Published February 6, 2025
| Version v1
Conference paper
Open
ADIFF: Explaining audio difference using natural language
Description
ADIFF is an audio prefix tuning-based language model with a cross-projection module and undergoes a three-step training process. ADIFF takes two audios and text prompt as input and produces different tiers of difference explanations as output. This involves identifying and describing audio events, acoustic scenes, signal characteristics, and their emotional impact on listeners.
The code repository is: soham97/ADIFF
Files
Files
(2.4 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:37ae169cb2706e0a0976aec73aeb5b82
|
1.2 GB | Download |
|
md5:3f61dd7937d43f064a2130bc2fbe5c80
|
1.2 GB | Download |
Additional details
Software
- Repository URL
- https://github.com/soham97/ADIFF
- Programming language
- Python