Text to audio grounding (TAG) dataset: AudioGrounding
Authors/Creators
Description
AudioGrounding dataset, including audio files and timestamp annotations.
Changes in version 2: The train/validation/test sets are re-split. The validation and test annotations are refined.
----------------------------------------------------------
References
[1] Xuenan Xu, Heinrich Dinkel, Mengyue Wu and Kai Yu. "Text-to-audio grounding: Building correspondence between captions and sound events." In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021, pp. 606-610.
[2] Xuenan Xu, Mengyue Wu, and Kai Yu. "Investigating Pooling Strategies and Loss Functions for Weakly-Supervised Text-to-Audio Grounding via Contrastive Learning." In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW). IEEE, 2023, pp. 1-5.