Published April 19, 2023 | Version 2.2
Dataset Open

Text to audio grounding (TAG) dataset: AudioGrounding

Authors/Creators

Description

AudioGrounding dataset, including audio files and timestamp annotations.

Changes in version 2: The train/validation/test sets are re-split. The validation and test annotations are refined.

 

----------------------------------------------------------

References

[1] Xuenan Xu, Heinrich Dinkel, Mengyue Wu and Kai Yu. "Text-to-audio grounding: Building correspondence between captions and sound events." In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021, pp. 606-610.

[2] Xuenan Xu, Mengyue Wu, and Kai Yu. "Investigating Pooling Strategies and Loss Functions for Weakly-Supervised Text-to-Audio Grounding via Contrastive Learning." In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW). IEEE, 2023, pp. 1-5.

Files

audio.zip

Files (2.5 GB)

Name Size Download all
md5:0634c9a8eb12a844f3772a1bc22ed1f4
2.5 GB Preview Download
md5:e67693e1ec7c8a040db1aeddf1b40118
475.0 kB Preview Download
md5:09fe45cfab1be904f41f2182f261b7ab
4.0 MB Preview Download
md5:917058633e744d08a7cf48d4161be960
486.9 kB Preview Download