Text to audio grounding (TAG) dataset: AudioGrounding

Xuenan Xu

doi:10.5281/zenodo.7269161

There is a newer version of the record available.

Published October 31, 2022 | Version 2

Dataset Open

Text to audio grounding (TAG) dataset: AudioGrounding

Xuenan Xu

AudioGrounding dataset, including audio files and timestamp annotations.

Changes in version 2: The train/validation/test sets are re-split. The validation and test annotations are refined.

----------------------------------------------------------

References

[1] Xuenan Xu, Heinrich Dinkel, Mengyue Wu and Kai Yu. "Text-to-audio grounding: Building correspondence between captions and sound events." In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021, pp. 606-610.

Files

audio.zip

Files (2.5 GB)

Name	Size
audio.zip md5:ecdb48fab2d09abceb0d14165201f23e	2.5 GB	Preview Download
test.json md5:2f988a2271a2cd9307473281ff458d64	474.9 kB	Preview Download
train.json md5:2a1e7ec1da4072a5b6b8e802e37036e9	4.3 MB	Preview Download
val.json md5:917058633e744d08a7cf48d4161be960	486.9 kB	Preview Download

Views

Downloads

Show more details

	All versions	This version
Views	1,417	724
Downloads	1,316	570
Data volume	1.5 TB	574.2 GB

More info on how stats are collected....

DOI

Resource type

Dataset

Publisher

Zenodo

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: November 1, 2022
Modified: April 19, 2023

Text to audio grounding (TAG) dataset: AudioGrounding

Authors/Creators

Description

Files

audio.zip

Files (2.5 GB)