Audio captioning DCASE 2020 evaluation (testing) split

Konstantinos Drossos; Samuel Lipping; Tuomas Virtanen

doi:10.5281/zenodo.3865658

Published May 29, 2020 | Version 1.0

Dataset Open

Audio captioning DCASE 2020 evaluation (testing) split

1. Audio Research Group, Faculty of Information Technology and Communication Sciences, Tampere University

This is the evaluation split for Task 6, Automated Audio Captioning, in DCASE 2020 Challenge.

This evaluation split is the Clotho testing split, which is thoroughly described in the corresponding paper:

K. Drossos, S. Lipping and T. Virtanen, "Clotho: an Audio Captioning Dataset," IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 736-740, doi: 10.1109/ICASSP40776.2020.9052990.

available online at: https://arxiv.org/abs/1910.09387 and at: https://ieeexplore.ieee.org/document/9052990

This evaluation split is meant to be used for the purposes of the Task 6 at the scientific challenge DCASE 2020. This split it is not meant to be used for developing audio captioning methods. For developing audio captioning methods, you should use the development and evaluation splits of Clotho.

If you want the development and evaluation splits of Clotho dataset, you can find them also in Zenodo, at: https://zenodo.org/record/3490684

--------------------------------------------------------------------------------------------------------

== License ==

The audio files in the archives:

clotho_audio_test.7z

and the associated meta-data in the CSV file:

clotho_metadata_test.csv

are under the corresponding licences (mostly CreativeCommons with attribution) of Freesound [1] platform, mentioned explicitly in the CSV file for each of the audio files. That is, each audio file in the 7z archive is listed in the CSV file with the meta-data. The meta-data for each file are:

File name
Start and ending samples for the excerpt that is used in the Clotho dataset
Uploader/user in the Freesound platform (manufacturer)
Link to the licence of the file

--------------------------------------------------------------------------------------------------------

== References ==
[1] Frederic Font, Gerard Roma, and Xavier Serra. 2013. Freesound technical demo. In Proceedings of the 21st ACM international conference on Multimedia (MM '13). ACM, New York, NY, USA, 411-412. DOI: https://doi.org/10.1145/2502081.2502245

Files

clotho_metadata_test.csv

Files (1.3 GB)

Name	Size
clotho_audio_test.7z md5:9b3fe72560a621641ff4351ba1154349	1.3 GB	Download
clotho_metadata_test.csv md5:52f8ad01c229a310a0ff8043df480e21	89.2 kB	Preview Download

Additional details

European Commission
EVERYSOUND - Computational Analysis of Everyday Soundscapes 637422

	All versions	This version
Views	2,540	2,533
Downloads	2,669	2,661
Data volume	2.2 TB	2.2 TB

Audio captioning DCASE 2020 evaluation (testing) split

Authors/Creators

Description

Files

clotho_metadata_test.csv

Files (1.3 GB)

Additional details

Funding