Planned intervention: On Wednesday April 3rd 05:30 UTC Zenodo will be unavailable for up to 2-10 minutes to perform a storage cluster upgrade.

There is a newer version of the record available.

Published June 2, 2022 | Version v1
Dataset Open

Clotho Analysis Set

  • 1. Inria
  • 2. Université de Lorraine
  • 3. Tampere University
  • 4. Wolt Enterprises Oy

Description

This dataset is derived from the evaluation subset of Clotho dataset. It is designed to analyze the behavior of the captioning system under certain perturbation in order to try and identify some open challenges in automated audio captioning. The original audio clips are transformed with audio_degrader. The transformations applied are the following:

  • Microphone response simulation

  • Mixup with another clip from the dataset (ratio -6dB, -3dB and 0dB)

  • Additive noise from DESED (ratio -12dB, -6dB, 0dB)

Files

clotho_analysis_2022.zip

Files (12.7 GB)

Name Size Download all
md5:cda8b216a3531ec03e2a7b5164c54174
12.7 GB Preview Download