Published November 16, 2023 | Version 1.0.0
Dataset Open

Song Describer Dataset

  • 1. ROR icon Queen Mary University of London
  • 2. ROR icon Pompeu Fabra University
  • 3. Universitat Pompeu Fabra


The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation

A retro-futurist drum machine groove drenched in bubbly synthetic sound effects and a hint of an acid bassline.

The Song Describer Dataset (SDD) contains ~1.1k captions for 706 permissively licensed music recordings. It is designed for use in evaluation of models that address music-and-language (M&L) tasks such as music captioning, text-to-music generation and music-language retrieval. More information about the data, collection method and validation is provided in the paper describing the dataset.

If you use this dataset, please cite our paper:

The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation, Manco, Ilaria and Weck, Benno and Doh, Seungheon and Won, Minz and Zhang, Yixiao and Bogdanov, Dmitry and Wu, Yusong and Chen, Ke and Tovstogan, Philip and Benetos, Emmanouil and Quinton, Elio and Fazekas, György and Nam, Juhan, Machine Learning for Audio Workshop at NeurIPS 2023, 2023


Files (3.3 GB)

Name Size Download all
3.3 GB Preview Download
162.2 kB Preview Download
97.7 kB Download
141.8 kB Preview Download
204.2 kB Preview Download
186.2 kB Preview Download
108.7 kB Download

Additional details

Additional titles

a Corpus of Audio Captions for Music-and-Language Evaluation

Related works

Is derived from
Dataset: 10.5281/zenodo.3826813 (DOI)
Is supplemented by
Software: (URL)


UKRI Centre for Doctoral Training in Artificial Intelligence and Music EP/S022694/1
UK Research and Innovation
Musical AI: Artificial intelligence to support musical experiences: towards a data-driven, human-centred approach PID2019-111403GB-I0
Agencia Estatal de Investigación