Published July 26, 2024 | Version 1.0.0
Dataset Open

MuChoMusic dataset

  • 1. ROR icon Pompeu Fabra University
  • 2. ROR icon Queen Mary University of London
  • 3. Universal Music Group

Description

MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models

MuChoMusic is a benchmark designed to evaluate music understanding in multimodal language models focused on audio. It includes 1,187 multiple-choice questions validated by human annotators, based on 644 music tracks from two publicly available music datasets. These questions cover a wide variety of genres and assess knowledge and reasoning across several musical concepts and their cultural and functional contexts. The benchmark provides a holistic evaluation of five open-source models, revealing challenges such as over-reliance on the language modality and highlighting the need for better multimodal integration.

Note on Audio Files

This dataset comes without audio files. The audio files can be downloaded from two datasets: SongDescriberDataset (SDD) and MusicCaps. Please see the code repository for more information on how to download the audio.

Citation

If you use this dataset, please cite our paper:

@inproceedings{weck2024muchomusic,
   title={MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models},
   author={Weck, Benno and Manco, Ilaria and Benetos, Emmanouil and Quinton, Elio and Fazekas, György and Bogdanov, Dmitry},
   booktitle = {Proceedings of the 25th International Society for Music Information Retrieval Conference (ISMIR)},
   year={2024}
}

Files

datasheet.pdf

Files (340.2 kB)

Name Size Download all
md5:4e5501c0cb11504069851cc4a93d6b7a
61.0 kB Preview Download
md5:1493835617013f8e47cd91c97ba52d6e
279.1 kB Preview Download

Additional details

Additional titles

Subtitle
Evaluating Music Understanding in Multimodal Audio-Language Models

Related works

Is derived from
Dataset: 10.5281/zenodo.10072000 (DOI)
Dataset: arXiv:2301.11325 (arXiv)
Is described by
Preprint: arXiv:2408.01337 (arXiv)

Funding

Musical AI: Artificial intelligence to support musical experiences: towards a data-driven, human-centred approach PID2019-111403GB-I0
Agencia Estatal de Investigación
UKRI Centre for Doctoral Training in Artificial Intelligence and Music EP/S022694/1
UK Research and Innovation

Software

Repository URL
https://github.com/mulab-mir/muchomusic
Programming language
Python