Music4All A+A
Authors/Creators
Description
Music4All A+A: Artist and Album Dataset
Music4All A+A (Artist and Album) is a large-scale multimodal dataset for Music Information Retrieval (MIR) tasks, providing comprehensive metadata, genre labels, image representations, and textual descriptors for 6,741 artists and 19,511 albums.
This dataset extends the [Music4All-Onion dataset](https://doi.org/10.1145/3511808.3557656) by providing multimodal data at the artist and album level, enabling research in:
- Multimodal music genre classification
- Music recommendation systems
- Missing-modality scenarios
- Cross-domain transfer learning
Key Features
- Multimodal Data: Images and text for both artists and albums
- Rich Genre Labels: 659 unique artist genres and 737 album genres
- Balanced Distribution: Addresses class imbalance issues in existing datasets
- Missing-Modality Splits: Pre-defined test splits for evaluating robustness (10%, 30%, 50%, 70%, 90%, 100% modality availability)
- Extensible: Built on Music4All-Onion, allowing integration with track-level audio, video, and user-item interaction data
Note: The missing-modality splits are nested, meaning that items in the 10% subset are also present in 30%, 50%, etc.
Citation:
If you use this dataset in your research, please cite:
@inproceedings{geiger2025music4all,
title={Music4All A+A: A Multimodal Dataset for Music Information Retrieval Tasks},
author={Geiger, Jonas and Moscati, Marta and Nawaz, Shah and Schedl, Markus},
booktitle={Proceedings of the IEEE International Conference on Content-Based Multimedia Indexing, Dublin, Ireland, October 22-24, 2025},
year={2025},
url={https://arxiv.org/abs/2509.14891}
}
Files
album_json.zip
Additional details
Identifiers
- arXiv
- arXiv:2509.14891
- Other
- 10.5281/zenodo.17278677
Related works
- Is described by
- Conference paper: arXiv:2509.14891 (arXiv)