WikiMuTe: A web-sourced dataset of semantic descriptions for music audio

doi:10.5281/zenodo.10223363

Published December 13, 2023 | Version 1.0.0

Dataset Open

WikiMuTe: A web-sourced dataset of semantic descriptions for music audio

This upload contains the supplementary material for our paper presented at the MMM2024 conference.

Dataset

The dataset contains rich text descriptions for music audio files collected from Wikipedia articles.

The audio files are freely accessible and available for download through the URLs provided in the dataset.

Example

A few hand-picked, simplified examples of the dataset.

file	aspects	sentences
🔈 Bongo sound.wav	['bongoes', 'percussion instrument', 'cumbia', 'drums']	['a loop of bongoes playing a cumbia beat at 99 bpm']
🔈 Example of double tracking in a pop-rock song (3 guitar tracks).ogg	['bass', 'rock', 'guitar music', 'guitar', 'pop', 'drums']	['a pop-rock song']
🔈 OriginalDixielandJassBand-JazzMeBlues.ogg	['jazz standard', 'instrumental', 'jazz music', 'jazz']	['Considered to be a jazz standard', 'is an jazz composition']
🔈 Colin Ross - Etherea.ogg	['chirping birds', 'ambient percussion', 'new-age', 'flute', 'recorder', 'single instrument', 'woodwind']	['features a single instrument with delayed echo, as well as ambient percussion and chirping birds', 'a new-age composition for recorder']
🔈 Belau rekid (instrumental).oga	['instrumental', 'brass band']	['an instrumental brass band performance']
...	...	...

Dataset structure

We provide three variants of the dataset in the data folder.

All are described in the paper.

all.csv contains all the data we collected, without any filtering.
filtered_sf.csv contains the data obtained using the self-filtering method.
filtered_mc.csv contains the data obtained using the MusicCaps dataset method.

File structure

Each CSV file contains the following columns:

file: the name of the audio file
pageid: the ID of the Wikipedia article where the text was collected from
aspects: the short-form (tag) description texts collected from the Wikipedia articles
sentences: the long-form (caption) description texts collected from the Wikipedia articles
audio_url: the URL of the audio file
url: the URL of the Wikipedia article where the text was collected from

Citation

If you use this dataset in your research, please cite the following paper:

@inproceedings{wikimute,
    title = {WikiMuTe: {A} Web-Sourced Dataset of Semantic Descriptions for Music Audio},
    author = {Weck, Benno and Kirchhoff, Holger and Grosche, Peter and Serra, Xavier},
    booktitle = "MultiMedia Modeling",
    year = "2024",
    publisher = "Springer Nature Switzerland",
    address = "Cham",
    pages = "42--56",
    doi = {10.1007/978-3-031-56435-2_4},
    url = {https://doi.org/10.1007/978-3-031-56435-2_4},
}

License

The data is available under the Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) license.

Each entry in the dataset contains a URL linking to the article, where the text data was collected from.

Files

all.csv

Files (13.7 MB)

Name	Size	Download all
all.csv md5:90a68ade8b1328a5a9b99996fd86eb72	4.7 MB	Preview Download
filtered_mc.csv md5:e8802df8e10425c25d39516820bcabb7	4.3 MB	Preview Download
filtered_sf.csv md5:e67467ce7fa20b4ea24aded9d44f9ba3	4.7 MB	Preview Download

Additional details

Is supplement to: Preprint: arXiv:2312.09207 (arXiv); Publication: 10.1007/978-3-031-56435-2_4 (DOI)
Is supplemented by: Software: https://github.com/Bomme/wikimute (URL)

	All versions	This version
Views	1,388	1,388
Downloads	392	392
Data volume	2.4 GB	2.4 GB

WikiMuTe: A web-sourced dataset of semantic descriptions for music audio

Creators

Description

Dataset

Example

Dataset structure

File structure

Citation

License

Files

all.csv

Files (13.7 MB)

Additional details

Related works