Hung, Hsiao-Tzu
Ching, Joann
Doh, Seungheon
Kim, Nabin
Nam, Juhan
Yang, Yi-Hsuan
2021-07-18
<p>EMOPIA (pronounced ‘yee-mò-pi-uh’) dataset is a shared multi-modal (audio and MIDI) database focusing on perceived emotion in <strong>pop piano music</strong>, to facilitate research on various tasks related to music emotion. The dataset contains <strong>1,087</strong> music clips from 387 songs and <strong>clip-level</strong> emotion labels annotated by four dedicated annotators. </p>
<p>For more detailed information about the dataset, please refer to our paper: <strong>EMOPIA: A Multi-Modal Pop Piano Dataset For Emotion Recognition and Emotion-based Music Generation</strong>. </p>
<p><strong>File Description</strong></p>
<ul>
<li><em><strong>midis/</strong></em>: midi clips transcribed using GiantMIDI.
<ul>
<li>Filename `Q1_xxxxxxx_2.mp3`: Q1 means this clip belongs to Q1 on the V-A space; xxxxxxx is the song ID on YouTube, and the `2` means this clip is the 2nd clip taken from the full song.</li>
</ul>
</li>
<li><em><strong>metadata/</strong></em>: metadata from YouTube. (Got when crawling)</li>
<li>
<p><em><strong>songs_lists/</strong></em>: YouTube URLs of songs.</p>
</li>
<li>
<p><em><strong>tagging_lists/</strong></em>: raw tagging result for each sample.</p>
</li>
<li>
<p><em><strong>label.csv</strong></em>: metadata that records filename, clip timestamps, and annotator.</p>
</li>
<li>
<p><em><strong>metadata_by_song.csv</strong></em>: list all the clips by the song. Can be used to create the train/val/test splits to avoid the same song appear in both train and test.</p>
</li>
<li>
<p><em><strong>scripts/prepare_split.ipynb:</strong></em> the script to create train/val/test splits and save them to csv files.</p>
</li>
</ul>
<p>------</p>
<p><strong>2.0 Update</strong></p>
<p>Add two new folders:</p>
<ul>
<li><strong><em>corpus/</em></strong>: processed data that following <a href="https://github.com/YatingMusic/compound-word-transformer/blob/main/dataset/Dataset.md">the preprocessing flow</a>. (Please notice that although we have <code>1078</code> clips in our dataset, we lost some clips during steps 1~4 of the flow, so the final number of clips in this <strong><code>corpus</code></strong> is <code>1052</code>, and that's the number we used for training the generative model.)</li>
<li><strong><em>REMI_events/</em></strong>: REMI event for each midi file. They are generated using this <a href="https://github.com/YatingMusic/compound-word-transformer/blob/main/dataset/representations/uncond/remi/corpus2events.py">script</a>.</li>
</ul>
<p>-------- </p>
<p> </p>
<p>------</p>
<p> </p>
<p> </p>
<p> </p>
<p><strong>Cite this dataset</strong></p>
<pre><code>@inproceedings{{EMOPIA},
author = {Hung, Hsiao-Tzu and Ching, Joann and Doh, Seungheon and Kim, Nabin and Nam, Juhan and Yang, Yi-Hsuan},
title = {{MOPIA}: A Multi-Modal Pop Piano Dataset For Emotion Recognition and Emotion-based Music Generation},
booktitle = {Proc. Int. Society for Music Information Retrieval Conf.},
year = {2021}
}</code></pre>
https://doi.org/10.5281/zenodo.5144853
oai:zenodo.org:5144853
Zenodo
https://doi.org/10.5281/zenodo.5090630
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
ISMIR, International Society for Music Information Retrieval Conference 2021
piano
emotion
music
midi
EMOPIA: A Multi-Modal Pop Piano Dataset For Emotion Recognition and Emotion-based Music Generation
info:eu-repo/semantics/other