There is a newer version of the record available.

Published May 5, 2022 | Version 1.1
Dataset Open

Song Interpretation Dataset

  • 1. Queen Mary University of London
  • 2. NYU Shanghai

Description

The Song Interpretation Dataset combines data from two sources: (1) music and metadata from the Music4All Dataset and (2) lyrics and user interpretations from SongMeanings.com. We design a music metadata-based matching algorithm that aligns matching items in the two datasets with each other. In the end, we successfully match 25.47% of the tracks in the Music4All Dataset.

The dataset contains audio excerpts from 27,834 songs (30 seconds each, recorded at 44.1 kHz), the corresponding music metadata, about 490,000 user interpretations of the lyric text, and the number of votes given for each of these user interpretations. The average length of the interpretations is 97 words. Music in the dataset covers various genres, of which the top 5 are: Rock (11,626), Pop (6,071), Metal (2,516), Electronic (2,213) and Folk (1,760). 

For more details, please refer to our paper "Interpreting Song Lyrics with an Audio-Informed Pre-trained Language Model".

Files

Files (522.0 MB)

Name Size Download all
md5:dc49ae3d10bf37980855fbda9e4c9ba6
246.8 MB Download
md5:c582a24f5af4e3e222f65f9a862c8dd9
212.9 MB Download
md5:f33239ffdb7f3a9a6ff9d1cca38d2152
60.7 MB Download
md5:30e3406a48a7a83c82571f8ff6081ad3
1.6 MB Download