Dataset Open Access

Novel Datasets for Evaluating Song Popularity Prediction Tasks

Voetter, Michael; Mayerl, Maximilian; Specht, Guenther; Zangerle, Eva

We present two novel datasets for hit song prediction (HSP): HSP-S and HSP-L. They are substantially larger than currently available dataset with respect to the number of features contained.

Both datasets provide high- and low-level audio features stemming from AcousticBrainz and short representative MP3 samples, as well as Mel-spectrogram features from the same samples. Further, we include listener- and play-counts gathered from last.fm for both datasets. The larger dataset, HSP-L, contains 73,482 songs with audio features, listener- and play-counts, making it substantially larger than previous datasets. In addition, we provide the release year information for 65,575 songs as provided by the Million Song Dataset.
The smaller dataset, HSP-S, contains 7,736 songs, listener- and play-counts as well as Billboard Hot 100 data for 50% of the songs, and release year information is available for 7,449 songs.

Files (44.4 GB)
Name Size
hsp-l_dataset.tar.xz
md5:2042f749d84bd13d157c23e49a5276a3
40.2 GB Download
hsp-s_dataset.tar.xz
md5:16f66a5a93005f77b143558259dc0699
4.3 GB Download
README.md
md5:1f340cdec9ca3360ff2882ab544fa3d3
3.7 kB Download
25
7
views
downloads
All versions This version
Views 2525
Downloads 77
Data volume 88.9 GB88.9 GB
Unique views 2121
Unique downloads 44

Share

Cite as