Dataset Open Access

Novel Datasets for Evaluating Song Popularity Prediction Tasks

Voetter, Michael; Mayerl, Maximilian; Specht, Guenther; Zangerle, Eva

We present two novel datasets for hit song prediction (HSP): HSP-S and HSP-L. They are substantially larger than currently available dataset with respect to the number of features contained.

Both datasets provide high- and low-level audio features stemming from AcousticBrainz and short representative MP3 samples. Further, we include listener- and play-counts gathered from last.fm for both datasets. The larger dataset, HSP-L, contains 73,482 songs with audio features, listener- and play-counts, making it substantially larger than previous datasets. In addition, we provide the release year information for 65,575 songs as provided by the Million Song Dataset.
The smaller dataset, HSP-S, contains 7,736 songs, listener- and play-counts as well as Billboard Hot 100 data for 50% of the songs, and release year information is available for 7,449 songs.

Files (3.7 GB)
Name Size
hsp-l_acousticbrainz.parquet
md5:936d414b14b65e4e168777972021b6f9
1.8 GB Download
hsp-l_essentia.parquet
md5:4e9c0a42c9cb4355f7f69f8dfb276c09
1.5 GB Download
hsp-l_uuid_year.csv
md5:5c32908ce6936e432ec3da600fb9caf1
7.9 MB Download
hsp-s_acousticbrainz.parquet
md5:c3372acb249f9243e92613537825ad4d
210.7 MB Download
hsp-s_essentia.parquet
md5:60bacd786a3374207b9cd97ba734b34d
167.0 MB Download
hsp-s_uuid_year.csv
md5:572a6a94ac783459ec141e89a834aed3
2.1 MB Download
README.md
md5:f87f9256197a0db7b5f8484d5a693a76
3.1 kB Download
229
97
views
downloads
All versions This version
Views 229229
Downloads 9797
Data volume 316.9 GB316.9 GB
Unique views 192192
Unique downloads 4343

Share

Cite as