Published September 8, 2021 | Version v1
Dataset Open

Novel Datasets for Evaluating Song Popularity Prediction Tasks

Description

We present two novel datasets for hit song prediction (HSP): HSP-S and HSP-L. They are substantially larger than currently available dataset with respect to the number of features contained.

Both datasets provide high- and low-level audio features stemming from AcousticBrainz and short representative MP3 samples. Further, we include listener- and play-counts gathered from last.fm for both datasets. The larger dataset, HSP-L, contains 73,482 songs with audio features, listener- and play-counts, making it substantially larger than previous datasets. In addition, we provide the release year information for 65,575 songs as provided by the Million Song Dataset.
The smaller dataset, HSP-S, contains 7,736 songs, listener- and play-counts as well as Billboard Hot 100 data for 50% of the songs, and release year information is available for 7,449 songs.

Files

hsp-l_uuid_year.csv

Files (3.7 GB)

Name Size Download all
md5:936d414b14b65e4e168777972021b6f9
1.8 GB Download
md5:4e9c0a42c9cb4355f7f69f8dfb276c09
1.5 GB Download
md5:5c32908ce6936e432ec3da600fb9caf1
7.9 MB Preview Download
md5:c3372acb249f9243e92613537825ad4d
210.7 MB Download
md5:60bacd786a3374207b9cd97ba734b34d
167.0 MB Download
md5:572a6a94ac783459ec141e89a834aed3
2.1 MB Preview Download
md5:f87f9256197a0db7b5f8484d5a693a76
3.1 kB Preview Download