Novel Datasets for Evaluating Song Popularity Prediction Tasks
Description
We present two novel datasets for hit song prediction (HSP): HSP-S and HSP-L. They are substantially larger than currently available dataset with respect to the number of features contained.
Both datasets provide high- and low-level audio features stemming from AcousticBrainz and short representative MP3 samples. Further, we include listener- and play-counts gathered from last.fm for both datasets. The larger dataset, HSP-L, contains 73,482 songs with audio features, listener- and play-counts, making it substantially larger than previous datasets. In addition, we provide the release year information for 65,575 songs as provided by the Million Song Dataset.
The smaller dataset, HSP-S, contains 7,736 songs, listener- and play-counts as well as Billboard Hot 100 data for 50% of the songs, and release year information is available for 7,449 songs.
Files
hsp-l_uuid_year.csv
Files
(3.7 GB)
Name | Size | Download all |
---|---|---|
md5:936d414b14b65e4e168777972021b6f9
|
1.8 GB | Download |
md5:4e9c0a42c9cb4355f7f69f8dfb276c09
|
1.5 GB | Download |
md5:5c32908ce6936e432ec3da600fb9caf1
|
7.9 MB | Preview Download |
md5:c3372acb249f9243e92613537825ad4d
|
210.7 MB | Download |
md5:60bacd786a3374207b9cd97ba734b34d
|
167.0 MB | Download |
md5:572a6a94ac783459ec141e89a834aed3
|
2.1 MB | Preview Download |
md5:f87f9256197a0db7b5f8484d5a693a76
|
3.1 kB | Preview Download |