Published October 7, 2019 | Version v2
Dataset Open

MusicOSet: An Enhanced Open Dataset for Music Data Mining

  • 1. Universidade Federal de Minas Gerais

Description

MusicOSet is an open and enhanced dataset of musical elements (artists, songs and albums) based on musical popularity classification. Provides a directly accessible collection of data suitable for numerous tasks in music data mining (e.g., data visualization, classification, clustering, similarity search, MIR, HSS and so forth). To create MusicOSet, the potential information sources were divided into three main categories: music popularity sources, metadata sources, and acoustic and lyrical features sources. Data from all three categories were initially collected between January and May 2019. Nevertheless, the update and enhancement of the data happened in June 2019.

The attractive features of MusicOSet include:

  • Integration and centralization of different musical data sources
  • Calculation of popularity scores and classification of hits and non-hits musical elements, varying from 1962 to 2018
  • Enriched metadata for music, artists, and albums from the US popular music industry
  • Availability of acoustic and lyrical resources
  • Unrestricted access in two formats: SQL database and compressed .csv files
|        Data       | # Records |
|:-----------------:|:---------:|
| Songs             | 20,405    |
| Artists           | 11,518    |
| Albums            | 26,522    |
| Lyrics            | 19,664    |
| Acoustic Features | 20,405    |
| Genres            | 1,561     |

Files

additional.zip

Files (376.7 MB)

Name Size Download all
md5:88626a90c7f4bdf0cfdb6c39a2c2fd31
58.5 MB Preview Download
md5:6bcb13b308b0ec6457a15712dafbf0f0
245.3 MB Download
md5:dbf4de4942ba54e3892a213bf4856675
6.0 MB Preview Download
md5:f14c16252f1cb3af49f5f8fad56deaaa
12.4 MB Preview Download
md5:b0f5dd491b0d2a85bc7e6d5c57b0b5c1
54.5 MB Preview Download

Additional details

References

  • Silva, M. O., Rocha, L. M., and Moro, M. M. (2019). MusicOSet: An Enhanced Open Dataset for Music Data Mining. In XXXIV Simpósio Brasileiro de Banco de Dados: Dataset Showcase Workshop, SBBD 2019 Companion, Fortaleza, CE, Brazil.