Playlist2vec: Spotify Million Playlist Dataset

Papreja, Piyush

doi:10.5281/zenodo.5002584

Published March 27, 2020 | Version v1

Dataset Open

Playlist2vec: Spotify Million Playlist Dataset

Papreja, Piyush¹

1. Arizona State University

This dataset was created using Spotify developer API. It consists of user-created as well as Spotify-curated playlists.
The dataset consists of 1 million playlists, 3 million unique tracks, 3 million unique albums, and 1.3 million artists.
The data is stored in a SQL database, with the primary entities being songs, albums, artists, and playlists.
Each of the aforementioned entities are represented by unique IDs (Spotify URI).
Data is stored into following tables:

album
artist
track
playlist
track_artist1
track_playlist1

album

| id | name | uri |

id: Album ID as provided by Spotify
name: Album Name as provided by Spotify
uri: Album URI as provided by Spotify

artist

| id | name | uri |

id: Artist ID as provided by Spotify
name: Artist Name as provided by Spotify
uri: Artist URI as provided by Spotify

track

id: Track ID as provided by Spotify
name: Track Name as provided by Spotify
duration: Track Duration (in milliseconds) as provided by Spotify
popularity: Track Popularity as provided by Spotify
explicit: Whether the track has explicit lyrics or not. (true or false)
preview_url: A link to a 30 second preview (MP3 format) of the track. Can be null
uri: Track Uri as provided by Spotify
album_id: Album Id to which the track belongs

playlist

| id | name | followers | uri | total_tracks |

id: Playlist ID as provided by Spotify
name: Playlist Name as provided by Spotify
followers: Playlist Followers as provided by Spotify
uri: Playlist Uri as provided by Spotify
total_tracks: Total number of tracks in the playlist.

track_artist1

| track_id | artist_id |

Track-Artist association table

track_playlist1

| track_id | playlist_id |

Track-Playlist association table

- - - - - SETUP - - - - -

The data is in the form of a SQL dump. The download size is about 10 GB, and the database populated from it comes out to about 35GB.

spotifydbdumpschemashare.sql contains the schema for the database (for reference):
spotifydbdumpshare.sql is the actual data dump.

Setup steps:
1. Create database <dbname>
2. mysql -u <username> -p <dbname> < spotifydbdumpshare.sql

- - - - - PAPER - - - - -

The description of this dataset can be found in the following paper:

Papreja P., Venkateswara H., Panchanathan S. (2020) Representation, Exploration and Recommendation of Playlists. In: Cellier P., Driessens K. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019. Communications in Computer and Information Science, vol 1168. Springer, Cham

Files

Files (10.7 GB)

Name	Size
spotifydbdumpschemashare.sql md5:015c03a86fd2d2c92426db68e83a1862	5.0 kB	Download
spotifydbdumpshare.sql md5:3549b42e207a76ba5c20e650f1cd044e	10.7 GB	Download

Additional details

Is documented by: Conference paper: 10.1007/978-3-030-43887-6_50 (DOI)

Papreja, Piyush, Hemanth Venkateswara, and Sethuraman Panchanathan. "Representation, exploration and recommendation of playlists." Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Cham, 2019.

	All versions	This version
Views	5,862	5,793
Downloads	1,455	1,445
Data volume	15.6 TB	15.5 TB

Files (10.7 GB)

Related works

References

Playlist2vec: Spotify Million Playlist Dataset

Authors/Creators

Description

Files

Files (10.7 GB)

Additional details

Related works

References