lastfm Music Recommendation Dataset
Description
This is a common Zenodo repository for both lastfm-360K and lastfm-1K datasets. See below the details of both datasets, including license, acknowledgements, contact, and instructions to cite.
LASTFM-360K (version 1.2, March 2010).
- What is this? This dataset contains <user, artist, plays> tuples (for ~360,000 users) collected from Last.fm API, using the user.getTopArtists() method.
- Files:
- usersha1-artmbid-artname-plays.tsv (MD5: be672526eb7c69495c27ad27803148f1)
- usersha1-profile.tsv (MD5: 51159d4edf6a92cb96f87768aa2be678)
- mbox_sha1sum.py (MD5: feb3485eace85f3ba62e324839e6ab39)
- Data Statistics:
- Data Format: The data is formatted one entry per line as follows (tab separated "\t"):
- File usersha1-artmbid-artname-plays.tsv:
user-mboxsha1 \t musicbrainz-artist-id \t artist-name \t plays
- File usersha1-profile.tsv:
user-mboxsha1 \t gender (m|f|empty) \t age (int|empty) \t country (str|empty) \t signup (date|empty)
- File usersha1-artmbid-artname-plays.tsv:
- Example:
- File usersha1-artmbid-artname-plays.tsv:
000063d3fe1cf2ba248b9e3c3f0334845a27a6be \t a3cb23fc-acd3-4ce0-8f36-1e5aa6a18432 \t u2 \t 31 ...
- File usersha1-profile.tsv:
000063d3fe1cf2ba248b9e3c3f0334845a27a6be \t m \t 19 \t Mexico \t Apr 28, 2008 ...
- File usersha1-artmbid-artname-plays.tsv:
LASTFM-1K (version 1.0, March 2010).
- What is this? This dataset contains <user, timestamp, artist, song> tuples collected from Last.fm API, using the user.getRecentTracks() method. This dataset represents the whole listening habits (till May, 5th 2009) for nearly 1,000 users.
- Files:
- userid-timestamp-artid-artname-traid-traname.tsv (MD5: 64747b21563e3d2aa95751e0ddc46b68)
- userid-profile.tsv (MD5: c53608b6b445db201098c1489ea497df)
- Data Statistics:
- File userid-timestamp-artid-artname-traid-traname.tsv:
- Total Lines: 19,150,868
- Unique Users: 992
- Artists with MBID: 107,528
- Artists without MBDID: 69,420
- File userid-timestamp-artid-artname-traid-traname.tsv:
- Data Format: The data is formatted one entry per line as follows (tab separated, "\t"):
- File userid-timestamp-artid-artname-traid-traname.tsv:
userid \t timestamp \t musicbrainz-artist-id \t artist-name \t musicbrainz-track-id \t track-name
- File userid-profile.tsv:
userid \t gender ('m'|'f'|empty) \t age (int|empty) \t country (str|empty) \t signup (date|empty)
- File userid-timestamp-artid-artname-traid-traname.tsv:
- Example:
- File userid-timestamp-artid-artname-traid-traname.tsv:
user_000639 \t 2009-04-08T01:57:47Z \t MBID \t The Dogs D'Amour \t MBID \t Fall in Love Again? user_000639 \t 2009-04-08T01:53:56Z \t MBID \t The Dogs D'Amour \t MBID \t Wait Until I'm Dead ...
- File userid-profile.tsv:
user_000639 \t m \t Mexico \t Apr 27, 2005 ...
- File userid-timestamp-artid-artname-traid-traname.tsv:
LICENSE OF BOTH DATASETS. The data contained in both datasets is distributed with permission of Last.fm. The data is made available for non-commercial use. Those interested in using the data or web services in a commercial context should contact:
partners [at] last [dot] fm
For more information see Last.fm terms of service
ACKNOWLEDGEMENTS. Thanks to Last.fm for providing the access to this data via their web services. Special thanks to Norman Casagrande.
REFERENCES. When using this dataset you must reference the Last.fm webpage. Optionally (not mandatory at all!), you can cite Chapter 3 of this book:
@book{Celma:Springer2010, author = {Celma, O.}, title = {{Music Recommendation and Discovery in the Long Tail}}, publisher = {Springer}, year = {2010} }
CONTACT: This data was collected by Òscar Celma @ MTG/UPF
Files
Files
(1.2 GB)
Name | Size | Download all |
---|---|---|
md5:a79a6808f54f73354789a9fb02cb1e41
|
672.7 MB | Download |
md5:635e6ed3fc873aa4ba33aba0ebce02b1
|
569.2 MB | Download |