Dataset Open Access

#nowplaying-rs

Eva Zangerle; Asmita Poddar; Yi-Hsuan Yang


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nmm##2200000uu#4500</leader>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Poddar, Asmita; Zangerle, Eva; Yang, Yi-Hsuan  #nowplaying-RS: A New Benchmark Dataset for Building Context-Aware Music Recommender Systems Inproceedings  Proceedings of the 15th Sound &amp; Music Computing Conference, Limassol, Cyprus, 2018.</subfield>
  </datafield>
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">eng</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">context</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">recommender system</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">music</subfield>
  </datafield>
  <controlfield tag="005">20200124192514.0</controlfield>
  <controlfield tag="001">3247476</controlfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">National University of Singapore</subfield>
    <subfield code="a">Asmita Poddar</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Academia Sinica, Taiwan</subfield>
    <subfield code="a">Yi-Hsuan Yang</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">University of Innsbruck, Austria</subfield>
    <subfield code="0">(orcid)0000-0003-3195-8273</subfield>
    <subfield code="4">res</subfield>
    <subfield code="a">Eva Zangerle</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">National University of Singapore</subfield>
    <subfield code="4">res</subfield>
    <subfield code="a">Asmita Poddar</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Academia Sinica, Taiwan</subfield>
    <subfield code="4">res</subfield>
    <subfield code="a">Yi-Hsuan Yang</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">26964118</subfield>
    <subfield code="z">md5:32d00b55dbeb2b617fd041c8156d3e86</subfield>
    <subfield code="u">https://zenodo.org/record/3247476/files/nowplaying_rs_train_test.zip</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">2656519068</subfield>
    <subfield code="z">md5:b2fc636fbd3ea8ddf81165aab197108a</subfield>
    <subfield code="u">https://zenodo.org/record/3247476/files/nowplayingrs.zip</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2019-03-15</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire_data</subfield>
    <subfield code="p">user-mir</subfield>
    <subfield code="o">oai:zenodo.org:3247476</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">University of Innsbruck, Austria</subfield>
    <subfield code="0">(orcid)0000-0003-3195-8273</subfield>
    <subfield code="a">Eva Zangerle</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">#nowplaying-rs</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-mir</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">http://creativecommons.org/licenses/by/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;The nowplaying-rs dataset features context- and content features of listening events. It contains 11.6 million music listening events of 139K users and 346K tracks collected from Twitter. The dataset comes with a rich set of item content features and user context features, as well as timestamps of the listening events. Moreover, some of the user context features imply the cultural origin of the users, and some others - like hashtags - give clues to the emotional state of a user underlying a listening event.&lt;/p&gt;

&lt;p&gt;The dataset contains three files:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;user_track_hashtag_timestamp.csv contains basic information about each listening event. For each listening event, we provide an id, the user_id, track_id, hashtag, created_at&amp;nbsp;&lt;/li&gt;
	&lt;li&gt;context_content_features.csv: contains all context and content features. For each listening event, we provide the id of the event, user_id, track_id, artist_id, content features regarding the track mentioned in the event (instrumentalness, liveness, speechiness, danceability, valence, loudness, tempo, acousticness, energy, mode, key) and context features regarding the listening event (coordinates (as geoJSON), place (as geoJSON), geo (as geoJSON), tweet_language, created_at, user_lang, time_zone, entities contained in the tweet).&lt;/li&gt;
	&lt;li&gt;sentiment_values.csv contains sentiment information for hashtags. It contains the hashtag itself and the sentiment values gathered via four different sentiment dictionaries: AFINN, Opinion Lexicon, Sentistrength Lexicon and vader. For each of these dictionaries we list the minimum, maximum, sum and average of all&amp;nbsp;sentiments of the tokens of the hashtag (if available, else we list empty values). However, as most hashtags only consist of a single token, these&amp;nbsp;values are equal in most cases. Please note that the lexica are rather diverse and therefore, are able to resolve very different terms against a score. Hence,&amp;nbsp;the resulting csv is rather sparse. The file contains the following comma-separated values: &amp;lt;hashtag, vader_min, vader_max, vader_sum,vader_avg, &amp;nbsp;afinn_min, afinn_max,&amp;nbsp;afinn_sum, afinn_avg, ol_min, ol_max, ol_sum, ol_avg, ss_min, ss_max, ss_sum, ss_avg &amp;gt;, where we abbreviate all scores gathered over the Opinion Lexicon with the&amp;nbsp;prefix &amp;#39;ol&amp;#39;. Similarly, &amp;#39;ss&amp;#39; stands for SentiStrength.&amp;nbsp;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Please note that user_track_hashtag_timestamp.csv and context_content_features.csv partly provide the same features. We deliberately chose to do so to be able to provide useable files that do not have to be matched and joined with each other to perform e.g., simple recommendation tasks.&lt;/p&gt;

&lt;p&gt;Please also find the training and test-splits for the dataset in this repo. Also, Asmita provides prototypical implementations of a context-aware recommender system based on the dataset at https://github.com/asmitapoddar/nowplaying-RS-Music-Reco-FM.&lt;/p&gt;

&lt;p&gt;&lt;br&gt;
If you make use of this dataset, please cite the following paper where we describe and experiment with the dataset:&lt;/p&gt;

&lt;p&gt;@inproceedings{smc18,&lt;br&gt;
title = {#nowplaying-RS: A New Benchmark Dataset for Building Context-Aware Music Recommender Systems},&lt;br&gt;
author = {Asmita Poddar and Eva Zangerle and Yi-Hsuan Yang},&lt;br&gt;
url = {http://mac.citi.sinica.edu.tw/~yang/pub/poddar18smc.pdf},&lt;br&gt;
year = {2018},&lt;br&gt;
date = {2018-07-04},&lt;br&gt;
booktitle = {Proceedings of the 15th Sound &amp;amp; Music Computing Conference},&lt;br&gt;
address = {Limassol, Cyprus},&lt;br&gt;
note = {code at https://github.com/asmitapoddar/nowplaying-RS-Music-Reco-FM},&lt;br&gt;
tppubtype = {inproceedings}&lt;br&gt;
}&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.2594537</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.3247476</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">dataset</subfield>
  </datafield>
</record>
989
399
views
downloads
All versions This version
Views 989426
Downloads 399354
Data volume 666.4 GB598.6 GB
Unique views 881368
Unique downloads 184149

Share

Cite as