Dataset Open Access

Fair RecSys Datasets

Kowald Dominik

MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="">
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">eng</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">multimedia recommender systems</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">fairness</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">popularity bias</subfield>
  <controlfield tag="005">20220302155528.0</controlfield>
  <controlfield tag="001">6123879</controlfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">2099899</subfield>
    <subfield code="z">md5:537b5cdaf8c02e34a2552cd47eb58a82</subfield>
    <subfield code="u"></subfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">2985860</subfield>
    <subfield code="z">md5:96cddcdc4dbb8b62ea1e7b96933415e7</subfield>
    <subfield code="u"></subfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">9161489</subfield>
    <subfield code="z">md5:57a773a0c30c097dfc987a3fdb0b322e</subfield>
    <subfield code="u"></subfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">2040657</subfield>
    <subfield code="z">md5:6a879d1fc781e0b37c42bbbdc5f27deb</subfield>
    <subfield code="u"></subfield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2022-02-17</subfield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire_data</subfield>
    <subfield code="o"></subfield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">Know-Center GmbH, TU Graz</subfield>
    <subfield code="0">(orcid)0000-0003-3230-6234</subfield>
    <subfield code="a">Kowald Dominik</subfield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Fair RecSys Datasets</subfield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u"></subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2"></subfield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;Four multimedia recommender systems datasets to study popularity bias and fairness:&lt;/p&gt;

	&lt;li&gt; (, based on the LFM-1b dataset of JKU Linz (;/li&gt;
	&lt;li&gt;MovieLens (, based on MovieLens-1M dataset (;/li&gt;
	&lt;li&gt;BookCrossing (, based on the BookCrossing dataset of Uni Freiburg (;/li&gt;
	&lt;li&gt;MyAnimeList (, based on the MyAnimeList dataset of Kaggle (;/li&gt;

&lt;p&gt;Each dataset contains of user interactions (user_events.txt) and three user groups that differ in their inclination to popular/mainstream items: LowPop (low_main_users.txt), MedPop (med_main_users.txt), and HighPop (high_main_users.txt).&lt;/p&gt;

&lt;p&gt;The format of the three user files are &amp;quot;user,mainstreaminess&amp;quot;&lt;/p&gt;

&lt;p&gt;The format of the user-events files are &amp;quot;user,item,preference&amp;quot;&lt;/p&gt;

&lt;p&gt;Example Python-code for analyzing the datasets as well as more information on the user groups can be found on Github ( and on Arxiv (;/p&gt;


  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.6123878</subfield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.6123879</subfield>
    <subfield code="2">doi</subfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">dataset</subfield>
All versions This version
Views 306306
Downloads 3838
Data volume 145.6 MB145.6 MB
Unique views 252252
Unique downloads 2121


Cite as