Dataset Open Access

Fair RecSys Datasets

Kowald Dominik


Citation Style Language JSON Export

{
  "publisher": "Zenodo", 
  "DOI": "10.5281/zenodo.6123879", 
  "language": "eng", 
  "title": "Fair RecSys Datasets", 
  "issued": {
    "date-parts": [
      [
        2022, 
        2, 
        17
      ]
    ]
  }, 
  "abstract": "<p>Four multimedia recommender systems datasets to study popularity bias and fairness:</p>\n\n<ol>\n\t<li>Last.fm (lfm.zip), based on the LFM-1b dataset of JKU Linz (http://www.cp.jku.at/datasets/LFM-1b/)</li>\n\t<li>MovieLens (ml.zip), based on MovieLens-1M dataset (https://grouplens.org/datasets/movielens/1m/)</li>\n\t<li>BookCrossing (book.zip), based on the BookCrossing dataset of Uni Freiburg (http://www2.informatik.uni-freiburg.de/~cziegler/BX/)</li>\n\t<li>MyAnimeList (anime.zip), based on the MyAnimeList dataset of Kaggle (https://www.kaggle.com/CooperUnion/anime-recommendations-database)</li>\n</ol>\n\n<p>Each dataset contains of user interactions (user_events.txt) and three user groups that differ in their inclination to popular/mainstream items: LowPop (low_main_users.txt), MedPop (med_main_users.txt), and HighPop (high_main_users.txt).</p>\n\n<p>The format of the three user files are &quot;user,mainstreaminess&quot;</p>\n\n<p>The format of the user-events files are &quot;user,item,preference&quot;</p>\n\n<p>Example Python-code for analyzing the datasets as well as more information on the user groups can be found on Github (https://github.com/domkowald/FairRecSys) and on Arxiv (https://arxiv.org/abs/2203.00376)</p>\n\n<p>&nbsp;</p>\n\n<p>&nbsp;</p>", 
  "author": [
    {
      "family": "Kowald Dominik"
    }
  ], 
  "version": "1.0", 
  "type": "dataset", 
  "id": "6123879"
}
304
38
views
downloads
All versions This version
Views 304304
Downloads 3838
Data volume 145.6 MB145.6 MB
Unique views 250250
Unique downloads 2121

Share

Cite as