Fair RecSys Datasets

10.5281/zenodo.6123879 https://zenodo.org/records/6123879 oai:zenodo.org:6123879 Kowald Dominik Kowald Dominik 0000-0003-3230-6234 Know-Center GmbH, TU Graz Fair RecSys Datasets Zenodo 2022 multimedia recommender systems fairness popularity bias 2022-02-17 2023-02-22 eng 10.5281/zenodo.6123878 2.0 Creative Commons Attribution 4.0 International Four multimedia recommender systems datasets to study popularity bias and fairness: Last.fm (lfm.zip), based on the LFM-1b dataset of JKU Linz (http://www.cp.jku.at/datasets/LFM-1b/) MovieLens (ml.zip), based on MovieLens-1M dataset (https://grouplens.org/datasets/movielens/1m/) BookCrossing (book.zip), based on the BookCrossing dataset of Uni Freiburg (http://www2.informatik.uni-freiburg.de/~cziegler/BX/) MyAnimeList (anime.zip), based on the MyAnimeList dataset of Kaggle (https://www.kaggle.com/CooperUnion/anime-recommendations-database) Each dataset contains of user interactions (user_events.txt) and three user groups that differ in their inclination to popular/mainstream items: LowPop (low_main_users.txt), MedPop (med_main_users.txt), and HighPop (high_main_users.txt). The format of the three user files are "user,mainstreaminess" The format of the user-events files are "user,item,preference" Example Python-code for analyzing the datasets as well as more information on the user groups can be found on Github (https://github.com/domkowald/FairRecSys) and on Arxiv (https://arxiv.org/abs/2203.00376)