Published September 15, 2020 | Version v1
Dataset Open

User Groups for Robustness of Meta Matrix Factorization Against Decreasing Privacy Budgets

  • 1. Know-Center GmbH
  • 2. Graz University of Technology


This dataset comprises a subset of rating data from five different datasets, i.e., Douban [1], Hetrec-MovieLens [2], MovieLens 1M [3], Ciao [4] and Jester [5]. Each subset represents rating data from three distinct user groups: users with few ratings (low), users with a medium amount of ratings (med) and users with lots of ratings (high). Each row in the user files includes a user's id and her number of ratings. The rows of the ratings files are in the format (user_id, item_id, rating). For more details, we refer to our publication in

* 375 users (i.e., 125 users per user group)
* 32,191 items
* 266,517 ratings

* 318 users (i.e., 106 users per user group)
* 9,553 items
* 207,943 ratings

MovieLens 1M
* 906 users (i.e., 302 users per user group)
* 3,613 items
* 275,119 ratings

* 1,107 users (i.e., 369 users per user group)
* 60,132 items
* 107,807 ratings

* 11,013 users (i.e., 3,671 per user group)
* 100 items
* 618768 ratings

The python code for generating and utilizing this dataset can be found in

This work is supported by the H2020 project TRUSTS (GA: 871481) and the "DDAI'' COMET Module within the COMET – Competence Centers for Excellent Technologies Programme, funded by the Austrian Federal Ministry for Transport, Innovation and Technology (bmvit), the Austrian Federal Ministry for Digital and Economic Affairs (bmdw), the Austrian Research Promotion Agency (FFG), the province of Styria (SFG) and partners from industry and academia. The COMET Programme is managed by FFG.

[1] Hu, L., Sun, A., Liu, Y.: Your neighbors affect your ratings: on geographical neighborhood influence to rating prediction. In: SIGIR’14 (2014)
[2] Cantador, I., Brusilovsky, P., Kuflik, T.: Second international workshop on information heterogeneity and fusion in recommender systems (hetrec2011). In: RecSys’11(2011)
[3] Harper, F. M., Konstan, J. A.: The movielens datasets: History and context. ACM Transactions on Interactive Intelligent Systems (TIIS) 5(4), 1–19 (2015)
[4] Guo, G., Zhang, J., Thalmann, D., Yorke-Smith, N.: Etaf: An extended trust antecedents framework for trust prediction. In: ASONAM’14 (2014)
[5] Goldberg, K., Roeder, T., Gupta, D., Perkins, C.:  Eigentaste: A constant time collaborative filtering algorithm. Information Retrieval 4(2), 133–151 (2001)



Files (10.5 MB)

Name Size Download all
10.5 MB Preview Download