Dataset Open Access

A Dataset of State-Censored Tweets

Elmas, Tuğrulcan; Overdorf, Rebekah; Aberer, Karl

This is the dataset associated with the paper of the same name. You can find it here: https://arxiv.org/abs/2101.05919


Files:

  • tweets.csv : All 583k censored tweets
  • tweets_debiased.csv : Debiased sample of tweets (Section 6.1)
  • all_users.csv : All users who are censored once at least once
  • users.csv : All 4301 users whose entire profile is censored
  • users_inferred.csv : 1931 extra users inferred to be the censored by the procedure described in Section 3.3
  • supplement.csv : The supplementary tweet data. (Section 3.5)

Please refer to this Github repo for the detailed documentation and the code for reproduction.

https://github.com/tugrulz/CensoredTweets

Files (445.3 MB)
Name Size
all_users.csv
md5:0f450a646503986fbe8c80ffb58be4bd
2.2 MB Download
buzzfeed_users.csv
md5:a6f10a0314cc8ca4598ee1ccb5143e48
23.2 kB Download
supplement.csv
md5:121c28bd0d6f2be0ee4e81c3f702c465
430.9 MB Download
tweets.csv
md5:8859c72022ec01ecf582dd1d3c24bd3b
11.3 MB Download
tweets_debiased.csv
md5:df73ca41948b942d36b86184511783c6
775.2 kB Download
users.csv
md5:e394f9ecf149f7172de574df6e51ff7b
63.2 kB Download
users_inferred.csv
md5:74af1ce74e538b407b420f40ee72c5b4
30.1 kB Download
838
400
views
downloads
All versions This version
Views 838783
Downloads 400390
Data volume 22.3 GB22.2 GB
Unique views 668639
Unique downloads 180174

Share

Cite as