Published February 20, 2019 | Version v1
Dataset Open

Replication data for: Reconciliation k-median: Clustering with non-polarized representatives

  • 1. Aalto University

Description

# Description
These files contain the data employed in the experiments described in Bruno Ordozgoiti and Aristides Gionis. 2019. Reconciliation k-median: Clustering with Non-Polarized Representatives. In Proceedings of the 2019 World Wide Web Conference (WWW’19), May 13–17, 2019, San Francisco, CA, USA.

Twitter ID's have been anonymized.

# Contents
domain_mentions.txt: Each line contains a domain name, a user ID and the number of times this user has mentioned this domain name in a tweet.
format: domain_name <TAB> user_id <TAB> mention_count

domains_ideology_score.txt: Domain names and their ideology score, estimated as described in (Lahoti et al. WSDM 2018). Note: missing scores can be retrieved from supplementary data in https://doi.org/10.1093/poq/nfw006
format: domain_name <TAB> ideology_score

follow_graph.txt: The Twitter follower graph. Each line contains a user id and the user id of one of its followers.
format: user_id <TAB> follower_user_id

representatives.txt: US Congress representatives, each with Twitter handle and polarity score computed using Barbera's method (Barbera, 2015).
format: rep_name <TAB> website_url <TAB> district <TAB> twitter_handle <TAB> party <TAB> barbera_polarity_score

user_polarity.txt: User ID's and polarity score computed using Barbera's method (Barbera, 2015).
format: user_id <TAB> barbera_polarity_score

Files

domain_mentions.txt

Files (317.2 MB)

Name Size Download all
md5:9f0c493d354ab89969bccc9f325a2934
6.5 MB Preview Download
md5:8457ce5f72bfdefc4e1cf41b321661ac
11.4 kB Preview Download
md5:a4ef3a277ad195f695d3c347496d0c58
310.6 MB Preview Download
md5:4aa60146cb58a6158e6370caa77d3eec
33.4 kB Preview Download
md5:977ae3533f6105821efcb6b48529211b
101.5 kB Preview Download

Additional details

Related works

Funding

European Commission
SoBigData - SoBigData Research Infrastructure 654024

References

  • Lahoti, Preethi, Kiran Garimella, and Aristides Gionis. "Joint non-negative matrix factorization for learning ideological leaning on twitter." Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. ACM, 2018.
  • Barberá, Pablo. "Birds of the same feather tweet together: Bayesian ideal point estimation using Twitter data." Political Analysis 23.1 (2015): 76-91.