Scored.co Hypernetwork Dataset
Creators
Description
Scored.co is a Reddit-like social platform that served as a shelter for many far-right/alt-right communities after the 2020 massive Reddit bans. Some of these maintained their names and key figures, e.g., c/TheDonald, c/GreatAwakening, ultimately acting as a continuation of the original ones. This migration has allowed these communities to persist and even thrive in a new environment where they can freely share their views without the risk of being banned. Scored.co is an emerging yet understudied platform potentially hosting dangerous content.
This dataset contains higher-order interactions between Scored.co users, along with aggregated user profiles/statistics.
All data is anonymized, and no personally identifiable information is released.
Note on Versions
Version 1.0.0 (current) contains only data from the c/TheDonald community during 2023. Future updates will include data from other communities and temporal spans.
Dataset
Here is a description of the dataset files.
- 2023-thedonald-threads.csv, contains information on discussion threads (higher-order interactions), including temporal information
- 2023-thedonald-sent.csv, contains information on average sentiment for each user in each month of 2023. Sentiment is computed via the VADER algorithm;
- 2023-thedonald-tox.csv, contains information on average toxicity for each user in each month of 2023. Toxicity is computed via the Detoxify-unbiased-small model:
- 2023-thedonald-score.csv, contains information on average toxicity for each user in each month of 2023. Post scores are obtained via the API during collection.
- 2023-thedonald-other.csv, several other language-related statistics for each user in each month of 2023. These include pleasure/arousal/dominance from the PAD psychological model, Plutchik's wheel of emotions, and Moral Foundations scores.
- collection.py, contains a Python collection script used to gather data from Scored.co's official API
For further information on fields and volumes, please refer to the data paper.
Citation
If used for research purposes, please cite the following paper describing the dataset details:
TBD
Acknowledgements
This work is supported by:
- the European Union – Horizon 2020 Program under the scheme “INFRAIA-01-2018-2019 – Integrating Activities for Advanced Communities”,
Grant Agreement n.871042, “SoBigData++: European Integrated Infrastructure for Social Mining and Big Data Analytics” (http://www.sobigdata.eu); - SoBigData.it which receives funding from the European Union – NextGenerationEU – National Recovery and Resilience Plan (Piano Nazionale di Ripresa e Resilienza, PNRR) – Project: “SoBigData.it – Strengthening the Italian RI for Social Mining and Big Data Analytics” – Prot. IR0000013 – Avviso n. 3264 del 28/12/2021;
- EU NextGenerationEU programme under the funding schemes PNRR-PE-AI FAIR (Future Artificial Intelligence Research).
Files
2023-thedonald-threads.csv
Files
(71.8 MB)
Name | Size | Download all |
---|---|---|
md5:c7cd6e08fd1e1b055d7401376560344c
|
14.8 MB | Preview Download |
md5:cda2774b2b6f1d111b77dc262a63a86d
|
2.2 MB | Preview Download |
md5:096d76adaaabed09048afd0395b07f63
|
2.4 MB | Preview Download |
md5:29ad329dffc0aab9d51c199eebaa6de4
|
50.1 MB | Preview Download |
md5:786aa821a099a8144992ceee093ec790
|
2.3 MB | Preview Download |
md5:1f6e180efd94eaed8aa3a21dc70d7817
|
3.9 kB | Download |