Domain shares on Twitter containing news and misinformation
Contributors
Researcher:
Description
This dataset contains a set of domain sharing actions that occurred on Twitter during the month of June 2017. Each domain sharing action can be thought of as a triple (user_id, action_id, domain). The user_id is an anonymized Twitter account ID, and the action_id is an anonymized tweet ID. The tweets from which the dataset was created were collected through Twitter's decahose API. Each user in the dataset was responsible for sharing at least one news article, and at least one article that can be labeled as misinformation.
The data is distributed in domain-shares.data in the following JSON format:
{
"<user-id>": {
"<action-id>": ["<domain>", "<domain>", ...],
...
},
...
}
For example:
{
"22359b28-93e1-4c13-a3eb-e72357b77c65": {
"1": ["palmerreport.com"],
"2": ["reuters.com", "abcn.ws"],
"3": ["mobile.nytimes.com"]
},
"ffe79a32-d49d-4780-87b7-bb6417106067": {
"4": ["dallasnews.com"]
}
}
In addition, a TAB-separated version with (user id, action id, domain) triples is also available.
For further information on how the dataset was constructed and on analyses that have been conducted on it, please refer to the accompanying Github repository at https://github.com/dimitargnikolov/twitter-bias.
Files
Files
(119.1 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:08dfdf3fcf88553b75253f4b9c4c1565
|
39.8 MB | Download |
|
md5:1809c60458e5ce27eca2602d87b654e9
|
79.3 MB | Download |