580587
doi
10.5281/zenodo.580587
oai:zenodo.org:580587
Asmelash, Teka Hadgu
L3S Research Center
URLs from tweets for a 2014 sample of Twitter users and for a set of computer scientists
Robert Jäschke
University of Sheffield
doi:10.5281/zenodo.154583
doi:10.5281/zenodo.12942
doi:10.1371/journal.pone.0179630
info:eu-repo/semantics/openAccess
Creative Commons Attribution Share Alike 4.0 International
https://creativecommons.org/licenses/by-sa/4.0/legalcode
Twitter
tweets
<p>The files in this dataset are used to analyse the tweeting behaviour of computer scientists on Twitter. They comprise</p>
<ul>
<li>a set of 989,529 tweet-URL pairs (<em>tweets_2014_researcher.tsv.bz2</em>) from 2014 from 6,271 users of the computer scientists sample in https://zenodo.org/record/12942 specified by time, tweet id, user id, and URL,</li>
<li>a set of 300,053,850 tweet ids (<em>tweets_2014_sample.tsv.bz2</em>) from the 1% Twitter stream sample from 2014,</li>
<li>a set of 605,080 tweet-URL pairs (<em>tweets_2014_sample_6694_users.tsv.bz2</em>) from the 1% Twitter stream sample from 2014 for 6,694 users specified by time, tweet id, user id, and URL,</li>
<li>a set of the top 10,000 host names (<em>MAG_hosts_10000.tsv</em>) from the Microsoft Academic Graph data (http://blogs.msdn.com/b/msr_er/archive/2015/06/26/announcing-the-microsoft-academic-graph-let-the-research-begin.aspx), specified by rank, URL count, and host name, and</li>
<li>a set of 340 host names of URL shortening services (<em>url_shortening_services.tsv</em>).</li>
</ul>
<p>In addition, the following rankings (based on the odds ratio) of domains, hosts, and URLs that appear in both the researcher dataset and the sample are included:</p>
<ul>
<li><em>domains_by_odds_ratio.tsv.bz2</em> - a ranking of 61,860 domains,</li>
<li><em>hosts_by_odds_ratio.tsv.bz2</em> - a ranking of 80,384 hosts,</li>
<li><em>publisher_domains_by_odds_ratio.tsv.bz2</em> - a ranking of 924 publisher domains,</li>
<li><em>publisher_urls_by_odds_ratio.tsv.bz2</em> - a ranking of 4,227 publisher URLs.</li>
</ul>
This is an updated and extended version of 10.5281/zenodo.154583 where a new sample of users has been used, resulting in an updated file tweets_2014_sample_6694_users.tsv.bz2. In addition, domain, host, and URL rankings have been added.
Zenodo
2017-05-17
info:eu-repo/semantics/other
785910
1579893962.827935
8120
md5:10e489478e9076e76d158c18e95f51bc
https://zenodo.org/records/580587/files/publisher_domains_by_odds_ratio.tsv.bz2
444954
md5:299e3ec2469d3a91582e592a2fc0aa1e
https://zenodo.org/records/580587/files/domains_by_odds_ratio.tsv.bz2
2295230252
md5:d0ea5705cb86480a0f22a1c7439533b4
https://zenodo.org/records/580587/files/tweets_2014_sample.tsv.bz2
12227572
md5:2dff10a6301cb97c53a653a65019199c
https://zenodo.org/records/580587/files/tweets_2014_sample_6694_users.tsv.bz2
31993560
md5:6c466537064b5a5574734f418893b199
https://zenodo.org/records/580587/files/tweets_2014_researcher.tsv.bz2
2967
md5:1f040245142c7309b9c46f897f79f7ce
https://zenodo.org/records/580587/files/url_shortening_services.tsv
298167
md5:bf92fe9d92a45949d44037a81356b82b
https://zenodo.org/records/580587/files/MAG_hosts_10000.tsv
84262
md5:e5f563f85a2ea56fac3b20109e1c2402
https://zenodo.org/records/580587/files/publisher_urls_by_odds_ratio.tsv.bz2
619682
md5:bd959f2b67bc50e746a4740d8969f18c
https://zenodo.org/records/580587/files/hosts_by_odds_ratio.tsv.bz2
public
10.5281/zenodo.154583
Is new version of
doi
10.5281/zenodo.12942
Is supplement to
doi
10.1371/journal.pone.0179630
Is supplement to
doi
isVersionOf
doi
PLoS ONE
12
6
2017-05-17