Published September 21, 2016 | Version v1
Dataset Open

URLs from tweets for a 2014 sample of Twitter users and for a set of computer scientists

Authors/Creators

  • 1. University of Sheffield

Contributors

Data collector:

  • 1. L3S Research Center

Description

The files in this dataset are used to analyse the tweeting behaviour of computer scientists on Twitter. They comprise

  • a set of 989,529 tweet-URL pairs (tweets_2014_researcher.tsv.bz2) from 2014 from 6,271 users of the computer scientists sample in https://zenodo.org/record/12942 specified by time, tweet id, user id, and URL,
  • a set of 300,053,850 tweet ids (tweets_2014_sample.tsv.bz2) from the 1% Twitter stream sample from 2014,
  • a set of 671,304 tweet-URL pairs (tweets_2014_sample_6271_users.tsv.bz2) from the 1% Twitter stream sample from 2014 for 6,271 users specified by time, tweet id, user id, and URL,
  • a set of the top 10,000 host names (MAG_hosts_10000.tsv) from the Microsoft Academic Graph data (http://blogs.msdn.com/b/msr_er/archive/2015/06/26/announcing-the-microsoft-academic-graph-let-the-research-begin.aspx), specified by rank, URL count, and host name, and
  • a set of 340 host names of URL shortening services (url_shortening_services.tsv).

Files

Files (2.3 GB)

Name Size Download all
md5:bf92fe9d92a45949d44037a81356b82b
298.2 kB Download
md5:6c466537064b5a5574734f418893b199
32.0 MB Download
md5:d0ea5705cb86480a0f22a1c7439533b4
2.3 GB Download
md5:b467076317f32d5fc7a1ebb4e2d25997
13.6 MB Download
md5:1f040245142c7309b9c46f897f79f7ce
3.0 kB Download

Additional details

Related works

Is supplement to
10.5281/zenodo.12942 (DOI)