Dataset Open Access

URLs from tweets for a 2014 sample of Twitter users and for a set of computer scientists

Robert Jäschke


Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
  <dc:contributor>Asmelash, Teka Hadgu</dc:contributor>
  <dc:creator>Robert Jäschke</dc:creator>
  <dc:date>2017-05-17</dc:date>
  <dc:description>The files in this dataset are used to analyse the tweeting behaviour of computer scientists on Twitter. They comprise


	a set of 989,529 tweet-URL pairs (tweets_2014_researcher.tsv.bz2) from 2014 from 6,271 users of the computer scientists sample in https://zenodo.org/record/12942 specified by time, tweet id, user id, and URL,
	a set of 300,053,850 tweet ids (tweets_2014_sample.tsv.bz2) from the 1% Twitter stream sample from 2014,
	a set of 605,080 tweet-URL pairs (tweets_2014_sample_6694_users.tsv.bz2) from the 1% Twitter stream sample from 2014 for 6,694 users specified by time, tweet id, user id, and URL,
	a set of the top 10,000 host names (MAG_hosts_10000.tsv) from the Microsoft Academic Graph data (http://blogs.msdn.com/b/msr_er/archive/2015/06/26/announcing-the-microsoft-academic-graph-let-the-research-begin.aspx), specified by rank, URL count, and host name, and
	a set of 340 host names of URL shortening services (url_shortening_services.tsv).


In addition, the following rankings (based on the odds ratio) of domains, hosts, and URLs that appear in both the researcher dataset and the sample are included:


	domains_by_odds_ratio.tsv.bz2 - a ranking of 61,860 domains,
	hosts_by_odds_ratio.tsv.bz2 - a ranking of 80,384 hosts,
	publisher_domains_by_odds_ratio.tsv.bz2 - a ranking of 924 publisher domains,
	publisher_urls_by_odds_ratio.tsv.bz2 - a ranking of 4,227 publisher URLs.
</dc:description>
  <dc:identifier>https://zenodo.org/record/580587</dc:identifier>
  <dc:identifier>10.5281/zenodo.580587</dc:identifier>
  <dc:identifier>oai:zenodo.org:580587</dc:identifier>
  <dc:relation>doi:10.5281/zenodo.154583</dc:relation>
  <dc:relation>doi:10.5281/zenodo.12942</dc:relation>
  <dc:relation>doi:10.1371/journal.pone.0179630</dc:relation>
  <dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
  <dc:rights>https://creativecommons.org/licenses/by-sa/4.0/</dc:rights>
  <dc:source>PLoS ONE 12(6)</dc:source>
  <dc:subject>Twitter</dc:subject>
  <dc:subject>tweets</dc:subject>
  <dc:title>URLs from tweets for a 2014 sample of Twitter users and for a set of computer scientists</dc:title>
  <dc:type>info:eu-repo/semantics/other</dc:type>
  <dc:type>dataset</dc:type>
</oai_dc:dc>

Share

Cite as