Dataset Open Access

Dataset for "On the Origins of Memes by Means of Fringe Web Communities"

Savvas Zannettou; Tristan Caulfield; Jeremy Blackburn; Emiliano De Cristofaro; Michael Sirivianos; Gianluca Stringhini; Guillermo Suarez-Tangil

Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="" xmlns:oai_dc="" xmlns:xsi="" xsi:schemaLocation="">
  <dc:creator>Savvas Zannettou</dc:creator>
  <dc:creator>Tristan Caulfield</dc:creator>
  <dc:creator>Jeremy Blackburn</dc:creator>
  <dc:creator>Emiliano De Cristofaro</dc:creator>
  <dc:creator>Michael Sirivianos</dc:creator>
  <dc:creator>Gianluca Stringhini</dc:creator>
  <dc:creator>Guillermo Suarez-Tangil</dc:creator>
  <dc:description>This dataset was collected with research funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Grant Agreement No 691025.
The publication on which this dataset was used is: "On the Origins of Memes by Means of Fringe Web Communities". Savvas Zannettou, Tristan Caulfield, Jeremy Blackburn, Emiliano De Cristofaro, Michael Sirivianos, Gianluca Stringhini, and Guillermo Suarez-Tangil. ACM Internet Measurement Conference (IMC), 2018., DOI:


The dataset consists of all the URLs and phashes for images from Twitter, Reddit, 4chan's /pol/, and Gab posted between July 2016 and end of July 2017.

The code related to this research can be found here:, or here: 10.5281/zenodo.1463050

Presentation available here:</dc:description>
  <dc:title>Dataset for "On the Origins of Memes by Means of Fringe Web Communities"</dc:title>
All versions This version
Views 1,230470
Downloads 22193
Data volume 1.2 TB492.3 GB
Unique views 1,025397
Unique downloads 17885


Cite as