Dataset Open Access

Dataset for "On the Origins of Memes by Means of Fringe Web Communities"

Savvas Zannettou; Tristan Caulfield; Jeremy Blackburn; Emiliano De Cristofaro; Michael Sirivianos; Gianluca Stringhini; Guillermo Suarez-Tangil

This dataset was collected with research funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Grant Agreement No 691025.
The publication on which this dataset was used is: "On the Origins of Memes by Means of Fringe Web Communities". Savvas Zannettou, Tristan Caulfield, Jeremy Blackburn, Emiliano De Cristofaro, Michael Sirivianos, Gianluca Stringhini, and Guillermo Suarez-Tangil. ACM Internet Measurement Conference (IMC), 2018., DOI: https://doi.org/10.5281/zenodo.1323551

 

The dataset consists of all the URLs and phashes for images from Twitter, Reddit, 4chan's /pol/, and Gab posted between July 2016 and end of July 2017.

The code related to this research can be found here: https://github.com/memespaper/memes_pipeline, or here: https://doi.org/ 10.5281/zenodo.1463050

Presentation available here: https://doi.org/10.5281/zenodo.1477082

Files (5.3 GB)
Name Size
shared_data_phashes.tar.gz
md5:0ffbd46979fc8fc1c0198fa936f473f5
5.3 GB Download
455
87
views
downloads
All versions This version
Views 455456
Downloads 8787
Data volume 459.9 GB459.9 GB
Unique views 413414
Unique downloads 6464

Share

Cite as