Dataset Open Access
Savvas Zannettou; Tristan Caulfield; Jeremy Blackburn; Emiliano De Cristofaro; Michael Sirivianos; Gianluca Stringhini; Guillermo Suarez-Tangil
This dataset was collected with research funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Grant Agreement No 691025.
The publication on which this dataset was used is: "On the Origins of Memes by Means of Fringe Web Communities". Savvas Zannettou, Tristan Caulfield, Jeremy Blackburn, Emiliano De Cristofaro, Michael Sirivianos, Gianluca Stringhini, and Guillermo Suarez-Tangil. ACM Internet Measurement Conference (IMC), 2018., DOI: https://doi.org/10.5281/zenodo.1323551
The dataset consists of all the URLs and phashes for images from Twitter, Reddit, 4chan's /pol/, and Gab posted between July 2016 and end of July 2017.
The code related to this research can be found here: https://github.com/memespaper/memes_pipeline, or here: https://doi.org/ 10.5281/zenodo.1463050
Presentation available here: https://doi.org/10.5281/zenodo.1477082
Name | Size | |
---|---|---|
shared_data_phashes.tar.gz
md5:df64270f50f9cb12cd41de15755ee00a |
5.3 GB | Download |
All versions | This version | |
---|---|---|
Views | 997 | 303 |
Downloads | 179 | 55 |
Data volume | 946.7 GB | 291.1 GB |
Unique views | 841 | 256 |
Unique downloads | 144 | 51 |