Published April 20, 2020 | Version v1
Dataset Open

Cross-platform mentions of the QAnon conspiracy theory

  • 1. University of Amsterdam

Description

This dataset contains mentions of the QAnon conspiracy theory across the Web between 28 October 2017 and 1 November 2018. The following list details the data per platform and its collection process:

  • 4chan: Posts and comments on 4chan/pol/ mentioning "Q" or "QAnon". The data is collected through 4CAT, a data capturing and analysis tool that hosts all posts and comments made on 4chan/pol/ since 2014.
  • 8chan: Posts and comments on the /qresearch/ board and other smaller boards mentioning "Q" or "QAnon". The data is derived qanon.news, a grassroots archive. Considering its amateur nature, the dataset is likely not 100% complete, but still includes over 200,000 posts.
  • Reddit: Comments made on politically-oriented subreddits mentioning "Q" or "QAnon". The data is gathered through the Pushshift API.
  • YouTube: Videos mentioning QAnon or "Q" in the title or video decription. The data is collected via the YouTube v3 API using the search endpoint. Multiple keywords were queried ("qanon", "qanon 4chan", etc) to collect a large sample. False positives were then filtered out manually.
  • Breitbart: Disqus comments on Breitbart.com mentioning QAnon or "Q". The data was gathered by crawling all of Breitbart.com in the timeframe and using the Disqus API.
  • Online news media: Articles from English online news sources mentioning QAnon. The data is derived from Nexis Uni and ContextualWeb Search by searching for "QAnon". Irrelevant sources and false positives were filtered manually.

The datasets include timestamps, text bodies, and platform-specific information like subreddits and channel titles. To collect data from 4chan, 8chan, Reddit, and Breitbart, we used the same SQL query, sampled 200 comments, and edited the query to so it would have sufficient number of true positives (> 94%). The YouTube and online news media datasets are filtered manually.

For Breitbart and Reddit, the data is anonymised by omitting author information. The online news media article text is omitted because of copyright concerns.

See the article on First Monday for the full collection process.

Files

qanon_4chan.csv

Files (255.9 MB)

Name Size Download all
md5:f8e7a187ab4aaa18f7cec4bf5e15dc50
26.0 MB Preview Download
md5:da734e147776349131de2b10295d07cd
134.8 MB Preview Download
md5:05fb0fb1bddf240550bd52590e5bf2d9
7.1 MB Preview Download
md5:1815fde1403182faf0a78b78740bca3b
11.5 kB Preview Download
md5:f2afa1478dc47862475bf61d9b0e2371
107.2 kB Preview Download
md5:2c9b253b21f4e0f7e0e523f0ed5f19a3
80.0 MB Preview Download
md5:68df13d29d1a5bc0d1cf6df6502a0a5f
8.0 MB Preview Download