Published June 7, 2025 | Version v1
Dataset Open

SoK: Machine Learning for Misinformation Detection

Contributors

Data curator:

Description

Annotations and replication materials for 'SoK: Machine Learning for Misinformation Detection'

I've included descriptions of file contents below. 

annotations_aec.tsv: Contains annotations for our full paper corpus, comprising 248 published works. We annotated these papers for target, dataset curation, model choice, feature selection, and evaluation. 

paper_selection_criteria.txt: Our criteria for assembling the full and focus coding sets, adapted from pages 3, 5 ('Paper selection') and 6. 

replications.zip: within this zip archive, you'll find three subfolders, each corresponding to one of the three replication analyses found on pages 11-13 of the manuscript. We've included the subsection header in the manuscript where each dataset / codebase is discussed: 

  • articles (5.1): includes original and modified Reuters and NYTimes texts and accompanying labels (these are new datasets that we introduced for the sake of robustness testing). Also includes FA-KES and ISOT datasets and classifier (new_RNN_CNN.py) used by the original study authors and their classifier. 
  • users (5.2): includes troll and non-troll summary statistics, by account, with accompanying label. Also includes the classifier used by the original study author. 
  • sources (5.3): includes splits, classifier, and datasets used by the original author. 

Notes on open-source availability for each codebase: the source-scoped replication code is freely available online. We received permission from the authors of the article-scoped study to open-source their code. We've previously contacted the author of the user-scoped work (TrollMagnifier) and have not received a response -- we are sharing their code here, for the sake of artifact evaluation; open-source availability is pending an affirmative response from the author. 

Files

paper_selection_criteria.txt

Files (1.2 MB)

Name Size Download all
md5:174aea89bfa98f6953d982d3833a03a5
179.0 kB Download
md5:6cc3c374c2709f0f180473f9d70ca6c6
2.7 kB Preview Download
md5:6e7388122e814f2b20eb72485529680e
972.1 kB Preview Download

Additional details

Dates

Available
2025-06-07