Dataset Open Access

Assessing Vaccine-Related Content for Journalistic Quality: A large-scale dataset and article repository

Bornhoft, Alexandra; Brennan, Christopher; Sehat, Connie Moon

This dataset was produced through a collaboration with the NSF-funded ARTT project (led by Hacks/Hackers) and Overtone. It consists of 1,000 vaccine-related articles, pulled from a wide variety of news media sources, with associated scores based on their journalistic quality. The scores were provided through Overtone’s algorithm, and range from one (low-quality or low informational value add) to five (high-quality or high informational value add). Articles were sourced from traditional journalism outlets (news and news-leaning websites), as well as non-journalistic sources of vaccine information, such as governmental websites, healthcare and NGO websites, and medical journals. Given the algorithm’s focus on editorial content, as opposed to other metrics such as author, outlet, or engagement, analyzing a diverse set of article types allowed the research team to examine how different styles of vaccine-related content measured against traditional journalistic quality standards. Therefore, this dataset provides a unique insight into the spectrum of vaccine reporting, and serves as a contribution to the field of automated quality assessment. 

About ARTT: The Analysis and Response Toolkit for Trust (ARTT) project is focused on helping people engage in trust-building ways when discussing vaccine efficacy and other topics online. 

About Overtone: Overtone has built a Natural Language Processing algorithm that finds and sorts online content by its intrinsic qualities, rather than clicks or shares. Their AI assesses texts for journalistic signals that demonstrate human effort.

For any questions about this dataset, please contact

The creation of this dataset was supported by an award from the National Science Foundation (US), award number 49100421C0037
Files (278.7 kB)
Name Size
Criteria, Scoring and Collection Methodology - ARTT_Overtone Dataset.pdf
78.9 kB Download
Description and Learnings - ARTT_Overtone Dataset.pdf
86.2 kB Download
Final Dataset_ARTT_Overtone.xlsx
113.6 kB Download
All versions This version
Views 9696
Downloads 4848
Data volume 4.1 MB4.1 MB
Unique views 8080
Unique downloads 3737


Cite as