Published October 25, 2025 | Version v2
Dataset Open

Global Claims Dataset

  • 1. ROR icon LIP - Laboratory of Instrumentation and Experimental Particle Physics
  • 2. ROR icon Instituto Superior Técnico
  • 3. ROR icon Laboratório para a Ciência da Computação e Informática

Description

Collection of claims collected from different fact-checking websites, covering various languages and topics. Described in "Global Claims: A Multilingual Dataset of Fact-Checked Claims with Veracity, Topic, and Salience Annotations"

Notes

factcheck_claims.json A JSON Lines dataset of fact-checked claims. Each entry includes the following fields:

  • factcheck_url: URL of the fact-checking website
  • factcheck_date: Date of the fact-check
  • claim_reviewed : Text of the reviewed claim
  • claim_language: Language of the claim
  • items_reviewed: Source URL of the reviewed claim
  • review_rating: Original rating assigned to the claim
  • review_standardized: Standardized rating (`true`, `false`, `other`, `unknown`)
  • topics: Dictionary of topic probabilities
  • Mean Topic: Inferred main topic of the claim
  •  twitter_presence: `True` if the claim’s source URL appears in url_tweets.json, otherwise `False`

 url_tweets.json A JSON Lines file containing tweets that shared URLs found in claims. Each entry includes:

  •  url: URL shared in the tweet
  • tweet_id: List of tweet IDs that shared the URL

Files

factcheck_claims.json

Files (91.2 MB)

Name Size Download all
md5:322fbb9e51bb7f5a25b88cefde2b44f3
70.9 MB Preview Download
md5:8d0eb5c7fa7f41f452c5371156a44b2c
20.3 MB Preview Download

Additional details

Related works

Is cited by
Conference proceeding: 10.1145/3746275.3762201 (DOI)
Is supplemented by
Computational notebook: 10.5281/zenodo.16942428 (DOI)

Funding

European Commission
FARE - FAKE NEWS AND REAL PEOPLE – USING BIG DATA TO UNDERSTAND HUMAN BEHAVIOUR 853566
European Commission
FARE_AUDIT - FARE_AUDIT: Fake News Recommendations - an Auditing System of Differential Tracking and Search Engine Results 101100653
Fundação para a Ciência e Tecnologia
Differential tracking on disinformation websites and its impact on search engine results 2022.12547.BD

Software

References

  • Ana Vranić, José M. Reis, Iris Damião, Paulo Almeida, and Joana Gonçalves-Sa. 2025. Global Claims: A Multilingual Dataset of Fact-Checked Claims with Veracity, Topic, and Salience Annotations. In Proceedings of the 2nd International Workshop on Diffusion of Harmful Content on Online Web (DHOW '25), October 27–28, 2025, Dublin, Ireland. ACM, New York, NY, USA, 12 pages. https://doi.org/10.1145/3746275.3762201