Published May 1, 2024
| Version v1
Dataset
Open
Datasets for the paper: Lost in Translation: Using Global Fact-Checks to Measure Multilingual Misinformation Prevalence, Spread, and Evolution
Authors/Creators
Description
FullData.csv.gz: Contains links to all claims in the data-set.
- publishing_date: Date on which the fact-check was published.
- claim_date: Date that claim was made.
- verdict: Rating given by the fact-checking organisation.
- language: Language of the claim.
- cluster_{threshold}: ID of the cluster that claim belongs to at all given clusters. Entry "0" means that claim is singleton and not clustered with any other claims.
Embeddings.npy: Contains a dictionary linking each claim to it's embedding calculated with LaBSE.
Files
Files
(873.5 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:8a02db004e06ffb0b9ec8b5f63c2a5a2
|
859.2 MB | Download |
|
md5:69e95774a818320ba3ffcc85a6f09d77
|
14.3 MB | Download |
Additional details
Software
- Repository URL
- https://github.com/dorianquelle/Lost-In-Translation
- Programming language
- Python