CRED-1: 2,672 domains scored for credibility
An open, reproducible domain credibility dataset for automated pre-bunking of online misinformation. Multi-signal scoring, CC BY 4.0 licensed.
Get started in 3 lines
Load the dataset and query any domain's credibility score.
Python
import json
with open("data/cred1_current.json") as f:
cred = json.load(f)
domain = "infowars.com"
score = cred[domain]["credibility_score"]
print(f"Credibility: {score:.2f}")
R
library(jsonlite)
cred <- fromJSON("data/cred1_current.json")
cred[["infowars.com"]]$credibility_score
Six credibility signals
Each domain is scored across multiple independent dimensions, then combined into a weighted composite.
๐ท๏ธ
Source Category
Classification from curated lists (fake, conspiracy, satire, bias, etc.)
๐
Iffy News Index
Factual reporting level and political bias ratings from the Iffy Index
๐
Fact-Check Claims
Number of fact-checked claims associated with the domain via ClaimReview
๐
Domain Age
WHOIS registration age as a trust signal (newer domains score lower)
๐
Web Popularity
Tranco ranking as a proxy for reach and potential impact
๐ก๏ธ
Safe Browsing
Google Safe Browsing threat assessment for malware and social engineering
Dataset schema
credibility_score
Composite score 0.0 - 1.0
category
fake, conspiracy, satire, bias...
domain_age_years
Years since WHOIS registration
factcheck_claims
ClaimReview count
iffy_factual
VL / L / ML / M / MH / H / VH
iffy_bias
Political bias rating
safe_browsing
Google threat flag
sources
Number of corroborating lists
๐ Cite this dataset
@article{loth2026cred1,
title = {CRED-1: An Open Multi-Signal Domain Credibility Dataset
for Automated Pre-Bunking of Online Misinformation},
author = {Loth, Alexander and Kappes, Martin and Pahl, Marc-Oliver},
year = {2026},
doi = {10.2139/ssrn.6448466},
url = {https://ssrn.com/abstract=6448466}
}
Build with CRED-1
Open data. Reproducible pipeline. Ready for your research or application.