ChronoMedKG: A Temporally-Grounded, Evidence-Graded Biomedical Knowledge Graph and Benchmark for Temporal Clinical Reasoning
Authors/Creators
Description
ChronoMedKG is a temporally-grounded, evidence-graded biomedical knowledge graph built by running a four-agent disease-autonomous pipeline across 13,431 of PrimeKG's 17,080 diseases (78.6%). The pipeline yields 460,497 validated consensus triples out of 13 million extracted triples; 10,852 diseases produce surviving triples after multi-LLM consensus and Quality Controller filtering. Every edge carries temporal metadata (per-phenotype onset windows, progression stages, clinical milestones), PMID-traceable evidence text, and a six-signal credibility score.
Unlike static biomedical KGs (PrimeKG, iKraph, Hetionet) that treat associations as timeless, ChronoMedKG records WHEN in a disease course each fact applies. The resource adds onset data for 6,250 diseases not present in any reference resource (HPOA, Orphadata, Phenopackets), 1,657 of them Orphanet-coded rare diseases gaining first-time structured onset representation. Validation against Orphadata reaches 92.7%; a three-LLM judge-panel audit on 100 novel-coverage diseases reaches 87.9%.
Construction uses a disease-autonomous four-agent pipeline (Disease Profiler, Evidence Harvester, Knowledge Extractor, Quality Controller) that runs end-to-end from a disease identifier. Multiple frontier LLMs extract triples in parallel; only relations supported by multi-model consensus survive credibility filtering and PrimeKG schema alignment. Total construction cost across 13,431 diseases: ~$2,400 in LLM API spend.
ChronoMedKG ships paired with ChronoTQA, the first temporal biomedical QA benchmark: 3,341 questions across eight reported task types plus a 12-question supplementary HPOA negative-temporal MCQ probe. Frontier LLMs trail their static-question accuracy by ~30 points on temporal items, and selective retrieval against ChronoMedKG rescues 47-65% of failed long-tail queries (vs 17-29% for HPOA-RAG).
This deposit (v0.0.1) contains:
- validated_triples.jsonl (Gold, 527 MB, 460,497 rows): main product, post-QC
- consensus_triples.jsonl.gz (Silver, 30 MB): pre-QC consensus rows
- raw_triples.jsonl.gz (Bronze, 644 MB): full extraction log, 13M rows
- tqa_benchmark.json (3.2 MB): ChronoTQA, 3,341 questions
- pmc_clinical_cases.json (63 KB): 31 diagnostic-odyssey case reports
- novelty_multi_judge_v2.json (168 KB): three-LLM audit verdicts
- croissant.json: Croissant 1.0 ML metadata
- README.md, LICENSE-DATA, NOTICE
Files
croissant.json
Files
(1.2 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:a5da17f73ff275e3df99860a246a4d88
|
29.6 MB | Download |
|
md5:773d722830f08ac3382c47a741cffd4e
|
14.0 kB | Preview Download |
|
md5:7922b6da3d1cd1718b5828dd83a319c9
|
3.3 kB | Download |
|
md5:f6f40f8561cbf1d132b0ce2b9c6190ca
|
1.1 kB | Download |
|
md5:f57d8eb15920df9fbb902cc4872a70a4
|
63.5 kB | Preview Download |
|
md5:352d774747f21c19e6fe8e523347072b
|
643.7 MB | Download |
|
md5:e417513cbcdb1b88a4571113d63c270b
|
10.7 kB | Preview Download |
|
md5:a9d077139730a66b5b14110bc43ad010
|
3.2 MB | Preview Download |
|
md5:fcca97df3d02a82595ccacdd9e418d09
|
526.6 MB | Download |