ClinvArbitration data release - October 2025
Creators
Description
This file is a tarball representing the ClinvArbitration re-summary of ClinVar's raw submissions. The ClinvArbitration project represents an altered aggregation of the individual submissions, preferring to break ties when presented with submissions which don't all agree, instead of defaulting to a rating of "conflicting interpretations of pathogenicity". This leads to more variants being presented as either B/LB, or P/LP, and a reduced grey area between.
This data release contains a four items:
- The results of the re-interpretation of ClinVar, presented as a Hail Table, and as a TSV
- All Pathogenic Missense variants in the ClinVar re-interpretation, indexed by Ensembl Transcript and Codon number
This second part is not easily applied by existing tools, approximating the PM5 consequence category according to the ACMG criteria, and represents the following:
For each Pathogenic SNV in ClinVar, we annotate the variants using BCFtools CSQ. For each Pathogenic SNV which is also a Missense variant, we reogrganise the data to be indexed on Transcript and Codon number. This can then be inverted to annotate genetic variation - if a variant is a Missense, and a ClinVar pathogenic Missense variant exists affecting the same Codon, we annotate the Missense with co-located known pathogenic ClinVar entries, in case this contributes to the interpretation of the variant under investigation.
I would like to acknowledge that since this side project started, a substantial curation effort has been made in ClinVar, so the gap between the standard and re-interpreted ClinVar results has closed substantially. The exact data format presented here is required by Talos, a whole-Exome/Genome variant prioritisation tool, so despite the increasing consistency between the two results sets this exact data format should continue to be distributed.
This release builds on the previous release by including Mitochondrial variants. These Mito submissions are not part of the PM5/missense matched dataset, but are present as individual decisions.
Files
Files
(87.8 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:acdf61d6fff1619d2911285668e0b61d
|
87.8 MB | Download |
Additional details
Dates
- Updated
-
2025-10-20
Software
- Repository URL
- https://github.com/populationgenomics/ClinvArbitration
- Programming language
- Python
- Development Status
- Active