Published February 25, 2022
| Version v1
Dataset
Open
AlphScore_final dataset
Creators
- 1. Institute of Human Genetics, University Bonn, and School of Medicine, University Hospital Bonn, Germany
- 2. Berlin Institute of Health at Charité – Universitätsmedizin Berlin , Berlin, Germany
- 3. Institute for Genomic Statistics and Bioinformatics, University Hospital of Bonn, University of Bonn, Bonn, Germany; Institute of Medical Biometry, Informatics and Epidemiology, University Hospital of Bonn, University of Bonn, Bonn, Germany
- 4. Berlin Institute of Health at Charité – Universitätsmedizin Berlin , Berlin, Germany; Institut für Humangenetik, Universität zu Lübeck, Lübeck, Germany
Description
This file contains AlphScore_final as described in our associated publication. The file is based on dbNSFP 4.2a, contains a header and is tab-separated and compressed using bgzip. The columns contain the following content:
#chr | chromosome (hg38) |
pos(1-based) | position (hg38) |
ref | reference allele |
alt | alternative allele |
aaref | reference amino acid |
aaalt | alternative amino acid |
rs_dbSNP | rs number |
hg19_chr | chromosome (hg19) |
hg19_pos(1-based) | position (hg19) |
ID | variant id in the format: chromosome:position:reference amino acid:alternative amino acid |
genename | genename, taken from dbNSFP |
Uniprot_acc_split | The Uniprot-IDs of the structural models that were used to create AlphScore_final (multiple entries separated by ; ) |
Uniprot_acc | Uniprot_acc as provided by dbNSFP |
HGVSp_VEP_split | The missense variant(s) as used to create AlphScore_final; these variant(s) correspond(s) to the Uniprot_acc_split |
HGVSp_VEP | HGVSp_VEP as provided by dbNSFP |
CADD_raw | CADD_raw as provided by dbNSFP |
REVEL_score | REVEL_score as provided by dbNSFP |
DEOGEN2_score | DEOGEN2_score as provided by dbNSFP |
b_factor | AlphaFold's pLDDT-score of the residue (if a variant affects multiple proteins, the values of the proteins as indicated in Uniprot_acc_split are given separated by ; ). |
SOLVENT_ACCESSIBILITY_core | Solvent accessibility of the residue as calculated for C-alpha by DSSP (if a variant affects multiple proteins, the values of the proteins as indicated in Uniprot_acc_split are given separated by ; ). |
in_gnomad_train | TRUE if the variant was in the gnomAD set used for training |
in_clinvar_ds | TRUE if the variant was in the ClinVar set used for validation / training of combined scores |
AlphScore | This column corresponds to AlphScore_final |
glm_AlphCadd | This column corresponds to AlphScore_final + CADD |
glm_AlphRevel | This column corresponds to AlphScore_final + REVEL |
glm_RevelCadd | This column corresponds to REVEL + CADD |
glm_AlphRevelCadd | This column corresponds to AlphScore_final + REVEL + CADD |
glm_AlphDeogen | This column corresponds to AlphScore_final + DEOGEN2 |
glm_CaddDeogen | This column corresponds to CADD + DEOGEN2 |
glm_DeogenRevel | This column corresponds to DEOGEN2 + REVEL |
glm_AlphDeogenRevel | This column corresponds to AlphScore_final + DEOGEN2 + REVEL |
glm_AlphCaddDeogen | This column corresponds to AlphScore_final + CADD + DEOGEN2 |
glm_CaddDeogenRevel | This column corresponds to CADD + DEOGEN2 + REVEL |
Note that the Creative Commons license applies only to the values of AlphScore_final. REVEL and CADD scores as well as combined scores containing REVEL and CADD are not licensed for commercial use. The full list of references can be found in our manuscript.
Files
Files
(9.4 GB)
Name | Size | Download all |
---|---|---|
md5:cbb22553cd6976f5a8a6e2333e5a9fed
|
9.4 GB | Download |
md5:a9fe9eae820a54f3c64d0e3372be9564
|
765.6 kB | Download |