Published July 21, 2024 | Version v2
Dataset Open

Tissue-aware interpretation of genetic variants advances the etiology of rare diseases

Description

Pathogenic variants underlying Mendelian diseases often disrupt the normal physiology of a
few tissues and organs. However, variant effect prediction tools that aim to identify
pathogenic variants are typically oblivious to tissue contexts. Here we report a machine-
learning framework, denoted ‘Tissue Risk Assessment of Causality by Expression for
variants’ (TRACEvar, https://netbio.bgu.ac.il/TRACEvar/), that offers two advancements.
First, TRACEvar predicts pathogenic variants that disrupt the normal physiology of specific
tissues. This was achieved by creating 14 tissue-specific models that were trained on over
14,000 variants and combined 84 attributes of genetic variants with 495 attributes derived
from tissue omics. TRACEvar outperformed 10 well-established and tissue-oblivious variant
effect prediction tools. Second, the resulting models are interpretable, thereby illuminating
variants' mode-of-action. Application of TRACEvar to variants of 52 rare-disease patients
highlighted pathogenicity mechanisms and relevant disease processes. Lastly, interpretation
of large-scale models revealed that top-ranking determinants of pathogenicity included
attributes of disease-affected tissues, particularly cellular process activities. Hence, tissue
contexts and interpretable machine-learning models can greatly enhance the etiology of rare
diseases.

Article link: https://www.embopress.org/doi/full/10.1038/s44320-024-00061-6

 

Notes

File  Description
full_dataset_2_stars_2022.csv training set
 test set
 full dataset
 Column names for tissue-specific models
 multi-tissue model data
  multi-tissue column names

Files

full_dataset_2_stars_2022.csv

Files (876.0 MB)

Name Size Download all
md5:f50e1a9824ff5c2d913aa6bf3debdc84
167.8 MB Preview Download
md5:5d4a9dbbeb18af02d58041054825d017
214.9 MB Preview Download
md5:0250a53f01e894942c709d84c80d2374
27.9 MB Preview Download
md5:2edfa72f128e0a5f9c023b41b067ee07
41.4 kB Preview Download
md5:8a8fc2bd028b01dafbe8e30ebcadcf0c
465.4 MB Preview Download
md5:f609ffd206871f123f1b5728d71aa639
4.4 kB Preview Download

Additional details

Software

Repository URL
https://github.com/ChananArgov/TRACEvar
Programming language
Python