There is a newer version of the record available.

Published October 9, 2019 | Version 1
Dataset Open

Rare Disease analysis in Mondo

Description

To answer the question of 'How many rare diseases are there?' we analyzed terms in Mondo to get a total count of Rare Diseases as defined in Mondo Disease Ontology (Mondo).

 

Methods

This analysis was performed on the Mondo 2019-09-30 release.

1. Get all 'Disease' terms from Mondo

First we get all the terms in Mondo that are a descendants of MONDO:0000001 'Disease'.

There are 21633 Mondo disease terms.

2. Filter terms that are descendants of 'disease susceptibility'

We then filter out terms that are descendants of MONDO:0042489 'disease susceptibility', to avoid counting ambiguous terms that are related to disease susceptibility and not the actual disease itself.

This gives us a list of 21563 Mondo rare disease terms.

3. Identify terms that are 'rare'

Any disease term in Mondo is considered rare if the term, or its ancestor, has modifier MONDO:0021136 'Rare' in the ontology.

There are 12914 Mondo rare disease terms.

4. Consider terms in 'gard_rare' subset

There are 3176 Mondo disease terms that are in gard_rare subset which contains Mondo terms that are yet to be treated as 'rare'.

We add these terms to our set of Mondo rare disease terms.

This increases the Mondo rare disease term count to 13866.

But for this analysis, we are interested in terms that are both rare and are leaf nodes in the ontology.

After considering only leaf nodes, we get 10394 as the final count of Mondo rare disease terms.

 

Results

all-mondo-disease-terms.tsv: As part of our analysis, we generated a TSV containing 21633 Mondo disease terms, each with annotations that signifies whether the term is a rare disease term and whether that term is a leaf node in the ontology.

Files

README.md

Files (5.4 MB)

Name Size Download all
md5:4262ab21d94e577e3ca1029afd15e4cb
5.4 MB Download
md5:a8a405bec2ec9a230de087ea2a24fe7c
1.9 kB Preview Download
md5:402b915ddaa6c864b767bac4a7c6ef24
1.8 kB Preview Download

Additional details

Funding

National Institutes of Health
Semantic LAMHDI: Linking diseases to model organism resources 5R24OD011883-03

References

  • Mungall, Christopher J., Sebastian Koehler, Peter Robinson, Ian Holmes, and Melissa Haendel. 2016. "K-BOOM: A Bayesian Approach to Ontology Structure Inference, with Applications in Disease Ontology Construction." bioRxiv. doi:10.1101/048843
  • Mungall, Christopher J., Julie A. McMurry, Sebastian Köhler, James P. Balhoff, Charles Borromeo, Matthew Brush, Seth Carbon, et al. 2017. "The Monarch Initiative: An Integrative Data and Analytic Platform Connecting Phenotypes to Genotypes across Species." Nucleic Acids Research 45 (D1): D712–22