How good is your phenotyping? Methods for quality assessment
Creators
- 1. Lawrence Berkeley National Laboratory, Berkeley, CA
- 2. Oregon Health & Sciences University, Portland, OR
- 3. Charité - Universitätsmedizin Berlin
- 4. Wellcome Trust Sanger Institute, Hinxton, UK
Description
Semantic phenotyping has been shown to be an effective means to aid variant prioritization and characterization by comparison to both known Mendelian diseases and across species with animal models (Robinson et al 2013). This process, whereby symptoms and characteristic phenotypic findings are curated with species-specific ontology terms, has generated a baseline set of disease phenotype descriptions for more than 7,000 Mendelian diseases (Kohler et al 2014a) as well as many thousands of descriptions of additional animal models. By leveraging the knowledge encoded in the ontology graph and methods drawn from information theory, similarities can be computed between any two sets of phenotype descriptions (Washington et al 2009). This very powerful technique has the potential to be used for disease diagnosis, particularly for novel and rare diseases when the underlying genetic cause is unknown. The robustness of semantic similarity methods is heavily dependent on the quality of both the knowledgebase as well as the phenotype profile being studied. Therefore, capturing the highest quality phenotypic profiles is necessary. Until now, these phenotypic profiles have been typically captured by specialized curators, but as we want to move this technique into the diagnostic setting it will need to move into a physician’s hands. This process of acquiring structured phenotype annotations for individual patients may seem daunting and unnecessarily complex for physicians with high demands on their time. Annotation tools such as Phenotips (Girdea et al 2013) greatly facilitate recording rigorous phenotype annotations in the clinic, but do not themselves provide guidance about what constitutes annotations sufficient for comparative phenotype analysis. Since clinicians are not used to providing structured phenotype data, it is necessary to provide a measurement of how a given patient phenotype profile compares against the corpus of available genotype-phenotype annotations, including that of known diseases, animal models, and other patients in the system. A metric to gauge overall complexity and diagnostic capability of a phenotype profile generated in this way would greatly enhance the ability to use structured phenotyping in the clinical setting for comparative analysis. Conversely, such a metric can also be utilized in the context of any systematic model organism phenotyping efforts. Here, we present a method to assess the sufficiency of a phenotype profile, by investigating the necessary and sufficient information characteristics required to identify disease similarity based on phenotypes alone. This scoring method is being provided as a REST service through the Monarch Initiative API.
Notes
Files
PhenoDay_SufficiencyScore_camera_ready_final.pdf
Files
(495.8 kB)
Name | Size | Download all |
---|---|---|
md5:edb1a91aab1eda585495f26cda50d9c9
|
495.8 kB | Preview Download |