Published October 1, 2020 | Version V1.1
Dataset Open

OMOP2OBO Measurement Mappings

  • 1. Computational Bioscience Program, Department of Pharmacology, University of Colorado Anschutz Medical Campus
  • 2. Translational and Integrative Sciences Lab, University of Colorado Anschutz School of Medicine
  • 3. Department of Pediatrics, Section of Pediatric Critical Care, School of Medicine, University of Colorado Anschutz School of Medicine
  • 4. Adult and Child Consortium for Health Outcomes Research and Delivery Science (ACCORDS), University of Colorado Anschutz School of Medicine
  • 1. Department of Computer Science, Georgia State University
  • 2. Department of Biomedical Informatics, University of Pittsburgh School of Medicine
  • 3. Center for Health AI, University of Colorado Anschutz Medical Campus
  • 4. Department of Biomedical Informatics, Columbia University Medical Center
  • 5. Department of Clinical Pharmacy and Medicine, University of Colorado Anschutz Skaggs School of Pharmacy and Pharmaceutical Sciences and School of Medicine
  • 6. Computational Bioscience Program, Department of Pharmacology, Aurora, CO, 80045, USA, University of Colorado Anschutz Medical Campus
  • 7. Department of Research Informatics, Children's Hospital Colorado
  • 8. The Jackson Laboratory for Genomic Medicine
  • 9. Sema4

Description

OMOP2OBO Measurement Mappings V1.0

The mappings in this repository were created between OMOP standard measurement concepts (i.e., LOINC) to the Human Phenotype Ontology (HPO), Chemical Entities of Biological Interest (CheBI), Vaccine Ontology (VO), National Center for Biotechnology Information Taxon Ontology (NCBITaxon), Protein Ontology (PRO), Cell Ontology (CL), and the Uber-anatomy Ontology (UBERON).

For each measurement, all levels of the test result (results above, below, and within a reference range) were mapped, not only those deemed clinically relevant. Results outside of a reference range, but not currently deemed clinically relevant (as advised by the literature or consultation via domain expert), were annotated to the nearest relevant ontology concept ancestor. For example, when annotating the results of a test for Asparagus IgE Ab RAST class [Presence] in Serum (LOINC:15547-3), a result above a reference range would be annotated with an increased anti-plant-based food allergen IgE antibody level (HP:0410228). While a low level of this antibody may not be deemed clinically relevant, it is still outside of the provided reference range and thus was annotated to the nearest applicable concept ancestor, abnormal immunoglobulin level (HP:0010701). There is one exception to this rule: all measured drugs and toxins (entities not normally found in the human body) with normal results (results that were not outside of a given reference range) were annotated to the same HP concept as the clinically relevant result and logically negated. For example, Amphetamine [Presence] in Urine by Screen (LOINC:19343-3), a positive finding was mapped to a positive urine amphetamine test (HP:0500112) and a negative finding was mapped to a positive urine amphetamine test and logically negated (NOT HP:0500112).

LOINC2HPO currently aligns LOINC to HP. The current work extends existing LOINC2HPO annotations to match the OMOP2OBO mappings in the following two ways: (1) annotations were updated if new and/or more specific concepts had been added to the HP; and (2) existing mappings were expanded to include the measurement substance (body fluids, tissues, and organs via Uberon), the entity being measured (chemicals, metabolites, or hormones via ChEBI; cell types via CL; and proteins and protein complexes via PR), and the species of the measured entities (organism taxonomy via NCBITaxon). Consistent with LOINC2HPO, all measurements lacking sufficient specimen detail (those measured in non-specific body substances) were annotated as “Unspecified Sample” and all measurements without a valid result type were annotated as “Not Mapped test Type”. All modifications to the original LOINC2HPO annotations were meticulously recorded in the mapping evidence field enabling users to easily identify when an original LOINC2HPO annotation had been updated.

For this OMOP domain, the owl:complementOf (“not” and was used to model normal test results), owl:intersectionOf (“and”), and owl:unionOf (“or”) constructors were used to construct semantically expressive mappings.


Mapping Details
Mappings included in this set were generated automatically using OMOP2OBO or through the use of a Bag-of-words embedding model using TF-IDF. Cosine similarity is used to compute similarity scores between all pairwise combinations of OMOP and OBO concepts and ancestor concepts. To improve the efficiency of this process, the algorithm searches only the top 𝑛 most similar results and keeps the top 75th percentile among all pairs with scores >= 0.25. Manually created mappings are also included.

Mapping Categories

  • Automatic One-to-One Concept: Exact label or synonym, dbXRef, or expert validated mapping @ concept-level; 1:1
  • Automatic One-to-One Ancestor: Exact label or synonym, dbXRef, or expert validated mapping @ concept ancestor-level; 1:1
  • Automatic One-to-Many Concept: Exact label or synonym, dbXRef, cosine similarity, or expert validated mapping @ concept-level; 1:Many
  • Automatic One-to-Many Ancestor: Exact label or synonym, dbXRef, cosine similarity, or expert validated mapping @ concept-level; 1:Many
  • Manual One-to-One: Hand mapping created using expert suggested resources; 1:1
  • Manual One-to-Many: Hand mapping created using expert suggested resources; 1:Many
  • Cosine Similarity: score suggested mapping -- manually verified
  • UnMapped: No suitable mapping or not mapped type

Mapping Statistics
Additional statistics have been provided for the mappings and are shown in the table below. This table presents the counts of OMOP concepts by mapping category and ontology:

Mapping Category HPO UBERON ChEBI CL PR NCBITaxon
Automatic One-to-One Concept 20 1981 268 129 19 286
Automatic One-to-Many Concept 49 5 0 24 0 0
Automatic One-to-One Ancestor 43 426 1149 5 5 207
Automatic Constructor - Ancestor   0 1 12 1 0 0
Cosine Similarity 113 50 160 35 45 56
Manual 10663 319 1446 185 1590 2357
Manual One-to-Many 49 1118 528 18 133 196
UnMapped 184 184 529 3688 2296 982


Provenance and Versioning: The V1.0 deposited mappings were created by OMOP2OBO v1.0.0 on October 2022 using the OMOP Common Data Model V5.0 and OBO Foundry ontologies downloaded on September 14, 2020. 

Caveats: Please note that these are the original mappings that were created for the preprint. They have not been updated to current versions of the ontologies. In our experience, this should result in very few errors, but we do suggest that you check the ontology concepts used against current versions of each ontology before using them.

 

Important Resources and Documentation

Files

Files (3.3 MB)

Name Size Download all
md5:fb33a8152d4c5e303d0971808a1522b8
3.3 MB Download

Additional details

Related works

Is cited by
Preprint: 10.5281/zenodo.5716421 (DOI)
Is compiled by
Software: https://github.com/callahantiff/OMOP2OBO (URL)
Is published in
Other: http://tiffanycallahan.com/OMOP2OBO_Dashboard (URL)
Is referenced by
Thesis: 10.5281/zenodo.5716401 (DOI)