OMOP2OBO Drug Exposure Ingredient Mappings

Callahan, Tiffany J; Baumgartner, William A; Wyrwa, Jordan M; Vasilevsky, Nicole A; Bennett, Tellen D; Hunter, Lawrence D; Kahn, Michael G

doi:10.5281/zenodo.6949696

Published October 1, 2020 | Version V1.1

Dataset Open

OMOP2OBO Drug Exposure Ingredient Mappings

1. Computational Bioscience Program, Department of Pharmacology, University of Colorado Anschutz Medical Campus
2. Department of Physical Medicine and Rehabilitation, School of Medicine, University of Colorado Anschutz School of Medicine
3. Translational and Integrative Sciences Lab, University of Colorado Anschutz School of Medicine
4. Department of Pediatrics, Section of Pediatric Critical Care, School of Medicine, University of Colorado Anschutz School of Medicine

Contributors

Data collector (3):

Data curator:

Stefanski, Adrianne L⁶

Project member:

Deakyne-Davies, Sara H⁷

Researcher (2):

1. Department of Biomedical Informatics, University of Pittsburgh School of Medicine
2. Center for Health AI, University of Colorado Anschutz Medical Campus
3. Keck School of Medicine, University of Southern California
4. Department of Biomedical Informatics, Columbia University Medical Center
5. Department of Clinical Pharmacy and Medicine, University of Colorado Anschutz Skaggs School of Pharmacy and Pharmaceutical Sciences and School of Medicine
6. Computational Bioscience Program, Department of Pharmacology, Aurora, CO, 80045, USA, University of Colorado Anschutz Medical Campus
7. Department of Research Informatics, Children's Hospital Colorado

OMOP2OBO Drug Exposure Ingredient Mappings V1.0

These mappings were created by the OMOP2OBO mapping algorithm (see links below). OMOP2OBO - the first health system-wide, disease-agnostic mappings between standardized clinical terminologies and eight Open Biomedical Ontology (OBO) Foundry ontologies spanning diseases, phenotypes, anatomical entities, cell types, organisms, chemicals, vaccines, and proteins. These mappings are also the first to be explicitly created using standard terminologies in the Observational Medical Outcomes (OMOP) common data model (CDM), ensuring both semantic and clinical interoperability across a space of N conditions [and N relationships curated in these ontologies].

The mappings in this repository were created between OMOP standard drug exposure concepts at the ingredient-level (i.e., RxNorm) to the Chemical Entities of Biological Interest (ChEBI), the National Center for Biotechnology Information Taxon Ontology (NCBITaxon), the Protein Ontology (PRO), and the Vaccine Ontology (VO). All concepts were aligned to at least one ChEBI concept and the remaining ontologies (NCBITaxon, PR, and VO) were mapped by their drug class and/or type (e.g., biologics versus vaccines). For these OMOP domains, owl:intersectionOf (“and”), and owl:unionOf (“or”) constructors were used to construct semantically expressive mappings.

Mapping Details
Mappings included in this set were generated automatically using OMOP2OBO or through the use of a Bag-of-words embedding model using TF-IDF. Cosine similarity is used to compute similarity scores between all pairwise combinations of OMOP and OBO concepts and ancestor concepts. To improve the efficiency of this process, the algorithm searches only the top 𝑛 most similar results and keeps the top 75th percentile among all pairs with scores >= 0.25. Manually created mappings are also included.

Mapping Categories

Automatic One-to-One Concept: Exact label or synonym, dbXRef, or expert validated mapping @ concept-level; 1:1
Automatic One-to-One Ancestor: Exact label or synonym, dbXRef, or expert validated mapping @ concept ancestor-level; 1:1
Automatic One-to-Many Concept: Exact label or synonym, dbXRef, cosine similarity, or expert validated mapping @ concept-level; 1:Many
Automatic One-to-Many Ancestor: Exact label or synonym, dbXRef, cosine similarity, or expert validated mapping @ concept-level; 1:Many
Manual One-to-One: Hand mapping created using expert suggested resources; 1:1
Manual One-to-Many: Hand mapping created using expert suggested resources; 1:Many
Cosine Similarity: score suggested mapping -- manually verified
UnMapped: No suitable mapping or not mapped type

Mapping Statistics
Additional statistics have been provided for the mappings and are shown in the table below. This table presents the counts of OMOP concepts by mapping category and ontology:

Mapping category	ChEBI	NCBITaxon	PRO	VO
Automatic One-to-One Concept	3151	155	43	108
Automatic One-to-Many Constructor	404	1	1	0
Automatic One-to-One Ancestor	147	17	20	4
Automatic One-to-Many Ancestor	210	3	2	2
Cosine Similarity	109	4241	18	17
Manual	322	230	157	21
Manual One-to-Many	72	14	8	2
UnMapped	7392	7146	11558	11653

Provenance and Versioning: The V1.0 deposited mappings were created by OMOP2OBO v1.0.0 on October 2022 using the OMOP Common Data Model V5.0 and OBO Foundry ontologies downloaded on September 14, 2020.

Caveats: Please note that these are the original mappings that were created for the preprint. They have not been updated to current versions of the ontologies. In our experience, this should result in very few errors, but we do suggest that you check the ontology concepts used against current versions of each ontology before using them.

Important Resources and Documentation

GitHub: OMOP2OBO
Project Wiki: OMOP2OBO - wiki
Zenodo Community: OMOP2OBO
Preprint Manuscript: 10.5281/zenodo.5716421

Files

Files (9.1 MB)

Name	Size	Download all
OMOP2OBO_V1_Drug_Exposure_Mapping_Oct2020.xlsx md5:b6f4f30e04b66297e094cf753c23f9bf	9.1 MB	Download

Additional details

Is cited by: Preprint: 10.5281/zenodo.5716421 (DOI)
Is compiled by: Software: https://github.com/callahantiff/OMOP2OBO (URL)
Is published in: Other: http://tiffanycallahan.com/OMOP2OBO_Dashboard (URL)
Is referenced by: Thesis: 10.5281/zenodo.5716401 (DOI)

	All versions	This version
Views	1,159	420
Downloads	351	145
Data volume	3.9 GB	1.9 GB

OMOP2OBO Drug Exposure Ingredient Mappings

Authors/Creators

Contributors

Data collector (3):

Data curator:

Project member:

Researcher (2):

Description

Files

Files (9.1 MB)

Additional details

Related works