Pharmacogenomics datasets for Ontology Matching
Authors/Creators
- 1. Université Côte d'Azur, Inria, CNRS, I3S, France
- 2. Inria Paris, Centre de Recherche des Cordeliers, Inserm, Université Paris Cité, Sorbonne Université, Paris, France
Description
Pharmacogenomics datasets for Ontology Matching
Pharmacogenomics (or PGx for short) involves n-ary tuples representing so-called "pharmacogenomic relationships" and their components of three distinct types: drugs, genetic factors, and phenotypes. Tuples are reified as instances of the class ``pgxo:PharmacogenomicRelationship``. The goal of the matching task is to match these tuples (instance matching).
Motivation: Pharmacogenomic tuples involve drugs, genetic factors, and phenotypes, and state that patients being treated by the specified drugs while having the specified genetic factors may experience the given phenotypes. Knowledge in pharmacogenomics is scattered across several resources, e.g., reference databases (PharmGKB) or the biomedical literature. Hence, there is a need to build a consolidated view of the knowledge of this domain by aligning tuples from different sources. See [1] for a detailed motivation and [2] for a detailed task description.
Datasets
We provide different subsets of the alignments available in PGxLOD that have been created with the matching rules described in [3].
Task with 10 % of PGx relationships
- Alignments: 1092
- relatedMatch alignments: 66
- sameAs alignments: 498
- closeMatch alignments: 53
- broadMatch alignments: 333
- narrowMatch alignments: 142
- Entities to align in source: 2525
- Triples in source: 816604
- Entities to align in target: 2518
- Triples in target: 816859
Task with 50 % of PGx relationships
- Alignments: 23630
- relatedMatch alignments: 1245
- sameAs alignments: 9219
- closeMatch alignments: 1175
- broadMatch alignments: 7183
- narrowMatch alignments: 4808
- Entities to align in source: 12816
- Triples in source: 894723
- Entities to align in target: 12401
- Triples in target: 889735
Task with 100 % of PGx relationships
- Alignments: 89926
- relatedMatch alignments: 4979
- sameAs alignments: 35499
- closeMatch alignments: 4603
- broadMatch alignments: 26135
- narrowMatch alignments: 18710
- Entities to align in source: 25406
- Triples in source: 982548
- Entities to align in target: 25029
- Triples in target: 980543
References
- Pierre Monnin, Joël Legrand, Graziella Husson, Patrice Ringot, Andon Tchechmedjiev, Clément Jonquet, Amedeo Napoli, Adrien Coulet: PGxO and PGxLOD: a reconciliation of pharmacogenomic knowledge of various provenances, enabling further comparison. BMC Bioinformatics 20-S(4): 139:1-139:16 (2019) [pdf]
- Pierre Monnin, Adrien Coulet: Matching pharmacogenomic knowledge: particularities, results, and perspectives. OM@ISWC 2022: 79-83 [pdf]
- Pierre Monnin, Miguel Couceiro, Amedeo Napoli, Adrien Coulet: Knowledge-Based Matching of n-ary Tuples. ICCS 2020: 48-56 [pdf]
Files
pharmacogenomics-om-v1.0.0.zip
Files
(49.5 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:709560077480d850377a906ba68af1d3
|
49.5 MB | Preview Download |
Additional details
Related works
- Cites
- Journal article: 10.1186/s12859-019-2693-9 (DOI)
- Conference paper: https://ceur-ws.org/Vol-3324/om2022_STpaper3.pdf (URL)
- Conference paper: 10.1007/978-3-030-57855-8_4 (DOI)