Published December 1, 2025 | Version 1.0
Dataset Open

SIDEKICK: A Semantically Integrated Resource for Drug Effects, Indications, and Contraindications

  • 1. ROR icon King Abdullah University of Science and Technology
  • 2. EDMO icon Duke University

Description

Pharmacovigilance and clinical decision support systems utilize structured drug safety data to guide medical practice. However, existing datasets frequently depend on legacy terminologies, such as MedDRA, which limit the semantic reasoning capabilities and the interoperability required for modern computational phenotyping. To address this gap, we developed SIDEKICK, a knowledge graph that standardizes drug indications, contraindications, and adverse reactions from FDA Structured Product Labels. We developed and used a workflow based on Large Language Model (LLM) extraction and Graph-Retrieval Augmented Generation (Graph RAG) for ontology mapping. We processed over 50,000 labels and mapped terms to the Human Phenotype Ontology (HPO), MONDO Disease Ontology, and RxNorm. Our semantically integrated resource outperforms the SIDER and ONSIDES baselines databases when applied to the task of drug repurposing by side effects. We serialized the dataset as a Resource Description Framework (RDF) graph and employed the Semanticscience Integrated Ontology (SIO) as upper level ontology to further improve interoperability. Consequently, SIDEKICK enables automated safety surveillance and phenotype-based similarity analysis for drug repurposing.

Files

Files (160.0 MB)

Name Size Download all
md5:83c9d39be33028052f12883f84c63ed5
160.0 MB Download

Additional details

Dates

Available
2025-12-02