Published May 13, 2025 | Version v2
Dataset Open

Traditional Chinese Medicine Multidimensional Knowledge Graph

  • 1. School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing 211198, PR China
  • 2. State Key Laboratory of Natural Medicines, China Pharmaceutical University, Nanjing 210009, PR China

Contributors

Project leader:

Supervisor:

Description

Overview of the Traditional Chinese Medicine Multi-dimensional Knowledge Graph (TCM-MKG)

The Traditional Chinese Medicine Multi-dimensional Knowledge Graph (TCM-MKG) is a comprehensive, open-source data platform developed by Jingqi Zeng in November 2024. This platform aims to integrate and standardize a vast array of data from multiple sources, encompassing both traditional Chinese medicine (TCM) and modern biomedical sciences. By organizing and linking this diverse information, TCM-MKG acts as a bridge that connects the ancient wisdom of TCM with contemporary medical research and applications.

Key Features and Objectives:

  • Multi-source Data Integration: TCM-MKG consolidates data from over 30 authoritative resources, covering a broad spectrum of topics, including TCM terminology, Chinese patent medicines (CPM), Chinese herbal pieces (CHP), natural products (NP), chemical components, disease targets, and more. These data sources are carefully curated and interlinked, ensuring a rich, multi-dimensional view of TCM in relation to modern biomedical research. The platform incorporates data from reputable databases such as DrugBank, BioGRID, DisGeNET, STRING, and many others, ensuring that the TCM knowledge is not only expansive but also scientifically robust and cross-referenced with global biomedical standards.

  • Standardized Design for Global Interoperability: TCM-MKG adheres to international data standards and integrates with widely-used global medical classification systems such as ICD-11, UMLS, MeSH, and DOID. This ensures that the platform’s data is globally comparable and facilitates easy integration with international research efforts, promoting collaboration and knowledge exchange across the fields of TCM and modern medicine.

  • Open Source and Collaborative: In line with its mission to enhance transparency and accessibility, TCM-MKG is open-sourced in a structured tabular format. This allows researchers worldwide to freely access, contribute to, and expand upon the data, fostering interdisciplinary collaboration and accelerating innovation in both TCM research and modern medicine.

  • Advanced Analytical Capabilities: By leveraging the power of knowledge graph technology and graph-based intelligence algorithms, TCM-MKG supports deep data mining and relational reasoning. Researchers can uncover hidden associations between TCM components, diseases, and targets, providing insights into the mechanisms of herbal interactions and offering new pathways for drug discovery and therapeutic research.

Personal Research Application:

Using the TCM-MKG platform, I conducted a study titled "Graph Neural Networks for Quantifying Compatibility Mechanisms in Traditional Chinese Medicine." This research applied advanced graph intelligence algorithms to quantitatively assess the compatibility mechanisms of Chinese herbal formulas (CHF). The study provides fresh insights into the underlying principles of TCM herbal combinations.

This research has been published:

Zeng, J., & Jia, X. (2025). Quantifying compatibility mechanisms in traditional Chinese medicine with interpretable graph neural networks. Journal of Pharmaceutical Analysis, 101342. https://doi.org/10.1016/j.jpha.2025.101342

The code and methodology for this research have been open-sourced and are available on GitHub.

Acknowledgments:

This work benefited from the integration of data from numerous open-access and authoritative databases. We acknowledge the valuable contributions of resources such as DrugBank, BindingDB, BioGRID, DisGeNET, and many others. These datasets provided essential insights into TCM, modern drug chemistry, genetics, diseases, and related fields, forming the foundation for the traditional Chinese medicine multi-dimensional knowledge graph (TCM-MKG) used in this study. Furthermore, we utilized the PSICHIC model (https://github. com/huankoh/PSICHIC) to analyze the binding interactions between components and targets. Full citations for these resources are included.

Contact Information:

For further inquiries or more detailed information, please feel free to contact:
Email: zjingqi@163.com

 

Notes (English)

 

Reference Databases

This database integrates data from a diverse range of authoritative sources, encompassing key areas such as Traditional Chinese Medicine (TCM), modern drug chemistry, genetics, diseases, and more. The data comes from over 30 reputable databases, which include, but are not limited to, DrugBank, BindingDB, BioGRID, DisGeNET, NCBI Taxonomy, and others. These sources offer high-quality, up-to-date information that spans both traditional and contemporary medical research, providing a comprehensive view of the relationships between TCM and modern biomedical science.

  1. World Health Organization International Standard Terminologies on Traditional Chinese Medicine (WHO IST TCM) – Provides standard terminology for Traditional Chinese Medicine.
    Version: 3-Mar-22 | Link

  2. NCBI Taxonomy – Detailed classification of biological species.
    Version: 22-May-24 | Link

  3. Comprehensive Medicinal Chemistry Analysis Using Machine Learning (CMAUP) – A comprehensive analysis of medicinal chemistry using machine learning.
    Version: V2.0 | Link

  4. RDKit – An open-source toolkit for cheminformatics.
    Version: Apr-24 | Link

  5. Binding Database (BindingDB) – Provides data on the interactions between small molecules and biological macromolecules.
    Version: 28-Apr-24 | Link

  6. DrugBank – A comprehensive resource for drug and drug target information.
    Version: V5.1.12 | Link

  7. Search Tool for Interacting Chemicals (STITCH) – A tool for retrieving data on chemical interactions.
    Version: V5.0 | Link

  8. Therapeutic Target Database (TTD) – A database of therapeutic targets and their associated diseases.
    Version: 10-Jan-24 | Link

  9. Natural Product Classifier (NPClassifier) – A classifier for natural products based on their molecular properties.
    Version: V1.5 | Link

  10. Biological General Repository for Interaction Datasets (BioGRID) – A database of protein and genetic interactions.
    Version: V4.4.233 | Link

  11. IntAct – A database for protein interaction data.
    Version: 15-Feb-24 | Link

  12. Molecular INTeraction Database (MINT) – A database for molecular interaction data.
    Version: 22-May-24 | Link

  13. Signaling Network Open Resource (SIGNOR) – A database for signaling networks and pathways.
    Version: V3.0 | Link

  14. Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) – A tool for retrieving interaction data between genes/proteins.
    Version: V12.0 | Link

  15. Ensembl – A genome browser for accessing genomic data.
    Version: GRCh37.p13 | Link

  16. UniProt – A comprehensive resource for protein sequence and functional data.
    Version: Release 2024_02 | Link

  17. UMLS (Unified Medical Language System) – A system that integrates biomedical terminologies.
    Version: 6-May-24 | Link

  18. China Medical Information Platform (CMIP) – A medical information platform for China.
    Version: 22-May-24 | Link

  19. Pharmacopoeia of the People’s Republic of China 2020 (PPRC 2020) – The official pharmacopoeia of China.
    Version: 2020 | Link

  20. Disease Ontology (DO) – A comprehensive resource for disease classification and terminology.
    Version: Release 2024_04 | Link

  21. DisGeNET – A database of human disease-gene associations.
    Version: V7.0 | Link

  22. Comparative Toxicogenomics Database (CTD) – A database for toxicology and genomics data.
    Version: Apr-24 | Link

  23. Diseases – A comprehensive collection of disease-related data.
    Version: V2.0 | Link

  24. International Classification of Diseases, 11th Revision (ICD-11) – The latest revision of the WHO’s disease classification system.
    Version: Jan-24 | Link

  25. Encyclopedia of Traditional Chinese Medicine (ETCM) – An encyclopedia of TCM knowledge.
    Version: V2.0 | Link

  26. Herbal Ingredients' Targets (HERB) – A database of herbal ingredient-target interactions.
    Version: V2.0 | Link

  27. Herbal Ingredients Targets Database (HIT) – A comprehensive database of herbal ingredient-target interactions.
    Version: V2.0 | Link

  28. Symptom Mapping Database (SymMap) – A database mapping symptoms to diseases and treatments.
    Version: V2.0 | Link

  29. Traditional Chinese Medicine Bank (TCMbank) – A comprehensive database of Traditional Chinese Medicine.
    Version: V1.0 | Link

  30. Traditional Chinese Medicine Integrated Database (TCMID) – An integrated database of Traditional Chinese Medicine resources.
    Version: V2.0 | Link

  31. Traditional Chinese Medicine Systems Pharmacology Database (TCMSP) – A database focused on the pharmacology of Traditional Chinese Medicine.
    Version: V1.0 | Link

These rich and varied data sources collectively create a robust and interconnected framework that enhances our understanding of both TCM and its potential integration into current scientific research. By combining diverse knowledge from various fields, the platform supports interdisciplinary exploration and provides valuable insights for researchers, clinicians, and practitioners in the study and application of TCM.

Files

TCM-MKG_Open_Source_Documentation.pdf

Files (1.1 GB)

Name Size Download all
md5:2a4ff7b94561063048c674085b82ccbc
671.4 kB Download
md5:41ab5677f198c40a4dee20b76c25579d
18.2 MB Download
md5:c665b9efff68668ce8ca062fb956fbcf
60.4 MB Download
md5:296221f21145c337ad711c57647d551e
74.8 MB Download
md5:dc572af7e06e6dc6d785e368ac60403c
6.4 MB Download
md5:d0bdd7bed39f71fc814aaa150f13e4d4
16.1 MB Download
md5:35cd485ad7fbda22449b0b47ab6a2788
83.5 MB Download
md5:41d9ffee75f8ffc1f2710011228ffd48
12.1 MB Download
md5:092ea2429120c3419a7d575f06ee028d
2.7 MB Download
md5:3816372c4f7dd851c064c884abec668d
5.3 MB Download
md5:202001bab5f23d22fe1acdbf0c633714
1.0 MB Download
md5:d5013aa4e68671e8b9610fad4bd95622
107.2 kB Download
md5:af706ebaf67a6fe6f8a56b0b114f57dd
88.1 kB Download
md5:4b99ea5003b88d122d2e2ade4c50ea2a
8.7 MB Download
md5:9028e1288393564dda70a2ce17b7ef2a
356.4 MB Download
md5:7819adcef68125ad3aeb15036e15c014
111.9 MB Download
md5:348f72b83f83156f2318e7dd9b01fd1c
457.7 kB Download
md5:87ac3267f672ffa79e5d89f153ac5c1e
1.5 MB Download
md5:d2c98b6c7a9015878edcc07544dcaa4b
1.7 MB Download
md5:025a9775783bc9fb42a7f54f118ab935
1.2 MB Download
md5:853e25ec51cd3c657646ce3254d133a8
457.5 kB Download
md5:4dfba51deaba06744185c6d58553a538
1.2 MB Download
md5:0a71a613fd86dc99f1ebaef4d1590cba
354.3 kB Download
md5:8524a67c1c50b81f0eae313c1706daa7
10.5 MB Download
md5:b46ceb5e163fcf78bfe2c655316e77f6
331.1 MB Download
md5:725b5919749356d2274dbb0c7902baa6
835.5 kB Preview Download

Additional details

Additional titles

Translated title (Mandarin Chinese)
中医药多维知识图谱

Related works

Is described by
Thesis: 10.1016/j.jpha.2025.101342 (DOI)

Dates

Available
2024-11-10

Software

Repository URL
https://github.com/ZENGJingqi/GraphAI-for-TCM
Programming language
Python
Development Status
Active

References

  • Zeng, J., & Jia, X. (2025). Quantifying compatibility mechanisms in traditional Chinese medicine with interpretable graph neural networks. Journal of Pharmaceutical Analysis, 101342. https://doi.org/10.1016/j.jpha.2025.101342