Published March 5, 2026 | Version 1.0
Dataset Open

TCMNSCLC: A Real-world Dataset for Chinese Medicine Reasoning on Non-small-cell Lung Cancer

Description

TCM4NSCLC is a real-world dataset designed for Chinese medicine reasoning in non-small-cell lung cancer (NSCLC). The dataset contains structured clinical cases derived from real-world medical records and expert annotations.

Each case includes patient clinical information, syndrome differentiation, treatment principles, herbal prescriptions, and Chinese patent medicine recommendations.

The dataset is designed to support research on large language models (LLMs) for traditional Chinese medicine reasoning, clinical decision support, and explainable medical AI.

The dataset includes:
- structured case descriptions
- TCM syndrome differentiation labels
- treatment principles
- herbal prescriptions
- Chinese patent medicine recommendations

Files

example_en.json

Files (4.5 MB)

Name Size Download all
md5:7923b72b251856912dbbc6c99d91a8ac
2.5 kB Preview Download
md5:41c6a6870665c03896dedab1090c48ad
2.8 kB Preview Download
md5:7854c0241d5d8cb743597dc335c9cb9b
2.1 kB Preview Download
md5:cc582b560c9d7e52ecdbe8bc363c1160
450.3 kB Preview Download
md5:2c03a1dc09dd7f80d4dd94ca7244293d
3.6 MB Preview Download
md5:4e24602de8c40b675c6a1358330c7bc8
448.3 kB Preview Download

Additional details

Related works