PolyMed: A Medical Dataset Addressing Disease Imbalance for Robust Automatic Diagnosis Systems
Creators
- 1. 1Department of Applied Artificial Intelligence Major in Bio Artificial Intelligence Hanyang University, Ansan, Republic of Korea
Description
We introduce the PolyMed dataset, designed to address the limitations of existing medical case data for Automatic Diagnosis Systems (ADS). ADS assists doctors by predicting diseases based on patients' basic information, such as age, gender, and symptoms. However, these systems face challenges due to imbalanced disease label data and difficulties in accessing or collecting medical data. To tackle these issues, the PolyMed dataset has been developed to improve the evaluation of ADS by incorporating medical knowledge graph data and diagnosis case data. The dataset aims to provide comprehensive evaluation, include diverse disease information, effectively utilize external knowledge, and perform tasks closer to real-world scenarios.
We have also made the data collection tools publicly available to enable researchers and other interested parties to contribute additional data in a standardized format. These tools feature a range of customizable input fields that can be selectively utilized according to the user's specific requirements, ensuring consistency and professionalism in the data collection process.
All train and test code of our data available in https://github.com/krchanyang/PolyMed
Files
data annotation tool.zip
Files
(58.2 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:f5b6cae4c27c51c457cfe121174e5fd6
|
57.4 MB | Preview Download |
|
md5:df94c9c2995afba4cdc79df19c33fa66
|
786.0 kB | Preview Download |
Additional details
Related works
- Is supplemented by
- https://github.com/krchanyang/PolyMed (URL)