MedPix-2.0
Contributors
Contact person (2):
Description
MedPix 2.0: A Comprehensive Multimodal Biomedical Dataset for Advanced AI Applications.
Please cite our work as follows if you use MedPix 2.0
```
@misc{siragusa2025medpix20comprehensivemultimodal,
title={MedPix 2.0: A Comprehensive Multimodal Biomedical Data set for Advanced AI Applications with Retrieval Augmented Generation and Knowledge Graphs},
author={Irene Siragusa and Salvatore Contino and Massimo La Ciura and Rosario Alicata and Roberto Pirrone},
year={2025},
eprint={2407.02994},
archivePrefix={arXiv},
primaryClass={cs.DB},
url={https://arxiv.org/abs/2407.02994},
}
```
Below a description of Case_topic.json and Descriptions.json is provided. images folder contains all the images of the dataset, while in splitted_dataset folder, a split of the dataset is provided, please refer to /splitted_dataset/README.md for further informations.
Case_topic.json
Contains a list of JSON, each of these provide the information of a single clinical case. The structure of each element is reported below:
-
U_idthe UID code idenifies a clinical case -
TAClist of names of the .png files containing the CT scans (if present). Images are under the image folder. -
MRIlist of names of the .png files containing the MR scans (if present). Images are under the image folder. -
Casedictionary with the information of the clinical case. It contains the following information:- Title the diagnosis
- History patient's history
- Exam
- Findings
- Differential Diagnosis
- Case Diagnosis
- Diagnosis By
-
TopicDictionary with the general information about the disease. It contains the following information:- Title the diagnosis
- Disease Discussion
- ACR Code
- Category
Descriptions.json
Contains a list of JSON, each of these provide the textual information about a single image, stored in the image folder. The structure of each element is reported below:
TypeCan be CT or MR, identifies teh scanning modality of the image.U_idThe UID code of the clinical case the image belongs to.imagename of the image filelocationfine-grained information about the body part location of the given imagelocation categorymacro-location of the body-part showen in the given imageDescriptionDictionary with the decriptive information of the image. It contains the following information:- ACR codes
- Age age of the patient
- Sex sex of the patient
- Caption refers to the specific caption of the image
- Figure part
- Modality scanning modality of the image
- Plane
Files
splitted_dataset.zip
Additional details
Identifiers
- arXiv
- arXiv:2407.02994
Dates
- Submitted
-
2024-06-30
Software
- Repository URL
- https://github.com/CHILab1/MedPix-2.0.git
- Development Status
- Active