Published November 7, 2023 | Version 1.0
Dataset Open

The MultiCaRe Dataset: A Multimodal Case Report Dataset with Clinical Cases, Labeled Images and Captions from Open Access PMC Articles

  • 1. ROR icon Universidad Nacional del Sur

Description

The dataset contains multi-modal data from over 75,000 open access and de-identified case reports, including metadata, clinical cases, image captions and more than 130,000 images. Images and clinical cases belong to different medical specialties, such as oncology, cardiology, surgery and pathology. The structure of the dataset allows to easily map images with their corresponding article metadata, clinical case, captions and image labels. Details of the data structure can be found in the file data_dictionary.csv.

Almost 100,000 patients and almost 400,000 medical doctors and researchers were involved in the creation of the articles included in this dataset. The citation data of each article can be found in the metadata.parquet file.

Refer to the examples showcased in this GitHub repository to understand how to optimize the use of this dataset.

For a detailed insight about the contents of this dataset, please refer to this data article published in Data In Brief.

Files

data_dictionary.csv

Files (8.8 GB)

Name Size Download all
md5:db66ace65fd73cd25861f2b541ec8b8a
44.9 MB Download
md5:190cc54f77e38f3872ac268417f94c31
60.1 MB Preview Download
md5:bacaf73b1a0c0b6aca030d62559bb472
52.0 MB Download
md5:b1f7f7a14844e6ea66b5f14f52197e4a
159.6 MB Download
md5:33ff266f8e177f2c0ec813a40fdb067c
5.9 kB Preview Download
md5:d0db089d8e4fe5deb998ede53abcc0b9
20.3 MB Download
md5:b5683e60f25bd8baee9fcb56517050a2
573.1 MB Preview Download
md5:91dfb43f52013d66887ae7137a3e45f0
194.9 MB Preview Download
md5:a38e4bdf7b30b99f2c816b8d482ff82d
1.5 GB Preview Download
md5:d54b5c7eec18e0ceeb45a3c3fe22053c
1.6 GB Preview Download
md5:112ecfa754ce4bab6b5c7fadfac483b8
1.5 GB Preview Download
md5:2fffc03cd9927478b04f50f94a4c3a88
1.2 GB Preview Download
md5:da2d115b48842d9fdf4eccfe9705fcbe
968.4 MB Preview Download
md5:16e4881ec92662f38457b3a17c63ad84
837.6 MB Preview Download
md5:9d0fb956d8ad8e009c039bbc3e6612b8
186.9 MB Preview Download

Additional details

Related works

Is published in
Data paper: 10.1016/j.dib.2023.110008 (DOI)