There is a newer version of the record available.

Published December 13, 2024 | Version 2.0
Dataset Open

MultiCaRe: An open-source clinical case dataset for medical image classification and multimodal AI applications

  • 1. ROR icon Universidad Nacional del Sur

Description

The dataset contains multi-modal data from over 85,000 open access and de-identified case reports, including metadata, clinical cases, image captions and more than 160,000 images. Images and clinical cases belong to different medical specialties, such as oncology, cardiology, surgery and pathology. The structure of the dataset allows to easily map images with their corresponding article metadata, clinical case, captions and image labels. Details of the data structure can be found in the file data_dictionary.csv.

More than 110,000 patients and 300,000 medical doctors and researchers were involved in the creation of the articles included in this dataset. The citation data of each article can be found in the metadata.parquet file.

Refer to the examples showcased in this GitHub repository to understand how to optimize the use of this dataset.

Files

data_dictionary.csv

Files (3.7 GB)

Name Size Download all
md5:4aa38f8f0d23662c7e45179d9909ec94
51.8 MB Download
md5:979205f4f6807d64b602b85626b83506
60.3 MB Preview Download
md5:a08cca95204340f7946368ce421ad439
60.3 MB Download
md5:614ecfc5f13c9b7b9959565212794896
184.4 MB Download
md5:2de670f0f631189192835ee17830c4e3
6.4 kB Preview Download
md5:4216c8d202fb10ba5440bbdf4a20fb85
23.2 MB Download
md5:d0e88683716e932fb21654a757907791
696.7 MB Preview Download
md5:145a647fbdc1a8f4488fa699b25bbcd5
64.9 MB Preview Download
md5:235d9b20b8af200b1f406342d2fa033c
454.9 MB Preview Download
md5:11c95726aee91bd7d5d182d0f044e2ff
515.3 MB Preview Download
md5:87f1620b14a657c843b102877b7c6ad5
491.9 MB Preview Download
md5:20d337078c9289bc0544d73602a1f575
384.8 MB Preview Download
md5:b4aab3fd4f1a4b58948cf9631a601eff
316.3 MB Preview Download
md5:a7b8a1efc9953d9af9480dc2299ec5fb
283.9 MB Preview Download
md5:127af1527cdf52820d79fe0d6beafc15
63.7 MB Preview Download

Additional details

Related works

Is published in
Data paper: 10.1016/j.dib.2023.110008 (DOI)