Published January 16, 2024 | Version 3.0
Dataset Open

[MedMNIST+] 18x Standardized Datasets for 2D and 3D Biomedical Image Classification with Multiple Size Options: 28 (MNIST-Like), 64, 128, and 224

  • 1. ROR icon Shanghai Jiao Tong University
  • 2. ROR icon Harvard University
  • 3. ROR icon Boston College
  • 4. ROR icon RWTH Aachen University
  • 5. Zhongshan Hospital Affiliated to Fudan University
  • 6. Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine

Description

Code [GitHub] | Publication [Nature Scientific Data'23 / ISBI'21] | Preprint [arXiv]

 

Abstract

We introduce MedMNIST, a large-scale MNIST-like collection of standardized biomedical images, including 12 datasets for 2D and 6 datasets for 3D. All images are pre-processed into 28x28 (2D) or 28x28x28 (3D) with the corresponding classification labels, so that no background knowledge is required for users. Covering primary data modalities in biomedical images, MedMNIST is designed to perform classification on lightweight 2D and 3D images with various data scales (from 100 to 100,000) and diverse tasks (binary/multi-class, ordinal regression and multi-label). The resulting dataset, consisting of approximately 708K 2D images and 10K 3D images in total, could support numerous research and educational purposes in biomedical image analysis, computer vision and machine learning. We benchmark several baseline methods on MedMNIST, including 2D / 3D neural networks and open-source / commercial AutoML tools. The data and code are publicly available at https://medmnist.com/.

Disclaimer: The only official distribution link for the MedMNIST dataset is Zenodo. We kindly request users to refer to this original dataset link for accurate and up-to-date data.

Update: We are thrilled to release MedMNIST+ with larger sizes: 64x64, 128x128, and 224x224 for 2D, and 64x64x64 for 3D. As a complement to the previous 28-size MedMNIST, the large-size version could serve as a standardized benchmark for medical foundation models. Install the latest API to try it out!

 

Python Usage

We recommend our official code to download, parse and use the MedMNIST dataset:

% pip install medmnist
% python
To use the standard 28-size (MNIST-like) version utilizing the downloaded files:

>>> from medmnist import PathMNIST
>>> train_dataset = PathMNIST(split="train")

To enable automatic downloading by setting `download=True`:

>>> from medmnist import NoduleMNIST3D
>>> val_dataset = NoduleMNIST3D(split="val", download=True)

Alternatively, you can access MedMNIST+ with larger image sizes by specifying the `size` parameter:

>>> from medmnist import ChestMNIST
>>> test_dataset = ChestMNIST(split="test", download=True, size=224)

 

Citation

If you find this project useful, please cite both v1 and v2 paper as:

Jiancheng Yang, Rui Shi, Donglai Wei, Zequan Liu, Lin Zhao, Bilian Ke, Hanspeter Pfister, Bingbing Ni. Yang, Jiancheng, et al. "MedMNIST v2-A large-scale lightweight benchmark for 2D and 3D biomedical image classification." Scientific Data, 2023.

Jiancheng Yang, Rui Shi, Bingbing Ni. "MedMNIST Classification Decathlon: A Lightweight AutoML Benchmark for Medical Image Analysis". IEEE 18th International Symposium on Biomedical Imaging (ISBI), 2021.

or using bibtex:

@article{medmnistv2,
    title={MedMNIST v2-A large-scale lightweight benchmark for 2D and 3D biomedical image classification},
    author={Yang, Jiancheng and Shi, Rui and Wei, Donglai and Liu, Zequan and Zhao, Lin and Ke, Bilian and Pfister, Hanspeter and Ni, Bingbing},
    journal={Scientific Data},
    volume={10},
    number={1},
    pages={41},
    year={2023},
    publisher={Nature Publishing Group UK London}
}

@inproceedings{medmnistv1,
    title={MedMNIST Classification Decathlon: A Lightweight AutoML Benchmark for Medical Image Analysis},
    author={Yang, Jiancheng and Shi, Rui and Ni, Bingbing},
    booktitle={IEEE 18th International Symposium on Biomedical Imaging (ISBI)},
    pages={191--195},
    year={2021}
}

Please also cite the corresponding paper(s) of source data if you use any subset of MedMNIST as per the description on the project website.

 

License

The MedMNIST dataset is licensed under Creative Commons Attribution 4.0 International (CC BY 4.0), except DermaMNIST under Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).

The code is under Apache-2.0 License.

 

Changelog

v3.0 (this repository): Released MedMNIST+ featuring larger sizes: 64x64, 128x128, and 224x224 for 2D, and 64x64x64 for 3D.

v2.2: Removed a small number of mistakenly included blank samples in OrganAMNIST, OrganCMNIST, OrganSMNIST, OrganMNIST3D, and VesselMNIST3D. 

v2.1: Addressed an issue in the NoduleMNIST3D file (i.e., nodulemnist3d.npz). Further details can be found in this issue.

v2.0: Launched the initial repository of MedMNIST v2, adding 6 datasets for 3D and 2 for 2D.

v1.0: Established the initial repository (in a separate repository) of MedMNIST v1, featuring 10 datasets for 2D.

 

Note: This dataset is NOT intended for clinical use.

Files

Files (46.1 GB)

Name Size Download all
md5:bbd3c5a5576322bc4cdfea780653b1ce
276.8 kB Download
md5:17721accfe9fb005146a47d33bc54b2f
1.9 MB Download
md5:7053d0359d879ad8a5505303e11de1dc
35.5 MB Download
md5:adace1e0ed228fccda1f39692059dd4c
569.1 MB Download
md5:b718ff6835fcbdb22ba9eacccd7b2601
1.5 GB Download
md5:2b94928a2ae4916078ca51e05b6b800b
156.3 MB Download
md5:750601b1f35ba3300ea97c75c52ff8f6
559.6 kB Download
md5:363e4b3f8d712e9b5de15470a2aaadf1
11.0 MB Download
md5:b56378a6eefa9fed602bb16d192d4c8b
30.9 MB Download
md5:742edef2a1fd1524b2efff4bd7ba9364
2.8 MB Download
md5:02c8a6516a18b556561a56cbdd36c4a8
82.8 MB Download
md5:db107e5590b27930b62dbaf558aebee3
1.4 GB Download
md5:45bd33e6f06c3e8cdb481c74a89152aa
3.9 GB Download
md5:9de6cd0b934ebb5b7426cfba5efbae16
401.6 MB Download
md5:0744692d530f8e62ec473284d019b0c7
19.7 MB Download
md5:2defd784463fa5243564e855ed717de1
372.6 MB Download
md5:8974907d8e169bef5f5b96bc506ae45d
1.1 GB Download
md5:b70a2f5635c6199aeaa28c31d7202e1f
100.1 MB Download
md5:6aa7b0143a6b42da40027a9dda61302f
3.3 MB Download
md5:f01d7e6316aedf4210da0da5b7437b42
26.6 MB Download
md5:8755a7e9e05a4d9ce80a24c3e7a256f3
29.3 MB Download
md5:c47c5b7d457bf6332200d2ea6d64ecd8
289.8 MB Download
md5:c68d92d5b585d8d81f7112f81e2d0842
54.9 MB Download
md5:0a97e76651ace45c5d943ee3f65b63ae
1.3 GB Download
md5:abc493b6d529d5de7569faaef2773ba3
4.0 GB Download
md5:e229e9440236b774d9f0dfef9d07bdaf
311.8 MB Download
md5:68e3f8846a6bd62f0c9bf841c0d9eacc
38.2 MB Download
md5:eeae80d0a227a8d099027e1b3cfd3b60
707.9 MB Download
md5:50747347e05c87dd3aaf92c49f9f3170
1.8 GB Download
md5:2dcccc29b88e6da5a01161ef20cda288
200.4 MB Download
md5:b9ceb9546e10131b32923c5bbeaea2b1
15.5 MB Download
md5:773c1f009daa3fe5d9a2a201b2a7ed94
287.5 MB Download
md5:050f5e875dc056f6768abf94ec9995d1
760.2 MB Download
md5:3ce34a8724ea6f548e6db4744d03b6a9
80.3 MB Download
md5:a0c5a1ff56af4f155c46d46fbb45a2fe
32.7 MB Download
md5:58a2205adf14a9d0a189cb06dc78bf10
361.5 MB Download
md5:9ab87b696fb54e2a387ebe992d6ed5f1
16.5 MB Download
md5:ded0c5fa01a95dc4978b956f613e9b8e
305.2 MB Download
md5:b354719e553fbbb2513d5533f52a4cb1
802.7 MB Download
md5:53a6d115339d874c25e309a994ff46d3
85.9 MB Download
md5:a8b06965200029087d5bd730944a56c1
205.6 MB Download
md5:ac42d08fb904d92c244187169d1fd1d9
4.3 GB Download
md5:2c51a510bcdc9cf8ddb2af93af1eadec
12.6 GB Download
md5:55aa9c1e0525abe5a6b9d8343a507616
1.1 GB Download
md5:28209eda62fecd6e6a2d98b1501bb15f
4.2 MB Download
md5:05b46931834c231683c68f40c47b2971
75.5 MB Download
md5:d6a3c71de1b945ea11211b03746c1fe1
214.4 MB Download
md5:8f4eceb4ccffa70c672198ea285246c6
20.6 MB Download
md5:bd4c0672f1bba3e3a89f0e4e876791e4
3.3 MB Download
md5:e48e916a24454daf90583d4e6efb1a18
46.5 MB Download
md5:eae7e3b6f3fcbda4ae613ebdcbe35348
128.0 MB Download
md5:afda852cc34dcda56f86ad2b2457dbcc
13.2 MB Download
md5:1235b78a3cd6280881dd7850a78eadb6
38.0 MB Download
md5:43bd14ebf3af9d3dd072446fedc14d5e
452.8 MB Download
md5:ebe78ee8b05294063de985d821c1c34b
125.0 MB Download
md5:61b955355d7425a89687b06cca3ce0c2
1.6 GB Download
md5:b077128c4a949f0a4eb01517f9037b9c
3.4 GB Download
md5:123ece2eba09d0aa5d698fda57103344
555.3 MB Download
md5:b41fd4f7e7e2feedddb201585ecafa1b
398.3 kB Download
md5:6bb274a8846e1097066dcd64e2c4520f
2.7 MB Download

Additional details

Related works

Is published in
Journal: 10.1038/s41597-022-01721-8 (DOI)
Conference paper: 10.1109/ISBI48211.2021.9434062 (DOI)