Histopathology data of bone marrow biopsies (HistBMP or HistMNIST)

doi:10.5281/zenodo.1205024

Published March 21, 2018 | Version v1

Dataset Open

Histopathology data of bone marrow biopsies (HistBMP or HistMNIST)

Jakub Tomczak¹

1. University of Amsterdam

Contributors

Researcher:

Jakub Tomczak¹

1. University of Amsterdam

Data information

We prepared a dataset basing on histopathological images freely available on-line (http://www.enjoypath.com/). We selected 16 patients (patient IDs: 272, 274, 283, 289, 290, 291, 292, 295, 297, 298, 299). Each histopathological image represents a bone marrow biopsy. Diagnoses of the chosen cases were associated with different kinds of cancer (e.g., lymphoma, leukemia) or anemia. All original images were taken using HE, 40×, and each image was of size 336 × 448.

Data preparation

The original RGB representation was transformed to gray scale. Further, we divided each image into small patches of size 28 × 28. Eventually, we picked 10 patients for training, 3 patients for validation and 3 patients for testing, which resulted in 6,800 training images, 2,000 validation images and 2,000 test images. The selection of patients was performed in such a fashion that each dataset contained representative images with different diagnoses and amount of fat.

Since the small patches resemble a widely-used benchmark in machine learning/AI community called MNIST, the dataset is referred to as HistMNIST.

First usage

The dataset was used to train deep generative models (VAEs):

Tomczak, J. M., & Welling, M. (2016). Improving variational auto-encoders using householder flow. arXiv preprint arXiv:1611.09630.

Notes

The dataset was originally used in the following paper: J.M. Tomczak & M. Welling, "Improving Variational Auto-Encoders using Householder Flow", NIPS Workshop on Bayesian Deep Learning 2016, arXiv:1611.09630

Files

Files (59.1 MB)

Name	Size	Download all
histopathology.pkl.tar.gz md5:c6be7a98aeee14eb32fc8db2d2298bce	59.1 MB	Download

Additional details

DeeBMED – Deep learning and Bayesian inference for medical imaging 702666: European Commission

Tomczak, J. M., & Welling, M. (2016). Improving variational auto-encoders using householder flow. arXiv preprint arXiv:1611.09630.

	All versions	This version
Views	1,225	1,221
Downloads	203	202
Data volume	14.6 GB	14.5 GB

Histopathology data of bone marrow biopsies (HistBMP or HistMNIST)

Contributors

Researcher:

Notes

Files

Files (59.1 MB)

Additional details

Funding

References

Histopathology data of bone marrow biopsies (HistBMP or HistMNIST)

Creators

Contributors

Researcher:

Description

Notes

Files

Files (59.1 MB)

Additional details

Funding

References