Published March 21, 2018 | Version v1
Dataset Open

Histopathology data of bone marrow biopsies (HistBMP or HistMNIST)

Creators

  • 1. University of Amsterdam

Contributors

Researcher:

  • 1. University of Amsterdam

Description

Data information

We prepared a dataset basing on histopathological images freely available on-line (http://www.enjoypath.com/). We selected 16 patients (patient IDs: 272, 274, 283, 289, 290, 291, 292, 295, 297, 298, 299). Each histopathological image represents a bone marrow biopsy. Diagnoses of the chosen cases were associated with different kinds of cancer (e.g., lymphoma, leukemia) or anemia. All original images were taken using HE, 40×, and each image was of size 336 × 448.

Data preparation

The original RGB representation was transformed to gray scale. Further, we divided each image into small patches of size 28 × 28. Eventually, we picked 10 patients for training, 3 patients for validation and 3 patients for testing, which resulted in 6,800 training images, 2,000 validation images and 2,000 test images. The selection of patients was performed in such a fashion that each dataset contained representative images with different diagnoses and amount of fat.

Since the small patches resemble a widely-used benchmark in machine learning/AI community called MNIST, the dataset is referred to as HistMNIST. 

First usage

The dataset was used to train deep generative models (VAEs):

  • Tomczak, J. M., & Welling, M. (2016). Improving variational auto-encoders using householder flow. arXiv preprint arXiv:1611.09630.

Notes

The dataset was originally used in the following paper: J.M. Tomczak & M. Welling, "Improving Variational Auto-Encoders using Householder Flow", NIPS Workshop on Bayesian Deep Learning 2016, arXiv:1611.09630

Files

Files (59.1 MB)

Name Size Download all
md5:c6be7a98aeee14eb32fc8db2d2298bce
59.1 MB Download

Additional details

Funding

DeeBMED – Deep learning and Bayesian inference for medical imaging 702666
European Commission

References

  • Tomczak, J. M., & Welling, M. (2016). Improving variational auto-encoders using householder flow. arXiv preprint arXiv:1611.09630.