Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published May 18, 2020 | Version v1
Dataset Open

Histological image tiles for TCGA-CRC-DX, color-normalized, sorted by MSI status, train/test split

  • 1. RWTH Aachen University

Description

These are histological images of colorectal cancer, derived from the TCGA database at https://portal.cdc.cancer.gov. Tumor tissue was outlined manually and the tumor region was cut into tiles of 256 µm edge length, saved as 512 px images (effective magnification 0.5 µm/px). All image tiles were color-normalized with the Macenko method. Patients were split into training and test set in a 2:1 ratio. For all patients, MSI status was acquired (patients with MSI-H = MSIH; patients with MSI-L and MSS = NonMSIH) and all tiles inherited the label of the parent patient. Then, tiles in the training set were randomly undersampled to equalize classes. The test set was not undersampled. Further info: www.kather.ai

Files

TEST.zip

Files (3.4 GB)

Name Size Download all
md5:cf2f738a256da1a9353decb79663ebba
2.1 GB Preview Download
md5:b93b040fb6057b91f0524dd71a73273e
1.3 GB Preview Download