Training data for the "Computational textural mapping harmonises sampling variation and reveals multidimensional histopathological fingerprints"
There are two ZIP-files consisting of small histological image tiles that have been used to detect and quantify distinct tissue textures and lymphocyte proportions from H&E-stained clear cell renal cell carcinoma (KIRC) digital tissue sections of the Cancer Genome Atlas (TCGA) image archive and the Helsinki dataset.
The tissue_classification file contains 300x300px tissue texture image tiles (n=52,713) representing renal cancer (“cancer”; n=13,057, 24.8%); normal renal (“normal”; n=8,652, 16.4%); stromal (“stroma”; n= 5,460, 10.4%) including smooth muscle, fibrous stroma and blood vessels; red blood cells (“blood”; n=996, 1.9%); empty background (“empty”; n=16,026, 30.4%); and other textures including necrotic, torn and adipose tissue (“other”; n=8,522, 16.2%). Image tiles have been randomly selected from the TCGA-KIRC WSI and the Helsinki datasets.
The binary_lymphocytes file contains mostly 256x256px-sized but also smaller image tiles of Low (n=20,092, 80.1%) or High (n=5,003, 19.9%) lymphocyte density (n=25,095). Image tiles have been randomly selected from the TCGA-KIRC WSI dataset.
All accuracy of all annotations have been double-checked. However, the classification between multiple tissue textures or lymphocyte density can be sometimes ambiguous.
The deep learning model parameters trained with the ResNet-18 infrastructure for (1) lymphocyte and (2) texture classification are named as (1) resnet18_binary_lymphocytes.pth and (2) resnet18_tissue_classification.pth. Codes and instructions to use these are found in
If you use either work, please cite the publication by Brummer O et al (1) AND the TCGA Research Network (2):
(1) Brummer, O., Pölönen, P., Mustjoki, S. et al. Computational textural mapping harmonises sampling variation and reveals multidimensional histopathological fingerprints. Br J Cancer 129, 683–695 (2023).
(2) The results shown here are in whole or part based upon data generated by the TCGA Research Network: