There is a newer version of the record available.

Published August 15, 2022 | Version v1
Dataset Open

Training data for the "Integrative Analysis of Histological Textures and Lymphocyte Infiltration in Renal Cell Carcinoma using Deep Learning"

  • 1. Helsinki University Hospital
  • 2. University of Helsinki

Description

There are two ZIP-files consisting of small histological image tiles that have been used to detect and quantify distinct tissue textures and lymphocyte proportions from the clear cell renal cell carcinoma (KIRC) samples of the Cancer Genome Atlas (TCGA) image archive.

The tissue_classification file contains 300x300px tissue texture image tiles (n=39,458) representing renal cancer (“cancer”; n=11,755, 29.7%); normal renal (“normal”; n=6,313, 16.0%); stromal (“stroma”; n= 3,027, 7.7%) including smooth muscle, fibrous stroma and blood vessels; red blood cells (“blood”; n=544, 0.9%); empty background (“empty”; n=11,609, 29.4%); and other textures including necrotic, torn and adipose tissue (“other”; n=6,210, 15.7%). Tiles were randomly selected from the TCGA-KIRC WSI collection.

The binary_lymphocytes file contains mostly 256x256px-sized but also smaller image tiles of Low (n=20,092, 80.1%) or High (n=5,003, 19.9%) lymphocyte density (n=25,095). These images have been randomly selected from the TCGA-KIRC WSI collection.

All accuracy of all annotations have been double-checked. However, the classification between multiple tissue textures or lymphocyte density can be sometimes ambiguous.

The deep learning model parameters trained with the ResNet-18 infrastructure for (1) lymphocyte and (2) texture classification are named as (1) resnet18_binary_lymphocytes_final.pth and (2) resnet18_tissue_classification_dataset_final.pth. Codes and instructions to use these are found in https://github.com/vahvero/RCC_textures_and_lymphocytes_publication_image_analysis.

 

If you use either work, please cite the publication by Brummer O et al (1) AND the TCGA Research Network (2):
(1) Brummer O et al (unpublished)
(2) The results shown here are in whole or part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga.

Files

binary_lymphocytes.zip

Files (7.8 GB)

Name Size Download all
md5:acc12b93f8bf0e9c8c2b7de040dd5af9
4.2 GB Preview Download
md5:ba3f4370c9eba507c6b08582e592a5a4
45.8 MB Download
md5:cf9742d67bf5111d749a2b4f794afbe0
45.8 MB Download
md5:837683969dc6d77911af86a7baaacbc2
3.5 GB Preview Download