Published June 27, 2020 | Version v1
Dataset Open

TCGA@Focus Dataset

Creators

  • 1. University of Toronto - Multimedia Lab

Description

Content

A dataset of 1000 whole slide images from 52 types of organs was gathered from The Cancer Genome Atlas (TCGA) repository provided by the National Cancer Institute (NCI)/ National Institute of Health (NIH). Two different categories of focus were annotated on every region of interest in these WSIs; they were labelled "in-focus" and "out-focus" and given respective binary ground truth scores of "1" and "0". From these regions of interest, a dataset of 14,371 image patches was created. Of these patches, 11,328 are labelled in-focus, and 3,043 are labelled out-focus.

The organ types were selected to be diverse in order to include a wide spectrum of tissue textures and colour information to enrich the dataset.

More Information

For more information, please refer to the following paper. Please cite this paper when using the dataset.

@InProceedings{wang2020focuslitenn,
    title={FocusLiteNN: High Efficiency Focus Quality Assessment for Digital Pathology},
    author={Wang, Zhongling and Hosseini, Mahdi and Miles, Adyn and Plataniotis, Konstantinos and Wang, Zhou},
    booktitle={Medical Image Computing and Computer Assisted Intervention -- MICCAI 2020},
    year={2020},
    publisher="Springer International Publishing"
}

For the full code released on GitHub, please visit the repository at: https://github.com/icbcbicc/FocusLiteNN/

Contact

For questions, please contact:

Mahdi Hosseini

mahdi.hosseini@utoronto.ca

http://orcid.org/0000-0002-9147-0731

Files

TCGA@Focus.zip

Files (23.7 GB)

Name Size Download all
md5:295ab3858cd72bae812f48b5f45bce6a
23.7 GB Preview Download