TCGA@Focus Dataset
Description
Content
A dataset of 1000 whole slide images from 52 types of organs was gathered from The Cancer Genome Atlas (TCGA) repository provided by the National Cancer Institute (NCI)/ National Institute of Health (NIH). Two different categories of focus were annotated on every region of interest in these WSIs; they were labelled "in-focus" and "out-focus" and given respective binary ground truth scores of "1" and "0". From these regions of interest, a dataset of 14,371 image patches was created. Of these patches, 11,328 are labelled in-focus, and 3,043 are labelled out-focus.
The organ types were selected to be diverse in order to include a wide spectrum of tissue textures and colour information to enrich the dataset.
More Information
For more information, please refer to the following paper. Please cite this paper when using the dataset.
@InProceedings{wang2020focuslitenn,
title={FocusLiteNN: High Efficiency Focus Quality Assessment for Digital Pathology},
author={Wang, Zhongling and Hosseini, Mahdi and Miles, Adyn and Plataniotis, Konstantinos and Wang, Zhou},
booktitle={Medical Image Computing and Computer Assisted Intervention -- MICCAI 2020},
year={2020},
publisher="Springer International Publishing"
}
For the full code released on GitHub, please visit the repository at: https://github.com/icbcbicc/FocusLiteNN/
Contact
For questions, please contact:
Mahdi Hosseini
mahdi.hosseini@utoronto.ca
http://orcid.org/0000-0002-9147-0731
Files
TCGA@Focus.zip
Files
(23.7 GB)
Name | Size | Download all |
---|---|---|
md5:295ab3858cd72bae812f48b5f45bce6a
|
23.7 GB | Preview Download |