Data for: Tang et al., Interpretable classification of Alzheimer's disease pathologies with a convolutional neural network pipeline. bioRxiv 2018.

Tang, Ziqi; Chuang, Kangway; DeCarli, Charles; Jin, Lee-Way; Beckett, Laurel; Keiser, Michael; Dugger, Brittany

doi:10.5281/zenodo.1470797

Published November 1, 2018 | Version v1.0

Dataset Open

Data for: Tang et al., Interpretable classification of Alzheimer's disease pathologies with a convolutional neural network pipeline. bioRxiv 2018.

1. Institute for Neurodegenerative Diseases, University of California, San Francisco
2. Department of Neurology, University of California, Davis School of Medicine
3. Department of Pathology and Laboratory Medicine, University of California, Davis School of Medicine
4. Department of Public Health Sciences, University of California, Davis

Datasets containing 63 whole slide images (WSIs) and their segmented 256x256 pixel tiles with approximately 80,000 tile-level amyloid-β pathology expert annotations.

Paper: "Interpretable classification of Alzheimer's disease pathologies with a convolutional neural network pipeline", bioRxiv 454793; DOI: https://doi.org/10.1101/454793.

Details: A total of 63 WSIs for 63 unique decedent cases spanning Alzheimer’s disease (AD) to non-AD and possessing a variety of CERAD scores. WSIs comprise three datasets as follows:

Development (Phases I-II). 33 WSIs used for convolutional neural network (CNN) model development (29 training, 4 validation).
Hold-out (Phase III). 10 WSIs selected by an expert neuropathologist as a held-out test set to assess the generalizability of the CNN model.
CERAD-like hold-out. 20 blinded WSIs collected solely for use in a CERAD-like scoring comparison study.

Datasets 1 and 2 were color-normalized and segmented to 256x256 pixel image tiles for model training set (61,370 images), validation set (8,630 images), and hold-out test set (10,873 images). Dataset 3 was color-normalized but not segmented.

Expert labels of plaques for Dataset 1 and 2 tiles are included in corresponding CSV files.

Slide source and preparation: All samples were retrieved from archives of the University of California, Davis Alzheimer’s Disease Center Brain Bank (https://www.ucdmc.ucdavis.edu/alzheimers/). Archival samples analyzed in this study were 5 μm formalin fixed, paraffin embedded sections of the superior and middle temporal gyrus from human brain. The tissue had been previously stained with an amyloid-β antibody (4G8, recognizing residues 17-24, BioLegend, formerly Covance) that were first pretreated with formic acid to rid samples of endogenous protein. All slides were digitized using an Aperio AT2 up to 40x magnification.

Code: Please visit https://github.com/keiserlab/plaquebox-paper

Notes

This study was funded by a NIH P30 AG010129 grant (BND, CD, LWJ, and LB), a Paul G. Allen Family Foundation Distinguished Investigator Award (MJK), and the China Scholarship Council (ZT). These agencies had no role in any aspect of the study, including study design, data collection, analysis, or writing.

Files

Dataset 1a Development_train.zip

Files (110.0 GB)

Name	Size	Download all
Dataset 1a Development_train.zip md5:f1b8413b61799a3350f7b431ecf2026f	35.3 GB	Preview Download
Dataset 1b Development_validation.zip md5:ffd0c30e55154901621972c16c259efa	3.9 GB	Preview Download
Dataset 2 Hold-out.zip md5:f0f69ccc39fe9e3072909ec48a1c057a	41.2 GB	Preview Download
Dataset 3 CERAD-like hold-out.zip md5:2200d5d0209fb35e77dfa0692eece03f	26.3 GB	Preview Download
Tiles.zip md5:1420e454def8f09eb945643ba5cfac53	3.3 GB	Preview Download

Additional details

Is supplement to: 10.1101/454793 (DOI); https://github.com/keiserlab/plaquebox-paper (URL)

	All versions	This version
Views	3,197	3,185
Downloads	2,144	2,142
Data volume	94.9 TB	94.8 TB

Data for: Tang et al., Interpretable classification of Alzheimer's disease pathologies with a convolutional neural network pipeline. bioRxiv 2018.

Creators

Description

Notes

Files

Dataset 1a Development_train.zip

Files (110.0 GB)

Additional details

Related works