Published November 1, 2018 | Version v1.0
Dataset Open

Data for: Tang et al., Interpretable classification of Alzheimer's disease pathologies with a convolutional neural network pipeline. bioRxiv 2018.

  • 1. Institute for Neurodegenerative Diseases, University of California, San Francisco
  • 2. Department of Neurology, University of California, Davis School of Medicine
  • 3. Department of Pathology and Laboratory Medicine, University of California, Davis School of Medicine
  • 4. Department of Public Health Sciences, University of California, Davis

Description

Datasets containing 63 whole slide images (WSIs) and their segmented 256x256 pixel tiles with approximately 80,000 tile-level amyloid-β pathology expert annotations.

Paper: "Interpretable classification of Alzheimer's disease pathologies with a convolutional neural network pipeline", bioRxiv 454793; DOI: https://doi.org/10.1101/454793.

Details: A total of 63 WSIs for 63 unique decedent cases spanning Alzheimer’s disease (AD) to non-AD and possessing a variety of CERAD scores. WSIs comprise three datasets as follows:

  1. Development (Phases I-II). 33 WSIs used for convolutional neural network (CNN) model development (29 training, 4 validation).
  2. Hold-out (Phase III). 10 WSIs selected by an expert neuropathologist as a held-out test set to assess the generalizability of the CNN model.
  3. CERAD-like hold-out. 20 blinded WSIs collected solely for use in a CERAD-like scoring comparison study.

Datasets 1 and 2 were color-normalized and segmented to 256x256 pixel image tiles for model training set (61,370 images), validation set (8,630 images), and hold-out test set (10,873 images). Dataset 3 was color-normalized but not segmented.

Expert labels of plaques for Dataset 1 and 2 tiles are included in corresponding CSV files.

Slide source and preparation: All samples were retrieved from archives of the University of California, Davis Alzheimer’s Disease Center Brain Bank (https://www.ucdmc.ucdavis.edu/alzheimers/). Archival samples analyzed in this study were 5 μm formalin fixed, paraffin embedded sections of the superior and middle temporal gyrus from human brain. The tissue had been previously stained with an amyloid-β antibody (4G8, recognizing residues 17-24, BioLegend, formerly Covance) that were first pretreated with formic acid to rid samples of endogenous protein. All slides were digitized using an Aperio AT2 up to 40x magnification.

Code: Please visit https://github.com/keiserlab/plaquebox-paper

 

Notes

This study was funded by a NIH P30 AG010129 grant (BND, CD, LWJ, and LB), a Paul G. Allen Family Foundation Distinguished Investigator Award (MJK), and the China Scholarship Council (ZT). These agencies had no role in any aspect of the study, including study design, data collection, analysis, or writing.

Files

Dataset 1a Development_train.zip

Files (110.0 GB)

Name Size Download all
md5:f1b8413b61799a3350f7b431ecf2026f
35.3 GB Preview Download
md5:ffd0c30e55154901621972c16c259efa
3.9 GB Preview Download
md5:f0f69ccc39fe9e3072909ec48a1c057a
41.2 GB Preview Download
md5:2200d5d0209fb35e77dfa0692eece03f
26.3 GB Preview Download
md5:1420e454def8f09eb945643ba5cfac53
3.3 GB Preview Download

Additional details

Related works