Published 2014 | Version 1.0.0
Dataset Open

Automated Cell Nuclei detection for Large-Volume Electron Microscopy of Neural Tissue

  • 1. Istanbul Technical University
  • 2. HCI, University of Heidelberg, Germany

Description

Volumetric electron microscopy techniques, such as serial block-face electron microscopy (SBEM), generate massive amounts of image data that are used for reconstructing neural circuits. Typically, this requires time-intensive manual annotation of cells and their connections. To facilitate this analysis, we study the problem of automated detection of cell nuclei in a new SBEM dataset that contains cerebral cortex, white matter, and striatum from an adult mouse brain. The dataset was manually annotated to identify the locations of all 3309 cell nuclei in the volume. We make both dataset and annotations available here.

This is a supplementary to the ISBI 2014 paper "Automated Cell Nucleus Detection for Large-Volume Electron Microscopy of Neural Tissue".

Technical info

Dataset

DATA ACQUISITION 

A 20-week-old male mouse brain was prepared in its entirety for electron microscopy. A sub-volume containing cerebral cortex, white matter, and striatum was extracted from the epoxy-embedded whole brain with a trimmer (Leica) and scalpel blade, and mounted on an aluminum stub. Back- scattered electrons were imaged at 40nm pixel size in high vacuum with SBEM on a QuantaFEG 200 (FEI) and using a heuristic-based algorithm for automated aberration correc- tion. The final stack size for the cortico-striatal dataset was 4382 × 3435 × 30464 voxels, which was subsequently downscaled 4 times for cell nucleus detection. 

DATA FORMAT 

The downscaled and cropped SBEM volume is grayscale (8-bit) and 1024×768×7552 voxels (x, y, z) in size, where each voxel is 160 × 160 × 200nm. It is available for download as an HDF5 file, which can be easily viewed by HDFView or read from using Python, Matlab or C. 

GROUND TRUTH 

Ground-truth annotations are provided as CSV files (separator: comma) with x,y,z and r columns. x,y,z specify the manually marked approximate center position of each neuronal or glial nucleus; r the manually estimated radius. Because nuclei are in 3D and not always spherical these are only rough estimates. Two files are provided: In the first, the complete set of neuronal and glial nuclei annotations are given. In the second, we have removed those annotations for which the sphere touches the volume border in order to simplify evaluation of automated detection algorithms (see paper).

Data and ground-truth annotations

Raw data

The raw data is a HDF5 file, 4 GB, md5sum 8b1f88fd0cd57874dd8f0f0b74be61ba. 

This file uses the HDF5 data format for chunked and compressed storage. Regions of interest can be read from the file using Matlab's h5read command, using Python and h5py library as well as many other languages (see the wikipedia article for a list).

The data is stored in the group G1/20130722_132814 as a dataset of size 1024 x 768 x 7552. It was written in chunks of 64 x 64 x 64 using the deflate-1 OPT compression filter. Note that the data is in x,y,z axis order, which requires a transpose when read from C or python (see example below), and also from Matlab (because it uses y,x,z order).

Ground-truth

Ground-truth annotations are provided as CSV files (separator: comma)with x,y,z and r columns.
x,y,z specify the manually marked approximate center position of each neuronal or glial nucleus;
r the manually estimated radius. Because nuclei are in 3D and not always spherical these are only rough estimates.
Two files are provided: In the first, the complete set of neuronal and glial nuclei annotations are given. In the second, we have removed those annotations for which the sphere touches the volume border in order to simplify evaluation of automated detection algorithms (see paper).

  • Original ground-truth (all neuronal and glial nuclei)
  • Edge clean ground-truth (annotations touching border removed)

Source Code

  • Code for block-wise thresholding and connected-component labeling can be found in the blockedarray github repository.
    Example usage from C++ can be found in the ccpipeline.cpp file.
  • Code for block-wise component accumulator can be found in the blockWiseComponentAccumulator.m
    This matlab function works on a labelled volume dataset stored in an HDF5 file (created by the blockedarray). It calculates bounding box, centroid, and coordinate list for all components except the label zero (assumed background). The output is written to a .mat file.
  • Code for component-wise morphological filtering can be found in the file componentWiseFiltering.m
    This matlab function works on a component list (created by the blockWiseComponentAccumulator.m) stored in a .mat file. It calculates and writes the new components to another .mat file.

Other

Acknowledgements

Shawn Mikula, Sarah Mikula, Ivo Sonntag, Winfried Denk

F. Boray Tek’s work in the Heidelberg Collaboratory for Image Processing was supported by Is¸ık University and by The Scientific and Technological Research Council of Turkey through BIDEP 2219 grant. We thank Winfried Denk for serial block-face electron microscopy resources and Sarah Mikula and Ivo Sonntag for manually annotating neuronal and glial nuclear sizes and locations in our SBEM dataset. Supported by the Max Planck Society

Files

soma_groundtruth.zip

Files (4.2 GB)

Name Size Download all
md5:8b1f88fd0cd57874dd8f0f0b74be61ba
4.2 GB Download
md5:ff31890deb62f398de6b06c9a5232a3f
39.4 kB Preview Download

Additional details

Funding

International Postdoctoral Research Fellowship Program for Turkish Citizens BIDEB 2219
Scientific and Technological Research Council of Turkey

Software

Repository URL
https://github.com/btekgit/soma-nuclei-matlab
Programming language
MATLAB
Development Status
Unsupported