COOS-7 (Cells Out Of Sample 7-Class)

Lu, Alex X; Lu, Amy X; Schormann, Wiebke; Andrews, David W; Moses, Alan M

doi:10.5281/zenodo.3355599

Published July 30, 2019 | Version 0.2

Dataset Open

COOS-7 (Cells Out Of Sample 7-Class)

1. University of Toronto
2. Sunnybrook Research Center

This repository contains the preliminary preprint version (v0.2) of the COOS-7 dataset (see preprint at https://arxiv.org/abs/1906.07282). COOS-7 contains 132,209 crops of mouse cells, stratified into a training dataset, and four test datasets representing increasing degrees of covariate shift from the training dataset. In the classification task associated with COOS-7, the aim is to build a classifier robust to covariate shifts typically seen in microscopy. Methods developers must train and optimize machine learning models using the training dataset exclusively, and evaluate performance on each of the four test datasets.

Important: This version contains preliminary data only. Depending on the usage of these images, you may need to re-run your method once the full version of the dataset is released.

Each HDF5 file contains two dictionaries:
'data' - contains all of the images in a four-dimensional array (images, channels, height, width)
'labels' - contains the labels for each image, in the same order as the images in 'data'

The value for labels indicates the class of the image, which can be one of seven values:
0 - Endoplasmic Reticulum (ER)
1 - Inner Mitochondrial Membrane (IMM)
2 - Golgi
3 - Peroxisomes
4 - Early Endosome
5 - Cytosol
6 - Nuclear Envelope

The h5py package is required to read these files with Python.

We provide a Python script, unpackage_COOS.py, that will automatically save the archives as directories of tiff files, organized by class. The two channels for each image will be saved as separate images, with a suffix of "_protein.tif" and "_nucleus.tif", respectively.

To run the unpackaging script, issue the command line argument:
python unpackage_COOS.py [path of HDF5 file] [path of directory to save images to]
e.g. python unpackage_COOS.py ./COOS7_v0.2_training.hdf5 ./COOS7_v0.2_training_images/

Full information about the test sets and the images can be found at https://arxiv.org/abs/1906.07282.

Files

README.txt

Files (4.3 GB)

Name	Size	Download all
COOS7_v0.2_test1.hdf5 md5:f1d191fc836691dc6a572c8bb8cad412	339.7 MB	Download
COOS7_v0.2_test2.hdf5 md5:557d403b154c0f562aa80cb278040b9a	557.9 MB	Download
COOS7_v0.2_test3.hdf5 md5:00144a6c85e1c30ee7fd4fc857196c52	1.1 GB	Download
COOS7_v0.2_test4.hdf5 md5:b4779fbd2895199221bef1d184fc5aee	1.0 GB	Download
COOS7_v0.2_training.hdf5 md5:67daabb803ccc0f00607200c1d974986	1.4 GB	Download
README.txt md5:955f9c0919e50b470c314c19c0e48e23	1.2 kB	Preview Download
unpackage_COOS.py md5:ff5be3fe38cad9e15972c9487ab51cdb	1.3 kB	Download

Additional details

Is documented by: arXiv:1906.07282v1 (arXiv)

	All versions	This version
Views	1,879	619
Downloads	1,977	284
Data volume	1.5 TB	219.6 GB

COOS-7 (Cells Out Of Sample 7-Class)

Authors/Creators

Description

Files

README.txt

Files (4.3 GB)

Additional details

Related works