Correspondence: Alex Lu - alexlu@cs.toronto.edu Each HDF5 file contains two dictionaries: 'data' - contains all of the images in a four-dimensional array (images, channels, height, width) 'labels' - contains the labels for each image, in the same order as the images in 'data' The h5py package is required to read these files into Python, but loading is as simple as: import h5py archive = h5py.File(archive_name, "r") images = archive['data'] labels = archive['labels'] We also provide a script, unpackage_COOS.py, that will automatically save an archive as directories of tiff files, organized by class. The two channels for each cell will be saved as separate images, with a suffix of _protein.tif and _nucleus.tif. To use: python unpackage_COOS.py [path of HDF5 file] [path of directory to save images to] e.g. python unpackage_COOS.py ./COOS7_v1.0_training.hdf5 ./COOS7_v1.0_training_images/