Published October 21, 2020 | Version v1
Dataset Open

Data from: On the objectivity, reliability, and validity of deep learning enabled bioimage analyses

Description

Bioimage analysis of fluorescent labels is widely used in the life sciences. Recent advances in deep learning (DL) allow automating time-consuming manual image analysis processes based on annotated training data. However, manual annotation of fluorescent features with a low signal-to-noise ratio is somewhat subjective. Training DL models on subjective annotations may be instable or yield biased models. In turn, these models may be unable to reliably detect biological effects. An analysis pipeline integrating data annotation, ground truth estimation, and model training can mitigate this risk. To evaluate this integrated process, we compared different DL-based analysis approaches. With data from two model organisms (mice, zebrafish) and five laboratories, we show that ground truth estimation from multiple human annotators helps to establish objectivity in fluorescent feature annotations. Furthermore, ensembles of multiple models trained on the estimated ground truth establish reliability and validity. Our research provides guidelines for reproducible DL-based bioimage analyses.

Notes

This data repository contains the source code and source data of our study. Raw bioimages represent cFOS labeling in different brain areas of mice after behavioral analyses (Pavlovian fear conditioning paradigms).We provide the code and training datasets that we used to generate expert and consensus models and ensembles, a model library that contains our validated consensus ensembles, the source data and our code used for the analyses, and the complete bioimage datasets of two laboratories (Lab-Wue1 [283 images] and Lab-Mue [24 images]).

Official repository of our study "On the objectivity, reliability, and validity of deep learning enabled bioimage analyses." You can find our paper at eLife. In addition, we also provide all code in our GitHub repository.

File organization:

bioimage_data.zip:

This folder contains the raw image data of all laboratories and an Excel sheet ("image_mapping.xlsx") that contains all metadata to associate the images with experimental data, like genotype, treatment condition (see code below) or whether the image was used for model training.

Treatment condition code:

    - lab-wue1: homecage (H), context control (-), context conditioned (+)
    - lab-mue: early retrieval (Ext), late retrieval (Ret)
    - lab-inns1: control (Ctrl), extinction (Ext)
    - lab-inns2: Saline, L-DOPA responder, L-DOPA non-responder
    - lab-wue2: wildtype (WT), gad1b knock-down (KO)

For each laboratory, we provide all labels predicted by the different models or ensembles as indicated with the path names: "*/labels/initialization_variant/model_type/model_or_ensemble/identifier/", and all regions in which bioimage analysis was performed. For two laboratories (lab-wue1 and lab-mue), we also provide all microscopy images.

model_library.zip:

This folder contains a selection of one validated consenus ensembles for each of the five bioimage datasets.

source_data.zip:

This folder contains the source data of our study and is organized according to the individual figures in which the data is presented. In each figure folder, you find a readme file that provides more detailed information about the respective files and which notebook was used to perform the analysis.

test_data.zip:

This folder contains the test dataset of lab-wue1.

train_data.zip:

This folder contains all training datasets that were generated in the course of this study. This includes all microscopy images, the labels of the individual experts, and the computed consensus labels.

requirements.txt:

This file contains a list of all packages and their versions that are required for local installation and execution of our codes.  

Funding provided by: Deutsche Forschungsgemeinschaft
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100001659
Award Number: ID 44541416 - TRR58, A10 to Robert Blum

Funding provided by: Deutsche Forschungsgemeinschaft
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100001659
Award Number: ID 44541416 - TRR58, A03 to Hans-Christian Pape

Funding provided by: Deutsche Forschungsgemeinschaft
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100001659
Award Number: ID 44541416 - TRR58, B08 to Maren Lange

Funding provided by: Graduate School of Life Sciences, Würzburg
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100009379
Award Number: Fellowships to Rohini Gupta and Manju Sasi

Funding provided by: Austrian Science Fund
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100002428
Award Number: P29952 & P25851 to Ramon O. Tasan

Funding provided by: Austrian Science Fund
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100002428
Award Number: I2433-B26, DKW-1206, and SFB F4410 to Nicolas Singewald

Funding provided by: Interdisziplinäres Zentrum für Klinische Forschung, Universitätsklinikum Würzburg
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100009379
Award Number: N-320 to Christina Lillesaar

Funding provided by: Deutsche Forschungsgemeinschaft
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100001659
Award Number: ID 424778381 to Robert Blum

Files

bioimage_data.zip

Files (7.8 GB)

Name Size Download all
md5:58a848ff796b02b5f7b2afa6d06351cf
471.1 MB Preview Download
md5:b93c773e1d38b9eb0df38d1b2f32ed80
7.3 GB Preview Download
md5:bb360c0659d4f16952b1c62e30858e5e
20.9 kB Preview Download
md5:3431202f39813577235fef73bf7e8a35
2.3 kB Preview Download
md5:3488bb2d56ced9b4ed09087c1706fa85
235 Bytes Preview Download
md5:abdaf745746b537887ab9c29825346b8
22.3 MB Preview Download
md5:1b059bd7a20158840b831aea5d9cef51
151.3 kB Preview Download
md5:2926264fc84bf7523fa68ac7a5933da1
39.9 MB Preview Download
md5:76f95da5c184ae27fc91fa001291651f
23.7 kB Preview Download

Additional details

Related works