Published November 20, 2024 | Version v3
Dataset Open

CellBinDB: A Large-Scale Multimodal Annotated Dataset

Description

CellBinDB is a large-scale, multimodal annotated dataset for cell segmentation. It contains 1,044 annotated microscope images and 109,083 cell annotations, covering four staining types: DAPI, ssDNA, H&E, and mIF. CellBinDB contains samples from two species, human and mouse, covering more than 30 histologically different tissue types, including disease-related tissues. The images in CellBinDB come from two sources: 844 mouse images from internal experiments and 200 human images from the open access platform 10x Genomics. We annotated all images in CellBinDB and provide two types of image annotations: semantic and instance masks. A xlsx file is attached to record the detailed information of each image.

In addition, we provide the images and annotations of nine other widely used publicly available cell segmentation datasets downloaded from their original sources, retaining their original formats for ease of use. 

The file 'mixed_licenses.txt' contains the original accessions of the public datasets used in our project and their associated licenses. Please refer to these links for more information about each dataset and its licensing terms, and use it according to the specifications.

Files

bbbc038.zip

Files (8.2 GB)

Name Size Download all
md5:8806dc3719cfde3b8c0c93408c5f82d6
372.7 MB Preview Download
md5:5af00c79c54b7ece852f030e26bed536
80.6 MB Preview Download
md5:e770f1287619eb45e74d131430e20fe5
286.0 MB Preview Download
md5:83e48f9827f475cb65425a3142046ff2
73.9 kB Download
md5:2d3323e98c5b1f05dcb5283114391e15
360.6 MB Preview Download
md5:af17e56a611818cd6d9a734e5bc46b61
25.2 MB Preview Download
md5:62453f9471d84dbd62424660acec8ab8
76.2 MB Preview Download
md5:4a157c786cb80097f15afc5950e88807
786.5 MB Preview Download
md5:80ef751e5829b781b2036efa8463a75b
1.1 kB Preview Download
md5:d2e67d14d86f9452396ab14c5752f35d
207.5 MB Preview Download
md5:08f9a33e23e983d82ee93aa58e69d095
1.6 GB Preview Download
md5:42fd53473eb043e968815c03753d0ed1
4.4 GB Preview Download

Additional details

Related works

Is published in
Journal article: 10.1093/gigascience/giaf069 (DOI)
Dataset: 10.26036/CNP0006370 (DOI)
Dataset: 10.6019/S-BIAD1538 (DOI)

References

  • Caicedo, J.C., Goodman, A., Karhohs, K.W. et al. Nucleus segmentation across imaging experiments: the 2018 Data Science Bowl. Nat Methods 16, 1247–1253 (2019). https://doi.org/10.1038/s41592-019-0612-7
  • Ljosa, V., Sokolnicki, K. & Carpenter, A. Annotated high-throughput microscopy image sets for validation. Nat Methods 9, 637 (2012). https://doi.org/10.1038/nmeth.2083
  • Stringer, C., Pachitariu, M. Cellpose3: one-click image restoration for improved cellular segmentation. Nat Methods 22, 592–599 (2025). https://doi.org/10.1038/s41592-025-02595-5
  • Naylor Peter Jack, Walter Thomas, Laé Marick, & Reyal Fabien. (2018). Segmentation of Nuclei in Histopathology Images by deep regression of the distance map (1.1) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.2579118
  • Kromp, F., Bozsaky, E., Rifatbegovic, F. et al. An annotated fluorescence image dataset for training nuclear segmentation methods. Sci Data 7, 262 (2020). https://doi.org/10.1038/s41597-020-00608-w
  • S. Graham et al., "Lizard: A Large-Scale Dataset for Colonic Nuclear Instance Segmentation and Classification," 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada, 2021, pp. 684-693, doi: 10.1109/ICCVW54120.2021.00082.
  • N. Kumar et al., "A Multi-Organ Nucleus Segmentation Challenge," in IEEE Transactions on Medical Imaging, vol. 39, no. 5, pp. 1380-1391, May 2020, doi: 10.1109/TMI.2019.2947628.
  • Mahbod, A., Polak, C., Feldmann, K., Khan, R., Gelles, K., Dorffner, G., Woitek, R., Hatamikia, S., & Ellinger, I. (2024). NuInsSeg: A fully annotated dataset for nuclei instance segmentation in H&E-stained histological images (1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.10518968
  • Greenwald, N.F., Miller, G., Moen, E. et al. Whole-cell segmentation of tissue images with human-level performance using large-scale data annotation and deep learning. Nat Biotechnol 40, 555–565 (2022). https://doi.org/10.1038/s41587-021-01094-0