Published March 12, 2024 | Version v1
Dataset Open

HistoArtifacts

  • 1. ROR icon University of Stavanger

Contributors

Data collector:

  • 1. ROR icon Erasmus MC Cancer Institute
  • 2. ROR icon University of Stavanger

Description

This dataset contains five notable histological artifacts: blur, blood (hemorrhage), air bubbles, folded tissue, and damaged tissue. This dataset is used in the following works, and a description of the dataset can be found at https://arxiv.org/abs/2403.07743.

The full dataset is explained and used in  the article, "Equipping Computational Pathology Systems with Artifact Processing Pipelines: A Showcase for Computation and Performance Trade-offs.". https://arxiv.org/abs/2403.07743

See the detailed video explanation behind the motivation of artifact detection in computational pathology. in the video paper: "Extract, detect, eliminate: Enhancing reliability and performance of computational pathology through artifact processing pipelines" https://www.sciencetalks-journal.com/article/S2772-5693(24)00013-6/fulltext

Please cite the following papers while using the dataset, in full or partially:. 

A sub-dataset contains folded tissues extracted at 20x and blur class used in the paper "Are you sure it’s an artifact? Artifact detection and uncertainty quantification in histological images".  https://www.sciencedirect.com/science/article/pii/S0895611123001398

A sub-dataset using air bubbles is used in the paper: "Vision transformers for small histological datasets learned through knowledge distillation" https://link.springer.com/chapter/10.1007/978-3-031-33380-4_13
https://arxiv.org/abs/2305.17370

A sub-dataset using blood and damaged tissue is used in the paper: "Quantifying the effect of color processing on blood and damaged tissue detection in whole slide images" https://ieeexplore.ieee.org/abstract/document/9816283

"Equipping Computational Pathology Systems with Artifact Processing Pipelines: A Showcase for Computation and Performance Trade-offs.". https://arxiv.org/abs/2403.07743

"The Devil is in the Details: Whole Slide Image Acquisition and Processing for Artifacts Detection, Color Variation, and Data Augmentation: A Review" https://ieeexplore.ieee.org/document/9777677

 

Files

multiclass_artifact_data.zip

Files (4.8 GB)

Name Size Download all
md5:a3f0cb9dae5f2bd04cbba707eb69edd4
4.8 GB Preview Download

Additional details

Identifiers

Related works

Has part
Conference paper: 10.1109/IVMSP54334.2022.9816283 (DOI)
Book chapter: 10.1007/978-3-031-33380-4_13 (DOI)
Journal: 10.1016/j.compmedimag.2023.102321 (DOI)
Is described by
Journal article: arXiv:2403.07743 (arXiv)

Funding

European Research Council
CLARIFY 860627

Dates

Available
2024-03-12
Submitted as journal paper to BMC MIDM

References

  • @misc{kanwal2024equipping, title={Equipping Computational Pathology Systems with Artifact Processing Pipelines: A Showcase for Computation and Performance Trade-offs}, author={Neel Kanwal and Farbod Khoraminia and Umay Kiraz and Andres Mosquera-Zamudio and Carlos Monteagudo and Emiel A. M. Janssen and Tahlita C. M. Zuiverloon and Chunmig Rong and Kjersti Engan}, year={2024}, eprint={2403.07743}, archivePrefix={arXiv}, primaryClass={eess.IV} }