Other Open Access
Bjoern Menze; Leo Joskowicz; Spyridon Bakas; Andras Jakab; Ender Konukoglu; Anton Becker; Christoph Berger
This is the challenge design document for the "Quantification of Uncertainties in Biomedical Image Quantification" Challenge, accepted for MICCAI 2020.
A preliminary study on inter-observer variability of manual contour delineation of structures was carried out by L. Joskowicz et al. in 2019 and published in the journal 'European Radiology'. Its objective was to quantify the interobserver variability of manual delineation of lesions and organ contours in CT images to establish a reference standard for volumetric measurements for clinical decision making and for the evaluation of automatic segmentation algorithms. It was observed that the variability in manual delineations for different structures and observers is large and spans a wide range across a variety of structures and pathologies. Two and even three observers may not be sufficient to establish the full range of potential variability of the outlines of the structures of interest. This variability, that is a property of the biological problem, the imaging modality, and the expert annotators, is - as of now - not sufficiently considered in the design of computerized algorithms for medical image quantification.
So far, uncertainties in predicted image segmentation are derived from general considerations of the statistical model, from resampling training data sets in ensemble approaches, or from systematic modifications of the predictive algorithm as in ‘drop-out’ procedures of deep learning procedures. At the same time, the definition of when the outline of an image structure to be quantified is ‘uncertain’ is a task- and data-dependent property of the quantification that can – and maybe has to – be directly inferred from human expert annotations. So far, there are no data sets available for evaluating the accuracy of probabilistic model predictions against such expert generated truth and there is no consensus on what procedures for uncertainty quantification return realistic estimates, and what procedures do not.
The purpose of the challenge is to benchmark algorithms returning uncertainty estimates (probability scores, variability regions, etc) of structures in medical imaging segmentation tasks. Specifically, the algorithmic output will be compared against uncertainties that human annotators attribute to the local delineation of various image structures of diagnostic relevance, such as lesions or anatomical structures. Structures in several CT an MR image data sets have repeatedly been annotated by a group of experts to quantify the variability of boundary delineations. Tasks include the segmentation of lesions, such as brain tumors, lung tumors, liver tumors, brain hemorrhages, as well as anatomical structures, such as kidneys, and prenatal brains.
Joskowicz, Leo, et al. "Inter-observer variability of manual contour delineation of structures in CT." European radiology 29.3 (2019): 1391-1399.