Published June 6, 2018 | Version 1.0.0
Dataset Open

Is the winner really the best? A critical analysis of common research practice in biomedical image analysis competitions

  • 1. Division of Computer Assisted Medical Interventions (CAMI), German Cancer Research Center (DKFZ), Heidelberg, Germany
  • 1. Division of Computer Assisted Medical Interventions (CAMI), German Cancer Research Center (DKFZ), Heidelberg, Germany

Description

This data set corresponds to the paper: Is the winner really the best? A critical analysis of common research practice in biomedical image analysis competitions [1] (Experiment: Comprehensive reporting).

The key research questions corresponding to this data set were:

RQ1: What is the role of challenges for the field of biomedical image analysis (e.g. How many challenges conducted to date? In which fields? For which algorithm categories? Based on which modalities?)

RQ2: What is common practice related to challenge design (e.g. choice of metric(s) and ranking methods, number of training/test images, annotation practice etc.)? Are there common standards?

RQ3: Does common practice related to challenge reporting allow for reproducibility and adequate interpretation of results?

To address these research questions, we aimed to capture all biomedical image analysis challenges that have been conducted up to 2016. To acquire the data, we analyzed the websites hosting/representing biomedical image analysis challenges, namely grand-challenge.org, dreamchallenges.org and kaggle.com as well as websites of main conferences in the field of biomedical image analysis, namely Medical Image Computing and Computer Assisted Intervention (MICCAI), International Symposium on Biomedical Imaging (ISBI), International Society for Optics and Photonics (SPIE) Medical Imaging, Cross Language Evaluation Forum (CLEF), International Conference on Pattern Recognition (ICPR), The American Association of Physicists in Medicine (AAPM), the Single Molecule Localization Microscopy Symposium (SMLMS) and the BioImage Informatics Conference (BII). This yielded a list of 150 challenges with 549 tasks.

Next, a tool for instantiating the challenge parameter list introduced in [1] was used by some of the authors (engineers and medical student) to formalize all challenges that met our inclusion criteria as follows: (1) Initially, each challenge was independently formalized by two different observers. (2) The formalization results were automatically compared. In ambiguous cases, when the observers could not agree on the instantiation of a parameter - a third observer was consulted, and a decision was made. When refinements to the parameter list were made, the process was repeated for missing values. Based on the formalized challenge data set, a descriptive statistical analysis was performed to characterize common practice related to challenge design and reporting.

[1] Maier-Hein, L., Eisenmann, M., Reinke, A., Onogur, S., Stankovic, M., Scholz, P., Arbel, T., Bogunovic, H., Bradley, A. P., Carass, A., Feldmann, C., Frangi, A. F., Full, P. M., van Ginneken, B., Hanbury, A., Honauer, K., Kozubek, M., Landman, B. A., März, K., Maier, O., Maier-Hein, K., Menze, B. H., Müller, H., Neher, P. F., Niessen, W., Rajpoot, N., Sharp, G. C., Sirinukunwattana, K., Speidel, S., Stock, C., Stoyanov, D., Aziz Taha, A., van der Sommen, F., Wang, C.-W., Weber, M.-A., Zheng, G., Jannin, P., Kopp-Schneider, A.: Is the winner really the best? A critical analysis of common research practice in biomedical image analysis competitions. arXiv preprint arXiv:1806.02051 (2018).

Notes

We would like to acknowledge support from the European Research Council (ERC) (ERC starting grant COMBIOSCOPY under the New Horizon Framework Programme grant agreement ERC-2015-StG-37960).

Files

Files (2.8 MB)

Name Size Download all
md5:193fb461cb6a4f44a0141680dab990ed
2.8 MB Download

Additional details

Related works

References

  • Maier-Hein, L., et al.: Is the winner really the best? A critical analysis of common research practice in biomedical image analysis competitions. arXiv preprint arXiv:1806.02051 (2018).