Published October 23, 2024 | Version v1
Other Open

MICCAI 2025 Lighthouse Challenge: Unified Benchmarks for Imaging in Computational pathology, Radiology and Natural language (UNICORN)

Description

* Shared first authors: Clément Grisi, Michelle Stegeman, Judith Lefkes, Marina D'Amato, Luc Builtjes, Lena Philipp, Fennie van der Graaf, Joeran Bosma

** Shared last authors: Alessa Hering, Francesco Ciompi 

Challenges have undoubtedly shaped the field of medical image analysis and deep learning in the last decade, following a "many-to-one" approach, where numerous models competed on a specific single (narrow) task. But the emergence of (multimodal) foundation models powered by (vision) transformers comes with a paradigm
change in benchmarking towards a "one-to-many" approach, where a single model is benchmarked across a selection of (multimodal) tasks.

Language models are benchmarked on different tests, including math, medicine and geometry, vision models are being benchmarked via zero-shot or few-shot learning for classification, detection and segmentation tasks, also in the field of medical imaging. One of the main promises of these large models pre-trained with very-large scale is to be "generalist" models, able to tackle multiple tasks without being specifically trained to solve that task with large-scale data. Instead, paradigms of fine-tuning and few-shot learning are increasingly being adopted and explored, envisioning a future where user can (re-)purpose foundation models to address different tasks with
minimal interaction via prompts, similar to what is done via text with models such as ChatGPT. But for medical image analysis, the question is how far we are now with the development and usability of multi-modal foundation models, and to be able to answer this question, the field lacks a comprehensive publicly available benchmark that
encompasses these new evaluation forms. We believe it is the right time to introduce such a benchmark, which we envision in the form of a challenge.

We propose UNICORN, a challenge to provide a unified set of benchmarks to assess the performance of multimodal foundation models. We focus on both image data in the fields of radiology and digital pathology, text data, using medical reports, and images + text, focusing on multi-modal approaches. We release multi-modal public data that can be used by participants to fine-tune existing (pre-trained) foundation models or to develop a strategy based on few-shot learning, and establish a battery of benchmarks based on sequestrated test data.

Considering both vision, and language and vision-language tasks, UNICORN is not just about the performance of models on individual tasks, but about understanding how well integrated vision-language models can adapt to the complexities of medical data. In UNICORN, we will put together multiple teams from one Dutch medical center
(Radboudumc), one Dutch onology center (Maastro), a precision oncology group from Germany (TU Dresden), with multidisciplinary expertise and a strong track on challenge organization*. We aim at making UNICORN a unified evaluation platform for fair and uniform comparison of multimodal foundation models in medical imaging.

* We are currently discussing the inclusion of additional medical centers in the Netherlands, which have not fully confirmed their commitment yet due to time constraints, therefore not included in this proposal, but will likely join and contribute with data in existing tasks as well as new tasks, should this proposal being accepted.

The final version of this document including further revisions will be released soon.

 

Files

Lighthouse -- UNICORN Proposal.pdf

Files (603.3 kB)

Name Size Download all
md5:dccf0f43e1a2acd4d7bac9aa4222683c
603.3 kB Preview Download