Benchmarking of Artificial Intelligence and Radiologists for Lung Cancer Screening in CT: The LUNA25 Challenge
- 1. Department of Medical imaging, Radboud University Medical Center, The Netherlands
Description
Lung cancer is the leading cause of cancer-related deaths worldwide, with 2.5 million people diagnosed and 1.8 million deaths occurring annually worldwide (Bray et al., 2024). The National Lung Cancer Screening Trial (NLST) and the Dutch-Belgian lung cancer screening trial (NELSON) provided evidence that the lung cancer mortality can be reduced by repeated screening of high-risk individuals with low-dose chest CT (de Koning et al., 2020; Aberle. et al., 2011). Lung cancer screening could therefore play an important role in reducing lung cancer mortality. However, implementation of lung cancer screening will further increase the already high workload on radiologists.
This imminent implementation of lung cancer screening and growing workload for radiologists demonstrates the need for safe and validated artificial intelligence (AI) algorithms, which have already shown human level performance for lung nodule malignancy risk estimation (Venkadesh et al., 2021). However, it remains challenging to adequately validate and benchmark the increasing amount of AI algorithms being developed. For this, Grand challenges, which are international public competitions, offer the means to compare and validate AI algorithms. LUNA16 (Setio et al., 2017) and the Kaggle 2017 Data Science Bowl are examples of such competitions to train and validate AI algorithms. However, since then the field of AI has progressed a lot, the interest of radiologists has increased and better datasets have been collected, which opens the door for a more reliable comparison of AI algorithms and radiologists in lung cancer screening.
The LUNA25 challenge is new grand challenge designed to evaluate the performance of AI algorithms and radiologists in lung nodule malignancy risk estimation in screening CT. LUNA25 aims to establish: 1) state-of-the-art AI performance for lung nodule malignancy risk estimation, 2) performance of radiologists at lung nodule malignancy risk estimation through a large scale international reader study, 3) a comparison between performance of AI algorithms and radiologists with varying levels of experience. This study hypothesizes that state-of-the-art AI performs non-inferior to the average radiologist in estimating lung nodule malignancy risk on screening CT scans. However, this study does not address AI workflow integration or radiologist-AI interaction.
Files
181-Benchmarking_of_Artificial_Intelligence_and_Radiologists_for_Lung_2025-03-21T14-06-01.pdf
Files
(109.0 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:11c6b6cc6f45a0654a1073ee3a1830e0
|
109.0 kB | Preview Download |