Image segmentations produced by BAMF under the AIMI Annotations initiative

Van Oss, Jeff; Murugesan, Gowtham Krishnan; McCrumb, Diana; Soni, Rahul

doi:10.5281/zenodo.13244892

Published August 6, 2024 | Version v2.0.2

Dataset Open

Image segmentations produced by BAMF under the AIMI Annotations initiative

1. BAMF Health

The Imaging Data Commons (IDC)(https://imaging.datacommons.cancer.gov/) [1] connects researchers with publicly available cancer imaging data, often linked with other types of cancer data. Many of the collections have limited annotations due to the expense and effort required to create these manually. The increased capabilities of AI analysis of radiology images provide an opportunity to augment existing IDC collections with new annotation data. To further this goal, we trained several nnUNet [2] based models for a variety of radiology segmentation tasks from public datasets and used them to generate segmentations for IDC collections.

To validate the model's performance, roughly 10% of the AI predictions were assigned to a validation set. For this set, a board-certified radiologist graded the quality of AI predictions on a Likert scale. If they did not 'strongly agree' with the AI output, the reviewer corrected the segmentation.

This record provides the AI segmentations, Manually corrected segmentations, and Manual scores for the inspected IDC Collection images.

Only 10% of the AI-derived annotations provided in this dataset are verified by expert radiologists . More details, on model training and annotations are provided within the associated manuscript to ensure transparency and reproducibility.

This work was done in two stages. Versions 1.x of this record were from the first stage. Versions 2.x added additional records. In the Version 1.x collections, a medical student (non-expert) reviewed all the AI predictions and rated them on a 5-point Likert Scale, for any AI predictions in the validation set that they did not 'strongly agree' with, the non-expert provided corrected segmentations. This non-expert was not utilized for the Version 2.x additional records.

Likert Score Definition:

Guidelines for reviewers to grade the quality of AI segmentations.

5 Strongly Agree - Use-as-is (i.e., clinically acceptable, and could be used for treatment without change)
4 Agree - Minor edits that are not necessary. Stylistic differences, but not clinically important. The current segmentation is acceptable
3 Neither agree nor disagree - Minor edits that are necessary. Minor edits are those that the review judges can be made in less time than starting from scratch or are expected to have minimal effect on treatment outcome
2 Disagree - Major edits. This category indicates that the necessary edit is required to ensure correctness, and sufficiently significant that user would prefer to start from the scratch
1 Strongly disagree - Unusable. This category indicates that the quality of the automatic annotations is so bad that they are unusable.

Zip File Folder Structure

Each zip file in the collection correlates to a specific segmentation task. The common folder structure is

ai-segmentations-dcm This directory contains the AI model predictions in DICOM-SEG format for all analyzed IDC collection files
qa-segmentations-dcm This directory contains manual corrected segmentation files, based on the AI prediction, in DICOM-SEG format. Only a fraction, ~10%, of the AI predictions were corrected. Corrections were performed by radiologist (rad*) and non-experts (ne*)
qa-results.csv CSV file linking the study/series UIDs with the ai segmentation file, radiologist corrected segmentation file, radiologist ratings of AI performance.

qa-results.csv Columns

The qa-results.csv file contains metadata about the segmentations, their related IDC case image, as well as the Likert ratings and comments by the reviewers.

Column	Description
Collection	The name of the IDC collection for this case
PatientID	PatientID in DICOM metadata of scan. Also called Case ID in the IDC
StudyInstanceUID	StudyInstanceUID in the DICOM metadata of the scan
SeriesInstanceUID	SeriesInstanceUID in the DICOM metadata of the scan
Validation	true/false if this scan was manually reviewed
Reviewer	Coded ID of the reviewer. Radiologist IDs start with ‘rad’ non-expect IDs start with ‘ne’
AimiProjectYear	2023 or 2024, This work was split over two years. The main methodology difference between the two is that in 2023, a non-expert also reviewed the AI output, but a non-expert was not utilized in 2024.
AISegmentation	The filename of the AI prediction file in DICOM-seg format. This file is in the ai-segmentations-dcm folder.
CorrectedSegmentation	The filename of the reviewer-corrected prediction file in DICOM-seg format. This file is in the qa-segmentations-dcm folder. If the reviewer strongly agreed with the AI for all segments, they did not provide any correction file.
Was the AI predicted ROIs accurate?	This column appears one for each segment in the task for images from AimiProjectYear 2023. The reviewer rates segmentation quality on a Likert scale. In tasks that have multiple labels in the output, there is only one rating to cover them all.
Was the AI predicted {SEGMENT_NAME} label accurate?	This column appears one for each segment in the task for images from AimiProjectYear 2024. The reviewer rates each segment for its quality on a Likert scale.
Do you have any comments about the AI predicted ROIs?	Open ended question for the reviewer
Do you have any comments about the findings from the study scans?	Open ended question for the reviewer

File Overview

brain-mr.zip

Segment Description: brain tumor regions: necrosis, edema, enhancing
IDC Collection: UPENN-GBM
Links: model weights, github

breast-fdg-pet-ct.zip

Segment Description: FDG-avid lesions in breast from FDG PET/CT scans QIN-Breast
IDC Collection: QIN-Breast
Links: model weights, github

breast-mr.zip

Segment Description: Breast, Fibroglandular tissue, structural tumor
IDC Collection: duke-breast-cancer-mri
Links: model weights, github

kidney-ct.zip

Segment Description: Kidney, Tumor, and Cysts from contrast enhanced CT scans
IDS Collection: TCGA-KIRC, TCGA-KIRP, TCGA-KICH, CPTAC-CCRCC
Links: model weights, github

liver-ct.zip

Segment Description: Liver from CT scans
IDC Collection: TCGA-LIHC
Links: model weights, github

liver2-ct.zip

Segment Description: Liver and Lesions from CT scans
IDC Collection: HCC-TACE-SEG, COLORECTAL-LIVER-METASTASES
Links: model weights, github

liver-mr.zip

Segment Description: Liver from T1 MRI scans
IDC Collection: TCGA-LIHC
Links: model weights, github

lung-ct.zip

Segment Description: Lung and Nodules (3mm-30mm) from CT scans
IDC Collections:
Links: model weights 1, model weights 2, github

lung2-ct.zip

Improved model version
Segment Description: Lung and Nodules (3mm-30mm) from CT scans
IDC Collections:
- QIN-LUNG-CT, SPIE-AAPM Lung CT Challenge
Links: model weights, github

lung-fdg-pet-ct.zip

Segment Description: Lungs and FDG-avid lesions in the lung from FDG PET/CT scans
IDC Collections:
Links: model weights, github

prostate-mr.zip

Segment Description: Prostate from T2 MRI scans
IDC Collection: ProstateX, Prostate-MRI-US-Biopsy
Links: model weights, github

Changelog

2.0.2 - Fix the brain-mr segmentations to be transformed correctly
2.0.1 - added AIMI 2024 radiologist comments to qa-results.csv
2.0.0 - added AIMI 2024 segmentations
1.X - AIMI 2023 segmentations and reviewer scores

Files

brain-mr.zip

Files (686.3 MB)

Name	Size	Download all
brain-mr.zip md5:61ea0e3e3bca0ed181421b92c7d51a64	40.1 MB	Preview Download
breast-fdg-pet-ct.zip md5:7a0e9dca5934f7c448bdfdd7e5fe5c3b	805.1 kB	Preview Download
breast-mr.zip md5:5ec911c56465b33d5db65cafb87bd98b	248.6 MB	Preview Download
kidney-ct.zip md5:dcb82c33ff78a726d28808710842b419	11.1 MB	Preview Download
liver-ct.zip md5:160b441da41631893e09ea3ee77d7688	4.7 MB	Preview Download
liver-mr.zip md5:68762aac6c048b75cfd920dbfa494744	1.7 MB	Preview Download
liver2-ct.zip md5:2826f1d85425e5245bde9005b3872c9c	31.2 MB	Preview Download
lung-ct.zip md5:54591ab911a8d41e328cd768b240dce9	72.7 MB	Preview Download
lung-fdg-pet-ct.zip md5:ca277d2723474c4842f4ca3ec89eac67	9.7 MB	Preview Download
lung2-ct.zip md5:8cf4858149a1fbef0c6718e93b5c6b4a	258.2 MB	Preview Download
prostate-mr.zip md5:753b6c7b5b281aece03724298bacbdb7	7.5 MB	Preview Download

Additional details

Is derived from: Dataset: 10.7937/K9/TCIA.2016.21JUEBH0 (DOI); Dataset: 10.7937/K9/TCIA.2016.V6PBVTDR (DOI); Dataset: 10.7937/K9/TCIA.2016.IMMQW8UQ (DOI); Dataset: 10.7937/K9/TCIA.2016.IMMQW8UQ (DOI); Dataset: 10.7937/K9TCIA.2017.MURS5CL (DOI); Dataset: 10.7937/tcia.2019.30ilqfcl (DOI); Dataset: 10.7937/tcia.2019.zjjwb9ip (DOI); Dataset: 10.7937/TCIA.2020.NNC2-0461 (DOI); Dataset: 10.7937/K9/TCIA.2017.7hs46erv (DOI); Dataset: 10.7937/k9/tcia.2015.ofip7tvm (DOI); Dataset: 10.7937/K9/TCIA.2016.JGNIHEP5 (DOI); Dataset: 10.7937/K9/TCIA.2016.TYGKKFMQ (DOI); Dataset: 10.7937/K9/TCIA.2015.NPGZYZBZ (DOI); Dataset: 10.7937/K9/TCIA.2015.1BUVFJR7 (DOI); Dataset: 10.7937/K9/TCIA.2015.UZLSU3FL (DOI); Dataset: 10.7937/TCIA.HMQ8-J677 (DOI); Dataset: 10.7937/K9/TCIA.2017.KLXWJJ1Q (DOI); Dataset: 10.7937/K9/TCIA.2017.GJQ7R0EF (DOI); Dataset: 10.1016/j.media.2022.102680 (DOI); Dataset: 10.1038/s41467-022-30695-9 (DOI); Dataset: 10.7937/TCIA.XC7A-QT20 (DOI); Dataset: 10.7937/TCIA.e3sv-re93 (DOI); Dataset: 10.7937/TCIA.2020.A61IOC1A (DOI); Dataset: 10.7937/TCIA.709X-DN49 (DOI); Dataset: 10.7937/K9/TCIA.2016.ACWOGBEF (DOI); Dataset: 10.7937/K9/TCIA.2016.YU3RBCZN (DOI); Dataset: 10.7937/k9/tcia.2018.oblamn27 (DOI); Dataset: 10.7937/TCIA.5FNA-0924 (DOI); Dataset: 10.7937/QXK2-QG03 (DOI)
Is published in: Preprint: 10.48550/arXiv.2310.14897 (DOI); Other: 10.25504/FAIRsharing.0b5a1d (DOI); Other: https://portal.imaging.datacommons.cancer.gov/ (URL)
References: Dataset: 10.1038/sdata.2017.117(opens in a new window) (DOI); Journal article: 10.1109/TMI.2014.2377694 (DOI); Preprint: arXiv:2107.02314 (arXiv); Data paper: 10.1038/s41597-022-01555-4 (DOI)

National Cancer Institute
AIMI Annotations 75N91019D00024

[1] Fedorov A, Longabaugh WJ, Pot D, Clunie DA, Pieper S, Aerts HJ, Homeyer A, Lewis R, Akbarzadeh A, Bontempi D, Clifford W. NCI imaging data commons. Cancer research. 2021 Aug 8;81(16):4188.
[2] Isensee, F., Jaeger, P. F., Kohl, S. A., Petersen, J., & Maier-Hein, K. H. (2021). nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nature methods, 18(2), 203-211.

	All versions	This version
Views	5,728	2,737
Downloads	4,104	1,809
Data volume	185.1 GB	112.0 GB

Likert Score Definition:

Zip File Folder Structure

qa-results.csv Columns

File Overview

brain-mr.zip

breast-fdg-pet-ct.zip

breast-mr.zip

kidney-ct.zip

liver-ct.zip

liver2-ct.zip

liver-mr.zip

lung-ct.zip

lung2-ct.zip

lung-fdg-pet-ct.zip

prostate-mr.zip

brain-mr.zip

Files (686.3 MB)

Related works

Funding

References

Image segmentations produced by BAMF under the AIMI Annotations initiative

Authors/Creators

Description

Likert Score Definition:

Zip File Folder Structure

qa-results.csv Columns

File Overview

brain-mr.zip

breast-fdg-pet-ct.zip

breast-mr.zip

kidney-ct.zip

liver-ct.zip

liver2-ct.zip

liver-mr.zip

lung-ct.zip

lung2-ct.zip

lung-fdg-pet-ct.zip

prostate-mr.zip

Files

brain-mr.zip

Files (686.3 MB)

Additional details

Related works

Funding

References