There is a newer version of the record available.

Published September 14, 2023 | Version v1
Dataset Open

TotalSegmentator segmentations and radiomics features for NCI Imaging Data Commons CT images

  • 1. ROR icon Brigham and Women's Hospital
  • 2. ROR icon Harvard Medical School
  • 3. Pixelmed Publishing
  • 4. Brigham and Women's Hospital; Harvard Medical School

Description

This dataset contributes volumetric segmentations of the anatomic regions in a subset of CT images available from NCI Imaging Data Commons [1] (https://imaging.datacommons.cancer.gov/) automatically generated using the TotalSegmentation model v1.5.6 [2]. The initial release includes segmentations for the majority of the CT scans included in the National Lung Screening Trial (NLST) collection [3], [4] already available in IDC. Direct link to open this analysis result dataset in IDC (available after release of IDC v18): https://portal.imaging.datacommons.cancer.gov/explore/filters/?analysis_results_id=TotalSegmentator-CT-Segmentations

Specifically, for each of the CT series analyzed, we include segmentations as generated by TotalSegmentator, converted into DICOM Segmentation object format using dcmqi v1.3.0 [5], and first order and shape features for each of the segmented regions, as produced by pyradiomics v3.0.1 [6]. Radiomics features were converted to DICOM Structured Reporting documents following template TID1500 using dcmqi. TotalSegmentator analysis on the NLST cohort was executed using Terra platform [7]. Implementation of the workflow that was used for performing the analysis is available at https://github.com/ImagingDataCommons/CloudSegmentator [8].

Due to the large size of the files, they are stored in the cloud buckets maintained by IDC, and the attached files are the manifests that can be used to download the actual files.

The GCP and AWS manifests provided with this dataset record can be used to download the corresponding files from the IDC Google Cloud Storage (GCS) or Amazon S3 (AWS) buckets free of charge following the instructions available in IDC documentation here: https://learn.canceridc.dev/data/downloading-data. Specifically, you will need to install the s5cmd command line tool on your computer (see instructions at https://github.com/peak/s5cmd#installation), and follow the manifest-specific download instructions accompanying the file list below.

If you use the files referenced in the attached manifests, we ask you to cite this dataset and the preprint describing how it was generated [9].

Specific files included in the record are:

  1. totalsegmentator_ct_segmentations_aws.s5cmd.zip: compressed AWS-based manifest (to download the files described in the manifest, execute this command: s5cmd --no-sign-request --endpoint-url https://s3.amazonaws.com run totalsegmentator_ct_segmentations_aws.s5cmd)

  2. totalsegmentator_ct_segmentations_gcs.s5cmd.zip: GCS-based manifest (to download the files described in the manifest, execute this command:  s5cmd --no-sign-request --endpoint-url https://storage.googleapis.com run totalsegmentator_ct_segmentations_gcs.s5cmd)

  3. Gen3-based manifest (see details in https://learn.canceridc.dev/data/organization-of-data/guids-and-uuids).

Files

totalsegmentator_ct_segmentations_dcf.csv

Files (67.7 MB)

Name Size Download all
md5:efe63194c29da150771f498b5353498b
24.2 MB Download
md5:c7f1367d087d4dc578f7677320579b9e
17.0 MB Preview Download
md5:c4499b042f78cd063a81a33fb109efca
26.5 MB Download

Additional details

Related works

Is derived from
Dataset: 10.7937/TCIA.HMQ8-J677 (DOI)
Is described by
Other: 10.21203/rs.3.rs-4351526/v1 (DOI)
Is published in
Other: 10.25504/FAIRsharing.0b5a1d (DOI)
Other: https://portal.imaging.datacommons.cancer.gov/ (URL)

References

  • [1] A. Fedorov et al., "National cancer institute imaging data commons: Toward transparency, reproducibility, and scalability in imaging artificial intelligence," Radiographics, vol. 43, no. 12, Dec. 2023, doi: 10.1148/rg.230180.
  • [2] J. Wasserthal et al., "TotalSegmentator: Robust segmentation of 104 anatomic structures in CT images," Radiol. Artif. Intell., Jul. 2023, doi: 10.1148/ryai.230024.
  • [3] National Lung Screening Trial Research Team et al., "The National Lung Screening Trial: overview and study design," Radiology, vol. 258, no. 1, pp. 243–253, Jan. 2011, doi: 10.1148/radiol.10091808.
  • [4] National Lung Screening Trial Research Team, "Data from the National Lung Screening Trial (NLST) (Version 3) [dataset]." 2013. doi: 10.7937/TCIA.HMQ8-J677.
  • [5] C. Herz et al., "dcmqi: An Open Source Library for Standardized Communication of Quantitative Image Analysis Results Using DICOM," Cancer Res., vol. 77, no. 21, pp. e87–e90, Nov. 2017, doi: 10.1158/0008-5472.CAN-17-0336.
  • [6] J. J. M. van Griethuysen et al., "Computational Radiomics System to Decode the Radiographic Phenotype," Cancer Res., vol. 77, no. 21, pp. e104–e107, Nov. 2017, doi: 10.1158/0008-5472.CAN-17-0339.
  • [7] C. Birger et al., "FireCloud, a scalable cloud-based platform for collaborative genome analysis: Strategies for reducing and controlling costs," bioRxiv, p. 209494, Nov. 03, 2017. doi: 10.1101/209494.
  • [8] V. Thiriveedhi and A. Fedorov, ImagingDataCommons/CloudSegmentator: v1.2.0. Zenodo, 2024. doi: 10.5281/ZENODO.10712897.
  • [9] Thiriveedhi, V. K., Krishnaswamy, D., Clunie, D., Pieper, S., Kikinis, R. & Fedorov, A. Cloud-based large-scale curation of medical imaging data using AI segmentation. Research Square (2024). doi:10.21203/rs.3.rs-4351526/v1