TotalSegmentator segmentations and radiomics features for NCI Imaging Data Commons CT images
Description
This dataset contributes volumetric segmentations of the anatomic regions in a subset of CT images available from NCI Imaging Data Commons [1] (https://imaging.datacommons.cancer.gov/) automatically generated using the TotalSegmentation model v1.5.6 [2]. The initial release includes segmentations for the majority of the CT scans included in the National Lung Screening Trial (NLST) collection [3], [4] already available in IDC. Direct link to open this analysis result dataset in IDC (available after release of IDC v18): https://portal.imaging.datacommons.cancer.gov/explore/filters/?analysis_results_id=TotalSegmentator-CT-Segmentations.
Specifically, for each of the CT series analyzed, we include segmentations as generated by TotalSegmentator, converted into DICOM Segmentation object format using dcmqi v1.3.0 [5], and first order and shape features for each of the segmented regions, as produced by pyradiomics v3.0.1 [6]. Radiomics features were converted to DICOM Structured Reporting documents following template TID1500 using dcmqi. TotalSegmentator analysis on the NLST cohort was executed using Terra platform [7]. Implementation of the workflow that was used for performing the analysis is available at https://github.com/ImagingDataCommons/CloudSegmentator [8].
Due to the large size of the files, they are stored in the cloud buckets maintained by IDC, and the attached files are the manifests that can be used to download the actual files.
The GCP and AWS manifests provided with this dataset record can be used to download the corresponding files from the IDC Google Cloud Storage (GCS) or Amazon S3 (AWS) buckets free of charge following the instructions available in IDC documentation here: https://learn.canceridc.dev/data/downloading-data. Specifically, you will need to install the s5cmd command line tool on your computer (see instructions at https://github.com/peak/s5cmd#installation), and follow the manifest-specific download instructions accompanying the file list below.
If you use the files referenced in the attached manifests, we ask you to cite this dataset and the preprint describing how it was generated [9].
Specific files included in the record are:
-
totalsegmentator_ct_segmentations_aws.s5cmd.zip: compressed AWS-based manifest (to download the files described in the manifest, execute this command: s5cmd --no-sign-request --endpoint-url https://s3.amazonaws.com run totalsegmentator_ct_segmentations_aws.s5cmd)
-
totalsegmentator_ct_segmentations_gcs.s5cmd.zip: GCS-based manifest (to download the files described in the manifest, execute this command: s5cmd --no-sign-request --endpoint-url https://storage.googleapis.com run totalsegmentator_ct_segmentations_gcs.s5cmd)
-
Gen3-based manifest (see details in https://learn.canceridc.dev/data/organization-of-data/guids-and-uuids).
Files
totalsegmentator_ct_segmentations_dcf.csv
Files
(67.7 MB)
Name | Size | Download all |
---|---|---|
md5:efe63194c29da150771f498b5353498b
|
24.2 MB | Download |
md5:c7f1367d087d4dc578f7677320579b9e
|
17.0 MB | Preview Download |
md5:c4499b042f78cd063a81a33fb109efca
|
26.5 MB | Download |
Additional details
Related works
- Is derived from
- Dataset: 10.7937/TCIA.HMQ8-J677 (DOI)
- Is described by
- Other: 10.21203/rs.3.rs-4351526/v1 (DOI)
- Is published in
- Other: 10.25504/FAIRsharing.0b5a1d (DOI)
- Other: https://portal.imaging.datacommons.cancer.gov/ (URL)
References
- [1] A. Fedorov et al., "National cancer institute imaging data commons: Toward transparency, reproducibility, and scalability in imaging artificial intelligence," Radiographics, vol. 43, no. 12, Dec. 2023, doi: 10.1148/rg.230180.
- [2] J. Wasserthal et al., "TotalSegmentator: Robust segmentation of 104 anatomic structures in CT images," Radiol. Artif. Intell., Jul. 2023, doi: 10.1148/ryai.230024.
- [3] National Lung Screening Trial Research Team et al., "The National Lung Screening Trial: overview and study design," Radiology, vol. 258, no. 1, pp. 243–253, Jan. 2011, doi: 10.1148/radiol.10091808.
- [4] National Lung Screening Trial Research Team, "Data from the National Lung Screening Trial (NLST) (Version 3) [dataset]." 2013. doi: 10.7937/TCIA.HMQ8-J677.
- [5] C. Herz et al., "dcmqi: An Open Source Library for Standardized Communication of Quantitative Image Analysis Results Using DICOM," Cancer Res., vol. 77, no. 21, pp. e87–e90, Nov. 2017, doi: 10.1158/0008-5472.CAN-17-0336.
- [6] J. J. M. van Griethuysen et al., "Computational Radiomics System to Decode the Radiographic Phenotype," Cancer Res., vol. 77, no. 21, pp. e104–e107, Nov. 2017, doi: 10.1158/0008-5472.CAN-17-0339.
- [7] C. Birger et al., "FireCloud, a scalable cloud-based platform for collaborative genome analysis: Strategies for reducing and controlling costs," bioRxiv, p. 209494, Nov. 03, 2017. doi: 10.1101/209494.
- [8] V. Thiriveedhi and A. Fedorov, ImagingDataCommons/CloudSegmentator: v1.2.0. Zenodo, 2024. doi: 10.5281/ZENODO.10712897.
- [9] Thiriveedhi, V. K., Krishnaswamy, D., Clunie, D., Pieper, S., Kikinis, R. & Fedorov, A. Cloud-based large-scale curation of medical imaging data using AI segmentation. Research Square (2024). doi:10.21203/rs.3.rs-4351526/v1