Published August 23, 2023 | Version v1
Dataset Open

DICOM converted whole slide hematoxylin and eosin images of rhabdomyosarcoma from Children's Oncology Group trials

  • 1. PixelMed Publishing
  • 2. National Cancer Institute
  • 3. Frederick National Laboratory for Cancer Research
  • 4. Nationwide Children's Hospital
  • 5. KnowledgeVis, LLC
  • 6. Duke University School of Medicine
  • 7. Seattle Children's Hospital
  • 8. Texas Children's Cancer Center
  • 9. Institute for Systems Biology
  • 10. General Dynamics Information Technology
  • 11. Brigham and Women's Hospital

Description

Rhabdomyosarcoma (RMS) is an aggressive soft-tissue sarcoma, which primarily occurs in children and young adults. This dataset contains manifests referring to the hematoxylin and eosin (H&E) stained images in Digital Imaging and Communications in Medicine (DICOM) format available from National Cancer Institute Imaging Data Commons (IDC) [1] (also see IDC Portal at https://imaging.datacommons.cancer.gov) as of data release v16. The original images in vendor-specific format were collected on IRB-approved clinical trials or tissue banking studies from Children’s Oncology Group (COG) patients enrolled on ARST0331, ARST0431, D9602, D9803, and D9902 trials, as described in [2]. Those images, augmented with the metadata describing their content, were provided to the IDC team for the purposes of archival, and were converted into DICOM Whole Slide Microscopy (SM) representation [3], [4] using custom open source scripts and tools available and described here [5]. The resulting converted images were released in IDC in the RMS-Mutation-Prediction collection with the data release v16.

To conveniently explore the data available for this dataset, please use this dashboard: https://lookerstudio.google.com/reporting/7f267400-8774-42e1-b5d1-ca11863c52a9.

Notebooks demonstrating how to use this data are available here: https://github.com/ImagingDataCommons/IDC-Tutorials/tree/master/notebooks/collections_demos/rms_mutation_prediction.

Clinical data accompanying the images is available via SQL interface in IDC BigQuery tables, see details on accessing IDC clinical data in the respective tutorial (https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/clinical_data_intro.ipynb).

The images referred to by the accompanying manifests can be explored and visualized using IDC Portal here: https://portal.imaging.datacommons.cancer.gov/explore/. Direct link to open the collection is https://portal.imaging.datacommons.cancer.gov/explore/filters/?collection_id=rms_mutation_prediction.

The GCP and AWS manifests provided with this dataset record can be used to download the corresponding files from the IDC Google Cloud Storage (GCS) or Amazon S3 (AWS) buckets free of charge following the instructions available in IDC documentation here: https://learn.canceridc.dev/data/downloading-data. Specifically, you will need to install the s5cmd command line tool on your computer (see instructions at https://github.com/peak/s5cmd#installation), and follow the manifest-specific download instructions accompanying the file list below.

If you use the files referenced in the attached manifests, we ask you to please cite this dataset, as well as the publication describing the original dataset [2] and the publication acknowledging IDC [1].

Specific files included in the record are:

  1. rms_mutation_prediction_gcs.s5cmd: GCS-based manifest (to download the files described in the manifest, execute this command: s5cmd --no-sign-request --endpoint-url https://storage.googleapis.com run rms_mutation_prediction_gcs.s5cmd)

  2. rms_mutation_prediction_aws.s5cmd: AWS-based manifest (to download the files described in the manifest, execute this command: s5cmd --no-sign-request --endpoint-url https://s3.amazonaws.com run rms_mutation_prediction_aws.s5cmd)

  3. rms_mutation_prediction_dcf.csv: Gen3-based manifest (see details in https://learn.canceridc.dev/data/organization-of-data/guids-and-uuids).

References

[1] A. Fedorov et al., "NCI Imaging Data Commons," Cancer Res., vol. 81, no. 16, pp. 4188–4193, Aug. 2021, doi: 10.1158/0008-5472.CAN-21-0950

[2] D. Milewski et al., "Predicting molecular subtype and survival of rhabdomyosarcoma patients using deep learning of H&E images: A report from the Children's Oncology Group," Clin. Cancer Res., vol. 29, no. 2, pp. 364–378, Jan. 2023, doi: 10.1158/1078-0432.CCR-22-1663.

[3] National Electrical Manufacturers Association (NEMA), "DICOM PS3.3 - Information Object Definitions: A.32.8 VL Whole Slide Microscopy Image IOD." Accessed: Aug. 11, 2023. [Online]. Available: https://dicom.nema.org/medical/dicom/current/output/html/part03.html#sect_A.32.8

[4] M. D. Herrmann et al., "Implementing the DICOM standard for digital pathology," J. Pathol. Inform., vol. 9, no. 1, p. 37, Jan. 2018, doi: 10.4103/jpi.jpi_42_18

[5] D. Clunie, A. Fedorov, and M. D. Herrmann, ImagingDataCommons/idc-wsi-conversion: Initial release. Zenodo, 2023. doi: 10.5281/zenodo.8240154

Files

rms_mutation_prediction_dcf.csv

Files (183.2 kB)

Name Size Download all
md5:05606a307cad4ffc4f2aaa0dd9dbd77b
27.1 kB Download
md5:57587c9e2ea401a5d6d63b2a715b3d14
126.5 kB Preview Download
md5:743373611654b580033ba8aa4da92d84
29.6 kB Download

Additional details

Related works

Is published in
Other: 10.25504/FAIRsharing.0b5a1d (DOI)
Is supplement to
Journal article: 10.1158/1078-0432.CCR-22-1663 (DOI)
Is supplemented by
Software: 10.5281/zenodo.8240154 (DOI)
Dataset: 10.5281/zenodo.10462857 (DOI)
References
Journal article: 10.1158/0008-5472.CAN-21-0950 (DOI)

References

  • [1] A. Fedorov et al., "NCI Imaging Data Commons," Cancer Res., vol. 81, no. 16, pp. 4188–4193, Aug. 2021, doi: 10.1158/0008-5472.CAN-21-0950.
  • [2] D. Milewski et al., "Predicting molecular subtype and survival of rhabdomyosarcoma patients using deep learning of H&E images: A report from the Children's Oncology Group," Clin. Cancer Res., vol. 29, no. 2, pp. 364–378, Jan. 2023, doi: 10.1158/1078-0432.CCR-22-1663.
  • [3] National Electrical Manufacturers Association (NEMA), "DICOM PS3.3 - Information Object Definitions: A.32.8 VL Whole Slide Microscopy Image IOD." Accessed: Aug. 11, 2023. [Online]. Available: https://dicom.nema.org/medical/dicom/current/output/html/part03.html#sect_A.32.8
  • [4] M. D. Herrmann et al., "Implementing the DICOM standard for digital pathology," J. Pathol. Inform., vol. 9, no. 1, p. 37, Jan. 2018, doi: 10.4103/jpi.jpi_42_18.
  • [5] D. Clunie, A. Fedorov, and M. D. Herrmann, ImagingDataCommons/idc-wsi-conversion: Initial release. Zenodo, 2023. doi: 10.5281/ZENODO.8240154.