Expert annotations of the tissue types for the RMS-Mutation-Prediction microscopy images
Creators
Description
This dataset contains annotations of multiple regions of interest performed by an expert pathologist with eight years of experience for a subset of hematoxylin and eosin (H&E) stained images from the RMS-Mutation-Prediction image collection [1,2]. Annotations were generated manually, using the Aperio ImageScope tool, to delineate regions of alveolar rhabdomyosarcoma (ARMS), embryonal rhabdomyosarcoma (ERMS), stroma, and necrosis [3]. The resulting planar contour annotations were originally stored in ImageScope-specific XML format, and subsequently converted into Digital Imaging and Communications in Medicine (DICOM) Structured Report (SR) representation using the open source conversion tool [4].
Many pixels from the whole slide images annotated by this dataset are not contained inside any annotation contours and are considered to belong to the background class. Other pixels are contained inside only one annotation contour and are assigned to a single class. However, cases also exist in this dataset where annotation contours overlap. In these cases, the pixels contained in multiple contours could be assigned membership in multiple classes. One example is a necrotic tissue contour overlapping an internal subregion of an area designated by a larger ARMS or ERMS annotation. The ordering of annotations in this DICOM dataset preserves the order in the original XML generated using ImageScope. These annotations were converted, in sequence, into segmentation masks and used in the training of several machine learning models. Details on the training methods and model results are presented in [1]. In the case of overlapping contours, the order in which annotations are processed may affect the generated segmentation mask if prior contours are overwritten by later contours in the sequence. It is up to the application consuming this data to decide how to interpret tissues regions annotated with multiple classes. The annotations included in this dataset are available for visualization and exploration from the National Cancer Institute Imaging Data Commons (IDC) [5] (also see IDC Portal at https://imaging.datacommons.cancer.gov) as of data release v18. Direct link to open the collection in IDC Portal: https://portal.imaging.datacommons.cancer.gov/explore/filters/?analysis_results_id=RMS-Mutation-Prediction-Expert-Annotations.
The GCP and AWS manifests provided with this dataset record can be used to download the corresponding files from the IDC Google Cloud Storage (GCS) or Amazon S3 (AWS) buckets free of charge following the instructions available in IDC documentation here: https://learn.canceridc.dev/data/downloading-data.
If you use the files referenced in the attached manifests, we ask you to cite this dataset, as well as the publication describing the original dataset [2] and publication acknowledging IDC [5].
Specific files included in the record are:
-
rms_mutation_prediction_expert_annotations.zip: DICOM SR instances containing the expert annotations encoded as closed planar contours
-
rms_mutation_prediction_expert_annotations_gcs.s5cmd: GCS-based manifest (to download the files described in the manifest, execute this command: s5cmd --no-sign-request --endpoint-url https://storage.googleapis.com run rms_mutation_prediction_gcs.s5cmd)
-
rms_mutation_prediction_expert_annotations_aws.s5cmd AWS-based manifest (to download the files described in the manifest, execute this command: s5cmd --no-sign-request --endpoint-url https://s3.amazonaws.com run rms_mutation_prediction_aws.s5cmd)
- rms_mutation_prediction_expert_annotations_dcf.csv: Gen3-based manifest (see details in https://learn.canceridc.dev/data/organization-of-data/guids-and-uuids).
Files
rms_mutation_prediction_expert_annotations_dcf.csv
Files
(18.2 kB)
Name | Size | Download all |
---|---|---|
md5:d1cbac7207e8c217972c0553639f300a
|
6.4 kB | Download |
md5:21042d955daa85596dd031caaed2f6f3
|
4.7 kB | Preview Download |
md5:f0b08790c3986614320973c0996a264b
|
7.0 kB | Download |
Additional details
Related works
- Is derived from
- Dataset: 10.5281/zenodo.8225132 (DOI)
- Is published in
- Other: 10.25504/FAIRsharing.0b5a1d (DOI)
- Is supplemented by
- Software: 10.5281/zenodo.8240154 (DOI)
- References
- Publication: 10.1158/0008-5472.CAN-21-0950 (DOI)
- Publication: 10.1148/rg.230180 (DOI)
References
- [1] D. Milewski et al., "Predicting molecular subtype and survival of rhabdomyosarcoma patients using deep learning of H&E images: A report from the Children's Oncology Group," Clin. Cancer Res., vol. 29, no. 2, pp. 364–378, Jan. 2023, doi: 10.1158/1078-0432.CCR-22-1663.
- [2] Clunie, D., Khan, J., Milewski, D., Jung, H., Bowen, J., Lisle, C., Brown, T., Liu, Y., Collins, J., Linardic, C. M., Hawkins, D. S., Venkatramani, R., Clifford, W., Pot, D., Wagner, U., Farahani, K., Kim, E., & Fedorov, A. (2023). DICOM converted whole slide hematoxylin and eosin images of rhabdomyosarcoma from Children's Oncology Group trials [Data set]. Zenodo. https://doi.org/10.5281/zenodo.8225132
- [3] Agaram NP. Evolving classification of rhabdomyosarcoma. Histopathology. 2022 Jan;80(1):98-108. doi: 10.1111/his.14449. PMID: 34958505; PMCID: PMC9425116,https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9425116/
- [4] Chris Bridge. (2024). ImagingDataCommons/idc-sm-annotations-conversion: v1.0.0 (v1.0.0). Zenodo. https://doi.org/10.5281/zenodo.10632182
- [5] Fedorov, A., Longabaugh, W. J. R., Pot, D., Clunie, D. A., Pieper, S. D., Gibbs, D. L., Bridge, C., Herrmann, M. D., Homeyer, A., Lewis, R., Aerts, H. J. W. L., Krishnaswamy, D., Thiriveedhi, V. K., Ciausu, C., Schacherer, D. P., Bontempi, D., Pihl, T., Wagner, U., Farahani, K., Kim, E. & Kikinis, R. National cancer institute imaging data commons: Toward transparency, reproducibility, and scalability in imaging artificial intelligence. Radiographics 43, (2023).