Published April 18, 2025 | Version 1.0.0
Other Open

Sample "LABELMAP" DICOM Segmentation Files

  • 1. ROR icon Athinoula A. Martinos Center for Biomedical Imaging
  • 2. Brigham and Women's Hospital Department of Radiology

Contributors

Project leader:

Project member:

  • 1. ROR icon Athinoula A. Martinos Center for Biomedical Imaging
  • 2. Brigham and Women's Hospital Department of Radiology

Description

This dataset contains four sample DICOM Segmentation files with the new "LABELMAP" SegmentationType that was included into the DICOM standard in the 2024c version. The DICOM Standard supplement number 243 describing this new format is found here. Briefly, this format stores one or more non-overlapping segmentas in the same pixel array where the integer value of a pixel identifies the segment that the pixel belongs to. The primary purpose of releasing these files is to make examples of these objects available as test objects for other libraries and software tools, including viewers.

The providing segmentations are "re-encodings" of an existing BINARY segmentation that is publicly available in the Imaging Data Commons (IDC) repository. It contains 74 segments of various abdominal organs in an abdominal CT. The CT image series that was segmented to produce that segmentation is also available in the IDC as case "100002" from the NLST collection. The identifiers for this series are:

StudyInstanceUID: 1.2.840.113654.2.55.68425808326883186792123057288612355322
SeriesInstanceUID: 1.2.840.113654.2.55.229650531101716203536241646069123704792

Both the BINARY segmentation and its source image are downloaded from the IDC as part of the Python code snippet below. Alternatively you can use the idc-index tool to download e.g. the source image series:

idc download 1.2.840.113654.2.55.229650531101716203536241646069123704792

You can view this in the IDC OHIF instance with the existing BINARY segmentation here. Please consult the IDC project documentation for further ways to access those files.

Four variants are provided using two different values for Photometric Interpretation ("MONOCHROME2" and "PALETTE COLOR") and two different transfer syntaxes (ExplicitVRLittleEndian ("native") and JPEG2000Lossless):

  • ct_labelmap_monochrome_native.dcm
    • StudyInstanceUID: 1.2.840.113654.2.55.68425808326883186792123057288612355322
    • SeriesInstanceUID: 1.2.826.0.1.3680043.10.511.3.39163641237923761484492278922466793
    • SOPInstanceUID: 1.2.826.0.1.3680043.10.511.3.54215185833283589552918444094454937
    • Size: 32MB
  • ct_labelmap_monochrome_jpeg2000.dcm
    • StudyInstanceUID: 1.2.840.113654.2.55.68425808326883186792123057288612355322
    • SeriesInstanceUID: 1.2.826.0.1.3680043.10.511.3.78021852011276185743765835838835269
    • SOPInstanceUID: 1.2.826.0.1.3680043.10.511.3.54215185833283589552918444094454937
    • Size: 1.7MB
  • ct_labelmap_palette_color_native.dcm
    • StudyInstanceUID: 1.2.840.113654.2.55.68425808326883186792123057288612355322
    • SeriesInstanceUID: 1.2.826.0.1.3680043.10.511.3.1928618149102457371215393385865002
    • SOPInstanceUID: 1.2.826.0.1.3680043.10.511.3.49665993503894378485170565809581412
    • Size: 32MB
  • ct_labelmap_palette_color_jpeg2000.dcm
    • StudyInstanceUID: 1.2.840.113654.2.55.68425808326883186792123057288612355322
    • SeriesInstanceUID: 1.2.826.0.1.3680043.10.511.3.60058092962333326814689785333007126
    • SOPInstanceUID: 1.2.826.0.1.3680043.10.511.3.35953210903734988696114857448634090
    • Size: 1.8MB

Anyone can produce these example files (although the UIDs will change) using the highdicom Python library (version 0.25.0 or higher) and the following Python code:

from copy import deepcopy
from matplotlib import colormaps
import numpy as np
from pathlib import Path
import highdicom as hd
from pydicom.sr.codedict import codes

from idc_index import IDCClient
from pydicom.uid import JPEG2000Lossless, ExplicitVRLittleEndian


# Need this guard because of possibility of threading (when using workers)
if __name__ == '__main__':

    # The series instance UID of an existing IDC segmentation (created using
    # TotalSegmentator on the NLST collection). We will download this and
    # re-encode it in a new LABELMAP segmentation
    segmentation_series_uid = "1.2.276.0.7230010.3.1.3.313263360.31993.1706319455.429793"

    # Temporary download directory for downloads and new files
    download_dir = Path('total_segmentator_downloads')
    download_dir.mkdir(exist_ok=True)

    # Download the original (binary) segmentation and load it
    client = IDCClient()
    client.download_dicom_series(
        segmentation_series_uid,
        downloadDir=download_dir,
        dirTemplate='%SeriesInstanceUID'
    )

    seg_file = list((download_dir / segmentation_series_uid).glob('*.dcm'))[0]
    seg = hd.seg.segread(seg_file)

    # Download the files of the segmented CT series and load them
    segmented_series_uid = seg.get_source_image_uids()[0][1]
    client.download_dicom_series(
        segmented_series_uid,
        downloadDir=download_dir,
        dirTemplate='%SeriesInstanceUID'
    )
    ct_files = list((download_dir / segmented_series_uid).glob('*.dcm'))
    images = [hd.imread(p) for p in ct_files]

    # Sort the image datasets so that they match the eventual segmentation
    # volume
    images = hd.spatial.sort_datasets(images)

    # Get the original segmentation pixel array as a volume, and combine into
    # "labelmap" form at this point
    seg_vol = seg.get_volume(combine_segments=True)

    # The original segmentation is rotated relative to the source images, for
    # whatever reason. So we either have to a) pass the segmentation pixel
    # array to the Segmentation constructor as a Volume object, or b) do the
    # spatial operations necessary to align the segmentation pixel array to the
    # source images. Both will make the spatial metadata correct in the new
    # segmentation, but if we do option a), highdicom will not establish
    # per-frame correspondences in the PerFrameFunctionalGroupsSequence. So we
    # opt for option b) using the match_geometry method
    im_vol = hd.get_volume_from_series(images)
    seg_vol = seg_vol.match_geometry(im_vol)

    # Describe the algorithm that created the segmentation (though this is not
    # strictly required in the standard and not present in the original
    # segmentation, highdicom requires it)
    algorithm_identification = hd.AlgorithmIdentificationSequence(
        name='TotalSegmentator',
        version='v1.5.6',
        family=codes.cid7162.ArtificialIntelligence
    )
    orig_segment_descriptions = [
        seg.get_segment_description(i) for i in seg.segment_numbers
    ]
    for desc in orig_segment_descriptions:
        desc.SegmentationAlgorithmIdentificationSequence = algorithm_identification

    # Create an 8-bit palette color LUT using matplotlib's' built-in
    # 'gist_rainbow_r' colormap
    cmap = colormaps['gist_rainbow_r']
    num_entries = seg.number_of_segments  # e.g. number of classes in a segmentation
    lut_data = cmap(np.arange(num_entries) / (num_entries + 1), bytes=True)
    lut_data = np.vstack([np.zeros((1, 4), dtype=lut_data.dtype), lut_data])
    lut = hd.PaletteColorLUTTransformation.from_combined_lut(
        lut_data=lut_data[:, :3],  # remove alpha channel
        palette_color_lut_uid=hd.UID(),
    )

    i = 1
    for compress in [False, True]:
        for palette_color in [False, True]:

            if palette_color:
                # If using palette color, the recommended cielab values are disallowed
                segment_descriptions = deepcopy(orig_segment_descriptions)
                for desc in segment_descriptions:
                    del desc.RecommendedDisplayCIELabValue
            else:
                segment_descriptions = orig_segment_descriptions

            transfer_syntax_uid = (
                JPEG2000Lossless if compress
                else ExplicitVRLittleEndian
            )

            pi_str = 'Palette Color' if palette_color else 'Monochrome'
            syntax_str = 'JPEG2000' if compress else 'Native'

            series_description = (
                f"TotalSegmentator(v.1.5.6) Labelmap Seg, {pi_str} {syntax_str}"
            )

            labelmap_seg = hd.seg.Segmentation(
                source_images=images,
                pixel_array=seg_vol.array,  # move to array otherwise spatial correspondences are not preserved
                segment_descriptions=segment_descriptions,
                segmentation_type="LABELMAP",
                manufacturer=seg.Manufacturer,
                manufacturer_model_name=seg.ManufacturerModelName,
                software_versions=seg.SoftwareVersions,
                device_serial_number=seg.DeviceSerialNumber,
                series_instance_uid=hd.UID(),
                sop_instance_uid=hd.UID(),
                series_number=seg.SeriesNumber + i,
                instance_number=1,
                workers=12 if compress else 0,
                palette_color_lut_transformation=lut if palette_color else None,
                transfer_syntax_uid=transfer_syntax_uid,
                series_description=series_description,
            )

            fname = (
                download_dir / f'ct_labelmap_{pi_str}_{syntax_str}.dcm'.lower().replace(' ', '_')
            )
            labelmap_seg.save_as(fname)
            i += 1

This data was encoded by the NCI Imaging Data Commons project, which has been funded in whole or in part with Federal funds from the NCI, NIH, under task order no. HHSN26110071 under contract no. HHSN261201500003l.

Files

Files (70.0 MB)

Name Size Download all
md5:d92237c74b343722a195952f74694cf3
1.8 MB Download
md5:07fd0d8b6b791a3dadb0c36eea95e7fd
33.2 MB Download
md5:496b6f2819b90d9977cff7586a544dff
1.8 MB Download
md5:4542e7ee611e5c3b35977f8e7773c234
33.2 MB Download

Additional details

Related works

Is derived from
Journal: 10.1007/s10278-022-00683-y (DOI)
Journal: 10.1148/rg.230180 (DOI)

Funding

National Institutes of Health
National Cancer Institute Imaging Data Commons Task order no. HHSN26110071 under contract no. HHSN261201500003l

Dates

Available
2025-04-18