Published October 1, 2024 | Version v2
Image Open

Melanoma Histopathology Dataset with Tissue and Nuclei Annotations

  • 1. ROR icon University Medical Center Utrecht

Contributors

  • 1. University Medical Center Utrecht

Description

Description:

This dataset is designed for development of deep learning models for segmentation of nuclei and tissue in melanoma H&E stained histopathology. Existing nuclei segmentation models that are trained on non-melanoma specific datasets have low performance due to the ability of melanocytes to mimic other cell types, whereas existing melanoma specific models utilize older, sub-optimal techniques. Moreover, these models do not provide tissue annotations necessary for determining the localization of tumor-infiltrating lymphocytes, which may hold value for predictive and prognostic tasks. To address this, we created a melanoma specific dataset with nuclei and tissue annotations. 

Methodology:

Sample Collection:

Regions of interest (ROIs) were sampled from H&E stained slides of 103 primary melanoma specimens and 103 metastatic melanoma specimens, scanned using a Hamamatsu scanner at 40× magnification (0.23 μm per pixel). All slides were obtained from regular diagnostic procedures.
From each specimen, a 40× magnified ROI of 1024×1024 pixels was selected for annotation. Additionally, a context ROI of 5120×5120 pixels was sampled to provide information about the broader context for the annotation process. Selection was performed by a trained medical expert (M.S.) and subsequently verified by a dermatopathologist (W.B.). Manual ROI selection ensured the inclusion of diverse tissue and nuclei types.

Annotation Process:

  • Nuclei segmentation
    Nuclei segmentations were generated using Hover-Net pretrained on the PanNuke dataset. Manual annotation adjustments were performed by author M.S. using QuPath, with the following nuclei categories: tumor, stroma, vascular endothelium, histiocyte, melanophage, lymphocyte, plasma cell, neutrophil, apoptotic cell, and epithelium. All annotations were reviewed and corrected, where needed, by a dermatopathologist (W.B.).
  • Tissue segmentation
    Tissue segmentations were created manually using QuPath by M.S., with the following categories: tumor, stroma, epidermis, necrosis, blood vessel, and background. Annotations were reviewed and corrected, where needed, by a dermatopathologist (W.B.).

Quality Control:
To assess the reliability of the annotations, intra- and interobserver agreement (by pathologist G.B.) were determined on 12 randomly selected ROIs.

  • Nuclei segmentation
    The intraobserver overall precision was 84.89%, with a recall of 86.45%, and an F1 score of 85.66%. Interobserver overall precision was 80.34%, with a recall of 80.62%, and an F1 score of 80.20%. These results are based on the sum of all true positive, false positive, and false negative counts for the 12 ROIs.
  • Tissue segmentation
    The DICE score was determined on the same 12 randomly selected ROIs. The average intraobserver DICE score was 0.90, and the interobserver DICE score was also 0.90.

 

Version 2:
Updated the file format to .tiff with magnification metadata included in the files.
Added 5 additional ROIs containing necrosis and epidermis tissue categories.

Files

01_training_dataset_geojson_nuclei.zip

Files (14.9 GB)

Name Size Download all
md5:595d8ec2f8bb8e09e1e9fc1df2c4c09e
16.0 MB Preview Download
md5:152fd0689d3941ea7ed309019206a083
3.1 MB Preview Download
md5:98494b72d34d99f79dd055a4c4a05e58
14.2 GB Preview Download
md5:0310c5a303b986eb4ff2dba4ddbfb70e
647.2 MB Preview Download