Published October 1, 2024 | Version v3
Image Open

Melanoma Histopathology Dataset with Tissue and Nuclei Annotations

  • 1. ROR icon University Medical Center Utrecht

Contributors

  • 1. University Medical Center Utrecht

Description

Description:

This dataset is designed for development of deep learning models for segmentation of nuclei and tissue in melanoma H&E stained histopathology. Existing nuclei segmentation models that are trained on non-melanoma specific datasets have low performance due to the ability of melanocytes to mimic other cell types, whereas existing melanoma specific models utilize older, sub-optimal techniques. Moreover, these models do not provide tissue annotations necessary for determining the localization of tumor-infiltrating lymphocytes, which may hold value for predictive and prognostic tasks. To address this, we created a melanoma specific dataset with nuclei and tissue annotations. 

Methodology:

Sample Collection:

Regions of interest (ROIs) were sampled from H&E stained slides of 103 primary melanoma specimens and 102 metastatic melanoma specimens, scanned using a Hamamatsu scanner at 40× magnification (0.23 μm per pixel). All slides were obtained from regular diagnostic procedures.
From each specimen, a 40× magnified ROI of 1024×1024 pixels was selected for annotation. Additionally, a context ROI of 5120×5120 pixels was sampled to provide information about the broader context for the annotation process. Selection was performed by a trained medical expert (M.S.) and subsequently verified by a dermatopathologist (W.B.). Manual ROI selection ensured the inclusion of diverse tissue and nuclei types.

Annotation Process:

  • Nuclei segmentation
    Nuclei segmentations were generated using Hover-Net pretrained on the PanNuke dataset. Manual annotation adjustments were performed by author M.S. using QuPath, with the following nuclei categories: tumor, stroma, vascular endothelium, histiocyte, melanophage, lymphocyte, plasma cell, neutrophil, apoptotic cell, and epithelium. All annotations were reviewed and corrected, where needed, by a dermatopathologist (W.B.).
  • Tissue segmentation
    Tissue segmentations were created manually using QuPath by M.S., with the following categories: tumor, stroma, epidermis, necrosis, blood vessel, and background. Annotations were reviewed and corrected, where needed, by a dermatopathologist (W.B.).

Quality Control:
To assess the reliability of the annotations, intra- and interobserver agreement (by pathologist G.B.) were determined on 12 randomly selected ROIs.

  • Nuclei segmentation
    The intraobserver overall precision was 84.89%, with a recall of 86.45%, and an F1 score of 85.66%. Interobserver overall precision was 80.34%, with a recall of 80.62%, and an F1 score of 80.20%. These results are based on the sum of all true positive, false positive, and false negative counts for the 12 ROIs.
  • Tissue segmentation
    The DICE score was determined on the same 12 randomly selected ROIs. The average intraobserver DICE score was 0.90, and the interobserver DICE score was also 0.90.

 

Version 3:
Removed sample "training_set_metastatic_roi_103" due to inconsistencies in annotation file.

Files

01_training_dataset_geojson_nuclei.zip

Files (14.8 GB)

Name Size Download all
md5:161682f2726b15346d9fcc34df8b9192
16.5 MB Preview Download
md5:bc8bc686c77935380c53aa9b009742be
3.2 MB Preview Download
md5:514781690f3a2c4c1f03cb4c28dbee68
14.1 GB Preview Download
md5:6311f46efc043caf78b926f2942c4b1c
643.4 MB Preview Download