Published December 17, 2021 | Version v1.0.0
Dataset Open

PatchGastricADC22

  • 1. Medmain Research

Description

Computational histopathology has made significant strides in the past few years, slowly getting closer to clinical adoption. One area of benefit would be the automatic generation of diagnostic reports from H\&E-stained whole slide images which would further increase the efficiency of the pathologists' routine diagnostic workflows. In this study, we compiled a dataset (PatchGastricADC22) of histopathological captions of stomach adenocarcinoma endoscopic biopsy specimens, which we extracted from diagnostic reports and paired with patches extracted from the associated whole slide images. The dataset contains a variety of gastric adenocarcinoma subtypes. We trained a baseline attention-based model to predict the captions from features extracted from the patches and obtained promising results. We make the captioned dataset of 262K patches publicly available.

Files

captions.csv

Files (7.2 GB)

Name Size Download all
md5:a1497c5081fbe43c61fe914d50c3208b
264.5 kB Preview Download
md5:e02010e8a07aa26b8b006c6fd79628eb
7.2 GB Preview Download

Additional details

Related works

Is published in
Preprint: arXiv:2202.03432 (arXiv)