Published February 27, 2026 | Version v1
Dataset Open

HANCOCK CD3/CD8 Tutorial Dataset for GLEAM Multimodal Learner (3GB JPEG Derivative)

  • 1. ROR icon Moffitt Cancer Center

Description

This Zenodo record provides a tutorial-focused derivative of the original HANCOCK TMA CD3/CD8 dataset, created to demonstrate the GLEAM Multimodal Learner workflow with a lightweight example. The original archive was size-reduced by converting all image files from PNG to JPEG (lossy compression, quality 67) while keeping original image dimensions, resulting in a compact archive of about 2.89 GB (CD3_CD8_images_3GB_jpeg_q67.zip, 3037 images). Because file extensions changed from .png to .jpg, the accompanying train/test metadata files were updated to match the new paths (HANCOCK_train_split_3GB_jpeg.{csv} and HANCOCK_test_split_3GB_jpeg.{csv}; 653 matched records, ~80/20 split).
This modified dataset is intended only for tutorial and tool demonstration purposes and is not a replacement for the original HANCOCK dataset for research-grade analysis or final benchmarking.

Files

CD3_CD8_images_3GB_jpeg.zip

Files (2.9 GB)

Name Size Download all
md5:971bcd42f8e469cafacd95b6cc2d6b64
2.9 GB Preview Download
md5:885c79e92ed07078883372cbd7458072
58.3 kB Preview Download
md5:9b10b2f36833f6ff9280fe6fd831e76e
193.8 kB Preview Download