Published July 23, 2025 | Version v2
Publication Open

UniFORM: Towards Universal ImmunoFluorescence Normalization for Multiplex Tissue Imaging

  • 1. ROR icon Oregon Health & Science University
  • 2. ROR icon OHSU Knight Cancer Institute

Description

Dataset Title:

PRAD‑CyCIF: Prostate Cancer Single‑Cell Mean Fluorescence Intensity Dataset from Cyclic Immunofluorescence (CyCIF) — AnnData & Pickle Versions

Description:

This publicly released PRAD‑CyCIF dataset is a curated, single‑cell feature representation of prostate cancer tissues imaged on the Cyclic Immunofluorescence (CyCIF) platform. It underpins the benchmarking experiments in the manuscript

“UniFORM: Towards Universal ImmunoFluorescence Normalization for Multiplex Tissue Imaging.”

Two parallel data formats are provided to accommodate common workflows:

  1. AnnData Version (.h5ad)

    • Expression matrix (.X):
      – Shape: 1,760,530 cells × 20 markers
      – Each row = one cell’s mean intensity across 20 markers

    • Cell metadata (.obs):

      • cell_id: unique cell identifier

      • sample_id: patient sample (PRAD‑01 … PRAD‑20)

      • scene_id: imaging scene label (e.g., scene000)

      • batch_id: staining/imaging batch (e.g., batch1batch7)

      • x, y: cell coordinates within each scene

    • Marker metadata (.var):

      • marker_name: the 20 marker panel (e.g., DAPI_R1, EPCAM, CD45, Ki67, etc.)

    • Unstructured info (.uns['image_dimensions']):
      – Pixel height/width of each scene

  2. Pickle Version (.pkl per sample)

    • Structure: a directory of 20 files, one per patient sample named PRAD‑XX_mean_intensity.pkl

    • Sample order: the pickle file is saved in exactly this order: PRAD_sample_names = ['PRAD-01', 'PRAD-02', 'PRAD-03', 'PRAD-04', 'PRAD-05', 'PRAD-06', 'PRAD-07', 'PRAD-08', 'PRAD-09', 'PRAD-10', 'PRAD-11', 'PRAD-12', 'PRAD-13', 'PRAD-14', 'PRAD-15', 'PRAD-16', 'PRAD-17', 'PRAD-18', 'PRAD-19', 'PRAD-20']
    • Marker order: the channels are stacked in exact this order: PRAD_markers = ['DAPI_R1', 'EPCAM', 'CD56', 'CD45', 'aSMA', 'ChromA', 'CK14', 'Ki67', 'GZMB', 'ECAD', 'PD1', 'CD31', 'CD45RA', 'HLADRB1', 'CD3', 'p53', 'FOXA1', 'CDX2', 'CD20', 'NOTCH1']
    • Contents (per .pkl):

      {
        'intensity_mean':  np.ndarray of shape (20 markers, N_cells),
      }

Intended Use:

This dataset is intended for researchers to replicate results shown in our manuscript and further benchmarking and method development. 

Acknowledgments:

Please cite the dataset and related publications appropriately when using this data in your research.

Files

PRAD_pickle_data.zip

Files (639.3 MB)

Name Size Download all
md5:2b9d9cd263e1649b752ba4d60ae5ea6a
392.8 MB Download
md5:bcf22d067a9ed324b2c9e9d7beaad122
246.4 MB Preview Download

Additional details

Software

Repository URL
https://github.com/kunlunW/UniFORM
Programming language
Python
Development Status
Active