Title: GeoMx Digital Spatial Profiling Data from Human Intestinal Tissue

Description:
This dataset accompanies the manuscript investigating spatial gene expression in human intestinal tissue with dysplasia using NanoString GeoMx Digital Spatial Profiling (DSP). The data includes gene expression measurements and corresponding sample metadata from regions of interest (ROIs) across patient biopsies.

Files Included:
- GeoMx_DSP_data.rds: R serialized object containing the spatial transcriptomics dataset analyzed in this manuscript, including expression (for CD14, CEBPB, MPO, OSM, CCL2, ICAM1, IL1B and VEGFA) and metadata used in analysis.
- GeoMx_DSP_expression_data.xlsx: Excel file containing the gene expression matrix generated from GeoMx Digital Spatial Profiling (DSP).
- GeoMx_DSP_metadata.xlsx: Excel file providing metadata for each area of interest (AOI), including disease phenotype, subject ID, and sample parameters.

Methods:
Hematoxylin and eosin (H&E) slides from each patient block were annotated by a dedicated IBD pathologist to define areas of dysplasia. A 2 mm core was extracted from each block and embedded into a new paraffin block to create a tissue microarray (TMA). TMA sections (4 µm thick) were incubated with the Human Whole Transcriptome Atlas panel (NanoString, USA) and stained with anti-pan-cytokeratin, anti-CD45, and nuclear SYTO13 antibodies following manufacturer protocols. Slides were scanned using the GeoMX instrument, and 1 to 4 regions of interest (ROIs) were selected per core. CD45+ immune cells were segmented using GeoMX software.

Quality control (QC) for segmental and biological probes was conducted using a publicly available pipeline. Pre-processing and normalization were performed in R (v4.2.0), using NanoStringNCTools (v1.6.1). Normalization used the quartile 3 method followed by log₂ transformation. For comparative analysis between IBD-associated dysplasia and sporadic dysplasia, mean Z-scores were calculated using the formula Z = (x - μ) / σ, where x is an individual value, μ is the dataset mean, and σ is the standard deviation.
