Published February 27, 2026 | Version v7
Workflow Open

Unsupervised Spatial Segmentation of the Tumor Microenvironment via Multi-Scale Z-Axis Stratification and Molecular Density Fields: A Label-Free Grid-Level Framework with MRF Smoothing and Interface Gradient Analysis

Authors/Creators

Description

Overview

Using public 10x Xenium spatial transcriptomic data of breast cancer, we exploit the cell-stacking phenomenon in the Z-axis — traditionally treated as a technical artifact — as a genuine physical signal. We develop a grid-based, multi-scale texture analysis framework built on Z-axis stratification statistics.

The pipeline proceeds as follows:

  1. Transcripts are binned into spatial grids after quality control. A robust baseline correction (RANSAC + Huber regression with automatic linear/quadratic model selection via AIC, validated by Moran's I on residuals) removes global geometric trends from the Z-axis.
  2. At multiple Gaussian kernel scales (σ = 15, 30, 45 μm), three continuous physical fields are computed per grid: transcription molecule density (ρ), overall Z-dispersion (z_std_all), and imbalance-enhanced upper–lower Z-dispersion difference (z_std_diff_enhanced). Edge correction and confidence weighting are applied throughout.
  3. Using only these geometry-derived features — with no pathological partitioning, cell-type labels, or gene expression input — we perform unsupervised classification via diagonal-covariance GMM with Potts-model MRF spatial smoothing. The number of clusters (K) is selected by bootstrap stability + ICL, and the smoothing strength (λ) is chosen by a stability–boundary-ratio objective over a sigma × lambda sensitivity grid.
  4. A leakage guard formally verifies that no biological or expression features enter the classification stage.
  5. Post-classification biological validation is conducted entirely downstream: grid-level count matrices and CPM are constructed, per-cluster marker ranking (vectorized Wilcoxon one-vs-rest) and pairwise differential gene expression (Mann–Whitney U with BH correction) are performed, followed by pathway enrichment (MSigDB Hallmark, GO BP, KEGG) via gseapy. Marker-group scoring (log1p mean CPM of curated gene panels) with Cohen's d effect sizes quantifies functional differences between clusters.
  6. Spatial interface analysis computes signed distances to the cluster boundary, constructs interface gradient heatmaps (z-scored feature profiles binned by distance), and derives interface strength/sharpness metrics (contrast Cohen's d, near-boundary slope, maximum gradient, AUC separation). A radius sensitivity sweep with partial Spearman correlations (controlling for transcript density) confirms that the Z-dispersion–density and Z-dispersion–heterogeneity associations are robust across neighborhood scales.
  7. A panel-restricted DGE and its own pathway enrichment provide a focused validation on biologically curated gene sets.

概览

基于公开的10x Xenium乳腺癌空间转录组数据,我们将Z轴上的细胞堆叠现象——传统上被视为技术误差来源——反向利用为真实的物理信号,开发了一套基于Z轴分层统计的网格化多尺度纹理分析框架。

流程如下:

  1. 质控后将转录本分配至空间网格。通过稳健基线校正(RANSAC + Huber回归,AIC自动选择线性/二次模型,Moran's I验证残差空间自相关)去除Z轴的全局几何趋势。

  2. 在多个高斯核尺度(σ = 15、30、45 μm)下,为每个网格计算三个连续物理场:转录分子密度(ρ)、整体Z离散度(z_std_all)、以及经不平衡增强的上下Z离散度差异(z_std_diff_enhanced)。全程施加边缘校正与置信度加权。

  3. 仅使用上述几何衍生特征——不涉及任何病理分区、细胞类型标签或基因表达信息——通过对角协方差GMM结合Potts模型MRF空间平滑进行无监督聚类。簇数K由bootstrap稳定性+ICL联合选择,平滑强度λ通过sigma × lambda敏感性网格上的稳定性-边界比目标函数确定。

  4. 正式验证分类阶段未引入任何生物学或表达特征。

  5. 分类后的生物学验证完全在下游进行:构建网格级计数矩阵与CPM,执行逐簇标记基因排序(向量化Wilcoxon一对其余)和成对差异基因表达(Mann–Whitney U + BH校正),随后通过gseapy进行通路富集(MSigDB Hallmark、GO BP、KEGG)。标记基因组评分(策划基因面板CPM均值的log1p)配合Cohen's d效应量,量化簇间功能差异。

  6. 空间界面分析计算到簇边界的有符号距离,构建界面梯度热图(按距离分箱的z-score特征谱),并推导界面强度/锐度指标(对比度Cohen's d、近边界斜率、最大梯度、AUC分离度)。半径敏感性扫描结合偏Spearman相关(控制转录密度)确认Z离散度与密度、Z离散度与异质性的关联在不同邻域尺度下稳健成立。

  7. 面板限定的差异表达及其独立通路富集,在生物学策划基因集上提供聚焦验证。

Files

cluster.png

Files (28.6 MB)

Name Size Download all
md5:c2b63feb1a87626976f96ccaa695e344
1.5 MB Preview Download
md5:88001adb709381d41363129a93d37e3f
211.6 kB Preview Download
md5:83f43c7f95d38bd0316988f84937a275
164.5 kB Download
md5:5caabc421ac16ef330fc0038087332c8
1.5 MB Preview Download
md5:bf7b76128906d41b6a5469cd639b62b5
58.2 kB Preview Download
md5:8547fc41e437a0c9a14c0f9650bd186d
11.5 kB Download
md5:2cdfef61587019e10282bb5f355cf007
13.1 MB Download
md5:b01fb21d2fd10f641a23d5d4904e8386
12.1 MB Preview Download
md5:6b7ca0d3c883f407b988da31e4471e1d
63.7 kB Preview Download

Additional details

Additional titles

Translated title (Mandarin Chinese)
基于多尺度Z轴分层与分子密度场的肿瘤微环境无监督空间分割:一种无标签网格级MRF平滑与界面梯度分析框架

Software

Repository URL
https://github.com/Guquan-2002/GridZ-ST
Programming language
Python
Development Status
Active