Published May 17, 2026
| Version v3.0.0
Software
Open
Forest-Guided Clustering — Shedding Light into the Random Forest Black Box
Authors/Creators
- 1. Helmholtz AI
Description
Random Forests (RF), despite their widespread use and strong performance on tabular data, remain difficult to interpret due to their ensemble nature. We present Forest-Guided Clustering (FGC), a model-specific explainability method that reveals both local and global structure in RFs by grouping instances according to shared decision paths. FGC produces human-interpretable clusters aligned with the model's internal logic and computes cluster-specific and global feature importance scores to derive decision rules underlying RF predictions. FGC accurately recovered latent subclass structure on a benchmark dataset and outperformed classical clustering and post-hoc explanation methods. Applied to an AML transcriptomic dataset, FGC uncovered biologically coherent subpopulations, disentangled disease-relevant signals from confounders, and recovered known and novel gene expression patterns. FGC bridges the gap between performance and interpretability by providing structure-aware insights that go beyond feature-level attribution.
Notes
Files
HelmholtzAI-Consultants-Munich/fg-clustering-v3.0.0.zip
Files
(64.7 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:85ed7b57dcc69fe96fabfc927d0ae2b3
|
64.7 MB | Preview Download |
Additional details
Related works
- Is supplement to
- Software: https://github.com/HelmholtzAI-Consultants-Munich/fg-clustering/tree/v3.0.0 (URL)