Semantic Axis Decomposition of Transformer Embeddings
Description
This work introduces a novel method for interpreting sentence-transformer embeddings by identifying and labeling the most semantically meaningful latent dimensions. Using Random Forest classifiers, we extract top-N influential coordinates and assign human-interpretable meanings such as "emotionality", "scientific intent", or "question structure".
The result is a semantic heatmap that shows how individual sentences activate specific dimensions of meaning. This allows researchers and practitioners to better understand what transformer-based models are encoding and how they behave.
This is a conceptual and visual demonstration. Code is not included.
Keywords: transformer, embeddings, interpretability, latent space, semantic axis, XAI, sentence-transformers
Interactive prototype, source code, and PCA visualizations available here:
https://github.com/kexi-bq/embedding-explainer
Files
semantic_floors_expanded (1).pdf
Files
(4.7 kB)
Name | Size | Download all |
---|---|---|
md5:35ce8ea7f84925c1e5fc6957d49bfff4
|
4.7 kB | Preview Download |
Additional details
Dates
- Accepted
-
2025-05-24