Semantic Axis Decomposition of Transformer Embeddings
Description
This work introduces a novel method for interpreting sentence-transformer embeddings by identifying and labeling the most semantically meaningful latent dimensions. Using Random Forest classifiers, we extract top-N influential coordinates and assign human-interpretable meanings such as "emotionality", "scientific intent", or "question structure".
The result is a semantic heatmap that shows how individual sentences activate specific dimensions of meaning. This allows researchers and practitioners to better understand what transformer-based models are encoding and how they behave.
This is a conceptual and visual demonstration. Code is not included.
Keywords: transformer, embeddings, interpretability, latent space, semantic axis, XAI, sentence-transformers
Interactive prototype, source code, and PCA visualizations available here:
https://github.com/kexi-bq/embedding-explainer
Files
Semantic_Embedding_Final.pdf
Files
(3.4 kB)
Name | Size | Download all |
---|---|---|
md5:4b2250234fafb52f97380140b7620725
|
3.4 kB | Preview Download |
Additional details
Dates
- Accepted
-
2025-05-24