Published May 3, 2026
| Version v1
Publication
Open
Two Refusals or One? Disentangling Safety and Epistemic Abstention Directions in Language Model Activations
Authors/Creators
Files
main.pdf
Files
(263.2 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:36f82a48f988c32ff83b286d70a5d87b
|
263.2 kB | Preview Download |