Published May 3, 2026 | Version v1

Two Refusals or One? Disentangling Safety and Epistemic Abstention Directions in Language Model Activations

Authors/Creators

Files

main.pdf

Files (263.2 kB)

Name Size Download all
md5:36f82a48f988c32ff83b286d70a5d87b
263.2 kB Preview Download