Published May 26, 2026
| Version v1
Software
Open
Patcher: Post-Hoc Patching of Backdoored Large Language Models
Authors/Creators
Description
This project is official implementation from the paper "Patcher: Post-Hoc Patching of Backdoored Large Language Models". A security research framework for localizing and removing backdoor attacks from Large Language Models (LLMs) by patching the models. This project implements the attack, patching, and evaluation pipeline using gradient saliency analysis to identify trigger tokens and patch compromised models. For more details, see the README.md in the files.
Files
Patcher.zip
Files
(34.1 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:0e7db0dc963df27ab6a0db374df96638
|
34.1 kB | Preview Download |