Published May 26, 2026 | Version v1
Software Open

Patcher: Post-Hoc Patching of Backdoored Large Language Models

Authors/Creators

Description

This project is official implementation from the paper "Patcher: Post-Hoc Patching of Backdoored Large Language Models". A security research framework for localizing and removing backdoor attacks from Large Language Models (LLMs) by patching the models. This project implements the attack, patching, and evaluation pipeline using gradient saliency analysis to identify trigger tokens and patch compromised models. For more details, see the README.md in the files.

Files

Patcher.zip

Files (34.1 kB)

Name Size Download all
md5:0e7db0dc963df27ab6a0db374df96638
34.1 kB Preview Download