End-to-end de novo design of Zn²⁺ metallohydrolase binders: an open-source canonical pipeline anchored by LigandMPNN's metal-coordination recovery
Authors/Creators
- 1. Genesis_Medicine Lab; HAN PREDICT, Inc.; Recover Korean Medicine Clinic
Description
De novo design of binders against Zn²⁺ metallohydrolases (matrix metalloproteinases, carbonic anhydrases, thermolysins, and related catalytic-metal enzymes) remains one of the most demanding stress tests for modern generative protein modeling. The catalytic geometry of these enzymes depends on a small set of coordinating residues (typically His/Asp/Glu/Cys) whose identity and side-chain rotamer states must be preserved through every stage of an end-to-end design pipeline. We present a fully open-source canonical pipeline that integrates four publicly released components — RFdiffusion3 for backbone generation, LigandMPNN for metal-aware inverse folding, FlowPacker for side-chain refinement, and AlphaFold3 / Boltz-2x / Chai-1 for cofold validation — into a reproducible workflow we apply to the matrix metalloproteinase-1 (MMP-1) catalytic domain. The pivotal stage is sequence design: on the 1HFC reference scaffold (157 residues, 2× Zn²⁺ + 1× Ca²⁺), LigandMPNN recovers 95.3% of the six Zn-coordinating positions versus 46.4% for plain ProteinMPNN. The disparity is most pronounced at the structural-Zn triad (His183/Asp185/His196), where ProteinMPNN scores 0% versus LigandMPNN's 90.6%. An orthogonal ESM-C 600M zero-shot likelihood oracle independently confirms that LigandMPNN sequences are more native-like (mean perplexity 2.85 vs 3.03). We document a silent failure mode — when HETATM lines are stripped during preprocessing, LigandMPNN reports use_ligand_context=True but quietly degenerates to ProteinMPNN behavior — and provide a preflight check. The pipeline composes naturally with neural network potential (NNP) ranking (paper_A) and physicality-steered cofold validation (paper_B, --use_potentials). We argue that this open canonical stack now matches or exceeds the design quality of closed alternatives (AlphaProteo) at zero licensing cost for academic users.
Keywords: de novo enzyme design, metalloenzyme, matrix metalloproteinase, LigandMPNN, RFdiffusion, FlowPacker, AlphaFold3, Boltz-2x, open source, reproducibility.
Notes
Files
22_paper_C.md
Files
(135.7 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:a84affa4bfeedeb64ad4fe7597dab7e9
|
44.0 kB | Preview Download |
|
md5:e4669156e17838d0791ed91b42809408
|
91.8 kB | Preview Download |