Motif-Upcycling: Structure-Preserving Adaptation of Transformer Models
Authors/Creators
Description
We introduce Motif-Upcycling, a structure-preserving framework for adapting pretrained Transformer models. The key idea is that common feed-forward modules, including SwiGLU FFNs, can be exactly factorized along their intermediate channel axis into motif-aligned components. With neutral routing, the factorized module computes the same function as the original pretrained block at initialization. We further introduce Scale-Aware Residual Control (SARC), an identity-preserving control motif that modulates the magnitude of trainable residual interventions relative to the residual stream. We also propose Emergence as Coupled Budget Thresholds (ECBT), a conditional model showing that apparent capability cliffs can arise from multiplicatively coupled motif effectiveness curves under uniform budget allocation.
Files
motif_upcycling_structure_preserving_paper.pdf
Files
(917.1 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:db6cbd788c61e24c9b19547de6faf72e
|
917.1 kB | Preview Download |
Additional details
Software
- Repository URL
- https://github.com/kharkilirov1/motif_upcycling.git
- Programming language
- Python