Published April 1, 2026 | Version v1
Preprint Open

DeltaLens: Selective Reading from Compressed Memory via Cross-Attention

Authors/Creators

Description

We propose DeltaLens, which replaces linear attention's read operation with cross-attention over the compressed state matrix. At 1.36B scale, DeltaLens achieves PPL 19.01 with 751M params, outperforming a full Transformer (1.36B, PPL 25.4) by 25%. 

Files

DeltaLens__Selective_Reading_from_Compressed_Memory_via_Cross_Attention__2_.pdf