There is a newer version of the record available.

Published February 17, 2026 | Version v3.2.000
Software Open

VRAXION

Authors/Creators

Description

v3.2.000 — Architecture Complete

All 10 LCX bottleneck projection levers are now locked via deterministic GPU/CPU probes. The architecture is fully specified and ready for training.

Bottleneck Design (LOCKED)

lcx_read [D] → Lin(D→D/10) → C19 → Lin(D/10→D/10) → C19 → Lin(D/10→D)
  × zoom_gate(sigmoid=0.5) → + hidden

10 Levers — All Resolved

| Lever | Decision | Method | |-------|----------|--------| | 1. Bottleneck dim | 618 (10:1 squeeze) | CPU probe | | 2. Architecture | 2x618 (3 linears, 2 C19s) | CPU probe | | 3. Activation | C19 everywhere | CPU probe | | 4. Placement | Both input + think-tick | GPU probe | | 5. Read/Write | Read only | Analysis | | 6. Zoom gate init | 0.0 (balanced) | GPU probe | | 7. Weight init | Orthogonal | GPU probe | | 8. Normalization | None | GPU probe | | 9. Residual skip | OFF (proj(x) only) | GPU probe | | 10. Shared/per-tick | Shared | Analysis |

Changes since v3.1.002

  • Orthogonal init for all 3 bottleneck Linear layers (probe: +0.8% vs kaiming)
  • Lever 4+7+8 GPU probe validated remaining design choices
  • Levers 5+10 resolved analytically

Key Probe Results

  • Placement (L4): Removing input-time BN: -4.2%. Removing think-tick BN: -10.4%.
  • Init (L7): Orthogonal +0.8%, xavier -2.4%, zero-last -0.6%.
  • Norm (L8): LN-before -0.4%, LN-after -2.4%. No norm is best.
  • Architecture proven: +13.3% bit accuracy from LCX at tt=1 (mini-model).

What's Next

Resume real training at tt=1 with the fully-locked architecture.

Notes

If you use this software, please cite it as below.

Files

VRAXION/VRAXION-v3.2.000.zip

Files (1.3 MB)

Name Size Download all
md5:1abf09d82d4266d55fd207ce10522473
1.3 MB Preview Download

Additional details

Related works

Software