VRAXION
Authors/Creators
Description
v3.2.000 — Architecture Complete
All 10 LCX bottleneck projection levers are now locked via deterministic GPU/CPU probes. The architecture is fully specified and ready for training.
Bottleneck Design (LOCKED)
lcx_read [D] → Lin(D→D/10) → C19 → Lin(D/10→D/10) → C19 → Lin(D/10→D)
× zoom_gate(sigmoid=0.5) → + hidden
10 Levers — All Resolved
| Lever | Decision | Method | |-------|----------|--------| | 1. Bottleneck dim | 618 (10:1 squeeze) | CPU probe | | 2. Architecture | 2x618 (3 linears, 2 C19s) | CPU probe | | 3. Activation | C19 everywhere | CPU probe | | 4. Placement | Both input + think-tick | GPU probe | | 5. Read/Write | Read only | Analysis | | 6. Zoom gate init | 0.0 (balanced) | GPU probe | | 7. Weight init | Orthogonal | GPU probe | | 8. Normalization | None | GPU probe | | 9. Residual skip | OFF (proj(x) only) | GPU probe | | 10. Shared/per-tick | Shared | Analysis |
Changes since v3.1.002
- Orthogonal init for all 3 bottleneck Linear layers (probe: +0.8% vs kaiming)
- Lever 4+7+8 GPU probe validated remaining design choices
- Levers 5+10 resolved analytically
Key Probe Results
- Placement (L4): Removing input-time BN: -4.2%. Removing think-tick BN: -10.4%.
- Init (L7): Orthogonal +0.8%, xavier -2.4%, zero-last -0.6%.
- Norm (L8): LN-before -0.4%, LN-after -2.4%. No norm is best.
- Architecture proven: +13.3% bit accuracy from LCX at tt=1 (mini-model).
What's Next
Resume real training at tt=1 with the fully-locked architecture.
Notes
Files
VRAXION/VRAXION-v3.2.000.zip
Files
(1.3 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:1abf09d82d4266d55fd207ce10522473
|
1.3 MB | Preview Download |
Additional details
Related works
- Is supplement to
- Software: https://github.com/VRAXION/VRAXION/tree/v3.2.000 (URL)
Software
- Repository URL
- https://github.com/VRAXION/VRAXION