A Discrete Weight Language for Large Language Models: Compressing Gemma 4 31B with a 5-Step Ternary Route

Aubakirov, Arman

doi:10.5281/zenodo.19632982

Published April 17, 2026 | Version v1

Preprint Open

A Discrete Weight Language for Large Language Models: Compressing Gemma 4 31B with a 5-Step Ternary Route

Aubakirov, Arman (Researcher)¹

1. Independent Research

We present Route B, a weight-compression scheme that treats every weight tensor of a pretrained language model as a sequence of short codes drawn from a shared, per-family code book. In effect, each scalar weight is spelled out by a 5-step ternary path through a learned ladder of amplitudes. Applied to google/gemma-4-31B-it, the method replaces the 60 GB bf16 weights of the 410 linear layers by 7.9 bits per weight, reducing the working memory of inference to about 31 GB with no change to the model I/O interface...

Files

A_Discrete_Weight_Language_for_Large_Language_Models_Compressing_Gemma_4_31B_with_a_5_Step_Ternary_Route.pdf

Files (562.4 kB)

Name	Size	Download all
A_Discrete_Weight_Language_for_Large_Language_Models_Compressing_Gemma_4_31B_with_a_5_Step_Ternary_Route.pdf md5:fcf987f36643179edb510d232286a01b	328.9 kB	Preview Download
archiv_org_zenodo_source_20260417.zip md5:e78aa914293f479ab1e75f1bf494b489	233.5 kB	Preview Download

Additional details

Repository URL: https://github.com/AubakirovArman/discrete-weight-language-gemma4
Programming language: Python
Development Status: Active

	All versions	This version
Views	20	20
Downloads	14	14
Data volume	4.9 MB	4.9 MB

A Discrete Weight Language for Large Language Models: Compressing Gemma 4 31B with a 5-Step Ternary Route

Authors/Creators

Description

Files

A_Discrete_Weight_Language_for_Large_Language_Models_Compressing_Gemma_4_31B_with_a_5_Step_Ternary_Route.pdf

Files (562.4 kB)

Additional details

Software