TATOS: Geometric Concept Compression for Efficient Language Representation

BeccaLabs; Stover, Dustin

doi:10.5281/zenodo.20027873

Published May 4, 2026 | Version v1

Technical note Open

TATOS: Geometric Concept Compression for Efficient Language Representation

We present TATOS (Text-Angle-Trajectory-Optimized-Sequence), a novel architecture for language representation that operates on geometrically-grounded concept sequences rather than conventional token streams. A proprietary compression codec maps natural language to 2,048 canonical concept vectors, achieving a 25x vocabulary reduction compared to standard transformer approaches. A 304M parameter model trained on 2.5 million concept sequences achieves 90.5% validation accuracy and 74.5% token accuracy on unseen data, trained on a single consumer GPU for under $0.30. The system demonstrates a consistent scaling curve from 10M to 304M parameters with no observed ceiling. All results produced at BeccaLabs, Morgan MN, May 2026.

Files

TATOS_Technical_Report_2026.pdf

Files (142.7 kB)

Name	Size	Download all
TATOS_Technical_Report_2026.pdf md5:0b04c67ce77e557f8e81095639746fb8	142.7 kB	Preview Download

Views

Downloads

Show more details

	All versions	This version
Views	55	55
Downloads	43	43
Data volume	7.8 MB	7.8 MB

More info on how stats are collected....

DOI

Resource type

Technical note

Publisher

Zenodo

License: Creative Commons Attribution No Derivatives 4.0 International

No further description. Read more

Technical metadata

Created: May 4, 2026
Modified: May 4, 2026

TATOS: Geometric Concept Compression for Efficient Language Representation

Authors/Creators

Description

Files

TATOS_Technical_Report_2026.pdf

Files (142.7 kB)