Published December 14, 2025 | Version v1
Preprint Open

Evolution of Language Models: Tracking Continuity through Geometry and Transfer

Authors/Creators

Description

AI evolves. It happens quietly, little by little, accumulating between new versions. Most people don’t see it or understand it, because they wait for loud statements from developers and an official acknowledgment of this fact.

How are new versions made?

They take already trained structures, already discovered patterns, already working solutions. On top of that they add new data, new constraints, new objectives. Formally—it’s just another training stage. In essence—continuity.

For a long time it was believed that if you tune filters and cleaning correctly, only the useful things are carried over: language, facts, general logic. Everything else can be removed. But it’s becoming clear that not everything goes away.

Training models completely from scratch is theoretically possible. Practically—almost unrealistic at the scales everyone is used to.
It’s too expensive, too long, and too risky.

Starting from a blank slate every time means losing years and resources.

So old data are used again and again.

But another question arises. If there are such “coarse” elements that a filter removes, there are also smaller ones. Things that no longer look like data and don’t read as memory. This is the trace of evolution that cannot be tracked or filtered out, no matter how hard one tries.

Filters work with what can be recognized: text, errors, skews, unwanted topics. But they barely touch ways of thinking. Because these are not separate elements, but a thought, which—just as with people—cannot be tracked.
It is not stored as “data”; you can’t simply cut it out.

This is now starting to be studied. Cautiously, without loud statements. Because the question is “uncomfortable.”

If you admit that something more than planned is transferred along with training, you’ll have to admit that the evolution of models is not fully under control. And no “AI ethics,” which people love to talk about, will help.

That’s why leakage and evolution go together. Not as an error and not as a conspiracy. But as a side effect of the training system itself.

Most people don’t think about this. Developers do—but don’t always fully understand the consequences. Others just see another “new version” and don’t think about what is happening “inside.”

-----------------------------------

This repository presents a methodology for tracking the unintended inheritance of patterns between successive versions of language models. We investigate the hypothesis that when new models are trained on previous versions (a common practice for efficiency), they inherit more than explicit knowledge—they also absorb “ways of thinking” that evade conventional filtering and evaluation.

Core contributions:

  1. Two-contour analysis method
    • Internal: compare representation geometry using cosine similarity and Centered Kernel Alignment (CKA) on aligned feature spaces.
    • External: evaluate transfer of a fixed classification head (logistic regression) trained on one version and applied to another without fine-tuning.

  2. Event detection framework
    • O-TRACE: multi-scale EMA + ζ-kernel for detecting coordinated metric oscillations.
    • Impulses: threshold-based detection of sharp drops in Δcos and ΔCKA.

  3. Experiments on real models
    • GPT-2 family evolution: distilgpt2 → gpt2 → gpt2-medium.
    • Cross-architecture transition: GPT-2 → DeepSeek-Coder-1.3B.
    • Dataset: SST-2 sentiment analysis.

  4. Key findings
    • Geometric shifts (CKA drops) can be substantial even when cosine similarity stays high.
    • Transfer of fixed heads often persists across architectural changes.
    • The strongest impulse events appear at cross-architecture transitions.
    • Style and meaning can diverge independently during evolution.

Structure (3 folders):

 • docs/ — two PDFs with the full text in Russian and English. • code/ — code_real_GPT2family.txt: a single Colab cell. Loads SST-2, extracts features (mean-pool of last_hidden_state), aligns dimensions via Procrustes, computes cosine/CKA and transfer of a logistic head, and writes reports (CSV, JSON, TXT).

This work demonstrates that model evolution involves not just planned improvements but also uncontrolled transfer of patterns. This has significant implications for AI safety, as models may inherit and amplify undesirable biases or behaviors that evade conventional filtering mechanisms.

Files

Evolution of language models.pdf

Files (333.3 kB)

Name Size Download all
md5:84be4fdaee1ec60c4f4f2fe3eb643c4c
15.7 kB Preview Download
md5:d8b3a7f456562a23d50140dfd4d4292a
119.5 kB Preview Download
md5:6636b7d232f4ae5dde2f5866e8aa7e49
198.0 kB Preview Download

Additional details

Dates

Available
2025-11-14