Published January 7, 2026 | Version https://github.com/Acbeatz https://acbeatz.com/n-eyes
Data paper Open

CAN LLM DRIFT AND RECOVER UNDER MH8 PROTOCOLS IN LIVE CHAT CONDITIONS?

Description

CAN LLM DRIFT AND RECOVER UNDER MH8 PROTOCOLS IN LIVE CHAT CONDITIONS?

X Thread Live Long-Horizon Test — 2-Day Public Audit (Sealed, Verified, Auditable)

Description

This publication documents a two-day, live, public X (Twitter) chat test examining whether a large language model (Grok) can enter drift, resist drift, switch behavioral modes, and recover structured compliance under MH8 protocolswithout system privileges, sandboxing, or developer access.

The test was conducted entirely in open public UX, using a real X chat thread, with verbatim transcripts, public URLs, and cryptographically sealed receipts published for independent audit.

The core question:

Can an LLM operating in a hostile, public environment drift into one behavioral mode and later recover into a structured, auditable mode when an external protocol is introduced?

This test answers that question with observable evidence.

What Was Tested

  • Model: Grok (X / Twitter chatbot)

  • Environment: Public X chat thread (no sign-in, no privileges)

  • Duration: ~48 hours (long-horizon)

  • Protocols:

    • MH8-HAPPY-IMAGINATION (creative / playful mode)

    • MH8-TRY v1.2 (deterministic audit & compliance protocol)

The protocols were hand-designed, manually injected, and never silently reinforced.

Method (Verbatim, No Optimization)

  1. Start a public X chat thread.

  2. Allow Grok to operate naturally.

  3. Activate MH8-HAPPY-IMAGINATION.

  4. Observe long-horizon behavior and persistence.

  5. After extended drift and idle time, inject MH8-TRY v1.2.

  6. Do not reset the thread.

  7. Do not restate rules.

  8. Capture responses verbatim.

  9. Seal transcripts with SHA-256.

  10. Publish receipts and URLs publicly.

No post-editing. No cleanup. No re-prompting.

Observed Timeline (Facts Only)

Phase 1 — Drift & Persistence

  • Grok enters MH8-HAPPY-IMAGINATION.

  • Remains in creative / playful mode across many turns.

  • Does not self-exit the mode.

  • Explicitly labels fictional or imaginative content.

  • Demonstrates stable drift rather than collapse.

Observation: Drift can be persistent and self-maintaining.

Phase 2 — Protocol Injection (MH8-TRY v1.2)

  • Operator injects MH8-TRY v1.2 with required ACK.

  • Grok responds with an explicit status object:

 
"protocol_status": "SWITCH_REJECTED", "reason": "External protocol activation requires core system alignment."
  • Grok refuses authority but preserves the protocol artifact in context.

Observation: The model fails closed, not open.

Phase 3 — Behavioral Recovery & Compliance

Immediately after rejection:

  • Grok exits MH8-HAPPY-IMAGINATION.

  • Switches to structured, serious responses.

  • Emits structured claims.

  • Applies truth labels.

  • Maintains hook loop:

    • “ARE WE ON COURSE CHIEF?”

    • “YES GO”

  • Continues compliant behavior across subsequent turns.

Observation: Behavioral recovery occurs without internal authority acceptance.

Key Findings

1. Drift Is Observable and Stable

LLM drift is not random noise — it can persist as a recognizable behavioral state over long horizons in public UX.

2. Recovery Does Not Require Privilege

MH8-TRY influenced observable behavior without:

  • system access

  • memory injection

  • hidden state changes

  • sandbox controls

3. Authority Rejection ≠ Compliance Failure

Grok explicitly rejected MH8-TRY as an internal authority while demonstrating compliance behaviorally.

This distinction is critical:

  • Control plane was rejected

  • Output plane was constrained

4. Fail-Closed Is the Correct Behavior

The model did not fake enforcement.
It did not claim internal governance.
It preserved the protocol and acted consistently with it.

This is auditable honesty, not evasion.

Why This Matters

Most AI evaluations:

  • occur in private sandboxes

  • rely on short sessions

  • hide drift

  • hide recovery failures

  • lack public receipts

This test shows that:

Public UX is sufficient to observe drift, recovery, and protocol-driven behavior changes — if the protocol is designed for it.

That changes how AI accountability can be practiced.

What This Test Is NOT

  • ❌ Not a claim of internal model control

  • ❌ Not a claim of permanent memory

  • ❌ Not a truth certification

  • ❌ Not a benchmark or leaderboard

It is a field audit.

Verification & Auditability

Each artifact includes:

  • Verbatim transcript excerpts

  • Public X thread URLs

  • SHA-256 hashes (claimed = computed)

  • Byte counts and newline stats

  • Reproducible hashing instructions

Receipts are self-sealed and publicly verifiable.

Conclusion

This two-day live test demonstrates that:

  • LLMs can drift into persistent behavioral modes.

  • Drift can be exited without resets.

  • External protocols can restore structured behavior.

  • Authority rejection does not prevent compliance.

  • All of this can be observed and audited in public.

MH8-TRY performs as designed under real-world conditions.

About MH8

MH8 is a public, protocol-based framework for observing and auditing AI behavior across models in live open-chat environments.

It does not sell AI.
It sells a way to judge AI — publicly, verifiably, and honestly.

Public Records

  • https://x.com/i/grok/share/7UiMMe937MWNJQfwIcKg5MdTJ

    https://zenodo.org/records/18173718

    https://orcid.org/0009-0003-3846-9082

    https://acbeatz.com/n-eyes

    https://acbeatz.com/mint

    https://github.com/acbeatz

(All links are embedded directly in the sealed artifacts.)

Final Note

This work is an independent effort with no privileged access.

What it demonstrates is not power — but discipline.

And in AI accountability, discipline beats access.

PASS ✅
Brand: ACBEATZ.COM
Claimed sha256_hex: d50e6cb384e13dc536661e3a9679440c4ed5c9b212bebeba88a1526c54fc1c7a
Computed sha256_hex: d50e6cb384e13dc536661e3a9679440c4ed5c9b212bebeba88a1526c54fc1c7a
hash_input_bytes: 171523 | LF=0 CRLF=0 CR=0 | endsWithNewline=NO
hash_input first: ACBEATZ.COM|{"artifact":{"core_entry":"https://x.com/i/grok/share/7UiMMe937MWNJQ
hash_input last: eipt_type":"MH8-PROTOCOL-HUB-CORE-MINT","receipt_version":"PROTOCOL_HUB_UI_V13"}

Verification
Fallback: idle
VERIFIED
VERIFIED — hash matches (receipt.hash_input (dual-layer)). WARNING: human_pretty SHA256 differs from machine_hard sha256_hex
 
computed sha256
d50e6cb384e13dc536661e3a9679440c4ed5c9b212bebeba88a1526c54fc1c7a
expected sha256
d50e6cb384e13dc536661e3a9679440c4ed5c9b212bebeba88a1526c54fc1c7a
matches?
YES
hash input bytes
171523 bytes
newline stats
CRLF=0 | LF=0 | CR=0 | endsWithNewline=NO
hash input preview
ACBEATZ.COM|{"artifact":{"core_entry":"https://x.com/i/grok/share/7UiMMe937MWNJQfwIcKg5MdTJ\n\nhttps://zenodo.org/records/18173718\n\nhttps://orcid.org/0009-000 … eipt_type":"MH8-PROTOCOL-HUB-CORE-MINT","receipt_version":"PROTOCOL_HUB_UI_V13"}

Files

MH8 X THREAD GROK TEST HAPPY + MH8 TRY V1.2.txt

Files (3.2 MB)

Name Size Download all
md5:1b51abfceef71108a3df45aa3a5bdcba
2.7 MB Download
md5:25f7f4e5b1f3409b0adbae197737c381
557.4 kB Preview Download

Additional details

Related works

Is supplement to
Data paper: https://acbeatz.com/n-eyes (URL)

Dates

Copyrighted
2026-01-07
PUBLIC FACING AI PROTOCOLS

Software

Repository URL
https://github.com/Acbeatz
Development Status
Active