Published April 28, 2026 | Version v1
Preprint Open

Evidence for a Structured Token-Generation System in the Voynich Manuscript

Authors/Creators

Description

This record provides the preprint and reproduction package for the study "Evidence for a Structured Token-Generation System in the Voynich Manuscript."

The study tests whether Voynich Manuscript tokens can be modeled by a structured prefix–core–suffix (PCS) token-generation system rather than by random processes. Using the Zandbergen–Landini EVA transcription (ZL3b), the analysis evaluates token matching, coverage, Zipf distribution alignment, fixed-length validation, bootstrap significance, holdout generalization, inter-token transition entropy, morphological family structure, positional dependency, and compression-based structural regularity.

Key results include:

- Matching Rate: PCS 97.02% vs Random 5.07%
- Token Coverage: PCS 85.25% vs Random 18.50%
- Zipf Slope Difference: PCS 0.0795 vs Random 0.7138
- H(suffix | core): 0.8746
- Transition entropy reduction: 0.2504 bits
- Compression ratios: real 0.3089, shuffled 0.3240, PCS-generated 0.3163
- Morphological families: 767 core-sharing families and 1,017 suffix-alternating clusters
- Positional chi-square: prefix = 4554.31, core = 17993.46, suffix = 3329.07

The results support a structured token-generation mechanism but do not constitute semantic decipherment.

Files

Chang_2026d_Voynich_PCS_Token_Generation.pdf

Files (648.6 kB)

Name Size Download all
md5:a573c1209768bcdb1b4f084f8a37b40e
648.6 kB Preview Download

Additional details

Related works

Is supplemented by
Software: 10.5281/zenodo.19857820 (DOI)

Dates

Accepted
2026-04-28