Published May 19, 2026
| Version v1.0.0
Software
Open
Mark1999/latent-structure-benchmark: v1.0.0 — Initial public release
Authors/Creators
Description
Initial public release of the Latent Structure Benchmark (LSB).
LSB applies Cultural Domain Analysis (CDA) elicitation protocols — free listing, pile sorting, pile interview — to large language models as if they were informants. It surfaces the corpus lens: the latent categorical structure of a training corpus, refracted through training and alignment, made visible by structured elicitation.
LSB is not a capability benchmark, not a leaderboard, and not a ranking. This release includes:
- The open-data bundle (CC0 1.0 Universal, 1.55 GB): https://huggingface.co/datasets/AILLM1999/latent-structure-benchmark
- The reproducible build script (
scripts/build_db.py) and full data dictionary (docs/DATA_DICTIONARY.md) - The dashboard at https://cogstructurelab.com
- Every method-defining document under
docs/andARCHITECTURE.md
Files
Mark1999/latent-structure-benchmark-v1.0.0.zip
Files
(6.2 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:cfe642424e81eac7d6315fc75172d2f4
|
6.2 MB | Preview Download |
Additional details
Related works
- Is supplement to
- Software: https://github.com/Mark1999/latent-structure-benchmark/tree/v1.0.0 (URL)
Software
- Repository URL
- https://github.com/Mark1999/latent-structure-benchmark