There is a newer version of the record available.

Published May 12, 2026 | Version 0.6.0
Software Open

gtheory-app: a Shiny app for generalizability theory analyses

  • 1. Unidade de Educação Médica, Faculdade de Medicina e Ciências Biomédicas, Universidade do Algarve
  • 2. Faculdades Pequeno Príncipe
  • 3. Inspirali Educação
  • 4. European Board of Medical Assessors

Description

Generalizability theory (G-theory) extends classical reliability theory by partitioning measurement error into multiple, separable sources, thereby supporting the design of more dependable measurement procedures. Existing software for G-theory — most notably the GENOVA Suite and EduG — is widely used but predates contemporary R-based reproducible workflows and lacks interactive exploration of decision (D) study trade-offs. We introduce gtheory-app, a free and open-source Shiny application that brings G-theory analysis into the browser. The application accepts long-format data uploads with auto-detected delimiters or score-only files paired with a user-specified design, supports balanced and unbalanced data, and handles fully crossed, nested, and mixed (random + fixed) designs with an arbitrary number of facets. Variance components are estimated via lme4 (REML) with a classical ANOVA cross-check available for balanced data. The application then reports universe-score variance, relative and absolute error variances, the generalizability coefficient (Eρ²), the index of dependability (Φ), Φ(λ) for criterion-referenced decisions, a G-facets sensitivity table, and an interactive 3D surface graph over any pair of user-selected facets with the resulting D-study. Every numerical output is paired with a context-sensitive, plain-language interpretation that names the user's actual facets, places reliability values in conventional bands, and identifies the largest single error term as a target for redesign. We illustrate the application on three classes of designs: a fully crossed three-facet design, a nested-items design, and a multi-facet chain-nested design. We argue that gtheory-app fills a clear gap in the open-source psychometric software ecosystem and lowers the barrier to G-theory for applied researchers, educators, and assessment professionals.

CHANGELOG:

 

0.6.0 (May 2026)

Added

- Universe-size column for partially-fixed facets. The facets
  specification table in section 2 of the sidebar now has a sixth
  column, "Universe size". For a fixed facet whose universe is
  larger than the sample and finite (e.g. EduG's `D = 2 of 17` in
  example 16), enter the universe N here; the app then applies the
  Cardinet `(1 − n/N)` factor when computing the Mixed estimates,
  the Corrected estimates, the G-Study contributions, and σ²(X̄..).
  For random facets and fully-sampled-fixed facets you can leave the
  field blank and the app defaults to N = n.

  This closes the last EduG-compatibility gap. With v0.6.0, example
  16 reproduces EduG's σ²(X̄..) = 0.015 and D's absolute-error
  contribution = 0.005 (27.3%) exactly.

Fixed (inherited from v0.5.0)

- All v0.5.0 patches are carried forward:
  - τ classification for "D + only-fixed-I" components (→ Δ_only).
  - Corrected-column sign behavior (negatives preserved).
  - Cardinet finite-population correction applied to every fully-
    fixed component, not just D-only ones.
  - σ²(X̄..) finite-population skip rule (now refined to use
    the user-specified N rather than assuming n = N).
  - Estimation-method toggle (REML vs. ANOVA Method-1).

 

0.5.0 (May 2026)

Fixed

- τ classification for `D + only-fixed-I` components. EduG puts
  components consisting of a differentiation facet combined with
  only fully-sampled-fixed instrumentation facets (e.g. MO, MK:O,
  DSM, DMO, SMO on the 16-Language-Learning design) into the
  absolute-error column, not into universe-score variance σ²(τ).
  v0.4.0 was wrongly adding them to σ²(τ) (averaged over the fixed
  I's), which inflated Coef_G. v0.5.0 routes them to Δ_only,
  matching EduG. For example 16 this brings σ²(τ) from 0.0406 down
  to 0.035 and Coef_G relative from 0.87 up to 0.90.

- σ²(X̄..) in mixed designs. The error-variance-of-the-grand-mean
  helper now skips components that contain any fully-sampled-fixed
  facet (the Cardinet (1 − n/N) factor is 0 for those, so they
  contribute nothing to the grand-mean variance under the universe
  of generalization). For purely random designs this is identical
  to the v0.4.0 formula (verified: brennan3E gives 0.2195 either
  way, so Φ(6) still matches EduG's 0.659).

  Known limitation: the design table only flags facets as
  "random" or "fixed". It does not yet let the user specify a
  finite-but-larger universe N for partially-fixed facets such as
  EduG's `D = 2 of 17`. For those facets the app treats n = N, so
  the (1 − n/N) factor is 0 and the partially-fixed facet does not
  contribute to σ²(X̄..). EduG, which knows N = 17 for D, applies
  (1 − 2/17) = 0.882 and gets a non-zero contribution. Adding an
  `n_universe` column to the design table is on the v0.6.0 roadmap.

Added

- Estimation method toggle. A new radio control in the sidebar
  (section 1, just below the data input) lets the user choose between:
  - REML (lme4) — default; works for any data, balanced or
    unbalanced. Slow on saturated nested designs with many random-
    effect levels.
  - ANOVA Method-1 (closed form) — balanced data only; runs in
    milliseconds because it computes SS / df / MS by inclusion-
    exclusion and inverts the EMS matrix in one shot. Numerically
    identical to EduG on every balanced design the application can
    represent. The toggle automatically falls back to REML if the
    data is unbalanced.

  Use the ANOVA path when you want EduG-style instant results on
  large nested designs like the EduG "16 Variations of attitudes
  towards learning of German" example, which previously took
  lme4 several minutes to fit.

Fixed

- Corrected column now matches EduG's sign behavior. v0.4.0
  truncated negative values in the Corrected column to zero with
  `pmax(Mixed, 0)`. EduG actually keeps the sign on the Corrected
  estimate (e.g. MI = −0.006 in example 10) and only applies the
  finite-population correction (n−1)/n when the component consists
  entirely of fully-sampled-fixed facets. v0.5.0's `.anova_decomp()`
  now follows the EduG rule exactly:
  - If every facet in the component is fixed AND fully sampled,
    multiply Mixed by Π (n_f − 1) / n_f.
  - Otherwise Corrected = Mixed, including any negative sampling
    artifacts.

  The downstream G coefficient calculation still treats negative
  variance estimates as zero when summing into σ²(δ) and σ²(Δ), in
  line with EduG's G-Study Table convention. So the display shows
  the negative (matching EduG's report), but the coefficient
  arithmetic uses the truncated value.

 

0.4.0 (May 2026)

This release consolidates all of the v0.3.x patches into a single tagged
version. 

Fixed

- SS / df / MS now populated for every component in saturated
  nested designs. v0.3.0's `.anova_decomp` passed a single saturated
  aov formula to R, and R silently dropped aliased terms in saturated
  nested designs (e.g. the EduG "10 Dependability of change of
  attitude" example), leaving the SS, df, MS, Random, Mixed, Corrected
  and SE columns blank for higher-order components. v0.4.0 computes
  SS and df directly from cell means via inclusion-exclusion -- an
  algorithm that is identity-by-construction on balanced data and is
  agnostic to crossed vs. nested vs. mixed designs. Every component
  in EduG's output now has its SS/df/MS reported by the app.

- G coefficients match EduG when the differentiation facet is
  fully fixed. v0.3.0 used the lme4 / Mixed variance estimate
  directly for σ²(τ), giving Eρ² and Φ values that drifted from EduG
  whenever a differentiation facet was fully sampled and declared
  fixed. v0.4.0 applies the Cardinet finite-population correction
  σ²(τ) ← σ²(τ) × Π (n_f − 1)/n_f for every D-only component made
  up entirely of fully-sampled-fixed facets, reproducing EduG's
  "Corrected" column when it is relevant for τ. For example 10
  (M differentiating, n_M = N_M = 2), Coef_G relative now reads 0.78
  (matching EduG); v0.3.0 reported 0.879.

- Upload header / data column-count mismatch. EduG-exported
  files frequently leave the score column un-named in the header
  (the header has one fewer field than every data row). R's
  `read.table(header = TRUE)` silently treats the first data field
  as a row name and shifts every column one position to the left;
  the downstream model fit then fails with a generic "An error has
  occurred". v0.4.0 detects the mismatch up-front in `read_uploaded()`,
  auto-appends `scores` (or `scores`, `V2`, ... for larger gaps),
  reads the file with explicit `col.names`, and shows a yellow
  "Heads up" notice in the Data preview tab explaining what was
  auto-fixed.

- Case-insensitive facet / parent matching. Typing `T` as a
  facet label and `t` as its parent letter (or vice versa) no longer
  raises *"facet 'J' is nested in unknown facet(s): t"*. A new
  internal helper `.normalize_design()` lowercases and trims both
  facet labels and parent references at every entry point of the
  math layer; data column names are renamed to the canonical
  lowercase form inside `prepare_data()`.

- No more hard-coded facet-letter meanings. The interpretation
  layer no longer assumes `t = "tasks"`, `p = "persons"`, etc. It now
  always reads the user's actual column name as the descriptive name
  in the plain-language interpretations; if the user labelled their
  column `T` for teachers, the text says "T", not "tasks". The
  bundled synthetic-example datasets were renamed to use full-word
  data columns (`persons`, `items`, `raters`, etc.) so the
  built-in dropdown still produces readable plain-language output.

Improved

- Facets section help text is now shorter (four short bullets,
  one expandable worked example using EduG's "07 ClassObservation"
  T/JO design). The previous long block has been condensed.

 

0.3.0 (May 2026)

Fixed (post-release patches)

- Facet-name resolution audit. Removed the hard-coded mapping
  that assumed single-letter facet labels had conventional meanings
  (`t = "tasks"`, `p = "persons"`, etc.). The interpretation layer now
  uses the user's actual column name as the descriptive name; if the
  user labelled their column `T` for teachers, the text says "T", not
  "tasks". Bundled synthetic examples were renamed to use full-word
  data column names (`persons`, `items`, `raters`, etc.) so the
  built-in dropdown still produces readable plain-language output.
- Case-insensitive facet/parent matching. Typing `T` as a facet
  label and `t` as its parent letter (or vice versa) no longer raises
  "facet 'J' is nested in unknown facet(s): t". A new internal helper
  `.normalize_design()` lowercases and trims both facet labels and
  parent references at every entry point of the math layer; data
  column names are renamed to the canonical lowercase form inside
  `prepare_data()`.

Fixed

- Φ(λ) bias correction (matches EduG). The criterion-referenced
  dependability coefficient Φ(λ) now uses the Brennan/Cardinet
  bias-corrected formula, subtracting σ²(X̄..) — the error variance of
  the grand mean — from (X̄..-λ)² in both the numerator and the
  denominator. On the bundled `brennan3E.csv` dataset at λ = 6 the
  application now reports Φ(6) = 0.659 (matching EduG); the previous
  v0.2.x naive formula gave 0.708. The old formula is preserved as
  `compute_phi_lambda_naive()` for backwards reproducibility.

Added

- EduG-style ANOVA columns in the Variance components tab.
  On balanced data the components table now reports SS, df, MS, an
  analytic SE on each variance component, and three flavours of the
  estimate matching EduG's columns:
  - Random: naive Method-1 ANOVA recovery (treats every facet as
    random).
  - Mixed: Cardinet-style adjustment for finite-universe fixed
    facets. This is the value the application uses downstream for G,
    Φ, and Φ(λ).
  - Corrected: Mixed with negative estimates truncated to zero.
  On purely random designs Random ≡ Mixed ≡ Corrected and all three
  match the REML estimate to numerical precision.

- New G-Study tab mirroring EduG's "G Study Table" output, with
  one row per variance component split into differentiation,
  relative-error, and absolute-error contributions (with their
  percentage shares), and a footer block reporting:
  - sum of variances (differentiation / relative / absolute)
  - relative and absolute SEMs
  - generalizability coefficients (relative Eρ² and absolute Φ)
  - grand mean
  - σ²(X̄..) (error variance of the mean for levels used)
  - SE of the grand mean

- σ²(X̄..) helper (`sigma2_grand_mean()`) in `R/g_coefficients.R`
  that returns the error variance of the estimated grand mean over
  replications of the study. This is the quantity EduG reports as
  *Error variance of the mean for levels used* and is the basis of the
  Φ(λ) bias correction.

- Multi-parent / chain-nesting support. The `nested_in` field
  now fully supports comma-separated parent lists (e.g. `d,s` for
  classes nested in DxS combinations, or `c,d,s` for pupils nested in
  classes within DxS). v0.3.0 also computes the **transitive closure**
  of the parent graph, so users can supply only the immediate parent
  (`c` for *p*, `d,s` for *c*) and the closure rule will still
  correctly require all transitive ancestors to be present. The
  EduG-style "16 Variations of attitudes towards learning of German"
  example with design `(p:c:DS) × m × (i:k:o)` now works in the app.

- `σ²(X̄..)` row in the G-coefficients summary panel so users can
  read off the same number EduG prints as "Error variance of the mean
  for levels used".

- Updated legends documenting the new columns, the new tab, and
  the bias-corrected Φ(λ).

Internal

- `prepare_data()` uses transitive parents when safe-encoding nested
  factor levels, so chains of arbitrary depth are no longer ambiguous
  to lme4.
- `validate_design()` now detects cycles in the (transitive) parent
  graph.
- `.anova_decomp()` is a new private helper that returns SS/df/MS plus
  the EMS coefficient matrices for both the Random and Mixed paths in
  one place; both `estimate_vc_reml()` and `estimate_vc_anova()` use
  it.

 

v0.2.1 (May 2026)

Changed

- Documentation DOI in the in-app introductory panel updated to
  the concept DOI <https://doi.org/10.5281/zenodo.20086375>, which
  resolves to the latest version of the Zenodo deposit rather than to
  a specific frozen version.
- Long-format example in the data-upload help block now uses
  plural column names (`persons, items, raters, scores`) and shows
  additional rows so the long-format pattern is clearer to first-time
  users.

Added

- App version visible in the UI. The title bar shows a `v0.2.1`
  chip next to the title; a small footer at the bottom of every page
  repeats the version and links to the documentation DOI.

 

v0.2.0 (May 2026)

Added

- Always-visible introductory panel above section 1 that explains
  Generalizability Theory for non-initiated users, ending with the
  full documentation DOI: <https://doi.org/10.5281/zenodo.20115464>.
- New file `AUDIT.md` documenting the full code audit performed for
  this release: REML/lme4 correctness, G-coefficient assembly,
  plain-language interpretations, and comparability with EduG and the
  GENOVA Suite.
- `tests/test_vc.R` now asserts REML/ANOVA agreement to 1e-4 on
  balanced data — a regression test for the v0.1.x ANOVA bug.

Fixed

- ANOVA-path EMS matrix. The expected-mean-square coefficient rule
  in `estimate_vc_anova()` was inverted in v0.1.x — it computed
  "facets in B but not in A" rather than the correct "facets not in
  B" (Henderson Method 1 / Cornfield–Tukey rule). The default REML
  path used by the UI was never affected, but ANOVA-based variance
  estimates were systematically wrong for any design with unequal
  facet sizes. See `AUDIT.md` §4 for the full derivation and fix.

Removed

- "Score column only" upload mode. The expand-grid-based row order
  was fragile and produced incorrect facet assignments when the user's
  row order did not match R's `expand.grid` convention. The long-format
  upload (with automatic separator and score-column detection) is the
  single supported path.

 

v0.1.x (prior development)

Initial public preview. Three-facet crossed and nested designs,
four-facet crossed designs, chain-nested designs, mixed-model designs
with fixed facets, interactive 3D D-study surface, plain-language
interpretations on every output, automatic delimiter and score-column
detection.

Files

g-theory-app-0.6.0.zip

Files (40.6 kB)

Name Size Download all
md5:cd5d3b25687367f404b5dd96885fa2b8
40.6 kB Preview Download

Additional details

Dates

Updated
2026-05-12
bug fixes

Software

Programming language
R
Development Status
Concept