Published April 28, 2026 | Version 1
Journal article Open

M.A.R.V.I.N. – A Software-Led Infrastructure for Thermodynamically Grounded Autonomous Materials Discovery

  • 1. ROR icon Arizona State University

Description

MARVIN — short for (Materials Agentic Research, Validation, and Inference Navigation)— is a software-led infrastructure for materials discovery that treats the CALPHAD-derived Gibbs energy function as the universal currency tying every step of an autonomous loop together. The motivation is a concrete gap in the current self-driving-laboratory (SDL) literature: systems like the A-Lab, Ada, and Burger's mobile robotic chemist propose and synthesize candidate compounds without explicit reference to assessed thermodynamic data, which makes it hard to tell whether a product is a genuinely stable equilibrium phase, a kinetically trapped artifact, or simply unreacted precursor. Leeman et al. and Cheetham & Seshadri have made this concrete; MARVIN is a direct response to it.

The architecture is a four-layer pipeline. The first layer ingests data from literature, DFT, NIST-JANAF, and (eventually) live experiments. The second is the CALPHAD core itself — Bayesian thermodynamic database construction with ESPEI, multicomponent equilibrium calculations through pycalphad, precipitation kinetics via kawin, and the self-contained SGTE Gibbs evaluator built for this work. The third is a reasoning engine: a SQLite + NetworkX Knowledge Graph that records every datum with full provenance, and an MCP-tooled multi-agent layer that draws on the Coscientist / ChemCrow / SciAgents lineage for closed-loop reasoning. The fourth layer is the decision-action surface — a Digital Twin dashboard, an RL synthesis optimizer, an Economic Resilience module for supply-chain criticality, and the experiment-dispatch interface. Solid borders in Fig. 1 mark what's actually implemented; dashed borders mark the components that remain designed but not yet operational.

The case study is garnet-type Li₇La₃Zr₂O₁₂ (LLZO), a leading solid-state-electrolyte candidate that braids together four classic problems — phase metastability, dopant-vacancy charge compensation, surface-chemistry contamination, and supply-chain vulnerability of tantalum. Section 7 reports actual computational results from the implemented half of the architecture: a two-panel Li-O-H Kellogg atmosphere diagram (computed from the Chang & Hallstedt 2011 Li-O TDB plus NIST-JANAF supplements, anchored to Dinsdale 1991 SGTE91 unaries to within 0.001 J/mol), a (T, X(O)) ternary isothermal section by closed-system Gibbs minimization, a Li-O-H-C atmosphere section that sets the Li₂CO₃-formation boundary at log p_CO₂ = −5.6 ± 0.6 dex (NIST-JANAF / Knacke spread) — three orders of magnitude below atmospheric — and a La₂Zr₂O₇ pyrochlore driving-force map that quantifies the thermodynamic envelope for the most-relevant competing phase across the LLZO sintering window.

The conductivity bridge is where the CEF sublattice model proves itself. A real Gibbs minimization over the (Li,Va)₃(Li,Va)₄(La)₃(Zr,Ta)₂(O)₁₂ sublattice product, with configurational-entropy contributions and the charge-neutrality coupling between Ta⁵⁺ on the 16a site and Li-vacancies, correctly partitions all the resulting vacancies onto the octahedral 48g/96h sublattice and leaves the tetrahedral 24d sublattice fully Li-occupied — exactly the experimental site preference that Rettenwander et al. observed by neutron diffraction. Coupling those equilibrium vacancies to a Nernst-Einstein transport bridge (E_a = 0.34 eV from Jalem et al., D₀ calibrated to the Allen 2012 anchor at x(Ta) = 0.125) reproduces the rising branch of σ vs x(Ta) up to the experimental peak; the work is candid that the falling branch beyond x ≈ 0.125 is driven by tetragonal ordering and grain-boundary effects and could be captured within CALPHAD by introducing a subregular L₁ interaction parameter on the octahedral sublattice — a natural target for the production ESPEI assessment.

What sets the manuscript apart is its transparency about the gap between implemented and designed components, and a CC-BY 4.0 Zenodo deposit that ships the assessed TDB, ESPEI YAML run-config, four ESPEI JSON datasets (Chang-Hallstedt Li-O ZPF, Rettenwander 2016 site occupancies, Allen 2012 conductivity, Bolech 1996 pyrochlore), the self-contained SGTE Gibbs evaluator and CEF Gibbs solver, the entire figure pipeline, and a real Metropolis-Hastings MCMC trace with proper Gelman-Rubin diagnostics (R̂ < 1.05 across all parameters, ~20% acceptance) on a 3-parameter LiOH thermochemistry demonstration model. The garnet phase is deliberately kept out of the v1 deposit — the (Li,Al,Ga,Va)₃(Li,Al,Ga,Va)₄(La)₃(Zr,Al,Ga,Ta)₂(O)₁₂ model is presented as a proposal awaiting the converged production assessment, with an explicit strategy for managing its 64 endmember corners (eight realizable Ta-system corners + 16 Al/Ga analogues fitted to ND data, 40 fictive corners constrained by reciprocal relations or suppressed by a +500 kJ/mol charge-impossibility penalty). The result is a paper that anchors every autonomous decision to an assessed Gibbs energy function with full provenance — turning autonomous materials discovery from a black-box synthesis campaign into a thermodynamically auditable workflow — while being explicit about what's done, what's deferred, and exactly what reproducible artifacts back the claim.

Files

Submission PDF - MARVIN (acronym-updated).pdf

Files (1.4 MB)

Name Size Download all
md5:812ca881756a7fca220ac6cadf329a24
1.4 MB Preview Download

Additional details

References

  • 1,2,3,4,5