Materials Property Axiom: Scaling Foundation Models to Experi- mental Property Generalists via Multi-phase Training
Authors/Creators
Description
Large-scale experimental property prediction is becoming a central bottleneck in discovering novel ma- terials, where the models are required to learn from heterogeneous experimental data and be rapidly adapted to new assays, measurements, and product-relevant endpoints. However, current materi- als foundation models focus primarily on computational properties, especially thermodynamics and stability. Inspired by the training paradigm of large language models, in which broad pre-training of all available corpus is followed by mid-training alignment of multiple high-quality sources and fi- nally post-training with task-specific supervised fine tuning, we ask whether the same philosophy can improve experimental materials property prediction. Here we introduce Materials Property Axiom (MPA), a three-phase framework comprising general pre-training, domain-aligned mid-training, and downstream post-training. Across a vast validation of 40 experimental properties, MPA consistently claims state-of-the-art performance relative to direct fine-tuning from a pretrained model, improving mean absolute error by 15% on average and by up to 55% on select individual properties. This gap widens with a mid-training strategy where high-quality subsets from various experimental sources and increasing first-principle computational data are aligned, suggesting that mid-training captures a common structure across heterogeneous materials data that scales. Together, these findings establish multi-phase training as a general strategy for transforming materials foundation models into reliable surrogates of experimental properties for accelerating real-world materials discovery. MPA is ready to use in https://sciclaw.ai.
Files
mpa.pdf
Files
(6.3 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:b1564216d42462888c9d864b79823d2c
|
6.3 MB | Preview Download |