Published November 3, 2025 | Version v1
Journal article Open

Navigating materials design spaces with efficient Bayesian optimization: a case study in functionalized nanoporous materials

  • 1. ROR icon National Centre of Scientific Research "Demokritos"
  • 2. ROR icon National and Kapodistrian University of Athens
  • 3. SciFY PNPC
  • 4. NCSR "Demokritos"

Description

Machine learning (ML) has the potential to accelerate the discovery of high-performance materials by learning complex structure–property relationships and prioritizing candidates for costly experiments or simulations. However, ML efficiency is often offset by the need for large, high-quality training datasets, motivating strategies that intelligently select the most informative samples. Here, we formulate the search for top-performing functionalized nanoporous materials (metal–organic and covalent–organic frameworks) as a global optimization problem and apply Bayesian Optimization (BO) to identify regions of interest and rank candidates with minimal evaluations. We highlight the importance of a proper and efficient initialization scheme of the BO process, and we demonstrate how BO-acquired samples can also be used to train an XGBoost regression predictive model that can further enrich the efficient mapping of the region of high performing instances of the design space. Across multiple literature-derived adsorption and diffusion datasets containing thousands of structures, our BO framework identifies 2×- to 3×-more materials within a top-100 or top-10 ranking list, than random-sampling-based ML pipelines, and it achieves significantly higher ranking quality. Moreover, the surrogate enrichment strategy further boosts top-N recovery while maintaining high ranking fidelity. By shifting the evaluation focus from average predictive metrics (e.g.R2, MSE) to task-specific criteria (e.g., recall@N and nDCG), our approach offers a practical, data-efficient, and computationally accessible route to guide experimental and computational campaigns toward the most promising materials.

Files

d5dd00237k.pdf

Files (2.4 MB)

Name Size Download all
md5:c42cb7b4016dad21c69f45e959e360a4
2.4 MB Preview Download

Additional details

Dates

Issued
2025-11-03