AI-assisted screening of MOFs for formaldehyde removal: Combining GCMC simulation and machine learning
Description
Formaldehyde (FA) is a hazardous indoor volatile organic compound that poses serious risks to human health demands the development of efficient and selective adsorbent materials. Metal–Organic Frameworks (MOFs) offer exceptional structural tunability for FA capture. However, the vast chemical space of existing and hypothetical MOFs makes experimental and conventional computational screening resource intensive. In this work, we present an AI-assisted high-throughput screening framework that combines Generative Artificial Intelligence (GAI) based text mining, Grand Canonical Monte Carlo (GCMC) simulations, and Machine Learning (ML) to accelerate the discovery of MOFs for indoor FA removal. GAI models were employed to extract key features which supports FA removal in MOFs to screen database without prior programming expertise. A total of 1208 MOFs from the CoRE MOF 2019 database were evaluated using GCMC simulations to compute FA uptake and henry’s coefficients under ambient conditions. ML models trained on atomic, geometric, chemical, and thermodynamic descriptors initially exhibited weak correlations due to data noise. This limitation was resolved using a support vector regression (SVR)-based denoising strategy, resulting in predictive accuracies exceeding R² > 0.90. Among seven trained ML models, SVR model exhibited the best performing model and successfully predicting FA adsorption in 500 hypothetical MOFs. Validation against experimental data for HKUST-1 and MIL-101(Cr) further confirmed the reliability of the approach. This workflow will significantly accelerate in identification and high-throughput screening of MOFs for gas adsorption.
Files
mof4fa.zip
Files
(472.0 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:0b4c8b680cb29a0c33dea6e0b4c8d9f9
|
472.0 kB | Preview Download |
Additional details
Dates
- Created
-
2026-01-22