Published March 26, 2026 | Version v2
Dataset Open

Machine Learning and GenAI datasets for the accelerated design of homogeneous catalysts for CO2 reduction - HPCvsCO2

Description

The HPCvsCO2 project proposes a protocol to accelerate the discovery of new catalysts for the CO2 capture, addressing the computational resource limitations of traditional quantum chemistry methods.

The core idea is to integrate Machine Learning (ML) techniques with computational chemistry to predict chemical properties, specifically the HOMO (Highest Occupied Molecular Orbital) and LUMO (Lowest Unoccupied Molecular Orbital) energy, drastically reducing the need to screen vast numbers of molecular configurations computationally. Furthermore, the project investigates the use of Genenerative AI (GenAI) to boost the performance of the ML algorithms within the workflow and generate molecules with a target structure.

As part of the HPCvsCO2 project, two datasets of metal-centered catalyst complexes were produced:

- The first dataset was used to train and test the UniMol machine learning model for predicting HOMO/LUMO values.

- The second dataset was used for fine-tuning the REINVENT4 generative model for generating structurally valid complexes for the project.

-The top 50 candidates were obtained by considering minimizing the HOMO-epoxide LUMO catalyst gap. In the folder, there are the xyz files and an .xlsx file with the HOMO and LUMO values of the molecules. Note: the HOMO and LUMO values and the HOMO-LUMO gap refer to the catalysts themselves, not to the HOMO_epox-LUMO_catalyst gap.

Files

[GenAI] Reinvent data.zip

Files (694.6 MB)

Name Size Download all
md5:720339f605cd04cf99be6c79eb1856d1
11.8 MB Preview Download
md5:e46cfd0513b69d7e5ca4497285c64a32
682.7 MB Preview Download
md5:239e5363fdf265e0aa54925950011a08
98.1 kB Preview Download

Additional details

Funding

Fondazione ICSC Centro Nazionale di Ricerca in High Performance Computing, Big Data e Quantum Computing
HPC workflows based on ML and quantum chemistry methods for CO2 capture and conversion (HPCvsCO2) B93C22000620006