Published May 3, 2024 | Version 1.1
Dataset Open

DFT Calculated xyz and log Files as well as csv Files for Machine Learning in Support of "Tailoring Phosphine Ligands for Improved C H Activation: Insights from Δ-Machine Learning"

  • 1. ROR icon Friedrich Schiller University Jena
  • 2. Technische Universität Ilmenau
  • 3. Friedrich-Schiller-Universität Jena

Description

Transition metal complexes have played crucial roles in various homogeneous catalytic processes due to their exceptional versatility. This adaptability stems not only from the central metal ions but also from the vast array of choices of the ligand spheres, which form an enormously large chemical space. For example, Rh complexes, with a well-designed ligand sphere, are known to be efficient in catalyzing the C-H activation process in alkanes. To investigate the structure-property relation of the Rh complex and identify the optimal ligand that minimizes the calculated reaction energy ΔE of an alkane C-H activation, we have applied a Δ-Machine Learning method trained on various features to study 1,743 pairs of reactants (Rh(PLP)(Cl)(CO)) and intermediates (Rh(PLP)(Cl)(CO)(H)(propyl)). Our findings demonstrate that the models exhibit robust predictive performance when trained on features derived from electron density (R2 = 0.816), and SOAPs (R2 = 0.819), a set of position-based descriptors. Leveraging the model trained on xTB-SOAPs that only depend on the xTB-equilibrium structures, we propose an efficient and accurate screening procedure to explore the extensive chemical space of bisphosphine ligands. By applying this screening procedure, we identify ten newly selected reactant-intermediate pairs with an average ΔE of 33.2 kJ mol-1, remarkably lower than the average ΔE of the original data set of 68.0 kJ mol-1. This underscores the efficacy of our screening procedure in pinpointing structures with significantly lower energy levels.

_______________________________________________________________________

The dataset contains three file types:

Version 1.0:

  1. xyz files of the final optimized Rh-phosphine complexes; one set for the starting materials denoted as "molecule-XXXX_4-times" and one set for the intermediates after C-H activation denoted as "molecule-XXXX_6-times"
  2. Gaussian16 log files for the optimization process; one set for the starting materials denoted as "molecule-XXXX_4-times" and one set for the intermediates after C-H activation denoted as "molecule-XXXX_6-times"
  3. csv files containing the per molecule features used for training the different machine learning models. The name of the csv files indicates which property was predicted and which model was used

New in version 1.1 (other data is unchanged):

  1. Gaussian16 log files for the ten newly identified bisphosphine ligands; one set for the product material denoted as "LXX_6-times-axial" and one set for the transition state for the C-H activation denoted as "LXX_C-H-activation_TS"

Files

logs_ten-new-ligands.zip

Files (10.7 MB)

Name Size Download all
md5:2d07b23eae74eb17b05a6f101337befb
10.7 MB Preview Download

Additional details

Dates

Created
2024-01-18
First Upload
Updated
2024-05-03
Addition of log files for ten new ligands