Multiple-Kernel Ridge Regression for Learning the Structure-Electronic Property Relationships of Pyranoazacoronene COFs
Authors/Creators
Description
The dataset uploaded herein is associated with the paper published under the title "Multiple-Kernel Ridge Regression for Learning the Structure-Electronic Property Relationships of Pyranoazacoronene COFs". This dataset includes:
- A .zip file with all 232 .xyz files
- .cjson files for the custom node and linker precursors to be used with pyCOFBuilder to generate the COF structures
- A Python script for executing pyCOFBuilder with example strings for each node-linker combination
- One .csv file with all of the Fermi levels, maximum valence bands, minimum conduction bands, and band gaps for each structure
- Another four .csv files with the SOAP similarity kernels for the SOAP KRR and SOAP MKRR models, and one .csv file with the normalized stoichiometric features
- An Excel spreadsheet with tabulated elemental properties used to calculate the stoichiometric features
Abstract
Covalent organic frameworks (COFs) are highly ordered, porous organic materials whose reticular construction from tailored nodes and linkers enables atomic-level control over structure and function. The design space of COFs is vast with virtually unlimited combinations of nodes, linkers, and functional groups. Interpretable machine learning (ML) offers a pathway to navigate this complexity by identifying the structural features that govern materials performance, yet interpretability often comes at the cost of predictive accuracy. In this work, we introduce a novel multiple-kernel learning framework that achieves both accuracy and mechanistic insight. A multiple-kernel ridge regression (MKRR) model was trained on band gaps predicted from GFN1-xTB level theory for a dataset of 232 theoretical pyranoazacoronene (PAC) COFs produced from eight different conjugated linkers and 29 functional groups. Modifying these building units alone produced a range of band gaps between 0.4 - 2 eV. Manual analysis of the theoretical band gaps versus the linker indicates that breaking the conjugation pathway by altering the bond angle or by introducing a sigma-bond increases the band gap while increasing the length of the linker decreases the band gap. All functional groups appear to reduce the band gap with three specific electron withdrawing groups reducing the band gap near 0.4 eV. For the ML, the building units were represented with three independent kernels that encoded the local environments of each node, linker, and functional group calculated from the Smooth Overlap of Atomic Positions (SOAP). After decomposing each kernel's contribution to the model's global predictions, we found that the MKRR model successfully captures the underlying structure–property relationships that influence the band gap. These results demonstrate that MKRR is an effective and interpretable framework for understanding and designing functional COFs.
Files
all-data.csv
Files
(6.3 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:58ed2dbf834401b91f55086ca47f441f
|
12.0 kB | Preview Download |
|
md5:39b8e57495ce323527d634d50968f654
|
691.9 kB | Preview Download |
|
md5:4f16b68b291127bb60011f9e61b2d3b9
|
3.3 kB | Download |
|
md5:7fc6e741a22f421de086a46e63d97e74
|
1.7 kB | Download |
|
md5:99a0dd53ada09a97dda05dd1ccecb4c8
|
2.9 kB | Download |
|
md5:f3bb571de0f844739d54216564e7bebb
|
2.4 kB | Download |
|
md5:bb0412e1dc38e64312727e103cae9ddd
|
4.4 kB | Download |
|
md5:e402bbcaceb1a669a2851ab89289281f
|
10.0 kB | Download |
|
md5:bbef0f600dfc08682e15e3c12135d66c
|
3.3 kB | Download |
|
md5:4358e020ac8bcb3b55d858dc622e70cf
|
2.4 kB | Download |
|
md5:1b7b9dc41334cf5e164e480c4d9c404f
|
4.4 kB | Download |
|
md5:92eddd5b1299d36bd0add35a6ae475d6
|
641 Bytes | Download |
|
md5:3466e3d3eae4c335be709236c3e0fb56
|
7.9 kB | Download |
|
md5:8e275b1102e57f5efa43f296530807a5
|
7.9 kB | Download |
|
md5:12b8e657f60fae91cef72781391ef018
|
1.3 MB | Preview Download |
|
md5:ea99203bbc7d7090f00d8abfb790651e
|
1.3 MB | Preview Download |
|
md5:1fb3a1362ce4f2d24fc20a7c04f05aa5
|
1.3 MB | Preview Download |
|
md5:f5fb8d6c96e10fce7b354adad66c6e7e
|
1.3 MB | Preview Download |
|
md5:1b7d0348cba3a2ccb9e686e02c86e62c
|
119.3 kB | Preview Download |
|
md5:1b2ec02e5c7f0660a01762a4ab50aed2
|
4.1 kB | Download |
Additional details
Related works
- Continues
- Publication: 10.1021/jacs.4c10529 (DOI)