Datasets for understanding the importance of conformation in property prediction models
Authors/Creators
Description
Descriptor and conformer data sets for molecular property and reaction selectivity prediction tasks. The PQC data set was created based on a part of the PubChemQC PM6 dataset (J. Chem. Inf. Model. 2020, 60, 12, 5891–5899), which contains two- and three-dimensional descriptors and conformers. The APTC data sets are based on the data sets for asymmetric phase transfer catalysts with enantio-selectivity (https://github.com/Laboratoire-de-Chemoinformatique/3D-MIL-QSSR/tree/main/datasets). The melting point data set was created from the Jean-Claude Bradley Double Plus Good (Highly Curated and Validated) Melting Points Dataset (https://doi.org/10.6084/m9.figshare.1031638.v1).
They contained descriptors and conformers to train and validate machine learning models.
Detailed explanations on how to use these datasets are found in the Github repository: https://github.com/YuHamakawa/Conformation-Importance-ML-Models.
Files
dataset.zip
Files
(9.3 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:34959746f624327b7f79813d5a3cc4ca
|
9.3 GB | Preview Download |