Datasets for understanding the importance of conformation in property prediction models

Hamakawa, Yu; Miyao, Tomoyuki

doi:10.5281/zenodo.14575682

Published December 31, 2024 | Version v2

Dataset Open

Datasets for understanding the importance of conformation in property prediction models

1. Nara Institute of Science and Technology
2. Nara Institute of Science and Technology(NAIST)

Descriptor and conformer data sets for molecular property and reaction selectivity prediction tasks. The PQC data set was created based on a part of the PubChemQC PM6 dataset (J. Chem. Inf. Model. 2020, 60, 12, 5891–5899), which contains two- and three-dimensional descriptors and conformers. The APTC data sets are based on the data sets for asymmetric phase transfer catalysts with enantio-selectivity (https://github.com/Laboratoire-de-Chemoinformatique/3D-MIL-QSSR/tree/main/datasets). The melting point data set was created from the Jean-Claude Bradley Double Plus Good (Highly Curated and Validated) Melting Points Dataset (https://doi.org/10.6084/m9.figshare.1031638.v1).

They contained descriptors and conformers to train and validate machine learning models.

Detailed explanations on how to use these datasets are found in the Github repository: https://github.com/YuHamakawa/Conformation-Importance-ML-Models.

Files

dataset.zip

Files (9.3 GB)

Name	Size	Download all
dataset.zip md5:34959746f624327b7f79813d5a3cc4ca	9.3 GB	Preview Download

	All versions	This version
Views	319	184
Downloads	75	55
Data volume	974.2 GB	890.9 GB

Datasets for understanding the importance of conformation in property prediction models

Authors/Creators

Description

Files

dataset.zip

Files (9.3 GB)