Published June 11, 2024 | Version nc_1000_v0
Dataset Open

modelforge curated dataset: QM9 1000 conformer test dataset

  • 1. ROR icon Memorial Sloan Kettering Cancer Center

Description

Curated QM9 1000 Conformer Test Dataset:

This provides a curated hdf5 file for a subset of the QM9 dataset to be used for testing purposes, designed to be compatible with modelforge, an infrastructure to implement and train NNPs.  This dataset contains in total 1000 conformers (1 conformer per unique molecule).

When applicable, the units of properties are provided in the datafile,  encoded as strings compatible with the openff-units package.  For more information about the structure of the data file, please see the following:

This curated dataset was generated using the modelforge software at commit c5c7153:

 

Original QM9 Dataset:

The QM9 dataset includes 133,885 organic molecules with up to nine total heavy atoms (C,O,N,or F; excluding H) original published by Ramakrishnan, et al. Properties in the QM9 dataset were calculated at the B3LYP/6-31G(2df,p) level of quantum chemistry.

Citations:

Original publication:

Source dataset, released with CCO 1.0 Universal license:

Files

Files (1.7 MB)

Name Size Download all
md5:dc8ada0d808d02c699daf2000aff1fe9
1.7 MB Download

Additional details

Related works

Is derived from
Journal: 10.1038/sdata.2014.22 (DOI)
Dataset: 10.6084/m9.figshare.c.978904.v5 (DOI)

Software

Repository URL
https://github.com/choderalab/modelforge
Programming language
Python
Development Status
Active