There is a newer version of the record available.

Published June 11, 2024 | Version nc_1000_v0
Dataset Open

modelforge curated dataset: QM9

  • 1. ROR icon Memorial Sloan Kettering Cancer Center
  • 2. Chodera Lab

Description

Curated QM9 Dataset:

1000 conformer test set, version nc_1000_v0:

This provides a curated hdf5 file for a subset of the QM9 dataset to be used for testing purposes, designed to be compatible with modelforge, an infrastructure to implement and train NNPs.  This test dataset contains 1000 conformers, 1 for each unique molecule.

When applicable, the units of properties are provided in the datafile,  encoded as strings compatible with the openff-units package.  For more information about the structure of the data file, please see the following:

This curated dataset was generated using the modelforge software at commit c5c7153:

 

Original Source:

The QM9 dataset includes 133,885 organic molecules with up to nine total heavy atoms (C,O,N,or F; excluding H) original published by Ramakrishnan, et al. Properties in the QM9 dataset were calculated at the B3LYP/6-31G(2df,p) level of quantum chemistry.

Citations:

Original publication:

Source dataset, released with CCO 1.0 Universal license:

Files

Files (1.7 MB)

Name Size Download all
md5:dc8ada0d808d02c699daf2000aff1fe9
1.7 MB Download

Additional details

Related works

Is derived from
Journal article: 10.1038/sdata.2014.22 (DOI)
Dataset: 10.6084/m9.figshare.c.978904.v5 (DOI)

Software

Repository URL
https://github.com/choderalab/modelforge
Programming language
Python
Development Status
Active