modelforge curated dataset: tmQM
Authors/Creators
Description
Curated tmQM Dataset:
Full dataset, version "full_dataset_v1":
This provides a curated hdf5 file for the tmQM dataset (release 13Aug2024) designed to be compatible with modelforge, an infrastructure to implement and train NNPs. This datafile includes 108541 unique molecules. Note, only a single configuration per unique molecule is provided.
Change from full_dataset_v0: fixed minor labeling bug and scaling issue in the scaled version of the computed dipole moment.
When applicable, the units of properties are provided in the datafile, encoded as strings compatible with the openff-units package. For more information about the structure of the data file, please see the following:
This curated dataset was generated using the modelforge software at commit <add commit>:
- Link to the source code at this commit: <add commit>
- Link to the script file used to generate the dataset: <add commit>
Files
Files
(310.6 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:c584662a02964d78b0d5c6bc28960867
|
310.6 MB | Download |
Additional details
Related works
- Is derived from
- Dataset: https://github.com/bbskjelstad/tmqm (URL)
- Is published in
- Publication: 10.1021/acs.jcim.0c01041 (DOI)
Software
- Repository URL
- https://github.com/choderalab/modelforge
- Programming language
- Python
- Development Status
- Active