modelforge curated dataset: ANI-2x
Creators
Description
Modelforge Curated ANI-2x Dataset:
- 1000 configuration test set
- Version: nc_1000_v1.1
This provides a curated hdf5 file for a subset of the ANI-2x dataset designed to be compatible with modelforge, an infrastructure to implement and train NNPs. This dataset contains 101 unique records for 1000 total configurations, with a maximum of 10 per system. Note, configurations are parititioned into records based on the array of atomic species appearing in sequence in the source data file.
When applicable, the units of properties are provided in the datafile, encoded as strings compatible with the openff-units package. For more information about the structure of the data file, please see the following:
This dataset is compatible with modelforge hdf5 schema 2.
Properties Included:
- atomic_numbers
- positions
- "per_atom"
- "nanometer"
- forces
- "per_atom"
- "kilojoule_per_mole / nanometer"
- energies
- "per_system"
- "kilojoule_per_mole"
Source Dataset:
The ANI-2x data set includes properties for small organic molecules that contain H, C, N, O, S, F, and Cl. This dataset contains 9651712 conformers. This data was generated with the wB97X/631Gd level of theory used in the original ANI-2x paper, calculated using Gaussian 09.
Citations:
ANI-2x publication:
-
Devereux, C, Zubatyuk, R., Smith, J. et al. "Extending the applicability of the ANI deep learning molecular potential to sulfur and halogens." Journal of Chemical Theory and Computation 16.7 (2020): 4192-4202. https://doi.org/10.1021/acs.jctc.0c00121
Source dataset, released with CC Attribution 4.0 International license:
- Huddleston, K., Zubatyuk, R., Smith, J., Roitberg, A., Isayev, O., Pickering, I., Devereux, C., & Barros, K. (2023). ANI-2x Release [Data set]. Zenodo. https://doi.org/10.5281/zenodo.10108942
Files
Files
(179.3 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:574cb77f5089019965607f7c7110d4c2
|
179.3 kB | Download |
Additional details
Related works
- Is derived from
- Dataset: 10.5281/zenodo.10108942 (DOI)
- Is described by
- Journal article: 10.1021/acs.jctc.0c00121 (DOI)
Software
- Repository URL
- https://github.com/choderalab/modelforge
- Programming language
- Python
- Development Status
- Active