Published April 22, 2025 | Version nc_1000_v1.1
Dataset Open

modelforge curated dataset: Fe (II) dataset

  • 1. ROR icon Memorial Sloan Kettering Cancer Center
  • 2. ROR icon Vanderbilt University

Description

Modelforge Curated Fe (II) Dataset
- 1000 conformer test set
- Version: nc_1000_v1.1

This provides a modelforge curated hdf5 file for the Fe (II) dataset.  This dataset contains 102 unique systems with a total of 1000 configurations (max of 10 configurations per system). The full dataset contains systems with elements: H, C, N, O, P, S, Cl, and Fe.

This dataset is compatible with modelforge hdf5 schema 2.

The full Fe(II) dataset includes 28834 total configurations Fe(II) organometallic complexes.Specifically, this includes 15568 HS geometries and 13266 LS geometries. These complexes originate from the Cambridge Structural Database (CSD) as curated by Nandy, et al. (Journal of Physical Chemistry Letters (2023), 14 (25), 10.1021/acs.jpclett.3c01214), and were filtered into “computation-ready” complexes, (those where both oxidation states and charges are already specified without hydrogen atoms missing in the structures), following the procedure outlined by Arunachalam, et al. (Journal of Chemical Physics (2022), 157 (18), 10.1063/5.0125700)

The original Fe (II) dataset is available from github:
https://github.com/Neon8988/Iron_NNPs

The modelforge  dataset was curated from a forked release (no modifications made to the dataset, forked/released to ensure provenance).
https://github.com/chrisiacovella/Iron_NNPs/releases/tag/2024Jan16

Citation to the original dataset:

Modeling Fe(II) Complexes Using Neural Networks
Hongni Jin and Kenneth M. Merz Jr.
Journal of Chemical Theory and Computation 2024 20 (6), 2551-2558
DOI: 10.1021/acs.jctc.4c00063

 

Properties  included in the dataset:   

  • atomic_numbers  
  • positions       
    •  "per_atom"
    •   "nanometer"
  • forces  
    •  "per_atom"
    •   "kilojoule_per_mole / nanometer"
  • energies        
    •   "per_system"
    •    "kilojoule_per_mole"
  • total_charge    
    •   "per_system"
    •   "elementary_charge"
  • spin_multiplicities     
    •   "per_system"
  • mol_id  "meta_data"

 

Files

Files (1.4 MB)

Name Size Download all
md5:5337732f01cc99fac8c500c1df7a4b39
1.4 MB Download

Additional details

Related works

Is derived from
Dataset: https://github.com/Neon8988/Iron_NNPs (URL)
Is described by
Journal: 10.1021/acs.jctc.4c00063 (DOI)

Software

Repository URL
https://github.com/choderalab/modelforge
Programming language
Python traceback
Development Status
Active