1. Original Data:

Batzner, S. et al. E (3)-equivariant graph neural networks for data-efficient and accurate interatomic
potentials. Nat. communications 13, 2453 (2022).

This dataset consists of configurations characterizing the decomposition process of formate on Cu(110), focusing on C-H bond cleavage.
It includes initial states (monodentate/bidentate formate), intermediate configurations, and final states (H ad-atom with gas-phase CO2).
The Nudged Elastic Band (NEB) method generated reaction pathways, followed by 12 \textit{ab initio} molecular dynamics (AIMD) simulations using the CP2K code.
These simulations produced 6,855 DFT structures with a 0.5 fs time step over 500-step trajectories, capturing dynamic evolution across reaction coordinates.
The dataset provides atomistic-scale insights into catalytic decomposition mechanisms through systematically sampled configurations.

2. Data Split:

The full dataset was partitioned into training (2,500 structures), validation (250 structures), and test (remaining 4,105 structures) sets via uniform random sampling, see `data/downsample/split.py` for details.

