Data for the manuscript "Spatially resolved uncertainties for machine learning potentials"
Description
This repository accompanies the manuscript "Spatially resolved uncertainties for machine learning potentials" by E. Heid, J. Schörghuber, R. Wanzenböck, and G. K. H. Madsen. The following files are available:
-
mc_experiment.ipynb
is a Jupyter notebook for the Monte Carlo experiment described in the study (artificial model with only variance as error source).
-
aggregate_cut_relax.py
contains code to cut and relax boxes for the water active learning cycle. -
data_t1x.tar.gz
contains reaction pathways for 10,073 reactions from a subset of the Transition1x dataset, split into training, validation and test sets. The training and validation sets contain the indices 1, 2, 9, and 10 from a 10-image nudged-elastic band search (40k datapoints), while the test set contains indices 3-8 (60k datapoints). The test set is ordered according to the reaction and index, i.e. rxn1_index3, rxn1_index4, [...] rxn1_index8, rxn2_index3, [...]. -
data_sto.tar.gz
contains surface reconstructions of SrTiO3, randomly split into a training and validation set, as well as a test set. -
data_h2o.tar.gz
contains:-
full_db.extxyz
: The full dataset of 1.5k structures. -
iter00_train.extxyz
anditer00_validation.extxyz
: The initial training and validation set for the active learning cycle. -
the subfolders in the folders
random
, anduncertain
, andatomic
contain the training and validation sets for the random and uncertainty-based (local or atomic) active learning loops.
-