Published September 22, 2025 | Version v1
Dataset Open

CatBench Benchmark Dataset Collection: Adsorption Energy Datasets for MLIP Evaluation

  • 1. ROR icon Seoul National University

Description

Dataset collection supporting the CatBench framework for evaluating machine learning interatomic potentials (MLIPs) in heterogeneous catalysis. This repository contains five comprehensive adsorption energy datasets used to benchmark 13 state-of-the-art MLIPs:

File Contents:
1. KHLOHC_origin_adsorption.json (10.69 MB) - Liquid organic hydrogen carrier (LOHC) adsorption dataset including methylcyclohexane and toluene systems on transition metal surfaces, used for fine-tuning demonstrations

2. ComerGeneralized2024_adsorption.json (1.85 MB) - 325 adsorption reactions on metal oxide surfaces, enabling assessment of MLIP performance on oxides with distinct electronic structures and bonding characteristics

3. MamunHighT2019_adsorption.json (195.41 MB) - 45,130 small molecule adsorption reactions on 2,035 bimetallic alloy surfaces covering 37 metals with H, C, N, O, S atoms and molecular fragments (CHx, OH, NH, SH)

4. FG_dataset_adsorption.json (15.35 MB) - 2,651 large organic molecule adsorption reactions containing C1-C10 molecules with diverse functional groups (alcohols, amines, thiols, aromatics) on 14 transition metals

5. BM_dataset_adsorption.json (445.55 KB) - 32 extended large molecule adsorption systems with up to 30 heteroatoms, covering industrial applications: biomass conversion (lignin-derived on Ni/Ru), polyurethane synthesis (diaminotoluene on Ag/Au), and plastic recycling (polymer segments on Pt/Ru)

Each JSON file contains DFT-optimized structures, total energies, and reaction definitions formatted for direct use with the CatBench framework. The data supports all results reported in "CatBench Framework for Benchmarking Machine Learning Interatomic Potentials in Adsorption Energy Predictions for Heterogeneous Catalysis" accepted at Cell Reports Physical Science.

Related code repository: https://github.com/JinukMoon/CatBench (DOI: 10.5281/zenodo.17172022)

Files

BM_dataset_adsorption.json

Files (223.7 MB)

Name Size Download all
md5:ef2d57ef9f92047dec62c3222c33aeb6
445.5 kB Preview Download
md5:0b6ad7380c19ab8d4b8d7cb016487f2e
1.9 MB Preview Download
md5:f7c22d01325e4171056aa60e73e1eca9
15.3 MB Preview Download
md5:9e98a45f36d15604dbc8e9dfe9c2f0b5
10.7 MB Preview Download
md5:8a208683009546b4d0f9821113ab1cc6
195.4 MB Preview Download