Published March 24, 2026 | Version v1
Dataset Open

SPICE-alpha - MACE-MDP

  • 1. ROR icon University of Bayreuth
  • 2. ROR icon Fritz Haber Institute of the Max Planck Society
  • 3. Fritz-Haber-Institut der Max-Planck-Gesellschaf
  • 4. ROR icon University College London

Description

This repository contains the datasets used to develop and benchmark MACE-MDP, a general dipole and polarizability model for organic molecules and materials. It comprises three complementary datasets that collectively support large-scale training, benchmarking of vibrational spectroscopy, and evaluation of transferability to non-covalent molecular clusters.

Datasets

1. SPICE-α

SPICE-α is a large-scale extension of the SPICE v2.0 dataset comprising approximately 1.8 million molecular configurations.

In addition to energies, forces, and dipole moments, the dataset includes DFT-level molecular polarizability tensors computed at the ωB97M-D3(BJ)/def2-TZVPPD level of theory. It spans isolated molecules, non-covalent dimers, solvated systems, and biomolecular fragments.

2. IR-R-7193

IR-R-7193 contains 7,193 isolated organic molecules with reference harmonic IR and Raman spectra computed at the same level of theory.

This dataset is used to benchmark the accuracy of machine-learning predictions of vibrational spectra derived from model-predicted dipole moments and polarizabilities.

3. R-3B69

R-3B69 is a dataset of 69 molecular trimers derived from crystal structures and annotated with reference IR and Raman spectra.

It is designed to evaluate model transferability to non-covalent molecular clusters and intermolecular interactions.

Reference electronic structure method

All reported reference properties were computed consistently at the ωB97M-D3(BJ)/def2-TZVPPD level of theory.

Citations

[1] Gönnheimer, N., Reuter, K., Kapil, V., Margraf, J. T., MACE-MDP: A General Dipole and Polarizability Model for Organic Molecules and Materials, ChemRxiv (2025), https://chemrxiv.org/doi/full/10.26434/chemrxiv.15000716

[2] Eastman, P.; Behara, P. K.; Dotson, D. L.; Galvelis, R.; Herr, J. E.; Horton, J. T.; Mao, Y.; Chodera, J. D.; Pritchard, B. P.; Wang, Y.; De Fabritiis, G.; Markland, T. E. SPICE, A Dataset of Drug-like Molecules and Peptides for Training Machine Learning Potentials. Sci. Data. 2023, 10, 11.

[3] Pracht, P.; Pillai, Y.; Kapil, V.; Csányi, G.; Gönnheimer, N.; Vondrák, M.; Margraf, J. T.; Wales, D. J. Efficient Composite Infrared Spectroscopy: Combining the Double-Harmonic Approximation with Machine Learning Potentials. J. Chem. Theory Comput. 2024, 20, 10986–11004.

[4] Řezáč, J.; Huang, Y.; Hobza, P.; Beran, G. J. O. Benchmark Calculations of Three-Body Intermolecular Interactions and the Performance of Low-Cost Electronic Structure Methods. J. Chem. Theory Comput. 2015, 11, 3065–3079.

Files

SPICE-alpha.zip

Files (12.5 GB)

Name Size Download all
md5:afcba5a263030b61deaeb79c660c2efd
12.5 GB Preview Download

Additional details

Software

Repository URL
https://github.com/Nilsgoe/Benchmark-MACE-MDP
Programming language
Python

References