Published June 5, 2025 | Version 2a
Dataset Open

Code and Data for the Publication: "Solvation free energies of anions: from curated reference data to predictive models"

Description

This compilation includes 8241 experimental pKa values across 8 solvents, 5536 computed gas-phase acidities, 6090 solvation energies of anions and 6088 solvation energies of neutral compounds computed using COSMO-RS.  All citations should refer to the manuscript: 

Nevolianis, T., Zheng, J.W., Müller, S., Baumann, M., Tshepelevitsh, S., Kaljurand, I., Leito, I., Smirnova, I., Green, W.H., Leonhard, K. Solvation free energies of anions: from curated reference data to predictive models. Submitted in 2025.

Until the publication is available, citations to the ChemRxiv preprint are also acceptable:

Nevolianis, T., Zheng, J.W., Müller, S., Baumann, M., Tshepelevitsh, S., Kaljurand, I., Leito, I., Smirnova, I., Green, W.H., Leonhard, K. Solvation free energies of anions: from curated reference data to predictive models. ChemRxiv. 2025; doi:10.26434/chemrxiv-2025-8bj2t-v2

Note: To load the models for solvation & pKa using the Chemprop Python API, you will need to instantiate the MPNNs as MulticomponentMPNNs. From there, we advise following the documentation (https://chemprop.readthedocs.io/en/latest/predicting_regression_reaction.html). 

Notes

Version V2a: includes references for pKa data

Version V2: split models into separate directories, updated test splits to remove erroneously repeated species (stereoisomers)

Version V1a: fixed broken models file

Version V1: initial upload


Files

data.zip

Files (652.0 MB)

Name Size Download all
md5:b47a163bda1e7f858202b5b382902dba
6.0 MB Preview Download
md5:61530fb6b3896a61eafedb51f2d0e1ce
71.4 MB Preview Download
md5:d8edd89554f5afd9b2d08971faf0035e
218.6 MB Preview Download
md5:759a59927188103b537de424c972e5d6
3.1 kB Preview Download
md5:f4ce332425c663f77b6e3d77fe84f28d
356.0 MB Preview Download
md5:c623bb7dc4b06f8e103a838d27d0dea3
6.4 kB Preview Download