Published February 16, 2026 | Version v3
Dataset Open

MixtureSolDB, dataset of solubility values for organic compounds in binary mixtures of solvents at various temperatures

  • 1. N.S. Kurnakov Institute of General and Inorganic Chemistry, Moscow, 119991, Russia
  • 2. Department of Chemistry, Lomonosov Moscow State University, 119991 Moscow, 1 Leninskiye Gory, Russia

Description

MixtureSolDB contains 175166 experimental solubility values within a temperature range from 252 to 383 K for 810 organic compounds as well as 3001 unique solute-binary solvent systems as well as 750 unique binary solvent mixtures extracted from 1115 peer-reviewed articles.

If you use this dataset, please cite our paper: https://doi.org/10.1038/s41597-026-07047-z

If you need a dataset for mono-solvents, BigSolDB 2.0 is available here: https://doi.org/10.5281/zenodo.15094978

The 20 columns of this dataset are explained as follows:

  1. RecordID — stable unique identifier for each dataset row
  2. SMILES_Solute — SMILES representation of the solute molecule
  3. Temperature_K — temperature for the reported solubility value, K
  4. Solubility(mole_fraction) — the reported solubility value expressed as mole fraction of solute
  5. LogS(mole_fraction) — decimal logarithm of the solubility expressed as mole fraction of solute
  6. Solubility(g/100g) — the recalculated solubility value expressed as grams of solute per 100 g of solvent
  7. LogS(g/100g) — decimal logarithm of the solubility expressed as grams of solute per 100 g of solvent
  8. Solvent1 — name of the first solvent component in the solvent mixture
  9. Solvent2 — name of the second solvent component in the solvent mixture
  10. SMILES_Solvent1 — SMILES representation of the first solvent component
  11. SMILES_Solvent2 — SMILES representation of the second solvent component
  12. Fraction_Solvent1 — initial fraction of the first solvent component in the solvent mixture (before solute addition), expressed according to Fraction_Type
  13. Fraction_Solvent2 — initial fraction of the second solvent component in the solvent mixture (before solute addition), expressed according to Fraction_Type
  14. Fraction_Type — fraction type for Fraction_Solvent1 and Fraction_Solvent2 ('mole' for mole fraction, 'mass' for mass fraction)
  15. Compound_Name — solute name
  16. CAS — solute CAS number
  17. PubChem_CID — solute PubChem_CID
  18. FDA_Approved — indicates whether the solute is approved by the U.S. Food and Drug Administration (FDA)
  19. Source — DOI of a data source for given values
  20. IsPureSolventEndpoint — flag indicating whether the solvent mixture corresponds to a pure-solvent endpoint (Fraction_Solvent1 = 0 or 1)

Online visualization and search across the dataset are available here: https://mixturesoldb.streamlit.app/

Files

MixtureSolDB.csv

Files (41.1 MB)

Name Size Download all
md5:597e31685fe36c78d568f3d37b6e1336
41.1 MB Preview Download
md5:afc090e75d79fd87e8ddb3648568b187
459 Bytes Preview Download
md5:057d5246d0b004f23e6f6d60b438aee1
14.2 kB Preview Download

Additional details

Funding

NS Kurnakova Institute of General and Inorganic Chemistry
Program for Fundamental Research of the N.S. Kurnakov Institute of General and Inorganic Chemistry of the Russian Academy of Sciences 1021071612866-5-1.4.7