Published July 25, 2025 | Version v1
Dataset Open

Generated tautomeric forms for Drug Bank structure dataset

  • 1. University of Plovdiv "Paisii Hilendarski"
  • 2. ROR icon Plovdiv University
  • 3. Ideaconsult Ltd

Description

All structures were pre-processed using ChemAxon Standardizer version 5.12.2 including extraction of SMILES linear notation from sdf files, kekulization of aromatic structures, conversion of explicit hydrogen atoms to implicit ones and removal of stereo information. All tautomeric forms for the testing structures were generated by means of Ambit-Tautomer software [https://doi.org/10.1002/minf.201200133], IA-DFS algorithm (incremental approach based on depth-first search) with tautomeric rules for 1.3 and 1.5 hydrogen shifts and removal of topologically equivalent atoms and allene atom. The generated tautomeric forms for the 5 550 structures from is 174 777.

Files

dataset_DrugBank.csv

Files (10.7 MB)

Name Size Download all
md5:6d8a9d9498f320ab65972a41f6f55c25
240.3 kB Preview Download
md5:ffbdc6408ec71ee5c32796771fb375c3
10.5 MB Preview Download