Published January 7, 2024 | Version v1
Dataset Open

Data from: Compromise docking power evaluation of liganded crystal structures of Mpro SARS-CoV-2

  • 1. Slovak University of Technology in Bratislava

Description

A set of 406 liganded SARS-CoV-2 Mpro crystal structures originally downloaded from RCSB PBD database is provided. Ligand and protein files are processed and corrected for various types of structural errors and are provided in pdbqt and mol2 formats for immediate use in molecular docking programs AutoDock, AutoDock Vina, and PLANTS. Data are utilized in calculations of newly defined compromise docking power to monitor the performance of above-mentioned software. The provided dataset can also be used for benchmarking of other software and molecular docking protocols on liganded SARS-CoV-2 Mpro systems.

Notes

Funding provided by: Science and Technology Assistance Agency*
Crossref Funder Registry ID:
Award Number: APVV-20-0213

Funding provided by: Slovak Grant Agency VEGA*
Crossref Funder Registry ID:
Award Number: 1/0718/19

Funding provided by: Slovak Grant Agency VEGA*
Crossref Funder Registry ID:
Award Number: 1/0139/20

Funding provided by: Slovak Grant Agency VEGA*
Crossref Funder Registry ID:
Award Number: 1/0175/23

Funding provided by: ERDF
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100008530
Award Number: ITMS code 26210120002

Funding provided by: European Regional Development Fund, EU Structural Funds Informatization of society*
Crossref Funder Registry ID:
Award Number: 311070AKF2

Methods

Provided data set is comprised of 406 SARS-CoV-2 Mpro crystal structures (169 noncovalent and 247 covalent in pdbqt and mol2 formats) ready to use in molecular docking programs AutoDock, AutoDock Vina, and PLANTS.

The initial dataset containing 671 SARS-CoV-2 Mpro crystal structures was downloaded (10th of February 2022) from RCSB PDB database. 161 unliganded structures and two structures containing unparametrized atoms in the ligand structure (Se, Zn) were discarded from further processing. The remaining 508 crystal structures were then stripped of disordered atoms, crystal waters, ions, and cosolvents and aligned to a reference structure (PDB ID: 6wqf). Crystal structures were then split into separate files for each monomer present, and the first chain containing a ligand-protein pair was selected for further processing. The integrity of the ligand structure was validated by expressing its InChIKey, using OpenBabel 2.3.2, and comparing it with its RCSB entry. Discrepancies were recorded and addressed manually by transforming the ligand pdb structure from the respective monomer unit to the mol2 format using OpenBabel and editing the individual atoms/bond orders to produce the desired ligand structure. Ligands with a missing fragment (102 in total), i.e., more than hydrogen atoms were omitted. OpenBabel was then used to produce the desired file format of both ligand and protein structures, pdbqt for AutoDock and AutoDock Vina with only polar hydrogens present, and mol2 for PLANTS, with added Gasteiger charges.

Files

Compounds_structures.zip

Files (59.1 MB)

Name Size Download all
md5:82d330a45646990d85c04e29fbae17de
59.1 MB Preview Download
md5:12710522c8d8a33017f9f146909562be
1.3 kB Preview Download