There is a newer version of the record available.

Published July 24, 2023 | Version 2
Dataset Open

MBC and ECBL Libraries: outstanding tools for drug discovery

  • 1. CIB-CSIC

Description

UPDATE. New in this revision: python scripts to process DBs and calculate the percentage of molecules which pass the Veber and Ghose filters.

- Veber_filter.py ad Ghose filter.py (data presented in Table 1 of the MS).

 

Data and scripts to reproduce all the graphics reported in the Manuscript entitled: "MBC and ECBL Libraries: outstanding tools for drug discovery".

List of analyzed DBs:

  1. MBC2016 (Total entries: 1,096 cmpds; 7.39% excluded from properties analysis - QikProp failure).
  2. MBC2022 (Total entries: 2,577 cmpds; 3.14% excluded from properties analysis - QikProp failure).
  3. ECBL (Total entries: 101,021 cmpds; 0.20% excluded from properties analysis - QikProp failure).
  4. ChEMBL v.31 (Total entries 1,908,325 cmpds; 2.97% excluded from properties analysis - QikProp failure).
  5. DrugBank v.5.0 (Total entries 10,981 cmpds; 4.13% excluded from properties analysis - QikProp failure).
  6. ZINC20 (Total entries 10,723,360 cmpds; 0.61% excluded from properties analysis - QikProp failure).

Files:

QikProp_properties.docx: doc file containing the full list of QikProp properties calculated for each analyzed DB.

DATA_comparison.xlsx: excel file containing data used to reproduce plots in Figure 4 of the MS.

  • Murcko_scaffold_percentages: distribution (%) of the first 50 most populated Murcko scaffolds for MBC2016, MBC2022 and ECBL.
  • Murcko_scaffolds_comparison: distribution (count) of the first 94 common Murcko scaffolds for MBC2016, MBC2022 and ECBL.

QikProp properties for all the analyzed DBs (6 files; CSV format).

SMILES codes for all the analyzed DBs (6 files; SMI format). 

joinplots.py: python script to generate the 2D plots in Figure 2 of the MS.

fingerprint_similarity.py: python script to run and generate the Tanimoto similarity plots in Figure 3 of the MS.

calc_kde.py: python script to run kernel density analysis reported in Figure 5 of the MS.

Notes

Revised version of the original publication.

Files

ChEMBL_Properties.CSV

Files (4.6 GB)

Name Size Download all
md5:1795bb16a4363baaf3386a38dab63e4e
271 Bytes Download
md5:f75e079cee0a25c24b544598317e8db9
165.5 MB Download
md5:72be51283fdf942a9bcf816f98d48a08
598.1 MB Preview Download
md5:11c8932895a0a63b75dd120112aa70d6
43.8 kB Download
md5:2c484664d289399b39cd176e6acf7cec
701.1 kB Download
md5:2c599b63c39e5778bcbf9140868aa891
3.2 MB Preview Download
md5:0d0bde6e17ce0fcd5140146be62a0d75
5.7 MB Download
md5:446c14dec5d1b670a67aa88c80418400
29.2 MB Preview Download
md5:a177a1b263ca8c463cefcdf95689338f
3.2 kB Download
md5:b289acb7dc3202dba05d0a36830ba263
863 Bytes Download
md5:1870823a09ab993130a5f56e8bbe8005
283 Bytes Download
md5:07891db838c343dbd6219b6470d01757
53.8 kB Download
md5:dab2b5b660097f2c0d76099397b1219f
353.2 kB Preview Download
md5:efca0f4372e0e500f67544be5c1559a0
134.5 kB Download
md5:bdce23f4f5b0fb0a8312669641ed401c
837.2 kB Preview Download
md5:6d658acf6f7daa6b123f41f025dd181b
17.3 kB Download
md5:a5033a568d6f7dd282496bab6f7044bb
16.1 kB Download
md5:d48867e061c1a78259580fc7bac8c7a8
827 Bytes Download
md5:46d74fd535a734af9c74514c9776e7eb
641.0 MB Download
md5:9a8e4f151ff6dc0928bb7444f0f189df
3.1 GB Preview Download