There is a newer version of the record available.

Published November 4, 2023 | Version v2
Dataset Open

Morphing libraries, QSAR models, and compounds predicted to be active on the Glucocorticoid receptor (GR)

  • 1. ROR icon University of Chemistry and Technology

Description

This repository contains datasets and files related to the computational drug discovery project of the chemical space exploration of the Glucocorticoid receptor.

Morphing Libraries:

  • GRML_library.csv: The GRML library is the collection of virtual compounds generated by Molpher [1] (molpher-lib) starting from GR ligands with unique Bemis-Murcko scaffolds collected from the ChEMBL17 and IMG libraries.
  • RML_library.csv: The RML library is the collection of virtual compounds generated by Molpher starting from compounds with unique Bemis-Murcko scaffolds  randomly selected from the ZINC database.

Molpher inputs:

  • GR_inputs.csv: The GR inputs are the ligands used to create the GRML library. However, the ligands from the IMG library are not present.
  • Random_inputs.csv: The random inputs are the random ZINC compounds used to create the Random library.

Model's training sets:

  • Model33_training_set.csv: Random forest classification model training set, it includes known GR actives and inactives from ChEMBL33. This training set doesn't contain the ligands from the IMG library.
  • Model17_training_set.csv: Random forest classification model training set, it includes known GR actives and inactives from ChEMBL17. This training set doesn't contain the ligands from the IMG library.
  • RFR_training_set.csv: Random forest regression model training set, it includes known GR actives and inactives from ChEMBL33 that fit into the GR pharmacophore with four features we describe in our paper.

Models:

  • Model33.pkl: Python pickle file containing the trained Random forest classification models used along with Mondrian cross-conformal prediction to classify GR actives/inactives. This model was trained with ChEMBL33 and IMG libraries.
  • Model17.pkl: Python pickle file containing the trained Random forest classification models used along with Mondrian cross-conformal prediction to classify GR actives/inactives. This model was trained with ChEMBL17 and IMG libraries.
  • RFR_model.pkl: Python pickle file containing the trained random forest regression model used to predict GR pEC50. This model was trained with the RFR_training_set.csv.

Active predicted morphs:

  • all_morphs_actives_predicted.xlsx: An Excel spreadsheet containing two sheets. 1) All GRML active predicted morphs. 2) All RML active predicted morphs. The QED, NIBR Severity Score, and Molskill Score are given for each morph.

Researchers and professionals in the field of drug discovery and cheminformatics may find these resources useful for further analysis and investigations.

Bibliography

[1] Hoksza, D., Škoda, P., Voršilák, M. et al. Molpher: a software framework for systematic chemical space exploration. J Cheminform 6, 7 (2014). https://doi.org/10.1186/1758-2946-6-7

Files

GR_inputs.csv

Files (418.0 MB)

Name Size Download all
md5:db5e196e1a792b4a31bbaf6bc173125c
1.5 MB Download
md5:82a0f397c9a174dabfef066b4099a2c4
6.2 kB Preview Download
md5:7fe280d66c663355822fc89cd13292b3
59.1 MB Preview Download
md5:afaca101c56c44eac9b4d890f081b76b
92.2 MB Download
md5:182962eed06043e417abfbf0a4a5cf39
39.8 kB Preview Download
md5:044291f49591a46673627a1aa8caf50b
188.1 MB Download
md5:5b6e040d276794c57cb8bd0cbadc8aca
74.8 kB Preview Download
md5:d576cebc82e8d4d4a7a3127506832e70
11.5 kB Preview Download
md5:56890ff6748ed0048bcb2aadf25f4162
168.0 kB Download
md5:690917d111628ded2487c07ae354665a
9.0 kB Preview Download
md5:d7d9a147ffef05b5de8b4e3fd29b28cd
76.9 MB Preview Download