Morphing libraries, QSAR models, and compounds predicted to be active on the Glucocorticoid receptor (GR)
Description
This repository contains datasets and files related to the computational drug discovery project of the chemical space exploration of the Glucocorticoid receptor.
Morphing Libraries:
- GRML_library.csv: The GRML library is the collection of virtual compounds generated by Molpher [1] (molpher-lib) starting from GR ligands with unique Bemis-Murcko scaffolds collected from the ChEMBL17 and IMG libraries.
- RML_library.csv: The RML library is the collection of virtual compounds generated by Molpher starting from compounds with unique Bemis-Murcko scaffolds randomly selected from the ZINC database.
Molpher inputs:
- GR_inputs.csv: The GR inputs are the ligands used to create the GRML library. However, the ligands from the IMG library are not present.
- Random_inputs.csv: The random inputs are the random ZINC compounds used to create the Random library.
Model's training sets:
- Model33_training_set.csv: Random forest classification model training set, it includes known GR actives and inactives from ChEMBL33. This training set doesn't contain the ligands from the IMG library.
- Model17_training_set.csv: Random forest classification model training set, it includes known GR actives and inactives from ChEMBL17. This training set doesn't contain the ligands from the IMG library.
- RFR_training_set.csv: Random forest regression model training set, it includes known GR actives and inactives from ChEMBL33 that fit into the GR pharmacophore with four features we describe in our paper.
Models:
- Model33.pkl: Python pickle file containing the trained Random forest classification models used along with Mondrian cross-conformal prediction to classify GR actives/inactives. This model was trained with ChEMBL33 and IMG libraries.
- Model17.pkl: Python pickle file containing the trained Random forest classification models used along with Mondrian cross-conformal prediction to classify GR actives/inactives. This model was trained with ChEMBL17 and IMG libraries.
- RFR_model.pkl: Python pickle file containing the trained random forest regression model used to predict GR pEC50. This model was trained with the RFR_training_set.csv.
Active predicted morphs:
- all_morphs_actives_predicted.xlsx: An Excel spreadsheet containing two sheets. 1) All GRML active predicted morphs. 2) All RML active predicted morphs. The QED, NIBR Severity Score, and Molskill Score are given for each morph.
Researchers and professionals in the field of drug discovery and cheminformatics may find these resources useful for further analysis and investigations.
Bibliography
[1] Hoksza, D., Škoda, P., Voršilák, M. et al. Molpher: a software framework for systematic chemical space exploration. J Cheminform 6, 7 (2014). https://doi.org/10.1186/1758-2946-6-7
Files
GR_inputs.csv
Files
(418.0 MB)
Name | Size | Download all |
---|---|---|
md5:db5e196e1a792b4a31bbaf6bc173125c
|
1.5 MB | Download |
md5:82a0f397c9a174dabfef066b4099a2c4
|
6.2 kB | Preview Download |
md5:7fe280d66c663355822fc89cd13292b3
|
59.1 MB | Preview Download |
md5:afaca101c56c44eac9b4d890f081b76b
|
92.2 MB | Download |
md5:182962eed06043e417abfbf0a4a5cf39
|
39.8 kB | Preview Download |
md5:044291f49591a46673627a1aa8caf50b
|
188.1 MB | Download |
md5:5b6e040d276794c57cb8bd0cbadc8aca
|
74.8 kB | Preview Download |
md5:d576cebc82e8d4d4a7a3127506832e70
|
11.5 kB | Preview Download |
md5:56890ff6748ed0048bcb2aadf25f4162
|
168.0 kB | Download |
md5:690917d111628ded2487c07ae354665a
|
9.0 kB | Preview Download |
md5:d7d9a147ffef05b5de8b4e3fd29b28cd
|
76.9 MB | Preview Download |