Morphing libraries, QSAR models, and compounds predicted to be active on the Glucocorticoid receptor (GR)
Description
This repository contains datasets and files related to the computational drug discovery project of the chemical space exploration of the Glucocorticoid receptor. The accompanying Python code is freely available in the GitHub repository (https://github.com/Iagea/GRML_analyses).
Morphing Libraries:
- GRML_library.csv: The GRML library is the collection of 999,015 virtual compounds generated by Molpher [1-2] starting from GR ligands with unique Bemis-Murcko scaffolds collected from the ChEMBL17 and IMG libraries.
- RML_library.csv: The RML library is the collection of 1,346,310 virtual compounds generated by Molpher starting from compounds with unique Bemis-Murcko scaffolds randomly selected from the ZINC database.
IMG library:
- IMG_non_proprietary.csv: The non-proprietary IMG library subset containing 12,956 compounds and their corresponding B-scores from the primary screen.
Molpher inputs:
- GR_inputs.csv: The GR inputs are the ligands used to create the GRML library, 204 compounds from ChEMBL17 (95 compounds) and the non-proprietary dataset from IMG (109 compounds).
- Random_inputs.csv: The random inputs are 249 random ZINC compounds used to create the Random library.
Model's training sets:
- Model33_training_set.csv: Random forest classification model training set, it includes 865 compounds; known GR actives and inactives from ChEMBL33 (738 compounds) and non-proprietary active ligands from the IMG library (127 compounds).
- Model17_training_set.csv: Random forest classification model training set, it includes 601 compounds; known GR actives and inactives from ChEMBL17 (474 compounds) and non-proprietary active ligands from the IMG library (127 compounds).
- RFR_training_set.csv: Random forest regression model training set, it includes 89 compounds; known GR actives and inactives from ChEMBL33 that fit into the GR pharmacophore with the four features we describe in our paper.
Models:
- Model33.pkl: Python pickle file containing the trained Random forest classification models used along with Mondrian cross-conformal prediction to classify GR actives/inactives. This model was trained with ChEMBL33 and IMG libraries.
- Model17.pkl: Python pickle file containing the trained Random forest classification models used along with Mondrian cross-conformal prediction to classify GR actives/inactives. This model was trained with ChEMBL17 and IMG libraries.
- RFR_model.pkl: Python pickle file containing the trained random forest regression model used to predict GR pEC50. This model was trained with the RFR_training_set.csv.
Active predicted morphs:
- all_morphs_actives_predicted.xlsx: An Excel spreadsheet containing two sheets. 1) All 22,524 GRML active predicted morphs. 2) All 4,341 RML active predicted morphs. The QED, NIBR Severity Score, and Molskill Score are given for each morph.
Proposed GR active ligands:
- designed_ligands.xlsx: An Excel spreadsheet containing two sheets. 1) All 54 designed GR ligands with their QED, NIBR severity score, MolSkill score, predicted activity (pEC50 value), and the result of the manual annotation and remarks, if available. 2) The structure of the 54 ligands based on their manual annotation and presence or not in ChEMBL33 database.
Researchers and professionals in the field of drug discovery and cheminformatics may find these resources useful for further analysis and investigations.
Bibliography
[1] Hoksza, D., Škoda, P., Voršilák, M. et al. Molpher: a software framework for systematic chemical space exploration. J Cheminform 6, 7 (2014). https://doi.org/10.1186/1758-2946-6-7
Files
GR_inputs.csv
Files
(419.2 MB)
Name | Size | Download all |
---|---|---|
md5:db5e196e1a792b4a31bbaf6bc173125c
|
1.5 MB | Download |
md5:91822a0ac10263631789fcc3811670a5
|
228.0 kB | Download |
md5:452d6878510a123706f0cb6c30dd0aa8
|
12.1 kB | Preview Download |
md5:7fe280d66c663355822fc89cd13292b3
|
59.1 MB | Preview Download |
md5:5f3fad7d86c072405d50d30aa9f4ad06
|
928.6 kB | Preview Download |
md5:afaca101c56c44eac9b4d890f081b76b
|
92.2 MB | Download |
md5:57f1d7fabab3ec26e6d3c8d5217b2fff
|
46.1 kB | Preview Download |
md5:044291f49591a46673627a1aa8caf50b
|
188.1 MB | Download |
md5:8d4a748810adecf0e6ba03b65565bf65
|
81.1 kB | Preview Download |
md5:d576cebc82e8d4d4a7a3127506832e70
|
11.5 kB | Preview Download |
md5:56890ff6748ed0048bcb2aadf25f4162
|
168.0 kB | Download |
md5:690917d111628ded2487c07ae354665a
|
9.0 kB | Preview Download |
md5:d7d9a147ffef05b5de8b4e3fd29b28cd
|
76.9 MB | Preview Download |