Exploiting Pretrained Biochemical Language Models for Targeted Drug Design
- 1. Boğaziçi University
- 2. F. Hoffmann-La Roche AG
- 3. İstanbul University
Description
This repository contains materials for the paper, Exploiting Pretrained Biochemical Language Models for Targeted Drug Design, which has been accepted for publication in Bioinformatics Published by Oxford University Press.
data.zip contains vocabulary files for the pretrained models, additional information regarding proteins (PFAM family, protein similarity) and interactions filtered from BindingDB which are further split into train, validation and test sets and used to train target specific molecule generation models.
models.zip includes files for the models trained in this study.
predictions.zip comprises the compounds generated with the targeted models and the result of their evaluation with respect to benchmarking metrics.
docking.zip contains targets/ including PDB files of the test proteins selected for docking evaluation, ligands/ including SDF files for molecules generated with the targeted models and two decoding strategies (i.e. beam search and sampling) and complex/ including docking outputs.
Files
data.zip
Additional details
Related works
- Is cited by
- Software: https://github.com/boun-tabi/biochemical-lms-for-drug-design (URL)
- Is supplement to
- Journal article: 10.1093/bioinformatics/btac482 (DOI)