Conformine, a predictor of protein Conformational Variability from amino acid sequence
- 1. Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels 1050, Belgium
Description
Proteins are dynamic and can change conformation over time. We introduce a knowledge-based metric to describe conformational regions for the protein backbone at the residue level, including a means to quantify how often an amino acid residue moves between these regions. This metric provides complementary characterization of the protein in addition to other protein dynamics metrics, such as Random Coil Index. To calculate this metric (Conformational Variability), we performed Molecular Dynamics simulations of 100 proteins with diverse levels of disorder, for which each amino acid was assigned to one of five secondary structure categories for every sample time in the simulation. Then, the Conformational Variability was determined from how often changes in this secondary structure category occurred, calculated with the Frobenius matrix norm.
We then trained an estimator to predict Conformational Variability from amino acid sequence only. This estimator consists of a Long Short-Term Memory neural network which predicts Conformational Variability in multitask with synergetic problems: secondary structure propensity and ShiftCrypt index. We trained on 118 sequences for the main task and secondary structure propensities and on 4500 sequences for the ShiftCrypt index. We performed a 5-fold cross-validation, which indicated a Pearson’s correlation for the main task at approximately 0.7 and its corresponding p-value near 0.
We present this estimator under the name ConforMine, a predictor of protein Conformational Variability from amino acid sequence. This tool will be incorporated in our tool suite “b2bTools”, available in PyPI, Anaconda and Bioconda under the same name.
Notes
Files
2022_eccb_poster_jgg.pdf
Files
(751.9 kB)
Name | Size | Download all |
---|---|---|
md5:fa14c3eea8ed357cf0d12ce6ab305a29
|
751.9 kB | Preview Download |