Published February 7, 2023 | Version 1.0.0
Poster Open

Conformine, a predictor of protein Conformational Variability from amino acid sequence

  • 1. Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels 1050, Belgium

Description

Proteins are dynamic and can change conformation over time. We introduce a knowledge-based metric to describe conformational regions for the protein backbone at the residue level, including a means to quantify how often an amino acid residue moves between these regions. This metric provides complementary characterization of the protein in addition to other protein dynamics metrics, such as Random Coil Index. To calculate this metric (Conformational Variability), we performed Molecular Dynamics simulations of 100 proteins with diverse levels of disorder, for which each amino acid was assigned to one of five secondary structure categories for every sample time in the simulation. Then, the Conformational Variability was determined from how often changes in this secondary structure category occurred, calculated with the Frobenius matrix norm. 

We then trained an estimator to predict Conformational Variability from amino acid sequence only. This estimator consists of a Long Short-Term Memory neural network which predicts Conformational Variability in multitask with synergetic problems: secondary structure propensity and ShiftCrypt index. We trained on 118 sequences for the main task and secondary structure propensities and on 4500 sequences for the ShiftCrypt index. We performed a 5-fold cross-validation, which indicated a Pearson’s correlation for the main task at approximately 0.7 and its corresponding p-value near 0. 

We present this estimator under the name ConforMine, a predictor of protein Conformational Variability from amino acid sequence. This tool will be incorporated in our tool suite “b2bTools”, available in PyPI, Anaconda and Bioconda under the same name.

Notes

This work was presented in the 21st European Conference on Computational Biology (ECCB 2022) in Sitges, Spain.

Files

2022_eccb_poster_jgg.pdf

Files (751.9 kB)

Name Size Download all
md5:fa14c3eea8ed357cf0d12ce6ab305a29
751.9 kB Preview Download

Additional details

Funding

European Commission
RNAct - Enabling proteins with RNA recognition motifs for synthetic biology and bio-analytics. 813239