Deep Learning of Protein Structural Classes: Any Evidence for an 'Urfold'?
Authors/Creators
- 1. Univeristy of Virginia
Description
Recent advances in protein structure determination and prediction offer new opportunities to decipher relationships amongst proteins--a task that entails 3D structure comparison and classification. Historically, protein domain classification has been somewhat manual and heuristic. While CATH and related resources represent significant steps towards a more systematic (and automatable) approach, more scalable and objective classification methods, e.g., grounded in machine learning, could be informative. Indeed, comparative analyses of protein structures via Deep Learning (DL), though it may entail large-scale restructuring of classification schemes, could uncover distant relationships. We have developed new DL models for domain structures (including physicochemical properties), focused initially at CATH's homologous superfamily (SF) level. Adopting DL approaches to image classification and segmentation, we have devised and applied a hybrid convolutional autoencoder architecture that allows SF-specific models to learn features that, in a sense, 'define' the various homologous SFs. We quantitatively evaluate pairwise 'distances' between SFs by building one model per SF and comparing the loss functions of the models. Clustering on these distance matrices provides a new view of protein interrelationships--a view that extends beyond simple structural/geometric similarity, towards the realm of structure/function properties, and that is consistent with a recently proposed 'Urfold' concept.
Files
Draizen3DSig2020DeepUrfold.pdf
Files
(5.1 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:91600b7ab8b38af9bf9d440e90c8ef4b
|
5.1 MB | Preview Download |