Published December 7, 2021 | Version v1
Journal article Open

HIV protease and integrase empirical substitution models of evolution: Protein-specific models outperform generalist models

  • 1. Del Amparo
  • 2. Arenas

Description

Diverse phylogenetic methods require a substitution model of evolution that should mimic, as accurate as possible, the real substitution process. At the protein level, empirical substitution models have been traditionally based on a large number of different proteins from particular taxonomic levels. However, these models assume that all the proteins of a taxonomic level evolve under the same substitution patterns. We believe that this assumption is highly unrealistic and should be relaxed by considering protein-specific substitution models that account for protein-specific selection processes. In order to test this hypothesis, here we inferred and evaluated four new empirical substitution models for the protease and integrase of HIV and other viruses. We found that these models more accurately fit, compared with any of the currently available empirical substitution models, with the evolutionary process of these proteins. We conclude that evolutionary inferences from protein sequences are more accurate if they are based on protein-specific substitution models rather than taxonomic-specific (generalist) substitution models. We also present four new empirical substitution models of protein evolution that can be useful for phylogenetic inferences of viral protease and integrase.

Files

Method data.zip

Files (1.2 MB)

Name Size Download all
md5:3fd05474b7e413563924bc5a3f7f0530
1.1 MB Preview Download
md5:35000c13f049908adb6e5feb86dd158a
655 Bytes Preview Download
md5:d98725037e3f0c1b4514143748f574e2
7.1 kB Preview Download
md5:daecc23eced05704104d5aebe0cd5418
125.8 kB Preview Download