The emergence of high-fitness variants accelerates the slowdown of genome heterogeneity in the coronavirus
Creators
- 1. Department of Genetics, Faculty of Sciences, University of Granada, 18071, Granada, Spain
- 2. Department of Applied Physics II and Institute Carlos I for Theoretical and Computational Physics, University of Málaga, 29071, Málaga, Spain
- 3. Dipartimento di Scienze della Terra, dell'Ambiente e delle Risorse, Università di Napoli Federico II, 80126, Napoli, Italy
- 4. 7Centro de Investigaciones sobre Desertificación, Consejo Superior de Investigaciones Científicas (CSIC), University of València and Generalitat Valenciana, 46113, Valencia, Spain
- 5. Institute of Integrative Systems Biology (I2Sysbio), University of València and Consejo Superior de Investigaciones Científicas (CSIC), 46980, Valencia, Spain
Description
Supplement of the paper
“The emergence of high-fitness variants accelerates the slowdown of genome heterogeneity in the coronavirus”
Since the outbreak of the COVID-19 pandemic, the SARS-CoV-2 coronavirus accumulated an important amount of genome variability through mutation and recombination. To test evolutionary trends that could inform us on the adaptive process of the virus to its human host, we compute a genome-wide measure of Sequence Compositional Complexity (SCC) in high-quality coronavirus genomes from across the globe, covering the full span of the pandemic. By using phylogenetic ridge regression, a method able to reveal both macro- and microevolutionary trends, we present evidence for a long-term tendency of decreasing genome sequence heterogeneity in SARS-CoV-2. In early samples, we find no statistical support for any trend in SCC values over time, although the virus genome appears to evolve faster than Brownian Motion expectation. However, in samples taken after the emergence of Variants of Concern with higher transmissibility, and controlling for phylogenetic and sampling effects, we detect a declining trend for SCC and an increasing one for its absolute evolutionary rate. This means that the decline in SCC itself accelerated over time, and that increasing fitness of variant genomes lead to a reduction of their genome sequence heterogeneity.
Supplementary files
File |
Description |
SupplementaryTables S1-S18.xlsx |
The strain name, the collection date, and the SCC values for each analyzed genome. |
SupplementaryTableS19.pdf |
A complete list acknowledging all originating and submitting laboratories for the sequence data in GISAID EpiCoV on which these analyses are based. |
SupplementaryTable S20.pdf |
A complete list acknowledging the authors, originating and submitting laboratories of the genetic sequences we used for the analysis of the Nextstrain sample. |
PhylogeneticTimetrees_NexusFormat.zip |
Phylogenetic timetrees (Nexus format). |
PhylogeneticTimetrees_NewickFormat.zip |
Phylogenetic timetrees (Newick format). |
Notes
Files
PhylogeneticTimetrees_NewickFormat.zip
Files
(4.3 MB)
Name | Size | Download all |
---|---|---|
md5:543413cc46daf993e3a9b9e3c712dc86
|
612.8 kB | Preview Download |
md5:1c2b12e84b5a15b846f7d017bead88a1
|
551.0 kB | Preview Download |
md5:274d1a30fded0727fc1704d7b1130235
|
2.1 kB | Preview Download |
md5:5cc08059138ed7dd3868f601bd5c6e89
|
760.2 kB | Preview Download |
md5:a9d588d7b8c39c28201c5fb3ff271641
|
679.4 kB | Preview Download |
md5:da50514205e76a01bf3c42290df9b332
|
1.7 MB | Download |