Investigating evolution at the catalytic site of the main SARS-CoV-2 protease using over 15,000 genomes
Creators
- 1. European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridgeshire, United Kingdom
Description
We investigated evolution and genomic variation of SARS-CoV-2 within the current pandemic at the catalytic site of the main SARS-CoV-2 protease (see https://zenodo.org/record/3834875#.Xs1IHsZ7nyk and https://openlabnotebooks.org/mapping-the-genetic-variations-of-sars-cov-2-onto-its-proteins-crystal-structures-post-1/ ).
We used more than 15,000 genomic sequences from GISAID (https://www.epicov.org/) available on the 17th of May 2020.
We use a new approach based on phylogenetic inference of homoplasy, clustering of mutations, and ambiguous consensus sequence characters, to identify sites that are likely affected by sequencing artefacts.
We find that these sites are mostly conserved, and the amino acid variants observed are only M49I, P52S, N142S, and P168S, all of which appear only at extremely low frequencies (maximum of two samples each).
Files
Files
(27.8 kB)
Name | Size | Download all |
---|---|---|
md5:ea3d219241a7902892a7a7d03e6c7c71
|
27.8 kB | Download |
Additional details
Related works
- Is supplemented by
- https://openlabnotebooks.org/mapping-the-genetic-variations-of-sars-cov-2-onto-its-proteins-crystal-structures-post-1/ (URL)
- 10.5281/zenodo.3834875 (DOI)
- Other: https://www.epicov.org/ (URL)