This report was automatically generated on November 7, 2022.
Katherine Eaton
| National Microbiology Laboratory, PHAC
| katherine.eaton@phac-aspc.gc.ca
The ncov-recombinant update from v0.5.1
to v0.6.0
has two major changes.
The first change is a Nextclade
upgrade to the sars-cov-2 2022-10-27 dataset, which introduces recombinant sublineages for the first time (ex. XBB.1) and two new lineages: XBD and XBE.
The second major change is the calculation and visualization of immune-related statistics. In v0.6.0
, the number of key receptor binding domain (RBD) mutations is calculated for every sample. This is performed by comparing the Nextclade aaSubstitutions
column (amino acid substitutions) produced by Nextclade, to the list of 12 key RBD mutations provided by Nextstrain. In addition, the statistics immune_escape
and ace2_binding
from Nextclade are included in the final linelists.
Between v0.5.1
and v0.6.0
, 14.2% of sequences in the controls-gisaid
dataset had different detection results. 4.0% of sequences were newly classified (NA
→ X*
) and represent lineages not present in the v0.5.1
model. 10.2% of sequences had sublineage assignment changes as a result of the Nextclade
dataset upgrade. 0.0% of positive controls were dropped (X*
→ NA
), indicating no observed loss in sensitivity.
ncov-recombinant
v0.6.0
is a recommended upgrade for recombinant surveillance to enable sublineage classification and to access enhanced statistics regarding immune-escape.
For a comprehensive summary of the methodological changes, please see the release notes for v0.6.0
Verify that the update of ncov-recombinant pipeline from version 0.5.1
to0.6.0
:
controls-gisaid
)This dataset includes SARS-CoV-2 genomes from GISAID that reflect the known diversity of recombinant sequences to date. These include 501
positive controls (recombinants), representing lineages XA - XBE and 186
negative controls (non-recombinants) selected from the Nextstrain Reference Phylogeny.
In total, 687
control sequences were used as input and a strain list is available here.
The snakemake pipelines for v0.5.1
and v0.6.0
were run independently on the same dataset (controls-gisaid
). Please see the Procedure section in the Supplementary for detailed command-line instructions.
controls-gisaid
)XBB
→ XBB.1
).NA
).Note: Lineage assignments in
v0.6.0
are identical to those in pango-designation and are the expected values.
New detections (NA
→ X*
) result from the following changes in v0.6.0
:
Sublineage changes result from the following updates in v0.6.0
:
The following plots report recombinant sequences over the last 16 weeks.
Note: Download the GISDAID sequences and metadata in the strains list to
data/controls-gisaid/
.
Download the pipeline.
git clone https://github.com/ktmeaton/ncov-recombinant.git 0.5.1
cd 0.5.1
git checkout v0.5.1
Symlink controls-gisaid
, data.
rm -rf data/controls-gisaid
ln -s ../data/controls-gisaid data/controls-gisaid
Create a version-controlled conda environment.
# Local
mamba env create -f workflow/envs/environment.yaml -n ncov-recombinant-0.5.1
# HPC
sbatch -J conda-ncov-recombinant-0.5.1 --wrap="mamba env create -f workflow/envs/environment.yaml -n ncov-recombinant-0.5.1"
Run the pipeline.
# Local
conda activate ncov-recombinant-0.5.1
snakemake --profile profiles/controls-gisaid-hpc
# HPC
scripts/slurm.sh --profile profiles/controls-gisaid-hpc --conda-env ncov-recombinant-0.5.1
Download the pipeline.
git clone https://github.com/ktmeaton/ncov-recombinant.git 0.5.0
cd 0.6.0
git checkout v0.6.0
Symlink controls-gisaid
, data.
rm -rf data/controls-gisaid
ln -s ../data/controls-gisaid data/controls-gisaid
Create a version-controlled conda environment.
# Local
mamba env create -f workflow/envs/environment.yaml -n ncov-recombinant-0.6.0
# HPC
sbatch -J conda-ncov-recombinant-0.5.1 --wrap="mamba env create -f workflow/envs/environment.yaml -n ncov-recombinant-0.6.0"
Run the pipeline.
# Local
conda activate ncov-recombinant-0.6.0
snakemake --profile profiles/controls-gisaid-hpc
# HPC
scripts/slurm.sh --profile profiles/controls-gisaid-hpc --conda-env ncov-recombinant-0.6.0
After the pipelines are complete for each version, run the following to compare lineage assignments.
python3 0.6.0/scripts/compare_positives.py \
--positives-1 0.5.1/results/controls-gisaid/linelists/positives.tsv \
--positives-2 0.6.0/results/controls-gisaid/linelists/positives.tsv \
--ver-1 "v0.5.1" \
--ver-2 "v0.6.0" \
--outdir compare/controls-gisaid \
--node-order alphabetical \
--min-link-size 1
csvtk cut -t -f "strain" 0.5.1/results/controls-gisaid/linelists/positives.tsv \
| tail -n+2 \
| csvtk grep -t -f "strain" -P - -v 0.6.0/results/controls-gisaid/linelists/positives.tsv \
| csvtk cut -t -f "strain" \
| tail -n+2 \
| csvtk grep -t -f "strain" -P - 0.5.1/results/controls-gisaid/linelists/linelist.tsv \
| csvtk pretty -t \
| less -S
csvtk cut -t -f "strain" 0.6.0/results/controls-gisaid/linelists/positives.tsv \
| tail -n+2 \
| csvtk grep -t -f "strain" -P - -v 0.5.1/results/controls-gisaid/linelists/positives.tsv \
| csvtk cut -t -f "strain" \
| tail -n+2 \
| csvtk grep -t -f "strain" -P - 0.6.0/results/controls-gisaid/linelists/linelist.tsv \
| csvtk pretty -t \
| less -S
The following plots report all recombinant sequences.
controls-gisaid
)