Dataset Open Access

Training data for "Identification of allelic variants in SARS-CoV-2 from deep sequencing reads"

Bérénice Batut; Wolfgang Maier

Effectively monitoring global infectious disease crises, such as the COVID-19 pandemic, requires capacity to generate and analyze large volumes of sequencing data in near real time. These data have proven essential for monitoring the emergence and spread of new variants, and for understanding the evolutionary dynamics of the virus.

Two sequencing platforms in combination with several established library preparation strategies are predominantly used to generate SARS-CoV-2 sequence data. However, data alone do not equal knowledge: they need to be analyzed. The Galaxy community developed analysis workflows to support the identification of allelic variants (AVs) in SARS-CoV-2 from deep sequencing reads.

These workflows allow one to identify AVs and lineages in SARS-CoV-2 genomes with variant allele frequencies ranging from 5% to 100% (i.e., they detect variants with intermediate frequencies as well.

In this tutorial we will see how to run these workflows for the different types of input data:

  • Single end data derived from Illumina-based RNAseq experiments
  • Paired end data derived from Illumina-based RNAseq experiments
  • Paired-end data generated with Illumina-based Ampliconic (ARTIC) protocols
  • ONT fastq files generated with Oxford nanopore (ONT)-based Ampliconic (ARTIC) protocols

To illustrate the tutorial, we took some example datasets (paired-end data generated with Illumina-based Ampliconic (ARTIC) protocols) from COG-UK, the COVID-19 Genomics UK Consortium.

Files (1.8 GB)
Name Size
ARTIC_amplicon_info_v3.tabular
md5:70bcc8b5ebee4a69fe780d07eab9ecca
4.1 kB Download
ARTIC_nCoV-2019_v3.bed6
md5:c3507dd20502bea58cd3f410267d8478
12.0 kB Download
ERR5931005_1.fastqsanger.gz
md5:3bbc7c4923351b787c3200fae5a2a3c2
93.2 MB Download
ERR5931005_2.fastqsanger.gz
md5:e7684604bea5c276fc865d9b4a04c27a
103.3 MB Download
ERR5931006_1.fastqsanger.gz
md5:8a6e61185b4f7db4067f1472b51df4c1
74.4 MB Download
ERR5931006_2.fastqsanger.gz
md5:79f9ded7820933caeb41a5421ac98c5c
83.5 MB Download
ERR5931007_1.fastqsanger.gz
md5:e3237fa0c1799c0532417902f3378887
66.3 MB Download
ERR5931007_2.fastqsanger.gz
md5:d984844a645409c429000c1c024ac927
75.9 MB Download
ERR5931008_1.fastqsanger.gz
md5:64da62827b0f44447299fad50ff44dcd
42.5 MB Download
ERR5931008_2.fastqsanger.gz
md5:c6d84e1a9c74dc844141de70ef83de47
48.2 MB Download
ERR5949456_1.fastqsanger.gz
md5:7e1062e5bab35025d37d2cda35fce46a
70.0 MB Download
ERR5949456_2.fastqsanger.gz
md5:5737abf02e431d9056f6ef77403da672
79.1 MB Download
ERR5949457_1.fastqsanger.gz
md5:3a46fc92c376d9ac1cc52b03ac598f98
52.5 MB Download
ERR5949457_2.fastqsanger.gz
md5:615ce341745eee6e28324ae14046398e
59.6 MB Download
ERR5949458_1.fastqsanger.gz
md5:9dbdc343b7858343956c72f8cbfb83b6
69.9 MB Download
ERR5949458_2.fastqsanger.gz
md5:8980ac7566388b40d081ff823c671789
80.4 MB Download
ERR5949459_1.fastqsanger.gz
md5:fc5c4b6f7dd0453a68446d886f05266d
64.7 MB Download
ERR5949459_2.fastqsanger.gz
md5:1fc2c78bbbf4c87fca84161616f10e7d
74.0 MB Download
ERR5949460_1.fastqsanger.gz
md5:f9d4b3efd5263d00612ab8b9f136558c
42.1 MB Download
ERR5949460_2.fastqsanger.gz
md5:74242af479bdc898959dbdd1f270965a
47.8 MB Download
ERR5949461_1.fastqsanger.gz
md5:a97d3ba8ea69740c347af547f3c8b411
28.7 MB Download
ERR5949461_2.fastqsanger.gz
md5:2c7a02d79337c80c5389fa88f7400926
32.7 MB Download
ERR5949462_1.fastqsanger.gz
md5:c67da720da6714062e72041aee313252
15.3 MB Download
ERR5949462_2.fastqsanger.gz
md5:ef6b97063f81be9f7d2418166a687bdd
17.5 MB Download
ERR5949463_1.fastqsanger.gz
md5:2a55dc9239fc55c428bcf780de837a1e
41.0 MB Download
ERR5949463_2.fastqsanger.gz
md5:425c7dae491a2cc4d974ce44eb0bffe6
47.9 MB Download
ERR5949464_1.fastqsanger.gz
md5:0457441e1b6a5bd120af341b39ba39c9
36.0 MB Download
ERR5949464_2.fastqsanger.gz
md5:de8465315d3a2fe84e27ffb5dcf21d2f
39.9 MB Download
ERR5949465_1.fastqsanger.gz
md5:24182dd42307f74370f9f190144b4fb7
13.9 MB Download
ERR5949465_2.fastqsanger.gz
md5:a73ca480fb9029e056982f65c430930f
15.3 MB Download
ERR5949466_1.fastqsanger.gz
md5:a00089267f1ed27e9944c67197dfdc26
63.7 MB Download
ERR5949466_2.fastqsanger.gz
md5:4d9e1daa7fe42846814f7d4cf39ec97f
70.8 MB Download
ERR5949467_1.fastqsanger.gz
md5:63b212b4cbfe5cfebeaebde2caebf6ff
43.3 MB Download
ERR5949467_2.fastqsanger.gz
md5:579c44ca4abae971cb5822e013a70301
47.5 MB Download
ERR5949468_1.fastqsanger.gz
md5:c9d647e2904a10b4afd60c49ce8b3fce
15.4 MB Download
ERR5949468_2.fastqsanger.gz
md5:23b67b676e32aaea7fcaf91486d46acd
17.7 MB Download
ERR5949469_1.fastqsanger.gz
md5:3016598154380bf250eeeb73c7259eb7
18.8 MB Download
ERR5949469_2.fastqsanger.gz
md5:e9348a9fb2a6ee4f0d3b161ee6187e66
20.9 MB Download
NC_045512.2_feature_mapping.tabular
md5:2e02be0c1d258e030361ae94de6313a6
488 Bytes Download
NC_045512.2_reference_sequence.fasta
md5:b915d0b4dd6af6f06d3f586fbc0efdba
29.9 kB Download
142
5,594
views
downloads
All versions This version
Views 142142
Downloads 5,5945,594
Data volume 282.0 GB282.0 GB
Unique views 118118
Unique downloads 258258

Share

Cite as