Dataset Open Access

Training data for "Identification of allelic variants in SARS-CoV-2 from deep sequencing reads"

Bérénice Batut; Wolfgang Maier


Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
  <dc:creator>Bérénice Batut</dc:creator>
  <dc:creator>Wolfgang Maier</dc:creator>
  <dc:date>2021-06-28</dc:date>
  <dc:description>Effectively monitoring global infectious disease crises, such as the COVID-19 pandemic, requires capacity to generate and analyze large volumes of sequencing data in near real time. These data have proven essential for monitoring the emergence and spread of new variants, and for understanding the evolutionary dynamics of the virus.

Two sequencing platforms in combination with several established library preparation strategies are predominantly used to generate SARS-CoV-2 sequence data. However, data alone do not equal knowledge: they need to be analyzed. The Galaxy community developed analysis workflows to support the identification of allelic variants (AVs) in SARS-CoV-2 from deep sequencing reads.

These workflows allow one to identify AVs and lineages in SARS-CoV-2 genomes with variant allele frequencies ranging from 5% to 100% (i.e., they detect variants with intermediate frequencies as well.

In this tutorial we will see how to run these workflows for the different types of input data:


	Single end data derived from Illumina-based RNAseq experiments
	Paired end data derived from Illumina-based RNAseq experiments
	Paired-end data generated with Illumina-based Ampliconic (ARTIC) protocols
	ONT fastq files generated with Oxford nanopore (ONT)-based Ampliconic (ARTIC) protocols


To illustrate the tutorial, we took some example datasets (paired-end data generated with Illumina-based Ampliconic (ARTIC) protocols) from COG-UK, the COVID-19 Genomics UK Consortium.</dc:description>
  <dc:identifier>https://zenodo.org/record/5036687</dc:identifier>
  <dc:identifier>10.5281/zenodo.5036687</dc:identifier>
  <dc:identifier>oai:zenodo.org:5036687</dc:identifier>
  <dc:relation>doi:10.5281/zenodo.5036686</dc:relation>
  <dc:relation>url:https://zenodo.org/communities/galaxy-training</dc:relation>
  <dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
  <dc:rights>https://creativecommons.org/licenses/by/4.0/legalcode</dc:rights>
  <dc:title>Training data for "Identification of allelic variants in SARS-CoV-2 from deep sequencing reads"</dc:title>
  <dc:type>info:eu-repo/semantics/other</dc:type>
  <dc:type>dataset</dc:type>
</oai_dc:dc>
165
5,978
views
downloads
All versions This version
Views 165165
Downloads 5,9785,978
Data volume 301.1 GB301.1 GB
Unique views 138138
Unique downloads 285285

Share

Cite as