Dataset Open Access

Training data for "Identification of allelic variants in SARS-CoV-2 from deep sequencing reads"

Bérénice Batut; Wolfgang Maier


DataCite XML Export

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4.1/metadata.xsd">
  <identifier identifierType="DOI">10.5281/zenodo.5036687</identifier>
  <creators>
    <creator>
      <creatorName>Bérénice Batut</creatorName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0001-9852-1987</nameIdentifier>
    </creator>
    <creator>
      <creatorName>Wolfgang Maier</creatorName>
    </creator>
  </creators>
  <titles>
    <title>Training data for "Identification of allelic variants in SARS-CoV-2 from deep sequencing reads"</title>
  </titles>
  <publisher>Zenodo</publisher>
  <publicationYear>2021</publicationYear>
  <dates>
    <date dateType="Issued">2021-06-28</date>
  </dates>
  <resourceType resourceTypeGeneral="Dataset"/>
  <alternateIdentifiers>
    <alternateIdentifier alternateIdentifierType="url">https://zenodo.org/record/5036687</alternateIdentifier>
  </alternateIdentifiers>
  <relatedIdentifiers>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.5281/zenodo.5036686</relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="URL" relationType="IsPartOf">https://zenodo.org/communities/galaxy-training</relatedIdentifier>
  </relatedIdentifiers>
  <rightsList>
    <rights rightsURI="https://creativecommons.org/licenses/by/4.0/legalcode">Creative Commons Attribution 4.0 International</rights>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
  </rightsList>
  <descriptions>
    <description descriptionType="Abstract">&lt;p&gt;Effectively monitoring global infectious disease crises, such as the COVID-19 pandemic, requires capacity to generate and analyze large volumes of sequencing data in near real time. These data have proven essential for monitoring the emergence and spread of new variants, and for understanding the evolutionary dynamics of the virus.&lt;/p&gt;

&lt;p&gt;Two sequencing platforms in combination with several established library preparation strategies are predominantly used to generate SARS-CoV-2 sequence data. However, data alone do not equal knowledge: they need to be analyzed. The Galaxy community developed analysis workflows to support the &lt;strong&gt;identification of allelic variants (AVs) in SARS-CoV-2 from deep sequencing reads&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;These workflows allow one to identify AVs and lineages in SARS-CoV-2 genomes with variant allele frequencies ranging from 5% to 100% (i.e., they detect variants with intermediate frequencies as well.&lt;/p&gt;

&lt;p&gt;In this tutorial we will see how to run these workflows for the different types of input data:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;Single end data derived from Illumina-based RNAseq experiments&lt;/li&gt;
	&lt;li&gt;Paired end data derived from Illumina-based RNAseq experiments&lt;/li&gt;
	&lt;li&gt;Paired-end data generated with Illumina-based Ampliconic (ARTIC) protocols&lt;/li&gt;
	&lt;li&gt;ONT fastq files generated with Oxford nanopore (ONT)-based Ampliconic (ARTIC) protocols&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To illustrate the tutorial, we took some example datasets (paired-end data generated with Illumina-based Ampliconic (ARTIC) protocols) from COG-UK, the COVID-19 Genomics UK Consortium.&lt;/p&gt;</description>
  </descriptions>
</resource>
153
5,897
views
downloads
All versions This version
Views 153153
Downloads 5,8975,897
Data volume 296.9 GB296.9 GB
Unique views 128128
Unique downloads 275275

Share

Cite as