Published February 6, 2025 | Version v1
Dataset Open

Viridian consensus sequences and metadata (2024-10-14, batch 1 + 2), aligned with MAFFT and converted to VCF Zarr format

  • 1. ROR icon University of Oxford

Description

This is a repackaging of the Viridian SARS-CoV-2 dataset produced by Hunt et al. Please see the original publication for details on the dataset, and how it was produced: https://www.biorxiv.org/content/10.1101/2024.04.29.591666v1

We have done two things:

  • Aligned the consensus sequences provided by Viridian to the Wuhan-Hu-1/2019 reference using MAFFT v7.475
  • Converted the alignments and metadata to VCF Zarr format. See the paper for more information: https://doi.org/10.1093/gigascience/giaf049

The result is a single Zip file that contains all of the alignment data and metadata in a compact and easily accessible format.

The Snakemake pipeline for producing this file is fully reproducible and available on GitHub: https://github.com/jeromekelleher/sc2ts-paper/tree/main/viridian_dataset

 

Files

viridian_mafft_2024-10-14_v1.vcz.zip

Files (419.7 MB)

Name Size Download all
md5:7c4a01379391dec0e38afaee3563fe63
419.7 MB Preview Download