Published January 20, 2025
| Version v1
Dataset
Open
Multiple sequence alignment of 1,870,492 SARS-CoV-2 genomes assembled by the Viridian project
Description
The assembled genomes were obtained from the publication by Hunt et al (10.1101/2024.04.29.591666). They were filtered to remove sequences with at least 100 non-ACGT nucleotides or with at least two consecutive Ns (except at the ends). The unaligned filtered sequences are available at https://zenodo.org/records/14698684.
The 1,870,492 remaining sequences were then aligned using Halign3 (10.1093/molbev/msac166), taking 1.2TB of RAM.
The computation was performed on the IFB Core cluster managed by the Institut Français de Bioinformatique.
Files
Files
(63.2 MB)
Name | Size | Download all |
---|---|---|
md5:75db04a474804bb3b2cd583a3b9b24c8
|
63.2 MB | Download |
Additional details
Funding
- Agence Nationale de la Recherche
- INSSANE - Integrated Sequencing and Structural Analysis of RNA Probing Experiments ANR-21-CE45-0034
- Agence Nationale de la Recherche
- IFB (ex Renabi-IFB) - Institut français de bioinformatique ANR-11-INBS-0013
Dates
- Submitted
-
2025-01-20