Published January 6, 2022
| Version v1
Dataset
Open
AGC archives of human and SARS-CoV-2 genomes
- 1. Silesian University of Technology
- 2. DFCI & Harvard
Description
AGC is a tool to compress a collection of similar genomes. This Zenodo record provides pre-built AGC archives of several datasets:
- File "HPRC-yr1.agc" contains CHM13 and 94 haploid human assemblies released by HPRC in 2021. The telomere-to-telomere CHM13 v1.1 plus chrY from GRCh38 is used as the reference genome.
- File "sars-cov-2_ncbi-620k.agc" contains 619,750 complete SARS-CoV-2 genomes with NC_045512.2 as the reference. It was created with AGC command line "agc create -cb10000 -s5000". SARS-CoV-2 genomes were downloaded from NCBI at the end of year 2021. The original FASTA is provided as "sars-cov-2_ncbi-620k.fa.xz".
Files
Files
(1.5 GB)
Name | Size | Download all |
---|---|---|
md5:4e9608ee808a9728bc1c7ee8496f7c30
|
1.5 GB | Download |
md5:4690b70401de2c2f08e2c64fcd0c3a31
|
28.0 MB | Download |
md5:0a3140edb6a67e7133be748b5af422a4
|
56.3 MB | Download |