A collection of high-quality human assemblies
Authors/Creators
Description
A collection of high-quality human assemblies, including:
- T2T-CHM13 v2.0 analysis set with HG002 chrY and rCRS chrM
- GRCh38 no-alt analysis set with rCRS chrM
- HG002 v1.1
- CN1 v1.0.1
- YAO v1.1
- 156 HPRC "r2-v1" samples (312 assemblies)
Use AGC to extract indivual genomes and use ropebwt3 to query the FM-index:
agc listset human320.agc # list genomes
agc getset human320.agc 400131_HG02615.pat > HG02615.pat.fa # extract one genome
gzip -d human320.fmr.gz # decompress the incremental index
ropebwt3 build -i human320.fmr -do human320.fmd # convert to a faster query format
Note: HPRC samples are already available from GenBank but are not formally published. You may use the data for algorithm development or performance evaluation. If you want to use the genomes for biological discovery, please contact HPRC.
Files
Files
(16.3 GB)
Additional details
Related works
- Continues
- Dataset: 10.5281/zenodo.11533210 (DOI)
- Is published in
- Journal article: 10.1093/bioinformatics/btae717 (DOI)