There is a newer version of the record available.

Published April 26, 2021 | Version v0.2
Journal article Open

Assemblies and data generated for "SEGMENTAL DUPLICATIONS AND THEIR VARIATION IN A COMPLETE HUMAN GENOME"

  • 1. University of Washington School of Medicine Department of Genome Sciences

Description

This repository contains all assemblies used in the paper titled "SEGMENTAL DUPLICATIONS AND THEIR VARIATION IN A COMPLETE HUMAN GENOME". With the exception of T2T-CHM13 v1.0 and GRCh38 all assemblies in this repository were  assembled with Hifiasm v0.12 using default parameters. The human samples with the exception of CHM1 were assembled using parental short-read data for phasing. All nonhuman primates and CHM1 were assembled without parental phasing information since none exists. The assembly of GRCh38 contains only the chromosome level sequences removing all other contigs, and the T2T assembly (v1.0) has the GRCh38 chrY added to facilitate identifying interchromosomal duplications shared with chrY. 

The zip file annotation_bed_files.zip contains the SD annotations and masking files used in the paper.

The tar ball "data.plots.tar.gz" contains information used for figure generation including: SD annotations, methylation data, Liftoff gene models, WSSD copy number estimates, and RepeatMasker annotations.

The tar ball "important_biomedical_and _evolutionary_loci.tar" contains sequences and annotations for the 10 loci highlighted in the paper. 

The raw HiFi sequence data used for assembly is on the NCBI SRA:

  • HiFi data for Chimpanzee, Macaque, and Orangutan can be found under NCBI BioProject PRJNA659034, and PRJNA691628 for Bonobo and Gorilla. 
  • HiFi data for all the human samples can be found under the following accessions and BioProjects:
    • CHM13: NCBI SRA SRR11292120-SRR11292123 (PRJNA530776)
    • HG00733: NCBI SRA ERX3831682
    • HG002: NCBI SRA SRR10382244, SRR10382245, SRR10382248 and SRR10382249
    • HG00514: NCBI SRA ERX4795966
    • NA19240: NCBI SRA ERX4787609, ERX4787607, ERX4787606, ERX4782632, and ERX4781730
    • CHM1: 
    • HiFi data for all remaining human samples can be found under NCBI BioProject PRJNA701308

 

Notes

Upload for second submission.

Files

annotation_bed_files.zip

Files (36.0 GB)

Name Size Download all
md5:1ac84db26e6eec30dfddf37696aa67dc
492.2 MB Preview Download
md5:66a292ceef9919b9fe7bf29235ae7ac0
12.2 MB Download
md5:b3b2137c579fd144301a5f1a5c1074f5
10.6 MB Download
md5:7b6dc98b9dad8fc899a45d654d07c02b
328.7 MB Download
md5:034dca899dc432c8406de762a11321d0
59.5 MB Download
md5:ab153a283bfc96b6f7e330eff1593ff6
19.4 MB Download
md5:682db570cbe7bbf399ad8156b95254a8
22.6 MB Download
md5:dcaa88a341258967d44e4055d56b9a84
326.0 MB Download
md5:b2eb5d41b51087ffa8a1da10d2092380
126.7 MB Download
md5:7ff7c8a3ded2fbcc12d19dd04ee903ac
20.4 MB Download
md5:5ce8e6ea766ecb98e0b1b29e6331794e
22.8 MB Download
md5:8e461327783d592df43c59fe43ffe6cb
327.6 MB Download
md5:f97e6c92bad56c8d029db2e3450cd26c
130.8 MB Download
md5:fc913273228a7bf6a4d220cc7984f737
1.9 GB Download
md5:27901368807c3b8287d48f8245297d51
16.9 MB Download
md5:f637d919bb185d6a85fae543c2f068b9
4.9 MB Download
md5:4b9731d8348515cc82b0cbef67c60842
320.1 MB Download
md5:d9d1d2daf824409453a4ca028f14aaf6
67.3 MB Download
md5:c9619b446cd957dace04738421dc84af
22.3 GB Download
md5:461536fab5333da3e230e6ddd818c70e
693.3 MB Download
md5:fc8a4c4953bc81dc6542850cc8065ce6
8.8 GB Download
md5:5a766488df4bce89fe0b93308c9a97f4
132.1 kB Preview Download