There is a newer version of the record available.

Published February 18, 2023 | Version v0.1
Dataset Open

Supplemental data for: Increased mutation and gene conversion within human segmental duplications

Description

Data used for figure generation and analysis in: Increased mutation and gene conversion within human segmental duplications

  1. new-assemblies.zip contains all the new assemblies added in this work beyond the HPRC assemblies (Clint PTR, CHM1, HG00514, NA12878, HG03125). All other assemblies used in this analysis are available through the HPRC: assembly_index/Year1_assemblies_v2_genbank.index.
  2. all-sample.vcf is a vcf file with all the variant calls used in this analysis. 
  3. alignments.zip contains all the syntenic alignments used for analysis. 
  4. data.zip contains annotation data and other information used in analysis and figure making. 

Code used in figure making and analysis is on GitHub.

Snakemake pipelines used in the analysis are also on GitHub:

  • Assembly alignment and IGC calling: https://github.com/mrvollger/asm-to-reference-alignment
  • Variant calling from assembly alignments: https://github.com/mrvollger/sd-divergence
  • Analysis of the triplet content of SNVs: https://github.com/mrvollger/mutyper_workflow

Files

alignments.zip

Files (33.2 GB)

Name Size Download all
md5:171a63782b446553134777c69f2502a7
1.0 GB Preview Download
md5:3fdc3e6ecc19227114c6427f5bb8865b
584.1 MB Download
md5:3e00331d2702b6ad495d06d40a55971a
1.7 MB Download
md5:f3f609347515aadf2c316e88b97f5d39
2.2 MB Download
md5:f1b693b3d3b46f2060fcbb13f03089f1
22.9 GB Preview Download
md5:4c113bceff5dc90cd76427b434a89dd0
8.8 GB Preview Download