Published May 30, 2023 | Version 0.0.2
Dataset Open

Repeat catalogs for TRGT

  • 1. Pacific Biosciences
  • 2. Baylor College of Medicine

Description

This dataset contains various repeat catalogs for the Tandem Repeat Genotyping Tool (TRGT):

  • pathogenic_repeats.hg38.bed contains annotations of 56 known pathogenic repeats.
  • polymorphic_repeats.hg38.bed contains 171,146 polymorphic repeats. The original version of this catalog was made for short reads and is distributed under CC BY-SA 4.0 license.
  • adotto_repeats.hg38.bed contains 937,122 repeats originally released by the Genome in a Bottle tandem repeat benchmarking project 10.5281/zenodo.7226352.
  • adotto_hprc.tdb.tar is a TRGTdb file containing alleles of repeats from the adotto_repeats.hg38.bed catalog across 100 HPRC samples.

Please consider citing TRGT preprint if you are using these data.

 

 

Files

Files (1.1 GB)

Name Size Download all
md5:b56e36bdda15a34b8c3d0ec63fcc8cff
1.1 GB Download
md5:9cfe0da4028cf7f7502162ee090199e4
18.2 MB Download
md5:06ff62717cad4320878c9f209c01b253
3.5 kB Download
md5:e9402d6b02b489cb139693891e959507
12.8 MB Download

Additional details

Related works

Is cited by
Preprint: 10.1101/2023.05.12.540470 (DOI)
Is derived from
Dataset: 10.5281/zenodo.7226352 (DOI)