Published January 23, 2026 | Version v2
Dataset Open

TRcompDB v2.0, a global reference of human tandem repeat composition and mutation rate from long-read assemblies

  • 1. ROR icon University of Southern California

Description

The following motif catalogs are provided. Including

  1. the vamos genomic catalog expanded by TRExplore v1.0 (4,422,661 loci) for GRCh38
  2. the vamos genomic catalog for CHM13 (1,175,989 loci)
  3. the vamos genomic catalog expanded by TRExplore v1.0 (4,155,246 loci) for CHM13
  4. the curated functional catalog for GRCh38 (72 loci)

All motif catalogs are generated under the v3.0 infrastructure of the catalog construction pipeline. “oriMotifs” stands for the full motif set before efficient motif selection. “effMotifs-0.1” stands for the efficient motifs selected under q=0.1.

TR annotations are provided as combined vcf for 416 publicly available diploid long-read assemblies.

Locus-specific ANOVAs by allelic length or compositions are provided in "anova.annoLen.tsv.gz" and "anova.ed2Major.tsv.gz".

Estimated locus-specific mutation rates for GRCh38 are provided in a master table with locus features (mutRate_GRCh38_masterFile.tsv.gz). Columns are explained in the txt file (mutRate_GRCh38_masterFile.txt).

Files

mutRate_GRCh38_masterFile.txt

Files (2.7 GB)

Name Size Download all
md5:0c052e2e868111f0f7505992aa2ca22b
58.2 MB Download
md5:86b7b5caf708f8ec58e9a0c69ef92692
80.0 MB Download
md5:6e228be6ef02122f13e170e2efd34659
248.2 MB Download
md5:a89d973e951d61d92e6f9a1ed56d756d
3.3 kB Preview Download
md5:3e076399a34ee221563f215068c34a26
2.7 kB Download
md5:3f9f24a270a99616558376ce43d6318d
100.2 kB Download
md5:1f684d1c6efcef23ac826f644253cf24
5.3 kB Download
md5:d79cc33298a815b1a6a0597465149a0f
124.0 kB Download
md5:1bced7f4b12c3f0b48c03ceef75b5a83
75.5 MB Download
md5:09fddf7a584f2dd7859ead84a545b0d6
83.6 MB Download
md5:54f07072870570ed92bdbb8f89fafb7f
80.7 MB Download
md5:da15e2a94f8a86d1da448bc7482ff43a
554.4 MB Download
md5:2b350a7b70b417f515fc318a1e3850bb
89.4 MB Download
md5:596a06fc3d7852d82a907d4c6a67d13f
610.9 MB Download
md5:f0afc5956826838564c8280c1af9c208
39.7 MB Download
md5:36163673b27cb655267359fde1f6ea2f
322.6 MB Download
md5:7644aee08b2e5d7451875e7bf48a1c92
46.6 MB Download
md5:46bccacd02fe82261be51f84af5d841e
360.9 MB Download

Additional details

Funding

National Human Genome Research Institute
Computational methods for detecting and genotyping human genetic variation using single-molecule sequencing and high-throughput short read data R01HG010756