There is a newer version of the record available.

Published May 17, 2020 | Version 1.2.0
Dataset Open

Global consensus map and dynamics of human transcription factor footprints

  • 1. Altius Institute for Biomedical Sciences

Description

Data associated with publication "Global reference mapping and dynamics of human transcription factor footprints". https://doi.org/10.1101/2020.01.31.927798

These data describe digital genomic footprints derived from 243 human biosamples.

Please see http://vierstra.org/resources/dgf for additional information and a complete set of raw DNase I data.

Metadata with biosample annotation and ENCODE project accessions in Excel format (Extended Data Table 1):

  • Consensus_footprints_metadata.xlsx

Motif clustering metdata in Excel format (Extended Data Table 2):

  • Motif_clustering_metadata.xlsx

Raw footprint call within individual datasets at a FDR cutoff of 0.01 (additional levels of FDR thresholding can be made available by request). The following tarball contains 243 BED-formatted files, each corresponding to an individual dataset. (Extended Data File 1).

  • Footprints_per_sample.0q01.tar.gz

Tab-delimited files with consensus footprint coordinates and overlaps with matches to motif model clusters in human genome build GRCh38/hg38. The legend file contains column definitions in detail (Extended Data File 2).

  • Consensus_footprints_and_motifs_hg38.bed.gz
  • Consensus_footprints_and_motifs_legend.txt

BED-formatted file containing the coordinates for clustered motif model matches that overlap consensus footprints (Extended Data File 3)

  • Collapsed_motifs_overlaping_consensus_footprints.bed.gz
  • Collapsed_motifs_overlaping_consensus_footprints_legend.txt

Footprint occupancy matrix of index footprints (rows) vs. biosamples (columns). Rows are same order as the consensus footprint file and columns are same order as in the metadata files (Extended Data Files 4 and 5).

  • Consensus_footprints_and_motifs_matrix_full_hg38.txt.gz (Values are -log(1-posterior))
  • Consensus_footprints_and_motifs_matrix_binary_hg38.txt.gz (binary occupancy matrix, where footprints with posterior footprint probability >0.99 are considered occupied)

Single nucleotide variants tested for allelic imbalance. The legend file contains column definitions in detail (Extended Data File 6).

  • De_novo_SNVs_tested_for_imbalance.bed.gz
  • De_novo_SNVs_tested_for_imbalance.legend.txt

Contact: Jeff Vierstra (jvierstra@altius.org)

Notes

This work was supported by NHGRI grant U54HG007010.

Files

Collapsed_motifs_overlaping_consensus_footprints_legend.txt

Files (3.2 GB)

Name Size Download all
md5:c9abbcb6eab93b7912c6a033325e66d1
427.6 MB Download
md5:cbff0b2a15545427e0077ff73311c4ea
872 Bytes Preview Download
md5:2a8cd65dceb96fc0fc3a2a1346c1387a
140.7 MB Download
md5:15f3da13cd57217af1407e0271251ab6
65.5 MB Download
md5:0ee7b596e3ae34d175ef052edbe21004
1.5 GB Download
md5:4b1120c21e68c5a60e316c8b9f15c942
42.4 kB Download
md5:a8deccf73a5d4ee45bf03cfa8aed37de
35.3 MB Download
md5:b0eec98b2909f5bc6f88d947dd01333d
958 Bytes Preview Download
md5:882afc8dce3963aed668e9df1df353d8
1.1 GB Download
md5:39a8a566036f6c0b1d97daca0cbc02b3
143.9 kB Download

Additional details

References

  • Vierstra et al. Global reference mapping and dynamics of human transcription factor footprints. (2020). bioRxiv