Global consensus map of human transcription factor footprints
Description
Vierstra, J. et al. Global reference mapping of human transcription factor footprints. Nature 583, 729–736 (2020). https://doi.org/10.1038/s41586-020-2528-x
Preprint @ bioRxiv: https://doi.org/10.1101/2020.01.31.927798
Contact: Jeff Vierstra (jvierstra@altius.org)
Genomic DNase I footprinting enables quantitative, nucleotide-resolution delineation of sites of transcription factor occupancy within native chromatin. We combined sampling of >67 billion uniquely mapping DNase I cleavages from >240 human cell types and states to index, with unprecedented accuracy and resolution, human genomic footprints and thereby the sequence elements that encode transcription factor recognition sites.
Please see http://vierstra.org/resources/dgf for additional information and a complete set of raw DNase I data for individual datasets. Additionally, raw data can also be accessed via the ENCODE data portal (http://encodeproject.org) using the dataset accessions found in Supplementary Table 1.
Code for footprint analysis and tutorials on how to access and manipulate digital genomic footprint data can be found at https://footprint-tools.readthedocs.io/en/latest/.
All files herein correspond to human genome build version GRCh38 (UCSC hg38).
Dataset contents:
- Biosample metadata – Supplementary_Table_1.xlsx
- Motif clustering metadata – Supplementary_Table_2.xlsx
- ChIP-seq validation metadata – Supplementary_Table_3.xlsx
- Consensus footprint coordinates and assigned motif archetypes
TSV file (BED-format) with consensus footprint (posterior probability>0.99) coordinates and overlaps with matches to motif model clusters. The legend file contains column definitions in detail.- consensus_footprints_and_motifs_hg38.bed.gz
- consensus_footprints_and_motifs_legend.txt
- Motif archetype matches overlapping consensus footprints
TSV file (BED-format) containing the coordinates for clustered motif model matches that overlap consensus footprints- collapsed_motifs_overlaping_consensus_footprints.bed.gz
- collapsed_motifs_overlaping_consensus_footprints_legend.txt
- Footprint occupancy matrix of consensus footprints
Rows are same order as the consensus footprint file and columns are same order as in the metadata files.- consensus_index_matrix_full_hg38.txt.gz (Values are –log(1-posterior))
- consensus_index_matrix_binary_hg38.txt.gz (binary occupancy matrix, where footprints with posterior footprint probability >0.99 are considered occupied)
- Single nucleotide variants tested for allelic imbalance
The legend file contains column definitions in detail.- genotypes.vcf.gz - Genotyping and allelic read depth for each biosample (see header for more information)
- tested_snvs_padj.bed.gz - SNVs tested for imbalance (TSV, BED-format)
- tested_snvs_padj_legend.txt
Notes
Files
collapsed_motifs_overlapping_consensus_footprints_legend.txt
Files
(2.9 GB)
Name | Size | Download all |
---|---|---|
md5:c9abbcb6eab93b7912c6a033325e66d1
|
427.6 MB | Download |
md5:971d028d1649dd7d7c4320f95fd54dd5
|
799 Bytes | Preview Download |
md5:1ab941adcda42e7b54cd93da89dd9723
|
584.0 MB | Download |
md5:35adfcbdf0e4d4fb6b2a15e2ad6deee4
|
1.3 kB | Preview Download |
md5:15f3da13cd57217af1407e0271251ab6
|
65.5 MB | Download |
md5:099f7fc5a080afde2002646c8bfa3e7d
|
1.4 GB | Download |
md5:52096331fbfec008a1a198f888e81ca5
|
354.8 MB | Download |
md5:920e1eee2547cef2c4f235ae8eaba3ff
|
42.4 kB | Download |
md5:920e1eee2547cef2c4f235ae8eaba3ff
|
42.4 kB | Download |
md5:920e1eee2547cef2c4f235ae8eaba3ff
|
42.4 kB | Download |
md5:a8deccf73a5d4ee45bf03cfa8aed37de
|
35.3 MB | Download |
md5:29db108b5dd59232e8e7ad3fa77d0634
|
899 Bytes | Preview Download |
Additional details
References
- Vierstra et al. Global reference mapping and dynamics of human transcription factor footprints. (2020). bioRxiv