Suplemental_methods.docx, Supplemental_tables.xlsx ------------------------------------------------------------------ - Main supplemental methods/tables for the manuscript. singularity_container.sif, singularity_container_definition.txt: ------------------------------------------------------------------ - Singularity container and definition files used for training CNN models with Keras. models_trained.tar.gz: ------------------------------------------------------------------ - CNN_models/ Trained CNN models - lsgkm_models/ LG-GKM SVM models trained on DNA sequences. - feature_models/ Models trained on sequence features (CG, GC, CpG, CGI). SVM model files are SVM.* Logistic regression files are LogReg.* supplemental_files.tar.gz: ------------------------------------------------------------------ - ENCFF050EKS.bedpe.hg19.gz ENCODE Hi-C chromatin loop file for HepG2 - FitHiChIP.interactions_FitHiC_Q0.0001.hg19.bed.gz Long-range chromatin contacts generated using FitHiChIP program for HepG2 - GWAS_enrichments.p_adj.txt.gz, GWAS_enrichments.txt.gz GWAS traits enriched in HepG2 HOT loci based on adjusted and unadjusted p-values - HOTs_with_common_tfs/ HOT loci generated by applying our definition using only DAPs for which ChIP-seq datasets are available in all three cell lines (HepG2, K562 and H1) - HOTs_with_subsampled_to_H1/ HOT loci generated by applying our definition using DAPs randomly subsampled to the number of H1 DAPs - HepG2_enhancers_DHS_H3K27ac.bed.gz, HepG2_enhancers_DHS_H3K27ac.bed.vertebrate.phastcons.gz, K562_enhancers_DHS_H3K27ac.bed.gz phastCons conservation scores and enhnacer regions in HepG2 and K562 defined using DHS regions and H3K27ac. Referred to as "regular enhancers" in the text. - HepG2_superenhancers.bed.gz super-enhancer regions in HepG2 - Housekeeping_GenesHuman.csv List of housekeeping genes. - TF_binwise_signal_values.txt ChIP-seq signal values in loci binned by number of bound DAPs. - chipseq_signal_values_by_tfs.csv.gz ChIP-seq signal values of individual DAPs in HOT loci. - Tau_gene_V8.csv.gz Tau metrics of genes measuring the tissue-specificities of their expressions. - all_dhs_merged.bed.gz Globally merged DHS regions of all available DHS datasets of ENCODE. Used as a background when necessary/ - classification_results.txt AUC values of calssification experiments. - hg19_files/ Various annotation files of hg19 genomes (promoters, exons, CpG islands etc.) downloaded from UCAS Genome Browser database. - human_transcription_factors.txt Catalogue of annotations of transcription factors, used for analyzing seqquence-specific and non-sequence-specific DAPs. - metadata_HepG2_K569_H1.txt Metadata of ENCODE datasets used in the study. - peaks/ ChIP-seq peaks files lifted over to hg19. - singularity_container.sif, singularity_container_definition.txt Singularity container (and definition file) used for running deep learning models. - tf_stats_summary_table.txt summary statistics of individual DAPs' involvements in HOT loci. - variant_density_values.txt enrichment values of various types of variants (GWAS, eQTLs, ClinVar, raQTLs, caQTLs) in HOT loci and other compared regions (DHS, enhancers, promoters). - windows_400_hg19.bed.gz file with all the tiled 400bp regions of hg19 genome used in the study.