Identifying statistically significant chromatin contacts from Hi-C data with FitHiC2
Creators
- 1. Department of Bioengineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
- 2. Division of Vaccine Discovery, La Jolla Institute for Immunology, La Jolla, CA 92037, USA
Description
Fit-Hi-C is a programming application to compute statistical confidence estimates for Hi-C contact maps to identify significant chromatin contacts. By fitting a monotonically non-increasing spline, Fit-Hi-C captures the relationship between genomic distance and contact probability without any parametric assumption. The spline fit together with the correction of contact probabilities with respect to bin- or locus-specific biases account for previously characterized covariates impacting Hi-C contact counts. Fit-Hi-C is best applied for the study of mid-range (e.g., 20Kb – 2Mb for human genome) intra-chromosomal contacts, however, with the latest reimplementation, named FitHiC2, it is possible to perform genome-wide analysis for high-resolution Hi-C data including all intra-chromosomal distances and inter-chromosomal contacts. FitHiC2 also offers a merging filter module, which eliminates indirect/bystander interactions, leading to significant reduction in the number of reported contacts without sacrificing recovery of key loops such as those between convergent CTCF binding sites. Here we describe how to apply the FitHiC2 protocol to three use cases: (i) 5kb resolution Hi-C data of chromosome 5 from GM12878 (a human lymphoblastoid cell line), (ii) 40kb resolution whole genome Hi-C data from IMR90 (human lung fibroblast), and (iii) budding yeast whole genome Hi-C data at a single restriction cut site (EcoRI) resolution. The procedure takes ~10 hours when all use cases are run sequentially (~4h when run parallel). With the recent improvements in its implementation, FitHiC2 (8 processors and 16GB RAM) is also scalable to genome-wide analysis of the highest resolution (1kb) Hi-C data available to date (~48h with 32GB peak memory). FitHiC2 is available through Bioconda, Github and the Python Package Index.
Notes
Files
Submitted_Data_FitHiC2.zip
Files
(172.4 MB)
Name | Size | Download all |
---|---|---|
md5:eb551652a4d293df3c7c218658da6f34
|
172.4 MB | Preview Download |
Additional details
Related works
- Is supplemented by
- 10.24433/CO.5589539.v2 (DOI)