Datasets for predicting TF binding using Virtual ChIP-seq
Creators
- 1. Department of Medical Biophysics, University of Toronto
Description
This repository contains datasets necessary for using the Virtual ChIP-seq software.
Virtual ChIP-seq requires the following datasets to predict transcription factor binding:
-
chipExpDir_AtoH_V1.0.0.tar.gz: Reference matrices of correlation between TF binding and gene expression for TFs starting with letters A-H.
-
chipExpDir_ItoZ_V1.0.0.tar.gz: Reference matrices of correlation between TF binding and gene expression for TFs starting with letters I-Z.
-
refTables_V1.1.0.tar.gz: PhastCons genomic conservation, FIMO PWM scores for JASPAR motifs, and ChIP-seq data of ENCODE and Cistrome database.
-
hg38_chrsize.tsv: Length of chromosomes in hg38
-
trainedModels_V1.0.0.tar.gz: Virtual ChIP-seq scikit-learn trained models saved in joblib format
-
<CellType>.tar.gz: Pre-calculated matrices suitable for training with other algorithms or re-training with Virtual ChIP-seq.
Some predictive features of TF binding are the same in each cell type and are stored together for simplicity in refTables_V1.0.0.tar.gz. You can use datasets from other cell types (named here as <CellType>.tar.gz) for the purpose of re-training the model. The <CellType>.tar.gz files contain pre-calculated predictive features of transcription factor binding in 4 chromosomes (5, 10, 15, 20).
These features include:
-
PhastCons genomic conservation
-
FIMO score for sequence motifs of TF in the JASPAR database
-
Chromatin accessibility
-
TF binding in ENCODE + Cistrome DB datasets
-
Virtual ChIP-seq expression score
Files
Files
(127.7 GB)
Name | Size | Download all |
---|---|---|
md5:b94ddbdd23c9c3f3bf19da32a1c81386
|
5.6 GB | Download |
md5:8d32e610ddaaa3dc51c1160a4412854a
|
576.8 MB | Download |
md5:73cd1b537594d6c5292bcf565bed53b5
|
28.3 GB | Download |
md5:1cab4d28f8b19590f0d54df34eabc5f7
|
28.6 GB | Download |
md5:fd571b8c85e838c1e0376473ec64377f
|
8.0 GB | Download |
md5:1b58bcb51013ddc0af22f8e675348404
|
5.7 GB | Download |
md5:77790b28dbe2d26a2b9d794dff8b0007
|
2.7 GB | Download |
md5:a0865de52384e0bbc1469d122b41d12d
|
5.2 GB | Download |
md5:545563edd74c082b23d2560b8a71431f
|
7.3 GB | Download |
md5:50e491e8a8e9b0019a6c15c5d9a890ff
|
365 Bytes | Download |
md5:69bee0c45cdbb862b236c396cedb8a6e
|
1.1 kB | Download |
md5:501962ea3cb66864a67f500fde72ef63
|
9.7 GB | Download |
md5:3f81f671584da330deabe726cf003ea4
|
1.9 GB | Download |
md5:2db7b0208a0bd2b550051a9eea9c322b
|
685.0 MB | Download |
md5:2772ebf82e301b61cbf12b339d346416
|
8.8 GB | Download |
md5:0031fd1d55911a5f47da2a7d4fc13d39
|
1.5 GB | Download |
md5:d5eb27535bdd55ebbd654551fe36a3c7
|
1.9 GB | Download |
md5:67d5243c981b7c2a97360b9495dd78ec
|
6.4 GB | Download |
md5:6844458b2fe22225b5f4118f84d52580
|
702.4 MB | Download |
md5:78f721adbe6c570394cdd86c97c95b23
|
559.6 MB | Download |
md5:47a5ae471179ddc52992df6129d3a947
|
136.9 MB | Download |
md5:5ecabb174c6f6bc5eae70a6e31b18859
|
2.3 GB | Download |
md5:4d2cc333adc6ccba713f4b4d0253f4e1
|
987.2 MB | Download |
md5:e3a24d6c6e0428f4036ed58ca6f9735f
|
70.3 MB | Download |