multi-omics identification of transcribed enhancers
Description
A method for identifying transcribed enhancers(TEn) with multi-omics data set. FANTOM5 enhancers overlapped OCRs, and the same number of bidirectionally transcribed (determined by CAGE tag density) was used as a positive set, and a comparable number of non-transcribed OCRs were used as a negative set. We collected tag counts with ATAC-seq, ChIP-seq, and strand-specific RNA-seq bam files for the central 200bp and flanking 400bp of both the testing and training sets. Additionally, we annotated the position relative to genes. To control for the potential effect from nearby gene expression, we also included the genomic annotation, including: upstream (5kb upstream of gene 5’), downstream (5kb downstream of gene 3’), intronic, and intergenic (all others). The BAM counts and genomic annotation of the training set were used as the input for a random forest model, with parameters fine-tuned by a 10-fold cross-validation grid search.
Files
Files
(11.0 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:ddf2b6da99c6773e0f89bfaac4f6c3f6
|
11.0 kB | Download |