Published May 30, 2019 | Version v1
Journal article Open

MTTFsite: cross-cell type TF binding site prediction by using multi-task learning

  • 1. Harbin Institute of Technology
  • 2. Hong Kong Polytechnic University
  • 3. University of Warwick

Description

In this paper, a multi-task learning framework (called MTTFsite) is proposed to address the lack of labeled data problem by leveraging on labeled data available in cross-cell types. The proposed MTTFsite contains a shared CNN to learn common features for all cell types and a private CNN for each cell type to learn private features. The common features are aimed to help predicting TFBSs for all cell types especially those cell types that lack labeled data. MTTFsite is evaluated on 241 cell type TF pairs and compared with a baseline method without using any multi-task learning model and a fully shared multi-task model that uses only a shared CNN and do not use private CNNs. For cell types with insuffificient labeled data, results show that MTTFsite performs better than the baseline method and the fully shared model on more than 89% pairs. For cell types without any labeled data, MTTFsite outperforms the baseline method and the fully shared model by more than 80 and 93% pairs, respectively. A novel gene expression prediction method (called TFChrome) using both MTTFsite and histone modifification features is also presented. Results show that TFBSs predicted by MTTFsite alone can achieve good performance. When MTTFsite is combined with his tone modifification features, a signifificant 5.7% performance improvement is obtained.

Files

bioinformatics.pdf

Files (598.8 kB)

Name Size Download all
md5:0b1ccfd3fa467ce1c25ef5b177388339
598.8 kB Preview Download

Additional details

Funding

European Commission
DeepPatient – Deep Understanding of Patient Experience of Healthcare from Social Media 794196