Published February 24, 2020 | Version v1
Journal article Open

Prediction of new associations between ncRNAs and diseases exploiting multi-type hierarchical clustering

  • 1. University of Bari Aldo Moro - Department of Computer Science, Via Orabona, 4, Bari, 70125, Italy
  • 2. CNR, Institute for Biomedical Technologies, Bari, 70126, Italy

Description

Background: The study of functional associations between ncRNAs and human diseases is a pivotal task of modern research to develop new and more effective therapeutic approaches. Nevertheless, it is not a trivial task since it involves entities of different types, such as microRNAs, lncRNAs or target genes whose expression also depends on endogenous or exogenous factors. Such a complexity can be faced by representing the involved biological entities and their relationships as a network and by exploiting network-based computational approaches able to identify new associations. However, existing methods are limited to homogeneous networks (i.e., consisting of only one type of objects and relationships) or can exploit only a small subset of the features of biological entities, such as the presence of a particular binding domain, enzymatic properties or their involvement in specific diseases.

Results: To overcome the limitations of existing approaches, we propose the system LP-HCLUS, which exploits a multi-type hierarchical clustering method to predict possibly unknown ncRNA-disease relationships. In particular, LP-HCLUS analyzes heterogeneous networks consisting of several types of objects and relationships, each possibly described by a set of features, and extracts multi-type clusters that are subsequently exploited to predict new ncRNA-disease associations. The extracted clusters are overlapping, hierarchically organized, involve entities of different types, and allow LP-HCLUS to catch multiple roles of ncRNAs in diseases at different levels of granularity. Our experimental evaluation, performed on heterogeneous attributed networks consisting of microRNAs, lncRNAs, diseases, genes and their known relationships, shows that LP-HCLUS is able to obtain better results with respect to existing approaches. The biological relevance of the obtained results was evaluated according to both quantitative (i.e., TPR@k, Areas Under the TPR@k, ROC and Precision-Recall curves) and qualitative (i.e., according to the consultation of the existing literature) criteria.

Conclusions: The obtained results prove the utility of LP-HCLUS to conduct robust predictive studies on the biological role of ncRNAs in human diseases. The produced predictions can therefore be reliably considered as new, previously unknown, relationships among ncRNAs and diseases.

Files

12859_2020_3392_MOESM1_ESM.pdf

Files (28.2 MB)

Name Size Download all
md5:df5384e5a7023d04fa5d73f3e1c00fe0
139.8 kB Preview Download
md5:872ee4186c2d59c57434a7b0dafca839
85.6 kB Preview Download
md5:02befdbaddb0c40b70d30f019cbdf573
24.0 MB Download
md5:9487e4b1ec94c7838480261321e45c73
12.1 kB Download
md5:a41a96453a4dc0aeaea1dda58ed396f6
556.4 kB Download
md5:59b9a158a8a62f49393c2b79bfde1258
3.4 MB Preview Download
md5:59b94017f6b0db1ec6481cf3ca097ea4
14.7 kB Download

Additional details

Funding

MAESTRA – Learning from Massive, Incompletely annotated, and Structured Data 612944
European Commission