Published January 23, 2018 | Version 2.0.0
Dataset Open

A comprehensive evaluation of module detection methods for gene expression data

  • 1. Ghent University - VIB

Description

A critical step in the analysis of large genome-wide gene expression datasets is the use of module detection methods to group genes into co-expression modules. Because of limitations of classical clustering methods, numerous alternative module detection methods have been proposed, which improve upon clustering by handling co-expression in only a subset of samples, modeling the regulatory network, or allowing overlap between modules. In this study we use known regulatory networks to do a comprehensive and robust evaluation of these different methods. Overall, decomposition methods outperform all other strategies, while we could not find any clear advantages of biclustering and network-inference based approaches.  Using our evaluation workflow, we also investigate several practical aspects of module detection, such as parameter estimation and the use of alternative similarity measures, and conclude with recommendations for the further development of these methods.

Notes

Changes since V1: - Added the baselines, which were missing in the first version - Added a new E. coli dataset (https://github.com/SBRG/precise2/tree/b58057a42cc620985ce92df7ff51cfdb9260860c/data/precise2) - Added a much needed update of the RegulonDB database (RegulonDB 10.9, 06/29/2021) - Added results from the top methods

Files

data.zip

Files (1.8 GB)

Name Size Download all
md5:f516d53b13a4d359c1a5cdc14438e705
1.3 GB Preview Download
md5:4c11cf0dec6fc44cf918736899864ca3
520.4 MB Preview Download