Published January 7, 2015 | Version v1
Dataset Open

Data from: Machine learning-based differential network analysis: a study of stress-responsive transcriptomes in Arabidopsis thaliana

  • 1. University of Arizona

Description

Machine learning (ML) is an intelligent data mining technique that builds a prediction model based on the learning of prior knowledge to recognize patterns in large-scale data sets. We present an ML-based methodology for transcriptome analysis via comparison of gene coexpression networks, implemented as an R package called machine learning–based differential network analysis (mlDNA) and apply this method to reanalyze a set of abiotic stress expression data in Arabidopsis thaliana. The mlDNA first used a ML-based filtering process to remove nonexpressed, constitutively expressed, or non-stress-responsive "noninformative" genes prior to network construction, through learning the patterns of 32 expression characteristics of known stress-related genes. The retained "informative" genes were subsequently analyzed by ML-based network comparison to predict candidate stress-related genes showing expression and network differences between control and stress networks, based on 33 network topological characteristics. Comparative evaluation of the network-centric and gene-centric analytic methods showed that mlDNA substantially outperformed traditional statistical testing–based differential expression analysis at identifying stress-related genes, with markedly improved prediction accuracy. To experimentally validate the mlDNA predictions, we selected 89 candidates out of the 1784 predicted salt stress–related genes with available SALK T-DNA mutagenesis lines for phenotypic screening and identified two previously unreported genes, mutants of which showed salt-sensitive phenotypes.

Notes

Files

Files (27.6 MB)

Name Size Download all
md5:0f64146e22e021caaf7686710ce6dd58
118.8 kB Download
md5:449629669cc716952253943e18c0cee5
5.0 MB Download
md5:3807a6b83be494a4e37d1c8796db91ea
21.3 MB Download
md5:5558e6aed4fc418a8c9d44ae04d4cb32
111.6 kB Download
md5:c96910e4df0a231873cab4ad3fe3c2c9
239.1 kB Download
md5:b9eeff79c6eb0c1a52432004917f6a67
104.4 kB Download
md5:7781ba097239b907fc59a131c82aa57d
781.8 kB Download

Additional details

Related works

Is cited by
10.1105/tpc.113.121913 (DOI)