Published January 10, 2024 | Version v1
Dataset Open

Aggregation of recount3 RNA-seq data improves inference of consensus and tissue-specific gene co-expression networks

  • 1. ROR icon Johns Hopkins University School of Medicine

Description

Data and Inferred Networks accompanying the manuscript entitled - “Aggregation of recount3 RNA-seq data improves the inference of consensus and context-specific gene co-expression networks” 

Authors: Prashanthi Ravichandran, Princy Parsana, Rebecca Keener, Kaspar Hansen, Alexis Battle 

Affiliations: Johns Hopkins University School of Medicine, Johns Hopkins University Department of Computer Science, Johns Hopkins University Bloomberg School of Public Health

Description: 

This folder includes data produced in the analysis contained in the manuscript and inferred consensus and context-specific networks from graphical lasso and WGCNA with varying numbers of edges. Contents include:

  • all_metadata.rds: File including meta-data columns of study accession ID, sample ID, assigned tissue category, cancer status and disease status obtained through manual curation for the 95,484 RNA-seq samples used in the study. 

  • all_counts.rds: log2 transformed RPKM normalized read counts for 5999 genes and 95,484 RNA-seq samples which was utilized for dimensionality reduction and data exploration 

  • precision_matrices.zip: Zipped folder including networks inferred by graphical lasso for different experiments presented in the paper using weighted covariance aggregation following PC correction.

    • The networks can be found as follows. First, select the folder corresponding to the network of interest - for example, Blood, this will then include two or more folders which indicate the data aggregation utilized, select the folder corresponding appropriate level of data aggregation - either all samples/ GTEx for blood-specific networks, this includes precision matrices inferred across a range of penalization parameters. To view the precision matrix inferred for a particular value of the penalization parameter X, select the file labeled lambda_X.rds

    • For select networks, we have included the computed centrality measures which can be accessed at centrality_X.rds for a particular value of the penalization parameter X. 

    • We have also included .rds files that list the hub genes from the consensus networks inferred from non-cancerous samples at “normal_hubs.rds”, and the consensus networks inferred from cancerous samples at “cancer_hubs.rds”

    • The file “context_specific_selected_networks.csv” includes the networks that were selected for downstream biological interpretation based on the scale-free criterion which is also summarized in the Supplementary Tables. 

  • WGCNA.zip: A zipped folder containing gene modules inferred from WGCNA for sequentially aggregated GTEx, SRA, and blood studies. Select the data aggregated, and the number of studies based on folder names. For example, blood networks inferred from 20 studies can be accessed at blood/consensus/net_20. The individual networks correspond to distinct cut heights, and include information on the cut height used, the genes that the network was inferred over merged module labels, and merged module colors. 

Files

Zenodo README.txt

Files (29.4 GB)

Name Size Download all
md5:31781122bda1fd585b91d4e8bb9e162c
4.3 GB Download
md5:84c34a64c6b01dc9e9a755d25a002035
732.4 kB Download
md5:2fc266e503ca6ddeccc51f2c5d6ace99
7.9 GB Preview Download
md5:38b09e8396f123e402a6357e4eb6bc4a
17.2 GB Preview Download
md5:f3028f80e63ec8ca7e7b99ad707c0105
3.0 kB Preview Download

Additional details

Software