Dataset Open Access

# Robustness and applicability of transcription factor and pathway analysis tools on single-cell RNA-seq data

Christian H. Holland; Jovan Tanevski; Javier Perales-Patón; Jan Gleixner; Manu P. Kumar; Elisabetta Mereu; Brian A. Joughin; Oliver Stegle; Douglas A. Lauffenburger; Holger Heyn; Bence Szalai; Julio Saez-Rodriguez

### Citation Style Language JSON Export

{
"publisher": "Zenodo",
"DOI": "10.5281/zenodo.3564179",
"title": "Robustness and applicability of  transcription factor and pathway analysis tools on single-cell RNA-seq data",
"issued": {
"date-parts": [
[
2019,
12,
10
]
]
},
"abstract": "<p>Data used to test the robustness and applicability of transcription factor and pathway analysis tools on single-cell RNA-seq data, described in <a href=\"https://doi.org/10.1186/s13059-020-1949-z\">Holland et al. 2020</a>.</p>\n\n<p>The folder&nbsp;<em>data </em>contains<em>&nbsp;</em>raw data and the folder <em>output</em> contains intermediate and final results of all analyses.&nbsp;</p>\n\n<p>The associated analyses code and more information are available on&nbsp;<a href=\"https://github.com/saezlab/FootprintMethods_on_scRNAseq\">GitHub</a>.</p>\n\n<p>&nbsp;</p>\n\n<p><strong>Abstract</strong></p>\n\n<p><strong>Background</strong></p>\n\n<p>Many functional analysis tools have been developed to extract functional and mechanistic insight from bulk transcriptome data. With the advent of single-cell RNA sequencing (scRNA-seq), it is in principle possible to do such an analysis for single cells. However, scRNA-seq data has characteristics such as drop-out events and low library sizes. It is thus not clear if functional TF and pathway analysis tools established for bulk sequencing can be applied to scRNA-seq in a meaningful way.</p>\n\n<p><strong>Results</strong></p>\n\n<p>To address this question, we perform benchmark studies on simulated and real scRNA-seq data. We include the bulk-RNA tools PROGENy, GO enrichment, and DoRothEA that estimate pathway and transcription factor (TF) activities, respectively, and compare them against the tools SCENIC/AUCell and metaVIPER, designed for scRNA-seq. For the in silico study, we simulate single cells from TF/pathway perturbation bulk RNA-seq experiments. We complement the simulated data with real scRNA-seq data upon CRISPR-mediated knock-out. Our benchmarks on simulated and real data reveal comparable performance to the original bulk data. Additionally, we show that the TF and pathway activities preserve cell type-specific variability by analyzing a mixture sample sequenced with 13 scRNA-seq protocols. We also provide the benchmark data for further use by the community.</p>\n\n<p><strong>Conclusions</strong></p>\n\n<p>Our analyses suggest that bulk-based functional analysis tools that use manually curated footprint gene sets can be applied to scRNA-seq data, partially outperforming dedicated single-cell tools. Furthermore, we find that the performance of functional analysis tools is more sensitive to the gene sets than to the statistic used.</p>\n\n<p>&nbsp;</p>\n\n<p>For questions related to the data please write an email to christian.holland@bioquant.uni-heidelberg.de or use the <a href=\"https://github.com/saezlab/FootprintMethods_on_scRNAseq/issues\">GitHub issue system</a>.</p>",
"author": [
{
"family": "Christian H. Holland"
},
{
"family": "Jovan Tanevski"
},
{
"family": "Javier Perales-Pat\u00f3n"
},
{
"family": "Jan Gleixner"
},
{
"family": "Manu P. Kumar"
},
{
"family": "Elisabetta Mereu"
},
{
"family": "Brian A. Joughin"
},
{
"family": "Oliver Stegle"
},
{
"family": "Douglas A. Lauffenburger"
},
{
"family": "Holger Heyn"
},
{
"family": "Bence Szalai"
},
{
"family": "Julio Saez-Rodriguez"
}
],
"version": "Version 2019-12-10",
"type": "dataset",
"id": "3564179"
}
992
2,155
views