Dataset Open Access
Christian H. Holland;
Jovan Tanevski;
Javier Perales-Patón;
Jan Gleixner;
Manu P. Kumar;
Elisabetta Mereu;
Brian A. Joughin;
Oliver Stegle;
Douglas A. Lauffenburger;
Holger Heyn;
Bence Szalai;
Julio Saez-Rodriguez
<?xml version='1.0' encoding='utf-8'?> <resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4.1/metadata.xsd"> <identifier identifierType="DOI">10.5281/zenodo.3564179</identifier> <creators> <creator> <creatorName>Christian H. Holland</creatorName> <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0002-3060-5786</nameIdentifier> <affiliation>Institute of Computational Biomedicine, Heidelberg University, Faculty of Medicine, Bioquant - Im Neuenheimer Feld 267, 69120 Heidelberg, Germany</affiliation> </creator> <creator> <creatorName>Jovan Tanevski</creatorName> <affiliation>Institute of Computational Biomedicine, Heidelberg University, Faculty of Medicine, Bioquant - Im Neuenheimer Feld 267, 69120 Heidelberg, Germany</affiliation> </creator> <creator> <creatorName>Javier Perales-Patón</creatorName> <affiliation>Institute of Computational Biomedicine, Heidelberg University, Faculty of Medicine, Bioquant - Im Neuenheimer Feld 267, 69120 Heidelberg, Germany</affiliation> </creator> <creator> <creatorName>Jan Gleixner</creatorName> <affiliation>German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, 69120 Heidelberg, Germany</affiliation> </creator> <creator> <creatorName>Manu P. Kumar</creatorName> <affiliation>Department of Biological Engineering, MIT, Cambridge MA</affiliation> </creator> <creator> <creatorName>Elisabetta Mereu</creatorName> <affiliation>CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain</affiliation> </creator> <creator> <creatorName>Brian A. Joughin</creatorName> <affiliation>Department of Biological Engineering, MIT, Cambridge MA</affiliation> </creator> <creator> <creatorName>Oliver Stegle</creatorName> <affiliation>German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, 69120 Heidelberg, Germany</affiliation> </creator> <creator> <creatorName>Douglas A. Lauffenburger</creatorName> <affiliation>Department of Biological Engineering, MIT, Cambridge MA</affiliation> </creator> <creator> <creatorName>Holger Heyn</creatorName> <affiliation>CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain</affiliation> </creator> <creator> <creatorName>Bence Szalai</creatorName> <affiliation>Semmelweis University, Faculty of Medicine, Department of Physiology, Budapest, Hungary</affiliation> </creator> <creator> <creatorName>Julio Saez-Rodriguez</creatorName> <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0002-8552-8976</nameIdentifier> <affiliation>Institute of Computational Biomedicine, Heidelberg University, Faculty of Medicine, Bioquant - Im Neuenheimer Feld 267, 69120 Heidelberg, Germany</affiliation> </creator> </creators> <titles> <title>Robustness and applicability of transcription factor and pathway analysis tools on single-cell RNA-seq data</title> </titles> <publisher>Zenodo</publisher> <publicationYear>2019</publicationYear> <subjects> <subject>scRNA-seq</subject> <subject>functional analysis</subject> <subject>transcription factor analysis</subject> <subject>pathway analysis</subject> <subject>benchmark</subject> </subjects> <dates> <date dateType="Issued">2019-12-10</date> </dates> <resourceType resourceTypeGeneral="Dataset"/> <alternateIdentifiers> <alternateIdentifier alternateIdentifierType="url">https://zenodo.org/record/3564179</alternateIdentifier> </alternateIdentifiers> <relatedIdentifiers> <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.5281/zenodo.3564178</relatedIdentifier> </relatedIdentifiers> <version>Version 2019-12-10</version> <rightsList> <rights rightsURI="https://creativecommons.org/licenses/by/4.0/legalcode">Creative Commons Attribution 4.0 International</rights> <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights> </rightsList> <descriptions> <description descriptionType="Abstract"><p>Data used to test the robustness and applicability of transcription factor and pathway analysis tools on single-cell RNA-seq data, described in <a href="https://doi.org/10.1186/s13059-020-1949-z">Holland et al. 2020</a>.</p> <p>The folder&nbsp;<em>data </em>contains<em>&nbsp;</em>raw data and the folder <em>output</em> contains intermediate and final results of all analyses.&nbsp;</p> <p>The associated analyses code and more information are available on&nbsp;<a href="https://github.com/saezlab/FootprintMethods_on_scRNAseq">GitHub</a>.</p> <p>&nbsp;</p> <p><strong>Abstract</strong></p> <p><strong>Background</strong></p> <p>Many functional analysis tools have been developed to extract functional and mechanistic insight from bulk transcriptome data. With the advent of single-cell RNA sequencing (scRNA-seq), it is in principle possible to do such an analysis for single cells. However, scRNA-seq data has characteristics such as drop-out events and low library sizes. It is thus not clear if functional TF and pathway analysis tools established for bulk sequencing can be applied to scRNA-seq in a meaningful way.</p> <p><strong>Results</strong></p> <p>To address this question, we perform benchmark studies on simulated and real scRNA-seq data. We include the bulk-RNA tools PROGENy, GO enrichment, and DoRothEA that estimate pathway and transcription factor (TF) activities, respectively, and compare them against the tools SCENIC/AUCell and metaVIPER, designed for scRNA-seq. For the in silico study, we simulate single cells from TF/pathway perturbation bulk RNA-seq experiments. We complement the simulated data with real scRNA-seq data upon CRISPR-mediated knock-out. Our benchmarks on simulated and real data reveal comparable performance to the original bulk data. Additionally, we show that the TF and pathway activities preserve cell type-specific variability by analyzing a mixture sample sequenced with 13 scRNA-seq protocols. We also provide the benchmark data for further use by the community.</p> <p><strong>Conclusions</strong></p> <p>Our analyses suggest that bulk-based functional analysis tools that use manually curated footprint gene sets can be applied to scRNA-seq data, partially outperforming dedicated single-cell tools. Furthermore, we find that the performance of functional analysis tools is more sensitive to the gene sets than to the statistic used.</p> <p>&nbsp;</p> <p>For questions related to the data please write an email to christian.holland@bioquant.uni-heidelberg.de or use the <a href="https://github.com/saezlab/FootprintMethods_on_scRNAseq/issues">GitHub issue system</a>.</p></description> </descriptions> </resource>
All versions | This version | |
---|---|---|
Views | 992 | 992 |
Downloads | 2,155 | 2,155 |
Data volume | 11.6 TB | 11.6 TB |
Unique views | 913 | 913 |
Unique downloads | 533 | 533 |