Dataset Open Access

Robustness and applicability of transcription factor and pathway analysis tools on single-cell RNA-seq data

Christian H. Holland; Jovan Tanevski; Javier Perales-Patón; Jan Gleixner; Manu P. Kumar; Elisabetta Mereu; Brian A. Joughin; Oliver Stegle; Douglas A. Lauffenburger; Holger Heyn; Bence Szalai; Julio Saez-Rodriguez


JSON Export

{
  "files": [
    {
      "links": {
        "self": "https://zenodo.org/api/files/dc049add-b231-4dd6-ba3d-a5f69ce57c11/data.zip"
      }, 
      "checksum": "md5:a2f6387a668c204d61b5e47c402e745d", 
      "bucket": "dc049add-b231-4dd6-ba3d-a5f69ce57c11", 
      "key": "data.zip", 
      "type": "zip", 
      "size": 5485871322
    }, 
    {
      "links": {
        "self": "https://zenodo.org/api/files/dc049add-b231-4dd6-ba3d-a5f69ce57c11/output.zip"
      }, 
      "checksum": "md5:d86f800b0f9ffae88858405e7517d4ba", 
      "bucket": "dc049add-b231-4dd6-ba3d-a5f69ce57c11", 
      "key": "output.zip", 
      "type": "zip", 
      "size": 5378455338
    }
  ], 
  "owners": [
    59833
  ], 
  "doi": "10.5281/zenodo.3564179", 
  "stats": {
    "version_unique_downloads": 533.0, 
    "unique_views": 918.0, 
    "views": 997.0, 
    "version_views": 997.0, 
    "unique_downloads": 533.0, 
    "version_unique_views": 918.0, 
    "volume": 11624299872366.0, 
    "version_downloads": 2155.0, 
    "downloads": 2155.0, 
    "version_volume": 11624299872366.0
  }, 
  "links": {
    "doi": "https://doi.org/10.5281/zenodo.3564179", 
    "conceptdoi": "https://doi.org/10.5281/zenodo.3564178", 
    "bucket": "https://zenodo.org/api/files/dc049add-b231-4dd6-ba3d-a5f69ce57c11", 
    "conceptbadge": "https://zenodo.org/badge/doi/10.5281/zenodo.3564178.svg", 
    "html": "https://zenodo.org/record/3564179", 
    "latest_html": "https://zenodo.org/record/3564179", 
    "badge": "https://zenodo.org/badge/doi/10.5281/zenodo.3564179.svg", 
    "latest": "https://zenodo.org/api/records/3564179"
  }, 
  "conceptdoi": "10.5281/zenodo.3564178", 
  "created": "2019-12-10T16:39:03.831386+00:00", 
  "updated": "2020-02-14T12:26:53.172299+00:00", 
  "conceptrecid": "3564178", 
  "revision": 6, 
  "id": 3564179, 
  "metadata": {
    "access_right_category": "success", 
    "doi": "10.5281/zenodo.3564179", 
    "description": "<p>Data used to test the robustness and applicability of transcription factor and pathway analysis tools on single-cell RNA-seq data, described in <a href=\"https://doi.org/10.1186/s13059-020-1949-z\">Holland et al. 2020</a>.</p>\n\n<p>The folder&nbsp;<em>data </em>contains<em>&nbsp;</em>raw data and the folder <em>output</em> contains intermediate and final results of all analyses.&nbsp;</p>\n\n<p>The associated analyses code and more information are available on&nbsp;<a href=\"https://github.com/saezlab/FootprintMethods_on_scRNAseq\">GitHub</a>.</p>\n\n<p>&nbsp;</p>\n\n<p><strong>Abstract</strong></p>\n\n<p><strong>Background</strong></p>\n\n<p>Many functional analysis tools have been developed to extract functional and mechanistic insight from bulk transcriptome data. With the advent of single-cell RNA sequencing (scRNA-seq), it is in principle possible to do such an analysis for single cells. However, scRNA-seq data has characteristics such as drop-out events and low library sizes. It is thus not clear if functional TF and pathway analysis tools established for bulk sequencing can be applied to scRNA-seq in a meaningful way.</p>\n\n<p><strong>Results</strong></p>\n\n<p>To address this question, we perform benchmark studies on simulated and real scRNA-seq data. We include the bulk-RNA tools PROGENy, GO enrichment, and DoRothEA that estimate pathway and transcription factor (TF) activities, respectively, and compare them against the tools SCENIC/AUCell and metaVIPER, designed for scRNA-seq. For the in silico study, we simulate single cells from TF/pathway perturbation bulk RNA-seq experiments. We complement the simulated data with real scRNA-seq data upon CRISPR-mediated knock-out. Our benchmarks on simulated and real data reveal comparable performance to the original bulk data. Additionally, we show that the TF and pathway activities preserve cell type-specific variability by analyzing a mixture sample sequenced with 13 scRNA-seq protocols. We also provide the benchmark data for further use by the community.</p>\n\n<p><strong>Conclusions</strong></p>\n\n<p>Our analyses suggest that bulk-based functional analysis tools that use manually curated footprint gene sets can be applied to scRNA-seq data, partially outperforming dedicated single-cell tools. Furthermore, we find that the performance of functional analysis tools is more sensitive to the gene sets than to the statistic used.</p>\n\n<p>&nbsp;</p>\n\n<p>For questions related to the data please write an email to christian.holland@bioquant.uni-heidelberg.de or use the <a href=\"https://github.com/saezlab/FootprintMethods_on_scRNAseq/issues\">GitHub issue system</a>.</p>", 
    "license": {
      "id": "CC-BY-4.0"
    }, 
    "title": "Robustness and applicability of  transcription factor and pathway analysis tools on single-cell RNA-seq data", 
    "relations": {
      "version": [
        {
          "count": 1, 
          "index": 0, 
          "parent": {
            "pid_type": "recid", 
            "pid_value": "3564178"
          }, 
          "is_last": true, 
          "last_child": {
            "pid_type": "recid", 
            "pid_value": "3564179"
          }
        }
      ]
    }, 
    "version": "Version 2019-12-10", 
    "keywords": [
      "scRNA-seq", 
      "functional analysis", 
      "transcription factor analysis", 
      "pathway analysis", 
      "benchmark"
    ], 
    "publication_date": "2019-12-10", 
    "creators": [
      {
        "orcid": "0000-0002-3060-5786", 
        "affiliation": "Institute of Computational Biomedicine, Heidelberg University, Faculty of Medicine, Bioquant - Im Neuenheimer Feld 267, 69120 Heidelberg, Germany", 
        "name": "Christian H. Holland"
      }, 
      {
        "affiliation": "Institute of Computational Biomedicine, Heidelberg University, Faculty of Medicine, Bioquant - Im Neuenheimer Feld 267, 69120 Heidelberg, Germany", 
        "name": "Jovan Tanevski"
      }, 
      {
        "affiliation": "Institute of Computational Biomedicine, Heidelberg University, Faculty of Medicine, Bioquant - Im Neuenheimer Feld 267, 69120 Heidelberg, Germany", 
        "name": "Javier Perales-Pat\u00f3n"
      }, 
      {
        "affiliation": "German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, 69120 Heidelberg, Germany", 
        "name": "Jan Gleixner"
      }, 
      {
        "affiliation": "Department of Biological Engineering, MIT, Cambridge MA", 
        "name": "Manu P. Kumar"
      }, 
      {
        "affiliation": "CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain", 
        "name": "Elisabetta Mereu"
      }, 
      {
        "affiliation": "Department of Biological Engineering, MIT, Cambridge MA", 
        "name": "Brian A. Joughin"
      }, 
      {
        "affiliation": "German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, 69120 Heidelberg, Germany", 
        "name": "Oliver Stegle"
      }, 
      {
        "affiliation": "Department of Biological Engineering, MIT, Cambridge MA", 
        "name": "Douglas A. Lauffenburger"
      }, 
      {
        "affiliation": "CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain", 
        "name": "Holger Heyn"
      }, 
      {
        "affiliation": "Semmelweis University, Faculty of Medicine, Department of Physiology, Budapest, Hungary", 
        "name": "Bence Szalai"
      }, 
      {
        "orcid": "0000-0002-8552-8976", 
        "affiliation": "Institute of Computational Biomedicine, Heidelberg University, Faculty of Medicine, Bioquant - Im Neuenheimer Feld 267, 69120 Heidelberg, Germany", 
        "name": "Julio Saez-Rodriguez"
      }
    ], 
    "access_right": "open", 
    "resource_type": {
      "type": "dataset", 
      "title": "Dataset"
    }, 
    "related_identifiers": [
      {
        "scheme": "doi", 
        "identifier": "10.5281/zenodo.3564178", 
        "relation": "isVersionOf"
      }
    ]
  }
}
997
2,155
views
downloads
All versions This version
Views 997997
Downloads 2,1552,155
Data volume 11.6 TB11.6 TB
Unique views 918918
Unique downloads 533533

Share

Cite as