Published July 11, 2022 | Version 0.0.1
Workflow Open

Inferring cellular and molecular processes in single-cell data with non-negative matrix factorization using Python, R, and GenePattern Notebook implementations of CoGAPS

Description

All code and data needed to reproduce results of our workflow paper, "Inferring cellular and molecular processes in single-cell data with non-negative matrix factorization using Python, R, and GenePattern Notebook implementations of CoGAPS"

Can also be found in our CoGAPS and PyCoGAPS github repositories.

Abstract:

Non-negative matrix factorization (NMF) is an unsupervised learning method well suited to  high-throughput biology. Still, inferring biological processes requires additional post hoc statistics and annotation for interpretation of features learned from software packages developed for NMF implementation. Here, we aim to introduce a suite of computational tools that implement NMF and provide methods for accurate, clear biological interpretation and analysis. A generalized discussion of NMF covering its benefits, limitations, and open questions in the field is followed by three procedures for the Bayesian NMF algorithm CoGAPS (Coordinated Gene Activity across Pattern Subsets). Each procedure will demonstrate NMF analysis to quantify cell state transitions in public domain single-cell RNA-sequencing (scRNA-seq) data of 25,422 epithelial cells from pancreatic ductal adenocarcinoma (PDAC) tumors and control samples. The first demonstrates PyCoGAPS, our new Python implementation of CoGAPS that enhances runtime of Bayesian NMF for large datasets. The second procedure steps through the same single-cell NMF analysis using our R CoGAPS interface, and the third introduces a beginner-friendly CoGAPS platform using GenePattern Notebook. By providing Python support, cloud-based computing options, and relevant example workflows, we facilitate user-friendly interpretation and implementation of NMF for single-cell analyses.

Files

ModSimBases.txt

Files (4.4 GB)

Name Size Download all
md5:5082d45074f6584e6de68ffc723084bd
1.5 GB Download
md5:d05abdc4010ca116388c41dc28f1b845
44.2 MB Download
md5:6c9ca83a58b708eb96f32c7bc48ab117
1.2 GB Download
md5:b6844648e16325b35cc362f5236ef0b4
166.6 MB Download
md5:735ddbd82aef34c016cd93c31e68c864
433.3 MB Download
md5:a6110c3cc963f1779ac216a85dce7e1f
471 Bytes Preview Download
md5:520ebe0dc39876ca7dda19ebe2af35ea
3.7 kB Preview Download
md5:673d6ee43def4c85c5e55fd535462f22
32.8 kB Download
md5:aaf6c284cfe2f23ecc1e68c5e28c4ff5
1.0 GB Download