Table of Contents

Standard BESCA workflow

The current workflow works with the Scanpy version 1.6.

Changes:

Import section

Setup Standard Wokflow

Parameters to be set - on the command line or here

Define Input

Standard parameters - these should be kept as stable as possible

Thresholds defined above for filtering should be a good start to treat most samples. In some cases, based on the QC plots shown below, one can decide to change the threshold.

For example, with the pbmc3k dataset, we advise to lower those thresholds (see besca tutorial here: https://bedapub.github.io/besca/tutorials/notebook1_data_processing_pbmc3k.html).

For the test dataset here (Kotliarov2020), one might argue that the max_counts threshold could be lower (based on the distribution in code chunk 12)

Standard Pipeline

(note nothing below this point should be modified!!)

!!! Remove low quality samples !!!

Read patient HBsAg

Visualization of quality control plots and selected filtering parameters

Count occurrence

This plot shows cell counts per sample

Transcript capture efficiency

This plot gives you an idea about the sequencing depth and if the sequencing has reached saturation or not.

Library size distribution

This plot gives you an idea about the library size distribution in your dataset before processing.

The effect of filtering parameters

Scanpy plots of genes, counts, and mitochondria gene counts

Mitochondrial genes, genes, and counts by samples grouped by the split condition

Filtering

Filtering with thresholds of gene and cell counts

Filtering with thresholds of proportion of mitochondrial genes and the upper limit of gene counts

Visualising QC metrics of the filtered dataset

Per-cell normalization and output of the normalized data

We perform an additional QC, which checks the dynamic range of ubiquitously expressed marker genes.

Feature selection (highly variable genes) for clustering

Regression steps, and output of regressed data

PCA-based neighborhood analysis and UMAP with optional batch correction

We use the Batch Balanced K-Nearest Neighbourhood (bbknn, Teichlab/bbknn) method as the batch correction method.

Clustering

Additional Labeling

If labeling_to_use is specified, additional labels are taken from annotations in "metadata.tsv", and the data associated with additional labelling will be exported to files. And the fract_pos.gct and average.gct files are generated.

CiteSeq Standard Wokflow (only executed if applicable)

Complete the log-file

Write QC Report

Session info

Finally, we report the session info with the package sinfo.

Additional plots

Convert to html