Normalization methods for single-cell RNA-Seq data (high-level overview)
Description
In this video, I provide a high-level overview over different scRNA-Seq noramlization methods. In particular, I discuss the differences between log transforms, square root transforms, and Pearson residuals.
While discussing the scaling step, I forgot to mention that scaling should be done to the median transcript count of all cells in the dataset (approx. 9,000 in the example), not to an arbitrary number like 1 or 1,000,000. Otherwise, this can really throw off the following transformation step and lead to completely useless analysis results.
Further reading
-------------------------
1. "Validation of noise models for single-cell transcriptomics" (Grün et al., 2015) https://doi.org/10.1038/nmeth.2930
2. "Comprehensive Integration of Single-Cell Data" (Stuart et al., 2019) https://doi.org/10.1016/j.cell.2019.05.031
3. "K-nearest neighbor smoothing for high-throughput single-cell RNA-Seq data" (Wagner et al., 2018) https://doi.org/10.1101/217737
4. "Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression" (Hafemeister and Satija, 2019) https://doi.org/10.1186/s13059-019-1874-1
5. "Analytic Pearson residuals for normalization of single-cell RNA-seq UMI data" (Lause et al., 2021) https://doi.org/10.1101/2020.12.01.405886
Data sources
-------------------------
1. Technical noise experiment: "Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells" (Klein et al., 2015) https://doi.org/10.1016/j.cell.2015.04.044
2. PBMC data: "10k PBMCs from a Healthy Donor (v3 chemistry)" (10x Genomics) https://support.10xgenomics.com/single-cell-gene-expression/datasets/3.0.0/pbmc_10k_v3
Files
Transforms.mp4
Files
(382.5 MB)
Name | Size | Download all |
---|---|---|
md5:c1ce73936b7005cce05936a878f76e03
|
382.5 MB | Preview Download |
Additional details
Related works
- Is identical to
- Video/Audio: https://www.youtube.com/watch?v=huxkc2GH4lk (URL)