Benchmarking computational doublet-detection methods for single-cell RNA sequencing data
Description
This repository contains the real and synthetic datasets used in the paper "Benchmarking Computational Doublet-Detection Methods for Single-Cell RNA Sequencing Data" and "Protocol for Benchmarking Computational Doublet-Detection Methods in Single-Cell RNA Sequencing Data Analysis". Please check the full text published on Cell Systems and STAR Protocols.
1. real_datasets.zip: 16 real scRNA-seq datasets with experimentally annotated doublets. This collection covers a variety of cell types, droplet and gene numbers, doublet rates, and sequencing depths. It represents varying levels of difficulty in detecting doublets from scRNA-seq data. The data collection and preprocessing details are described in our Cell System paper. The name of each file corresponds to the names in the paper.
2. synthetic_datasets.zip: synthetic datasets used in the paper, including datasets with varying doublet rates (i.e., percentages of doublets among all droplets), sequencing depths, cell types, and between-cell-type heterogeneity levels. The synthetic datasets contain ground-truth doublets, cell types, differentially expressed (DE) genes, and cell trajectories. The simulation details are described in our Cell System paper.
3. A detailed description on how to use these datasets is available at our STAR Protocols paper
Files
real_datasets.zip
Files
(1.3 GB)
Name | Size | Download all |
---|---|---|
md5:72d393ecc0fecf5bb91571ccd985f233
|
749.9 MB | Preview Download |
md5:4fd8722ba9868d8ba044c84dc947b28c
|
588.4 MB | Preview Download |
Additional details
Related works
- Is supplement to
- 10.1016/j.cels.2020.11.008 (DOI)
References
- Xi, N. M. and Li, J. J. (2020) 'Benchmarking Computational Doublet-Detection Methods for Single-Cell RNA Sequencing Data', Cell systems. doi: 10.1016/j.cels.2020.11.008.