Published May 1, 2021 | Version v3
Dataset Open

Benchmarking computational doublet-detection methods for single-cell RNA sequencing data

  • 1. UCLA

Description

This repository contains the real and synthetic datasets used in the paper "Benchmarking Computational Doublet-Detection Methods for Single-Cell RNA Sequencing Data" and "Protocol for Benchmarking Computational Doublet-Detection Methods in Single-Cell RNA Sequencing Data Analysis". Please check the full text published on Cell Systems and STAR Protocols.

1. real_datasets.zip: 16 real scRNA-seq datasets with experimentally annotated doublets. This collection covers a variety of cell types, droplet and gene numbers, doublet rates, and sequencing depths. It represents varying levels of difficulty in detecting doublets from scRNA-seq data. The data collection and preprocessing details are described in our Cell System paper. The name of each file corresponds to the names in the paper.

2. synthetic_datasets.zip: synthetic datasets used in the paper, including datasets with varying doublet rates (i.e., percentages of doublets among all droplets), sequencing depths, cell types, and between-cell-type heterogeneity levels. The synthetic datasets contain ground-truth doublets, cell types, differentially expressed (DE) genes, and cell trajectories. The simulation details are described in our Cell System paper.

3. A detailed description on how to use these datasets is available at our STAR Protocols paper

Files

real_datasets.zip

Files (1.3 GB)

Name Size Download all
md5:72d393ecc0fecf5bb91571ccd985f233
749.9 MB Preview Download
md5:4fd8722ba9868d8ba044c84dc947b28c
588.4 MB Preview Download

Additional details

Related works

Is supplement to
10.1016/j.cels.2020.11.008 (DOI)

References

  • Xi, N. M. and Li, J. J. (2020) 'Benchmarking Computational Doublet-Detection Methods for Single-Cell RNA Sequencing Data', Cell systems. doi: 10.1016/j.cels.2020.11.008.