Published April 13, 2022 | Version 0.0.1
Dataset Open

Evaluating the influence of structural properties on proximity metric performance in single cell RNA-seq data - Datasets

  • 1. Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, QLD, Australia
  • 2. School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, QLD, Australia

Description

Includes raw and processed copies of the scRNA-seq datasets used for the paper: 'How does data structure impact cell-cell similarity? Evaluating the influence of structural properties on proximity metric performance in single cell RNA-seq data.'

Real scRNA-seq.zip contains the Abundant (subset1) and Rare (subset 2) subsets generated to represent discretely structured datasets (sourced from Wegmann et al. 2019) and the continuously structured data (sourced from Popescu et al. 2019).

Simulated scRNA-seq.zip contains the Abundant, Moderately-Rare and Ultra-Rare subsets for discretely and continuously structured datasets. All data was simulated using the PROSSTT package in Python 3.8, as well as the dataset containing the labels to re-produce Figure 3 of the manuscript.

Results.zip contains the results for all datasets from the full analysis, in a pickled python dictionary. Code to read in and visualise results is available on the projects github

The scripts for the dataset generation, processing and visualisation of results are available at our github for the scProcimitE package, and documentation is available here.

Files

Results.zip

Files (764.9 MB)

Name Size Download all
md5:d74a74a4f0ef1360d133b61d9d73a601
1.3 MB Preview Download
md5:da9cd31b0c29740c3d5848bb861c1f47
255.2 MB Preview Download
md5:2d478bdb37db26602652949181f0a8c5
508.4 MB Preview Download

Additional details

References

  • Popescu D-M, Botting RA, Stephenson E, et al. Decoding human fetal liver haematopoiesis: Dataset. 2019;
  • Wegmann R, Neri M, Schuierer S, et al. CellSIUS provides sensitive and specific detection of rare cell populations from complex single-cell RNA-seq data. Genome Biology 2019; 20:142
  • Papadopoulos N, Gonzalo PR, Söding J. PROSSTT: probabilistic simulation of single-cell RNA-seq data for complex differentiation processes. Bioinformatics 2019; 35:3517–3519