Evaluating the influence of structural properties on proximity metric performance in single cell RNA-seq data - Datasets
Authors/Creators
- 1. Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, QLD, Australia
- 2. School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, QLD, Australia
Description
Includes raw and processed copies of the scRNA-seq datasets used for the paper: 'How does data structure impact cell-cell similarity? Evaluating the influence of structural properties on proximity metric performance in single cell RNA-seq data.'
Real scRNA-seq.zip contains the Abundant (subset1) and Rare (subset 2) subsets generated to represent discretely structured datasets (sourced from Wegmann et al. 2019) and the continuously structured data (sourced from Popescu et al. 2019).
Simulated scRNA-seq.zip contains the Abundant, Moderately-Rare and Ultra-Rare subsets for discretely and continuously structured datasets. All data was simulated using the PROSSTT package in Python 3.8, as well as the dataset containing the labels to re-produce Figure 3 of the manuscript.
Results.zip contains the results for all datasets from the full analysis, in a pickled python dictionary. Code to read in and visualise results is available on the projects github
The scripts for the dataset generation, processing and visualisation of results are available at our github for the scProcimitE package, and documentation is available here.
Files
Results.zip
Additional details
References
- Popescu D-M, Botting RA, Stephenson E, et al. Decoding human fetal liver haematopoiesis: Dataset. 2019;
- Wegmann R, Neri M, Schuierer S, et al. CellSIUS provides sensitive and specific detection of rare cell populations from complex single-cell RNA-seq data. Genome Biology 2019; 20:142
- Papadopoulos N, Gonzalo PR, Söding J. PROSSTT: probabilistic simulation of single-cell RNA-seq data for complex differentiation processes. Bioinformatics 2019; 35:3517–3519