Datasets for evaluating SCEMENT: Scalable and Memory Efficient Integration of Large-scale Single Cell RNA-sequencing Data
Creators
Description
This resource contains pre-processed A. thaliana root , the H. sapiens aortic valve datasets, PBMC Covid atlas and public 10x datasetse used in the paper, SCEMENT: Scalable and Memory Efficient Integration of Large-scale Single Cell RNA-sequencing Data. The raw datasets provided in the links below are pre-processed for quality control with respect to both cells and genes.
A. thaliana datasets are sourced from the following locations at Single-cell Gene expression Atlas and Gene Expression Omnibus (GEO):
- E-GEOD-121619 : https://www.ebi.ac.uk/gxa/sc/experiments/E-GEOD-121619/results
- E-GEOD-152766 : https://www.ebi.ac.uk/gxa/sc/experiments/E-GEOD-152766/results
- E-GEOD-158761 : https://www.ebi.ac.uk/gxa/sc/experiments/E-GEOD-158761/results
- E-GEOD-123013 : https://www.ebi.ac.uk/gxa/sc/experiments/E-GEOD-123013/results
H. sapiens datasets are obtained from the NCBI database : https://www.ncbi.nlm.nih.gov/bioproject/PRJNA562645/
- GSE152766: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE152766
- GSE158761: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE158761
All COVID atlas datasets are from: http://covid19.cancer-pku.cn . covid_atlas_data1.zip contains the h5ad files and covid_atlas_data2.zip contains the Seurat rds files.
PBMC datasets are from the following public sources:
References for the Datasets :
- H. sapiens dataset: Kang Xu, Shangbo Xie,Yuming Huang,Tingwen Zhou, Ming Liu, Peng Zhu, Chunli Wang, Jiawei Shi, Fei Li,Frank W. Sellke and Nianguo Dong (2020) Cell-Type Transcriptome Atlas of Human Aortic Valves Reveal Cell Heterogeneity and Endothelial to Mesenchymal Transition Involved in Calcific Aortic Valve Disease.
- E-GEOD-152766: Shahan R, Hsu C, Nolan TM, Cole BJ, Taylor IW et al. (2020) A single cell Arabidopsisroot atlas reveals developmental trajectories in wild type and cell identity mutants.
- E-GEOD-121619: Jean-Baptiste K, McFaline-Figueroa JL, Alexandre CM, Dorrity MW, Saunders L et al. (2019) Dynamics of Gene Expression in Single Root Cells of Arabidopsis thaliana.
- E-GEOD-123013: Ryu KH, Huang L, Kang HM, Schiefelbein J. (2019) Single-Cell RNA Sequencing Resolves Molecular Relationships Among Individual Plant Cells.
- E-GEOD-158761: Gala HP, Lanctot A, Jean-Baptiste K, Guiziou S, Chu JC et al. (2020) A single cell view of the transcriptome during lateral root initiation in Arabidopsis thaliana.
- COVID Atlas Reference: Xianwen Ren, Wen Wen, Xiaoying Fan et.al. (2021) COVID-19 immune features revealed by a large-scale single-cell transcriptome atlas
- PBMC data are downloaded from respective links
Files
aortic_valve_datasets.zip
Files
(21.7 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:de580ff20b86e500eee4ac5db9ce5e61
|
901.2 MB | Preview Download |
|
md5:603292c25d2424c9d7203262442f033a
|
3.0 GB | Preview Download |
|
md5:67f4d08091c123fbfccfdff09772ae41
|
5.1 GB | Preview Download |
|
md5:c7a886c681f67636fb5185f34af48f16
|
10.6 GB | Preview Download |
|
md5:19ecd11793f062ef2fb581e9ad889606
|
2.0 GB | Preview Download |
|
md5:17c01656132de9f22fe0cf819e15a783
|
56.8 MB | Preview Download |
Additional details
Funding
- U.S. National Science Foundation
- A scalable integrated multi-modal single cell analysis framework for gene regulatory and cell-cell interaction networks 2233887
Software
- Repository URL
- https://github.com/AluruLab/scement