Joint embedding of vertebrate brain single-cell RNA-Seq using sequence or structure
Description
Embeddings of single-cell RNA-Seq data from three adult vertebrate brain datasets into Orthogroup feature space or Structural cluster feature space. Orthogroups were generated using OrthoFinder v5.5.0; Structural clusters were assigned by using FoldSeek to cluster AlphaFold-v4 structural predictions.
The three datasets used as the basis for these embeddings were:
- sample "Brain8" from the Jiang et al. 2021 zebrafish cell atlas (files beginning with GSM3768152)
- sample "Brain1" from the Han et al. 2018 mouse cell atlas (files beginning with GSM2906405)
- sample "Xenopus_brain_COL65" from the Liao et al. 2022 Xenopus laevis adult cell atlas (files beginning with GSM6214268)
For each dataset, we also generated a standardized cell type annotation file based on the author's originally provided cell type annotation data. The first column is the cell barcode for that species and the second column is the original study's cell type annotation for that cell.
For the Xenopus brain data, we removed around ~18k cells that were not annotated in the original data to simplify data analyses - these are reflected in the files with the "subsampled" suffix. Subsampled versions of the data are also available for the joint embedding space (prefixed with "DrerMmusXlae").
For the final datasets used in our analyses, we also provide features x cell matrices as .h5ad files for smaller file sizes and faster loading using Scanpy.
For visualizing our UMAP plots of our top200 embedding space, we provide ".tsv" files with a variety of metrics and the x and y positions of each cell in the UMAP. See "DrerMmusXlae_adultbrain_FoldSeek_plotlydata.tsv" and "DrerMmusXlae_adultbrain_OrthoFinder_plotlydata.tsv"
These data are part of the Arcadia Science Pub titled "Comparing gene expression across species based on protein structure instead of sequence".
Files
Files
(750.4 MB)
Name | Size | Download all |
---|---|---|
md5:4dc9e54e572b44e744fb36fbaf301585
|
570.0 kB | Download |
md5:9cd4860496d1b22ae9468c79b83770d6
|
31.2 MB | Download |
md5:2332f343901aee6dc0bc00f10b63ccaa
|
326.2 MB | Download |
md5:673af312c7b18ea2711cae325dc0e31c
|
29.2 MB | Download |
md5:239bc5e187cda4c9b371c80acce3af0b
|
184.4 MB | Download |
md5:d3ab3dc83308cd6216daf58b5df1d1e0
|
2.0 MB | Download |
md5:64e795e69280c651d1959bec8ae1336c
|
2.0 MB | Download |
md5:016b5bdbdd509283b4ddff3eb38d9209
|
2.9 MB | Download |
md5:73ddd61f9087b84a8974770329f4043a
|
1.4 MB | Download |
md5:2495e96e1bc582dafca23a391b557b86
|
1.4 MB | Download |
md5:8c67af8b8d875e734716c8799780d287
|
11.7 MB | Download |
md5:22d0fe8829c2c6fd77c8596b233b9ce3
|
9.6 MB | Download |
md5:68d3fc3973e6dd990d5eee0d93490f93
|
11.7 MB | Download |
md5:8c4be3dd33415badfa1f5e52e9845036
|
25.6 MB | Download |
md5:eb07599f82e1425f7c3d0e6619da7492
|
18.2 MB | Download |
md5:2b059574f0f05ac702b27b8f65d669fc
|
15.6 MB | Download |
md5:0dd1c7205c648ab8588b37e8f5ca22eb
|
22.8 MB | Download |
md5:90969c318f0d0ed665e962238547fbbe
|
15.8 MB | Download |
md5:0916c212573c093f9ec78c801a3b557f
|
37.6 MB | Download |
md5:0652a93c123d5b5b359a6edeeb72c317
|
178.3 kB | Download |
md5:75483fd9275d7d9acf341f4307caee09
|
356.7 kB | Download |