Published April 17, 2023 | Version 0.1
Dataset Open

Joint embedding of vertebrate brain single-cell RNA-Seq using sequence or structure

  • 1. Arcadia Science

Description

Embeddings of single-cell RNA-Seq data from three adult vertebrate brain datasets into Orthogroup feature space or Structural cluster feature space. Orthogroups were generated using OrthoFinder v5.5.0; Structural clusters were assigned by using FoldSeek to cluster AlphaFold-v4 structural predictions.

The three datasets used as the basis for these embeddings were:

For each dataset, we also generated a standardized cell type annotation file based on the author's originally provided cell type annotation data. The first column is the cell barcode for that species and the second column is the original study's cell type annotation for that cell.

For the Xenopus brain data, we removed around ~18k cells that were not annotated in the original data to simplify data analyses - these are reflected in the files with the "subsampled" suffix. Subsampled versions of the data are also available for the joint embedding space (prefixed with "DrerMmusXlae").

For the final datasets used in our analyses, we also provide features x cell matrices as .h5ad files for smaller file sizes and faster loading using Scanpy. 

For visualizing our UMAP plots of our top200 embedding space, we provide ".tsv" files with a variety of metrics and the x and y positions of each cell in the UMAP. See "DrerMmusXlae_adultbrain_FoldSeek_plotlydata.tsv" and "DrerMmusXlae_adultbrain_OrthoFinder_plotlydata.tsv"

These data are part of the Arcadia Science Pub titled "Comparing gene expression across species based on protein structure instead of sequence".

Files

Files (750.4 MB)

Name Size Download all
md5:4dc9e54e572b44e744fb36fbaf301585
570.0 kB Download
md5:9cd4860496d1b22ae9468c79b83770d6
31.2 MB Download
md5:2332f343901aee6dc0bc00f10b63ccaa
326.2 MB Download
md5:673af312c7b18ea2711cae325dc0e31c
29.2 MB Download
md5:239bc5e187cda4c9b371c80acce3af0b
184.4 MB Download
md5:d3ab3dc83308cd6216daf58b5df1d1e0
2.0 MB Download
md5:64e795e69280c651d1959bec8ae1336c
2.0 MB Download
md5:016b5bdbdd509283b4ddff3eb38d9209
2.9 MB Download
md5:73ddd61f9087b84a8974770329f4043a
1.4 MB Download
md5:2495e96e1bc582dafca23a391b557b86
1.4 MB Download
md5:8c67af8b8d875e734716c8799780d287
11.7 MB Download
md5:22d0fe8829c2c6fd77c8596b233b9ce3
9.6 MB Download
md5:68d3fc3973e6dd990d5eee0d93490f93
11.7 MB Download
md5:8c4be3dd33415badfa1f5e52e9845036
25.6 MB Download
md5:eb07599f82e1425f7c3d0e6619da7492
18.2 MB Download
md5:2b059574f0f05ac702b27b8f65d669fc
15.6 MB Download
md5:0dd1c7205c648ab8588b37e8f5ca22eb
22.8 MB Download
md5:90969c318f0d0ed665e962238547fbbe
15.8 MB Download
md5:0916c212573c093f9ec78c801a3b557f
37.6 MB Download
md5:0652a93c123d5b5b359a6edeeb72c317
178.3 kB Download
md5:75483fd9275d7d9acf341f4307caee09
356.7 kB Download