Published December 16, 2024 | Version v1
Dataset Open

Processed GTEx Data From Experimental and Computational Methods for Allelic Imbalance Analysis from Single-Nucleus RNA-seq Data

Description

Preprint abstract (from https://www.biorxiv.org/content/10.1101/2024.08.13.607784v1): Single-cell RNA-seq (scRNA-seq) is emerging as a powerful tool for understanding gene function across diverse cells. Recently, this has included the use of allele-specific expression (ASE) analysis to better understand how variation in the human genome affects RNA expression at the single-cell level. We reasoned that because intronic reads are more prevalent in single-nucleus RNA-Seq (snRNA-Seq), and introns are under lower purifying selection and thus enriched for genetic variants, that snRNA-seq should facilitate single-cell analysis of ASE. Here we demonstrate how experimental and computational choices can improve the results of allelic imbalance analysis. We explore how experimental choices, such as RNA source, read length, sequencing depth, genotyping, etc., impact the power of ASE-based methods. We developed a new suite of computational tools to process and analyze scRNA-seq and snRNA-seq for ASE. As hypothesized, we extracted more ASE information from reads in intronic regions than those in exonic regions and show how read length can be set to increase power. Additionally, hybrid selection improved our power to detect allelic imbalance in genes of interest. We also explored methods to recover allele-specific isoform expression levels from both long- and short-read snRNA-seq. To further investigate ASE in the context of human disease, we applied our methods to a Parkinson’s disease cohort of 94 individuals and show that ASE analysis had more power than eQTL analysis to identify significant SNP/gene pairs in our direct comparison of the two methods. Overall, we provide an end-to-end experimental and computational approach for future studies.

 

This repo contains the clustering of the single nuclei cortex GTEx data presented in Figure 2 of the above preprint, and is also availble in the Broad single cell repository under accension SCP2873. 

Files

cells_counts_Samp.txt

Files (394.4 MB)

Name Size Download all
md5:e5798977926ac3e2e203407e59fd4990
518.4 kB Preview Download
md5:e5798977926ac3e2e203407e59fd4990
518.4 kB Preview Download
md5:731f031cd64c1ebc47658ceab6a20e96
174.7 MB Download
md5:241c16f164b7d63b6ded3c31db333f3c
212.0 MB Download
md5:37c96fc48e1b0c9270295f3db40cd25f
1.1 MB Preview Download
md5:37c96fc48e1b0c9270295f3db40cd25f
1.1 MB Preview Download
md5:3e91d492acd3ad3cbb9ba439edb05972
3.2 MB Preview Download
md5:0695b292d086fbd4a7172066f9f6ddf9
1.2 MB Preview Download

Additional details

Funding

Aligning Science Across Parkinson's
Parkinson5D: deconstructing proximal disease mechanisms across cells, space, and progression ASAP-000301