Published October 25, 2021 | Version 1.0
Dataset Open

Gene expression and splicing counts from 49 tissues from GTEx v6p genome build hg19 - non-strand specific

Description

Dataset description:

49 folders, each corresponding to one tissue from GTEx v6p and containing the following files:

  1. geneCounts: gene-level counts 

  2. k_j: split counts spanning from one exon to another.

  3. k_theta: non-split counts covering a splice site

  4. n_psi3: total split counts from a given acceptor site

  5. n_psi5: total split counts from a given donor site

  6. n_theta: total split and non-split counts for a given splice site

  7. Sample annotation describing each sample from the dataset

  8. Description file with global information from the dataset

The gene counts were originated using the GTF file from release 29 of GENCODE, and the split and non-split counts contain only the annotated junctions from the same release. Statistics are reported only for GENCODE-annotated introns and splice sites, in compliance with the regulations of the GTEx consortium. For a description of the samples, methods, and protocols, see the GTEx publication specified below.

Use: The count matrices are intended to help researchers that are interested in using RNA-Seq data with the purpose of diagnostics. Researchers can merge their own dataset with the downloaded ones, provided the tissue, genome build, strand, and paired-end specifications match. Afterwards, the Detection of RNA outliers Pipeline (DROP)  can be used to compute gene expression and splicing outliers.
Organism: Homo sapiens
Genome assembly: hg19
Gene annotation: gencode29
Strand specific: FALSE
Paired end: TRUE
Protocol: poly(A) enrichment

Contact: Vicente A. Yepez, yepez at in.tum.de; Christian Mertes, mertes at in.tum.de; Julien Gagneur, gagneur at in.tum.de

Citation: Write the following in the "Data availability" section of the manuscript or similar replacing the three citations by the ones from the References section below:

The count matrices for the GTEx samples <cite GTEx publication, see below> were downloaded from Zenodo (doi: 10.5281/zenodo.5596755) and were generated through DROP <cite DROP, see below> using the release 29 of the GENCODE annotation <cite GENCODE, see below>.

Also, write the following in the Acknowledgements section:
 

The Genotype-Tissue Expression (GTEx) Project was supported by the Common Fund of the Office of the Director of the National Institutes of Health, and by NCI, NHGRI, NHLBI, NIDA, NIMH, and NINDS. The raw data used for the analyses described in this manuscript were obtained from the GTEx Portal on June 12, 2017, under accession number dbGaP  phs00424.v6.p1.


 

Files

Files (8.9 GB)

Name Size Download all
md5:4eba0413b21c2fd5512b51ca13ca19d2
8.9 GB Download

Additional details

Funding

National Institutes of Health

References

  • Yépez, V.A.; Mertes, C.; Müller, M.F. et al. Detection of aberrant gene expression events in RNA sequencing data. Nat Protoc 16, 1276–1296 (2021). https://doi.org/10.1038/s41596-020-00462-5
  • GTEx Consortium. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017). https://doi.org/10.1038/nature24277.
  • Frankish, A.; Diekhans, M.; Ferreira, A-M. et al. GENCODE reference annotation for the human and mouse genomes, Nucleic Acids Research, Volume 47, Issue D1, 08 January 2019, Pages D766–D773, https://doi.org/10.1093/nar/gky955