There is a newer version of the record available.

Published October 29, 2020 | Version v3
Dataset Open

The impact of biological sex on alternative splicing

  • 1. The Jackson Laboratory
  • 2. Science and Technology Consulting LLC
  • 3. Lifebit Biotech Ltd
  • 4. University of Iceland
  • 5. Berlin Institute of Health, Charité-Universitätsmedizin Berlin
  • 6. Linus Pauling Institute
  • 7. Translational Informatics Division, Department of Internal Medicine, The University of New Mexico Health Science Center
  • 8. The Jackson Laboratory, Institute for Systems Genomics, University of Connecticut, Department of Genetics and Genome Sciences, UConn Health

Description

These files were pulled from the results obtained through the execution of the Nextflow workflow rmats-nf and through numerous notebooks. How the data were generated are noted for each file.

  1. as.tar.gz - This file contains the output of executing differentialSplicingJunctionAnalysis.ipynb.   When untarred and unzipped.  For each tissue, there are 5 alternatively splicing types together with both the statistically significant results (FC > 1.5, p-value < 0.05), the entirety of the results are included.  To do this efficiently, this notebook was run with the help of Papermill as a Nextflow workflow. 
  2. fromGTF.tar.gz - One for each splicing type, generated within rmats-nf: fromGTF.A3SS.txt, fromGTF.A5SS.txt, fromGTF.MXE.txt, fromGTF.RI.txt,fromGTF.SE.txt.
  3. dge.tar.gz - This file contains the output of executing differentialGeneExpressionAnalysis.ipynb.  Both the statistically significant results (FC > 1.5, p-value < 0.05), the entirety of the results are included.
  4. srr.tar.gz - This file contains the Sequence Run (SRR) data merged with phenotype data, generated by differentialGeneExpressionAnalysis.ipynb
  5. rmats_final.tar.gz.   For each splicing type, we have 5 files for there is a matrix of all included junction (ijc) counts, inclusion lengths (inclen), percent spliced in as calculated by rMATS 3.2.5, skipped junction counts (sjc) and skipped junction lengths (skiplen) for each junction and for each sample (SRR) generated by rmats-nf.

Notes

https://github.com/lifebit-ai/rmats-nf was run on Genotype-Tissue Expression (GTEx) RNAseq data (an application to dbGAP for access to the dataset phs000424.v8.v2 is required). This raw data is under controlled access and subject to the data use agreement thereunder. Public posting of Genomic Summary Results is permitted. We post here in this release, genomic summary results from our analysis using rMATS 3.2.5 in the aforementioned nextflow script. We acknowledge the Genotype Tissue Expression (GTEx) Project. The Genotype-Tissue Expression (GTEx) Project was supported by the Common Fund of the Office of the Director of the National Institutes of Health (commonfund.nih.gov/GTEx). Additional funds were provided by the NCI, NHGRI, NHLBI, NIDA, NIMH, and NINDS. Donors were enrolled at Biospecimen Source Sites funded by NCI\Leidos Biomedical Research, Inc. subcontracts to the National Disease Research Interchange (10XS170), Roswell Park Cancer Institute (10XS171), and Science Care, Inc. (X10S172). The Laboratory, Data Analysis, and Coordinating Center (LDACC) was funded through a contract (HHSN268201000029C) to the The Broad Institute, Inc. Biorepository operations were funded through a Leidos Biomedical Research, Inc. subcontract to Van Andel Research Institute (10ST1035). Additional data repository and project management were provided by Leidos Biomedical Research, Inc.(HHSN261200800001E). The Brain Bank was supported supplements to University of Miami grant DA006227. Statistical Methods development grants were made to the University of Geneva (MH090941 & MH101814), the University of Chicago (MH090951,MH090937, MH101825, & MH101820), the University of North Carolina - Chapel Hill (MH090936), North Carolina State University (MH101819),Harvard University (MH090948), Stanford University (MH101782), Washington University (MH101810), and to the University of Pennsylvania (MH101822). The datasets used for the analyses described in this manuscript were obtained from dbGaP at http://www.ncbi.nlm.nih.gov/gap through dbGaP accession number phs000424.v8.v2.

Files

Files (2.7 GB)

Name Size Download all
md5:1d8fed756af4999370cfaac8d87611b9
60.0 MB Download
md5:0cf77d2f53edb2533a013fb55795e1f2
34.5 MB Download
md5:959c396f5068e8dd7dec8fa3f7831282
1.7 MB Download
md5:855fe928319779a1fb44e1c7ffc6ec41
1.4 GB Download
md5:3f22acfea45c16757ae6776c882337f2
1.2 GB Download
md5:599d50910deb418f1d0fcf4c2065a658
1.3 MB Download