Published February 14, 2018 | Version v1
Dataset Open

Datasets and Scripts: Full-length mRNA sequencing uncovers a widespread coupling between transcription and mRNA processing

  • 1. Leiden University Medical Center, Netherlands
  • 2. Pacific Biosciences, USA
  • 3. Dana-Farber Cancer Institute, USA
  • 4. LGC Bioresearch Technologies, USA

Description

The multilayered control of gene expression requires tight coordination of regulatory mechanisms at the transcriptional and post-transcriptional level. In this study, we studied the interdependence of transcription, splicing and polyadenylation events on single mRNA molecules by full-length mRNA sequencing. In MCF-7 breast cancer cells and three human tissues, we found an unforeseen number of genes that demonstrate mutually inclusive or exclusive alternative transcription and mRNA processing events, which can span the entire length of mRNA molecules. Furthermore, alternative poly(A) sites that are coupled with alternative splicing events are depleted for known poly(A) signals and enriched for MBNL binding motifs, supporting a dual role of MBNL proteins in regulating splicing and polyadenylation. We predict thousands of open-reading frames from the sequence of full-length mRNAs, allowing for a more sensitive proteogenomics analysis of MCF-7 mass-spectrometry data. Our findings demonstrate that our understanding of transcriptome complexity is far from complete and provides a framework to reveal largely unresolved mechanisms that coordinate transcription and mRNA processing.

Notes

This repository hosts all datasets that were used during the course of the study.

Files

Files (146.5 MB)

Name Size Download all
md5:7af62deea3ce8323c02caa46485b5db7
146.5 MB Download