Published August 28, 2022 | Version v1
Thesis Open

Whole-transcriptome analysis of protein-coding potential in the model plant Medicago truncatula

  • 1. Bogazici University

Contributors

  • 1. UAE University
  • 2. Bogazici University

Description

How many different proteins can be produced from a single spliced transcript? Genome annotation projects typically annotate the longest open reading frame (ORF) in each transcript and call it a reference ORF (refORF). If non-reference ORFs in a single transcript are predicted to encode proteins, they are referred to as alternative ORFs (altORFs). Proteins translated from altORFs are termed alternative proteins (altProts). Genome annotation projects usually do not consider the coding potential of altORFs. However, many altProts have been shown to carry out essential functions in various organisms. In addition to the existence of protein-coding potential in all the three reading frames, spliced eukaryotic transcripts may undergo programmed single or multiple ribosomal frameshifting events. Depending on whether a protein is produced by one or several such events, this novel protein is called either a chimeric protein or a mosaic protein, respectively. Proteins produced via single ribosomal frameshifting events have been known in viruses for a long time, and more recently, they have also been found in higher eukaryotes. In contrast, mosaic proteins so far are elusive, with only one example found in viruses. Detection of altORFs can help identify these unusual proteins because altORFs may act as building blocks for chimeric and mosaic proteins. This way of extracting and combining genetic information from different reading frames may significantly increase proteome diversity, thus promoting organisms' flexibility and adaptability to various environmental conditions. This project aims to identify altProts based on the conservation evidence or detection by mass spectrometry (MS) analysis and to find proteins produced via single and multiple ribosomal frameshifting events to demonstrate the existence of mosaic translation. Our study in Medicago truncatula, a well-established model legume, detected 715 translated altProts and 146 chimeric proteins. Two transcripts support the existence of mosaic proteins and mosaic translation, which has never been detected in non-viral organisms before. In addition, we have found evidence for many thousands of conserved altProts. This work pioneers a new field of proteomics and is of immense value for plant biologists and specialists interested in translation. It also paves a way towards the major shift in current understanding of proteome complexity and diversity. 

This study was funded by the Scientific and Technological Research Council of Turkey (TUBITAK) 1002 Short Term R&D Funding Program (No. 120Z247) and Boğaziçi University Scientific Research Projects, BAP, Funding Program (No. 18841). I am grateful for the funding and hope that these programs can support many young scientists in the future as well. I would like to thank TUBITAK ULAKBIM, High Performance and Grid Computing Center, TRUBA, for offering the opportunity to analyse a huge amount of data. 

Files

Files (1.4 GB)

Name Size Download all
md5:fb715c0f7a6b6dfbb8e7bfe7d1b8a213
77.7 MB Download
md5:9ee4c085c0868bf0347fd6e9aeecca39
78.2 MB Download
md5:e989bb230e31cd50ed2987f58931d71d
159.9 MB Download
md5:11fff76fe017f9284988579810d0b867
379.5 MB Download
md5:e660086e85c2abdbda19e1c6dc6a338e
695.6 MB Download
md5:ff6e6776590f7fb13543eac0fb10ae0b
41.4 kB Download
md5:399c902026003d6d1a1e7e805e90a640
42.2 kB Download