Published January 8, 2021 | Version v1
Dataset Open

The genetic basis of cytoplasmic male sterility and fertility restoration in wheat

Description

Hybrid wheat varieties give higher yields than conventional lines but are difficult to produce due to a lack of effective control of male fertility in breeding lines. One promising system involves the Rf1 and Rf3 genes that restore fertility of wheat plants carrying Triticum timopheevii-type cytoplasmic male sterility (T-CMS). By genetic mapping and comparative sequence analyses we identified Rf1 and Rf3 candidates that could restore normal pollen production in transgenic wheat plants carrying T-CMS. We show that Rf1 and Rf3 bind to the mitochondrial orf279 transcript and induce cleavage, preventing expression of the CMS trait. The identification of restorer genes in wheat is an important step towards the development of hybrid wheat varieties based on a CMS-Rf system. The characterisation of their mode of action brings new insights into the molecular basis of CMS and fertility restoration in plants.

This dataset includes transcript count and coverage data from 2 RNA-seq experiments looking at gene expression in various male-sterile or male-fertile wheat lines examined in the course of this research.

Notes

For dataset 1, the files included here are:

  • experimental_design.xlsx — lists the samples and genotypes
  • references — folder of fasta files containing reference transcripts for the respective genotypes (input to Salmon)
  • quants — folder of quant.sf files containing nuclear/cytosolic transcript counts (output from Salmon)
  • mt_quants — folder of quant.sf files containing mitochondrial transcript counts (output from Salmon)
  • rnaseq.ipynb — Jupyter notebook (Python code) to reproduce Fig. 2b and Fig. S2 from the paper using the quants files (requires Python packages pandas, numpy, matplotlib, seaborn, sklearn and diffexpr (https://github.com/wckdouglas/diffexpr))
  • mt.ipynb — Jupyter notebook (Python code) to reproduce Figs. 2c and Fig. 2d from the paper using the mt_quants files

For dataset 2, the files included here are:

  • references — folder containing a fasta file containing reference transcripts (input to Salmon)
  • RNASeq_quants.xlsx — table of read counts extracted from Salmon output 
  • mt_cov — folder of strand-specific read coverage files (generated by genomeCoverageBed from the bedtools2 package)
  • Transgene_TPM.ipynb — Jupyter notebook (Python code) to reproduce Fig. 3c from the paper using the quants files
  • mt_coverage.ipynb — Jupyter notebook (Python code) to reproduce Figures 5 and S5 from the paper using the mt_cov files

The source data underlying Figs 2a, 3b-f, 4c-e, 6c, 7b and Supplementary Figs S3b, S4b-e, S6a and S7b-d are provided as a Source Data zip file.

Funding provided by: Australian Research Council
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100000923
Award Number: CE140100008

Files

Source_Data.zip

Files (604.5 MB)

Name Size Download all
md5:c760fb0a8cd9786d57b0be26c5f979d9
386.0 MB Download
md5:8efbaba4adfbc7cdb3a1f92e17926050
100.1 MB Download
md5:b44a28302ec38d7d1d835f9dd85b3985
118.4 MB Preview Download