Data from: Highly efficient and comprehensive identification of ethyl methanesulfonate-induced mutations in Nicotiana tabacum L. by whole-genome and whole-exome sequencing
Authors/Creators
- 1. Leaf Tobacco Research Center, Japan Tobacco Inc.
- 2. RIKEN Nishina Center for Accelerator-Based Science
Description
Tobacco (Nicotiana tabacum L.) is a complex allotetraploid species with a large 4.5-Gb genome that carries duplicated gene copies. In this study, we describe the development of a whole-exome sequencing (WES) procedure in tobacco and its application to characterize a test population of ethyl methanesulfonate (EMS)-induced mutations. A probe set covering 50.3-Mb protein coding regions was designed from a reference tobacco genome. The EMS-induced mutations in 19 individual M2 lines were analyzed using our mutation analysis pipeline optimized to minimize false positives/negatives. In the target regions, the on-target rate of WES was approximately 75%, and 61,146 mutations were detected in the 19 M2 lines. Most of the mutations (98.8%) were single nucleotide variants, and 95.6% of them were C/G to T/A transitions. The number of mutations detected in the target coding sequences by WES was 93.5% of the mutations detected by whole-genome sequencing (WGS). The amount of sequencing data necessary for efficient mutation detection was significantly lower in WES (11.2 Gb), which is only 6.2% of the required amount in WGS (180 Gb). Thus, WES was almost comparable to WGS in performance but is more cost effective. Therefore, the developed target exome sequencing, which could become a fundamental tool in high-throughput mutation identification, renders the genome-wide analysis of tobacco highly efficient.
Files
Files
(850.4 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:d20b657423d6f09a0323f678cd08f977
|
836.5 MB | Download |
|
md5:c897d8286a4f5245ccb96f6eed276d52
|
11.3 MB | Download |
|
md5:6e1f6e2114f06bc615bf54ad41183bf5
|
2.6 MB | Download |