Rice Grain Quality Benchmarking Through Profiling of Volatiles and Metabolites in Grains Using Gas Chromatography Mass Spectrometry.

Gas chromatograph coupled with mass spectrometer is widely used to profile volatiles and metabolites from the homogenized rice flour obtained from mature grains. Rice grains consist of central endosperm which stores majorly starch and, in addition, accumulate various storage proteins as storage reserves. The outer nutritious aleurone layer stores lipids, sugar alcohols, volatiles, antioxidants, vitamins, and various micronutrients. Once paddy sample is dehulled, milled, and ground cryogenically, the brown rice flour is subjected to extraction of primary metabolites and volatiles using an appropriate extraction method. In metabolite profiling of the liquid extract obtained from the rice sample, mixture is initially subjected to methoxyamination then silylation before being subjected to untargeted metabolite profiling. Peaks obtained are processed for noise reduction and specific signal selection. Volatile compounds are initially extracted using a solid phase adsorbent prior to analysis. All these compounds, metabolites, and volatiles are detected in the mass selective detector by fragmentation at 70 eV ionization energy and the resultant mass spectrum compared with a built-in library of compounds. Data mined from the gas chromatography mass spectrometry analysis are then subjected to post-processing statistical analysis.


Introduction
Chemical composition of seeds comprises a wide array of storage products including starch, storage proteins, lipids, various primary metabolism intermediates, and secondary metabolites carrying human health benefits [1,2]. Various metabolite profiling techniques such as gas chromatography mass spectrometry (GC-MS), liquid chromatography mass spectrometry (LC-MS), capillary electrophoresis mass spectrometry (CE-MS), and chromatography coupled nuclear magnetic resonance (NMR) technologies are available with various degrees of accuracy and sensitivity [3]. Combining the information obtained from several of these technologies, it is possible to cover the global metabolome index with thousands of metabolites. Among these platforms, the GC-MS is one of the key technologies applied routinely to profile several hundred metabolites that helped reveal the metabolic signature index of different plant species which possess diverse seed storage products [4][5][6]. Over the last decade, the GC-MS platform is the most widely used due to superior chromatographic resolution, better reproducibility, repeatability of mass fragmentation through electron impact ionization, as well as its low operational and maintenance cost [7]. These attributes make GC-MS reliable in terms of characterization of numerous compounds in different biological systems, especially at various stages of their development.
Rice grains contain up to 90% starch and 6-8% proteins preferentially in endosperm, while the outer nutritious layer bran and endosperm contain lipids, antioxidants, dietary fiber, and other micronutrients [8]. Thus, rice remains an important cereal crop providing a valuable source of energy for human consumption. GC-MS profiling technique has been instrumental in acquiring biological information for the identification of varietal differences in terms of rice grain quality [9], starch quality [10] and for distinguishing aroma through volatile profiling [11][12][13]. The measured 759 metabolic signatures investigated from the grains of 85 lines from backcrossed inbred population were investigated for association mapping using metabolome quantitative trait loci analysis [14,15]. Additional application includes the identification of genotypic and phenotypic differences of germinating rice seeds [16]. Elucidation of seed storage composition of varietal differences [17], drought implications on seed storage composition [18], and application of metabolomic techniques as screening method for predicting rice quality traits were reported in the past [19][20][21]. Metabolomic studies have also helped in the investigation of the molecular basis of rice quality [22] as well as the identification of metabolic variation between transgenic lines and its wild type, mapping populations created from breeding lines due to higher accumulation of sucrose, mannitol, and glutamic acid [23]. The GC-MS platform was also shown to be useful in elucidating the type of mutation in rice as part of food safety assessments [24].
Metabolites in rice grains are widely studied using the GC-MS platform for separation, identification and quantification [7,25]. By facilitating extraction procedures specific to compounds of interest, the GC-MS aids not only in the identification of more than 100 primary metabolites, but also to unravel the volatile signatures of the sample being analyzed [11,26,27]. In this chapter, we describe the GC-MS protocols for volatiles and metabolites found in ground rice grains as performed at the Grain Quality and Nutrition Center, IRRI. The rice paddy is dehulled, milled, and ground with liquid nitrogen using a cryomill prior to extraction and GC-MS analysis. Resulting chromatogram is initially processed for noise reduction and peak alignment before subjecting to statistical analysis (Fig. 1). 1. Analytical balance (10 mg-220 g ± 0.1 mg capacity).
4. Ball grinder (equipped with grinding jar with 50 g capacity for handling cryo-cooled samples).

Methods
Handling of rice samples is meticulous and, prior to extraction, handling is cryogenic. 7. Centrifuge samples at 12,000 × g for 5 min.
8. Aliquot 1 mL of the polar supernatant into properly labeled 1.5 mL Eppendorf ® Safe-Lock microcentrifuge tubes.
9. Dry the extracts overnight in a speed vacuum dryer or its equivalent.
10. Once extracts are dried, remove vials from speed vacuum dryer (see Note 6). Dried extracts are temporarily stored in dessicator.
To improve the volatility of some compounds, analytes of particular interest must undergo methoxyamination prior to derivatization where the polar functional groups are modified to decrease their polarity and hence they can be separated in the GC column.
1. Cool the vials to room temperature.
4. Heat the mixture with shaking for 30 min at 37 °C to ensure that derivatization is complete.
6. Centrifuge samples at 12,000 × g for 30 s to settle down any solid particles into the bottom of the glass insert. This will also avoid clogging the syringe of the GC.
7. Once done, transfer glass inserts into properly labeled 2 mL GC vials and samples are analyzed in the GCMS.
The method described in this procedure employs silylation to alter the functionality of the targeted functional groups with the use of N, O-bis(trimethylsilyl)trifluoroacetamide or BSTFA with 1% trimethylchlorosilane (TMCS). With BSTFA, the active hydrogens of the compounds containing -SH, -OH, -NH, and -COOH are replaced with the trimethylsilyl (TMS) group. This process is cata-

Derivatization of Compounds (See Note 7)
lyzed by trimethylchlorosilane (TMCS) which is already mixed with BSTFA at 1%. Pyridine is used as a solvent which contributes to the reactivity of the solvent by accepting the protons (H + ) during derivatization process. There had been many published methods using different reagents to derivatize compounds of different classes such as MSTFA w/ 1% TMCS [25] in rice and in biological sample extracts [28]; BSTFA w/ 1% TMCS [23] in transgenic rice; MSTFA [24] in rice; hydroxylamine hydrochloride and hexamethylsilylimidazole and trifluoroaceteic acid [29] for sugars in carrots. A comprehensive discussion on the derivatization of sugars [30] provides a selection for the preparation of procedures for the analysis of carbohydrates which may be useful in rice research.
For every batch of run in the GC a reagent blank and a set of standards in a quality control (QC) mix should be included. Both blank and QC mix should also undergo similar extraction and derivatization procedure together with the sample.
The variability in samples can arise from multiple sources including physiological differences and variability from the analytical method itself. Measuring metabolites using mass spectrometry techniques to explore natural variation from the diversity panel requires appropriate care about homogenous representation of tissue samples, including enough biological and technical replications. In addition, analytical variation caused by suboptimal performance of the chosen apparatus and instrument drift over time are additional major issues in large-scale metabolomics studies, which requires further attention. Batch-to-batch variation is another technical source of variation arising from the sum of both manual and robotic samples handling. The presence of batch-tobatch variation makes it difficult to integrate data from independent batches of samples. This issue is particularly problematic when dealing with a large number of samples such is the case when analyzing structured plant populations. To counter this, several normalization methods have been developed to minimize nonbiological variation. For example, normalizations by a single or multiple internal or external standard compounds [7, 31] were considered. Similarly, isotope-labeled internal standard approaches [32] were established to monitor analytical error. While there is no single best way to conduct metabolomic studies, there are a number of pitfalls and known problems that need to be carefully avoided. Detailed guidelines and practice and normalization protocols [33] have been published previously for this purpose. As the number of samples in the data set increases there is a corresponding time-dependent variation in the metabolite data. Removing platform-specific sources of variability such as system-

GC-MS Run (See Note 8)
atic errors is one of the top priorities in metabolomics data preprocessing. However, metabolite diversity leads to different responses to variations at given experimental conditions, making normalization a very demanding task. The Quality Control (QC) samples are of key importance, and these are best prepared by pooling equal volumes of material from all of the biological samples to be analyzed [34,35]. Alternatively, a chemically defined mixture of authenticated reference compounds that mimics the metabolic composition of the investigated biological material can be employed. These synthetic mixtures are then subjected to the same sample extraction, subject to instrumental analyses (ideally distributed across the analytical run), and data processing, thus providing quality checks for technical and analytical error, and quantitative calibration to eliminate batch effects for the final processed data. This normalization is a crucial step for minimizing the batch-to-batch data variability across extended periods. As such, this is a crucial requirement for large-scale phenotyping, which facilitate inter-batch data integration.
Depending on the objectives of the study, a single or a triple quadrupole MS can be employed to carry out the profiling for both volatiles and metabolites. Single-quadrupole MS is very useful in giving a full untargeted scan of the compounds while in triple quadrupole the resulting transitions from the precursor ion to the product ion become the fingerprint of the targeted analytes. In a single quadrupole, compounds are first fragmented in the ionization chamber using electron impact ionization set at 70 eV to produce fragment ions. These fragments are then separated by mass in the quadrupole, allowing the ions to be detected at slightly differing time intervals. The resultant mass spectrum can be indexed against the built-in library of compounds commercially available for the identification of analytes.
In the triple quadrupole MS system, there is a second quadrupole or collision cell that further fragments the fragment ions generated from the first quadrupole, also known as the precursor ions. Prior to the collision induced dissociation, the precursor ions are first selected based on their selectivity and sensitivity to produce the product ions. The product ions then pass to the third quadrupole for another mass filtering process. Instead of detecting a specific ion, the transition from a precursor ion to product ion is recorded. This type of detection allows us to properly identify and quantify trace-level compounds in a complex matrix such as that of rice (Fig. 2). Matrix compounds might coincide in having the same precursor ions as the target analytes, but the chances of having the same precursor-to-product transition are rare [36,37]. [38] Targeted metabolites need to be accurately quantified to use them as biomarkers to discriminate differences in traits of diverse rice species. Profiling with a GC-MS/MS (GC triple quadrupole MS) was proven to be an effective tool for complex matrices following 2 phases during profiling. Phase I requires the discovery of the compounds that are present in the sample using the full scan mode. Phase II involves the analysis of selected secondary fragmentation in the multiple reaction monitoring (MRM) mode. Once these MRM transitions are established, they can be used to create targeted screening methods that will scan the whole sample for the identified transitions of each of the specific analytes. This type of workflow allows the researcher to identify and quantify metabolites in a sample without the worry of varying matrix. Metabolomic workflow showing the use of full scan and MRM measurements for discovery and targeted analyses respectively were described earlier [38]. Different systems will have different ways of creating the method for the targeted phase of metabolite profiling using the GC-MS/MS. Generally, compounds of interest, particularly those that may be considered potential biomarkers are chosen for the precursor ions study based on the fragment's intensity and selectivity. From the target compound, product ions are selected and transitions from the precursor ions to the product ions are optimized. That is, the collision energy that will produce the maximum ion intensity of the compound will then be used as the suitable collision energy of the compound for the actual analysis of the samples.

Application Box: GCMSMS Analysis of Volatiles and Metabolites in Rice using Triple Quadrupole GC
In metabolites analysis, common fragments with m/z ratio such as 147 and 73 may not be a good precursor ion as these fragments of trimethylsilyl (TMS) group are common in all the derivatized compounds. Once the transition parameters are identified for each compound, this information is then used by the GC-MS/MS system to identify and quantify the compounds (see Fig. 3).
Aside from volatiles, rice metabolites have already been investigated showing the capacity of the GC triple quadrupole MS system to identify useful mass fragment transitions of compounds. This can be used as a guide in selecting transitions for the analytes of interest.

Post-Run Data Analysis
Raw data obtained after every GC run can be analyzed one-at-atime or as a batch. Processing data in batches will ensure the analyst that similar data treatment methods areapplied to all the chromatograms as suggested by Fig. 4. Several chromatography data systems are available like LabSolutions for Shimadzu and ChemStation for Agilent. Generally, integration parameters are set up and applied automatically to most of the chromatograms. It is possible that since the samples can be largely varied in terms of sample matrices, they would have slight differences due to retention time shifts in the chromatogram. While the built-in library provides the spectrum of the desired compound, the value of running the actual standard compound and comparing the analyte's mass spectra to that of the standard from pooled libraries around the world [39] is still important for the correct identification of the compound. There are additional softwares available for deconvolution, baseline correction, and peak alignment that could be useful during the data mining process such as AMDIS [40]. The TargetSearch Package is a preprocessing package from Bioconductor for searching and identifying metabolites using corrected retention time indices [41]. The identification process requires the use of retention index markers or standards to obtain aligned peaks and to identify outliers all throughout the sample runs (Fig. 4). Moreover, several data pretreatment methods down the pipeline such as centering, scaling, and transformations are in place to further improve the biological interpretability of the biological data set [42]. After the preprocessing, multivariate statistical analysis is often used to initially describe and explain the profiles obtained in the analysis. 2. Ultra-cooled samples and environment during extraction will ensure that any enzymes present are quenched and any metabolic process will no longer occur.
3. If the interest of the researcher includes the bran, there is no need to mill or polish the rice sample. Cryo-grinding of the sample must be immediately done. 4. Make sure that the septum used for the vials is suitable for SPME analysis to avoid breakage of the SPME fiber.
5. For mature grains, and depending on the sensitivity of the equipment, amount of sample recommended for extraction is 300 mg while for developing grains and germinating seeds, smaller amount (20 mg) is desired.
6. Moisture will hydrolyze the derivatizing reagents rendering the latter to lose its integrity for sample derivatization. Make sure that there is no water or moisture adhering on the surfaces of the vials, glass inserts, and pipette tips and all throughout the process especially right after freeze-drying. If the extracts will not be derivatized immediately, store vials in a dessicator. Store the derivatizing reagents in a cool, dry place when not in use.
7. Derivatize only extracts that will be analyzed in the GC on the same day.
8. Make sure that the GC-MS has no leak or moisture in the system. Perform auto-tune and replace filaments, septa, liners as needed.