Presentation Open Access
Bittremieux, Wout; Meysman, Pieter; Martens, Lennart; Goethals, Bart; Valkenborg, Dirk; Laukens, Kris
Analysis of mass spectrometry quality control metrics
Mass-spectrometry-based proteomics is a powerful analytical technique to identify complex protein samples, however, its results are still subject to a large variability. Lately several quality control metrics have been introduced to assess the performance of a mass spectrometry experiment. Unfortunately, these metrics are generally not sufficiently thoroughly understood. For this reason, we present a few powerful techniques to analyze multiple experiments based on quality control metrics, identify the low-performance experiments, and provide an interpretation of these outlying experiments.
Mass-spectrometry-based proteomics is a powerful analytical technique that can be used to identify complex protein samples. Despite the many technological and computational advances, performing a mass spectrometry experiment is still a highly complicated task and its results are subject to a large variability. To understand and evaluate how technical variability affects the results of an experiment, lately several quality control (QC) and performance metrics have been introduced. Unfortunately, despite the availability of such QC metrics covering a wide range of qualitative information, a systematic approach to quality control is often still lacking.
As most quality control tools are able to generate several dozens of metrics, any single experiment can be characterized by multiple QC metrics. Therefore it is often not clear which metrics are most interesting in general, or even which metrics are relevant in a specific situation. To take into account the multidimensional data space formed by the numerous metrics, we have applied advanced techniques to visualize, analyze, and interpret the QC metrics.
Outlier detection can be used to detect deviating experiments with a low performance or a high level of (unexplained) variability. These outlying experiments can subsequently be analyzed to discover the source of the reduced performance and to enhance the quality of future experiments.
However, it is insufficient to know that a specific experiment is an outlier; it is also of vital importance why the experiment is an outlier. To understand why an experiment is an outlier, we have used the subspace of QC metrics in which the outlying experiment can be differentiated from the other experiments. This provides crucial information on how to interpret an outlier, which can be used by domain experts to increase interpretability and investigate the performance of the experiment.
Results & Discussion
Figure 1 shows an example of interpreting a specific experiment that has been identified as an outlier. As can be seen, two QC metrics mainly contribute to this experiment being an outlier. The explanatory subspace formed by these QC metrics can be extracted, which can then be interpreted by domain experts, resulting in insights in relationships between various QC metrics.
Next, by combining the explanatory subspaces for all individual outliers, it is possible to get a general view on which QC metrics are most relevant when detecting deviating experiments. When taking the various explanatory subspaces for all different outliers into account, a distinction between several of the outliers can be made in terms of the number of identified spectra (PSM’s). As can be seen in Figure 2, for some specific QC metrics (highlighted in italics) the outliers result in a notably lower number of PSM's compared to the non-outlying experiments.
Because monitoring a large number of QC metrics on a regular basis is often unpractical, it is more convenient to focus on a small number of user-friendly, well-understood, and discriminating metrics. As the QC metrics highlighted in Figure 2 are shown to indicate low-performance experiments, these metrics are prime candidates to monitor on a continuous basis to quickly detect faulty experiments.