Presentation Open Access

Analysis of mass spectrometry quality control metrics

Bittremieux, Wout; Meysman, Pieter; Martens, Lennart; Goethals, Bart; Valkenborg, Dirk; Laukens, Kris

MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="">
  <controlfield tag="005">20200120134126.0</controlfield>
  <controlfield tag="001">56001</controlfield>
  <datafield tag="711" ind1=" " ind2=" ">
    <subfield code="d">7-8 December 2015</subfield>
    <subfield code="g">BBC</subfield>
    <subfield code="a">Benelux Bioinformatics Conference</subfield>
    <subfield code="c">Antwerp, Belgium</subfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">University of Antwerp, Antwerp, Belgium</subfield>
    <subfield code="a">Meysman, Pieter</subfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Ghent University, Ghent, Belgium</subfield>
    <subfield code="a">Martens, Lennart</subfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">University of Antwerp, Antwerp, Belgium</subfield>
    <subfield code="a">Goethals, Bart</subfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">VITO, Mol, Belgium</subfield>
    <subfield code="a">Valkenborg, Dirk</subfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">University of Antwerp, Antwerp, Belgium</subfield>
    <subfield code="a">Laukens, Kris</subfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">8919544</subfield>
    <subfield code="z">md5:c9aa03214aced3f0c1fdc3bbcffa56e3</subfield>
    <subfield code="u"></subfield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="y">Conference website</subfield>
    <subfield code="u"></subfield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2015-12-07</subfield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire</subfield>
    <subfield code="o"></subfield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">University of Antwerp, Antwerp, Belgium</subfield>
    <subfield code="a">Bittremieux, Wout</subfield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Analysis of mass spectrometry quality control metrics</subfield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u"></subfield>
    <subfield code="a">Creative Commons Attribution Share Alike 4.0 International</subfield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2"></subfield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;&lt;strong&gt;Analysis of mass spectrometry quality control metrics&lt;/strong&gt;&lt;/p&gt;


&lt;p&gt;Mass-spectrometry-based proteomics is a powerful analytical technique that can be used to identify complex protein samples. Despite the many technological and computational advances, performing a mass spectrometry experiment is still a highly complicated task and its results are subject to a large variability. To understand and evaluate how technical variability affects the results of an experiment, lately several quality control (QC) and performance metrics have been introduced. Unfortunately, despite the availability of such QC metrics covering a wide range of&lt;br /&gt;
qualitative information, a systematic approach to quality control is often&lt;br /&gt;
still lacking.&lt;/p&gt;

&lt;p&gt;As most quality control tools are able to generate several dozens of&lt;br /&gt;
metrics, any single experiment can be characterized by multiple QC metrics.&lt;br /&gt;
Therefore it is often not clear which metrics are most interesting in general,&lt;br /&gt;
or even which metrics are relevant in a specific situation. To take into account the multidimensional data space formed by the numerous metrics, we have applied advanced techniques to visualize, analyze, and interpret the QC metrics.&lt;/p&gt;


&lt;p&gt;Outlier detection can be used to detect deviating experiments with a low performance or a high level of (unexplained) variability. These outlying experiments can subsequently be analyzed to discover the source of the reduced performance and to enhance the quality of future experiments.&lt;/p&gt;

&lt;p&gt;However, it is insufficient to know that a specific experiment is an outlier; it is also of vital importance why the experiment is an outlier. To understand why an experiment is an outlier, we have used the subspace of QC metrics in which the outlying experiment can be differentiated from the other experiments. This provides crucial information on how to interpret an outlier, which can be used by domain experts to increase interpretability and investigate the performance of the experiment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Results &amp;amp; Discussion&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Figure 1 shows an example of interpreting a specific experiment that has been identified as an outlier. As can be seen, two QC metrics mainly contribute to this experiment being an outlier. The explanatory subspace formed by these QC metrics can be extracted, which can then be interpreted by domain experts, resulting in insights in relationships between various QC metrics.&lt;/p&gt;

&lt;p&gt;Next, by combining the explanatory subspaces for all individual outliers, it is possible to get a general view on which QC metrics are most relevant when detecting deviating experiments. When taking the various explanatory subspaces for all different outliers into account, a distinction between several of the outliers can be made in terms of the number of identified spectra (PSM&amp;rsquo;s). As can be seen in Figure 2, for some specific QC metrics (highlighted in italics) the outliers result in a notably lower number of PSM&amp;#39;s compared to the non-outlying experiments.&lt;/p&gt;

&lt;p&gt;Because monitoring a large number of QC metrics on a regular basis is often unpractical, it is more convenient to focus on a small number of user-friendly, well-understood, and discriminating metrics. As the QC metrics highlighted in Figure 2 are shown to indicate low-performance experiments, these metrics are prime candidates to monitor on a continuous basis to quickly detect faulty experiments.&lt;/p&gt;</subfield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.56001</subfield>
    <subfield code="2">doi</subfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">presentation</subfield>
All versions This version
Views 1818
Downloads 33
Data volume 26.8 MB26.8 MB
Unique views 1818
Unique downloads 33


Cite as