Poster Open Access
Bittremieux, Wout; Kelchtermans, Pieter; Valkenborg, Dirk; Martens, Lennart; Laukens, Kris
jqcML: A Java API for quality control for mass spectrometry experiments
qcML is a new approach towards a standardized format for quality control metrics for mass spectrometry experiments. We here present jqcML, an open-source Java API for working with qcML data.
In order to provide a pervasive and standardized means to report quality control information for mass spectrometry experiments, the qcML standard1 has recently been developed. The qcML standard aims to support an automated quality control pipeline by providing a set of useful metrics that can be calculated on the acquired data. Additionally, it aims to provide a standard format for the exchange of these metrics.
To exchange qcML data an XML-based file format has been developed. This is a universal format that captures metrics and metadata about all kinds of mass spectrometry experiments. As such, the qcML file format can be used as a container to separate quality control information from the actual data analysis.
Here we present jqcML, a fully operative Java API for working with qcML data. Firstly, jqcML provides a complete object model to interpret and manipulate qcML data, while retaining a small memory footprint without sacrificing the overall speed of data access. Furthermore, jqcML is able to interact with both XML-based qcML files and a qcDB relational database. This interaction is abstracted, so the user is able to work with both sources of qcML data in a consistent way.
The main approach to exchange qcML data will be through the XML-based qcML files. To handle this approach, jqcML is able to operate on qcML files by reading a full qcML file or only a specific part of a qcML file, and by creating and writing qcML files.
In order to perform input and output on qcML files, the Java Architecture for XML Binding (JAXB) is used. By annotating specific elements of the object model, a mapping between the object model and the XML structure defined by the XML schema is constructed. This allows a translation from qcML files to the object model, and vice-versa. Using JAXB we are able to both read and write qcML files, thus enabling the interpretation of existing qcML files, as well as the creation of new qcML files.
When interpreting data from a qcML file, special care is taken to be able to manipulate files that are arbitrarily large. It is possible to only read a specific part of a file by using an XML indexer component. This prevents having to read the full, possibly (very) large, qcML file into memory, while still being able to retrieve the required content.
Besides the XML-based file format, qcML data can also be stored in a relational database, called qcDB. Equivalent to the XML-based file format, jqcML provides an application layer to be able to read and write qcML data from a qcDB. This interaction is abstracted, enabling the user to interface with an XML-based file or a qcDB in a consistent way. Consequently, jqcML can also be used as a converter between the XML-based file format and a qcDB, and vice-versa.
Results & Discussion
OpenMS2, an open-source library for LC/MS data management and analyses, provides a (modular) tool to calculate qcML data. By using this pipeline raw files detailing mass spectrometry experiments can easily be processed to output a qcML file.
Using this data it is possible to perform advanced analyses between different runs. By performing a large-scale analysis on specific quality metrics, it is possible to obtain a classification between experiments. Subsequently, based on this classification quality boundaries can be determined. Finally, using these boundaries, thresholds can be defined in order to flag bad experiments.
The qcML standard will be finalized by the end of the year. Using libraries such as jqcML, the capabilities of the qcML standard can easily be harnessed. This will enable the users to provide easier and better quality control for their mass spectrometry experiments, resulting in more high-quality results.
1. qcml – A XML format for quality related data of mass spectrometry instruments. https://code.google.com/p/qcml/ (2013).
2. Kohlbacher, O. et al. TOPP--The OpenMS Proteomics Pipeline. Bioinformatics 23, e191–e197 (2007).