There is a newer version of the record available.

Published February 26, 2025 | Version v1
Preprint Open

Seeking community input for: mzPeak - a modern, scalable, and interoperable mass spectrometry data format for the future

  • 1. VIB - UGent Center for Medical Biotechnology
  • 2. OpenMS Inc
  • 3. ROR icon University of Tübingen
  • 4. ROR icon European Molecular Biology Laboratory
  • 5. University of California San Diego Skaggs School of Pharmacy and Pharmaceutical Sciences
  • 6. ROR icon Pacific Northwest National Laboratory
  • 7. ROR icon University of California, San Diego
  • 8. ROR icon University of Antwerp
  • 9. ROR icon University of Washington
  • 10. ROR icon Institute for Systems Biology
  • 11. ROR icon University of Bristol
  • 12. ROR icon The Alan Turing Institute
  • 13. ROR icon RECETOX
  • 14. ROR icon Eli Lilly (United States)
  • 15. Thermo Fisher Scientific Inc
  • 16. ROR icon Ruhr University Bochum
  • 17. ROR icon European Bioinformatics Institute
  • 18. ROR icon University of California, Riverside

Description

Abstract

Advancements in mass spectrometry (MS) instrumentation—including higher resolution, faster scan speeds, increased throughput, and improved sensitivity—along with the growing adoption of imaging and ion mobility, have dramatically increased the volume and complexity of data produced in fields like proteomics, metabolomics, and lipidomics. While these technologies unlock new possibilities, they also present significant challenges in data management, storage, and accessibility. Existing formats, such as the XML-based community standards mzML and imzML, struggle to meet the demands of modern MS workflows due to their large file sizes, slow data access, and limited metadata support. Vendor-specific formats, while optimized for proprietary instruments, lack interoperability, comprehensive metadata support and long-term archival reliability.

This white paper lays the groundwork for mzPeak, a next-generation data format designed to address these challenges and support high-throughput, multi-dimensional MS workflows. By adopting a hybrid model that combines efficient binary storage for numerical data and human-readable metadata storage, mzPeak will reduce file sizes, accelerate data access, and offer a scalable, adaptable solution for evolving MS technologies.

For researchers, mzPeak will enable faster (random) data access, enhanced interoperability across platforms, and seamless support for complex workflows, including ion mobility and imaging. Its design will ensure data is managed in compliance with regulatory standards, essential for applications such as precision medicine and chemical safety, where long-term data integrity and accessibility are critical.

For vendors, mzPeak provides a streamlined, open alternative to proprietary formats, reducing the burden of regulatory compliance while aligning with the industry's push for transparency and standardization. By offering a high-performance, interoperable solution, mzPeak positions vendors to meet customer demands for sustainable data management tools which will be able to handle emerging and future data types and workflows.

mzPeak aspires to become the cornerstone of MS data management, empowering researchers, vendors, and developers to innovate and collaborate more effectively. We invite the MS community to join the discussion on PREreview.org and collaborate in developing and adopting mzPeak to meet the challenges of today and tomorrow.

Files

mzPeak_preprint_v02.pdf

Files (201.7 kB)

Name Size Download all
md5:e08b6983355512da673b7f45ecb0a16c
201.7 kB Preview Download