Poster Open Access

The HUPO-PSI standardized spectral library format

Gabriels, Ralf; Bandeira, Nuno; Bittremieux, Wout; Carver, Jeremy J; Chambers, Matthew; Kawano, Shin; Lam, Henry; Mak, Tytus; Perez-Riverol, Yasset; Pullman, Benjamin J; Sharma, Vagisha; Shofstahl, Jim; Van Den Bossche, Tim; Vizcaino, Juan Antonio; Zhu, Yunping; Deutsch, Eric W

More and more proteomics datasets are becoming available in public repositories. The knowledge embedded in these datasets can be used to improve peptide identification workflows. Spectral library searching provides a straightforward method to boost identification rates using previously identified spectra. Alternatively, machine learning methods can learn from these spectra to accurately predict the behavior of peptides in a liquid chromatography-mass spectrometry system.

At the basis of both approaches are spectral libraries: Unified collections of previously identified spectra. Organizations and projects such as the National Institute of Standards and Technology (NIST), the Global Proteome Machine, PeptideAtlas, PRIDE Archive and MassIVE have all compiled spectral libraries for a multitude of species and experimental setups. A large obstacle, however, is that each organization provides libraries in a different file format. At the software level the problem propagates (if not expands), as different software tools require different file formats.

The solution is a standardized spectral library format that is sufficiently flexible to meet all users' demands, but that is also standardized enough to be usable across environments and software packages. This balance is achieved by setting up a standardized framework and a controlled vocabulary with metadata terms, and allow the format to be represented in different forms, such as plain text, JSON and HDF.

So far, the required (and optional) meta data has been compiled and added to the PSI-MS ontology, and versions of the text and JSON representations have been drafted. The tabular and HDF representations of the format are in development, as well as converters and validators in various programming languages.

Files (168.6 kB)
Name Size
2020-01 EuBIC Dev Meeting - SpecLibFormat.pdf
md5:216e71dcee8bf44d25d3e058fc703439
168.6 kB Download
  • Deutsch EW et al. Expanding the Use of Spectral Libraries in Proteomics. J Proteome Res. 2018;17(12):4051–4060. doi:10.1021/acs.jproteome.8b00485

60
51
views
downloads
All versions This version
Views 6060
Downloads 5151
Data volume 8.6 MB8.6 MB
Unique views 5151
Unique downloads 4848

Share

Cite as