Published December 1, 2020 | Version v1


Repository-scale propagated spectral library of suspects


In untargeted metabolomics a majority of the acquired MS/MS spectra typically remains unidentified. In part, this is due to the fact that the molecular coverage of reference spectral libraries is necessarily incomplete. However, even if an exact match cannot be found in a spectral library, it is often possible to propagate annotations to structurally related molecules. We have created a "suspect" reference spectral library, consisting of putative molecule identities and a loss or addition characterized by a difference in precursor mass. By propagating library annotations to previously unidentified spectra using repository-scale molecular networking, we are able to extract 79,461 high-quality suspect spectra in a data-driven fashion from 92,063 raw files originating from 1,289 heterogeneous public datasets deposited to GNPS. This forms a rich data resource of novel, real-life, reference spectra that will be freely available for the community to use via GNPS.


