Published November 13, 2025 | Version 25.7
Dataset Open

Collective Spectral Library

Description

The Collective Spectral Library (CSL) is a database containing a collection of reference spectra generated using high-resolution tandem mass spectrometry (HRMS²).

The CSL is built collaboratively and currently includes reference spectra provided by:

The CSL enables retrospective screening of environmental samples as part of Non-Target Screening (NTS) efforts. It is integrated into the open analysis workflow ntsworkflow [1], which supports matching of experimental MS² data with verified reference spectra. It is also connected to NTSPortal [2], a platform for processing, archiving and visualizing NTS data to support the identification and assessment of trace contaminants in surface waters.

We continuously expand the CSL to improve its utility, for instance in the retrospective analysis of historical data in NTSPortal.

For more information, feedback, or to contribute spectral reference data feel free to contact us at ntsportal@bafg.de.


References

[1] Jewell, K. S., et al. (2020). Rapid Commun. Mass Spectrom., 34, e8541. https://doi.org/10.1002/rcm.8541
[2] Jewell, K. S., et al. (2025). Online-Portal „Non-Target Screening für die Umweltüberwachung der Zukunft“, Umweltbundesamt, Dessau-Roßlau. https://www.umweltbundesamt.de/sites/default/files/medien/11850/publikationen/21_2025_texte.pdf

 

Files

CSL_v<version>.db
The SQLite database containing all reference spectra and related records.
This file can be opened with any SQLite-compatible software, such as the open-source DB Browser for SQLite (available at https://sqlitebrowser.org), or accessed programmatically using standard SQLite libraries.

thermo-<chrom_method>CSLv<version>-<export_date>.msp
All reference spectra exported in NIST format. 
This format is readable by software such as mzVault (tested with mzVault Version 2.3 SP1, Thermo Fisher Scientific), and can also be opened with any text editor.
A separate file is provided for each chromatographic method for which experimental or modelled retention time data are available. The field PREDICTED_RT indicates the origin of the retention time data for each record. TRUE: retention time is modelled (predicted); FALSE: retention time is experimental. 

CSL-er-diagram.png
Entity-Relationship diagram of the CSL database structure.

CSL-data-dictionary.xlsx
Data dictionary providing detailed descriptions of all columns in each table.

CSL-changelog.md
Detailed record of changes for each version.

 

Versions

25.7

  • Unified the naming scheme of all tables and columns
  • Restructured tables, columns, and links (see CSL-er-diagram.png)
  • Expanded the data sources table to included additional information
  • General cleanup and corrections

25.6

  • Added spectra for 41 new compounds
  • Corrected some compound names and standardized adduct notation
  • Fixed incorrectly linked experiments and removed duplicate entries

25.5

  • Initial version containing 1,721 compounds and 35,523 spectra

For more details see, CSL-changelog.md.

Files

CSL-er-diagram.png

Files (245.4 MB)

Name Size Download all
md5:da7d2de57eb68cb653de559eaeaada77
5.8 kB Preview Download
md5:29c5dd54064988e7dfd1ed109da6d802
13.5 kB Download
md5:87f3cf3020b22f224135b9da6dd879ea
22.9 kB Preview Download
md5:a9baead136da1b7232144247195628c0
25.3 MB Download
md5:65d497abce68d456db12bd5086afa816
55.0 MB Download
md5:ea8b0c36c7287a9b229c9bdc543f6fa7
55.0 MB Download
md5:0694a0723c6706091969d0fc6b2af63b
55.1 MB Download
md5:6e092798cead311890b56af0e7b1198e
55.0 MB Download

Additional details

Related works

Is required by
Software: https://github.com/bafg-bund/csl-tools (URL)

Funding

German Federal Environment Agency
3723 22 202 0