Published October 15, 2024 | Version v1
Dataset Open

Training split MS2Deepscore

  • 1. ROR icon Wageningen University & Research

Description

The training data split into training, validation and test set used for training the ms2deepscore model found at: https://doi.org/10.5281/zenodo.13897744
The spectra are a combination of GNPS spectra, MassBank, Mona and a library created by Corinna Brungs (see MSnLib Mass spectral libraries (.mgf and .json) (zenodo.org)) preprossessed using matchms to harmonize metadata. If you use this for academic work, please make sure to cite all relevant work.

The data was split by selecting 1/20th of the unique inchikeys for the training set and the validation set. So no compound in the training data appears in the test or train set. For the rest the split is random. 

Files

Files (1.5 GB)

Name Size Download all
md5:f6d7a4a83f1801bf860fff6e043db6c2
15.6 MB Download
md5:d96f0c95e5febb293743747586cf6ffb
296.8 MB Download
md5:305c02cb261f0425af2b54c0c052d0fe
16.2 MB Download
md5:86d8d7ab35dd68e5c47877ff580b4270
57.9 MB Download
md5:c2032ec39c3509f3539d0e216b230a10
1.1 GB Download
md5:51aee6fd0d060cb9c013ddbc12748e75
58.9 MB Download

Additional details

Dates

Submitted
2024-10-15