Published August 10, 2020 | Version v1
Dataset Open

Cleaned and pre-processed MS/MS datset (build from all positive ionmode spectra in GNPS)

Authors/Creators

  • 1. Netherlands eScience Center

Description

MS/MS dataset build from data that was obtained from GNPS (accessed on 2020-05-11): https://gnps-external.ucsd.edu/gnpslibrary/ALL_GNPS.json

The data was cleaned and pre-processed using notebooks provided here: https://github.com/iomega/spec2vec_gnps_data_analysis/tree/master/notebooks

  • 112,956 spectra
  • metadata was cleaned and corrected using matchms (https://github.com/matchms/matchms) and lookup routines using PubChem
  • 92,954 of the spectra have Smiles and InchiKey (13717 unique InchiKey in first 14 characters)

Files

Files (277.9 MB)

Name Size Download all
md5:ff1592f3b9c57538e1ba0387045c859a
277.9 MB Download