There is a newer version of the record available.

Published June 5, 2023 | Version v1
Dataset Open

Predicting glycan structure from tandem mass spectrometry via deep learning

Description

Curated set of LC-MS/MS data from glycomics studies. Used for training and applying CandyCrunch, a deep learning model to predict glycan structure from LC-MS/MS data, described in Urban et al., bioRxiv, 2023 and https://github.com/BojarLab/CandyCrunch.

Files:

full_dataset.xlsx: Full dataset with all annotated LC-MS/MS glycan spectra

X_train.pkl: spectra and metadata from our training set

y_train.pkl: labels from our training set

X_test.pkl: spectra and metadata from our independent test set

y_test.pkl: labels from our independent test set

Files

Files (6.6 GB)

Name Size Download all
md5:fb3bfb09259d850af046051f93edcfff
1.6 GB Download
md5:42d9db11c876afd1f82c01696b1db10b
682.7 MB Download
md5:afde804c819f5e1d5b85058de41cbd16
4.3 GB Download
md5:5d437e8c073c2ae173ff6355f2c31a5d
121.4 kB Download
md5:6f717f8c6ade3d51d27a923f964c9715
756.1 kB Download