Public Data files for MassFormer
Description
Public data files for experiments in MassFormer. See the Github repository for instructions on how to use this data.
Raw Data:
casmi_2016.tgz - Critical Assessment of Small Molecule Identification 2016, used for model evaluation.
casmi_2022.tgz - Critical Assessment of Small Molecule Identification 2022, used for model evaluation.
mb_na_msms.msp.gz - MassBank of North America export of LC-MS/MS spectra, used for model evaluation.
cid_smiles.tsv.gz - Mapping of CID to SMILES strings, obtained from PubChem.
Processed Data:
proc_casmi_2016.tgz - Processed spectrum and molecule data for the CASMI 2016 benchmark.
proc_casmi_2022.tgz - Processed spectrum and molecule data for the CASMI 2022 benchmark.
proc_nist20_outlier.tgz - Processed spectrum and molecule data for the NIST20 Outlier benchmark (formerly called pseudo-CASMI).
proc_demo.tgz - Processed spectrum and molecule data for the demo (refer to code repository for more information).
cfm.tgz - Predicted spectra for the Competitive Fragmentation Modelling (CFM) baseline.
Model Checkpoints:
demo.pkl - Checkpoint of a MassFormer model trained on MoNA data, for the purposes of running the demo.
checkpoint_best_pcqm4mv2.pt - Checkpoint of a Graphormer model pretrained on the PCQM4M dataset, used for initialization of some MassFormer models. Copied from this url. Please refer to the Graphormer repository for more information.
Files
Files
(4.5 GB)
Name | Size | Download all |
---|---|---|
md5:c31160256365c58735e24a6cf0cb7eb9
|
36.5 MB | Download |
md5:a49055f71d4a6983c20f665c41c59eb6
|
23.9 MB | Download |
md5:9091cbc446124dc947cb2a7b01fe9c98
|
1.2 GB | Download |
md5:9d7a2bd77bd02d3ed45cea66646aee81
|
193.2 MB | Download |
md5:6e17ad47e5dc9a18404274beeae06484
|
1.4 GB | Download |
md5:4a00ad8d88ced0a0cda1d5c9288d82d2
|
772.8 MB | Download |
md5:f7dd1827a013bf05c3e7f7b07b946083
|
277.4 MB | Download |
md5:c24bae2a10e929d8a1570ae810d45278
|
24.4 MB | Download |
md5:82364fafaff0ead065c0b7680da34215
|
199.7 MB | Download |
md5:cfa1ee20d88fb91d3d664f2694db0ed2
|
348.3 MB | Download |
md5:dab95dd818a5bab14a22bee5551c861e
|
98.0 MB | Download |