TransExtION: Transformer based Explainable similarity metric for IONS
- 1. Adrem Data Lab, Department of Computer Science, University of Antwerp, Belgium
- 2. Janssen Pharmaceutica N.V., Turnhoutseweg 30, 2340 Beerse, Belgium.
Description
TransExtION is a supervised learning method estimating spectral similarity between MS/MS spectra that are strongly correlated to their structural similarity. It can be used in spectral library search to find structural analogues. TransExtION is based on Transformer architecture and provides a post hoc explanation for its outcome in order to reveal the relationship between fragments.
Here we provide a pretrained transformer model "GNPS_MassBank.ms.model". The model was trained using (+)ESI GNPS/MassBank spectra of 9,996 unique compounds ("GNPS_MassBank_train.mgf"). The query spectrum/spectra should be written in mgf format (example: "GNPS_MassBank_test.mgf" and "test_urine.mgf"), and it/they can be annotated by searching a spectral library after format conversion (example: "ALL_PUBLIC_LIBRARY_POS_CONSENSUS_2022.mgf" converted to "ALL_PUBLIC_LIBRARY_POS_CONSENSUS_2022.db"). The "GNPS_MassBank.ms.model", along with the converted "ALL_PUBLIC_LIBRARY_POS_CONSENSUS_2022.db" (covering over 15,000 metabolites, natural products, and drugs), can be directly used for positive ion mode library search, compounds annotation and post hoc explanation.
Files
Files
(126.2 MB)
Name | Size | Download all |
---|---|---|
md5:279fff077afa610c851e8b88014d780d
|
20.8 MB | Download |
md5:3234fa3d00f61a39ff44548b2f99b04c
|
24.8 MB | Download |
md5:e2ed38093d3bcab472ab2142a4009774
|
8.4 MB | Download |
md5:90e7cc149b95a6792abf2a96bdea81ae
|
1.7 MB | Download |
md5:3f03c5cb925091203ac015d089ce0bd8
|
20.8 MB | Download |
md5:10607ae83c06c84e9dbaac015c79b52c
|
33.5 MB | Download |
md5:e77d509dd95677e7cbcb639c80789e3f
|
16.2 MB | Download |