Published December 21, 2022 | Version 2
Dataset Open

Deep learning-driven fragment ion series classification enables highly precise and sensitive de novo peptide sequencing

  • 1. Computational Molecular Medicine, School of Computation, Information and Technology, Technical University of Munich, Munich, Germany
  • 2. Computational Mass Spectrometry, School of Life Sciences, Technical University of Munich, Munich, Germany

Description

This Zenodo record contains the dataset and model weights for "Deep learning-driven fragment ion series classification enables highly precise and sensitive de novo peptide sequencing".

 

This repository contains the following files:

  • For the human dataset by Wang et al.:

    • train_val_test_split.csv containing the mapping of the correct peptide by MaxQuant to either train, validation or test set

    • psms_train_val_test.csv containing the mapping of correct PSMs (scan number, raw file and correct peptide by MaxQuant) to either train, validation or test set

    • updated_spectralis_test_out.csv as before containing Spectralis-EA predictions and scores on test set, as well as initial peptides and scores by Casanovo and Novor and now containing also correct peptides by MaxQuant and Spectralis-scores on the combination of Casanovo and Novor sequences (column named spectralis_score_onlyRescoring)

    • spectralis_test_out_heart_analysis.csv subset of 20220822_spectralis_test_out.csv containing only PSMs for the tissue heart with the computation of precision and recall values

    • spectralis_test_out_pointnovo_deepnovo.csv containing predictions by DeepNovo and PointNovo with original scores and Spectralis-score, as well as correct peptides by MaxQuant

 

  • For the nine-species dataset by Tran et al.:

    • spectralis_ninespecies_out.csv containing spectrum identifiers, correct peptides by PEAKSDB, predicted peptides by the different de novo sequencing tools as well as original scores and Spectralis-scores for the different PSMs.

Files

psms_train_val_test.csv

Files (2.5 GB)

Name Size Download all
md5:92b8f1e44e5f20d8b2d3107c84d112d6
327.6 MB Preview Download
md5:2f23385782115f277e3d9167ad0c5ff3
5.6 MB Download
md5:c33d003f7d0450243bbd7410f0b4ebee
761.8 MB Preview Download
md5:3b456a9ecc1e33144956b0bcf6ca8a9a
1.3 GB Download
md5:91a485e814b474d29e99cdaa95173d6b
7.3 MB Preview Download
md5:178afd50656122a246d3ff4423cc162f
8.0 MB Preview Download
md5:91b709900c2b3f5d2ab8783643a9ac57
6.8 MB Preview Download
md5:46e682c38b9369b817c10b7efcc79a99
146.1 MB Preview Download