Re-analysis of the data from synthetic peptides run on an orbitrap Fusion (Thermo Scientific) with different acquisition methods (Ferries et al. 2017). The aim is to compare the localisation scores from different bioinformatic pipelines on a low-complexity mix of phospho-peptides with known sequences and phospho-localisation.
The synthetic peptides were spiked in phospho-enriched peptides(Ferries et al. 2017):
Synthetic phosphopeptide standards (∼10 pmol, split into five pools to separate phosphoisomers and thus ensure confidence in phosphosite localization) and 6 μL of enriched phosphopeptides (equivalent to 100 μg of digested cell lysate) were loaded onto the trapping column (PepMap100, C18, 300 μm × 5 mm)
I find 179 unique phospho-peptide in the pools. This is not in line with what is described in (Ferries et al. 2017).
All related information can be found in the document “01_syntheticPeptides_Description.html”.
Search parameters with Andromeda integrated in MaxQuant (v1.6.1.0) using default settings unless otherwise specified. The databases is Uniprot_human_20170912 including contaminants database. The search is performed on each pool separately.
Search parameters with Mascot integrated in Proteome Discoverer 2.3 (PhosphoRS 3.0) using default settings unless otherwise specified:
The databases are:
A search is performed for each raw file separately. We use either Mascot, SequestHT or MSAmanda to compare the impact of different search engines on phosphorylation sites identification.
Percolator validation cannot be performed because not enough PSMs from the target and the decoy search. Therefore, the validation was performed with the Target Decoy method only.
.mgf peaklists were exported to serve as input in PeptideShaker.
The search results were parsed with the markdown scripts in “ParseInput/”. All the informations on filters and else are in the corresponding .html documents.
Number of PSM per strategy (all pools combined):
These results are not entirely in line with what was published previously (Ferries et al. 2017):
Number of PSM per strategy (all pools combined) that correspond to a synthetic sequence (without taking into account the localisation):
Number of sequence ID per strategy (all pools combined) that correspond to a synthetic sequence (without taking into account the localisation):
## quartz_off_screen
## 2
| Method | NumPSMs | Replicate | Software | SearchEngine | Scoretype | Validation | colour | |
|---|---|---|---|---|---|---|---|---|
| rep1_HCDOT_MQ_Andromeda_SearchEngine_TargetDecoy | HCDOT | 112 | rep1 | MQ | Andromeda | SearchEngine | TargetDecoy | MQ Andromeda |
| rep1_HCDOT_PD_Mascot_ptmRS_TargetDecoy | HCDOT | 100 | rep1 | PD | Mascot | ptmRS | TargetDecoy | PD Mascot |
| rep1_HCDOT_PD_Mascot_SearchEngine_TargetDecoy | HCDOT | 100 | rep1 | PD | Mascot | SearchEngine | TargetDecoy | PD Mascot |
| rep1_HCDOT_PD_MSAmanda_ptmRS_TargetDecoy | HCDOT | 107 | rep1 | PD | MSAmanda | ptmRS | TargetDecoy | PD MSAmanda |
| rep1_HCDOT_PD_MSAmanda_SearchEngine_TargetDecoy | HCDOT | 107 | rep1 | PD | MSAmanda | SearchEngine | TargetDecoy | PD MSAmanda |
| rep1_HCDOT_PD_SequestHT_ptmRS_TargetDecoy | HCDOT | 101 | rep1 | PD | SequestHT | ptmRS | TargetDecoy | PD SequestHT |
| rep1_HCDOT_PD_SequestHT_SearchEngine_TargetDecoy | HCDOT | 101 | rep1 | PD | SequestHT | SearchEngine | TargetDecoy | PD SequestHT |
| rep1_HCDOT_PS_Comet_Ascore_TargetDecoy | HCDOT | 103 | rep1 | PS | Comet | Ascore | TargetDecoy | PS Comet |
| rep1_HCDOT_PS_Comet_ptmRS_TargetDecoy | HCDOT | 103 | rep1 | PS | Comet | ptmRS | TargetDecoy | PS Comet |
| rep1_HCDOT_PS_Comet_SearchEngine_TargetDecoy | HCDOT | 103 | rep1 | PS | Comet | SearchEngine | TargetDecoy | PS Comet |
| rep1_HCDOT_PS_MSAmanda_Ascore_TargetDecoy | HCDOT | 108 | rep1 | PS | MSAmanda | Ascore | TargetDecoy | PS MSAmanda |
| rep1_HCDOT_PS_MSAmanda_ptmRS_TargetDecoy | HCDOT | 108 | rep1 | PS | MSAmanda | ptmRS | TargetDecoy | PS MSAmanda |
| rep1_HCDOT_PS_MSAmanda_SearchEngine_TargetDecoy | HCDOT | 108 | rep1 | PS | MSAmanda | SearchEngine | TargetDecoy | PS MSAmanda |
| rep1_HCDOT_PS_X!Tandem_Ascore_TargetDecoy | HCDOT | 80 | rep1 | PS | X!Tandem | Ascore | TargetDecoy | PS X!Tandem |
| rep1_HCDOT_PS_X!Tandem_ptmRS_TargetDecoy | HCDOT | 80 | rep1 | PS | X!Tandem | ptmRS | TargetDecoy | PS X!Tandem |
| rep1_HCDOT_PS_X!Tandem_SearchEngine_TargetDecoy | HCDOT | 80 | rep1 | PS | X!Tandem | SearchEngine | TargetDecoy | PS X!Tandem |
| rep2_HCDOT_MQ_Andromeda_SearchEngine_TargetDecoy | HCDOT | 114 | rep2 | MQ | Andromeda | SearchEngine | TargetDecoy | MQ Andromeda |
| rep2_HCDOT_PD_Mascot_ptmRS_TargetDecoy | HCDOT | 99 | rep2 | PD | Mascot | ptmRS | TargetDecoy | PD Mascot |
| rep2_HCDOT_PD_Mascot_SearchEngine_TargetDecoy | HCDOT | 99 | rep2 | PD | Mascot | SearchEngine | TargetDecoy | PD Mascot |
| rep2_HCDOT_PD_MSAmanda_ptmRS_TargetDecoy | HCDOT | 106 | rep2 | PD | MSAmanda | ptmRS | TargetDecoy | PD MSAmanda |
| rep2_HCDOT_PD_MSAmanda_SearchEngine_TargetDecoy | HCDOT | 106 | rep2 | PD | MSAmanda | SearchEngine | TargetDecoy | PD MSAmanda |
| rep2_HCDOT_PD_SequestHT_ptmRS_TargetDecoy | HCDOT | 106 | rep2 | PD | SequestHT | ptmRS | TargetDecoy | PD SequestHT |
| rep2_HCDOT_PD_SequestHT_SearchEngine_TargetDecoy | HCDOT | 106 | rep2 | PD | SequestHT | SearchEngine | TargetDecoy | PD SequestHT |
| rep2_HCDOT_PS_Comet_Ascore_TargetDecoy | HCDOT | 85 | rep2 | PS | Comet | Ascore | TargetDecoy | PS Comet |
| rep2_HCDOT_PS_Comet_ptmRS_TargetDecoy | HCDOT | 99 | rep2 | PS | Comet | ptmRS | TargetDecoy | PS Comet |
| rep2_HCDOT_PS_Comet_SearchEngine_TargetDecoy | HCDOT | 99 | rep2 | PS | Comet | SearchEngine | TargetDecoy | PS Comet |
| rep2_HCDOT_PS_MSAmanda_Ascore_TargetDecoy | HCDOT | 106 | rep2 | PS | MSAmanda | Ascore | TargetDecoy | PS MSAmanda |
| rep2_HCDOT_PS_MSAmanda_ptmRS_TargetDecoy | HCDOT | 106 | rep2 | PS | MSAmanda | ptmRS | TargetDecoy | PS MSAmanda |
| rep2_HCDOT_PS_MSAmanda_SearchEngine_TargetDecoy | HCDOT | 106 | rep2 | PS | MSAmanda | SearchEngine | TargetDecoy | PS MSAmanda |
| rep2_HCDOT_PS_X!Tandem_Ascore_TargetDecoy | HCDOT | 96 | rep2 | PS | X!Tandem | Ascore | TargetDecoy | PS X!Tandem |
| rep2_HCDOT_PS_X!Tandem_ptmRS_TargetDecoy | HCDOT | 96 | rep2 | PS | X!Tandem | ptmRS | TargetDecoy | PS X!Tandem |
| rep2_HCDOT_PS_X!Tandem_SearchEngine_TargetDecoy | HCDOT | 96 | rep2 | PS | X!Tandem | SearchEngine | TargetDecoy | PS X!Tandem |
Number of unique synthetic phospho-sequence: 133
In these data, we find 219 unique sequences (without taking into account modifications: Sequence field) that do not correspond to any of the synthetic peptides expected.
Number of unique synthetic phospho-peptide per strategy (all pools combined):
I work on the basis of the 179 synthetic phospho-peptides used for the analysis. In the data, there are: 157 unique sequences among the 179 synthetic.
The synthetic peptides that are not identified by any of the methods are the following: pool_1_ADENYYK_5, pool_1_ESKSSPRPTAEK_4, pool_1_SQSTSEQEK_1, pool_2_ADENYYK_6, pool_2_EDAANNYAR_7, pool_2_ETTTSPKKYYLAEK_2, pool_2_NIDQSEFEGFSFVNSEFLKPEVK_11, pool_3_AGGKPSQSPSQEAAGEAVLGAK_10, pool_3_ETTTSPKKYYLAEK_3, pool_3_MPSHEAR_3, pool_3_SRTPPSAPSQSR_6, pool_3_TAPTPPKR_4, pool_3_VYELMR_2, pool_3_VYHYR_2, pool_4_ILSDVTHSAVFGVPASK_8, pool_4_SFNGSLKNVAVDELSR_1, pool_4_SQSDIFSR_3, pool_4_STVASMMHR_5, pool_4_YELTGLPEQDR_1, pool_5_ILSDVTHSAVFGVPASK_6, pool_5_TIYVRDPTSNK_3, pool_5_TSSFAEPGGGGGGGGGGPGGSASGPGGTGGGK_2&3
In the following density plots, the SyntheticPeptideID field indicates which PSMs are attributed to a synthetic peptide: with a correct localisation (TRUE) as opposed to incorrect phosphorylation localisations (FALSE).
I work on mono-phosphorylated peptides only.
I find the spectra that are matched to an ID in all the pipelines.
I plot the histograms of the localisation scores on the PSM matched in all the pipelines (325 PSMs).
## quartz_off_screen
## 2
I keep only the mono-phosphorylated spectra with a localisation score > 50% for ptmRS, and ≥ 0.75 for MQ.
I rank the localisation scores from the highest to lowest and go through them. For each step, I count the number of errors.
## quartz_off_screen
## 2
The black dots correspond to a localisation threshold of 0.75. In PhosphoRS scoring scheme, there is no value between 0.75 and 0.5, which explains the step at this threshold value.
The “x” correspond to a threshold localisation score of 0.2.
## quartz_off_screen
## 2
I save the table with the true FLR and corresponding localisation scores:
write.table(plotval, "FLR.txt", sep = "\t", row.names = F)
Here, I use all the mono-phosphorylated PSMs, even with scores <= 0.5. There are 325 PSMs common to all the pipelines. I use them for the following figures:
## quartz_off_screen
## 2
The figure is saved in “Figures/Score_realign.pdf”.
Ferries, Samantha, Simon Perkins, Philip J. Brownridge, Amy Campbell, Patrick A. Eyers, Andrew R. Jones, and Claire E. Eyers. 2017. “Evaluation of Parameters for Confident Phosphorylation Site Localization Using an Orbitrap Fusion Tribrid Mass Spectrometer.” Journal of Proteome Research 16 (9): 3448–59. https://doi.org/10.1021/acs.jproteome.7b00337.