Path to the search results parsed with this script: ~/Documents/Boulot/CPR/Projets/PhosphoLocalisations/Data_C.Eyers/MQ/.
I manually add the information of which search engine has been used for the search.
I keep only the phosphorylated ions.
I remove reverse and potential contaminants.
I keep only the files corresponding to the following acquisition methods: HCDOT.
Distribution of the scores:
apply the threshold of localisation score : above 0.75. The data are not filtered yet, I indicate if the localisation score passes the threshold in the field LocalisationsFilter.
NOTE: In the field Modified.sequence of the input tables, there are two ways the phosphorylation site localisation can be indicated:
This is weird and I take these two possible options in to consideration when parsing the localisations.
For all the different inputs, I create IDs of the phospho-peptides:
PhosphopeptideID: concatenation of pool, sequence and localisation of the phosphorylation (seperated with "_").PhosphosequenceID: concatenation of pool, sequence and number of phosphorylations on the peptide (seperated with "_").When there are several scores for the phosphorylations localisations, I create one ID for each scoring. I define the scorings as “ptmRS” when it is the phosphoRS algorithm, or “SearchEngine” when it is the default localisation score of the pipeline.