Computational identification of Brassica napus pollen specific protein Bnm 1 as an allergen

Bnm1 is a pollen specific protein from Brassica napus (oilseed rape) and it is specifically expressed in the bi-cellular and tri-cellular stages of pollen development. Since the incidence of pollinosis due to oilseed rape (Brassica napus) is increasing day by day, keeping pace with its high cultivation rate, the search for its allergens is a demand of time to develop effective immune therapy. In the present study, different computational tools were adopted to predict the potential of Bnm1 as a candidate allergen. Physicochemical properties of Bnm1 showed its molecular weight (~20kD) and theoretical pI (5.27) along with other properties to be fallen between the ranges essential for a protein to be an allergen. Keeping in mind the capability of allergen to induce both humoral and cell mediated immune response, we checked both the potential B cell and T cell epitope candidates of Bnm1 using different immune-informatics tools housed at IEDB analysis resource. For B cell epitope prediction, potential antigenic sites on the protein surface were predicted by both propensity scale and machine learning method followed by their mapping on Bnm1 3D structure predicted from homology modeling. In case of T cell epitope prediction, interaction of the core sequence with seven abundant MHC-II alleles (DRB1*0101, DRB1*0301, DRB1*0401, DRB1*0701, DRB1*1101, DRB1*1301 and DRB1*1501) within an IC50 range (IC50<25) was the basis. We observed multiple epitope interactions with DRB1*0101 allele, implying that this interactions might evoke a strong TH2 response ensuring an increased production of IgE response. Based on our prediction we hereby claim that the Bnm1 is a potential allergen having capability to induce both humoral and cell mediated allergic responses. However, in vitro analysis was out of our scope and therefore it should have to be performed to validate the potency of Bnm1 as an allergen.


Introduction
Allergens are mostly small proteins or protein bound small substances having a molecular weight ranging from 15 to 40 kD [1].These proteins or protein associated substances not only can induce IgE antibody mediated immune response but also can induce other antibody responses like IgA, IgM, and IgG [2].Moreover, they can elicit T cell response in the human body system which is mostly Th2 cell mediated [3,4].The sources of these allergens can be different food items, foreign serums, insect venoms, pollens, insect products etc. [5].Most of the allergens have biological function but little is known about their functional relevance to allergenicity.The functional property of allergen like enzyme activity may contribute to the induction of allergenicity [6].Pollen allergy (pollinosis) has brought a large attention to the scientific community around the world and the rate of pollinosis is increasing day by day [7,8].Among the symptoms of immediate response upon exposure to pollen allergens are allergic rhinitis, asthma and hay-fever [9][10][11] etc.Individuals who are hypersensitive to pollen often show allergic like symptoms.
Brassica napus (oilseed rape), belonging to cruciferae family, is one of the most cultivated crop plants around the world.The importance of this plant as a source of oil and medicinal components is reviewed elsewhere.Cultivation rate of oilseed rape around the globe has been increasing over the last 10 years resulting in about 31.5 Mha of lands used to cultivate in 2010 [12] The recent introduction of genetically engineered oil seed has geared the cultivation rate even more [13,14].
With the pace of the cultivation rate of this plant an alarming concern for allergenicity is generated, since pollen from oilseed rape have been reported as potential allergen.In the most abundant regions, most patients with hay-fever blame the blooming of Brassica napus for their symptoms during its flowering season [15].During this flowering season the microscopic male cells of the plant make round or oval structural grain which is programmed to transport along the wind to help in the process of pollination.Pollens of Brassica napus guided by the wind are spreaded everywhere around the cultivated land and people mostly farmers having exposed to it suffer from hay-fever, allergic rhinitis like seasonal diseases [15] Although some of the high molecular mass proteins in oilseed rape pollen have been identified as allergens [16][17][18][19][20], a quest for other allergens is going on in different laboratories.In modern biology, in silico approaches have made it possible to give a straightforward time and money saving way to find out the solution of many biological problems like vaccine design [21], prediction of deleterious effects of mutations [22] and many more.It can also predict potential allergens from a given whole proteome.
Bnm1 is a pollen specific protein in Brassica napus and it is specifically expressed in the bicellular and tricellular stages of pollen development [23].In the present study we have attempted to analyze the potentials of Bnm1 to be an allergen using different computational tools and immune-informatics databases.To our knowledge, this is the first immune-informatics approach to examine the potency of Bnm1 from oilseed rape to be an allergen.

Materials and Methods: 2.1 Protein Sequence retrieval
Bnm1 (pollen specific protein) protein sequence (P93760) for Brassica napus was retrieved from the UniProt database (http://www.uniprot.org/).This protein sequence was the basis to perform different computational predictions from linear amino acid residues.

Prediction of physico-chemical properties
Different physico-chemical properties for Bnm1 protein was predicted from its linear amino acid sequence.ProtParam tool was employed to predict various physical and chemical properties for Bnm1 protein including the molecular weight, theoretical pI, atomic composition, amino acid composition, instability index, extinction coefficient, grand average of hydropathicity (GRAVY), estimated half-life, and aliphatic index [24].

Potential antigenic sites prediction
Antigenicity of the protein was predicted by determination of hydrophobic-hydrophilic region and thereby determining the regions of protein, exposed to outer surface and can react to B cell.To predict the potential antigenic sites two prediction methods called Kolaskar-Tongaonkar antigenicity and Parker's hydrophilicity were assigned followed by determination of antigenic propensity and hydrophilicity respectively from the plots generated [25,26].

Potential B cell epitope prediction:
All the regions exposed to outer surface are not potent site to react with B cell, that's why B cell epitopes are needed to be predicted.A machine learning tool BepiPred housed at IEDB analysis tools was assigned to predict B cell epitope [27].This tool uses a combinatorial algorithm comprising both hidden markov model and antigenic propensity and thus was used to cross check the predicted result from Parker's hydrophilicity and Kolaskar-Tongaonkar antigenicity prediction method [25,26].

Prediction of 3D structure of Bnm1 and mapping of B cell epitopes on the structure
For 3-D structure prediction we followed a template based Bnm1 protein structure modeling using RaptorX (http://raptorx.uchicago.edu/)server which makes several alignments of the target protein sequence with different protein templates having sparse protein sequence to predict the ab initio model of protein, which is essential to have a better structure prediction [28].Based on the alignment score top ten ranked alignments were predicted by probabilistic-consistency algorithm and a novel nonlinear scoring function.We chose the structure of Bnm1 which showed maximum alignment score between the Bnm1 protein and its target template (RCSB PDB ID 1x8z, chain A).Energy minimization of the structure was performed by Swiss PDB viewer tool [29].To validate the structure a Ramachandran plot was generated using ProCheck program [30] which measures the stereo-chemical properties of the protein structure.

Potential T cell epitope prediction
To check whether the Bnm1 protein can elicit T cell response, we predicted potential T cell epitopes using "Peptide binding to MHC class II molecules" program under "MHCII binding prediction" tool in IEDB analysis resource.For our prediction we followed NetMHCIIpan prediction method choosing the seven abundant HLA class II alleles DRB1*0101, DRB1*0301, DRB1*0401, DRB1*0701, DRB1*1101, DRB1*1301 and DRB1*1501 from the selection panel [31,32].Later, from the predicted T cell epitopes only those epitopes having IC50 score less than 25 were selected as potential T cell epitope candidates since epitopes having IC50 value < 25 shows higher binding affinity with MHC alleles [33].

Physico-chemical properties predicts similar characteristics of Bnm1 like conventional allergen
Physico-chemical properties of an intact protein sometimes can predict the allergenic potential of a protein [34].Bnm1 consists of 182 amino acids and the molecular weight was predicted to be approximately 20 kD (Table 1).The total amino acid distribution along the Bnm1 protein (Figure 1) shows that there is no tryptophan residue present and among the 20 amino acids six amino acid residues e.g.Ala, Asp, Leu, Ser, Thr and Val comprise almost 60% of total composition of Bnm1 protein (Figure 1).Since all of these highly abundant residues have acidic pI (minimum 2.77 for Asp to maximum 6.0 for Ala) range, this suggests the protein's theoretical pI to be acidic which is a combinatorial score of all the pI scores for all the amino acid residues present in Bnm1 protein.ProtParam prediction shows that the protein is acidic (pI 5.27) and net negatively charged as total number of negatively charged residues outnumber the total number of positively charged residues (Table 1), hence it may preferentially be processed by dendritic cells [34].Based on the half lives of the individual N-terminal amino acid residues, the predicted overall half life of the Bnm1 [35,36] was quite a high as 30 hours.The predicted instability index of the Bnm1 indicates that this protein is slightly unstable.The grand average of hydropathicity predicted from the Bnm1 linear protein sequence was predicted as negative and hence most of the amino acid residues in Bnm1 protein are likely to be present on the surface of the folded Bnm1 protein.

Figure 1 Amino acid composition of Bnm1
Among the all amino acids present in Bnm1, Alanine (Ala) is ~16.5% of total composition and thus outnumbers the other amino acid.Cysteine (Cys) is of least percentage present in Bnm1 protien.Note there is no tryptophan residue in its total composition.

Potential antigenic sites are on the surface of the Bnm1 protein
To show allergenic response the whole protein does not need to be antigenic; rather there are some antigenic determinants which are called epitope accounts for the immunogenic reactions.It is hypothesized that the antigenic epitope must reside in the outer layer or the hydrophilic region of the protein so that it can interact with others.So simply hydrophobic or hydrophilic profile of a protein sometimes can predict a protein's potential as an allergen from the linear amino acid sequence [37].For prediction of Bnm1 allergenicity, Kolaskar and Tongaonkar prediction method was employed which functions on the basis of physico-chemical properties of amino acids in proteins and abundances in experimentally known epitopes [25].In Kolaskar scale X axis represents amino acid residues whereas Y axis represents anigenic propensity of the protein.For Bnm1 the average antigenic propensity of the bnm1 protein is 1.037 so all residues having a value greater that 1.037 are potential antigenic determinant (Figure 2).Nine peptides are found to satisfy the threshold (1.00) value set prior to analysis and thus are potential antigenic sites of the protein and they have the potential to evoke B cell response.The detail of the individual peptide is summarized in Table 2 .The peptide regions ranging from 4 to 39 amino acid residue having "FSVLSTFAAAAITLQLLLVPASASPHMKYIDAICDR" sequence and 55 to 75 amino acid residue having "PTAAPIGLNPLAEVMALTIAH" sequence are predicted to have the highest antigenic propensity score and both in total comprise about 38% portion as allergenic sites of all the sites in Bnm1 protein.Hydrophilic regions of the protein are likely to be exposed to outer surface and are most likely to evoke B cell response.To predict the hydrophilic region of the Bnm1, we assigned Parker hydrophilicity prediction [26] method and the predicted hydrophilicity plot shows the hydrophilic regions of Bnm1 protein shaded in yellow color (Figure 3).
Since the single-scale amino acid propensity profiles cannot always predict B cell epitope location reliably as even the best Antigen Prediction method could yield marginally better score than the ROC (receiver operating characteristics) plot [38] so we also performed a machine learning process, BepiPred to predict the antigenic sites that can increase the prediction success which is reviewed elsewhere [39,40].

potential B cell epitopes overlap the antigenic sites of Bnm1
Peptides exposed to the outer surface of the protein do not mean that they will react with B cell.
To predict the B cell epitopes we employed BepiPred tool which is a antibody epitope prediction program that combines both hidden Markov model and antigenic propensity scale method making the prediction more reliable [27].BepiPred predicted seven potential B cell epitopes highlighted as yellow marked region for Bnm1 protein sequences (Figure 4) and the maximum score predicted is 2.034.The predicted epitopes fit exactly in the hydrophilic regions predicted from Parker hydrophilicity plot and thus are likely to be exposed on the surface of the protein.A detail of individual B cell epitopes is summarized in Table 3. Table 3 Predicted B cell epitope sequences and their position along with their length.

Mapping of the B cell epitopes in the modeled structure confirms their presence on the surface of Bnm1
The predicted 3-D structure of Bnm1(Figure 5) was visualized using Pymol molecular visualization system [41].Ramachandran plot generated to validate the predicted structure (Figure 5) shows the amino acid distribution in different regions of the plot.A detail of the distribution of amino acids in different regions is summarized in Table 4. Once the structure was found to be reliable, this 3D structure was chosen as the template to map the Table 4 Distribution of different amino acid residues in Ramachandran plot

Percentage of distribution
Most favored regions 92.9Additional allowed regions 7.1 Generously allowed regions 0.0 Disallowed regions 0.0 predicted B cell epitopes in Bnm1 protein (Figure 6).The yellow balls (Figure 6) represent the predicted B cell epitopes and all of these yellow colored regions are on the surface of the protein.
So what we observed from the primary structure of Bnm1 is consistent in its 3D structure.

Predicted potential T cell epitopes of Bnm1 shows multiple interactions with DRB1* 0101
We also checked whether Bnm1 can induce T cell response since some allergens can induce both humoral and cell mediated immune response to exert immune-inflammation [42][43][44].MHC class II epitope prediction was performed for the highly abundant selected alleles e.

Conclusion
To design an effective immune therapy against the allergic response to oilseed rape pollens it is highly needed to know about the array of allergens in its pollen.Our computational approaches strongly suggest that Bnm1, the pollen specific protein, is likely to be an allergen.However, In vitro analyses are warranted to validate the allergenicity of Bnm1.

Figure 2
Figure 2 Kolaskar and Tongaonkar antigneicity graphical plot.The protein sequences those satisfied the set threshold value (antigenic propensity threshold 1.00) are predicted to be potential antigenic site against which antibodies can elicit the response.

Figure 3
Figure 3 Parker hydrophilicity plot In Parker hydrophilicity prediction scale X axis represents the hydrophilicity score whereas the Y axis represents the sequence position.The average score of hydrophilicity is 1.887 and thus the peptide regions having hydrophilicity score above the average score are hydrophilic and are likely to be present on the surface of the Bnm1 protein.Nine potential hydrophilic regions predicted here are highlighted as yellow color region whereas the green shaded regions could not satisfy the minimum hydrophilicity score and thus are likely to be present within the core of Bnm1 protein.

Figure 4
Figure 4 Potential B cell epitopes predicted from BepiPred tool The amino acid sequences having a score above the threshold (0.350) are predicted to be potential B cell epitopes and are highlighted as yellow color.The maximum score predicted here is 2.034

Figure 5 3D
Figure 5 3D structure and its validation using Ramachandran plot for Bnm1 protein (A) Cartoon representation of the predicted structure of Bnm1 protein.This image has been developed using pymol molecular visualization system (B) amino acid residues are distributed in Ramachandran plot.All the amino acid residues lie in the allowed residues in distribution plot making the prediction reliable.

Figure 6
Figure 6 Mapping of B cell epitopes on 3D structure of Bnm1 All the B cell epitopes predicted from BepiPred prediction tool are mapped on the surface of the Bnm1 protein structure where B cell epitopes are highlighted as yellow color and on the other hand the rest of the portion of Bnm1 protein is highlighted as pink color.

Table 1
Detail of physico-chemical properties for Bnm1 protein from oilseed rape