AlgPred 2.0: An improved method for predicting allergenic proteins and mapping of IgE epitopes
Authors/Creators
Description
Title:
AlgPred 2.0 Dataset – Experimentally validated allergenic and non‑allergenic proteins with IgE epitopes
Description:
Project: AlgPred 2.0 – An improved method for predicting allergenic proteins and mapping of IgE epitopes
Publication: Sharma, N., Patiyal, S., Dhall, A., Pande, A., Arora, C., & Raghava, G.P.S. (2021). AlgPred 2.0: an improved method for predicting allergenic proteins and mapping of IgE epitopes. Briefings in Bioinformatics, 22(5), bbaa294. https://doi.org/10.1093/bib/bbaa294
Overview: This dataset accompanies AlgPred 2.0, an updated version of AlgPred (2006) for predicting allergenic proteins and mapping IgE epitopes. The dataset is the largest compiled for allergen prediction, with stringent redundancy reduction (no two proteins >40% similar across train/validation sets). IgE epitopes (10,451) are provided for epitope mapping.
Content:
| Dataset | Allergens | Non‑allergens | Source |
|---|---|---|---|
| Main | 10,075 | 10,075 | COMPARE, AllergenOnline, Swiss‑Prot, AllerTOP, AlgPred |
| IgE epitopes | 10,451 | 307,866 | IEDB, AllerBase, IgPred |
Key Findings – Compositional Analysis:
-
Allergens enriched in: C, G, R, S
-
Non‑allergens enriched in: A, E, K, L, Q
-
MEME motifs and MERCI motifs identified exclusively in allergens (provided in supplementary data)
Best Model Performance (Hybrid – RF + BLAST + MERCI, validation set):
| Metric | Value |
|---|---|
| AUC | 0.98 |
| MCC | 0.85 |
| Accuracy | 92.3% |
| Sensitivity | 89.4% |
| Specificity | 95.1% |
Alternative model (RF + AAC alone): AUC = 0.92, MCC = 0.68
Comparison with existing methods (as reported in literature):
| Method | AUC | MCC | Web server |
|---|---|---|---|
| AlgPred 2.0 | 0.99 | 0.88 | Yes |
| AllerTOPv2 | — | 0.775 | Yes |
| AllerCatPro | — | 0.84 | Yes |
| AllerHunter | 0.928 | 0.738 | No |
| AlgPred (2006) | — | 0.705 | Yes |
Independent dataset validation (297 newly added allergens, non‑redundant 56 set): Accuracy = 94.3% (280/297) and 91.1% (51/56)
Usage: Predicting allergenic proteins, mapping IgE epitopes in protein sequences, scanning protein for allergen‑specific motifs, similarity search against allergen/IgE databases.
Related Resources: Web server: https://webs.iiitd.edu.in/raghava/algpred2/ | GitHub: https://github.com/raghavagps/algpred2
Contact: raghava@iiitd.ac.in (Gajendra P. S. Raghava)
Files
raghavagps/algpred2-v1.0.zip
Files
(7.0 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:4fa4624a12741aa05a549a3c8089a18b
|
7.0 MB | Preview Download |
Additional details
Related works
- Is supplement to
- Software: https://github.com/raghavagps/algpred2/tree/v1.0 (URL)
Software
- Repository URL
- https://github.com/raghavagps/algpred2