Pfeature – A comprehensive tool for computing protein/peptide features and building prediction models
Authors/Creators
Description
Title:
Pfeature – A comprehensive tool for computing protein/peptide features and building prediction models
Description:
Project: Pfeature – A tool for computing wide range of protein features and building prediction models
Publication: Pande, A., Patiyal, S., Lathwal, A., Arora, C., Kaur, D., Dhall, A., Mishra, G., Kaur, H., Sharma, N., Jain, S., Usmani, S.S., Agrawal, P., Kumar, R., Kumar, V., & Raghava, G.P.S. (2023). Pfeature: A Tool for Computing Wide Range of Protein Features and Building Prediction Models. Journal of Computational Biology, 30(2), 204–222. https://doi.org/10.1089/cmb.2022.0241
Overview: Pfeature is a comprehensive software platform for computing a wide range of protein/peptide features (>200,000 descriptors) and building machine learning prediction models. It addresses limitations of existing tools by integrating novel features (Shannon entropy, residue repeats, distance distribution, atom/bond composition) and supporting chemically modified peptides (structural descriptors). The tool is available as a web server, Python library, and standalone package.
Key Modules:
| Module | Description |
|---|---|
| Composition | AAC, DPC, TPC, atom/bond composition, AAIndex, autocorrelation, entropy, repeats, PSSM‑400 |
| Binary Profiles | Amino acid, dipeptide, property, AAIndex, atom/bond profiles (residue‑level annotation) |
| Evolutionary Info | PSSM profile generation (raw + 4 normalization methods) |
| Structural Descriptors | Fingerprints (14,532), SMILES, surface accessibility, secondary structure (for chemically modified peptides) |
| Patterns | Overlapping windows, terminal regions (N‑term, C‑term, split, SAAP) |
| Model Building | Feature merging, selection (mRMR, etc.), normalization, classification (RF, ET, XGB, SVC, etc.), 5‑fold CV |
Feature Comparison – Unique to Pfeature:
-
Shannon entropy (protein + residue level)
-
Distance distribution of residues
-
Residue repeats (homo‑ and hetero‑repeats)
-
Physicochemical property repeats
-
Atom and bond composition
-
Dipeptide binary profiles
-
AAIndex binary profiles
-
Structural descriptors for chemically modified peptides
Total descriptors (whole protein, λ=5): 11,879 (protein level) + terminal/split regions → 95,137 total
Usage: Computing protein/peptide features for classification/regression, residue‑level annotation (secondary structure, binding sites), chemically modified peptide analysis (FDA‑approved therapeutics), model building without programming expertise.
Case Studies Citing Pfeature:
-
IL6pred, IL13pred, AlgPred2.0, HLAncPred, ABCRpred
-
SARS‑CoV‑2 ACE2 receptor analysis (Hassan et al., 2020)
-
Amyloid protein prediction (Sofi & ArifWani, 2021)
-
Metagenomic biocatalyst discovery (Shahraki et al., 2022)
Related Resources: Web server: https://webs.iiitd.edu.in/raghava/pfeature/ | GitHub: https://github.com/raghavagps/Pfeature
Contact: raghava@iiitd.ac.in (Gajendra P. S. Raghava)
Files
raghavagps/Pfeature-v1.0.zip
Files
(3.3 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:80022db650ef1c57c5fa368a841520a5
|
3.3 MB | Preview Download |
Additional details
Related works
- Is supplement to
- Software: https://github.com/raghavagps/Pfeature/tree/v1.0 (URL)
Software
- Repository URL
- https://github.com/raghavagps/Pfeature