Published April 30, 2026 | Version v1.0
Software Open

Pfeature – A comprehensive tool for computing protein/peptide features and building prediction models

  • 1. ROR icon Indraprastha Institute of Information Technology Delhi
  • 2. ROR icon National Cancer Institute

Description

Title:
Pfeature – A comprehensive tool for computing protein/peptide features and building prediction models

Description:

Project: Pfeature – A tool for computing wide range of protein features and building prediction models

Publication: Pande, A., Patiyal, S., Lathwal, A., Arora, C., Kaur, D., Dhall, A., Mishra, G., Kaur, H., Sharma, N., Jain, S., Usmani, S.S., Agrawal, P., Kumar, R., Kumar, V., & Raghava, G.P.S. (2023). Pfeature: A Tool for Computing Wide Range of Protein Features and Building Prediction Models. Journal of Computational Biology, 30(2), 204–222. https://doi.org/10.1089/cmb.2022.0241

Overview: Pfeature is a comprehensive software platform for computing a wide range of protein/peptide features (>200,000 descriptors) and building machine learning prediction models. It addresses limitations of existing tools by integrating novel features (Shannon entropy, residue repeats, distance distribution, atom/bond composition) and supporting chemically modified peptides (structural descriptors). The tool is available as a web server, Python library, and standalone package.

Key Modules:

 
 
Module Description
Composition AAC, DPC, TPC, atom/bond composition, AAIndex, autocorrelation, entropy, repeats, PSSM‑400
Binary Profiles Amino acid, dipeptide, property, AAIndex, atom/bond profiles (residue‑level annotation)
Evolutionary Info PSSM profile generation (raw + 4 normalization methods)
Structural Descriptors Fingerprints (14,532), SMILES, surface accessibility, secondary structure (for chemically modified peptides)
Patterns Overlapping windows, terminal regions (N‑term, C‑term, split, SAAP)
Model Building Feature merging, selection (mRMR, etc.), normalization, classification (RF, ET, XGB, SVC, etc.), 5‑fold CV

Feature Comparison – Unique to Pfeature:

  • Shannon entropy (protein + residue level)

  • Distance distribution of residues

  • Residue repeats (homo‑ and hetero‑repeats)

  • Physicochemical property repeats

  • Atom and bond composition

  • Dipeptide binary profiles

  • AAIndex binary profiles

  • Structural descriptors for chemically modified peptides

Total descriptors (whole protein, λ=5): 11,879 (protein level) + terminal/split regions → 95,137 total

 

Usage: Computing protein/peptide features for classification/regression, residue‑level annotation (secondary structure, binding sites), chemically modified peptide analysis (FDA‑approved therapeutics), model building without programming expertise.

Case Studies Citing Pfeature:

  • IL6pred, IL13pred, AlgPred2.0, HLAncPred, ABCRpred

  • SARS‑CoV‑2 ACE2 receptor analysis (Hassan et al., 2020)

  • Amyloid protein prediction (Sofi & ArifWani, 2021)

  • Metagenomic biocatalyst discovery (Shahraki et al., 2022)

Related Resources: Web server: https://webs.iiitd.edu.in/raghava/pfeature/ | GitHub: https://github.com/raghavagps/Pfeature

Contact: raghava@iiitd.ac.in (Gajendra P. S. Raghava)

 

Files

raghavagps/Pfeature-v1.0.zip

Files (3.3 MB)

Name Size Download all
md5:80022db650ef1c57c5fa368a841520a5
3.3 MB Preview Download

Additional details

Related works