NTxPred2: A large language model for predicting neurotoxic peptides and neurotoxins
Authors/Creators
- 1. Indraprastha Institute of Information Technology, New Delhi, India
Contributors
Researcher (3):
Supervisor:
- 1. Indraprastha Institute of Information Technology, New Delhi, India.
- 2. Indraprastha Institute of Information Technology, New Delhi, India
Description
NTxPred2 is a computational tool for predicting the neurotoxic activity of peptides and proteins. It helps researchers in therapeutic peptide and protein development by quantifying and classifying sequences that target the central nervous system. The method uses large language model (LLM) embeddings as features and provides Prediction, Protein Scanning, and Design modules.
đź”— Web server: http://webs.iiitd.edu.in/raghava/ntxpred2
đź“– Reference: Rathore, A.S., Jain, S., Choudhury, S., & Raghava, G.P.S. (2025). A large language model for predicting neurotoxic peptides and neurotoxins. Protein Science, 34(8), e70200. https://doi.org/10.1002/pro.70200
🖼️ Workflow
https://webs.iiitd.edu.in/raghava/ntxpred2/download/NTXPred_flowchart.png
⚙️ Installation
🔹 PIP Installation
pip install ntxpred2
To check available options:
ntxpred2 -h
🔹 Standalone Installation
NTxPred2 is written in Python 3 and requires the following dependencies.
Required Python version: 3.10.7
Install core libraries:
pip install scikit-learn==1.5.2 pip install pandas==1.5.3 pip install numpy==1.25.2 pip install torch==2.1.0 pip install transformers==4.34.0 pip install joblib==1.4.2 pip install onnxruntime==1.15.1 pip install biopython==1.81 pip install tqdm==4.64.1
🔹 Conda Installation (using environment.yml)
conda env create -f environment.yml conda activate NTxPred2
⚠️ Important Note
Due to the large size of the model file, the model directory has been compressed and uploaded.
-
Download the zip file from the Download Page
-
Extract the file before using the code or model
🔬 Classification
NTxPred2 classifies peptides and proteins as neurotoxic or non‑neurotoxic based on their primary sequence.
| Model | Description |
|---|---|
| ESM2‑t30 (Peptide Model) | For sequences 7–50 amino acids |
| ET (Protein Model) | For sequences ≥ 51 amino acids |
| ET (Combined Model) | For sequences of mixed length |
| Default Model | ESM2‑t30 (Peptide Model) – best performance and efficiency |
🚀 Usage
Minimum Usage
ntxpred2.py -i example.fasta
Full Usage
ntxpred2.py [-h] [-i INPUT] [-o OUTPUT] [-t THRESHOLD] [-j {1,2,3,4}] [-m {1,2,3}] [-d {1,2}] [-wd WORKING_DIRECTORY]
Required Arguments
| Argument | Description |
|---|---|
-i INPUT |
Input file (FASTA or simple format, each sequence on a new line) |
-o OUTPUT |
Output file name (default: outfile.csv) |
-t THRESHOLD |
Classification threshold between 0 and 1 (default: 0.5) |
-j {1,2,3,4} |
Job type (see table below) |
-m {1,2,3} |
Model: 1 = ESM2‑t30 (peptides), 2 = ET (proteins), 3 = ET (combined) |
-wd WORKING_DIR |
Working directory for saving results |
Job Types (-j)
| Job | Description |
|---|---|
1 |
Prediction – Classify input peptide/protein as neurotoxic or non‑neurotoxic |
2 |
Protein Scanning – Identify neurotoxic regions within a protein sequence |
3 |
Design – Generate mutants with a single amino acid/dipeptide at a specified position |
4 |
Design All Possible Mutants – Generate and predict all possible single‑residue mutants |
Additional Options (for Design & Scan jobs)
| Option | Description |
|---|---|
-p POSITION |
Position to mutate (1‑indexed) |
-r RESIDUES |
Mutated residues (single or double‑letter amino acid codes) |
-w {8-20} |
Window length for Protein Scan (default: 12) |
-d {1,2} |
Display: 1 = neurotoxic only, 2 = all peptides (default) |
đź“‚ Input & Output
Input Formats
| Format | Example | Description |
|---|---|---|
| FASTA | example.fasta |
Standard FASTA format with >ID headers |
| Simple | example.seq |
One sequence per line, no headers |
Output
-
Results are saved in CSV format
-
Default output file:
outfile.csv -
Columns include sequence ID, prediction label, and probability score
📦 Package Contents
| File | Description |
|---|---|
INSTALLATION |
Installation instructions |
LICENSE |
License information |
README.md |
This file |
ntxpred2.py |
Main Python program |
example.fasta |
Example input file (FASTA format) |
📬 Contact
For questions or suggestions, please contact:
Prof. Gajendra P. S. Raghava – raghava@iiitd.ac.in
Files
README.md
Files
(1.5 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:fb4931beef37d6640ffa69ede7ccbe7f
|
331 Bytes | Download |
|
md5:53153d4d07792435f96c7d982a5936fc
|
90 Bytes | Download |
|
md5:83c4a20a6f63000310270f41643a4c04
|
50.9 kB | Download |
|
md5:d000c1aa2babc64380a1bb26514e2d13
|
356.7 kB | Download |
|
md5:5c5815016cbd99f799d1de552dec972f
|
5.4 kB | Preview Download |
|
md5:9ed4e0bbc61a58933a872d6196bf24dd
|
1.0 MB | Preview Download |
Additional details
Related works
- Is referenced by
- Software documentation: 10.1101/2025.03.01.640936 (DOI)
Dates
- Available
-
2025-03-10NTxPred2: A large language model for predicting neurotoxic peptides and neurotoxins
Software
- Repository URL
- https://github.com/raghavagps/ntxpred2
- Programming language
- Python
- Development Status
- Active