Published March 2025 | Version v2
Software documentation Open

NTxPred2: A large language model for predicting neurotoxic peptides and neurotoxins

  • 1. Indraprastha Institute of Information Technology, New Delhi, India
  • 1. Indraprastha Institute of Information Technology, New Delhi, India.
  • 2. Indraprastha Institute of Information Technology, New Delhi, India

Description

NTxPred2 is a computational tool for predicting the neurotoxic activity of peptides and proteins. It helps researchers in therapeutic peptide and protein development by quantifying and classifying sequences that target the central nervous system. The method uses large language model (LLM) embeddings as features and provides Prediction, Protein Scanning, and Design modules.

đź”— Web server: http://webs.iiitd.edu.in/raghava/ntxpred2

đź“– Reference: Rathore, A.S., Jain, S., Choudhury, S., & Raghava, G.P.S. (2025). A large language model for predicting neurotoxic peptides and neurotoxins. Protein Science, 34(8), e70200. https://doi.org/10.1002/pro.70200

🖼️ Workflow

https://webs.iiitd.edu.in/raghava/ntxpred2/download/NTXPred_flowchart.png

⚙️ Installation

🔹 PIP Installation

bash
pip install ntxpred2

To check available options:

bash
ntxpred2 -h

🔹 Standalone Installation

NTxPred2 is written in Python 3 and requires the following dependencies.

Required Python version: 3.10.7

Install core libraries:

bash
pip install scikit-learn==1.5.2
pip install pandas==1.5.3
pip install numpy==1.25.2
pip install torch==2.1.0
pip install transformers==4.34.0
pip install joblib==1.4.2
pip install onnxruntime==1.15.1
pip install biopython==1.81
pip install tqdm==4.64.1

🔹 Conda Installation (using environment.yml)

bash
conda env create -f environment.yml
conda activate NTxPred2

⚠️ Important Note

Due to the large size of the model file, the model directory has been compressed and uploaded.

  • Download the zip file from the Download Page

  • Extract the file before using the code or model

🔬 Classification

NTxPred2 classifies peptides and proteins as neurotoxic or non‑neurotoxic based on their primary sequence.

 
 
Model Description
ESM2‑t30 (Peptide Model) For sequences 7–50 amino acids
ET (Protein Model) For sequences ≥ 51 amino acids
ET (Combined Model) For sequences of mixed length
Default Model ESM2‑t30 (Peptide Model) – best performance and efficiency

🚀 Usage

Minimum Usage

bash
ntxpred2.py -i example.fasta

Full Usage

bash
ntxpred2.py [-h] [-i INPUT] [-o OUTPUT] [-t THRESHOLD] [-j {1,2,3,4}]
            [-m {1,2,3}] [-d {1,2}] [-wd WORKING_DIRECTORY]

Required Arguments

 
 
Argument Description
-i INPUT Input file (FASTA or simple format, each sequence on a new line)
-o OUTPUT Output file name (default: outfile.csv)
-t THRESHOLD Classification threshold between 0 and 1 (default: 0.5)
-j {1,2,3,4} Job type (see table below)
-m {1,2,3} Model: 1 = ESM2‑t30 (peptides), 2 = ET (proteins), 3 = ET (combined)
-wd WORKING_DIR Working directory for saving results

Job Types (-j)

 
 
Job Description
1 Prediction – Classify input peptide/protein as neurotoxic or non‑neurotoxic
2 Protein Scanning – Identify neurotoxic regions within a protein sequence
3 Design – Generate mutants with a single amino acid/dipeptide at a specified position
4 Design All Possible Mutants – Generate and predict all possible single‑residue mutants

Additional Options (for Design & Scan jobs)

 
 
Option Description
-p POSITION Position to mutate (1‑indexed)
-r RESIDUES Mutated residues (single or double‑letter amino acid codes)
-w {8-20} Window length for Protein Scan (default: 12)
-d {1,2} Display: 1 = neurotoxic only, 2 = all peptides (default)

đź“‚ Input & Output

Input Formats

 
 
Format Example Description
FASTA example.fasta Standard FASTA format with >ID headers
Simple example.seq One sequence per line, no headers

Output

  • Results are saved in CSV format

  • Default output file: outfile.csv

  • Columns include sequence ID, prediction label, and probability score

📦 Package Contents

 
 
File Description
INSTALLATION Installation instructions
LICENSE License information
README.md This file
ntxpred2.py Main Python program
example.fasta Example input file (FASTA format)

📬 Contact

For questions or suggestions, please contact:
Prof. Gajendra P. S. Raghava – raghava@iiitd.ac.in

 

Files

README.md

Files (1.5 MB)

Name Size Download all
md5:fb4931beef37d6640ffa69ede7ccbe7f
331 Bytes Download
md5:53153d4d07792435f96c7d982a5936fc
90 Bytes Download
md5:83c4a20a6f63000310270f41643a4c04
50.9 kB Download
md5:d000c1aa2babc64380a1bb26514e2d13
356.7 kB Download
md5:5c5815016cbd99f799d1de552dec972f
5.4 kB Preview Download
md5:9ed4e0bbc61a58933a872d6196bf24dd
1.0 MB Preview Download

Additional details

Related works

Is referenced by
Software documentation: 10.1101/2025.03.01.640936 (DOI)

Dates

Available
2025-03-10
NTxPred2: A large language model for predicting neurotoxic peptides and neurotoxins

Software

Repository URL
https://github.com/raghavagps/ntxpred2
Programming language
Python
Development Status
Active