Catching the Rare: Ensemble and Linear Models for Imbalanced Network Intrusion Detection
Authors/Creators
Contributors
Researcher:
Description
This record contains the materials for the study “Catching the Rare: Ensemble and Linear Models for Imbalanced Network Intrusion Detection.” The work investigates supervised machine learning approaches for binary network intrusion detection under severe class imbalance, with a focus on identifying rare attack instances.
The study is motivated by attack behaviors commonly analyzed in honeypot environments but does not involve live honeypot deployment. Experiments are conducted using the publicly available CICIDS 2017 dataset, generated in a controlled environment that simulates realistic benign and malicious network traffic.
Three supervised learning models - Logistic Regression, RandomForest, and XGBoost, are evaluated to compare linear and ensemble-based approaches. Model performance is assessed using imbalance-aware metrics, including precision–recall curves, ROC analysis, and balanced accuracy, rather than accuracy alone. Feature importance and model coefficients are analyzed to provide interpretable insights into network flow characteristics associated with malicious activity.
The study provides a transparent and reproducible baseline for intrusion detection research inspired by honeypot traffic analysis. While results are based on simulated network data, the methodology and findings may inform future work involving live network traffic, deployed honeypots, or adaptive learning approaches.
Files
Catching the Rare Ensemble and Linear Models for Imbalanced Network Intrusion Detection.pdf
Files
(402.4 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:43cfad9cbe0de07f9038fc5a9bc60419
|
402.4 kB | Preview Download |
Additional details
Identifiers
Dates
- Created
-
2025-10-20
Software
- Repository URL
- https://github.com/harddikk/Catching-the-Rare
- Programming language
- Python , Jupyter Notebook
- Development Status
- Active