Network Packet level –based Intelligent Phishing Intrusion Detection System
Creators
- 1. INSTITUTE OF SCIENCE, TECHNOLOGY AND INNOVATION (IST&I) UM6P-CS: School of computer Science
Description
The Internet has become an indispensable component of our daily social and financial lives. Nonetheless, internet users may be subject to a variety of web threats that can result in financial losses, identity theft, data loss, and brand reputation damage be it in the public or private sectors. Phishing is a type of web threat and cybercrime defined as the act of imitating a legitimate company's website in order to steal sensitive information such as usernames, passwords, and social security numbers.
So far, there is no single solution that can capture every phishing attack at a network-based level for real HTTPS data flow, According to a report [1] by the Anti-Phishing Working Group (APWG) and contributor Phish Labs, in the first quarter of 2021, 83% of phishing sites had SSL encryption enabled. In this study, we introduced a unique intelligent model for predicting phishing attacks. Network-based intrusion detection systems that monitor cutting-edge high-volume network linkages need greater computing resources than ordinary computer hardware can provide. Possible use cases for merging traditional, packet-based, and innovative, flow-based intrusion detection are described using this model. An increasing amount of web traffic is currently encrypted using HTTPS. While most of the HTTPS traffic is legitimate, a growing slice is generated by malware. The use of the HTTPS protocol by malware and phishing attacks makes its detection more challenging.
The current strategy in the literature is to use HTTPS interceptor proxies to identify HTTPS malware traffic [4]. This technology necessitates on-the-fly decryption of traffic, which poses certain risks to data and communication security and privacy.
The purpose by the end of this research project is to detect HTTPS/HTTP malicious phishing traffic without decryption by analyzing real-time device traffic and large network captures in the form of a PCAP file to extract network traffic characteristics. We propose a novel detection model that makes use of the underlying DNS traffic features that is put into a machine learning classifier. A solid feature engineering mechanism plays a pivotal role in boosting the performance of any machine learning model. Therefore, we have extracted effective and practical features from DNS traffic categorizing them into groups of lexical-based and third-party-based features. Third-party features are biographical information about a specific domain extracted from third-party APIs. [2] [3]. Experimental evaluation is conducted using a CSV public dataset that we preprocessed and generated while the model training data is taken from CIC-Bell-DNS 2021 Dataset, which is a collaborative project with Bell Canada (BC) and Cyber Threat Intelligence (CTI) [18]. In their work, they generated and released a large DNS features dataset of 400,000 benign and 13,011 malicious samples processed from a million benign and 51,453 known-malicious domains from publicly available datasets. The malicious samples span between three categories of spam, phishing, and malware - For our research project, I only worked with the phishing samples. The CIC-Bell-DNS2021 was the best decision for this research because it replicates the real-world scenarios with frequent benign traffic and malicious domain types [18].
Files
RAPPORT _FRSI_Khaoula_Hidawi.pdf
Files
(6.5 MB)
Name | Size | Download all |
---|---|---|
md5:ef5af3b44c3ed70fce8cba6a2df0291f
|
6.5 MB | Preview Download |