DETECTION OF BOTNETS USING INVARIANT REPRESENTATION.

Moinak Bhattacharya 1 and V.Bhattacharya 2 . 1. SRM Institute of Science and Technology, Kattankulathur. 2. Birla Institute of Technology, Mesra, Ranchi. ...................................................................................................................... Manuscript Info Abstract ......................... ........................................................................ Manuscript History Received: 14 December 2018 Final Accepted: 16 January 2019 Published: February 2019

Over the past few decades, botnets are known to be a serious threat to the cyber security. The botnets are the systems in a particular network environment that are commanded by the attacker also known as Bot herder through C & C channel and hence targets the neighbour systems. As a result, several anomalies(such as DDoS, spamming, key-logging etc) are detected which leads to failure of the systems, information breach and also threat to security. With the advancement of technology, botnets tend to change their feature and pattern of attack and tend to be indomitable. In the proposed architecture, we derieved a methodology to effectively detect problematic botnets irrespective of their variance in features and attack pattern. Invariant representation is implemented to effectively detect the botnets and keep in view the feature of invariance and the architechture is evaluated using bin histogram representations and two-class SVM(Support Vector Machine).

…………………………………………………………………………………………………….... Introduction:-
Over last 15 years, botnets had been the most vexing cyber-security threats, which caused many devastating and costly threats to Internet Security. About 15 to 20 percent of the computers connected to the Internet are infected and are used by Botnets [7]. A victim host becomes a bot in the Botnetwork and controlled by a human(Bot herder) and numerous controllers(Botnets) through a Command and Control (C & C) communication channel. The attacker also known as "bot herder", "botmaster", or "controller" commands the vulnerable victim host to perform attacks such as Distributed Denial of Service(DDoS) attacks with several fraudulent activities featuring spamming of other hosts in the system, security breach, information identity theft and exfiltration, malware dissemination, click-fraud and many more [3][4][5][6]. All these activities are performed in a specific manner through the C & C channel. Centralized C & C structures using the Internet Relay Chats(IRC) protocol is used by vast majority of Botnets [2] which is featured by more flexibility to the attackers, as it provides instant interaction with more number of bots and with more efficiency [20,26] and P2P(Point to Point) protocols, which does not have a central C & C server, and all the bots will be connected to each other to get controlled. That is why P2P botnet doesnot suffer from a single point failure [14].
Some of the botnets uses HTTP/HTTPS protocols [3,16,17,18], as HTTP-based C & C communications are allowed in most networks. It has been documented that a single botnet is capable of infecting about 4,00,000 systems and simultaneously keeping them under their control [8]. Several methodologies are proposed with an approach to detect the existance of botnets in monitored network. Most of the approaches are based on detection of botnets that uses IRC or HTTP-based C & C communication channel [9][10][11][12][13]. BotHunter [19] uses the concept of detecting the botnets 966 by categorizing the bot behaviour, which follows a predefined infection Life-Cycle dialog model. According to recent studies, botnets can change its C & C server address using fast-flux service networks. Hence some more approaches may prove ineffective against these changes. Therefore, robustness and invariance of the features extracted from raw samples determines the change in botnet behaviour by implementing botnet samples (FEATURES) difference with training samples and test samples. In this work, an invariant representation of sample traffic is implemented which is invariant under shifting and scaling of the features and permutation of the channelized representation. This is studied with Histogram representations with self-similarity matrix for each channel.

Related Works:-
A botnet is a group of targeted hosts that executes the specified commands of the botmaster. Botnets targets the nearby hosts by exploiting the vulnerability of the security if the victim host. Botnets tends to borrow several strategies from different types of malwares, including self-replicating worms, email-virus etc. and tends to diverge their features which makes botnets more dangerous. In this section we will discuss about the strategies attempted to detect botnets and efficiently detect their features.
Several researchers have proposed several approaches [9,10,11,12,13,19,20,21] to detect the existence of botnets in networks. Dewes et al. [22] introduced the concept for identifying the chat traffic. Sen et al. [23] used a signature based scheme to discern traffic produced by several well known P2P applications. Rishi [10] uses known IRC bot nickname patterns as signature to detect IRC botnets. BotSniffer [20] detects C & C activities with protocols such as HTTP and IRC [24].
Some of the other botnet detection methods have classified botnet detection into active and passive detections proposed by Daniel et al. [25] and characterization of behaviors namely Network Based Behaviors, Host Based Behavior and Global Correlated Behaviour based on Trends Micro"s Report [26].
Explanation:-According to the Figure 1, the congestion of the network traffic with the problematic elements refers to the fact that the botnets that simultaneously tends to attack the neighbour systems is commanded by any specific and channelized instruction sequence. This particular behaiviour of the botnets and their nature of attacks are organized and subsequently arranged into different channels.
The idea of channelizing the information leads to a different approach that fully determines the expression of features in specific format. The aspect kept in mind is that the system must be designed such that the features show invariant properties. This is attained by the representation of features in Invariant form. Invariant Representation specifies the response of features to the changes that can either be scale or shift variance. The representation also specifies invariance to changes in the arrangement of the channels.
Right from the initial status, when the attacker issues the commands to the vulnerable botnet to the classification of problematic and non-problematic botnets and thereby detection of the problematic botnets efficiently, the proposed architecture shows robustness in nature and exemplifies efficiency in working.

967
Experimental Characteristics:-The proposed architecture in the figure deals with the problem of presenting a robust representation of network traffic communication that would be sufficiently invariant against modifications any Botmaster can deploy to evade the detection system. The invariant representation classifies the network into specified channels that will contain the malicious and Non-malicious botnet.
To evaluate the prototype, we have tested its performance on real network traffic traces for several times including normal data from BIT Mesra campus network and collected sample data simultaneously represented as an Ndimensional feature vector. Samples are then channelized into M-Channels. Here, Channels are represented as C.
The channel is now represented as transformation of three stages of scaling and shifting Invariance of features and permutation(assumed that size of the channel is pre-determined) of the channel denoted as τ. The Invariant Representation is explained in three major heads:-1.  This represents that the proposed model will achieve a higher efficiency by properly training the classifier. Here square represents non-malicious samples and circles represent malicious samples.

Evaluation And Result:-
The proposed architecture was implemented in real network at BIT Mesra Campus to detect botnet samples at different interval of time. Two-class classification method is deployed that transforms the features as τ(C , δ, Ф). Different studies are made by regularizing the specified bins of the histogram and at different intervals as shown in Figure 4 and Figure 6 . The training data and test data was evaluated using a two-class SVM classifier. The result on the test data is shown in Figure 5 and Figure 7. With these specifications, the proposed method achieves 90% efficiency in detection botnets besides the change in features of the same at different interval of time. 969

Conclusion:-Future Works:-
The proposed architecture is capable of detecting botnets in congested network traffic and is able to classify the activities of the botnets irrespective of the change in features and behavior of the mentioned. It works with the methodology of channelizing the flows and representing them to sustain the invariance in properties. Subsequently the results are analyzed and the method was found to be 90% efficient.
Furthermore, with the gradual advent of technology, the threats to cyber security increases creating a path for deep research in the malicious behavior of the botnets and the study of change in features of botnets may lead to the early detection of any problem related to security breach, leading to the eradication of any kind of potential threat to network.