Published March 2, 2018
| Version 10008916
Journal article
Open
Off-Policy Q-learning Technique for Intrusion Response in Network Security
Authors/Creators
Description
With the increasing dependency on our computer
devices, we face the necessity of adequate, efficient and effective
mechanisms, for protecting our network. There are two main
problems that Intrusion Detection Systems (IDS) attempt to solve.
1) To detect the attack, by analyzing the incoming traffic and inspect
the network (intrusion detection). 2) To produce a prompt response
when the attack occurs (intrusion prevention). It is critical creating an
Intrusion detection model that will detect a breach in the system on
time and also challenging making it provide an automatic and with
an acceptable delay response at every single stage of the monitoring
process. We cannot afford to adopt security measures with a high
exploiting computational power, and we are not able to accept a
mechanism that will react with a delay. In this paper, we will
propose an intrusion response mechanism that is based on artificial
intelligence, and more precisely, reinforcement learning techniques
(RLT). The RLT will help us to create a decision agent, who will
control the process of interacting with the undetermined environment.
The goal is to find an optimal policy, which will represent the
intrusion response, therefore, to solve the Reinforcement learning
problem, using a Q-learning approach. Our agent will produce an
optimal immediate response, in the process of evaluating the network
traffic.This Q-learning approach will establish the balance between
exploration and exploitation and provide a unique, self-learning and
strategic artificial intelligence response mechanism for IDS.
devices, we face the necessity of adequate, efficient and effective
mechanisms, for protecting our network. There are two main
problems that Intrusion Detection Systems (IDS) attempt to solve.
1) To detect the attack, by analyzing the incoming traffic and inspect
the network (intrusion detection). 2) To produce a prompt response
when the attack occurs (intrusion prevention). It is critical creating an
Intrusion detection model that will detect a breach in the system on
time and also challenging making it provide an automatic and with
an acceptable delay response at every single stage of the monitoring
process. We cannot afford to adopt security measures with a high
exploiting computational power, and we are not able to accept a
mechanism that will react with a delay. In this paper, we will
propose an intrusion response mechanism that is based on artificial
intelligence, and more precisely, reinforcement learning techniques
(RLT). The RLT will help us to create a decision agent, who will
control the process of interacting with the undetermined environment.
The goal is to find an optimal policy, which will represent the
intrusion response, therefore, to solve the Reinforcement learning
problem, using a Q-learning approach. Our agent will produce an
optimal immediate response, in the process of evaluating the network
traffic.This Q-learning approach will establish the balance between
exploration and exploitation and provide a unique, self-learning and
strategic artificial intelligence response mechanism for IDS.
Files
10008916.pdf
Files
(405.4 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:306ee6e5eb2aaeee517431cb98effa49
|
405.4 kB | Preview Download |
Additional details
References
- E. Even-Dar and Y. Mansour, Learning Rates for Q-Learning, Lecture Notes in Computer Science Computational Learning Theory, pp. 589-604, 2001.
- F. S. Melo, S. P. Meyn, and M. I. Ribeiro, An analysis of reinforcement learning with function approximation, Proceedings of the 25th international conference on Machine learning - ICML '08, 2008.
- H. Maei, C. Szepesvari, S. Bhatnagar, D. Silver, D. Precup, and R. Sutton, Convergent temporal-difference learning with arbitrary smooth function approximation, NIPS-22, pp. 1204-1212.
- ISCX NSL - KDD Data Set, University of New Brunswick est.1785. (Online). Available: http://www.unb.ca/cic/datasets/index.html.
- J. Cannady, Applying CMAC-based online learning to intrusion detection, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium, vol. 5, pp. 405-410, Jul. 2000.
- J. Cannady, Next Generation Intrusion Detection: Autonomous Reinforcement Learning of Network Attacks, In Proceedings of the 23rd National Information Systems Secuity Conference, pp. 1-12, 2000.
- J. Fu and U. Topcu, Probably Approximately Correct MDP Learning and Control With Temporal Logic Constraints, Robotics: Science and Systems X, 2014.
- J. N. Tsitsiklis, Asynchronous stochastic approximation and Q-learning, Machine Learning, vol. 16, no. 3, pp. 185-202, 1994.
- KDD Cup 1999 Data. (Online). Available: http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html. [10] M. Tavallaee, E. Bagheri,W. Lu, and A. A. Ghorbani, A detailed analysis of the KDD CUP 99 data set, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, 2009. [11] P. Dayan and C. Watkins, Q-learning, Machine Learning, vol. 8, no. 3-4, pp. 279-292, 1992. [12] P. Laskov, K. Rieck, P. Dussel, and C. Schafer, Learning Intrusion Detection: Supervised or Unsupervised?, Proceedings of the 13th ICIAP Conference, pp. 50-57, 2005. [13] P. Miller and A. Inoue, Collaborative intrusion detection system, 22nd International Conference of the North American Fuzzy Information Processing Society, NAFIPS 2003, pp. 519-524. [14] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. s.l.: MIT Press, 1998. [15] V. Chandola, A. Banerjee, and V. Kumar, Anomaly detection, ACM Computing Surveys, vol. 41, no. 3, pp. 1-58, 2009. [16] VNI Global Fixed and Mobile Internet Traffic Forecasts, Cisco, 13-Feb-2018. (Online). Available: http://www.cisco.com/c/en/us/solutions/service-provider/visualnetworking- index-vni/index.html. [17] X. Xu and T. Xie, A Reinforcement Learning Approach for Host-Based Intrusion Detection Using Sequences of System Calls, Lecture Notes in Computer Science Advances in Intelligent Computing, pp. 995-1003, 2005. [18] X. Xu and Y. Luo, A Kernel-Based Reinforcement Learning Approach to Dynamic Behavior Modeling of Intrusion Detection, Lecture Notes in Computer Science, Proceedings of ISNN, pp. 455-464, 2007. [19] X. Xu, T. Xie, D. Hu, and X. Lu, Kernel least-squares temporal difference learning, International Journal of Information Technology, vol. 11, no. 9, pp. 54-63, 2005. [20] Z. Stefanova and K. Ramachandran, Network attribute selection, classification and accuracy (NASCA) procedure for intrusion detection systems, 2017 IEEE International Symposium on Technologies for Homeland Security (HST), 2017.