Anomaly Detection in Network Traffic Using Unsupervised Machine Learning Approach
Shraddha.D.Shegokar1, Prof. P. P. Rane2, Prof. S. A. Vyawhare3
1Department of Computer Science and Engineering, Rajarshi Shahu College of Engineering, Buldhana 443001, Maharashtra India
2Department of Computer Science and Engineering, Rajarshi Shahu College of Engineering, Buldhana 443001, Maharashtra India
3Department of Computer Science and Engineering, Rajarshi Shahu College of Engineering, Buldhana 443001, Maharashtra India
ABSTRACT -The advent of IoT technology and the increase in wireless networking devices has led to an enormous increase in network attacks from different sources. To maintain networks as safe and secure, the Intrusion Detection System (IDS) has become very critical. Intrusion Detection Systems (IDS) are designed to protect the network by identifying anomaly behaviors or improper uses. Intrusion Detection systems provide more meticulous security functionality than access control barriers by detecting attempted and successful attacks at the endpoint of within the network. Intrusion prevention systems are the next logical step to this approach as they can take real-time action against breaches. To have an accurate IDS, detailed visibility is required into the network traffic. The intrusion detection system should be able to detect inside the network threats as well as access control breaches. IDS has been around for a very long time now. These traditional IDS were rules and signature based. Though they were able to reduce false positives they were not able to detect new attacks. In today’s world due to the growth of connectivity, attacks have increased at an exponential rate, and it has become essential to use a data-driven approach to tackle these issues. In this paper, the KDD dataset was used to train the unsupervised machine learning algorithm called Isolation Forest. The data set is highly imbalanced and contains various attacks such as DOS, Probe, U2R, R2L. Since this data set suffers from redundancy of values and class imbalance, the data preprocessing will be performed first and also used unsupervised learning. For this network traffic-based anomaly detection model isolation forest was used to detect outliers and probable attack the results were evaluated using the anomaly score
Keywords-anomaly detection, isolation forest, machine learning, intrusion detection system, KDD Cup, NSL-KDD.