Examining The Impact of Traffic Sampling on Methods for Network Intrusion Detection Based on Machine Learning
M. Yashwant Ravi Kumar ,Gajula Pavan Kumar , Saroj Kumar Bharti
Dept. of Computer Science and
Engineering
Bharath Institiute of Higher Education and
Research
Chennai, India
yaswanthravi11@gmail.com
gajulapavannaidu666@gmail.com
Challa Apoorva
Dept. of Computer Science and
Engineering
Bharath Institiute of Higher Education and
Research Hyderabad, India
apoorvachalla14@gmail.com
Guide:
Mr.Krishnamoorthy
Assistant Professor Dept. of Computer science and Engineering, Bharath
Institiute of Higher
Education and Research, Krishnamoorthy.cse@bh arathuniv.ac.in
Abstract
Network intrusion detection is a key cybersecurity element that identifies and prevents unauthorized entry into computer networks Machine learning (ML) is an emerging solution for intrusion detection because it can process huge volumes of network traffic data and recognize sophisticated patterns of attacks. Yet, the success of ML-based intrusion detection systems (IDS) relies greatly on the input data quality and quantity.
This research evaluates the effect of traffic sampling on the performance of ML-based IDS. Two popular ML algorithms—Random Forest and Multi-Layer Perceptron—are tested using datasets with both unsampled and sampled network traffic data with varying sampling rates. IDS performance is measured in terms of accuracy, precision, recall, and F1-score.
The findings show that traffic sampling has a considerable impact on ML-based IDS, with
increased sampling rates tending to improve performance. Nevertheless, the best sampling rate depends on the ML algorithm and dataset employed.
The findings show that traffic sampling has a significant impact on ML-based IDS, with increasing sampling rates typically resulting in improved performance. The best sampling rate, however, depends on the particular ML algorithm and dataset utilized.