Analyzing Resampling Techniques for Addressing the Class Imbalance in NIDS using SVM with Random Forest Feature Selection

Journal Title: International Journal of Experimental Research and Review - Year 2024, Vol 43, Issue 7

Abstract

The purpose of Network Intrusion Detection Systems (NIDS) is to ensure and protect computer networks from harmful actions. A major concern in NIDS development is the class imbalance problem, i.e., normal traffic dominates the communication data plane more than intrusion attempts. Such a state of affairs can pose certain hazards to the effectiveness of detection algorithms, including those useful for detecting less frequent but still highly dangerous intrusions. This paper aims to utilize resampling techniques to tackle this problem of class imbalance in NIDS using a Support Vector Machine (SVM) classifier alongside utilizing features selected by Random Forest to improve the feature subset selection process. The analysis highlights the combativeness of each sampling method, offering insights into their efficiency and practicality for real-world applications. Four resampling techniques are analyzed. Such techniques include Synthetic Minority Over-sampling Technique (SMOTE), Random Under-sampling (RUS), Random Over-sampling (ROS) and SMOTE with two different combinations i.e., RUS SMOTE and RUS ROS. Feature selection was done using Random Forest, which was improved by Bayesian methods to create subsets of features with feature rankings determined by Cumulative Feature Importance Score (CFIS). The CIDDS-2017 dataset is used for the performance evaluation, and the metrics used include accuracy, precision, recall, F-measure and CPU time. The algorithm that performs best overall in the CFIS feature subsets is SMOTE, and the features that give the best result are selected at the 90% level with 25 features. This subset accomplishes a relative accuracy enhancement of 0.08% than the other approaches. The RUS+ROS technique is also fine but somehow slower than SMOTE. On the other hand, RUS+SMOTE shows relatively poor results although it consumes less time in terms of computational time compared to other methods, giving about 50% of the performance shown by the other methods. This paper's novelty is adapting the RUS method as a standalone test for screening new and potentially contaminated datasets. The standalone RUS method is more efficient in terms of computations; the algorithm returned the best result of 98.13% accuracy at 85% at the CFIS level of 34 features with a computation time of 137.812 s. It is also noted that SMOTE is considered to be proficient among all resampling techniques used for handling the problem of class imbalance in NIDS, vice 90% CFIS feature subset. Future research directions could include using these techniques in different data sets and other machine learning and deep learning methods together with ROC curve analysis to provide useful pointers to NIDS designers on how to select the right data mining tools and strategies for their projects.

Authors and Affiliations

K. Swarnalatha, Nirmalajyothi Narisetty, Gangadhara Rao Kancherla, Basaveswararao Bobba

Keywords

Related Articles

A Spatio-temporal change analysis and assessment of the urban growth over Delhi National capital territory (NCT) during the period 1977-2014

Rapid urbanization and urban growth, particularly in the developing worlds, is continuing to be one of the crucial issues of global change in affecting the physical dimensions of cities. This study proposes a technique t...

Extensive study and data collection on the pituitary gland: A promising prospect revealed by surveying the fish market during the monsoon season

Because of India's large population, the fish supply demand is increasing daily. Among all fishes, the Indian major carp (Labeo rohita, Catla catla, etc.) is one of the most demanded fishes in India. Indian major carps b...

Plant regeneration through somatic embryogenesis of pseudostem callus culture response in In-vitro condition of palmarosa grass (Cymbopogon martinii) with special reference to hardening and pot culture

This study successfully achieved somatic embryogenesis and plant regeneration from callus culture of the plant Cymbopogon martinii (Palmarosa Grass). The ability of medicinally significant palmarosa grass to regenerate t...

Implications of Cyber-Physical Adversarial Attacks on Autonomous Systems

This study examines hostile cyber-physical assaults on autonomous systems and proposes a novel approach. The recommended strategy integrates many domains, evaluates data quantitatively, and emphasizes real-world applicat...

Enhancing Cassava Disease Detection: Leveraging Deep Convolutional Neural Networks and Data Augmentation for Accurate Diagnosis

Cassava, a widely cultivated staple food crop in the tropics, is frequently afflicted by various diseases that significantly decrease its yield. Cassava leaf disease diagnosis with four common cassava leaf diseases: cass...

Download PDF file
  • EP ID EP747173
  • DOI 10.52756/ijerr.2024.v43spl.004
  • Views 58
  • Downloads 0

How To Cite

K. Swarnalatha, Nirmalajyothi Narisetty, Gangadhara Rao Kancherla, Basaveswararao Bobba (2024). Analyzing Resampling Techniques for Addressing the Class Imbalance in NIDS using SVM with Random Forest Feature Selection. International Journal of Experimental Research and Review, 43(7), -. https://europub.co.uk/articles/-A-747173