Performance Enhancement in Machine Learning Approaches using Recursive Feature Extraction, Maximum Variance Method and Min Max Method

Main Article Content

K. Pramilarani, Vasanthi Kumari P

Abstract

This paper presents the well known techniques used for selecting and extracting the features in the dataset to improve the performance of the machine learning approaches in identifying DoS/DDoS attack types. Intrusion Detection is an essential part of an organization's cybersecurity defense strategy. It complements other security measures like firewalls, antivirus software, and access controls, helping to identify and mitigate potential threats that may have bypassed other security layers. By proactively identifying and addressing security breaches promptly, companies can secure their confidential information, ensure uninterrupted operations, and defend their standing against digital dangers. Machine Learning approaches are used to identify the threat in network. The input to the machine learning model should be preprocessed to enhance the overall performance of the model. Recursive Feature Elimination is one such method used to reduce the total number of features based on the importance ranking. In maximum variance method, variance for all the data samples are collected to identify the threshold value using the mean of all the variance. Data normalization is used in min max algorithm to make all the data in the similar range. These algorithms are used to preprocess the data so that the classifier overall performance will be enhanced. The data set considered for this research work is KDD+ data set which is mainly used to train and test the model for identifying different types of network attacks. The data set contains 41 different features for network connection, status, protocols, services and associated label for the attack category. The performance of any machine learning approach will be increased by reducing the number of features from the data set. The research work shows that the accuracy is improved by 10% to 20% after using the preprocessing algorithms for feature selection and noise removal.

Article Details

Section
Articles