Genetic Algorithm based Clustering for Intrusion Detection

Abstract

Clustering algorithms have recently gained attention in the related literature since they can help current intrusion detection systems in several aspects. This paper proposes genetic algorithm (GA) based clustering, serving to distinguish patterns incoming from network traffic packets into normal and attack. Two GA based clustering models for solving intrusion detection problem are introduced. The first model coined as GA #1 handles numeric features of the network packet, whereas the second one coined as GA #2 concerns all features of the network packet. Moreover, a new mutation operator directed for binary and symbolic features is proposed. The basic concept of proposed mutation operator depends on the most frequent value of the features using mode operator. The proposed GA-based clustering models are evaluated using Network Security Laboratory-Knowledge Discovery and Data mining (NSL-KDD) benchmark dataset. Also, it is compared with two baseline methods namely k-means and k-prototype to judge their performance and to confirm the value of the obtained clustering structures. The experiments demonstrate the effectiveness of the proposed models for intrusion detection problem in which GA #1 and GA #2 models outperform the two baseline methods in accuracy (Acc), detection rate (DR) and true negative rate (TNR). Moreover, the results prove the positive impact of the proposed mutation operator to enhance the strength of GA#2 model in all evaluation metrics. It successfully attains 6.4, 5.463 and 3.279 percentage of relative improvement in Acc over GA #1 and baseline models respectively.