Evaluation of Different Data Mining Algorithms with KDD CUP 99 Data Set

Abstract

Data mining is the modern technique for analysis of huge of data such as KDD CUP 99 data set that is applied in network intrusion detection. Large amount of data can be handled with the data mining technology. It is still in developing state, it can become more effective as it is growing rapidly.Our work in this paper survey is for the most algorithms Data Mining using KDD CUP 99 data set in the classification of attacks and compared their results which have been reached, and being used of the performance measurement such as, True Positive Rate (TP), False Alarm Rate(FP), Percentage of Successful Prediction (PSP) and training time (TT) to show the results, the reason for this survey is to compare the results and select the best system for detecting intrusion(classification). The results showed that the Data Mining algorithms differ in the proportion of determining the rate of the attack, according to its type. The algorithm Random Forest Classifier detection is the highest rate of attack of the DOS, While Fuzzy Logic algorithm was the highest in detection Probe attack. The two categories R2U and R2L attacks have been identified well by using an MARS, Fuzzy logic and Random Forest classifiers respectively.MARS getting higher accuracy in classification, while PART classification algorithm got less accuracy. OneR got the least training time, otherwise Fuzzy Logic algorithm and MLP algorithm got higher training time.