Change search
ReferencesLink to record
Permanent link

Direct link
Data Mining for Network Intrusion Detection: A comparison of data mining algorithms and an analysis of relevant features for detecting cyber-attacks
Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
2015 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Data mining can be defined as the extraction of implicit, previously un-known, and potentially useful information from data. Numerous re-searchers have been developing security technology and exploring new methods to detect cyber-attacks with the DARPA 1998 dataset for Intrusion Detection and the modified versions of this dataset KDDCup99 and NSL-KDD, but until now no one have examined the performance of the Top 10 data mining algorithms selected by experts in data mining. The compared classification learning algorithms in this thesis are: C4.5, CART, k-NN and Naïve Bayes. The performance of these algorithms are compared with accuracy, error rate and average cost on modified versions of NSL-KDD train and test dataset where the instances are classified into normal and four cyber-attack categories: DoS, Probing, R2L and U2R. Additionally the most important features to detect cyber-attacks in all categories and in each category are evaluated with Weka’s Attribute Evaluator and ranked according to Information Gain. The results show that the classification algorithm with best performance on the dataset is the k-NN algorithm. The most important features to detect cyber-attacks are basic features such as the number of seconds of a network connection, the protocol used for the connection, the network service used, normal or error status of the connection and the number of data bytes sent. The most important features to detect DoS, Probing and R2L attacks are basic features and the least important features are content features. Unlike U2R attacks, where the content features are the most important features to detect attacks.

Place, publisher, year, edition, pages
2015. , 61 p.
Keyword [en]
Data mining, machine learning, cyber-attack, NSL-KDD, fea-tures, DoS, Probing, R2L, U2R.
National Category
Other Engineering and Technologies not elsewhere specified
URN: urn:nbn:se:miun:diva-28002OAI: diva2:939697
Educational program
Master of Science in Industrial Engineering and Management TINDA 300 higher education credits
Available from: 2016-06-20 Created: 2016-06-20 Last updated: 2016-06-20Bibliographically approved

Open Access in DiVA

fulltext(1415 kB)37 downloads
File information
File name FULLTEXT01.pdfFile size 1415 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Petersen, Rebecca
By organisation
Department of Information and Communication systems
Other Engineering and Technologies not elsewhere specified

Search outside of DiVA

GoogleGoogle Scholar
Total: 37 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 50 hits
ReferencesLink to record
Permanent link

Direct link