Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Evaluation of Machine Learning Classification Methods: Support Vector Machines, Nearest Neighbour and Decision Tree
KTH, School of Computer Science and Communication (CSC).
KTH, School of Computer Science and Communication (CSC).
2017 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesisAlternative title
Utvärdering av Klassificeringsmetoder inom Maskininlärning : Stödvektormaskiner, Närmaste Granne och Beslutsträd (Swedish)
Abstract [en]

With more and more data available, the interest and use for machine learning is growing and so does the need for classification. Classifica- tion is an important method within machine learning for data simpli- fication and prediction.

This report evaluates three classification methods for supervised learn- ing: Support Vector Machines (SVM) with several kernels, Nearest Neighbor (k-NN) and Decision Tree (DT). The methods were evalu- ated based on the factors accuracy, precision, recall and time.

The experiments were conducted on artificial data created to represent a variation of distributions with a limitation of only 2 features and 3 classes. Different distributions of data were chosen to challenge each classification method. The results show that the measurements for ac- curacy and time vary considerably for the different distributed dataset. SVM with RBF kernel performed better for accuracy in comparison to the other classification methods. k-NN scored slightly lower accuracy values than SVM with RBF kernel in general, but performed better on the challenging dataset. DT is the less time consuming algorithm and was significally faster than the other classification methods. The only method that could compete with DT on time was k-NN that was faster than DT for the dataset with small spread and coinciding classes.

Although a clear trend can be seen in the results the area needs to be studied further to draw a comprehensive conclusion due to limitation of the artificially generated datasets in this study. 

Abstract [sv]

Med växande data och tillgänglighet ökar intresset och användning- en för maskininlärning, tillsammans med behovet för klassificering. Klassificering är en viktig metod inom maskininlärning för att förenk- la data och göra förutstägelser.

Denna rapport utvärderar tre klassificeringsmetoder för övervakad in- lärning: Stödvektormaskiner (SVM) med olika kärnor, Närmaste Gran- ne (k-NN) och Beslutsträd (DT). Metoderna utvärderades baserat på nogrannhet, precision, återkallelse och tid.

Experimenten utfördes på artificiell data skapad för att representera en variation av fördelningar med en begränsning av endast 2 egenskaper och 3 klasser. Resultaten visar att mätningarna för noggrannhet och tid varierar avsevärt för olika variationer av dataset. SVM med RBF-kärna gav generellt högre värden för noggrannhet i jämförelse med de and- ra klassificeringsmetoderna. k-NN visade något lägre noggrannhet än SVM med RBF-kärna i allmänhet, men presterade bättre på det mest utmanande datasetet. DT är den minst tidskrävande algoritmen och var signifikant snabbare än de andra klassificeringsmetoderna. Den enda metoden som kunde konkurrera med DT i tid var SVM med k- NN som var snabbare än DT för det dataset som hade liten spridning och sammanfallande klasser.

Även om en tydlig trend kan ses i resultaten behöver området studeras ytterligare för att dra en omfattande slutsats på grund av begränsning av dataset i denna studie. 

Place, publisher, year, edition, pages
2017.
Series
Kandidatexjobb CSC
National Category
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-209119OAI: oai:DiVA.org:kth-209119DiVA: diva2:1110023
Subject / course
Computer Science
Educational program
Master of Science in Engineering - Computer Science and Technology
Supervisors
Examiners
Available from: 2017-06-17 Created: 2017-06-15 Last updated: 2017-06-17Bibliographically approved

Open Access in DiVA

fulltext(3028 kB)31 downloads
File information
File name FULLTEXT01.pdfFile size 3028 kBChecksum SHA-512
8960121bc6c07b45fbd2951e3665e90b04559a861827d9db6585e2c5dc95abdf8800f4062849feff58bf7688ed8153cb62159d9ad45ecb7e3d43a68569f8dca6
Type fulltextMimetype application/pdf

By organisation
School of Computer Science and Communication (CSC)
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 31 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 155 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf