Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Analyzing the ability of Naive-Bayes and Label Spreading to predict labels with varying quantities of training data: Classifier Evaluation
KTH, School of Computer Science and Communication (CSC).
KTH, School of Computer Science and Communication (CSC).
2016 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesisAlternative title
Prestandaanalys av två metoder inom semisupervised och supervised maskininlärning (Swedish)
Abstract [en]

A study was performed on Naive-Bayes and Label Spread- ing methods applied in a spam filter as classifiers. In the testing procedure their ability to predict was observed and the results were compared in a McNemar test; leading to the discovery of the strengths and weaknesses of the chosen methods in a environment of varying training data. Though the results were inconclusive due to resource restrictions, the theory is discussed from various angles in order to pro- vide a better understanding of the conditions that can lead to potentially different results between the chosen meth- ods; opening up for improvement and further studies. The conclusion made of this study is that a significant differ- ence exists in terms of ability to predict labels between the two classifiers. On a secondary note it is recommended to choose a classifier depending on available training data and computational power. 

Abstract [sv]

En studie utfördes på klassifieringsmetoderna Naive-Bayes och Label Spreading applicerade i ett spam filter. Meto- dernas förmåga att predicera observerades och resultaten jämfördes i ett McNemar test, vilket ledde till upptäckten av styrkorna och svagheterna av de valda metoderna i en miljö med varierande träningsdata. Fastän resultaten var ofullständiga på grund av bristfälliga resurser, så diskute- ras den bakomliggande teorin utifrån flera vinklar. Denna diskussion har målet att ge en bättre förståelse kring de bakomliggande förutsättningarna som kan leda till poten- tiellt annorlunda resultat för de valda metoderna. Vidare öppnar detta möjligheter för förbättringar och framtida stu- dier. Slutsatsen som dras av denna studie är att signifikanta skillnader existerar i förmågan att kunna predicera klasser mellan de två valda klassifierarna. Den slutgiltiga rekom- mendationen blir att välja en klassifierare utifrån utbudet av träningsdata och tillgängligheten av datorkraft. 

Place, publisher, year, edition, pages
2016.
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:kth:diva-188132OAI: oai:DiVA.org:kth-188132DiVA: diva2:933512
Subject / course
Computer Science
Educational program
Master of Science in Engineering - Computer Science and Technology
Supervisors
Examiners
Available from: 2016-06-09 Created: 2016-06-06 Last updated: 2016-06-09Bibliographically approved

Open Access in DiVA

fulltext(528 kB)88 downloads
File information
File name FULLTEXT01.pdfFile size 528 kBChecksum SHA-512
05503089f3b9c2bcc1095ef09fdcdb61ad1768d9d34ff983904ea159a9eb38f82846f1c345e9c6a06a4dec01136058b63cf2c8cb1717adf5da82ef1e4829b520
Type fulltextMimetype application/pdf

By organisation
School of Computer Science and Communication (CSC)
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 88 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 47 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf