Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
MLID: A multilabelextension of the ID3 algorithm
Blekinge Institute of Technology, Faculty of Computing, Department of Software Engineering.
Blekinge Institute of Technology, Faculty of Computing, Department of Software Engineering.
2016 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

AbstractMachine learning is a subfield within artificial intelligence that revolves around constructingalgorithms that can learn from, and make predictions on data. Instead of following strict andstatic instruction, the system operates by adapting and learning from input data in order tomake predictions and decisions. This work will focus on a subcategory of machine learningcalled “MultilabelClassification”, which is the concept of where items introduced to thesystem is categorized by an analytical model, learned through supervised learning, whereeach instance of the dataset can belong to multiple labels, or classes.This paper presents the task of implementing a multilabelclassifier based on the ID3algorithm, which we call MLID (MultilabelIterative Dichotomiser). The solution is presentedboth in a sequentially executed version as well as an parallelized one.We also presents acomparison based on accuracy and execution time, that is performed against algorithms of asimilar nature in order to evaluate the viability of using ID3 as a base to further expand andbuild upon in regards of multi label classification.In order to evaluate the performance of the MLID algorithm, we have measured theexecution time, accuracy, and made a summarization of precision and recall into what iscalled Fmeasure,which is the harmonic mean of both precision and sensitivity of thealgorithm. These results are then compared to already defined and established algorithms,on a range of datasets of varying sizes, in order to assess the viability of the MLID algorithm.The results produced when comparing MLID against other multilabelalgorithms such asBinary relevance, Classifier Chains and Random Trees shows that MLID can compete withother classifiers in term of accuracy and Fmeasure,but in terms of training the algorithm,the time required is proven inferior. Through these results, we can conclude that MLID is aviable option to use as a multilabelclassifier. Although, some constraints inherited from theoriginal ID3 algorithm does impede the full utility of the algorithm, we are certain thatfollowing the same path of development and improvement as ID3 experienced would allowMLID to develop towards a suitable choice of algorithm for a diverse range of multilabelclassification problems.

Place, publisher, year, edition, pages
2016. , 17 p.
Keyword [en]
ID3, Multilabel, Machine learning
National Category
Software Engineering
Identifiers
URN: urn:nbn:se:bth-13667OAI: oai:DiVA.org:bth-13667DiVA: diva2:1059861
Subject / course
PA1418 Bachelor's Thesis - Large Team Software Engineering Project; PA1418 Bachelor's Thesis - Large Team Software Engineering Project
Educational program
PAGPT Software Engineering; PAGSE Software Engineering
Supervisors
Examiners
Available from: 2017-05-29 Created: 2016-12-25 Last updated: 2017-05-29Bibliographically approved

Open Access in DiVA

BTH2016Starefors(280 kB)29 downloads
File information
File name FULLTEXT02.pdfFile size 280 kBChecksum SHA-512
01d051a8ab1844663517dcaca4be2d4bad33aa5945cf7cf87e0847af56420fd95a7820c9523b5bcb6923ad62d0310495d744d8b7ceb610e0738fbf7bdcce29cc
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Starefors, HenrikPersson, Rasmus
By organisation
Department of Software Engineering
Software Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 29 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 38 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf