Change search
ReferencesLink to record
Permanent link

Direct link
Enhanced Similarity Matching by Grouping of Features
Norwegian University of Science and Technology, Faculty of Information Technology, Mathematics and Electrical Engineering, Department of Computer and Information Science.
2012 (English)MasteroppgaveStudent thesis
Abstract [en]

In this report we introduce a classification system named Grouping of Features (GoF), together with a theoretical exploration of some of the important concepts in the Instant Based Learning(IBL)-field that are related to this system. A dataset's original features are by the GoF-system grouped together into abstract features. Each of these groups may capture inherent structures in one of the classes in the data. A genetic algorithm is used to extract a tree of such groups that can be used for measuring similarity between samples. As each class may have different inherent structures, different trees of groups are found for the different classes. To adjust the importance of one group in regards to the classifier, the concept of power average is used. A group's power-average may let either the smallest or the largest value of its group dominate, or take any value in-between. Tests show that the GoF-system outperforms kNN at many classification tasks. The system started as a research project by Verdande Technology, and a set of algorithms had been fully or partially implemented before the start of this thesis project. There existed no documentation however, so we have built an understanding of the fields on which the system relies, analyzed their properties, documented this understanding in explicit method descriptions, and tested, modified and extended the original system. During this project we found that scaling or weighting features as a data pre-processing step or during classification often is crucial for the performance of the classification-algorithm. Our hypothesis then was that by letting the weights vary between features and between groups of features, more complex structures could be captured. This would also make the classifier less dependent on how the features are originally scaled. We therefore implemented the Weighted Grouping of Features, an extension of the GoF-system. Notable results in this thesis include a 95.48 percent and 100.00 percent correctly classified non-scaled UCI Wine dataset using the GoF- and WGoF-system, respectively.

Place, publisher, year, edition, pages
Institutt for datateknikk og informasjonsvitenskap , 2012. , 106 p.
Keyword [no]
ntnudaim:6964, MTDT datateknikk, Intelligente systemer
URN: urn:nbn:no:ntnu:diva-20114Local ID: ntnudaim:6964OAI: diva2:603574
Available from: 2013-02-06 Created: 2013-02-06

Open Access in DiVA

fulltext(1493 kB)179 downloads
File information
File name FULLTEXT01.pdfFile size 1493 kBChecksum SHA-512
Type fulltextMimetype application/pdf
cover(315 kB)15 downloads
File information
File name COVER01.pdfFile size 315 kBChecksum SHA-512
Type coverMimetype application/pdf

By organisation
Department of Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 179 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 21 hits
ReferencesLink to record
Permanent link

Direct link