Change search
ReferencesLink to record
Permanent link

Direct link
Efficient K-means clustering and the importanceof seeding
KTH, School of Computer Science and Communication (CSC).
KTH, School of Computer Science and Communication (CSC).
2013 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Data clustering is the process of grouping data elements based on some

aspect of similarity between the elements in the group. Clustering has

many applications such as data compression, data mining, pattern recognition

and machine learning and there are many different clustering

methods. This paper examines the k-means method of clustering and

how the choice of initial seeding affects the result. Lloyd’s algorithm is

used as a base line and it is compared to an improved algorithm utilizing

kd-trees. Two different methods of seeding are compared, random

seeding and partial clustering seeding.

Abstract [sv]

Klustring av data innebär att man grupperar dataelement baserat på någon

typ a likhet mellan de grupperade elementen. Klustring har många

olika användningsråden såsom datakompression, datautvinning, mönsterigenkänning,

och maskininlärning och det finns många olika klustringsmetoder.

Den här uppsatsen undersöker klustringsmetoden k-means och

hur valet av startvärden för metoden påverkar resultatet. Lloyds algorithm

används som utgångspunkt och den jämförs med en förbättrad

algorithm som använder sig av kd-träd. Två olika metoder att välja

startvärden jämförs, slumpmässigt val av startvärde och delklustring.

Place, publisher, year, edition, pages
Kandidatexjobb CSC, K13021
National Category
Computer Science
URN: urn:nbn:se:kth:diva-134910OAI: diva2:668713
Educational program
Master of Science in Engineering - Computer Science and Technology
Available from: 2013-12-13 Created: 2013-12-02 Last updated: 2013-12-13Bibliographically approved

Open Access in DiVA

Efficient K-means clustering and the importance(430 kB)199 downloads
File information
File name FULLTEXT01.pdfFile size 430 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Other links
By organisation
School of Computer Science and Communication (CSC)
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 199 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 149 hits
ReferencesLink to record
Permanent link

Direct link