Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A sliding window BIRCH algorithm with performance evaluations
Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
2017 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

An increasing number of applications covered various fields generate transactional data or other time-stamped data which all belongs to time series data. Time series data mining is a popular topic in the data mining field, it introduces some challenges to improve accuracy and efficiency of algorithms for time series data. Time series data are dynamical, large-scale and high complexity, which makes it difficult to discover patterns among time series data with common methods suitable for static data. One of hierarchical-based clustering methods called BIRCH was proposed and employed for addressing the problems of large datasets. It minimizes the costs of I/O and time. A CF tree is generated during its working process and clusters are generated after four phases of the whole BIRCH procedure. A drawback of BIRCH is that it is not very scalable. This thesis is devoted to improve accuracy and efficiency of BIRCH algorithm. A sliding window BIRCH algorithm is implemented on the basis of BIRCH algorithm. At the end of thesis, the accuracy and efficiency of sliding window BIRCH are evaluated. A performance comparison among SW BIRCH, BIRCH and K-means are also presented with Silhouette Coefficient index and Calinski-Harabaz Index. The preliminary results indicate that the SW BIRCH may achieve a better performance than BIRCH in some cases.

Place, publisher, year, edition, pages
2017. , p. 70
Keyword [en]
Clustering, time series data
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:miun:diva-32397Local ID: DT-H16-A2-001OAI: oai:DiVA.org:miun-32397DiVA, id: diva2:1164506
Subject / course
Computer Engineering DT1
Supervisors
Examiners
Available from: 2017-12-11 Created: 2017-12-11 Last updated: 2017-12-11Bibliographically approved

Open Access in DiVA

fulltext(1724 kB)17 downloads
File information
File name FULLTEXT01.pdfFile size 1724 kBChecksum SHA-512
46775b77521c7b38bf942c73d4242a9e31546c7e2a59deca1f053b40984e5688d27c21b9190c3dabd5c9c8697df72da341f775fa354a7ecb9743ef97d7c1607d
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Li, Chuhe
By organisation
Department of Information Systems and Technology
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 17 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 36 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
v. 2.34-SNAPSHOT
|