Cost-efficient mining techniques for data streams
2004 (English)In: Proceedings of the Second Australasian Information Security Workshop (AISW2004), the Australasian Workshop on Data Mining and Web Intelligence (DMWI2004), and the Australasian Workshop on Software Internationalisation (AWSI2004), Australian Computer Society, 2004, Vol. 32, 109-114 p.Conference paper (Refereed)
A data stream is a continuous and high-speed flow of data items. High speed refers to the phenomenon that the data rate is high relative to the computational power. The increasing focus of applications that generate and receive data streams stimulates the need for online data stream analysis tools. Mining data streams is a real time process of extracting interesting patterns from high-speed data streams. Mining data streams raises new problems for the data mining community in terms of how to mine continuous high-speed data items that you can only have one look at. In this paper, we propose algorithm output granularity as a solution for mining data streams. Algorithm output granularity is the amount of mining results that fits in main memory before any incremental integration. We show the application of the proposed strategy to build efficient clustering, frequent items and classification techniques. The empirical results for our clustering algorithm are presented and discussed which demonstrate acceptable accuracy coupled with efficiency in running time.
Place, publisher, year, edition, pages
Australian Computer Society, 2004. Vol. 32, 109-114 p.
, ACM International Conference Proceeding Series, 54
IdentifiersURN: urn:nbn:se:ltu:diva-40369Local ID: f7585900-ce9a-11dc-91eb-000ea68e967bISBN: 1-920682-14-7OAI: oai:DiVA.org:ltu-40369DiVA: diva2:1013891
Australasian Workshop on Data Mining and Web Intelligence : 17/12/2004
Upprättat; 2004; 20080129 (ysko)2016-10-032016-10-03