Change search
ReferencesLink to record
Permanent link

Direct link
Distributed Graph Mining: A study of performance advantages in distributed data mining paradigms when processing graphs using PageRank on a single node cluster
KTH, School of Computer Science and Communication (CSC).
KTH, School of Computer Science and Communication (CSC).
2015 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Distributed data mining is a relatively new area within computer science that is steadily growing, emerging from the demands of being able to gather and process various distributed data by utilising clusters. This report presents the properties of graph structured data and what paradigms to use for efficiently processing the data type, based on comprehensive theoretical studies applied on practical tests performed on a single node cluster. The results in the study showcase the various performance aspects of processing graph data, using different open source paradigm frameworks and amount of shards used on input. A conclusion to be drawn from this study is that there are no real performance advantages to using distributed data mining paradigms specifically developed for graph data on single machines. 

Place, publisher, year, edition, pages
National Category
Computer Science
URN: urn:nbn:se:kth:diva-166449OAI: diva2:811098
Available from: 2015-05-12 Created: 2015-05-10 Last updated: 2015-05-12Bibliographically approved

Open Access in DiVA

fulltext(5887 kB)1429 downloads
File information
File name FULLTEXT01.pdfFile size 5887 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
School of Computer Science and Communication (CSC)
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 1429 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 759 hits
ReferencesLink to record
Permanent link

Direct link