Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Scalable Streaming Graph Partitioning
KTH, School of Information and Communication Technology (ICT).
2017 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Large-scale graph-structured datasets are growing at an increasing rate. Social network graphs are an example of these datasets. Processing large-scale graphstructured datasets are central to many applications ranging from telecommunication to biology and has led to the development of many parallel graph algorithms. Performance of parallel graph algorithms largely depends on how the underlying graph is partitioned.

In this work, we focus on studying streaming vertex-cut graph partitioning algorithms where partitioners receive a graph as a stream of vertices and edges and assign partitions to them on their arrival once and for all. Some of these algorithms maintain a state during partitioning. In some cases, the size of the state is so huge that it cannot be kept in a single machine memory. In many real world scenarios, several instances of a streaming graph partitioning algorithm are run simultaneously to improve the system throughput. However, running several instances of a partitioner drops the partitioning quality considerably due to the incomplete information of partitioners. Even frequently sharing states and its combination with buffering mechanisms does not completely solves the problem because of the heavy communication overhead produced by partitioners.

In this thesis, we propose an algorithm which tackles the problem of low scalability and performance of existing streaming graph partitioning algorithms by providing an efficient way of sharing states and its combination with windowing mechanism. We compare state-of-the-art streaming graph partitioning algorithms with our proposed solution concerning performance and efficiency.

Our solution combines a batch processing method with a shared-state mechanism to achieve both an outstanding performance and a high partitioning quality. Shared state mechanism is used for sharing states of partitioners. We provide a robust implementation of our method in a PowerGraph framework. Furthermore, we empirically evaluate the impact of partitioning quality on how graph algorithms perform in a real cloud environment.

The results show that our proposed method outperforms other algorithms in terms of partitioning quality and resource consumption and improves partitioning time considerably. On average our method improves partitioning time by 23%, decreases communication load by 15% and increase memory consumption by only 5% compared to the state-of-the-art streaming graph partitioning.

Place, publisher, year, edition, pages
2017. , p. 54
Series
TRITA-ICT-EX ; 2017:28
Keywords [en]
streaming graph, vertex-cut partitioning, graph partitioning, distributed hash table
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:kth:diva-206113OAI: oai:DiVA.org:kth-206113DiVA, id: diva2:1091194
Subject / course
Computer Science
Educational program
Master of Science - Software Engineering of Distributed Systems
Presentation
2017-02-09, KTH Electrum Building, Kistagången 16, Kista, 10:06 (English)
Supervisors
Examiners
Available from: 2017-04-26 Created: 2017-04-26 Last updated: 2017-04-26Bibliographically approved

Open Access in DiVA

fulltext(1066 kB)163 downloads
File information
File name FULLTEXT02.pdfFile size 1066 kBChecksum SHA-512
017fd7c989f85f3847961d471f2640df971ecc4ae7c006364ad0759d55c01c6f94d29c6e7a6a0ccd6be429cf58fc2e4f1554585b270e73c3fa18c8e44186bead
Type fulltextMimetype application/pdf

By organisation
School of Information and Communication Technology (ICT)
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 163 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 561 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf