Change search
CiteExportLink to record
Permanent link

Direct link
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Efficient reduction over threads
KTH, School of Engineering Sciences (SCI), Theoretical Physics.
2011 (English)Independent thesis Basic level (university diploma), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

The increasing number of cores in both desktops and servers leads to a demand for efficient parallel algorithms. This project focuses on the fundamental collective operation reduce, which merges several arrays into one by applying a binary operation element wise. Several reduce algorithms are evaluated in terms of performance and scalability and a novel algorithm is introduced that takes advantage of shared memory and exploits load imbalance. To do so, the concept of dynamic pair generation is introduced which implies constructing a binary reduce tree dynamically based on the order of thread arrival, where pairs are formed in a lock-free manner. We conclude that the dynamic algorithm, given enough spread in the arriving times, can outperform the reference algorithms for some or all array sizes.

Place, publisher, year, edition, pages
2011. , 43 p.
Trita-FYS, ISSN 0280-316X ; 57
National Category
Computer and Information Science
URN: urn:nbn:se:kth:diva-49818OAI: diva2:460393
Educational program
Master of Science in Engineering - Computer Science and Technology
Available from: 2011-11-30 Created: 2011-11-30 Last updated: 2011-11-30Bibliographically approved

Open Access in DiVA

fulltext(2413 kB)