Parallel Reduction from Block Hessenberg to Hessenberg using MPI
Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
In many scientific applications, eigenvalues of a matrix have to be computed. By first reducing a matrix from fully dense to Hessenberg form, eigenvalue computations with the QR algorithm become more effcient. Previous work on shared memory architectures has shown that the Hessenberg reduction is in some cases most effcient when performed in two stages: First reduce the matrix to block Hessenberg form, then reduce the block Hessenberg matrix to true Hessenberg form. This Thesis concerns the adaptation of an existing parallel reduction algorithm implementing the second stage to a distributed memory architecture. Two algorithms were designed using the Message Passing Interface (MPI) for communication between processes. The algorithms have been evaluated by an analyze of trace and run-times for different problem sizes and processes. Results show that the two adaptations are not effcient compared to a shared memory algorithm, but possibilities for further improvement have been identified. We found that an uneven distribution of work, a large sequential part, and significant communication overhead are the main bottlenecks in the distributed algorithms. Suggested further improvements are dynamic load balancing, sequential computation overlap, and hidden communication.
Place, publisher, year, edition, pages
, UMNAD, 949
Engineering and Technology
IdentifiersURN: urn:nbn:se:umu:diva-80687OAI: oai:DiVA.org:umu-80687DiVA: diva2:650907
Master of Science Programme in Computing Science and Engineering