Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Spark for HPC: a comparison with MPI on compute-intensive applications using Monte Carlo method
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology.
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology.
2018 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

With the emergence of various big data platforms in recent years, Apache Spark - a distributed large-scale computing platform, is perceived as a potential substitute for Message Passing Interface (MPI) in High Performance Computing (HPC). Due to the limitations in fault-tolerance, dynamic resource handling and ease of use, MPI, as a dominant method to achieve parallel computing in HPC, is often associated with higher development time and costs in enterprises such as Scania IT. This thesis project aims to examine Apache Spark as an alternative to MPI on HPC clusters and compare their performance in various aspects. The test results are obtained by running a compute- intensive application on both platforms to solve a Bayesian inference problem of a extended Lotka-Volterra model using particle Markov chain Monte Carlo methods. As is confirmed by the tests, Spark is demonstrated to be superior in fault tolerance, dynamic resource handling and ease of use, whilst having its shortcomings in performance and resource consumption compared with MPI. Overall, Spark proves to be a promising alternative of MPI on HPC clusters. As a result, Scania IT continues to explore Spark on HPC clusters for use in different departments.

Place, publisher, year, edition, pages
2018. , p. 65
Series
IT ; 18048
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:uu:diva-392311OAI: oai:DiVA.org:uu-392311DiVA, id: diva2:1347863
Educational program
Master Programme in Computer Science
Supervisors
Examiners
Available from: 2019-09-02 Created: 2019-09-02 Last updated: 2019-09-02Bibliographically approved

Open Access in DiVA

fulltext(2947 kB)2069 downloads
File information
File name FULLTEXT01.pdfFile size 2947 kBChecksum SHA-512
541d3aa9088354c9aa3e5ec426deda879667d16b05780dad79b74db7cd078a30aad167d03c4f2fb2208beb70a84d58ef563d2db3b164137e791e8cf90fe21826
Type fulltextMimetype application/pdf

By organisation
Department of Information Technology
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 2069 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 661 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf