Change search
ReferencesLink to record
Permanent link

Direct link
Optimizing Strassen's multiplication algorithm for modern processors: A study in optimizing matrix multiplications for large matrices on modern CPUs
KTH, School of Computer Science and Communication (CSC).
KTH, School of Computer Science and Communication (CSC).
2016 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

This paper examines how to write code to gain high performance on modern computers as well as the importance of well planned data structures. The experiments were run on a computer with an Intel i5-5200U CPU, 8GB of RAM running Linux Mint 17.For the measurements Winograd's variant of Strassen's matrix multiplication algorithm was implemented and eventually compared to Intel's math kernel library (MKL). A quadtree data structure was implemented to ensure good cache locality. Loop unrolling and tiling was combined to improve cache performance on both L1 and L2 cache taking into regard the out of order behavior of modern CPUs. Compiler hints were partially used but a large part of the time critical code was written in pure assembler. Measurements of the speed performance of both floats and doubles were performed and substantial differences in running times were found.While there was a substantial difference between the best implementation and MKL for both doubles and floats at smaller sizes, a difference of only 1\% in execution time was achieved for floats at the size of 2^14. This was achieved without any specific tuning and could be expected to be improved if more time was spent on the implementation.

Place, publisher, year, edition, pages
National Category
Computer Science
URN: urn:nbn:se:kth:diva-186418OAI: diva2:927258
Subject / course
Computer Science
Educational program
Bachelor of Science in Engineering - Computer Engineering
Available from: 2016-05-18 Created: 2016-05-11 Last updated: 2016-05-18Bibliographically approved

Open Access in DiVA

fulltext(672 kB)46 downloads
File information
File name FULLTEXT01.pdfFile size 672 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Welin-Berger, RobertBäckström, Anton
By organisation
School of Computer Science and Communication (CSC)
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 46 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 31 hits
ReferencesLink to record
Permanent link

Direct link