Change search
ReferencesLink to record
Permanent link

Direct link
Code Generation and Global Optimization Techniques for a Reconfigurable PRAM-NUMA Multicore Architecture
Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, The Institute of Technology.
2014 (English)Licentiate thesis, monograph (Other academic)
Abstract [en]

In this thesis we describe techniques for code generation and global optimization for a PRAM-NUMA multicore architecture. We specifically focus on the REPLICA architecture which is a family massively multithreaded very long instruction word (VLIW) chip multiprocessors with chained functional units that has a reconfigurable emulated shared on-chip memory. The on-ship memory system supports two execution modes, PRAM and NUMA, which can be switched between at run-time.PRAM mode is considered the standard execution mode and targets mainly applications with very high thread level parallelism (TLP). In contrast, NUMA mode is optimized for sequential legacy applications and applications with low amount of TLP. Different versions of the REPLICA architecture have different number of cores, hardware threads and functional units. In order to utilize the REPLICA architecture efficiently we have made several contributionsto the development of a compiler for REPLICA target code generation. It supports both code generation for PRAM mode and NUMA mode and can generate code for different versions of the processor pipeline (i.e. for different numbers of functional units). It includes optimization phases to increase the utilization of the available functional units. We have also contributed to quantitative the evaluation of PRAM and NUMA mode. The results show that PRAM mode often suits programs with irregular memory access patterns and control flow best while NUMA mode suites regular programs better. However, for a particular program it is not always obvious which mode, PRAM or NUMA, will show best performance. To tackle this we contributed a case study for generic stencil computations, using machine learning derived cost models in order to automatically select at runtime which mode to execute in. We extended this to also include a sequence of kernels.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2014. , 101 p.
Linköping Studies in Science and Technology. Thesis, ISSN 0280-7971 ; 1688
Keyword [en]
PRAM; NUMA; multicore; reconfigurable; code generation; optimized composition;
National Category
Computer and Information Science
URN: urn:nbn:se:liu:diva-111333DOI: 10.3384/lic.diva-111333ISBN: 978-91-7519-189-8 (print)OAI: diva2:761347
2014-12-16, Alan Turing, Hus E, Campus Valla,Linköpings universitet, Linköping, 13:15 (English)
Available from: 2014-11-17 Created: 2014-10-14 Last updated: 2014-11-18Bibliographically approved

Open Access in DiVA

Code Generation and Global Optimization Techniques for a Reconfigurable PRAM-NUMA Multicore Architecture(5381 kB)370 downloads
File information
File name FULLTEXT02.pdfFile size 5381 kBChecksum SHA-512
Type fulltextMimetype application/pdf
omslag(1014 kB)19 downloads
File information
File name COVER01.pdfFile size 1014 kBChecksum SHA-512
Type coverMimetype application/pdf

Other links

Publisher's full text

Search in DiVA

By author/editor
Hansson, Erik
By organisation
Software and SystemsThe Institute of Technology
Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 376 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 516 hits
ReferencesLink to record
Permanent link

Direct link