Change search
ReferencesLink to record
Permanent link

Direct link
Dynamic and speculative polyhedral parallelization of loop nests using binary code patterns
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science.
Show others and affiliations
2013 (English)In: ICCS 2013, 2013, 2575-2578 p.Conference paper (Refereed)
Abstract [en]

Speculative parallelization is a classic strategy for automatically parallelizing codes that cannot be handled at compile-time due to the use ofdynamic data and control structures. Another motivation of being speculative is to adapt the code to the current execution context, by selecting at run-time an efficient parallel schedule. However, since this parallelization scheme requires on-the-fly semantics verification, it is in general difficult to perform advanced transformations for optimization and parallelism extraction. We propose a framework dedicated tospeculative parallelization of scientific nested loop kernels, able to transform thecode at runtime by re-scheduling the iterations to exhibit parallelism and data locality. The run-time process includes a transformation selection guided by profiling phases on short samples, using an instrumented version of the code. During this phase, the accessed memory addresses are interpolated to build a predictor of the forthcoming accesses. The collected addresses are also used to compute on-the-fly dependence distance vectors by tracking accesses to common addresses. Interpolating functions and distance vectors are then employed in dynamicdependence analysis and in selecting a parallelizing transformation that, if the prediction is correct, does not induce any rollback during execution. In order to ensure that the rollback time overhead stays low, the code is executed in successive slices of the outermost original loop of the nest. Each slice can be either a parallelized version, a sequential original version, or an instrumented version. Moreover, such slicing of the execution provides the opportunity of transforming differently the code to adapt to the observed execution phases. Parallel codegeneration is achieved almost at no cost by using binary code patterns that are generated at compile-time and that are simply patched at run-time to result in the transformed code. The framework has been implemented with extensions of the LLVM compiler and an x86-64 runtime system. Significant speed-ups are shown on a set of benchmarks that could not have been handled efficiently by a compiler.

Place, publisher, year, edition, pages
2013. 2575-2578 p.
, Procedia Computer Science, ISSN 1877-0509 ; 18
National Category
Computer Science
URN: urn:nbn:se:uu:diva-206798DOI: 10.1016/j.procs.2013.05.443ISI: 000321051200280OAI: diva2:645481
International Conference on Computational Science (ICCS) 2013
Available from: 2013-06-01 Created: 2013-09-04 Last updated: 2013-09-04Bibliographically approved

Open Access in DiVA

JimboreanICCS13.pdf(206 kB)95 downloads
File information
File name FULLTEXT01.pdfFile size 206 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Search in DiVA

By author/editor
Jimborean, Alexandra
By organisation
Computing Science
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 95 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 208 hits
ReferencesLink to record
Permanent link

Direct link