Early evaluation of branches via decoupled access-execute to enable super-block optimizations
Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Modern CPUs rely on expensive branch predictors to speed up execution. Predictions nevertheless imply speculation, which is inherently costly, as mispredictions and re-execution of instructions can not only slow down execution but require extra energy expenditure. From the compiler perspective, the presence of branches complicates static analysis and hinders compile time optimizations. This work evaluates a software-only technique to remove branches and build super-blocks, thus, enabling more powerful compile-time optimizations, without the hardware support for dynamic branch prediction. Our approach eliminates branches and builds larger basic blocks using the Decoupled Access-Execute approach. Selected branches are hoisted and evaluated early in a so-called Access phase. If all branches are taken (or not-taken respectively), a simplified version of the code is run where these branches have been safely removed. Otherwise, the original version of the code is run. The end goal of this transformation is to enable optimizations on the simplified version. In the frame of this thesis, we have evaluated the benchmarks without enabling any additional optimizations and observed performance improvements in two out of eight benchmarks and performance penalties ranging between 4% to 27% on the remaining six. Based on these promising results, we expect that the optimizations triggered on the super-blocks to hide the small overhead and lead to significant performance improvements.
Place, publisher, year, edition, pages
2016. , 40 p.
Engineering and Technology
IdentifiersURN: urn:nbn:se:uu:diva-304758OAI: oai:DiVA.org:uu-304758DiVA: diva2:1033917
Bachelor Programme in Computer Science
Kaxiras, StefanosGällmo, Olle