Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Towards Power Efficiency on Task-Based, Decoupled Access-Execute Models
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. (Uppsala Architecture Research Team)ORCID iD: 0000-0002-9460-1290
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. (Uppsala Architecture Research Team)
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. (Uppsala Architecture Research Team)
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. (Uppsala Architecture Research Team)
2013 (English)In: PARMA 2013, 4th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures, 2013Conference paper, Published paper (Refereed)
Abstract [en]

This work demonstrates the potential of hardware and software optimization to improve theeffectiveness of dynamic voltage and frequency scaling (DVFS). For software, we decouple data prefetch (access) and computation (execute) to enable optimal DVFS selectionfor each phase. For hardware, we use measurements from state-of-the-art multicore processors to accurately model the potential of per-core, zero-latency DVFS. We demonstrate that the combinationof decoupled access-execute and precise DVFS has the potential to decrease EDP by 25-30% without reducing performance.

The underlying insight in this work is that by decoupling access and execute we can take advantageof the memory-bound nature of the access phase and the compute-bound nature of the execute phase to optimize power efficiency. For the memory-bound access phase, where we prefetch data into the cachefrom main memory, we can run at a reduced frequency and voltage without hurting performance. Thereafter, the execute phase can run much faster, thanks to the prefetching of the access phase, and achieve higher performance. This decoupled program behavior allows us to achieve more effective use of DVFS than standard coupled executions which mix data access and compute.

To understand the potential of this approach, we measure application performance and power consumption on a modern multicore system across a range of frequencies and voltages. From this data we build a model that allows us to analyze the effects of per-core, zero-latency DVFS. The results of this work demonstrate the significant potential for finer-grain DVFS in combination with DVFS-optimized software.

Place, publisher, year, edition, pages
2013.
Keyword [en]
DVFS, energy, decoupled execution, performance, task-based execution
National Category
Computer Systems
Research subject
Computer Systems
Identifiers
URN: urn:nbn:se:uu:diva-203249OAI: oai:DiVA.org:uu-203249DiVA: diva2:635874
Conference
PARMA 2013, 4th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures, Berlin, Germany, January 23, 2013
Available from: 2013-07-06 Created: 2013-07-06 Last updated: 2013-07-09Bibliographically approved

Open Access in DiVA

fulltext(479 kB)