Change search
ReferencesLink to record
Permanent link

Direct link
Improving Energy Efficiency with Special-Purpose Accelerators
Norwegian University of Science and Technology, Faculty of Information Technology, Mathematics and Electrical Engineering, Department of Computer and Information Science.
2013 (English)MasteroppgaveStudent thesis
Abstract [en]

The number of transistors per chip and their speed grows exponentially, but the power dissipation per transistor is decreased slightly with each process generation. This leads to increased power density and heat generation, meaning that only a fraction of the chip can be active at any given time. To attack this problem, heterogeneous systems-on-chip are developed. They consist of multiple specialized cores, each optimized to perform a particular set of tasks. Delegating parts of the application to run on specific, energy-efficient cores, allows more computations to execute within the given power budget, increasing the overall performance of the system. This thesis proposes a methodology for developing a special-purpose accelerator for a given application to create an energy-efficient heterogeneous system-on-chip based on the Xilinx Zynq platform. This work introduces the Xilinx tool suite used during development and defines the complete design work flow for implementing the accelerator and running the application on the accelerated system. This work evaluates the optimization techniques which lead to the most energy-efficient implementation. The simulations show that pipelining, separate ports for reading and writing data and a small, fast, local memory improves the performance of the accelerator by a factor of 44.4x and the energy-efficiency by 379x. The accelerator is physically implemented on the Xilinx Zynq SoC and acts as a co-processor for the ARM CPU available on the system. This work proposes a methodology for evaluating the physical power consumption and performance of various configurations of the system. For the given application, the system with the accelerator running at 125 MHz is 1.5x faster and 2.15x more energy-efficient compared to the application executing only on the CPU at 666 MHz. If the clock frequencies are matched at 100 MHz, the accelerated system is 3.6x faster and 3x more energy-efficient.

Place, publisher, year, edition, pages
Institutt for datateknikk og informasjonsvitenskap , 2013. , 77 p.
URN: urn:nbn:no:ntnu:diva-23451Local ID: ntnudaim:8953OAI: diva2:664087
Available from: 2013-11-13 Created: 2013-11-13 Last updated: 2013-11-13Bibliographically approved

Open Access in DiVA

fulltext(3429 kB)140 downloads
File information
File name FULLTEXT01.pdfFile size 3429 kBChecksum SHA-512
Type fulltextMimetype application/pdf
cover(184 kB)11 downloads
File information
File name COVER01.pdfFile size 184 kBChecksum SHA-512
Type coverMimetype application/pdf
attachment(64336 kB)13 downloads
File information
File name ATTACHMENT01.zipFile size 64336 kBChecksum SHA-512
Type attachmentMimetype application/zip

By organisation
Department of Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 140 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 76 hits
ReferencesLink to record
Permanent link

Direct link