Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Programming Abstractions and Optimization Techniques for GPU-based Heterogeneous Systems
Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering. (PELAB)ORCID iD: 0000-0001-8976-0484
2018 (English)Doctoral thesis, monograph (Other academic)
Abstract [en]

CPU/GPU heterogeneous systems have shown remarkable advantages in performance and energy consumption compared to homogeneous ones such as standard multi-core systems.Such heterogeneity represents one of the most promising trendsfor the near-future evolution of high performance computing hardware.However, as a double-edged sword, the heterogeneity also brings significant programming complexitiesthat prevent the easy and efficient usage of different such heterogeneous systems.In this thesis, we are interested in four such kinds of fundamental complexities that are associated withthese heterogeneous systems: measurement complexity (efforts required to measure a metric, e.g., measuring enegy), CPU-GPU selection complexity, platform complexity and data management complexity. We explore new low-cost programming abstractions to hide these complexities,and propose new optimization techniques that could be performed under the hood.

For the measurement complexity, although measuring time is trivial by native library support,measuring energy consumption, especially for systems with GPUs, is complexbecause of the low level details involved such as choosing the right measurement methods, handling the trade-off between sampling rate and accuracy,and switching to different measurement metrics.We propose a clean interface with its implementationthat not only hides the complexity of energy measurement,but also unifies different kinds of measurements. The unificationbridges the gap between time measurement and energy measurement,and if no metric-specific assumptions related to time optimization techniques are made,energy optimization can be performedby blindly reusing time optimization techniques.

For the CPU-GPU selection complexity, which relates to efficient utilization of heterogeneous hardware,we propose a new adaptive-sampling based construction mechanism of predictors for such selections which can adapt to different hardware platforms automatically,and shows non-trivial advantages over random sampling.

For the platform complexity, we propose a new modular platform modeling language and its implementation to formally and systematically describe a computer system,enabling zero-overhead platform information queries for high-level software tool chains and for programmers as a basis for making software adaptive.

For the data management complexity, we propose a new mechanism to enable a unified memory view on heterogeneous systemsthat have separate memory spaces. This mechanism enables programmers to write significantly less code,which runs equally fast with expert-written code and outperforms the current commercially available solution: Nvidia's Unified Memory.We further propose two data movement optimization techniques, lazy allocation and transfer fusion optimization.The two techniques are based on adaptively merging messages to reduce data transfer latency.We show that these techniques can be potentially beneficial and we prove that our greedy fusion algorithm is optimal.

Finally, we show that our approaches to handle different complexities can be combined so that programmers could use them simultaneously.

This research was partly funded by two EU FP7 projects (PEPPHER and EXCESS) and SeRC.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2018. , p. 177
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 1903
Keywords [en]
CPU, GPU, GPGPU, heterogeneous systems, programming abstraction, performance optimization, energy optimization, adaptive sampling, MeterPU, TunerPU, XPDL, VectorPU
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:liu:diva-145304DOI: 10.3384/diss.diva-145304ISBN: 9789176853702 (print)OAI: oai:DiVA.org:liu-145304DiVA, id: diva2:1184739
Public defence
2018-04-04, Ada Lovelace, B-huset, Campus Valla, Linköping, 10:15 (English)
Opponent
Supervisors
Funder
EU, FP7, Seventh Framework Programme, PEPPHER and EXCESSAvailable from: 2018-02-28 Created: 2018-02-22 Last updated: 2018-02-28Bibliographically approved

Open Access in DiVA

Programming Abstractions and Optimization Techniques for GPU-based Heterogeneous Systems(2503 kB)161 downloads
File information
File name FULLTEXT01.pdfFile size 2503 kBChecksum SHA-512
bcc77d7f439b43a97bebe8b3c303c043e6af299134894aea539b4ff66482052afc9d2fd1074dbcf1a000280cde1053eb0b97774200b42ff49d9357f4b6f9fcb3
Type fulltextMimetype application/pdf
omslag(3230 kB)11 downloads
File information
File name COVER01.pdfFile size 3230 kBChecksum SHA-512
0d76f7cc4d71a75f48dd388756c06a007d02c1271634fb662322d52e339b4a3f84a9f507dba63534e970cd1ff901ee37dab1a72d6eb4649514680e076ee929e6
Type coverMimetype application/pdf

Other links

Publisher's full text

Search in DiVA

By author/editor
Li, Lu
By organisation
Software and SystemsFaculty of Science & Engineering
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 161 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 1876 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf