Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Acceleration of deep convolutional neural networks on multiprocessor system-on-chip
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computer Systems.
2019 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

In this master thesis some of the most promising existing frameworks and implementations of deep convolutional neural networks on multiprocessor system-on-chips (MPSoCs) are researched and evaluated. The thesis’ starting point was a previousthesis which evaluated possible deep learning models and frameworks for object detection on infra-red images conducted in the spring of 2018. In order to fit an existing deep convolutional neural network (DCNN) on a Multiple-Processor-System on Chip it needs modifications. Most DCNNs are trained on Graphic processing units (GPUs) with a bit width of 32 bit. This is not optimal for a platform with hard memory constraints such as the MPSoC which means it needs to be shortened. The optimal bit width depends on the network structure and requirements in terms of throughput and accuracy although most of the currently available object detection networks drop significantly when reduced below 6 bits width. After reducing the bit width, the network needs to be quantized and pruned for better memory usage. After quantization it can be implemented using one of many existing frameworks. This thesis focuses on Xilinx CHaiDNN and DNNWeaver V2 though it touches a little on revision, HLS4ML and DNNWeaver V1 as well. In conclusion the implementation of two network models on Xilinx Zynq UltraScale+ ZCU102 using CHaiDNN were evaluated. Conversion of existing network were done and quantization tested though not fully working. The results were a two to six times more power efficient implementation in comparison to GPU inference.

Place, publisher, year, edition, pages
2019. , p. 48
Series
UPTEC E, ISSN 1654-7616 ; 19006
Keywords [sv]
Neurala nätverk, MPSoC, FPGA, DCNN
National Category
Embedded Systems
Identifiers
URN: urn:nbn:se:uu:diva-385904OAI: oai:DiVA.org:uu-385904DiVA, id: diva2:1326323
Educational program
Master Programme in Electrical Engineering
Presentation
2019-06-10, 2003, Lägerhyddsvägen 1, Uppsala, 08:06 (English)
Supervisors
Examiners
Available from: 2019-06-26 Created: 2019-06-18 Last updated: 2019-06-26Bibliographically approved

Open Access in DiVA

Acceleration of deep convolutional neural networks on multiprocessor system-on-chip(6529 kB)201 downloads
File information
File name FULLTEXT01.pdfFile size 6529 kBChecksum SHA-512
72e06d3235d45132292efba0110c94c950188919105b6fd700b19317fabe0e41ca9afd983c06ec7d22cc1e2a69d8168a87f54f8e0c8ab850213fc328f3e4baa6
Type fulltextMimetype application/pdf

By organisation
Division of Computer Systems
Embedded Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 201 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 298 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf