Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Sparsity Analysis of Deep Learning Models and Corresponding Accelerator Design on FPGA
KTH, School of Information and Communication Technology (ICT).
2016 (English)Independent thesis Basic level (professional degree), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Machine learning has achieved great success in recent years, especially the deep learning algorithms based on Artificial Neural Network. However, high performance and large memories are needed for these models , which makes them not suitable for IoT device, as IoT devices have limited performance and should be low cost and less energy-consuming. Therefore, it is necessary to optimize the deep learning models to accommodate the resource-constrained IoT devices.

This thesis is to seek for a possible solution of optimizing the ANN models to fit into the IoT devices and provide a hardware implementation of the ANN accelerator on FPGA. The contribution of this thesis mainly lies in two aspects: 1). analyze the sparsity in the two mainstream deep learning models – DBN and CNN. The DBN model consists of two hidden layers with Restricted Boltzmann Machines while the CNN model consists of 2 convolutional layers and 2 sub-sampling layer. Experiments have been done on the MNIST data set with the sparsity of 75%. The ratio of the multiplications resulting in near-zero values has been tested. 2). FPGA implementation of an ANN accelerator. This thesis designed a hardware accelerator for the inference process in ANN models on FPGA (Stratix IV: EP4SGX530KH40C2). The main part of hardware design is the processing array consists of 256 Multiply-Accumulators array, which can conduct multiply-accumulate operations of 256 synaptic connections simultaneously. 16-bit fixed point computation is used to reduce the hardware complexity, thus saving power and area.

Based on the evaluation results, it is found that the ratio of the multiplications under the threshold of 2-5 is 75% for CNN with ReLU activation function, and is 83% for DBN with sigmoid activation function, respectively. Therefore, there still exists large space for complex ANN models to be optimized if the sparsity of data is fully utilized. Meanwhile, the implemented hardware accelerator is verified to provide correct results through 16-bit fixed point computation, which can be used as a hardware testing platform for evaluating the ANN models.

Place, publisher, year, edition, pages
2016. , p. 26
Series
TRITA-ICT-EX ; 2016:33
Keywords [en]
Deep machine learning, DBN, CNN, Multiplication-avoiding, FPGA, MNIST
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:kth:diva-204409OAI: oai:DiVA.org:kth-204409DiVA, id: diva2:1084868
Subject / course
Electronic- and Computer Systems
Educational program
Bachelor of Science in Engineering - Computer Engineering
Supervisors
Examiners
Available from: 2017-03-27 Created: 2017-03-27 Last updated: 2018-01-13Bibliographically approved

Open Access in DiVA

fulltext(2153 kB)153 downloads
File information
File name FULLTEXT01.pdfFile size 2153 kBChecksum SHA-512
81d02292b085782d72518af772746a2f61fbd62b4bc0f0750b07cae73c12a0fcc98d91c47c5a220427e3f4b40e2f6e2732fd73e2612210ee34a283a7d8bef16a
Type fulltextMimetype application/pdf

By organisation
School of Information and Communication Technology (ICT)
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 153 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 485 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf