Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Compact ConvNets with Ternary Weights and Binary Activations
KTH, School of Computer Science and Communication (CSC), Robotics, perception and learning, RPL.
2017 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Compact architectures, ternary weights and binary activations are two methods suitable for making neural networks more efficient. We introduce a) a dithering binary activation which improves accuracy of ternary weight networks with binary activations by randomizing quantization error, and b) a method of implementing ternary weight networks with binary activations using binary operations. Despite these new approaches, training a compact SqueezeNet architecture with ternary weights and full precision activations on ImageNet degrades classification accuracy significantly more than when training a less compact architecture the same way. Therefore ternary weights in their current form cannot be called the best method for reducing network size. However, the effect of weight decay on ternary weight network training should be investigated more in order to have more certainty in this finding.

Abstract [sv]

Kompakta arkitekturer, ternära vikter och binära aktiveringar är två metoder som är lämpliga för att göra neurala nätverk effektivare. Vi introducerar a) en dithering binär aktivering som förbättrar noggrannheten av ternärviktsnätverk med binära aktiveringar genom randomisering av kvantiseringsfel, och b) en metod för genomförande ternärviktsnätverk med binära aktiveringar med användning av binära operationer. Trots dessa nya metoder, att träna en kompakt SqueezeNet-arkitektur med ternära vikter och fullprecisionaktiveringar på ImageNet försämrar klassificeringsnoggrannheten betydligt mer än om man tränar en mindre kompakt arkitektur på samma sätt. Därför kan ternära vikter i deras nuvarande form inte kallas bästa sättet att minska nätverksstorleken. Emellertid, effekten av weight decay på träning av ternärviktsnätverk bör undersökas mer för att få större säkerhet i detta resultat.

Place, publisher, year, edition, pages
2017. , 55 p.
Keyword [en]
convolutional neural networks, compact architectures, weight quantization, activation quantization, dithering activations
National Category
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-216389OAI: oai:DiVA.org:kth-216389DiVA: diva2:1150748
Educational program
Master of Science - Systems, Control and Robotics
Supervisors
Examiners
Available from: 2017-11-06 Created: 2017-10-19 Last updated: 2017-11-06Bibliographically approved

Open Access in DiVA

fulltext(1187 kB)13 downloads
File information
File name FULLTEXT01.pdfFile size 1187 kBChecksum SHA-512
44b661e4c4872c5d0f56c176e3caf06249aebf834d2b227d688064202c7b4f039cd5da64635b6ad84fa648bd5d45e992d31a1e1f7a22f90954b757b12d48d96b
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Holesovsky, Ondrej
By organisation
Robotics, perception and learning, RPL
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 13 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 34 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf