Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
ASIC implementation of LSTM neural network algorithm
KTH, School of Electrical Engineering and Computer Science (EECS).
2018 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

LSTM neural networks have been used for speech recognition, image recognition and other artificial intelligence applications for many years. Most applications perform the LSTM algorithm and the required calculations on cloud computers. Off-line solutions include the use of FPGAs and GPUs but the most promising solutions include ASIC accelerators designed for this purpose only. This report presents an ASIC design capable of performing the multiple iterations of the LSTM algorithm on a unidirectional and without peepholes neural network architecture. The proposed design provides arithmetic level parallelism options as blocks are instantiated based on parameters. The internal structure of the design implements pipelined, parallel or serial solutions depending on which is optimal in every case. The implications concerning these decisions are discussed in detail in the report. The design process is described in detail and the evaluation of the design is also presented to measure accuracy and error of the design output.This thesis work resulted in a complete synthesizable ASIC design implementing an LSTM layer, a Fully Connected layer and a Softmax layer which can perform classification of data based on trained weight matrices and bias vectors. The design primarily uses 16-bit fixed point format with 5 integer and 11 fractional bits but increased precision representations are used in some blocks to reduce error output. Additionally, a verification environment has also been designed and is capable of performing simulations, evaluating the design output by comparing it with results produced from performing the same operations with 64-bit floating point precision on a SystemVerilog testbench and measuring the encountered error. The results concerning the accuracy and the design output error margin are presented in this thesis report. The design went through Logic and Physical synthesis and successfully resulted in a functional netlist for every tested configuration. Timing, area and power measurements on the generated netlists of various configurations of the design show consistency and are reported in this report.

Abstract [sv]

LSTM neurala nätverk har använts för taligenkänning, bildigenkänning och andra artificiella intelligensapplikationer i många år. De flesta applikationer utför LSTM-algoritmen och de nödvändiga beräkningarna i digitala moln. Offline lösningar inkluderar användningen av FPGA och GPU men de mest lovande lösningarna inkluderar ASIC-acceleratorer utformade för endast dettaändamål. Denna rapport presenterar en ASIC-design som kan utföra multipla iterationer av LSTM-algoritmen på en enkelriktad neural nätverksarkitetur utan peepholes. Den föreslagna designed ger aritmetrisk nivå-parallellismalternativ som block som är instansierat baserat på parametrar. Designens inre konstruktion implementerar pipelinerade, parallella, eller seriella lösningar beroende på vilket anternativ som är optimalt till alla fall. Konsekvenserna för dessa beslut diskuteras i detalj i rapporten. Designprocessen beskrivs i detalj och utvärderingen av designen presenteras också för att mäta noggrannheten och felmarginal i designutgången. Resultatet av arbetet från denna rapport är en fullständig syntetiserbar ASIC design som har implementerat ett LSTM-lager, ett fullständigt anslutet lager och ett Softmax-lager som kan utföra klassificering av data baserat på tränade viktmatriser och biasvektorer. Designen använder huvudsakligen 16bitars fast flytpunktsformat med 5 heltal och 11 fraktions bitar men ökade precisionsrepresentationer används i vissa block för att minska felmarginal. Till detta har även en verifieringsmiljö utformats som kan utföra simuleringar, utvärdera designresultatet genom att jämföra det med resultatet som produceras från att utföra samma operationer med 64-bitars flytpunktsprecision på en SystemVerilog testbänk och mäta uppstådda felmarginal. Resultaten avseende noggrannheten och designutgångens felmarginal presenteras i denna rapport.Designen gick genom Logisk och Fysisk syntes och framgångsrikt resulterade i en funktionell nätlista för varje testad konfiguration. Timing, area och effektmätningar på den genererade nätlistorna av olika konfigurationer av designen visar konsistens och rapporteras i denna rapport.

Place, publisher, year, edition, pages
2018. , p. 79
Series
TRITA-EECS-EX ; 2018:153
Keywords [en]
Neural networks, LSTM, Long Short Term Memory, ASIC, VLSI
Keywords [sv]
Neural networks, LSTM, Long Short Term Memory, ASIC, VLSI
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
URN: urn:nbn:se:kth:diva-254290OAI: oai:DiVA.org:kth-254290DiVA, id: diva2:1330224
Subject / course
Electrical Engineering
Educational program
Master of Science - Embedded Systems
Supervisors
Examiners
Available from: 2019-06-25 Created: 2019-06-25 Last updated: 2019-06-25Bibliographically approved

Open Access in DiVA

fulltext(3520 kB)44 downloads
File information
File name FULLTEXT01.pdfFile size 3520 kBChecksum SHA-512
8d940b078b8bc24235eb67122bd09752c376e426d1c8c07a7641b6b4f58fca90727e767c23b033b9155d234050715def03d1fc4dc97aba388989eb0e7ddd242c
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 44 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 220 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf