Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Asynchronous Advantage Actor-Critic with Adam Optimization and a Layer Normalized Recurrent Network
KTH, School of Engineering Sciences (SCI), Mathematics (Dept.), Optimization and Systems Theory.
2017 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

State-of-the-art deep reinforcement learning models rely on asynchronous training using multiple learner agents and their collective updates to a central neural network. In this thesis, one of the most recent asynchronous policy gradientbased reinforcement learning methods, i.e. asynchronous advantage actor-critic (A3C), will be examined as well as improved using prior research from the machine learning community. With application of the Adam optimization method and addition of a long short-term memory (LSTM) with layer normalization, it is shown that the performance of A3C is increased.

Abstract [sv]

Moderna modeller inom förstärkningsbaserad djupinlärning förlitar sig på asynkron träning med hjälp av ett flertal inlärningsagenter och deras kollektiva uppdateringar av ett centralt neuralt nätverk. I denna studie undersöks en av de mest aktuella policygradientbaserade förstärkningsinlärningsmetoderna, i.e. asynchronous advantage actor-critic (A3C) med avsikt att förbättra dess prestanda med hjälp av tidigare forskning av maskininlärningssamfundet. Genom applicering av optimeringsmetoden Adam samt långt korttids minne (LSTM) med nätverkslagernormalisering visar det sig att prestandan för A3C ökar.

Place, publisher, year, edition, pages
2017.
Series
TRITA-MAT-E ; 2017:81
National Category
Computational Mathematics
Identifiers
URN: urn:nbn:se:kth:diva-220698OAI: oai:DiVA.org:kth-220698DiVA, id: diva2:1169944
External cooperation
EA SEED
Subject / course
Optimization and Systems Theory
Educational program
Master of Science - Applied and Computational Mathematics
Supervisors
Examiners
Available from: 2017-12-31 Created: 2017-12-31 Last updated: 2018-01-13Bibliographically approved

Open Access in DiVA

fulltext(1203 kB)140 downloads
File information
File name FULLTEXT01.pdfFile size 1203 kBChecksum SHA-512
7a25644aafa7dc24f1880b76475756ef87fb43ae10a41ce7d42be245015f6bde682ddf4f9a0bf91c57e2fcbc68735c1d1de0bc18bf05d8a1ca441fb364749718
Type fulltextMimetype application/pdf

By organisation
Optimization and Systems Theory
Computational Mathematics

Search outside of DiVA

GoogleGoogle Scholar
Total: 140 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 3129 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf