Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Learning Operational Goals for Propulsion System Using Reinforcement Learning
KTH, School of Electrical Engineering and Computer Science (EECS).
2018 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

This degree project, conducted at ABB, aims to analyze and solve differentsituations that a crew on board a vessel might face by controllingits propulsion system. The propulsion system is viewed as static,transition-deterministic, as well as stochastic when measuring data.This system is then used to formulate a decision problem using a finiteMarkov Decision Process, which is attempted to be tackled usingQ-learning, Speedy Q-learning and Double Q-learning for three differentobjectives that are relevant to the system’s behaviour and performance.The objective policies found from experiments are clearlyworking as intended and from the looks of experiments it seems thatmore training very much does affect the performance, which should bethe case knowing that there is a proof of convergence for Q-learningbased algorithms. The convergence rates for the three different algorithmsare then compared to a solution that is seen as optimal, to seehow fast they converge and try to determine the time needed to solveproblems similar to the ones stated in this thesis.

Abstract [sv]

Detta examensarbete, utfört på ABB, har som syfte att analysera ochlösa olika situationer som ett fartyg kan stöta på genom att kontrolleradess propellersystem på ett så bra sätt som möjligt. Propellersystemetär sett som övergångsdeterministiskt, stokastiskt i mätningar, och statiskt.Detta system är sedan formulerat som en Markovprocess (eng.Markov Decision Process), som löses med hjälp av "Q-learning", SSpeedyQ-learning" och Double Q-learning" för tre olika mål vilka ärrelevanta för systemets beteende och prestanda. För dessa mål är förhållningssätt(eng. policy) framtagna vilka tydligt fungerar som önskat.De anses nära optimala baserat på existerande evis om konvergensför Q-learning-baserade algoritmer. Konvergenshastigheten är för detre olika algoritmerna är sedan mätta gentemot en ansett nära optimallösning, för att se hur snabbt de konvergerar så att det kan bestämmashur mycket tid som behövs för att lösa problem liknande de somä hanterade i denna rapport.

Place, publisher, year, edition, pages
2018. , p. 45
Series
TRITA-EECS-EX ; 2018:717
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:kth:diva-246038OAI: oai:DiVA.org:kth-246038DiVA, id: diva2:1295380
External cooperation
ABB
Educational program
Master of Science - Systems, Control and Robotics
Supervisors
Examiners
Available from: 2019-03-11 Created: 2019-03-11 Last updated: 2019-03-11Bibliographically approved

Open Access in DiVA

fulltext(883 kB)15 downloads
File information
File name FULLTEXT01.pdfFile size 883 kBChecksum SHA-512
d41f218d54612b0e20de8a246004f41672c897f3549c042e4d7921bafb5aa7fb8c84be530e3b4466ad192754fcdde2ccca873aba6d7e52670203fd51eabee095
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 15 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 223 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf