Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
SAFE AND EFFICIENT REINFORCEMENT LEARNING
Örebro University, School of Science and Technology.
Örebro University, School of Science and Technology.
2019 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesisAlternative title
Säker och effektiv reinforcement learning (Swedish)
Abstract [en]

Pre-programming a robot may be efficient to some extent, but since a human has code the robot it will only be as efficient as the programming. The problem can solved by using machine learning, which lets the robot learn the most efficient way by itself. This thesis is continuation of a previous work that covered the development of the framework ​Safe-To-Explore-State-Spaces​ (STESS) for safe robot manipulation. This thesis evaluates the efficiency of the ​Q-Learning with normalized advantage function ​ (NAF), a deep reinforcement learning algorithm, when integrated with the safety framework STESS. It does this by performing a 2D task where the robot moves the tooltip on a plane from point A to point B in a set workspace. To test the viability different scenarios was presented to the robot. No obstacles, sphere obstacles and cylinder obstacles. The reinforcement learning algorithm only knew the starting position and the STESS pre-defined the workspace constraining the areas which the robot could not enter. By satisfying these constraints the robot could explore and learn the most efficient way to complete its task. The results show that in simulation the NAF-algorithm learns fast and efficient, while avoiding the obstacles without collision.

Abstract [sv]

Förprogrammering av en robot kan vara effektiv i viss utsträckning, men eftersom en människa har programmerat roboten kommer den bara att vara lika effektiv som programmet är skrivet. Problemet kan lösas genom att använda maskininlärning. Detta gör att roboten kan lära sig det effektivaste sättet på sitt sätt. Denna avhandling är fortsättning på ett tidigare arbete som täckte utvecklingen av ramverket Safe-To-Explore-State-Spaces (STESS) för säker robot manipulation. Denna avhandling utvärderar effektiviteten hos ​Q-Learning with normalized advantage function (NAF)​, en deep reinforcement learning algoritm, när den integreras med ramverket STESS. Det gör detta genom att utföra en 2D-uppgift där roboten flyttar sitt verktyg på ett plan från punkt A till punkt B i en förbestämd arbetsyta. För att testa effektiviteten presenterades olika scenarier för roboten. Inga hinder, hinder med sfärisk form och hinder med cylindrisk form. Deep reinforcement learning algoritmen visste bara startpositionen och STESS-fördefinierade arbetsytan och begränsade de områden som roboten inte fick beträda. Genom att uppfylla dessa hinder kunde roboten utforska och lära sig det mest effektiva sättet att utföra sin uppgift. Resultaten visar att NAF-algoritmen i simulering lär sig snabbt och effektivt, samtidigt som man undviker hindren utan kollision. 

Place, publisher, year, edition, pages
2019. , p. 47
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:oru:diva-76588OAI: oai:DiVA.org:oru-76588DiVA, id: diva2:1353018
Subject / course
Computer Engineering
Supervisors
Examiners
Available from: 2019-09-20 Created: 2019-09-20 Last updated: 2019-09-20Bibliographically approved

Open Access in DiVA

fulltext(2677 kB)9 downloads
File information
File name FULLTEXT01.pdfFile size 2677 kBChecksum SHA-512
9a4243919a4c2827053edd63b1c04f702cef14338e5eb4b3db06c3bbb705b178526525f03f15484f75ce88fa5c5a5cc466b366cf688d7dfb5341ab1c7bcb2757
Type fulltextMimetype application/pdf

By organisation
School of Science and Technology
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 9 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 2 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf