Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Performance Evaluation of Imitation Learning Algorithms with Human Experts
KTH, School of Electrical Engineering and Computer Science (EECS).
KTH, School of Electrical Engineering and Computer Science (EECS).
2019 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

The purpose of this thesis was to compare the performance of three different imitation learning algorithms with human experts, with limited expert time. The central question was, ”How should one implement imitation learning in a simulated car racing environment, using human experts, to achieve the best performance when access to the experts is limited?”. We limited the work to only consider the three algorithms Behavior Cloning, DAGGER, and HG-DAGGER and limited the implementation to the car racing simulator TORCS. The agents consisted of the same type of feedforward neural network that utilized sensor data provided by TORCS. Through comparison in the performance of the different algorithms on a different amount of expert time, we can conclude that HGDAGGER performed the best. In this case, performance is regarded as a distance covered given set time. Its performance also seemed to scale well with more expert time, which the others did not. This result confirmed previously published results when comparing these algorithms.

Abstract [sv]

Målet med detta examensarbete var att jämföra prestandan av tre olika algoritmer inom området imitationinlärning med mänskliga experter, där experttiden är begränsad. Arbetets frågeställning var, ”Hur ska man implementera imitationsinlärning i en bilsimulator, för att få bäst prestanda, med mänskliga experter där experttiden är begränsad?”. Vi begränsade arbetet till att endast omfatta de tre algoritmerna, Behavior Cloning, DAGGER och HG-DAGGER, och begränsade implementationsmiljön till bilsimulatorn TORCS. Alla agenterna bestod av samma sorts feedforward neuralt nätverk som använde sig av sensordata från TROCS. Genom jämförelse i prestanda på olika mängder experttid kan vi dra slutsatsen att HG-DAGGER gav bäst resultat. I detta fall motsvarar prestanda körsträcka, givet en viss tid. Dess prestanda verkar även utvecklas väl med ytterligare experttid, vilket de övriga inte gjorde. Detta resultat bekräftar tidigare publicerade resultat om jämförelse av de tre olika algoritmerna.

Place, publisher, year, edition, pages
2019. , p. 50
Series
TRITA-EECS-EX ; 2019:255
Keywords [en]
Imitation Learning, DAGGER, HG-DAGGER, Behavior Cloning, Machine Learning, TORCS
Keywords [sv]
Imitationsinlärning, DAGGER, HG-DAGGER, Behavior Cloning, Maskininlärning, TORCS
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:kth:diva-254637OAI: oai:DiVA.org:kth-254637DiVA, id: diva2:1334302
Supervisors
Examiners
Available from: 2019-07-02 Created: 2019-07-02 Last updated: 2019-07-02Bibliographically approved

Open Access in DiVA

fulltext(1182 kB)20 downloads
File information
File name FULLTEXT01.pdfFile size 1182 kBChecksum SHA-512
c7e6de5824fc1a1668cc81c0f28815868dde315a7c05382bb94e0ceb7f5054ab6becff71d7e14fc5399ecdd9d172df3ee451ed9c64b2d5833b91a2b96c972cab
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 20 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 91 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf