Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Reinforcement Learning for a Hunter and Prey Robot
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology.
2018 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

The surge in the use of adaptive Artificial Intelligent (AI) systems have been made possible by leveraging the increasing processing and storage power that modern computers are able to provide. These systems are designed to make quality decisions that assist in making predictions in a wide variety of application fields. When such a system is fueled by data, the foundation for a Machine Learning (ML) approach can be modelled. Reinforcement Learning (RL) is an active model of ML going beyond the traditional supervised or unsupervised ML methods. RL studies algorithms to take actions so that the resulting reward is expected to be optimal. This thesis investigates the use of methods of RL in a context where the reward is highly time-varying: a setup is studied where two agents compete for a common resource. Specifically, we study a robotic setting inspired by the "cat-and-mouse" (hunter-prey) game. We refer to the hunter robot as to Tom, and to the competing prey robot as Jerry. In order to study this setup, two practical setups are considered. The first one is based on a LEGO platform, enabling us to run a number of actual experiments. The second one is based on a known RL simulator environment, enabling us to run many virtual experiments. We use these environments to explore the setting: indicate the non-stationary solution it generates, evaluate a number of de-facto standard approaches to RL, and identify key future avenues to be addressed.

Place, publisher, year, edition, pages
2018. , p. 67
Series
IT ; 18068
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:uu:diva-389998OAI: oai:DiVA.org:uu-389998DiVA, id: diva2:1340177
Educational program
Master Programme in Computer Science
Supervisors
Examiners
Available from: 2019-08-02 Created: 2019-08-02 Last updated: 2019-08-02Bibliographically approved

Open Access in DiVA

fulltext(19101 kB)1639 downloads
File information
File name FULLTEXT01.pdfFile size 19101 kBChecksum SHA-512
770c3c7f65447b5e040dbf31426839fc2753bd35fce323853dc3fd9567221e211084be8db0c326d8913bede3dd6bac2be2ed64c94e1d130238f9c0b3adf3cc2e
Type fulltextMimetype application/pdf

By organisation
Department of Information Technology
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 1640 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 704 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf