Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Applying Multi-Agent Reinforcement Learning as Game-AI in Football-like Environments
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology.
2024 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

This thesis investigates the potential of Multi-Agent Reinforcement Learning (MARL) as an alternative to traditionally scripted game-AI, focusing on the domain of football, a complex and dynamic environment that requires strategic interactions and is in need of more sophisticated game-AI techniques.

The study provides an extensive review of the literature in MARL, identifying TiZero as a recent state-of-the-art MARL method that has demonstrated significant potential, and therefore, using it as a base for the studies conducted in this work. However, like most MARL methods, TiZero suffers from sample inefficiency, meaning it requires a large number of training samples to achieve good performance, which makes the training process time-consuming and resource-intensive.

To address this, the thesis proposes modifications to the TiZero method, most significantly in its reward scheme. This involves the application of concepts from reward shaping, a technique that has shown promise in improving the learning efficiency of reinforcement learning algorithms. More specifically, we add a self-supervised online-learned intrinsic reward and an exploration bonus to the original extrinsic reward function of TiZero. We study the effects of these item by evaluating them individually as well as collectively. Additionally, we modify the original TiZero method slightly by applying minor implementation adjustments, i.e. normalizing the value estimate targets, replacing the recurrent architecture for a non-recurrent one, using fixed positional encoding instead of Multi-Layer Perceptrons (MLPs), and trying different exploration rates and methods.

The modified versions of TiZero, alongside the base method, are tested extensively in an open-source football environment. The results reveal that while MARL methods, including the improved versions of TiZero, can learn meaningful gameplay strategies in complex environments such as football, they still struggle with sample inefficiency and exhibit a slow learning progress which may not be feasible for the use of the methods in practical contexts as they are currently.

The thesis concludes with a discussion on the current limitations of MARL in game-AI and potential directions for future research, emphasizing the need for more efficient models or demonstration-based learning algorithms and the investigation of other forms of exploration and reward shaping to further improve the performance of MARL methods in such complex game environments.

Place, publisher, year, edition, pages
2024. , p. 90
Series
IT ; mDV 24 032
Keywords [en]
Reinforcement Learning, Multi-Agent Reinforcement Learning, Game-AI, Artificial Intelligence, Machine Learning
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:uu:diva-539832OAI: oai:DiVA.org:uu-539832DiVA, id: diva2:1903668
External cooperation
Electronic Arts, SEED
Presentation
2024-08-22, Zoom, Uppsala, 13:15 (English)
Supervisors
Examiners
Available from: 2024-10-07 Created: 2024-10-05 Last updated: 2024-10-07Bibliographically approved

Open Access in DiVA

marl-game-ai-football-like-environments-master-thesis-amir-baghi(9805 kB)698 downloads
File information
File name FULLTEXT01.pdfFile size 9805 kBChecksum SHA-512
e0c6b5188dbdb178259ce397bf18bcd9a1264693ff0a1baade83a7128db7b9db945e92fb58f9827868a692a1f9842eb859bb6d23015549d7548b733c2d144a45
Type fulltextMimetype application/pdf

By organisation
Department of Information Technology
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 698 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 900 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf