Applying Multi-Agent Reinforcement Learning as Game-AI in Football-like Environments
2024 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE credits
Student thesis
Abstract [en]
This thesis investigates the potential of Multi-Agent Reinforcement Learning (MARL) as an alternative to traditionally scripted game-AI, focusing on the domain of football, a complex and dynamic environment that requires strategic interactions and is in need of more sophisticated game-AI techniques.
The study provides an extensive review of the literature in MARL, identifying TiZero as a recent state-of-the-art MARL method that has demonstrated significant potential, and therefore, using it as a base for the studies conducted in this work. However, like most MARL methods, TiZero suffers from sample inefficiency, meaning it requires a large number of training samples to achieve good performance, which makes the training process time-consuming and resource-intensive.
To address this, the thesis proposes modifications to the TiZero method, most significantly in its reward scheme. This involves the application of concepts from reward shaping, a technique that has shown promise in improving the learning efficiency of reinforcement learning algorithms. More specifically, we add a self-supervised online-learned intrinsic reward and an exploration bonus to the original extrinsic reward function of TiZero. We study the effects of these item by evaluating them individually as well as collectively. Additionally, we modify the original TiZero method slightly by applying minor implementation adjustments, i.e. normalizing the value estimate targets, replacing the recurrent architecture for a non-recurrent one, using fixed positional encoding instead of Multi-Layer Perceptrons (MLPs), and trying different exploration rates and methods.
The modified versions of TiZero, alongside the base method, are tested extensively in an open-source football environment. The results reveal that while MARL methods, including the improved versions of TiZero, can learn meaningful gameplay strategies in complex environments such as football, they still struggle with sample inefficiency and exhibit a slow learning progress which may not be feasible for the use of the methods in practical contexts as they are currently.
The thesis concludes with a discussion on the current limitations of MARL in game-AI and potential directions for future research, emphasizing the need for more efficient models or demonstration-based learning algorithms and the investigation of other forms of exploration and reward shaping to further improve the performance of MARL methods in such complex game environments.
Place, publisher, year, edition, pages
2024. , p. 90
Series
IT ; mDV 24 032
Keywords [en]
Reinforcement Learning, Multi-Agent Reinforcement Learning, Game-AI, Artificial Intelligence, Machine Learning
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:uu:diva-539832OAI: oai:DiVA.org:uu-539832DiVA, id: diva2:1903668
External cooperation
Electronic Arts, SEED
Presentation
2024-08-22, Zoom, Uppsala, 13:15 (English)
Supervisors
Examiners
2024-10-072024-10-052024-10-07Bibliographically approved