
Jaakko Tuomisto


Topic
Optimizing multi-robot behavior using Reinforcement learning
Supervisor(s)
Coordinating multiple robots in industrial settings is essential for maximizing productivity and minimizing disruptions. However, optimizing multi-agent behavior is complex due to the combinatorial explosion of possible actions across time and agents. This complexity increases in industrial environments with low latency requirements and frequent changes. Real-world multi-robot systems operate in dynamic and stochastic environments where task priorities shift unexpectedly, robots experience unforeseen issues, and workflows change rapidly.
This research addresses the need for improved efficiency and adaptability in multi-robot decision-making. Recognizing the limitations of conventional approaches, we aim to develop and validate new methods for multi-robot coordination that can handle complex and stochastic scenarios. This includes exploring planning algorithms such as Monte Carlo search techniques for continuous action spaces and multi-agent settings. To improve computational efficiency, we will investigate incorporating learning components inspired by approaches like AlphaZero, using both online and offline data. We will validate these algorithms through simulation studies and aim to demonstrate their use in practice by deploying them on real robotic systems, highlighting both academic and industrial significance.