Xinyi Sheng

Topic

Reliable control through reinforcement learning

Supervisor(s)

My research focuses on the algorithmic study of non-ergodic reward dynamics in reinforcement learning (RL), with the primary aim of developing new algorithms to enhance the robustness of RL models in complex and dynamic environments. Non-ergodic reward dynamics refer to systems where the outcome from each rollout of a policy may be arbitrarily different from the averaged value, a property commonly observed in real-world problems such as financial market fluctuations.By addressing the challenges posed by non-ergodicity, we will adapt the objective function from the expected returns to the time-average growth rate of the return and propose corresponding algorithms. This involves improving existing RL methods to better adapt to non-ergodic conditions and innovating new approaches that are more resilient to environmental changes.
My research bridges the gap between theoretical advancements in reinforcement learning and real-world applications. It contributes to the development of robust RL algorithms while providing solutions for addressing complex problems in dynamic environments. This work has significant implications for solving practical challenges in diverse fields.