My group at Huawei Technologies Research and Development UK Ltd. is concerned with developing reinforcement learning algorithms applicable to the real-world environments including but not limited to, self-driving cars, robotics, logistics, communication networks, and many others. Realising that current decision-making algorithms lack in robustness, safety, efficiency, and generalisability, we are assembling a world-leading team tackling these challenges. My team currently includes 15 colleagues and planning on growth. If any of the below sounds interesting, don't hesitate to contact me.
Challenge I: Robust Reinforcement Learning
Though performing well in specific tasks, reinforcement learning algorithms overfit to the training environment. In fact, recent research has shown that testing reinforcement learning algorithms on slightly modified systems tend to fail atrociously. In other words, current reinforcement learning literature either requires a perfect simulator, which is extremely challenging to design or needs to learn while interacting with the real world, which is inefficient -- current reinforcement learning algorithms require millions of agent-environment interactions.
To have agents that adapt to simulator mismatches, our goal is to design reinforcement learning algorithms that are robust to changes in environments. Interestingly, such a topic has been extensively studied in robust-optimal control, where the goal is to find a controller robust to (relatively simple, e.g., additive) worst-case disturbances applied to the transition model. Similar to robust-control, our goal is to design decision-making algorithms that can handle such disturbances. Contrary to robust-control, however, we look at a broader range of possible disturbances, e.g., changes in transition model-dynamics. Moreover, we also question if worst-case loss functions are correct objectives to optimise.