Fundamentals of RL & Agents
Overview
Reinforcement Learning (RL) has achieved remarkable success, yet fundamental challenges remain in making agents sample-efficient, scalable, and capable of long-term reasoning. Our research delves into the theoretical underpinnings of RL to build more robust autonomous agents.
Active Projects
1. Scalable Environments with JAX
Goal: Accelerate RL research by orders of magnitude. Details: Building on our work Navix, we leverage JAX to create vectorised grid-world environments that compile directly to XLA. This allows for massive parallelisation, enabling us to train agents in seconds rather than hours and explore meta-learning frontiers previously out of reach.
2. Temporal Credit Assignment
Goal: Solve the “needle in a haystack” problem in long-horizon tasks. Details: When a reward is delayed, how does the agent know which past action caused it? We are developing new mechanisms for credit assignment that go beyond simple backpropagation through time, allowing agents to connect cause and effect over thousands of steps.
3. Sample Efficiency via Invariances
Goal: Learn faster by understanding symmetries. Details: We incorporate group theory into RL agents. By explicitly encoding known invariances (e.g., rotation, translation) into the network structure or the learning objective, we drastically reduce the number of samples needed to master a task.
Related Publications
A Kayal, S Vakili, L Toni, A Bernacchia. AISTATS 2025.
Details PDF
A Kayal, S Vakili, L Toni, A Bernacchia. ICML 2024.
Details PDF
E Pignatelli, J Liesen, RT Lange, C Lu, PS Castro, L Toni. NeurIPS 2025 Dataset Track.
Details PDF
E Pignatelli, J Ferret, T Rockäschel, E Grefenstette, D Paglieri, S Coward, et al. arXiv preprint 2024.
Details PDF