<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>UCL LASP | Learning And Signal Processing</title><link>https://ucl-lasp.github.io/</link><atom:link href="https://ucl-lasp.github.io/index.xml" rel="self" type="application/rss+xml"/><description>UCL LASP</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Sun, 25 Jan 2026 00:00:00 +0000</lastBuildDate><image><url>https://ucl-lasp.github.io/media/icon_hu488c70cfa50b07216f285734af4abcd1_22080_512x512_fill_lanczos_center_3.png</url><title>UCL LASP</title><link>https://ucl-lasp.github.io/</link></image><item><title>Agents for Optimization</title><link>https://ucl-lasp.github.io/project/optimization-agents/</link><pubDate>Sun, 25 Jan 2026 00:00:00 +0000</pubDate><guid>https://ucl-lasp.github.io/project/optimization-agents/</guid><description>&lt;h2 id="overview">Overview&lt;/h2>
&lt;p>Many industrial problems (routing, scheduling, circuit design) are NP-hard combinatorial optimization challenges. We investigate whether learning-based agents can &amp;ldquo;outsmart&amp;rdquo; or accelerate classical solvers.&lt;/p>
&lt;h2 id="active-projects">Active Projects&lt;/h2>
&lt;h3 id="1-neural-combinatorial-optimization">1. Neural Combinatorial Optimization&lt;/h3>
&lt;p>&lt;strong>Goal:&lt;/strong> Learning heuristics from data.
&lt;strong>Details:&lt;/strong> Instead of hand-crafting heuristics for every new problem, we train &lt;strong>RL agents&lt;/strong> to learn construction and improvement heuristics automatically. We focus on graph-based problems where the agent learns to traverse the graph to build a valid solution.&lt;/p>
&lt;h3 id="2-generalizable-solvers">2. Generalizable Solvers&lt;/h3>
&lt;p>&lt;strong>Goal:&lt;/strong> Agents that generalize across problem sizes.
&lt;strong>Details:&lt;/strong> A major limitation of neural solvers is generalization. We are designing architectures (based on GNNs and attention) that allow an agent trained on small graphs (e.g., 20 nodes) to zero-shot generalize to large-scale instances (e.g., 1000 nodes) without retraining.&lt;/p>
&lt;h2 id="works-done">Works Done&lt;/h2></description></item><item><title>AI for Sustainable Power Grids</title><link>https://ucl-lasp.github.io/project/sustainable-grids/</link><pubDate>Sun, 25 Jan 2026 00:00:00 +0000</pubDate><guid>https://ucl-lasp.github.io/project/sustainable-grids/</guid><description>&lt;h2 id="overview">Overview&lt;/h2>
&lt;p>The transition to renewable energy requires a smarter, more resilient grid. We apply graph-based learning to manage the combinatorial complexity of power networks and critical infrastructure.&lt;/p>
&lt;h2 id="active-projects">Active Projects&lt;/h2>
&lt;h3 id="1-neural-unit-commitment">1. Neural Unit Commitment&lt;/h3>
&lt;p>&lt;strong>Goal:&lt;/strong> Optimize power dispatch in real-time.
&lt;strong>Details:&lt;/strong> The Unit Commitment (UC) problem—deciding which power plants to turn on—is a hard combinatorial problem. We are designing &lt;strong>Graph Neural Networks&lt;/strong> that can approximate optimal solutions for UC faster than classical solvers, facilitating the integration of fluctuating renewable sources like wind and solar.&lt;/p>
&lt;h3 id="2-resilient-infrastructure-monitoring">2. Resilient Infrastructure Monitoring&lt;/h3>
&lt;p>&lt;strong>Goal:&lt;/strong> Detect failures before they become disasters.
&lt;strong>Details:&lt;/strong> Building on our work in water distribution networks, we develop graph-based anomaly detection systems. These models learn the topology of the infrastructure to localize leaks, faults, or attacks in complex sensor networks.&lt;/p>
&lt;h2 id="works-done">Works Done&lt;/h2></description></item><item><title>Fundamentals of RL &amp; Agents</title><link>https://ucl-lasp.github.io/project/rl-fundamentals/</link><pubDate>Sun, 25 Jan 2026 00:00:00 +0000</pubDate><guid>https://ucl-lasp.github.io/project/rl-fundamentals/</guid><description>&lt;h2 id="overview">Overview&lt;/h2>
&lt;p>Reinforcement Learning (RL) has achieved remarkable success, yet fundamental challenges remain in making agents sample-efficient, scalable, and capable of long-term reasoning. Our research delves into the theoretical underpinnings of RL to build more robust autonomous agents.&lt;/p>
&lt;h2 id="active-projects">Active Projects&lt;/h2>
&lt;h3 id="1-scalable-environments-with-jax">1. Scalable Environments with JAX&lt;/h3>
&lt;p>&lt;strong>Goal:&lt;/strong> Accelerate RL research by orders of magnitude.
&lt;strong>Details:&lt;/strong> Building on our work &lt;strong>Navix&lt;/strong>, we leverage JAX to create vectorised grid-world environments that compile directly to XLA. This allows for massive parallelisation, enabling us to train agents in seconds rather than hours and explore meta-learning frontiers previously out of reach.&lt;/p>
&lt;h3 id="2-temporal-credit-assignment">2. Temporal Credit Assignment&lt;/h3>
&lt;p>&lt;strong>Goal:&lt;/strong> Solve the &amp;ldquo;needle in a haystack&amp;rdquo; problem in long-horizon tasks.
&lt;strong>Details:&lt;/strong> When a reward is delayed, how does the agent know which past action caused it? We are developing new mechanisms for &lt;strong>credit assignment&lt;/strong> that go beyond simple backpropagation through time, allowing agents to connect cause and effect over thousands of steps.&lt;/p>
&lt;h3 id="3-sample-efficiency-via-invariances">3. Sample Efficiency via Invariances&lt;/h3>
&lt;p>&lt;strong>Goal:&lt;/strong> Learn faster by understanding symmetries.
&lt;strong>Details:&lt;/strong> We incorporate group theory into RL agents. By explicitly encoding known invariances (e.g., rotation, translation) into the network structure or the learning objective, we drastically reduce the number of samples needed to master a task.&lt;/p>
&lt;h2 id="related-publications">Related Publications&lt;/h2>
&lt;div class="pub-list-item" style="margin-bottom: 1rem;">
&lt;i class="far fa-file-alt pub-icon" aria-hidden="true">&lt;/i>
&lt;span style="font-weight: bold;">Near-Optimal Sample Complexity in Reward-Free Kernel-Based Reinforcement Learning&lt;/span>&lt;br>
A Kayal, S Vakili, L Toni, A Bernacchia. &lt;em>AISTATS 2025&lt;/em>.&lt;br>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://ucl-lasp.github.io/publication/kayal-2025-sample/" target="_blank" rel="noopener">Details&lt;/a>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://arxiv.org/pdf/2502.07715.pdf" target="_blank" rel="noopener">PDF&lt;/a>
&lt;/div>
&lt;div class="pub-list-item" style="margin-bottom: 1rem;">
&lt;i class="far fa-file-alt pub-icon" aria-hidden="true">&lt;/i>
&lt;span style="font-weight: bold;">Reward-Free Kernel-Based Reinforcement Learning&lt;/span>&lt;br>
A Kayal, S Vakili, L Toni, A Bernacchia. &lt;em>ICML 2024&lt;/em>.&lt;br>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://ucl-lasp.github.io/publication/kayal-2024-reward/" target="_blank" rel="noopener">Details&lt;/a>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://arxiv.org/pdf/2502.07715.pdf" target="_blank" rel="noopener">PDF&lt;/a>
&lt;/div>
&lt;div class="pub-list-item" style="margin-bottom: 1rem;">
&lt;i class="far fa-file-alt pub-icon" aria-hidden="true">&lt;/i>
&lt;span style="font-weight: bold;">Navix: Scaling MiniGrid Environments with JAX&lt;/span>&lt;br>
E Pignatelli, J Liesen, RT Lange, C Lu, PS Castro, L Toni. &lt;em>NeurIPS 2025 Dataset Track&lt;/em>.&lt;br>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://ucl-lasp.github.io/publication/pignatelli-2025-navix/" target="_blank" rel="noopener">Details&lt;/a>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://arxiv.org/abs/2407.19396" target="_blank" rel="noopener">PDF&lt;/a>
&lt;/div>
&lt;div class="pub-list-item" style="margin-bottom: 1rem;">
&lt;i class="far fa-file-alt pub-icon" aria-hidden="true">&lt;/i>
&lt;span style="font-weight: bold;">Assessing the zero-shot capabilities of LLMs for action evaluation in RL&lt;/span>&lt;br>
E Pignatelli, J Ferret, T Rockäschel, E Grefenstette, D Paglieri, S Coward, et al. &lt;em>arXiv preprint 2024&lt;/em>.&lt;br>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://ucl-lasp.github.io/publication/pignatelli-2024-assessing/" target="_blank" rel="noopener">Details&lt;/a>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://arxiv.org/pdf/2409.12798.pdf" target="_blank" rel="noopener">PDF&lt;/a>
&lt;/div>
&lt;div class="pub-list-item" style="margin-bottom: 1rem;">
&lt;i class="far fa-file-alt pub-icon" aria-hidden="true">&lt;/i>
&lt;span style="font-weight: bold;">A survey of temporal credit assignment in deep reinforcement learning&lt;/span>&lt;br>
E Pignatelli, J Ferret, M Geist, T Mesnard, H van Hasselt, O Pietquin, L Toni. &lt;em>arXiv preprint 2023&lt;/em>.&lt;br>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://ucl-lasp.github.io/publication/pignatelli-2023-survey/" target="_blank" rel="noopener">Details&lt;/a>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://arxiv.org/pdf/2312.01072.pdf" target="_blank" rel="noopener">PDF&lt;/a>
&lt;/div>
&lt;div class="pub-list-item" style="margin-bottom: 1rem;">
&lt;i class="far fa-file-alt pub-icon" aria-hidden="true">&lt;/i>
&lt;span style="font-weight: bold;">Bayesian Optimization from Human Feedback: Near-Optimal Regret Bounds&lt;/span>&lt;br>
A Kayal, S Vakili, L Toni, D Shiu, A Bernacchia. &lt;em>ICML 2025&lt;/em>.&lt;br>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://ucl-lasp.github.io/publication/kayal-2025-bayesian/" target="_blank" rel="noopener">Details&lt;/a>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://arxiv.org/pdf/2411.01190.pdf" target="_blank" rel="noopener">PDF&lt;/a>
&lt;/div></description></item><item><title>Geometric &amp; Graph Generative AI</title><link>https://ucl-lasp.github.io/project/generative-ai/</link><pubDate>Sun, 25 Jan 2026 00:00:00 +0000</pubDate><guid>https://ucl-lasp.github.io/project/generative-ai/</guid><description>&lt;h2 id="overview">Overview&lt;/h2>
&lt;p>We investigate the fundamental limits of learning and information processing for geometric data. Our goal is to develop theoretically grounded generative models that can handle the complexity of 3D structures and molecular graphs.&lt;/p>
&lt;h2 id="active-projects">Active Projects&lt;/h2>
&lt;h3 id="1-autoregressive-expansion-for-latent-graph-diffusion">1. Autoregressive Expansion for Latent Graph Diffusion&lt;/h3>
&lt;p>&lt;strong>Goal:&lt;/strong> Extend Latent Graph Diffusion (LGDC) by introducing an autoregressive expansion mechanism.
&lt;strong>Details:&lt;/strong> Instead of expanding all nodes in a single step, this project generates fine-level structure iteratively, allowing local decisions to be conditioned on previously generated substructures.&lt;/p>
&lt;h3 id="2-grounding-geometric-generative-models">2. Grounding Geometric Generative Models&lt;/h3>
&lt;p>&lt;strong>Goal:&lt;/strong> Leverage discrete differential geometry to build better generative models.
&lt;strong>Details:&lt;/strong> We view graphs as samples from an underlying manifold. This project derives new families of diffusion and flow-based models grounded in curvature approximations and stochastic differential equations on manifolds.&lt;/p>
&lt;h2 id="related-publications">Related Publications&lt;/h2>
&lt;div class="pub-list-item" style="margin-bottom: 1rem;">
&lt;i class="far fa-file-alt pub-icon" aria-hidden="true">&lt;/i>
&lt;span style="font-weight: bold;">LGDC: Latent Graph Diffusion via Spectrum-Preserving Coarsening&lt;/span>&lt;br>
N Osman, K Jiang, D Buffelli, X Dong, L Toni. &lt;em>NeurIPS 2025 Workshop&lt;/em>.&lt;br>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://ucl-lasp.github.io/publication/osman-2025-lgdc/" target="_blank" rel="noopener">Details&lt;/a>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://arxiv.org/pdf/2512.01190.pdf" target="_blank" rel="noopener">PDF&lt;/a>
&lt;/div>
&lt;div class="pub-list-item" style="margin-bottom: 1rem;">
&lt;i class="far fa-file-alt pub-icon" aria-hidden="true">&lt;/i>
&lt;span style="font-weight: bold;">Effects of Random Edge-Dropping on Over-Squashing in Graph Neural Networks&lt;/span>&lt;br>
J Singh, K Jiang, B Paige, L Toni. &lt;em>NeurIPS 2025&lt;/em>.&lt;br>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://ucl-lasp.github.io/publication/singh-2025-dropping/" target="_blank" rel="noopener">Details&lt;/a>
&lt;/div>
&lt;div class="pub-list-item" style="margin-bottom: 1rem;">
&lt;i class="far fa-file-alt pub-icon" aria-hidden="true">&lt;/i>
&lt;span style="font-weight: bold;">Bures-Wasserstein Flow Matching for Graph Generation&lt;/span>&lt;br>
K Jiang, J Cui, X Dong, L Toni. &lt;em>Submitted to ICLR, 2025&lt;/em>.&lt;br>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://ucl-lasp.github.io/publication/jiang-2025-flow/" target="_blank" rel="noopener">Details&lt;/a>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://arxiv.org/pdf/2506.14020.pdf" target="_blank" rel="noopener">PDF&lt;/a>
&lt;/div>
&lt;div class="pub-list-item" style="margin-bottom: 1rem;">
&lt;i class="far fa-file-alt pub-icon" aria-hidden="true">&lt;/i>
&lt;span style="font-weight: bold;">From In Silico to In Vitro: Evaluating Molecule Generative Models&lt;/span>&lt;br>
N Osman, V Lembo, G Bottegoni, L Toni. &lt;em>NeurIPS 2025 AI4Science Workshop&lt;/em>.&lt;br>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://ucl-lasp.github.io/publication/osman-2025-insilico/" target="_blank" rel="noopener">Details&lt;/a>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://arxiv.org/pdf/2512.22031.pdf" target="_blank" rel="noopener">PDF&lt;/a>
&lt;/div>
&lt;div class="pub-list-item" style="margin-bottom: 1rem;">
&lt;i class="far fa-file-alt pub-icon" aria-hidden="true">&lt;/i>
&lt;span style="font-weight: bold;">Midi: Mixed graph and 3d denoising diffusion for molecule generation&lt;/span>&lt;br>
C Vignac, N Osman, L Toni, P Frossard. &lt;em>ECML PKDD 2023&lt;/em>.&lt;br>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://ucl-lasp.github.io/publication/vignac-2023-midi/" target="_blank" rel="noopener">Details&lt;/a>
&lt;/div>
&lt;div class="pub-list-item" style="margin-bottom: 1rem;">
&lt;i class="far fa-file-alt pub-icon" aria-hidden="true">&lt;/i>
&lt;span style="font-weight: bold;">Heterogeneous Graph Structure Learning through the Lens of Data-generating Processes&lt;/span>&lt;br>
K Jiang, B Tang, X Dong, L Toni. &lt;em>AISTATS 2025&lt;/em>.&lt;br>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://ucl-lasp.github.io/publication/jiang-2025-heterogeneous/" target="_blank" rel="noopener">Details&lt;/a>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://arxiv.org/pdf/2503.08760.pdf" target="_blank" rel="noopener">PDF&lt;/a>
&lt;/div></description></item><item><title>Graph ML for Science</title><link>https://ucl-lasp.github.io/project/graph-science/</link><pubDate>Sun, 25 Jan 2026 00:00:00 +0000</pubDate><guid>https://ucl-lasp.github.io/project/graph-science/</guid><description>&lt;h2 id="overview">Overview&lt;/h2>
&lt;p>Biology and Chemistry are fundamentally relational—molecules are graphs of atoms, and cellular functions rely on complex interaction networks. We develop geometric deep learning methods to model, generate, and understand these structures.&lt;/p>
&lt;h2 id="active-projects">Active Projects&lt;/h2>
&lt;h3 id="1-generative-biology--drug-discovery">1. Generative Biology &amp;amp; Drug Discovery&lt;/h3>
&lt;p>&lt;strong>Goal:&lt;/strong> Move from &amp;ldquo;In Silico&amp;rdquo; generation to &amp;ldquo;In Vitro&amp;rdquo; validation.
&lt;strong>Details:&lt;/strong> We are building generative models (like &lt;strong>MiDi&lt;/strong> and &lt;strong>LGDC&lt;/strong>) that can design novel molecules with specific 3D geometries and chemical properties. A key focus is bridging the gap between computational metrics and actual wet-lab success rates.&lt;/p>
&lt;h3 id="2-transcriptomics--interaction-networks">2. Transcriptomics &amp;amp; Interaction Networks&lt;/h3>
&lt;p>&lt;strong>Goal:&lt;/strong> Decode the language of the cell.
&lt;strong>Details:&lt;/strong> Using Graph Neural Networks (GNNs) and Graph Signal Processing, we model gene regulatory networks and protein-protein interactions. Our aim is to infer causal relationships in transcriptomic data to identify potential therapeutic targets.&lt;/p>
&lt;h2 id="related-publications">Related Publications&lt;/h2>
&lt;div class="pub-list-item" style="margin-bottom: 1rem;">
&lt;i class="far fa-file-alt pub-icon" aria-hidden="true">&lt;/i>
&lt;span style="font-weight: bold;">From In Silico to In Vitro: Evaluating Molecule Generative Models&lt;/span>&lt;br>
N Osman, V Lembo, G Bottegoni, L Toni. &lt;em>NeurIPS 2025 AI4Science Workshop&lt;/em>.&lt;br>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://ucl-lasp.github.io/publication/osman-2025-insilico/" target="_blank" rel="noopener">Details&lt;/a>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://arxiv.org/pdf/2512.22031.pdf" target="_blank" rel="noopener">PDF&lt;/a>
&lt;/div>
&lt;div class="pub-list-item" style="margin-bottom: 1rem;">
&lt;i class="far fa-file-alt pub-icon" aria-hidden="true">&lt;/i>
&lt;span style="font-weight: bold;">LGDC: Latent Graph Diffusion via Spectrum-Preserving Coarsening&lt;/span>&lt;br>
N Osman, K Jiang, D Buffelli, X Dong, L Toni. &lt;em>NeurIPS 2025 Workshop&lt;/em>.&lt;br>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://ucl-lasp.github.io/publication/osman-2025-lgdc/" target="_blank" rel="noopener">Details&lt;/a>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://arxiv.org/pdf/2512.01190.pdf" target="_blank" rel="noopener">PDF&lt;/a>
&lt;/div>
&lt;div class="pub-list-item" style="margin-bottom: 1rem;">
&lt;i class="far fa-file-alt pub-icon" aria-hidden="true">&lt;/i>
&lt;span style="font-weight: bold;">Heterogeneous Graph Structure Learning through the Lens of Data-generating Processes&lt;/span>&lt;br>
K Jiang, B Tang, X Dong, L Toni. &lt;em>AISTATS 2025&lt;/em>.&lt;br>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://ucl-lasp.github.io/publication/jiang-2025-heterogeneous/" target="_blank" rel="noopener">Details&lt;/a>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://arxiv.org/pdf/2503.08760.pdf" target="_blank" rel="noopener">PDF&lt;/a>
&lt;/div></description></item><item><title>Life at LASP</title><link>https://ucl-lasp.github.io/lab-life/</link><pubDate>Sun, 25 Jan 2026 00:00:00 +0000</pubDate><guid>https://ucl-lasp.github.io/lab-life/</guid><description>&lt;h2 id="our-team-in-action">Our Team in Action&lt;/h2>
&lt;p>We work hard, but we also enjoy our time together! Here are some highlights from recent conferences and social events.&lt;/p>
&lt;div class="gallery-grid">
&lt;div class="gallery-item gallery-item--medium">
&lt;a data-fancybox="gallery-lab-life" href="https://ucl-lasp.github.io/media/albums/lab-life/220516_J-Tye_UCL_99_EEE_Festival_of_Research-6127.jpg" >
&lt;img src="https://ucl-lasp.github.io/media/albums/lab-life/220516_J-Tye_UCL_99_EEE_Festival_of_Research-6127_hu00bfd3855d424a07c9e10f392f19e215_6414172_750x750_fit_q75_h2_lanczos.webp" loading="lazy" alt="220516_J-Tye_UCL_99_EEE_Festival_of_Research-6127.jpg" width="537" height="750">
&lt;/a>
&lt;/div>
&lt;div class="gallery-item gallery-item--medium">
&lt;a data-fancybox="gallery-lab-life" href="https://ucl-lasp.github.io/media/albums/lab-life/2W3A0419_9c279.jpg" >
&lt;img src="https://ucl-lasp.github.io/media/albums/lab-life/2W3A0419_9c279_hu0f1b02e974bece61557d8c1e3451ca7e_146397_750x750_fit_q75_h2_lanczos.webp" loading="lazy" alt="2W3A0419_9c279.jpg" width="640" height="360">
&lt;/a>
&lt;/div>
&lt;div class="gallery-item gallery-item--medium">
&lt;a data-fancybox="gallery-lab-life" href="https://ucl-lasp.github.io/media/albums/lab-life/Image%20from%20iOS%20-%201.jpg" >
&lt;img src="https://ucl-lasp.github.io/media/albums/lab-life/Image%20from%20iOS%20-%201_hu9b023497892eae677a24b502f5be28a7_1745066_750x750_fit_q75_h2_lanczos.webp" loading="lazy" alt="Image from iOS - 1.jpg" width="750" height="478">
&lt;/a>
&lt;/div>
&lt;div class="gallery-item gallery-item--medium">
&lt;a data-fancybox="gallery-lab-life" href="https://ucl-lasp.github.io/media/albums/lab-life/IMG_7829%20%281%29.png" >
&lt;img src="https://ucl-lasp.github.io/media/albums/lab-life/IMG_7829%20%281%29_hu96b93ebf4a6b82460fb34f865bee1170_14355259_750x750_fit_q75_h2_lanczos_3.webp" loading="lazy" alt="IMG_7829 (1).png" width="750" height="563">
&lt;/a>
&lt;/div>
&lt;div class="gallery-item gallery-item--medium">
&lt;a data-fancybox="gallery-lab-life" href="https://ucl-lasp.github.io/media/albums/lab-life/IMG_8250%20%281%29.png" >
&lt;img src="https://ucl-lasp.github.io/media/albums/lab-life/IMG_8250%20%281%29_hu164232383358bac7ee5ea783453f2e09_11035857_750x750_fit_q75_h2_lanczos_3.webp" loading="lazy" alt="IMG_8250 (1).png" width="750" height="563">
&lt;/a>
&lt;/div>
&lt;div class="gallery-item gallery-item--medium">
&lt;a data-fancybox="gallery-lab-life" href="https://ucl-lasp.github.io/media/albums/lab-life/LASP%20group%20photo%20-%20January%202022.jpg" >
&lt;img src="https://ucl-lasp.github.io/media/albums/lab-life/LASP%20group%20photo%20-%20January%202022_hu493077aa3318a1d6ec645d06b297c9d4_632864_750x750_fit_q75_h2_lanczos.webp" loading="lazy" alt="LASP group photo - January 2022.jpg" width="750" height="445">
&lt;/a>
&lt;/div>
&lt;div class="gallery-item gallery-item--medium">
&lt;a data-fancybox="gallery-lab-life" href="https://ucl-lasp.github.io/media/albums/lab-life/Snow3Feb09_42_90650.jpg" >
&lt;img src="https://ucl-lasp.github.io/media/albums/lab-life/Snow3Feb09_42_90650_huca90780f538a176ff1d043d8f8032ae9_135785_750x750_fit_q75_h2_lanczos.webp" loading="lazy" alt="Snow3Feb09_42_90650.jpg" width="640" height="466">
&lt;/a>
&lt;/div>
&lt;div class="gallery-item gallery-item--medium">
&lt;a data-fancybox="gallery-lab-life" href="https://ucl-lasp.github.io/media/albums/lab-life/UCL_Welcome_2024_15_ef6aa.jpg" >
&lt;img src="https://ucl-lasp.github.io/media/albums/lab-life/UCL_Welcome_2024_15_ef6aa_hu091ed913fe0fc99eefb1033067aea074_97173_750x750_fit_q75_h2_lanczos.webp" loading="lazy" alt="UCL_Welcome_2024_15_ef6aa.jpg" width="640" height="427">
&lt;/a>
&lt;/div>
&lt;/div></description></item><item><title>LLM Alignment &amp; Exploration</title><link>https://ucl-lasp.github.io/project/llm-alignment/</link><pubDate>Sun, 25 Jan 2026 00:00:00 +0000</pubDate><guid>https://ucl-lasp.github.io/project/llm-alignment/</guid><description>&lt;h2 id="overview">Overview&lt;/h2>
&lt;p>Large Language Models (LLMs) are powerful, but aligning them with human preferences and encouraging them to explore novel solutions remains difficult. We bring techniques from control theory and exploration research to LLMs.&lt;/p>
&lt;h2 id="active-projects">Active Projects&lt;/h2>
&lt;h3 id="1-bayesian-optimization-from-human-feedback">1. Bayesian Optimization from Human Feedback&lt;/h3>
&lt;p>&lt;strong>Goal:&lt;/strong> Optimize LLM outputs with minimal human labelling.
&lt;strong>Details:&lt;/strong> We treat alignment as a Bayesian Optimization problem. By efficiently querying human preferences, we aim to find optimal prompts or model weights with theoretical regret bounds, minimizing the cost of human annotation.&lt;/p>
&lt;h3 id="2-post-training-exploration">2. Post-Training Exploration&lt;/h3>
&lt;p>&lt;strong>Goal:&lt;/strong> Encouraging LLMs to think &amp;ldquo;outside the box.&amp;rdquo;
&lt;strong>Details:&lt;/strong> Standard RLHF can lead to mode collapse (repetitive answers). We are investigating the impact of &lt;strong>intrinsic rewards&lt;/strong> on LLMs, encouraging the model to explore diverse reasoning paths and discover creative solutions during the fine-tuning phase.&lt;/p>
&lt;h2 id="related-publications">Related Publications&lt;/h2>
&lt;div class="pub-list-item" style="margin-bottom: 1rem;">
&lt;i class="far fa-file-alt pub-icon" aria-hidden="true">&lt;/i>
&lt;span style="font-weight: bold;">Bayesian Optimization from Human Feedback: Near-Optimal Regret Bounds&lt;/span>&lt;br>
A Kayal, S Vakili, L Toni, D Shiu, A Bernacchia. &lt;em>ICML 2025&lt;/em>.&lt;br>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://ucl-lasp.github.io/publication/kayal-2025-bayesian/" target="_blank" rel="noopener">Details&lt;/a>
&lt;a class="btn btn-outline-primary btn-page-header btn-sm" href="https://arxiv.org/pdf/2411.01190.pdf" target="_blank" rel="noopener">PDF&lt;/a>
&lt;/div></description></item><item><title/><link>https://ucl-lasp.github.io/people/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://ucl-lasp.github.io/people/</guid><description/></item><item><title>Alan Guedes</title><link>https://ucl-lasp.github.io/author/alan-guedes/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://ucl-lasp.github.io/author/alan-guedes/</guid><description>&lt;meta http-equiv="refresh" content="0; url=https://www.reading.ac.uk/computer-science/staff/dr-alan-guedes" />
&lt;script>window.location = "https://www.reading.ac.uk/computer-science/staff/dr-alan-guedes";&lt;/script></description></item><item><title>Publications</title><link>https://ucl-lasp.github.io/publication/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://ucl-lasp.github.io/publication/</guid><description>&lt;p>&lt;strong>Bures-Wasserstein Flow Matching for Graph Generation&lt;/strong>
K Jiang, J Cui, X Dong, L Toni&lt;br>
2026. Submitted to ICLR arXiv:2506.14020&lt;/p>
&lt;p>&lt;strong>Reinforcement Learning Using known Invariances&lt;/strong>
A Cioba, A Kayal, L Toni, S Vakili, A Bernacchia&lt;br>
2026. AISTATS [arXiv:2511.03473]&lt;/p>
&lt;p>&lt;strong>GT-MilliNoise: Graph transformer for point-wise denoising of indoor millimeter-wave point clouds&lt;/strong>
P Gomes, W Brescia, S Mascolo, L Toni, L De Cicco&lt;br>
2025. Signal Processing: Image Communication&lt;/p>
&lt;p>&lt;strong>Effects of Random Edge-Dropping on Over-Squashing in Graph Neural Networks&lt;/strong>
J Singh, K Jiang, B Paige, L Toni&lt;br>
2025. NeurIPS&lt;/p>
&lt;p>&lt;strong>Navix: Scaling minigrid environments with JAX&lt;/strong>
E Pignatelli, J Liesen, RT Lange, C Lu, PS Castro, L Toni&lt;br>
2025. NeurIPS Dataset Track [preprint arXiv:2407.19396]&lt;/p>
&lt;p>&lt;strong>From In Silico to In Vitro: Evaluating Molecule Generative Models for Hit Generation&lt;/strong>
N Osman, V Lembo, G Bottegoni, L Toni&lt;br>
2025. AI4Science Workshop @ NeurIPS 2025&lt;/p>
&lt;p>&lt;strong>LGDC: Latent Graph Diffusion via Spectrum-Preserving Coarsening&lt;/strong>
N Osman, K Jiang, D Buffelli, X Dong, L Toni&lt;br>
2025. New Perspective in Graph Machine Learning Workshop @ NeurIPS 2025&lt;/p>
&lt;p>&lt;strong>MERINA+: Improving Generalization for Neural Video Adaptation via Information-Theoretic Meta-Reinforcement Learning&lt;/strong>
N Kan, C Li, Y Jiang, W Dai, J Zou, H Xiong, L Toni&lt;br>
2025. IEEE Transactions on Circuits and Systems for Video Technology&lt;/p>
&lt;p>&lt;strong>The impact of intrinsic rewards on exploration in Reinforcement Learning&lt;/strong>
A Kayal, E Pignatelli, L Toni&lt;br>
2025. Neural Computing and Applications&lt;/p>
&lt;p>&lt;strong>Bayesian Optimization from Human Feedback: Near-Optimal Regret Bounds&lt;/strong>
A Kayal, S Vakili, L Toni, D Shiu, A Bernacchia&lt;br>
2025. ICML [arXiv:2505.23673]&lt;/p>
&lt;p>&lt;strong>Heterogeneous Graph Structure Learning through the Lens of Data-generating Processes&lt;/strong>
K Jiang, B Tang, X Dong, L Toni&lt;br>
2025. AISTATS [arXiv:2503.08760]&lt;/p>
&lt;p>&lt;strong>Reward-Free Kernel-Based Reinforcement Learning&lt;/strong>
A Kayal, S Vakili, L Toni, A Bernacchia&lt;br>
2024. ICML [arXiv:2502.07715]&lt;/p>
&lt;p>&lt;strong>Assessing the zero-shot capabilities of LLMs for action evaluation in RL&lt;/strong>
E Pignatelli, J Ferret, T Rockäschel, E Grefenstette, D Paglieri, S Coward, et al.&lt;br>
2024. arXiv preprint arXiv:2409.12798&lt;/p>
&lt;p>&lt;strong>AGAR: Attention Graph-RNN for Adaptative Motion Prediction of Point Clouds of Deformable Objects&lt;/strong>
PM Gomes, S Rossi, L Toni&lt;br>
2024. ACM Transactions on Multimedia Computing, Communications and Applications&lt;/p>
&lt;p>&lt;strong>Learning algorithm generalization error bounds via auxiliary distributions&lt;/strong> G Aminian, S Masiha, L Toni, MRD Rodrigues&lt;br>
2024. IEEE Journal on Selected Areas in Information Theory&lt;/p>
&lt;p>&lt;strong>Millinoise: a millimeter-wave radar sparse point cloud dataset in indoor scenarios&lt;/strong>
W Brescia, P Gomes, L Toni, S Mascolo, L De Cicco&lt;br>
2024. Proceedings of the 15th ACM Multimedia Systems Conference&lt;/p>
&lt;p>&lt;strong>Conditional Meta-Reinforcement Learning with State Representation&lt;/strong>
Y Sun, L Toni, Y Andreopoulos&lt;br>
2024. Automated Reinforcement Learning: Exploring Meta-Learning, AutoML, and LLMs&lt;/p>
&lt;p>&lt;strong>A survey of temporal credit assignment in deep reinforcement learning&lt;/strong>
E Pignatelli, J Ferret, M Geist, T Mesnard, H van Hasselt, O Pietquin, L Toni&lt;br>
2023. arXiv preprint arXiv:2312.01072&lt;/p>
&lt;p>&lt;strong>Information-theoretic characterizations of generalization error for the Gibbs algorithm&lt;/strong> G Aminian, Y Bu, L Toni, MRD Rodrigues, GW Wornell&lt;br>
2023. IEEE Transactions on Information Theory&lt;/p></description></item></channel></rss>