Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleJanuary 2025
Model-based RL for mean-field games is not statistically harder than single-agent RL
ICML'24: Proceedings of the 41st International Conference on Machine LearningArticle No.: 798, Pages 19816–19870We study the sample complexity of reinforcement learning (RL) in Mean-Field Games (MFGs) with model-based function approximation that requires strategic exploration to find a Nash Equilibrium policy. We introduce the Partial Model-Based Eluder Dimension (...
- research-articleJanuary 2025
Global reinforcement learning: beyond linear and convex rewards via submodular semi-gradient methods
ICML'24: Proceedings of the 41st International Conference on Machine LearningArticle No.: 408, Pages 10235–10266In classic Reinforcement Learning (RL), the agent maximizes an additive objective of the visited states, e.g., a value function. Unfortunately, objectives of this type cannot model many real-world applications such as experiment design, exploration, ...
- research-articleJanuary 2025
Geometric active exploration in Markov decision processes: the benefit of abstraction
ICML'24: Proceedings of the 41st International Conference on Machine LearningArticle No.: 407, Pages 10206–10234How can a scientist use a Reinforcement Learning (RL) algorithm to design experiments over a dynamical system's state space? In the case of finite and Markovian systems, an area called Active Exploration (AE) relaxes the optimization problem of ...
- research-articleMay 2024
Safe Model-Based Multi-Agent Mean-Field Reinforcement Learning
AAMAS '24: Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent SystemsPages 973–982Many applications, e.g., in shared mobility, require coordinating a large number of agents. Mean-field reinforcement learning addresses the resulting scalability challenge by optimizing the policy of a representative agent interacting with the infinite ...
- research-articleMay 2024
Provably Learning Nash Policies in Constrained Markov Potential Games
Multi-agent reinforcement learning addresses sequential decision-making problems with multiple agents, where each agent optimizes its own objective. In many real-world scenarios, agents not only aim to maximize their goals but also need to ensure safe ...
-
- research-articleMay 2024
Contextual stochastic bilevel optimization
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 3428, Pages 78412–78434We introduce contextual stochastic bilevel optimization (CSBO) - a stochastic bilevel optimization framework with the lower-level problem minimizing an expectation conditioned on some contextual information and the upper-level decision variable. This ...
- research-articleMay 2024
Implicit manifold Gaussian process regression
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 2961, Pages 67701–67720Gaussian process regression is widely used because of its ability to provide well-calibrated uncertainty estimates and handle small or sparse datasets. However, it struggles with high-dimensional data. One possible way to scale this technique to higher ...
- research-articleMay 2024
Stochastic approximation algorithms for systems of interacting particles
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 2436, Pages 55826–55847Interacting particle systems have proven highly successful in various machine learning tasks, including approximate Bayesian inference and neural network optimization. However, the analysis of these systems often relies on the simplifying assumption of ...
- research-articleMay 2024
Efficient exploration in continuous-time model-based reinforcement learning
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 1825, Pages 42119–42147Reinforcement learning algorithms typically consider discrete-time dynamics, even though the underlying systems are often continuous in time. In this paper, we introduce a model-based reinforcement learning algorithm that represents continuous-time ...
- research-articleMay 2024
A dynamical system view of langevin-based non-convex sampling
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 1787, Pages 41051–41075Non-convex sampling is a key challenge in machine learning, central to non-convex optimization in deep learning as well as to approximate probabilistic inference. Despite its significance, theoretically there remain some important challenges: Existing ...
- research-articleMay 2024
Optimistic active exploration of dynamical systems
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 1656, Pages 38122–38153Reinforcement learning algorithms commonly seek to optimize policies for solving one particular task. How should we explore an unknown dynamical system such that the estimated model globally approximates the dynamics and allows us to solve multiple ...
- research-articleMay 2024
Learning to dive in branch and bound
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 1485, Pages 34260–34277Primal heuristics are important for solving mixed integer linear programs, because they find feasible solutions that facilitate branch and bound search. A prominent group of primal heuristics are diving heuristics. They iteratively modify and resolve ...
- research-articleMay 2024
Riemannian stochastic optimization methods avoid strict saddle points
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 1286, Pages 29580–29601Many modern machine learning applications - from online principal component analysis to covariance matrix identification and dictionary learning - can be formulated as minimization problems on Riemannian manifolds, typically solved with a Riemannian ...
- research-articleMay 2024
Anytime model selection in linear bandits
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 1282, Pages 29478–29514Model selection in the context of bandit optimization is a challenging problem, as it requires balancing exploration and exploitation not only for action selection, but also for model selection. One natural approach is to rely on online learning ...
- research-articleMay 2024
Likelihood ratio confidence sets for sequential decision making
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 1159, Pages 26686–26698Certifiable, adaptive uncertainty estimates for unknown quantities are an essential ingredient of sequential decision-making algorithms. Standard approaches rely on problem-dependent concentration results and are limited to a specific combination of ...
- research-articleMay 2024
Multitask learning with no regret: from improved confidence bounds to active learning
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 297, Pages 6770–6781Multitask learning is a powerful framework that enables one to simultaneously learn multiple related tasks by sharing information between them. Quantifying uncertainty in the estimated tasks is of pivotal importance for many downstream applications, such ...
- research-articleJuly 2023
Aligned diffusion schrödinger bridges
- Vignesh Ram Somnath,
- Matteo Pariset,
- Ya-Ping Hsieh,
- Maria Rodriguez Martinez,
- Andreas Krause,
- Charlotte Bunne
UAI '23: Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial IntelligenceArticle No.: 186, Pages 1985–1995Diffusion Schrödinger bridges (DSB) have recently emerged as a powerful framework for recovering stochastic dynamics via their marginal observations at different time points. Despite numerous successful applications, existing algorithms for solving DSBs ...
- research-articleJuly 2023
Lifelong bandit optimization: no prior and no regret
UAI '23: Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial IntelligenceArticle No.: 173, Pages 1847–1857Machine learning algorithms are often repeatedly applied to problems with similar structure over and over again. We focus on solving a sequence of bandit optimization tasks and develop LIBO, an algorithm which adapts to the environment by learning from ...
- research-articleJuly 2023
Hallucinated adversarial control for conservative offline policy evaluation
UAI '23: Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial IntelligenceArticle No.: 166, Pages 1774–1784We study the problem of conservative off-policy evaluation (COPE) where given an offline dataset of environment interactions, collected by other agents, we seek to obtain a (tight) lower bound on a policy's performance. This is crucial when deciding ...
- research-articleJuly 2023
A scalable walsh-hadamard regularizer to overcome the low-degree spectral bias of neural networks
UAI '23: Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial IntelligenceArticle No.: 68, Pages 723–733Despite the capacity of neural nets to learn arbitrary functions, models trained through gradient descent often exhibit a bias towards "simpler" functions. Various notions of simplicity have been introduced to characterize this behavior. Here, we focus on ...