Author: Krause, Andreas : Search

research-article

Model-based RL for mean-field games is not statistically harder than single-agent RL

ICML'24: Proceedings of the 41st International Conference on Machine LearningArticle No.: 798, Pages 19816–19870

We study the sample complexity of reinforcement learning (RL) in Mean-Field Games (MFGs) with model-based function approximation that requires strategic exploration to find a Nash Equilibrium policy. We introduce the Partial Model-Based Eluder Dimension (...

research-article

Global reinforcement learning: beyond linear and convex rewards via submodular semi-gradient methods

ICML'24: Proceedings of the 41st International Conference on Machine LearningArticle No.: 408, Pages 10235–10266

In classic Reinforcement Learning (RL), the agent maximizes an additive objective of the visited states, e.g., a value function. Unfortunately, objectives of this type cannot model many real-world applications such as experiment design, exploration, ...

research-article

Geometric active exploration in Markov decision processes: the benefit of abstraction

ICML'24: Proceedings of the 41st International Conference on Machine LearningArticle No.: 407, Pages 10206–10234

How can a scientist use a Reinforcement Learning (RL) algorithm to design experiments over a dynamical system's state space? In the case of finite and Markovian systems, an area called Active Exploration (AE) relaxes the optimization problem of ...

research-article

Safe Model-Based Multi-Agent Mean-Field Reinforcement Learning

AAMAS '24: Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent SystemsPages 973–982

Many applications, e.g., in shared mobility, require coordinating a large number of agents. Mean-field reinforcement learning addresses the resulting scalability challenge by optimizing the policy of a representative agent interacting with the infinite ...

research-article

Provably Learning Nash Policies in Constrained Markov Potential Games

AAMAS '24: Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent SystemsPages 31–39

Multi-agent reinforcement learning addresses sequential decision-making problems with multiple agents, where each agent optimizes its own objective. In many real-world scenarios, agents not only aim to maximize their goals but also need to ensure safe ...

research-article

Contextual stochastic bilevel optimization

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 3428, Pages 78412–78434

We introduce contextual stochastic bilevel optimization (CSBO) - a stochastic bilevel optimization framework with the lower-level problem minimizing an expectation conditioned on some contextual information and the upper-level decision variable. This ...

research-article

Implicit manifold Gaussian process regression

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 2961, Pages 67701–67720

Gaussian process regression is widely used because of its ability to provide well-calibrated uncertainty estimates and handle small or sparse datasets. However, it struggles with high-dimensional data. One possible way to scale this technique to higher ...

research-article

Stochastic approximation algorithms for systems of interacting particles

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 2436, Pages 55826–55847

Interacting particle systems have proven highly successful in various machine learning tasks, including approximate Bayesian inference and neural network optimization. However, the analysis of these systems often relies on the simplifying assumption of ...

research-article

Efficient exploration in continuous-time model-based reinforcement learning

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 1825, Pages 42119–42147

Reinforcement learning algorithms typically consider discrete-time dynamics, even though the underlying systems are often continuous in time. In this paper, we introduce a model-based reinforcement learning algorithm that represents continuous-time ...

research-article

A dynamical system view of langevin-based non-convex sampling

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 1787, Pages 41051–41075

Non-convex sampling is a key challenge in machine learning, central to non-convex optimization in deep learning as well as to approximate probabilistic inference. Despite its significance, theoretically there remain some important challenges: Existing ...

research-article

Optimistic active exploration of dynamical systems

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 1656, Pages 38122–38153

Reinforcement learning algorithms commonly seek to optimize policies for solving one particular task. How should we explore an unknown dynamical system such that the estimated model globally approximates the dynamics and allows us to solve multiple ...

research-article

Learning to dive in branch and bound

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 1485, Pages 34260–34277

Primal heuristics are important for solving mixed integer linear programs, because they find feasible solutions that facilitate branch and bound search. A prominent group of primal heuristics are diving heuristics. They iteratively modify and resolve ...

research-article

Riemannian stochastic optimization methods avoid strict saddle points

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 1286, Pages 29580–29601

Many modern machine learning applications - from online principal component analysis to covariance matrix identification and dictionary learning - can be formulated as minimization problems on Riemannian manifolds, typically solved with a Riemannian ...

research-article

Anytime model selection in linear bandits

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 1282, Pages 29478–29514

Model selection in the context of bandit optimization is a challenging problem, as it requires balancing exploration and exploitation not only for action selection, but also for model selection. One natural approach is to rely on online learning ...

research-article

Likelihood ratio confidence sets for sequential decision making

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 1159, Pages 26686–26698

Certifiable, adaptive uncertainty estimates for unknown quantities are an essential ingredient of sequential decision-making algorithms. Standard approaches rely on problem-dependent concentration results and are limited to a specific combination of ...

research-article

Multitask learning with no regret: from improved confidence bounds to active learning

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 297, Pages 6770–6781

Multitask learning is a powerful framework that enables one to simultaneously learn multiple related tasks by sharing information between them. Quantifying uncertainty in the estimated tasks is of pivotal importance for many downstream applications, such ...

research-article

Aligned diffusion schrödinger bridges

UAI '23: Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial IntelligenceArticle No.: 186, Pages 1985–1995

Diffusion Schrödinger bridges (DSB) have recently emerged as a powerful framework for recovering stochastic dynamics via their marginal observations at different time points. Despite numerous successful applications, existing algorithms for solving DSBs ...

research-article

Lifelong bandit optimization: no prior and no regret

UAI '23: Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial IntelligenceArticle No.: 173, Pages 1847–1857

Machine learning algorithms are often repeatedly applied to problems with similar structure over and over again. We focus on solving a sequence of bandit optimization tasks and develop LIBO, an algorithm which adapts to the environment by learning from ...

research-article

Hallucinated adversarial control for conservative offline policy evaluation

UAI '23: Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial IntelligenceArticle No.: 166, Pages 1774–1784

We study the problem of conservative off-policy evaluation (COPE) where given an offline dataset of environment interactions, collected by other agents, we seek to obtain a (tight) lower bound on a policy's performance. This is crucial when deciding ...

research-article

A scalable walsh-hadamard regularizer to overcome the low-degree spectral bias of neural networks

UAI '23: Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial IntelligenceArticle No.: 68, Pages 723–733

Despite the capacity of neural nets to learn arbitrary functions, models trained through gradient descent often exhibit a bias towards "simpler" functions. Various notions of simplicity have been introduced to characterize this behavior. Here, we focus on ...

Applied Filters

People

Names

Institutions

Authors

Advisors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Save to Binder

Upcoming Conferences