Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleJanuary 2025
DPZero: private fine-tuning of language models without backpropagation
ICML'24: Proceedings of the 41st International Conference on Machine LearningArticle No.: 2446, Pages 59210–59246The widespread practice of fine-tuning large language models (LLMs) on domain-specific data faces two major challenges in memory and privacy. First, as the size of LLMs continues to grow, the memory demands of gradient-based training methods via ...
- research-articleJanuary 2025
Truly no-regret learning in constrained MDPs
ICML'24: Proceedings of the 41st International Conference on Machine LearningArticle No.: 1489, Pages 36605–36653Constrained Markov decision processes (CMDPs) are a common way to model safety constraints in reinforcement learning. State-of-the-art methods for efficiently solving CMDPs are based on primal-dual algorithms. For these algorithms, all currently known ...
- research-articleJanuary 2025
Model-based RL for mean-field games is not statistically harder than single-agent RL
ICML'24: Proceedings of the 41st International Conference on Machine LearningArticle No.: 798, Pages 19816–19870We study the sample complexity of reinforcement learning (RL) in Mean-Field Games (MFGs) with model-based function approximation that requires strategic exploration to find a Nash Equilibrium policy. We introduce the Partial Model-Based Eluder Dimension (...
- research-articleMay 2024
When is Mean-Field Reinforcement Learning Tractable and Relevant?
AAMAS '24: Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent SystemsPages 2038–2046Mean-field reinforcement learning has become a popular theoretical framework for efficiently approximating large-scale multi-agent reinforcement learning (MARL) problems exhibiting symmetry. However, questions remain regarding the applicability of mean-...
- research-articleMay 2024
Provably Learning Nash Policies in Constrained Markov Potential Games
Multi-agent reinforcement learning addresses sequential decision-making problems with multiple agents, where each agent optimizes its own objective. In many real-world scenarios, agents not only aim to maximize their goals but also need to ensure safe ...
-
- research-articleMay 2024
Two sides of one coin: the limits of untuned SGD and the power of adaptive methods
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 3248, Pages 74257–74288The classical analysis of Stochastic Gradient Descent (SGD) with polynomially decaying stepsize ηt = η/√t relies on well-tuned η depending on problem param eters such as Lipschitz smoothness constant, which is often unknown in practice. In this work, we ...
- research-articleMay 2024
Robust knowledge transfer in tiered RL
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 2268, Pages 52073–52085In this paper, we study the Tiered Reinforcement Learning setting, a parallel transfer learning framework, where the goal is to transfer knowledge from the low-tier (source) task to the high-tier (target) task to reduce the exploration risk of the latter ...
- research-articleMay 2024
On imitation in mean-field games
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 1757, Pages 40426–40437We explore the problem of imitation learning (IL) in the context of mean-field games (MFGs), where the goal is to imitate the behavior of a population of agents following a Nash equilibrium policy according to some unknown payoff function. IL in MFGs ...
- research-articleMay 2024
Optimal guarantees for algorithmic reproducibility and gradient complexity in convex optimization
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 769, Pages 17527–17566Algorithmic reproducibility measures the deviation in outputs of machine learning algorithms upon minor changes in the training process. Previous work suggests that first-order methods would need to trade-off convergence rate (gradient complexity) for ...
- research-articleJuly 2023
Policy mirror ascent for efficient and independent learning in mean field games
ICML'23: Proceedings of the 40th International Conference on Machine LearningArticle No.: 1658, Pages 39722–39754Mean-field games have been used as a theoretical tool to obtain an approximate Nash equilibrium for symmetric and anonymous N-player games. However, limiting applicability, existing theoretical results assume variations of a "population generative model",...
- research-articleJuly 2023
Stochastic policy gradient methods: improved sample complexity for fisher-non-degenerate policies
ICML'23: Proceedings of the 40th International Conference on Machine LearningArticle No.: 393, Pages 9827–9869Recently, the impressive empirical success of policy gradient (PG) methods has catalyzed the development of their theoretical foundations. Despite the huge efforts directed at the design of efficient stochastic PG-type algorithms, the understanding of ...
- research-articleJuly 2023
Reinforcement learning with general utilities: simpler variance reduction and large state-action space
ICML'23: Proceedings of the 40th International Conference on Machine LearningArticle No.: 75, Pages 1753–1800We consider the reinforcement learning (RL) problem with general utilities which consists in maximizing a function of the state-action occupancy measure. Beyond the standard cumulative reward RL setting, this problem includes as particular cases ...
- research-articleApril 2024
Bring your own algorithm for optimal differentially private stochastic minimax optimization
NIPS '22: Proceedings of the 36th International Conference on Neural Information Processing SystemsArticle No.: 2549, Pages 35174–35187We study differentially private (DP) algorithms for smooth stochastic minimax optimization, with stochastic minimization as a byproduct. The holy grail of these settings is to guarantee the optimal trade-off between the privacy and the excess population ...
- research-articleApril 2024
Sharp analysis of stochastic optimization under global kurdyka-Łojasiewicz inequality
NIPS '22: Proceedings of the 36th International Conference on Neural Information Processing SystemsArticle No.: 1152, Pages 15836–15848We study the complexity of finding the global solution to stochastic nonconvex optimization when the objective function satisfies global Kurdyka-Łojasiewicz (KŁ) inequality and the queries from stochastic gradient oracles satisfy mild expected smoothness ...
- research-articleApril 2024
Nest your adaptive algorithm for parameter-agnostic nonconvex minimax optimization
NIPS '22: Proceedings of the 36th International Conference on Neural Information Processing SystemsArticle No.: 814, Pages 11202–11216Adaptive algorithms like AdaGrad and AMSGrad are successful in nonconvex optimization owing to their parameter-agnostic ability - requiring no a priori knowledge about problem-specific parameters nor tuning of learning rates. However, when it comes to ...
- research-articleApril 2024
Stochastic second-order methods improve best-known sample complexity of SGD for gradient-dominated functions
NIPS '22: Proceedings of the 36th International Conference on Neural Information Processing SystemsArticle No.: 789, Pages 10862–10875We study the performance of Stochastic Cubic Regularized Newton (SCRN) on a class of functions satisfying gradient dominance property with 1 ≤ α ≤ 2 which holds in a wide range of applications in machine learning and signal processing. This condition ...
- research-articleJune 2024
On the bias-variance-cost tradeoff of stochastic optimization
NIPS '21: Proceedings of the 35th International Conference on Neural Information Processing SystemsArticle No.: 1694, Pages 22119–22131We consider stochastic optimization when one only has access to biased stochastic oracles of the objective, and obtaining stochastic gradients with low biases comes at high costs. This setting captures a variety of optimization paradigms widely used in ...
- research-articleDecember 2020
A unified switching system perspective and convergence analysis of Q-learning algorithms
NIPS '20: Proceedings of the 34th International Conference on Neural Information Processing SystemsArticle No.: 1305, Pages 15556–15567This paper develops a novel and unified framework to analyze the convergence of a large family of Q-learning algorithms from the switching system perspective. We show that the nonlinear ODE models associated with Q-learning and many of its variants can ...
- research-articleDecember 2020
The devil is in the detail: a framework for macroscopic prediction via microscopic models
NIPS '20: Proceedings of the 34th International Conference on Neural Information Processing SystemsArticle No.: 671, Pages 8006–8016Macroscopic data aggregated from microscopic events are pervasive in machine learning, such as country-level COVID-19 infection statistics based on city-level data. Yet, many existing approaches for predicting macroscopic behavior only use aggregated ...
- research-articleDecember 2020
The mean-squared error of double Q-learning
NIPS '20: Proceedings of the 34th International Conference on Neural Information Processing SystemsArticle No.: 572, Pages 6815–6826In this paper, we establish a theoretical comparison between the asymptotic mean-squared error of Double Q-learning and Q-learning. Our result builds upon an analysis for linear stochastic approximation based on Lyapunov equations and applies to both ...