Author: He, Niao : Search

research-article

DPZero: private fine-tuning of language models without backpropagation

ICML'24: Proceedings of the 41st International Conference on Machine LearningArticle No.: 2446, Pages 59210–59246

The widespread practice of fine-tuning large language models (LLMs) on domain-specific data faces two major challenges in memory and privacy. First, as the size of LLMs continues to grow, the memory demands of gradient-based training methods via ...

research-article

Truly no-regret learning in constrained MDPs

ICML'24: Proceedings of the 41st International Conference on Machine LearningArticle No.: 1489, Pages 36605–36653

Constrained Markov decision processes (CMDPs) are a common way to model safety constraints in reinforcement learning. State-of-the-art methods for efficiently solving CMDPs are based on primal-dual algorithms. For these algorithms, all currently known ...

research-article

Model-based RL for mean-field games is not statistically harder than single-agent RL

ICML'24: Proceedings of the 41st International Conference on Machine LearningArticle No.: 798, Pages 19816–19870

We study the sample complexity of reinforcement learning (RL) in Mean-Field Games (MFGs) with model-based function approximation that requires strategic exploration to find a Nash Equilibrium policy. We introduce the Partial Model-Based Eluder Dimension (...

research-article

When is Mean-Field Reinforcement Learning Tractable and Relevant?

AAMAS '24: Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent SystemsPages 2038–2046

Mean-field reinforcement learning has become a popular theoretical framework for efficiently approximating large-scale multi-agent reinforcement learning (MARL) problems exhibiting symmetry. However, questions remain regarding the applicability of mean-...

research-article

Provably Learning Nash Policies in Constrained Markov Potential Games

AAMAS '24: Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent SystemsPages 31–39

Multi-agent reinforcement learning addresses sequential decision-making problems with multiple agents, where each agent optimizes its own objective. In many real-world scenarios, agents not only aim to maximize their goals but also need to ensure safe ...

research-article

Two sides of one coin: the limits of untuned SGD and the power of adaptive methods

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 3248, Pages 74257–74288

The classical analysis of Stochastic Gradient Descent (SGD) with polynomially decaying stepsize η_t = η/√t relies on well-tuned η depending on problem param eters such as Lipschitz smoothness constant, which is often unknown in practice. In this work, we ...

research-article

Robust knowledge transfer in tiered RL

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 2268, Pages 52073–52085

In this paper, we study the Tiered Reinforcement Learning setting, a parallel transfer learning framework, where the goal is to transfer knowledge from the low-tier (source) task to the high-tier (target) task to reduce the exploration risk of the latter ...

research-article

On imitation in mean-field games

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 1757, Pages 40426–40437

We explore the problem of imitation learning (IL) in the context of mean-field games (MFGs), where the goal is to imitate the behavior of a population of agents following a Nash equilibrium policy according to some unknown payoff function. IL in MFGs ...

research-article

Optimal guarantees for algorithmic reproducibility and gradient complexity in convex optimization

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 769, Pages 17527–17566

Algorithmic reproducibility measures the deviation in outputs of machine learning algorithms upon minor changes in the training process. Previous work suggests that first-order methods would need to trade-off convergence rate (gradient complexity) for ...

research-article

Policy mirror ascent for efficient and independent learning in mean field games

ICML'23: Proceedings of the 40th International Conference on Machine LearningArticle No.: 1658, Pages 39722–39754

Mean-field games have been used as a theoretical tool to obtain an approximate Nash equilibrium for symmetric and anonymous N-player games. However, limiting applicability, existing theoretical results assume variations of a "population generative model",...

research-article

Stochastic policy gradient methods: improved sample complexity for fisher-non-degenerate policies

ICML'23: Proceedings of the 40th International Conference on Machine LearningArticle No.: 393, Pages 9827–9869

Recently, the impressive empirical success of policy gradient (PG) methods has catalyzed the development of their theoretical foundations. Despite the huge efforts directed at the design of efficient stochastic PG-type algorithms, the understanding of ...

research-article

Reinforcement learning with general utilities: simpler variance reduction and large state-action space

ICML'23: Proceedings of the 40th International Conference on Machine LearningArticle No.: 75, Pages 1753–1800

We consider the reinforcement learning (RL) problem with general utilities which consists in maximizing a function of the state-action occupancy measure. Beyond the standard cumulative reward RL setting, this problem includes as particular cases ...

research-article

Bring your own algorithm for optimal differentially private stochastic minimax optimization

NIPS '22: Proceedings of the 36th International Conference on Neural Information Processing SystemsArticle No.: 2549, Pages 35174–35187

We study differentially private (DP) algorithms for smooth stochastic minimax optimization, with stochastic minimization as a byproduct. The holy grail of these settings is to guarantee the optimal trade-off between the privacy and the excess population ...

research-article

Sharp analysis of stochastic optimization under global kurdyka-Łojasiewicz inequality

NIPS '22: Proceedings of the 36th International Conference on Neural Information Processing SystemsArticle No.: 1152, Pages 15836–15848

We study the complexity of finding the global solution to stochastic nonconvex optimization when the objective function satisfies global Kurdyka-Łojasiewicz (KŁ) inequality and the queries from stochastic gradient oracles satisfy mild expected smoothness ...

research-article

Nest your adaptive algorithm for parameter-agnostic nonconvex minimax optimization

NIPS '22: Proceedings of the 36th International Conference on Neural Information Processing SystemsArticle No.: 814, Pages 11202–11216

Adaptive algorithms like AdaGrad and AMSGrad are successful in nonconvex optimization owing to their parameter-agnostic ability - requiring no a priori knowledge about problem-specific parameters nor tuning of learning rates. However, when it comes to ...

research-article

Stochastic second-order methods improve best-known sample complexity of SGD for gradient-dominated functions

NIPS '22: Proceedings of the 36th International Conference on Neural Information Processing SystemsArticle No.: 789, Pages 10862–10875

We study the performance of Stochastic Cubic Regularized Newton (SCRN) on a class of functions satisfying gradient dominance property with 1 ≤ α ≤ 2 which holds in a wide range of applications in machine learning and signal processing. This condition ...

research-article

On the bias-variance-cost tradeoff of stochastic optimization

NIPS '21: Proceedings of the 35th International Conference on Neural Information Processing SystemsArticle No.: 1694, Pages 22119–22131

We consider stochastic optimization when one only has access to biased stochastic oracles of the objective, and obtaining stochastic gradients with low biases comes at high costs. This setting captures a variety of optimization paradigms widely used in ...

research-article

Free

A unified switching system perspective and convergence analysis of Q-learning algorithms

NIPS '20: Proceedings of the 34th International Conference on Neural Information Processing SystemsArticle No.: 1305, Pages 15556–15567

This paper develops a novel and unified framework to analyze the convergence of a large family of Q-learning algorithms from the switching system perspective. We show that the nonlinear ODE models associated with Q-learning and many of its variants can ...

research-article

Free

The devil is in the detail: a framework for macroscopic prediction via microscopic models

NIPS '20: Proceedings of the 34th International Conference on Neural Information Processing SystemsArticle No.: 671, Pages 8006–8016

Macroscopic data aggregated from microscopic events are pervasive in machine learning, such as country-level COVID-19 infection statistics based on city-level data. Yet, many existing approaches for predicting macroscopic behavior only use aggregated ...

research-article

Free

The mean-squared error of double Q-learning

NIPS '20: Proceedings of the 34th International Conference on Neural Information Processing SystemsArticle No.: 572, Pages 6815–6826

In this paper, we establish a theoretical comparison between the asymptotic mean-squared error of Double Q-learning and Q-learning. Our result builds upon an analysis for linear stochastic approximation based on Lyapunov equations and applies to both ...

Search Results

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

DPZero: private fine-tuning of language models without backpropagation

Truly no-regret learning in constrained MDPs

Model-based RL for mean-field games is not statistically harder than single-agent RL

When is Mean-Field Reinforcement Learning Tractable and Relevant?

Provably Learning Nash Policies in Constrained Markov Potential Games

Two sides of one coin: the limits of untuned SGD and the power of adaptive methods

Robust knowledge transfer in tiered RL

On imitation in mean-field games

Optimal guarantees for algorithmic reproducibility and gradient complexity in convex optimization

Policy mirror ascent for efficient and independent learning in mean field games

Stochastic policy gradient methods: improved sample complexity for fisher-non-degenerate policies

Reinforcement learning with general utilities: simpler variance reduction and large state-action space

Bring your own algorithm for optimal differentially private stochastic minimax optimization

Sharp analysis of stochastic optimization under global kurdyka-Łojasiewicz inequality

Nest your adaptive algorithm for parameter-agnostic nonconvex minimax optimization

Stochastic second-order methods improve best-known sample complexity of SGD for gradient-dominated functions

On the bias-variance-cost tradeoff of stochastic optimization

A unified switching system perspective and convergence analysis of Q-learning algorithms

The devil is in the detail: a framework for macroscopic prediction via microscopic models

The mean-squared error of double Q-learning

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Save to Binder