Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–15 of 15 results for author: Grand-Clément, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.10631  [pdf, other

    cs.GT cs.LG math.OC

    Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms

    Authors: Yang Cai, Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee, Haipeng Luo, Weiqiang Zheng

    Abstract: Self-play via online learning is one of the premier ways to solve large-scale two-player zero-sum games, both in theory and practice. Particularly popular algorithms include optimistic multiplicative weights update (OMWU) and optimistic gradient-descent-ascent (OGDA). While both algorithms enjoy $O(1/T)$ ergodic convergence to Nash equilibrium in two-player zero-sum games, OMWU offers several adva… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 27 pages, 4 figures

  2. arXiv:2403.04680  [pdf, other

    cs.GT

    Extensive-Form Game Solving via Blackwell Approachability on Treeplexes

    Authors: Darshan Chakrabarti, Julien Grand-Clément, Christian Kroer

    Abstract: In this paper, we introduce the first algorithmic framework for Blackwell approachability on the sequence-form polytope, the class of convex polytopes capturing the strategies of players in extensive-form games (EFGs). This leads to a new class of regret-minimization algorithms that are stepsize-invariant, in the same sense as the Regret Matching and Regret Matching$^+$ algorithms for the simplex.… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  3. arXiv:2312.03618  [pdf, other

    math.OC cs.GT

    Beyond discounted returns: Robust Markov decision processes with average and Blackwell optimality

    Authors: Julien Grand-Clement, Marek Petrik, Nicolas Vieille

    Abstract: Robust Markov Decision Processes (RMDPs) are a widely used framework for sequential decision-making under parameter uncertainty. RMDPs have been extensively studied when the objective is to maximize the discounted return, but little is known for average optimality (optimizing the long-run average of the rewards obtained over time) and Blackwell optimality (remaining discount optimal for all discou… ▽ More

    Submitted 7 March, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

  4. arXiv:2311.00676  [pdf, other

    cs.GT cs.LG

    Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games

    Authors: Yang Cai, Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee, Haipeng Luo, Weiqiang Zheng

    Abstract: Algorithms based on regret matching, specifically regret matching$^+$ (RM$^+$), and its variants are the most popular approaches for solving large-scale two-player zero-sum games in practice. Unlike algorithms such as optimistic gradient descent ascent, which have strong last-iterate and ergodic convergence properties for zero-sum games, virtually nothing is known about the last-iterate properties… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  5. arXiv:2305.14709  [pdf, ps, other

    cs.GT cs.LG

    Regret Matching+: (In)Stability and Fast Convergence in Games

    Authors: Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee, Haipeng Luo

    Abstract: Regret Matching+ (RM+) and its variants are important algorithms for solving large-scale games. However, a theoretical understanding of their success in practice is still a mystery. Moreover, recent advances on fast convergence in games are limited to no-regret algorithms such as online mirror descent, which satisfy stability. In this paper, we first give counterexamples showing that RM+ and its p… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

  6. arXiv:2302.00036  [pdf, other

    cs.LG

    Reducing Blackwell and Average Optimality to Discounted MDPs via the Blackwell Discount Factor

    Authors: Julien Grand-Clément, Marek Petrik

    Abstract: We introduce the Blackwell discount factor for Markov Decision Processes (MDPs). Classical objectives for MDPs include discounted, average, and Blackwell optimality. Many existing approaches to computing average-optimal policies solve for discounted optimal policies with a discount factor close to $1$, but they only work under strong or hard-to-verify assumptions such as ergodicity or weakly commu… ▽ More

    Submitted 31 January, 2023; originally announced February 2023.

    Journal ref: Advances in Neural Information Processing Systems (Neurips), 2023

  7. arXiv:2209.10187  [pdf, other

    math.OC cs.LG

    On the convex formulations of robust Markov decision processes

    Authors: Julien Grand-Clément, Marek Petrik

    Abstract: Robust Markov decision processes (MDPs) are used for applications of dynamic optimization in uncertain environments and have been studied extensively. Many of the main properties and algorithms of MDPs, such as value iteration and policy iteration, extend directly to RMDPs. Surprisingly, there is no known analog of the MDP convex optimization formulation for solving RMDPs. This work describes the… ▽ More

    Submitted 13 December, 2023; v1 submitted 21 September, 2022; originally announced September 2022.

  8. arXiv:2209.01874  [pdf, other

    cs.HC cs.AI

    The Best Decisions Are Not the Best Advice: Making Adherence-Aware Recommendations

    Authors: Julien Grand-Clément, Jean Pauphilet

    Abstract: Many high-stake decisions follow an expert-in-loop structure in that a human operator receives recommendations from an algorithm but is the ultimate decision maker. Hence, the algorithm's recommendation may differ from the actual decision implemented in practice. However, most algorithmic recommendations are obtained by solving an optimization problem that assumes recommendations will be perfectly… ▽ More

    Submitted 9 December, 2023; v1 submitted 5 September, 2022; originally announced September 2022.

  9. arXiv:2202.12277  [pdf, other

    math.OC cs.LG

    Solving optimization problems with Blackwell approachability

    Authors: Julien Grand-Clément, Christian Kroer

    Abstract: We introduce the Conic Blackwell Algorithm$^+$ (CBA$^+$) regret minimizer, a new parameter- and scale-free regret minimizer for general convex sets. CBA$^+$ is based on Blackwell approachability and attains $O(\sqrt{T})$ regret. We show how to efficiently instantiate CBA$^+$ for many decision sets of interest, including the simplex, $\ell_{p}$ norm balls, and ellipsoidal confidence regions in the… ▽ More

    Submitted 24 February, 2022; originally announced February 2022.

    Comments: arXiv admin note: text overlap with arXiv:2105.13203

  10. arXiv:2110.10994  [pdf, ps, other

    cs.LG cs.CY

    Interpretable Machine Learning for Resource Allocation with Application to Ventilator Triage

    Authors: Julien Grand-Clément, Carri Chan, Vineet Goyal, Elizabeth Chuang

    Abstract: Rationing of healthcare resources is a challenging decision that policy makers and providers may be forced to make during a pandemic, natural disaster, or mass casualty event. Well-defined guidelines to triage scarce life-saving resources must be designed to promote transparency, trust, and consistency. To facilitate buy-in and use during high-stress situations, these guidelines need to be interpr… ▽ More

    Submitted 21 October, 2021; originally announced October 2021.

  11. arXiv:2105.13203  [pdf, other

    cs.LG cs.GT

    Conic Blackwell Algorithm: Parameter-Free Convex-Concave Saddle-Point Solving

    Authors: Julien Grand-Clément, Christian Kroer

    Abstract: We develop new parameter-free and scale-free algorithms for solving convex-concave saddle-point problems. Our results are based on a new simple regret minimizer, the Conic Blackwell Algorithm$^+$ (CBA$^+$), which attains $O(1/\sqrt{T})$ average regret. Intuitively, our approach generalizes to other decision sets of interest ideas from the Counterfactual Regret minimization (CFR$^+$) algorithm, whi… ▽ More

    Submitted 14 October, 2021; v1 submitted 27 May, 2021; originally announced May 2021.

  12. arXiv:2009.06790  [pdf, other

    math.OC cs.GT

    First-Order Methods for Wasserstein Distributionally Robust MDP

    Authors: Julien Grand-Clément, Christian Kroer

    Abstract: Markov decision processes (MDPs) are known to be sensitive to parameter specification. Distributionally robust MDPs alleviate this issue by allowing for \emph{ambiguity sets} which give a set of possible distributions over parameter sets. The goal is to find an optimal policy with respect to the worst-case parameter distribution. We propose a framework for solving Distributionally robust MDPs via… ▽ More

    Submitted 3 May, 2021; v1 submitted 14 September, 2020; originally announced September 2020.

  13. arXiv:2005.05434  [pdf, other

    math.OC cs.LG

    Scalable First-Order Methods for Robust MDPs

    Authors: Julien Grand-Clément, Christian Kroer

    Abstract: Robust Markov Decision Processes (MDPs) are a powerful framework for modeling sequential decision-making problems with model uncertainty. This paper proposes the first first-order framework for solving robust MDPs. Our algorithm interleaves primal-dual first-order updates with approximate Value Iteration updates. By carefully controlling the tradeoff between the accuracy and cost of Value Iteratio… ▽ More

    Submitted 14 January, 2021; v1 submitted 11 May, 2020; originally announced May 2020.

  14. arXiv:2002.06247  [pdf, other

    cs.LG eess.SY stat.ML

    Robust Policies For Proactive ICU Transfers

    Authors: Julien Grand-Clement, Carri W. Chan, Vineet Goyal, Gabriel Escobar

    Abstract: Patients whose transfer to the Intensive Care Unit (ICU) is unplanned are prone to higher mortality rates than those who were admitted directly to the ICU. Recent advances in machine learning to predict patient deterioration have introduced the possibility of \emph{proactive transfer} from the ward to the ICU. In this work, we study the problem of finding \emph{robust} patient transfer policies wh… ▽ More

    Submitted 22 January, 2021; v1 submitted 14 February, 2020; originally announced February 2020.

  15. The operator approach to entropy games

    Authors: Marianne Akian, Stéphane Gaubert, Julien Grand-Clément, Jérémie Guillaud

    Abstract: Entropy games and matrix multiplication games have been recently introduced by Asarin et al. They model the situation in which one player (Despot) wishes to minimize the growth rate of a matrix product, whereas the other player (Tribune) wishes to maximize it. We develop an operator approach to entropy games. This allows us to show that entropy games can be cast as stochastic mean payoff games in… ▽ More

    Submitted 10 April, 2019; originally announced April 2019.

    Comments: 29 pages. This is an extended version of the article with the same title and authors published in the Proceedings of the 34th Symposium on Theoretical Aspects of Computer Science (STACS 2017), Leibniz International Proceedings in Informatics (LIPIcs), volume 66, pages 6:1--6:14, 2017

    MSC Class: 91A15; 47H05; 93E20

    Journal ref: Theor. Comp. Sys., 63(5):1089--1130, July 2019