Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 97 results for author: He, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.15173  [pdf, other

    cs.GT cs.LG math.OC stat.ML

    Exploiting Approximate Symmetry for Efficient Multi-Agent Reinforcement Learning

    Authors: Batuhan Yardim, Niao He

    Abstract: Mean-field games (MFG) have become significant tools for solving large-scale multi-agent reinforcement learning problems under symmetry. However, the assumption of exact symmetry limits the applicability of MFGs, as real-world scenarios often feature inherent heterogeneity. Furthermore, most works on MFG assume access to a known MFG model, which might not be readily available for real-world finite… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

    Comments: 5 figures

  2. arXiv:2408.11084  [pdf, other

    math.OC cs.LG

    Multi-level Monte-Carlo Gradient Methods for Stochastic Optimization with Biased Oracles

    Authors: Yifan Hu, Jie Wang, Xin Chen, Niao He

    Abstract: We consider stochastic optimization when one only has access to biased stochastic oracles of the objective and the gradient, and obtaining stochastic gradients with low biases comes at high costs. This setting captures various optimization paradigms, such as conditional stochastic optimization, distributionally robust optimization, shortfall risk optimization, and machine learning paradigms, such… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: A preliminary version of this manuscript has appeared in a conference proceeding. Please refer to Yifan Hu, Xin Chen, and Niao He. On the bias-variance-cost tradeoff of stochastic optimization. Advances in Neural Information Processing Systems, 2021

  3. arXiv:2408.08537  [pdf, other

    cs.CR cs.SE

    SeeWasm: An Efficient and Fully-Functional Symbolic Execution Engine for WebAssembly Binaries

    Authors: Ningyu He, Zhehao Zhao, Hanqin Guan, Jikai Wang, Shuo Peng, Ding Li, Haoyu Wang, Xiangqun Chen, Yao Guo

    Abstract: WebAssembly (Wasm), as a compact, fast, and isolation-guaranteed binary format, can be compiled from more than 40 high-level programming languages. However, vulnerabilities in Wasm binaries could lead to sensitive data leakage and even threaten their hosting environments. To identify them, symbolic execution is widely adopted due to its soundness and the ability to automatically generate exploitat… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: Accepted by ISSTA'24 Demo Track, the tool can be accessed at https://github.com/PKU-ASAL/SeeWasm

  4. arXiv:2408.08075  [pdf, ps, other

    cs.LG cs.GT cs.MA

    Independent Policy Mirror Descent for Markov Potential Games: Scaling to Large Number of Players

    Authors: Pragnya Alatur, Anas Barakat, Niao He

    Abstract: Markov Potential Games (MPGs) form an important sub-class of Markov games, which are a common framework to model multi-agent reinforcement learning problems. In particular, MPGs include as a special case the identical-interest setting where all the agents share the same reward function. Scaling the performance of Nash equilibrium learning algorithms to a large number of agents is crucial for multi… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

    Comments: 16 pages, CDC 2024

    Journal ref: CDC 2024 - Proceedings of the 63rd IEEE Conference on Decision and Control

  5. arXiv:2408.01839  [pdf, ps, other

    math.OC cs.LG

    Complexity of Minimizing Projected-Gradient-Dominated Functions with Stochastic First-order Oracles

    Authors: Saeed Masiha, Saber Salehkaleybar, Niao He, Negar Kiyavash, Patrick Thiran

    Abstract: This work investigates the performance limits of projected stochastic first-order methods for minimizing functions under the $(α,τ,\mathcal{X})$-projected-gradient-dominance property, that asserts the sub-optimality gap $F(\mathbf{x})-\min_{\mathbf{x}'\in \mathcal{X}}F(\mathbf{x}')$ is upper-bounded by $τ\cdot\|\mathcal{G}_{η,\mathcal{X}}(\mathbf{x})\|^α$ for some $α\in[1,2)$ and $τ>0$ and… ▽ More

    Submitted 3 August, 2024; originally announced August 2024.

  6. arXiv:2407.10207  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Learning to Steer Markovian Agents under Model Uncertainty

    Authors: Jiawei Huang, Vinzenz Thoma, Zebang Shen, Heinrich H. Nax, Niao He

    Abstract: Designing incentives for an adapting population is a ubiquitous problem in a wide array of economic applications and beyond. In this work, we study how to design additional rewards to steer multi-agent systems towards desired policies \emph{without} prior knowledge of the agents' underlying learning dynamics. We introduce a model-based non-episodic Reinforcement Learning (RL) formulation for our s… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: 33 Pages

  7. arXiv:2407.06654  [pdf, other

    cs.CL cs.AI

    SoftDedup: an Efficient Data Reweighting Method for Speeding Up Language Model Pre-training

    Authors: Nan He, Weichen Xiong, Hanwen Liu, Yi Liao, Lei Ding, Kai Zhang, Guohua Tang, Xiao Han, Wei Yang

    Abstract: The effectiveness of large language models (LLMs) is often hindered by duplicated data in their extensive pre-training datasets. Current approaches primarily focus on detecting and removing duplicates, which risks the loss of valuable information and neglects the varying degrees of duplication. To address this, we propose a soft deduplication method that maintains dataset integrity while selective… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 12 pages, 7 figures

  8. arXiv:2406.02939  [pdf, ps, other

    math.OC cs.DC cs.LG

    Achieving Near-Optimal Convergence for Distributed Minimax Optimization with Adaptive Stepsizes

    Authors: Yan Huang, Xiang Li, Yipeng Shen, Niao He, Jinming Xu

    Abstract: In this paper, we show that applying adaptive methods directly to distributed minimax problems can result in non-convergence due to inconsistency in locally computed adaptive stepsizes. To address this challenge, we propose D-AdaST, a Distributed Adaptive minimax method with Stepsize Tracking. The key strategy is to employ an adaptive stepsize tracking protocol involving the transmission of two ex… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  9. arXiv:2405.20561  [pdf, other

    cs.CR cs.SE

    All Your Tokens are Belong to Us: Demystifying Address Verification Vulnerabilities in Solidity Smart Contracts

    Authors: Tianle Sun, Ningyu He, Jiang Xiao, Yinliang Yue, Xiapu Luo, Haoyu Wang

    Abstract: In Ethereum, the practice of verifying the validity of the passed addresses is a common practice, which is a crucial step to ensure the secure execution of smart contracts. Vulnerabilities in the process of address verification can lead to great security issues, and anecdotal evidence has been reported by our community. However, this type of vulnerability has not been well studied. To fill the voi… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted by USENIX Security 2024

  10. arXiv:2405.18373  [pdf, other

    stat.ML cs.LG math.OC

    A Hessian-Aware Stochastic Differential Equation for Modelling SGD

    Authors: Xiang Li, Zebang Shen, Liang Zhang, Niao He

    Abstract: Continuous-time approximation of Stochastic Gradient Descent (SGD) is a crucial tool to study its escaping behaviors from stationary points. However, existing stochastic differential equation (SDE) models fail to fully capture these behaviors, even for simple quadratic objectives. Built on a novel stochastic backward error analysis framework, we derive the Hessian-Aware Stochastic Modified Equatio… ▽ More

    Submitted 5 August, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

  11. arXiv:2405.17944  [pdf, other

    cs.CR

    Remeasuring the Arbitrage and Sandwich Attacks of Maximal Extractable Value in Ethereum

    Authors: Tianyang Chi, Ningyu He, Xiaohui Hu, Haoyu Wang

    Abstract: Maximal Extractable Value (MEV) drives the prosperity of the blockchain ecosystem. By strategically including, excluding, or reordering transactions within blocks, block producers/validators can extract additional value, which in turn incentivizes them to keep the decentralization of the whole blockchain platform. Before The Merge of Ethereum in Sep. 2022, around \$675M was extracted in terms of M… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  12. arXiv:2405.04332  [pdf, other

    cs.CR

    WALLETRADAR: Towards Automating the Detection of Vulnerabilities in Browser-based Cryptocurrency Wallets

    Authors: Pengcheng Xia, Yanhui Guo, Zhaowen Lin, Jun Wu, Pengbo Duan, Ningyu He, Kailong Wang, Tianming Liu, Yinliang Yue, Guoai Xu, Haoyu Wang

    Abstract: Cryptocurrency wallets, acting as fundamental infrastructure to the blockchain ecosystem, have seen significant user growth, particularly among browser-based wallets (i.e., browser extensions). However, this expansion accompanies security challenges, making these wallets prime targets for malicious activities. Despite a substantial user base, there is not only a significant gap in comprehensive se… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Just accepted by the Automated Software Engineering Journal

  13. arXiv:2403.12859  [pdf, other

    math.OC cs.LG stat.ML

    Primal Methods for Variational Inequality Problems with Functional Constraints

    Authors: Liang Zhang, Niao He, Michael Muehlebach

    Abstract: Constrained variational inequality problems are recognized for their broad applications across various fields including machine learning and operations research. First-order methods have emerged as the standard approach for solving these problems due to their simplicity and scalability. However, they typically rely on projection or linear minimization oracles to navigate the feasible set, which be… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  14. arXiv:2402.17885  [pdf, other

    cs.LG cs.GT cs.MA

    Independent Learning in Constrained Markov Potential Games

    Authors: Philip Jordan, Anas Barakat, Niao He

    Abstract: Constrained Markov games offer a formal mathematical framework for modeling multi-agent reinforcement learning problems where the behavior of the agents is subject to constraints. In this work, we focus on the recently introduced class of constrained Markov Potential Games. While centralized algorithms have been proposed for solving such constrained games, the design of converging independent lear… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: AISTATS 2024

  15. arXiv:2402.17722  [pdf, other

    math.OC cs.LG

    Taming Nonconvex Stochastic Mirror Descent with General Bregman Divergence

    Authors: Ilyas Fatkhullin, Niao He

    Abstract: This paper revisits the convergence of Stochastic Mirror Descent (SMD) in the contemporary nonconvex optimization setting. Existing results for batch-free nonconvex SMD restrict the choice of the distance generating function (DGF) to be differentiable with Lipschitz continuous gradients, thereby excluding important setups such as Shannon entropy. In this work, we present a new convergence analysis… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: Accepted for publication at AISTATS 2024

    MSC Class: 90C15; 90C26; 90C15 ACM Class: G.1.6

  16. arXiv:2402.15776  [pdf, other

    cs.LG stat.ML

    Truly No-Regret Learning in Constrained MDPs

    Authors: Adrian Müller, Pragnya Alatur, Volkan Cevher, Giorgia Ramponi, Niao He

    Abstract: Constrained Markov decision processes (CMDPs) are a common way to model safety constraints in reinforcement learning. State-of-the-art methods for efficiently solving CMDPs are based on primal-dual algorithms. For these algorithms, all currently known regret bounds allow for error cancellations -- one can compensate for a constraint violation in one round with a strict constraint satisfaction in a… ▽ More

    Submitted 19 July, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

  17. arXiv:2402.08129  [pdf, ps, other

    cs.GT

    Automated Design of Affine Maximizer Mechanisms in Dynamic Settings

    Authors: Michael Curry, Vinzenz Thoma, Darshan Chakrabarti, Stephen McAleer, Christian Kroer, Tuomas Sandholm, Niao He, Sven Seuken

    Abstract: Dynamic mechanism design is a challenging extension to ordinary mechanism design in which the mechanism designer must make a sequence of decisions over time in the face of possibly untruthful reports of participating agents. Optimizing dynamic mechanisms for welfare is relatively well understood. However, there has been less work on optimizing for other goals (e.g. revenue), and without restrictiv… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: To be published in the Thirty-Eighth Proceedings of the AAAI Conference on Artificial Intelligence 2024

  18. arXiv:2402.05757  [pdf, other

    cs.GT cs.MA math.OC

    When is Mean-Field Reinforcement Learning Tractable and Relevant?

    Authors: Batuhan Yardim, Artur Goldman, Niao He

    Abstract: Mean-field reinforcement learning has become a popular theoretical framework for efficiently approximating large-scale multi-agent reinforcement learning (MARL) problems exhibiting symmetry. However, questions remain regarding the applicability of mean-field approximations: in particular, their approximation accuracy of real-world systems and conditions under which they become computationally trac… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: 26 pages, 1 figure

  19. arXiv:2402.05724  [pdf, other

    cs.LG cs.AI cs.GT stat.ML

    Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL

    Authors: Jiawei Huang, Niao He, Andreas Krause

    Abstract: We study the sample complexity of reinforcement learning (RL) in Mean-Field Games (MFGs) with model-based function approximation that requires strategic exploration to find a Nash Equilibrium policy. We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity. Notably, P-MBED measures the complexity of the single-agent model cl… ▽ More

    Submitted 3 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: ICML 2024; 55 Pages

  20. arXiv:2401.00108  [pdf, other

    math.OC cs.CC

    Stochastic Optimization under Hidden Convexity

    Authors: Ilyas Fatkhullin, Niao He, Yifan Hu

    Abstract: In this work, we consider constrained stochastic optimization problems under hidden convexity, i.e., those that admit a convex reformulation via non-linear (but invertible) map $c(\cdot)$. A number of non-convex problems ranging from optimal control, revenue and inventory management, to convex reinforcement learning all admit such a hidden convex structure. Unfortunately, in the majority of applic… ▽ More

    Submitted 29 December, 2023; originally announced January 2024.

    MSC Class: 90C06; 90C15; 90C26

  21. arXiv:2312.13232  [pdf, other

    cs.GT

    Learning Best Response Policies in Dynamic Auctions via Deep Reinforcement Learning

    Authors: Vinzenz Thoma, Michael Curry, Niao He, Sven Seuken

    Abstract: Many real-world auctions are dynamic processes, in which bidders interact and report information over multiple rounds with the auctioneer. The sequential decision making aspect paired with imperfect information renders analyzing the incentive properties of such auctions much more challenging than in the static case. It is clear that bidders often have incentives for manipulation, but the full scop… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

    Comments: 14 pages, 4 figures

  22. arXiv:2312.10588  [pdf, other

    cs.CV cs.AI

    Post-Training Quantization for Re-parameterization via Coarse & Fine Weight Splitting

    Authors: Dawei Yang, Ning He, Xing Hu, Zhihang Yuan, Jiangyong Yu, Chen Xu, Zhe Jiang

    Abstract: Although neural networks have made remarkable advancements in various applications, they require substantial computational and memory resources. Network quantization is a powerful technique to compress neural networks, allowing for more efficient and scalable AI deployments. Recently, Re-parameterization has emerged as a promising technique to enhance model performance while simultaneously allevia… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: 23 pages

    MSC Class: I.2.0 ACM Class: I.2.0

  23. arXiv:2312.10456  [pdf, other

    cs.SE

    WRTester: Differential Testing of WebAssembly Runtimes via Semantic-aware Binary Generation

    Authors: Shangtong Cao, Ningyu He, Xinyu She, Yixuan Zhang, Mu Zhang, Haoyu Wang

    Abstract: Wasm runtime is a fundamental component in the Wasm ecosystem, as it directly impacts whether Wasm applications can be executed as expected. Bugs in Wasm runtime bugs are frequently reported, thus our research community has made a few attempts to design automated testing frameworks for detecting bugs in Wasm runtimes. However, existing testing frameworks are limited by the quality of test cases, i… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

  24. arXiv:2312.08000  [pdf, other

    cs.CR

    SoK: On the Security of Non-Fungible Tokens

    Authors: Kai Ma, Jintao Huang, Ningyu He, Zhuo Wang, Haoyu Wang

    Abstract: Non-fungible tokens (NFTs) drive the prosperity of the Web3 ecosystem. By November 2023, the total market value of NFT projects reached approximately 16 billion USD. Accompanying the success of NFTs are various security issues, i.e., attacks and scams are prevalent in the ecosystem. While NFTs have attracted significant attentions from both industry and academia, there is a lack of understanding o… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  25. arXiv:2311.08914  [pdf, other

    cs.LG math.OC

    Efficiently Escaping Saddle Points for Non-Convex Policy Optimization

    Authors: Sadegh Khorasani, Saber Salehkaleybar, Negar Kiyavash, Niao He, Matthias Grossglauser

    Abstract: Policy gradient (PG) is widely used in reinforcement learning due to its scalability and good performance. In recent years, several variance-reduced PG methods have been proposed with a theoretical guarantee of converging to an approximate first-order stationary point (FOSP) with the sample complexity of $O(ε^{-3})$. However, FOSPs could be bad local optima or saddle points. Moreover, these algori… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: arXiv admin note: text overlap with arXiv:2205.08253

    MSC Class: ACM-class:I.2.6

  26. arXiv:2311.03252  [pdf, other

    math.OC cs.LG stat.ML

    Parameter-Agnostic Optimization under Relaxed Smoothness

    Authors: Florian Hübler, Junchi Yang, Xiang Li, Niao He

    Abstract: Tuning hyperparameters, such as the stepsize, presents a major challenge of training machine learning models. To address this challenge, numerous adaptive optimization algorithms have been developed that achieve near-optimal complexities, even when stepsizes are independent of problem-specific parameters, provided that the loss function is $L$-smooth. However, as the assumption is relaxed to the m… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  27. arXiv:2310.19019  [pdf, other

    cs.CL cs.AI

    TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise

    Authors: Nan He, Hanyu Lai, Chenyang Zhao, Zirui Cheng, Junting Pan, Ruoyu Qin, Ruofan Lu, Rui Lu, Yunchen Zhang, Gangming Zhao, Zhaohui Hou, Zhiyuan Huang, Shaoqing Lu, Ding Liang, Mingjie Zhan

    Abstract: Large Language Models (LLMs) exhibit impressive reasoning and data augmentation capabilities in various NLP tasks. However, what about small models? In this work, we propose TeacherLM-7.1B, capable of annotating relevant fundamentals, chain of thought, and common mistakes for most NLP samples, which makes annotation more than just an answer, thus allowing other models to learn "why" instead of jus… ▽ More

    Submitted 15 July, 2024; v1 submitted 29 October, 2023; originally announced October 2023.

    Comments: 5 figures, 15 pages

  28. arXiv:2310.17759  [pdf, other

    cs.LG math.OC stat.ML

    Optimal Guarantees for Algorithmic Reproducibility and Gradient Complexity in Convex Optimization

    Authors: Liang Zhang, Junchi Yang, Amin Karbasi, Niao He

    Abstract: Algorithmic reproducibility measures the deviation in outputs of machine learning algorithms upon minor changes in the training process. Previous work suggests that first-order methods would need to trade-off convergence rate (gradient complexity) for better reproducibility. In this work, we challenge this perception and demonstrate that both optimal reproducibility and near-optimal convergence gu… ▽ More

    Submitted 9 January, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023 Spotlight

  29. arXiv:2310.09639  [pdf, other

    cs.LG cs.CR math.OC stat.ML

    DPZero: Private Fine-Tuning of Language Models without Backpropagation

    Authors: Liang Zhang, Bingcong Li, Kiran Koshy Thekumparampil, Sewoong Oh, Niao He

    Abstract: The widespread practice of fine-tuning large language models (LLMs) on domain-specific data faces two major challenges in memory and privacy. First, as the size of LLMs continues to grow, the memory demands of gradient-based training methods via backpropagation become prohibitively high. Second, given the tendency of LLMs to memorize training data, it is important to protect potentially sensitive… ▽ More

    Submitted 6 June, 2024; v1 submitted 14 October, 2023; originally announced October 2023.

    Comments: ICML 2024

  30. arXiv:2309.12450  [pdf, other

    stat.ML cs.LG

    A Convex Framework for Confounding Robust Inference

    Authors: Kei Ishikawa, Niao He, Takafumi Kanamori

    Abstract: We study policy evaluation of offline contextual bandits subject to unobserved confounders. Sensitivity analysis methods are commonly used to estimate the policy value under the worst-case confounding over a given uncertainty set. However, existing work often resorts to some coarse relaxation of the uncertainty set for the sake of tractability, leading to overly conservative estimation of the poli… ▽ More

    Submitted 1 November, 2023; v1 submitted 21 September, 2023; originally announced September 2023.

    Comments: This is an extended version of the following work https://proceedings.mlr.press/v206/ishikawa23a.html. arXiv admin note: text overlap with arXiv:2302.13348

  31. arXiv:2309.04272  [pdf, other

    eess.SY cs.GT cs.LG

    Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity and Last-Iterate Convergence

    Authors: Jiduan Wu, Anas Barakat, Ilyas Fatkhullin, Niao He

    Abstract: Zero-sum Linear Quadratic (LQ) games are fundamental in optimal control and can be used (i)~as a dynamic game formulation for risk-sensitive or robust control and (ii)~as a benchmark setting for multi-agent reinforcement learning with two competing agents in continuous state-control spaces. In contrast to the well-studied single-agent linear quadratic regulator problem, zero-sum LQ games entail so… ▽ More

    Submitted 31 October, 2023; v1 submitted 8 September, 2023; originally announced September 2023.

  32. arXiv:2308.03123  [pdf, other

    cs.CR

    WASMixer: Binary Obfuscation for WebAssembly

    Authors: Shangtong Cao, Ningyu He, Yao Guo, Haoyu Wang

    Abstract: WebAssembly (Wasm) is an emerging binary format that draws great attention from our community. However, Wasm binaries are weakly protected, as they can be read, edited, and manipulated by adversaries using either the officially provided readable text format (i.e., wat) or some advanced binary analysis tools. Reverse engineering of Wasm binaries is often used for nefarious intentions, e.g., identif… ▽ More

    Submitted 6 August, 2023; originally announced August 2023.

  33. arXiv:2307.00549  [pdf, other

    cs.CR

    Abusing the Ethereum Smart Contract Verification Services for Fun and Profit

    Authors: Pengxiang Ma, Ningyu He, Yuhua Huang, Haoyu Wang, Xiapu Luo

    Abstract: Smart contracts play a vital role in the Ethereum ecosystem. Due to the prevalence of kinds of security issues in smart contracts, the smart contract verification is urgently needed, which is the process of matching a smart contract's source code to its on-chain bytecode for gaining mutual trust between smart contract developers and users. Although smart contract verification services are embedded… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

  34. arXiv:2306.14799  [pdf, other

    cs.LG cs.GT

    On Imitation in Mean-field Games

    Authors: Giorgia Ramponi, Pavel Kolev, Olivier Pietquin, Niao He, Mathieu Laurière, Matthieu Geist

    Abstract: We explore the problem of imitation learning (IL) in the context of mean-field games (MFGs), where the goal is to imitate the behavior of a population of agents following a Nash equilibrium policy according to some unknown payoff function. IL in MFGs presents new challenges compared to single-agent IL, particularly when both the reward function and the transition kernel depend on the population di… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

  35. arXiv:2306.14133  [pdf, other

    cs.LG cs.AI math.OC

    Provably Convergent Policy Optimization via Metric-aware Trust Region Methods

    Authors: Jun Song, Niao He, Lijun Ding, Chaoyue Zhao

    Abstract: Trust-region methods based on Kullback-Leibler divergence are pervasively used to stabilize policy optimization in reinforcement learning. In this paper, we exploit more flexible metrics and examine two natural extensions of policy optimization with Wasserstein and Sinkhorn trust regions, namely Wasserstein policy optimization (WPO) and Sinkhorn policy optimization (SPO). Instead of restricting th… ▽ More

    Submitted 25 June, 2023; originally announced June 2023.

    Journal ref: Transactions on Machine Learning Research, 2023

  36. arXiv:2306.07749  [pdf, other

    cs.LG cs.GT cs.MA

    Provably Learning Nash Policies in Constrained Markov Potential Games

    Authors: Pragnya Alatur, Giorgia Ramponi, Niao He, Andreas Krause

    Abstract: Multi-agent reinforcement learning (MARL) addresses sequential decision-making problems with multiple agents, where each agent optimizes its own objective. In many real-world instances, the agents may not only want to optimize their objectives, but also ensure safe behavior. For example, in traffic routing, each car (agent) aims to reach its destination quickly (objective) while avoiding collision… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

    Comments: 30 pages

  37. arXiv:2306.07001  [pdf, ps, other

    cs.LG stat.ML

    Cancellation-Free Regret Bounds for Lagrangian Approaches in Constrained Markov Decision Processes

    Authors: Adrian Müller, Pragnya Alatur, Giorgia Ramponi, Niao He

    Abstract: Constrained Markov Decision Processes (CMDPs) are one of the common ways to model safe reinforcement learning problems, where constraint functions model the safety objectives. Lagrangian-based dual or primal-dual algorithms provide efficient methods for learning in CMDPs. For these algorithms, the currently known regret bounds in the finite-horizon setting allow for a "cancellation of errors"; one… ▽ More

    Submitted 30 August, 2023; v1 submitted 12 June, 2023; originally announced June 2023.

  38. arXiv:2306.02673  [pdf, other

    eess.IV cs.CV cs.LG

    Cross-Modal Vertical Federated Learning for MRI Reconstruction

    Authors: Yunlu Yan, Hong Wang, Yawen Huang, Nanjun He, Lei Zhu, Yuexiang Li, Yong Xu, Yefeng Zheng

    Abstract: Federated learning enables multiple hospitals to cooperatively learn a shared model without privacy disclosure. Existing methods often take a common assumption that the data from different hospitals have the same modalities. However, such a setting is difficult to fully satisfy in practical applications, since the imaging guidelines may be different between hospitals, which makes the number of ind… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: 12 pages, 7 figures

  39. arXiv:2306.01854  [pdf, other

    cs.LG math.OC

    Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space

    Authors: Anas Barakat, Ilyas Fatkhullin, Niao He

    Abstract: We consider the reinforcement learning (RL) problem with general utilities which consists in maximizing a function of the state-action occupancy measure. Beyond the standard cumulative reward RL setting, this problem includes as particular cases constrained RL, pure exploration and learning from demonstrations among others. For this problem, we propose a simpler single-loop parameter-free normaliz… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: 48 pages, 2 figures, ICML 2023, this paper was initially submitted in January 26th 2023

    Journal ref: Proceedings of the Fortieth International Conference on Machine Learning (ICML 2023)

  40. arXiv:2305.12475  [pdf, other

    math.OC cs.LG stat.ML

    Two Sides of One Coin: the Limits of Untuned SGD and the Power of Adaptive Methods

    Authors: Junchi Yang, Xiang Li, Ilyas Fatkhullin, Niao He

    Abstract: The classical analysis of Stochastic Gradient Descent (SGD) with polynomially decaying stepsize $η_t = η/\sqrt{t}$ relies on well-tuned $η$ depending on problem parameters such as Lipschitz smoothness constant, which is often unknown in practice. In this work, we prove that SGD with arbitrary $η> 0$, referred to as untuned SGD, still attains an order-optimal convergence rate… ▽ More

    Submitted 21 May, 2023; originally announced May 2023.

  41. arXiv:2305.11283  [pdf, ps, other

    cs.LG cs.AI stat.ML

    On the Statistical Efficiency of Mean Field Reinforcement Learning with General Function Approximation

    Authors: Jiawei Huang, Batuhan Yardim, Niao He

    Abstract: In this paper, we study the fundamental statistical efficiency of Reinforcement Learning in Mean-Field Control (MFC) and Mean-Field Game (MFG) with general model-based function approximation. We introduce a new concept called Mean-Field Model-Based Eluder Dimension (MF-MBED), which characterizes the inherent complexity of mean-field model classes. We show that low MF-MBED subsumes a rich family of… ▽ More

    Submitted 13 October, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

    Comments: 38 Pages

  42. arXiv:2305.06108  [pdf, other

    cs.CR

    A Deep Dive into NFT Rug Pulls

    Authors: Jintao Huang, Ningyu He, Kai Ma, Jiang Xiao, Haoyu Wang

    Abstract: NFT rug pull is one of the most prominent type of scam that the developers of a project abandon it and then run away with investors' funds. Although they have drawn attention from our community, to the best of our knowledge, the NFT rug pulls have not been systematically explored. To fill the void, this paper presents the first in-depth study of NFT rug pulls. Specifically, we first compile a list… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

  43. arXiv:2305.01454  [pdf, other

    cs.SE

    A General Static Binary Rewriting Framework for WebAssembly

    Authors: Shangtong Cao, Ningyu He, Yao Guo, Haoyu Wang

    Abstract: Binary rewriting is a widely adopted technique in software analysis. WebAssembly (Wasm), as an emerging bytecode format, has attracted great attention from our community. Unfortunately, there is no general-purpose binary rewriting framework for Wasm, and existing effort on Wasm binary modification is error-prone and tedious. In this paper, we present BREWasm, the first general purpose static binar… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

  44. arXiv:2304.07204  [pdf, other

    cs.SE cs.CR

    Eunomia: Enabling User-specified Fine-Grained Search in Symbolically Executing WebAssembly Binaries

    Authors: Ningyu He, Zhehao Zhao, Jikai Wang, Yubin Hu, Shengjian Guo, Haoyu Wang, Guangtai Liang, Ding Li, Xiangqun Chen, Yao Guo

    Abstract: Although existing techniques have proposed automated approaches to alleviate the path explosion problem of symbolic execution, users still need to optimize symbolic execution by applying various searching strategies carefully. As existing approaches mainly support only coarse-grained global searching strategies, they cannot efficiently traverse through complex code structures. In this paper, we pr… ▽ More

    Submitted 18 June, 2023; v1 submitted 14 April, 2023; originally announced April 2023.

    Comments: !!!NOTE HERE!!! In arxiv v2 version, I have replaced the original repo link to a new one, because the original one is hijacked to a extremely frightening and jump-scare webpage. PLEASE REFER TO https://github.com/HNYuuu/Eunomia-ISSTA23 NOT THE ORIGINAL shorturl ONE!

  45. arXiv:2304.07166  [pdf, other

    cs.CR

    Fuzzing the Latest NTFS in Linux with Papora: An Empirical Study

    Authors: Edward Lo, Ningyu He, Yuejie Shi, Jiajia Xu, Chiachih Wu, Ding Li, Yao Guo

    Abstract: Recently, the first feature-rich NTFS implementation, NTFS3, has been upstreamed to Linux. Although ensuring the security of NTFS3 is essential for the future of Linux, it remains unclear, however, whether the most recent version of NTFS for Linux contains 0-day vulnerabilities. To this end, we implemented Papora, the first effective fuzzer for NTFS3. We have identified and reported 3 CVE-assigned… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

    Comments: Accepted by 17th IEEE Workshop on Offensive Technologies

  46. arXiv:2303.01559  [pdf, other

    cs.CV cs.AI

    Improving GAN Training via Feature Space Shrinkage

    Authors: Haozhe Liu, Wentian Zhang, Bing Li, Haoqian Wu, Nanjun He, Yawen Huang, Yuexiang Li, Bernard Ghanem, Yefeng Zheng

    Abstract: Due to the outstanding capability for data generation, Generative Adversarial Networks (GANs) have attracted considerable attention in unsupervised learning. However, training GANs is difficult, since the training distribution is dynamic for the discriminator, leading to unstable image representation. In this paper, we address the problem of training GANs from a novel perspective, \emph{i.e.,} rob… ▽ More

    Submitted 8 April, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

    Comments: Accepted by CVPR'2023. Code and Demo are available at https://github.com/WentianZhang-ML/AdaptiveMix

  47. arXiv:2302.13348  [pdf, other

    stat.ML cs.LG

    Kernel Conditional Moment Constraints for Confounding Robust Inference

    Authors: Kei Ishikawa, Niao He

    Abstract: We study policy evaluation of offline contextual bandits subject to unobserved confounders. Sensitivity analysis methods are commonly used to estimate the policy value under the worst-case confounding over a given uncertainty set. However, existing work often resorts to some coarse relaxation of the uncertainty set for the sake of tractability, leading to overly conservative estimation of the poli… ▽ More

    Submitted 14 September, 2023; v1 submitted 26 February, 2023; originally announced February 2023.

    Journal ref: AISTATS 2023

  48. arXiv:2302.05534  [pdf, other

    cs.LG cs.AI stat.ML

    Robust Knowledge Transfer in Tiered Reinforcement Learning

    Authors: Jiawei Huang, Niao He

    Abstract: In this paper, we study the Tiered Reinforcement Learning setting, a parallel transfer learning framework, where the goal is to transfer knowledge from the low-tier (source) task to the high-tier (target) task to reduce the exploration risk of the latter while solving the two tasks in parallel. Unlike previous work, we do not assume the low-tier and high-tier tasks share the same dynamics or rewar… ▽ More

    Submitted 13 June, 2024; v1 submitted 10 February, 2023; originally announced February 2023.

    Comments: 47 Pages; 1 Figure; NeurIPS 2023

  49. arXiv:2302.01734  [pdf, other

    cs.LG math.OC

    Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies

    Authors: Ilyas Fatkhullin, Anas Barakat, Anastasia Kireeva, Niao He

    Abstract: Recently, the impressive empirical success of policy gradient (PG) methods has catalyzed the development of their theoretical foundations. Despite the huge efforts directed at the design of efficient stochastic PG-type algorithms, the understanding of their convergence to a globally optimal policy is still limited. In this work, we develop improved global convergence guarantees for a general class… ▽ More

    Submitted 8 November, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

    Comments: This work was initially submitted in October 2022

    MSC Class: 90C26; 90C15 ACM Class: G.1.6

    Journal ref: Proceedings of the 40th International Conference on Machine Learning, PMLR 202:9827-9869, 2023

  50. arXiv:2301.00989  [pdf, ps, other

    cs.CV cs.AI

    A New Perspective to Boost Vision Transformer for Medical Image Classification

    Authors: Yuexiang Li, Yawen Huang, Nanjun He, Kai Ma, Yefeng Zheng

    Abstract: Transformer has achieved impressive successes for various computer vision tasks. However, most of existing studies require to pretrain the Transformer backbone on a large-scale labeled dataset (e.g., ImageNet) for achieving satisfactory performance, which is usually unavailable for medical images. Additionally, due to the gap between medical and natural images, the improvement generated by the Ima… ▽ More

    Submitted 3 January, 2023; originally announced January 2023.