-
QoS-Aware and Routing-Flexible Network Slicing for Service-Oriented Networks
Authors:
Wei-Kun Chen,
Ya-Feng Liu,
Yu-Hong Dai,
Zhi-Quan Luo
Abstract:
In this paper, we consider the network slicing (NS) problem which attempts to map multiple customized virtual network requests (also called services) to a common shared network infrastructure and manage network resources to meet diverse quality of service (QoS) requirements. We propose a mixed-integer nonlinear programming (MINLP) formulation for the considered NS problem that can flexibly route t…
▽ More
In this paper, we consider the network slicing (NS) problem which attempts to map multiple customized virtual network requests (also called services) to a common shared network infrastructure and manage network resources to meet diverse quality of service (QoS) requirements. We propose a mixed-integer nonlinear programming (MINLP) formulation for the considered NS problem that can flexibly route the traffic flow of the services on multiple paths and provide end-to-end delay and reliability guarantees for all services. To overcome the computational difficulty due to the intrinsic nonlinearity in the MINLP formulation, we transform the MINLP formulation into an equivalent mixed-integer linear programming (MILP) formulation and further show that their continuous relaxations are equivalent. In sharp contrast to the continuous relaxation of the MINLP formulation which is a nonconvex nonlinear programming problem, the continuous relaxation of the MILP formulation is a polynomial-time solvable linear programming problem, which significantly facilitates the algorithmic design. Based on the newly proposed MILP formulation, we develop a customized column generation (cCG) algorithm for solving the NS problem. The proposed cCG algorithm is a decomposition-based algorithm and is particularly suitable for solving large-scale NS problems. Numerical results demonstrate the efficacy of the proposed formulations and the proposed cCG algorithm.
△ Less
Submitted 20 September, 2024;
originally announced September 2024.
-
Local discontinuous Galerkin method for nonlinear BSPDEs of Neumann boundary conditions with deep backward dynamic programming time-marching
Authors:
Yixiang Dai,
Yunzhang Li,
Jing Zhang
Abstract:
This paper aims to present a local discontinuous Galerkin (LDG) method for solving backward stochastic partial differential equations (BSPDEs) with Neumann boundary conditions. We establish the $L^2$-stability and optimal error estimates of the proposed numerical scheme. Two numerical examples are provided to demonstrate the performance of the LDG method, where we incorporate a deep learning algor…
▽ More
This paper aims to present a local discontinuous Galerkin (LDG) method for solving backward stochastic partial differential equations (BSPDEs) with Neumann boundary conditions. We establish the $L^2$-stability and optimal error estimates of the proposed numerical scheme. Two numerical examples are provided to demonstrate the performance of the LDG method, where we incorporate a deep learning algorithm to address the challenge of the curse of dimensionality in backward stochastic differential equations (BSDEs). The results show the effectiveness and accuracy of the LDG method in tackling BSPDEs with Neumann boundary conditions.
△ Less
Submitted 17 September, 2024;
originally announced September 2024.
-
Presolving and cutting planes for the generalized maximal covering location problem
Authors:
Wei Lv,
Cheng-Yang Yu,
Jie Liang,
Wei-Kun Chen,
Yu-Hong Dai
Abstract:
This paper considers the generalized maximal covering location problem (GMCLP) which establishes a fixed number of facilities to maximize the weighted sum of the covered customers, allowing customers' weights to be positive or negative. The GMCLP can be modeled as a mixed integer programming (MIP) formulation and solved by off-the-shelf MIP solvers. However, due to the large problem size and parti…
▽ More
This paper considers the generalized maximal covering location problem (GMCLP) which establishes a fixed number of facilities to maximize the weighted sum of the covered customers, allowing customers' weights to be positive or negative. The GMCLP can be modeled as a mixed integer programming (MIP) formulation and solved by off-the-shelf MIP solvers. However, due to the large problem size and particularly, poor linear programming (LP) relaxation, the GMCLP is extremely difficult to solve by state-of-the-art MIP solvers. To improve the computational performance of MIP-based approaches for solving GMCLPs, we propose customized presolving and cutting plane techniques, which are the isomorphic aggregation, dominance reduction, and two-customer inequalities. The isomorphic aggregation and dominance reduction can not only reduce the problem size but also strengthen the LP relaxation of the MIP formulation of the GMCLP. The two-customer inequalities can be embedded into a branch-and-cut framework to further strengthen the LP relaxation of the MIP formulation on the fly. By extensive computational experiments, we show that all three proposed techniques can substantially improve the capability of MIP solvers in solving GMCLPs. In particular, for a testbed of 40 instances with identical numbers of customers and facilities in the literature, the proposed techniques enable to provide optimal solutions for 13 previously unsolved benchmark instances; for a testbed of 56 instances where the number of customers is much larger than the number of facilities, the proposed techniques can turn most of them from intractable to easily solvable.
△ Less
Submitted 15 September, 2024;
originally announced September 2024.
-
Serial and Parallel Two-Column Probing for Mixed-Integer Programming
Authors:
Yongzheng Dai,
Chen Chen
Abstract:
Probing in mixed-integer programming (MIP) is a technique of temporarily fixing variables to discover implications that are useful to branch-and-cut solvers. Such fixing is typically performed one variable at a time -- this paper develops instead a two-column probing scheme that instead fixes a pair of variables per iteration. Although the scheme involves more work per iteration compared to the on…
▽ More
Probing in mixed-integer programming (MIP) is a technique of temporarily fixing variables to discover implications that are useful to branch-and-cut solvers. Such fixing is typically performed one variable at a time -- this paper develops instead a two-column probing scheme that instead fixes a pair of variables per iteration. Although the scheme involves more work per iteration compared to the one-column approach, stronger implied bounds as well as more conflicts identified may compensate. Indeed, our prototype implementation was awarded first prize at the MIP Workshop 2024 Computational Competition on novel presolving approaches. This paper presents the aforementioned (serial) prototype and additionally develops an efficient parallelization, leveraging hardware acceleration to further improve overall solve times. Compared to serial two-column probing, our parallel version sacrifices some strength per-pair probed in exchange for greatly increasing the total number of such probings; computational experiments demonstrate its promise.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
Adversarial Network Optimization under Bandit Feedback: Maximizing Utility in Non-Stationary Multi-Hop Networks
Authors:
Yan Dai,
Longbo Huang
Abstract:
Stochastic Network Optimization (SNO) concerns scheduling in stochastic queueing systems. It has been widely studied in network theory. Classical SNO algorithms require network conditions to be stationary with time, which fails to capture the non-stationary components in many real-world scenarios. Many existing algorithms also assume knowledge of network conditions before decision, which rules out…
▽ More
Stochastic Network Optimization (SNO) concerns scheduling in stochastic queueing systems. It has been widely studied in network theory. Classical SNO algorithms require network conditions to be stationary with time, which fails to capture the non-stationary components in many real-world scenarios. Many existing algorithms also assume knowledge of network conditions before decision, which rules out applications where unpredictability presents.
Motivated by these issues, we consider Adversarial Network Optimization (ANO) under bandit feedback. Specifically, we consider the task of *i)* maximizing some unknown and time-varying utility function associated to scheduler's actions, where *ii)* the underlying network is a non-stationary multi-hop one whose conditions change arbitrarily with time, and *iii)* only bandit feedback (effect of actually deployed actions) is revealed after decisions. Our proposed `UMO2` algorithm ensures network stability and also matches the utility maximization performance of any "mildly varying" reference policy up to a polynomially decaying gap. To our knowledge, no previous ANO algorithm handled multi-hop networks or achieved utility guarantees under bandit feedback, whereas ours can do both.
Technically, our method builds upon a novel integration of online learning into Lyapunov analyses: To handle complex inter-dependencies among queues in multi-hop networks, we propose meticulous techniques to balance online learning and Lyapunov arguments. To tackle the learning obstacles due to potentially unbounded queue sizes, we design a new online linear optimization algorithm that automatically adapts to loss magnitudes. To maximize utility, we propose a bandit convex optimization algorithm with novel queue-dependent learning rate scheduling that suites drastically varying queue lengths. Our new insights in online learning can be of independent interest.
△ Less
Submitted 28 August, 2024;
originally announced August 2024.
-
Enhanced Barrier-Smoothing Technique for Bilevel Optimization with Nonsmooth Mappings
Authors:
Mengwei Xu,
Yu-Hong Dai,
Xin-Wei Liu,
Bo Wang
Abstract:
Bilevel optimization problems, encountered in fields such as economics, engineering, and machine learning, pose significant computational challenges due to their hierarchical structure and constraints at both upper and lower levels. Traditional gradient-based methods are effective for unconstrained bilevel programs with unique lower level solutions, but struggle with constrained bilevel problems d…
▽ More
Bilevel optimization problems, encountered in fields such as economics, engineering, and machine learning, pose significant computational challenges due to their hierarchical structure and constraints at both upper and lower levels. Traditional gradient-based methods are effective for unconstrained bilevel programs with unique lower level solutions, but struggle with constrained bilevel problems due to the nonsmoothness of lower level solution mappings. To overcome these challenges, this paper introduces the Enhanced Barrier-Smoothing Algorithm (EBSA), a novel approach that integrates gradient-based techniques with an augmented Lagrangian framework. EBSA utilizes innovative smoothing functions to approximate the primal-dual solution mapping of the lower level problem, and then transforms the bilevel problem into a sequence of smooth single-level problems. This approach not only addresses the nonsmoothness but also enhances convergence properties. Theoretical analysis demonstrates its superiority in achieving Clarke and, under certain conditions, Bouligand stationary points for bilevel problems. Both theoretical analysis and preliminary numerical experiments confirm the robustness and efficiency of EBSA.
△ Less
Submitted 20 August, 2024; v1 submitted 18 August, 2024;
originally announced August 2024.
-
Exploiting Overlap Information in Chance-constrained Program with Random Right-hand Side
Authors:
Wei Lv,
Wei-Kun Chen,
Yu-Hong Dai,
Xiao-Jiao Tong
Abstract:
We consider the chance-constrained program (CCP) with random right-hand side under a finite discrete distribution. It is known that the standard mixed integer linear programming (MILP) reformulation of the CCP is generally difficult to solve by general-purpose solvers as the branch-and-cut search trees are enormously large, partly due to the weak linear programming relaxation. In this paper, we id…
▽ More
We consider the chance-constrained program (CCP) with random right-hand side under a finite discrete distribution. It is known that the standard mixed integer linear programming (MILP) reformulation of the CCP is generally difficult to solve by general-purpose solvers as the branch-and-cut search trees are enormously large, partly due to the weak linear programming relaxation. In this paper, we identify another reason for this phenomenon: the intersection of the feasible regions of the subproblems in the search tree could be nonempty, leading to a wasteful duplication of effort in exploring the uninteresting overlap in the search tree. To address the newly identified challenge and enhance the capability of the MILP-based approach in solving CCPs, we first show that the overlap in the search tree can be completely removed by a family of valid nonlinear if-then constraints, and then propose two practical approaches to tackle the highly nonlinear if-then constraints. In particular, we use the concept of dominance relations between different scenarios of the random variables, and propose a novel branching, called dominance-based branching, which is able to create a valid partition of the problem with a much smaller overlap than the classic variable branching. Moreover, we develop overlap-oriented node pruning and variable fixing techniques, applied at each node of the search tree, to remove more overlaps in the search tree. Computational results demonstrate the effectiveness of the proposed dominance-based branching and overlap-oriented node pruning and variable fixing techniques in reducing the search tree size and improving the overall solution efficiency.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
An efficient branch-and-cut approach for large-scale competitive facility location problems with limited choice rule
Authors:
Wei-Kun Chen,
Wei-Yang Zhang,
Yan-Ru Wang,
Shahin Gelareh,
Yu-Hong Dai
Abstract:
In the paper, we consider the competitive facility location problem with limited choice rule (CFLPLCR), which attempts to open a subset of facilities to maximize the net profit of a newcomer company, requiring customers to patronize only a limited number of opening facilities and an outside option. We propose an efficient branch-and-cut (B&C) approach for the CFLPLCR based on newly proposed mixed…
▽ More
In the paper, we consider the competitive facility location problem with limited choice rule (CFLPLCR), which attempts to open a subset of facilities to maximize the net profit of a newcomer company, requiring customers to patronize only a limited number of opening facilities and an outside option. We propose an efficient branch-and-cut (B&C) approach for the CFLPLCR based on newly proposed mixed integer linear programming (MILP) formulations. Specifically, by establishing the submodularity of the probability function, we develop an MILP formulation for the CFLPLCR using the submodular inequalities. For the special case where each customer patronizes at most one open facility and the outside option, we show that the submodular inequalities can characterize the convex hull of the considered set and provide a compact MILP formulation. Moreover, for the general case, we strengthen the submodular inequalities by sequential lifting, resulting in a class of facet-defining inequalities. The proposed lifted submodular inequalities are shown to be stronger than the classic submodular inequalities, enabling to obtain another MILP formulation with a tighter linear programming (LP) relaxation. By extensive numerical experiments, we show that the proposed B&C approach outperforms the state-of-the-art generalized Benders decomposition approach by at least one order of magnitude. Furthermore, it enables to solve CFLPLCR instances with 10000 customers and 2000 facilities.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Stability for Nash Equilibrium Problems
Authors:
Ruoyu Diao,
Yu-Hong Dai,
Liwei Zhang
Abstract:
This paper is devoted to studying the stability properties of the Karush-Kuhn-Tucker (KKT) solution mapping $S_{\rm KKT}$ for Nash equilibrium problems (NEPs) with canonical perturbations. Firstly, we obtain an exact characterization of the strong regularity of $S_{\rm KKT}$ and a sufficient condition that is easy to verify. Secondly, we propose equivalent conditions for the continuously different…
▽ More
This paper is devoted to studying the stability properties of the Karush-Kuhn-Tucker (KKT) solution mapping $S_{\rm KKT}$ for Nash equilibrium problems (NEPs) with canonical perturbations. Firstly, we obtain an exact characterization of the strong regularity of $S_{\rm KKT}$ and a sufficient condition that is easy to verify. Secondly, we propose equivalent conditions for the continuously differentiable single-valued localization of $S_{\rm KKT}$. Thirdly, the isolated calmness of $S_{\rm KKT}$ is studied based on two conditions: Property A and Property B, and Property B proves to be sufficient for the robustness of both $E(p)$ and $S_{\rm KKT}$ under the convex assumptions, where $E(p)$ denotes the Nash equilibria at perturbation $p$. Furthermore, we establish that studying the stability properties of the NEP with canonical perturbations is equivalent to studying those of the NEP with only tilt perturbations based on the prior discussions. Finally, we provide detailed characterizations of stability for NEPs whose each individual player solves a quadratic programming (QP) problem.
△ Less
Submitted 18 May, 2024;
originally announced May 2024.
-
An inexact augmented Lagrangian algorithm for unsymmetric saddle-point systems
Authors:
N. Huang,
Y. -H. Dai,
D. Orban,
M. A. Saunders
Abstract:
Augmented Lagrangian (AL) methods are a well known class of algorithms for solving constrained optimization problems. They have been extended to the solution of saddle-point systems of linear equations. We study an AL (SPAL) algorithm for unsymmetric saddle-point systems and derive convergence and semi-convergence properties, even when the system is singular. At each step, our SPAL requires the ex…
▽ More
Augmented Lagrangian (AL) methods are a well known class of algorithms for solving constrained optimization problems. They have been extended to the solution of saddle-point systems of linear equations. We study an AL (SPAL) algorithm for unsymmetric saddle-point systems and derive convergence and semi-convergence properties, even when the system is singular. At each step, our SPAL requires the exact solution of a linear system of the same size but with an SPD (2,2) block. To improve efficiency, we introduce an inexact SPAL algorithm. We establish its convergence properties under reasonable assumptions. Specifically, we use a gradient method, known as the Barzilai-Borwein (BB) method, to solve the linear system at each iteration. We call the result the augmented Lagrangian BB (SPALBB) algorithm and study its convergence. Numerical experiments on test problems from Navier-Stokes equations and coupled Stokes-Darcy flow show that SPALBB is more robust and efficient than BICGSTAB and GMRES. SPALBB often requires the least CPU time, especially on large systems.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
A cut-and-project perspective for linearized Bregman iterations
Authors:
Yu-Hong Dai,
Kangkang Deng,
Hui Zhang
Abstract:
The linearized Bregman iterations (LBreI) and its variants are powerful tools for finding sparse or low-rank solutions to underdetermined linear systems. In this study, we propose a cut-and-project perspective for the linearized Bregman method via a bilevel optimization formulation, along with a new unified algorithmic framework. The new perspective not only encompasses various existing linearized…
▽ More
The linearized Bregman iterations (LBreI) and its variants are powerful tools for finding sparse or low-rank solutions to underdetermined linear systems. In this study, we propose a cut-and-project perspective for the linearized Bregman method via a bilevel optimization formulation, along with a new unified algorithmic framework. The new perspective not only encompasses various existing linearized Bregman iteration variants as specific instances, but also allows us to extend the linearized Bregman method to solve more general inverse problems. We provide a completed convergence result of the proposed algorithmic framework, including convergence guarantees to feasible points and optimal solutions, and the sublinear convergence rate. Moreover, we introduce the Bregman distance growth condition to ensure linear convergence. At last, our findings are illustrated via numerical tests.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
A Proximal-Gradient Method for Constrained Optimization
Authors:
Yutong Dai,
Xiaoyi Qu,
Daniel P. Robinson
Abstract:
We present a new algorithm for solving optimization problems with objective functions that are the sum of a smooth function and a (potentially) nonsmooth regularization function, and nonlinear equality constraints. The algorithm may be viewed as an extension of the well-known proximal-gradient method that is applicable when constraints are not present. To account for nonlinear equality constraints…
▽ More
We present a new algorithm for solving optimization problems with objective functions that are the sum of a smooth function and a (potentially) nonsmooth regularization function, and nonlinear equality constraints. The algorithm may be viewed as an extension of the well-known proximal-gradient method that is applicable when constraints are not present. To account for nonlinear equality constraints, we combine a decomposition procedure for computing trial steps with an exact merit function for determining trial step acceptance. Under common assumptions, we show that both the proximal parameter and merit function parameter eventually remain fixed, and then prove a worst-case complexity result for the maximum number of iterations before an iterate satisfying approximate first-order optimality conditions for a given tolerance is computed. Our preliminary numerical results indicate that our approach has great promise, especially in terms of returning approximate solutions that are structured (e.g., sparse solutions when a one-norm regularizer is used).
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
Stochastic Approximation Proximal Subgradient Method for Stochastic Convex-Concave Minimax Optimization
Authors:
Yu-Hong Dai,
Jiani Wang,
Liwei Zhang
Abstract:
This paper presents a stochastic approximation proximal subgradient (SAPS) method for stochastic convex-concave minimax optimization. By accessing unbiased and variance bounded approximate subgradients, we show that this algorithm exhibits ${\rm O}(N^{-1/2})$ expected convergence rate of the minimax optimality measure if the parameters in the algorithm are properly chosen, where $N$ denotes the nu…
▽ More
This paper presents a stochastic approximation proximal subgradient (SAPS) method for stochastic convex-concave minimax optimization. By accessing unbiased and variance bounded approximate subgradients, we show that this algorithm exhibits ${\rm O}(N^{-1/2})$ expected convergence rate of the minimax optimality measure if the parameters in the algorithm are properly chosen, where $N$ denotes the number of iterations. Moreover, we show that the algorithm has ${\rm O}(\log(N)N^{-1/2})$ minimax optimality measure bound with high probability. Further we study a specific stochastic convex-concave minimax optimization problems arising from stochastic convex conic optimization problems, which the the bounded subgradient condition is fail. To overcome the lack of the bounded subgradient conditions in convex-concave minimax problems, we propose a linearized stochastic approximation augmented Lagrange (LSAAL) method and prove that this algorithm exhibits ${\rm O}(N^{-1/2})$ expected convergence rate for the minimax optimality measure and ${\rm O}(\log^2(N)N^{-1/2})$ minimax optimality measure bound with high probability as well. Preliminary numerical results demonstrate the effect of the SAPS and LSAAL methods.
△ Less
Submitted 29 March, 2024;
originally announced March 2024.
-
Towards Large-scale Probabilistic Set Covering Problem: An Efficient Benders Decomposition Approach
Authors:
Wei-Kun Chen,
Yi-Long Chen,
Yu-Hong Dai,
Wei Lv
Abstract:
In this paper, we investigate the probabilistic set covering problems (PSCP) in which the right-hand side is a random vector ξ and the covering constraint is required to be satisfied with a prespecified probability. We consider the case arising from sample average approximation (or finite discrete distributions). We develop an effective Benders decomposition (BD) algorithm for solving large-scale…
▽ More
In this paper, we investigate the probabilistic set covering problems (PSCP) in which the right-hand side is a random vector ξ and the covering constraint is required to be satisfied with a prespecified probability. We consider the case arising from sample average approximation (or finite discrete distributions). We develop an effective Benders decomposition (BD) algorithm for solving large-scale PSCPs, which enjoys two key advantages: (i) the number of variables in the underlying Benders reformulation is independent of the scenario size; and (ii) the Benders cuts can be separated by an efficient combinatorial algorithm. For the special case that ξ is a combination of several independent random blocks/subvectors, we explicitly take this kind of block structure into consideration and develop a more efficient BD algorithm. Numerical results on instances with up to one million scenarios demonstrate the effectiveness of the proposed BD algorithms over a black-box MIP solver's branch-and-cut and automatic BD algorithms and a state-of-the-art algorithm in the literature.
△ Less
Submitted 28 February, 2024;
originally announced February 2024.
-
Tensor Completion via Integer Optimization
Authors:
Xin Chen,
Sukanya Kudva,
Yongzheng Dai,
Anil Aswani,
Chen Chen
Abstract:
The main challenge with the tensor completion problem is a fundamental tension between computation power and the information-theoretic sample complexity rate. Past approaches either achieve the information-theoretic rate but lack practical algorithms to compute the corresponding solution, or have polynomial-time algorithms that require an exponentially-larger number of samples for low estimation e…
▽ More
The main challenge with the tensor completion problem is a fundamental tension between computation power and the information-theoretic sample complexity rate. Past approaches either achieve the information-theoretic rate but lack practical algorithms to compute the corresponding solution, or have polynomial-time algorithms that require an exponentially-larger number of samples for low estimation error. This paper develops a novel tensor completion algorithm that resolves this tension by achieving both provable convergence (in numerical tolerance) in a linear number of oracle steps and the information-theoretic rate. Our approach formulates tensor completion as a convex optimization problem constrained using a gauge-based tensor norm, which is defined in a way that allows the use of integer linear optimization to solve linear separation problems over the unit-ball in this new norm. Adaptations based on this insight are incorporated into a Frank-Wolfe variant to build our algorithm. We show our algorithm scales-well using numerical experiments on tensors with up to ten million entries.
△ Less
Submitted 3 April, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
Zeroth-Order primal-dual Alternating Projection Gradient Algorithms for Nonconvex Minimax Problems with Coupled linear Constraints
Authors:
Huiling Zhang,
Zi Xu,
Yuhong Dai
Abstract:
In this paper, we study zeroth-order algorithms for nonconvex minimax problems with coupled linear constraints under the deterministic and stochastic settings, which have attracted wide attention in machine learning, signal processing and many other fields in recent years, e.g., adversarial attacks in resource allocation problems and network flow problems etc. We propose two single-loop algorithms…
▽ More
In this paper, we study zeroth-order algorithms for nonconvex minimax problems with coupled linear constraints under the deterministic and stochastic settings, which have attracted wide attention in machine learning, signal processing and many other fields in recent years, e.g., adversarial attacks in resource allocation problems and network flow problems etc. We propose two single-loop algorithms, namely the zero-order primal-dual alternating projected gradient (ZO-PDAPG) algorithm and the zero-order regularized momentum primal-dual projected gradient algorithm (ZO-RMPDPG), for solving deterministic and stochastic nonconvex-(strongly) concave minimax problems with coupled linear constraints. The iteration complexity of the two proposed algorithms to obtain an $\varepsilon$-stationary point are proved to be $\mathcal{O}(\varepsilon ^{-2})$ (resp. $\mathcal{O}(\varepsilon ^{-4})$) for solving nonconvex-strongly concave (resp. nonconvex-concave) minimax problems with coupled linear constraints under deterministic settings and $\tilde{\mathcal{O}}(\varepsilon ^{-3})$ (resp. $\tilde{\mathcal{O}}(\varepsilon ^{-6.5})$) under stochastic settings respectively. To the best of our knowledge, they are the first two zeroth-order algorithms with iterative complexity guarantees for solving nonconvex-(strongly) concave minimax problems with coupled linear constraints under the deterministic and stochastic settings.
△ Less
Submitted 26 January, 2024;
originally announced February 2024.
-
Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise
Authors:
Kwangjun Ahn,
Zhiyu Zhang,
Yunbum Kook,
Yan Dai
Abstract:
Despite the success of the Adam optimizer in practice, the theoretical understanding of its algorithmic components still remains limited. In particular, most existing analyses of Adam show the convergence rate that can be simply achieved by non-adative algorithms like SGD. In this work, we provide a different perspective based on online learning that underscores the importance of Adam's algorithmi…
▽ More
Despite the success of the Adam optimizer in practice, the theoretical understanding of its algorithmic components still remains limited. In particular, most existing analyses of Adam show the convergence rate that can be simply achieved by non-adative algorithms like SGD. In this work, we provide a different perspective based on online learning that underscores the importance of Adam's algorithmic components. Inspired by Cutkosky et al. (2023), we consider the framework called online learning of updates/increments, where we choose the updates/increments of an optimizer based on an online learner. With this framework, the design of a good optimizer is reduced to the design of a good online learner. Our main observation is that Adam corresponds to a principled online learning framework called Follow-the-Regularized-Leader (FTRL). Building on this observation, we study the benefits of its algorithmic components from the online learning perspective.
△ Less
Submitted 30 May, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
Parallelized Conflict Graph Cut Generation
Authors:
Yongzheng Dai,
Chen Chen
Abstract:
A conflict graph represents logical relations between binary variables, and effective use of the graph can significantly accelerate branch-and-cut solvers for mixed-integer programming (MIP). In this paper we develop efficient parallel conflict graph management: conflict detection; maximal clique generation; clique extension; and clique merging. We leverage parallel computing in order to intensify…
▽ More
A conflict graph represents logical relations between binary variables, and effective use of the graph can significantly accelerate branch-and-cut solvers for mixed-integer programming (MIP). In this paper we develop efficient parallel conflict graph management: conflict detection; maximal clique generation; clique extension; and clique merging. We leverage parallel computing in order to intensify computational effort on the conflict graph, thereby generating a much larger pool of cutting planes than what can be practically achieved in serial. Computational experiments demonstrate that the expanded pool of cuts enabled by parallel computing lead to substantial reductions in total MIP solve time, especially for more challenging cases.
△ Less
Submitted 27 May, 2024; v1 submitted 6 November, 2023;
originally announced November 2023.
-
Exact penalty method for D-stationary point of nonlinear optimization
Authors:
Xin-Wei Liu,
Yu-Hong Dai
Abstract:
We consider the nonlinear optimization problem with least $\ell_1$-norm measure of constraint violations and introduce the concepts of the D-stationary point, the DL-stationary point and the DZ-stationary point with the help of exact penalty function. If the stationary point is feasible, they correspond to the Fritz-John stationary point, the KKT stationary point and the singular stationary point,…
▽ More
We consider the nonlinear optimization problem with least $\ell_1$-norm measure of constraint violations and introduce the concepts of the D-stationary point, the DL-stationary point and the DZ-stationary point with the help of exact penalty function. If the stationary point is feasible, they correspond to the Fritz-John stationary point, the KKT stationary point and the singular stationary point, respectively. In order to show the usefulness of the new stationary points, we propose a new exact penalty sequential quadratic programming (SQP) method with inner and outer iterations and analyze its global and local convergence. The proposed method admits convergence to a D-stationary point and rapid infeasibility detection without driving the penalty parameter to zero, which demonstrates the commentary given in [SIAM J. Optim., 20 (2010), 2281--2299] and can be thought to be a supplement of the theory of nonlinear optimization on rapid detection of infeasibility. Some illustrative examples and preliminary numerical results demonstrate that the proposed method is robust and efficient in solving infeasible nonlinear problems and a degenerate problem without LICQ in the literature.
△ Less
Submitted 24 September, 2023;
originally announced September 2023.
-
Lipschitz Transport Maps via the Follmer Flow
Authors:
Yin Dai,
Yuan Gao,
Jian Huang,
Yuling Jiao,
Lican Kang,
Jin Liu
Abstract:
Inspired by the construction of the F{ö}llmer process, we construct a unit-time flow on the Euclidean space, termed the F{ö}llmer flow, whose flow map at time 1 pushes forward a standard Gaussian measure onto a general target measure. We study the well-posedness of the F{ö}llmer flow and establish the Lipschitz property of the flow map at time 1. We apply the Lipschitz mapping to several rich clas…
▽ More
Inspired by the construction of the F{ö}llmer process, we construct a unit-time flow on the Euclidean space, termed the F{ö}llmer flow, whose flow map at time 1 pushes forward a standard Gaussian measure onto a general target measure. We study the well-posedness of the F{ö}llmer flow and establish the Lipschitz property of the flow map at time 1. We apply the Lipschitz mapping to several rich classes of probability measures on deriving dimension-free functional inequalities and concentration inequalities for the empirical measure.
△ Less
Submitted 7 September, 2023;
originally announced September 2023.
-
Solving Elliptic Optimal Control Problems via Neural Networks and Optimality System
Authors:
Yongcheng Dai,
Bangti Jin,
Ramesh Sau,
Zhi Zhou
Abstract:
In this work, we investigate a neural network based solver for optimal control problems (without / with box constraint) for linear and semilinear second-order elliptic problems. It utilizes a coupled system derived from the first-order optimality system of the optimal control problem, and employs deep neural networks to represent the solutions to the reduced system. We present an error analysis of…
▽ More
In this work, we investigate a neural network based solver for optimal control problems (without / with box constraint) for linear and semilinear second-order elliptic problems. It utilizes a coupled system derived from the first-order optimality system of the optimal control problem, and employs deep neural networks to represent the solutions to the reduced system. We present an error analysis of the scheme, and provide $L^2(Ω)$ error bounds on the state, control and adjoint in terms of neural network parameters (e.g., depth, width, and parameter bounds) and the numbers of sampling points. The main tools in the analysis include offset Rademacher complexity and boundedness and Lipschitz continuity of neural network functions. We present several numerical examples to illustrate the method and compare it with two existing ones.
△ Less
Submitted 8 May, 2024; v1 submitted 23 August, 2023;
originally announced August 2023.
-
An Efficient Benders Decomposition Approach for Optimal Large-Scale Network Slicing
Authors:
Wei-Kun Chen,
Zheyu Wu,
Rui-Jin Zhang,
Ya-Feng Liu,
Yu-Hong Dai,
Zhi-Quan Luo
Abstract:
This paper considers the network slicing (NS) problem which attempts to map multiple customized virtual network requests to a common shared network infrastructure and allocate network resources to meet diverse service requirements. This paper proposes an efficient customized Benders decomposition algorithm for globally solving the large-scale NP-hard NS problem. The proposed algorithm decomposes t…
▽ More
This paper considers the network slicing (NS) problem which attempts to map multiple customized virtual network requests to a common shared network infrastructure and allocate network resources to meet diverse service requirements. This paper proposes an efficient customized Benders decomposition algorithm for globally solving the large-scale NP-hard NS problem. The proposed algorithm decomposes the hard NS problem into two relatively easy function placement (FP) and traffic routing (TR) subproblems and iteratively solves them enabling the information feedback between each other, which makes it particularly suitable to solve large-scale problems. Specifically, the FP subproblem is to place service functions into cloud nodes in the network, and solving it can return a function placement strategy based on which the TR subproblem is defined; and the TR subproblem is to find paths connecting two nodes hosting two adjacent functions in the network, and solving it can either verify that the solution of the FP subproblem is an optimal solution of the original problem, or return a valid inequality to the FP subproblem that cuts off the current infeasible solution. The proposed algorithm is guaranteed to find the globally optimal solution of the NS problem. By taking the special structure of the NS problem into consideration, we successfully develop two families of valid inequalities that render the proposed algorithm converge much more quickly and thus much more efficient. Numerical results demonstrate that the proposed valid inequalities effectively accelerate the convergence of the decomposition algorithm, and the proposed algorithm significantly outperforms the existing algorithms in terms of both solution efficiency and quality.
△ Less
Submitted 25 September, 2024; v1 submitted 27 June, 2023;
originally announced June 2023.
-
A Variance-Reduced and Stabilized Proximal Stochastic Gradient Method with Support Identification Guarantees for Structured Optimization
Authors:
Yutong Dai,
Guanyi Wang,
Frank E. Curtis,
Daniel P. Robinson
Abstract:
This paper introduces a new proximal stochastic gradient method with variance reduction and stabilization for minimizing the sum of a convex stochastic function and a group sparsity-inducing regularization function. Since the method may be viewed as a stabilized version of the recently proposed algorithm PStorm, we call our algorithm S-PStorm. Our analysis shows that S-PStorm has strong convergenc…
▽ More
This paper introduces a new proximal stochastic gradient method with variance reduction and stabilization for minimizing the sum of a convex stochastic function and a group sparsity-inducing regularization function. Since the method may be viewed as a stabilized version of the recently proposed algorithm PStorm, we call our algorithm S-PStorm. Our analysis shows that S-PStorm has strong convergence results. In particular, we prove an upper bound on the number of iterations required by S-PStorm before its iterates correctly identify (with high probability) an optimal support (i.e., the zero and nonzero structure of an optimal solution). Most algorithms in the literature with such a support identification property use variance reduction techniques that require either periodically evaluating an exact gradient or storing a history of stochastic gradients. Unlike these methods, S-PStorm achieves variance reduction without requiring either of these, which is advantageous. Moreover, our support-identification result for S-PStorm shows that, with high probability, an optimal support will be identified correctly in all iterations with the index above a threshold. We believe that this type of result is new to the literature since the few existing other results prove that the optimal support is identified with high probability at each iteration with a sufficiently large index (meaning that the optimal support might be identified in some iterations, but not in others). Numerical experiments on regularized logistic loss problems show that S-PStorm outperforms existing methods in various metrics that measure how efficiently and robustly iterates of an algorithm identify an optimal support.
△ Less
Submitted 13 February, 2023;
originally announced February 2023.
-
A Cut-and-solve Algorithm for Virtual Machine Consolidation Problem
Authors:
Jiang-Yao Luo,
Liang Chen,
Wei-Kun Chen,
Jian-Hua Yuan,
Yu-Hong Dai
Abstract:
The virtual machine consolidation problem (VMCP) attempts to determine which servers to be activated, how to allocate virtual machines (VMs) to the activated servers, and how to migrate VMs among servers such that the summation of activated, allocation, and migration costs is minimized subject to the resource constraints of the servers and other practical constraints. In this paper, we first propo…
▽ More
The virtual machine consolidation problem (VMCP) attempts to determine which servers to be activated, how to allocate virtual machines (VMs) to the activated servers, and how to migrate VMs among servers such that the summation of activated, allocation, and migration costs is minimized subject to the resource constraints of the servers and other practical constraints. In this paper, we first propose a new mixed integer linear programming (MILP) formulation for the VMCP. We show that compared with existing formulations, the proposed formulation is much more compact in terms of smaller numbers of variables or constraints, which makes it suitable for solving large-scale problems. We then develop a cut-and-solve (C&S) algorithm, a tree search algorithm to efficiently solve the VMCP to optimality. The proposed C&S algorithm is based on a novel relaxation of the VMCP that provides a stronger lower bound than the natural continuous relaxation of the VMCP, making a smaller search tree. By extensive computational experiments, we show that (i) the proposed formulation significantly outperforms existing formulations in terms of solution efficiency; and (ii) compared with standard MILP solvers, the proposed C&S algorithm is much more efficient.
△ Less
Submitted 23 December, 2022;
originally announced December 2022.
-
A mechanism of three-dimensional quadratic termination for the gradient method with applications
Authors:
Yakui Huang,
Yu-Hong Dai,
Xin-Wei Liu
Abstract:
Recent studies show that the two-dimensional quadratic termination property has great potential in improving performance of the gradient method. However, it is not clear whether higher-dimensional quadratic termination leads further benefits. In this paper, we provide an affirmative answer by introducing a mechanism of three-dimensional quadratic termination for the gradient method. A novel stepsi…
▽ More
Recent studies show that the two-dimensional quadratic termination property has great potential in improving performance of the gradient method. However, it is not clear whether higher-dimensional quadratic termination leads further benefits. In this paper, we provide an affirmative answer by introducing a mechanism of three-dimensional quadratic termination for the gradient method. A novel stepsize is derived from the mechanism such that a family of delayed gradient methods equipping with the novel stepsize have the three-dimensional quadratic termination property. When applied to the Barzilai--Borwein (BB) method, the novel stepsize does not require the use of any exact line search or the Hessian, and can be computed by stepsizes and gradient norms in previous iterations. Using long BB steps and some short steps associated with the novel stepsize in an adaptive manner, we develop an efficient gradient method for quadratic optimization and further extend it to general unconstrained optimization. Numerical experiments show that the three-dimensional quadratic termination property can significantly improve performance of the BB method, and the proposed method outperforms gradient methods that use stepsizes with the two-dimensional quadratic termination property.
△ Less
Submitted 19 June, 2024; v1 submitted 14 December, 2022;
originally announced December 2022.
-
Primal Dual Alternating Proximal Gradient Algorithms for Nonsmooth Nonconvex Minimax Problems with Coupled Linear Constraints
Authors:
Huiling Zhang,
Junlin Wang,
Zi Xu,
Yu-Hong Dai
Abstract:
Nonconvex minimax problems have attracted wide attention in machine learning, signal processing and many other fields in recent years. In this paper, we propose a primal-dual alternating proximal gradient (PDAPG) algorithm for solving nonsmooth nonconvex-(strongly) concave minimax problems with coupled linear constraints, respectively. The iteration complexity of the two algorithms are proved to b…
▽ More
Nonconvex minimax problems have attracted wide attention in machine learning, signal processing and many other fields in recent years. In this paper, we propose a primal-dual alternating proximal gradient (PDAPG) algorithm for solving nonsmooth nonconvex-(strongly) concave minimax problems with coupled linear constraints, respectively. The iteration complexity of the two algorithms are proved to be $\mathcal{O}\left( \varepsilon ^{-2} \right)$ (resp. $\mathcal{O}\left( \varepsilon ^{-4} \right)$) under nonconvex-strongly concave (resp. nonconvex-concave) setting to reach an $\varepsilon$-stationary point. To our knowledge, it is the first algorithm with iteration complexity guarantees for solving the nonconvex minimax problems with coupled linear constraints.
△ Less
Submitted 27 April, 2024; v1 submitted 9 December, 2022;
originally announced December 2022.
-
Zeroth-Order Alternating Gradient Descent Ascent Algorithms for a Class of Nonconvex-Nonconcave Minimax Problems
Authors:
Zi Xu,
Zi-Qi Wang,
Jun-Lin Wang,
Yu-Hong Dai
Abstract:
In this paper, we consider a class of nonconvex-nonconcave minimax problems, i.e., NC-PL minimax problems, whose objective functions satisfy the Polyak-Łojasiewicz (PL) condition with respect to the inner variable. We propose a zeroth-order alternating gradient descent ascent (ZO-AGDA) algorithm and a zeroth-order variance reduced alternating gradient descent ascent (ZO-VRAGDA) algorithm for solvi…
▽ More
In this paper, we consider a class of nonconvex-nonconcave minimax problems, i.e., NC-PL minimax problems, whose objective functions satisfy the Polyak-Łojasiewicz (PL) condition with respect to the inner variable. We propose a zeroth-order alternating gradient descent ascent (ZO-AGDA) algorithm and a zeroth-order variance reduced alternating gradient descent ascent (ZO-VRAGDA) algorithm for solving NC-PL minimax problem under the deterministic and the stochastic setting, respectively. The total number of function value queries to obtain an $ε$-stationary point of ZO-AGDA and ZO-VRAGDA algorithm for solving NC-PL minimax problem is upper bounded by $\mathcal{O}(\varepsilon^{-2})$ and $\mathcal{O}(\varepsilon^{-3})$, respectively. To the best of our knowledge, they are the first two zeroth-order algorithms with the iteration complexity gurantee for solving NC-PL minimax problems.
△ Less
Submitted 26 May, 2023; v1 submitted 24 November, 2022;
originally announced November 2022.
-
Inexact Proximal-Gradient Methods with Support Identification
Authors:
Yutong Dai,
Daniel P. Robinson
Abstract:
We consider the proximal-gradient method for minimizing an objective function that is the sum of a smooth function and a non-smooth convex function. A feature that distinguishes our work from most in the literature is that we assume that the associated proximal operator does not admit a closed-form solution. To address this challenge, we study two adaptive and implementable termination conditions…
▽ More
We consider the proximal-gradient method for minimizing an objective function that is the sum of a smooth function and a non-smooth convex function. A feature that distinguishes our work from most in the literature is that we assume that the associated proximal operator does not admit a closed-form solution. To address this challenge, we study two adaptive and implementable termination conditions that dictate how accurately the proximal-gradient subproblem is solved. We prove that the number of iterations required for the inexact proximal-gradient method to reach a $τ> 0$ approximate first-order stationary point is $\mathcal{O}(τ^{-2})$, which matches the similar result that holds when exact subproblem solutions are computed. Also, by focusing on the overlapping group $\ell_1$ regularizer, we propose an algorithm for approximately solving the proximal-gradient subproblem, and then prove that its iterates identify (asymptotically) the support of an optimal solution. If one imposes additional control over the accuracy to which each subproblem is solved, we give an upper bound on the maximum number of iterations before the support of an optimal solution is obtained.
△ Less
Submitted 3 November, 2022;
originally announced November 2022.
-
Efficient Quantized Constant Envelope Precoding for Multiuser Downlink Massive MIMO Systems
Authors:
Zheyu Wu,
Ya-Feng Liu,
Bo Jiang,
Yu-Hong Dai
Abstract:
Quantized constant envelope (QCE) precoding, a new transmission scheme that only discrete QCE transmit signals are allowed at each antenna, has gained growing research interests due to its ability of reducing the hardware cost and the energy consumption of massive multiple-input multiple-output (MIMO) systems. However, the discrete nature of QCE transmit signals greatly complicates the precoding d…
▽ More
Quantized constant envelope (QCE) precoding, a new transmission scheme that only discrete QCE transmit signals are allowed at each antenna, has gained growing research interests due to its ability of reducing the hardware cost and the energy consumption of massive multiple-input multiple-output (MIMO) systems. However, the discrete nature of QCE transmit signals greatly complicates the precoding design. In this paper, we consider the QCE precoding problem for a massive MIMO system with phase shift keying (PSK) modulation and develop an efficient approach for solving the constructive interference (CI) based problem formulation. Our approach is based on a custom-designed (continuous) penalty model that is equivalent to the original discrete problem. Specifically, the penalty model relaxes the discrete QCE constraint and penalizes it in the objective with a negative $\ell_2$-norm term, which leads to a non-smooth non-convex optimization problem. To tackle it, we resort to our recently proposed alternating optimization (AO) algorithm. We show that the AO algorithm admits closed-form updates at each iteration when applied to our problem and thus can be efficiently implemented. Simulation results demonstrate the superiority of the proposed approach over the existing algorithms.
△ Less
Submitted 20 February, 2023; v1 submitted 26 October, 2022;
originally announced October 2022.
-
On GSOR, the Generalized Successive Overrelaxation Method for Double Saddle-Point Problems
Authors:
Na Huang,
Yu-Hong Dai,
Dominique Orban,
Michael A. Saunders
Abstract:
We consider the generalized successive overrelaxation (GSOR) method for solving a class of block three-by-three saddle-point problems. Based on the necessary and sufficient conditions for all roots of a real cubic polynomial to have modulus less than one, we derive convergence results under reasonable assumptions. We also analyze a class of block lower triangular preconditioners induced from GSOR…
▽ More
We consider the generalized successive overrelaxation (GSOR) method for solving a class of block three-by-three saddle-point problems. Based on the necessary and sufficient conditions for all roots of a real cubic polynomial to have modulus less than one, we derive convergence results under reasonable assumptions. We also analyze a class of block lower triangular preconditioners induced from GSOR and derive explicit and sharp spectral bounds for the preconditioned matrices. We report numerical experiments on test problems from the liquid crystal director model and the coupled Stokes-Darcy flow, demonstrating the usefulness of GSOR.
△ Less
Submitted 15 August, 2022;
originally announced August 2022.
-
A primal-dual majorization-minimization method for large-scale linear programs
Authors:
Xin-Wei Liu,
Yu-Hong Dai,
Ya-Kui Huang
Abstract:
We present a primal-dual majorization-minimization method for solving large-scale linear programs. A smooth barrier augmented Lagrangian (SBAL) function with strict convexity for the dual linear program is derived. The majorization-minimization approach is naturally introduced to develop the smoothness and convexity of the SBAL function. Our method only depends on a factorization of the constant m…
▽ More
We present a primal-dual majorization-minimization method for solving large-scale linear programs. A smooth barrier augmented Lagrangian (SBAL) function with strict convexity for the dual linear program is derived. The majorization-minimization approach is naturally introduced to develop the smoothness and convexity of the SBAL function. Our method only depends on a factorization of the constant matrix independent of iterations and does not need any computation on step sizes, thus can be expected to be particularly appropriate for large-scale linear programs. The method shares some similar properties to the first-order methods for linear programs, but its convergence analysis is established on the differentiability and convexity of our SBAL function. The global convergence is analyzed without prior requiring either the primal or dual linear program to be feasible. Under the regular conditions, our method is proved to be globally linearly convergent, and a new iteration complexity result is given.
△ Less
Submitted 7 August, 2022;
originally announced August 2022.
-
A semi-conjugate gradient method for solving unsymmetric positive definite linear systems
Authors:
Na Huang,
Yu-Hong Dai,
Dominique Orban,
Michael A Saunders
Abstract:
The conjugate gradient (CG) method is a classic Krylov subspace method for solving symmetric positive definite linear systems. We introduce an analogous semi-conjugate gradient (SCG) method for unsymmetric positive definite linear systems. Unlike CG, SCG requires the solution of a lower triangular linear system to produce each semi-conjugate direction. We prove that SCG is theoretically equivalent…
▽ More
The conjugate gradient (CG) method is a classic Krylov subspace method for solving symmetric positive definite linear systems. We introduce an analogous semi-conjugate gradient (SCG) method for unsymmetric positive definite linear systems. Unlike CG, SCG requires the solution of a lower triangular linear system to produce each semi-conjugate direction. We prove that SCG is theoretically equivalent to the full orthogonalization method (FOM), which is based on the Arnoldi process and converges in a finite number of steps. Because SCG's triangular system increases in size each iteration, we study a sliding window implementation (SWI) to improve efficiency, and show that the directions produced are still locally semi-conjugate. A counterexample illustrates that SWI is different from the direct incomplete orthogonalization method (DIOM), which is FOM with a sliding window. Numerical experiments from the convection-diffusion equation and other applications show that SCG is robust and that the sliding window implementation SWI allows SCG to solve large systems efficiently.
△ Less
Submitted 8 June, 2022; v1 submitted 6 June, 2022;
originally announced June 2022.
-
Optimality Conditions and Numerical Algorithms for A Class of Linearly Constrained Minimax Optimization Problems
Authors:
Yu-Hong Dai,
Jiani Wang,
Liwei Zhang
Abstract:
It is well known that there have been many numerical algorithms for solving nonsmooth minimax problems, numerical algorithms for nonsmooth minimax problems with joint linear constraints are very rare. This paper aims to discuss optimality conditions and develop practical numerical algorithms for minimax problems with joint linear constraints. First of all, we use the properties of proximal mapping…
▽ More
It is well known that there have been many numerical algorithms for solving nonsmooth minimax problems, numerical algorithms for nonsmooth minimax problems with joint linear constraints are very rare. This paper aims to discuss optimality conditions and develop practical numerical algorithms for minimax problems with joint linear constraints. First of all, we use the properties of proximal mapping and KKT system to establish optimality conditions. Secondly, we propose a framework of alternating coordinate algorithm for the minimax problem and analyze its convergence properties. Thirdly, we develop a proximal gradient multi-step ascent decent method (PGmsAD) as a numerical algorithm and demonstrate that the method can find an $ε$-stationary point for this kind of nonsmooth nonconvex-nonconcave problem in ${\cal O}(ε^{-2}\logε^{-1})$ iterations. Finally, we apply PGmsAD to generalized absolute value equations, generalized linear projection equations and linear regression problems and report the efficiency of PGmsAD on large-scale optimization.
△ Less
Submitted 19 April, 2022;
originally announced April 2022.
-
Sparsity-Exploiting Distributed Projections onto a Simplex
Authors:
Yongzheng Dai,
Chen Chen
Abstract:
Projecting a vector onto a simplex is a well-studied problem that arises in a wide range of optimization problems. Numerous algorithms have been proposed for determining the projection; however, the primary focus of the literature has been on serial algorithms. We present a parallel method that decomposes the input vector and distributes it across multiple processors for local projection. Our meth…
▽ More
Projecting a vector onto a simplex is a well-studied problem that arises in a wide range of optimization problems. Numerous algorithms have been proposed for determining the projection; however, the primary focus of the literature has been on serial algorithms. We present a parallel method that decomposes the input vector and distributes it across multiple processors for local projection. Our method is especially effective when the resulting projection is highly sparse; which is the case, for instance, in large-scale problems with i.i.d. entries. Moreover, the method can be adapted to parallelize a broad range of serial algorithms from the literature. We fill in theoretical gaps in serial algorithm analysis, and develop similar results for our parallel analogues. Numerical experiments conducted on a wide range of large-scale instances, both real-world and simulated, demonstrate the practical effectiveness of the method.
△ Less
Submitted 10 October, 2023; v1 submitted 17 April, 2022;
originally announced April 2022.
-
Rotational hypersurfaces with constant Gauss-Kronecker curvature
Authors:
Yuhang Liu,
Yunchu Dai
Abstract:
We study rotational hypersurfaces with constant Gauss-Kronecker curvature. We solve the ODE for the generating curves of such hypersurfaces and analyze several geometric properties of such hypersurfaces. In particular, we discover a class of non-compact rotational hypersurfaces with constant and negative Gauss-Kronecker curvature and finite volume, which can be seen as the higher-dimensional gener…
▽ More
We study rotational hypersurfaces with constant Gauss-Kronecker curvature. We solve the ODE for the generating curves of such hypersurfaces and analyze several geometric properties of such hypersurfaces. In particular, we discover a class of non-compact rotational hypersurfaces with constant and negative Gauss-Kronecker curvature and finite volume, which can be seen as the higher-dimensional generalization of the pseudo-sphere. Finally we investigate other types of rotational hypersurfaces with similar curvature constraints, including those with prescribed Gauss-Kronecker curvature.
△ Less
Submitted 18 January, 2022;
originally announced January 2022.
-
The Augmented Lagrangian Method Can Approximately Solve Convex Optimization with Least Constraint Violation
Authors:
Yu-Hong Dai,
Liwei Zhang
Abstract:
There are many important practical optimization problems whose feasible regions are not known to be nonempty or not, and optimizers of the objective function with the least constraint violation prefer to be found. A natural way for dealing with these problems is to extend the nonlinear optimization problem as the one optimizing the objective function over the set of points with the least constrain…
▽ More
There are many important practical optimization problems whose feasible regions are not known to be nonempty or not, and optimizers of the objective function with the least constraint violation prefer to be found. A natural way for dealing with these problems is to extend the nonlinear optimization problem as the one optimizing the objective function over the set of points with the least constraint violation. This leads to the study of the shifted problem. This paper focuses on the constrained convex optimization problem. The sufficient condition for the closedness of the set of feasible shifts is presented and the continuity properties of the optimal value function and the solution mapping for the shifted problem are studied. Properties of the conjugate dual of the shifted problem are discussed through the relations between the dual function and the optimal value function. The solvability of the dual of the optimization problem with the least constraint violation is investigated. It is shown that, if the least violated shift is in the domain of the subdifferential of the optimal value function, then this dual problem has an unbounded solution set. Under this condition, the optimality conditions for the problem with the least constraint violation are established in term of the augmented Lagrangian. It is shown that the augmented Lagrangian method has the properties that the sequence of shifts converges to the least violated shift and the sequence of multipliers is unbounded. Moreover, it is proved that the augmented Lagrangian method is able to find an approximate solution to the problem with the least constraint violation.
△ Less
Submitted 11 November, 2021;
originally announced November 2021.
-
Stability for Constrained Minimax Optimization
Authors:
Yu-Hong Dai,
Liwei Zhang
Abstract:
Minimax optimization problems are an important class of optimization problems arising from both modern machine learning and from traditional research areas. We focus on the stability of constrained minimax optimization problems based on the notion of local minimax point by Dai and Zhang (2020). Firstly, we extend the classical Jacobian uniqueness conditions of nonlinear programming to the constrai…
▽ More
Minimax optimization problems are an important class of optimization problems arising from both modern machine learning and from traditional research areas. We focus on the stability of constrained minimax optimization problems based on the notion of local minimax point by Dai and Zhang (2020). Firstly, we extend the classical Jacobian uniqueness conditions of nonlinear programming to the constrained minimax problem and prove that this set of properties is stable with respect to small $C^2$ perturbation. Secondly, we provide a set of conditions, called Property A, which does not require the strict complementarity condition for the upper level constraints. Finally, we prove that Property A is a sufficient condition for the strong regularity of the Kurash-Kuhn-Tucker (KKT) system at the KKT point, and it is also a sufficient condition for the local Lipschitzian homeomorphism of the Kojima mapping near the KKT point.
△ Less
Submitted 10 November, 2021;
originally announced November 2021.
-
Global Optimization via Schr{ö}dinger-F{ö}llmer Diffusion
Authors:
Yin Dai,
Yuling Jiao,
Lican Kang,
Xiliang Lu,
Jerry Zhijian Yang
Abstract:
We study the problem of finding global minimizers of $V(x):\mathbb{R}^d\rightarrow\mathbb{R}$ approximately via sampling from a probability distribution $μ_σ$ with density $p_σ(x)=\dfrac{\exp(-V(x)/σ)}{\int_{\mathbb R^d} \exp(-V(y)/σ) dy }$ with respect to the Lebesgue measure for $σ\in (0,1]$ small enough.
We analyze a sampler based on the Euler-Maruyama discretization of the Schr{ö}dinger-F{ö}…
▽ More
We study the problem of finding global minimizers of $V(x):\mathbb{R}^d\rightarrow\mathbb{R}$ approximately via sampling from a probability distribution $μ_σ$ with density $p_σ(x)=\dfrac{\exp(-V(x)/σ)}{\int_{\mathbb R^d} \exp(-V(y)/σ) dy }$ with respect to the Lebesgue measure for $σ\in (0,1]$ small enough.
We analyze a sampler based on the Euler-Maruyama discretization of the Schr{ö}dinger-F{ö}llmer diffusion processes with stochastic approximation under appropriate assumptions on the step size $s$ and the potential $V$.
We prove that the output of the proposed sampler is an approximate global minimizer of $V(x)$ with high probability at cost of sampling $\mathcal{O}(d^{3})$ standard normal random variables.
Numerical studies illustrate the effectiveness of the proposed method and its superiority to the Langevin method.
△ Less
Submitted 17 August, 2022; v1 submitted 30 October, 2021;
originally announced November 2021.
-
Efficient CI-Based One-Bit Precoding for Multiuser Downlink Massive MIMO Systems with PSK Modulation
Authors:
Zheyu Wu,
Bo Jiang,
Ya-Feng Liu,
Mingjie Shao,
Yu-Hong Dai
Abstract:
In this paper, we consider the one-bit precoding problem for the multiuser downlink massive multiple-input multiple-output (MIMO) system with phase shift keying (PSK) modulation. We focus on the celebrated constructive interference (CI)-based problem formulation. We first establish the NP-hardness of the problem (even in the single-user case), which reveals the intrinsic difficulty of globally sol…
▽ More
In this paper, we consider the one-bit precoding problem for the multiuser downlink massive multiple-input multiple-output (MIMO) system with phase shift keying (PSK) modulation. We focus on the celebrated constructive interference (CI)-based problem formulation. We first establish the NP-hardness of the problem (even in the single-user case), which reveals the intrinsic difficulty of globally solving the problem. Then, we propose a novel negative $\ell_1$ penalty model for the considered problem, which penalizes the one-bit constraint into the objective by a negative $\ell_1$-norm term, and show the equivalence between (global and local) solutions of the original problem and the penalty problem when the penalty parameter is sufficiently large. We further transform the penalty model into an equivalent min-max problem and propose an efficient alternating proximal/projection gradient descent ascent (APGDA) algorithm for solving it, which performs a proximal gradient decent over one block of variables and a projection gradient ascent over the other block of variables alternately. The APGDA algorithm enjoys a low per-iteration complexity and is guaranteed to converge to a stationary point of the min-max problem and a local minimizer of the penalty problem. To further reduce the computational cost, we also propose a low-complexity implementation of the APGDA algorithm, where the values of the variables will be fixed in later iterations once they satisfy the one-bit constraint. Numerical results show that, compared to the state-of-the-art CI-based algorithms, both of the proposed algorithms generally achieve better bit-error-rate (BER) performance with lower computational cost.
△ Less
Submitted 10 October, 2023; v1 submitted 22 October, 2021;
originally announced October 2021.
-
A Novel Negative $\ell_1$ Penalty Approach for Multiuser One-Bit Massive MIMO Downlink with PSK Signaling
Authors:
Zheyu Wu,
Bo Jiang,
Ya-Feng Liu,
Yu-Hong Dai
Abstract:
This paper considers the one-bit precoding problem for the multiuser downlink massive multiple-input multiple-output (MIMO) system with phase shift keying (PSK) modulation and focuses on the celebrated constructive interference (CI)-based problem formulation. The existence of the discrete one-bit constraint makes the problem generally hard to solve. In this paper, we propose an efficient negative…
▽ More
This paper considers the one-bit precoding problem for the multiuser downlink massive multiple-input multiple-output (MIMO) system with phase shift keying (PSK) modulation and focuses on the celebrated constructive interference (CI)-based problem formulation. The existence of the discrete one-bit constraint makes the problem generally hard to solve. In this paper, we propose an efficient negative $\ell_1$ penalty approach for finding a high-quality solution of the considered problem. Specifically, we first propose a novel negative $\ell_1$ penalty model, which penalizes the one-bit constraint into the objective with a negative $\ell_1$-norm term, and show the equivalence between (global and local) solutions of the original problem and the penalty problem when the penalty parameter is sufficiently large. We further transform the penalty model into an equivalent min-max problem and propose an efficient alternating optimization (AO) algorithm for solving it. The AO algorithm enjoys low per-iteration complexity and is guaranteed to converge to the stationary point of the min-max problem. Numerical results show that, compared against the state-of-the-art CI-based algorithms, the proposed algorithm generally achieves better bit-error-rate (BER) performance with lower computational cost.
△ Less
Submitted 7 February, 2022; v1 submitted 10 October, 2021;
originally announced October 2021.
-
Optimal QoS-Aware Network Slicing for Service-Oriented Networks with Flexible Routing
Authors:
Wei-Kun Chen,
Ya-Feng Liu,
Yu-Hong Dai,
Zhi-Quan Luo
Abstract:
In this paper, we consider the network slicing problem which attempts to map multiple customized virtual network requests (also called services) to a common shared network infrastructure and allocate network resources to meet diverse quality of service (QoS) requirements. We first propose a mixed integer nonlinear program (MINLP) formulation for this problem that optimizes the network resource con…
▽ More
In this paper, we consider the network slicing problem which attempts to map multiple customized virtual network requests (also called services) to a common shared network infrastructure and allocate network resources to meet diverse quality of service (QoS) requirements. We first propose a mixed integer nonlinear program (MINLP) formulation for this problem that optimizes the network resource consumption while jointly considers QoS requirements, flow routing, and resource budget constraints. In particular, the proposed formulation is able to flexibly route the traffic flow of the services on multiple paths and provide end-to-end (E2E) delay and reliability guarantees for all services. Due to the intrinsic nonlinearity, the MINLP formulation is computationally difficult to solve. To overcome this difficulty, we then propose a mixed integer linear program (MILP) formulation and show that the two formulations and their continuous relaxations are equivalent. Different from the continuous relaxation of the MINLP formulation which is a nonconvex nonlinear programming problem, the continuous relaxation of the MILP formulation is a polynomial time solvable linear programming problem, which makes the MILP formulation much more computationally solvable. Numerical results demonstrate the effectiveness and efficiency of the proposed formulations over existing ones.
△ Less
Submitted 27 February, 2022; v1 submitted 8 October, 2021;
originally announced October 2021.
-
Mirror frameworks for relatively Lipschitz and monotone-like variational inequalities
Authors:
Hui Zhang,
Yu-Hong Dai
Abstract:
Nonconvex-nonconcave saddle-point optimization in machine learning has triggered lots of research for studying non-monotone variational inequalities (VI). In this work, we introduce two mirror frameworks, called mirror extragradient method and mirror extrapolation method, for approximating solutions to relatively Lipschitz and monotone-like VIs. The former covers the well-known Nemirovski's mirror…
▽ More
Nonconvex-nonconcave saddle-point optimization in machine learning has triggered lots of research for studying non-monotone variational inequalities (VI). In this work, we introduce two mirror frameworks, called mirror extragradient method and mirror extrapolation method, for approximating solutions to relatively Lipschitz and monotone-like VIs. The former covers the well-known Nemirovski's mirror prox method and Nesterov's dual extrapolation method, and the recently proposed Bregman extragradient method; all of them can be reformulated into a scheme that is very similar to the original form of extragradient method. The latter includes the operator extrapolation method and the Bregman extrapolation method as its special cases. The proposed mirror frameworks allow us to present a unified and improved convergence analysis for all these existing methods under relative Lipschitzness and monotone-like conditions that may be the currently weakest assumptions.
△ Less
Submitted 29 December, 2022; v1 submitted 26 August, 2021;
originally announced August 2021.
-
Derivative-free Alternating Projection Algorithms for General Nonconvex-Concave Minimax Problems
Authors:
Zi Xu,
Ziqi Wang,
Jingjing Shen,
Yuhong Dai
Abstract:
In this paper, we study zeroth-order algorithms for nonconvex-concave minimax problems, which have attracted widely attention in machine learning, signal processing and many other fields in recent years. We propose a zeroth-order alternating randomized gradient projection (ZO-AGP) algorithm for smooth nonconvex-concave minimax problems, and its iteration complexity to obtain an $\varepsilon$-stati…
▽ More
In this paper, we study zeroth-order algorithms for nonconvex-concave minimax problems, which have attracted widely attention in machine learning, signal processing and many other fields in recent years. We propose a zeroth-order alternating randomized gradient projection (ZO-AGP) algorithm for smooth nonconvex-concave minimax problems, and its iteration complexity to obtain an $\varepsilon$-stationary point is bounded by $\mathcal{O}(\varepsilon^{-4})$, and the number of function value estimation is bounded by $\mathcal{O}(d_{x}+d_{y})$ per iteration. Moreover, we propose a zeroth-order block alternating randomized proximal gradient algorithm (ZO-BAPG) for solving block-wise nonsmooth nonconvex-concave minimax optimization problems, and the iteration complexity to obtain an $\varepsilon$-stationary point is bounded by $\mathcal{O}(\varepsilon^{-4})$ and the number of function value estimation per iteration is bounded by $\mathcal{O}(K d_{x}+d_{y})$. To the best of our knowledge, this is the first time that zeroth-order algorithms with iteration complexity gurantee are developed for solving both general smooth and block-wise nonsmooth nonconvex-concave minimax problems. Numerical results on data poisoning attack problem and distributed nonconvex sparse principal component analysis problem validate the efficiency of the proposed algorithms.
△ Less
Submitted 25 January, 2024; v1 submitted 1 August, 2021;
originally announced August 2021.
-
Towards Efficient Large-Scale Network Slicing: An LP Dynamic Rounding-and-Refinement Approach
Authors:
Wei-Kun Chen,
Ya-Feng Liu,
Fan Liu,
Yu-Hong Dai,
Zhi-Quan Luo
Abstract:
In this paper, we propose an efficient algorithm for the network slicing problem which attempts to map multiple customized virtual network requests (also called services) to a common shared network infrastructure and allocate network resources to meet diverse service requirements. The problem has been formulated as a mixed integer linear programming (MILP) formulation in the literature. We first p…
▽ More
In this paper, we propose an efficient algorithm for the network slicing problem which attempts to map multiple customized virtual network requests (also called services) to a common shared network infrastructure and allocate network resources to meet diverse service requirements. The problem has been formulated as a mixed integer linear programming (MILP) formulation in the literature. We first propose a novel linear programming (LP) relaxation of the MILP formulation. We show that compared with a natural LP relaxation of the MILP formulation, the novel LP relaxation is much more compact in terms of smaller numbers of variables and constraints, and much stronger in terms of providing a better LP bound, which makes it particularly suitable to be embedded in an LP relaxation based algorithm. Then we design an efficient two-stage LP dynamic rounding-and-refinement algorithm based on this novel LP relaxation. In the first stage, the proposed algorithm uses an LP dynamic rounding procedure to place the virtual network functions of all services into cloud nodes while taking traffic routing of all services into consideration; in the second stage, the proposed algorithm uses an iterative LP refinement procedure to obtain a solution for traffic routing of all services with their end-to-end delay constraints being satisfied. Compared with the existing algorithms which either have an exponential complexity or return a low-quality solution, our proposed algorithm achieves a better trade-off between the solution quality and the computational complexity. In particular, the worst-case complexity of our proposed algorithm is polynomial, which makes it suitable for solving large-scale problems. Numerical results demonstrate the effectiveness and efficiency of our proposed algorithm.
△ Less
Submitted 11 February, 2023; v1 submitted 29 July, 2021;
originally announced July 2021.
-
A novel augmented Lagrangian method of multipliers for optimization with general inequality constraints
Authors:
Xin-Wei Liu,
Yu-Hong Dai,
Ya-Kui Huang,
Jie Sun
Abstract:
We introduce a twice differentiable augmented Lagrangian for nonlinear optimization with general inequality constraints and show that a strict local minimizer of the original problem is an approximate strict local solution of the augmented Lagrangian. A novel augmented Lagrangian method of multipliers (ALM) is then presented. Our method is originated from a generalization of the Hetenes-Powell aug…
▽ More
We introduce a twice differentiable augmented Lagrangian for nonlinear optimization with general inequality constraints and show that a strict local minimizer of the original problem is an approximate strict local solution of the augmented Lagrangian. A novel augmented Lagrangian method of multipliers (ALM) is then presented. Our method is originated from a generalization of the Hetenes-Powell augmented Lagrangian, and is a combination of the augmented Lagrangian and the interior-point technique. It shares a similar algorithmic framework with existing ALMs for optimization with inequality constraints, but it can use the second derivatives and does not depend on projections on the set of inequality constraints. In each iteration, our method solves a twice continuously differentiable unconstrained optimization subproblem on primal variables. The dual iterates, penalty and smoothing parameters are updated adaptively. The global and local convergence are analyzed. Without assuming any constraint qualification, it is proved that the proposed method has strong global convergence. The method may converge to either a Kurash-Kuhn-Tucker (KKT) point or a singular stationary point when the converging point is a minimizer. It may also converge to an infeasible stationary point of nonlinear program when the problem is infeasible. Furthermore, our method is capable of rapidly detecting the possible infeasibility of the solved problem. Under suitable conditions, it is locally linearly convergent to the KKT point, which is consistent with ALMs for optimization with equality constraints. The preliminary numerical experiments on some small benchmark test problems demonstrate our theoretical results.
△ Less
Submitted 28 June, 2021;
originally announced June 2021.
-
Explicit generators and relations for the centre of the quantum group
Authors:
Yanmin Dai,
Yang Zhang
Abstract:
For the standard Drinfeld-Jimbo quantum group ${\rm U}_q(\mathfrak{g})$ associated with a simple Lie algebra $\mathfrak{g}$, we construct explicit generators of the centre $Z({\rm U}_q(\mathfrak{g}))$, and determine the relations satisfied by the generators. For $\mathfrak{g}$ of type $A_n(n\geq 2)$, $D_{2k+1}(k\geq 2)$ or $E_6$, the centre $Z({\rm U}_q(\mathfrak{g}))$ is isomorphic to a quotient…
▽ More
For the standard Drinfeld-Jimbo quantum group ${\rm U}_q(\mathfrak{g})$ associated with a simple Lie algebra $\mathfrak{g}$, we construct explicit generators of the centre $Z({\rm U}_q(\mathfrak{g}))$, and determine the relations satisfied by the generators. For $\mathfrak{g}$ of type $A_n(n\geq 2)$, $D_{2k+1}(k\geq 2)$ or $E_6$, the centre $Z({\rm U}_q(\mathfrak{g}))$ is isomorphic to a quotient of a polynomial algebra in multiple variables, which is described in a uniform manner for all cases. For $\mathfrak{g}$ of any other type, $Z({\rm U}_q(\mathfrak{g}))$ is generated by $n=$rank$(\mathfrak{g})$ algebraically independent elements.
△ Less
Submitted 15 February, 2021;
originally announced February 2021.
-
An efficient linear programming rounding-and-refinement algorithm for large-scale network slicing problem
Authors:
Wei-Kun Chen,
Ya-Feng Liu,
Yu-Hong Dai,
Zhi-Quan Luo
Abstract:
In this paper, we consider the network slicing problem which attempts to map multiple customized virtual network requests (also called services) to a common shared network infrastructure and allocate network resources to meet diverse service requirements, and propose an efficient two-stage algorithm for solving this NP-hard problem. In the first stage, the proposed algorithm uses an iterative line…
▽ More
In this paper, we consider the network slicing problem which attempts to map multiple customized virtual network requests (also called services) to a common shared network infrastructure and allocate network resources to meet diverse service requirements, and propose an efficient two-stage algorithm for solving this NP-hard problem. In the first stage, the proposed algorithm uses an iterative linear programming (LP) rounding procedure to place the virtual network functions of all services into cloud nodes while taking traffic routing of all services into consideration; in the second stage, the proposed algorithm uses an iterative LP refinement procedure to obtain a solution for traffic routing of all services with their end-to-end delay constraints being satisfied. Compared with the existing algorithms which either have an exponential complexity or return a low-quality solution, our proposed algorithm achieves a better trade-off between solution quality and computational complexity. In particular, the worst-case complexity of our proposed algorithm is polynomial, which makes it suitable for solving large-scale problems. Numerical results demonstrate the effectiveness and efficiency of our proposed algorithm.
△ Less
Submitted 4 February, 2021;
originally announced February 2021.
-
Efficient presolving methods for the influence maximization problem
Authors:
Sheng-Jie Chen,
Wei-Kun Chen,
Yu-Hong Dai,
Jian-Hua Yuan,
Hou-Shan Zhang
Abstract:
We consider the influence maximization problem (IMP) which asks for identifying a limited number of key individuals to spread influence in a network such that the expected number of influenced individuals is maximized. The stochastic maximal covering location problem (SMCLP) formulation is a mixed integer programming formulation that effectively approximates the IMP by the Monte-Carlo sampling. Fo…
▽ More
We consider the influence maximization problem (IMP) which asks for identifying a limited number of key individuals to spread influence in a network such that the expected number of influenced individuals is maximized. The stochastic maximal covering location problem (SMCLP) formulation is a mixed integer programming formulation that effectively approximates the IMP by the Monte-Carlo sampling. For IMPs with a large-scale network or a large number of samplings, however, the SMCLP formulation cannot be efficiently solved by existing exact algorithms due to its large problem size. In this paper, we attempt to develop presolving methods to reduce the problem size and hence enhance the capability of employing exact algorithms in solving large-scale IMPs. In particular, we propose two effective presolving methods, called strongly connected nodes aggregation (SCNA) and isomorphic nodes aggregation (INA), respectively. The SCNA enables to build a new SMCLP formulation that is potentially much more compact than the existing one, and the INA further eliminates variables and constraints in the SMCLP formulation. A theoretical analysis on two special cases of the IMP is provided to demonstrate the strength of the SCNA and INA in reducing the problem size of the SMCLP formulation. We integrate the proposed presolving methods, SCNA and INA, into the Benders decomposition algorithm, which is recognized as one of the state-of-the-art exact algorithms for solving the IMP. We show that the proposed SCNA and INA provide the possibility to develop a much faster separation algorithm for the Benders cuts. Numerical results demonstrate that with the SCNA and INA, the Benders decomposition algorithm is much more effective in solving the IMP in terms of solution time.
△ Less
Submitted 5 July, 2023; v1 submitted 2 January, 2021;
originally announced January 2021.
-
Majorized Semi-proximal Alternating Coordinate Method for Nonsmooth Convex-Concave Minimax Optimization
Authors:
Yu-Hong Dai,
Jiani Wang,
Liwei Zhang
Abstract:
Minimax optimization problems are an important class of optimization problems arising from modern machine learning and traditional research areas. While there have been many numerical algorithms for solving smooth convex-concave minimax problems, numerical algorithms for nonsmooth convex-concave minimax problems are very rare. This paper aims to develop an efficient numerical algorithm for a struc…
▽ More
Minimax optimization problems are an important class of optimization problems arising from modern machine learning and traditional research areas. While there have been many numerical algorithms for solving smooth convex-concave minimax problems, numerical algorithms for nonsmooth convex-concave minimax problems are very rare. This paper aims to develop an efficient numerical algorithm for a structured nonsmooth convex-concave minimax problem. A majorized semi-proximal alternating coordinate method (mspACM) is proposed, in which a majorized quadratic convex-concave function is adopted for approximating the smooth part of the objective function and semi-proximal terms are added in each subproblem. This construction enables the subproblems at each iteration are solvable and even easily solved when the semiproximal terms are cleverly chosen. We prove the global convergence of the algorithm mspACM under mild assumptions, without requiring strong convexity-concavity condition. Under the locally metrical subregularity of the solution mapping, we prove that the algorithm mspACM has the linear rate of convergence. Preliminary numerical results are reported to verify the efficiency of the algorithm mspACM.
△ Less
Submitted 21 December, 2020;
originally announced December 2020.
-
Equipping Barzilai-Borwein method with two dimensional quadratic termination property
Authors:
Yakui Huang,
Yu-Hong Dai,
Xin-Wei Liu
Abstract:
A novel gradient stepsize is derived at the motivation of equipping the Barzilai-Borwein (BB) method with two dimensional quadratic termination property. A remarkable feature of the novel stepsize is that its computation only depends on the BB stepsizes in previous iterations and does not require any exact line search or the Hessian, and hence it can easily be extended for nonlinear optimization.…
▽ More
A novel gradient stepsize is derived at the motivation of equipping the Barzilai-Borwein (BB) method with two dimensional quadratic termination property. A remarkable feature of the novel stepsize is that its computation only depends on the BB stepsizes in previous iterations and does not require any exact line search or the Hessian, and hence it can easily be extended for nonlinear optimization. By adaptively taking long BB steps and some short steps associated with the new stepsize, we develop an efficient gradient method for quadratic optimization and general unconstrained optimization and extend it to solve extreme eigenvalues problems. The proposed method is further extended for box-constrained optimization and singly linearly box-constrained optimization by incorporating gradient projection techniques. Numerical experiments demonstrate that the proposed method outperforms the most successful gradient methods in the literature.
△ Less
Submitted 9 January, 2021; v1 submitted 22 October, 2020;
originally announced October 2020.