Search | arXiv e-print repository

arXiv:2406.19377 [pdf, ps, other]

Grassmannian optimization is NP-hard

Abstract: We show that unconstrained quadratic optimization over a Grassmannian $\operatorname{Gr}(k,n)$ is NP-hard. Our results cover all scenarios: (i) when $k$ and $n$ are both allowed to grow; (ii) when $k$ is arbitrary but fixed; (iii) when $k$ is fixed at its lowest possible value $1$. We then deduce the NP-hardness of unconstrained cubic optimization over the Stiefel manifold $\operatorname{V}(k,n)$… ▽ More We show that unconstrained quadratic optimization over a Grassmannian $\operatorname{Gr}(k,n)$ is NP-hard. Our results cover all scenarios: (i) when $k$ and $n$ are both allowed to grow; (ii) when $k$ is arbitrary but fixed; (iii) when $k$ is fixed at its lowest possible value $1$. We then deduce the NP-hardness of unconstrained cubic optimization over the Stiefel manifold $\operatorname{V}(k,n)$ and the orthogonal group $\operatorname{O}(n)$. As an addendum we demonstrate the NP-hardness of unconstrained quadratic optimization over the Cartan manifold, i.e., the positive definite cone $\mathbb{S}^n_{\scriptscriptstyle++}$ regarded as a Riemannian manifold, another popular example in manifold optimization. We will also establish the nonexistence of $\mathrm{FPTAS}$ in all cases. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: 19 pages

MSC Class: 03D15; 90C26; 90C23; 65K10; 68Q25; 90C60

arXiv:2406.11821 [pdf, ps, other]

Simple matrix expressions for the curvatures of Grassmannian

Authors: Zehua Lai, Lek-Heng Lim, Ke Ye

Abstract: We show that modeling a Grassmannian as symmetric orthogonal matrices $\operatorname{Gr}(k,\mathbb{R}^n) \cong\{Q \in \mathbb{R}^{n \times n} : Q^{\scriptscriptstyle\mathsf{T}} Q = I, \; Q^{\scriptscriptstyle\mathsf{T}} = Q,\; \operatorname{tr}(Q)=2k - n\}$ yields exceedingly simple matrix formulas for various curvatures and curvature-related quantities, both intrinsic and extrinsic. These include… ▽ More We show that modeling a Grassmannian as symmetric orthogonal matrices $\operatorname{Gr}(k,\mathbb{R}^n) \cong\{Q \in \mathbb{R}^{n \times n} : Q^{\scriptscriptstyle\mathsf{T}} Q = I, \; Q^{\scriptscriptstyle\mathsf{T}} = Q,\; \operatorname{tr}(Q)=2k - n\}$ yields exceedingly simple matrix formulas for various curvatures and curvature-related quantities, both intrinsic and extrinsic. These include Riemann, Ricci, Jacobi, sectional, scalar, mean, principal, and Gaussian curvatures; Schouten, Weyl, Cotton, Bach, Plebański, cocurvature, nonmetricity, and torsion tensors; first, second, and third fundamental forms; Gauss and Weingarten maps; and upper and lower delta invariants. We will derive explicit, simple expressions for the aforementioned quantities in terms of standard matrix operations that are stably computable with numerical linear algebra. Many of these aforementioned quantities have never before been presented for the Grassmannian. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 25 pages

MSC Class: 15A75; 14M15

arXiv:2405.08047 [pdf, other]

Autonomous Sparse Mean-CVaR Portfolio Optimization

Authors: Yizun Lin, Yangyu Zhang, Zhao-Rong Lai, Cheng Li

Abstract: The $\ell_0$-constrained mean-CVaR model poses a significant challenge due to its NP-hard nature, typically tackled through combinatorial methods characterized by high computational demands. From a markedly different perspective, we propose an innovative autonomous sparse mean-CVaR portfolio model, capable of approximating the original $\ell_0$-constrained mean-CVaR model with arbitrary accuracy.… ▽ More The $\ell_0$-constrained mean-CVaR model poses a significant challenge due to its NP-hard nature, typically tackled through combinatorial methods characterized by high computational demands. From a markedly different perspective, we propose an innovative autonomous sparse mean-CVaR portfolio model, capable of approximating the original $\ell_0$-constrained mean-CVaR model with arbitrary accuracy. The core idea is to convert the $\ell_0$ constraint into an indicator function and subsequently handle it through a tailed approximation. We then propose a proximal alternating linearized minimization algorithm, coupled with a nested fixed-point proximity algorithm (both convergent), to iteratively solve the model. Autonomy in sparsity refers to retaining a significant portion of assets within the selected asset pool during adjustments in pool size. Consequently, our framework offers a theoretically guaranteed approximation of the $\ell_0$-constrained mean-CVaR model, improving computational efficiency while providing a robust asset selection scheme. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: ICML 2024

arXiv:2306.11286 [pdf, other]

Globally Optimal Solutions to a Class of Fractional Optimization Problems Based on Proximal Gradient Algorithm

Authors: Yizun Lin, Jian-Feng Cai, Zhao-Rong Lai, Cheng Li

Abstract: In this paper, we investigate a category of constrained fractional optimization problems that emerge in various practical applications. The objective function for this category is characterized by the ratio of a numerator and denominator, both being convex, semi-algebraic, Lipschitz continuous, and differentiable with Lipschitz continuous gradients over the constraint sets. The constrained sets as… ▽ More In this paper, we investigate a category of constrained fractional optimization problems that emerge in various practical applications. The objective function for this category is characterized by the ratio of a numerator and denominator, both being convex, semi-algebraic, Lipschitz continuous, and differentiable with Lipschitz continuous gradients over the constraint sets. The constrained sets associated with these problems are closed, convex, and semi-algebraic. We propose an efficient algorithm that is inspired by the proximal gradient method, and we provide a thorough convergence analysis. Our algorithm offers several benefits compared to existing methods. It requires only a single proximal gradient operation per iteration, thus avoiding the complicated inner-loop concave maximization usually required. Additionally, our method converges to a critical point without the typical need for a nonnegative numerator, and this critical point becomes a globally optimal solution with an appropriate condition. Our approach is adaptable to unbounded constraint sets as well. Therefore, our approach is viable for many more practical models. Numerical experiments show that our method not only reliably reaches ground-truth solutions in some model problems but also outperforms several existing methods in maximizing the Sharpe ratio with real-world financial data. △ Less

Submitted 15 May, 2024; v1 submitted 20 June, 2023; originally announced June 2023.

Comments: 29 pages, 2 figures

MSC Class: 90C26; 90C32; 65K05

arXiv:2303.11730 [pdf, other]

Abstract Visual Reasoning: An Algebraic Approach for Solving Raven's Progressive Matrices

Authors: Jingyi Xu, Tushar Vaidya, Yufei Wu, Saket Chandra, Zhangsheng Lai, Kai Fong Ernest Chong

Abstract: We introduce algebraic machine reasoning, a new reasoning framework that is well-suited for abstract reasoning. Effectively, algebraic machine reasoning reduces the difficult process of novel problem-solving to routine algebraic computation. The fundamental algebraic objects of interest are the ideals of some suitably initialized polynomial ring. We shall explain how solving Raven's Progressive Ma… ▽ More We introduce algebraic machine reasoning, a new reasoning framework that is well-suited for abstract reasoning. Effectively, algebraic machine reasoning reduces the difficult process of novel problem-solving to routine algebraic computation. The fundamental algebraic objects of interest are the ideals of some suitably initialized polynomial ring. We shall explain how solving Raven's Progressive Matrices (RPMs) can be realized as computational problems in algebra, which combine various well-known algebraic subroutines that include: Computing the Gröbner basis of an ideal, checking for ideal containment, etc. Crucially, the additional algebraic structure satisfied by ideals allows for more operations on ideals beyond set-theoretic operations. Our algebraic machine reasoning framework is not only able to select the correct answer from a given answer set, but also able to generate the correct answer with only the question matrix given. Experiments on the I-RAVEN dataset yield an overall $93.2\%$ accuracy, which significantly outperforms the current state-of-the-art accuracy of $77.0\%$ and exceeds human performance at $84.4\%$ accuracy. △ Less

Submitted 21 March, 2023; originally announced March 2023.

Comments: Accepted at IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023. 30 pages, 7 figures (including supplementary material). First three authors contributed equally. Code is available at: https://github.com/Xu-Jingyi/AlgebraicMR

MSC Class: 13P25; 68W30 ACM Class: I.1; I.2.4; I.2.6; I.5.1

arXiv:2212.12677 [pdf, other]

Towards a Multimodal Charging Network: Joint Planning of Charging Stations and Battery Swapping Stations for Electrified Ride-Hailing Fleets

Authors: Zhijie Lai, Sen Li

Abstract: This paper considers a multimodal charging network in which charging stations and battery swapping stations are jointly built to support an electric ride-hailing fleet synergistically. Our argument is based on the observation that charging an EV is a time-consuming burden, and battery swapping faces scaling issues due to its deployment costs. However, charging stations are cost-effective, making t… ▽ More This paper considers a multimodal charging network in which charging stations and battery swapping stations are jointly built to support an electric ride-hailing fleet synergistically. Our argument is based on the observation that charging an EV is a time-consuming burden, and battery swapping faces scaling issues due to its deployment costs. However, charging stations are cost-effective, making them ideal for scaling up EV fleets, while battery swapping stations offer quick turnaround and can be deployed in tandem with charging stations to improve fleet utilization and reduce operational costs. To fulfill this vision, we consider a ride-hailing platform that jointly builds charging and battery swapping stations to support an EV fleet. An optimization model is proposed to capture the platform's planning and operational decisions. In particular, the model incorporates essential components such as elastic passenger demand, spatial charging equilibrium, charging and swapping congestion, etc. The overall problem is formulated as a nonconcave program. Instead of pursuing the globally optimal solution, we establish a tight upper bound through relaxation and decomposition, allowing us to evaluate the solution optimality even in the absence of concavity. Through case studies for Manhattan, New York City, we find that joint planning of charging and battery swapping stations outperforms deploying only one of them, yielding a total profit that is 11.7% higher than swapping-only deployment under a limited budget, and 17.5% higher than charging-only deployment under a sufficient budget. These results underscore the complementary benefit between charging and battery swapping facilities. △ Less

Submitted 1 April, 2024; v1 submitted 24 December, 2022; originally announced December 2022.

arXiv:2212.00212 [pdf, other]

Simpler flag optimization

Authors: Zehua Lai, Lek-Heng Lim, Ke Ye

Abstract: We study the geometry of flag manifolds under different embeddings into a product of Grassmannians. We show that differential geometric objects and operations -- tangent vector, metric, normal vector, exponential map, geodesic, parallel transport, gradient, Hessian, etc -- have closed-form analytic expressions that are computable with standard numerical linear algebra. Furthermore, we are able to… ▽ More We study the geometry of flag manifolds under different embeddings into a product of Grassmannians. We show that differential geometric objects and operations -- tangent vector, metric, normal vector, exponential map, geodesic, parallel transport, gradient, Hessian, etc -- have closed-form analytic expressions that are computable with standard numerical linear algebra. Furthermore, we are able to derive a coordinate descent method in the flag manifold that performs well compared to other gradient descent methods. △ Less

Submitted 30 November, 2022; originally announced December 2022.

Comments: 26 pages

MSC Class: 14M15; 90C30; 90C53; 49Q12; 65F25; 62H12

arXiv:2212.00208 [pdf, other]

Haagerup bound for quaternionic Grothendieck inequality

Authors: Shmuel Friedland, Zehua Lai, Lek-Heng Lim

Abstract: We present here several versions of the Grothendieck inequality over the skew field of quaternions: The first one is the standard Grothendieck inequality for rectangular matrices, and two additional inequalities for self-adjoint matrices, as introduced by the first and the last authors in a recent paper. We give several results on ``conic Grothendieck inequality'': as Nesterov $π/2$-Theorem, which… ▽ More We present here several versions of the Grothendieck inequality over the skew field of quaternions: The first one is the standard Grothendieck inequality for rectangular matrices, and two additional inequalities for self-adjoint matrices, as introduced by the first and the last authors in a recent paper. We give several results on ``conic Grothendieck inequality'': as Nesterov $π/2$-Theorem, which corresponds to the cones of positive semidefinite matrices; the Goemans--Williamson inequality, which corresponds to the cones of weighted Laplacians; the diagonally dominant matrices. The most challenging technical part of this paper is the proof of the analog of Haagerup result that the inverse of the hypergeometric function $x {}_2F_1(\frac{1}{2}, \frac{1}{2}; 3; x^2)$ has first positive Taylor coefficient and all other Taylor coefficients are nonpositive. △ Less

Submitted 30 November, 2022; originally announced December 2022.

Comments: 36 pages

MSC Class: 47A07; 46B28; 46B85; 81P40; 81P45; 03D15; 97K30; 47N10; 90C27

arXiv:2211.15310 [pdf, other]

Stochastic Steffensen method

Authors: Minda Zhao, Zehua Lai, Lek-Heng Lim

Abstract: Is it possible for a first-order method, i.e., only first derivatives allowed, to be quadratically convergent? For univariate loss functions, the answer is yes -- the Steffensen method avoids second derivatives and is still quadratically convergent like Newton method. By incorporating an optimal step size we can even push its convergence order beyond quadratic to $1+\sqrt{2} \approx 2.414$. While… ▽ More Is it possible for a first-order method, i.e., only first derivatives allowed, to be quadratically convergent? For univariate loss functions, the answer is yes -- the Steffensen method avoids second derivatives and is still quadratically convergent like Newton method. By incorporating an optimal step size we can even push its convergence order beyond quadratic to $1+\sqrt{2} \approx 2.414$. While such high convergence orders are a pointless overkill for a deterministic algorithm, they become rewarding when the algorithm is randomized for problems of massive sizes, as randomization invariably compromises convergence speed. We will introduce two adaptive learning rates inspired by the Steffensen method, intended for use in a stochastic optimization setting and requires no hyperparameter tuning aside from batch size. Extensive experiments show that they compare favorably with several existing first-order methods. When restricted to a quadratic objective, our stochastic Steffensen methods reduce to randomized Kaczmarz method -- note that this is not true for SGD or SLBFGS -- and thus we may also view our methods as a generalization of randomized Kaczmarz to arbitrary objectives. △ Less

Submitted 28 November, 2022; originally announced November 2022.

Comments: 22 pages, 3 figures

MSC Class: 65K10; 65B05; 65C05; 68W20

arXiv:2206.03298 [pdf, other]

doi 10.1109/TITS.2023.3340253

Spatiotemporal Pricing and Fleet Management of Autonomous Mobility-on-Demand Networks: A Decomposition and Dynamic Programming Approach with Bounded Optimality Gap

Authors: Zhijie Lai, Sen Li

Abstract: This paper studies spatiotemporal pricing and fleet management for autonomous mobility-on-demand (AMoD) systems while taking elastic demand into account. We consider a platform that offers ride-hailing services using a fleet of autonomous vehicles and makes pricing, rebalancing, and fleet sizing decisions in response to demand fluctuations. A network flow model is developed to characterize the evo… ▽ More This paper studies spatiotemporal pricing and fleet management for autonomous mobility-on-demand (AMoD) systems while taking elastic demand into account. We consider a platform that offers ride-hailing services using a fleet of autonomous vehicles and makes pricing, rebalancing, and fleet sizing decisions in response to demand fluctuations. A network flow model is developed to characterize the evolution of system states over space and time, which captures the vehicle-passenger matching process and demand elasticity with respect to price and waiting time. The platform's objective of maximizing profit is formulated as a constrained optimal control problem, which is highly nonconvex due to the nonlinear demand model and complex supply-demand interdependence. To address this challenge, an integrated decomposition and dynamic programming approach is proposed, where we first relax the problem through a change of variable, then separate the relaxed problem into a few small-scale subproblems via dual decomposition, and finally solve each subproblem using dynamic programming. Despite the nonconvexity, our approach establishes a theoretical upper bound to evaluate the solution optimality. The proposed model and methodology are validated in numerical studies for Manhattan. We find that compared to the benchmark case, the proposed upper bound is significantly tighter. We also find that compared to pricing alone, joint pricing and fleet rebalancing can only offer a minor profit improvement when demand can be accurately predicted. However, during unanticipated demand surges, joint pricing and rebalancing can lead to substantially improved profits, and the impacts of demand shocks, despite being more widespread, can dissipate faster. △ Less

Submitted 5 December, 2023; v1 submitted 7 June, 2022; originally announced June 2022.

Journal ref: IEEE Transactions on Intelligent Transportation Systems (2023)

arXiv:2203.09762 [pdf, ps, other]

doi 10.1007/s10957-024-02403-8

Riemannian Interior Point Methods for Constrained Optimization on Manifolds

Authors: Zhijian Lai, Akiko Yoshise

Abstract: We extend the classical primal-dual interior point method from the Euclidean setting to the Riemannian one. Our method, named the Riemannian interior point method, is for solving Riemannian constrained optimization problems. We establish its local superlinear and quadratic convergence under the standard assumptions. Moreover, we show its global convergence when it is combined with a classical line… ▽ More We extend the classical primal-dual interior point method from the Euclidean setting to the Riemannian one. Our method, named the Riemannian interior point method, is for solving Riemannian constrained optimization problems. We establish its local superlinear and quadratic convergence under the standard assumptions. Moreover, we show its global convergence when it is combined with a classical line search. Our method is a generalization of the classical framework of primal-dual interior point methods for nonlinear nonconvex programming. Numerical experiments show the stability and efficiency of our method. △ Less

Submitted 7 February, 2024; v1 submitted 18 March, 2022; originally announced March 2022.

Comments: 25pages

Report number: Department of Policy and Planning Sciences Discussion Paper Series No.1381 MSC Class: 90C51; 90C53; 90C46

Journal ref: Journal of Optimization Theory and Applications, 2024

arXiv:2112.07218 [pdf, ps, other]

doi 10.1016/j.tra.2024.103975

Regulating Transportation Network Companies with a Mixture of Autonomous Vehicles and For-Hire Human Drivers

Authors: Di Ao, Jing Gao, Zhijie Lai, Sen Li

Abstract: This paper investigates the equity impacts of autonomous vehicles (AV) on for-hire human drivers and passengers in a ride-hailing market, and examines regulation policies that protect human drivers and improve transport equity for ride-hailing passengers. We consider a transportation network companies (TNC) that employs a mixture of AVs and human drivers to provide ride-hailing services. The TNC p… ▽ More This paper investigates the equity impacts of autonomous vehicles (AV) on for-hire human drivers and passengers in a ride-hailing market, and examines regulation policies that protect human drivers and improve transport equity for ride-hailing passengers. We consider a transportation network companies (TNC) that employs a mixture of AVs and human drivers to provide ride-hailing services. The TNC platform determines the spatial prices, fleet size, human driver payments, and vehicle relocation strategies to maximize its profit, while individual passengers choose between different transport modes to minimize their travel costs. A market equilibrium model is proposed to capture the interactions among passengers, human drivers, AVs, and TNC over the transportation network. The overall problem is formulated as a non-concave program, and an algorithm is developed to derive its approximate solution with a theoretical performance guarantee. Our study shows that TNC prioritizes AV deployment in higher-demand areas to make a higher profit. As AVs flood into these higher-demand areas, they compete with human drivers in the urban core and push them to relocate to suburbs. This leads to reduced earning opportunities for human drivers and increased spatial inequity for passengers. To mitigate these concerns, we consider: (a) a minimum wage for human drivers; and (b) a restrictive pickup policy that prohibits AVs from picking up passengers in higher-demand areas. In the former case, we show that a minimum wage for human drivers will protect them from the negative impact of AVs with negligible impacts on passengers. However, there exists a threshold beyond which the minimum wage will trigger the platform to replace the majority of human drivers with AVs. △ Less

Submitted 17 December, 2023; v1 submitted 14 December, 2021; originally announced December 2021.

Report number: Volume 180, February 2024, 103975

Journal ref: Transportation Research Part A: Policy and Practice, Volume 180, 2024, 103975, ISSN 0965-8564

arXiv:2107.01538 [pdf, other]

Completely Positive Factorization by a Riemannian Smoothing Method

Authors: Zhijian Lai, Akiko Yoshise

Abstract: Copositive optimization is a special case of convex conic programming, and it consists of optimizing a linear function over the cone of all completely positive matrices under linear constraints. Copositive optimization provides powerful relaxations of NP-hard quadratic problems or combinatorial problems, but there are still many open problems regarding copositive or completely positive matrices. I… ▽ More Copositive optimization is a special case of convex conic programming, and it consists of optimizing a linear function over the cone of all completely positive matrices under linear constraints. Copositive optimization provides powerful relaxations of NP-hard quadratic problems or combinatorial problems, but there are still many open problems regarding copositive or completely positive matrices. In this paper, we focus on one such problem; finding a completely positive (CP) factorization for a given completely positive matrix. We treat it as a nonsmooth Riemannian optimization problem, i.e., a minimization problem of a nonsmooth function over a Riemannian manifold. To solve this problem, we present a general smoothing framework for solving nonsmooth Riemannian optimization problems and show convergence to a stationary point of the original problem. An advantage is that we can implement it quickly with minimal effort by directly using the existing standard smooth Riemannian solvers, such as Manopt. Numerical experiments show the efficiency of our method especially for large-scale CP factorizations. △ Less

Submitted 4 October, 2022; v1 submitted 4 July, 2021; originally announced July 2021.

Comments: 25 pages

Report number: Department of Policy and Planning Sciences Discussion Paper Series No. 1377, University of Tsukuba

Journal ref: Computational Optimization and Applications, 2022

arXiv:2106.08162 [pdf, other]

doi 10.1016/j.trc.2022.103669

On-Demand Valet Charging for Electric Vehicles: Economic Equilibrium, Infrastructure Planning and Regulatory Incentives

Authors: Zhijie Lai, Sen Li

Abstract: Many city residents cannot install their private electric vehicle (EV) chargers due to the lack of dedicated parking spaces or insufficient grid capacity. This presents a significant barrier towards large-scale EV adoption. To address this concern, this paper considers a novel business model, on-demand valet charging, that unlocks the potential of under-utilized public charging infrastructure to p… ▽ More Many city residents cannot install their private electric vehicle (EV) chargers due to the lack of dedicated parking spaces or insufficient grid capacity. This presents a significant barrier towards large-scale EV adoption. To address this concern, this paper considers a novel business model, on-demand valet charging, that unlocks the potential of under-utilized public charging infrastructure to promise higher EV penetration. In the proposed model, a platform recruits a fleet of couriers that shuttle between customers and public charging stations to provide on-demand valet charging services to EV owners at an affordable price. Couriers are dispatched to pick up low-battery EVs from customers, deliver the EVs to charging stations, plug them in, and then return the fully-charged EVs to customers. To depict the proposed business model, we develop a queuing network to represent the stochastic matching dynamics, and further formulate an economic equilibrium model to capture the incentives of couriers, customers as well as the platform. These models are used to examine how charging infrastructure planning and regulatory intervention will affect the market outcome. First, we find that the optimal charging station densities for distinct stakeholders are different: couriers prefer a lower density; the platform prefers a higher density; while the density in-between leads to the highest EV penetration as it balances the time traveling to and queuing at charging stations. Second, we evaluate a regulatory policy that imposes a tax on the platform and invests the tax revenue in public charging infrastructure. Numerical results suggest that this regulation can suppress the platform's market power associated with monopoly pricing, increase social welfare, and facilitate the market expansion. △ Less

Submitted 9 November, 2021; v1 submitted 15 June, 2021; originally announced June 2021.

Journal ref: Transportation Research Part C: Emerging Technologies, 140, p.103669 (2022)

arXiv:2102.03389 [pdf, other]

Online Statistical Inference for Stochastic Optimization via Kiefer-Wolfowitz Methods

Authors: Xi Chen, Zehua Lai, He Li, Yichen Zhang

Abstract: This paper investigates the problem of online statistical inference of model parameters in stochastic optimization problems via the Kiefer-Wolfowitz algorithm with random search directions. We first present the asymptotic distribution for the Polyak-Ruppert-averaging type Kiefer-Wolfowitz (AKW) estimators, whose asymptotic covariance matrices depend on the distribution of search directions and the… ▽ More This paper investigates the problem of online statistical inference of model parameters in stochastic optimization problems via the Kiefer-Wolfowitz algorithm with random search directions. We first present the asymptotic distribution for the Polyak-Ruppert-averaging type Kiefer-Wolfowitz (AKW) estimators, whose asymptotic covariance matrices depend on the distribution of search directions and the function-value query complexity. The distributional result reflects the trade-off between statistical efficiency and function query complexity. We further analyze the choice of random search directions to minimize certain summary statistics of the asymptotic covariance matrix. Based on the asymptotic distribution, we conduct online statistical inference by providing two construction procedures of valid confidence intervals. △ Less

Submitted 9 December, 2023; v1 submitted 5 February, 2021; originally announced February 2021.

arXiv:2009.13502 [pdf, other]

Simpler Grassmannian optimization

Authors: Zehua Lai, Lek-Heng Lim, Ke Ye

Abstract: There are two widely used models for the Grassmannian $\operatorname{Gr}(k,n)$, as the set of equivalence classes of orthogonal matrices $\operatorname{O}(n)/(\operatorname{O}(k) \times \operatorname{O}(n-k))$, and as the set of trace-$k$ projection matrices $\{P \in \mathbb{R}^{n \times n} : P^{\mathsf{T}} = P = P^2,\; \operatorname{tr}(P) = k\}$. The former, standard in manifold optimization, ha… ▽ More There are two widely used models for the Grassmannian $\operatorname{Gr}(k,n)$, as the set of equivalence classes of orthogonal matrices $\operatorname{O}(n)/(\operatorname{O}(k) \times \operatorname{O}(n-k))$, and as the set of trace-$k$ projection matrices $\{P \in \mathbb{R}^{n \times n} : P^{\mathsf{T}} = P = P^2,\; \operatorname{tr}(P) = k\}$. The former, standard in manifold optimization, has the advantage of giving numerically stable algorithms but the disadvantage of having to work with equivalence classes of matrices. The latter, widely used in coding theory and probability, has the advantage of using actual matrices (as opposed to equivalence classes) but working with projection matrices is numerically unstable. We present an alternative that has both advantages and suffers from neither of the disadvantages; by representing $k$-dimensional subspaces as symmetric orthogonal matrices of trace $2k-n$, we obtain \[ \operatorname{Gr}(k,n) \cong \{Q \in \operatorname{O}(n) : Q^{\mathsf{T}} = Q, \; \operatorname{tr}(Q) = 2k -n\}. \] As with the other two models, we show that differential geometric objects and operations -- tangent vector, metric, normal vector, exponential map, geodesic, parallel transport, gradient, Hessian, etc -- have closed-form analytic expressions that are computable with standard numerical linear algebra. In the proposed model, these expressions are considerably simpler, a result of representing $\operatorname{Gr}(k,n)$ as a linear section of a compact matrix Lie group $\operatorname{O}(n)$, and can be computed with at most one QR decomposition and one exponential of a special skew-symmetric matrix that takes only $O(nk(n-k))$ time. In particular, we completely avoid eigen- and singular value decompositions in our steepest descent, conjugate gradient, quasi-Newton, and Newton methods for the Grassmannian. △ Less

Submitted 28 September, 2020; originally announced September 2020.

Comments: 34 pages, 4 figures

MSC Class: 14M15; 90C30; 90C53; 49Q12; 65F25; 62H12

arXiv:2006.01510 [pdf, ps, other]

Recht-Ré Noncommutative Arithmetic-Geometric Mean Conjecture is False

Authors: Zehua Lai, Lek-Heng Lim

Abstract: Stochastic optimization algorithms have become indispensable in modern machine learning. An unresolved foundational question in this area is the difference between with-replacement sampling and without-replacement sampling -- does the latter have superior convergence rate compared to the former? A groundbreaking result of Recht and Ré reduces the problem to a noncommutative analogue of the arithme… ▽ More Stochastic optimization algorithms have become indispensable in modern machine learning. An unresolved foundational question in this area is the difference between with-replacement sampling and without-replacement sampling -- does the latter have superior convergence rate compared to the former? A groundbreaking result of Recht and Ré reduces the problem to a noncommutative analogue of the arithmetic-geometric mean inequality where $n$ positive numbers are replaced by $n$ positive definite matrices. If this inequality holds for all $n$, then without-replacement sampling indeed outperforms with-replacement sampling. The conjectured Recht-Ré inequality has so far only been established for $n = 2$ and a special case of $n = 3$. We will show that the Recht-Ré conjecture is false for general $n$. Our approach relies on the noncommutative Positivstellensatz, which allows us to reduce the conjectured inequality to a semidefinite program and the validity of the conjecture to certain bounds for the optimum values, which we show are false as soon as $n = 5$. △ Less

Submitted 2 June, 2020; originally announced June 2020.

Comments: 10 pages

MSC Class: 15A45; 47A13; 90C22; 13J30; 15B48; 68W20

Journal ref: Proceedings of the 37th International Conference on Machine Learning, Vienna, Austria, PMLR 108, 2020

Showing 1–17 of 17 results for author: Lai, Z