INFORMS Journal on Optimization
Published By Institute For Operations Research And The Management Sciences

2575-1492, 2575-1484

Vedat Bayram ◽  
Gohram Baloch ◽  
Fatma Gzara ◽  
Samir Elhedhli

Optimizing warehouse processes has direct impact on supply chain responsiveness, timely order fulfillment, and customer satisfaction. In this work, we focus on the picking process in warehouse management and study it from a data perspective. Using historical data from an industrial partner, we introduce, model, and study the robust order batching problem (ROBP) that groups orders into batches to minimize total order processing time accounting for uncertainty caused by system congestion and human behavior. We provide a generalizable, data-driven approach that overcomes warehouse-specific assumptions characterizing most of the work in the literature. We analyze historical data to understand the processes in the warehouse, to predict processing times, and to improve order processing. We introduce the ROBP and develop an efficient learning-based branch-and-price algorithm based on simultaneous column and row generation, embedded with alternative prediction models such as linear regression and random forest that predict processing time of a batch. We conduct extensive computational experiments to test the performance of the proposed approach and to derive managerial insights based on real data. The data-driven prescriptive analytics tool we propose achieves savings of seven to eight minutes per order, which translates into a 14.8% increase in daily picking operations capacity of the warehouse.

Edward Anderson ◽  
Andy Philpott

Sample average approximation is a popular approach to solving stochastic optimization problems. It has been widely observed that some form of robustification of these problems often improves the out-of-sample performance of the solution estimators. In estimation problems, this improvement boils down to a trade-off between the opposing effects of bias and shrinkage. This paper aims to characterize the features of more general optimization problems that exhibit this behaviour when a distributionally robust version of the sample average approximation problem is used. The paper restricts attention to quadratic problems for which sample average approximation solutions are unbiased and shows that expected out-of-sample performance can be calculated for small amounts of robustification and depends on the type of distributionally robust model used and properties of the underlying ground-truth probability distribution of random variables. The paper was written as part of a New Zealand funded research project that aimed to improve stochastic optimization methods in the electric power industry. The authors of the paper have worked together in this domain for the past 25 years.

Maher Nouiehed ◽  
Meisam Razaviyayn

With the increasing popularity of nonconvex deep models, developing a unifying theory for studying the optimization problems that arise from training these models becomes very significant. Toward this end, we present in this paper a unifying landscape analysis framework that can be used when the training objective function is the composite of simple functions. Using the local openness property of the underlying training models, we provide simple sufficient conditions under which any local optimum of the resulting optimization problem is globally optimal. We first completely characterize the local openness of the symmetric and nonsymmetric matrix multiplication mapping. Then we use our characterization to (1) provide a simple proof for the classical result of Burer-Monteiro and extend it to noncontinuous loss functions; (2) show that every local optimum of two-layer linear networks is globally optimal. Unlike many existing results in the literature, our result requires no assumption on the target data matrix [Formula: see text], and input data matrix [Formula: see text]; (3) develop a complete characterization of the local/global optima equivalence of multilayer linear neural networks (we provide various counterexamples to show the necessity of each of our assumptions); and (4) show global/local optima equivalence of overparameterized nonlinear deep models having a certain pyramidal structure. In contrast to existing works, our result requires no assumption on the differentiability of the activation functions and can go beyond “full-rank” cases.

Junyu Zhang ◽  
Lin Xiao ◽  
Shuzhong Zhang

The cubic regularized Newton method of Nesterov and Polyak has become increasingly popular for nonconvex optimization because of its capability of finding an approximate local solution with a second order guarantee and its low iteration complexity. Several recent works extend this method to the setting of minimizing the average of N smooth functions by replacing the exact gradients and Hessians with subsampled approximations. It is shown that the total Hessian sample complexity can be reduced to be sublinear in N per iteration by leveraging stochastic variance reduction techniques. We present an adaptive variance reduction scheme for a subsampled Newton method with cubic regularization and show that the expected Hessian sample complexity is [Formula: see text] for finding an [Formula: see text]-approximate local solution (in terms of first and second order guarantees, respectively). Moreover, we show that the same Hessian sample complexity is retained with fixed sample sizes if exact gradients are used. The techniques of our analysis are different from previous works in that we do not rely on high probability bounds based on matrix concentration inequalities. Instead, we derive and utilize new bounds on the third and fourth order moments of the average of random matrices, which are of independent interest on their own.

Zhongruo Wang ◽  
Bingyuan Liu ◽  
Shixiang Chen ◽  
Shiqian Ma ◽  
Lingzhou Xue ◽  

Spectral clustering is one of the fundamental unsupervised learning methods and is widely used in data analysis. Sparse spectral clustering (SSC) imposes sparsity to the spectral clustering, and it improves the interpretability of the model. One widely adopted model for SSC in the literature is an optimization problem over the Stiefel manifold with nonsmooth and nonconvex objective. Such an optimization problem is very challenging to solve. Existing methods usually solve its convex relaxation or need to smooth its nonsmooth objective using certain smoothing techniques. Therefore, they were not targeting solving the original formulation of SSC. In this paper, we propose a manifold proximal linear method (ManPL) that solves the original SSC formulation without twisting the model. We also extend the algorithm to solve multiple-kernel SSC problems, for which an alternating ManPL algorithm is proposed. Convergence and iteration complexity results of the proposed methods are established. We demonstrate the advantage of our proposed methods over existing methods via clustering of several data sets, including University of California Irvine and single-cell RNA sequencing data sets.

Xiaoyue Li ◽  
John M. Mulvey

The contributions of this paper are threefold. First, by combining dynamic programs and neural networks, we provide an efficient numerical method to solve a large multiperiod portfolio allocation problem under regime-switching market and transaction costs. Second, the performance of our combined method is shown to be close to optimal in a stylized case. To our knowledge, this is the first paper to carry out such a comparison. Last, the superiority of the combined method opens up the possibility for more research on financial applications of generic methods, such as neural networks, provided that solutions to simplified subproblems are available via traditional methods. The research on combining fast starts with neural networks began about four years ago. We observed that Professor Weinan E’s approach for solving systems of differential equations by neural networks had much improved performance when starting close to an optimal solution and could stall if the current iterate was far from an optimal solution. As we all know, this behavior is common with Newton- based algorithms. As a consequence, we discovered that combining a system of differential equations with a feedforward neural network could much improve overall computational performance. In this paper, we follow a similar direction for dynamic portfolio optimization within a regime-switching market with transaction costs. It investigates how to improve efficiency by combining dynamic programming with a recurrent neural network. Traditional methods face the curse of dimensionality. In contrast, the running time of our combined approach grows approximately linearly with the number of risky assets. It is inspiring to explore the possibilities of combined methods in financial management, believing a careful linkage of existing dynamic optimization algorithms and machine learning will be an active domain going forward. Relationship of the authors: Professor John M. Mulvey is Xiaoyue Li’s doctoral advisor.

Zichong Li ◽  
Yangyang Xu

First-order methods (FOMs) have been widely used for solving large-scale problems. A majority of existing works focus on problems without constraint or with simple constraints. Several recent works have studied FOMs for problems with complicated functional constraints. In this paper, we design a novel augmented Lagrangian (AL)–based FOM for solving problems with nonconvex objective and convex constraint functions. The new method follows the framework of the proximal point (PP) method. On approximately solving PP subproblems, it mixes the usage of the inexact AL method (iALM) and the quadratic penalty method, whereas the latter is always fed with estimated multipliers by the iALM. The proposed method achieves the best-known complexity result to produce a near Karush–Kuhn–Tucker (KKT) point. Theoretically, the hybrid method has a lower iteration-complexity requirement than its counterpart that only uses iALM to solve PP subproblems; numerically, it can perform significantly better than a pure-penalty-based method. Numerical experiments are conducted on nonconvex linearly constrained quadratic programs. The numerical results demonstrate the efficiency of the proposed methods over existing ones.

Naveed Haghani ◽  
Claudio Contardo ◽  
Julian Yarkony

We address the problem of accelerating column generation (CG) for set-covering formulations via dual optimal inequalities (DOIs). We study two novel classes of DOIs, which are referred to as Flexible DOIs (F-DOIs) and Smooth-DOIs (S-DOIs), respectively (and jointly as SF-DOIs). F-DOIs provide rebates for covering items more than necessary. S-DOIs describe the payment of a penalty to permit the undercoverage of items in exchange for the overinclusion of other items. Unlike other classes of DOIs from the literature, the S-DOIs and F-DOIs rely on very little problem-specific knowledge and, as such, have the potential to be applied to a vast number of problem domains. In particular, we discuss the application of the new DOIs to three relevant problems: the single-source capacitated facility location problem, the capacitated p-median problem, and the capacitated vehicle-routing problem. We provide computational evidence of the strength of the new inequalities by embedding them within a column-generation solver for these problems. Substantial speedups can be observed as when compared with a nonstabilized variant of the same CG procedure to achieve the linear-relaxation lower bound on problems with dense columns and structured assignment costs.

Xi Chen ◽  
Qihang Lin ◽  
Guanglin Xu

Distributionally robust optimization (DRO) has been introduced for solving stochastic programs in which the distribution of the random variables is unknown and must be estimated by samples from that distribution. A key element of DRO is the construction of the ambiguity set, which is a set of distributions that contains the true distribution with a high probability. Assuming that the true distribution has a probability density function, we propose a class of ambiguity sets based on confidence bands of the true density function. As examples, we consider the shape-restricted confidence bands and the confidence bands constructed with a kernel density estimation technique. The former allows us to incorporate the prior knowledge of the shape of the underlying density function (e.g., unimodality and monotonicity), and the latter enables us to handle multidimensional cases. Furthermore, we establish the convergence of the optimal value of DRO to that of the underlying stochastic program as the sample size increases. The DRO with our ambiguity set involves functional decision variables and infinitely many constraints. To address this challenge, we apply duality theory to reformulate the DRO to a finite-dimensional stochastic program, which is amenable to a stochastic subgradient scheme as a solution method.

Jacob Mays

Summary of Contribution This article was inspired by price formation changes recently proposed and implemented in several U.S. wholesale electricity markets. The analysis draws from and contributes to three lines of literature. First, the paper specifies two mechanisms that lead to inefficient and inconsistent prices in real-world markets. Second, the article illustrates the importance of considering uncertainty in evaluating policies for pricing in nonconvex markets and observes that convex hull pricing, sometimes described as an ?ideal? due to its uplift-minimizing property in deterministic analyses, can perform poorly in settings with uncertainty. Lastly, the paper strengthens the theoretical basis for operating reserve demand curves by connecting their parameterization to outcomes expected in efficient stochastic markets.

