Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Ning 2019

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Computers and Chemical Engineering 125 (2019) 434–448

Contents lists available at ScienceDirect

Computers and Chemical Engineering


journal homepage: www.elsevier.com/locate/compchemeng

Optimization under uncertainty in the era of big data and deep


learning: When machine learning meets mathematical programming
Chao Ning, Fengqi You∗
Robert Frederick Smith School of Chemical and Biomolecular Engineering, Cornell University, Ithaca, NY 14853, USA

a r t i c l e i n f o a b s t r a c t

Article history: This paper reviews recent advances in the field of optimization under uncertainty via a modern data lens,
Received 17 January 2019 highlights key research challenges and promise of data-driven optimization that organically integrates
Revised 6 March 2019
machine learning and mathematical programming for decision-making under uncertainty, and identifies
Accepted 27 March 2019
potential research opportunities. A brief review of classical mathematical programming techniques for
Available online 28 March 2019
hedging against uncertainty is first presented, along with their wide spectrum of applications in Process
Keywords: Systems Engineering. A comprehensive review and classification of the relevant publications on data-
Data-driven optimization driven distributionally robust optimization, data-driven chance constrained program, data-driven robust
Decision making under uncertainty optimization, and data-driven scenario-based optimization is then presented. This paper also identifies
Big data fertile avenues for future research that focuses on a closed-loop data-driven optimization framework,
Machine learning which allows the feedback from mathematical programming to machine learning, as well as scenario-
Deep learning
based optimization leveraging the power of deep learning techniques. Perspectives on online learning-
based data-driven multistage optimization with a learning-while-optimizing scheme are presented.
© 2019 Elsevier Ltd. All rights reserved.

1. Introduction In the era of big data and deep learning, intelligent use of data
has a great potential to benefit many areas. Although there is no
Optimization applications abound in many areas of science rigorous definition of big data (John Walker, 2014), people typ-
and engineering (Biegler and Grossmann, 2004; Grossmann and ically characterize big data with five Vs, namely, volume, veloc-
Biegler, 2004; Sakizlis et al., 2004). In real practice, some pa- ity, variety, veracity and value (Yin and Kaynak, 2015). Torrents of
rameters involved in optimization problems are subject to un- data are routinely collected and archived in process industries, and
certainty due to a variety of reasons, including estimation er- these data are becoming an increasingly important asset in pro-
rors and unexpected disturbance (Sahinidis, 2004). Such uncertain cess control, operations and design (Qin, 2014; Yin et al., 2015;
parameters can be product demands in process planning (Liu and Venkatasubramanian, 2019). Nowadays, a wide array of emerging
Sahinidis, 1996), kinetic constants in reaction-separation-recycling machine learning tools can be leveraged to analyze data and ex-
system design (Acevedo and Pistikopoulos, 1998), and task du- tract accurate, relevant, and useful information to facilitate knowl-
rations in batch process scheduling (Li and Ierapetritou, 2008), edge discovery and decision-making. Deep learning, one of the
among others. The issue of uncertainty could unfortunately ren- most rapidly growing machine learning subfields, demonstrates
der the solution of a deterministic optimization problem (i.e. remarkable power in deciphering multiple layers of representa-
the one disregarding uncertainty) suboptimal or even infeasible tions from raw data without any domain expertise in designing
(Ben-Tal and Nemirovski, 2002). The infeasibility, i.e. the viola- feature extractors (Goodfellow et al., 2016). More recently, dra-
tion of constraints in optimization problems, has a disastrous matic progress of mathematical programming (Grossmann, 2012),
consequence on the solution quality. Motivated by the practi- coupled with recent advances in machine learning (Jordan and
cal concern, optimization under uncertainty has attracted tremen- Mitchell, 2015), especially in deep learning over the past decade
dous attention from both academia and industry (Sahinidis, 2004; (LeCun et al., 2015), sparks a flurry of interest in data-driven op-
Pistikopoulos, 1995). timization (Bertsimas et al., 2018; Bertsimas and Thiele, 2006;
Calfa et al., 2015; Calfa et al., 2015; Campbell and How, 2015;
Jiang and Guan, 2015; Ning and You, 2017a; Levi et al., 2015). In
the data-driven optimization paradigm, uncertainty model is for-

Corresponding author.
mulated based on data, thus allowing uncertainty data “speak”
E-mail address: fengqi.you@cornell.edu (F. You). for themselves in the optimization algorithm. In this way, rich

https://doi.org/10.1016/j.compchemeng.2019.03.034
0098-1354/© 2019 Elsevier Ltd. All rights reserved.
C. Ning and F. You / Computers and Chemical Engineering 125 (2019) 434–448 435

information underlying uncertainty data can be harnessed in an Q (x, ω ) = min b(ω )T y(ω )
automatic manner for smart and data-driven decision making. y(ω )∈Y

In this review paper, we summarize and classify the existing s.t. W(ω )y(ω ) ≥ h(ω ) − T(ω )x (2)
contributions of data-driven optimization under uncertainty, high-
light the current research trends, point out the research chal- where x represents first-stage decisions made “here-and-now” be-
lenges, and introduce promising methodologies that can be used fore the uncertainty ω is realized, while the second-stage deci-
to tackle these challenges. We briefly review conventional mathe- sions y are postponed in a “wait-and-see” manner after observ-
matical programming techniques for hedging against uncertainty, ing the uncertainty realization. The objective of the two-stage
alongside their wide spectrum of applications in Process Systems stochastic programming model includes two parts: the first-stage
Engineering (PSE). We then summarize the existing research pa- objective cT x and the expectation of the second-stage objective
pers on data-driven optimization under uncertainty and classify b(ω)T y(ω). The constraints associated with the first-stage decisions
them into four categories according to their unique approach for are Ax ≥ d, x ∈ X, and the constraints of the second-stage decisions
uncertainty modeling and distinct optimization structures. Based are W(ω )y(ω ) ≥ h(ω ) − T(ω )x and y(ω) ∈ Y. Sets X and Y can in-
on the literature survey, we identify three promising future re- clude nonnegativity, continuity or integrality restrictions.
search directions on optimization under uncertainty in the era of The resulting two-stage stochastic programming problem is
big data and deep learning and highlight respective research chal- computationally expensive to solve because of the growth of com-
lenges and potential methodologies. putational time with the number of scenarios. To this end, de-
The rest of this paper is organized as follows. In Section 2, composition based algorithms have been developed in the exist-
background on mathematical programming techniques for deci- ing literature, including Benders decomposition or the L-shaped
sion making under uncertainty is given. Section 3 presents a com- method (Vanslyke and Wets, 1969; Laporte and Louveaux, 1993),
prehensive literature review, where relevant research papers are and Lagrangean decomposition (Oliveira et al., 2013). The location
summarized and classified into four categories. Section 4 discusses of binary decision variables is critical for the design of computa-
promising future research directions to further advance the area of tional algorithms. For stochastic programs with integer recourse,
data-driven optimization. Conclusions are provided in Section 5. the expected recourse function is no longer convex, and even dis-
continuous, thus hindering the employment of conventional L-
2. Background on optimization under uncertainty shaped method. As a result, research efforts have made on com-
putational algorithms for efficient solution of two-stage stochastic
In recent years, mathematical programming techniques for de- mixed-integer programs (Küçükyavuz and Sen, 2017), such as La-
cision making under uncertainty have gained tremendous popular- grangian relaxation (Caroe and Schultz, 1999), branch-and-bound
ity among the PSE community, as witnessed by various successful scheme (Ahmed et al., 2004), and an improved L-shaped method
applications in process synthesis and design (Pistikopoulos, 1995; (Li and Grossmann, 2018; You and Grossmann, 2013).
Rooney and Biegler, 2003), production scheduling and planning Stochastic programming has demonstrated various applica-
(Li and Ierapetritou, 2008; Verderame et al., 2010), and process tions in PSE, such as design and operations of batch pro-
control (Mesbah, 2016; Krieger and Pistikopoulos, 2014; Chiu and cesses (Ierapetritou and Pistikopoulos, 1995; Bonfill et al., 2004;
Christofides, 20 0 0). In this section, we present some background Bonfill et al., 2005; Chu and You, 2013), process flowsheet op-
knowledge of methodologies for optimization under uncertainty, timization (Steimel and Engell, 2015), energy systems (Liu et al.,
along with computational algorithms and applications in PSE. 2010; Peng et al., 2018; Yue and You, 2016; Gao and You, 2017),
Specifically, we briefly review three leading modeling paradigms and supply chain management (Zeballos et al., 2016; Gao and
for optimization under uncertainty, namely stochastic program- You, 2015; You et al., 2009; Gebreslassie et al., 2012; Tong et al.,
ming, chance-constrained programming, and robust optimization. 2014; You et al., 2011; Gupta and Maranas, 2003). Due to its
wide applicability, immense research efforts have been made on
2.1. Stochastic programming the variants of stochastic programming approach. For instance,
the two-stage formulation in (1) can be readily extended to a
Stochastic programming is a powerful modeling paradigm for multi-stage stochastic programming setup by utilizing scenario
decision making under uncertainty that aims to optimize the trees. Other extensions include stochastic nonlinear programming
expected objective value across all the uncertainty realizations (Li et al., 2011), and stochastic programs with endogenous uncer-
(Birge and Louveaux, 2011). The key idea of the stochastic pro- tainties (Gupta and Grossmann, 2014; Goel and Grossmann, 2007).
gramming approach is to model the randomness in uncertain
parameters with probability distributions (Birge, 1997; Kall and
Wallace, 1994). In general, the stochastic programming approach
2.2. Chance constrained optimization
can effectively accommodate decision making processes with var-
ious time stages. In single-stage stochastic programs, there are
As another powerful paradigm for optimization under uncer-
no recourse variables and all the decisions must be made before
tainty, chance constrained programming aims to optimize an ob-
knowing uncertainty realizations. By contrast, stochastic program-
jective while ensuring constraints to be satisfied with a speci-
ming with recourse can take corrective actions after uncertainty
fied probability in uncertain environment (Prékopa, 1995) {Uryasev,
is revealed. Among the stochastic programming approach with
20 0 0}. As in the stochastic programming approach, probabil-
recourse, the most widely used one is the two-stage stochastic pro-
ity distribution is the key uncertainty model to capture the
gram, in which decisions are partitioned into “here-and-now” de-
randomness of uncertain parameters in chance constrained opti-
cisions and “wait-and-see” decisions.
mization. The chance constrained program was first introduced in
The general mathematical formulation of a two-stage stochas-
the seminal work of (Charnes and Cooper, 1959), and attracted
tic programming problem is given as follows (Birge and Lou-
considerable attention ever since. Such chance constraints or prob-
veaux, 2011).
abilistic constraints are flexible enough to quantify the trade-off
min cT x + Eω [Q (x, ω )] between objective performance and system reliability (Li et al.,
x∈X
2008).
s.t. Ax ≥ d (1)
The generic formulation of a chance constrained optimization
The recourse function Q(x, ω) is defined by, problem is presented as follows,
436 C. Ning and F. You / Computers and Chemical Engineering 125 (2019) 434–448

min f (x ) largest constraint violation, the realization leading to the lowest


x∈X
asset return (Gregory et al., 2011) or the one resulting in the high-
s.t. P{ξ ∈ |G(x, ξ ) ≤ 0 } ≥ 1 − ε (3) est regret (Assavapokee et al., 2008).
where x represents the vector of decision variables, X denotes the The conventional box uncertainty set is not a good choice since
deterministic feasible region, f is the objective function to be min- it includes the unlikely-to-happen scenario where uncertain pa-
imized, ξ is a random vector following a known probability dis- rameters simultaneously increase to their highest values. The con-
tribution P with the support set , G = (g1 , . . . , gm ) represents a ventional box uncertainty set is defined as follows (Soyster, 1973).
constraint mapping, 0 is a vector of all zeros, and parameter ε is a  
pre-specified risk level. Ubox = u|uLi ≤ ui ≤ uUi , ∀i (4)
The chance constraint P{ξ ∈ |G(x, ξ ) ≤ 0} ≥ 1 − ε guarantees
that decision x satisfies constraints with a probability of at where Ubox is a box uncertainty set, u is a vector of uncertain
least 1 − ε . Note that when the number of constraints m = 1, the parameters, ui is the i-th component of uncertainty vector u. ui L
above optimization model is an individual chance constrained and ui U represent the lower bound and the upper bound of uncer-
program; for m > 1, it is called joint chance constrained pro- tain parameter ui , respectively. Box uncertainty set simply defines
gram (Miller and Wagner, 1965). A salient merit of chance con- the range of each uncertain parameter in vector u. One cannot
strained programs is that it allows decision makers choose their easily control the size of this uncertainty set to meet his or her
own risk levels for the improvement in objectives. To model se- risk-averse attitude. To this end, researchers propose the following
quential decision-making processes, two-stage chance constrained budgeted uncertainty set (Bertsimas and Sim, 2004).
optimization with recourse was recently studied and had various  
applications (Liu et al., 2016; Quddus et al., 2018). 
Despite of its promising modeling power, the resulting chance
Ubudget = u| ui = ūi + ui · zi , −1 ≤ zi ≤ 1, |zi | ≤  , ∀i
i
constrained program is generally computationally intractable for
the following two main reasons. First, calculating the probability (5)
of constraint satisfaction for a given x involves a multivariate in-
where Ubudget denotes a budgeted uncertainty set, u and ui have
tegral, which is believed to be computationally prohibitive. Sec-
the same definitions as in (4), ūi is the nominal value of ui , ui
ond, the feasible region is not convex even if set X is convex and
is the largest possible deviation of uncertain parameter ui , zi de-
G(x, ξ ) is convex in x for any realizations of uncertain vector ξ
notes the extent and direction of parameter deviation, and  is an
(Prékopa, 1995). In light of these computational challenges, a large
uncertainty budget.
body of related literature is devoted into the development of so-
Traditional robust optimization approaches, also known as
lution algorithms for chance constrained optimization problems,
static robust optimization (Bertsimas et al., 2011), make all the
such as sample average approximation (Luedtke and Ahmed, 2008),
decisions at once. This modeling framework cannot well rep-
sequential approximation (Hong et al., 2011), and convex con-
resent sequential decision-making problems (Lorca et al., 2016;
servative approximation schemes (Nemirovski and Shapiro, 2006).
Atamtürk and Zhang, 2007; Bertsimas et al., 2013). Adaptive ro-
Note that chance constrained programs admit convex reformula-
bust optimization (ARO) was proposed to offer a new paradigm
tion for some very special cases. For example, individual chance
for optimization under uncertainty by incorporating recourse
constrained programs are endowed with tractable convex reformu-
decisions (Ben-Tal et al., 2004). Due to the flexibility of adjust-
lations for normal distributions (Birge and Louveaux, 2011). Chance
ing recourse decisions after observing uncertainty realizations, ARO
constraints with right-hand-side uncertainty are convex if uncer-
typically generates less conservative solutions than static robust
tain parameters are independent and follow log-concave distribu-
optimization (Ben-Tal et al., 2009). The general form of a two-stage
tions (Prékopa, 1995).
adaptive robust mixed-integer programming model is given as fol-
In the PSE community, chance constraints are usually employed
lows:
for customer demand satisfaction, product quality specification,
and service level of process systems {Maranas, 1997; Yue and min cT x + max min bT y
x u∈U y∈(x,u )
You, 2013; Gupta et al., 20 0 0; You and Grossmann, 2011; Chu et al.,
2015; Zipkin, 20 0 0}. Due to its practical relevance, chance con- s.t. Ax ≥ d, x ∈ Rn+1 × Z n2
strained optimization has been applied in numerous applications,
 
(x, u ) = y ∈ Rn+3 : Wy ≥ h − Tx − Mu (6)
including model predictive control (Shen et al., 2018; Cannon et al.,
2009), process design and operations (Li et al., 2008; Chu et al., where x is the first-stage decision made before uncertainty u is re-
2015; Gong et al., 2017), refinery blend planning (Yang et al., 2017), alized, while the second-stage decision y is postponed in a “wait-
biopharmaceutical manufacturing (Liu et al., 2016), and supply and-see” manner. x includes both continuous and integer variables,
chain optimization (Mitra et al., 2008; You and Grossmann, 2011; while y only includes continuous variables. c and b are the vec-
You and Grossmann, 2008; Ye and You, 2016). tors of the cost coefficients. U is an uncertainty set that character-
izes the region of uncertainty realizations. ARO approaches could
2.3. Robust optimization be applied to address uncertainty in a variety of applications, in-
cluding process design (Gong and You, 2018; Gong and You, 2017;
As a promising alternative paradigm, robust optimization does Gong et al., 2016), process scheduling (Shi and You, 2016), supply
not require accurate knowledge on probability distributions of un- chain optimization (Gong et al., 2016; Tong et al., 2014), among
certain parameters (Ben-Tal and Nemirovski, 20 0 0; Bertsimas and others. Besides the two-stage ARO framework, the multistage ARO
Sim, 2004; Ben-Tal et al., 2009). Instead, it models uncertain pa- method has attracted immense attention due to its unique fea-
rameters using an uncertainty set, which includes possible un- ture in reflecting sequential realizations of uncertainties over time
certainty realizations. It is worth noting that uncertainty set is (Delage and Iancu, 2015). In multistage ARO, decisions are made
a paramount ingredient in robust optimization framework (Ben- sequentially, and uncertainties are revealed gradually over stages.
Tal et al., 2009). Given a specific uncertainty set, the idea of robust Note that the additional value delivered by ARO over static ro-
optimization is to hedge against the worst case within the uncer- bust optimization is its adjustability of recourse decisions based
tainty set. The worst-case uncertainty realization is defined based on uncertainty realizations (Ben-Tal et al., 2004). Accordingly, the
on different contexts: it could be the realization giving rise to the multistage ARO method has demonstrated applications in process
C. Ning and F. You / Computers and Chemical Engineering 125 (2019) 434–448 437

scheduling and planning (Lappas and Gounaris, 2016; Ning and hedging against the distribution errors, and accounts for the input
You, 2017b). of uncertainty data.
Despite popularity of the above three leading paradigms for The general model formulation of data-driven stochastic pro-
optimization under uncertainty, these approaches have their own gramming is presented as follows (Delage and Ye, 2010).
limitations and specific application scopes. To this end, research  
efforts have been made on “hybrid” methods that leverage the
min max EP l x,ξ (7)
x∈X P∈D
synergy of different optimization approaches to inherit their
where x is the vector of decision variables, X is the feasible set, l
corresponding strengths and complement respective weaknesses
is the objective function, and ξ represents a random vector whose
(McLean and Li, 2013; Liu et al., 2016; Baringo and Baringo, 2018;
probability distribution P is only known to reside in an ambigu-
Zhao and Guan, 2013; Keyvanshokooh et al., 2016; Parpas et al.,
ity set D. The DRO approach aims for optimal decisions under the
2009). For instance, stochastic programming was integrated with
worst-case distribution, and as a result offers performance guaran-
robust optimization for supply chain design and operation under
tee over the family of distributions.
multi-scale uncertainties (Yue and You, 2016). Robust chance con-
The DRO or data-driven stochastic optimization framework en-
strained optimization along with global solution algorithms were
joys two salient merits compared with the conventional stochas-
developed and applied to process design under price and demand
tic programming approach. First, it allows the decision maker to
uncertainties (Parpas et al., 2009).
incorporate partial distribution information learned from uncer-
tainty data into the optimization. As a result, the data-driven
3. Existing methods for data-driven optimization under
stochastic programming approach greatly mitigates the issue of
uncertainty
optimizer’s curse and improves the out-of-sample performance.
Second, data-driven stochastic programming inherits the compu-
In this section, we review the recent advances in optimiza-
tational tractability from robust optimization and some resulting
tion under uncertainty in the era of big data and deep learning.
problems can be solved exactly in polynomial time without resort-
Recent years have witnessed a rapidly growing number of publi-
ing to the approximation scheme via sampling or discretization.
cations on data-driven optimization under uncertainty, an active
For example, optimization problem (7) for a convex program with
area integrating machine learning and mathematical programming.
continuous variables and a moment-based ambiguity set is proved
These publications cover various topics and can be roughly clas-
to be solvable in polynomial time (Delage and Ye, 2010).
sified into four categories, namely data-driven stochastic program,
The choice of ambiguity sets plays a critical role in the perfor-
data-driven chance constrained program, data-driven robust opti-
mance of DRO. When choosing ambiguity set, the decision maker
mization, and data-driven scenario-based optimization. Unlike the
needs to consider the following three factors, namely tractabil-
conventional mathematical programming techniques, these data-
ity, statistical meaning, and performance (Hanasusanto et al.,
driven approaches do not presume the uncertainty model is per-
2015). First, the data-driven stochastic programming problem with
fectly given a priori, rather they all focus on the practical setting
the ambiguity set should be computationally tractable, meaning
where only uncertainty data are available.
the resulting optimization could be formulated as linear, conic
quadratic or semidefinite programs. Second, the derived ambiguity
3.1. Data-driven stochastic program and distributionally robust
set should have clear statistical meaning. Therefore, various ways
optimization
of constructing ambiguity sets based on uncertainty data were ex-
tensively studied (Delage and Ye, 2010; Esfahani and Kuhn, 2018;
The literature review of data-driven stochastic program, also
Shang and You, 2018). Third, the devised ambiguity set should be
known as distributionally robust optimization (DRO), is presented
tight to increase the performance of resulting decisions.
in detail in this subsection. The motivation of this emerging
One commonly used approach to constructing ambiguity set
paradigm on data-driven optimization under uncertainty is first
is moment-based approaches, in which first and second order
presented, followed by its model formulation. In this modeling
information is extracted from uncertainty data using statistical in-
paradigm, the uncertainty is modeled via a family of probability
ference (Calafiore and El Ghaoui, 2006). The ambiguity set that
distributions that well capture uncertainty data on hand. This set
specifies the support, first and second moment information is
of probability distributions is referred to as ambiguity set. We then
shown as follows,
present and analyze various types of ambiguity sets alongside their
 
corresponding strengths and weaknesses. Finally, the extension of P[ξ ∈ ] = 1
E [ξ ] = μ
DRO to the multistage decision-making setting is also discussed, as D = P ∈ M+ P
(8)
well as their recent applications in PSE. EP (ξ − μ )(ξ − μ )T =
In the stochastic programming approach, it is assumed that the
probability distribution of uncertain parameters is perfectly known. where ξ represents the uncertainty vector,  is the support, P rep-
However, such precise information of the uncertainty distribution resents the probability distribution of ξ , M+ denotes the set of all
is rarely available in practice. Instead, what the decision maker probability measures, EP denotes the expectation with respect to
has is a set of historical and/or real-time uncertainty data and distribution P. Parameters μ and represent the mean vector and
possibly some prior structure knowledge of the probability. More- covariance matrix estimated from uncertainty data, respectively.
over, the assumed probability in conventional stochastic program- The ambiguity set in (8) fails to account for the fact that the
ming might deviate from the true distribution. Therefore, relying mean and covariance matrix are also subject to uncertainty. To this
on a single probability distribution could result in sub-optimal so- end, an ambiguity set was proposed based on the distribution’s
lutions, or even lead to the deterioration in out-of-sample perfor- support information as well as the confidence regions for the mean
mance (Smith and Winkler, 2006). Motivated by these weaknesses and second-moment matrix in the work of (Delage and Ye, 2010).
of stochastic programming, DRO emerges as a new data-driven The resulting DRO problem could be solved efficiently in polyno-
optimization paradigm which hedges against the worst-case distri- mial time.
⎧ ⎫
bution in an ambiguity set. Rather than assuming a single uncer-
⎨ P[ξ ∈ ] = 1 ⎬
tainty distribution, the DRO approach constructs an uncertainty set
D = P ∈ M+ (EP
[ξ ] − μ )
−1
T
(EP [ξ ] − μ ) ≤ ψ1 (9)
⎩ EP (ξ − μ )(ξ − μ )T ≤ ψ2 ⎭
of probability distributions from uncertainty data through statisti-
cal inference and big data analytics. In this way, DRO is capable of
438 C. Ning and F. You / Computers and Chemical Engineering 125 (2019) 434–448

where ξ represents the uncertainty vector,  is the support, P rep- Data-driven stochastic programming has several salient merits
resents the probability distribution of ξ . The equality constraint over the conventional stochastic programming approach. However,
P[ξ ∈ ] = 1 enforces that all uncertainty realizations reside in the there are few papers on its PSE applications in the existing liter-
support set . Parameters ψ 1 and ψ 2 are used to define the sizes ature (Shang and You, 2018; Gao et al., 2019). As the trend of big
of confidence regions for the first and second moment information, data has fueled the increasing popularity of data-driven stochastic
respectively. programming in many areas, DRO emerges as a new data-driven
The moment-based ambiguity sets typically enjoy the advan- optimization paradigm which hedges against the worst-case distri-
tage of computational tractability. For example, DRO with the am- bution in an ambiguity set, and has various applications in power
biguity set based on principal component analysis and first-order systems, such as unit commitment problems (Xiong et al., 2017;
deviation functions was developed (Shang and You, 2018). Addi- Chen et al., 2018; Duan et al., 2018; Zhao and Guan, 2016), and
tionally, the computational effectiveness of this data-driven DRO optimal power flow (Wang et al., 2018; Guo et al., 2018).
method was demonstrated via process network planning and batch
production scheduling. Recently, a data-driven DRO model was de- 3.2. Data-driven chance constrained program
veloped for the optimal design and operations of shale gas supply
chains to hedge against uncertainties associated with shale well es- In contrast to the data-driven stochastic programming approach
timated ultimate recovery and product demand (Gao et al., 2019). reviewed in Section 3.1., data-driven chance constrained program-
However, the moment-based ambiguity set is not guaranteed to ming is another paradigm focusing on chance constraint satisfac-
converge to the true probability distribution as the number of un- tion under the worst-case probability instead of optimizing the
certainty data goes to infinity. Consequently, this type of ambiguity worst-case expected objective. Although both data-driven chance
set suffers from the conservatism with moderate uncertainty data. constrained program and DRO adopt ambiguity sets in the un-
To address the above issue with moment-based methods, ambigu- certainty models, they have distinct model structures. Specifically,
ity sets based on statistical distance between probability distribu- data-driven chance constrained program features constraints sub-
tions were developed, as shown below, ject to uncertainty in probability distributions, while DRO typically
only involves the worst-case expectation of an objective function
D = {P ∈ M+ |d (P, P0 ) ≤ θ } (10) with respect to a family of probability distributions. As introduced
where P is the probability distribution of uncertain parameters, P0 in Section 2, the chance constrained programming approach as-
represents the reference distribution such as the empirical distri- sumes the complete distribution information is perfectly known.
bution, d denotes some statistical distance between two distribu- However, the decision maker only has access to a finite number of
tions, and θ stands for the confidence level. uncertainty realizations or uncertainty data. On the one hand, such
Ambiguity set in (10) can be further classified based on complete knowledge of distribution is usually estimated from lim-
the adopted distance metric, such as Kullback–Leibler divergence ited number of uncertainty data or obtained from expert knowl-
(Hu and Hong, 2013) and Wasserstein distance (Esfahani and edge. On the other hand, even if the probability distribution is
Kuhn, 2018). For example, a DRO model was proposed for lot-sizing available, the chance constrained program is computationally cum-
problem, in which the chi-square goodness-of-fit test and robust bersome. In practice, one can only have partial information on
optimization were combined. The ambiguity set of demand was the probability distribution of uncertainty. Therefore, data-driven
constructed from uncertainty data by using a hypothesis test in chance constrained optimization emerges as another paradigm for
statistics, called the chi-square goodness-of-fit test (Klabjan et al., hedging against uncertainty in the era of big data.
2013). This set is well defined by linear constraints and second The general form of data-driven chance constrained program is
order cone constraints. It is worth noting that the input of their given by,
model is histograms, which make it possible to use a finite di- min f (x )
x∈X
mensional probability vector to characterize the distribution. The
adopted statistic belonged to the phi-divergences, which motivated s.t. min P{ξ ∈ |G(x, ξ ) ≤ 0 } ≥ 1 − ε (12)
P∈D
researchers to construct distribution uncertainty set by using the
where x represents the vector of decision variables, X denotes
phi-divergences (Bayraksan and Love, 2015).
the deterministic feasible region, f is the objective function, ξ is
To account for the sequential decision-making process, re-
a random vector following a probability distribution P that be-
searchers recently developed the adaptive DRO method by incor-
longs to an ambiguity set D. G = (g1 , . . . , gm ) represents a con-
porating recourse decision variables (Hanasusanto and Kuhn, 2018;
straint mapping, 0 is a vector of all zeros, and parameter ε is a
Bertsimas et al., 2019). A general two-stage data-driven stochastic
pre-specified risk level. The data-driven chance constraints enforce
programming model is presented in the following form:
classical chance constraints to be satisfied for every probability dis-
  
min cT x + max EP Q x,ξ tribution within the ambiguity set.
x∈X P∈D
The computational tractability of the resulting data-driven
s.t. Ax ≤ d chance constrained program can vary depending on both the am-
  T
  min b ξ y biguity sets and the structure of the optimization problem. In
Q x, ξ = y∈Y       (11) the following, we summarize the relevant papers according to the
s.t. T ξ x + W ξ y ≤ h ξ adopted uncertainty set of distributions and optimization struc-
tures.
where x presents the vector of first-stage decision variables that
Distributionally robust individual linear chance constraints un-
need to be determined before observing uncertainty realizations,
der the ambiguity set comprised of all distributions sharing the
y denotes the vector of second-stage decision variables that can be
same known mean and covariance were reformulated as convex
adjustable based on the realized uncertain parameters ξ , sets X and
second-order cone constraints (Calafiore and El Ghaoui, 2006).
Y can include nonnegativity, continuity or integrality restrictions,
The deterministic convex conditions to enforce distributionally ro-
and Q represents the recourse function. The objective of the above
bust chance constraints were provided under distribution families
data-driven stochastic program is to minimize the worst-case ex-
of (a) independent random variables with box-type support and
pected cost with respect to all possible uncertainty distributions
(b) radially symmetric non-increasing distributions over the or-
P within the ambiguity set D. Based on the literature, multistage
thotope support. The worst-case conditional value-at-risk (CVaR)
data-driven DRO is becoming a rapidly evolving research direction.
C. Ning and F. You / Computers and Chemical Engineering 125 (2019) 434–448 439

approximation for distributionally robust joint chance constraints Ahmed, 2018), stochastic control (Van Parys et al., 2016), and vehi-
was studied assuming first and second moment (Zymler et al., cle routing problem (Ghosal and Wiesemann, 2018).
2013), and the resulting conservative approximation can be cast
as semidefinite program. In addition to moment information, a 3.3. Data-driven robust optimization
specific structural information of distributions called unimodality
was incorporated into the ambiguity set, and the corresponding As a paramount ingredient in robust optimization, uncertainty
ambiguous risk constraints were reformulated as a set of second sets endogenously determine robust optimal solutions and there-
second-order cone constraints (Li et al., 2017). Instead of assuming fore should be devised with special care. However, uncertainty sets
unimodality of distributions, data-driven robust individual chance in the conventional robust optimization methodology are typically
constrained programs along with convex approximations were re- set a priori using a fixed shape and/or model without providing
cently developed using a mixture distribution-based ambiguity set sufficient flexibility to capture the structure and complexity of un-
with fixed component distribution and uncertain mixture weights certainty data. For example, the geometric shapes of uncertainty
(Chen et al., 2018). sets in (4) and (5) do not change with the intrinsic structure and
In real world applications, exact moment information can be complexity of uncertainty data. Furthermore, these uncertainty sets
challenging to obtain, and can only be estimated through confi- are specified by finite number of parameters, thereby having lim-
dence intervals from uncertainty realizations (Delage and Ye, 2010). ited modeling flexibility. Motivated by this knowledge gap, data-
To accommodate this moment uncertainty, attempts were made driven robust optimization emerges as a powerful paradigm for ad-
in the context of distributionally robust chance constraints, in- dressing uncertainty in decision making.
cluding constructing convex moment ambiguity set (El Ghaoui A data-driven ARO framework that leverages the power
et al., 2003), employing Chebyshev ambiguity set with bounds on of Dirichlet process mixture model was proposed (Ning and
second-order moment (Cheng et al., 2014), characterizing a family You, 2017a). The data-driven approach for defining uncertainty set
of distributions with upper bounds on both mean and covariance was developed based on Bayesian machine learning. This machine
(Zhang et al., 2018). Ambiguous joint chance constraints were stud- learning model was then integrated with the ARO method through
ied where the ambiguity set was characterized by the mean, con- a four-level optimization framework. This developed framework ef-
vex support, and an upper bound on the dispersion (Hanasusanto fectively accounted for the correlation, asymmetry and multimode
et al., 2017), and the resulting constraints were conic representable of uncertainty data, so it generated less conservative solutions. Its
for right-hand-side uncertainty. In addition to generalized moment salient feature is that multiple basic uncertainty sets are used to
bounds (Wiesemann et al., 2014), structural properties of distri- provide a high-fidelity description of uncertainties. Although the
butions, such as symmetry, unimodality, multimodality and in- data-driven ARO has a number of attractive features, it does not
dependence, were further integrated into distributionally robust account for an important evaluation metric, known as regret, in
chance constrained programs leveraging a Choquet representation decision-making (Bell, 1982). Motivated by the knowledge gap, a
(Hanasusanto et al., 2015). Nonlinear extensions of distribution- data-driven bi-criterion ARO framework was developed that effec-
ally robust chance constraints were made under the ambiguity sets tively accounted for the conventional robustness as well as mini-
defined by mean and variance (Yang and Xu, 2016), convex mo- max regret (Ning and You, 2018a).
ment constraints (Xie and Ahmed, 2018), mean absolute deviation In some applications, uncertainty data in large datasets are usu-
(Postek et al., 2018), and a mixture of distributions (Lasserre and ally collected under multiple conditions. A data-driven stochas-
Weisser, 2018). tic robust optimization framework was proposed for optimization
Although moment-based ambiguity sets achieve certain suc- under uncertainty leveraging labeled multi-class uncertainty data
cess, they do not converge to the true probability distribution as (Ning and You, 2018b). Machine learning methods including Dirich-
the number of available uncertainty data increases. Consequently, let process mixture model and maximum likelihood estimation
the resulting data-driven chance-constrained programs tend to
generate conservative solutions. To this end, data-driven chance-
constrained programs with distance-based ambiguity set were pro-
posed to alleviate the undesirable consequence of moment-based
data-driven chance-constrained programs. The ambiguity set de-
fined by the Prohorov metric was introduced into the distribu-
tionally robust chance constraints, and the resulting optimiza-
tion problem was approximated by using robust sampled problem
(Erdogan and Iyengar, 2006). Distributionally robust chance con-
straints with the ambiguity set containing all distributions close
to a reference distribution in terms of Kullback–Leibler divergence
were cast as classical chance constraints with an adjusted risk level
(Hu and Hong, 2013). Data-driven chance constrained programs
with φ -divergence based ambiguity set were proposed (Jiang and
Guan, 2016), and further extensions were made using the ker-
nel smoothing method (Calfa et al., 2015). Recently, data-driven
chance constraints over Wasserstein balls were exactly reformu-
lated as mixed-integer conic constraints (Chen et al., 2018; Ji and
Lejeune, 2018). Leveraging the strong duality result (Gao and Kley-
wegt, 2016), distributionally robust chance constrained programs
with Wasserstein ambiguity set were studied for linear constraints
with both right and left hand uncertainty (Xie, 2018), as well as for
general nonlinear constraints (Hota et al., 2018).
Data-driven chance constrained programs have successful ap-
plications in a number of areas, such as power system (Xie and Fig. 1. The data-driven uncertainty model based on the Dirichlet process mixture
model (Ning and You, 2018b).
440 C. Ning and F. You / Computers and Chemical Engineering 125 (2019) 434–448

were employed for uncertainty modeling, which is illustrated in more direct manner compared with other data-driven optimiza-
Fig. 1. This framework was further proposed based on the data- tion methods. This data-driven optimization framework was first
driven uncertainty model through a bi-level optimization structure. introduced in (Calafiore and Campi, 2005), and has gained
The outer optimization problem followed the two-stage stochastic great popularity within the systems and control community
programming approach, while ARO was nested as the inner prob- (Campi et al., 2009). As in data-driven chance constrained pro-
lem for maintaining computational tractability. grams, the knowledge of true underlying uncertainty distribution is
To mitigate computational burden, research effort has been not required in scenario optimization but a finite number of uncer-
made on convex polyhedral data-driven uncertainty set based tainty realizations. Specifically, the scenario approach enforces the
on machine learning techniques, such as principal component constraint satisfaction with N independent identically distributed
analysis and support vector clustering. A data-driven robust opti- uncertainty data u(1) , … , u( N ) . The resulting scenario optimization
mization framework that leveraged the power of principal com- problem is given by,
ponent analysis and kernel smoothing for decision-making under
min cT x
uncertainty was studied (Ning and You, 2018c). In this approach, x∈X   (13)
correlations between uncertain parameters were effectively cap- s.t. f x, u(i ) ≤ 0, i = 1, . . . , N
tured, and latent uncertainty sources were identified by principal
where x is the vector of decision variables, X represents a deter-
component analysis. To account for asymmetric distributions, for-
ministic convex and closed set unaffected by uncertainty, c is the
ward and backward deviation vectors were utilized in the uncer-
vector of cost coefficients, and f denotes the constraint function af-
tainty set, which was further integrated with robust optimization
fected by uncertainty u. Note that function f is typically assumed
models. A data-driven static robust optimization framework based
to be convex in x, and can have arbitrarily nonlinear dependence
on support vector clustering that aims to find the hypersphere
on u, as opposed to data-driven nonlinear chance constrained pro-
with minimal volume to enclose uncertainty data was proposed
gram assuming the constraint function must be quasi-convex in u
(Shang et al., 2017). The adopted piecewise linear kernel incor-
(Yang and Xu, 2016). Additionally, scenario-based optimization can
porates the covariance information, thus effectively capturing the
be considered as a special case of data-driven robust optimization
correlation among uncertainties. These two data-driven robust op-
when the uncertainty set is constructed as a union of u(1) , … , u( N ) .
timization approaches utilized polyhedral uncertainty learned from .
In the scenario optimization literature, ω = {u(1 ) , . . . , u(N ) } is
data, and thus enjoying computational efficiency. Various types of
referred to as the multi-sample or scenario that is drawn from the
data-driven uncertainty sets were developed for static robust op-
product probability space. Due to the random nature of the multi-
timization based on statistical hypothesis tests (Bertsimas et al.,
sample, the optimal solution of the scenario optimization problem
2018), copula (Zhang et al., 2018), and probability density contours
(13), denoted as x∗ (ω), is also random. One key merit of the sce-
(Zhang et al., 2018).
nario approach is that the scenario optimization problem admits
To address multistage decision making under uncertainty, a
the same problem type as its deterministic counterpart, so that it
data-driven approach for optimization under uncertainty based on
can be solved efficiently by convex optimization algorithms when
multistage ARO and nonparametric kernel density M-estimation
f(x, u) is convex in x (Boyd and Vandenberghe, 2004). Moreover,
was developed (Ning and You, 2017b). The salient feature of the
the optimal solution x∗ (ω) is guaranteed to satisfy the constraints
framework was its incorporation of distributional information to
with other unseen uncertainty realizations with a high probability
address the issue of over-conservatism. Robust kernel density es-
(Campi and Garatti, 2008).
timation was employed to extract probability distributions from
For the sake of clarity, we revisit the following definition and
data. This data-driven multistage ARO framework exploited robust
theorem (Campi and Garatti, 2008).
statistics to be immunized to data outliers. An exact robust coun-
Definition (Violation probability) The violation probability of a
terpart was developed for solving the resulting data-driven ARO
given decision x is defined as follows:
problem.
.
In recent years, data-driven robust optimization has been ap- V (x ) = P{u ∈ | f (x, u ) > 0 } (14)
plied to a variety of areas, such as power systems (Ning and
where V(x) denotes the probability of violation for a given x, and
You, 2019), energy systems (Zhao et al., 2019), planning and
 represents the support of uncertainty u. We say a decision x is
scheduling (Ning and You, 2017b; Zhang et al., 2018), process con-
ε-feasible if V(x) ≤ ε.
trol (Ning and You, 2018c; Shang and You, 2019), and transporta-
Theorem. Assuming x∗ (ω) is the unique optimal solution of the
tion systems (Miao et al., 2019; Zhao and You, 2019).
scenario optimization problem. It holds that
n−1
 
3.4. Scenario optimization approach for chance constrained programs  N
P {ω|V (x (ω ) ) ≤ ε } ≥ 1 −
N ∗
ε i (1 − ε )N−i (15)
i
i=0
A salient feature of scenario-based optimization is that it does
not require the explicit knowledge of probability distribution as where n is the number of decision variables, N denotes the number
in the stochastic programming approach. Additionally, scenario- of uncertainty data, and PN is a product probability governing the
based optimization uses uncertainty scenarios to seek an optimal sample generation.
solution having a high probabilistic guarantee of constraint satis- The above theorem implies that the optimal solution x∗ (ω) sat-
faction instead of utilizing scenarios or samples to approximate isfies the corresponding chance constraint with a certain confi-
the expectation term as in stochastic programming. Although the dence level. The proof of this theorem depends on the fundamental
scenario-based optimization can be regarded as a special type of fact that the number of support constraints, the removal of which
robust optimization that has a discrete uncertainty set consisting changes the optimal solution, is upper bounded by the number of
of uncertainty data, it can provide probabilistic guarantee for those decision variables (Calafiore and Campi, 2005). Note that (15) holds
unobserved uncertainty data in the testing data set. Note that with equality for the fully-supported convex optimization problem
the scenario-based optimization approach provides a viable and (Campi and Garatti, 2008), meaning that the probability bound is
data-driven route to achieving approximate solutions of chance- tight. Additionally, the result holds true irrespective of probability
constrained programs. The scenario-based optimization approach distribution information or even its support set.
is a general data-driven optimization under uncertainty framework By exploiting the structured dependence on uncertainty, the
in which uncertainty data or random samples are utilized in a sample size required by the scenario optimization problem
C. Ning and F. You / Computers and Chemical Engineering 125 (2019) 434–448 441

was reduced through a tighter bound on Helly’s dimension Traditionally, the field of scenario optimization has focused on
(Zhang et al., 2015). Rather than focusing on the constraint vi- convex optimization problems, in which the number of support
olation probability, considerable research efforts have been made constraints is upper bounded by the number of decision variables.
on the degree of violation (Kanamori and Takeda, 2012), expected However, such upper bounds are no longer available in nonconvex
probability of constraint violation (Calafiore, 2009), and the per- scenario optimization problems, giving rise to research challenges
formance bounds for objective values (Esfahani et al., 2015). To of extending the scenario theory to the nonconvex setting. To date,
make a trade-off between feasibility and performance, the case few works have considered nonconvex uncertain program using
was studied where some of the sampled constraints were allowed the scenario approach. One contribution is that of (Campi et al.,
to be violated for improving the performance of the objective 2018), in which assessing the generalization of the optimal solution
(Calafiore, 2010). Subsequent work along this direction includes in a wait-and-judge manner through the concept of support sub-
a sampling-and-discarding method (Campi and Garatti, 2011). A sample was proposed. The proposed approach can be employed
wait-and-judge scenario optimization framework was proposed in to general nonconvex setups, including mixed-integer scenario op-
which the level of robustness was assessed a posteriori after the timization problems. Another attempt to address nonconvex sce-
optimal solution was obtained (Campi and Garatti, 2018). Re- nario optimization made use of the statistical learning theory for
cently, the extension of scenario-based optimization to the mul- bounding the violation probability, and devised a randomized solu-
tistage decision making setting was made (Kariotoglou et al., 2016; tion algorithm (Alamo et al., 2009). The statistical learning theory-
Vayanos et al., 2012). based method provided the probabilistic guarantee for all feasi-
While the scenario optimization problems with continuous de- ble solutions, as opposed to the convex scenario approach where
cision variables are extensively studied (Campi et al., 2009), the such guarantee is valid only for the optimal solution. This unique
mixed-integer scenario optimization was less developed. An at- feature regarding probabilistic guarantees for all feasible solutions
tempt to extend the scenario theory to random convex programs granted by the statistical learning based method is of practical rel-
with mixed-integer decision variables was made (Calafiore et al., evance (Calafiore et al., 2011), since it is computationally challeng-
2012), and the Helly dimension in the mixed-integer scenario pro- ing to solve nonconvex optimization problems to global optimal-
gram was proved to depend geometrically on the number of inte- ity. A class of non-convex scenario optimization problem, which
ger variables. This result suggests that the required sample size can has non-convex objective functions and convex constraints, was re-
be prohibitively large for scenario programs with many discrete cently studied (Grammatico et al., 2016). Since the Helly’s dimen-
variables. Along this research direction, two sampling algorithms sion for the optimal solution of such non-convex scenario program
within the framework of S-optimization were recently developed can be unbounded, the direct application of scenario approaches
for solving mixed-integer convex scenario programs (De Loera based on Helly’s theorem is impossible. To overcome the research
et al., 2018). challenge, the feasible region was restricted to the convex hull of
In some real-world applications, the required sample size few optimizers, thus enabling the application of sample complexity
can be very large, resulting in great computational burden for results (Campi and Garatti, 2008).
scenario optimization problems with huge number of sampling
constraints. One way to circumvent this difficulty is to devise 4. Future research directions and opportunities
sequential solution algorithms. Along this direction, sequential
randomized algorithms were developed for convex scenario Several promising research directions in data-driven optimiza-
optimization problems (Chamanbaz et al., 2016), and fell into tion under uncertainty are highlighted in this section. We specifi-
the framework of Sequential Probabilistic Validation (SPV) cally focus on some ideas on closed-loop data-driven optimization,
(Alamo et al., 2015). The motivation behind these sequential integration of deep learning and scenario-based optimization, and
algorithms is that validating a given solution with a large number learning-while-optimizing frameworks.
of samples is less computational expensive than solving the cor-
4.1. A “closed-loop” data-driven optimization framework with
responding scenario optimization problem. Recently, a repetitive
feedback from mathematical programming to machine learning
scenario design approach was proposed by iterating between
reduced-size scenario optimization problems and the probabilistic
The framework of data-driven optimization under uncertainty
feasibility check (Calafiore, 2017). The trade-off between the sam-
could be considered as a “hybrid” system that integrates the
ple size and the expected number of repetitions was also revealed
data-driven system based on machine learning to extract useful
in the repetitive scenario design (Calafiore, 2017). Note that the
and relevant information from data, and the model-based sys-
classical scenario-based approach is an extreme situation in the
tem based on the mathematical programming to derive the opti-
trade-off curve, where one seeks to find the solution at one step.
mal decisions from the information. Existing data-driven optimiza-
Another effective way to reduce the computation cost of large-
tion approaches adopt a sequential and open-loop scheme, which
scale scenario optimization is to employ distributed algorithms
could be further improved by introducing feedback steps from the
(Margellos et al., 2018; Carlone et al., 2014). Particularly, the
model-based system to data-driven system. A “closed-loop” data-
sampled constraints were distributed among multiple processors
driven optimization paradigm that explores the information feed-
of a network, and the large-scale scenario optimization prob-
back to fully couple upper-stream machine learning and down-
lems can be efficiently solved via constraint consensus schemes
stream mathematical programming could be a more effective and
(Carlone et al., 2014). Along this direction, a distributed computing
rigorous approach.
framework was developed for the scenario convex program with
multiple processors connected by a graph (You et al., 2018). The 4.1.1. The issues of conventional “open-loop” data-driven optimization
major advantage of this approach is that the computational cost methods
for each processor becomes lower and the original scenario opti- It is widely recognized that data-driven optimization is a
mization problem can be solved collaboratively. Other contribution promising way to hedging against uncertainty in the era of big
to reduce computational cost is made based on a non-iterative data and deep learning. Such promise hinges heavily on the or-
two-step procedure, i.e. the optimization step and detuning step ganic integration and effective interaction between machine learn-
(Care et al., 2014). As a consequence, the total sample complexity ing and mathematical programming. In existing data-driven op-
was greatly decreased. timization frameworks, the tasks performed by the data-driven
442 C. Ning and F. You / Computers and Chemical Engineering 125 (2019) 434–448

system and the model-based system are treated separately in a the data-driven system. In this way, the “hybrid” system becomes
sequential fashion. More specifically, data serve as input to a data- a closed-loop one in which information can be transmitted in both
driven system. After that, useful, accurate and relevant uncertainty directions. Such feedback strategy should be beneficial to the “hy-
information is extracted through the data-driven system and fur- brid” system and could provide an effective way to organically in-
ther passed along to the model-based system based on mathe- tegrate machine learning and mathematical programming.
matical programming for rigorous and systematic optimization un- Research challenges emerge from the feedback step in the “hy-
der uncertainty, using paradigms such as robust optimization and brid” system. In typical PSE applications, the problem size of math-
stochastic programming. However, due to the sequential connec- ematical programs tends to be large. Such large-scale mathematical
tion between these two systems, the machine learning model is programming problems in conjunction with big data could pose a
trained without interacting with the “downstream” mathematical computational challenge for the training of machine learning. Ad-
programming. Accordingly, from a control theoretical perspective, ditionally, how to design an effective feedback strategy to “close
such “hybrid” systems in the existing data-driven optimization lit- the loop” poses another key challenge to be addressed.
erature are essentially open loop. In contrast to open-loop sys-
tems, closed-loop systems using the feedback control strategy de-
4.1.3. Incorporating “prior” knowledge in the data-driven
liver amazingly superior system performance (e.g. stability, robust-
optimization framework
ness to disturbances, and safety) in virtually every area of science
In addition to uncertainty data, some available domain-specific
and engineering, such as biological systems, social networks, and
knowledge or “prior” knowledge could serve as another informa-
mechanical systems (A˚ström and Kumar, 2014). Therefore, there
tive input to the data-driven system. Relying solely on the data
should be a “feedback” channel for information flow returning
to develop the uncertainty model could unfavorably influence the
from the model-based system to the data-driven system, in ad-
downstream mathematical programming. The prior knowledge de-
dition to the information flow that is fed into the mathematical
picts what the decision maker knows about the uncertainty, and
programming problem from the machine learning results. The de-
it can come in different forms. For example, the prior knowledge
sign of such feedback loops from mathematical programming to
could be the structural information of probability distributions, up-
machine learning deserves further attention in future research. Al-
per and lower bounds of uncertain parameters or certain correla-
though there are closed-loop machine learning methods in the
tion relationship among uncertainties. Incorporating such “prior”
case of reinforcement learning (Shin and Lee, 2019), to the best of
knowledge in the data-driven optimization framework could be
our knowledge, there are few works on developing a closed-loop
substantially useful and provides more reliable results in the face
strategy for data-driven mathematical programming under uncer-
of messy data.
tainty. Different from mathematical programming, reinforcement
learning is a kind of machine learning that aims to find an action
policy to increase an agent’s performance in terms of reward by 4.2. Leveraging deep learning techniques for hedging against
interacting with an environment. uncertainty in data-driven optimization

4.1.2. A “closed-loop” data-driven optimization framework Recently, deep learning has shown great promise due
Due to its critical role in the training of machine learning to its amazing power in hierarchical representation of data
models, loss functions could provide a foundation for consider- (Goodfellow et al., 2016). The deep learning techniques are now
ing feedback steps in future research efforts. Instead of using a shaping and revolutionizing many areas of science and engineering
mathematical-programming-agnostic loss function (e.g. logistic or (LeCun et al., 2015). In recent years, deep learning has a wide
squared-error loss), a loss function that incorporates the objective array of applications in the PSE domain, such as process monitor-
function of mathematical programming could be used to train the ing (Zhu et al., 2019; Zhang and Zhao, 2017), refinery scheduling
machine learning model. Specifically, a weighted sum of the con- (Gao et al., 2014), and soft sensor (Shang et al., 2014). For exten-
ventional loss function and the objective function in the mathe- sive surveys on deep learning in the PSE area, we refer the reader
matical programming problem should be useful in handling issues to the review papers on this subject (Venkatasubramanian, 2019;
experienced with current “open-loop” data-driven frameworks. An Lee et al., 2018). In real applications, uncertainty data exhibit very
iterative scheme between machine learning and mathematical pro- complex and highly nonlinear characteristic. Therefore, it should
gramming offers an alternative promising path to close the loop be promising to explore the potential opportunities of leveraging
of the data-driven system and the model-based system. Fig. 2 deep neural networks with various architectures to uncover useful
presents the potential schematic of the closed-loop data-driven patterns of uncertainty data for mathematical programming.
mathematical programming system. From the figure, we can see In this section, a variety of deep learning techniques are first
that the feedback from the model-based system serves as input to summarized along with their unique features from a practical

Fig. 2. The schematic of “closed-loop” data-driven mathematical programming framework.


C. Ning and F. You / Computers and Chemical Engineering 125 (2019) 434–448 443

point of view, and future research directions on how to lever- Given its unique power in spatial data modeling, CNNs hold the
age the power of deep learning in optimization under uncertainty potential to model uncertainty data with large spatial correlations,
are further suggested. Research opportunities of integrating data- such as demand data in different adjacent market locations. In ad-
driven scenario-based optimization with deep generative models dition, the CNNs can be trained for the labeled multi-class uncer-
are then presented. tainty data to perform the task of classification. Therefore, the out-
put of the CNN potentially acts as the probability weights used in
4.2.1. Various types of deep learning techniques and their potentials the data-driven stochastic robust optimization framework.
In this subsection, we present three types of deep learning
• Recurrent neural networks
techniques, including deep belief networks, convolutional neural
networks, and recurrent neural networks, and explore their poten- Besides the aforementioned models for spatial data, recurrent
tial applications in data-driven optimization under uncertainty. neural networks (RNNs) are widely recognized as the state-of-
the-art deep learning technique for processing time series data,
• Deep belief networks
especially those from language and speech (Graves et al., 2013).
Among deep learning techniques, deep belief networks (DBNs) RNNs can be considered as feedforward neural networks if they
are becoming increasingly popular primarily because its unique are unfolded in time scale. The architecture of neural networks
feature in capturing a hierarchy of latent features (Mohamed et al., in a RNN possesses a unique structure of directed cycles among
2012). DBNs essentially belong to probabilistic graphical models hidden units. In addition, the inputs of the hidden unit come
and are structured by stacking a series of restricted Boltzmann ma- from both the hidden unit of previous time and the input unit
chines (RBMs). This specific network structure is designed based at current time. Accordingly, these hidden units in the architec-
on the fact that a single RBM with only one hidden layer fall shorts ture of RNNs constitute the state vectors and store the histori-
of capturing the intrinsic complexities in high-dimensional data. As cal information of past input data. With this special architecture,
the building blocks for DBNs, RBMs are characterized as two lay- RNNs are well-suited for feature learning for sequential data and
ers of neurons, namely hidden layer and visible layer. Note that the demonstrate successful applications in various areas, including nat-
hidden layer can be regarded as the abstract representation of the ural speech recognition (Graves et al., 2013), and load forecasting
visible layer. There are undirected connections between these two (Vermaak and Botha, 1998). However, one drawback of RNNs is
layers, while there exist no intra-connections within each layer. its weakness in storing long-term memory due to gradient van-
The training process of DBNs typically involves the pre-training ishing and exploding problems. To address this issue, research ef-
and fine-tuning procedures in a layer-wise scheme. Armed with forts have been made on variants of RNNs, such as long short-term
multiple layers of hidden variables, DBNs enjoy unique power in memory (LSTM) and gated recurrent unit (GRU) (Hochreiter and
extracting a hierarchy of latent features automatically, which is Schmidhuber, 1997). By explicitly incorporating input, output and
desirable in many practical applications. As a result, DBNs have forget gates, LSTM enhances the capability of memorizing the long-
been applied in a wide spectrum of areas, including fault diagno- term dependency among sequential data. In sequential mathemat-
sis (Zhang and Zhao, 2017), soft sensor (Shang et al., 2014), and ical programming under uncertainty, massive time series of un-
drug discovery (Gawehn et al., 2016). DBNs can decipher compli- certain parameters are collected. Uncertainty data realized at dif-
cated nonlinear correlation among uncertain parameters. Recently, ferent time stages often exhibit temporal dynamics. To this end,
deep Gaussian process model was proposed as a special type of deep learning techniques, such as deep RNNs and LSTM, could be
DBN based Gaussian process mappings. Due to its unique advan- leveraged to decipher the temporal dynamics and trajectories of
tage in nonlinear regression, deep Gaussian process model should uncertainty over time stages. Therefore, exploring the integration
be used to characterize the relationship between uncertain param- between deep learning and multistage optimization under uncer-
eters, such as product price and demand. tainty is another promising research direction.
• Convolutional neural networks
4.2.2. Deep generative models for scenario-based optimization
Convolutional neural networks (CNNs) are one specialized ver- Despite the various successful applications of scenario-based
sion of deep neural networks (Krizhevsky et al., 2017), and they optimization, this type of data-driven optimization framework has
have become increasingly popular in areas such as image classifi- its own limitations. In general, scenario-based optimization enjoys
cation, speech recognition, and robotics. Inspired by the visual neu- computational efficiency by constraint sampling and provides the
roscience, CNNs are designed to fully exploit the three main ideas, feasibility guarantee regardless of probability types. These advan-
namely sparse connectivity, weight sharing, and equivariant repre- tages of scenario-based optimization rely heavily on the key as-
sentations (Goodfellow et al., 2016). This kind of neural network is sumption that sufficient amount of uncertainty data is available.
suited for processing data in the form of multiple arrays, particu- However, in most practical cases, this assumption does not hold,
larly two-dimensional image data. The architecture of a CNN typ- and on the contrary the amount of uncertainty data sampled from
ically consists of convolution layers, nonlinear layers, and pooling the underlying true distribution is quite limited. Moreover, acquir-
layers. In convolution layers, feature maps are extracted by per- ing uncertainty data could be extremely expensive or time con-
forming convolutions between local patch of data and filters. The suming in some specific cases, which greatly hinders the appli-
filters share the same weights when moving across the dataset, cability of the scenario-based approach (Gupta and Rusmevichien-
leading to reduced number of parameters in networks. The ob- tong, 2017). Existing studies of the scenario-based optimization ne-
tained results are further passed through a nonlinear activation glect the aforementioned practical situation (Campi et al., 2009;
function, such as rectified linear unit (ReLU). After that, pooling Zhang et al., 2015; Calafiore, 2017). The practical challenge of han-
layers, such as max pooling and average pooling, are applied to dling insufficient amount of data requires further research atten-
aggregate semantically similar features. Such different types of lay- tion, and data-driven scenario-based optimization frameworks ad-
ers are alternatively connected to extract hierarchical features with dressing this issue need to be developed.
various abstractions. For the purpose of classification, a fully con- This knowledge gap could be potentially filled by leveraging
nected layer is stacked after extracting the high-level features. Al- the power of deep generative models for the data-driven scenario-
though CNNs are mainly used for image classification, they have based optimization, whose schematic is shown in Fig. 3. Instead of
been used to learn spatial features of traffic flow data at nearby lo- assuming unlimited uncertainty scenarios sampled from the true
cations which exhibit strong spatial correlations (Wu et al., 2018). distribution, deep generative models could be leveraged to learn
444 C. Ning and F. You / Computers and Chemical Engineering 125 (2019) 434–448

Fig. 3. The schematic of the scenario-based optimization framework based on deep learning.

the intrinsic useful patterns from the available uncertainty data decision variables and a small value of risk level (Alamo et al.,
and to generate synthetic data. These synthetic uncertainty data 2015). Therefore, the available amount of uncertainty data might
generated by the deep learning techniques mimic the real uncer- not be enough for the purpose of probabilistic guarantee. However,
tainty data, and should be potentially useful in the scenario-based the number of uncertainty data can still be sufficient for training
optimization model. Deep generative models can be utilized to generative models.
generate synthetic uncertainty data with the aim for better de- Although deep learning could be a silver bullet in many areas, a
cision with insufficient uncertainty data. To be more precise, in lot of research challenges still persist in organically integrating the
deep generative models, the true data distribution is learned ei- state-of-the-art deep learning techniques with optimization under
ther explicitly or implicitly, and then the learned distribution is uncertainty. The discussion in Section 4.2. is aimed to serve as a
used to generate new data points referred to as synthetic data. good starting point to promote the employment of deep learning
One of the most commonly used deep generative models is vari- in the field of data-driven optimization under uncertainty.
ational autoencoders (VAEs) (Goodfellow et al., 2016). VAEs gen-
erate new data samples through the architecture of synthesizing 4.3. Online learning-based data-driven optimization: a
an encoder network and a decoder network in an unsupervised learning-while-optimizing paradigm for addressing uncertainty
fashion. The function of encoders is to reduce the dimension of
input data and extracts the latent features, while the decoder net- In conventional data-driven optimization frameworks, a batch
work aims to reconstruct data given the latent variables. In this of uncertainty data serves as input to the data-driven system,
way, the VAE model learns the complicated target distribution by in which learning typically takes place only once and is termed
maximizing the lower bound of the data log-likelihood. The ad- as batch machine learning. Most, if not all, of the papers on
vantage of this technique is that its quality is easily evaluated data-driven optimization under uncertainty are restricted to such
via log-likelihood or importance sampling. However, researchers learning of data (Ning and You, 2017a; Esfahani and Kuhn, 2018;
have found out that VAEs typically tend to generate blurry images, Jiang and Guan, 2016; Shang et al., 2017), so they fail to account
meaning a noticeable difference between the true distribution and for real-time data. For example, in data-driven robust optimization
the learned one (Goodfellow et al., 2016). Recently, an emerg- methods, uncertainty sets are learned from a batch of uncertainty
ing deep generative model named generative adversarial networks data. Once these data-driven uncertainty sets are obtained, they
(GANs) was proposed and has become increasingly popular in var- remain fixed for the model-based system based on mathematical
ious areas, such as image processing (Ledig et al., 2017), renew- programming and are not updated or refined. Additionally, prob-
able scenario generation (Chen et al., 2018), and molecular designs ability distributions of uncertainties and their support sets could
(Sanchez-Lengeling and Aspuru-Guzik, 2018). Different from VAEs, be time variant and evolves gradually, rendering the data-driven
GANs implicitly learn the data distribution through a zero-sum system “outdated”. Such obsolete data-driven system inevitably de-
game between two competing neural networks, namely genera- teriorates the resulting solution quality of the mathematical pro-
tor network and discriminator network (Goodfellow et al., 2014). gramming problem. In many practical settings, uncertainty data are
Given the noise input, the generator network competes against the collected sequentially in an online fashion (Shalev-Shwartz, 2012).
discriminator network by generating plausible synthetic data. On Although previous works have explored the online learning of un-
the contrary, the discriminator network attempts to distinguish the certainty sets (Shang and You, 2019), they typically re-train the
real uncertainty data from the synthetic data. These two networks data-driven system from scratch using the existing and new ad-
compete against each other. Accordingly, the data distribution re- dition of data, thus making these approaches suitable only for sys-
sulted from the generator network will be the true distribution tems with slow dynamics. Therefore, few studies to date investi-
once the Nash equilibrium is achieved. The required sample size gate the real-time data analytics for systems with fast dynamics
for random convex programs scales linearly with the number of such as those encountered in chemical processes, establishing re-
decision variables (Alamo et al., 2015), implying that the “small search opportunities for the PSE community.
data” regime should be frequently encountered for large-scale op- An online-learning-based data-driven optimization paradigm,
timization problems. Consequently, data-driven scenario-based op- in which learning takes place iteratively to account for real-time
timization tends to suffer severely from the issue of insufficient data, could be a promising research direction. More specifically, a
uncertainty data. Leveraging the power of deep generative mod- learning-while-optimizing scheme could be explored by taking ad-
els could be a promising way to addressing this challenge. The re- vantage of deep reinforcement learning. On the one hand, the un-
quired sample size to guarantee the constraint satisfaction could certainty model should be time varying to accommodate real-time
become large for optimization problems with a huge number of uncertainty data. On the other hand, decisions are made sequen-
C. Ning and F. You / Computers and Chemical Engineering 125 (2019) 434–448 445

tially under uncertainty. After decisions are made, uncertainties Ben-Tal, A., El Ghaoui, L., Nemirovski, A., 2009. Robust Optimization. University
are realized and then collected in the database. There are research Press, Princeton.
Bertsimas, D., Sim, M., 2004. The price of robustness. Oper. Res. 52, 35.
challenges associated with such online-learning-based frameworks. Bertsimas, D., Thiele, A., 2006. Robust and data-driven optimization: modern deci-
Updating the data-driven system in an online fashion is paramount sion-making under uncertainty. INFORMS Tutor. Oper. Res. 95–122.
in implementing the learning-while-optimizing scheme and poses Bertsimas, D., Brown, D.B., Caramanis, C., 2011. Theory and applications of robust
optimization. SIAM Rev. 53, 464–501.
a key research challenge. Additionally, developing efficient algo- Bertsimas, D., Litvinov, E., Sun, X.A., Zhao, J., Zheng, T., 2013. Adaptive Robust Op-
rithms to solve the resulting online-learning-based mathematical timization for the Security Constrained Unit Commitment Problem. IEEE Trans.
programming problems creates the computational challenge. There Power Syst. 28, 52–63.
Bertsimas, D., Gupta, V., Kallus, N., 2018. Data-driven robust optimization. Math. Pro-
exist some theoretical research challenges as well. One theoreti-
gram. 167, 235–292.
cal challenge is to investigate the convergence of solutions when Bertsimas, D., Sim, M., Zhang, M., 2019. Adaptive distributionally robust optimiza-
the probability distribution shifts to a new one. Another challenge tion. Manag. Sci. 0 null.
Biegler, L.T., Grossmann, I.E., 2004. Retrospective on optimization. Comput. Chem.
is to provide theoretical bounds for computational complexity and
Eng. 28, 1169–1192.
required memory for the online-learning-based data-driven opti- Birge, J.R., Louveaux, F., 2011. Introduction to Stochastic Programming. Springer Sci-
mization. ence & Business Media.
Birge, J.R., 1997. State-of-the-Art-Survey—stochastic programming: computation and
applications. INFORMS J. Comput. 9, 111–133.
5. Conclusions Bonfill, A., Bagajewicz, M., Espuña, A., Puigjaner, L., 2004. Risk management in the
scheduling of batch plants under uncertain market demand. Ind. Eng. Chem. Res.
Although conventional stochastic programming, robust opti- 43, 741–750.
Bonfill, A., Espuña, A., Puigjaner, L., 2005. Addressing robustness in scheduling
mization, and chance constrained optimization are the most rec- batch processes with uncertain operation times. Ind. Eng. Chem. Res. 44, 1524–
ognized modeling paradigms for hedging against uncertainty, it 1534.
is foreseeable that in the near future data-driven mathematical Boyd, S., Vandenberghe, L., 2004. Convex Optimization. Cambridge University Press.
Calafiore, G., Campi, M.C., 2005. Uncertain convex programs: randomized solutions
programming frameworks would experience a rapid growth fu- and confidence levels. Math. Program. 102, 25–46.
eled by big data and deep learning. We reviewed recent progress Calafiore, G.C., El Ghaoui, L., 2006. On distributionally robust chance-constrained
of data-driven mathematical programming under uncertainty in linear programs. J. Optim. Theory Appl. 130, 1–22.
Calafiore, G., Dabbene, F., Tempo, R., 2011. Research on probabilistic methods for
terms of systematic uncertainty modeling, organic integration of control system design. Automatica 47, 1279–1293.
machine learning and mathematical programming, and efficient Calafiore, G., Lyons, D., Fagiano, L., 2012. On mixed-integer random convex pro-
computational algorithms for solving the resulting mathematical grams. In: 2012 IEEE 51st IEEE Conference on Decision and Control (CDC),
pp. 3508–3513.
programming problems. The advantages and disadvantages of dif- Calafiore, G., 2009. On the expected probability of constraint violation in sampled
ferent data-driven uncertainty models were also analyzed in de- convex programs. J. Optim. Theory Appl. 143, 405–412.
tail. Future research could be directed toward devising feedback Calafiore, G.C., 2010. Random convex programs. SIAM J. Optim. 20, 3427–3464.
Calafiore, G., 2017. Repetitive scenario design. IEEE Trans. Autom. Control 62,
steps to close the loop of the data-driven system and the model-
1125–1137.
based system, leveraging the power of deep generative models for Calfa, B.A., Agarwal, A., Bury, S.J., Wassick, J.M., Grossmann, I.E., 2015. Data-driven
the data-driven scenario-based optimization, and developing data- simulation and optimization approaches to incorporate production variability in
driven mathematical programming frameworks with online learn- sales and operations planning. Ind. Eng. Chem. Res. 54, 7261–7272.
Calfa, B.A., Grossmann, I.E., Agarwal, A., Bury, S.J., Wassick, J.M., 2015. Data-driven
ing for real-time data. individual and joint chance-constrained optimization via kernel smoothing.
Comput. Chem. Eng. 78, 51–69.
Campbell, T., How, J.P., 2015. Bayesian nonparametric set construction for ro-
Supplementary materials
bust optimization. In: American Control Conference (ACC), 2015, pp. 4216–
4221.
Supplementary material associated with this article can be Campi, M.C., Garatti, S., 2008. The exact feasibility of randomized solutions of un-
found, in the online version, at doi:10.1016/j.compchemeng.2019. certain convex programs. SIAM J. Optim. 19, 1211–1230.
Campi, M.C., Garatti, S., 2011. A sampling-and-discarding approach to chance-con-
03.034. strained optimization: feasibility and optimality. J. Optim. Theory Appl. 148,
257–280.
References Campi, M.C., Garatti, S., 2018. Wait-and-judge scenario optimization. Math. Program.
167, 155–189.
A˚ström, K.J., Kumar, P.R., 2014. Control: a perspective. Automatica 50, 3–43. Campi, M.C., Garatti, S., Prandini, M., 2009. The scenario approach for systems and
Acevedo, J., Pistikopoulos, E.N., 1998. Stochastic optimization based algorithms for control design. Ann. Rev. Control 33, 149–157.
process synthesis under uncertainty. Comput. Chem. Eng. 22, 647–671. Campi, M.C., Garatti, S., Ramponi, F.A., 2018. A general scenario theory for non-
Ahmed, S., Tawarmalani, M., Sahinidis, N.V., 2004. A finite branch-and-bound algo- convex optimization and decision making. IEEE Trans. Autom. Control 63,
rithm for two-stage stochastic integer programs. Math. Program. 100, 355–377. 4067–4078.
Alamo, T., Tempo, R., Camacho, E.F., 2009. Randomized strategies for probabilistic Cannon, M., Kouvaritakis, B., Wu, X.J., 2009. Probabilistic constrained MPC for mul-
solutions of uncertain feasibility and optimization problems. IEEE Trans. Autom. tiplicative and additive stochastic uncertainty. IEEE Trans. Autom. Control 54,
Control 54, 2545–2559. 1626–1632.
Alamo, T., Tempo, R., Luque, A., Ramirez, D.R., 2015. Randomized methods for design Care, A., Garatti, S., Campi, M.C., 2014. FAST-Fast algorithm for the scenario tech-
of uncertain systems: sample complexity and sequential algorithms. Automatica nique. Oper. Res. 62, 662–671.
52, 160–172. Carlone, L., Srivastava, V., Bullo, F., Calafiore, G.C., 2014. Distributed random convex
Assavapokee, T., Realff, M.J., Ammons, J.C., 2008. Min-max regret robust optimiza- programming via constraints consensus. SIAM J. Control Optim. 52, 629–662.
tion approach on interval data uncertainty. J. Optim. Theory Appl. 137, 297–316. Caroe, C.C., Schultz, R., 1999. Dual decomposition in stochastic integer programming.
Atamtürk, A., Zhang, M., 2007. Two-stage robust network flow and design under Oper. Res. Lett. 24, 37–45.
demand uncertainty. Oper. Res. 55, 662–673. Chamanbaz, M., Dabbene, F., Tempo, R., Venkataramanan, V., Wang, Q.G., 2016. Se-
Baringo, L., Baringo, A., 2018. A stochastic adaptive robust optimization approach for quential randomized algorithms for convex optimization in the presence of un-
the generation and transmission expansion planning. IEEE Trans. Power Syst. 33, certainty. IEEE Trans. Autom. Control 61, 2565–2571.
792–802. Charnes, A., Cooper, W.W., 1959. Chance-constrained programming. Manag. Sci. 6,
Bayraksan, G., Love, D.K., 2015. Data-Driven Stochastic Programming Using Phi-Di- 73–79.
vergences. The Operations Research Revolution, pp. 1–19. Chen, Y.W., Guo, Q.L., Sun, H.B., Li, Z.S., Wu, W.C., Li, Z.H., 2018. A Distributionally
Bell, D.E., 1982. Regret in decision making under uncertainty. Oper. Res. 30, 961–981. robust optimization model for unit commitment based on Kullback-Leibler di-
Ben-Tal, A., Nemirovski, A., 20 0 0. Robust solutions of linear programming problems vergence. IEEE Trans. Power Syst. 33, 5147–5160.
contaminated with uncertain data. Math. Program. 88, 411. Chen, Z., Peng, S., Liu, J., 2018. Data-driven robust chance constrained problems: a
Ben-Tal, A., Nemirovski, A., 2002. Robust optimization—methodology and applica- mixture model approach. J. Optim. Theory Appl. 179, 1065–1085.
tions. Math. Program. 92, 453–480. Chen, Y.Z., Wang, Y.S., Kirschen, D., Zhang, B.S., 2018. Model-free renewable sce-
Ben-Tal, A., Goryashko, A., Guslitzer, E., Nemirovski, A., 2004. Adjustable robust so- nario generation using generative adversarial networks. IEEE Trans. Power Syst.
lutions of uncertain linear programs. Math. Program. 99, 351–376. 33, 3265–3275.
Ben-Tal, A., Ghaoui, L.E., Nemirovski, A., 2009. Robust Optimization. Princeton Uni- Chen, Z., Kuhn, D., Wiesemann, W., 2018. Data-Driven Chance Constrained Programs
versity Press. Over Wasserstein Balls arXiv:1809.00210.
446 C. Ning and F. You / Computers and Chemical Engineering 125 (2019) 434–448

Cheng, J., Delage, E., Lisser, A., 2014. Distributionally robust stochastic knapsack Grossmann, I.E., 2012. Advances in mathematical programming models for enter-
problem. SIAM J. Optim. 24, 1485–1506. prise-wide optimization. Comput. Chem. Eng. 47, 2–18.
Chiu, T.Y., Christofides, P.D., 20 0 0. Robust control of particulate processes using un- Guo, Y., Baker, K., Dall’Anese, E., Hu, Z., Summers, T., 2018. Stochastic optimal power
certain population balances. AIChE J. 46, 266–280. flow based on data-driven distributionally robust optimization. In: 2018 Annual
Chu, Y., You, F., 2013. Integration of scheduling and dynamic optimization of batch American Control Conference (ACC), pp. 3840–3846.
processes under uncertainty: two-stage stochastic programming approach and Gupta, V., Grossmann, I.E., 2014. A new decomposition algorithm for multistage
enhanced generalized benders decomposition algorithm. Ind. Eng. Chem. Res. 52, stochastic programs with endogenous uncertainties. Comput. Chem. Eng. 62,
16851–16869. 62–79.
Chu, Y., You, F., Wassick, J.M., Agarwal, A., 2015. Simulation-based optimization Gupta, A., Maranas, C.D., 2003. Managing demand uncertainty in supply chain plan-
framework for multi-echelon inventory systems under uncertainty. Comput. ning. Comput. Chem. Eng. 27, 1219–1227.
Chem. Eng. 73, 1–16. Gupta, V., Rusmevichientong, P., 2017. Small-Data, Large-Scale Linear Optimization.
Chu, Y., You, F., Wassick, J.M., Agarwal, A., 2015. Integrated planning and scheduling Gupta, A., Maranas, C.D., McDonald, C.M., 20 0 0. Mid-term supply chain planning
under production uncertainties: bi-level model formulation and hybrid solution under demand uncertainty: customer demand satisfaction and inventory man-
method. Comput. Chem. Eng. 72, 255–272. agement. Comput. Chem. Eng. 24, 2613–2621.
De Loera, J.A., La Haye, R.N., Oliveros, D., Roldan-Pensado, E., 2018. Chance-con- Hanasusanto, G.A., Kuhn, D., 2018. Conic Programming reformulations of two-stage
strained convex mixed-integer optimization and beyond: two sampling algo- distributionally robust linear programs over Wasserstein balls. Oper. Res. 66,
rithms within s-optimization. J. Convex Anal. 25, 201–218. 849–869.
Delage, E., Iancu, D.A., 2015. In: Aleman, DM, Thiele, AC (Eds.), Robust Multistage Hanasusanto, G.A., Roitch, V., Kuhn, D., Wiesemann, W., 2015. A distributionally ro-
Decision Making. INFORMS Tutorials in Operations Research, Catonsville, MD, bust perspective on uncertainty quantification and chance constrained program-
pp. 20–46. ming. Math. Program. 151, 35–62.
Delage, E., Ye, Y.Y., 2010. Distributionally Robust optimization under moment uncer- Hanasusanto, G.A., Roitch, V., Kuhn, D., Wiesemann, W., 2017. Ambiguous joint
tainty with application to data-driven problems. Oper. Res. 58, 595–612. chance constraints under mean and dispersion information. Oper. Res. 65,
Duan, C., Jiang, L., Fang, W.L., Liu, J., 2018. Data-driven affinely adjustable dis- 751–767.
tributionally robust unit commitment. IEEE Trans. Power Syst. 33, 1385– Hochreiter, S., Schmidhuber, J., 1997. Long short-term memory. Neural Comput. 9,
1398. 1735–1780.
El Ghaoui, L., Oks, M., Oustry, F., 2003. Worst-case value-at-risk and robust portfolio Hong, L.J., Yang, Y., Zhang, L.W., 2011. Sequential convex approximations to joint
optimization: a conic programming approach. Oper. Res. 51, 543–556. chance constrained programs: a Monte Carlo approach. Oper. Res. 59, 617–
Erdogan, E., Iyengar, G., 2006. Ambiguous chance constrained problems and robust 630.
optimization. Math. Program. 107, 37–61. Hota, A.R., Cherukuri, A., Lygeros, J., 2018. Data-Driven Chance Constrained Opti-
Esfahani, P.M., Kuhn, D., 2018. Data-driven distributionally robust optimization using mization Under Wasserstein Ambiguity Sets arXiv:1805.06729.
the Wasserstein metric: performance guarantees and tractable reformulations. Hu, Z., Hong, L.J., 2013. Kullback-Leibler Divergence Constrained Distributionally Ro-
Math. Program. 171, 115–166. bust Optimization Available at Optimization Online.
Esfahani, P.M., Sutter, T., Lygeros, J., 2015. Performance bounds for the scenario ap- Ierapetritou, M.G., Pistikopoulos, E.N., 1995. Design of multiproduct batch plants
proach and an extension to a class of non-convex programs. IEEE Trans. Autom. with uncertain demands. Comput. Chem. Eng. 19, S627–S632.
Control 60, 46–58. Ji, R., Lejeune, M., 2018. Data-driven distributionally robust chance-constrained pro-
Gao, R., Kleywegt, A.J., 2016. Distributionally Robust Stochastic Optimization With gramming with Wasserstein metric.
Wasserstein Distance arXiv:1604.02199. Jiang, R., Guan, Y., 2015. Data-driven chance constrained stochastic program. Math.
Gao, J., You, F., 2015. Deciphering and handling uncertainty in shale gas supply chain Program. 158, 291–327.
design and optimization: novel modeling framework and computationally effi- Jiang, R.W., Guan, Y.P., 2016. Data-driven chance constrained stochastic program.
cient solution algorithm. AIChE J. 61, 3739–3755. Math. Program. 158, 291–327.
Gao, J., You, F., 2017. Modeling framework and computational algorithm for hedg- John Walker, S., 2014. Big Data: A Revolution That will Transform how we live, work,
ing against uncertainty in sustainable supply chain design using functional-u- and think. Taylor & Francis.
nit-based life cycle optimization. Comput. Chem. Eng. 107, 221–236. Jordan, M.I., Mitchell, T.M., 2015. Machine learning: trends, perspectives, and
Gao, X.Y., Shang, C., Jiang, Y.H., Huang, D.X., Chen, T., 2014. Refinery scheduling with prospects. Science 349, 255–260.
varying crude: a deep belief network classification and multimodel approach. Küçükyavuz, S., Sen, S., 2017. An Introduction to Two-Stage Stochastic Mixed-Integer
AIChE J. 60, 2525–2532. Programming. Leading Developments From INFORMS Communities, pp. 1–27.
Gao, J., Ning, C., You, F., 2019. Data-driven distributionally robust optimization of INFORMS.
shale gas supply chains under uncertainty. AIChE J. 65, 947–963. Kall, P., Wallace, S.W., 1994. Stochastic Programming. John Wiley and Sons Ltd.
Gawehn, E., Hiss, J.A., Schneider, G., 2016. Deep learning in drug discovery. Mol. Inf. Kanamori, T., Takeda, A., 2012. Worst-case violation of sampled convex programs for
35, 3–14. optimization with uncertainty. J. Optim. Theory Appl. 152, 171–197.
Gebreslassie, B.H., Yao, Y., You, F., 2012. Design under uncertainty of hydrocarbon Kariotoglou, N., Margellos, K., Lygeros, J., 2016. On the computational complexity
biorefinery supply chains: multiobjective stochastic programming models, de- and generalization properties of multi-stage and stage-wise coupled scenario
composition algorithm, and a comparison between CVaR and downside risk. programs. Syst. Control Lett. 94, 63–69.
AIChE J. 58, 2155–2179. Keyvanshokooh, E., Ryan, S.M., Kabir, E., 2016. Hybrid robust and stochastic opti-
Ghosal, S., Wiesemann, W., 2018. The Distributionally Robust Chance Constrained mization for closed-loop supply chain network design using accelerated Benders
Vehicle Routing Problem Available Optimization Online. decomposition. Eur. J. Oper. Res. 249, 76–92.
Goel, V., Grossmann, I.E., 2007. A class of stochastic programs with decision depen- Klabjan, D., Simchi-Levi, D., Song, M., 2013. Robust stochastic lot-sizing by means of
dent uncertainty. Math. Program. 108, 355–394. histograms. Prod. Oper. Manag. 22, 691–710.
Gong, J., You, F., 2017. Optimal processing network design under uncertainty for pro- Krieger, A., Pistikopoulos, E.N., 2014. Model predictive control of anesthesia under
ducing fuels and value-added bioproducts from microalgae: two-stage adaptive uncertainty. Comput. Chem. Eng. 71, 699–707.
robust mixed integer fractional programming model and computationally effi- Krizhevsky, A., Sutskever, I., Hinton, G.E., 2017. ImageNet Classification with deep
cient solution algorithm. AIChE J. 63, 582–600. convolutional neural networks. Commun. ACM 60, 84–90.
Gong, J., You, F., 2018. Resilient design and operations of process systems: nonlinear Laporte, G., Louveaux, F.V., 1993. The integer l-shaped method for stochastic integer
adaptive robust optimization model and algorithm for resilience analysis and programs with complete recourse. Oper. Res. Lett. 13, 133–142.
enhancement. Comput. Chem. Eng. 116, 231–252. Lappas, N.H., Gounaris, C.E., 2016. Multi-stage adjustable robust optimization for
Gong, J., Garcia, D.J., You, F., 2016. Unraveling optimal biomass processing routes process scheduling under uncertainty. AIChE J. 62, 1646–1667.
from bioconversion product and process networks under uncertainty: an adap- Lasserre, J., Weisser, T., 2018. Distributionally Robust Polynomial Chance-Constraints
tive robust optimization approach. ACS Sustain. Chem. Eng. 4, 3160–3173. Under Mixture Ambiguity Sets.
Gong, J., Yang, M., You, F., 2017. A systematic simulation-based process intensifica- LeCun, Y., Bengio, Y., Hinton, G., 2015. Deep learning. Nature 521, 436.
tion method for shale gas processing and NGLs recovery process systems under Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., et al., 2017.
uncertain feedstock compositions. Comput. Chem. Eng. 105, 259–275. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Network. CVPR, p. 4.
et al., 2014. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Wein- Lee, J.H., Shin, J., Realff, M.J., 2018. Machine learning: overview of the recent pro-
berger, K.Q. (Eds.). Generative Adversarial Nets, 27. Advances in Neural Informa- gresses and implications for the process systems engineering field. Comput.
tion Processing Systems 27. Chem. Eng. 114, 111–121.
Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y., 2016. Deep Learning, 1. MIT Press Levi, R., Perakis, G., Uichanco, J., 2015. The data-driven newsvendor problem: new
Cambridge. bounds and insights. Oper. Res. 63, 1294–1306.
Grammatico, S., Zhang, X.J., Margellos, K., Goulart, P., Lygeros, J., 2016. A scenario Li, C., Grossmann, I.E., 2018. An improved L-shaped method for two-stage con-
approach for non-convex control design. IEEE Trans. Autom. Control 61, 334–345. vex 0–1 mixed integer nonlinear stochastic programs. Comput. Chem. Eng. 112,
Graves, A., Mohamed, A.R., Hinton, G., 2013. Speech recognition with deep recurrent 165–179.
neural networks. In: 2013 IEEE International Conference on Acoustics. Speech Li, Z., Ierapetritou, M.G., 2008. Process scheduling under uncertainty: review and
and Signal Processing, pp. 6645–6649. challenges. Comput. Chem. Eng. 32, 715–727.
Gregory, C., Darby-Dowman, K., Mitra, G., 2011. Robust optimization and portfolio Li, P., Arellano-Garcia, H., Wozny, G., 2008. Chance constrained programming ap-
selection: the cost of robustness. Eur. J. Oper. Res. 212, 417–428. proach to process optimization under uncertainty. Comput. Chem. Eng. 32,
Grossmann, I.E., Biegler, L.T., 2004. Part II. Future perspective on optimization. Com- 25–45.
put. Chem. Eng. 28, 1193–1218. Li, P., Arellano-Garcia, H., Wozny, G., 2008. Chance constrained programming ap-
C. Ning and F. You / Computers and Chemical Engineering 125 (2019) 434–448 447

proach to process optimization under uncertainty. Comput. Chem. Eng. 32, Sahinidis, N.V., 2004. Optimization under uncertainty: state-of-the-art and opportu-
25–45. nities. Comput. Chem. Eng. 28, 971–983.
Li, X., Tomasgard, A., Barton, P.I., 2011. Nonconvex generalized benders decomposi- Sakizlis, V., Perkins, J.D., Pistikopoulos, E.N., 2004. Recent advances in optimiza-
tion for stochastic separable mixed-integer nonlinear programs. J. Optim. Theory tion-based simultaneous process and control design. Comput. Chem. Eng. 28,
Appl. 151, 425–454. 2069–2086.
Li, B., Jiang, R., Mathieu, J.L., 2017. Ambiguous risk constraints with moment and Sanchez-Lengeling, B., Aspuru-Guzik, A., 2018. Inverse molecular design using ma-
unimodality information. Math. Program. 173, 151–192. chine learning: generative models for matter engineering. Science 361, 360–365.
Liu, M.L., Sahinidis, N.V., 1996. Optimization in process planning under uncertainty. Shalev-Shwartz, S., 2012. Online learning and online convex optimization. Found.
Ind. Eng. Chem. Res. 35, 4154–4165. Trends Mach. Learn. 4, 107–194.
Liu, P., Pistikopoulos, E.N., Li, Z., 2010. Decomposition based stochastic programming Shang, C., You, F., 2018. Distributionally robust optimization for planning and
approach for polygeneration energy systems design under uncertainty. Ind. Eng. scheduling under uncertainty. Comput. Chem. Eng. 110, 53–68.
Chem. Res. 49, 3295–3305. Shang, C., You, F., 2019. A data-driven robust optimization approach to stochastic
Liu, X., Kucukyavuz, S., Luedtke, J., 2016. Decomposition algorithms for two-stage model predictive control. J. Process Control 75, 24–39.
chance-constrained programs. Math. Program. 157, 219–243. Shang, C., Yang, F., Huang, D.X., Lyu, W.X., 2014. Data-driven soft sensor develop-
Liu, S.S., Farid, S.S., Papageorgiou, L.G., 2016. Integrated optimization of upstream ment based on deep learning technique. J. Process Control 24, 223–233.
and downstream processing in biopharmaceutical manufacturing under uncer- Shang, C., Huang, X., You, F., 2017. Data-driven robust optimization based on kernel
tainty: a chance constrained programming approach. Ind. Eng. Chem. Res. 55, learning. Comput. Chem. Eng. 106, 464–479.
4599–4612. Shen, W., Li, Z., Huang, B., Jan, N.M., 2018. Chance-constrained model predictive con-
Liu, C., Lee, C., Chen, H., Mehrotra, S., 2016. Stochastic robust mathematical pro- trol for SAGD process using robust optimization approximation. Ind. Eng. Chem.
gramming model for power system optimization. IEEE Trans. Power Syst. 31, Res. doi:10.1021/acs.iecr.8b03207.
821–822. Shi, H., You, F., 2016. A computational framework and solution algorithms for
Lorca, Á., Sun, X.A., Litvinov, E., Zheng, T., 2016. Multistage adaptive robust optimiza- two-stage adaptive robust scheduling of batch manufacturing processes under
tion for the unit commitment problem. Oper. Res. 61, 32–51. uncertainty. AIChE J. 62, 687–703.
Luedtke, J., Ahmed, S., 2008. A sample approximation approach for optimization Shin, J., Lee, J.H., 2019. Multi-timescale, multi-period decision-making model devel-
with probabilistic constraints. SIAM J. Optim. 19, 674–699. opment by combining reinforcement learning and mathematical programming.
Maranas, C.D., 1997. Optimization accounting for property prediction uncertainty in Comput. Chem. Eng. 121, 556–573.
polymer design. Comput. Chem. Eng. 21, S1019–S1024. Smith, J.E., Winkler, R.L., 2006. The optimizer’s curse: skepticism and postdecision
Margellos, K., Falsone, A., Garatti, S., Prandini, M., 2018. Distributed constrained surprise in decision analysis. Manag. Sci. 52, 311–322.
optimization and consensus in uncertain networks via proximal minimization. Soyster, A.L., 1973. Technical note—convex programming with set-inclusive con-
IEEE Trans. Autom. Control 63, 1372–1387. straints and applications to inexact linear programming. Oper. Res. 21,
McLean, K., Li, X., 2013. Robust scenario formulations for strategic supply chain op- 1154–1157.
timization under uncertainty. Ind. Eng. Chem. Res. 52, 5721–5734. Steimel, J., Engell, S., 2015. Conceptual design and optimization of chemical pro-
Mesbah, A., 2016. Stochastic model predictive control an overview and perspectives cesses under uncertainty by two-stage programming. Comput. Chem. Eng. 81,
for future research. IEEE Control Syst. Mag. 36, 30–44. 200–217.
Miao, F., Han, S., Lin, S., Wang, Q., Stankovic, J.A., Hendawi, A., et al., 2019. Data– Tong, K., Gong, J., Yue, D., You, F., 2014. Stochastic programming approach to optimal
driven robust taxi dispatch under demand uncertainties. IEEE Trans. Control Syst. design and operations of integrated hydrocarbon biofuel and petroleum supply
Technol. 27, 175–191. chains. ACS Sustain. Chem. Eng. 2, 49–61.
Miller, B.L., Wagner, H.M., 1965. Chance constrained programming with joint con- Tong, K., You, F., Rong, G., 2014. Robust design and operations of hydrocarbon bio-
straints. Oper. Res. 13, 930. fuel supply chain integrating with existing petroleum refineries considering unit
Mitra, K., Gudi, R.D., Patwardhan, S.C., Sardar, G., 2008. Midterm supply chain cost objective. Comput. Chem. Eng. 68, 128–139.
planning under uncertainty: a multiobjective chance constrained programming Uryasev, S., 20 0 0. Conditional value-at-risk: Optimization algorithms and applica-
framework. Ind. Eng. Chem. Res. 47, 5501–5511. tions. In: Proceedings of the IEEE/IAFE/INFORMS 20 0 0 Conference on Compu-
Mohamed, A.R., Dahl, G.E., Hinton, G., 2012. Acoustic modeling using deep belief tational Intelligence for Financial Engineering (CIFEr)(Cat. No. 00TH8520). IEEE,
networks. IEEE Trans. Audio Speech Lang. Process. 20, 14–22. pp. 49–57.
Nemirovski, A., Shapiro, A., 2006. Convex approximations of chance constrained pro- Van Parys, B.P.G., Kuhn, D., Goulart, P.J., Morari, M., 2016. Distributionally ro-
grams. SIAM J. Optim. 17, 969–996. bust control of constrained stochastic systems. IEEE Trans. Autom. Control 61,
Ning, C., You, F., 2017a. Data-driven adaptive nested robust optimization: general 430–442.
modeling framework and efficient computational algorithm for decision making Vanslyke, R.M., Wets, R., 1969. L-shaped linear programs with applications to opti-
under uncertainty. AIChE J. 63, 3790–3817. mal control and stochastic programming. SIAM J. Appl. Math. 17, 638.
Ning, C., You, F., 2017b. A data-driven multistage adaptive robust optimiza- Vayanos, P., Kuhn, D., Rustem, B., 2012. A constraint sampling approach for multi-
tion framework for planning and scheduling under uncertainty. AIChE J. 63, -stage robust optimization. Automatica 48, 459–471.
4343–4369. Venkatasubramanian, V., 2019. The promise of artificial intelligence in chemical en-
Ning, C., You, F., 2018a. Adaptive robust optimization with minimax regret cri- gineering: is it here, finally? AIChE J. 65, 466–478.
terion: multiobjective optimization framework and computational algorithm Verderame, P.M., Elia, J.A., Li, J., Floudas, C.A., 2010. Planning and scheduling un-
for planning and scheduling under uncertainty. Comput. Chem. Eng. 108, 425– der uncertainty: a review across multiple sectors. Ind. Eng. Chem. Res. 49,
447. 3993–4017.
Ning, C., You, F., 2018b. Data-driven stochastic robust optimization: general compu- Vermaak, J., Botha, E.C., 1998. Recurrent neural networks for short-term load fore-
tational framework and algorithm leveraging machine learning for optimization casting. IEEE Trans. Power Syst. 13, 126–132.
under uncertainty in the big data era. Comput. Chem. Eng. 111, 115–133. Wang, C., Gao, R., Qiu, F., Wang, J., Xin, L., 2018. Risk-based distributionally ro-
Ning, C., You, F., 2018c. Data-driven decision making under uncertainty integrating bust optimal power flow with dynamic line rating. IEEE Trans. Power Syst. 33,
robust optimization with principal component analysis and kernel smoothing 6074–6086.
methods. Comput. Chem. Eng. 112, 190–210. Wiesemann, W., Kuhn, D., Sim, M., 2014. Distributionally robust convex optimiza-
Ning, C., You, F., 2019. Data-driven adaptive robust unit commitment under wind tion. Oper. Res. 62, 1358–1376.
power uncertainty: a Bayesian nonparametric approach. IEEE Trans. Power Syst. Wu, Y., Tan, H., Qin, L., Ran, B., Jiang, Z., 2018. A hybrid deep learning based traf-
doi:10.1109/TPWRS.2019.2891057. fic flow prediction method and its understanding. Transp. Res. Part C 90, 166–
Oliveira, F., Gupta, V., Hamacher, S., Grossmann, I.E., 2013. A Lagrangean decomposi- 180.
tion approach for oil supply chain investment planning under uncertainty with Xie, W.J., Ahmed, S., 2018. On deterministic reformulations of distributionally robust
risk considerations. Comput. Chem. Eng. 50, 184–195. joint chance constrained optimization problems. SIAM J. Optim. 28, 1151–1182.
Parpas, P., Rustem, B., Pistikopoulos, E., 2009. Global optimization of robust chance Xie, W., Ahmed, S., 2018. Distributionally robust chance constrained optimal power
constrained problems. J. Global Optim. 43, 231–247. flow with renewables: a conic reformulation. IEEE Trans. Power Syst. 33,
Peng, X., Root, T.W., Maravelias, C.T., 2018. Optimization-based process synthesis 1860–1867.
under seasonal and daily variability: application to concentrating solar power. Xie, W., 2018. On Distributionally Robust Chance Constrained Program with Wasser-
AIChE J. doi:10.1002/aic.16458. stein Distance arXiv:1806.07418.
Pistikopoulos, E.N., 1995. Uncertainty in process design and operations. Comput. Xiong, P., Jirutitijaroen, P., Singh, C., 2017. A distributionally robust optimization
Chem. Eng. 19, 553–563. model for unit commitment considering uncertain wind power generation. IEEE
Postek, K., Ben-Tal, A., den Hertog, D., Melenberg, B., 2018. Robust optimization with Trans. Power Syst. 32, 39–49.
ambiguous stochastic constraints under mean and dispersion information. Oper. Yang, W.Z., Xu, H., 2016. Distributionally robust chance constraints for non-linear
Res. 66, 814–833. uncertainties. Math. Program. 155, 231–265.
Prékopa, A., 1995. Stochastic Programming, Volume 324 of Mathematics and Its Ap- Yang, Y., Vayanos, P., Barton, P.I., 2017. Chance-constrained optimization for refinery
plications. Kluwer Academic Publishers Group, Dordrecht. blend planning under uncertainty. Ind. Eng. Chem. Res. 56, 12139–12150.
Qin, S.J., 2014. Process data analytics in the era of big data. AIChE J. 60, 3092–3100. Ye, W., You, F., 2016. A computationally efficient simulation-based optimization
Quddus, M.A., Chowdhury, S., Marufuzzaman, M., Yu, F., Bian, L.K., 2018. A two-stage method with region-wise surrogate modeling for stochastic inventory manage-
chance-constrained stochastic programming model for a bio-fuel supply chain ment of supply chains with general network structures. Comput. Chem. Eng. 87,
network. Int. J. Prod. Econ. 195, 27–44. 164–179.
Rooney, W.C., Biegler, L.T., 2003. Optimal process design with model parameter un- Yin, S., Kaynak, O., 2015. Big data for modern industry: challenges and trends. Proc.
certainty and process variability. AIChE J. 49, 438–449. IEEE 103, 143–146.
448 C. Ning and F. You / Computers and Chemical Engineering 125 (2019) 434–448

Yin, S., Li, X., Gao, H., Kaynak, O., 2015. Data-based techniques focused on modern Zhang, X.J., Grammatico, S., Schildbach, G., Goulart, P., Lygeros, J., 2015. On the sam-
industry: an overview. IEEE Trans. Ind. Electron. 62, 657–667. ple size of random convex programs with structured dependence on the uncer-
You, F., Grossmann, I.E., 2008. Mixed-integer nonlinear programming models and tainty. Automatica 60, 182–188.
algorithms for large-scale supply chain design with stochastic inventory man- Zhang, Y., Jiang, R., Shen, S., 2018. Ambiguous chance-constrained binary programs
agement. Ind. Eng. Chem. Res. 47, 7802–7817. under mean-covariance information. SIAM J. Optim. 28, 2922–2944.
You, F., Grossmann, I.E., 2011. Stochastic inventory management for tactical pro- Zhang, Y., Jin, X.Z., Feng, Y.P., Rong, G., 2018. Data-driven robust optimization under
cess planning under uncertainties: MINLP models and Algorithms. AIChE J. 57, correlated uncertainty: a case study of production scheduling in ethylene plant
1250–1277. (Reprinted from computers and Chemical Engineering, vol 109, pg 48-67, 2017).
You, F., Grossmann, I.E., 2011. Balancing responsiveness and economics in pro- Comput. Chem. Eng. 116, 17–36.
cess supply chain design with multi-echelon stochastic inventory. AIChE J. 57, Zhang, Y., Feng, Y.P., Rong, G., 2018. Data-driven rolling-horizon robust optimization
178–192. for petrochemical scheduling using probability density contours. Comput. Chem.
You, F., Grossmann, I.E., 2013. Multicut Benders decomposition algorithm for process Eng. 115, 342–360.
supply chain planning under uncertainty. Ann. Oper. Res. 210, 191–211. Zhao, C.Y., Guan, Y.P., 2013. Unified stochastic and robust unit commitment. IEEE
You, F., Wassick, J.M., Grossmann, I.E., 2009. Risk management for a global supply Trans. Power Syst. 28, 3353–3361.
chain planning under uncertainty: models and algorithms. AIChE J. 55, 931–946. Zhao, C.Y., Guan, Y.P., 2016. Data-driven stochastic unit commitment for integrating
You, F., Pinto, J.M., Grossmann, I.E., Megan, L., 2011. Optimal distribution-inventory wind generation. IEEE Trans. Power Syst. 31, 2587–2596.
planning of industrial gases. II. MINLP models and algorithms for stochastic Zhao, S., You, F., 2019. Resilient supply chain design and operations with deci-
cases. Ind. Eng. Chem. Res. 50, 2928–2945. sion-dependent uncertainty using a data-driven robust optimization approach.
You, K., Tempo, R., Xie, P., 2018. Distributed algorithms for robust convex optimiza- AIChE J. 65, 1006–1021.
tion via the scenario approach. IEEE Trans. Autom. Control 64 1-1. Zhao, L., Ning, C., You, F., 2019. Operational optimization of industrial steam sys-
Yue, D., You, F., 2013. Planning and scheduling of flexible process networks under tems under uncertainty using data-driven adaptive robust optimization. AIChE J.
uncertainty with stochastic inventory: MINLP models and algorithm. AIChE J. 59, doi:10.10 02/aic.1650 0.
1511–1532. Zhu, W., Ma, Y., Benton, M.G., Romagnoli, J.A., Zhan, Y., 2019. Deep learning for py-
Yue, D., You, F., 2016. Optimal supply chain design and operations under multi-scale rolysis reactor monitoring: from thermal imaging toward smart monitoring sys-
uncertainties: nested stochastic robust optimization modeling framework and tem. AIChE J. 65, 582–591.
solution algorithm. AIChE J. 62, 3041–3055. Zipkin, P.H., 20 0 0. Foundations of inventory management. McGraw-Hill.
Zeballos, L.J., Méndez, C.A., Barbosa-Povoa, A.P., 2016. Design and planning of Zymler, S., Kuhn, D., Rustem, B., 2013. Distributionally robust joint chance con-
closed-loop supply chains: a risk-averse multistage stochastic approach. Ind. straints with second-order moment information. Math. Program. 137, 167–198.
Eng. Chem. Res. 55, 6236–6249.
Zhang, Z.P., Zhao, J.S., 2017. A deep belief network based fault diagnosis model for
complex chemical processes. Comput. Chem. Eng. 107, 395–407.

You might also like