Theory of Machine Learning Assisted Structrual Optimization Algorith and Its Application
Theory of Machine Learning Assisted Structrual Optimization Algorith and Its Application
Theory of Machine Learning Assisted Structrual Optimization Algorith and Its Application
computational efficiency analysis of the MLASO algorithm with the CDLP scheme. The MLASO algorithm is
then embedded within the solid isotropic material with penalization topology optimization method to solve two-
dimensional and three-dimensional problems. Numerical examples and results demonstrate the prediction accuracy
and the computational efficiency of the MLASO algorithm, and that the CDLP scheme can remarkably improve the
computational efficiency of the MLASO algorithm.
ϵm = relative difference between the predicted gradient Due to the unique characteristic of TO, the FEA and sensitivity
and its exact value analysis can be computationally expensive: in particular, for large-
λ = Lagrange multiplier scale and complicated physical problems. To improve the computa-
μ = range of the gradient tional efficiency of TO, some available works have managed to skip
σ = activation function or accelerate FEA runs and sensitivity analyses in selected iterations
τc = convergence criterion or entirely skip all iterations using ML-based methods [4,7–17] or
τt = training convergence criterion non-ML-based methods [18–21]. For those ML-based methods, an
Φ = neural network input vector offline-training strategy is typically used to train the ML model with a
Ψ = neural network output vector large set of training samples before solving the target TO problem [7–
Ψ = neural network prediction output 16]. In the offline-training strategy, the generation of training samples
ω = collective representation of w and b is accomplished by solving many TO problems using conventional TO
methods, which can be time consuming. For instance, in Ref. [15],
Subscripts 15,000 training samples (including 12,000 samples for training and
3000 samples for validation) were generated for the three-dimensional
e = element e (3-D) simply supported stiffened panel with the distributed-load TO
i, j = element index numbers i and j problem. The training samples were generated by solving the con-
MLASO = results obtained using machine learning assisted sidered TO problems with 15,000 different loading conditions using
structural optimization a conventional gradient-based TO method. To generate these training
max = maximum value samples, eight CPUs were employed to work simultaneously using a
min = minimum value single CPU for each single loading case calculation, and the total
Downloaded by University of Georgia on January 16, 2024 | http://arc.aiaa.org | DOI: 10.2514/1.J062195
pred = values calculated in prediction iterations computational time for training sample generation was 94 h
real = values calculated using routine optimization (≈4 days). The trained machine learning model obtained using the
methods offline-training strategy can normally solve the design problems
ref = results obtained using gradient descent or solid iso- within a scope instantly and accurately, but additional training sam-
tropic material with penalization ples may need to be generated if design problems beyond the scope
are encountered. For example, in Ref. [16], 1000 training samples
Superscripts were used to train a generative adversarial network (GAN) for
structural and heat conduction TO problems, respectively. However,
ep = epoch number in a training loop to solve the multiphysics TO problems, the training for the GAN
k = iteration index required 2400 training samples in total, including 1000 structural,
kp = prediction iteration index 1000 heat conduction and 400 multiphysics training samples. To
kr = routine iteration index improve the training efficiency, an online-learning and online-
prediction strategy has been recently adopted [4,17]. By using the
online-learning and online-prediction strategy, the training process is
embedded into the iterative process of TO; and training samples are
I. Introduction collected from the historical data of the chosen optimization quan-
prediction accuracy of the MLASO-d algorithm are demonstrated size of N out × 1. In MLASO-d, the scaled gradient of the objective
and compared with the top88 algorithm [22] by using the numerical function obtained in the previous and the current iterations are used as
results of the two-dimensional (2-D) TO problems in Ref. [4]. the training sample input and output, respectively [4]. Let w be the
In this work, the computational efficiency of the MLASO-d algo- user-defined weight matrix with the size of N m × N in connecting the
rithm is further improved, and the mathematical background of the input and the hidden layer. For any given input Φ ∈ R, the output of
MLASO-d algorithm is studied. The first contribution of this work is the hidden layer is
to propose a criterion-driven learning and predicting scheme for the
MLASO-d algorithm. With the CDLP scheme, the MLASO-d algo- hΦ wΦ (4a)
rithm can autonomously decide the timing of activating single or
multiple prediction iterations. Compared with the PDLP scheme, the Let w and b be the weight matrix and bias vector with sizes of
implementation of the CDLP scheme significantly improves the N out × N m and N out × 1 connecting the hidden layer and the output
computational efficiency of the MLASO-d algorithm when solving layer; also, let σ⋅ be the rectified linear activation function for the
TO problems. Furthermore, we also demonstrate that MLASO-d can output layer, and the output of the NN is
cooperate with the multigrid preconditioned conjugate-gradient
(MGCG) method [18], which is a non-ML based TO acceleration
method, to remarkably reduce the computational time of TO. Ψ σ σ whΦ b (4b)
The second contribution of this work is to establish the mathematical
theory of the MLASO-d algorithm from two aspects: the convergence The error function (denoted as fe ) measures the sum of the mean
analysis, and the computational efficiency analysis. In the convergence square error between the target output Ψ and the NN output Ψ for N s
analysis, we assume that MLASO-d is embedded into the gradient training samples, and it is defined as
Downloaded by University of Georgia on January 16, 2024 | http://arc.aiaa.org | DOI: 10.2514/1.J062195
problems in MLASO, we scale the exact gradient to the range of Based on Eq. (13e), we define our activation criterion using
[0, 1]. The scaled exact gradient (denoted as g~ k−1 ) or the predicted Eq. (14a):
gradient in the previous iteration (denoted as s~k−1 ) is used as the p
training sample input (i.e., Φ δk−1 ; δ is the collective representa- ϵa ksk − gk−1 k αLkgk−1 k − 1 − αLksk k ≤ 0 k > 0
tion for the rescaled g~ and predicted s), ~ whereas the scaled exact (14a)
gradient in the current iteration is used as the training sample output
for the current training (i.e., Ψ g~k ). so that when the activation criterion is satisfied, we have
In prediction iterations, the input to the NN is the scaled exact
gradient g~ k−1 or the predicted gradient s~k−1 computed in the previous p
iteration (i.e., Φ δk−1 ), and the prediction output s~k is in the range ksk − gk−1 k αLkgk−1 k − 1 − αLksk k
of [0, 1]. Before using the predicted gradient to update the design p
variable, s~k needs to be scaled to an estimated range of [μmin ; μmax ], × ksk − gk−1 k αLkgk−1 k 1 − αLksk k ≤ 0 (14b)
where
2
μ ws μkr bs (10a) ksk − gk−1 k αLkgk−1 k ≤ 1 − αLksk k2 (14c)
and μkr is the range of g in the most recent routine iteration; ws and bs Noting Eqs. (13e) and (14c), one can obtain Eq. (14d) for all
are the coefficients for a mapping from the second most recent routine activated prediction iterations:
iteration (denoted as μkr −1 ) to μkr [i.e., fm ws ; bs ∶μkr −1 → μkr ], and
Downloaded by University of Georgia on January 16, 2024 | http://arc.aiaa.org | DOI: 10.2514/1.J062195
19: else
20: routine 1;
21: end if
22: end if
23: if routine ≥1 || k ≤ 1, ▸ If prediction is not activated, do routine iteration
24: gk ∇fxk ; ▸ Calculate the exact gradient
25: δk gk ; ▸ Update δk
26: Do training, update w and b;
27: routine 0;
28: end if
29: xk1 xk − αδk ; ▸ Update design variable
30: if k > 0,
kδk−1 −δk k
31: αLk α kx k−1 −xk k
;
32: end if
33: end for
1 Proof: Let us replace y and x in Eq. (17) with xk1 and xk ; then, we
k∇fx − ∇fyk2 ≤ h∇fx − ∇fy; x − yi (18)
L have the following:
XING AND TONG 4669
L Rk Lα ≤ 0 (24)
fxk1 ≤ fxk gTk xk − αgk − xk kx − αgk − xk k2
2 k k∈kip
Lα
≤ fxk − α 1 − kgk k2 (21a) In routine iterations, due to Eq. (21b),
2
Lα
Based on Assumption 2, α ≤ 1∕L; so, in routine iterations, α 1− kgk k2 ≤ fxk − fxk1 (25a)
2
k∈kip k∈kir
Based on Eq. (12), substituting sk gk ck into Eq. (22a) yields
≤ fxk − fxk1
k∈ki
Lα2
fxk1 ≤ fxk − αgTk gk ck kgk ck k2 α
2 kg k2 − Rk Lα
2 k
≤ fxk − αgTk gk − αhgk ⋅ ck i k∈kip
Letting
Equations (24) and (27b) correspond to the left- and right-hand
sides of Eq. (23), and
α
Rk Lα −1 − Lαkck gk k2 kck k2
2 Rk Lα
k∈kip
with the CDLP scheme, Eq. (14d) must hold in all prediction iter-
ations; thus, Rk Lα ≤ 0 and must be a negative finite number.
The following theorem proves the convergence of the MLASO-d
α algorithm for unconstrained optimization with convex and smooth
fxk1 − fxk ≤ − kgk k2 Rk Lα ≤ 0 (22c)
2 functions.
Theorem 1: If Assumptions 1 and 2 are valid, starting from any
Equations (21b) and (22c) illustrate the objective function in both arbitrary initial value x0 , the objective function defined in Eq. (2) can
prediction and routine iterations is monotonically nonincreasing; always converge to its minimum value [denoted as fx ] if design
therefore, Eq. (20) is proved. variable x is updated using Eqs. (9) and (11) with a CDLP scheme.
The next lemma shows, with the CDLP scheme, the summation of The convergence of the objective function can be represented as
Rk Lα, over all prediction iterations is a negative finite number.
Lemma 2: If Assumptions 1 and 2 hold, the summation of Rk Lα lim fxKn − fx 0 (28)
for all prediction iterations is a negative finite number within the K n →∞
range
Proof: In routine iterations,
0≥ Rk Lα ≥ −fx0 fx (23)
kxk1 − x k2 kxk − x − αgk k2
k∈kip
kxk − x k2 − 2αhxk − x ; gk i α2 kgk k2 (29)
where fx0 and fx are the objective function values calculated
using the initial value of the minimizer x0 and using the optima Substituting x and y in Eq. (18) with xk and x , noting
value x . gk ∇fxk ∇fx 0, gives
4670 XING AND TONG
2α kx0 − x k2
kxk1 − x k2 ≤ kxk − x k2 − kgk k2 α2 kgk k2 (38a)
L
2
≤ kxk − x k2 − α2 − 1 kgk k2 (31) multiplying Eq. (38a) by
Lα
1
In prediction iterations, fxk1 − fx fxk − fx
we have
kxk1 − x k2 kxk − x − αsk k2
kxk − x k2 − 2αhxk − x ; sk i α2 ksk k2 1 1 α 1 − Lα fxk − fx
≤ −
2
fxk − fx fxk1 − fx kx0 − x k fxk1 − fx
2
kxk − x k2 − 2αhxk − x ; gk ck i α2 ksk k2
α 1 − Lα fxk − fx 1 1
kxk − x k2 − 2αhxk − x ; gk i
2
≤ −
Downloaded by University of Georgia on January 16, 2024 | http://arc.aiaa.org | DOI: 10.2514/1.J062195
We substitute x; y; sx ; gx , and cx in Eq. (19) with xk ; x ; sk ; gk , and Because fxk1 − fx ≤ fxk − fx ,
ck ; and
1 1 α 1 − Lα
2
ks k2 − ≥ (38c)
2hck ; xk − x i ≥ hgk ; xk − x i αL k 2 − 1 (33a) fxk1 − fx fxk − fx kx0 − x k2
kgk k
In prediction iterations, based on Eq. (22c),
Because of Eq. (30), we have α
fxk1 − fx ≤ fxk − fx − kgk k2 Rk Lα
2
1 ks k2 α
2hck ; xk − x i ≥ kgk k2 αL k 2 − 1
≤ fxk − fx − fxk − fx 2
L kgk k 2kx0 − x k2
ksk k2 1 Rk Lα (39a)
≥ kgk k2 α −
kgk k2 L
multiplying Eq. (39a) by
1
≥ αksk k2 − kgk k2 (33b)
L 1
fxk1 − fx fxk − fx
Substituting Eqs. (30) and (33b) into Eq. (32) yields
we have
kxk1 − x k2 ≤ kxk − x k2− 2αhxk − x ; g ki − α2 ks k k2 1 1 α fxk − fx
α ≤ −
kgk k2 α2 ksk k2 fxk − fx fxk1 − fx 2kx0 − x k fxk1 − fx
2
L
α Rk Lα
≤ kxk − x k2 − kgk k2 (34)
L fxk1 − fx fxk − fx
1 1 α
− ≥
Because 0 < Lα ≤ 1, we can conclude from Eqs. (31) and (34) that fxk1 − fx fxk − fx 2kx0 − x k2
kxk − x k2 is a nonincreasing sequence for all iterations and Rk Lα
− (39b)
fxk1 − fx fxk − fx
kxk − x k2 ≤ kx0 − x k2 (35)
By taking the sum of Eqs. (38c) and (39b) over all iterations (i.e.,
Due to the convexity of the considered problem and the Cauchy– from k 0 to k Kn − 1),
Schwarz inequality, in both routine and prediction iterations, we have
1 1 1
≥ −
fxKn − fx fxKn − fx fx0 − fx
fx ≥ fxk gTk x − xk (36a)
α α 1 − Lα
≥ K
2 p
2
Kr
2kx0 − x k kx0 − x k2
fxk − fx ≤ gTk xk − x ≤ kgk kkxk − x k ≤ kgk kkx0 − x k
Rk Lα
(36b) − fx − fx
k∈kip
fxk1 − fx k
α α 1 − Lα
So, ≥
Kp 2
Kr
2kx0 − x k 2 kx0 − x k2
fxk − fx Rk Lα
≤ kgk k (37) − 2
(40a)
kx0 − x k k∈kip
fx0 − fx
XING AND TONG 4671
where K p and Kr are the total numbers of routine and prediction The reduction in routine iterations is accompanied by extra com-
iterations. Because of Lemma 2, let putational costs for training and prediction, and so the secondary
condition for saving total computational time is that the time saved by
Rk Lα reducing routine iterations must cover the extra time costs of training
zR
k∈kip
fx0 − fx 2 and prediction. Here, we define the computational time used to do
one calculation of gradient information, one design variable update,
one training, and one prediction as tg ; tu , ttrain , and tpred ; and the
and zR is a negative finite number; then, difference between the total computational time when solving the
same optimization problem by using a gradient-based optimization
1 α α 1 − Lα
≥
Kp 2
Kr (40b) method and MLASO-d is defined as td . Based on the primary and
fxKn − fx 2kx0 − x k2 kx0 − x k2 secondary conditions for time savings, we establish the following
theorem:
Theorem 2: Note that td ≥ 0 when the following condition is
Because Lα ≤ 1, we can derive from Eq. (40b) that satisfied:
1 α Np 1
≥ K Kr (41a) 1 ≥ −εKr ≥ (46)
fxKn − fx 2kx0 − x k2 p tg
tu ttrain tpred
and thus
Proof: Let us define the total computational time when solving the
Downloaded by University of Georgia on January 16, 2024 | http://arc.aiaa.org | DOI: 10.2514/1.J062195
The convergence of the GD method with the constant move step − Kn × tpred (48)
size α for the same unconstrained optimization problem can be
proved using the same method as in Theorem 1 by considering all In Eq. (48), if K n0 ≥ Kn , noting Kn ≥ Kr , we have
iterations as routine iterations. Because the GD method has been
proved to be monotonically nonincreasing [24], fxk1 − fx ≤ td ≥ K n0 − K r × tg K n0 − Kn0 × tu − Kn0 × ttrain − Kn0 × tpred
fxk − fx . Based on Eq. (38c), the sum of
≥ K n0 − K r × tg − K n0 × ttrain tpred (49)
1 1
− which implies td ≥ 0 when
fxk1 − fx fxk − fx
tg K0 1
over all iterations is (i.e., from k 0 to k Kn0 − 1, where K n0 is the ≥ 0 n (50)
total number of iterations in GD) ttrain tpred Kn − Kr −εKr
Noting the stopping criterion and Lα ≤ 1, one has which implies td ≥ 0 when
kx0 − x k2 2kx0 − x k2 tg Kn Kn
Kn0 ≤ ≤ ≥ (52)
α 1 − Lα ατc
(44) tu ttrain tpred K n0 − Kr K n0 −εKr
2 τc
A. Problem Statement
SIMP is a popular gradient-based TO method for solving the
minimum compliance TO problems with volume constraint [5,22].
In SIMP, the design domain Ω is discretized with N e elements. In
each iteration, the density of these elements is redistributed to min-
imize compliance. Let us define the density for the eth element,
which is also the design variable, as xe e 1; 2; 3 : : : N e ; and the
material model in SIMP is defined as [5]
where Ee , E0 , and Emin are the Young’s modulus for the eth element,
solid elements, and void elements; and a penalty factor of p 3 is
often used [5].
We also define the minimum compliance TO problem using [5]
Ne
min∶ fx C UT KU Ee uTe k0 ue
Fig. 1 The value of ϵc and the actual time savings ts achieved by solving
e1
test function P3 with Nt terms using MLASO-d with CDLP scheme.
KU F
Downloaded by University of Georgia on January 16, 2024 | http://arc.aiaa.org | DOI: 10.2514/1.J062195
e ve xe
subject to∶ Vf (63)
predictions can be activated to save more computational time for test V
functions with a larger N t ; for example, with N t 128, more com- 0 ≤ xe ≤ 1
putational time can be reduced if N p increases from N p 1 to
N p 11. Note that Eq. (58b) offers a guideline for selecting N p that where the objective function fx calculates the structural compli-
can be safely used; but typically, higher N p can be used to push the ance; ue is the displacement of the eth element, k0 is the stiffness
limit of the computational efficiency of the MLASO-d algorithm. matrix for a solid element; K and U are the global stiffness matrix and
However, if N p is too large, the prediction quality can be affected; and the global displacement vector, F is the load vector; ve and V are the
more trainings need to be conducted to find the correct mapping from volume of element e and the volume of the design domain; and V f is
the predicted gradient in previous iterations to the exact gradient the volume fraction.
calculated in the current iteration. Intuitively, at N t 8, MLASO-d
should have more time savings when N p is larger than nine ; but in B. Solution Method and Algorithm
fact, as N p increases from N p 9 to N p 11, the time or cost of one The framework of SIMP includes three key steps in each iteration:
training doubles, resulting in less time savings. In general, N p > 1 the FEA, the sensitivity analysis, and the design variable update
seems to be the best option for solving the considered test functions [5,22]. By implementing MLASO-d into the SIMP method, the
because MLASO-d (N p > 1) reduces the total computational time framework of SIMP remains in routine iterations; but in prediction
for almost every N t cases. iterations, the FEA and the sensitivity analysis are skipped, and the
gradient of the objective function is predicted via machine learning.
E. Applicability Discussion Let us represent the design variable in MLASO-d as a function of
The problem in Eq. (1) is generic, and it can be convex or gradient information xe δ, and δ is the exact gradient information
nonconvex. The MLASO-d algorithm determines the gradient of calculated using routine SIMP TO steps in routine iterations (i.e.,
the objective function in the next few iterations from these gra- δ g); whereas δ is the scaled predicted gradient calculated using
dients in the previous few iterations via machine learning. In other machine learning in prediction iterations (i.e., δ s). Equation (63)
words, the MLASO-d algorithm can be viewed as an alternative is solved by updating the design variable xe δ iteratively to minimize
method of determining the gradient of the objective function at the compliance. The first two iterations are routine iterations. Starting
some iterations in a broad feasible gradient-based optimization from the third iteration, the predicted gradient information for each
method. In Sec. II.C, it is mathematically proven that the gradient element s~e is calculated, and the activation criterion for prediction
of the objective function determined by using the MLASO-d iteration needs to be examined to decide the activation of routine and
algorithm can guarantee convergence and efficiency when the prediction iterations; then, the following steps will be performed
problem in Eq. (1) is convex. Although this mathematical proof accordingly:
cannot be readily extended to the generic nonconvex case, it is In routine iterations, the nodal displacements are calculated by
believed that MLASO-d algorithm should be capable of efficiently solving the equilibrium equation KU F, and the objective function
solving some nonconvex problems within the framework of a is computed based on the nodal displacements [5]; the predicted
feasible gradient-based solution method. Therefore, Sec. III aims gradient information is discarded, and the exact gradient of the
to illustrate how the MLASO-d algorithm can be integrated into objective function ge for element e is calculated via a sensitivity
the SIMP method, as well as to numerically demonstrate the analysis using Eq. (64) [5]. To guarantee that the selection of the
expected convergence and efficiency. training parameters is independent of the TO problems, ge is scaled to
[0, 1] before the training and the scaled gradient (denoted as g~ e ) is
rescaled back to its original range once the training is completed; the
training process is conducted to update the ML model by using the
III. MLASO-d Embedded SIMP Method scaled g~ e in the previous and current iterations as the training input
This section presents a method that embeds the MLASO-d algo- and target output:
rithm within the solid isotropic material with penalization method for
solving the minimum compliance TO problems. First, we present the ∂C
ge −pxe δp−1 E0 − Emin uTe k0 ue (64)
problem statement for the minimum compliance TO problems, fol- ∂xe
lowed by an introduction to the implementation of MLASO-d in
SIMP. The computational efficiency and the prediction accuracy of In prediction iterations, the predicted gradient information s~e is
the MLASO-d algorithm are demonstrated by solving 2-D and 3-D adopted. The value of the predicted gradient information s~e is in the
numerical examples. range of [0, 1], and it needs to be scaled to an estimated range
4674 XING AND TONG
[μmin ; μmax ]; we denote the scaled gradient for the eth element as se . where Rij is the centroid distance between elements i and j, and
The algorithms for the routine and prediction iterations are demon- r 0.1; the maximum radius used to define the local connectivity
strated in Appendix B. between the hidden layer and the output layer is Rmin [4], and
Once the value of δe is calculated or predicted, the optimization Rmin 1. The learning rate is rl 0.2.
problem in Eq. (63) can be transformed into an unconstrained opti- 3) The convergence criteria for the 2-D and 3-D problems are the
mization problem by using the standard OC method [5]. In the OC same as the ones used in Ref. [5], and they are defined in Eqs. (68a)
method, the Lagrange multiplier λ is calculated to satisfy the volume and (68b). In this study, τc 1 × 10−5 for 2-D design problems and
constraint; then, Be is calculated as [5] τc 1 × 10−6 for 3-D design problems:
−δe ∂V kxk−1 − xk k
Be ; and 1 (65) p ≤ τc (68a)
∂V ∂xe
λ ∂xe Ne
kxk−1 − xk k
Based on Be , the design variable xe can be updated using the ≤ τc (68b)
heuristic updating scheme as [5] Ne
xnew
e xe B0.5
e if maxxmin ; xe − move ≤ xe B0.5
e ≤ min1; xe move (66)
Downloaded by University of Georgia on January 16, 2024 | http://arc.aiaa.org | DOI: 10.2514/1.J062195
where xmin is the minimum allowable value for xe ; move is the move Before we present the numerical results, let us denote the final
limit in SIMP; and, normally, xmin 0 and move 0.2. compliance computed by top99neo, top3D125 as fref , and the one
By using the CDLP scheme, Eq. (66) is transformed into predicted by MLaSO-d as fMLASO , and we use Eq. (60) to calculate
the relative difference between fref and fMLASO . Also, Eqs. (45) and
xnew
e xe − αδe (67a) (59) are used to calculate the relative differences between the total
number of routine iterations and the overall computational time when
where solving TO problems; Kn0 and tref are the total number of routine
Then, αLk can be calculated using Eq. (15a) and used to examine iterations and the total computational time for top99neo and
the activation criterion in the next iteration. The same procedure will top3D125 algorithms.
be repeatedly conducted until compliance is minimized.
1. 2-D Topology Optimization Examples
C. Numerical Results and Discussion Four 2-D design problems are considered in the section: the
The top99neo and top3D125 algorithms [5] are the latest versions Messerschmitt–Bölkow–Blohm (MMB) beam problem, the mid
of the SIMP method for solving 2-D and 3-D TO problems, respec- cantilever problem, the Michell beam problem, and the distributed-
tively. Compared to top88 [22], top99neo and top3D125 accelerate load beam problem. In this work, these four design problems are
referred to as problems A, B, C, and D. The geometry, boundary, and
the FEA computation and design variable updates. In this section, we
loading conditions, as well as the material properties and volume
assess the computational efficiency and the prediction accuracy of the
fraction for each design problem, are presented in Fig. 2. For all
MLASO-d algorithm by comparing 2-D and 3-D numerical results
obtained by using MLASO-d with those obtained by top99neo and
top3D125 algorithms. In addition, we also compare the prediction
accuracy and the computational efficiency of the MLASO-d algo-
rithm with CDLP and PDLP schemes to illustrate the superiority of
the CDLP scheme.
The following general settings are defined when using MLASO-d:
1) When using the PDLP scheme, ks 5 for all design problems,
where ks is the total number of initial routine iterations before
the first ML prediction iteration; and the following parameters
of the exponential moving average filter [4] are defined:
γ 0 0.1; Δγ 0.1; n 3, and βmin 0.6.
2) For both CDLP and PDLP schemes, the upper bound of the
initial weights and bias is ru 1 × 10−9 . The fixed weight between
the input layer and the hidden layer is defined as [4]
Rij 2
wij exp − i 1; 2; : : : ; N e ; j 1; 2; : : : ; N e Fig. 2 Problem setup for a) MMB Beam, b) mid cantilever beam,
r2 c) Michell beam, and d) distribute-load beam.
XING AND TONG 4675
Table 3 Total computation time (and ts ) for solving 2-D problems using top99neo and MLASO-d with
PDLP and CDLP schemes
Table 4 Optimized topologies and final compliance calculated by using top99neo and MLASO-d with
PDLP and CDLP schemes
D
f 8.73 8.73 8.73 8.73 8.73
εf 8.73 (0.00%) (0.00%) (0.00%) (0.00%) (0.00%)
4676 XING AND TONG
compliances εf computed by using the MLASO-d algorithms are 48 × 24 × 24; 80 × 40 × 40, and 112 × 56 × 56. The Young’s moduli
within 0.03% for all problems. With N p 1; 2, and 4, the relative for the solid and void elements are E0 1 and Emin 1 × 10−9 , and
difference εf for the CDLP scheme is lower than or equal to the one the Poisson’s ratio is v 0.3. The SIMP penalty is three,pthe volume
for the PDLP scheme for all considered problems. The results dem-
lx
fraction is 0.12, and the size of the density filter is 48 × 3.
onstrate that the MLASO-d algorithm can calculate the optimized As reported in Ref. [5], the direct solver in top3D125 needs to be
topology and the minimum compliance accurately for 2-D TO prob- replaced with the multigrid preconditioned conjugate-gradient solver
lems with both PDLP and CDLP schemes, and CDLP can have more [18] when solving 3-D problems with meshes finer than 48 × 24 × 24.
accurate prediction results than PDLP when N p 1; 2, and 4. The implementation of the MGCG solver also accelerates the com-
putation of gradient information. When solving the 3-D cantilever
2. 3-D Topology Optimization Examples beam example with a 48 × 24 × 24 mesh using top3D125, the aver-
a. 3-D Cantilever Beam. In this example, we assess the computa- age computational costs of one calculation of gradient information
tional efficiency and the prediction accuracy of MLASO-d with are 8.1 and 1.6 s for direct and MGCG solvers, and they are 338 and
PDLP and CDLP schemes in the optimization of a 3-D cantilever 67 times the sum of tu ; ttrain , and tpred for MLASO-d with CDLP.
beam. The geometry and boundary, as well as the loading conditions, According to Eq. (58b), the ratio of
are demonstrated in Fig. 4; and the load is a sine-shaped load, as tg
defined in Ref. [18]. The design domain is discretized by cubic tpred ttrain tu
elements with a unit side length and three sets of length lx , width
lz , and height ly of the design domain are considered, which are for the 3-D cantilever beam problem with the 48 × 24 × 24 mesh
suggests that N p can theoretically be as large as 337 and 66 when
Downloaded by University of Georgia on January 16, 2024 | http://arc.aiaa.org | DOI: 10.2514/1.J062195
using the direct solver and the MGCG solver, and the upper bound of
N p can be higher if the mesh is finer.
Tables 5 and 6 list the computational efficiencies of the top3D125
and MLASO-d algorithms for the 3-D cantilever problem. As listed
in Table 6, when the mesh is 48 × 24 × 24, the top3D125 with the
MGCG solver is around 4.6 times faster than the top3D125 with the
direct solver. When MLASO-d with PDLP and CDLP (N p 1) are
implemented, the numbers of routine iterations are reduced by 55.1
and 42.9%, which result in 52.5 and 41.3% total time reductions as
compared to the top3D125 with the MGCG solver. For the cases of
48 × 24 × 24; 80 × 40 × 40, and 112 × 56 × 56 meshes, the PDLP
scheme shows better computational efficiency than CDLP with
N p 1, but the computational efficiency of CDLP increases as N p
increases from one to two, four, and 10; and the CDLP scheme
(N p 10) is 2.3, 2.2, and 1.8 times faster than the PDLP scheme.
Note that based on the upper bound of our estimated N p ; we can
choose a larger N p to save more computation time, but the prediction
accuracy will deteriorate.
The optimized topologies and the final compliance computed by
Fig. 4 Problem setup for 3-D Cantilever beam. using top3D125 with the direct and MGCG solver and MLASO-d
Table 5 Number of routine iterations needed to converge (and εKr ) for solving 3-D cantilever problem using top99neo
and MLASO-d with PDLP and CDLP schemes
Table 6 Total computation time (and ts ) for solving 3-D cantilever problem using top99neo and MLASO-d with
PDLP and CDLP schemes
Table 7 Optimized topologies and final compliance for 3-D cantilever beam problem, calculated using
top3D125 and MLASO-d with PDLP and CDLP schemes
48 × 24 × 24
80 × 40 × 40
112 × 56 × 56
with PDLP and CDLP schemes are demonstrated in Table 7. The unit side length, and the size of the design domain is
optimized topologies predicted by MLASO-d are similar to the ones lx 100; with ly lz 16, The Young’s moduli for solid and void
obtained by using top3D125 algorithms with direct and MGCG elements are E0 1 and Emin 1 × 10−9 ; and the Poisson’s ratio is
solvers. The relative differences in the final compliances calculated v 0.3. The SIMP penalty isp three, the volume fraction is 0.25, and
by CDLP with N p 1; 2, and 4 are within 1% for the three the size of the density filter is 3.
considered meshes; and they are lower than the one calculated by When solving the engine pylon problem with CDLP, the average
the PDLP scheme. However, with N p 10, the relative difference computational cost of one calculation of gradient information is 62
computed by the CDLP scheme is higher than the one calculated by the times the sum of tu , ttrain , and tpred , which implies that the upper bound
PDLP scheme for all considered meshes; and the relative difference is of N p can be as large as 61. Considering a large N p may deteriorate the
higher than 1% for the 80 × 40 × 40 and 112 × 56 × 56 meshes. The prediction accuracy, here, we use N p 1; 2; 4, and 10. The computa-
results of the final compliance suggest that using N p 10 can degrade tional efficiencies of top3D125 and MLASO-d for the design of the
the prediction accuracy for the 3-D cantilever beam problem, although aircraft engine pylon are listed in Table 8. The MLASO-d with PDLP
it can also lead to remarkable computational efficiency. schemes can use less routine iteration runs, and thus less computational
time to finish the design as compared to the top3D125; and the CDLP
b. 3-D Aircraft Engine Pylon. To demonstrate the computational scheme shows an even better computational efficiency than the PDLP
efficiency and prediction accuracy of MLASO-d in the design of an scheme. With N p 10, the MLASO-d with CDLP is 2.8 times faster
aircraft component, we present the 3-D aircraft engine pylon design than using the PDLP scheme, and it is 3.9 times faster than the
problem. The aircraft engine pylon is the structure that connects top3D125 with the MGCG solver. Table 8 also lists the optimized
the engine to the wing. Each pylon is usually attached to the wing at topologies and the final compliance calculated by top3D125 with the
the fore and aft attachment points, whereas the engine is mounted to the
MGCG solver and MLASO-d with PDLP and CDLP schemes. The
pylon at the fore and aft engine mounts. The conceptual design of the
optimized topologies obtained using MLASO-d algorithms are similar
engine pylon can be accomplished using the gradient-based TO
to the one obtained by top3D125, and the final compliances computed
method [26]. In this work, we consider the design of the engine pylon
by MLASO-d with PDLP and CDLP schemes (with N p < 10) are
with respect to the minimum compliance requirement; the aft and fore
attachment points are at x 0 and x 0.3lx , and fixed boundary lower than the one calculated by top3D125 with the MGCG solver; the
conditions are used to simulate the attachment to the wing. The aft and relative error of the final objective function values is within 1% for all
fore engine mounts are at x 0.6lx and x lx , and unit downward MLASO-d results.
loads are used to illustrate the engine weight; see Fig. 5 for the setup of The convergence history of the engine pylon design is depicted in
the problem. The design domain is discretized by cubic elements with a Fig. 6, and the indexes of the routine iteration at convergence
(denoted as) krc are labelled. When using the PDLP and CDLP
scheme, the first prediction is activated after kr 5 and kr 12
and thus, the objective function value starts to drop faster than the one
calculation by top3D125 at kr 6 and.kr 13 The PDLP scheme
activates prediction iteration earlier than the CDLP scheme, resulting
in a lower objective function in the range of kr 6 to kr 12. But,
once the prediction is activated in the CDLP scheme, the objective
function of CDLP drops sharply, reaching the same level as that
calculated by PDLP with only one routine iteration when N p 10.
The convergence history of the objective function again demonstrates
the superior computational efficiency of the MLASO-d with the
CDLP scheme.
To measure the prediction accuracy of each leading prediction, let
us use ϵm to represent the relative error between the predicted gradient
and its exact value. Note that
ks − gk
ϵm
Fig. 5 Problem setup for aircraft engine pylon design. kgk
4678 XING AND TONG
Table 8 Calculation results and computational efficiency for solving engine pylon problem using top3D125 and
MLASO-d with PDLP and CDLP schemes
Topology
Also, we use ϵf to measure the relative error between the predicted increase in the first three and five leading predictions; and this is
compliance and its exact value, and because the prediction is activated too early. On the contrary, when
the CDLP scheme is used (with N p 1; 2; 4, and 10), the prediction
kfpred − freal k
Downloaded by University of Georgia on January 16, 2024 | http://arc.aiaa.org | DOI: 10.2514/1.J062195
IV. Conclusions
In this work, a novel generic CDLP scheme is proposed to control
the activation of predictions in the MLASO algorithm. The CDLP
scheme reduces the prediction error in early predictions and
improves the computational efficiency of MLASO when solving
TO problems. Based on the CDLP scheme, the mathematical theory
to demonstrate the convergence and the computational efficiency of
the MLASO-d algorithm for unconstrained optimization problems
is also established, and it is shown that the mathematical theory is
valid for the TO problems. To support this mathematical theory, the
MLASO-d is embedded into top99neo and top3D125 algorithms to
solve 2-D and 3-D TO problems. The implementation of MLASO-d
reduces the total computational time as compared to using top99neo
and top3D125 when solving the selected 2-D and 3-D TO problems.
Based on the present numerical results, it is concluded that the
MLASO-d with the CDLP scheme has superior computational
Fig. 7 The convergence history of relative errors ϵm and ϵf for the efficiency with multiple consecutive predictions as compared to
aircraft engine pylon problem calculated using MLASO-d with PDLP the top99neo, the top3D125, and the MLASO-d with the PDLP
and CDLP schemes. scheme.
XING AND TONG 4679
Appendix A: Unconstrained Optimization Test Functions For extended test function P3, the extended DIXMAANA-DIX-
The following test functions are used in Sec. II.D: MAANL(A) function (nt is the index of each term, and 0 ≤ nt ≤ N t ) is
For test function P1, the extended Dennis and Schnabel test
Mb B1
problems (version F), denoted as DENSCHNF function [con- i
strained and unconstrained testing environment (CUTE)], is fx 1 a1 x2i
Mb
i1 nt 1
n∕2 Mb −1 B2
i
fx 2x2i−1 − 22 x2i−1 − 22 − 82 a2 x2i xi1 x2i1
i1 Mb
i1 nt 2
5x22i−1 x2i − 32 − 92 ; x0 2; 0; 2; 0; : : : ; 2; 0 2Ma B3
i
a3 x2i x2iMa
For test function P2, the generalized quartic function is Mb
i1 nt 3
Ma B4
n−1 i
a4 xi xi2Ma
fx x2i xi1 x2i 2 ; x0 1; 1; 1; : : : ; 1 Mb
i1 nt 4
i1
Mb B1
i
For test function P3, the DIXMAANA-DIXMAANL(A) function is ::: a1 x2i
Mb
i1 nt N t −3
Mb Mb −1
Downloaded by University of Georgia on January 16, 2024 | http://arc.aiaa.org | DOI: 10.2514/1.J062195
i B1 i B2 Mb −1 B2
fx 1 a1 x2i a2 x2i xi1 x2i1 i
Mb Mb a2 x2i xi1 x2i1
i1 i1 Mb
i1 nt N t −2
2Ma B3 Ma B4
i i 2Ma
a3 x2i x2iMa a4 xi xi2Ma i B3
Mb Mb a3 x2i x2iMa
i1 i1 Mb
i1 nt N t −1
a1 1; a2 0; a3 0.125; a4 0.125; Ma
i B4
Mb a4 xi xi2Ma
B1 B2 B3 B4 0; Ma ; x0 2; 2; : : : ; 2 Mb
3 i1 nt N t
∂J ∂J
3: Calculate ws and bs by letting ∂w s
0 and ∂bs
0
4: μ^ ws μkr −1 bs
5: s~k σ wkr −1 hδk−1 bkr −1 ▸ Predicting s~
6: δk s~k
s~k −mins~k
7: δk sk μ^ min maxs~k −mins~k × ^μmax − μ^ min ▸ Rescaling s~ to range [^μmin , μ^ max ]
Acknowledgment [16] Kazemi, H., Seepersad, C., and Kim, H. A., “Topology Optimization
Integrated Deep Learning for Multiphysics Problems,” AIAA Science
L. Tong would like to acknowledge the support of the Australian and Technology Forum and Exposition, AIAA Paper 2022-0802, 2022.
Research Council (grant number DP170104916). https://doi.org/10.2514/6.2022-0802
[17] Chi, H., Zhang, Y., Tang, T. L. E., Mirabella, L., Dalloro, L., Song, L.,
References and Paulino, G. H., “Universal Machine Learning for Topology Opti-
mization,” Computer Methods in Applied Mechanics and Engineering,
[1] Mukherjee, S., Lu, D., Raghavan, B., Breitkopf, P., Dutta, S., Xiao, M., Vol. 375, March 2021, Paper 112739.
and Zhang, W., “Accelerating Large-Scale Topology Optimization:
https://doi.org/10.1016/j.cma.2019.112739
State-of-the-Art and Challenges,” Archives of Computational Methods
[18] Amir, O., Aage, N., and Lazarov, B. S., “On Multigrid-CG for Efficient
in Engineering, Vol. 28, No. 7, 2021, pp. 4549–4571.
Topology Optimization,” Structural and Multidisciplinary Optimiza-
https://doi.org/10.1007/s11831-021-09544-3
tion, Vol. 49, No. 5, 2014, pp. 815–829.
[2] Maksum, Y., Amirli, A., Amangeldi, A., Inkarbekov, M., Ding, Y.,
Romagnoli, A., Rustamov, S., and Akhmetov, B., “Computational https://doi.org/10.1007/s00158-013-1015-5
Acceleration of Topology Optimization Using Parallel Computing and [19] Gogu, C., “Improving the Efficiency of Large Scale Topology Optimi-
Machine Learning Methods—Analysis of Research Trends,” Journal of zation Through On-the-Fly Reduced Order Model Construction,”
Industrial Information Integration, Vol. 28, July 2022, Paper 100352. International Journal for Numerical Methods in Engineering, Vol. 101,
https://doi.org/10.1016/j.jii.2022.100352 No. 4, 2015, pp. 281–304.
[3] Brunton, S. L., Nathan Kutz, J., Manohar, K., Aravkin, A. Y., https://doi.org/10.1002/nme.4797
Morgansen, K., Klemisch, J., Goebel, N., Buttrick, J., Poskin, J., [20] Kim, Y. Y., and Yoon, G. H., “Multi-Resolution Multi-Scale Topology
Blom-Schieber, A. W., Hogan, T., and McDonald, D., “Data-Driven Optimization—A New Paradigm,” International Journal of Solids and
Aerospace Engineering: Reframing the Industry with Machine Learn- Structures, Vol. 37, No. 39, 2000, pp. 5529–5559.
https://doi.org/10.1016/S0020-7683(99)00251-6
Downloaded by University of Georgia on January 16, 2024 | http://arc.aiaa.org | DOI: 10.2514/1.J062195