An Ai Assisted Design Method For Topology Optimization Without Pre Optimized Training Data
An Ai Assisted Design Method For Topology Optimization Without Pre Optimized Training Data
An Ai Assisted Design Method For Topology Optimization Without Pre Optimized Training Data
https://doi.org/10.1017/pds.2022.161
Abstract
Engineers widely use topology optimization during the initial process of product development to obtain a
first possible geometry design. The state-of-the-art method is iterative calculation, which requires both time
and computational power. This paper proposes an AI-assisted design method for topology optimization,
which does not require any optimized data. The presented AI-assisted design procedure generates
geometries that are similar to those of conventional topology optimizers, but require only a fraction of the
computational effort.
1. Introduction
The presented paper deals with the solution of optimization problems by means of artificial
intelligence (AI) techniques. Topology optimization (TO) was chosen as an application example, even
though the described method is applicable to many optimization problems and thus has generality.
TO is a method of optimizing the geometry of structures. In TO, the material distribution over a given
design domain is the subject of optimization, i.e. minimization of a given objective function while
satisfying given constraints (Sigmund and Maute, 2013). In most cases, suitable search algorithms
solve the optimization problem mathematically.
The combination of AI and TO in the state-of-the-art research mostly requires optimized geometries
generated by conventional TO or FEM results as a basis for training. For this reason, they are subject
to several limitations that affect those techniques, such as large computational effort and the need to
prepare representative data.
The approach proposed here aims at removing those drawbacks by generating all the artificial
knowledge required for optimization during the learning phase, with no need to rely on pre-optimized
results.
2. Method
The presented method is based on an ANN architecture called predictor-evaluator-network (PEN), which
was developed by the authors for this purpose. The predictor is the trainable part of the PEN and its task
is to generate, based on input data, optimized geometries.
As mentioned, unlike the state-of-the-art methods, no conventionally topology-optimized or
computationally prepared data are used in the training. The geometries used for the training are created
by the predictor itself on the basis of randomly generated input data and evaluated by the remaining
components of the PEN, called evaluators (see Figure 1).
Evaluator 2
Quality function
Input Data Predictor Evaluator 1 M
Minimization
Output Data
Figure 2. Design space overview with elements, nodes and dimension 𝒅 (square case)
In this work, we examined only square meshes with equal numbers of rows and columns. However,
this method can be used for non-square and three-dimensional geometries.
The total number of elements is as follows:
𝑛 = 𝑑𝑥 𝑑𝑦 (1)
where 𝑑𝑦 is the number of rows and 𝑑𝑥 the number of columns (see Figure 2). In the square case, the
number of rows and columns are equal (𝑑𝑥 = 𝑑𝑦 = 𝑑), so that the total number of elements is 𝑑2 .
The 𝑑2 design variables 𝑥𝑖 {𝑖 = 1, … , 𝑑2 }, termed density values, scale the contributions of the single
elements to the stiffness matrix. The density has a value of one when the stiffness contribution of the
element is fully preserved and zero when it disappears.
The density values are collected in a vector 𝐱. In general, the density values 𝑥𝑖 are defined in the
interval [0, 1]. In order to prevent possible singularities of the stiffness matrix, a lower limit value
𝑥min for the entries of 𝐱 is set as follows (Bendsøe and Sigmund, 2003):
0 < 𝑥min ≤ 𝑥𝑖 < 1, 𝑖 = 1,2, … , 𝑑 2 . (2)
Although a binary selection of the density is desired (discrete TO, material present/not present), values
between zero and one are permitted for algorithmic reasons (continuous TO). To get closer to the
desired binary selection of densities, the so-called penalization can be used in the calculation of the
compliance. The penalization is realized by an element-wise exponentiation of the densities by the
penalization exponent 𝑝 > 1 (Sigmund, 2001).
The arithmetic mean of all 𝑥𝑖 defines the degree of filling of the geometry as follows:
1 2
𝑀is = ∑𝑑𝑖=1 𝑥𝑖 (3)
𝑑2
Figure 3. Nodes and elements at different levels 𝚲 (resolutions). The boundary conditions do
not change
Input data can be only defined at the initial level and do not change when the level is changed. Hence,
new nodes cannot be subject to static or kinematic boundary conditions (see Figure 3). When the level
is changed, only the dimension of the outputs changes; the dimension of the inputs remains constant.
The change in level occurs after a certain condition, which will be described later, is fulfilled.
2.2. Predictor
The predictor is responsible for generating, after training, the optimized result for a given input data
point. Its ANN-architecture consists of multiple hidden layers, convolutional layers and output layers.
All parameters that can be changed during training in order to minimize the target function, such as
the bias and the weights of the hidden layers, are generally referred to as trainable parameters in the
following. The predictor's topology is shown in Figure 4 in a simplified form.
sigmoid
ResNet
ResNet
ResNet
hidden
hidden
output
block
block
block
block
block
input Λ=
…
data
Avarage
sigmoid
Pooling
activation
output
activation
activation
hidden
Λ=
conv.
conv.
layer
layer
layer
add
Λ=2
hidden ResNet
block block Λ=1
Figure 4. Predictor’s artificial neural network (ANN) topology (simplified)
An input data (top left) is processed by several successive hidden blocks and then passed on to some
ResNet-blocks. A hidden block is a combination of a hidden layer and an activation. ResNet-blocks
consist of multiple convolutional and activation layers (He et al., 2016) (see Figure 4). At this stage,
the output is at the highest resolution. The sigmoid function is well suited as an activation function for
the output layer because it provides results in the interval (0, 1). This makes the predictor's output
directly suitable to describe the density values of the geometry. Average pooling is used in order to
reduce the resolution to a lower level Λ.
2.8. Training
Within one batch, the input data points are randomly generated, and the predictor creates the
corresponding geometries 𝐱. Afterwards, the quality function is computed from the evaluators' losses.
The value of the objective function is then calculated for the whole batch. Then, the gradient of the
objective function, with respect to the trainable parameters, is calculated. The trainable parameters of
the predictor for the next batch are then adjusted according to the gradient descent method to decrease
the value of the objective function. In order to apply the gradient descent method, the functions must
be differentiable with respect to the trainable parameters (Kingma and Ba, 2017). For this reason, the
evaluators and the objective function use only differentiable functions.
When the level increases, the predictor outputs a geometry with higher resolution, and the process
starts again.
It is important to stress that, unlike conventional topology optimization, the PEN method does not
optimize the density values of the geometry, but only the weights of the predictor.
3. Application
3.1. Implementation
The implementation of the presented method takes place in the programming language Python. The
framework Tensorflow with the Keras programming interface is used. Tensorflow is an open-source
platform for the development of machine-learning applications (Abadi et al., 2015). In Tensorflow, the
gradients necessary for the predictor learning are calculated using automatic differentiation, which
requires the use of {differentiable} functions available in Tensorflow (Baydin et al., 2015).
The predictor's topology, with all layers and all hyperparameters, is shown in Figure 5. The chosen
hyperparameters were found to be the best after a grid search of all parameters in which the deviations
of the predictions from the ones obtained by conventional TO were evaluated. The hyperparameters
are displayed by the shape (numerical expression over the arrow pointing outside the block) of the
output matrix of a block or by the comment near the convolutional block. The label of the output
arrow describes the dimensions of the output vector or matrix. The names of the shapes in Figure 5,
e.g. "Conv2D", correspond to the Keras layer names.
UpSampling2D
Concatenae
1 2 1 2
Normalization
Channelwise
Reshape
Pooling2D
ResNet
ResNet
ResNet
Avarage
Avarage
hidden
hidden
hidden
PReLU
Sigmoid
block
block
block
block
block
block
Batch
25
Λ=
1
25
12
20
Output
1
1
2
Channelwise
Pooling2D
Avarage
Avarage
Sigmoid
Output 2
2
Λ=
2
input shape
(height width channels)
Channelwise
input shape
Pooling2D
A: Conv2D
12
Avarage
Avarage
Sigmoid
Λ=2
Output 1
LeakyReLU
1
Dense
hidden Batch ResNet B: Conv2D
1
20
Pooling2D
output shape
Avarage
Avarage
Sigmoid
LeakyReLU
Λ=1
Output
output shape
(height width channels)
3.2. Results
The training of the predictor lasted 3.25 h (3:15:56), which can be subdivided according to the
individual levels as follows: 16%, 7%, 42%, 35%. The training processed approximately 7.6 million
randomly generated training data points. As expected, the training time increases proportionally with
the size of the geometry. While the first level processed over 3400 data points per second (𝑑𝑝𝑠), it
became less with each level (928 𝑑𝑝𝑠, 2 𝑑𝑝𝑠, 𝑑𝑝𝑠). This is due to the additional computational
effort and the need to reduce the batch size due to higher memory requirements with higher levels at
constant available memory.
The training history shows the progression of the objective function (see Figure 6). The smaller batch
size at higher levels produces more oscillation of the curve and therefore, makes it difficult to identify
a trend. For this reason, the curves shown in the figures are filtered, using the exponential moving
average and a constant smoothing factor of 0.873 (Nicolas, 2017) for all levels. This filtering does not
affect the original objective function and serves only visual purposes.
Figure 7. Computing time (left) and compliance (middle) and degree of filling (right)
comparison
The examples in Figure 8 show that the predictor can deliver geometries that are similar to the
conventional method, as well as some weaknesses. For instance, some geometries lack details (see
Figure 8, column four or five). This may be improved by an appropriate choice of layers or
hyperparameters of the predictor and by adapting the quality function. For all sample geometries in
Figure 8, the compliance is reported under the geometry diagram.
3.3. Interactivity
Due to the ability to quickly obtain the optimized geometry by the predictor, the ANN-based TO can be
executed interactively in the browser. Under the address: https://www.tu-chemnitz.de/mb/mp/forschung/ai-
design/TODL/ (accessed on 22 February 2022), it is possible to perform investigations with different degrees
of filling as well as static boundary conditions.
4. Conclusions
In this work, a method was presented to train an ANN using online deep learning and use it to solve
optimization problems. In the context of the paper, Topology optimization (TO) was chosen as the
References
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., et al. (2015), “TensorFlow:
Large-Scale Machine Learning on Heterogeneous Distributed Systems”, ArXiv:1603.04467 [Cs], available
at: https://www.tensorflow.org/.
Abueidda, D.W., Koric, S. and Sobh, N.A. (2020), “Topology optimization of 2D structures with nonlinearities
using deep learning”, Computers & Structures, Vol. 237, p. 106283.
Andreassen, E., Clausen, A., Schevenels, M., Lazarov, B.S. and Sigmund, O. (2011), “Efficient topology
optimization in MATLAB using 88 lines of code”, Structural and Multidisciplinary Optimization, Vol. 43
No. 1, pp. 1–16.
Ates, G.C. and Gorguluarslan, R.M. (2021), “Two-stage convolutional encoder-decoder network to improve the
performance and reliability of deep learning models for topology optimization”, Structural and
Multidisciplinary Optimization, available at:https://doi.org/10.1007/s00158-020-02788-w.
Basheer, I.A. and Hajmeer, M. (2000), “Artificial neural networks: fundamentals, computing, design, and
application”, Journal of Microbiological Methods, Vol. 43 No. 1, pp. 3–31.
Baydin, A.G., Pearlmutter, B.A., Radul, A.A. and Siskind, J.M. (2015), “Automatic differentiation in machine
learning: a survey”, ArXiv:1502.05767 [Cs, Stat], available at: http://arxiv.org/abs/1502.05767 (accessed 23
September 2019).
Behzadi, M.M. and Ilies, H.T. (2021), “GANTL: Towards Practical and Real-Time Topology Optimization with
Conditional GANs and Transfer Learning”, Journal of Mechanical Design, pp. 1–32.
Bendsøe, M.P. and Sigmund, O. (2003), Topology Optimization: Theory, Methods, and Applications, Springer,
Berlin; Heidelberg; New York.