Thermal Error Modeling Based On Bilstm Deep Learning For CNC Machine Tool

Adv. Manuf.
(2021) 9:235–249
https://doi.org/10.1007/s40436-020-00342-x
Thermal error modeling based on BiLSTM deep learning

for CNC machine tool
Pu-Ling Liu1 • Zheng-Chun Du1 • Hui-Min Li1 • Ming Deng1 • Xiao-Bing Feng1 •
Jian-Guo Yang1
Received: 1 September 2020 / Revised: 21 October 2020 / Accepted: 26 December 2020 / Published online: 21 February 2021
Ó The Author(s) 2021
Abstract The machining accuracy of computer numerical Keywords Thermal error Error modeling Bidirectional
control machine tools has always been a focus of the long short-term memory (BiLSTM) Phase space
manufacturing industry. Among all errors, thermal error reconstruction Computer numerical control (CNC)
affects the machining accuracy considerably. Because of machine tool
the significant impact of Industry 4.0 on machine tools,
existing thermal error modeling methods have encountered
unprecedented challenges in terms of model complexity 1 Introduction
and capability of dealing with a large number of time series
data. A thermal error modeling method is proposed based With the continuous improvement of production automa-
on bidirectional long short-term memory (BiLSTM) deep tion, manufacturing industries have increasingly higher
learning, which has good learning ability and a strong requirements for the machining accuracy of computer
capability to handle a large group of dynamic data. A four- numerical control (CNC) machine tools. An increase in
layer model framework that includes BiLSTM, a feedfor- temperature results in thermal deformation, and thermal
ward neural network, and the max pooling is constructed. deformation leads to a relative displacement at the cutting
An elaborately designed algorithm is proposed for better point and, thus, influences the accuracy of the workpiece
and faster model training. The window length of the input being produced [1]. Thermally induced errors have become
sequence is selected based on the phase space reconstruc- one of the largest machine tool error sources in precision
tion of the time series. The model prediction accuracy and and ultraprecision machining. Up to 75% of the overall
model robustness were verified experimentally by three geometrical errors of machined workpieces can be induced
validation tests in which thermal errors predicted by the by the effects of temperature [2]. Thermal error plays a
proposed model were compensated for real workpiece significant role in limiting the accuracy of the produced
cutting. The average depth variation of the workpiece was workpiece.
reduced from approximately 50 lm to less than 2 lm after To eliminate the negative influence of thermal error on
compensation. The reduction in maximum depth variation the machining accuracy, a variety of thermal error mod-
was more than 85%. The proposed model was proved to be eling approaches have been proposed by many researchers
feasible and effective for improving machining accuracy worldwide. For most modeling methods, the relationships
significantly. between the temperatures and thermal errors are studied
based on either the numerical simulation results or exper-
imental data [3], namely, the principal-based model (PBM)
and empirical-based model (EBM). The PBM builds the
relationship between thermal errors and heat generated by a
& Zheng-Chun Du system of nonlinear differential equations, which fre-
zcdu@sjtu.edu.cn
quently employs the principles of friction-induced heat,
1
School of Mechanical Engineering, Shanghai Jiao Tong heat conduction, heat convection, heat radiation, and
University, Shanghai 200240, People’s Republic of China thermal expansion in the model construction. Liu et al. [4]
123
236 P.-L. Liu et al.
established a thermal resistance network model of the incapable of processing such a large group of data with
motorized spindle system of a CNC machining tool. The high efficiency and good robustness. Therefore, a thermal
influences of the radial and axial thermal conduction error model with a relatively simple framework, as well as
resistance and thermal convection resistance between the the ability to handle big data with learning abilities, is
cooling system and the components of the spindle system desired.
on the temperature rise are considered in the model. Li Deep learning has rapidly advanced and has achieved
et al. [5] studied the modal characteristics under both a state-of-the-art performance in various fields. Deep learn-
steady state and a static state of the spindle by using the ing can learn complex features by combining simple fea-
finite-element method. The thermal characteristics, vibra- tures learned from data. It takes advantage of large datasets
tion modes, and natural frequencies were also analyzed. and computationally efficient training algorithms to out-
Liu et al. [6] analyzed the thermal behavior of the machine- perform other approaches in various machine learning
workpiece system and established a thermal error predic- tasks [12]. The state of the bidirectional long short-term
tion model based on heat transfer theory. The compre- memory (BiLSTM) deep learning network changes with
hensive thermal error of the machine-workpiece system time. It can essentially be regarded as a dynamic system.
was reduced to approximately 20%. The thermal error of the CNC machine tool changes with
The EBM is a kind of black-box methodology that the change in the machine temperature field. The thermal
assumes thermal errors can be considered as a function of error data have time sequence and continuity characteris-
some critical thermal discrete temperature points on the tics, and the current and historical states are interrelated.
machine. Regression analysis and artificial neural networks Therefore, the thermal error accords with the deep learning
(ANNs) are the most commonly used EBM methods. Ye network in terms of the dynamic time series. The use of
et al. [7] proposed a thermal error regression model for BiLSTM deep learning to extract the temporal and spatial
determining the thermal deformation coefficient of the characteristics of dynamic time series data is conducive to
moving shaft of a gantry milling machine. Grama et al. [8] the accurate prediction of thermal error.
developed a thermal error compensation model in a linear In this article, a thermal error modeling method based on
regression framework, where the thermal key point was BiLSTM deep learning is proposed. The model consists of
selected using principal component analysis and k-means one BiLSTM layer, two feedforward neural network (FFN)
clustering. Li et al. [9] established a thermal error predic- layers, and one max-pooling layer. A training algorithm is
tion model based on an improved backpropagation (BP) designed to realize the model training. The window length
neural network. The temperature measurement points were of the input sequence is determined by the proposed input
clustered by a self-organization mapping neural network. sequence construction method based on the phase space
Improved particle swarm optimization is used to optimize reconstruction of the time series. In addition, verification
the parameters of the BP neural network. The prediction tests were conducted theoretically and practically to prove
accuracy of the improved model for the spindle thermal the accuracy and robustness of the proposed BiLSTM-
error is 93.1%. based thermal error model. The basic principle of BiLSTM
The prediction accuracy of the PBM and EBM methods is introduced in Sect. 2. The process of establishing the
has been demonstrated to be satisfactory in many appli- thermal error model, which includes the model framework,
cations. However, the existing modeling methods have training algorithm, and input sequence construction
encountered unprecedented challenges because of the sig- method, is given in Sect. 3. In Sect. 4, the verification tests
nificant impact of Industry 4.0 on machine tools. On the and application results of the proposed BiLSTM-based
one hand, the accuracy of the PBM model mainly depends thermal error model are described. Finally, some conclu-
on the accuracy of the heat source model and heat transfer sions are presented in Sect. 5.
model. It is difficult to obtain an accurate thermal error
model because it depends highly on a number of factors,
such as machine working cycles, the use of coolants, and 2 Basic principle of BiLSTM
environmental conditions [10]. The PBM models become
highly complicated if they consider all the essential influ- The concept of long short-term memory (LSTM) was
ential effects within a machine tool, and they finally reach proposed by Hochreiter and Schmidhuber [13]. The basic
their limit owing to the increasing complexity and high LSTM unit consists of three gates and two conveyor belts
demand from production [11]. On the other hand, the that protect the state of each neuron. It controls the transfer
thermal error data of CNC machine tools are a group of path of information through the gating mechanism. Four
sequential, large, fast, and continuous nonlinear data neural network layers interact with each other in a special
sequences, which are constantly generated with the oper- way in one LSTM unit, as shown in Fig. 1.
ation of the machine tools. The existing EBM models are The state calculations for each step are presented as [14]
123
Thermal error modeling based on BiLSTM deep learning for CNC machine tool 237
Fig. 1 Illustration of LSTM unit structure

Ft ¼ r W f ½Ht1 ; X t þ bf ; ð1Þ
I t ¼ rðW i ½Ht1 ; X t þ bi Þ; ð2Þ
Ot ¼ rðW o ½Ht1 ; Xt þ bo Þ; ð3Þ Fig. 2 Illustration of BiLSTM model structure
0
Ct ¼ tanh ðW c ½Ht1 ; X t þ bc Þ; ð4Þ 3 Thermal error modeling based on BiLSTM
0
Ct ¼ Ft Ct1 þ I t Ct ; ð5Þ
3.1 Model framework
Ht ¼ Ot tanhðCt Þ; ð6Þ
The proposed model framework is illustrated in Fig. 3. The
where Ft represents the forget gate at time t, to decide the
input layer contains the model input sequence at and before
information needs to be thrown away from the cell state. In
time t. The window length n of the input sequence is dis-
addition, I t represents the input gate, to decide the new
cussed in Sect. 3.3.2. The output layer gives the thermal
information needs to be stored in the cell state, and Ot
error value predicted by the model at time t ? 1, which is
represents the output gate, to decide the information needs
marked as Ytþ1 , and h is the hidden size of the model. The
to be output. These three gates use the sigmoid function to
hidden layer consists of four layers: one BiLSTM layer,
filter information. A value of zero means ‘‘let nothing
two FFN layers, and one max-pooling layer.
through’’, while a value of one means ‘‘let everything
0
through’’. Moreover, Ct is the candidate cell state; Ct is the (i) The BiLSTM layer is the core part of the model.
cell state; Ht is the hidden layer state; X t is the model The main function of the BiLSTM layer is feature
inputs; W f , W i , W o and W c are the weight matrices; bf , bi , extraction from the input sequence. Thorough a
bo and bc are the biases; and ‘‘ ’’ and ‘‘’’ denote the series of operations, training, and learning using
matrix multiplication and element multiplication, Eqs. (1)–(6), the weight parameters W f , W i , W o
respectively. and W c and bias parameters bf , bi , bo and bc are
The above operating mechanism of LSTM is explicitly obtained to finalize the structure of the model.
designed to avoid the long-term dependency problem, (ii) The FFN layers are used to realize dimensionality
which is common for recurrent neural networks (RNNs). reduction by linear transformation. The first FFN
LSTM can remove or add information to the cell state, layer reduces the hidden layer state of the
which is carefully regulated by different gates. Therefore, BiLSTM from 2 h to h, and the second FFN layer
learning from information over a long period of time is the reduces the prediction vector from h to 1, i.e., a
default behavior of LSTM. predicted single thermal error value.
The BiLSTM consists of two LSTM layers with oppo- (iii) The max-pooling layer is used after the first FFN
site directions, as shown in Fig. 2. The hidden layer state layer to transform an h-dimension hidden state of
encodes the information features in the forward direc- the BiLSTM into an h-dimension hidden space
vector, namely the prediction vector in Fig. 3, by
tion, while the hidden layer state Ht encodes the infor-
its characteristic of translation invariance.
mation features in the backward direction. BiLSTM can
learn information more precisely by utilizing the forward
and backward orders of the information sequence. Thus, it 3.2 Training algorithm
is especially suitable for dealing with time series data.
The model training is carried out according to an elabo-
rately designed algorithm based on the above framework,
which is shown in Fig. 4. There are four main steps.
123
Fig. 3 Illustration of model framework
original variable x, so that the transformed result

x is mapped between [0,1]. The conversion
function is given as
x Smin
x ¼ ; ð7Þ
Smax Smin
where Smin is the minimum value of the sample
data, and Smax is the maximum value of the sample
data.
(iii) Weight initialization has a significant influence on
the final training results of the model. If the initial
value of the weight is too large, it causes a
gradient explosion and makes the model unable to
converge. If the initial value of the weight is too
small, the gradient disappears, and the model
converges slowly or converges to the local
minimum value. The method of Kaiming initial-
ization is adopted because the activation function
sigmoid in the BiLSTM model is not symmetric
about zero, which improves the convergence of
the model. Kaiming initialization is a Gaussian
distribution with a mean value of zero and
qffiffiffiffiffi
variance of F2in , shown as
Fig. 4 Flow chart of the model training algorithm
rffiffiffiffiffiffi
(i) Hyperparameters must be set before model train- 2
ing. These parameters are used to define high- w G 0; ; ð8Þ
Fin
level concepts of the model, such as complexity or
learning ability. Details are discussed in where Fin is the number of input neurons, and w is
Sect. 3.3.3. the weight.
(ii) Data preprocessing helps the model converge (iv) Model parameter training determines the weight
more efficiently and accurately. Different vari- parameters and bias parameters of the model. The
ables often have different dimensions and units, determination of model parameters means the
which negatively influence the results of data finalization of the model. First, a loss function that
analysis. The method of deviation standardization evaluates the inconsistency between the predicted
is used to eliminate the dimensional influence value and the real value of the model must be
between variables, which makes the data process- defined. The model established is a thermal error
ing more convenient and faster. Deviation stan- prediction model, which belongs to the category
dardization is a linear transformation of the of regression problems (as opposed to
123
classification problems). Therefore, mean squared One-time training of the model parameters is completed.
error is selected as the loss function, shown as Then, a validation dataset is put in the model to calculate
Pk 0 2 the loss function. If the loss function of the validation
i¼1 yi yi
J¼ ; ð9Þ dataset cannot meet the requirements after all iterations are
k
completed, hyperparameters must be optimized until a
where J is the loss function, k the sample size, yi the satisfactory loss function value of the validation dataset is
0
measured value, and yi the predicted value. obtained.
Second, the parameter gradient is calculated by the loss
function, and the gradient is optimized by appropriate 3.3 Thermal error model establishment
methods to minimize the loss function and make the model
converge. In the gradient calculation, the L2 regularization 3.3.1 Data collection
method is used to limit the scale of the parameters to
prevent overfitting, that is, to add a regularization term Thermal error exists in all three axes of the machine
after the loss function, shown as coordinate system. In this article, only the z-direction is
k taken as an example to illustrate the modeling process. The
J0 ¼ J þ kwk22 ; ð10Þ studied machine tool is a horizontal machining center. The
2k
simplified machine tool structure is shown in Fig. 5. The
where J 0 is the loss function after regularization, and k is axial direction of the spindle is in the z-direction. The
the regularization parameter. spindle moves up in the x- and y-directions. The workpiece
Next, a gradient clipping process is added to prevent the is fixed on the workbench and moves up in the z-direction.
gradient explosion and ensure the convergence of the The modeling object is the comprehensive axial spindle
training process. By setting the clipping threshold, i.e., the thermal error at the tool center point (TCP), which includes
maximum gradient norm, the gradient exceeding the the thermal error of the machine tool and the thermal error
threshold is regulated, shown as of the tool.
8
< h A computer-controlled testing setup was built to mea-
t ; if t2 h;
t ¼ t ð11Þ sure the error data automatically at the TCP and tempera-
: 2
t ; otherwise; ture data at various locations in the machine tool. The
sampling interval was 2 s. An Omron eddy current sensor
where t is the tensor, h the clipping threshold, and t2 the L2
and a PT101 thermal sensor were used. Figure 6 shows the
norm of t .
placement of the eddy current sensor during the measure-
Thereafter, the Adam optimization algorithm is used to
ment. The mandrel is used before the blade making instead
adapt the learning rate of each parameter to achieve a better
of the real tool because the tip of the three-blade milling
and faster convergence of the training process. The itera-
cutter is not a flat plane that cannot be measured by an eddy
tion formula of the Adam algorithm is shown as
current sensor.
vt
v^t ; ð12Þ The error measured at the TCP, marked as ETCP ,
1 bt1 includes the thermal error of the machine tool ETM , thermal
st error of the tool ETT , static errors of the machine tool ESM ,
s^t ; ð13Þ
1 bt2 and static errors of the tool EST . The comprehensive ther-
mal error ETMT at the TCP can be obtained using
0 g vbt
gt pffiffiffiffiffiffiffiffiffiffiffiffi ; ð14Þ ETCP ¼ ETM þ ESM þ ETT þ EST ; ð16Þ
sbt þ e
where vt is the momentum variable (first moment), st the
exponential weighted moving average variable by the
square of elements (second moment), v^t and s^t the revised
vt and st , respectively, after deviation correction, b1 a
constant of the first-order decay rate, b2 a constant of the
second-order decay rate, g the learning rate, e the stability
0
constant, and gt the calculation update after Adam
optimization.
Finally, the parameter weight w is iterated according to
0
wt wt1 gt : ð15Þ
Fig. 5 Schematic diagram of machine tool structure
123
Fig. 6 Placement of eddy current sensor for thermal error

measurement
ETMT ¼ ETM þ ETT ; ð17Þ

ETMT ¼ ETCP ESM EST ; ð18Þ
where ESM is determined experimentally, and EST is cal- Fig. 8 Temperature data measuring results
culated according to linear-elastic physical principles [15].
Figure 7 shows the measurement result of ETMT . The Table 1 Position of temperature sensor
blue line is the rotation speed of the spindle during the Parameter Definition
measurement. The temperature data of each position are
shown in Fig. 8. Table 1 gives the details of the tempera- TS Motor of spindle
ture sensor positions, which are shown in Fig. 5. Except for Tx Motor of x-axis
the temperature sensor of room temperature, all the tem- Ty Motor of y-axis
perature sensors were sensors built into the machine tool. Tz Motor of z-axis
The measurement was repeated under different room TB1 Front bearing of spindle
temperatures for another three times to obtain enough TB2 Front bearing of spindle
datasets for model training and model validation. Two TB3 Back bearing of spindle
datasets were used for model training; one dataset was used Troom Ambient temperature
for model validation; and another one was used for model
testing.
3.3.2 Input sequence construction lead to information redundancy and even invalid informa-
tion, which affect the model accuracy.
In the input layer, as shown in Fig. 3, fX tnþ1 , Xtnþ2 , , A window length selection method based on the phase
X t g is the input sequence of the model, where n is the space reconstruction of a time series is proposed [16].
window length. Moreover, n - 1 is the number of input According to the theorem of Takens, there exists a best
sequences, and it is also the number of steps to form a deep embedding dimension of the reconstructed phase space that
learning network in the time dimension. Steps that are too has the same geometric characteristics as the original
short lead to information loss, while steps that are too long space. This best embedding dimension m is the optimal
number of input sequences, i.e., m ¼ n 1. When two
nonadjacent points in a high-dimensional phase space are
projected into a low-dimensional phase space, they may
become two adjacent points, that is, false adjacent points.
With the increase in the dimension of the reconstructed
phase space, the false adjacent points are eliminated
gradually.
The time series of input parameter x in X t can be
expressed as {xi j i ¼ 1; 2; ; Ng, where N is the length of
the time series. Then, the reconstructed phase space of x
with embedding dimension m and delay time s can be
expressed as
0 0 0 0
x0 ¼ fx1 ; x2 ; ; xi ; ; xM g; ð19Þ
0
xi ¼ fxi ; xiþs ; ; xiþðm1Þs g; ð20Þ
where 1 i M and M ¼ N ðm 1Þs.

Fig. 7 Measuring data of thermal error ETMT
123
0 0
For each xi , there is an adjacent point xj based on the Table 3 Hyper-parameters setting
Euclidean norm Ri ðmÞ, shown as Hyper-parameter Setting value

Ri ðmÞ ¼ x0i ðmÞ x0j ðmÞ ; ð21Þ Hidden size 20
2
Window length 5
where 1 j M. When the dimension changes from m to Input length 9
m þ 1, the distance becomes Batch size 128

Epoch 1 000
Ri ðm þ 1Þ ¼ x0i ðm þ 1Þ x0j ðm þ 1Þ : ð22Þ
2 Shuffle True
Learning rate 0.001
The relation between Ri ðmÞ and Ri ðm þ 1Þ is shown as
2 Regulation norm 0.01
ðRi ðm þ 1ÞÞ2 ¼ ðRi ðmÞÞ2 þ x0 iþms x0 jþms 2 : ð23Þ Maximum gradient norm 10
Adam first order decay rate 0.9
di ðmÞ represents the degree of distance change and is
Adam second order decay rate 0.999
described as
0 Adam stability constant 10-8
x iþms x0 jþms
2
di ð m Þ ¼ : ð24Þ
Ri ð m Þ
0 Table 4 Optimal loss function of validation dataset and test dataset
If di ðmÞ exceeds a certain threshold ds , then xj ðmÞ is a
false adjacent point of x0 i ðmÞ. The corresponding embed- Loss function validation Loss function test Loss function
dataset dataset requirement
ding dimension m is the best embedding dimension of the
reconstructed phase space, ds is set as 30%, and the cal- 0.072 0.024 \1
culation of di ðmÞ starts from m ¼ 2. The calculation results
of the best embedding dimension are shown in Table 2.
The maximum embedding dimension of all input parame-
ters is 4; therefore, the window length of the input
sequence n is set to 5.
3.3.3 Hyperparameter setting
Setting the appropriate hyperparameters is helpful to

improve the performance and effectiveness of model
learning. Based on the training algorithm proposed in
Sect. 3.2, the following hyperparameters, which are shown
in Table 3, must be defined before training. The window
length is determined by the proposed method based on
phase space reconstruction. Hidden size, batch size, epoch,
Table 2 Embedding dimension for each input parameter Fig. 9 Comparison between the model predicted thermal error and
Input parameter Embedding dimension the measured thermal error
x1 TS 3
x2 Tx 2 Table 5 Evaluation result of each model
x3 Ty 2
Model RMSE MAE/ MAPE Max absolute Prediction
x4 Tz 2
lm residual error/ accuracy/%
x5 TB1 3 lm
x6 TB2 3
FFN 0.605 0.508 1.29 1.74 90.8
x7 TB3 4
RNN 0.423 0.331 0.91 1.56 91.3
x8 Troom 2
BiRNN 0.378 0.313 0.85 1.63 91.8
x9 Rotation speed 4
BiLSTM 0.154 0.111 0.34 0.98 95.7
123
Fig. 10 Thermal error prediction results of the other three models
123
Fig. 13 Temperature measurement data of robustness verification

test
Fig. 11 A zoom-in observation of the comparison between four

models
Table 6 Operation procedures of robustness verification test

Sequence Rotation speed of Running
number spindle /(rmin1 ) time/s
1 12 000 3 600
2 0 1 200
3 6 000 600
Fig. 14 Real picture of a machined workpiece
Table 7 Dimension of each hole

Hole number Diameter/mm Depth/mm
1 30 8
2 30 5
3 30 10
4 50 8
5 50 10
6 50 5
7 80 5
8 80 8
9 80 10
Fig. 12 Thermal error measurement data and model prediction data

3.3.4 Model training results
shuffle, learning rate, regulation norm, and maximum
gradient norm are determined by a simple searching The model parameters are determined according to the
method. Other hyperparameters, including those not listed optimal loss function value of the validation dataset, and
in the table, are set according to experience value or default the model was further verified by the test dataset. Table 4
value. The best group of hyperparameters that has the shows the optimal loss function of the validation dataset
lowest loss function of the validation dataset is also listed and test dataset. Based on the model that is established with
in Table 3. the above hyperparameters, the loss function of the vali-
dation dataset meets the requirement, and the loss function
level of the test dataset shows the good prediction accuracy
123
4 Verification of BiLSTM-based thermal error

model
4.1 Prediction accuracy comparison with models

based on FFN/RNN/BiRNN
The proposed BiLSTM model was compared with the

traditional neural networks FFN and RNN, as well as a
bidirectional RNN (BiRNN) deep learning network. The
comparison test was performed with the same dataset as in
Section 3.3. The input sequence and hidden size of the
network were kept the same. Three evaluation indices that
Fig. 15 Picture of test site and CNC machine tool
are commonly used in the field of machine learning in
regression problems, are shown as
Table 8 Descriptions on the cutting process sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
PN 0 2
Step Spindle speed/ Time/ Feed rate/ i¼1 jyi yi j
(rmin1 ) min (mmmin1 ) RMSE ¼ ; ð25Þ
N
Facing milling 2 000 4 600
1X N
0
Rough milling 5 000 8 3 000 MAE ¼ yi yi ; ð26Þ
N i¼1
Semi-finish 2 000 5 250
milling
1X N 0
yi yi
Finish milling 2 000 10 50 MAPE ¼ 100%; ð27Þ
N i¼1 yi
0
where yi is the measured value, yi the predicted value, and
N the sample size. Equations (25)–(27) were used to
of the model. A comparison between the model-predicted evaluate the model performance.
thermal error and the measured thermal error of the test The calculation results of the evaluation indices, as well
dataset is shown in Fig. 9. The residual error is less than as the maximum absolute residual error and prediction
1 lm, and the prediction accuracy is more than 95.7%, accuracy, are listed in Table 5. The thermal error predic-
which is a fairly good performance. tion results of the other three models are shown in Fig. 10.
For a clearer presentation, a zoom-in observation of the
comparison of these four models is shown in Fig. 11. The
comparison test result shows that the proposed BiLSTM
model outperforms the other three models in all three
evaluation indices, and it has the best performance in terms
of both the residual error and prediction accuracy.
Fig. 16 Schematic diagram of the thermal error compensation system
123
Fig. 17 Error compensation scheme
4.2 Robustness verification workpiece, which has nine holes with different diameters
and depths. The dimensions of each hole are shown in
The training, validation, and test datasets used previously Table 7. Only the accuracy of depth was considered in this
were obtained under the same running program of the study, because the direction of the hole depth was the z-
machine tool. To further confirm the prediction accuracy direction.
and robustness of the proposed BiLSTM-based model and The workpiece cutting test was performed in a hori-
avoid unnecessary cost loss before the workpiece cutting zontal machining center (HMC-C100P), as shown in
test, the thermal errors under different operation procedures Fig. 15. Detailed descriptions of the cutting process are
were measured and predicted by the established model. The listed in Table 8. Workpieces were machined with and
operation procedures are listed in Table 6. The thermal without a thermal error compensation system. The sample
errors are shown in Fig. 12, and the temperature mea- size of each group needed to be at least 30 pieces to
surement data are shown in Fig. 13. The maximum abso- achieve statistical significance. The changes in the work-
lute residual error was 1.29 lm, and the prediction piece at the depth of the hole before and after compensa-
accuracy was 94.9%. The prediction accuracy of the pro- tion were compared to verify the validity of the proposed
posed model was maintained at a relatively high level BiLSTM-based thermal error model.
under different running programs, verifying the good
robustness of the established model. 4.3.1 Error compensation system
4.3 Thermal error compensation on workpiece A schematic diagram of the thermal error compensation
cutting process system is illustrated in Fig. 16. This compensation system
was developed based on a fast Ethernet data interaction
The effectiveness of the proposed thermal error model technique and the external machine original coordinate
based on BiLSTM deep learning has been proved theo- shift (EMZPS) function of the machine tool [17]. The error
retically in previous verifications. In this section, the vali- prediction value calculated by the error prediction model
dation of the model through practical verification by an was stored in the R address of the programmable machine
actual workpiece cutting test is discussed. An artificially controller (PMC) through data interaction. When the data
designed workpiece made of aluminum alloy was used to between the machine tool and PMC were scanned and
check the effectiveness of the proposed model in real interacted, the error prediction value in the R address was
applications. Figure 14 shows a real picture of a machined automatically obtained and combined with the
123
Fig. 18 Comparison on the measurement results of the depth of hole
123
Depth = 10 mm
interpolation algorithm in the next stage. The compensa-
tion value in each scanning cycle was iterated to the
interpolation step to realize the modification of the inter-
polation vector and complete the error compensation.
85.5
The compensation system consisted of a hardware
platform and a software platform. The compensation
Depth = 8 mm
scheme is shown in Fig. 17. The hardware platform was
implemented on an embedded computer, which included
Accuracy improvement/%
an error compensation unit and a data acquisition unit. The

85.6 software platform was developed with LabVIEW, which is
associated with the embedded MATLAB procedure, which
Depth = 5 mm
includes an error modeling module and an error compen-

sation module.
86.7
4.3.2 Thermal error compensation result
The depth of the hole of all machined workpieces was

Depth = 10 mm
measured using a coordinate measuring machine (CON-

TURA-G2) in a standard environment. A comparison of
the measurement results is shown in Fig. 18. The vertical
54.4
1.9
coordinate Dh is the relative deviation of the measured

hole depth value to the center value. The horizontal
Depth = 8 mm
coordinate is the serial number of each hole according to

the cutting order.
The main differences in workpiece accuracy before and
46.1
after thermal error compensation are listed in Table 9. The

1.2
Variation average/lm
average depth variation is reduced from approximately

50 lm to less than 2 lm after compensation. The maxi-
Depth = 5 mm
mum depth variation range is significantly reduced to 15

lm. The workpiece accuracy in the depth of the hole is
improved by more than 85%. These results acknowledge
Table 9 Main differences in workpiece accuracy before and after compensation
45.3
1.8
the excellent prediction accuracy of the proposed

BiLSTM-based thermal error model.
Depth = 10 mm
-9.5–13.0
24.0–90.0
5 Conclusions
A thermal error modeling method was proposed based on

BiLSTM deep learning for a horizontal tuning center. The
Depth = 8 mm
model is composed of one BiLSTM layer, two FFN layers,

-1.0–94.0
-9.5–13.5
and one max-pooling layer. With a specially designed gate

mechanism and bidirectional structure in the BiLSTM
layer, the model has a superior learning ability from both
Variation range/lm
short-term and long-term information to extract the tem-

Depth = 5 mm
Depth = 5 mm
poral and spatial characteristics of dynamic time series

-13.6–13.7
8.0–103.0
temperature data and thermal error data.

A training algorithm was elaborately designed for
model training to obtain an optimal loss function of the
validation data, which is crucial to the thermal error
compensation
compensation
modeling. The training algorithm adopts the gradient

clipping method and Adam optimization algorithm to
Before
Group
prevent gradient explosion and ensure the convergence of

After
the training process.
123
A window length selection method for input sequences 4. Liu Y, Ma YX, Meng QY et al (2018) Improved thermal resis-
was proposed based on the phase space reconstruction of a tance network model of motorized spindle system considering
temperature variation of cooling system. Adv Manuf 6:384–400
time series. The best embedding dimension of the recon- 5. Li SS, Shen Y, He Q (2016) Study of the thermal influence on the
structed phase space is used to determine the window dynamic characteristics of the motorized spindle system. Adv
length of the input sequence because it has the same geo- Manuf 4(4):355–362
metric characteristics as the original space, which benefits 6. Liu K, Liu H, Li T et al (2019) Intelligentization of machine
tools: comprehensive thermal error compensation of machine-
the efficiency of model learning. workpiece system. Int J Adv Manuf Technol 102(9/
Three validation tests were performed to verify the 12):3865–3877
accuracy and robustness of the proposed BiLSTM-based 7. Ye WH, Guo YX, Zhou HF et al (2020) Thermal error regression
thermal error model. (i) Compared with the traditional modeling of the real-time deformation coefficient of the moving
shaft of a gantry milling machine. Adv Manuf 8:119–132
modeling method based on FFN/RNN/BiRNN, the pro- 8. Grama SN, Mathur A, Aralaguppi R et al (2017) Optimization of
posed model outperformed the other three models in all high speed machine tool spindle to minimize thermal distortion.
three evaluation indices, as well as in residual error and Procedia CIRP 58:457–462
prediction accuracy. (ii) The prediction accuracy of the 9. Li B, Tian X, Zhang M (2019) Thermal error modeling of
machine tool spindle based on the improved algorithm optimized
proposed model was verified with a different running BP neural network. Int J Adv Manuf Technol 105(9):1497–1505
program of the machine tool. The test results show that the 10. Ni J (1997) CNC machine accuracy enhancement through real-
proposed model has good robustness under different time error compensation. J Manuf Sci Eng 119(4B):717–725
machining conditions. (iii) The effectiveness of the pro- 11. Wuest T, Irgens C, Thoben KD (2014) An approach to moni-
toring quality in manufacturing using supervised machine learn-
posed thermal error model was validated by a real working on product state data. J Intell Manuf 25(5):1167–1180
piece cutting test. The average depth variation of the 12. Xu J, Guo L, Jiang J et al (2019) A deep learning methodology
workpiece was reduced from approximately 50 lm to less for automatic extraction and discovery of technical intelligence.
than 2 lm after compensation. The reduction in maximum Technological Forecasting & Social Change 146:339–351
13. Hochreiter S, Schmidhuber J (1997) Long short-term memory.
depth variation was more than 85%, which is a significant Neural Computation 9(8):1735–1780
improvement in workpiece accuracy. The proposed thermal 14. Wei J, Liao J, Yang Z et al (2020) BiLSTM with multi-polarity
error model based on BiLSTM deep learning has good orthogonal attention for implicit sentiment analysis. Neurocom-
prediction accuracy and satisfactory robustness, so it can puting 383(28):165–173
15. Putz M, Regel J, Wenzel A et al (2019) Thermal errors in milling:
significantly improve the workpiece accuracy. comparison of displacements of the machine tool, tool and
workpiece. Procedia CIRP 82:389–394
Acknowledgements The research was sponsored by the National 16. Kim HS, Eykholt R, Salas JD (1999) Nonlinear dynamics, delay
Natural Science Foundation of Major Special Instruments (Grant No. times, and embedding windows. Physica D: Nonlinear Phenom-
51527806) and the National Natural Science Foundation Projects of ena 127(1/2):48–60
the People’s Republic of China (Grant No. 51975372). 17. Li Z, Yang J, Fan K et al (2015) Integrated geometric and thermal
error modeling and compensation for vertical machining centers.
Open Access This article is licensed under a Creative Commons Int J Adv Manuf Technol 76(5/8):1139–1150
Attribution 4.0 International License, which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as
long as you give appropriate credit to the original author(s) and the Pu-Ling Liu is a PhD candidate
source, provide a link to the Creative Commons licence, and indicate at the School of Mechanical
if changes were made. The images or other third party material in this Engineering, Shanghai Jiao
article are included in the article’s Creative Commons licence, unless Tong University, China. Her
indicated otherwise in a credit line to the material. If material is not research interests include
included in the article’s Creative Commons licence and your intended machine tool error measure-
use is not permitted by statutory regulation or exceeds the permitted ment, error modeling, and error
use, you will need to obtain permission directly from the copyright compensation, with a focus on
holder. To view a copy of this licence, visit http://creativecommons. thermal error modeling and
org/licenses/by/4.0/. compensation.
References
1. Ramesh R, Mannan MA, Poo AN (2000) Error compensation in

machine tools–a review Part II: thermal errors. Int J Mach Tools
Manuf 40(9):1257–1284
2. Mayr J, Jedrzejewski J, Uhlmann E et al (2012) Thermal issues in
machine tools. CIRP Annals-Manuf Technol 61(2):771–791
3. Li Y, Zhao W, Lan S et al (2015) A review on spindle thermal
error compensation in machine tools. Int J Mach Tools Manuf
95:20–38
123
Zheng-Chun Du is an Associ- Xiao-Bing Feng is an Assistant

ate Professor at the School of Professor at the School of
Mechanical Engineering, Mechanical Engineering,
Shanghai Jiao Tong University, Shanghai Jiao Tong University,
China. He received his PhD in China. He received his PhD in
Mechanical Engineering from precision manufacturing from
Southeast University. His the National University of Sin-
research interests include error gapore. He finished his post-
measurement, error modeling doctoral research in
and error compensation of manufacturing metrology at the
machine tools, precision mea- University of Nottingham, UK.
surement, and processing. His research interests include
dimensional, surface, and
machine tool metrology, with a
focus on in situ metrology.
Hui-Min Li is a PhD candidate Jian-Guo Yang is a professor at

at the School of Mechanical the School of Mechanical
Engineering, Shanghai Jiao Engineering, Shanghai Jiao
Tong University, China. Her Tong University, China. He
research interests include received his PhD degree in
machine tool error measure- mechanical manufacturing from
ment, error modeling, and error Shanghai Jiao Tong University.
compensation, with a focus on His research interests include
volumetric error modeling and precision machining and testing.
compensation. He has received a number of
awards, including the author of
National Excellent Doctoral
Dissertation of China and the
National Technology Invention
Award. He is a vice president of
Chinese Colleges and Universities Society of Machine Tools and has
published more than 300 papers.
Ming Deng is a PhD candidate
at the School of Mechanical
Engineering, Shanghai Jiao
Tong University, China. His
research interests include
machine tool error measure-
ment, error modeling, and error
compensation, with a focus on
error measurement of five axis
machine tools.
123

Thermal Error Modeling Based On Bilstm Deep Learning For CNC Machine Tool

Uploaded by

Copyright:

Available Formats

Thermal Error Modeling Based On Bilstm Deep Learning For CNC Machine Tool

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Thermal Error Modeling Based On Bilstm Deep Learning For CNC Machine Tool

Uploaded by

Copyright:

Available Formats

Adv. Manuf.

Thermal error modeling based on BiLSTM deep learning

Fig. 1 Illustration of LSTM unit structure

Fig. 3 Illustration of model framework

original variable x, so that the transformed result

Fig. 6 Placement of eddy current sensor for thermal error

ETMT ¼ ETM þ ETT ; ð17Þ

where 1 i M and M ¼ N ðm 1Þs.

3.3.3 Hyperparameter setting

Setting the appropriate hyperparameters is helpful to

Fig. 10 Thermal error prediction results of the other three models

Fig. 13 Temperature measurement data of robustness verification

Fig. 11 A zoom-in observation of the comparison between four

Table 6 Operation procedures of robustness verification test

Fig. 14 Real picture of a machined workpiece

Table 7 Dimension of each hole

Fig. 12 Thermal error measurement data and model prediction data

4 Verification of BiLSTM-based thermal error

4.1 Prediction accuracy comparison with models

The proposed BiLSTM model was compared with the

Fig. 16 Schematic diagram of the thermal error compensation system

Fig. 17 Error compensation scheme

Fig. 18 Comparison on the measurement results of the depth of hole

an error compensation unit and a data acquisition unit. The

includes an error modeling module and an error compen-

4.3.2 Thermal error compensation result

The depth of the hole of all machined workpieces was

measured using a coordinate measuring machine (CON-

coordinate Dh is the relative deviation of the measured

coordinate is the serial number of each hole according to

after thermal error compensation are listed in Table 9. The

average depth variation is reduced from approximately

mum depth variation range is significantly reduced to 15

the excellent prediction accuracy of the proposed

A thermal error modeling method was proposed based on

model is composed of one BiLSTM layer, two FFN layers,

and one max-pooling layer. With a specially designed gate

short-term and long-term information to extract the tem-

poral and spatial characteristics of dynamic time series

temperature data and thermal error data.

modeling. The training algorithm adopts the gradient

prevent gradient explosion and ensure the convergence of

the training process.

1. Ramesh R, Mannan MA, Poo AN (2000) Error compensation in

Zheng-Chun Du is an Associ- Xiao-Bing Feng is an Assistant

Hui-Min Li is a PhD candidate Jian-Guo Yang is a professor at

You might also like