Neural and Machine Learning To The Surface Defect Investigation in Sheet Metal Forming
Neural and Machine Learning To The Surface Defect Investigation in Sheet Metal Forming
Neural and Machine Learning To The Surface Defect Investigation in Sheet Metal Forming
I089
4.1. Model development 0 A feedforward BP network consisting of multiple
hidden slabs (2 and 3 respectively) with different
In a neural network, the calculations performed activation functions.
at each neuron are determined by an activation 0 Probabilistic neural networks.
function which may be of various mathematical
form such as logistic, linear threshold, or hard 4.2. Network training and validation
limiting (odofl). The intensity of the signal
passed between any two neurons depends on As mentioned before, a v-fold cross-validation
both the activation function and the weight of the re-sampling method was implemented for this
connection. While connection weights will be study. As a result, eight training sets and eight
modified during training of the network as validation sets were obtained and used for
observation patterns are passed along, activation conducting the experiment. Each of the training
functions should be decided before the network sets was %&er split into a training subset and
training. The selection of an activation function test subset. This allowed us to modify andor to
for the hidden layer is most important, since this stop the network training based on the model
is the layer that actually performs the feature performance in processing “quasi-new data”,
extraction from the patterns processed. thus, ensuring that the resultant network model
Accordingly, we experimented with different would have better ability for generalization.
forms of activation functions, and found that for
the given problem the best suited combination of The network is trained using basic supervised
the activation functions are as follow: learning error-backpropagation. The algorithm
first layer: linear function 1-1, 11; works by presenting each training vector (inputs-
hidden layer: Logistic function; output variable pair) in turn, computing the
output layer: Logistic function. estimation error, and thereby determining
interconnection weight changes through the
Selection of the number of hidden neurons is network. The observations were presented to the
another basic decision to be made in building a network in random order in the experiment to
neural network. Heuristics have been suggested, minimize bias due to the network memorizing
but care must be taken when using them in the position of the training data. After every 180
practice as the issue is case dependent and an epochs (a complete pass of the training data
inappropriate decision may degrade the accuracy through the network), the partially trained
of the networks in generalization. At current network was tested using the test subsets
stage of ANN technology development, such described above, and the average error in
selection is still more of an art than a science. We estimating the occurrence of wrinkling was
therefore tested a wide range of hidden neuron recorded. If there was an improvement in the
numbers in order to obtain an optimal model. estimation accuacy, the network parameters were
saved. Thus, the performance of the network in
With regard to the interconnections between its processing of test data rather than of training
neurons, we considered the following network data was used to determine the quality of the
architectures: network. This approach helps to reduce the risk
0 A feedforward BP network with standard of saving a network which has memorized the
connections, ie, neurons in each layer connected features of training data, without the ability to
only to neurons in the immediately previous layer; generalize on new data.
1,2 and 3-hidden layer structures were tested.
0 A feedfoward BP network with jumped After the training session was completed,
connections,ie. neurons in each layer connected to validation of the model was performed using the
neurons in the every previous layer (jump validation data set (observations that were not
connection); 1, 2 and 3-hidden layer structures exposed to the network during its training). The
were tested.
I090
outputs of the model in processing this data set, 6. Results and Discussion
compared with the target values, represent a kind
of objective assessment of the model quality. The outputs of the neural network and the
decision tree models can be expressed as the total
5. BRP Approach Modelling percentage of the wrinkledwrinkle-free blocks
correctly predicted, and more crucially, the total
Binary Recursive Partitioning (BRP) is a branch percentage of wrinkled blocks correctly
of the Machine Learning methodology. The B W estimated. The usefulness of a prediction or
technique has its distinguish advantages in that estimation model should also be assessed from
the binary tree-structured analysis outcomes offer other points of view, such as the error stability of
explicit explanation of the evaluation results for the model and its ability to generalize. In order to
the problem under discussion. make comparisons and to evaluate the overall
performance of each model as a tool for
In this study, the algorithm used to implement estimating the occurrence of wrinkling, a group
the decision trees is the Classification and of six statistics were chosen as standard
Regression Trees [2]. A common technique assessment criteria. They are:
among the first generation of the BRP algorithms % Correctly Estimated, which measures the
was to continue splitting (or growing the tree) percentage of the observationscorrectly estimated;
until some goodness-of-fit criterion failed to met, M E , Mean absolute error, which measures the
and that often yielded erroneous results. To avoid average error rate with the best being nearest to 0;
the problem, CART proceeds by means of a STDEV, standard deviation , which measures the
three-step implementation algorithm. Firstly it dispersion of the errors that a estimation model
grows the classification trees until firther generates around the mean value of these errors;
growing is impossible. Secondly it prunes away the best is that nearest to 0;
branches of the maximal tree to get a set of sub- Misclassification Rates, categorized as Type I
trees. Thirdly it tests the error rate or costs for all Error (the frequency of incorrect estimations of
wrinkled blocks as wrinkle-free) and Type I1Error
the possible sub-trees and the maximal tree to
(the frequency of incorrect estimations of wrinkle-
determine the best tree. Based on this
free blocks as wrinkled);
combination of exhaustive searching and
RSQ, the square of Pearson product moment
computer-intensive testing, no valuable correlation coefficient, which measures the
information would be lost. correlation between the estimated and actual
distributions, ranging from 0 @oar fitting to the
Figure 1 illustrates the estimation model model) to 1 (good fitting);
constructed for the problem under consideration. Cross- Validation Prediction Risk, which measures
the iiccuracy of a model when being applied to
-9.701 N,,& 1 new data (i.e. the model's generalization ability),
I .%3 1 the small, the better.
1091
model having the lower value of MAE) and that took the second and the third places in
stable error structure (small STDEV value). making contribution to the final estimation
Inspecting the model performance on processing results. The Curvature Gradient is the variable of
the training and the validation data, no significant the least significance. For the BRP model, it is
deterioration is traced which indicates that these noticed that the surface geometry parameter
models have fhirly good ability to generalize. In Torsion Gradient is the variable that contributed
other words, the models suit for future most (of the largest value of the Variable
applications. The low value of the ‘Prediction Importance). The Height Gradient and the
Risk’ observed also supports this conclusion. Boundary Condition of Height are the variables
that played the second and the third important
It is noted that the misclassification rates of the roles. The Curvature Gradient is again the
Type I Errors produced by the resultant models variable of the least contribution.
were quite large. In case of the ANN model, the
rates were around 36.36% for training data and According to the above, we can then infer that
33.33% for validation data. In case of the BRP surface geometry parameters Torsion Gradient
model, the rates were zero for training but and Height Gradient are the most important
20.00% for validation. This is the unsatisfactory factors in regard to the possible formation of
aspect of the analysis results, to which further wrinkling at a given area. The boundary
study will be directed. conditions of the torsion gradient, the height
gradient and the curvature gradient are the
Besides to examine the prediction results, it is parameters that each plays a role of certain level
also valuable to consider the relative importance of importance. The surface geometry parameter
of the input variables in producing the estimation Curvature Gradient, however, seems not
outcomes for each model. For a neural network, a influencing much on the process of wrinkling
parameter known as the Contribution Factors formation. In spite of being preliminary, these
(CFs) can be observed. These factors are the sum are very useful findings, given the fact that there
of the absolute values of the network connection is no established theory available on describing
weights, leading from each neuron that the mechanism of the (surface defect) wrinkling
represents an input variable. In this regard, the Occurrence and the surface geometry conditions.
CF value of a variable gives an approximate In light of these, new experiments will be
measure of the importance of that variable designed and carried out in our follow-up
relative to the other variables in the same research. That will hrther examine and test those
network. By examining the CFs, some useful results, aiming at more clear understanding of the
information with respect to the relations between mechanism of the geometrical influence on the
input variables and the estimated target variable wrinkling formation.
may be obtained. Similarly, for each input
variable of a BRP model the Variable 7. Conclusions
Importance can be used to observe that variable’s
contribution in deriving the final estimation. This paper used the neural and the machine
learning techniques to estimate the occurrence of
Table 2 lists the CFs and the VariabZe wrinkling of the sheet metal forming parts of
Importance of the input variables corresponding automobile manufacturing. A neural network and
to the surface geometrical parameters of the a binary recursive partitioning classification tree
estimation models. For the neural network, the model were developed which demonstrated fairly
surface geometry parameter Height Gradient is good performance in processing the training and
the variable that contributed most (of the largest the validation data. The input explanatory
CF value). The Torsion Gradient and the variables of the models were a set of geometrical
Roiindary Condihori of Torsion are the variables parameters of the forming parts. Thus certain
I092
kind of relationships between the surface References
geometry features and the wrinkling Occurrence
1. Altman, E,,Marco, G., and Varetto, F., Corporate
of the corresponding area of the parts have been distress diagnosis: comparison using linear
established. The estimation models would be discriminant analysis and neural networks (the
useful tools for the product design engineer as Italian experience), Journal of Banking and
well as the workshop practitioner in their Finance, vol: 18,pp. 509-525,(1994).
tackling the problem of surface defects of the 2. Breiman, L.,Friedman, J., Olshen, R. and Stone,
sheet metal forming parts. C., Classification and Regression Trees,
Wadsworth, Pacific Grove, USA, (1984).
3. Ding Q. & Davis B.J., Surface Engineering
Appendix: The parameters for Surface Geometry Geometry for Computer-aided Design and
1. Curvature is the rate of change of the direction of Manufacture, Halsted Press, England, (1987).
the tangent vector of the curve with respect to arc 4. Duncan J. L.,Altan T., New Directions in Sheet
length: K(S) = Im(s)/dsl. Where K is curvature, T is Metal Forming Research, CIRP Annals, vol: 29,No.
tangent vector, and s is arc length. If the curve is 1, pp, 153-156,(1980).
described by the arc length parameter equation 5. Fletcher, D. and Goss, E., Forecasting with Neural
R(s) = IX@) Y(S) ml Networks: an application using bankruptcy data,
then K(s)=/r ~ s ~ ~ = [ x ( s ~ ~ + ~ ( s ~ ' + z ~ s ~ ~ Information
~~~ &Management, vol: 24,p.159,(1993).
6. Flitman, A., Towards Analysing Student Failures:
2. Torsion is the rate of turn of the direction of the neural networks compared with regression analysis
binomial vector of the curve with respect to arc and multiple discriminant analysis, Computers Ops.
length: t(s)=ldB(s)/dsl . If the curve is described by Res. Vol: 24,pp. 367, (1997).
the parametric equation for arc length, then
7.Steinberg, D.and Colla, P., CART: Tree-Structured
z(s)= r '"0( r 'x r r "12 Non-Parametric Data Analysis, Salford Systems,
3. Height is the vertical displacement of each point CA, USA, (1995).
from its initial position: h(s)=z(s) 8. Stone. M., Cross-Validation Choice and
A program was written to calculate these three Assessment of Statistical Predictions, Roy. Stat.
parameters. It was assumed in the present work that SOC.,B36, pp. 11 1-147,(1974).
the change of the above parameters at each grid point 9. Utans,J. and Moody, J., Selecting Neural Network
is important,rather than the value itself at that point. Architecturesvia the Prediction Risk: Applicationto
The actual values to be used as the input variables of Corporate Bond Rating Prediction, Proceedings of
the neural network were the gradients of the three 1''. Inter. Con$ on Artijcial Intelligence Application
parameters at each point. on Wall Street, CA, pp. 35-41,(1991).
I093