Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo

1

2012 American Transactions on Engineering & Applied Sciences
                       2011 American Transactions on Engineering & Applied Sciences.



                               American Transactions on
                             Engineering & Applied Sciences

                  http://TuEngr.com/ATEAS,               http://Get.to/Research




                       Relevance Vector Machines for Earthquake Response
                       Spectra
                                   a*                b
                       Jale Tezcan , Qiang Cheng

a
  Department of Civil and Environmental Engineering, Southern Illinois University Carbondale,
Carbondale, IL 62901, USA
b
  Department of Computer Science, Southern Illinois University Carbondale, Carbondale, IL 62901, USA

ARTICLEINFO                     A B S T RA C T
Article history:                        This study uses Relevance Vector Machine (RVM)
Received 23 August 2011
Received in revised form        regression to develop a probabilistic model for the average horizontal
23 September 2011               component of 5%-damped earthquake response spectra. Unlike
Accepted 26 September 2011      conventional models, the proposed approach does not require a
Available online
26 September 2011               functional form, and constructs the model based on a set predictive
Keywords:                       variables and a set of representative ground motion records. The
Response spectrum               RVM uses Bayesian inference to determine the confidence intervals,
Ground motion                   instead of estimating them from the mean squared errors on the
Supervised learning             training set. An example application using three predictive
Bayesian regression             variables (magnitude, distance and fault mechanism) is presented for
Relevance Vector Machines       sites with shear wave velocities ranging from 450 m/s to 900 m/s.
                                The predictions from the proposed model are compared to an existing
                                parametric model. The results demonstrate the validity of the
                                proposed model, and suggest that it can be used as an alternative to
                                the conventional ground motion models. Future studies will
                                investigate the effect of additional predictive variables on the
                                predictive performance of the model.

                                   2012 American Transactions on Engineering & Applied Sciences.



*Corresponding author ( J. Tezcan). Tel/Fax: +001-618-4536125. E-mail address:
jale@siu.edu.  2012. American Transactions on Engineering & Applied Sciences.
Volume 1 No.1     ISSN 2229-1652     eISSN 2229-1660.     Online Available at
                                                                                            25
http://TUENGR.COM/ATEAS/V01/25-39.pdf

2

1. Introduction 
    Reliable prediction of ground motions from future earthquakes is one of the primary
challenges in seismic hazard assessment. Conventional ground motion models are based on
parametric regression, which requires a fixed functional form for the predictive model. Because the
mechanisms governing ground motion processes are not fully understood, identification of the
mathematical form of the underlying function is a challenge. Once a functional form is selected,
the model is fit to the data and the model coefficients minimizing the mean squared errors
between the model and the data are determined. This approach, when the selected mathematical
form does not accurately represent the actual input-output relationship, is susceptible to
overfitting. Indeed, using a sufficiently complex model, one can achieve a perfect fit to the
training data, regardless of the selected mathematical form.       However, a perfect fit to the
training data does not indicate the predictive performance of the model for new data.


    Kernel regression offers a convenient way to perform regression without a fixed parametric
form, or any knowledge of the underlying probability distribution. A special form of kernel
regression, called the Support Vector Regression (SVR) (Drucker et al., 1997) is characterized by
its compact representation and its high generalization performance. In SVR, the training data is
first transformed into a high dimensional kernel space, and linear regression is performed on the
transformed data. The resulting model is a linear combination of nonlinear kernel functions
evaluated at a subset of the training input. Combination weights are determined by minimizing a
penalized residual function. The SVR has proved successful in many studies since its introduction
in 1997. The effectiveness of SVR in ground motion modeling has been recently demonstrated
(Tezcan and Cheng, 2011), (Tezcan et al., 2010). A well-known weakness of the SVR is the lack
of probabilistic outputs. Although the confidence intervals can be constructed using the
mean-squared errors, similar to the approach used in conventional ground motion models, the
posterior probabilities, which produce the most reliable estimate of prediction intervals, are not
given. The lack of probabilistic outputs in the SVR formulation has motivated the development of
a new kernel regression model called Relevance Vector Machine (RVM) (Tipping, 2000) which
operates in a Bayesian framework.


    To overcome the limitations of parametric regression while obtaining probabilistic

    26           Jale Tezcan and Qiang Cheng

3

predictions, this paper proposes a new ground motion model based on the RVM regression.
Unlike standard ground motion models, which make point estimates of the optimal value of the
weights by minimizing the fitting error, the RVM model treats the model coefficients as random
variables with independent variances and attempts to find the model that maximizes the likelihood
of the observations. This approach offers two main advantages over the conventional ground
motion models.      First, the prediction uncertainty is explicitly determined using Bayesian
inference, as opposed to being estimated from the mean squared errors. Second, the complexity of
the RVM model is controlled by assigning suitable prior distributions over the model coefficients,
which reduces the overfit susceptibility of the model.


    The rest of the paper is organized as follows. In Section 2, the RVM regression algorithm is
described. Section 3 is devoted to the construction of ground motion model. Starting with the
description of the ground motion data and the predictive and target variables, the training results
are presented, and the prediction procedure for new data is described. Section 4 demonstrates
computational results and compares the RVM predictions to an existing empirical parametric
model. Section 5 concludes the paper by presenting the main conclusions of this study, and
discusses the advantages and limitations of the proposed method.


2. The RVM Regression Algorithm 
   Given a set of input vectors        ,     1:    and corresponding real-valued targets         , the
regression task is to estimate the underlying input-output relationship. Using kernel representation
(Smola and Schölkopf, 2004), the regression function can be written as a linear combination of a
set of nonlinear kernel functions:



                                              ,                                            (1)



where    ,      1…     are the combination weights and       is the bias term.


*Corresponding author ( J. Tezcan). Tel/Fax: +001-618-4536125. E-mail address:
jale@siu.edu.  2012. American Transactions on Engineering & Applied Sciences.
Volume 1 No.1     ISSN 2229-1652     eISSN 2229-1660.     Online Available at
                                                                                          27
http://TUENGR.COM/ATEAS/V01/25-39.pdf

4

This study uses the radial basis function (RBF) kernel:


                                     ,    ,                     ,              0                         (2)



where            is the width parameter controlling the trade-off between model accuracy and
complexity. In this study, the width parameter has been determined using cross-validation.


        Assuming independent noise samples from a zero-mean Gaussian distribution,
i.e.,        ~   0,         , the target values can be written as:


                                                                    1, … , .                             (3)


        Recast in matrix from, Equation (3) becomes:


                                                      Φw    ,                                            (4)


where                 ,…,        ,                ,…,      , and Φ is an           1 basis matrix with         1
and                    ,        . The likelihood of the entire set, assuming independent observations is
given by:



                                 | ,              2                            .                         (5)


where                  ,…,               is the vector containing the mean values of the combination weights.
To control the complexity of the model, a zero-mean Gaussian prior is used where each weight is
assigned a different variance (MacKay, 1992):



                                              |             0, 1/       .                                (6)



        28                 Jale Tezcan and Qiang Cheng

5

In Eq. (6),               ,…,             where 1/        is the variance of       . The posterior distribution
of the weights is obtained as:



                    | , ,            2                | |                          .                      (7)


where the mean vector             and covariance matrix               are:


                                                                                                          (8)



                                                                                                          (9)


    with
                                                  …    …          0
                                          :
                                                                      .                                 (10)
                                          0       …


The marginal likelihood of the dataset can be determined by integrating out the weights (MacKay,
1992) as follows:



                            | ,               2        | |                                                (11)


where                                    and      is the identity matrix of size        . Ideal Bayesian inference
requires defining prior distributions over                  and       , followed by marginalization. This process,
however, will not result in a closed form solution. Instead, the                       and     values maximizing
Eq. (11) can be found iteratively as follows (MacKay, 1992):




*Corresponding author ( J. Tezcan). Tel/Fax: +001-618-4536125. E-mail address:
jale@siu.edu.  2012. American Transactions on Engineering & Applied Sciences.
Volume 1 No.1     ISSN 2229-1652     eISSN 2229-1660.     Online Available at
                                                                                                          29
http://TUENGR.COM/ATEAS/V01/25-39.pdf

6

1
                                                                                                         (12)



                                                                    .                                    (13)
                                                    ∑ 1


       Because the nominator in Eq.(12) is a positive number with a maximum value of 1, an
value tending to infinity implies that the posterior distribution of                is infinitely peaked at zero,
i.e.           0. As a consequence, the corresponding kernel function can be removed from the
model. The procedure for determining the weights and the noise variance can be summarized as
follows:

       1) Select a width parameter of the kernel function and form the basis matrix Φ.
       2) Initialize                ,…,       and       .
       3) Compute matrix             using Eq.(10).
       4) Compute the covariance matrix                 using Eq.(9).
       5) Compute the mean vector              using Eq.(8).
       6) Update        and          using Eq.(12) and Eq.(13).
       7) If           ∞, set             0 and remove the corresponding column in Φ.
       8) Go back to step 3 until convergence.
       9) Set the remaining weights equal to                .


       The training input points corresponding to the remaining nonzero weights are called the
“relevance vectors”. After the weights and the noise variance are determined, the predictive mean
for a new input         can be found as follows:


                                                            Φ.
                                                                                                         (14)



                                                                             T
       In Eq.(14) Φ             1     x ,r          x ,r        …   x , rN       where    r , r … , rN    are the
relevance vectors.


       30              Jale Tezcan and Qiang Cheng

7

The total predictive variance can be found by adding the noise variance to the uncertainty due
to the variance of the weights, as follows:


                                              ΦT CΦ .
                                                                                              (15)



3. Construction of the Ground Motion Model   
    In this section, RVM regression algorithm will be used to construct a ground motion model. In
Section 4, the resulting model will be compared to an existing parametric model by Idriss (Idriss,
2008), which will be referred to as “I08 model” in this paper. To enable a fair comparison, the
dataset and the predictive variables of I08 model have been adopted in this study. The RVM
algorithm is independent of the size of the predictive variable set; additional variables can be
introduced the set of predictive variables can be customized to specific applications.



3.1 Ground Motion Data       
    The ground motion records used in the training have been obtained from the PEER-NGA
database (PEER, 2007). Consistent with the I08 model, a total of 942 free-field records have been
selected using the following criteria:

   •   Shear wave velocity at the top 30 m              ranging from 450 m/s to 900 m/s,
   •   Magnitude larger than 4.5,
   •   Closest distance between the station and rupture surface (R) less than 200 km.


    Detailed information regarding these records can be found in the paper by Idriss (Idriss, 2008).



3.2 Predictive and Target Variables     
   The predictive variable set includes moment magnitude (M), natural logarithm of the closest
distance between the station and the rupture surface in kilometers           and fault mechanism (F).
Idriss finds that with the shear wave velocity (        ) constrained to 450 m/s- 900 m/s range, it has
*Corresponding author ( J. Tezcan). Tel/Fax: +001-618-4536125. E-mail address:
jale@siu.edu.  2012. American Transactions on Engineering & Applied Sciences.
Volume 1 No.1     ISSN 2229-1652     eISSN 2229-1660.     Online Available at
                                                                                              31
http://TUENGR.COM/ATEAS/V01/25-39.pdf

8

negligible effect on spectral values up to 1 second. Therefore,                  was not used as a predictive
variable. Following the convention used in I08 model, earthquakes that have been assigned a fault
mechanism type 0 and 1 in the PEER database were merged to a single, “strike-slip” group, while
the rest were considered to be representative of “reverse” events. In the RVM model, strike-slip
and reverse earthquakes are assigned              1        and            1, respectively.   The input vector
representing ith record has the following form:


                                                       .                                              (16)


   A set of eight vibration periods (         8) ranging from 0.01 second to 4 seconds was used in
the RVM model. The output for the ith record for the vibration period                is defined as:



                                        for           1 to           .                                (17)


    In Equation (17),       is the natural logarithm of the average horizontal component of 5%-
damped pseudo-acceleration response spectrum. The spectral values                       represent the median
value of the geometric mean of the two horizontal components,                  computed using non-redundant
rotations between 0 and 90 degrees (Boore, 2006).



3.3 Training of the RVM Regression Model 

    As a pre-processing step,        and          values were linearly scaled to [-1 1] to achieve
uniformity between the ranges of the predictive variables. There is no need to scale the fault
mechanism identifier (    as it was already defined to take either -1 or 1. Because kernel functions
use Euclidean distances between pairs of input vectors, such scaling will help prevent numerical
problems due to large variations between the ranges of the values that variables can take. In the
ground motion data used in this study, the ranges of the predictive variables are
4.53          7.68 , and 0.32                 199.27             .       Therefore, input scaling takes the
following form:


    32            Jale Tezcan and Qiang Cheng

9

2         12.21 2          4.16
                                          ,                ,    .                                 (18)
                                  3.15            6.44

    The optimal value of the kernel width parameter                   for each vibration period was
determined using 10-fold cross validation (Webb, 2002). In 10-fold cross validation, the training
data is randomly partitioned into 10 subsets of equal size; and the model is trained using 9 subsets,
and the remaining subset is used to compute the validation error. This process is repeated 10 times,
each time with a different validation subset, and the average validation error for a particular          is
computed. By computing the average validation error over a range of possible                  values, the
optimal     with the smallest average validation error is determined. The resulting              values for
each period are listed in Table 1, along with the standard deviation of noise (    ), the mean value of
the constant term         and the number of relevance vectors. The relevance vectors and the
combination weights            are listed in Table 2.


    After the RVM models, one for each vibration period, were trained, standardized residuals
were computed. Figure 1 shows the distribution of the standardized residuals, corresponding to
T=1 second, with respect to       ,      and     . The residual distribution patterns for other periods
were similar, not indicating any systematic bias.




Table 1: Kernel width parameter           , logarithmic standard deviation of noise (   ), mean value of
          the bias term         and the number of relevance vectors (      ), for each period.
           T (sec)
            0.01                  0.23                  0.633           -3.069            7
            0.05                  0.32                  0.666           -0.664            7
            0.10                  0.13                  0.718            0.002            7
            0.20                  0.15                  0.661          -15.042            6
            0.50                  0.25                  0.695           -8.359            7
            1.00                  0.36                  0.748           -4.670            5
            2.00                  0.28                  0.869          -6.0548            5
            4.00                  0.26                  0.983           -7.794            5



*Corresponding author ( J. Tezcan). Tel/Fax: +001-618-4536125. E-mail address:
jale@siu.edu.  2012. American Transactions on Engineering & Applied Sciences.
Volume 1 No.1     ISSN 2229-1652     eISSN 2229-1660.     Online Available at
                                                                                                  33
http://TUENGR.COM/ATEAS/V01/25-39.pdf

10

Figure 1: Standardized residuals for T=1 second.



    Table 2: Mean values of the combination weights          and the relevance vectors

                T=0.01 s.                                      T=0.05 s.
i        Wi                 ri                i       Wi                   ri
1     13.258    [-0.1937     0.2676   -1]     1     -6.177     [0.7905 -0.4227    1]
2     15.393    [0.5238     -0.2268    1]     2      6.355      [-0.3841 -0.1783 -1]
3     0.4861    [ 0.8921     0.9414   -1]     3     28.555     [0.5238    0.5856  1]
4      -5.073   [0.9619     -1.0000    1]     4     -7.930     [-0.5111   0.7896 -1]
5      -4.275   [0.9619     -0.6751    1]     5     -0.402      [0.7460 -0.4021 -1]
6     -14.173   [-0.2889     0.7862   -1]     6    -12.622     [0.9619    0.9545  1]
7      -8.086   [ 0.0603     0.9789    1]     7    -16.194     [0.0603    0.9789  1]


                 T=0.1 s.                                       T=0.2 s.
i        Wi                 ri                i       Wi                   ri
1     64.423     [0.4159    -0.1499    1]     1     29.569     [-0.8921 -0.0837 -1]
2      -6.991   [ 0.9619     0.9545    1]     2      2.293     [0.7905 -0.4227   1]
3     -36.297    [0.9619    -1.0000    1]     3     35.440     [0.8921   0.6543 -1]
4     15.875     [1.0000     0.4559   -1]     4     5.7412     [0.9619 -1.0000   1]
5      -5.599   [-0.3143     0.0809    1]     5     3.5036     [-0.8222 0.1385   1]
6     -17.361   [ 0.6508     0.9961   -1]     6    -48.496      [0.0603 0.4955 -1]
7     -25.799   [-0.1302     0.9056    1]

    34          Jale Tezcan and Qiang Cheng

11

Table 2 (continued).

                    T=0.5 s.                                         T=1.0 s.
  i       Wi                    ri                  i      Wi                        ri
 1       6.4551      [0.7905 -0.4227   1]           1     1.9699     [0.7905         -0.4227        1]
 2       12.825      [-0.2317 -0.2931 -1]           2     4.8873     [0.0540          -0.2785       -1]
 3       0.0283     [-0.7714   0.1214   1]          3    -4.1425     [-0.7524         0.7892        1]
 4       -0.806     [ 0.8921 -0.0318 -1]            4    -3.9593      [-0.7651        0.8672       -1]
 5       8.4335      [0.8921   0.9414 -1]           5     3.7352      [-0.1302       -0.0121        1]
 6       -0.089     [ 0.9619   0.9545  1]
 7        -12.9      [ 0.0603 0.5786 -1]

                    T=2.0 s.                                         T=4.0 s.
  i       Wi                    ri                  i      Wi                        ri
 1      7.3574       [-0.2317 -0.2931 -1]           1     0.4747     [0.7460 -0.4021               -1]
 2      4.5548      [-0.0730   0.4691   1]          2     11.936     [0.7460  0.5118               -1]
 3      3.0086      [ 0.9619 -1.0000   1]           3     6.8109     [0.3714 -0.0296                1]
 4      -6.4695     [-1.0000   0.5142 -1]           4    -5.6050     [-0.7524 0.7892                1]
 5      -5.3630     [-0.7524   0.7892   1]          5    -10.180     [0.3778  1.0000               -1]


3.4 Prediction Phase   
      After training, the spectral values for a new input vector         ,       ,        can be determined
as follows:

      1. Scale the input to the range [-1 1] using Eq. (18);
                                                                                          T
      2. Construct the basis vector Φ       1       ,            ,   …       ,                using the
      relevance vectors from Table 2 and the kernel width parameter from Table 1;
      3. Determine the median value of          using Eq.(14);
      4. Obtain the standard deviation of the noise from Table 1. Total uncertainty, if needed, can
         be determined using Eq.(15).


4. Computational Results 
      The RVM model was tested using different magnitude, distance and fault mechanisms, and the
results were compared to the I08 model. Figure 2 shows the median spectral acceleration at T=1

*Corresponding author ( J. Tezcan). Tel/Fax: +001-618-4536125. E-mail address:
jale@siu.edu.  2012. American Transactions on Engineering & Applied Sciences.
Volume 1 No.1     ISSN 2229-1652     eISSN 2229-1660.     Online Available at
                                                                                                    35
http://TUENGR.COM/ATEAS/V01/25-39.pdf

12

second, along with the 16th and 84th percentile values          bounds for strike-slip faults, for
M=5 (left) and M=7 (right). The circles in the figure show the spectral values from earthquakes
with the same fault mechanism and within ±0.25 magnitude units. Figure 3 shows the same
information for reverse faults. For periods about 1 second and longer, it was observed that the
median estimates from the RVM model were generally lower than those from the I08 model. At
very short distances, within ~20 km of the source, RVM estimates were higher for M=7, for both
strike-slip and reverse faulting earthquakes.




     Figure 2: Median ±σ bounds for spectral acceleration at T=1 second, strike-slip faults.




         Figure 3: Median ±σ bounds for spectral acceleration at T=1 second, reverse faults.
    36             Jale Tezcan and Qiang Cheng

13

Figure 4 presents the results for vibration period T=0.2 second, for strike-slip earthquakes.
The results for the reverse faulting earthquakes were similar. For shorter vibration periods, and
M=7, RVM estimates were lower than those from the I08 model. For M=5, however, RVM
predictions equaled or exceed the I08 predictions. Regarding the variation about the median (noise
variance), the predictions from the two models were in general agreement for all vibration periods.




    Figure 4: Median ±σ bounds for spectral acceleration at T=0.2 second, strike-slip faults.


5. Conclusion 
    This paper proposes an RVM-based model for the average horizontal component of
earthquake response spectra. Given a set of predictive variable set, and a set of ground motion
records, the RVM model predicts the most likely spectral values in addition to its variability. An
example application has been presented where the predictions from the RVM model have been
compared to an existing, parametric ground motion model. The results demonstrate the validity of
the proposed model, and suggest that it can be used as an alternative to the conventional ground
motion models.

    The RVM model offers the following advantages over its conventional counterparts: (1) There
is no need to select a fixed functional form. By determining the optimal variances associated with

*Corresponding author ( J. Tezcan). Tel/Fax: +001-618-4536125. E-mail address:
jale@siu.edu.  2012. American Transactions on Engineering & Applied Sciences.
Volume 1 No.1     ISSN 2229-1652     eISSN 2229-1660.     Online Available at
                                                                                          37
http://TUENGR.COM/ATEAS/V01/25-39.pdf

14

the weights, the RVM automatically detects the most plausible model; (2) The resulting RVM
model has a simple mathematical structure (weighted average of exponential basis functions), and
is based on a small number of samples that carry the most relevant information. Samples that are
not well supported by the evidence (as measured by the increase in the marginal likelihood) are
automatically pruned. (3) Because the model complexity is controlled during the training stage, the
RVM has lower risk of over-fitting.

    One limitation of the proposed approach is that the resulting model may be difficult to
interpret. Because the RVM is not a physical model, it does not allow any user-defined, physical
constraints, not allowing extension of the model to scenarios not represented in the training data
set. However, in our opinion, this does not constitute a shortcoming, considering that the reliability
such practice is questionable in any regression model. Another potential limitation is that the RVM
requires a user-defined kernel width parameter, which does not have a very clear intuitive meaning,
especially when working with high dimensional input vectors. However, the optimal value of the
kernel width parameter can be determined using cross-validation, as has been done in this study.
Future studies will investigate the effect of using additional predictive variables on the
performance of the model.


6. Acknowledgements 
    This material is based in part upon work supported by the National Science Foundation under
Grant Number CMMI-1100735.


7. References 
Boore, D.M., J. Watson-Lamprey, and N.A. Abrahamson. (2006). Orientation-independent
       measures of ground motion. Bulletin of the Seismological Society of America, 96(4A),
       1502-1511.

Bozorgnia, Y. and K. W. Campbell. (2004). The vertical-to-horizontal response spectral ratio and
      tentative procedures for developing simplified V/H and vertical design spectra. Journal of
      Earthquake Engineering, 8(2), 175-207.

Campbell, K. W. and Y. Bozorgnia. (2003). Updated Near-Source Ground-Motion (Attenuation)
     Relations for the Horizontal and Vertical Components of Peak Ground Acceleration and
     Acceleration Response Spectra. Bulletin of the Seismological Society of America, 93(1),
     314-331.
    38            Jale Tezcan and Qiang Cheng

15

Drucker, H., C. J. C. Burges, L. Kaufman, A. Smola and V. Vapnik. (1997). Support vector
      regression machines, Advances in Neural Information Processing Systems 9, MIT Press.

Idriss, I. M. (2008). An NGA empirical model for estimating the horizontal spectral values
        generated by shallow crustal earthquakes. Earthquake spectra, 24(1), 217-242.

MacKay, D. J. C. (1992). Bayesian interpolation. Neural computation, 4(3), 415-447.

MacKay, D. J. C. (1992). The evidence framework applied to classification networks. Neural
     Computation, 4(5), 720-736.

PEER. (2007). PEER-NGA Database. http://peer.berkeley.edu/nga/index.html.

Smola, A. J. and B. Schölkopf. (2004). A tutorial on support vector regression. Statistics and
       Computing, 14(3), 199-222.

Tezcan, J. and Q. Cheng. (2011). A Nonparametric Characterization of Vertical Ground Motion
      Effects. Earthquake Engineering and Structural Dynamics (in print).

Tezcan, J., Q. Cheng and L. Hill. (2010). Response Spectrum Estimation using Support Vector
      Machines, 5th International Conference on Recent Advances in Geotechnical Earthquake
      Engineering and Soil Dynamics, San Diego, CA.

Tipping, M. (2000). The relevance vector machine. Advances in Neural Information Processing
       Systems MIT Press.

Webb, A. (2002). Statistical pattern recognition, New York, John Wiley and Sons.



            Dr.Jale Tezcan is an Associate Professor in the Department of Civil and Environmental
            Engineering at Southern Illinois University Carbondale. She earned her Ph.D. from Rice University,
            Houston, TX in 2005. Dr.Tezcan’s research interests include earthquake engineering, material
            characterization, and numerical methods.


            Dr.Qiang Cheng is an Assistant Professor in the Department of Computer Science at Southern
            Illinois University Carbondale. He earned his Ph.D. from the University of Illinois at Urbana
            Champaign, IL in 2002. Dr.Cheng’s research interests include pattern recognition, machine
            learning and signal processing.




Peer Review: This article has been internationally peer-reviewed and accepted for publication
              according to the guidelines given at the journal’s website.



*Corresponding author ( J. Tezcan). Tel/Fax: +001-618-4536125. E-mail address:
jale@siu.edu.  2012. American Transactions on Engineering & Applied Sciences.
Volume 1 No.1     ISSN 2229-1652     eISSN 2229-1660.     Online Available at
                                                                                                      39
http://TUENGR.COM/ATEAS/V01/25-39.pdf

More Related Content

Relevance Vector Machines for Earthquake Response Spectra

  • 1. 2012 American Transactions on Engineering & Applied Sciences 2011 American Transactions on Engineering & Applied Sciences. American Transactions on Engineering & Applied Sciences http://TuEngr.com/ATEAS, http://Get.to/Research Relevance Vector Machines for Earthquake Response Spectra a* b Jale Tezcan , Qiang Cheng a Department of Civil and Environmental Engineering, Southern Illinois University Carbondale, Carbondale, IL 62901, USA b Department of Computer Science, Southern Illinois University Carbondale, Carbondale, IL 62901, USA ARTICLEINFO A B S T RA C T Article history: This study uses Relevance Vector Machine (RVM) Received 23 August 2011 Received in revised form regression to develop a probabilistic model for the average horizontal 23 September 2011 component of 5%-damped earthquake response spectra. Unlike Accepted 26 September 2011 conventional models, the proposed approach does not require a Available online 26 September 2011 functional form, and constructs the model based on a set predictive Keywords: variables and a set of representative ground motion records. The Response spectrum RVM uses Bayesian inference to determine the confidence intervals, Ground motion instead of estimating them from the mean squared errors on the Supervised learning training set. An example application using three predictive Bayesian regression variables (magnitude, distance and fault mechanism) is presented for Relevance Vector Machines sites with shear wave velocities ranging from 450 m/s to 900 m/s. The predictions from the proposed model are compared to an existing parametric model. The results demonstrate the validity of the proposed model, and suggest that it can be used as an alternative to the conventional ground motion models. Future studies will investigate the effect of additional predictive variables on the predictive performance of the model. 2012 American Transactions on Engineering & Applied Sciences. *Corresponding author ( J. Tezcan). Tel/Fax: +001-618-4536125. E-mail address: jale@siu.edu. 2012. American Transactions on Engineering & Applied Sciences. Volume 1 No.1 ISSN 2229-1652 eISSN 2229-1660. Online Available at 25 http://TUENGR.COM/ATEAS/V01/25-39.pdf
  • 2. 1. Introduction  Reliable prediction of ground motions from future earthquakes is one of the primary challenges in seismic hazard assessment. Conventional ground motion models are based on parametric regression, which requires a fixed functional form for the predictive model. Because the mechanisms governing ground motion processes are not fully understood, identification of the mathematical form of the underlying function is a challenge. Once a functional form is selected, the model is fit to the data and the model coefficients minimizing the mean squared errors between the model and the data are determined. This approach, when the selected mathematical form does not accurately represent the actual input-output relationship, is susceptible to overfitting. Indeed, using a sufficiently complex model, one can achieve a perfect fit to the training data, regardless of the selected mathematical form. However, a perfect fit to the training data does not indicate the predictive performance of the model for new data. Kernel regression offers a convenient way to perform regression without a fixed parametric form, or any knowledge of the underlying probability distribution. A special form of kernel regression, called the Support Vector Regression (SVR) (Drucker et al., 1997) is characterized by its compact representation and its high generalization performance. In SVR, the training data is first transformed into a high dimensional kernel space, and linear regression is performed on the transformed data. The resulting model is a linear combination of nonlinear kernel functions evaluated at a subset of the training input. Combination weights are determined by minimizing a penalized residual function. The SVR has proved successful in many studies since its introduction in 1997. The effectiveness of SVR in ground motion modeling has been recently demonstrated (Tezcan and Cheng, 2011), (Tezcan et al., 2010). A well-known weakness of the SVR is the lack of probabilistic outputs. Although the confidence intervals can be constructed using the mean-squared errors, similar to the approach used in conventional ground motion models, the posterior probabilities, which produce the most reliable estimate of prediction intervals, are not given. The lack of probabilistic outputs in the SVR formulation has motivated the development of a new kernel regression model called Relevance Vector Machine (RVM) (Tipping, 2000) which operates in a Bayesian framework. To overcome the limitations of parametric regression while obtaining probabilistic 26 Jale Tezcan and Qiang Cheng
  • 3. predictions, this paper proposes a new ground motion model based on the RVM regression. Unlike standard ground motion models, which make point estimates of the optimal value of the weights by minimizing the fitting error, the RVM model treats the model coefficients as random variables with independent variances and attempts to find the model that maximizes the likelihood of the observations. This approach offers two main advantages over the conventional ground motion models. First, the prediction uncertainty is explicitly determined using Bayesian inference, as opposed to being estimated from the mean squared errors. Second, the complexity of the RVM model is controlled by assigning suitable prior distributions over the model coefficients, which reduces the overfit susceptibility of the model. The rest of the paper is organized as follows. In Section 2, the RVM regression algorithm is described. Section 3 is devoted to the construction of ground motion model. Starting with the description of the ground motion data and the predictive and target variables, the training results are presented, and the prediction procedure for new data is described. Section 4 demonstrates computational results and compares the RVM predictions to an existing empirical parametric model. Section 5 concludes the paper by presenting the main conclusions of this study, and discusses the advantages and limitations of the proposed method. 2. The RVM Regression Algorithm  Given a set of input vectors , 1: and corresponding real-valued targets , the regression task is to estimate the underlying input-output relationship. Using kernel representation (Smola and Schölkopf, 2004), the regression function can be written as a linear combination of a set of nonlinear kernel functions: , (1) where , 1… are the combination weights and is the bias term. *Corresponding author ( J. Tezcan). Tel/Fax: +001-618-4536125. E-mail address: jale@siu.edu. 2012. American Transactions on Engineering & Applied Sciences. Volume 1 No.1 ISSN 2229-1652 eISSN 2229-1660. Online Available at 27 http://TUENGR.COM/ATEAS/V01/25-39.pdf
  • 4. This study uses the radial basis function (RBF) kernel: , , , 0 (2) where is the width parameter controlling the trade-off between model accuracy and complexity. In this study, the width parameter has been determined using cross-validation. Assuming independent noise samples from a zero-mean Gaussian distribution, i.e., ~ 0, , the target values can be written as: 1, … , . (3) Recast in matrix from, Equation (3) becomes: Φw , (4) where ,…, , ,…, , and Φ is an 1 basis matrix with 1 and , . The likelihood of the entire set, assuming independent observations is given by: | , 2 . (5) where ,…, is the vector containing the mean values of the combination weights. To control the complexity of the model, a zero-mean Gaussian prior is used where each weight is assigned a different variance (MacKay, 1992): | 0, 1/ . (6) 28 Jale Tezcan and Qiang Cheng
  • 5. In Eq. (6), ,…, where 1/ is the variance of . The posterior distribution of the weights is obtained as: | , , 2 | | . (7) where the mean vector and covariance matrix are: (8) (9) with … … 0 : . (10) 0 … The marginal likelihood of the dataset can be determined by integrating out the weights (MacKay, 1992) as follows: | , 2 | | (11) where and is the identity matrix of size . Ideal Bayesian inference requires defining prior distributions over and , followed by marginalization. This process, however, will not result in a closed form solution. Instead, the and values maximizing Eq. (11) can be found iteratively as follows (MacKay, 1992): *Corresponding author ( J. Tezcan). Tel/Fax: +001-618-4536125. E-mail address: jale@siu.edu. 2012. American Transactions on Engineering & Applied Sciences. Volume 1 No.1 ISSN 2229-1652 eISSN 2229-1660. Online Available at 29 http://TUENGR.COM/ATEAS/V01/25-39.pdf
  • 6. 1 (12) . (13) ∑ 1 Because the nominator in Eq.(12) is a positive number with a maximum value of 1, an value tending to infinity implies that the posterior distribution of is infinitely peaked at zero, i.e. 0. As a consequence, the corresponding kernel function can be removed from the model. The procedure for determining the weights and the noise variance can be summarized as follows: 1) Select a width parameter of the kernel function and form the basis matrix Φ. 2) Initialize ,…, and . 3) Compute matrix using Eq.(10). 4) Compute the covariance matrix using Eq.(9). 5) Compute the mean vector using Eq.(8). 6) Update and using Eq.(12) and Eq.(13). 7) If ∞, set 0 and remove the corresponding column in Φ. 8) Go back to step 3 until convergence. 9) Set the remaining weights equal to . The training input points corresponding to the remaining nonzero weights are called the “relevance vectors”. After the weights and the noise variance are determined, the predictive mean for a new input can be found as follows: Φ. (14) T In Eq.(14) Φ 1 x ,r x ,r … x , rN where r , r … , rN are the relevance vectors. 30 Jale Tezcan and Qiang Cheng
  • 7. The total predictive variance can be found by adding the noise variance to the uncertainty due to the variance of the weights, as follows: ΦT CΦ . (15) 3. Construction of the Ground Motion Model    In this section, RVM regression algorithm will be used to construct a ground motion model. In Section 4, the resulting model will be compared to an existing parametric model by Idriss (Idriss, 2008), which will be referred to as “I08 model” in this paper. To enable a fair comparison, the dataset and the predictive variables of I08 model have been adopted in this study. The RVM algorithm is independent of the size of the predictive variable set; additional variables can be introduced the set of predictive variables can be customized to specific applications. 3.1 Ground Motion Data        The ground motion records used in the training have been obtained from the PEER-NGA database (PEER, 2007). Consistent with the I08 model, a total of 942 free-field records have been selected using the following criteria: • Shear wave velocity at the top 30 m ranging from 450 m/s to 900 m/s, • Magnitude larger than 4.5, • Closest distance between the station and rupture surface (R) less than 200 km. Detailed information regarding these records can be found in the paper by Idriss (Idriss, 2008). 3.2 Predictive and Target Variables      The predictive variable set includes moment magnitude (M), natural logarithm of the closest distance between the station and the rupture surface in kilometers and fault mechanism (F). Idriss finds that with the shear wave velocity ( ) constrained to 450 m/s- 900 m/s range, it has *Corresponding author ( J. Tezcan). Tel/Fax: +001-618-4536125. E-mail address: jale@siu.edu. 2012. American Transactions on Engineering & Applied Sciences. Volume 1 No.1 ISSN 2229-1652 eISSN 2229-1660. Online Available at 31 http://TUENGR.COM/ATEAS/V01/25-39.pdf
  • 8. negligible effect on spectral values up to 1 second. Therefore, was not used as a predictive variable. Following the convention used in I08 model, earthquakes that have been assigned a fault mechanism type 0 and 1 in the PEER database were merged to a single, “strike-slip” group, while the rest were considered to be representative of “reverse” events. In the RVM model, strike-slip and reverse earthquakes are assigned 1 and 1, respectively. The input vector representing ith record has the following form: . (16) A set of eight vibration periods ( 8) ranging from 0.01 second to 4 seconds was used in the RVM model. The output for the ith record for the vibration period is defined as: for 1 to . (17) In Equation (17), is the natural logarithm of the average horizontal component of 5%- damped pseudo-acceleration response spectrum. The spectral values represent the median value of the geometric mean of the two horizontal components, computed using non-redundant rotations between 0 and 90 degrees (Boore, 2006). 3.3 Training of the RVM Regression Model  As a pre-processing step, and values were linearly scaled to [-1 1] to achieve uniformity between the ranges of the predictive variables. There is no need to scale the fault mechanism identifier ( as it was already defined to take either -1 or 1. Because kernel functions use Euclidean distances between pairs of input vectors, such scaling will help prevent numerical problems due to large variations between the ranges of the values that variables can take. In the ground motion data used in this study, the ranges of the predictive variables are 4.53 7.68 , and 0.32 199.27 . Therefore, input scaling takes the following form: 32 Jale Tezcan and Qiang Cheng
  • 9. 2 12.21 2 4.16 , , . (18) 3.15 6.44 The optimal value of the kernel width parameter for each vibration period was determined using 10-fold cross validation (Webb, 2002). In 10-fold cross validation, the training data is randomly partitioned into 10 subsets of equal size; and the model is trained using 9 subsets, and the remaining subset is used to compute the validation error. This process is repeated 10 times, each time with a different validation subset, and the average validation error for a particular is computed. By computing the average validation error over a range of possible values, the optimal with the smallest average validation error is determined. The resulting values for each period are listed in Table 1, along with the standard deviation of noise ( ), the mean value of the constant term and the number of relevance vectors. The relevance vectors and the combination weights are listed in Table 2. After the RVM models, one for each vibration period, were trained, standardized residuals were computed. Figure 1 shows the distribution of the standardized residuals, corresponding to T=1 second, with respect to , and . The residual distribution patterns for other periods were similar, not indicating any systematic bias. Table 1: Kernel width parameter , logarithmic standard deviation of noise ( ), mean value of the bias term and the number of relevance vectors ( ), for each period. T (sec) 0.01 0.23 0.633 -3.069 7 0.05 0.32 0.666 -0.664 7 0.10 0.13 0.718 0.002 7 0.20 0.15 0.661 -15.042 6 0.50 0.25 0.695 -8.359 7 1.00 0.36 0.748 -4.670 5 2.00 0.28 0.869 -6.0548 5 4.00 0.26 0.983 -7.794 5 *Corresponding author ( J. Tezcan). Tel/Fax: +001-618-4536125. E-mail address: jale@siu.edu. 2012. American Transactions on Engineering & Applied Sciences. Volume 1 No.1 ISSN 2229-1652 eISSN 2229-1660. Online Available at 33 http://TUENGR.COM/ATEAS/V01/25-39.pdf
  • 10. Figure 1: Standardized residuals for T=1 second. Table 2: Mean values of the combination weights and the relevance vectors T=0.01 s. T=0.05 s. i Wi ri i Wi ri 1 13.258 [-0.1937 0.2676 -1] 1 -6.177 [0.7905 -0.4227 1] 2 15.393 [0.5238 -0.2268 1] 2 6.355 [-0.3841 -0.1783 -1] 3 0.4861 [ 0.8921 0.9414 -1] 3 28.555 [0.5238 0.5856 1] 4 -5.073 [0.9619 -1.0000 1] 4 -7.930 [-0.5111 0.7896 -1] 5 -4.275 [0.9619 -0.6751 1] 5 -0.402 [0.7460 -0.4021 -1] 6 -14.173 [-0.2889 0.7862 -1] 6 -12.622 [0.9619 0.9545 1] 7 -8.086 [ 0.0603 0.9789 1] 7 -16.194 [0.0603 0.9789 1] T=0.1 s. T=0.2 s. i Wi ri i Wi ri 1 64.423 [0.4159 -0.1499 1] 1 29.569 [-0.8921 -0.0837 -1] 2 -6.991 [ 0.9619 0.9545 1] 2 2.293 [0.7905 -0.4227 1] 3 -36.297 [0.9619 -1.0000 1] 3 35.440 [0.8921 0.6543 -1] 4 15.875 [1.0000 0.4559 -1] 4 5.7412 [0.9619 -1.0000 1] 5 -5.599 [-0.3143 0.0809 1] 5 3.5036 [-0.8222 0.1385 1] 6 -17.361 [ 0.6508 0.9961 -1] 6 -48.496 [0.0603 0.4955 -1] 7 -25.799 [-0.1302 0.9056 1] 34 Jale Tezcan and Qiang Cheng
  • 11. Table 2 (continued). T=0.5 s. T=1.0 s. i Wi ri i Wi ri 1 6.4551 [0.7905 -0.4227 1] 1 1.9699 [0.7905 -0.4227 1] 2 12.825 [-0.2317 -0.2931 -1] 2 4.8873 [0.0540 -0.2785 -1] 3 0.0283 [-0.7714 0.1214 1] 3 -4.1425 [-0.7524 0.7892 1] 4 -0.806 [ 0.8921 -0.0318 -1] 4 -3.9593 [-0.7651 0.8672 -1] 5 8.4335 [0.8921 0.9414 -1] 5 3.7352 [-0.1302 -0.0121 1] 6 -0.089 [ 0.9619 0.9545 1] 7 -12.9 [ 0.0603 0.5786 -1] T=2.0 s. T=4.0 s. i Wi ri i Wi ri 1 7.3574 [-0.2317 -0.2931 -1] 1 0.4747 [0.7460 -0.4021 -1] 2 4.5548 [-0.0730 0.4691 1] 2 11.936 [0.7460 0.5118 -1] 3 3.0086 [ 0.9619 -1.0000 1] 3 6.8109 [0.3714 -0.0296 1] 4 -6.4695 [-1.0000 0.5142 -1] 4 -5.6050 [-0.7524 0.7892 1] 5 -5.3630 [-0.7524 0.7892 1] 5 -10.180 [0.3778 1.0000 -1] 3.4 Prediction Phase    After training, the spectral values for a new input vector , , can be determined as follows: 1. Scale the input to the range [-1 1] using Eq. (18); T 2. Construct the basis vector Φ 1 , , … , using the relevance vectors from Table 2 and the kernel width parameter from Table 1; 3. Determine the median value of using Eq.(14); 4. Obtain the standard deviation of the noise from Table 1. Total uncertainty, if needed, can be determined using Eq.(15). 4. Computational Results  The RVM model was tested using different magnitude, distance and fault mechanisms, and the results were compared to the I08 model. Figure 2 shows the median spectral acceleration at T=1 *Corresponding author ( J. Tezcan). Tel/Fax: +001-618-4536125. E-mail address: jale@siu.edu. 2012. American Transactions on Engineering & Applied Sciences. Volume 1 No.1 ISSN 2229-1652 eISSN 2229-1660. Online Available at 35 http://TUENGR.COM/ATEAS/V01/25-39.pdf
  • 12. second, along with the 16th and 84th percentile values bounds for strike-slip faults, for M=5 (left) and M=7 (right). The circles in the figure show the spectral values from earthquakes with the same fault mechanism and within ±0.25 magnitude units. Figure 3 shows the same information for reverse faults. For periods about 1 second and longer, it was observed that the median estimates from the RVM model were generally lower than those from the I08 model. At very short distances, within ~20 km of the source, RVM estimates were higher for M=7, for both strike-slip and reverse faulting earthquakes. Figure 2: Median ±σ bounds for spectral acceleration at T=1 second, strike-slip faults. Figure 3: Median ±σ bounds for spectral acceleration at T=1 second, reverse faults. 36 Jale Tezcan and Qiang Cheng
  • 13. Figure 4 presents the results for vibration period T=0.2 second, for strike-slip earthquakes. The results for the reverse faulting earthquakes were similar. For shorter vibration periods, and M=7, RVM estimates were lower than those from the I08 model. For M=5, however, RVM predictions equaled or exceed the I08 predictions. Regarding the variation about the median (noise variance), the predictions from the two models were in general agreement for all vibration periods. Figure 4: Median ±σ bounds for spectral acceleration at T=0.2 second, strike-slip faults. 5. Conclusion  This paper proposes an RVM-based model for the average horizontal component of earthquake response spectra. Given a set of predictive variable set, and a set of ground motion records, the RVM model predicts the most likely spectral values in addition to its variability. An example application has been presented where the predictions from the RVM model have been compared to an existing, parametric ground motion model. The results demonstrate the validity of the proposed model, and suggest that it can be used as an alternative to the conventional ground motion models. The RVM model offers the following advantages over its conventional counterparts: (1) There is no need to select a fixed functional form. By determining the optimal variances associated with *Corresponding author ( J. Tezcan). Tel/Fax: +001-618-4536125. E-mail address: jale@siu.edu. 2012. American Transactions on Engineering & Applied Sciences. Volume 1 No.1 ISSN 2229-1652 eISSN 2229-1660. Online Available at 37 http://TUENGR.COM/ATEAS/V01/25-39.pdf
  • 14. the weights, the RVM automatically detects the most plausible model; (2) The resulting RVM model has a simple mathematical structure (weighted average of exponential basis functions), and is based on a small number of samples that carry the most relevant information. Samples that are not well supported by the evidence (as measured by the increase in the marginal likelihood) are automatically pruned. (3) Because the model complexity is controlled during the training stage, the RVM has lower risk of over-fitting. One limitation of the proposed approach is that the resulting model may be difficult to interpret. Because the RVM is not a physical model, it does not allow any user-defined, physical constraints, not allowing extension of the model to scenarios not represented in the training data set. However, in our opinion, this does not constitute a shortcoming, considering that the reliability such practice is questionable in any regression model. Another potential limitation is that the RVM requires a user-defined kernel width parameter, which does not have a very clear intuitive meaning, especially when working with high dimensional input vectors. However, the optimal value of the kernel width parameter can be determined using cross-validation, as has been done in this study. Future studies will investigate the effect of using additional predictive variables on the performance of the model. 6. Acknowledgements  This material is based in part upon work supported by the National Science Foundation under Grant Number CMMI-1100735. 7. References  Boore, D.M., J. Watson-Lamprey, and N.A. Abrahamson. (2006). Orientation-independent measures of ground motion. Bulletin of the Seismological Society of America, 96(4A), 1502-1511. Bozorgnia, Y. and K. W. Campbell. (2004). The vertical-to-horizontal response spectral ratio and tentative procedures for developing simplified V/H and vertical design spectra. Journal of Earthquake Engineering, 8(2), 175-207. Campbell, K. W. and Y. Bozorgnia. (2003). Updated Near-Source Ground-Motion (Attenuation) Relations for the Horizontal and Vertical Components of Peak Ground Acceleration and Acceleration Response Spectra. Bulletin of the Seismological Society of America, 93(1), 314-331. 38 Jale Tezcan and Qiang Cheng
  • 15. Drucker, H., C. J. C. Burges, L. Kaufman, A. Smola and V. Vapnik. (1997). Support vector regression machines, Advances in Neural Information Processing Systems 9, MIT Press. Idriss, I. M. (2008). An NGA empirical model for estimating the horizontal spectral values generated by shallow crustal earthquakes. Earthquake spectra, 24(1), 217-242. MacKay, D. J. C. (1992). Bayesian interpolation. Neural computation, 4(3), 415-447. MacKay, D. J. C. (1992). The evidence framework applied to classification networks. Neural Computation, 4(5), 720-736. PEER. (2007). PEER-NGA Database. http://peer.berkeley.edu/nga/index.html. Smola, A. J. and B. Schölkopf. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199-222. Tezcan, J. and Q. Cheng. (2011). A Nonparametric Characterization of Vertical Ground Motion Effects. Earthquake Engineering and Structural Dynamics (in print). Tezcan, J., Q. Cheng and L. Hill. (2010). Response Spectrum Estimation using Support Vector Machines, 5th International Conference on Recent Advances in Geotechnical Earthquake Engineering and Soil Dynamics, San Diego, CA. Tipping, M. (2000). The relevance vector machine. Advances in Neural Information Processing Systems MIT Press. Webb, A. (2002). Statistical pattern recognition, New York, John Wiley and Sons. Dr.Jale Tezcan is an Associate Professor in the Department of Civil and Environmental Engineering at Southern Illinois University Carbondale. She earned her Ph.D. from Rice University, Houston, TX in 2005. Dr.Tezcan’s research interests include earthquake engineering, material characterization, and numerical methods. Dr.Qiang Cheng is an Assistant Professor in the Department of Computer Science at Southern Illinois University Carbondale. He earned his Ph.D. from the University of Illinois at Urbana Champaign, IL in 2002. Dr.Cheng’s research interests include pattern recognition, machine learning and signal processing. Peer Review: This article has been internationally peer-reviewed and accepted for publication according to the guidelines given at the journal’s website. *Corresponding author ( J. Tezcan). Tel/Fax: +001-618-4536125. E-mail address: jale@siu.edu. 2012. American Transactions on Engineering & Applied Sciences. Volume 1 No.1 ISSN 2229-1652 eISSN 2229-1660. Online Available at 39 http://TUENGR.COM/ATEAS/V01/25-39.pdf