Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

THE STATISTICAL SOFTWARE NEWSLE'ITER 229

The Controlled Random Search Algorithm


in Optimizing Regression Models
Ivan KI/IV~ and Josef TVRDIK

University of Ostrava, Dvo~r/ikova 7,701 03 Ostrava, Czech Republic

Summary: This paper deals with the problems of con- remaining d poles of the simplex, and determine
trolled random search algorithms (CRS algorithms) and
their use in regression analysis. A modified CRS al-
gorithm of Price is described, which is more effective 3. If.ill') < j ~ l ) , M being the point with the great-
when compared with the original algorithm in optimiz- est function value of the N points stored, then
ing regression models, first non-linear ones. The prin- replace M with P.
cipal modification consists in randomizing the search
4. Set k - k + 1 andreturntoStep2.
for the next trial points. Some results of testing the al-
gorithm, using both real and modeled data, are given to The algorithm was originally programmed in Basic
illustrate its possibilities when estimating the parame- and run on a PDP 10/20 minicomputer and on a Cy-
ters of non-linear regression models. bet, 72 computer [9].
(SSNinCSDA 20, 199-204 (1995)) A FORTRAN procedure based on Price's algorithm
Keywords: random search algorithm, non-linear re- was also presented recently by Cordon [1].
gression models
Received: November 1994 Revised: February 1995 II. Modifications of Price's algorithm
Price's algorithm is very simple and easily pro-
I. Introduction grammed on the PC, but its convergence is usually
The notion of the CRS algorithm was introduced by flower compared with optimi~fion methods based
Price [9] for his seeking algorithm for the global on function derivatives. It is possible to speed up its
minimum of a mulfimodal function,f, of d variables. convergence in the following ways:
The algorithm combines the simple random search * by selecting d+l simplex vertices not from the
and the simplex method [8] into a single continuous complete configuration of N points in store, but
process. When minimizing ~ subject to P ~ f ~ , from some of its sul:~et containing points with
where P is a d-dimensional vector and f~ a bounded the lowest function values;
set in R d , Price's algorithm is as follows: • by selecting as the simplex pole P,~1 in Eqn. (1)
1. Set k - 0, load storage of size N by generating that point (from the set of d+l), which has the
randomly points P1, P2.... ,IN ~ f~ and store also largest function value.
j(Pi) for i - 1,2 ..... N.
Such modifications improve convergence, but at the
2. Choose in random d+l linearly independent same time tend to increase the risk of fmding a local
points (N >> d) from the current configuration of minimum instead of a global one.
N points in store, generate the new trial point P We have investigated several quite different modifi-
by the relation cations of the original Price's alg~'ithm. Our modifi-
P - 2 G -Pd+l, (1) cations [3] are based on generating the next trial
point P according to the formula
where Pd+~ is one (rand(mdy taken) pole of the
simplex PI P2 ... Pd+l and G the centroid of the P - G - F(ct) ( P~q - G) (2)
230 THE STATISTICAL SOFTWARE NEWSLETFER

instead of using Eqn. (1). Regarding the multiplica- scalar criterion for a given regression model variant
tion factor l"(ct), the following assumptions have (point) in store by using the expression
been tested in detail: m
EwiD i , (6)
• r(cc)= ct, ill

• F is a random variable with uniform distribu- where w; is a subjective weight of i-th component of
tion on the interval (0, co), the vector criterion (i = 1, 2 ..... m) and D~ the num-
ber of variants that are dominated in this component
• F is a random variable with normal distribu-
by the variant considered.
tion N(cc, I),
cc being a positive constant. Special attention is paid
to investigating the effect of the F-factor on the run-
IV. Stop Criterion
ning time needed for reaching acceptable optimiza- The CRS algorithm given in Section 1 does not in-
tion results. As shown later in Section 7, the best re- clude any particular stop criterion. However, it is
suits were obtained when considering F distributed clear that the criterion has to be evaluated within
uniformly with ct ranging from 4 to 8. each iteration of the optimizing algorithm.
We propose to stop the optimization process, when
III. Optimization criteria
f(pN )_ f(pl) < e (7)
When estimating the parameters of regression mod-
els, the following three functions are usually consid-
fo
ered as optimization criteria to be minimized: e0 being a positive input value, .KI~) and 3~Pl) the
• residual sum of squares greatest and the least value of the optimization crite-
rion for the current iteration, resp. [3]. J~ denotes an
RSS = ~(Yi-Yi) 2 ; (3) appropriate constant factor whose value is deter-
iffil
mined by the variability of the dependent model
• sum of absolute deviations variable. For example, when using RSS as the opti-
n mization criterion, the ~ factor is equal to the total
SAD= ~ l y i - ~ i l ; (4)
i=l sum of squares, i.e.
• maximum absolute deviation f0 = ~ ( Y i - y-)2.
iffil
MAD=maxlyi-~il f o r / - 1,2 ..... n. (5)
The stop condition (7) proves to be more useful,
w h e r e Yi ( i - 1, 2 . . . . . n) denote the observed values when compared with that defined by Conlon [1] in
of the dependent variable, Yi their estimates calcu- terms ofAP") -~'b.
lated from the estimates of regression parameters and
n the total number of observations (sample size). V. Input to the algorithm
More robust criteria for optimization (for example
The input parameters for our modification of the
median of squares [10], trimmed squares and S-
CRS algorithm consist of.
estimators [11 ]) can be used as well.
A vector-type criterion is also applicable in evaluat- • number of points, N, to hold in the store,
ing the quality of regression model variants. This • value of ct in Eqn. (2),
criterion enables finding values of the model parame-
ters that correspond to the optimum with respect to • value of e0 in Eqn. (7).
all of its components.
When using the vector-type criterion, it is desirable
VI. Implementation of the algorithm
to define a subsidiary scalar criterion for ordering the Our CRS algorithm was implemented (as a core of a
model variants according to their quality. Wein- program named MOR) in Turbo Pascal, version 6.0,
berger [12] r~0aaamends evaluating the value of the using the ESTAT environment [13].
THE STATISTICAL SOFTWARE NEWSLEITER 231

The input data for the program include:


• definition of the regression model,

t
140 X
• boundaries for the individual regression parame-
ters, 120

• regression data in a text file (no more than 10 100

000 data items),


eo ,~
• choice of the optimization criterion (RSS, SAD,
60 4
MAD or veaor-typo criterion with the individual
weights of its components), 40 .~

• input for the modified CRS algorithm (given in '20


Section 5).
04
The regression model is defined by writing the corm- 5 10 15 20 25 30

sponding regression function (as an arithmetic ex- a~pha


pression in Pascal source form) on a separate input
file. Therdore, the program unit containing this file Figure 1: Empirical dependence of T on a for
together with the main program has to be compiled I'(~x) = cx (x), F - N(~, ]) ( ) , and r
once again for each new regression model under uniform on (0, a) (+): data [4]-
consideration.
Example 7, N = 15, f.o = 1E-16.
Regarding the algorithm input, the following choices
are usually found to be acceptable: N - 5d, a - 8, examples are smnmarized in Figures 1, 2, and 3 as
and e0 - 1E-16 (when minimizing RSS). All input is well as in Tables 1 and 2, the residual sum of squares
introduced interactively from the keytx3ard. (RSS) being chosen as an optimization criterion.
The course of the optimization process (number of it- Fig. 1 shows the running time, T, as a function of the
erations, running time, values of the minimized cri- input parameter ¢t for different distributions of F (see
terion, regression parameters and R-squared) is dis- Eqn. (2)). For F(a) = a as well as for F distributed
played on the screen every ~ e r second. normally, the running time increases dramatically
The optimization run stops as soon as the condition with the increasing value of a. On the other hand,
(7) is satisfied. The user is also allowed to stop the when considering F distributed uniformly on the in-
calculations intetactively, e.g. in case when the esti- terval (0, a), the values of T grow almost linearly
mates of the regression parameters remain constant with increasing a, the observed line having a rela-
for a sufficiently long time period. The final results tively small slope. This is true for all the data tested,
are saved on a text file for the sake of a subsequent which indicates a privileged standing of uniform
evaluation, if there is a need. distribution among the three distributions under
The MOR source program and user guide are avail- consideration. The use of our algorithm with F dis-
able for the interested reader on request per E-mail to tributed uniformly on (0, a) permits reducing the
tvrdik@oudec.osu.ez or per regular mail at the ad- value of N as compared with the proposal of Price [9]
dress of the authors. and, therefore, shortening the running time.
Figures 2 and 3 illustrate in more detail the effect of
VII. Test Results ¢x on the values of T provided that F has uniform
Both real and modelled regression data were used to distribution on (0, ¢t). It is clear that all the graphs T
illustrate the possibilities of our algorithm (MOR vs. a are very close to those of linear function, except
program) when estimating the parameters of non- perhaps, the parts for small a not exceeding approx.
linear regression models. All the calculations were 5. Starting from our experience in optimizing non-
performed on PC 486DX, 33 MHz, und~ MS DOS, linear regression models, we can recommend work-
version 6.0. The results for some well-known testing ing with a ranging from 4 to 8.
232 THE STATISTICAL SOFTWARE NEWSLETTER

45 ............................................................................................................... 500 ........................................................................................................

40. 450

35. 400

30. 4. 350

3OO
25
4,
250
20. T~
200
15.
150
10.
100
5,
50

0 0
0 5 10 15 20 0 5 10 15 20

Figure 2: Graphs T vs. lz. F distributed uniformly Figure 3. Graphs T vs. a. I" distributed uniform-
on (0, ~), for aata [21 (x), data [41- ly on (0, ¢~), for data [4J-Example 8
Example 7 ( ) , and data [6]-Model 5 ( ) , data [5J-Model IV (x) and data
(+)" N = 15, eo = 1E-16. [51-Model V (+): N = 15, ~o = 1E-16.

Table 1: The results of testing the MOR program on well-known published data:
N = 5d, a = 8, Eo = 1E-16.
Reference Regression model Time RSS Parameters
is]
[4] ~l~3xl/(1 + [~lXl + 1~2X2) 4.6 4.355E-05 3.1315
Example 1 15.159
0.7801
[4] 1~3(exp(-I~lXl) + exp(-l~2x2) ) 13.7 7.471E-05 13.241
Example 4 1.5007
20.100
[4] l~3(exp(-l~lXl)+ exp(-[~2x2)) 9.9 1.252 32.000
Example 5 1.5076
19.920
[4] [31 + [32exp(133x) 5.8 5.986E-03 15.673
Example 7 0.9994
0.0222
[4] 13texp(~2/([33
+ x)) 62.0 87.95 0.00561
Example 8 6181.4
345.22
[2] exp(13~x) + exp(~2x) 3.6 124.4 0.2578
0.2578
[5] 131exp(133x)
+ 132exp(~,,x) 73.9 129.0 1655.2
Model IV 3.4E07
-0.6740
-1.8160
[5] I~l Xp3 -I- 1~2 Xp4 283.4 2.981E-05 0.00414
Model V 3.8018
2.0609
0.2229
THE STATISTICAL SOFTWARE NEWSLEITER 233

Table 2: The results o f testing the MOR program on the unpublished data of Militk~ [6]:
N = Sd, ct = 8, eo = lE-16.

Reference Regression model Time RSS Parame~


[s]
Model 2 exp([31x) + exp(132x) 5.4 8.896E-03 0.2807
O.4064
Model 3 [31 + ~2exp((133 + ~4.x) [3s ) 45.4 0.9675 9.3593
2.0292
1.3366
0.4108
0.3551
Model 4 131eXp(133X)+ 132exp(134.x) 19.3 3.179E-04 47.971
102.05
-0.2466
-0.4965
Model 5 131x Ih + 133132/x 17.6 4.375E-03 0.05589
3.5489
1.4822
Model 6 13~ + 132x I~ + 134x 13, +l~6x I;, 275.0 1.694E-02 1.9295
2.5784
0.8017
-1.2987
0.8990
0.011915
3.0184
Model 7 ~lln(~2 + ~3X) 44.2 7.147E-05 2.0484
18.601
1.8021

VIII. Conclusions Because the MOR program provides no information


on the accuracy of the parameter estimates, it is de-
Our modification of the original Price's algorithm
sirable, whexe possible, to complete the regression
(see Eqn. (2)) with F distributed unif~mly on (0, ¢0 results by using commonly used regression software.
makes possible a significant decrease in the value of
the tuning parameter N as compared with the value
recommended by Price, and therefore reduces the References
running time.
[1] Conlon, M. (1992): The Controlled Random Search
The expeximents showed that our algorithm (MOR Procedure for Function Optimization. Commun. Sta-
program) provides results c~nparable with the best tist.-Simula. Comput, 21,919-923.
ones obtained using special techniques based on the [2] Jennrich, R.I. & P.T. Sampson (1968): Application
calculation of criterial function derivatives. This al- of Stepwise Regression to Non-Linear Estimation.
gorithm proved to be successful even in the cases Technometrics 10, 63-72
when, according to the results of Militk~ [7], most [3] K ~"ivy, I. & J. Tvrd£k (1994): Multicriterial Op-
of the commercial statisticalpackages fail. timization of Regression Models, in: R. Dutter and W.
234 THE STATISTICALSOFTWARENEWSLETTER

Grossmann (Eds.), COMPSTAT 1994. Short Commu- [9] Price, W.L. (1976): A controlled random search
nications and Posters. Vienna: Univ. of Technology procedure for global optimization. Computer J. 20,
25-26 367-370
[4] Meyer, R.R. & P.M. Roth (1972): Modified [10] Rousseeuw, P. (1984): Least Median of Squares
damped least squares: An algorithm for non-linear es- Regression. J. Amer. Statist. Assoc. 79, 871-879
timation. J. Inst. Math. Applics. 9, 218-233 [11] Rousseeuw, P. and V. Yohai (1984: Robust Re-
[5] Militk~, J. and M. Meloun (1994): Modus oper- gression by Means of S-Estimators, in: Robust and
andi of the least squares algorithm MINOPT. Talanta Nonlinear Time Series Analysis. Lecture Notes Statis-
40, 269-277 tics, Vol. 26. New York: Springer Vedag, 256-272
[6] Militk,~, J., private communication. [12] Weinberger, J. (1987): Extremizatinn of vector
[7] Militk~, J. (1994): Nonlinear Regression on Per- criteria of simulation models by means of quasi-
sonal Computers, in: R. Dutter and W. Grossmann parallel hanling. Comp. andArt. Intel. 6, 71-19
(Eds.), COMPSTAT 1994. Proceedings in Computa- [13] Zva~ek,Land Re zankov~ H. (1990): ESTAT.
tional Statistics. Heidelberg: Physiea -Verlag, 395-400
Statistical programming environment in Turbo Pascal.
[8] Nelder, J.A. & R. Mead (1964): A simplex method Im COMPSTAT 1990. Soj%vare Catalogue. Heidel-
for function minimization. ComputerJ. 7, 308-313 berg: Physica-Verlag, 19-20

You might also like