0% found this document useful (0 votes)

275 views

Analysis of Two Partial-least-Squares Algorithms For Multivariate Calibration PDF

This document analyzes two partial least squares (PLS) algorithms (PLS1 and PLS2) for multivariate calibration in terms of standard linear regression theory. PLS1 is shown to give results identical to the bidiagonalization algorithm of Golub and Kahan, similar to the conjugate gradients method. PLS2 solves the matrix inversion problem of linear regression by transforming the matrix to a triangular form. The general efficiency of the PLS algorithms is discussed.

Uploaded by

Ana Rebelo

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

275 views

Analysis of Two Partial-least-Squares Algorithms For Multivariate Calibration PDF

Uploaded by

Ana Rebelo

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Analysis of Two Partial-Least-Squares Algorithms for

Multivariate Calibration
ROLF MANNE

Department of Chemistry, University of Bergen, N-5007 Bergen (Norway)

Abstract
Manne,R., 1987. Analysis of two partial-least-squares algorithms for multivariate calibra-
tion. Chemometrics and Intelligent Laboratory Systems, 2:187-197.

Two algorithms for multivariate calibration are analysed in terms of standard linear
regression theory. The matrix inversion problem of linear regression is shown to be solved
by transformations to a bidiagonal form in PLS1 and to a triangular form in PLS2. PLS1
gives results identical with the bidiagonalization algorithm by Golub and Kahan, similar to
the method of conjugate gradients. The general efficiency of the algorithms is discussed.

1 INTRODUCTION
Partial least squares (PLS) is the name of a set of algorithms developed by Wold for use in
econometrics[1,2]. They have in common that no a priori assumptions are made about the model
structure, a fact which has given rise to the name soft modelling for the PLS approach. Instead,
estimates of reliability may be made using the jack-knife or cross-validation (for a review of these
techniques, see ref.3). Although such reliability estimates seem to be essential in the description
of the PLS approach [4], they are not considered here.
The PLS approach has been used in chemometrics for extracting chemical information from
complex spectra which contain interference effects from other factors (noise) than those of primary
interest [4-10]. This problem can also be solved by using more or less standard least-squares
methods, provided that the collinearity problem is considered [11]. What is required by these
methods is a proper method for calculating generalized inverses of matrices. The PLS approach,
however, is so far only described through its algorithms, and appears to have an intuitive
character. There has been some confusion about what the PLS algorithms do, what their
underlying mathematics are, and how they relate to other formulations of linear regression theory.
The purpose of this paper is to place the two PLS algorithms that have been used in
chemometrics in relation to a more conventional description of linear regression. The two PLS
algorithms are called, in the terminology in chemometrics, PLS1 and PLS2 [6]. PLS2 considers
the case when several chemical variables are to be fitted to the spectrum and has PLS1, which
fits one such variable, as a special case. A third algorithm, which is equivalent to PLS1, has been

1
suggested by Martens and Ns [7]. It differs from the latter in its orthogonality relations, but
has not been used for actual calculations. It will be referred to here only in passing.
Some properties of the PLS1 algorithm have previously been described by Wold et al. [4], in
particular its equivalence to a conjugate-greadient algorithm. This equivalence, however, has not
been used further in the literature, e.g., in comparisons with other methods. Such comparisons
have been made, particularly with the method of principal components regression (PCR), by Ns
and Martens [12] and Helland [13]. A recent tutorial article by Geladi and Kowalski [14] also
attempts to present the PLS algorithms in relation to PCR but, unfortunately, suffers from a
certain lack of precision.
The outline of this paper is as follows. After the notation has been established, the solution
to the problem of linear regression is developed using the Moore-Penrose generalized inverse. The
bidiagonalization algorithm of Golub and Kahan [15] is sketched and shown to be equivalent to
the PLS1 algorithm. Properties of this solution are discussed.
The Ulvik workshop made it clear that within chemistry and the geosciences there is growing
interest in the methods of multivariate statistics but, at the same time, the theoretical background
of workers in the field is highly variable. With this situation in mind, and attempt has been made
to make the presentation reasonably self-contained.

2 NOTATION

The notation used for the PLS method varies from publication to publication. Further confusion
is caused by the use of different conventions for normalization. In the following, we arrange
the measurements of the calibration set in a matrix X = Xij , where each row contains the
measurements for a given sample and the each column the measurements for a given variable.
the number of samples (or rows) is given as n and the number of variables (or column) as p. With
the experimental situation in mind, we shall call the p measurements for sample i a spectrum
which we denote by xi . For each sample in the calibration set, there is, in addition, a chemical
variable yi which is represented by a column vector y. In the prediction step the spectrum x of
a new sample is used to predict the value h for the sample.
We use bold-face capital letters for matrices and bold-face lower-case letters for vectors. The
transpose of vectors and matrices will be denoted by , e.g., y0 and X0 . A scalar production of
two (column) vectors is thus written (a0 b). Whenever possible, scalar quantities obtained from
vector or matrix multiplication are enclosed in parentheses. The Euclidian norm of a column
vector a is written ||a|| = (a0 a)1/2 . The Kronecker delta, ij , is used to describe orthogonality
relations. It takes the values 1 for i = j and 0 for i 6= j.

2
3 CALIBRATION AND THE LEAST-SQUARES METHOD
The relationship between a set of spectra X and the known values y of the chemical variable is
assumed to be

p
X
yi = b0 + Xij bj + noise(i = 1, 2, . . . , n) (1)
j=1

If all variables are measured relative to their averages, the natural estimate of b0 is zero, and in
this sense b0 may be eliminated from eqn.1. We thus assume that the zero points of the variables
y, x1 , x2 , . . . , xp are chosen so that
n
X n
X
yi = 0 and Xij = 0(j = 1, . . . , p) (2)
i=1 i=1

Eq.1 may therefore be written as

p
X
yi = Xij bj (3)
j=1

or, in matrix notation

y = Xb (4)

For the estimation of b, eq.4 in general does not have an exact solution. In standard least
squares one instead minimizes the residual error ||e||2 defined by the relationship

e = y Xb (5)

This leads to the normal equations

X0 Xb = X0 y (6)

which reduce to eq.4 for non-singular square matrices X. Provided that the inverse of X0 X exists,
the solution may be written as

1
b = (X0 X) X0 y (7)

If the inverse does not exist, there will be non-zero vectors cj which fulfil

X0 Xcj = 0 (8)
P
Then, if b is a solution to eq.6, so is b + j cj (j arbitrary scalars). This may be expressed
by substituting for (X0 X)1 any generalized inverse of X0 X. A particular such generalized inverse
is chosen as follows. Write X as product of three matrices:

3
X = URW0 (9)

where U and W are orthogonal and of dimensions n a and p a, respectively, R is of dimension

a a and non-singular and a is the rank of the matrix X. That is

U0 U = W0 W = 1 (10)

where 1 is the unit matrix of dimension a a. A generalized inverse may be written:

X+ = WR1 U0 (11)

Truncations are commonly introduced by choosing the dimension of R equal to r smaller than
the rank a of X. As written (no truncation implied), X+ fulfils not only the defining condition
for a generalized inverse, i.e.,
XX+ X = X (12)

but also

X+ XX+ = X+ ; XX+ = (XX+ )0 ; (X+ X)0 = X+ X (13)

which define the Moore-Penrose generalized inverse (see,e.g., ref. 17). Insertion of eq.9 into eq.6
gives
WW0 b = WR1 U0 y (14)

It may be shown [17] that the Moore-Penrose generalized inverse gives the minimum-norm
solution to the least squares problem,i.e.,

b = WW0 b = X+ y = WR1 U0 y (15)

is the solution which minimizes (b0 b) in the case that X0 X is singular and eq.6 has multiple
solutions.
The orthogonal decomposition 9 gives together with eq.15 a simple expression for the residual
error:

e = y Xb = (1 URW0 WR1 U0 )y
= (1 UU0 )y (16)

There are many degrees of freedom in the orthogonal decomposition 9. From the computa-
tional point of view, it is important that the inverse R1 is simple to calculate. This may be
achieved, e.g., with R diagonal or triangular. For a right triangular matrix one has Rij = 0 for
i > j. In that case, from the definition of the inverse it follows that
X
(R1 )ik Rkj + (R1 )ij Rjj = ij (17)
k<j

4
or
X
(R1 )ij = (ij (R1 )ik Rkj )/Rjj (18)
k<j

A bidiagonal matrix is a special case of a triangular matrix with Rij = 0 except for i = j
and either i = j 1 (right bidiagonal) or i = j + 1 (left bidiagonal). From successive application
of eq.18 it follows that the inverse of a right triangular matrix (including the bidiagonal case) is
itself right triangular.
With a diagonal matrix R, i.e., with Rij = 0 for i 6= j, the decomposition 9 is known as the
singular-value decomposition. The singular values, which are chosen to be 0, are the diagonal
elements Rii . Another decomposition of interest for matrix inversion is the QR decomposition
with R right triangular and W = 1, the unit matrix.

4 BIDIAGONALIZATION AND PLS1

Wold et al. [4] mention that the transformation step of the PLS1 algorithm is equivalent to a
conjugate gradient algorithm given by Paige and Saunders [18]. This algorithm, which is applied
to X, however, is not the general conjugate gradient algorithm, which requires a symmetric X
[19,20], but a bidiagonalization algorithm developed by Golub and Kahan [15] and called Bidiag2
by Paige and Saunders [18]. The detailed properties of Bidiag2, which in our view is the key to
the understanding of the PLS1 algorithm, have so far not been exploited in the literature.
The Bidiag2 algorithm gives a decomposition of the type described in eq. 9. It can be
described as follows: given w1 = X0 y/||X0 y|| and u1 = Xw1 /||Xw1 ||, one obtains successive
column vectors of the matrices U and W through

wi = ki [X0 ui1 wi1 (wi1

0
X0 ui1 )] (19)

ui = ki0 [X0 wi ui1 (u0i1 X0 wi )] (20)

where ki and ki0 are appropriate normalization constants.

Eqs. 19 and 20 may be obtained from the general expressions
j
X
Xwj = ui (u0i Xwj ) (21)
i=1

i+1
X
0
X ui = wj (wj0 X0 ui ) (22)
j=1

which define the vectors uj and wi+1 , respectively.

The following properties are easily shown: (i) the sets {ui } and {wi } are orthonormal, (ii) eq.21
gives (u0i Xwj ) = 0 for i > j and (iii) eq.22 gives (u0i Xwj ) = 0 for j > i + 1. The decomposition

5
of X according to eq.9 therefore makes Rij = u0i Xwj a right bidiagonal matrix. Eqs. 21 and 22
may therefore be reformulated as

Xwi = ui (u0i Xwi ) + ui1 (u0i1 Xwi ) (23)

X0 ui = wi+1 (wi+1
0
X0 ui ) + wi (wi0 X0 ui ) (24)

which are equivalent to eqs. 20 and 19, respectively.

There is a close relationship between the present bidiagonalizaiton scheme and Lanczos
iteration scheme for making a symmetric matrix tridiagonal [16]. The latter is obtained by
multiplying the two bidiagonalization equation together, e.g.,

wi+1 = N [X0 Xwi wi (wi0 X0 Xwi ) wi1 (wi1

0
X0 Xwi )] (25)

where N is a normalization constant. A similar equation applies to ui+1 . In the Lanczos

basis {wi } the matrix X0 X is tridiagonal. It is further obvious from eq. 25 that the vectors
{wi } represent a Schmidt orthogonalization of the Krylov sequence {ki ; ki = (X0 X)i1 w1 }. The
relationship between the Krylov sequence and the vectors wi generated by the PLS1 algorithm
has previously been pointed out by Helland [13].
In order to show the equivalence between the bidiagonalization algorithm and PLS1, the
calibration step of the latter can be written as follow:

Step 1: X1 = X; y1 = y

Step 2: For i = 1, 2, . . . , r(r a) do Steps 2.1-2.4:

Step 2.1: wi = X0i yi /||X0i yi ||

Step 2.2: ui = Xi wi /||Xi wi ||
Step 2.3: Xi+1 = (1 ui u0i )Xi
Step 2.4: yi+1 = (1 ui u0i )yi

The iteration in Step 2 may be continued until the rank of Xi equals zero (a=rank of X)
or may be interrupted earlier using, e.g., a stopping criterion from cross-validation. The present
description differs from that given, e.g., by Martens and Ns [7] only by the introduction of
normalized vectors {ui }. The latter write

Step 2.2a: ti = Xi wi

but have equation for the other steps that give results identical with our formulation. The
PLS decomposition of the X matrix into

6
X = TP (26)

where T is orthorgonal and of dimension n a and P is of dimension a p therefore corresponds

in our notation to
X = U(U0 X) = U(RW0 ) (27)

The alternative algorithm of Martens and Ns gives a decomposition of X according to eq.

26, but with orthogonal rows in the matrix P [7]. These rows are, apart from a normalization
factor, identical with those of the matrix W0 defined here.
In our notation this algorithm corresponds to the decomposition

X = (XW)W0 = (UR)W0 (28)

The equivalence of this algorithm with the ordinary PLS1 algorithm has been pointed out by
Ns and Martens [12] and shown in detail by Helland [13].
In the PLS1 algorithm the orthogonality of {ui } follows from Steps 2.2 and 2.3 through
induction. One may then reformulate Step 2.3 as

Pi 0
Step 2.3b: Xi+1 = (1 k=1 uk uk )X

Step 2.1 may be simplified into

Step 2.1b: wi = X0i y/||X0i y||

Step 2.4 is thus shown to be unnecessary. Sjostrom et al. [5] updated y by

yi+1 = (1 ci ui u0i )yi (29)

with specifications for the parameter ci , called the inner PLS relation. This updating therefore
has no effect on the result. Later publications which use the inner PLS relationship [6,14] have,
in fact, the same updating expression as used here in Step 2.4.
Comparing PLS1 with Biadiag2 one finds that Step 2.2 and 2.3b of the former give the second
step of the latter, eq. 20, since (u0k Xwi = 0) for k < i 1. In order to show the equivalence of
the first step of Bidiag2, eq. 19 and Step 2.1b, we write the latter as

wi+1 = X0i (1 ui u0i )y/||X0i+1 y|| (30)

The orthogonality between wi and wi+1 is then obtained from

wi0 wi+1 wi0 X0i (1 ui u0i )y u0i (1 ui u0i )y = 0 (31)

From

7
wi+1 ||X0i+1 y|| = X0i+1 y = X0i (1 ui u0i )y = wi ||X0i y|| X0 ui (u0i y) (32)

one thus obtains, by multiplication with wi0 ,

||X0i y|| = (wi0 X0 ui )(u0i y) (33)

which inserted back in eq. 30 gives the bidiagonalization equation 19 apart from, possibly, a sign
factor.

5 MATRIX TRANSFORMATIONS-PLS2
The PLS2 algorithm was designed for the case when several chemical variable vectors yk are to
be fitted using the same measured spectra X. The chemical vectors are collected as columns
in a matrix Y. The algorithm may be described as follow:

Step1: X1 = X; Y1 = Y

Step2: For i = 1, 2, . . . , r(r a) do Steps 2.1-2.4

Step2.1: vi = first column of Yi .

Step2.2: Repeat until convergence of wi :

Step 2.2.1: wi = X0i vi /||Xi0 vi ||

Step 2.2.2: ui = Xi wi /||X0i wi ||
Step 2.2.3: zi = Yi ui /||Yi0 ui ||
Step 2.2.4: vi = Yi zi /||Yi0 zi ||

Step2.3: Xi+1 = (1 ui u0i )Xi

Step2.4: Yi+1 = (1 ui u0i )Yi

As for PLS1 the iteration may be continued until the rank of Xi is zero or may be stopped
earlier (r < a).
Several simplifications of the algorithm can be made. What is important in the present con-
text, however, is that the ui and wi vectors form two orthonormal sets, and that the transformed
matrix U0 XW is right triangular. Also, it may be shown that PLS2 reduces to PLS1 when the
Y matrix has only one column. In the latter case zi from Step 2.2.3 has only a single element=1.
Convergence of wi is then obtained in the first iteration.
The orthogonality u0i ui = ij follows from Steps 2.2.2 and 2.3 in the same way as in PLS1. This
in turn makes it possible to show that eq.21 is valid also for PLS2, which proves the triangularity
of the transformed matrix R = U0 XW. Finally, the orthogonality wi0 wj =ij (i > j) may be

8
established from

wi0 wj vi0 Xi X0j vj

i1
X
= vi0 (1 uk u0k )Xj X0j vj
k=j
i1
X
vi0 (1 uk u0k )uj = 0 (34)
k=j

The orthogonality relationships of PLS2 are well established in the literature.

6 MATRIX INVERSION AND PREDICTION

For a sample with the spectrum x (row vector) but with unknown chemical composition, the
predicted value of the chemical variable is obtained as (cf. eqs. 4 and 14)

y = (xb) = (xWR1 U0 y) (35)

We write here this expression as for PLS1. The extension to PLS2, however, is trivial. Utilizing
the fact that R1 is right triangular both in PLS1 and in PLS2, the vector of regression coefficients
can be written as
XX X
b= wi (R1 )ij (u0j y) = dj (u0j y) (36)
j ij j

where wi and uj are columns of the matrices W and U, respectively. The substitution
X
dj = wi (R1 )ij (37)
ij

or
X
dk Rkj = wj (38)
kj
X
dj = (wj dk Rkj )/Rjj (39)
k<j

simplifies for a bidiagonal matrix R to

dj = (wj dj1 Rj1,j )/Rjj (40)

This equation makes it possible to calculate b with little use of computer memory, especially
since also
(u0j y) = (u0j1 y)Rj1,j /Rjj (41)

The proof of eq. 41 is given in the next section.

9
In the following, we develop eq. 35 to yield the equations for the prediction used in the PLS
literature. A basic feature of these equations is that the regression vector b is never explicitly
calculated. For this reason, the predicted value is written as
r
X r
X
y = (xb) = (xdj )(u0j y) = hj (u0j y) (42)
j=1 j=1

where, from eq. 39, one obtains

X
hj = [(xwj ) hk Rkj ]/Rjj (43)
k<j

These expressions differ from those given by Martens and N s [7] only in the normalization.
The latter write r
X
y = tj qj (44)
j=1

with (cf. PLS1 Step 2.2a)

qj = (t0j y)/(t0j tj ) = (u0j y)/(u0j Xwj ) (45)

X
tj = (x tk pk )wj (46)
k<j

and
pk = t0k Xk /(t0k tk ) = u0k X/(u0k Xwk ) (47)

i.e.
X
tj = (xwj ) tk Rkj /Rkk (48)
k<j

From the identification tj = hj Rjj in eq. 48, it follows that the prediction equation 44 of
Martens and N s[7] gives the same result as eq. 42.
Wold et al. [6] and Geladi and Kowalski [14] give expressions for prediction containing the
inner PLS relationship. They write for the PLS1 case
r
X
y = cj tj qj (49)
j=1

The need for the inner relationship comes from the normalization of qj to |qj | = 1 used by
these authors. A detailed calculation shows that cj qj in eq. 49 equals qj as defined by Martens
and N s [7]. Also for PLS2, the inner relationship of eq. 48 gives results identical with those of
Martens and N s[7].
On the other hand, we believe that Sjostrom et al. [5] make an erroneous use of the inner PLS
relationship in the prediction step. These authors make still another choice of normalization.
Also in other respects their prediction equations indicate an early stage of development.

10
Geladi and Kowalski in their tutorial [14], discuss a procedure for obtaining orthogonal t
values which we, so far, have chosen to overlook. This procedure, however, is said to be not
absolutely necessary. What is described is a scaling procedure for the vectors pj , tj and wj so
that

pnew
j = pj /||pj ||
tnew
j = tj ||pj ||
wjnew = wj ||pj || (50)

It should be noted that both before and after this scaling the vectors tj are orthogonal. The
replacements in eq. 50 also scal the values of cj and tj appearing in eq. 49 but have no effect
upon the predicted value y.

7 STOPPING RULES FOR BIDIAGONALIZATION

It is well known that the Krylov sequence converges to the eigenvector with the numerically
largest eigenvalue. This is the basis of the power method for finding eigenvalues of symmetric
matrices. In the chemometrics literature the power method is often called NIPALS and attributed
to Wold [21]. The method, however, is well established in numerical analysis and can, according
to Householder [22], be traced back at least to the work of Muntz in 1913 [23]. A krylov or, for
that matter, a Lanczos basis thus becomes linearly dependent if the iteration is continued too far.
In exact arithmetic this linear dependence would at some point give a vector ws+1 or us exactly
equal to zero. Rounding prevent this from happening, however.
Like the Bidiag1 algorithm discussed in detail by Paige and Saunders [18], the Bidiag2 algo-
rithm produces simple estimates of quantities that can be used in stopping rules for the iteration
scheme. We write es as the residual error (5) obtained with a matrix R or s dimensions,i.e.,
s
X
es = (1 uj u0j )y = ys + 1 (51)
j=1

defined in Step 2.4 and

||ys+1 ||2 = ||ys ||2 (u0s y)2 (52)

As the iteration proceeds ||ys ||2 becomes smaller, and the stability of this quantity may be
taken as a stopping criterion. Using Step 2.1 and eq. 21 we write

(y0 Xws ) = ||X0 y||(w10 ws )

= (y0 us )(u0s Xws ) + (y0 us1 )(u0s1 Xws ) (53)

Since (w10 ws ) = 0 for s 6= 1 one obtains the iteration equation (s > 1)

(y0 us ) = (y0 us1 )(u0s1 Xws )/(u0s Xws ) (54)

11
which may be used to evaluate the residual error (eq. 52). As mentioned above, eq.41, one may
also use eq. 54 in the evaluation of regression coefficients.
Another quantity of interest is the normalization integral ||X0 es || = ||X0s+1 y||, which appears
in the denominator of Step 2.1 of PLS1. When this quantity approaches zero the iteration scheme
becomes instable. The equation

||X0s+1 y|| = ||X0s y||(u0s Xws+1 )/(u0s Xws ) (55)

may be derived from eq. 30. The use of eqs. 52 and 55 and further criteria for stopping was
dicussed by Paige and Saunders [18]. These criteria are simple to evaluate and relate directly to
the numerical properties of the PLS1 iteration scheme. For this reason, they may be of advantage
as a complement to the cross-validation currently used.

8 COMPARISON OF BIDIAG2/PLS1 AND PCR/SINGULAR-

VALUE DECOMPOSITION
It is frequently stated that with the same number of factors wi PLS1 has a better predictive
ability than principal components regression (PCR)/singular-value decomposition. This may be
understood by expanding PLS1 results in terms of the factors of the singular-value decomposition
(=principal components). We write the latter as
X
X= gi di fi0 (56)
i

and obtain
X
X0 X = d2 fi fi0 (57)
i

The first PLS1 factor becomes

X X
w1 = X0 y/||X0 y|| = fi di (gi0 y)/||X0 y|| = fi ci (58)
i

(no contribution from vectors fi with di = 0). Application of the Lanczos equation (25) expanded
according to eq. 57 yields
X
w2 X0 Xw1 w1 (w10 X0 Xw1 ) = fi ci [d2i (w10 X0 Xw1 )] (59)

Partial summations over functions fi with degenerate eigenvalues di give the same results
for w2 as for w1 . One may therefore show that the number of linearly independent terms in the
sequence {wi } is no greater than the number of eigenvectors with distinct eigenvalues contributing
to the expansion of w1 . Almost degenerate or clustered eigenvalues coupled with finite numerical
accuracy may make the expansion even shorter in practice. Compare this with PCR, where all
eigenvectors with large eigenvalues are used irrespective of their degeneracy. These properties of

12
PLS1 have been pointed out by Ns and Martens [12] and by Helland [13]. They are also well
established in the literature on the conjugate gradient method (e.g.,ref. 20).
The exclusion of di = 0 not only for all wj but also for all uj is the main advantage of
Bidiag2 over the Bidiag1 algorithm used by Paige and Saunders in their LSQR algorithm [18].
The latter starts the bidiagonalization with u1 = y1 , w1 = X0 y, and obtains a left bidiagonal
matrix along similar lines as Bidiag2. The two algorithms generate the same set of vectors {wi },
but Bidiag1 runs into singularity problems for least-squares problems, which, however, are solved
by the application of the QR algorithm. The resulting right bidiagonal matrix is the same as that
obtained directly from Bidiag2. We have not found any obvious advantage of this algorithm over
the direct use of Bidiag2 as in PLS1.
In PCR the stopping or truncation criterion is usually the magnitude of the eigenvalue di . As
discussed by Joliffe [24], this may lead to omission of vectors fi , which are important for reducing
the residual error ||es ||2 , eq. 51.
On the other hand, the Bidiag2/PLS1 algorithms or, equivalently, the Lancoz algorithm do not
favour only the principal components with large eigenvalues di . Instead, it is our own experience
from eigenvalue calculations with the Lanczos algorithm in the initial tridiagonalization step [25]
that convergence is first reached for eigenvalues at the ends of the spectrum. That means in the
present context that with large and small eigenvalues are favoured over those with eigenvalues
in the middle. the open question is the extent to which the principal components with small
eigenvalues represent noise and therefore should be excluded from model building. Without
additional information about the data there is no simple solution to this problem.

9 UNDERSTANDING PLS2
In this section we consider the triangularization algorithm in PLS2. At convergence the iteration
loop contained in Step 2.2 of the algorithm (see Matrix transformations PLS2) leads to the
eigenvalue relationships
X0i Yi Yi0 Xi wi = ki2 wi (60)

Yi0 Xi X0i Yi zi = ki2 zi (61)

where ki2 , which are the numerically large eigenvalues, may be evaluated from the product
of normalization integrals in Steps 2.2.1-2.2.4. We interpret these relationships as principal
component relationships of the matrix X0i Yi . For each matrix one thus obtains the principal
component with largest variance. As mentioned before, the vectors wi obtained in this way are
mutually orthogonal, but the vectors zi are not.
The matrix X0i Yi may be simplified to X0 Yi . Hence for each iteration, those parts of the
column vectors of Y which overlap with ui = Xi wi /||Xi wi || are removed. Eventually, a ui is

13
obtained that has zero (or a small) overlap with all the columns of Y, and the iteration has to
be stopped.
As for PLS1, it is of interest to relate the vectors wi to the singular-value decomposition of
X, eq. 56. We write eq. 60 as
X0 Ai Xwi = ki2 wi (62)

Inserting eq. 56 and

X
wi = fj cji (63)
j

We obtain
X X
fj (dj gj0 Ai Xwi ) = ki2 fj cji (64)
j j

or, using the orthogonality of the fj s,

ki2 cji = dj (gj0 Ai Xwi ) (65)

From ki > 0 and dj = 0 it follows that cji = 0.

Hence there is no contribution to wi from a vector fj with the singular value dj = 0. In
this way, we are assured that the vectors wi of PLS2 form a subspace of the space spanned by
{fj ; dj 6= 0}. The transformed matrix R = U0 XW is therefore invertible.

10 DISCUSSION
The first result of this study is that both PLS algorithms, as given by Martens and Ns [7], yield
the ordinary least-squares solution for invertible matrices X0 X. The algorithms correspond to
standard methods for inverting matrices or solving systems of linear equations, and the various
steps of these methods are identified in the PLS algorithms. This result is likely to be known
to those who know the method in detail. However, as parts of the PLS literature are obscure,
and as even recent descriptions of the algorithms in refereed publications contain errors, it is
felt necessary to make this statement. There are, however, no reasons to believe that the errors
mentioned carry over into current computer codes.
The close relationship with conjugate gradient techniques makes it possible to speculate about
the computational utility of PLS methods relative to other methods of linear regression. As
pointed out also by Wold et al. [4], the matrix transformations of Bidiag2 are computationally
simpler than those of the original PLS1 method. Further, in the prediction step some saving
would be possible using the equations given here. For problems of moderate size this saving will
not, however, be large. For small matrices a still faster procedure for bi- or tridiagonalization
is Householders method. Savings by using this method would be important for both matrix
inversion and matrix diagonalization (principal components regression). The real saving with
methods of the conjugate gradients type discussed here is for large and spare matrices where the
elements can only be accessed in a fixed order.

14
On the other hand, with present technology neigher matrix inversion nor matrix diagonal-
ization is particularly difficult, even on a small computer. The cost of obtaining high-quality
chemical data for the calibration is likely to be much higher than the cost of computing. This
puts a limit on the amount of effort one may want to invest in program refinement.
Compared with principal components regression/sigular value decomposition it is clear that
PLS1/Bidiag2 manages with fewer latent vectors. Like PCR, the PLS methods avoid exact linear
dependences, i.e., the zero eigenvalues of the X0 X matrix. On the other hand, there is room for
uncertainty in how PLS treats approximate linear dependences, i.e., small positive eigenvalues
of X0 X. Is it desirable to include such eigenvalues irrespective of the data considered? Detailed
studies of this problem in a PCR procedure might lead to a cut-off criterion where the smallness
of the eigenvalue is compared with the importance of the eigenvector for reducing the residual
error.
The points where the PLS algorithms depart most from standard regression methods are the
use of latent vectors (PLS factors) instead of regression coefficients in the prediction step, and
that the matrix inversion of standard regression methods is actually performed anew for each
prediction sample. As is clear from the present work and also from that of Helland [13], the
latter procedure is by no means a requirement. Once the latent vectors are obtained they may
be combined into regression coefficients, (eq. 36), i.e., into one vector giving the same predicted
value as obtained with several PLS or PCR vectors. A possible use of the PLS factors would
then be for the detection of outliers among samples supplied for prediction. For this purpose, a
regression vector is insufficient as it spans only one dimension. On the other hand, there seems
to be no guarantee that the space spanned by the PLS vectors is more suitable for this purpose
than that spanned by principal components.
It seems as if the PLS2 method has few numerical or computational advantage both relative to
PLS1/Bidiag2 performed for each dependent variable y and relative to PCR. The power method
of extracting eigenvalues, although simple to program, is inefficient, especially for near-degenerate
eigenvalues. In contrast to principal components analysis, the PLS2 eigenvalue problem changes
from iteration to iteration, which makes the saving small if matrix diagonalization is used instead.
As long as the number of dependent variables is relatively small, the use of PLS1 for each
dependent variable may well be worth the effort.
In conclusion, it can be stated that the PLS1 algorithm provides one solution to the calibration
problem using collinear data. This solution has a number of attractive features, some of which
have not yet been exploited. It is an open question, however, whether this method is the optimal
solution to the problem or not. For an answer one would have to consider the structure of the
input data in greater detail than has been done so far.
ACKNOWLEDGEMENTS
Numerous discussions with Olav M. Kvalheim are gratefully acknowledged. Thanks are also
due to John Birks, Inge Helland, Terje V. Karstang, H.J.H. MacFie, Harald Martens and an

15
unnamed referee for valuable comments.

References
[1] H. Wold, Soft modelling. The basic design and some extensions, in K. Jo reskog and H.
Wold (Editors), Systems under Indirect Observation, North-Holland, Amsterdam, 1982, Vol.
II, pp. 1-54

[2] H.Wold, Partial least squares, in S. Kotz and N.L. Johnson (Editors), Encyclopedia of
Statistical Sciences, Vol. 6, Wiley, New York, 1985, pp. 581-591

[3] B. Efron and G. Gong, A leisurely look at the bootstrap, jackknife and cross-validation, The
American Statistician, 37(1983)37-48

[4] S. Wold, A. Ruhe, H. Wold and W.J. Dunn III, The collinearity problem in linear regression.
The partial least squares (PLS) approach to generalized inverses, SIAM Journal of Scientific
and Statistical Computations, 5(1984)735-743

[5] M. Sjostrom, S. Wold, W. Lindberg, J.-A. Persson and H. Martens, Amultivariate calibration
problem in analytical chemistry solved by partial least-squares models in latent variables,
Analytica Chimica Acta, 150(1983)61-70.

[6] S. Wold, C. Albano, W.J. Dunn III, K. Esbensen, S. Hellberg, E. Johansson and M. Sjostrom,
Pattern recognition: finding and using regularities in multivariate data, in H. Martens and
H. Russworm, Jr. (Editors), Food Reasearch and Data Analysis, Applied Science Publishers,
London, 1983, pp. 147-188

[7] H. Martens and T. Ns, Multivariate calibration by data compression, in H.A. Martens,
Multivariate Calibration. Quantitative Interpretation of Non-selective Chemical Data, Dr.
techn. thesis, Technical University of Norway, Trondheim, 1985, pp. 167-286; K. Norris
and P.C. Williams (Editors), Near Infrared Technology in Agricultural and Food Industries,
American Cereal Association, St. Paul, MN, in press.

[8] T.V. Karstang and R. Eastgate, Multivariate calibration of an X-ray diffractometer by partial
least squares regression, Chemometrics and Intelligent Laboratory Systems, 2(1987)209-219.

[9] A.A. Christy, R.A. Velapoldi, T.V. Karstang, O.M. Kvalheim, E.Sletten and N. Telns,
Multivariate calibration of diffuse reflectance infrared spectra of coals as an alternative
to rank determination by vitrinite reflectance, Chemometrics and Intelligent Laboratory
Systems, 2(1987)221-232

[10] K.H. Esbensen and H. Martens, Predicting oil-well permeability and porosity from wire-line
geophysical logs-a feasibility study using partial least squares regression, Chemometrics and
Intelligent Laboratory Systems, 2(1987)221-232.

[11] P.J. Brown, Multivariate calibration, Proceedings of the Royal Statistical Society, Series B,
44(1982) 287-321

[12] T. Ns and H. Martens, Comparison of prediction methods for multicollinear data,

Communications in StatisticsSimulations and Computations, 14(1985)545-576

16
[13] I.S. helland, On the structure of partial least squares regression, Reports from the Department
of Mathematics and Statics, Agricultural University of Norway, 21(1986)44

[14] P. Geladi and B.R. Kowalski, Partial least-squares regression: A tutorial, Analytica Chimica
Acta, 185(1986)1-17

[15] G.H. Golub and W. Kahan, Calculating the singular values and pseudo-inverse of a matrix,
SIAM Journal of Numerical Analysis, Series B, 2(1965)205-224

[16] C. Lanczos, An iteration method for the solution of the eigenvalue problem of linear
differential and integral operators, Journal of Research of the National Bureau of Standards,
45(1950)255-282.

[17] C.R. Rao and S.K. Mitra, Generalized Inverse of Matrices and its Applications, Wiley, New
York, 1971

[18] C.C. Paige and M.A. Saunders, A bidiagonalization algorithm for sparse linear equations
and least squares problems, ACM Transcactions on Mathematical Software, 8(1982)43-71

[19] M.R. Hestenes and E. Stiefel, Method of conjugate gradients for solving linear systems,
Journal of Research of the National Bureau of Standards, 49(1952)409-436.

[20] M.R. Hestenes, Conjugate Direction Methods in Optimization, Springer, New

York,1980,p.247 ff.

[21] H. Wold, Estimation of principal components and related models by iterative least squares,
in P.R. Krishnaiah (Editor), Multivariate Analysis, Academic Press, New York, 1966, pp.391-
420.

[22] A.S. Householder, The Theory of Matrices in Numerical Analysis, Blaisdell Publ. Corp., New
York, 1964, reprinted by Dover Publications, New York, 1975, p. 198.

[23] C.R. Muntz, Solution directe de lequation seculaire et des problemes analogues transcen-
dentes, Comptes Rendus de lAcademie des Sciences, Paris, 156(1913)443-46

[24] I.T. Joliffe, A note on the use of principal components in regression, Applied Statistics,
31(1982)300-303.

[25] R. Arneberg, J. Muller and R. Manne, Configuration interaction calculations of statelite

structure in photoelectron spectra of H2 O, Chemical Physics, 64(1982)249-258.

Predictive Control: J.M.Maciejowski Cambridge University Engineering Department
No ratings yet
Predictive Control: J.M.Maciejowski Cambridge University Engineering Department
85 pages
Adaline/Madaline:Applications
100% (1)
Adaline/Madaline:Applications
25 pages
Stanford Linear System Theory
100% (1)
Stanford Linear System Theory
431 pages
1987 - Generalized Predictive Control - Part I - Clarke PDF
No ratings yet
1987 - Generalized Predictive Control - Part I - Clarke PDF
12 pages
Final Exam in System Identification For F and STS Answers and Brief Solutions
No ratings yet
Final Exam in System Identification For F and STS Answers and Brief Solutions
4 pages
En 228 PDF
No ratings yet
En 228 PDF
11 pages
Tutorial On PLS and PCA
100% (1)
Tutorial On PLS and PCA
17 pages
Introduction To The R Package PLSPM: Gaston Sanchez, Laura Trinchera, Giorgio Russolillo
No ratings yet
Introduction To The R Package PLSPM: Gaston Sanchez, Laura Trinchera, Giorgio Russolillo
10 pages
Introduction To Model Predictive Control (MPC) : Oscar Mauricio Agudelo Mañozca Bart de Moor
No ratings yet
Introduction To Model Predictive Control (MPC) : Oscar Mauricio Agudelo Mañozca Bart de Moor
17 pages
The Kalman Filter: State-Space Derivation For Mass-Spring-Damper System
No ratings yet
The Kalman Filter: State-Space Derivation For Mass-Spring-Damper System
10 pages
Why Information Can Not Be The Basis of Reality Scientific American Blog Network
No ratings yet
Why Information Can Not Be The Basis of Reality Scientific American Blog Network
4 pages
Linear System Theory: 2.1 Discrete-Time Signals
No ratings yet
Linear System Theory: 2.1 Discrete-Time Signals
31 pages
Control Principles For Engineered Systems 5SMC0: State Reconstruction & Observer Design
No ratings yet
Control Principles For Engineered Systems 5SMC0: State Reconstruction & Observer Design
19 pages
Matlab Prog
No ratings yet
Matlab Prog
1,218 pages
IFIX With FIX Desktop - Getting Started
No ratings yet
IFIX With FIX Desktop - Getting Started
98 pages
LMI-Linear Matrix Inequality
No ratings yet
LMI-Linear Matrix Inequality
34 pages
Machine Learning Slides
No ratings yet
Machine Learning Slides
281 pages
Brillinger D.R. Time Series (SIAM, 2001)
No ratings yet
Brillinger D.R. Time Series (SIAM, 2001)
561 pages
EE4673/5673 - Embedded Systems Assignment Module #1: Problem 1
100% (1)
EE4673/5673 - Embedded Systems Assignment Module #1: Problem 1
15 pages
9 - Neural Modelling and Control
No ratings yet
9 - Neural Modelling and Control
17 pages
00703055
No ratings yet
00703055
5 pages
Book Matlab Document Stats
No ratings yet
Book Matlab Document Stats
2,338 pages
Lyapunov Tutorial
100% (2)
Lyapunov Tutorial
104 pages
Coupled Tank - MPC
No ratings yet
Coupled Tank - MPC
6 pages
Dynamic Programming and Optimal Control, Volumes I Solution Selected
No ratings yet
Dynamic Programming and Optimal Control, Volumes I Solution Selected
30 pages
Modern Control Systems (MCS) : Lecture-13 Introduction To State Space Modeling & Analysis
No ratings yet
Modern Control Systems (MCS) : Lecture-13 Introduction To State Space Modeling & Analysis
30 pages
System Identification: Arun K. Tangirala
No ratings yet
System Identification: Arun K. Tangirala
13 pages
MPC
No ratings yet
MPC
250 pages
Linear System Theory Fall 2011 Final Exam Questions With Solutions
No ratings yet
Linear System Theory Fall 2011 Final Exam Questions With Solutions
6 pages
Model Predictive Control Toolbox, User's Guide
No ratings yet
Model Predictive Control Toolbox, User's Guide
346 pages
Newton Gauss Method
No ratings yet
Newton Gauss Method
37 pages
Wiener Filters-Chapter56-2020 PDF
No ratings yet
Wiener Filters-Chapter56-2020 PDF
48 pages
Statistics For Biomedical Engineers and Scientists: How To Visualize and Analyze Data 1st Edition - Ebook PDF All Chapters Instant Download
100% (3)
Statistics For Biomedical Engineers and Scientists: How To Visualize and Analyze Data 1st Edition - Ebook PDF All Chapters Instant Download
51 pages
Maxima Book Chapter 4
No ratings yet
Maxima Book Chapter 4
43 pages
(Conciseness) Create Strong Verbs
No ratings yet
(Conciseness) Create Strong Verbs
11 pages
Chapter6-Wiener Filters and The LMS Algorithm-Pp32
No ratings yet
Chapter6-Wiener Filters and The LMS Algorithm-Pp32
32 pages
Template For Parameter Estimation With Matlab Optimization Toolbox PDF
No ratings yet
Template For Parameter Estimation With Matlab Optimization Toolbox PDF
8 pages
11 - State Space
No ratings yet
11 - State Space
30 pages
ECG Denoising Using Wiener Filter and Kalman Filter
No ratings yet
ECG Denoising Using Wiener Filter and Kalman Filter
9 pages
Optimizacion (Ingles)
No ratings yet
Optimizacion (Ingles)
133 pages
Get (Ebook PDF) Numerical Analysis 3rd Edition by Timothy Sauer Free All Chapters
100% (4)
Get (Ebook PDF) Numerical Analysis 3rd Edition by Timothy Sauer Free All Chapters
41 pages
Mech ch3
No ratings yet
Mech ch3
7 pages
Fuzzy
No ratings yet
Fuzzy
343 pages
Wiener
No ratings yet
Wiener
10 pages
Math Cheat Sheet
No ratings yet
Math Cheat Sheet
5 pages
Wiener Filter
No ratings yet
Wiener Filter
6 pages
M Hayes Statistical Digital Signal Proc Part 2 PDF
No ratings yet
M Hayes Statistical Digital Signal Proc Part 2 PDF
200 pages
Ngspice
No ratings yet
Ngspice
202 pages
Solutions To Selected Problems-Duda, Hart
67% (3)
Solutions To Selected Problems-Duda, Hart
12 pages
Digital Signal Processing Matlab Programs
No ratings yet
Digital Signal Processing Matlab Programs
34 pages
Symbolic Reduction of Block Diagrams1
No ratings yet
Symbolic Reduction of Block Diagrams1
2 pages
Phillips Nagle Chapters 1 and 2
No ratings yet
Phillips Nagle Chapters 1 and 2
105 pages
Download Full Particle Swarm Optimization and Intelligence Advances and Applications Premier Reference Source 1st Edition Konstantinos E. Parsopoulos PDF All Chapters
100% (6)
Download Full Particle Swarm Optimization and Intelligence Advances and Applications Premier Reference Source 1st Edition Konstantinos E. Parsopoulos PDF All Chapters
50 pages
Transmission Lines in Digital and Analog Electronic Systems: Signal Integrity and Crosstalk
From Everand
Transmission Lines in Digital and Analog Electronic Systems: Signal Integrity and Crosstalk
Clayton R. Paul
No ratings yet
Solutions Manual to accompany An Introduction to Numerical Methods and Analysis
From Everand
Solutions Manual to accompany An Introduction to Numerical Methods and Analysis
James F. Epperson
5/5 (1)
Introductory Applications of Partial Differential Equations: With Emphasis on Wave Propagation and Diffusion
From Everand
Introductory Applications of Partial Differential Equations: With Emphasis on Wave Propagation and Diffusion
G. L. Lamb, Jr.
No ratings yet
Motion control A Complete Guide
From Everand
Motion control A Complete Guide
Gerardus Blokdyk
No ratings yet
Advanced Dynamic-System Simulation: Model Replication and Monte Carlo Studies
From Everand
Advanced Dynamic-System Simulation: Model Replication and Monte Carlo Studies
Granino A. Korn
No ratings yet
The Geometry of Partial Least Squares
No ratings yet
The Geometry of Partial Least Squares
28 pages
Rsimpls
No ratings yet
Rsimpls
37 pages
Partial Least Squares Regression A Tutorial
100% (1)
Partial Least Squares Regression A Tutorial
17 pages
DSIE Proceedings
No ratings yet
DSIE Proceedings
112 pages
Sustainable Slum Upgrading in Urban Area - W110
100% (1)
Sustainable Slum Upgrading in Urban Area - W110
644 pages
EN - 228 - Fuels
No ratings yet
EN - 228 - Fuels
45 pages
Chap 13 Letters
No ratings yet
Chap 13 Letters
29 pages
Porous Pavements - CRC Press Book
No ratings yet
Porous Pavements - CRC Press Book
2 pages
Analysis of Pavement Structures - CRC Press Book
No ratings yet
Analysis of Pavement Structures - CRC Press Book
2 pages
Present and Future Challenges in Food Analysis: Foodomics: Supporting Information
No ratings yet
Present and Future Challenges in Food Analysis: Foodomics: Supporting Information
10 pages
Kessler (2003) Found That Among Epidemiological Samples
No ratings yet
Kessler (2003) Found That Among Epidemiological Samples
3 pages
Citing Medecine - Int
No ratings yet
Citing Medecine - Int
4 pages
Citing Medecine - Int
No ratings yet
Citing Medecine - Int
4 pages
Chap-9 - Maps
No ratings yet
Chap-9 - Maps
39 pages
Chap 11 Forthcoming
No ratings yet
Chap 11 Forthcoming
61 pages
Chap 8 Newspapers
No ratings yet
Chap 8 Newspapers
23 pages
Appendix Content Updates
No ratings yet
Appendix Content Updates
6 pages
Chap 7 Patents
No ratings yet
Chap 7 Patents
33 pages
Chap 6 Bibliographies
No ratings yet
Chap 6 Bibliographies
51 pages
Chap 1 Journals PDF
No ratings yet
Chap 1 Journals PDF
99 pages
Appendix D: ISO Country Codes For Selected Countries: Created: October 10, 2007
No ratings yet
Appendix D: ISO Country Codes For Selected Countries: Created: October 10, 2007
3 pages
Practical 5: LU Decomposition Method
No ratings yet
Practical 5: LU Decomposition Method
4 pages
Test I. Read Each Item, Then Choose The
No ratings yet
Test I. Read Each Item, Then Choose The
2 pages
Ce 2206
No ratings yet
Ce 2206
12 pages
Mixed Integer Linear Programming
No ratings yet
Mixed Integer Linear Programming
42 pages
Solutions To Nonlinear Differential Equations
No ratings yet
Solutions To Nonlinear Differential Equations
8 pages
Introduction To Matrices in Matlab: Vectors Vector
No ratings yet
Introduction To Matrices in Matlab: Vectors Vector
4 pages
Stability and Accuracy of Newmark's Method: October 2002
No ratings yet
Stability and Accuracy of Newmark's Method: October 2002
32 pages
CS 4820-Lecture 12
No ratings yet
CS 4820-Lecture 12
2 pages
BIN - 11th (2019C) - Embhbjkhj
No ratings yet
BIN - 11th (2019C) - Embhbjkhj
28 pages
Kadane
100% (1)
Kadane
15 pages
Nsm Practical
No ratings yet
Nsm Practical
39 pages
Notes Matrices, Determinants, Eigen Values and Eigen Vectors
100% (1)
Notes Matrices, Determinants, Eigen Values and Eigen Vectors
51 pages
2.1 - 2.3 PDF
No ratings yet
2.1 - 2.3 PDF
40 pages
Chapter2 Nonlinear Eqs Version2021
No ratings yet
Chapter2 Nonlinear Eqs Version2021
19 pages
New Art March 06
No ratings yet
New Art March 06
16 pages
MBA 19 PAT 302 DS Unit 1.3.2 LCM
No ratings yet
MBA 19 PAT 302 DS Unit 1.3.2 LCM
18 pages
27 Polynomials
No ratings yet
27 Polynomials
5 pages
Lab 5. LU Factorization: Name: 1 Instructions
No ratings yet
Lab 5. LU Factorization: Name: 1 Instructions
2 pages
Lecture 1 Special Matrix Operation
No ratings yet
Lecture 1 Special Matrix Operation
13 pages
Basic Considerations in Process Equipment Design
No ratings yet
Basic Considerations in Process Equipment Design
6 pages
Week02 Answers PDF
No ratings yet
Week02 Answers PDF
6 pages
Otras Soluciones Algebraicas A Las Ecuaciones Polinomicas de Tercer y Cuarto Grado
No ratings yet
Otras Soluciones Algebraicas A Las Ecuaciones Polinomicas de Tercer y Cuarto Grado
7 pages
Multiple Regression and Correlation Analysis
No ratings yet
Multiple Regression and Correlation Analysis
1 page
List of Books For Gate Mechanical: Linear Algebra Linear Algebra Seymour Lipschutz, Marc Lipson
No ratings yet
List of Books For Gate Mechanical: Linear Algebra Linear Algebra Seymour Lipschutz, Marc Lipson
4 pages
Ant Colony Optimization and Local Search For Bin P
No ratings yet
Ant Colony Optimization and Local Search For Bin P
13 pages
Application of Ls-Dyna in Numerical Analysis of Vehicle Trajectories
No ratings yet
Application of Ls-Dyna in Numerical Analysis of Vehicle Trajectories
8 pages
PHP Oh FND 0
No ratings yet
PHP Oh FND 0
11 pages
Galerkin - Weighted - Residual - Technique - Ii
No ratings yet
Galerkin - Weighted - Residual - Technique - Ii
8 pages
Multiple-Choice Test Chapter 09.01 Golden Section Search Method Optimization
No ratings yet
Multiple-Choice Test Chapter 09.01 Golden Section Search Method Optimization
9 pages
Polynomials Class - 6 (Notes)
No ratings yet
Polynomials Class - 6 (Notes)
5 pages