Alexander Graham - Kronecker Products and Matrix C
Alexander Graham - Kronecker Products and Matrix C
Alexander Graham - Kronecker Products and Matrix C
380A
I
CONTROL AND OPTIMAL CONTROL
D. N. BURGHES, Cranfield Institute of Technology and A. GRAHAM, The Open Uni-
versity. Milton Keynes.
TEXTBOOK OF DYNAMICS
F. CHaRLTON, University of Aston. Birmingham.
1 VECTOR AND TENSOR METHODS
F. CHaRLTON, University of Aston. Birmingham.
,!, TECHNIQUES IN OPERATIONAL RESEARCH
VOLUME 1: QUEUEING SYSTEMS
~ VOLUME 2: MODELS. SEARCH, RANDOMIZATION
B. CONOLLY, Chelsea College, University of London
MATHEMATICS FOR THE BIOSCIENCES
G. EASON, C. W. COLES, G. GETTINBY, University of Strathclyde.
HANDBOOK OF HYPER GEOMETRIC INTEGRALS; Theory, Applications, Table.,
Computer Programs
H. EXTON, The Polytechnic, Preston.
MULTIPLE HYPERGEOMETRIC FUNCTIONS
H. EXTON, The Polytechnic, Preston '.-
COMPUTATIONAL GEOMETRY FOR DESIGN AND MANUFACTURE
\. D. FAUX and M. J. I'RATT, Cranfield Institu~c of Technology.
APPLIED LINEAR ALGEBRA '
R. J. GaULT, Cranfield Institute of Technology.
MATRIX THEORY AND APPLICATIONS FOR ENGINEERS AND MATHEMATICIANS
A. GRAHAM. The Open University, Milton Keynes. ..
APPLIED FUNCTIONAL ANALYSIS
D. H. GRIFfEL, University of Bristol.
GENERALISED FUNCTIONS: Theory. Applications
R. F. HOSKINS, Cranfield Institute or Technology.
MECHANICS OF CONTINUOUS MEDIA
S. C. HUNTER, University of Shefrield.
GAME THEORY: Mathematical Models of Conflict
A. J. JONES, Royal Holloway College, University of London.
USING COMPUTERS
B. L. MEEK and S. FAIRTHORNE, Queen Elizabeth College, University of London.
SPECTRAL THEORY OF ORDINARY DIFFERENTIAL OPERATORS
E. MULLER-PfEIFFER, Technical High School, Ergurt.
SIMULATION CONCEPTS IN MATHEMATICAL MODELLING
F. OLIVEIRA-PINTO, Chelsea College, University of London.
ENVIRONMENTAL AERODYNAMICS
R. S. SCORER, Imperial College of Science and Technology, University of London.
APPLIED STATISTICAL TECHNIQUES
K. D. C. STOODLEY. T. LEWIS and C. L. S. STAINTON, University of Bradford.
LIQUIDS AND THEIR PROPERTIES: A Molecular and Macroscopic Treatise with Appli-
cations
H. N. V. TEMPERLEY! University College of Swansea. University of Wales and D. H.
TREVENA, University of Wales, Aberystwyth.
GRAPH THEORY AND APPLICATIONS
H. N. V_ TEMPERLEY. University College of Swansea.
Kronecker Products and
Matrix Calculus:
with Applications
.:.....
CxA
Distributors: I "'r
I:>0
Chapter 1 - Preliminaries
1.1 Introduction ....................................... 11
1.2 Unit Vectors and Elementary Matrices ...................... J I
1.3 Decompositions of a Matrix ............................. 13
1.4 The Trace Function .................................. 16
1.5 The Vec Operator ..: ................................. 18
Problems for Chapter 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
6 Table of Content s
My purpose in wrlting this book is to bring to the attention of tlw reader, some
recent developments in the field of Matrix. Calculus. Although some concepts,
such as Kronecker matrix. products, the vector derivative etc. are mentioned in
a few specialised books, no book, to my knowledge, is totally devoted to this
subject. The interested researcher must consult numerous published papers to
appreciate the scope of the concepts involved.
Matrix. calculus applicable to square matrices was developed by Turnbull
[29,30] as far back as 1927. The theory presented in this book is based on the
works of Dwyer and McPhail [15] published in 1948 and others mentioned in
the Bibliography. It is more general than Turnbull's development and is applicable
to non·square matrices. But even this more general theory has grave limitationS,
in particular it reqUires that in general the matrix elements are non constant and
independent. A symmetric matrix, for example, is treated as a special case.
Methods of overcoming some of these limitations have been suggested, but I am
not aware of any published theory wllich is both quite general and simple enough
to be useful.
The book is organised in the following way;
Chapter I concentrates on the preliminaries of matrix theory and notation
which is found useful throughout the book. In particular, the simple and useful
elementary matrix Is defined. The vec operator Is defined and many useful
relations are developed. Chapter 2 introduces and establishes various important
properties of the matrix Kronecker product.
Several applications of the Kronecker product are considered in Chapter 3.
Chapter 4 introduces Matrix CalculUS. Various derivatives of vectors are defined
and the chain rule for vector differentiation is established. Rules for obtaining
the derivative of a matrix with respect to one of its elements and conversely are
discussed. Further developments in Matrix Calculus including derivatives of
scalar functions of a matrix with respect to the matrix and matrix differentials
arc found in Chapter 5.
Chaptcr 6 deals with the derivative of a matrix with respect to a mattix.
123
3801'1.
8 Author's l'rcface
This includes the derivation of expressions for the derivatives of both the matrix
product IInu the Kronecker product of matrices with respect to a matrix. There
is also the derivation of a chain rule of matrix d if[eren tla llon. Various appJica lions
of ut least some of the mutrix culculus arc discussod in Chapter 1.
Dy making use, whenever possible, of simple notation, Including many
worked examples to illustrate most of the important results and other examples
at the end of each Chapter (except for Chapters 3 and 7) with solutions at the
end of the book, I have attempted to bring a topic studied mainly at post-
graduate and research level to an undergraduate level.
Symbols and Notation Used
"
CHAPTER 1
Preliminaries
I't
, j
, I
!
I
j'
Ii
l.l INTRODUCTION 'j
In this chupter we introuuce some notation and discuss some results which will
be founu very useful for the uevelopmcnt of the theory of both Kronecker
products ,IIlU mutrix uifrcrentialion. Our aim will be to make the notation as
simple as possible although inevitably it will be complicated. Some simplification
may be obtaineu ut the expense of generality. For example, we may show that a
result holds for a square matrix of order 11 X 11 and state that it holds in the more
general case when A is of order m X n. We will leave it to the interested reader to , I
o o
o a
a , ... , e tl o (1.1)
o a
12 Preliminaries [ell. 1
e ... (1.2)
000 ... 0
Solution
I
(i)
=
1 0
000
O~
[
000
Sec. 1.3] Decompositions of a Matrix 13
3
(ii) • [= E;I + Ell + 4'33 = L ejej .
1:1
The Kronecker delta 0lj is Jefined us
6 -
. {I if i::: j
I,j - 0 if j =P j
it can be expressed as
(1.6)
We can now determine some relations between unit vectors and elementary
matrices.
Eller == e/e}er (by 1.5)
= aIrel (1.7)
and
e~Eij = e~eiel
= Orlej . (1.8)
Also
EijErs := eleiere~ ::: 6irele~ = ol,Els . (1.9)
In particular if r = j, we have
iJllErs = 0 If N= r .
1. 3 DECOMPOSITIONS OF A MATRIX
We consider a matrix A of order r?1 X n having the following form
[
t1ml t1m2 ... t1mn
We denote the n columns of A by A'I> A. l , . .. A. n . So that
al /
A' I = 7 1 (j = 1,2, ... ,n) (1.12)
[
a,,'1 J
123
38'0:0.
14 Preliminaries [Ch. 1
[
AI' = a/l~
a:l] (i= 1,2, ... ,m) (1.13)
at" .
Both the A./ and the AI' are column vectors. In this notation we can write A as
the (partitioned) matrix
A [A' l A' 2 ••• A.,,] (1.14)
or as
A (1.15)
(where the prime means 'the transpose of').
For example, let
so that
then
The elements, the columns and the rows of A can be expressed in terms of the
unit vectors as follows:
e, e, == e,'A'e,.
all ;:; 'A (1.19)
We can express A as the sum
A == J:.J:.aIIEI1 (1.20)
(where the Eli are of course of the same order as A) so that
A == J:.J:.al/eiej. (1.21 )
I j
Sec. 1.3] Decompositions of a Matrix 15
From (1.l6) and (1.21)
A. j == Aej = (f7Qije/ej)ej
== 77Qiiei(ejej)
~aiiej . (1.22)
I
Similarly
(1. 23)
so that
(1.24)
Example 1.2
Write the 111 atrix
A Jail a,~
~21 a2~
as a sum of: (i) column vectors of A; (ii) row vectors of A.
Solutions
(i) Using (1.25)
A = A.le', + A. 2e'2
= ~~J (1 OJ + ~~J (0 IJ ,,
Using (1. 26)
A = eIA 1 .'+ e2A2.' I,
r'
[~J [all all] + [~J [a21 a221
I
== • 1:
,'
There exist interesting relations involving the elementary matrices operating 011
the rna trix A .
For example
Preliminaries [Ch.l
16
Similarly
AEtj == AejeJ == A.tei .(by 1.16) (1. 28)
sO that
AEi/ = A.,ei (1.29)
AEtli == Ae,ejB = A.IBj .' (by 1.28 and 1.27) (1.30)
EllA Erl == e/e,' A '
eres (by 1.5)
-- ,
e/alre! (by 1.19)
== aJrele~ == alrEls (1.31)
In particular
EIIAE" == aJrEjr (1.32)
Example 1.3
Use elementary matrices and/or unit vectors to find an expression for
(i) The product AB of the matrices A = [al,l and B = [bljl.
(ii) The kth column of the product AB
(iii) The kth column of the product XYZ of the matricesX= [Xlj], Y= [YIJl
and Z == [tij]
Solutions
(i) By (1.25) and (1.29)
A = 1:A. Jei = 1:AEJi ,
hence
AB = i:,(AEJI)B = i:,(AeJ)(ejB)
= "EA.jBj.' (by (1.16) and (1.17)
(ii) (a)
(AB)." = (AB)e" = A (Bek) = AB." by (1.l6)
(b) From (i) above we can write
(AB)." = "E(AejejB)e"
j
= "E(Aej)(ejBe,,)
J
= "I;A.JbJk
]
by (1.16) and (!.l9)
so that
tr A = ke~Ael . . (1.34)
From (l.l6) and (1.34) we find
tr A == ke/A' I (1.35)
and from (I.J 7) and (1.34)
tr A = kAI:e~ . (1.36)
tr AB = fe/ABel . . (1.37)
I
= 'k'kal/b .. (1.38) ,I •
j I /1 . t
! I
Similarly
tr BA = l:ejBA ej
j
:Ek(ejBe/)(ejAeJ)
I J
= LLbj/aIJ . (1.39)
I I "
tr AB == 'J:,Bj.A. j • (1.42)
Similarly
tr AB' == :EAj.'B I . (1.43)
and since tr AB' == tr A'B
Preliminaries [eh. 1
where IX is a scalar.
These properties show that trace is a linear function.
For real matrices A and B the various properties of tr (AB') indicated above
show that it is an inner product and is sometimes written as
tr (AB') = (A, B)
A.]
vecA ==
l ~'l
A· n
From the definition it is clear that vec A is a vector of order mn.
For example if
(1. 47)
then
Example 1.4
Show that we can write tr AB as (vec A')' vec B
Solution
By (1.37)
tr AB :: '2:eiABe,
I
::
7A,:B., by (1.16) and (1.17)
:: 1;(A')
1
,B
.1.1
.1
I, ..
Although at first sight this notation for the transpose is sensible and is used
frequently in this book, there are associated snags. The difficulty arises when
the suffix notation is not only indicative. of the matrix involved but also deter-
mines specific elements as in equations (1.31) and (1.32). On such occasions it
will be necessary to use a more accurate notation indicating the matrix order and
the element involved. Then instead of E12 we will write E 12 (2 X 3) and instead
of Et2 we write E21 (3 X 2).
More generally if X is a matrix or order (m X n) then the transpose of
Ers(m X /I)
will be written as
(1) The matrix A Is of order (4 X /1) and the matrixB is of order (n X 3). Write
the product AB in terms of the rows of A, that is, AI-. A:z., ... and the
columns of B, that is, B.}, B. 2 , •••
(2) Describe in words the matrices
(a) AEllc and (b) ElkA .
Write these matrices in terms of an appropriate product of a row or a column
of A and a unit vector.
(3) Show that
,
(a) tr ABC == l:.A/.BC./
~I t
2.1 INTRODUCTION I!
Kronecker product, also known as a direct product or a tensor product is a ,.
;
concept having its origin in group theory and has important applications in
particle physics. But the technique has been successfully applied in various fields
of matrix theory, for example in the solution of matrix equations which arise
when using Lyapunov's approach to the stability theory. The development of the
technique in this chapter will be as a topic within the scope of matrix algebra.
.i
A®B == (2.1)
A ® B is seen to be a matrix of order (mr X liS). It has mil blocks, (he (i,j)th
block is the matrix aliB of order (r X s).
For example, let
then
A"" III
a21
aid'
a22
B = [;11 bl~,
b 21 b22
anbll an b 12
and (2.3)
To find the transformation between I' and v, we determine the relations between
the components of the two vectors.
For example,
XIYI == (auzl + a12 z 2) (b u wI + b12 w2)
[UbU a12 b ll
allb 12
I'
allb 21 au b 22 al2 b2 1 ,nb'J
a12 b 22
v
1,
I
a 22 b u
Example 2.1
Let t lj be an elementary matrix of order (2 X 2) defined in section 1.2 (see 1.4). \
Find the matrix !
2 2 L
U == 2: 2: E
1=1/=1
1,} ® E/,/
~
I
I
'.
Sec.2.3J Some Properties and RuieH for Kronecker Products 23
Solution
so that
Note. U is seen to be a square matrix having columns wWch are unit vectors
el(i = 1,2, .. ). It can be obtained from a unit matrix by a permutation of rows I,
or columns. It is known as a permutation matrix (see also section 2.5).
= a[ (i,j) th block of A @ B]
Since the twO blocks are equal for every (iJ), the result follows.
IV There exists
a zero element Om" := Om ® 0"
(2.9)
a unit clement Imn == 1m ® ill .
The unit matrices are all square, for example 1m in the unit matrix of order
(m X m).
Other important properties of the Kronecker product follow.
Proof
The (i,j)th block of (A ® B)' is
ajlB' .
VI (The 'Mixed Product Rule ').
(A ®B)(C®D) = AC®BD (2.11)
provided the dimensions of the matrices are such that the various expressions
exist.
Proof
The (j,j)th block of the left hand side is obtained by taking the product of the
jth roW block of (A ® B) and the jth colum block of (C ® D), this is of the
following form
c1jD
[ailB anB ... a/rIB] c2j D
The (i,j)th block of the right hand side is (by definition ofthe Kronecker product)
gijBD
where gij is the (i,j)th element of the matrix AC. But by the rule of matrix
multiplications
glj = '2/al r Cr j •
Sec. 2.3] SOllie Properties and Rules for Kronecker Products 25
VII Given A(m X m) and B(II X n) and subject to the existence of the various
inverses,
(2.12)
Proof
Usc (2.11)
(A ® B)(A- I ® B- 1) = AA- I ® BO- I = 1m ® In := 1,'111
Proof
We prove (2.13) for A, Y and B each of order n X n. The result is true for
A(m X n), Y(1l X r), B(r X ~). We use the solutions to Example 1. 3 (iii).
Y' l
[bl",A b2",A ... b/l",Aj Y' 2
I:
[B.",'®AjvecY
= [(B')",.' ®A] vee Y
"
since the transpose of the klh column of B is the kth row of 8'; the results
follows.
Example 2.2
Write the equation
~~: :~~ ~~ :~
in a matrix-vector form.
Solutioll
The equation can be written as AX! = C. Use (2.12), to find
vec(AXI) = (I®A)vecX = vccC,
380A
0 0 all Cil
Example 2.3
A and B are both of order (n X n), show that
(i) vecAB = (/®A)vecB
(ii) vee AB = (B' ® A) vee /
(iii) vee AB = 'k(D')./c ® A./c
Solution
(I) (As in Example 2.2)
In (2.13) let Y =B andB =/.
(li) In (2.13) let Y =: I .
(iii) In vec AB = (B' ® A) vee /
The product ej ® ej is a one row matrix having a unit element in the [(i -l)n +
jJth column and zeros elsewhere. Hence the product
[(B')., ®A.d [ei ® ell
is a matrix having
as its [(i -l)n + j]th column and zeros elsewhere. Since vecl is a one column
matrix having a unity in the 1st, (n + 2)nd, (211 + 3)rd ... n2 rd position and
zeros elsewhere, the product of
[(B')./ ®A.J][ei ® ej] and vecI
is a one column matrix whose elements are all zeros unless j and j satisfy
(i-l)n+j = l,orn+2,or2n+3,,,.,orn2
Sec.2.3} Some Properties and Rules for Kronecker Products 27
that is
i = i = J or i = i = 2 or i = i == 3 or ... , i =i = n
in which case the one column matrix is
(B').; ® A' I (i = 1,2, ... , /I)
The result now follows.
IX If {Ad and {xtl are the eigenvalues and the corresponding eigenvectors for A
and (Ilj} and (y/} are the eigenvalues and the corresponding eigenvectors for B,
then
A®B
Proof
By (2.11)
(;I ® 11) (xI ® y/) = (AXt) ® (By/)
= (A/XI) ® (JJiYJ)
= AtJ-lj(x/ (8) y/) (by2.S)
The result follows.
Proof
Assume that AI> 1... 2 , •.. , A" and Ill> J.l2, ... , Ilm are the eigenvalues of A and B
respectively. The proof relies on the fact (see [18] p. 145) that the determinant
of a matrix is equal to the product of its eigenvalues.
Hence (from Property IX above)
IA®BI=TI\Pj
I, /
Prool
tetAYB' = X, then by (2.13)
(B ® A) vec Y = vec X . (1)
Oil takins transpose, we obtain
Byil' = X'
So that by (2.13)
(A ®B)vecY' = vecX' . (2)
From example 1.5, we know that there exist permutation matrices VI and V2
such that X' U X I·
vee = I vee and vec Y = U2 vee Y .
Substituting for vee Yin (1) and mUltiplying bOlh sides by VI> we obtain
UI (B®A)V2 vecY' == VI vecX. (3)
Substituting for vec X' in (2), we obtain
(A ®B)vecY' = VI veeX . (4)
Proof
Since f is an analytic function it can be expressed as a power series such as
_ '\ k
- J", ® LUkA by (2,7)
k=O
= 1/11 ®[(A) .
This ~roves (2.15); (2.16) is proved similarly.
We can write
I [CA ® 1m} = 2>k(A ® Im)k
k=O
Ir
I, = 2: ak(A 1m)
k=O
k
® by (2.11)
! 2: (ak Ak ® 1m)
I k=O
,. .
, I
·1 I
I = [(A) ®Im
by (2.6)
(2.17)
and (2.16) leads to
eA®Im = eA ®Im (2.18)
Example 2.4
Use a direct method to verify (2.17) and (2.18).
Solution
The Kronecker Product [Ch.2
'Ole right hand side is a block diagonal matrix, each of the m blocks is the sum
I A2
1m + A + 2! + ... = eA
The result (2.17) follows.
eA ® 1m = (In ® 1m) + (A ® 1m) + ft (1m ® A)2 + ...
XIV If {Ad and {J.lj} are the eigenvalues of A and B respectively, then {AI + J.lj}
are the eigenvalues of A (£) B.
Proof
Let x and y be the eigenvectors corresponding to the eigenvalues A and J.l of A
and B respectively, then
Example 2.5
Verify the Property XIV for
A = G-~ and B
Solution
For the matrix A;
AI = 1 and XI = [~J
A. = 2 and x. = [-:J
For the matrix B;
J.il = I and YI = C]
J.i. = 2 and Y2 = [~J
We find
~
~
0
0 -Io -1
C=A(f)B =
0 3 0
0 2 1
Example 2.6
Show that
exp(A eE) = expA ® exp B
where A (n. X II), E (ill X 111).
Solution
By (2.11)
(A ®Im)(In ®E) == A ®B
and
(In ® B) (A ® I,n) == A ® B
hence (A ® 1m) and (In ® B) commute so that
exp (A (1:) E) == exp (A ® 1m + In ® B)
= exp (A 01m ) exp (In ® B)
= (expA ® In,) (J" ® expB) (by 2.15 and 2.16)
== expA 0 expE (by 2.11)
Xli
(2.23) I
I
Example 2. 7 , t
Given "
::
I
o 0 0 0 0
o0 000
00001 0
r:"
j <I
U=
o 1 0 0 0 0 I
!
00000 ,
1
!
o0 0 0 0
vec X I = vec L.
'"
Esr (11 X m) XEsr (ll X m)
r,s
= 2: vn
r,s
[Ers X n) ® Esr (11 X m)J vec X . by (2.13)
It follows that
U = 2:
r,s
b..... s (m X 11) ® Esr (/I X m) (2.24)
(2.25)
r,s
Notice that U is a matrix of order (mil X 1Il1l).
At first sight it may appear that the evaluation of the permutation matrices
Ut and U1 in (2.14) using (2.24) is a major task. In fact this is one of the examples
where the practice is much easier than the theory.
We can readily determine the form ofa permutation matrix - as in Example
2.7. So the only real problem is to determine the orders of the two matrices.
Since the matrices forming the product (2.14) must be conformable, the
orders of the matrices UI and U2 are determined respectively by the number of
rows and the number of columns of (A ® B).
Example 2.8
Let A = [aij] be a matrix of order (2 X 3), and B = [b ij ] be a matrix of order
(2 X 2).
Determine the permutation matrices ~ and U2 such that
i
A ® B = UI (B ® A) U2
Solution
(A ® B) is of the order (4 X 6)
From the above discussion we conclude that UI is of order (4 X 4) and U2 is of
order (6 X 6).
0 0 0 0 0
I.1
1 0 0 0 0 0 1 0 0 0
0 0 1 0 0 0 0 0 I 0
UI = and U2 =
o1 0 0 0 0 0 0 0
o0 0 0 0 0 0 0
0 0 0 0 0
Sec.2.5J The Permutation Matrix 35
(2.26)
r,s
Problems of Chapter 2
(i) Given,
U = 2:
r,s
Ers (m X /1) ® Esr (/I X m) .
Show that
u- I = U' = LEsr (II X m) ®E,. (m X n)
r,s
(2) A = [ali], B = [bi;] and Y = [YliJ are matrices all of order (2 X 2), use a
direct method to evaluate
(a) (i) A YB
(ii) fl' ®A .
(b) Verify (2.13) that
vecAYB = (B'®A)vecY.
(3) Given
A = G~ 'nd B = L~ ~
(a) Calculate
A ®B add B ®A .
(b) Find matrices U1 and U2 such that
A ®B = u..(B ®A)U2 .
(4) Given
- rL-23 -~'
A - 41 calculate,
(a) exp(A)
(b) 'exp(A ®1).
Verify (2.16), that is
exp(A)®I = exp(A®I).
~6
The Kronecker Product [eh.2]
(5) Given A -
-
[2 IJ
-1 -1
and B -
-
~
3
J
4'
calculate
(a) £1 ® B- 1 and
(b) (A ®Btl.
Hence verify (2.12), that is
(A ®Btl = A-I ®B- I
(6) Given
A =
G -J
4 -2
and B =
G ~,
2 find
3.1 INTRODUCTION
There are numerous applications of the Kronecker product in various fields
including statistics, economics, optimisation and control. It is not OUr intention
to discuss applications in all these fields, just a selected number to give an idea
of the problems tackled in some of the literature mentioned in the Bibliography.
There is no doubt that the interested reader will find there various other appli-
cations hopefully in his own field of interest.
A number of the applications involve the derivative of a matrix - it is a well
known concept (for example see f18] p. 229) which we now briefly review.
,"
3.2 THE DERIVATIVE OF A MATRIX , :
Given the matrix
"
'j
(3.2)
2t2
A= [
sin t
380A
d
-A -
dt - G t
cos t 2t OJ und
f ~
Adt =
t3 4t Je
-cos t 2t + t 3/3
+
e
where is a constant matrix.
One important property follows immediately. Given conformable matrices
A(t) and B{t), then
Example 3.1
Given
e = A®B
(each matrix is assumed to be a function of t) show that
de == dA ® B +A ® dil (3.4)
dt dt dt
Solution
On differentiating the (i,j)th block of A ® B, we obtain
J ( ) Jail dB
dt ailB = Tt B + all dt
dA®B +A® dB
dt dt '
the result follows.
3.3 PROBLEM 1
Determine the condition for the equation
AX+XB = e
to have a unique solution.
Solution
We have already considered this equation and wrote it (2.21) as
(B' (±) A) vee X = vee C
or
Gx =c (3.5)
Equation (3.5) has a unique solution Iff G Is nonsingular, that is iff the
eigenvalues of G are all l1onZIlro. Since, by Property XIV (sec section 2.4), the
eigenvalues of G are D\ + J.lJ} (note that the eigenvalues of the matrix B' arc
the same as the eigenvalues of B). Equation (3.5) has a unique solution iff
[G;c] .
If the rank of [G :c] is equal to the rank of G, then solutions do exist, otherwise
the set of equations
AX+XB = C
is not consistent.
Example 3.2
Obtain the solution to
AX+XB = C
where
! (i) A .,'
I
.,
(ii) A
t
j
I Solution
Writing the equation in the form of(3.5) we obtain,
(i)
(li) In case (ii) A and (-B) have one eigenvalue (i\ == 1) in common. Equation
(3.5) becomes
4 0 0 -1 X3 == 5
0401 X4 -9
Xl ==
~I J
-2 -1
and Xl == 11 q
l-2 -lJ
any other solution is a linear combination of Xl and Xl'
3.4 PROBLEM 2
Determine the condition for the equation
AX-XA == pX (3.6)
to have a nontrivial solution.
Solution
We can write (3.6) as
Hx == px (3.7)
whereH ==I®A -A'®I and
x == vecX .
(3.7) has a nontrivial solution for x iff
IpI-HI := 0
[~ ~ ~~ -~ ~~ -2~i]
On solving, we obtain
3.5 PROBLEM 3
Use the fact (see [18] p. 230) that the solution to
is
x=Ax , x(O)= c (3.8)
x = exp(At )c (3.9)
to solve the equation
Solution
Using the vee operato r on (3.10) we obtain
Example 3.4
Obtain the solution to (3.10) when
A
_- II-~
LO ~' B == [
I
o
al
-lJ
and C = [-21 olIj
Solution
(See [18] p. 227)
X -
_[_e 3t
2f _ e3f
e
3.6 PROBLEM 4
We consider a problem similar to the previous one but in a different context.
An important concept in Control Theory is the transition matrix.
Very briefly, associated with the equations
i == A(t)X or x= A(t)x
is the transition matrix 1>1 (t, r) having the following two properties
4>1 (t, r) ;;: A(t)1>J (t, r) (3.13)
and
Sec. 3.6] Problem 4 43
[For simplicity of notation we shall write <P for <1)«(, r).J If A is a constant matrix,
it is easily shown lhal
<I' :::: cxp(At) .
Similarly, with the equation
X == XlJ so that .r' == lJ'X'
we associate the transition matrix <I12 such that
4>. = B'(p. . (3.14)
111e problem is to find the transition matrix associated with the equation
X=:: AX+XB (3.15)
given the transition matrices epl and (D 2 defined above.
SO/U/iol1
We can write (3.15) as
x == Gx
where x and G were defined ill the previous problem.
We define a matrix'" as
!J; (t, r) = 'I>. (t, r) ® <PI Ct, r) (3.16)
We obtain by (3.4)
~ ::;; (i12 ® 'PI + 1/.12 ® (~I
+ '112 ® (A(ll t )
::;; (lJ'<p 2) ® '/>I by (3.13) and (3.14) "
I Ience
== [B'®I+I®A}[ep2®q,d
~ = GifJ .
by (2.11) 'J
r
Also
if;(t, t) = <P2 (r, t) ® (PI (t, r)
== J®J
!
=1. (3.18)
The two equations (3.17) and (3.18) prove that if; is the transition matrix for
(3.15)
Example 3.5
Find the transition matrix for the equation
. ~l° -~
X == X+X G
1 OJ
2 0 -1 .
44 Some Applications of the Kronecker Product [eh.3
Solution
In this case both A and B are Constant matrices. From Example 3.4.
[~t :~r- e
2
<PI == exp (At) ;:;:
J
<P2 = exp (Bt) = ret 0 J
["
La e- t
sO that
a
I/J ;:;: .<P2 ® <PI
a ,"
e3t
o 0
o a
- ," a
1
a
L,j
et
For this equation
G I~ -~ ~ _~l
l~ 0
and it is easily verified that
0 J
~ = GI/J
and
1/1(0) = I.
3.7 PROBLEM 5
Solve the equation
AXB;:;: C
where all matrices are of order n X n.
Solution
Using (2.13) we can write (3.19) In the form
Hx ;:;: c (3.20)
where H = B' ® A, x = vee X and c ;:;: vee C.
The criteria for the existence and the uniqueness ofa solution to (3.20) are
well known (see for example [18]).
The above method of solving the problem is easily generalised to the linear
equation of the form
A I XB I +A 2 XB 2 + ... +ArXBr ;:;: C (3.21)
Sec.3.!l1 Problem 6 4S
Equation (3.21) can be written as for exampJe (3.20) where this time
II = n; ® AI + /32® A2 + '" + B; ® Ar
Example 3.6
Find the matrix X, given
AIXB I + A 2 XB2 = C
where
2 2
1 -1
-21 -~2
022 S
[
-4 -2 -5 -4
and c' = [4 0 --ti 8J
It follows that
x = W', = [=i
so that
X = rL-l -2loj
J
3.8 PROBLEM 6
This problem is to determine a constant output feedback matrix K so that the
closed loop matrix of a system has preassigned eigenvalues.
A multi variable system is defined by the equations
x= Ax +Bu
(3.22)
y = Cx
where A(n X n), BCn X m) and CCr X n) are constant matrices. u, x and yare
r.olumn vectors of order m, nand r respectively.
3S0A
Solution
Various solutions exist to this problem. We are interested in the application of
the Kronecker product and will follow u method suggested in [24].
We consider a matrix H(n X n) whose eigenvalues are the desired values fq,
Al' ... , An, that is
or more simply as
Pk = q (3.30)
(3.31 )
(3.32)
If the rank of P is mr, then PI is of order (IIlr X fIlr), P2 is of order «(n 2 - II1r1X
mr) and u and v are of order mr and (/1 2 - mr) respectively.
A sufficien t condition for the I!xbtencc of a solu tion to (3.32) Of equivalently
to (3.30) is that
v :; 0 (3.33)
in (3.32).
If the condition (3.33) holds and rank PI:; mr, then
k = Wlu . (3.34 )
The condition (3.33) depends on an appropriate choice of H. The underlying
assumption being made is that a matrix H satisfying this condition does exist.
This in turn depends on the system under consideration, for example whether it
is controllable.
"
Some obvious choices for the foo11 of matrix 11 are: (a) diagonal, (b) upper 'j
or lower triangular, (c) companion form or (d) certain combinations of the above
forms.
Although forms (a) and (b) are well known, the companion form is less well
documented.
Very brief1y, the matrix
0 1 0
0 0 I
11=
-aD
0 0
-al
0
-a2
-fJ
is said to be in 'companion' form, it has the associated characteristic equation
(3.35)
Some Applications of the Kronecker Produ.ct (eh.3
Example 3. 7
Determine the feedback matrix K so that the two input - two output
system
x ~ ~ r~ ~3 x+ u
G ~J ~ J
-3
has closed loop eigenvalues (-I, -2, -3).
Solution
We must first decide on the form of the matrix H.
Since (see (3.28»
H-A = BKC
and the first row of B is zero, it follows that the first row of
H-A
must be zerO.
We must therefore choose H in the compan ion form.
Since the characteristic equation of His
(:\+1) (:\+2) (:\+3) = :\3+6:\ 2+11:\ +6 O.
H = I ~ ~ ~J (see (3.35»)
L-6-1 1-6
and hence (see (3.28))
Q = I_~ _~ ~l
L-8 -8 -~J o 0 0 0
I 0 1 0
o 1 0 1
o 0 0 0
p = C'®B o 0
o 0
o 0 0 0
o 0 I 0
000
Sec. 3.8) Problem 6 49 i I
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0
T= 1 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 1 0 0
0 0 I 0 0-1 0 0 0
0 0 0-1 0 0 0 0
It follows that
0 0 0
0 0 0
I 0 I 0
{~j
0 1 0
TP = ----------
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0 ,',
and
0
-8
-3
-8
Tq =
0 t;j
0
0
0
0
23
Since
~3J
Hence
k
[
~ If'. ~ _:
K = f-3 ol
Lo-sJ
CHAPTER 4
4.1 INTRODUCTION
It is becoming ever increasingly clear that there is a real need for matrix calculus
in fields such as multivariate analysis. There is a strong analogy here with matrix
algebra which is such a powerful and elegant tool in the study of linear systems
and elsewhere.
Expressions in multivariate analysis can be written in terms of scalar calculUS,
but the compactness of the equivalent relations in terms of matrices not only
leads to a better understanding of the problems involved, but also encourages the
consideration of problems which may be too complex to tackle by scalar calculus.
We have already defined the derivative of a matrix with respect to a scalar
(see (3.1)), we now generalise this concept. The process is frequently referred to
as fonnal or symbolic matrix differentiation. The basic definitions involve
the partial differentiation of scalar matrix functions with respect to all the
elements of a matrix. These derivatives are the elements of a matrix, of the same
order as the original matrix, which is defined as the derived matrix. The words
'formal' and 'symbolic' refer to the fact that the matrix derivatives are defined
without the rigorous mathematical justification which we expect for the corres-
ponding scalar derivatives. This is not to say that such justification cannot be
made, rather the fact is that this topic is still in its infancy and that appropriate ",
mathcmatical basis is being laid as thc subject develops. With this in mind we
make the following observations about the notation used. In general the elements
of the matrices A, B, C, ... will be constant scalars. On the other hand the
elements of the matrices X, Y, Z, ... are scalar variables and we exclude the
possibility that any element can be a constant or zero. In general we will also
demand that these elements are independent. When this is not the case, for
example when the matrix X is symme tric, is considered as a special case. The
reader will appreciate the necessity for these restrictions when he considers the
partial derivatives of (say) a matrix X with respect to one of its elements x rs .
Obviously the derivative is undefined if x rs is a constant. The derivative is
Ers if x rs is independent of all the other elements of X, but is Ers + Esr if X is
symmetric.
38'01\
There have been attempts to define the derivative when xrs is a constant (or
zero) but. as far as this author knows. no rigorous mathematical theory for the
general case has been proposed and successfully applied.
oY
aXI
oy
(4.2)
ax
y
.,. a m] (4.3)
ax
Example 4.1
Given
y
Sec. 4.2] The Derivatives of Vectors 53
and
YI = xl -Xl
Y2 = xj + 3x.
Obtain iJy/iJx.
Solutiofl
aYI ah
2XI\ 0
aXI aXI
3y iJYI ay.
== -1 3
iJx ax. ax.
aYI aYl
0 2X3
aX3 aX3
In multivariate analysis, if x and yare of the same order, the abwlutc value
of the ueterminan t of 3x/3y, that is of
I :; i
is called the Jacobian of the transformation determined by
t
y = y(x) . t
Example 4.2
t
The transformation from spherical to cartesian co-ordinates is defined by x =
r sin e cos ..p, y = r sin e sin ..p, and z = r cos () where r > 0, 0 < () < 1f and
t
o ~..p <21f.
Obtain the Jacobian of the transformation.
Solution
Let
" .,
I
i
f
I
and t
r = Yl, () = Yl, 1/1 = Y3 , f.
f,.
f
J =
ax
oy
sin h cos Y3
Yl cos h cos Y3
sin h sin Y3
Yl cos h sin Y3
COSh
-Yl siny.
.
t
~
Definitions (4.1), (4.2) and (4.3) can be used to obtain derivatives to many
frequently used expressions, including quatratic and bilinear forms.
It
t.
f
i,
Introduction to Matrix Calculus [Ch.4
For example consider
y == x'Ax
Using (4.2) it Is not difficult to show that
iJy
:= Ax +A'x
iJx
:= 2Ax if A is symmetric.
We can of course differentiate the vector 2Ax with respect to x, by deflJ1ition
(4.1).
-a - (ay) == -(2Ax)
a.
ax ax ax
= 2A' = 2A (if A is symmetric).
m
Using the definition (4.1), we can write
aZ l aZ l aZ l
j :=
1,2, ... ,m
1,2, ... , II.
3Yq oYq
~
iJZ1 0Z1
2: Yq
iJZ1
= (a y az)'
ax oy
oz oy az
--
oy = ax oy (4.6)
56 Introduction to Matrix Calculus [Ch.4
4.4 THE DERNATlVE OF SCALAR FUNCTIONS OF A MATRIX WITH
RESPECT TO THE MATRIX
Let X = [xII) be a matrix of order (m X n) and let
y = f(X)
be a scalar function of X.
The derivative of y with respect to X, denoted by
ay
ax
is defined as the following matrix of order (m X n)
ay 3y 3y
aXil aX 12 aX In
ay
ax
=
Cly Cly
aXll aX22
ay
ax1n = [:,J =
L ay
EII - (4.7)
1.1 aXil
ay ~ ay
aXml aXm 2 aXmn
Solution
Y = tr X = Xll + X22 + ... + xnn = tr X' (see 1.33) hence by (4.7)
ay
- 1
ax = n'
We will determine
where Yi/15 the cofactor of the elementYl/ln IY!. Since the cofactors Yilt Yi., ...
are Independent of the element YII' we have
olYI
- - ==
aYi!
It f ol1O\'/s that
alYI
(4.8)
L "2.I ali b
I
lj
= LLal/ejbl/el
I I
"f,A1.'Bi . (by (1.23) and (1.24»
I
and since
we can write
Solution
(i) In the notation of (4.1 0), we have
a/YI ,
- = (vee E,..) vee Z
ax/:\,
I
• J
to
y
i
Scc.4.4J The Derivative of Scalar Functions of a Matrix S9
ant!
Helice
Y= ~II
x
XI~
I2 X 2 2
hence
alYI alYI
= Ell , - = EI2 +E 21 and so on.
Xll Xu
'.
, I
It fOllO:;::'~ ~
JX12
alYl
JX21
[0 I I 01 ;,:J =
Xu
X +X
21 I2
= 2X
I2
alXI
-ax = 2[X·j-diag{X·}.
I} /I
60 Introduction to Matrix Calculus [Ch.4
From this expression we see that Yij is the (scalar) product of the ith row of
[bljA: b2j A: ... : bTl; AJ and vec X.
so that
Yij = Ii
p=1 1=1
ail bp;Xlp (4.14)
°Yi/
aXil aXJ2
°Yil °Yii
aXln i
aJ"'j
-=
ax
?!!L
aX'll
oYI/ 'OYII
(4.16) I
I
aX22 ilX 2n
We note that the matrix on the right hand side of (4.17) can be expressed
as (for notation see (1. 5)(1.13) (1.16) and (1.17))
= Aj.n./
= A'e/ei B'.
123
38'0:"
Utax,s
a2,bsl
am,bsl
..
a2,bs2
am,bSl
a2,bsq
amrbsq
ai'
:;: al, [bsl bs2 ... bsq ]
a,n,
where Ers is an elementary matrix of order (m X n), the order of the matrix X
Sec.4.5J The Derivative of a Matrix 63
Example 4.5
Find the derivative aY/ox rs , given
Y == AX'B
where the order of the matrices A, X and B is such thallhc product on the right
hand side is defined.
Solution
l3y the method used above to obtain the derivative il/ilxr$ (AXB), we find
a
- - (AX'B) == AE~sB .
axrs '.
Before continuing with further examples we need a rule for determining the
derivative of a product of matrices.
Consider
Y == UV (4.21)
where U == [uij] is of order (m X Ii) and V == [ttl] is of order (/1 X l) and both
U and Vare [unctions of a matrix X.
We wish to determine
hence
(4.23)
For fixed rand s, (4.23) is the (i,j)th element of the matrix ay/axrs of
order (m X /) the same as the order of the matrix Y.
On comparing both the terms on the right hand side of (4.23) with (4.22),
we can write
o(UV) au av
-v+u- (4.24)
aXrs axrs
as one would expect.
64 bltroduction to Matrix Calculus [Ch.4
of the
011 the other hand, when fixing (i,j), (4.23) is the (r,3)th element
matrix aYij/ax, which is of the same order as the matrix X, that is
aY/1 =
ax
p=1
*au/pv + ~ u aVpl
L.., ax pi L.., /p ax
p=1
(4.25)
s.
We will make use of the result (4.24) in some of the subsequ ent example
Example 4.6
Let X = lX,sJ be a non-singular matrix. Find the derivative ay/ax,s, given
(i) y = AX-IB, and
(ii) y= X)tX
Solution
(i) Using (4.24) to differentiate
yy-\ = I,
we obtain
ay ay-I
-y-I+ y _ = 0,
ax,s ax,s
hence
ay ay-I
-y-y -.
ax,s ax,s
But by (4.20)
ay-l
so that
ay
ay ax' , a(AX)
--r- = -AX+X_
ax,s axrs ax,s
(by (4.12) and (4.20» .
for all i, j
Both (4.18) and (4.20) were derived from (4.15) which is valid
r, $, defined by the orders of the matrices involved .
Sec.4.5J The Derivative of a Matrix 65
Example 4. 7
Find the derivative of oYii/ax, given
(i) Y=AX'n,
(ij) Y=AX-1n, and
(iii) Y = x'Ax
where X = [xli] is a nonsingular matrix.
Sollllion
(i) Let W = X', tlien .'
ay
Y = AWn so that by (4.20) - =AE,sB
awrs
hence
.,
aYil = Alg.B ' .
aw IJ
But ;
ay··
-.2!. = BE'A
II
l
ax IJ
j.
~,
~.
(ii) From Example 4.6(i) ~ .
ay
F'.
;
I,.
t.
- = -AX-1t."',sx-1n.
axrs
so that
66 Introduction to Matrix Calculus (eh.4
= AXEij + A'XEjj •
It is interesting to compare this last result with the example in section 4.2
when we considered the scalar Y == x/Ax.
In tlus special case when the matrix X has only one column, the elementary
matrix which is of the same order as Y, becomes
Ejj ::: E:j = I .
Hence
aYij ily /
- == - == Ax +Ax
ax ax
which is the result obtained in section 4.2 (see (4.4».
Conversely using the above techniques we can also obtain the derivatives of
the matrix equivalents of the other equations in the table (4.4).
Example 4.8
Find
when
(i) Y == AX, and
(li) Y == X/X.
Solution
(i) With B == I, apply (4.20)
aY
- = AE'3'
ax,s
Sec,4.6J The Derivatives of the Powers of a MatrLx 67
-(lxilY = E,.X
,
+ XE" rs
rs
and (Solution to Example 4.7 (iii))
aYi,
ax = XE/
J
+ x£..
1/'
,
:~
Our aim ill this section is to obtain the rules for determining
ay
and
aXrs ax
when
n
Y = X ,
t·;1,
and, applying the first transformation principle, ~;
ay·· Ii
I
We obtain
and
123
3S'OA
(Ch.4
68 Introduction to. Matrix Calculus
Y == X"
ay "-I (4.26)
_ == ""
L..., XkE,s X"-k-I
ax,s k=O
O
where by defmition X == I, and
aYi/' .
- == 2:
>1-1
(X')"Eij(X,)"-k-1
(4.27)
ax k=1
Example 4.9
Using the result (4.26), obtain ay/ax,s when
y "" X-n
Solution
Using (4.24) on both sides of
X-fiX" =I
we find
acx-")
___ XII aeX")
+ X- II _ - = 0
ax,s ax,s
so that
acx-")
_
a(X")
== -X-" --X-".
ax,s ax,s
Now making use of (4.26), we conclude that
acX-")
_ _ ==
ax,s
~I-I
-X-II" XkE,sX"-k-1 X-II.
L...,
k=O
J
Problems for Chapter 4
J
~
(1) Given
X = IXl! Xl1 Xl~, y == 1
e2X
X-I
ay ay
and
ax ax
Sec.4.6 J The Derivatives of the Power of a Matrix 69
(2) Given·
x GillX xl and
Lcosx ej
evaluate
alXI
ax
by
(a) a direct method
(b) use of a derivative formula.
(3) Given
5.1 INTRODUCTION
In Chapter 4 we discussed rules for determining the derivatives of a vector and
then the derivatives of a matrix.
But it will be remembered that when Y is a matrix, then vec Y is a vector.
This fact, together with the closely related Kronecker product techniques
discussed in Chapter 2 will now be exploited to derive some interesting results.
Also we explore further the derivatives of some scalar functions with respect
to a matrL'< first considered in the previous chapter.
Y == AXB (5.1 )
where Y == [Ylil. A == [alj], X == [XII) and B == [bill.
We now obtain (a vec y)/(a vec X) for (5.1). We can write (5.1) as
y == Px (5.2)
where y == vec Y, x == vec X and P == B' ® A .
By (4.1), (4.4) and (2.10)
Y == AXE (S.4)
is not so simple.
[Sec. 5.2] Derivatives of Matrices and Kronecker Products
71
The problell1 is that when we write (5.4) in the form of(S.2),wehave this
time
y >:: pz
(5.5)
where z =: vee X'
We can lind (see (2.25» a permutation matrix U such that
vecX' = UvecX (5.6)
in which case (5.5) becomes
y == PVx
so that
~ =: (PU)' U'{E ®A'). (5.7)
ax
It is convenient to write
v' is seen to premultiply the matrix (ll ® A'). Its effect is therefore to rearrange
the rows of (ll ®A').
In fact the first and every subsequent 11th row of (B ® A') Corm the first
consecutive m rows of (ll ® A')(n)' The second and every subsequent Ilth row
form the next m consecutive rows of (B ® A')(Il) and so on.
A special case of this notation is for II=: J, then
ay
- == (5.10)
ax
Example 5.1
Obtain (iJ vec Y)/(a vee X), given X == [xij] of order (m X n), when
Solution
Let y == vec Y and x = vec X.
(i) Use (5.3)withB==I
ily
= I®A'.
ax
3S'OA
Bu t by definition (4.19),
ay),
tile first rOW oflhc matrix (5.12) is vec- ,
( aXil
ilY)' ,etc.
the second row of the matrix (5.12) is vee -
( aX 21
Sec. 5.3] The Determination of (0 vee x)/(o vee Y) 73
We can therefor!: wrlte (5.12) us
oveeY [:
- - ::: vee -
oy.: vee -oy.: ... :. vee -ay]'
- (5.13)
oveeX aXil QX21 aXmn
We now use the solution to Example (4.6) where we had established that
oY
when Y = x'Ax, then = E:sAX + X'AErs . (5.14)
axrs
It follows that
oY " ,
vee _.- = vee ErsAX + vee X'AEts
ax rs
= (x'A' ® I) vee E:s + (/ ® x'A) vee Ers (5.15)
(using (2.13» .
Substituting (5.15) into (5.13) we obtain
avee Y
il veeX
+ [(J®X'A)[vecE l1 : veeEzI : ••. : vecEmnJ]'
[vec E{l: vec £;1: ... :vec E~n]' (AX ® I)
+ [vec Ell: vec E21 : vee Emn]' (1 ® A'X) (5.16)
(by (2.1 D».
The matrix
ovee Y
U'(AX ®I) + (1 ®A'X).
aveeX
111 a t is
dRCY ,
-- := (AX ® 1)(11) + (I ® A X) . (5.17)
ilvccX
In thc above calculations we have u$cd the derivative af/ax rs to ob tain ca vee f)1
(0 vee X).
123
avec Y I I
- - ;:: B ® A + (D ® C )(11) (S.18)
ovecX
We will refer to the above result as the second transformation principle.
Example 5.2
Find
avec Y
ovecX
when
(i) Y = X'X (ii) Y = AX -I B .
Solution
Let y = vec Y and x = vec X .
(i) From Example 4.8
OY I I
- - = ErsX + XErs
ax rs
Now use the second transformation principle, to obtain
ay
ax = I ® X + (X ® /)(n)
(li) From Example 4.6
ay
- = -AX-I ErsX- 1B
axrs
hence
Hopefully, using the above results for matrices, we should be able to rediscover
results for the derivatives of vectors considered in Chapter 4.
Sec. SA] More on Derivatives of Scalar Functions 7S
Example 5.3
Evaluate the derivatives
alog IXI illXl r
(i) ax and (ii) - -
ax
Solution
0) We have
() I al)(1
- (log 1)(1) == - - -
axrs 1)( I axrs
From Example 4.4,
Hence
illXl'
--=
ax
function s
Traces of matrices form an importa nt class of scalar matrix
of applicat ions, particula rly in statistics in the formu-
covering a wide range
.
lation of least squares and various optimisa tion problem s.
product s
Having discussed the evaluation of the derivative ay/ax,s for various
apply these results to the evaluati on of the derivativ e
of matrices , we can now
a(tr Y)
ax
We first note that
(5.19)
a matrix
where the bracket on the right hand side of (5.19) denotes , (as usual)
of the same order as X, defined by its (r,s)th element .
definitio n
As a consequ ence of (5 .19) or perhaps more clearly seen from the
(4.7), we note that on transpos ing X, we have
a(tr Y) _ (a-
--- - (tr -
Y))' (5.20)
ax' ax
considering
Another , and possibly an obvious property of a trace is found when
the definitio n of aY/ ax,s
(see (4.19)).
Assuming that Y = [Yij] is of order (n X n)
Example 5.4
Evaluate
a (r(AX)
ax
Sec. 5.4] More 011 Derivatives of Scalar Functions 77
Solutioll
u tr (AX) u(AX)
:: tf by (5.21)
ox's ax,s
tr (AEr.) by Example (4.8)
tr(E:3 A') since tr Y = tr Y'
(vee Er.)' (vec A ') by Example (1.4).
Hence,
a tr (AX) = A'.
ax
trace of
As we found in the previous chapter we can use the derivative of the
the derivative of the trace of a differen t product .
one product to obtain
Example 5.5
Evaluate
a tf (AX')
ax
Solution
From the previous rcsul t
a t1' (BX) _ utr (X'B') -B
_ '
.
----
ax ax
Example 5.6
Evaluate
a(tr Y)
ay
when
(i) Y=XA X
(li) Y=XA XB
SO/utIUI/
It is obvious that (i) follows from Oi) when B = I.
78 Further Development of Matrix. Calculus [Ch.5
= tr (E;sAXB) + tr (E;sA'XB')
= (vec Era)' vec (AXB) + (vee E,a)' vee (A'XB') .
It follows that
a(tr Y)
- - == AXB+A'xn'.
ax
(i) Let B == I in the above equation, we obtain
3(tr Y)
- - = AX+A'X = (A+A')X.
ax
(5.24)
XY = [~XijYjk] ,
J
ff{1
It follows that
d(XY) = (dX)Y + X(dY) . (5.27)
Example 5.7
Given X = [xlf] a nonsingular matrix, evaluate
(0 dlXI, (il) d(X'-I) -;-
Solution
(i) By (5.23)
dlXI =
"" -alXI (dx,,)
L....
/,/ aXij
= Li,j Xjj(dXij)
since (aIXI)/caXij) = Xii' the cofactor OfXij in IXI.
By an argument similar to the one used in section 4.4, we can write
dlXI = tr {Z'(dX)} (compare with (4.10)) '.
where Z = [X,/).
Since Z I = IX IX -I , we can write
dlXI = IXI tr {X-I (dX)} .
(ii) Since
X-IX = I
we use (5.27) to write
d(X-I)X + X-I(dX) = O.
Hence
d(X-I) = -X-I(dX)X- 1
(compare with Example 4.6).
Notice that if X is a symmetric matrix, then
X = X'
and
(dX)' = dX (5.28)
80 Further Development of Matrix Calculus [Ch. 5]
avec Y
a vee X
and verify (5.1 0).
(2) Obtain
avec Y
i'lvecX '
when
(i) Y = AX'B and (ii) Y = llMi Xl .
(4) Evaluate
a tr Y
ax
when
(a) Y == X-I, (b) Y = AX-1B, (c) Y == Xn and (d) Y == eX.
(5) (a) Use the direct method to obtain expressions for the matrix differential
dY when
6.1 INTRODUCTION
In the previous two chapters we have defined the derivative of a matrix with
respect to a scalar and the aerivative of a scalar with respect to a matrix. We will
now generalise the dcfinitiftlls to include the derivative of a matrix with respect
to a matrix. The author 1.1u:Cadoptcd tile definition suggested by Vetter [31J,
although other definitions also'give rise to some useful results,
ay ay ay
aXII1 1 aXm 2 aXmn
123
TIle right hand side of (6.1) following from the definitions (1.4) and (2.1) where
Ers is of order (m X n), the order of the matrix X.
l! is seen that aY/ax is a matrix of order (ml' X IIq).
Example 6.1
Consider
y= [;u X1:2 X22
sin (Xli + X12)
XIi
e x"
log (XII
J
:t- X21)
and
X= [Xli XI2J
X21 X22 I
Evaluate
oY
ax
Solution
']
aX21 aX22
Xllx
XI2 X2 2 X 22 e " XI 1 X22 0
ay
-= cos (XII + X12) cos (XI1 + XI2) 0
ax Xu + X21
0 0 Xu XI2 XI1 eXIl Xu
I
0 0 0
XII + X21
Example 6.2
Given the matrix X = [Xjj] of order (m X n), evaluate aX(dX when
(i) All elements of X are independent
(li) X is a symmetric matrix (of course in this case m = n).
Sec. 6.2} The Definitions and Some Results 83
Solution
(i) By (6.1)
ax
-=
ax 2: Ers ® Ers
r, s
= D (see (2.26))
(ii) ax
- = 1.:...... + l;'sr for r,-ps
oXrs
ax
-=1:-"',.,. for r=s
axrs
We cun write the above as;
Hence,
ax
-=
oX 2: Ers ® b....s + L EN ® b'sr -
'IS ,,s
Ors L
,,s
Esr ® Err
Example 6.3
Evaluate and write out in full ax' lax given
Solution
By (6.1) we have
ax'
- = Ers ® E;s
ax
u.
Hence
0 0 Q 0 0
0 0 1 0 0 0
ax' 0 0 0 0 1 0
==
ax 0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
3801.
by (2.10)
from (4.19).
It follows that
3Y'
(:;)' = (6.2)
ax'
il(XY)
az
where the order of the matrices arc as indicated
a(xY) ax
- - = 2E,s® - Y + XilY]
[ ilz,s
-
ilZ ilz,s
" S
Sec.6.3J Product Rules for Matrices !IS
= 2: (t~s ® az,.
~s
ax )(Jq ® y) + 2: (Ip ® (£r$ ® a~y)
~s
X)
a~r:
(by 2.11)
finally, by (6.1)
ax- I
ax
Solution
Using (6.3) on
XX- I =1,
we obtain
I a(xx- 1) ax ax- 1
_ _-c. = -(1®x- I )+ (J®X) - = 0
.j
hence
ax ax ax
ax-I ax
- = -(1®xt l -(1 (8) X-I)
ax ax
= -(1(8) X-I) 0(1 (8) X-I)
o(X® Y)
az
TIle order of the matrix Y is not now restricted, we will consider that it is
(u X v). On representing X ® Y by it (i,j)th partition [xijYj (i = 1, 2, ... , m,
k = 1,2, .. " n), we can write
380A
= -ax ® y +
az L. E rs ® (X ® -ay) •
azr :
To·
The summat ion on the right hand side is not X ® ay/az as may
appear at first
sight. nevertheless it can be put into a more conveni ent form, as
a product of
matrices. To achieve this aim we make repeated use of(2.8) and (2.ll)
by (2.11)
Hence
(6.4)
Example 6.5
A = [aii] and X = [Xij] are matrices, each of order (2 X 2). Use
(i) Equatio n (6.4), and
(ii) a direct method to evaluate
a(A ® X)
ax
Sec.6.3J Product Rules for Matrices 87
Solution
(i) In this example (6.4) becomes
~ ~
0 0
0 1 ',.
U1 U2 '2:.Ers eEl h-';s .. ~
1 0
0 0
Since
l~ ~
0 0
ax 0 0
0=
ax 0 0
0 0
(il) We evaluate
~BXB
all Xl2 al2 x U
Y = A @X =
all X21
a2l xli
all X22
allx I 2
al2 X21
a22 XII
'''X'j
al2 X22
an XI2
t I,.
~. ·
38'0"
i = 1,1, ... , u
j = 1,2, ... , q
aZ11
-=
L aZ11 . aYail 0: = 1,2, ... , u
ax,s a, il aY<>jl ax,s {1 = 1, 2, ... , v
Hence
az
-=
ax 2:" s E,s ® [2: Eii 2 -aZil- . ax,s
..
-
il aY"'il
y
a ail ]
til a,
22 E,s- ®
aY"'il 2E aZij
lj - - (by 2.5)
ax,s I,J' aY"'il
',S "',il
aY"'il az
=
2
"',il
-0-
ax aY"il
(by (4.7) and (4.19» .
Sec. 6.4] The Chain Rule for the Derivative of a Mattix 89
[~~: ~~:
'" YIV
Y = ..• Y2"
[
aYlI ® 1 :- aYll 01 : ... ayuv ®
ax P - ax p. ax
jlpJ
as
a[YlI hi'" Yuvl
--'--'-'--a-x----.:...;..c. ® Ip
or as
a(vee Y)'
®Ip
ax
Similarly. we write the partitioned matrix
j 0-
az
n aYII
j
n
0-
aY21
az
as
~
1 0az
--
n avec Y
J
90 The Derivative of a Matrix with Respect to a Matrix [Ch.6
ax
y
az = [a_ll®I
-. oy ay
ax P :• ... •:~®1
ax p.:--2.!®! ax P
J ! ®-
n .
az
aY21
Example 6.6
Given the matrix A = [aij] and X = [xii] both of order (2 X 2), evaluate
aZjaX
where Z = Y'Y and Y:= AX.
(i) Using (6.6)
(ii) Using a direct method.
Solution
(I) For convenience write (6.6) as
az
- = QR
ax
where
Q = [a(vec Y]' J and R= [In®~J.
ax 01p ' avec Y
Sec. 6.4] The Chain Rule for the Derivative of a Matrix 91
3Yli = A'E.
ax 1/
0]
a:ll
Q= 1 1
~21
a1'2 0 0 0 1 a22 0 0 01 0 0 all 0 0 0 all
1 1
o all 0 0 1 0
1
a22 0 01 0 0 0
1
a l2 0 0 0 .
az = ErsY+ Y E
-- I I
rs
aYrs
we can now evaluate R
2Yl1 Y12 0 0
Y12 0 o o
o o 2Yll Y12
o o Y12 0
2121 121 0 0
Y22 0 0 0
o o
o o
R
o Yl1 0 0
Yll 2Y12 0 0
o o o Yl1
o o Yll 2Y12
-0- - --Y;; --0- - - -0--
121 2Y22 0 0
o 0 0 121
o o
3
3S'OA
allYn
o
+ allYll
aUYlI + a2lY2l
2allYI2+ 2a 21Y2l
1
o allYn + a22Y:U
(ii) 13y a simple .extension of the result of Example 4. 6(b) we find that when
Z = X/A/AX
az
ax rs
= E:sA'AX + X'A'AErs
where Y=AX.
By (6.1) and (2.11)
az
ax == 2: (Ers ® E:s) (/ ® A'y) + 2:
',3 r,s
(I ® Y'Z) (Ers ® Ey,) .
[ ~l
0 0
' 0 0
LErs®Ers = ~ 1 0
0 0
and
[~ ~l
0 0
0 0
LErs®Ers =
0 0
0 0
On substitution and multiplying out in the above expression for az/ax, we obtain
the same matrix as in (i).
(2)
Xli X21]
The elements of the matrix X = XI2 X22
[
Xu X23
are all independent. Use a direct method to evaluate ax/ax.
ax-I
ax
and verify the solution to Example 6.4.
(4) The matrices A = [alf] and X = [xi;1 arc both of order (2 X 2), X is non·
singular. Use a direct method to evaluate
123
CHAPTER 7
7.1 INTRODUCTION
As in Chapter 3, where a number of applications of the Kronecker product were
considered, in this chapter a number of applications of matrix calculus are
discussed. The applications have been selected from a number considered in the
published literature, as indicated in the Bibliography at the end of this book.
These problems were originally intended for the expert, but by expansion
and simplification it is hoped that they will now be appreciated by the general
reader.
S = *
Lei2
1=1
= L~ (YI -
1=1
[(XI)) 2 (7.4)
is a minimum.
[Sec. 7.2] The Problems of Least Square and Constrained Optimisation 9S
f'" = / + 2:
1;1
/Jigi (7.8)
for the m parameters Ill, 1l2, ... , Ilm and the n variables x determining the
extremum.
Example 7.1
Given a matrix A = [ali] of order (2 X 2) determine a symmetric matrix
X = [xli] which is a best approximation to A by the criterion of least squares.
Solution
Corresponding to (7.3) we have
E = A-X
where E = [eli] and elj = ali - Xii'
3S'Oi\
-
ar = -2(J u - Xli) = 0
aXil
ar
- = -2 (a22 -X22) = 0
aX22
Hence
i(A + A')
The criterion of the least squares method is to minimise (7.1 0) with respect to
the parameters involved.
The constrained optimisation problem then takes the form of fmding the
matrix X such that the scalar matrix function
s = I(X)
is minimised subject to conlraints on X in the form of
. G(X) = 0 (7.11)
where G = [elf] is a matrix of order (s X t) where sand t are dependent on the
number of constraints gil inv~lvcd.
As for the scalar case, we usc Lagrange multipliers to forlll an augmented
matrix function ["(X).
Each constraint gil is associated with a parameter (Lagrange multiplier)
Since
kJ..lijgij = tr U'G
where
U = [J..II/]
we can write the augmented scalar matrix function as
f*(X) = tr E'E + tr U'G (7.12)
which is the equivalent to (7.8). To find the optimal X, we must solve the
system of equations
a/*
- = o. (7.13)
ax
Problem
Given a non-singular matrix A = [all] of order (n X n) determine a matrix
X = [xII] which is a least squares approximation to A
(i) when X is a symmetric matrix
(ii) when X is an orthogonal matrix.
Solution
(i) The problem was solved in Example 7.1 when A and X are of order (2 X 2).
With the terminology defined above, we write
E = A-X
G(X) = X-X' = 0
so that G and hence U are both of order (n X n).
123
We now make use of the results, in modified form if necessary, of Examples 5.4
and 5.5, we obtain
or
_._. = -2A + 2X + V - v'
ax u-u'
= 0 for X = A +--
2
Then
V'-u
X' = A ' + - -
2
and since X "" X', we finally obtain
X=HA+A').
If a solution to (7.14) exists, there are various ways of solving this matrix
equation.
Sec.7.3J Problem 1 99
For example with the help of (2.13) lind Example (2.7) we can write it as
l(/ ® A') .- (II' ® I)UJ x =0 (7.15)
where U is a permutation matrix (see (2.24» and
x == vecX.
We have now reduced the matrix equation into a system of homogeneous
equations wh.ich cart be solved by a standard method.
If a non-trivial solution to (7.15) does exist, it is not unique. We must scale
it appropriately for X to be orthogonal.
There may, of course, be more than one linearly indepentlent solution to
(7.15). We must choose the solution corresponding to X being an orthogonal
matrix.
Example 7.2
Given
find the othogonal matrix X which is the least squares best approximation to A.
Solution
[I0A'] =
r 1 2
o
o
l
1
0
0
0
0
2
o
-1
1
,
and [A 0I] U = [,-I
0
2
0
0
I
0
0
0
2 l
Equation (7.15) can now be written as
[-~
0 0
-1
x = 0 .
-1 1
0 0 -IJ
There are 3 non-trivial (linearly independent) solutions, (see [18] p. 131). They
are
x = [1 -2 1 1]', x = [1 1 2 -1]' and x = [2 -3 3 2]' .
X = yTI
I [2-3 2J31 .
380A
100 Some Applications of Matrix Calculus [Ch.7
and
y =
[~:J ' [:] [i]
b = , e =
which is the matrix. form of the normal equations defiend in section 7.2.
Example 7.3
Obtain the normal equations for a least squares approximation when each sample
consists of one observation from Yand one observation from
Solution
(i)
b
hence
X'[y-Xh]
~:J
XI ZI YI
X= X2 Z2 Y Y1, b =
XN ZN YN
The normal equations are
and • - - 1,
LXjZ/ =: b l LZ; + b1, LXjZ; + b3 LZ/ .
123
!(xl>X2," .,xn) =
1
(27T)n/21VII/2 exp
(-!ex - p.)' V-I(X
2
- p.))
(7.23)
where
-oo<X/<oo (i=1,2, ... ,II)
and
L -
_ 1
nN/2
( I..!!::
IN/'l exp - - L... (XI-
I
p.) V
-I
(XI - P.
)}
(27T) IV I . 2 /=1
so that
N I ~
10gL = C--IogiVl-- L... (x,-,,), V- 1 (x,-,,) (7.24)
2 2 /=1
where C is a constant.
Sec. 7.5] Problem 3 103
};XI
= 0 when jJ. = x.
N
Hence the maximum likelihood estimate of" Is " = X, the sample mean.
(2) By Example 5.3, but taking account of the symmetry of V-I (see Example
4.4)
a log IV-II
- - - 1 - = 2V-diag {V}.
av-
(3) If X is a symmetric matrix
a tr(AX) I
---'----' = A +A - diag {A} .
ax
Let A = yy' and X = V-I, then
atr(yy'V- I ) I
== 2Yy' - diag {yy } .
aV-I
23
30'01.
Differentiating log L with respect to V-I, u~ng the esHmate p = i1., and the
results (2) and (3) above, we obtain
a 10gL N i l ,
-- I = - [2 V - diag {V}] - YY + - diag {YY } .
av- 2 2
Let Q = NV - YY', then
alog L 1
aV-I "" Q - 2" diag {Q}
=0 when 2Q = dla!! {Q} .
Since Q is symmetric, the only solution to the above equation is
Q = O.
It follows that the maximum likelihood estimate of V is
(7.26)
Solution
We are given
R ;::: {(XI, X2); 0 <XI< "", 0 <X2 < co} .
The above transformation (corresponding to (7.26)) results in the following
inverse transforma tion (7.27)
XI = 1(YI +h)
X2 = Y2
which defines
123
and by (7:29)
! 0
IJI = = ! .
!
Hence
/= J:[! VI +Y2),12] dYI d12
Our main interest in this section is to evaluate Jacobians when the transfor-
mation corresponding to (7.26) is expressed in matrix form, for example as
Y = AXB (7.30)
where A, X and B are all assumed to be of order (n X n). .
As in section 5.2 (see (5.1) and (5.2» we can write (7.6) as
y = Px (7.31)
where y = vee Y, x = vee X and P = 8' ® A .
In this case
ay = B®A'
ax
and
ax
= [B®A'tl = B-1®(A'r l by(2.12).
ay
It follows that
rL-l2-4]
where
A = and B = [ 21 11]'
3
Find the Jacobian of this transformation
(i) By a direct method
(ii) Using (7.32).
Sec. 7.6 J Problem 4 107
Solution
(i) We have
Similarly, we can use the theory developed in this book to evaluate the
lacobians of many other transformations.
Example 7. 6
Evaluate the Jacobian associated with the following transformation
(i) Y=X- 1
tii) Y = X2.
Solution
(i) From Example 5.2
ay
= -X-I ® (X-I)'
ax
so that
ax
= -X®X'.
ay
Hence
J = mod I : : 1 = \X ® x' \ = \X I-n IX I-n
(ii) From section 4.6
ay
- = X®I+I®X'
ax
and
J = IX®I+I®X'I-I
123
Y\, Yi,"" Yn •
ThClse two sets of eigenvectors have the property
xj Y/ = a or (equivalen tly) Y/ xI =0 (i "*' j) (7.33)
and can be normalised so that
xI Yi = 1 or Y; Xi = 1 (i = 1, 2, ... ,n) . (7.34)
Sets of eigenvectors {Xi} and {Yj} having the properties (7.33) and (7.34) are said
to be properly nomlalised.
It is well known (see [18] p. 227) that
exp (Qt) = P diag {eAlt, e A•t, ... , e A" t} p- I
P = [XI:X2:.":X,,]. (7.35)
It follows from (7.33), (7.34) and (7.35) that
p-l , (7.36)
Y2
,
Yn
Hence
,
o Yl
exp (Qt) = [XI X2 ... x,,] 0 o y~
,
o o Yn
Sec. 7.7] Problem 5 109
that is
The right hand side of (7.37) is kIlOwn as the spectral representation (or spectral
decomposition) of the exponential matrix exp (Qt).
We consider a very siniple Illustrative example.
Exumple 7. 7
Find the spectral representation of the matrix exp (Qt), where
Solution
I, A2 = .-1; x; [2 -1], x; = [1 -1]
I
YI [1 1], y~ = [-1 -2] .
By (7.37)
=
[ 2 2]
-1 -1
exp (t) + l-1 -2J
1 3
exp (-t) .
d
- (1) = Q<I>, (7,41 )
dt
and
Z = [zij] is a matrix of order (r X s).
123
d a~
dt az = az = aQaz (/®~)+(/®Q) aclaz>
a(Q~)
(7.42)
We next make use of a generalisation of a well known result (see [19] p.68);
Given
d
-x = RX+BU
dt
'and
X(o) = 0,
then
X = itexp {R(t -r)1BU(r)dr. (7.44)
For
a4>
- =
It exp {I ® Q(t -r)} -
aQ
[/ ® <fl(t)]dr
az 0 az
;:::;
t ""
~ (/ ® XiYi) exp (Ai(t -r»
I a
az [/Q ,
® XjY, 1exp O,;r) dr
Jo 1,/
(by 7.37 and 7.38)
CHAPTER 1
(1)
(2) (a) The kth column of AElk is the ith column of At all other columns are
zero.
(b) The ith row of EikA is the kth row of A, all other rows are zero.
= L A /. BC.,
(4) tf AE/j = L e~ Eljek = L e~a,sErsEljek
k k",S
= 2: a,se~ e,e~eie;ek
kiT,S
== 2:
Ic, r,s
==
a,sflkrOslOjk a/I'
123
380A
(5) A = .L
1.1
2:
ail Ail =
/.1
tr(BEI/liil)Eil
= 2: ekBEilel<~1 L = ekBelfill<Ei/
l.i,I< I,i.k
= .L
/
eIBe/EIi = 2: I
bllEIi = diag {B},
CHAPTER 2
L
r.s
[Ers(m X n) ® Esr(n X m)] [Esr(n X m) ® Erim X n)]
=L
r,s
[Erim X n)Esr(n X m)] ® [Esr(n X m)Ers(m X n)]
L
r,s
[Oss Err Cm X m)] ® [orrEssCn X n)]
= L Err(m
r
X m)] ® [.2 Ess(n
s
X n)]
(3) (a)
r ~ r
2 -1
A®B = i 0 2
~
-1
I
2
0
J
o -1 B®A =
2 0
0 2
2 0
(b)
1 0 0 0
0 0 0
UI = U2 =
0 0 0
0 0 0
(4)
See [18] p. 228 for methods of calculating matrix exponentials.
(a)
[2e -e- I 2(e -e- I )]
exp(A) = -I
e -e 2e- 1 - e
Chapter 2 113
{'~'-' 2e
0
-e- I 0
exp(A) ®I
c- I -e 0 2c- 1 -c
0 e-I-e 0
Hence exp (A) ® I = exp (A ® I) .
(5) (a)
A _, [I IJ _, '[-4 -~J '
=
-1. -2 '
B =-
2 3
so that
['
2 -4
-1
~~ ~
3
A-I ®B- I
-3
-2
-6
3
-J
-1
2
~~ ~-2 I ]
(b) As 4 1
8 3
A®B = , it follows that
-2 -1
-3 -4 -3
['
1
3/2 -1/2 3/2 -1/2
(A ®B)-I =
-1 4 -2
~3/2 1/2 -3 -4
This verifies (2.12)
(6) (a) ForA; Al = -1, 1..2 = 2, x; == [14] and x; == [11].
For B; III = 1, 112 == 4, y; = [I -1] and y~ == [1 2] .
(b) A ®B = ~6 ~ =~ =~J
8 4 -4 -2
== E (say).
8 12-4 --{)
23
380A
114 Solution to Problems
[~ .[il tJ
(c) This verifies Property IX
'nd [iJ
(7) For some non-singular P and Q
A = p-Icp and B = Q-IDQ.
Hence
A ®B = p-1CP®Q-1DQ
l
= (r ® Q-I)(CP®DQ) by (2.11)
l
= (P ® Qr (C ® D)(P ® Q) by (2.12) and (2.11)
=R-1(C®D)R
where
R = P®Q.
CHAPTER 4
~
(1) :; ~ [~: o ay 2e2X]
o -x-2
ax
4x cos x
(2)(a) IXI = x sin x -exp (2,)
~= [ex -cos xl
ax -x sinxJ.
hence
~2~
(a) 2X21 x22
ay
-- =
aX21 "
.
X2J 0
0
oJ.
~~ ~J ~"
o 0 21
XI2
x22 X2)
XI~ + [" X'] ~ ~J
x12 X22
XI3 x23
0
1 0
l" "J
aYI3 0
ax X23 0 X21 =
(4)(a)
(b)
(b)
CHAPTER 5
(l) Since
Yll allx ll + al2 x l2
Y2l a21 x 11 + 012 X I2
o
all
o
0]
a21
0
all a22 •
(2) (a)
oveeY _ (B®A')(n) by(s.i8)
a vee X
(b)
o vee Y = X ®I +I ® X'
a veeX
(3)(a) a tr Y (vee E,.)' (vee A'il') ,
ax,s
hence
a tr Y
= A'lJ'.
ax
(b) a tr Y
2 tr E;sX' ,
ax,s
hence
aIr Y
2X' .
ax
(c) a tr Y
21rE;sX,
ax"
hence
a tr Y
- - - 2X.
ax
(4)(a) a tr Y -Ir X-I /.:.~. X-I -Ir E;s (X- 2 )' ,
axrs
hence
a tr Y
(x-2)' .
ax
(b) a tf Y
= -tf AX-IErS x-In ,
ax,s
hence
a tf Y -(X-lBAx- 1), •
ax
123
380A
(c) atrY
ax"
hence
a tr Y
- - = n(X II
-
1
)'
ax"
(d) I I
exp (X) = I +X +- Xl +- Xl + ...
21 31
hence by the result (c) above
a tr Y
ax- ,
= exp(X).
(5) (a) (i)
(iii)
2X l1 dX l1 +X12dx21 +X21dx12
dY =
[
xlldx21 +X21 dx ll +X22 dx 21 +X21 dx22
Xli dx 12 + X 12 dx II + X 12 dx 22 + X 22 dx 12 ]
X21 Ux 12 +X12 Ux 21 + 2X'22 Ux 22
CHAPTER 6
~-'In(x" + x,,)
(I) 0
ay
-
ax
==
XI2exllx,.
xli
0 Xu eXllx
..
+ X 22)
0]
X22
~12
0 Xli -sin (x 12
0 0 0
(2)
and so on I
hence by (6.1)
1 0 0 I
0 0 0 0
0 0 0 0
0 0 0 0
ax
- 1 0 0 I O.
ax
0 0 0 0
0 0 0 0
0 0 0 0
0 0
(3) Since
X-I
/::,.
L"-X21
-x"J
XII
Hence
~
~2
ilX- 1 ~ -x21 X 22
-x 12X22
X I2X21
-X.2 X 21
X~I
XlI
X
-XlIX.I
22 J
oX A' -x12xn xI. X 12 X21 -XlI X 1.
'.i
'.j
,I
I
i
.1
I
'j
Tables of Formulae and ':1
Derivatives "I
- i
· I
·.
;
i
j
Table 1
Notation used: A = [ailLE = [bill
,
Eii = e/ei
Bij :: e/ ci = e/ el
E;jer = Bjre;
E;jErs = BjrEis
EjjEjsEsm = E,m
EjjErs = 0 if j =l=r l
.1
A = l::Z;aijE;j
I J
A. j = Aej
,
;
· \
Aj • = A'el
E/jAErs = ajrEij
trAE = 'I.'£a/lbl/
/ I
tr AS' = tr A'B.
trAB = (vecA')'vecB.
123
3S'OA
Table 2
A ®B = laljBj
A ® (aB) = a(A @ B)
(A + B) ® C = A ® C + il ® C
A ® (B + C) = A ® B +A ® C
A ® (B ® C) = (A ® B) ® C
(A ®il)' = A' ®B'
(A ®B)(C®D) = AC®BC
(A ®Br l = A-I ®B- I
vec(AYB) = (B'®A)vec Y
IA ® ill = IAlm IBln when A and B are of ordt!r
(/1 X n) and (m Xm) respectively
A ®B == UI(B ®A)U21 UI and U2 are permutation
matrices
tr (A ® B) = tr A tr B
A C±> B = A QlI 1m + In ® B
U = 'L'LErs®E~s
r s
Table 3
Table 4
alex)
ax
alxl
IXI(X- 1), when elements of X are
ax independent
2 [X,/l - uiag {XII}, when X is sYlllmetric.
ax = Ers
axr .!'
ax'
- = E;s
oXrs
a(AXB)
- - - = AErsB
axrs
il(AX'B) ,
= AErsB
oXrs
a(X'A'AX)
E,.'A'AX + X'A'AErs
axrs
3(X'AX)
ax,s
a(x n )
ax,s
=
Ic=O
I XlcErsXn-k-l
1; 123
Table S
a vee (AXB) ,
----=---'- = B ® A
a veeX
a vee (X'A,X) = U'(AX ® I) + (/ ® A 'X)
a vecX
J
a vee (AX- B) = _(X-I B)<® (x-I)'A'
I, a veeX
I'
II
ii Table 6
I
I I
a log IXI
ax
= (X-I)'
aIXlr
- - = rIXlr(X-I)'
ax
a tr (AX) ,
=A
ax
a tr (A'X)
-....:----=- = A
ax
a tr (X'AXB) , ,
-~-~ = AXB+AXB
ax
a tr (XX')
= 2X
ax
a tr (X") = nX"-1
ax
a tr(e x ) x
=e
ax
a tr (AX-I B) = -(X ~I BArl)'
Tables of Fonnula e Ilnd Derivatives 125
Table 7
ax (X symmet ric)
0+ U - L,Err ® Err
ax
ax
- = (] (clemen ts of X indepen dent)
ax
ax'
-=U
ax
a(xy) ax ay
-- = - (I ® Y) + (I ® X) -
az az az
ax- 1
- = -(1 ® X-I)O( / ® X-l)
ax
a ex ® Y)
az =
ax
az ®Y
raY ~
+ [/ ® Uil Laz ® X [1 ® U2 ]
J
123
38'01.
Bibliography
Bibliography 127
1 Bibliography
c K
Rule Kronecker delta, 13
~rix, 88 produc t,21,23 ,33,70 ,85
tor,54 sum, 30
:teristic equatio n, 47
L
or,57
11 vector, 14
Langrange multipli ers, 95
mion form, 47 least squares, 94, 96, 100
'ained optimis ation, 94, 96
M
D Matrix
nposition of a matrix, 13 calculus, 51, 94
product , 21 compan ion, 47
ltive decomp osition, 13
onecker product , 70 derivative, 37,60, 62,67, 70,75,
itriX, 60, 62, 64, 67,70, 75,81 81,84, 88
alar function , 56,75 differential, 78
ctor,52 elemen tary, 12, 19
minant, 27,56 expone ntial, 29, 31,42, 108
tion,94 gradien t, 56
integral, 37
E orthogo nal, 97
lValues, 27, 30 permut ation, 23, 28, 32
Ivectors, 27, 30 product rule, 84
entary matrix, 12, 19 symmet ric, 58, 95, 97
'anspose, 19 transitio n, 42
mential matrix, 29, 31, 42,108 maximu m likeliho od, 102
mixed product rule, 24
G multivariable system, 45
ient matrix, 56 multivariate normal, 102
J N
lbian, 53,109 normal equatio ns, 95,101