Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
12 views

01 - Lab Notes

This document provides a summary of mathematical definitions and tools for visual navigation of autonomous vehicles, including: - Linear algebra concepts such as norms, inner products, vectors, matrices, traces, and Kronecker products. - Common matrix norms including the Frobenius norm and operator norm. - Properties of orthogonal matrices. - The document serves as an overview of foundational math concepts for computer vision and robotics research.

Uploaded by

antoniobongio
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

01 - Lab Notes

This document provides a summary of mathematical definitions and tools for visual navigation of autonomous vehicles, including: - Linear algebra concepts such as norms, inner products, vectors, matrices, traces, and Kronecker products. - Common matrix norms including the Frobenius norm and operator norm. - Properties of orthogonal matrices. - The document serves as an overview of foundational math concepts for computer vision and robotics research.

Uploaded by

antoniobongio
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

16-485: Visual Navigation for Autonomous Vehicles (VNAV) Fall 2020

Recitation 1: Mathematical Preliminaries


Lecturer: Heng Yang Scribes: Heng Yang

Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.
They may be distributed outside this class only with the permission of the Instructor(s).

This recitation serves as a recap of useful mathematical definitions and tools for VNAV 2020, and for research
in robotics and computer vision in general. It covers basic linear algebra (Section 1.1) and matrix calculus
(Section 1.2). These notes have benefited from [2, 1].
Notations. We use lowercase characters (e.g., s ∈ R, C) to denote real and complex scalars, bold lowercase
characters (e.g., v ∈ Rn , Cn ) for real and complex vectors, and bold uppercase characters (e.g., M ∈
Rm×n , Cm×n ) for real and complex matrices. vi denotes the i-th scalar entry of vector v, and MijPdenotes
. n
the i-th row and j-th column scalar entry of matrix M . For a square matrix M ∈ Rn×n , tr (M ) = i=1 Mii
m×n
denotes the trace of M , and det (M ) denotes the determinant of M . For any matrix M ∈ R , denote
vec (M ) ∈ Rmn as the column-wise vectorization of M by vertically stacking its columns. We use S n to
denote the set of real symmetric matrices of size n × n. For any vector v ∈ Rn , diag (v) creates a diagonal
matrix V ∈ S n with diagonal entries Vii = vi , i = 1, . . . , n.

1.1 Linear Algebra

1.1.1 Norms

Inner Product. The standard inner product on Rn , the set of n-dimensional real vectors, is defined as:
n
X
hx, yi = xT y = xi yi . (1.1)
i=1

The standard inner product on Rm×n , the set of m × n real matrices, is defined as:
m X
n
 X
hX, Y i = tr X T Y = Xij Yij . (1.2)
i=1 j=1

Vector Norms. Let us first introduce the definition of a general vector norm in Rn .

Definition 1 (General Norm). A function f : Rn → R is called a norm if f satisfies:

(i) Nonnegativity: f (x) ≥ 0 for all x ∈ Rn ;

(ii) Definiteness: f (x) = 0 if and only if x = 0;

(iii) Nonnegative homogeneity: f (tx) = |t| f (x) for all t ∈ R and x ∈ Rn ;

(iv) Triangle inequality: f (x + y) ≤ f (x) + f (y), for all x, y ∈ Rn .

1
Lecture 1: Mathematical Preliminaries

When f satisfies Definition 1, f is called a norm function and typically denoted as k·k. We use kxkp to
denote the `p norm of a vector x ∈ Rn . When p ≥ 1, kxkp is defined as:

n
!1/p
. X p
kxkp = |xi | . (1.3)
i=1

In particular, we care about the following three norms:


Pn
(i) p = 1, the `1 norm: kxk1 = |xi |, i.e., the sum of absolute values.
i=1

(ii) p = 2, the `2 norm (Euclidean norm): kxk2 = xT x, i.e., the length of vector x.
(iii) p = ∞, the `∞ norm: kxk∞ = maxi |xi |, i.e., the maximum of absolute values.

For applications of `∞ norm in computer vision, one can refer to [5, 4] and the CVPR 2018 tutorial.1
Exercise: Verify the three norms (`1 , `2 , `∞ ) satisfy the properties in Definition 1.
Angle. The angle between two nonzero vectors x, y ∈ Rn is defined as:
xT y
 
6 (x, y) = arccos , (1.4)
kxk2 kyk2

where we take arccos (·) ∈ [0, π]. We say x and y are orthogonal when xT y = 0. In machine learning, cosine
similarity, i.e., the cosine of the angle 6 (x, y), is often used to measure the similarity of two vectors x, y.
Frobenius Norm. The most common norm on Rm×n is the Frobenius norm, For X ∈ Rm×n , its Frobenius
norm is defined as:
v
um X n
q uX
kXkF = tr (X T X) = t 2 = kvec (M )k .
Xij (1.5)
2
i=1 j=1

Operator Norm. Suppose k·kp (p ≥ 1) is a norm on Rn and Rm , then we can define the operator norm
(induced norm) of X ∈ Rm×n as:
kXkp = sup kXvkp . (1.6)
v∈Rn ,kvkp ≤1

A special case of the operator norm is kXk2 = σmax (X), where σmax (X) denotes the maximum singular
value of X. In general, kXkp is NP-hard to compute for p 6∈ {1, 2, ∞}.
Exercise: Verify the matrix operator norm defined in eq. (1.6) satisfies Definition 1.

1.1.2 Trace, Vectorization and Kronecker Product

Cyclic Property. The trace operator is invariant under cyclic permutations:


tr (ABCD) = tr (BCDA) = tr (CDAB) = tr (DABC) . (1.7)
Moreover, if A, B, C ∈ S n , then the trace operator is invariant under all permutations:
 
T
tr (ABC) = tr (ABC) = tr (CBA) = tr (ACB) . (1.8)
1 https://cs.adelaide.edu.au/~tjchin/tutorials/cvpr18/

2
Lecture 1: Mathematical Preliminaries

Kronecker Product. If A ∈ Rm×n and B ∈ Rp×q , then the Kronecker product A ⊗ B is defined as:
 
A11 B · · · A1n B
A⊗B = .. .. .. mp×nq
∈R . (1.9)
 
. . .
Am1 B · · · Amn B

Useful Equalities. The following equalities can be useful when manipulating mathematical equations:

 T
(i) If A, B ∈ Rn×n , then tr AT B = vec (A) vec (B).

(ii) Let A ∈ Rk×l , B ∈ Rl×m , C ∈ Rm×n , then:


 
vec (ABC) = C T ⊗ A vec (B) = (In ⊗ AB) vec (C) = C T B T ⊗ Ik vec (A) , (1.10)

vec (AB) = (Im ⊗ A) vec (B) = B T ⊗ Ik vec (A) . (1.11)

(iii) Let A, X, B, Y be real matrices with proper dimensions, then:


T
tr AT X T BY = vec (X) (A ⊗ B) vec (Y ) .

(1.12)

(iv) Let X ∈ Rm×n , then its (i, j)-th entry can be written as:

Xij = eT T T
 
i Xej = tr ei Xej = tr Xej ei , (1.13)

where ei ∈ Rm is the i-th standard basis vector (1 at the i-th entry and 0 everywhere else), and ej ∈ Rn
is the j-th standard basis vector (1 at the j-th entry and 0 everywhere else).

The interested reader can refer to the supplementary material of [3] for an application of the equalities above
to solving a problem in computer vision.

1.1.3 Orthogonal Matrices

An orthogonal matrix is a real square matrix whose rows and columns are orthonormal vectors (orthogonal
and unit norm). Formally, let Q ∈ Rn×n , then Q is an orthogonal matrix if and only if:

QT Q = QQT = In ⇐⇒ Q−1 = QT . (1.14)

We use O(n), the n-dimensional orthogonal group, to denote the set of orthogonal matrices with size n × n.
An orthogonal matrix has determinant equal to either +1 or −1, which can be easily seen from:
2
det QT Q = (det (Q)) = det (In ) = 1 =⇒ det (Q) = ±1.

(1.15)

An orthogonal matrix preserves inner product in Euclidean space:


T
hQx, Qyi = (Qx) Qy = xT QT Qy = xT y = hx, yi , ∀x, y ∈ Rn , Q ∈ O(n). (1.16)

As a result, orthogonal matrix is also `2 -norm preserving (cf. eq. (1.1)):

kQxk2 = kxk2 , ∀x ∈ Rn , Q ∈ O(n). (1.17)

3
Lecture 1: Mathematical Preliminaries

1.1.4 Eigenvalues and Eigenvectors

Given a square matrix A ∈ Cn×n , if there exists a scalar λ ∈ C and a nonzero vector v ∈ Cn such that:
Av = λv, (1.18)
then v is called a right eigenvector of A and λ is the associated eigenvalue. If there exists a scalar κ ∈ C
and a nonzero vector u ∈ Cn such that:
uT A = κuT , (1.19)
then u is called a left eigenvector of A with associated eigenvalue κ. If all right eigenvectors of A are linearly
independent, then denoting V = [v1 , . . . , vn ] and Λ = diag ([λ1 , . . . , λn ]), we have:
AV = V Λ =⇒ V −1 AV = Λ, (1.20)
i.e., A is diagonalizable by V .
As an exercise, try to prove the following lemmas.
Lemma 2 (Eigenvalues and Characteristic Polynomial). Any matrix A ∈ Cn×n has equal left and right
eigenvalues, and they are the roots of the characteristic polynomial f (λ) = det (A − λIn ) .
Lemma 3 (Real Symmetric Matrices have Real Eigenvalues). The eigenvalues of any real symmetric matrix
A ∈ S n are all real. Hence, the eigenvalues can be sorted: λ1 ≥ . . . ≥ λn .
Lemma 4 (Real Symmetric Matrices have Orthogonal Eigenvectors). Let A ∈ S n be a real symmetric
matrix and let λi 6= λj be any two distinct eigenvalues with associated eigenvectors vi and vj , then viT vj = 0.
Moreover, if λi is a repeated eigenvalue with multiplicity m ≥ 2, then there exist m orthonormal eigenvectors
corresponding to λi .
Corollary 5 (Real Symmetric Matrices are Diagonalizable). Any real symmetric matrix A ∈ S n can be
diagonalized as:
A = U ΛU T , (1.21)
where U = [u1 , . . . , un ] ∈ O(n) is an orthogonal matrix whose columns ui are the eigenvectors of A, and
Λ = diag ([λ1 , . . . , λn ]) is a diagonal matrix containing the eigenvalues of A. The factorization in eq. (1.21)
is called the eigendecomposition or spectral decomposition of A, and is unique (up to permutation of ui and
λi ) when all the eigenvalues of A are distinct.

Some useful properties of eigenvalues and eigenvectors:


Pn
(i) Trace of a matrix equals the sum of all eigenvalues: tr (A) = λi , for all A ∈ Cn×n .
i=1
Qn
(ii) Determinant of a matrix equals the product of all eigenvalues: det (A) = i=1 λi , for all A ∈ Cn×n .
(iii) The eigenvalues of Ak , i.e., the k-th power of A, is λki , i = 1, . . . , n, for any A ∈ Cn×n and k ∈ N.
More generally, let f (·) be any polynomial, then the eigenvalues of f (A) are f (λi ), i = 1, . . . , n.
(iv) The eigenvalues of A−1 are λ−1
i , i = 1, . . . , n, for any A ∈ C
n×n
.
(v) The eigenvalues of A + αIn are λi + α, i = 1, . . . , n, for any A ∈ Cn×n and α ∈ C.
(vi) For any A ∈ S n , the solutions to the following (nonconvex) problems:
min xT Ax; max xT Ax, (1.22)
kxk2 =1 kxk2 =1

are the minimum and maximum eigenvalues/engenvectors: (λmin , vmin ) and (λmax , vmax ). As a result,
2 2
we have λmin kxk2 ≤ xT Ax ≤ λmax kxk2 , for any x ∈ Rn .

4
Lecture 1: Mathematical Preliminaries

1.1.5 Singular Value Decomposition

The singular value decomposition (SVD) of any real matrix2 M ∈ Rm×n is:

M = U SV T , U ∈ O(m), V ∈ O(n), (1.23)

and S ∈ Rm×n is a rectangular diagonal matrix with nonnegative diagonal entries. The diagonal entries of
S are called the singular values of M . The number of nonzero singular values in S is equal to the rank of
M . The SVD in eq. (1.23) in equivalent to:

M V = U S, U ∈ O(m), V ∈ O(n), (1.24)

which implies that M vi = Sii ui , i = 1, . . . , min {m, n}, where ui ∈ Rm , vi ∈ Rn are the i-th column of U
and V , and they are called the left and right singular vectors of M , respectively.
Exercise: Is the SVD of a matrix unique? What is an SVD of an orthogonal matrix and a rotation matrix?
Relationship to Spectral Decomposition. Consider matrices M T M and M M T :

M T M = V S T U T U SV T = V S T S V T ,

(1.25)
M M T = U SV T V S T U T = U SS T U T .

(1.26)

Therefore, the columns of V are eigenvectors of M T M , while the columns of U are eigenvectors of M M T .
The nonzero singular values of M are the square roots of the nonzero eigenvalues of M T M and M M T .

1.1.6 Positive Semidefinite Matrices

The following statements about positive semidefinite (PSD) matrices are equivalent:

(i) The matrix A ∈ S n is positive semidefinite (A  0, A ∈ S+


n
).

(ii) For all x ∈ Rn , xT Ax ≥ 0.

(iii) All eigenvalues of A are nonnegative.

(iv) All 2n − 1 principal minors of A are nonnegative.

(v) There exists a factorization A = B T B.

The following statements about positive definite (PD) matrices are equivalent:

(i) The matrix A ∈ S n is positive definite (A  0, A ∈ S++


n
).

(ii) For all nonzero x ∈ Rn , xT Ax > 0.

(iii) All eigenvalues of A are strictly positive.

(iv) All n leading principal minors of A are positive.

(v) There exists a factorization A = B T B, with B ∈ Rn×n and rank (B) = n.


2 Complex matrices also admit SVD factorizations.

5
Lecture 1: Mathematical Preliminaries

Matrix Congruence. If P ∈ Rn×n is invertible (nonsingular), then A, B ∈ Rn×n are congruent if:

P T AP = B. (1.27)

If both A and B are symmetric, then A and B have the same numbers of positive, negative, and zero
eigenvalues. Therefore, A  0 ⇐⇒ P T AP  0.
Matrix Similarity. If P ∈ Rn×n is invertible (nonsingular), then A, B ∈ Rn×n are similar if:

P −1 AP = B. (1.28)

Similar matrices have the same characteristic polynomials, hence, the same eigenvalues. (Exercise: prove
this.) Therefore, A  0 ⇐⇒ P −1 AP  0. See [6] for an application of this.
Schur Complement. Consider the following block matrix:
 
A B
X= ∈ R(m+n)×(m+n) , A ∈ Rm×m , B ∈ Rm×n , C ∈ Rn×m , D ∈ Rn×n . (1.29)
C D

If block A is invertible, then the Schur complement of block A of X is:


.
X/A = D − CA−1 B. (1.30)

If block D is invertible, then the Schur complement of block D of X is:


.
X/D = A − BD −1 C. (1.31)

In the case that A or D is singular, replacing A−1 and D −1 with generalized inverses 3 yields the generalized
Schur complement.
Schur complement is one of the most important tools for analyzing positive semi-definiteness and positive
definiteness of symmetric matrices. Consider the following symmetric matrix:
 
A B
X= ∈ S (m+n) , A ∈ Rm×m , B ∈ Rm×n , D ∈ Rn×n , (1.32)
BT D

then we have:

(i) Sufficient and necessary conditions for X  0:

X0 ⇐⇒ A  0, X/A  0 ⇐⇒ D  0, X/D  0. (1.33)

(ii) Sufficient and necessary condition for X  0 when A is invertible:

X0 ⇐⇒ A  0, X/A  0. (1.34)

(iii) Sufficient and necessary condition for X  0 when D is invertible:

X0 ⇐⇒ D  0, X/D  0. (1.35)

(iv) Sufficient and necessary condition for X  0 can be described using generalized Schur complements [7].
3 https://en.wikipedia.org/wiki/Generalized_inverse

6
Lecture 1: Mathematical Preliminaries

1.2 Matrix Calculus

A good online resource for matrix calculus: http://www.matrixcalculus.org/.


A good book for matrix calculus: The Matrix Cookbook.

1.2.1 Derivative and Gradient

Let f : Rn → Rm be a differentiable function:


T
f (x) = [f1 (x) , f2 (x) , . . . , fm (x)] , (1.36)

where each fi , i = 1, . . . , m is differentiable. Then the Jacobian (or derivative) of f w.r.t. x, denoted as
Df (x), is an m × n matrix whose (i, j)-th entry is:

∂fi (x)
[Df (x)]ij = , i = 1, . . . , m; j = 1, . . . , n. (1.37)
∂xj

In other word, the i-th row of the Jacobian is the derivative of fi w.r.t. x:
 
Df1 (x)
.. m×n
, Dfi (x) ∈ R1×n , i = 1, . . . , m.
Df (x) =  ∈R (1.38)
 
.
Dfm (x)

In the case when f is a real-valued function, i.e., f : Rn → R (e.g., each fi in eq. (1.36)), then the gradient
of f w.r.t. x is the transpose of Df (x):
T
∇f (x) = Df (x) ∈ Rn , (1.39)

which is a column vector.


We can use the Jacobian to perform first-order approximation of a function f : Rn → Rm at x:

f (z) ≈ f (x) + Df (x) · (z − x) . (1.40)

Chain Rule. Suppose f : Rn → Rm and g : Rm → Rp are both differentiable, then the composition
h : Rn → Rp defined by h (x) = g (f (x)) is differentiable at x, with the derivative computed by the chain
rule as:

Dh (x) = Dg (f (x)) Df (x) ∈ Rp×n , Dg (f (x)) ∈ Rp×m , Df (x) ∈ Rm×n . (1.41)

1.2.2 Second Derivative

Let f : Rn → R be a twice differentiable function, then the second derivative, i.e., the Hessian, of f w.r.t.
x, denoted as ∇2 f (x), is:
 2  ∂f (x)
∇ f (x) ij = , i = 1, . . . , n; j = 1, . . . , n. (1.42)
∂xi ∂xj

By definition, the Hessian ∇2 f (x) ∈ S n is a symmetric matrix. The Hessian can be interpreted as the
derivative of the gradient: ∇2 f (x) = D∇f (x).

7
Lecture 1: Mathematical Preliminaries

Using the gradient and Hessian of f , the second-order approximation of f at x can be written as:

T 1 T
f (z) ≈ f (x) + ∇f (x) (z − x) + (z − x) ∇2 f (x) (z − x) . (1.43)
2

Exercise: Let f : Rn → R, and g : R → R be two twice differentiable functions, what is the Hessian of
h(x) = g (f (x))?

References
[1] Grigoriy Blekherman, Pablo A Parrilo, and Rekha R Thomas. Semidefinite optimization and convex
algebraic geometry. SIAM, 2012.
[2] S. Boyd and L. Vandenberghe. Convex optimization. Cambridge University Press, 2004.
[3] Jesus Briales, Laurent Kneip, and Javier Gonzalez-Jimenez. A certifiably globally optimal solution to
the non-minimal relative pose problem. In IEEE Conf. on Computer Vision and Pattern Recognition
(CVPR), 2018.
[4] Olof Enqvist and Fredrik Kahl. Robust optimal pose estimation. In European Conf. on Computer Vision
(ECCV), pages 141–153. Springer, 2008.
[5] F. Kahl and R. Hartley. Multiple-view geometry under the `∞ -norm. IEEE Trans. Pattern Anal. Machine
Intell., 30(9):1603–1617, 2008.
[6] H. Yang and L. Carlone. A quaternion-based certifiably optimal solution to the Wahba problem with
outliers. In Intl. Conf. on Computer Vision (ICCV), 2019.
[7] Fuzhen Zhang. The Schur complement and its applications, volume 4. Springer Science & Business
Media, 2006.

You might also like