Linear Algebra

Linear Algebra
Dr. Alexander Pelaez

Linear Algebra
Module 1.1: Introduction to Linear Algebra
Introduction
Matrices are a very important part of mathematics because they help us analyze
multi- dimensional data in a linear format.
We use matrices all the time when we have data in a row / column format.
Matrices take the form as follows:
3
Matrix Notation
Matrices are denoted by a capital letter, in order to distinguish them from other
variables, e.g. A, B, C.
The dimensions are denoted as number of rows and columns (r x c).
In the examples to the right:
Matrix A is 3 x 3
Matrix B is 2 x 2
Matrix C is 1 x 3
4
Terminology
Scalar a single element or number represented by a variable m = 2.1 , λ =5.2
Vector – single column or row matrix (Denoted with lowercase)

Column Vector (n x 1)
Row Vector (1 x n)
Dimension refers to number of rows by columns (e.g. m x n)
Norm , sometimes referred to as length, of a vector is the square root of the sum of squares of
the elements. (Maybe written with two bars or one)
Normalized Vector
5
Vector Operations
Inner Product – The inner product multiplies the entries of one vector with another.
Note relationship of inner product and norm of vector.
Projection of a on b
Orthogonal vectors: if two vectors inner product = 0.
6
Transpose of Matrix
Transpose of a matrix is inverts the rows and columns. The transpose is denoted by
the apostrophe (') symbol. Therefore given matrix A , A’ is below, similarly C and C’
7
Matrix Addition & Subtraction
Matrix Addition and Subtraction are very easy. Simply add or subtract the
corresponding element of the two matrices to obtain the result matrix.
8
Matrix Multiplication (Scalar)
Sometimes we will multiply a matrix by a scalar, a single number. Simply multiply the
scalar by each element in the matrix:
9
Matrix Multiplication
Multiplying matrices is a bit more complicated. This is performed by adding the
product of row entries in a matrix (A) by the corresponding column entries in another
matrix (B). Lets see an example, before we look at the formula:
10
Matrix Multiplication / Division : Note
When multiplying or dividing matrices, it is important to ensure that the “inner dimensions” are
the same. Example:
If you have a matrix A which is (2x3) in dimension, and Matrix B which is (3x2) in dimension, the
multiplication is possible because the “inner dimensions” are the same.
If you have a matrix A which is (3x4) in dimension, and Matrix B which is (3x4) in dimension, the
multiplication is impossible because the “inner dimensions” are different.
11
Linear Algebra
Module 1.2: Special Matrices & Linear Independence
Special Matrices
Square Matrix - Matrix whose dimensions are equal (m x p) where m=p
Zero Matrix – Matrix whose entries are all 0. Denoted as Ο
Diagonal Matrix is a square matrix whose entries on the diagonal are not 0, such that
every element xij≠0 where i=j, and xij=0 where i ≠ j
Identity Matrix is a square diagonal matrix whose diagonal are all 1. Every element xij=1
where i=j, and xij=0 where i ≠ j. The Identity matrix is denoted with a capital I
Symmetric Matrix – Matrix where its transpose is identical such that A=A’
13
Special Matrices
A - Square matrix
(3x3)
O - Zero matrix
D – Diagonal Matrix
I – Identity Matrix
M - symmetric matrix
Transpose of a symmetric matrix yields itself

Matrix Properties – Linear Independence
A set of vectors a1 , a2 , . . . , an is said to be linearly dependent if constants c1, c2, . . . , cn (not all
zero) can be found such that
c1a1+c2a2+···+cnan =0
If no constants c1 , c2 , . . . , cn can be found satisfying the equation above, the set of vectors is
said to be linearly independent.
A is linear dependent because row 1 and 2 are multiples (3), B is linear independent, and C is
linear independent.
Matrix Properties – Rank of Matrix
The rank of a matrix is equal to the number of linear independent rows or columns in a
matrix. Mathematically, it can be shown that the number of linear dependent rows equals
the number of linear dependent columns (we will not prove that here). Thus, rank is the
smaller of the two, or in a square matrix they are equal.
In square matrix, rank(A) = number of linear independent rows in A

= number of linear independent columns in A
A matrix is said to have full rank if the rank is equal to the smaller of the number of rows
or columns, i.e. if matrix B is (2x3) and the matrix has a rank or 2, then the matrix has full
rank, because it equals the number of rows.
Matrix Properties – Rank of Matrix
A has a rank of 2, rank (A) = 2 (second row is multiple of first

)
B has a rank of 2, and is full rank because B is 2 x 3
C has a rank of 2
Linear Algebra
Module 1.3: The Determinant, Inverse & Trace
Matrix Properties – Determinant of Matrix
The determinant of an n×n (square) matrix A is used widely throughout

mathematics. It is denoted as |A| or det(A).
For a two by two square matrix

Matrix Properties – Determinant of Matrix
For a three by three square matrix, it gets more complicated.
Positive Terms are determined by Negative terms are denoted by
Yielding:
|A| = (a11a22a33+a12a23a31+a13a32a21 ) - (a31a22a13+a32a23a11+a33a12a21 )

Matrix Properties – Orthogonal Matrices
A matrix is said to be orthogonal if the following is true
AA’ = A’A = I
Matrix Properties – Inverse
The inverse of a matrix is such that when multiplied by the matrix the inverse yields the identity
matrix, i.e. A-1A = I
To prove it is the inverse we multiply the A and A-1 and the result is I
Matrix Properties – Singular Matrix
Matrices that have an inverse are known as invertible, and those that don't have an
inverse are known as non invertible.
It is important to note that not all matrices are invertible, i.e. AA-1=A-1A = I
A square matrix that has no inverse is considered a singular matrix. One of the key
properties of a square singular matrix is that the determinant of this matrix is 0.
Therefore a square singular matrix A

Matrix Division
Previously we had spoken about matrix multiplication. Matrix division requires us to

first obtain the inverse of the right side matrix and then perform a multiplication.
Therefore
24
Matrix Properties - Trace
The trace of a matrix is the sum of the diagonal elements, denoted by tr(..)
Matrix Functions – R
In R, we have a number of functions to help with these matrix calculations:
Function Definition Parameters Package
det(M) Determinant of a matrix M – matrix base
tr(M) Trace of a matrix M – matrix psych
solve(M) Find inverse of a matrix M – matrix base

Linear Algebra
Module 1.4: Random Variables, Covariance and Correlation
Random Matrices and Vectors
Definition: X = [Xij ] n×p is a random matrix if column [Xj] is a random variable.
Mean E(X) of a random matrix/vector X is the matrix/vector of element wise

in other words: E (X) = [E (Xj )]
Properties of Expected Value:
Addition: E(X+Y) = E(X) + E(Y)

generalizes the scalar property E(X +Y) = E(X)+E(Y)
Multiplication by constants: E(AXB) = AE(X)B

generalizes the scalar property E(cX) = cE(X)
Example:
Below is a table of data. Each column represents collected data, and thus can be
considered a Random Variable. Thus, this dataset can be called X.
The matrix Xnxp has p columns, each column has a mean, and thus there would be p means. We
call this a vector of means and it can be represented as E(X):
A variance-covariance matrix is a covariance matrix of a random matrix and is the matrix of

pairwise covariances. We know that the covariance of two variables is denoted as cov(x,y), thus
the covariance matrix denoted is by Σ.
Σ = Cov(X)=[σik]=[Cov(Xi,Xk)]
= E(X−μ)(X−μ)′
generalizes the scalar definition var(X ) = E ( [ X − E(X) ]2 )

Covariance Matrix: matrix of pairwise covariances

Pairwise covariance:
Notice the covariances on the diagonals are actually the variance of the random variable
Random Matrices and Vectors (Con’d)
Correlations: matrix of pairwise correlations

Pairwise correlations: (Correlations are the standardized covariances)
Notices the correlations on the diagonals are actually 1

Standard Deviation Matrix is the matrix of standard deviations.
We denote the standard deviation matrix as

We know the relationship between the variance , standard deviation, covariance,

correlations. Thus, we can compute the matrices given the fact that we have a
covariance or correlation matrix.
Generalized Variance of a random matrix is |Σ|, the determinant of its covariance matrix
Refer to μ,Σ,ρ as the population mean (vector), population covariance (matrix), and
population correlation (matrix), respectively
μ – population mean vector

Σ – population covariance matrix
𝞀 – population correlation matrix
Covariance to Correlation Example
Linear Algebra
Module 1.5: Eigenvalues, Eigenvectors
Covariance Matrix
Recall that the covariance matrix represents the relationship between variables.
A dataset with 40 variables would yield a 40x40 covariance matrix. Unfortunately, it's
very difficult and impractical to work with 40 dimensions.
So data mining and multivariate analysis techniques rely heavily on reducing the
number of dimensions.
Covariance Matrix
If we could reduce, in some manner, the number of dimensions to something more

manageable it would help in our analysis.
A fundamental concept in linear algebra is this notion of dimension reduction using

eigenvalues and eigenvectors.
Eigenvalues & Eigenvectors
Suppose that we have a m × m matrix A.
λ is an eigenvalue of A if there exists a non-zero vector 𝛎 such that
A𝛎 = λ𝛎
The vector 𝛎 is said to be an eigenvector of A corresponding to the eigenvalue λ.
We can find eigenvalues by solving the equation,
det(A − λI) = 0.
If 𝛎 is an eigenvector of A with eigenvalue λ, then A𝛎 − λI𝛎 = 0, and thus
(A−λI)v = 0
If there exists an inverse (A − λI)−1 then the (trivial) solution 𝛎 = 0 is obtained.
When there does not exist a trivial solution there is no inverse and hence
det(A − λI) = 0.
How does this work? Let's take an example:
Suppose we have the following matrix 𝚺 to the right:
If there exists an inverse (A − λI)−1 then the (trivial) solution 𝛎 = 0 is obtained.
When there does not exist a trivial solution there is no inverse and hence
det(A − λI) = 0. Thus we can obtain a vector 𝛎 which satisfies the equation
A𝛎 − λI𝛎 = 0,
Notice how A𝛎 is a 2x1 matrix … How does this relate to the original matrix A?
So now we have A𝛎, and according to our formula before we should be able to find a
scalar (𝜆) that we can multiply by 𝛎 to make the two sides equal.
Because we have found this solution for 𝜆 and 𝛎, we can now reduce A to a scalar,
known as an eigenvalue, and a vector 𝛎, known as the eigenvector. Thus, the
eigenvalue and eigenvector provide information about A, but it is now reduced into
the two components (of smaller dimensions).
The eigenvalues of a covariance matrix Σ are non-negative.
If λ is an eigenvalue of Σ, then
where 𝛎 is an eigenvector corresponding to λ.
It should also be noted that if Σ is a 2x2 matrix there will be two solutions for
the eigenvector-eigenvalue problem.
Therefore, given a matrix Σ that is nxn, there will be n eigenvalue-eigenvector
solutions.
Extending the example using the covariance matrix 𝚺 to the right:
The function in R to get the eigenvalues and eigenvectors is eigen.

The code in R:
A= matrix (c(2,1,1,2), nrow = 2, byrow=T)
A
eigen(A)
The result will be two eigenvalues and eigenvectors. The first eigenvalue will always be the largest,
and the eigenvectors are the columns of the result of the $vectors object.
$values
[1] 3 1
$vectors
[,1] [,2]
[1,] 0.7071068 -0.7071068
[2,] 0.7071068 0.7071068
Linear Algebra
Module 1.6: Singular Value Decomposition
Singular Value Decomposition (SVD)
SVD is a factorization of a matrix. We can express any (real) matrix A in terms of

eigenvalues and eigenvectors. Let A be an n × p matrix of rank k. Then the singular
value decomposition of A can be expressed as
Singular Value Decomposition (SVD)
The D matrix is a matrix with the entries on the diagonal as the eigenvalues of the
matrix A.
The U matrix is a matrix of the eigenvectors of the matrix A
The V' matrix is the transpose of a matrix of the eigenvectors of matrix A

U D V'
Singular Value Decomposition
We can express any (real) matrix A in terms of eigenvalues and eigenvectors of A′A
and AA′.
Let A be an n × p matrix of rank k.
Then the singular value decomposition of A can be expressed as
A = UDV′,
From the previous slide, we can see, using a bit of Linear Algebra, the equation
above is similar to the eigenvalue-eigenvector formula.
Then, A is equivalent to an SVD with eigenvectors and eigenvalues.

Singular Value Decomposition - Example
Singular Value Decomposition - Example
Key Terms
Term Definition
Eigenvalue - Eigenvector Av = 𝝀 v
Singular Value Decomposition A way to reduce the dimensions of a matrix A = UDV’
51
Visit, Follow, Share
Please Subscribe to our Youtube Channel
Visit our blog site for news on analytics and code samples
http://blogs.5eanalytics.com

Linear Algebra - Module 1

Uploaded by

Copyright:

Available Formats

Linear Algebra - Module 1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Linear Algebra - Module 1

Uploaded by

Copyright:

Available Formats

Dr. Alexander Pelaez

Scalar a single element or number represented by a variable m = 2.1 , λ =5.2

Vector – single column or row matrix (Denoted with lowercase)

Dimension refers to number of rows by columns (e.g. m x n)

Note relationship of inner product and norm of vector.

Orthogonal vectors: if two vectors inner product = 0.

Square Matrix - Matrix whose dimensions are equal (m x p) where m=p

Zero Matrix – Matrix whose entries are all 0. Denoted as Ο

Transpose of a symmetric matrix yields itself

In square matrix, rank(A) = number of linear independent rows in A

A has a rank of 2, rank (A) = 2 (second row is multiple of first

B has a rank of 2, and is full rank because B is 2 x 3

The determinant of an n×n (square) matrix A is used widely throughout

For a two by two square matrix

For a three by three square matrix, it gets more complicated.

Positive Terms are determined by Negative terms are denoted by

|A| = (a11a22a33+a12a23a31+a13a32a21 ) - (a31a22a13+a32a23a11+a33a12a21 )

A matrix is said to be orthogonal if the following is true

Therefore a square singular matrix A

Previously we had spoken about matrix multiplication. Matrix division requires us to

In R, we have a number of functions to help with these matrix calculations:

Function Definition Parameters Package

det(M) Determinant of a matrix M – matrix base

tr(M) Trace of a matrix M – matrix psych

solve(M) Find inverse of a matrix M – matrix base

Deﬁnition: X = [Xij ] n×p is a random matrix if column [Xj] is a random variable.

Mean E(X) of a random matrix/vector X is the matrix/vector of element wise

Properties of Expected Value:

Addition: E(X+Y) = E(X) + E(Y)

Multiplication by constants: E(AXB) = AE(X)B

A variance-covariance matrix is a covariance matrix of a random matrix and is the matrix of

generalizes the scalar deﬁnition var(X ) = E ( [ X − E(X) ]2 )

Covariance Matrix: matrix of pairwise covariances

Correlations: matrix of pairwise correlations

Notices the correlations on the diagonals are actually 1

Standard Deviation Matrix is the matrix of standard deviations.

We denote the standard deviation matrix as

We know the relationship between the variance , standard deviation, covariance,

μ – population mean vector

If we could reduce, in some manner, the number of dimensions to something more

A fundamental concept in linear algebra is this notion of dimension reduction using

Suppose that we have a m × m matrix A.

λ is an eigenvalue of A if there exists a non-zero vector 𝛎 such that

The vector 𝛎 is said to be an eigenvector of A corresponding to the eigenvalue λ.

We can find eigenvalues by solving the equation,

If 𝛎 is an eigenvector of A with eigenvalue λ, then A𝛎 − λI𝛎 = 0, and thus

If there exists an inverse (A − λI)−1 then the (trivial) solution 𝛎 = 0 is obtained.

How does this work? Let's take an example:

Suppose we have the following matrix 𝚺 to the right:

If there exists an inverse (A − λI)−1 then the (trivial) solution 𝛎 = 0 is obtained.

The eigenvalues of a covariance matrix Σ are non-negative.

where 𝛎 is an eigenvector corresponding to λ.

Extending the example using the covariance matrix 𝚺 to the right:

The function in R to get the eigenvalues and eigenvectors is eigen.

SVD is a factorization of a matrix. We can express any (real) matrix A in terms of

The U matrix is a matrix of the eigenvectors of the matrix A

The V' matrix is the transpose of a matrix of the eigenvectors of matrix A

Then, A is equivalent to an SVD with eigenvectors and eigenvalues.