MATLAB 2023b Mathematics
MATLAB 2023b Mathematics
MATLAB 2023b Mathematics
Mathematics
R2023b
How to Contact MathWorks
Phone: 508-647-7000
Linear Algebra
2
Matrices in the MATLAB Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2
Creating Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2
Adding and Subtracting Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
Vector Products and Transpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
Multiplying Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5
Identity Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-7
Matrix Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-7
Kronecker Tensor Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8
Vector and Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8
Using Multithreaded Computation with Linear Algebra Functions . . . . . . . 2-9
Factorizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-20
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-20
Cholesky Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-20
LU Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-21
QR Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-22
Using Multithreaded Computation for Factorization . . . . . . . . . . . . . . . . 2-25
v
Powers and Exponentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-26
Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-30
Eigenvalue Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-30
Multiple Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-31
Schur Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-31
Random Numbers
3
Why Do Random Numbers Repeat After Startup? . . . . . . . . . . . . . . . . . . . 3-2
vi Contents
Creating and Controlling a Random Number Stream . . . . . . . . . . . . . . . 3-21
Substreams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-21
Choosing a Random Number Generator . . . . . . . . . . . . . . . . . . . . . . . . . 3-23
Configuring a Stream . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-26
Restore State of Random Number Generator to Reproduce Output . . . . . 3-27
Sparse Matrices
4
Computational Advantages of Sparse Matrices . . . . . . . . . . . . . . . . . . . . . 4-2
Memory Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2
Computational Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2
vii
Iterative Methods for Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-56
Direct vs. Iterative Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-56
Generic Iterative Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-57
Summary of Iterative Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-57
Choosing an Iterative Solver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-59
Preconditioners . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-60
Equilibration and Reordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-62
Using Linear Operators Instead of Matrices . . . . . . . . . . . . . . . . . . . . . . 4-64
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-64
Add Graph Node Names, Edge Weights, and Other Attributes . . . . . . . . 5-13
viii Contents
Polynomial Curve Fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-11
Computational Geometry
7
Triangulation Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2
2-D and 3-D Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2
Triangulation Matrix Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-3
Querying Triangulations Using the triangulation Class . . . . . . . . . . . . . . . 7-4
Interpolation
8
Gridded and Scattered Sample Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2
Interpolation versus Curve Fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2
Grid Approximation Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3
ix
Interpolation of Multiple 1-D Value Sets . . . . . . . . . . . . . . . . . . . . . . . . . . 8-13
Optimization
9
Optimizing Nonlinear Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2
Minimizing Functions of One Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2
Minimizing Functions of Several Variables . . . . . . . . . . . . . . . . . . . . . . . . 9-3
Maximizing Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4
fminsearch Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-6
x Contents
States of the Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-17
Stop Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-18
Function Handles
10
Parameterizing Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-2
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-2
Parameterizing Using Nested Functions . . . . . . . . . . . . . . . . . . . . . . . . . 10-2
Parameterizing Using Anonymous Functions . . . . . . . . . . . . . . . . . . . . . . 10-3
xi
Solve Differential Algebraic Equations (DAEs) . . . . . . . . . . . . . . . . . . . . 11-30
What is a Differential Algebraic Equation? . . . . . . . . . . . . . . . . . . . . . . 11-30
Consistent Initial Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-31
Differential Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-31
Imposing Nonnegativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-32
Solve Robertson Problem as Semi-Explicit Differential Algebraic Equations
(DAEs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-32
xii Contents
Solve BVP with Multiple Boundary Conditions . . . . . . . . . . . . . . . . . . . . 12-29
xiii
Numerical Integration
15
Integration to Find Arc Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-2
Fourier Transforms
16
Fourier Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-2
xiv Contents
Creation Functions for CompositeGate Objects . . . . . . . . . . . . . . . . . . . 17-22
xv
QUBO Problems
18
What Is a QUBO Problem? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2
QUBO and Ising Problem Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2
QUBO Problem Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3
Solution Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3
xvi Contents
1
The most basic MATLAB® data structure is the matrix. A matrix is a two-dimensional, rectangular
array of data elements arranged in rows and columns. The elements can be numbers, logical values
(true or false), dates and times, strings, categorical values, or some other MATLAB data type.
Even a single number is stored as a matrix. For example, a variable containing the value 100 is stored
as a 1-by-1 matrix of type double.
A = 100;
whos A
A 1x1 8 double
If you have a specific set of data, you can arrange the elements in a matrix using square brackets. A
single row of data has spaces or commas in between the elements, and a semicolon separates the
rows. For example, create a single row of four numeric elements. The size of the resulting matrix is 1-
by-4 because it has one row and four columns. A matrix of this shape is often referred to as a row
vector.
A = [12 62 93 -8]
A = 1×4
12 62 93 -8
sz = size(A)
sz = 1×2
1 4
Now create a matrix with the same numbers, but arrange them in two rows. This matrix has two rows
and two columns.
A = 2×2
12 62
93 -8
sz = size(A)
sz = 1×2
2 2
1-2
Creating, Concatenating, and Expanding Matrices
MATLAB has many functions that help create matrices with certain values or a particular structure.
For example, the zeros and ones functions create matrices of all zeros or all ones. The first and
second arguments of these functions are the number of rows and number of columns of the matrix,
respectively.
A = zeros(3,2)
A = 3×2
0 0
0 0
0 0
B = ones(2,4)
B = 2×4
1 1 1 1
1 1 1 1
The diag function places the input elements on the diagonal of a matrix. For example, create a row
vector A containing four elements. Then, create a 4-by-4 matrix whose diagonal elements are the
elements of A.
A = [12 62 93 -8];
B = diag(A)
B = 4×4
12 0 0 0
0 62 0 0
0 0 93 0
0 0 0 -8
Concatenating Matrices
You can also use square brackets to append existing matrices. This way of creating a matrix is called
concatenation. For example, concatenate two row vectors to make an even longer row vector.
A = ones(1,4);
B = zeros(1,4);
C = [A B]
C = 1×8
1 1 1 1 0 0 0 0
D = 2×4
1-3
1 Matrices and Arrays
1 1 1 1
0 0 0 0
To concatenate several matrices, they must have compatible sizes. In other words, when you
concatenate matrices horizontally, they must have the same number of rows. When you concatenate
them vertically, they must have the same number of columns.
For example, create two matrices that both have two rows. Horizontally append the second matrix to
the first by using square brackets.
A = ones(2,3)
A = 2×3
1 1 1
1 1 1
B = zeros(2,2)
B = 2×2
0 0
0 0
C = [A B]
C = 2×5
1 1 1 0 0
1 1 1 0 0
D = horzcat(A,B)
D = 2×5
1 1 1 0 0
1 1 1 0 0
The colon is a handy way to create matrices whose elements are sequential and evenly spaced. For
example, create a row vector whose elements are the integers from 1 to 10.
A = 1:10
A = 1×10
1 2 3 4 5 6 7 8 9 10
1-4
Creating, Concatenating, and Expanding Matrices
You can use the colon operator to create a sequence of numbers within any range, incremented by
one.
A = -2.5:2.5
A = 1×6
To change the value of the sequence increment, specify the increment value in between the starting
and ending range values, separated by colons.
A = 0:2:10
A = 1×6
0 2 4 6 8 10
A = 6:-1:0
A = 1×7
6 5 4 3 2 1 0
You can also increment by noninteger values. If an increment value does not evenly partition the
specified range, MATLAB automatically ends the sequence at the last value it can reach before
exceeding the range.
A = 1:0.2:2.1
A = 1×6
Expanding a Matrix
You can add one or more elements to a matrix by placing them outside of the existing row and column
index boundaries. MATLAB automatically pads the matrix with zeros to keep it rectangular. For
example, create a 2-by-3 matrix and add an additional row and column to it by inserting an element in
the (3,4) position.
A = 2×3
10 20 30
60 70 80
A(3,4) = 1
A = 3×4
1-5
1 Matrices and Arrays
10 20 30 0
60 70 80 0
0 0 0 1
You can also expand the size by inserting a new matrix outside of the existing index ranges.
A(4:5,5:6) = [2 3; 4 5]
A = 5×6
10 20 30 0 0 0
60 70 80 0 0 0
0 0 0 1 0 0
0 0 0 0 2 3
0 0 0 0 4 5
To expand the size of a matrix repeatedly, such as within a for loop, it is a best practice to
preallocate space for the largest matrix you anticipate creating. Without preallocation, MATLAB has
to allocate memory every time the size increases, slowing down operations. For example, preallocate
a matrix that holds up to 10,000 rows and 10,000 columns by initializing its elements to zero.
A = zeros(10000,10000);
If you need to preallocate additional elements later, you can expand it by assigning outside of the
matrix index ranges or concatenate another preallocated matrix to A.
Empty Arrays
An empty array in MATLAB is an array with at least one dimension length equal to zero. Empty arrays
are useful for representing the concept of "nothing" programmatically. For example, suppose you
want to find all elements of a vector that are less than 0, but there are none. The find function
returns an empty vector of indices, indicating that it did not find any elements less than 0.
A = [1 2 3 4];
ind = find(A<0)
ind =
Many algorithms contain function calls that can return empty arrays. It is often useful to allow empty
arrays to flow through these algorithms as function arguments instead of handling them as a special
case. If you do need to customize empty array handling, you can check for them using the function.
TF = isempty(ind)
1-6
Creating, Concatenating, and Expanding Matrices
TF = logical
1
See Also
Related Examples
• “Array Indexing” on page 1-21
• “Reshaping and Rearranging Arrays” on page 1-9
• “Multidimensional Arrays” on page 1-14
• “Create String Arrays”
• “Represent Dates and Times in MATLAB”
1-7
1 Matrices and Arrays
The easiest way to remove a row or column from a matrix is to set that row or column equal to a pair
of empty square brackets []. For example, create a 4-by-4 matrix and remove the second row.
A = magic(4)
A = 4×4
16 2 3 13
5 11 10 8
9 7 6 12
4 14 15 1
A(2,:) = []
A = 3×4
16 2 3 13
9 7 6 12
4 14 15 1
A(:,3) = []
A = 3×3
16 2 13
9 7 12
4 14 1
You can extend this approach to any array. For example, create a random 3-by-3-by-3 array and
remove all of the elements in the first matrix of the third dimension.
B = rand(3,3,3);
B(:,:,1) = [];
See Also
Related Examples
• “Reshaping and Rearranging Arrays” on page 1-9
• “Array Indexing” on page 1-21
1-8
Reshaping and Rearranging Arrays
Many functions in MATLAB® can take the elements of an existing array and put them in a different
shape or sequence. This can be helpful for preprocessing your data for subsequent computations or
analyzing the data.
Reshaping
The reshape function changes the size and shape of an array. For example, reshape a 3-by-4 matrix
to a 2-by-6 matrix.
A = [1 4 7 10; 2 5 8 11; 3 6 9 12]
A = 3×4
1 4 7 10
2 5 8 11
3 6 9 12
B = reshape(A,2,6)
B = 2×6
1 3 5 7 9 11
2 4 6 8 10 12
As long as the number of elements in each shape are the same, you can reshape them into an array
with any number of dimensions. Using the elements from A, create a 2-by-2-by-3 multidimensional
array.
C = reshape(A,2,2,3)
C =
C(:,:,1) =
1 3
2 4
C(:,:,2) =
5 7
6 8
C(:,:,3) =
9 11
10 12
A common task in linear algebra is to work with the transpose of a matrix, which turns the rows into
columns and the columns into rows. To do this, use the transpose function or the .' operator.
1-9
1 Matrices and Arrays
A = 3×3
8 1 6
3 5 7
4 9 2
B = A.'
B = 3×3
8 3 4
1 5 9
6 7 2
A similar operator ' computes the conjugate transpose for complex matrices. This operation
computes the complex conjugate of each element and transposes it. Create a 2-by-2 complex matrix
and compute its conjugate transpose.
A = [1+i 1-i; -i i]
A = 2×2 complex
B = A'
B = 2×2 complex
flipud flips the rows of a matrix in an up-to-down direction, and fliplr flips the columns in a left-
to-right direction.
A = [1 2; 3 4]
A = 2×2
1 2
3 4
B = flipud(A)
B = 2×2
3 4
1 2
C = fliplr(A)
1-10
Reshaping and Rearranging Arrays
C = 2×2
2 1
4 3
You can shift elements of an array by a certain number of positions using the circshift function.
For example, create a 3-by-4 matrix and shift its columns to the right by 2. The second argument [0
2] tells circshift to shift the rows 0 places and shift the columns 2 places to the right.
A = [1 2 3 4; 5 6 7 8; 9 10 11 12]
A = 3×4
1 2 3 4
5 6 7 8
9 10 11 12
B = circshift(A,[0 2])
B = 3×4
3 4 1 2
7 8 5 6
11 12 9 10
To shift the rows of A up by 1 and keep the columns in place, specify the second argument as [-1 0].
C = circshift(A,[-1 0])
C = 3×4
5 6 7 8
9 10 11 12
1 2 3 4
A = [1 2; 3 4]
A = 2×2
1 2
3 4
B = rot90(A)
B = 2×2
2 4
1 3
1-11
1 Matrices and Arrays
If you rotate 3 more times by using the second argument to specify the number of rotations, you end
up with the original matrix A.
C = rot90(B,3)
C = 2×2
1 2
3 4
Sorting
Sorting the data in an array is also a valuable tool, and MATLAB offers a number of approaches. For
example, the sort function sorts the elements of each row or column of a matrix separately in
ascending or descending order. Create a matrix A and sort each column of A in ascending order.
A = magic(4)
A = 4×4
16 2 3 13
5 11 10 8
9 7 6 12
4 14 15 1
B = sort(A)
B = 4×4
4 2 3 1
5 7 6 8
9 11 10 12
16 14 15 13
Sort each row in descending order. The second argument value 2 specifies that you want to sort row-
wise.
C = sort(A,2,'descend')
C = 4×4
16 13 3 2
11 10 8 5
12 9 7 6
15 14 4 1
To sort entire rows or columns relative to each other, use the sortrows function. For example, sort
the rows of A in ascending order according to the elements in the first column. The positions of the
rows change, but the order of the elements in each row are preserved.
D = sortrows(A)
D = 4×4
4 14 15 1
1-12
Reshaping and Rearranging Arrays
5 11 10 8
9 7 6 12
16 2 3 13
See Also
Related Examples
• “Removing Rows or Columns from a Matrix” on page 1-8
• “Array Indexing” on page 1-21
1-13
1 Matrices and Arrays
Multidimensional Arrays
A multidimensional array in MATLAB® is an array with more than two dimensions. In a matrix, the
two dimensions are represented by rows and columns.
Each element is defined by two subscripts, the row index and the column index. Multidimensional
arrays are an extension of 2-D matrices and use additional subscripts for indexing. A 3-D array, for
example, uses three subscripts. The first two are just like a matrix, but the third dimension
represents pages or sheets of elements.
You can create a multidimensional array by creating a 2-D matrix first, and then extending it. For
example, first define a 3-by-3 matrix as the first page in a 3-D array.
A = [1 2 3; 4 5 6; 7 8 9]
A = 3×3
1 2 3
4 5 6
7 8 9
Now add a second page. To do this, assign another 3-by-3 matrix to the index value 2 in the third
dimension. The syntax A(:,:,2) uses a colon in the first and second dimensions to include all rows
and all columns from the right-hand side of the assignment.
A(:,:,2) = [10 11 12; 13 14 15; 16 17 18]
A =
A(:,:,1) =
1-14
Multidimensional Arrays
1 2 3
4 5 6
7 8 9
A(:,:,2) =
10 11 12
13 14 15
16 17 18
The cat function can be a useful tool for building multidimensional arrays. For example, create a new
3-D array B by concatenating A with a third page. The first argument indicates which dimension to
concatenate along.
B = cat(3,A,[3 2 1; 0 9 8; 5 3 7])
B =
B(:,:,1) =
1 2 3
4 5 6
7 8 9
B(:,:,2) =
10 11 12
13 14 15
16 17 18
B(:,:,3) =
3 2 1
0 9 8
5 3 7
Another way to quickly expand a multidimensional array is by assigning a single element to an entire
page. For example, add a fourth page to B that contains all zeros.
B(:,:,4) = 0
B =
B(:,:,1) =
1 2 3
4 5 6
7 8 9
B(:,:,2) =
10 11 12
13 14 15
1-15
1 Matrices and Arrays
16 17 18
B(:,:,3) =
3 2 1
0 9 8
5 3 7
B(:,:,4) =
0 0 0
0 0 0
0 0 0
Accessing Elements
To access elements in a multidimensional array, use integer subscripts just as you would for vectors
and matrices. For example, find the 1,2,2 element of A, which is in the first row, second column, and
second page of A.
A
A =
A(:,:,1) =
1 2 3
4 5 6
7 8 9
A(:,:,2) =
10 11 12
13 14 15
16 17 18
elA = A(1,2,2)
elA = 11
Use the index vector [1 3] in the second dimension to access only the first and last columns of each
page of A.
C = A(:,[1 3],:)
C =
C(:,:,1) =
1 3
4 6
7 9
C(:,:,2) =
1-16
Multidimensional Arrays
10 12
13 15
16 18
To find the second and third rows of each page, use the colon operator to create your index vector.
D = A(2:3,:,:)
D =
D(:,:,1) =
4 5 6
7 8 9
D(:,:,2) =
13 14 15
16 17 18
Manipulating Arrays
Elements of multidimensional arrays can be moved around in many ways, similar to vectors and
matrices. reshape, permute, and squeeze are useful functions for rearranging elements. Consider
a 3-D array with two pages.
Reshaping a multidimensional array can be useful for performing certain operations or visualizing the
data. Use the reshape function to rearrange the elements of the 3-D array into a 6-by-5 matrix.
A = [1 2 3 4 5; 9 0 6 3 7; 8 1 5 0 2];
A(:,:,2) = [9 7 8 5 2; 3 5 8 5 1; 6 9 4 3 3];
B = reshape(A,[6 5])
B = 6×5
1 3 5 7 5
9 6 7 5 5
8 5 2 9 3
2 4 9 8 2
0 3 3 8 1
1 0 6 4 3
reshape operates columnwise, creating the new matrix by taking consecutive elements down each
column of A, starting with the first page then moving to the second page.
Permutations are used to rearrange the order of the dimensions of an array. Consider a 3-D array M.
1-17
1 Matrices and Arrays
M(:,:,1) = [1 2 3; 4 5 6; 7 8 9];
M(:,:,2) = [0 5 4; 2 7 6; 9 3 1]
M =
M(:,:,1) =
1 2 3
4 5 6
7 8 9
M(:,:,2) =
0 5 4
2 7 6
9 3 1
Use the permute function to interchange row and column subscripts on each page by specifying the
order of dimensions in the second argument. The original rows of M are now columns, and the
columns are now rows.
P1 = permute(M,[2 1 3])
P1 =
P1(:,:,1) =
1 4 7
2 5 8
3 6 9
P1(:,:,2) =
0 2 9
5 7 3
4 6 1
P2 =
P2(:,:,1) =
1 2 3
0 5 4
P2(:,:,2) =
4 5 6
2 7 6
P2(:,:,3) =
7 8 9
1-18
Multidimensional Arrays
9 3 1
When working with multidimensional arrays, you might encounter one that has an unnecessary
dimension of length 1. The squeeze function performs another type of manipulation that eliminates
dimensions of length 1. For example, use the repmat function to create a 2-by-3-by-1-by-4 array
whose elements are each 5, and whose third dimension has length 1.
A = repmat(5,[2 3 1 4])
A =
A(:,:,1,1) =
5 5 5
5 5 5
A(:,:,1,2) =
5 5 5
5 5 5
A(:,:,1,3) =
5 5 5
5 5 5
A(:,:,1,4) =
5 5 5
5 5 5
szA = size(A)
szA = 1×4
2 3 1 4
numdimsA = ndims(A)
numdimsA = 4
Use the squeeze function to remove the third dimension, resulting in a 3-D array.
B = squeeze(A)
B =
B(:,:,1) =
5 5 5
5 5 5
B(:,:,2) =
1-19
1 Matrices and Arrays
5 5 5
5 5 5
B(:,:,3) =
5 5 5
5 5 5
B(:,:,4) =
5 5 5
5 5 5
szB = size(B)
szB = 1×3
2 3 4
numdimsB = ndims(B)
numdimsB = 3
See Also
Related Examples
• “Creating, Concatenating, and Expanding Matrices” on page 1-2
• “Array Indexing” on page 1-21
• “Reshaping and Rearranging Arrays” on page 1-9
1-20
Array Indexing
Array Indexing
In MATLAB®, there are three primary approaches to accessing array elements based on their
location (index) in the array. These approaches are indexing by position, linear indexing, and logical
indexing.
The most common way is to explicitly specify the indices of the elements. For example, to access a
single element of a matrix, specify the row number followed by the column number of the element.
A = [1 2 3 4; 5 6 7 8; 9 10 11 12; 13 14 15 16]
A = 4×4
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
e = A(3,2)
e = 10
You can also reference multiple elements at a time by specifying their indices in a vector. For
example, access the first and third elements of the second row of A.
r = A(2,[1 3])
r = 1×2
5 7
To access elements in a range of rows or columns, use the colon. For example, access the elements
in the first through third row and the second through fourth column of A.
r = A(1:3,2:4)
r = 3×3
2 3 4
6 7 8
10 11 12
An alternative way to compute r is to use the keyword end to specify the second column through the
last column. This approach lets you specify the last column without knowing exactly how many
columns are in A.
r = A(1:3,2:end)
r = 3×3
1-21
1 Matrices and Arrays
2 3 4
6 7 8
10 11 12
If you want to access all of the rows or columns, use the colon operator by itself. For example, return
the entire third column of A.
r = A(:,3)
r = 4×1
3
7
11
15
In general, you can use indexing to access elements of any array in MATLAB regardless of its data
type or dimensions. For example, directly access a column of a datetime array.
t = [datetime(2018,1:5,1); datetime(2019,1:5,1)]
t = 2x5 datetime
01-Jan-2018 01-Feb-2018 01-Mar-2018 01-Apr-2018 01-May-2018
01-Jan-2019 01-Feb-2019 01-Mar-2019 01-Apr-2019 01-May-2019
march1 = t(:,3)
For higher-dimensional arrays, expand the syntax to match the array dimensions. Consider a random
3-by-3-by-3 numeric array. Access the element in the second row, third column, and first sheet of the
array.
A = rand(3,3,3);
e = A(2,3,1)
e = 0.5469
For more information on working with multidimensional arrays, see “Multidimensional Arrays” on
page 1-14.
Another method for accessing elements of an array is to use only a single index, regardless of the size
or dimensions of the array. This method is known as linear indexing. While MATLAB displays arrays
according to their defined sizes and shapes, they are actually stored in memory as a single column of
elements. A good way to visualize this concept is with a matrix. While the following array is displayed
as a 3-by-3 matrix, MATLAB stores it as a single column made up of the columns of A appended one
after the other. The stored vector contains the sequence of elements 12, 45, 33, 36, 29, 25, 91, 48,
11, and can be displayed using a single colon.
1-22
Array Indexing
A = 3×3
12 36 91
45 29 48
33 25 11
Alinear = A(:)
Alinear = 9×1
12
45
33
36
29
25
91
48
11
For example, the 3,2 element of A is 25, and you can access it using the syntax A(3,2). You can also
access this element using the syntax A(6), since 25 is sixth element of the stored vector sequence.
e = A(3,2)
e = 25
elinear = A(6)
elinear = 25
While linear indexing can be less intuitive visually, it can be powerful for performing certain
computations that are not dependent on the size or shape of the array. For example, you can easily
sum all of the elements of A without having to provide a second argument to the sum function.
s = sum(A(:))
s = 330
The sub2ind and ind2sub functions help to convert between original array indices and their linear
version. For example, compute the linear index of the 3,2 element of A.
linearidx = sub2ind(size(A),3,2)
linearidx = 6
Convert from the linear index back to its row and column form.
[row,col] = ind2sub(size(A),6)
row = 3
col = 2
Using true and false logical indicators is another useful way to index into arrays, particularly when
working with conditional statements. For example, say you want to know if the elements of a matrix A
1-23
1 Matrices and Arrays
are less than the corresponding elements of another matrix B. The less-than operator returns a
logical array whose elements are 1 when an element in A is smaller than the corresponding element
in B.
A = [1 2 6; 4 3 6]
A = 2×3
1 2 6
4 3 6
B = [0 3 7; 3 7 5]
B = 2×3
0 3 7
3 7 5
ind = A<B
0 1 1
0 1 0
Now that you know the locations of the elements meeting the condition, you can inspect the
individual values using ind as the index array. MATLAB matches the locations of the value 1 in ind to
the corresponding elements of A and B, and lists their values in a column vector.
Avals = A(ind)
Avals = 3×1
2
3
6
Bvals = B(ind)
Bvals = 3×1
3
7
7
MATLAB "is" functions also return logical arrays that indicate which elements of the input meet a
certain condition. For example, check which elements of a string vector are missing using the
ismissing function.
1-24
Array Indexing
0 0 1 0 0 1
Suppose you want to find the values of the elements that are not missing. Use the ~ operator with the
index vector ind to do this.
strvals = str(~ind)
For more examples using logical indexing, see “Find Array Elements That Meet a Condition”.
See Also
Related Examples
• “Access Data Using Categorical Arrays”
• “Access Data in Tables”
• “Structure Arrays”
• “Access Data in Cell Array”
External Websites
• Programming: Organizing Data (MathWorks Teaching Resources)
1-25
2
Linear Algebra
The MATLAB environment uses the term matrix to indicate a variable containing real or complex
numbers arranged in a two-dimensional grid. An array is, more generally, a vector, matrix, or higher
dimensional grid of numbers. All arrays in MATLAB are rectangular, in the sense that the component
vectors along any dimension are all the same length. The mathematical operations defined on
matrices are the subject of linear algebra.
Creating Matrices
MATLAB has many functions that create different kinds of matrices. For example, you can create a
symmetric matrix with entries based on Pascal's triangle:
A = pascal(3)
A =
1 1 1
1 2 3
1 3 6
Or, you can create an unsymmetric magic square matrix, which has equal row and column sums:
B = magic(3)
B =
8 1 6
3 5 7
4 9 2
Another example is a 3-by-2 rectangular matrix of random integers. In this case the first input to
randi describes the range of possible values for the integers, and the second two inputs describe the
number of rows and columns.
C = randi(10,3,2)
C =
9 10
10 7
2 1
A column vector is an m-by-1 matrix, a row vector is a 1-by-n matrix, and a scalar is a 1-by-1 matrix.
To define a matrix manually, use square brackets [ ] to denote the beginning and end of the array.
Within the brackets, use a semicolon ; to denote the end of a row. In the case of a scalar (1-by-1
matrix), the brackets are not required. For example, these statements produce a column vector, a row
vector, and a scalar:
u = [3; 1; 4]
v = [2 0 -1]
s = 7
2-2
Matrices in the MATLAB Environment
u =
3
1
4
v =
2 0 -1
s =
7
For more information about creating and working with matrices, see “Creating, Concatenating, and
Expanding Matrices” on page 1-2.
X = A + B
X =
9 2 7
4 7 10
5 12 8
Y = X - A
Y =
8 1 6
3 5 7
4 9 2
Addition and subtraction require both matrices to have compatible dimensions. If the dimensions are
incompatible, an error results:
X = A + C
Error using +
Matrix dimensions must agree.
u = [3; 1; 4];
v = [2 0 -1];
x = v*u
x =
X = u*v
2-3
2 Linear Algebra
X =
6 0 -3
2 0 -1
8 0 -4
For real matrices, the transpose operation interchanges aij and aji. For complex matrices, another
consideration is whether to take the complex conjugate of complex entries in the array to form the
complex conjugate transpose. MATLAB uses the apostrophe operator (') to perform a complex
conjugate transpose, and the dot-apostrophe operator (.') to transpose without conjugation. For
matrices containing all real elements, the two operators return the same result.
B = magic(3)
B =
8 1 6
3 5 7
4 9 2
X = B'
X =
8 3 4
1 5 9
6 7 2
For vectors, transposition turns a row vector into a column vector (and vice-versa):
x = v'
x =
2
0
-1
If x and y are both real column vectors, then the product x*y is not defined, but the two products
x'*y
and
y'*x
produce the same scalar result. This quantity is used so frequently, it has three different names: inner
product, scalar product, or dot product. There is even a dedicated function for dot products named
dot.
For a complex vector or matrix, z, the quantity z' not only transposes the vector or matrix, but also
converts each complex element to its complex conjugate. That is, the sign of the imaginary part of
each complex element changes. For example, consider the complex matrix
2-4
Matrices in the MATLAB Environment
z =
ans =
The unconjugated complex transpose, where the complex part of each element retains its sign, is
denoted by z.':
z.'
ans =
For complex vectors, the two scalar products x'*y and y'*x are complex conjugates of each other,
and the scalar product x'*x of a complex vector with itself is real.
Multiplying Matrices
Multiplication of matrices is defined in a way that reflects composition of the underlying linear
transformations and allows compact representation of systems of simultaneous linear equations. The
matrix product C = AB is defined when the column dimension of A is equal to the row dimension of B,
or when one of them is a scalar. If A is m-by-p and B is p-by-n, their product C is m-by-n. The product
can actually be defined using MATLAB for loops, colon notation, and vector dot products:
A = pascal(3);
B = magic(3);
m = 3;
n = 3;
for i = 1:m
for j = 1:n
C(i,j) = A(i,:)*B(:,j);
end
end
MATLAB uses an asterisk to denote matrix multiplication, as in C = A*B. Matrix multiplication is not
commutative; that is, A*B is typically not equal to B*A:
X = A*B
X =
15 15 15
26 38 26
41 70 39
Y = B*A
2-5
2 Linear Algebra
Y =
15 28 47
15 34 60
15 28 43
A matrix can be multiplied on the right by a column vector and on the left by a row vector:
u = [3; 1; 4];
x = A*u
x =
8
17
30
v = [2 0 -1];
y = v*B
y =
12 -7 10
Rectangular matrix multiplications must satisfy the dimension compatibility conditions. Since A is 3-
by-3 and C is 3-by-2, you can multiply them to get a 3-by-2 result (the common inner dimension
cancels):
X = A*C
X =
24 17
47 42
79 77
Y = C*A
Error using *
Incorrect dimensions for matrix multiplication. Check that the number of columns
in the first matrix matches the number of rows in the second matrix. To perform
elementwise multiplication, use '.*'.
s = 10;
w = s*y
w =
When you multiply an array by a scalar, the scalar implicitly expands to be the same size as the other
input. This is often referred to as scalar expansion.
2-6
Matrices in the MATLAB Environment
Identity Matrix
Generally accepted mathematical notation uses the capital letter I to denote identity matrices,
matrices of various sizes with ones on the main diagonal and zeros elsewhere. These matrices have
the property that AI = A and IA = A whenever the dimensions are compatible.
The original version of MATLAB could not use I for this purpose because it did not distinguish
between uppercase and lowercase letters and i already served as a subscript and as the complex unit.
So an English language pun was introduced. The function
eye(m,n)
returns an m-by-n rectangular identity matrix and eye(n) returns an n-by-n square identity matrix.
Matrix Inverse
If a matrix A is square and nonsingular (nonzero determinant), then the equations AX = I and XA = I
have the same solution X. This solution is called the inverse of A and is denoted A-1. The inv function
and the expression A^-1 both compute the matrix inverse.
A = pascal(3)
A =
1 1 1
1 2 3
1 3 6
X = inv(A)
X =
A*X
ans =
1.0000 0 0
0.0000 1.0000 -0.0000
-0.0000 0.0000 1.0000
The determinant calculated by det is a measure of the scaling factor of the linear transformation
described by the matrix. When the determinant is exactly zero, the matrix is singular and no inverse
exists.
d = det(A)
d =
Some matrices are nearly singular, and despite the fact that an inverse matrix exists, the calculation
is susceptible to numerical errors. The cond function computes the condition number for inversion,
which gives an indication of the accuracy of the results from matrix inversion. The condition number
ranges from 1 for a numerically stable matrix to Inf for a singular matrix.
2-7
2 Linear Algebra
c = cond(A)
c =
61.9839
It is seldom necessary to form the explicit inverse of a matrix. A frequent misuse of inv arises when
solving the system of linear equations Ax = b. The best way to solve this equation, from the
standpoint of both execution time and numerical accuracy, is to use the matrix backslash operator x
= A\b. See mldivide for more information.
The Kronecker product is often used with matrices of zeros and ones to build up repeated copies of
small matrices. For example, if X is the 2-by-2 matrix
X = [1 2
3 4]
kron(X,I)
ans =
1 0 2 0
0 1 0 2
3 0 4 0
0 3 0 4
and
kron(I,X)
ans =
1 2 0 0
3 4 0 0
0 0 1 2
0 0 3 4
Aside from kron, some other functions that are useful to replicate arrays are repmat, repelem, and
blkdiag.
2-8
Matrices in the MATLAB Environment
1p
x p = ∑ xi p ,
is computed by norm(x,p). This operation is defined for any value of p > 1, but the most common
values of p are 1, 2, and ∞. The default value is p = 2, which corresponds to Euclidean length or
vector magnitude:
v = [2 0 -1];
[norm(v,1) norm(v) norm(v,inf)]
ans =
Ax p
A p = max ,
x x p
A = pascal(3);
[norm(A,1) norm(A) norm(A,inf)]
ans =
In cases where you want to calculate the norm of each row or column of a matrix, you can use
vecnorm:
vecnorm(A)
ans =
1 The function performs operations that easily partition into sections that execute concurrently.
These sections must be able to execute with little communication between processes. They
should require few sequential operations.
2 The data size is large enough so that any advantages of concurrent execution outweigh the time
required to partition the data and manage separate execution threads. For example, most
functions speed up only when the array contains several thousand elements or more.
3 The operation is not memory-bound; processing time is not dominated by memory access time. As
a general rule, complicated functions speed up more than simple functions.
The matrix multiply (X*Y) and matrix power (X^p) operators show significant increase in speed on
large double-precision arrays (on order of 10,000 elements). The matrix analysis functions det,
rcond, hess, and expm also show significant increase in speed on large double-precision arrays.
2-9
2 Linear Algebra
Computational Considerations
One of the most important problems in technical computing is the solution of systems of simultaneous
linear equations.
In matrix notation, the general problem takes the following form: Given two matrices A and b, does
there exist a unique matrix x, so that Ax= b or xA= b?
7x = 21
The answer, of course, is yes. The equation has the unique solution x = 3. The solution is easily
obtained by division:
x = 21/7 = 3.
The solution is not ordinarily obtained by computing the inverse of 7, that is 7–1= 0.142857..., and
then multiplying 7–1 by 21. This would be more work and, if 7–1 is represented to a finite number of
digits, less accurate. Similar considerations apply to sets of linear equations with more than one
unknown; MATLAB solves such equations without computing the inverse of the matrix.
Although it is not standard mathematical notation, MATLAB uses the division terminology familiar in
the scalar case to describe the solution of a general system of simultaneous equations. The two
division symbols, slash, /, and backslash, \, correspond to the two MATLAB functions mrdivide and
mldivide. These operators are used for the two situations where the unknown matrix appears on the
left or right of the coefficient matrix:
2-10
Systems of Linear Equations
The dimension compatibility conditions for x = A\b require the two matrices A and b to have the
same number of rows. The solution x then has the same number of columns as b and its row
dimension is equal to the column dimension of A. For x = b/A, the roles of rows and columns are
interchanged.
In practice, linear equations of the form Ax = b occur more frequently than those of the form xA = b.
Consequently, the backslash is used far more frequently than the slash. The remainder of this section
concentrates on the backslash operator; the corresponding properties of the slash operator can be
inferred from the identity:
(b/A)' = (A'\b').
The coefficient matrix A need not be square. If A has size m-by-n, then there are three cases:
The mldivide operator employs different solvers to handle different kinds of coefficient matrices.
The various cases are diagnosed automatically by examining the coefficient matrix. For more
information, see the “Algorithms” section of the mldivide reference page.
General Solution
The general solution to a system of linear equations Ax= b describes all possible solutions. You can
find the general solution by:
1 Solving the corresponding homogeneous system Ax = 0. Do this using the null command, by
typing null(A). This returns a basis for the solution space to Ax = 0. Any solution is a linear
combination of basis vectors.
2 Finding a particular solution to the nonhomogeneous system Ax =b.
You can then write any solution to Ax= b as the sum of the particular solution to Ax =b, from step 2,
plus a linear combination of the basis vectors from step 1.
The rest of this section describes how to use MATLAB to find a particular solution to Ax =b, as in step
2.
Square Systems
The most common situation involves a square coefficient matrix A and a single right-hand side column
vector b.
If the matrix A is nonsingular, then the solution, x = A\b, is the same size as b. For example:
A = pascal(3);
u = [3; 1; 4];
2-11
2 Linear Algebra
x = A\u
x =
10
-12
5
If A and b are square and the same size, x= A\b is also that size:
b = magic(3);
X = A\b
X =
19 -3 -1
-17 4 13
6 0 -6
Both of these examples have exact, integer solutions. This is because the coefficient matrix was
chosen to be pascal(3), which is a full rank matrix (nonsingular).
A square matrix A is singular if it does not have linearly independent columns. If A is singular, the
solution to Ax = b either does not exist, or is not unique. The backslash operator, A\b, issues a
warning if A is nearly singular or if it detects exact singularity.
If A is singular and Ax = b has a solution, you can find a particular solution that is not unique, by
typing
P = pinv(A)*b
pinv(A) is a pseudoinverse of A. If Ax = b does not have an exact solution, then pinv(A) returns a
least-squares solution.
For example:
A = [ 1 3 7
-1 4 4
1 10 18 ]
rank(A)
ans =
Since A is not full rank, it has some singular values equal to zero.
Exact Solutions. For b =[5;2;12], the equation Ax = b has an exact solution, given by
pinv(A)*b
2-12
Systems of Linear Equations
ans =
0.3850
-0.1103
0.7066
A*pinv(A)*b
ans =
5.0000
2.0000
12.0000
Least-Squares Solutions. However, if b = [3;6;0], Ax = b does not have an exact solution. In this
case, pinv(A)*b returns a least-squares solution. If you type
A*pinv(A)*b
ans =
-1.0000
4.0000
2.0000
You can determine whether Ax =b has an exact solution by finding the row reduced echelon form of
the augmented matrix [A b]. To do so for this example, enter
rref([A b])
ans =
1.0000 0 2.2857 0
0 1.0000 1.5714 0
0 0 0 1.0000
Since the bottom row contains all zeros except for the last entry, the equation does not have a
solution. In this case, pinv(A) returns a least-squares solution.
Overdetermined Systems
This example shows how overdetermined systems are often encountered in various kinds of curve
fitting to experimental data.
A quantity y is measured at several different values of time t to produce the following observations.
You can enter the data and view it in a table with the following statements.
B=6×2 table
t y
___ ____
0 0.82
0.3 0.72
2-13
2 Linear Algebra
0.8 0.63
1.1 0.6
1.6 0.55
2.3 0.5
y(t) = c1 + c2e−t.
The preceding equation says that the vector y should be approximated by a linear combination of two
other vectors. One is a constant vector containing all ones and the other is the vector with
components exp(-t). The unknown coefficients, c1 and c2, can be computed by doing a least-squares
fit, which minimizes the sum of the squares of the deviations of the data from the model. There are
six equations in two unknowns, represented by a 6-by-2 matrix.
E = [ones(size(t)) exp(-t)]
E = 6×2
1.0000 1.0000
1.0000 0.7408
1.0000 0.4493
1.0000 0.3329
1.0000 0.2019
1.0000 0.1003
c = E\y
c = 2×1
0.4760
0.3413
The following statements evaluate the model at regularly spaced increments in t, and then plot the
result together with the original data:
T = (0:0.1:2.5)';
Y = [ones(size(T)) exp(-T)]*c;
plot(T,Y,'-',t,y,'o')
2-14
Systems of Linear Equations
E*c is not exactly equal to y, but the difference might well be less than measurement errors in the
original data.
A rectangular matrix A is rank deficient if it does not have linearly independent columns. If A is rank
deficient, then the least-squares solution to AX = B is not unique. A\B issues a warning if A is rank
deficient and produces a least-squares solution. You can use lsqminnorm to find the solution X that
has the minimum norm among all solutions.
Underdetermined Systems
This example shows how the solution to underdetermined systems is not unique. Underdetermined
linear systems involve more unknowns than equations. The matrix left division operation in MATLAB
finds a basic least-squares solution, which has at most m nonzero components for an m-by-n coefficient
matrix.
R = [6 8 7 3; 3 5 4 1]
rng(0);
b = randi(8,2,1)
R =
6 8 7 3
3 5 4 1
2-15
2 Linear Algebra
b =
7
8
The linear system Rp = b involves two equations in four unknowns. Since the coefficient matrix
contains small integers, it is appropriate to use the format command to display the solution in
rational format. The particular solution is obtained with
format rat
p = R\b
p =
0
17/7
0
-29/7
One of the nonzero components is p(2) because R(:,2) is the column of R with largest norm. The
other nonzero component is p(4) because R(:,4) dominates after R(:,2) is eliminated.
The complete general solution to the underdetermined system can be characterized by adding p to an
arbitrary linear combination of the null space vectors, which can be found using the null function
with an option requesting a rational basis.
Z = null(R,'r')
Z =
-1/2 -7/6
-1/2 1/2
1 0
0 1
It can be confirmed that R*Z is zero and that the residual R*x - b is small for any vector x, where
x = p + Z*q
Since the columns of Z are the null space vectors, the product Z*q is a linear combination of those
vectors:
u
Zq = x 1 x 2 = ux 1 + wx 2 .
w
q = [-2; 1];
x = p + Z*q;
format short
norm(R*x - b)
2-16
Systems of Linear Equations
ans =
2.6645e-15
When infinitely many solutions are available, the solution with minimum norm is of particular
interest. You can use lsqminnorm to compute the minimum-norm least-squares solution. This
solution has the smallest possible value for norm(p).
p = lsqminnorm(R,b)
p =
-207/137
365/137
79/137
-424/137
However, sometimes the different values of b are not all available at the same time, which means you
need to solve several systems of equations consecutively. When you solve one of these systems of
equations using slash (/) or backslash (\), the operator factorizes the coefficient matrix A and uses this
matrix decomposition to compute the solution. However, each subsequent time you solve a similar
system of equations with a different b, the operator computes the same decomposition of A, which is
a redundant computation.
The solution to this problem is to precompute the decomposition of A, and then reuse the factors to
solve for the different values of b. In practice, however, precomputing the decomposition in this
manner can be difficult since you need to know which decomposition to compute (LU, LDL, Cholesky,
and so on) as well as how to multiply the factors to solve the problem. For example, with LU
decomposition you need to solve two linear systems to solve the original system Ax = b:
[L,U] = lu(A);
x = U \ (L \ b);
Instead, the recommended method for solving linear systems with several consecutive right-hand
sides is to use decomposition objects. These objects enable you to leverage the performance
benefits of precomputing the matrix decomposition, but they do not require knowledge of how to use
the matrix factors. You can replace the previous LU decomposition with:
dA = decomposition(A,'lu');
x = dA\b;
If you are unsure which decomposition to use, decomposition(A) chooses the correct type based
on the properties of A, similar to what backslash does.
Here is a simple test of the possible performance benefits of this approach. The test solves the same
sparse linear system 100 times using both backslash (\) and decomposition.
n = 1e3;
A = sprand(n,n,0.2) + speye(n);
2-17
2 Linear Algebra
b = ones(n,1);
% Backslash solution
tic
for k = 1:100
x = A\b;
end
toc
% decomposition solution
tic
dA = decomposition(A);
for k = 1:100
x = dA\b;
end
toc
For this problem, the decomposition solution is much faster than using backslash alone, yet the
syntax remains simple.
Iterative Methods
If the coefficient matrix A is large and sparse, factorization methods are generally not efficient.
Iterative methods generate a series of approximate solutions. MATLAB provides several iterative
methods to handle large, sparse input matrices.
Function Description
pcg Preconditioned conjugate gradients method. This method is appropriate
for Hermitian positive definite coefficient matrix A.
bicg BiConjugate Gradients Method
bicgstab BiConjugate Gradients Stabilized Method
bicgstabl BiCGStab(l) Method
cgs Conjugate Gradients Squared Method
gmres Generalized Minimum Residual Method
lsqr LSQR Method
minres Minimum Residual Method. This method is appropriate for Hermitian
coefficient matrix A.
qmr Quasi-Minimal Residual Method
symmlq Symmetric LQ Method
tfqmr Transpose-Free QMR Method
Multithreaded Computation
MATLAB supports multithreaded computation for a number of linear algebra and element-wise
numerical functions. These functions automatically execute on multiple threads. For a function or
expression to execute faster on multiple CPUs, a number of conditions must be true:
2-18
Systems of Linear Equations
1 The function performs operations that easily partition into sections that execute concurrently.
These sections must be able to execute with little communication between processes. They
should require few sequential operations.
2 The data size is large enough so that any advantages of concurrent execution outweigh the time
required to partition the data and manage separate execution threads. For example, most
functions speed up only when the array contains several thousand elements or more.
3 The operation is not memory-bound; processing time is not dominated by memory access time. As
a general rule, complicated functions speed up more than simple functions.
inv, lscov, linsolve, and mldivide show significant increase in speed on large double-precision
arrays (on order of 10,000 elements or more) when multithreading is enabled.
2-19
2 Linear Algebra
Factorizations
In this section...
“Introduction” on page 2-20
“Cholesky Factorization” on page 2-20
“LU Factorization” on page 2-21
“QR Factorization” on page 2-22
“Using Multithreaded Computation for Factorization” on page 2-25
Introduction
All three of the matrix factorizations discussed in this section make use of triangular matrices, where
all the elements either above or below the diagonal are zero. Systems of linear equations involving
triangular matrices are easily and quickly solved using either forward or back substitution.
Cholesky Factorization
The Cholesky factorization expresses a symmetric matrix as the product of a triangular matrix and its
transpose
A = R′R,
Not all symmetric matrices can be factored in this way; the matrices that have such a factorization
are said to be positive definite. This implies that all the diagonal elements of A are positive and that
the off-diagonal elements are “not too big.” The Pascal matrices provide an interesting example.
Throughout this chapter, the example matrix A has been the 3-by-3 Pascal matrix. Temporarily switch
to the 6-by-6:
A = pascal(6)
A =
1 1 1 1 1 1
1 2 3 4 5 6
1 3 6 10 15 21
1 4 10 20 35 56
1 5 15 35 70 126
1 6 21 56 126 252
The elements of A are binomial coefficients. Each element is the sum of its north and west neighbors.
The Cholesky factorization is
R = chol(A)
R =
1 1 1 1 1 1
0 1 2 3 4 5
0 0 1 3 6 10
0 0 0 1 4 10
0 0 0 0 1 5
0 0 0 0 0 1
2-20
Factorizations
The elements are again binomial coefficients. The fact that R'*R is equal to A demonstrates an
identity involving sums of products of binomial coefficients.
Note The Cholesky factorization also applies to complex matrices. Any complex matrix that has a
Cholesky factorization satisfies
A′ = A
Ax = b
to be replaced by
R′Rx = b.
Because the backslash operator recognizes triangular systems, this can be solved in the MATLAB
environment quickly with
x = R\(R'\b)
If A is n-by-n, the computational complexity of chol(A) is O(n3), but the complexity of the subsequent
backslash solutions is only O(n2).
LU Factorization
LU factorization, or Gaussian elimination, expresses any square matrix A as the product of a
permutation of a lower triangular matrix and an upper triangular matrix
A = LU,
where L is a permutation of a lower triangular matrix with ones on its diagonal and U is an upper
triangular matrix.
The permutations are necessary for both theoretical and computational reasons. The matrix
01
10
cannot be expressed as the product of triangular matrices without interchanging its two rows.
Although the matrix
ε 1
10
can be expressed as the product of triangular matrices, when ε is small, the elements in the factors
are large and magnify errors, so even though the permutations are not strictly necessary, they are
desirable. Partial pivoting ensures that the elements of L are bounded by one in magnitude and that
the elements of U are not much larger than those of A.
For example:
2-21
2 Linear Algebra
[L,U] = lu(B)
L =
1.0000 0 0
0.3750 0.5441 1.0000
0.5000 1.0000 0
U =
8.0000 1.0000 6.0000
0 8.5000 -1.0000
0 0 5.2941
A*x = b
x = U\(L\b)
det(A) = det(L)*det(U)
and
inv(A) = inv(U)*inv(L)
You can also compute the determinants using det(A) = prod(diag(U)), though the signs of the
determinants might be reversed.
QR Factorization
An orthogonal matrix, or a matrix with orthonormal columns, is a real matrix whose columns all have
unit length and are perpendicular to each other. If Q is orthogonal, then
QTQ = I,
cos(θ) sin(θ)
.
−sin(θ) cos(θ)
For complex matrices, the corresponding term is unitary. Orthogonal and unitary matrices are
desirable for numerical computation because they preserve length, preserve angles, and do not
magnify errors.
The orthogonal, or QR, factorization expresses any rectangular matrix as the product of an
orthogonal or unitary matrix and an upper triangular matrix. A column permutation might also be
involved:
A = QR
or
2-22
Factorizations
AP = QR,
There are four variants of the QR factorization—full or economy size, and with or without column
permutation.
Overdetermined linear systems involve a rectangular matrix with more rows than columns, that is m-
by-n with m > n. The full-size QR factorization produces a square, m-by-m orthogonal Q and a
rectangular m-by-n upper triangular R:
Q =
R =
In many cases, the last m – n columns of Q are not needed because they are multiplied by the zeros in
the bottom portion of R. So the economy-size QR factorization produces a rectangular, m-by-n Q with
orthonormal columns and a square n-by-n upper triangular R. For the 5-by-4 example, this is not
much of a saving, but for larger, highly rectangular matrices, the savings in both time and memory
can be quite important:
[Q,R] = qr(C,0)
Q =
R =
In contrast to the LU factorization, the QR factorization does not require any pivoting or
permutations. But an optional column permutation, triggered by the presence of a third output
argument, is useful for detecting singularity or rank deficiency. At each step of the factorization, the
2-23
2 Linear Algebra
column of the remaining unfactored matrix with largest norm is used as the basis for that step. This
ensures that the diagonal elements of R occur in decreasing order and that any linear dependence
among the columns is almost certainly be revealed by examining these elements. For the small
example given here, the second column of C has a larger norm than the first, so the two columns are
exchanged:
[Q,R,P] = qr(C)
Q =
-0.3522 0.8398 -0.4131
-0.7044 -0.5285 -0.4739
-0.6163 0.1241 0.7777
R =
-11.3578 -8.2762
0 7.2460
0 0
P =
0 1
1 0
When the economy-size and column permutations are combined, the third output argument is a
permutation vector, rather than a permutation matrix:
[Q,R,p] = qr(C,0)
Q =
-0.3522 0.8398
-0.7044 -0.5285
-0.6163 0.1241
R =
-11.3578 -8.2762
0 7.2460
p =
2 1
norm(A*x - b)
equals
norm(Q*R*x - b)
Multiplication by orthogonal matrices preserves the Euclidean norm, so this expression is also equal
to
norm(R*x - y)
where y = Q'*b. Since the last m-n rows of R are zero, this expression breaks into two pieces:
norm(R(1:n,1:n)*x - y(1:n))
2-24
Factorizations
and
norm(y(n+1:m))
When A has full rank, it is possible to solve for x so that the first of these expressions is zero. Then
the second expression gives the norm of the residual. When A does not have full rank, the triangular
structure of R makes it possible to find a basic solution to the least-squares problem.
1 The function performs operations that easily partition into sections that execute concurrently.
These sections must be able to execute with little communication between processes. They
should require few sequential operations.
2 The data size is large enough so that any advantages of concurrent execution outweigh the time
required to partition the data and manage separate execution threads. For example, most
functions speed up only when the array contains several thousand elements or more.
3 The operation is not memory-bound; processing time is not dominated by memory access time. As
a general rule, complicated functions speed up more than simple functions.
lu and qr show significant increase in speed on large double-precision arrays (on order of 10,000
elements).
2-25
2 Linear Algebra
This topic shows how to compute matrix powers and exponentials using a variety of methods.
If A is a square matrix and p is a positive integer, then A^p effectively multiplies A by itself p-1 times.
For example:
A = [1 1 1
1 2 3
1 3 6];
A^2
ans = 3×3
3 6 10
6 14 25
10 25 46
If A is square and nonsingular, then A^(-p) effectively multiplies inv(A) by itself p-1 times.
A^(-3)
ans = 3×3
MATLAB® calculates inv(A) and A^(-1) with the same algorithm, so the results are exactly the
same. Both inv(A) and A^(-1) produce warnings if the matrix is close to being singular.
isequal(inv(A),A^(-1))
ans = logical
1
Fractional powers, such as A^(2/3), are also permitted. The results using fractional powers depend
on the distribution of the eigenvalues of the matrix.
A^(2/3)
ans = 3×3
2-26
Powers and Exponentials
Element-by-Element Powers
The .^ operator calculates element-by-element powers. For example, to square each element in a
matrix you can use A.^2.
A.^2
ans = 3×3
1 1 1
1 4 9
1 9 36
Square Roots
The sqrt function is a convenient way to calculate the square root of each element in a matrix. An
alternate way to do this is A.^(1/2).
sqrt(A)
ans = 3×3
For other roots, you can use nthroot. For example, calculate A.^(1/3).
nthroot(A,3)
ans = 3×3
These element-wise roots differ from the matrix square root, which calculates a second matrix B such
that A = BB. The function sqrtm(A) computes A^(1/2) by a more accurate algorithm. The m in
sqrtm distinguishes this function from sqrt(A), which, like A.^(1/2), does its job element-by-
element.
B = sqrtm(A)
B = 3×3
B^2
ans = 3×3
2-27
2 Linear Algebra
Scalar Bases
In addition to raising a matrix to a power, you also can raise a scalar to the power of a matrix.
2^A
ans = 3×3
When you raise a scalar to the power of a matrix, MATLAB uses the eigenvalues and eigenvectors of
A D
the matrix to calculate the matrix power. If [V,D] = eig(A), then 2 = V 2 V −1.
[V,D] = eig(A);
V*2^D*V^(-1)
ans = 3×3
Matrix Exponentials
The matrix exponential is a special case of raising a scalar to a matrix power. The base for a matrix
exponential is Euler's number e = exp(1).
e = exp(1);
e^A
ans = 3×3
103 ×
ans = 3×3
103 ×
The matrix exponential can be calculated in a number of ways. See “Matrix Exponentials” on page 2-
38 for more information.
2-28
Powers and Exponentials
The MATLAB functions log1p and expm1 calculate log 1 + x and ex − 1 accurately for very small
values of x. For example, if you try to add a number smaller than machine precision to 1, then the
result gets rounded to 1.
log(1+eps/2)
ans = 0
log1p(eps/2)
ans = 1.1102e-16
exp(eps/2)-1
ans = 0
expm1(eps/2)
ans = 1.1102e-16
2-29
2 Linear Algebra
Eigenvalues
In this section...
“Eigenvalue Decomposition” on page 2-30
“Multiple Eigenvalues” on page 2-31
“Schur Decomposition” on page 2-31
Eigenvalue Decomposition
An eigenvalue and eigenvector of a square matrix A are, respectively, a scalar λ and a nonzero vector
υ that satisfy
Aυ = λυ.
With the eigenvalues on the diagonal of a diagonal matrix Λ and the corresponding eigenvectors
forming the columns of a matrix V, you have
AV = VΛ.
A = VΛV–1.
A good example is the coefficient matrix of the differential equation dx/dt = Ax:
A =
0 -6 -1
6 2 -16
-5 20 -10
The solution to this equation is expressed in terms of the matrix exponential x(t) = etAx(0). The
statement
lambda = eig(A)
produces a column vector containing the eigenvalues of A. For this matrix, the eigenvalues are
complex:
lambda =
-3.0710
-2.4645+17.6008i
-2.4645-17.6008i
The real part of each of the eigenvalues is negative, so eλt approaches zero as t increases. The
nonzero imaginary part of two of the eigenvalues, ±ω, contributes the oscillatory component, sin(ωt),
to the solution of the differential equation.
With two output arguments, eig computes the eigenvectors and stores the eigenvalues in a diagonal
matrix:
[V,D] = eig(A)
V =
2-30
Eigenvalues
D =
-3.0710 0 0
0 -2.4645+17.6008i 0
0 0 -2.4645-17.6008i
The first eigenvector is real and the other two vectors are complex conjugates of each other. All three
vectors are normalized to have Euclidean length, norm(v,2), equal to one.
The matrix V*D*inv(V), which can be written more succinctly as V*D/V, is within round-off error of
A. And, inv(V)*A*V, or V\A*V, is within round-off error of D.
Multiple Eigenvalues
Some matrices do not have an eigenvector decomposition. These matrices are not diagonalizable. For
example:
A = [ 1 -2 1
0 1 4
0 0 3 ]
[V,D] = eig(A)
produces
V =
D =
1 0 0
0 1 0
0 0 3
There is a double eigenvalue at λ = 1. The first and second columns of V are the same. For this
matrix, a full set of linearly independent eigenvectors does not exist.
Schur Decomposition
Many advanced matrix computations do not require eigenvalue decompositions. They are based,
instead, on the Schur decomposition
A = USU ′ ,
where U is an orthogonal matrix and S is a block upper-triangular matrix with 1-by-1 and 2-by-2
blocks on the diagonal. The eigenvalues are revealed by the diagonal elements and blocks of S, while
2-31
2 Linear Algebra
the columns of U provide an orthogonal basis, which has much better numerical properties than a set
of eigenvectors.
For example, compare the eigenvalue and Schur decompositions of this defective matrix:
A = [ 6 12 19
-9 -20 -33
4 9 15 ];
[V,D] = eig(A)
V =
D =
[U,S] = schur(A)
U =
S =
The matrix A is defective since it does not have a full set of linearly independent eigenvectors (the
second and third columns of V are the same). Since not all columns of V are linearly independent, it
has a large condition number of about ~1e8. However, schur is able to calculate three different
basis vectors in U. Since U is orthogonal, cond(U) = 1.
The matrix S has the real eigenvalue as the first entry on the diagonal and the repeated eigenvalue
represented by the lower right 2-by-2 block. The eigenvalues of the 2-by-2 block are also eigenvalues
of A:
eig(S(2:3,2:3))
ans =
1.0000 + 0.0000i
1.0000 - 0.0000i
2-32
Singular Values
Singular Values
A singular value and corresponding singular vectors of a rectangular matrix A are, respectively, a
scalar σ and a pair of vectors u and v that satisfy
Av = σu
AHu = σv,
where AH is the Hermitian transpose of A. The singular vectors u and v are typically scaled to have a
norm of 1. Also, if u and v are singular vectors of A, then -u and -v are singular vectors of A as well.
The singular values σ are always real and nonnegative, even if A is complex. With the singular values
in a diagonal matrix Σ and the corresponding singular vectors forming the columns of two orthogonal
matrices U and V, you obtain the equations
AV = UΣ
AHU = V Σ .
Since U and V are unitary matrices, multiplying the first equation by V H on the right yields the
singular value decomposition equation
A = UΣV H .
• m-by-m matrix U
• m-by-n matrix Σ
• n-by-n matrix V
In other words, U and V are both square, and Σ is the same size as A. If A has many more rows than
columns (m > n), then the resulting m-by-m matrix U is large. However, most of the columns in U are
multiplied by zeros in Σ. In this situation, the economy-sized decomposition saves both time and
storage by producing an m-by-n U, an n-by-n Σ and the same V:
The eigenvalue decomposition is the appropriate tool for analyzing a matrix when it represents a
mapping from a vector space into itself, as it does for an ordinary differential equation. However, the
singular value decomposition is the appropriate tool for analyzing a mapping from one vector space
2-33
2 Linear Algebra
into another vector space, possibly with a different dimension. Most systems of simultaneous linear
equations fall into this second category.
If A is square, symmetric, and positive definite, then its eigenvalue and singular value decompositions
are the same. But, as A departs from symmetry and positive definiteness, the difference between the
two decompositions increases. In particular, the singular value decomposition of a real matrix is
always real, but the eigenvalue decomposition of a real, nonsymmetric matrix might be complex.
A = [9 4
6 8
2 7];
[U,S,V] = svd(A)
U =
S =
14.9359 0
0 5.1883
0 0
V =
-0.6925 0.7214
-0.7214 -0.6925
You can verify that U*S*V' is equal to A to within round-off error. For this small problem, the
economy size decomposition is only slightly smaller.
[U,S,V] = svd(A,"econ")
U =
-0.6105 0.7174
-0.6646 -0.2336
-0.4308 -0.6563
S =
14.9359 0
0 5.1883
V =
2-34
Singular Values
-0.6925 0.7214
-0.7214 -0.6925
Function Usage
pagesvd Use pagesvd to perform singular value
decompositions on the pages of a
multidimensional array. This is an efficient way to
perform SVD on a large collection of matrices
that all have the same size.
For example, consider a collection of three 2-by-2 matrices. Concatenate the matrices into a 2-by-2-
by-3 array with the cat function.
A = [0 -1; 1 0];
B = [-1 0; 0 -1];
C = [0 1; -1 0];
X = cat(3,A,B,C);
[U,S,V] = pagesvd(X);
For each page of X, there are corresponding pages in the outputs U, S, and V. For example, the matrix
A is on the first page of X, and its decomposition is given by U(:,:,1)*S(:,:,1)*V(:,:,1)'.
In cases where only a subset of the singular values and singular vectors are required, the svds and
svdsketch functions are preferred over svd.
Function Usage
svds Use svds to calculate a rank-k approximation of
the SVD. You can specify whether the subset of
singular values should be the largest, the
smallest, or the closest to a specific number.
svds generally calculates the best possible rank-
k approximation.
2-35
2 Linear Algebra
Function Usage
svdsketch Use svdsketch to calculate a partial SVD of the
input matrix satisfying a specified tolerance.
While svds requires that you specify the rank,
svdsketch adaptively determines the rank of the
matrix sketch based on the specified tolerance.
The rank-k approximation that svdsketch
ultimately uses satisfies the tolerance, but unlike
svds, it is not guaranteed to be the best one
possible.
For example, consider a 1000-by-1000 random sparse matrix with a density of about 30%.
n = 1000;
A = sprand(n,n,0.3);
S = svds(A)
S =
130.2184
16.4358
16.4119
16.3688
16.3242
16.2838
S = svds(A,6,"smallest")
S =
0.0740
0.0574
0.0388
0.0282
0.0131
0.0066
For smaller matrices that can fit in memory as a full matrix, full(A), using svd(full(A)) might
still be quicker than svds or svdsketch. However, for truly large and sparse matrices, using svds
or svdsketch becomes necessary.
2-36
LAPACK in MATLAB
LAPACK in MATLAB
LAPACK (Linear Algebra Package) is a library of routines that provides fast, robust algorithms for
numerical linear algebra and matrix computations. Linear algebra functions and matrix operations in
MATLAB are built on LAPACK, and they continue to benefit from the performance and accuracy of its
routines.
A Brief History
MATLAB started its life in the late 1970s as an interactive calculator built on top of LINPACK and
EISPACK, which were the state-of-the-art Fortran subroutine libraries for matrix computation of the
time. For many years MATLAB used translations to C of about a dozen Fortran subroutines from
LINPACK and EISPACK.
In the year 2000, MATLAB migrated to using LAPACK, which is the modern replacement for LINPACK
and EISPACK. It is a large, multi-author, Fortran library for numerical linear algebra. LAPACK was
originally intended for use on supercomputers because of its ability to operate on several columns of
a matrix at a time. The speed of LAPACK routines is closely connected to the speed of the Basic
Linear Algebra Subroutines (BLAS). The BLAS version is typically hardware-specific and highly
optimized.
See Also
More About
• “Call LAPACK and BLAS Functions”
External Websites
• MATLAB Incorporates LAPACK
2-37
2 Linear Algebra
Matrix Exponentials
This example shows three of the 19 ways to compute the exponential of a matrix.
Moler, Cleve, and Charles Van Loan. “Nineteen Dubious Ways to Compute the Exponential of a
Matrix, Twenty-Five Years Later.” SIAM Review 45, no. 1 (January 2003): 3–49. https://doi.org/
10.1137/S00361445024180.
A = [0 1 2; 0.5 0 1; 2 1 0]
A = 3×3
0 1.0000 2.0000
0.5000 0 1.0000
2.0000 1.0000 0
Asave = A;
Golub, Gene H. and Charles Van Loan. Matrix Computations, 3rd edition. Baltimore, MD: Johns
Hopkins University Press, 1996.
2-38
Matrix Exponentials
E1 = E
E1 = 3×3
expmdemo2 uses the classic definition for the matrix exponential given by the power series
∞ 1 k
eA = ∑ k!
A .
k=0
A0 is the identity matrix with the same dimensions as A. As a practical numerical method, this
approach is slow and inaccurate if norm(A) is too large.
A = Asave;
E2 = E
E2 = 3×3
expmdemo3 assumes that the matrix has a full set of eigenvectors V such that A = VDV −1. The matrix
exponential can be calculated by exponentiating the diagonal matrix of eigenvalues:
e A = VeDV −1 .
As a practical numerical method, the accuracy is determined by the condition of the eigenvector
matrix.
A = Asave;
2-39
2 Linear Algebra
[V,D] = eig(A);
E = V * diag(exp(diag(D))) / V;
E3 = E
E3 = 3×3
Compare Results
For the matrix in this example, all three methods work equally well.
E = expm(Asave);
err1 = E - E1
err1 = 3×3
10-14 ×
err2 = E - E2
err2 = 3×3
10-14 ×
0 0 -0.1776
-0.0444 0 -0.0888
0.1776 0 0.0888
err3 = E - E3
err3 = 3×3
10-13 ×
For some matrices the terms in the Taylor series become very large before they go to zero.
Consequently, expmdemo2 fails.
A = [-147 72; -192 93];
E1 = expmdemo1(A)
E1 = 2×2
-0.0996 0.0747
-0.1991 0.1494
2-40
Matrix Exponentials
E2 = expmdemo2(A)
E2 = 2×2
106 ×
-1.1985 -0.5908
-2.7438 -2.0442
E3 = expmdemo3(A)
E3 = 2×2
-0.0996 0.0747
-0.1991 0.1494
Here is a matrix that does not have a full set of eigenvectors. Consequently, expmdemo3 fails.
A = [-1 1; 0 -1];
E1 = expmdemo1(A)
E1 = 2×2
0.3679 0.3679
0 0.3679
E2 = expmdemo2(A)
E2 = 2×2
0.3679 0.3679
0 0.3679
E3 = expmdemo3(A)
E3 = 2×2
0.3679 0
0 0.3679
See Also
expm
2-41
2 Linear Algebra
This example shows an interesting graphical approach for discovering whether eπ is greater than πe.
The question is: which is greater, eπ or πe? The easy way to find out is to type it directly at the
MATLAB® command prompt. But another way to analyze the situation is to ask a more general
question: what is the shape of the function z x, y = xy − yx?
Here is a plot of z.
% Define the mesh
x = 0:0.16:5;
y = 0:0.16:5;
[xx,yy] = meshgrid(x,y);
% The plot
zz = xx.^yy-yy.^xx;
h = surf(x,y,zz);
h.EdgeColor = [0.7 0.7 0.7];
view(20,50);
colormap(hsv);
title('$z = x^y-y^x$','Interpreter','latex')
xlabel('x')
ylabel('y')
hold on
2-42
Graphical Comparison of Exponential Functions
The solution of the equation xy − yx = 0 has a very interesting shape, and our original question is not
easily solved by inspection. Here is a plot of the xy values that yield z = 0.
c = contourc(x,y,zz,[0 0]);
list1Len = c(2,1);
xContour = [c(1,2:1+list1Len) NaN c(1,3+list1Len:size(c,2))];
yContour = [c(2,2:1+list1Len) NaN c(2,3+list1Len:size(c,2))];
% Note that the NAN above prevents the end of the first contour line from being
% connected to the beginning of the second line
line(xContour,yContour,'Color','k');
Some combinations of x and y along the black curve are integers. This next plot is of the integer
4 2
solutions to the equation xy − yx = 0. Notice that 2 = 4 is the only integer solution where x ≠ y.
2-43
2 Linear Algebra
Finally, plot the points π, e and e, π on the surface. The result shows that eπ is indeed larger than
πe (though not by much).
e = exp(1);
plot([e pi],[pi e],'r.','MarkerSize',25);
plot([e pi],[pi e],'y.','MarkerSize',10);
text(e,3.3,'(e,pi)','Color','k', ...
'HorizontalAlignment','left','VerticalAlignment','bottom');
text(3.3,e,'(pi,e)','Color','k','HorizontalAlignment','left',...
'VerticalAlignment','bottom');
hold off;
2-44
Graphical Comparison of Exponential Functions
e = exp(1);
e^pi
ans = 23.1407
pi^e
ans = 22.4592
See Also
exp | pi
2-45
2 Linear Algebra
This example shows basic techniques and functions for working with matrices in the MATLAB®
language.
a = [1 2 3 4 6 4 3 4 5]
a = 1×9
1 2 3 4 6 4 3 4 5
Now let's add 2 to each element of our vector, a, and store the result in a new vector.
b = a + 2
b = 1×9
3 4 5 6 8 6 5 6 7
Creating graphs in MATLAB is as easy as one command. Let's plot the result of our vector addition
with grid lines.
plot(b)
grid on
2-46
Basic Matrix Operations
MATLAB can make other graph types as well, with axis labels.
bar(b)
xlabel('Sample #')
ylabel('Pounds')
2-47
2 Linear Algebra
MATLAB can use symbols in plots as well. Here is an example using stars to mark the points.
MATLAB offers a variety of other symbols and line types.
plot(b,'*')
axis([0 10 0 10])
2-48
Basic Matrix Operations
Creating a matrix is as easy as making a vector, using semicolons (;) to separate the rows of a matrix.
A = [1 2 0; 2 5 -1; 4 10 -1]
A = 3×3
1 2 0
2 5 -1
4 10 -1
B = 3×3
1 2 4
2 5 10
0 -1 -1
Note again that MATLAB doesn't require you to deal with matrices as a collection of numbers.
MATLAB knows when you are dealing with matrices and adjusts your calculations accordingly.
2-49
2 Linear Algebra
C = A * B
C = 3×3
5 12 24
12 30 59
24 59 117
Instead of doing a matrix multiply, we can multiply the corresponding elements of two matrices or
vectors using the .* operator.
C = A .* B
C = 3×3
1 4 0
4 25 -10
0 -10 1
Let's use the matrix A to solve the equation, A*x = b. We do this by using the \ (backslash) operator.
b = [1;3;5]
b = 3×1
1
3
5
x = A\b
x = 3×1
1
0
-1
r = 3×1
0
0
0
MATLAB has functions for nearly every type of common matrix calculation.
ans = 3×1
2-50
Basic Matrix Operations
3.7321
0.2679
1.0000
svd(A)
ans = 3×1
12.3171
0.5149
0.1577
The "poly" function generates a vector containing the coefficients of the characteristic polynomial.
det(λI − A)
p = round(poly(A))
p = 1×4
1 -5 5 -1
We can easily find the roots of a polynomial using the roots function.
roots(p)
ans = 3×1
3.7321
1.0000
0.2679
q = conv(p,p)
q = 1×7
r = conv(p,q)
r = 1×10
2-51
2 Linear Algebra
plot(r);
At any time, we can get a listing of the variables we have stored in memory using the who or whos
command.
whos
A 3x3 72 double
B 3x3 72 double
C 3x3 72 double
a 1x9 72 double
ans 3x1 24 double
b 3x1 24 double
p 1x4 32 double
q 1x7 56 double
r 1x10 80 double
x 3x1 24 double
You can get the value of a particular variable by typing its name.
A
A = 3×3
2-52
Basic Matrix Operations
1 2 0
2 5 -1
4 10 -1
You can have more than one statement on a single line by separating each statement with commas or
semicolons.
If you don't assign a variable to store the result of an operation, the result is stored in a temporary
variable called ans.
sqrt(-1)
As you can see, MATLAB easily deals with complex numbers in its calculations.
See Also
More About
• “Array vs. Matrix Operations”
2-53
2 Linear Algebra
This topic explains how to use the chol and eig functions to determine whether a matrix is
symmetric positive definite (a symmetric matrix with all positive eigenvalues).
The most efficient method to check whether a matrix is symmetric positive definite is to attempt to
use chol on the matrix. If the factorization fails, then the matrix is not symmetric positive definite.
Create a square symmetric matrix and use a try/catch block to test whether chol(A) succeeds.
A = [1 -1 0; -1 5 0; 0 0 7]
A = 3×3
1 -1 0
-1 5 0
0 0 7
try chol(A)
disp('Matrix is symmetric positive definite.')
catch ME
disp('Matrix is not symmetric positive definite')
end
ans = 3×3
1.0000 -1.0000 0
0 2.0000 0
0 0 2.6458
The drawback of this method is that it cannot be extended to also check whether the matrix is
symmetric positive semi-definite (where the eigenvalues can be positive or zero).
While it is less efficient to use eig to calculate all of the eigenvalues and check their values, this
method is more flexible since you can also use it to check whether a matrix is symmetric positive
semi-definite. Still, for small matrices the difference in computation time between the methods is
negligible to check whether a matrix is symmetric positive definite.
This method requires that you use issymmetric to check whether the matrix is symmetric before
performing the test (if the matrix is not symmetric, then there is no need to calculate the
eigenvalues).
tf = issymmetric(A)
tf = logical
1
d = eig(A)
2-54
Determine Whether Matrix Is Symmetric Positive Definite
d = 3×1
0.7639
5.2361
7.0000
isposdef = logical
1
You can extend this method to check whether a matrix is symmetric positive semi-definite with the
command all(d >= 0).
Numerical Considerations
The methods outlined here might give different results for the same matrix. Since both calculations
involve round-off errors, each algorithm checks the definiteness of a matrix that is slightly different
from A. In practice, the use of a tolerance is a more robust comparison method, since eigenvalues can
be numerically zero within machine precision and be slightly positive or slightly negative.
For example, if a matrix has an eigenvalue on the order of eps, then using the comparison isposdef
= all(d > 0) returns true, even though the eigenvalue is numerically zero and the matrix is
better classified as symmetric positive semi-definite.
To perform the comparison using a tolerance, you can use the modified commands
tf = issymmetric(A)
d = eig(A)
isposdef = all(d > tol)
issemidef = all(d > -tol)
The tolerance defines a radius around zero, and any eigenvalues within that radius are treated as
zeros. A good choice for the tolerance in most cases is length(d)*eps(max(d)), which takes into
account the magnitude of the largest eigenvalue.
See Also
chol | eig
More About
• “Factorizations” on page 2-20
2-55
2 Linear Algebra
This example shows how to use svdsketch to compress an image. svdsketch uses a low-rank
matrix approximation to preserve important features of the image, while filtering out less important
features. As the tolerance used with svdsketch increases in magnitude, more features are filtered
out, changing the level of detail in the image.
Load Image
Load the image street1.jpg, which is a picture of a city street. The 3-D matrix that forms this
image is uint8, so convert the image to a grayscale matrix. View the image with an annotation of the
original matrix rank.
A = imread('street1.jpg');
A = rgb2gray(A);
imshow(A)
title(['Original (',sprintf('Rank %d)',rank(double(A)))])
2-56
Image Compression with Low-Rank SVD
Compress Image
Use svdsketch to calculate a low-rank matrix that approximates A within a tolerance of 1e-2. Form
the low-rank matrix by multiplying the SVD factors returned by svdsketch, convert the result to
uint8, and view the resulting image.
[U1,S1,V1] = svdsketch(double(A),1e-2);
Anew1 = uint8(U1*S1*V1');
imshow(uint8(Anew1))
title(sprintf('Rank %d approximation',size(S1,1)))
svdsketch produces a rank 288 approximation, which results in some minor graininess in some of
the boundary lines of the image.
Now, compress the image a second time using a tolerance of 1e-1. As the magnitude of the tolerance
increases, the rank of the approximation produced by svdsketch generally decreases.
[U2,S2,V2] = svdsketch(double(A),1e-1);
Anew2 = uint8(U2*S2*V2');
imshow(Anew2)
title(sprintf('Rank %d approximation',size(S2,1)))
2-57
2 Linear Algebra
This time, svdsketch produces a rank 48 approximation. Most of the major aspects of the image are
still visible, but the additional compression increases the blurriness.
svdsketch adaptively determines what rank to use for the matrix sketch based on the specified
tolerance. However, you can use the MaxSubspaceDimension name-value pair to specify the
maximum subspace size that should be used to form the matrix sketch. This option can produce
matrices that do not satisfy the tolerance, since the subspace you specify might be too small. In these
cases, svdsketch returns a matrix sketch with the maximum allowed subspace size.
Use svdsketch with a tolerance of 1e-1 and a maximum subspace size of 15. Specify a fourth
output to return the relative approximation error.
[U3,S3,V3,apxErr] = svdsketch(double(A),1e-1,'MaxSubspaceDimension',15);
Compare the relative approximation error of the result with the specified tolerance. apxErr contains
one element since svdsketch only needs one iteration to compute the answer.
2-58
Image Compression with Low-Rank SVD
ans = logical
0
The result indicates that the matrix sketch does not satisfy the specified tolerance.
Compare Results
2-59
2 Linear Algebra
imshow(Anew2)
title(sprintf('Rank %d approximation',size(S2,1)))
nexttile
imshow(Anew3)
title(sprintf('Rank %d approximation',size(S3,1)))
See Also
svd | svds | svdsketch
More About
• “Singular Values” on page 2-33
• “Resample Image with Gridded Interpolation” on page 8-53
2-60
3
Random Numbers
When you first start a MATLAB session or call rng("default"), MATLAB initializes the random
number generator using the default algorithm and seed. Starting in R2023b, you can set the default
algorithm and seed in MATLAB preferences. If you do not change these preferences, then rng uses
the factory value of "twister" for the Mersenne Twister generator with seed 0, as in previous
releases. For more information, see “Default Settings for Random Number Generator” and
“Reproducibility for Random Number Generator”.
• If you want to avoid repeating the same random number arrays when MATLAB restarts, then use
rng("shuffle") before calling rand, randn, randi, or randperm. This command ensures that
you do not repeat a result from a previous MATLAB session.
• If you want to repeat a result that you got at the start of a MATLAB session without restarting, you
can reset the generator to the startup state using rng("default").
When you execute rng("default"), the ensuing random number commands return results that
match the output of another MATLAB session that uses the same default algorithm and seed for the
random number generator.
rng("default");
A = rand(2,2)
A =
0.8147 0.1270
0.9058 0.9134
The values in A match the output of rand(2,2) whenever you restart MATLAB using the same
preferences for the random number generator.
Alternatively, you can repeat a result by specifying the seed and algorithm used for the random
number generator. For example, sets the seed to 1 and the generator algorithm to Mersenne Twister
rng(1,"twister");
A = rand(2,2)
A =
0.4170 0.0001
0.7203 0.3023
Next, in a new MATLAB session, repeat the same commands to reproduce the array A.
rng(1,"twister");
A = rand(2,2)
3-2
Why Do Random Numbers Repeat After Startup?
A =
0.4170 0.0001
0.7203 0.3023
See Also
rng
3-3
3 Random Numbers
The rand, randi, randn, and randperm functions are the primary functions for creating arrays of
random numbers. The rng function allows you to control the seed and algorithm that generates
random numbers.
rng("default")
r1 = rand(1000,1);
All the values in r1 are in the open interval (0,1). A histogram of these values is roughly flat, which
indicates a fairly uniform sampling of numbers.
The randi function returns double integer values drawn from a discrete uniform distribution. For
example, create a 1000-by-1 column vector containing integer values drawn from a discrete uniform
distribution.
r2 = randi(10,1000,1);
All the values in r2 are in the close interval [1, 10]. A histogram of these values is roughly flat, which
indicates a fairly uniform sampling of integers between 1 and 10.
The randn function returns arrays of real floating-point numbers that are drawn from a standard
normal distribution. For example, create a 1000-by-1 column vector containing numbers drawn from
a standard normal distribution.
r3 = randn(1000,1);
A histogram of r3 looks like a roughly normal distribution whose mean is 0 and standard deviation is
1.
You can use the randperm function to create a double array of random integer values that have no
repeated values. For example, create a 1-by-5 array containing integers randomly selected from the
range [1, 15].
r4 = randperm(15,5);
3-4
Create Arrays of Random Numbers
Unlike randi, which can return an array containing repeated values, the array returned by
randperm has no repeated values.
Successive calls to any of these functions return different results. This behavior is useful for creating
several different arrays of random values.
Use the rng function to set the seed and generator used by the rand, randi, randn, and randperm
functions.
For example, rng(0,"twister") sets the seed to 0 and the generator algorithm to Mersenne
Twister. To avoid repetition of random number arrays when MATLAB restarts, see “Why Do Random
Numbers Repeat After Startup?” on page 3-2
For more information about controlling the random number generator's state to repeat calculations
using the same random numbers, or to guarantee that different random numbers are used in
repeated calculations, see “Controlling Random Number Generation” on page 3-38.
Starting in R2023b, you can set the default algorithm and seed in MATLAB preferences. If you do not
change these preferences, then rng uses the factory value of "twister" for the Mersenne Twister
generator with seed 0, as in previous releases. For more information, see “Default Settings for
Random Number Generator” and “Reproducibility for Random Number Generator”.
3-5
3 Random Numbers
ans = 'double'
rng("default")
B = rand(1,5,"double");
class(B)
ans = 'double'
isequal(A,B)
ans =
1
rng("default")
A = rand(1,5,"single");
class(A)
ans = 'single'
The values are the same as if you had cast the double precision values from the previous example.
The random stream that the functions draw from advances the same way regardless of what class of
values is returned.
A,B
A =
0.8147 0.9058 0.1270 0.9134 0.6324
B =
0.8147 0.9058 0.1270 0.9134 0.6324
A = randi([1 10],1,5,"double");
class(A)
ans = 'double'
B = randi([1 10],1,5,"uint8");
class(B)
ans = 'uint8'
See Also
rng | rand | randi | randn | randperm
Related Examples
• “Controlling Random Number Generation” on page 3-38
• “Generate Random Numbers That Are Repeatable” on page 3-13
• “Generate Random Numbers That Are Different” on page 3-16
3-6
Create Arrays of Random Numbers
3-7
3 Random Numbers
By default, rand returns normalized values (between 0 and 1) that are drawn from a uniform
distribution. To change the range of the distribution to a new range, (a, b), multiply each value by the
width of the new range, (b – a) and then shift every value by a.
First, initialize the random number generator to make the results in this example repeatable.
rng(0,'twister');
Create a vector of 1000 random values. Use the rand function to draw the values from a uniform
distribution in the open interval, (50,100).
a = 50;
b = 100;
r = (b-a).*rand(1000,1) + a;
r_range =
50.0261 99.9746
Note Some combinations of a and b make it theoretically possible for your results to include a or b.
In practice, this is extremely unlikely to happen.
See Also
rng
Related Examples
• “Random Numbers from Normal Distribution with Specific Mean and Variance” on page 3-10
• “Random Numbers Within a Sphere” on page 3-11
• “Create Arrays of Random Numbers” on page 3-4
3-8
Random Integers
Random Integers
This example shows how to create an array of random integer values that are drawn from a discrete
uniform distribution on the set of numbers –10, –9,...,9, 10.
The simplest randi syntax returns double-precision integer values between 1 and a specified value,
imax. To specify a different range, use the imin and imax arguments together.
First, initialize the random number generator to make the results in this example repeatable.
rng(0,'twister');
Create a 1-by-1000 array of random integer values drawn from a discrete uniform distribution on the
set of numbers -10, -9,...,9, 10. Use the syntax, randi([imin imax],m,n).
r = randi([-10 10],1,1000);
[rmin,rmax] = bounds(r)
rmin = -10
rmax = 10
See Also
rng | randi
Related Examples
• “Create Arrays of Random Numbers” on page 3-4
3-9
3 Random Numbers
The randn function returns a sample of random numbers from a normal distribution with mean 0 and
variance 1. The general theory of random variables states that if x is a random variable whose mean
is μx and variance is σx2, then the random variable, y, defined by y = ax + b,where a and b are
constants, has mean μy = aμx + b and variance σy2 = a2σx2 . You can apply this concept to get a sample
of normally distributed random numbers with mean 500 and variance 25.
First, initialize the random number generator to make the results in this example repeatable.
rng(0,'twister');
Create a vector of 1000 random values drawn from a normal distribution with a mean of 500 and a
standard deviation of 5.
a = 5;
b = 500;
y = a.*randn(1000,1) + b;
stats = 1×3
The mean and variance are not 500 and 25 exactly because they are calculated from a sampling of
the distribution.
See Also
rng | randn
Related Examples
• “Random Numbers Within a Specific Range” on page 3-8
• “Random Numbers Within a Sphere” on page 3-11
• “Create Arrays of Random Numbers” on page 3-4
3-10
Random Numbers Within a Sphere
One way to create points inside a sphere is to specify them in spherical coordinates. Then you can
convert them to Cartesian coordinates to plot them.
First, initialize the random number generator to make the results in this example repeatable.
rng(0,'twister')
Calculate an elevation angle for each point in the sphere. These values are in the open interval,
( − π/2, π/2), but are not uniformly distributed.
rvals = 2*rand(1000,1)-1;
elevation = asin(rvals);
Create an azimuth angle for each point in the sphere. These values are uniformly distributed in the
open interval, (0, 2π).
azimuth = 2*pi*rand(1000,1);
Create a radius value for each point in the sphere. These values are in the open interval, (0, 3), but
are not uniformly distributed.
radii = 3*(rand(1000,1).^(1/3));
[x,y,z] = sph2cart(azimuth,elevation,radii);
figure
plot3(x,y,z,'.')
axis equal
3-11
3 Random Numbers
If you want to place random numbers on the surface of the sphere, then specify a constant radius
value to be the last input argument to sph2cart. In this case, the value is 3.
[x,y,z] = sph2cart(azimuth,elevation,3);
References
[1] Knuth, D. The Art of Computer Programming. Vol. 2, 3rd ed. Reading, MA: Addison-Wesley
Longman, 1998, pp. 134–136.
See Also
rng | rand | sph2cart
Related Examples
• “Random Numbers Within a Specific Range” on page 3-8
• “Random Numbers from Normal Distribution with Specific Mean and Variance” on page 3-10
• “Create Arrays of Random Numbers” on page 3-4
3-12
Generate Random Numbers That Are Repeatable
First, initialize the random number generator to make the results in this example repeatable. For
example, the following code sets the seed to 1 and the generator algorithm to Mersenne Twister.
rng(1,"twister");
A =
A =
The first call to rand changed the state of the generator, so the second result is different.
Now, reinitialize the generator using the same seed and algorithm as before. Then reproduce the first
matrix, A.
rng(1,"twister");
A = rand(3,3)
A =
Set the seed and generator type together when you want to:
• Ensure that the behavior of code you write today returns the same results when you run that code
in a future MATLAB release.
• Ensure that the behavior of code you wrote in a previous MATLAB release returns the same
results using the current release.
• Repeat random numbers in your code after running someone else’s random number code.
3-13
3 Random Numbers
When you first start a MATLAB session or call rng("default"), MATLAB initializes the random
number generator using the default algorithm and seed. Starting in R2023b, you can set the default
algorithm and seed in MATLAB preferences. If you do not change these preferences, then rng uses
the factory value of "twister" for the Mersenne Twister generator with seed 0, as in previous
releases. For more information, see “Default Settings for Random Number Generator” and
“Reproducibility for Random Number Generator”.
First, initialize the random number generator to make the results in this example repeatable.
rng(1,"twister");
A = randi(10,3,3)
A = 3×3
5 4 2
8 2 4
1 1 4
The first call to randi changed the state of the generator. Save the generator settings after the first
call to randi in a structure s.
s = rng;
A = randi(10,3,3)
A = 3×3
6 3 7
5 9 5
7 1 6
Now, return the generator to the previous state stored in s and reproduce the second array A.
rng(s);
A = randi(10,3,3)
A = 3×3
6 3 7
5 9 5
7 1 6
See Also
rng
3-14
Generate Random Numbers That Are Repeatable
Related Examples
• “Generate Random Numbers That Are Different” on page 3-16
• “Controlling Random Number Generation” on page 3-38
3-15
3 Random Numbers
All the random number functions, rand, randn, randi, and randperm, draw values from a shared
random number generator. Every time you start MATLAB, the generator resets itself to the same
state using the default algorithm and seed. Therefore, a command such as rand(2,2) returns the
same result any time you execute it immediately following startup in different MATLAB sessions that
have the same preferences for the random number generator. Also, any script or function that calls
the random number functions returns the same result whenever you restart.
When you first start a MATLAB session or call rng("default"), MATLAB initializes the random
number generator using the default algorithm and seed. Starting in R2023b, you can set the default
algorithm and seed in MATLAB preferences. If you do not change these preferences, then rng uses
the factory value of "twister" for the Mersenne Twister generator with seed 0, as in previous
releases. For more information, see “Default Settings for Random Number Generator” and
“Reproducibility for Random Number Generator”.
One way to get different random numbers is to initialize the generator using a different seed every
time. Doing so ensures that you don’t repeat results from a previous session.
Execute the rng("shuffle") command once in your MATLAB session before calling any of the
random number functions.
rng("shuffle")
You can execute this command in a MATLAB Command Window, or you can add it to your startup file,
which is a special script that MATLAB executes every time you restart.
Each time you call rng("shuffle"), it reseeds the generator using a different seed based on the
current time.
Note Frequent reseeding of the generator does not improve the statistical properties of the output
and does not make the output more random in any real sense. Reseeding can be useful when you
restart MATLAB or before you run a large calculation involving random numbers. However, reseeding
the generator too frequently within a session is not a good idea because the statistical properties of
your random numbers can be adversely affected.
Alternatively, specify different seeds explicitly in different MATLAB sessions using the default
algorithm. For example, generate random numbers in one MATLAB session.
rng(1);
A = rand(2,2);
3-16
Generate Random Numbers That Are Different
Arrays A and B are different because the generator is initialized with a different seed before each call
to the rand function.
To generate multiple independent streams that are guaranteed to not overlap, and for which tests
that demonstrate independence of the values between streams have been carried out, you can use
RandStream.create. For more information about generating multiple streams, see “Multiple
Streams” on page 3-30.
See Also
rng
Related Examples
• “Generate Random Numbers That Are Repeatable” on page 3-13
• “Controlling Random Number Generation” on page 3-38
• “Startup Options in MATLAB Startup File”
3-17
3 Random Numbers
The rand, randn, randi, and randperm functions draw random numbers from an underlying
random number stream, called the global stream. The global stream is a RandStream object. A
simple way to control the global stream is to use the rng function. For more comprehensive control,
the RandStream class enables you to create a separate stream from the global stream, get a handle
to the global stream, and control random number generation.
Use rng to initialize the random number generator. Set the generator seed to 0 and the generator
algorithm to Mersenne Twister. Save the generator settings.
rng(0,'twister')
s = rng
Create a 1-by-6 row vector of uniformly distributed random values between 0 and 1.
x = rand(1,6)
x = 1×6
Use RandStream.getGlobalStream to return a handle to the global stream, that is, the current
global stream that rand generates random numbers from. If you use
RandStream.getGlobalStream to get a handle to the global stream, you can see the changes you
made to the global stream using rng.
globalStream = RandStream.getGlobalStream
globalStream =
mt19937ar random stream (current global stream)
Seed: 0
NormalTransform: Ziggurat
Change the generator seed and algorithm, and create a new random row vector. Show the current
global stream that rand generates random numbers from.
rng(1,'philox')
xnew = rand(1,6)
xnew = 1×6
globalStream = RandStream.getGlobalStream
globalStream =
philox4x32_10 random stream (current global stream)
3-18
Managing the Global Stream Using RandStream
Seed: 1
NormalTransform: Inversion
Next, restore the original generator settings and create a random vector. The result matches the
original row vector x created with the initial generator.
rng(s)
xold = rand(1,6)
xold = 1×6
By default, random number generation functions, such as rand, use the global random number
stream. To specify a different stream, create another RandStream object. Pass it as the first input
argument to rand. For example, create a 1-by-6 vector of random numbers using the SIMD-oriented
Fast Mersenne Twister.
myStream = RandStream('dsfmt19937')
myStream =
dsfmt19937 random stream
Seed: 0
NormalTransform: Ziggurat
r = rand(myStream,1,6)
r = 1×6
When you call the rand function with myStream as the first input argument, it draws numbers from
myStream and does not affect the results of the global stream.
If you want to set myStream as a global stream, you can use the RandStream.setGlobalStream
object function.
RandStream.setGlobalStream(myStream)
globalStream = RandStream.getGlobalStream
globalStream =
dsfmt19937 random stream (current global stream)
Seed: 0
NormalTransform: Ziggurat
In many cases, the rng function is all you need to control the global stream, but the RandStream
class allows control over some advanced features, such as the choice of algorithm used for normal
random values.
For example, create a RandStream object and specify the transformation algorithm to generate
normally distributed pseudorandom values when using randn. Generate normally distributed
pseudorandom values using the Polar transformation algorithm, instead of the default Ziggurat
transformation algorithm.
3-19
3 Random Numbers
myStream = RandStream('mt19937ar','NormalTransform','Polar')
myStream =
mt19937ar random stream
Seed: 0
NormalTransform: Polar
Set myStream as the global stream. Create 6 random numbers with normal distribution from the
global stream.
RandStream.setGlobalStream(myStream)
randn(1,6)
ans = 1×6
See Also
rng | RandStream
Related Examples
• “Multiple Streams” on page 3-30
• “Creating and Controlling a Random Number Stream” on page 3-21
• “Controlling Random Number Generation” on page 3-38
• “Create Arrays of Random Numbers” on page 3-4
3-20
Creating and Controlling a Random Number Stream
The RandStream class allows you to create a random number stream. This is useful for several
reasons:
• You can generate random values without affecting the state of the global stream.
• You can separate sources of randomness in a simulation.
• You can use a generator that is configured differently than the one MATLAB software uses at
startup.
With a RandStream object, you can create your own stream, set the writable properties, and use the
stream to generate random numbers. You can control the stream you create the same way you control
the global stream. You can even replace the global stream with the stream you create.
myStream = RandStream('mlfg6331_64');
rand(myStream,1,5)
ans =
0.6986 0.7413 0.4239 0.6914 0.7255
The random stream myStream acts separately from the global stream. If you call the rand, randn,
randi, and randperm functions with myStream as the first argument, they draw from the stream
you created. If you call rand, randn, randi, and randperm without myStream, they draw from the
global stream.
You can make myStream the global stream using the RandStream.setGlobalStream method.
RandStream.setGlobalStream(myStream)
RandStream.getGlobalStream
ans =
RandStream.getGlobalStream == myStream
ans =
1
Substreams
You can use substreams to get different results that are statistically independent from a stream.
Unlike seeds, where the locations along the sequence of random numbers are not exactly known, the
3-21
3 Random Numbers
spacing between substreams is known, so any chance of overlap can be eliminated. In short,
substreams are a more-controlled way to do many of the same things that seeds have traditionally
been used for. Substreams are also a more lightweight solution than parallel streams.
Substreams provide a quick and easy way to ensure that you get different results from the same code
at different times. To use the Substream property of a RandStream object, create a stream using a
generator that supports substreams. For a list of generator algorithms that support substreams and
their properties, see the table in the next section. For example, generate several random numbers in
a loop.
myStream = RandStream('mlfg6331_64');
RandStream.setGlobalStream(myStream)
for i = 1:5
myStream.Substream = i;
z = rand(1,i)
end
z =
0.6986
z =
0.9230 0.2489
z =
0.0261 0.2530 0.0737
z =
0.3220 0.7405 0.1983 0.1052
z =
0.2067 0.2417 0.9777 0.5970 0.4187
In another loop, you can generate random values that are independent from the first set of 5
iterations.
for i = 6:10
myStream.Substream = i;
z = rand(1,11-i)
end
z =
0.2650 0.8229 0.2479 0.0247 0.4581
z =
0.3963 0.7445 0.7734 0.9113
z =
0.2758 0.3662 0.7979
z =
0.6814 0.5150
z =
0.5247
Substreams are useful in serial computation. Substreams can recreate all or part of a simulation by
returning to a particular checkpoint in stream. For example, you can return to the 6th substream in
the loop. The result contains the same values as the 6th output above.
3-22
Creating and Controlling a Random Number Stream
myStream.Substream = 6;
z = rand(1,5)
z =
0.2650 0.8229 0.2479 0.0247 0.4581
The generators mcg16807, shr3cong, and swb2712 provide for backwards compatibility with earlier
versions of MATLAB. mt19937ar and dsfmt19937 are designed primarily for sequential
applications. The remaining generators provide explicit support for parallel random number
generation.
Depending on the application, some generators might be faster or return values with more precision.
All pseudorandom number generators are based on deterministic algorithms, and all generators pass
a sufficiently specific statistical test for randomness. One way to check the results of a Monte Carlo
simulation is to rerun the simulation with two or more different generator algorithms, and the choice
of generators in MATLAB provides you with the means to do that. Although it is unlikely that your
results will differ by more than the Monte Carlo sampling error when using different generators,
there are examples in the literature where this kind of validation has turned up flaws in a particular
generator algorithm. (See [13] for an example.)
3-23
3 Random Numbers
Generator Algorithms
mt19937ar
19937
The Mersenne Twister, as described in [11], has period 2 − 1 and each U(0,1) value is
−53
created using two 32-bit integers. The possible values are multiples of 2 in the interval (0, 1).
This generator does not support multiple streams or substreams. The randn algorithm used by
default for mt19937ar streams is the ziggurat algorithm [7], but with the mt19937ar generator
underneath.
Note This generator is identical to the one used by the rand function beginning in MATLAB
Version 7, activated by rand('twister',s).
dsfmt19937
The double precision SIMD-oriented Fast Mersenne Twister, as described in [12], is a faster
19937
implementation of the Mersenne Twister algorithm. The period is 2 − 1 and the possible
−52
values are multiples of 2 in the interval (0, 1). The generator produces double precision values
in [1, 2) natively, which are transformed to create U(0,1) values. This generator does not support
multiple streams or substreams.
mcg16807
A 32-bit multiplicative congruential generator, as described in [14], with multiplier a = 75, modulo
31 31
m = 2 − 1. This generator has a period of 2 − 2 and does not support multiple streams or
substreams. Each U(0,1) value is created using a single 32-bit integer from the generator; the
31 −1
possible values are all multiples of (2 − 1) strictly within the interval (0, 1). For mcg16807
streams, the default algorithm used by randn is the polar algorithm (described in [1]).
Note This generator is identical to the one used beginning in MATLAB Version 4 by both the
rand and randn functions, activated using rand('seed',s) or randn('seed',s).
mlfg6331_64
A 64-bit multiplicative lagged Fibonacci generator, as described in [10], with lags l = 63, k = 31.
This generator is similar to the MLFG implemented in the SPRNG package. It has a period of
124 61 51
approximately 2 . It supports up to 2 parallel streams, via parameterization, and 2
72
substreams each of length 2 . Each U(0,1) value is created using one 64-bit integer from the
−64
generator; the possible values are all multiples of 2 strictly within the interval (0, 1). The
randn algorithm used by default for mlfg6331_64 streams is the ziggurat algorithm [7], but
with the mlfg6331_64 generator underneath.
mrg32k3a
A 32-bit combined multiple recursive generator, as described in [2]. This generator is similar to
191
the CMRG implemented in the RngStreams package in C. It has a period of 2 and supports up
63 127 51
to 2 parallel streams through sequence splitting, each of length 2 . It also supports 2
76
substreams, each of length 2 . Each U(0,1) value is created using two 32-bit integers from the
−53
generator; the possible values are multiples of 2 strictly within the interval (0, 1). The randn
3-24
Creating and Controlling a Random Number Stream
algorithm used by default for mrg32k3a streams is the ziggurat algorithm [7], but with the
mrg32k3a generator underneath.
philox4x32_10
A 4x32 generator with 10 rounds as described in [15]. This generator uses a Feistel network and
integer multiplication. The generator is specifically designed for high performance in highly
parallel systems such as GPUs. It has a period of 2193 (264 streams of length 2129).
threefry4x64_20
A 4x64 generator with 20 rounds as described in [15]. This generator is a non-cryptographic
adaptation of the Threefish block cipher from the Skein Hash Function. It has a period of 2514
(2256 streams of length 2258).
shr3cong
Marsaglia's SHR3 shift-register generator summed with a linear congruential generator with
−32
multiplier a = 69069, addend b = 1234567, and modulus 2 . SHR3 is a 3-shift-register
13 17 5
generator defined as u = u(I + L )(I + R )(I + L ), where I is the identity operator, L is the left
shift operator, and R is the right shift operator. The combined generator (the SHR3 part is
64
described in [7]) has a period of approximately 2 . This generator does not support multiple
streams or substreams. Each U(0,1) value is created using one 32-bit integer from the generator;
−32
the possible values are all multiples of 2 strictly within the interval (0, 1). The randn
algorithm used by default for shr3cong streams is the earlier form of the ziggurat algorithm [9],
but with the shr3cong generator underneath. This generator is identical to the one used by the
randn function beginning in MATLAB Version 5, activated using randn('state',s).
Note The SHR3 generator used in [6] (1999) differs from the one used in [7] (2000). MATLAB
uses the most recent version of the generator, presented in [7].
swb2712
A modified Subtract-with-Borrow generator, as described in [8]. This generator is similar to an
additive lagged Fibonacci generator with lags 27 and 12, but it is modified to have a much longer
1492
period of approximately 2 . The generator works natively in double precision to create U(0,1)
values, and all values in the open interval (0, 1) are possible. The randn algorithm used by
default for swb2712 streams is the ziggurat algorithm [7], but with the swb2712 generator
underneath.
Note This generator is identical to the one used by the rand function beginning in MATLAB
Version 5, activated using rand('state',s).
Transformation Algorithms
Inversion
Computes a normal random variate by applying the standard normal inverse cumulative
distribution function to a uniform random variate. Exactly one uniform value is consumed per
normal value.
Polar
The polar rejection algorithm, as described in [1]. Approximately 1.27 uniform values are
consumed per normal value, on average.
3-25
3 Random Numbers
Ziggurat
The ziggurat algorithm, as described in [7]. Approximately 2.02 uniform values are consumed per
normal value, on average.
Configuring a Stream
A random number stream s has properties that control its behavior. To access or change a property,
use the syntax p = s.Property and s.Property = p.
For example, you can configure the transformation algorithm to generate normally distributed
pseudorandom values when using randn. Generate normally distributed pseudorandom values using
the default Ziggurat transformation algorithm.
s1 = RandStream('mt19937ar');
s1.NormalTransform
ans = 'Ziggurat'
r1 = randn(s1,1,10);
Configure the stream to use the Polar transformation algorithm to generate normally distributed
pseudorandom values.
s1.NormalTransform = 'Polar'
s1 =
mt19937ar random stream
Seed: 0
NormalTransform: Polar
r2 = randn(s1,1,10);
When generating random numbers with uniform distribution using rand, you can also configure the
stream to generate antithetic pseudorandom values, that is, the usual values subtracted from 1 for
uniform values.
s2 = RandStream('mt19937ar');
r1 = rand(s2,1,6)
r1 =
0.8147 0.9058 0.1270 0.9134 0.6324 0.0975
Restore the initial state of the stream. Create another 6 random numbers with the Antithetic
property set to true. Check that these 6 random numbers are equal to the previously generated
random numbers subtracted from 1.
reset(s2)
s2.Antithetic = true;
r2 = rand(s2,1,6)
r2 =
0.1853 0.0942 0.8730 0.0866 0.3676 0.9025
isequal(r1,1 - r2)
3-26
Creating and Controlling a Random Number Stream
ans =
1
Instead of setting the properties of a stream one-by-one, you can save and restore all properties of a
stream s by using A = get(s) and set(s,A), respectively. For example, configure all properties of
the stream s2 to be the same as the stream s1.
A = get(s1)
A =
Type: 'mt19937ar'
NumStreams: 1
StreamIndex: 1
Substream: 1
Seed: 0
State: [625x1 uint32]
NormalTransform: 'Polar'
Antithetic: 0
FullPrecision: 1
set(s2,A)
Type: 'mt19937ar'
NumStreams: 1
StreamIndex: 1
Substream: 1
Seed: 0
State: [625x1 uint32]
NormalTransform: 'Polar'
Antithetic: 0
FullPrecision: 1
The get and set functions enable you to save and restore a stream's entire configuration so that
results are exactly reproducible later on.
Use RandStream.getGlobalStream to return a handle to the global stream, that is, the current
global stream that rand generates random numbers from. Save the state of the global stream.
globalStream = RandStream.getGlobalStream;
myState = globalStream.State;
Using myState, you can restore the state of globalStream and reproduce previous results.
A = rand(1,100);
globalStream.State = myState;
B = rand(1,100);
isequal(A,B)
ans = logical
1
3-27
3 Random Numbers
rand, randi, randn, and randperm access the global stream. Since all of these functions access the
same underlying stream, a call to one affects the values produced by the others at subsequent calls.
globalStream.State = myState;
A = rand(1,100);
globalStream.State = myState;
C = randi(100);
B = rand(1,100);
isequal(A,B)
ans = logical
0
You can also reset a stream to its initial settings using the reset function.
reset(globalStream)
A = rand(1,100);
reset(globalStream)
B = rand(1,100);
isequal(A,B)
ans = logical
1
References
[1] Devroye, L. Non-Uniform Random Variate Generation, Springer-Verlag, 1986.
[2] L’Ecuyer, P. “Good Parameter Sets for Combined Multiple Recursive Random Number Generators”,
Operations Research, 47(1): 159–164. 1999.
[3] L'Ecuyer, P. and S. Côté. “Implementing A Random Number Package with Splitting Facilities”,
ACM Transactions on Mathematical Software, 17: 98–111. 1991.
[4] L'Ecuyer, P. and R. Simard. “TestU01: A C Library for Empirical Testing of Random Number
Generators,” ACM Transactions on Mathematical Software, 33(4): Article 22. 2007.
[5] L'Ecuyer, P., R. Simard, E. J. Chen, and W. D. Kelton. “An Objected-Oriented Random-Number
Package with Many Long Streams and Substreams.” Operations Research, 50(6):1073–1075.
2002.
[6] Marsaglia, G. “Random numbers for C: The END?” Usenet posting to sci.stat.math. 1999.
Available online at https://groups.google.com/group/sci.crypt/browse_thread/
thread/ca8682a4658a124d/.
[7] Marsaglia G., and W. W. Tsang. “The ziggurat method for generating random variables.” Journal of
Statistical Software, 5:1–7. 2000. Available online at https://www.jstatsoft.org/v05/
i08.
[8] Marsaglia, G., and A. Zaman. “A new class of random number generators.” Annals of Applied
Probability 1(3):462–480. 1991.
[9] Marsaglia, G., and W. W. Tsang. “A fast, easily implemented method for sampling from decreasing
or symmetric unimodal density functions.” SIAM J. Sci. Stat. Comput. 5(2):349–359. 1984.
3-28
Creating and Controlling a Random Number Stream
[12] Matsumoto, M., and M. Saito.“A PRNG Specialized in Double Precision Floating Point Numbers
Using an Affine Transition.” Monte Carlo and Quasi-Monte Carlo Methods 2008,
10.1007/978-3-642-04107-5_38. 2009.
[13] Moler, C.B. Numerical Computing with MATLAB. SIAM, 2004. Available online at https://
www.mathworks.com/moler
[14] Park, S.K., and K.W. Miller. “Random Number Generators: Good Ones Are Hard to Find.”
Communications of the ACM, 31(10):1192–1201. 1998.
[15] Salmon, J. K., M. A. Moraes, R. O. Dror, and D. E. Shaw. "Parallel Random Numbers: As Easy As
1, 2, 3." In Proceedings of the International Conference for High Performance Computing,
Networking, Storage and Analysis (SC11). New York, NY: ACM, 2011.
See Also
rng | RandStream
Related Examples
• “Managing the Global Stream Using RandStream” on page 3-18
• “Multiple Streams” on page 3-30
• “Controlling Random Number Generation” on page 3-38
• “Create Arrays of Random Numbers” on page 3-4
3-29
3 Random Numbers
Multiple Streams
MATLAB® software includes generator algorithms that enable you to create multiple independent
random number streams. For example, the four generator types that support multiple independent
streams are the Combined Multiple Recursive ('mrg32k3a'), the Multiplicative Lagged Fibonacci
('mlfg6331_64'), the Philox 4x32 ('philox4x32_10'), and the Threefry 4x64 ('threefry4x64_20')
generators. You can create multiple independent streams that are guaranteed to not overlap, and for
which tests that demonstrate (pseudo)independence of the values between streams have been carried
out. For more information about generator algorithms that support multiple streams, see the table of
generator algorithms in “Creating and Controlling a Random Number Stream” on page 3-21.
The RandStream.create function enables you to create streams that have the same generator
algorithm and seed value, but are statistically independent.
[s1,s2,s3] = RandStream.create('mlfg6331_64','NumStreams',3)
s1 =
mlfg6331_64 random stream
StreamIndex: 1
NumStreams: 3
Seed: 0
NormalTransform: Ziggurat
s2 =
mlfg6331_64 random stream
StreamIndex: 2
NumStreams: 3
Seed: 0
NormalTransform: Ziggurat
s3 =
mlfg6331_64 random stream
StreamIndex: 3
NumStreams: 3
Seed: 0
NormalTransform: Ziggurat
As evidence of independence, you can see that these streams are largely uncorrelated.
r1 = rand(s1,100000,1);
r2 = rand(s2,100000,1);
r3 = rand(s3,100000,1);
corrcoef([r1,r2,r3])
ans = 3×3
3-30
Multiple Streams
Depending on the application, creating only some of the streams in a set of independent streams can
be useful if you need to simulate some events. Specify the StreamIndices parameter to create only
some of the streams from a set of multiple streams. The StreamIndex property returns the index of
each stream you create.
streamNum = 256;
streamId = 4;
s4 = RandStream.create('mlfg6331_64','NumStreams',streamNum,'StreamIndices',streamId)
s4 =
mlfg6331_64 random stream
StreamIndex: 4
NumStreams: 256
Seed: 0
NormalTransform: Ziggurat
Multiple streams, since they are statistically independent, can be used to verify the precision of a
simulation. For example, a set of independent streams can be used to repeat a Monte Carlo
simulation several times in different MATLAB sessions or on different processors and determine the
variance in the results. This makes multiple streams useful in large-scale parallel simulations.
For generator types that do not explicitly support independent streams, different seeds provide a
method to create multiple streams. By using different seeds, you can create streams that return
different values and act separately from one another. However, using a generator specifically
designed for multiple independent streams is a better option, as the statistical properties across
streams have been carefully verified.
Create two streams with different seeds by using the Mersenne twister generator.
s1 = RandStream('mt19937ar','Seed',1)
s1 =
mt19937ar random stream
Seed: 1
NormalTransform: Ziggurat
s2 = RandStream('mt19937ar','Seed',2)
s2 =
mt19937ar random stream
Seed: 2
NormalTransform: Ziggurat
Use the first stream in one MATLAB session to generate random numbers.
r1 = rand(s1,100000,1);
Use the second stream in another MATLAB session to generate random numbers.
r2 = rand(s2,100000,1);
With different seeds, streams typically return values that are uncorrelated.
corrcoef([r1,r2])
3-31
3 Random Numbers
ans = 2×2
1.0000 0.0030
0.0030 1.0000
The two streams with different seeds may appear uncorrelated since the state space of the Mersenne
19937 32
Twister is so much larger (2 elements) than the number of possible seeds (2 ). The chances of
overlap in different simulation runs are pretty remote unless you use a large number of different
seeds. Using widely spaced seeds does not increase the level of randomness. In fact, taking this
strategy to the extreme and reseeding a generator before each call can result in the sequence of
values that are not statistically independent and identically distributed.
Seeding a stream is most useful if you use it as an initialization step, perhaps at MATLAB startup, or
before running a simulation.
Another method to get different results from a stream is to use substreams. Unlike seeds, where the
locations along the sequence of random numbers are not exactly known, the spacing between
substreams is known, so any chance of overlap can be eliminated. Like independent parallel streams,
research has been done to demonstrate statistical independence across substreams. In short,
substreams are a more controlled way to do many of the same things that seeds have traditionally
been used for, and a more lightweight solution than parallel streams.
Substreams provide a quick and easy way to ensure that you get different results from the same code
at different times. For example, generate several random numbers in a loop.
defaultStream = RandStream('mlfg6331_64');
RandStream.setGlobalStream(defaultStream)
for i = 1:5
defaultStream.Substream = i;
z = rand(1,i)
end
z = 0.6986
z = 1×2
0.9230 0.2489
z = 1×3
z = 1×4
z = 1×5
3-32
Multiple Streams
In another loop, you can generate random values that are independent from the first set of 5
iterations.
for i = 6:10
defaultStream.Substream = i;
z = rand(1,11-i)
end
z = 1×5
z = 1×4
z = 1×3
z = 1×2
0.6814 0.5150
z = 0.5247
Each of these substreams can reproduce its loop iteration. For example, you can return to the 6th
substream in the loop.
defaultStream.Substream = 6;
z = rand(1,5)
z = 1×5
See Also
rng | RandStream
Related Examples
• “Managing the Global Stream Using RandStream” on page 3-18
• “Controlling Random Number Generation” on page 3-38
• “Creating and Controlling a Random Number Stream” on page 3-21
• “Create Arrays of Random Numbers” on page 3-4
3-33
3 Random Numbers
rand('seed',sd)
randn('seed',sd)
rand('state',s)
randn('state',s)
rand('twister',5489)
These syntaxes referred to different types of generators, and they are no longer recommended for the
following reasons:
• The terms 'seed' and 'state' are misleading names for the generators.
• All of the generators except 'twister' are flawed.
• They unnecessarily use different generators for rand and randn.
To assess the impact of replacing discouraged syntaxes in your existing code, execute the following
commands at the start of your MATLAB session:
warning('on','MATLAB:RandStream:ActivatingLegacyGenerators')
warning('on','MATLAB:RandStream:ReadingInactiveLegacyGeneratorState')
Generator = 'seed' referred to the MATLAB v4 generator, not to the seed initialization value.
Generator = 'state' referred to the MATLAB v5 generators, not to the internal state of the
generator.
Generator = 'twister' referred to the Mersenne Twister generator, now the MATLAB startup
generator.
3-34
Replace Discouraged Syntaxes of rand and randn
The v4 and v5 generators are no longer recommended unless you are trying to exactly reproduce the
random numbers generated in earlier versions of MATLAB. The simplest way to update your code is
to use rng. The rng function replaces the names for the rand and randn generators as follows.
• Reproduce exactly the same random numbers each time (e.g., by using a seed such as 0, 1, or
3141879)
• Try to ensure that MATLAB always gives different random numbers in separate runs (for example,
by using a seed such as sum(100*clock))
The following table shows replacements for syntaxes with an integer seed sd.
• The first column shows the discouraged syntax with rand and randn.
• The second column shows how to exactly reproduce the discouraged behavior with the new rng
function. In most cases, this is done by specifying a legacy generator type such as the v4 or v5
generators, which is no longer recommended.
• The third column shows the recommended alternative, which does not specify the optional
generator type input to rng. Therefore, if you always omit the Generator input, rand, randn,
and randi just use the default generator type and seed value. Before R2023b, the default
generator type used at MATLAB startup is the Mersenne Twister generator with seed value of 0.
Starting in R2023b, you can change the default algorithm and seed for the random number
generator in MATLAB preferences. For more details, see rng.
3-35
3 Random Numbers
The rng function changes the pattern of saving and restoring the state of the random number
generator as shown in the next table. The example in the left column assumes that you are using the
v5 uniform generator. The example in the right column uses the new syntax, and works for any
generator you use.
If code that you rely on puts MATLAB into legacy mode, use the following command to escape legacy
mode and get back to the default startup generator:
rng('default')
Alternatively, to guard around code that puts MATLAB into legacy mode, use:
3-36
Replace Discouraged Syntaxes of rand and randn
See Also
rng | rand | randn
Related Examples
• “Create Arrays of Random Numbers” on page 3-4
3-37
3 Random Numbers
This example shows how to use the rng function, which provides control over random number
generation.
(Pseudo)Random numbers in MATLAB® come from the rand, randi, and randn functions. Many
other functions call those three, but those are the fundamental building blocks. All three depend on a
single shared random number generator that you can control using rng.
It's important to realize that "random" numbers in MATLAB are not unpredictable at all, but are
generated by a deterministic algorithm. The algorithm is designed to be sufficiently complicated so
that its output appears to be an independent random sequence to someone who does not know the
algorithm, and can pass various statistical tests of randomness. The function that is introduced here
provides ways to take advantage of the determinism to
• repeat calculations that involve random numbers, and get the same results, or
• guarantee that different random numbers are used in repeated calculations
and to take advantage of the apparent randomness to justify combining results from separate
calculations.
"Starting Over"
If you look at the output from rand, randi, or randn in a new MATLAB session, you'll notice that
they return the same sequences of numbers each time you restart MATLAB. It's often useful to be
able to reset the random number generator to that startup state, without actually restarting MATLAB.
For example, you might want to repeat a calculation that involves random numbers, and get the same
result.
When you first start a MATLAB session or call rng("default"), MATLAB initializes the random
number generator using the default algorithm and seed. Starting in R2023b, you can set the default
algorithm and seed in MATLAB preferences. If you do not change these preferences, then rng uses
the factory value of "twister" for the Mersenne Twister generator with seed 0, as in previous
releases. For more information, see “Default Settings for Random Number Generator” and
“Reproducibility for Random Number Generator”.
rng("default") provides a very simple way to put the random number generator back to its
default settings.
rng("default")
rand % returns the same value as at startup
ans = 0.8147
What are the "default" random number settings that MATLAB starts up with, or that
rng("default") gives you? Before R2023b, if you call rng with no inputs, you can see that it is the
Mersenne Twister generator algorithm, seeded with 0.
rng
3-38
Controlling Random Number Generation
You'll see in more detail below how to use the above output, including the State field, to control and
change how MATLAB generates random numbers. For now, it serves as a way to see what generator
rand, randi, and randn are currently using.
Non-Repeatability
Each time you call rand, randi, or randn, they draw a new value from their shared random number
generator, and successive values can be treated as statistically independent. But as mentioned above,
each time you restart MATLAB those functions are reset and return the same sequences of numbers.
Obviously, calculations that use the same "random" numbers cannot be thought of as statistically
independent. So when it's necessary to combine calculations done in two or more MATLAB sessions
as if they were statistically independent, you cannot use the default generator settings.
One simple way to avoid repeating the same random numbers in a new MATLAB session is to choose
a different seed for the random number generator. rng gives you an easy way to do that, by creating
a seed based on the current time.
rng("shuffle")
rand
ans = 0.7466
Each time you use "shuffle", it reseeds the generator with a different seed. You can call rng with
no inputs to see what seed it actually used.
rng
rand
ans = 0.1908
"shuffle" is a very easy way to reseed the random number generator. You might think that it's a
good idea, or even necessary, to use it to get "true" randomness in MATLAB. For most purposes,
though, it is not necessary to use "shuffle" at all. Choosing a seed based on the current time does
not improve the statistical properties of the values you'll get from rand, randi, and randn, and does
not make them "more random" in any real sense. While it is perfectly fine to reseed the generator
each time you start up MATLAB, or before you run some kind of large calculation involving random
numbers, it is actually not a good idea to reseed the generator too frequently within a session,
because this can affect the statistical properties of your random numbers.
3-39
3 Random Numbers
What "shuffle" does provide is a way to avoid repeating the same sequences of values. Sometimes
that is critical, sometimes it's just "nice", but often it is not important at all. Bear in mind that if you
use "shuffle", you may want to save the seed that rng created so that you can repeat your
calculations later on. You'll see how to do that below.
So far, you've seen how to reset the random number generator to its default settings, and reseed it
using a seed that is created using the current time. rng also provides a way to reseed it using a
specific seed.
You can use the same seed several times, to repeat the same calculations. For example, if you run this
code twice ...
rng(1) % the seed is any non-negative integer < 2^32
x = randn(1,5)
x = 1×5
rng(1)
x = randn(1,5)
x = 1×5
... you get exactly the same results. You might do this to recreate x after having cleared it, so that you
can repeat what happens in subsequent calculations that depend on x, using those specific values.
On the other hand, you might want to choose different seeds to ensure that you don't repeat the same
calculations. For example, if you run this code in one MATLAB session ...
rng(2)
x2 = sum(randn(50,1000),1); % 1000 trials of a random walk
... you could combine the two results and be confident that they are not simply the same results
repeated twice.
x = [x2 x3];
As with "shuffle" there is a caveat when reseeding MATLAB's random number generator, because
it affects all subsequent output from rand, randi, and randn. Unless you need repeatability or
uniqueness, it is usually advisable to simply generate random values without reseeding the generator.
If you do need to reseed the generator, that is usually best done at the command line, or in a spot in
your code that is not easily overlooked.
Not only can you reseed the random number generator as shown above, you can also choose the type
of random number generator that you want to use. Different generator types produce different
3-40
Controlling Random Number Generation
sequences of random numbers, and you might, for example, choose a specific type because of its
statistical properties. Or you might need to recreate results from an older version of MATLAB that
used a different default generator type.
One other common reason for choosing the generator type is that you are writing a validation test
that generates "random" input data, and you need to guarantee that your test can always expect
exactly the same predictable result. If you call rng with a seed before creating the input data, it
reseeds the random number generator. But if the generator type has been changed for some reason,
then the output from rand, randi, and randn will not be what you expect from that seed. Therefore,
to be 100% certain of repeatability, you can also specify a generator type.
For example,
rng(0,"twister")
causes rand, randi, and randn to use the Mersenne Twister generator algorithm, after seeding it
with 0.
Using "combRecursive"
rng(0,"combRecursive")
selects the Combined Multiple Recursive generator algorithm, which supports some parallel features
that the Mersenne Twister does not.
This command
rng(0,"v4")
selects the generator algorithm that was the default in MATLAB 4.0.
And of course, this command returns the random number generator to its default settings.
rng("default")
However, because the default random number generator settings may change between MATLAB
releases, using "default" does not guarantee predictable results over the long-term. "default" is
a convenient way to reset the random number generator, but for even more predictability, specify a
generator type and a seed.
On the other hand, when you are working interactively and need repeatability, it is simpler, and
usually sufficient, to call rng with just a seed.
Calling rng with no inputs returns a scalar structure with fields that contain two pieces of
information described already: the generator type, and the integer with which the generator was last
reseeded.
s = rng
3-41
3 Random Numbers
The third field, State, contains a copy of the generator's current state vector. This state vector is the
information that the generator maintains internally in order to generate the next value in its
sequence of random numbers. Each time you call rand, randi, or randn, the generator that they
share updates its internal state. Thus, the state vector in the settings structure returned by rng
contains the information necessary to repeat the sequence, beginning from the point at which the
state was captured.
While just being able to see this output is informative, rng also accepts a settings structure as an
input, so that you can save the settings, including the state vector, and restore them later to repeat
calculations. Because the settings contain the generator type, you'll know exactly what you're getting,
and so "later" might mean anything from moments later in the same MATLAB session, to years (and
multiple MATLAB releases) later. You can repeat results from any point in the random number
sequence at which you saved the generator settings. For example
x2 = 1×5
x2 = 1×5
Notice that while reseeding provides only a coarse reinitialization, saving and restoring the generator
state using the settings structure allows you to repeat any part of the random number sequence.
The most common way to use a settings structure is to restore the generator state. However, because
the structure contains not only the state, but also the generator type and seed, it's also a convenient
way to temporarily switch generator types. For example, if you need to create values using one of the
legacy generators from MATLAB 5.0, you can save the current settings at the same time that you
switch to use the old generator.
previousSettings = rng(0,"v5uniform")
rng(previousSettings)
You should not modify the contents of any of the fields in a settings structure. In particular, you
should not construct your own state vector, or even depend on the format of the generator state.
3-42
Controlling Random Number Generation
without having to know what type it is. You can also return the random number generator to its
default settings without having to know what those settings are. While there are situations when you
might want to specify a generator type, rng affords you the simplicity of not having to specify it.
If you are able to avoid specifying a generator type, your code will automatically adapt to cases where
a different generator needs to be used, and will automatically benefit from improved properties in a
new default random number generator type.
rng provides a convenient way to control random number generation in MATLAB for the most
common needs. However, more complicated situations involving multiple random number streams
and parallel random number generation require a more complicated tool. The RandStream class is
that tool, and it provides the most powerful way to control random number generation. The two tools
are complementary, with rng providing a much simpler and concise syntax that is built on top of the
flexibility of RandStream.
3-43
4
Sparse Matrices
In this section...
“Memory Management” on page 4-2
“Computational Efficiency” on page 4-2
Memory Management
Using sparse matrices to store data that contains a large number of zero-valued elements can both
save a significant amount of memory and speed up the processing of that data. sparse is an attribute
that you can assign to any two-dimensional MATLAB matrix that is composed of double or logical
elements.
• Store only the nonzero elements of the matrix, together with their indices.
• Reduce computation time by eliminating operations on zero elements.
For full matrices, MATLAB stores every matrix element internally. Zero-valued elements require the
same amount of storage space as any other matrix element. For sparse matrices, however, MATLAB
stores only the nonzero elements and their indices. For large matrices with a high percentage of zero-
valued elements, this scheme significantly reduces the amount of memory required for data storage.
The whos command provides high-level information about matrix storage, including size and storage
class. For example, this whos listing shows information about sparse and full versions of the same
matrix.
whos
Notice that the number of bytes used is fewer in the sparse case, because zero-valued elements are
not stored.
Computational Efficiency
Sparse matrices also have significant advantages in terms of computational efficiency. Unlike
operations with full matrices, operations with sparse matrices do not perform unnecessary low-level
arithmetic, such as zero-adds (x+0 is always x). The resulting efficiencies can lead to dramatic
improvements in execution time for programs working with large amounts of sparse data.
4-2
Computational Advantages of Sparse Matrices
See Also
More About
• “Sparse Matrix Operations” on page 4-14
4-3
4 Sparse Matrices
In this section...
“Creating Sparse Matrices” on page 4-4
“Importing Sparse Matrices” on page 4-7
The density of a matrix is the number of nonzero elements divided by the total number of matrix
elements. For matrix M, this would be
nnz(M) / prod(size(M));
or
nnz(M) / numel(M);
Matrices with very low density are often good candidates for use of the sparse format.
You can convert a full matrix to sparse storage using the sparse function with a single argument.
For example:
A = [ 0 0 0 5
0 2 0 0
1 3 0 0
0 0 4 0];
S = sparse(A)
S =
(3,1) 1
(2,2) 2
(3,2) 3
(4,3) 4
(1,4) 5
The printed output lists the nonzero elements of S, together with their row and column indices. The
elements are sorted by columns, reflecting the internal data structure.
You can convert a sparse matrix to full storage using the full function, provided the matrix order is
not too large. For example, A = full(S) reverses the example conversion.
Converting a full matrix to sparse storage is not the most frequent way of generating sparse matrices.
If the order of a matrix is small enough that full storage is possible, then conversion to sparse storage
rarely offers significant savings.
4-4
Constructing Sparse Matrices
You can create a sparse matrix from a list of nonzero elements using the sparse function with five
arguments.
S = sparse(i,j,s,m,n)
i and j are vectors of row and column indices, respectively, for the nonzero elements of the matrix. s
is a vector of nonzero values whose indices are specified by the corresponding (i,j) pairs. m is the
row dimension of the resulting matrix, and n is the column dimension.
S =
(3,1) 1
(2,2) 2
(3,2) 3
(4,3) 4
(1,4) 5
The sparse command has a number of alternate forms. The example above uses a form that sets the
maximum number of nonzero elements in the matrix to length(s). If desired, you can append a
sixth argument that specifies a larger maximum, allowing you to add nonzero elements later without
reallocating the sparse matrix.
The matrix representation of the second difference operator is a good example of a sparse matrix. It
is a tridiagonal matrix with -2s on the diagonal and 1s on the super- and subdiagonal. There are many
ways to generate it—here's one possibility.
n = 5;
D = sparse(1:n,1:n,-2*ones(1,n),n,n);
E = sparse(2:n,1:n-1,ones(1,n-1),n,n);
S = E+D+E'
S =
(1,1) -2
(2,1) 1
(1,2) 1
(2,2) -2
(3,2) 1
(2,3) 1
(3,3) -2
(4,3) 1
(3,4) 1
(4,4) -2
(5,4) 1
(4,5) 1
(5,5) -2
F =
4-5
4 Sparse Matrices
-2 1 0 0 0
1 -2 1 0 0
0 1 -2 1 0
0 0 1 -2 1
0 0 0 1 -2
Creating sparse matrices based on their diagonal elements is a common operation, so the function
spdiags handles this task. Its syntax is
S = spdiags(B,d,m,n)
• B is a matrix of size min(m,n)-by-p. The columns of B are the values to populate the diagonals of
S.
• d is a vector of length p whose integer elements specify which diagonals of S to populate.
That is, the elements in column j of B fill the diagonal specified by element j of d.
Note If a column of B is longer than the diagonal it's replacing, super-diagonals are taken from the
lower part of the column of B, and sub-diagonals are taken from the upper part of the column of B.
B = [ 41 11 0
52 22 0
63 33 13
74 44 24 ];
d = [-3
0
2];
A = spdiags(B,d,7,4)
A =
(1,1) 11
(4,1) 41
(2,2) 22
(5,2) 52
(1,3) 13
(3,3) 33
(6,3) 63
(2,4) 24
(4,4) 44
(7,4) 74
full(A)
4-6
Constructing Sparse Matrices
ans =
11 0 13 0
0 22 0 24
0 0 33 0
41 0 0 44
0 52 0 0
0 0 63 0
0 0 0 74
spdiags can also extract diagonal elements from a sparse matrix, or replace matrix diagonal
elements with new values. Type help spdiags for details.
load T.dat
S = spconvert(T)
The save and load commands can also process sparse matrices stored as binary data in MAT-files.
See Also
sparse | spconvert
More About
• “Sparse Matrix Operations” on page 4-14
4-7
4 Sparse Matrices
Nonzero Elements
There are several commands that provide high-level information about the nonzero elements of a
sparse matrix:
To try some of these, load the supplied sparse matrix west0479, one of the Harwell-Boeing
collection.
load west0479
whos
nnz(west0479)
ans =
1887
format short e
west0479
west0479 =
(25,1) 1.0000e+00
(31,1) -3.7648e-02
(87,1) -3.4424e-01
(26,2) 1.0000e+00
(31,2) -2.4523e-02
(88,2) -3.7371e-01
(27,3) 1.0000e+00
(31,3) -3.6613e-02
(89,3) -8.3694e-01
(28,4) 1.3000e+02
.
4-8
Accessing Sparse Matrices
.
.
nonzeros(west0479)
ans =
1.0000e+00
-3.7648e-02
-3.4424e-01
1.0000e+00
-2.4523e-02
-3.7371e-01
1.0000e+00
-3.6613e-02
-8.3694e-01
1.3000e+02
.
.
.
Note that initially nnz has the same value as nzmax by default. That is, the number of nonzero
elements is equivalent to the number of storage locations allocated for nonzeros. However, MATLAB
does not dynamically release memory if you zero out additional array elements. Changing the value of
some matrix elements to zero changes the value of nnz, but not that of nzmax.
However, you can add as many nonzero elements to the matrix as desired. You are not constrained by
the original value of nzmax.
find returns the row indices of nonzero values in vector i, the column indices in vector j, and the
nonzero values themselves in the vector s. The example below uses find to locate the indices and
values of the nonzeros in a sparse matrix. The sparse function uses the find output, together with
the size of the matrix, to recreate the matrix.
S1 = west0479;
[i,j,s] = find(S1);
[m,n] = size(S1);
S2 = sparse(i,j,s,m,n);
4-9
4 Sparse Matrices
B = speye(4);
[i,j,s] = find(B);
[i,j,s]
ans =
1 1 1
2 2 1
3 3 1
4 4 1
B(3,1) = 42;
[i,j,s] = find(B);
[i,j,s]
ans =
1 1 1
3 1 42
2 2 1
3 3 1
4 4 1
In order to store the new matrix with 42 at (3,1), MATLAB inserts an additional row into the
nonzero values vector and subscript vectors, then shifts all matrix values after (3,1).
Using linear indexing to access or assign an element in a large sparse matrix will fail if the linear
index exceeds 2^48-1, which is the current upper bound for the number of elements allowed in a
matrix.
S = spalloc(2^30,2^30,2);
S(end) = 1
To access an element whose linear index is greater than intmax, use array indexing:
S(2^30,2^30) = 1
S =
(1073741824,1073741824) 1
While the cost of indexing into a sparse matrix to change a single element is negligible, it is
compounded in the context of a loop and can become quite slow for large matrices. For that reason,
in cases where many sparse matrix elements need to be changed, it is best to vectorize the operation
instead of using a loop. For example, consider a sparse identity matrix:
n = 10000;
A = 4*speye(n);
Changing the elements of A within a loop is slower than a similar vectorized operation:
tic
A(1:n-1,n) = -1;
A(n,1:n-1) = -1;
toc
4-10
Accessing Sparse Matrices
tic
for k = 1:n-1
C(k,n) = -1;
C(n,k) = -1;
end
toc
Since MATLAB stores sparse matrices in compressed sparse column format, it needs to shift multiple
entries in A during each pass through the loop.
Preallocating the memory for a sparse matrix and then filling it in an element-wise manner similarly
causes a significant amount of overhead in indexing into the sparse array:
S1 = spalloc(1000,1000,100000);
tic;
for n = 1:100000
i = ceil(1000*rand(1,1));
j = ceil(1000*rand(1,1));
S1(i,j) = rand(1,1);
end
toc
Constructing the vectors of indices and values eliminates the need to index into the sparse array, and
thus is significantly faster:
i = ceil(1000*rand(100000,1));
j = ceil(1000*rand(100000,1));
v = zeros(size(i));
for n = 1:100000
v(n) = rand(1,1);
end
tic;
S2 = sparse(i,j,v,1000,1000);
toc
For that reason, it’s best to construct sparse matrices all at once using a construction function, like
the sparse or spdiags functions.
For example, suppose you wanted the sparse form of the coordinate matrix C:
4 0 0 0 −1
0 4 0 0 −1
C = 0 0 4 0 −1
0 0 0 4 −1
1 1 1 1 4
Construct the five-column matrix directly with the sparse function using the triplet pairs for the row
subscripts, column subscripts, and values:
i = [1 5 2 5 3 5 4 5 1 2 3 4 5]';
j = [1 1 2 2 3 3 4 4 5 5 5 5 5]';
4-11
4 Sparse Matrices
s = [4 1 4 1 4 1 4 1 -1 -1 -1 -1 4]';
C = sparse(i,j,s)
C =
(1,1) 4
(5,1) 1
(2,2) 4
(5,2) 1
(3,3) 4
(5,3) 1
(4,4) 4
(5,4) 1
(1,5) -1
(2,5) -1
(3,5) -1
(4,5) -1
(5,5) 4
The ordering of the values in the output reflects the underlying storage by columns. For more
information on how MATLAB stores sparse matrices, see John R. Gilbert, Cleve Moler, and Robert
Schreiber's Sparse Matrices In MATLAB: Design and Implementation, (SIAM Journal on Matrix
Analysis and Applications, 13:1, 333–356 (1992)).
It is often useful to use a graphical format to view the distribution of the nonzero elements within a
sparse matrix. The MATLAB spy function produces a template view of the sparsity structure, where
each point on the graph represents the location of a nonzero array element.
For example:
Load the supplied sparse matrix west0479, one of the Harwell-Boeing collection.
load west0479
spy(west0479)
4-12
Accessing Sparse Matrices
See Also
sparse
More About
• “Computational Advantages of Sparse Matrices” on page 4-2
• “Constructing Sparse Matrices” on page 4-4
• “Sparse Matrix Operations” on page 4-14
4-13
4 Sparse Matrices
Efficiency of Operations
Computational Complexity
The computational complexity of sparse operations is proportional to nnz, the number of nonzero
elements in the matrix. Computational complexity also depends linearly on the row size m and column
size n of the matrix, but is independent of the product m*n, the total number of zero and nonzero
elements.
The complexity of fairly complicated operations, such as the solution of sparse linear equations,
involves factors like ordering and fill-in, which are discussed in the previous section. In general,
however, the computer time required for a sparse matrix operation is proportional to the number of
arithmetic operations on nonzero quantities.
Algorithmic Details
• Functions that accept a matrix and return a scalar or constant-size vector always produce output
in full storage format. For example, the size function always returns a full vector, whether its
input is full or sparse.
• Functions that accept scalars or vectors and return matrices, such as zeros, ones, rand, and
eye, always return full results. This is necessary to avoid introducing sparsity unexpectedly. The
sparse analog of zeros(m,n) is simply sparse(m,n). The sparse analogs of rand and eye are
sprand and speye, respectively. There is no sparse analog for the function ones.
• Unary functions that accept a matrix and return a matrix or vector preserve the storage class of
the operand. If S is a sparse matrix, then chol(S) is also a sparse matrix, and diag(S) is a
sparse vector. Columnwise functions such as max and sum also return sparse vectors, even though
these vectors can be entirely nonzero. Important exceptions to this rule are the sparse and full
functions.
• Binary operators yield sparse results if both operands are sparse, and full results if both are full.
For mixed operands, the result is full unless the operation preserves sparsity. If S is sparse and F
is full, then S+F, S*F, and F\S are full, while S.*F and S&F are sparse. In some cases, the result
might be sparse even though the matrix has few zero elements.
• Matrix concatenation using either the cat function or square brackets produces sparse results for
mixed operands.
For example:
p = [1 3 4 2 5]
I = eye(5,5);
4-14
Sparse Matrix Operations
P = I(p,:)
e = ones(4,1);
S = diag(11:11:55) + diag(e,1) + diag(e,-1)
p =
1 3 4 2 5
P =
1 0 0 0 0
0 0 1 0 0
0 0 0 1 0
0 1 0 0 0
0 0 0 0 1
S =
11 1 0 0 0
1 22 1 0 0
0 1 33 1 0
0 0 1 44 1
0 0 0 1 55
You can now try some permutations using the permutation vector p and the permutation matrix P. For
example, the statements S(p,:) and P*S return the same matrix.
S(p,:)
ans =
11 1 0 0 0
0 1 33 1 0
0 0 1 44 1
1 22 1 0 0
0 0 0 1 55
P*S
ans =
11 1 0 0 0
0 1 33 1 0
0 0 1 44 1
1 22 1 0 0
0 0 0 1 55
S*P'
ans =
11 0 0 1 0
1 1 0 22 0
0 33 1 1 0
4-15
4 Sparse Matrices
0 1 44 0 1
0 0 1 0 55
If P is a sparse matrix, then both representations use storage proportional to n and you can apply
either to S in time proportional to nnz(S). The vector representation is slightly more compact and
efficient, so the various sparse matrix permutation routines all return full row vectors with the
exception of the pivoting permutation in LU (triangular) factorization, which returns a matrix
compatible with the full LU factorization.
n = 5;
I = speye(n);
Pr = I(p,:);
Pc = I(:,p);
pc = (1:n)*Pc
pc =
1 3 4 2 5
pr = (Pr*(1:n)')'
pr =
1 3 4 2 5
The inverse of P is simply R = P'. You can compute the inverse of p with r(p) = 1:n.
r(p) = 1:5
r =
1 4 2 3 5
Reordering the columns of a matrix can often make its LU or QR factors sparser. Reordering the rows
and columns can often make its Cholesky factors sparser. The simplest such reordering is to sort the
columns by nonzero count. This is sometimes a good reordering for matrices with very irregular
structures, especially if there is great variation in the nonzero counts of rows or columns.
The colperm computes a permutation that orders the columns of a matrix by the number of nonzeros
in each column from smallest to largest.
The reverse Cuthill-McKee ordering is intended to reduce the profile or bandwidth of the matrix. It is
not guaranteed to find the smallest possible bandwidth, but it usually does. The symrcm function
actually operates on the nonzero structure of the symmetric matrix A + A', but the result is also
useful for nonsymmetric matrices. This ordering is useful for matrices that come from one-
dimensional problems or problems that are in some sense long and thin.
The degree of a node in a graph is the number of connections to that node. This is the same as the
number of off-diagonal nonzero elements in the corresponding row of the adjacency matrix. The
4-16
Sparse Matrix Operations
approximate minimum degree algorithm generates an ordering based on how these degrees are
altered during Gaussian elimination or Cholesky factorization. It is a complicated and powerful
algorithm that usually leads to sparser factors than most other orderings, including column count and
reverse Cuthill-McKee. Because keeping track of the degree of each node is very time-consuming, the
approximate minimum degree algorithm uses an approximation to the degree, rather than the exact
degree.
See “Reordering and Factorization of Sparse Matrices” on page 4-18 for an example using symamd.
You can change various parameters associated with details of the algorithms using the spparms
function.
For details on the algorithms used by colamd and symamd, see [5]. The approximate degree the
algorithms use is based on [1].
Like the approximate minimum degree ordering, the nested dissection ordering algorithm
implemented by the dissect function reorders the matrix rows and columns by considering the
matrix to be the adjacency matrix of a graph. The algorithm reduces the problem down to a much
smaller scale by collapsing together pairs of vertices in the graph. After reordering the small graph,
the algorithm then applies projection and refinement steps to expand the graph back to the original
size.
The nested dissection algorithm produces high quality reorderings and performs particularly well
with finite element matrices compared to other reordering techniques. For more information about
the nested dissection ordering algorithm, see [7].
If S is a sparse matrix, the following command returns three sparse matrices L, U, and P such that
P*S = L*U.
[L,U,P] = lu(S);
lu obtains the factors by Gaussian elimination with partial pivoting. The permutation matrix P has
only n nonzero elements. As with dense matrices, the statement [L,U] = lu(S) returns a permuted
unit lower triangular matrix and an upper triangular matrix whose product is S. By itself, lu(S)
returns L and U in a single matrix without the pivot information.
The three-output syntax [L,U,P] = lu(S) selects P via numerical partial pivoting, but does not
pivot to improve sparsity in the LU factors. On the other hand, the four-output syntax [L,U,P,Q] =
lu(S) selects P via threshold partial pivoting, and selects P and Q to improve sparsity in the LU
factors.
4-17
4 Sparse Matrices
where thresh is a pivot threshold in [0,1]. Pivoting occurs when the diagonal entry in a column has
magnitude less than thresh times the magnitude of any sub-diagonal entry in that column. thresh
= 0 forces diagonal pivoting. thresh = 1 is the default. (The default for thresh is 0.1 for the four-
output syntax).
When you call lu with three or less outputs, MATLAB automatically allocates the memory necessary
to hold the sparse L and U factors during the factorization. Except for the four-output syntax,
MATLAB does not use any symbolic LU prefactorization to determine the memory requirements and
set up the data structures in advance.
Reordering and Factorization of Sparse Matrices
This example shows the effects of reordering and factorization on sparse matrices.
If you obtain a good column permutation p that reduces fill-in, perhaps from symrcm or colamd, then
computing lu(S(:,p)) takes less time and storage than computing lu(S).
B = bucky;
r = symrcm(B);
m = symamd(B);
The two permutations are the symmetric reverse Cuthill-McKee ordering and the symmetric
approximate minimum degree ordering.
Create spy plots to show the three adjacency matrices of the Bucky Ball graph with these three
different numberings. The local, pentagon-based structure of the original numbering is not evident in
the others.
figure
subplot(1,3,1)
spy(B)
title('Original')
subplot(1,3,2)
spy(B(r,r))
title('Reverse Cuthill-McKee')
subplot(1,3,3)
spy(B(m,m))
title('Min Degree')
4-18
Sparse Matrix Operations
The reverse Cuthill-McKee ordering, r, reduces the bandwidth and concentrates all the nonzero
elements near the diagonal. The approximate minimum degree ordering, m, produces a fractal-like
structure with large blocks of zeros.
To see the fill-in generated in the LU factorization of the Bucky ball, use speye, the sparse identity
matrix, to insert -3s on the diagonal of B.
B = B - 3*speye(size(B));
Since each row sum is now zero, this new B is actually singular, but it is still instructive to compute
its LU factorization. When called with only one output argument, lu returns the two triangular
factors, L and U, in a single sparse matrix. The number of nonzeros in that matrix is a measure of the
time and storage required to solve linear systems involving B.
Here are the nonzero counts for the three permutations being considered.
Even though this is a small example, the results are typical. The original numbering scheme leads to
the most fill-in. The fill-in for the reverse Cuthill-McKee ordering is concentrated within the band, but
it is almost as extensive as the first two orderings. For the approximate minimum degree ordering,
the relatively large blocks of zeros are preserved during the elimination and the amount of fill-in is
significantly less than that generated by the other orderings.
4-19
4 Sparse Matrices
subplot(1,3,2)
spy(lu(B(r,r)))
title('Reverse Cuthill-McKee')
subplot(1,3,3)
spy(lu(B(m,m)))
title('Min Degree')
Cholesky Factorization
If S is a symmetric (or Hermitian), positive definite, sparse matrix, the statement below returns a
sparse, upper triangular matrix R so that R'*R = S.
R = chol(S)
chol does not automatically pivot for sparsity, but you can compute approximate minimum degree
and profile limiting permutations for use with chol(S(p,p)).
Since the Cholesky algorithm does not use pivoting for sparsity and does not require pivoting for
numerical stability, chol does a quick calculation of the amount of memory required and allocates all
4-20
Sparse Matrix Operations
the memory at the start of the factorization. You can use symbfact, which uses the same algorithm
as chol, to calculate how much memory is allocated.
QR Factorization
or
[Q,R,E] = qr(S)
but this is often impractical. The unitary matrix Q often fails to have a high proportion of zero
elements. A more practical alternative, sometimes known as “the Q-less QR factorization,” is
available.
returns just the upper triangular portion of the QR factorization. The matrix R provides a Cholesky
factorization for the matrix associated with the normal equations:
R'*R = S'*S
However, the loss of numerical information inherent in the computation of S'*S is avoided.
With two input arguments having the same number of rows, and two output arguments, the
statement
[C,R] = qr(S,B)
The Q-less QR factorization allows the solution of sparse least squares problems
minimize Ax − b 2
If A is sparse, but not square, MATLAB uses these steps for the linear equation solving backslash
operator:
x = A\b
Or, you can do the factorization yourself and examine R for rank deficiency.
It is also possible to solve a sequence of least squares linear systems with different right-hand sides,
b, that are not necessarily known when R = qr(A) is computed. The approach solves the “semi-
normal equations R'*R*x = A'*b with
x = R\(R'\(A'*b))
and then employs one step of iterative refinement to reduce round off error:
4-21
4 Sparse Matrices
r = b - A*x;
e = R\(R'\(A'*r));
x = x + e
Incomplete Factorizations
The ilu and ichol functions provide approximate, incomplete factorizations, which are useful as
preconditioners for sparse iterative methods.
The ilu function produces three incomplete lower-upper (ILU) factorizations: the zero-fill ILU
(ILU(0)), a Crout version of ILU (ILUC(tau)), and ILU with threshold dropping and pivoting
(ILUTP(tau)). The ILU(0) never pivots and the resulting factors only have nonzeros in positions
where the input matrix had nonzeros. Both ILUC(tau) and ILUTP(tau), however, do threshold-
based dropping with the user-defined drop tolerance tau.
For example:
A = gallery('neumann',1600) + speye(1600);
nnz(A)
ans =
7840
nnz(lu(A))
ans =
126478
shows that A has 7840 nonzeros, and its complete LU factorization has 126478 nonzeros. On the
other hand, the following code shows the different ILU outputs:
[L,U] = ilu(A);
nnz(L)+nnz(U)-size(A,1)
ans =
7840
norm(A-(L*U).*spones(A),'fro')./norm(A,'fro')
ans =
4.8874e-17
opts.type = 'ilutp';
opts.droptol = 1e-4;
[L,U,P] = ilu(A, opts);
nnz(L)+nnz(U)-size(A,1)
ans =
31147
norm(P*A - L*U,'fro')./norm(A,'fro')
ans =
9.9224e-05
4-22
Sparse Matrix Operations
opts.type = 'crout';
[L,U,P] = ilu(A, opts);
nnz(L)+nnz(U)-size(A,1)
ans =
31083
norm(P*A-L*U,'fro')./norm(A,'fro')
ans =
9.7344e-05
These calculations show that the zero-fill factors have 7840 nonzeros, the ILUTP(1e-4) factors have
31147 nonzeros, and the ILUC(1e-4) factors have 31083 nonzeros. Also, the relative error of the
product of the zero-fill factors is essentially zero on the pattern of A. Finally, the relative error in the
factorizations produced with threshold dropping is on the same order of the drop tolerance, although
this is not guaranteed to occur. See the ilu reference page for more options and details.
The ichol function provides zero-fill incomplete Cholesky factorizations (IC(0)) as well as
threshold-based dropping incomplete Cholesky factorizations (ICT(tau)) of symmetric, positive
definite sparse matrices. These factorizations are the analogs of the incomplete LU factorizations
above and have many of the same characteristics. For example:
A = delsq(numgrid('S',200));
nnz(A)
ans =
195228
nnz(chol(A,'lower'))
ans =
7762589
shows that A has 195228 nonzeros, and its complete Cholesky factorization without reordering has
7762589 nonzeros. By contrast:
L = ichol(A);
nnz(L)
ans =
117216
norm(A-(L*L').*spones(A),'fro')./norm(A,'fro')
ans =
3.5805e-17
opts.type = 'ict';
opts.droptol = 1e-4;
L = ichol(A,opts);
nnz(L)
4-23
4 Sparse Matrices
ans =
1166754
norm(A-L*L','fro')./norm(A,'fro')
ans =
2.3997e-04
IC(0) has nonzeros only in the pattern of the lower triangle of A, and on the pattern of A, the
product of the factors matches. Also, the ICT(1e-4) factors are considerably sparser than the
complete Cholesky factor, and the relative error between A and L*L' is on the same order of the
drop tolerance. It is important to note that unlike the factors provided by chol, the default factors
provided by ichol are lower triangular. See the ichol reference page for more information.
Function Description
eigs Few eigenvalues
svds Few singular values
These functions are most frequently used with sparse matrices, but they can be used with full
matrices or even with linear operators defined in MATLAB code.
The statement
[V,lambda] = eigs(A,k,sigma)
finds the k eigenvalues and corresponding eigenvectors of the matrix A that are nearest the “shift”
sigma. If sigma is omitted, the eigenvalues largest in magnitude are found. If sigma is zero, the
eigenvalues smallest in magnitude are found. A second matrix, B, can be included for the generalized
eigenvalue problem: Aυ = λBυ.
The statement
[U,S,V] = svds(A,k)
[U,S,V] = svds(A,k,'smallest')
The numerical techniques used in eigs and svds are described in [6].
This example shows how to find the smallest eigenvalue and eigenvector of a sparse matrix.
4-24
Sparse Matrix Operations
Set up the five-point Laplacian difference operator on a 65-by-65 grid in an L-shaped, two-
dimensional domain.
L = numgrid('L',65);
A = delsq(L);
size(A)
ans = 1×2
2945 2945
nnz(A)
ans = 14473
[v,d] = eigs(A,1,'smallestabs');
Distribute the components of the eigenvector over the appropriate grid points and produce a contour
plot of the result.
L(L>0) = full(v(L(L>0)));
x = -1:1/32:1;
contour(x,x,L,15)
axis square
4-25
4 Sparse Matrices
References
[1] Amestoy, P. R., T. A. Davis, and I. S. Duff, “An Approximate Minimum Degree Ordering Algorithm,”
SIAM Journal on Matrix Analysis and Applications, Vol. 17, No. 4, Oct. 1996, pp. 886-905.
[2] Barrett, R., M. Berry, T. F. Chan, et al., Templates for the Solution of Linear Systems: Building
Blocks for Iterative Methods, SIAM, Philadelphia, 1994.
[3] Davis, T.A., Gilbert, J. R., Larimore, S.I., Ng, E., Peyton, B., “A Column Approximate Minimum
Degree Ordering Algorithm,” Proc. SIAM Conference on Applied Linear Algebra, Oct. 1997, p.
29.
[4] Gilbert, John R., Cleve Moler, and Robert Schreiber, “Sparse Matrices in MATLAB: Design and
Implementation,” SIAM J. Matrix Anal. Appl., Vol. 13, No. 1, January 1992, pp. 333-356.
[5] Larimore, S. I., An Approximate Minimum Degree Column Ordering Algorithm, MS Thesis, Dept.
of Computer and Information Science and Engineering, University of Florida, Gainesville, FL,
1998.
[6] Saad, Yousef, Iterative Methods for Sparse Linear Equations. PWS Publishing Company, 1996.
[7] Karypis, George and Vipin Kumar. "A Fast and High Quality Multilevel Scheme for Partitioning
Irregular Graphs." SIAM Journal on Scientific Computing. Vol. 20, Number 1, 1999, pp. 359–
392.
4-26
Sparse Matrix Operations
See Also
More About
• “Computational Advantages of Sparse Matrices” on page 4-2
• “Constructing Sparse Matrices” on page 4-4
• “Accessing Sparse Matrices” on page 4-8
4-27
4 Sparse Matrices
This example shows how to compute and represent the finite difference Laplacian on an L-shaped
domain.
Domain
The numgrid function numbers points within an L-shaped domain. The spy function is a useful tool
for visualizing the pattern of nonzero elements in a matrix. Use these two functions to generate and
display an L-shaped domain.
n = 32;
R = 'L';
G = numgrid(R,n);
spy(G)
title('A Finite Difference Grid')
g = 10×10
0 0 0 0 0 0 0 0 0 0
0 1 5 9 13 17 25 33 41 0
0 2 6 10 14 18 26 34 42 0
4-28
Finite Difference Laplacian
0 3 7 11 15 19 27 35 43 0
0 4 8 12 16 20 28 36 44 0
0 0 0 0 0 21 29 37 45 0
0 0 0 0 0 22 30 38 46 0
0 0 0 0 0 23 31 39 47 0
0 0 0 0 0 24 32 40 48 0
0 0 0 0 0 0 0 0 0 0
Discrete Laplacian
Use delsq to generate the discrete Laplacian. Use the spy function again to get a graphical feel of
the matrix elements.
D = delsq(G);
spy(D)
title('The 5-Point Laplacian')
N = 675
Solve the Dirichlet boundary value problem for the sparse linear system. The problem setup is:
4-29
4 Sparse Matrices
rhs = ones(N,1);
if (R == 'N') % For nested dissection, turn off minimum degree ordering.
spparms('autommd',0)
u = D\rhs;
spparms('autommd',1)
else
u = D\rhs; % This is used for R=='L' as in this example
end
Map the solution onto the L-shaped grid and plot it as a contour map.
U = G;
U(G>0) = full(u(G(G>0)));
clabel(contour(U));
prism
axis square ij
mesh(U)
axis([0 n 0 n 0 max(max(U))])
axis square ij
4-30
Finite Difference Laplacian
See Also
spy
4-31
4 Sparse Matrices
This example shows the finite element mesh for a NASA airfoil, including two trailing flaps. More
information about the history of airfoils is available at NACA Airfoils (nasa.gov).
The data is stored in the file airfoil.mat. The data consists of 4253 pairs of (x,y) coordinates of the
mesh points. It also contains an array of 12,289 pairs of indices, (i,j), specifying connections between
the mesh points.
load airfoil
−32
First, scale x and y by 2 to bring them into the range 0, 1 . Then form a sparse adjacency matrix
from the (i,j) connections and make it positive definite. Finally, plot the adjacency matrix using (x,y)
as the coordinates for the vertices (mesh points).
% Scaling x and y
x = pow2(x,-32);
y = pow2(y,-32);
4-32
Graphical Representation of Sparse Matrices
You can use spy to visualize the nonzero elements in a matrix, so it is a particularly useful function to
see the sparsity pattern in sparse matrices. spy(A) plots the sparsity pattern of the matrix A.
spy(A)
title('Airfoil Adjacency Matrix')
4-33
4 Sparse Matrices
symrcm uses the Reverse Cuthill-McKee technique for reordering the adjacency matrix. r =
symrcm(A) returns a permutation vector r such that A(r,r) tends to have its diagonal elements
closer to the diagonal than A. This form is a good preordering for LU or Cholesky factorization of
matrices that come from "long, skinny" problems. It works for both symmetric and nonsymmetric
matrices.
r = symrcm(A);
spy(A(r,r))
title('Reverse Cuthill-McKee')
4-34
Graphical Representation of Sparse Matrices
Use j = COLPERM(A) to return a permutation vector that reorders the columns of the sparse matrix
A in nondecreasing order of nonzero count. This form is sometimes useful as a preordering for LU
factorization, as in lu(A(:,j)).
j = colperm(A);
spy(A(j,j))
title('Column Count Reordering')
4-35
4 Sparse Matrices
symamd gives a symmetric approximate minimum degree permutation. For a symmetric positive
definite matrix A, the command p = symamd(S) returns the permutation vector p such that S(p,p)
tends to have a sparser Cholesky factor than S. Sometimes symamd works well for symmetric
indefinite matrices too.
m = symamd(A);
spy(A(m,m))
title('Approximate Minimum Degree')
4-36
Graphical Representation of Sparse Matrices
See Also
symamd | symrcm | colperm | spy
4-37
4 Sparse Matrices
This example shows an application of sparse matrices and explains the relationship between graphs
and matrices.
A graph is a set of nodes with specified connections, or edges, between them. Graphs come in many
shapes and sizes. One example is the connectivity graph of the Buckminster Fuller geodesic dome,
which is also in the shape of a soccer ball or a carbon-60 molecule.
In MATLAB®, you can use the bucky function to generate the graph of the geodesic dome.
[B,V] = bucky;
G = graph(B);
p = plot(G);
axis equal
You also can specify coordinates for the nodes to change the display of the graph.
p.XData = V(:,1);
p.YData = V(:,2);
4-38
Graphs and Matrices
The bucky function can be used to create the graph because it returns an adjacency matrix. An
adjacency matrix is one way to represent the nodes and edges in a graph.
To construct the adjacency matrix of a graph, the nodes are numbered 1 to N. Then each element (i,j)
of the N-by-N matrix is set to 1 if node i is connected to node j, and 0 otherwise. Thus, for undirected
graphs the adjacency matrix is symmetric, but this need not be the case for directed graphs.
For example, here is a simple graph and its associated adjacency matrix.
% Define a matrix A.
A = [0 1 1 0 ; 1 0 0 1 ; 1 0 0 1 ; 0 1 1 0];
4-39
4 Sparse Matrices
Sparse matrices are particularly helpful for representing very large graphs. This is because each
node is usually connected to only a few other nodes. As a result, the density of nonzero entries in the
adjacency matrix is often relatively small for large graphs. The bucky ball adjacency matrix is a good
example, since it is a 60-by-60 symmetric sparse matrix with only 180 nonzero elements. The density
of this matrix is just 5%.
Since the adjacency matrix defines the graph, you can plot a portion of the bucky ball by using a
subset of the entries in the adjacency matrix.
Use the adjacency function to create a new adjacency matrix for the graph. Display the nodes in
one hemisphere of the bucky ball by indexing into the adjacency matrix to create a new, smaller
graph.
figure
A = adjacency(G);
H = graph(A(1:30,1:30));
h = plot(H);
4-40
Graphs and Matrices
To visualize the adjacency matrix of this hemisphere, use the spy function to plot the silhouette of the
nonzero elements in the adjacency matrix.
Note that the matrix is symmetric, since if node i is connected to node j, then node j is connected to
node i.
spy(A(1:30,1:30))
title('Top Left Corner of Bucky Ball Adjacency Matrix')
4-41
4 Sparse Matrices
Finally, here is a spy plot of the entire bucky ball adjacency matrix.
spy(A)
title('Bucky Ball Adjacency Matrix')
4-42
Graphs and Matrices
See Also
spy | graph
4-43
4 Sparse Matrices
This example shows how reordering the rows and columns of a sparse matrix can influence the speed
and storage requirements of a matrix operation.
This spy plot shows a sparse symmetric positive definite matrix derived from a portion of the barbell
matrix. This matrix describes connections in a graph that resembles a barbell.
load barbellgraph.mat
S = A + speye(size(A));
pct = 100 / numel(A);
spy(S)
title('A Sparse Symmetric Matrix')
nz = nnz(S);
xlabel(sprintf('Nonzeros = %d (%.3f%%)',nz,nz*pct));
G = graph(S,'omitselfloops');
p = plot(G,'XData',xy(:,1),'YData',xy(:,2),'Marker','.');
axis equal
4-44
Sparse Matrix Reordering
Compute the Cholesky factor L, where S = L*L'. Notice that L contains many more nonzero
elements than the unfactored S, because the computation of the Cholesky factorization creates fill-in
nonzeros. These fill-in values slow down the algorithm and increase storage cost.
L = chol(S,'lower');
spy(L)
title('Cholesky Decomposition of S')
nc(1) = nnz(L);
xlabel(sprintf('Nonzeros = %d (%.2f%%)',nc(1),nc(1)*pct));
4-45
4 Sparse Matrices
By reordering the rows and columns of a matrix, it is possible to reduce the amount of fill-in that
factorization creates, thereby reducing the time and storage cost of subsequent calculations.
Test the effects of these sparse matrix reorderings on the barbell matrix.
The colperm command uses the column count reordering algorithm to move rows and columns with
higher nonzero count towards the end of the matrix.
q = colperm(S);
spy(S(q,q))
title('S(q,q) After Column Count Ordering')
nz = nnz(S);
xlabel(sprintf('Nonzeros = %d (%.3f%%)',nz,nz*pct));
4-46
Sparse Matrix Reordering
For this matrix, the column count ordering can barely reduce the time and storage for Cholesky
factorization.
L = chol(S(q,q),'lower');
spy(L)
title('chol(S(q,q)) After Column Count Ordering')
nc(2) = nnz(L);
xlabel(sprintf('Nonzeros = %d (%.2f%%)',nc(2),nc(2)*pct));
4-47
4 Sparse Matrices
The symrcm command uses the reverse Cuthill-McKee reordering algorithm to move all nonzero
elements closer to the diagonal, reducing the bandwidth of the original matrix.
d = symrcm(S);
spy(S(d,d))
title('S(d,d) After Cuthill-McKee Ordering')
nz = nnz(S);
xlabel(sprintf('Nonzeros = %d (%.3f%%)',nz,nz*pct));
4-48
Sparse Matrix Reordering
The fill-in produced by Cholesky factorization is confined to the band, so factorizing the reordered
matrix takes less time and less storage.
L = chol(S(d,d),'lower');
spy(L)
title('chol(S(d,d)) After Cuthill-McKee Ordering')
nc(3) = nnz(L);
xlabel(sprintf('Nonzeros = %d (%.2f%%)', nc(3),nc(3)*pct));
4-49
4 Sparse Matrices
The amd command uses an approximate minimum degree algorithm (a powerful graph-theoretic
technique) to produce large blocks of zeros in the matrix.
r = amd(S);
spy(S(r,r))
title('S(r,r) After Minimum Degree Ordering')
nz = nnz(S);
xlabel(sprintf('Nonzeros = %d (%.3f%%)',nz,nz*pct));
4-50
Sparse Matrix Reordering
The Cholesky factorization preserves the blocks of zeros produced by the minimum degree algorithm.
This structure can significantly reduce time and storage costs.
L = chol(S(r,r),'lower');
spy(L)
title('chol(S(r,r)) After Minimum Degree Ordering')
nc(4) = nnz(L);
xlabel(sprintf('Nonzeros = %d (%.2f%%)',nc(4),nc(4)*pct));
4-51
4 Sparse Matrices
The dissect function uses graph-theoretic techniques to produce fill-reducing orderings. The
algorithm treats the matrix as the adjacency matrix of a graph, coarsens the graph by collapsing
vertices and edges, reorders the smaller graph, and then uses refinement steps to uncoarsen the
small graph and produce a reordering of the original graph. The result is a powerful algorithm that
frequently produces the least amount of fill-in compared to the other reordering algorithms.
p = dissect(S);
spy(S(p,p))
title('S(p,p) After Nested Dissection Ordering')
nz = nnz(S);
xlabel(sprintf('Nonzeros = %d (%.3f%%)',nz,nz*pct));
4-52
Sparse Matrix Reordering
Similar to the minimum degree ordering, the Cholesky factorization of the nested dissection ordering
mostly preserves the nonzero structure of S(d,d) below the main diagonal.
L = chol(S(p,p),'lower');
spy(L)
title('chol(S(p,p)) After Nested Dissection Ordering')
nc(5) = nnz(L);
xlabel(sprintf('Nonzeros = %d (%.2f%%)',nc(5),nc(5)*pct));
4-53
4 Sparse Matrices
Summarizing Results
This bar chart summarizes the effects of reordering the matrix before performing the Cholesky
factorization. While the Cholesky factorization of the original matrix had about 8% of its elements as
nonzeros, using dissect or symamd reduces that density to less than 1%.
4-54
Sparse Matrix Reordering
See Also
spy | chol | nnz | symrcm | colperm | symamd
4-55
4 Sparse Matrices
• Direct methods are variants of Gaussian elimination. These methods use the individual matrix
elements directly, through matrix operations such as LU, QR, or Cholesky factorization. You can
use direct methods to solve linear equations with a high level of precision, but these methods can
be slow when operating on large sparse matrices. The speed of solving a linear system with a
direct method strongly depends on the density and fill pattern of the coefficient matrix.
ans =
1.4211e-14
MATLAB implements direct methods through the matrix division operators / and \, as well as
functions such as decomposition, lsqminnorm, and linsolve.
• Iterative methods produce an approximate solution to the linear system after a finite number of
steps. These methods are useful for large systems of equations where it is reasonable to trade-off
precision for a shorter run time. Iterative methods use the coefficient matrix only indirectly,
through a matrix-vector product or an abstract linear operator. Iterative methods can be used with
any matrix, but they are typically applied to large sparse matrices for which direct solves are slow.
The speed of solving a linear system with an indirect method does not depend as strongly on the
fill pattern of the coefficient matrix as a direct method. However, using an iterative method
typically requires tuning parameters for each specific problem.
For example, this code solves a large sparse linear system that has a symmetric positive definite
coefficient matrix.
A = delsq(numgrid('L',400));
b = ones(size(A,1),1);
x = pcg(A,b,[],1000);
norm(b-A*x)
pcg converged at iteration 796 to a solution with relative residual 9.9e-07.
ans =
3.4285e-04
MATLAB implements a variety of iterative methods that have different strengths and weaknesses
depending on the properties of the coefficient matrix A.
4-56
Iterative Methods for Linear Systems
Direct methods are usually faster and more generally applicable than indirect methods if there is
enough storage available to carry them out. Generally, you should attempt to use x = A\b first. If the
direct solve is too slow, then you can try using iterative methods.
1 Start with an initial guess for the solution vector x0. (This is usually a vector of zeros unless you
specify a better guess.)
2 Compute the residual norm res = norm(b-A*x0).
3 Compare the residual against the specified tolerance. If res <= tol, end the computation and
return the computed answer for x0.
4 Apply A*x0 and update the magnitude and direction of the vector x0 based on the value of the
residual and other calculated quantities. This is the step where most computation is done.
5 Repeat Steps 2 through 4 until the value of x0 is good enough to satisfy the tolerance.
Iterative methods differ in how they update the magnitude and direction of x0 in Step 4, and some
have slightly different convergence criteria in Steps 2 and 3, but this captures the basic process that
all iterative solvers follow.
Description Notes
pcg (preconditioned conjugate gradients) • Coefficient matrix must be symmetric
positive definite.
• Most effective solver for symmetric
positive definite systems since storage
for only a limited number of vectors is
required.
lsqr (least squares) • The only solver available for
rectangular systems.
• Analytically equivalent to the method
of conjugate gradients (PCG) applied
to the normal equations (A'*A)*x =
A'*b.
4-57
4 Sparse Matrices
Description Notes
minres (minimum residual) • Coefficient matrix must be symmetric
but does not need to be positive
definite.
• Each iteration minimizes the residual
error in the 2-norm, so the algorithm
is guaranteed to make progress from
step to step.
• Does not suffer from breakdowns
(when an algorithm becomes unable
to make progress toward a solution
and halts).
symmlq (symmetric LQ) • Coefficient matrix must be symmetric
but does not need to be positive
definite.
• Solves a projected system and keeps
the residual orthogonal to all previous
ones.
• Does not suffer from breakdowns
(when an algorithm becomes unable
to make progress toward a solution
and halts).
bicg (biconjugate gradient) • Coefficient matrix must be square.
• bicg is computationally cheap, but
convergence is irregular and
unreliable.
• bicg is historically important because
many other iterative algorithms were
developed as improvements on it.
bicgstab (biconjugate gradient stabilized) • Coefficient matrix must be square.
• Uses BiCG steps alternating with
GMRES(1) steps for additional
stability.
bicgstabl (biconjugate gradient stabilized (l)) • Coefficient matrix must be square.
• Uses BiCG steps alternating with
GMRES(2) steps for additional
stability.
cgs (conjugate gradient squared) • Coefficient matrix must be square.
• Requires the same number of
operations per iteration as bicg, but
avoids using the transpose by working
with a squared residual.
4-58
Iterative Methods for Linear Systems
Description Notes
gmres (generalized minimum residual) • Coefficient matrix must be square.
• One of the most dependable
algorithms, since the residual norm is
minimized in each iteration.
• Work and required storage rise
linearly with iteration count.
• Choosing an appropriate restart
value is essential to avoid
unnecessary work and storage.
qmr (quasi-minimal residual) • Coefficient matrix must be square.
• Overhead per iteration is slightly
more than bicg, but this provides
more stability.
tfqmr (transpose-free quasi-minimal residual) • Coefficient matrix must be square.
• Best solver to try for symmetric
indefinite systems when memory is
limited.
4-59
4 Sparse Matrices
Preconditioners
The convergence rate of iterative methods is dependent on the spectrum (eigenvalues) of the
coefficient matrix. Therefore, you can improve the convergence and stability of most iterative
methods by transforming the linear system to have a more favorable spectrum (clustered eigenvalues
or a condition number near 1). This transformation is performed by applying a second matrix, called a
preconditioner, to the system. This process transforms the linear system
Ax = b
Ax = b .
The ideal preconditioner transforms the coefficient matrix A into an identity matrix, since any
iterative method will converge in one iteration with such a preconditioner. In practice, finding a good
4-60
Iterative Methods for Linear Systems
preconditioner requires trade-offs. The transformation is performed in one of three ways: left
preconditioning, right preconditioning, or split preconditioning.
The first case is called left preconditioning since the preconditioner matrix M appears on the left of A:
M−1 A x = M−1 b .
• bicg
• gmres
• qmr
A M−1 M x = b .
• lsqr
• bicgstab
• bicgstabl
• cgs
• tfqmr
Finally, for symmetric coefficient matrices A, split preconditioning ensures that the transformed
system is still symmetric. The preconditioner M = HHT gets split and the factors appear on different
sides of A:
The solver algorithm for split preconditioned systems is based on the above equation, but in practice
there is no need to compute H. The solver algorithm multiplies and solves with M directly.
• pcg
• minres
• symmlq
In all cases, the preconditioner M is chosen to accelerate convergence of the iterative method. When
the residual error of an iterative solution stagnates or makes little progress between iterations, it
often means you need to generate a preconditioner matrix to incorporate into the problem.
The iterative solvers in MATLAB allow you to specify a single preconditioner matrix M, or two
preconditioner matrix factors such that M = M1M2. This makes it easy to specify a preconditioner in
its factorized form, such as M = LU. Note that in the split preconditioned case, where M = HHT also
holds, there is not a relation between the M1 and M2 inputs and the H factors.
In some cases, preconditioners occur naturally in the mathematical model of a given problem. In the
absence of natural preconditioners, you can use one of the incomplete factorizations in this table to
4-61
4 Sparse Matrices
generate a preconditioner matrix. Incomplete factorizations are essentially incomplete direct solves
that are quick to calculate.
See “Incomplete Factorizations” on page 4-22 for more information about ilu and ichol.
Preconditioner Example
Consider the five-point finite difference approximation to Laplace's equation on a square, two-
dimensional domain. The following commands use the preconditioned conjugate gradient (PCG)
method with the preconditioner M = L*L', where L is the zero-fill incomplete Cholesky factor of A.
For this system, pcg is unable to find a solution without specifying a preconditioner matrix.
A = delsq(numgrid('S',250));
b = ones(size(A,1),1);
tol = 1e-3;
maxit = 100;
L = ichol(A);
x = pcg(A,b,tol,maxit,L,L');
pcg requires 92 iterations to achieve the specified tolerance. However, using a different
preconditioner can yield better results. For example, using ichol to construct a modified incomplete
Cholesky allows pcg to meet the specified tolerance after only 39 iterations.
L = ichol(A,struct('type','nofill','michol','on'));
x = pcg(A,b,tol,maxit,L,L');
If you use both equilibration and reordering to generate a preconditioner, the process is:
4-62
Iterative Methods for Linear Systems
Here is an example that uses equilibration and reordering to generate a preconditioner for a sparse
coefficient matrix.
1 Create the coefficient matrix A and a vector of ones b for the right-hand side of the linear
equation. Calculate an estimation of the condition number for A.
load west0479;
A = west0479;
b = ones(size(A,1),1);
condest(A)
ans =
1.4244e+12
ans =
5.1042e+04
2 Reorder the equilibrated matrix using dissect.
q = dissect(Anew);
Anew = Anew(q,q);
bnew = bnew(q);
3 Generate a preconditioner using an incomplete LU factorization.
[L,U] = ilu(Anew);
4 Solve the linear system with gmres using the preconditioner matrices, a tolerance of 1e-10, 50
maximum outer iterations, and 30 inner iterations.
tol = 1e-10;
maxit = 50;
restart = 30;
[xnew, flag, relres] = gmres(Anew,bnew,restart,tol,maxit,L,U);
x(q) = xnew;
x = C*x(:);
Now, compare the relres relative residual returned by gmres (which includes the
preconditioners) to the relative residual without the preconditioners resnew and the relative
residual without equilibration res. The results show that even though the linear systems are all
equivalent, the different methods apply different weights to each element, and this can
significantly affect the value of the residual.
relres
resnew = norm(Anew*xnew - bnew) / norm(bnew)
res = norm(A*x - b) / norm(b)
relres =
8.7537e-11
resnew =
3.6805e-08
res =
5.1415e-04
4-63
4 Sparse Matrices
In addition to using a linear operator instead of a coefficient matrix A, you can also use a linear
operator instead of a matrix for the preconditioner M. In that case, the function needs to calculate M\x
or M'\x, as indicated on the reference page for the solver.
Using linear operators enables you to exploit patterns in A or M to calculate the value of the linear
operations more efficiently than if the solver used the matrix explicitly to carry out the full matrix-
vector multiplication. It also means you do not need the memory to store the coefficient or
preconditioner matrices, since the linear operator typically calculates the result of the matrix-vector
multiplication without forming the matrix at all.
A = [2 -1 0 0 0 0;
-1 2 -1 0 0 0;
0 -1 2 -1 0 0;
0 0 -1 2 -1 0;
0 0 0 -1 2 -1;
0 0 0 0 -1 2];
When A multiplies a vector, most of the elements in the resulting vector are zeros. The nonzero
elements in the result correspond with the nonzero tridiagonal elements of A. So, for a given vector x,
the linear operator function simply needs to add together three vectors to calculate the value of A*x:
function y = linearOperatorA(x)
y = -1*[0; x(1:end-1)] ...
+ 2*x ...
+ -1*[x(2:end); 0];
end
Most iterative solvers require the linear operator function for A to return the value of A*x. Likewise,
for the preconditioner matrix M, the function generally must calculate M\x. For the solvers lsqr, qmr,
and bicg, the linear operator functions must also return the value for A'*x or M'\x when requested.
See the iterative solver reference pages for examples and descriptions of linear operator functions.
References
[1] Barrett, R., M. Berry, T. F. Chan, et al., Templates for the Solution of Linear Systems: Building
Blocks for Iterative Methods, SIAM, Philadelphia, 1994.
[2] Saad, Yousef, Iterative Methods for Sparse Linear Equations. PWS Publishing Company, 1996.
See Also
More About
• “Systems of Linear Equations” on page 2-10
4-64
5
What Is a Graph?
A graph is a collection of nodes and edges that represents relationships:
These definitions are general, as the exact meaning of the nodes and edges in a graph depends on the
specific application. For instance, you can model the friendships in a social network using a graph.
The graph nodes are people, and the edges represent friendships. The natural correspondence of
graphs to physical objects and situations means that you can use graphs to model a wide variety of
systems. For example:
• Web page linking — The graph nodes are web pages, and the edges represent hyperlinks between
pages.
• Airports — The graph nodes are airports, and the edges represent flights between airports.
In MATLAB, the graph and digraph functions construct objects that represent undirected and
directed graphs.
• Undirected graphs have edges that do not have a direction. The edges indicate a two-way
relationship, in that each edge can be traversed in both directions. This figure shows a simple
undirected graph with three nodes and three edges.
5-2
Directed and Undirected Graphs
• Directed graphs have edges with direction. The edges indicate a one-way relationship, in that
each edge can only be traversed in a single direction. This figure shows a simple directed graph
with three nodes and two edges.
5-3
5 Graph and Network Algorithms
The exact position, length, or orientation of the edges in a graph illustration typically do not have
meaning. In other words, the same graph can be visualized in several different ways by rearranging
the nodes and/or distorting the edges, as long as the underlying structure does not change.
Graphs created using graph and digraph can have one or more self-loops, which are edges
connecting a node to itself. Additionally, graphs can have multiple edges with the same source and
target nodes, and the graph is then known as a multigraph. A multigraph may or may not contain self-
loops.
For the purposes of graph algorithm functions in MATLAB, a graph containing a node with a single
self-loop is not a multigraph. However, if the graph contains a node with multiple self-loops, it is a
multigraph.
For example, the following figure shows an undirected multigraph with self-loops. Node A has three
self-loops, while node C has one. The graph contains these three conditions, any one of which makes
it a multigraph.
5-4
Directed and Undirected Graphs
Creating Graphs
The primary ways to create a graph include using an adjacency matrix or an edge list.
Adjacency Matrix
One way to represent the information in a graph is with a square adjacency matrix. The nonzero
entries in an adjacency matrix indicate an edge between two nodes, and the value of the entry
indicates the weight of the edge. The diagonal elements of an adjacency matrix are typically zero, but
a nonzero diagonal element indicates a self-loop, or a node that is connected to itself by an edge.
• When you use graph to create an undirected graph, the adjacency matrix must be symmetric. In
practice, the matrices are frequently triangular to avoid repetition. To construct an undirected
graph using only the upper or lower triangle of the adjacency matrix, use graph(A,'upper') or
graph(A,'lower') .
• When you use digraph to create a directed graph, the adjacency matrix does not need to be
symmetric.
• For large graphs, the adjacency matrix contains many zeros and is typically a sparse matrix.
• You cannot create a multigraph from an adjacency matrix.
5-5
5 Graph and Network Algorithms
01 2
10 3 .
23 0
A = [0 1 2; 1 0 3; 2 3 0];
node_names = {'A','B','C'};
G = graph(A,node_names)
G =
You can use the graph or digraph functions to create a graph using an adjacency matrix, or you can
use the adjacency function to find the weighted or unweighted sparse adjacency matrix of a
preexisting graph.
Edge List
Another way to represent the information in a graph is by listing all of the edges.
5-6
Directed and Undirected Graphs
Edge Weight
(A, B) 1
(A, C) 2
(B, C) 3
From the edge list it is easy to conclude that the graph has three unique nodes, A, B, and C, which are
connected by the three listed edges. If the graph had disconnected nodes, they would not be found in
the edge list, and would have to be specified separately.
In MATLAB, the list of edges is separated by column into source nodes and target nodes. For directed
graphs the edge direction (from source to target) is important, but for undirected graphs the source
and target node are interchangeable. One way to construct this graph using the edge list is to use
separate inputs for the source nodes, target nodes, and edge weights:
source_nodes = {'A','A','B'};
target_nodes = {'B','C','C'};
edge_weights = [1 2 3];
G = graph(source_nodes, target_nodes, edge_weights);
Both graph and digraph permit construction of a simple graph or multigraph from an edge list.
After constructing a graph, G, you can look at the edges (and their properties) with the command
G.Edges. The order of the edges in G.Edges is sorted by source node (first column) and secondarily
5-7
5 Graph and Network Algorithms
by target node (second column). For undirected graphs, the node with the smaller index is listed as
the source node, and the node with the larger index is listed as the target node.
Since the underlying implementation of graph and digraph depends on sparse matrices, many of
the same indexing costs apply. Using one of the previous methods to construct a graph all at once
from the triplet pairs (source,target,weight) is quicker than creating an empty graph and
iteratively adding more nodes and edges. For best performance, minimize the number of calls to
graph, digraph, addedge, addnode, rmedge, and rmnode.
If the graph has node names (that is, G.Nodes contains a variable Name), then you also can refer to
the nodes in a graph using their names. Thus, named nodes in a graph can be referred to by either
their node indices or node names. For example, node 1 can be called, 'A'.
The term node ID encompasses both aspects of node identification. The node ID refers to both the
node index and the node name.
For convenience, MATLAB remembers which type of node ID you use when you call most graph
functions. So if you refer to the nodes in a graph by their node indices, most graph functions return a
numeric answer that also refers to the nodes by their indices.
A = [0 1 1 0; 1 0 1 0; 1 1 0 1; 0 0 1 0];
G = graph(A,{'a','b','c','d'});
p = shortestpath(G,1,4)
p =
1 3 4
However, if you refer to the nodes by their names, then most graph functions return an answer that
also refers to the nodes by their names (contained in a cell array of character vectors or string array).
p1 = shortestpath(G,'a','d')
p1 =
Use findnode to find the numeric node ID for a given node name. Conversely, for a given numeric
node ID, index into G.Nodes.Name to determine the corresponding node name.
5-8
Directed and Undirected Graphs
See “Modify Nodes and Edges of Existing Graph” on page 5-10 for some common graph modification
examples.
See Also
graph | digraph
More About
• “Modify Nodes and Edges of Existing Graph” on page 5-10
• “Add Graph Node Names, Edge Weights, and Other Attributes” on page 5-13
• “Graph Plotting and Customization” on page 5-17
5-9
5 Graph and Network Algorithms
This example shows how to access and modify the nodes and/or edges in a graph or digraph object
using the addedge, rmedge, addnode, rmnode, findedge, findnode, and subgraph functions.
Add Nodes
Create a graph with four nodes and four edges. The corresponding elements in s and t specify the
end nodes of each graph edge.
s = [1 1 1 2];
t = [2 3 4 3];
G = graph(s,t)
G =
graph with properties:
ans=4×1 table
EndNodes
________
1 2
1 3
1 4
2 3
Use addnode to add five nodes to the graph. This command adds five disconnected nodes with node
IDs 5, 6, 7, 8, and 9.
G = addnode(G,5)
G =
graph with properties:
Remove Nodes
Use rmnode to remove nodes 3, 5, and 6 from the graph. All edges connected to any of the removed
nodes also are removed. The remaining six nodes in the graph are renumbered to reflect the new
number of nodes.
G = rmnode(G,[3 5 6])
G =
graph with properties:
5-10
Modify Nodes and Edges of Existing Graph
Add Edges
Use addedge to add two edges to G. The first edge is between node 1 and node 5, and the second
edge is between node 2 and node 5. This command adds two new rows to G.Edges.
G =
graph with properties:
Remove Edges
Use rmedge to remove the edge between node 1 and node 3. This command removes a row from
G.Edges.
G = rmedge(G,1,3)
G =
graph with properties:
Determine the edge index for the edge between nodes 1 and 5. The edge index, ei, is a row number
in G.Edges.
ei = findedge(G,1,5)
ei = 2
Add node names to the graph, and then determine the node index for node 'd'. The numeric node
index, ni, is a row number in G.Nodes. You can use both ni and the node name, 'd', to refer to the
node when using other graph functions, like shortestpath.
ni = 4
Extract Subgraph
Use subgraph to extract a piece of the graph containing only two nodes.
H = subgraph(G,[1 2])
5-11
5 Graph and Network Algorithms
H =
graph with properties:
H.Edges
ans=table
EndNodes
______________
{'a'} {'b'}
The node and edge information for a graph object is contained in two properties: Nodes and Edges.
Both of these properties are tables containing variables to describe the attributes of the nodes and
edges in the graph. Since Nodes and Edges are both tables, you can use the Variables editor to
interactively view or edit the tables. You cannot add or remove nodes or edges using the Variables
editor, and you also cannot edit the EndNodes property of the Edges table. The Variables editor is
useful for managing extra node and edge attributes in the Nodes and Edges tables. For more
information, see “Create and Edit Variables”.
See Also
graph | digraph | addedge | rmedge | addnode | rmnode | findedge | findnode | subgraph
More About
• “Directed and Undirected Graphs” on page 5-2
5-12
Add Graph Node Names, Edge Weights, and Other Attributes
This example shows how to add attributes to the nodes and edges in graphs created using graph and
digraph. You can specify node names or edge weights when you originally call graph or digraph to
create a graph. However, this example shows how to add attributes to a graph after it has been
created.
Create Graph
Create a directed graph. The corresponding elements in s and t define the source and target nodes
of each edge in the graph.
s = [1 1 2 2 3];
t = [2 4 3 4 4];
G = digraph(s,t)
G =
digraph with properties:
Add node names to the graph by adding the variable, Name, to the G.Nodes table. The Name variable
must be specified as an N-by-1 cell array of character vectors or string array, where N =
numnodes(G). It is important to use the Name variable when adding node names, as this variable
name is treated specially by some graph functions.
G.Nodes
ans=4×1 table
Name
__________
{'First' }
{'Second'}
{'Third' }
{'Fourth'}
G.Nodes.Name([1 4])
5-13
5 Graph and Network Algorithms
Add edge weights to the graph by adding the variable, Weight, to the G.Edges table. The Weight
variable must be an M-by-1 numeric vector, where M = numedges(G). It is important to use the
Weight variable when adding edge weights, as this variable name is treated specially by some graph
functions.
G.Edges
ans=5×2 table
EndNodes Weight
________________________ ______
{'First' } {'Second'} 10
{'First' } {'Fourth'} 20
{'Second'} {'Third' } 30
{'Second'} {'Fourth'} 40
{'Third' } {'Fourth'} 50
Use table indexing to view the first and third rows of G.Edges.
G.Edges([1 3],:)
ans=2×2 table
EndNodes Weight
________________________ ______
{'First' } {'Second'} 10
{'Second'} {'Third' } 30
In principle you can add any variable to G.Nodes and G.Edges that defines an attribute of the graph
nodes or edges. Adding custom attributes can be useful, since functions like subgraph and
reordernodes preserve the graph attributes.
For example, add a variable named Power to G.Edges to indicate whether each edge is 'on' or
'off'.
ans=5×3 table
EndNodes Weight Power
________________________ ______ _______
5-14
Add Graph Node Names, Edge Weights, and Other Attributes
Add a variable named Size to G.Nodes to indicate the physical size of each node.
ans=4×2 table
Name Size
__________ ____
{'First' } 10
{'Second'} 20
{'Third' } 10
{'Fourth'} 30
Since Nodes and Edges are both tables, you can use the Variables editor to interactively view or edit
the tables. For more information, see “Create and Edit Variables”.
When you plot a graph, you can use the variables in G.Nodes and G.Edges to label the graph nodes
and edges. This practice is convenient, since these variables are already guaranteed to have the
correct number of elements.
Plot the graph and label the edges using the Power variable in G.Edges. Label the nodes using the
Size variable in G.Nodes.
p = plot(G,'EdgeLabel',G.Edges.Power,'NodeLabel',G.Nodes.Size)
5-15
5 Graph and Network Algorithms
p =
GraphPlot with properties:
See Also
graph | digraph
More About
• “Directed and Undirected Graphs” on page 5-2
• “Modify Nodes and Edges of Existing Graph” on page 5-10
5-16
Graph Plotting and Customization
This example shows how to plot graphs, and then customize the display to add labels or highlighting
to the graph nodes and edges.
Use the plot function to plot graph and digraph objects. By default, plot examines the size and
type of graph to determine which layout to use. The resulting figure window contains no axes tick
marks. However, if you specify the (x,y) coordinates of the nodes with the XData, YData, or ZData
name-value pairs, then the figure includes axes ticks.
Node labels are included automatically in plots of graphs that have 100 or fewer nodes. The node
labels use the node names if available; otherwise, the labels are numeric node indices.
For example, create a graph using the buckyball adjacency matrix, and then plot the graph using all
of the default options. If you call plot and specify an output argument, then the function returns a
handle to a GraphPlot object. Subsequently, you can use this object to adjust properties of the plot.
For example, you can change the color or style of the edges, the size and color of the nodes, and so
on.
G = graph(bucky);
p = plot(G)
p =
GraphPlot with properties:
5-17
5 Graph and Network Algorithms
After you have a handle to the GraphPlot object, use dot indexing to access or change the property
values. For a complete list of the properties that you can adjust, see GraphPlot Properties.
ans = 0.5000
5-18
Graph Plotting and Customization
Create and plot a graph representing an L-shaped membrane constructed from a square grid with a
side of 12 nodes. Specify an output argument with plot to return a handle to the GraphPlot object.
n = 12;
A = delsq(numgrid('L',n));
G = graph(A,'omitselfloops')
G =
graph with properties:
p = plot(G)
p =
GraphPlot with properties:
5-19
5 Graph and Network Algorithms
EdgeLabel: {}
XData: [-2.5225 -2.1251 -1.6498 -1.1759 -0.7827 -2.5017 -2.0929 -1.6027 -1.1131 -0.7069
YData: [-3.5040 -3.5417 -3.5684 -3.5799 -3.5791 -3.0286 -3.0574 -3.0811 -3.0940 -3.0997
ZData: [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Use the layout function to change the layout of the graph nodes in the plot. The different layout
options automatically compute node coordinates for the plot. Alternatively, you can specify your own
node coordinates with the XData, YData, and ZData properties of the GraphPlot object.
Instead of using the default 2-D layout method, use layout to specify the 'force3' layout, which is
a 3-D force directed layout.
layout(p,'force3')
view(3)
Color the graph nodes based on their degree. In this graph, all of the interior nodes have the same
maximum degree of 4, nodes along the boundary of the graph have a degree of 3, and the corner
nodes have the smallest degree of 2. Store this node coloring data as the variable NodeColors in
G.Nodes.
5-20
Graph Plotting and Customization
G.Nodes.NodeColors = degree(G);
p.NodeCData = G.Nodes.NodeColors;
colorbar
Add some random integer weights to the graph edges, and then plot the edges such that their line
width is proportional to their weight. Since an edge line width approximately greater than 7 starts to
become cumbersome, scale the line widths such that the edge with the greatest weight has a line
width of 7. Store this edge width data as the variable LWidths in G.Edges.
5-21
5 Graph and Network Algorithms
Extract Subgraph
Extract and plot the top right corner of G as a subgraph, to make it easier to read the details on the
graph. The new graph, H, inherits the NodeColors and LWidths variables from G, so that recreating
the previous plot customizations is straightforward. However, the nodes in H are renumbered to
account for the new number of nodes in the graph.
H = subgraph(G,[1:31 36:41]);
p1 = plot(H,'NodeCData',H.Nodes.NodeColors,'LineWidth',H.Edges.LWidths);
colorbar
5-22
Graph Plotting and Customization
Use labeledge to label the edges whose width is larger than 6 with the label, 'Large'. The
labelnode function works in a similar manner for labeling nodes.
5-23
5 Graph and Network Algorithms
Find the shortest path between node 11 and node 37 in the subgraph, H. Highlight the edges along
this path in red, and increase the size of the end nodes on the path.
path = shortestpath(H,11,37)
path = 1×10
11 12 17 18 19 24 25 30 36 37
highlight(p1,[11 37])
highlight(p1,path,'EdgeColor','r')
5-24
Graph Plotting and Customization
Remove the node labels and colorbar, and make all of the nodes black.
p1.NodeLabel = {};
colorbar off
p1.NodeColor = 'black';
5-25
5 Graph and Network Algorithms
Find a different shortest path that ignores the edge weights. Highlight this path in green.
path2 = shortestpath(H,11,37,'Method','unweighted')
path2 = 1×10
11 12 13 14 15 20 25 30 31 37
highlight(p1,path2,'EdgeColor','g')
5-26
Graph Plotting and Customization
It is common to create graphs that have hundreds of thousands, or even millions, of nodes and/or
edges. For this reason, plot treats large graphs slightly differently to maintain readability and
performance. The plot function makes these adjustments when working with graphs that have more
than 100 nodes:
See Also
graph | digraph | plot | GraphPlot
More About
• “Directed and Undirected Graphs” on page 5-2
• GraphPlot
• “Add Node Properties to Graph Plot Data Tips” on page 5-36
5-27
5 Graph and Network Algorithms
This example shows how to define a function that visualizes the results of bfsearch and dfsearch
by highlighting the nodes and edges of a graph.
Perform a depth-first search on the graph. Specify 'allevents' to return all events in the
algorithm. Also, specify Restart as true to ensure that the search visits every node in the graph.
T = dfsearch(G, 1, 'allevents', 'Restart', true)
T =
38x4 table
5-28
Visualize Breadth-First and Depth-First Search
The values in the table, T, are useful for visualizing the search. The function visualize_search.m
shows one way to use the results of searches performed with bfsearch and dfsearch to highlight
the nodes and edges in the graph according to the table of events, T. The function pauses before each
step in the algorithm, so you can slowly step through the search by pressing any key.
function visualize_search(G,t)
% G is a graph or digraph object, and t is a table resulting from a call to
% BFSEARCH or DFSEARCH on that graph.
%
% Example inputs: G = digraph([1 2 3 3 3 3 4 5 6 7 8 9 9 9 10], ...
% [7 6 1 5 6 8 2 4 4 3 7 1 6 8 2]);
% t = dfsearch(G, 1, 'allevents', 'Restart', true);
5-29
5 Graph and Network Algorithms
if isundirected
% Replace graph with corresponding digraph, because we need separate
% edges for both directions
[src, tgt] = findedge(G);
G = digraph([src; tgt], [tgt; src], [1:numedges(G), 1:numedges(G)]);
end
for ii=1:size(t,1)
switch t.Event(ii)
case 'startnode'
highlight(h,t.Node(ii),'MarkerSize',min(h.MarkerSize)*2);
case 'discovernode'
highlight(h,t.Node(ii),'NodeColor','r');
case 'finishnode'
highlight(h,t.Node(ii),'NodeColor','k');
otherwise
if isundirected
a = G.Edges.Weight;
b = t.EdgeIndex(ii);
edgeind = intersect(find(a == b),...
findedge(G,t.Edge(ii,1),t.Edge(ii,2)));
else
edgeind = t.EdgeIndex(ii);
end
switch t.Event(ii)
case 'edgetonew'
highlight(h,'Edges',edgeind,'EdgeColor','b');
case 'edgetodiscovered'
highlight(h,'Edges',edgeind,'EdgeColor',[0.8 0 0.8]);
case 'edgetofinished'
highlight(h,'Edges',edgeind,'EdgeColor',[0 0.8 0]);
end
end
nodeStr = t.Node;
if isnumeric(nodeStr)
nodeStr = num2cell(nodeStr);
nodeStr = cellfun(@num2str, nodeStr, 'UniformOutput', false);
end
edgeStr = t.Edge;
if isnumeric(edgeStr)
edgeStr = num2cell(edgeStr);
edgeStr = cellfun(@num2str, edgeStr, 'UniformOutput', false);
end
if ~isnan(t.Node(ii))
title([char(t{ii, 1}) ' on Node ' nodeStr{ii}]);
else
title([char(t{ii, 1}) ' on Edge (' edgeStr{ii, 1} ', '...
edgeStr{ii, 2},') with edge index ' sprintf('%d ', t{ii, 4})]);
end
5-30
Visualize Breadth-First and Depth-First Search
end
disp('Done.')
close all
visualize_search(G,T)
The graph begins as all gray, and then a new piece of the search result appears each time you press a
key. The search results are highlighted according to:
This .gif animation shows what you see when you step through the results of
visualize_search.m.
See Also
bfsearch | dfsearch | graph | digraph
5-31
5 Graph and Network Algorithms
More About
• “Directed and Undirected Graphs” on page 5-2
5-32
Partition Graph with Laplacian Matrix
This example shows how to use the Laplacian matrix of a graph to compute the Fiedler vector. The
Fiedler vector can be used to partition the graph into two subgraphs.
Load Data
Load the data set barbellgraph.mat, which contains the sparse adjacency matrix and node
coordinates of a barbell graph.
load barbellgraph.mat
Plot Graph
G = graph(A,'omitselfloops');
p = plot(G,'XData',xy(:,1),'YData',xy(:,2),'Marker','.');
axis equal
Calculate the Laplacian matrix of the graph. Then, calculate the two smallest magnitude eigenvalues
and corresponding eigenvectors using eigs.
5-33
5 Graph and Network Algorithms
L = laplacian(G);
[V,D] = eigs(L,2,'smallestabs');
The Fiedler vector is the eigenvector corresponding to the second smallest eigenvalue of the graph.
The smallest eigenvalue is zero, indicating that the graph has one connected component. In this case,
the second column in V corresponds to the second smallest eigenvalue D(2,2).
D = 2×2
10-3 ×
0.0000 0
0 0.2873
w = V(:,2);
Finding the Fiedler vector using eigs is scalable to larger graphs, since only a subset of the
eigenvalues and eigenvectors are computed, but for smaller graphs it is equally feasible to convert
the Laplacian matrix to full storage and use eig(full(L)).
Partition Graph
Partition the graph into two subgraphs using the Fiedler vector w. A node is assigned to subgraph A if
it has a positive value in w. Otherwise, the node is assigned to subgraph B. This practice is called a
sign cut or zero threshold cut. The sign cut minimizes the weight of the cut, subject to the upper and
lower bounds on the weight of any nontrivial cut of the graph.
Partition the graph using the sign cut. Highlight the subgraph of nodes with w>=0 in red, and the
nodes with w<0 in black.
highlight(p,find(w>=0),'NodeColor','r') % subgraph A
highlight(p,find(w<0),'NodeColor','k') % subgraph B
5-34
Partition Graph with Laplacian Matrix
For the bar bell graph, this partition bisects the graph nicely into two equal sets of nodes. However,
the sign cut does not always produce a balanced cut.
It is always possible to bisect a graph by calculating the median of w and using it as a threshold value.
This partition is called the median cut, and it guarantees an equal number of nodes in each subgraph.
You can use the median cut by first shifting the values in w by the median:
w_med = w - median(w);
Then, partition the graph by sign in w_med. For the bar bell graph, the median of w is close to zero, so
the two cuts produce similar bisections.
See Also
graph | digraph | laplacian | subgraph
More About
• “Directed and Undirected Graphs” on page 5-2
5-35
5 Graph and Network Algorithms
This example shows how to customize GraphPlot data tips to display extra node properties of a
graph.
Create a GraphPlot graphics object for a random directed graph. Add an extra node property wifi
to the graph.
rng default
G = digraph(sprandn(20, 20, 0.05));
G.Nodes.wifi = randi([0 1], 20, 1) == 1;
h = plot(G);
Add a data tip to the graph. The data tip enables you to select nodes in the graph plot and view
properties of the nodes.
dt = datatip(h,4,3);
5-36
Add Node Properties to Graph Plot Data Tips
By default, the data tips for an undirected graph display the node number and degree. For directed
graphs, the display includes the node number, in-degree, and out-degree.
You can customize the display of data tips for graphics objects by adding, editing, or removing rows
of data from the appropriate object properties. For this GraphPlot object:
Change the label for the Node row in the data tip so that it displays as "City".
h.DataTipTemplate.DataTipRows(1).Label = "City";
5-37
5 Graph and Network Algorithms
The dataTipTextRow function creates a new row of data as an object that can be inserted into the
DataTipRows property. Use dataTipTextRow to create a new row of data for the data tip labeled
"WiFi" that references the values in the G.Nodes.wifi property of the graph. Add this data tip row
to the DataTipRows property as the last row.
row = dataTipTextRow('WiFi',G.Nodes.wifi);
h.DataTipTemplate.DataTipRows(end+1) = row;
5-38
Add Node Properties to Graph Plot Data Tips
The data tip display now includes a Wi-Fi® value for each node.
To remove rows of data from the data tip, you can index into the DataTipRows property and assign
the rows an empty matrix []. This is the same method you might use to delete rows or columns from
a matrix.
Delete the in-degree and out-degree rows from the data tip. Since these appear as the second and
third rows in the data tip display, they correspond to the second and third rows of the DataTipRows
property.
h.DataTipTemplate.DataTipRows(2:3) = [];
5-39
5 Graph and Network Algorithms
The data tip display now only displays the city number and Wi-Fi status.
See Also
datatip | graph | digraph | DataTipTemplate Properties
More About
• “Create Custom Data Tips”
• “Interactively Explore Plotted Data”
5-40
Build Watts-Strogatz Small World Graph Model
This example shows how to construct and analyze a Watts-Strogatz small-world graph. The Watts-
Strogatz model is a random graph that has small-world network properties, such as clustering and
short average path length.
Algorithm Description
1 Create a ring lattice with nodes of mean degree . Each node is connected to its nearest
neighbors on either side.
2 For each edge in the graph, rewire the target node with probability . The rewired edge cannot
be a duplicate or self-loop.
After the first step the graph is a perfect ring lattice. So when , no edges are rewired and the
model returns a ring lattice. In contrast, when , all of the edges are rewired and the ring lattice
is transformed into a random graph.
The file WattsStrogatz.m implements this graph algorithm for undirected graphs. The input
parameters are N, K, and beta according to the algorithm description above.
function h = WattsStrogatz(N,K,beta)
% H = WattsStrogatz(N,K,beta) returns a Watts-Strogatz model graph with N
% nodes, N*K edges, mean node degree 2*K, and rewiring probability beta.
%
% beta = 0 is a ring lattice, and beta = 1 is a random graph.
% Connect each node to its K next and previous neighbors. This constructs
% indices for a ring lattice.
s = repelem((1:N)',1,K);
t = s + repmat(1:K,N,1);
t = mod(t-1,N)+1;
h = graph(s,t);
5-41
5 Graph and Network Algorithms
end
Ring Lattice
Construct a ring lattice with 500 nodes using the WattsStrogatz function. When beta is 0, the
function returns a ring lattice whose nodes all have degree 2K.
h = WattsStrogatz(500,25,0);
plot(h,'NodeColor','k','Layout','circle');
title('Watts-Strogatz Graph with $N = 500$ nodes, $K = 25$, and $\beta = 0$', ...
'Interpreter','latex')
Increase the amount of randomness in the graph by raising beta to 0.15 and 0.50.
h2 = WattsStrogatz(500,25,0.15);
plot(h2,'NodeColor','k','EdgeAlpha',0.1);
title('Watts-Strogatz Graph with $N = 500$ nodes, $K = 25$, and $\beta = 0.15$', ...
'Interpreter','latex')
5-42
Build Watts-Strogatz Small World Graph Model
h3 = WattsStrogatz(500,25,0.50);
plot(h3,'NodeColor','k','EdgeAlpha',0.1);
title('Watts-Strogatz Graph with $N = 500$ nodes, $K = 25$, and $\beta = 0.50$', ...
'Interpreter','latex')
5-43
5 Graph and Network Algorithms
Random Graph
Generate a completely random graph by increasing beta to its maximum value of 1.0. This rewires
all of the edges.
h4 = WattsStrogatz(500,25,1);
plot(h4,'NodeColor','k','EdgeAlpha',0.1);
title('Watts-Strogatz Graph with $N = 500$ nodes, $K = 25$, and $\beta = 1$', ...
'Interpreter','latex')
5-44
Build Watts-Strogatz Small World Graph Model
Degree Distribution
The degree distribution of the nodes in the different Watts-Strogatz graphs varies. When beta is 0,
the nodes all have the same degree, 2K, so the degree distribution is just a Dirac-delta function
centered on 2K, . However, as beta increases, the degree distribution changes.
This plot shows the degree distributions for the nonzero values of beta.
histogram(degree(h2),'BinMethod','integers','FaceAlpha',0.9);
hold on
histogram(degree(h3),'BinMethod','integers','FaceAlpha',0.9);
histogram(degree(h4),'BinMethod','integers','FaceAlpha',0.8);
hold off
title('Node degree distributions for Watts-Strogatz Model Graphs')
xlabel('Degree of node')
ylabel('Number of nodes')
legend('\beta = 1.0','\beta = 0.50','\beta = 0.15','Location','NorthWest')
5-45
5 Graph and Network Algorithms
Hub Formation
The Watts-Strogatz graph has a high clustering coefficient, so the nodes tend to form cliques, or small
groups of closely interconnected nodes. As beta increases towards its maximum value of 1.0, you
see an increasingly large number of hub nodes, or nodes of high relative degree. The hubs are a
common connection between other nodes and between cliques in the graph. The existence of hubs is
what permits the formation of cliques while preserving a short average path length.
Calculate the average path length and number of hub nodes for each value of beta. For the purposes
of this example, the hub nodes are nodes with degree greater than or equal to 55. These are all of the
nodes whose degree increased 10% or more compared to the original ring lattice.
n = 55;
d = [mean(mean(distances(h))), nnz(degree(h)>=n); ...
mean(mean(distances(h2))), nnz(degree(h2)>=n); ...
mean(mean(distances(h3))), nnz(degree(h3)>=n);
mean(mean(distances(h4))), nnz(degree(h4)>=n)];
T = table([0 0.15 0.50 1]', d(:,1), d(:,2),...
'VariableNames',{'Beta','AvgPathLength','NumberOfHubs'})
T =
4x3 table
5-46
Build Watts-Strogatz Small World Graph Model
0 5.48 0
0.15 2.0715 20
0.5 1.9101 85
1 1.9008 92
As beta increases, the average path length in the graph quickly falls to its limiting value. This is due
to the formation of the highly connected hub nodes, which become more numerous as beta
increases.
Plot the Watts-Strogatz model graph, making the size and color of each node proportional to
its degree. This is an effective way to visualize the formation of hubs.
colormap hsv
deg = degree(h2);
nSizes = 2*sqrt(deg-min(deg)+0.2);
nColors = deg;
plot(h2,'MarkerSize',nSizes,'NodeCData',nColors,'EdgeAlpha',0.1)
title('Watts-Strogatz Graph with $N = 500$ nodes, $K = 25$, and $\beta = 0.15$', ...
'Interpreter','latex')
colorbar
See Also
digraph | graph
5-47
5 Graph and Network Algorithms
This example shows how to use a PageRank algorithm to rank a collection of websites. Although the
PageRank algorithm was originally designed to rank search engine results, it also can be more
broadly applied to the nodes in many different types of graphs. The PageRank score gives an idea of
the relative importance of each graph node based on how it is connected to the other nodes.
Theoretically, the PageRank score is the limiting probability that someone randomly clicking links on
each website will arrive at any particular page. So pages with a high score are highly connected and
discoverable within the network, and it is more likely a random web surfer will visit that page.
Algorithm Description
At each step in the PageRank algorithm, the score of each page is updated according to,
In other words, the rank of each page is largely based on the ranks of the pages that link to it. The
term A'*(r./d) picks out the scores of the source nodes that link to each node in the graph, and the
scores are normalized by the total number of outbound links of those source nodes. This ensures that
the sum of the PageRank scores is always 1. For example, if node 2 links to nodes 1, 3, and 4, then it
transfers 1/3 of its PageRank score to each of those nodes during each iteration of the algorithm.
Create a graph that illustrates how each node confers its PageRank score to the other nodes in the
graph.
5-48
Use PageRank Algorithm to Rank Websites
Create and plot a directed graph containing six nodes representing fictitious websites.
s = [1 1 2 2 3 3 3 4 5];
t = [2 5 3 4 4 5 6 1 1];
names = {'http://www.example.com/alpha', 'http://www.example.com/beta', ...
'http://www.example.com/gamma', 'http://www.example.com/delta', ...
'http://www.example.com/epsilon', 'http://www.example.com/zeta'};
G = digraph(s,t,[],names)
G =
digraph with properties:
plot(G,'Layout','layered', ...
'NodeLabel',{'alpha','beta','gamma','delta','epsilon','zeta'})
5-49
5 Graph and Network Algorithms
Calculate the PageRank centrality score for this graph. Use a follow probability (otherwise known as
a damping factor) of 0.85.
pr = centrality(G,'pagerank','FollowProbability',0.85)
pr = 6×1
0.3210
0.1706
0.1066
0.1368
0.2008
0.0643
View the PageRank scores and degree information for each page.
G.Nodes.PageRank = pr;
G.Nodes.InDegree = indegree(G);
G.Nodes.OutDegree = outdegree(G);
G.Nodes
ans=6×4 table
Name PageRank InDegree OutDegree
__________________________________ ________ ________ _________
{'http://www.example.com/alpha' } 0.32098 2 2
{'http://www.example.com/beta' } 0.17057 1 2
5-50
Use PageRank Algorithm to Rank Websites
{'http://www.example.com/gamma' } 0.10657 1 3
{'http://www.example.com/delta' } 0.13678 2 1
{'http://www.example.com/epsilon'} 0.20078 2 1
{'http://www.example.com/zeta' } 0.06432 1 0
The results show that it is not just the number of page links that determines the score, but also the
quality. The alpha and gamma websites both have a total degree of 4, however alpha links to both
epsilon and beta, which also are highly ranked. gamma is only linked to by one page, beta, which
is in the middle of the list. Thus, alpha is scored higher than gamma by the algorithm.
Load the data in mathworks100.mat and view the adjacency matrix, A. This data was generated in
2015 using an automatic page crawler. The page crawler began at https://www.mathworks.com
and followed links to subsequent web pages until the adjacency matrix contained information on the
connections of 100 unique web pages.
load mathworks100.mat
spy(A)
Create a directed graph with the sparse adjacency matrix, A, using the URLs contained in U as node
names.
G = digraph(A,U)
G =
digraph with properties:
5-51
5 Graph and Network Algorithms
Compute the PageRank scores for the graph, G, using 200 iterations and a damping factor of 0.85.
Add the scores and degree information to the nodes table of the graph.
pr = centrality(G,'pagerank','MaxIterations',200,'FollowProbability',0.85);
G.Nodes.PageRank = pr;
G.Nodes.InDegree = indegree(G);
G.Nodes.OutDegree = outdegree(G);
G.Nodes(1:25,:)
ans=25×4 table
Name PageRank
______________________________________________________________________________ ________
{'https://www.mathworks.com' } 0.044342
{'https://ch.mathworks.com' } 0.043085
5-52
Use PageRank Algorithm to Rank Websites
{'https://cn.mathworks.com' } 0.043085
{'https://jp.mathworks.com' } 0.043085
{'https://kr.mathworks.com' } 0.043085
{'https://uk.mathworks.com' } 0.043085
{'https://au.mathworks.com' } 0.043085
{'https://de.mathworks.com' } 0.043085
{'https://es.mathworks.com' } 0.043085
{'https://fr.mathworks.com' } 0.043085
{'https://in.mathworks.com' } 0.043085
{'https://it.mathworks.com' } 0.043085
{'https://nl.mathworks.com' } 0.043085
{'https://se.mathworks.com' } 0.043085
{'https://www.mathworks.com/index.html%3Fnocookie%3Dtrue' } 0.0015
{'https://www.mathworks.com/company/aboutus/policies_statements/patents.html'} 0.007714
⋮
Extract and plot a subgraph containing all nodes whose score is greater than 0.005. Color the graph
nodes based on their PageRank score.
The PageRank scores for the top websites are all quite similar, such that a random web surfer has
about a 4.5% chance to land on each page. This small group of highly connected pages forms a clique
5-53
5 Graph and Network Algorithms
in the center of the plot. Connected to this central clique are several smaller cliques, which are highly
connected amongst themselves.
References
Moler, C. Experiments with MATLAB. Chapter 7: Google PageRank. MathWorks, Inc., 2011.
See Also
digraph | graph | centrality
5-54
Label Graph Nodes and Edges
This example shows how to add and customize labels on graph nodes and edges.
Create a graph representing the gridded streets and intersections in a city. Add weights to the edges
so that the main avenues and cross streets appear differently in the plot. Plot the graph with the edge
line widths proportional to the weight of the edge.
s = [1 1 2 2 3 4 4 5 5 6 7 7 8 8 9 10 11];
t = [2 4 3 5 6 5 7 6 8 9 8 10 9 11 12 11 12];
weights = [1 5 1 5 5 1 5 1 5 5 1 5 1 5 5 1 1];
G = graph(s,t,weights);
P = plot(G,'LineWidth',G.Edges.Weight);
For graphs with 100 or fewer nodes, MATLAB® automatically labels the nodes using the numeric
node indices or node names (larger graphs omit these labels by default). However, you can change
the node labels by adjusting the NodeLabel property of the GraphPlot object P or by using the
labelnode function. Therefore, even if the nodes have names, you can use labels that are different
from the names.
Remove the default numeric node labels. Label one of the intersections as Home and another as Work.
5-55
5 Graph and Network Algorithms
labelnode(P,1:12,'')
labelnode(P,5,'Home')
labelnode(P,12,'Work')
The edges in a plotted graph are not labeled automatically. You can add edge labels by changing the
value of the EdgeLabel property of the GraphPlot object P or by using the labeledge function.
Add edge labels for streets in New York City. The order of the edges is defined in the G.Edges table
of the graph, so the order of the labels you specify must respect that order. It is convenient to store
edge labels directly in the G.Edges table, so that the edge name lives right next to the other edge
information.
G.Edges
ans=17×2 table
EndNodes Weight
________ ______
1 2 1
1 4 5
2 3 1
2 5 5
3 6 5
4 5 1
4 7 5
5-56
Label Graph Nodes and Edges
5 6 1
5 8 5
6 9 5
7 8 1
7 10 5
8 9 1
8 11 5
9 12 5
10 11 1
⋮
This example has 17 edges but only 7 unique street names. Therefore, it makes sense to define the
street names in a cell array and then index into the cell array to retrieve the desired street name for
each edge. Add a variable to the G.Edges table containing the street names.
streets = {'8th Ave' '7th Ave' '6th Ave' '5th Ave' ...
'W 20th St' 'W 21st St' 'W 22nd St'}';
inds = [1 5 1 6 7 2 5 2 6 7 3 5 3 6 7 4 4];
G.Edges.StreetName = streets(inds);
G.Edges
ans=17×3 table
EndNodes Weight StreetName
________ ______ _____________
1 2 1 {'8th Ave' }
1 4 5 {'W 20th St'}
2 3 1 {'8th Ave' }
2 5 5 {'W 21st St'}
3 6 5 {'W 22nd St'}
4 5 1 {'7th Ave' }
4 7 5 {'W 20th St'}
5 6 1 {'7th Ave' }
5 8 5 {'W 21st St'}
6 9 5 {'W 22nd St'}
7 8 1 {'6th Ave' }
7 10 5 {'W 20th St'}
8 9 1 {'6th Ave' }
8 11 5 {'W 21st St'}
9 12 5 {'W 22nd St'}
10 11 1 {'5th Ave' }
⋮
P.EdgeLabel = G.Edges.StreetName;
5-57
5 Graph and Network Algorithms
The node and edge labels in a graph plot have their own properties that control the appearance and
style of the labels. Since the properties are decoupled, you can use different styles for the node labels
and the edge labels.
• NodeLabel
• NodeLabelColor
• NodeFontName
• NodeFontSize
• NodeFontWeight
• NodeFontAngle
• EdgeLabel
• EdgeLabelColor
• EdgeFontName
• EdgeFontSize
• EdgeFontWeight
5-58
Label Graph Nodes and Edges
• EdgeFontAngle
Use these properties to adjust the fonts in this example with New York City Streets:
• Change NodeFontSize and NodeLabelColor so that the intersection labels are 12 pt. font and
red.
• Change EdgeFontWeight, EdgeFontAngle, and EdgeFontSize to use a larger, bold font for
streets in one direction and a smaller, italic font for streets in the other direction.
• Change EdgeFontName to use Times New Roman for the edge labels.
You can use the highlight function to change the graph properties of a subset of the graph edges.
Create a logical index isAvenue that is true for edge labels containing the word 'Ave'. Using this
logical vector as an input to highlight, label all of the Avenues in one way, and all of the non-
Avenues another way.
P.NodeFontSize = 12;
P.NodeLabelColor = 'r';
isAvenue = contains(P.EdgeLabel, 'Ave');
highlight(P, 'Edges', isAvenue, 'EdgeFontAngle', 'italic', 'EdgeFontSize', 7);
highlight(P, 'Edges', ~isAvenue, 'EdgeFontWeight', 'bold', 'EdgeFontSize', 10);
P.EdgeFontName = 'Times New Roman';
Highlight Edges
Find the shortest path between the Home and Work nodes and examine which streets are on the
path. Highlight the nodes and edges on the path in red and remove the edge labels for all edges that
are not on the path.
5-59
5 Graph and Network Algorithms
[path,d,pathEdges] = shortestpath(G,5,12)
path = 1×4
5 6 9 12
d = 11
pathEdges = 1×3
8 10 15
G.Edges.StreetName(pathEdges,:)
highlight(P,'Edges',pathEdges,'EdgeColor','r')
highlight(P,path,'NodeColor','r')
labeledge(P, setdiff(1:numedges(G), pathEdges), '')
See Also
GraphPlot
5-60
Label Graph Nodes and Edges
More About
• “Graph Plotting and Customization” on page 5-17
• “Add Node Properties to Graph Plot Data Tips” on page 5-36
5-61
6
This example shows how to represent a polynomial as a vector in MATLAB® and evaluate the
polynomial at points of interest.
Representing Polynomials
p = [p2 p1 p0];
p = [1 -4 4];
Intermediate terms of the polynomial that have a coefficient of 0 must also be entered into the vector,
since the 0 acts as a placeholder for that particular power of x.
p = [4 0 0 -3 2 33];
Evaluating Polynomials
After entering the polynomial into MATLAB® as a vector, use the polyval function to evaluate the
polynomial at a specific value.
polyval(p,2)
ans = 153
Alternatively, you can evaluate a polynomial in a matrix sense using polyvalm. The polynomial
expression in one variable, p(x) = 4x5 − 3x2 + 2x + 33, becomes the matrix expression
p(X) = 4X 5 − 3X 2 + 2X + 33I,
X = [2 4 5; -1 0 3; 7 1 5];
Y = polyvalm(p,X)
Y = 3×3
6-2
Create and Evaluate Polynomials
See Also
polyval | polyvalm | poly | roots
More About
• “Roots of Polynomials” on page 6-4
• “Integrate and Differentiate Polynomials” on page 6-9
• “Polynomial Curve Fitting” on page 6-11
6-3
6 Functions of One Variable
Roots of Polynomials
This example shows several different methods to calculate the roots of a polynomial.
In this section...
“Numeric Roots” on page 6-4
“Roots Using Substitution” on page 6-4
“Roots in a Specific Interval” on page 6-5
“Symbolic Roots” on page 6-7
Numeric Roots
The roots function calculates the roots of a single-variable polynomial represented by a vector of
coefficients.
For example, create a vector to represent the polynomial x2 − x − 6, then calculate the roots.
p = [1 -1 -6];
r = roots(p)
r =
3
-2
The poly function converts the roots back to polynomial coefficients. When operating on vectors,
poly and roots are inverse functions, such that poly(roots(p)) returns p (up to roundoff error,
ordering, and scaling).
p2 = poly(r)
p2 =
1 -1 -6
When operating on a matrix, the poly function computes the characteristic polynomial of the matrix.
The roots of the characteristic polynomial are the eigenvalues of the matrix. Therefore,
roots(poly(A)) and eig(A) return the same answer (up to roundoff error, ordering, and scaling).
You can solve polynomial equations involving trigonometric functions by simplifying the equation
using a substitution. The resulting polynomial of one variable no longer contains any trigonometric
functions.
3cos2(θ) − sin(θ) + 3 = 0 .
6-4
Roots of Polynomials
2
Use the fact that cos2(θ) = 1 − sin (θ) to express the equation entirely in terms of sine functions:
2
−3sin (θ) − sin(θ) + 6 = 0 .
Use the substitution x = sin(θ) to express the equation as a simple polynomial equation:
−3x2 − x + 6 = 0 .
r = 2×1
-1.5907
1.2573
−1
To undo the substitution, use θ = sin (x). The asin function calculates the inverse sine.
theta = asin(r)
-1.5708 + 1.0395i
1.5708 - 0.7028i
Verify that the elements in theta are the values of θ that solve the original equation (within roundoff
error).
f = @(Z) 3*cos(Z).^2 - sin(Z) + 3;
f(theta)
-0.0888 + 0.0647i
0.2665 + 0.0399i
Use the fzero function to find the roots of a polynomial in a specific interval. Among other uses, this
method is suitable if you plot the polynomial and want to know the value of a particular root.
For example, create a function handle to represent the polynomial 3x7 + 4x6 + 2x5 + 4x4 + x3 + 5x2.
p = @(x) 3*x.^7 + 4*x.^6 + 2*x.^5 + 4*x.^4 + x.^3 + 5*x.^2;
6-5
6 Functions of One Variable
x = -2:0.1:1;
plot(x,p(x))
ylim([-100 50])
grid on
hold on
From the plot, the polynomial has a trivial root at 0 and another near -1.5. Use fzero to calculate
and plot the root that is near -1.5.
Z = fzero(p, -1.5)
Z = -1.6056
plot(Z,p(Z),'r*')
6-6
Roots of Polynomials
Symbolic Roots
If you have Symbolic Math Toolbox™, then there are additional options for evaluating polynomials
symbolically. One way is to use the solve function.
syms x
s = solve(x^2-x-6)
s =
-2
3
Another way is to use the factor function to factor the polynomial terms.
F = factor(x^2-x-6)
F =
[ x + 2, x - 3]
See “Solve Algebraic Equations” (Symbolic Math Toolbox) for more information.
See Also
roots | poly | eig
6-7
6 Functions of One Variable
More About
• “Create and Evaluate Polynomials” on page 6-2
• “Roots of Scalar Functions” on page 6-18
• “Integrate and Differentiate Polynomials” on page 6-9
6-8
Integrate and Differentiate Polynomials
This example shows how to use the polyint and polyder functions to analytically integrate or
differentiate any polynomial represented by a vector of coefficients.
Use polyder to obtain the derivative of the polynomial p(x) = x3 − 2x − 5. The resulting polynomial is
d
q(x) = p(x) = 3x2 − 2.
dx
p = [1 0 -2 -5];
q = polyder(p)
q = 1×3
3 0 -2
Similarly, use polyint to integrate the polynomial p(x) = 4x3 − 3x2 + 1. The resulting polynomial is
q(x) = ∫p(x)dx = x
4 − x3 + x.
p = [4 -3 0 1];
q = polyint(p)
q = 1×5
1 -1 0 1 0
polyder also computes the derivative of the product or quotient of two polynomials. For example,
create two vectors to represent the polynomials a(x) = x2 + 3x + 5 and b(x) = 2x2 + 4x + 6.
a = [1 3 5];
b = [2 4 6];
d
Calculate the derivative a(x)b(x) by calling polyder with a single output argument.
dx
c = polyder(a,b)
c = 1×4
8 30 56 38
d a(x)
Calculate the derivative by calling polyder with two output arguments. The resulting
dx b(x)
polynomial is
[q,d] = polyder(a,b)
q = 1×3
6-9
6 Functions of One Variable
-2 -8 -2
d = 1×5
4 16 40 48 36
See Also
polyder | polyint | conv | deconv
More About
• “Analytic Solution to Integral of Polynomial” on page 15-7
• “Create and Evaluate Polynomials” on page 6-2
6-10
Polynomial Curve Fitting
This example shows how to fit a polynomial curve to a set of data points using the polyfit function.
You can use polyfit to find the coefficients of a polynomial that fits a set of data in a least-squares
sense using the syntax
p = polyfit(x,y,n),
where:
• x and y are vectors containing the x and y coordinates of the data points
• n is the degree of the polynomial to fit
x = [1 2 3 4 5];
y = [5.5 43.1 128 290.7 498.4];
Use polyfit to find a third-degree polynomial that approximately fits the data.
p = polyfit(x,y,3)
p = 1×4
After you obtain the polynomial for the fit line using polyfit, you can use polyval to evaluate the
polynomial at other points that might not have been included in the original data.
Compute the values of the polyfit estimate over a finer domain and plot the estimate over the real
data values for comparison. Include an annotation of the equation for the fit line.
x2 = 1:.1:5;
y2 = polyval(p,x2);
plot(x,y,'o',x2,y2)
grid on
s = sprintf('y = (%.1f) x^3 + (%.1f) x^2 + (%.1f) x + (%.1f)',p(1),p(2),p(3),p(4));
text(2,400,s)
6-11
6 Functions of One Variable
See Also
polyfit | polyval
More About
• “Programmatic Fitting”
• “Create and Evaluate Polynomials” on page 6-2
6-12
Predicting the US Population
This example shows that extrapolating data using polynomials of even modest degree is risky and
unreliable.
This example is older than MATLAB®. It started as an exercise in Computer Methods for
Mathematical Computations, by Forsythe, Malcolm and Moler, published by Prentice-Hall in 1977.
Now, MATLAB makes it much easier to vary the parameters and see the results, but the underlying
mathematical principles are unchanged.
Create and plot two vectors with US Census data from 1910 to 2000.
% Time interval
t = (1910:10:2000)';
% Population
p = [91.972 105.711 123.203 131.669 150.697...
179.323 203.212 226.505 249.633 281.422]';
% Plot
plot(t,p,'bo');
axis([1910 2020 0 400]);
title('Population of the U.S. 1910-2000');
ylabel('Millions');
6-13
6 Functions of One Variable
p = 10×1
91.9720
105.7110
123.2030
131.6690
150.6970
179.3230
203.2120
226.5050
249.6330
281.4220
Fit the data with a polynomial in t and use it to extrapolate the population at t = 2010. Obtain the
coefficients in the polynomial by solving a linear system of equations involving an 11-by-11
Vandermonde matrix, with elements as powers of scaled time, A(i,j) = s(i)^(n-j).
n = length(t);
s = (t-1950)/50;
A = zeros(n);
A(:,end) = 1;
for j = n-1:-1:1
A(:,j) = s .* A(:,j+1);
end
Obtain the coefficients c for a polynomial of degree d that fits the data p by solving a linear system of
equations involving the last d+1 columns of the Vandermonde matrix:
A(:,n-d:n)*c ~= p
• If d < 10, then more equations than unknowns exist, and a least-squares solution is appropriate.
• If d == 10, then you can solve the equations exactly and the polynomial actually interpolates the
data.
In either case, use the backslash operator to solve the system. The coefficients for the cubic fit are:
c = A(:,n-3:n)\p
c = 4×1
-5.7042
27.9064
103.1528
155.1017
Now evaluate the polynomial at every year from 1910 to 2010 and plot the results.
v = (1910:2020)';
x = (v-1950)/50;
w = (2010-1950)/50;
y = polyval(c,x);
z = polyval(c,w);
6-14
Predicting the US Population
hold on
plot(v,y,'k-');
plot(2010,z,'ks');
text(2010,z+15,num2str(z));
hold off
Compare the cubic fit with the quartic. Notice that the extrapolated point is very different.
c = A(:,n-4:n)\p;
y = polyval(c,x);
z = polyval(c,w);
hold on
plot(v,y,'k-');
plot(2010,z,'ks');
text(2010,z-15,num2str(z));
hold off
6-15
6 Functions of One Variable
cla
plot(t,p,'bo')
hold on
axis([1910 2020 0 400])
colors = hsv(8);
labels = {'data'};
for d = 1:8
[Q,R] = qr(A(:,n-d:n));
R = R(1:d+1,:);
Q = Q(:,1:d+1);
c = R\(Q'*p); % Same as c = A(:,n-d:n)\p;
y = polyval(c,x);
z = polyval(c,11);
plot(v,y,'color',colors(d,:));
labels{end+1} = ['degree = ' int2str(d)];
end
legend(labels, 'Location', 'NorthWest')
hold off
6-16
Predicting the US Population
See Also
polyfit
6-17
6 Functions of One Variable
The fzero function attempts to find a root of one equation with one variable. You can call this
function with either a one-element starting point or a two-element vector that designates a starting
interval. If you give fzero a starting point x0, fzero first searches for an interval around this point
where the function changes sign. If the interval is found, fzero returns a value near where the
function changes sign. If no such interval is found, fzero returns NaN. Alternatively, if you know two
points where the function value differs in sign, you can specify this starting interval using a two-
element vector; fzero is guaranteed to narrow down the interval and return a value near a sign
change.
The following sections contain two examples that illustrate how to find a zero of a function using a
starting interval and a starting point. The examples use the function humps.m, which is provided with
MATLAB®. The following figure shows the graph of humps.
x = -1:.01:2;
y = humps(x);
plot(x,y)
xlabel('x');
ylabel('humps(x)')
grid on
6-18
Roots of Scalar Functions
You can control several aspects of the fzero function by setting options. You set options using
optimset. Options include:
• Choosing the amount of display fzero generates — see “Set Optimization Options” on page 9-
10, Using a Starting Interval on page 6-19, and Using a Starting Point on page 6-20.
• Choosing various tolerances that control how fzero determines it is at a root — see “Set
Optimization Options” on page 9-10.
• Choosing a plot function for observing the progress of fzero towards a root — see “Optimization
Solver Plot Functions” on page 9-20.
• Using a custom-programmed output function for observing the progress of fzero towards a root
— see “Optimization Solver Output Functions” on page 9-14.
The graph of humps indicates that the function is negative at x = -1 and positive at x = 1. You can
confirm this by calculating humps at these two points.
humps(1)
ans = 16
humps(-1)
ans = -5.1378
The iterative algorithm for fzero finds smaller and smaller subintervals of [-1 1]. For each
subinterval, the sign of humps differs at the two endpoints. As the endpoints of the subintervals get
closer and closer, they converge to zero for humps.
To show the progress of fzero at each iteration, set the Display option to iter using the
optimset function.
options = optimset('Display','iter');
a = fzero(@humps,[-1 1],options)
6-19
6 Functions of One Variable
a = -0.1316
Each value x represents the best endpoint so far. The Procedure column tells you whether each step
of the algorithm uses bisection or interpolation.
You can verify that the function value at a is close to zero by entering
humps(a)
ans = 8.8818e-16
Suppose you do not know two points at which the function values of humps differ in sign. In that case,
you can choose a scalar x0 as the starting point for fzero. fzero first searches for an interval
around this point on which the function changes sign. If fzero finds such an interval, it proceeds
with the algorithm described in the previous section. If no such interval is found, fzero returns NaN.
For example, set the starting point to -0.2, the Display option to Iter, and call fzero:
options = optimset('Display','iter');
a = fzero(@humps,-0.2,options)
a = -0.1316
The endpoints of the current subinterval at each iteration are listed under the headings a and b,
while the corresponding values of humps at the endpoints are listed under f(a) and f(b),
respectively.
6-20
Roots of Scalar Functions
Note: The endpoints a and b are not listed in any specific order: a can be greater than b or less than
b.
For the first nine steps, the sign of humps is negative at both endpoints of the current subinterval,
which is shown in the output. At the tenth step, the sign of humps is positive at a, -0.10949, but
negative at b, -0.264. From this point on, the algorithm continues to narrow down the interval
[-0.10949 -0.264], as described in the previous section, until it reaches the value -0.1316.
See Also
More About
• “Roots of Polynomials” on page 6-4
• “Optimizing Nonlinear Functions” on page 9-2
• “Systems of Nonlinear Equations” (Optimization Toolbox)
6-21
7
Computational Geometry
Triangulation Representations
The triangulation decomposes a complex polygon into a collection of simpler triangular polygons. You
can use these polygons for developing geometric-based algorithms or graphics applications.
Similarly, you can represent the boundary of a 3-D geometric domain using a triangulation. The figure
below shows the convex hull of a set of points in 3-D space. Each facet of the hull is a triangle.
7-2
Triangulation Representations
• The vertices, represented as a matrix in which each row contains the coordinates of a point in the
triangulation.
• The triangulation connectivity, represented as a matrix in which each row defines a triangle or
tetrahedron.
Vertices
Vertex ID x-coordinate y-coordinate
7-3
7 Computational Geometry
Vertices
V1 2.5 8.0
V2 6.5 8.0
V3 2.5 5.0
V4 6.5 5.0
V5 1.0 6.5
V6 8.0 6.5
The data in the previous table is stored as a matrix in the MATLAB environment. The vertex IDs are
labels used for identifying specific vertices. They are shown to illustrate the concept of a vertex ID,
but they are not stored explicitly. Instead, the row numbers of the matrix serve as the vertex IDs.
Connectivity
Triangle ID IDs of Bounding Vertices
T1 5 3 1
T2 3 2 1
T3 3 4 2
T4 4 6 2
The data in this table is stored as a matrix in the MATLAB environment. The triangle IDs are labels
used for identifying specific triangles. They are shown to illustrate the concept of a triangle ID, but
they are not stored explicitly. Instead, the row numbers of the matrix serve as the triangle IDs.
You can see that triangle T1 is defined by three vertices, {V5, V3, V1}. Similarly, T4 is defined by
the vertices, {V4, V6, V2}. This format extends naturally to higher dimensions, which require
additional columns of data. For example, a tetrahedron in 3-D space is defined by four vertices, each
of which have three coordinates, (x, y, z).
You can represent and query the following types of triangulations using MATLAB:
For example, you might compute the triangle incenters before plotting the annotated triangulation
shown below. In this case, you use the incenters to display the triangle labels (T1, T2, etc.) within
each triangle. If you want to plot the boundary in red, you need to determine the edges that are
referenced by only one triangle.
7-4
Triangulation Representations
You can use triangulation to create an in-memory representation of any 2-D or 3-D triangulation
data that is in matrix format, such as the matrix output from the delaunay function or other
software tools. When your data is represented using triangulation, you can perform topological
and geometric queries, which you can use to develop geometric algorithms. For example, you can find
the triangles or tetrahedra attached to a vertex, those that share an edge, their circumcenters, and
other features.
• Pass existing data that you have in matrix format to triangulation. This data can be the output
from a MATLAB function, such as delaunay or convhull. You also can import triangulation data
that was created by another software application. When you work with imported data, be sure the
connectivity data references the vertex array using 1-based indexing instead of 0-based indexing.
• Pass a set of points to delaunayTriangulation. The resulting Delaunay triangulation is a
special kind of triangulation. This means you can perform any triangulation query on your
data, as well as any Delaunay-specific query. In more formal MATLAB language terms,
delaunayTriangulation is a subclass of triangulation.
This example shows how to use the triangulation matrix data to create a triangulation, explore what it
is, and explore what it can do.
P = [ 2.5 8.0
6.5 8.0
2.5 5.0
6.5 5.0
1.0 6.5
8.0 6.5];
7-5
7 Computational Geometry
T = [5 3 1;
3 2 1;
3 4 2;
4 6 2];
TR = triangulation(T,P)
TR =
triangulation with properties:
Access the properties in a triangulation in the same way you access the fields of a struct. For
example, examine the Points property, which contains the coordinates of the vertices.
TR.Points
ans = 6×2
2.5000 8.0000
6.5000 8.0000
2.5000 5.0000
6.5000 5.0000
1.0000 6.5000
8.0000 6.5000
TR.ConnectivityList
ans = 4×3
5 3 1
3 2 1
3 4 2
4 6 2
The Points and ConnectivityList properties define the matrix data for the triangulation.
The triangulation class is a wrapper around the matrix data. The real benefit is the usefulness of
the triangulation class methods. The methods are like functions that accept a triangulation
and other relevant input data.
The triangulation class provides an easy way to index into the ConnectivityList property
matrix. Access the first triangle in the triangulation.
TR.ConnectivityList(1,:)
ans = 1×3
7-6
Triangulation Representations
5 3 1
TR(1,1)
ans = 5
TR(1,2)
ans = 3
TR(:,:)
ans = 4×3
5 3 1
3 2 1
3 4 2
4 6 2
Use triplot to plot the triangulation. The triplot function is not a triangulation method,
but it accepts and can plot a triangulation.
figure
triplot(TR)
axis equal
7-7
7 Computational Geometry
Use the triangulation method, freeBoundary, to query the free boundary and highlight it in a
plot. This method returns the edges of the triangulation that are shared by only one triangle. The
returned edges are expressed in terms of the vertex IDs.
boundaryedges = freeBoundary(TR)';
hold on
plot(P(boundaryedges,1),P(boundaryedges,2),'-r','LineWidth',2)
hold off
7-8
Triangulation Representations
You can use the freeBoundary method to validate a triangulation. For example, if you observed red
edges in the interior of the triangulation, then it would indicate a problem in how the triangles are
connected.
When you create a Delaunay triangulation using the delaunayTriangulation class, you
automatically get access to the triangulation methods because delaunayTriangulation is a
subclass of triangulation.
P = [ 2.5 8.0
6.5 8.0
2.5 5.0
6.5 5.0
1.0 6.5
8.0 6.5];
DT = delaunayTriangulation(P)
DT =
delaunayTriangulation with properties:
7-9
7 Computational Geometry
You can access the triangulation using direct indexing, just like triangulation. For example,
examine the connectivity of the first triangle.
DT(1,:)
ans = 1×3
5 3 1
DT(:,:)
ans = 4×3
5 3 1
3 4 1
1 4 2
2 4 6
triplot(DT)
axis equal
7-10
Triangulation Representations
The parent class, triangulation, provides the incenter method to compute the incenters of each
triangle.
IC = incenter(DT)
IC = 4×2
1.8787 6.5000
3.5000 6.0000
5.5000 7.0000
7.1213 6.5000
The returned value, IC, is an array of coordinates representing the incenters of the triangles.
Now, use the incenters to find the positions for placing triangle labels on the plot.
hold on
numtri = size(DT,1);
trilabels = arrayfun(@(P) {sprintf('T%d', P)}, (1:numtri)');
Htl = text(IC(:,1),IC(:,2),trilabels,'FontWeight','bold', ...
'HorizontalAlignment','center','Color','blue');
hold off
7-11
7 Computational Geometry
Instead of creating a Delaunay triangulation using delaunayTriangulation, you could use the
delaunay function to create the triangulation connectivity data, and then pass the connectivity data
to triangulation. For example,
P = [ 2.5 8.0
6.5 8.0
2.5 5.0
6.5 5.0
1.0 6.5
8.0 6.5];
T = delaunay(P);
TR = triangulation(T,P);
IC = incenter(TR);
Both approaches are valid in this example, but if you want to create a Delaunay triangulation and
perform queries on it, then you should use delaunayTriangulation for these reasons:
• The delaunayTriangulation class provides additional methods that are useful for working with
triangulations. For example, you can to perform nearest-neighbor and point-in-triangle searches.
• It allows you to edit the triangulation to add, move, or remove points.
• It allows you to create constrained Delaunay triangulations. This allows you to create a
triangulation for a 2-D domain.
7-12
Triangulation Representations
See Also
delaunay | delaunayTriangulation | freeBoundary | triangulation | triplot
More About
• “Working with Delaunay Triangulations” on page 7-14
• “Spatial Searching” on page 7-53
7-13
7 Computational Geometry
The fundamental property is the Delaunay criterion. In the case of 2-D triangulations, this is often
called the empty circumcircle criterion. For a set of points in 2-D, a Delaunay triangulation of these
points ensures the circumcircle associated with each triangle contains no other point in its interior.
This property is important. In the illustration below, the circumcircle associated with T1 is empty. It
does not contain a point in its interior. The circumcircle associated with T2 is empty. It does not
contain a point in its interior. This triangulation is a Delaunay triangulation.
The triangles below are different. The circumcircle associated with T1 is not empty. It contains V3 in
its interior. The circumcircle associated with T2 is not empty. It contains V1 in its interior. This
triangulation is not a Delaunay triangulation.
Delaunay triangles are said to be “well shaped” because in fulfilling the empty circumcircle property,
triangles with large internal angles are selected over ones with small internal angles. The triangles in
7-14
Working with Delaunay Triangulations
the non-Delaunay triangulation have sharp angles at vertices V2 and V4. If the edge {V2, V4} were
replaced by an edge joining V1 and V3, the minimum angle would be maximized and the triangulation
would become a Delaunay triangulation. Also, the Delaunay triangulation connects points in a
nearest-neighbor manner. These two characteristics, well-shaped triangles and the nearest-neighbor
relation, have important implications in practice and motivate the use of Delaunay triangulations in
scattered data interpolation.
While the Delaunay property is well defined, the topology of the triangulation is not unique in the
presence of degenerate point sets. In two dimensions, degeneracies arise when four or more unique
points lie on the same circle. The vertices of a square, for example, have a nonunique Delaunay
triangulation.
The properties of Delaunay triangulations extend to higher dimensions. The triangulation of a 3-D set
of points is composed of tetrahedra. The next illustration shows a simple 3-D Delaunay triangulation
made up of two tetrahedra. The circumsphere of one tetrahedron is shown to highlight the empty
circumsphere criterion.
A 3-D Delaunay triangulation produces tetrahedra that satisfy the empty circumsphere criterion.
7-15
7 Computational Geometry
The delaunay function supports the creation of 2-D and 3-D Delaunay triangulations. The
delaunayn function supports creating Delaunay triangulations in 4-D and higher.
Tip Creating Delaunay triangulations in dimensions higher than 6-D is generally not practical for
moderate to large point sets due to the exponential growth in required memory.
The delaunayTriangulation class supports creating Delaunay triangulations in 2-D and 3-D. It
provides many methods that are useful for developing triangulation-based algorithms. These class
methods are like functions, but they are restricted to work with triangulations created using
delaunayTriangulation. The delaunayTriangulation class also supports the creation of
related constructs such as the convex hull and Voronoi diagram. It also supports the creation of
constrained Delaunay triangulations.
In summary:
• The delaunay function is useful when you only require the basic triangulation data, and that data
is sufficiently complete for your application.
• The delaunayTriangulation class offers more functionality for developing triangulation-based
applications. It is useful when you require the triangulation and you want to perform any of these
operations:
The delaunay and delaunayn functions take a set of points and produce a triangulation in matrix
format. Refer to “Triangulation Matrix Format” on page 7-3 for more information on this data
structure. In 2-D, the delaunay function is often used to produce a triangulation that can be used to
plot a surface defined in terms of a set of scattered data points. In this application, it’s important to
note that this approach can only be used if the surface is single-valued. For example, it could not be
used to plot a spherical surface because there are two z values corresponding to a single (x, y)
coordinate. A simple example demonstrates how the delaunay function can be used to plot a surface
representing a sampled data set.
This example shows how to use the delaunay function to create a 2-D Delaunay triangulation from
the seamount data set. A seamount is an underwater mountain. The data set consists of a set of
longitude (x) and latitude (y) locations, and corresponding seamount elevations (z) measured at those
coordinates.
Load the seamount data set and view the (x, y) data as a scatter plot.
load seamount
plot(x,y,'.','markersize',12)
7-16
Working with Delaunay Triangulations
xlabel('Longitude'), ylabel('Latitude')
grid on
Construct a Delaunay triangulation from this point set and use triplot to plot the triangulation in
the existing figure.
tri = delaunay(x,y);
hold on, triplot(tri,x,y), hold off
7-17
7 Computational Geometry
Add the depth data (z) from seamount to lift the vertices and create the surface. Create a new figure
and use trimesh to plot the surface in wireframe mode.
figure
hidden on
trimesh(tri,x,y,z)
xlabel('Longitude'),ylabel('Latitude'),zlabel('Depth in Feet');
7-18
Working with Delaunay Triangulations
If you want to plot the surface in shaded mode, use trisurf instead of trimesh.
A 3-D Delaunay triangulation also can be created using the delaunay function. This triangulation is
composed of tetrahedra.
This example shows how to create a 3-D Delaunay triangulation of a random data set. The
triangulation is plotted using tetramesh, and the FaceAlpha option adds transparency to the plot.
rng('default')
X = rand([30 3]);
tet = delaunay(X);
faceColor = [0.6875 0.8750 0.8984];
tetramesh(tet,X,'FaceColor', faceColor,'FaceAlpha',0.3);
7-19
7 Computational Geometry
MATLAB provides the delaunayn function to support the creation of Delaunay triangulations in
dimension 4-D and higher. Two complementary functions tsearchn and dsearchn are also provided
to support spatial searching for N-D triangulations. See “Spatial Searching” on page 7-53 for more
information on triangulation-based search.
“Triangulation Representations” on page 7-2 introduces the triangulation class, which supports
topological and geometric queries for 2-D and 3-D triangulations. A delaunayTriangulation is a
special kind of triangulation. This means you can perform any triangulation query on a
delaunayTriangulation in addition to the Delaunay-specific queries. In more formal MATLAB
language terms, delaunayTriangulation is a subclass of triangulation.
This example shows how to create, query, and edit a Delaunay triangulation from the seamount data
using delaunayTriangulation. The seamount data set contains (x, y) locations and corresponding
elevations (z) that define the surface of the seamount.
7-20
Working with Delaunay Triangulations
load seamount
DT = delaunayTriangulation(x,y)
DT =
delaunayTriangulation with properties:
The Constraints property is empty because there aren't any imposed edge constraints. The
Points property represents the coordinates of the vertices, and the ConnectivityList property
represents the triangles. Together, these two properties define the matrix data for the triangulation.
The delaunayTriangulation class is a wrapper around the matrix data, and it offers a set of
complementary methods. You access the properties in a delaunayTriangulation in the same way
you access the fields of a struct.
DT.Points;
DT.ConnectivityList;
DT.ConnectivityList(1,:)
ans = 1×3
DT(1,:)
ans = 1×3
DT(1,1)
ans = 230
DT(:,:);
7-21
7 Computational Geometry
Indexing into the delaunayTriangulation output, DT, works like indexing into the triangulation
array output from delaunay. The difference between the two are the extra methods that you can call
on DT (for example, nearestNeighbor and pointLocation).
triplot(DT);
axis equal
xlabel('Longitude'), ylabel('Latitude')
grid on
Use the delaunayTriangulation method, convexHull, to compute the convex hull and add it to
the plot. Since you already have a Delaunay triangulation, this method allows you to derive the
convex hull more efficiently than a full computation using convhull.
hold on
k = convexHull(DT);
xHull = DT.Points(k,1);
yHull = DT.Points(k,2);
plot(xHull,yHull,'r','LineWidth',2);
hold off
7-22
Working with Delaunay Triangulations
You can incrementally edit the delaunayTriangulation to add or remove points. If you need to
add points to an existing triangulation, then an incremental addition is faster than a complete
retriangulation of the augmented point set. Incremental removal of points is more efficient when the
number of points to be removed is small relative to the existing number of points.
Edit the triangulation to remove the points on the convex hull from the previous computation.
figure
plot(xHull,yHull,'r','LineWidth',2);
axis equal
xlabel('Longitude'),ylabel('Latitude')
grid on
% The convex hull topology duplicates the start and end vertex.
% Remove the duplicate entry.
k(end) = [];
DT =
delaunayTriangulation with properties:
7-23
7 Computational Geometry
There is one vertex that is just inside the boundary of the convex hull that was not removed. The fact
that it is interior to the hull can be seen using the Zoom-In tool in the figure. You could plot the vertex
labels to determine the index of this vertex and remove it from the triangulation. Alternatively, you
can use the nearestNeighbor method to identify the index more readily.
The point is close to location (211.6, -48.15). Use the nearestNeighbor method to find the nearest
vertex.
vertexId = nearestNeighbor(DT, 211.6, -48.15)
vertexId = 50
DT =
delaunayTriangulation with properties:
7-24
Working with Delaunay Triangulations
figure
plot(xHull,yHull,'r','LineWidth',2);
axis equal
xlabel('Longitude'),ylabel('Latitude')
grid on
hold on
triplot(DT);
hold off
Add points to the existing triangulation. Add 4 points to form a rectangle around the triangulation.
DT =
delaunayTriangulation with properties:
close all
7-25
7 Computational Geometry
figure
plot(xHull,yHull,'r','LineWidth',2);
axis equal
xlabel('Longitude'),ylabel('Latitude')
grid on
hold on
triplot(DT);
hold off
You can edit the points in the triangulation to move them to a new location. Edit the first of the
additional point set (the vertex ID 274).
close all
figure
plot(xHull,yHull,'r','LineWidth',2);
axis equal
xlabel('Longitude'),ylabel('Latitude')
grid on
hold on
7-26
Working with Delaunay Triangulations
triplot(DT);
hold off
Use the a method of the triangulation class, vertexAttachments, to find the attached triangles.
Since the number of triangles attached to a vertex is variable, the method returns the attached
triangle IDs in a cell array. You need braces to extract the contents.
attTris = vertexAttachments(DT,274);
hold on
triplot(DT(attTris{:},:),DT.Points(:,1),DT.Points(:,2),'g')
hold off
7-27
7 Computational Geometry
delaunayTriangulation also can be used to triangulate points in 3-D space. The resulting
triangulation is composed of tetrahedra.
This example shows how to use a delaunayTriangulation to create and plot the triangulation of
3-D points.
rng('default')
P = rand(30,3);
DT = delaunayTriangulation(P)
DT =
delaunayTriangulation with properties:
7-28
Working with Delaunay Triangulations
The tetramesh function plots both the internal and external faces of the triangulation. For large 3-D
triangulations, plotting the internal faces might be an unnecessary use of resources. A plot of the
boundary might be more appropriate. You can use the freeBoundary method to get the boundary
triangulation in matrix format. Then pass the result to trimesh or trisurf.
The delaunayTriangulation class allows you to constrain edges in a 2-D triangulation. This
means you can choose a pair of points in the triangulation and constrain an edge to join those points.
You can picture this as “forcing” an edge between one or more pairs of points. The following example
shows how edge constraints can affect the triangulation.
The triangulation below is a Delaunay triangulation because it respects the empty circumcircle
criterion.
7-29
7 Computational Geometry
Triangulate a set of points with an edge constraint specified between vertex V1 and V3.
P = [2 4; 6 1; 9 4; 6 7];
C = [1 3];
DT = delaunayTriangulation(P,C);
triplot(DT)
% Use the incenters to find the positions for placing triangle labels on the plot.
hold on
IC = incenter(DT);
numtri = size(DT,1);
trilabels = arrayfun(@(P) {sprintf('T%d', P)}, (1:numtri)');
Htl = text(IC(:,1),IC(:,2),trilabels,'FontWeight','bold', ...
'HorizontalAlignment','center','Color','blue');
hold off
7-30
Working with Delaunay Triangulations
The constraint between vertices (V1, V3) was honored, however, the Delaunay criterion was
invalidated. This also invalidates the nearest-neighbor relation that is inherent in a Delaunay
triangulation. This means the nearestNeighbor search method provided by
delaunayTriangulation cannot be supported if the triangulation has constraints.
In typical applications, the triangulation might be composed of many points, and a relatively small
number of edges in the triangulation might be constrained. Such a triangulation is said to be locally
non-Delaunay, because many triangles in the triangulation might respect the Delaunay criterion, but
locally there might be some triangles that do not. In many applications, local relaxation of the empty
circumcircle property is not a concern.
Constrained triangulations are generally used to triangulate a nonconvex polygon. The constraints
give us a correspondence between the polygon edges and the triangulation edges. This relationship
enables you to extract a triangulation that represents the region. The following example shows how
to use a constrained delaunayTriangulation to triangulate a nonconvex polygon.
figure()
axis([-1 17 -1 6]);
axis equal
P = [0 0; 16 0; 16 2; 2 2; 2 3; 8 3; 8 5; 0 5];
patch(P(:,1),P(:,2),'-r','LineWidth',2,'FaceColor',...
'none','EdgeColor','r');
7-31
7 Computational Geometry
hold on
numvx = size(P,1);
vxlabels = arrayfun(@(n) {sprintf('P%d', n)}, (1:numvx)');
Hpl = text(P(:,1)+0.2, P(:,2)+0.2, vxlabels, 'FontWeight', ...
'bold', 'HorizontalAlignment','center', 'BackgroundColor', ...
'none');
hold off
Create and plot the triangulation together with the polygon boundary.
figure()
subplot(2,1,1);
axis([-1 17 -1 6]);
axis equal
P = [0 0; 16 0; 16 2; 2 2; 2 3; 8 3; 8 5; 0 5];
DT = delaunayTriangulation(P);
triplot(DT)
hold on;
patch(P(:,1),P(:,2),'-r','LineWidth',2,'FaceColor',...
'none','EdgeColor','r');
hold off
7-32
Working with Delaunay Triangulations
This triangulation cannot be used to represent the domain of the polygon because some triangles cut
across the boundary. You need to impose a constraint on the edges that are cut by triangulation
edges. Since all edges have to be respected, you need to constrain all edges. The steps below show
how to constrain all the edges.
Enter the constrained edge definition. Observe from the annotated figure where you need constraints
(between (V1, V2), (V2, V3), and so on).
C = [1 2; 2 3; 3 4; 4 5; 5 6; 6 7; 7 8; 8 1];
In general, if you have N points in a sequence that define a polygonal boundary, the constraints can be
expressed as C = [(1:(N-1))' (2:N)'; N 1];.
DT = delaunayTriangulation(P,C);
Alternatively, you can impose constraints on an existing triangulation by setting the Constraints
property: DT.Constraints = C;.
figure('Color','white')
subplot(2,1,1);
axis([-1 17 -1 6]);
axis equal
triplot(DT)
7-33
7 Computational Geometry
hold on;
patch(P(:,1),P(:,2),'-r','LineWidth',2, ...
'FaceColor','none','EdgeColor','r');
hold off
The plot shows that the edges of the triangulation respect the boundary of the polygon. However, the
triangulation fills the concavities. What is needed is a triangulation that represents the polygonal
domain. You can extract the triangles within the polygon using the delaunayTriangulation
method, isInterior. This method returns a logical array whose true and false values that
indicate whether the triangles are inside a bounded geometric domain. The analysis is based on the
Jordan Curve theorem, and the boundaries are defined by the edge constraints. The ith triangle in the
triangulation is considered to be inside the domain if the ith logical flag is true, otherwise it is
outside.
Now use the isInterior method to compute and plot the set of domain triangles.
% Plot the constrained edges in red.
figure('Color','white')
subplot(2,1,1);
plot(P(C'),P(C'+size(P,1)),'-r','LineWidth', 2);
axis([-1 17 -1 6]);
7-34
Working with Delaunay Triangulations
7-35
7 Computational Geometry
The following example illustrates the importance of referencing the unique data set stored within the
Points property when working with delaunayTriangulation:
rng('default')
P = rand([25 2]);
P(18,:) = P(8,:)
P(16,:) = P(6,:)
P(12,:) = P(2,:)
DT = delaunayTriangulation(P)
When the triangulation is created, MATLAB issues a warning. The Points property shows that the
duplicate points have been removed from the data.
DT =
If for example, the Delaunay triangulation is used to compute the convex hull, the indices of the
points on the hull are indices with respect to the unique point set, DT.Points. Therefore, use the
following code to compute and plot the convex hull:
K = DT.convexHull();
plot(DT.Points(:,1),DT.Points(:,2),'.');
hold on
plot(DT.Points(K,1),DT.Points(K,2),'-r');
If the original data set containing the duplicates were used in conjunction with the indices provided
by delaunayTriangulation, then the result would be incorrect. The delaunayTriangulation
works with indices that are based on the unique data set DT.Points. For example, the following
would produce an incorrect plot, because K is indexed with respect to DT.Points and not P:
K = DT.convexHull();
plot(P(:,1),P(:,2),'.');
hold on
plot(P(K,1),P(K,2),'-r');
It’s often more convenient to create a unique data set by removing duplicates prior to creating the
delaunayTriangulation. Doing this eliminates the potential for confusion. This can be
accomplished using the unique function as follows:
rng('default')
P = rand([25 2]);
P(18,:) = P(8,:)
P(16,:) = P(6,:)
P(12,:) = P(2,:)
[~, I, ~] = unique(P,'first','rows');
I = sort(I);
P = P(I,:);
DT = delaunayTriangulation(P) % The point set is unique
7-36
Working with Delaunay Triangulations
See Also
More About
• “Spatial Searching” on page 7-53
7-37
7 Computational Geometry
This example shows how to create, edit, and query Delaunay triangulations using the
delaunayTriangulation class. The Delaunay triangulation is the most widely used triangulation in
scientific computing. The properties associated with the triangulation provide a basis for solving a
variety of geometric problems. Construction of constrained Delaunay triangulations is also shown,
together with an applications covering medial axis computation and mesh morphing.
This example shows you how to compute a 2-D Delaunay triangulation and then plot the triangulation
together with the vertex and triangle labels.
rng default
x = rand(10,1);
y = rand(10,1);
dt = delaunayTriangulation(x,y)
dt =
delaunayTriangulation with properties:
triplot(dt)
hold on
vxlabels = arrayfun(@(n) {sprintf('P%d', n)}, (1:10)');
Hpl = text(x,y,vxlabels,'FontWeight','bold','HorizontalAlignment',...
'center','BackgroundColor','none');
ic = incenter(dt);
numtri = size(dt,1);
trilabels = arrayfun(@(x) {sprintf('T%d',x)}, (1:numtri)');
Htl = text(ic(:,1),ic(:,2),trilabels,'FontWeight','bold', ...
'HorizontalAlignment','center','Color','blue');
hold off
7-38
Creating and Editing Delaunay Triangulations
This example shows you how to compute and plot a 3-D Delaunay triangulation.
rng default
X = rand(10,3);
dt = delaunayTriangulation(X)
dt =
delaunayTriangulation with properties:
tetramesh(dt)
view([10 20])
7-39
7 Computational Geometry
To display large tetrahedral meshes, use the convexHull method to compute the boundary
triangulation and plot it using trisurf. For example:
triboundary = convexHull(dt);
trisurf(triboundary, X(:,1), X(:,2), X(:,3),'FaceColor','cyan')
There are two ways to access the triangulation data structure. One way is via the Triangulation
property, the other way is using indexing.
dt =
delaunayTriangulation with properties:
One way to access the triangulation data structure is with the ConnectivityList property.
dt.ConnectivityList
7-40
Creating and Editing Delaunay Triangulations
ans = 11×3
2 8 5
7 6 1
3 7 8
8 7 5
3 8 2
6 7 3
7 4 5
5 9 2
2 9 10
5 4 9
⋮
Indexing is a shorthand way to query the triangulation. The syntax is dt(i,j), where j is the jth
vertex of the ith triangle. Standard indexing rules apply.
dt(:,:)
ans = 11×3
2 8 5
7 6 1
3 7 8
8 7 5
3 8 2
6 7 3
7 4 5
5 9 2
2 9 10
5 4 9
⋮
dt(2,:)
ans = 1×3
7 6 1
dt(2,3)
ans = 1
dt(1:3,:)
ans = 3×3
2 8 5
7-41
7 Computational Geometry
7 6 1
3 7 8
This example shows you how to use index-based subscripting to insert or remove points. It is more
efficient to edit a delaunayTriangulation to make minor modifications as opposed to recreating a
new delaunayTriangulation from scratch, this is especially true if the data set is large.
rng default
x = rand(10,1);
y = rand(10,1);
dt = delaunayTriangulation(x,y)
dt =
delaunayTriangulation with properties:
dt.Points(end+(1:5),:) = rand(5,2)
dt =
delaunayTriangulation with properties:
dt.Points(5,:) = [0 0]
dt =
delaunayTriangulation with properties:
dt.Points(4,:) = []
dt =
delaunayTriangulation with properties:
7-42
Creating and Editing Delaunay Triangulations
Constraints: []
This example shows you how to create a constrained Delaunay triangulation and illustrates the effect
of the constraints.
X = [0 0; 16 0; 16 2; 2 2; 2 3; 8 3; 8 5; 0 5];
C = [1 2; 2 3; 3 4; 4 5; 5 6; 6 7; 7 8; 8 1];
dt = delaunayTriangulation(X,C);
subplot(2,1,1)
triplot(dt)
axis([-1 17 -1 6])
xlabel('Constrained Delaunay triangulation','FontWeight','b')
hold on
plot(X(C'),X(C'+size(X,1)),'-r','LineWidth',2)
hold off
Now delete the constraints and plot the unconstrained Delaunay triangulation.
dt.Constraints = [];
subplot(2,1,2)
triplot(dt)
axis([-1 17 -1 6])
xlabel('Unconstrained Delaunay triangulation','FontWeight','b')
7-43
7 Computational Geometry
Load a map of the perimeter of the conterminous United States. Construct a constrained Delaunay
triangulation representing the polygon. This triangulation spans a domain that is bounded by the
convex hull of the set of points. Filter out the triangles that are within the domain of the polygon and
plot them. Note: The data set contains duplicate data points; that is, two or more datapoints have the
same location. The duplicate points are rejected and the delaunayTriangulation reformats the
constraints accordingly.
clf
load usapolygon
Define an edge constraint between two successive points that make up the polygonal boundary and
create the Delaunay triangulation.
nump = numel(uslon);
C = [(1:(nump-1))' (2:nump)'; nump 1];
dt = delaunayTriangulation(uslon,uslat,C);
Warning: Intersecting edge constraints have been split, this may have added new points into the t
io = isInterior(dt);
patch('Faces',dt(io,:),'Vertices',dt.Points,'FaceColor','r')
axis equal
7-44
Creating and Editing Delaunay Triangulations
This example highlights the use of a Delaunay triangulation to reconstruct a polygonal boundary from
a cloud of points. The reconstruction is based on the elegant Crust algorithm.
Reference: N. Amenta, M. Bern, and D. Eppstein. The crust and the beta-skeleton: combinatorial
curve reconstruction. Graphical Models and Image Processing, 60:125-135, 1998.
numpts = 192;
t = linspace( -pi, pi, numpts+1 )';
t(end) = [];
r = 0.1 + 5*sqrt( cos( 6*t ).^2 + (0.7).^2 );
x = r.*cos(t);
y = r.*sin(t);
ri = randperm(numpts);
x = x(ri);
y = y(ri);
dt = delaunayTriangulation(x,y);
tri = dt(:,:);
7-45
7 Computational Geometry
Insert the location of the Voronoi vertices into the existing triangulation.
V = voronoiDiagram(dt);
Remove the infinite vertex and filter out duplicate points using unique.
V(1,:) = [];
numv = size(V,1);
dt.Points(end+(1:numv),:) = unique(V,'rows');
The Delaunay edges that connect pairs of sample points represent the boundary.
delEdges = edges(dt);
validx = delEdges(:,1) <= numpts;
validy = delEdges(:,2) <= numpts;
boundaryEdges = delEdges((validx & validy),:)';
xb = x(boundaryEdges);
yb = y(boundaryEdges);
clf
triplot(tri,x,y)
axis equal
hold on
plot(x,y,'*r')
plot(xb,yb,'-r')
xlabel('Curve reconstruction from point cloud','FontWeight','b')
hold off
7-46
Creating and Editing Delaunay Triangulations
This example shows how to create an approximate Medial Axis of a polygonal domain using a
constrained Delaunay triangulation. The Medial Axis of a polygon is defined by the locus of the center
of a maximal disk within the polygon interior.
load trimesh2d
dt = delaunayTriangulation(x,y,Constraints);
inside = isInterior(dt);
tr = triangulation(dt(inside,:),dt.Points);
Construct a set of edges that join the circumcenters of neighboring triangles. The additional logic
constructs a unique set of such edges.
numt = size(tr,1);
T = (1:numt)';
neigh = neighbors(tr);
cc = circumcenter(tr);
xcc = cc(:,1);
ycc = cc(:,2);
idx1 = T < neigh(:,1);
idx2 = T < neigh(:,2);
idx3 = T < neigh(:,3);
neigh = [T(idx1) neigh(idx1,1); T(idx2) neigh(idx2,2); T(idx3) neigh(idx3,3)]';
Plot the domain triangles in green, the domain boundary in blue, and the medial axis in red.
clf
triplot(tr,'g')
hold on
plot(xcc(neigh), ycc(neigh), '-r','LineWidth',1.5)
axis([-10 310 -10 310])
axis equal
plot(x(Constraints'),y(Constraints'),'-b','LineWidth',1.5)
xlabel('Medial Axis of Polygonal Domain','FontWeight','b')
hold off
7-47
7 Computational Geometry
This example shows how to morph a mesh of a 2-D domain to accommodate a modification to the
domain boundary.
Step 1: Load the data. The mesh to be morphed is defined by trife, xfe, and yfe, which is a
triangulation in face-vertex format.
load trimesh2d
clf
triplot(trife,xfe,yfe)
axis equal
axis([-10 310 -10 310])
axis equal
xlabel('Initial Mesh','FontWeight','b')
7-48
Creating and Editing Delaunay Triangulations
dt = delaunayTriangulation(x,y,Constraints);
clf
triplot(dt)
axis equal
axis([-10 310 -10 310])
axis equal
xlabel('Background Triangulation','FontWeight','b')
7-49
7 Computational Geometry
descriptors.tri = pointLocation(dt,xfe,yfe);
descriptors.baryCoords = cartesianToBarycentric(dt,descriptors.tri,[xfe yfe]);
Step 3: Edit the background triangulation to incorporate the desired modification to the domain
boundary.
7-50
Creating and Editing Delaunay Triangulations
Step 4: Convert the descriptors back to Cartesian coordinates using the deformed background
triangulation as a basis for evaluation.
Xnew = barycentricToCartesian(tr,descriptors.tri,descriptors.baryCoords);
tr = triangulation(trife,Xnew);
clf
triplot(tr)
axis([-10 310 -10 310])
axis equal
xlabel('Morphed Mesh','FontWeight','b')
7-51
7 Computational Geometry
7-52
Spatial Searching
Spatial Searching
Introduction
MATLAB® provides the necessary functions for performing a spatial search using either a Delaunay
triangulation or a general triangulation. The search queries that MATLAB supports are:
Given a set of points X and a query point q in Euclidean space, the nearest-neighbor search locates a
point p in X that is closer to q than to any other point in X. Given a triangulation of X, the point-
location search locates the triangle or tetrahedron that contains the query point q. Since these
methods work for both Delaunay as well as general triangulations, you can use them even if a
modification of the points violates the Delaunay criterion. You also can search a general triangulation
represented in matrix format.
While MATLAB supports these search schemes in N dimensions, exact spatial searches usually
become prohibitive as the number of dimensions extends beyond 3-D. You should consider
approximate alternatives for large problems in up to 10 dimensions.
Nearest-Neighbor Search
There are a few ways to compute nearest-neighbors in MATLAB, depending on the dimensionality of
the problem:
• For 2-D and 3-D searches, use the nearestNeighbor method provided by the triangulation
class and inherited by the delaunayTriangulation class.
• For 4-D and higher, use the delaunayn function to construct the triangulation and the
complementary dsearchn function to perform the search. While these N-D functions support 2-D
and 3-D, they are not as general and efficient as the triangulation search methods.
X = [3.5 8.2; 6.8 8.3; 1.3 6.5; 3.5 6.3; 5.8 6.2; 8.3 6.5;...
1 4; 2.7 4.3; 5 4.5; 7 3.5; 8.7 4.2; 1.5 2.1; 4.1 1.1; ...
7 1.5; 8.5 2.75];
plot(X(:,1),X(:,2),'ob')
hold on
vxlabels = arrayfun(@(n) {sprintf('X%d', n)}, (1:15)');
Hpl = text(X(:,1)+0.2, X(:,2)+0.2, vxlabels, 'FontWeight', ...
'bold', 'HorizontalAlignment','center', 'BackgroundColor', ...
'none');
hold off
7-53
7 Computational Geometry
dt = delaunayTriangulation(X);
Create some query points and for each query point find the index of its corresponding nearest
neighbor in X using the nearestNeighbor method.
numq = 10;
rng(0,'twister');
q = 2+rand(numq,2)*6;
xi = nearestNeighbor(dt, q);
Add the query points to the plot and add line segments joining the query points to their nearest
neighbors.
xnn = X(xi,:);
hold on
plot(q(:,1),q(:,2),'or');
plot([xnn(:,1) q(:,1)]',[xnn(:,2) q(:,2)]','-r');
hold off
Performing a nearest-neighbor search in 3-D is a direct extension of the 2-D example based on
delaunayTriangulation.
7-54
Spatial Searching
For 4-D and higher, use the delaunayn and dsearchn functions as illustrated in the following
example:
Create a random sample of points in 4-D and triangulate the points using delaunayn:
X = 20*rand(50,4) -10;
tri = delaunayn(X);
Create some query points and for each query point find the index of its corresponding nearest-
neighbor in X using the dsearchn function:
q = rand(5,4);
xi = dsearchn(X,tri, q);
The nearestNeighbor method and the dsearchn function allow the Euclidean distance between
the query point and its nearest-neighbor to be returned as an optional argument. In the 4-D example,
you can compute the distances, dnn, as follows:
[xi,dnn] = dsearchn(X,tri,q);
Point-Location Search
A point-location search is a triangulation search algorithm that locates the simplex (triangle,
tetrahedron, and so on) enclosing a query point. As in the case of the nearest-neighbor search, there
are a few approaches to performing a point-location search in MATLAB, depending on the
dimensionality of the problem:
• For 2-D and 3-D, use the class-based approach with the pointLocation method provided by the
triangulation class and inherited by the delaunayTriangulation class.
• For 4-D and higher, use the delaunayn function to construct the triangulation and the
complementary tsearchn function to perform the point-location search. Although supporting 2-D
and 3-D, these N-D functions are not as general and efficient as the triangulation search methods.
This example shows how to use the delaunayTriangulation class to perform a point location
search in 2-D.
X = [3.5 8.2; 6.8 8.3; 1.3 6.5; 3.5 6.3; 5.8 6.2; ...
8.3 6.5; 1 4; 2.7 4.3; 5 4.5; 7 3.5; 8.7 4.2; ...
1.5 2.1; 4.1 1.1; 7 1.5; 8.5 2.75];
Create the triangulation and plot it showing the triangle ID labels at the incenters of the triangles.
dt = delaunayTriangulation(X);
triplot(dt);
hold on
ic = incenter(dt);
numtri = size(dt,1);
trilabels = arrayfun(@(x) {sprintf('T%d', x)}, (1:numtri)');
Htl = text(ic(:,1), ic(:,2), trilabels, 'FontWeight', ...
'bold', 'HorizontalAlignment', 'center', 'Color', ...
'blue');
hold off
7-55
7 Computational Geometry
Now create some query points and add them to the plot. Then find the index of the corresponding
enclosing triangles using the pointLocation method.
q = [5.9344 6.2363;
2.2143 2.1910;
7.0948 3.6615;
7.6040 2.2770;
6.0724 2.5828;
6.5464 6.9407;
6.4588 6.1690;
4.3534 3.9026;
5.9329 7.7013;
3.0271 2.2067];
hold on;
plot(q(:,1),q(:,2),'*r');
vxlabels = arrayfun(@(n) {sprintf('q%d', n)}, (1:10)');
Hpl = text(q(:,1)+0.2, q(:,2)+0.2, vxlabels, 'FontWeight', ...
'bold', 'HorizontalAlignment','center', ...
'BackgroundColor', 'none');
hold off
ti = pointLocation(dt,q);
7-56
Spatial Searching
For 4-D and higher, use the delaunayn and tsearchn functions as illustrated in the following
example:
Create a random sample of points in 4-D and triangulate them using delaunayn:
X = 20*rand(50,4) -10;
tri = delaunayn(X);
Create some query points and find the index of the corresponding enclosing simplices using the
tsearchn function:
q = rand(5,4);
ti = tsearchn(X,tri,q);
The pointLocation method and the tsearchn function allow the corresponding barycentric
coordinates to be returned as an optional argument. In the 4-D example, you can compute the
barycentric coordinates as follows:
[ti,bc] = tsearchn(X,tri,q);
The barycentric coordinates are useful for performing linear interpolation. These coordinates provide
you with weights that you can use to scale the values at each vertex of the enclosing simplex. See
“Interpolating Scattered Data” on page 8-17 for further details.
See Also
nearestNeighbor | delaunayTriangulation | triangulation | delaunayn | dsearchn |
pointLocation | triangulation | tsearchn | delaunay
Related Examples
• “Working with Delaunay Triangulations” on page 7-14
• “Triangulation Representations” on page 7-2
• “Interpolating Scattered Data” on page 8-17
7-57
7 Computational Geometry
Voronoi Diagrams
In this section...
“Plot 2-D Voronoi Diagram and Delaunay Triangulation” on page 7-58
“Computing the Voronoi Diagram” on page 7-61
The Voronoi diagram of a discrete set of points X decomposes the space around each point X(i) into a
region of influence R{i}. This decomposition has the property that an arbitrary point P within the
region R{i} is closer to point i than any other point. The region of influence is called a Voronoi
region and the collection of all the Voronoi regions is the Voronoi diagram.
The Voronoi diagram is an N-D geometric construct, but most practical applications are in 2-D and 3-
D space. The properties of the Voronoi diagram are best understood using an example.
This example shows the Voronoi diagram and the Delaunay triangulation on the same 2-D plot.
Use the 2-D voronoi function to plot the Voronoi diagram for a set of points.
figure()
X = [-1.5 3.2; 1.8 3.3; -3.7 1.5; -1.5 1.3; ...
0.8 1.2; 3.3 1.5; -4.0 -1.0;-2.3 -0.7; ...
0 -0.5; 2.0 -1.5; 3.7 -0.8; -3.5 -2.9; ...
-0.9 -3.9; 2.0 -3.5; 3.5 -2.25];
voronoi(X(:,1),X(:,2))
7-58
Voronoi Diagrams
Observe that P is closer to X9 than to any other point in X, which is true for any point P within the
region that bounds X9.
The Voronoi diagram of a set of points X is closely related to the Delaunay triangulation of X. To see
this relationship, construct a Delaunay triangulation of the point set X and superimpose the
triangulation plot on the Voronoi diagram.
dt = delaunayTriangulation(X);
hold on
triplot(dt,'-r');
hold off
7-59
7 Computational Geometry
From the plot you can see that the Voronoi region associated with the point X9 is defined by the
perpendicular bisectors of the Delaunay edges attached to X9. Also, the vertices of the Voronoi edges
are located at the circumcenters of the Delaunay triangles. You can illustrate these associations by
plotting the circumcenter of triangle {|X9|,|X4|,|X8|}.
To find the index of this triangle, query the triangulation. The triangle contains the location (-1, 0).
tidx = pointLocation(dt,-1,0);
cc = circumcenter(dt,tidx);
hold on
plot(cc(1),cc(2),'*g');
hold off
7-60
Voronoi Diagrams
The Delaunay triangulation and Voronoi diagram are geometric duals of each other. You can compute
the Voronoi diagram from the Delaunay triangulation and vice versa.
Observe that the Voronoi regions associated with points on the convex hull are unbounded (for
example, the Voronoi region associated with X13). The edges in this region "end" at infinity. The
Voronoi edges that bisect Delaunay edges (X13, X12) and (X13, X14) extend to infinity. While the
Voronoi diagram provides a nearest-neighbor decomposition of the space around each point in the
set, it does not directly support nearest-neighbor queries. However, the geometric constructions used
to compute the Voronoi diagram are also used to perform nearest-neighbor searches.
MATLAB provides functions to plot the Voronoi diagram in 2-D and to compute the topology of the
Voronoi diagram in N-D. In practice, Voronoi computation is not practical in dimensions beyond 6-D
for moderate to large data sets, due to the exponential growth in required memory.
The voronoi plot function plots the Voronoi diagram for a set of points in 2-D space. In MATLAB
there are two ways to compute the topology of the Voronoi diagram of a point set:
7-61
7 Computational Geometry
The voronoin function supports the computation of the Voronoi topology for discrete points in N-D
(N ≥ 2). The voronoiDiagram method supports computation of the Voronoi topology for discrete
points 2-D or 3-D.
The voronoiDiagram method is recommended for 2-D or 3-D topology computations as it is more
robust and gives better performance for large data sets. This method supports incremental insertion
and removal of points and complementary queries, such as nearest-neighbor point search.
The voronoin function and the voronoiDiagram method represent the topology of the Voronoi
diagram using a matrix format. See “Triangulation Matrix Format” on page 7-3 for further details on
this data structure.
Given a set of points, X, obtain the topology of the Voronoi diagram as follows:
[V,R] = voronoin(X)
dt = delaunayTriangulation(X);
[V,R] = voronoiDiagram(dt)
V is a matrix representing the coordinates of the Voronoi vertices (the vertices are the end points of
the Voronoi edges). By convention the first vertex in V is the infinite vertex. R is a vector cell array
length size(X,1), representing the Voronoi region associated with each point. Hence, the Voronoi
region associated with the point X(i) is R{i}.
X = [-1.5 3.2; 1.8 3.3; -3.7 1.5; -1.5 1.3; 0.8 1.2; ...
3.3 1.5; -4.0 -1.0; -2.3 -0.7; 0 -0.5; 2.0 -1.5; ...
3.7 -0.8; -3.5 -2.9; -0.9 -3.9; 2.0 -3.5; 3.5 -2.25];
[VX,VY] = voronoi(X(:,1),X(:,2));
h = plot(VX,VY,'-b',X(:,1),X(:,2),'.r');
xlim([-4,4])
ylim([-4,4])
7-62
Voronoi Diagrams
R{9} gives the indices of the Voronoi vertices associated with the point site X9.
R{9}
ans = 1×5
8 12 17 10 14
The indices of the Voronoi vertices are the indices with respect to the V array.
Similarly, R{4} gives the indices of the Voronoi vertices associated with the point site X4.
R{4}
ans = 1×5
4 8 14 9 7
In 3-D a Voronoi region is a convex polyhedron, the syntax for creating the Voronoi diagram is similar.
However the geometry of the Voronoi region is more complex. The following example illustrates the
creation of a 3-D Voronoi diagram and the plotting of a single region.
7-63
7 Computational Geometry
Create a sample of 25 points in 3-D space and compute the topology of the Voronoi diagram for this
point set.
rng('default')
X = -3 + 6.*rand([25 3]);
dt = delaunayTriangulation(X);
[V,R] = voronoiDiagram(dt);
Find the point closest to the origin and plot the Voronoi region associated with this point.
tid = nearestNeighbor(dt,0,0,0);
XR10 = V(R{tid},:);
K = convhull(XR10);
defaultFaceColor = [0.6875 0.8750 0.8984];
trisurf(K, XR10(:,1) ,XR10(:,2) ,XR10(:,3) , ...
'FaceColor', defaultFaceColor, 'FaceAlpha',0.8)
title('3-D Voronoi Region')
See Also
More About
• “Spatial Searching” on page 7-53
7-64
Types of Region Boundaries
The convex hull of a set of points in N-D space is the smallest convex region enclosing all points in the
set. If you think of a 2-D set of points as pegs in a peg board, the convex hull of that set would be
formed by taking an elastic band and using it to enclose all the pegs.
rng('default')
x = rand(20,1);
y = rand(20,1);
plot(x,y,'r.','MarkerSize',10)
hold on
k = convhull(x,y);
plot(x(k),y(k))
title('The Convex Hull of a Set of Points')
hold off
A convex polygon is a polygon that does not have concave vertices, for example:
7-65
7 Computational Geometry
x = rand(20,1);
y = rand(20,1);
k = convhull(x,y);
plot(x(k),y(k))
title('Convex Polygon')
You can also create a boundary of a point set that is nonconvex. If you shrink and tighten the convex
hull from above, you can enclose all of the points in a nonconvex polygon with concave vertices:
k = boundary(x,y,0.9);
plot(x(k),y(k))
title('Nonconvex Polygon')
7-66
Types of Region Boundaries
The convex hull has numerous applications. You can compute the upper bound on the area bounded
by a discrete point set in the plane from the convex hull of the set. The convex hull simplifies the
representation of more complex polygons or polyhedra. For instance, to determine whether two
nonconvex bodies intersect, you could apply a series of fast rejection steps to avoid the penalty of a
full intersection analysis:
If the convex hulls did not intersect, this would avoid the expense of a more comprehensive
intersection test.
While convex hulls and nonconvex polygons are convenient ways to represent relatively simple
boundaries, they are in fact specific instances of a more general geometric construct called the alpha
shape.
Alpha Shapes
The alpha shape of a set of points is a generalization of the convex hull and a subgraph of the
Delaunay triangulation. That is, the convex hull is just one type of alpha shape, and the full family of
alpha shapes can be derived from the Delaunay triangulation of a given point set.
7-67
7 Computational Geometry
rng(4)
x = rand(20,1);
y = rand(20,1);
plot(x,y,'r.','MarkerSize',20)
hold on
shp = alphaShape(x,y,100);
plot(shp)
title('Convex Alpha Shape')
hold off
Unlike the convex hull, alpha shapes have a parameter that controls the level of detail, or how tightly
the boundary fits around the point set. The parameter is called alpha or the alpha radius. Varying the
alpha radius from 0 to Inf produces a set of different alpha shapes unique for that point set.
plot(x,y,'r.','MarkerSize',20)
hold on
shp = alphaShape(x,y,.5);
plot(shp)
title('Nonconvex Alpha Shape')
hold off
7-68
Types of Region Boundaries
Varying the alpha radius can sometimes result in an alpha shape with multiple regions, which might
or might not contain holes. However, the alphaShape function in MATLAB® always returns
regularized alpha shapes, which prevents isolated or dangling points, edges, or faces.
plot(x,y,'r.','MarkerSize',20)
hold on
shp = alphaShape(x,y);
plot(shp)
title('Alpha Shape with Multiple Regions')
hold off
7-69
7 Computational Geometry
See Also
convhull | alphaShape
More About
• “Using the delaunayTriangulation Class” on page 7-20
• “Triangulation Matrix Format” on page 7-3
7-70
Computing the Convex Hull
The convhull function supports the computation of convex hulls in 2-D and 3-D. The convhulln
function supports the computation of convex hulls in N-D (N ≥ 2). The convhull function is
recommended for 2-D or 3-D computations due to better robustness and performance.
The delaunayTriangulation class supports 2-D or 3-D computation of the convex hull from the
Delaunay triangulation. This computation is not as efficient as the dedicated convhull and
convhulln functions. However, if you have a delaunayTriangulation of a point set and require
the convex hull, the convexHull method can compute the convex hull more efficiently from the
existing triangulation.
The alphaShape function also supports the 2-D or 3-D computation of the convex hull by setting the
alpha radius input parameter to Inf. Like delaunayTriangulation, however, computing the
convex hull using alphaShape is less efficient than using convhull or convhulln directly. The
exception is when you are working with a previously created alpha shape object.
The convhull and convhulln functions take a set of points and output the indices of the points that
lie on the boundary of the convex hull. The point index-based representation of the convex hull
supports plotting and convenient data access. The following examples illustrate the computation and
representation of the convex hull.
The first example uses a 2-D point set from the seamount dataset as input to the convhull function.
load seamount
K = convhull(x,y);
K represents the indices of the points arranged in a counter-clockwise cycle around the convex hull.
plot(x,y,'.','markersize',12)
xlabel('Longitude')
7-71
7 Computational Geometry
ylabel('Latitude')
hold on
plot(x(K),y(K),'r')
Add point labels to the points on the convex hull to observe the structure of K.
[K,A] = convhull(x,y);
convhull can compute the convex hull of both 2-D and 3-D point sets. You can reuse the seamount
dataset to illustrate the computation of the 3-D convex hull.
close(gcf)
K = convhull(x,y,z);
In 3-D the boundary of the convex hull, K, is represented by a triangulation. This is a set of triangular
facets in matrix format that is indexed with respect to the point array. Each row of the matrix K
represents a triangle.
Since the boundary of the convex hull is represented as a triangulation, you can use the triangulation
plotting function trisurf.
trisurf(K,x,y,z,'Facecolor','cyan')
7-72
Computing the Convex Hull
The volume bounded by the 3-D convex hull can optionally be returned by convhull, the syntax is as
follows.
[K,V] = convhull(x,y,z);
The convhull function also provides the option of simplifying the representation of the convex hull
by removing vertices that do not contribute to the area or volume. For example, if boundary facets of
the convex hull are collinear or coplanar, you can merge them to give a more concise representation.
The following example illustrates use of this option.
[x,y,z] = meshgrid(-2:1:2,-2:1:2,-2:1:2);
x = x(:);
y = y(:);
z = z(:);
K1 = convhull(x,y,z);
subplot(1,2,1)
defaultFaceColor = [0.6875 0.8750 0.8984];
trisurf(K1,x,y,z,'Facecolor',defaultFaceColor)
axis equal
title(sprintf('Convex hull with simplify\nset to false'))
K2 = convhull(x,y,z,'simplify',true);
subplot(1,2,2)
trisurf(K2,x,y,z,'Facecolor',defaultFaceColor)
axis equal
title(sprintf('Convex hull with simplify\nset to true'))
7-73
7 Computational Geometry
MATLAB provides the convhulln function to support the computation of convex hulls and
hypervolumes in higher dimensions. Though convhulln supports N-D, problems in more than 10
dimensions present challenges due to the rapidly growing memory requirements.
The convhull function is superior to convhulln in 2-D and 3-D as it is more robust and gives better
performance.
This example shows the relationship between a Delaunay triangulation of a set of points in 2-D and
the convex hull of that set of points.
The delaunayTriangulation class supports computation of Delaunay triangulations in 2-D and 3-D
space. This class also provides a convexHull method to derive the convex hull from the
triangulation.
X = [-1.5 3.2; 1.8 3.3; -3.7 1.5; -1.5 1.3; 0.8 1.2; ...
3.3 1.5; -4.0 -1.0; -2.3 -0.7; 0 -0.5; 2.0 -1.5; ...
3.7 -0.8; -3.5 -2.9; -0.9 -3.9; 2.0 -3.5; 3.5 -2.25];
dt = delaunayTriangulation(X);
7-74
Computing the Convex Hull
Plot the triangulation and highlight the edges that are shared only by a single triangle to reveal the
convex hull.
triplot(dt)
fe = freeBoundary(dt)';
hold on
plot(X(fe,1), X(fe,2), '-r', 'LineWidth',2)
hold off
In 3-D, the facets of the triangulation that are shared only by one tetrahedron represent the boundary
of the convex hull.
The dedicated convhull function is generally more efficient than a computation based on the
convexHull method. However, the triangulation based approach is appropriate if:
• You have a delaunayTriangulation of the point set already and the convex hull is also
required.
• You need to add or remove points from the set incrementally and need to recompute the convex
hull frequently after you have edited the points.
7-75
7 Computational Geometry
This example shows how to compute the convex hull of a 2-D point set using the alphaShape
function.
alphaShape computes a regularized alpha shape from a set of 2-D or 3-D points. You can specify the
alpha radius, which determines how tightly or loosely the alpha shape envelops the point set. When
the alpha radius is set to Inf, the resulting alpha shape is the convex hull of the point set.
X = [-1.5 3.2; 1.8 3.3; -3.7 1.5; -1.5 1.3; 0.8 1.2; ...
3.3 1.5; -4.0 -1.0; -2.3 -0.7; 0 -0.5; 2.0 -1.5; ...
3.7 -0.8; -3.5 -2.9; -0.9 -3.9; 2.0 -3.5; 3.5 -2.25];
Compute and plot the convex hull of the point set using an alpha shape with alpha radius equal to
Inf.
shp = alphaShape(X,Inf);
plot(shp)
See Also
convhull | convhulln | convexHull | delaunayTriangulation | alphaShape
Related Examples
• “Working with Delaunay Triangulations” on page 7-14
7-76
8
Interpolation
Gridded sample data makes interpolation more efficient, because the organized structure of the data
makes it easy for MATLAB to find the sample data points closest to the query point. However,
interpolating scattered data requires a “Delaunay Triangulation” of the data points, and this
introduces an extra layer of computation. Therefore, if your data can be approximated as a grid,
gridded interpolation provides substantial savings in computation time and memory usage compared
to scattered interpolation.
• “Interpolating Gridded Data” on page 8-5 covers 1-D interpolation, and the N-D interpolation of
sample data that is in axis-aligned grid format:
• “Interpolating Scattered Data” on page 8-17 covers the N-D interpolation of scattered data:
8-2
Gridded and Scattered Sample Data
algorithms do not necessarily pass through the sample data points. For more information about curve
fitting, see Curve Fitting Toolbox.
With a curved grid, you are effectively dealing with a set of scattered data and must use more
computationally expensive scattered interpolation functions to interpolate the values. However,
although the input data cannot be gridded directly, it is sometimes feasible to approximate the curved
grid with straight grid lines at appropriate intervals:
8-3
8 Interpolation
You can create an approximate grid by creating a set of grid vectors with appropriate spacing.
Approximating a curved grid with straight lines allows you to get the performance benefits of grid-
based interpolation, at the cost of slightly distorting the data. For more information about creating
grid vectors, see “Grid Representations” on page 8-5.
See Also
griddedInterpolant | scatteredInterpolant
Related Examples
• “Interpolating Gridded Data” on page 8-5
• “Interpolating Scattered Data” on page 8-17
8-4
Interpolating Gridded Data
In all of these applications, grid-based interpolation efficiently extends the usefulness of the data to
points where no measurement was taken. For example, if you have hourly price data for a stock, you
can use interpolation to approximate the price every 15 minutes.
The meshgrid and ndgrid functions create grids of various dimensionality. meshgrid can create 2-
D or 3-D grids, while ndgrid can create grids with any number of dimensions. These functions return
grids using different output formats. You can convert between these grid formats using the
pagetranspose (as of R2020b) or permute functions to swap the first two dimensions of the grid.
Interpolation Functions
The interp family of functions includes interp1, interp2, interp3, and interpn. Each function
is designed to interpolate data with a specific number of dimensions. interp2 and interp3 use
grids in meshgrid format, while interpn uses grids in ndgrid format.
Interpolation Objects
There are memory and performance benefits to using griddedInterpolant objects over the
interp functions. griddedInterpolant offers substantial performance improvements for repeated
queries of the interpolant object, whereas the interp functions perform a new calculation each time
they are called. Also, griddedInterpolant stores the sample points in a memory-efficient format
(as a compact grid on page 8-6) and is multithreaded to take advantage of multicore computer
processors.
Grid Representations
MATLAB allows you to represent a grid in one of three representations: full grid, compact grid, or
default grid. The default grid and compact grid are used primarily for convenience and improved
efficiency, respectively.
8-5
8 Interpolation
Full Grid
A full grid is one in which all points are explicitly defined. The outputs of ndgrid and meshgrid
define a full grid. You can create full grids that are uniform, in which points in each dimension have
equal spacing, or nonuniform, in which the spacing varies in one or more of the dimensions. Uniform
grids can have different spacing in each dimension, as long as the spacing is constant within each
dimension.
X =
1 2 3
1 2 3
1 2 3
1 2 3
Y =
3 3 3
6 6 6
9 9 9
12 12 12
Compact Grid
Explicitly defining every point in a grid can consume a lot of memory when you are dealing with large
grids. The compact grid representation is a way to dispense with the memory overhead of a full grid.
The compact grid representation stores only grid vectors (one for each dimension) instead of the full
grid. Together, the grid vectors implicitly define the grid. In fact, the inputs for meshgrid and
ndgrid are grid vectors, and these functions replicate the grid vectors to form the full grid. The
compact grid representation enables you to bypass grid creation and supply the grid vectors directly
to the interpolation function.
For example, consider two vectors, x1 = 1:3 and x2 = 1:5. You can think of these vectors as a set
of coordinates in the x1 direction and a set of coordinates in the x2 direction, like so:
8-6
Interpolating Gridded Data
Each arrow points to a location. You can use these two vectors to define a set of grid points, where
one set of coordinates is given by x1 and the other set of coordinates is given by x2. When the grid
vectors are replicated, they form two coordinate arrays that make up the full grid:
Your input grid vectors might be monotonic or nonmonotonic. Monotonic vectors contain values that
either increase in that dimension or decrease in that dimension. Conversely, nonmonotonic vectors
contain values that fluctuate. If the input grid vector is nonmonotonic, such as [2 4 6 3 1], then
[X1,X2] = ndgrid([2 4 6 3 1]) outputs a nonmonotonic grid. Your grid vectors should be
monotonic if you intend to pass the grid to other MATLAB functions. The sort function is useful to
ensure monotonicity.
Default Grid
In some applications, only the values at the grid points are important and not the distances between
grid points. For example, most MRI scans gather data that is uniformly spaced in all directions. In
cases like this, you can allow the interpolation function to automatically generate a default grid
representation to use with the data. To do this, leave out the grid inputs to the interpolation function.
When you leave out the grid inputs, the function automatically considers the data to be on a unit-
spaced grid. The function creates this unit-spaced grid while it executes, saving you the trouble of
creating a grid yourself.
8-7
8 Interpolation
[X,Y] = meshgrid(0:5:20)
X =
0 5 10 15 20
0 5 10 15 20
0 5 10 15 20
0 5 10 15 20
0 5 10 15 20
Y =
0 0 0 0 0
5 5 5 5 5
10 10 10 10 10
15 15 15 15 15
20 20 20 20 20
The (x,y) coordinates of each grid point are represented as corresponding elements in the X and Y
matrices. The first grid point is given by [X(1) Y(1)], which is [0 0], the next grid point is given
by [X(2) Y(2)], which is [0 5], and so on.
Now, create a matrix to represent temperature measurements on the grid and then plot the data as a
surface.
T = [1 1 10 1 1;
1 10 10 10 10;
100 100 1000 100 100;
10 10 10 10 1;
1 1 10 1 1];
surf(X,Y,T)
view(2)
8-8
Interpolating Gridded Data
Although the temperature at the center grid point is large, its location and influence on surrounding
grid points is not apparent from the raw data.
To improve the resolution of the data by a factor of 10, use interp2 to interpolate the temperature
data onto a finer grid that uses 0.5 cm intervals. Use meshgrid again to create a finer grid
represented by the matrices Xq and Yq. Then, use interp2 with the original grid, the temperature
data, and the new grid points, and plot the resulting data. By default, interp2 uses linear
interpolation in each dimension.
[Xq,Yq] = meshgrid(0:0.5:20);
Tq = interp2(X,Y,T,Xq,Yq);
surf(Xq,Yq,Tq)
view(2)
8-9
8 Interpolation
Interpolating the temperature data adds detail to the image and greatly improves the usefulness of
the data within the area of measurements.
Method Description
The interpolated value at a query point is the value at
the nearest sample grid point.
• Discontinuous
• Modest memory requirements
• Fastest computation time
• Requires 2 grid points in each dimension
8-10
Interpolating Gridded Data
Method Description
The interpolated value at a query point is the value at
the next sample grid point.
• Discontinuous
• Same memory requirements and computation time
as nearest neighbor
• Available for 1-D interpolation only
• Requires at least 2 grid points
• Discontinuous
• Same memory requirements and computation time
as nearest neighbor
• Available for 1-D interpolation only
• Requires at least 2 grid points
• C0 continuous
• Requires more memory and computation time than
nearest neighbor
• Requires at least 2 grid points in each dimension
• C1 continuous
• Requires more memory and computation time than
linear
• Available for 1-D interpolation only
• Requires at least 4 grid points
8-11
8 Interpolation
Method Description
The interpolated value at a query point is based on
cubic interpolation of the values at neighboring grid
points in each respective dimension.
• C1 continuous
• Requires more memory and computation time than
linear
• Grid must have uniform spacing, though the
spacing in each dimension does not have to be the
same
• Requires at least 4 points in each dimension
The interpolated value at a query point is based on a
piecewise function of polynomials with degree at most
three evaluated using the values of neighboring grid
points in each respective dimension. The Akima
formula is modified to avoid overshoots.
• C1 continuous
• Similar memory requirements as spline
• Requires more computation time than cubic, but
typically less than spline
• Requires at least 2 grid points in each dimension
The interpolated value at a query point is based on a
cubic interpolation of the values at neighboring grid
points in each respective dimension.
• C2 continuous
• Requires more memory and computation time than
cubic
• Requires at least 4 points in each dimension
See Also
interp1 | interp2 | interp3 | interpn | griddedInterpolant
Related Examples
• “Resample Image with Gridded Interpolation” on page 8-53
• “Interpolation of Multiple 1-D Value Sets” on page 8-13
• “Interpolation of 2-D Selections in 3-D Grids” on page 8-15
8-12
Interpolation of Multiple 1-D Value Sets
This example shows how to interpolate three 1-D data sets in a single pass using
griddedInterpolant. This is a faster alternative to looping over your data sets.
x = (1:5)';
V = 5×3
1 2 3
2 4 6
3 6 9
4 8 12
5 10 15
Create the interpolant F by passing the sample points and sample values to griddedInterpolant.
With this setup, griddedInterpolant interprets V as containing three different 1-D data sets
defined at the same x-values.
F = griddedInterpolant(x,V);
qx = 1:0.5:5;
Vq = F(qx)
Vq = 9×3
See Also
griddedInterpolant
8-13
8 Interpolation
Related Examples
• “Interpolating Gridded Data” on page 8-5
• “Resample Image with Gridded Interpolation” on page 8-53
8-14
Interpolation of 2-D Selections in 3-D Grids
This example shows how to reduce the dimensionality of the grid plane arrays in 3-D to solve a 2-D
interpolation problem.
In some application areas, it might be necessary to interpolate a lower dimensional plane of a grid;
for example, interpolating a plane of a 3-D grid. When you extract the grid plane from the 3-D grid,
the resulting arrays might be in 3-D format. You can use the squeeze function to reduce the
dimensionality of the grid plane arrays to solve the problem in 2-D.
[X,Y,Z] = ndgrid(1:5);
V = X.^2 + Y.^2 +Z;
Select a 2-D sample from the grid. In this case, the third column of samples.
x = X(:,3,:);
z = Z(:,3,:);
v = V(:,3,:);
The 2-D plane occurs at Y=3, so the Y dimension has been fixed. x, z, and v are 5-by-1-by-5 arrays.
You must reduce them to 2-D arrays before evaluating the interpolant.
x = squeeze(x);
z = squeeze(z);
v = squeeze(v);
[Xq,Zq] = ndgrid(1:0.5:5);
Vq = interpn(x,z,v,Xq,Zq);
figure
surf(Xq,Zq,Vq);
xlabel('Xq');
ylabel('Zq');
zlabel('Vq');
8-15
8 Interpolation
See Also
interpn | squeeze
More About
• “Interpolating Gridded Data” on page 8-5
• “Interpolation of Multiple 1-D Value Sets” on page 8-13
8-16
Interpolating Scattered Data
Scattered Data
Scattered data consists of a set of points X and corresponding values V, where the points have no
structure or order between their relative locations. There are various approaches to interpolating
scattered data. One widely used approach uses a Delaunay triangulation of the points.
This example shows how to construct an interpolating surface by triangulating the points and lifting
the vertices by a magnitude V into a dimension orthogonal to X.
There are variations on how you can apply this approach. In this example, the interpolation is broken
down into separate steps; typically, the overall interpolation process is accomplished with one
function call.
8-17
8 Interpolation
Create a Delaunay triangulation, lift the vertices, and evaluate the interpolant at the query point Xq.
figure('Color', 'white')
t = delaunay(X(:,1),X(:,2));
hold on
view(322.5, 30);
8-18
Interpolating Scattered Data
This step generally involves traversing of the triangulation data structure to find the triangle that
encloses the query point. Once you find the point, the subsequent steps to compute the value depend
on the interpolation method. You could compute the nearest point in the neighborhood and use the
value at that point (the nearest-neighbor interpolation method). You could also compute the weighted
sum of values of the three vertices of the enclosing triangle (the linear interpolation method). These
methods and their variants are covered in texts and references on scattered data interpolation.
Though the illustration highlights 2-D interpolation, you can apply this technique to higher
dimensions. In more general terms, given a set of points X and corresponding values V, you can
construct an interpolant of the form V = F(X). You can evaluate the interpolant at a query point Xq,
to give Vq = F(Xq). This is a single-valued function; for any query point Xq within the convex hull of
X, it will produce a unique value Vq. The sample data is assumed to respect this property in order to
produce a satisfactory interpolation.
The griddata function supports 2-D scattered data interpolation. The griddatan function supports
scattered data interpolation in N-D; however, it is not practical in dimensions higher than 6-D for
moderate to large point sets, due to the exponential growth in memory required by the underlying
triangulation.
8-19
8 Interpolation
The scatteredInterpolant class supports scattered data interpolation in 2-D and 3-D space. Use
of this class is encouraged as it is more efficient and readily adapts to a wider range of interpolation
problems.
This example shows how the griddata function interpolates scattered data at a set of grid points
and uses this gridded data to create a contour plot.
Plot the seamount data set (a seamount is an underwater mountain). The data set consists of a set of
longitude (x) and latitude (y) locations, and corresponding seamount elevations (z) measured at
those coordinates.
load seamount
plot3(x,y,z,'.','markersize',12)
xlabel('Longitude')
ylabel('Latitude')
zlabel('Depth in Feet')
grid on
8-20
Interpolating Scattered Data
Use meshgrid to create a set of 2-D grid points in the longitude-latitude plane and then use
griddata to interpolate the corresponding depth at those points.
figure
[xi,yi] = meshgrid(210.8:0.01:211.8, -48.5:0.01:-47.9);
zi = griddata(x,y,z,xi,yi);
surf(xi,yi,zi);
xlabel('Longitude')
ylabel('Latitude')
zlabel('Depth in Feet')
Now that the data is in a gridded format, compute and plot the contours.
figure
[c,h] = contour(xi,yi,zi);
clabel(c,h);
xlabel('Longitude')
ylabel('Latitude')
8-21
8 Interpolation
You can also use griddata to interpolate at arbitrary locations within the convex hull of the dataset.
For example, the depth at coordinates (211.3, -48.2) is given by:
The underlying triangulation is computed each time the griddata function is called. This can impact
performance if the same data set is interpolated repeatedly with different query points. The
scatteredInterpolant class described in “Interpolating Scattered Data Using the
scatteredInterpolant Class” on page 8-23 is more efficient in this respect.
MATLAB software also provides griddatan to support interpolation in higher dimensions. The
calling syntax is similar to griddata.
scatteredInterpolant Class
The griddata function is useful when you need to interpolate to find the values at a set of
predefined grid-point locations. In practice, interpolation problems are often more general, and the
scatteredInterpolant class provides greater flexibility. The class has the following advantages:
• It produces an interpolating function that can be queried efficiently. That is, the underlying
triangulation is created once and reused for subsequent queries.
• The interpolation method can be changed independently of the triangulation.
• The values at the data points can be changed independently of the triangulation.
8-22
Interpolating Scattered Data
• Data points can be incrementally added to the existing interpolant without triggering a complete
recomputation. Data points can also be removed and moved efficiently, provided the number of
points edited is small relative to the total number of sample points.
• It provides extrapolation functionality for approximating values at points that fall outside the
convex hull. See “Extrapolating Scattered Data” on page 8-44 for more information.
The scatteredInterpolant class supports scattered data interpolation in 2-D and 3-D space. You
can create the interpolant by calling scatteredInterpolant and passing the point locations and
corresponding values, and optionally the interpolation and extrapolation methods. See the
scatteredInterpolant reference page for more information about the syntaxes you can use to
create and evaluate a scatteredInterpolant.
This example shows how to use scatteredInterpolant to interpolate a scattered sampling of the
peaks function.
rng default;
X = -3 + 6.*rand([250 2]);
V = peaks(X(:,1),X(:,2));
F = scatteredInterpolant(X,V)
F =
scatteredInterpolant with properties:
The Points property represents the coordinates of the data points, and the Values property
represents the associated values. The Method property represents the interpolation method that
performs the interpolation. The ExtrapolationMethod property represents the extrapolation
method used when query points fall outside the convex hull.
You can access the properties of F in the same way you access the fields of a struct. For example,
use F.Points to examine the coordinates of the data points.
8-23
8 Interpolation
Vq = F([1.5 1.25])
Vq = 1.4838
Vq = F(1.5, 1.25)
Vq = 1.4838
Vq = 3×1
0.4057
1.2199
2.1639
You can evaluate F at grid point locations and plot the result.
[Xq,Yq] = meshgrid(-2.5:0.125:2.5);
Vq = F(Xq,Yq);
surf(Xq,Yq,Vq);
xlabel('X','fontweight','b'), ylabel('Y','fontweight','b');
zlabel('Value - V','fontweight','b');
title('Linear Interpolation Method','fontweight','b');
8-24
Interpolating Scattered Data
You can change the interpolation method on the fly. Set the method to 'nearest'.
F.Method = 'nearest';
Vq = F(Xq,Yq);
figure
surf(Xq,Yq,Vq);
xlabel('X','fontweight','b'),ylabel('Y','fontweight','b')
zlabel('Value - V','fontweight','b')
title('Nearest neighbor Interpolation Method','fontweight','b');
8-25
8 Interpolation
Change the interpolation method to natural neighbor, reevaluate, and plot the results.
F.Method = 'natural';
Vq = F(Xq,Yq);
figure
surf(Xq,Yq,Vq);
xlabel('X','fontweight','b'),ylabel('Y','fontweight','b')
zlabel('Value - V','fontweight','b')
title('Natural neighbor Interpolation Method','fontweight','b');
8-26
Interpolating Scattered Data
You can change the values V at the sample data locations, X, on the fly. This is useful in practice as
some interpolation problems may have multiple sets of values at the same locations. For example,
suppose you want to interpolate a 3-D velocity field that is defined by locations (x, y, z) and
corresponding componentized velocity vectors (Vx, Vy, Vz). You can interpolate each of the velocity
components by assigning them to the values property (V) in turn. This has important performance
benefits, because it allows you to reuse the same interpolant without incurring the overhead of
computing a new one each time.
The following steps show how to change the values in our example. You will compute the values using
2 − y2
the expression, v = xe−x .
V = X(:,1).*exp(-X(:,1).^2-X(:,2).^2);
F.Values = V;
Vq = F(Xq,Yq);
figure
surf(Xq,Yq,Vq);
xlabel('X','fontweight','b'), ylabel('Y','fontweight','b')
zlabel('Value - V','fontweight','b')
title('Natural neighbor interpolation of v = x.*exp(-x.^2-y.^2)')
8-27
8 Interpolation
This performs an efficient update as opposed to a complete recomputation using the augmented data
set.
When adding sample data, it is important to add both the point locations and the corresponding
values.
X = -1.5 + 3.*rand(100,2);
V = X(:,1).*exp(-X(:,1).^2-X(:,2).^2);
F.Points(end+(1:100),:) = X;
F.Values(end+(1:100)) = V;
Vq = F(Xq,Yq);
figure
surf(Xq,Yq,Vq);
xlabel('X','fontweight','b'), ylabel('Y','fontweight','b');
zlabel('Value - V','fontweight','b');
8-28
Interpolating Scattered Data
You can incrementally remove sample data points from the interpolant. You also can remove data
points and corresponding values from the interpolant. This is useful for removing spurious outliers.
When removing sample data, it is important to remove both the point location and the corresponding
value.
8-29
8 Interpolation
xlabel('X','fontweight','b'), ylabel('Y','fontweight','b');
zlabel('Value - V','fontweight','b');
title('Interpolation of v = x.*exp(-x.^2-y.^2) with sample points removed')
This example shows how to interpolate scattered data when the value at each sample location is
complex.
rng('default')
X = -3 + 6*rand([250 2]);
V = complex(X(:,1).*X(:,2), X(:,1).^2 + X(:,2).^2);
F = scatteredInterpolant(X,V);
Create a grid of query points and evaluate the interpolant at the grid points.
[Xq,Yq] = meshgrid(-2.5:0.125:2.5);
Vq = F(Xq,Yq);
8-30
Interpolating Scattered Data
VqReal = real(Vq);
figure
surf(Xq,Yq,VqReal);
xlabel('X');
ylabel('Y');
zlabel('Real Value - V');
title('Real Component of Interpolated Value');
VqImag = imag(Vq);
figure
surf(Xq,Yq,VqImag);
xlabel('X');
ylabel('Y');
zlabel('Imaginary Value - V');
title('Imaginary Component of Interpolated Value');
8-31
8 Interpolation
When dealing with real-world interpolation problems the data may be more challenging. It may come
from measuring equipment that is likely to produce inaccurate readings or outliers. The underlying
data may not vary smoothly, the values may jump abruptly from point to point. This section provides
you with some guidelines to identify and address problems with scattered data interpolation.
You should preprocess sample data that contains NaN values to remove the NaN values as this data
cannot contribute to the interpolation. If a NaN is removed, the corresponding data values/
coordinates should also be removed to ensure consistency. If NaN values are present in the sample
data, the constructor will error when called.
8-32
Interpolating Scattered Data
V = x.^2 + y.^2;
F = scatteredInterpolant(x,y,V);
Instead, find the sample point indices of the NaNs and then construct the interpolant:
x(nan_flags) = [];
y(nan_flags) = [];
V(nan_flags) = [];
F = scatteredInterpolant(x,y,V);
The following example is similar if the point locations are in matrix form. First, create data and
replace some entries with NaN values.
X = rand(25,2)*4-2;
V = X(:,1).^2 + X(:,2).^2;
F = scatteredInterpolant(X,V);
Find the sample point indices of the NaN and then construct the interpolant:
X(nan_flags,:) = [];
V(nan_flags) = [];
F = scatteredInterpolant(X,V);
griddata and griddatan return NaN values when you query points outside the convex hull using
the 'linear' or 'natural' methods. However, you can expect numeric results if you query the
same points using the 'nearest' method. This is because the nearest neighbor to a query point
exists both inside and outside the convex hull.
If you want to compute approximate values outside the convex hull, you should use
scatteredInterpolant. See “Extrapolating Scattered Data” on page 8-44 for more information.
Input data is rarely “perfect” and your application could have to handle duplicate data point
locations. Two or more data points at the same location in your data set can have different
corresponding values. In this scenario, scatteredInterpolant merges the points and computes
the average of the corresponding values. This example shows how scatteredInterpolant
performs an interpolation on a data set with duplicate points.
8-33
8 Interpolation
x = rand(100,1)*6-3;
y = rand(100,1)*6-3;
V = x + y;
2 Introduce a duplicate point location by assigning the coordinates of point 50 to point 100:
x(50) = x(100);
y(50) = y(100);
3 Create the interpolant. Notice that F contains 99 unique data points:
F = scatteredInterpolant(x,y,V)
4 Check the value associated with the 50th point:
F.Values(50)
This value is the average of the original 50th and 100th value, as these two data points have the same
location:
(V(50)+V(100))/2
In this scenario the interpolant resolves the ambiguity in a reasonable manner. However in some
instances, data points can be close rather than coincident, and the values at those locations can be
different.
In some interpolation problems, multiple sets of sample values might correspond to the same
locations. For example, a set of values might be recorded at the same locations at different periods in
time. For efficiency, you can interpolate one set of readings and then replace the values to interpolate
the next set.
Always use consistent data management when replacing values in the presence of duplicate point
locations. Suppose you have two sets of values associated with the 100 data point locations and you
would like to interpolate each set in turn by replacing the values.
V1 = x + 4*y;
V2 = 3*x + 5*y
2 Create the interpolant. scatteredInterpolant merges the duplicate locations and the
interpolant contains 99 unique sample points:
F = scatteredInterpolant(x,y,V1)
Replacing the values directly via F.Values = V2 means assigning 100 values to 99 samples.
The context of the previous merge operation is lost; the number of sample locations will not
match the number of sample values. The interpolant will require the inconsistency to be resolved
to support queries.
In this more complex scenario, it is necessary to remove the duplicates prior to creating and editing
the interpolant. Use the unique function to find the indices of the unique points. unique can also
output arguments that identify the indices of the duplicate points.
8-34
Interpolating Scattered Data
V1 = V1(I);
V2 = V2(I);
F = scatteredInterpolant(x,y,V1)
Now you can use F to interpolate the first data set. Then you can replace the values to interpolate the
second data set.
F.Values = V2;
scatteredInterpolant allows you to edit the properties representing the sample values
(F.Values) and the interpolation method (F.Method). Since these properties are independent of the
underlying triangulation, the edits can be performed efficiently. However, like working with a large
array, you should take care not to accidentally create unnecessary copies when editing the data.
Copies are made when more than one variable references an array and that array is then edited.
However, a copy is made in this scenario because the array is referenced by another variable. The
arrays A1 and A2 can no longer share the same data once the edit is made:
A1 = magic(4)
A2 = A1
A1(4,4) = 32
Similarly, if you pass the array to a function and edit the array within that function, a deep copy may
be made depending on how the data is managed. scatteredInterpolant contains data and it
behaves like an array—in MATLAB language, it is called a value object. The MATLAB language is
designed to give optimum performance when your application is structured into functions that reside
in files. Prototyping at the command line may not yield the same level of performance.
The following example demonstrates this behavior, but it should be noted that performance gains in
this example do not generalize to other functions in MATLAB.
You can place the code in a function file to execute it more efficiently.
When MATLAB executes a program that is composed of functions that reside in files, it has a
complete picture of the execution of the code; this allows MATLAB to optimize for performance. When
you type the code at the command line, MATLAB cannot anticipate what you are going to type next,
so it cannot perform the same level of optimization. Developing applications through the creation of
reusable functions is general and recommended practice, and MATLAB will optimize the performance
in this setting.
8-35
8 Interpolation
The Delaunay triangulation is well suited to scattered data interpolation problems because it has
favorable geometric properties that produce good results. These properties are:
The empty circumcircle property ensures the interpolated values are influenced by sample points in
the neighborhood of the query location. Despite these qualities, in some situations the distribution of
the data points may lead to poor results and this typically happens near the convex hull of the sample
data set. When the interpolation produces unexpected results, a plot of the sample data and
underlying triangulation can often provide insight into the problem.
This example shows an interpolated surface that deteriorates near the boundary.
Create a sample data set that will exhibit problems near the boundary.
t = 0.4*pi:0.02:0.6*pi;
x1 = cos(t)';
y1 = sin(t)'-1.02;
x2 = x1;
y2 = y1*(-1);
x3 = linspace(-0.3,0.3,16)';
y3 = zeros(16,1);
x = [x1;x2;x3];
y = [y1;y2;y3];
Now lift these sample points onto the surface z = x2 + y2 and interpolate the surface.
z = x.^2 + y.^2;
F = scatteredInterpolant(x,y,z);
[xi,yi] = meshgrid(-0.3:.02:0.3, -0.0688:0.01:0.0688);
zi = F(xi,yi);
mesh(xi,yi,zi)
xlabel('X','fontweight','b'), ylabel('Y','fontweight','b')
zlabel('Value - V','fontweight','b')
title('Interpolated Surface');
8-36
Interpolating Scattered Data
zi = xi.^2 + yi.^2;
figure
mesh(xi,yi,zi)
title('Actual Surface')
8-37
8 Interpolation
To understand why the interpolating surface deteriorates near the boundary, it is helpful to look at
the underlying triangulation:
dt = delaunayTriangulation(x,y);
figure
plot(x,y,'*r')
axis equal
hold on
triplot(dt)
plot(x1,y1,'-r')
plot(x2,y2,'-r')
title('Triangulation Used to Create the Interpolant')
hold off
8-38
Interpolating Scattered Data
The triangles within the red boundaries are relatively well shaped; they are constructed from points
that are in close proximity and the interpolation works well in this region. Outside the red boundary,
the triangles are sliver-like and connect points that are remote from each other. There is not
sufficient sampling to accurately capture the surface, so it is not surprising that the results in these
regions are poor. In 3-D, visual inspection of the triangulation gets a bit trickier, but looking at the
point distribution can often help illustrate potential problems.
The MATLAB® 4 griddata method, 'v4', is not triangulation-based and is not affected by
deterioration of the interpolation surface near the boundary.
8-39
8 Interpolation
The interpolated surface from griddata using the 'v4' method corresponds to the expected actual
surface.
8-40
Interpolation Using a Specific Delaunay Triangulation
This example shows how to perform nearest-neighbor interpolation on a scattered set of points using
a specific Delaunay triangulation.
rng('default')
P = -2.5 + 5*rand([50 2]);
DT = delaunayTriangulation(P)
DT =
delaunayTriangulation with properties:
V = P(:,1).^2 + P(:,2).^2;
Pq = -2 + 4*rand([10 2]);
vi = nearestNeighbor(DT,Pq);
Vq = V(vi)
Vq = 10×1
2.7208
3.7792
1.8394
3.5086
1.8394
3.5086
1.4258
5.4053
4.0670
0.5586
8-41
8 Interpolation
This example shows how to perform linear interpolation on a scattered set of points with a specific
Delaunay triangulation.
You can use the triangulation method, pointLocation, to compute the enclosing triangle of a
query point and the magnitudes of the vertex weights. The weights are called barycentric
coordinates, and they represent a partition of unity. That is, the sum of the three weights equals 1.
The interpolated value of a function, V, at a query point is the sum of the weighted values of V at the
three vertices. That is, if the function has values, V1, V2, V3 at the three vertices, and the weights are
B1, B2, B3, then the interpolated value is (V1)(B1) + (V2)(B2) + (V3)(B3).
DT =
delaunayTriangulation with properties:
Find the triangle that encloses each query point using the pointLocation method. In the code
below, ti contains the IDs of the enclosing triangles and bc contains the barycentric coordinates
associated with each triangle.
[ti,bc] = pointLocation(DT,Pq);
Calculate the sum of the weighted values of V(x,y) using the dot product.
Vq = dot(bc',triVals')'
Vq = 10×1
2.2736
4.2596
2.1284
3.5372
4.6232
2.1797
1.2779
8-42
Interpolation Using a Specific Delaunay Triangulation
4.7644
3.6311
1.2196
See Also
delaunayTriangulation | pointLocation | nearestNeighbor
More About
• “Interpolating Scattered Data” on page 8-17
8-43
8 Interpolation
In addition, the triangulation near the convex hull boundary can have sliver-like triangles. These
triangles can compromise your extrapolation results in the same way that they can compromise
interpolation results. See “Interpolation Results Poor Near the Convex Hull” on page 8-35 for more
information.
You should inspect your extrapolation results visually using your knowledge of the behavior outside
the domain.
This example shows how to interpolate two different samplings of the same parabolic function. It also
shows that a better distribution of sample points produces better extrapolation results.
Create a radial distribution of points spaced 10 degrees apart around 10 concentric circles. Use
bsxfun to compute the coordinates, x = cosθ and y = sinθ.
theta = 0:10:350;
c = cosd(theta);
s = sind(theta);
r = 1:10;
x1 = bsxfun(@times,r.',c);
y1 = bsxfun(@times,r.',s);
figure
plot(x1,y1,'*b')
axis equal
8-44
Extrapolating Scattered Data
Create a second, more coarsely distributed set of points. Use the rand function to create random
samplings in the range, [-10, 10].
rng default;
x2 = -10 + 20*rand([25 1]);
y2 = -10 + 20*rand([25 1]);
figure
plot(x2,y2,'*')
8-45
8 Interpolation
v1 = x1.^2 + y1.^2;
v2 = x2.^2 + y2.^2;
F1 = scatteredInterpolant(x1(:),y1(:),v1(:));
F2 = scatteredInterpolant(x2(:),y2(:),v2(:));
[xq,yq] = ndgrid(-20:20);
figure
vq1 = F1(xq,yq);
surf(xq,yq,vq1)
8-46
Extrapolating Scattered Data
figure
vq2 = F2(xq,yq);
surf(xq,yq,vq2)
8-47
8 Interpolation
The quality of the extrapolation is not as good for F2 because of the coarse sampling of points in v2.
This example shows how to extrapolate a well sampled 3-D gridded dataset using
scatteredInterpolant. The query points lie on a planar grid that is completely outside domain.
Create a 10-by-10-by-10 grid of sample points. The points in each dimension are in the range, [-10,
10].
[x,y,z] = ndgrid(-10:10);
F = scatteredInterpolant(x(:),y(:),z(:),v(:),'linear','linear');
Evaluate the interpolant over an x-y grid spanning the range, [-20,20] at an elevation, z = 15.
[xq,yq,zq] = ndgrid(-20:20,-20:20,15);
vq = F(xq,yq,zq);
8-48
Extrapolating Scattered Data
figure
surf(xq,yq,vq)
The extrapolation returned good results because the function is well sampled.
8-49
8 Interpolation
This example shows how to use normalization to improve scattered data interpolation results with
griddata. Normalization can improve the interpolation results in some cases, but in others it can
compromise the accuracy of the solution. Whether to use normalization is a judgment made based on
the nature of the data being interpolated.
• Benefits: Normalizing your data can potentially improve the interpolation result when the
independent variables have different units and substantially different scales. In this case, scaling
the inputs to have similar magnitudes might improve the numerical aspects of the interpolation.
An example where normalization would be beneficial is if x represents engine speed in RPMs from
500 to 3500, and y represents engine load from 0 to 1. The scales of x and y differ by a few orders
of magnitude, and they have different units.
• Cautions: Use caution when normalizing your data if the independent variables have the same
units, even if the scales of the variables are different. With data of the same units, normalization
distorts the solution by adding a directional bias, which affects the underlying triangulation and
ultimately compromises the accuracy of the interpolation. An example where normalization is
erroneous is if both x and y represent locations and have units of meters. Scaling x and y
unequally is not recommended because 10 m due East should be spatially the same as 10 m due
North.
Create some sample data where the values in y are a few orders of magnitude larger than those in x.
Assume that x and y have different units.
x = rand(1,500)/100;
y = 2.*(rand(1,500)-0.5).*90;
z = (x.*1e2).^2;
Use the sample data to construct a grid of query points. Interpolate the sample data on the grid and
plot the results.
X = linspace(min(x),max(x),25);
Y = linspace(min(y),max(y),25);
[xq, yq] = meshgrid(X,Y);
zq = griddata(x,y,z,xq,yq);
plot3(x,y,z,'mo')
hold on
mesh(xq,yq,zq)
xlabel('x')
ylabel('y')
hold off
8-50
Normalize Data with Differing Magnitudes
The result produced by griddata is not very smooth and seems to be noisy. The differing scales in
the independent variables contribute to this, since a small change in the size of one variable can lead
to a much larger change in the size of the other variable.
Since x and y have different units, normalizing them so that they have similar magnitudes should
help produce better results. Normalize the sample points using z-scores and regenerate the
interpolation using griddata.
% Regenerate Grid
X = linspace(min(x),max(x),25);
Y = linspace(min(y),max(y),25);
[xq, yq] = meshgrid(X,Y);
8-51
8 Interpolation
In this case, normalizing the sample points permits griddata to compute a smoother solution.
See Also
griddata | griddatan | scatteredInterpolant
8-52
Resample Image with Gridded Interpolation
This example shows how to use griddedInterpolant to resample the pixels in an image.
Resampling an image is useful for adjusting the resolution and size, and you also can use it to smooth
out the pixels after zooming.
Load Image
Load and show the image ngc6543a.jpg, which is a Hubble Space Telescope image of the planetary
nebulae NGC 6543. This image displays several interesting structures, such as concentric gas shells,
jets of high-speed gas, and unusual knots of gas. The matrix A that represents the image is a 650-
by-600-by-3 matrix of uint8 integers.
A = imread('ngc6543a.jpg');
imshow(A)
8-53
8 Interpolation
Create Interpolant
Create a gridded interpolant object for the image. griddedInterpolant only works for double-
precision and single-precision matrices, so convert the uint8 matrix to double. To interpolate each
RGB channel of the image, specify two grid vectors to describe the sample points in the first two
dimensions. The grid vectors are grouped together as column vectors in a cell array
{xg1,xg2,...,xgN}. With this formulation, griddedInterpolant treats the 3-D matrix as
containing multiple 2-D data sets defined on the same grid.
8-54
Resample Image with Gridded Interpolation
sz = size(A);
xg = 1:sz(1);
yg = 1:sz(2);
F = griddedInterpolant({xg,yg},double(A));
Use the sizes of the first two matrix dimensions to resample the image so that it is 120% the size.
That is, for each 5 pixels in the original image, the interpolated image has 6 pixels. Evaluate the
interpolant at the query points with the syntax F({xq,yq}). griddedInterpolant evaluates each
page in the 3-D image at the query points.
xq = (0:5/6:sz(1))';
yq = (0:5/6:sz(2))';
vq = uint8(F({xq,yq}));
imshow(vq)
title('Higher Resolution')
8-55
8 Interpolation
Similarly, reduce the size of the image by querying the interpolant with 55% fewer points than the
original image. While you can simply index into the original image matrix to produce lower resolution
images, interpolation enables you to resample the image at noninteger pixel locations.
xq = (0:1.55:sz(1))';
yq = (0:1.55:sz(2))';
vq = uint8(F({xq,yq}));
figure
imshow(vq)
title('Lower Resolution')
8-56
Resample Image with Gridded Interpolation
As you zoom in on an image, the pixels in the region of interest become larger and detail in the image
is quickly lost. You can use image resampling to smooth out these zooming artifacts.
Zoom in on the bright spot in the center of the original image. (The indexing into A is to center this
bright spot in the image so that subsequent zooming does not push it out of the frame.)
imshow(A(1:570,10:600,:),'InitialMagnification','fit')
zoom(10)
title('Original Image, 10x Zoom')
8-57
8 Interpolation
Query the interpolant F to reproduce this zoomed image (approximately) with 10x higher resolution.
Compare the results from several different interpolation methods.
xq = (1:0.1:sz(1))';
yq = (1:0.1:sz(2))';
F.Method = 'linear';
vq = uint8(F({xq,yq}));
imshow(vq(1:5700,150:5900,:),'InitialMagnification','fit')
zoom(10)
title('Linear method')
8-58
Resample Image with Gridded Interpolation
F.Method = 'cubic';
vq = uint8(F({xq,yq}));
imshow(vq(1:5700,150:5900,:),'InitialMagnification','fit')
zoom(10)
title('Cubic method')
8-59
8 Interpolation
F.Method = 'spline';
vq = uint8(F({xq,yq}));
imshow(vq(1:5700,150:5900,:),'InitialMagnification','fit')
zoom(10)
title('Spline method')
8-60
Resample Image with Gridded Interpolation
See Also
griddedInterpolant | imshow
More About
• “Interpolating Gridded Data” on page 8-5
8-61
9
Optimization
Given a mathematical function of a single variable, you can use the fminbnd function to find a local
minimizer of the function in a given interval. For example, consider the humps.m function, which is
provided with MATLAB®. The following figure shows the graph of humps.
x = -1:.01:2;
y = humps(x);
plot(x,y)
xlabel('x')
ylabel('humps(x)')
grid on
9-2
Optimizing Nonlinear Functions
To find the minimum of the humps function in the range (0.3,1), use
x = fminbnd(@humps,0.3,1)
x = 0.6370
You can see details of the solution process by using optimset to create options with the Display
option set to 'iter'. Pass the resulting options to fminbnd.
options = optimset('Display','iter');
x = fminbnd(@humps,0.3,1,options)
Optimization terminated:
the current x satisfies the termination criteria using OPTIONS.TolX of 1.000000e-04
x = 0.6370
The iterative display shows the current value of x and the function value at f(x) each time a function
evaluation occurs. For fminbnd, one function evaluation corresponds to one iteration of the
algorithm. The last column shows the procedure fminbnd uses at each iteration, a golden section
search or a parabolic interpolation. For details, see “Optimization Solver Iterative Display” on page 9-
13.
Note: Optimization solvers apply to real-valued functions. Complex values cannot be optimized,
except for a real-valued function of the complex values, such as the norm.
Now find a minimum for this function using x = -0.6, y = -1.2, and z = 0.135 as the starting
values.
v = [-0.6,-1.2,0.135];
a = fminsearch(@three_var,v)
9-3
9 Optimization
a =
0.0000 -1.5708 0.1803
Note Optimization solvers apply to real-valued functions. Complex values cannot be optimized,
except for a real-valued function of the complex values, such as the norm.
Maximizing Functions
The fminbnd and fminsearch solvers attempt to minimize an objective function. If you have a
maximization problem, that is, a problem of the form
max f (x),
x
[x fval] = fminbnd(@(x)-tan(cos(x)),3,8)
x =
6.2832
fval =
-1.5574
The maximum is 1.5574 (the negative of the reported fval), and occurs at x = 6.2832. This answer is
correct since, to five digits, the maximum is tan(1) = 1.5574, which occurs at x = 2π = 6.2832.
fminsearch Algorithm
fminsearch uses the Nelder-Mead simplex algorithm as described in Lagarias et al. [1]. This
algorithm uses a simplex of n + 1 points for n-dimensional vectors x. The algorithm first makes a
simplex around the initial guess x0 by adding 5% of each component x0(i) to x0. The algorithm uses
these n vectors as elements of the simplex in addition to x0. (The algorithm uses 0.00025 as
component i if x0(i) = 0.) Then, the algorithm modifies the simplex repeatedly according to the
following procedure.
Note The keywords for the fminsearch iterative display appear in bold after the description of the
step.
1 Let x(i) denote the list of points in the current simplex, i = 1,...,n + 1.
2 Order the points in the simplex from lowest function value f(x(1)) to highest f(x(n + 1)). At each
step in the iteration, the algorithm discards the current worst point x(n + 1), and accepts another
point into the simplex. [Or, in the case of step 7 below, it changes all n points with values above
f(x(1))].
3 Generate the reflected point
where
9-4
Optimizing Nonlinear Functions
a If f(r) < f(x(n + 1)) (that is, r is better than x(n + 1)), calculate
c = m + (r – m)/2 (
9
-
4
)
and calculate f(c). If f(c) < f(r), accept c and terminate the iteration. Contract outside
cc = m + (x(n + 1) – m)/2 (
9
-
5
)
and calculate f(cc). If f(cc) < f(x(n + 1)), accept cc and terminate the iteration. Contract
inside
and calculate f(v(i)), i = 2,...,n + 1. The simplex at the next iteration is x(1), v(2),...,v(n + 1).
Shrink
The following figure shows the points that fminsearch can calculate in the procedure, along with
each possible new simplex. The original simplex has a bold outline. The iterations proceed until they
meet a stopping criterion.
9-5
9 Optimization
Reference
[1] Lagarias, J. C., J. A. Reeds, M. H. Wright, and P. E. Wright. “Convergence Properties of the Nelder-
Mead Simplex Method in Low Dimensions.” SIAM Journal of Optimization, Vol. 9, Number 1,
1998, pp. 112–147.
See Also
More About
• “Optimization Troubleshooting and Tips” on page 9-30
• “Nonlinear Optimization” (Optimization Toolbox)
• “Curve Fitting via Optimization” on page 9-7
9-6
Curve Fitting via Optimization
This example shows how to fit a nonlinear function to data. For this example, the nonlinear function is
the standard exponential decay curve
where y(t) is the response at time t, and A and λ are the parameters to fit. Fitting the curve means
finding parameters A and λ that minimize the sum of squared errors
n
2
∑ yi − Aexp( − λti) ,
i=1
where the times are ti and the responses are yi, i = 1, …, n. The sum of squared errors is the objective
function.
Usually, you have data from measurements. For this example, create artificial data based on a model
with A = 40 and λ = 0 . 5, with normally distributed pseudorandom errors.
Write a function that accepts parameters A and lambda and data tdata and ydata, and returns the
sum of squared errors for the model y(t). Put all the variables to optimize (A and lambda) in a single
vector variable (x). For more information, see “Minimizing Functions of Several Variables” on page 9-
3.
type sseval
Save this objective function as a file named sseval.m on your MATLAB® path.
The fminsearch solver applies to functions of one variable, x. However, the sseval function has
three variables. The extra variables tdata and ydata are not variables to optimize, but are data for
the optimization. Define the objective function for fminsearch as a function of x alone:
fun = @(x)sseval(x,tdata,ydata);
For information about including extra parameters such as tdata and ydata, see “Parameterizing
Functions” on page 10-2.
Start from a random positive set of parameters x0, and have fminsearch find the parameters that
minimize the objective function.
9-7
9 Optimization
x0 = rand(2,1);
bestx = fminsearch(fun,x0)
bestx = 2×1
40.6877
0.4984
The result bestx is reasonably near the parameters that generated the data, A = 40 and lambda =
0.5.
To check the quality of the fit, plot the data and the resulting fitted response curve. Create the
response curve from the returned parameters of your model.
A = bestx(1);
lambda = bestx(2);
yfit = A*exp(-lambda*tdata);
plot(tdata,ydata,'*');
hold on
plot(tdata,yfit,'r');
xlabel('tdata')
ylabel('Response Data and Curve')
title('Data and Best Fitting Exponential Curve')
legend('Data','Fitted Curve')
hold off
9-8
Curve Fitting via Optimization
See Also
More About
• “Optimizing Nonlinear Functions” on page 9-2
• “Nonlinear Data-Fitting” (Optimization Toolbox)
• “Nonlinear Regression” (Statistics and Machine Learning Toolbox)
9-9
9 Optimization
In this section...
“How to Set Options” on page 9-10
“Options Table” on page 9-10
“Tolerances and Stopping Criteria” on page 9-11
“Output Structure” on page 9-12
x = fminbnd(fun,x1,x2,options)
x = fminsearch(fun,x0,options)
For example, to display output from the algorithm at each iteration, set the Display option to
'iter':
options = optimset('Display','iter');
Options Table
Option Description Solvers
Display A flag indicating whether intermediate steps appear on fminbnd,
the screen. fminsearch, fzero,
lsqnonneg
• 'notify' (default) displays output only if the
function does not converge.
• 'iter' displays intermediate steps (not available
with lsqnonneg). See “Optimization Solver
Iterative Display” on page 9-13.
• 'off' or 'none' displays no intermediate steps.
• 'final' displays just the final output.
FunValCheck Check whether objective function values are valid. fminbnd,
fminsearch, fzero
• 'on' displays an error when the objective function
or constraints return a value that is complex or NaN.
• 'off' (default) displays no error.
MaxFunEvals The maximum number of function evaluations allowed. fminbnd,
The default value is 500 for fminbnd and fminsearch
200*length(x0) for fminsearch.
9-10
Set Optimization Options
Tip Generally, set the TolFun and TolX tolerances to well above eps, and usually above 1e-14.
Setting small tolerances does not guarantee accurate results. Instead, a solver can fail to recognize
when it has converged, and can continue futile iterations. A tolerance value smaller than eps
effectively disables that stopping condition. This tip does not apply to fzero, which uses a default
value of eps for TolX.
• TolX is a lower bound on the size of a step, meaning the norm of (xi – xi+1). If the solver attempts
to take a step that is smaller than TolX, the iterations end. Solvers generally use TolX as a
relative bound, meaning iterations end when |(xi – xi+1)| < TolX*(1 + |xi|), or a similar relative
measure.
9-11
9 Optimization
• TolFun is a lower bound on the change in the value of the objective function during a step. If |
f(xi) – f(xi+1)| < TolFun, the iterations end. Solvers generally use TolFun as a relative bound,
meaning iterations end when |f(xi) – f(xi+1)| < TolFun(1 + |f(xi)|), or a similar relative measure.
• MaxIter is a bound on the number of solver iterations. MaxFunEvals is a bound on the number
of function evaluations.
Note Unlike other solvers, fminsearch stops when it satisfies both TolFun and TolX.
Output Structure
The output structure includes the number of function evaluations, the number of iterations, and the
algorithm. The structure appears when you provide fminbnd, fminsearch, or fzero with a fourth
output argument, as in
[x,fval,exitflag,output] = fminbnd(@humps,0.3,1);
The details of the output structure for each solver are on the function reference pages.
The output structure is not an option that you choose with optimset. It is an optional output for
fminbnd, fminsearch, and fzero.
See Also
More About
• “Optimizing Nonlinear Functions” on page 9-2
• “Optimization Solver Output Functions” on page 9-14
• “Optimization Solver Plot Functions” on page 9-20
9-12
Optimization Solver Iterative Display
See Also
More About
• “Set Optimization Options” on page 9-10
• “Optimization Solver Output Functions” on page 9-14
• “Optimization Solver Plot Functions” on page 9-20
9-13
9 Optimization
In this section...
“What Is an Output Function?” on page 9-14
“Creating and Using an Output Function” on page 9-14
“Structure of the Output Function” on page 9-15
“Example of a Nested Output Function” on page 9-16
“Fields in optimValues” on page 9-17
“States of the Algorithm” on page 9-17
“Stop Flag” on page 9-18
You can use the OutputFcn option with the following MATLAB optimization functions:
• fminbnd
• fminsearch
• fzero
You can use this output function to plot the points generated by fminsearch in solving the
optimization problem
x1
min f (x) = mine 4x12 + 2x22 + x1x2 + 2x2 .
x x
To do so,
1 Create a file containing the preceding code and save it as outfun.m in a folder on the MATLAB
path.
2 Set the value of the Outputfcn field of the options structure to a function handle to outfun.
9-14
Optimization Solver Output Functions
hold on
objfun=@(x) exp(x(1))*(4*x(1)^2+2*x(2)^2+x(1)*x(2)+2*x(2));
[x fval] = fminsearch(objfun, [-1 1], options)
hold off
x =
0.1290 -0.5323
fval =
-0.5689
where
• stop is a flag that is true or false depending on whether the optimization routine halts or
continues. See “Stop Flag” on page 9-18.
• x is the point computed by the algorithm at the current iteration.
• optimValues is a structure containing data from the current iteration. “Fields in optimValues” on
page 9-17 describes the structure in detail.
• state is the current state of the algorithm. “States of the Algorithm” on page 9-17 lists the
possible values.
The optimization function passes the values of the input arguments to outfun at each iteration.
9-15
9 Optimization
In the following example, the function file also contains the objective function as a local function. You
can instead write the objective function as a separate file or as an anonymous function.
Nested functions have access to variables in the surrounding file. Therefore, this method enables the
output function to preserve variables from one iteration to the next.
The following example uses an output function to record the fminsearch iterates in solving
x1
min f (x) = mine 4x12 + 2x22 + x1x2 + 2x2 .
x x
The output function returns the sequence of points as a matrix called history.
function z = objfun(x)
z = exp(x(1))*(4*x(1)^2+2*x(2)^2+x(1)*x(2)+2*x(2));
end
end
3 Save the file as myproblem.m in a folder on the MATLAB path.
4 At the MATLAB prompt, enter
The function fminsearch returns x, the optimal point, and fval, the value of the objective function
at x.
x,fval
9-16
Optimization Solver Output Functions
x =
0.1290 -0.5323
fval =
-0.5689
In addition, the output function myoutput returns the matrix history, which contains the points
generated by the algorithm at each iteration, to the MATLAB workspace. The first four rows of
history are
history(1:4,:)
ans =
-1.0000 1.0000
-1.0000 1.0000
-1.0750 0.9000
-1.0125 0.8500
The final row of points in history is the same as the optimal point, x.
history(end,:)
ans =
0.1290 -0.5323
objfun(history(end,:))
ans =
-0.5689
Fields in optimValues
The following table lists the fields of the optimValues structure that are provided by the
optimization functions fminbnd, fminsearch, and fzero.
The “Command-Line Display Headings” column of the table lists the headings that appear when you
set the Display parameter of options to 'iter'.
9-17
9 Optimization
State Description
'init' The algorithm is in the initial state before the first iteration.
'interrupt' The algorithm is performing an iteration. In this state, the
output function can halt the current iteration of the
optimization. You might want the output function to halt the
iteration to improve the efficiency of the computations. When
state is set to 'interrupt', the values of x and
optimValues are the same as at the last call to the output
function, in which state is set to 'iter'.
'iter' The algorithm is at the end of an iteration.
'done' The algorithm is in the final state after the last iteration.
The following code illustrates how the output function uses the value of state to decide which tasks
to perform at the current iteration.
switch state
case 'init'
% Setup for plots or dialog boxes
case 'iter'
% Make updates to plots or dialog boxes as needed
case 'interrupt'
% Check conditions to see whether optimization
% should quit
case 'done'
% Cleanup of plots, dialog boxes, or final plot
end
Stop Flag
The output argument stop is a flag that is true or false. The flag tells the optimization function
whether the optimization halts (true) or continues (false). The following examples show typical
ways to use the stop flag.
The output function can stop an optimization at any iteration based on the current data in
optimValues. For example, the following code sets stop to true if the objective function value is
less than 5:
If you design a UI to perform optimizations, you can have the output function stop an optimization
with, for example, a Stop button. The following code shows how to do this callback. The code
assumes that the Stop button callback stores the value true in the optimstop field of a handles
structure called hObject stored in appdata.
9-18
Optimization Solver Output Functions
See Also
More About
• “Optimization Solver Plot Functions” on page 9-20
• “Set Optimization Options” on page 9-10
• “Optimization Solver Iterative Display” on page 9-13
9-19
9 Optimization
The PlotFcns field of an options structure specifies one or more functions that an optimization
function calls at each iteration to plot various measures of progress. Pass a function handle or cell
array of function handles.
You can use the PlotFcns option with the following MATLAB optimization functions:
• fminbnd
• fminsearch
• fzero
View the progress of a minimization using fminsearch with the plot function @optimplotfval.
The objective function onehump appears at the end of this example on page 9-22.
options = optimset('PlotFcns',@optimplotfval);
x0 = [2 1];
[x fval] = fminsearch(@onehump,x0,options)
9-20
Optimization Solver Plot Functions
x = 1×2
-0.6691 0.0000
fval = -0.4052
You can write a custom plot function using the same syntax as an output function. For more
information on this structure, see “Optimization Solver Output Functions” on page 9-14.
Create a 2-D plot function that shows the iterative points labeled with the iteration number. For the
code, see the myplot helper function at the end of this example on page 9-22. Have the plot
function call both the original @optimplotfval plot function as well as myplot.
9-21
9 Optimization
x = 1×2
-0.6691 0.0000
fval = -0.4052
The custom plot function plots roughly the last half of the iterations over each other as the solver
converges to the final point [-0.6691 0.0000]. This makes the last half of the iterations hard to
read. Nevertheless, the plot gives some indication of how fminsearch iterates toward the
minimizing point.
Helper Functions
function f = onehump(x)
r = x(1)^2 + x(2)^2;
s = exp(-r);
f = x(1)*s+r/20;
end
9-22
Optimization Solver Plot Functions
See Also
More About
• “Optimization Solver Output Functions” on page 9-14
• “Set Optimization Options” on page 9-10
• “Optimization Solver Iterative Display” on page 9-13
9-23
9 Optimization
9-24
Optimize Live Editor Task
For a video describing a similar optimization problem, see How to Use the Optimize Live Editor Task.
1 On the Home tab, in the File section, click the New Live Script button.
2 Insert an Optimize Live Editor task. Click the Insert tab and then, in the Code section, select
Task > Optimize.
3 For use in entering problem data, click the Section Break button. New sections appear above
and below the task.
4 In the section above the Optimize task, enter the following code.
a = pi;
x0 = [-1 2];
5 To place these variables into the workspace, press Ctrl + Enter.
6 In the Specify problem type section of the task, click the Objective > Nonlinear button and
the Constraints > Unconstrained button. The task shows that the recommended solver is
fminsearch.
Note If you have Optimization Toolbox™, your recommended solver at this point is different.
Choose fminsearch to proceed with the example.
9-25
9 Optimization
7 In the Select problem data section, select Objective function > Local function and then
click the New button. A function script appears in a new section below the task. Edit the
resulting code to contain the following uncommented lines.
function f = objectiveFcn(optimInput,a)
x = optimInput(1);
y = optimInput(2);
f = 100*(y - x^2)^2 + (a - x)^2;
end
8 In the Select problem data section, select objectiveFcn as the local function.
9 In the Select problem data section, under Function inputs, select Optimization input >
optimInput and Fixed input: a > a.
9-26
Optimize Live Editor Task
13 To view the solution point, look at the top of the Optimize task.
The solution and objectiveValue variables are returned to the workspace. To view their
values, insert a section break below the task and enter this code.
disp(solution)
disp(objectiveValue)
14 Run the section by pressing Ctrl+Enter.
disp(solution)
3.1416 9.8696
disp(objectiveValue)
3.9946e-11
1 On the Home tab, in the File section, click the New Live Script button. Enter these lines of
code in the live script.
9-27
9 Optimization
fun = @(x)cos(x) - x;
x0 = 0;
The first line defines the anonymous function fun, which takes the value 0 at the point x where
cos(x) = x. The second line defines the initial point x0 = 0, where fzero begins its search for a
solution.
2 Put these variables in the MATLAB workspace by pressing Ctrl+Enter.
3 Insert an Optimize Live Editor task. Click the Insert tab and then, in the Code section, select
Task > Optimize.
4 In the Specify problem type section of the task, select Solver > fzero.
5 In the Select problem data section, select Objective function > Function handle and then
select fun. Select Initial point (x0) > x0.
6 In the Display progress section, select Objective value for the plot.
9-28
Optimize Live Editor Task
8 To see the solution value, insert a new section below the task by clicking the Section Break
button on the Insert tab. In the new section, enter solution and press Ctrl+Enter.
solution
solution = 0.7391
See Also
Optimize | fzero | fminsearch
More About
• “Optimization”
• “Add Interactive Tasks to a Live Script”
• How to Use the Optimize Live Editor Task
9-29
9 Optimization
Problem Recommendation
The solution found by fminbnd or There is no guarantee that a solution is a global minimum
fminsearch is not a global minimum. A global unless your problem is continuous and has only one minimum.
minimum has the smallest objective function To search for a global minimum, start the optimization from
value among all points in the search space. multiple starting points (or intervals, in the case of fminbnd).
It is impossible to evaluate the objective Modify your function to return a large positive value for f(x)
function f(x) at some points x. Such points at infeasible points x.
are called infeasible.
The minimization routine appears to enter an Perhaps your objective function returns NaN or complex
infinite loop or returns a solution that is not a values. Solvers expect only real objective function values. Any
minimum (or not a zero, in the case of fzero). other values can cause unexpected results. To determine
whether there are NaN or complex values, set
options = optimset('FunValCheck','on')
9-30
Optimization Troubleshooting and Tips
Problem Recommendation
fminsearch fails to reach a solution. fminsearch can fail to reach a solution for various reasons.
Note Optimization solvers apply to real-valued functions. Complex values cannot be optimized,
except for a real-valued function of the complex values, such as the norm.
See Also
More About
• “Optimizing Nonlinear Functions” on page 9-2
9-31
10
Function Handles
10 Function Handles
Parameterizing Functions
In this section...
“Overview” on page 10-2
“Parameterizing Using Nested Functions” on page 10-2
“Parameterizing Using Anonymous Functions” on page 10-3
Overview
This topic explains how to store or access extra parameters for mathematical functions that you pass
to MATLAB function functions, such as fzero or integral.
MATLAB function functions evaluate mathematical expressions over a range of values. They are
called function functions because they are functions that accept a function handle (a pointer to a
function) as an input. Each of these functions expects that your objective function has a specific
number of input variables. For example, fzero and integral accept handles to functions that have
exactly one input variable.
Suppose you want to find the zero of the cubic polynomial x3 + bx + c for different values of the
coefficients b and c. Although you could create a function that accepts three input variables (x, b, and
c), you cannot pass a function handle that requires all three of those inputs to fzero. However, you
can take advantage of properties of anonymous or nested functions to define values for additional
inputs.
function y = findzero(b,c,x0)
y = fzero(@poly,x0);
function y = poly(x)
y = x^3 + b*x + c;
end
end
The nested function defines the cubic polynomial with one input variable, x. The parent function
accepts the parameters b and c as input values. The reason to nest poly within findzero is that
nested functions share the workspace of their parent functions. Therefore, the poly function can
access the values of b and c that you pass to findzero.
To find a zero of the polynomial with b = 2 and c = 3.5, using the starting point x0 = 0, you can
call findzero from the command line:
x = findzero(2,3.5,0)
x =
-1.0945
10-2
Parameterizing Functions
For example, create a handle to an anonymous function that describes the cubic polynomial, and find
the zero:
b = 2;
c = 3.5;
cubicpoly = @(x) x^3 + b*x + c;
x = fzero(cubicpoly,0)
x =
-1.0945
Variable cubicpoly is a function handle for an anonymous function that has one input, x. Inputs for
anonymous functions appear in parentheses immediately following the @ symbol that creates the
function handle. Because b and c are in the workspace when you create cubicpoly, the anonymous
function does not require inputs for those coefficients.
You do not need to create an intermediate variable, cubicpoly, for the anonymous function. Instead,
you can include the entire definition of the function handle within the call to fzero:
b = 2;
c = 3.5;
x = fzero(@(x) x^3 + b*x + c,0)
x =
-1.0945
You also can use anonymous functions to call more complicated objective functions that you define in
a function file. For example, suppose you have a file named cubicpoly.m with this function
definition:
function y = cubicpoly(x,b,c)
y = x^3 + b*x + c;
end
At the command line, define b and c, and then call fzero with an anonymous function that invokes
cubicpoly:
b = 2;
c = 3.5;
x = fzero(@(x) cubicpoly(x,b,c),0)
x =
-1.0945
Note To change the values of the parameters, you must create a new anonymous function. For
example:
b = 10;
c = 25;
x = fzero(@(x) x^3 + b*x + c,0);
10-3
10 Function Handles
See Also
More About
• “Create Function Handle”
• “Nested Functions”
• “Anonymous Functions”
10-4
11
y′′ = 9y
In an initial value problem, the ODE is solved by starting from an initial state. Using the initial
condition, y0, as well as a period of time over which the answer is to be obtained, t0, tf , the solution
is obtained iteratively. At each step the solver applies a particular algorithm to the results of previous
steps. At the first such step, the initial condition provides the necessary information that allows the
integration to proceed. The final result is that the ODE solver returns a vector of time steps
t = t0, t1, t2, ..., tf as well as the corresponding solution at each step y = y0, y1, y2, ..., yf .
Types of ODEs
The ODE solvers in MATLAB solve these types of first-order ODEs:
Systems of ODEs
You can specify any number of coupled ODE equations to solve, and in principle the number of
equations is only limited by available computer memory. If the system of equations has n equations,
11-2
Choose an ODE Solver
then the function that encodes the equations returns a vector with n elements, corresponding to the
values for y′1, y′2, … , y′n. For example, consider the system of two equations
y′1 = y2
y′2 = y1 y2 − 2 .
function dy = myODE(t,y)
dy(1) = y(2);
dy(2) = y(1)*y(2)-2;
end
Higher-Order ODEs
The MATLAB ODE solvers only solve first-order equations. You must rewrite higher-order ODEs as an
equivalent system of first-order equations using the generic substitutions
y1 = y
y2 = y′
y3 = y′′
⋮
yn = y(n − 1) .
y′1 = y2
y′2 = y3
⋮
y′n = f t, y1, y2, ..., yn .
y′′′ − y′′y + 1 = 0.
y1 = y
y2 = y′
y3 = y′′
11-3
11 Ordinary Differential Equations (ODEs)
y′1 = y2
y′2 = y3
y′3 = y1 y3 − 1.
Complex ODEs
Consider the complex ODE equation
y′ = f t, y ,
where y = y1 + iy2. To solve it, separate the real and imaginary parts into different solution
components, then recombine the results at the end. Conceptually, this looks like
yv = Real y Imag y
f v = Real f t, y Imag f t, y .
For example, if the ODE is y′ = yt + 2i, then you can represent the equation using the function file:
function f = complexf(t,y)
f = y.*t + 2*i;
end
function fv = imaginaryODE(t,yv)
% Construct y from the real and imaginary components
y = yv(1) + i*yv(2);
When you run a solver to obtain the solution, the initial condition y0 is also separated into real and
imaginary parts to provide an initial condition for each solution component.
y0 = 1+i;
yv0 = [real(y0); imag(y0)];
tspan = [0 2];
[t,yv] = ode45(@imaginaryODE, tspan, yv0);
Once you obtain the solution, combine the real and imaginary components together to obtain the final
result.
y = yv(:,1) + i*yv(:,2);
11-4
Choose an ODE Solver
Some ODE problems exhibit stiffness, or difficulty in evaluation. Stiffness is a term that defies a
precise definition, but in general, stiffness occurs when there is a difference in scaling somewhere in
the problem. For example, if an ODE has two solution components that vary on drastically different
time scales, then the equation might be stiff. You can identify a problem as stiff if nonstiff solvers
(such as ode45) are unable to solve the problem or are extremely slow. If you observe that a nonstiff
solver is very slow, try using a stiff solver such as ode15s instead. When using a stiff solver, you can
improve reliability and efficiency by supplying the Jacobian matrix or its sparsity pattern.
You can use ode objects to automate solver selection based on properties of the problem. If you are
not sure which solver to use, then this table provides general guidelines on when to use each solver.
11-5
11 Ordinary Differential Equations (ODEs)
If there is a mass
matrix, it must be
constant.
ode23t Low Use ode23t if the
problem is only
moderately stiff and you
need a solution without
numerical damping.
11-6
Choose an ODE Solver
For details and further recommendations about when to use each solver, see [5].
odeexamples
edit exampleFileName.m
exampleFileName
This table contains a list of the available ODE and DAE example files as well as the solvers and
options they use. Links are included for the subset of examples that are also published directly in the
documentation.
• 'OutputSel'
• 'Refine'
• 'InitialStep'
• 'MaxStep'
batono ode45 • 'Mass' ODE with time- and state- “Solve Equations of Motion
de dependent mass matrix — for Baton Thrown into Air”
motion of a baton on page 11-56
brusso ode15s • 'JPattern' Stiff large problem — “Solve Stiff ODEs” on page
de • 'Vectorized' diffusion in a chemical 11-23
reaction (the Brusselator)
burger ode15s • 'Mass' ODE with strongly state- “Solve ODE with Strongly
sode • 'MStateDepend dependent mass matrix — State-Dependent Mass
ence' Burgers' equation solved Matrix” on page 11-62
using a moving mesh
• 'JPattern' technique
• 'MvPattern'
• 'RelTol'
• 'AbsTol'
11-7
11 Ordinary Differential Equations (ODEs)
• 'Jacobian'
fem2od ode23s • 'Mass' Stiff problem with a constant —
e mass matrix — finite element
method
hb1ode ode15s — Stiff ODE problem solved on —
a very long interval —
Robertson chemical reaction
hb1dae ode15s • 'Mass' Stiff, linearly implicit DAE “Solve Robertson Problem as
• 'RelTol' from a conservation law — Semi-Explicit Differential
Robertson chemical reaction Algebraic Equations (DAEs)”
• 'AbsTol'
• 'Vectorized'
ihb1da ode15i • 'RelTol' Stiff, fully implicit DAE — “Solve Robertson Problem as
e • 'AbsTol' Robertson chemical reaction Implicit Differential
Algebraic Equations (DAEs)”
• 'Jacobian'
iburge ode15i • 'RelTol' Implicit ODE system — —
rsode • 'AbsTol' Burgers’ equation
• 'Jacobian'
• 'JPattern'
kneeod ode15s • 'NonNegative' The “knee problem” with “Nonnegative ODE Solution”
e nonnegativity constraints on page 11-35
orbito ode45 • 'RelTol' Advanced event location — “ODE Event Location” on
de • 'AbsTol' restricted three body page 11-13
problem
• 'Events'
• 'OutputFcn'
rigido ode45 — Nonstiff problem — Euler “Solve Nonstiff ODEs” on
de equations of a rigid body page 11-19
without external forces
vdpode ode15s • 'Jacobian' Parameterizable van der Pol “Solve Stiff ODEs” on page
equation (stiff for large μ) 11-23
References
[1] Shampine, L. F. and M. K. Gordon, Computer Solution of Ordinary Differential Equations: the
Initial Value Problem, W. H. Freeman, San Francisco, 1975.
[2] Forsythe, G., M. Malcolm, and C. Moler, Computer Methods for Mathematical Computations,
Prentice-Hall, New Jersey, 1977.
11-8
Choose an ODE Solver
[3] Kahaner, D., C. Moler, and S. Nash, Numerical Methods and Software, Prentice-Hall, New Jersey,
1989.
[4] Shampine, L. F., Numerical Solution of Ordinary Differential Equations, Chapman & Hall, New
York, 1994.
[5] Shampine, L. F. and M. W. Reichelt, “The MATLAB ODE Suite,” SIAM Journal on Scientific
Computing, Vol. 18, 1997, pp. 1–22.
[6] Shampine, L. F., Gladwell, I. and S. Thompson, Solving ODEs with MATLAB, Cambridge University
Press, Cambridge UK, 2003.
See Also
ode | odeset | odextend
More About
• “Solve Nonstiff ODEs” on page 11-19
• “Solve Stiff ODEs” on page 11-23
• “Solve Differential Algebraic Equations (DAEs)” on page 11-30
External Websites
• Ordinary Differential Equations
11-9
11 Ordinary Differential Equations (ODEs)
Options Syntax
Use the odeset function to create an options structure that you then pass to the solver as the fourth
input argument. For example, to adjust the relative and absolute error tolerances:
opts = odeset('RelTol',1e-2,'AbsTol',1e-5);
[t,y] = ode45(@odefun,tspan,y0,opts);
If you use the command odeset with no inputs, then MATLAB displays a list of the possible values for
each option, with default values indicated by curly braces {}.
The odeget function queries the value of an option in an existing structure, which you can use to
dynamically change option values based on conditions. For example, this code detects whether Stats
is set to 'on', and changes the value if necessary:
if isempty(odeget(opts,'Stats'))
odeset(opts,'Stats','on')
end
Optio Optio ode45 ode23 ode78 ode89 ode11 ode15 ode23 ode23 ode23 ode15
n n 3 s s t tb i
Group
Error RelTo
Contro l
l AbsTo
l
NormC
ontro
l
Solver NonNe
Output gativ
e
Outpu
tFcn
Outpu
tSel
Refin
e
11-10
Summary of ODE Options
Optio Optio ode45 ode23 ode78 ode89 ode11 ode15 ode23 ode23 ode23 ode15
n n 3 s s t tb i
Group
Stats
Step Initi
Size alSte
p
MaxSt
ep
Event Event
Locati s
on
Jacobi Jacob
an ian
Matrix JPatt
ern
Vecto
rized
Mass Mass
Matrix
and MStat
DAEs eDepe
ndenc
e
MvPat
tern
MassS
ingul
ar
Initi
alSlo
pe
Algorit MaxOr
hm der
Option BDF
s for
ode15
s and
ode15
i
* Use the NonNegative parameter with ode15s, ode23t, and ode23tb only for those problems in
which there is no mass matrix.
** The events function for ode15i must accept a third input argument for yp.
11-11
11 Ordinary Differential Equations (ODEs)
Usage Examples
MATLAB includes several example files that show how to use various options. For example, type edit
ballode to see an example that uses 'Events' to specify an events function, or edit batonode to
see an example that uses 'Mass' to specify a mass matrix. For a complete summary of example files
and which options they use, see “Summary of ODE Examples and Files” on page 11-7.
See Also
odeset | odeget
More About
• “Choose an ODE Solver” on page 11-2
• “ODE Event Location” on page 11-13
• “Nonnegative ODE Solution” on page 11-35
11-12
ODE Event Location
This topic describes how to detect events while solving an ODE using solver functions (ode45,
ode15s, and so on).
Use event functions to detect when certain events occur during the solution of an ODE. Event
functions take an expression that you specify, and detect an event when that expression is equal to
zero. They can also signal the ODE solver to halt integration when they detect an event.
In the case of ode15i, the event function must also accept a third input argument for yp.
The output arguments value, isterminal, and direction are vectors whose ith element
corresponds to the ith event:
• value(i) is a mathematical expression describing the ith event. An event occurs when
value(i) is equal to zero.
• isterminal(i) = 1 if the integration is to terminate when the ith event occurs. Otherwise, it is
0.
• direction(i) = 0 if all zeros are to be located (the default). A value of +1 locates only zeros
where the event function is increasing, and -1 locates only zeros where the event function is
decreasing. Specify direction = [] to use the default value of 0 for all events.
Again, consider the case of an apple falling from a tree. The ODE that represents the falling body is
y′′ = − 1 + y′ 2 ,
with the initial conditions y 0 = 1 and y′ 0 = 0. You can use an event function to determine when
y t = 0, which is when the apple hits the ground. For this problem, an event function that detects
when the apple hits the ground is
function [position,isterminal,direction] = appleEventsFcn(t,y)
position = y(1); % The value that we want to be zero
isterminal = 1; % Halt integration
direction = 0; % The zero can be approached from either direction
end
11-13
11 Ordinary Differential Equations (ODEs)
Event Information
If you specify an events function, then call the ODE solver with three extra output arguments, as
[t,y,te,ye,ie] = odeXY(odefun,tspan,y0,options)
The three additional outputs returned by the solver correspond to the detected events:
In this case, the event information is stored in the structure as sol.te, sol.ye, and sol.ie.
Limitations
The root-finding mechanism employed by the ODE solver in conjunction with the event function has
these limitations:
• If a terminal event occurs during the first step of the integration, then the solver registers the
event as nonterminal and continues integrating.
• If more than one terminal event occurs during the first step, then only the first event registers and
the solver continues integrating.
• Zeros are determined by sign crossings between steps. Therefore, zeros of functions with an even
number of crossings between steps can be missed.
If the solver steps past events, try reducing RelTol and AbsTol to improve accuracy. Alternatively,
set MaxStep to place an upper bound on the step size. Adjusting tspan does not change the steps
taken by the solver.
This example shows how to write a simple event function for use with an ODE solver. The example file
ballode models the motion of a bouncing ball. The events function halts the integration each time
the ball bounces, and the integration then restarts with new initial conditions. As the ball bounces,
the integration stops and restarts several times.
A ball bounce occurs when the height of the ball is equal to zero after decreasing. An events
function that codes this behavior is
11-14
ODE Event Location
Type ballode to run the example and illustrate the use of an events function to simulate the
bouncing of a ball.
ballode
This example shows how to use the directional components of an event function. The example file
orbitode simulates a restricted three body problem where one body is orbiting two much larger
bodies. The events function determines the points in the orbit where the orbiting body is closest and
farthest away. Since the value of the events function is the same at the closest and farthest points of
the orbit, the direction of zero crossing is what distinguishes them.
11-15
11 Ordinary Differential Equations (ODEs)
where
The first two solution components are coordinates of the body of infinitesimal mass, so plotting one
against the other gives the orbit of the body.
The events function nested in orbitode.m searches for two events. One event locates the point of
maximum distance from the starting point, and the other locates the point where the spaceship
returns to the starting point. The events are located accurately, even though the step sizes used by
the integrator are not determined by the locations of the events. In this example, the ability to specify
the direction of the zero crossing is critical. Both the point of return to the starting point and the
point of maximum distance from the starting point have the same event values, and the direction of
the crossing is used to distinguish them. An events function that codes this behavior is
orbitode
Note that the step sizes used by the integrator are NOT
determined by the location of the events, and the events are
still located accurately.
11-16
ODE Event Location
11-17
11 Ordinary Differential Equations (ODEs)
See Also
odeset | odeget
More About
• “Choose an ODE Solver” on page 11-2
• “Summary of ODE Options” on page 11-10
• “Parameterizing Functions” on page 10-2
11-18
Solve Nonstiff ODEs
This page contains two examples of solving nonstiff ordinary differential equations using ode45.
MATLAB® has several solvers for nonstiff ODEs.
• ode45
• ode23
• ode78
• ode89
• ode113
For most nonstiff problems, ode45 performs best. However, ode23 is recommended for problems that
permit a slightly cruder error tolerance or in the presence of moderate stiffness. Likewise, ode113
can be more efficient than ode45 for problems with more stringent error tolerances or when the ODE
function is computationally expensive to evaluate. ode78 and ode89 are high-order solvers that excel
with long integrations where accuracy is crucial for stability.
If the nonstiff solvers take a long time to solve the problem or consistently fail the integration, then
the problem might be stiff. See “Solve Stiff ODEs” on page 11-23 for more information.
where is a scalar parameter. Rewrite this equation as a system of first-order ODEs by making
the substitution . The resulting system of first-order ODEs is
The system of ODEs must be coded into a function file that the ODE solver can use. The general
functional signature of an ODE function is
dydt = odefun(t,y)
That is, the function must accept both t and y as inputs, even if it does not use t for any
computations.
The function file vdp1.m codes the van der Pol equation using . The variables and are
represented by y(1) and y(2), and the two-element column vector dydt contains the expressions
for and .
11-19
11 Ordinary Differential Equations (ODEs)
Solve the ODE using the ode45 function on the time interval [0 20] with initial values [2 0]. The
output is a column vector of time points t and a solution array y. Each row in y corresponds to a time
returned in the corresponding row of t. The first column of y corresponds to , and the second
column to .
[t,y] = ode45(@vdp1,[0 20],[2; 0]);
The vdpode function solves the same problem, but it accepts a user-specified value for . The van
der Pol equations become stiff as increases. For example, with the value you need to use
a stiff solver such as ode15s to solve the system.
The Euler equations for a rigid body without external forces are a standard test problem for ODE
solvers intended for nonstiff problems.
11-20
Solve Nonstiff ODEs
The function file rigidode defines and solves this first-order system of equations over the time
interval [0 12], using the vector of initial conditions [0; 1; 1] corresponding to the initial values
of , , and . The local function f(t,y) encodes the system of equations.
rigidode calls ode45 with no output arguments, so the solver uses the default output function
odeplot to automatically plot the solution points after each step.
function rigidode
%RIGIDODE Euler equations of a rigid body without external forces.
% A standard test problem for non-stiff solvers proposed by Krogh. The
% analytical solutions are Jacobian elliptic functions, accessible in
% MATLAB. The interval here is about 1.5 periods; it is that for which
% solutions are plotted on p. 243 of Shampine and Gordon.
%
% L. F. Shampine and M. K. Gordon, Computer Solution of Ordinary
% Differential Equations, W.H. Freeman & Co., 1975.
%
% See also ODE45, ODE23, ODE113, FUNCTION_HANDLE.
tspan = [0 12];
y0 = [0; 1; 1];
% --------------------------------------------------------------------------
rigidode
title('Solution of Rigid Body w/o External Forces using ODE45')
legend('y_1','y_2','y_3','Location','Best')
11-21
11 Ordinary Differential Equations (ODEs)
See Also
ode45 | ode23 | ode78 | ode89 | ode113
More About
• “Choose an ODE Solver” on page 11-2
• “Parameterizing Functions” on page 10-2
External Websites
• Qualitative Analysis of ODEs (MathWorks Teaching Resources)
11-22
Solve Stiff ODEs
This page contains two examples of solving stiff ordinary differential equations using ode15s.
MATLAB® has four solvers designed for stiff ODEs.
• ode15s
• ode23s
• ode23t
• ode23tb
For most stiff problems, ode15s performs best. However, ode23s, ode23t, and ode23tb can be
more efficient if the problem permits a crude error tolerance.
For some ODE problems, the step size taken by the solver is forced down to an unreasonably small
level in comparison to the interval of integration, even in a region where the solution curve is smooth.
These step sizes can be so small that traversing a short time interval might require millions of
evaluations. This can lead to the solver failing the integration, but even if it succeeds it will take a
very long time to do so.
Equations that cause this behavior in ODE solvers are said to be stiff. The problem that stiff ODEs
pose is that explicit solvers (such as ode45) are untenably slow in achieving a solution. This is why
ode45 is classified as a nonstiff solver along with ode23, ode78, ode89, and ode113.
Solvers that are designed for stiff ODEs, known as stiff solvers, typically do more work per step. The
pay-off is that they are able to take much larger steps, and have improved numerical stability
compared to the nonstiff solvers.
Solver Options
For stiff problems, specifying the Jacobian matrix using odeset is particularly important. Stiff solvers
use the Jacobian matrix to estimate the local behavior of the ODE as the integration
proceeds, so supplying the Jacobian matrix (or, for large sparse systems, its sparsity pattern) is
critical for efficiency and reliability. Use the Jacobian, JPattern, or Vectorized options of
odeset to specify information about the Jacobian. If you do not supply the Jacobian then the solver
estimates it numerically using finite differences.
where is a scalar parameter. When , the resulting system of ODEs is nonstiff and easily
solved using ode45. However, if you increase to 1000, then the solution changes dramatically and
exhibits oscillation on a much longer time scale. Approximating the solution of the initial value
problem becomes more difficult. Because this particular problem is stiff, a solver intended for nonstiff
problems, such as ode45, is too inefficient to be practical. Use a stiff solver such as ode15s for this
problem instead.
11-23
11 Ordinary Differential Equations (ODEs)
Rewrite the van der Pol equation as a system of first-order ODEs by making the substitution .
The resulting system of first-order ODEs is
The vdp1000 function evaluates the van der Pol equation using .
Use the ode15s function to solve the problem with an initial conditions vector of [2; 0], over a time
interval of [0 3000]. For scaling reasons, plot only the first component of the solution.
11-24
Solve Stiff ODEs
The vdpode function also solves the same problem, but it accepts a user-specified value for . The
equations become increasingly stiff as increases.
The classic Brusselator system of equations is potentially large, stiff, and sparse. The Brusselator
system models diffusion in a chemical reaction, and is represented by a system of equations involving
, , , and .
The function file brussode solves this set of equations on the time interval [0,10] with .
The initial conditions are
where for . Therefore, there are equations in the system, but the
Jacobian is a banded matrix with a constant width of 5 if the equations are ordered as
. As increases, the problem becomes increasingly stiff, and the Jacobian becomes
increasingly sparse.
11-25
11 Ordinary Differential Equations (ODEs)
The function call brussode(N), for , specifies a value for N in the system of equations,
corresponding to the number of grid points. By default, brussode uses .
• The nested function f(t,y) encodes the system of equations for the Brusselator problem,
returning a vector.
• The local function jpattern(N) returns a sparse matrix of 1s and 0s showing the locations of
nonzeros in the Jacobian. This matrix is assigned to the JPattern field of the options structure.
The ODE solver uses this sparsity pattern to generate the Jacobian numerically as a sparse matrix.
Supplying this sparsity pattern in the problem significantly reduces the number of function
evaluations required to generate the 2N-by-2N Jacobian, from 2N evaluations to just 4.
function brussode(N)
%BRUSSODE Stiff problem modelling a chemical reaction (the Brusselator).
% The parameter N >= 2 is used to specify the number of grid points; the
% resulting system consists of 2N equations. By default, N is 20. The
% problem becomes increasingly stiff and increasingly sparse as N is
% increased. The Jacobian for this problem is a sparse constant matrix
% (banded with bandwidth 5).
%
% The property 'JPattern' is used to provide the solver with a sparse
% matrix of 1's and 0's showing the locations of nonzeros in the Jacobian
% df/dy. By default, the stiff solvers of the ODE Suite generate Jacobians
% numerically as full matrices. However, when a sparsity pattern is
% provided, the solver uses it to generate the Jacobian numerically as a
% sparse matrix. Providing a sparsity pattern can significantly reduce the
% number of function evaluations required to generate the Jacobian and can
% accelerate integration. For the BRUSSODE problem, only 4 evaluations of
% the function are needed to compute the 2N x 2N Jacobian matrix.
%
% Setting the 'Vectorized' property indicates the function f is
% vectorized.
%
% E. Hairer and G. Wanner, Solving Ordinary Differential Equations II,
% Stiff and Differential-Algebraic Problems, Springer-Verlag, Berlin,
% 1991, pp. 5-8.
%
% See also ODE15S, ODE23S, ODE23T, ODE23TB, ODESET, FUNCTION_HANDLE.
options = odeset('Vectorized','on','JPattern',jpattern(N));
[t,y] = ode15s(@f,tspan,y0,options);
u = y(:,1:2:end);
11-26
Solve Stiff ODEs
x = (1:N)/(N+1);
figure;
surf(x,t,u);
view(-40,30);
xlabel('space');
ylabel('time');
zlabel('solution u');
title(['The Brusselator for N = ' num2str(N)]);
% -------------------------------------------------------------------------
% Nested function -- N is provided by the outer function.
%
% Evaluate the 2 components of the function at the other edge of the grid
% (with edge conditions).
i = 2*N-1;
dydt(i,:) = 1 + y(i+1,:).*y(i,:).^2 - 4*y(i,:) + c*(y(i-2,:)-2*y(i,:)+1);
dydt(i+1,:) = 3*y(i,:) - y(i+1,:).*y(i,:).^2 + c*(y(i-1,:)-2*y(i+1,:)+3);
end
% -------------------------------------------------------------------------
end % brussode
% ---------------------------------------------------------------------------
% Subfunction -- the sparsity pattern
%
function S = jpattern(N)
% Jacobian sparsity pattern
B = ones(2*N,5);
B(2:2:2*N,2) = zeros(N,1);
B(1:2:2*N-1,4) = zeros(N,1);
S = spdiags(B,-2:2,2*N,2*N);
end
% ---------------------------------------------------------------------------
brussode
11-27
11 Ordinary Differential Equations (ODEs)
brussode(50)
11-28
Solve Stiff ODEs
See Also
ode15s | ode23s | ode23t | ode23tb
More About
• “Choose an ODE Solver” on page 11-2
• “Summary of ODE Options” on page 11-10
• “Parameterizing Functions” on page 10-2
11-29
11 Ordinary Differential Equations (ODEs)
• The ode15s and ode23t solvers can solve index-1 linearly implicit problems with a singular mass
matrix M t, y y′ = f t, y , including semi-explicit DAEs of the form
y′ = f t, y, z
0 = g t, y, z .
In this form, the presence of algebraic variables leads to a singular mass matrix, since there are
one or more zeros on the main diagonal.
y′1 0 ⋯ 0
0 y′2 0 ⋮
My′ = .
⋮ 0 ⋱ 0
0 ⋯ 0 0
By default, solvers automatically test the singularity of the mass matrix to detect DAE systems. If
you know about singularity ahead of time then you can set the MassSingular option of odeset
to 'yes'. With DAEs, you can also provide the solver with a guess of the initial conditions for y′0
using the InitialSlope property of odeset. This is in addition to specifying the usual initial
conditions for y0 in the call to the solver.
• The ode15i solver can solve more general DAEs in the fully implicit form
f t, y, y′ = 0 .
In the fully implicit form, the presence of algebraic variables leads to a singular Jacobian matrix.
This is because at least one of the columns in the matrix is guaranteed to contain all zeros, since
the derivative of that variable does not appear in the equations.
∂f1 ∂f1
⋯
∂y′1 ∂y′n
J = ∂ f / ∂y′ = ⋮ ⋱ ⋮
∂fm ∂fm
⋯
∂y′1 ∂y′n
11-30
Solve Differential Algebraic Equations (DAEs)
The ode15i solver requires that you specify initial conditions for both y′0 and y0. Also, unlike the
other ODE solvers, ode15i requires the function encoding the equations to accept an extra input:
odefun(t,y,yp).
DAEs arise in a wide variety of systems because physical conservation laws often have forms like
x + y + z = 0. If x, x', y, and y' are defined explicitly in the equations, then this conservation
equation is sufficient to solve for z without having an expression for z'.
• ode15s and ode23t — If you do not specify an initial condition for y′0, then the solver
automatically computes consistent initial conditions based on the initial condition you provide for
y0. If you specify an inconsistent initial condition for y′0, then the solver treats the values as
guesses, attempts to compute consistent values close to the guesses, and continues on to solve the
problem.
• ode15i — The initial conditions you supply to the solver must be consistent, and ode15i does not
check the supplied values for consistency. The helper function decic computes consistent initial
conditions for this purpose.
Differential Index
DAEs are characterized by their differential index, which is a measure of their singularity. By
differentiating equations you can eliminate algebraic variables, and if you do this enough times then
the equations take the form of a system of explicit ODEs. The differential index of a system of DAEs is
the number of derivatives you must take to express the system as an equivalent system of explicit
ODEs. Thus, ODEs have a differential index of 0.
yt =kt .
For this equation, you can take a single derivative to obtain the explicit ODE form
y′ = k′ t .
y′1 = y2
0 = k t − y1 .
These equations require two derivatives to be rewritten in the explicit ODE form
y′1 = k′ t
y′2 = k′′ t .
11-31
11 Ordinary Differential Equations (ODEs)
The ode15s and ode23t solvers only solve DAEs of index 1. If the index of your equations is 2 or
higher, then you need to rewrite the equations as an equivalent system of index-1 DAEs. It is always
possible to take derivatives and rewrite a DAE system as an equivalent system of index-1 DAEs. Be
aware that if you replace algebraic equations with their derivatives, then you might have removed
some constraints. If the equations no longer include the original constraints, then the numerical
solution can drift.
If you have Symbolic Math Toolbox, then see “Solve Differential Algebraic Equations (DAEs)”
(Symbolic Math Toolbox) for more information.
Imposing Nonnegativity
Most of the options in odeset on page 11-10 work as expected with the DAE solvers ode15s,
ode23t, and ode15i. However, one notable exception is with the use of the NonNegative on page
11-35 option. The NonNegative option does not support implicit solvers (ode15s, ode23t,
ode23tb) applied to problems with a mass matrix. Therefore, you cannot use this option to impose
nonnegativity constraints on a DAE problem, which necessarily has a singular mass matrix. For more
details, see [1].
This example reformulates a system of ODEs as a system of differential algebraic equations (DAEs).
The Robertson problem found in hb1ode.m is a classic test problem for programs that solve stiff
ODEs. The system of equations is
hb1ode solves this system of ODEs to steady state with the initial conditions , , and
. But the equations also satisfy a linear conservation law,
The system of equations can be rewritten as a system of DAEs by using the conservation law to
determine the state of . This reformulates the problem as the DAE system
The differential index of this system is 1, since only a single derivative of is required to make this a
system of ODEs. Therefore, no further transformations are required before solving the system.
11-32
Solve Differential Algebraic Equations (DAEs)
The function robertsdae encodes this DAE system. Save robertsdae.m in your current folder to
run the example.
The full example code for this formulation of the Robertson problem is available in hb1dae.m.
Solve the DAE system using ode15s. Consistent initial conditions for y0 are obvious based on the
conservation law. Use odeset to set the options:
• Use a constant mass matrix to represent the left hand side of the system of equations.
y0 = [1; 0; 0];
tspan = [0 4*logspace(-6,6)];
M = [1 0 0; 0 1 0; 0 0 0];
options = odeset('Mass',M,'RelTol',1e-4,'AbsTol',[1e-6 1e-10 1e-6]);
[t,y] = ode15s(@robertsdae,tspan,y0,options);
y(:,2) = 1e4*y(:,2);
semilogx(t,y);
ylabel('1e4 * y(:,2)');
title('Robertson DAE problem with a Conservation Law, solved by ODE15S');
11-33
11 Ordinary Differential Equations (ODEs)
References
[1] Shampine, L.F., S. Thompson, J.A. Kierzenka, and G.D. Byrne. “Non-Negative Solutions of ODEs.”
Applied Mathematics and Computation 170, no. 1 (November 2005): 556–569. https://doi.org/
10.1016/j.amc.2004.12.011.
See Also
ode15s | ode23t | ode15i | odeset
More About
• “Choose an ODE Solver” on page 11-2
• “Summary of ODE Options” on page 11-10
• “Equation Solving” (Symbolic Math Toolbox)
External Websites
• Solving Index-1 DAEs in MATLAB and Simulink
11-34
Nonnegative ODE Solution
This topic shows how to constrain the solution of an ODE to be nonnegative. Imposing nonnegativity
is not always trivial, but sometimes it is necessary due to the physical interpretation of the equations
or due to the nature of the solution. You should only impose this constraint on the solution when
necessary, such as in cases where the integration fails without it, or where the solution would be
inapplicable.
If certain components of the solution must be nonnegative, then use odeset to set the NonNegative
option for the indices of these components. This option is not available for ode23s, ode15i, or for
implicit solvers (ode15s, ode23t, ode23tb) applied to problems with a mass matrix. In particular,
you cannot impose nonnegativity constraints on a DAE problem, which necessarily has a singular
mass matrix.
y′ = − | y | ,
solved on the interval [0, 40] with the initial condition y(0) = 1. The solution of this ODE decays to
zero. If the solver produces a negative solution value, then it begins to track the solution of the ODE
through this value, and the computation eventually fails as the calculated solution diverges to − ∞.
Using the NonNegative option prevents this integration failure.
Compare the analytic solution of y(t) = e−t to a solution of the ODE using ode45 with no extra
options, and to one with the NonNegative option set.
% Analytic solution
t = linspace(0,40,1000);
y = exp(-t);
Plot the three solutions for comparison. Imposing nonnegativity is crucial to keep the solution from
veering off toward − ∞.
plot(t,y,'b-',t0,y0,'ro',t1,y1,'k*');
legend('Exact solution','No constraints','Nonnegativity', ...
'Location','SouthWest')
11-35
11 Ordinary Differential Equations (ODEs)
Another example of a problem that requires a nonnegative solution is the knee problem coded in the
example file kneeode. The equation is
ϵy′ = 1 − x y − y2,
solved on the interval [0, 2] with the initial condition y(0) = 1. The parameter ϵ generally is taken to
−6
satisfy 0 < ϵ ≪ 1, and this problem uses ϵ = 1 × 10 . The solution to this ODE approaches y = 1 − x
for x < 1 and y = 0 for x > 1. However, computing the numerical solution with default tolerances
shows that the solution follows the y = 1 − x isocline for the whole interval of integration. Imposing
nonnegativity constraints results in the correct solution.
epsilon = 1e-6;
y0 = 1;
xspan = [0 2];
odefcn = @(x,y,epsilon) ((1-x)*y - y^2)/epsilon;
11-36
Nonnegative ODE Solution
plot(x1,y1,'ro',x2,y2,'b*')
axis([0,2,-1,1])
title('The "knee problem"')
legend('No constraints','Non-negativity')
xlabel('x')
ylabel('y')
References
[1] Shampine, L.F., S. Thompson, J.A. Kierzenka, and G.D. Byrne, "Non-negative solutions of ODEs,"
Applied Mathematics and Computation Vol. 170, 2005, pp. 556-569.
See Also
odeset
More About
• “Choose an ODE Solver” on page 11-2
• “Summary of ODE Options” on page 11-10
11-37
11 Ordinary Differential Equations (ODEs)
Error Tolerances
Question or Problem Answer
How do I choose the error thresholds RelTol RelTol, the relative accuracy tolerance, controls
and AbsTol? the number of correct digits in the computed
answer. AbsTol, the absolute error tolerance,
controls the difference between the computed
answer and the true solution. At each step, the
error e in component i of the solution satisfies
|e(i)| ≤
max(RelTol*abs(y(i)),AbsTol(i))
11-38
Troubleshoot Common ODE Problems
Problem Scale
Question or Problem Answer
How large a problem can I solve with the ODE The primary constraints are memory and time. At
suite? each time step, the solvers for nonstiff problems
allocate vectors of length n, where n is the
number of equations in the system. The solvers
for stiff problems allocate vectors of length n but
also allocate an n-by-n Jacobian matrix. For these
solvers, it might be advantageous to specify the
Jacobian sparsity pattern using the JPattern
option of odeset.
11-39
11 Ordinary Differential Equations (ODEs)
Solution Components
Question or Problem Answer
The solution does not look like what I expected. If your expectations are correct, then reduce the
error tolerances from their default values. A
smaller relative error tolerance is needed to
accurately solve problems integrated over “long”
intervals, as well as problems that are moderately
unstable.
11-40
Troubleshoot Common ODE Problems
11-41
11 Ordinary Differential Equations (ODEs)
Problem Type
Can the solvers handle partial differential Yes, because the discretization produces a system
equations (PDEs) that have been discretized by of ODEs. Depending on the discretization, you
the method of lines? might have a form involving mass matrices, which
the ODE solvers provide for. Often the system is
stiff. This is to be expected if the PDE is
parabolic, or when there are phenomena that
happen on very different time scales such as a
chemical reaction in a fluid flow. In such cases,
use one of the four stiff solvers ode15s, ode23s,
ode23t, or ode23tb.
See Also
odeset | odeget | deval | odextend
More About
• “Choose an ODE Solver” on page 11-2
• “Summary of ODE Options” on page 11-10
• “ODE Event Location” on page 11-13
• “Nonnegative ODE Solution” on page 11-35
11-42
Differential Equations
Differential Equations
This example shows how to use MATLAB® to formulate and solve several different types of
differential equations. MATLAB offers several numerical algorithms to solve a wide variety of
differential equations:
2
d y dy
− μ 1 − y2 + y = 0.
dt2 dt
type vanderpoldemo
The equation is written as a system of two first-order ordinary differential equations (ODEs). These
equations are evaluated for different values of the parameter μ. For faster integration, you should
choose an appropriate solver based on the value of μ.
For μ = 1, any of the MATLAB ODE solvers can solve the van der Pol equation efficiently. The ode45
solver is one such example. The equation is solved in the domain 0, 20 with the initial conditions
dy
y 0 = 2 and dt t = 0
= 0.
tspan = [0 20];
y0 = [2; 0];
Mu = 1;
ode = @(t,y) vanderpoldemo(t,y,Mu);
[t,y] = ode45(ode, tspan, y0);
% Plot solution
plot(t,y(:,1))
xlabel('t')
ylabel('solution y')
title('van der Pol Equation, \mu = 1')
11-43
11 Ordinary Differential Equations (ODEs)
For larger magnitudes of μ, the problem becomes stiff. This label is for problems that resist attempts
to be evaluated with ordinary techniques. Instead, special numerical methods are needed for fast
integration. The ode15s, ode23s, ode23t, and ode23tb functions can solve stiff problems
efficiently.
This solution to the van der Pol equation for μ = 1000 uses ode15s with the same initial conditions.
You need to stretch out the time span drastically to 0, 3000 to be able to see the periodic movement
of the solution.
plot(t,y(:,1))
title('van der Pol Equation, \mu = 1000')
axis([0 3000 -3 3])
xlabel('t')
ylabel('solution y')
11-44
Differential Equations
bvp4c and bvp5c solve boundary value problems for ordinary differential equations.
The example function twoode has a differential equation written as a system of two first-order ODEs.
The differential equation is
2
d y
+ |y| = 0 .
dt2
type twoode
The function twobc has the boundary conditions for the problem: y 0 = 0 and y 4 = − 2.
type twobc
11-45
11 Ordinary Differential Equations (ODEs)
%
% See also TWOODE, TWOBVP.
Prior to calling bvp4c, you have to provide a guess for the solution you want represented at a mesh.
The solver then adapts the mesh as it refines the solution.
The bvpinit function assembles the initial guess in a form you can pass to the solver bvp4c. For a
mesh of [0 1 2 3 4] and constant guesses of y x = 1 and y′ x = 0, the call to bvpinit is:
With this initial guess, you can solve the problem with bvp4c. Evaluate the solution returned by
bvp4c at some points using deval, and then plot the resulting values.
11-46
Differential Equations
This particular boundary value problem has exactly two solutions. You can obtain the other solution
by changing the initial guesses to y x = − 1 and y′ x = 0.
xint = linspace(0,4,50);
yint = deval(sol,xint);
plot(xint,yint(1,:));
legend('Solution 1','Solution 2')
hold off
dde23, ddesd, and ddensd solve delay differential equations with various delays. The examples
ddex1, ddex2, ddex3, ddex4, and ddex5 form a mini tutorial on using these solvers.
The ddex1 example shows how to solve the system of differential equations
y1′ t = y1 t − 1
y2′ t = y1 t − 1 + y2 t − 0 . 2
y3′ t = y2 t .
11-47
11 Ordinary Differential Equations (ODEs)
y1 t = 1
y2 t = 1
y3 t = 1 .
ddex1hist = ones(3,1);
lags = [1 0.2];
Pass the function, delays, solution history, and interval of integration 0, 5 to the solver as inputs. The
solver produces a continuous solution over the whole interval of integration that is suitable for
plotting.
plot(sol.x,sol.y);
title({'An example of Wille and Baker', 'DDE with Constant Delays'});
xlabel('time t');
ylabel('solution y');
legend('y_1','y_2','y_3','Location','NorthWest');
11-48
Differential Equations
pdepe solves partial differential equations in one space variable and time. The examples pdex1,
pdex2, pdex3, pdex4, and pdex5 form a mini tutorial on using pdepe.
This example problem uses the functions pdex1pde, pdex1ic, and pdex1bc.
∂u ∂ ∂u
π2 = .
∂t ∂x ∂x
type pdex1pde
c = pi^2;
f = DuDx;
s = 0;
u(x, 0) = sinπx .
type pdex1ic
function u0 = pdex1ic(x)
%PDEX1IC Evaluate the initial conditions for the problem coded in PDEX1.
%
% See also PDEPE, PDEX1.
u0 = sin(pi*x);
u 0, t = 0,
∂
πe−t + u 1, t = 0 .
∂x
type pdex1bc
11-49
11 Ordinary Differential Equations (ODEs)
pl = ul;
ql = 0;
pr = pi * exp(-t);
qr = 1;
pdepe requires the spatial discretization x and a vector of times t (at which you want a snapshot of
the solution). Solve the problem using a mesh of 20 nodes and request the solution at five values of t.
Extract and plot the first component of the solution.
x = linspace(0,1,20);
t = [0 0.5 1 1.5 2];
sol = pdepe(0,@pdex1pde,@pdex1ic,@pdex1bc,x,t);
u1 = sol(:,:,1);
surf(x,t,u1);
xlabel('x');
ylabel('t');
zlabel('u');
See Also
ode45 | bvp4c | pdepe
11-50
Differential Equations
More About
• “Choose an ODE Solver” on page 11-2
• “Solving Boundary Value Problems” on page 12-2
• “Solving Partial Differential Equations” on page 13-2
11-51
11 Ordinary Differential Equations (ODEs)
This example shows how to solve a differential equation representing a predator/prey model using
both ode23 and ode45. These functions are for the numerical solution of ordinary differential
equations using variable step size Runge-Kutta integration methods. ode23 uses a simple 2nd and
3rd order pair of formulas for medium accuracy and ode45 uses a 4th and 5th order pair for higher
accuracy.
Consider the pair of first-order ordinary differential equations known as the Lotka-Volterra
equations, or predator-prey model:
dx
= x − αxy
dt
dy
= − y + βxy .
dt
The variables x and y measure the sizes of the prey and predator populations, respectively. The
quadratic cross term accounts for the interactions between the species. The prey population
increases when no predators are present, and the predator population decreases when prey are
scarce.
Code Equations
To simulate the system, create a function that returns a column vector of state derivatives, given
state and time values. The two variables x and y can be represented in MATLAB® as the first two
values in a vector y. Similarly, the derivatives are the first two values in a vector yp. The function
must accept values for t and y and return the values produced by the equations in yp.
yp(1) = (1 - alpha*y(2))*y(1)
In this example, the equations are contained in a file called lotka.m. This file uses parameter values
of α = 0 . 01 and β = 0 . 02.
type lotka
function yp = lotka(t,y)
%LOTKA Lotka-Volterra predator-prey model.
Simulate System
Use ode23 to solve the differential equation defined in lotka over the interval 0 < t < 15. Use an
initial condition of x 0 = y 0 = 20 so that the populations of predators and prey are equal.
t0 = 0;
tfinal = 15;
y0 = [20; 20];
[t,y] = ode23(@lotka,[t0 tfinal],y0);
11-52
Solve Predator-Prey Equations
Plot Results
plot(t,y)
title('Predator/Prey Populations Over Time')
xlabel('t')
ylabel('Population')
legend('Prey','Predators','Location','North')
Now plot the populations against each other. The resulting phase plane plot makes the cyclic
relationship between the populations very clear.
plot(y(:,1),y(:,2))
title('Phase Plane Plot')
xlabel('Prey Population')
ylabel('Predator Population')
11-53
11 Ordinary Differential Equations (ODEs)
Solve the system a second time using ode45, instead of ode23. The ode45 solver takes longer for
each step, but it also takes larger steps. Nevertheless, the output of ode45 is smooth because by
default the solver uses a continuous extension formula to produce output at four equally spaced time
points in the span of each step taken. (You can adjust the number of points with the 'Refine'
option.) Plot both solutions for comparison.
plot(y(:,1),y(:,2),'-',Y(:,1),Y(:,2),'-');
title('Phase Plane Plot')
legend('ode23','ode45')
11-54
Solve Predator-Prey Equations
The results show that solving differential equations using different numerical methods can produce
slightly different answers.
See Also
ode45 | ode23
More About
• “Choose an ODE Solver” on page 11-2
• “Solve Nonstiff ODEs” on page 11-19
• “Experiment with Predator-Prey Equations”
11-55
11 Ordinary Differential Equations (ODEs)
This example solves a system of ordinary differential equations that model the dynamics of a baton
thrown into the air [1]. The baton is modeled as two particles with masses m1 and m2 connected by a
rod of length L. The baton is thrown into the air and subsequently moves in the vertical xy-plane
subject to the force due to gravity. The rod forms an angle θ with the horizontal and the coordinates
of the first mass are x, y . With this formulation, the coordinates of the second mass are
x + L cos θ, y + L sin θ .
The equations of motion for the system are obtained by applying Lagrange's equations for each of the
three coordinates, x, y, and θ:
2
m1 + m2 ẍ − m2L θ̈ sin θ − m2L θ̇ cos θ = 0,
2
m1 + m2 ÿ − m2L θ̈ cos θ − m2L θ̇ sin θ + m1 + m2 g = 0,
To solve this system of ODEs in MATLAB®, code the equations into a function before calling the
solver ode45. You can either include the required functions as local functions at the end of a file (as
done here), or save them as separate, named files in a directory on the MATLAB path.
Code Equations
The ode45 solver requires the equations to be written in the form q̇ = f t, q , where q̇ is the first
derivative of each coordinate. In this problem, the solution vector has six components: x, y, the angle
θ, and their first derivatives:
11-56
Solve Equations of Motion for Baton Thrown into Air
q1 x
q2 ẋ
q3 y
q= = .
q4 ẏ
q5 θ
q6 θ̇
With this notation, you can rewrite the equations of motion entirely in terms of the elements of q:
Unfortunately, the equations of motion do not fit into the form q̇ = f t, q required by the solver, since
there are several terms on the left with first derivatives. When this occurs, you must use a mass
matrix to represent the left side of the equation.
With matrix notation, you can rewrite the equations of motion as a system of six equations using a
mass matrix in the form M t, q q̇ = f t, q . The mass matrix expresses the linear combinations of first
derivatives on the left side of the equation with a matrix-vector product. In this form, the system of
equations becomes:
1 0 0 0 0 0 q˙1 q2
0 m1 + m2 0 0 0 −m2L sin q5 q˙2 m2L q62cos q5
0 0 1 0 0 0 q˙3 q4
0 0 0 m1 + m2 0 m2L cos q5 =
q˙4 m2L q62 sin q5 − m1 + m2 g
0 0 0 0 1 0 q˙5 q6
0 −L sin q5 0 L cos q5 0 L2 q˙6 −g L cos q5
From this expression, you can write a function that calculates the nonzero elements of the mass
matrix. The function takes three inputs: t and the solution vector q are required (you must specify
these inputs even if they are not used in the body of the function), and P is an optional extra input
used to pass in parameter values. To pass the parameters for this problem to the function, create P as
a structure that holds the parameter values and then use the technique described in “Parameterizing
Functions” on page 10-2 to pass the structure to the function as an extra input.
function M = mass(t,q,P)
% Extract parameters
m1 = P.m1;
m2 = P.m2;
L = P.L;
g = P.g;
11-57
11 Ordinary Differential Equations (ODEs)
M(3,3) = 1;
M(4,4) = m1 + m2;
M(4,6) = m2*L*cos(q(5));
M(5,5) = 1;
M(6,2) = -L*sin(q(5));
M(6,4) = L*cos(q(5));
M(6,6) = L^2;
end
Next, you can write a function for the right side of each of the equations in the system
M t, q q̇ = f t, q . Like the mass matrix function, this function takes two required inputs for t and q,
and one optional input for parameter values P.
% Equation to solve
dydt = [q(2)
m2*L*q(6)^2*cos(q(5))
q(4)
m2*L*q(6)^2*sin(q(5))-(m1+m2)*g
q(6)
-g*L*cos(q(5))];
end
First, create a structure P of parameter values for m1, m2, g, and L by setting appropriately named
fields in a structure. The structure P is passed to the mass matrix and ODE functions as an extra
input.
P.m1 = 0.1;
P.m2 = 0.1;
P.L = 1;
P.g = 9.81
Create a vector with 25 points between 0 and 4 for the time span of the integration. When you specify
a vector of times, ode45 returns interpolated solutions at the requested times.
tspan = linspace(0,4,25);
Set the initial conditions of the problem. Since the baton is thrown upward at an angle, use nonzero
values for the initial velocities, x˙0 = 4, y˙0 = 20, and θ˙0 = 2. For the initial positions, the baton begins
in an upright position, so θ0 = − π/2, x0 = 0, and y0 = L.
11-58
Solve Equations of Motion for Baton Thrown into Air
Use odeset to create an options structure that references the mass matrix function. Since the solver
expects the mass matrix function to have only two inputs, use an anonymous function to pass in the
parameters P from the workspace.
Finally, solve the system of equations using ode45 with these inputs:
Plot Results
The outputs from ode45 contain the solutions of the equations of motion at each requested time step.
To examine the results, plot the motion of the baton over time.
Loop through each row of the solution, and at each time step, plot the position of the baton. Color
each end of the baton differently so that you can see its rotation over time.
figure
title('Motion of a Thrown Baton, Solved by ODE45');
axis([0 22 0 25])
hold on
for j = 1:length(t)
theta = q(j,5);
X = q(j,1);
Y = q(j,3);
xvals = [X X+P.L*cos(theta)];
yvals = [Y Y+P.L*sin(theta)];
plot(xvals,yvals,xvals(1),yvals(1),'ro',xvals(2),yvals(2),'go')
end
hold off
11-59
11 Ordinary Differential Equations (ODEs)
References
[1] Wells, Dare A. Schaum’s Outline of Theory and Problems of Lagrangian Dynamics: With a
Treatment of Euler’s Equations of Motion, Hamilton’s Equations and Hamilton’s Principle. Schaum's
Outline Series. New York: Schaum Pub. Co, 1967.
Local Functions
Listed here are the local helper functions that the ODE solver calls to calculate the solution.
Alternatively, you can save these functions as their own files in a directory on the MATLAB path.
11-60
Solve Equations of Motion for Baton Thrown into Air
See Also
ode45
More About
• “Choose an ODE Solver” on page 11-2
• “Solve Nonstiff ODEs” on page 11-19
11-61
11 Ordinary Differential Equations (ODEs)
This example shows how to solve Burgers' equation using a moving mesh technique [1]. The problem
includes a mass matrix, and options are specified to account for the strong state dependence and
sparsity of the mass matrix, making the solution process more efficient.
Problem Setup
∂u ∂2 u ∂ u2 −4
=ϵ 2− , 0 < x < 1, t > 0, ϵ = 1 × 10 .
∂t ∂x ∂x 2
Applying a coordinate transformation (Eq. 18 in [1]) leads to an extra term on the left-hand side:
∂u ∂u ∂x ∂2 u ∂ u2
− =ϵ 2− .
∂t ∂x ∂t ∂x ∂x 2
Converting the PDE into an ODE of one variable is accomplished by using finite differences to
approximate the partial derivatives taken with respect to x. If the finite differences are written as Δ,
then the PDE can be rewritten as an ODE that only contains derivatives taken with respect to t:
du dx u2
− Δu = ϵΔ2u − Δ .
dt dt 2
In this form, you can use an ODE solver such as ode15s to solve for u and x over time.
For this example, the problem is formulated on a moving mesh of N points, and the moving mesh
technique described in [1] positions the mesh points at each time step so that they are concentrated
in areas of change. The boundary and initial conditions are
u 0, t = u 1, t = 0,
1
u x, 0 = sin 2πx + sin πx .
2
For a given initial mesh of N points, there are 2N equations to solve: N equations corresponding to
Burgers' equation, and N equations determining the movement of each mesh point. So, the final
system of equations is:
11-62
Solve ODE with Strongly State-Dependent Mass Matrix
The terms for the moving mesh correspond to MMPDE6 in [1]. The parameter τ represents a
timescale for forcing the mesh toward equidistribution. The term B x, t is a monitor function given by
Eq. 21 in [1]:
dui 2
B x, t = 1 + .
dxi
The approach used in this example to solve Burgers' equation with moving mesh points demonstrates
several techniques:
• The system of equations is expressed using a mass matrix formulation, M y′ = f t, y . The mass
matrix is provided to the ode15s solver as a function.
• The derivative function not only includes the equations for Burgers' equation, but also a set of
equations governing the moving mesh selection.
• The sparsity patterns of the Jacobian dF/dy and the derivative of the mass matrix multiplied with a
vector d Mv /dy are supplied to the solver as functions. Supplying these sparsity patterns helps
the solver operate more efficiently.
• Finite differences are used to approximate several partial derivatives.
To solve this equation in MATLAB®, write a derivative function, a mass matrix function, a function for
the sparsity pattern of the Jacobian dF/dy, and a function for the sparsity pattern of d Mv /dy. You
can either include the required functions as local functions at the end of a file (as done here), or save
them as separate, named files in a directory on the MATLAB path.
The left side of the system of equations involves linear combinations of first derivatives, so a mass
matrix is required to represent all of the terms. Set the left side of the system of equations equal to
M y′ to extract the form of the mass matrix. The mass matrix is composed of four blocks, each of
which is a square matrix of order N:
⋮ u˙1
∂uN ∂uN ∂xN ⋮
∂t
− ∂xN ∂t
M1 M2 u˙N
∂2 x˙1
= M y′ = .
M3 M4 x˙1
∂t2
⋮
⋮
x˙N
∂2 xṄ
∂t2
This formulation shows that M1 and M2 form the left side of Burgers' equations (the first N equations
in the system), while M3 and M4 form the left side of the mesh equations (the last N equations in the
system). The block matrices are:
11-63
11 Ordinary Differential Equations (ODEs)
M1 = IN,
∂ui
M2 = − I ,
∂xi N
M3 = 0N,
∂2
M4 = IN .
∂t2
IN is the N × N identity matrix. The partial derivatives in M2 are estimated using finite differences,
while the partial derivative in M4 uses a Laplacian matrix. Notice that M3 contains only zeros because
none of the equations for the mesh movement depend on u̇.
Now you can write a function that computes the mass matrix. The function must accept two inputs for
time t and the solution vector y. Since the solution vector y contains half u̇ components and half ẋ
components, the function extracts these first. Then, the function forms all of the block matrices
(taking the boundary values of the problem into account) and assembles the mass matrix using the
four blocks.
function M = mass(t,y)
% Extract the components of y for the solution u and mesh x
N = length(y)/2;
u = y(1:N);
x = y(N+1:end);
% M1 and M2 are the portions of the mass matrix for Burgers' equation.
% The derivative du/dx is approximated with finite differences, using
% single-sided differences on the edges and centered differences in between.
M1 = speye(N);
M2 = sparse(N,N);
M2(1,1) = - (u(2) - u0)/(x(2) - x0);
for i = 2:N-1
M2(i,i) = - (u(i+1) - u(i-1))/(x(i+1) - x(i-1));
end
M2(N,N) = - (uNP1 - u(N-1))/(xNP1 - x(N-1));
Note: All functions are included as local functions at the end of the example.
11-64
Solve ODE with Strongly State-Dependent Mass Matrix
The derivative function for this problem returns a vector with 2N elements. The first N elements
correspond to Burgers' equations, while the last N elements are for the moving mesh equations. The
function movingMeshODE goes through these steps to evaluate the right-hand sides of all the
equations in the system:
The first N equations in the derivative function encode the right side of Burgers' equations. Burgers'
equations can be considered as a differential operator involving spatial derivatives of the form:
∂2 u ∂ u2
f u =ϵ − .
∂x2 ∂x 2
The reference paper [1] describes the process of approximating the differential operator f using
centered finite differences by
ui + 1 − ui ui − ui − 1
− x −x 2 2
xi + 1 − xi i i−1 1 ui + 1 − ui − 1
fi = ϵ − .
1
x − xi − 1 2 xi + 1 − xi − 1
2 i+1
On the edges of the mesh (for which i = 1 and i = N), only single-sided differences are used instead.
−4
This example uses ϵ = 1 × 10 .
The equations governing the mesh (comprising the last N equations in the derivative function) are
∂2 ẋ 1 ∂ ∂x
= B x, t .
∂t 2 τ ∂t ∂t
Just as with Burgers' equations, you can use finite differences to approximate the monitor function
B x, t :
∂ui 2 ui + 1 − ui − 1 2
B x, t = 1 + = 1+ .
∂xi xi + 1 − xi − 1
Once the monitor function is evaluated, spatial smoothing is applied (Equations 14 and 15 in [1]).
This example uses γ = 2 and p = 2 for the spatial smoothing parameters.
11-65
11 Ordinary Differential Equations (ODEs)
xNP1 = 1;
% Form final discrete approximation for Eq. 12 in reference paper, the equation governing
% the mesh points.
tau = 1e-3;
g(1+N:end) = - g(1+N:end)/(2*tau);
end
11-66
Solve ODE with Strongly State-Dependent Mass Matrix
The Jacobian dF/dy for the derivative function is a 2N × 2N matrix containing all of the partial
derivatives of the derivative function, movingMeshODE. ode15s estimates the Jacobian using finite
differences when the matrix is not supplied in the options structure. You can supply the sparsity
pattern of the Jacobian to help ode15s calculate it more quickly.
Another way to make the calculation more efficient is to provide the sparsity pattern of d Mv /dy. You
can find this sparsity pattern by examining which terms of ui and xi are present in the finite
differences calculated in the mass matrix function.
11-67
11 Ordinary Differential Equations (ODEs)
S(1,2) = 1;
S(1,2+N) = 1;
for i = 2:N-1
S(i,i-1) = 1;
S(i,i+1) = 1;
S(i,i-1+N) = 1;
S(i,i+1+N) = 1;
end
S(N,N-1) = 1;
S(N,N-1+N) = 1;
end
Solve the system with the value N = 80. For the initial conditions, initialize x with a uniform grid and
evaluate u x, 0 on the grid.
N = 80;
h = 1/(N+1);
xinit = h*(1:N);
uinit = sin(2*pi*xinit) + 0.5*sin(pi*xinit);
y0 = [uinit xinit];
11-68
Solve ODE with Strongly State-Dependent Mass Matrix
opts = odeset('Mass',@mass,'MStateDependence','strong','JPattern',JPat(N),...
'MvPattern',MvPat(N),'RelTol',1e-5,'AbsTol',1e-4);
Finally, call ode15s to solve the system on the interval 0, 1 using the movingMeshODE derivative
function, the time span, the initial conditions, and the options structure.
tspan = [0 1];
sol = ode15s(@movingMeshODE,tspan,y0,opts);
Plot Results
The result of the integration is a structure sol that contains the time steps t, the mesh points x t ,
and the solution u x, t . Extract these values from the structure.
t = sol.x;
x = sol.y(N+1:end,:);
u = sol.y(1:N,:);
Plot the movement of the mesh points over time. The plot shows that the mesh points retain a
reasonably even spacing over time (due to the monitor function), but they are able to cluster near the
discontinuity in the solution as it moves.
plot(x,t)
xlabel('t')
ylabel('x(t)')
title('Burgers'' equation: Trajectories of grid points')
11-69
11 Ordinary Differential Equations (ODEs)
Now, sample u x, t at a few values of t and plot the evolution of the solution over time. The mesh
points at the ends of the interval are fixed, so x(0) = 0 and x(N+1) = 1. The boundary values are
u(t,0) = 0 and u(t,1) = 0, which you must add to the known values computed for the figure.
tint = 0:0.2:1;
yint = deval(sol,tint);
figure
labels = {};
for j = 1:length(tint)
solution = [0; yint(1:N,j); 0];
location = [0; yint(N+1:end,j); 1];
labels{j} = ['t = ' num2str(tint(j))];
plot(location,solution,'-o')
hold on
end
xlabel('x')
ylabel('solution u(x,t)')
legend(labels{:},'Location','SouthWest')
title('Burgers equation on moving mesh')
hold off
11-70
Solve ODE with Strongly State-Dependent Mass Matrix
The plot shows that u x, 0 is a smooth wave that develops a steep gradient over time as it moves
towards x = 1. The mesh points track the movement of the discontinuity so that extra evaluation
points are in the appropriate position in each time step.
References
[1] Huang, Weizhang, et al. “Moving Mesh Methods Based on Moving Mesh Partial Differential
Equations.” Journal of Computational Physics, vol. 113, no. 2, Aug. 1994, pp. 279–90. https://doi.org/
10.1006/jcph.1994.1135.
Local Functions
Listed here are the local helper functions that the solver ode15s calls to calculate the solution.
Alternatively, you can save these functions as their own files in a directory on the MATLAB path.
function g = movingMeshODE(t,y)
% Extract the components of y for the solution u and mesh x
N = length(y)/2;
u = y(1:N);
x = y(N+1:end);
11-71
11 Ordinary Differential Equations (ODEs)
% Form final discrete approximation for Eq. 12 in reference paper, the equation governing
% the mesh points.
tau = 1e-3;
g(1+N:end) = - g(1+N:end)/(2*tau);
end
% -----------------------------------------------------------------------
11-72
Solve ODE with Strongly State-Dependent Mass Matrix
function M = mass(t,y)
% Extract the components of y for the solution u and mesh x
N = length(y)/2;
u = y(1:N);
x = y(N+1:end);
% M1 and M2 are the portions of the mass matrix for Burgers' equation.
% The derivative du/dx is approximated with finite differences, using
% single-sided differences on the edges and centered differences in between.
M1 = speye(N);
M2 = sparse(N,N);
M2(1,1) = - (u(2) - u0)/(x(2) - x0);
for i = 2:N-1
M2(i,i) = - (u(i+1) - u(i-1))/(x(i+1) - x(i-1));
end
M2(N,N) = - (uNP1 - u(N-1))/(xNP1 - x(N-1));
% -------------------------------------------------------------------------
function out = JPat(N) % Jacobian sparsity pattern
S1 = spdiags(ones(N,3),-1:1,N,N);
S2 = spdiags(ones(N,9),-4:4,N,N);
out = [S1 S1
S2 S2];
end
% -------------------------------------------------------------------------
function S = MvPat(N) % Sparsity pattern for the derivative of the Mass matrix times a vector
S = sparse(2*N,2*N);
S(1,2) = 1;
S(1,2+N) = 1;
for i = 2:N-1
S(i,i-1) = 1;
S(i,i+1) = 1;
S(i,i-1+N) = 1;
S(i,i+1+N) = 1;
end
S(N,N-1) = 1;
S(N,N-1+N) = 1;
11-73
11 Ordinary Differential Equations (ODEs)
end
% -------------------------------------------------------------------------
See Also
ode15s | odeset
More About
• “Choose an ODE Solver” on page 11-2
• “Summary of ODE Options” on page 11-10
• “Solve Equations of Motion for Baton Thrown into Air” on page 11-56
11-74
Solve Stiff Transistor Differential Algebraic Equation
This example shows how to use ode23t to solve a stiff differential algebraic equation (DAE) that
describes an electrical circuit [1]. The one-transistor amplifier problem coded in the example file
amp1dae.m can be rewritten in semi-explicit form, but this example solves it in its original form
Mu′ = ϕ(u). The problem includes a constant, singular mass matrix M.
The transistor amplifier circuit contains six resistors, three capacitors, and a transistor.
To solve this equation in MATLAB®, you need to code the equations, code a mass matrix, and set the
initial conditions and interval of integration before calling the solver ode23t. You can either include
the required functions as local functions at the end of a file (as done here), or save them as separate,
named files in a directory on the MATLAB path.
Using Kirchoff's law to equalize the current through each node (1 through 5), you can obtain a system
of five equations describing the circuit:
11-75
11 Ordinary Differential Equations (ODEs)
Ue t U1
node 1: − + C1 U2′ − U1′ = 0,
R0 R0
Ub 1 1
node 2: − U2 + + C1 U1′ − U2′ − 0 . 01f U2 − U3 = 0,
R2 R1 R2
U3
node 3: f U2 − U3 − − C2U3′ = 0,
R3
Ub U4
node 4: − + C3 U5′ − U4′ − 0 . 99f U2 − U3 = 0,
R4 R4
U5
node 5: − + C3 U4′ − U5′ = 0 .
R5
The mass matrix of this system, found by collecting the derivative terms on the left side of the
equations, has the form
−c1 c1 0 0 0
c1 −c1 0 0 0
M= 0 0 −c2 0 0 ,
0 0 0 −c3 c3
0 0 0 c3 −c3
−6
where ck = k × 10 for k = 1, 2, 3.
Create a mass matrix with the appropriate constants ck, and then use the odeset function to specify
the mass matrix. Even though it is apparent that the mass matrix is singular, leave the
'MassSingular' option at its default value of 'maybe' to test the automatic detection of a DAE
problem by the solver.
c = 1e-6 * (1:3);
M = zeros(5,5);
M(1,1) = -c(1);
M(1,2) = c(1);
M(2,1) = c(1);
M(2,2) = -c(1);
M(3,3) = -c(2);
M(4,4) = -c(3);
M(4,5) = c(3);
M(5,4) = c(3);
M(5,5) = -c(3);
opts = odeset('Mass',M);
Code Equations
The function transampdae contains the system of equations for this example. The function defines
values for all of the voltages and constant parameters. The derivatives gathered on the left side of the
equations are coded in the mass matrix, and transampdae codes the right side of the equations.
11-76
Solve Stiff Transistor Differential Algebraic Equation
R15 = 9000;
alpha = 0.99;
beta = 1e-6;
Uf = 0.026;
Note: This function is included as a local function at the end of the example.
Set the initial conditions. This example uses the consistent initial conditions for the current through
each node computed in [1].
Ub = 6;
u0(1) = 0;
u0(2) = Ub/2;
u0(3) = Ub/2;
u0(4) = Ub;
u0(5) = 0;
Solve the DAE system over the time interval [0 0.05] using ode23t.
tspan = [0 0.05];
[t,u] = ode23t(@transampdae,tspan,u0,opts);
Plot Results
Ue = @(t) 0.4*sin(200*pi*t);
plot(t,Ue(t),'o',t,u(:,5),'.')
axis([0 0.05 -3 2]);
legend('Input Voltage U_e(t)','Output Voltage U_5(t)','Location','NorthWest');
title('One Transistor Amplifier DAE Problem Solved by ODE23T');
xlabel('t');
11-77
11 Ordinary Differential Equations (ODEs)
References
[1] Hairer, E., and Gerhard Wanner. Solving Ordinary Differential Equations II: Stiff and Differential-
Algebraic Problems. Springer Berlin Heidelberg, 1991, p. 377.
Local Functions
Listed here is the local helper function that the ODE solver ode23t calls to calculate the solution.
Alternatively, you can save this function as its own file in a directory on the MATLAB path.
11-78
Solve Stiff Transistor Differential Algebraic Equation
(u(5)/R15) ];
end
See Also
ode23t | ode15s
More About
• “Choose an ODE Solver” on page 11-2
• “Summary of ODE Options” on page 11-10
• “Solve Differential Algebraic Equations (DAEs)” on page 11-30
• “Solve ODE with Strongly State-Dependent Mass Matrix” on page 11-62
11-79
11 Ordinary Differential Equations (ODEs)
This example compares two techniques to solve a system of ordinary differential equations with
multiple sets of initial conditions. The techniques are:
• Use a for-loop to perform several simulations, one for each set of initial conditions. This
technique is simple to use but does not offer the best performance for large systems.
• Vectorize the ODE function to solve the system of equations for all sets of initial conditions
simultaneously. This technique is the faster method for large systems but requires rewriting the
ODE function so that it reshapes the inputs properly.
The equations used to demonstrate these techniques are the well-known Lotka-Volterra equations,
which are first-order nonlinear differential equations that describe the populations of predators and
prey.
Problem Description
The Lotka-Volterra equations are a system of two first-order, nonlinear ODEs that describe the
populations of predators and prey in a biological system. Over time, the populations of the predators
and prey change according to the equations
dx
= αx − βxy,
dt
dy
= δxy − γy .
dt
For this problem, the initial values for x and y are the initial population sizes. Solving the equations
then provides information about how the populations change over time as the species interact.
To solve the Lotka-Volterra equations in MATLAB®, write a function that encodes the equations,
specify a time interval for the integration, and specify the initial conditions. Then you can use one of
the ODE solvers, such as ode45, to simulate the system over time.
11-80
Solve System of ODEs with Multiple Initial Conditions
Since there are two equations in the system, dpdt is a vector with one element for each equation.
Also, the solution vector p has one element for each solution component: p(1) represents x in the
original equations, and p(2) represents y in the original equations.
Next, specify the time interval for integration as 0, 15 and set the initial population sizes for x and y
to 50.
t0 = 0;
tfinal = 15;
p0 = [50; 50];
Solve the system with ode45 by specifying the ODE function, the time span, and the initial
conditions. Plot the resulting populations versus time.
Since the solutions exhibit periodicity, plot the solutions against each other in a phase plot.
plot(p(:,1),p(:,2))
title('Phase Plot of Predator/Prey Populations')
11-81
11 Ordinary Differential Equations (ODEs)
xlabel('Prey')
ylabel('Predators')
The resulting plots show the solution for the given initial population sizes. To solve the equations for
different initial population sizes, change the values in p0 and rerun the simulation. However, this
method only solves the equations for one initial condition at a time. The next two sections describe
techniques to solve for many different initial conditions.
The simplest way to solve a system of ODEs for multiple initial conditions is with a for-loop. This
technique uses the same ODE function as the single initial condition technique, but the for-loop
automates the solution process.
For example, you can hold the initial population size for x constant at 50, and use the for-loop to vary
the initial population size for y between 10 and 400. Create a vector of population sizes for y0, and
then loop over the values to solve the equations for each set of initial conditions. Plot a phase plot
with the results from all iterations.
y0 = 10:10:400;
for k = 1:length(y0)
[t,p] = ode45(@lotkaODE,[t0 tfinal],[50 y0(k)]);
plot(p(:,1),p(:,2))
hold on
end
title('Phase Plot of Predator/Prey Populations')
xlabel('Prey')
11-82
Solve System of ODEs with Multiple Initial Conditions
ylabel('Predators')
hold off
The phase plot shows all of the computed solutions for the different sets of initial conditions.
Another method to solve a system of ODEs for multiple initial conditions is to rewrite the ODE
function so that all of the equations are solved simultaneously. The steps to do this are:
• Provide all of the initial conditions to ode45 as a matrix. The size of the matrix is s-by-n, where s
is the number of solution components and n is the number of initial conditions being solved for.
Each column in the matrix then represents one complete set of initial conditions for the system.
• The ODE function must accept an extra input parameter for n, the number of initial conditions.
• Inside the ODE function, the solver passes the solution components p as a column vector. The ODE
function must reshape the vector into a matrix with size s-by-n. Each row of the matrix then
contains all of the initial conditions for each variable.
• The ODE function must solve the equations in a vectorized format, so that the expression accepts
vectors for the solution components. In other words, f(t,[y1 y2 y3 ...]) must return
[f(t,y1) f(t,y2) f(t,y3) ...].
• Finally, the ODE function must reshape its output back into a vector so that the ODE solver
receives a vector back from each function call.
If you follow these steps, then the ODE solver can solve the system of equations using a vector for the
solution components, while the ODE function reshapes the vector into a matrix and solves each
11-83
11 Ordinary Differential Equations (ODEs)
solution component for all of the initial conditions. The result is that you can solve the system for all
of the initial conditions in one simulation.
To implement this method for the Lotka-Volterra system, start by finding the number of initial
conditions n, and then form a matrix of initial conditions.
n = length(y0);
p0_all = [50*ones(n,1) y0(:)]';
Next, rewrite the ODE function so that it accepts n as an input. Use n to reshape the solution vector
into a matrix, then solve the vectorized system and reshape the output back into a vector. A modified
ODE function that performs these tasks is
% Linearize output.
dpdt = dpdt(:);
end
Solve the system of equations for all of the initial conditions using ode45. Since ode45 requires the
ODE function to accept two inputs, use an anonymous function to pass in the value of n from the
workspace to lotkasystem.
Reshape the output vector into a matrix with size (numTimeSteps*s)-by-n. Each column of the
output p(:,k) contains the solutions for one set of initial conditions. Plot a phase plot of the solution
components.
p = reshape(p,[],n);
nt = length(t);
for k = 1:n
plot(p(1:nt,k),p(nt+1:end,k))
hold on
end
title('Predator/Prey Populations Over Time')
xlabel('t')
ylabel('Population')
hold off
11-84
Solve System of ODEs with Multiple Initial Conditions
The results are comparable to those obtained by the for-loop technique. However, there are some
properties of the vectorized solution technique that you should keep in mind:
• The calculated solutions can be slightly different than those computed from a single initial input.
The difference arises because the ODE solver applies norm checks to the entire system to
calculate the size of the time steps, so the time-stepping behavior of the solution is slightly
different. The change in time steps generally does not affect the accuracy of the solution, but
rather which times the solution is evaluated at.
• For stiff ODE solvers (ode15s, ode23s, ode23t, ode23tb) that automatically evaluate the
numerical Jacobian of the system, specifying the block diagonal sparsity pattern of the Jacobian
using the JPattern option of odeset can improve the efficiency of the calculation. The block
diagonal form of the Jacobian arises from the input reshaping performed in the rewritten ODE
function.
Time each of the previous methods using timeit. The timing for solving the equations with one set
of initial conditions is included as a baseline number to see how the methods scale.
% Time one IC
baseline = timeit(@() ode45(@lotkaODE,[t0 tfinal],p0),2);
% Time for-loop
for k = 1:length(y0)
loop_timing(k) = timeit(@() ode45(@lotkaODE,[t0 tfinal],[50 y0(k)]),2);
end
11-85
11 Ordinary Differential Equations (ODEs)
loop_timing = sum(loop_timing);
Create a table with the timing results. Multiply all of the results by 1e3 to express the times in
milliseconds. Include a column with the time per solution, which divides each time by the number of
initial conditions being solved for.
TimingTable=3×2 table
TotalTime (ms) TimePerSolution (ms)
______________ ____________________
The TimePerSolution column shows that the vectorized technique is the fastest of the three
methods.
Local Functions
Listed here are the local functions that ode45 calls to calculate the solutions.
% Linearize output.
dpdt = dpdt(:);
end
See Also
ode45 | odeset
11-86
Solve System of ODEs with Multiple Initial Conditions
More About
• “Choose an ODE Solver” on page 11-2
• “Solve Predator-Prey Equations” on page 11-52
11-87
11 Ordinary Differential Equations (ODEs)
This example shows how to use ode78 and ode89 to solve a celestial mechanics problem that
requires high accuracy in each step from the ODE solver for a successful integration. Both ode45 and
ode113 are unable to solve the problem using the default error tolerances. Even with more stringent
error thresholds, ode89 performs best on the problem due to the high accuracy of the Runge-Kutta
formulas it uses in each step.
Problem Description
The Pleiades problem is a celestial mechanics problem describing the gravitational interactions of
seven stars [1]. This cluster of stars is also referred to as The Seven Sisters, and it is visible to the
human eye in the night sky due to its proximity to Earth [2]. The system of equations describing the
motion of the stars in the cluster consists of 14 nonstiff second-order differential equations, which
produce a system of 28 equations when rewritten in first-order form.
Newton's second law of motion relates the force applied to a body to its rate of change in momentum
over time,
d
Fi = p.
dt i
The momentum (pi = mivi) has separate x- and y-components. At the same time, the universal law of
gravitation describes the force working on body i from body j as
mim j
Fij = g dij.
pi − p j 2
p j − pi
The term dij = p j − pi
gives the direction of the distance between the bodies, and the masses of the
bodies are mi = i for i = 1, 2, . . . , 7. For a system with many bodies, the force applied to any
individual body is the sum of its interactions with all others, so
Fi = ∑ Fij .
i≠ j
Setting the gravitational constant g equal to 1 and solving yields a system of second-order equations
describing the evolution of the x- and y-components over time.
m j x j − xi
xi′′ = f i x, y = ∑ ,
j≠i rij3/2
m j y j − yi
yi′′ = hi x, y = ∑ ,
j≠i rij3/2
2 2
where rij = xi − x j + yi − y j . Because these two equations apply for each of the seven stars in
the system, 14 second-order differential equations (i = 1, 2, . . . , 7 describe the entire system.
11-88
Solve Celestial Mechanics Problem with High-Order Solvers
x0 = (3, 3, − 1, − 3, 2, − 2, 2)
y0 = (3, − 3, 2, 0, 0, − 4, 4)
x0′ = (0, 0, 0, 0, 0, 1 . 75, − 1 . 5)
y0′ = (0, 0, 0, − 1 . 25, 1, 0, 0)
To solve this system of ODEs in MATLAB®, you must code the equations into a function before calling
the solvers ode78 and ode89. You can either include the required functions as local functions at the
end of a file (as done here), or save them as separate, named files in a directory on the MATLAB path.
Code Equations
The ODE solvers in MATLAB require equations to be written in first-order form, q′ = u t, q . For this
problem, the system of equations can be rewritten in first-order form using the substitutions w = x′
and z = y′. With these substitutions, the system contains 28 first-order equations organized into four
groups of seven equations:
xi′ wi
yi′ zi
= .
wi′ f i(x, y)
zi′ hi(x, y)
The solution vector produced by solving the system has the form
xi
yi
q= .
wi
zi
Therefore, writing the original equations in terms of the solution vector q yields
where x = q1, q2, . . . , q7 and y = q8, q9, . . . , q14 . With the equations written in the first-order form
q′ = u t, q , you can now write a function that calculates the components during each time step of the
solution process. A function that codes this system of equations is:
function dqdt = pleiades(t,q)
x = q(1:7);
y = q(8:14);
xDist = (x - x.');
yDist = (y - y.');
r = (xDist.^2+yDist.^2).^(3/2);
m = (1:7)';
dqdt = [q(15:28);
sum(xDist.*m./r,1,'omitnan').';
sum(yDist.*m./r,1,'omitnan').'];
end
11-89
11 Ordinary Differential Equations (ODEs)
In this function, the x and y values are extracted directly from the solution vector q, as are the first
14 components of the output. Then, the function uses the position values to calculate the distances
between all seven stars, and these distances are used in the code for f i x, y and hi x, y .
Use the odeset function to set the value of several optional parameters:
• Specify stringent error tolerances of 1e-13 and 1e-15 for the relative and absolute error
tolerances, respectively.
• Turn on the display of solver statistics.
opts = odeset("RelTol",1e-13,"AbsTol",1e-15,"Stats","on");
Create a column vector with the initial conditions and a time vector with regularly spaced points in
the range 0, 15 . When you specify a time vector with more than two elements, the solver returns
solutions at the time points you specify.
init = [3 3 -1 -3 2 -2 2 ...
3 -3 2 0 0 -4 4 ...
0 0 0 0 0 1.75 -1.5 ...
0 0 0 -1.25 1 0 0]';
tspan = linspace(1,15,200);
Now, solve the equations using both ode78 and ode89 by specifying the equations, time span, initial
values, and options. Use tic and toc to time each solver for comparison (note that timings can differ
depending on your computer).
tic
[t78,q78] = ode78(@pleiades,tspan,init,opts);
toc
tic
[t89,q89] = ode89(@pleiades,tspan,init,opts);
toc
The output indicates that ode89 is best suited for solving the problem, because it is faster and has
fewer failed steps.
The first 14 components of q89 contain the x and y positions for each of the seven stars, as obtained
by ode89. Plot these solution components to see the trajectories of all the stars over time.
11-90
Solve Celestial Mechanics Problem with High-Order Solvers
plot(q89(:,1),q89(:,8),'--',...
q89(:,2),q89(:,9),'--',...
q89(:,3),q89(:,10),'--',...
q89(:,4),q89(:,11),'--',...
q89(:,5),q89(:,12),'--',...
q89(:,6),q89(:,13),'--',...
q89(:,7),q89(:,14),'--')
title('Position of Pleiades Stars, Solved by ODE89')
xlabel('X Position')
ylabel('Y Position')
Because the trajectories of the stars overlap considerably, a better way to visualize the results is to
create an animation showing the stars move over time. The function AnimateOrbits, included as a
local function at the end of this example, accepts the output from a solver for this problem and
creates an animated GIF file in the current folder that shows the stars move along their trajectories.
For example, you can generate an animation with the output from the ode89 solver using the
command
AnimateOrbits(t89,q89);
11-91
11 Ordinary Differential Equations (ODEs)
References
[1] Hairer, E., et al. Solving Ordinary Differential Equations I: Nonstiff Problems. 2nd rev. ed,
Springer, 2009.
Local Functions
This section includes the local functions called by the ODE solver to calculate the solution.
Alternatively, you can save these functions as their own files in a directory on the MATLAB path.
function dqdt = pleiades(t,q)
x = q(1:7);
y = q(8:14);
xDist = (x - x.');
yDist = (y - y.');
r = (xDist.^2+yDist.^2).^(3/2);
m = (1:7)';
dqdt = [q(15:28);
sum(xDist.*m./r,1,'omitnan').';
sum(yDist.*m./r,1,'omitnan').'];
end
%-----------------------------------------------------------------
function AnimateOrbits(t,q)
for k = 1:length(t)
plot(q(:,1),q(:,8),'--',q(:,2),q(:,9),'--',...
q(:,3),q(:,10),'--',q(:,4),q(:,11),'--',...
q(:,5),q(:,12),'--',q(:,6),q(:,13),'--',...
q(:,7),q(:,14),'--')
hold on
xlim([-20 20])
ylim([-10 10])
11-92
Solve Celestial Mechanics Problem with High-Order Solvers
sz = 15;
plot(q(k,1),q(k,8),'o','MarkerSize',sz,'MarkerFaceColor','r')
plot(q(k,2),q(k,9),'o','MarkerSize',sz,'MarkerFaceColor','k')
plot(q(k,3),q(k,10),'o','MarkerSize',sz,'MarkerFaceColor','b')
plot(q(k,4),q(k,11),'o','MarkerSize',sz,'MarkerFaceColor','m')
plot(q(k,5),q(k,12),'o','MarkerSize',sz,'MarkerFaceColor','c')
plot(q(k,6),q(k,13),'o','MarkerSize',sz,'MarkerFaceColor','y')
plot(q(k,7),q(k,14),'o','MarkerSize',sz,'MarkerFaceColor',[210 120 0]./255)
hold off
drawnow
M(k) = getframe(gca);
im{k} = frame2im(M(k));
end
filename = "orbits.gif";
for idx = 1:length(im)
[A,map] = rgb2ind(im{idx},256);
if idx == 1
imwrite(A,map,filename,'gif','LoopCount',Inf,'DelayTime',0);
else
imwrite(A,map,filename,'gif','WriteMode','append','DelayTime',0);
end
end
close all
end
See Also
ode78 | ode89 | odeset
Related Examples
• “Choose an ODE Solver” on page 11-2
• “Summary of ODE Options” on page 11-10
• “Solve Equations of Motion for Baton Thrown into Air” on page 11-56
11-93
12
In a boundary value problem (BVP), the goal is to find a solution to an ordinary differential equation
(ODE) that also satisfies certain specified boundary conditions. The boundary conditions specify a
relationship between the values of the solution at two or more locations in the interval of integration.
In the simplest case, the boundary conditions apply at the beginning and end (or boundaries) of the
interval.
The MATLAB BVP solvers bvp4c and bvp5c are designed to handle systems of ODEs of the form
y′ = f x, y
where:
Boundary Conditions
In the simplest case of a two-point BVP, the solution to the ODE is sought on an interval [a, b], and
must satisfy the boundary conditions
g y a ,y b = 0 .
• Write a function of the form res = bcfun(ya,yb), or use the form res = bcfun(ya,yb,p) if
there are unknown parameters involved. You supply this function to the solver as the second input
argument. The function returns res, which is the residual value of the solution at the boundary
point. For example, if y(a) = 1 and y(b) = 0, then the boundary condition function is
The BVP solvers in MATLAB also can accommodate other types of problems that have:
• Unknown parameters p
• Singularities in the solutions
• Multipoint conditions (internal boundaries that separate the integration interval into several
regions)
In the case of multipoint boundary conditions, the boundary conditions apply at more than two points
in the interval of integration. For example, the solution might be required to be zero at the beginning,
middle, and end of the interval. See bvpinit for details on how to specify multiple boundary
conditions.
12-2
Solving Boundary Value Problems
• No solution
• A finite number of solutions
• Infinitely many solutions
An important part of the process of solving a BVP is providing a guess for the required solution. The
quality of this guess can be critical for the solver performance and even for a successful computation.
Use the bvpinit function to create a structure for the initial guess of the solution. The solvers
bvp4c and bvp5c accept this structure as the third input argument.
Creating a good initial guess for the solution is more an art than a science. However, some general
guidelines include:
• Have the initial guess satisfy the boundary conditions, since the solution is required to satisfy
them as well. If the problem contains unknown parameters, then the initial guess for the
parameters also should satisfy the boundary conditions.
• Try to incorporate as much information about the physical problem or expected solution into the
initial guess as possible. For example, if the solution is supposed to oscillate or have a certain
number of sign changes, then the initial guess should as well.
• Consider the placement of the mesh points (the x-coordinates of the initial guess of the solution).
The BVP solvers adapt these points during the solution process, so you do not need to specify too
many mesh points. Best practice is to specify a few mesh points placed near where the solution
changes rapidly.
• If there is a known, simpler solution on a smaller interval, then use it as an initial guess on a
larger interval. Often you can solve a problem as a series of relatively simpler problems, a practice
called continuation. With continuation, a series of simple problems are connected by using the
solution of one problem as the initial guess to solve the next problem.
y′ = f x, y, p
g y a ,y b ,p = 0
In this case, the boundary conditions must suffice to determine the values of the parameters p.
You must provide the solver with an initial guess for any unknown parameters. When you call
bvpinit to create the structure solinit, specify the initial guess as a vector in the third input
argument parameters.
solinit = bvpinit(x,v,parameters)
Additionally, the functions odefun and bcfun that encode the ODE equations and boundary
conditions must each have a third argument.
dydx = odefun(x,y,parameters)
res = bcfun(ya,yb,parameters)
12-3
12 Boundary Value Problems (BVPs)
While solving the differential equations, the solver adjusts the value of the unknown parameters to
satisfy the boundary conditions. The solver returns the final values of these unknown parameters in
sol.parameters.
Singular BVPs
bvp4c and bvp5c can solve a class of singular BVPs of the form
1
y′ = Sy + f x, y ,
x
0 = g y 0 ,y b .
The solvers can also accommodate unknown parameters for problems of the form
1
y′ = Sy + f x, y, p ,
x
0 = g y 0 ,y b ,p .
Singular problems must be posed on an interval [0,b] with b > 0. Use bvpset to pass the constant
matrix S to the solver as the value of the 'SingularTerm' option. Boundary conditions at x = 0
must be consistent with the necessary condition for a smooth solution, Sy(0) = 0. The initial guess of
the solution also should satisfy this condition.
When you solve a singular BVP, the solver requires that your function odefun(x,y) return only the
value of the f(x, y) term in the equation. The term involving S is handled by the solver separately
using the 'SingularTerm' option.
The bvp5c function is used exactly like bvp4c, with the exception of the meaning of error tolerances
between the two solvers. If S(x) approximates the solution y(x), bvp4c controls the residual |S′(x) –
f(x,S(x))|. This approach indirectly controls the true error |y(x) – S(x)|. Use bvp5c to control the true
error directly.
12-4
Solving Boundary Value Problems
Solver Description
bvp4c bvp4c is a finite difference code that implements
the 3-stage Lobatto IIIa formula. This is a
collocation formula, and the collocation
polynomial provides a C1-continuous solution that
is fourth-order accurate uniformly in the interval
of integration. Mesh selection and error control
are based on the residual of the continuous
solution.
Sxint = deval(sol,xint)
The deval function is vectorized. For a vector xint, the ith column of Sxint approximates the
solution y(xint(i)).
12-5
12 Boundary Value Problems (BVPs)
odeexamples
edit exampleFileName.m
exampleFileName
This table contains a list of the available BVP example files, as well as the solvers and the options
they use.
12-6
Solving Boundary Value Problems
References
[1] Ascher, U., R. Mattheij, and R. Russell. “Numerical Solution of Boundary Value Problems for
Ordinary Differential Equations.” Philadelphia, PA: SIAM, 1995, p. 372.
[2] Shampine, L.F., and J. Kierzenka. "A BVP Solver based on residual control and the MATLAB PSE."
ACM Trans. Math. Softw. Vol. 27, Number 3, 2001, pp. 299–316.
[3] Shampine, L.F., M.W. Reichelt, and J. Kierzenka. "Solving Boundary Value Problems for Ordinary
Differential Equations in MATLAB with bvp4c." MATLAB File Exchange, 2004.
[4] Shampine, L.F., and J. Kierzenka. "A BVP Solver that Controls Residual and Error." J. Numer. Anal.
Ind. Appl. Math. Vol. 3(1-2), 2008, pp. 27–41.
See Also
bvp4c | bvp5c | bvpinit | bvpset | pdepe | ode45
12-7
12 Boundary Value Problems (BVPs)
This example uses bvp4c with two different initial guesses to find both solutions to a BVP problem.
y′′ + ey = 0.
y 0 = y 1 = 0.
To solve this equation in MATLAB®, you need to code the equation and boundary conditions, then
generate a suitable initial guess for the solution before calling the boundary value problem solver
bvp4c. You either can include the required functions as local functions at the end of a file (as done
here), or save them as separate, named files in a directory on the MATLAB path.
Code Equation
Create a function to code the equation. This function should have the signature dydx =
bvpfun(x,y) or dydx = bvpfun(x,y,parameters), where:
These inputs are automatically passed to the function by the solver, but the variable names determine
how you code the equations. In this case, you can rewrite the second-order equation as a system of
first-order equations
y1′ = y2,
y
y2′ = − e 1.
For two-point boundary value conditions like the ones in this problem, the boundary conditions
function should have the signature res = bcfun(ya,yb) or res = bcfun(ya,yb,parameters),
depending on whether unknown parameters are involved. ya and yb are column vectors that the
solver automatically passes to the function, and bcfun returns the residual in the boundary
conditions.
For the boundary conditions y 0 = y 1 = 0, the bcfun function specifies that the residual value is
zero at both boundaries. These residual values are enforced at the first and last points of the mesh
that you specify to bvpinit in your initial guess. The initial mesh in this problem should have x(1)
= 0 and x(end) = 1.
12-8
Solve BVP with Two Solutions
Call bvpinit to generate an initial guess of the solution. The mesh for x does not need to have a lot
of points, but the first point must be 0. Then the last point must be 1 so that the boundary conditions
are properly specified. Use an initial guess for y where the first component is slightly positive and the
second component is zero.
xmesh = linspace(0,1,5);
solinit = bvpinit(xmesh, [0.1 0]);
Solve Equation
Solve the BVP a second time using a different initial guess for the solution.
Compare Solutions
Plot the solutions that bvp4c calculates for the different initial conditions. Both solutions satisfy the
stated boundary conditions, but have different behaviors inbetween. Since the solution is not always
unique, the different behaviors show the importance of giving a good initial guess for the solution.
plot(sol1.x,sol1.y(1,:),'-o',sol2.x,sol2.y(1,:),'-o')
title('BVP with Different Solutions That Depend on the Initial Guess')
xlabel('x')
ylabel('y')
legend('Solution 1','Solution 2')
12-9
12 Boundary Value Problems (BVPs)
Local Functions
Listed here are the local helper functions that the BVP solver bvp4c calls to calculate the solution.
Alternatively, you can save these functions as their own files in a directory on the MATLAB path.
See Also
bvp4c | bvp5c | bvpinit
More About
• “Solving Boundary Value Problems” on page 12-2
12-10
Solve BVP with Unknown Parameter
This example shows how to use bvp4c to solve a boundary value problem with an unknown
parameter.
y′′ + λ − 2q cos 2x y = 0.
y′ 0 = 0,
y′ π = 0.
y 0 = 1.
To solve this system of equations in MATLAB®, you need to code the equations, boundary conditions,
and initial guess before calling the boundary value problem solver bvp4c. You can either include the
required functions as local functions at the end of a file (as done here), or save them as separate,
named files in a directory on the MATLAB path.
Code Equation
Create a function to code the equations. This function should have the signature dydx =
mat4ode(x,y,lambda), where:
You can write Mathieu's equation as a first-order system using the substitutions y1 = y and y2 = y′,
y1′ = y2,
Note: All functions are included as local functions at the end of the example.
Now, write a function that returns the residual value of the boundary conditions at the boundary
points. This function should have the signature res = mat4bc(ya,yb,lambda), where:
12-11
12 Boundary Value Problems (BVPs)
This problem has three boundary conditions in the interval 0, π . To calculate the residual values, you
need to put the boundary conditions into the form g x, y = 0. In this form the boundary conditions
are
y′ 0 = 0,
y′ π = 0,
y 0 − 1 = 0.
Lastly, create an initial guess of the solution. You must provide an initial guess for both solution
components y1 = y x and y2 = y′ x , as well as the unknown parameter λ. Only eigenvalues and
eigenfunctions that are close to the initial guesses can be computed. To increase the likelihood that
the computed eigenfunction corresponds to the fourth eigenvalue, you should choose an initial guess
that has the correct qualitative behavior.
For this problem, a cosine function makes for a good initial guess since it satisfies the three boundary
conditions. Code the initial guess for y using a function that returns the guess for y1 and y2.
Call bvpinit using a mesh of 10 points in the interval 0, π , the initial guess function, and a guess of
15 for the value of λ.
lambda = 15;
solinit = bvpinit(linspace(0,pi,10),@mat4init,lambda);
Solve Equation
Call bvp4c with the ODE function, boundary condition function, and initial guess.
sol = bvp4c(@mat4ode, @mat4bc, solinit);
Value of Parameter
Print the value of the unknown parameter λ found by bvp4c. This value is the fourth eigenvalue
(q = 5) of Mathieu's equation.
fprintf('Fourth eigenvalue is approximately %7.3f.\n',...
sol.parameters)
12-12
Solve BVP with Unknown Parameter
Plot Solution
Use deval to evaluate the solution computed by bvp4c at 100 points in the interval 0, π .
xint = linspace(0,pi);
Sxint = deval(sol,xint);
Plot both solution components. The plot shows the eigenfunction (and its derivative) associated with
the fourth eigenvalue λ4 = 17 . 097.
plot(xint,Sxint)
axis([0 pi -4 4])
title('Eigenfunction of Mathieu''s Equation.')
xlabel('x')
ylabel('y')
legend('y','y''')
Local Functions
Listed here are the local helper functions that the BVP solver bvp4c calls to calculate the solution.
Alternatively, you can save these functions as their own files in a directory on the MATLAB path.
function dydx = mat4ode(x,y,lambda) % equation being solved
q = 5;
dydx = [y(2)
-(lambda - 2*q*cos(2*x))*y(1)];
12-13
12 Boundary Value Problems (BVPs)
end
%-------------------------------------------
function res = mat4bc(ya,yb,lambda) % boundary conditions
res = [ya(2)
yb(2)
ya(1)-1];
end
%-------------------------------------------
function yinit = mat4init(x) % initial guess function
yinit = [cos(4*x)
-4*sin(4*x)];
end
%-------------------------------------------
See Also
bvp4c | bvp5c | bvpinit
More About
• “Solving Boundary Value Problems” on page 12-2
12-14
Solve BVP Using Continuation
This example shows how to solve a numerically difficult boundary value problem using continuation,
which effectively breaks the problem up into a sequence of simpler problems.
The problem is posed on the interval −1, 1 and is subject to the boundary conditions
y −1 = − 2,
y 1 = 0.
−4
When e = 10 , the solution to the equation undergoes a rapid transition near x = 0, so it is difficult
to solve numerically. Instead, this example uses continuation to iterate through several values of e
−4
until e = 10 . The intermediate solutions are each used as the initial guess for the next problem.
To solve this system of equations in MATLAB®, you need to code the equations, boundary conditions,
and initial guess before calling the boundary value problem solver bvp4c. You either can include the
required functions as local functions at the end of a file (as done here), or you can save them as
separate, named files in a directory on the MATLAB path.
Code Equation
Using the substitutions y1 = y and y2 = y′, you can rewrite the equation as the system of first-order
equations
y1′ = y2,
x πx
y2′ = − e y′ − π2cos πx − e
sin πx .
Write a function to code the equations with the signature dydx = shockode(x,y), where:
Note: All functions are included as local functions at the end of the example.
12-15
12 Boundary Value Problems (BVPs)
The BVP solver requires the boundary conditions to be in the form g y a , y b = 0. In this form the
boundary conditions are:
y −1 + 2 = 0,
y 1 = 0.
Write a function to code the boundary conditions with the signature res = shockbc(ya,yb),
where:
Code Jacobians
The analytical Jacobians for the ODE function and boundary conditions can be calculated easily in
this problem. Providing the Jacobians makes the solver more efficient, since the solver no longer
needs to approximate them with finite differences.
1 0 0 0
Jy a = , Jy b = .
0 0 1 0
Use a constant guess for the solution on a mesh of five points in −1, 1 .
12-16
Solve BVP Using Continuation
Solve Equation
−4
If you try to solve the equation directly using e = 10 , then the solver is not able to overcome the
poor conditioning of the problem near the x = 0 transition point. Instead, to obtain the solution for
−4 −2 −3
e = 10 , this example uses continuation by solving a sequence of problems for 10 , 10 , and
−4
10 . The output from the solver in each iteration acts as the guess for the solution in the next
iteration (this is why the variable for the initial guess from bvpinit is sol, and the output from the
solver is also named sol).
Since the value of the Jacobian depends on the value of e, set the options in the loop specifying the
shockjac and shockbcjac functions for the Jacobians. Also, turn vectorization on since shockode
is coded to handle vectors of values.
e = 0.1;
for i = 2:4
e = e/10;
options = bvpset('FJacobian',@(x,y) shockjac(x,y,e),'BCJacobian',@shockbcjac,'Vectorized','on'
sol = bvp4c(@(x,y) shockode(x,y,e),@shockbc, sol, options);
end
Plot Solution
Plot the results from bvp4c for the mesh x and solution y x . With continuation, the solver is able to
handle the discontinuity at x = 0.
plot(sol.x,sol.y(1,:),'-o');
axis([-1 1 -2.2 2.2]);
title(['There Is a Shock at x = 0 When e =' sprintf('%.e',e) '.']);
xlabel('x');
ylabel('solution y');
12-17
12 Boundary Value Problems (BVPs)
Local Functions
Listed here are the local functions that the BVP solver bvp4c calls to calculate the solution.
Alternatively, you can save these functions as their own files in a directory on the MATLAB path.
12-18
Solve BVP Using Continuation
end
%-------------------------------------------
See Also
bvp4c | bvpinit | bvpset
More About
• “Solving Boundary Value Problems” on page 12-2
• “Verify BVP Consistency Using Continuation” on page 12-20
12-19
12 Boundary Value Problems (BVPs)
This example shows how to use continuation to gradually extend a BVP solution to larger intervals.
Falkner-Skan boundary value problems [1] arise from similarity solutions of a viscous,
incompressible, laminar flow over a flat plate. An example equation is
2
f ′′′ + f f ′′ + β 1 − f ′ = 0.
The problem is posed on the infinite interval 0, ∞ with β = 0 . 5, subject to the boundary conditions
f 0 = 0,
f ′ 0 = 0,
f ′ ∞ = 1.
The BVP cannot be solved on the infinite interval, and it is impractical to solve the BVP in a very large
finite interval. Instead, this example solves a sequence of problems posed on the smaller interval
0, a to verify that the solution has consistent behavior as a ∞. This practice of breaking the
problem up into simpler problems, with the solution of each problem feeding back in as the initial
guess for the next problem, is called continuation.
To solve this system of equations in MATLAB®, you need to code the equations, boundary conditions,
and options before calling the boundary value problem solver bvp4c. You either can include the
required functions as local functions at the end of a file (as done here), or save them as separate,
named files in a directory on the MATLAB path.
Code Equation
Create a function to code the equations. This function should have the signature dfdx =
fsode(x,f), where:
You can rewrite the third-order equation as a system of first-order equations using the substitutions
f 1 = f , f 2 = f ′, and f 3 = f ′′. The equations become
f 1′ = f 2 ,
f 2′ = f 3,
2
f 3′ = − f 1 f 3 − β 1 − f 2 .
12-20
Verify BVP Consistency Using Continuation
Note: All functions are included as local functions at the end of the example.
Now, write a function that returns the residual value of the boundary conditions at the boundary
points. This function should have the signature res = fsbc(f0,finf), where:
To calculate the residual values, you need to put the boundary conditions in the form g x, y = 0. In
this form, the boundary conditions are
f 0 = 0,
f ′ 0 = 0,
f ′ ∞ − 1 = 0.
Lastly, you must provide an initial guess for the solution. A crude mesh of five points and a constant
guess that satisfies the boundary conditions are good enough to get convergence on the interval
0, 3 . The variable infinity denotes the right-hand limit of the interval of integration. As the value
of infinity increases on subsequent iterations from 3 to its maximum value of 6, the solution from
each previous iteration acts as the initial guess for the next iteration.
infinity = 3;
maxinfinity = 6;
solinit = bvpinit(linspace(0,infinity,5),[0 0 1]);
Solve the problem in the initial interval 0, 3 . Plot the solution, and compare the value of f ′′ 0 to the
analytic value [1].
sol = bvp4c(@fsode,@fsbc,solinit);
x = sol.x;
f = sol.y;
plot(x,f(2,:),x(end),f(2,end),'o');
axis([0 maxinfinity 0 1.4]);
title('Falkner-Skan Equation, Positive Wall Shear, \beta = 0.5.')
xlabel('x')
ylabel('df/dx')
hold on
12-21
12 Boundary Value Problems (BVPs)
Now, solve the problem on progressively larger intervals by increasing the value of infinity for
each iteration. The bvpinit function extrapolates each solution to the new interval to act as the
initial guess for the next value of infinity. Each iteration prints the calculated value of f ′′ 0 and
superimposes the plot of the solution over the previous solutions. When infinity = 6, the
consistent behavior of the solution becomes evident and the value of f ′′ 0 is very close to the
predicted value.
12-22
Verify BVP Consistency Using Continuation
hold off
Local Functions
Listed here are the local helper functions that the BVP solver bvp4c calls to calculate the solution.
Alternatively, you can save these functions as their own files in a directory on the MATLAB path.
12-23
12 Boundary Value Problems (BVPs)
References
[1] Cebeci, T. and H. B. Keller. "Shooting and Parallel Shooting Methods for Solving the Falkner-Skan
Boundary-layer Equation." J. Comp. Phys., Vol. 7, 1971, pp. 289-300.
See Also
bvp4c | bvp5c | bvpinit
More About
• “Solving Boundary Value Problems” on page 12-2
• “Solve BVP Using Continuation” on page 12-15
12-24
Solve BVP with Singular Term
This example shows how to solve Emden's equation, which is a boundary value problem with a
singular term that arises in modeling a spherical body of gas.
After reducing the PDE of the model using symmetry, the equation becomes a second-order ODE
defined on the interval 0, 1 ,
2
y′′ + y′ + y5 = 0.
x
At x = 0, the 2/x term is singular, but symmetry implies the boundary condition y′ 0 = 0. With this
boundary condition, the term 2/x y′ is well defined as x 0. For the boundary condition y 1 = 3/2,
the BVP has an analytical solution that you can compare to a numeric solution calculated in
MATLAB®,
−1
x2
yx = 1+ .
3
The BVP solver bvp4c can solve singular BVPs that have the form
y
y′ = S + f (x, y).
x
The matrix S must be constant and the boundary conditions at x = 0 must be consistent with the
necessary condition S ⋅ y 0 = 0. Use the 'SingularTerm' option of bvpset to pass the S matrix to
the solver.
You can rewrite Emden's equation as a system of first-order equations using y1 = y and y2 = y′ as
y1′ = y2,
2
y2′ = − x y2 − y15.
y2 0 = 0,
y1 1 = 3/2.
y1′ 0 0 y1 y2
1
= x
+ .
y2′ 0 −2 y2 −y15
0 0 y2
In matrix form it is clear that S = and f x, y = .
0 −2 −y15
To solve this system of equations in MATLAB, you need to code the equations, boundary conditions,
and options before calling the boundary value problem solver bvp4c. You either can include the
12-25
12 Boundary Value Problems (BVPs)
required functions as local functions at the end of a file (as done here), or save them as separate,
named files in a directory on the MATLAB path.
Code Equation
Create a function to code the equations for f x, y . This function should have the signature dydx =
emdenode(x,y), where:
These inputs are automatically passed to the function by the solver, but the variable names determine
how you code the equations. In this case:
The term that contains S is passed to the solver using options, so that term is not included in the
function.
Now, write a function that returns the residual value of the boundary conditions at the boundary
points. This function should have the signature res = emdenbc(ya,yb), where:
For this problem, one of the boundary conditions is for y1, and the other is for y2. To calculate the
residual values, you need to put the boundary conditions into the form g x, y = 0.
y2 0 = 0,
y1 1 − 3/2 = 0.
Lastly, create an initial guess of the solution. For this problem, use a constant initial guess that
satisfies the boundary conditions, and a simple mesh of five points between 0 and 1. Using many
mesh points is unnecessary since the BVP solver adapts these points during the solution process.
y1 = 3/2,
12-26
Solve BVP with Singular Term
y2 = 0.
Solve Equation
Create a matrix for S and pass it to bvpset as the value of the 'SingularTerm' option. Finally, call
bvp4c with the ODE function, boundary condition function, initial guess, and option structure.
S = [0 0; 0 -2];
options = bvpset('SingularTerm',S);
sol = bvp4c(@emdenode, @emdenbc, solinit, options);
Plot Solution
Plot the analytical solution and the solution calculated by bvp4c for comparison.
plot(x,truy,sol.x,sol.y(1,:),'ro');
title('Emden Problem -- BVP with Singular Term.')
legend('Analytical','Computed');
xlabel('x');
ylabel('solution y');
12-27
12 Boundary Value Problems (BVPs)
Local Functions
Listed here are the local helper functions that the BVP solver bvp4c calls to calculate the solution.
Alternatively, you can save these functions as their own files in a directory on the MATLAB path.
See Also
bvp4c | bvp5c | bvpinit | bvpset
More About
• “Solving Boundary Value Problems” on page 12-2
12-28
Solve BVP with Multiple Boundary Conditions
This example shows how to solve a multipoint boundary value problem, where the solution of interest
satisfies conditions inside the interval of integration.
C−1
v′ = n
,
vC − min x, 1
C′ = η
.
2
λ
The known parameters of the problem are n, κ, λ > 1, and η = .
n ⋅ κ2
The term min x, 1 in the equation for C′ x is not smooth at x = 1, so the problem cannot be solved
directly. Instead, you can break the problem into two: one set in the interval 0, 1 , and the other set
in the interval 1, λ . The connection between the two regions is that the solutions must be continuous
at x = 1. The solution must also satisfy the boundary conditions
v 0 = 0,
C λ = 1.
Region 1: 0 ≤ x ≤ 1
C−1
v′ = n
,
vC − x
C′ = η
.
Region 2: 1 ≤ x ≤ λ
C−1
v′ = n
,
vC − 1
C′ = η
.
The interface point x = 1 is included in both regions. At this point, the solver produces both left and
right solutions, which must be equal to ensure continuity of the solution.
To solve this system of equations in MATLAB®, you need to code the equations, boundary conditions,
and initial guess before calling the boundary value problem solver bvp5c. You either can include the
required functions as local functions at the end of a file (as done here), or save them as separate,
named files in a directory on the MATLAB path.
Code Equations
The equations for v′ x and C′ x depend on the region being solved. For multipoint boundary value
problems the derivative function must accept a third input argument region, which is used to
12-29
12 Boundary Value Problems (BVPs)
identify the region where the derivative is being evaluated. The solver numbers the regions from left
to right, starting with 1.
Create a function to code the equations with the signature dydx = f(x,y,region,p), where:
Use a switch statement to return different equations depending on the region being solved. The
function is
dydx = zeros(2,1);
dydx(1) = (y(2) - 1)/n;
switch region
case 1 % x in [0 1]
dydx(2) = (y(1)*y(2) - x)/eta;
case 2 % x in [1 lambda]
dydx(2) = (y(1)*y(2) - 1)/eta;
end
end
Note: All functions are included as local functions at the end of the example.
Solving two first-order differential equations in two regions requires four boundary conditions. Two of
these conditions come from the original problem:
v 0 = 0,
C λ − 1 = 0.
The other two conditions enforce the continuity of the left and right solutions at the interface point
x = 1:
vL 1 − vR 1 = 0,
CL 1 − CR 1 = 0.
For multipoint BVPs, the arguments of the boundary conditions function YL and YR become matrices.
In particular, the kth column YL(:,k) is the solution at the left boundary of the kth region. Similarly,
YR(:,k) is the solution at the right boundary of the kth region.
12-30
Solve BVP with Multiple Boundary Conditions
The function that encodes the residual value of the four boundary conditions is
For multipoint BVPs, the boundary conditions are automatically applied at the beginning and end of
the interval of integration. However, you must specify double entries in xmesh for the other interface
points. A simple guess that satisfies the boundary conditions is the constant guess y = [1; 1].
xc = 1;
xmesh = [0 0.25 0.5 0.75 xc xc 1.25 1.5 1.75 2];
yinit = [1; 1];
sol = bvpinit(xmesh,yinit);
Solve Equation
Define the values of the constant parameters and put them in the vector p. Provide the function to
bvp5c with the syntax @(x,y,r) f(x,y,r,p) to provide the vector of parameters.
Calculate the solution for several values of κ, using each solution as the initial guess for the next. For
1
each value of κ, calculate the value of the osmolarity Os = vλ
. For each iteration of the loop,
compare the computed value with the approximate analytical solution.
lambda = 2;
n = 5e-2;
for kappa = 2:5
eta = lambda^2/(n*kappa^2);
p = [n kappa lambda eta];
sol = bvp5c(@(x,y,r) f(x,y,r,p), @bc, sol);
K2 = lambda*sinh(kappa/lambda)/(kappa*cosh(kappa));
approx = 1/(1 - K2);
computed = 1/sol.y(1,end);
fprintf(' %2i %10.3f %10.3f \n',kappa,computed,approx);
end
2 1.462 1.454
3 1.172 1.164
4 1.078 1.071
5 1.039 1.034
Plot Solution
Plot the solution components for v x and C x and a vertical line at the interface point x = 1. The
displayed solution for κ = 5 results from the final iteration of the loop.
plot(sol.x,sol.y(1,:),'--o',sol.x,sol.y(2,:),'--o')
line([1 1], [0 2], 'Color', 'k')
legend('v(x)','C(x)')
title('A Three-Point BVP Solved with bvp5c')
xlabel({'x', '\lambda = 2, \kappa = 5'})
ylabel('v(x) and C(x)')
12-31
12 Boundary Value Problems (BVPs)
Local Functions
Listed here are the local helper functions that the BVP solver bvp5c calls to calculate the solution.
Alternatively, you can save these functions as their own files in a directory on the MATLAB path.
dydx = zeros(2,1);
dydx(1) = (y(2) - 1)/n;
switch region
case 1 % x in [0 1]
dydx(2) = (y(1)*y(2) - x)/eta;
case 2 % x in [1 lambda]
dydx(2) = (y(1)*y(2) - 1)/eta;
end
end
%-------------------------------------------
function res = bc(YL,YR) % boundary conditions
res = [YL(1,1) % v(0) = 0
YR(1,1) - YL(1,2) % Continuity of v(x) at x=1
YR(2,1) - YL(2,2) % Continuity of C(x) at x=1
YR(2,end) - 1]; % C(lambda) = 1
12-32
Solve BVP with Multiple Boundary Conditions
end
%-------------------------------------------
See Also
bvp4c | bvp5c | bvpinit
More About
• “Solving Boundary Value Problems” on page 12-2
12-33
13
• ∂u ∂2 u
Equations with a time derivative are parabolic. An example is the heat equation = 2.
∂t ∂x
• ∂2 u
Equations without a time derivative are elliptic. An example is the Laplace equation = 0.
∂x2
pdepe requires at least one parabolic equation in the system. In other words, at least one equation in
the system must include a time derivative.
pdepe also solves certain 2-D and 3-D problems that reduce to 1-D problems due to angular
symmetry (see the argument description for the symmetry constant m for more information).
Partial Differential Equation Toolbox extends this functionality to generalized problems in 2-D and 3-D
with Dirichlet and Neumann boundary conditions.
∂u ∂u ∂ m ∂u ∂u
c x, t, u, = x−m x f x, t, u, + s x, t, u, .
∂x ∂t ∂x ∂x ∂x
The coupling of the partial derivatives with respect to time is restricted to multiplication by a
∂u
diagonal matrix c x, t, u, . The diagonal elements of this matrix are either zero or positive. An
∂x
13-2
Solving Partial Differential Equations
element that is zero corresponds to an elliptic equation, and any other element corresponds to a
parabolic equation. There must be at least one parabolic equation. An element of c that corresponds
to a parabolic equation can vanish at isolated values of x if they are mesh points (points where the
solution is evaluated). Discontinuities in c and s due to material interfaces are permitted provided
that a mesh point is placed at each interface.
Solution Process
To solve PDEs with pdepe, you must define the equation coefficients for c, f, and s, the initial
conditions, the behavior of the solution at the boundaries, and a mesh of points to evaluate the
solution on. The function call sol = pdepe(m,pdefun,icfun,bcfun,xmesh,tspan) uses this
information to calculate a solution on the specified mesh:
Together, the xmesh and tspan vectors form a 2-D grid that pdepe evaluates the solution on.
Equations
You must express the PDEs in the standard form expected by pdepe. Written in this form, you can
read off the values of the coefficients c, f, and s.
In MATLAB you can code the equations with a function of the form
∂u ∂2 u
In this case pdefun defines the equation = 2 . If there are multiple equations, then c, f, and s
∂t ∂x
are vectors with each element corresponding to one equation.
Initial Conditions
At the initial time t = t0, for all x, the solution components satisfy initial conditions of the form
u x, t0 = u0 x .
In MATLAB you can code the initial conditions with a function of the form
function u0 = icfun(x)
u0 = 1;
end
In this case u0 = 1 defines an initial condition of u0(x,t0) = 1. If there are multiple equations, then u0
is a vector with each element defining the initial condition of one equation.
13-3
13 Partial Differential Equations (PDEs)
Boundary Conditions
At the boundary x = a or x = b, for all t, the solution components satisfy boundary conditions of the
form
∂u
p x, t, u + q x, t f x, t, u, = 0.
∂x
q(x,t) is a diagonal matrix with elements that are either zero or never zero. Note that the boundary
conditions are expressed in terms of the flux f, rather than the partial derivative of u with respect to
x. Also, of the two coefficients p(x,t,u) and q(x,t), only p can depend on u.
In MATLAB you can code the boundary conditions with a function of the form
function [pL,qL,pR,qR] = bcfun(xL,uL,xR,uR,t)
pL = uL;
qL = 0;
pR = uR - 1;
qR = 0;
end
pL and qL are the coefficients for the left boundary, while pR and qR are the coefficients for the right
boundary. In this case bcfun defines the boundary conditions
uL xL, t = 0
uR xR, t = 1
If there are multiple equations, then the outputs pL, qL, pR, and qR are vectors with each element
defining the boundary condition of one equation.
Integration Options
The default integration properties in the MATLAB PDE solver are selected to handle common
problems. In some cases, you can improve solver performance by overriding these default values. To
do this, use odeset to create an options structure. Then, pass the structure to pdepe as the last
input argument:
sol = pdepe(m,pdefun,icfun,bcfun,xmesh,tspan,options)
Of the options for the underlying ODE solver ode15s, only those shown in the following table are
available for pdepe.
After you solve an equation with pdepe, MATLAB returns the solution as a 3-D array sol, where
sol(i,j,k) contains the kth component of the solution evaluated at t(i) and x(j). In general, you
can extract the kth solution component with the command u = sol(:,:,k).
The time mesh you specify is used purely for output purposes, and does not affect the internal time
steps taken by the solver. However, the spatial mesh you specify can affect the quality and speed of
13-4
Solving Partial Differential Equations
the solution. After solving an equation, you can use pdeval to evaluate the solution structure
returned by pdepe with a different spatial mesh.
∂u ∂2 u
= 2.
∂t ∂x
This equation describes the dissipation of heat for 0 ≤ x ≤ L and t ≥ 0. The goal is to solve for the
temperature u x, t . The temperature is initially a nonzero constant, so the initial condition is
u x, 0 = T0 .
Also, the temperature is zero at the left boundary, and nonzero at the right boundary, so the boundary
conditions are
u 0, t = 0,
u L, t = 1 .
To solve this equation in MATLAB®, you need to code the equation, initial conditions, and boundary
conditions, then select a suitable solution mesh before calling the solver pdepe. You either can
include the required functions as local functions at the end of a file (as in this example), or save them
as separate, named files in a directory on the MATLAB path.
Code Equation
Before you can code the equation, you need to make sure that it is in the form that the pdepe solver
expects:
∂u ∂u ∂ m ∂u ∂u
c x, t, u, = x−m x f x, t, u, + s x, t, u, .
∂x ∂t ∂x ∂x ∂x
∂u ∂ 0 ∂u
1⋅ = x0 x + 0.
∂t ∂x ∂x
• m=0
• c=1
• ∂u
f =
∂x
• s=0
The value of m is passed as an argument to pdepe, while the other coefficients are encoded in a
function for the equation, which is
13-5
13 Partial Differential Equations (PDEs)
f = dudx;
s = 0;
end
(Note: All functions are included as local functions at the end of the example.)
The initial condition function for the heat equation assigns a constant value for u0. This function must
accept an input for x, even if it is unused.
function u0 = heatic(x)
u0 = 0.5;
end
The standard form for the boundary conditions expected by the pdepe solver is
∂u
p x, t, u + q x, t f x, t, u, = 0.
∂x
Written in this form, the boundary conditions for this problem are
u 0, t + 0 ⋅ f = 0,
u L, t − 1 + 0 ⋅ f = 0 .
• pL = uL, qL = 0 .
• pR = uR − 1, qR = 0.
Use a spatial mesh of 20 points and a time mesh of 30 points. Since the solution rapidly reaches a
steady state, the time points near t = 0 are more closely spaced together to capture this behavior in
the output.
L = 1;
x = linspace(0,L,20);
t = [linspace(0,0.05,20), linspace(0.5,5,10)];
Solve Equation
Finally, solve the equation using the symmetry m, the PDE equation, the initial condition, the
boundary conditions, and the meshes for x and t.
13-6
Solving Partial Differential Equations
m = 0;
sol = pdepe(m,@heatpde,@heatic,@heatbc,x,t);
Plot Solution
colormap hot
pcolor(x,t,sol)
colorbar
xlabel('Distance x','interpreter','latex')
ylabel('Time t','interpreter','latex')
title('Heat Equation for $0 \le x \le 1$ and $0 \le t \le 5$','interpreter','latex')
Local Functions
13-7
13 Partial Differential Equations (PDEs)
qr = 0;
end
odeexamples
edit exampleFileName.m
exampleFileName
References
[1] Skeel, R. D. and M. Berzins, "A Method for the Spatial Discretization of Parabolic Equations in
One Space Variable," SIAM Journal on Scientific and Statistical Computing, Vol. 11, 1990, pp.
1–32.
See Also
bvp4c | ode45 | pdepe | odeset | pdeval
More About
• “Solve Single PDE” on page 13-9
• “Solve System of PDEs” on page 13-32
13-8
Solve Single PDE
This example shows how to formulate, compute, and plot the solution to a single PDE.
∂u ∂2 u
π2 = 2.
∂t ∂x
The equation is defined on the interval 0 ≤ x ≤ 1 for times t ≥ 0. At t = 0, the solution satisfies the
initial condition
u x, 0 = sin πx .
u 0, t = 0,
∂u
πe−t + 1, t = 0 .
∂x
To solve this equation in MATLAB®, you need to code the equation, the initial conditions, and the
boundary conditions, then select a suitable solution mesh before calling the solver pdepe. You either
can include the required functions as local functions at the end of a file (as done here), or save them
as separate, named files in a directory on the MATLAB path.
Code Equation
Before you can code the equation, you need to rewrite it in a form that the pdepe solver expects. The
standard form that pdepe expects is
∂u ∂u ∂ ∂u ∂u
c x, t, u, ∂x ∂t
= x−m ∂x xm f x, t, u, ∂x
+ s x, t, u, ∂x
.
∂u ∂ ∂u
π2 ∂t = x0 ∂x x0 ∂x + 0.
With the equation in the proper form you can read off the relevant terms:
m=0
∂u
c x, t, u, = π2
∂x
∂u ∂u
f x, t, u, =
∂x ∂x
∂u
s x, t, u, =0
∂x
Now you can create a function to code the equation. The function should have the signature [c,f,s]
= pdex1pde(x,t,u,dudx):
13-9
13 Partial Differential Equations (PDEs)
(Note: All functions are included as local functions at the end of the example.)
Next, write a function that returns the initial condition. The initial condition is applied at the first
time value tspan(1). The function should have the signature u0 = pdex1ic(x).
function u0 = pdex1ic(x)
u0 = sin(pi*x);
end
Now, write a function that evaluates the boundary conditions. For problems posed on the interval
a ≤ x ≤ b, the boundary conditions apply for all t and either x = a or x = b. The standard form for the
boundary conditions expected by the solver is
∂u
p x, t, u + q x, t f x, t, u, = 0.
∂x
Rewrite the boundary conditions in this standard form and read off the coefficient values.
∂u
u 0, t = 0 u+0⋅ ∂x
= 0.
• p 0, t, u = u
• q 0, t = 0
∂u ∂u
πe−t + ∂x
1, t = 0 πe−t + 1 ⋅ ∂x
1, t = 0 .
13-10
Solve Single PDE
• p 1, t, u = πe−t
• q 1, t = 1
∂u
Since the boundary condition function is expressed in terms of f x, t, u, ∂x
, and this term is already
defined in the main PDE function, you do not need to specify this piece of the equation in the
boundary condition function. You need only specify the values of p x, t, u and q x, t at each boundary.
Before solving the equation you need to specify the mesh points t, x at which you want pdepe to
evaluate the solution. Specify the points as vectors t and x. The vectors t and x play different roles in
the solver. In particular, the cost and accuracy of the solution depend strongly on the length of the
vector x. However, the computation is much less sensitive to the values in the vector t.
For this problem, use a mesh with 20 equally spaced points in the spatial interval [0,1] and five values
of t from the time interval [0,2].
x = linspace(0,1,20);
t = linspace(0,2,5);
Solve Equation
Finally, solve the equation using the symmetry m, the PDE equation, the initial conditions, the
boundary conditions, and the meshes for x and t.
m = 0;
sol = pdepe(m,@pdex1pde,@pdex1ic,@pdex1bc,x,t);
pdepe returns the solution in a 3-D array sol, where sol(i,j,k) approximates the kth component
of the solution uk evaluated at t(i) and x(j). The size of sol is length(t)-by-length(x)-by-
length(u0), since u0 specifies an initial condition for each solution component. For this problem, u
has only one component, so sol is a 5-by-20 matrix, but in general you can extract the kth solution
component with the command u = sol(:,:,k).
13-11
13 Partial Differential Equations (PDEs)
u = sol(:,:,1);
Plot Solution
surf(x,t,u)
title('Numerical solution computed with 20 mesh points')
xlabel('Distance x')
ylabel('Time t')
The initial condition and boundary conditions of this problem were chosen so that there would be an
analytical solution, given by
u x, t = e−t sin πx .
surf(x,t,exp(-t)'*sin(pi*x))
title('True solution plotted with 20 mesh points')
xlabel('Distance x')
ylabel('Time t')
13-12
Solve Single PDE
Now, compare the numerical and analytical solutions at tf , the final value of t. In this example tf = 2.
plot(x,u(end,:),'o',x,exp(-t(end))*sin(pi*x))
title('Solution at t = 2')
legend('Numerical, 20 mesh points','Analytical','Location','South')
xlabel('Distance x')
ylabel('u(x,2)')
13-13
13 Partial Differential Equations (PDEs)
Local Functions
Listed here are the local helper functions that the PDE solver pdepe calls to calculate the solution.
Alternatively, you can save these functions as their own files in a directory on the MATLAB path.
See Also
pdepe
13-14
Solve Single PDE
More About
• “Solving Partial Differential Equations” on page 13-2
• “Solve PDE with Discontinuity” on page 13-16
13-15
13 Partial Differential Equations (PDEs)
This example shows how to solve a PDE that interfaces with a material. The material interface
creates a discontinuity in the problem at x = 0 . 5, and the initial condition has a discontinuity at the
right boundary x = 1.
∂u ∂ 2 ∂u
= x−2 x 5 − 1000eu 0 ≤ x ≤ 0 . 5
∂t ∂x ∂x
∂u ∂ 2 ∂u
= x−2 x − eu 0.5 ≤ x ≤ 1
∂t ∂x ∂x
u x, 0 = 0 0 ≤ x < 1 ,
u 1, 0 = 1 x = 1 .
∂u
=0 x=0,
∂x
u 1, t = 1 x = 1 .
To solve this equation in MATLAB®, you need to code the equation, the initial conditions, and the
boundary conditions, then select a suitable solution mesh before calling the solver pdepe. You either
can include the required functions as local functions at the end of a file (as done here), or save them
as separate, named files in a directory on the MATLAB path.
Code Equation
Before you can code the equation, you need to make sure that it is in a form that the pdepe solver
expects. The standard form that pdepe expects is
∂u ∂u ∂ ∂u ∂u
c x, t, u, ∂x ∂t
= x−m ∂x xm f x, t, u, ∂x
+ s x, t, u, ∂x
.
In this case, the PDE is in the proper form, so you can read off the values of the coefficients.
∂u ∂ ∂u
∂t
= x−2 ∂x x2 5 ∂x − 1000eu 0 ≤ x ≤ 0 . 5
∂u ∂ ∂u
∂t
= x−2 ∂x x2 ∂x
− eu 0.5 ≤ x ≤ 1
∂u ∂u
The values for the flux term f x, t, u, ∂x
and source term s x . t, u, ∂x
change depending on the value
of x. The coefficients are:
m=2
∂u
c x, t, u, =1
∂x
13-16
Solve PDE with Discontinuity
∂u ∂u
f x, t, u, =5 0 ≤ x ≤ 0.5
∂x ∂x
∂u ∂u
f x, t, u, = 0.5 ≤ x ≤ 1
∂x ∂x
∂u
s x, t, u, = − 1000eu 0 ≤ x ≤ 0 . 5
∂x
∂u
s x, t, u, = − eu 0.5 ≤ x ≤ 1
∂x
Now you can create a function to code the equation. The function should have the signature [c,f,s]
= pdex2pde(x,t,u,dudx):
(Note: All functions are included as local functions at the end of the example.)
Next, write a function that returns the initial conditions. The initial condition is applied at the first
time value and provides the value of u x, t0 for any value of x. Use the function signature u0 =
pdex2ic(x) to write the function.
u x, 0 = 0 0≤x<1,
u 1, 0 = 1 x=1 .
13-17
13 Partial Differential Equations (PDEs)
Now, write a function that evaluates the boundary conditions. For problems posed on the interval
a ≤ x ≤ b, the boundary conditions apply for all t and either x = a or x = b. The standard form for the
boundary conditions expected by the solver is
∂u
p x, t, u + q x, t f x, t, u, = 0.
∂x
Since this example has spherical symmetry (m = 2), the pdepe solver automatically enforces the left
boundary condition to bound the solution at the origin, and ignores any conditions specified for the
left boundary in the boundary function. So for the left boundary condition, you can specify
pL = qL = 0. For the right boundary condition, you can rewrite the boundary condition in the standard
form and read off the coefficient values for pR and qR.
∂u
For x = 1, the equation is u 1, t = 1 u−1 +0⋅ ∂x
= 0. The coefficients are:
• pR 1, t, u = u − 1
• qR 1, t = 0
The spatial mesh should include several values near x = 0 . 5 to account for the discontinuous
interface, as well as points near x = 1 because of the inconsistent initial value (u 1, 0 = 1) and
boundary value (u 1, t = 0) at that point. The solution changes rapidly for small t, so use a time step
that can resolve this sharp change.
x = [0 0.1 0.2 0.3 0.4 0.45 0.475 0.5 0.525 0.55 0.6 0.7 0.8 0.9 0.95 0.975 0.99 1];
t = [0 0.001 0.005 0.01 0.05 0.1 0.5 1];
13-18
Solve PDE with Discontinuity
Solve Equation
Finally, solve the equation using the symmetry m, the PDE equation, the initial conditions, the
boundary conditions, and the meshes for x and t.
m = 2;
sol = pdepe(m,@pdex2pde,@pdex2ic,@pdex2bc,x,t);
pdepe returns the solution in a 3-D array sol, where sol(i,j,k) approximates the kth component
of the solution uk evaluated at t(i) and x(j). The size of sol is length(t)-by-length(x)-by-
length(u0), since u0 specifies an initial condition for each solution component. For this problem, u
has only one component, so sol is a 8-by-18 matrix, but in general you can extract the kth solution
component with the command u = sol(:,:,k).
u = sol(:,:,1);
Plot Solution
Create a surface plot of the solution u plotted at the selected mesh points for x and t. Since m = 2 the
problem is posed in a spherical geometry with spherical symmetry, so the solution only changes in the
radial x direction.
surf(x,t,u)
title('Numerical solution with nonuniform mesh')
xlabel('Distance x')
ylabel('Time t')
zlabel('Solution u')
13-19
13 Partial Differential Equations (PDEs)
Now, plot just x and u to get a side view of the contours in the surface plot. Add a line at x = 0 . 5 to
highlight the effect of the material interface.
plot(x,u,x,u,'*')
line([0.5 0.5], [-3 1], 'Color', 'k')
xlabel('Distance x')
ylabel('Solution u')
title('Solution profiles at several times')
13-20
Solve PDE with Discontinuity
Local Functions
Listed here are the local helper functions that the PDE solver pdepe calls to calculate the solution.
Alternatively, you can save these functions as their own files in a directory on the MATLAB path.
13-21
13 Partial Differential Equations (PDEs)
pr = ur - 1;
qr = 0;
end
%----------------------------------------------
See Also
pdepe
More About
• “Solving Partial Differential Equations” on page 13-2
• “Solve PDE and Compute Partial Derivatives” on page 13-23
13-22
Solve PDE and Compute Partial Derivatives
This example shows how to solve a transistor partial differential equation (PDE) and use the results to
obtain partial derivatives that are part of solving a larger problem.
∂u ∂2 u Dη ∂u
=D 2− .
∂t ∂x L ∂x
This equation arises in transistor theory [1], and u x, t is a function describing the concentration of
excess charge carriers (or holes) in the base of a PNP transistor. D and η are physical constants. The
equation holds on the interval 0 ≤ x ≤ L for times t ≥ 0.
K L 1 − e−η 1 − x/L
u x, 0 = .
D η
u 0, t = u L, t = 0 .
For fixed x, the solution to the equation u x, t describes the collapse of excess charge as t ∞. This
collapse produces a current, called the emitter discharge current, which has another constant Ip:
IpD ∂
It = u x, t .
K ∂x x=0
The equation is valid for t > 0 due to the inconsistency in the boundary values at x = 0 for t = 0 and
t > 0. Since the PDE has a closed-form series solution for u x, t , you can calculate the emitter
discharge current analytically as well as numerically, and compare the results.
To solve this problem in MATLAB®, you need to code the PDE equation, initial conditions, and
boundary conditions, then select a suitable solution mesh before calling the solver pdepe. You either
can include the required functions as local functions at the end of a file (as done here), or save them
as separate, named files in a directory on the MATLAB path.
To keep track of the physical constants, create a structure array with fields for each one. When you
later define functions for the equations, the initial condition, and the boundary conditions, you can
pass in this structure as an extra argument so that the functions have access to the constants.
C.L = 1;
C.D = 0.1;
C.eta = 10;
C.K = 1;
C.Ip = 1;
13-23
13 Partial Differential Equations (PDEs)
Code Equation
Before you can code the equation, you need to make sure that it is in the form that the pdepe solver
expects:
∂u ∂u ∂ m ∂u ∂u
c x, t, u, = x−m x f x, t, u, + s x, t, u, .
∂x ∂t ∂x ∂x ∂x
∂u ∂ 0 ∂u Dη ∂u
= x0 x D − .
∂t ∂x ∂x L ∂x
Now you can create a function to code the equation. The function should have the signature [c,f,s]
= transistorPDE(x,t,u,dudx,C):
c = 1;
f = D*dudx;
s = -(D*eta/L)*dudx;
end
(Note: All functions are included as local functions at the end of the example.)
Next, write a function that returns the initial condition. The initial condition is applied at the first
time value, and provides the value of u x, t0 for any value of x. Use the function signature u0 =
transistorIC(x,C) to write the function.
13-24
Solve PDE and Compute Partial Derivatives
K L 1 − e−η 1 − x/L
u x, 0 = D η
.
function u0 = transistorIC(x,C)
K = C.K;
L = C.L;
D = C.D;
eta = C.eta;
Now, write a function that evaluates the boundary conditions u 0, t = u 1, t = 0. For problems posed
on the interval a ≤ x ≤ b, the boundary conditions apply for all t and either x = a or x = b. The
standard form for the boundary conditions expected by the solver is
∂u
p x, t, u + q x, t f x, t, u, = 0.
∂x
Written in this form, the boundary conditions for this problem are
∂u
- For x = 0, the equation is u + 0 ⋅ d ∂x = 0 . The coefficients are:
• pL x, t, u = u,
• qL x, t = 0 .
∂u
- Likewise for x = 1, the equation is u + 0 ⋅ d ∂x = 0 . The coefficients are:
• pR x, t, u = u,
• qR x, t = 0 .
13-25
13 Partial Differential Equations (PDEs)
ql = 0;
pr = ur;
qr = 0;
end
The solution mesh defines the values of x and t where the solution is evaluated by the solver. Since
the solution to this problem changes rapidly, use a relatively fine mesh of 50 spatial points in the
interval 0 ≤ x ≤ L and 50 time points in the interval 0 ≤ t ≤ 1.
x = linspace(0,C.L,50);
t = linspace(0,1,50);
Solve Equation
Finally, solve the equation using the symmetry m, the PDE equation, the initial condition, the
boundary conditions, and the meshes for x and t. Since pdepe expects the PDE function to use four
inputs and the initial condition function to use one input, create function handles that pass in the
structure of physical constants as an extra input.
m = 0;
eqn = @(x,t,u,dudx) transistorPDE(x,t,u,dudx,C);
ic = @(x) transistorIC(x,C);
sol = pdepe(m,eqn,ic,@transistorBC,x,t);
pdepe returns the solution in a 3-D array sol, where sol(i,j,k) approximates the kth component
of the solution uk evaluated at t(i) and x(j). For this problem u has only one component, but in
general you can extract the kth solution component with the command u = sol(:,:,k).
u = sol(:,:,1);
Plot Solution
Create a surface plot of the solution u plotted at the selected mesh points for x and t.
surf(x,t,u)
title('Numerical Solution (50 mesh points)')
xlabel('Distance x')
ylabel('Time t')
zlabel('Solution u(x,t)')
13-26
Solve PDE and Compute Partial Derivatives
Now, plot just x and u to get a side view of the contours in the surface plot.
plot(x,u)
xlabel('Distance x')
ylabel('Solution u(x,t)')
title('Solution profiles at several times')
13-27
13 Partial Differential Equations (PDEs)
Using a series solution for u x, t , the emitter discharge current can be expressed as the infinite series
[1]:
dt
1 − e−η ∞ n2 − n2π2 + η2 /4
I t = 2π2Ip
η ∑ 2 2 2
e L2 .
n = 1 n π + η /4
Write a function to calculate the analytic solution for I t using 40 terms in the series. The only
variable is time, but specify a second input to the function for the structure of constants.
function It = serex3(t,C) % Approximate I(t) by series expansion.
Ip = C.Ip;
eta = C.eta;
D = C.D;
L = C.L;
It = 0;
for n = 1:40 % Use 40 terms
m = (n*pi)^2 + 0.25*eta^2;
It = It + ((n*pi)^2 / m)* exp(-(D/L^2)*m*t);
end
It = 2*Ip*((1 - exp(-eta))/eta)*It;
end
Using the numeric solution for u x, t as computed by pdepe, you can also calculate the numeric
approximation for I t at x = 0 with
13-28
Solve PDE and Compute Partial Derivatives
IpD ∂
It = u x, t .
K ∂x x=0
Calculate the analytic and numeric solutions for I t and plot the results. Use pdeval to compute the
value of ∂u/ ∂x at x = 0.
nt = length(t);
I = zeros(1,nt);
seriesI = zeros(1,nt);
iok = 2:nt;
for j = iok
% At time t(j), compute du/dx at x = 0.
[~,I(j)] = pdeval(m,x,u(j,:),0);
seriesI(j) = serex3(t(j),C);
end
% Numeric solution has form I(t) = (I_p*D/K)*du(0,t)/dx
I = (C.Ip*C.D/C.K)*I;
plot(t(iok),I(iok),'o',t(iok),seriesI(iok))
legend('From PDEPE + PDEVAL','From series')
title('Emitter discharge current I(t)')
xlabel('Time t')
The results match reasonably well. You can further improve the numeric result from pdepe by using a
finer solution mesh.
13-29
13 Partial Differential Equations (PDEs)
Local Functions
Listed here are the local helper functions that the PDE solver pdepe calls to calculate the solution.
Alternatively, you can save these functions as their own files in a directory on the MATLAB path.
c = 1;
f = D*dudx;
s = -(D*eta/L)*dudx;
end
% ----------------------------------------------------
function u0 = transistorIC(x,C) % Initial condition
K = C.K;
L = C.L;
D = C.D;
eta = C.eta;
It = 0;
for n = 1:40 % Use 40 terms
m = (n*pi)^2 + 0.25*eta^2;
It = It + ((n*pi)^2 / m)* exp(-(D/L^2)*m*t);
end
It = 2*Ip*((1 - exp(-eta))/eta)*It;
end
% ----------------------------------------------------
References
[1] Zachmanoglou, E.C. and D.L. Thoe. Introduction to Partial Differential Equations with
Applications. Dover, New York, 1986.
See Also
pdepe
13-30
Solve PDE and Compute Partial Derivatives
More About
• “Solving Partial Differential Equations” on page 13-2
• “Solve System of PDEs” on page 13-32
13-31
13 Partial Differential Equations (PDEs)
This example shows how to formulate, compute, and plot the solution to a system of two partial
differential equations.
∂u1 ∂2 u1
= 0 . 024 2 − F u1 − u2 ,
∂t ∂x
∂u2 ∂2 u2
= 0 . 170 2 + F u1 − u2 .
∂t ∂x
The equation holds on the interval 0 ≤ x ≤ 1 for times t ≥ 0. The initial conditions are
u1 x, 0 = 1,
u2 x, 0 = 0 .
∂
u 0, t = 0,
∂x 1
u2 0, t = 0,
∂
u 1, t = 0,
∂x 2
u1 1, t = 1 .
To solve this equation in MATLAB®, you need to code the equation, the initial conditions, and the
boundary conditions, then select a suitable solution mesh before calling the solver pdepe. You either
can include the required functions as local functions at the end of a file (as done here), or save them
as separate, named files in a directory on the MATLAB path.
Code Equation
Before you can code the equation, you need to make sure that it is in the form that the pdepe solver
expects:
∂u ∂u ∂ m ∂u ∂u
c x, t, u, = x−m x f x, t, u, + s x, t, u, .
∂x ∂t ∂x ∂x ∂x
In this form, the PDE coefficients are matrix-valued and the equation becomes
∂u1
0 . 024
1 0 ∂ u1 ∂ ∂x −F u1 − u2
= + .
0 1 ∂t u2 ∂x ∂u2 F u1 − u2
0 . 170
∂x
13-32
Solve System of PDEs
m=0
∂u 1
c x, t, u, = (diagonal values only)
∂x 1
∂u1
0 . 024
∂u ∂x
f x, t, u, =
∂x ∂u2
0 . 170
∂x
∂u −F u1 − u2
s x, t, u, =
∂x F u1 − u2
Now you can create a function to code the equation. The function should have the signature [c,f,s]
= pdefun(x,t,u,dudx):
(Note: All functions are included as local functions at the end of the example.)
Next, write a function that returns the initial condition. The initial condition is applied at the first
time value and provides the value of u x, t0 for any value of x. The number of initial conditions must
equal the number of equations, so for this problem there are two initial conditions. Use the function
signature u0 = pdeic(x) to write the function.
u1 x, 0 = 1,
u2 x, 0 = 0 .
13-33
13 Partial Differential Equations (PDEs)
function u0 = pdeic(x)
u0 = [1; 0];
end
For problems posed on the interval a ≤ x ≤ b, the boundary conditions apply for all t and either x = a
or x = b. The standard form for the boundary conditions expected by the solver is
∂u
p x, t, u + q x, t f x, t, u, = 0.
∂x
Written in this form, the boundary conditions for the partial derivatives of u need to be expressed in
∂u
terms of the flux f x, t, u, ∂x
. So the boundary conditions for this problem are
0
pL x, t, u = ,
u2
1
qL x, t = .
0
u1 − 1
pR x, t, u = ,
0
0
qR x, t = .
1
13-34
Solve System of PDEs
The solution to this problem changes rapidly when t is small. Although pdepe selects a time step that
is appropriate to resolve the sharp changes, to see the behavior in the output plots you need to select
appropriate output times. For the spatial mesh, there are boundary layers in the solution at both ends
of 0 ≤ x ≤ 1, so you need to specify mesh points there to resolve the sharp changes.
x = [0 0.005 0.01 0.05 0.1 0.2 0.5 0.7 0.9 0.95 0.99 0.995 1];
t = [0 0.005 0.01 0.05 0.1 0.5 1 1.5 2];
Solve Equation
Finally, solve the equation using the symmetry m, the PDE equation, the initial conditions, the
boundary conditions, and the meshes for x and t.
m = 0;
sol = pdepe(m,@pdefun,@pdeic,@pdebc,x,t);
pdepe returns the solution in a 3-D array sol, where sol(i,j,k) approximates the kth component
of the solution uk evaluated at t(i) and x(j). Extract each solution component into a separate
variable.
u1 = sol(:,:,1);
u2 = sol(:,:,2);
Plot Solution
Create surface plots of the solutions for u1 and u2 plotted at the selected mesh points for x and t.
surf(x,t,u1)
title('u_1(x,t)')
xlabel('Distance x')
ylabel('Time t')
13-35
13 Partial Differential Equations (PDEs)
surf(x,t,u2)
title('u_2(x,t)')
xlabel('Distance x')
ylabel('Time t')
13-36
Solve System of PDEs
Local Functions
Listed here are the local helper functions that the PDE solver pdepe calls to calculate the solution.
Alternatively, you can save these functions as their own files in a directory on the MATLAB path.
function [c,f,s] = pdefun(x,t,u,dudx) % Equation to solve
c = [1; 1];
f = [0.024; 0.17] .* dudx;
y = u(1) - u(2);
F = exp(5.73*y)-exp(-11.47*y);
s = [-F; F];
end
% ---------------------------------------------
function u0 = pdeic(x) % Initial Conditions
u0 = [1; 0];
end
% ---------------------------------------------
function [pl,ql,pr,qr] = pdebc(xl,ul,xr,ur,t) % Boundary Conditions
pl = [0; ul(2)];
ql = [1; 0];
pr = [ur(1)-1; 0];
qr = [0; 1];
end
% ---------------------------------------------
See Also
pdepe
13-37
13 Partial Differential Equations (PDEs)
More About
• “Solving Partial Differential Equations” on page 13-2
• “Solve Single PDE” on page 13-9
• “Solve System of PDEs with Initial Condition Step Functions” on page 13-39
13-38
Solve System of PDEs with Initial Condition Step Functions
This example shows how to solve a system of partial differential equations that uses step functions in
the initial conditions.
∂n ∂ ∂n ∂c
= d −an +Srn N−n ,
∂t ∂x ∂x ∂x
∂c ∂2 c n
= 2 +S −c .
∂t ∂x n+1
The equations involve the constant parameters d, a, S, r , and N, and are defined for 0 ≤ x ≤ 1 and
t ≥ 0. These equations arise in a mathematical model of the first steps of tumor-related angiogenesis
[1]. n x, t represents the cell density of endothelial cells, and c x, t the concentration of a protein
they release in response to the tumor.
n0 1
= .
c0 0.5
However, a stability analysis predicts evolution of the system to an inhomogeneous solution [1]. So
step functions are used as the initial conditions to perturb the steady state and stimulate evolution of
the system.
The boundary conditions require that both solution components have zero flux at x = 0 and x = 1.
∂ ∂
n 0, t = n 1, t = 0,
∂x ∂x
∂ ∂
c 0, t = c 1, t = 0 .
∂x ∂x
To solve this system of equations in MATLAB®, you need to code the equations, initial conditions, and
boundary conditions, then select a suitable solution mesh before calling the solver pdepe. You either
can include the required functions as local functions at the end of a file (as done here), or save them
as separate, named files in a directory on the MATLAB path.
Code Equation
Before you can code the equation, you need to make sure that it is in the form that the pdepe solver
expects:
∂u ∂u ∂ m ∂u ∂u
c x, t, u, = x−m x f x, t, u, + s x, t, u, .
∂x ∂t ∂x ∂x ∂x
Since there are two equations in the system of PDEs, the system of PDEs can be rewritten as
∂n ∂c
d −an Srn N−n
10 ∂ n ∂ ∂x ∂x
= + n .
01 ∂t c ∂x ∂c S −c
n+1
∂x
13-39
13 Partial Differential Equations (PDEs)
m=0
∂u 1
c x, t, u, = (diagonal values only)
∂x 1
∂n ∂c
d −an
∂u ∂x ∂x
f x, t, u, =
∂x ∂c
∂x
Srn N−n
∂u
s x, t, u, = n
∂x S −c
n+1
Now you can create a function to code the equation. The function should have the signature [c,f,s]
= angiopde(x,t,u,dudx):
c = [1; 1];
f = [d*dudx(1) - a*u(1)*dudx(2)
dudx(2)];
s = [S*r*u(1)*(N - u(1));
S*(u(1)/(u(1) + 1) - u(2))];
end
(Note: All functions are included as local functions at the end of the example.)
Next, write a function that returns the initial condition. The initial condition is applied at the first
time value and provides the value of n x, t0 and c x, t0 for any value of x. Use the function signature
u0 = angioic(x) to write the function.
13-40
Solve System of PDEs with Initial Condition Step Functions
n0 1
= .
c0 0.5
However, a stability analysis predicts evolution of the system to an inhomogenous solution [1]. So,
step functions are used as the initial conditions to perturb the steady state and stimulate evolution of
the system.
n0
u x, 0 = ,
c0
1 . 05u1 0.3 ≤ x ≤ 0.6
u x, 0 =
1 . 0005u2 0 . 3 ≤ x ≤ 0 . 6
∂ ∂
n 0, t = n 1, t = 0,
∂x ∂x
∂ ∂
c 0, t = c 1, t = 0 .
∂x ∂x
For problems posed on the interval a ≤ x ≤ b, the boundary conditions apply for all t and either x = a
or x = b. The standard form for the boundary conditions expected by the solver is
∂u
p x, t, u + q x, t f x, t, u, = 0.
∂x
∂n ∂c
d −an
0 1 ∂x ∂x
+ ⋅ = 0.
0 1 ∂c
∂x
• 0
pL x, t, u = ,
0
• 1
qL x, t = .
1
0 1
For x = 1 the boundary conditions are the same, so pR x, t, u = and qR x, t = .
0 1
13-41
13 Partial Differential Equations (PDEs)
A long time interval is required to see the limiting behavior of the equations, so use 10 points in the
interval 0 ≤ t ≤ 200. Also, the limit distribution of c x, t varies by only about 0.1% over the interval
0 ≤ x ≤ 1, so a relatively fine spatial mesh with 50 points is appropriate.
x = linspace(0,1,50);
t = linspace(0,200,10);
Solve Equation
Finally, solve the equation using the symmetry m, the PDE equation, the initial condition, the
boundary conditions, and the meshes for x and t.
m = 0;
sol = pdepe(m,@angiopde,@angioic,@angiobc,x,t);
pdepe returns the solution in a 3-D array sol, where sol(i,j,k) approximates the kth component
of the solution uk evaluated at t(i) and x(j). Extract the solution components into separate
variables.
n = sol(:,:,1);
c = sol(:,:,2);
Plot Solution
Create a surface plot of the solution components n and c plotted at the selected mesh points for x and
t.
surf(x,t,c)
title('c(x,t): Concentration of Fibronectin')
xlabel('Distance x')
ylabel('Time t')
13-42
Solve System of PDEs with Initial Condition Step Functions
surf(x,t,n)
title('n(x,t): Density of Endothelial Cells')
xlabel('Distance x')
ylabel('Time t')
13-43
13 Partial Differential Equations (PDEs)
Now plot just the final distributions of the solutions at tf = 200. These plots correspond to Figures 3
and 4 in [1].
plot(x,n(end,:))
title('Final distribution of n(x,t_f)')
13-44
Solve System of PDEs with Initial Condition Step Functions
plot(x,c(end,:))
title('Final distribution of c(x,t_f)')
13-45
13 Partial Differential Equations (PDEs)
References
[1] Humphreys, M.E. and M.A.J. Chaplain. "A mathematical model of the first steps of tumour-related
angiogenesis: Capillary sprout formation and secondary branching." IMA Journal of Mathematics
Applied in Medicine & Biology, 13 (1996), pp. 73-98.
Local Functions
Listed here are the local helper functions that the PDE solver pdepe calls to calculate the solution.
Alternatively, you can save these functions as their own files in a directory on the MATLAB path.
c = [1; 1];
f = [d*dudx(1) - a*u(1)*dudx(2)
dudx(2)];
s = [S*r*u(1)*(N - u(1));
S*(u(1)/(u(1) + 1) - u(2))];
end
% ---------------------------------------------
function u0 = angioic(x) % Initial Conditions
u0 = [1; 0.5];
13-46
Solve System of PDEs with Initial Condition Step Functions
See Also
pdepe
More About
• “Solving Partial Differential Equations” on page 13-2
• “Solve Single PDE” on page 13-9
• “Solve System of PDEs” on page 13-32
13-47
14
Delay differential equations (DDEs) are ordinary differential equations that relate the solution at the
current time to the solution at past times. This delay can be constant, time-dependent, state-
dependent, or derivative-dependent. In order for the integration to begin, you generally must provide
a solution history so that the solution is accessible to the solver for times before the initial integration
point.
y′ t = f t, y t , y t − τ1 , …, y t − τk .
Here, t is the independent variable, y is a column vector of dependent variables, and y ′ represents
the first derivative of y with respect to t. The delays, τ1,…,τk, are positive constants.
The dde23 function solves DDEs with constant delays with history y(t) = S(t) for t <t0.
The solutions of DDEs are generally continuous, but they have discontinuities in their derivatives. The
dde23 function tracks discontinuities in low-order derivatives. It integrates the differential equations
with the same explicit Runge-Kutta (2,3) pair and interpolant used by ode23. The Runge-Kutta
formulas are implicit for step sizes bigger than the delays. When y(t) is smooth enough to justify steps
this big, the implicit formulas are evaluated by a predictor-corrector iteration.
Time-dependent and state-dependent DDEs involve delays dy1,..., dyk that can depend on both time t
and state y. The delays dyj(t, y) must satisfy dyj(t, y) ≤ t on the interval [t0, tf] with t0 < tf.
The ddesd function finds the solution, y(t), for time-dependent and state-dependent DDEs with
history y(t) = S(t) for t < t0. The ddesd function integrates with the classic four-stage, fourth-order
explicit Runge-Kutta method, and it controls the size of the residual of a natural interpolant. It uses
iteration to take steps that are longer than the delays.
The delays in the solution must satisfy dyi(t,y) ≤ t. The delays in the first derivative must satisfy
dypj(t,y) < t so that y ′ does not appear on both sides of the equation.
The ddensd function solves DDEs of neutral type by approximating them with DDEs of the form given
for time-dependent and state-dependent delays:
14-2
Solving Delay Differential Equations
Discontinuities in DDEs
If your problem has discontinuities, it is best to communicate them to the solver using an options
structure. To do this, use the ddeset function to create an options structure containing the
discontinuities in your problem.
There are three properties in the options structure that you can use to specify discontinuities;
InitialY, Jumps, and Events. The property you choose depends on the location and nature of the
discontinuities.
14-3
14 Delay Differential Equations (DDEs)
Propagation of Discontinuities
Generally, the first derivative of the solution has a jump at the initial point. This is because the first
derivative of the history function, S(t), generally does not satisfy the DDE at this point. A
discontinuity in any derivative of y(t) propagates into the future at spacings of τ1,…, τk when the
delays are constant. If the delays are not constant, the propagation of discontinuities is more
complicated. For neutral DDEs of the forms in “Constant Delay DDEs” on page 14-2 and “Time-
Dependent and State-Dependent DDEs” on page 14-2, the discontinuity appears in the next higher
order derivative each time it is propagated. In this sense, the solution gets smoother as the
integration proceeds. Solutions of neutral DDEs of the form given in “DDEs of Neutral Type” on page
14-2 are qualitatively different. The discontinuity in the solution does not propagate to a derivative of
higher order. In particular, the typical jump in y ′(t) at t0 propagates as jumps in y ′(t) throughout [t0,
tf].
odeexamples
edit exampleFileName.m
exampleFileName
This table contains a list of the available DDE example files, as well as the solvers and the options
they use.
14-4
Solving Delay Differential Equations
References
[1] Shampine, L.F. “Dissipative Approximations to Neutral DDEs.” Applied Mathematics &
Computation, Vol. 203, 2008, pp. 641–648.
See Also
dde23 | ddesd | ddensd | ddeset
More About
• “DDE with Constant Delays” on page 14-6
• “DDE with State-Dependent Delays” on page 14-10
14-5
14 Delay Differential Equations (DDEs)
This example shows how to use dde23 to solve a system of DDEs (delay differential equations) with
constant delays.
y1′ t = y1 t − 1
y2′ t = y1 t − 1 + y2 t − 0 . 2
y3′ t = y2 t .
The time delays in the equations are only present in y terms, and the delays themselves are
constants, so the equations form a system of constant delay equations.
To solve this system of equations in MATLAB®, you need to code the equations, delays, and history
before calling the delay differential equation solver dde23, which is meant for systems with constant
delays. You either can include the required functions as local functions at the end of a file (as done
here), or save them as separate, named files in a directory on the MATLAB path.
Code Delays
First, create a vector to define the delays in the system of equations. This system has two different
delays:
dde23 accepts a vector argument for the delays, where each element is the constant delay for one
component.
lags = [1 0.2];
Code Equation
Now, create a function to code the equations. This function should have the signature dydt =
ddefun(t,y,Z), where:
These inputs are automatically passed to the function by the solver, but the variable names determine
how you code the equations. In this case:
• Z(:,1) y1 t − 1
• Z(:,2) y2 t − 0 . 2
14-6
DDE with Constant Delays
ylag2 = Z(:,2);
dydt = [ylag1(1);
ylag1(1)+ylag2(2);
y(2)];
end
Note: All functions are included as local functions at the end of the example.
Next, create a function to define the solution history. The solution history is the solution for times
t ≤ t0.
function s = history(t)
s = ones(3,1);
end
Solve Equation
Finally, define the interval of integration t0 tf and solve the DDE using the dde23 solver.
tspan = [0 5];
sol = dde23(@ddefun, lags, @history, tspan);
Plot Solution
The solution structure sol has the fields sol.x and sol.y that contain the internal time steps taken
by the solver and corresponding solutions at those times. (If you need the solution at specific points,
you can use deval to evaluate the solution at the specific points.)
plot(sol.x,sol.y,'-o')
xlabel('Time t');
ylabel('Solution y');
legend('y_1','y_2','y_3','Location','NorthWest');
14-7
14 Delay Differential Equations (DDEs)
Local Functions
Listed here are the local helper functions that the DDE solver dde23 calls to calculate the solution.
Alternatively, you can save these functions as their own files in a directory on the MATLAB path.
dydt = [ylag1(1);
ylag1(1)+ylag2(2);
y(2)];
end
%-------------------------------------------
function s = history(t) % history function for t <= 0
s = ones(3,1);
end
%-------------------------------------------
See Also
dde23 | ddesd | ddensd | deval
More About
• “Solving Delay Differential Equations” on page 14-2
14-8
DDE with Constant Delays
14-9
14 Delay Differential Equations (DDEs)
This example shows how to use ddesd to solve a system of DDEs (delay differential equations) with
state-dependent delays. This system of DDEs was used as a test problem by Enright and Hayashi [1].
y1′ t = y2 t ,
1 − y2 t 2 1 − y2 t
y2′ t = − y2 e ⋅ y2 t ⋅e .
y1 t = log t ,
1
y2 t = .
t
The time delays in the equations are only present in y terms. The delays depend only on the state of
the second component y2 t , so the equations form a system of state-dependent delay equations.
To solve this system of equations in MATLAB®, you need to code the equations, delays, and history
before calling the delay differential equation solver ddesd, which is meant for systems with state-
dependent delays. You either can include the required functions as local functions at the end of a file
(as done here), or save them as separate, named files in a directory on the MATLAB path.
Code Delays
First, write a function to define the time delays in the system. The only delay present in this system of
1 − y2 t
equations is in the term−y2 e .
function d = dely(t,y)
d = exp(1 - y(2));
end
Note: All functions are included as local functions at the end of the example.
Code Equation
Now, create a function to code the equations. This function should have the signature dydt =
ddefun(t,y,Z), where:
These inputs are automatically passed to the function by the solver, but the variable names determine
how you code the equations. In this case:
• Z(2,1) y2 e
1 − y2 t
14-10
DDE with State-Dependent Delays
Next, create a function to define the solution history. The solution history is the solution for times
t ≤ t0.
Solve Equation
Finally, define the interval of integration t0 tf and solve the DDE using the ddesd solver.
Plot Solution
The solution structure sol has the fields sol.x and sol.y that contain the internal time steps taken
by the solver and corresponding solutions at those times. (If you need the solution at specific points,
you can use deval to evaluate the solution at the specific points.)
Plot the two solution components against time using the history function to calculate the analytical
solution within the integration interval for comparison.
ta = linspace(0.1,5);
ya = history(ta);
plot(ta,ya,sol.x,sol.y,'o')
legend('y_1 exact','y_2 exact','y_1 ddesd','y_2 ddesd')
xlabel('Time t')
ylabel('Solution y')
title('D1 Problem of Enright and Hayashi')
14-11
14 Delay Differential Equations (DDEs)
Local Functions
Listed here are the local helper functions that the DDE solver ddesd calls to calculate the solution.
Alternatively, you can save these functions as their own files in a directory on the MATLAB path.
References
[1] Enright, W.H. and H. Hayashi. “The Evaluation of Numerical Software for Delay Differential
Equations.” In Proceedings of the IFIP TC2/WG2.5 working conference on Quality of numerical
14-12
DDE with State-Dependent Delays
software: assessment and enhancement. (R.F. Boisvert, ed.). London, UK: Chapman & Hall, Ltd., pp.
179-193.
See Also
ddensd | ddesd | dde23 | deval
More About
• “Solving Delay Differential Equations” on page 14-2
• “Cardiovascular Model DDE with Discontinuities” on page 14-14
14-13
14 Delay Differential Equations (DDEs)
This example shows how to use dde23 to solve a cardiovascular model that has a discontinuous
derivative. The example was originally presented by Ottesen [1].
1 1 1
P˙a t = − P t + P t + V str P τa t H t
caR a caR v ca
1 1 1
Ṗv t = P t − + P t
cvR a cvR cvr v
αHTs
Ḣ t = − βHTp .
1 + γHTp
The terms for Ts and Tp are variations of the same equation with and without time delay. Paτ and Pa
represent the mean arterial pressure with and without time delay, respectively.
1
Ts = βs
Paτ
1+ αs
1
Tp = .
Pa −βp
1+ αp
• β0 = βs = βp = 7
• βH = 1 . 17
• γH = 0
The system is heavily influenced by peripheral pressure, which decreases exponentially from
R = 1 . 05 to R = 0 . 84 beginning at t = 600. As a result, the system has a discontinuity in a low-order
derivative at t = 600.
14-14
Cardiovascular Model DDE with Discontinuities
1 1 1
Pa = P0, Pv t = P0, Ht = P0 .
1+
R RVstr 1 + r
r R
To solve this system of equations in MATLAB®, you need to code the equations, parameters, delays,
and history before calling the delay differential equation solver dde23, which is meant for systems
with constant time delays. You either can include the required functions as local functions at the end
of a file (as done here), or save them as separate, named files in a directory on the MATLAB path.
Code Delay
Next, create a variable tau to represent the constant time delay τ in the equations for the terms
Paτ t = Pa t − τ .
tau = 4;
Code Equations
Now, create a function to code the equations. This function should have the signature dydt =
ddefun(t,y,Z,p), where:
The first three inputs are automatically passed to the function by the solver, and the variable names
determine how you code the equations. The structure of parameters p is passed to the function when
you call the solver. In this case the delays are represented with:
• Z(:,1) Pa t − τ
14-15
14 Delay Differential Equations (DDEs)
Ts = 1 / ( 1 + (Patau / p.alphas)^p.betas );
Tp = 1 / ( 1 + (p.alphap / Paoft)^p.betap );
Note: All functions are included as local functions at the end of the example.
Next, create a vector to define the constant solution history for the three components Pa, Pv, and H.
The solution history is the solution for times t ≤ t0.
P0 = 93;
Paval = P0;
Pvval = (1 / (1 + p.R/p.r)) * P0;
Hval = (1 / (p.R * p.Vstr)) * (1 / (1 + p.r/p.R)) * P0;
history = [Paval; Pvval; Hval];
Solve Equation
Use ddeset to specify the presence of the discontinuity at t = 600. Finally, define the interval of
integration t0 tf and solve the DDE using the dde23 solver. Specify ddefun using an anonymous
function to pass in the structure of parameters, p.
options = ddeset('Jumps',600);
tspan = [0 1000];
sol = dde23(@(t,y,Z) ddefun(t,y,Z,p), tau, history, tspan, options);
Plot Solution
The solution structure sol has the fields sol.x and sol.y that contain the internal time steps taken
by the solver and corresponding solutions at those times. (If you need the solution at specific points,
you can use deval to evaluate the solution at the specific points.)
14-16
Cardiovascular Model DDE with Discontinuities
xlabel('Time t')
ylabel('H(t)')
Local Functions
Listed here are the local helper functions that the DDE solver dde23 calls to calculate the solution.
Alternatively, you can save these functions as their own files in a directory on the MATLAB path.
function dydt = ddefun(t,y,Z,p) % equation being solved
if t <= 600
p.R = 1.05;
else
p.R = 0.21 * exp(600-t) + 0.84;
end
ylag = Z(:,1);
Patau = ylag(1);
Paoft = y(1);
Pvoft = y(2);
Hoft = y(3);
14-17
14 Delay Differential Equations (DDEs)
Ts = 1 / ( 1 + (Patau / p.alphas)^p.betas );
Tp = 1 / ( 1 + (p.alphap / Paoft)^p.betap );
References
[1] Ottesen, J. T. “Modelling of the Baroreflex-Feedback Mechanism with Time-Delay.” J. Math. Biol.
Vol. 36, Number 1, 1997, pp. 41–63.
See Also
ddensd | ddesd | dde23 | deval
More About
• “Solving Delay Differential Equations” on page 14-2
• “DDE of Neutral Type” on page 14-19
14-18
DDE of Neutral Type
This example shows how to use ddensd to solve a neutral DDE (delay differential equation), where
delays appear in derivative terms. The problem was originally presented by Paul [1].
The equation is
t 2
y′ t = 1 + y t − 2y 2
− y′ t − π .
Since the equation has time delays in a y′ term, the equation is called a neutral DDE. If the time
delays are only present in y terms, then the equation would be a constant or state-dependent DDE,
depending on what form the time delays have.
To solve this equation in MATLAB®, you need to code the equation, delays, and history before calling
the delay differential equation solver ddensd. You either can include these as local functions at the
end of a file (as done here), or save them as separate files in a directory on the MATLAB path.
Code Delays
First, write functions to define the delays in the equation. The first term in the equation with a delay
t
is y 2
.
function dy = dely(t,y)
dy = t/2;
end
In this example, only one delay for y and one delay for y′ are present. If there were more delays, then
you can add them in these same function files, so that the functions return vectors instead of scalars.
Note: All functions are included as local functions at the end of the example.
Code Equation
Now, create a function to code the equation. This function should have the signature yp =
ddefun(t,y,ydel,ypdel), where:
These inputs are automatically passed to the function by the solver, but the variable names determine
how you code the equation. In this case:
14-19
14 Delay Differential Equations (DDEs)
• ydel y
t
2
• ypdel y′ t − π
function yp = ddefun(t,y,ydel,ypdel)
yp = 1 + y - 2*ydel^2 - ypdel;
end
Next, create a function to define the solution history. The solution history is the solution for times
t ≤ t0.
function y = history(t)
y = cos(t);
end
Solve Equation
Finally, define the interval of integration t0 tf and solve the DDE using the ddensd solver.
tspan = [0 pi];
sol = ddensd(@ddefun, @dely, @delyp, @history, [0,pi]);
Plot Solution
The solution structure sol has the fields sol.x and sol.y that contain the internal time steps taken
by the solver and corresponding solutions at those times. However, you can use deval to evaluate
the solution at the specific points.
tn = linspace(0,pi,20);
yn = deval(sol,tn);
Plot the calculated solution and history against the analytical solution.
th = linspace(-pi,0);
yh = history(th);
ta = linspace(0,pi);
ya = cos(ta);
plot(th,yh,tn,yn,'o',ta,ya)
legend('History','Numerical','Analytical','Location','NorthWest')
xlabel('Time t')
ylabel('Solution y')
title('Example of Paul with 1 Equation and 2 Delay Functions')
axis([-3.5 3.5 -1.5 1.5])
14-20
DDE of Neutral Type
Local Functions
Listed here are the local helper functions that the DDE solver ddensd calls to calculate the solution.
Alternatively, you can save these functions as their own files in a directory on the MATLAB path.
References
14-21
14 Delay Differential Equations (DDEs)
[1] Paul, C.A.H. “A Test Set of Functional Differential Equations.” Numerical Analysis Reports. No.
243. Manchester, UK: Math Department, University of Manchester, 1994.
See Also
ddensd | ddesd | dde23 | deval
More About
• “Solving Delay Differential Equations” on page 14-2
• “Initial Value DDE of Neutral Type” on page 14-23
14-22
Initial Value DDE of Neutral Type
This example shows how to use ddensd to solve a system of initial value DDEs (delay differential
equations) with time-dependent delays. The example was originally presented by Jackiewicz [1].
The equation is
t 2 cos t t
y′ t = 2 cos 2t y + log y′ − log 2 cos t − sin t .
2 2
This equation is an initial value DDE because the time delays are zero at t0. Therefore, a solution
history is unnecessary to calculate a solution, only the initial values are needed:
y 0 = 1,
y′ 0 = s .
s is the solution of 2 + log s − log 2 = 0. The values of s that satisfy this equation are s1 = 2 and
s2 = 0 . 4063757399599599.
Since the time delays in the equations are present in a y′ term, this equation is called a neutral DDE.
To solve this equation in MATLAB®, you need to code the equation and delays before calling the
delay differential equation solver ddensd, which is the solver for neutral equations. You either can
include the required functions as local functions at the end of a file (as done here), or save them as
separate files in a directory on the MATLAB path.
Code Delays
First, write an anonymous function to define the delays in the equation. Since both y and y′ have
t
delays of the form 2 , only one function definition is required. This delay function is later passed to the
solver twice, once to indicate the delay for y and once for y′.
delay = @(t,y) t/2;
Code Equation
Now, create a function to code the equation. This function should have the signature yp =
ddefun(t,y,ydel,ypdel), where:
These inputs are automatically passed to the function by the solver, but the variable names determine
how you code the equation. In this case:
• ydel y
t
2
• ypdel y′
t
2
14-23
14 Delay Differential Equations (DDEs)
function yp = ddefun(t,y,ydel,ypdel)
yp = 2*cos(2*t)*ydel^(2*cos(t)) + log(ypdel) - log(2*cos(t)) - sin(t);
end
Note: All functions are included as local functions at the end of the example.
Solve Equation
Finally, define the interval of integration t0 tf and the initial values, and then solve the DDE using
the ddensd solver. Pass the initial values to the solver by specifying them in a cell array in the fourth
input argument.
tspan = [0 0.1];
y0 = 1;
s1 = 2;
sol1 = ddensd(@ddefun, delay, delay, {y0,s1}, tspan);
Solve the equation a second time, this time using the alternate value of s for the initial condition.
s2 = 0.4063757399599599;
sol2 = ddensd(@ddefun, delay, delay, {y0,s2}, tspan);
Plot Solution
The solution structures sol1 and sol2 have the fields x and y that contain the internal time steps
taken by the solver and corresponding solutions at those times. However, you can use deval to
evaluate the solution at the specific points.
plot(sol1.x,sol1.y,sol2.x,sol2.y);
legend('y''(0) = 2','y''(0) = .40637..','Location','NorthWest');
xlabel('Time t');
ylabel('Solution y');
title('Two Solutions of Jackiewicz''s Initial-Value NDDE');
14-24
Initial Value DDE of Neutral Type
Local Functions
Listed here are the local helper functions that the DDE solver ddensd calls to calculate the solution.
Alternatively, you can save these functions as their own files in a directory on the MATLAB path.
function yp = ddefun(t,y,ydel,ypdel)
yp = 2*cos(2*t)*ydel^(2*cos(t)) + log(ypdel) - log(2*cos(t)) - sin(t);
end
References
[1] Jackiewicz, Z. “One step Methods of any Order for Neutral Functional Differential Equations.”
SIAM Journal on Numerical Analysis. Vol. 21, Number 3. 1984. pp. 486–511.
See Also
ddensd | ddesd | dde23 | deval
More About
• “Solving Delay Differential Equations” on page 14-2
• “DDE of Neutral Type” on page 14-19
14-25
15
Numerical Integration
where t ∊ [0,3π].
t = 0:0.1:3*pi;
plot3(sin(2*t),cos(t),t)
The arc length formula says the length of the curve is the integral of the norm of the derivatives of
the parameterized equations.
3π
0
∫ 2
4cos2 2t + sin t + 1 dt .
len = integral(f,0,3*pi)
len =
17.2220
See Also
integral
More About
• “Create Function Handle”
• “Singularity on Interior of Integration Domain” on page 15-5
• “Integration of Numeric Data” on page 15-8
15-2
Complex Line Integrals
This example shows how to calculate complex line integrals using the 'Waypoints' option of the
integral function. In MATLAB®, you use the 'Waypoints' option to define a sequence of straight
line paths from the first limit of integration to the first waypoint, from the first waypoint to the
second, and so forth, and finally from the last waypoint to the second limit of integration.
Integrate
∮ ez dz
z
C
where C is a closed contour that encloses the simple pole of ez /z at the origin.
You can evaluate contour integrals of complex-valued functions with a parameterization. In general, a
contour is specified, and then differentiated and used to parameterize the original integrand. In this
case, specify the contour as the unit circle, but in all cases, the result is independent of the contour
chosen.
g = @(theta) cos(theta) + 1i*sin(theta);
gprime = @(theta) -sin(theta) + 1i*cos(theta);
q1 = integral(@(t) fun(g(t)).*gprime(t),0,2*pi)
q1 = -0.0000 + 6.2832i
This method of parameterizing, although reliable, can be difficult and time consuming since a
derivative must be calculated before the integration is performed. Even for simple functions, you
need to write several lines of code to obtain the correct result. Since the result is the same with any
closed contour that encloses the pole (in this case, the origin), instead you can use the 'Waypoints'
option of integral to construct a square or triangular path that encloses the pole.
If any limit of integration or element of the waypoints vector is complex, then integral performs the
integration over a sequence of straight line paths in the complex plane. The natural direction around
a contour is counterclockwise; specifying a clockwise contour is akin to multiplying by -1. Specify the
contour in such a way that it encloses a single functional singularity. If you specify a contour that
encloses no poles, then Cauchy's integral theorem guarantees that the value of the closed-loop
integral is zero.
To see this, integrate fun around a square contour away from the origin. Use equal limits of
integration to form a closed contour.
C = [2+i 2+2i 1+2i];
q = integral(fun,1+i,1+i,'Waypoints',C)
q = -3.6082e-16 + 6.6613e-16i
15-3
15 Numerical Integration
Specify a square contour that completely encloses the pole at the origin, and then integrate.
q2 = 0.0000 + 6.2832i
This result agrees with the q1 calculated above, but uses much simpler code.
2*pi*i
See Also
integral
More About
• “Create Function Handle”
• “Singularity on Interior of Integration Domain” on page 15-5
• “Integration of Numeric Data” on page 15-8
15-4
Singularity on Interior of Integration Domain
This example shows how to split the integration domain to place a singularity on the boundary.
∫∫
1 1 1
dx dy
−1 −1 x+y
has a singularity when x = y = 0 and is, in general, singular on the line y = -x.
format long
q = integral2(fun,-1,1,-1,1)
q =
NaN + NaNi
If there are singular values in the interior of the integration region, the integration fails to converge
and returns a warning.
You can redefine the integral by splitting the integration domain into complementary pieces and
adding the smaller integrations together. Avoid integration errors and warnings by placing
singularities on the boundary of the domain. In this case, you can split the square integration region
into two triangles along the singular line y = -x and add the results.
q1 = integral2(fun,-1,1,-1,@(x)-x);
q2 = integral2(fun,-1,1,@(x)-x,1);
q = q1 + q2
q =
3.771236166328258 - 3.771236166328255i
The integration succeeds when the singular values are on the boundary.
8 2
1−i
3
8/3*sqrt(2)*(1-i)
15-5
15 Numerical Integration
ans =
3.771236166328253 - 3.771236166328253i
See Also
integral | integral2 | integral3
More About
• “Create Function Handle”
• “Complex Line Integrals” on page 15-3
• “Integration of Numeric Data” on page 15-8
15-6
Analytic Solution to Integral of Polynomial
This example shows how to use the polyint function to integrate polynomial expressions
analytically. Use this function to evaluate indefinite integral expressions of polynomials.
∫ 4x 5 − 2x3 + x + 4 dx
2 6 1 4 1 2
x − x + x + 4x + k
3 2 2
where k is the constant of integration. Since the limits of integration are unspecified, the integral
function family is not well-suited to solving this problem.
Create a vector whose elements represent the coefficients for each descending power of x.
p = [4 0 -2 0 1 4];
Integrate the polynomial analytically using the polyint function. Specify the constant of integration
with the second input argument.
k = 2;
I = polyint(p,k)
I = 1×7
The output is a vector of coefficients for descending powers of x. This result matches the analytic
solution above, but has a constant of integration k = 2.
See Also
polyint | polyval
More About
• “Singularity on Interior of Integration Domain” on page 15-5
• “Integration of Numeric Data” on page 15-8
15-7
15 Numerical Integration
This example shows how to integrate a set of discrete velocity data numerically to approximate the
distance traveled. The integral family only accepts function handles as inputs, so those functions
cannot be used with discrete data sets. Use trapz or cumtrapz when a functional expression is not
available for integration.
This data represents the velocity of an automobile (in m/s) taken at 1 s intervals over 24 s.
Plot the velocity data points and connect each point with a straight line.
figure
plot(time,vel,'-*')
grid on
title('Automobile Velocity')
xlabel('Time (s)')
ylabel('Velocity (m/s)')
15-8
Integration of Numeric Data
The slope is positive during periods of acceleration, zero during periods of constant velocity, and
negative during periods of deceleration. At time t = 0, the vehicle is at rest with vel(1) = 0 m/s.
The vehicle accelerates until reaching a maximum velocity at t = 8 s of vel(9) = 29.05 m/s and
maintains this velocity for 4 s. It then decelerates to vel(14) = 17.9 m/s for 3 s and eventually
back down to rest. Since this velocity curve has multiple discontinuities, a single continuous function
cannot describe it.
trapz performs discrete integration by using the data points to create trapezoids, so it is well suited
to handling data sets with discontinuities. This method assumes linear behavior between the data
points, and accuracy may be reduced when the behavior between data points is nonlinear. To
illustrate, you can draw trapezoids onto the graph using the data points as vertices.
trapz calculates the area under a set of discrete data by breaking the region into trapezoids. The
function then adds the area of each trapezoid to compute the total area.
Calculate the total distance traveled by the automobile (corresponding to the shaded area) by
integrating the velocity data numerically using trapz. By default, the spacing between points is
assumed to be 1 if you use the syntax trapz(Y). However, you can specify a different uniform or
nonuniform spacing X with the syntax trapz(X,Y). In this case, the spacing between readings in the
time vector is 1, so it is acceptable to use the default spacing.
15-9
15 Numerical Integration
distance = trapz(vel)
distance = 345.2200
The cumtrapz function is closely related to trapz. While trapz returns only the final integration
value, cumtrapz also returns intermediate values in a vector.
cdistance = cumtrapz(vel);
T = table(time',cdistance','VariableNames',{'Time','CumulativeDistance'})
T=25×2 table
Time CumulativeDistance
____ __________________
0 0
1 0.225
2 1.345
3 4.25
4 9.835
5 19
6 32.635
7 51.63
8 77.105
9 106.15
10 135.2
11 164.25
12 193.31
13 219.04
14 239.2
15 257.1
⋮
plot(cdistance)
title('Cumulative Distance Traveled Per Second')
xlabel('Time (s)')
ylabel('Distance (m)')
15-10
Integration of Numeric Data
See Also
trapz | cumtrapz | integral
More About
• “Singularity on Interior of Integration Domain” on page 15-5
• “Analytic Solution to Integral of Polynomial” on page 15-7
15-11
15 Numerical Integration
This example shows how to approximate gradients of a function by finite differences. It then shows
how to plot a tangent plane to a point on the surface by using these approximated gradients.
Approximate the partial derivatives of f (x, y) with respect to x and y by using the gradient function.
Choose a finite difference length that is the same as the mesh size.
[xx,yy] = meshgrid(-5:0.25:5);
[fx,fy] = gradient(f(xx,yy),0.25);
The tangent plane to a point on the surface, P = (x0, y0, f (x0, y0)), is given by
∂f ∂f
The fx and fy matrices are approximations to the partial derivatives and . The point of interest
∂x ∂y
in this example, where the tangent plane meets the functional surface, is (x0,y0) = (1,2). The
function value at this point of interest is f(1,2) = 5.
To approximate the tangent plane z you need to find the value of the derivatives at the point of
interest. Obtain the index of that point, and find the approximate derivatives there.
x0 = 1;
y0 = 2;
t = (xx == x0) & (yy == y0);
indt = find(t);
fx0 = fx(indt);
fy0 = fy(indt);
Plot the original function f (x, y), the point P, and a piece of plane z that is tangent to the function at
P.
surf(xx,yy,f(xx,yy),'EdgeAlpha',0.7,'FaceAlpha',0.9)
hold on
surf(xx,yy,z(xx,yy))
plot3(1,2,f(1,2),'r*')
15-12
Calculate Tangent Plane to Surface
view(-135,9)
15-13
15 Numerical Integration
See Also
gradient
More About
• “Create Function Handle”
15-14
16
Fourier Transforms
Fourier Transforms
The Fourier transform is a mathematical formula that transforms a signal sampled in time or space to
the same signal sampled in temporal or spatial frequency. In signal processing, the Fourier transform
can reveal important characteristics of a signal, namely, its frequency components.
The Fourier transform is defined for a vector x with n uniformly sampled points by
n−1
yk + 1 = ∑ ω jkx j + 1 .
j=0
ω = e−2πi/n is one of the n complex roots of unity where i is the imaginary unit. For x and y, the
indices j and k range from 0 to n − 1.
The fft function in MATLAB® uses a fast Fourier transform algorithm to compute the Fourier
transform of data. Consider a sinusoidal signal x that is a function of time t with frequency
components of 15 Hz and 20 Hz. Use a time vector sampled in increments of 1/50 seconds over a
period of 10 seconds.
Ts = 1/50;
t = 0:Ts:10-Ts;
x = sin(2*pi*15*t) + sin(2*pi*20*t);
plot(t,x)
xlabel('Time (seconds)')
ylabel('Amplitude')
16-2
Fourier Transforms
Compute the Fourier transform of the signal, and create the vector f that corresponds to the signal's
sampling in frequency space.
y = fft(x);
fs = 1/Ts;
f = (0:length(y)-1)*fs/length(y);
When you plot the magnitude of the signal as a function of frequency, the spikes in magnitude
correspond to the signal's frequency components of 15 Hz and 20 Hz.
plot(f,abs(y))
xlabel('Frequency (Hz)')
ylabel('Magnitude')
title('Magnitude')
The transform also produces a mirror copy of the spikes, which correspond to the signal's negative
frequencies. To better visualize this periodicity, you can use the fftshift function, which performs a
zero-centered, circular shift on the transform.
n = length(x);
fshift = (-n/2:n/2-1)*(fs/n);
yshift = fftshift(y);
plot(fshift,abs(yshift))
xlabel('Frequency (Hz)')
ylabel('Magnitude')
16-3
16 Fourier Transforms
Noisy Signals
In scientific applications, signals are often corrupted with random noise, disguising their frequency
components. The Fourier transform can process out random noise and reveal the frequencies. For
example, create a new signal, xnoise, by injecting Gaussian noise into the original signal, x.
rng('default')
xnoise = x + 2.5*randn(size(t));
Signal power as a function of frequency is a common metric used in signal processing. Power is the
squared magnitude of a signal's Fourier transform, normalized by the number of frequency samples.
Compute and plot the power spectrum of the noisy signal centered at the zero frequency. Despite
noise, you can still make out the signal's frequencies due to the spikes in power.
ynoise = fft(xnoise);
ynoiseshift = fftshift(ynoise);
power = abs(ynoiseshift).^2/n;
plot(fshift,power)
title('Power')
xlabel('Frequency (Hz)')
ylabel('Power')
16-4
Fourier Transforms
Computational Efficiency
Using the Fourier transform formula directly to compute each of the n elements of y requires on the
order of n2 floating-point operations. The fast Fourier transform algorithm requires only on the order
of n log n operations to compute. This computational efficiency is a big advantage when processing
data that has millions of data points. Many specialized implementations of the fast Fourier transform
algorithm are even more efficient when n has small prime factors, such as n is a power of 2.
Consider audio data collected from underwater microphones off the coast of California. This data can
be found in a library maintained by the Cornell University Bioacoustics Research Program. Load and
format a subset of the data in bluewhale.au, which contains a Pacific blue whale vocalization.
Because blue whale calls are low-frequency sounds, they are barely audible to humans. The time
scale in the data is compressed by a factor of 10 to raise the pitch and make the call more clearly
audible. You can use the command sound(x,fs) to listen to the entire audio file.
whaleFile = 'bluewhale.au';
[x,fs] = audioread(whaleFile);
whaleMoan = x(2.45e4:3.10e4);
t = 10*(0:1/fs:(length(whaleMoan)-1)/fs);
plot(t,whaleMoan)
xlabel('Time (seconds)')
ylabel('Amplitude')
xlim([0 t(end)])
16-5
16 Fourier Transforms
Specify a new signal length that is the next power of 2 greater than the original length. Then, use fft
to compute the Fourier transform using the new signal length. fft automatically pads the data with
zeros to increase the sample size. This padding can make the transform computation significantly
faster, particularly for sample sizes with large prime factors.
m = length(whaleMoan);
n = pow2(nextpow2(m));
y = fft(whaleMoan,n);
Plot the power spectrum of the signal. The plot indicates that the moan consists of a fundamental
frequency around 17 Hz and a sequence of harmonics, where the second harmonic is emphasized.
16-6
Fourier Transforms
Phase of Sinusoids
Using the Fourier transform, you can also extract the phase spectrum of the original signal. For
example, create a signal that consists of two sinusoids of frequencies 15 Hz and 40 Hz. The first
sinusoid is a cosine wave with phase −π/4, and the second is a cosine wave with phase π/2. Sample
the signal at 100 Hz for 1 second.
fs = 100;
t = 0:1/fs:1-1/fs;
x = cos(2*pi*15*t - pi/4) - sin(2*pi*40*t);
Compute the Fourier transform of the signal. Plot the magnitude of the transform as a function of
frequency.
y = fft(x);
z = fftshift(y);
ly = length(y);
f = (-ly/2:ly/2-1)/ly*fs;
stem(f,abs(z))
xlabel("Frequency (Hz)")
ylabel("|y|")
grid
16-7
16 Fourier Transforms
Compute the phase of the transform, removing small-magnitude transform values. Plot the phase as a
function of frequency.
tol = 1e-6;
z(abs(z) < tol) = 0;
theta = angle(z);
stem(f,theta/pi)
xlabel("Frequency (Hz)")
ylabel("Phase / \pi")
grid
16-8
Fourier Transforms
See Also
fft | fftshift | nextpow2 | ifft | fft2 | fftn | fftw
Related Examples
• “2-D Fourier Transforms” on page 16-19
External Websites
• Fourier Analysis (MathWorks Teaching Resources)
16-9
16 Fourier Transforms
Quantity Description
x Sampled data
n = length(x) Number of samples
fs Sample frequency (samples per unit time or space)
dt = 1/fs Time or space increment per sample
t = (0:n-1)/fs Time or space range for data
y = fft(x) Discrete Fourier transform of data (DFT)
abs(y) Amplitude of the DFT
(abs(y).^2)/n Power of the DFT
fs/n Frequency increment
f = (0:n-1)*(fs/n) Frequency range
fs/2 Nyquist frequency (midpoint of frequency range)
Noisy Signal
The Fourier transform can compute the frequency components of a signal that is corrupted by
random noise.
Create a signal with component frequencies at 15 Hz and 40 Hz, and inject random Gaussian noise.
rng('default')
fs = 100; % sample frequency (Hz)
t = 0:1/fs:10-1/fs; % 10 second span time vector
x = (1.3)*sin(2*pi*15*t) ... % 15 Hz component
+ (1.7)*sin(2*pi*40*(t-2)) ... % 40 Hz component
+ 2.5*randn(size(t)); % Gaussian noise;
The Fourier transform of the signal identifies its frequency components. In MATLAB®, the fft
function computes the Fourier transform using a fast Fourier transform algorithm. Use fft to
compute the discrete Fourier transform of the signal.
y = fft(x);
Plot the power spectrum as a function of frequency. While noise disguises a signal's frequency
components in time-based space, the Fourier transform reveals them as spikes in power.
16-10
Basic Spectral Analysis
plot(f,power)
xlabel('Frequency')
ylabel('Power')
In many applications, it is more convenient to view the power spectrum centered at 0 frequency
because it better represents the signal's periodicity. Use the fftshift function to perform a circular
shift on y, and plot the 0-centered power.
plot(f0,power0)
xlabel('Frequency')
ylabel('Power')
16-11
16 Fourier Transforms
Audio Signal
You can use the Fourier transform to analyze the frequency spectrum of audio data.
The file bluewhale.au contains audio data from a Pacific blue whale vocalization recorded by
underwater microphones off the coast of California. The file is from the library of animal vocalizations
maintained by the Cornell University Bioacoustics Research Program.
Because blue whale calls are so low, they are barely audible to humans. The time scale in the data is
compressed by a factor of 10 to raise the pitch and make the call more clearly audible. Read and plot
the audio data. You can use the command sound(x,fs) to listen to the audio.
whaleFile = 'bluewhale.au';
[x,fs] = audioread(whaleFile);
plot(x)
xlabel('Sample Number')
ylabel('Amplitude')
16-12
Basic Spectral Analysis
The first sound is a "trill" followed by three "moans". This example analyzes a single moan. Specify
new data that approximately consists of the first moan, and correct the time data to account for the
factor-of-10 speed-up. Plot the truncated signal as a function of time.
moan = x(2.45e4:3.10e4);
t = 10*(0:1/fs:(length(moan)-1)/fs);
plot(t,moan)
xlabel('Time (seconds)')
ylabel('Amplitude')
xlim([0 t(end)])
16-13
16 Fourier Transforms
The Fourier transform of the data identifies frequency components of the audio signal. In some
applications that process large amounts of data with fft, it is common to resize the input so that the
number of samples is a power of 2. This can make the transform computation significantly faster,
particularly for sample sizes with large prime factors. Specify a new signal length n that is a power of
2, and use the fft function to compute the discrete Fourier transform of the signal. fft
automatically pads the original data with zeros to increase the sample size.
Adjust the frequency range due to the speed-up factor, and compute and plot the power spectrum of
the signal. The plot indicates that the moan consists of a fundamental frequency around 17 Hz and a
sequence of harmonics, where the second harmonic is emphasized.
f = (0:n-1)*(fs/n)/10;
power = abs(y).^2/n;
plot(f(1:floor(n/2)),power(1:floor(n/2)))
xlabel('Frequency')
ylabel('Power')
16-14
Basic Spectral Analysis
See Also
fft | fftshift | nextpow2 | ifft | fft2 | fftn
Related Examples
• “Fourier Transforms” on page 16-2
• “2-D Fourier Transforms” on page 16-19
16-15
16 Fourier Transforms
FFT in Mathematics
The FFT algorithm is associated with applications in signal processing, but it can also be used more
generally as a fast computational tool in mathematics. For example, coefficients ci of an nth degree
polynomial c1xn + c2xn − 1 + ... + cnx + cn + 1 that interpolates a set of data are commonly computed by
solving a straightforward system of linear equations. While studying asteroid orbits in the early 19th
century, Carl Friedrich Gauss discovered a mathematical shortcut for computing the coefficients of a
polynomial interpolant by splitting the problem up into smaller subproblems and combining the
results. His method was equivalent to estimating the discrete Fourier transform of his data.
In a paper by Gauss, he describes an approach to estimating the orbit of the Pallas asteroid. He starts
with the following twelve 2-D positional data points x and y.
x = 0:30:330;
y = [408 89 -66 10 338 807 1238 1511 1583 1462 1183 804];
plot(x,y,'ro')
xlim([0 360])
16-16
Polynomial Interpolation Using FFT
Gauss models the asteroid's orbit with a trigonometric polynomial of the following form.
y = a0 + a1cos(2π(x/360)) + b1sin(2π(x/360))
a2cos(2π(2x/360)) + b2sin(2π(2x/360))
⋯
a5cos(2π(5x/360)) + b5sin(2π(5x/360))
a6cos(2π(6x/360))
m = length(y);
n = floor((m+1)/2);
z = fft(y)/m;
a0 = z(1);
an = 2*real(z(2:n));
a6 = z(n+1);
bn = -2*imag(z(2:n));
hold on
px = 0:0.01:360;
k = 1:length(an);
py = a0 + an*cos(2*pi*k'*px/360) ...
+ bn*sin(2*pi*k'*px/360) ...
+ a6*cos(2*pi*6*px/360);
plot(px,py)
16-17
16 Fourier Transforms
References
[1] Briggs, W. and V.E. Henson. The DFT: An Owner's Manual for the Discrete Fourier Transform.
Philadelphia: SIAM, 1995.
[2] Gauss, C. F. “Theoria interpolationis methodo nova tractata.” Carl Friedrich Gauss Werke. Band 3.
Göttingen: Königlichen Gesellschaft der Wissenschaften, 1866.
[3] Heideman M., D. Johnson, and C. Burrus. “Gauss and the History of the Fast Fourier Transform.”
Arch. Hist. Exact Sciences. Vol. 34. 1985, pp. 265–277.
[4] Goldstine, H. H. A History of Numerical Analysis from the 16th through the 19th Century. Berlin:
Springer-Verlag, 1977.
See Also
fft
Related Examples
• “Fourier Transforms” on page 16-2
16-18
2-D Fourier Transforms
ωm = e−2πi/m
ωn = e−2πi/n
i is the imaginary unit, p and j are indices that run from 0 to m–1, and q and k are indices that run
from 0 to n–1. The indices for X and Y are shifted by 1 in this formula to reflect matrix indices in
MATLAB.
Computing the 2-D Fourier transform of X is equivalent to first computing the 1-D transform of each
column of X, and then taking the 1-D transform of each row of the result. In other words, the
command fft2(X) is equivalent to Y = fft(fft(X).').'.
In optics, the Fourier transform can be used to describe the diffraction pattern produced by a plane
wave incident on an optical mask with a small aperture [1]. This example uses the fft2 function on
an optical mask to compute its diffraction pattern.
Create a logical array that defines an optical mask with a small, circular aperture.
16-19
16 Fourier Transforms
Use fft2 to compute the 2-D Fourier transform of the mask, and use the fftshift function to
rearrange the output so that the zero-frequency component is at the center. Plot the resulting
diffraction pattern frequencies. Blue indicates small amplitudes and yellow indicates large
amplitudes.
DP = fftshift(fft2(M));
imagesc(abs(DP))
axis image
16-20
2-D Fourier Transforms
To enhance the details of regions with small amplitudes, plot the 2-D logarithm of the diffraction
pattern. Very small amplitudes are affected by numerical round-off error, and the rectangular grid
causes radial asymmetry.
imagesc(abs(log2(DP)))
axis image
16-21
16 Fourier Transforms
References
[1] Fowles, G. R. Introduction to Modern Optics. New York: Dover, 1989.
See Also
fft2 | fftshift | fftn | ifft2 | fft
Related Examples
• “Fourier Transforms” on page 16-2
16-22
Square Wave from Sine Waves
This example shows how the Fourier series expansion for a square wave is made up of a sum of odd
harmonics.
Start by forming a time vector running from 0 to 10 in steps of 0.1, and take the sine of all the points.
Plot this fundamental frequency.
t = 0:.1:10;
y = sin(t);
plot(t,y);
Next add the third harmonic to the fundamental, and plot it.
y = sin(t) + sin(3*t)/3;
plot(t,y);
16-23
16 Fourier Transforms
Now use the first, third, fifth, seventh, and ninth harmonics.
16-24
Square Wave from Sine Waves
For a finale, go from the fundamental all the way to the 19th harmonic, creating vectors of
successively more harmonics, and saving all intermediate steps as the rows of a matrix.
Plot the vectors on the same figure to show the evolution of the square wave. Note that the Gibbs
effect says it will never quite get there.
t = 0:.02:3.14;
y = zeros(10,length(t));
x = zeros(size(t));
for k = 1:2:19
x = x + sin(k*t)/k;
y((k+1)/2,:) = x;
end
plot(y(1:2:9,:)')
title('The building of a square wave: Gibbs'' effect')
16-25
16 Fourier Transforms
Here is a 3-D surface representing the gradual transformation of a sine wave into a square wave.
surf(y);
shading interp
axis off ij
16-26
Square Wave from Sine Waves
16-27
16 Fourier Transforms
You can use the Fourier transform to analyze variations in data, such as an event in nature over a
period time.
For almost 300 years, astronomers have tabulated the number and size of sunspots using the Zurich
sunspot relative number. Plot the Zurich number over approximately the years 1700 to 2000.
load sunspot.dat
year = sunspot(:,1);
relNums = sunspot(:,2);
plot(year,relNums)
xlabel('Year')
ylabel('Zurich Number')
title('Sunspot Data')
To take a closer look at the cyclical nature of sunspot activity, plot the first 50 years of data.
plot(year(1:50),relNums(1:50),'b.-');
xlabel('Year')
ylabel('Zurich Number')
title('Sunspot Data')
16-28
Analyzing Cyclical Data with FFT
The Fourier transform is a fundamental tool in signal processing that identifies frequency components
in data. Using the fft function, take the Fourier transform of the Zurich data. Remove the first
element of the output, which stores the sum of the data. Plot the remainder of the output, which
contains a mirror image of complex Fourier coefficients about the real axis.
y = fft(relNums);
y(1) = [];
plot(y,'ro')
xlabel('real(y)')
ylabel('imag(y)')
title('Fourier Coefficients')
16-29
16 Fourier Transforms
Fourier coefficients on their own are difficult to interpret. A more meaningful measure of the
coefficients is their magnitude squared, which is a measure of power. Since half of the coefficients are
repeated in magnitude, you only need to compute the power on one half of the coefficients. Plot the
power spectrum as a function of frequency, measured in cycles per year.
n = length(y);
power = abs(y(1:floor(n/2))).^2; % power of first half of transform data
maxfreq = 1/2; % maximum frequency
freq = (1:n/2)/(n/2)*maxfreq; % equally spaced frequency grid
plot(freq,power)
xlabel('Cycles/Year')
ylabel('Power')
16-30
Analyzing Cyclical Data with FFT
Maximum sunspot activity happens less frequently than once per year. For a view of the cyclical
activity that is easier to interpret, plot power as a function of period, measured in years per cycle.
The plot reveals that sunspot activity peaks about once every 11 years.
period = 1./freq;
plot(period,power);
xlim([0 50]); %zoom in on max power
xlabel('Years/Cycle')
ylabel('Power')
16-31
16 Fourier Transforms
See Also
fft | fft2 | fftw
Related Examples
• “Fourier Transforms” on page 16-2
• “2-D Fourier Transforms” on page 16-19
16-32
17
Note Installation Required: This functionality requires MATLAB Support Package for Quantum
Computing.
Quantum computing is an emerging technology that uses the laws of quantum physics to perform
computations. Quantum computations use phenomena in quantum physics, such as superposition and
entanglement to perform computations. In contrast to classical computing where information is
stored as binary bits, which can only be in either the 0 or 1 state, quantum computing uses quantum
bits, or qubits, which can be in the 0 and 1 states at the same time. Because of this fundamental
difference, quantum computers have the potential to outperform classical computers when solving
certain types of problems, such as optimization or simulation.
This topic describes three building blocks of quantum computing: qubits, quantum gates, and
quantum circuits. This topic also shows how to perform measurements of a quantum circuit, either by
simulating the circuit locally with random sampling or by running the circuit remotely on a quantum
device.
Qubit
The qubit is the basic building block of quantum computing. Qubits store information and can be
physically realized by two-state quantum devices. A qubit can be in a linear combination of the 0
and 1 states with complex coefficients, referred to as a superposition. If the complex coefficients
are normalized, then they represent the probability amplitudes of measuring the qubit in the 0 or
i0 − 21
1 state. For example, a qubit can be in the state. You can determine the probability of
3
measuring the qubit in the 0 state by taking the magnitude squared of the probability amplitude of
the 0 state, which is 1/3. Similarly, the probability of measuring the qubit in the 1 state is 2/3.
A practical way to visualize the quantum state of a single qubit is using the Bloch sphere. In general,
the state of a qubit can be written as
θ θ
ψ = cos 0 + exp(i ϕ)sin 1 .
2 2
The Bloch sphere represents the parameters θ and ϕ in spherical coordinates, respectively, as the
colatitude with respect to the z-axis and the longitude with respect to the x-axis of the unit sphere.
The state is plotted as a point u on the unit sphere with coordinates
By convention, the north pole is the 0 state, the south pole is the 1 state, and the equator is a
linear combination of these two states with equal probability of measuring 0 or 1 .
To plot any quantum state of a qubit on the Bloch sphere, use the plotBlochSphere function
(provided in the “Helper Functions” on page 17-16 section). This function takes a complex vector
with two elements that represent the probability amplitudes of the 0 and 1 states.
17-2
Introduction to Quantum Computing
plotBlochSphere([1; 0])
title("|0> state on Bloch sphere")
plotBlochSphere([0; 1])
title("|1> state on Bloch sphere")
Plot the + = 0 + 1 / 2 state on the Bloch sphere. This state is represented by a point on the
equator. Its opposite point on the other side of the sphere represents the − state.
17-3
17 Gate-Based Quantum Algorithms
plotBlochSphere([1; 1]/sqrt(2))
title("|+> state on Bloch sphere")
plotBlochSphere([1; -1]/sqrt(2))
title("|-> state on Bloch sphere")
17-4
Introduction to Quantum Computing
plotBlochSphere([1; 1i]/sqrt(2))
title("|R> state on Bloch sphere")
17-5
17 Gate-Based Quantum Algorithms
The quantum state of a single qubit is expressed as a normalized two-element complex vector, where
the first element gives the amplitude of the 0 state and the second element gives the amplitude of
the 1 state. A measurement of this qubit yields a classical 0 or 1 state, with the probability of
measuring each state given by the magnitude squared of its respective amplitude.
You can use the quantum.gate.QuantumState constructor to create a quantum state, and then use
the querystates function to find the possible states to measure and their probabilities. For
example, create a quantum state that represents the 1 state.
state = quantum.gate.QuantumState([0 1]);
state.Amplitudes
ans =
0
1
The querystates function returns all possible states to measure and the probability of measuring
each one. Here, only the 1 state is possible.
[states,probabilities] = querystates(state)
states =
"1"
17-6
Introduction to Quantum Computing
probabilities =
1
As another example, create a quantum state where the two elements of the vector have different
amplitudes with the same absolute value.
ans =
0.7071 + 0.0000i
0.0000 + 0.7071i
The querystates function returns the "0" and "1" states as the possible states to measure and a
0.5 probability of measuring each state.
[states,probabilities] = querystates(state)
states =
"0"
"1"
probabilities =
0.5000
0.5000
You can plot a histogram that shows the probabilities of measuring all possible states by using
histogram.
histogram(state)
17-7
17 Gate-Based Quantum Algorithms
As another example, create a quantum state where the two elements of the vector have different
amplitudes and different absolute values. If you call quantum.gate.QuantumState with a general
complex vector, then the constructor creates a quantum state that is normalized. For example, create
a quantum state and find its measurement probabilities.
ans =
0.6000 + 0.0000i
0.0000 - 0.8000i
[states,probabilities] = querystates(state)
states =
"0"
"1"
probabilities =
0.3600
0.6400
Only properties of a quantum state that translate into probabilities are measurable. For example,
multiplying every amplitude of a quantum state by –1 or exp(iθ) has no measurable impact. This type
of transformation is called applying a global phase. However, multiplying one of the amplitudes by –1
or exp(iθ) (applying a relative phase) has a measurable impact. If you measure the state directly after
17-8
Introduction to Quantum Computing
applying the relative phase, you do not see a difference in the probabilities. But if you apply
additional gate operations on the qubits, you can detect the difference due to the relative phase.
Quantum Gate
The quantum gate is the next building block of quantum computing. Quantum gates represent
reversible operations that transform the quantum state according to unitary matrices. While all gate
operations are deterministic, measuring in quantum computing is probabilistic, with the probabilities
of various measurements depending on the states of the qubits. For a complete list of quantum gates
available in MATLAB, see “Types of Quantum Gates” on page 17-18.
Pauli X Gate
An example of a quantum gate is the Pauli X gate. This gate multiplies the state vector by the Pauli X
01
matrix , which flips the 0 state of a qubit to the 1 state and flips the 1 state to the 0
10
state.
inAmps =
0.8137 + 0.0000i
0.0000 + 0.5812i
Create a quantum circuit that applies the Pauli X gate (xGate) to the qubit by using
quantumCircuit. Simulate this circuit to get the final state of the qubit by using simulate.
c = quantumCircuit(xGate(1));
outState = simulate(c,inState);
outAmps = outState.Amplitudes
outAmps =
0.0000 + 0.5812i
0.8137 + 0.0000i
To illustrate the operation performed by the Pauli X gate, plot the initial state and the final state of
the circuit on a Bloch sphere. In the Bloch sphere representation, the Pauli X gate rotates the state
around the x-axis (the axis of the + and − states) by an angle of π.
figure
tiledlayout(1,2);
nexttile
plotBlochSphere(inAmps)
nexttile
plotBlochSphere(outAmps)
hold on
plot3([-1 1],[0 0],[0 0],LineWidth=2,Color=[0 0.8 0])
hold off
17-9
17 Gate-Based Quantum Algorithms
For the choice of basis states of a qubit, the 0 and 1 states are the eigenstates of the Pauli Z gate.
Applying the Pauli Z gate (zGate) has no effect on the 0 state and maps the 1 state to − 1 . The
Pauli Z gate is also known as a phase-flip gate. Similarly, the + and − states are the eigenstates
of the Pauli X gate. The eigenstates of the Pauli Y gate (yGate) are the other two orthogonal
directions in the plot, which are the R = 0 + i 1 / 2 state and the L = 0 − i 1 / 2 states.
Although each of these pairs would work as basis states, the 0 and 1 basis states are the
standard basis states in which measurements are made. To rotate around the x-, y-, or z-axis with an
arbitrary angle, you can use the rotation gates rxGate, ryGate, or rzGate.
Hadamard Gate
Another example of a quantum gate is the Hadamard gate, which transforms the Z basis to the X
basis. That is, the gate transforms the 0 and 1 states to the + and − states, respectively.
For example, consider the same initial quantum state that was previously created.
inAmps = inState.Amplitudes
inAmps =
0.8137 + 0.0000i
0.0000 + 0.5812i
Create a quantum circuit that applies the Hadamard gate (hGate) to the qubit. Simulate this circuit
to get the final state of the qubit.
c = quantumCircuit(hGate(1));
outState = simulate(c,inState);
outAmps = outState.Amplitudes
outAmps =
0.5754 + 0.4110i
0.5754 - 0.4110i
The operation by the Hadamard gate is not apparent from the probability amplitudes of the states. To
better understand this operation, transform the complex amplitudes of the states into coordinates on
the Bloch sphere by using the mapToBlochSphere function (provided in the “Helper Functions” on
page 17-16 section). Show the coordinates of the initial state and final state on the Bloch sphere.
inCoords = mapToBlochSphere(inAmps)
17-10
Introduction to Quantum Computing
inCoords =
outCoords = mapToBlochSphere(outAmps)
outCoords =
In the Bloch sphere representation, the Hadamard gate rotates the state around the [1,0,1] axis by an
angle of π.
figure
tiledlayout(1,2);
nexttile
plotBlochSphere(inAmps)
nexttile
plotBlochSphere(outAmps)
hold on
plot3([-1 1]/sqrt(2),[0 0],[-1 1]/sqrt(2),LineWidth=2,Color=[0,0.8,0])
hold off
Quantum devices typically provide measurements only in the Z basis, which is a choice of basis that is
not intrinsically special given the various transformations available using quantum gates. For
example, to measure a qubit in the X basis instead, you can apply a Hadamard gate to the qubit
before performing the measurement.
Quantum Circuit
The quantum circuit is the next building block of quantum computing. Quantum circuits consist of
quantum gates that act on qubits. A quantum circuit diagram is commonly used as a visual model for
a sequence of quantum gates applied to qubits. This example of quantum circuit diagram includes a
Hadamard gate and a controlled X gate acting on two qubits.
17-11
17 Gate-Based Quantum Algorithms
• In a quantum circuit diagram, each solid horizontal line represents a qubit, or more generally, a
qubit register. In MATLAB, the top line is a qubit with index 1 and the remaining lines from top to
bottom are labeled sequentially. In this example, the circuit consists of two qubits with indices 1
and 2.
• Quantum gates perform operations on the qubits. A gate acting on qubits is denoted as a specific
gate symbol. In this example, the symbol represents a Hadamard gate acting on qubit 1. The
next symbol represents a controlled X gate with qubit 1 as the control and qubit 2 as the
target.
• In a circuit diagram, time flows from left to right. Quantum gates are ordered in chronological
order with the leftmost gate as the gate first applied to the qubits. The solid horizontal lines hold
the overall quantum state of the circuit that is passed through each of the gates in the diagram
from left to right.
• For a quantum circuit with n qubits, the overall 2n basis states are constructed from the
Kronecker product of the n qubit bases with ordering from left to right for qubits with the lowest
index to the highest index. In other words, if the basis of a single qubit with index k is labeled as
qk (which can be 0 or 1 ), then the basis states of the circuit with n qubits are represented
by q1 ⊗ ... ⊗ qk ⊗ ... ⊗ qn . For example, for a circuit with two qubits, all possible basis states
T
expressed as a column vector are q1q2 = 00 , 01 , 10 , 11 . This definition follows most
textbooks.
• A gate operation can be represented as a transformation matrix for these basis states. For
example, consider the controlled X gate that operates on a target qubit (with index 2) based on
the state of a control qubit (with index 1). If the control qubit is in the 0 state, this gate does
nothing. If the control qubit is in the 1 state, this gate applies the Pauli X gate to the target
qubit. The matrix representation of the controlled X gate is
1 0 0 0
0 1 0 0
.
0 0 0 1
0 0 1 0
• The quantum state of a circuit with n qubits can then be represented by the linear combinations of
n
the 2 basis states, which can be a separable or an entangled state. A separable state is a
quantum state that can be factored into individual states belonging to separate subspaces of the
qubits. An entangled state is a quantum state that is not separable. For example, for a circuit with
two qubits, the quantum state can be a separable state, such as
17-12
Introduction to Quantum Computing
i 0 + 1 i 0 + 1
00 + 01 + 10 + 11 =
2 2 2
You can build a quantum circuit by using the quantumCircuit function. Create an empty quantum
circuit with two qubits. Simulate the quantum state returned by this circuit. Show the quantum state
of this circuit as a string formula by using formula.
c = quantumCircuit(2);
s = simulate(c);
str = formula(s)
str =
"1 * |00>"
By default, all qubits are set to 0. Next, pass a different initial quantum state,
1 i
00 − 01 + 10 − 11 , to the simulate function, where the quantum state is a linear
2 2
combination of all basis states of the qubits.
str =
"(0.5+0i) * |00> +
(-0.5+0i) * |01> +
(0+0.5i) * |10> +
(0-0.5i) * |11>"
You can add quantum gates to the circuit. For this example, add a controlled X (or CNOT) gate to the
circuit. Then plot the circuit.
c.Gates = cxGate(1,2);
figure
plot(c)
17-13
17 Gate-Based Quantum Algorithms
1 i
Find the final state of this circuit after passing the initial state 00 − 01 + 10 − 11 to
2 2
the controlled X gate.
outState = simulate(c,inState);
str = formula(outState,Basis="Z")
str =
"(0.5+0i) * |00> +
(-0.5+0i) * |01> +
(0-0.5i) * |10> +
(0+0.5i) * |11>"
You can also pass the initial state as +0 to this circuit. Find the final state of this circuit for this
initial state.
outState = simulate(c,"+0");
str = formula(outState,Basis="Z")
str =
"0.70711 * |00> +
0.70711 * |11>"
You can run a circuit built in MATLAB on a remote quantum device and retrieve the measurement
result.
Build a quantum circuit that entangles two qubits by using a Hadamard gate and a controlled X gate.
Then plot the circuit.
17-14
Introduction to Quantum Computing
The default initial state of this circuit is 00 . Simulate the circuit to get its final state.
outState = simulate(c);
str = formula(outState)
str =
"0.70711 * |00> +
0.70711 * |11>"
To investigate the behavior of this circuit, you can first perform a local simulated measurement of the
circuit. Use randsample to randomly sample the quantum state of the circuit with 100 shots.
m = randsample(outState,100)
m =
table(m.Counts,m.MeasuredStates,VariableNames=["Counts","States"])
ans =
2×2 table
Counts States
______ ______
54 "00"
46 "11"
In theory, because the quantum state is entangled and it is a superposition only of the 00 and 11
states, the probability of measuring the 01 and 10 states is 0. In practice, however, due to the
noise in physical quantum devices to date, the 01 and 10 states can appear as measurements.
To run the circuit on a remote quantum device that is available through Amazon® Web Services
(AWS®), first connect to a specific quantum device using quantum.backend.QuantumDeviceAWS.
17-15
17 Gate-Based Quantum Algorithms
dev = quantum.backend.QuantumDeviceAWS("Aspen-M-3");
Run the circuit on the device with the default 100 shots using the run function.
task = run(c,dev)
task =
Status: "queued"
TaskARN: "arn:aws:braket:us-west-1:123456789012:quantum-task/1234abcd-ef56-7890-abc2-34de56f678ab"
Wait for the task to finish. Retrieve the measurement result of running the circuit on the device by
using fetchOutput.
wait(task)
m = fetchOutput(task)
m =
Show the measurement result. Due to the noise in the quantum device, the 01 and 10 states
appear as measurements.
table(m.Counts,m.MeasuredStates,VariableNames=["Counts","States"])
ans =
4×2 table
Counts States
______ ______
45 "00"
10 "10"
5 "01"
40 "11"
Helper Functions
This section provides the complete code of the plotBlochSphere and mapToBlochSphere
functions.
function plotBlochSphere(u)
% Plot Bloch sphere representation from 2-D complex amplitudes
arguments
u {mustBeNumeric,mustBeVector}
end
% Compute Bloch sphere representation (3-D real) from a 2-D complex vector
P = mapToBlochSphere(u);
17-16
Introduction to Quantum Computing
% Meridian lines
meridianCoordinates = cat(3, cospi(beta).*cospi(alpha), ...
sinpi(beta).*cospi(alpha), repmat(sinpi(alpha), 1, length(beta)));
% Latitude circles
latitudeCoordinates = cat(3, cospi(gamma).*cospi(alpha), ...
cospi(gamma).*sinpi(alpha), repmat(sinpi(gamma), length(alpha), 1));
hold off
end
function P = mapToBlochSphere(u)
% Compute Bloch sphere representation (3-D real) from a 2-D complex vector
theta = 2*atan2(abs(u(2)),abs(u(1)));
phi = angle(u(2)*conj(u(1)));
P = [sin(theta)*cos(phi) sin(theta)*sin(phi) cos(theta)];
end
See Also
quantum.gate.SimpleGate | quantum.gate.CompositeGate | quantumCircuit |
quantum.gate.QuantumState | quantum.gate.QuantumMeasurement |
quantum.backend.QuantumDeviceAWS | quantum.backend.QuantumTaskAWS
Related Examples
• “Types of Quantum Gates” on page 17-18
• “Local Quantum State Simulation” on page 17-25
• “Run Quantum Circuit on Hardware Using AWS” on page 17-29
17-17
17 Gate-Based Quantum Algorithms
Note Installation Required: This functionality requires MATLAB Support Package for Quantum
Computing.
This topic provides a list functions that you can use to create quantum gates in MATLAB. Quantum
gates are reversible and have unitary matrix representations.
Rotation Gates
17-18
Types of Quantum Gates
Inverse T gate 1 1 0
1−i
tiGate 0
2
Controlled X or 2 1 0 0 0 • Involutory
CNOT gate 0 1 0 0
cxGate or 0 0 0 1
cnotGate
0 0 1 0
17-19
17 Gate-Based Quantum Algorithms
cryGate
17-20
Types of Quantum Gates
17-21
17 Gate-Based Quantum Algorithms
17-22
Types of Quantum Gates
1
2
1 0 1 0 1 0 1
0 1 0 1 0 1 0
0 1 0 −1 0 1 0
1 0 −1 0 1 0 −1
1 0 1 0 −1 0 −1
0 1 0 1 0 −1 0
0 −1 0 1 0 1 0
−1 0 1 0 1 0 −1
qftGate Quantum Varies Example: Quantum Fourier transform gate on three qubits. The
Fourier equivalent internal gates are Hadamard gates, R1 gates, and a swap
transfor gate.
m (QFT)
gate
17-23
17 Gate-Based Quantum Algorithms
mcxGate Multi- Varies Example: Multi-controlled X gate with three control qubits, one target
controlle qubit, and no ancilla qubit. The equivalent internal gates are Hadamard
d X gate gates, controlled R1 gates, and controlled X gates.
10 0 000 0
01 0 000 0
0 0 ⋱ 0 0 0 0
0 0 0 1 0 0 0
0 0 0 0 1 0 0
0 0 0 0 0 0 1
0 0 0 0 0 1 0
See Also
quantum.gate.SimpleGate | quantum.gate.CompositeGate
Related Examples
• “Introduction to Quantum Computing” on page 17-2
17-24
Local Quantum State Simulation
Note Installation Required: This functionality requires MATLAB Support Package for Quantum
Computing.
This topic describes how to simulate a quantum circuit locally and analyze the simulation results.
Building quantum circuits requires iteration on circuit design to refine circuit gates and confirm
behavior. Also, because measurements on quantum hardware are probabilistic, it is useful to examine
the probabilities of final states to determine the most likely outcome of a measurement. Use the
simulate method of quantumCircuit to simulate circuits on your local computer. Once you
simulate a circuit, use the methods of quantum.gate.QuantumState to inspect the results.
Create Circuit
Create a quantum circuit with three qubits and three gates. Apply a Hadamard gate to the first qubit,
and apply CNOT gates to the second and third qubits, using the first qubit as the control. Plot the
circuit to view its qubits and gates.
gates = [hGate(1);
cxGate(1,2);
cxGate(1,3)];
C = quantumCircuit(gates);
plot(C)
Each horizontal line in the plot represents one of the qubits, and the gates are arranged from left to
right in the order that they are applied.
Simulate Circuit
Simulate the quantum circuit by using the simulate method. All qubits start in the 0 state by
default, but you can use a second input argument to specify a different starting state for the qubits.
Specify that each qubit has an initial state of 1 .
S = simulate(C,"111")
S =
17-25
17 Gate-Based Quantum Algorithms
Display the basis states and corresponding amplitudes by inspecting the properties of the resulting
quantum.gate.QuantumState object.
S.BasisStates
ans =
"000"
"001"
"010"
"011"
"100"
"101"
"110"
"111"
S.Amplitudes
ans =
0
0
0
0.7071
-0.7071
0
0
0
f = formula(S)
f =
"0.70711 * |011> +
-0.70711 * |100>"
Specify the Basis name-value argument to display the formula using the X basis. The formula now
shows linear combinations of + and − states.
f2 = formula(S,Basis="X")
f2 =
"-0.5 * |++-> +
-0.5 * |+-+> +
0.5 * |-++> +
0.5 * |--->"
17-26
Local Quantum State Simulation
histogram(S)
[states,P] = querystates(S)
states =
"011"
"100"
P =
0.5000
0.5000
p = probability(S,2,"1")
17-27
17 Gate-Based Quantum Algorithms
p =
0.5000
M = randsample(S,50)
M =
T = table(M.Counts,M.MeasuredStates,VariableNames=["Counts","States"])
T =
2×2 table
Counts States
______ ______
21 "011"
29 "100"
See Also
quantumCircuit | quantum.gate.SimpleGate | quantum.gate.CompositeGate |
quantum.gate.QuantumState
Related Examples
• “Introduction to Quantum Computing” on page 17-2
• “Types of Quantum Gates” on page 17-18
• “Run Quantum Circuit on Hardware Using AWS” on page 17-29
17-28
Run Quantum Circuit on Hardware Using AWS
Note Installation Required: This functionality requires MATLAB Support Package for Quantum
Computing.
This topic describes how to connect directly from MATLAB to Amazon Web Services (AWS) to enable
running gate-based quantum circuits on remote quantum devices and simulators. Quantum hardware
has limited availability as well as associated costs for each task that is run. A best practice is to
finalize your circuit design as much as possible by simulating the circuit locally on your computer
before running the circuit on a quantum device. See “Local Quantum State Simulation” on page 17-25
for instructions on how to inspect simulated circuit results on your local computer.
Requirements
To run quantum circuits on an AWS device from within MATLAB, you need:
Costs for the use of Amazon Braket devices are charged to your AWS account. See Amazon Braket
Pricing for details.
The account credentials (AWS Access Key and AWS Secret Access Key) are only available to
download just after they are created. If your AWS account is managed through your organization,
then you might need to contact your administrator for the recommended way to retrieve
credentials.
First, create a file named awsConfig.env that contains all necessary environment variables. At
a minimum, you must specify values for AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY.
AWS_ACCESS_KEY_ID=YOUR_AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY=YOUR_AWS_SECRET_ACCESS_KEY
17-29
17 Gate-Based Quantum Algorithms
Next, use the loadenv function with no output argument. This command loads each line as an
environment variable in MATLAB.
loadenv('awsConfig.env')
Optional: Two optional environment variables that you can also specify in the configuration file
are:
• AWS_SESSION_TOKEN: A token for one AWS session that expires after a given time period.
• AWS_DEFAULT_REGION: Restricts the region in which the server storing results must reside.
When AWS_DEFAULT_REGION is set, you can still use QPUs that are located in any region.
3 Create Amazon S3 bucket to store results. See Create your first S3 bucket for more
information about creating a bucket.
• The bucket you create must have a unique name. A suggested prefix for the name is amazon-
braket-mathworks-.
• If you specified AWS_DEFAULT_REGION in the previous step, then the bucket must reside in
the same region.
Device Availability
To see a list of available quantum devices:
1 Go to https://aws.amazon.com/.
2 Log in to your AWS account.
3 Navigate to Amazon Braket using the Services menu or the search bar.
4 Select Devices from the navigation menu.
The device list contains information on device providers, names, availability, and descriptions. Each
device has different associated costs, and you can click on a device to see more details. If a device is
offline, it does not accept tasks and creating a quantum.backend.QuantumDeviceAWS object for
the device returns an error. However, devices that are not currently available might still accept tasks
into their queue to be run later. Check the device details to determine the current status before
submitting tasks.
The MATLAB Support Package for Quantum Computing works with gate-model QPU devices as well
as simulators:
• QPU Devices: These devices are quantum computers used to perform probabilistic measurements
of circuits. Not all QPU devices are supported. To find out if a device is supported, try connecting
to the device using quantum.backend.QuantumDeviceAWS. Availability and number of qubits
supported can also vary, so check the device details before sending circuits for measurement.
• Simulator Devices: Because memory usage scales exponentially with the number of qubits,
larger circuits with many qubits might stretch the limits of memory available on your local
computer. To address that issue, AWS also provides cloud-scale simulators. The functionality is
similar to using randsample on your local system, but the simulators can support up to 34 qubits
with cloud-scale performance.
17-30
Run Quantum Circuit on Hardware Using AWS
device =
Name: "SV1"
DeviceARN: "arn:aws:braket:::device/quantum-simulator/amazon/sv1"
Region: "us-west-1"
S3Path: "s3://amazon-braket-mathworks/doc-example"
Note If you receive an error or warning while connecting to a device in this step, then your
connection to AWS might not be set up properly. See “Set Up Access to AWS” on page 17-29 for more
information.
deviceArn: "arn:aws:braket:::device/quantum-simulator/amazon/sv1"
deviceCapabilities: "{"service": {"braketSchemaHeader": {"name": "braket.device_schema.device_service_properties",..."
deviceName: "SV1"
deviceStatus: "ONLINE"
deviceType: "SIMULATOR"
providerName: "Amazon Braket"
17-31
17 Gate-Based Quantum Algorithms
When a circuit is measured on quantum hardware, the superposition of states for the qubits collapses
into a single state, and that result is probabilistic. That is, the result of a measurement can change
between runs, but the probability of each result follows the state probabilities produced by the gates
in the circuit. So, it is common to run several shots, or trials, of a circuit to reveal the underlying
probability distribution of results. By default, the run method uses 100 shots.
task = run(C,device)
task =
Status: "queued"
TaskARN: "arn:aws:braket:us-west-1:123456789012:quantum-task/1234abcd-ef56-7890-abc2-34de56f678ab"
Quantum devices have limited availability, so tasks are often queued for execution. Once a task is
finished, the Status property of the task object has a value of "finished". Check the status of a
task by querying the Status property periodically, or use the wait method to internally check the
status until the task is finished.
wait(task)
You can use cancel to cancel the task when needed. Canceling has no effect unless the status of the
task is "queued".
R = fetchOutput(task)
R =
Examine the measured states and their counts in a table. The counts can change slightly from run to
run, but will tend to be evenly divided between the two possible states.
T = table(R.Counts,R.MeasuredStates,VariableNames=["Counts","States"])
T =
2×2 table
Counts States
______ ______
55 "000"
45 "111"
17-32
Run Quantum Circuit on Hardware Using AWS
See Also
quantumCircuit | quantum.backend.QuantumDeviceAWS |
quantum.backend.QuantumTaskAWS | quantum.gate.QuantumMeasurement
Related Examples
• “Introduction to Quantum Computing” on page 17-2
• “Local Quantum State Simulation” on page 17-25
• “Graph Coloring with Grover's Algorithm” on page 17-34
17-33
17 Gate-Based Quantum Algorithms
Note Installation Required: This functionality requires MATLAB Support Package for Quantum
Computing.
This example shows how to use Grover's algorithm to solve graph coloring problems. Grover's
algorithm works generically for any problem where a bit string of a given length is classified as valid
or invalid, and the goal is to retrieve one of the valid bit strings. The algorithm uses a state oracle to
determine whether a bit string is valid. Although this application of Grover's algorithm is not the most
efficient method to solve the graph coloring problem in practice, it illustrates how a quantum
algorithm can be applied to a well-known problem.
Because the graph has only four nodes and four edges, there are a limited number of ways to color
the nodes in the graph. If each node can be one of two different colors, then there are 24 = 16 ways to
color the graph nodes.
A solution of the graph coloring problem is a color assignment such that adjacent nodes are not the
same color. Therefore, the quantum state oracle must check that for each edge in a given graph, the
nodes it connects have different colors. The oracle accomplishes this check by accepting a bit string
of length 4 (represented by 4 qubits), each of which represents the color of one node, and
determining whether the colors produce a valid graph coloring.
Create Oracle
As a first step to creating the oracle, create a helper circuit called xorGate. This circuit reads qubits
1 and 2, and flips qubit 3 from 0 to 1 if qubits 1 and 2 have different values.
17-34
Graph Coloring with Grover's Algorithm
xorInnerGates = [cxGate(1,3);
cxGate(2,3)];
xorGate = quantumCircuit(xorInnerGates,Name="XOR");
Next, loop through all edges in the graph and write the value of xorGate applied to the startNode
and endNode qubits into a new helper qubit representing the edge. This increases the number of
qubits from 4 to 8 because there are four edges. The first four qubits represent the bit string to
evaluate, while the other qubits begin in state 0 before xorGate is applied.
[startNodes,endNodes] = findedge(g);
N = numnodes(g);
E = numedges(g);
gates = [];
for k = 1:E
gates = [gates;
compositeGate(xorGate,[startNodes(k),endNodes(k),N+k])];
end
Next, the oracle must verify that the helper qubits representing the edges are in state 1 , since
xorGate returns 1 when the two nodes connected by the edge have different states. This
verification requires the addition of a ninth qubit that has a state of 1 when the coloring is valid,
and 0 otherwise. Use the mcxGate function to determine the value of this final qubit, and create a
circuit for the oracle.
lastQubit = N+E+1;
gates = [gates;
mcxGate(N+(1:E),lastQubit,[])];
oracle = quantumCircuit(gates,lastQubit);
oracle.Name = "Oracle"
oracle =
NumQubits: 9
Gates: [5×1 quantum.gate.CompositeGate]
Name: "Oracle"
Plot the circuit representing the oracle to view all of the qubits and gates. Use the QubitBlocks
option to draw red lines between the different groups of qubits: the first four qubits at the top of the
circuit diagram represent the node colors being checked, the next four qubits are the helper qubits to
check the nodes connected by each edge, and the final qubit at the bottom determines whether the
graph coloring is valid.
plot(oracle,QubitBlocks=[4 4 1])
17-35
17 Gate-Based Quantum Algorithms
Construct two input configurations, one valid (1001) and one invalid (1111), and apply the oracle
circuit to each of them to check its behavior. The input configuration is encoded in the left-most four
characters of each bit string, and the value of interest is the last qubit (the right-most character) in
the output bit string.
Start with the invalid coloring 1111, where all nodes are the same color.
invalidColors = "111100000";
inputState1 = quantum.gate.QuantumState(invalidColors);
formula(inputState1)
ans =
"1 * |111100000>"
outputState1 = simulate(oracle,inputState1);
formula(outputState1)
ans =
"1 * |111100000>"
The last qubit of the simulated state is 0, so the oracle correctly determines that not all of the nodes
can be the same color. Next, check the coloring 1001.
validColors = "100100000";
inputState2 = quantum.gate.QuantumState(validColors);
formula(inputState2)
ans =
"1 * |100100000>"
ans =
"1 * |100111111>"
17-36
Graph Coloring with Grover's Algorithm
In this case, the last qubit is 1, so 1001 is a valid graph coloring, where each edge connects nodes of
different colors. This check confirms that the behavior of the oracle is correct for these two cases.
The oracle currently works correctly, but it alters the input bits during execution. A requirement of
Grover's algorithm is that the oracle must be a state oracle. That is, after the oracle is applied to a
standard state, the bit values must be the same as the input state with only the phase changing. A
valid bit string (indicating a valid coloring of the graph) should have a phase of -1 after the oracle
operates, while an invalid bit string should have an unchanged phase of 1 after the oracle operates.
Achieving this behavior requires additional modifications to the oracle.
Start by adding gates that revert all the helper qubits for the edges to 0 by flipping any values that
were previously flipped.
oracle.Gates = [oracle.Gates;
flip(oracle.Gates(1:end-1))];
Next, note how state − = 0 − 1 / 2 interacts with the X gate. Applying the X gate, which flips
between the states 0 and 1 , flips the phase of the − state:
X − = X 0 − 1 / 2 = 1 − 0 / 2 = − − . This behavior is ideal for the last qubit in the
oracle. So, passing that qubit into the oracle in state − ensures that the output bit string is the
same as the input, while indicating valid graph colorings with an altered phase of -1.
plot(oracle)
To check the behavior of the state oracle, use the same two input bit strings as before. Start with the
invalid graph coloring 1111, where all nodes are the same color.
invalidColors = "11110000-";
inputState1 = quantum.gate.QuantumState(invalidColors);
formula(inputState1)
ans =
"1 * |11110000->"
17-37
17 Gate-Based Quantum Algorithms
outputState1 = simulate(oracle,inputState1);
formula(outputState1)
ans =
"1 * |11110000->"
The oracle correctly determines that the graph coloring is not valid. The phase of the output bits is
unchanged. Next, check the coloring 1001.
validColors = "10010000-";
inputState2 = quantum.gate.QuantumState(validColors);
formula(inputState2)
ans =
"1 * |10010000->"
outputState2 = simulate(oracle,inputState2);
formula(outputState2)
ans =
"(-1-3.8858e-16i) * |10010000->"
In this case, the state oracle correctly flips the phase to -1 because the graph coloring is valid. The
small nonzero imaginary part is due to floating-point round-off error.
This function creates a state oracle circuit for a specified input graph.
function oracle = oracleGraphColoring(g)
xorInnerGates = [cxGate(1,3);
cxGate(2,3)];
xorGate = quantumCircuit(xorInnerGates,Name="XOR");
[startNodes,endNodes] = findedge(g);
N = numnodes(g);
E = numedges(g);
gates = [];
for k = 1:E
gates = [gates;
compositeGate(xorGate,[startNodes(k),endNodes(k),N+k])];
end
lastQubit = N+E+1;
gates = [gates; mcxGate(N+(1:E),lastQubit,[])];
oracle = quantumCircuit(gates,lastQubit);
oracle.Name = "Oracle"
oracle.Gates = [oracle.Gates;
flip(oracle.Gates(1:end-1))];
end
17-38
Graph Coloring with Grover's Algorithm
Diffuser Code
This function creates a diffuser circuit with a specified number of nodes. Save this function in a file on
the MATLAB path.
function cg = diffuser(n)
gates = [hGate(1:n);
xGate(1:n);
hGate(n);
mcxGate(1:n-1, n, []);
hGate(n);
xGate(1:n);
hGate(1:n)];
cg = quantumCircuit(gates,Name="Diffuser");
end
Create a diffuser circuit for the four-node graph coloring problem and plot the circuit.
diffuserG = diffuser(N);
plot(diffuserG)
Verify that the diffuser applies a reflection of IN − 2 e e by comparing the circuit matrix of
diffuserG with a manually constructed reflection matrix.
e = ones(2^N,1);
e = e/norm(e);
M = eye(2^N) - 2*(e*e');
norm(getMatrix(diffuserG) - M)
ans =
1.2243e-15
The result is zero (up to round-off error), which confirms that the diffuser circuit applies the expected
reflection.
17-39
17 Gate-Based Quantum Algorithms
gates = [hGate(1:N);
xGate(oracle.NumQubits);
hGate(oracle.NumQubits)];
c = quantumCircuit(gates);
Check the circuit setup to make sure it produces the expected state.
s = simulate(c);
formula(s)
ans =
"1 * |++++0000->"
Express the formula in the Z basis to confirm that all combinations of states are represented equally.
formula(s, Basis="Z")
ans =
"0.17678 * |000000000> +
-0.17678 * |000000001> +
0.17678 * |000100000> +
-0.17678 * |000100001> +
0.17678 * |001000000> +
-0.17678 * |001000001> +
0.17678 * |001100000> +
-0.17678 * |001100001> +
0.17678 * |010000000> +
-0.17678 * |010000001> +
0.17678 * |010100000> +
-0.17678 * |010100001> +
0.17678 * |011000000> +
-0.17678 * |011000001> +
0.17678 * |011100000> +
-0.17678 * |011100001> +
0.17678 * |100000000> +
-0.17678 * |100000001> +
0.17678 * |100100000> +
-0.17678 * |100100001> +
0.17678 * |101000000> +
-0.17678 * |101000001> +
0.17678 * |101100000> +
-0.17678 * |101100001> +
0.17678 * |110000000> +
-0.17678 * |110000001> +
0.17678 * |110100000> +
-0.17678 * |110100001> +
0.17678 * |111000000> +
-0.17678 * |111000001> +
0.17678 * |111100000> +
-0.17678 * |111100001>"
Next, apply the oracle and diffuser composite gates two times in the circuit and then plot the
resulting circuit. (The number of times that the oracle and diffuser need to be applied depends on the
angle of rotation for a given problem, which can be determined based on what percentage of the
possible inputs are valid. You can try applying the oracle and diffuser different numbers of times to
view the effect on the outputs.)
17-40
Graph Coloring with Grover's Algorithm
gates = [gates;
compositeGate(oracle,1:9);
compositeGate(diffuserG,1:4);
compositeGate(oracle,1:9);
compositeGate(diffuserG,1:4)];
c = quantumCircuit(gates);
plot(c)
Simulate the circuit to see which node colorings are most likely to be measured. Plot a histogram of
the results for the first four qubits.
s = simulate(c);
histogram(s,1:4)
[outputBits,probabilities] = querystates(s,1:4);
T = table(outputBits,probabilities);
T = sortrows(T,"probabilities","descend")
T =
17-41
17 Gate-Based Quantum Algorithms
16×2 table
outputBits probabilities
__________ _____________
"1001" 0.47266
"0110" 0.47266
"1000" 0.0039063
"0001" 0.0039063
"0111" 0.0039063
"1110" 0.0039063
"1101" 0.0039063
"0100" 0.0039063
"0101" 0.0039063
"1011" 0.0039063
"1100" 0.0039063
"0010" 0.0039063
"0011" 0.0039063
"1010" 0.0039063
"0000" 0.0039062
"1111" 0.0039062
The simulation results show that the two states most likely to be measured are 0110 and 1001, which
are the two possible valid graph colorings.
To run circuits on AWS, you need an AWS account with access to AWS Braket. See “Run Quantum
Circuit on Hardware Using AWS” on page 17-29 for more information on how to set up access and
see available devices.
device =
Name: "SV1"
DeviceARN: "arn:aws:braket:::device/quantum-simulator/amazon/sv1"
Region: "us-east-1"
S3Path: "s3://amazon-braket-mathworks/doc-examples"
Create a task to run the circuit on the device. By default, run uses 100 shots.
task = run(c,device)
task =
17-42
Graph Coloring with Grover's Algorithm
Status: "queued"
TaskARN: "arn:aws:braket:us-east-1:123456789012:quantum-task/1234abcd-ef56-7890-abc2-34de56f678ab"
Quantum devices have limited availability, so tasks are often queued for execution. Once the task is
finished, the Status property of the task object has a value of "finished". Check the status of the
task by querying the Status property periodically, or use the wait method to internally check the
status until the task is finished.
wait(task)
Once the task is finished, retrieve the result of running the circuit by using fetchOutput.
R = fetchOutput(task)
R =
T = table(R.Counts,R.MeasuredStates,VariableNames=["Counts","States"])
T =
8×2 table
Counts States
______ ___________
1 "000000001"
1 "001000001"
1 "010000001"
28 "011000000"
24 "011000001"
22 "100100000"
22 "100100001"
1 "110000001"
Plot a histogram of the results for the first four qubits, which correspond to the node colors.
histogram(R,1:4)
17-43
17 Gate-Based Quantum Algorithms
The results agree with the local simulation, showing 0110 and 1001 as valid graph colorings. The
results solve the graph coloring problem by indicating that nodes 2 and 3 in the graph can be the
same color and that nodes 1 and 4 can be the same color.
See Also
quantumCircuit | quantum.gate.QuantumState | quantum.gate.QuantumMeasurement
Related Examples
• “Introduction to Quantum Computing” on page 17-2
• “Types of Quantum Gates” on page 17-18
17-44
Ground-State Protein Folding Using Variational Quantum Eigensolver (VQE)
Note Installation Required: This functionality requires MATLAB Support Package for Quantum
Computing.
This example shows an efficient method for using qubits to encode a protein fold on a 3-D tetrahedral
lattice [1],[2]. The ground-state is found through a simulated variational quantum eigensolver (VQE)
routine. The final circuit from the simulation is run on a real QPU for comparison. Running this
example requires Global Optimization Toolbox as well as MATLAB Support Package for Quantum
Computing.
The protein is a neuropeptide with seven amino acids, APRLRFY, pictured below. The example
assumes a coarse-grained protein model, where "beads" representing amino acids can traverse the
lattice and interact with each other. Each bond between amino acids can be in one of four directions,
corresponding to corners of a tetrahedron. These four turns can be represented by two qubits.
Configuration Qubits
A protein of N beads (here N=7) can make N-1 turns, therefore 2*(N-1)=12 bits are required for the
APRLRFY protein bonds. However, without loss of generality, the first two turns can be fixed to 01
and 00. One of the bits in the third turn is fixed due to other symmetry considerations. The
turn2qubit mapping depicts the 12 bits, denoting the value of the 5 that are fixed, and the 7 that
are variable which will be represented by qubits. For details, see the dense encoding scheme in [2].
hyperParams.protein = 'APRLRFY';
hyperParams.turn2qubit = '0100q1qqqqqq';
hyperParams.numQubitsConfig = sum(hyperParams.turn2qubit=='q');
17-45
17 Gate-Based Quantum Algorithms
Interaction Qubits
The methods of [1] and [2] can consider interactions between an arbitrary number of nearest-
neighbors (NN). This example only considers 1-NN interaction terms, which are only possible
between some beads due to the structure of the lattice. Beads separated by less than 5 bonds cannot
be 1-NN (Eq. 29 of [2]). There are two interaction pairs in the length-7 protein, between beads 1 and
6, and between beads 2 and 7. Two interaction bits are used to specify whether there is any
interaction between the beads in each of these pairs.
hyperParams.numQubitsInteraction = 2;
A matrix of contact energy values is used as a look-up table for pairwise interaction between different
amino acids. For this example, the values are chosen at random.
Call the buildMJInteractions function to find the interaction energies for the protein.
hyperParams.interactionEnergy = buildMJInteractions(hyperParams.protein);
% H = Hgc + Hin_1
energies = zeros(size(bitstrings,1),1);
numBeads = length(hyperParams.protein);
17-46
Ground-State Protein Folding Using Variational Quantum Eigensolver (VQE)
% Count number of adjacent turns which are equal and impose a penalty for each
energies(idx) = lambdaBack*sum(turns(1:end-1) == turns(2:end));
currInteractionQubit = currInteractionQubit+1;
if bitstring(currInteractionQubit)=='0'
continue;
end
17-47
17 Gate-Based Quantum Algorithms
end
end
end
end
end
allFolds = dec2bin(0:2^hyperParams.numQubitsTotal-1,hyperParams.numQubitsTotal);
allEnergies = exactHamiltonian(allFolds,hyperParams);
hyperParams.GroundState.Energy = min(allEnergies);
hyperParams.GroundState.Index = find(allEnergies == hyperParams.GroundState.Energy);
allFolds(hyperParams.GroundState.Index,:)
ans =
'101000110'
'101001010'
There are two lowest-energy folds because the last amino acid can occupy either position in the
lattice. Find the energy of each of the identified folds.
allEnergies(hyperParams.GroundState.Index)
ans = 2×1
-4.7973
-4.7973
This minimum-energy value matches the interaction energy between beads 1 and 6.
Specifically, after the energy is computed for each observed fold, the associated probabilities are
sorted by energy. The objective function returns an expectation energy computed from the tail end of
the probability distribution, cutoff by an alpha parameter. This expectation energy is a conditional
value at risk (CVaR). An alpha value of 0.05 was used experimentally in [1], but for a noise-free
simulation ProteinVQEObjective uses a smaller cutoff value of 0.025.
17-48
Ground-State Protein Folding Using Variational Quantum Eigensolver (VQE)
% There are 10 qubits in the circuit, but only these 9 are used to define a fold
foldQubits = [1:8 10];
% Sample, and query/get the states and probabilities of the fold qubits
qMeasurement = randsample(qState, hyperParams.numShots);
[states, probs] = querystates(qMeasurement, foldQubits);
% Compute CVaR over the low energy tail of the energy distribution,
% delimited by a cutoff parameter alpha.
alpha = .025;
cut_idx = nnz(cumsum(probs) < alpha);
cvar_probs = probs(1:cut_idx);
cvar_probs(end+1) = alpha - sum(cvar_probs);
% Compute expectation energy as the sum of cutoff state energies weighted by their probability
energy = dot(cvar_probs, energies(1:cut_idx+1))/alpha;
end
Define the number of shots to use in ProteinVQEObjective, and create a function handle that
passes in all of the parameter values to ProteinVQEObjective.
hyperParams.numShots = 1024;
objFcn = @(theta) ProteinVQEObjective(theta,hyperParams);
17-49
17 Gate-Based Quantum Algorithms
cxGate(5,10)
cxGate(10,9)
cxGate(9,8)
cxGate(8,9)
cxGate(9,8)
cxGate([8 7 6], [7 6 1])
ryGate([1:8 10], parameters(2,:))
];
ansatz = quantumCircuit(gates);
end
Call ProteinConfigAnsatz with random angles to construct the circuit. Plot the circuit to view the
qubits and gates.
ansatz = ProteinConfigAnsatz(rand(2,9));
plot(ansatz)
Define the number of angles and set options for the maximum number of function evaluations,
plotting function, and initial points. Then call surrogateopt with the objective function, upper and
lower bounds for the angles, and options.
17-50
Ground-State Protein Folding Using Variational Quantum Eigensolver (VQE)
numAngles = 2*hyperParams.numQubitsTotal;
rng default
options = optimoptions("surrogateopt",...
"MaxFunctionEvaluations",10, ...
"PlotFcn","optimplotfval",...
"InitialPoints",pi*ones(numAngles,1));
lb = repmat(-pi,numAngles,1);
ub = repmat(pi,numAngles,1);
[angles,minEnergy] = surrogateopt(objFcn,lb,ub,[],[],[],[],[],options);
Call the objective function using the optimized angles to find the state bit string with highest
probability as well as its associated energy.
[groundStateEnergy,groundStateFold] = ProteinVQEObjective(angles,hyperParams)
groundStateEnergy = -4.7973
groundStateFold = '101000110'
This ground-state fold is one of the two folds obtained earlier with the minimum energy calculation.
allFolds(allEnergies==minEnergy,:)
ans =
17-51
17 Gate-Based Quantum Algorithms
'101000110'
'101001010'
To visualize the ground-state fold, save the plotProtein function in a separate file on the path. This
function enables you to plot the protein structure from the qubit values.
function plotProtein(bitstring,hyperParams)
% Plot protein structure from the bitstring
% Number of beads
N = length(hyperParams.protein);
% Plot the beads connecting by lines, and add a text label to each
figure
plot3(beads(:,1),beads(:,2),beads(:,3),'.-','LineWidth',2,'MarkerSize',80,'SeriesIndex',1)
axis off
for i=1:(N-5)
17-52
Ground-State Protein Folding Using Variational Quantum Eigensolver (VQE)
for j=(i+5):2:N
if bitstring(currInteractionQubit) == '1'
interactions = [interactions;beads(i,:);beads(j,:);nan*ones(1,3)]; %#ok<AGROW>
end
currInteractionQubit = currInteractionQubit+1;
end
end
if ~isempty(interactions)
hold on
plot3(interactions(:,1),interactions(:,2),interactions(:,3),'k--','LineWidth',2)
hold off
legend([hyperParams.protein+" Protein Structure";"Interactions"], "Location","southoutside")
else
legend(hyperParams.protein+" Protein Structure", "Location","southoutside")
end
end
The protein has two possible interactions considered in the 1-NN model. The fold with the lowest
energy only has one of these interactions. Use plotProtein to visualize the lowest energy fold.
plotProtein(groundStateFold,hyperParams)
Next, construct the quantum circuit using the optimized angles and simulate the circuit to see the
expected probability distribution over states. Use a threshold to filter out states with probabilities
17-53
17 Gate-Based Quantum Algorithms
less than 2%. The ground-state fold 101000110 appears again as the state with the highest
probability.
optimized_circuit = ProteinConfigAnsatz(angles);
sv = simulate(optimized_circuit);
histogram(sv,[1:8 10],Threshold=0.02)
reg = "us-east-1";
bucketPath = "s3://amazon-braket-mathworks/doc-examples";
device = quantum.backend.QuantumDeviceAWS("Harmony",S3Path=bucketPath,Region=reg)
Create a task to run the circuit ansatz with optimized angles on the QPU. Specify the number of shots
as 1,000.
task = run(optimized_circuit,device,NumShots=1000);
wait(task)
Fetch the results and plot a histogram of the states for all but the ninth qubit. Specify a lower
threshold of 2% to filter out unlikely states.
results = fetchOutput(task);
histogram(results,[1:8 10],Threshold=0.02)
17-54
Ground-State Protein Folding Using Variational Quantum Eigensolver (VQE)
The ground-state fold 101000110 appears again as the state with the highest probability, so the QPU
results agree with the local simulation of the circuit.
References
[1] Robert, Anton, Panagiotis Kl. Barkoutsos, Stefan Woerner, and Ivano Tavernelli. “Resource-
Efficient Quantum Algorithm for Protein Folding.” Npj Quantum Information 7, no. 1
(February 17, 2021): 38. https://doi.org/10.1038/s41534-021-00368-4.
[2] Robert, Anton, Panagiotis Kl. Barkoutsos, Stefan Woerner, and Ivano Tavernelli. “Supplementary
Information for 'Resource-Efficient Quantum Algorithm for Protein Folding'" Npj Quantum
Information 7, no. 1 (February 17, 2021): 38. https://doi.org/10.1038/s41534-021-00368-4.
See Also
quantumCircuit
Related Examples
• “Introduction to Quantum Computing” on page 17-2
• “Run Quantum Circuit on Hardware Using AWS” on page 17-29
• “Graph Coloring with Grover's Algorithm” on page 17-34
17-55
17 Gate-Based Quantum Algorithms
Note Installation Required: This functionality requires MATLAB Support Package for Quantum
Computing.
This example shows how to solve the XOR problem using a trained quantum neural network (QNN).
You use the network to classify the classical data of 2-D coordinates. A QNN is a machine learning
model that combines quantum computing layers and classical layers. This example shows how to train
such a hybrid network for a classification problem that is nonlinearly separable, such as the
exclusive-OR (XOR) problem. To run this example, you must install MATLAB Support Package for
Quantum Computing and Deep Learning Toolbox™.
In the XOR problem, two-dimensional (2-D) data points are classified based on the region of their x-
and y-coordinates using a mapping function that resembles the XOR function. If the x- and y-
coordinates are both in region 0 or 1, then the data are classified into class "0". Otherwise, the data
are classified into class "1". In this problem, a single linear decision boundary cannot solve the
classification problem. Instead, nonlinear decision boundaries are required to classify the data.
This example shows a proof-of-concept idea about how to train a QNN using a local simulation. For
more general frameworks using a hybrid quantum-classical model to classify quantum and classical
data, see references [1] and [2]. The QNN in this example consists of four layers:
17-56
Solve XOR Problem Using Quantum Neural Network (QNN)
Finally, the network computes a loss function based on the categorical cross-entropy between the
predictions and the labels. The network then propagates the gradients of the loss with respect to the
learnable parameters through the layers to train the QNN using the stochastic gradient descent with
momentum (SGDM) optimization.
numSamples = 200;
[X,Y] = generateData(numSamples);
classNames = ["Blue", "Yellow"];
The input for each data point has the form of 2-D coordinates. Specify the number of classes in the
training data.
inputSize = size(X,2);
numClasses = numel(classNames);
Forward Pass
The quantum circuit consists of two qubits that are initially in the 0 state. Construct the quantum
circuit by applying an RX gate with the rotation angle θ1 to the first qubit and an RX gate with the
rotation angle θ2 to the second qubit, followed by a controlled NOT gate to the first qubit as the
control and the second qubit as the target. These gates prepare the states of qubits according to the
coordinates of the input data by introducing two adjustable learnable parameters A and B in the
rotation angles. These parameters are the scaling factors for the rotation angles of each qubit, which
scale the x- and y-coordinates of the XOR problem to θ1 = Ax and θ2 = By. You then perform a
measurement on the second qubit in the Z basis. The quantity of interest is the magnetization of the
second qubit, which is the difference in counts of this qubit being in the 0 state and the 1 state.
For this quantum circuit, the measured quantity Z has a predicted form of cosθ1cosθ2 based on
the states of the qubits. You then use the condition Z = 0 to determine the classification
17-57
17 Gate-Based Quantum Algorithms
boundaries of the XOR problem. In this conceptual example, you use local simulation to determine the
probabilities of measuring the qubit in these states instead of real counts on quantum hardware.
Backpropagation
To train the network, the derivative of the loss function through the quantum computing layer needs
to be backpropagated. Backpropagation requires the computation of the gradients of Z with
respect to the learnable parameters. To find these gradients, use the parameter-shift rules that are
valid at the operator level, as described in [3] and [4]. For this quantum circuit, these equations give
the gradients of Z with respect to the learnable parameters A and B.
∂ Z A + s, B − Z A − s, B
Z A, B =x
∂A 2sin sx
∂ Z A, B + s − Z A, B − s
Z A, B =y
∂B 2sin sy
As in [3], the gradients of the expectation values are exact for any choice of s as long as s is not an
integer multiple of π. This example chooses s = π/4.
To create a custom layer for the quantum circuit, create the PQCLayer class with this definition:
17-58
Solve XOR Problem Using Quantum Neural Network (QNN)
properties (Learnable)
% Define layer learnable parameters.
A
B
end
methods
function layer = PQCLayer
% Set layer name.
layer.Name = "PQC";
function Z = predict(layer,X)
% Z = predict(layer,X) forwards the input data X through the
% layer and outputs the result Z at prediction time.
Z = computeZ(X,layer.A,layer.B);
end
s = pi/4;
ZPlus = computeZ(X,layer.A + s,layer.B);
ZMinus = computeZ(X,layer.A - s,layer.B);
dZdA = X(1,:).*((ZPlus - ZMinus)./(2*sin(X(1,:).*s)));
dLdA = sum(dLdZ.*dZdA,"all");
17-59
17 Gate-Based Quantum Algorithms
dLdB = sum(dLdZ.*dZdB,"all");
function Z = computeZ(X, A, B)
numSamples = size(X,2);
x1 = X(1,:);
x2 = X(2,:);
Z = zeros(1,numSamples,"like",X);
for i = 1:numSamples
circ = quantumCircuit(2);
circ.Gates = [rxGate(1,x1(i)*A); rxGate(2,x2(i)*B); cxGate(1,2)];
s = simulate(circ);
Z(i) = probability(s,2,"0") - probability(s,2,"1");
end
end
• Create a feature input layer with observations consisting of two features. These features
correspond to the coordinates of the XOR problem.
• Specify a quantum computing layer using the PQCLayer class.
• For classification, specify a fully connected layer with a size equal to the number of classes.
• Map the output to probabilities by including a two-output softmax layer.
• Create an output classification layer that computes the cross-entropy loss between the true labels
and the probabilities output of the softmax layer.
layers = [
featureInputLayer(inputSize,Normalization="none")
PQCLayer
fullyConnectedLayer(numClasses)
softmaxLayer
classificationLayer];
17-60
Solve XOR Problem Using Quantum Neural Network (QNN)
Train Network
Train the QNN. The result shows an excellent accuracy above 90% for classifying the XOR problem.
net = trainNetwork(X,Y,layers,options)
net =
SeriesNetwork with properties:
Test Network
Test the classification accuracy of the network by comparing the predictions on the test data with the
true labels.
17-61
17 Gate-Based Quantum Algorithms
[XTest,trueLabels] = generateData(numSamples);
predictedLabels = classify(net,XTest);
gscatter(XTest(:,1),XTest(:,2),predictedLabels,"by")
Visualize the accuracy of the predictions in a confusion chart. Large values on the diagonal indicate
accurate predictions for the corresponding class. Large values on the off-diagonal indicate strong
confusion between the corresponding classes. Here, the confusion chart shows very small errors in
classifying the test data.
confusionchart(trueLabels,predictedLabels)
17-62
Solve XOR Problem Using Quantum Neural Network (QNN)
Supporting Function
The generateData function creates a sample of data points for the XOR problem. This function
classifies the data into two groups: "Blue" and "Yellow". If the coordinates of the data points
satisfy x > 1 and y > 0.5, or x < 1 and y < 0.5, then this function classifies the data points as "Blue".
Otherwise, if the coordinates of the data points satisfy x > 1 and y < 0.5, or x < 1 and y > 0.5, then
this function classifies the data points as "Yellow".
17-63
17 Gate-Based Quantum Algorithms
References
[1] Broughton, Michael, Guillaume Verdon, Trevor McCourt, Antonio J. Martinez, Jae Hyeon Yoo,
Sergei V. Isakov, Philip Massey, et al. "TensorFlow Quantum: A Software Framework for
Quantum Machine Leanring." Preprint, submitted August 26, 2021. https://doi.org/10.48550/
arXiv.2003.02989.
[2] Farhi, Edward, and Hartmut Neven. "Classification with Quantum Neural Networks on Near Term
Processors." Preprint, submitted August 30, 2018. https://doi.org/10.48550/arXiv.1802.06002.
[3] Mari, Andrea, Thomas R. Bromley, and Nathan Killoran. “Estimating the Gradient and Higher-
Order Derivatives on Quantum Hardware.” Physical Review A 103, no. 1 (January 11, 2021):
012405. https://doi.org/10.1103/PhysRevA.103.012405.
[4] Wierichs, David, Josh Izaac, Cody Wang, and Cedric Yen-Yu Lin. “General Parameter-Shift Rules for
Quantum Gradients.” Quantum 6 (March 30, 2022): 677. https://doi.org/10.22331/
q-2022-03-30-677.
See Also
Related Examples
• “Introduction to Quantum Computing” on page 17-2
• “Run Quantum Circuit on Hardware Using AWS” on page 17-29
• “Graph Coloring with Grover's Algorithm” on page 17-34
• “Ground-State Protein Folding Using Variational Quantum Eigensolver (VQE)” on page 17-45
17-64
Quantum Monte Carlo (QMC) Simulation
Note Installation Required: This functionality requires MATLAB Support Package for Quantum
Computing.
This example shows how to use Quantum Monte Carlo (QMC) simulation in MATLAB to compute the
mean of a function of a random variable. Running this example requires Statistics and Machine
Learning Toolbox™ as well as MATLAB Support Package for Quantum Computing.
There are a broad range of tasks in finance and economics that depend on Monte Carlo simulation,
from option pricing to macroeconomic stress testing. While this example does not explore
computational efficiency, research shows that QMC offers a quadratic speed-up compared to classic
Monte Carlo methods.
This example follows [1] in considering an application of QMC to compute the mean of a function of a
random variable. Specifically, the mean of a trigonometric function of an underlying normal
distribution is computed. This calculation generalizes to many practical applications. For example, if
the probability distribution represents the price of an underlying asset, the function could be the
price of an option on that asset.
Problem Formulation
Assume there is a random variable x that is normally distributed, and the function to be evaluated is
2
f x = sin x .
sinh 1
μ= .
e
AnalyticMean =
0.4323
17-65
17 Gate-Based Quantum Algorithms
MCMean =
0.4210
tiledlayout(1,2)
nexttile
h1 = histogram(sampleData);
title("Sample Points")
nexttile
h2 = histogram(funcValues);
hold on
xline(MCMean,'--','MCMean');
hold off
title("Classic Monte Carlo","Mean of Function Values")
QMC Methodology
First, define some parameters for the quantum simulation. Define the number of qubits for the
probability distribution register as 5, and the number of qubits for the estimation register as 6. Then,
discretize the probability distribution and function over a grid and compute the discrete mean of the
function.
x_max = pi;
x = linspace(-x_max,x_max,M)';
p = normpdf(x);
p = p./sum(p);
DiscreteMean = sum(func(x).*p)
17-66
Quantum Monte Carlo (QMC) Simulation
DiscreteMean =
0.4326
While there are other approaches for amplitude estimation (such as iterative amplitude estimation),
this example uses quantum phase estimation.
The circuit diagram below summarizes the approach. The F block loads the probability distribution
and encodes the random variable onto a value qubit. The Q blocks and the inverse Quantum Fourier
Transform (QFT) are used for quantum phase estimation.
Load the probability distribution onto m qubits. While [1] uses a quantum variational circuit to
approximate the probability distribution, this example loads the probabilities directly onto the circuit.
This requires the use of some custom gate functions, which you can save as functions in the current
folder:
initStateToProbabilities.m
function cg = initStateToProbabilities(p)
% This function initializes their state so that the probabilities of
% measuring each state match vector p. Applied to a set of qubits all in
% state |0>.
L = log2(numel(p));
assert(numel(p) == 2^L);
theta = cell(1,L);
17-67
17 Gate-Based Quantum Algorithms
for lvl=1:L
pp = reshape(p,2^(L-lvl),[]);
pp = sum(pp,1);
pp = reshape(pp,2,[]);
theta{lvl} = 2*acos(sqrt(pp(1,:) ./ sum(pp,1)));
end
gates = ryGate(1,theta{1});
for lvl=2:log2(length(p))
gates = [gates; ucrGate(1:lvl-1,lvl,theta{lvl}(:),'y')]; %#ok<AGROW>
end
cg = compositeGate(gates,1:L,Name="initP");
end
ucrGate.m
function cg = ucrGate(controlQubits,targetQubit,theta,axis)
% Returns a CompositeGate implementing a uniformly-controlled rotation
% about the specified axis on a target qubit.
%
% A rotation is applied for each classical state of controlQubits, so theta
% must be a vector of length 2^length(controlQubits).
%
% Reference:
% [1] Möttönen, M., J.J. Vartiainen, V. Bergholm, & M.M. Salomaa. "Quantum
% Circuits for General Multi-qubit Gates." Physical Review Letters 93,
% 130502 (2004). https://arxiv.org/abs/quant-ph/0404089
arguments
controlQubits
targetQubit
theta double {mustBeVector}
axis {mustBeTextScalar}
end
Nc = length(controlQubits);
% Input checks
assert(log2(length(theta))==Nc)
assert(iscolumn(theta))
assert(ismember(axis,["x","y","z"]))
17-68
Quantum Monte Carlo (QMC) Simulation
ctrls(end+1) = 1;
% Construct local circuit with the control qubits at the top, followed by
% the target qubit
trgt = Nc+1;
if axis=="y"
rotGates = ryGate(trgt,permAngles);
entgGates = cxGate(ctrls,trgt);
elseif axis=="z"
rotGates = rzGate(trgt,permAngles);
entgGates = cxGate(ctrls,trgt);
elseif axis=="x"
rotGates = rxGate(trgt,permAngles);
entgGates = czGate(ctrls,trgt);
end
code(:,1) = repelem('01',2^(n-1));
for ii=2:n
block = repelem('0110',2^(n-ii));
code(:,ii) = repmat(block,1,2^(ii-2));
end
end
end
A = initStateToProbabilities(p);
A.Name = 'Ainit';
Simulate the register to verify that the probability distribution loaded correctly.
circ = quantumCircuit(A);
state = simulate(circ);
histogram(state,probQubits)
title("Simulated Probabilities")
17-69
17 Gate-Based Quantum Algorithms
anglesR = 2*asin(sqrt(func(x)));
R = inv(ucrGate(probQubits,valueQubit,anglesR,"y"));
R.Name = "R";
Compute the probability of the value qubit being in the state 1 . The purpose of the F circuit is to
evaluate the expected value of the random variable function.
ans = 0.4326
Estimate the value of the value qubit using quantum phase estimation.
The controlled-Q block in the circuit diagram is implemented with several composite gates as
outlined in [1]. Create a composite gate to represent this circuit element and plot the circuit to see
the gates it contains.
17-70
Quantum Monte Carlo (QMC) Simulation
xGate([probQubits valueQubit])];
cZ = compositeGate(cZgates,1:controlQubit,Name="cZ");
QPE = [];
for ii=1:n
% Repeat the cQ gate as needed and map to the current control qubit:
cQrep = compositeGate(repmat(cQ,2^(n-ii),1),[probQubits valueQubit estQubits(ii)],Name="cQ"+2
plot(cQ)
circ = quantumCircuit([F;
hGate(estQubits);
QPE;
inv(qftGate(estQubits))]);
plot(circ,"QubitBlocks",[m 1 n])
17-71
17 Gate-Based Quantum Algorithms
state = simulate(circ);
Compare Results
In terms of the phase θ, the analytical mean is given by
1 − cos πθ
μ= .
2
AnalyticPhase =
0.4568
[states,probabilities] = querystates(state,estQubits);
plot(bin2dec(states)/N,probabilities)
xlim([.35 .65])
xline(AnalyticPhase,'--',"AnalyticPhase")
xlabel("\theta")
ylabel("Probability")
title("Phase Estimation with QMC")
17-72
Quantum Monte Carlo (QMC) Simulation
The estimated phase is the first maximum in amplitude. Use the estimated phase to compute the
mean value of the function, and compare the quantum value against the analytic value, classic Monte
Carlo value, and discrete value.
ResultTable = table(AnalyticMean,MCMean,DiscreteMean,QMCMean)
ResultTable =
1×4 table
References
[1] Priazhkina, S., V. Skavysh, D. Guala, & T. Bromley. "Quantum Monte Carlo for Economics: Stress
Testing and Macroeconomic Deep Learning." Bank of Canada working paper (2022). https://
www.bankofcanada.ca/wp-content/uploads/2022/06/swp2022-29.pdf
[2] Brassard, G., P. Hoyer, M. Mosca, & A. Tapp. "Quantum Amplitude Amplification and Estimation."
Contemporary Mathematics 305 (2002): 53-74. https://arxiv.org/pdf/quant-ph/0005055.pdf
[3] Rebentrost, P., B. Gupt, & T. R. Bromley. "Quantum Computational Finance: Monte Carlo Pricing of
Financial Derivatives." Physical Review A, 98(2), 022321 (2018). https://arxiv.org/pdf/
1805.00109.pdf
[4] Stamatopoulos, N., D. J. Egger, Y. Sun, C. Zoufal, R. Iten, N. Shen, & S. Woerner. "Option Pricing
Using Quantum Computers." Quantum 4, 291 (2020). https://arxiv.org/pdf/1905.02666.pdf
17-73
17 Gate-Based Quantum Algorithms
[5] Woerner, S., & D. J. Egger. "Quantum Risk Analysis." npj Quantum Information 5(1), 15 (2019).
https://arxiv.org/pdf/1806.06893.pdf
See Also
quantumCircuit
Related Examples
• “Introduction to Quantum Computing” on page 17-2
• “Local Quantum State Simulation” on page 17-25
17-74
18
QUBO Problems
Note Installation Required: This functionality requires MATLAB Support Package for Quantum
Computing.
• Q can be a real symmetric matrix. If Q is not symmetric, the software internally replaces Q with
the equivalent symmetric matrix
Q + Q′
Q =
2
qprob = qubo(Q)
% or
qprob = qubo(Q,c)
% or
qprob = qubo(Q,c,d)
An Ising problem has the same formulation as a QUBO problem, except the Ising variables y(i) are ±1
instead of the QUBO x variables 0 or 1. You can convert between the two formulations using a linear
mapping. For a QUBO problem represented with variable x and an Ising problem with variable y, the
mapping is
y = 2x − 1
y+1
x= .
2
The objective function values in the two formulations differ by an easily calculated amount.
18-2
What Is a QUBO Problem?
(y + 1)′Q(y + 1) y+1
x′Qx + c′x + d = + c′ +d
4 2
y′Qy 1′Q + c′ 1′Q1 c′1
= + y+d+ + ,
4 2 4 2
where 1 represents the column vector of ones having the same length as y.
Also, many current and proposed quantum computers use QUBO or Ising as the problem type. To try
to find a quantum solution to a combinatorial optimization problem, you formulate a QUBO problem
and then pass the problem to quantum hardware for the solution.
Solution Methods
To solve a QUBO problem, perform these two steps.
For example, create a QUBO problem for the quadratic matrix Q, linear vector c, and constant term d.
Q = [0 -1 2;...
-1 0 4;...
2 4 0];
c = [-5 6 -4];
d = 12;
qprob = qubo(Q,c,d)
qprob =
result = solve(qprob)
result =
18-3
18 QUBO Problems
Alternatively, if you have an Optimization Toolbox license and your problem has up to 100 or 200
variables, convert the QUBO problem to a mixed-integer linear programming (MILP) problem and
solve it using intlinprog, as shown in “Verify Optimality by Solving QUBO as MILP” on page 18-
5.
References
[1] Glover, Fred, Gary Kochenberger, and Yu Du. Quantum Bridge Analytics I: A Tutorial on
Formulating and Using QUBO Models. Available at https://arxiv.org/abs/1811.11538.
[3] Kochenberger, G. A., and F. Glover. A Unified Framework for Modeling and Solving Combinatorial
Optimization Problems: A Tutorial. In: Hager, W. W., Huang, S. J., Pardalos, P. M., Prokopyev,
O. A. (eds) Multiscale Optimization Methods and Applications. Nonconvex Optimization and
Its Applications, vol 82. Springer, Boston, MA. https://doi.org/10.1007/0-387-29550-X_4.
Available at https://www.researchgate.net/publication/
226808473_A_Unified_Framework_for_Modeling_and_Solving_Combinatorial_Optimization_Pr
oblems_A_Tutorial.
See Also
solve | qubo
Related Examples
• “Workflow for QUBO Problems” on page 18-8
• “Verify Optimality by Solving QUBO as MILP” on page 18-5
18-4
Verify Optimality by Solving QUBO as MILP
Note Installation Required: This functionality requires MATLAB Support Package for Quantum
Computing.
A Quadratic Unconstrained Binary Optimization (QUBO) problem can be difficult to solve exactly. The
solve function might return solutions that are not globally optimal. However, for small enough
QUBO problems, you can convert the QUBO problem to a mixed-integer linear programming (MILP)
problem. To solve an MILP problem, use intlinprog, which requires an Optimization Toolbox
license. If intlinprog returns a solution with gap 0, the solution is guaranteed to be optimal. This
approach is practical for up to 100 or 200 variables. For larger problems, the N2 problem size slows
the solution too much.
This formulation converts the problem from quadratic in x to linear in xij and x, which is a
simplification. However, the number of variables increases from N to N2 + N, which is a complication.
The following three linear inequalities tie the matrix xij to the vector x and ensure that, at the
solution, xij(i,j) = x(i)*x(j). For all i and j,
Represent these inequality constraints in a structured way. Define xij as an N-by-N matrix of
optimization variables, and define x as a column vector of optimization variables with N entries. Use
the following code to create the constraints for an optimization problem named prob.
Q = makeq(100,1);
Create a QUBO problem from Q and solve the problem using the default tabu search algorithm.
qprob = qubo(Q);
rng default % For reproducibility
result = solve(qprob)
18-5
18 QUBO Problems
result =
The returned objective function value is –9610. Check whether this answer is reliable by resolving
the problem.
result = solve(qprob)
result =
The returned objective function value does not change. However, the tabu search algorithm does not
necessarily give consistent results.
To obtain a reliable result, formulate the problem as an MILP using optimization variables.
N = size(Q,1);
x = optimvar("x",N,Type="integer",LowerBound=0,UpperBound=1);
% Need N^2 variables for MILP
xij = optimvar("xij",N,N,Type="integer",LowerBound=0,UpperBound=1);
prob = optimproblem;
% The next three constraint arrays are the connections between xij and x
prob.Constraints.f = xij >= repmat(x,1,N) + repmat(x',N,1) - 1;
prob.Constraints.g = xij <= repmat(x,1,N);
prob.Constraints.h = xij <= repmat(x',N,1);
% Formulate the objective in terms of xij
% If you have a linear term c and a constant term d, the objective is
% prob.Objective = sum(Q.*xij,"all") + dot(c,x) + d;
prob.Objective = sum(Q.*xij,"all");
% Solve calls intlinprog
[solxij,fv] = solve(prob);
18-6
Verify Optimality by Solving QUBO as MILP
Intlinprog stopped because the objective value is within a gap tolerance of the optimal value, options.AbsoluteGapToleran
intcon variables are integer within tolerance, options.IntegerTolerance = 1e-05.
The returned gap is 0, meaning the solution is guaranteed to be optimal. The returned objective
function value is –9610, which is the same as the tabu search result. You can see this value in the
iterative display column integer fval.
Check that the returned variable solxij.x gives the same objective function value in the quadratic
expression x'*Q*x.
-9610
Helper Function
This code creates the makeq helper function.
function Q = makeq(N,seed)
% N must be a positive integer, seed is a random stream seed
% Q is an N-by-N sparse symmetric integer matrix, values -100 through 100
% Q has about N^2/10 nonzeros
% Q is modeled after Beasley 1998
See Also
intlinprog | solve | qubo
Related Examples
• “Workflow for QUBO Problems” on page 18-8
• “Problem-Based Optimization Workflow” (Optimization Toolbox)
18-7
18 QUBO Problems
Note Installation Required: This functionality requires MATLAB Support Package for Quantum
Computing.
To solve a QUBO problem, you first need to convert your problem to QUBO, and then solve it using
the tabu search algorithm.
To express your problem as a QUBO, create a real N-by-N matrix Q, an optional N vector c, and an
optional scalar d. Then create the QUBO problem as follows:
For more information, including how to convert between QUBO form and Ising form, see “What Is a
QUBO Problem?” on page 18-2
Some problems have constraints that you express as a penalty term in the QUBO objective function.
See “Constraints in QUBO Problems” on page 18-10.
result = solve(qprob)
Tabu search is a stochastic algorithm, so each time you run the algorithm, you might get a different
result. If you add constraints to a problem by using a penalty term, and the first solution you get is
infeasible, you can try to find a feasible solution by rerunning solve.
You can control some aspects of solve by creating a tabu search algorithm object, which contains
properties for the tabu search. Pass the tabu search object with specified properties to solve. For
example, to have solve use more time and iterations than the defaults, enter
ts = tabuSearch(MaxTime=60,MaxIterations=1e7);
result = solve(qprob,Algorithm=ts)
For details about the properties and their default values, see tabuSearch.
When your problem has constraints expressed as a penalty term, some results might violate the
constraints. See “Examine Solutions for Feasibility” on page 18-11.
18-8
Workflow for QUBO Problems
See Also
solve | qubo | evaluateObjective | tabuSearch
Related Examples
• “Traveling Salesperson Problem with QUBO” on page 18-17
• “Capacitated Vehicle Routing Problem” on page 18-27
• “Feature Selection QUBO (Quadratic Unconstrained Binary Optimization)” on page 18-36
18-9
18 QUBO Problems
Note Installation Required: This functionality requires MATLAB Support Package for Quantum
Computing.
To ensure that constraints are satisfied at a solution for a QUBO problem, add a positive multiplier M
times a quadratic function that is positive for unsatisfied constraints and zero for satisfied
constraints. For example, suppose you have the constraint that exactly two components of x are equal
to 1. This constraint can be formulated as
2
∑ xi − 2 = 0.
i
2
M ∑ xi − 2 .
i
For a large enough M, this penalty causes a solution to the QUBO problem
2
xT Qx + cT x + M ∑ xi − 2
i
to have the penalty term equal to zero; otherwise, the objective function value is not minimized. In
other words, the penalty term enforces the constraint.
This constraint term can be represented as a QUBO expression. The penalty term without the
multiplier M is
2
∑ xi − 4 ∑ xi + 4 = 0.
i i
To represent this quadratic constraint as a QUBO, take A = ones(N), an all-1 matrix of coefficients
for a quadratic term
1 1 ⋯ 1
1 1 ⋯ 1
A= .
⋮⋮ ⋱⋮
1 1 ⋯ 1
18-10
Constraints in QUBO Problems
2
xT Ax = ∑ xi .
i
Add a penalty term to enforce the constraint by taking a large positive multiplier M and adding these
expressions to your QUBO problem: M*A to the quadratic term, –4*M*ones(N,1) to the linear term,
and 4*M to the constant term.
Do not take a very large multiplier M, because doing so might cause the problem to lose precision
when it involves a large variation in the function values. Especially when you use quantum annealing
hardware to solve a QUBO problem, the number of digits available for computations is limited.
Therefore, a multiplier M that is too large might make the problem unsuitable for the hardware.
Suppose that your original problem has the following quadratic term with no linear or constant terms.
Q = [0 -5 2 -6
-5 0 -1 3
2 -1 0 -4
-6 3 -4 0];
The constraint is that exactly two elements in the solution are equal to 1. Create and solve the
problem with no constraints.
qprob = qubo(Q);
result = solve(qprob)
result =
Does the unconstrained solution satisfy the constraint that exactly two xi are nonzero?
result.BestX
ans =
1
1
1
1
No, the constraint is not satisfied because the solution has too many ones.
Create and solve a new problem with a linear term, a constant term, and the constraint multiplier M
set to 1.
A = ones(4);
c = -4*ones(4,1);
18-11
18 QUBO Problems
d = 4;
M = 1;
qprob2 = qubo(Q + M*A, M*c, M*d);
sol2 = solve(qprob2)
sol2 =
ans =
No, the solution does not satisfy the constraints because they do not evaluate to 0.
M = 10;
qprob2 = qubo(Q + M*A, M*c, M*d);
sol3 = solve(qprob2)
sol3 =
ans =
This time, the constraints are satisfied. For this simple problem, you can see that the constraints are
satisfied by looking at BestX.
sol3.BestX
ans =
1
0
0
1
18-12
Constraints in QUBO Problems
In general, if your solution is infeasible, you can try to find a feasible solution by doing one of the
following:
For Ising formulations of common constraints, see Lucas [1]. Ising formulations are equivalent to
QUBO formulations; see “What Is a QUBO Problem?” on page 18-2
References
[1] Lucas, Andrew. Ising formulations of many NP problems. Available at https://arxiv.org/pdf/
1302.5843.pdf.
See Also
solve | qubo | evaluateObjective | tabuSearch
Related Examples
• “Workflow for QUBO Problems” on page 18-8
• “Traveling Salesperson Problem with QUBO” on page 18-17
18-13
18 QUBO Problems
Note Installation Required: This functionality requires MATLAB Support Package for Quantum
Computing.
To solve a combinatorial optimization problem formulated as a QUBO problem, call the solve
function on the QUBO problem. The solve function internally uses the tabu search algorithm, as
described in Palubeckis . For background information on QUBO problems, see “What Is a QUBO
Problem?” on page 18-2
Change of Variables
For a binary vector x, a QUBO objective function has the form
N N N
f (x) = ∑ ∑ Qi, jxix j + ∑ cixi + d .
i=1 j=1 i=1
The algorithm begins by choosing a random binary vector x, and then changing the problem to a
binary vector y that has all zero components. The coefficients of the problem must change so that the
objective function for y is equal to the objective function for the corresponding x. The quadratic
coefficients Qi,j for x become Ki,j for y, where
2
Ki, j = Qi, j 1 − 2 xi − x j .
di = 1 − 2xi ci + ∑ r(i, j) ,
j: x j = 1
and
r(i, j) = Qi, j + Q j, i .
With these definitions, the value of the constant term of the objective function in the y formulation
becomes f(x). In other words,
T
yT Ky + d y + f (x)
is the objective function in terms of y. Therefore, when y = 0, the value of the objective function is
f(x).
This reformulation makes it easy to determine when a one-variable change in y leads to a lower
objective function value. If any component d(i) is negative, then changing y(i) to 1 results in a lower
value of the objective function, because all quadratic terms are zero. This change assumes, without
loss of generality, that the diagonal entries in K are zero, because nonzero entries can be absorbed
into d.
Palubeckis gives an efficient algorithm for updating the coefficients of the objective function after a
one-variable change in y. This update makes a local search for a minimum an efficient procedure.
18-14
Tabu Search Algorithm
Algorithm Steps
The tabu search algorithm has three phases: Initialize, Simple Tabu Search, and Get New Start
Point. The algorithm starts with Initialize, then alternates between Simple Tabu Search and Get New
Start Point until it reaches a stopping condition. The Simple Tabu Search phase has a tabu list, which
is a list of variables that the algorithm cannot change until it is past the tabu tenure value for each
variable. The tabu tenure value is the smaller of 20 and N/4, where N is the length of x.
1 Initialize — Create a random binary vector x0. Map the x0 vector to an all-zero vector using the
procedure explained in “Change of Variables” on page 18-14. Set x*, the best point found, to x0,
and set the associated best function value to f(x*).
2 Simple Tabu Search — Starting from index 1, search for the first index K in the vector so that
setting x(K) = 1 causes the objective function value f(x) < f(x*), ignoring any K selected
within the past tabu tenure searches.
a If the search is successful, change x(K) from 0 to 1, and map x(K) to the zero vector again
using the procedure explained in “Change of Variables” on page 18-14. Each successful step
counts as one iteration in the iterative display and output structure. Then repeatedly perform
a strictly greedy search for a minimum, without referring to tabu, by taking the first index
with a negative coefficient K, changing x(K) from 0 to 1, and then remapping x(K) to 0
using the procedure for changing variables. Each step counts as one iteration. Restart the
search from index K + 1. Repeat until a full loop from x(1) to x(N) of the search is
unsuccessful, meaning the current point is a local minimum of the objective function. The
best point x* and the best function value f(x*) are different from before this phase, and the
current point x = x*.
b If the search is unsuccessful, choose K as the index not in the tabu list (list of variables with
positive tabu values) that has the lowest linear coefficient. Change x(K) from 0 to 1, and
map x(K) to the zero vector again using the procedure for changing variables. This
procedure leads to a new x, but not a new x*. Each step of this type can count for up to N
iterations, because the process of examining each coefficient value counts as an iteration.
Update the tabu list by setting variable K to have a tabu value of 20, and lowering the tabu
value of all other variables with positive tabu values by 1.
Repeat the Simple Tabu Search until reaching a stopping condition for this phase. Generally, the
stopping condition occurs when the iteration count in this phase exceeds 1e4*N.
3 Get New Start Point — Initialize the set I to all N indices. Choose a random integer r uniformly
in the interval [10,0.1*N]. Perform the following steps r times.
a Find the five indices with the minimal linear coefficients among those in I. Randomly choose
one of these indices, J.
b Remove J from I.
c Change x(J) to 1. Update the problem coefficients using the procedure explained in
“Change of Variables” on page 18-14.
Take this new start point, clear the tabu list, and return to phase 2.
18-15
18 QUBO Problems
References
[1] Palubeckis, G. Iterated Tabu Search for the Unconstrained Binary Quadratic Optimization
Problem. Informatica (2006), 17(2), pp. 279–296. Available at https://citeseerx.ist.psu.edu/
document?repid=rep1&type=pdf&doi=3c323a1d41cd0e2ca1ddb27192e475ea73959e52.
See Also
solve | tabuSearch | quboResult | tabuSearchResult
Related Examples
• “Workflow for QUBO Problems” on page 18-8
18-16
Traveling Salesperson Problem with QUBO
Note Installation Required: This functionality requires MATLAB Support Package for Quantum
Computing.
The classic Traveling Salesperson Problem (TSP) involves a group of cities (locations) that a
salesperson must visit before returning to the start location. The problem is to minimize the total
distance the salesperson travels. This topic shows how to convert a TSP to a Quadratic Unconstrained
Binary Optimization (QUBO) problem, and to solve the QUBO problem using the tabu search
algorithm.
Problem Data
The data for a TSP consists of city locations and the distances between each pair of cities. The
distances can be given as a matrix D, where D(i,j) is the distance from city i to city j. This example
uses the Pythagorean rule to calculate distances, assuming a flat earth. The solution depends only on
the distances, not the city locations. But to plot the solution, you need the locations.
Calculate the distances between the cities using the hypot function.
18-17
18 QUBO Problems
[X,Y] = meshgrid(1:N);
dist = hypot(stopsLon(X) - stopsLon(Y),stopsLat(X) - stopsLat(Y));
According to Lucas [2], the following equations suffice to specify that the x(i,j) variables represent a
route.
N N 2
∑ 1− ∑ x(i, j) =0
i=1 j=1
N N 2
∑ 1− ∑ x(i, j) = 0.
j=1 i=1
These equations ensure that each city is visited only once, and each step of the route is in one city.
If some cities are not reachable directly from other cities, also include the equation
N
∑ ∑ x(i, k)x( j, k + 1) = 0.
No path from i to j k = 1
This equation ensures that when no direct path exists from i to j, then at any time k when the route
visits i, the route does not visit j at time k + 1.
To convert the TSP to a QUBO problem, use the following objective function, which is scaled by a
positive number M. Each expression in the objective function is a penalty for the solution failing to
satisfy one of the previous equations.
N N 2
f (x) = M ∑ 1− ∑ x(i, j)
i=1 j=1
N N 2
+M ∑ 1− ∑ x(i, j)
j=1 i=1
N
+M ∑ ∑ x(i, k)x( j, k + 1) .
No connection from i to j k = 1
As you can see, f(x) = 0 when x represents a valid route. All of the expressions in f(x) are in QUBO
form.
To finish the conversion from TSP to QUBO, include the cost of the route in the objective function. Let
D(i,j) be the distance of a direct path from i to j. The objective function for the TSP is
N
f (x) + ∑ D(u, v) ∑ x(u, j)x(v, j + 1) .
uv j=1
Here, interpret x(v,N+1) as x(v,1), meaning the route returns to the start location at step N+1.
18-18
Traveling Salesperson Problem with QUBO
Suppose that the multiplier M satisfies 0 < max(D) < M. When you specify M this large, the minimal
objective function takes place where f(x) = 0. In this case, the constraints are satisfied, so the x
variables represent a valid route.
The final step in converting the quadratic expressions for distance and penalty to a QUBO problem is
to represent them in matrix form. This step is explained in “Convert to QUBO: Code” on page 18-21.
The code that converts the problem to a QUBO problem is given in the tsp2qubo “Helper Function”
on page 18-25 at the end of this example.
Q = tsp2qubo(dist);
result = solve(Q);
binx = result.BestX;
binx = reshape(binx,N,[]);
ordr = zeros(1,N);
for i = 1:N
ordr(i) = find(binx(i,:)); % Find order of cities in route
end
hold on
plot(stopsLon(ordr),stopsLat(ordr),"ko")
plot(stopsLon(ordr),stopsLat(ordr),"b-")
plot(stopsLon(ordr([N 1])),stopsLat(ordr([N,1])),"b-")
for i = 1:length(stopsLon)
text(stopsLon(i)+0.02,stopsLat(i),num2str(i))
end
hold off
18-19
18 QUBO Problems
disp(ordr)
4 1 3 6 7 8 5 2 9
Find the distance of the route by adding the length of each leg.
myd = 0;
for i = 1:(N-1)
myd = myd + dist(ordr(i),ordr(i+1));
end
myd = myd + dist(ordr(N),ordr(1))
myd =
4.0665
evaluateObjective(Q,result.BestX)
ans =
4.0665
18-20
Traveling Salesperson Problem with QUBO
set the variable x as the variables x(i,j) for i and j going from 1 through N. In other words, x is a
column vector with N2 entries, corresponding in order to (1,1), …, (1,N), (2,1), …, (2,N), …, (N,1), …
(N,N). The Q matrix is of size N2-by-N2.
N N 2 N N N N 2
∑ 1− ∑ x(i, j) = N−2 ∑ ∑ x(i, j) + ∑ ∑ x(i, j) .
i=1 j=1 i=1 j=1 i=1 j=1
The first double sum is linear in the variables and, therefore, does not enter into the quadratic matrix.
You can represent the quadratic expression using a block diagonal matrix, where each block is an N-
by-N matrix of ones.
18-21
18 QUBO Problems
1 … 1
⋮ ⋱ ⋮
1 … 1
1 … 1
x′Qx = x′ ⋮ ⋱ ⋮ x
1 … 1
1 … 1
⋮ ⋱ ⋮
1 … 1
∑ x(1, j)
j
⋮
∑ x(1, j)
j
∑ x(2, j)
j
⋮
= x′
∑ x(2, j)
j
⋮
∑ x(N, j)
j
⋮
∑ x(N, j)
j
N N 2
= ∑ ∑ x(i, j) .
i=1 j=1
Similarly, expand the second double sum, which differs from the first by having the order of the i and j
indices interchanged.
N N 2 N N N N 2
∑ 1− ∑ x(i, j) = N−2 ∑ ∑ x(i, j) + ∑ ∑ x(i, j) .
j=1 i=1 j = 1i = 1 j=1 i=1
You can represent the quadratic term using a block matrix, where every block is an N-by-N identity
matrix.
18-22
Traveling Salesperson Problem with QUBO
1 … 0 1 … 0 1 … 0
⋮ ⋱ ⋮⋮ ⋱ ⋮ ⋮ ⋱ ⋮
0 … 1 0 … 1 0 … 1
1 … 0 1 … 0 1 … 0
x′Qx = x′ ⋮ ⋱ ⋮ ⋮ ⋱ ⋮ ⋮ ⋱ ⋮ x
0 … 1 0 … 1 0 … 1
1 … 0 1 … 0 1 … 0
⋮ ⋱ ⋮⋮ ⋱ ⋮ ⋮ ⋱ ⋮
0 … 1 0 … 1 0 … 1
∑ x(i, 1)
i
⋮
∑ x(i, N)
i
∑ x(i, 1)
i
= x′ ⋮
∑ x(i, N)
i
∑ x(i, 1)
i
⋮
∑ x(i, N)
i
N N 2
= ∑ ∑ x(i, j) .
j=1 i=1
Now calculate the distance to travel from the first city to city N, a one-way trip (calculate the cost to
return in another step). The cost to go from city u to city v, including a check that such a step is in
the route, is
N−1
D(u, v) ∑ x(u, j)x(v, j + 1) . (18-1)
j=1
This expression is the sum over all step numbers in which the step occurs, times the distance of the
step. Therefore, the total cost of the one-way trip is
N−1
∑ D(u, v) ∑ x(u, j)x(v, j + 1), (18-2)
uv j=1
To represent this cost function as a QUBO problem, create a block matrix of size N2-by-N2, where
each block is of size N-by-N, and the (i,j) block is a matrix with D(i,j) on the upper diagonal.
18-23
18 QUBO Problems
where
0 D(u, v) 0 0
0 0 D(u, v) 0
M(u, v) = .
⋮ ⋮ ⋱ D(u, v)
0 0 … 0
To see that this matrix represents the cost of a route from the first city to the last before returning to
the first, consider the contribution of x'Qx from the (u,v) block.
x(v, 2)
x(v, 3)
Qx = D(u, v) ⋮ .
x(v, N)
0
Therefore,
The final distance is the distance for returning from the end of the route to the start. To calculate this
distance, recall that x(i,1) = 1 when i is the first step of the route, and x(j,N) = 1 when j is the last
step. So the cost to return from j to i is
N N
∑ ∑ D( j, i)x(i, 1)x( j, N) .
i=1 j=1
To represent this cost in a QUBO problem, place the cost D(j,i) in the upper-right corner of each block
submatrix, and place zeros elsewhere. The matrix M(u,v) for this cost becomes
0 0 0 D(v, u)
0 0 0 0
M(u, v) = .
⋮⋮ ⋱ ⋮
0 0 … 0
You can check that this matrix represents the cost of returning in the expression x'Qx, where Q is a
block matrix with subblocks M(u,v).
In summary, the QUBO problem representing the cost of a TSP route has four terms:
• A quadratic term of block diagonal 1s, along with a constant term N and a linear term
N N
−2 ∑ ∑ x(i, j)
i=1 j=1
18-24
Traveling Salesperson Problem with QUBO
• A quadratic term of block diagonal identity blocks, along with a constant term N and a linear term
N N
−2 ∑ ∑ x(i, j)
j = 1i = 1
• N−1
A quadratic term ∑ D(u, v) ∑ x(u, j)x(v, j + 1), representing the cost of the sequence of cities
uv j=1
from the first city through city N
• N N
A quadratic term ∑ ∑ D( j, i)x(i, 1)x( j, N), representing the cost to return from city N to the
i=1 j=1
first city
As explained in Feld [1], to incorporate constraints in the QUBO problem, multiply the first two terms
by a large number such as maxD(i, j)N2. In this way, the QUBO problem is minimized when the
i, j
constraints are satisfied because the cost of not satisfying a constraint is larger than any cost
associated with the route.
The cost of the first two terms is zero when the constraints are satisfied. You can use the information
to check whether a returned solution is feasible. Another way to check is to rewrite the returned x as
an N-by-N matrix and see if its row sums and column sums are all 1.
Helper Function
This code creates the tsp2qubo helper function.
function QP = tsp2qubo(dist)
% QP = TSP2QUBO(DIST) returns a QUBO problem from the traveling salesperson
% problem specified by the distance matrix DIST. DIST is an N-by-N
% nonnegative matrix where DIST(i,j) is the distance between locations
% i and j.
N = size(dist,1);
% Create constraints on routes
A = eye(N);
B = ones(N);
Q0 = kron(A,B);
Q1 = kron(B,A);
% Create upper diagonal matrices of distances
v = ones(N-1,1);
A2 = diag(v,1);
Q2 = kron(B,A2); % Q2 has a diagonal just above the main diagonal in each block
C = kron(dist,B);
Q2 = Q2.*C; % Q2 has an upper diagonal dist(i,j)
% Create dist(j,i) in the upper-right corner of each block
E = zeros(N);
E(1,N) = 1;
Q3 = kron(B,E); % Q3 has a 1 in the upper-right corner of each block
CP = kron(dist',B); % dist' for D(j,i)
Q3 = Q3.*CP; % Q3 has dist(j,i) in the upper-right corner of each block
% Add the multipliers
M = max(max(dist));
QN = sparse(M*(Q0 + Q1)*N^2 + Q2 + Q3);
% Symmetrize
QN = (QN + QN.')/2;
18-25
18 QUBO Problems
QP = qubo(QN,c,d);
end
References
[1] Feld, Sebastian, Christoph Roch, Thomas Gabor, Christian Seidel, Florian Neukart, Isabella Galter,
Wolfgang Mauerer, and Claudia Linnhoff-Popien. “A Hybrid Solution Method for the
Capacitated Vehicle Routing Problem Using a Quantum Annealer.” Frontiers in ICT 6 (June
25, 2019): 13. Available at https://arxiv.org/abs/1811.07403.
[2] Lucas, Andrew. “Ising Formulations of Many NP Problems.” Frontiers in Physics 2 (2014).
Available at https://www.frontiersin.org/articles/10.3389/fphy.2014.00005/full.
See Also
solve | qubo
Related Examples
• “Workflow for QUBO Problems” on page 18-8
• “Constraints in QUBO Problems” on page 18-10
• “Capacitated Vehicle Routing Problem” on page 18-27
18-26
Capacitated Vehicle Routing Problem
Note Installation Required: This functionality requires MATLAB Support Package for Quantum
Computing.
The following figure shows four routes originating from a single point, the depot. These routes do not
represent a minimal solution, because nodes 2 and 3 (at least) should be visited in the opposite order.
The route containing nodes 2 and 3 has a self-intersection, which does not occur in an optimal tour.
To solve a capacitated vehicle routing problem, follow the steps in Feld and coauthors [1]. While Feld
gives several solution approaches, this example uses just one:
• Create clusters that represent groups of customers visited by a vehicle in a single route. This step
is a knapsack problem.
• Solve the traveling salesperson problem for each cluster.
18-27
18 QUBO Problems
Solve the cluster creation problem using a classical algorithm. Solve the traveling salesperson
problems by mapping them to QUBO problems. Solve the QUBO problems using the solve function,
which internally uses the tabu search algorithm.
Create the matrix of distances between customers from the coordinate vector. Use the Pythagorean
rule and Euclidean distance.
numLocations = numCustomers + 1;
[X,Y] = meshgrid(1:numLocations);
18-28
Capacitated Vehicle Routing Problem
Set up variables for the remainder of the example by removing the depot, site 1, from the problem
data.
customerCoords = loc(2:end,:);
costMatrix = dist;
vehicleCapacity = capacity;
Create Clusters
Feld [1] gives the following two-step approach for creating clusters:
1 Create initial clusters based on the problem coordinates and capacity limits.
2 Refine the initial clusters to obtain shorter traveling salesperson routes.
For this example, create initial clusters using the following iterative approach:
Create the clusters as structures with the fields customers, center, and demand. Use the
addToCluster helper function to add a customer to the cluster.
clusters = [];
% CLUSTER GENERATION
currentCluster.Customers = idx;
currentCluster.Center = customerCoords(idx,:);
currentCluster.Demand = demand(idx);
unclustered.Coords = customerCoords;
unclustered.Customers = 1:numCustomers;
unclustered.Coords(idx,:) = [];
unclustered.Customers(idx) = [];
18-29
18 QUBO Problems
end
clusters = [clusters, currentCluster];
After creating the initial clusters, you can improve them. To do so, reassign a customer to a different
cluster if doing so places the customer closer to the mean position of the new cluster compared to the
current cluster, without exceeding the capacity constraint of the new cluster. Continue to perform
cluster improvement steps until no more steps are possible, or until 10 improvement steps are made.
Improve the clusters using the addToCluster and removeFromCluster helper functions.
% CLUSTER IMPROVEMENT
iterations = 0;
% For each cluster
while iterations < 10
for i = 1:numel(clusters)
% And each customer within the cluster
for customer = clusters(i).Customers
% Calculate the customer's distance from the center
d_i = sqrt(sum((clusters(i).Center - customerCoords(customer,:)).^2));
% For each alternative cluster
for j = [1:i-1, i+1:numel(clusters)]
% Calculate the customer's distance to the center of the
% alternative cluster
d_j = sqrt(sum((clusters(j).Center - customerCoords(customer,:)).^2));
% Move the customer to the alternative cluster if it is closer and
% capacity constraints are met
if d_j < d_i && clusters(j).Demand + demand(customer) < vehicleCapacity
% Remove the customer from the original cluster
% and add the customer to the new cluster
clusters(i) = removeFromCluster(clusters(i), ...
customer, customerCoords, demand);
clusters(j) = addToCluster(clusters(j), ...
customer, customerCoords, demand);
break
end
end
end
end
iterations = iterations + 1;
end
Adjust the customer labels so that they correspond to the original problem, with the depot being site
1 and the first customer being site 2.
nRoutes = numel(clusters);
for i = 1:nRoutes
clusters(i).Customers = clusters(i).Customers + 1;
end
For each cluster, create a distance matrix for the customers in that cluster. Collect the various TSPs
and their solutions in structures. Each structure contains the customers associated with that TSP, the
distances between customers, and the solution to the TSP, which is the route with the minimal
distance.
18-30
Capacitated Vehicle Routing Problem
TSPsolutions = cell(nRoutes,1);
Routes = cell(nRoutes,1);
customerCoords = loc; % Return depot to list
For each cluster of customers, compute the pairwise distances and formulate the corresponding
TSPs. Convert each TSP to a QUBO problem using the tsp2qubo helper function. Solve the TSPs
using the solvemyTSP helper function.
for rt = 1:nRoutes
% Compute pairwise distance
cluster = clusters(rt);
coords = [depot; customerCoords(cluster.Customers,:)];
M = height(coords);
[X,Y] = meshgrid(1:M);
dist = hypot(coords(X(:),1)- coords(Y(:),1),coords(X(:),2) - coords(Y(:),2));
d =
The demand in each route is less than the capacity limit of 6000.
18-31
18 QUBO Problems
Helper Functions
This code creates the addToCluster helper function. Note that this helper function uses the
updateCluster helper function.
function cluster = addToCluster(cluster,next,customerCoordinates,demandVector)
cluster.Customers = [cluster.Customers, next];
cluster = updateCluster(cluster,next,customerCoordinates,demandVector,1);
end
This code creates the removeFromCluster helper function. Note that this helper function uses the
updateCluster helper function.
function cluster = removeFromCluster(cluster,next,customerCoordinates,demandVector)
cluster.Customers(cluster.Customers == next) = [];
cluster = updateCluster(cluster,next,customerCoordinates,demandVector,-1);
end
currentN = numel(cluster.Customers);
previousN = currentN - s;
% Update center
newX = (cluster.Center(1)*previousN + s*customerCoordinates(next,1))/(currentN);
newY = (cluster.Center(2)*previousN + s*customerCoordinates(next,2))/(currentN);
cluster.Center = [newX,newY];
% Update demand
cluster.Demand = cluster.Demand + s*demandVector(next);
end
scatter(customerCoords(1,1), customerCoords(1,2),"filled");
ax = f.CurrentAxes;
text(customerCoords(1,1), customerCoords(1,2),"Depot");
hold on
18-32
Capacitated Vehicle Routing Problem
ax.ColorOrderIndex = colorIdx;
plot(customerCoords([tr(end),tr(1)],1),...
customerCoords([tr(end),tr(1)],2));
end
end
drawnow
hold off
end
N = size(dist,1);
% Create constraints on routes
A = eye(N);
B = ones(N);
Q0 = kron(A,B);
Q1 = kron(B,A);
% Create upper diagonal matrices of distances
v = ones(N-1,1);
A2 = diag(v,1);
Q2 = kron(B,A2); % Q2 has a diagonal just above the main diagonal in each block
C = kron(dist,B);
Q2 = Q2.*C; % Q2 has an upper diagonal dist(i,j)
% Create dist(j,i) in the upper-right corner of each block
E = zeros(N);
E(1,N) = 1;
Q3 = kron(B,E); % Q3 has a 1 in the upper-right corner of each block
CP = kron(dist',B); % dist' for D(j,i)
Q3 = Q3.*CP; % Q3 has dist(j,i) in the upper-right corner of each block
% Add the multipliers
M = max(max(dist));
QN = sparse(M*(Q0 + Q1)*N^2 + Q2 + Q3);
% Symmetrize
QN = (QN + QN.')/2;
QP = qubo(QN,c,d);
end
This code creates the solvemyTSP helper function. Note that this helper function uses the
solveTSPwithTabu and convertSolutionToRoute helper functions.
function TSPsolution = solvemyTSP(tsp)
% solvemyTSP solves the TSP by first converting it to a QUBO problem
% and then using tabu search to find a solution.
% Inputs:
% tsp: Structure with fields
% customers: the customers for the tsp
% costMatrix: the cost matrix for the tsp
% Outputs:
% TSP_solution: Structure with fields
% Route: The order of the customers in the best route found. For example,
% if there are 5 customers in this tsp, the Route might be [2 3 1 5 4]
% for customers [13 6 3 12 7].
% customers: The customers in the current tsp, taken directly
% from the tsp input
% quboFval: The fval returned by the qubo algorithm
n = numel(tsp.Customers);
18-33
18 QUBO Problems
if n < 3
TSPsolution.Route = 1:n;
TSPsolution.Customers = tsp.Customers;
TSPsolution.QuboFval = tsp.CostMatrix(1,2) + tsp.CostMatrix(2,1);
return
end
This code creates the solveTSPwithTabu helper function. Note that this helper function uses the
tsp2qubo and checkTSPConstraints helper functions.
function solution = solveTSPwithTabu(costMatrix)
% Solves the TSP with tabu search. Convert the TSP to a QUBO
% according to the penalty term and return the best feasible solution.
% Convert the TSP problem to QUBO
Q = tsp2qubo(costMatrix);
x = solve(Q);
validSolutions = checkTSPConstraints(x);
% If no valid solution is found yet, try again up to 10 times
if validSolutions == false
it = 1;
while (validSolutions == false) && it < 10
x = solve(Q);
validSolutions = checkTSPConstraints(x);
it = it + 1;
end
end
if validSolutions == false
solution = [];
else
solution = x;
end
end
Route = nodes(node_order);
end
18-34
Capacitated Vehicle Routing Problem
References
[1] Feld, Sebastian, Christoph Roch, Thomas Gabor, Christian Seidel, Florian Neukart, Isabella Galter,
Wolfgang Mauerer, and Claudia Linnhoff-Popien. A Hybrid Solution Method for the
Capacitated Vehicle Routing Problem Using a Quantum Annealer. Available at https://
arxiv.org/abs/1811.07403.
See Also
solve | qubo | evaluateObjective
Related Examples
• “Workflow for QUBO Problems” on page 18-8
• “Traveling Salesperson Problem with QUBO” on page 18-17
18-35
18 QUBO Problems
Note Installation Required: This functionality requires MATLAB Support Package for Quantum
Computing.
• Prevent overfitting — Avoid modeling with an excessive number of features that are more
susceptible to rote-learning-specific training examples.
• Reduce model size — Increase computational performance with high-dimensional data or prepare
a model for embedded deployment when memory might be limited.
• Improve interpretability — Use fewer features, which might help identify those that affect model
behavior.
Based on Mücke, Heese, Müller, Wolter, and Piatkowski [1], you can perform feature selection by
using a QUBO model to select relevant predictors. Specifically, given a set of N scalar observations
y(i), where the observations are associated with p variables (the predictors) x(i,j) for i = 1 through N
and j = 1 through p. The problem is to find a subset of the p predictors such that the resulting model
gives accurate predictions of the observations.
• Specify the input data as an N-by-p matrix X, where each row is one data point.
• Specify the response data as an N-by-1 vector Y, where Y(i) is the response to the row X(i,:).
• Calculate the matrix R(i,j) for i and j from 1 through p as the mutual information between
columns i and j of X. R(i,j) represents the redundancy between pairs of features. The mutual
information is a nonnegative quantity that can be estimated by binning the data and counting
entries in each bin to estimate the associated probabilities. For more information, see https://
en.wikipedia.org/wiki/Mutual_information.
• Similarly, calculate the vector J(i) for i from 1 through p as the mutual information between
column i of X and the response variable Y. The quantity J(i) is nonnegative and represents the
strength of the coupling between column i and the response Y.
• For each value a from 0 through 1, define the QUBO as
Qa = (1 – a)R – diag(aJ).
Because each R and J entry is nonnegative, the minimum of the QUBO problem is at the point [1,1,
…,1] for a = 1 and the point [0,0,…,0] for a = 0. In [1], Mücke proves that in nontrivial problems, for
each k from 0 through p, there is a value a such that the number of nonzero entries in the solution to
Qa is equal to k. In other words, for each number of predictors k, there is a QUBO problem Qa whose
18-36
Feature Selection QUBO (Quadratic Unconstrained Binary Optimization)
solution has exactly k nonzero entries. These entries represent the data columns in X that best match
the Y data while not matching the other data columns.
[N,p,X,Y] = iSyntheticData1;
2 View the number of data points and predictors.
N = 10000
p = 30
3 Select five features, the number of useful features in this example. In general, you can try to find
how many useful features exist by cross-validating models with different numbers of features.
K = 5;
4 Create the R matrix and J vector from the data by using 20 bins for the data, and estimating the
mutual information by counting the number of entries in each bin. Use the iBinXY,
iComputeMIXX, and iComputeMIXY helper functions to create the matrix and vector.
nbins = 20;
[binnedXInfo,binnedYInfo] = iBinXY(X,Y,nbins);
binnedX = binnedXInfo.binned;
nbinsX = binnedXInfo.nbins;
binnedY = binnedYInfo.binned;
nbinsY = binnedYInfo.nbins;
R0 = iComputeMIXX(binnedX,nbinsX);
J = iComputeMIXY(binnedX,binnedY,nbinsX,nbinsY);
% Scale $R$ to make $\alpha = 0.5$ correspond to equal redundancy and
% relevance terms.
R = R0/(K-1);
5 Find the appropriate value of a for the QUBO Qa = (1 – a)R – diag(aJ). To do so, solve for the
number of nonzero elements in the solution by using the howmany helper function, and then use
fzero to set that number equal to K.
fun = @(alpha)howmany(alpha,R,J) - K;
alphasol = fzero(fun,[0 1]);
[~,xsol] = howmany(alphasol,R,J);
6 View the selected predictors.
find(xsol.BestX)
ans =
6
8
9
11
13
18-37
18 QUBO Problems
7 Train a regression tree, first using the full set of 30 predictors and then using the selected set of
five predictors. Specify a common holdout value of 0.2, and compare the cross-validation losses.
First train the regression tree using all the predictors. Training a regression tree requires a
Statistics and Machine Learning Toolbox license.
rng default
mdl = fitrtree(X,Y,CrossVal="on",Holdout=0.2);
kfoldLoss(mdl)
ans = 0.0041
8 rng default
X2 = X(:,find(xsol.BestX));
mdl2 = fitrtree(X2,Y,CrossVal="on",Holdout=0.2);
kfoldLoss(mdl2)
ans = 0.0032
Although both regression trees have a low loss value, the regression tree that uses only five of
the 30 predictors has a lower loss value.
Helper Functions
This code creates the iSyntheticData1 helper function.
function [N,p,X,y] = iSyntheticData1
rng default
N = 10000;
p = 30;
useful = [6,8,9,11,13];
C = randn(p,p);
R = corrcov(C'*C);
X = mvnrnd(zeros(p,1),R,N);
% Make features 15 to 19 highly correlated with useful features:
% 15 -> 6
% 16 -> 8
% 17 -> 9
% 18 -> 11
% 19 -> 13
corrStd = 0.1;
X(:,15:19) = X(:,useful) + corrStd*randn(N,5);
noiseStd = 0.1;
t = 0.5*cos(X(:,11)) + sin(X(:,9).*X(:,8)) + 0.5*X(:,13).*X(:,6) + noiseStd*randn(N,1);
y = rescale(t,0,1);
X = zscore(X);
end
This code creates the iBinXY helper function. Note that this helper function uses the
iBinPredictors helper function.
function [binnedXInfo,binnedYInfo] = iBinXY(X,Y,nbins)
binnedXInfo = iBinPredictors(X,nbins);
binnedYInfo = iBinPredictors(Y,nbins);
end
This code creates the iComputeMIXX helper function. Note that this helper function uses the
iComputeMIXIXJ helper function.
function Q = iComputeMIXX(binnedX,nbinsX)
p = size(binnedX,2);
Q = zeros(p,p);
for i = 1:p
for j = i+1:p
18-38
Feature Selection QUBO (Quadratic Unconstrained Binary Optimization)
Xi = binnedX(:,i);
Xj = binnedX(:,j);
nbinsXi = nbinsX(i);
nbinsXj = nbinsX(j);
Q(i,j) = iComputeMIXIXJ(Xi,Xj,nbinsXi,nbinsXj);
end
end
Q = Q + Q';
end
This code creates the iComputeMIXY helper function. Note that this helper function uses the
iComputeMIXIXJ helper function.
function f = iComputeMIXY(binnedX,binnedY,nbinsX,nbinsY)
p = size(binnedX,2);
f = zeros(p,1);
for i = 1:p
Xi = binnedX(:,i);
nbinsXi = nbinsX(i);
f(i) = iComputeMIXIXJ(Xi,binnedY,nbinsXi,nbinsY);
end
end
This code creates the iBinPredictors helper function. Note that this helper function uses the
iDiscretize helper function.
function binnedInfo = iBinPredictors(X,nbins)
[N,p] = size(X);
binnedX = zeros(N,p);
edgesX = cell(1,p);
iscatX = false(1,p);
nbinsX = zeros(1,p);
istbl = istable(X);
18-39
18 QUBO Problems
for i = 1:p
if istbl
oneX = X{:,i};
else
oneX = X(:,i);
end
nbinsi = min(nbins, numel(unique(oneX)));
[binnedX(:,i),edgesX{i},iscatX(i),nbinsX(i)] = iDiscretize(oneX,nbinsi);
end
binnedInfo = struct;
binnedInfo.binned = binnedX;
binnedInfo.edges = edgesX;
binnedInfo.iscat = iscatX;
binnedInfo.nbins = nbinsX;
end
References
[1] Mücke, S., R. Heese, S. Müller, M. Wolter, and N. Piatkowski. Quantum Feature Selection.
arXiv:2203.13261v1, March 2022. Available at https://arxiv.org/pdf/2203.13261.pdf.
See Also
fitrtree | solve | qubo | tabuSearch
Related Examples
• “Workflow for QUBO Problems” on page 18-8
External Websites
• https://en.wikipedia.org/wiki/Mutual_information
18-40