Linear Algebra: Assignment I

LINEAR ALGEBRA
Assignment I
BLAS & LAPACK
陳文漢 | 110006223
1. Run numpy.show config() and scipy.show config()
- First uncomment this part of the given code
- Running the entire code will give us the below result.
These are the BLAS and LAPACK libraries that are used in our python.
PAGE 1
2. a.)Namming convention, starting character S, D, C, Z, for
BLAS and LAPACK:
S : Single Precision Real Numbers
D: Double Precision Real Numbers
C: Single Precision Complex Numbers
Z: Double Precision Complex Numbers
b.) Meaning of SSPR2, ZGERC, DGBSVX, and CHEEVR:
❖ SSPR2
➢ SSPR2 performs a symmetric rank 2 operation.
➢ For example, A := alpha*x*y**T + alpha*y*x**T + A. Here,
alpha is a scalar, x and y are the n element vectors and A is an
n by n symmetric matrix.
➢ The subroutine sspr2 has some parameters which is:
▪ character UPLO :
• UPLO has a character data type.
• On input, UPLO indicates whether the upper or lower
triangular part of matrix A is provided in packed array
AP.
• The upper triangular part of matrix A will be delivered
by AP, (UPLO = ‘U’ or ‘u’)
• While the lower triangular part of matrix A will be
delivered by AP (UPLO = ‘L’ or ‘l’).
▪ integer N:
• N is integer type.
• N indicates the order of matrix A where N must be
greater than or equal to zero.
▪ real ALPHA:
• ALPHA is a real data type.
• ALPHA indicates a scalar alpha.
▪ real, dimension(*) X:
• X is REAL array, at least dimension at least,
( 1 + ( n - 1 )*abs( INCX ) ).
• Before entering, the incremented array X must contain
an n-element vector x.
▪ integer INCX:
PAGE 2
• The increment for the elements of X is specified on
entry by INCX.
• INCX cannot equal zero.
▪ real, dimension(*) Y:
• Y is REAL array, dimension at least,
( 1 + ( n - 1 )*abs( INCX ) ).
• Before entering, the incremented array Y must contain
an n-element vector y.
▪ integer INCY:
• The increment for the elements of Y is specified on
entry by INCY.
• INCY cannot equal zero.
▪ real, dimension(*) AP:
• AP is a REAL array output with a dimension at least
( ( n*( n + 1 ) )/2 ).
❖ ZGERC:
➢ ZGERC do the rank 1 operation
➢ For example, A := alpha*x*y**H + A. Here, alpha is a scalar, x
is an m-element vector, y is a n-element vector and A is an
m x n matrix.
➢ ZGERC has some parameters which is :
▪ integer M:
• M is an integer which indicates the numbers of rows of
the matrix A, and M must be greater than or equal to
zero.
▪ integer N:
• N is an integer which indicates the number of columns
of the matrix A and N must be greater than or equal to
zero.
▪ complex*16 ALPHA:
• ALPHA is a COMPLEX*16 which indicates the scalar
alpha on input.
▪ complex*16, dimension(*) X:
• X is a COMPLEX*16 array with a dimension at least
( 1 + ( m - 1 )*abs( INCX ) ).
• The element vector x must be present in the
incremented array X prior to entry.
▪ integer INCX:
• INCX is an integer.
PAGE 3
• The increment for the elements of X is specified on
entry by INCX where INCX cannot equal zero.
▪ complex*16, dimension(*) Y:
• Y is a COMPLEX*16 array with a dimension at least
( 1 + ( n - 1 )*abs( INCY ) ).
• The element vector Y must be present in the
incremented array Y prior to entry.
▪ integer INCY:
• INCY is an integer.
• The increment for the elements of Y is specified by
INCY upon entry where INCY can't be equal to zero.
▪ complex*16, dimension(lda,*) A:
• A is a COMPLEX*16 array with dimension (LDA, N). →
we will explain LDA on the next point.
• The coefficients matrix must be present in the array A's
leading m by n section before entering.
• A is replaced with the updated matrix upon exit.
▪ integer LDA:
• LDA is an integer.
• Upon entering, LDA defines the first dimension of A as
stated in the calling (sub) program.
• LDA must be at least max( 1, m ).
❖ DGBSVX:
➢ The solution to the real system of linear equations
A * X = B, A**T * X = B, or A**H * X = B,
where A is a band matrix of order N with KL subdiagonals and
KU superdiagonals, and X and B are N-by-NRHS matrices, is
computed by DGBSVX using the LU factorization.
➢ It also includes a condition estimate and error bounds on the
answer.
❖ CHEEVR:
➢ The complex Hermitian matrix A's specified eigenvalues and,
optionally, eigenvectors which are calculated via CHEEVR.
Eigenvalues and eigenvectors can be chosen by giving either a
range of values or a range of indices for the desired
eigenvalues.
➢ With a call to CHETRD, CHEEVR first converts the matrix A
to the tridiagonal form T.
PAGE 4
➢ Then, CHEEVR invokes CSTEMR to compute the
eigenspectrum using Relatively Robust Representations.
➢ The dqds technique is used by CSTEMR to calculate
eigenvalues, and several L D L^T representations are used to
calculate orthogonal eigenvectors (also known as Relatively
RobustRepresentations).
3. The subroutine to solve a linear system Ax = b in LAPACK,

where A, b are in double precision.
❖ Among LAPACK subroutines, we can find DGESV is the subroutine
that can computes the solution to system of linear equations Ax = B
for GE(general) matrices, where A, b are in double precision since D
in DGESV is naming for double precision as already stated in
question number 2.
❖ DGESV uses the LU(lower-upper) factorization to compute the
solution to a real system of linear equations Ax = b. Here, A is NxN
matrix, while x and b are N-by-NRHS(the number of right hand
sides, i.e., the number of columns of matrices b and x) matrices.
❖ Code Implementation
• Following the template that was already given, we can modify

and edit the code based on this question.
PAGE 5
• First, we want to make the check_equal function that we will
used later to check the correctness of the results.
▪ Iterate n size of the matrix with the for loop and then
compare B(np.random.rand(n)) and
C(lapack.dgesvx(A,B)[7]).
• Then, declare a list to store some value of the sizes of the
matrix.
• Here we can use for loop to iterate the value of the sizes that
we have already made.
• Here, C[7](C at position 7) is the result that we want from
using the subroutine DGESV in LAPACK to solve Ax = b.
source :
https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.lapack.dgesvx.htm
l
• Then, we can declare D as to store the result of .dot()

function.
• Next, we can just use the function that we have already made
before to compare B and D.
• The final step is to generate our graph and we can try
different matrix sizes.
PAGE 6
4. The reason why BLAS/LAPACK can be so effective on
modern CPUs is that they utilize block matrix operation.
(See textbook 1.6.) How to use block matrix operation to
solve a linear system Ax = b? Derive the formula by
partitioning A into a 2 × 2 block matrix. Explain why a good
matrix-matrix multiplication subroutine can accelerate the
computation of solving Ax = b.
❖ Here, we have the linear system Ax = b.
• Block matrices, also known as partitioned matrices, are one
way to represent complex matrices by breaking them up into
smaller submatrices.
• Submatrices can be used to perform block matrix operations,
which can then be transformed into general matrix form.
• In order to solve this linear system by using block matrix
operation first we need to partition the matrix A into a 2 x 2
block matrix.
• Here, we have block matrix b
• A1, A2, A3, A4, are the submatrices that are formed by
partitioning matrix A, B1 and B2 are submatrices that are
formed by partitioning matrix B.
• Ax = b → x = A-1 b
PAGE 7
• So we can get x:
• Therefore, block matrix operation can be used to solve linear

system Ax = b and there is the solution of x that is already
been calculated.
❖ Since the final solution needs to performs multiplication of lots of

matrices that is why a good matrix-matrix multiplication subroutine
can help to accelerate the computation of solving Ax = b.
5. Strassen Matrix-Matrix Multiplication algorithm. ? Why it is

faster than the conventional matrix-matrix multiplication
algorithm in theory? Why people usually do not use it in
practice for high performance computation?
❖ Strassen Matrix-Matrix Multiplication is a faster algorithm than the
original Matrix-Matrix Multiplication algorithm.
❖ Strassen algorithm uses a recursive method for matrix multiplication
by dividing the matrix into 4 sub-matrices of size N/2 x N/2 in each
recursive step.
❖ We can first see the simple Divide and Conquer method (a method
of solving a large problem by breaking the problem into smaller sub-
problems )of matrix-matrix multiplication.
➢ First divide the matrix A and B into 4 sub-matrices of size N/2
x N/2.
➢ Then we can calculate ae+bg, af+bh, ce+dg and cf+dh
recursively.
PAGE 8
Source: geeksforgeeks
➢ In this simple method, there are 8 multiplications and 4
additions in total. So the worst case of the time complexity is
8 recursive calls.
❖ The Strassen algorithm could reduce the number of recursive calls to
7.
❖ Strassen algorithm is actually implementing to the above simple
divide and conquer method, but the 4-sub matrices of result are
calculated using a specific formula that is shown on the figure below.
source:geeksforgeeks
❖ We can see that this method is faster than a conventional matrix-
matrix multiplication for small matrices. However, as the input
become larger and larger, the performance of Strassen method will
PAGE 9
have a larger increase compare to the conventional or naïve matrix-
matrix multiplication method. We can see a graph below that show
the performance of each method as the matrices gets larger and
larger.
Image taken from https://chart-studio.plotly.com/~tbindi/7.embed
❖ So, the explanations above is the reason why people usually do not
use it in practice for high performance computation.
PAGE 10

Linear Algebra: Assignment I

Uploaded by

Copyright:

Available Formats

Linear Algebra: Assignment I

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Linear Algebra: Assignment I

Uploaded by

Copyright:

Available Formats

LINEAR ALGEBRA

- Running the entire code will give us the below result.

b.) Meaning of SSPR2, ZGERC, DGBSVX, and CHEEVR:

3. The subroutine to solve a linear system Ax = b in LAPACK,

• Following the template that was already given, we can modify

• Then, we can declare D as to store the result of .dot()

• Here, we have block matrix b

• Therefore, block matrix operation can be used to solve linear

❖ Since the final solution needs to performs multiplication of lots of

5. Strassen Matrix-Matrix Multiplication algorithm. ? Why it is

Image taken from https://chart-studio.plotly.com/~tbindi/7.embed

You might also like