Linear Algebra: Assignment I
Linear Algebra: Assignment I
Linear Algebra: Assignment I
Assignment I
BLAS & LAPACK
陳文漢 | 110006223
1. Run numpy.show config() and scipy.show config()
- First uncomment this part of the given code
These are the BLAS and LAPACK libraries that are used in our python.
PAGE 1
2. a.)Namming convention, starting character S, D, C, Z, for
BLAS and LAPACK:
S : Single Precision Real Numbers
D: Double Precision Real Numbers
C: Single Precision Complex Numbers
Z: Double Precision Complex Numbers
❖ SSPR2
➢ SSPR2 performs a symmetric rank 2 operation.
➢ For example, A := alpha*x*y**T + alpha*y*x**T + A. Here,
alpha is a scalar, x and y are the n element vectors and A is an
n by n symmetric matrix.
➢ The subroutine sspr2 has some parameters which is:
▪ character UPLO :
• UPLO has a character data type.
• On input, UPLO indicates whether the upper or lower
triangular part of matrix A is provided in packed array
AP.
• The upper triangular part of matrix A will be delivered
by AP, (UPLO = ‘U’ or ‘u’)
• While the lower triangular part of matrix A will be
delivered by AP (UPLO = ‘L’ or ‘l’).
▪ integer N:
• N is integer type.
• N indicates the order of matrix A where N must be
greater than or equal to zero.
▪ real ALPHA:
• ALPHA is a real data type.
• ALPHA indicates a scalar alpha.
▪ real, dimension(*) X:
• X is REAL array, at least dimension at least,
( 1 + ( n - 1 )*abs( INCX ) ).
• Before entering, the incremented array X must contain
an n-element vector x.
▪ integer INCX:
PAGE 2
• The increment for the elements of X is specified on
entry by INCX.
• INCX cannot equal zero.
▪ real, dimension(*) Y:
• Y is REAL array, dimension at least,
( 1 + ( n - 1 )*abs( INCX ) ).
• Before entering, the incremented array Y must contain
an n-element vector y.
▪ integer INCY:
• The increment for the elements of Y is specified on
entry by INCY.
• INCY cannot equal zero.
▪ real, dimension(*) AP:
• AP is a REAL array output with a dimension at least
( ( n*( n + 1 ) )/2 ).
❖ ZGERC:
➢ ZGERC do the rank 1 operation
➢ For example, A := alpha*x*y**H + A. Here, alpha is a scalar, x
is an m-element vector, y is a n-element vector and A is an
m x n matrix.
➢ ZGERC has some parameters which is :
▪ integer M:
• M is an integer which indicates the numbers of rows of
the matrix A, and M must be greater than or equal to
zero.
▪ integer N:
• N is an integer which indicates the number of columns
of the matrix A and N must be greater than or equal to
zero.
▪ complex*16 ALPHA:
• ALPHA is a COMPLEX*16 which indicates the scalar
alpha on input.
▪ complex*16, dimension(*) X:
• X is a COMPLEX*16 array with a dimension at least
( 1 + ( m - 1 )*abs( INCX ) ).
• The element vector x must be present in the
incremented array X prior to entry.
▪ integer INCX:
• INCX is an integer.
PAGE 3
• The increment for the elements of X is specified on
entry by INCX where INCX cannot equal zero.
▪ complex*16, dimension(*) Y:
• Y is a COMPLEX*16 array with a dimension at least
( 1 + ( n - 1 )*abs( INCY ) ).
• The element vector Y must be present in the
incremented array Y prior to entry.
▪ integer INCY:
• INCY is an integer.
• The increment for the elements of Y is specified by
INCY upon entry where INCY can't be equal to zero.
▪ complex*16, dimension(lda,*) A:
• A is a COMPLEX*16 array with dimension (LDA, N). →
we will explain LDA on the next point.
• The coefficients matrix must be present in the array A's
leading m by n section before entering.
• A is replaced with the updated matrix upon exit.
▪ integer LDA:
• LDA is an integer.
• Upon entering, LDA defines the first dimension of A as
stated in the calling (sub) program.
• LDA must be at least max( 1, m ).
❖ DGBSVX:
➢ The solution to the real system of linear equations
A * X = B, A**T * X = B, or A**H * X = B,
where A is a band matrix of order N with KL subdiagonals and
KU superdiagonals, and X and B are N-by-NRHS matrices, is
computed by DGBSVX using the LU factorization.
➢ It also includes a condition estimate and error bounds on the
answer.
❖ CHEEVR:
➢ The complex Hermitian matrix A's specified eigenvalues and,
optionally, eigenvectors which are calculated via CHEEVR.
Eigenvalues and eigenvectors can be chosen by giving either a
range of values or a range of indices for the desired
eigenvalues.
➢ With a call to CHETRD, CHEEVR first converts the matrix A
to the tridiagonal form T.
PAGE 4
➢ Then, CHEEVR invokes CSTEMR to compute the
eigenspectrum using Relatively Robust Representations.
➢ The dqds technique is used by CSTEMR to calculate
eigenvalues, and several L D L^T representations are used to
calculate orthogonal eigenvectors (also known as Relatively
RobustRepresentations).
PAGE 5
• First, we want to make the check_equal function that we will
used later to check the correctness of the results.
▪ Iterate n size of the matrix with the for loop and then
compare B(np.random.rand(n)) and
C(lapack.dgesvx(A,B)[7]).
• Then, declare a list to store some value of the sizes of the
matrix.
• Here we can use for loop to iterate the value of the sizes that
we have already made.
• Here, C[7](C at position 7) is the result that we want from
using the subroutine DGESV in LAPACK to solve Ax = b.
source :
https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.lapack.dgesvx.htm
l
PAGE 6
4. The reason why BLAS/LAPACK can be so effective on
modern CPUs is that they utilize block matrix operation.
(See textbook 1.6.) How to use block matrix operation to
solve a linear system Ax = b? Derive the formula by
partitioning A into a 2 × 2 block matrix. Explain why a good
matrix-matrix multiplication subroutine can accelerate the
computation of solving Ax = b.
❖ Here, we have the linear system Ax = b.
• Block matrices, also known as partitioned matrices, are one
way to represent complex matrices by breaking them up into
smaller submatrices.
• Submatrices can be used to perform block matrix operations,
which can then be transformed into general matrix form.
• In order to solve this linear system by using block matrix
operation first we need to partition the matrix A into a 2 x 2
block matrix.
• A1, A2, A3, A4, are the submatrices that are formed by
partitioning matrix A, B1 and B2 are submatrices that are
formed by partitioning matrix B.
• Ax = b → x = A-1 b
PAGE 7
• So we can get x:
PAGE 8
Source: geeksforgeeks
➢ In this simple method, there are 8 multiplications and 4
additions in total. So the worst case of the time complexity is
8 recursive calls.
❖ The Strassen algorithm could reduce the number of recursive calls to
7.
❖ Strassen algorithm is actually implementing to the above simple
divide and conquer method, but the 4-sub matrices of result are
calculated using a specific formula that is shown on the figure below.
source:geeksforgeeks
❖ We can see that this method is faster than a conventional matrix-
matrix multiplication for small matrices. However, as the input
become larger and larger, the performance of Strassen method will
PAGE 9
have a larger increase compare to the conventional or naïve matrix-
matrix multiplication method. We can see a graph below that show
the performance of each method as the matrices gets larger and
larger.
❖ So, the explanations above is the reason why people usually do not
use it in practice for high performance computation.
PAGE 10