Implementing A Randomized SVD Algorithm and Its Performance Analysis
Implementing A Randomized SVD Algorithm and Its Performance Analysis
ISSN No:-2456-2165
Abstract:- Dimension reducing techniques are becoming patterns, it can be really expensive and we might have to
more and more dominant in data science and model wait for a significant while.
predictions because it is much more efficient and
comfortable working on a small set of data than very In Modern world, we have now with the bigger and
large data. More often than not the reduced lower bigger sets of data is the higher and higher dimensional
dimensional representation seems to contain the same measurements or we can say the measuring dimensions is
properties as that of the higher dimensional space. increasing rapidly at a fast pace. But the key concept of
Additionally, big sets of data prove to be a problem in Singular Value Decomposition is that even in very large
terms of computational environment on both memory data sets there are actually a few properties or features that
and processing power and hence the need for we care about, say for our understanding and building
dimensionality reduction is key. models. So despite the increase in measuring dimensions we
can still have a low intrinsic rank available to us.
This paper discusses a dimension reducing
technique for low rank matrices called Randomized Randomized Singular Value Decomposition (RSVD)
Singular Value Decomposition (RSVD). It is a is an emerging technique in Randomized Linear Algebra
probabilistic method combined with the theory of that essentially randomly samples the column space of our
structured matrices to aid matrix factorization of any matrix, say 𝑋. So the basic idea is we randomly sample the
order. Probabilistic methods were considered to be column space of our parent matrix 𝑋 and with high
chaotic in nature at the start but the numerical results probability we are going to find a subspace that will be
are found to be quite reliable and stable. First we try and spanned by the dominating columns of 𝑈 matrix from the
show the mathematics involved in randomized singular Singular Value Decomposition.
value decompositions using concepts of Linear Algebra
and then try and formulate an algorithm based on our Note: The key motivation or the assumption while
mathematics. A random (𝟓𝟎𝟎𝟎, 𝟏𝟎𝟎𝟎) matrix is chosen working with Randomized Singular Value Decomposition is
to be factorized into a low rank 10, which is our target that there exists a low rank available that we want to
rank. uncover. We mostly call that rank as the target rank.
The deterministic singular value decomposition in python is a single line of code and the randomized singular value
decomposition is a few lines of codes. The above written code is very robust and work easily for matrices of arbitrary order and
can even be implemented to shapes and pictures with easy matrix reconstruction codes. In fact randomized algorithms are finding
high usage in graphics designing and image compression.
The next piece of code below is to illustrate the working and significance of power iterations as 𝑞 is run in the range 1 to 5.
The graphs and the plots below show the comparison between deterministic singular value decomposition and then randomized
ones with power iterations.
The results show that as and when go on increasing the Often we observe our singular values doesn’t decay of
value of 𝑞 = 1,2,3,4,5, the singular values decays of faster that fast but for example using the factor (𝐴𝐴∗ )𝑞 in substep
and faster and the slope of the graphs goes down and down. 2 of step 1 in our algorithm makes the singular values decay
The randomized singular value decomposition relies on the much faster. This factor 𝑞 in theory is what we call a power
fact that singular values decay vey very rapidly so that we iteration.
can approximate the dominant ones and not have them be
contaminated by lower energy ones.
Note: To plot this figure one must install the package experiment was carried out in a machine with Intel(R)
ggplot2 which can be easily found and implemented in Core(TM) i5-7300HQ CPU @ 2.50GHz and RAM of
RStudio. 8.00GB.
Computational Accuracy
In the following we evaluate the performance of the
randomized SVD routine of ours with the classical SVD
routine based on the error percentage. The relative
theoretical reconstruction error is computed as
∥ 𝐴 − 𝐴𝐾 ∥𝐹
∥ 𝐴 ∥𝐹
V. CONCLUSIONS
REFERENCES