This document provides an overview of effective numerical computation in NumPy and SciPy. It discusses how Python can be used for numerical computation tasks like differential equations, simulations, and machine learning. While Python is initially slower than languages like C, libraries like NumPy and SciPy allow Python code to achieve sufficient speed through techniques like broadcasting, indexing, and using sparse matrix representations. The document provides examples of how to efficiently perform tasks like applying functions element-wise to sparse matrices and calculating norms. It also presents a case study for efficiently computing a formula that appears in a machine learning paper using different sparse matrix representations in SciPy.
1 of 35
Downloaded 91 times
More Related Content
Effective Numerical Computation in NumPy and SciPy
2. About Myself
Kimikazu Kato
Chief Scientists at Silver Egg Technology Co., Ltd.
Ph.D in Computer Science
Background in Mathematics, Numerical Computation, Algorithms, etc.
<2 year experience in Python
>10 year experience in numerical computation
Now designing algorithms for recommendation system, and doing research
about machine learning and data analysis.
2 / 35
3. This talk...
is about effective usage of NumPy/SciPy
is NOT exhaustive introduction of capabilities, but shows some case
studies based on my experience and interest
3 / 35
4. Table of Contents
Introduction
Basics about NumPy
Broadcasting
Indexing
Sparse matrix
Usage of scipy.sparse
Internal structure
Case studies
Conclusion
4 / 35
5. Numerical Computation
Differential equations
Simulations
Signal processing
Machine Learning
etc...
Why Numerical Computation in Python?
Productivity
Easy to write
Easy to debug
Connectivity with visualization tools
Matplotlib
IPython
Connectivity with web system
Many frameworks (Django, Pyramid, Flask, Bottle, etc.)
5 / 35
6. But Python is Very Slow!
Code in C
#include <stdio.h>
int main() {
int i; double s=0;
for (i=1; i<=100000000; i++) s+=i;
printf("%.0fn",s);
}
Code in Python
s=0.
for i in xrange(1,100000001):
s+=i
print s
Both of the codes compute the sum of integers from 1 to 100,000,000.
Result of benchmark in a certain environment:
Above: 0.109 sec (compiled with -O3 option)
Below: 8.657 sec
(80+ times slower!!)
6 / 35
7. Better code
import numpy as np
a=np.arange(1,100000001)
print a.sum()
Now it takes 0.188 sec. (Measured by "time" command in Linux, loading time
included)
Still slower than C, but sufficiently fast as a script language.
7 / 35
8. Lessons
Python is very slow when written badly
Translate C (or Java, C# etc.) code into Python is often a bad idea.
Python-friendly rewriting sometimes result in drastic performance
improvement
8 / 35
9. Basic rules for better performance
Avoid for-sentence as far as possible
Utilize libraries' capabilities instead
Forget about the cost of copying memory
Typical C programmer might care about it, but ...
9 / 35
11. Broadcasting
>>> import numpy as np
>>> a=np.array([0,1,2])
>>> a*3
array([0, 3, 6])
>>> b=np.array([1,4,9])
>>> np.sqrt(b)
array([ 1., 2., 3.])
A function which is applied to each element when applied to an array is called
a universal function.
11 / 35
14. Refernces
Gabriele Lanaro, "Python High Performance Programming," Packt
Publishing, 2013.
Stéfan van der Walt, Numpy Medkit
14 / 35
15. Sparse matrix
Defined as a matrix in which most elements are zero
Compressed data structure is used to express it, so that it will be...
Space effective
Time effective
15 / 35
16. scipy.sparse
The class scipy.sparse has mainly three types as expressions of a sparse
matrix. (There are other types but not mentioned here)
lil_matrix : convenient to set data; setting a[i,j] is fast
csr_matrix : convenient for computation, fast to retrieve a row
csc_matrix : convenient for computation, fast to retrieve a column
Usually, set the data into lil_matrix, and then, convert it to csc_matrix or
csr_matrix.
For csr_matrix, and csc_matrix, calcutaion of matrices of the same type is fast,
but you should avoid calculation of different types.
16 / 35
20. Merit of knowing the internal structure
Setting csr_matrix or csc_matrix with its internal structure is much faster than
setting lil_matrix with indices.
See the benchmark of setting
ý ý
ý
20 / 35
21. from scipy.sparse import lil_matrix, csr_matrix
import numpy as np
from timeit import timeit
def set_lil(n):
a=lil_matrix((n,n))
for i in xrange(n):
a[i,i]=2.
if i+1n:
a[i,i+1]=1.
return a
def set_csr(n):
data=np.empty(2*n-1)
indices=np.empty(2*n-1,dtype=np.int32)
indptr=np.empty(n+1,dtype=np.int32)
# to be fair, for-sentence is intentionally used
# (using indexing technique is faster)
for i in xrange(n):
indices[2*i]=i
data[2*i]=2.
if in-1:
indices[2*i+1]=i+1
data[2*i+1]=1.
indptr[i]=2*i
indptr[n]=2*n-1
a=csr_matrix((data,indices,indptr),shape=(n,n))
return a
print lil:,timeit(set_lil(10000),
number=10,setup=from __main__ import set_lil)
print csr:,timeit(set_csr(10000),
number=10,setup=from __main__ import set_csr)
21 / 35
22. Result:
lil: 11.6730761528
csr: 0.0562081336975
Remark
When you deal with already sorted data, setting csr_matrix or csc_matrix
with data, indices, indptr is much faster than setting lil_matrix
But the code tend to be more complicated if you use the internal structure
of csr_matrix or csc_matrix
22 / 35
24. Case 1: Norms
If 2
is dense:
norm=np.dot(v,v)
Ï2 Ï %
2%
Expressed as product of matrices. (dot means matrix product, but you don't
have to take transpose explicitly.)
When is sparse, suppose that is expressed as matrix:
2 2 g *
norm=v.multiply(v).sum()
(multiply() is element-wise product)
This is because taking transpose of a sparse matrix changes the type.
24 / 35
26. Case 2: Applying a function to all of the elements of a
sparse matrix
A universal function can be applied to a dense matrix:
import numpy as np
a=np.arange(9).reshape((3,3))
a
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
np.tanh(a)
array([[ 0. , 0.76159416, 0.96402758],
[ 0.99505475, 0.9993293 , 0.9999092 ],
[ 0.99998771, 0.99999834, 0.99999977]])
This is convenient and fast.
However, we cannot do the same thing for a sparse matrix.
26 / 35
27. from scipy.sparse import lil_matrix
a=lil_matrix((3,3))
a[0,0]=1.
a[1,0]=2.
b=a.tocsr()
np.tanh(b)
3x3 sparse matrix of type 'type 'numpy.float64''
with 2 stored elements in Compressed Sparse Row format
This is because, for an arbitrary function, its application to a sparse matrix is
not necessarily sparse.
However, if a universal function satisfies
, the density is
preserved.
Then, how can we compute it?
27 / 35
28. Use the internal structure!!
The positions of the non-zero elements are not changed after application of
the function.
Keep indices and indptr, and just change data.
Solution:
b = csr_matrix((np.tanh(a.data), a.indices, a.indptr), shape=a.shape)
28 / 35
29. Case 3: Formula which appears in a paper
In the algorithm for recommendation system [1], the following formula
appears:
øø
* g
where is dense matrix, and D is a diagonal matrix defined from a
given array as:
%
ý
*
Here, (which corresponds to the number of users or items) is big and
(which means the number of latent factors) is small.
[1] Hu et al. Collaborative Filtering for Implicit Feedback Datasets, ICDM,
2008.
*
29 / 35
30. Solution 1:
There is a special class dia_matrix to deal with a diagonal sparse matrix.
import scipy.sparse as sparse
import numpy as np
def f(a,d):
a: 2d array of shape (n,f), d: 1d array of length n
dd=sparse.diags([d],[0])
return np.dot(a.T,dd.dot(a))
30 / 35
32. Solution 3:
û
)
û
)
g g
û
)
û
)
This is equivalent to the broadcasting!
def h(a,d):
return np.dot(a.T*d,a)
ü
ü
ü
*
*
û
*)
ý
*
ü
ü
g
ü
* *
* *
û
*) *
32 / 35
34. Conclusion
Try not to use for-sentence, but use libraries' capabilities instead.
Knowledge about the internal structure of the sparse matrix is useful to
extract further performance.
Mathematical derivation is important. The key is to find a mathematically
equivalent and Python-friendly formula.
Computational speed does not necessarily matter. Finding a better code in
a short time is valuable. Otherwise, you shouldn't pursue too much.
34 / 35
35. Acknowledgment
I would like to thank
(@shima__shima)
who gave me useful advice in Twitter.
35 / 35