Numpy - All - Lectures - Jupyter Notebook
Numpy - All - Lectures - Jupyter Notebook
Introduction
The numpy package (module) is used in almost all numerical computation using Python. It is a package that provide high-performance
vector, matrix and higher-dimensional data structures for Python. It is implemented in C and Fortran so when calculations are vectorized
(formulated with vectors and matrices), performance is very good.
To use numpy you need to import the module, using for example:
In the numpy package the terminology used for vectors, matrices and higher-dimensional data sets is array.
There are a number of ways to initialize new numpy arrays, for example from
From lists
For example, to create new vector and matrix arrays from Python lists we can use the numpy.array function.
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 1/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: v.shape
In [ ]: v.T
In [ ]: v=array([[1,2,3,4]])
In [ ]: v.shape
In [ ]: vt=v.T
In [ ]: vt.shape
The v and M objects are both of the type ndarray that the numpy module provides.
In [ ]: type(v), type(M)
The difference between the v and M arrays is only their shapes. We can get information about the shape of an array by using the
ndarray.shape property.
In [ ]: v.shape
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 2/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: M.shape
The number of elements in the array is available through the ndarray.size property:
In [ ]: M.size
In [ ]: shape(M)
In [ ]: size(M)
So far the numpy.ndarray looks awefully much like a Python list (or nested list). Why not simply use Python lists for computations
instead of creating a new array type?
Python lists are very general. They can contain any kind of object. They are dynamically typed. They do not support mathematical
functions such as matrix and dot multiplications, etc. Implementing such functions for Python lists would not be very efficient because
of the dynamic typing.
Numpy arrays are statically typed and homogeneous. The type of the elements is determined when the array is created.
Numpy arrays are memory efficient.
Because of the static typing, fast implementation of mathematical functions such as multiplication and addition of numpy arrays can
be implemented in a compiled language (C and Fortran is used).
Using the dtype (data type) property of an ndarray , we can see what type the data of an array has:
In [ ]: a=array([1, 2],'i');a
In [ ]: a[1]=3;a
In [ ]: a.dtype
In [ ]: M.dtype
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 3/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
We get an error if we try to assign a value of the wrong type to an element in a numpy array:
In [ ]: M[0,0] = "hello"
If we want, we can explicitly define the type of the array data when we create it, using the dtype keyword argument:
Common data types that can be used with dtype are: int , float , complex , bool , object , etc.
We can also explicitly define the bit size of the data types, for example: int64 , int16 , float128 , complex128 .
For larger arrays it is inpractical to initialize the data manually, using explicit python lists. Instead we can use one of the many functions in
numpy that generate arrays of different forms. Some of the more common are:
arange
In [ ]: # create a range
In [ ]: x = arange(-1, 1, 0.1)
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 4/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: logspace(1, 3, 4,base=2)
mgrid
In [ ]: x
In [ ]: y
random data
diag
In [ ]: # a diagonal matrix
diag([1,2,3])
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 5/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: diag([1,2,3], k=-1)
In [ ]: zeros((3,3))
In [ ]: ones((3,3))
In [ ]: M = random.rand(3,3)
Useful when storing and reading back numpy array data. Use the functions numpy.save and numpy.load :
In [ ]: M=array([[1,2,3],[4,5,6], [7,80,91]])
In [ ]: save("random-matrix.npy", M)
In [ ]: load("random-matrix.npy")
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 6/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: M.size
In [ ]: M=array([[1,2],[3,4]])
In [ ]: M=array([[[1,2],[3,4]],[[4,5],[7,8]]])
print(M)
M.ndim
Manipulating arrays
Indexing
In [ ]: v=array([[1,2,3,4]]);v
In [ ]:
In [ ]: M=array([[1,2,3],[4,5,6], [7,80,91]])
M
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 7/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
If we omit an index of a multidimensional array it returns the whole row (or, in general, a N-1 dimensional array)
In [ ]: M
In [ ]: M[1]
In [ ]: M[1,:] # row 1
In [ ]: M[:,1] # column 1
In [ ]: M[0,0] = 47
In [ ]: M
In [ ]: M
Index slicing
Index slicing is the technical name for the syntax M[lower:upper:step] to extract part of an array:
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 8/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: A = array([1,2,3,4,5])
A
In [ ]: A[1:3]
Array slices are mutable: if they are assigned a new value the original array from which the slice was extracted is modified:
In [ ]: A[1:3] = [-2,-3]
In [ ]: A[::2] # step is 2, lower and upper defaults to the beginning and end of the array
Negative indices counts from the end of the array (positive index from the begining):
In [ ]: A = array([1,2,3,4,5])
Index slicing works exactly the same way for multidimensional arrays:
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 9/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
A=array([[ 0, 1, 2, 3, 4],
[10, 11, 12, 13, 14],
[20, 21, 22, 23, 24],
[30, 31, 32, 33, 34],
[40, 41, 42, 43, 44]])
A
In [ ]: # strides
A[::2, ::2]
Fancy indexing
Fancy indexing is the name for when an array or list is used in-place of an index:
In [ ]: A
In [ ]: row_indices = [1, 2, 3]
A[row_indices]
In [ ]: #Alternatively,
A[[1,3]]
We can also use index masks: If the index mask is an Numpy array of data type bool , then an element is selected (True) or not (False)
depending on the value of the index mask at the position of each element:
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 10/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: # same thing
row_mask = array([1,0,1,0,0], dtype=bool)
B[row_mask]
This feature is very useful to conditionally select elements from an array, using for example comparison operators:
In [ ]: x>5
mask
In [ ]: x[mask]
In [ ]: arr=arange(10);arr
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 11/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]:
arr[arr%2==1]=-1
arr
where
The index mask can be converted to position index using the where function
In [ ]: indices = where(mask)
indices
In [ ]: arr=arange(10)
print(arr)
out=where(arr%2==1,-1,arr)
print(out)
diag
With the diag function we can also extract the diagonal and subdiagonals of an array:
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 12/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: print(A)
In [ ]: diag(A)
In [ ]: diag(A, -1)
In [ ]: diag([1,2,3])
Linear algebra
Vectorizing code is the key to writing efficient numerical calculation with Python/Numpy. That means that as much as possible of a
program should be formulated in terms of matrix and vector operations, like matrix-matrix multiplication.
Scalar-array operations
We can use the usual arithmetic operators to multiply, add, subtract, and divide arrays with scalar numbers.
In [ ]: v1 = arange(0, 5);v1
In [ ]: v1 * 2
In [ ]: v1 + 2
In [ ]: A
In [ ]: A * 2, A + 2
When we add, subtract, multiply and divide arrays with each other, the default behaviour is element-wise operations:
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 13/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: A=array([[ 0, 1, 2, 3, 4],
[10, 11, 12, 13, 14],
[20, 21, 22, 23, 24],
[30, 31, 32, 33, 34],
[40, 41, 42, 43, 44]])
In [ ]: A
In [ ]: A+2
In [ ]: a=array([[1,2],[3,4]]);print(a); b=array([[1,2],[3,1]]);print(b)
In [ ]: c=a@b; print(c)
In [ ]: v1=array([[1,2,3,4,5]])
In [ ]: v1 * v1
Broadcasting
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 14/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: a=array([1,2,3,4])
In [ ]: a+2
In [ ]: a2=array([[1,2,3,4], [6,7,8,9]])
In [ ]: a2+5
In [ ]: a2
In [ ]: a4=a2.reshape(4,2); a4
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 15/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: a3=arange(15).reshape(3,5);a3
In [ ]: a=array([1,2,3,4,5])
In [ ]: a3+a
In [ ]: v=array([[0,1,2]])
vt=v.T
In [ ]: v+vt
In [ ]: a=arange(4).reshape(2,2);a
In [ ]: b=arange(15).reshape(3,5);b
In [ ]: a+b
In [ ]: import numpy as np
x = np.array([[20,20,20],[30,30,30],[40,40,40]])
print("Original array:")
print(x)
v = np.array([[20,30,40]])
print("Vector:")
print(v)
print(v.shape)
print(x / v.T)
Matrix algebra
In [ ]: A
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 16/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: dot(A, A)
In [ ]: v1
In [ ]: A.shape, v1.shape
In [ ]: dot(A, v1)
In [ ]: dot(v1.T, v1)
Alternatively, we can cast the array objects to the type matrix . This changes the behavior of the standard arithmetic operators +, -,
* to use matrix algebra.
In [ ]: v1=array([1,2,3,4,5])
In [ ]: A
In [ ]: M = matrix(A)
v = matrix(v1).T # make it a column vector
In [ ]: v
In [ ]: M * M
In [ ]: M * v
In [ ]: # inner product
v.T * v
If we try to add, subtract or multiply objects with incomplatible shapes we get an error:
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 17/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: v = matrix([1,2,3,4,5,6]).T
In [ ]: shape(M), shape(v)
In [ ]: M * v
See also the related functions: inner , outer , cross , kron , tensordot . Try for example help(kron) .
Array/Matrix transformations
Above we have used the .T to transpose the matrix object v . We could also have used the transpose function to accomplish the
same thing.
In [ ]: conjugate(C)
In [ ]: C.H
We can extract the real and imaginary parts of complex-valued arrays using real and imag :
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 18/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: abs(C)
Matrix computations
Inverse
In [ ]: C.I * C
Determinant
In [ ]: linalg.det(C)
In [ ]: linalg.det(C.I)
Data processing
Often it is useful to store datasets in Numpy arrays. Numpy provides a number of functions to calculate statistics of datasets in arrays.
For example, let's calculate some properties from the Stockholm temperature dataset used above.
In [ ]: data=arange(50).reshape(10,5);data
In [ ]: shape(data)
mean
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 19/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
The daily mean temperature in Stockholm over the last 200 years has been about 6.2 C.
In [ ]: std(data[:,3]), var(data[:,3])
In [ ]: d = arange(0, 10)
d
In [ ]: # cummulative sum
cumsum(d)
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 20/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: # cummulative product
cumprod(d+1)
In [ ]: A
When functions such as min , max , etc. are applied to a multidimensional arrays, it is sometimes useful to apply the calculation to the
entire array, and sometimes only on a row or column basis. Using the axis argument we can specify how these functions should
behave:
In [ ]: m = random.rand(3,3)
m
In [ ]: # global max
m.max()
Many other functions and methods in the array and matrix classes accept the same (optional) axis keyword argument.
The shape of an Numpy array can be modified without copying the underlaying data, which makes it a fast operation even for large
arrays.
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 21/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: A
In [ ]: n, m = A.shape
In [ ]: B = A.reshape((1,n*m))
B
In [ ]: A # and the original variable is also changed. B is only a different view of the same data
We can also use the function flatten to make a higher-dimensional array into a vector. But this function create a copy of the data.
In [ ]: B = A.flatten()
In [ ]: B[0:5] = 10
In [ ]: A # now A has not changed, because B's data is a copy of A's, not refering to the same data
With newaxis , we can insert new dimensions in an array, for example converting a vector to a column or row matrix:
In [ ]: v = array([1,2,3])
In [ ]: shape(v)
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 22/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: # column matrix
v[:,newaxis].shape
In [ ]: # row matrix
v[newaxis,:].shape
Using function repeat , tile , vstack , hstack , and concatenate we can create larger vectors and matrices from smaller ones:
In [ ]: tile(a, (4,1))
In [ ]: tile(a, (1,4))
In [ ]: tile(a, (4,4))
concatenate
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 23/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: b = array([[5, 6]]);b
In [ ]: a
In [ ]:
In [ ]: vstack((a,b))
In [ ]: hstack((a,b.T))
To achieve high performance, assignments in Python usually do not copy the underlaying objects. This is important for example when
objects are passed between functions, to avoid an excessive amount of memory copying when it is not necessary (technical term: pass
by reference).
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 24/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: # changing B affects A
B[0,0] = 10
In [ ]: A
In [ ]: print(c)
A[1]=45
In [ ]: print(c)
If we want to avoid this behavior, so that when we get a new completely independent object B copied from A , then we need to do a so-
called "deep copy" using the function copy :
In [ ]: B = copy(A)
In [ ]: A
Generally, we want to avoid iterating over the elements of arrays whenever we can (at all costs). The reason is that in a interpreted
language like Python (or MATLAB), iterations are really slow compared to vectorized operations.
However, sometimes iterations are unavoidable. For such cases, the Python for loop is the most convenient way to iterate over an
array:
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 25/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: v = array([1,2,3,4])
for x in v:
print(x)
In [ ]: M = array([[1,2], [3,4],[5,6]])
for x in M:
print(x)
When we need to iterate over each element of an array and modify its elements, it is convenient to use the enumerate function to obtain
both the element and its index in the for loop:
In [ ]: g=arange(10).reshape(5,2)
print(g)
for i in range(len(g)):
for j in range(len(g[i])):
print(g[i][j], end=" ")
print()
Vectorizing functions
As mentioned several times by now, to get good performance we should try to avoid looping over elements in our vectors and matrices,
and instead use vectorized algorithms. The first step in converting a scalar algorithm to a vectorized algorithm is to make sure that the
functions we write work with vector inputs.
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 26/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: def Theta(x):
"""
Scalar implemenation of the Heaviside step function.
"""
if x >= 0:
return 1
else:
return 0
In [ ]: Theta(array([-3,-2,-1,0,1,2,3]))
OK, that didn't work because we didn't write the Theta function so that it can handle a vector input...
To get a vectorized version of Theta we can use the Numpy function vectorize . In many cases it can automatically vectorize a
function:
In [ ]: Theta_vec = vectorize(Theta)
In [ ]: Theta_vec(array([-3,-2,-1,0,1,2,3]))
We can also implement the function to accept a vector input from the beginning (requires more effort but might give better performance):
In [ ]: def Theta(x):
"""
Vector-aware implemenation of the Heaviside step function.
"""
return 1 * (x >= 0)
In [ ]: Theta(array([-3,-2,-1,0,1,2,3]))
Vectorised operations
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 27/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: arr1=array([10,20,30.5,40])
arr2=array([15,25,35,45])
#a=2; print(a)
In [ ]: arr1-arr2
In [ ]: arr1/arr2
In [ ]: arr1**arr2
In [ ]: sin(arr1)
In [ ]: log10(arr1)
In [ ]: print(arr1)
argmin(arr1)
In [ ]: a=array([1,5,2,2, 9,9,3])
b=unique(a)
In [ ]: b
In [ ]: c=sort(a); c
In [ ]: d=c[::-1];d
In [ ]: a=array([1,2,3,0])
b=array([0,2,3,1])
c=a<b;
print(c)
In [ ]: any(c)
In [ ]: all(c)
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 28/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: print(b)
c=logical_or(b>=0, b<=2);
print(c)
In [ ]: a>=2
In [ ]: print(a)
anew=where(a>=2, 30, a)
#anew=where(a>=2, 30, -1)
print(anew)
In [ ]: print(a[g])
In [ ]: a=eye(10);a
In [ ]: g=arange(10).reshape(5,2);print(g)
When using arrays in conditions,for example if statements and other boolean expressions, one needs to use any or all , which
requires that any or all elements in the array evalutes to True :
In [ ]: M
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 29/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: if (M > 5).any():
print("at least one element in M is larger than 5")
else:
print("no element in M is larger than 5")
In [ ]: if (M > 5).all():
print("all elements in M are larger than 5")
else:
print("all elements in M are not larger than 5")
Type casting
Since Numpy arrays are statically typed, the type of an array does not change once created. But we can explicitly cast an array of some
type to another using the astype functions (see also the similar asarray function). This always create a new array of new type:
In [ ]: M.dtype
In [ ]: M2 = M.astype(float)
M2
In [ ]: M2.dtype
In [ ]: M3 = M.astype(bool)
M3
Splitting of arrays
The opposite of concatenation is splitting, which is implemented by the functions np.split, np.hsplit, and np.vsplit. For each of these, we
can pass a list of indices giving the split points:
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 30/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: import numpy as np
x = [1, 2, 3, 99, 99, 3, 2, 1]
x1, x2, x3 = np.split(x, [3, 5])
print(x1, x2, x3)
Exercise-1: Create a 10 x 10 arrays of zeros and then "frame" it with a border of ones.
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 31/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: import numpy as np
arra=np.ones((8,8))
print("Original array:")
print(arra)
result = np.tril(arra)
print("\nResult:")
print(result)
Splitting of arrays
The opposite of concatenation is splitting, which is implemented by the functions np.split, np.hsplit, and np.vsplit. For each of these, we
can pass a list of indices giving the split points:
In [ ]: import numpy as np
x = [1, 2, 3, 99, 99, 3, 2, 1]
x1, x2, x3 = np.split(x, [3, 5])
print(x1, x2, x3)
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 32/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: x=np.diagflat([4,5,6,8]); print(x)
#x=eye(2)
Exercise-1: Create a 10 x 10 arrays of zeros and then "frame" it with a border of ones.
In [ ]: x=np.zeros((10,10));
nx=np.pad(x,(1,1),"constant", constant_values=1); print(nx)
In [ ]: help(np.pad)
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 33/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: import numpy as np
arra=np.ones((8,8))
print("Original array:")
print(arra)
result = np.tril(arra)
print("\nResult:")
print(result)
In [ ]: x=np.zeros((10,10));print(x)
In [ ]: import numpy as np
x=np.array([[1,0],[0,1]])
y=np.tile(x, (5,5));print(y)
In [ ]: help (np.tile)
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 34/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: import numpy as np
x=np.zeros((10,10));
x[0::2,::2]=1;x[1::2,1::2]=1;print(x)
In [ ]: help (np.tile)
Matplotlib
Plotting
plt.plot(x, y)
plt.show()
plt.plot(x, y)
plt.show()
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 35/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
plt.plot(x, y_sin)
plt.plot(x, y_cos)
plt.xlabel('x axis label')
plt.ylabel('y axis label')
plt.title('Sine and Cosine')
plt.legend(['Sine', 'Cosine'])
plt.show()
In [ ]: # Subplots
x = np.arange(0, 3 * np.pi, 0.1)
y_sin = np.sin(x)
y_cos = np.cos(x)
plt.subplot(2, 1, 1)
plt.plot(x, y_sin)
plt.title('Sine')
plt.subplot(2, 1, 2)
plt.plot(x, y_cos)
plt.title('Cosine')
plt.suptitle("My Plots")
plt.show()
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 36/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: a=np.array([1,2,3,4,5])
b=np.array([5,6,7,8,9])
np.setdiff1d(a,b)
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 37/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
In [ ]: a=np.array([1,2,3,7,4,6,8,9])
b=np.array([3,4,5,7,1,1,0])
np.intersect1d(a,b)
In [ ]: a=np.full((3,3), True)
print(a)
print(a.dtype)
In [ ]: a=np.full((2,2), np.NaN)
print(a)
print(a.dtype)
In [ ]: a=np.full((2,2), 5)
print(a)
print(a.dtype)
In [ ]:
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 38/39
2/13/2021 Numpy_all_lectures - Jupyter Notebook
localhost:8888/notebooks/Numpy_all_lectures.ipynb# 39/39