Python Numpy
Python Numpy
Numpy
Scientific Python?
2
What is NumPy?
• Official documentation
– http://docs.scipy.org/doc/
• The NumPy book
– http://web.mit.edu/dvp/Public/numpybook.pdf
• Example list
– https://docs.scipy.org/doc/numpy/reference/routines.html
Arrays – Numerical Python (Numpy)
The fundamental library needed for scientific computing with Python is called NumPy. This
Open Source library contains:
• a powerful N-dimensional array object
• advanced array slicing methods (to select array elements)
• convenient array reshaping methods
NumPy can be extended with C-code for functions where performance is highly time
critical. In addition, tools are provided for integrating existing Fortran code. NumPy is a
hybrid of the older NumArray and Numeric packages, and is meant to replace them both.
Numpy – Creating arrays
8
Array shape
9
Array item types
10
Some ndarray methods
• ndarray. tolist ()
– The contents of self as a nested list
• ndarray. copy ()
– Return a copy of the array
• ndarray. fill (scalar)
– Fill an array with the scalar value
11
Some NumPy functions
abs() min()
add()
binomial()
max()
cumprod() multipy()
cumsum() polyfit()
floor()
histogram()
randint()
shuffle()
transpose()
12
Numpy – Creating vectors
• From lists
– numpy.array
# as vectors from lists
>>> a = numpy.array([1,3,5,7,9])
>>> b = numpy.array([3,5,6,7,9])
>>> c = a + b
>>> print(c)
[4, 8, 11, 14, 18]
>>> type(c)
(<type 'numpy.ndarray'>)
>>> c.shape
(5,)
Numpy – Creating matrices
>>> l = [[1, 2, 3], [3, 6, 9], [2, 4, 6]] # create a list
>>> a = numpy.array(l) # convert a list to an array
>>>print(a)
[[1 2 3] #only one type
[3 6 9] >>> M[0,0] = "hello"
[2 4 6]] Traceback (most recent call last):
>>> a.shape File "<stdin>", line 1, in <module>
(3, 3) ValueError: invalid literal for long() with base 10: 'hello‘
>>> print(a.dtype) # get type of an array
int64 >>> M = numpy.array([[1, 2], [3, 4]], dtype=complex)
>>> M
# or directly as matrix array([[ 1.+0.j, 2.+0.j],
>>> M = array([[1, 2], [3, 4]]) [ 3.+0.j, 4.+0.j]])
>>> M.shape
(2,2)
>>> M.dtype
dtype('int64')
Numpy – Matrices use
>>> print(a)
[[1 2 3]
[3 6 9]
[2 4 6]]
>>> print(a[0]) # this is just like a list of lists
[1 2 3]
>>> print(a[1, 2]) # arrays can be given comma separated indices
9
>>> print(a[1, 1:3]) # and slices
[6 9]
>>> print(a[:,1])
[2 6 4]
>>> a[1, 2] = 7
>>> print(a)
[[1 2 3]
[3 6 7]
[2 4 6]]
>>> a[:, 0] = [0, 9, 8]
>>> print(a)
[[0 2 3]
[9 6 7]
[8 4 6]]
Numpy – Creating arrays
• Generation functions
>>> x = arange(0, 10, 1) # arguments: start, stop, step
>>> x
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> b = numpy.zeros(5)
>>> print(b)
[ 0. 0. 0. 0. 0.]
>>> b.dtype
dtype(‘float64’)
>>> n = 1000
>>> my_int_array = numpy.zeros(n, dtype=numpy.int)
>>> my_int_array.dtype
dtype(‘int32’)
>>> c = numpy.ones((3,3))
>>> c
array([[ 1., 1., 1.],
[ 1., 1., 1.],
[ 1., 1., 1.]])
Numpy – array creation and use
>>> d = numpy.arange(5) # just like range()
>>> print(d)
[0 1 2 3 4]
• File I/O
>>> os.system('head DeBilt.txt')
"Stn", "Datum", "Tg", "qTg", "Tn", "qTn", "Tx", "qTx"
001, 19010101, -49, 00, -68, 00, -22, 40
001,
>>>19010102, -21, 00, -36, 30, data)
numpy.savetxt('datasaved.txt', -13, 30
001,
>>>19010103, -28, 00,
os.system('head -79, 30,
datasaved.txt') -5, 20
001, 19010104, -64, 00, 1.901010100000000000e+07
1.000000000000000000e+00 -91, 20, -10, 00 -4.900000000000000000e+01
001, 19010105, -59, 00, -6.800000000000000000e+01
0.000000000000000000e+00 -84, 30, -18, 00 0.000000000000000000e+00
001, 19010106, -99, 00, -115,
-2.200000000000000000e+01 30, -78, 30
4.000000000000000000e+01
001, 19010107, -91, 00, -122,
1.000000000000000000e+00 00, -66, 00
1.901010200000000000e+07 -2.100000000000000000e+01
001, 19010108, -49, 00, -3.600000000000000000e+01
0.000000000000000000e+00 -94, 00, -6, 00 3.000000000000000000e+01
001, 19010109, 11, 00,
-1.300000000000000000e+01 -27, 40, 42, 00
3.000000000000000000e+01
0 1.000000000000000000e+00 1.901010300000000000e+07 -2.800000000000000000e+01
0.000000000000000000e+00 -7.900000000000000000e+01 3.000000000000000000e+01
>>>-5.000000000000000000e+00
data = numpy.genfromtxt('DeBilt.txt‘, delimiter=',‘, skip_header=1)
2.000000000000000000e+01
>>> data.shape
(25568, 8)
Numpy – Creating arrays
>>> M = numpy.random.rand(3,3)
>>> M
array([[ 0.84188778, 0.70928643, 0.87321035],
[ 0.81885553, 0.92208501, 0.873464 ],
[ 0.27111984, 0.82213106, 0.55987325]])
>>>
>>> numpy.save('saved-matrix.npy', M)
>>> numpy.load('saved-matrix.npy')
array([[ 0.84188778, 0.70928643, 0.87321035],
[ 0.81885553, 0.92208501, 0.873464 ],
[ 0.27111984, 0.82213106, 0.55987325]])
>>>
>>> os.system('head saved-matrix.npy')
NUMPYF{'descr': '<f8', 'fortran_order': False, 'shape': (3, 3), }
Ï<
£¾ðê?sy²æ?$÷ÒVñë?Ù4ê?%dn¸í?Ã[Äjóë?Ä,ZÑ?Ç
ÎåNê?ó7L{êá?0
>>>
Numpy - ndarray
[ [ 1.5, 0.2, -3.7] , An array of rank 2 i.e. It has 2 axes, the first
[ 0.1, 1.7, 2.9] ] length 3, the second of length 3 (a matrix
with 2 rows and 3 columns
Numpy – ndarray attributes
• ndarray.ndim
– the number of axes (dimensions) of the array i.e. the rank.
• ndarray.shape
– the dimensions of the array. This is a tuple of integers indicating the size of the array in each
dimension. For a matrix with n rows and m columns, shape will be (n,m). The length of the
shape tuple is therefore the rank, or number of dimensions, ndim.
• ndarray.size
– the total number of elements of the array, equal to the product of the elements of shape.
• ndarray.dtype
– an object describing the type of the elements in the array. One can create or specify dtype's
using standard Python types. NumPy provides many, for example bool_, character, int_,
int8, int16, int32, int64, float_, float8, float16, float32, float64, complex_, complex64, object_.
• ndarray.itemsize
– the size in bytes of each element of the array. E.g. for elements of type float64, itemsize is 8
(=64/8), while complex32 has itemsize 4 (=32/8) (equivalent to ndarray.dtype.itemsize).
• ndarray.data
– the buffer containing the actual elements of the array. Normally, we won't need to use this
attribute because we will access the elements in an array using indexing facilities.
Numpy – array creation and use
Two ndarrays are mutable and may be views to the same memory:
>>> print(x)
[ 4.5 2.3 6.7 1.2 1.8 5.5]
>>> s = x.argsort()
>>> s
array([3, 4, 1, 0, 5, 2])
>>> x[s]
array([ 1.2, 1.8, 2.3, 4.5, 5.5, 6.7])
>>> y[s]
array([ 6.2, 7.8, 2.3, 1.5, 8.5, 4.7])
Numpy – array functions
>>> arr.sum()
45
>>> numpy.sum(arr)
45
• See http://numpy.scipy.org
Numpy – array operations
>>> a = array([[1.0, 2.0], [4.0, 3.0]])
>>> print a
[[ 1. 2.]
[ 3. 4.]]
>>> a.transpose()
array([[ 1., 3.],
[ 2., 4.]])
>>> inv(a)
array([[-2. , 1. ],
[ 1.5, -0.5]])
>>> u
array([[ 1., 0.],
[ 0., 1.]])
The correlation coefficient for multiple variables observed at multiple instances can be
found for arrays of the form [[x1, x2, …], [y1, y2, …], [z1, z2, …], …] where x, y, z are
different observables and the numbers indicate the observation times:
>>> a = np.array([[1, 2, 1, 3], [5, 3, 1, 8]], float)
>>> c = np.corrcoef(a)
>>> c
array([[ 1. , 0.72870505],
[ 0.72870505, 1. ]])
Here the return array c[i,j] gives the correlation coefficient for the ith and jth
observables. Similarly, the covariance for data can be found::
>>> np.cov(a)
array([[ 0.91666667, 2.08333333],
[ 2.08333333, 8.91666667]])
Using arrays wisely