Python Numpy
Python Numpy
Copyright © LEARNXT
Data Wrangling and Manipulation in Python
Copyright © LEARNXT
NumPy
Copyright © LEARNXT
Scientific Python
Extra features required:
Plotting tools
It provides a fast N-d array datatype that can be manipulated in a vectorized form
Copyright © LEARNXT
NumPy Package
The fundamental library needed for scientific computing with Python is called NumPy
Copyright © LEARNXT
NumPy Package
NumPy even contains 3 libraries with numerical routines:
Copyright © LEARNXT
Install NumPy
Ensure that the NumPy package is installed on your laptop/computer
You can use Anaconda command prompt terminal or Jupyter notebook to install the package:
Copyright © LEARNXT
Import NumPy
Import the NumPy package into Python session
import numpy as np
Copyright © LEARNXT
NumPy Arrays
Copyright © LEARNXT
NumPy Arrays
Lists are useful for storing small amounts of one-dimensional data
>>> a = [1,3,5,7,9] >>> a = [1,3,5,7,9]
>>> print(a[2:4]) >>> b = [3,5,6,7,9]
[5, 7] >>> c = a + b
>>> b = [[1, 3, 5, 7, 9], [2, 4, 6, 8, 10]] >>> print c
>>> print(b[0]) [1, 3, 5, 7, 9, 3, 5, 6, 7, 9]
[1, 3, 5, 7, 9]
>>> print(b[1][2:4])
[6, 8]
NumPy Arrays:
Copyright © LEARNXT
Differences Between Lists and Arrays
Arrays are specially optimized for arithmetic computations so if you’re going to perform similar
operations you should consider using an array instead of a list
Lists are containers for elements having differing data types, but arrays are used as containers
for elements of the same data type
NumPy arrays are faster and more compact than Python lists
NumPy uses much less memory to store data and it provides a mechanism of specifying the
data types. This allows the code to be optimized even further
Copyright © LEARNXT
Arrays from Data
Demographic data Extract Birth rate as Pandas Series
Birth Internet
Country Name rate users Income Group
Aruba 10.244 78.9 High income
Afghanistan 35.253 5.9 Low income
Angola 45.985 19.1 Upper middle income
Albania 12.877 57.2 Upper middle income
United Arab Emirates 11.044 88 High income
Extract birth rate as numpy array
Convert to data
frame
Convert data frame to numpy array
Copyright © LEARNXT 14
NumPy Array
NumPy arrays are the one of the most widely used data structuring techniques
An array is a grid of values and it contains information about the raw data, how to locate an
element, and how to interpret an element
The elements are all of the same type, referred to as the array dtype
Copyright © LEARNXT
NumPy Array
NumPy arrays are of two types:
NumPy Arrays
Copyright © LEARNXT
NumPy Array - Attributes
An array is usually a fixed-size container of items of the same type and size
The shape of an array is a tuple of non-negative integers that specify the sizes of each
dimension
Copyright © LEARNXT
NumPy – Creating Arrays
There are several ways to initialize new NumPy arrays, for example from
Using functions that are dedicated to generating NumPy arrays, such as arrange(), linspace(),
etc.
Copyright © LEARNXT
Creating NumPy Arrays – Examples
simple_list = [101,102,103,104,105,106,107,108,109,110]
simple_list
[101, 102, 103, 104, 105, 106, 107, 108, 109, 110]
array1 = np.array(simple_list)
array1
array([101, 102, 103, 104, 105, 106, 107, 108, 109, 110])
type(array1)
numpy.ndarray
Copyright © LEARNXT
Creating NumPy Arrays – Examples
list_of_lists = [[10,11,12],[20,21,22],[30,31,32]]
list_of_lists
# create an array
array2 = np.array(list_of_lists)
array2
np.arange(0,20)
# Returns values 0 to 19. Start value is 0 (included). Stop value is 20 (not included)
np.arange(0,21,4)
Copyright © LEARNXT
Generate Arrays of 0's
# Generate Array of 0's
array3 = np.zeros(50)
array3
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0.,0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
Copyright © LEARNXT
Generate Arrays of 1's
# Generate Array of 1's
array4 = np.ones((4,5))
array4
Example: we initiate an empty array and progressively add results from a loop into the array
Copyright © LEARNXT
Arrays using linspace()
Equally specified values from the interval specified - create numeric sequences
array5 = np.linspace(0,20,10)
array5
Copyright © LEARNXT
Arrays using eye()
# Create an Identity Matrix with eye()
array6 = np.eye(5)
Array6
Copyright © LEARNXT
Random Numbered Arrays
Create random number arrays using rand(), randn(), randint()
Uniform distribution:
# Every time you run this will generate the new set of numbers
array7 = np.random.rand(3,2)
array7
array([[0.48341811, 0.94935455],
[0.86604955, 0.29532457],
[0.79461142, 0.28140248]])
Copyright © LEARNXT
Random Numbered Arrays
Normal distribution:
array8 = np.random.randn(3,2)
array8
array([[-0.05195311, 0.14081327],
[ 0.57633652, -0.42966707],
[ 1.03544668, -0.81755038]])
Copyright © LEARNXT
Random Numbered Arrays
Integers:
array9 = np.random.randint(5,20,10)
array9
array([15, 16, 14, 15, 12, 17, 14, 11, 18, 12])
Copyright © LEARNXT
Functions & Arithmetic Operations on Arrays
Copyright © LEARNXT
Functions on Arrays
Create an array and reshape into a 5 by 6 matrix
sample_array = np.arange(30)
sample_array
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,17, 18, 19, 20, 21, 22, 23, 24, 25,
26, 27, 28, 29])
Copyright © LEARNXT
Functions on Arrays
# Reshape the array into a 5 x 6 matrix using reshape()
matrix2 = sample_array.reshape(5,6)
matrix2
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29]])
Copyright © LEARNXT
Functions on Arrays
Get the min and max values in an array
array9 = np.random.randint(5,20,10)
array9.min()
Copyright © LEARNXT
Functions on Arrays
Get the min and max values in an array
array9.min()
array9.shape
(10,)
Copyright © LEARNXT
Universal Array Functions
# Create an array and find the variance
sample_array = np.arange(30)
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,
26, 27, 28, 29])
# Variance
np.var(sample_array)
74.91666666666667
Copyright © LEARNXT
Universal Array Functions
# Square root
Arr = np.sqrt(sample_array)
Arr
array([0. , 1. , 1.41421356, 1.73205081, 2. ,
2.23606798, 2.44948974, 2.64575131, 2.82842712, 3. ,
3.16227766, 3.31662479, 3.46410162, 3.60555128, 3.74165739,
3.87298335, 4. , 4.12310563, 4.24264069, 4.35889894,
4.47213595, 4.58257569, 4.69041576, 4.79583152, 4.89897949,
5. , 5.09901951, 5.19615242, 5.29150262, 5.38516481])
Copyright © LEARNXT
Universal
# log
Array Functions
np.log(sample_array)
np.max(sample_array)
29
Copyright © LEARNXT
Universal Array Functions
Round the array values to 2 decimal places
# Round up the decimals
np.round(Arr, decimals = 2)
array([0. , 1. , 1.41, 1.73, 2. , 2.24, 2.45, 2.65, 2.83, 3. , 3.16,
3.32, 3.46, 3.61, 3.74, 3.87, 4. , 4.12, 4.24, 4.36, 4.47, 4.58,
4.69, 4.8 , 4.9 , 5. , 5.1 , 5.2 , 5.29, 5.39])
# Standard deviation
np.std(Arr)
1.3683899139885065
# Mean
np.mean(Arr)
3.553520654688042
Copyright © LEARNXT
Universal Array Functions - Strings
# Create an array of string values
np.unique(sports)
Copyright © LEARNXT
Arithmetic Operations
# View the sample array
sample_array
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,17, 18, 19, 20, 21, 22, 23, 24, 25,
26, 27, 28, 29])
# Addition of arrays
sample_array + sample_array
array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32,34, 36, 38, 40, 42, 44, 46, 48,
50, 52, 54, 56, 58])
Copyright © LEARNXT
Arithmetic Operations
# Division of arrays
sample_array / sample_array
array([nan, 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1.,
sample_array + 1
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,18, 19, 20, 21, 22, 23, 24, 25, 26,
27, 28, 29, 30])
Copyright © LEARNXT
Saving and Loading Arrays with NumPy
Copyright © LEARNXT
Saving Arrays with NumPy
Save function - saves in working directory as *.npy file
np.save(‘S2_sample_array', sample_array)
Copyright © LEARNXT
Loading Arrays with NumPy
# Load the saved file S2_sample_array
np.load('sample_array.npy’)
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,17, 18, 19, 20, 21, 22, 23, 24, 25,
26, 27, 28, 29])
archive = np.load('2_arrays.npz’)
archive[‘b’]
Copyright © LEARNXT
Summary
NumPy provides a fast N-d array datatype that can be manipulated in a vectorized form
This Open-Source library contains: A powerful N-dimensional array object, Advanced array
slicing methods and convenient array reshaping methods
An array is a grid of values and it contains information about the raw data, how to locate an
element, and how to interpret an element
Copyright © LEARNXT
Additional Resources
McKinney, W. (2013). Python for data analysis. O'Reilly Media.
Beazley, D., & Jones, B. K. (2013). Python cookbook: Recipes for mastering Python 3. O'Reilly
Media.
Copyright © LEARNXT
e-References
Welcome to Python.org. (n.d.). Python.org. https://www.python.org
Copyright © LEARNXT 46
Any Questions?
Thank you
Copyright © LEARNXT
Copyright © LEARNXT