Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
113 views

Week 9 - Introduction To Numpy - Part 2

This document provides an introduction to NumPy programming concepts including: 1. Creating your own NumPy universal functions (ufuncs) using the frompyfunc() method. 2. Working with Boolean arrays including counting True entries, checking if any/all values are True, and aggregating counts along axes. 3. Exploring fancy indexing which allows accessing multiple array elements using arrays of indices. Fancy indexing can also be used to modify values. 4. NumPy functions for sorting arrays including np.sort(), np.argsort(), and np.partition(). Axis arguments allow sorting along rows or columns. 5. Searching arrays using np.where() and np.searchsorted(), and filtering arrays

Uploaded by

God is Good
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
113 views

Week 9 - Introduction To Numpy - Part 2

This document provides an introduction to NumPy programming concepts including: 1. Creating your own NumPy universal functions (ufuncs) using the frompyfunc() method. 2. Working with Boolean arrays including counting True entries, checking if any/all values are True, and aggregating counts along axes. 3. Exploring fancy indexing which allows accessing multiple array elements using arrays of indices. Fancy indexing can also be used to modify values. 4. NumPy functions for sorting arrays including np.sort(), np.argsort(), and np.partition(). Axis arguments allow sorting along rows or columns. 5. Searching arrays using np.where() and np.searchsorted(), and filtering arrays

Uploaded by

God is Good
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

Data Science Programming

Introduction to NumPy – Part 2

Week 9
Program Studi Teknik Informatika
Fakultas Teknik – Universitas Surabaya
Create Your Own ufunc
• You can create own ufunc, you have to define a function, like you do with
normal functions in Python, then you add it to your NumPy ufunc library
with the frompyfunc() method.
• The frompyfunc() method takes the following arguments:
– function - the name of the function.
– inputs - the number of input arguments (arrays).
– outputs - the number of output arrays.

# Create your own ufunc for addition


def myadd(x, y):
return x+y

myadd = np.frompyfunc(myadd, 2, 1)
print(myadd([1, 2, 3, 4], [5, 6, 7, 8])) # Output: [6 8 10 12]
Working with Boolean Arrays
Counting entries
• Given a Boolean array, there are a host of useful operations
you can do.
• To count the number of True entries in a Boolean array,
np.count_nonzero is useful

x = np.random.randint(10, size=(3, 4))


print(x)
Output:
# how many values less than 6?
print(np.count_nonzero(x < 6))

• Another way to get at this information is to use np.sum; in this


case, False is interpreted as 0, and True is interpreted as 1
print(np.sum(x < 6))
Counting entries
• The benefit of sum() is that like with other NumPy aggregation
functions, this summation can be done along rows or columns as well.
# how many values less than 6 in each row?
print(np.sum(x < 6, axis=1)) # Output: [3 2 3]

• If we’re interested in quickly checking whether any or all the values are
true, we can use (you guessed it) np.any() or np.all()
# are there any values greater than 8?
print(np.any(x > 8)) #Output: True

# are all values less than 10?


print(np.all(x < 10)) #Output: True

# are all values in each row less than 8?


print(np.all(x < 8, axis=1)) #Output: [False False True]
Fancy Indexing
Exploring Fancy Indexing
• Fancy indexing is like the simple indexing we’ve already seen, but we pass
arrays of indices in place of single scalars.
• This allows us to very quickly access and modify complicated subsets of an
array’s values.
• Fancy indexing is conceptually simple: it means passing an array of indices to
access multiple array elements at once.
• For example, consider the following array

x = np.random.randint(100, size=10)
print(x) # Output: [97 2 8 94 77 38 18 49 91 50]
• Suppose we want to access three different elements. we can pass a single list
or array of indices to obtain the result
ind = [3, 7, 4]
print(x[ind]) # Output: [94 49 77]
Exploring Fancy Indexing
• With fancy indexing, the shape of the result reflects the shape of the
index arrays rather than the shape of the array being indexed.
ind = np.array([[3, 7],[4, 5]])
print(x[ind]) Output:

• Fancy indexing also works in multiple dimensions. Consider the


following array.
X = np.arange(12).reshape((3, 4)) Output:
print(X)

• Like with standard indexing, the first index refers to the row, and the
second to the column row = np.array([0, 1, 2])
col = np.array([2, 1, 3])
print(X[row, col]) # Output: [ 2 5 11]
Exploring Fancy Indexing
• Notice that the first value in the result is X[0, 2], the second is X[1, 1],
and the third is X[2, 3].
• If we combine a column vector and a row vector within the indices, we
get a two-dimensional result.
print(X[row[:, np.newaxis], col]) Output:

• For even more powerful operations, fancy indexing can be combined with the
other indexing schemes.
# We can combine fancy and simple indices
print(X[2, [2, 0, 1]]) # Output: [10 8 9]

# We can also combine fancy indexing with slicing


print(X[1:, [2, 0, 1]]) Output:
Modifying Values with Fancy Indexing
• Just as fancy indexing can be used to access parts of an array, it can
also be used to modify parts of an array.
• For example, imagine we have an array of indices and we’d like to set
the corresponding items in an array to some value.
x = np.arange(10)
i = np.array([2, 1, 8, 4])
x[i] = 99
print(x) # Output: [ 0 99 99 3 99 5 6 7 99 9]

• We can use any assignment-type operator for this. For example


x[i] -= 10
print(x) # Output: [ 0 89 89 3 89 5 6 7 89 9]
Sorting, Searching, and Filtering
Sorting
• This section covers algorithms related to sorting values in NumPy
arrays.
• For example, a simple selection sort repeatedly finds the minimum
value from a list, and makes swaps until the list is sorted.
• We can code this in just a few lines of Python.
def selection_sort(x):
for i in range(len(x)):
swap = i + np.argmin(x[i:])
(x[i], x[swap]) = (x[swap], x[i])
return x

x = np.array([2, 1, 4, 3, 5])
print(selection_sort(x)) # Output: [ 1 2 3 4 5 ]
Sorting
• The selection sort is useful for its simplicity, but is much too slow to be
useful for larger arrays.
• For a list of N values, it requires N loops, each of which does on the
order of ~ N comparisons to find the swap value.
• In terms of the “big-O” notation often used to characterize these
algorithms, selection sort averages O(N2).
• If you double the number of items in the list, the execution time will go
up by about a factor of four.
• Although Python has built-in sort and sorted functions to work with lists, we
won’t discuss them here because NumPy’s np.sort function turns out to be
much more efficient and useful for our purposes.
Sorting
• By default np.sort uses an O(N log N) , quicksort algorithm, though
mergesort and heapsort are also available.
• For most applications, the default quicksort is more than sufficient.
x = np.array([2, 1, 4, 3, 5])
print(np.sort(x)) # Output: [ 1 2 3 4 5 ]

• A related function is argsort, which instead returns the indices of the


sorted elements.
x = np.array([2, 1, 4, 3, 5])
i = np.argsort(x)
print(i) # Output: [1 0 3 2 4]

• The first element of that result gives the index of the smallest element, the
second value gives the index of the second smallest, and so on.
Sorting
• A useful feature of NumPy’s sorting algorithms is the ability to sort
along specific rows or columns of a multidimensional array using the
axis argument.
X = np.random.randint(0, 10, (4, 6))
print(X) Output:

# sort each column of X


print(np.sort(X, axis=0))
Output:

# sort each row of X


print(np.sort(X, axis=1)) Output:
Sorting
• Sometimes we’re not interested in sorting the entire array, but simply want to
find the K smallest values in the array.
• NumPy provides this in the np.partition function.
• np.partition takes an array and a number K; the result is a new array with the
smallest K values to the left of the partition, and the remaining values to the
right, in arbitrary order.

x = np.array([7, 2, 3, 1, 6, 5, 4])
print(np.partition(x, 3)) # Output: [2 1 3 4 6 5 7]

• Note that the first three values in the resulting array are the three smallest in
the array, and the remaining array positions contain the remaining values.
• Within the two partitions, the elements have arbitrary order.
Searching
• We can search an array for a certain value, and return the indexes
that get a match. To search an array, use the where() method.
arr = np.array([1, 2, 3, 4, 5, 4, 4])
x = np.where(arr == 4)
print(x) # Output: (array([3, 5, 6], dtype=int64),)

• Another example: Find the indexes where the values are even or
odd
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
x = np.where(arr%2 == 0)
y = np.where(arr%2 == 1)
print(x) # Output: (array([1, 3, 5, 7], dtype=int64),)
print(y) # Output: (array([0, 2, 4, 6], dtype=int64),)
Searching
• There is a method called searchsorted() which performs a binary
search in the array, and returns the index where the specified value
would be inserted to maintain the search order.
arr = np.array([2, 7, 9, 12, 12])

# The number 8 should be inserted on index 2 to remain the sort order.


# The method starts the search from the left.
print(np.searchsorted(arr, 8)) # Output: 2

# Find the indexes where the value 10 should be inserted, starting from the right.
print(np.searchsorted(arr, 10, side='right')) # Output: 3

# Find the indexes where the values 2, 4, 6, and 11 should be inserted.


# The return value is an array: [0 1 1 3] containing the four indexes,
# where 2, 4, 6, 11 would be inserted in the original array to maintain the order.
print(np.searchsorted(arr, [2, 4, 6, 11])) # Output: [0 1 1 3]
Filtering
• Getting some elements out of an existing array and creating a
new array out of them is called filtering.
• In NumPy, you filter an array using a boolean index list.
arr = np.array([41, 42, 43, 44])
x = arr[[True, False, True, False]]
print(x) # Output: [41 43]

• The example above will return [41 43], why? Because the new
filter contains only the values where the filter array had the
value True, in this case, index 0 and 2.
Filtering
• Another example:
# Create a filter array that will return only even elements from the original array
arr = np.array([1, 2, 3, 4, 5, 6, 7])
filter_arr = arr % 2 == 0
newarr = arr[filter_arr]
print(filter_arr) #Output: [False True False True False True False]
print(newarr) #Output: [2 4 6]

# Create a filter array that will return only values higher than 42
arr = np.array([41, 42, 43, 44])
filter_arr = arr > 42
newarr = arr[filter_arr]
print(filter_arr) #Output: [False False True True]
print(newarr) #Output: [43 44]
Questions??
Exercise
• Create NRP_Nickname_ExWeek9.ipynb file.

Question 1
Create a 5X2 integer array from the range 100 to 200 so that the
difference between each element is 10. Here is an example of
what it looks like:
Exercise
Question 2
The following provides a numPy array.
np.array([[11 ,22, 33], [44, 55, 66], [77, 88, 99]])
Returns an array of items in the second column of all existing
rows. Here is the expected display:
Exercise
Question 3
The following provides a numPy array.
np.array([[3 ,6, 9, 12], [15 ,18, 21, 24],[27 ,30, 33, 36],
[39 ,42, 45, 48], [51 ,54, 57, 60]])
Returns the given array of odd rows and even columns. Here is
the expected display:
Exercise
Question 4
Add the following two NumPy arrays
arrayOne = np.array([[5, 6, 9], [21 ,18, 27]])
arrayTwo = np.array([[15 ,33, 24], [4 ,7, 1]])
And modify the resulting array by calculating the square root of each
element. Here is the expected display:

You might also like