Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
2 views

Boolean Arrays and Masks.ipynb Colab

This document explains the use of Boolean masks in NumPy for examining and manipulating array values based on specific criteria. It covers comparison operators as universal functions (ufuncs) and demonstrates how to work with Boolean arrays, including counting entries and using Boolean operators for logical operations. Additionally, it clarifies the distinction between Python's logical keywords and bitwise operators when working with Boolean values.

Uploaded by

Ananthi.R
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Boolean Arrays and Masks.ipynb Colab

This document explains the use of Boolean masks in NumPy for examining and manipulating array values based on specific criteria. It covers comparison operators as universal functions (ufuncs) and demonstrates how to work with Boolean arrays, including counting entries and using Boolean operators for logical operations. Additionally, it clarifies the distinction between Python's logical keywords and bitwise operators when working with Boolean values.

Uploaded by

Ananthi.R
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

keyboard_arrow_down Comparisons, Masks, and Boolean Logic

This section covers the use of Boolean masks to examine and manipulate values within NumPy arrays. Masking
comes up when you want to extract, modify, count, or otherwise manipulate values in an array based on some
criterion: for example, you might wish to count all values greater than a certain value, or perhaps remove all outliers
that are above some threshold. In NumPy, Boolean masking is often the most efficient way to accomplish these
types of tasks.

keyboard_arrow_down Comparison Operators as ufuncs


In Computation on NumPy Arrays: Universal Functions we introduced ufuncs, and focused in particular on arithmetic
operators. We saw that using + , - , * , / , and others on arrays leads to element-wise operations. NumPy also
implements comparison operators such as < (less than) and > (greater than) as element-wise ufuncs. The result of
these comparison operators is always an array with a Boolean data type. All six of the standard comparison
operations are available:

import numpy as np
x = np.array([1, 2, 3, 4, 5])

x < 4 # less than

array([ True, True, True, False, False])

x > 3 # greater than

array([False, False, False, True, True])

x == 3 # equal

array([False, False, True, False, False])

As in the case of arithmetic operators, the comparison operators are implemented as ufuncs in NumPy; for example,
when you write x < 3 , internally NumPy uses np.less(x, 3) . A summary of the comparison operators and their
equivalent ufunc is shown here:

Operator Equivalent ufunc Operator Equivalent ufunc

== np.equal != np.not_equal

< np.less <= np.less_equal

> np.greater >= np.greater_equal

Just as in the case of arithmetic ufuncs, these will work on arrays of any size and shape. Here is a two-dimensional
example:

import numpy as np
x=np.array([[1,2],[3,4]])
x

array([[1, 2],
[3, 4]])

x >10
x >10

array([[False, False],
[False, False]])

In each case, the result is a Boolean array, and NumPy provides a number of straightforward patterns for working
with these Boolean results.

keyboard_arrow_down Working with Boolean Arrays


Given a Boolean array, there are a host of useful operations you can do. We'll work with x , the two-dimensional array
we created earlier.

print(x)

[[1 2]
[3 4]]

keyboard_arrow_down Counting entries

To count the number of True entries in a Boolean array, np.count_nonzero is useful:

# how many values less than 5?


np.count_nonzero(x < 5)

We see that there are eight array entries that are less than 6. Another way to get at this information is to use
np.sum ; in this case, False is interpreted as 0 , and True is interpreted as 1 :

np.sum(x<2)

array([[1, 2],
[3, 4]])

# are there any values greater than 8?


np.any(x >8)

False

# are there any values less than zero?


np.any(x < 0)

False

array([[1, 2],
[3, 4]])
# are all values less than 5?
np.all(x <5)

True

# are all values equal to 6?


np.all(x == 6)

False

keyboard_arrow_down Boolean operators

We've already seen how we might count, say, all days with rain less than four inches, or all days with rain greater than
two inches. But what if we want to know about all days with rain less than four inches and greater than one inch?
This is accomplished through Python's bitwise logic operators, & , | , ^ , and ~ . Like with the standard arithmetic
operators, NumPy overloads these as ufuncs which work element-wise on (usually Boolean) arrays.

np.sum((x > 0.5) & (x < 3))

Combining comparison operators and Boolean operators on arrays can lead to a wide range of efficient logical
operations.

The following table summarizes the bitwise Boolean operators and their equivalent ufuncs:

Operator Equivalent ufunc Operator Equivalent ufunc

& np.bitwise_and | np.bitwise_or

^ np.bitwise_xor ~ np.bitwise_not

keyboard_arrow_down Boolean Arrays as Masks


In the preceding section we looked at aggregates computed directly on Boolean arrays. A more powerful pattern is
to use Boolean arrays as masks, to select particular subsets of the data themselves. Returning to our x array from
before, suppose we want an array of all values in the array that are less than, say, 5:

array([[1, 2],
[3, 4]])

We can obtain a Boolean array for this condition easily, as we've already seen:

x < 3

array([[ True, True],


[False, False]])

Now to select these values from the array, we can simply index on this Boolean array; this is known as a masking
operation:

x[x < 3]
array([1, 2])

What is returned is a one-dimensional array filled with all the values that meet this condition; in other words, all the
values in positions at which the mask array is True .

We are then free to operate on these values as we wish.

By combining Boolean operations, masking operations, and aggregates, we can very quickly answer these sorts of
questions for our dataset.

keyboard_arrow_down Aside: Using the Keywords and/or Versus the Operators &/|
One common point of confusion is the difference between the keywords and and or on one hand, and the
operators & and | on the other hand. When would you use one versus the other?

The difference is this: and and or gauge the truth or falsehood of entire object, while & and | refer to bits within
each object.

When you use and or or , it's equivalent to asking Python to treat the object as a single Boolean entity. In Python, all
nonzero integers will evaluate as True. Thus:

bool(42), bool(0)

(True, False)

bool(42 and 0)

False

bool(42 or 0)

True

When you use & and | on integers, the expression operates on the bits of the element, applying the and or the or to
the individual bits making up the number:

bin(42 & 59)

'0b101010'

bin(42 | 59)

'0b111011'

You might also like