DS Unit 3 Part 1
DS Unit 3 Part 1
DS Unit 3 Part 1
UNIT-3
Python for Data Handling
Syllabus: UNIT III:part-1
Python for Data Handling: Basics of Numpy arrays – aggregations – computations on arrays
– comparisons, masks, Boolean logic – fancy indexing – structured arrays
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 1
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
2. Indexing of arrays
Getting and setting the value of individual array elements
3. Slicing of arrays
Getting and setting smaller subarrays within a larger array
4. Reshaping of arrays
Changing the shape of a given array
5. Joining and splitting of arrays
Combining multiple arrays into one, and splitting one array into many
NumPy Array Attributes
some useful array attributes are:
ndim : the number of dimensions
shape :the size of each dimension
size :the total size of the array
dtype: the data type of the array
itemsize: lists the size (in bytes) of each array element,
nbytes: lists the total size (in bytes) of the array
In general, nbytes is equal to itemsize times size.
# Python program to demonstrate Attribute of arrays
import numpy as np
# Creating array object
arr = np.array( [[ 1, 2, 3],
[ 4, 5, 6]] )
# Printing array dimensions (axes)
print("No. of dimensions: ", arr.ndim)
# Printing shape of array
print("Shape of array: ", arr.shape)
# Printing size (total number of elements) of array
print("Size of array: ", arr.size)
# Printing type of elements in array
print("Array Elements type: ", arr.dtype)
# Printing size of each elements in array
print("Size of array elment: ", arr.itemsize,"bytes")
# Printing total size of array
print("Total size of array: ",arr.nbytes ,"bytes")
print(a[0])
print(a[4])
Output:5
7
To index from the end of the array, we can use negative indices:
print(a[-1])
print(a[-2])
Output:9
7
In a multidimensional array, we access items using a comma-separated
tuple of indices:
Example: import numpy as np
a=np.array([[3, 5, 2, 4],
[7, 6, 8, 8],
[1, 6, 7, 7]])
Print(a[0,0])
Print(a[2,0])
Print(a[2,-1])
Output: 3
1
7
We can also modify values using any of the above index notation:
a[0, 0] = 12
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 3
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
[[12 2]
[ 7 8]
[ 1 7]]
[[ 7 7 6 1]
[ 8 8 6 7]
[ 4 2 5 12]]
Accessing array rows and columns.
One commonly needed routine is accessing single rows or columns of an
array. We can do this by combining indexing and slicing, using an empty
slice marked by a single colon (:):
Example:
print(x2[:, 0]) # first column of x2 [12 7 1]
print(x2[0, :]) # first row of x2 [12 5 2 4]
In the case of row access, the empty slice can be omitted for a more
compact syntax:
print(x2[0]) # equivalent to x2[0, :] [12 5 2 4]
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 4
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 5
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
Where possible, the reshape method will use a no-copy view of the initial
array, but with noncontiguous memory buffers this is not always the case.
Another common reshaping pattern is the conversion of a one-
dimensional array into a two-dimensional row or column matrix. You can
do this with the reshape method, or more easily by making use of the
newaxis keyword within a slice operation
Example:
import numpy as np
x = np.array([1, 2, 3])
# row vector via reshape
x.reshape((1, 3))
# row vector via newaxis
x[np.newaxis, :]
# column vector via reshape
x.reshape((3, 1))
# column vector via newaxis
x[:, np.newaxis]
Output: [[1 2 3]]
[[1 2 3]]
[[1]
[2]
[3]]
[[1]
[2]
[3]]
Array Concatenation and Splitting
It’s also possible to combine multiple arrays into one, and to conversely
split a single array into multiple arrays.
Concatenation of arrays: Concatenation, or joining of two arrays in
NumPy, is primarily accomplished through the routines np.concatenate,
np.vstack, and np.hstack.
np.concatenate takes a tuple or list of arrays as its first argument
Example:
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
print(np.concatenate([x, y]))
Output: [1 2 3 4 5 6]
We can also concatenate more than two arrays at once:
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 6
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
Example:
z = [99, 99, 99]
print(np.concatenate([x, y, z]))
Output: [ 1 2 3 4 5 6 99 99 99]
np.concatenate can also be used for two-dimensional arrays:
Example:
import numpy as np
grid = np.array([[1, 2, 3], [4, 5, 6]])
# concatenate along the first axis
np.concatenate([grid, grid])
Output: [[1, 2, 3], [4, 5, 6], [1, 2, 3], [4, 5, 6]]
# concatenate along the second axis (zero-indexed)
np.concatenate([grid, grid], axis=1)
Output: [[1, 2, 3, 1, 2, 3], [4, 5, 6, 4, 5, 6]]
For working with arrays of mixed dimensions, it can be clearer to use the
np.vstack (vertical stack) and np.hstack (horizontal stack) functions:
Example:
x = np.array([1, 2, 3])
grid = np.array([[9, 8, 7], [6, 5, 4]])
# vertically stack the arrays
np.vstack([x, grid])
Output: [[1, 2, 3], [9, 8, 7], [6, 5, 4]])
# horizontally stack the arrays
y = np.array([[99], [99]])
np.hstack([grid, y])
Ouput:[[ 9, 8, 7, 99], [ 6, 5, 4, 99]])
Similarly, np.dstack will stack arrays along the third axis.
Splitting of arrays
The opposite of concatenation is splitting, which is implemented by the
functions np.split, np.hsplit, and np.vsplit.
For each of these, we can pass a list of indices giving the split points:
Example
x = [1, 2, 3, 99, 99, 3, 2, 1]
x1, x2, x3 = np.split(x, [3, 5])
print(x1, x2, x3)
[1 2 3] [99 99] [3 2 1]
Notice that N split points lead to N + 1 subarrays.
The related functions np.hsplit and np.vsplit are similar:
Example:
grid = np.arange(16).reshape((4, 4))
Output: [[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 7
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
upper, lower = np.vsplit(grid, [2])
print(upper)
print(lower)
Ouput:[[0 1 2 3]
[4 5 6 7]]
[[ 8 9 10 11]
[12 13 14 15]]
left, right = np.hsplit(grid, [2])
print(left)
print(right)
Output:
[[ 0 1]
[ 4 5]
[ 8 9]
[12 13]]
[[ 2 3]
[ 6 7]
[10 11]
[14 15]]
Similarly, np.dsplit will split arrays along the third axis.
Aggregations
Aggregations are used to compute summary statistics for the data in
question.
Perhaps the most common summary statistics are the mean and standard
deviation, which allows to summarize the “typical” values in a dataset,
but other aggregates are useful as well (the sum, product, median,
minimum and maximum, quantiles, etc.).
Summing the Values in an Array
np.sum() function is used for computing the sum of all values in an array.
Example:
import numpy as np
arr = np.array([2,4,6,8])
b = np.sum(arr)
print(b)
Output: 20
Minimum and Maximum
np.min(), np.max() are used to find minimum and maximum in given array
Example:
import numpy as np
arr = np.array([2,4,6,8])
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 8
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
print(np.min(arr))
print(np.max(arr))
output:
2
8
Multidimensional aggregates
One common type of aggregation operation is an aggregate along a row
or column.
By default, each NumPy aggregation function will return the aggregate
over the entire array:
Aggregation functions take an additional argument specifying the axis
along which the aggregate is computed.
For example, we can find the minimum value within each column by
specifying axis=0: Similarly, we can find the maximum value within each
row by specifying axis=1
Example:
import numpy as np
a = np.array([[1,3,5,7],
[2,4,6,8]])
print(a.max())
print(np.max(a))
print(a.max(axis=0))
print(a.max(axis=1))
Output:
8
8
[2 4 6 8]
[7 8]
The axis keyword specifies the dimension of the array that will be
collapsed, rather than the dimension that will be returned. So specifying
axis=0 means that the first axis will be collapsed: for two-dimensional
arrays, this means that values within each column will be aggregated.
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 9
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 10
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
Example:
import numpy as np
x=np.array([1,2,3,4])
y=np.array([4,5,6,7])
z=np.add(x,y)
print(z)
Output: [ 5 7 9 11]
Absolute value
Just as NumPy understands Python’s built-in arithmetic operators, it also
understands Python’s built-in absolute value function:
Example:
Import numpy as np
x = np.array([-2, -1, 0, 1, 2])
print(np.abs(x))
Output: [2 1 0 1 2]
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 11
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
Trigonometric functions
NumPy provides a large number of useful ufuncs, and some of the most
useful for the data scientist are the trigonometric functions.
Example:
theta = np.linspace(0, np.pi, 3)
print("theta = ", theta)
print("sin(theta) = ", np.sin(theta))
print("cos(theta) = ", np.cos(theta))
print("tan(theta) = ", np.tan(theta))
Output:
theta = [ 0. 1.57079633 3.14159265]
sin(theta) = [ 0.00000000e+00 1.00000000e+00 1.22464680e-16]
cos(theta) = [ 1.00000000e+00 6.12323400e-17 -1.00000000e+00]
tan(theta) = [ 0.00000000e+00 1.63312394e+16 -1.22464680e-16]
Inverse trigonometric functions are also available:
Example:
import numpy as np
x = [-1, 0, 1]
print("x = ", x)
print("arcsin(x) = ", np.arcsin(x))
print("arccos(x) = ", np.arccos(x))
print("arctan(x) = ", np.arctan(x))
Output:
x = [-1, 0, 1]
arcsin(x) = [-1.57079633 0. 1.57079633]
arccos(x) = [3.14159265 1.57079633 0. ]
arctan(x) = [-0.78539816 0. 0.78539816]
Exponents and logarithms
Another common type of operation available in a NumPy ufunc are the
exponentials:
Example:
import numpy as np
x = [1, 2, 3]
print("x =", x)
print("e^x =", np.exp(x))
print("2^x =", np.exp2(x))
print("3^x =", np.power(3, x))
Output:
x = [1, 2, 3]
e^x = [ 2.71828183 7.3890561 20.08553692]
2^x = [ 2. 4. 8.]
3^x = [ 3 9 27]
Advanced Ufunc Features
Some of specialized features of ufuncs are:.
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 12
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
Specifying output
For large calculations, it is sometimes useful to be able to specify the
array where the result of the calculation will be stored. Rather than
creating a temporary array, we can use this to write computation results
directly to the required memory location where we would like them to be.
x = np.arange(5)
y = np.empty(5)
np.multiply(x, 10, out=y)
print(y)
Output: [ 0. 10. 20. 30. 40.]
Aggregates
For binary ufuncs, there are some interesting aggregates that can be
computed directly from the object.
For example, if we’d like to reduce an array with a particular operation,
we can use the reduce method of any ufunc. A reduce repeatedly applies a
given operation to the elements of an array until only a single result
remains.
For example, calling reduce on the add ufunc returns the sum of all
elements in the
array:
x = np.arange(1, 6)
np.add.reduce(x)
Output: 15
If we’d like to store all the intermediate results of the computation, we
can instead use accumulate:
np.add.accumulate(x)
Output: [ 1 3 6 10 15]
Outer products
Any ufunc can compute the output of all pairs of two different inputs
using the outer method.
Example:
x = np.arange(1, 6)
np.multiply.outer(x, x)
Ouput:
[[ 1, 2, 3, 4, 5],
[ 2, 4, 6, 8, 10],
[ 3, 6, 9, 12, 15],
[ 4, 8, 12, 16, 20],
[ 5, 10, 15, 20, 25]])
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 13
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
Broadcasting
Broadcasting is means of vectorizing Operations.
Broadcasting is simply a set of rules for applying binary ufuncs
(addition, subtraction, multiplication, etc.) on arrays of different sizes.
Broadcasting allows binary operations to be performed on arrays of
different sizes.
Broadcasting is a mechanism that allows Numpy to handle arrays of
different shapes during arithmetic operations.
In broadcasting, we can think of it as a smaller array being “broadcasted”
into the same shape as the larger array, before doing certain operations. In
general, the smaller array will be copied multiple times, until it reaches the
same shape as the larger array.
Using broadcasting allows for vectorization, a style of programming that
works with entire arrays instead of individual elements
Broadcasting is usually fast, since it vectorizes array operations so that
looping occurs in optimized C code instead of the slower Python. In
addition, it doesn’t really require storing all copies of the smaller array;
instead, there are faster and more efficient algorithms to store that.
The central idea around broadcasting is that it tries to copy the data
contained within the smaller array to match the shape of the larger array.
Example 1:For example, we can just as easily add a scalar (think of it as a
zero dimensional array) to an array:
import numpy as np
a = np.array([0, 1, 2])
print(a + 5)
Output: [5, 6, 7]
We can think of this as an operation that stretches or duplicates the value
5 into the array [5, 5, 5], and adds the results.
Example2:
import numpy as np
a = np.array([0, 1, 2])
M = np.ones((3, 3))
print(M + a)
Output: [[ 1., 2., 3.],
[ 1., 2., 3.],
[ 1., 2., 3.]])
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 14
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 15
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 16
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 17
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
Array having too few dimensions can have its shape prepended
with a dimension of length 1, so that the above stated property is
true.
Uses of Broadcasting
(Broadcasting in Practice)
Centering an array:
One com monly seen example is centering an array of data.
Example: Imagine you have an array of 10 observations, each of which
consists of 3 values. We will store this in a 10×3 array:
X = np.random.random((10, 3))
We can compute the mean of each feature using the mean aggregate
across the first dimension:
Xmean = X.mean(0)
print(Xmean)
And now we can center the X array by subtracting the mean
X_centered = X - Xmean
To double-check that we’ve done this correctly, we can check that the
centered array has near zero mean:
print(X_centered.mean(0))
To within-machine precision, the mean is now zero.
The entire program is:
import numpy as np
X = np.random.random((10, 3))
Xmean = X.mean(0)
print(Xmean)
X_centered = X - Xmean
print(X_centered.mean(0))
Output: [ 0.53514715, 0.66567217, 0.44385899])
[ 2.22044605e-17, -7.77156117e-17, -1.66533454e-17])
Plotting a two-dimensional function:
One place that broadcasting is very useful is in displaying images based
on two dimensional functions.
If we want to define a function z = f(x, y), broadcasting can be used to
compute the function across the grid:
# x and y have 50 steps from 0 to 5
x = np.linspace(0, 5, 50)
y = np.linspace(0, 5, 50)[:, np.newaxis]
z = np.sin(x) ** 10 + np.cos(10 + y * x) * np.cos(x)
We can use Matplotlib to plot this two-dimensional array
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 18
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
Example 1:
x = np.array([1, 2, 3, 4, 5])
print(x < 3) # less than operator
Output: [True, True, False, False, False]
Example 2:
x = np.array([1, 2, 3, 4, 5])
print(np.less(x,3)) # less than ufunc
Output: [True, True, False, False, False]
Working with Boolean Arrays:
Counting entries:
To count the number of True entries in a Boolean array,
np.count_nonzero is useful:
# how many values less than 6?
rng = np.random.RandomState(0)
x = rng.randint(10, size=(3, 4))
print(x)
print(np.less(x,6))
print(np.count_nonzero(x < 6)))
Output:
[[5 0 3 3]
[7 9 3 5]
[2 4 7 6]]
[[ True True True True]
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 19
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
Example:
import numpy as np
a =np.arange(10)
print(a)
#bitwise or operatot
b=((a<=2) | (a>=8))
print(b)
d=np.sum(b)
print(d)
#bitwise or ufunc
c=np.bitwise_or(a<=2,a>=8)
print(c)
e=np.sum(c)
print(e)
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 20
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
Output:
[0 1 2 3 4 5 6 7 8 9]
[ True True True False False False False False True True]
5
[ True True True False False False False False True True]
5
Boolean Masking (Boolean Indexing)
(Boolean Arrays as Masks)
Boolean masks are used to examine and manipulate values within NumPy
arrays.
Masking comes up when we want to extract, modify, count, or otherwise
manipulate values in an array based on some criterion: for example,
counting all values greater than a certain value, or perhaps remove all
outliers that are above some threshold.
In NumPy, Boolean masking is often the most efficient way to
manipulate values in an array based on some criterion.
Boolean masking, also called boolean indexing, is a feature in Python
NumPy that allows for the filtering of values in numpy arrays.
Numpy allows us to use an array of boolean values as an index of another
array.
Each element of the boolean array indicates whether or not to select the
elements from the array.
If the value is True, the element of that index is selected. In case the value
is False, the element of that index is not selected.
Example 1:
import numpy as np
a = np.array([1, 2, 3])
b = np.array([True, True, False])
c = a[b]
print(c)
Output: [1 2]
Example 2:
import numpy as np
a = np.arange(1, 10)
b=a>5
print(b)
c = a[b]
print(c)
Output:
[False False False False False True True True True]
[6 7 8 9]
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 21
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
Fancy Indexing
Fancy indexing is a new style of array indexing, where we pass arrays of
indices in place of single scalars. This allows us to very quickly access
and modify complicated subsets of an array’s values.
Fancy indexing means passing an array of indices to access multiple
array elements at once.
For example, consider the following array:
import numpy as np
x = np.arange(1,11)
print(x)
Output:[ 1 2 3 4 5 6 7 8 9 10]
Suppose we want to access three different elements. We could do it like this:
print(x[3], x[7], x[2])
Output: 4 8 3
Alternatively, we can pass a single list or array of indices to obtain the same
result:
ind = [3, 7, 2]
print(x[ind])
Outut: [4 8 3]
With fancy indexing, the shape of the result reflects the shape of the index
arrays rather than the shape of the array being indexed:
Example: ind = np.array([[3, 7],
[4, 5]])
print(x[ind])
Output: [[4 8]
[5 6]]
Fancy indexing also works in multiple dimensions.
Example:
import numpy as np
X = np.arange(12).reshape((3, 4))
print(X)
Output: [[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
Like with standard indexing, the first index refers to the row, and the
second to the column:
row = np.array([0, 1, 2])
col = np.array([2, 1, 3])
print(X[row, col])
Output: [ 2, 5, 11])
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 22
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
Combined Indexing
(Combining fancy indexing with other indexing schemes)
For even more powerful operations, fancy indexing can be combined with
the other indexing schemes
We can combine fancy and simple indices:
import numpy as np
X = np.arange(12).reshape((3, 4))
print(X)
a= X[2, [2, 0, 1]]
print(a)
Output: : [[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
[10, 8, 9]
We can also combine fancy indexing with slicing:
b=X[1:, [2, 0, 1]]
print(b)
Output:
[[ 6, 4, 5],
[10, 8, 9]]
We can combine fancy indexing with masking:
mask = np.array([True, False, True, False])
row = np.array([0, 1, 2])
c= X[row[:, np.newaxis], mask]
print(c)
Output: [[ 0, 2],
[ 4, 6],
[ 8, 10]]
All of these indexing options combined lead to a very flexible set of
operations for accessing and modifying array values.
Structured Arrays
Structure arrays are arrays with compound data types.
They provide efficient storage for compound, heterogeneous data.
Example:
import numpy as np
# Use a compound data type for structured arrays
data = np.zeros(4, dtype={'names':('name', 'age', 'weight'),
'formats':('U10', 'i4', 'f8')})
print(data.dtype)
#storing data in three separate arrays
name = np.array( ['Kumar', 'Rao', 'Ali', 'Singh'])
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 23
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
The handy thing with structured arrays is that we can now refer to values
either by index or by name:
Example:
# Get all names
print(data['name'])
# Get first row of data
print(data[0])
# Get the name from the last row
print(data[-1]['name'])
# Get names where age is under 30
print(data[data['age'] < 30]['name'])
Output:
['Kumar' 'Rao' 'Ali' 'Singh']
('Kumar', 25, 55.)
Singh
['Kumar' 'Singh']
Creating Structured Arrays
Structured array data types can be specified in a number of ways.
Method 1: Dictionary method: We can create a structured array using a
compound data type specification:
struct = np.dtype({'names':('name', 'age', 'weight'),
'formats':('U10', 'i4', 'f8')})
Method2: Numerical types can be specified with Python types or NumPy
dtypes instead:
struct2 = np.dtype({'names':('name', 'age', 'weight'),
'formats':((np.str_, 10), int, np.float32)})
Method3: A compound type can also be specified as a list of tuples:
struct3 = np.dtype([('name', 'U10'), ('age', 'i4'), ('weight', 'f8')])
Example:
import numpy as np
data['name'] = name
data['age'] = age
data['weight'] = weight
data2['name'] = name
data2['age'] = age
data2['weight'] = weight
data3['name'] = name
data3['age'] = age
data3['weight'] = weight
print(data)
print(data2)
print(data3)
Output:
[('Kumar', 25, 55. ) ('Rao', 45, 85.5) ('Ali', 37, 68. ) ('Singh', 19, 61.5)]
[('Kumar', 25, 55. ) ('Rao', 45, 85.5) ('Ali', 37, 68. ) ('Singh', 19, 61.5)]
[('Kumar', 25, 55. ) ('Rao', 45, 85.5) ('Ali', 37, 68. ) ('Singh', 19, 61.5)]
Record Arrays: Structured Arrays with a Twist
NumPy also provides the np.recarray class, which is almost identical to the
structured arrays , but with one additional feature: fields can be accessed as
attributes rather than as dictionary keys.
Recall can access the ages by writing: data['age']
Output: array([25, 45, 37, 19], dtype=int32).
If we view our data as a record array instead, we can access this with
slightly fewer keystrokes:
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 25
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
data_rec = data.view(np.recarray)
print(data_rec.age)
Output:array([25, 45, 37, 19], dtype=int32)
The downside is that for record arrays, there is some extra overhead
involved in accessing the fields
Example program:
import numpy as np
name = ['Kumar','Rao','Ali','Singh']
age = [25,45,37,19]
weight = [55.0,85.5,68.0,61.5]
struct = np.dtype({'names':('name','age','weight'),
'formats':('U10','i4','f8')})
data = np.zeros(4,struct)
data['name'] = name
data['age'] = age
data['weight'] = weight
print(data)
#accesing field as dictinary keys
print(data['age'])
# accesing field as attribute
data_rec = data.view(np.recarray)
print(data_rec.age)
Output:
[('Kumar', 25, 55. ) ('Rao', 45, 85.5) ('Ali', 37, 68. ) ('Singh', 19, 61.5)]
[25 45 37 19]
[25 45 37 19]
Tutorial Questions
1. Illustrate different categories of basic array manipulations with examples.
2. What are universal functions in NumPy array? Explain the different advanced
features of universal functions.
3. Discuss and demonstrate some of built-in aggregation functions in NumPy.
4. What is broadcasting in NumPy? Discuss the different rules of broadcasting with
examples
5. What is Boolean masking in NumPay? Explain with example.
6. What is fancy indexing in NumPy? Discuss and demonstrate the Fancy Indexing in
NumpPy.
7. Demonstrate the use of structured arrays and record arrays in NumpPy
8. How fancy indexing can be combined with other indexing schemes.
9. Illustrate different attributes of NumPy arrays with example.
10. Write short note on Computation on NumPy arrays
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 26
4-2 B.Tech IT Regulation: R19 Data Science: UNIT-3 Part-1
Assignment Questions:
1. Write a python program to demonstrate the Attributes of Arrays in NumpPy
2. Write a python program to demonstrate the Indexing of Arrays in NumpPy
3. Write a python program to demonstrate the Slicing of Arrays in NumpPy
4. Write a python program to demonstrate the Reshaping of Arrays in NumpPy
5. Write a python program to demonstrate the Joining and Splitting of Arrays in
NumpPy
6. Write a python program to demonstrate the Aggregation Universal Functions in
NumpPy
7. Write a python program to demonstrate the Broadcasting in NumpPy
8. Write a python program to demonstrate the Boolean Making in NumpPy
9. Write a python program to demonstrate the Fancy Indexing in NumpPy
10. Write a python program to demonstrate the use of structured arrays and record
arrays in NumpPy
Prepared By: MD SHAKEEL AHMED, Associate Professor, Dept. Of IT, VVIT, Guntur Page 27