0% found this document useful (0 votes)

55 views

Unit 7 Python Libraries For Data Science

Uploaded by

utsavsutariya066

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

55 views

Unit 7 Python Libraries For Data Science

Uploaded by

utsavsutariya066

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Unit 7 Python Libraries for Data Science

7.1 Numeric Python – NumPy

7.1.1 Introduction to Numpy :
➢ NumPy is a Python library that is used for working with large, multi-
dimensional arrays and matrices.
➢ It provides a high-performance multidimensional array object and
tools for working with these arrays.
➢ The core functionality of NumPy is provided by its `ndarray` (n-
dimensional array) object, which is used to hold and manipulate
arrays of homogeneous data types.
➢ NumPy provides a wide range of mathematical operations that can
be performed on arrays, including basic arithmetic operations,
mathematical functions, statistical functions, linear algebra
operations, and more.
➢ NumPy arrays can be created from Python lists, tuples, or other
sequences, or they can be created using built-in functions such as
`zeros`, `ones`, and `random`.
➢ The shape of a NumPy array is defined by its `shape` attribute,
which gives the dimensions of the array in the form of a tuple.
➢ NumPy arrays can be sliced and indexed just like Python lists or
tuples, using square brackets and integers or slicing notation.
➢ NumPy provides a wide range of functions for manipulating and
transforming arrays, including operations such as reshape,
concatenate, split, and transpose.
➢ NumPy also provides functionality for reading and writing array
data to and from disk, with support for a variety of file formats
including binary, text, and HDF5.
➢ Finally, NumPy is widely used in scientific computing, data
analysis, machine learning, and other areas where high-
performance numerical computing is needed.
some key features of NumPy
1. Efficient memory usage
2. Support for multidimensional arrays
3. Mathematical operations
4. Universal functions
5. Easy integration with other libraries
6. Fast I/O operations
7. Support for complex data types
8. Broadcasting for operations on arrays with different shapes and sizes.

7.1.2 Array Operations using Numpy

NumPy is a Python package which means ‘Numerical Python’. It is the library
for logical computing, which contains a powerful n-dimensional array object,
gives tools to integrate C, C++ and so on. It is likewise helpful in linear based
math, arbitrary number capacity and so on. NumPy exhibits can likewise be
utilized as an effective multi-dimensional compartment for generic
data. NumPy Array: Numpy array is a powerful N-dimensional array object
which is in the form of rows and columns. We can initialize NumPy arrays from
nested Python lists and access it elements. A Numpy array on a structural level
is made up of a combination of:
• The Data pointer indicates the memory address of the first byte in the
array.
• The Data type or dtype pointer describes the kind of elements that are
contained within the array.
• The shape indicates the shape of the array.
• The strides are the number of bytes that should be skipped in memory
to go to the next element.
Operations on Numpy Array
Arithmetic Operations:

# Python code to perform arithmetic

# operations on NumPy array
import numpy as np

# Initializing the array

arr1 = np.arange(4, dtype = np.float_).reshape(2, 2)

print('First array:')
print(arr1)

print('\nSecond array:')
arr2 = np.array([12, 12])
print(arr2)

print('\nAdding the two arrays:')

print(np.add(arr1, arr2))

print('\nSubtracting the two arrays:')

print(np.subtract(arr1, arr2))

print('\nMultiplying the two arrays:')

print(np.multiply(arr1, arr2))

print('\nDividing the two arrays:')

print(np.divide(arr1, arr2))

Output:
First array:
[[ 0. 1.]
[ 2. 3.]]
Second array:
[12 12]
Adding the two arrays:
[[ 12. 13.]
[ 14. 15.]]
Subtracting the two arrays:
[[-12. -11.]
[-10. -9.]]
Multiplying the two arrays:
[[ 0. 12.]
[ 24. 36.]]
Dividing the two arrays:
[[ 0. 0.08333333]
[ 0.16666667 0.25 ]]
numpy.reciprocal() This function returns the reciprocal of argument, element-
wise. For elements with absolute values larger than 1, the result is always 0 and
for integer 0, overflow warning is issued. Example:

# Python code to perform reciprocal operation

# on NumPy array
import numpy as np
arr = np.array([25, 1.33, 1, 1, 100])
print('Our array is:')
print(arr)

print('\nAfter applying reciprocal function:')

print(np.reciprocal(arr))

arr2 = np.array([25], dtype = int)

print('\nThe second array is:')
print(arr2)

print('\nAfter applying reciprocal function:')

print(np.reciprocal(arr2))

Output
Our array is:
[ 25. 1.33 1. 1. 100. ]
After applying reciprocal function:
[ 0.04 0.7518797 1. 1. 0.01 ]
The second array is:
[25]
After applying reciprocal function:[0]
numpy.power() This function treats elements in the first input array as the
base and returns it raised to the power of the corresponding element in the
second input array.

# Python code to perform power operation

# on NumPy array
import numpy as np
arr = np.array([5, 10, 15])

print('First array is:')

print(arr)

print('\nApplying power function:')

print(np.power(arr, 2))

print('\nSecond array is:')

arr1 = np.array([1, 2, 3])
print(arr1)

print('\nApplying power function again:')

print(np.power(arr, arr1))

Output:
First array is:
[ 5 10 15]
Applying power function:
[ 25 100 225]
Second array is:
[1 2 3]
Applying power function again:
[ 5 100 3375]
numpy.mod() This function returns the remainder of division of the
corresponding elements in the input array. The function numpy.remainder()
also produces the same result.

# Python code to perform mod function

# on NumPy array
import numpy as np
arr = np.array([5, 15, 20])
arr1 = np.array([2, 5, 9])
print('First array:')
print(arr)
print('\nSecond array:')
print(arr1)
print('\nApplying mod() function:')
print(np.mod(arr, arr1))
print('\nApplying remainder() function:')
print(np.remainder(arr, arr1))

Output:
First array:
[ 5 15 20]
Second array:
[2 5 9]
Applying mod() function:
[1 0 2]
Applying remainder() function:
[1 0 2]
7.1.3 N-dimensional Array Processing
Numpy is mainly used for working with n-dimensional arrays. Numpy arrays are
homogeneous, meaning all elements must be of the same data type. They can
have any number of dimensions, but most commonly used are 1D, 2D, and 3D
arrays.
1. 1D arrays: These are also known as vectors and are created using the
`np.array()` function.

Example:
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print(arr)

Output:
[1 2 3 4 5]
`
2. 2D arrays: These are also known as matrices and are created using the
`np.array()` function with multiple nested lists.

Example:
import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr)

Output:
[[1 2 3]
[4 5 6]]
3. 3D arrays: These are created using the `np.array()` function with multiple
nested lists.
Example:
import numpy as np
arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(arr)

Output:
[[[1 2]
[3 4]]
[[5 6]
[7 8]]]
➢ Numpy also provides a range of functions to create n-dimensional arrays
such as `np.zeros()`, `np.ones()`, `np.eye()`, `np.random.random()`,
`np.empty()` etc.

❖ Example:
import numpy as np
arr1 = np.zeros((2, 3, 4))
arr2 = np.ones((2, 3))
arr3 = np.eye(5)
arr4 = np.random.random((2, 3))

print(arr1)
print(arr2)
print(arr3)
print(arr4)

Output:
[[[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]

[[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]]
[[1. 1. 1.]
[1. 1. 1.]]
[[1. 0. 0. 0. 0.]
[0. 1. 0. 0. 0.]
[0. 0. 1. 0. 0.]
[0. 0. 0. 1. 0.]
[0. 0. 0. 0. 1.]]
[[0.43407942 0.37427243 0.46211803]
[0.84423743 0.80177559 0.23460201]]

➢ Operations on n-dimensional arrays follow the same principles as the 1D

arrays, but with more complex indexing and slicing.

7.2 Data Analysis – Pandas

7.2.1 Introduction to Pandas
=> Pandas is an open-source library that is made mainly for working with
relational or labeled data both easily and intuitively. It provides various data
structures and operations for manipulating numerical data and time series. This
library is built on top of the NumPy library. Pandas is fast and it has high
performance & productivity for users.
Advantages
• Fast and efficient for manipulating and analyzing data.
• Data from different file objects can be loaded.
• Easy handling of missing data (represented as NaN) in floating point as
well as non-floating point data
• Size mutability: columns can be inserted and deleted from DataFrame
and higher dimensional objects
• Data set merging and joining.
• Flexible reshaping and pivoting of data sets
• Provides time-series functionality.
• Powerful group by functionality for performing split-apply-combine
operations on data sets.

What Can Pandas Do?

Pandas gives you answers about the data. Like:

• Is there a correlation between two or more columns?

• What is average value?
• Max value?
• Min value?

Pandas are also able to delete rows that are not relevant, or contains wrong
values, like empty or NULL values. This is called cleaning the data.

7.2.2 Pandas Objects - Series and Dataframes

❖ Pandas is a Python library used primarily for data manipulation and
analysis. It provides two main data structures: Series and DataFrame.

Series:
A Series is a one-dimensional labeled array capable of holding data of any type.
It can be created using a list or array, and it contains both the data and index
labels. The index can be customized to make it easier to work with the data.
Example:
import pandas as pd
# Creating a simple Series
s = pd.Series([1, 2, 3, 4, 5])
# Using a custom index
s = pd.Series([1, 2, 3, 4, 5], index=['a', 'b', 'c', 'd', 'e'])
print(s)
Output:
a 1
b 2
c 3
d 4
e 5
dtype: int64

DataFrame:
❖ Data frameme is a two-dimensional table in which the columns can have
different types. It can be thought of as a dictionary of Series objects
where each Series represents a column. It can be created using lists,
dictionaries, or other DataFrame objects. It also contains both the data
and index labels.
Example:
import pandas as pd
# Creating a simple DataFrame using a dictionary
data = {'name': ['John', 'Jane', 'James', 'Emily'],
'age': [30, 25, 35, 28]}
df = pd.DataFrame(data)
print(df)
Output:
name age
0 John 30
1 Jane 25
2 James 35
3 Emily 28

Pandas provides many built-in functions and methods to work with these
data structures, including but not limited to:
- Importing and Exporting: Pandas supports reading data from and writing data
to many different file formats including CSV, Excel, JSON, SQL databases and
more.
- Selection and Indexing: Pandas supports advanced data selection and
indexing functionality, including Boolean indexing, label-based indexing, and
more.
- Data cleaning and transformation: DataFrames can be manipulated using
built-in or custom functions, and missing data can be addressed using
interpolation or deletion.
- Aggregation and Grouping: Pandas supports aggregation and grouping
functionality including groupby, pivot tables, and cross-tabulation.
Pandas is a powerful tool that makes data analysis tasks easier and more
efficient.

7.2.3 Dataframe Operations

In addition to the basic functionality discussed above, pandas provides a wide
range of operations and methods for manipulating and analyzing data in
DataFrames. Some of the most commonly used operations are:
1. Adding and removing columns: new columns can be added to a DataFrame
using assignment or the `insert()` method. Columns can be removed using the
`drop()` method, either by specifying the column name or index.
2. Filtering and selecting data: Boolean indexing can be used to filter rows of
data based on a condition. Data can be selected by specifying the column name
or index, using Boolean conditions, or using the `loc[]` and `iloc[]` methods.
3. Sorting data: DataFrames can be sorted by one or more columns, either in
ascending or descending order, using the `sort_values()` method.
4. Aggregating data: Pandas provides methods for computing aggregate
statistics on data, including mean, median, standard deviation, and more.
These methods can be applied to individual columns or to the entire
DataFrame.
5. Grouping data: The `groupby()` method can be used to group data based on
one or more columns, and then apply aggregate functions to each group.
6. Handling missing data: Missing data can be handled using the `fillna()`
method to replace missing values, or the `dropna()` method to remove rows or
columns with missing values.
7. Merging and joining data: Multiple DataFrames can be merged or joined
together based on common columns using the `merge()` method.
8. Reshaping data: DataFrames can be reshaped using the `pivot()` and `melt()`
methods, which allow data to be transformed from wide to long or vice versa.
=> Overall, pandas provides a powerful set of tools for data manipulation and
analysis, making it an essential tool for anyone working with data in Python.

7.2.4 Reading and Writing Files

Python provides inbuilt functions for creating, writing, and reading files. There
are two types of files that can be handled in python, normal text files and
binary files (written in binary language, 0s, and 1s).
• Text files: In this type of file, Each line of text is terminated with a special
character called EOL (End of Line), which is the new line character (‘\n’)
in python by default.
• Binary files: In this type of file, there is no terminator for a line, and the
data is stored after converting it into machine-understandable binary
language.
In this article, we will be focusing on opening, closing, reading, and writing data
in a text file.
Writing to a file
There are two ways to write in a file.
1. write() : Inserts the string str1 in a single line in the text file.
File_object.write(str1)
1. writelines() : For a list of string elements, each string is inserted in the
text file.Used to insert multiple strings at a single time.
File_object.writelines(L) for L = [str1, str2, str3]
Reading from a file
There are three ways to read data from a text file.
1. read() : Returns the read bytes in form of a string. Reads n bytes, if no n
specified, reads the entire file.
File_object.read([n])
1. readline() : Reads a line of the file and returns in form of a string.For
specified n, reads at most n bytes. However, does not reads more than
one line, even if n exceeds the length of the line.
File_object.readline([n])
1. readlines() : Reads all the lines and return them as each line a string
element in a list.
File_object.readlines()

7.3 Plotting Graphs using Matplotlib

7.3.1 Plot Creation
This series will introduce you to graphing in python with Matplotlib, which is
arguably the most popular graphing and data visualization library for Python.
Installation
The easiest way to install matplotlib is to use pip. Type following command in
terminal:

pip install matplotlib

OR, you can download it from here and install it manually.
Getting started ( Plotting a line)
• Python

# importing the required module

import matplotlib.pyplot as plt

# x axis values
x = [1,2,3]
# corresponding y axis values
y = [2,4,1]

# plotting the points

plt.plot(x, y)

# naming the x axis

plt.xlabel('x - axis')
# naming the y axis
plt.ylabel('y - axis')

# giving a title to my graph

plt.title('My first graph!')

# function to show the plot

plt.show()

Output:
The code seems self-explanatory. Following steps were followed:
• Define the x-axis and corresponding y-axis values as lists.
• Plot them on canvas using .plot() function.
• Give a name to x-axis and y-axis using .xlabel() and .ylabel() functions.
• Give a title to your plot using .title() function.
• Finally, to view your plot, we use .show() function.

Plotting two or more lines on same plot

• Python

import matplotlib.pyplot as plt

# line 1 points
x1 = [1,2,3]
y1 = [2,4,1]
# plotting the line 1 points
plt.plot(x1, y1, label = "line 1")

# line 2 points
x2 = [1,2,3]
y2 = [4,1,3]
# plotting the line 2 points
plt.plot(x2, y2, label = "line 2")

# naming the x axis

plt.xlabel('x - axis')
# naming the y axis
plt.ylabel('y - axis')
# giving a title to my graph
plt.title('Two lines on same graph!')

# show a legend on the plot

plt.legend()

# function to show the plot

plt.show()

Output:
• Here, we plot two lines on the same graph. We differentiate between
them by giving them a name(label) which is passed as an argument of
the .plot() function.
• The small rectangular box giving information about the type of line and
its color is called a legend. We can add a legend to our plot
using .legend() function.

Customization of Plots
Here, we discuss some elementary customizations applicable to almost any
plot.
• Python

import matplotlib.pyplot as plt

# x axis values
x = [1,2,3,4,5,6]
# corresponding y axis values
y = [2,4,1,5,2,6]

# plotting the points

plt.plot(x, y, color='green', linestyle='dashed', linewidth = 3,
marker='o', markerfacecolor='blue', markersize=12)

# setting x and y axis range

plt.ylim(1,8)
plt.xlim(1,8)

# naming the x axis

plt.xlabel('x - axis')
# naming the y axis
plt.ylabel('y - axis')

# giving a title to my graph

plt.title('Some cool customizations!')

# function to show the plot

plt.show()

Output:
As you can see, we have done several customizations like
• setting the line-width, line-style, line-color.
• setting the marker, marker’s face color, marker’s size.
• overriding the x and y-axis range. If overriding is not done, pyplot
module uses the auto-scale feature to set the axis range and scale.

Bar Chart
• Python

import matplotlib.pyplot as plt

# x-coordinates of left sides of bars

left = [1, 2, 3, 4, 5]

# heights of bars
height = [10, 24, 36, 40, 5]

# labels for bars

tick_label = ['one', 'two', 'three', 'four', 'five']

# plotting a bar chart

plt.bar(left, height, tick_label = tick_label,
width = 0.8, color = ['red', 'green'])

# naming the x-axis

plt.xlabel('x - axis')
# naming the y-axis
plt.ylabel('y - axis')
# plot title
plt.title('My bar chart!')

# function to show the plot

plt.show()

Output :
• Here, we use plt.bar() function to plot a bar chart.
• x-coordinates of the left side of bars are passed along with the heights of
bars.
• you can also give some names to x-axis coordinates by
defining tick_labels

Histogram
• Python

import matplotlib.pyplot as plt

# frequencies
ages = [2,5,70,40,30,45,50,45,43,40,44,
60,7,13,57,18,90,77,32,21,20,40]
# setting the ranges and no. of intervals
range = (0, 100)
bins = 10

# plotting a histogram
plt.hist(ages, bins, range, color = 'green',
histtype = 'bar', rwidth = 0.8)

# x-axis label
plt.xlabel('age')
# frequency label
plt.ylabel('No. of people')
# plot title
plt.title('My histogram')

# function to show the plot

plt.show()

Output:
• Here, we use plt.hist() function to plot a histogram.
• frequencies are passed as the ages list.
• The range could be set by defining a tuple containing min and max
values.
• The next step is to “bin” the range of values—that is, divide the entire
range of values into a series of intervals—and then count how many
values fall into each interval. Here we have defined bins = 10. So, there
are a total of 100/10 = 10 intervals.

Scatter plot

import matplotlib.pyplot as plt

# x-axis values
x = [1,2,3,4,5,6,7,8,9,10]
# y-axis values
y = [2,4,5,7,6,8,9,11,12,12]

# plotting points as a scatter plot

plt.scatter(x, y, label= "stars", color= "green",
marker= "*", s=30)

# x-axis label
plt.xlabel('x - axis')
# frequency label
plt.ylabel('y - axis')
# plot title
plt.title('My scatter plot!')
# showing legend
plt.legend()

# function to show the plot

plt.show()

Output:
• Here, we use plt.scatter() function to plot a scatter plot.
• As a line, we define x and corresponding y-axis values here as well.
• marker argument is used to set the character to use as a marker. Its size
can be defined using the s parameter.

Pie-chart

import matplotlib.pyplot as plt

# defining labels
activities = ['eat', 'sleep', 'work', 'play']

# portion covered by each label

slices = [3, 7, 8, 6]

# color for each label

colors = ['r', 'y', 'g', 'b']
# plotting the pie chart
plt.pie(slices, labels = activities, colors=colors,
startangle=90, shadow = True, explode = (0, 0, 0.1, 0),
radius = 1.2, autopct = '%1.1f%%')

# plotting legend
plt.legend()

# showing the plot

plt.show()

The output of above program looks like this:

• Here, we plot a pie chart by using plt.pie() method.

• First of all, we define the labels using a list called activities.
• Then, a portion of each label can be defined using another list
called slices.
• Color for each label is defined using a list called colors.
• shadow = True will show a shadow beneath each label in pie chart.
• startangle rotates the start of the pie chart by given degrees
counterclockwise from the x-axis.
• explode is used to set the fraction of radius with which we offset each
wedge.
• autopct is used to format the value of each label. Here, we have set it to
show the percentage value only upto 1 decimal place.

Plotting curves of given equation

• Python

# importing the required modules

import matplotlib.pyplot as plt
import numpy as np

# setting the x - coordinates

x = np.arange(0, 2*(np.pi), 0.1)
# setting the corresponding y - coordinates
y = np.sin(x)

# plotting the points

plt.plot(x, y)

# function to show the plot

plt.show()

The output
of above program looks like this:

Here, we use NumPy which is a general-purpose array-processing package in

python.

• To set the x-axis values, we use the np.arange() method in which the first
two arguments are for range and the third one for step-wise increment.
The result is a NumPy array.
• To get corresponding y-axis values, we simply use the
predefined np.sin() method on the NumPy array.
• Finally, we plot the points by passing x and y arrays to
the plt.plot() function.
So, in this part, we discussed various types of plots we can create in matplotlib.
There are more plots that haven’t been covered but the most significant ones
are discussed here –

7.3.2 Plot Routines

Matplotlib, which is the most widely used plotting library in Python, provides a
variety of plot routines to create different types of plots. Here are some
commonly used plot routines in Matplotlib:
1. plot(): This routine is used to create line plots. It takes the x and y data as
inputs and can also be used to customize the line style, color, and marker
type.
2. scatter(): This routine is used to create scatter plots. It takes the x and y
data as inputs and can also be used to customize the marker size, color,
and shape.
3. bar(): This routine is used to create bar charts. It takes the x and y data
as inputs and can also be used to customize the width and color of the
bars.
4. hist(): This routine is used to create histograms. It takes the data as input
and can also be used to customize the number of bins and the color of
the bars.
5. pie(): This routine is used to create pie charts. It takes the data as input
and can also be used to customize the colors and labels of the wedges.
6. boxplot(): This routine is used to create box plots. It takes the data as
input and can also be used to customize the appearance of the boxes
and whiskers.
7. imshow(): This routine is used to create image plots. It takes a 2D array
as input and can also be used to customize the color map and color
scale.
These are just a few examples of the many plot routines available in Matplotlib.
Each routine has a variety of options and parameters that can be used to
customize the appearance and behavior of the plot.

7.3.3 Saving, Showing and Clearing Graphs

After creating a plot in Matplotlib, you may want to save it to a file, display it on
the screen, or clear it to start over with a new plot. Here's how to do each of
these actions:
1. Saving a plot: To save a plot to a file, you can use the savefig() function.
This function takes a filename as input and saves the current figure to
that file. Here's an example:
import matplotlib.pyplot as plt
x = [1, 2, 3, 4, 5]
y = [10, 8, 6, 4, 2]
plt.plot(x, y)
plt.savefig('plot.png')
This saves the current plot to a file called plot.png in the current directory.
2. Showing a plot: To display a plot on the screen, you can use the show()
function. This function opens a window showing the current plot. Here's
an example:
import matplotlib.pyplot as plt
x = [1, 2, 3, 4, 5]
y = [10, 8, 6, 4, 2]
plt.plot(x, y)
plt.show()
This displays the current plot in a window.
3. Clearing a plot: To clear the current plot and start over with a new plot,
you can use the clf() function. This function clears the current figure and
axes. Here's an example:
import matplotlib.pyplot as plt
x1 = [1, 2, 3, 4, 5]
y1 = [10, 8, 6, 4, 2]
x2 = [1, 3, 5, 7, 9]
y2 = [2, 4, 6, 8, 10]
plt.plot(x1, y1)
plt.show()
plt.clf()
plt.plot(x2, y2)
plt.show()
This creates two plots: the first one with data (x1, y1), displays it, clears it using
clf(), and then creates a second plot with data (x2, y2) and displays it.
These are some basic actions you can perform with Matplotlib plots in Python.

7.3.4 Customize Matplotlib

Matplotlib provides a wide range of customization options that allow you to
create professional-looking plots that meet your specific needs. Here are some
of the most commonly used customization options in Matplotlib:
1. Setting plot title, axis labels, and legends: You can use the title(), xlabel(),
ylabel(), and legend() functions to add a title, axis labels, and a legend to
your plot, respectively. Here's an example:
import matplotlib.pyplot as plt
x = [1, 2, 3, 4, 5]
y = [10, 8, 6, 4, 2]
plt.plot(x, y)
plt.title('My plot')
plt.xlabel('X-axis label')
plt.ylabel('Y-axis label')
plt.legend(['Line 1'])
plt.show()
This sets a title, x-axis and y-axis labels, and a legend for the plot.

2.Changing plot colors, line styles, and marker styles: You can use the color,
linestyle, and marker parameters in the plot() function to change the color, line
style, and marker style of the plot, respectively. Here's an example:
import matplotlib.pyplot as plt
x = [1, 2, 3, 4, 5]
y = [10, 8, 6, 4, 2]
plt.plot(x, y, color='red', linestyle='--', marker='o')
plt.show()
This changes the color to red, the line style to dashed, and the marker style to circles.

3.Changing plot size and resolution: You can use the figure() function to change
the size and resolution of the plot. Here's an example:
import matplotlib.pyplot as plt
x = [1, 2, 3, 4, 5]
y = [10, 8, 6, 4, 2]
fig = plt.figure(figsize=(8, 6), dpi=100)
plt.plot(x, y)
plt.show()
4.Adding grid lines: You can use the grid() function to add grid lines to the plot.
Here's an example:
import matplotlib.pyplot as plt
x = [1, 2, 3, 4, 5]
y = [10, 8, 6, 4, 2]
plt.plot(x, y)
plt.grid(True)
plt.show()
This adds grid lines to the plot.
These are just a few examples of the many customization options available in
Matplotlib. By exploring the documentation, you can discover many more ways
to customize your plots.

Ethnotech - Data Science With Python
No ratings yet
Ethnotech - Data Science With Python
480 pages
NumPy Notes
No ratings yet
NumPy Notes
13 pages
Data Science Handwritten Notes - 3
No ratings yet
Data Science Handwritten Notes - 3
26 pages
Numpy
No ratings yet
Numpy
9 pages
45B AIML Practical1.1
No ratings yet
45B AIML Practical1.1
57 pages
Numpy Python
No ratings yet
Numpy Python
36 pages
Print
No ratings yet
Print
296 pages
Python Numpy
No ratings yet
Python Numpy
20 pages
Numpy
No ratings yet
Numpy
64 pages
Unit Iii Using Numpy
No ratings yet
Unit Iii Using Numpy
23 pages
Python Sem v Portion 2
No ratings yet
Python Sem v Portion 2
29 pages
Mds1111 Merged Numbered (1)
No ratings yet
Mds1111 Merged Numbered (1)
41 pages
python-notes-BCC-302 (Unit - 05)
No ratings yet
python-notes-BCC-302 (Unit - 05)
25 pages
APznzaaqszKXWidB7ZcUyElwKtMW9baPO5uwgBspe7mup3-RAjUbFs9a5J0SWJx5baBOtL8oMAExrcfE-xNmC3fbtEqgqkuUDV3hM3RFDNeuJc8K5DkloC95lixWjd8hSK4WWqCMirKOpcOSGSRNGGugDyjrAf-wzcSS5bC_l3kfkAro7lqM_CfNu8jP_XQRy6CFb
No ratings yet
APznzaaqszKXWidB7ZcUyElwKtMW9baPO5uwgBspe7mup3-RAjUbFs9a5J0SWJx5baBOtL8oMAExrcfE-xNmC3fbtEqgqkuUDV3hM3RFDNeuJc8K5DkloC95lixWjd8hSK4WWqCMirKOpcOSGSRNGGugDyjrAf-wzcSS5bC_l3kfkAro7lqM_CfNu8jP_XQRy6CFb
51 pages
Unit 7 - Python Libraries
No ratings yet
Unit 7 - Python Libraries
22 pages
15.NUMPY
No ratings yet
15.NUMPY
32 pages
FALLSEM2023-24 CSI3007 ETH VL2023240104352 2023-09-27 Reference-Material-I
No ratings yet
FALLSEM2023-24 CSI3007 ETH VL2023240104352 2023-09-27 Reference-Material-I
47 pages
Value Added Course: Programming in Python and Machine Learning UNIT-2
No ratings yet
Value Added Course: Programming in Python and Machine Learning UNIT-2
41 pages
Module3 Advance Pythonlibraries
No ratings yet
Module3 Advance Pythonlibraries
53 pages
CAP776 Numpy
No ratings yet
CAP776 Numpy
71 pages
NumPy_Array_Operations_and_Functions
No ratings yet
NumPy_Array_Operations_and_Functions
14 pages
DSE UNIT 3
No ratings yet
DSE UNIT 3
12 pages
Python 5th Sem
No ratings yet
Python 5th Sem
33 pages
Module Numpy
No ratings yet
Module Numpy
67 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
36 pages
Numpy
No ratings yet
Numpy
71 pages
UNIT 5 python aktu
No ratings yet
UNIT 5 python aktu
49 pages
Numerical Methods Using Python: (MCSC-202)
No ratings yet
Numerical Methods Using Python: (MCSC-202)
34 pages
Numpy
No ratings yet
Numpy
20 pages
NUMPY, PANDAS
No ratings yet
NUMPY, PANDAS
19 pages
Numpy Handbook
No ratings yet
Numpy Handbook
16 pages
Introduction To Numpy Pandas and Matplotlib
No ratings yet
Introduction To Numpy Pandas and Matplotlib
2 pages
Numpy & Pandas
No ratings yet
Numpy & Pandas
13 pages
Num Py
No ratings yet
Num Py
31 pages
Week2-1 Numpy
No ratings yet
Week2-1 Numpy
43 pages
NUMPYA03
No ratings yet
NUMPYA03
36 pages
Python-Unit-4
No ratings yet
Python-Unit-4
43 pages
Unit-V Python_BCC402
No ratings yet
Unit-V Python_BCC402
20 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
61 pages
Applied Machine Learning For Engineers: Introduction To Numpy
No ratings yet
Applied Machine Learning For Engineers: Introduction To Numpy
13 pages
The NumPy Array - A Structure For Efficient Numerical Computation
No ratings yet
The NumPy Array - A Structure For Efficient Numerical Computation
9 pages
Numpy: Usage For Data Analysis Operations
No ratings yet
Numpy: Usage For Data Analysis Operations
20 pages
Unit 4 Numpy
No ratings yet
Unit 4 Numpy
14 pages
NUMPY
No ratings yet
NUMPY
8 pages
NumPy class 11th
No ratings yet
NumPy class 11th
10 pages
Num Py
No ratings yet
Num Py
15 pages
10 Numpy
No ratings yet
10 Numpy
39 pages
HKU - 7001 - 3.2 Managing Data II
No ratings yet
HKU - 7001 - 3.2 Managing Data II
67 pages
Unit3_ Arrays and Strings
No ratings yet
Unit3_ Arrays and Strings
20 pages
NumPy Basics
No ratings yet
NumPy Basics
23 pages
PYTHON UNIT-5 Part-B
No ratings yet
PYTHON UNIT-5 Part-B
3 pages
Lecture 2 - NumPy I
No ratings yet
Lecture 2 - NumPy I
12 pages
Numpy, Pandas and Matplotlib
No ratings yet
Numpy, Pandas and Matplotlib
60 pages
Num Py
No ratings yet
Num Py
13 pages
UNIT 3 (1)
No ratings yet
UNIT 3 (1)
56 pages
Numpy Basics
No ratings yet
Numpy Basics
66 pages
Unit 1
No ratings yet
Unit 1
170 pages
Numerical Python Numpy
No ratings yet
Numerical Python Numpy
28 pages
Python Presentation 3
No ratings yet
Python Presentation 3
44 pages
The Numpy Pocketbook: Essentials on the Go
From Everand
The Numpy Pocketbook: Essentials on the Go
Silas Meadowlark
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Lab - Use Wireshark To View Network Traffic
No ratings yet
Lab - Use Wireshark To View Network Traffic
6 pages
sensors-23-06739-v2
No ratings yet
sensors-23-06739-v2
20 pages
B. List Is Mutable && Tuple Is Immutable: Python CHP 3 MCQ
No ratings yet
B. List Is Mutable && Tuple Is Immutable: Python CHP 3 MCQ
13 pages
Hari 2 Fungsi, Join, Subquery v2
No ratings yet
Hari 2 Fungsi, Join, Subquery v2
70 pages
Manual Tosiba
No ratings yet
Manual Tosiba
2 pages
E-Banking Services and Performance of Top Performer Commercial Banks in Ethiopia
75% (4)
E-Banking Services and Performance of Top Performer Commercial Banks in Ethiopia
63 pages
Istio - Bookinfo Application
No ratings yet
Istio - Bookinfo Application
9 pages
End User License Agreement
No ratings yet
End User License Agreement
6 pages
Intoduction To Computing
No ratings yet
Intoduction To Computing
292 pages
Plan Ahead User Guide
No ratings yet
Plan Ahead User Guide
434 pages
4 Data Flow Diagram: 4.1 Introduction To DFD
No ratings yet
4 Data Flow Diagram: 4.1 Introduction To DFD
13 pages
Signals and Systems (ELE-202) RCS (EngineeringDuniya - Com)
No ratings yet
Signals and Systems (ELE-202) RCS (EngineeringDuniya - Com)
2 pages
ChemoSpec 1
No ratings yet
ChemoSpec 1
42 pages
Steven Kolawole SOP
No ratings yet
Steven Kolawole SOP
2 pages
SoftwareAGRansomware Attack
No ratings yet
SoftwareAGRansomware Attack
4 pages
Dynamic Actions On Steroids: Session 301 Donna Wendling Sherryanne Meyer
No ratings yet
Dynamic Actions On Steroids: Session 301 Donna Wendling Sherryanne Meyer
49 pages
Uml Design Patterns Nov 2022
No ratings yet
Uml Design Patterns Nov 2022
8 pages
Active Directory Replication
No ratings yet
Active Directory Replication
63 pages
Pac Gms Procedures-1
No ratings yet
Pac Gms Procedures-1
7 pages
Computer Science and Electrical and Computer Engineering Resource Guide
No ratings yet
Computer Science and Electrical and Computer Engineering Resource Guide
10 pages
Capstone 1 Coolest Beauty Products
No ratings yet
Capstone 1 Coolest Beauty Products
29 pages
Memmert Incubator IN30.En
No ratings yet
Memmert Incubator IN30.En
3 pages
Marcos Hernandez Resume
No ratings yet
Marcos Hernandez Resume
1 page
PDF The Squat Bible The Ultimate Guide PDF
No ratings yet
PDF The Squat Bible The Ultimate Guide PDF
7 pages
Bios Beep Codes
100% (1)
Bios Beep Codes
5 pages
Creating A Zone in Solaris 11
No ratings yet
Creating A Zone in Solaris 11
12 pages
Contax N/645 - Sony E Full Auto Adapter Ring Mk3 User's Manual (V. 31)
No ratings yet
Contax N/645 - Sony E Full Auto Adapter Ring Mk3 User's Manual (V. 31)
5 pages
Ansible Questions
No ratings yet
Ansible Questions
9 pages
Designing For Interaction: Dan Saffer
No ratings yet
Designing For Interaction: Dan Saffer
38 pages

Unit 7 Python Libraries For Data Science

Uploaded by

Unit 7 Python Libraries For Data Science

Uploaded by

Unit 7 Python Libraries for Data Science

7.1 Numeric Python – NumPy

7.1.2 Array Operations using Numpy

# Python code to perform arithmetic

# Initializing the array

print('\nAdding the two arrays:')

print('\nSubtracting the two arrays:')

print('\nMultiplying the two arrays:')

print('\nDividing the two arrays:')

# Python code to perform reciprocal operation

print('\nAfter applying reciprocal function:')

arr2 = np.array([25], dtype = int)

print('\nAfter applying reciprocal function:')

# Python code to perform power operation

print('First array is:')

print('\nApplying power function:')

print('\nSecond array is:')

print('\nApplying power function again:')

# Python code to perform mod function

➢ Operations on n-dimensional arrays follow the same principles as the 1D

7.2 Data Analysis – Pandas

What Can Pandas Do?

Pandas gives you answers about the data. Like:

• Is there a correlation between two or more columns?

7.2.2 Pandas Objects - Series and Dataframes

7.2.3 Dataframe Operations

7.2.4 Reading and Writing Files

7.3 Plotting Graphs using Matplotlib

pip install matplotlib

# importing the required module

# plotting the points

# naming the x axis

# giving a title to my graph

# function to show the plot

Plotting two or more lines on same plot

import matplotlib.pyplot as plt

# naming the x axis

# show a legend on the plot

# function to show the plot

import matplotlib.pyplot as plt

# plotting the points

# setting x and y axis range

# naming the x axis

# giving a title to my graph

# function to show the plot

import matplotlib.pyplot as plt

# x-coordinates of left sides of bars

# labels for bars

# plotting a bar chart

# naming the x-axis

# function to show the plot

import matplotlib.pyplot as plt

# function to show the plot

import matplotlib.pyplot as plt

# plotting points as a scatter plot

# function to show the plot

import matplotlib.pyplot as plt

# portion covered by each label

# color for each label

# showing the plot

The output of above program looks like this:

• Here, we plot a pie chart by using plt.pie() method.

Plotting curves of given equation

# importing the required modules

# setting the x - coordinates

# plotting the points

# function to show the plot

Here, we use NumPy which is a general-purpose array-processing package in

7.3.2 Plot Routines

7.3.3 Saving, Showing and Clearing Graphs

7.3.4 Customize Matplotlib

You might also like