python-notes-BCC-302 (Unit - 05)
python-notes-BCC-302 (Unit - 05)
Features of NumPy
Besides its obvious scientific uses, NumPy in Python can also be used as an
efficient multi-dimensional container of generic data. Arbitrary data types can be
defined using Numpy which allows NumPy to seamlessly and speedily integrate
with a wide variety of databases.
Arrays in NumPy
NumPy’s array class is called ndarray. It is also known by the alias array.
There are various ways of Numpy array creation in Python. They are as follows:
1. You can create an array from a regular Python list or tuple using the array()
function. The type of the resulting array is deduced from the type of the elements
in the sequences. Let’s see this implementation:
import numpy as np
b = np.array((1 , 3, 2))
Output:
[[1. 2. 4.]
[5. 8. 7.]]
[1 3 2]
2. Often, the element is of an array is originally unknown, but its size is known.
Hence, NumPy offers several functions to create arrays with initial placeholder
content. These minimize the necessity of growing arrays, an expensive
operation. For example: np.zeros, np.ones, np.full, np.empty, etc.
c = np.zeros((3, 4))
print ("An array initialized with all zeros:\n", c)
e = np.random.random((2, 2))
Output:
[[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]
A random array:
[[0.15471821 0.47506745]
[0.03637972 0.15772238]]
3. arange: This function returns evenly spaced values within a given
interval. Step size is specified.
f = np.arange(0, 30, 5)
Output:
[ 0 5 10 15 20 25]
g = np.linspace(0, 5, 10)
Output:
[5, 2, 4, 2],
[1, 2, 0, 1]])
newarr = arr.reshape(2, 2, 3)
print("---------------")
Output:
Original array:
[[1 2 3 4]
[5 2 4 2]
[1 2 0 1]]
---------------
Reshaped array:
[[[1 2 3]
[4 5 2]]
[[4 2 1]
[2 0 1]]]
NumPy Array Indexing
Knowing the basics of NumPy array indexing is important for analyzing and
manipulating the array object. NumPy in Python offers many ways to do array
indexing.
Slicing: Just like lists in Python, NumPy arrays can be sliced. As arrays can
be multidimensional, you need to specify a slice for each dimension of the
array.
Integer array indexing: In this method, lists are passed for indexing for
each dimension. One-to-one mapping of corresponding elements is done to
construct a new arbitrary array.
# indexing in numpy
import numpy as np
# An exemplar array
[2.6, 0, 7, 8],
# Slicing array
temp = arr[cond]
Output:
[[-1. 0.]
[ 4. 6.]]
[ 4. 6. 0. 3.]
[ 2. 4. 4. 6. 2.6 7. 8. 3. 4. 2. ]
import numpy as np
a = np.array([1, 2, 5, 3])
a *= 2
# transpose of array
Output:
Original array:
[[1 2 3]
[3 4 5]
[9 6 0]]
Transpose of array:
[[1 3 9]
[2 4 6]
[3 5 0]]
These operations apply to the array elementwise and a new array is created. You
can use all basic arithmetic operators like +, -, /, etc. In the case of +=, -=, =
operators, the existing array is modified.
import numpy as np
a = np.array([[1, 2],
[3, 4]])
b = np.array([[4, 3],
[2, 1]])
# add arrays
# matrix multiplication
Output:
Array sum:
[[5 5]
[5 5]]
Array multiplication:
[[4 6]
[6 4]]
Matrix multiplication:
[[ 8 5]
[20 13]]
PANDAS
Pandas is a powerful and versatile library that simplifies tasks of data manipulation
in Python. Pandas is built on top of the NumPy library and is particularly well-
suited for working with tabular data, such as spreadsheets or SQL tables. Its
versatility and ease of use make it an essential tool for data analysts, scientists, and
engineers working with structured data in Python.
Installing Pandas
The first step of working in pandas is to ensure whether it is installed in the system
or not. If not then we need to install it in our system using the pip command.
Type the cmd command in the search box and locate the folder using the cd
command where python-pip file has been installed. After locating it, type the
command:
Importing Pandas
After the pandas have been installed into the system, you need to import the
library. This module is generally imported as follows:
import pandas as pd
Pandas generally provide two data structures for manipulating data, They are:
Series
DataFrame
SERIES
A Pandas Series is a one-dimensional array-like object that can store data of any
type, including strings, integers, and floats. It has an associated index, which is an
array of labels used to identify elements within the Series. Series can only contain
a single list with index labels, but they are easy to construct and manipulate.
The values are labeled with their index number. First value has index 0, second
value has index 1 etc. This label can be used to access a specified value.
print(ds[0])
With the index argument, you can name your own labels.
import pandas as pd
a = [1,2,3]
ds = pd.Series(a, index=[‘a’, ‘b’, ‘c’])
print(ds)
When you have created labels, you can access an item by referring to the label.
print(ds[“a”])
DATAFRAMES
import pandas as pd
data = {"AQI": [420, 380, 390], "Temp": [50, 40, 45]}
df = pd.DataFrame(data)
print(df)
The DataFrame is like a table with rows and columns. Pandas use the loc attribute
to return one or more specified row(s).
print(df.loc[0])
print(df.loc[[0,1,2]])
import pandas as pd
data = {"AQI": [420, 380, 390], "Temp": [50, 40, 45]}
df = pd.DataFrame(data, index=[“day1”, “day2”, “day3”])
print(df)
print(df.to_string())
To load JSON file into a DataFrame:
import pandas as pd
df = pd.read_json('data.json')
print(df.to_string())
print(df.head(10))
print(df.tail())
df.tail() returns the bottom 5 rows of the DataFrame.
Pandas uses the mean() median() and mode() methods to calculate the respective
values for a specified column:
import pandas as pd
df = pd.read_csv('data.csv')
x = df["Calories"].mean()
print(x)
Pyplot
Most of the Matplotlib utilities lies under the pyplot submodule, and are usually
imported under the plt alias:
plt.show() starts an event loop, looks for all currently active figure objects, and
opens one or more interactive windows that display your figure or figures.
plt.show() command should be used only once per Python session, and is most
often seen at the very end of the script. Multiple show() commands can lead to
unpredictable backend-dependent behavior, and should mostly be avoided.
Creating title and labels:
import numpy as np
from matplotlib import pyplot as plt
x = np.array([4, 5, 6, 7, 8, 9, 10, 11, 12, 13])
y = np.array([8, 5, 9, 5, 10, 15, 11, 16, 12, 25])
plt.plot(x, y)
plt.title("Warehouse")
plt.xlabel("Date")
plt.ylabel("Quantity")
plt.show()
title, xlabel and ylabel are added to the plot using the title(), xlabel(), and ylabel()
functions of the matplotlib library.
1. Line graph
from matplotlib import pyplot as plt
x = [16, 8, 10]
y = [8, 16, 6]
x2 = [8, 15, 11]
y2 = [6, 15, 7]
plt.plot(x, y, 'r')
plt.plot(x2, y2, 'm')
plt.title('Epic Info')
plt.ylabel('Y axis')
plt.xlabel('X axis')
plt.show()
2. Bar graphs
Bar graphs are one of the most common types of graphs and are used to show data
associated with the categorical variables. Matplotlib provides a bar() to make bar
graphs which accepts arguments such as: categorical variables, their value and
color.
3. Pie Chart
A pie chart is a circular graph that is broken down in the segment or slices of pie. It
is generally used to represent the percentage or proportional data where each slice
of pie represents a particular category. The pie() function to draw pie charts.
4. Histogram
First, we need to understand the difference
between the bar graph and histogram.
A histogram is used for the distribution, whereas a bar chart is used to compare
different entities. A histogram is a type of bar plot that shows the frequency of a
number of values compared to a set of values ranges. The hist() function to create
histograms. The hist() function will use an array of numbers to create a histogram,
the array is sent into the function as an argument.
5. Scatter plot
The scatter plots are mostly used for comparing variables when we need to define
how much one variable is affected by another variable. The data is displayed as a
collection of points. Each point has the value of one variable, which defines the
position on the horizontal axes, and the value of other variable represents the
position on the vertical axis.
from matplotlib import pyplot as plt
x = [5,7,10]
y = [18,10,6]
x2 = [6,9,11]
y2 = [7,14,17]
plt.scatter(x, y)
plt.scatter(x2, y2)
plt.title('Epic Info')
plt.ylabel('Y axis')
plt.xlabel('X axis')
plt.show()
Importing a tkinter is the same as importing any other module in the Python code.
import tkinter
Two main methods used which the user needs to remember while creating the
Python application with GUI.
mainloop(): There is a method known by the name mainloop() is used when your
application is ready to run. mainloop() is an infinite loop used to run the
application, wait for an event to occur and process the event as long as the window
is not closed.
m.mainloop()
Tkinter also offers access to the geometric configuration of the widgets which can
organize the widgets in the parent windows. There are mainly three geometry
manager classes class.
1. pack() method:It organizes the widgets in blocks before placing in the parent
widget.
2. grid() method:It organizes the widgets in grid (table-like structure) before
placing in the parent widget.
3. place() method:It organizes the widgets by placing them on specific
positions directed by the programmer.
o Text : It enables us to display and alter text in a variety of styles and offers a
prepared text display.
o Label : Used to display text and images, but we are unable to interact with it.
o Button : Often used add buttons and we may add functions and methods to
it.
o Entry : One-line string text can be entered into this widget.
o Radiobutton : Use a radio button to carry out one of several choices.
entry_1 = Entry(base)
entry_1.place(x=240,y=130)
labl_2 = Label(base, text="Email",width=20,font=("bold", 10))
labl_2.place(x=68,y=180)
entry_02 = Entry(base)
entry_02.place(x=240,y=180)
entry_02 = Entry(base)
entry_02.place(x=240,y=280)
Button(base, text='Submit',width=20,bg='brown',fg='white').place(x=180,y=380)
base.mainloop()