0% found this document useful (0 votes)

2 views

Pandas,Numpy,Matplotlib

Pandas is an open-source library in Python for data manipulation and analysis, featuring data structures like Series and DataFrame. It provides functionalities for creating DataFrames, selecting rows and columns, handling missing data, and reading CSV files. Additionally, the document covers basic usage of Numpy for array processing and Matplotlib for data visualization through various plot types.

Uploaded by

sachin sales abraham

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Pandas,Numpy,Matplotlib

Uploaded by

sachin sales abraham

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Python Pandas

Pandas is an open-source library built on top of the Python programming

language, primarily designed for data manipulation and analysis. It offers data
structures like Series and DataFrame that allow users to work with structured
data seamlessly. Here's a more detailed look at its core components and
functionalities:
● A Series is a one-dimensional labeled array that can store data of various
types, such as integers, floats, strings, or even Python objects. Each item in
a Series corresponds to a labeled index, allowing for both positional and
label-based indexing.
● A Data frame is a two-dimensional data structure, i.e., data is aligned in a
tabular fashion in rows and columns. Pandas DataFrame consists of three
principal components, the data, rows, and columns.
Creating a Pandas DataFrame

# import pandas as pd
import pandas as pd
# list of strings
list1 = ['hey', 'rithu', 'how', 'people','happy','for']
# Calling DataFrame constructor on list
df = pd.DataFrame(list1)
print(df)

Dealing with Rows and Columns

⇒ Column Selection: In Order to select a column in Pandas DataFrame, we can

either access the columns by calling them by their columns name.

# Import pandas package

import pandas as pd
# Define a dictionary containing employee data
data = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Age':[27, 24, 22, 32],
'Address':['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
'Qualification':['Msc', 'MA', 'MCA', 'Phd']}
# Convert the dictionary into DataFrame
df = pd.DataFrame(data)
# select two columns
print(df[['Name', 'Qualification']])

⇒ Row Selection: Pandas provide a unique method to retrieve rows from a

Data frame. DataFrame.loc[] method is used to retrieve rows from Pandas
DataFrame. Rows can also be selected by passing integer location to an iloc[]
function.
Example 1:
#row selection using loc function
import pandas as pd
# Define a dictionary containing employee data
data = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Age':[27, 24, 22, 32],
'Address':['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
'Qualification':['Msc', 'MA', 'MCA', 'Phd']}
# Convert the dictionary into DataFrame
df = pd.DataFrame(data)
# select row using loc
print(df.loc[df['Name'] == 'Jai'])

Example 2:
#row selection using iloc function
import pandas as pd
# Define a dictionary containing employee data
data = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Age':[27, 24, 22, 32],
'Address':['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
'Qualification':['Msc', 'MA', 'MCA', 'Phd']}
# Convert the dictionary into DataFrame
df = pd.DataFrame(data)
# select row using iloc
print(df.iloc[0])
⇒ CSV file of Students named “students.csv”

Name,Age,Grade,City,Subject,Score
John,16,10,New York,Math,85
Alice,15,9,Los Angeles,English,92
Bob,17,11,Chicago,Science,88
Charlie,16,10,Houston,History,90
David,15,9,Phoenix,Math,75
Eva,17,11,Philadelphia,English,95
Frank,16,10,San Antonio,Science,80
Grace,15,9,San Diego,History,85
Henry,17,11,Dallas,Math,78
Isabella,16,10,San Jose,English,89

⇒ Indexing and Selecting Data

● To display the dataframe

import pandas as pd
# Load the CSV file into a DataFrame
df = pd.read_csv('students.csv')
# Display the DataFrame
print("DataFrame:")
print(df)

● Selecting a single columns

import pandas as pd
df = pd.read_csv('students.csv')
name_column = df['Name']
print("\nName Column:")
print(name_column)
● Selecting a single row

import pandas as pd
df = pd.read_csv('students.csv')
john_row = df.loc[df['Name'] == 'John']
print("\nRow where Name is 'John':")
print(john_row)

Working with Missing Data

Missing Data can occur when no information is provided for one or more items or
for a whole unit. Missing Data is a very big problem in real-life scenarios. Missing
Data can also refer to as NA(Not Available) values in pandas.

Checking for missing values using isnull() and notnull() :

# importing pandas as pd
import pandas as pd
# importing numpy as np
import numpy as np
# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, 45, 56, np.nan],
'Third Score':[np.nan, 40, 80, 98]}
# creating a dataframe from list
df = pd.DataFrame(dict)
# using isnull() function
df.isnull()
Filling missing values using fillna(), replace() :

→ fillna()
# importing pandas as pd
import pandas as pd
# importing numpy as np
import numpy as np
# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, 45, 56, np.nan],
'Third Score':[np.nan, 40, 80, 98]}
# creating a dataframe from dictionary
df = pd.DataFrame(dict)
# filling missing value using fillna()
df.fillna(0)

→ replace()
# Importing pandas as pd
import pandas as pd

# Importing numpy as np
import numpy as np
# Dictionary of lists
data = {
'First Score': [100, 90, np.nan, 95],
'Second Score': [30, 45, 56, np.nan],
'Third Score': [np.nan, 40, 80, 98]
}
# Creating a DataFrame from the dictionary
df = pd.DataFrame(data)
df.replace(100, 'null')
Dropping missing values using dropna() :

# importing pandas as pd
import pandas as pd
# importing numpy as np
import numpy as np
# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, np.nan, 45, 56],
'Third Score':[52, 40, 80, 98],
'Fourth Score':[np.nan, np.nan, np.nan, 65]}
# creating a dataframe from dictionary df = pd.DataFrame(dict)
# using dropna() function
df.dropna()

Python Numpy

Numpy is a general-purpose array-processing package. It provides a

high-performance multidimensional array object and tools for working with these
arrays. It is the fundamental package for scientific computing with Python.

Array in Numpy is a table of elements (usually numbers), all of the same type,
indexed by a tuple of positive integers. In Numpy, number of dimensions of the
array is called rank of the array. A tuple of integers giving the size of the array
along each dimension is known as the shape of the array. An array class in
Numpy is called as ndarray.

→ Creating a Numpy Array

import numpy as np

# Creating a 1D array
array_1d = np.array([1, 2, 3, 4, 5])
print("1D Array:", array_1d)
# Creating a 2D array
array_2d = np.array([[1, 2, 3], [4, 5, 6]])
print("2D Array:\n", array_2d)

→ Accessing the array Index

import numpy as np
# Create a 1D array
array_1d = np.array([10, 20, 30, 40, 50])
# Accessing elements by index
print("Element at index 0:", array_1d[0])
print("Element at index 2:", array_1d[2])

→ Slicing
eg:1
import numpy as np
# Create a 1D array
array_1d = np.array([10, 20, 30, 40, 50])
# Slicing a 1D array
slice_1d = array_1d[1:4] # Elements from index 1 to 3
print("Sliced 1D Array:", slice_1d)

eg:2
import numpy as np
# Creating a 2D array
array_2d = np.array([[1, 2, 3], [4, 5, 6]])
print("2D Array:\n", array_2d)
# Slicing a 2D array
slice_2d = array_2d[0:2, 1:3] # Subarray with rows 0 to 1 and columns 1 to 2
print("Sliced 2D Array:\n", slice_2d)
Matplotlib
Matplotlib is an amazing visualization library in Python for 2D plots of
arrays.

python -m pip install -U matplotlib

Import matplotlib

Basic plots in Matplotlib

1. Line plot using Matplotlib

# importing matplotlib module

from matplotlib import pyplot as plt
# x-axis values
x = [5, 2, 9, 4, 7]
# Y-axis values
y = [10, 5, 8, 4, 2]
# Function to plot
plt.plot(x,y)
# function to show the plot
plt.show()
2. Bar plot using Matplotlib

# importing matplotlib module

from matplotlib import pyplot as plt
# x-axis values
x = [5, 2, 9, 4, 7]
# Y-axis values
y = [10, 5, 8, 4, 2]
# Function to plot the bar
plt.bar(x,y)
# function to show the plot
plt.show()

3. Histogram using Matplotlib

# importing matplotlib module

from matplotlib import pyplot as plt
# Y-axis values
y = [10, 5, 8, 4, 2]
# Function to plot histogram
plt.hist(y)
# Function to show the plot
plt.show()
4. Scatter Plot using Matplotlib

# importing matplotlib module

from matplotlib import pyplot as plt
# x-axis values
x = [5, 2, 9, 4, 7]
# Y-axis values
y = [10, 5, 8, 4, 2]
# Function to plot scatter
plt.scatter(x, y)
# function to show the plot
plt.show()

RAP With BAPI
100% (1)
RAP With BAPI
12 pages
EBX Documentation Advanced
100% (2)
EBX Documentation Advanced
664 pages
Pandas Basics
No ratings yet
Pandas Basics
84 pages
IQ+ Install Manual
0% (1)
IQ+ Install Manual
61 pages
Auber API
No ratings yet
Auber API
4 pages
Pandas
No ratings yet
Pandas
5 pages
pandas (1)
No ratings yet
pandas (1)
25 pages
Class 12 Practical File
No ratings yet
Class 12 Practical File
29 pages
unit-3(FODS)
No ratings yet
unit-3(FODS)
34 pages
Pandas
No ratings yet
Pandas
8 pages
Lab 9
No ratings yet
Lab 9
9 pages
Pandas - Digitalocean
No ratings yet
Pandas - Digitalocean
15 pages
Class 12 Panda Project
No ratings yet
Class 12 Panda Project
13 pages
Ilovepdf Merged (2) Merged
No ratings yet
Ilovepdf Merged (2) Merged
65 pages
Unit 4
No ratings yet
Unit 4
36 pages
FDS Module 2 Notes
No ratings yet
FDS Module 2 Notes
24 pages
Python Libraries
No ratings yet
Python Libraries
27 pages
Exp_1_Introduction to Data Analytics and Python fundamentals_sdk_ok
No ratings yet
Exp_1_Introduction to Data Analytics and Python fundamentals_sdk_ok
9 pages
Introduction To Pandas For Data Analysis
No ratings yet
Introduction To Pandas For Data Analysis
6 pages
Usage of NumPy for Numerical Data in Detail
No ratings yet
Usage of NumPy for Numerical Data in Detail
52 pages
14_Pandas
No ratings yet
14_Pandas
25 pages
Practical File Python
No ratings yet
Practical File Python
25 pages
Class Notes: Class: XII Date: 7-Apr-2020 Subject: Informatics Practices Topic: 2. Python Pandas
No ratings yet
Class Notes: Class: XII Date: 7-Apr-2020 Subject: Informatics Practices Topic: 2. Python Pandas
4 pages
18_Pandas
No ratings yet
18_Pandas
33 pages
dav 2 unit
No ratings yet
dav 2 unit
55 pages
Chapter 1 Python Pandas - I
No ratings yet
Chapter 1 Python Pandas - I
35 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
IP Slybuss
No ratings yet
IP Slybuss
21 pages
Pandas
No ratings yet
Pandas
12 pages
dsa-lab-manual (1)
No ratings yet
dsa-lab-manual (1)
72 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
12 pages
IP Practical File - Reference
No ratings yet
IP Practical File - Reference
98 pages
Python Data Science 101
100% (1)
Python Data Science 101
41 pages
DevOps Session 3 Pandas.pptx
No ratings yet
DevOps Session 3 Pandas.pptx
33 pages
Unit 2
No ratings yet
Unit 2
81 pages
Xii Record (Dataframe & CSV)
No ratings yet
Xii Record (Dataframe & CSV)
11 pages
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
No ratings yet
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
12 pages
Chapter Notes - Data Handling Using Pandas DataFrame
No ratings yet
Chapter Notes - Data Handling Using Pandas DataFrame
16 pages
EXP1-siddhant gupta (23_SE_148)
No ratings yet
EXP1-siddhant gupta (23_SE_148)
17 pages
Pandas
No ratings yet
Pandas
36 pages
Data Aggregation and Group Operations
No ratings yet
Data Aggregation and Group Operations
34 pages
Pandas PDF(2)
No ratings yet
Pandas PDF(2)
25 pages
Data Analysis Tools
No ratings yet
Data Analysis Tools
26 pages
01 Introduction to Python
No ratings yet
01 Introduction to Python
36 pages
Numpy Basics Introduction To
No ratings yet
Numpy Basics Introduction To
35 pages
Python pandas
No ratings yet
Python pandas
34 pages
Pandas Dataframe
No ratings yet
Pandas Dataframe
48 pages
Unit6 - Working With Data
No ratings yet
Unit6 - Working With Data
29 pages
Python Pandas New Sylabus
No ratings yet
Python Pandas New Sylabus
53 pages
AIML LAB MANAUAL R23
100% (1)
AIML LAB MANAUAL R23
10 pages
Python Notes by Prof T
No ratings yet
Python Notes by Prof T
10 pages
Python Pandas ch-2
No ratings yet
Python Pandas ch-2
56 pages
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
No ratings yet
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
28 pages
Pandas Practicals - Term-1
100% (1)
Pandas Practicals - Term-1
18 pages
Day08-Pandas-Tutorial: Pandas - by Punith V T
No ratings yet
Day08-Pandas-Tutorial: Pandas - by Punith V T
8 pages
Pandas DataFrameObject
No ratings yet
Pandas DataFrameObject
4 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
10 pages
FDS RECORD-1-4
No ratings yet
FDS RECORD-1-4
18 pages
05 Pandas Data Frames
No ratings yet
05 Pandas Data Frames
33 pages
12 IP Practical Exampl
No ratings yet
12 IP Practical Exampl
6 pages
P Unit-4 NP
No ratings yet
P Unit-4 NP
30 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Python For Beginners
From Everand
Python For Beginners
Célio Azevedo
No ratings yet
Emc San Course Contents
No ratings yet
Emc San Course Contents
6 pages
NIMCET 2024 With Solution
No ratings yet
NIMCET 2024 With Solution
17 pages
Abstract Data Types: Asim Rehan
No ratings yet
Abstract Data Types: Asim Rehan
15 pages
Basic Computer Knowledge Test 2
No ratings yet
Basic Computer Knowledge Test 2
20 pages
Heap Trees
No ratings yet
Heap Trees
3 pages
E1 To Ethernet Converter HPC-4E1-4ETH: Description
100% (1)
E1 To Ethernet Converter HPC-4E1-4ETH: Description
2 pages
How To Connect Wireless Router On Computer
No ratings yet
How To Connect Wireless Router On Computer
12 pages
8085 Basic Compiler Reference Manual
No ratings yet
8085 Basic Compiler Reference Manual
6 pages
Prepking MK0-201 Exam Questions
No ratings yet
Prepking MK0-201 Exam Questions
11 pages
How Can I Troubleshoot High Cpu Utilization For Amazon Rds or Amazon Aurora Postgresql?
No ratings yet
How Can I Troubleshoot High Cpu Utilization For Amazon Rds or Amazon Aurora Postgresql?
3 pages
مشروع التخرج
No ratings yet
مشروع التخرج
37 pages
Operating Manual: Communication Commands
100% (1)
Operating Manual: Communication Commands
148 pages
Distributed Database Management Notes - 3
86% (7)
Distributed Database Management Notes - 3
48 pages
Exams Collection: Exams4Collection Exam Dumps & Exams4Collection Exam Study Material
No ratings yet
Exams Collection: Exams4Collection Exam Dumps & Exams4Collection Exam Study Material
4 pages
Active Directory Replication
No ratings yet
Active Directory Replication
63 pages
5 Redo-Log-Files
No ratings yet
5 Redo-Log-Files
18 pages
Hadoop 1.x Architecture: Name: Siddhant Singh Chandel PRN: 20020343053
No ratings yet
Hadoop 1.x Architecture: Name: Siddhant Singh Chandel PRN: 20020343053
4 pages
08 r05310404 Digital Communication
No ratings yet
08 r05310404 Digital Communication
8 pages
SSO - Single Sign On Solution For Banks and Financial Organizations
No ratings yet
SSO - Single Sign On Solution For Banks and Financial Organizations
9 pages
Architecture Diagram BI Platform 4.3
No ratings yet
Architecture Diagram BI Platform 4.3
1 page
IyCnet CX Supervisor Database
No ratings yet
IyCnet CX Supervisor Database
47 pages
Disk Less Using NXD
No ratings yet
Disk Less Using NXD
9 pages
General Routing Encapsulation (GRE) Protocol 1
No ratings yet
General Routing Encapsulation (GRE) Protocol 1
2 pages
Mobile IP: Asst. Prof. Sumegha C. Sakhreliya
No ratings yet
Mobile IP: Asst. Prof. Sumegha C. Sakhreliya
23 pages
Computer Network (KCS-603) FIRST SESSIONAL EXAMINATION 2020-21, EVEN Semester B.TECH, CSE
No ratings yet
Computer Network (KCS-603) FIRST SESSIONAL EXAMINATION 2020-21, EVEN Semester B.TECH, CSE
12 pages
Internal Workings of The Oracle RAC Systems
No ratings yet
Internal Workings of The Oracle RAC Systems
6 pages

Pandas,Numpy,Matplotlib

Uploaded by

Pandas,Numpy,Matplotlib

Uploaded by

Python Pandas

Pandas is an open-source library built on top of the Python programming

Dealing with Rows and Columns

⇒ Column Selection: In Order to select a column in Pandas DataFrame, we can

# Import pandas package

⇒ Row Selection: Pandas provide a unique method to retrieve rows from a

⇒ Indexing and Selecting Data

● To display the dataframe

● Selecting a single columns

Working with Missing Data

Checking for missing values using isnull() and notnull() :

Numpy is a general-purpose array-processing package. It provides a

→ Creating a Numpy Array

→ Accessing the array Index

python -m pip install -U matplotlib

Basic plots in Matplotlib

1. Line plot using Matplotlib

# importing matplotlib module

# importing matplotlib module

3. Histogram using Matplotlib

# importing matplotlib module

# importing matplotlib module

You might also like