Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
2 views

05-Unit-V Python Lecture Notes

Sandip university nashik python pdf

Uploaded by

ajwagh358
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

05-Unit-V Python Lecture Notes

Sandip university nashik python pdf

Uploaded by

ajwagh358
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Unit 5

NumPy: (Numerical Python): Introduction to Numpy, Data types of arrays, Dealing


with ndarrays, copies and views, Arithmetic operations, Indexing, Slicing, splitting
arrays, Shape manipulation, Stacking together di fferent data.

Pandas: Pandas: (Data Analysis): Data Frame and Series, Data Frame operations, Data
Slicing, indexing, Data Frame functions, Reading the files-csv, excel.

NumPy and Pandas

NumPy: Numerical Python

NumPy (short for Numerical Python) is a powerful library for numerical


computations in Python. It offers support for multi-dimensional arrays, matrices, and
high-level mathematical functions to operate on these arrays efficiently.

1. Introduction to NumPy

• NumPy is used for fast mathematical and logical operations on arrays.


• It provides efficient ways to store large data and manipulate it in the form of
ndarrays (N-dimensional arrays).
• NumPy arrays are much faster and more memory-efficient than Python lists.

Installation:

pip install numpy

Importing NumPy:

import numpy as np

2. Data Types of Arrays

NumPy supports many data types, such as:

• int: Integer types like int32, int64


• float: Floating-point types like float32, float64
• complex: Complex numbers
• bool: Boolean values
• object: General Python objects
• str: Unicode string

Example: Creating arrays with different data types:


import numpy as np

arr1 = np.array([1, 2, 3], dtype='int32') # Integer array


arr2 = np.array([1.1, 2.2, 3.3], dtype='float64') # Float array
arr3 = np.array([True, False, True], dtype='bool') # Boolean array

print(arr1, arr1.dtype)
print(arr2, arr2.dtype)
print(arr3, arr3.dtype)

Output:

[1 2 3] int32
[1.1 2.2 3.3] float64
[ True False True] bool

3. Dealing with ndarrays

An ndarray is the core data structure of NumPy. It is a fast, N-dimensional container


for homogeneous data.

Creating Arrays:

arr = np.array([[1, 2, 3], [4, 5, 6]])


print(arr)
print(f'Shape: {arr.shape}, Dimensions: {arr.ndim}')

Output:

[[1 2 3]
[4 5 6]]
Shape: (2, 3), Dimensions: 2

4. Copies and Views

• Copy: A new independent array is created.


• View: A new array refers to the original array’s data.

Example:
arr = np.array([10, 20, 30])
copy_arr = arr.copy() # Independent copy
view_arr = arr.view() # Just a view of original data

copy_arr[0] = 99
view_arr[1] = 88

print('Original:', arr) # [10 88 30]


print('Copy:', copy_arr) # [99 20 30]
print('View:', view_arr) # [10 88 30]

5. Arithmetic Operations on Arrays

You can perform element-wise operations with NumPy arrays.

Example:

arr1 = np.array([1, 2, 3])


arr2 = np.array([4, 5, 6])

print(arr1 + arr2) # [5 7 9]
print(arr1 * arr2) # [4 10 18]
print(arr1 ** 2) # [1 4 9]

6. Indexing and Slicing

• Indexing: Accessing elements using indices.


• Slicing: Accessing a range of elements.

Example:

arr = np.array([10, 20, 30, 40, 50])

print(arr[1]) # 20 (Indexing)
print(arr[1:4]) # [20 30 40] (Slicing)
print(arr[-1]) # 50 (Negative Indexing)

7. Splitting Arrays

You can split a larger array into smaller ones.


Example:

arr = np.array([1, 2, 3, 4, 5, 6])


split_arr = np.array_split(arr, 3)

print(split_arr) # [array([1, 2]), array([3, 4]), array([5, 6])]

8. Shape Manipulation

You can reshape arrays to change their dimensions.

Example:

arr = np.array([1, 2, 3, 4, 5, 6])


reshaped = arr.reshape(2, 3)

print(reshaped)

Output:

[[1 2 3]
[4 5 6]]

9. Stacking Arrays

You can stack arrays vertically or horizontally.

Example:

arr1 = np.array([1, 2, 3])


arr2 = np.array([4, 5, 6])

vstack = np.vstack((arr1, arr2))


hstack = np.hstack((arr1, arr2))

print('Vertical Stack:\n', vstack)


print('Horizontal Stack:', hstack)
Pandas: Data Analysis Library

Pandas is a Python library used for data manipulation and analysis. It provides two
main data structures:

• Series: One-dimensional labeled array.


• DataFrame: Two-dimensional labeled data structure.

1. Introduction to Pandas

• It allows importing, cleaning, transforming, and analyzing data.


• Pandas is especially useful for working with CSV or Excel files.

Installation:

pip install pandas

Importing Pandas:

import pandas as pd

2. Series and DataFrame

• Series: One-dimensional array with labels.


• DataFrame: Two-dimensional array (table) with rows and columns.

Example:

# Creating a Series
s = pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])
print(s)

# Creating a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [24, 27, 22]}
df = pd.DataFrame(data)
print(df)

Output:

a 1
b 2
c 3
d 4
dtype: int64

Name Age
0 Alice 24
1 Bob 27
2 Charlie 22

3. DataFrame Operations

You can manipulate DataFrames using functions like head(), tail(), or describe().

Example:

print(df.head()) # First few rows


print(df.describe()) # Summary statistics

4. Data Slicing and Indexing

You can select specific rows and columns using labels or positions.

Example:

print(df['Name']) # Select a column


print(df.iloc[1]) # Select the second row
print(df[df['Age'] > 23]) # Filter rows

5. DataFrame Functions

Some useful functions:

• sort_values(): Sorts data based on a column.


• drop(): Removes a column or row.
• fillna(): Replaces missing values.

Example:

df['Age'] = df['Age'].fillna(0) # Replace NaN with 0


sorted_df = df.sort_values(by='Age')
print(sorted_df)

6. Reading Files (CSV, Excel)

Pandas can read data from various formats, including CSV and Excel.

Reading a CSV File:

df = pd.read_csv('data.csv')
print(df.head())

Reading an Excel File:

df = pd.read_excel('data.xlsx')
print(df.head())

Summary

• NumPy: Used for numerical computations with fast operations on arrays.


• Pandas: Used for data manipulation and analysis, especially useful for tabular
data in CSV/Excel formats.
• Key operations include indexing, slicing, reshaping, and reading files.
2 Marks Questions (Simple Conceptual or Definition-based Questions)

1. What is NumPy?

Answer:
NumPy (Numerical Python) is a Python library used for scientific computing. It
provides support for multi-dimensional arrays and mathematical operations on
these arrays, such as linear algebra, statistical operations, and element-wise
operations. It is faster than traditional Python lists due to its optimized C-based
implementation.

2. What is a Pandas DataFrame?

Answer:
A Pandas DataFrame is a 2-dimensional, tabular data structure with labeled axes
(rows and columns). It is similar to a spreadsheet or SQL table and is useful for
working with structured data.

3. Explain 'ndarray' in NumPy.

Answer:
ndarray (N-dimensional array) is the core data structure in NumPy. It can hold
multiple elements of the same data type across various dimensions (1D, 2D, or
more). Operations on these arrays are performed element-wise and efficiently.

4. How do you read a CSV file using Pandas?

Answer:
You can read a CSV file using the read_csv() function from the Pandas library:

import pandas as pd
data = pd.read_csv('filename.csv')

5. What is the difference between a view and a copy in NumPy?

Answer:

• View: A view refers to shared data; changes in the original array reflect in
the view.
• Copy: A copy creates a new array independent of the original; changes in
one do not affect the other.
5 Marks Questions (Explanation and Short Code Questions)

1. Explain the difference between NumPy arrays and Python lists with an
example.

Answer:

• Python Lists: Can hold elements of different data types, but they are slower
and occupy more memory.
• NumPy Arrays: Store elements of the same data type. Operations are faster
because of better memory management.

Example:

import numpy as np

# NumPy array
arr = np.array([1, 2, 3])

# Python list
lst = [1, 2, 3]

Operations like element-wise addition are faster in NumPy:

arr + 2 # Output: [3 4 5]

2. What are Pandas Series? How do you create one?

Answer:
A Pandas Series is a one-dimensional labeled array capable of holding any data
type. It can act like a list or dictionary.

Code Example:

import pandas as pd

# Creating a Series from a list


s = pd.Series([10, 20, 30], index=['a', 'b', 'c'])
print(s)

Output:
a 10
b 20
c 30
dtype: int64

3. Write a code to slice a NumPy array.

Answer:

import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Slice to get the first two rows and first two columns
sliced_arr = arr[:2, :2]
print(sliced_arr)

Output:

[[1 2]
[4 5]]

4. How can you change the shape of a NumPy array?

Answer:
You can change the shape using the reshape() method.

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])


reshaped_arr = arr.reshape(2, 3)
print(reshaped_arr)

Output:

[[1 2 3]
[4 5 6]]

5. Explain the use of DataFrame indexing with an example.

Answer: You can index a DataFrame using row and column labels.
import pandas as pd

data = {'Name': ['Alice', 'Bob'], 'Age': [24, 27]}


df = pd.DataFrame(data)

# Indexing a single column


print(df['Name'])

# Indexing a specific row


print(df.loc[1])

Output:

0 Alice
1 Bob
Name: Name, dtype: object

Name Bob
Age 27
Name: 1, dtype: object

10 Marks Questions (Detailed Questions with Code Examples)

1. Explain arithmetic operations on NumPy arrays with examples.

Answer:
NumPy allows element-wise arithmetic operations such as addition, subtraction,
multiplication, and division.

Example:

import numpy as np

arr1 = np.array([1, 2, 3])


arr2 = np.array([4, 5, 6])

# Addition
print(arr1 + arr2)

# Multiplication
print(arr1 * arr2)

# Scalar addition
print(arr1 + 10)

Output:

[5 7 9]
[ 4 10 18]
[11 12 13]

2. Write a program to split a NumPy array into sub-arrays.

Answer:

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])

# Splitting into 3 sub-arrays


sub_arrays = np.split(arr, 3)
print(sub_arrays)

Output:

[array([1, 2]), array([3, 4]), array([5, 6])]

3. How can you read an Excel file using Pandas? Write a code example.

Answer:

import pandas as pd

# Reading an Excel file


df = pd.read_excel('sample_data.xlsx')
print(df.head())

This will read the Excel file and print the first 5 rows using head().
15 Marks Questions (In-depth Questions Covering Concepts and Code)

1. Explain how you can stack NumPy arrays and manipulate their shapes.
Provide examples.

Answer:
Stacking means combining multiple arrays along a particular axis.

Code Example:

import numpy as np

arr1 = np.array([1, 2, 3])


arr2 = np.array([4, 5, 6])

# Stacking along rows (axis=0)


stacked_rows = np.vstack((arr1, arr2))
print("Stacked Rows:\n", stacked_rows)

# Stacking along columns (axis=1)


stacked_columns = np.column_stack((arr1, arr2))
print("Stacked Columns:\n", stacked_columns)

Output:

Stacked Rows:
[[1 2 3]
[4 5 6]]

Stacked Columns:
[[1 4]
[2 5]
[3 6]]

2. Write a Pandas program to perform the following operations: (1) Create a


DataFrame, (2) Filter rows based on a condition, (3) Perform a group-by
operation.

Answer:

import pandas as pd
# (1) Creating a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [24, 27, 22, 32],
'City': ['NY', 'LA', 'NY', 'LA']}
df = pd.DataFrame(data)

# (2) Filtering rows where Age > 25


filtered_df = df[df['Age'] > 25]
print("Filtered Data:\n", filtered_df)

# (3) Grouping by 'City' and calculating the mean age


grouped = df.groupby('City')['Age'].mean()
print("Mean Age by City:\n", grouped)

Output:

Filtered Data:
Name Age City
1 Bob 27 LA
3 David 32 LA

Mean Age by City:


City
LA 29.5
NY 23.0
Name: Age, dtype: float64

You might also like