Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
32 views

Data Analysis Python Notes

This document provides notes on Data Analysis using Python for 3rd Year B.Sc. students, covering key topics such as data types, Python libraries (NumPy, Pandas, Matplotlib, Seaborn), and essential operations for data handling and visualization. It includes practical examples for array creation, DataFrame manipulation, data cleaning, and basic statistical analysis. The content is structured into six units, each focusing on different aspects of data analysis and programming techniques.

Uploaded by

chiragal864
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Data Analysis Python Notes

This document provides notes on Data Analysis using Python for 3rd Year B.Sc. students, covering key topics such as data types, Python libraries (NumPy, Pandas, Matplotlib, Seaborn), and essential operations for data handling and visualization. It includes practical examples for array creation, DataFrame manipulation, data cleaning, and basic statistical analysis. The content is structured into six units, each focusing on different aspects of data analysis and programming techniques.

Uploaded by

chiragal864
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Data Analysis using Python - Notes for 3rd Year B.Sc.

Students

Unit 1: Introduction to Data Analysis and Python

- Data Analysis: Process of inspecting, cleaning, transforming, and modeling data.

- Types of Data:

- Qualitative (categorical, nominal, ordinal)

- Quantitative (discrete, continuous)

- Python Libraries:

- NumPy, Pandas, Matplotlib, Seaborn

- Jupyter Notebook: Ideal for data analysis

Unit 2: NumPy for Data Analysis

- Array Creation: np.array(), np.zeros(), np.ones(), np.arange(), np.linspace()

- Array Operations: Indexing, slicing, reshaping, broadcasting

- Mathematical Functions: np.mean(), np.std(), np.sum(), np.max()

Example:

import numpy as np

a = np.array([1, 2, 3])

print(np.mean(a)) # Output: 2.0

Unit 3: Pandas for Data Handling

- Data Structures:

- Series: 1D labeled array

- DataFrame: 2D labeled data structure

- Reading Data: pd.read_csv(), pd.read_excel()


- DataFrame Operations:

- Selecting: .loc[], .iloc[]

- Filtering: df[df['column'] > value]

- Sorting: df.sort_values()

- Grouping: df.groupby()

Example:

import pandas as pd

df = pd.read_csv("data.csv")

print(df.head())

Unit 4: Data Cleaning and Preprocessing

- Handling Missing Values: df.isnull(), df.dropna(), df.fillna()

- Renaming Columns: df.rename()

- Data Type Conversion: df.astype()

- Dropping Duplicates: df.drop_duplicates()

Unit 5: Data Visualization

- Matplotlib:

- plt.plot(), plt.bar(), plt.hist(), plt.scatter()

- Seaborn:

- sns.histplot(), sns.boxplot(), sns.heatmap()

Example:

import matplotlib.pyplot as plt

import seaborn as sns

sns.histplot(data=df, x='column_name')
plt.show()

Unit 6: Basic Statistical Analysis

- Descriptive Stats: Mean, Median, Mode, Variance, Std Dev

- Correlation: df.corr()

- Value Counts: df['column'].value_counts()

You might also like