Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
6 views

Data Analytics Using Python

The document outlines a step-by-step guide for performing exploratory data analysis on a car dataset using Python libraries, specifically Pandas and Matplotlib. Key tasks include loading the dataset, displaying its contents, generating summary statistics, and visualizing data through various plots. Additionally, it covers data cleaning techniques such as dropping irrelevant columns, handling duplicates, and removing missing values.

Uploaded by

ykashish456
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Data Analytics Using Python

The document outlines a step-by-step guide for performing exploratory data analysis on a car dataset using Python libraries, specifically Pandas and Matplotlib. Key tasks include loading the dataset, displaying its contents, generating summary statistics, and visualizing data through various plots. Additionally, it covers data cleaning techniques such as dropping irrelevant columns, handling duplicates, and removing missing values.

Uploaded by

ykashish456
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Data Analytics Using Python Libraries, Pandas

and Matplotlib

We’ll use a car.csv dataset and


perform exploratory data analysis using Pandas
and Matplotlib library functions to manipulate and
visualize the data and find insights.

1. Import the libraries

2. . Load the dataset using


pandas read_csv() function.

3. Display the head of the dataset using


the head() function.
4. Display the bottom 5 rows from the dataset
using the tail() function.

5. Print summary statistics of the dataset using


the describe() function.
6.Plot a histogram for all the variables.

7. Box plot to visualize the relationship between


vehicle size and engine hp.
8. Build a pair plot using the seaborn library

9. Drop irrelevant columns from the dataset


using drop() function.
10. Use rename() function to rename the columns.

11. Print the total number of duplicate rows.

12. Remove the duplicate rows using


the drop_duplicates() function.

13. Drop the missing values from the dataset.


14. Plot a histogram to find the number of cars per
brand.

15. Draw a correlation plot between the variables.

You might also like