Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
164 views

Data Analysis With Python - FreeCodeCamp

This document provides an overview of data analysis with Python for beginners. It introduces key concepts like the data analysis process, popular Python libraries for data science like NumPy and Pandas, and how to use Jupyter Notebooks. The tutorial contents include definitions of data analysis, examples of real-world data analysis projects in Python, introductions to NumPy and Pandas with exercises, data cleaning, reading different data types, and an overview of the Python programming language.

Uploaded by

tekula akhil
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
164 views

Data Analysis With Python - FreeCodeCamp

This document provides an overview of data analysis with Python for beginners. It introduces key concepts like the data analysis process, popular Python libraries for data science like NumPy and Pandas, and how to use Jupyter Notebooks. The tutorial contents include definitions of data analysis, examples of real-world data analysis projects in Python, introductions to NumPy and Pandas with exercises, data cleaning, reading different data types, and an overview of the Python programming language.

Uploaded by

tekula akhil
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Data Analysis

with Python
Full tutorial for beginners
Hands-on, online Data Science training.
About this tutorial
1. What is Data Analysis
2. Real example Data Analysis with Python
3. How to use Jupyter Notebooks
4. Intro to NumPy (exercises included)
5. Intro to Pandas (exercises included)
6. Data Cleaning
7. Reading Data SQL, CSVs, APIs, etc
8. Python in Under 10 Minutes
What is Data Analysis?
What is Data Analysis
> A process of inspecting, cleansing, transforming and modeling data with
the goal of discovering useful information, informing conclusion and
supporting decision-making.

Definition by Wikipedia.
What is Data Analysis
> A process of inspecting, cleansing, transforming and modeling data with
the goal of discovering useful information, informing conclusion and
supporting decision-making.

Definition by Wikipedia.
What is Data Analysis
> A process of inspecting, cleansing, transforming and modeling data with
the goal of discovering useful information, informing conclusion and
supporting decision-making.

Definition by Wikipedia.
What is Data Analysis
> A process of inspecting, cleansing, transforming and modeling data with
the goal of discovering useful information, informing conclusion and
supporting decision-making.

Definition by Wikipedia.
What is Data Analysis
> A process of inspecting, cleansing, transforming and modeling data with
the goal of discovering useful information, informing conclusion and
supporting decision-making.

Definition by Wikipedia.
Data Analysis Tools
Auto-managed closed tools Programming Languages
Auto-managed closed tools Programming Languages

👎 Closed Source 󰢃 👍 Open Source 🤩

👎 Expensive 💸 👍 Free (or very cheap) 🤑

👎 Limited 😩 👎 Extremely Powerful 💪

👍 Easy to learn 󰠁 👎 Steep learning curve 󰠁


Why Python
for Data Analysis?
Why Python for Data Analysis?
Why would we choose Python over R or Julia?

👍 very simple and intuitive to learn

👍 “correct” language

👍 powerful libraries (not just for Data Analysis)

👍 free and open source

👍 amazing community, docs and conferences


When to choose R?
Python, sadly, is not always the answer

● When R Studio is needed

● When dealing with advanced statistical methods

● When extreme performance is needed


The Data Analysis
Process
Data Extraction Data Cleaning Data Wrangling Analysis Action

● SQL ● Missing values and ● Hierarchical Data ● Exploration ● Building Machine


● Scrapping empty data ● Handling categorical ● Building statistical Learning Models
● File Formats ● Data imputation data models ● Feature Engineering
○ CSV ● Incorrect types ● Reshaping and ● Visualization and ● Moving ML into
○ JSON ● Incorrect or invalid transforming representations production
○ XML values structures ● Correlation vs ● Building ETL
● Consulting APIs ● Outliers and non ● Indexing data for Causation analysis pipelines
● Buying Data relevant data quick access ● Hypothesis testing ● Live dashboard and
● Distributed ● Statistical sanitization ● Merging, combining ● Statistical analysis reporting
Databases and joining data ● Reporting ● Decision making
and real-life tests
Data Analysis
Vs
Data Science
DATA ANALYSIS VS DATA SCIENCE

The traditional view


Python & PyData Ecosystem
PYTHON ECOSYSTEM:

The libraries we use...


● pandas: The cornerstone of our Data Analysis job with Python

● matplotlib: The foundational library for visualizations. Other libraries we’ll use will be

built on top of matplotlib.

● numpy: The numeric library that serves as the foundation of all calculations in Python.

● seaborn: A statistical visualization tool built on top of matplotlib.

● statsmodels: A library with many advanced statistical functions.

● scipy: Advanced scientific computing, including functions for optimization, linear

algebra, image processing and much more.

● scikit-learn: The most popular machine learning library for Python (not deep learning)
How Python Data
Analysts Think
EXCEL, TABLEAU, ETC.

They’re all visual tools...


Thinking like a
Python Data Analyst
And finally,
why Python?
About this tutorial
1. What is Data Analysis
2. Real Example Data Analysis with Python
3. How to use Jupyter Notebooks
4. Intro to NumPy (exercises included)
5. Intro to Pandas (exercises included)
6. Data Cleaning
7. Reading Data SQL, CSVs, APIs, etc
8. Python in Under 10 Minutes

You might also like