Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
20 views

Introduction To Data Analysis

Lecture slide in Programming for Data Analysis

Uploaded by

Nam Trần
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Introduction To Data Analysis

Lecture slide in Programming for Data Analysis

Uploaded by

Nam Trần
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

INTRODUCTION TO DATA

ANALYSIS
Contents

Definition of Data analysis


Data analysis vs Data analytics
Why do we need Data analysis
Types of data analysis
Process flow of data analysis
Types of data
Tools for data analysis
Data analysis is the process of
discovering useful information from the
What is data raw data to empower data-driven
business decisions

analysis
As per Wikipedia – Data analysis is a
process of inspecting, cleansing,
transforming, and modeling data with
the goal of discovering useful
information, informing conclusions, and
supporting decision-making
Data analysis vs Data analytics

- Data analysis: It is a detailed - Data analytics: It is a systematic


examination of the elements or
computational analysis of data or
structure of something.
statistics.
- Data analysis is a subpart of data
analytics, and it is a process of - Data analytics uses the outcome of
collecting, transforming, and examining data analysis and goes to the next
the data to uncover profound insights of
level of analysis. It uses compressive
the data.
computational models to analyze data
- We analyze the past data to understand
at the next level, predicting future
the contributing factors or reasons
behind some activities that have already possibilities
occurred.
We often do data analysis knowingly or unknowingly to make day-to-day
decisions.

For example, to purchase any product from the e-commerce site, often,
we see the previous customer’s ratings and feedback on that product.
Then, after analyzing product rating, feedback, and other factors, we buy
a new product.
We review and analyze the available data related to our interest and then
decide that this is nothing but a simple data analysis.

Data analysis helps organizations make better decisions and make their
business more profitable.

Why data So, we can say that data analysis is the backbone of any data-driven

analysis
business decision.
Descriptive data analysis
• Descriptive analysis is one of the most common and primary forms of data
analysis. Descriptive data analysis is helpful to find the “what is/has
happening/happened?” in business. Usually, we take the help of descriptive data
analysis to track the Key Performance Indicators (KPIs), sales profit/loss, etc.

Diagnostic data analysis


• Diagnostic data analysis helps find out the “Why did something happen?” Once
we get the report of what happened from descriptive analysis, diagnostic data
analysis helps us understand which factors caused something to happen. It has
more complexities than descriptive analysis

Predictive analysis
• Predictive data analysis is used to forecast or predict what can likely happen by
analyzing the historical data, which helps us understand “what will probably
happen in the future?“ For example, by analyzing the past years of sales reports, it
is possible to predict the coming year’s sales, but this task does not come easily.
It is needed in advanced data analysis and Machine Learning (ML)

Type of data Prescriptive data analysis

analysis
• Prescriptive data analysis uses the outcomes of all the data analysis. It will help
find the “what action should be taken?” to counter a problem or predicted
problem. So, it prescribes the action(s) as a solution to counter the specific
situation. It needs advanced machine learning and real-time artificial intelligence.
PROCESS
FLOW OF
DATA
ANALYSIS
Structured data Semi-structured data Unstructured data Tools for data analysis in
Python
Structured data have a fixed, Semi-structure has a partially Unstructured data means there Python has many libraries for
predefined, and consistent defined structure. Though it is no predefined structure of the data analysis, data visualization,
structure. This type of data is does not have entire relational data. This is a bit complex to and data modeling like IPython,
most effective for analysis. For data, it is manageable to process and store, and we need Pandas, NumPy, Matplotlib,
example, relational data is understand the data structure some advanced capacity, tools, Seaborn, Scikit-Learn, NLTK,
organized in rows and columns and process. For example, CSV and methods to analyze and Keras, TensorFlow, and so on. All
data, JSON data, XML data, and process such data. For example, are not in scope for this book,
so on pdf, image, text log, audio/video but some are common and used
data, and so on” by the data science and data
analytics community

Types of data
IPython
• IPython is a web-based interactive shell notebook for several
programming languages but is mainly used with Python to write, test,
and execute the Python programme to analyze and visualize the data.
Pandas
• Pandas is a trendy data analysis and data exploration library that
provides a structured representation of data. It helps to do data
manipulation, cleansing, aggregation, merging, and so on effortlessly
Numpy
• NumPy is a fundamental library for doing array and vector-based
mathematical operations
Matplotlib
Tools for data • Matplotlib library is a vastly used data visualization library. It helps us

analysis
represent the data in various visual graphs, such as line plots, bar
charts, histograms, etc

You might also like