Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
18 views

Week 5, Class1 - Introduction To Data Analytics

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Week 5, Class1 - Introduction To Data Analytics

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

AI/ML

Data Analytics
What is Data
Science?

Data Analytics
What is Data Science?

“Data science is an interdisciplinary field that


uses scientific methods, processes, algorithms
and systems to extract knowledge and insights
from noisy, structured and unstructured data
and apply knowledge and actionable insights
from data across a broad range of application
domains….”
- Wikipedia’s Definition of Data Science

Image thanks to Serap Baysal


What do I need to learn?

Image thanks to Benjamin Obi Tayo, Ph.D


Our Focus:

● Coding with pyhton


● Data Wrangling
● Data Visualization
Data Analytics
• We want to analyze data and draw conclusions from the
information.
• This information is critical in making key business decisions.
• A common saying that data is the new oil is not an exaggeration.
• It has become pervasive and is largely used in many industries e.g.
health, finance, education, automative,…
Data Analytics (Types)
• Descriptive – understand what the happened e.g the number of
accidents increased year-on-year.
• Diagnostic – why has it happened e.g why have the number of
accidents increased.
• Predictive – what will happen e.g will the number of accidents
increase.
• Prescriptive – what action should be taken e.g what should be done
to reduce the number of accidents
We shall focus primarily on the descriptive and diagnostic aspects for
this moule.
Data Analytics (Tools)
• There are many tools that can be used depending on the task and the
environment.
• For example, businesses may typically use enterprise tools likes
Microsoft Excel, Tableau, Power Bi.
• Need for automation and fine tuning also means that programming
tools have become very popular e.g R, Python
• For this module we shall focus on using Python as the main tool.
• NOTE: You can use whatever tools you prefer for the mini-project.
Data Analytics (What we shall do)
• Pandas for handling data
• Data loading
• Data wrangling
• Exploration
• Data engineering (introduction)
• Visualization
• Matplotlib and Seaborn
Types of Data
• Data can mainly grouped as Categorical or Numerical.

• Categorical data takes on specific values e.g rating a safe boda


rider can only be done in predefined star categories.

• Numerical data represents a numeric quantity which can take on


any value (sometimes within a given range) e.g age of students in a
class.
Categorical Data
• Can take on two forms;
• Nominal - here the categories do not have any underlying order e.g
assigning male or female sex to a subject or selecting a direction from
the compass.
• Ordinal - the categories have some underlying order e.g rating a product
as bad, average or good. There is an inherent ordering which implies that
good>average>bad
Numerical Data
• Also takes on two forms based on range
• Discrete - contains a finite number of values, is countable and cannot be
subdivided e.g age of a person
• Continuous - Infinite number of probable values and can usually be
measured e.g weight of a person, money earned by a business.
Questions on types of data
• What type of data is this?
• Rainfall in millimeters recorded for the past 5 years.
• Force in Newtons required to bend a different springs.
• Birth month of students in a class

You might also like