0% found this document useful (0 votes)

10 views

Chapter IV Data Exploration and Visualization

Uploaded by

warlitollapitan06

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Chapter IV Data Exploration and Visualization

Uploaded by

warlitollapitan06

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

CHAPTER 4

Learning Module in
IT Inst 3 – DATA SCIENCE ANALYTICS

INTENDED LEARNING OUTCOMES:

At the end of the lesson, the students are expected to:

 Explain the importance of using graphs and charts; and
 Create data visualizations to effectively communicate insights;

LESSON: Data Exploration and Visualization

I. Data Exploration
Exploratory Data Analysis (EDA) is a critical initial step in data analysis process that
examines and analyzes data to understand its characteristics, patterns, and relationships.
This involves visually exploring the data, summarizing it main features, and identifying
potential trends, outliers, and anomalies.

Univariate Analysis:
Univariate analysis involves the examination of cases of one variable at a time. There are
three major characteristics of a single variable that we tend to look at:
 the distribution
 the central tendency
 the dispersion
In most situations, we would describe all three of these characteristics for each of the
variables in our study.

The Distribution: The distribution is a summary of the frequency of individual values

or ranges of values for a variable. The simplest distribution would list every value of a
variable and the number of persons who had each value.

For instance, a typical way to describe the distribution of college students is by

year in college, listing the number or percent of students at each of the four years.
Or, we describe gender by listing the number or percent of males and females. In
these cases, the variable has few enough values that we can list each one and
summarize how many sample cases had the value. But what do we do for a variable
like income or GPA? With these variables, there can be a large number of possible
values, with relatively few people having each one. In this case, we group the raw
scores into categories according to ranges of values. For instance, we might look at
GPA according to the letter grade ranges. Or, we might group income into four or five
ranges of income values.

Frequency distribution table.

1
One of the most common ways to describe a single variable is with a frequency
distribution. Depending on the particular variable, all of the data values may be
represented, or you may group the values into categories first (e.g., with age, price,
or temperature variables, it would usually not be sensible to determine the
frequencies for each value. Rather, the values are grouped into ranges and the
frequencies are determined.). Frequency distributions can be depicted in two ways,
as a table or as a graph. For example, Table 1 shows an age frequency distribution
with five categories of age ranges defined. The same frequency distribution can be
depicted in a graph let’s say as shown in Figure 1.

Sample Percentage Distribution in Table and Graph Format

Profile of the Respondents

Profile of the
Frequency Percentage
Respondents
Age
25 – 39 21 52.5
40 – 50 14 35.0
51 – 64 5 12.5
Civil Status
Single 6 15.0
Married 34 85.0
Sex
Male 14 35.0
Female 26 65.0
Status of Appointment
Permanent 37 92.5
Job Order 3 7.5

Figure 1. Profile of the Respondents based on Civil Status

Civil Status
15%

85%

Single Married

Frequency distribution using bar chart.

Distributions may also be displayed using percentages. For example, you could
use percentages to describe the:
 percentage of people in different income levels

2
 percentage of people in different age ranges
 percentage of people in different ranges of standardized test scores

Central Tendency: The central tendency of a distribution is an estimate of the

"center" of a distribution of values. There are three major types of estimates of
central tendency:
 Mean
 Median
 Mode

The Mean or average is probably the most commonly used method of describing
central tendency. To compute the mean all you do is add up all the values and divide
by the number of values. For example, the mean or average quiz score is determined
by summing all the scores and dividing by the number of students taking the exam.
For example, consider the test score values:

15, 20, 21, 20, 36, 15, 25, 15

The sum of these 8 values is 167, so the mean is 167/8 = 20.875.

The Median is the score found at the exact middle of the set of values. One way
to compute the median is to list all scores in numerical order, and then locate the
score in the center of the sample. For example, if there are 500 scores in the list,
score #250 would be the median. If we order the 8 scores shown above, we would
get:

15, 15,15,20,20,21,25,36

There are 8 scores and scores #4 and #5 represent the halfway point. Since both
of these scores are 20, the median is 20. If the two middle scores had different
values, you would have to interpolate to determine the median.

The mode is the most frequently occurring value in the set of scores. To
determine the mode, you might again order the scores as shown above, and then
count each one. The most frequently occurring value is the mode. In our example,
the value 15 occurs three times and is the model. In some distributions, there is
more than one modal value. For instance, in a bimodal distribution, there are two
values that occur most frequently.

Notice that for the same set of 8 scores, we got three different values -- 20.875,
20, and 15 -- for the mean, median and mode respectively. If the distribution is truly
normal (i.e., bell-shaped), the mean, median and mode are all equal to each other.

TA Bivariate Data
No ratings yet
TA Bivariate Data
9 pages
2 - Norms and Basic Statistics For Testing
100% (2)
2 - Norms and Basic Statistics For Testing
43 pages
Ect MAD 8D Calibration Procedure: Using The Vertical Volts Method
No ratings yet
Ect MAD 8D Calibration Procedure: Using The Vertical Volts Method
7 pages
WWW Social Research Methods Net KB Statdesc PHP
100% (1)
WWW Social Research Methods Net KB Statdesc PHP
87 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
5 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
6 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
19 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
6 pages
Descriptive-Analytics
No ratings yet
Descriptive-Analytics
6 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
9 pages
Descriptive Statistics Inferential Statistics Standard Deviation Confidence Interval The T-Test Correlation
No ratings yet
Descriptive Statistics Inferential Statistics Standard Deviation Confidence Interval The T-Test Correlation
14 pages
EDUC 75 Module 7revised Measures of Central Tendency.
No ratings yet
EDUC 75 Module 7revised Measures of Central Tendency.
14 pages
Cermak 1989 NormsandScores PDF
No ratings yet
Cermak 1989 NormsandScores PDF
34 pages
Allama Iqbal Open University, Islamabad: (Department of Secondary Teacher Education)
No ratings yet
Allama Iqbal Open University, Islamabad: (Department of Secondary Teacher Education)
13 pages
Measures of Central Tendency and Variability
No ratings yet
Measures of Central Tendency and Variability
2 pages
Module Assessment1 C7.
No ratings yet
Module Assessment1 C7.
15 pages
Measures of Central Tendecy
No ratings yet
Measures of Central Tendecy
5 pages
Statistics in Assessment of Learning
No ratings yet
Statistics in Assessment of Learning
11 pages
Measure of Central Tendency Variability or Dispersion Group 6
No ratings yet
Measure of Central Tendency Variability or Dispersion Group 6
8 pages
Descriptive Stats
No ratings yet
Descriptive Stats
39 pages
Basic Concepts
No ratings yet
Basic Concepts
9 pages
Computer Application in Research Ojambo Paul 160-500
No ratings yet
Computer Application in Research Ojambo Paul 160-500
5 pages
CHAPTER-7
No ratings yet
CHAPTER-7
59 pages
UNIT-1V
No ratings yet
UNIT-1V
3 pages
Literature Review On Measures of Central Tendency
100% (1)
Literature Review On Measures of Central Tendency
6 pages
Measures of Central Tendency and Dispersion/Variability: Range, Variance and Standard Deviation
No ratings yet
Measures of Central Tendency and Dispersion/Variability: Range, Variance and Standard Deviation
15 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
2 pages
Interval Scale
No ratings yet
Interval Scale
7 pages
Module 7 Weeks 14 15
No ratings yet
Module 7 Weeks 14 15
9 pages
Aaa Math
No ratings yet
Aaa Math
2 pages
Conducting Educational Research: Calculating Descriptive Statistics
No ratings yet
Conducting Educational Research: Calculating Descriptive Statistics
3 pages
Educ. 105 Assessment of Learning 1 Module 4
No ratings yet
Educ. 105 Assessment of Learning 1 Module 4
17 pages
Statistics
100% (4)
Statistics
272 pages
Lec Notes Business Stat
No ratings yet
Lec Notes Business Stat
7 pages
Module 8-Students'
No ratings yet
Module 8-Students'
11 pages
Module 4 Educ 105 Final
0% (1)
Module 4 Educ 105 Final
35 pages
Chapter 3: Organization, Utilization, and Communication of Test Results
No ratings yet
Chapter 3: Organization, Utilization, and Communication of Test Results
25 pages
Stat Chapter 5-9
No ratings yet
Stat Chapter 5-9
32 pages
lesson-3-4
No ratings yet
lesson-3-4
5 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
10 pages
Data Are Collection of Any Number of Related Observations
No ratings yet
Data Are Collection of Any Number of Related Observations
13 pages
Lesson 4 Data Description Measures of Position-1
No ratings yet
Lesson 4 Data Description Measures of Position-1
14 pages
Measures of Dispersion 1
No ratings yet
Measures of Dispersion 1
5 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
5 pages
Sol of Css G Ability 2019
No ratings yet
Sol of Css G Ability 2019
13 pages
Module 5 Ge 114
No ratings yet
Module 5 Ge 114
15 pages
Statistics Revised
No ratings yet
Statistics Revised
73 pages
Research Dispersion
No ratings yet
Research Dispersion
3 pages
Module-4-Lesson-1
No ratings yet
Module-4-Lesson-1
30 pages
Basic Statistics
No ratings yet
Basic Statistics
52 pages
Unit 8..8602 PDF
No ratings yet
Unit 8..8602 PDF
47 pages
Data Analysis: What's The Arithmetic Mean of 3, - 5, 7, and 0?
No ratings yet
Data Analysis: What's The Arithmetic Mean of 3, - 5, 7, and 0?
26 pages
Chapter 3 Statistical Parameters
No ratings yet
Chapter 3 Statistical Parameters
22 pages
Ness Evans, Annabel-Using Basic Statistics in The Behavioral and Social Sciences-Chapter 2
No ratings yet
Ness Evans, Annabel-Using Basic Statistics in The Behavioral and Social Sciences-Chapter 2
17 pages
Standard Deviation
No ratings yet
Standard Deviation
11 pages
Lecture No 01 Statistics 13-2-24
No ratings yet
Lecture No 01 Statistics 13-2-24
34 pages
Data Management
No ratings yet
Data Management
36 pages
Stats Notes
No ratings yet
Stats Notes
29 pages
Descr Iptive Statis Tics: Inferential Statistics
No ratings yet
Descr Iptive Statis Tics: Inferential Statistics
36 pages
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet
Six Sigma: A New Practice For Reducing Water Consumption Within Coca Cola Industry
No ratings yet
Six Sigma: A New Practice For Reducing Water Consumption Within Coca Cola Industry
25 pages
Question Bank-ODD Semester
No ratings yet
Question Bank-ODD Semester
9 pages
The Effect of Jarimatic Technique On Children
No ratings yet
The Effect of Jarimatic Technique On Children
8 pages
Effect of Safety Training On Risk Tolerance
No ratings yet
Effect of Safety Training On Risk Tolerance
22 pages
Data Analytics with Excel lab2 manual
No ratings yet
Data Analytics with Excel lab2 manual
98 pages
Nielsen SIP Final Report
No ratings yet
Nielsen SIP Final Report
27 pages
Designing Experiments and Analyzing Data: A Model Comparison Perspective, Third Edition – Ebook PDF Version All Chapters Instant Download
100% (4)
Designing Experiments and Analyzing Data: A Model Comparison Perspective, Third Edition – Ebook PDF Version All Chapters Instant Download
51 pages
Impact of Mobile Money Services On Thegrowth of Smes in Botswana
No ratings yet
Impact of Mobile Money Services On Thegrowth of Smes in Botswana
17 pages
Assignment 18: Overview of Data Analysis: Unit 3 - Cycle 1 (Sep-Dec 2019) Assignments
100% (1)
Assignment 18: Overview of Data Analysis: Unit 3 - Cycle 1 (Sep-Dec 2019) Assignments
4 pages
Quiz#2
No ratings yet
Quiz#2
2 pages
One Sample T-Test
No ratings yet
One Sample T-Test
1 page
Bank and Improvement Process
No ratings yet
Bank and Improvement Process
9 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
21 pages
Rajeswari .Chapter Book
No ratings yet
Rajeswari .Chapter Book
41 pages
Assistant Engineer (2021-07) - 202107011310568909
No ratings yet
Assistant Engineer (2021-07) - 202107011310568909
4 pages
Product Manager Role @ Skydo _ JD 2024
No ratings yet
Product Manager Role @ Skydo _ JD 2024
2 pages
How To Write A Report For Work (With Examples)
100% (1)
How To Write A Report For Work (With Examples)
4 pages
It Data Quality Improvement Plan Template R1
No ratings yet
It Data Quality Improvement Plan Template R1
10 pages
A
No ratings yet
A
6 pages
Practice Sample Questions STA404
100% (1)
Practice Sample Questions STA404
5 pages
Kuliah-1 (Pengantar Analisis Data)
No ratings yet
Kuliah-1 (Pengantar Analisis Data)
7 pages
Assignment Data Analytics FINAL
100% (1)
Assignment Data Analytics FINAL
21 pages
Correlation: By: Jubing 5
No ratings yet
Correlation: By: Jubing 5
14 pages
ANOVA of Unequal Sample Sizes
No ratings yet
ANOVA of Unequal Sample Sizes
7 pages
Introduction To Data Analysis and Decision Making
No ratings yet
Introduction To Data Analysis and Decision Making
11 pages
Writing Chapter 4 Qualitative Dissertation
100% (2)
Writing Chapter 4 Qualitative Dissertation
7 pages
Crime Analysis
No ratings yet
Crime Analysis
25 pages
Lean Six Sigma Green Belt Curriculum
No ratings yet
Lean Six Sigma Green Belt Curriculum
6 pages

Chapter IV Data Exploration and Visualization

Uploaded by

Chapter IV Data Exploration and Visualization

Uploaded by

CHAPTER 4

INTENDED LEARNING OUTCOMES:

At the end of the lesson, the students are expected to:

LESSON: Data Exploration and Visualization

The Distribution: The distribution is a summary of the frequency of individual values

For instance, a typical way to describe the distribution of college students is by

Frequency distribution table.

Sample Percentage Distribution in Table and Graph Format

Figure 1. Profile of the Respondents based on Civil Status

Frequency distribution using bar chart.

Central Tendency: The central tendency of a distribution is an estimate of the

15, 20, 21, 20, 36, 15, 25, 15

The sum of these 8 values is 167, so the mean is 167/8 = 20.875.

You might also like