0% found this document useful (0 votes)

33 views

Chapter 2 Descriptive Statistics

This chapter discusses descriptive statistics and how to organize and summarize data through tables, graphs and numerical measures. It covers frequency distribution tables, bar charts, pie charts, histograms and stem-and-leaf displays to organize qualitative and quantitative data. Measures of central tendency like mean, median and mode are used to describe the center of data. Measures of variation such as range, standard deviation and variance quantify how spread out numbers are. These statistical concepts help analyze and describe key aspects of data.

Uploaded by

musiccharacter07

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views

Chapter 2 Descriptive Statistics

Uploaded by

musiccharacter07

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

CHAPTER 2: DESCRIPTIVE STATISTICS

CHAPTER DISCUSSIONS

2.1 Organizing Data

Organizing and Graphing qualitative data
 Frequency distribution table
 Contingency table/Two-way table
 Bar Chart
 Pie Chart
Organizing and Graphing quantitative data
 Histogram
 Stem and Leaf
2.2 Numerical Descriptive Measures
(ungrouped data)
Measures of Central Tendency
 Mean
 Median
 Mode
Measures of Variation
 Range
 Standard deviation and Variance
 Coefficient of Variation (CV)
Measures of Skewness
Measures of Position
 Quartiles (Q1, Q2 and Q3)
 Box-and-Whisker Plot

2.1 ORGANIZING DATA

ORGANIZING AND GRAPHING QUALITATIVE DATA

Frequency distribution table

Describes information concerning a single variable at a time. For Qualitative data.

e.g.

1
Contingency table/Two-way table

The simplest form of a contingency table describes information in a single variable or

multiple variables. An example of a contingency table is shown below.

e.g.
Distribution of students according to sex and
programme at College XYZ in 2004
Sex
Programme Male Female Total
Business Studies 270 180 450
Banking Studies 125 125 250
Art and Design 10 90 100
Accountancy 80 120 200
Total 485 515 1,000
Source : College XYZ

Bar charts

i) Simple bar chart

e.g. Illustrate the following information in a vertical bar chart.

Year 1997 1998 1999 2000
Number of students 250 300 450 500
Answer

2
ii) Multiple bar chart

e.g. Draw a multiple bar chart to elicit the following information.

Course 2000 2001 2002
DBS 23 28 36
DIB 16 20 36
DPA 48 50 70
Answer

iii) Component bar chart

e.g. Draw a component bar chart to illustrate the following information.

Course 2001 2002
Statistics 35 45
Mathematics 45 60
Accounting 20 45
Answer

3
Pie charts

e.g. Use the data in example 5 and draw a pie chart for courses attended in 2001.
Answer

ORGANIZING AND GRAPHING QUANTITATIVE DATA

Histogram

A bar graph of a frequency distribution. It consists of a set of vertical bars which are located
continuously side by side.

The distribution is said to be positively skewed.

Shape of the Distribution

Bell-shaped /Normally Distributed

Has a single peak and Symmetric
Right-skewed/positively
The peak of a distribution to the left.
Left-skewed/negatively
The peak of a distribution to the right.

Stem-and-leaf

4
The stem-and-leaf display is a valuable tool for organizing a set of data and understanding
how the values distribute and cluster over the range of observations in the data set. It shows
the values of the original observations. The display separates data entries into stems
(leading digits on the left) and leaves (trailing digits on the right). A display that organizes
data to show its shape and distribution

e.g. The followings are marks obtained by 20 students in a quiz:-

12, 15, 16, 20, 25, 25, 26, 27, 30, 31, 33, 38, 42, 42, 43, 45, 46, 49, 50, 50.

Represent the data in a stem-and-leaf display. Comment.

Answer
Unit = 10
1 2 represents 12
1 256
2 05567
3 0138
4 223569
5 00

From the display, we observe that

1) The observations range from 12 to 50;
2) Most observations fall from 42 to 49;
3) The shape of the distribution is not symmetrical.

e.g. Raw data of a sample of weekly income of 8 secretaries are as follows:

RM490, RM550, RM570, RM590, RM620, RM640, RM710, RM830
Represent the data in a stem-and-leaf display.

Answer
Unit = 100
4 9 represents 490
4 9
5 579
6 24
7 1
8 3

2.2 NUMERICAL DESCRIPTIVE MEASURES (UNGROUPED DATA)

MEASURES OF CENTRAL TENDENCY

Measure of central tendency is a number (or a character) that is used to represent data set.
It is also known as the average. There are 3 measures: the arithmetic mean, the median,
and the mode.

The Arithmetic Mean ( X )

The arithmetic mean, or simply the mean, is the sum of all the observations divided by the
number of observations. It is also known as the simple average. The data to compute mean
must be at least interval level of measurement.

5
Ungrouped data

x̄=
∑x
n

e.g. X : 110, 112, 98, 100, 115, 95, 100

Find the mean.

Answer
∑
X
X= n
110+112+ 98+100+115 +95+100 730
= 7 = 7 = 104.3
~
Median ( X )

The median is a value located in the center of a distribution. As such, 50% of the
observations are below the median and 50% are above the median. The data must be at
least ordinal level of measurement.

e.g. X : 110, 112, 98, 100, 115, 95, 100

Find the median for the above data.

Answer
Array X : 95, 98, 100, 100, 110, 112, 115
n+1 7+1
~
Location of X = 2 = 2 = 4
~
∴ X = 100

e.g. X : 40, 50, 80, 90, 100, 120, 112, 115

Find the median.

Answer
n+1 8+1
~
Location of X = 2 = 2 = 4.5
90+100
~ =95
∴X = 2

The Mode ( X^ )

The mode is the value (or character) of the observation that appears most frequently. It is
the most popular value (or character). It can be computed for all levels of measurement of
data: nominal, ordinal, interval, and ratio.

e.g. Find mode for the following data set.

X : 17, 17, 18, 19, 19, 18, 20, 23, 19
Answer
^ = 19
∴X

e.g. What is the value of mode for these data?

X : 47, 48, 49, 47, 49, 47, 49, 50
6
Answer
^ = 47 and 49 (bimodal)
∴X

e.g. Find the modal value for these data

X : 90, 91, 92, 93, 94, 95, 96, 97
Answer
X^ does not exist.

Comparisons between the mean, the median, and the mode

(i) X is popular because it is based on all the observations. It can be manipulated

algebraically.
However, X is sensitive to outliers (or extreme values).
~
(ii) X is especially useful for ordinal data. It is not sensitive to outliers, thus it is a measure
for skewed data set.
(iii) X^is useful in describing nominal and ordinal data. It is not affected by extreme values.
There might be more than one mode or none at all.

Relationship between the mean, the median, and the mode

a) If mean = median = mode, the distribution is symmetrical or bell-shaped or no

skewness.
b) If mean > median, the distribution is skewed to the right or positively-skewed.
c) If mean < median, the distribution is skewed to the left or negatively-skewed.

e.g. Distributions of students’ weights with the following statistical analysis.

X = 60 kg, ~ ^ = 52 kg
X = 55 kg, X
Determine the shape of the distribution of students’ weights. Interpret your
findings.

Answer
~ ^
Since X > X > X , the distribution is said to be positively skewed. This means
that most students have weights less than 60 kg.

MEASURES OF VARIATION

Data Set A Data Set B Data Set C

100 100 80
100 105 70
100 102 120
100 103 140
100 90 90

100 100 100

7
X:

Variations of data in each set can be seen clearly. The variations can be measured by:
(i) Range
(ii) Standard deviation and Variance
(iii) Coefficient of Variation

Range (R)

X
R = largest
−X smallest
e.g.
For the above data sets, compute range for each of them.
Answer
Data set A : R = 100 – 100 = 0
Data set B : R = 105 – 90 = 15
Data set C : R = 140 – 70 = 70
2
Standard deviation (s) and Variance (s )

Standard deviation is the square root of variance, as such variance is the square of standard
deviation, i.e.

Standard deviation = √ Variance ;

2
Variance = (Standard deviation)

√1
S = n−1
[ ( ∑ X )2
∑X − n
2
]
e.g.
X : 17, 17, 18, 19, 19, 18, 20, 23, 19
Calculate the standard deviation.

Answer

√ [ ∑ X − ∑n ] √ [ ]
2
1 ( X )2
2 1 (170 )
3238−
S = n−1 = 9−1 9 = 1.83

Coefficient of variation (CV)

Whenever two samples have the same units of measure, the variance and standard
deviation for each can be compared directly. For example, suppose an automobile dealer
wanted to compare the standard deviation of miles driven for the cars she received as trade-
ins on new cars. She found that for a specific year, the standard deviation for Buicks was
422 miles and the standard deviation for Cadillacs was 350 miles. She could say that the
variation in mileage was greater in the Buicks. But what if a manager wanted to compare the
standard deviations of two different variables, such as the number of sales per salesperson
over a 3-month period and the commissions made by these salespeople?

A statistic that allows you to compare standard deviations when the units are different, as in
this example, is called the coefficient of variation. Coefficient of variation is used to compare
dispersion between data sets.
8
S
CV = X̄ x 100%

Interpretation of CV:-
(i) Data set with larger CV means the distribution is more dispersed compared to data set
with smaller CV.
(ii) Data set with smaller CV means the distribution is more consistent (or more uniform)
compared to data set with larger CV.

e.g. Compare the consistency of number of hours spent watching television among the
3 locations below.

Residential Industrial Town

35 43 36 39 28 27 15 8 14 12 15
28 29 25 38 27 49 25 30 32 21 20
26 32 29 40 35 4 41 34 7 11 24
41 37 31 45 34 10 30
Answer

Residential
678
X̄ = 20 = 33.9

√ [ ]
2
1 (678 )
23656−
S = 20−1 20 = 5.95
5. 95
CV = 33 . 9 x 100% = 17.6%

Industrial
201
X̄ = 8 = 25.13

√ [ ]
2
1 (201 )
6677−
S = 8−1 8 = 15.25
15 .25
CV = 25 .13 x 100% = 60.7%

Town
228
X̄ = 12 = 19

9
√ [ ]
2
1 (228)
5296−
S = 12−1 12 = 9.36
9 .36
CV = 19 x 100% = 49.3%

The distribution of time spent on watching television in Residential area is the most
consistent. The least consistent is the Industrial area.

MEASURES OF SKEWNESS

Pearson’s coefficient of skewness is given as

mean−mod e
Coefficient of skewness 1 = s tan dard deviation
3 (mean−median)
Coefficient of skewness 2 = s tan dard deviation

Interpretation of the coefficient:-

Coefficient Type of distribution
>0 Positively skewed (or skewed to the right)
=0 Symmetrical (or normal distribution)
<0 Negatively skewed (or skewed to the left)

e.g.

The following information were calculated for the distribution of money received by 50
children.
X̄ = 115.60; ~ X^ = 114.55;
X = 115.33; S = 14.49
Determine the shape of the distribution by computing the coefficient of skewness.

Answer
X̄− X^
Coefficient of skewness 1 = S
115. 60−114 . 55
= 14 . 49 = 0.07 = 0
The distribution is said to be symmetrical.

p.s. The computation can also be done using the second coefficient of skewness.
~
3 ( X̄ − X )
Coefficient of skewness 2 = S
3 ( 115. 60 - 115. 33 )
= 14 . 49 = 0.06 = 0

MEASURES OF POSITION

10
Quartiles

Quartiles divide an array into 4 equal parts. Thus there are 3 quartiles: the first quartile ( Q 1 ),
the second quartile (Q 2 ), and the third quartile (Q 3 ).

25% 25% 25% 25%

Q1 Q2 Q3

e.g.
Find the first, second and third quartiles for the following data.
X: 95, 98, 100, 100, 110, 112, 115
Answer
n+1 7+1
Location of Q 1 = 4 = 4 = 2nd
∴ Q 1 = 98
n+1 7+1
Location of Q 2 = 2 = 2 = 4th
∴ Q 2 = 100
3( n+1 ) 3(7 +1 )
Location of Q 3 = 4 = 4 = 6th
∴ Q 3 = 112
Box-and-Whisker Plot

The Box-and-Whisker Plot is used to display some information on measures of position and
to determine the shape of the distribution. These plots involve five specific values:

1. The lowest value of the data set (minimum)

2. Q1
3. The median
4. Q3
5. The highest value of the data set (maximum)

These values are called a five-number summary of the data set.

A boxplot is a graph of a data set obtained by drawing a horizontal line from the minimum
data value to Q1, drawing a horizontal line from Q3 to the maximum data value, and drawing
a box whose vertical sides pass through Q1 and Q3 with a vertical line inside the box
passing through the median or Q2.

Procedure for constructing a boxplot

1. Find the five-number summary for the data values, that is, the maximum and minimum
data values, Q1 and Q3, and the median.
2. Draw a horizontal axis with a scale such that it includes the maximum and minimum data
values.
3. Draw a box whose vertical sides go through Q1 and Q3, and draw a vertical line though
the median.
4. Draw a line from the minimum data value to the left side of the box and a line from the
maximum data value to the right side of the box.

11
Information obtained from a Box Plot.

1. a. If the median is near the center of the box, the distribution is approximately symmetric.
b. If the median falls to the left of the center of the box, the distribution is positively
skewed.
c. If the median falls to the right of the center, the distribution is negatively skewed.

2. a. If the lines are about the same length, the distribution is approximately symmetric.
b. If the right line is larger than the left line, the distribution is positively skewed.
c. If the left line is larger than the right line, the distribution is negatively skewed.

Facing A New Chapter: Factors Affecting The Senior High School Career Preference of Grade 11 Students of Hipona National High School
91% (11)
Facing A New Chapter: Factors Affecting The Senior High School Career Preference of Grade 11 Students of Hipona National High School
35 pages
Painless Pre-Algebra
From Everand
Painless Pre-Algebra
Barron's Educational Series
3/5 (2)
Applied Statistics MCQ
0% (2)
Applied Statistics MCQ
7 pages
Chapter 2
No ratings yet
Chapter 2
46 pages
Chapter1 Statistics
No ratings yet
Chapter1 Statistics
17 pages
C1S1 Statistics Packet
No ratings yet
C1S1 Statistics Packet
24 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
35 pages
Statistical Analysis_ Descriptive Stat (2)
No ratings yet
Statistical Analysis_ Descriptive Stat (2)
6 pages
City_Uni_of_New_York
No ratings yet
City_Uni_of_New_York
33 pages
Stat 1101 4 7
No ratings yet
Stat 1101 4 7
18 pages
Introduction To Descriptive Statistics
No ratings yet
Introduction To Descriptive Statistics
73 pages
Week 3 - Measures of Central Tendency
No ratings yet
Week 3 - Measures of Central Tendency
4 pages
Day 01-Basic Statistics
No ratings yet
Day 01-Basic Statistics
36 pages
Ch1 Prob&Stat NEW
No ratings yet
Ch1 Prob&Stat NEW
35 pages
Quantitative Data Analysis
100% (2)
Quantitative Data Analysis
27 pages
Lesson 5 (Descriptive Statistics Part 1)_Oct 2024
No ratings yet
Lesson 5 (Descriptive Statistics Part 1)_Oct 2024
72 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
18 pages
02 - Descriptive Statistics
No ratings yet
02 - Descriptive Statistics
45 pages
Biostats Lesson 3
No ratings yet
Biostats Lesson 3
6 pages
Psychology 117 Study Guide
100% (3)
Psychology 117 Study Guide
41 pages
Statistics For Business Topic - Chapter 3, 4 - Descriptive Statistics
No ratings yet
Statistics For Business Topic - Chapter 3, 4 - Descriptive Statistics
1 page
SQC
No ratings yet
SQC
53 pages
Class1
No ratings yet
Class1
52 pages
Class Test 1 Revision Notes
No ratings yet
Class Test 1 Revision Notes
10 pages
NITKclass 1
No ratings yet
NITKclass 1
50 pages
Descriptive Statistics Summary (Session 1-5) : Types of Data - Two Types
No ratings yet
Descriptive Statistics Summary (Session 1-5) : Types of Data - Two Types
4 pages
Chapter 1 Descriptivestatistics
No ratings yet
Chapter 1 Descriptivestatistics
21 pages
Lesson 1
No ratings yet
Lesson 1
37 pages
Basic Stat
No ratings yet
Basic Stat
46 pages
Mmw Statistics
No ratings yet
Mmw Statistics
50 pages
Article Review 1 Eng
No ratings yet
Article Review 1 Eng
30 pages
Stats 1 Module Updated
No ratings yet
Stats 1 Module Updated
53 pages
Statistical Methods
No ratings yet
Statistical Methods
43 pages
Introduction To Statistics
100% (1)
Introduction To Statistics
60 pages
4 - Statistik Deskriptif
No ratings yet
4 - Statistik Deskriptif
33 pages
Sampling Design and Analysis MTH 494: Ossam Chohan Assistant Professor CIIT Abbottabad
No ratings yet
Sampling Design and Analysis MTH 494: Ossam Chohan Assistant Professor CIIT Abbottabad
34 pages
2.0 Unit # 2 - DSC7701
No ratings yet
2.0 Unit # 2 - DSC7701
25 pages
MMW-FINALS-REVIEWER - Etc
No ratings yet
MMW-FINALS-REVIEWER - Etc
4 pages
Basic Statistics
No ratings yet
Basic Statistics
52 pages
Bioepi Lesson 6. Descriptive Statistics
No ratings yet
Bioepi Lesson 6. Descriptive Statistics
38 pages
8409 Statistics
No ratings yet
8409 Statistics
17 pages
Organization of Data
No ratings yet
Organization of Data
6 pages
Descriptive Statistics and Exploratory Data Analysis
No ratings yet
Descriptive Statistics and Exploratory Data Analysis
36 pages
Faculty Introduction: Tkachwala@nmims - Edu
No ratings yet
Faculty Introduction: Tkachwala@nmims - Edu
27 pages
Data Analysis
No ratings yet
Data Analysis
43 pages
Descriptive Stat
No ratings yet
Descriptive Stat
13 pages
EECM3724_Unit_1_Ch3_slides_2022
No ratings yet
EECM3724_Unit_1_Ch3_slides_2022
48 pages
Lesson2 - Measures of Tendency
No ratings yet
Lesson2 - Measures of Tendency
65 pages
Mathematics in The Modern World
No ratings yet
Mathematics in The Modern World
50 pages
Document 1
No ratings yet
Document 1
7 pages
STATISTICS
No ratings yet
STATISTICS
98 pages
Unit 4
No ratings yet
Unit 4
152 pages
Desc. Stat
No ratings yet
Desc. Stat
41 pages
Statistics
100% (4)
Statistics
124 pages
Math
No ratings yet
Math
6 pages
Stat Chapter 5-9
No ratings yet
Stat Chapter 5-9
32 pages
Topic 2- Descriptive_statistics
No ratings yet
Topic 2- Descriptive_statistics
36 pages
Descriptive Statistic
No ratings yet
Descriptive Statistic
37 pages
Statistics I Essentials
From Everand
Statistics I Essentials
Emil G. Milewski
No ratings yet
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Pre-Calculus Essentials
From Everand
Pre-Calculus Essentials
Ernest Woodward
No ratings yet
Co-Clustering: Models, Algorithms and Applications
From Everand
Co-Clustering: Models, Algorithms and Applications
Gérard Govaert
No ratings yet
Mid Term Exam 11 English Question Paper
No ratings yet
Mid Term Exam 11 English Question Paper
3 pages
Ictus Trial
No ratings yet
Ictus Trial
9 pages
Research Proposal Power Point Presentation
100% (1)
Research Proposal Power Point Presentation
24 pages
Daligdig Argie PR
No ratings yet
Daligdig Argie PR
38 pages
Cognitive Benefits of Bilingual Is M
No ratings yet
Cognitive Benefits of Bilingual Is M
41 pages
Juayong CaldozaJOOrtega DelaCruzRA - SSERR31122023
No ratings yet
Juayong CaldozaJOOrtega DelaCruzRA - SSERR31122023
11 pages
Chi-Square Test: Dr. T. T. Kachwala
No ratings yet
Chi-Square Test: Dr. T. T. Kachwala
14 pages
Research Is The Systematic Process of Collecting and Analyzing Information To Increase Our Understanding of The Phenomenon Under Study
No ratings yet
Research Is The Systematic Process of Collecting and Analyzing Information To Increase Our Understanding of The Phenomenon Under Study
3 pages
Practical Research 2 Week-1
No ratings yet
Practical Research 2 Week-1
16 pages
Statsprob Finals
No ratings yet
Statsprob Finals
14 pages
HSTS423 - Unit 5 Multicolinearity
No ratings yet
HSTS423 - Unit 5 Multicolinearity
12 pages
Sampling: By: Kachiri T. Salibio-Mercadal
No ratings yet
Sampling: By: Kachiri T. Salibio-Mercadal
49 pages
DDU ASSIGNMENT GROUP - PDF 2016
No ratings yet
DDU ASSIGNMENT GROUP - PDF 2016
4 pages
Study Notes - PSYC 3200
No ratings yet
Study Notes - PSYC 3200
12 pages
Chapter 4 (Regression)
No ratings yet
Chapter 4 (Regression)
125 pages
Research Methodology - Introduction
No ratings yet
Research Methodology - Introduction
26 pages
LDA Slides N
No ratings yet
LDA Slides N
20 pages
Computation of Tourist Arrival
No ratings yet
Computation of Tourist Arrival
1 page
Jenniejosej
No ratings yet
Jenniejosej
90 pages
Get Statistics For Business & Economics 13th Edition (Ebook PDF) Free All Chapters
100% (2)
Get Statistics For Business & Economics 13th Edition (Ebook PDF) Free All Chapters
41 pages
Emerging Assignment - Final
No ratings yet
Emerging Assignment - Final
15 pages
MCQ On Sampling Techniques
80% (5)
MCQ On Sampling Techniques
2 pages
Ilo Maila Bitanti - Jurnal
No ratings yet
Ilo Maila Bitanti - Jurnal
10 pages
Understanding Research Methods - Mildred L. Patten, Michelle Newhart & Michelle Newhart
No ratings yet
Understanding Research Methods - Mildred L. Patten, Michelle Newhart & Michelle Newhart
367 pages
Assignment 02
No ratings yet
Assignment 02
9 pages
PA Lab2
No ratings yet
PA Lab2
11 pages
Statistical Report 1
No ratings yet
Statistical Report 1
13 pages
Causes and Effects of Tardiness CHAPTER 1 3
No ratings yet
Causes and Effects of Tardiness CHAPTER 1 3
18 pages