Chapter 4

Introduction
Statistics nowadays are very useful. It enables the researchers to easily find
the solutions to the problems, either personal or societal, and interpret the
implications of these solutions in our everyday lives. Furthermore, through this,
many improvements and inventions produced which give an impact in the
society.
Statistics may be used in education, politics, economics, and the like. With
this, it also gives us information on the trends in the society and help us to
discover problems which need an urgent solutions.
STATISTICS AND ITS IMPORTANCE
Key Concepts
• Statistics is a science that deals with the collection, organization
analysis, and interpretation of data.
- Collection means gathering relevant information or data from the
population through survey, test, interview, experiment, etc.
- Organization or presentation refers to the systematic arrangement
of data into textual form, table, graph, or chart.
- Analysis is the careful examination of data and may be with the use
of statistical tool.
- Interpretation of data is making a generalization or conclusion from
the data that have been analyzed.
• Population - the group from which data are to be collected.

• Sample - a subset of a population.
• Variable - a feature characteristic of any member of a population differing
in quality or quantity from one member to another.
• Quantitative variable - a variable differing in quantity. For example, the
weight of a person, number of people in a car.
• Qualitative variable - a variable differing in quality or attribute. For
example, color, the degree of damage of a car in an accident.
• Discrete variable - a variable which no value may be assumed between
two given values, for example, number of children in a family. It is a
whole number, and are usually a count of objects.
• Continuous variable - a variable which any value may be assumed
between two given values, for example, the length and width of a
rectangular table is 3.5 meters by 1.75 meters.
Two Divisions of Statistics:

1. Descriptive Statistics:
Descriptive statistics deals with collection of data, its presentation in
various forms, such as tables, graphs and diagrams and findings averages
and other measures which would describe the data.
Example:
Industrial statistics, population statistics, trade statistics etc. and
businessmen use descriptive statistics in presenting their annual reports,
final accounts, bank statements.
2. Inferential Statistics:
Inferential statistics deals with techniques used for analysis of data,
making predictions, comparisons, and drawing conclusions about a
population using information gathered about a representative portion or
sample of that population.
Worktext in Mathematics in the Modern World 82

Example:
Suppose we want to have an idea about the percentage of indigents
in our country. We take a sample from the population and find the
proportion of indigents in the sample. This sample proportion with the help
of probability enables us to make some inferences about the population
proportion.
Importance of Statistics
Statistics plays a vital role in every field of human activity. Statistical
methods are useful tools in aiding researches and studies in different fields such
as education, economics, social sciences, business, health and many others. It
helps provide more critical analyses of information. Examples: (1) In Economics:
Economics largely depends upon statistics. National income accounts are
multipurpose indicators for the economists and administrators. Statistical
methods are used for preparation of these accounts. (2) In Natural and Social
Sciences: Statistical methods are commonly used for analyzing the experiments
results, testing their significance in Biology, Physics, Chemistry, Mathematics,
Meteorology, Research chambers of commerce, Sociology, Business, Public
Administration, Communication and Information Technology etc…
MEASURES OF CENTRAL TENDENCY

A measure of central tendency or measure of central location is a summary
measure that describes a whole set of data. It isa single number that indicates
the center of a collection of data. The most commonly used measures of central
tendency are the mean, median, and mode.
A. Mean, Median and Mode of Ungrouped Data

MEAN (𝑥̅ )
The mean, also called as the “average” or arithmetic mean/average”, is
the most commonly used measure of central tendency. It is said to be the
most reliable measure of central tendency. To calculate mean, add all the
numbers in a set and then divide the sum by the total count of numbers.
Properties of Mean
1. A set of data has only one mean and does not have an outlier.
2. All values in the data set are included in computing the mean.
3. It is very useful in comparing two or more data sets.
4. It is affected by the extreme small or large values on a data set.
5. It is appropriate in symmetrical data.
𝑆𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑣𝑎𝑙𝑢𝑒𝑠

Mean: (𝑥̅ ) =
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠

∑𝑥 ∑𝑥
Sample Mean: 𝑥̅ = Population Mean: 𝜇 =
𝑛 𝑁
Where:
𝑥̅ -sample mean (read as “x bar”)
𝜇 -population mean (read as “mu”)
𝑥 -the value of any particular observation or measurement
𝛴𝑥 -sum of all values
𝑛 -total number of values in the sample
𝑁 -total number of values in the population
Illustrative Examples:
1. Jean has been working part- time on a fast-food company. The following
numbers represent the number of hours Jean has worked on this fast-food
company for each of the past 8 months: 30, 45, 43, 60, 71, 82, 71, 83. What
is the mean (average) number of hours that Jean worked on this company?
Solution:
Step 1: Add the numbers to determine the total number of hours he
worked.
30 + 45 + 43 + 60 + 71 + 82 + 71 + 83 = 485
485
Step 2: Divide the total by the number of months. = 60.63 hours/month
8
The average number of hours John worked on the Website is

60.63hours/ month.
2. Joseph operates Technology Giant, a Website service that employs 8

people. Find the mean age (in years) of his workers if the ages of the
employees are as follows:
26, 23, 30, 25, 29, 33, 38, 35
Solution:
Step 1: Add the numbers to determine the total age of the workers.
26 + 23 + 30+ 25+ 29 + 33 + 38 + 35 = 239
Step 2: Divide the total by the number of workers
239
= 29.875 𝑜𝑟 30 𝑦𝑒𝑎𝑟𝑠
8
The average age of Joseph’s workers is 30 years old.
Weighted Mean/Average
The weighted mean/average is particularly useful when various classes or
groups contribute differently to the total.

The weighted mean/average may be calculated by using the following
three-step procedure:
1. multiply each value by its corresponding weight;
2. find the sum of those products; and
3. divide that sum by the sum of the weights.
The following formula expresses the procedure:

∑ 𝑤𝑥 𝑤1 𝑥1 + 𝑤2 𝑥2 + ⋯ 𝑤𝑛 𝑥𝑛
𝑥̅ = =
∑𝑤 𝑤1 + 𝑤2 + ⋯ 𝑤𝑛
where 𝑤 represents the weight and 𝑥 represents the data value.
Illustrative Example:
Rena, a fourth year student majoring in mathematics took the following
courses with the corresponding units and grade during the first semester of the
school year. What is her average grade?
Course Title Unit Grade

The Teaching Profession 3 1.4
Field Study 5 1 1.3
Field Study 6 1 1.5
Special Topic 3 1 1.8
Calculus 1 3 1.7
Calculus 11 3 1.8
Seminar on Technology in Mathematics 3 1.2
Abstract Algebra 3 1.8
Mathematical Investigation and Modelling 3 1.7
TOTAL 21
Solution:
3(1.4) + 1(1.3) + 1(1.5) + 1(1.8) + 3(1.7) + 3(1.8) + 3(1.2) + 3(1.8) + 3(1.7)
𝑥̅ =
21
33.4
𝑥̅ =
21
𝑥̅ = 1.59
The weighted average grade of Rena is 1.59
MEDIAN (𝑥̃)
The median is the number that falls in the middle position after the data
has been organized either in ascending or descending order or array.

Properties of Median
1. It is unique, there is only one median for a set of data.
2. It is found by arranging the set of data in ascending or descending
order and getting the value of the middle observation.
3. It is not affected by extreme values.
To determine the value of the median for ungrouped data, consider two rules.
1. If n is odd, the median is the middle ranked.
2. If n is even, then the median is the average of the two middle ranked
values.
𝑛+1
Median: 𝑥̃ = 2
1. Find the median of the following data:12, 3, 17, 8, 14, 10, 6
Solution:
Step 1: Organize the data in an array.
3, 6, 8, 10, 12, 14, 17
Step 2: Since the number of data values is odd, the median is the middle
most position. In this case, the median is the value that is found
in the fourth position of the data in an array.
3, 6, 8, 10 , 12, 14, 17
2. Find the median of the following data: 7, 9, 3, 4, 15, 2, 8, 6, 2, 4

Solution:
Step 1: Arrange the data in an array.
2, 2, 3, 4, 4, 6, 7, 8, 9, 15
Step 2: Since the number of data values is even, the median will be
𝑛+1
the mean value of the numbers found before and after the 2
position.
𝑛 + 1 10 + 1 11
= = = 5.5
2 2 2
Step 3: The number before the 5.5 position is 4 and the number after the
5.5 position is 6. Now, you need to find the mean value.
2, 2, 3, 4, 4, 6, 7, 8, 9, 15
4+6
=5
2
The median is 5.

MODE (𝑥̂)
The mode (𝑥̂) is the value in a data set that appears most frequently.
1. Find the mode of the following data: 76, 81, 76, 80, 76, 83, 77, 79, 82, 76
Solution:
There is no need to organize the data in an array, unless you think
that it would be easier to locate the mode if the numbers are in an array. In
the above data set, the number 76 appears thrice, but all the other numbers
appear only once. Since 76 appears with the greatest frequency, it is the
mode of the data set.
2. The ages of 12 randomly selected customers at a local 7-Eleven listed

below:
21, 21, 29, 24, 31, 21, 27, 24, 24, 32, 33, 19
What is the mode of the above ages?
Solution:
The above data set has two values that each occur with a frequency of
3. These values has 2 modes 21, and 24 which is called bi-modal. All other
values occur only once.
3. The coach of a sports team begin to observe the color of t-shirt his athletes
wear. His goal is to find out what color is worn most frequently so that he
can offer a common color or uniform shirts to his athletes.
Monday: Green, Blue, Pink, White, Blue, and Blue
Tuesday: Blue, Red, Black, Pink, Green, and Blue
Wednesday: Orange, White, White, Blue, Blue, and Red
Thursday: Brown, Black, Brown, Blue, White, and Blue
Friday: Black, Blue, Red, Blue, Red, and Pink
What is the mode of the colors above?
Solution:
The color blue was worn 11 times during the week. All other colors
were worn with much less frequency in comparison to the color blue.
The owner can offer a blue shirt for his employees.

Name: Date:
Curriculum and Section: Score:
Try this!
Direction: Answer the following.
A. Solve for the mean, median and mode of the following data set and
interpret the results
1. 54, 50, 54, 55, 56, 57,57, 58, 58, 60, 68

2. 45, 48, 52, 46, 41, 26, 36, 34, 38, 41, 39, 38, 30, 49, 46, 55
3. 154, 133, 232, 267, 289, 274, 321, 348, 188, 439
B. Ben and his friends are comparing the number of times they have been to
the movies in the past year. The table below illustrates how many times
each person went to the movie theatre in each month.
Jan Feb Mar Apr May June July Aug Sept Oct Nov Dec
Ben 3 3 2 5 2 3 2 4 2 3 2 2
John 3 2 1 1 1 3 3 3 2 4 1 2
Matthew 1 3 3 2 1 4 5 3 2 2 2 3
Rose 2 2 2 1 3 2 4 1 3 2 3 3
1. By comparing modes, which person went to the movies the least per
month?
2. By comparing medians, which person went to the movies the most per
month?
3. 3. Rank the friends in order of most movies seen to least movies seen
by comparing their means.
4. 4. Which month, by comparing the means of movies seen in each

month, is the most popular movie-watching month?
5. By comparing medians, which month is the least popular month?
6. What is the mean of the medians for each month (the arithmetic
average of the medians of the number of movies seen in each
month)?

Definition of Terms
Raw Data is the data collected in original form
Range is the difference of the highest value and the lowest value in the
distribution.
Frequency Distribution Table is the organization of data in a tabular form,

using mutually exclusive classes showing the frequency or count of
the occurrences of values in the sample.
Class Interval/width/size (i) is the distance between the class lower limit and
the class upper limit.
Class Limit is the smallest and largest observation (data, events etc.) in each
class. Hence, each class has two limits: a lower and upper limit.
Class Boundary or True Class Limit. It is 0.5 more of an upper class limit
and 0.5 less of a lower class limit. Therefore, each class has an upper
and lower class boundary or true upper and lower class limit.
Midpoint or Class Mark (X) is found by adding the upper and lower class
limits of any class and dividing the sum by 2
Frequency (𝑓) is the number of values in a specific class of a frequency

distribution table.
Cumulative Frequency (𝑐𝑓) – is the sum of the frequencies accumulated up
to the upper boundary of a class in frequency distribution table.
Frequency Distribution Table

1. Construct a frequency distribution table for the following total scores in the
1st Quarter Quizzes in a Mathematics class.
118, 123, 128, 129, 130, 130, 133, 124, 125, 127, 136, 138,
141, 141, 149, 154, 150
Solution:
The following steps are involved in the construction of a frequency
distribution.
1. Decide the approximate number of classes in which the data are to be

grouped. There are no hard and first rules for number of classes. In
most cases we have 5 to 20 classes. H.A. Sturges provides a formula
for determining the approximation number of classes.
K =1+.3.322 log N, where = Number of classes,
N = the total number of observations

K =1+.3.322 log 17
K = 5.09
K≈5
2. Find the range of the data. The range is the difference between the
largest and the smallest value.
Range ( R ) = R = 154 – 118 = 36
3. Determine the approximate class interval/width/size (i). The class

interval is obtained by dividing the range by the number of classes.
𝑅𝑎𝑛𝑔𝑒
𝑖 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐶𝑙𝑎𝑠𝑠𝑒𝑠
36
Class size, 𝑖 = = 7.2
5
In the case of fractional results, the next higher whole number is

taken as the size of the class interval.
Class size (𝑖) = 7.2 becomes 8
4. Decide the starting point. The lower class limit should cover the
smallest value in the raw data. Write down your lowest value for your
first minimum data value. The lowest value is 118.
5. Determine the remaining class limits .When the lowest class limit has
been determined, then by adding the class width/size to the lower
class limit (118 + 8 = 126) the next lower class limit is found. The
remaining lower class limits may be determined by adding the class
size repeatedly until the largest value of the data is observed in the
class. You can compute the upper class limit by subtracting one from
the class width and add that to the minimum data value. For example:
118 + (8 – 1) = 125
or
118 – 125 150-157
126 – 133 142-149
134 – 141 134 – 141
142 – 149 126 – 133
150 – 157 118 – 125
Tally the observations or scores in each class, and determine the

frequency. The total of the frequency must be equal to the number of
observations. The score are:
118, 123, 128, 129, 130, 130, 133, 124, 125, 127, 136, 138, 141, 141,
149, 154, 150

Frequency Distribution Table
Score Tally Frequency (f)
118-125 IIII 4
126-133 IIII – I 6
134-141 IIII 4
142-149 I 1
150-157 II 2
Total 17
By using the frequency distribution table above:
a) What are the lower and upper class limits of the first two classes?
For the first class 118 – 125 the lower class limit is 118 and the
upper class limit is 125. For the second class, 126-133, the lower
class limit is 126 and the upper class limit is 133.
b) What are the true class limits/class boundaries of the first two
classes?
For the first class 118 – 125 , the lower class boundary is
118 – 0.5 = 117.5, and for the second class, 126-133, it is
126 – 0.5 = 125.5
While the true upper limits or upper class boundaries for the first
class 118 – 125, the true upper limit or upper class boundary is
125 + 0.5 = 125.5, and for the second class, 126-133, it is
133 + 0.5 =133.5
c) What is the class interval/width/size?

Class Interval/width/size is the distance between the class lower
limit and the class upper limit. It can be obtained by getting the
difference of the two lower limits or upper limits of two succeeding
classes
For the two succeeding classes: 118-125
126-133
The class width is 126-118 = 8 or 133-125 = 8
d) Find the class midpoint or class mark of the first class.

For the first class, 118-125, the class midpoint,
𝐿𝑜𝑤𝑒𝑟 𝑐𝑙𝑎𝑠𝑠 𝑙𝑖𝑚𝑖𝑡 + 𝑈𝑝𝑝𝑒𝑟 𝑐𝑙𝑎𝑠𝑠 𝑙𝑖𝑚𝑖𝑡
𝑋=
2
118 + 125
𝑋=
2
𝑋 = 121.5

B. Mean, Median, Mode of Grouped Data
MEAN (𝑥̅ ) OF GROUPED DATA
Steps in computing the mean of grouped data:
a. Find the midpoint/class mark (𝑋) of each class.
b. Multiply the frequency (f) of each class by its midpoint (𝑋) to get 𝑓𝑋.
c. Find the sum of 𝑓𝑋
d. Find the sum of all the frequencies (𝑛)
e. Divide the sum 𝑓𝑋 by the sum of the frequencies.
Formula for the Mean (𝑥̅ ) of Grouped Data

∑ 𝑓𝑋
𝑥̅ =
𝑛
Nowadays most people spend their leisure time in facebook. Fifty-eight
students in a class recorded the time they spent in facebook during their free
time. The frequency distribution table data below shows the number of minutes
they spent in facebook.
Classes 𝑓 𝑋 𝑓𝑋
5–9 1 7 7
10 – 14 2 12 24
15 – 19 2 17 34
20 – 24 6 22 132
25 – 29 7 27 189
30 – 34 10 32 320
35 – 39 30 37 1 110
𝛴𝑓 = n = 58 𝛴𝑓𝑋 = 1 816
Solution: Applying the formula,
∑ 𝑓𝑋 1 816
𝑥̅ = = = 31.31
𝑛 58
Therefore, 31.31 minutes in the mean of 58 students spent on

facebook.
MEDIAN (𝒙 ̃) OF GROUPED DATA

Steps in computing the median for grouped data
a. Compute the less than cumulative frequency (< 𝑐𝑓) of the data. The
less than cumulative frequency (< 𝑐𝑓 ) is obtained by adding the
frequencies successively starting from the lowest class.
𝑛
b. Determine the median class by computing the value of 2.
c. Determine the value of the cumulative frequency before the median
class (𝑐𝑓𝑥̃).

d. Determine the true class limit L𝑥̃ of the median class
e. Determine the class width.
f. Apply the formula.
Formula of Median of Grouped Data:

𝑛
−< 𝑐𝑓
𝑥̃ = 𝐿𝑥̃ + (2 )𝑖
𝑓𝑚
(Using the same data in the previous lesson). The data shows the time
spent of 43 students in studying during examination in their math course.
Find the median.
Classes 𝑓 < 𝑐𝑓
25 – 29 3 3
30 – 34 2 5
35 – 39 5 10
40 – 44 8 18
45 – 49 Median class 8 26
50 – 54 8 34
55 – 59 9 43
Solution: Follow the steps in determining the median for grouped data.
𝑛 43
a) 2 = 2 = 21.5
b) The cumulative frequency before* the median class (𝑐𝑓𝑥̃) is 18.

(*If the classes is arranged in ascending order, before refers to the
cumulative frequency less than the frequency of the median class)
c) The frequency of the median class (𝑓𝑚 ) is 8.
d) The true class limit L𝑥̃ of the median class L𝑥̃ = 45 – 0.5 , L𝑥̃ = 44.5
Determine the class width, 𝑖= 5.
𝑛
−< 𝑐𝑓
𝑥̃ = 𝐿𝑥̃ + (2 )𝑖
𝑓𝑚
43
− 18
𝑥̃ = 44.5 + ( 2 )5
8
21.5 − 18
𝑥̃ = 44.5 + ( )5
8
𝑥̃ = 44.5 + 2.1875
𝑥̃ = 46.69
The median time spent by the student in studying is 46. 69 minutes.

̂) OF GROUPED DATA
MODE (𝒙
Steps in computing the mode for grouped data:
a. Identify the modal class by determining the class with the highest
frequency.
b. Determine the true lower limit or class boundary (L𝒙 ̂) of the modal
class.
c. Calculate 𝑑1 , the difference of the frequency of the modal class and
the frequency of the class preceding (1 class lower in value from
the modal class) the modal class.
d. Calculate 𝑑2 , the difference of the frequency of the modal class
and the frequency of the class succeeding (1 class higher in value
from the modal class) the modal class.
e. Determine the class width/size (𝑖)
f. Substitute the values in the formula.
Formula of Mode of Grouped Data:
𝑑1
𝑥̂ = 𝐿𝑥̂ + ( )𝑖
𝑑1 + 𝑑2
The data shows the time spent of 43 students in studying during
examination in their math course. Find the mode and interpret the result.
Classes 𝒇
25 – 29 3
30 – 34 2
35 – 39 5
40 – 44 7
45 – 49 6
50 – 54 8
55 – 59 modal class 9
Solution:
a) 55-59 is the modal class
b) The lower boundary of the modal class is, Lx̂ = 55 − 0.5 = 54.5
c) 𝑑1 = 9 - 8 , 𝑑1 = 1
d) 𝑑2 = 9 – 0, 𝑑2 = 9
e) 𝑖 = 5
f) Substitute the values in the formula.
𝑑1
𝑥̂ = 𝐿𝑥̂ + ( )𝑖
𝑑1 + 𝑑2
1
𝑥̂ = 54.5 + ( )5
1+9
𝑥̂ = 54.5 + 0.5 = 55
Therefore, most students spent 55 minutes in studying during their
exam in math.
Note: The above formula for finding the exact mode for grouped data applies only for
uni-modal distribution.

OTHER MEASURES OF RELATIVE POSITION
Measures of relative position or location also called quantiles are used to
partition or divide an ordered (array) data set into equal parts like the median.
The common measures of relative position are the quartiles, deciles, and
percentiles.
Median divides the ordered data into 2 equal parts while Quartiles divide a
data set into four equal parts. The three quartiles: Quartile 1 (Q 1) also called the
lower quartile is the value that below which 25% of the data lie; Quartile 2 (Q2)
that is equivalent to the median is the value that below which 50% of the data lie,
and Q3 also called the upper quartile is the value that below which 75% or three-
fourths of the data lie.
Deciles divide the array data set into ten equal parts and there are 9
deciles, denoted by D1, D2, …, D9. The Decile 1 or D1 is the value that below
which 10% of the data lie.
Percentiles divide the array data set into one hundred equal parts. There
are 99 percentiles, denoted by P1, P2, …, P99. The Percentile 1 or P1 is the value
that below which 1% of the data lie.
Interquartile range (IQR)= 𝑈𝑝𝑝𝑒𝑟 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 – 𝑙𝑜𝑤𝑒𝑟 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒

Note: A quantile is a number or cut-off, and not a range of values.
The figure below illustrates the relationship of the quantiles in a given distribution.
Q1 = P25
Q2 = P50 = D5 = 𝑥̃
Q3 = P75
The formulas are as follows:

Ungrouped
Grouped Data Notation
Data
Quartile 𝑘(𝑁 + 1) 𝑘𝑁 Where:
𝑄𝑘 = −< 𝐶𝐹𝑏
4 𝑄𝑘 = 𝐿𝐵𝑄𝑘 + ( 4 )𝑖 𝑄𝑘 -𝑘𝑡ℎ quartile
𝑐𝑓𝑄𝑘
𝐿𝐵𝑄𝑘 -lower class
boundary of the 𝑘𝑡ℎ
or quartile
< 𝐶𝐹𝑏 -less than
𝑛 cumulative
−< 𝑐𝑓
𝑄𝑘 = 𝐿 + (4 )𝑖 frequency below the
𝑓𝑚 𝑘𝑡ℎ quartile class
𝑐𝑓𝑄𝑘 -frequency of the 𝑘𝑡ℎ
quartile class

Ungrouped Grouped Data Notation
Data
Decile 𝑘(𝑁 + 1) 𝑘𝑁 Where:
𝐷𝑘 = −< 𝐶𝐹𝑏
10 𝐷𝑘 = 𝐿𝐵𝐷𝑘 + ( 10 )𝑖 𝐷𝑘 𝑘𝑡ℎ decile
𝑐𝑓𝐷𝑘
𝐿𝐵𝐷𝑘 -lower class
boundary of the
𝑘𝑡ℎ decile
cumulative frequency
below the 𝑘𝑡ℎ decile
class
𝑐𝑓𝐷𝑘 -frequency of the 𝑘𝑡ℎ
decile class
Percentile 𝑘(𝑁 + 1) 𝑘𝑁 Where:
𝑃𝑘 = 100 −< 𝐶𝐹𝑏
100 𝑃𝑘 = 𝐿𝐵𝑃𝑘 + ( )𝑖 𝑃𝑘 -𝑘𝑡ℎ percentile
𝑐𝑓𝑃𝑘
𝐿𝐵𝑃𝑘 -lower class
boundary of the
𝑘𝑡ℎ percentile
cumulative
frequency below
the 𝑘𝑡ℎ percentile
class
𝑐𝑓𝑃𝑘 -frequency of the
𝑘𝑡ℎ percentile
class
The monthly salary in pesos of 16 DepEd Elementary teachers are
as follows:
Teacher Salary Teacher Salary
1 30,531 9 22,216
2 32,469 10 32,072
3 36,942 11 21,038
4 20,754 12 21,038
5 23,222 13 21,038
6 21,327 14 20,754
7 37,400 15 20,754
8 45,269 16 20,754
Find the a) lower quartile (Q1), b) 7th decile, and c) 30th percentile.

Solution:
First, arrange the observation in an array.
16 45,269
15 37,400
14 36,942
13 32,469
12 32,072
11 30,531 D7
10 23,222
9 22,216
8 21,327
7 21,038
6 21,038
5 21,038 P30
4 20,754 Q1
3 20,754
2 20,754
1 20,754
a) lower quartile (Q1)

Substitute the values in the formula:
𝑘(𝑁 + 1)
𝑄𝑘 =
4
1(16 + 1)
𝑄1 =
4
17
𝑄1 = = 4.25
4
The 4th observation or item in the table is Php 20 754. Therefore, 25%
of the 16 DepEd Elementary teachers have salaries that are below or lower
than Php 20 754.
b) 7th decile
Using the formula:
𝑘(𝑁 + 1)
𝐷𝑘 =
10
7(16 + 1) 7(17) 119
𝐷7 = = =
10 10 10
𝐷7 = 11.9
The 11th
observation or item which is Php 30 531, this shows that 70%
of the 16 DepEd Elementary teachers have salaries that are below or lower
than Php 30 531.

c) 30th percentile
Using the formula:
𝑘(𝑁 + 1)
𝑃𝑘 =
100
30(16 + 1)
P30 =
100
30(17) 510
P30 = =
100 100
P30 = 5.10
The 5th observation or item which is Php 21 038, this implies that
30% of the 16 DepEd Elementary teachers have salaries that are below or
lower than Php 21 038.
QUANTILES FOR GROUPED DATA

The data shows the time spent in Facebook by 43 students.
Find: (a) Quartile 1 and (b) Decile 2 and (c) Percentile 52.
Class interval 𝑓 < 𝑐𝑓
25 – 29 3 3
30 – 34 2 5
Decile 2 35 – 39 5 10
Quartile 1 40 – 44 8 18
Percentile 52 45 – 49 8 26
50 – 54 8 34
55 – 59 9 43
Solution:
a) Solving for the 1st quartile, Q1
Quartile 1 is one-fourth (or 25%) of the data falls on or below
𝑛 1(𝑛)
Q1, replace 2 by 0.25n or 4 in the formula of the median.
1(𝑛)
First solve 0.25n or and locate in the column for < 𝑐𝑓 the
4
1(𝑛) 43
location of Q1. So, = =10.75
4 4
𝑛
0.25𝑛−<𝑐𝑓 −<𝑐𝑓
𝑄1 = 𝐿 + ( )𝑖 or 𝑄𝑘 = 𝐿 + ( 4
)𝑖
𝑓𝑚 𝑓𝑚
0.25(43)−10
𝑄1 = 39.5 + ( )5
8
10.75−10
𝑄1 = 39.5 + ( )5
8
𝑄1 = 39.5 + 0.46875
𝑄1 = 39.97
Interpretation: 25% of the students spent 39.97 or about 40 minutes or below
in using Facebook.
b) Solving for the Decile 2:
Decile 2 is two-tenths (or 20%) of the data falls on or below D2.
2(𝑛)
First solve 0.20n or and locate in the column for <cf the
10
2(𝑛) 2(43)
location of D2, so, = =8.6
10 10
2𝑛
0.20𝑛−<𝑐𝑓 −<𝑐𝑓
D2 = 𝐿 + ( )𝑖 or D2 = 𝐿 + ( 10
)𝑖
𝑓𝑚 𝑓𝑚
0.20(43) − 5
D2 = 34.5 + ( )5
5
8.6) − 5
D2 = 34.5 + ( )5
5
D2 = 34.5 + 3.6
D2 = 38.1
Interpretation: 20% of the students spent 38.1 or about 38 minutes
in Facebook.
c) Solving for the Percentile 52 (P52)

In Percentile 52, 52% of the data falls on or below the P52.
52(𝑛)
First solve 0.52n or and locate in the column for <cf the
100
52(𝑛) 52(43)
location of P52, so, = =22.36
100 100
52𝑛
0.52𝑛−<𝑐𝑓 −<𝑐𝑓
P52 = 𝐿 + ( )𝑖 or P52 = 𝐿 + ( 100𝑓 )𝑖
𝑓𝑚 𝑚
0.52(43) − 18
P52 = 44.5 + ( )5
8
P52 = 44.5 + 2.73
P52 = 47.23
Interpretation: 52% of the students spent 47.23 or about 47
minutes in Facebook.

Name: Date:
Try this!
A. A supermart recorded the time in minutes of a sample of 28 customers

stayed in the store.
30 12 24 12 16 32 8 24 23 26 18 24 23 26
28 18 22 42 36 26 12 24 18 30 12 24 18 30
Construct a frequency distribution table with less than cumulative

frequency and the determine the class marks of the classes.
B. Answer the following questions:

1. What decile is equivalent to percentile 70?
2. What quartile is equivalent to percentile 75?
3. What percentile is equivalent to decile 4?
4. What percentile is equivalent to the median?
5. What percentile is equivalent to quartile 1?
C. Find the, lower quartile, upper quartile, interquartile range, decile 3, and
percentile 49 of the following scores of students in a Science 50-item
summative test: 15, 42, 38, 12, 6, 22, 31, 7, 36, 14, 41, 15, 50, 27 , 65

Name: Date:
D. The data below shows the age distribution of residents in 7th street of
South Hill Subdivision: Compute the (1) Mean, (2) Median, (3) Mode,
(4) Percentile 22, (5) Quartile 3 and (6) Decile 8. Interpret the results.
𝑓 < 𝑐𝑓
25 –
Class interval
29 3 3
30 – 34 2 5
35 – 39 5 10
40 – 44 8 18
45 – 49 8 26
50 – 54 8 34
55 – 59 9 43

MEASURES OF VARIABILITY
Measures of Variation or Dispersion refers to how clustered or spread out
the values/observations of the distribution from the mean of the distribution.
When the measure of variability is large, the values are widely scattered;
when it is small they are tightly clustered.
There are four measures of variability that will be discussed in this lesson:
the range, mean absolute deviation, variance and standard deviation.
MEASURES OF VARIABILITY OF UNGROUPED DATA

Range (R) is the simplest measure of variability, the range is the difference
between the highest and lowest score in a distribution.
Range (R) = Highest value (H) – Lowest value (L)
Although it is easy to compute, it is not often used as the sole

measure of variability due to its instability. Because it is based only on the
most extreme scores in the distribution and does not fully reflect the pattern
of variation within a distribution.
The data below are scores in a 20-item quiz in English of Grade 8 boys
and girls. Find the range of the scores of the two groups. Solve also for the
mean.
Girls’ scores
Girls’ Boys’
8 3
9 7
11 10
12 7
10 13
9 13 Boys’ scores
10 17
11 10
10 10
Solution:
Range (R) = Highest value (H) – Lowest value (L)
R=H-L R=H-L
Rgirls = 12-8 Rboys = 17-3
Rgirls = 4 Rboys = 14
Interpretation: The girls’ scores are more clustered about the mean than
the boys’ scores.

The Mean Absolute Deviation (MAD)
Mean Absolute Deviation (MAD) – is the arithmetic mean or
average of the absolute deviations from the mean.
∑|𝑥 − 𝑥̅ |
𝑀𝐴𝐷 =
𝑛
where: 𝑥 = raw score

∑|𝑥 − 𝜇|
𝑀𝐴𝐷 =
𝑁
Steps in Computing the MAD:

1. Solve the mean of the observations/measurements.
2. Get the absolute value of the difference between each observation and
the mean (|𝑥 − 𝑥̅ |).
3. Find the sum of all the absolute value of the differences.
4. To find the MAD, divide the sum obtained in Step 3 by the number of
observations.
Find the mean absolute deviation (MAD) of the girls’ and boys’ scores
in the previous example and interpret.
a. Find the MAD of the scores of girls.
Score (𝑥) Mean (𝑥̅ ) |𝑥 − 𝑥̅ |
8 10 2
9 10 1
11 10 1
12 10 2
10 10 0
9 10 1
10 10 0
11 10 1
10 10 0
∑𝑥 = 90 ∑|𝑥 − 𝑥̅ | = 8
Solution: MAD of girl’s scores

∑|𝑥 − 𝑥̅ |
𝑀𝐴𝐷 =
𝑛
8
𝑀𝐴𝐷 =
9
𝑀𝐴𝐷 = 0.89

b. Find the MAD of the scores of boys.
Score (𝑥) Mean (𝑥̅ ) |𝑥 − 𝑥̅ |
3 10 7
7 10 3
10 10 0
7 10 3
13 10 3
13 10 3
17 10 7
10 10 0
10 10 0
∑𝑥 = 90 ∑|𝑥 − 𝑥̅ | = 8
Solution: MAD of boy’s scores

∑|𝑥 − 𝑥̅ |
𝑀𝐴𝐷 =
𝑛
26
𝑀𝐴𝐷 =
9
𝑀𝐴𝐷 = 2.89
Interpretation: The MAD of the girls and boys is 0.89 and 2.89
respectively. This shows the same result as in the range, that the girls’
scores are more clustered than the boys’ scores and implies that girls’
scores are more homogenous while the boys’ scores are quite
heterogenous.
The Variance and the Standard Deviation of Ungrouped Data

The variance indicates to what degree the individual observations of a data
set are dispersed or 'spread out' around their mean. The square root of
the variance gives the standard deviation or squaring the standard deviation will
give the variance.
Variance is the arithmetic mean or average of the squared

deviation of the mean.
Population Variance Sample Variance

2
∑(𝑥 − 𝜇)2 2
∑(𝑥 − 𝑥̅ )2
𝜎 = 𝑠 =
𝑁 𝑛−1
Standard Deviation is the square root of the variance (𝜎 = √𝜎 2 ).

Steps in Computing for the Variance:
1. Find the mean.
2. Get the difference between each observation and the mean, and square
each difference.
3. Find the sum of all the squared deviations.
4. The variance is obtained by dividing the sum of all the squared deviations
by the number of observations.
Find the variance and standard deviation of the scores of the girls and boys.
a. Girls 𝑥̅ = 10
Score (𝑥) 𝑥 − 𝑥̅ (𝑥 − 𝑥̅ )2
8 −2 4
9 −1 1
11 1 1
12 2 4
10 0 0
9 −1 1
10 0 0
11 1 1
10 0 0
∑(𝑥 − 𝑥̅ )2 = 12
Solution:
∑(𝑥 − 𝑥̅ )2
𝑠2 =
𝑛−1
12
𝑠2 =
9−1
12
𝑠2 =
8
𝑠 2𝑔𝑖𝑟𝑙𝑠 = 1.5 variance of the girl’s scores
𝑠 = √15
𝑠 = 1.22 standard deviation of the girl’s score

b. Boys 𝑥̅ = 10
Score (𝑥) 𝑥 − 𝑥̅ (𝑥 − 𝑥̅ )2
3 −7 49
7 −4 9
10 0 0
7 −3 9
13 3 9
13 3 9
17 7 49
10 0 0
10 0 0
∑(𝑥 − 𝑥̅ )2 = 134
Solution:
∑(𝑥 − 𝑥̅ )2
𝑠2 =
𝑛−1
134
𝑠2 =
9−1
134
𝑠2 =
8
𝑠 2𝑔𝑖𝑟𝑙𝑠 = 16.75. variance of the boy’s scores
𝑠 = √16.75
𝑠 = 4.09 standard deviation of the boy’s score
Direct from the scores Method of Solving the Variance and Standard Deviation
Formulas of Variance and Standard Deviation (Direct from the scores Method)
(∑ 𝑥)2 (∑ 𝑥)2
∑ 𝑥2− ∑ 𝑥2−
2 2
Population: 𝜎 = 𝑁
Sample: 𝑠 = 𝑛
𝑁 𝑛−1
where,
𝜎 2 (sigma-squared) –variance for population
𝑠 2 –variance for a sample
𝜎 –standard deviation for population
s –standard deviation for sample
𝑥 –value of each observation or measurement
Solve for the variance and standard deviation of the boys’ and girls’ scores
using the same data in the previous example.

1. Girls’ scores
Score (𝑥) 𝑥2
8 64
9 81
11 121
12 144
10 100
9 81
10 100
11 121
10 100
∑ 𝑥 = 90 ∑ 𝑥 2 = 912
Solution:
(∑ 𝑥)2 (90)2
∑ 𝑥2 − 912 −
𝑠2 = 𝑛 = 9
𝑛−1 9−1
8100
912 − 9 912 − 900 12
2
𝑠 = = =
8 8 8
𝑠 2 = 1.5 variance of the girls’ scores
𝑠 = 1.22 standard deviation of the girls’ scores
2. Boys’ scores
Score (𝑥) 𝑥2
3 9
7 49
10 100
7 49
13 169
13 169
17 289
10 100
10 100
∑ 𝑥 = 90 ∑ 𝑥 2 = 1034
Solution:
(∑ 𝑥)2 (90)2
∑ 𝑥2 − 1034 −
𝑠2 = 𝑛 = 9
𝑛−1 9−1
8100
1034 − 9 1034 − 900 134
2
𝑠 = = =
8 8 8
𝑠 2 = 16.75 variance of the boys’ scores
𝑠 = 4.09 standard deviation of the boys’ scores
Interpretation: The same result was obtained using the two formulas, that girls’
scores are more clustered than boys’ scores.
MEASURES OF VARIABILITY OF GROUPED DATA
The range (R) of grouped data in a frequency distribution may be
determined either by:
Method 1. Finding the difference between the true upper limit (UL)
and the true lower limit (LL).
𝑅 = 𝑈𝐿 − 𝐿𝐿
Method 2. Finding the difference between the midpoints or class
marks (X)
𝑅 = 𝑋𝐻 − 𝑋𝐿
The data below shows the monthly electrical consumption (in kwh) of 100
households. Find the range.
Monthly electrical
𝑓
consumption (kwh)
40 – 44 4
45 – 49 11
50 – 54 20
55 – 59 31
60 – 64 19
65 – 69 11
70 - 74 4
100
Solution:
Method 1: Method 2:
𝑅 = 𝑈𝐿 – 𝐿𝐿 𝑅 = 𝑋𝐻 – 𝑋𝐿
𝑅 = 74.5 – 39.5 𝑅 = 72 − 42
𝑅 = 35 𝑅 = 30
70+74
𝑋𝐻 (midpoint of the highest class, 70 – 74) = = 72
2
40+44
𝑋𝐿 (midpoint of the lowest class, 40 – 44) = = 42
2
The Mean Absolute Deviation of Grouped Data

Formula for Sample data Formula for Population
∑ 𝑓|𝑋−𝑥̅ | ∑ 𝑓|𝑋−𝑥̅ |
𝑀𝐴𝐷 = 𝑀𝐴𝐷 =
𝑛−1 𝑁
Solve the MAD of the sample monthly electrical consumption (kwh) of
100 households.

Monthly
electrical
𝑓 Midpoint (𝑋) 𝑓𝑋 |𝑋 − 𝑥̅ | 𝑓|𝑋 − 𝑥̅ |
consumption
(kwh)
40 – 44 4 42 168 14.95 59.8
45 – 49 11 47 517 9.95 109.45
50 – 54 20 52 1 040 4.95 99
55 – 59 31 57 1 767 0.05 1.55
60 – 64 19 62 1 178 5.05 95.95
65 – 69 11 67 737 10.05 110.55
70 – 74 4 72 288 15.05 60.2
100 5 695 536.5
Solution:
a) First is to solve for the mean:
∑ 𝑓𝑋
𝑥̅ =
𝑛
5 695
𝑥̅ =
100
𝑥̅ = 56.95
b) Solve for the MAD

∑ 𝑓|𝑋 − 𝑥̅ |
𝑀𝐴𝐷 =
𝑛−1
536.50 536.50
𝑀𝐴𝐷 = =
100 − 1 99
𝑀𝐴𝐷 = 5.42
The mean absolute deviation of the monthly electrical consumption is
5.42 kwh.
The Variance and Standard Deviation of Grouped Data

Population Standard Deviation Sample Standard Deviation
2 2
𝑓𝑋 2 −(∑ 𝑓𝑋) 𝑓𝑋 2 −(∑ 𝑓𝑋)
𝜎=√ 𝑁
𝑠=√ 𝑛
𝑁 𝑛−1
where,
𝑓 –frequency
𝑋 –class marks or midpoint

Monthly
electrical Midpoint
𝑓 𝑓𝑋 𝑓𝑋 2
consumption (𝑋)
(kwh)
40 – 44 4 42 168 7 056
45 – 49 11 47 517 24 299
50 – 54 20 52 1 040 54 080
55 – 59 31 57 1 767 100 719
60 – 64 19 62 1 178 73 036
65 – 69 11 67 737 49 379
70 – 74 4 72 288 20 736
100 5 695 329 305
Solution:
The calculations essential for computing the standard deviation is
shown above. Computing the standard deviation of the sample:
2
(∑ 𝑓𝑋)
√ 𝑓𝑋 2 − 𝑛
𝑠=
𝑛−1
(5 695)2
√329 305 − 100 329 305 − 324 330.25
𝑠= =√
100 − 1 99
4 974.75 4 974.75
𝑠=√ =√
99 99
𝑠 = √50.25
𝑠 = 7.0887 or 7.09
Interpretation: The electrical consumption of 100 renters are spread out or

varies from each other.

Name: Date:
Try this!
1. The distribution of the number of children in a sample of 40 families

selected at random is shown below:
Number of
𝑓
child(ren)
0 5
1 11
2 9
3 5
4 5
5 0
6 4
7 1
Find the following descriptive measures:

a. range
b. MAD
c. variance and standard deviation
1. The ages of the customers at the local theater is shown below.
Age 𝑓
18 – 22 15
23 – 27 33
28 – 32 45
33 – 37 26
38 – 42 13
43 – 47 8
n=40
Find the following descriptive measures:

a. range
b. MAD
c. variance and standard deviation

LINEAR CORRELATION
Key Concepts:
Correlation analysis is a measure of strength of the relationship between
two variables by means of a single number
Scatter diagram is a graph of points representing two series, with the

unknown variable plotted on the x- coordinate and the variable to be
estimated plotted on y- coordinate
Linear relationship is a type of correlation between two variables that can

be described mathematically by a straight line
Correlation coefficient (r) is an estimate of the measure of linear

association between two variable x and y.
Scatter Diagram
Pearson Product – Moment Correlation Coefficient (r, or Pearson’s r) is a

parametric test of relationship, requires interval scale of measurement.
𝑛 ∑ 𝑥𝑦 − ∑ 𝑥 ∑ 𝑦
𝑟=
√[𝑛 ∑ 𝑥 2 − (∑ 𝑥)2 ][𝑛 ∑ 𝑦 2 − (∑ 𝑦)2 ]
where,
𝑟 –coefficient of correlation
𝑥, 𝑦 –random variable on observed data
𝑛 –sample size
A school psychologist is interested find the relationship between a child’s
Emotional Quotient (EQ) and Creativity. To investigate this problem, the
psychologist samples some children and administers a standard Emotional
Quotient and a test of creativity to each child.

The data is shown below:
Emotional
Student Creativity (𝑦)
Quotient (𝑥)
1 138 51
2 88 30
3 120 38
4 90 24
5 86 20
6 131 50
7 113 35
8 120 27
9 95 30
10 110 24
11 100 52
12 127 39
13 117 30
14 81 18
Solution:
a) Solve for 𝑥𝑦, 𝑥 2 , 𝑦 2 and find the sum.
b) Then solve for r.
Student EQ Creativity 𝒙𝒚 𝒙𝟐 𝒚𝟐
1 138 51 7 038 19 044 2 601
2 88 30 2 400 7 744 900
3 120 38 4 560 14 400 1 444
4 90 24 2 160 8 100 576
5 86 20 1 720 7 396 400
6 131 50 6 550 17 161 2 500
7 113 35 3 955 12 769 1 225
8 120 27 3 240 14 400 729
9 95 30 2 850 9 025 900
10 110 24 2 640 12 100 576
11 100 52 5 200 10 000 2 704
12 127 39 4 953 16 129 1 521
13 117 30 3 510 3 689 900
14 81 18 1 458 6 561 324
∑𝑥 = 1 516 ∑𝑦 = 468 ∑𝑥𝑦 = 52 474 ∑𝑥 2 = 168 518 ∑𝑦 2 = 17 300

Using the formula:
𝑛 ∑ 𝑥𝑦 − ∑ 𝑥 ∑ 𝑦
𝑟=
√[𝑛 ∑ 𝑥 2 − (∑ 𝑥)2 ] [𝑛 ∑ 𝑦 2 − (∑ 𝑦)2 ]
14(52 474) − (1 516)(468)

𝑟=
√[14(168 518) − (1 516)2 ][14(17 300) − (468)2 ]
734 636 − 709 488
𝑟=
√[2 359 252 − 2 298 256][242 200 − 219 024]
25 148
𝑟=
√[60 996][23 176]
25 148
𝑟=
√1 413 643 296
25 148
𝑟=
37 598.45
𝑟 = 0.67
Interpretation: Since r = 0.67, the child’s emotional quotient and creativity

is positive moderately correlated.

Chapter 4

Uploaded by

Copyright:

Available Formats

Chapter 4

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 4

Uploaded by

Copyright:

Available Formats

Introduction

• Population - the group from which data are to be collected.

Two Divisions of Statistics:

Worktext in Mathematics in the Modern World 82

MEASURES OF CENTRAL TENDENCY

A. Mean, Median and Mode of Ungrouped Data

𝑆𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑣𝑎𝑙𝑢𝑒𝑠

Worktext in Mathematics in the Modern World 83

The average number of hours John worked on the Website is

2. Joseph operates Technology Giant, a Website service that employs 8

The average age of Joseph’s workers is 30 years old.

Worktext in Mathematics in the Modern World 84

The following formula expresses the procedure:

Course Title Unit Grade

The weighted average grade of Rena is 1.59

Worktext in Mathematics in the Modern World 85

2. Find the median of the following data: 7, 9, 3, 4, 15, 2, 8, 6, 2, 4

Worktext in Mathematics in the Modern World 86

2. The ages of 12 randomly selected customers at a local 7-Eleven listed

The owner can offer a blue shirt for his employees.

Worktext in Mathematics in the Modern World 87

Direction: Answer the following.

1. 54, 50, 54, 55, 56, 57,57, 58, 58, 60, 68

4. 4. Which month, by comparing the means of movies seen in each

5. By comparing medians, which month is the least popular month?

Worktext in Mathematics in the Modern World 88

Frequency Distribution Table is the organization of data in a tabular form,

Frequency (𝑓) is the number of values in a specific class of a frequency

Frequency Distribution Table

1. Decide the approximate number of classes in which the data are to be

Worktext in Mathematics in the Modern World 89

3. Determine the approximate class interval/width/size (i). The class

In the case of fractional results, the next higher whole number is

Tally the observations or scores in each class, and determine the

Worktext in Mathematics in the Modern World 90

c) What is the class interval/width/size?

d) Find the class midpoint or class mark of the first class.

Worktext in Mathematics in the Modern World 91

Formula for the Mean (𝑥̅ ) of Grouped Data

Therefore, 31.31 minutes in the mean of 58 students spent on

MEDIAN (𝒙 ̃) OF GROUPED DATA

Worktext in Mathematics in the Modern World 92

Formula of Median of Grouped Data:

b) The cumulative frequency before* the median class (𝑐𝑓𝑥̃) is 18.

The median time spent by the student in studying is 46. 69 minutes.

Worktext in Mathematics in the Modern World 93

Worktext in Mathematics in the Modern World 94

Interquartile range (IQR)= 𝑈𝑝𝑝𝑒𝑟 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 – 𝑙𝑜𝑤𝑒𝑟 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒

The formulas are as follows:

Worktext in Mathematics in the Modern World 95

Worktext in Mathematics in the Modern World 96

a) lower quartile (Q1)

Worktext in Mathematics in the Modern World 97

QUANTILES FOR GROUPED DATA

c) Solving for the Percentile 52 (P52)

Worktext in Mathematics in the Modern World 99

Direction: Answer the following.

A. A supermart recorded the time in minutes of a sample of 28 customers