Module 5. Organizing and Summarizing Data
Module 5. Organizing and Summarizing Data
Learning Outcomes
Example 1. The following table shows what family planning methods were used by
married women in Tondo, Manila. The left column shows the categorical
variable (Method) and the right column is the frequency — the number of married women
using that particular method.
Table 1
The Number of Married Women using the Family Planning Method
Example 2. Tally marks are often used to make a frequency distribution table. For
example, let’s say you survey a number of households and find out how many pets they
own. The results are 3, 0, 1, 4, 4, 1, 2, 0, 2, 2, 0, 2, 0, 1, 3, 1, 2, 1, 1, 3. Create a frequency
distribution table will make the data easier to understand.
A frequency distribution table is one way you can organize data so that it makes more
sense.
Example 3. Consider a list of IQ scores for a gifted classroom in a particular elementary school.
The IQ scores are: 118, 123, 124, 125, 127, 128, 129, 130, 130, 133, 136, 138, 141, 142, 149, 150,
154. Draw a frequency distribution table, which will give a better picture of the data than a
simple list.
Steps:
1. Determine the number of classes m = √17 = 4.13 approximately 5.
2. Determine the class size of each class h = (154 – 118)/ 5 = 7.2 approximately 8
3. Determine the starting point for the first class, 118 – 125
4. Frequency Distribution Table
Table 1
The Frequency Distribution of the IQ Score of Students
IQ Score Tally
Frequency
118 – 125 |||| 4
126 – 133 |||| - | 6
134 – 141 ||| 3
142 – 149 || 2
150 – 157 || 2
TOTAL 17
Problem Set 1. Frequency Distribution
(a) 4, 3, 6, 5, 2, 4, 3, 3, 6, 4, 2, 3, 2, 2, 3, 3, 4, 5, 6, 4, 2, 3, 4
(b) 6, 7, 5, 4, 5, 6, 6, 8, 7, 9, 6, 5, 6, 7, 7, 8, 9, 4, 6, 7, 6, 5
2. The marks obtained out of 25 by 30 students of a class in the examination are given as 20,
6, 23, 19, 9, 14, 15, 3, 1, 12, 10, 20, 13, 3, 17, 10, 11, 6, 21, 9, 6, 10, 9, 4, 5, 1, 5, 11, 7, 24.
Prepare the frequency distribution of the scores.
Illustration:
Solution: Frequency Distribution Table
Table 1.
The Distribution of Scores of 50 students in a Statistics Examination
Cumulative
Classes Frequency
Class Class Relative Frequency (%)
No. of mark Boundaries Frequency (%) Less Greater
Scores(x)
students than than
29 - 37 2 33 28.5 – 37.5 =(2/50)*100=4 2 50
38 - 46 1 42 37.5 – 46.5 2 3 48
47 - 55 3 51 46.5 – 55.5 6 6 47
56 - 64 8 60 55.5 – 64.5 16 14 44
65 - 73 11 69 64.5 – 73.5 22 25 36
74 - 82 13 78 73.5 – 82.5 26 38 25
83 - 91 10 87 82.5 – 91.5 20 48 12
92 - 100 2 96 91.5 – 100.5 4 50 2
TOTAL 50 100
Definition of terms:
Class mark – the middle value or the average between the lower limit and upper limit,
𝑙𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 + 𝑢𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡
i.e.
2
Cumulative Frequency – the sum of the relative frequency as it goes down the classes.
Frequency Distribution can be presented by Histogram, Frequency Polygon
or Ogive (you can watch this video
https://www.youtube.com/watch?v=SbgCtj2NfXY)
1. Histogram
12
No. of Students (frequency)
10
0
28.5 - 37.5 37.5 - 46.5 46.5 - 55.5 55.5 - 64.5 64.5 - 73.5 73.5 - 82.5 82.5 - 91.5 91.5 - 100.5
2. Frequency Polygon
3. Ogive
The Measure of Central Tendency for ungrouped data are the mean, the median, the mode
and the midrange
1. The Mean
The Mean, or the arithmetic mean or average of a set of numbers is the sum of all the
values in the data set divided by the number of observation.
2. The Median
3. The Mode
4. The mid-range
B. MEASURES OF CENTRAL TENDENCY OF GROUPED DATA
Recall:
Grouped Data are the data or scores that are arranged in a frequency distribution.
̅ = mean value
Where: 𝒙
𝒙𝒊 = midpoint of each class or category
𝒇𝒊 = frequency of each class or category
Example 7. Consider the following data. Find the mean of the distribution.
X 25 35 45 55 65 75 85 95
f 1 2 2 3 12 14 12 4
Solution: Using the formula
X f fx
25 1 25
35 2 70
45 2 90
55 3 165
65 12 780
75 14 1050 3580
85 12 1020 = 50
95 4 380
∑ 𝒇 = 50 ∑ 𝒇𝒙 = 3580
𝑥̅ = 71.6
PROPERTIES OF MEAN
1. It measures stability. Mean is the most stable among other measures of central tendency
because every score contributes to the value of the mean.
2. The sum of each score’s distance from the mean is zero.
3. It may easily affected by extreme scores
4. It can be applied to interval level of measurement.
5. It may not be an actual score in the distribution.
6. It is very easy to compute
Example 8. The following table gives the frequency distribution of the number of orders received
each day during the past 65 days at the office of a mail-order company. Calculate the mean.
Number of Order f
10 – 12 4
13 – 15 12
16 – 18 20
19 – 21 14
22 – 24 5
25 – 27 10
n = 65
Solution:
Therefore,
Analysis:
There are 25 persons who take less than 24 minutes to travel to work and another 25 persons
take more than 24 minutes to travel to work.
PROPERTIES OF MEDIAN
1. It may not be the actual observation in the set of data.
2. It can be applied in ordinal level
3. It is not affected by the extreme values because median is a positional measure.
4. The exact midpoint of the score distribution is desired.
5. There are extreme scores in the distribution.
Quartiles
Quartiles is a type of quantile which divides the number of data points into four parts, or
quarters, of more-or-less equal size. Using the same method of calculation as in the Median, we
can get Q1 and Q3 equation as follows:
Interquartile Range
Interquartile range (IQR) is the difference between the third quartile and the first quartile.
IQR = Q3 – Q1
Example 10. Based on the grouped data below, find the Interquartile Range
3. Mode
Mode is the value that has the highest frequency in a data set. For grouped data, class
mode (or, modal class) is the class with the highest frequency.
1. Find the mean, median and mode for each of the following.
(a) 4, 3, 6, 5, 2, 4, 3, 3, 6, 4, 2, 3, 2, 2, 3, 3, 4, 5, 6, 4, 2, 3, 4
(b) 6, 7, 5, 4, 5, 6, 6, 8, 7, 9, 6, 5, 6, 7, 7, 8, 9, 4, 6, 7, 6, 5
2. The marks obtained out of 25 by 30 students of a class in the examination are given as 20, 6,
23, 19, 9, 14, 15, 3, 1, 12, 10, 20, 13, 3, 17, 10, 11, 6, 21, 9, 6, 10, 9, 4, 5, 1, 5, 11, 7, 24. Compute
for the mean, median and mode.
4. The scores of 40 students in statistics class consist of 60 items and they are tabulated below.
Compute for the mean, median, mode and the interquartile range of the distribution