Lessons 4.1 and 4.2
Lessons 4.1 and 4.2
Lessons 4.1 and 4.2
MATHEMATICS IN THE
MODERN WORLD
UNIT IV
Data Management
Statistics: Our Life Saver
and Influencer
Statistics: Our Life Saver and Influencer
Duration: 15 Hours
May these serve as an inspiration and motivation for you to learn and
understand the lessons that you have to go through in this unit as follows:
Daily, we come across different kinds of information, data, facts, and figures
from various communication and information media. Some current examples are:
surveys conducted by SWS on ratings of public offices and officials or
opinions of the public on issues using a sample of 1200 respondents
the daily data on Philippine COVID-19 cases provided by the Department of
Health (such data include new cases, fully recovered, and deaths added to
the previous total cases) where the active cases are categorized into
asymptomatic, mild, severe, and critical with corresponding percent.
These data are also the basis for the prediction of UP experts to project the
total cases up to a certain period, which are seemingly accurate.
weather conditions and forecasts
the clinical trials for COVID-19 vaccines
reports of DOTr on the number of commuters concerning the number of public
transportations to be allowed to operate in the quarantine periods
reports on the stock market situation
the estimated funds of Philhealth that were lost due to corruption
Lesson 4.1 will give you insights on how data such as mentioned above, are
collected.
The vital events may be live births, fetal deaths, deaths, marriages, divorces,
judicial separations, annulments of marriage, adoptions, recognitions
(acknowledgments of natural children), legitimating.
4. Texting Method. In this method, the researcher gathers data in the survey
being conducted through text messages.
In conducting a study, the researcher must consider the time element and the cost
involved to complete the study. This is why most researchers make use of a sample
(the representative of the population and possesses the characteristic of the
population) instead of the population (the entirety of objects, individuals, events, or
things). Slovin's formula is used to determine an appropriate sample size from the
population.
The margin of error shows how reliable the result of the survey is. A small
margin of error means that it is more likely that the results of the survey are true for
the population.
Solution:
Given: N = 6000 e = 5% = 0.05
N 6000
n= =
1+ N e 1+ (6000 ) ¿ ¿
2
Example 2. Using example 1, what will be the sample size if the margin of
error is 8%?
N 6000
n= =
1+ N e 1+ (6000 ) ¿ ¿
2
Did you notice that the bigger the margin of error, the smaller the
sample size becomes?
To solve for the margin of error (e) in this example, the formula to be used is
e=
Formula 1a
√N −n .
nN
In the formula, e is the margin of error, N is the population, and n is the sample
size.
Solution:
Given: N = 6000 n = 100
e=
√ N −n
nN
=
√6000 −100
100(6000)
=0.09916∨9.92 %
Example:
Given: N= 1400 and n = 141
Step 1. Determine k (sampling interval) by dividing the population by the
sample size.
N 1400
k (sampling interval)== =9.93∨10 (This means that every 10 th
n 141
element in the population list will be included in the sample until 141
samples are obtained.)
These are the first 20 numbers of the samples: 4, 24,34, 34, 44,54, 64, 74, 84,
94, 104, 114, 124,134, 144, 154, 164, 174, 184, 194
Example:
Given: N = 4370 patients; n = 151
Male Patients – 2734; Female Patients – 1636
Step 2. Multiply the result obtained in step 1 (p) by the size of each group to
get the number of samples to be taken from that group.
A cluster is a group where the objects or individuals in the group are more
similar to each other as compared to those from other groups.
You have learned the different ways on how to gather data and the sampling
techniques from which you can choose the one that you will employ in your research.
Now is the time for you to know what to do with the data that you have gathered. It is
essential to organize your data so that you can easily interpret them.
Data may be ungrouped or grouped. Ungrouped data are unsorted or raw data.
This means that the data have not been grouped or classified according to any
characteristic. On the other hand, grouped data are data that have been organized
or grouped.
1. By forming an array
An array is an arrangement of numbers in increasing or decreasing
order.
The three ways of organizing a set of ungrouped data are shown using the
example below.
42, 51, 44, 28, 32, 24, 30, 25, 24, 35, 43, 37, 28,
28, 22, 45, 29, 28, 36, 35, 50, 25, 25, 46, 44
You can organize the data in the following ways:
Stem-leaf Plot: (Assuming that we did not form an array, let us refer to the original
data.)
42, 51, 44, 28, 32, 24, 30, 25, 24, 35, 43, 37, 28,
28, 22, 45, 29, 28, 36, 35, 50, 25, 25, 46, 44
In the first value, 42, the digit 4 is the stem, and the digit 2 is the leaf. In the
second value, 51, the digit 5 is the stem, and the digit 1 is the leaf. Continue plotting
all the ages of the employees. After all the ages have been plotted, make another
table and arrange the leaves in increasing order.
Draft: (as the data is given) Final: (after arranging the leaves from
The lowest to the highest)
Table No. 1
Ages of 25 Employees in a Supermarket
Ages Frequency
22 1
24 2
25 3
28 4
29 1
30 1
32 1 Note: You may first form an
35 2 array or a stem-leaf plot so
36 1 that it would be easier to
37 1 construct the frequency
42 1 distribution table.
43 1
44 2
45 1
46 1
50 1
51 1
25
Interpretation of the data may be made in this way.
The table shows that of the twenty-five employees of the supermarket, the
youngest is twenty-two years old while the oldest is fifty-one years old. Most of the
employees are in their twenties, six are in their thirties and their forties, and only two
are in their fifties.
Table No 2
Scores of a Sample of 40 Students in a Biology Test
Class interval refers to the grouping bounded by the lower limit (LL) and upper limit
(UL).
Class size (c) is the length or width of the class.
Class frequency (f) is the number of observations falling within a class interval.
Class boundaries refer to the true boundaries (true limits) of a class interval
Class
Intervals In this example, 17-21 is the first class interval
(Scores) where 17 is the lower limit and 21 is the upper
limit. The lower limit of the first class interval is
17-21
usually the lowest value in the data. The upper
17+5 22-26 21+5
limit 21 was obtained by counting 5 units (since
22+5 27-31 26+5 c=5) starting from the lower limit 17 (17, 18, 19, 20,
27+5 32-36 31+5 21). To get the succeeding lower limits, just add 5
32+5 37-41 36+5 which is the class size. Do the same for the upper
37+5 42-46 41+5 limits.
42+5 47-51 46+5
c=5
Step 1: Determine the Range (R) of the distribution. The range is equal to the
highest score minus the lowest score.
R = 90 - 24
R = 66
Step 2: Determine the class size by dividing the range by the desired number of
classes. (The number of classes must not be too few nor too many. Too
many class intervals may result in classes with zero frequencies.) Let us
have ten classes on this problem. In some cases, the class size is already
given.
Range
Class size or class width( c ) =
numbe rofclasses (if the class size is not
(Formula 3) exact, round it off to the
66 nearest whole number)
c= = 6.6 ≈ 7
10
Step 3: Unless otherwise specified, always start the lowest class limit by the lowest
value of the given data (raw data). For the second lower limit, just add the
class size and then continue to add the class size to this lower limit to get the
rest of the lower limits. To get the first upper limit, subtract one (1) from the
second lower limit. For the second upper limit, just add the class size
continue to add the class size to this upper limit to get the rest of the upper
limits.
Note: The last class interval should contain the highest value.
Constructing the Class Limits Resulting Class Limits/Class
Intervals
LL - UL
Step 4: Determine the class boundaries by subtracting 0.5 from each of the lower
class limits and adding 0.5 to each of the upper class limits.
Lower Upper
Class Boundaries
Lower Limit Boundaries Upper Limit Boundaries
LB - UB
- 0.5 (LB) + 0.5 (UB)
24 - 0.5 23.5 30 + 0.5 30.5 23.5 – 30.5
31 - 0.5 30.5 37 + 0.5 37.5 30.5 – 37.5
38 - 0.5 37.5 44 + 0.5 44.5 37.5 – 44.5
45 - 0.5 44.5 51 + 0.5 51.5 44.5 – 51.5
52 - 0.5 51.5 58 + 0.5 58.5 51.5 – 58.5
59 - 0.5 58.5 65 + 0.5 65.5 58.5 – 65.5
66 - 0.5 65.5 72 + 0.5 72.5 65.5 – 72.5
73 - 0.5 72.5 79 + 0.5 79.5 72.5 – 79.5
80 - 0.5 79.5 86 + 0.5 86.5 79.5 – 86.5
87 - 0.5 86.5 93 + 0.5 93.5 86.5 – 93.5
Step 5: Calculate the class marks or class midpoints. It is the numerical location
of the center of the class and is computed as follows:
Applying the steps, table 2 shows how the frequency distribution table looks like.
Table No. 2
Frequency Distribution of the 50 Test Scores in Statistics
Class Class Class
Frequency
Intervals Boundaries Marks
f
LL - UL LB – UB Xi
24 - 30 23.5 - 30.5 27 3
31 – 37 30.5 - 37.5 34 3
38 – 44 37.5 - 44.5 41 6
45 – 51 44.5 - 51.5 48 7
52 – 58 51.5 - 58.5 55 8
59 – 65 58.5 - 65.5 62 9
66 – 72 65.5 - 72.5 69 3
73 – 79 72.5 - 79.5 76 6
80 – 86 79.5 - 86.5 83 3
87 - 93 86.5 - 93.5 90 2
c=7 n = 50
Start making the Worksheet on Data Management. Read the instructions
carefully and do Direction1.
Illustration 1
(Data from Test Scores of 50 Students in Statistics)
Resulting "Less
Successive addition of Successive addition of than" and
frequencies from top to frequencies from "Greater than"
bottom bottom to top Cumulative
Frequencies
Greater Cumulative
Less Than Frequency
than
Frequency Cumulative Frequency
Cumulative f
f Frequency f <cf >cf
Frequency
(<cf)
(>cf)
3 3 3 47 + 3 50 3 3 50
3 3+3 6 3 44 + 3 47 3 6 47
6 6+6 12 6 38 + 6 44 6 12 44
7 12 + 7 19 7 31 + 7 38 7 19 38
8 19 + 8 27 8 23 + 8 31 8 27 31
9 27 + 9 36 9 15 + 9 23 9 36 23
3 36 + 3 39 3 11 + 3 14 3 39 14
6 39 + 6 45 6 5 + 6 11 6 45 11
3 45 + 3 48 3 2+3 5 3 48 5
2 48 + 2 50 2 2 2 50 2
n = 50 n = 50 n=50
Then let us proceed to make the table on cumulative percentage frequency.
cf Formula 5
Cumulative Percentage Frequency (cpf) = x 100
n
Cumulative Percentage
Frequency Cumulative Frequency
Frequency
f
<cf >cf <cpf >cpf
3 50
3 3 50 x 100=6 x 100=100
50 50
6 48
3 6 48 x 100=12 x 100=96
50 50
Now our table looks like this with the addition of the column on cumulative
percentage frequency.
Table 3
Cumulative Percentage Distribution of 50 Test Scores in Statistics
Cumulative
Class Class Frequenc Cumulative
Percentage
Intervals Boundaries y Frequency
Frequency
LL - UL LB - UB f
<cf >cf <cpf >cpf
24 - 30 23.5 - 30.5 3 3 50 6 100
31 – 37 30.5 - 37.5 3 6 48 12 94
38 – 44 37.5 - 44.5 6 12 44 24 88
45 – 51 44.5 - 51.5 7 19 38 38 76
52 – 58 51.5 - 58.5 8 27 31 54 62
59 – 65 58.5 - 65.5 9 36 23 72 46
66 – 72 65.5 - 72.5 3 39 14 78 28
73 – 79 72.5 - 79.5 6 45 11 90 22
80 – 86 79.5 - 86.5 3 48 5 96 10
87 - 93 86.5 - 93.5 2 50 2 100 4
c=7 n= 50
This is how to interpret the cumulative frequency and the cumulative
percentage frequency.
Remember: Use the upper class boundaries in interpreting the <cf and
the <cpf. (lower than the upper class boundaries)
Use the lower class boundaries in interpreting the >cf and the
>cpf. (higher than the lower class boundaries)
(For less than cumulative frequency and less than cumulative percentage frequency,
use the numbers colored yellow in the table.)
(For greater than cumulative frequency and greater than cumulative percentage
frequency, use the numbers colored green in the table.)
f Formula 6
Relative Frequency (rf) = x 100 %
n
Resulting
Relative Frequency
Illustration No.3 Distribution of 50 Test
Scores in Statistics
Relative Relative
Frequency Frequency
Frequency Frequency
f f
(rf) (rf)
3 3 6
x 100 %
50 3
6
3 3 6
x 100 %
50 3
6
6 6 12
x 100 %
50 6
12
7 14
8 16
9 18
3 6
6 12
3 6
2 4
n = 50 100%
Table 4
Cumulative Percentage and Relative Frequency Distribution of 50 Test Scores
in Statistics
Class Cumulative
Cumulative
Class Percentage Relative
Boundarie Frequency Frequency
Intervals f Frequency Frequency
s (RF)
LL – UL <cf >cf <cpf >cpf
LB - UB
24 – 30 23.5 - 30.5 3 3 50 6 100 6
31 – 37 30.5 - 37.5 3 6 48 12 94 6
38 – 44 37.5 - 44.5 6 12 44 24 88 12
45 – 51 44.5 - 51.5 7 19 38 38 76 14
52 – 58 51.5 - 58.5 8 27 31 54 62 16
59 – 65 58.5 - 65.5 9 36 23 72 46 18
66 – 72 65.5 - 72.5 3 39 14 78 28 6
73 – 79 72.5 - 79.5 6 45 11 90 22 12
80 – 86 79.5 - 86.5 3 48 5 96 10 6
87 – 93 86.5 - 93.5 2 50 2 100 4 4
c=7 n = 50 100%