Module Assessment1 C7.
Module Assessment1 C7.
TOPICS
1. The Measures of Central Tendency
2. Normal and Skewed Distribution
3. Outcome-based Teaching-Learning and Score Distribution
4. Measures of Dispersion or Variability
LEARNING OUTCOMES
At the end of the lesson, you should be able to:
1. Explain the meaning and function of the different measures of central
tendency, and variability.
2. Distinguish among the measures of central tendency and measures of
variability.
3. Explain the meaning of normal and skewed score distributions.
There are three measures of central tendency – the mean, the median and the
mode. Perhaps you are most familiar with the mean (often called the average). But
there are two other measures of central tendency, namely, the median and the mode.
Is there such a thing as best measure of central tendency?
79
TOPIC 1: THE MEASURES OF CENTRAL TENDENCY
The mean, mode and media are valid measures of central tendency but under
different conditions, one measure becomes more appropriate than the others. For
example, if the scores are extremely high and extremely low, the median is a better
measure of central tendency since mean is affected by extremely high and extremely
low scores.
The mean (or average or arithmetic mean) is the most popular and most well known
measure of central tendency. The mean is equal to the sum of all the values in the
data set divided by the number of values in the data set. For example, 10 students in
a Graduate School class got the following scores in 100 – item test 70, 72, 75, 77, 78,
80, 84, 87, 90, 92. The mean score of the group of 10 students is the sum of all their
scores divided by 10. The mean, therefore, is 805/10 equals 80.5. 80.5 is the average
score of the group. There are 6 scores below the average score (mean) of the group
(70, 72, 75, 77, 78, and 80) and there are 4 scores above the average score (mean) of
the group (84, 87, 90, and 92).
The mean has one main disadvantage. It is particularly susceptible to the influence of
outliers. These are values that are unusual compared to the rest of the data set by being
especially small or large in numerical value. For example, consider scores of 10 of 12
students in a 100-item Statistics test below:
Score 1 2 3 4 5 6 7 8 9 10
5 38 56 60 67 70 73 78 79 95
The mean score for these ten Garde 12 students is 62.1. However, inspecting the raw
data suggests that the mean score may not be the best way to accurately reflect the
score of the typical Grade 12 student., as most students have scores in 5 to 95 rage.
The mean is being skewed by the extremely low and extremely high scores. Therefore,
in this situation, we would like to have a better measure of central tendency. As we will
find out later, taking the median would be a better measure of central tendency in this
situation.
Median
The is the middle score for a set of scores arranged from lowest to highest. The mean
is less affected by extremely low and extremely high scores. How do we find the
median? Suppose we have the following data:
65 55 89 56 35 14 56 55 87 45 92
80
To determine the median, first we have to rearrange the scores into order of
magnitude (from smallest to largest)
14 35 45 55 55 56 56 65 87 89 92
Our median is the score at the middle of the distribution. In this case , 56. It is the
middle score. There are 5 scores before it and 5 scores after it. This works fine when
you have an odd number of scores, but what happens when you have an even number
of scores? What if you had 10 scores like the score below?
65 55 89 56 35 14 56 55 87 45
Arrange that data according to order of magnitude (smallest to largest). Then take the
middle two scores (55 and 56) and compute the average of the two scores. The median
is 55.5. This gives us a more reliable picture of the tendency of the scores. There are
indeed scores of 55 and 56 in the score distribution.
Mode
The mode is the most frequent score in our data set. On a histogram or bar chart it
represents the highest bar. If is a score of the number of times an option is chosen in a
multiple choice test, you can therefore, sometimes consider the mode as being the
most popular option. Study of score distribution given below:
14 35 45 55 55 56 56 65 87 89
There are two most frequent scores 55 and 56. So we have a score distribution with
two modes, hence a bimodal distribution.
A score distribution a sample has a “normal distribution” when most of the values
aggregated around the mean, and the number of values decrease as you move below
or above the mean: the bar graph of frequencies of a “normally distributed” sample
will look like a bell curve.
81
If mean is equal to the median and median is equal to the mode, the score distribution
shows a perfectly normal distribution. This is illustrated by the perfect bell shape or
normal curve shown in Figure 13.
If mean is less than the median and the mode, the score distribution is a negatively
skewed distribution is a negatively skewed distribution. In a negatively skewed
distribution the scores tend to congregate at the upper end of the score distribution
(See Figure 14).
If mean is greater than the median and the mode, the score distribution is a positively
skewed distribution. In a positively skewed distribution the scores tend to congregate
at the lower end of the score distribution.
If scores tend to be high because teacher taught very well and students are highly
motivated to learn, the score tends to be negatively skewed, i.e. the scores will tend to
be high. On the other hand, when teacher does not teach well and students are poorly
motivated, the score distribution tends to be positively skewed which means that
scores tend to be low. So which score distribution should we work for?
82
TOPIC 3: OUTCOME-BASED TEACHING-LEARNING AND SCORE DISTRIBUTION
On the other hand, if what teachers teach and assess are not aligned with the intended
learning outcomes, the opposite will be true. Score distribution will be negatively
skewed which means that scores tend to congregate on the lower end of the score
distribution.
If the measures of central tendency indicate where scores congregate, the measures
of variability indicate how spread out a group of scores is or how varied the scores are
Common measures of dispersion or variability are range, variance, and standard
deviation.
What is Variability?
Variability refers to how “spread out” a group of scores is. The terms variability, spread,
and dispersion are synonyms, and refer to how spread out a distribution is. Here are
two sets of score distribution:
A – 5, 5, 5, 5, 6, 6, 6, 6, 6, 6 – Mean is 5.6
B – 1, 3, 4, 5, 5, 6, 7, 8, 8, 9 – Mean is 5.6
The two score distributions have equal mean scores and yet the scores are varied. Score
distribution A shows scores that are less varied than score distribution B. That is what
we mean by variability or dispersion. If we have to study both score distributions,
assuming that the highest possible score in the quiz is 10, we can say that Groups A
and B are equal in terms of mean but Group A has more similar scores and are closer
to the mean while Group B, while its mean is equal to the mean of Group A, students
in Group B have more varied scores than Group A. In fact the lowest score is extremely
low compared to Group A and the highest score is much higher than the highest score
in Group A.
To see more what we mean by spread out, consider graphs in Figures 16 and 17. These
graphs represent the scores on two quizzes. The mean score for each quiz is 7.0.
Despite the equality of means, you can see that the distributions quite different.
Specifically, the scores on Quiz 1 are more densely packed and those on Quiz 2 are
83
more spread out. The differences among students were much greater on Quiz 2 than
Quiz 1.
Quiz 1
Quiz 2
Range
The range is the most simple measure of variability. The range is simply the highest
score minus the lowest score. Here are examples: Let’s take a few examples. What is
the range of the following group of scores: 10, 2, 4, 6, 7, 3, 4? The highest number is
10, and the lowest is 2, so 10 – 2 = 8. The range is 8.
84
Here are other examples:
Here is a set of scores in a test: 99, 45, 23, 67, 45, 91, 82, 78, 62, 51. What is the range?
The highest number is 99 and the lowest number is 23, so 99 – 23 equals 76; the range
is 76. Here is another set of scores: 40, 40, 42, 50, 53, 56, 67, 68, 70, 89. What is the
range? 89 minus 40 equals 49. The range is 49. The set of scores with a range of 76 is
more varied than the set of scores with a range of 49.
Variance
Variability can also be defined in terms of how close the scores in the distribution are
to the middle of the distribution. Using the mean as the measure of the middle of the
distribution, the variance is defined as the average squared difference of the scores
from the mean. The data from Quiz 1 are shown in Table 1. The mean score is 7.0.
Therefore, the column “Deviation from Mean” Contains the score minus 7. The column
“squared deviation” is simply the previous column squared.
One thing that is important to notice is that the mean deviation form the mean is 0.
This will always be the case. The mean of the squared deviations is 1.5. Therefore, the
variance is 1.5. The formula for the variance is:
85
Standard Deviation
The number of flowers on each bush is 9, 2, 5, 4, 12, 57, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9
6, 9, 4
In the formula above μ (the Greek letter “mu”) is the Mean of our values…
The mean is
(9+2+5+4+12+7+8+11+9+3+7+4+12+5+4+10+9+6+9+4)/ 20 =140/20 = 7
So: μ=7
Step 2: Then for each number: subtract the Mean and square the result.
(𝑥𝑖 − 𝜇 )2
So what is 𝑥𝑖 ? They are the individual x values 9, 2, 5, 4, 12, 7, etc… In other words,
𝑥1 = 9, 𝑥2 = 2, 𝑥3 = 5, etc.
So it says “for each value, subtract the mean and square the result,” like this
Example (continued):
(9 − 7)2 = (2)2 = 4
(2 − 7)2 = (−5)2 = 25
86
(5 − 7)2 = (−2)2 = 4
(4 − 7)2 = (−3)2 = 9
(12 − 7)2 = (5)2 = 25
(7 − 7)2 = (0)2 = 0
(8 − 7)2 = (1)2 = 1
… etc…
To work out the mean, add up all the values then divide by how many.
But how do we say “add them all up” in mathematics? We use “Sigma”: Σ
We want to add up all the values from 1 to N, where N=20 in our case because there
are 20 values:
Example (continued):
∑(𝑥𝑖 − 𝜇 )2
𝑖=1
Which means: Sum all values from (𝑥1 − 7)2 to (𝑥𝑁 − 7)2
We already calculated (𝑥1 − 7)2 = 4 etc. in previous step, so just sum them up: =
4+25+4+9+25+0+1+16+4+16+0+9+25+4+9+9+4+1+4+9=178
But that isn’t the mean yet, we need to divide by how many which is done by dividing
by N.
Example (continued):
∑𝑁
𝑖=1(𝑥𝑖 − 𝜇 )
2
87
Step 4: Take the square root of that:
Example (concluded):
∑𝑁
𝑖=1(𝑥𝑖 − 𝜇 )
2
𝜎=√
𝑁
𝜎 = √8.9 = 2.983 …
But sometimes our data are only a sample of the whole population.
Example: Sam has 20 rose bushes, but only counted the flowers on 6 of them!
The “population” is all 20 rose bushes, and the “sample” is the 6 bushes that Sam
counted among the 20.
But wen we use the sample as an estimate of the whole population, the Standard
Deviation formula changes to this:
∑𝑁
𝑖=1(𝑥𝑖 − 𝑥̅ )
2
𝑠=√
𝑁−1
The important change is “N-1” instead of “N” (which is called “Bessel’s correction”).
The symbol also changed to reflect that we are working on a sample instead of the
whole population:
The mean is now 𝑥̅ (for sample mean) instead of 𝜇 (the population mean),
But that does not affect the calculations. Only N-1 instead of N changes the
calculations.
88
Here are the steps in calculating the Sample Standard Deviation:
So 𝑥̅ = 6.5
Step 2: Then for each number: subtract the Mean and square the result
Example 2 (continued):
To work out the mean, add up all the values then divide by how many.
But hang on… we are calculating the sample Standard Deviation, so instead of dividing
by how many (N), we will divide by N-1.
Example 2 (continued):
Sum=6.25+20.25+2.25+6.25+30.25+0.25=65.5
Divide by N – 1: 65.6/5 = 13.1
Example 2 (concluded):
∑𝑁
𝑖=1(𝑥𝑖 − 𝑥̅ )
2
𝑠=√
𝑁−1
𝑠 = √13.1 = 3.619 …
89
Comparing
When we used the whole population we got: Mean = 7, Standard Deviation = 2.983…
When we used the sample we got: Sample Mean = 6.5, Sample Standard Deviation =
3.619…
Our Sample Mean was wrong by 7% and our Sample Standard Deviation was wrong by
21%.
Imagine you want to know what the whole university thinks… you can’t ask thousands
of people, so instead you ask maybe only 300 people. Samuel Johnson once said “You
don’t have to eat the whole ox to know that the meat is tough.”
The standard deviation is simply the square root of the variance. The Standard
deviation is an especially useful measure of variability when the distribution is normal
or approximately normal because the proportion of the distribution within a given
number of standard deviations from the mean can be calculated. For example, 68% of
the distribution is within one standard deviation of the mean and approximately 95%
of the distribution is within two standard deviations of the mean. Therefore, if you had
a normal distribution with a mean of 50 and a standard deviation of 10, then 68% of
the distribution would bet between 50 – 10 = 40 and 50 + 10 = 60. Similarly, about 95%
of the distribution would be between 50 – 2x10 = 30 and 50 + 2x10 = 70. The symbol
for the population standard deviation is σ;
Figure 18 shows two normal distributions. The first distribution (bold line) has a mean
of 40 and a standard deviation of 5; The other distribution has a mean of 60 and a
standard deviation of 10. For the first distribution (bold line), 68% of the distribution
is between 35 and 45; for the other distribution, 68% is between 50 and 50.
90
Standard deviation is a measure of dispersion, the more dispersed the data, the less
consistent the data are. A lower standard deviation means that the data are more
clustered around the mean and hence the data set is more consistent.
Let us use the standard deviation to compare two data sets. Let us use the standard
deviation to interpret how consistent the data are. The lower the standard deviation,
the more consistent the data are.
Example: Two bowlers, Katie and Mike have the scores given below:
Both sets of data have a mean (𝑥̅ ) = 201.4. Does this mean they are equivalent bowlers?
No, consider the standard deviations. Katie has a standard deviation (SD) = 37.6470 and
Mike has a standard deviation (SD) = 26.1017. Since Mike has a smaller standard
deviation, he is a more consistent bowler than Katie, i.e. Mike is more likely to get a
score of 201.4.
Let’s presume that Katie’s and Mike’s scores are scores in a long test:
If you compute the mean for both sets of scores, you get 201. SD for Katie’s scores is
37.6570 while that of Mike is 26.1017. Mike’s scores indicate greater consistency than
those of Katie. This means that Mike tends to do better than Katie because his scores
are more consistent than those of Katie.
91
Exercises 7
Read and understand each question/statement carefully and then select the letter of
the best answer from the choices.
2. If scores are plotted in a histogram, which do you call that with the highest
frequency?
A. Mean C. Mode
B. Median D. Standard Deviation
8. You like to get a more reliable picture of the scores of your students in your Math
class. Which will you compute?
A. The mean C. The difficulty index
B. The mean and Standard Deviation D. the discrimination index
92
10. Which score distribution do all teachers, parents, and students wish?
A. Negatively Skewed C. Positively Skewed
B. Bell curve D. That depends on the Mean
11. If there is not real teaching and learning that take place, which score distribution
is most likely to come?
A. Negatively Skewed C. Positively Skewed
B. Bell curve D. That depends on the Mean
12. Among the measures of central tendency, which is most affected by outliers?
A. Mean C. Median
B. Mode D. Range
13. If a score distribution has not outliers, which is mot likely to be TRUE?
A. The scores may not be so varied
B. The scores may be highly varied
C. In this case, the median is the most reliable measure of central tendency.
D. In this case, the mode is the best measure of central tendency.
14. Which is the mean of the squared deviation from the mean?
A. Variance C. Standard Deviation
B. Range D. Mean
15. Which is TRUE of scores that follow the normal distribution curve?
A. The mean, the median and more are equal.
B. The mean is higher than the mean.
C. The mean is higher than the median.
D. The mode is higher than the mean and the median.
16. If a score distribution has a Standard Deviation of zero, what does it mean?
A. Most scores are zero.
B. The scores are the same.
C. Most scores are high.
D. Most scores are negative
93