Measures of Centraltendency
Measures of Centraltendency
Measures of Centraltendency
We will see in future lessons that the mean of a population is denoted by the
symbol μ; but the mean of a sample is denoted by the symbol x.
We will also learn in future lessons that the formula for the standard deviation
of a population is different from the formula for the standard deviation of a
sample.
Slovin's formula
Example:
Suppose that you have a group of 1,000 city government employees and you
want to survey them to find out which tools are best suited to their jobs. You
decide that you are happy with a margin of error of 0.05. Using Slovin's
formula, you would be required to survey n = N / (1 + Ne^2) people:
There are two types of data and these are qualitative data and quantitative
data. The difference between the two types of data is that quantitative data is
used to describe numerical information. For instance, the measurement of
temperature would fall under this kind of data.
Example:
a student scored 80%, 72%, 50%, 64% and 74% marks in five subjects in an
examination. Find the mean percentage of marks obtain by the student?
Solution:
Here, the observations in percentage are:
x1 = 80, x2 = 72, x3 = 50, x4 = 64, x5 = 74
Therefore: mean = ( x1 + x2 + x3 + x4 + x5 ) / 5
mean = 340 / 5
mean = 68
Therefore, mean percentage of marks obtained by the student was 68%.
If the values of the variable( observations or variates) be x1, x2, x3,…xn and
their corresponding frequencies are f1, f2, f3, … fn then:
14, 13, 14, 15, 12, 13, 13, 14, 15, 12, 15, 14, 12, 16, 13, 14, 14, 15, 16, 12
Find the mean age of the students of the class.
Solution:
In the data, only five different numbers appear respectively.
So, we write the frequencies of the variates as below.
Median is the most middle value in the arrayed data. It means that
when the data are arranged, the median is the middle value
Step 2: If the number of observations is odd, the number in the middle of the
list is the median. This can be found by taking the value of the (n+1)/2 -th term,
where n is the number of observations
Else, if the number of observations is even, then the median is the simple
average of the middle two numbers. In calculation, the median is the simple
average of the n/2 -th and the (n/2 + 1) -th terms..
Example:
Who’s In the Middle?
To compare its employees’ age profile against that of other companies in the
industry, your company, Middle World Co. has asked you to calculate the
median age of your fellow workers.
Example:
Suppose we have the monthly wages of 10 employees of a company. How
would you find the median wage?
Solution: Notice that we have arranged their wages above from lowest to
highest. This ranking will help us to determine the median. Using the method
introduced earlier, the median is computed by taking the simple average of the
(n/2)-th = (10/2)-th = 5th and (n/2 + 1)-th = (10/2+1)-th = 6th observations.
Mode of Ungrouped / Raw Data
Example:
Range ( R ) – it is the difference between the highest and the lowest data.
Class size ( C ) - is the limit of which a class starts at a certain minimum data
and ends at a certain maximum data ( limits )
55 70 57 73 55 59 64 72
60 48 58 54 69 51 63 78
75 64 65 57 71 77 76 62
49 66 62 76 61 63 63 76
52 76 71 61 53 56 67 71
In this example, the greatest mass is 78 and the smallest mass is 48. The
range of the masses is then 77 – 48 = 29. The scale of the frequency table
must contain the range of masses. .
Step2: Class ( K )
C=R/K C = 29 / 6 C = 4.833
Therefore : there will be 6 rows of classes with 5 value of interval ( limit )
Frequency Distribution Table
CLASSES FREQUENCY ( f )
48 – 52 4
53 – 57 7
58 – 62 7
63 – 67 8
68 – 72 6
73 - 77 8
The values 48, 53, 58 ,63, 68, and 73 are called the lower limits while the
values 52, 57, 62, 67, 72, and 77 are called the upper limits
Class boundary - is the midpoint of the upper class limit of one class and the
lower class limit of the subsequent class. Each class thus has an upper and a
lower class boundary. It must be noted that upper class boundary of one class
and the lower class boundary of the subsequent class are the same
Upper Class Boundary = Upper limit + 0.5
Lower Class Boundary = Lower limit – 0.5
Frequency Distribution Table
Class Boundary can be denoted as CB and it is also know as the True Limit
The class midpoint or class mark ( X )
is a specific point in the center of the bins (categories) in a
frequency distribution table; It's also the center of a bar in a
histogram. It is defined as the average of the upper and lower class limits.
48 – 52 4 47.5 – 52.5 50
53 – 57 7 52.5 – 57.5 55
58 – 62 7 57.5 – 62.5 60
63 – 67 8 62.5 – 67.5 65
68 – 72 6 67.5 – 72.5 70
73 - 77 8 72.5 – 77.5 75
Relative Frequency ( %f )
It is the value assigned to each class as the proportion of the total
data set that belongs in the class.
%f = frequency of a class x 100 %
n
n = total data
CLASSES (f) CB (X) %f
Less than cumulative frequency starts from the top to the bottom
Greater than cumulative frequency starts from the bottom to the top
Mean for Grouped Data ( x̅ )
- Mean or mean average is a value or proportion of a value that
Represents a given set of values.
σ 𝒇𝒙
x̅ = “summation of the product of the frequency and
𝒏
midpoint all over the total number of frequency.”
CLASSES (f) CB (X) %f fx
Step 1: complete the frequency distribution table to obtain the mid point ( x )
Step 2: by obtaining ( x ) multiply the frequency with x to get ( fx )
Step 3: after completing ( fx ) add all values to get σ 𝑓𝑥
Step 4: the sum σ 𝑓𝑥 is then divided by the total value of frequency ( n ).
Median for Grouped Data (x̃)
𝑛
( 2 − <𝑐𝑢𝑚𝑓 𝑏𝑒𝑓𝑜𝑟𝑒 )
x̃= CBlower + x Class size
𝑓𝑐𝑙𝑎𝑠𝑠
𝑛 40
Step 1: start with ∶ = 20
2 2
Step 2: locate where 20 is near to in the <cumulative frequency
in this case 20 is near to ( 18 ) therefore “assume” that the mean is
located at the 3rd class ( 58 – 62 ).
Step 3: proceed with the formula
CLASSES (f) CB (X) %f < cumf >cumf
48 – 52 4 47.5 – 52.5 50 10% 4 40
53 – 57 7 52.5 – 57.5 55 17.5% 11 36
58 – 62 7 57.5 – 62.5 60 17.5% 18 29
63 – 67 8 62.5 – 67.5 65 20% 26 22
68 – 72 6 67.5 – 72.5 70 15% 32 14
73 - 77 8 72.5 – 77.5 75 20% 40 8
total 40 total 100%
𝑛
( − <𝑐𝑢𝑚𝑓 𝑏𝑒𝑓𝑜𝑟𝑒)
x̃ = CBlower + 2
x Class size
𝑓𝑐𝑙𝑎𝑠𝑠
( 20 −11 )
x̃ = 57.5 + x5
7
x̃ = 63.93
Note that 63.96 is not located between the 3rd class having limits of
(58-62) therefore adjust the solution at take a step higher.
Note: x̃ is read as x - curl
( 20−18 )
̃x = 62.5 + x5 x
̃ = 63.75
8
Note: 63.75 is located in the 4th class with limits ( 63 – 67 ) therefore the
median 63.75 is correct.
Mode for Grouped data
𝑑1
X̂ = CB lower limit +( ) x class size
𝑑1+𝑑2
𝑑1
X̂ = CB lower limit + ( ) x class size
𝑑1+𝑑2
1
X̂ = 62.5 + ( )x5 mode: 64.17
1+2