Presentation of Data
Presentation of Data
Introduction
One of the most convincing and appealing ways in which statistical
results may be presented is through diagrams and graphs.
Evidence of this can be found in newspapers, magazines,
advertisement etc. Descriptive statistics enable us to understand
data through summary values and graphical presentations.
Summary values not only include the average, but also the
spread, median, mode, range, and standard deviation. It is
important to look at summary statistics along with the data set
to understand the entire picture, as the same summary statistics
may describe very different data sets. Descriptive statistics can
be illustrated in an understandable fashion by presenting them
graphically using statistical and data presentation tools.
Several types of statistical/data presentation tools exist, including:
(a) charts displaying frequencies (bar, pie, and Pareto charts, (b)
charts displaying trends (run and control charts), (c) charts
displaying distributions (histograms), and (d) charts displaying
associations (scatter diagrams).
hhen creating graphic displays, keep in mind
the following questions
60-70 52 -
70-80 40 -
80-90 30 -
90-100 5 - - - - - --
Drawing a frequency polygon requires marking
a dot at the mid point of the top horizontal
line of each bar and then joining these dots
by straight lines.The polygon so obtained is
then closed at each end by drawing straight
lines from the mid ƛpoint of the top base of
the first and the last rectangles to the mid-
point falling on the horizontal axis of the next
outlying interval with zero frequency.
Marks Number of std
a
0-10 8
10-20 12
20-30 22 a
a
30-40 35
40-50 40
50-60 60
60-70 52
70-80 40
80-90 30
90-100 5
Ö
At times we are interested in knowing Ơhow
many workers of a factory earn less than Rs.
700per monthơ or how many worker can earn
more than Rs. 1000 per monthơ etc to
answers these questions , it is necessary to
add frequencies. hhen frequencies are
added, they are called cumulative
frequencies. The curve obtained by plotting
cumulative frequencies is called a cumulative
frequency curve of an Ogive.
!
F d the m ss frequec es the follow frequecy d str but o f t s kow that the
mea of the d str but o s 1.46 No. of acc dets (x): 0 1 2 3 4
5
Total 200
Frequecy (f): 46 ? ? 25 10 5
COMPUTATION OF ARITHMATIC MEAN
f = N = 86+f1+f2 f1x1=140+f1+2f2
N= 200
f fx
X Oâ 200= 86 +f1+f2
Oâ f1+f2=114+f2+f2
Oâ f1+f2=114
Also, Mea=1.46 ²²²²²²²
0 46 0
²²²²²²²²²²²²±1
1 f1 f1 Oâ 1.46= f x N
Oâ 1.46=140+f1+2f2200
2 f2 2f2
Oâ 292=140+f1+2f2
3 25 75 Oâ F1+2f2=150 ²²²²²²²²
²²²²²²²²²²²²²2
4 10 40
Solv 1,2
5 5 25 Oâ F1=76 ad f2=38
Med a (d rect ser es)
Example 1
The marks of nine students in a geography test that had a maximum possible mark of 50 are given below:
47 35 37 32 38 39 36 34 35
F d the med a of th s set of data values.
Arrange the data values in order from the lowest value to the highest value:
32 34 35 35 36 37 38 39 47
The f fth data value, 36, s the m ddle value th s arraemet.
If the number of values in the data set is even, then the s the averae of
the two m ddle values.
Example 2
Find the median of the following data set:
12 18 16 21 10 13 17 19
Arrange the data values in order from the lowest value to the highest value:
10 12 13 16 17 18 19 21
The umber of values the data set s 8, wh ch s eve. So, the med a s the averae of the two m ddle values.
There are 8 values in the data set.
The fourth ad f fth scores, 16 ad 17, are the m ddle. That s, there s o oe m ddle value.
È%
alf of the values in the data set lie below the median and half lie above the
median.
The median is the most commonly quoted figure used to measure property
prices. The use of the median avoids the problem of the mean property price
which is affected by a few expensive properties that are not representative of
the general property market.
Med a ( cot uous ser es)
Calculate the med a from the follow d str but o:
Class: 5-10 10-15 15-20 20-25 25-30
Frequecy: 5 6 15 10 5
30-35 35-40 40-45
4 2 2
SOLUTION: F rst cumulat ve table to complete med a.
FREQUENCY CUMULATIVE
CLASS FREQUENCY
5-10 5 5
10-15 6 11
15-20 15 26
20-25 10 36
25-30 5 41
30-35 4 45
35-40 2 47
40-45 2 49
SO,
N=49 & N2=24.5ĺThe cumulat ve frequecy just reater tha N2 s 26 ad correspod class
s 15-20
(Med a class) L=15, f=15, F=11, h=5
:. Med a = ð = 15+24.5-1115*5 =19.5
Mode
The mode or modal value of a distribution is
that value of the variable for which the
frequency is maximum. In order to compute
the mode of a series of individual
observations. he first convert it into a
discrete series frequency distribution by
preparing a frequency table. From the
frequency table, we identify the value having
maximum frequency. The value of variable
to obtain is the mode or modal value
Mode
The mode has applications in printing. For example, it
is important to print more of the most popular books;
because printing different books in equal numbers
would cause a shortage of some books and an
oversupply of others.
Likewise, the mode has applications in
manufacturing. For example, it is important to
manufacture more of the most popular shoes;
because manufacturing different shoes in equal
numbers would cause a shortage of some shoes and
an oversupply of others
Obtain the value of the following:
l lower limit
h width
fx frequency
f1 frequency of the class preceding
f2 frequency of the class following.
MODE´&' ( )*' ( )+
Relationship among mean, median
and mode
È%
, !
!
,
!
-
!
,
! !
-
!
Analysing Data
4 The mean, median and mode of a data set are collectively known as
as these three measures focus on
where the data is centred or clustered. To analyse data using the
mean, median and mode, we need to use the most appropriate
measure of central tendency. The following points should be
remembered:
4 The mean is useful for predicting future results when there are no
extreme values in the data set. owever, the impact of extreme
values on the mean may be important and should be considered. E.g.
The impact of a stock market crash on average investment returns.
4 The median may be more useful than the mean when there are
extreme values in the data set as it is not affected by the extreme
values.
4 The mode is useful when the most common item, characteristic or
value of a data set is required.