Unit 15
Unit 15
Unit 15
REPRESENTATION OF DATA
Structure
15.1 Introduction
15.2 Objectives
15.3 Nature of Data
15.4 Organization of Data
15.4.1 Frequency Distribution Table and Its Interpretation
15.1 INTRODUCTION
While conducting educational research, statistic is used for analysis of
information draw conclusions or inferences. This process consists of three steps:
observations/collection of data or information, analysis of information/data
analysis and drawing inferences/conclusions. With the help of observations,
we obtain information either in the form of ideas or in numerical form. The
information that is in the form of ideas is qualitative in nature and that is numerical
form is quantitative in nature. Hence, we are likely to get either qualitative or
quantitative data. Qualitative data may either be analysed using qualitative
methods or may be converted into quantitative data through transforming the
ideas into numbers. And the information collected in terms of numerical figures
is known as the data. In this unit we shall discuss about nature methods of
organization and graphical representation of data.
15.2 OBJECTIVES
After going through this unit, you will be able to
• explain nature of data;
• describe the procedure of classification and tabulation of data ;
Note: Few sections of the Unit has been partially taken from Unit-13
‘Tabulation and Graphical Representation of Data’, BES-127 ‘Assessment
for Learning, B.Ed. IGNOU, 2017
279
Data Collection and • present data in frequency tables;
Analysis
• explain the need of graphical representation of data; and
• explain various methods of graphical representation of data like histogram,
pie charts, frequency polygon, cumulative percentage curve.
Apart from this on the basis of the nature of variable, quantitative data further
could be classified into two types: Continuous Data and Discrete Data.
• Continuous Data: This type data is one which is capable of any degree
of subdivision. Continuous data represents measurement and therefore
their values cannot be counted but they can be measured. The scores
of an intelligence quotient (I.Q.) test, achievement test and height or
weight of a person fall into continuous series. Suppose if you are asked
to measure your height with a meter rule, you might say you are 160 cm
tall. However, if you had a more accurate rule, you might actually find
you are 160.362 cm tall. This gap from 160 cm to 160.362 cm is in a
truly continuous series and the precision of measurement fulfils the gaps.
• Discrete Data: Discrete data is a count that involves only integers. The
discrete values can’t be subdivided into parts. A discrete series exhibit
the real gaps. For instanc, if in a class there are 45 boys and 18 girls, then
you know there is real gap between the two groups. So, this type of data
cannot be measured but it can be counted. The national census is another
example of discrete data.
On the basis of data organisation, data may be treated in two ways
-ungrouped and grouped.
Let us understand through the same example discussed as in the earlier section
15.3 where a teacher administered achievement test to 20 students. Following
are the scores:
25 33 36 28 35 35 25 45 23 20
21 36 45 33 35 35 33 33 45 40
Let us make the frequency distribution table for the above data. Table 15.1
shows frequency distribution table for ungrouped data.
Scores Frequencies
20 1
21 1
23 1
25 2
28 1
33 4
35 4
36 2
40 1
45 3
Table 15.1: Frequency Distribution Table for Ungrouped Data
2) Grouped Frequency Distribution: Grouped frequency distribution is a
table in which total range of data is classified into class intervals of such a
size .Generally, there are two ways to prepare class intervals :
• Exclusive Class Interval
• Inclusive Class Interval
1. Exclusive Class Interval: In an exclusive class-interval, upper scores of
the class-interval are not included in these class-intervals. For example, in
a class interval of 15-20, 20 is an upper score and it be excluded. This upper
score (i.e. 20 ) will be included in the next class-interval.
282
2. Inclusive Class Interval: In an inclusive class-interval, both the lower and Organization and
upper scores are included in class interval. This type is considered to be the Graphical representation
of Data
most convenient method.
Table 15.2 indicates two types of class intervals.
So now let us understand it with the help of an example. Following are the
scores obtained by 35 students on an intelligence test and are shown below:
55 62 73 44 67 78 79 90 56 67 78 108
65 69 79 75 98 58 72 85 86 82 97 74
99 103 106 88 100 92 88 56 49 77 42
In order to make the frequency distribution, the following procedures need to
be carried out:
1. Calculation of Range: Range is the difference between the highest score
and the lowest score in the set of data. In the above case ‘108’ is the highest
score and ‘42’ is the lowest score.
Therefore, Range= Highest Score-Lowest score
Range= 108-42=66
2. Calculation of Size of Class Interval: To calculate the size of the class
interval, first finalise the number of classes in which you want to represent
your data. The number of classes will depend on the size of data, larger the
283
Data Collection and data more the number of classes or vice versa. For calculating the size of a
Analysis class interval, you may use the following formula;
Size of the Class Interval=Range/Numbers of classes desired
From the above example, it can be deduced:
(Note: here we decide to take seven classes by observing the size of data)
Size of class interval= 66/7= 9.4≈10
Commonly used size of class intervals is 5 and 10 as they are easier to work
in the calculation.
3. Indication of Class Intervals: Once the size of a class interval is decided,
then the class intervals having the lowest to highest scores are arranged
either in ascending or descending order. So in order to write the class
interval, the lowest score, higher score and size of class interval is used.
If the lowest score is close to term i.e. 10, 15,……30, 35, 40, 45…., the
class interval can be started from these terms rather than actual scores.
In the above case, the lowest score is ‘42’ ( which is near to 40) and size of
class interval is ‘10’ and hence the first class interval would be 40-50. The
next class interval would be 50-60, 60-70 and so on.
4. Putting the Tallies: After the class intervals are arranged, you may start putting
tallies against each class intervals. For this the number of cases occurring in
each class interval is noted and is denoted by using tallies. At this point, the
style of putting tallies is to be paid attention. In order to put tallies, we may
start from 1 and go up to 4, then after the fifth tally we mark it by drawing
diagonal line as shown below. In the above example, the number of cases
appearing in each class interval is represented using tallies as shown below:
I II III IIII
5. Obtaining Frequencies: The marked tallies are counted separately for
each class interval. In this example, the total number of tallies against each
class interval is given below in table 15.3 :
6. Indication of Frequency Table: The final step is to check the total number
of tallies to get the total number of cases, ‘N’. This is found out by adding
all the frequencies. In this case, the total frequency (∑f) is 35, wherein ‘∑’
stands for summation and ‘f’ for frequency. Final frequency table is shown
below in table 15.3.
284 N= ∑f=35
Table 15.3 : Frequency Distribution Table
On the basis of the frequency distribution table one can say that 19 students Organization and
have an average I.Q level (i.e. 60-70, 70-80, 80-90) whereas only 4 students Graphical representation
of Data
have very high I.Q. (i.e.100-110).
Activity 1
Analyze the frequency distribution table 15.3. Note down the interpretation
that can be made out of it.
………………………………………………………………………………
……………………………………………………………………………......
……………………………………………………………………………......
When a researcher aspires to convert his/her data into graphical format, he/she
needs to keep in mind the following points:
1. Draw two perpendicular lines. The point where two lines intersect is called
‘origin’ and is represented using ‘0’ (zero).
2. The horizontal line is called ‘X’ axis. The ‘x-axis is called abscissa(base).
3. The vertical line is called ‘Y’ axis. The y-axis is called ordinate (height).
4. The ordinate/height of the graph must be 75% of the abscissa/base. This is
called 75% rule. But there is flexibility to dilate between 60% to 80%.
5. The graph generally has four quadrants as shown in figure 15.2. But
educationists/psychologists usually use the (++) quadrants to utilize
maximum space of the graph paper.
By now, you know that the data for discrete variable are obtained on nominal
scale or in terms of frequency rather than score. The frequencies are presented
in the form of bar diagram, (rectangles with similar width) as used in bar graphs.
You usually get the data in the form of frequencies, when you collect data by
using a questionnaire, interview techniques or a rating scale. The following are
287
Data Collection and the steps used for constructing a bar graph:
Analysis
• Select x-axis and y- axis on the graph paper. Generally, the x axis is the
horizontal line and y-axis is the vertical line in the graph.
• The intersection of the x-axis and y-axis is the origin (marked as ‘0’) of the
graph.
• Choose a convenient scale for both the axis’s.
• Mark the corresponding values against each variable on x-axis and y-axis
and draw them as bars having equal widths.
Let us discuss here with an example and apply these steps to draw the bar graph.
Consider a school, having the following number of girls in various sections of
a particular grade:
Here, the number of girls studying in various sections of grade eighth is given.
In order to draw bar graph, number of girl students is taken on the y-axis and the
corresponding grades are selected on the x axis. The resulting bar graph is given
below in figure 15.3:
The next question is to how to interpret the bar graph. In this case, we can say
that, the class 8C has greatest number of girls compared to rest of the classes
and 8E has the least number of girls. What else can we infer? The difference in
number of girls among classes 8E and 8C is 20. There are many more inferences
that you can draw from the bar graph. Why don’t you try it as an activity?
288
Organization and
Activity 2 Graphical representation
of Data
Draw other inferences from the bar-graph other than the ones already
drawn above.
.....................................................................................................................
.....................................................................................................................
.....................................................................................................................
To represent the data given in the above example in a bar graph, we select
two axes on the graph paper. Against the x-axis ‘modes of learning’ and y-axis,
‘number of students (rural and urban areas)’ are marked. The resulting bar graph
is given below:
On the basis of this graph, we can say that face-to-face mode (f2f) is more
preferred by rural area students as compared to urban area students. Whereas
online mode is more preferred by urban area students as compared to rural area
students.
289
Data Collection and
Analysis Activity 3
In figure 15.4, bar graph shows students preference on various teaching-
leaning mode. Analyze the graph and draw inference other than the ones
that is already discussed.
.....................................................................................................................
.....................................................................................................................
.....................................................................................................................
Another device for presenting a discrete data is the pie-diagram. The pie-
diagram/pie graph is known as circle graph as we represent the statistical
data as a circular figure considering weightage given to the proportions of
data. The percentage is shown by circle in terms of angle. The angle of a
circle at the center is 3600, which is converted into percentage i.e. 360 is
equal to 10 percent and 1 percent is equal to 3.600 . The circle is shown for
100 frequency of a set of data. To construct a pie-diagram one should have
the knowledge of angle measurements and percentages. Let us understand it
with the help of an example.
Let us try to put these details into a pie-diagram. To do so, you should have
the knowledge that the value of a circle is 2π (2 pie). 2π is equal to 2×1800
= 3600.Thus, the whole circle represents 3600. . Thus, we will represent the
total sample i.e. 180 through a circle having 3600. Let us see how it is done.
In this way a scale for pie diagram is prepared. Now it is very easy to divide a
circle into degrees to shown result in the visual presentation. The pie-diagram
is shown in figure 15.5:
290
Organization and
Graphical representation
of Data
To represent the data given in the table 15.4 in a line graph, we select two axes
on the graph paper. Against the x-axis ‘day’ and y axis, ‘absentees’ are marked.
After that, appropriate scale is decided. In this case, as the number of absentees
ranges from 1 to 7, we may choose I square of the graph as 1 along y-axis.
Similarly against x-axis, each square can be chosen as a day. Then after, the
data of absentees pertaining to each day are marked. The resulting line graph is
given below:
Line graphs are useful in that they show data variables and trends very clearly
and can help to make predictions about the results of data not yet recorded. They
can also be used to display several dependent variables against one independent
variable. So, more than one line will be drawn. To get better understanding let’s
take an example.
Example: Discipline wise pass percentage of result of high school are as follows
(results is given in pass percentage):
Year Humanities Science Commerce
2017 56 78 70
2018 58 80 90
2019 59 85 72
292
Organization and
2020 60 89 98 Graphical representation
2021 61 90 76 of Data
In the sample results are recorded year wise in different disciplines. There are
three disciplines in the school, therefore three lines are drawn.The resulting line
graph is given below in figures 15.7
• First the limits of the class intervals are calculated. To compute limits,
both lower limit and upper limit of each class interval is found out. For
example, the lower and upper the limits of class interval 5-9 is 4.5 and
9.5 respectively and the class interval is written as 4.5-9.5.
• The lower limit and upper limits are plotted on the x-axis
• The frequencies are plotted on the y-axis.
• Thereafter, each class interval is depicted using adjacent rectangular
bars of equal width.
• Keep in mind to select appropriate scales for both x-axis and y-axis.
• While constructing a histogram, 75% rule is followed i.e. the height of
the figure should be approximately 75% of its width.
294
Organization and
Graphical representation
of Data
2. Frequency Polygon
25-29 5 29.5 16 66
20-24 3 24.5 11 45
15-19 6 19.5 8 33
10-14 2 14.5 2 8
5-9 0 9.5 0 0
( Extra Class
Interval)
297
Data Collection and
Analysis
298 Data can be presented with the help of graphical presentations. The graphical
presentation catches one’s eye and holds the attention when most careful Organization and
statistical analysis fails to reveal the nature of distribution. There are various Graphical representation
of Data
types of graphical representation for discrete/ ungrouped data and continuous/
grouped data. Discrete data cannot be organized into class intervals and data
in the form of raw score is called ungrouped data. Line, bar and pie graphs
are used for presenting discrete data. These graphical presentations are used
for analyzing patterns, trends and comparison purposes. Histogram, frequency
polygon and ogive are prepared for continuous data/ungrouped data.
300
7) Organization and
Graphical representation
of Data
Graph Diagram
• Constructed on graph paper. • Constructed on plain/normal
paper.
• Generally used by researchers for
research purposes. • Generally used for publicity
• It is drawn on two axes. • There is no restriction of axis
• Does not have many dimensions • Has many choices and is
as it is independent of choices independent
301