BHRM 242 - Collection, Organisation and Presentation of Data
BHRM 242 - Collection, Organisation and Presentation of Data
BHRM 242 - Collection, Organisation and Presentation of Data
LECTURE ONE
Since statistical methods help in taking decisions, statistics may be rightly regarded as a body of
methods for making wise decisions in the face of uncertainty. A modified form of this definition is given
by Professor Yan- Lun-Chou in whose words, statistics is a method of decision making in the face of
uncertainty on the basis of numerical data and calculated risks.
1
BHRM 242: Quantitative Methods in Management Class Notes
i) Statistical Methods.
In statistical methods, we study all those devices, rules of procedures and general principles which are
applicable to all kinds or groups of data. They are the tools in the hands of a statistical investigator.
They are the devices for achieving the desired ends explained in theory.
1.5.3Descriptive Statistics
These are methods involving the collection, presentation and characterisation of data in order to
properly describe the various features of that set of data. Areas covered under descriptive statistics
include:
i. Collection of data
ii. Organisation and presentation of data
iii. Measures of central tendency
iv. Measures of dispersion
Government uses statistics as a tool for collecting data on economic aggregates such as
national income, savings, consumption and gross national product.
Government also uses statistics to measure the effects of external factors on its policies and to
assess the trends in the economy so that it can plan future policies.
2
BHRM 242: Quantitative Methods in Management Class Notes
Government uses statistics during census. The various forms sent by the government to
individuals and firms on annual income, tax returns, prices, costs, output and wage rates
generate a lot of statistical data for the use of the government
Business uses statistics to monitor the various changes in the national economy for the various
budget decisions. Business makes use of statistics in production, marketing, administration and
in personnel management.
Statistics is also used extensively to control and analyze stock level such as minimum, maximum
and reorder levels. It is used by business in market research to determine the acceptability of a
product that will be demanded at various prices by a given population in a geographical area.
Management also uses statistics to make forecast about the sales and labour cost of a firm.
• Entity: This may be person, place, and thing on which we make observations. In studying the
nutritional well-being of pupils in a primary school, the entity is a pupil in the school.
• Variable: This is a characteristic that assumes different values for different entities. The
weights of pupils in the primary school constitute a variable.
• Random Variable: If we can specify, for a given variable, a mathematical expression called a
function, which gives the relative frequency of occurrence of the values that the variable can
assume, the function is called a probability function and the variable a random variable.
• Quantitative Variable: This is a variable whose values are given as numerical quantities.
Examples of this is the hourly patronage of a restaurant
• Qualitative Variable: This is a variable that is not measurable in numerical form or that
cannot be counted. Examples of this are colours of fruits, taste of some brands of a biscuit.
• Discrete Variable: This is the variable that can only assume whole numbers. Examples of
these are the number of Local Government Council Areas of the States in Nigeria, number of
female students in the various programmes in the National Open University. A discrete variable
has "interruptions" between the values it can assume. For instance between 1 and 2, there are
infinite number of values such as 1.1, 1.11, 1.111, 1.IV land so on. These are called
interruptions.
3
BHRM 242: Quantitative Methods in Management Class Notes
• Continuous Variable: This is a variable that can assume both decimal and non decimal values.
There is always a continuum of values that the continuous variable can assume. The
interruptions that characterize the discrete variable are absent in the continuous variable. The
weight can be both whole values or decimal values such as 20 kilograms and
220.1752 kilograms.
• Population: This is the largest number of entities in a study. In the study of how workers in
Nigeria spend their leisure hours, the number of workers in Nigeria constitutes the population
of the study.
- A population (or universe):- is the totality of all things under consideration.
• Sample: This is the part of the population that is selected for a study. In studying the income
distribution of students in the National Open University, the incomes of 1000 students selected
for the study, from the population of all the students in the Open University will constitute the
sample of the study.
• Random Sample: This is a sample drawn from a population in such a way that the results of
its analysis may be used to generalize about the population from which it was drawn.
Thus, one major aspect of inferential statistics is the process of using sample statistics to draw
conclusions about the true population parameters. The need for inferential methods derives from the
need for sampling. As the population becomes larger, it is usually too costly, too time consuming, and
too cumbersome to obtain our information from the entire population. So, decisions pertaining to
populations characteristics have to be based on the information contained in a sample of that
population
4
BHRM 242: Quantitative Methods in Management Class Notes
LECTURE TWO
COLLECTION, ORGANISATION AND PRESENTATION OF DATA
Types of data:
We are going to talk about two types of data:
i. Primary data:-raw data collected by the investigator himself and thus it is original in
character.
ii. Secondary data:-data which have been collected by some other persons and which
have passed through the statistical machine at least once.
It may be observed that the distinction between primary and secondary data is a matter
of degree of relativity only.
The same set of data may be primary in the hands of one and secondary in the hands of
another.
In general, the data are primary to the source who collects and processes them for the
first time and are secondary for all other sources who later use such data.
Published sources
Include publications of various organisations such as;
i. Official publications of central government – i.e., economic surveys
ii. Annual and monthly abstracts of statistics- i.e statistical abstracts
iii. Publications of research institutions
iv. Publications of commercial and financial institutions.
v. Newspapers and periodicals.
vi. International publications e.g. U.N.O, WHO, IMF
Unpublished sources
There are various sources of unpublished statistical material such as;
Records maintained by private firms or business enterprises who may not want to
release their data to any outside agency.
The various departments and offices of the government.
5
BHRM 242: Quantitative Methods in Management Class Notes
The researches carried out by the individual research scholars in the universities or
research institutes
Bar Diagrams
Bar diagrams are one of the easiest and the most commonly used devices for presenting
most of economic and business data.
Consist of a group of equidistant rectangles, one for each group or category of the data
in which the values or the magnitude are represented by the height or length of the
rectangle, width being arbitrary and immaterial.
6
BHRM 242: Quantitative Methods in Management Class Notes
Objectives of Classification
To condense the mass of data in such a manner that similarities and dissimilarities can
be readily apprehended.
7
BHRM 242: Quantitative Methods in Management Class Notes
To facilitate comparison
To pinpoint the most significant features of the data at a glance.
To give prominence to the information gathered while dropping out the unnecessary
elements
To enable a statistical treatment of the material collected.
Types of Classification
Data can be classified broadly on the basis of the following 4 criteria:
1. Geographical i.e. area-wise e.g. cities, districts e.t.c
2. Chronological i.e. on the basis of time
3. Qualitative i.e. according to some attributes
4. Quantitative i.e. in terms of magnitude
Geographical Classification:
Data are classified on the basis of geographical or vocational differences between the various
items like states, cities, regions, zones, areas e.t.c.
Chronological Classification:
When data are observed over a period of time the type of classification is known as
chronological classification e.g. we may present the figures of population (or production, sales
e.t.c) as follows:
Population of Kenya from 1969 to 1988(hypothetical figures)
Year Pop. In Millions
1969 14 million
1978 21 million
1988 28 million
Time series are usually listed in chronological order; normally starting with earliest period.
Qualitative Classification
Data are classified on the basis of some attribute or quality such as sex, colour of hair, literacy,
religion e.t.c. The attribute under study cannot be measured; one can only find out whether it
is present or absent in the units of the population under study. E.g. if the attribute under study
is population, one can find out how many persons are living in urban areas and how many in
rural areas. Thus, when only one attribute is studied, one possessing the attribute and the
other not possessing the attribute. This type of classification is called simple classification.
Quantitative Classification
Quantitative classification refers to classification of data according to some characteristic that
can be measured such as height, weight, income, sales, profits, production e.t.c., e.g. the
students in college may be classified according to weight as follows:
8
BHRM 242: Quantitative Methods in Management Class Notes
A frequency distribution refers to data classified on the basis of some variable that can be
measured such as prices, wages, age, number of units consumed or produced.
A variable is a characteristic that varies in amount or magnitude in a frequency distribution. It
may be either continuous or discrete.
A discrete variable is that which can vary only by finite jumps and cannot manifest every
conceivable fractional value e.g. the number of children in a household can be 0, 1, 2, 3,……
e.t.c. The number of rooms in a house can be 1, 2, 3…..
Definition:
A frequency distribution or frequency table is simply a table in which the data grouped into
classes and the numbers of cases which fall in each class are recorded. The numbers in each
class are referred to as ‘frequencies’, hence the term ‘frequency’. When the number of items
are expressed by their proportion in each class, the table is usually referred to as a ‘relative’
frequency distribution’, or simply a percentage distribution.
Example:
In a survey of 35 families in a village, the number of children per family was recorded and the
following data obtained:
9
BHRM 242: Quantitative Methods in Management Class Notes
1 0 2 3 4 5 6
7 2 3 4 0 2 5
8 4 5 12 6 3 2
7 6 5 3 3 7 8
9 7 9 4 5 4 3
Represent the data in the form of a discrete frequency distribution.
Solution:
Frequency distribution of the number of children.
No. of children Tallies Frequency
1 II 2
2 I 1
3 IIII 4
4 IIII 5
5 III 3
6 IIII 5
7 IIII 4
8 II 2
9 II 2
10 - 0
11 - 0
12 I 1
FORMATION of a continuous frequency distribution
This classification is popular in practice. The following technical terms are important when a
continuous frequency distribution is formed or data are classified according to class intervals.
Class limits
These are the highest and lowest values that can be included in the class. For example take the
class 20-40. The lowest value of the class is 2 and the highest is 40. The two boundaries of
class are known as the lower limit and the upper limit of the class.
Class intervals
The difference between the upper and lower limit of a class is known as class interval of that
class. For example, in the class 100-200, the class interval is 100 (i.e. 200-00). An important
decision while constructing a frequency distribution is about the width of the class interval, i.e.
whether it should be 10, 20, 50, 100, 500, e.t.c.
The decision would depend upon a number of factors such as the range in the data, i.e. the
difference between the smallest and the largest item, the details required and the number of
classes to be formed e.t.c. A simple formula to obtain the estimate of appropriate class
interval, i.e. i is:
10
BHRM 242: Quantitative Methods in Management Class Notes
LS
i
K
Where: L – Largest item
S – Smallest item
K – The number of classes
e.g. if the salary of 100 employees in a commercial undertaking varied between £500 and £5,
500, and we want to form 10 classes, then the class interval would be:
LS
i
K
L = 5, 500
S = 500
K = 10
5,500 500
i
10
= 500
The starting class would be 500-1000, the next 1000 – 1500 and so on.
The question now is how to fix the number of classes, i.e. K. The number can be either fixed
arbitrarily keeping in view the nature of problem under study or it can be decided with the help
of Sturge’s rule. According to him, the number of classes can be determined by the formula:
K = 1 + 3.322 log N
Where : N = total number of observations
Log = logarithms of the number
Thus if 10 observations are being studied, the number of classes shall be:
K = 1 + (3.222 x 1)
= 4.322 or 4
And if 100 observations are being studied, the number of classes shall generally be between 4
and 20 – it cannot be less than 4 even if N is less than 10 and if N is 10 K will be 1 + 3.222 x 6 =
20.9 or 21
Sturges suggested the following formula for determining the magnitude of class interval:
Range
i
1 3.322 log N
Where range is the difference between the largest and smallest items.
e.g. in the example above; and by applying the formula, the magnitude of class interval shall be:
5500 500
i
1 3.22 log 100
11
BHRM 242: Quantitative Methods in Management Class Notes
5000
=
7.644
= 654.1 or 650
Class Frequency
The number of observations corresponding to a particular class is known as the frequency of
that class or the class frequency.
Class Midpoint or class mark
It is the value lying halfway between the lower and upper class limits of a class interval. The
midpoint of a class is ascertained as follows:
Upper lim oftheclass lower lim itoftheclass
Midpoint =
2
There are two methods of classifying the data according to class-intervals:
i) Exclusive method
ii) Inclusive method
Exclusive Method
The class intervals are so fixed that the upper limit of one class is the lower limit of the next
class.
Example
Income No. Of Persons
1000-1100 50
1100-1200 100
1200-1300 200
1300-1400 150
1400-1500 40
1500-1600 10
Inclusive Method
The upper limit of one class is included in that class itself.
Example:
Income (£) No. of Persons
1000 – 1099 50
1100 – 1199 100
1199 – 1299 200
1300 – 1399 150
1400 – 1499 40
1500 – 1599 10
Total 550
To decide whether to use the inclusive or exclusive method, it is important to determine
whether the variable under observation is a continuous or discrete one. In case of continuous
12
BHRM 242: Quantitative Methods in Management Class Notes
variables, the upper limit exclusive method must be used. E.g. the variable height being
inherently a continuous one should be stated as 60” and under 62”, and under 64 and so on.
The inclusive method should, in general be used in case of discrete variables.
Illustration:
Prepare a frequency table for the data below with the width of each class interval as 10. Use
exclusive method of classification.
57 44 80 75 00 18 45 14 04 64
72 51 69 34 22 83 79 20 57 28
96 56 50 47 10 34 61 66 80 46
22 10 84 50 47 73 42 33 48 65
10 34 60 53 75 90 58 46 39 69
Solution:
Preparation of frequency distribution:
Frequency distribution
Marks Tallies Frequency Relative Frequency
0-10 II 2 2
50 = 0.04
10–20 IIII 5 4
50 = 0.10
13