Chapter Two
Chapter Two
Sources of Data
There are 2 sources for data collection namely Primary, and Secondary data.
Primary data- freshly collected i.e. for the first time. They are original in character i.e.
they are the first hand information collected, compiled and published for some purpose.
They haven’t undergone any statistical treatment
Secondary Data- 2nd hand information mainly obtained from published sources such as
statistical abstracts books encyclopaedias periodicals, media reports e.g. census report
and other electronic devices, internet. They are not original in character and have
undergone some statistical treatment at least once.
a.) Field study- aims at testing hypothesis in natural life situations. It differs from
field experiment in that the researcher does not control or manipulate the independent
variables but both of them are carried out in natural conditions
b.) Census- A census is a study that obtains data from every member of a population
(totality of individuals /items pertaining to certain characteristics). In most studies, a
census is not practical, because of the cost and/or time required.
c.) Sample survey- A sample survey is a study that obtains data from a subset of a
population, in order to estimate population attributes/ characteristics. Surveys of human
populations and institutions are common in government, health, social science and
marketing research.
d.) Case study– It’s a method of intensively exploring and analyzing the life of a
single social unit be it a family, person, an institution, cultural group or even an entire
community. In this method no attempt is made to exercise experimental or statistical
control and phenomena related to the unit are studied in natural. The researcher has
several discretion in gathering information from a variety of sources such as diaries,
letters, and autobiographies, records in office, files or personal interviews.
Sampling Procedure
Sampling involves two tasks
How to select the elements?
How to estimate the population characteristics – from the sampling units?
We employ some randomization process for sample selection so that there is no
preferential treatment in selection which may introduce selectivity bias
Solution
If k=5 is considered, stop the selection of samples when n=175 achieved.
if k=6 is considered, treat the sampling frame as a circular list and continue the selection
of samples from the beginning of the list after exhausting the list during the first cycle.
For example, imagine you were interested in understanding more about employee
satisfaction in a single, large organisation in the United States. You intended to collect
your data using a questionnaire. The manager who has kindly given you access to
conduct your research is unable to get permission to get a list of all employees in the
organisation, which you would need to use a probability sampling technique such as
simple random sampling or systematic random sampling. However, the manager has
managed to secure permission for you to spend two days in the organisation to collect as
many questionnaire responses as possible. You decide to spend the two days at the
entrance of the organisation where all employees have to pass through to get to their
desks. Whilst a probability sampling technique would have been preferred, the
convenience sample was the only sampling technique that you could use to collect data.
Irrespective of the disadvantages of convenience sampling, discussed below, without the
use of this sampling technique, you may not have been able to get access to any data on
employee satisfaction in the organisation.
The convenience sample often suffers from a number of biases. This can be seen in both
of our examples, whether the 10,000 students we were studying, or the employees at
the large organisation. In both cases, a convenience sample can lead to the under-
representation or over-representation of particular groups within the sample. If we take
the large organisation:
It may be that the organisation has multiple sites, with employee satisfaction varying
considerably between these sites. By conducting the survey at the headquarters of the
organisation, we may have missed the differences in employee satisfaction amongst
those at different sites, including non-office workers. We also do not know why some
employees agreed to take part in the survey, whilst others did not. Was it because some
BSTA 1223: BUSINESS STATISTICS I 10
employees were simply too busy? Did they not trust the intentions of the survey? Did
others take part out of kindness or because they had a particular grievance with the
organisation? These types of biases are quite typical in convenience sampling.
Since the sampling frame is not known, and the sample is not chosen at random, the
inherent bias in convenience sampling means that the sample is unlikely to be
representative of the population being studied. This undermines your ability to make
generalisations from your sample to the population you are studying.
If you are an undergraduate or master’s level dissertation student considering using
convenience sampling, you may also want to read more about how to put together your
sampling strategy [see the section: Sampling Strategy