Sample Survey Practice
Sample Survey Practice
1
In most of the cases direct investigation of the whole elements of the
population (census) is not either possible or feasible.
This could be due to resource constraints (funds and time) or for some
other reasons such as nature of the population in which investigation of
all population elements is impossible.
Why planning?
successful.
resources.
3
The planning of a sample survey has three major steps:
Sample design
Planning timetable
Organization of fieldwork
Collecting information
c) Survey Analysis
Preparation for processing (data files and data structures, data checking,
5
2. 3. Source of Data
Collection of data refers to a purposive gathering of information
relevant to the subject matter of the study from the units under
investigation.
Primary Data
Questionnaires
Interview method
Observation method
Secondary Data
Mortality reports
Morbidity reports
selection
computable (known).
Lottery method
these are put in a box and mixed, and a sample of the required
size is drawn from the box.
12
Table of random numbers
13
A random number i between 1 and k is taken randomly and
Then the other elements selected are i+k, i+2k, i+3k… n (sample
size).
the population.
14
– Proportionate sampling: Size of the sample from each stratum is
proportional to the relative size of the stratum in the total population.
widely scattered
15
Non-probability sampling: members are selected from the population
in some nonrandom manner .
i) Convenience sampling
non-probability method.
When using this method, the researcher must be confident that the
16
iii) Quota sampling
17
CHAPTER 4
Sampling frame is a listing of the units from which the sample selection
households, persons, or
any identifiable items, and are generally known as area frame or list
frame.
18
The frame also contains an auxiliary information (measure of size,
For surveys with multistage sample designs, a frame is needed for each
stage of selection.
For example, for the three stage design the sampling units for
For the final stage, list of housing units (households) are required only
for sample EAs (kebeles).
stage of sampling for which they are used, include the following:
Intended use, frame units, coverage, media, content
20
Intended uses: Sampling frames are used for sample selection and for
making estimates based on sample data.
Frame units: The frame units are the sampling units included in the frame.
population and
to do so, every one of those units has a known (or knowable)
22
Media: Sampling frames may be stored either on print or electronic
media.
23
4.3 Desirable Properties of Frames
24
Desirable quality related properties are:
The frame consists of well-defined units
meaning that the area units has recognized boundaries that are clearly
delineated on various types of maps, and for non area units a precise
standard definition of the unit be established.
Frame units have adequate identifier
usually frame units will have both unique numerical identifiers (primary
identifiers), and the other identifiers, such as names and addresses
(secondary identifiers).
The frame must be complete
the completeness of a sampling frame deals with the extent to which the
intended coverage is actually achieved and the extent to which desired
information for each frame unit is included in the frame.
25
If incompleteness and duplication exist in a frame, it can create a
The frames are up-to date for frames that are to be used for more than
once,
procedures must be developed for periodic updating to ensure that
they are up-to date for some are likely to change with time.
Frame must have stable units if there is a choice with respect to the
i.e. those that are least subject to change in number, definition and
size.
26
b) Efficiency Related Properties
the most efficient survey design is the one that produces the desired
27
c) Cost Related Properties
the total error of the survey estimates when that particular frame is
used.
28
CHAPTER 5
Sample Design
5.1. Sampling Methods
population;
with this and some reasonable assumptions we can estimate a sample size
Most sampling methods attempt to select units such that each has a
29
All methods that adopt this general approach are called probability
sampling methods.
The basis of probability sampling is the selection of sampling units to
make up the sample based on defining the chance that each unit in the
sample frame will be included.
Among few issues on which they should discuss and reach agreement
30
estimates and any restrictions placed on survey with respect to timeliness
a) Setting objectives and preliminary investigation of the survey:
The survey objective should be cleary specified and precisely stated at
the outset.
Other issues related to the objective and relevant to the survey must be
assessed at the early stage of the design.
b) Sampling plan
There are different ways of designing a sample survey, but the idea of
optimum design started with the sampling features
The selection process deals with the preparation of sampling frames,
sample size determination, choice of design to be used, and sample
selection method.
The estimation procedure involves the process for computing the sample
Choice of design: there are different designs of sample, which are likely
33
Chapter 6
Methods of collecting the Data
The objective of the survey, the nature of the items of information, the
operational feasibility and cost will often determine the method of data
collection. Of the various method of collecting the data just a few of
them are outlined below:
Extraction of data from records
34
6.1.1 Extraction of data from records
However, one must first consider carefully its suitability for the purpose,
There are some areas where information from records is the only available
Self-administered questionnaires:
36
The advantages of this method include:
37
The disadvantages of this method may include:
questionnaire is completed.
38
6.2.3 Direct investigation –measurement (observation) and interviewing
(face-to –face, telephone, and Focus group discussion [FGD])
The type of question and the nature and status of the topic will determine
Measurements or Observations
Counts of human,
40
Interviewing (face- to – face, telephone)
and respondent.
41
It may be information that cannot be directly observed or measured
Face-to face interviews have the highest response rate and permit the
longest questionnaires.
probes(recording materials).
42
Suitable for use with illiterates
Interviews also can observe the surroundings and can use nonverbal
complex questions.
Cost is high- the training, travel, supervision, and personnel costs for
Its major advantages are lower cost and faster completion, with
There may be less interviewer bias and less social desirability bias than
44
Another disadvantage is that households without telephones and those
with unlisted numbers are automatically excluded from the survey, which
may bias results.
45
CHAPTER 7
unstructured questions.
47
A well-designed questionnaire will enable us to ask the respondents the
same questions in the same way and their answers must be recorded and
coded uniformly.
48
It should be designed in such a way that the recorded answers can
49
i) Open-ended question
50
The advantages of open-ended questions are:
• Respondent can answer in detail, and can qualify and clarify responses
• They may be used when there are too many response categories to list
on a questionnaire.
• They are useful when the questions are too complex to reduce to a few
standard responses.
51
Disadvantages of open-ended questions are :
The answers are not standardized and are therefore difficult to compare
They require a higher level of skills on the part of the data collector
52
ii) Closed-ended question
• Single coded question where the respondent is permitted to check one and
The questions meaning is often made more clear by the response categories,
The answers are relatively complete as long as all relevant categories are
specified
The respondent can guess the answer when they don’t know since they have the
A verbatim listing of every question, with complete wording and instructions on the
progression of the respondent through the form.
A listing of questions in a specific order, but without full or precise wording of the
questions, or instructions for progression through the form.
A tabular row and column format, in which spaces are indicated for response, usually
in coded form, without any specification of questions.
A checklist of topics, indicating key facts to be covered, but with answers recorded
either in an unstructured way in a field notebook, or a simplified row/column table.
56
7.6 Question phrasing and common problems which arise with question
phrasing
The information required should be well and clearly defined at each stage at which a
question is posed:
A clear meaning
Multiple (double-barreled) questions are questions which combine two or more distinct questions
into one single question. For example: ‘Do you like listening radio and watching television?’ ‘Do
you have a tractor or plough?’ “Does this company have pension and health insurance benefit?’
Ambiguous question: Ambiguity, confusion, and vagueness must be avoided from a question
Vague words and phrases like ‘kind of’, ‘fairly’, ‘generally’, ‘often’, regularly, etc., should be
avoided.
Probing questions: A delicate balance has to be struck between persistence and rudeness. Use
cross-checking questions for sensitive …
avoid the use of technical terms and jargon For example: ’Do you use inorganic fertilizer?’
Sensitive topics: In some cultures people do not like to discuss private matters openly. sensitive
questions are appropriate to be irritating, threatening, or embarrassing to the respondent.
Questions on age, physical or mental disability, deaths in households, income, sexual
58 behavior, family planning, are relatively regarded as sensitive issues.
7. 7 Choice of the Reference Period
Time reference period is the specified length of time for which the respondent is asked
In general, the more recent, and shorter a reference period, the better the information is
likely to be.
59
CHAPTER 8
population it is to cover, the way people will react to questions and even the possible
answers they are likely to give.
For large–scale survey it should be the general rule to conduct pre-tests and pilot survey
60
8.1 Pre–tests
The pretest is a preliminary application of the data gathering technique for the
This may take the form of a series of small pre-tests on isolated problems of the
design.
Its objective is to evaluate the general receptivity and feasibility of the questionnaire,
and identify specific problems of communication between the interviewer and the
respondent in terms of specific questions or items of information sought.
61
8.2 Pilot study
A pilot survey or pilot study is generally a full-scale dress rehearsal/trial of the survey.
The whole of the survey operation in all its aspects must be tested out on small scale in a few
the arrangements for the supply and distribution of all the resources;
Since the purpose of the pilot study is to identify weaknesses and problems with the survey
In other word, the survey forms and procedures must be observed under operational conditions in
the field if problems are to be correctly identified, and appropriate solutions found.
It is therefore necessary to allow enough time to analyze the results and observations from it, and
produce revised materials and arrangements in good time for the start of the main survey operations.
62
8.3 Specific uses of pilot survey
The pilot survey has many benefits in particular if the survey is to be conducted for the first time. In general it
The adequacy of the sampling frame from which it is proposed to select the sample.
The estimates necessary for determining the sample size needed in the actual survey so that the final estimates
The non-response rate be expected, i.e., the probable numbers of refusals and non-contacts can be roughly
estimated from the pilot survey or pretests and ways of reducing non-response can be sought.
Making a sensible choice from alternative methods of collecting the data (observation, mail questionnaires,
interviewers, etc.).
The adequacy of the questionnaire, which is probably the most valuable function of the pilot survey.
The codes chosen for pre-coded questions, which may help to decide the alternative answers to be allowed form
the coding
The probable cost and duration of the main survey and of its various stages.
The deficiency of the organization in the field, in the office and in the communication between the two.
63
CHAPTER-9
Survey Cost Estimation
9.1 Time Scheduling
Once there is an agreement to proceed with the survey, a planning time table should be
drawn up in order to facilitate planning and budgeting. Scheduling for field operations
must take into account two key aspects:
• List of survey activities; and
• Approximate time needed to perform each activity
65
9.2 Preparing Budgets
Budget preparation involves the assignments of cost to each survey activity. The main expenditure
items include:
• Office wages and salaries (administration, executive personnel, quality control, data processing);
• Survey material
• Supervisory and interviewing costs (enumerators’, supervisors’ and field officers’ salaries and
allowances);
• Supplies for the reproduction of questionnaire , forms and manuals and other stationeries;
• Transport cost;
• Computer services;
• Sampling design cost
• Other administrative costs (Office rentals, overheads recovery); etc
Preparation of a preliminary budget estimates is a priority activity that should be planned and
executed at an early stage. Generally, the budget will depend on the survey design
66
Example of Budget Preparation for Survey:
1. Office Experts
1 Survey director for 1 month at Birr 10,000 per month 10,000
1 Field organizer for 1 month at Birr 6000 per month 6,000
1 Survey statistician 1 month at Birr 6,000 per month 6,000
Sub-total 22,000
2. Field Personnel
a) Salaries
50 enumerators for 2 months at 400 Birr per month 40000
10 Field supervisors for 3 months at 600 Birr per month 18000
10 Drivers for 3 months at 350 Birr per month 10500
Sub-total 68500
b) Allowances
50 Enumerators for 1.5 months at 25 Birr per day 56250
10 Field supervisors for 2 months at 30 Birr per day 18000
10 Drivers for 2 months at 25 Birr per day 15000
50 Guides for 2 months at 10 Birr per day 1000
Sub-total allowances 90250
Total field personnel 158750
67
3. Equipment and Transport
Office equipment and furniture 80000
Rent of vehicles 150000
Running costs, maintenance, insurance 6000
Enumerators’ equipment 10000
Data processing equipment 4000
Miscellaneous 3000
Total equipment and supplies 253000
4. Stationary
Printing of forms, questionnaires 12000
Pens, pencils, sharpeners, erasers, rulers 500
Report production 1500
Manuals 2500
Total stationary 16500
5. Data processing staff
I data expert for 2 months at 3000 Birr per month 6000
5 data editors and coders for 20 days at 1500 Birr per month each 5000
2 data entry for 15 days at 2000 Birr per month each 2000
Total data processing staff 13000
Total Budget 463250
6. Contingency,
Approximately 10% 46325
Grand total 509575
68
CHAPTER 10: FIELD WORK
10.1 Organization of fieldwork
Fieldwork involves recruiting and training of field staff, actual data collection,
Field workers are required for the collection of data where personal interview or
The quality of these workers is one of the most crucial factors in determining the
quality of information. So, extra care and attention at the stage of recruitment of field
workers is very essential
69
10.2 Recruitment of fieldworks
There are three possible types of approach to the recruitment of field workers.
Field workers might be recruited for a particular survey for a limited period only; or
survey program; or
Use an existing group of people, either from an established data collection organization
70
Role of the interviewer:
The primary role of an interviewer is to gather data upon which major decisions are
based
The interviewer must be well informed about the survey and its objectives
The interviewer must establish good relations with the respondent, avoid arousing
unnecessary prejudice, confusion or resentment, and always respect the confidence on which
the respondent has given information
The interviewer must motivate the respondent to supply comprehensive and accurate
answers.
The functions of the field staff mainly fall into three categories
o Data collection;
fitness which include level of education, age limit, health conditions, sex, etc. For
example, regarding age, persons outside 20 to 45 may not be appropriate for field
work
Testing enumerators on the ability to read maps and to make changes and clerical
duties such as handwriting, form filling and ability to follow instruction; and
appearing relaxed, being neutral, absolute honesty and integrity, and work under
difficult conditions
72
10.3 Training of field workers
A large–scale survey requires hiring several field workers
Good training, adequate pay and good supervision are important for consistent high-quality
performance
An effective training program is essential to complete field operations efficiently and on
schedule
Before going into the actual fieldwork, all the field workers should be trained on the
73
There are also several specific elements to field workers training. Such issues
include, usually indicated in a survey manual for regular use,
description of the survey’s work, methods of data collection, interviewing
techniques, how to check and handle completed questionnaires,
what to do with non-response, standard definitions used in its questionnaires and
content of questionnaires.
The training procedure could be a formal training of courses that followed the teaching
types of
o lecturing, demonstrations and discussion in the class, practice of mock
interviews in the class, trial interviews or practices in the field, discussions on
the results from field practices and performing evaluation before deployment.
Supervisory staff needs additional training in supervision. This should include training in
• the specific skills of supervising and checking field work, organization and record
keeping and the training of enumerators.
74
10.4 Management of Fieldwork
The survey forms and questionnaires, and survey manuals and other documents.
In addition, the question of stationary supplies is easy to forget, but vital to remember. The
supply may include pens, pencils, erasers and pencil sharpeners, clip-boards, bags, etc.
b) Public relation: it is important to publicize the survey by informing and involving responsible
local government and administrative personnel and the traditional community hierarchy in
developing countries. Their cooperation is essential in giving local credibility to the survey
work.
• For publicity, depending on the nature of the survey, some other channels such as radio,
television, newspapers and magazines, posters and leaflets can be used in all locally important
75 languages.
10 .5 Supervision and Quality checks
Supervision and quality checks are part of field management. The field work of supervisors mostly
consists of the following activities.
• Allocating the work to enumerators,
• Monitoring the progress of the field work and taking remedial action, if necessary,
• Coordinating the full range of administrative support services required at field level, and
There are some ways of checking the quality of interviewing but the three major ones are:
The numbers are in a raw form (raw data), on questionnaires, note pads, recording
sheets, or paper. The raw data needs to be converted into a form suitable for analysis
and interpretation.
Data processing is therefore, the link between data collection and data analysis. This
can be achieved through sequences of activities, which include editing, coding, entry
and tabulation.
77
a) Editing
It refers to checking and correction manually or by computer
The checking involves whether the information contained in the questionnaire is
complete, recorded in the prescribed manner, accurate, internally consistent and from
eligible respondent
Field editing is intended to uncover errors in recording responses during the data
collection stage.
78
The central editing is performed when the completed questionnaires are returned to the
office. The objectives are to correct major errors such as those relating to questionnaire
identification, and to prepare questionnaire for coding and data entry so as to minimize
the possibility of error in these latter operations.
The modes of editing include manual and computer editing.
Manual editing is performed by a group of editor, usually the field supervisors or
trained editors. These editors are given a set of editing instructions specifying in detail the
rules and guidelines to be followed in editing. The problem with manual editing is that it
is time-consuming and costly exercise.
Computer editing involves the use of computer facilities to detect inconsistencies in the
questionnaires. It allows a large number of editing (cleaning and validation) instructions
to be executed simultaneously, and hence speed and accuracy are achieved.
79
b) Coding
It refers to the process of identifying and assigning a numerical character symbol to
questionnaire entries with the objective to prepare the data in a form suitable for entry in
to the computer.
The coding procedure is a set of rules stating that certain numbers are assigned to
variable attributes.
Researchers begin thinking about a coding procedure and code book before they collect
80
c) Data entry
The data must be transferred from raw data forms into a format for computers.
The aim is to store the data in a machine-readable format, and then to use it for
A subsequent transfer of parts or all of the data from one sheet or file to another is also
possible.
There are different ways of transferring data of which direct entry and optical scan sheet
Direct data entry is the more common and generally more appropriate approach in
developing countries
When transferring data from one medium to another it will give rise to a number of
possible errors. These errors may occur because some data may be Lost, some data may
be repeated, and the values of some data items may be changed.
81
d) Tabulation
Data tabulation may take the form of simple tabulation and Cross-tabulation.
simple tabulation involves counting single variable, and presents an empirical distribution
of the number of observations that fall into each category of response. For example, data
on gender can be tabulated for male and female.
82
11.2 Analysis and Interpretation of Data
Data analysis is concerned with categorizing, ordering, and summarizing data
while interpretation is essentially a follow-up which first involves the search for the meaning and
applications of the research results, and ultimately draws conclusions about these relationships.
The main divisions of data analysis and interpretation, together with the respective statistical tools
and techniques adopted, are outlined as follows:
a) Describing data
• Measure of central tendency (arithmetic mean, median, mode)
• Measure of dispersion (range, quartile deviation, mean deviation, standard deviation)
• Statistical estimation (point and interval estimation, assessing differences)
b) Testing hypotheses
Formulate the null and alternative hypotheses
Specify the level of significance
Select the appropriate test statistic
Compute the value of the test statistic, using sample data
Compare computed value of test statistic and the critical value
Reject Ho if computed test statistic falls outside the acceptance region and Accept Ho if computed
between variables
The association can be measured by simple and multiple regression and correlation,
The use of the appropriate techniques would depend on the number of variables
84
CHAPTER 12: Non- Sampling Error
12.1 The Nature of Survey Error
Survey error occurs when there is discrepancy between the estimate and the true value of the
survey.
Survey errors are generally divided into major types : sampling error and non sampling error.
Sampling errors are the random variations in the sample estimates around the true population
parameters.
It occurs by chance
Non Sampling Error is a type of systematic error that can not be avoided or minimized by
increasing the sample size.
We can eliminate or reduce the non-sampling error by careful design of the sampling procedure.
85
12.2 Methods of classifying non-sampling errors
Several schemes for classifying non-sampling errors are possible
One approach is to classify non-sampling errors by the stage of the survey in which they
occur. The three major stages of survey are survey design and preparation, data
collection, and data processing and analysis. Each of these stages can be subdivided and
they are useful in discussing the control of non-sampling errors.
A second method of approaching non-sampling errors is on the basis of observational and non-
observational errors. Observational error include questionnaire error, data processing error, and
analysis (reporting) error, while non-observational error consists of interviewer error, respondent
error and coverage error.
The underlying measurement and control of non-sampling errors of these types are as follows.
86
12.2.1 Non-observational
a) Coverage Errors
Coverage errors include non-coverage (under-coverage) and over-coverage of survey units.
Non-coverage is failure to include some units of observation, either directly or implicitly in the
operational sampling frame. In such cases; non-coverage will lead to error in the sample results if
the missed units differ in characteristics from the units covered.
The sum of the absolute values of non-coverage and over-coverage error gives coverage error.
b) Interviewer error
The interviewer is responsible for collecting data from the respondent in the most accurate and
At the same time, interviewers can be a source of error by failing to put the question clearly, by
87
C) Respondent error
Respondent error can be broadly classified into two categories: response error and non-response
i) Non-response error
Non-response error arises from failure to include a designated sampling unit. It can; arise from
several different sources, depending upon the survey situation.
Respondents are unable and unsuitable for interview because of physical, mental, emotional, or
language problems.
Failure to gain cooperation. This can happen when the respondent may be unwilling or unable to
respond
Response errors occur in the data collection phase of a survey, and are distinguished from errors
which occur in the data processing phase.
When information is obtained from respondent but it is incorrect, it refers to response error.
The response error may be unintentional or it may be deliberate on the part of the respondents. A
person may not know his exact age, or he may report his age wrongly even when he knows it.
88
There are two basic sources of response errors; errors arising from respondents and interviewers?.
For example, the inability of respondents to provide the desired information is a common source
of response errors.
This may arise from lack of knowledge, problem of recalling of the facts in the distant past, mis-
understanding of the questions, does not wish to give the correct answer. etc.
Respondents sometimes purposely report certain information incorrectly to protect their or simply
12.2.2 Observational
a) Questionnaire error: the sources of questionnaire error include poor designing, types of
questions used, excessively long questionnaire, inadequate interviewer instructions. or wrong
measurements/attitudinal scale used.
b) Data processing error may be caused by error in editing data, in coding, in computer
data entry and in tabulation.
c) Analysis (reporting) error refers to the inappropriate statistical methods used in the analysis
and interpretations of data.
89