Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 40
Survey and Field Research Methods
• Survey research is one of the most basic methods in
economic research • Survey research permits a rigorous step by step development and testing of complex propositions through survey data. • The aim of sample surveys is to generalize from the sample to the population. • The three most common purposes of surveys are: – Description – Explanation, and – Exploration • Basic Survey Designs • We could distinguish between two basic types of survey designs • Cross sectional surveys – Data are collected at one point in time – Less expensive and most common type • Longitudinal Survey – Surveys are collected at different point in time. – Useful for capturing changes over time. • Survey Sampling: – Some studies involve only small number of people and thus all of them can be included. – But when the population is large, it is usually not possible to undertake a census of all items in the population. • Sampling is the process of selecting a number of study units from a defined study population. – It aims at obtaining consistent and unbiased estimates of the population parameters. • There are two principles underlying any sample design: – The need to avoid bias in the selection procedure – The need to gain maximum precision. • Bias can arise: • if the selection of the sample is done by some non- random method i.e. selection is consciously or unconsciously influenced by human choice • if the sampling frame (i.e. list, index, population record) does not adequately cover the target population. – if some sections of the population are impossible to find or refuse to co-operate. • Major Reasons for Sampling • 1) Resource Limitations: A sample study is usually less expensive than a census. • 2) Superior Quality of Results: » more accurate measurement • 3) Infinite Population: sampling is also the only process possible if the population is infinite. • 4) Destructive nature of some tests: Sampling remains the only choice when a test involves the destruction of the items under study. – Example: testing the quality of a commodity (beer, cigarette, coffee, etc.) • Representativeness • Representativeness is important particularly if you want to make generalization about the population. • A representative sample has all the important characteristics of the population from which it is drawn. • For Quantitative Studies: • If researchers want to draw conclusions which are valid for the whole study population, they should draw a sample in such a way that it is representative of that population. • For Qualitative Studies: • representativeness of the sample is NOT a primary concern. • We select study units which give us the richest possible information. • you go for INFORMATION-RICH cases! • Steps in Sampling Design • The critical steps in sampling are: • a) Identifying the relevant population: when one wants to undertake a sample survey the relevant population from which the sample is going to be drawn need to be identified. • Example: if the study concerns income, then the definition of the population elements as individuals or households can make a difference. • b) Determining the method of sampling: • Whether a probability sampling procedure or a non-probability sampling procedure has to be used is also very important. • c) Securing a sampling frame: • A list of elements from which the sample is actually drawn is important and necessary. • d) Identifying parameters of interest: • what specific population characteristics (variables and attributes) may be of interest. • e) Determining the sample size • The determination of the sample size deepens on several factors. • i) Degree of homogeneity: The size of the population variance is the single most important parameter. • The greater the dispersion in the population the larger the sample must be to provide a given estimation precession. • ii) Degree of confidence required: Since a sample can never reflect its population for certain, the researcher must determine how much precision s/he needs. • Precision is measured in terms of – (i) An interval range in which we would expect to find the parameter estimate. – (ii) The degree of confidence we wish to have in the estimate. • iii) Number of sub groups to be studied: • When the researcher is interested in making estimates concerning various subgroups of the population then the sample must be large enough for each of these subgroups to meet the desired quality level. • iv) Cost: cost considerations have major impact on decisions about the size and type of sample. • All studies have some budgetary constraint and hence cost dictates the size of the sample.
• To determine the sample size:
• 1. Use prior information: If our process has been studied before, we can use that prior information to determine our sample size. • This can be done by using prior mean and variance estimates and by stratifying the population to reduce variation within groups. • 2. Rule of Thumb: are based on past experience with samples that have met the requirements of the statistical methods. • Researchers use it because they rarely have information on the variance or standard errors. • 3. Practicality: Of course the sample size you select must make sense. • We want to take enough observations to obtain reasonably precise estimates of the parameters of interest but we also want to do this within a practical resource budget. • Therefore the sample size is usually a compromise between what is DESIRABLE and what is FEASIBLE. • In general, the smaller the population, the bigger the sampling ratio has to be for a reasonable sample. • Hence: • For small populations (under 1000 a researcher needs a large sampling ratio (about 30%). Hence, a sample size of about 300 is required for a high degree of accuracy. • For moderately large population (10,000), a smaller sampling ratio (about 10%) is needed – a sample size around 1,000. • To sample from very large population (over 10 million), one can achieve accuracy using tiny sampling ratios (.025%) or samples of about 2,500. • These are approximates sizes, and practical limitations (e.g. cost) also play a role in a researcher’s decision about sample size. The size of the sample is determined using the formula written below: 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 ℎ𝑜𝑢𝑠𝑒ℎ𝑜𝑙𝑑𝑠 𝑖𝑛 𝑠𝑒𝑙𝑒𝑐𝑡𝑒𝑑𝑘𝑒𝑏𝑒𝑙𝑒𝑠 𝑃= 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑜𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑜𝑟 ℎ𝑜𝑢𝑠𝑒ℎ𝑜𝑙𝑑𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑑𝑖𝑠𝑡𝑟𝑖𝑐𝑡/𝑒𝑠𝑢𝑏𝑐𝑖𝑡𝑦
Where: n is the sample size, P is the sample proportion, q is (1-
P), e is the acceptable error term which lies below 0.005(let the error term be 0.05), (Z=1.96) is the standard normal variable in the accepted level of the error term, the level of confidence (α=0.05) will be used to check the level of significance. • Sample Size in Qualitative Studies • There are no fixed rules for sample size in qualitative research. • The size of the sample depends on WHAT you try to find out, and from what different informants or perspectives you try to find that out. • the sample size is therefore estimated as precisely as possible, but not determined. • Probability and non-probability sampling • There could be several sampling methods that could be used to draw a sample. • Two types: • probability samples • non-probability means. • Probability sampling is based on the concept of random selection of survey units. • It uses a random selection procedures to ensure that each unit of the sample is chosen on the basis of chance. • A randomization process is used in order to reduce or eliminate sampling bias so that the sample is representative of the population from which it is drawn. • A sample will be representative of the population from which it is drawn if all members of the population have an equal chance of being included in the sample. • Probability sampling requires a sampling frame (a listing of all study units). • Probability samples, although not perfectly representative are more representative than any other type of sample. • So, probability sampling has considerable advantages over all other forms of sampling. • First, sampling errors can be calculated. • Second, probability samples rely on random process, i.e. the selection process operates in a truly random method (no pattern). • Finally, since each element has an equal chance or probability of being selected it is possible to get consistent and unbiased estimate of the population parameter. • Types of probability sampling methods • Generally speaking we could distinguish between the following types of sampling designs. • Simple Random Sampling Technique • Systematic sampling Technique • Stratified Sampling Technique • Cluster Sampling Technique. • Hybrid Sampling • 1. Simple Random Sampling (SRS) – The SRS is the simplest and easiest method of probability sampling. – It is the sampling procedure in which each element of the population has an equal chance of being selected into the sample. – It assumes that an accurate sampling frame exists. – Usually two methods are adopted to pick a sample. • The lottery method • Table of random number: • SRS requires a listing of the entire population of interest. This may not be possible for national surveys. • It is too expensive to interview a national face to face sample based on SRS. • The cost of interviewing randomly selected individual drawn from a list of the entire population is extremely high. • So, the SRS can only be applied in situation where the population size is small. • 2. Systematic Sampling Technique – In SYSTEMATIC SAMPLING individuals are chosen at regular intervals (for example every fifth) from the sampling frame. – Under systematic sampling procedures, instead of a list of random number the researcher calculates a sampling interval. • The sampling interval is the standard distance between elements selected in the sample. – The major advantages of SS are its simplicity and flexibility. • 3. Stratified Sampling • Most populations can be segregated into a number of mutually exclusive sub populations or Strata. • The stratified sampling technique is particularly useful when we have heterogeneous populations. • After a population is divided into the appropriate strata a simple random sample can be taken either using the SRS or the SS techniques from each stratum. • The reasons for stratifying • There are three major reasons why a researcher chooses a stratified random sampling. • (a) To increase a sample’s statistical efficiency. • (b)To provide adequate data for analyzing the various subpopulation. • (c)To enable different research methods and procedures to be used in different strata. • How to Stratify – Three major decisions must be made in order to stratify the given population into some mutually exclusive groups. – (1) What stratification base to use: stratification would be based on the principal variable under study such as income, age, education, sex, location, religion, etc. – (2) How many strata to use: there is no precise answer as to how many strata to use. • The more strata the closer one would be to come to maximizing inter-strata differences and minimizing intra-strata variables. (3) What strata sample size to draw: different approaches could be used: • One could adopt a proportionate sampling procedure. – If the number of units selected from the different strata are proportional to the total number of units of the strata then we have proportionate sampling. • Or use disproportionate sampling, which allocates elements on the basis of some bias. • 4. Cluster Sampling: – The selection of groups of study units (clusters) instead of the selection of study units individually is called CLUSTER SAMPLING. • If the total area of interest happens to be a big one and can be divided into a number of smaller non – overlapping areas (clusters) and if some of the groups or clusters are selected randomly we have cluster sampling. – Clusters are often geographic units (e.g., districts, villages) or organizational units (e.g., firms, clinics, training groups, etc). • Cluster sampling addresses two problems: – Researchers lack a good sampling frame for a dispersed population. – The cost to reach a sample element is very high and cluster sampling reduces cost by concentrating surveys in selected clusters. • Multistage area sampling (MAS) - is a cluster sampling with several stages: – First take a sample of a set of geographic regions or clusters – randomly select X number of clusters. – Next, a subset of geographic area is sampled within each of those regions and so on. – Finally a sample of elements is drawn from smaller areas. • 5. Hybrid sampling – Where there is no single way to sample a particular population some researchers use a combination of the four different methods discussed above. • Non-Probability Sampling • non-probability selection is non random i.e., each member does not have a known non-zero chance of being included. • Generally three conditions need to be met in order to use non-probability sampling. – First, if there is no desire to generalize to a population parameter, then there is much less concern whether or not the sample fully reflects the population - when precise representation is not necessary. • Secondly, it is used because of cost and time requirements. – probability sampling could be prohibitively expensive since it calls for more planning and repeated callbacks to assure that each selected sample unit is contacted. • Thirdly, though probability sampling may be superior in theory there are breakdowns in its applications. – The total population may not be available for the study in certain cases. • Non-probability sampling methods: • (1) Convenience sampling • The method selects anyone who is convenient. • It can produce ineffective, highly un- representative samples and is not recommended. • Such samples are cheap, however, biased and full of systematic errors. – Example: the person on the street interview conducted by television programs is an example of a convenient sample. • (2) Quota Sampling – Quotas are assigned to different strata groups and interviewers are given quotas to be filled from different strata. – A researcher first identifies categories of people (e.g., male, female) then decides how many to get from each category. • The major limitation of this method is the absence of an element of randomization. Consequently the extent of sampling error cannot be estimated. • is used in opinion pollsters, marketing research and other similar research areas. • (3) Purposive or Judgment sampling • Purposive sampling occurs when one draws a non- probability sample based on certain criteria. – When focusing on a limited number of informants, whom we select strategically so that their in-depth information will give optimal insight into an issue is known as purposeful sampling. • It uses the judgment of the expert in selecting cases. • BUT, care should be taken that for different categories of informants; selection rules are developed to prevent the researcher from sampling according to personal preference. • (4) Snowball (Network) Sampling – This is a method for identifying and sampling (or selecting) the cases in a network. • Snowball sampling is based on an analogy to a snowball, which begins small but becomes larger as it is rolled on wet snow and pick up additional snow. – Snowball sampling begins with one or a few people or cases and spread out on the basis of links to the initial case. • You start with one or two information-rich key informants and ask them if they know persons who know a lot about your topic of interest. • Problems in Sampling – Two types of errors: – Non sampling errors – Sampling errors – Non Sampling errors are biases or errors due to fieldwork problems, interviewer induced bias, clerical problems in managing data, etc. • These would contribute to error in a survey, irrespective of whether a sample is drawn or a census is taken. – On the other hand, error which is attributable to sampling, and which therefore, is not present in information gathered in a census is called sampling error. • a) Non-Sampling Error • Non sampling error refer to – Non-coverage error – Wrong population is being sampled – No response error – Instrument error – Interviewer’s error • Non-Coverage sampling error: This refers to sample frame defect. – Omission of part of the target population (for instance, soldiers, students living on campus, people in hospitals, prisoners, households without a telephone in telephone surveys, etc). – Non-coverage error also occurs when the list used for the sampling are incomplete or are outdated. • The wrong population is sampled – Researchers must always be sure that the group being sampled is drawn from the population they want to generalize about or the intended population. • Non response error – Some people refuse to be interviewed because they are ill, are too busy, or simply do not trust the interviewer. • One should try to reduce the incidence of non- response errors. – Non-response error can occur in any interview situation, but it is mostly encountered in large-scale surveys with self-administered questionnaires. • It is important in any study to mention the non-response rate and to honestly discuss whether and how the non- response might have influenced the results. • Instrument error – The word instrument in sampling survey means the device in which we collect data- usually a questionnaire. – When a question is badly asked or worded, the resulting error is called instrument error. • Example: leading questions or carelessly worded questions may be misinterpreted by some researchers. • Interviewer error : This occurs when some characteristics of the interviewer such as age, sex, affects the way in which the respondent answer questions. – Example: questions about sexual behavior might be differently answered depending on the gender of the interviewer. • To sum up, a researcher must ensure that non sampling error are avoided as far as possible, or is evenly balanced (non systematic) and thus cancels out in the calculation of the population estimates. • b) Sampling Errors – Sampling errors are random variations in the sample estimates around the true population parameters. • Error which is attributable to sampling, and which therefore is not present in a census-gathered information, is called sampling error. – Sampling errors can be calculated only for probability samples. – Increasing the sample size is one of the major instruments to reduce the extent of the sampling error. – Sampling error is related to confidence intervals. • A narrower confidence interval means more precise estimates of the population for a given level of confidence. • The confidence interval for the true population mean is given by: Mean z n • Mean is the sample mean, z is the value of the standard variate at a given confidence level (to be read from the table giving the area under the normal curve) n is the sample size, and is the standard deviation of the sample mean. • The sampling error is given by: z n • Dealing with missing data: – There are several reasons why the data may be missing. • They may be missing because equipment malfunctioned, the weather was terrible, or people got sick, or the data were not entered correctly. • If data are missing at random, by far the most common approach is to simply omit those cases with missing data and to run our analyses on what remains. • Although deletion often results in a substantial decrease in the sample size available for the analysis, it does have important advantages. – Under the assumption that data are "missing at random”, it leads to unbiased parameter estimates. • If, on the other hand, data are not missing at random, but are missing as a function of some other variable, a complete treatment of missing data would have to include a model that accounts for missing data.