Presented By: Ashwini Pokharkar Rohit Pandey Swapnil Muke Apoorva Dave Peeyush Khandekar Shailaja Patil
Presented By: Ashwini Pokharkar Rohit Pandey Swapnil Muke Apoorva Dave Peeyush Khandekar Shailaja Patil
Presented By: Ashwini Pokharkar Rohit Pandey Swapnil Muke Apoorva Dave Peeyush Khandekar Shailaja Patil
To understand sampling, its advantages To know sampling process Types of sampling i.e. probability and non probability sampling To understand the factors to consider when determining sample size To know sampling errors
Sampling is the process of selecting a small number of elements from a larger defined target group of elements such that the information gathered from the small group will allow judgments to be made about the larger groups
More accurate measurementI. inspection fatigue is reduced II. Sample error can be studied, controlled & probability statement can be made about magnitude
Population of interest is entirely dependent on Management Problem, Research Problems, and Research Design.
Some Bases for Defining Population:
Geographic Area Demographics Usage/Lifestyle Awareness
A list of population elements (people, companies, houses, cities, etc.) from which units to be sampled can be selected.
Difficult to get an accurate list. Sample frame error occurs when certain elements of the population are accidentally omitted or not included on the list.
Probability
Non probability
Convenience
Simple random sampling Systematic random sampling Stratified random sampling Cluster sampling
PROBABILITY SAMPLES
A probability sample is one in which each element of the population has a known non-zero probability of selection.
Not a probability sample of some elements of population cannot be selected (have zero probability)
Not a probability sample if probabilities of selection are not known.
If the sampling frame is a poor fit to the population of interest, random sampling from that frame cannot fix the problem
The sampling frame is non-randomly chosen. Elements not in the sampling frame have zero probability of selection.
Generalizations can be made ONLY to the actual population defined by the sampling frame
Some of the groups are randomly selected For given sample size, a cluster sample has more error than a simple random sample Cost savings of clustering may permit larger sample
STRATIFIED CLUSTER SAMPLING Reduce the error in cluster sampling by creating strata of clusters Sample one cluster from each stratum The cost-savings of clustering with the error reduction of stratification Strata
Convenience sampling relies upon convenience and access Judgment sampling relies upon belief that participants fit characteristics Quota sampling emphasizes representation of specific characteristics Snowball sampling relies upon respondent referrals of others with like characteristics
Sampling error is any type of bias that is attributable to mistakes in either drawing a sample or determining the sample size. Sampling error is the amount of accuracy in estimating value caused only a portion of a population.
Non-sampling error is for the deviations from the true value that are not a function of the sample chosen , including systematic errors. Non sampling errors presents in sample survey as well as cencus survey. non-sampling error Occurs when aim of survey not very clear. Imperfect questionnaire Non-coherent answers by respondents Inadequate knowledge Prestige problems
How many completed questionnaires do we need to have a representative sample? Generally the larger the better, but that takes more time and money. Three criteria usually need to determine appropriate sample size:
The level of precision. Confidence level. Degree of variability.
The level of precision, sometimes called sampling error, is the range in which the true value of the population is estimated to be. This range is often expressed in percentage points, (e.g., 5 percent), in the same way that results for political campaign polls are reported by the media. Thus, if a researcher finds that 60% of farmers in the sample have adopted a recommended practice with a precision rate of 5%, then he or she can conclude that between 55% and 65% of farmers in the population have adopted the practice.
The confidence or risk level is based on ideas encompassed under the Central Limit Theorem. The key idea encompassed in the Central Limit Theorem is that when a population is repeatedly sampled, the average value of the attribute obtained by those samples is equal to the true population value. Furthermore, the values obtained by these samples are distributed normally about the true value, with some samples having a higher value and some obtaining a lower score than the true population value.
In a normal distribution, approximately 95% of the sample values are within two standard deviations of the true population value (e.g., mean). In other words, this means that, if a 95% confidence level is selected, 95 out of 100 samples will have the true population value within the range of precision specified.
the degree of variability in the attributes being measured refers to the distribution of attributes in the population. The more heterogeneous a population, the larger the sample size required to obtain a given level of precision. The less variable (more homogeneous) a population, the smaller the sample size. Note that a proportion of 50% indicates a greater level of variability than either 20% or 80%. This is because 20% and 80% indicate that a large majority do not or do, respectively, have the attribute of interest.
Larger sample sizes obviously produce better, more accurate estimates about populations. It may be hard to find a random sample of people.
Larger sample sizes produce more accurate statistics, the extra cost and effort is not always needed as smaller sample sizes can also produce significant results.
Variability of the population characteristic under investigation Level of confidence desired in the estimate
The confidence interval (also called margin of error) is the plus-or-minus figure usually reported in newspaper or television opinion poll results.
For example: if you use a confidence interval of 4 and 47% percent of your sample picks an answer you can be "sure" that if you had asked the question of the entire relevant population between 43% (47-4) and 51% (47+4) would have picked that answer.
The formula for calculating the sample size for a simple random sample without replacement is as follows: where, N=(z/m)^2 P(1-p) z is the z value (e.g., 1.645 for 90% confidence level, 1.96 for 95% confidence level, and 2.575 for 99% confidence level); m is the margin of error (e.g., .07 = + or 7%, .05 = + or 5%, and .03 = + or 3%); and
Using our factors for the principal investigator population, PIs1, and solving for the sample size equation, we find: n=(1.96/.05)^2 = (39.2)2(.25) = 1536.64(.25) = 384
Thus, without using the finite population correction factor (explained below), the sample for PIs1 = 384. The sample size equation solving for (new sample size) when taking the FPC into account is N=n/(1+n/N) where, n is the sample size based on the calculations above, and N is population size.
Calculating the new sample size for PIs1 using the formula above, we find: = = 375.37
WIKIPEDIA