Advertising 3
Advertising 3
Advertising 3
1 of 71
BBA-041
ADVERTISING FUNDAMENTAL
BLOCK 4 APPLICATION IN RESEARCH METHODOLOGY
In collaboration w ith
8/19/2012 12:08 AM
SYLLABUS
2 of 71
8/19/2012 12:08 AM
SYLLABUS
3 of 71
Unit 2- Non Parametric Tests: Objective, Introduction, Non Parametric Tests, Important Types of
Nonparametric Tests, Non-parametric vs. Distribution-free Tests, Standard uses of Non
Parametric Tests, Tests for Ordinal Data, The Null Hypothesis.
Unit 3- Chi-Square Test: Objective, Introduction, Key Definitions, Properties of the Chi-Square,
Chi-Square Probabilities, Degree of Freedom Which Arent in the Table, Uses of Chi-square
Test, Applications of Chi-square Test.
Unit 4- Analysis of Variance (ANOVA): Objectives, Introduction, Logic behind ANova, Applications of
Anova, Dependent and Independent Variables.
8/19/2012 12:08 AM
SYLLABUS
4 of 71
Advertising Fundamental
Application in Research Methodology
Unit 1
Testing Of Hypothesis-Large and Small Samples 2
Unit 2
Non Parametric Tests 34
Unit 3
Chi-Square Test 53
Unit 4
Analysis of Variance (ANOVA) 73
8/19/2012 12:08 AM
SYLLABUS
5 of 71
8/19/2012 12:08 AM
SYLLABUS
6 of 71
1.0 Objective
After studying this unit you will be able to:
Describe testing of hypothesis
Discuss sample proportions
Explain test for the differences between two samples.
1.1 Introduction
Let us understand about the principles of hypothesis testing where we have compared a sample
mean with a hypothesized or population mean.
We can also apply the principles of hypothesis testing already determined to testing for
differences between proportions of occurrence in a sample with a hypothesized level of
occurrence.
Theoretically the binomial distribution is the correct distribution to use when dealing with
proportions. As sample size increases the binomial distribution approaches a normal distribution
in terms of characteristics. Therefore we can use the normal distribution to approximate the
sampling distribution. To do this we need to satisfy the following conditions:
np>5, nq>5
where p is the proportion of successes
q: proportion of failures.
where p is the standard error of the proportion which is calculated using the hypothesized
population proportion.
8/19/2012 12:08 AM
SYLLABUS
7 of 71
A ketchup manufacturer is in the process of deciding whether to produce a new extra spicy
ketchup. The companys marketing research department used a national telephone survey of
6000 households and found that 335 would purchase extra spicy ketchup. A much more
extensive study made two years ago found showed that 5% of households would purchase would
purchase the brand then. At a two percent level of significance should the company conclude
that there is an increased interest in the extra spicy flavour?
z critical for a one tailed test is 2.05 Since the observed z > z critical we should reject.Ho and
current levels of interest are significantly greater than interest two years ago.
8/19/2012 12:08 AM
SYLLABUS
8 of 71
8/19/2012 12:08 AM
SYLLABUS
9 of 71
Since we will usually we will be testing for equality between the two population means hence:
(1 -) o =0 since 1 = 2
2H
An example will make the process clearer:
A manpower statistician is asked to determine whether hourly wages of semi skilled labour are
the same in two cities. The result of the survey is given in the table below. The company wants to
test the hypothesis at .05 level of significance that there is no significant difference between the
hourly wage rate across the two cities.
8/19/2012 12:08 AM
SYLLABUS
10 of 71
Since standard deviation of the two populations are not known we estimate s^ s^ by using the
sample standard deviation s and
The estimated standard error of the difference between the two means is
We can mark the standardized difference on a sketch of the sampling distribution and compare
with the critical value of z=1.96 in figure 7 As we can see the calculated z lies outside the
acceptance region. Therefore we reject the null hypothesis.
Example 2
1. Two independent samples of observations were collected. For the first sample of 60 elements,
the mean was 86 and the standard deviation 6. The second sample of 75 elements had a mean
of 82 and a standard deviation of 9.
a. Compute the estimated standard error of the difference between the two means.
b. Using a = 0.01, test whether the two samples can reasonably be considered to have come from
populations with the same mean.
8/19/2012 12:08 AM
SYLLABUS
11 of 71
Since 3.09>2.58, we reject Ho and it is reasonable to conclude that the two samples come from
different populations.
8/19/2012 12:08 AM
SYLLABUS
12 of 71
The estimated standard error of the difference between two sample proportions is:
z. critical for .o5 level of significance is 1.96. Since observed z is less than z critical we accept Ho.
This is shown in the figure 3 below:
Example 3:
For tax purposes a city government requires two methods of listing property. One requires
property owner to appear in person before a tax lister. The second one allows the form to be
mailed. The manager thinks the personal appearance method leads to lower fewer mistakes. She
authorizes an examination of 50 personal appearances, and 75 mail forms. The results show that
10% of personal appearances have errors whereas 13.3 % of mails forms had errors. The
manager wants to test at the .15 level of significance, the hypothesis that that personal
appearance method produces lower errors. The hypothesis is a one tailed test. The procedure for
this as the same as for carrying out a one tailed test for comparing sample means. The data is as
follows:
p 1: .1 q 1: .9 n 1 : 50
8/19/2012 12:08 AM
SYLLABUS
13 of 71
8/19/2012 12:08 AM
SYLLABUS
14 of 71
8/19/2012 12:08 AM
SYLLABUS
15 of 71
8/19/2012 12:08 AM
SYLLABUS
16 of 71
Point C on the power curve in Figure 2 b shows population mean dosage is 99.42 cc. Given that
8/19/2012 12:08 AM
SYLLABUS
17 of 71
the population mean is 99.42 cc, we must compute the probability that the mean of a random
sample of 50 doses from this population will be less than 99.64 cc (the point below which we
decided to reject the null hypothesis i.e., the value of the dose at which we rejected the null
hypothesis. This shown in Figure 2c .
We had computed the standard error of the mean to be 0.2829 cc. So 99.64 cc is (99.6499.42)/0.2829 = 0.78
Thus 99.64 is .78 SE above the true population mean when it takes a value = 99.42 cc.
The probability of observing a sample mean less than 99.64 cc and thus rejecting the null
hypothesis is 0.7823, when we take the true population mean to be = 99.42 cc. This is given by
the colored area in Figure 9c. Thus, the power of the test 1 - at = 99.42 is 0.7823. This simply
means that if = 99.42, the probability that this test will reject the null hypothesis when it is false is
0.7823.
Point D in Figure 9b shows that if the population mean dosage is 99.61 cc. We then ask what is
the probability that the mean of a random sample of 50 doses from this population will be less
than 99.64cc and thus cause the test to reject the null hypothesis?
This is illustrated in Figure 2d. Here we see that 99.64 is (99.64 -99.61)/0.2829, or 0.11 standard
error above 99.61 cc. The probability of observing a sample mean less than 99.64cc and thus
rejecting the null hypothesis is 0.5438, the colored area in Figure 9d. Thus, the power of the test
(1 - ) at = 99.61 cc is 0.5438.
Using the same procedure at point E, we find the power of the test at = 99.80 cc is 0.2843; this is
8/19/2012 12:08 AM
SYLLABUS
18 of 71
As we can see the values of 1 - continue to decrease to the right of point E. This is because as
the population mean gets closer and closer to 100.00 cc, the power of the test (1 - ) gets closer
and closer to the probability of rejecting the null hypothesis when the population mean is exactly
100.00 cc. This probability is nothing but the significance level of the test which in this case is
0.10. The curve terminates at point F, which lies at a height of 0.10 directly over the population
mean.
What does the power curve in Figure 2b tell us?
As the shipment becomes less satisfactory (as the doses in the shipment become smaller), our
test is more powerful (it has a greater probability of recognizing that the shipment is
unsatisfactory). It also shows us, however, that because of sampling error, when the dosage is
only slightly less than 100.00 cc, the power of the test to recognize this situation is quite low.
Thus, if having any dosage below 100.00 cc is completely unsatisfactory, the test we have been
discussing would not be appropriate.
8/19/2012 12:08 AM
SYLLABUS
19 of 71
to include the same area under the curve. Thus interval widths are much wider for a t
distribution.
2. There is a different t distribution for every possible sample size.
3. As sample size increases, the shape of the t distribution loses its flatness and becomes
approximately equal to the normal distribution. In fact for sample sizes greater than 30 the t
distribution becomes less dispersed and approximates a normal distribution and we can use
the normal distribution.
8/19/2012 12:08 AM
SYLLABUS
20 of 71
The normal tables focus on the chance of that the sample statistic lies within a given number
of standard deviations on either side of the population mean. The t distribution tables on the
other hand measures the chance that the observed sample statistic will lie outside it our
confidence interval, defined by a given number of standard deviations on either side of the
mean. A t value of 1.771 shows that if we mark off plus and minus 1.771s x = on either side of
the mean then we enclose 90% of the area under the curve. The area outside these limits, i.e.,
that of chance error, will be 10%.This is shown in the
Figure 2 below. Thus if we are making an estimate at the 90% confidence limit we would look in
the t tables under the .1 column (1.0-.9=.1). This is actually or the probability of error
8/19/2012 12:08 AM
SYLLABUS
21 of 71
where
The t test is the appropriate test to use when population standard deviation is not known and has
to be estimated by the sample standard deviation.
This represents the basic t test. Variants of this formula are developed to meet the requirements
of different testing situations. We shall look at more common types of problems briefly. As the
theoretical basis of hypothesis is the same as the normal distribution and has been dealt with in
detail in the last chapter, we shall focus on applications of the t test to various situations.
1. Hypotheses testing of means
The t test is used when :
1. the sample size is <30
or
2. When population standard deviation not known and has to be estimated by the sample
standard deviation.
3. When a population is finite and the sample accounts for more than 5% of the population we
use the finite population multiplier and the formula for the standard is modified to;
Exercise
Find one tail value for n=13, a=.05 % degrees of freedom=12
T value for one tail test we need to look up the value under the .10 column t= 1.782
Find one tail t values for the following:
n=10, a=.01
n=15, a=.05
8/19/2012 12:08 AM
SYLLABUS
22 of 71
The specification of the null and alternative hypotheses is similar to the normal distribution.
Ho: =
o
Ha:
o
This is tested at a prespecified level of significance The t statistic is
The calculated t value should be compared with the table t value. If t calculated< t critical we
accept the null hypotheses that there is no significant difference between the sample mean and
the hypothesized population mean. If the calculated t value > t critical we reject the null
hypotheses at the given level of significance.
An example shall make the process clearer:
A personnel specialist is a corporation is recruiting a large number of employees. For an overseas
assignment. She believes the aptitude scores are likely to be 90. a management review finds the
mean scores for 20 test results ot be 84 with a standard deviation of 11. Management wish to test
the hypotheses at the .10 level of significance that the average aptitude score is 90.
Our data is as follows;
8/19/2012 12:08 AM
SYLLABUS
23 of 71
Therefore since -2.44< -1.729 we reject the personnel managers hypotheses that the true mean
of employees being tested is 90. This is also illustrated diagrammatically in figure 3
Exercises
Given a sample mean 83, Given a sample mean of 94.3, a sample standard deviation of 12.5
and a sample size of G size of 22, test the hypothesis that the value of the population mean is
70 against the alternative the hypothesis that it is more than 100. Use the 0.025 significance
level.
If a sample of 25 observations reveals a sample mean of 52 a sample variance of 4.2, test the
hypothesis that the population mean is 05 against the alternative hypothesis that it is some
other value. Use the .01 level of significance. .
Picosoft, Ltd., a supplier of operating system software for personal computers, was planning
the initial public offering of its stock in order to raise sufficient working capital to finance the
development of a new seventh-generation integrated system. With current earnings $1.61 a
share, Picosoft and its underwriters were contemplating an offering price of $21, or about 13
times earnings. In order to check the appropriateness of this price, they randomly chose seven
publicly traded software firms and found that their average price/ earnings ratio was 11.6, and
the sample standard deviation was 1.3. At = .02 can Picosoft conclude that the stocks of
publicly traded software firms have an average P /E ratio that is significantly different from 13?
The data-processing department at a large life insurance company has installed new color
video display terminals to replace the monochrome units it previously used. The 95 operators
trained to use the new machines averaged 7.2 hours before achieving a satisfactory level of
performance. Their sample variance was 16.2 squared hours. Long experience with operators
on the old monochrome terminals showed that they averaged 8.1 hours on the machines
before their performances were satisfactory. At the 0.01 significance level, should the
supervisor of the department conclude that the new terminals are easier to learn to operate?
8/19/2012 12:08 AM
SYLLABUS
24 of 71
are not significantly different from each other is the same as for the large sample case. The
differences are in the calculation of the standard error formula and secondly in the calculation of
the degrees of freedom.
1.15.1 Degrees of Freedom
In the earlier case where we had tested the sample against a hypothesized population value, we
had used a t distribution with n-1 degrees of freedom. In this case we have n 1 1 degrees of
freedom for sample 1 and n2 1 for sample 2. When we combine the sample to estimate the
pooled variance we have n 1 + n 2 2 degrees of freedom . Thus for example if n 1 =10 and n2 = 12
the combined degrees of freedom = 20
Estimation of Sample Standard Error of the difference Between Two Means.
In large samples had assumed the unknown population variances were equal and we estimated
This is not appropriate for small samples. We assume the underlying population variances are
equal: s12= s22 we estimate population variance as a weighted average of s 12 and s22 where the
weights are numbers of degrees of freedom in each sample.
One we have our estimate for population variance we can then use it to determine standard error
of the difference between two sample means, i.e we get an equation for the estimate standard
error of
8/19/2012 12:08 AM
SYLLABUS
25 of 71
12Managers were observed for the first method and 15 for the second. The sample data is as
follows:
We then calculate the t statistic for the difference between two means:
since it is a one tailed test at the .05 level of significance we look in the .1 column against 25
degrees of freedom.
t. critical at .05 level of significance= 1.708
Since calculated t< t critical , we accept the null hypothesis that the first method is significantly
superior to the second.
1.16 Summary
We now turn to case where we wish to compare the parameters for two different populations and
determine whether these differ from each other. In this case we are not really interested in the
actual value of the two parameters but the relation between the two parameter, i.e. is there a
significant difference between them.
1.17 Questions
1. What is Degree of Freedom?
8/19/2012 12:08 AM
SYLLABUS
26 of 71
1.18 References
Boyd, westfall, and stasch, Marketing Research Text and Cases, All India Traveller Bookseller, New Delhi.
Brown, F.E. Marketing Research, a structure for decision making, Addison Wesley publishing company
Kothari, C.R. Research Methodology-Methods and Techniques, Wiley Eastern Ltd.
Stockton and Clark, Introduction to Business and Economic Statistics, D.B.Taraporevala Sons andCo.
Private Limited, Bombay.
Dunn Olive Jean and Virginia A Clarck, Applied Statistics John Wiley and Sons.
Green Paul E and Donald S. Tull, Research for Marketing Decisions Prentice Hall of India, New Delhi
8/19/2012 12:08 AM
SYLLABUS
27 of 71
8/19/2012 12:08 AM
SYLLABUS
28 of 71
2.0 Objective
After studying this unit you will be to:
Define Non Parametric Tests
Discuss Important Types of Nonparametric Tests
Explain Non-parametric vs. Distribution-free Tests
2.1 Introduction
Let us understand that So far we have discussed a variety of tests that make inferences about a
population parameter such as the mean or the population proportion. These are termed
parametric tests and use parametric statistics from samples that come from the population being
tested. To use these tests we make several restrictive assumptions about the populations from
which we drew our samples. For example we assumed that our underlying population is normal.
However underlying populations are not always normal and in these situations we need to use
tests, which are not parametric.
Many such tests have been developed which do not make restrictive assumptions about the
shape of the population distribution. These are known as non-parametric or distribution free tests.
There are many such tests, we shall learn about some of the more popular ones.
8/19/2012 12:08 AM
SYLLABUS
29 of 71
values such as 113.45, 189.42, 76.5, 101.79 by either ascending or descending order ranks.
Therefore we can replace them by 1,2, 3, 4, and 5. How ever if we represent 189.42 by 5, we lose
some information, which is contained in the value 189.42. 189.42 are the largest value and this
represented by the rank 5. However the rank 5 could also represent 1189.42, as that would also
be the largest value. Therefore use of ranked data leads to some loss of information.
The second disadvantage is that these tests are not as sharp or efficient as parametric tests. The
estimate of an interval using a non-parametric test may be twice as large as for the parametric
case.
When we use nonparametric tests we trade off sharpness in estimation with the ability to make do
with less information and to calculate faster.
What happens when we use the wrong test in the wrong situation?
Generally, parametric tests are more powerful than non-parametric tests (e.g., the non-parametric
method will have a greater probability of committing a Type II error - accepting a false null
hypothesis)
8/19/2012 12:08 AM
SYLLABUS
30 of 71
8/19/2012 12:08 AM
SYLLABUS
31 of 71
use the Mann-Whitney Rank Test as a non-parametric alternative to Students T-test when one
does not have normally distributed data.
Wilcoxon
To be used with two related (i.e., matched or repeated) groups (analogous to the dependent
samples t-test)
Kruskall-Wallis
To be used with two or more independent groups (analogous to the single-factor betweensubjects ANOVA)
Friedman
To be used with two or more related groups (analogous to the single-factor within-subjects
ANOVA)
We now look at a few of the non parametric tests in more details including their applications.
McNemar Test
This test is used for analyzing research designs of the before and after format where the data are
measured nominally. The samples therefore become dependent or related samples. The use of
this test is limited to the case where a 2x2 contingency table is involved. The test is most
popularly used to test response to a pre and post situation of a control group.
We can illustrate its use with an example:
A survey of 260 consumers was taken to test the effectiveness of mailed coupons and its effect on
individuals who changed their purchase rate for the product. The researcher took a random
sample of consumers before the release of the coupons to assess their purchase rate. On the
basis of their responses they were divided into groups as to their purchase rate (low, high). After
the campaign they were again asked to complete the forms and again classified on their
purchase rate. Table 1 shows the results from our sample.
8/19/2012 12:08 AM
SYLLABUS
32 of 71
Cases that showed a change in the before and after the campaign in terms of their purchase
response were placed in cells A and D. this was done as follows:
An individual is placed in cell A if he or she changed from a low purchase to a high purchase
rate.
Similarly hes placed in D if he goes from high to a low rate.
If no change is observed in his rate he is placed on cells BorC.
The researches wishes to determine if the mail order campaign was a success. We shall now
briefly outline the various steps involved in applying the McNemar test.
Step1
We state the null hypotheses. This essentially states that there is no perceptible or significant
change in purchase behavior of individuals. Thus for individuals who change their purchase rate
this means that the probability of those changing from high to low equals low to high. This equal
to .5.
Ho: P(A)=P(D)
Ha: P (A)P(D)
To test the null hypotheses we would examine the cases of change from cells A to D.
Step2
The level of significance is chosen, for example = .05
Step3
We now have to decide on the appropriate test to be used. The McNemar test is appropriate
because the study is a before and after study and the data are nominal. The study involves the
study of two related variables.
The McNemar test involves calculating the chi square value as given by the formula below:
8/19/2012 12:08 AM
SYLLABUS
33 of 71
8/19/2012 12:08 AM
SYLLABUS
34 of 71
8/19/2012 12:08 AM
SYLLABUS
35 of 71
8/19/2012 12:08 AM
SYLLABUS
36 of 71
The Mann whiteny Test is used because the data is ordinal and converted into ranks.. Also the
samples are independent.
The formula for the Mann whitney U value is :
Analysis of Differences
Where n1 and n2 are the two sample sizes and R1 and R2are the sums of the ranks for each
group. Letting regular accounts be sample 1 and the commercial accounts be sample 2, we
ind that
n1 = 15 n2=15 R1 = 198 R2= 267
The critical value for the statistic U* is found in Appendix I. For on = 0.05, n1 = 15 and n2 = 15,
the critical value for the Mann-Whitney statistic is U* = 64 for a two-tailed test. For this test, the
null hypothesis will be rejected if the computed value, U, is 64 or less. Otherwise, it will not be
rejected. This decision is just the opposite of the decision making procedure we followed for most
of the other tests of significance.
Calculation of the Test Statistic.
Therefore,
U1 = (15)(15) + 15(15 + 1) - 198
2
8/19/2012 12:08 AM
SYLLABUS
37 of 71
8/19/2012 12:08 AM
SYLLABUS
38 of 71
Solution
Step 1
The Null Hypothesis. The null hypothesis to be tested is that there is no difference in awareness
of services offered after the ad campaign. The alternative hyp6thesis would be that there was an
awareness of the services after the ad campaign.
Step 2The Level of Significance
It was decided that = 0.05. Step J.
The Statistical Test
The Wilcoxon test is appropriate because the study is of related samples and in which the data
measured, is ordinal and the differences can be ranked in magnitude. The test statistic calculated
is the T value. Since the direction of the difference is predicted, a one-tailed test is appropriate.
Step 3. The Decision Rule
The critical value of the Wilcoxon T is found Appendix J for n = 10 at the 0.05 level of significance
and a one-toiled test are 10. This indicates that a computed Tvalue of less than 10, the critical
value, rejects the null hypothesis. The argument is similar to that, of the Mann-Whitney U
statistic.
Step 4. Calculate the Test Statistic
The procedure for the test is very simple. The signed difference between each pair of
observations is found. Then these differences are ranked-ordered without regard to their
algebraic sign. Finally, the sign of the difference is attached to the rank for that difference. The
test statistic, T, is the smaller of the two sums of the ranks. For our example, T = 6.5 since the
smaller sum is associated with the negative difference.
If the null hypothesis is true the sums of positive and negative ranks should be approximately
equal. However the larger the difference between the underlying population, the smaller would
be the value of T since it is defined as the smaller of the ranks.
Step5 We draw a statistical conclusion; since the computed T value is 6.5 is less than the critical
T valie of 120; the null hypothesis, which states that there is no difference in the awareness of
bank services, is rejected.
8/19/2012 12:08 AM
SYLLABUS
39 of 71
2.8 Summary
The main advantage of non-parametric methods is that they do not require that the underlying
population have a normal or any other shaped distribution. The researches wishes to determine if
the mail order campaign was a success. We shall now briefly outline the various steps involved in
applying the McNemar test.
2.9 Keywords
McNemar Test
This test is used for analyzing research designs of the before and after format where the data are
measured nominally.
Kolmogorov Smirnov one Sample Test
This test is similar to the chi square test of goodness of fit.
Signed and Rank or Wilcoxon test
This test is the complement of the Mann Whitney U test and is used when ordinal data on two
samples are involved and the two samples are related or dependent.
2.10 Questions
1. What are Non Parametric Tests?
2.11 References
Boyd, westfall, and stasch, Marketing Research Text and Cases, All India Traveller Bookseller, New Delhi.
Brown, F.E. Marketing Research, a structure for decision making, Addison Wesley publishing company
Kothari, C.R. Research Methodology-Methods and Techniques, Wiley Eastern Ltd.
Stockton and Clark, Introduction to Business and Economic Statistics, D.B.Taraporevala Sons andCo.
Private Limited, Bombay.
Dunn Olive Jean and Virginia A Clarck, Applied Statistics John Wiley and Sons.
Green Paul E and Donald S. Tull, Research for Marketing Decisions Prentice Hall of India, New Delhi
8/19/2012 12:08 AM
SYLLABUS
40 of 71
8/19/2012 12:08 AM
SYLLABUS
41 of 71
3.0 Objective
After studying this unit you will be able to:
Define Properties of the Chi-Square
Discuss Chi-Square Probabilities
Describe Uses of Chi-square Test
3.1 Introduction
Let us understand that chi-square test is for goodness of fit and independence of attributes
The Chi Square distribution is a mathematical distribution that is used directly or indirectly in
many tests of significance. The most common use of the chi square distribution is to test
differences between proportions. Although this test is by no means the only test based on the chi
square distribution, it has come to be known as the chi square test. The chi square distribution
has one parameter, its degrees of freedom (df). It has a positive skew; the skew is less with more
degrees of freedom. The mean of a chi square distribution is its df. The mode is df - 2 and the
median is approximately df -0 .7.
8/19/2012 12:08 AM
SYLLABUS
42 of 71
8/19/2012 12:08 AM
SYLLABUS
43 of 71
You can go with the critical value which is less likely to cause you to reject in error (type I
error).
For a right tail test, this is the critical value further to the right (larger).
For a left tail test, it is the value further to the left (smaller).
For a two-tail test, its the value further to the left and the value further to the right. Note, it is
not the column with the degrees of freedom further to the right, its the critical value which is
further to the right.
The test statistic has a chi-square distribution when the following assumptions are met
The data are obtained from a random sample
The expected frequency of each category must be at least 5. This goes back to the
requirement that the data be normally distributed. Youre simulating a multinomial experiment
(using a discrete distribution) with the goodness-of-fit test (and a continuous distribution), and
if each expected frequency is at least five then you can use the normal distribution to
approximate (much like the binomial).
The following are properties of the goodness-of-fit test
8/19/2012 12:08 AM
SYLLABUS
44 of 71
The data are the observed frequencies. This means that there is only one data value for each
category. Therefore,
The degrees of freedom is one less than the number of categories, not one less than the
sample size.
It is always a right tail test.
It has a chi-square distribution.
The value of the test statistic doesnt change if the order
of the categories is switched.
2. Test for Independence
In the test for independence, the claim is that the row and column variables are independent of
each other.
This is the null hypothesis.
The multiplication rule said that if two events were independent, then the probability of both
occurring was the product of the probabilities of each occurring. This is key to working the test for
independence.
If you end up rejecting the null hypothesis, then the assumption must have been wrong and the
row and column variable are dependent. Remember, all hypothesis testing is done under the
assumption the null hypothesis is true.
The test statistic used is the same as the chi-square goodness-of-fit test. The principle behind the
test for independence is the same as the principle behind the goodness-of-fit test.
The Test for Independence is Always a Right Tail Test
In fact, you can think of the test for independence as a goodness-of-fit test where the data is
arranged into table form. This table is called a contingency table.
The test statistic has a chi-square distribution when the following assumptions are met
The data are obtained from a random sample
The expected frequency of each category must be at least 5.
8/19/2012 12:08 AM
SYLLABUS
45 of 71
8/19/2012 12:08 AM
SYLLABUS
46 of 71
The expected value, (X), of the first age group is obtained from the formula:
Subsequent expected values are computed by applying the expected 50% death rate (d) for each
succeeding year.
State the null hypothesis
Calculate the Chi-square value.
Knowing the critical value for 5 degrees of freedom (* = 0.05) is 11.0705, what do you
conclude about the fit between the observed and hypothesized death rates? Are there
significant differences??
Model Selection
Many programs develop predictive equations for data sets. A common test for the fit of a model to
the data is chi-square goodness-of-fit test. For example, program DISTANCE, which develops
curves to estimate probabilities of detection, uses discrete distance categories (e.g., 0-5m, 5-10m,
etc.) to see how well the model predicts the number of objects that should be seen in each
distance category (Expected) versus what the data actually show (Observed). II)
Contingency Tables and Tests for Independence of Factors:
Survival
Is the probability of being male or female independent of being alive or dead? Lets use data on
ruffed grouse. You collect mortality data from 100 birds you radio-collared and test the following
hypothesis:
Ho: The 2 sets of attributes (death and sex of bird) are unrelated (independent).
8/19/2012 12:08 AM
SYLLABUS
47 of 71
Expected values for each cell can be can be calculated by multiplying the row total by the column
total and dividing by the grand total.
EX: Expected Value = 70 * 67 / 100 = 46.9
Calculate the chi-square value.
Knowing the critical value for 1 degree of freedom (alpha = 0.05) is anything greater than
3.84146, what can you conclude about the independence of these 2 factors? rates?
II Chi Square Test for Independence (2-Way chi-square)
A large-scale national randomized experiment was conducted in the 1980s to see if daily aspirin
consumption (as compared to an identical, but inert placebo) would reduce the rate of heart
attacks. This study (The Physicians Health Study) was described in one of the episodes of the
statistics video series A Against All Odds.@ Here are the actual results from the study using
22,071 doctors who were followed for 5 years:
A chi-square analysis indicated that there was a significant relationship between aspirin condition
and incidence of heart attacks, chi-square (1, N=22,071)= 25.01, p<.001. A greater percentage of
heart attacks occurred for participants taking the placebo (M=1.7%) compared to those taking
aspirin (M=0.
Application of chi-square test
Product and Process Comparisons
8/19/2012 12:08 AM
SYLLABUS
48 of 71
(Note: the numbers in parentheses are the expected cell frequencies). Column probabilities Let p
A
be the probability that a defect will be of type A. Likewise, define p B, pC, and pD as the
8/19/2012 12:08 AM
SYLLABUS
49 of 71
probabilities of observing the other three types of defects. These probabilities, which are called
the column probabilities, will satisfy the requirement
Row probabilities By the same token, let pi (i=1, 2, or 3) be the row probability that a defect will
have occurred during shift i, where
Similarly, the row probabilities p 1, p2, and p 3 are estimated by dividing the row totals r1, r 2, and
r3 by the
Total n, respectively
Denote the observed frequency of the cell in row i and column jof the contingency table by n...
Then we have
8/19/2012 12:08 AM
SYLLABUS
50 of 71
The estimated cell frequencies are shown in parentheses in the contingency table above
1. Chi-Square (c2) Analysis- Introduction
Consider the following decision situations:
1. Are all package designs equally preferred? 2) Are all brands equally preferred? 3) Is their any
association between income level and brand preference? 4) Is their any association between
family size and size of washing machine bought? 5) Are the attributes educational background
and type of job chosen independent? The answer to these questions require the help of
Chi-Square (c2) analysis. The first two questions can be unfolded using Chi-Square test of
goodness of fit for a single variable while solution to questions 3, 4, and 5 need the help of
Chi-Square test of independence in a contingency table. Please note that the variables involved
in Chi-Square analysis are nominally scaled. Nominal data are also known by two namescategorical data and attribute data.The symbol c2 used here is to denote the chi-square
distribution whose value depends upon the number of degrees of freedom (d.f). As we know,
chi-square distribution is a skewed distribution particularly with smaller d.f. As the sample size
and therefore the d.f. increases and becomes large, the c2 distribution approaches normality.
c2 tests are nonparametric or distribution-free in nature. This means that no assumption needs to
be made about the form of the original population distribution from which the samples are drawn.
Please note that all parametric tests make the assumption that the samples are drawn from a
8/19/2012 12:08 AM
SYLLABUS
51 of 71
Do the consumer preferences for package colors show any significant difference?
Solution
If you look at the data, you may be tempted to infer that Blue is the most preferred color.
Statistically, you have to find out whether this preference could have arisen due to chance. The
appropriate test statistic is the c2 test of goodness of fit.
Null Hypothesis: All colors are equally preferred.
Alternative Hypothesis: They are not equally preferred.
8/19/2012 12:08 AM
SYLLABUS
52 of 71
Please note that under the null hypothesis of equal preference for all colors being true, the
expected frequencies for all the colors will be equal to 80. Applying the formula
3.8 Summary
A number of marketing problems involve decision situations in which it is important for a
marketing manager to know whether the pattern of frequencies that are observed fit well with the
expected ones.In consumer marketing, a common problem that any marketing manager faces is
the selection of appropriate colors for package design.
8/19/2012 12:08 AM
SYLLABUS
53 of 71
3.9 Keywords
Chi-square Distribution: A distribution obtained from the multiplying the ratio of sample variance
to population variance by the degrees of freedom when random samples are selected from a
normally distributed population.
Contingency Table: Data arranged in table form for the chi-square independence test
Expected Frequency: The frequencies obtained by calculation.
Goodness-of-fit Test: A test to see if a sample comes from a population with the given
distribution.
Independence Test: A test to see if the row and column variables are independent.
Observed Frequency: The frequencies obtained by observation. These are the sample
frequencies.
3.10 Questions
1. What are the Properties of the Chi-Square?
3.11 References
Boyd, westfall, and stasch, Marketing Research Text and Cases, All India Traveller Bookseller, New Delhi.
Brown, F.E. Marketing Research, a structure for decision making, Addison Wesley publishing company
Kothari, C.R. Research Methodology-Methods and Techniques, Wiley Eastern Ltd.
8/19/2012 12:08 AM
SYLLABUS
54 of 71
4.0 Objectives
4.1 Introduction
4.2 Logic behind ANova
4.3 Applications of Anova
4.4 Dependent and Independent Variables.
4.5 Summary
4.6 Keywords
4.7 Questions
4.8 References
8/19/2012 12:08 AM
SYLLABUS
55 of 71
4.0 Objectives
After studying this unit you will be to :
Define Logic behind ANova
Discuss Applications of Anova
Describe Dependent and Independent Variables.
4.1 Introduction
Let us understand that the tests we have learned up to this point allow us to test hypotheses that
examine the difference between only two means.
Analysis of Variance or Anova will allow us to test the difference between 2 or more means.
Anova does this by examining the ratio of variability between two conditions and variability
within each condition.
A t-test would compare the likelihood of observing the difference in the mean number of words
recalled for each group.
An ANOVA test, on the other hand, would compare the variability that we observe between the
two conditions to the variability observed within each condition. Recall that we measure variability
as the sum of the difference of each score from the mean. When we actually calculate an ANOVA
we will use a short-cut formula.
Thus, when the variability that we predict (between the two groups) is much greater than the
variability we dont predict (within each group), and then we will conclude that our treatments
produce different results.
X = the ith observation in the jth group
ij
n = the number of observations in group j
n = the total number of observations in all groups combined
c = the number of groups or levels
8/19/2012 12:08 AM
SYLLABUS
56 of 71
8/19/2012 12:08 AM
SYLLABUS
57 of 71
8/19/2012 12:08 AM
SYLLABUS
58 of 71
SB SSW
Now, construct the ANOVA table for this numerical example by plugging the results of your
computation in the Anova Table.
Conclusion: There is not enough evidence to reject the null hypothesis Ho.
8/19/2012 12:08 AM
SYLLABUS
59 of 71
Now, getting back to our numerical example, we notice that: given the test conclusion and the
Anova tests conditions, we may conclude that these three populations are in fact the same
population.
Therefore, the Anova technique could be used as a measuring tool and statistical routine for
quality control as described below using our numerical example.
Construction of the Control Chart for the Sample Means:
Under the null hypothesis the ANOVA concludes that 1 = 2 = 3; that is, we have a hypothetical
parent population.
The question is, what is its variance? The estimated variance is 36
/ 14 = 2.75.
Thus, estimated standard deviation is = 1.60 and estimated standard deviation for the means is
1.6 /^5 = 0.71.
Under the conditions of Anova, we can construct a control chart with the warning limits = 3
2(0.71); the action limits = 3 3(0.71). The following figure depicts the control chart.
8/19/2012 12:08 AM
SYLLABUS
60 of 71
8/19/2012 12:08 AM
SYLLABUS
61 of 71
Example of How An Anova Should Be Written Up: Check of Assumption Of Equal Variances: H0:
The variances are equal. H1: The variances are not equal
Homogeneity of Variance
Bartletts Test (normal distribution)
Test Statistic : 0.033
P-Value : 0.984
8/19/2012 12:08 AM
SYLLABUS
62 of 71
Since the p-value > 0.15 > 0.05, we DNR Ho. Therefore, the normality assumption is met. Check
for Differences Among The Means:
1.
H1: at least one mj not equal
Anova
a= 0.05
If the p-value < a reject H0.
If the p-value a do not reject H0.
One-way Analysis of Variance
Analysis of Variance for time
8/19/2012 12:08 AM
SYLLABUS
63 of 71
8/19/2012 12:08 AM
SYLLABUS
64 of 71
The means for the two groups are quite different (2 and 6, respectively).
The sums of squares within each group are equal to 2. Adding them together, we get 4.
If we now repeat these computations, ignoring group membership, that is, if we compute the total
SS based on the overall mean, we get the number 28. In other words, computing the variance
(sums of squares) based on the within-group variability yields a much smaller estimate of
variance than computing it based on the total variability (the overall mean). The reason for this in
the above example is of course that there is a large difference between means, and it is this
difference that accounts for the difference in the SS.
In fact, if we were to perform an ANOVA on the above data, we would get the following result:
8/19/2012 12:08 AM
SYLLABUS
65 of 71
8/19/2012 12:08 AM
SYLLABUS
66 of 71
Multiple Factors
The world is complex and multivariate in nature, and instances when a single variable completely
explains a phenomenon are rare.
For example, when trying to explore how to grow a bigger tomato, we would need to consider
factors that have to do with the plants genetic makeup, soil conditions, lighting, temperature, etc.
Thus, in a typical experiment, many factors are taken into account. One important reason for
using ANOVA methods rather than multiple two-group studies analyzed via t tests is that the
former method is more efficient, and with fewer observations we can gain more information.
Let us expand on this statement.
Controlling for Factors
Suppose that in the above two-group example we introduce another grouping factor, for example,
Gender. Imagine that in each group we have 3 males and 3 females. We could summarize this
design in a 2 by 2 table:
Before performing any computations, it appears that we can partition the total variance into at
least 3 sources: 1. error (within-group) variability, 2. variability due to experimental group
membership, and 3. variability due to gender.
(Note that there is an additional source interaction - that we will discuss shortly.)
Wh a t woul d h a v e h a p p e n e d h a d w e n o t i n c l u d egednder as a factor in the study
but rather computed a simple t test? If you compute the SS ignoring the gender factor (use the
within-group means ignoring or collapsing across gender; the result is SS=10+10=20), you will
see that the resulting within-group SS is larger than it is when we include gender (use the withingroup, within-gender means to compute those SS; they will be equal to 2 in each group, thus the
combined SS-within is equal to 2+2+2+2=8).
This difference is due to the fact that the means for males are systematically lower than those for
females, and this difference in means adds variability if we ignore this factor. Controlling for error
variance increases the sensitivity (power) of a test.
This example demonstrates another principal of ANOVA that makes it preferable over simple
8/19/2012 12:08 AM
SYLLABUS
67 of 71
two-group t test studies: In ANOVA we can test each factor while controlling for all others; this is
actually the reason why ANOVA is more statistically powerful (i.e., we need fewer observations to
find a significant effect) than the simple t test.
Interaction Effects
There is another advantage of ANOVA over simple t-tests: ANOVA allows us to detect interaction
effects between variables, and, therefore, to test more complex hypotheses about reality. Let us
consider another example to illustrate this point. (The term interaction was first used by Fisher,
1926.)
Main Effects, Two-way Interaction.
Imagine that we have a sample of highly achievement-oriented students and another of
achievement avoiders. We now create two random halves in each sample, and give one half of
each sample a challenging test, the other an easy test.
We measure how hard the students work on the test. The means of this (fictitious) study are as
follows:
How can we summarize these results? Is it appropriate to conclude that (1) challenging tests
make students work harder, (2) achievement-oriented students work harder than achievementavoiders? None of these statements captures the essence of this clearly systematic pattern of
means.
The appropriate way to summarize the result would be to say that challenging tests make only
achievement-oriented students work harder, while easy tests make only achievement- avoiders
work harder.
In other words, the type of achievement orientation and test difficulty interact in their effect on
effort; specifically, this is an example of a two-way interaction between achievement orientation
and test difficulty.
Note that statements 1 and 2 above describe so-called main effects.
Higher order interactions.
While the previous two-way interaction can be put into words relatively easily, higher order
interactions are increasingly difficult to verbalize. Imagine that we had included factor Gender in
the achievement study above, and we had obtained the following pattern of means:
8/19/2012 12:08 AM
SYLLABUS
68 of 71
How could we now summarize the results of our study? Graphs of means for all effects greatly
facilitate the interpretation of complex effects. The pattern shown in the table above (and in the
graph below) represents a three-way interaction between factors.
Thus we may summarize this pattern by saying that for females there is a two-way interaction
between achievement-orientation type and test difficulty: Achievement-oriented females work
harder on challenging tests than on easy tests, achievement-avoiding females work harder on
easy tests than on difficult tests. For males, this interaction is reversed. As you can see, the
description of the interaction has become much more involved.
A General way to Express Interactions.
A general way to express all interactions is to say that an effect is modified (qualified) by another
effect. Let us try this with the two-way interaction above. The main effect for test difficulty is
modified by achievement orientation.
For the three-way interaction in the previous paragraph, we may summarize that the two-way
interaction between test difficulty and achievement orientation is modified (qualified) by gender.
If we have a four-way interaction, we may say that the three-way interaction is modified by the
fourth variable, that is, that there are different types of interactions in the different levels of the
fourth variable.
As it turns out, in many areas of research five- or higher- way interactions are not that uncommon.
SPSS -OutputA
Does background music affect thinking, specifically semantic processing? To investigate this,10
participants solved anagrams while listening to three different types of background music. A
within-subjects design was used, so all participants were tested in every condition. Anagrams
8/19/2012 12:08 AM
SYLLABUS
69 of 71
were chosen of equal difficulty, and randomly paired with the different types of music. The order
of conditions was counterbalanced for the participants. To eliminate verbal interference,
instrumental Amuzak@ medleys of each music type were used, which were all 10 minutes in
length. The music was chosen to represent Classical, Easy listening, and Country styles. Chosen
for use were Beethoven, the BeeGees, and Garth Brooks. The number of anagrams solved in
each condition was recorded for analysis. Here are the data recorded for the participants:
Use 3 columns to enter the data just as they appear above. Call them bthoven beegees and
brooks
SPSS does not perform post-hoc comparisons for repeated-measures analyses.You will have to
use the formula for the Tukey test and calculate the critical difference by hand for this problem.
Remember, the general formula is:
Tukey
This gives you a critical difference (CD). Any two means which differ by this amount (the CD) or
more are significantly different from each other.
Note
MSwg = MS within groups (from ANova table output)
n = number of scores per condition
q = studentized range statistic (Table in back of textbook) H0: (mu #1 = mu #2 = mu #3) H1: (1 or
more mu values are unequal) Analyze > General Linear Model > Repeated Measures WithinSubjects factor name: music Number of levels: 3 click Add > Define
Click over variables corresponding to level 1, 2, 3 Options > Descriptive Statistics (optional, if you
want) Plots > Horizontal axis: music
8/19/2012 12:08 AM
SYLLABUS
70 of 71
Tukey
= 2.098 = 2.10; Any two groups differing by 2.10 or more are signficantly different with the Tukey
test at alpha = .05.
Summary of Tukey Results
Means from descrptives section of output: Group 1: 13.50
Group 2: 11.10
Group 3: 11.90
1 compared to 2: difference of 2.40 (greater than our CD, so this is statistically significant)
compared to 3: difference of 1.60 (less than our CD, so this is not statistically significant)
compared to 3: difference of .80 (less than our CD so this is not statistically significant
A one-way repeated-measures Anova indicated that there were significant differences in the
number of anagrams solved across the three background instrumental (muzak) conditions, F
(2,18)= 4.42, p<.027. Post-hoc Tukey comparisons indicated that more anagrams were solved
while listening to Beethoven (M=13.5) than while listening to the BeeGees (M=11.1
To summarize the discussion up to this point, the purpose of analysis of variance is to test
differences in means (for groups or variables) for statistical significance. This is accomplished by
analyzing the variance, that is, by partitioning the total variance into the component that is due to
true random error (i.e., within-group SS) and the components that are due to differences between
means. These latter variance components are then tested for statistical significance, and, if
significant, we reject the null hypothesis of no differences between means, and accept the
alternative hypothesis that the means (in the population) are different from each other.
4.5 Summary
A t-test would compare the likelihood of observing the difference in the mean number of words
recalled for each group.An ANOVA test, on the other hand, would compare the variability that we
observe between the two conditions to the variability observed within each condition. Elementary
Concepts provides a brief introduction into the basics of statistical significance testing.
4.6 Keywords
SS Error and SS Effect.: The within-group variability (SS) is usually referred to as Error variance.
This term denotes the fact that we cannot readily explain or account for it in the current design.
8/19/2012 12:08 AM
SYLLABUS
71 of 71
Dependent and Independent Variables: The variables that are measured (e.g., a test score) are
called dependent variables. The variables that are manipulated or controlled (e.g., a teaching
method or some other criterion used to divide observations into groups that are compared) are
called factors or independent variables
4.7 Questions
1. Discus Logic behind ANova.
4.8 References
Boyd, westfall, and stasch, Marketing Research Text and Cases, All India Traveller Bookseller, New Delhi.
Brown, F.E. Marketing Research, a structure for decision making, Addison Wesley publishing company
Kothari, C.R. Research Methodology-Methods and Techniques, Wiley Eastern Ltd.
Stockton and Clark, Introduction to Business and Economic Statistics, D.B.Taraporevala Sons andCo.
Private Limited, Bombay.
Dunn Olive Jean and Virginia A Clarck, Applied Statistics John Wiley and Sons.
Green Paul E and Donald S. Tull, Research for Marketing Decisions Prentice Hall of India, New Delhi
8/19/2012 12:08 AM