BALAJI

Balaji
Work shop on Statistics with R-package

Module 6
ANALYSIS OF VARIANCE [ANOVA]
Analysis of Variance (ANOVA) is a statistical method used to test
differences between two or more means. It may seem odd that the
technique is called "Analysis of Variance" rather than "Analysis of
Means." As you will see, the name is appropriate because
inferences about means are made by analysing variance.
EXAMPLE .ONE WAY ANOVA
The times required by three workers to perform an assembly-line task were

recorded on FIVE randomly selected occasions. Here are the times, to the
nearest minute.
Variable=X= is time. But 3 Independent persons
Hank Joseph Susan H 0: µ H = µ J = µ S

8 8 10 H1: Not all are equal
10 9 9 At least one of them differ
9 9 10 from others
11 8 11
10 10 9
Perform One way ANOVA by stating suitable hypothesis.
[a] Direct data typing algorithms ONE way ANOVA
[1] Workers<-
c("HANK","HANK","HANK","HANK","HANK","JOSEPH","JOSEPH","JOSEPH","J
OSEPH","JOSEPH","SUSAN","SUSAN","SUSAN","SUSAN","SUSAN")
[2] TIME<-c(8,10,9,11,10,8,9,9,8,10,10,9,10,11,9)
[3] model<-aov(TIME~Workers)
[4] summary(model)
[5] Conclusion
[b] Direct data typing algorithms TWO WAY ANOVA
Suppose if the data is collected on 5 independent shifts S1,S2,S3,S4,S5 ,then
one more hypothesis to be verified for all shifts data.
Hank Joseph Susan [1] H 0: µ H = µ J = µ S
S1 8 8 10 H1: Not all are equal
S2 10 9 9
S3 9 9 10 [2] H0:µs1 = µs2= µs3 = µs4 = µs5
S4 11 8 11 H1: Not all are equal
S5 10 10 9
[1] Workers<-
c("HANK","HANK","HANK","HANK","HANK","JOSEPH","JOSEPH","JOSEPH","J
OSEPH","JOSEPH","SUSAN","SUSAN","SUSAN","SUSAN","SUSAN")
[2] Shifts<-
c("S1","S2","S3","S4","S5","S1","S2","S3","S4","S5","S1","S2","S3","S4","S5"
)
[3] TIME<-c(8,10,9,11,10,8,9,9,8,10,10,9,10,11,9)
#ONE WAY ANOVA
model_1<-aov(TIME~Workers)
summary(model_1)
Conclusion
#TWO WAY ANOVA
model_2<-aov(TIME~Workers+Shifts)
summary(model_2)
conclusion
[c] Direct data reading from file algorithms ONE/TWO WAY ANOVA .
FROM R-DATA FILE>
[1] Chickwts, IS FILE IN R
data(chickwts)
names(chickwts)
w<-chickwts$weight
f<-chickwts$feed
model_1<-aov(w~f)
summary(model_1)
[2] warpbreaks ANOTHER FILE

data()
data(warpbreaks)
names(warpbreaks)
brx<-warpbreaks$breaks
wool<-warpbreaks$wool
tens<-warpbreaks$tension
model_1<-aov(brx~wool)
summary(model_1)
model_2<-aov(brx~tens)
summary(model_2)
model_3<-aov(brx~wool+tens)
summary(model_3)
Module -7
Probability OR proportion test, Then Chi square test
Example 1
Verify statistically a given coin is unbiased or biased.
where H0:p=0.5 , H0:p≠0.5
Step [1] X=1 if out come is head

=0 otherwise.
Step[2] Generate ,say ,sample of 100 trials and observe X values for
1 and 0.
Step [3] This means perform binomial experiment. We can do by R
Head<-rbinom(1,n,p=.5)
Prop.test(head,n,p=0.5)
Conclusion
#Proportion test, binom p test
heads<-rbinom(1,100,p=.5)
heads
prop.test(heads,100,p=.5)
# Example.2
#x is number of customer prefer a product in sample of 200
X=1 PREFER THE PRODUCT , X=0 IF NOT
#claim p=.45 H0:p=0.45 , H1:p≠0.45
x<-rbinom(1,200,p=.45)
x
prop.test(x,200,p=.45)
Marks = Distinction No distinction

Attendence= >=80% 40 20
<80% 21 19
Association There is a + ve association between Attendance
and Distinction
Chi-square test for contingency table
Following data represent contingency table of Gender and
preference of Tea ,Coffee, Lemon juice.
GENDER SOFT DRINK
Tea Coffee Lemon juice Total
Male 762 327 468 1557
Female 484 239 477 1200
Total 1246 566 945 2757
At a given l.o.s test the hypothesis that there is any association
between gender and preference of drink.
M <- as.table(rbind(c(762, 327, 468), c(484, 239, 477)))

dimnames(M) <- list(gender = c("F", "M"),
drink= c("Tea","Coffee”, "Lemon Juice"))
(Xsq <- chisq.test(M))
Xsq$observed # observed counts (same as M)

Xsq$expected # expected counts under the null
Xsq$residuals # Pearson residuals
Xsq$stdres # standardized residuals
EX.BY ME
x<-c(762,327,468)
y<-c(484,239,477)
m<-as.table(rbind(x,y))
m
dimnames(m)<-list( gender = c("F", "M"),drink=c("T","C","L"))
xsq<-chisq.test(m)
xsq
xsq$observed # observed counts (same as m)
xsq$expected # expected counts under the null
xsq$residuals # Pearson residuals
xsq$stdres # standardized residuals
A NEW EX BY ME[BEST]
male<-c(45,60)
females<-c(55,53)
m1<-as.table(rbind(male,females))
m1
r<-c("Male","Female")
c<-c("Tea","Coffee")
dimnames(m1)<-list( gender=r,drink=c)
xsq2<-chisq.test(m1)
xsq2
xsq2$observed # observed counts (same as m1)
xsq2$expected # expected counts under the null
xsq2$residuals # Pearson residuals
xsq2$stdres # standardized residuals
GENDER ADV
YES NO
MALE 34 43
FEMALE 28 23

BALAJI - Module - 6 7 AOV, CHI

Uploaded by

Copyright:

Available Formats

BALAJI - Module - 6 7 AOV, CHI

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

BALAJI - Module - 6 7 AOV, CHI

Uploaded by

Copyright:

Available Formats

Work shop on Statistics with R-package

EXAMPLE .ONE WAY ANOVA

The times required by three workers to perform an assembly-line task were

Hank Joseph Susan H 0: µ H = µ J = µ S

[2] warpbreaks ANOTHER FILE

Step [1] X=1 if out come is head

Marks = Distinction No distinction

M <- as.table(rbind(c(762, 327, 468), c(484, 239, 477)))

Xsq$observed # observed counts (same as M)

You might also like