Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

BALAJI - Module - 6 7 AOV, CHI

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

Balaji

Work shop on Statistics with R-package


Module 6
ANALYSIS OF VARIANCE [ANOVA]
Analysis of Variance (ANOVA) is a statistical method used to test
differences between two or more means. It may seem odd that the
technique is called "Analysis of Variance" rather than "Analysis of
Means." As you will see, the name is appropriate because
inferences about means are made by analysing variance.

EXAMPLE .ONE WAY ANOVA

The times required by three workers to perform an assembly-line task were


recorded on FIVE randomly selected occasions. Here are the times, to the
nearest minute.
Variable=X= is time. But 3 Independent persons

Hank Joseph Susan H 0: µ H = µ J = µ S


8 8 10 H1: Not all are equal
10 9 9 At least one of them differ
9 9 10 from others
11 8 11
10 10 9
Perform One way ANOVA by stating suitable hypothesis.
[a] Direct data typing algorithms ONE way ANOVA

[1] Workers<-
c("HANK","HANK","HANK","HANK","HANK","JOSEPH","JOSEPH","JOSEPH","J
OSEPH","JOSEPH","SUSAN","SUSAN","SUSAN","SUSAN","SUSAN")
[2] TIME<-c(8,10,9,11,10,8,9,9,8,10,10,9,10,11,9)
[3] model<-aov(TIME~Workers)
[4] summary(model)
[5] Conclusion
[b] Direct data typing algorithms TWO WAY ANOVA
Suppose if the data is collected on 5 independent shifts S1,S2,S3,S4,S5 ,then
one more hypothesis to be verified for all shifts data.
Hank Joseph Susan [1] H 0: µ H = µ J = µ S
S1 8 8 10 H1: Not all are equal
S2 10 9 9
S3 9 9 10 [2] H0:µs1 = µs2= µs3 = µs4 = µs5
S4 11 8 11 H1: Not all are equal
S5 10 10 9

[1] Workers<-
c("HANK","HANK","HANK","HANK","HANK","JOSEPH","JOSEPH","JOSEPH","J
OSEPH","JOSEPH","SUSAN","SUSAN","SUSAN","SUSAN","SUSAN")
[2] Shifts<-
c("S1","S2","S3","S4","S5","S1","S2","S3","S4","S5","S1","S2","S3","S4","S5"
)
[3] TIME<-c(8,10,9,11,10,8,9,9,8,10,10,9,10,11,9)
#ONE WAY ANOVA
model_1<-aov(TIME~Workers)
summary(model_1)
Conclusion
#TWO WAY ANOVA
model_2<-aov(TIME~Workers+Shifts)
summary(model_2)
conclusion
[c] Direct data reading from file algorithms ONE/TWO WAY ANOVA .
FROM R-DATA FILE>
[1] Chickwts, IS FILE IN R
data(chickwts)
names(chickwts)
w<-chickwts$weight
f<-chickwts$feed
model_1<-aov(w~f)
summary(model_1)

[2] warpbreaks ANOTHER FILE


data()
data(warpbreaks)
names(warpbreaks)
brx<-warpbreaks$breaks
wool<-warpbreaks$wool
tens<-warpbreaks$tension
model_1<-aov(brx~wool)
summary(model_1)
model_2<-aov(brx~tens)
summary(model_2)
model_3<-aov(brx~wool+tens)
summary(model_3)
Module -7
Probability OR proportion test, Then Chi square test
Example 1
Verify statistically a given coin is unbiased or biased.
where H0:p=0.5 , H0:p≠0.5

Step [1] X=1 if out come is head


=0 otherwise.
Step[2] Generate ,say ,sample of 100 trials and observe X values for
1 and 0.
Step [3] This means perform binomial experiment. We can do by R
Head<-rbinom(1,n,p=.5)
Prop.test(head,n,p=0.5)
Conclusion
#Proportion test, binom p test
heads<-rbinom(1,100,p=.5)
heads
prop.test(heads,100,p=.5)

# Example.2
#x is number of customer prefer a product in sample of 200
X=1 PREFER THE PRODUCT , X=0 IF NOT
#claim p=.45 H0:p=0.45 , H1:p≠0.45
x<-rbinom(1,200,p=.45)
x
prop.test(x,200,p=.45)

Marks = Distinction No distinction


Attendence= >=80% 40 20
<80% 21 19
Association There is a + ve association between Attendance
and Distinction
Chi-square test for contingency table
Following data represent contingency table of Gender and
preference of Tea ,Coffee, Lemon juice.
GENDER SOFT DRINK
Tea Coffee Lemon juice Total
Male 762 327 468 1557
Female 484 239 477 1200
Total 1246 566 945 2757
At a given l.o.s test the hypothesis that there is any association
between gender and preference of drink.

M <- as.table(rbind(c(762, 327, 468), c(484, 239, 477)))


dimnames(M) <- list(gender = c("F", "M"),
drink= c("Tea","Coffee”, "Lemon Juice"))
(Xsq <- chisq.test(M))

Xsq$observed # observed counts (same as M)


Xsq$expected # expected counts under the null
Xsq$residuals # Pearson residuals
Xsq$stdres # standardized residuals
EX.BY ME

x<-c(762,327,468)
y<-c(484,239,477)
m<-as.table(rbind(x,y))
m
dimnames(m)<-list( gender = c("F", "M"),drink=c("T","C","L"))
xsq<-chisq.test(m)
xsq
xsq$observed # observed counts (same as m)
xsq$expected # expected counts under the null
xsq$residuals # Pearson residuals
xsq$stdres # standardized residuals

A NEW EX BY ME[BEST]

male<-c(45,60)
females<-c(55,53)
m1<-as.table(rbind(male,females))
m1
r<-c("Male","Female")
c<-c("Tea","Coffee")
dimnames(m1)<-list( gender=r,drink=c)
xsq2<-chisq.test(m1)
xsq2
xsq2$observed # observed counts (same as m1)
xsq2$expected # expected counts under the null
xsq2$residuals # Pearson residuals
xsq2$stdres # standardized residuals

GENDER ADV
YES NO
MALE 34 43
FEMALE 28 23

You might also like