Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
32 views

Midterm Q2: Olamide Gab-Opadokun 3/6/2020

This document provides the analysis and results of a linear regression model fit to microbiological count data from food samples. Specifically: 1) It is found that X15 contributes 0.3% to the total variability in the model. 2) Proportions of variability contributed by each effect are calculated and plotted in a half normal plot, identifying X23, X12, and X123 as the most influential effects. 3) A bootstrap procedure is used to construct confidence envelopes around the half normal plot.

Uploaded by

Olamide Gab
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Midterm Q2: Olamide Gab-Opadokun 3/6/2020

This document provides the analysis and results of a linear regression model fit to microbiological count data from food samples. Specifically: 1) It is found that X15 contributes 0.3% to the total variability in the model. 2) Proportions of variability contributed by each effect are calculated and plotted in a half normal plot, identifying X23, X12, and X123 as the most influential effects. 3) A bootstrap procedure is used to construct confidence envelopes around the half normal plot.

Uploaded by

Olamide Gab
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

MidTerm Q2

Olamide Gab-Opadokun

3/6/2020

a)Compute the proportion of the contribution to the variability associated to X15

library(AlgDesign)
gen.factorial(c(2,2,2,2), center=TRUE)

## X1 X2 X3 X4
## 1 -1 -1 -1 -1
## 2 1 -1 -1 -1
## 3 -1 1 -1 -1
## 4 1 1 -1 -1
## 5 -1 -1 1 -1
## 6 1 -1 1 -1
## 7 -1 1 1 -1
## 8 1 1 1 -1
## 9 -1 -1 -1 1
## 10 1 -1 -1 1
## 11 -1 1 -1 1
## 12 1 1 -1 1
## 13 -1 -1 1 1
## 14 1 -1 1 1
## 15 -1 1 1 1
## 16 1 1 1 1

food<-read.csv('C:/Users/Olamide/Downloads/PecosFood.csv')
head(food)

## Log.Microbial.Count Process.Temperature Preservative Moisture Acidity X12 X13


## 1 5.55 -1 -1 -1 -1 1 1
## 2 4.47 1 -1 -1 -1 -1 -1
## 3 5.19 -1 1 -1 -1 -1 1
## 4 5.32 1 1 -1 -1 1 -1
## 5 10.54 -1 -1 1 -1 1 -1
## 6 11.56 1 -1 1 -1 -1 1
## X14 X23 X24 X34 X123 X124 X134 X234 X1234
## 1 1 1 1 1 -1 -1 -1 -1 1
## 2 -1 1 1 1 1 1 1 -1 -1
## 3 1 -1 -1 1 1 1 -1 1 -1
## 4 -1 -1 -1 1 -1 -1 1 1 1
## 5 1 -1 1 -1 1 -1 1 1 -1
## 6 -1 -1 1 -1 -1 1 -1 1 1

1
fit<-lm(Log.Microbial.Count~.,data=food)
anova(fit)

## Warning in anova.lm(fit): ANOVA F-tests on an essentially perfect fit are


## unreliable

## Analysis of Variance Table


##
## Response: Log.Microbial.Count
## Df Sum Sq Mean Sq F value Pr(>F)
## Process.Temperature 1 0.2475 0.2475
## Preservative 1 25.0250 25.0250
## Moisture 1 31.5002 31.5002
## Acidity 1 0.1463 0.1463
## X12 1 0.2730 0.2730
## X13 1 0.0371 0.0371
## X14 1 0.5148 0.5148
## X23 1 31.1643 31.1643
## X24 1 0.2889 0.2889
## X34 1 0.1914 0.1914
## X123 1 0.1620 0.1620
## X124 1 0.6440 0.6440
## X134 1 0.9555 0.9555
## X234 1 0.2003 0.2003
## X1234 1 0.2783 0.2783
## Residuals 0 0.0000

effect<-fit$coefficients[-1]
qq<-qqnorm(effect)
### use identify(qq) and click on graph to identify outliers
### identify(qq) does not work in R studio
text(qq$x[8], qq$y[8]+0.15, names(effect)[8])
text(qq$x[2], qq$y[2]+0.15, names(effect)[2])
text(qq$x[3], qq$y[3]-0.15, names(effect)[3])

2
Normal Q−Q Plot
1.5

Moisture
1.0
Sample Quantiles

0.5
−0.5 0.0

Preservative
X23
−1.5

−1 0 1

Theoretical Quantiles

#extra sum of squares


as.numeric(anova(fit)[,2])[-16]

## Warning in anova.lm(fit): ANOVA F-tests on an essentially perfect fit are


## unreliable

## [1] 0.24750625 25.02500625 31.50015625 0.14630625 0.27300625 0.03705625


## [7] 0.51480625 31.16430625 0.28890625 0.19140625 0.16200625 0.64400625
## [13] 0.95550625 0.20025625 0.27825625

## again without the warnings


options(warn = -1)
as.numeric(anova(fit)[,2])[-16]

## [1] 0.24750625 25.02500625 31.50015625 0.14630625 0.27300625 0.03705625


## [7] 0.51480625 31.16430625 0.28890625 0.19140625 0.16200625 0.64400625
## [13] 0.95550625 0.20025625 0.27825625

#a)
#############################################
#to obtain proportion of variability associated to X15
SSX15<-anova(fit)[15,2]
SSTO<-sum(anova(fit)[,2])
P15<-SSX15/SSTO
P15

3
## [1] 0.003036787

b)

for (i in 1:15)
SSX<-as.numeric(anova(fit)[,2][-16])
SSTO<-sum(as.numeric(anova(fit)[,2]))
Proportions<-SSX/SSTO
prop<-as.numeric(Proportions)
prop

## [1] 0.0027011930 0.2731138015 0.3437812296 0.0015967331 0.0029794908


## [6] 0.0004044184 0.0056184079 0.3401158851 0.0031530176 0.0020889381
## [11] 0.0017680772 0.0070284496 0.0104280471 0.0021855238 0.0030367873

To plot the half normal plot,

HalfNormalPlot.p<-function(lm.object,B=20,alpha=0.05,...)
{
## temporary lm object
lm.object.temp<-lm.object
## design matrix
X<-model.matrix(fit)
## n and p
n<-nrow(X)
## sum of squares cross products matrix
SSCP<- t(X) %*% X
## invert SSCP
A<-eigen(SSCP)
SSCP.inv<- A$vectors %*% diag(1/A$values) %*% t(A$vectors)
## hat matrix
H<-X %*% SSCP.inv %*% t(X)
## scale matrix for residuals (sigma=1)
A<-eigen(diag(n)-H)
## Because of rounding error some of the eigenvalues might be negative
## but essentially equal to zero. So we compute the absolute value before
## the square root
scale<- A$vectors %*% diag(sqrt(abs(A$values))) %*% t(A$vectors)

t<-prop
## order statistics
order.stats<-t[order(t)]
## theoretical quantiles
z<-qnorm((1:(n-1)+(n-1)-.125)/(2*(n-1)+0.5))

plot(z,order.stats,
xlab="Half Normal Scores",
ylab="Proportions",...)
effect1<-fit$effects[-1]
text(z,order.stats)

# initialize a matrix
MC.Sample<-matrix(NA,ncol=15,nrow=B)

4
for (k in 1:B)
{
food.temp<-read.csv('C:/Users/Olamide/Downloads/PecosFood.csv')
food.temp$Log.Microbial.Count<-rnorm(16)
fit.temp<-lm(Log.Microbial.Count~.,data=food.temp)
anova(fit.temp)
effect<-fit.temp$coefficients[-1]
#extra sum of squares
as.numeric(anova(fit.temp)[,2])[-16]
## again without the warnings
options(warn = -1)
as.numeric(anova(fit.temp)[,2])[-16]

#b)

for (i in 1:15)
SSX<-as.numeric((anova(fit.temp)[,2])[-16])
SSTO<-sum(as.numeric(anova(fit.temp)[,2]))
Proportions<-SSX/SSTO
prop2<-as.numeric(Proportions)
prop2
t<-prop2
## order statistics
MC.Sample[k,]<-t[order(t)]
}

## enveloppe
upper<-function(x) { quantile(x,prob=1-alpha,type=9) }
lower.limits<-apply(MC.Sample,MARGIN=2,min)
median.values<-apply(MC.Sample,MARGIN=2,median)
upper.limits<-apply(MC.Sample,MARGIN=2,upper)

points(z,lower.limits,lty=3,type="l",col="blue",lwd=2)
points(z,median.values,lty=5,type="l",col="blue",lwd=2)
points(z,upper.limits,lty=3,type="l",col="blue",lwd=2)

HalfNormalPlot.p(fit)

5
14 15
0.30

13
Proportions

0.20
0.10
0.00

1 2 3 4 5 6 7 8 9 10 11 12

0.0 0.5 1.0 1.5 2.0

Half Normal Scores

c) According to the plot, the active effect are X12, X23 and X123

You might also like