Formula
Formula
Formula
Summary Statistics
For data x1,x2,...,xn
2
( ni=1xi)
Pn P
i=1 xi 1 Pn
Sample mean x̄= n Sample variance s2 = n−1 2
i=1 xi − n .
e−λ λx
Poisson P o(λ) 0≤x<∞ x! λ λ
√
Mean (known σ2) x1,...,xn N(µ,σ2) x̄±z×σ/ n where P(Z >z)=α/2
with Z ∼N(0,1).
√
Mean (unknown σ2) x1,...,xn N(µ,σ2) x̄±t×s/ n where P(X >t)=α/2
with X ∼tn−1.
√
Mean (large sample) x1,...,xn Unknown x̄±z×s/ n where P(Z >z)=α/2
with Z ∼N(0,1).
s2 (n−1) s2 (n−1)
Variance x1,...,xn N(µ,σ2) b , a where P(χ2n−1 <a)=α/2
P(χ2n−1 >b)=α/2 .
q
p̂(1−p̂)
Proportion x Bin(n,p) p̂±z× n where P(Z >z)=α/2
with Z ∼N(0,1),
p̂=x/n.
p
Poisson Mean x1,...,xn P o(λ) x̄±z× x̄/n where P(Z >z)=α/2
with Z ∼N(0,1).
1
Two Sample Two sided Confidence Intervals, (100−α)%
Two independent samples:
x1,x2,...,xn from X ∼N(µ1,σ12) with sample mean x̄, sample variance s2x
y1,y2,...,ym from Y ∼N(µ2,σ22) with sample mean ȳ, sample variance s2y
Difference of Sample means:
Assume common variance, σ12 =σ22 s
2
1 1
(x̄−ȳ)±t× sp +
n m
2 +(m−1)s2
(n−1)sx
where P(tn+m−2 >t)=α/2 and s2p = n+m−2
y
.
Ratio of variances:
s2x 1 s2x 1
,
s2y b s2y a
where P(Fn−1,m−1 <a)=P(Fn−1,m−1 >b)=α/2.
Linear Regression
Summary statistics are: P 2 P P P 2
2 ( xi ) ( xi)( yi) 2 ( yi )
X X X
Sxx = xi − Sxy = xiyi − Syy = yi −
n n n
The least squares estimates of α and β in the regression model y =α+β x are
Sxy 1 X X
β̂ = α̂= yi − β̂ xi
Sxx n
The residual (error) mean square can be calculated from:
!
S 2
1 xy
s2 = Syy −
n−2 Sxx
s
s 1 x̄2
The standard error of the estimate β̂ is: √ and of the estimate α̂ is: s +
Sxx n Sxx
The prediction for a given value of the predictor x is: ŷ = α̂+ β̂ x
s
1 (x−x̄)2
ŷ±ts +
n Sxx
where t∼tn−2.
Pearson’s correlation coefficient:
Sxy
r= p
SxxSyy
χ2 Tests
Suppose that we have n counts Oi for which the expected numbers to occur under some model are Ei. A
test of the hypothesis is given by the statistic:
X (Oi −Ei)2
χ2 =
Ei
i
This statistic has approximately a χ2
distribution with ν =n−k−1 degrees of freedom, where k is the
number of estimated parameters.
To test for independence in a two-way contingency table, the degrees of freedom are:
ν =(r−1)×(c−1) where r and c are the number of rows and columns of the table.