Lecture 3 PDF
Lecture 3 PDF
Evaluation
Lecture 3
Statistical tests
Confidence Intervals
With statistics, however, we can establish an
interval surrounding an experimentally
determined mean x within which the population
mean is expected to lie with a certain degree
of probability.
This interval is known as the confidence interval and
the boundaries are called confidence limits.
Ex: 99% probable that the true population mean for a
set of potassium measurements lies in the interval
7.25% 0.15% K. Thus, the mean should lie in the
interval from 7.10% to 7.40% K with 99% probability.
Apply only in the absence of bias and only if we can assume that s is
a good approximation of
EXAMPLE 7-1
Determine the 80% and 95% confidence intervals for (a)
the first entry (1108mg/L glucose) in Example 6-2 (page
124) and (b) the mean value (1100.3mg/L) for month 1 in
the example. Assume that in each part, s = 19 is a good
estimate of .
Ans:
(a) From Table 7-1, z=1.28 and 1.96 for the 80% and
95% confidence levels:
80% CI = 1108 1.28 19 = 1108 24.3 mg/L
95% CI = 1108 1.96 19 = 1108 37.2 mg/L
EXAMPLE 7-2
How many replicate measurements in month 1 in
Example 6-2 are needed to decrease the 95%
confidence interval to 1100.3 10.0 mg/L of glucose?
- When is Unknown
Often, limitations in time or in the amount of available
sample prevent us to assume s is a good estimate of
I.e., a single set of replicate measurements must provide not
only a mean but also an estimate of precision.
t Statistic
Statistical treatment of small sets of data
Often called Students t, t depends on
the desired confidence level.
the number of degrees of freedom in the calculation
of s.
EXAMPLE 7-3
A chemist obtained the following data for the
alcohol content of a sample of blood: % C2H5OH:
0.084, 0.089, and 0.079. Calculate the 95%
confidence interval for the mean assuming (a)
the three results obtained are the only indication
of the precision of the method and (b) from
previous experience on hundreds of samples,
we know that the standard deviation of the
method s = 0.005% C2H5OH and is a good
estimate of .
Reject region:
The rejection region consists of all the values of the
test statistic for which H0 will be rejected.
The null hypothesis is rejected if the test statistic lies
within the rejection region.
Test statistics:
Large sample: z statistic test
Small sample: t statistic test
The procedure:
State the null hypothesis: H0: = 0
Form the test statistic:
State the alternative hypothesis, Ha, and determine
the rejection region:
For Ha: 0, reject H0 if z zcrit or if z zcrit
For Ha: 0, reject H0 if z zcrit
For Ha: 0, reject H0 if z z crit
EXAMPLE 7-4
A class of 30 students determined the
activation energy of a chemical reaction to
be 27.7 kcal/mol (mean value) with a
standard deviation of 5.2 kcal/mol. Are the
data in agreement with the literature value
of 30.8 kcal/mol at (1) the 95% confidence
level and (2) the 99% confidence level?
Estimate the probability of obtaining a
mean equal to the literature value.
Figure 7-3
Illustration of systematic error
in an analytical method. Curve
A is the frequency distribution
for the accepted value by a
method without bias. Curve B
illustrates the frequency
distribution of results by a
method that could have a
significant bias.
EXAMPLE 7-5
A new procedure for the rapid determination of the
percentage of sulfur in kerosenes was tested on a
sample known from its method of preparation to contain
0.123% (0 = 0.123%) S. The results were % S = 0.112,
0.118, 0.115, and 0.119. Do the data indicate that there
is a bias in the method at the 95% confidence level?
Excel function
TDIST (x, deg_freedom, tails)
test value of t
TDIST (4.375, 3, 2) = 0.022
Only 2.2% probable to get a value
because of random error.
TINV (probability,
degree_freedom)
TINV (0.05, 3) = 3.1825
The critical value of t for 95%
confidence interval
x1 x2
s1
N1
EXAMPLE 7-6
Two barrels of wine were analyzed for their
alcohol content to determine whether they were
from different sources. On the basis of six
analyses, the average content of the first barrel
was established to be 12.61% ethanol. Four
analyses of the second barrel gave a mean of
12.53% alcohol. The 10 analyses yielded a
pooled standard deviation spooled of 0.070%. Do
the data indicate a difference between the wines?
Paired Data
Scientists and engineers often make use
of pairs of measurements on the same
sample to minimize sources of variability
that are not of interest.
Ex: using two different methods to evaluate
two different samples.
There would be variability from different samples.
A better way would be use both methods on the
same samples and to focus on the differences.
Paired Data
The paired t test uses the same type of
procedure as the normal t test except that we
analyze pairs of data.
Our null hypothesis is H0: d = 0, where 0 is a
specific value of the difference to be tested, often zero.
The alternative hypothesis could be
EXAMPLE 7-7
A new automated procedure for determining
glucose in serum (Method A) is to be compared
with the established method (Method B). Both
methods are performed on serum from the same
six patients to eliminate patient-to-patient
variability. Do the following results confirm a
difference in the two methods at the 95%
confidence level?
Hypotheses:
If d is the true average difference between the methods, the null
hypothesis H0: d = 0 and the alternative hypothesis, Ha: d 0.
where
d=
N
16 + 9 + 25 + 5 + 22 + 11
= 14.67
6
From Table 7-3, the critical value of t is 2.57 for the 95% confidence
level and 5 degrees of freedom.
Since t tcrit, we reject the null hypothesis and conclude that the two
methods give different results
NOTE:
If we merely average the results of Method A
(836.0 mg/L) and the results of Method B
(821.3 mg/L), the large patient-to-patient
variation in glucose level gives us large
values for sA (146.5) and sB (142.7).
A comparison of means gives us a t value of
0.176a and we would accept the null
hypothesis!
Hence, the large patient-to-patient variability
masks the method differences that are of
interest. Pairing allows us to focus on the
differences.
Type II error:
We accept H0 when it is false.
The probability of a type II error is given the symbol .
Comparison of Precision
While comparing the variances( or
standard deviations) of two populations:
F test can be used to test this assumption
under the provision that the populations follow
the normal (Gaussian) distribution.
The F test is also used in comparing more
than two means and in linear regression
analysis.
F test
Defined as the ratio of the two sample
variances
,
Calculated and compared with the critical
value of F at the desired significance level.
The null hypothesis that the two population
variances under consideration are equal,
H0:
.
The null hypothesis is rejected if the test
statistic differs too much from 1.
Critical values of F at the 0.05 significance
level are shown in Table 7-4.
EXAMPLE 7-8
A standard method for the determination of the
carbon monoxide (CO) level in gaseous
mixtures is known from many hundreds of
measurements to have a standard deviation of
0.21 ppm CO. A modification of the method
yields a value for s of 0.15 ppm CO for a pooled
data set with 12 degrees of freedom. A second
modification, also based on 12 degrees of
freedom, has a standard deviation of 0.12 ppm
CO. Is either modification significantly more
precise than the original?
F1 < Fcrit :
We cannot reject the null hypothesis.
There is no improvement in precision
F2 > Fcrit :
We reject the null hypothesis.
The second method does appear to give better precision a
the 95% confidence level.
ANOVA Concepts
Detect differences in several population
means by comparing the variances:
For comparing I population means, 1,
2, 3, I,
the null hypothesis H0 is
H0: 1 = 2 =3 = =I
and the alternative hypothesis Ha is
Ha: at least two of the mis are different
ANOVA Concepts
Typical applications of ANOVA:
Is there a difference in the results of five analysts
determining calcium by a volumetric method?
Will four different solvent compositions have
differing influences on the yield of a chemical
synthesis?
Are the results of manganese determinations by
three different analytical methods different?
Are there any differences in the fluorescence of a
complex ion at six different values of pH?
Single-Factor ANOVA
For I populations
the sample means of are:
the sample variances are:
The grand average (i.e., the average of all the data):
(weighted average)
ANOVA Table
EXAMPLE 7-9
EXAMPLE 7-10
The Q Test
A simple, widely used statistical test for deciding whether
a suspected result should be retained or rejected.
The absolute value of the difference between the
questionable result xq and its nearest neighbor xn is
divided by the spread w of the entire set:
EXAMPLE 7-11
The analysis of a calcite sample yielded CaO
percentages of 55.95, 56.00, 56.04, 56.08, and 56.23.
The last value appears anomalous; should it be retained
or rejected at the 95% confidence level?
The difference between 56.23 and 56.08 is 0.15%. The
spread (56.23 55.95 ) is 0.28%. Thus,