Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

02 - Statistical Analysis - Chem32 PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

9/1/2016

Overview
accuracy and precision
errors in chemical analysis
measures of location (measures of
central tendency)
measures of dispersion
confidence interval of the mean
outliers
Q-test , Grubbs test
significance testing
t-test, F-test

Statistical Treatment
of Data

Chem 32
1st Sem 2016-2017

Precision

Accuracy vs. Precision


Accuracy

the closeness of a
measurement to its true
or accepted value

Repeatability: replicate measurements done


on the same sample using the same
conditions (same measurement procedure,
same operators, same measuring system,
same operating conditions and same
location) over a short period of time

Precision

the agreement between


multiple measurements
made in the same way

Reproducibility : different locations


(different laboratories), different
operators, different measuring systems,
different time
3

Precision

Absolute and relative error

Intermediate precision: replicate


measurements on the same or similar
samples employing the same
measurement procedure, same location
over an extended period of time, but may
include other conditions involving changes

absolute error (E) in the measurement


of quantity x
E = xi -
= true or accepted value
relative error (Er)
(expressed as percent)
Er = xi -

* 100
6

9/1/2016

Types of Error

Sample problem

systematic (or determinate) error

An analyst found 2.62 g calcium in a

- affect the accuracy of results

sample which actually contains 2.52 g

random (or indeterminate) error

calcium. Calculate the (a) absolute

- affect the precision of results

error and (b) relative error in percent

gross error
- lead to outliers
7

Sources of systematic errors

Sources of systematic errors

instrumental error
- caused by nonideal instrument
behavior, by faulty calibrations, or by
use under inappropriate conditions
- usually corrected by calibration

method error
- arises from nonideal chemical or
physical behavior of analytical systems
e.g. interferences
slowness of reactions
incomplete reaction
species instability
nonspecificity of reagents
side reaction
9

10

Sources of systematic errors

Effect of systematic error

personal error
- results from personal limitations of
the analyst
e.g. insensitivity to color changes
tendency to estimate scale
readings to improve precision

Constant error
- the magnitude of error does not depend
on the size of the quantity measured

11

12

9/1/2016

Effect of systematic error

Detection of systematic errors


blanks

proportional error
- it may increase or decrease in proportion

analysis of a standard (reference


material)

to the size of the sample taken for


analysis

Ex.
- the concentration of interfering species

analysis by a second, independent


method

increases as the concentration of an


analyzed substance increases

13

14

Random errors
- represent random fluctuations in
procedures and measuring devices
(including the human observer) that
are beyond the control of the analyst
near the performance limit of the
instrument
instrument noise
drift in electronic circuit
vibrations
temperature, etc.

Type of error

Qualitative
description

Quantitative
measure

systematic error

trueness

bias

(total) error

accuracy

uncertainty

random error

precision

measures of spread
(s, RSD) under
specified conditions

Illustration of the links between some fundamental concepts


used to describe the quality of measurement results
15

Absolute and relative uncertainty

16

Absolute and relative uncertainty

absolute uncertainty: margin of uncertainty


associated with a measurement
Ex.
If a buret is calibrated to read within 0.02
mL, the absolute uncertainty for measuring
12.35 mL is 0.02 mL

relative uncertainty: compares the size of

the absolute uncertainty with the size of its


associated measurement

Ex.
For buret reading of 12.35 0.02 mL , the
relative uncertainty is:
Relative Uncertaint y =

Absolute Uncertaint y
= 0.002
Measured Value

Relative Uncertaint y =
17

0.02 mL
= 0.002
12.35 mL

18

9/1/2016

Properties of the
normal distribution

3.42

6.66

68.3%

95.4%

99.7%

19

Measures of central tendency

20

Measures of dispersion

Arithmetic mean

Range, R

Deviation, d

Standard deviation

Relative average
deviation, in %

Relative standard
deviation

Variance, V

Median

Mode
- value that occurs most frequently
21

RSD expressed as parts per hundred (pph) or %: coefficient


of variation (CV)
22

Sample problem

Sample problem

An analyst reported the following


amount (mg/L) of Cl- in a given
sample:

A student found the following values


for the copper content (g/100 g) in an
ore sample.

15.67

39.17, 39.99, 39.21, 39.54, 39.43


and 37.72

15.69

16.03

Calculate the mean, range, standard


deviation, coefficient of variation.

Calculate the mean, average deviation


and standard deviation
23

24

9/1/2016

Confidence interval

Confidence interval

CI for the mean is the range of values


within which the population mean is
expected
to
lie
with
a
certain
probability
Confidence level is the probability that
the true mean lies within a certain
interval and is often expressed as a
percentage

If is unknown and s can be


calculated:

= sample mean
t = Students t, taken from the Table

25

26

Sample problem
Analysis of an insecticide gave the
following values for % of the chemical
lindane:
7.47
6.98
7.27
Calculate the CI for the mean value at
the 90% confidence level.

27

Sample problem
Determination of the cadmium level of a
blood
sample
by
ion-selective
measurement gave the following results
(mq/L).
139.2

139.8

140.1

139.4

Determine the 95% confidence interval


of
the
mean
for
the
cadmium
concentration measurements.
29

90% CI for = 7.24 0.42

28

If is unknown and s can be


calculated:

If is unknown and can be


calculated:

30

9/1/2016

Sample Problem

If is known and can be calculated:

Atomic absorption analysis for copper


concentration in aircraft engine oil gave a
value of 8.53 g Cu/mL.
Pooled results of many analyses showed
s = 0.32 g Cu/mL.
Calculate 90% and 99% confidence
limits if the above result were based on
(a) 1, (b) 4, (c) 16 measurements.

31

90%

32

99%

How many replicate measurements are


needed to decrease the 90%
confidence interval to 8.53 0.05 ?

4
16
At N=16: There is a 90% chance that
the true mean lies within the range
8.53 0.13 (8.40 to 8.66)

33

Seatwork
sample mean = 13.77.
N=30
= 5.88
use 95% confidence level
to calculate the
confidence limit
Calculate the number of
replicates needed to
decrease the confidence
interval to 13.77 0.50

34

Sample Problem:
Conf. Level,%
50
68
80
90
95
96
99
99.7
99.9

z
0.67
1.0
1.29
1.64
1.96
2.00
2.58
3.00
3.29

35

The population standard deviation for the


amount of aspirin in a batch of analgesic
tablets is known to be 7 mg of aspirin. What
is the 95% confidence interval for the
analgesic tablets if an analysis of five tablets
yields a mean of 245 mg of aspirin?

36

9/1/2016

Test for outliers


to show whether the outlying
value/s could reasonably arise from
chance variation or are so extreme
as to indicate some other causes
to provide objective criteria for
taking investigative or corrective
action

There is a 95% probability that the population


mean is between 239 mg and 251 mg of aspirin

How many replicate measurements are


needed to decrease the 95% confidence
interval to 245 3?

Dixons Q-test - for small data set


Grubbs test - ISO recommended

Ans. : 21
37

38

Table 3. Qcritical values for Q-test for


outliers

Q-test

If Qcalc > Qcritical : suspect value is rejected

Grubbs test

If Gcalc > Gcritical : suspect value is rejected

39

40

Sample problem

Table 4. Gcritical values for Grubbs test for outliers

Apply Q-test and Grubbs test (at 90%


confidence level) to the following data and
decide whether a value should be rejected
0.403
0.410
0.401
0.380
0.400
0.413
0.408

41

42

9/1/2016

Sample problem

Sample problem

A student found the following values for


the % iron content of an ore sample:

You titrate 0.1000 N HCl vs. NaOH and


find that for 4 trials the N of the NaOH
is:

39.17
39.54

39.99
39.43

39.21
38.72

0.0968, 0.0979, 0.1020 and 0.0985

Which is the most probable outlier? Can


you legitimately discard it?

Should you drop the value of 0.1020 as


an outlier? [use Grubbs test at 95%CL]

43

44

Cautious approach to rejection


1. Reexamining data if gross error has
been made
- importance of properly kept lab
notebook
2. If possible study the precision of the
procedure
3. Repeat the analysis
4. Apply Q-test
5. If tests indicate retention also report
median and range (where there is no
influence from outlying result)

Significance testing
T-test
- testing for significant difference between the
(1) means and a reference value
(2) two data sets (difference of means)
(3) difference between pairs of measurements
F-test
- testing for significant difference between
the spreads of two data sets (difference of s)
46

45

Null and alternative hypothesis

Steps in significance testing

Null hypothesis
H0: A = B

1. State the null hypothesis

The means are equal

2. State the alternative hypothesis


Alternative hypothesis
Ha: A B The means are not equal
two-tailed test

3. Select the appropriate test


4. Choose the level of significance for
the test

Ha: A > B Mean A is greater than mean B


one-tailed test

5. Calculate the test statistic


6. Obtain the critical value for the test
7. Compare the test statistic with the
critical value

Ha: A < B Mean A is less than mean B


one-tailed test
47

48

9/1/2016

Null hypothesis

One-sided/two-sided probabilities

H0:

Test statistic
Alternative hypothesis and rejection region
If Ha: reject H0 if tcalc ttab or
tcalc -ttab

a) one-tailed test with a significance level of 0.05 and 3 degrees


of freedom
b) two-tailed test with a significance level of 0.05 and 3 degrees
of freedom
49

t-test: comparison of experimental


mean with a reference value
(one-sample t-test)

If Ha:

>

reject H0 if tcalc ttab

If Ha:

<

reject H0 if tcalc -ttab

50

Sample problem
The following results (%K) were obtained
from the AAS analysis of a standard
reference material containing 38.90% K:
38.92

= sample mean

37.40

37.11

= reference value or stated value


s = sample standard deviation

Is there a significant difference (at


95%CL) between the mean of the results
and the certified value?

n = sample size
To determine whether the difference between the
experimental mean and the accepted value is due to
random error or to an actual systematic error

51

Sample problem
A new procedure for the rapid
determination of sulfur in kerosene was
tested on a sample known to contain
0.123% S. The results were
0.112, 0.118, 0.115 and 0.119 % S
Do the data indicate that there is a bias
in the method?
53

52

Sample problem
Analysis of five replicates of a vitamin
preparation known to contain 500.0
mg of vit C gives 502.0, 500.0,
505.0, 501.0 and 504.0 mg. Is the
difference between the experimental
mean and the true value due to
random error or is there a determinate
error in the method?
54

9/1/2016

t-test: comparison of two experimental means

Sample problem
The manufacturer claim that the mean
fat content of his burger is around
20%. Shown below is the result of
fat analysis for his sample. Was the
manufacturers claim true at 90%
confidence level?

55

t-test: comparison of two experimental means


Steps:
(1) Formulate a null hypothesis that the two means are
identical
H0 :

b. look for ttab (deg.of freedom= na + nb -2)


(3) Compare tcalc and ttab:
No to H0
Yes to H0

F-test: comparison of two standard dev.


*establishes if there is a significant difference between
standard deviations (or variances)

(2) Perform t-test


a. calculate tcalc (note: sa & sb NOT significantly different
[F-test first])

a. if tcalc > ttab


b. if tcalc < ttab

56

*answer the question: are the spreads different i.e. do


the two sets of data come from two separate
populations?

Two forms:
1. Is the precision of procedure A higher
than the precision of procedure B (a
one-tailed test)?
2. Is the precision of procedure A
significantly different from the precision
of procedure B (a two-tailed test)?

57

F-test: comparison of two standard dev.

58

F values at the 95% probability level

(1) Calculate Fcalc


where s1> s2

(2) Look for Ftab


(3) Compare Fcalc and Ftab
a. if Fcalc > Ftab

sa and sb are
significantly different

b. if Fcalc < Ftab

sa and sb are NOT


significantly different
59

60

10

9/1/2016

Sample problem

Sample problem

A proposed method for the determination of


chemical oxygen demand of wastewater was
compared with the standard (mercury salt)
method. The following results were obtained for
a sewage effluent sample:

The amount of 14CO2 in a plant sample


is measured to be:
28, 32, 27, 39 & 40 counts/min
(mean = 33.2).

Mean (mg/L)

std dev (mg/L)

72
72

3.31
1.51

Standard method
Proposed method

For each method 8 determinations were made.


Is the precision of the proposed method
significantly greater than that of the standard
method?

61

62

Before determining the amount of Na2CO3 in


an unknown sample, a student decided to
check her procedure by analyzing a sample
known to contain 98.76% w/w Na2CO3. Five
replicate determinations of the %w/w
Na2CO3 in the standard were made with the
following results

A new colorimetric method for determining


the glucose content of blood serum was
compared with the standard Folin-Wu
method. The results were as follows:
New method (mg/dL glucose):
125 123 130 131
126

Are the mean values significantly


different at a 95% confidence level?

Sample problem

Sample problem

127

The amount of radioactivity in a blank


is found to be:
28, 21, 28, & 20 counts/min
(mean = 24.2).

129

Folin-Wu method, (mg/dL glucose):


130 128
131
129 127 125
Are the mean values significantly different at
63
the 95% CL?

Sample problem

98.71

98.59

98.62

98.44

98.58

Is the mean for these five trials significantly


different from the accepted value at the
95% confidence level (=0.05)
64

Sample problem

Two barrels of wine were analyzed for


their alcohol content in order to
determine whether they were from
different sources. On the basis of six
analyses, the average content of the
first barrel was established to be
12.61% ethanol. Four analyses of
the second barrel gave a mean of
12.53% alcohol. The ten analyses
yielded a pooled value of s of
0.070%. Do the data indicate a
difference between the wines?
65

Method used for routine analysis gives


s = 0.06. Modification of the method
gives a pooled estimate of s = 0.04 for
a statistical sample with 12 degrees of
freedom. Has the modification
improved the precision of the analysis?

66

11

9/1/2016

Paired t-test

Sample problem

Uses the same type of procedure as the


normal t test except that we analyzed pairs of
data.
Procedure:
- Calculate the difference, di, between the
paired values for each sample
- Calculate the average difference, , and the
standard deviation of the differences, sd
H0:
= 0 (there is no difference between the
two samples)
Ha :
0
n= no. of paired samples

67

Sample

Microbiological

Electrochemical

129.5

132.3

89.6

91.0

76.6

73.6

52.2

58.2

110.8

104.2

50.4

49.9

72.4

82.1

141.4

154.1

75.0

73.4

10

34.1

38.1

11

60.3

60.1

Mareceket. al. developed a new electrochemical


method for rapidly determining the
concentration of the antibiotic monensinin
fermentation vats. The standard method for the
analysis, a test for microbiological activity, is
both difficult and time consuming. Samples
were collected from the fermentation vats at
various times during production and analyzed
for the concentration of monensinusing both
methods. The results, in parts per thousand
(ppt), are reported in the following table.
Is there a significant difference between the
methods at = 0.05?
68

z-test

for a large no. of results so that s is a good


estimate of
Null hypothesis
H0: =
Test statistic
Alternative hypothesis and rejection region
If Ha: reject H0 if zcalc ztab or
zcalc -ztab

69

If Ha:

>

reject H0 if zcalc ztab

If Ha:

<

reject H0 if zcalc -ztab

70

Sample problem
A class of 30 students determined the
activation energy of a chemical reaction to
be 27.7 kcal/mol (mean value) with a
standard deviation of 5.2 kcal/mol. Are
the data in agreement with the literature
value of 30.8 kcal/mol at
(1) the 95% confidence level and
(2) the 99% confidence level?
Estimate the probability of obtaining a mean
equal to the literature value.

71

72

12

You might also like