Lecture 1 -Data Analysis
Lecture 1 -Data Analysis
Analytical Chemistry
“ANALYTICAL CHEMISTRY IS ANALYTICAL METHODOLOGY
NOT
SPECTROMETERS, IDENTIFICATION
POLAROGRAPHS,
DETERMINATION or ASSAY
ELECTRON MICROPROBES,
ETC. ANALYSIS
QUANTITATION
IT IS ANALYTE
EXPERIMENTATION,
OBSERVATION, VALIDATION
DEVELOPING FACTS, METHODS or PROTOCOLS
AND DRAWING CONCLUSIONS.” TECHNIQUES
West PW
SOLVING A PROBLEM
in Analytical Chemistry (46) 1974
6
Analytical Chemistry
Two Types of Analysis
• Qualitative Analysis answers the question what is it in chemical
terms? identify what materials are present in sample
Examples include detecting metals in groundwater or fish - what
metals are present?
Analytical Chemistry
Two Types of Analysis
• Qualitative Analysis answers the question what is it in chemical
terms?
Examples include detecting metals in groundwater or fish - what
metals are present?
4
Applications of Analytical Chemistry
6
The General Analytical Problem
Select sample
Extract analyte(s) from matrix
Separate analytes
Statistical Data
Treatment
8
Errors & Statistical Data Treatment
xi
i=1
x =
N
Median
The middle result when data are arranged according to increasing
or decreasing value.
d i xi x
11
Accuracy
Relative error: x x
E i t 100%
r x
t
(latter is more useful in practice)
12
Illustration of “Mean” and “Median”
Results of 6 determinations of the Fe(III) content of a solution, known to
contain 20 ppm:
13
14
Illustrating the difference between “accuracy” and “precision”
(x x) i
2
s i 1
N 1
Alternative Expression for s (suitable for calculators)
N
( xi ) 2
( xi 2 ) i 1
i 1 N
s
N 1
16
Note: NEVER round off figures before the end of the calculation
Reproducibility of a method for determining
Standard Deviation of a Sample the % of selenium in foods. 9 measurements
were made on a single batch of brown rice.
Sample Selenium content (g/g) (xi) xi2
1 0.07 0.0049
2 0.07 0.0049
3 0.08 0.0064
4 0.07 0.0049
5 0.07 0.0049
6 0.08 0.0064
7 0.08 0.0064
8 0.09 0.0081
9 0.08 0.0064
17
18
Relative standard deviation (RSD)
The difference between the largest value and the smallest one in a set of data
19
HN NH3+Cl-
S
H H
Benzyl isothiourea
hydrochloride
OH
N
Analyst 1: precise, accurate
Analyst 2: imprecise, accurate Nicotinic acid
Analyst 3: precise, inaccurate
Analyst 4: imprecise, inaccurate 20
Types of Error in Experimental Data
Three types:
2. Method Error
•Nonideal chemical or physical behaviour of analytical systems
3. Personal Error
•Carelessness
•Inattention
•Personal limitation of the experimenter
23
This is a
A very large number of Gaussian or
random uncertainties normal error
curve.
Symmetrical about
the mean.
27
28
Replicate Data on the Calibration of a 10ml Pipette
B = Gaussian curve with the same mean value, the same precision (see later)
and the same area under the curve as for the histogram. 30
Statistical Treatment of Random Error
Population vs Sample
31
x i
x= i =1
N
Population mean () : defined as earlier (N ). In absence of systematic error,
is the true value (maximum on Gaussian curve).
N
x i
= i =1
N
More often than not, particularly when N is small, x differs from µ because a small
of data does not exactly represent its population.
Remember, sample mean ( x ) defined for small values of N.
(Sample mean population mean when N 20) 32
Population Standard Deviation (s or
The equation for must be modified for small samples of data, i.e. small N
N N
( xi ) 2 ( xi x ) 2
i 1
s i 1
N
N 1
For population of data For sample data
33
34
Properties of the Normal Error Curve
The distribution of errors for a particular population of data is given by two population
parameters and
The population mean expresses the magnitude of the quantity being measured; the
standard deviation expresses the scatter and is therefore an index of precision.
sm s
N
N, x
36
Pooled Data
When several small sets have the same sources of random error (i.e. the same type
of measurements but different samples) the standard deviations of the individual
data sets may be pooled to more accurately determine the standard deviation of the
analysis method.
Suppose that there are t small sets of data, comprising N1, N2,….Nt measurements.
The equation for the resultant sample standard deviation is:
N1 N2 N3
( xi x1 ) 2
( xi x2 ) ( xi x3 ) 2 ....
2
i 1 i 1 i 1
s pooled
N 1 N 2 N 3 ...... t
37
38
Pooled Standard Deviation Analysis of 6 bottles of wine
for residual sugar.
Bottle Sugar % (w/v) No. of obs. Deviations from mean
Set n (x x) 2
sn
1 0.94 3 0.05, 0.10, 0.08 1
i
0.0189 0.097
2 1.08 4 0.06, 0.05, 0.09, 0.06 2 0.0178 0.077
3 1.20 5 0.05, 0.12, 0.07, 0.00, 0.08 3 0.0282 0.084
4 0.0242 0.090
4 0.67 4 0.05, 0.10, 0.06, 0.09 5 0.0230 0.107
5 0.83 3 0.07, 0.09, 0.10 6 0.0205 0.083
Total 0.1326
6 0.76 4 0.06, 0.12, 0.04, 0.03
39
(x i x )2
s2 i 1
N 1
s
CV ( ) 100%
x
40
How can we relate the observed mean x value to the true value ?
41
42
43
CL for x z
N
44
Confidence Limits when is known
Atomic absorption analysis for copper concentration in aircraft engine oil gave a mean
value of 8.53 g Cu/ml. Pooled results of many analyses showed s = 0.32 g
Cu/ml.
Find out the at 90% and 99% confidence level if the above result were based on (a) 1,
(b) 4, (c) 16 measurements.
45
t = (x - )/s
46
Values of t for various levels of probability
48
Testing a Hypothesis
If the experimental value is different from the true value, is the difference due to a
systematic error (bias) in the method – or simply due to random error?
NULL HYPOTHESIS
--- two values are the same
ALTERNATIVE HYPOTHESIS
--- two values are different
x xt ts N
At the desired confidence level, null hypothesis is rejected
-----> the two values should not be the same
-----> evidence for systematic errors 49
50
Are two sets of measurements significantly different?
N1 N 2
x1 x2 ts pooled
N1 N 2
Only if the difference between the two samples is greater than the term on
the right-hand side, there must be a systematic error
51
ts N 1 N 2 N 1 N 2 - i. e. ( 3.3 6 )( 0 .2 6 7 ) 1 0 2 5
i.e. ± 0.5674, or ±0.57 g/g.
But x1 x 2 28 . 0 26 . 25 1 . 75 g/g
i. e . x 1 x 2 ts p o o le d N 1 N 2 N 1 N 2
Qexp x q x n /w
The following values were obtained for the concentration of nitrite ions in a sample
of river water: 0.403, 0.410, 0.401, 0.380 mg/L. Should the last reading be rejected
at 95% level?
55
Suppose 3 further measurements taken, giving total values of: 0.403, 0.410, 0.401,
0.380, 0.400, 0.413, 0.411 mg/l. Should 0.380 still be retained at 95% level?
56