Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
4 views

Lecture_PPT_ch04(1)

Chapter 4 of 'Quantitative Chemical Analysis' covers various statistical methods including Gaussian distribution, standard deviation, confidence intervals, and t-tests. It emphasizes the importance of understanding experimental uncertainty and provides examples of calculating mean, standard deviation, and using spreadsheets for statistical analysis. The chapter also includes practical applications of statistics in analyzing blood glucose readings and coin toss experiments.

Uploaded by

thuy36030
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Lecture_PPT_ch04(1)

Chapter 4 of 'Quantitative Chemical Analysis' covers various statistical methods including Gaussian distribution, standard deviation, confidence intervals, and t-tests. It emphasizes the importance of understanding experimental uncertainty and provides examples of calculating mean, standard deviation, and using spreadsheets for statistical analysis. The chapter also includes practical applications of statistics in analyzing blood glucose readings and coin toss experiments.

Uploaded by

thuy36030
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 101

Statistics

Chapter 4

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 1
Lecture Outline
• Section 4-1 Gaussian Distribution
• Section 4-2 Comparison of Standard Deviations with the F Test
• Section 4-3 Confidence Intervals
• Section 4-4 Comparison of Means with Student's t
• Section 4-5 t Tests with a Spreadsheet
• Section 4-6 Grubbs Test for an Outlier
• Section 4-7 The Method of Least Squares
• Section 4-8 Calibration Curves
• Section 4-9 A Spreadsheet for Least Squares
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 2
Is My Blood Glucose Reading Correct?
What do varying readings indicate? • Run a control sample to test one’s
• Serious health issues technique and glucose meter.
• Poor technique Measurement of Control sample
• Malfunctioning meter control sample concentration
90 mg/dL
94 mg/dL
85 mg/dL 99 mg/dL
95 mg/dL
93 mg/dL
Average = 91.4 mg/dL

Is my blood glucose reading correct?


Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 3
All Measurements Have Experimental
Uncertainty

Statistics gives us tools to:

• Accept conclusions that have a high probability of being correct

• Reject conclusions that have a high probability of being incorrect

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 4
Section 4-1
Gaussian Distribution

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 5
Gaussian Distribution
If an experiment is repeated a great many times and if the
errors are purely random: Figure 4-1
• The results tend to cluster
symmetrically about the average
value.
• The more times the experiment is
repeated, the more closely the results
approach a Gaussian distribution.
Usually we repeat an experiment 3–5 times (not 400 times).
From small data sets we can estimate properties of a hypothetical large set.
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 6
Mean Value and Standard Deviation
Figure 4-2
Mean (average): the sum of a set of results divided by
the number of values in the set

x=
xi
i
as n increases, x → μ
n
Standard deviation: measures how closely data are
clustered about the mean


2
(xi − x )
s= i
as n increases, s → 
n−1
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 7
Standard Deviation
The smaller the standard deviation, s, the more closely the
data are clustered about the mean.
• Experiments with a small standard
Precision: reproducibility deviation are more precise than
experiments with a large standard
Accuracy: nearness to the “truth” deviation.
• Greater precision does not necessarily
imply greater accuracy.
• Express the mean and standard deviation in the form x  s (n = ___ ).
• The average and the standard deviation should both end in the same
decimal place.
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 8
Other Statistical Parameters

Degrees of freedom:
De𝐠𝐫𝐞𝐞𝐬 𝐨𝐟 𝐟𝐫𝐞𝐞𝐝𝐨𝐦 = n − 𝟏
Variance: square of the standard deviation
V𝐚𝐫𝐢𝐚𝐧𝐜𝐞 = 𝛔2
Relative standard deviation (coefficient of variation): standard
deviation expressed as a percentage of the mean
s
Relative standard deviation = 100 
x
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 9
Example: Mean and Standard Deviation (1 of 4)

Find the average, standard deviation, and relative standard deviation


for 821, 783, 834, and 855.

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 10
Example: Mean and Standard Deviation (2 of 4)

Solution: The average is


821 + 783 + 834 + 855
x= = 823.2
4
To avoid accumulating round-off errors, retain one more digit in the
mean than was present in the original data. The standard deviation is

(821 − 823.2 )2 + (783 − 823.2 )2 + (834 − 823.2 )2 + (855 − 823.2 )2


s= = 30.3
(4 − 1)

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 11
Example: Mean and Standard Deviation (3 of 4)
Solution: The average and the standard deviation should both end at
the same decimal place. For x = 823.2 , we will write s = 30.3. The
relative standard deviation is the percent relative uncertainty:
s 30.3
Relative standard deviation = 100  = 100  = 3.7%
x 823.2

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 12
Example: Mean and Standard Deviation (4 of 4)

Test Yourself: If each of the four numbers 821, 783, 834, and 855 in the
example is divided by 2, how will the mean, standard deviation, and
relative standard deviation be affected?

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 13
Spreadsheets for Average and Standard Deviation

Spreadsheets have built-in statistical functions

• Average:
=AVERAGE(B1:B4)

• Standard deviation:
=STDEV.S(B1:B4)

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 14
Standard Deviation and Probability (1 of 2)
Figure 4-3
Gaussian curve:
1 − ( x − )2 /2  2
y= e
 2
The probability of observing a value within a certain
range is proportional to the area of that range.

Express deviations from the mean value in multiples,


z, of the standard deviation. We transform x into z

x −μ x − x
z 
σ s
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 15
1 − z2 /2
Table 4-1 Ordinate and area for the normal (Gaussian) error curve, y = e
2

𝒛𝒂 𝒚 Areab |𝒛| 𝒚 Area |𝒛| 𝒚 Area


0.0 0.398 9 0.000 0 1.4 0.149 7 0.419 2 2.8 0.007 9 0.497 4
0.1 0.397 0 0.039 8 1.5 0.129 5 0.433 2 2.9 0.006 0 0.498 1
0.2 0.391 0 0.079 3 1.6 0.110 9 0.445 2 3.0 0.004 4 0.498 650
0.3 0.381 4 0.117 9 1.7 0.094 1 0.455 4 3.1 0.003 3 0.499 032
0.4 0.368 3 0.155 4 1.8 0.079 0 0.464 1 3.2 0.002 4 0.499 313
0.5 0.352 1 0.191 5 1.9 0.065 6 0.471 3 3.3 0.001 7 0.499 517
0.6 0.333 2 0.225 8 2.0 0.054 0 0.477 3 3.4 0.001 2 0.499 663
0.7 0.312 3 0.258 0 2.1 0.044 0 0.482 1 3.5 0.000 9 0.499 767
0.8 0.289 7 0.288 1 2.2 0.035 5 0.486 1 3.6 0.000 6 0.499 841
0.9 0.266 1 0.315 9 2.3 0.028 3 0.489 3 3.7 0.000 4 0.499 904
1.0 0.242 0 0.341 3 2.4 0.022 4 0.491 8 3.8 0.000 3 0.499 928
1.1 0.217 9 0.364 3 2.5 0.017 5 0.493 8 3.9 0.000 2 0.499 952
1.2 0.194 2 0.384 9 2.6 0.013 6 0.495 3 4.0 0.000 1 0.499 968
1.3 0.171 4 0.403 2 2.7 0.010 4 0.496 5 ∞ 0 0.5
a. z = (x – μ)/σ
b. The area refers to the area between z = 0 and z = the value in the table. Thus the area from z = 0 to z = 1.4 is 0.419 2. The area from z =
−0.7 to z = 0 is the same as from z = 0 to z = 0.7. The area from z = −0.5 to z = +0.3 is (0.191 5 + 0.117 9) = 0.309 4. The total area between z
= −∞ and z = +∞ is unity.

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 16
Example: Area Under a Gaussian Curve (1 of 3)
For many tosses of a set of 50 coins in Figure 4-1, probability theory
predicts a mean of 25.00 heads and a standard deviation of 3.54. How
many tosses are expected to have fewer than 15 heads if the 50 coins
were tossed 400 times? Figure 4-1

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 17
Example: Area Under a Gaussian Curve (2 of 3)
Solution: We express the desired interval in multiples of the standard
deviation and then find the area of the interval in Table 4-1.
Because μ = 25.00 and  = 3.54, z = (15 − 25.00) / 3.54 = −2.82  −2.8.
From Table 4-1 the area between the mean and z = −2.8 is 0.497 4.
The entire area from −∞ to the mean value is 0.500 0, so the area from −∞ to
–2.8 is 0.500 0 − 0.497 4 = 0.002 6.
The area to the left of 15 heads in Figure 4-1b is only 0.26% of the entire area
under the curve.
If the class tosses the 50 coins 400 times, they would expect to see 15 or fewer
heads only once (0.26% of 400 = 1.04).
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 18
Example: Area Under a Gaussian Curve (3 of 3)
Test Yourself: If 50 coins are tossed 400 times, how many times would
32 or more heads be expected? How many times are observed in
Figure 4-1b? Figure 4-1

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 19
Example: Using a Spreadsheet to Find Area
Under a Gaussian Curve (1 of 5)
For 400 tosses of 50 coins, how many tosses are expected to have
between 20 and 27 heads? Figure 4-4

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 20
Example: Using a Spreadsheet to Find Area
Under a Gaussian Curve (2 of 5)
Solution: We need to find the fraction of the area of the Gaussian curve between
x = 20 and x = 27 heads and then multiply this fraction by 400 tosses. The function
NORM.DIST in Excel gives the area under the curve from −∞ to a chosen value of
x. We will find the area from −∞ to 27 heads, which is all the shaded area to the
left of 27 heads in Figure 4-4. Then we will find the area from −∞ to 20 heads,
which is the shaded area to the left of 20 heads. The difference between the two
is the area from 20 to 27 heads:

Area from 20 to 27 = (area from −∞ to 27) − (area from −∞ to 20)

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 21
Example: Using a Spreadsheet to Find Area
Under a Gaussian Curve (3 of 5)
Solution: In a spreadsheet, enter the mean
μ = 25.00 in cell A2 and the standard
deviation σ = 3.54 in cell B2. To find the
area under the Gaussian curve from −∞ to
27, select cell C4, then go to Formulas and
Insert Function. Select Statistical functions
and double click NORM.DIST. A window
appears asking for four values that will be
used by NORM.DIST.

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 22
Example: Using a Spreadsheet to Find Area
Under a Gaussian Curve (4 of 5)
Solution: The values provided to NORM.DIST(x,mean,standard_dev,cumulative) are called
arguments of the function.
The first argument is x, which is 27. The second argument is the mean. You can enter 25 for
the mean or enter $A$2, which is the cell containing 25. We use dollar signs in $A$2 so that
we can move the formula to other cells and still refer to cell A2.
The third argument is the standard deviation, for which we enter $B$2. The last argument is
“cumulative.” When cumulative is TRUE, NORM.DIST gives the area under the Gaussian
curve. When cumulative is FALSE, NORM.DIST gives the ordinate (the y-value) of the Gaussian
curve. We want area, so enter TRUE. The formula “= NORM.DIST(27,$A$2,$B$2,TRUE)” in cell
C4 returns 0.714 0 for the area under the curve from −∞ to 27. To get the area from −∞ to
20, enter “= NORM.DIST(20,$A$2,$B$2,TRUE)” in cell C5. The value returned is 0.078 9. Then
subtract the areas (C6 = C4 − C5) to obtain 0.635 0, which is the area from 20 to 27 heads. We
expect 63.50% of 400 tosses (= 254 tosses) to have 20 to 27 heads.
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 23
Example: Using a Spreadsheet to Find Area
Under a Gaussian Curve (5 of 5)
Test Yourself: How many tosses out of 400 tosses of 50 coins are
expected to have 23 to 28 heads?

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 24
Standard Deviation and Probability (2 of 2)

The sum of the probabilities of all measurements must be unity.


• The area under the whole curve from z = − to + adds up to 1.
The standard deviation measures the width of the Gaussian curve.
• The larger σ, the broader the curve.
Percentage of
For any Gaussian curve: Range measurements
  1 68.3
  2 95.5
  3 99.7
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 25
Standard Deviation of the Mean

The more times a quantity is measured, the more confident you can be
that the mean is close to the population mean.

• sx approaches a constant value as n → ∞.


• ux approaches 0 as n → ∞.

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 26
Section 4-2
Comparison of Standard
Deviations with the F Test

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 27
Statistical Tests (1 of 2)
Two sets of measurements (on the same quantity) will generally differ in
x and s. Statistical tools determine the probability of a conclusion.
• Accept conclusions with a high probability of being correct.
• Reject conclusions with a high probability of being incorrect.
Null hypothesis: states that two sets of data are drawn from populations
with the same properties
• Observed differences arise only from random variation in
measurements.
• Reject the null hypothesis if there is <5% probability of observing
experimental results from two populations with the same value.
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 28
Statistical Tests (2 of 2)

The null hypothesis is used with several tests.


“Two sets of data are drawn from populations with the same properties.”

• F-test compares standard deviation 


Must complete
• t test compares mean  F-test before t test
Note: Statistical tables (Tables 4-3 and 4-4) include degrees of freedom.

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 29
F-Test: Comparison of Standard Deviation
Are the standard deviations of two sets of
measurements “statistically different”?

𝑠12 Put the larger standard


𝐹calculated = 2 where 𝑠1 ≥ 𝑠2 deviation in the
𝑠2
numerator so that F  1.
If Fcalculated > Ftable, then reject the null hypothesis. Degrees of freedom
• There is <5% chance that the two data sets came from for n measurements
populations with the same population standard deviation. are n − 1.
• The difference is considered significant.

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 30

Table 4-2 Measurement of HCO in horse blood 3

• Unethical trainers inject NaHCO3 into a horse prior to a race to


neutralize lactic acid that accumulates during strenuous activity.
• To enforce the ban on this practice, HCO3− in horse blood is measured
after each race.
Authorities need to certify a new instrument:
Original instrument Substitute instrument
Mean (𝑥,ҧ mM) 36.14 36.20
Standard deviation (s, mM) 0.28 0.47
Number of measurements (n) 10 4

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 31
Table 4-3 Critical values of F = s / s at 95% 2
1
2
2
confidence level for two-tailed F test
Degrees of Degrees of freedom for s1
freedom
for s2 2 3 4 5 6 7 8 9 10 12 15 20 30 ∞
2 39.00 39.17 39.25 39.30 39.33 39.36 39.37 39.39 39.40 39.41 39.43 39.45 39.46 39.50
3 16.04 15.44 15.10 14.88 14.73 14.62 14.54 14.47 14.42 14.34 14.25 14.17 14.08 13.90
4 10.65 9.98 9.60 9.36 9.20 9.07 8.98 8.90 8.84 8.75 8.66 8.56 8.46 8.26
5 8.43 7.76 7.39 7.15 6.98 6.85 6.76 6.68 6.62 6.52 6.43 6.33 6.23 6.02
6 7.26 6.60 6.23 5.99 5.82 5.70 5.60 5.52 5.46 5.37 5.27 5.17 5.07 4.85
7 6.54 5.89 5.52 5.29 5.12 4.99 4.90 4.82 4.76 4.67 4.57 4.47 4.36 4.14
8 6.06 5.42 5.05 4.82 4.65 4.53 4.43 4.36 4.30 4.20 4.10 4.00 3.89 3.67
9 5.71 5.08 4.72 4.48 4.32 4.20 4.10 4.03 3.96 3.87 3.77 3.67 3.56 3.33
10 5.46 4.83 4.47 4.24 4.07 3.95 3.85 3.78 3.72 3.62 3.52 3.42 3.31 3.08
11 5.26 4.63 4.28 4.04 3.88 3.76 3.66 3.59 3.53 3.43 3.33 3.23 3.12 2.88
12 5.10 4.47 4.12 3.89 3.73 3.61 3.51 3.44 3.37 3.28 3.18 3.07 2.96 2.72
13 4.97 4.35 4.00 3.77 3.60 3.48 3.39 3.31 3.25 3.15 3.05 2.95 2.84 2.60
14 4.86 4.24 3.89 3.66 3.50 3.38 3.29 3.21 3.15 3.05 2.95 2.84 2.73 2.49
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 32
Table 4-3 Critical values of F = s / s at 95% 2
1
2
2
confidence level for two-tailed F test
Degrees of Degrees of freedom for s1
freedom
for s2 2 3 4 5 6 7 8 9 10 12 15 20 30 ∞
15 4.77 4.15 3.80 3.58 3.41 3.29 3.20 3.12 3.06 2.96 2.86 2.76 2.64 2.40
16 4.69 4.08 3.73 3.50 3.34 3.22 3.12 3.05 2.99 2.89 2.79 2.68 2.57 2.32
17 4.62 4.01 3.66 3.44 3.28 3.16 3.06 2.98 2.92 2.82 2.72 2.62 2.50 2.25
18 4.56 3.95 3.61 3.38 3.22 3.10 3.01 2.93 2.87 2.77 2.67 2.56 2.44 2.19
19 4.51 3.90 3.56 3.33 3.17 3.05 2.96 2.88 2.82 2.72 2.62 2.51 2.39 2.13
20 4.46 3.86 3.51 3.29 3.13 3.01 2.91 2.84 2.77 2.68 2.57 2.46 2.35 2.09
30 4.18 3.59 3.25 3.03 2.87 2.75 2.65 2.57 2.51 2.41 2.31 2.20 2.07 1.79
∞ 3.69 3.12 2.79 2.57 2.41 2.29 2.19 2.11 2.05 1.94 1.83 1.71 1.57 1.00
Critical values for a two-tailed test of the null hypothesis that σ1 = σ2. Tails are explained in Figure 4-9. There is a 5% probability of observing F at the tabulated value if
the two sets of data come from populations with the same population standard deviation.
You can compute F for a chosen confidence with the Excel function F.INV.RT(probability,degree_freedom1,degree_freedom2). The statement “=F.INV.RT(0.025,7 ,6)”
reproduces the value F = 5.70 in this table. The statement “F.INV.RT(0.05,7,6)” gives F = 4.21 for 90% confidence for a two-tailed test, which is also the value of F for a
one-tailed test at 95% confidence.

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 33
Example: Is the Standard Deviation from the Substitute
Instrument “Significantly Different” from That of the
Original Instrument? (1 of 3)
In Table 4-2, the standard deviation from the new instrument is s1 = 0.47
(n1 = 4 measurements) and the standard deviation from the original
instrument is s2 = 0.28 (n2 = 10).

Original instrument Substitute instrument


Mean (𝑥,ҧ mM) 36.14 36.20
Standard deviation (s, mM) 0.28 0.47
Number of measurements (n) 10 4

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 34
Example: Is the Standard Deviation from the Substitute
Instrument “Significantly” Different from That of the
Original Instrument? (1 of 2)
Solution: Compute Fcalculated with Equation 4-6:

s12 (0.47)2
Fcalculated = 2= 2
= 2.82
s2 (0.28)
In Table 4-3, we find Ftable = 5.08 in the column with 3 degrees of freedom
for s1 (degrees of freedom = n − 1) and the row with 9 degrees of freedom
for s2. Because Fcalculated = 2.82 < Ftable = 5.08, we accept the null hypothesis
and conclude that s1 and s2 are not significantly different.

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 35
Example: Is the Standard Deviation from the Substitute
Instrument “Significantly” Different from That of the
Original Instrument? (2 of 2)
Test Yourself: If there had been n = 21 replications in both data sets,
would the difference in standard deviations be significant?

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 36
Box 4-1 Choosing the Null Hypothesis in Epidemiology
For a drug’s approval, the null hypothesis is that the
treatment does not cause cancer.
• Similar to the U.S. legal system, the null hypothesis puts the
burden of proof on the prosecution: “innocent until proven
guilty.”
• The null hypothesis (stated above) is assumed to be true.
Unless strong evidence is found proving otherwise, the FDA
will continue to believe it is true.
Epidemiology Study Conducted at USC
• Studied the relationship between menopausal estrogen-
progestin hormone therapy and breast cancer
• Study concluded there was a 7.6% increase in breast cancer
risk per year of estrogen-progestin hormone therapy

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 37
Section 4-3
Confidence Intervals

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 38
Calculating Confidence Intervals
Student’s t: used to compare results from different experiments
• Evaluate the probability that an observed experimental result agrees
with a “known” value.

If we were to repeat n measurements many times and compute x and s, the


95% confidence interval would include the true population mean (whose
value we do not know) in 95% of the sets of n measurements.
We are 95% confident that the true mean lies within the confidence interval.
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 39
https://en.wikipedia.org/wiki/Student%27s_t-distribution

Table 4-4 Values of Student’s t (1 of 2)


Degrees of Confidence level (%)
freedom 50 90 95 98 99 99.5 99.9
1 1.000 6.314 12.706 31.821 63.656 127.321 636.578
2 0.816 2.920 4.303 6.965 9.925 14.089 31.598
3 0.765 2.353 3.182 4.541 5.841 7.453 12.924
4 0.741 2.132 2.776 3.747 4.604 5.598 8.610
5 0.727 2.015 2.571 3.365 4.032 4.773 6.869
6 0.718 1.943 2.447 3.143 3.707 4.317 5.959
7 0.711 1.895 2.365 2.998 3.500 4.029 5.408
8 0.706 1.860 2.306 2.896 3.355 3.832 5.041
9 0.703 1.833 2.262 2.821 3.250 3.690 4.781
10 0.700 1.812 2.228 2.764 3.169 3.581 4.587
15 0.691 1.753 2.131 2.602 2.947 3.252 4.073
20 0.687 1.725 2.086 2.528 2.845 3.153 3.850
25 0.684 1.708 2.060 2.485 2.787 3.078 3.725
30 0.683 1.697 2.042 2.457 2.750 3.030 3.646
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 40
Note: Values in table for two-sided
Table 4-4 Values of Student’s t (2 of 2)
Degrees of Confidence level (%)
freedom 50 90 95 98 99 99.5 99.9
40 0.681 1.684 2.021 2.423 2.704 2.971 3.551
60 0.679 1.671 2.000 2.390 2.660 2.915 3.460
120 0.677 1.658 1.980 2.358 2.617 2.860 3.373
∞ 0.674 1.645 1.960 2.326 2.576 2.807 3.291
In calculating confidence intervals, σ may be substituted for s in Equation 4-7 if you have a great deal of experience with a particular method
and have therefore determined its “true” population standard deviation. If σ is used instead of s, the value of t to use in Equation 4-7 comes
from the bottom row of this table.
Values of t in this table apply to two-tailed tests illustrated in Figure 4-9a. The 95% confidence level specifies the regions containing 2.5% of the
area in each wing of the curve. For a one-tailed test, we use values of t listed for 90% confidence. Each wing outside of t for 90% confidence
contains 5% of the area of the curve.
Find t with the Excel function T.INV.2T. For 12 degrees of freedom and 95% confidence, the function T.INV.2T (0.05,12) gives t = 2.179. Many
programmable calculators can provide t. Search “critical t [your calculator model]”. Verify that the instructions give t = 2.179 for 12 degrees of
freedom and 95% confidence.

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 41
Example: Calculating Confidence Intervals (1 of 4)

The carbohydrate content of a glycoprotein (a protein


with sugars attached to it) is found to be 12.6, 11.9,
13.0, 12.7, and 12.5 wt% (g carbohydrate/100 g
glycoprotein) in replicate analyses. Find the 50% and
90% confidence intervals for the carbohydrate
content.

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 42
Example: Calculating Confidence Intervals (2 of 4)

Solution: First calculate x (= 12.54 ) and (s = 0.40) for the five


measurements. For the 50% confidence interval, look up t in Table 4-4
under 50 and across from four degrees of freedom (degrees of freedom
= n − 1). The value of t is 0.741, so the 50% confidence interval is:

ts (0.741)( 0.4 0 )
50% confidence interval = x + = 12.54  = 12.54  0.13 wt%
n 5

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 43
Example: Calculating Confidence Intervals (3 of 4)

Solution: The 90% confidence interval is:

ts (2.132)( 0.4 0 )
90% confidence interval = x + = 12.54  = 12.54  0.38 wt%
n 5

If we repeat sets of five measurements many times,


50% confidence intervals are expected to include the true mean, μ.
90% confidence intervals are expected to include the true mean, μ.

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 44
Example: Calculating Confidence Intervals (4 of 4)

Test Yourself: Carbohydrate measured on one more sample was 12.3 wt%.
Using six results, find the 90% confidence interval.
12.6, 11.9, 13.0, 12.7, 12.5 and 12.3 wt% (g carbohydrate/100 g glycoprotein)

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 45
Meaning of confidence interval
50% and 90% Confidence Intervals for the Same Set of Random Data

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 46
Meaning of confidence interval
50% and 90% Confidence Intervals for the Same Set of Random Data
• Each point represents an Figure 4-5
average x (n = 4)
• Error bars represent calculated
ts
confidence interval x 
n
• Population mean  = 10 000

Filled squares: confidence


interval does not include the true
mean  = 10 000

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 47
Always State What Kind of Uncertainty You Are Reporting

You can report uncertainty as


Standard deviation Confidence interval
x s ts
OR x
n

Reduce uncertainty by
making more measurements.

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 48
Spreadsheets for Confidence Intervals
Finding confidence intervals with Excel

• Count the number of data points:


=COUNT(A4:A10)

• Find Student’s t:

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 49
Section 4-4
Comparison of Means with
Student's t

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 50
t Test: Comparison of Means
Are the means of two sets of measurements “statistically different”?

If you make two sets of measurements of the same quantity, generally x1  x2


due to random variations in measurements.
The t test determines if there is a statistical difference between x1 and x2 .
If tcalculated > ttable, then reject the null hypothesis.
• There is <5% chance that the two data sets came from populations with the
same population mean.
• The difference is considered significant.

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 51
Comparisons of Means with Student’s t
Case 1 Case 2 Case 3 (paired test)
Compare x to a Compare x1 to x2 with replicate Compare two methods where
known value: samples: samples are not duplicated:
• Measure quantity • Measure quantity multiple • Measure sample A once by
several times. method 1 and once by
times by two different
method 2.
• Obtain x and s. methods.
• Measure sample B once by
Does x compare to • Obtain x1  s1 and x2  s2 method 1 and once by
accepted answer, ? (for each method). method 2.
Does x1 agree with x2 within Do the two methods agree
experimental uncertainty? within experimental
uncertainty?
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 52
Case 1: Comparing Measured Result with “Known” Value

A coal sample is certified to contain 3.19 wt% sulfur. A new analytical


method measures values of 3.29, 3.22, 3.30, and 3.23 wt% sulfur, giving a
mean of x_bar = 3.26 and a standard deviation s = 0.041.
Does the answer using the new method agree with the known answer?
ts
Confidence Interval = x  = 3.260  (3.182)(0.004
4
1)
= 3.260  0.006 5
n
• Confidence interval = 3.195 to 3.325 wt%
• Known value, 3.19 wt%, outside 95% confidence interval
• Method gives “different” result from known result
(Result is so close scientist might want to complete a few more trials to confirm)
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 53
Case 2a: Comparing Replicate Measurements When
Standard Deviations Are Not Significantly Different
Recall: HCO3− in horse blood is measured after each race.
Original Substitute s12 (0.47)2
Fcalculated = 2= 2
= 2.82
instrument instrument s2 (0.28)
Mean (𝑥,ҧ mM) 36.14 36.20
Fcalculated = 2.82 < Ftable = 5.08, so s1 and s2
Standard deviation (s, mM) 0.28 0.47 are not significantly different.
Number of measurements (n) 10 4

 1 i 1 +  2 ( x j − x2 )
( )
2 2
Do the means of the two methods differ? x − x
spooled =
n1 + n2 − 2
x1 − x 2 n1 n2
t= s12 ( n1 − 1 ) + s22 ( n2 − 1 )
spooled n1 + n2 =
n1 + n2 − 2

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 54
Case 2a: Comparing Replicate Measurements When
Standard Deviations Are Not Significantly Different
Original Substitute
Do the means of the two methods differ? instrument instrument
(𝑥,ҧ mM) 36.14 36.20
s (n1 − 1) + s (n2 − 1)
2 2
0.28 (10 − 1) + 0.47 (4 − 1)
1 2
spooled = 1 2
= = 0.338 (s, mM) 0.28 0.47
n1 + n2 − 2 10 + 4 − 2
(n) 10 4

| x − x | n1 n2 |36.14 − 35.20 | 10  4 95% critical value of t in Table 4-4 for


t= 1 2 = = 0.30 0 (n1 + n2 − 2) = 12 degrees of freedom
spooled n1 + n2 0.338 10 + 4
lies between 2.228 and 2.131
If tcalculated > ttable, then reject the null hypothesis.
But tcalculated(0.300) < ttable(2.131)
• There is > 5% chance that the two data sets came from populations with the same
population mean.
• The difference in means is not considered significant.
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 55
Lord Rayleigh and the Discovery of Argon
Dry air is composed of ~1/5 oxygen and ~4/5 nitrogen Table 4-5
From Chemical
Measured nitrogen with two experiments: From air (g) decomposition (g)
(at constant temperature, pressure, and volume)
2.310 17 2.301 43
• Mass of N2 after removing O2 from air 2.309 86 2.298 90
• Mass of N2 generated from chemical decomposition 2.310 10 2.298 16
2.310 01 2.301 82
Do the means of the two methods differ? 2.310 24 2.298 69
Figure 4-7 2.310 10 2.299 40
2.310 28 2.298 49
— 2.298 89
Average
2.310 109 2.299 472
Standard deviation
0.000 143 0.001 379
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 56
Case 2b: Comparing Replicate Measurements When
Standard Deviations Are Significantly Different
Do the means of the two methods differ?
If the standard deviations of the two methods differ (F-test), the t test
equations become:
x1 − x 2 x1 − x 2 Standard deviation
tcalculated = =
(s 2
1 /n1 ) + ( s /n2 )
2
2 (u ) + (u )
2
1
2
2
of the mean
ui = si / ni
(s /n1 + s /n2 ) (u )
2 2 2
2 2 2
+u
Degrees of freedom = =
1 2 1 2

( s /n1 ) ( s /n2 )
2 2
2 2 u14 u 4
1
+
2 + 2

n1 − 1 n2 − 1 n1 − 1 n2 − 1
Round degrees of freedom to the nearest integer.
Compare tcalculated to ttable at 95% confidence using appropriate degrees of freedom.
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 57
Example: Is Rayleigh’s N2 from Air Denser
Than N2 from Chemicals? (1 of 4)
Table 4-5
The average mass of nitrogen from air is From Chemical

x1 = 2.310 109 g, with a standard deviation of


From air (g) decomposition (g)
2.310 17 2.301 43
s1 = 0.000 143 (for n1 = 7 measurements). 2.309 86 2.298 90

The average mass from chemical decomposition is 2.310 10 2.298 16


2.310 01 2.301 82
x2 = 2.299 472 g, with a standard deviation of 2.310 24 2.298 69
2.310 10 2.299 40
s2 = 0.001 379 (for n2 = 8 measurements).
2.310 28 2.298 49
Are the two masses significantly different? — 2.298 89
Average
2.310 109 2.299 472
Standard deviation
0.000 143 0.001 379
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 58
Example: Is Rayleigh’s N2 from Air Denser
Than N2 from Chemicals? (2 of 4)
Solution: The F-test told us that the standard deviations are significantly different, so we use
Equations 4-9b and 4-10b:

x1 − x2 2.310 109 − 2.299 472


tcalculated = = = 21.7
( s / n ) + ( s /n )
2
1 1
2
2 2
0.000 14 32 /7 + 0.001 3792 /8

( s / n + s /n ) ( 0.000 14 /7 + 0.001 37 /8 )
2 2 2 2 2 2

Degrees of freedom = = = 7.17


1 1 2 2 3 9

( s /n ) + ( s /n ) ( 0.000 14 /7 ) + ( 0.001 37 /8 )
2 2 2 2 2 2 2 2
1 1 2 2 3 9

n1 − 1 n2 − 1 7 −1 8 −1

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 59
Example: Is Rayleigh’s N2 from Air Denser
Than N2 from Chemicals? (3 of 4)
Solution: Equation 4-10b gives us 7.17 degrees of freedom, which we round to 7.
For 7 degrees of freedom, the critical value of t in Table 4-4 for 95% confidence is
2.365. The observed value tcalculated = 21.7 far exceeds ttable. The obvious difference
between the two data sets in Figure 4-7 is highly significant.
Figure 4-7

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 60
Example: Is Rayleigh’s N2 from Air Denser
Than N2 from Chemicals? (4 of 4)
Test Yourself: If the difference between the two mean values were half
as great as Rayleigh found, but the standard deviations were
unchanged, would the difference still be significant?

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 61
Case 3: Paired t Test for Comparing Individual Differences
Figure 4-8
Do the methods give the same answer?
• Use two methods to make single
measurements on several different samples..
• No measurement is duplicated.
 1
( )
2
d − d
sd =
n−1
(0. 01 − d )2 + (0. 37 − d )2 + (−0. 14 − d )2 +
= = 0. 401
8−1

|d | 0. 114
tcalculated = n= 8 = 0. 80 3
sd 0. 401
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 62
hypothesis-testing
One-Tailed and Two-Tailed Figure 4-9
Significance Tests
Two tailed tests: t test calculations assume:
• Certified value lies in the outer 5% of the area under the curve

One tailed tests: compare mean with regulatory limit


• 5% region lies only on one side of the certified mean

Consider drinking water: We are concerned only if the


probability of arsenic (As) in water exceeds the limit.
EPA maximum permissible level = 10 µg As/L
Water samples → 10.06, 10.12, 10.19, and 10.04 µg As/L; x = 10.1025  0.0675 μg/L

| x − regulatory limit| 10. 10 25 − 10


tcalculated = n= 4 = 3. 0 4
s 0. 06 75
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 63
Section 4-5
t Tests with a Spreadsheet

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 64
t Tests with a Spreadsheet
Figure 4-10
Spreadsheet for
comparing mean values
of Rayleigh’s nitrogen
measurements

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 65
Section 4-6
Grubbs Test for an Outlier

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 66
Grubbs Test: Check for Outliers
Should a data point that looks like an anomaly be discarded?
If you make several replicate measurements, results should fall within a
Gaussian distribution about the mean. But when n is small, it can be difficult
to determine if an outlying data point falls within the normal distribution.
The Grubbs test is a statistical test to decide whether to discard a datum that
appears discrepant (an “outlier”).
If Gcalculated > Gtable, then reject the null hypothesis.
• There is <5% chance that the suspicious data point is a member of
the same population as the other measurements.
• The difference is considered significant.
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 67
Grubbs Test for an Outlier (1 of 3)
When a single data point lies far from the
other data in a set of measurements:
• First, check your notebook.
• Are there any recorded observations about the
anomalous data point (for example, a note
that solution was lost during transfer)?
• Any data point based on recorded faulty
procedure should be discarded, no matter how
well it fits the rest of the data (“blunder”).

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 68
Grubbs Test for an Outlier (2 of 3)
In the absence of a recorded blunder, use the Grubbs test.

|questionable value − x |
Gcalculated =
s

• If Gcalculated is greater than G in Table 4-6, the questionable


point should be discarded.
• Only one outlier may be rejected using the Grubbs test.
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 69
Grubbs Test for an Outlier (3 of 3)
In the absence of a recorded blunder, use the Grubbs test.
|questionable value − x |
Gcalculated =
s
Volumes for replicate titrations (mL): 28.54, 28.39, 28.47, 27.68
(Larger than expected (RSD: Relative
x = 28. 27  0. 40 mL; RSD = 1. 4%  Standard
precision for this titration) deviation)
|27.68 − 28.27|
Gcalculated = = 1.482 Gtable = 1.463 for 4 observations
0.40
Gcalculated > Gtable, so reject the null hypothesis.
• The questionable data point is an “outlier” and should be discarded.
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 70
Table 4-6 Critical values of G for rejection of outlier

Number of G Number of G
observations (95% confidence) observations (95% confidence)
3 1.153 10 2.176
4 1.463 11 2.234
5 1.672 12 2.285
6 1.822 15 2.409
7 1.938 20 2.557
8 2.032 30 2.745
9 2.110 50 2.956
𝐺𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 = 𝑞𝑢𝑒𝑠𝑡𝑖𝑜𝑛𝑎𝑏𝑙𝑒 𝑣𝑎𝑙𝑢𝑒 − 𝑚𝑒𝑎𝑛 /𝑠. If 𝐺𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 > 𝐺𝑡𝑎𝑏𝑙𝑒 , the value in question can be rejected with 95%
confidence. Values in this table are for a one-tailed test, as recommended by ASTM.

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 71
Section 4-7
The Method of Least Squares

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 72
The Method of Least Squares
For most chemical analyses, the response obtained by the given lab
procedure must be compared to known quantities (called standards).
In this way the response from an unknown quantity can be interpreted.
• Prepare a calibration curve from known standards.
• Work in a region where the calibration curve is linear (usually).

Method of least squares: used to draw the “best” straight line through
experimental data points that contain some scatter
• Some points will lie above and some below the line.
• Equation y = mx + b can be used to quantify the unknown from
its signal.
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 73
Finding the Equation of the Line
Assume: Figure 4-11

• Uncertainty in y values is much greater than


uncertainty in x values (sy >> sx).
• Uncertainties of all y values are similar.
Draw a line to minimize vertical deviations between
points and line.

• Vertical deviation di = yi − y = yi − (mxi + b)


Deviations can be positive or negative. To minimize magnitude, irrespective of
sign, square the deviation di2 = (yi − mxi + b)2  method of least squares
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 74
Determinants
Mathematically finding values of m and b that
minimize the sum of the squares involves some
calculus.
We express the final solution for m and b as
determinants.
e f
= eh − fg
g h

6 5
= (6  3) − (5  4) = 2
4 3

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 75
Determinants to Solve Method of Least Squares

 Slope : m=
 (x y ) x
D i i i

Least -squares  y n i

" best " line Intercept : b=
 ( x ) ( x y )
2
i
D
i i

 x y i i

where D is

D=
( i )
x 2
x i

x i n

and n is the number of points.

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 76
Table 4-7 Calculations for least-squares analysis
𝒙𝒊 𝒚𝒊 𝒙𝒊 𝒚𝒊 𝒙𝟐𝒊 𝒅𝒊 (= 𝒚𝒊 − 𝒎𝒙𝒊 − 𝒃) 𝒅𝟐𝒊
1 2 2 1 0.038 46 0.001 479 3
3 3 9 9 −0.192 31 0.036 982
4 4 16 16 0.192 31 0.036 982
6 5 30 36 −0.038 46 0.001 479 3
∑𝑥𝑖 = 14 ∑𝑦𝑖 = 14 ∑(𝑥𝑖 𝑦𝑖 ) = 57 ∑(𝑥𝑖2 ) = 62 ∑(𝑑𝑖2 ) = 0.076 923

57 14 (57  4) − (14  14) 32


62 14 
m=  = = = 0.615 38 
14 4 14 4 (62  4) − (14  14) 52 
 y = 0. 615 38 x + 1. 346 15
62 57 62 14 (62  4) − (57  14) 70
b=  = = = 1.346 15 
14 14 14 4 (62  4) − (14  14) 52 

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 77
Example: Finding Slope and Intercept with a
Spreadsheet (1 of 3)
Excel has functions called SLOPE and INTERCEPT, whose use is
illustrated here:

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 78
Example: Finding Slope and Intercept with a
Spreadsheet (2 of 3)
The slope in cell D3 is computed with the formula “=SLOPE(B2:B5,A2:A5)”,
where B2:B5 is the range containing the y values and A2:A5 is the range
containing x values.

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 79
Example: Finding Slope and Intercept with a
Spreadsheet (3 of 3)
Test Yourself: Change cell A3 from 3 to 3.5 and find the new slope and
intercept.

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 80
How Reliable Are Least Squares Parameters?
To estimate uncertainties in m and b, an uncertainty analysis must be
performed.
Estimate σy, the population standard deviation for all y, by calculating sy.

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 81
Example: Finding sy, um, and ub with a Spreadsheet
(1 of 3)
Excel function LINEST returns the slope and intercept and their standard uncertainties in a table
(a matrix). As an example, enter x and y values from Table 4-7 in columns A and B. Then highlight
the 3-row × 2-column region E3:F5 with your mouse. This block of cells is selected for the output
of LINEST. On the Formulas ribbon, go to Insert Function. In the window that appears, in “Or
select a category” select Statistical and double click LINEST. The new window asks for four inputs
to the function. For y values, enter B2:B5. Then enter A2:A5 for x values. The next two entries are
both “TRUE”. The first TRUE tells Excel that we want to compute the y-intercept of the line and
not force the intercept to be 0. The second TRUE tells Excel to print out the uncertainties as well
as the slope and intercept. The formula you have just entered is
“=LINEST(B2:B5,A2:A5,TRUE,TRUE)”. Now press CONTROL+SHIFT+ENTER on a PC or
CONTROL+SHIFT+RETURN on a Mac.

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 82
Example: Finding sy, um, and ub with a Spreadsheet
(2 of 3)
Excel prints out a matrix in cells E3:F5. Write labels around the block to indicate what is in
each cell. The slope and intercept are on the top line. The second line contains um and ub.
Cell F5 contains sy, and cell E5 contains a quantity called R2, which is defined in Equation 5-3
and is a measure of the goodness of fit of the data to the line. The closer R2 is to unity, the
better the fit.

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 83
Example: Finding sy, um, and ub with a Spreadsheet
(3 of 3)
Test Yourself: Change cell A3 from 3 to 3.5 and apply LINEST. What is
the value of sy from LINEST?

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 84
Section 4-8
Calibration Curves

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 85
Calibration Curves
Figure 4-12
A calibration curve shows the response of an
analytical method to known quantities of analyte.
• Standard solutions contain known
concentrations of analyte.
• Blank solutions contain all reagents and
solvents used in the analysis, but contain no
deliberately added analyte.
A spectrophotometer measures the absorbance of
light (y-axis), which is proportional to the quantity
of protein analyzed (x-axis).
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 86
Table 4-8 Spectrophotometer data used to
construct calibration curve
Amount of Absorbance of
protein (μg) independent standards Range Corrected absorbance
0 0.099 0.099 0.100 0.001 −0.0003 −0.0003 0.0007
5.0 0.185 0.187 0.188 0.003 0.0857 0.0877 0.0887
10.0 0.282 0.272 0.272 0.010 0.1827 0.1727 0.1727
15.0 0.345 0.347 (0.392) 0.047 0.2457 0.2477 —
20.0 0.425 0.425 0.430 0.005 0.3257 0.3257 0.3307
25.0 0.483 0.488 0.496 0.013 0.3837 0.3887 0.3967

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 87
Constructing a Calibration Curve (1 of 2)
1. Prepare known samples of analyte covering the range Figure 4-12
(0 to 150%) of concentrations expected for
unknowns.
• Tabulate amount of analyte in each standard
and response.
2. Subtract the average absorbance of the blank
solutions from each measured absorbance (corrected
absorbance).
• Blanks measure the response of the
procedure when no analyte is present.
3. Make a graph of corrected absorbance vs. quantity of
analyte.
• Inspect the graph for linearity, outliers, and
consistent y-uncertainty.
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 88
Constructing a Calibration Curve (2 of 2)
Figure 4-12
4. Use the least-squares procedure to find the best
straight line through the linear portion of the data.

Corrected absorbance = ( 0.016 30 )(μg protein ) + 0.004 7


y x

5. If you analyze an unknown at a future time, run a


blank at that time.
• Subtract the new blank signal from the
unknown to correct.

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 89
Example: Using a Linear Calibration Curve (1 of 3)

An unknown protein sample gave an absorbance of 0.406, and a blank


had an absorbance of 0.104. How many micrograms of protein are in
the unknown?
Corrected absorbance = ( 0.016 30 )(μg protein ) + 0.004 7
y x

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 90
Example: Using a Linear Calibration Curve (2 of 3)
Figure 4-13
Solution: The corrected absorbance is
0.406 − 0.104 = 0.302, which lies on the
linear portion of the calibration curve in
Figure 4-13. Rearranging Equation 4-25
gives:
corrected absorbance − 0.004 7
μg of protein =
0.016 30
0.302 − 0.004 7
= = 18.24 μg
0.016 30

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 91
Example: Using a Linear Calibration Curve (3 of 3)

Test Yourself: What mass of protein gives a corrected absorbance of


0.250?

Corrected absorbance = ( 0.016 30 )(μg protein ) + 0.004 7


y x

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 92
Linear Response
The linear range of an analytical method is the Figure 4-14

analyte concentration range over which response


is proportional to concentration.
Dynamic range is the concentration range over
which there is a measurable response to analyte,
even if the response is not linear.
• Calibration procedures with a linear response are
preferred.
• Corrected analytical signal  quantity of analyte.
• It is possible to obtain valid results beyond the linear
region by fitting with a nonlinear equation.
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 93
Box 4-2 Using a Nonlinear Calibration Curve
• Consider an unknown whose corrected Figure 4-13
absorbance of 0.375 lies beyond the linear
range.
• Fit all the data points with a quadratic equation:
y = −1.17 × 10−4 x2 + 0.0185 58x − 0.000 7
• Insert y = 0.375 into the equation and rearrange
to the form
ax2 + bx + c = 0
• Solve for x.
−b  b2 − 4ac
x= = 135μg or 23.8 μg
2a
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 94
Good Practice (1 of 2)
Always make a graph of your data
Figure 4-15

• Helps reject bad data, stimulus to repeat a measurement, or decision that a


straight line is not appropriate
• All three data sets were fit to y = 0.5x + 3
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 95
Good Practice (2 of 2)

• It is not reliable to extrapolate any calibration curve beyond the


measured range of standards.
• At least six calibration concentrations and two replicate
measurements of each unknown are recommended.
• Make each standard solution from a certified material.
• Avoid serial dilution of a single stock solution (serial dilution
propagates systematic error).
• Measure calibration solutions in random order, not in
consecutive order of increasing concentration.
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 96
Box 4-3 Importance of Graphs to Visualize Data
A good graph reveals key characteristics of data and guides statistical analysis.

• Heights on the bar graph (a) give the mean values of two data sets.
The error bars correspond to ±standard deviation of the mean.
• Data plots (b–e) show different characteristics of the data that are not
evident in the bar graph.
Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 97
Propagation of Uncertainty with a Calibration Curve

• An unknown with a corrected absorbance of y = 0.302 had a protein content of


x = 18.24 μg. What is the uncertainty in x?
• Standard uncertainty in x = standard deviation of the mean =

sy 1 1 (y − y )2 k = number of replicate measurements


ux = + + 2
|m| k n m  ( xi − x )2 n = number of data points

ux = 0.23 μg
• Confidence interval for x is ±tux, where t is Student’s t for n − 2 degrees of freedom
±tux = ±(2.179)(0.23) = ± 0.50 μg

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 98
Section 4-9
A Spreadsheet for Least Squares

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 99
Figure 4-16: Spreadsheet for Linear Least-
Squares Analysis

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 100
Figure 4-17: Adding Error Bars to a Graph

Quantitative Chemical Analysis, Daniel C. Harris and Charles A. Lucy, © 2020 W. H. Freeman and Company 101

You might also like