Chapter 6: Ordinary Least Squares Estimation Procedure - The Properties
Chapter 6: Ordinary Least Squares Estimation Procedure - The Properties
Chapter 6: Ordinary Least Squares Estimation Procedure - The Properties
The Properties
Chapter 6 Outline
• Clint’s Assignment: Assess the Effect of Studying on Quiz Scores
• Review
o Regression Model
o Ordinary Least Squares (OLS) Estimation Procedure
o The Estimates, bConst and bx, Are Random Variables
• Strategy: General Properties and a Specific Application
o Review: Assessing Clint’s Opinion Poll Results
o Preview: Assessing Professor Lord’s Quiz Results
• Standard Ordinary Least Squares (OLS) Premises
• New Equation for the Ordinary Least Squares (OLS) Coefficient
Estimate
• General Properties: Describing the Coefficient Estimate’s Probability
Distribution
o Mean (Center) of the Coefficient Estimate’s Probability
Distribution
o Variance (Spread) of the Coefficient Estimate’s Probability
Distribution
• Estimation Procedures and the Estimate’s Probability Distribution
o Importance of the Mean (Center)
o Importance of the Variance (Spread)
• Reliability of the Coefficient Estimate
o Variance (Spread) of the Error Term’s Probability Distribution
o Sample Size: Number of Observations
o Range of the Explanatory Variable
o Reliability Summary
• Best Linear Unbiased Estimation Procedure (BLUE)
Chapter 6 Prep Questions
1. Run the Distribution of Coefficient Estimates simulation in the Econometrics
Lab by clicking the following link:
Minutes
Quiz
Studied
Student Score (y)
(x)
1 5 66
2 15 87
3 25 90
Table 6.1: First Quiz Results
Project: Use data from Professor Lord’s first quiz to assess the effect of
studying on quiz scores.
Review
Regression Model
Clint uses the following regression model to complete his assignment:
3
∑(y t − y )( xt − x )
bx = t =1
T
bConst = y − bx x
∑ (x − x )
t =1
t
2
Using the results of the first quiz, Clint estimates the values of the coefficient and
constant:
First Quiz Data Ordinary Least Squares (OLS) Estimates: Esty = 63 + 1.2x
Student x y
1 5 66 bConst = Estimated points for showing up = 63
⎯→
2 15 87 bx = Estimated points for each minute studied = 1.2
3 25 90
4
x1 + x2 + x3 5 + 15 + 25 45 y1 + y2 + y3 66 + 87 + 90 243
x= = = = 15 y= = = = 81
3 3 3 3 3 3
Student yt y yt − y xt x xt − x
1 66 81 −15 5 15 −10
2 87 81 6 15 15 0
3 90 81 9 25 15 10
Student ( yt − y )( xt − x ) ( xt − x ) 2
1 (−15)(−10) = 150 (-10) 2 = 100
2 (6)(0) = 0 (0) 2 = 0
3 (9)(10) = 90 (10) 2 = 100
Sum = 240 Sum = 200
T T
∑(y
t =1
t − y )( xt − x ) = 240 ∑ (x − x )
t =1
t
2
= 200
T
∑(y t − y )( xt − x )
240 6 6
bx = t =1
T
= = = 1.2 bConst = y − bx x = 81 − × 15 = 63
∑ (x − x ) 2 200 5 5
t
t =1
the election was actually a tossup. In view of this, we asked how confident Clint
should be in the results of his single poll. To address this issue, we turned to the
general properties of polling procedures to assess the reliability of the estimate
Clint obtained from his single poll:
General Properties versus One Specific Application
↓ ↓
Clint’s Estimation
Procedure: Apply the polling procedure
Calculate the fraction of ⎯⎯⎯⎯⎯⎯⎯⎯⎯→ once to Clint’s sample of the 16
the 16 randomly selected randomly selected students:
students supporting Clint
⏐ v + v + … + v16 ⏐
↓ EstFrac = 1 2 ↓
16
Before Poll vt = 1 if for Clint After Poll
↓ = 0 if not for Clint ↓
Random Variable: Estimate: Numerical Value
Probability Distribution ↓
⏐ 12 3
EstFrac = = = .75
⏐ 16 4
↓
How reliable is EstFrac?
Mean[ EstFrac ] = p = ActFrac = Actual fraction of the population supporting Clint
p (1 − p ) p (1 − p )
Var[ EstFrac ] = = where T = SampleSize
T 16
↓
Mean and variance describe the center and spread of the estimate’s probability distribution
While we could not determine the numerical value of the estimated
fraction, EstFrac, before the poll was conducted, we could describe its probability
distribution. Using algebra, we derived the general equations for the mean and
variance of the estimated fraction’s, EstFrac’s, probability distribution. Then, we
checked our algebra with a simulation by exploiting the relative frequency
interpretation of probability: after many, many repetitions, the distribution of the
numerical values mirrors the probability distribution for one repetition.
What can we deduce before the poll is
conducted?
↓
General properties of the polling
procedure described by EstFrac‘s
probability distribution.
↓
6
⏐ ∑ ( xt − x ) 2
6
bConst = 81 − × 15 = 63
⏐ t =1
5
↓ bConst = y − bx x
Mean[bx] = ?
Var[bx] = ?
7
Mean and variance describe the center and spread of the estimate’s probability distribution
While we cannot determine the numerical value of the coefficient estimate
before the quiz is given, we can describe its probability distribution. The
probability distribution tells us how likely it is for the coefficient estimate based
on a single quiz to equal each of the possible values. Using algebra, we shall
derive the general equations for the mean and variance of the coefficient
estimate’s probability distribution. Then, we will check our algebra with a
simulation by exploiting the relative frequency interpretation of probability: after
many, many repetitions, the distribution of the numerical values mirrors the
probability distribution for one repetition.
What can we deduce before the poll is
conducted?
↓
General properties of the OLS
estimation procedure described by the
coefficient estimate’s probability
distribution.
↓
Probability distribution is described by
its mean (center) and variance (spread).
↓
Use algebra to derive the equations for Check the algebra with a
the probability distribution’s simulation by exploiting the
⎯⎯→
mean and variance. relative frequency
interpretation of probability.
The coefficient estimate’s probability distribution will allow us to assess the
reliability of the coefficient estimate calculated from Professor Lord’s quiz.
Standard Ordinary Least Squares (OLS) Regression Premises
To derive the equations for the mean and variance of the coefficient estimate’s
probability distribution, we shall apply the standard ordinary least squares (OLS)
regression premises. As we mentioned Chapter 5, these premises make the
analysis as straightforward as possible. In later chapters, we will relax these
premises to study more general cases. In other words, we shall start with the most
straightforward case and then move on to more complex ones later.
• Error Term Equal Variance Premise: The variance of the error term’s
probability distribution for each observation is the same; all the variances
equal Var[e]:
Var[e1] = Var[e2] = … = Var[eT] = Var[e]
8
∑(y t − y )( xt − x )
bx = t =1
T
∑ (x − x )
t =1
t
2
It is advantageous to use a different equation to derive the equations for the mean
and variance of the coefficient estimate’s probability distribution, however; we
shall use an equivalent equation that expresses the coefficient estimate in terms of
the x’s, e’s, and βx rather than in terms of the x’s and y’s:1
T
∑ ( x − x )e t t
bx = β x + t =1
T
∑ (x − x )
t =1
t
2
Probability Distribution
bx
Mean[bx] = βx
Figure 6.1: Probability Distribution of Coefficient Estimates
Mean[bx] = βx
If our algebra is correct, the mean (average) of the estimated coefficient values
should equal the actual value of the coefficient, βx, after many, many repetitions.
Probability Distribution
bx
2
Figure 6.3: Histogram of Coefficient Value Estimates
Question: What does the mean (average) of the coefficient estimates equal?
Answer: It equals about 2.0.
This lends support to the equation for the mean of the coefficient estimate’s
probability distribution that we just derived. Now, change the actual coefficient
value from 2 to 4. Click Start and then after many, many repetitions, click Stop.
What does the mean (average) of the estimates equal? Next, change the actual
coefficient value to 6 and repeat the process.
Equation: Simulation:
Mean of Mean (Average) of
Coef Estimate Estimated Coef
Actual Prob Dist Simulation Values, bx, from
βx Mean[bx] Repetitions the Experiments
2 2 >1,000,000 ≈2.0
4 4 >1,000,000 ≈4.0
6 6 >1,000,000 ≈6.0
Table 6.2: Distribution of Coefficient Estimate Simulation Results
Conclusion: In all cases, the mean (average) of the estimates for the
coefficient value equals the actual value of the coefficient after many, many
repetitions. The simulations confirm our algebra. The estimation procedure does
not systematically underestimate or overestimate the actual value of the
coefficient. The ordinary least squares (OLS) estimation procedure for the
coefficient value is unbiased.
13
=
[( x − x ) 2
1
+ ( x2 − x ) 2 + ( x3 − x ) 2 ]2
[( x − x ) Var[e] + ( x − x ) Var[e] + ( x − x ) Var[e]]
1
2
2
2
3
2
1 2 3
Simplifying
Var[e ]
=
( x1 − x ) + ( x2 − x ) 2 + ( x3 − x ) 2
2
Var[e ]
We can generalize this: Var[bx ] = T
∑ (x − x )
t =1
t
2
The simulation automatically spreads the x values uniformly between 0 and 30.
We shall continue to consider three observations; accordingly, the x values are 5,
15
15, and 25. To convince yourself of this, be certain that the Pause checkbox is
checked. Click Start and then Continue a few times to observe that the values of x
are always 5, 15, and 25.
Next, recall the equation we just derived for the variance of the coefficient
estimate’s probability distribution:
Var[e ] Var[e ]
Var[bx ] = T =
( x − x ) + ( x2 − x ) 2 + ( x3 − x ) 2
2
∑ ( xt − x )2 1
t =1
By default, the variance of the error term probability distribution is 500; therefore,
the numerator equals 500. Let us turn our attention to the denominator, the sum of
squared x deviations. We have just observed that the x values are 5, 15, and 25.
Their mean is 15 and their sum of squared deviations from the mean is 200:
x +x +x 5 + 15 + 25 45
x= 1 2 3 = = = 15
3 3 3
Student xt x xt − x ( xt − x ) 2
1 5 15 −10 (−10) 2 = 100
2 15 15 0 (0) 2 = 0
25 15 10 2
3 (10) = 100
Sum = 200
T
∑ (x − x )
t =1
t
2
= 200
That is,
( x1 − x ) 2 + ( x2 − x ) 2 + ( x3 − x )2 = 200
When the variance of the error term’s probability distribution equals 500 and the
sum of squared x deviations equals 200, the variance of the coefficient estimate’s
probability distribution equals 2.50:
Var[e ] Var[e ] 500
Var[bx ] = T = = = 2.50
( x1 − x ) + ( x2 − x ) + ( x3 − x )
2 2 2
∑ ( xt − x ) 2 200
t =1
To show that the simulation confirms this, be certain that the Pause
checkbox is cleared and click Continue. After many, many repetitions click Stop.
Indeed, after many, many repetitions of the experiment the variance of the
numerical values is about 2.50. The simulation confirms the equation we derived
for the variance of the coefficient estimate’s probability distribution.
Estimation Procedures and the Estimate’s Probability Distribution:
Importance of the Mean (Center) and Variance (Spread)
Let us review what we learned about estimation procedures when we studied
Clint’s opinion poll in Chapter 3:
16
Estimate
Actual Value
Figure 6.5: Probability Distribution of Estimates − Importance of the Mean
∑ t(
t =1
x − x ) 2
∑ ( xt − x ) 2 200
t =1
Be certain that the Pause checkbox is cleared. Click Start and then after many,
many repetitions, click Stop. As Table 6.3 reports, the coefficient estimate lies
within 1.0 of the actual coefficient value in 47.3 percent of the repetitions.
Now, reduce the variance of the error term’s probability distribution from
500 to 50. The variance of the coefficient estimate’s probability distribution now
equals .25:
Var[e ] Var[e ] 50 1
Var[bx ] = T = = = = .25
( x1 − x ) + ( x2 − x ) + ( x3 − x )
2 2 2
∑ ( xt − x ) 2 200 4
t =1
19
Click Start and then after many, many repetitions, click Stop. The histogram of
the coefficient estimates is now more closely cropped around the actual value,
2.0. The percent of repetitions in which the coefficient estimate lies within 1.0 of
the actual coefficient value rises from 47.3 percent to 95.5 percent.
Simulations:
Probability Estimated Coefficient Values, bx
Actual Distribution Percent
Values Sample x x Equations: Mean Between
βx Var[e] Size Min Max Mean[bx] Var[bx] (Average) Variance 1.0 and 3.0
2 500 3 0 30 2.0 2.50 ≈2.0 ≈2.50 ≈47.3%
2 50 3 0 30 2.0 .25 ≈2.0 ≈.25 ≈95.5%
Table 6.3: Distribution of Coefficient Estimate Simulation Reliability Results
Why is this important? The variance measures the spread of the probability
distribution. This is important when the estimation procedure is unbiased. As the
variance decreases, the probability distribution becomes more closely cropped
around the actual coefficient value and the chances that the coefficient estimate
obtained from one quiz will lie close to the actual value increases. The simulation
confirms this; after many, many repetitions the percent of repetitions in which the
coefficient estimate lies between 1.0 and 3.0 increases from 47.3 percent to 95.5
percent. Consequently, as the error term’s variance decreases, we can expect the
estimate from one quiz to be more reliable. As the variance of the error term’s
probability distribution decreases, the estimate is more likely to be “close to” the
actual value. This is consistent with our intuition, is it not?
Estimate Reliability and the Sample Size
Next, we shall investigate the effect of the sample size, the number of
observations, used to calculate the estimate. Increase the sample size from 3 to 5.
What does our intuition suggest? As we increase the number of observations, we
will have more information. With more information the estimate should become
more reliable; that is, with more information the variance of the coefficient
estimate’s probability distribution should decrease. Using the equation, let us now
calculate the variance of the coefficient estimate’s probability distribution when
there are 5 observations. With 5 observations the x values are spread uniformly at
3, 9, 15, 21, and 27; the mean (average) of the x’s, x , equals 15 and the sum of
the squared x deviations equals 360:
x + x + x + x + x 3 + 9 + 15 + 21 + 27 75
x= 1 2 3 4 5 = = = 15
5 3 5
Student xt x xt − x ( xt − x ) 2
20
∑ (x − x )
t =1
t
2
= 360
Applying the equation for the value of the coefficient estimate’s probability
distribution:
Var[e ] Var[e ]
Var[bx ] = T =
( x1 − x ) + ( x2 − x ) + ( x3 − x ) 2 + ( x4 − x ) 2 + ( x5 − x ) 2
2 2
∑ t( x − x
t =1
) 2
50
=
(3 − 15) + (9 − 15) + (15 − 15) 2 + (21 − 15) 2 + (27 − 15) 2
2 2
50 50 50
= = = = .1388… ≈ .14
(−12) + (6) + (0) + (6) + (12) 144 + 36 + 0 + 36 + 144 360
2 2 2 2 2
The variance of the coefficient estimate’s probability distribution falls from .25 to
.14. The smaller variance suggests that the coefficient estimate will be more
reliable.
Econometrics Lab 6.4: Sample Size
Are our intuition and calculations supported by the simulation? In fact, the answer
is yes.
Note that the sample size has increased from 3 to 5. Click Start and then after
many, many repetitions click Stop:
Simulations:
Probability Estimated Coefficient Values, bx
Actual Distribution Percent
Values Sample x x Equations: Mean Between
βx Var[e] Size Min Max Mean[bx] Var[bx] (Average) Variance 1.0 and 3.0
2 500 3 0 30 2.0 2.50 ≈2.0 ≈2.50 ≈47.3%
2 50 3 0 30 2.0 .25 ≈2.0 ≈.25 ≈95.5%
2 50 5 0 30 2.0 .14 ≈2.0 ≈.14 ≈99.3%
21
After many, many repetitions the percent of repetitions in which the coefficient
estimate lies between 1.0 and 3.0 increases from 95.5 percent to 99.3 percent. As
the sample size increases, we can expect the estimate from one quiz to be more
reliable. As the sample size increases, the estimate is more likely to be “close to”
the actual value.
Estimate Reliability and the Range of x’s
Let us again begin by appealing to our intuition. As the range of x’s becomes
smaller, we are basing our estimates on less variation in the x’s, less diversity;
accordingly, we are basing our estimates on less information. As the range
becomes smaller, the estimate should become less reliable and consequently, the
variance of the coefficient estimate’s probability distribution should increase. To
confirm this, increase the minimum value of x from 0 to 10 and decrease the
maximum value from 30 to 20. The five x values are now spread uniformly
between 10 and 20 at 11, 13, 15, 17, and 19; the mean (average) of the x’s, x ,
equals 15 and the sum of the squared x deviations equals 40:
x + x + x + x + x 11 + 13 + 15 + 17 + 19 75
x= 1 2 3 4 5 = = = 15
5 3 5
Student xt x xt − x ( xt − x ) 2
1 11 15 −4 (−4) 2 = 16
2 13 15 −2 (−2) =2 4
15 15 0 2 0
3 (0)
17 15 2 2 4
4 (2)
19 15 4 2 = 16
5 (6)
Sum = 40
T
∑ (x − x )
t =1
t
2
= 40
Applying the equation for the value of the coefficient estimate’s probability
distribution:
Var[e ] Var[e ]
Var[bx ] = T =
( x1 − x ) + ( x2 − x ) + ( x3 − x ) 2 + ( x4 − x ) 2 + ( x5 − x ) 2
2 2
∑ t( x − x
t =1
) 2
50
=
(11 − 15) + (13 − 15) + (15 − 15) 2 + (17 − 15) 2 + (19 − 15) 2
2 2
50 50 50 5
= = = = = 1.25
(−4) + (2) + (0) + (2) + (4) 16 + 4 + 0 + 4 + 16 40 4
2 2 2 2 2
22
After changing the minimum value of x to 10 and the maximum value to 20, click
the Start and then after many, many repetitions click Stop.
Simulations:
Probability Estimated Coefficient Values, bx
Actual Distribution Percent
Values Sample x x Equations: Mean Between
βx Var[e] Size Min Max Mean[ bx ] Var[bx ] (Average) Variance 1.0 and 3.0
2 500 3 0 30 2.0 2.50 ≈2.0 ≈2.50 ≈47.3%
2 50 3 0 30 2.0 .25 ≈2.0 ≈.25 ≈95.5%
2 50 5 0 30 2.0 .14 ≈2.0 ≈.14 ≈99.3%
2 50 5 10 20 2.0 1.25 ≈2.0 ≈1.25 ≈62.8%
Table 6.5: Distribution of Coefficient Estimate Simulation Reliability Results
After many, many repetitions the percent of repetitions in which the coefficient
estimate lies between 1.0 and 3.0 decreases from 99.3 percent to 62.8 percent. An
estimate from one repetition will be less reliable. As the range of the x’s
decreases, the estimate is less likely to be “close to” the actual value.
Reliability Summary
Our simulation results illustrate relationships between information, the variance of
the coefficient estimate’s probability distribution, and the reliability of an
estimate:
More and/or more reliable Less and/or less reliable
information. information.
↓ ↓
Variance of coefficient Variance of coefficient
estimate’s probability estimate’s probability
distribution smaller. distribution larger.
↓ ↓
Estimate more reliable; more Estimate less reliable; less
likely the estimate is “close to” likely the estimate is “close to”
the actual value. the actual value.
Best Linear Unbiased Estimation Procedure (BLUE)
23
y Any Two
x
3 9 15 21 27
Figure 6.7: Any Two Estimation Procedure
y Min-Max
x
3 9 15 21 27
Figure 6.8: Min-Max Estimation Procedure
Econometrics Lab 6.6: Comparing the Ordinary Least Squares (OLS), Any Two,
and Min-Max Estimation Procedures
We shall now use the BLUE simulation in our Econometrics Lab to justify our
emphasis on the ordinary least squares (OLS) estimation procedure.
By default, the sample size equals 5 and the variance of the error term’s
probability distribution equals 500. The From-To values are specified as 1.0 and
3.0:
∑(y t − y )( xt − x )
bx = t =1
T
∑ (x − x )
t =1
t
2
bx is expressed in terms of the x’s and y’s. We wish to express bx in terms of the
x’s, e’s, and βx.
Strategy: Focus on the numerator of the expression for bx and substitute for the
y’s to express the numerator in terms of the x’s, e’s, and βx. As we shall shortly
show, once we do this, our goal will be achieved.
T
We begin with the numerator, ∑(y
t =1
t − y )( xt − x ) , and substitute βConst + βxxt + et
for yt:
T T
∑ ( yt − y )( xt − x ) = ∑ ( βConst + β x xt + et − y )( xt − x )
t =1 t =1
Rearranging terms.
T
= ∑ ( β Const − y + β x xt + et )( xt − x )
t =1
T T
Now, focus on the first term, ( β Const + β x x − y )∑ ( xt − x ) . What does ∑ (x − x )
t
t =1 t =1
equal?
T T T
∑ ( xt − x ) =
t =1
∑ xt −
t =1
∑x t =1
T
Replacing ∑t =1
x with Tx .
T
= ∑x
t =1
t − Tx
T
∑x t
Since x = t =1
.
T
T
T ∑x t
= ∑ xt − T
t =1
t =1
T
Simplifying.
T T
= ∑x
t =1
t − ∑x
t =1
t
=0
T
Next, return to the expression for the numerator, ∑(y
t =1
t − y )( xt − x ) :
T T T T
∑ ( yt − y )( xt − x ) = (βConst + β x x − y )∑ ( xt − x ) + β x ∑ ( xt − x )2 +
t =1 t =1 t =1
∑ ( x − x )e
t =1
t t
T
↓ ∑ (x − x ) = 0
t =1
t
T T
= 0 + β x ∑ ( xt − x ) 2 + ∑ ( x − x )e
t t
t =1 t =1
Therefore,
T T T
∑(y
t =1
t − y )( xt − x ) = β x ∑ ( xt − x ) 2 + ∑ ( xt − x )et
t =1 t =1
Last, apply this to the equation we derived for bx in Chapter 5:
28
∑(y t − y )( xt − x )
bx = t =1
T
∑ (x − x )
t =1
t
2
∑ (x − x )
t =1
t
2
∑ (x − x )
t =1
t
2
∑ (x − x )
t =1
t
2
∑ ( x − x )e
t t
= βx + t =1
T
∑ (x − x )
t =1
t
2
∑(y t − y )( xt − x )
b OLS
x = t =1
T
∑ (x − x )
i =1
i
2
OLS
Let w equal the ordinary least squares (OLS) “linear weights”; more
t
specifically,
T
(x − x )
bxOLS = ∑ wtOLS ( yt − y ) where wtOLS = T t
t =1
∑ ( xi − x )2 i =1
OLS
Now, let us derive two properties of w t :
T
• ∑w
t =1
OLS
t =0
T
• ∑w
t =1
OLS
t ( xt − x ) = 1
T
First, ∑w
t =1
OLS
t = 0:
T T
( xt − x )
∑ wtOLS = ∑ T
t =1 t =1
∑ (x − x )
i =1
i
2
∑ (x − x ) t
= t =1
T
∑ (x − x )
i =1
i
2
T T
∑ xt − ∑ x
= t =1
T
t =1
∑ (x − x )
i =1
i
2
∑ x − Tx t
= t =1
T
∑ (x − x )
i =1
i
2
∑x t
Since x = t =1
.
T
T
T ∑x t
∑ xt − T t =1
T
= t =1
T
∑ (x − x )
i =1
i
2
Simplifying.
T T
∑x −∑x t t
= t =1
T
t =1
∑ (x − x )
i =1
i
2
Simplifying.
T
( xt − x ) 2
=∑ T
t =1
∑ (x − x )
i =1
i
2
∑ (x − x )
t
2
= t =1
T
∑ (x − x )
i =1
i
2
T T
Since ∑w
t =1
OLS
t t x = 0 and ∑w
t =1
OLS
t x = 1.
t
32
T T T
= 0 + β Const ∑ w '
t + βx + βx ∑ w x + '
t t ∑ (w OLS
t + wt' )et
t =1 t =1 t =1
Therefore,
T T T
bx' = β Const ∑ wt' + β x + β x ∑ wt' xt + ∑ ( wtOLS + wt' )et
t =1 t =1 t =1
Now, calculate the mean of the new estimate’s probability distribution, Mean[bt′]:
T T T
Mean[b'x ] = Mean[β Const ∑ wt' + β x + β x ∑ wt' xt + ∑ (w OLS
t + wt' )et ]
t =1 t =1 t =1
∑ wt' = 0
t =1
and ∑w x
t =1
'
t t =0
∑ ( x − x )w t
'
t
= t =1
T
∑ (x − x )
i =1
i
2
∑ ( x w − ∑ xw )
t
'
t
'
t
= t =1
T
t =1
∑ (x − x )
i =1
i
2
34
∑ x w − x∑wt
'
t
'
t
= t =1
T
t =1
∑ (x − x)
i =1
i
2
T T
Since ∑x w
t =1
t
'
t = 0 and ∑w
t =1
'
t = 0.
0−0
= T
∑ (x − x)
i =1
i
2
1
Appendix 6.1 appearing at the end of this chapter shows how we can derive the
second equation for the coefficient estimate, bx, from the first.
2
The proof appears at the end of this chapter in Appendix 6.2.
3
To reduce potential confusion, the summation index in the denominator has been
changed from t to i.