Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
70 views

Chapter 2. Simple Linear Regression Module May13

This document discusses simple linear regression. It defines simple linear regression as a model with one independent variable where the relationship between the dependent and independent variables is linear. The key learning objectives are to determine the significance of predictors, predict dependent variable values, estimate relationships, evaluate assumptions, and critically evaluate analyses. Simple linear regression is introduced with an example on farm productivity and fertilizer use. The assumptions of the simple linear regression model are also outlined.

Uploaded by

Shalom Fiker
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views

Chapter 2. Simple Linear Regression Module May13

This document discusses simple linear regression. It defines simple linear regression as a model with one independent variable where the relationship between the dependent and independent variables is linear. The key learning objectives are to determine the significance of predictors, predict dependent variable values, estimate relationships, evaluate assumptions, and critically evaluate analyses. Simple linear regression is introduced with an example on farm productivity and fertilizer use. The assumptions of the simple linear regression model are also outlined.

Uploaded by

Shalom Fiker
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Topic 2: Simple Linear Regression

Learning Objectives

After completing this topic the students are able to:


 Determine the significance of the predictor variable in explaining variability in the
dependent variable;
 Predict values of the dependent variable for given values of the explanatory variable;
 Use linear regression methods to estimate empirical relationships;
 Evaluate and mitigate the effects of departures from classical statistical assumptions on
linear regression estimates; and
 Critically evaluate simple econometric analyses.

Keywords: Simple linear regression model, regression parameters, regression line, residuals,
principle of least squares, least squares estimates, least squares line, fitted values, predicted
values, coefficient of determination, least squares estimators, distributions of least squares
estimators, hypotheses on regression parameters, confidence intervals for regression.

2.1. Introduction

Linear regression is probably the most widely used, and useful, statistical technique for solving
economic problems. Linear regression models are extremely powerful, and have the power to
empirically simplify out very complicated relationships between variables. In general, the
technique is useful, among other applications, in helping explain observations of a dependent
variable, usually denoted Y, with observed values of one or more independent variables, usually
denoted by X1, X2, ... A key feature of all regression models is the inclusion of the error term,
which capture sources of error that are not captured by other variables.

This Topic presents simple linear regression models. That is, regression models with just one
independent variable, and where the relationship between the dependent variable and the
independent variable is linear (a straight line). Although these models are of a simple nature,

1
they are important for various reasons. Firstly, they are very common. This is partly due to the
fact that non-linear relationships often can be approximated by straight lines, over limited ranges.
Secondly, in cases where a scatterplot of the data displays a non-linear relationship between the
dependent variable and the independent variable, it is sometimes possible to transform the data
into a new pair of variables with a linear relationship. That is, we can transform a simple non-
linear regression model into a simple linear regression model, and analyze the data using linear
models. Lastly, the simplicity of these models makes them useful in providing an overview of
the general methodology. In Topic 3, we shall extend the results for simple linear regression
models to the case of more than one explanatory variables.

A formal definition of the simple linear regression model is given in Section 2.2. In Section 2.3,
we discuss how to fit the model, and how to estimate the variation away from the line. Section
2.4 presents inference on simple linear regression models.

2.2. Simple linear regression models

In most of the examples and exercises in Topic1, there was only one explanatory variable, and
the relationship between this variable and the dependent variable was a straight-line with some
random fluctuation around the line.

Example 2.1. Farm productivity and fertilizer use

This data is obtained from 15 farmers to analyze the relationship between farm productivity (Y)
measured in qt/ha and fertilizer use (X) measured in Kg/ha.

Table 2.1. Farm productivity and fertilizer use

Farm Fertilizer
productivity in use in kg
qt/ha
28 10
19 8

2
30 10
50 15
35 12
40 16
22 9
32 10
34 13
44 16
60 20
75 22
45 14
38 11
40 10

A scatterplot of the data is shown in Figure 2.1.

25
80

20
60

15

10
40

0
20

0 5 2010 40 15
Fertilizer used in kg
60
20 80
25

Figure 2.1: Farm productivity against fertilizer use

The relationship between the two variables could be described as a straight line, and some
random factors affecting farm productivity. Thus, we can use the following linear specification
as a model for analyzing the data, Yi   0  1 X i   i i  1,...,15.

This is an example of a simple linear regression model.

3
In general, suppose we have a dependent variable Y and an independent variable X. Hence, the
simple linear regression model for Y on X is given by:
Yi   0  1 X i   i i  1,..., n. (2.1)
where β0 and β1 are unknown parameters, and the  i s are independent random variables with
zero mean and constant variance for all.

The parameters β0 and β1 are called regression parameters (or regression coefficients), and the
line h( X i )   0  1 X i is called the regression line or the linear predictor. Note that a general h(.)
is called a regression curve. The regression parameters β0 and β1 are unknown, non-random
parameters. They are the intercept and the slope, respectively, of the straight line relating Y to X.

The name simple linear regression model refers to the fact that the mean value of the dependent:
E (Yi )   0  1 X i is a linear function of the regression parameters β0 and β1.

The terms  i in (2.1) are called random errors or random terms. The random error  i is the term
which accounts for the variation of the ith dependent variable Yi away from the linear predictor
 0  1 X i at the point Xi. That is,
 i  Yi   0  1 X i i = 1, …,n (2.2)

The  i s are independent random variables with the same variance and zero mean. Hence, the

dependent variables Yi are independent with means  0  1 X i , and constant variance equal to

the variance of  i .

For the above example, an interpretation of the regression parameters β0 and β1 is as follows:
β0: The expected farm productivity for a hypothetical farmer with no fertilizer applied.
β1: The expected change in the farm productivity, when the fertilizer application is increased by
one kg. Observe that the slope of the line is negative, implying that the farm productivity
decreases with increasing fertilizer application.

3.2. Assumptions Underlying Simple Linear Regression

4
Regression, like most statistical techniques, has a set of underlying assumptions that are expected
to be in place if we are to have confidence in estimating a model. Some of the assumptions are
required to make an estimate, even if the only goal is to describe a set of data. Other assumptions
are required if we want to make inference about a population from sample information.

1. Y is measured as a continuous level variable – not a dichotomy or ordinal measurement


2. The independent variable can be continuous, dichotomies, or ordinal

3.3.2. Assumptions about the error term

We noted in Topic 2 that the error term in regression provides an estimate of the standard error
of the model and helps in making inferences, including testing the regression coefficients. To
properly use regression analysis there are a number of criteria about the error term in the model
that we must be able to reasonably assume are true. If we cannot believe these assumptions are
reasonable in our model, the results may be biased or no longer have minimum variance.

The following are some of the assumptions about the error term in the regression model.

1. The mean of the probability distribution of the error term is zero (E(εi) = 0)

This is true by design of our estimator of OLS, but it also reflects the notion that we don’t expect
the error terms to be mostly positive or negative (over or underestimate the regression line), but
centered around the regression line.

Assumptions about the error term in regression are very important for statistical inference -
making statements from a sample to a larger population.

2. The error terms has constant variance (Var(εi)= σ2)

5
This implies that we assume a constant variance for Y across all the levels of the independent
variable. This is called homoscedasticity and it enables us to pool information from all the data
to make a single estimate of the variance. Data that does not show constant error variance is
called heteroscedasticity and must be corrected by a data transformation or other methods.

3. The error term is normally distributed (εi~N(0,σ2))

This assumption follows statistical theory of the sampling distribution of the regression
coefficient and is a reasonable assumption as our sample size gets larger and larger. This enables
us to make an inference from a sample to a population, much like we did for the mean.

4. The errors terms are independent of each other and with the independent variable in the
model (Cov (εi, εj) =0 and Cov(Xi, εi) =0)

This means that the error terms are uncorrelated with each other or with the independent variable
in the model. Correlated error terms sometimes occur in time series data and is known as auto-
correlation while correlation of the error terms with the independent variable is called
endogeneity. If there is correlation among the error terms or error terms with the independent
variable it usually implies that our model is mis-specified. Another way to view this problem is
that there is still pattern left to explain in the data by including a lagged variable in time series, or
a nonlinear form in the case of correlation with an independent variable.

2.3. Fitting the Model

Having decided that a straight line might describe the relationship in the data well, the obvious
question is now: which line fits the data best?

In Figure 2.2 a line is added to a scatterplot for the data on farm productivity and fertilizer used.
There could be many lines that can be fitted into this data set. However there is only one line that
best fits this data and having the property that the sum of squared residuals is the minimum.

6
80
60
40
20

5 10 15 20 25
Fertilizer

Farm prod Fitted values

Figure 2.2. Line of best fit to a data on farm productivity and fertilizer

The most common criterion for estimating the best fitting line to data is the principle of least
squares. This criterion is described in Subsection 2.3.1. Subsection 2.3.2 concerns a measure of
the strength of the straight-line relationship. When we estimate the regression line, we effectively
estimate the two regression parameters β0 and β1. That leaves one remaining parameter in the
model: the common variance σ2 of the dependent variables. We discuss how to estimate σ2 in
Subsection 2.3.3.

2.3.1. The principle of least squares

The principle of least squares is based on the residuals. For any line, the residuals are the
deviations of the dependent variables Yi away from the line. Note that residuals always refer to a
given line or curve. The residuals are usually denoted by  i like the random errors in (2.2). The
reason for this notation is that, if the line is the true regression line of the model, then the
~ ~ ~
residuals are exactly the random errors  i in (2.2). For a given line h ( X )   0  1 X , the

observed value of  i is the difference between the ith observation Yi and the linear predictor
~ ~
 0  1 X at the point Xi. That is,
~ ~
~i  Yi   0  1 X i i = 1, …, n. (2.3)

7
The observed values of  i are called observed residuals (or just residuals). The residuals are the
vertical deviations of the observed values from the line of best fit.

Note that, the better the line fits the data, the smaller the residuals will be. Thus, we can use the
‘sizes’ of the residuals as a measure of how well a proposed line fits the data. If we simply used
the sum of the residuals, we would get a problem with large positive and large negative values
cancelling out; this problem can be avoided by using the sum of the squared residuals instead. If
this measure-the sum of squared residuals-is small, the line explains the variation in the data
well; if it is large, the line explains the variation in the data poorly. The principle of least squares
is to estimate the regression line by the line which minimizes the sum of squared residuals. Or,
equivalently: estimate the regression parameters β0 and β1by the values which minimize the sum
of squared residuals.

The sum of squared residuals, or, as it is usually called, the residual sum of squares, is denoted
by RSS (or RSS(β0, β1) to emphasize that it is a function of β0 and β1), and is given by

 
n n
RSS  RSS (  0 , 1 )   ˆi   yi  ˆ0  ˆ1 xi
2 2
(2.4)
i 1 i 1

In order to minimize RSS with respect to β0 and β1 we differentiate (2.4), and get
RSS n
(  0 , 1 )  2  yi   0  1 xi 
ˆ0 i 1

RSS n
(  0 , 1 )  2 xi  yi   0  1 xi 
ˆ1 i 1

Putting the derivatives equal to zero and re-arranging the terms, yields the following equations

n n

 y i   0 n   1  xi
i 1 i 1

n n n

 xi y i   0  x i   1  xi
i 1 i 1 i 1
2

8
Solving the equations for β0 and β1 provides the least squares estimates ˆ0 (reads beta-naught-

hat) and ˆ1 (beta-one-hat) of β0 and β1, respectively. They are given by

ˆ0  y  1 x
n

 x i  x  yi  y 
ˆ1  i 1
n

 x
i 1
i  x
2

n n

y i x i
where y  i 1
and x  i 1
denote the sample means of the dependent and explanatory
n n
variable, respectively.

The estimated regression line is called the least squares line or the fitted regression line and is
given by:
(2.5)
yˆ  ˆ0  ˆ1 x

The values yˆ i  ˆ0  ˆ1 xi are called the fitted values or the predicted values. The fitted value ŷ i

is an estimate of the expected dependent for a given value xi of the explanatory variable. The
residuals corresponding to the fitted regression line, are called the fitted residuals, or simply the
residuals. They are given by
ˆi  yi  yˆ i  yi  ˆ0  ˆ1 xi i  1,..., n. (2.6)

The fitted residuals can be thought of as observations of the random errors  i in the simple linear
regression model (2.1).

It is convenient to use the following shorthand notation for the sums involved in the expressions
for the parameter estimates (all summations are for i = 1,.., n):
2
 n 
  xi 
  xi   i 1  ,
n n
s xx   xi  x 
2 2

i 1 i 1 n

9
2
 n 
  yi 
  yi   i 1  ,
n n
s yy    yi  y 
2 2

i 1 i 1 n

 n  n 
  xi   yi 
s xy  s yx   xi  x  yi  y    xi yi   i 1  i 1 
n n

i 1 i 1 n

The sums s xx and s yy are called corrected sums of squares, and the sums s xy and s yx are called

corrected sums of cross products. The corresponding sums involving the random variables Yi
rather than the observations y i are denoted by upper-case letters: S yy , S xy and S yx . In this

notation, the least squares estimates of the regression parameters β0 and β1 of the slope and
intercept of the regression line are given by:
s xy
ˆ1  (2.7)
s xx
and
ˆ0  y  ˆx (2.8)
respectively.

Note that the estimate of ˆ1 is undefined if s xx  0 (division by zero). But this is not a problem in

practice: if s xx  0 the explanatory variable only takes one value, and there can be no best line.

Note also that the least squares line passes through the centroid (the point ( x , y ) ) of the data.

For the data on farm productivity and fertilizer, the least squares estimates of the regression
parameters are given by
ˆ1  3.271

ˆ0  3.278
So, the fitted least squares line has equation
yˆ  3.278  3.271x

10
The least squares line is shown in Figure 2.4. The line appears to fit the data reasonably well.

80
60
40
20

5 10 15 20 25
Fertilizer

Farm prod Fitted values

Figure 2.3: Farm productivity and fertilizer data; the least squares line

The least squares principle is the traditional and most common method for estimating the
regression parameters. But there exists other estimating criteria: e.g. estimating the parameters
by the values that minimize the sum of absolute values of the residuals, or by the values that
minimize the sum of orthogonal distances between the observed values and the fitted line. The
principle of least squares has various advantages to the other methods. For example, it can be
shown that, if the dependent variables are normally distributed (which is often the case), the least
squares estimates of the regression parameters are exactly the maximum likelihood estimates of
the parameters.

2.3.2. Coefficient of determination

In the previous subsection we used the principle of least squares to fit the ‘best’ straight line to
data. But how well does the least squares line explain the variation in the data? In this subsection
we describe a measure for roughly assessing how well a fitted line describes the variation in data:
the coefficient of determination.

The coefficient of determination compares the amount of variation in the data away from the
fitted line with the total amount of variation in the data. The argument is as follows: if we did not
have the linear model we would have to use the ‘naïve’ model yˆ  y instead. The variation away

11
n
from the naïve model is S yy   ( yi  y ) 2 : the total amount of variation in the data. However, if
i 1

we use the least squares line (2.5) as model, the variation away from model is only

  S
2
n S xy
RSS ( ˆ0 , ˆ1 )   yi  ˆ0  ˆ1 xi
2
yy 
i 1 s xx

A measure of the strength of the linear relationship between Y and X is the coefficient of
determination R2: it is the proportional reduction in variation obtained by using the least squares
line instead of the naïve model. That is, the reduction in variation away from the model
SYY - RSS as a proportion of the total variation Syy:
S yy  RSS S yy  S yy  S 2 xy / s xx S 2 xy
R2   
S yy S yy s xx S yy

The larger the value of R 2 , the greater the reduction from S yy to RSS relative to S yy , and the

stronger the relationship between Y and x . An estimate of R 2 is found by substituting S yy and

S xy by the observed sums s yy and s xy , that is

s 2 xy
r2 
s xx S yy

Note that the square root of r2 is exactly the estimate from Module 1 of the Pearson correlation
coefficient, ρ, between x and Y when x is regarded as a random variable:
s xy / n  1
r
sx s y

where s x  s xx /( n  1) and s y  s yy /( n  1) are the standard deviations for x and y ,

respectively.

The value of R2 will always lie between 0 and 1 (or, in percentage, between 0% and 100%). It is
equal to 1 if ˆ1  0 and RSS = 0, that is, if all the data points lie precisely on the fitted straight
line (i.e. when there is a ‘perfect’ relationship between Y and x ). If the coefficient of

12
determination is close to 1, it is an indication that the data points lie close to the least squares
line. The value of R2 is zero if RSS = Syy, that is, the fitted straight-line model offers no more
information about the value of Y than the naïve model does.

It is tempting to use R2 as a measure of whether a model is good or not. This is not appropriate.
Try and think of why for a moment before reading on.

The coefficient of determination is only a measure of how well a straight-line model describes
the variation in the data compared to the naïve model-not to other models in general. Even
though R2 is close to 1 (i.e. a straight-line explains a large proportion of the variation), it could
easily be that a non-linear model explains the data-variation much better than the linear. Methods
for assessing the appropriateness of the assumption of a straight-line relationship between Y and
x will be discussed in Topic 4.

The relevant summary statistics for the data on farm productivity and fertilizer application are:

s xx  234.93 s yy  3256.57 s xy  717.30

The coefficient of determination is given by


s 2 xy 717.302
r2    0.673  67.3%
s xx S yy 234.93  3256.57

Since the coefficient of determination is very high, the model seems to describe the variation in
the data very well.

2.3.3. Estimating the variance

In Subsection 2.3.1, we found that the principle of least squares can provide estimates of the
regression parameters in a simple linear regression model. But, in order to fit the model we also
need an estimate for the common variance  2 . Such an estimate is required for making
statistical inferences about the true straight-line relationship between x and Y. Since  2 is the

13
common variance of the residuals  i , i  1,..., n, it would be natural to estimate it by the sample
variance of the fitted residuals (2.6). That is, an estimate would be

 y 
n
 ˆ0  ˆ1 xi
2
i
RSS
i 1

n 1 n 1

where RSS  RSS ( ˆ0 , ˆ1 ) . However, it can be shown that this is a biased estimate of  2 , that is,

 RSS 
the corresponding estimator does not have the ‘correct’ mean value: E     2 . An unbiased
 n  1
estimate of the common variance,  2 , is given by
 s 2 xy 
 s yy  
RSS ( ˆ0 , ˆ1 )  s xx 
s2  
n2 n2 (2.9)
The denominator in (2.9) is the residual degrees of freedom (df), that is
df = number of observations - number of estimated parameters.

In particular, for simple linear regression models, we have n observations and we have estimated
the two regression parameters β0 and β1, so the residual df is n-2.

The relevant summary statistics for the data on farm productivity and fertilizer application are:
s xx  234.93 s yy  3256.57

n  15 , s xy  717.30

An unbiased estimate of the common variance  2 is given by


 S 
2
 S yy  xy  514516.1
 s  3256.57 
s2     234.93  76.15
xx

n2 14

2.4. Inference in simple linear regression

14
In Section 2.3 we produced an estimate of the straight line that describes the data-variation best.
However, since the estimated line is based on the particular sample of data, xi and y i i = 1,…, n
we have observed, we would almost certainly get a different line if we took a new sample of data
and estimated the line on the basis of the new sample. For example, if we measured farm
productivity and fertilizer application of farmers in Haramaya district the one in Example 2.1, we
would invariably get different measurements, and therefore a different least squares line. In other
words: the least squares line is an observation of a random line which varies from one
experiment to the next. Likewise, the least squares estimates ˆ0 and ˆ1 of the intercept and
slope, respectively, of the least squares line, are both observations of random variables. These
random variables are called the least squares estimators. An estimate is non-random and is an
observation of an estimator, which is a random variable. The least squares estimators are given
by:
n

S  x i  x  yi  y 
ˆ1  xy  i 1
n

 x  x
S xx 2
i
i 1 (2.10)
ˆ0  y  1 x (2.11)
n

Y i
where Y  i 1
, and with all summations from i = 1 to n . By a similar argument we find that an
n
unbiased estimator for the common variance  2 is given by
 S 
2

 Y 
 S yy  xy  n
 Yˆi
2
 s xx  i
S 
2   i 1
n2 n2 (2.12)

where yˆ i  ˆ0  ˆ1 xi , with ˆ0 and ˆ1 being the least squares estimators. Note that the
randomness in the estimators is due to the dependent variables only, since the explanatory
variables are non-random. In particular, it can be seen from (2.10) and (2.11) that ˆ0 and ˆ1 are
linear combinations of the dependent variables.

15
It can be shown that the least squares estimators are unbiased, that is, that they have the ‘correct’
mean values:
E ( ˆ0 )   0 and E ( ˆ1 )  1 (2.13)

Also, the estimator S2 is an unbiased estimator of the common variance σ2, that is
E(S 2 )   2 (2.14)

The variances of the estimators ˆ0 and ˆ1 can be found from standard results on variances (we
shall not do it here). The variances are given by

ˆ 2 1 x2 
var(  0 )      (2.15)
 n s xx 

 2
Var ( ˆ1 )  (2.16)
s xx
Note that both variances decrease when the sample size n increases. Also, the variances decrease
if S xx   xi  x  is increased. That is, if the x-values are widely dispersed. In some studies, it
2

is possible to design the experiment such that the value of s xx is high, and hence the variances of
the estimators are small. It is desirable to have small variances, as it improves the precision of
results drawn from the analysis.

In order to make inferences about the model, such as testing hypotheses and producing
confidence intervals for the regression parameters, we need to make some assumption on the
distribution of the random variables Yi . The most common assumption-and the one we shall

make here-is that the dependent variables Yi are normally distributed.

Topic 4 is concerned with various methods for checking the assumptions of regression models.
In this section, we shall simply assume the following about the dependent variables: the Yi s are
independent normally distributed random variables with equal variances and mean values
depending linearly on xi .

16
2.4.1. Inference on the regression parameters

To test hypotheses and construct confidence intervals for the regression parameters β0 and β1, we
need the distributions of the parameter estimators ˆ0 and ˆ1 . Recall from (2.10) and (2.11) that

the least squares estimators ˆ0 and ˆ1 are linear combinations of the dependent variables Yi .
Standard theory on the normal distribution says that a linear combination of independent, normal
random variables is normally distributed. Thus, since the Yi s are independent, normal random

variables, the estimators ˆ0 and ˆ1 are both normally distributed. In (2.13)-(2.16), we found the
mean values and variances of the estimators. Putting everything together, we get that
1 x2 
ˆ0 ~ N (  0 ,  2   
n s xx 

2
ˆ1 ~ N ( 1 , )
s xx
It can be shown that the distribution of the estimator S2 of the common variance σ2 is given by

 2  n 2 2
S2 ~
n2

where  n 2 denotes a chi-square distribution with n-2 degrees of freedom. Moreover, it can be
2

shown that the estimator S2 is independent of the estimators ˆ0 and ˆ1 . (But the estimators

ˆ0 and ˆ1 are not mutually independent.)

We can use these distributional results to test hypotheses on the regression parameters. Since
both ˆ0 and ˆ1 have normal distributions with variances depending on the unknown quantity σ2,
we can apply standard results for normal random variables with unknown variances. Thus, in
order to test  i equal to some value  i *, i = 0,1, that is, to test hypotheses of the form

H 0 :  i   i * for i = 0,1, we can use the t-test statistic, given by

ˆi   i *
t ˆ ( y )  , i  0,1
i
se( ˆi ) (2.17)

17
where se( ˆi ) denotes the estimated standard error of the estimator ˆi . That is

 1 x2 
se( ˆ0 )  var( ˆ0 )  s 2   
 n s xx 
and

s2
se( ˆ1 )  var( ˆ1 ) 
s xx

It can be shown that both test statistics t ˆ ( y ) and t ˆ ( y ) have t-distributions with n-2 degrees of
0 1

freedom.

The test statistics in (2.17) can be used for testing the parameter  i (i=0,1) equal to any value

 i * . However, for the slope parameter β1, one value is particularly important: if we can test β1
equal to zero, the simple linear regression model simplifies to:
Yi   0   i i  1,..., n.

That is, the value of y i does not depend on the value of xi . In other words: the dependent
variable and the independent variable are unrelated!

It is common-for instance in computer output-to present the estimates and standard errors of the
least squares estimators in a table like the following.

Parameter Estimate Standard error t-statistic p-value


0 ˆ0 se( ˆ0 )

1 ˆ1 se( ˆ1 )

The column ‘t-statistic’ contains the t-test statistic (2.17) for testing the hypotheses H 0 :  0  0

and H 0 : 1  0 respectively. If you wish to test a parameter equal to a different value, it is easy
to produce the appropriate test statistic (2.17) from the table. The column ‘p –value’ contains the
p-values corresponding to the t-test statistic in the same row.

18
For the data on farm productivity and fertilizer application, the table is given by

Parameter Estimate Standard error t-statistic p-value


0 -3.278 4.581 -0.68 0.000

1 3.271 0.355 9.21 0.511

Not surprisingly, neither parameter can be tested equal to zero. If, for some reason, we wished to
test whether the slope parameter was equal to 1.58, say, the test statistic would be

ˆ1  1.58 3.271  1.58


t ˆ ( y )    4.763
1
se( ˆ1 ) 0.355

Since n = 15 in this example, the test statistic has a t(13)-distribution. The t-value for this test is
2.16, thus, on the basis of these data we reject the hypothesis that the slope parameter is 1.58, at
the 5% significance level.

A second practical use of the table is to provide confidence intervals for the regression
parameters. The 1–α confidence interval for β0 and β1 are given by, respectively,
ˆ0  t1 / 2 (n  2)se( ˆ0 ),
and
ˆ1  t1 / 2 (n  2) se( ˆ1 ),

In order to construct the confidence intervals, all that is needed is the table and t1 / 2 (n  2) : the

1   / 2 -quantile of a t(n – 2)-distribution.

For the data on farm productivity and fertilizer application, the 95% confidence intervals for the
regression parameters can be obtained from the table for these data and the 0.975-quantile of a
t(13)-distribution:t0.975(13) = 2.16. The confidence intervals for β0 and β1 are, respectively,
 0 : (13.17,6.62)
and

19
1 : (2.50,4.04)

Learning activity 2.1. Consider data on two continuous variables which you are familiar with
and test whether the explanatory variable significantly affects the dependent variable.

2.5. Summary of Topic 2

In this topic, the simple linear regression model has been discussed. We have described a
method, based on the principle of least squares, for fitting simple linear regression models to
data. The principle of least squares says to estimate the regression line by the line which
minimizes the sum of the squared deviations of the observed data away from the line. The
intercept and slope of the fitted line are estimates of the regression parameters β0 and β1,
respectively. Further, an unbiased estimate of the common variance has been given. Under the
assumption of normality of the dependent variable, we have tested hypotheses and constructed
confidence intervals for the regression parameters.

References

Michael Sullivan III. Statistics Informed Decisions Using Data. Upper Saddle River, New
Jersey: Pearson Education, 2004.

Michael H. Kutner, Christopher J. Nachtsheim, John Neter and William Li. Applied Linear
Statistical Models. New York: McGraw-Hill Irwin, 2005.

Wooldridge, J.M., 2000. Introductory Econometrics: A Modern Approach.

20

You might also like