Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
256 views

Chapter 9: Linear Regression and Correlation

This document provides examples and exercises related to linear regression and correlation analysis. It includes: 1) Examples of calculating a linear regression equation and using it to estimate values. 2) Exercises involving interpreting correlation coefficients, testing if relationships are significant, and using multiple linear regression to develop prediction models. 3) An example of multiple linear regression being used to predict exam grades based on study time, number of books used, IQ, and age. Metrics like coefficients, standard errors, and significance tests are interpreted.

Uploaded by

Wong Veronica
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
256 views

Chapter 9: Linear Regression and Correlation

This document provides examples and exercises related to linear regression and correlation analysis. It includes: 1) Examples of calculating a linear regression equation and using it to estimate values. 2) Exercises involving interpreting correlation coefficients, testing if relationships are significant, and using multiple linear regression to develop prediction models. 3) An example of multiple linear regression being used to predict exam grades based on study time, number of books used, IQ, and age. Metrics like coefficients, standard errors, and significance tests are interpreted.

Uploaded by

Wong Veronica
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Chapter 9: Linear Regression and Correlation

EXERCISE: LINEAR REGRESSION AND CORRELATION

1. The grades of a class of 9 students on a midterm report (x) and on the final
examination (y) are as follows:

x 77 50 71 72 81 94 96 99 67
y 82 66 78 34 47 85 99 99 68

(a) Find the equation of the regression line.

(b) Estimate the final examination grade of a student who received a grade of 85
on the midterm report but was ill at the time of the final examination.

2. (a) From the following information draw a scatter diagram and by the method
of least squares draw the regression line of best fit.

Volume of sales (thousand units), x 5 6 7 8 9 10


Total expenses (thousand $), y 74 77 82 86 92 95

(b) What will be the total expenses when the volume of sales is 7,500 units?

(c) If the selling price per unit is $11, at what volume of sales will the total
income from sales equal the total expenses?

3. The following data show the unit cost of producing certain electronic components
and the number of units produced:

Lot size, x 50 100 250 500 1000


Unit cost, y $108 $53 $24 $9 $5

It is believed that the regression equation is of the form

y  ax b .

By simple linear regression technique or otherwise estimate the unit cost for a lot of
400 components.

4. Two variables x and y are related by the law:

y  x  x 2 .

State how  and  can be estimated by the simple linear regression technique.

1
Chapter 9: Linear Regression and Correlation

5. Compute and interpret the correlation coefficient for the following grades of 6
students selected at random.

Mathematics grade 70 92 80 74 65 83
English grade 74 84 63 87 78 90

6. The following table shows a traffic-flow index and the related site costs in respect
of eight service stations of ABC Garages Ltd.

Site No. Traffic-flow index Site cost (in 1000)


1 100 100
2 110 115
3 119 120
4 123 140
5 123 135
6 127 175
7 130 210
8 132 200

(a) Calculate the coefficient of correlation for this data.


(b) Calculate the coefficient of rank correlation.

7. As a result of standardized interviews, an assessment was made of the IQ and the


attitude to the employing company of a group of six workers. The IQ’s were
expressed as whole numbers within the range 50-150 and the attitudes are assigned
to five grades labeled 1, 2, 3, 4 and 5 in order of decreasing approval. The results
obtained are summarized below:

Employee A B C D E F
IQ 127 85 94 138 104 70
Attitude score 2 4 3 1 2 5

Is there evidence of an association between the two attributes?

8. For the following multiple regression equation:



y  50  2 x1  7 x2 with R 2  0.40
(a) Interpret the meaning of the slopes.
(b) Interpret the meaning of the Y intercept.
(c) Interpret the meaning of the coefficient of multiple determination R 2 .

2
Chapter 9: Linear Regression and Correlation

9. The following ANOVA summary table was obtained from a multiple regression
model with two independent variables.

Source Degrees of Sum of Mean F


freedom squares squares
Regression 2 30
Error 10 120
Total 12 150
(a) Determine the mean square that is due to regression and the mean square
that is due to error.
(b) Determine the computed F statistic.
(c) Determine whether there is a significant relationship between Y and the
two explanatory variables at the 0.05 level of significance.

10. Given the following information from a multiple regression analysis


n  25, b1  5, b2  10, Sb1  2, Sb2  8 , where Sbi = standard error of bi
(a) Which variable has the largest slope in units of a t statistic?
(b) At the 0.05 level of significance, determine whether each explanatory
variable makes a significant contribution to the regression model. On the
basis of these results, indicate the independent variables that should be
included in this model.

11. Amy trying to purchase a used Toyota car has researched the prices. She believes
the year of the car and the number of miles the car has been driven both influence
the purchase price. Data are given below for 10 cars with the price (Y) in thousands
of dollars, year (X1), and miles driven (X2) in thousands.

(Y) (X1) (X2)


Price Year Miles
($ thousands) (thousands)
2.99 1987 55.6
6.02 1992 18.4
8.87 1993 21.3
3.92 1988 46.9
9.55 1994 11.8
9.05 1991 36.4
9.37 1992 28.2
4.2 1988 44.2
4.8 1989 34.9
5.74 1991 26.4

(a) Using SPSS, fit the least-squares equation that best relates these three
variables.
(b) Amy would like to purchase a 1991 Toyota with about 40,000 miles on it.
How much do you predict she will pay?

3
Chapter 9: Linear Regression and Correlation

12. Steven Reich, a statistics professor in a leading business school, has a keen interest
in factors affecting students' performance on exams. The midterm exam for the past
semester had a wide distribution of grades, but Steven feels certain that several
factors explain the distribution: He allowed his students to study from as many
different books as they liked, their IQs vary, they are of different ages, and they study
varying amounts of time for exams. To develop a predicting formula for exam
grades, Steven asked each student to answer, at the end of the exam, questions
regarding study time and number of books used. Steven's teaching records already
contained the IQs and ages for the students, so he compiled the data for the class and
ran a multiple regression with a computer package. The output from Steven's
computer run was as follows:

Predictor Coef Stdev t-ratio p


Constant -49.948 41.55 -1.20 0.268
HOURS 1.06931 0.98163 1.09 0.312
IQ 1.36460 0.37627 3.63 0.008
BOOKS 2.03982 1.50799 1.35 0.218
AGE -1.79890 0.67332 -2.67 0.319
s = 11.657 R-sq = 76.7%
(a) What is the least squares regression equation for these data?
(b) What percentage of the variation in grades is explained by this equation?
(c) What grade would you expect for a 21-year-old student with an IQ of 113,
who studied 5 hours and used three different books?

13. Refer to Q12. The following additional output was provided by the computer when
Steven ran the multiple regression:

Analysis of Variance

Source DF SS MS F p
Regression 4 3134.42 783.60
Error 7 951.25 135.89
Total 11 4085.67

(a) What is the observed value of F?


(b) At a significance level of 0.05, what is the appropriate critical value of F
to use in determining whether the regression as a whole is significant?
(c) Based on your answers to part (a) and part (b), is the regression significant
as a whole?

4
Chapter 9: Linear Regression and Correlation

14. A New Canada-based commuter airline has taken a survey of its 15 terminals and
has obtained the following data for the month of February, where

SALES = total revenue based on number of tickets sold (in thousands of dollars)
PROMOT = amount spent on promoting the airline in the area (in thousands of
dollars)
COMP = number of competing airlines at that terminal
FREE = the percentage of passengers who flew free (for various reasons)

Sales($) Promot($) Comp Free


79.3 2.5 10 3
200.1 5.5 8 6
163.2 6.0 12 9
200.1 7.9 7 16
146.0 5.2 8 15
177.7 7.6 12 9
30.9 2.0 12 8
291.9 9.0 5 10
160.0 4.0 8 4
339.4 9.6 5 16
159.6 5.5 11 7
86.3 3.0 12 6
237.5 6.0 6 10
107.2 5.5 10 4
155.0 3.5 10 4

Predictor Coef Stdev t-ratio p


Constant 172.34 51.38 3.35 0.006
PROMOT 25.950 4.877 5.32 0.000
COMP -13.238 3.686 -3.59 0.004
FREE -3.041 2.342 -1.30 0.221

(a) Use the above computer output to determine the least-squares regression
equation for the airline to predict sales.
(b) Do the percentage of passengers who fly free cause sales to decrease
significantly? State and test appropriate hypothesis. Use  = 0.05.

5
Chapter 9: Linear Regression and Correlation

15. Alex Yeung, manager of Star Shine's Diamond and Jewellery Store, is interested in
developing a model to estimate consumer demand for his rather expensive
merchandise. Because most customers buy diamonds and jewelry on credit, Alex is
sure that two factors that must influence consumer demand are the current annual
inflation rate and the current lending rate at the leading banks in UK. Explain some
of the problems that Alex might encounter if he were to set up a regression model
based on his two predictor variables.

You might also like