0% found this document useful (0 votes)

56 views

Notes 3 - Linear Regression

This document provides notes on linear regression. It defines key terms like explanatory variable, response variable, and scatterplot. It explains how to calculate the least squares regression line (LSRL) by hand, including finding the slope, y-intercept, and equation. It discusses interpreting the slope and using the line to make predictions. It also covers residuals, the coefficient of determination, and using residual plots to assess the appropriateness of the linear model.

Uploaded by

kjogu giyvg

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views

Notes 3 - Linear Regression

Uploaded by

kjogu giyvg

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Unit 2 – Exploring Two Variable Data

Notes 3 – Linear Regression

Recall
 If we think that a variable x may explain or even cause changes in another variable y, we call x an
__________________ and y a ___________________.
 _________________ displays the relationship between two quantitative variables measured on the same
individuals.
 In examining a scatterplot, look for an overall pattern showing the ________, ___________, and
_____________ of the relationship. And then for ____________ or other deviations from the pattern.

Example

_____ Which of the following statements about correlation r is true?

(A) A correlation of 0.2 means that 20% of the points are highly correlation.
(B) Perfect correlation, that is, when the point lie exactly on a straight line, results in r = 0.
(C) Correlation is not affected by which variable is called x and which is called y.
(D) Correlation is not affected by extreme values.
(E) A correlation of 0.75 indicates a relationship that is 3 times as linear as one for which the correlation is only
0.25

Least Squares Regression Line

Least Squares Regression or linear regression allows you to fit a line to a scatterplot in order to be able to
better interpret the relationship between two variables, as well as make predictions about our response variable.

The fitted line is called the line of best fit, linear regression line, or least squares regression line, (LSRL) and
has an equation in a form that should look very familiar:

____ : predicted y-value

____ : explanatory variable

____ : slope

____ : y-intercept

There are other ways the LSRL is written:

_______________________ - A form on the calculator (in addition to the form we see above)
The way the line is fitted to the data is through a process called the method of least squares. The main idea
behind this method is that the square of the vertical distance between each data point and the line is minimized.

Slope: The slope of the regression line is important in the sense that it gives us the change of y with respect to x.
In other words, it gives us the amount of change in y when x increases by 1.

Intercept: The intercept is statistically meaningful only when x can actually take values close to zero. When it
does make sense to have a x-value of zero, the y-intercept is the y-value we would expect.

When we have a data set (x,y), we can calculate the LSRL by hand or with technology. In our Vitruvian Man
activity, we were already introduced to both. Let’s go through the process again, but this time, we will use a
different example.

Example
Many schools require teachers to have evaluations done by students. A study investigated the extent to which
student evaluations are related to grades. Teacher evaluations and grades are both given on a scale of 100. The
results for Mrs. H for 10 of her students are given below together with the average for each student (x).

x 40 6 70 73 7 68 65 8 98 90
0 5 5
y 10 5 60 65 7 73 78 8 90 95
0 5 0

(a) Create the LSRL by hand. Show all work.

Step 1: Enter the data into your calculator. x values go into L1 and y values into L2.

Step 2: Find the slope of your LSRL.

r = _______

Slope = r( )
Sy
Sx
S y =¿ _______

S x =¿ _______

*Finding r: We could go through the process of lists to find “r”, like we did in the activity, but we can just use
our calculator to get us “r”. You will NOT have to find “r” by hand on the AP exam. You will have to show
work for calculating the slope by hand.
*Also, when you find “r” on your calculator, yes you can see the LSRL from your calculator. Still show your
work on your paper but feel free to make sure your answers match!

Step 3: Find the y-intercept of your LSRL.

slope = ______
y-intercept = y−¿ slope * x x = ______
y = ______

*Fun fact of the LSRL, it will ALWAYS pass through the point ( x , y ) . We use this property to find the y-
intercept.

Step 4: Create your equation: ______________________________

(b) Use your equation to predict what evaluation Mrs. H will get from a student who scored a 81.

(c) Interpret the slope in the context of the problem.

(d) Do you think student grades and the evaluations students give their teachers are related? Explain.

LSRL on Calculator

At this point, I’m sure you know the trick to finding the LSRL on your calculator. There are two ways to get the
line and it depends on how you like to write the line.

STAT  CALC  4: LinReg(ax + b) OR STAT  CALC  8: LinReg(a+bx)

Regression Line Cautions

Be careful of making predictions beyond what the data shows.

__________________ is the use of a regression line for prediction far outside the interval of values of the
explanatory variable x used to obtain the line. Such predictions are often not accurate.
CORRELATION DOES NOT IMPLY CAUSATION. For example, there was a study done that showed a
strong, positive linear relationship between ice cream sales and homicides in New York City. Does this mean
that if we stop selling ice cream, we will have no more homicides?

Coefficient of Determination
The strength of a prediction which uses the LSRL depends on how close the data points are to the regression
line. The mathematical approach to describing this strength is via the coefficient of determination (________).
The coefficient of determination gives us the proportion of variation in the values of y that is explained by least-
squares regression of y on x. The coefficient of determination turns out to be the correlation coefficient squared.

In our last example, r = _______ Therefore, to find the coefficient of determination:

Whenever you use the regression line for prediction, also include as a measure of how successful the regression
is in explaining the response.

In our example, r 2=0.818. This means that 81.8% of the variation in teacher evaluations (the dependent
variable) can be explained by the linear relationship it has with the student class average (the independent
variable).

Residuals
 In most cases, no line will pass exactly through all the points. This means that even if we use the LSRL
to make predictions about our dependent variable, there will still be some error from the actual y-value.
 Because we use the line to predict __ from __, the prediction errors we make are errors in y, the
_________ direction in the scatterplot.
 A good regression makes the vertical deviations of the points from the line
____________________________________ (remember Least Square Regression?)
 A residual is the _____________________ between an observed value of the response variable and the
value predicted by the regression line.
 Residual = observed – predicted OR _________________________
 If the residual is positive, the observed point lies ___________ the least squares regression line.
 If the residual is negative, the observed point lies ____________ the least squares regression line.

Fun fact: If you add up all the residuals from your data, you will get “0”. That is why the method of LSRL
involves squaring the residuals then adding them up and minimizing that value 
Example: Everyone knows that cars and trucks lose value the more they are driven. Can we predict the price of
a used Ford F-150 SuperCrew 4x4 if we know how many miles it has on the odometer? A random sample of 16
used trucks was selected from autotrader.com. Here is a graph of the data:

Find and interpret the residual for the Ford F-150 that had 70,583 miles driven and a price of $21,994.

Residual Plots
 A residual plot makes it
easy to study the residuals
by plotting them against
the explanatory variable.

 Residual plots help us

assess whether a linear
model is appropriate.

 In the truck example, if

we calculate all the
residuals for each point,
we can then plot the miles
driven vs. residuals.
 Essential, a residual plot turns the regression line horizontal.
 Residual plots magnify the deviations of points from a line.
 This makes it easier to see an unusual pattern, so it helps us determine if a linear model is appropriate.

When an obvious curved pattern exists in a residual plot, the model we are using is not appropriate:

Residual Plots on Calculator

The TI-83/84 will generate a complete set of residuals when you perform a LinReg. They are stored in a list
called RESID which can be found in the LIST menu. RESID stores only the current set of residuals. That is, a
new set of residuals is stored in RESID each time you perform a new regression.

In order to draw a residual plot on the TI-83/84, first enter your data and perform a LinReg. Next, create a
STAT PLOT where XList is L1 and YList is RESID (get this by pressing 2nd  STAT  9:RESID)

Bike Sharing Assignment
100% (6)
Bike Sharing Assignment
7 pages
Regression Analysis
No ratings yet
Regression Analysis
12 pages
Simple Linear Regression Homework Solutions
100% (1)
Simple Linear Regression Homework Solutions
6 pages
Simple Linear Regression: Coefficient of Determination
No ratings yet
Simple Linear Regression: Coefficient of Determination
21 pages
MAT 120 Chapter 9 Notes PDF
No ratings yet
MAT 120 Chapter 9 Notes PDF
4 pages
ArunRangrej
No ratings yet
ArunRangrej
5 pages
UNIt-3 TY
No ratings yet
UNIt-3 TY
67 pages
LINEAR REGRESSION IN R
No ratings yet
LINEAR REGRESSION IN R
6 pages
OpenStax Chapter 12 Power Point
No ratings yet
OpenStax Chapter 12 Power Point
81 pages
classification algorithm
No ratings yet
classification algorithm
43 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
Types of Regression
No ratings yet
Types of Regression
8 pages
Introduction To Linear Regression
No ratings yet
Introduction To Linear Regression
5 pages
Chapter 6: How To Do Forecasting by Regression Analysis
No ratings yet
Chapter 6: How To Do Forecasting by Regression Analysis
7 pages
FM Project REPORT - Group3
No ratings yet
FM Project REPORT - Group3
24 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
95 pages
ML UNIT-4
No ratings yet
ML UNIT-4
34 pages
ML UNIT-4
No ratings yet
ML UNIT-4
35 pages
Experiment No 7
No ratings yet
Experiment No 7
7 pages
Chapter 3 - Classical Simple Linear Regression
No ratings yet
Chapter 3 - Classical Simple Linear Regression
52 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
13 pages
ML Lecture - 3
No ratings yet
ML Lecture - 3
47 pages
UNIT I Notes-1
No ratings yet
UNIT I Notes-1
18 pages
UNIT I Notes
No ratings yet
UNIT I Notes
23 pages
ML Unit-2 Final
No ratings yet
ML Unit-2 Final
32 pages
Unit 2
No ratings yet
Unit 2
67 pages
Regression ANOVA
No ratings yet
Regression ANOVA
42 pages
Chapter 12 Notes
No ratings yet
Chapter 12 Notes
60 pages
Linear Regression Skills Quiz
No ratings yet
Linear Regression Skills Quiz
13 pages
Linear Regression
No ratings yet
Linear Regression
83 pages
Chapter 8 B - Trendlines and Regression Analysis
No ratings yet
Chapter 8 B - Trendlines and Regression Analysis
73 pages
Module3-Fitting A Model To Data
No ratings yet
Module3-Fitting A Model To Data
57 pages
Assignment Linear Regression
No ratings yet
Assignment Linear Regression
10 pages
unit 2 svms linear logistic regression
No ratings yet
unit 2 svms linear logistic regression
9 pages
Regression
No ratings yet
Regression
45 pages
LINEST Function
No ratings yet
LINEST Function
8 pages
Lab-3: Regression Analysis and Modeling Name: Uid No. Objective
No ratings yet
Lab-3: Regression Analysis and Modeling Name: Uid No. Objective
9 pages
Chapter 06-Regression Analysis
No ratings yet
Chapter 06-Regression Analysis
41 pages
Chapter12 Stats
No ratings yet
Chapter12 Stats
6 pages
Econometrics Practical
No ratings yet
Econometrics Practical
13 pages
Sec54 Online Complete
No ratings yet
Sec54 Online Complete
18 pages
ML Model Paper 2 Solution
No ratings yet
ML Model Paper 2 Solution
15 pages
Regression Test Lesson Notes (Optional Download)
No ratings yet
Regression Test Lesson Notes (Optional Download)
5 pages
Chapter 4
No ratings yet
Chapter 4
15 pages
Subjective Questions
No ratings yet
Subjective Questions
3 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
20 pages
Pearson's Correlation Coefficient
No ratings yet
Pearson's Correlation Coefficient
7 pages
3.2 Power Point 2
No ratings yet
3.2 Power Point 2
35 pages
Unit2 ML Notes
No ratings yet
Unit2 ML Notes
19 pages
Regression Primer
No ratings yet
Regression Primer
4 pages
ML Model Paper 2 Solution
No ratings yet
ML Model Paper 2 Solution
15 pages
41 Machine Learning Algorithms I
No ratings yet
41 Machine Learning Algorithms I
8 pages
Final Answer Bank
No ratings yet
Final Answer Bank
10 pages
How To Find The Line of Best Fit in 3 Steps
No ratings yet
How To Find The Line of Best Fit in 3 Steps
5 pages
Chapter_2_Linear and Logistic Regression
No ratings yet
Chapter_2_Linear and Logistic Regression
34 pages
Linear Regression Models
No ratings yet
Linear Regression Models
41 pages
Unit2-Regression NGP
No ratings yet
Unit2-Regression NGP
81 pages
Fdsa UNIT V
No ratings yet
Fdsa UNIT V
18 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
Generalized Linear Model
No ratings yet
Generalized Linear Model
9 pages
Demand Forecasting-Quantitative Methods
No ratings yet
Demand Forecasting-Quantitative Methods
47 pages
5 Chapter Five ANOVA
100% (1)
5 Chapter Five ANOVA
9 pages
Chapter 7: Statistics 7.1 Measures of Central Tendency 7.1.1 Mean
100% (1)
Chapter 7: Statistics 7.1 Measures of Central Tendency 7.1.1 Mean
13 pages
Excel Pronto Pizza
No ratings yet
Excel Pronto Pizza
29 pages
4.1 Normal Distribution: Properties
No ratings yet
4.1 Normal Distribution: Properties
4 pages
Scikit-Learn Cheat Sheet
No ratings yet
Scikit-Learn Cheat Sheet
1 page
Analytical Calibration Using A Simple Linear Curve Fit, With Error Estimation
No ratings yet
Analytical Calibration Using A Simple Linear Curve Fit, With Error Estimation
24 pages
MQB 10 WS 2
No ratings yet
MQB 10 WS 2
4 pages
IEM 4103 Quality Control & Reliability Analysis IEM 5103 Breakthrough Quality & Reliability
No ratings yet
IEM 4103 Quality Control & Reliability Analysis IEM 5103 Breakthrough Quality & Reliability
42 pages
Lecture 9 F Test Practice Questions
No ratings yet
Lecture 9 F Test Practice Questions
2 pages
Harolds Stats Distributions Cheat Sheet 2022
No ratings yet
Harolds Stats Distributions Cheat Sheet 2022
18 pages
Weighted Pca
No ratings yet
Weighted Pca
1 page
Multiple Regression With Serial
No ratings yet
Multiple Regression With Serial
15 pages
MTH-206 Statistics and Probability Theory
No ratings yet
MTH-206 Statistics and Probability Theory
10 pages
AIML Module - 03 21CS4
No ratings yet
AIML Module - 03 21CS4
34 pages
CS 3352 Foundations of Data Science Syllabus
No ratings yet
CS 3352 Foundations of Data Science Syllabus
2 pages
GE9 Quiz 3
No ratings yet
GE9 Quiz 3
5 pages
36-708 Statistical Machine Learning Homework #3 Solutions: DUE: March 29, 2019
No ratings yet
36-708 Statistical Machine Learning Homework #3 Solutions: DUE: March 29, 2019
22 pages
Module 3 Excel Utility: This Workbook Contains Two Utilities Related To Confidence Intervals
No ratings yet
Module 3 Excel Utility: This Workbook Contains Two Utilities Related To Confidence Intervals
3 pages
q2 w3 m4 Practical Research 2 Inquiries, Investigations, and Immersion Ellima
No ratings yet
q2 w3 m4 Practical Research 2 Inquiries, Investigations, and Immersion Ellima
39 pages
Ex Post Facto Design
No ratings yet
Ex Post Facto Design
17 pages
MANOVA
No ratings yet
MANOVA
33 pages
Random Sampliong Activity
No ratings yet
Random Sampliong Activity
3 pages
(Quiz) Statistics in Finance
No ratings yet
(Quiz) Statistics in Finance
19 pages
high fps low ping for real
No ratings yet
high fps low ping for real
5 pages
Statistical Techniques in Business & Economics, Lind/Marchal/Wathen, 13/e 105
100% (1)
Statistical Techniques in Business & Economics, Lind/Marchal/Wathen, 13/e 105
7 pages
Econometrics II
100% (1)
Econometrics II
4 pages
Notes
No ratings yet
Notes
3 pages
Ba Final
No ratings yet
Ba Final
18 pages