Equation of A Line With Python Programming: Module 2-Part 2 CS 132 - Mathematics For Computer Science
Equation of A Line With Python Programming: Module 2-Part 2 CS 132 - Mathematics For Computer Science
1
Learning Objectives
1. Derive the equation of a line that fits a given
set of points based on the concept of a
Regression
(Voluntary Homework)
2. Create a Program that is an aid in the
derivation of the equation of a line using the
Python Programming language
3. Use colab.research.google.com to run a
program in Python
Introduction
• Regression Analysis is an approach for
deriving the equation of a line that may be
used for forecasting.
• Given: a set a points
• Goal: Derive a line that fits the given set of
points.
Fitting a line to a given set of points
(Model for inference based on data collected)
• Curve Fitting
– Deriving a Regression Line
– Deriving Nonlinear Regression Curve (e.g.
Quadratic Curve)
Regression Analysis
• Determine the values of parameters for a
function that cause the function to best fit
a set of data observations
• The Regression line has the form y=ax+b.
Hence, the values of the parameters a and
b leads to having determined the line that
fits a set of points.
Linear Regression
y = ax + b
(equivalent form: y= a + bx or y=p0 +p1x)
General Idea behind the Regression Line
• Using principles of Calculus, a system of equations that will allow the derivation of the values of
a and b for the regression line y=ax+b can be derived using the following formulas.
» For the mean time, we forego showing the derivation of the formulas.
Required Exercise Exercise
2B
(See Next Slide)
Exercise
11
Linear Regression
(A sample context)
For a particular stretch of a highway it is believed that there is a
correlation between the vehicle density(number of vehicles per 100 m) on
the highway and the number of accidents that occur. From causal
observation, the number of accidents has been found to increase with an
increase in vehicle density up to a certain point. However, once the
vehicle density exceeds a certain value, the average vehicle speed is
reduced due to congestion, thereby reducing the number of accidents.
To predict accident rates and as an aid to produce an improved highway
design, we wish to develop equations relating the vehicle density to the
number of accidents from observed data.
Assume:
Accidents on a Highway
depends on Vehicle Density
Data for Regression Problem
Observation Data
You may use a spreadsheet software as an aid
Please find sample but partial worksheets uploaded for this purpose
1.4 3
2.0 6
2.3 4
4.5 7
6.2 10
6.7 15
7.0 11
8.5 18
9.0 13
12.7 17
13.1 15
17.7 16
18.5 11
20.3 5
Data for Regression Problem
!
The First 8 Data Points may be represented by one
line
!
The last 7 Data Points may be represented by
another regression line.
Linear Regression
(Sample case described in a previous slide )
1.4 3
2.0 6
2.3 4
4.5 7
6.2 10
6.7 15
7.0 11
8.5 18
9.0 13
12.7 17
13.1 15
17.7 16
18.5 11
20.3 5
Partial Worksheet
(Part of the solution of Exercise 2B)
APPLY THE FORMULAS FOR GETTING THE REGRESSION LINE. See Module2B1_p
21
Observation Data
You may use a spreadsheet software as an aid
Please find sample but partial worksheets uploaded for this purpose
1.4 3
2.0 6
2.3 4
4.5 7
6.2 10
6.7 15
7.0 11
8.5 18
9.0 13
12.7 17
13.1 15
17.7 16
18.5 11
20.3 5
Partial Worksheet
(Part of the solution of the Exercise 2B)
APPLY THE FORMULAS FOR GETTING THE REGRESSION LINE. See Module2B1_present
23
Cramer’s Rule for Solving a system two equations in
2 unknowns
24
Cramer’s Rule for Solving a system two equations in
2 unknowns
25
Needed Theoretical Framework
» To be able to solve 2 unknowns, you must
have a system of 2 equations in 2 unknowns
» Apply Method of Elimination
» By substitution
» By Addition/Subtraction
» Use Cramer’s Rule
26
Linear Regression Problem
(Another sample Context)
Population Forecasting Model
year population(million)
1950 20.2
1955 22.9
1960 25.29
1965 30.55
1970 34.7
1975 40.3
1980 47.1
1985 52.3
1990 59.2
1995 68.2
2000 78.78
2005 85.5
2010 91.35
Nonlinear Regression Formulas
Course Facilitator
30
References
Walpole, R.(1997) Introduction to Statistics.
3rd Edition. Prentice Hall International, Inc.
Singapore.