Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
27 views

( (x, y · · ·, n), and given a functional β) β β) "models" the data

The document discusses various methods for fitting models to data, including interpolation, approximation, and least squares regression. It specifically focuses on linear least squares regression. The key points are: 1) Linear least squares regression finds the β coefficients that minimize the sum of squared errors between the observed and modeled y-values. 2) This can be expressed as minimizing the norm of the residual vector (y - Aβ). 3) The normal equations AT Aβ = ATy are derived but poorly conditioned. 4) It is better to use the QR factorization of the design matrix A to solve the least squares problem.

Uploaded by

sirj0_hn
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

( (x, y · · ·, n), and given a functional β) β β) "models" the data

The document discusses various methods for fitting models to data, including interpolation, approximation, and least squares regression. It specifically focuses on linear least squares regression. The key points are: 1) Linear least squares regression finds the β coefficients that minimize the sum of squared errors between the observed and modeled y-values. 2) This can be expressed as minimizing the norm of the residual vector (y - Aβ). 3) The normal equations AT Aβ = ATy are derived but poorly conditioned. 4) It is better to use the QR factorization of the design matrix A to solve the least squares problem.

Uploaded by

sirj0_hn
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Interpolation and approximation

Given a set of points {(xj , yj ), j = 1, · · · , n}, and given a functional


⃗ nd “best” β
form f (x; β), ⃗ so that f (x; β) ⃗ “models” the data.

Interpolation Approximation

⃗ = yj
f (xj ; β) ⃗ + εj = yj
f (xj ; β)
εj is “noise”, E(εj ) = 0

1/19
Least squares approximation

E(εj ) = 0
E(ε2j ) = σj2
σj ≈ const

n
∑ 2
⃗ = ⃗ ⇒ min
2
χ (β) yj − f (xj ; β)
j=1

2/19
Weighted least squares

E(εj ) = 0
E(ε2j ) = σj2

If σj are signi cantly different (heteroscedasticity)


2


n yj − f (xj ; β)
⃗ =
χ2 (β) ⇒ min
j=1
σj2

3/19
Least absolute deviations

Also known as L1 regression:


n

⃗ = ⃗ ⇒ min
S(β) yj − f (xj ; β)
j=1

4/19
Total least squares

Also known as Orthogonal distance regression: minimize the sum of


squares of orthogonal distances from observations to the curve.

Can be more appropriate e.g. if both variables, x and y have


measurement errors.
5/19
Linear least squares

6/19
Least squares approximation

E(εj ) = 0
E(ε2j ) = σj2
σj ≈ const

n
∑ 2
⃗ = ⃗ ⇒ min
2
χ (β) yj − f (xj ; β)
j=1

7/19
Linear least squares

Consider an ordinary least squares problem,


n
∑ 2
⃗ = ⃗
ξ(β) j
y − f (x j ; β) ⇒ min
j=1

⃗ a linear combination
Let the model, f (x; β), is a linear function of β,
of m basis functions, φk (x)


m
⃗ =
f (x; β) βk φk (x)
j=1

Typically, want m < n.

8/19
Linear least squares

⃗ a linear combination
Let the model, f (x; β), is a linear function of β,
of m basis functions, φk (x)


m
⃗ =
f (x; β) βk φk (x)
j=1

The basis functions need not be linear:


▶ polynomials: φk (x) = xk
▶ Fourier series: φk (x) = eisk x
▶ φk (x) = xk log x
▶ ...

9/19
Linear least squares

We minimize with respect to β⃗


n
⃗ =
ξ(β) |zj |2
j=1

where (j = 1, . . . , n)
( )
zj = yj − β1 φ1 (xj ) + β2 φ2 (xj ) + · · · + βm φm (xj )

Which is eqivalent to
2
ξ(β) = y − Aβ⃗ 2

with y = (y1 , · · · , yn )T and Akj = φk (xj ).

10/19
Design matrix

The design matrix A is an n × m matrix


 
φ1 ( ) φ2 ( ) · · · φm ( )
 φ1 ( ) φ2 ( ) · · · φm ( )
A= 


···
φ1 ( ) φ2 ( ) · · · φm ( )

The dimensions of the design matrix is # of observations × # of


parameters

11/19
Design matrix

The design matrix A is an n × m matrix


 
φ1 (x1 ) φ2 (x1 ) · · · φm (x1 )
 φ1 (x2 ) φ2 (x2 ) · · · φm (x2 ) 
A= 


···
φ1 (xn ) φ2 (xn ) · · · φm (xn )

The dimensions of the design matrix is # of observations × # of


parameters

11/19
Example: straight line t

The model is
⃗ = β1 + β2 x
f (x; β)

m=2: φ1 (x) = 1 , φ2 (x) = x


and the design matrix is
 
1 x1
1 x2 
 
A = . . 
 .. .. 
1 xn

12/19
Linear least squares

Normal equations

13/19
Linear least squares: normal equations

To minimize the quadratic form



⃗ = y − Aβ⃗ 2
ξ(β) 2

set the derivatives to zero,


∂ ⃗ = 0,
ξ(β) j = 1, · · · , m
∂βk

And obtain the normal equations:


⃗ = AT y
AT A β

14/19
Linear least squares: normal equations

Normal equations
⃗ = AT y
AT A β
give a formal solution of a linear least squares problem.

However, ( )
cond AT A = [cond A]2
so that typically the system of normal equations is very poorly
conditioned.

15/19
Linear least squares

QR factorization of the design matrix

16/19
Linear least squares: QR factorization

Recall that a matrix A can be factorized into

A = QR

where Q is orthogonal (QT Q = 1) and R is upper triangular.

Since a design matrix is thin and tall (m < n), last n − m rows of R
are zero:
[ ]
R1
A=Q
0
where dim R1 = m

17/19
Linear least squares: QR factorization

Since the 2-norm of a vector is invariant under a rotation by an


orthogonal matrix Q, we rotate the residual y − Aβ⃗
2 ( ) 2

ξ(β) = y − Aβ⃗ = QT y − Aβ⃗

[ ] 2
T R1 ⃗

= Q y − β
0

Next, write [ ]
T f
Q y=
r
with dim f = m and dim r = n − m.

18/19
Linear least squares: QR factorization

This way, 2
⃗ =
ξ(β)

f − R1 β⃗ + ∥r∥
2

⃗ satis es
And the minimum of ξ(β)

R1 β⃗ = f

19/19
Linear least squares: QR factorization

This way, 2
⃗ =
ξ(β)

f − R1 β⃗ + ∥r∥
2

⃗ satis es
And the minimum of ξ(β)

R1 β⃗ = f

Algorithm
▶ Factorize the design matrix A = QR
▶ Rotate y → QT y (only need m rows ⇒ thin QR)
▶ Solve R1 β⃗ = f by back substitution.

19/19

You might also like