regression - sutherland, utah 2.pdf · polynomial regression & the normal equations...

13
Least-Squares Regression ChEn 2450 x f(x) Concept: Given N data points (x i ,y i ), find parameters in the function f(x) that minimize the error between f(x i ) and y i . f (x) x 1 Regression.key - September 22, 2014

Upload: others

Post on 22-Jul-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Regression - Sutherland, Utah 2.pdf · Polynomial Regression & The Normal Equations (Alternative Formulation for Linear Least Squares) Given N>n p observations (x i,y i), and a n

Least-Squares Regression

ChEn 2450

x

f(x)

Concept: Given N data points (xi,yi), find parameters in the function f(x) that minimize the error between f(xi) and yi.

f(x)

x

1 Regression.key - September 22, 2014

Page 2: Regression - Sutherland, Utah 2.pdf · Polynomial Regression & The Normal Equations (Alternative Formulation for Linear Least Squares) Given N>n p observations (x i,y i), and a n

Introduction: Regression to a Linear Function

f(x)

x

(x1, y1)

(x2, y2)(x3, y3)

What we would like:

y1 = a0 + a1x1

y2 = a0 + a1x2

y3 = a0 + a1x3

Problem: 3 equations (3 data points), but only 2 unknowns (a0, a1).

Idea: minimize the error between f(xi) and yi(xi).

S = (y1 � f(x1))2 + (y2 � f(x2))

2 + (y3 � f(x3))2

= (y1 � a0 � a1x1)2 + (y2 � a0 � a1x2)

2 + (y3 � a0 � a1x3)2

A measure of error (sum of squared errors):

To minimize error, we change a0 and a1 to minimize S. Set the slope to zero!

@S

@a0= 2(y1�a0�a1x1)(�1)+2(y1�a0�a1x2)(�1)+2(y1�a0�a1x3)(�1)

@S

@a1= �2x1(y1 � a0 � a1x1)� 2x2(y1 � a0 � a1x2)� 2x3(y1 � a0 � a1x3)

3y1 = 3a0 + a1(x1 + x2 + x3)

3y1(x1 + x2 + x3) = a0(x1 + x2 + x3) + a1(x21 + x

22 + x

23)

To fit a linear polynomial to

three data points:

Solve for a0, a1.

2 Regression.key - September 22, 2014

Page 3: Regression - Sutherland, Utah 2.pdf · Polynomial Regression & The Normal Equations (Alternative Formulation for Linear Least Squares) Given N>n p observations (x i,y i), and a n

Linear Least-Squares Regression

S - Sum of the squared errors.

To minimize S (error between function & data), we take partial derivatives w.r.t. the function parameters and set to zero.

ASSUME f(x) is an np-order polynomial:

f(x) =np�

k=0

akxk

S =N�

i=1

[yi � f(xi)]2

np+1 equations for np+1 unknowns (ak). Equations are linear w.r.t. ak.

S =N⇤

i=1

�yi �

np⇤

k=0

akxki

⇥2

Hoffman §4.10.3

“Linear Least Squares” regression.

⇥S

⇥ak=

N⇧

i=1

�2xki

⇤yi �np⇧

j=0

ajxji

⌅ = 0 k = 0 . . . np

i - data point index k - polynomial coefficient index

Only for polynomials

Concept: Given N data points (xi,yi), find parameters in the function f(x) that minimize the error between f(xi) and yi.

i - data point index j - dummy index k - polynomial coefficient index

x

y, f(x) f(x)

(xi,yi)

3 Regression.key - September 22, 2014

Page 4: Regression - Sutherland, Utah 2.pdf · Polynomial Regression & The Normal Equations (Alternative Formulation for Linear Least Squares) Given N>n p observations (x i,y i), and a n

Example: np=1 (Linear Polynomial)

Linear polynomial, np=1:

f(x) = a0 + a1x S =N�

i=1

[yi � a0 � a1xi]2

�S

�a0=

N�

i=1

2(�1) [yi � a0 � a1xi]

�S

�a1=

N�

i=1

2(�xi) [yi � a0 � a1xi]

0 =N�

i=1

[yi � a0 � a1xi]

0 =N�

i=1

[yi � a0 � a1xi]xi

S =N�

i=1

g2i gi = yi � a0 � a1xi

2 equations, 2 unknowns

(a0, a1)

N�

i=1

yi =N�

i=1

(a0 + a1xi)

N�

i=1

yixi =N�

i=1

(a0 + a1xi) xi

Here we divided the entire equation by -2.

(why?)

⇥S

⇥ak=

N⇧

i=1

xki

⇤yi �np⇧

j=0

ajxji

⌅ = 0 k = 0 . . . np

Apply chain rule.�S

�a0=

NX

i=1

�S

�gi

�gi�a0

=nX

i=1

(2gi)(�1)

=nX

i=1

�2 (yi � a0 � a1xi)

�S

�a1=

NX

i=1

�S

�gi

�gi�a1

=nX

i=1

(2gi)(�xi)

=nX

i=1

�2xi (yi � a0 � a1xi)

4 Regression.key - September 22, 2014

Page 5: Regression - Sutherland, Utah 2.pdf · Polynomial Regression & The Normal Equations (Alternative Formulation for Linear Least Squares) Given N>n p observations (x i,y i), and a n

Example (cont’d.)N�

i=1

yi =N�

i=1

(a0 + a1xi)

N�

i=1

yixi =N�

i=1

(a0 + a1xi) xi

2 equations, 2 unknowns. Let’s put these in Matrix form...

✓a0a1

◆Step 1: define the solution variable vector.

Step 2: define the matrix and RHS vector.

For N=4 points,

y1 + y2 + y3 + y4 = (a0 + a1x1) + (a0 + a1x2) + (a0 + a1x3) + (a0 + a1x4)

y1x1 + y2x2 + y3x3 + y4x4 = (a0 + a1x1)x1 + (a0 + a1x2)x2 + (a0 + a1x3)x3 + (a0 + a1x4)x4

"N

PNi=1 xiPN

i=1 xiPN

i=1 x2i

# PNi=1 yiPN

i=1 xiyi

!

"N

PNi=1 xiPN

i=1 xiPN

i=1 x2i

#✓a0

a1

◆=

PNi=1 yiPN

i=1 xiyi

!Linear least squares

regression for a linear polynomial.

5 Regression.key - September 22, 2014

Page 6: Regression - Sutherland, Utah 2.pdf · Polynomial Regression & The Normal Equations (Alternative Formulation for Linear Least Squares) Given N>n p observations (x i,y i), and a n

ï3.9 ï3.8 ï3.7 ï3.6x 10ï4

ï8

ï7.5

ï7

ï6.5

ï6

ï5.5

ï5

ï4.5

ï1/RT (mol/J)lo

g(k)

(1/s

)

ln(A)=38.9246Ea=121481.5094

databest fit

Example - Reaction Rate Constant

T (K) k (1/s)

313 0.00043

319 0.00103

323 0.0018

328 0.00355

333 0.00717

Pre-exponential factor Activation energy

TemperatureGas constant R=8.314 J/mol-K

N=N

Cl

Cl

+ N2

Benzene

diazonium

chloride

Chlorobenzene

rate “constant”

y = a0 + a1x

k = A exp��Ea

RT

ln(k) = ln(A)� Ea

RT

y = ln(k), x =�1RT

,

a0 = ln(A), a1 = Ea

recall: ln(ab)=ln(a)+ln(b)

Note: we need to calculate A (pre-exponential factor) from a0.

"N

PNi=1 xiPN

i=1 xiPN

i=1 x2i

#✓a0

a1

◆=

✓b0

b1

Now we are ready to go to the computer to determine a0 and a1.

6 Regression.key - September 22, 2014

Page 7: Regression - Sutherland, Utah 2.pdf · Polynomial Regression & The Normal Equations (Alternative Formulation for Linear Least Squares) Given N>n p observations (x i,y i), and a n

Polynomial Regression & The Normal Equations (Alternative Formulation for Linear Least Squares)

Given N>np observations (xi,yi), and a np order polynomial, find aj.

NOTE: this is an overdetermined (more equations than unknowns) linear problem for the coefficients, ai.

p = a0 + a1x + a2x2 + a3x

3 + · · · + anxn

One equation for each observation

(N equations)

Example - linear polynomial: p(x) = a0 + a1x

Another form of linear least-squares regression.

⌥⌥⌥⇧

1 x1 x21 · · · x

np

1

1 x2 x22 · · · x

np

2...

......

......

1 xN x2N · · · x

np

N

���⌃

✏ �� ⇣A

↵↵↵

a0

a1...

anp

���⌦

✏ �� ⇣�

=

↵ y1...

yN

�⌦

✏ �� ⇣b

“Normal Equations”

ATA� = ATb

⇤1 1 · · · 1x1 x2 · · · xN

⌅⌥

↵↵↵

1 x1

1 x2...

...1 xN

���⌦

�a0

a1

⇥=⇤

1 1 · · · 1x1 x2 · · · xN

⌅⇧

✏✏✏�

y1

y2...

yN

⇣⇣⇣�

1 x1

1 x2...

...1 xN

⌦⌦⌦�

�a0

a1

⇥=

���↵

y1

y2...

yN

����

7 Regression.key - September 22, 2014

Page 8: Regression - Sutherland, Utah 2.pdf · Polynomial Regression & The Normal Equations (Alternative Formulation for Linear Least Squares) Given N>n p observations (x i,y i), and a n

The Two are One...

a0

N�

i=1

1 + a1

N�

i=1

xi =N�

i=1

yi,

a0

N�

i=1

xi + a1

N�

i=1

x2i =

N�

i=1

xiyi

⇧N

⌥Ni=1 xi⌥N

i=1 xi⌥N

i=1 x2i

⌃ �a0

a1

⇥=

⇤ ⌥Ni=1 yi⌥N

i=1 xiyi

N�

i=1

(yi � a0 � a1xi) = 0,

N�

i=1

xi (yi � a0 � a1xi) = 0

k=0

k=1

A =

⇧⇧⇧⇤

1 x1

1 x2...

...1 xN

⌃⌃⌃⌅b =

⇧⇧⇧⇧⇧⇤

y1

y2

y3...

yN

⌃⌃⌃⌃⌃⌅� =

�a0

a1

ATA� = ATb

ATb =

� ⇤Ni=1 yi⇤N

i=1 xiyi

Consider each of the previous approaches for a first order polynomial.

Dir

ect

Leas

t Sq

uare

sM

atrix Transpose Approach

Typically most convenient for linear regression problems.

⇥S

⇥ak=

N⇧

i=1

xki

⇤yi �np⇧

j=0

ajxji

⌅ = 0 k = 0 . . . np

ATA =

�N

⇤Ni=1 xi⇤N

i=1 xi⇤N

i=1 x2i

8 Regression.key - September 22, 2014

Page 9: Regression - Sutherland, Utah 2.pdf · Polynomial Regression & The Normal Equations (Alternative Formulation for Linear Least Squares) Given N>n p observations (x i,y i), and a n

ï3.9 ï3.8 ï3.7 ï3.6x 10ï4

ï8

ï7.5

ï7

ï6.5

ï6

ï5.5

ï5

ï4.5

ï1/RT (mol/J)lo

g(k)

(1/s

)

ln(A)=38.9246Ea=121481.5094

databest fit

Example - Reaction Rate Constant

T (K) k (1/s)

313 0.00043

319 0.00103

323 0.0018

328 0.00355

333 0.00717

Pre-exponential factor Activation energy

TemperatureGas constant R=8.314 J/mol-K

N=N

Cl

Cl

+ N2

Benzene

diazonium

chloride

Chlorobenzene

rate “constant”

y = a0 + a1x

k = A exp��Ea

RT

ln(k) = ln(A)� Ea

RT

y = ln(k), x =�1RT

,

a0 = ln(A), a1 = Ea

recall: ln(ab)=ln(a)+ln(b)

Note: need to calculate A (pre-exponential factor) from a0.

1 �1RT1

1 �1RT2

......

1 �1RTN

⌦⌦⌦�

⌘ ⇣✏ ✓A

�a0

a1

⌘ ⇣✏ ✓�

=

���↵

ln(k1)ln(k2)

...ln(kN )

����

⌘ ⇣✏ ✓b

ATA� = ATb

9 Regression.key - September 22, 2014

Page 10: Regression - Sutherland, Utah 2.pdf · Polynomial Regression & The Normal Equations (Alternative Formulation for Linear Least Squares) Given N>n p observations (x i,y i), and a n

Linear Least Squares Regression in matlab

Do it “manually” - the way that we just showed. • See MATLAB code for previous example posted on class website.

• This is my favored method, and provides maximum flexibility!

Polynomial regression: p=polyfit(x,y,n) • polyval(p,xi) evaluates the resulting polynomial at xi.

• gives the “best fit” for a polynomial of order n through the data.

• if n==(length(x)-1) then you get an interpolant.

• if n<(length(x)-1) then you get a least-squares fit.

• You still must get the problem into a polynomial form.

BEFORE you go to the computer, formulate the problem as a polynomial regression problem!

10 Regression.key - September 22, 2014

Page 11: Regression - Sutherland, Utah 2.pdf · Polynomial Regression & The Normal Equations (Alternative Formulation for Linear Least Squares) Given N>n p observations (x i,y i), and a n

The “R2” Value

How well does the regressed line fit the data?

Average value of yi

R2=1 ⇒ Perfect fit

y = 1N

N�

i=1

yi

R2 = 1��N

i=1 (yi � f(xi))2

�Ni=1 (yi � y)2

• (xi,yi) - data points that we are regressing.

• f(x) - function we are regressing to.

• f(xi) - regression function value at xi.

11 Regression.key - September 22, 2014

Page 12: Regression - Sutherland, Utah 2.pdf · Polynomial Regression & The Normal Equations (Alternative Formulation for Linear Least Squares) Given N>n p observations (x i,y i), and a n

“Nonlinear” Least Squares Regression

S =N�

i=1

[yi � f(xi)]2

Same approach as before, but now the parameters of f(x) may enter nonlinearly!

Example:

Two nonlinear equations to solve

for a and b.

⇤S

⇤ak= 0, k = 1 . . . n

Assume f(x) has n parameters ak that we want to determine via regression.

S =N�

i=1

g2i , gi = yi � axb

i

Can we reformulate this as a linear problem?

f(x) = ax

bS =

NX

i=1

�yi � ax

bi

�2

�S

�a=

NX

i=1

�2xbi

�yi � axb

i

�S

�b=

NX

i=1

�2axbi ln(xi)

�yi � axb

i

0 =NX

i=1

x

bi

�yi � ax

bi

0 =NX

i=1

x

bi ln(xi)

�yi � ax

bi

@S

@a

=NX

i=1

@S

@gi

@gi

@a

=NX

i=1

2gi��x

bi

@S

@b

=NX

i=1

@S

@gi

@gi

@b

=NX

i=1

2gi��ax

bi ln(xi)

12 Regression.key - September 22, 2014

Page 13: Regression - Sutherland, Utah 2.pdf · Polynomial Regression & The Normal Equations (Alternative Formulation for Linear Least Squares) Given N>n p observations (x i,y i), and a n

Kinetics Example RevisitedPre-exponential factor Activation energy

TemperatureGas constant R=8.314 J/mol-K

rate “constant”

k = A exp��Ea

RT

⇥S =

N⇧

i=1

⇤yi �A exp

�� Ea

RTi

⇥⌅2

�S

�A=

N⇧

i=1

�2 exp�� Ea

RTi

⇥ ⇤yi �A exp

�� Ea

RTi

⇥⌅

�S

�Ea=

N⇧

i=1

2A

RTiexp

�� Ea

RTi

⇥ ⇤yi �A exp

�� Ea

RTi

⇥⌅

0 =N⇧

i=1

exp�� Ea

RTi

⇥ ⇤yi �A exp

�� Ea

RTi

⇥⌅

0 =N⇧

i=1

1Ti

exp�� Ea

RTi

⇥ ⇤yi �A exp

�� Ea

RTi

⇥⌅

Sum of squared errors.

Minimize S w.r.t. A and Ea.

2 nonlinear equations with 2 unknowns A, Ea.

We will show how to solve this soon!

S =N⇤

i=1

g2i , gi = yi �A exp

�� Ea

RTi

@S

@A=

NX

i=1

@S

@gi

@gi@A

=

NX

i=1

2gi

� exp

✓� Ea

RTi

◆�

@S

@Ea=

NX

i=1

@S

@gi

@gi@Ea

=

NX

i=1

2gi

A

RTiexp

✓� Ea

RTi

◆�

13 Regression.key - September 22, 2014