regression - sutherland, utah 2.pdf · polynomial regression & the normal equations...
TRANSCRIPT
Least-Squares Regression
ChEn 2450
x
f(x)
Concept: Given N data points (xi,yi), find parameters in the function f(x) that minimize the error between f(xi) and yi.
f(x)
x
1 Regression.key - September 22, 2014
Introduction: Regression to a Linear Function
f(x)
x
(x1, y1)
(x2, y2)(x3, y3)
What we would like:
y1 = a0 + a1x1
y2 = a0 + a1x2
y3 = a0 + a1x3
Problem: 3 equations (3 data points), but only 2 unknowns (a0, a1).
Idea: minimize the error between f(xi) and yi(xi).
S = (y1 � f(x1))2 + (y2 � f(x2))
2 + (y3 � f(x3))2
= (y1 � a0 � a1x1)2 + (y2 � a0 � a1x2)
2 + (y3 � a0 � a1x3)2
A measure of error (sum of squared errors):
To minimize error, we change a0 and a1 to minimize S. Set the slope to zero!
@S
@a0= 2(y1�a0�a1x1)(�1)+2(y1�a0�a1x2)(�1)+2(y1�a0�a1x3)(�1)
@S
@a1= �2x1(y1 � a0 � a1x1)� 2x2(y1 � a0 � a1x2)� 2x3(y1 � a0 � a1x3)
3y1 = 3a0 + a1(x1 + x2 + x3)
3y1(x1 + x2 + x3) = a0(x1 + x2 + x3) + a1(x21 + x
22 + x
23)
To fit a linear polynomial to
three data points:
Solve for a0, a1.
2 Regression.key - September 22, 2014
Linear Least-Squares Regression
S - Sum of the squared errors.
To minimize S (error between function & data), we take partial derivatives w.r.t. the function parameters and set to zero.
ASSUME f(x) is an np-order polynomial:
f(x) =np�
k=0
akxk
S =N�
i=1
[yi � f(xi)]2
np+1 equations for np+1 unknowns (ak). Equations are linear w.r.t. ak.
S =N⇤
i=1
�yi �
np⇤
k=0
akxki
⇥2
Hoffman §4.10.3
“Linear Least Squares” regression.
⇥S
⇥ak=
N⇧
i=1
�2xki
�
⇤yi �np⇧
j=0
ajxji
⇥
⌅ = 0 k = 0 . . . np
i - data point index k - polynomial coefficient index
Only for polynomials
Concept: Given N data points (xi,yi), find parameters in the function f(x) that minimize the error between f(xi) and yi.
i - data point index j - dummy index k - polynomial coefficient index
x
y, f(x) f(x)
(xi,yi)
3 Regression.key - September 22, 2014
Example: np=1 (Linear Polynomial)
Linear polynomial, np=1:
f(x) = a0 + a1x S =N�
i=1
[yi � a0 � a1xi]2
�S
�a0=
N�
i=1
2(�1) [yi � a0 � a1xi]
�S
�a1=
N�
i=1
2(�xi) [yi � a0 � a1xi]
0 =N�
i=1
[yi � a0 � a1xi]
0 =N�
i=1
[yi � a0 � a1xi]xi
S =N�
i=1
g2i gi = yi � a0 � a1xi
2 equations, 2 unknowns
(a0, a1)
N�
i=1
yi =N�
i=1
(a0 + a1xi)
N�
i=1
yixi =N�
i=1
(a0 + a1xi) xi
Here we divided the entire equation by -2.
(why?)
⇥S
⇥ak=
N⇧
i=1
xki
�
⇤yi �np⇧
j=0
ajxji
⇥
⌅ = 0 k = 0 . . . np
Apply chain rule.�S
�a0=
NX
i=1
�S
�gi
�gi�a0
=nX
i=1
(2gi)(�1)
=nX
i=1
�2 (yi � a0 � a1xi)
�S
�a1=
NX
i=1
�S
�gi
�gi�a1
=nX
i=1
(2gi)(�xi)
=nX
i=1
�2xi (yi � a0 � a1xi)
4 Regression.key - September 22, 2014
Example (cont’d.)N�
i=1
yi =N�
i=1
(a0 + a1xi)
N�
i=1
yixi =N�
i=1
(a0 + a1xi) xi
2 equations, 2 unknowns. Let’s put these in Matrix form...
✓a0a1
◆Step 1: define the solution variable vector.
Step 2: define the matrix and RHS vector.
For N=4 points,
y1 + y2 + y3 + y4 = (a0 + a1x1) + (a0 + a1x2) + (a0 + a1x3) + (a0 + a1x4)
y1x1 + y2x2 + y3x3 + y4x4 = (a0 + a1x1)x1 + (a0 + a1x2)x2 + (a0 + a1x3)x3 + (a0 + a1x4)x4
"N
PNi=1 xiPN
i=1 xiPN
i=1 x2i
# PNi=1 yiPN
i=1 xiyi
!
"N
PNi=1 xiPN
i=1 xiPN
i=1 x2i
#✓a0
a1
◆=
PNi=1 yiPN
i=1 xiyi
!Linear least squares
regression for a linear polynomial.
5 Regression.key - September 22, 2014
ï3.9 ï3.8 ï3.7 ï3.6x 10ï4
ï8
ï7.5
ï7
ï6.5
ï6
ï5.5
ï5
ï4.5
ï1/RT (mol/J)lo
g(k)
(1/s
)
ln(A)=38.9246Ea=121481.5094
databest fit
Example - Reaction Rate Constant
T (K) k (1/s)
313 0.00043
319 0.00103
323 0.0018
328 0.00355
333 0.00717
Pre-exponential factor Activation energy
TemperatureGas constant R=8.314 J/mol-K
N=N
Cl
Cl
+ N2
Benzene
diazonium
chloride
Chlorobenzene
rate “constant”
y = a0 + a1x
k = A exp��Ea
RT
⇥
ln(k) = ln(A)� Ea
RT
y = ln(k), x =�1RT
,
a0 = ln(A), a1 = Ea
recall: ln(ab)=ln(a)+ln(b)
Note: we need to calculate A (pre-exponential factor) from a0.
"N
PNi=1 xiPN
i=1 xiPN
i=1 x2i
#✓a0
a1
◆=
✓b0
b1
◆
Now we are ready to go to the computer to determine a0 and a1.
6 Regression.key - September 22, 2014
Polynomial Regression & The Normal Equations (Alternative Formulation for Linear Least Squares)
Given N>np observations (xi,yi), and a np order polynomial, find aj.
NOTE: this is an overdetermined (more equations than unknowns) linear problem for the coefficients, ai.
p = a0 + a1x + a2x2 + a3x
3 + · · · + anxn
One equation for each observation
(N equations)
Example - linear polynomial: p(x) = a0 + a1x
Another form of linear least-squares regression.
⇤
⌥⌥⌥⇧
1 x1 x21 · · · x
np
1
1 x2 x22 · · · x
np
2...
......
......
1 xN x2N · · · x
np
N
⌅
���⌃
✏ �� ⇣A
�
↵↵↵
a0
a1...
anp
⇥
���⌦
✏ �� ⇣�
=
�
↵ y1...
yN
⇥
�⌦
✏ �� ⇣b
“Normal Equations”
ATA� = ATb
⇤1 1 · · · 1x1 x2 · · · xN
⌅⌥
↵↵↵
1 x1
1 x2...
...1 xN
�
���⌦
�a0
a1
⇥=⇤
1 1 · · · 1x1 x2 · · · xN
⌅⇧
✏✏✏�
y1
y2...
yN
⌃
⇣⇣⇣�
⇧
⌥
1 x1
1 x2...
...1 xN
⌃
⌦⌦⌦�
�a0
a1
⇥=
⇤
���↵
y1
y2...
yN
⌅
����
7 Regression.key - September 22, 2014
The Two are One...
a0
N�
i=1
1 + a1
N�
i=1
xi =N�
i=1
yi,
a0
N�
i=1
xi + a1
N�
i=1
x2i =
N�
i=1
xiyi
⇧N
⌥Ni=1 xi⌥N
i=1 xi⌥N
i=1 x2i
⌃ �a0
a1
⇥=
⇤ ⌥Ni=1 yi⌥N
i=1 xiyi
⌅
N�
i=1
(yi � a0 � a1xi) = 0,
N�
i=1
xi (yi � a0 � a1xi) = 0
k=0
k=1
A =
�
⇧⇧⇧⇤
1 x1
1 x2...
...1 xN
⇥
⌃⌃⌃⌅b =
�
⇧⇧⇧⇧⇧⇤
y1
y2
y3...
yN
⇥
⌃⌃⌃⌃⌃⌅� =
�a0
a1
⇥
ATA� = ATb
ATb =
� ⇤Ni=1 yi⇤N
i=1 xiyi
⇥
Consider each of the previous approaches for a first order polynomial.
Dir
ect
Leas
t Sq
uare
sM
atrix Transpose Approach
Typically most convenient for linear regression problems.
⇥S
⇥ak=
N⇧
i=1
xki
�
⇤yi �np⇧
j=0
ajxji
⇥
⌅ = 0 k = 0 . . . np
ATA =
�N
⇤Ni=1 xi⇤N
i=1 xi⇤N
i=1 x2i
⇥
8 Regression.key - September 22, 2014
ï3.9 ï3.8 ï3.7 ï3.6x 10ï4
ï8
ï7.5
ï7
ï6.5
ï6
ï5.5
ï5
ï4.5
ï1/RT (mol/J)lo
g(k)
(1/s
)
ln(A)=38.9246Ea=121481.5094
databest fit
Example - Reaction Rate Constant
T (K) k (1/s)
313 0.00043
319 0.00103
323 0.0018
328 0.00355
333 0.00717
Pre-exponential factor Activation energy
TemperatureGas constant R=8.314 J/mol-K
N=N
Cl
Cl
+ N2
Benzene
diazonium
chloride
Chlorobenzene
rate “constant”
y = a0 + a1x
k = A exp��Ea
RT
⇥
ln(k) = ln(A)� Ea
RT
y = ln(k), x =�1RT
,
a0 = ln(A), a1 = Ea
recall: ln(ab)=ln(a)+ln(b)
Note: need to calculate A (pre-exponential factor) from a0.
⇧
⌥
1 �1RT1
1 �1RT2
......
1 �1RTN
⌃
⌦⌦⌦�
⌘ ⇣✏ ✓A
�a0
a1
⇥
⌘ ⇣✏ ✓�
=
⇤
���↵
ln(k1)ln(k2)
...ln(kN )
⌅
����
⌘ ⇣✏ ✓b
ATA� = ATb
9 Regression.key - September 22, 2014
Linear Least Squares Regression in matlab
Do it “manually” - the way that we just showed. • See MATLAB code for previous example posted on class website.
• This is my favored method, and provides maximum flexibility!
Polynomial regression: p=polyfit(x,y,n) • polyval(p,xi) evaluates the resulting polynomial at xi.
• gives the “best fit” for a polynomial of order n through the data.
• if n==(length(x)-1) then you get an interpolant.
• if n<(length(x)-1) then you get a least-squares fit.
• You still must get the problem into a polynomial form.
BEFORE you go to the computer, formulate the problem as a polynomial regression problem!
10 Regression.key - September 22, 2014
The “R2” Value
How well does the regressed line fit the data?
Average value of yi
R2=1 ⇒ Perfect fit
y = 1N
N�
i=1
yi
R2 = 1��N
i=1 (yi � f(xi))2
�Ni=1 (yi � y)2
• (xi,yi) - data points that we are regressing.
• f(x) - function we are regressing to.
• f(xi) - regression function value at xi.
11 Regression.key - September 22, 2014
“Nonlinear” Least Squares Regression
S =N�
i=1
[yi � f(xi)]2
Same approach as before, but now the parameters of f(x) may enter nonlinearly!
Example:
Two nonlinear equations to solve
for a and b.
⇤S
⇤ak= 0, k = 1 . . . n
Assume f(x) has n parameters ak that we want to determine via regression.
S =N�
i=1
g2i , gi = yi � axb
i
Can we reformulate this as a linear problem?
f(x) = ax
bS =
NX
i=1
�yi � ax
bi
�2
�S
�a=
NX
i=1
�2xbi
�yi � axb
i
�
�S
�b=
NX
i=1
�2axbi ln(xi)
�yi � axb
i
�
0 =NX
i=1
x
bi
�yi � ax
bi
�
0 =NX
i=1
x
bi ln(xi)
�yi � ax
bi
�
@S
@a
=NX
i=1
@S
@gi
@gi
@a
=NX
i=1
2gi��x
bi
�
@S
@b
=NX
i=1
@S
@gi
@gi
@b
=NX
i=1
2gi��ax
bi ln(xi)
�
12 Regression.key - September 22, 2014
Kinetics Example RevisitedPre-exponential factor Activation energy
TemperatureGas constant R=8.314 J/mol-K
rate “constant”
k = A exp��Ea
RT
⇥S =
N⇧
i=1
⇤yi �A exp
�� Ea
RTi
⇥⌅2
�S
�A=
N⇧
i=1
�2 exp�� Ea
RTi
⇥ ⇤yi �A exp
�� Ea
RTi
⇥⌅
�S
�Ea=
N⇧
i=1
2A
RTiexp
�� Ea
RTi
⇥ ⇤yi �A exp
�� Ea
RTi
⇥⌅
0 =N⇧
i=1
exp�� Ea
RTi
⇥ ⇤yi �A exp
�� Ea
RTi
⇥⌅
0 =N⇧
i=1
1Ti
exp�� Ea
RTi
⇥ ⇤yi �A exp
�� Ea
RTi
⇥⌅
Sum of squared errors.
Minimize S w.r.t. A and Ea.
2 nonlinear equations with 2 unknowns A, Ea.
We will show how to solve this soon!
S =N⇤
i=1
g2i , gi = yi �A exp
�� Ea
RTi
⇥
@S
@A=
NX
i=1
@S
@gi
@gi@A
=
NX
i=1
2gi
� exp
✓� Ea
RTi
◆�
@S
@Ea=
NX
i=1
@S
@gi
@gi@Ea
=
NX
i=1
2gi
A
RTiexp
✓� Ea
RTi
◆�
13 Regression.key - September 22, 2014