interpolation and regression - concordia university

21
Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007] Interpolation & Regression 60 CHAPTER V Interpolation and Regression Topics Interpolation Direct Method; Newton’s Divided Difference; Lagrangian Interpolation; Spline Interpolation. Regression Linear and non-linear. 1. What is interpolation? A function () x f y = is, often, given only at discrete points such as ( )( ) ( )( ) n n n n y x y x y x y x , , , ,......, , , , 1 1 1 1 0 0 . How does one find the value of y at any other value of x? Well, a continuous function () x f may be used to represent the n+1 data values with ( ) x f passing through the n+1 point. Then we can find the value of y at any other value of x. This is called interpolation. Of course, if x falls outside the range of x for which the data is given, it is no longer interpolation, but instead, is called extrapolation. So what kind of function () x f should we choose? A polynomial is a common choice for an interpolating function because polynomials are easy to - Evaluate - Differentiate, and - Integrate as opposed to other choices such as a sine or exponential series. Polynomial interpolation involves finding a polynomial of order ‘n’ that passes through the ‘n+1’ points. One of the methods is called the direct method of interpolation. Other methods include Newton’s divided difference polynomial method and Lagrangian interpolation method. (x 0 , y 0 ) (x 1 , y 1 ) (x 2 , y 2 ) (x 3 , y 3 ) f (x) x y

Upload: others

Post on 16-Oct-2021

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Interpolation and Regression - Concordia University

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Interpolation & Regression 60

CHAPTER V

Interpolation and Regression Topics Interpolation Direct Method; Newton’s Divided Difference; Lagrangian Interpolation; Spline Interpolation. Regression Linear and non-linear.

1. What is interpolation? A function ( )xfy = is, often, given only at discrete points such as

( ) ( ) ( ) ( )nnnn yxyxyxyx ,,,,......,,,, 111100 −− . How does one find the value of y at any other value of x? Well, a continuous function ( )xf may be used to represent the n+1 data values with ( )xf passing through the n+1 point. Then we can find the value of y at any other value of x. This is called interpolation. Of course, if x falls outside the range of x for which the data is given, it is no longer interpolation, but instead, is called extrapolation. So what kind of function ( )xf should we choose? A polynomial is a common choice for an interpolating function because polynomials are easy to

- Evaluate - Differentiate, and - Integrate

as opposed to other choices such as a sine or exponential series. Polynomial interpolation involves finding a polynomial of order ‘n’ that passes through the ‘n+1’ points. One of the methods is called the direct method of interpolation. Other methods include Newton’s divided difference polynomial method and Lagrangian interpolation method. (x0, y0)

(x1, y1)

(x2, y2)

(x3, y3)

f (x)

x

y

Page 2: Interpolation and Regression - Concordia University

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Interpolation & Regression 61

1.2. Direct Method The direct method of interpolation is based on the following principle. If we have 'n+1' data points, fit a polynomial of order 'n' as given below

nn xaxaay +++= ...............10

(1) through the data, where a0, a1, . . ., an are n+1 real constants. Since n+1 values of y are given at n+1 values of x, one can write n+1 equations. Then the 'n+1' constants, a0, a1, . . ., an, can be found by solving the n+1 simultaneous linear equations (Ahaaa !!! do you remember previous course !!!). To find the value of y at a given value of x, simply substitute the value of x in the polynomial form. But, it is not necessary to use all the data points. How does one then choose the order of the polynomial and what data points to use? This concept and the direct method of interpolation are best illustrated using an example. 1.2.1. Example The upward velocity of a rocket is given as a function of time in Table 1.

Table 1. Velocity as a function of time

t [s] v(t) [m/s] 0 0 10 227.04 15 362.78 20 517.35 22.5 602.97 30 901.67

1. Determine the value of the velocity at t=16 s using the direct method and a first order polynomial. 2.. Determine the value of the velocity at t=16 s using direct method and a third order polynomial interpolation using direct method.

Page 3: Interpolation and Regression - Concordia University

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Interpolation & Regression 62

0

250

500

750

1000

0 10 20 30 40

t [s]

v (t) [s]

Figure 5.2. Velocity vs. time data for the rocket example.

1.3. Newton’s divided difference interpolation To illustrate this method, we will start with linear and quadratic interpolation, then, the general form of the Newton’s Divided Difference Polynomial method will be presented. 1.3.1. Linear interpolation Given ),,( 00 yx ),,( 11 yx fit a linear interpolant through the data. Note taht )( 00 xfy = and

)( 11 xfy = , assuming a linear interpolant means:

)()( 0101 xxbbxf −+= Since at 0xx = : 00010001 )()()( bxxbbxfxf =−+== ,

and at 1xx = : )()()( 0110111 xxbbxfxf −+== )()( 0110 xxbxf −+= Then

01

011

)()(xx

xfxfb

−−

=

so )( 00 xfb =

01

011

)()(xx

xfxfb

−−

=

And the linear interpolant,

)()( 0101 xxbbxf −+=

Page 4: Interpolation and Regression - Concordia University

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Interpolation & Regression 63

Becomes: )()()()()( 001

0101 xx

xxxfxfxfxf −

−−

+=

1.3.2. Quadratic interpolation Given ),,( 00 yx ),,( 11 yx and ),,( 22 yx fit a quadratic interpolant through the data. Note that

),(xfy = ),( 00 xfy = ),( 11 xfy = and ),( 22 xfy = assume the quadratic interpolant )(2 xf given by ))(()()( 1020102 xxxxbxxbbxf −−+−+= At 0xx =

))(()()()( 100020010020 xxxxbxxbbxfxf −−+−+==

0b=

)( 00 xfb =

At 1xx =

))(()()()( 110120110121 xxxxbxxbbxfxf −−+−+==

)()()( 01101 xxbxfxf −+= then

01

011

)()(xx

xfxfb

−−

=

At 2xx =

))(()()()( 120220210222 xxxxbxxbbxfxf −−+−+==

))(()()()(

)()( 120220201

0102 xxxxbxx

xxxfxf

xfxf −−+−−−

+=

then

02

01

01

12

12

2

)()()()(

xxxx

xfxfxx

xfxf

b−

−−

−−−

=

Hence the quadratic interpolant is given by ))(()()( 1020102 xxxxbxxbbxf −−+−+=

))((

)()()()(

)()()(

)()( 1002

01

01

12

12

001

0102 xxxx

xxxx

xfxfxx

xfxf

xxxx

xfxfxfxf −−

−−−

−−−

+−−−

+=

Page 5: Interpolation and Regression - Concordia University

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Interpolation & Regression 64

Figure 5.4. Quadratic interpolation 1.3.3. General Form of Newton’s Divided Difference Polynomial In the two previous cases, we found how linear and quadratic interpolation is derived by Newton’s Divided Difference polynomial method. Let us analyze the quadratic polynomial interpolant formula ))(()()( 1020102 xxxxbxxbbxf −−+−+= where )( 00 xfb =

01

011

)()(xx

xfxfb

−−

=

02

01

01

12

12

2

)()()()(

xxxx

xfxfxx

xfxf

b−

−−

−−−

=

Note that ,0b ,1b and 2b are finite divided differences. ,0b ,1b and 2b are first, second, and third finite divided differences, respectively. Denoting first divided difference by )(][ 00 xfxf = the second divided difference by

01

0101

)()(],[

xxxfxf

xxf−−

=

and the third divided difference by

02

0112012

],[],[],,[

xxxxfxxf

xxxf−−

=

02

01

01

12

12 )()()()(

xxxx

xfxfxx

xfxf

−−−

−−−

=

Page 6: Interpolation and Regression - Concordia University

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Interpolation & Regression 65

where ],[ 0xf ],,[ 01 xxf and ],,[ 012 xxxf are called bracketed functions of their variables enclosed in square brackets. We can write: ))(](,,[)](,[][)( 1001200102 xxxxxxxfxxxxfxfxf −−+−+= This leads to the general form of the Newton’s divided difference polynomial for )1( +n data

points, ( ) ( ) ( ) ( )nnnn yxyxyxyx ,,,,......,,,, 111100 −− as ))...()((....)()( 110010 −−−−++−+= nnn xxxxxxbxxbbxf where ][ 00 xfb =

],[ 011 xxfb =

],,[ 0122 xxxfb =

M ],....,,[ 0211 xxxfb nnn −−− =

],....,,[ 01 xxxfb nnn −=

where the definition of the thm divided difference is ],........,[ 0xxfb mm =

0

011 ],........,[],........,[xx

xxfxxf

m

mm

−−

= −

From the above definition, it can be seen that the divided differences are calculated recursively. For an example of a third order polynomial, given ),,( 00 yx ),,( 11 yx ),,( 22 yx and ),,( 33 yx

))()(](,,,[

))(](,,[)](,[][)(

2100123

1001200103

xxxxxxxxxxfxxxxxxxfxxxxfxfxf

−−−+−−+−+=

0b

0x )( 0xf 1b

],[ 01 xxf 2b

1x )( 1xf ],,[ 012 xxxf 3b

],[ 12 xxf ],,,[ 0123 xxxxf

2x )( 2xf ],,[ 123 xxxf

],[ 23 xxf

3x )( 3xf

Page 7: Interpolation and Regression - Concordia University

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Interpolation & Regression 66

1.4. Lagrangian Interpolation Polynomial interpolation involves finding a polynomial of order ‘n’ that passes through the ‘n+1’ points. One of the methods to find this polynomial is called Lagrangian Interpolation. Lagrangian interpolating polynomial is given by

∑=

=n

iiin xfxLxf

0)()()(

where ‘ n ’ in )(xfn stands for the thn order polynomial that approximates the function

)(xfy = given at )1( +n data points as ( ) ( ) ( ) ( )nnnn yxyxyxyx ,,,,......,,,, 111100 −− , and

∏≠= −

−=

n

ijj ji

ji xx

xxxL

0

)(

)(xLi is a weighting function that includes a product of )1( −n terms with terms of ij = omitted.

1.5. Spline Method of Interpolation Spline method was introduced to solve one of the drawbacks of the polynomial interpolation. In

fact, when the order (n) becomes large, in many cases, oscillations appear in the resulting

polynomial. This was shown by Runge when he interpolated data based on a simple function of

22511

xy

+=

on an interval of [-1, 1]. For example, take six equidistantly spaced points in [-1, 1] and find y at

these points as given in Table 1.

Example Use the same previous data of the upward velocity of a rocket, to determine the value of the velocity at t=16 s using third order polynomial interpolation using Newton’s Divided Difference polynomial.

Example Use the same previous data of the upward velocity of a rocket, to determine the value of the velocity at t=16 s using third order polynomial interpolation using third order polynomial interpolation using Lagrangian polynomial interpolation.

Page 8: Interpolation and Regression - Concordia University

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Interpolation & Regression 67

Table 1: Six equidistantly spaced points in [-1, 1]

Figure.5.5. 5th order polynomial vs. exact function.

Now through these six points, we can pass a fifth order polynomial

,2019.17308.156731.0)( 425 xxxf +−= 11 ≤≤− x

through the six data points.

When plotting the fifth order polynomial and the original function, you can notice that the two do

not match well. So maybe you will consider choosing more points in the interval [-1, 1] to get a

better match, but it diverges even more (see figure below). In fact, Runge found that as the order

of the polynomial becomes infinite, the polynomial diverges in the interval of –1 < x < 0.726 and

0.726 < x < 1.

1 0.5 0 0.5 1

1

0

1

2

f x( )

f1 n1 x,( )f1 n2 x,( )f1 n3 x,( )

x Figure.5.6. Higher order polynomial interpolation is a bad idea.

x 22511

xy

+=

-1.0 0.038461

-0.6 0.1

-0.2 0.5

0.2 0.5

0.6 0.1

1.0 0.038461

Page 9: Interpolation and Regression - Concordia University

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Interpolation & Regression 68

1.5.1. Linear spline interpolation Given ( ) ( ) ( )( )nnnn yxyxyxyx ,,,......,,,, 111100 −− , fit linear splines to the data. This simply involves forming the consecutive data through straight lines. So if the above data is given in an ascending order, the linear splines are given by ( ))( ii xfy =

Figure.5.7. Linear splines.

),()()(

)()( 001

010 xx

xxxfxf

xfxf −−−

+= 10 xxx ≤≤

),()()(

)( 112

121 xx

xxxfxf

xf −−−

+= 21 xxx ≤≤

. . .

),()()(

)( 11

11 −

−− −

−−

+= nnn

nnn xx

xxxfxf

xf nn xxx ≤≤−1

Note the terms of

1

1 )()(

−−

ii

ii

xxxfxf

in the above function are simply slopes between 1−ix and ix . 1.5.2. Quadratic Splines In these splines, a quadratic polynomial approximates the data between two consecutive data points. The splines are given by

Page 10: Interpolation and Regression - Concordia University

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Interpolation & Regression 69

,)( 112

1 cxbxaxf ++= 10 xxx ≤≤

,222

2 cxbxa ++= 21 xxx ≤≤ . . . ,2

nnn cxbxa ++= nn xxx ≤≤−1 Now, how to find the coefficients of these quadratic splines? There are 3n such coefficients ,ia =i 1, 2, …, n

,ib =i 1, 2, …, n

,ic =i 1, 2, …, n To find ‘3n’ unknowns, we need ‘3n’ equations and then simultaneously solve them. These ‘3n’ equations are found by the following.

1) Each quadratic spline goes through two consecutive data points

)( 01012

01 xfcxbxa =++

)( 11112

11 xfcxbxa =++ . . .

)( 112

1 −−− =++ iiiiii xfcxbxa

)(2iiiiii xfcxbxa =++

. . .

)( 112

1 −−− =++ nnnnnn xfcxbxa

)(2nnnnnn xfcxbxa =++

This condition gives 2n equations as there are ‘n’ quadratic splines going through two consecutive data points.

2) The first derivatives of two quadratic splines are continuous at the interior points. For

example, the derivative of the first spline 11

21 cxbxa ++

is 112 bxa + The derivative of the second spline 22

22 cxbxa ++

is 222 bxa +

Page 11: Interpolation and Regression - Concordia University

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Interpolation & Regression 70

and the two are equal at 1xx = giving 212111 22 bxabxa +=+

022 212111 =−−+ bxabxa Similarly at the other interior points, 022 323222 =−−+ bxabxa . . . 022 11 =−−+ ++ iiiiii bxabxa . . . 022 1111 =−−+ −−−− nnnnnn bxabxa

Since there are (n-1) interior points, we have (n-1) such equations. Now, the total number of equations is )13()1()2( −=−+ nnn equations. We still then need one more equation. We can assume that the first spline is linear, that is:

01 =a

This gives us ‘3n’ equations and ‘3n’ unknowns. These can be solved by a number of techniques used to solve simultaneous linear equations.

Page 12: Interpolation and Regression - Concordia University

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Interpolation & Regression 71

2. Regression 2.2. What is regression? Regression analysis gives information on the relationship between a response variable and one or more independent variables to the extent that information is contained in the data. The goal of regression analysis is to express the response variable as a function of the predictor variables. Duality of fit and accuracy of conclusion depend on the data used. Hence non-representative or improperly compiled data result into poor fits and conclusions. Thus, for effective use of regression analysis one must

Investigate the data collection process, Discover any limitations in data collected, Restrict conclusions accordingly.

Once regression analysis relationship is obtained, it can be used to predict values of the response variable, identify variables that most affect response, or verify hypothesized casual models of the response. The value of each predictor variable can be assessed through statistical tests on the estimated coefficients (multipliers) of the predictor variables. 2.3. Linear regression Linear regression is the most popular regression model. In this model we wish to predict response to n data points (x1,y1), (x2,y2), ....., (xn, yn) data by a regression model given by xaay 10 += where a0 and a1 are the constants of the regression model. A measure of goodness of fit, that is, how xaa 10 + predicts the response variable y is the

magnitude of the residual, iε at each of the n data points.

)( 10 iii xaay +−=ε Ideally, if all the residuals iε are zero, one may have found an equation in which all the points lie on the model. Thus, minimization of the residual is an objective of obtaining regression coefficients. The most popular method to minimize the residual is the least squares method, where the estimates of the constants of the models are chosen such that the sum of the squared residuals

is minimized, that is minimize∑=

n

ii

1

2ε .

Why minimize the sum of the square of the residuals? Why not, for instance, minimize the sum of the residual errors or the sum of the absolute values of the residuals? Alternatively, constants of the model can be chosen such that the average residual is zero without making individual residuals small. For example, let us analyze the following table.

x y 2.0 4.0 3.0 6.0 2.0 6.0 3.0 8.0

To explain this data by a straight line regression model,

Page 13: Interpolation and Regression - Concordia University

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Interpolation & Regression 72

xaay 10 +=

and using minimizing ∑=

n

ii

1

ε as a criteria to find ao and a1, we find that for (Figure 5.8)

Y = 4x -4

Figure.5.8. Regression curve y = 4x – 4 for y vs. x data.

The sum of the residuals, 04

1

=∑=i

iε as shown in the table below.

x y ypredicted ε = y - ypredicted 2.0 4.0 4.0 0.0 3.0 6.0 8.0 -2.0 2.0 6.0 4.0 2.0 3.0 8.0 8.0 0.0

04

1

=∑=i

So does this give us the smallest error? It does as 04

1=∑

=iiε . But it does not give unique values

for the parameters of the model. A straight-line of the model: Y = 6.

0

2

4

6

8

10

0 1 2 3 4

x

y

y =4x - 4

Page 14: Interpolation and Regression - Concordia University

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Interpolation & Regression 73

Figure.5.9. Regression curve y = 6 for y vs. x data.

also makes 04

1=∑

=iiε as shown in the table below.

x y ypredicted ε = y - ypredicted 2.0 4.0 6.0 -2.0 3.0 6.0 6.0 0.0 2.0 6.0 6.0 0.0 3.0 8.0 6.0 2.0

04

1

=∑=i

Since this criterion does not give unique regression model, it cannot be used for finding the regression coefficients. Why? Because, we want to minimize

( )∑∑==

−−=n

iii

n

ii xaay

110

Differentiating this equation with respect to a0 and a1, we get

na

n

i

n

ii

−=−=∂

∑∑

=

=

10

1 1ε

_

11

1 xnxa

n

ii

n

ii

−=−=∂

∑∑

=

=

ε

0

2

4

6

8

10

0 1 2 3 4

x

y

y = 6

Page 15: Interpolation and Regression - Concordia University

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Interpolation & Regression 74

Putting these equations to zero, give n= 0 but this is impossible. Therefore, unique values of a0 and a1 do not exist.

You may think that the reason the minimization criterion ∑=

n

ii

1ε does not work is that negative

residuals cancel with positive residuals. So is minimizing ∑=

n

ii

1ε criterion may be better? Let us

look at the data given below for equation 44 −= xy . It makes 44

1=∑

=iiε as shown in the

following table.

x y ypredicted |ε| = |y - ypredicted| 2.0 4.0 4.0 0.0 3.0 6.0 8.0 2.0 2.0 6.0 4.0 2.0 3.0 8.0 8.0 0.0

44

1=∑

=iiε

The value of 44

1=∑

=iiε also exists for the straight line model y = 6. No other straight line for this

data has 44

1<∑

=iiε . Again, we find the regression coefficients are not unique, and hence this

criterion also cannot be used for finding the regression model. Let us use the least squares criterion where we minimize

( )2

110

1

2 ∑∑==

−−==n

iii

n

iir xaayS ε

Sr is called the sum of the square of the residuals.

Page 16: Interpolation and Regression - Concordia University

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Interpolation & Regression 75

Figure.5.10. Linear regression of y vs. x data showing residuals at a typical point, xi.

To find a0 and a1, we minimize Sr with respect to a0 and a1:

( )( ) 0121

100

=−−−=∂∂ ∑

=

n

iii

r xaayaS

( )( ) 021

101

=−−−=∂∂ ∑

=

n

iiii

r xxaayaS

giving

01 1

11

0 =++−∑ ∑∑= ==

n

i

n

ii

n

ii xaay

01 1

21

10 =++−∑ ∑∑

= ==

n

i

n

ii

n

iiii xaxaxy

Noting that 00001

0 ... naaaaan

i

=+++=∑=

∑∑==

=+n

ii

n

ii yxana

1110

∑∑∑===

=+n

iii

n

ii

n

ii yxxaxa

11

21

10

Solving the above equations gives:

x

xaay 10 +=11 , yx

22 , yx33 , yx

nn yx ,

ii yx ,

iii xaay 10 −−=ε

y

Page 17: Interpolation and Regression - Concordia University

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Interpolation & Regression 76

2

11

2

1111

⎟⎠

⎞⎜⎝

⎛−

−=

∑∑

∑∑∑

==

===

n

ii

n

ii

n

ii

n

ii

n

iii

xxn

yxyxna

2

11

2

1111

2

0

⎟⎠

⎞⎜⎝

⎛−

−=

∑∑

∑∑∑∑

==

====

n

ii

n

ii

n

iii

n

ii

n

ii

n

ii

xxn

yxxyxa

Redefining __

1yxnyxS

n

iiixy −=∑

=

2_

1

2 xnxSn

iixx −=∑

=

n

xx

n

ii∑

== 1_

n

yy

n

ii∑

== 1_

we can rewrite

xx

xy

SS

a =1

_

1

_

0 xaya −= 2.4. Nonlinear models using least squares 2.4.1. Exponential model Given ( )11 y,x , ( )22 y,x , . . . ( )nn yx , , we can fit bxaey = to the data. The variables a and

b are the constants of the exponential model. The residual at each data point ix is

ibxii aeyE −=

The sum of the square of the residuals is

∑=

=n

iir ES

1

2 ( )∑=

−=n

i

bxi

iaey1

2

To find the constants a and b of the exponential model, we minimize Sr by differentiating with respect to a and b and equating the resulting equations to zero.

Page 18: Interpolation and Regression - Concordia University

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Interpolation & Regression 77

( )( ) 021

=−−=∂∂ ∑

=

ii bxn

i

bxi

r eaeyaS

( )( ) 021

=−−=∂∂ ∑

=

ii bxi

n

i

bxi

r eaxaeybS

or

01

2

1=∑+∑−

==

n

i

bxn

i

bxi

ii eaey

01

2

1=∑−∑

==

n

i

bxi

n

i

bxii

ii exaexy

These equations are nonlinear in a and b and thus not in a closed form to be solved as was the case for the linear regression. In general, iterative methods must be used to find values of a and b. However, in this case, a can be written explicitly in terms of b as

∑=

=

=n

i

bx

n

i

bxi

i

i

e

eya

1

2

1

Substituting gives

01

2

1

2

1

1=∑

∑−∑

=

=

=

=

n

i

bxin

i

bx

bxn

ii

bxi

n

ii

i

i

i

i exe

eyexy

This equation is still a nonlinear equation in b and can be solved by numerical methods such as bisection method or secant method. 2.4.2. Growth model Growth models common in scientific fields have been developed and used successfully for specific situations. The growth models are used to describe how something grows with changes in regressor variable (often the time). Examples in this category include growth of population with time. Growth models include

xcbeay .1 −+

=

where a, b and c are the constants of the model. At x= 0, b1

ay+

= and as ∞→x , ay → .

The residuals at each data point, xi are

icxii be

ayE −+−=

1

The sum of the square of the residuals is

Page 19: Interpolation and Regression - Concordia University

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Interpolation & Regression 78

∑=

=n

iir ES

1

2

2

1 1∑=

− ⎟⎠⎞

⎜⎝⎛

+−=

n

icxi ibe

ay

To find the constants a, b and c we minimize Sr by differentiating with respect to a , b and c, and equating the resulting equations to zero.

( )[ ]

( )0

21

2 =⎟⎟⎠

⎞⎜⎜⎝

+

+−=

∂∂ ∑

=

×n

icx

cxi

xccxr

be

beyaeeaS

i

iii

,

( )[ ]

( )0

21

3 =⎟⎟⎠

⎞⎜⎜⎝

+

−+=

∂∂ ∑

=

n

icx

icx

icx

r

be

ayebyaebS

i

ii

,

( )[ ]

( )0

21

3 =⎟⎟⎠

⎞⎜⎜⎝

+

−+−=

∂∂ ∑

=

n

icx

icx

icx

ir

be

ayebyeabxcS

i

ii

.

Then , it is possible to use the Newton-Raphson method to solve the above set of simultaneous nonlinear equations for a, b and c. 2.4.3. Polynomial Models Given n data points (x1, y1), (x2, y2). . , (xn, yn) use least squares method to regress the data to an mth order polynomial. nmxaxaxaay m

m <++++= ,2210 LL

The residual at each data point is given by m

imiii xaxaayE −−−−= ...10 The sum of the square of the residuals is given by

( )∑

=

=

−−−−=

=

n

i

mimii

n

iir

xaxaay

ES

1

210

1

2

...

To find the constants of the polynomial regression model, we put the derivatives with respect to ai to zero, that is,

Page 20: Interpolation and Regression - Concordia University

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Interpolation & Regression 79

( )

( )

( ) 0)(....2

....

0)(....2

0)1(....2

110

110

1

110

0

=−−−−−=∂∂

=−−−−−=∂∂

=−−−−−=∂∂

=

=

=

mi

n

i

mimii

m

r

i

n

i

mimii

r

n

i

mimii

r

xxaxaayaS

xxaxaayaS

xaxaayaS

Writing these equations in matrix form gives

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

=

⎥⎥⎥⎥

⎢⎢⎢⎢

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎟⎠

⎞⎜⎝

⎛⎟⎠

⎞⎜⎝

⎛⎟⎠

⎞⎜⎝

⎟⎠

⎞⎜⎝

⎛⎟⎠

⎞⎜⎝

⎛⎟⎠

⎞⎜⎝

⎟⎠

⎞⎜⎝

⎛⎟⎠

⎞⎜⎝

∑∑∑

∑∑∑

∑∑

=

=

=

==

+

=

=

+

==

==

n

ii

mi

n

iii

n

ii

mn

i

mi

n

i

mi

n

i

mi

n

i

mi

n

ii

n

ii

n

i

mi

n

ii

yx

yx

y

a

aa

xxx

xxx

xxn

1

1

1

1

0

1

2

1

1

1

1

1

1

2

1

11

......

...

...........

...

...

The above system is solved for a0, a1,. . ., am 2.4.4. Logarithmic Functions The form for the log regression models is ( )xy ln10 ββ += This is a linear function between y and ( )xln and the usual least squares method applies in

which y is the response variable and ( )xln is the regressor. 2.4.5. Power Functions The power function equation describes many scientific and engineering phenomena: baxy = The method of least squares is applied to the power function by first linearizing the data (assumption is that b is not known). If the only unknown is a, then a linear relation exists between xb and y . The linearization of the data is as follows: ( ) ( ) ( )xbay lnlnln += The resulting equation shows a linear relation between ( )yln and ( )xln .

Page 21: Interpolation and Regression - Concordia University

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Interpolation & Regression 80

We can put

)ln(

lnxw

yz==

( )aa ln0 = then oaea =

ba =1 we get waaz 10 +=

n

wa

n

za

wwn

zwzwna

n

ii

n

ii

n

i

n

iii

n

ii

n

i

n

iiii

∑∑

∑ ∑

∑∑ ∑

==

= =

== =

−=

⎟⎠

⎞⎜⎝

⎛−

−=

11

10

1

2

1

2

11 11

Since a0 and a1 can be found, the original constants of the model are

0

1aea

ab

=

=