fixed point iteration

24
Fixed Point Iteration When attempting to solve the equation f(x) = 0, it would be wonderful if we could rewrite the equation in a form which gives explicitly the solution, in a manner similar to the familiar solution method for a quadratic equation. While this does not occur for the vast majority of equations we must solve, we can always find a way to re-arrange the equation f(x) = 0 in the form: (6. 3) Finding a value of x for which x = g(x) is thus equivalent to finding a solution of the equation f(x) = 0. The function g(x) can be said to define a map on the real line over which x varies, such that for each value of x, the function g(x) maps that point to a new point, on the real line. Usually this map results in the points x and being some distance apart. If there is no motion under the map for some x = x p , we call x p a fixed point of the function g(x). Thus we have x p = g(x p ), and it becomes clear that the fixed point of g(x) is also a zero of the corresponding equation f(x) = 0. Suppose we are able to choose a point x 0 which lies near a fixed point, x p , of g(x), where of course, we do not know the value of x p (after all, that is our quest here). We might speculate that under appropriate circumstances, we could use the iterative scheme: x n+1 = g(x n ) (6. 4)

Upload: ryinejan

Post on 18-Nov-2014

119 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Fixed Point Iteration

Fixed Point Iteration When attempting to solve the equation f(x) = 0, it would be wonderful if we could rewrite the equation in a form which gives explicitly the solution, in a manner similar to the familiar solution method for a quadratic equation. While this does not occur for the vast majority of equations we must solve, we can always find a way to re-arrange the equation f(x) = 0 in the form:

(6.3)

Finding a value of x for which x = g(x) is thus equivalent to finding a solution of the equation f(x) = 0. The function g(x) can be said to define a map on the real line over which x varies, such that for

each value of x, the function g(x) maps that point to a new point, on the real line. Usually

this map results in the points x and being some distance apart. If there is no motion under the map for some x = xp, we call xp a fixed point of the function g(x). Thus we have xp = g(xp), and it becomes clear that the fixed point of g(x) is also a zero of the corresponding equation f(x) = 0.

Suppose we are able to choose a point x0 which lies near a fixed point, xp, of g(x), where of course, we do not know the value of xp (after all, that is our quest here). We might speculate that under appropriate circumstances, we could use the iterative scheme:

xn+1 = g(xn) (6.4)

where , and we continue the iteration until the difference between successive xn is as small as we require for the the precision desired. To that level precision, the final value of xn approximates a fixed point of g(x), and hence approximates a zero of f(x). Figure: Fixed point iteration for the very simple case where g(x) is a linear function of x. In this

figure the line y = g(x) has been chosen to have a positive slope less than one and its iteration

started from the value x0. Similarly, the line has been chosen to have a positive slope

greater than one and its iteration started from the value . The convergent behavior of the

g(x) fixed point iteration is quite different from that for which diverges away from the

fixed point at xp. The divergence for is caused solely by the fact that the slope of

is greater than the slope of the line y = x.

Page 2: Fixed Point Iteration

The conditions for which the fixed point iteration scheme is convergent can be understood by inspection of figure 6.3. In this case we consider the determination of the zero of the simple function:

(6.5)

A straightforward choice for the function g(x) with which to do fixed point iteration is:

(6.6)

This example is particularly simple since we can solve f(x) = 0 analytically and find the fixed point of g(x), xp = ma/(m-1). It is easy to verify that g(xp) = xp, confirming that xp is indeed a fixed point. The fixed point iteration sequence is shown for two choices of the slope, m, both positive.

The curve y = g(x) has m < 1 and the curve has m > 1. It is clear that the m < 1 case results in monotonic convergence to the fixed point xp, so that the fixed point is strongly attractive in this case. The m > 1 case illustrates monotonic divergence away from the fixed point xp, so that the fixed point is strongly repellent in this case. While this simple linear case may seem special, it displays the behaviour which applies in general to a continuous mapping function, g(x). In order to understand the reasons for the difference in behaviour for the two cases m < 1 and m > 1, we need to follow the iteration sequence in some detail. Once given the starting value x0, we compute g(x0), the corresponding

point on the y = g(x) curve. We then move along the line to intersect the y = x

Page 3: Fixed Point Iteration

line, and there read the value of x, and use this as the next iteration value for x. Examination of the m < 1 iteration sequence in figure 6.3 shows that each motion along the arrows of the iteration sequence leads towards the intersection point of y = x and y = g(x), thus assuring convergence. A similar examination of the m > 1 case shows that each motion along the arrows of the iteration sequence leads away from the intersection point at xp, thus assuring divergence.

While the point xp remains a fixed point of , it is an unstable fixed point in the sense that starting arbitrarily close to the fixed point still results in an iterative path that leads away from the fixed point. The terms attractor and repeller then naturally describe the fixed point xp for the maps associated with m < 1 and m > 1 respectively.

Figure 6.3: Fixed point iteration for a general function g(x) for the four cases of interest. Generalizations of the two cases of positive slope shown in figure 6.3 are shown on the left, and illustrate monotonic convergence and divergence. The cases where g(x) has negative slope are

shown on the right, and illustrate oscillating convergence and divergence. The top pair of panels illustrate strong and weak attractors, while the bottom pair of panels illustrate strong and weak

repellers.

We have considered iteration functions, like g(x), which have positive slopes in the neighborhood of the fixed point, and shown that these lead to either monotonic convergence or monotonic divergence. When g(x) has negative slope in the neighborhood of the fixed point, the

Page 4: Fixed Point Iteration

result is oscillating convergence or divergence, with convergence requiring |m| < 1. The iteration sequences for all four cases are shown in 6.3 for more general g(x). The conditions leading to convergence are unchanged from those derived for the linear case as long as the neighborhood of the fixed point considered is small enough.

http://pathfinder.scar.utoronto.ca/~dyer/csca57/book_P/node34.html

The Newton-Raphson Method Bracketing methods are useful when we have a well established interval in which a sign change takes place for a continuous function. If the function is monotonic on that interval, we know that the zero is unique and methods such as bisection and regula falsi are sure to find it. If the function is not monotonic, it is difficult to ensure that all the zeroes on the interval are found. In addition, these bracketing methods are somewhat slow in their convergence to the zero, since they do not make much use of local knowledge of the function f(x).

We first examine a method, known as the Newton-Raphson method, that makes explicit use of the derivative of the function of which we wish to find the zero. If we suppose we know more local information about f(x), such as that used in developing a Taylor expansion of the function about the point x0, we can often find the zeroes more quickly. Suppose we have reason to believe that there is a zero of f(x) near the point x0. The Taylor expansion for f(x) about x0 can be written as:

(6.7)

If we drop the the terms of this expansion beyond the first order term, and write:

and set f(x) = 0 to find the next approximation, x1, to the zero of f(x), we find:

(6.9)

This provides us with an iteration scheme which may well converge on the zero of f(x), under appropriate conditions.

To examine the conditions under which this iteration converges, we consider the scheme in the approach used previously for fixed point iteration, since we have actually set up a fixed point iteration for the first order Taylor approximation to f(x). The iteration function is:

Page 5: Fixed Point Iteration

(6.10)

and the derivative is:

(6.11)

At the actual zero, f(x) = 0, so that as long as , we have at the zero of f(x). Thus continuity implies that there must be a neighborhood around the zero where

and we can conclude that the Newton-Raphson method converges in the interval

where . Figure 6.4: Schematic representation of the Newton-Raphson method showing the iteration from

the initial guess x0 to the next approximation x1 to the zero of the function f(x).

The Newton-Raphson method is illustrated graphically in figure 6.4 for a simple monotonic function f(x). The iteration from x0, an initial guess for the zero of f(x), involves drawing the tangent to f(x) from the point (x0,f(x0)) until it intersects the y = 0 axis. The point of intersection is

Page 6: Fixed Point Iteration

the next approximate value, x1, of the zero of f(x). The process is repeated until the change in the intersection point is smaller than the requested tolerance or precision, ie. until

for some specified .

Figure 6.5: Newton-Raphson iteration for a function f(x) which has a minimum near the initial

guess at x0. Since at the minimum, the convergence of the method could be threatened if an initial guess for x0 was too close to the minimum of f(x). The initial step from x0

to x1 is quite large, and it is clear that for a slightly smaller x0, that step could become very large indeed.

Figure 6.5 shows Newton-Raphson iteration for a function f(x) with a minimum near the initial guess at x0. This results in a very large step from x0 to x1, after which convergence is quite fast

since the active region for x is then quite far from points where is small. If the choice of the initial guess is even closer to the minimum of f(x), the initial step in the iteration can become very large, with the result that the Newton-Raphson iteration could move to a completely different range of x, possibly far away from the zero of interest in the application. If f(x) is monotonic except at the minimum indicated, the iteration would converge back on the zero desired, provided numerical accuracy could be maintained.

A very simple application of the Newton-Raphson method arises in the problem of computing fractional powers of numbers. The most familiar problem is the computation of the square root of a number A which requires that we find the zeroes of the simple function:

Page 7: Fixed Point Iteration

(6.12)

Straightforward application of the method results in the iteration function:

(6.13)

yielding a simple scheme to evaluate square roots. This algorithm is implemented in the function my_sqrt(a) to compute the square root of a to a precision specified by TOL. ''' The function my_sqrt(a) computes the square root of its argument a to the precision specified in the constant tol. If a is negative, the function returns -1, and otherwise the return value is non-negative. It is a straightforward implementation of the Newton-Raphson method.'''tol = 1.0e-8

def my_sqrt(a): if a < 0 : return -1 x = a # initialize to different values xprev = 0 while xprev - x > tol or x - xprev > tol : # are we done yet? xprev = x # keep previous value x = 0.5*(xprev + a/xprev) # Newton-Raphson iterate return x

The same approach can be used to evaluate the nth root of any positive number A by using the Newton-Raphson scheme on the equation xn = A, with the resulting iterating equation:

(6.14)

This algorithm is implemented in the function any_root(a,n) with a support function ipower(x,i) to compute the value of a floating point value raised to any positive integer power. A simple test to call the any_root function is include in main. The ipower function is not a very efficient way to raise a number to an integer power, since it uses far more multiplications than necessary. A simple decomposition of the power can lead to many fewer multiplications being necessary, as one can see from the simple example of x11 which would require 10 multiplications in the naive approach, but only 5 if one first evaluates x2 = x*x, then x5 = x2*x2*x, and finally x11 = x5*x5*x. Implementing such an approach would be worthwhile if such evaluations were required frequently. #!/usr/bin/env python

''' The function any_root(a,n) computes the n-th root of the argument a to the precision tol. If a is negative or n is not positive, -1 is returned and non-negative values are returned otherwise. This uses

Page 8: Fixed Point Iteration

the Newton-Raphson method.'''

def any_root(a,n): # Compute n-th root of a where n is positive x = a xprev = 0 # initialize to different values if a < 0 or n <= 0 : return -1 while xprev - x > tol or x - xprev > tol : xprev = x x = xprev*(n - 1 + a/xprev**n)/n return x

import sys

tol = 1.0e-10if len(sys.argv) < 3 : sys.exit(1)a = float(sys.argv[1])n = float(sys.argv[2])print "%.12g to the 1/%g power = %.12g" % (a,n,any_root(a,n))

The restriction that at or near the zero of f(x) causes difficulties with the Newton-

Raphson as we discovered earlier. Let us suppose that at a zero of f(x) but

, so that the Taylor expansion of f(x) about the point x = a becomes:

(6.15)

so that we can rewrite f(x) in the form: f(x) = (x - a) h(x) (6.16)

where h(x) is a function of the form:

(6.17)

and we know that the coefficient of x - a does not vanish at x = a since . Then h(x)

is a function with a zero at x = a, but for which , so that the Newton-Raphson method can be applied to find the zero of h(x) at x = a. Once that zero of h(x) is found, it is clear

that the problem of arises because the zero x = a is a double root of f(x). The same

Page 9: Fixed Point Iteration

approach applies to even higher order zeroes of f(x). If we find that all the derivatives of f(x) vanish up to f(k)(x) at a point x = a, then we should suspect that a zero of multiplicity k+1 is a possibility at x = a.

The discussion of multiple zeroes of a function f(x) also leads to a technique, called deflation, to handle the difficulties which arise when there are two zeroes which lie so close to each other that once the first is found, it is difficult to prevent the Newton-Raphson iteration from finding it repeatedly, while its nearby companion cannot be found. If the zero at x = a is a simple zero (ie. not a multiple zero), then switching consideration from f(x) to h(x) = f(x)/(x-a) results in a zero finding problem for h(x), but where we know that h(x) does not have a zero at x = a, based on the previous discussion of multiple zeroes. If f(x) really does have a second zero at x = b near to x = a, we can be sure that h(x) does not have the zero x = a but does have the zero x = b. The Newton-Raphson method can now be applied to finding the zero of h(x) at x = b without any difficulty from the zero of f(x) at x = a. Once the zero of h(x) has been located at x = b, we know by the construction of h(x) = f(x)(x-a) that f(x) has the same zero at x = b.

http://pathfinder.scar.utoronto.ca/~dyer/csca57/book_P/node35.html

Secant Method While the Newton-Raphson method has many positive features, it does require the evaluation of

two different functions on each iteration, f(x) and . When f(x) is reasonably simple, it

easy to compute , but when f(x) is a complicated function, the computation of the derivative can be tedious at best. Thus it is useful to have another method which does not require

evaluation of . Such a method is the secant method which closely resembles the regula falsi method in using a linear approximation to f(x) at each iteration, but with the bracketing aspect dropped.

Figure 6.6: The secant method for a simple function. We observe that, unlike the regula falsi method, the initial points do not enclose a point where we necessarily know a zero of f(x) exists.

Page 10: Fixed Point Iteration

In figure 6.6 the secant method is illustrated and we can see that the function f(x) is being approximated by a straight line which is an extrapolation based on the two points x0 and x1. The line passing through the points (x0,f(x0)) and (x1,f(x1)) can be seen to be given by:

(6.18)

so that solving for the value of x for which y = 0 for this line, we have the two-point iteration formula:

(6.19)

The secant method is closely related to the Newton-Raphson method, and this relationship is clearly related to the fact that the quantity

(6.20)

becomes in the limit that . In fact, from the Mean Value Theorem of calculus, we know that as long as f(x) is continuous in the interval [xk,xk-1], there is a point x = c in that interval for which

(6.21)

#!/usr/bin/env python''' This uses the secant method to find the zero of a function starting from two values of x, which do not necessarily enclose the zero.'''

import sys

def secant_solve(f,x1,x2,ftol,xtol): f1 = f(x1) if abs(f1) <= ftol : return x1 # already effectively zero f2 = f(x2) if abs(f2) <= ftol : return x2 # already effectively zero while abs(x2 - x1) > xtol : slope = (f2 - f1)/(x2 - x1) if slope == 0 : sys.stderr.write("Division by 0 due to vanishing slope - exit!\n") sys.exit(1)

Page 11: Fixed Point Iteration

x3 = x2 - f2/slope # the new approximate zero f3 = f(x3) # and its function value if abs(f3) <= ftol : break x1,f1 = x2,f2 # copy x2,f2 to x1,f1 x2,f2 = x3,f3 # copy x3,f3 to x2,f2 return x3

def quad(x): # a simple test function with known zeroes return (x-5)*(x-2)

root = secant_solve(quad,1.0,3.0,0.000001,0.000001)print "ROOT = %g" % rootroot = secant_solve(quad,3.0,10.0,0.000001,0.000001)print "ROOT = %g" % rootroot = secant_solve(quad,9.99,10.0,0.000001,0.000001)print "ROOT = %g" % root

The secant method does not require the evaluation of any formal derivative as the Newton-Raphson does, and requires only one evaluation of f(x) per iteration. In many cases the absence of any requirement to compute a derivative is a significant advantage, not only because there is no need to perform the formal differentiation, but because frequently the derivative of a function is a significantly more complicated function than the original function. The rate of convergence for the Newton-Raphson for a simple zero x = a is said to be of order 2, meaning that the error |a - xk+1| at iteration k+1 is related to the error |a - xk| at iteration k by the relation:

(6.22)

where A is some constant factor, and R = 2. Thus if the error , then the error

in the next iteration will be . The constant A is normally near 1, but its value is not particularly important here, for the exponent R has the dominant effect. While R = 2 for the Newton-Raphson method, the value for the secant method is about 1.618 (actually it can

be shown to be ) for simple zeroes. For many practical purposes, the difference between convergence of order 2 and order 1.618 is of less impact than the difficulties already mentioned for the Newton-Raphson method.

While both the regula falsi and secant methods use the idea of a linear approximation to the function based on its values at two points, the regula falsi method depends on the fact that those two points enclose a zero, with the consequent sign change for f(x), while the secant method simply extrapolates using these two points to find the next approximation to the zero. The regula falsi method is sure to find the zero, since it keeps it bracketed, while the secant method can sometimes fail to find a zero that does exist. The secant method has the advantage that we do not need to have prior knowledge of the interval in x in which each zero of f(x) lies.

Page 12: Fixed Point Iteration

We have tried to follow the dominant trend in the naming of these methods, but there does exist some variance in many books in this regard. Most variance arises with the distinction between the regula falsi method and the secant method. In fact both these methods use the straight line joining two points on a curve, which is called a secant or a chord. The distinction is in how the two methods use that secant to find the next approximation to the zero. It is probably best to think of the regula falsi method as an interpolator method for a zero in a known interval, and the secant method as an extrapolator method since it uses two known values to determine the next approximation for the zero, but without the restriction that the next approximation must lie in any particular interval. Of course, the secant method will function as an interpolator when the next approximation happens to lie between the two initial values of x.

http://pathfinder.scar.utoronto.ca/~dyer/csca57/book_P/node36.html

Finding Roots by "Open" Methods The differences between "open" and "closed" methods Newton/Raphson: use first derivative to pick next point Example of Newton's method Secant: estimate first derivative to pick next point

See Chapter 6 of your textbook for more information on these methods.

The differences between "open" and "closed" methods

The differences between "open" and "closed" methods are

closed open ----------------- --------------------- uses a bounded interval not restricted to interval (usually) converges slowly (usually) converges quickly always finds a root may not find a root (if it exists) (if it exists)

Newton/Raphson method

This method uses not only values of a function f(x), but also values of its derivative f'(x). If you don't know the derivative, you can't use it.

The graphical approach to the method may be described as "follow the slope down to zero"; see your textbook for an illustration.

One can also use the Taylor series to derive Newton's method. The problem is: given

a starting point x1

Page 13: Fixed Point Iteration

a function f(x) the function's derivative f'(x)

how can we find a root of the function -- that is, a place xr where f(xr) = 0?

Suppose we expand the function around the starting point, using the Taylor series:

f(x) = f(x1) + f'(x1) * (x - x1)Now, we want to find the spot where f(x) = 0, so let's plug that into the equation and solve for x.

0 = f(x1) + f'(x1) * (x - x1)

-f(x1) = f'(x1) * (x - x1)

-f(x1) ------- = x - x1 f(x1)

f(x1) x = x1 - ------ f'(x1)

So, given a first point x1, one can calculate a new guess x2 which -- we hope -- is closer to the root. Iterating a number of times might move us very close to the root.

But there is no guarantee that this method will find the root. The method often does, but it can fail, or take a very large number of iterations, if the function in question has a slope which is zero, or close to zero, near the location of the root. It can also fail if the second derivative of the function is zero near the root.

Example of Newton's method

Let's look at a specific example of Newton's method:

find a root of the equation y = x^2 - 4 on interval [0, 5] stop when relative fractional change is 1e-5The exact root is 2, of course. How quickly can we find it, to within the given termination criterion?

The first step is to pick a starting point -- why not try halfway between the endpoints, at x1 = 2.5? The function has a value of f(x1) = 2.25 there ...

Page 14: Fixed Point Iteration
Page 15: Fixed Point Iteration

Now, we calculate the derivative of the function at x1 = 2.5, which is f'(x1) = 5.0. So, we draw a line with slope 5 downwards from our current point, and follow it to the x-axis.

Page 16: Fixed Point Iteration

Where does this line intercept the x-axis? At the point given by

f(x1) x2 = x1 - ------ f'(x1)which turns out to be our next guess at the root: x2 = 2.05

We are now much closer to the root than we were at the start -- hooray! Newton's method sure is fast, when it works. Let's zoom in and repeat the process. We evaluate our function at the new position, finding f(x2) = 0.2025.

Page 17: Fixed Point Iteration

We also evalulate the derivative of the function at this point, which yields a new slope: f'(x2) = 4.1. Following this new slope down to the x-axis, we see that it intersects at

f(x2) x3 = x2 - ------ f'(x2)

Wow. This third estimate of the root, x3 = 2.00061, is really, really close to the actual root (which is 2, of course). In just two steps, Newton's method has done very well.

You can watch the progress of Newton's method in these movies:

Movie with fixed limits (slow version) Movie with fixed limits (fast version) Movie which zooms in after each step (slow version) Movie which zooms in after each step (fast version)

Page 18: Fixed Point Iteration

Here's a table showing the method in action:

current value at value of deriv fractional iter guess current guess at current guess change------------------------------------------------------------------- 0 2.500000e+00 2.2500e+00 5.0000e+00 2.1951e-01 1 2.050000e+00 2.0250e-01 4.1000e+00 2.4688e-02 2 2.000610e+00 2.4394e-03 4.0012e+00 3.0483e-04 3 2.000000e+00 3.7169e-07 4.0000e+00 4.6461e-08

The result is 2.00000000000000, which is indistinguishable from the true root of 2.

Secant method

In order to use Newton's method, we need to be able to calculate the derivative of a function at some point: f'(x1). Sometimes you don't know the derivative; what can you do then?

What you can do is estimate the derivative by looking at the change in the function near x1: pick some other point x2 close to x1, and estimate the derivative as

f(x1) - f(x2) derivative at x1 is approx D = ------------- x1 - x2

With that approximation in hand, you can then apply the same method to guess the point at which the function equals zero.

f(x1) x_new = x1 - ------ D

After finding x_new, one can replace

x1 = x_new x2 = x1and make another iteration, and so forth.

The secant method therefore avoids the need for the first derivative, but it does require the user to pick a "nearby" point in order to estimate the slope numerically. Picking a "nearby" point which is too far, or too near, the first one, can lead to trouble. The secant method, just like Newton's method, is vulnerable to slopes which are very close to zero: they can cause the program to extrapolate far from the true root.

http://spiff.rit.edu/classes/phys317/lectures/open_root/open_root.html

Page 19: Fixed Point Iteration

http://www.google.com.ph/url?sa=t&source=web&cd=8&ved=0CD4QFjAH&url=https%3A%2F%2Fceprofs.civil.tamu.edu%2Fjzhang%2Fcven-302%2Fchap06.ppt&ei=blspTPqJM46rcbehqNEC&usg=AFQjCNGerwcS6jVffS2gLd7XvaMjQtxXkg

http://www.google.com.ph/url?sa=t&source=web&cd=13&ved=0CB4QFjACOAo&url=http%3A%2F%2Fengr.astate.edu%2Fjwalker%2FChap_06.ppt&ei=Tl4pTOOdMcvXca6TiZcD&usg=AFQjCNFP9wIxrbSXE3mkbBKGoVGz9-tQkA