kalman filter 2

Upload: deddy-sagala

Post on 07-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/4/2019 Kalman Filter 2

    1/7

    THE KALMAN FILTER

    RAUL ROJAS

    Abstract. This paper provides a gentle introduction to the Kalman filter,

    a numerical method that can be used for sensor fusion or for calculation of

    trajectories. First, we consider the Kalman filter for a one-dimensional system.

    The main idea is that the Kalman filter is simply a linear weighted average of

    two sensor values. Then, we show that the general case has a similar structure

    and that the mathematical formulation is quite similar.

    1. An example of data filtering

    The Kalman filter is widely used in aeronautics and engineering for two mainpurposes: for combining measurements of the same variables but from differentsensors, and for combining an inexact forecast of a systems state with an inexactmeasurement of the state. The Kalman filter has also applications in statistics andfunction approximation.

    When dealing with a time series of data points x1, x2, . . . , xn, a forecaster com-putes the best guess for the point xn+1. A smoother looks back at the data, andcomputes the best possible xi taking into account the points before and after xi. Afilter provides a correction for xn+1 taking into account all the points x1, x2, . . . , xnand an inexact measurement of xn+1.

    An example of a filter is the following: Assume that we have a system whoseone-dimensional state we can measure at successive steps. The readings of themeasurements are x1, x2, . . . , xn. Our task is to compute the average n of thetime series given n points. The solution is

    n =1

    n

    n1

    xi.

    If a new point xn+1 is measured, we can recompute n, but it is more efficient touse the old value of n and make a small correction using xn+1. The correction iseasy to derive, since

    n+1

    =1

    n + 1

    n+1

    1

    xi

    =n

    n + 1 1

    n

    n

    1

    xi

    +1

    nxn+1

    and so, n+1 can be written as

    (1.1) n+1 =n

    n + 1 +

    1

    n + 1xn+1 = + K(xn+1 )

    where K = 1/(n + 1). K is called the gain factor. The new average n+1 is aweighted average of the old estimate n and the new value xn+1. We trust n morethan the single value xn+1; therefore, the weight of n is larger than the weightof xn+1. Equation 1.1 can be also read as stating that the average is corrected

    1

  • 8/4/2019 Kalman Filter 2

    2/7

    2 RAUL ROJAS

    using the difference of xn+1 and the old value n. The gain K adjusts how big thecorrection will be.

    We can also recalculate recursively the quadratic standard deviation of the timeseries (the variance). Given n points, the quadratic standard deviation n is givenby:

    2n =1

    n

    n1

    (xi )2.

    If a new point xn+1 is measured, the new variance is

    2n+1 =1

    n + 1

    n+11

    (xi n+1)2 =1

    n + 1

    n+11

    (xi n K(xn+1 n)2)

    where K is the gain factor defined above and we used Equation 1.1. The expression

    above can be expanded as follows:

    2n+1 =1

    n + 1

    n1

    (xi n)2 + 2Kn1

    (xi n)(xn+1 n) + nK2(xn+1 n)2 + (1 K)2(xn+1 n)2

    The second term inside the brackets is zero, becausen

    1 (xi n) = 0. Therefore,the whole expression reduces to

    2n+1 =1

    n + 1(n2n + (xn+1 )2(nK2 + (1 K)2).

    Since nK2 + (1 K)2 = nK the last expression reduces to

    2

    n+1 =

    n

    n + 1 (2

    n + K(xn+1 )2

    ) = (1 K)(2

    n + K(xn+1 )2

    ).

    The whole process can now be casted into as a series of steps to be followediteratively. Given the first n points, and our calculation of n and n, then

    When a new point xn+1 is measured, we compute the gain factor K =1/(n + 1).

    We compute the new estimation of the average

    n+1 = n + K(xn+1 )

    We compute also a provisional estimate of the new standard deviation

    2n = 2n + K(xn+1

    )2

    Finally, we find the correct n+1 using the correction

    2n+1 = (1 K)2n

    This kind of iterative computation is used in calculators for updating the averageand standard deviation of numbers entered sequentially into the calculator withouthaving to store all numbers. This kind of iterative procedure shows the generalflavor of the Kalman filter, which is a kind of recursive least squares estimator fordata points.

  • 8/4/2019 Kalman Filter 2

    3/7

    THE KALMAN FILTER 3

    Figure 1. Gaussians

    2. The one-dimensional Kalman Filter

    The example above showed how to update a statistical quantity once more in-formation becomes available. Assume now that we are dealing with two differentinstruments that provide a reading for some quantity of interest x. We call x1 thereading from the first instrument and x2 the reading from the second instrument.We know that the first instrument has an error modelled by a Gaussian with stan-dard deviation 1. The error of the second instrument is also normally distributedaround zero with standard deviation 2. We would like to combine both readings

    into a single estimation.If both instruments are equally good (1 = 2), we just take the average of both

    numbers. If the first instrument is absolutely superior (1

  • 8/4/2019 Kalman Filter 2

    4/7

    4 RAUL ROJAS

    Since the two measurements are independent we can combine them best, bymultiplying their probability distributions and normalizing. Multiplying we obtain:

    p(x) = p1(x)p2(x) = Ce

    12 (xx1)

    2/2112 (xx2)

    2/22

    where C is a constant obtained after the multiplication (including the normalizationfactor needed for the new Gaussian).

    The expression for p(x) can be expanded into

    p(x) = Ce

    12 ((

    x2

    21

    2xx121

    +x2121

    )+(x2

    22

    2xx222

    +x2222

    ))

    which grouping some terms reduces to

    p(x) = Ce

    12 (x

    2( 121

    + 121

    )2x(x121

    +x222

    ))+D

    where D is a constant. The expression can be rewritten as

    p(x) = Ce

    12 (

    21+

    22

    21

    22

    (x22x(x1

    22+x2

    21

    21+

    22

    )))+D.

    Completing the square in the exponent we obtain:

    p(x) = F e

    12

    21+

    22

    21

    22

    x

    (x122+x2

    21)

    21+

    22

    2

    where all constants in the exponent (also those arising from completing the square)and in front of the exponential function have been absorbed into the constant F.

    From this result, we see that the most probable result x obtained from combiningthe two measurements (that is, the center of the distribution) is

    x =(x1

    22 + x2

    21)

    21 + 22

    and the variance of the combined result is

    2 =21

    22

    21 + 22

    If we introduce a gain factor K = 21/(21 +

    22) we can rewrite the best estimate

    x of the state as

    x = x1 + K(x2 x1)

    and the change to the variance 21 as

    = (1 K)21

    This is the general form of the classical Kalman filter.Note that x1 does not need to be a measurement. It can be a forecast of the

    system state, with a variance 21 , and x2 can be a measurement with the errorvariance 22 . The Kalman filter would in that case combine the forecast with themeasurement in order to provide the best possible linear combination of both asthe final estimate.

  • 8/4/2019 Kalman Filter 2

    5/7

    THE KALMAN FILTER 5

    3. Multidimensional Kalman Filter - uncorrelated dimensions

    We can generalize the result obtained above to the case of multiple dimensions,when every dimension is independent of each other, that is, when there is no cor-relation between the measurements obtained for each dimension. In that case themeasurements x1 and x2 are vectors, and we can combine their coordinates inde-pendently.

    In the case of uncorrelated dimensions, given measurements x1 and x2 ( vectorsof dimension n), the variances for each dimension can be arranged in the diagonalof two matrices 1 and 2, as follows:

    1 =

    211 0 00 212 00 0 00 0 21n

    and

    2 =

    221 0 00 222 00 0 00 0 22n

    where 21i represents the variance of the i-th coordinate for the first instrument,and 22i the variance of the i-th coordinate for the second.

    The best estimate for the system state is given by

    x = (2x1 + 1x2)(1 + 2)1

    Introducing the gain K = 1(1 + 2)1, we can rewrite the expression above as

    x = x1 + K(x2 x1)and the new estimate of all variances can be put in the diagonal of a matrix computed as

    = (I K)1where I is the n n identity matrix. The reader can verify all these expressions.

    The matrices 1 and 2 are the covariance matrices for the two instruments.In the special case of uncorrelated dimensions, the covariance matrix is diago-nal. Remember that the covariance matrix for m n-dimensional data pointsx1, x2, . . . , xm is defined as

    =1

    n(x1x

    T1 + x2x

    T2 + + xmxTm).

    4. The general case

    The general case is very similar to the one-dimensional case. The mathematicsis almost identical, but now we have to operate with vectors and matrices.

    Given two n-dimensional measurements x1 and x2, we assume that the twoinstruments that produced them have a normal error distribution. The first mea-surement is centered at x1 with covariance matrix 1. The second is centered atx2 with covariance matrix 2. The first distribution is denoted by N(x1, 1), thesecond by N(x2, 2)

  • 8/4/2019 Kalman Filter 2

    6/7

    6 RAUL ROJAS

    If the two instruments are independent, the combined distribution is the productof the two distributions. The product of two normalized Gaussians, after normal-

    ization, is another normalized Gaussian. We only have to add the exponents of thetwo Gaussians in order to find the exponent of the new Gaussian. Any constantsin the exponent can be moved out of the exponent: they can be converted to con-stants in front of the exponential function, which will be later normalized. The newdistribution is N(x, ).

    Remember that in a Gaussian with center and covariance matrix , the expo-nent of the Gaussian is 12 (x )T1(x ).

    The sum of the exponents of the two Gaussians is:

    12

    (x x1)T11 (x x1) + (x x2)T12 (x x2)

    We will drop the factor 1/2 in the intermediate steps and will recover it at theend. The expression above (without considering the factor

    1/2) can be expanded

    as:

    xT(11 + 12 )x 2(xT1 11 + xT2 12 )x + C

    where C is a constant. Since the product of symmetric matrices is commutativeand the two covariance matrices 1 and 2 are commutative, we can transform theexpression above into:

    xT(1 + 2)(12)1x 2(xT1 2 + xT2 1)(12)1(1 + 2)1(1 + 2)x + C

    Note that we used the fact that (11 +12 )12 = 1 +2. Now we can complete

    the square in the exponent and rewrite this expression as:

    (x(x12+x21)(1+2)1)T(1+2)(12)1(x(x12+x21)(1+2)1)+C

    where C is a constant absorbing C and any other constants needed for the squarecompletion. Multiplying the above expression by 1/2, the exponent has the re-quired form for a Gaussian, and by direct inspection we see that the expectedaverage of the combination of measurements is

    x = (x12 + x21)(1 + 2)1

    and the new covariance matrix is

    = (1 + 2)1(12)

    These expressions correspond to the Kalman filter for the combination of two mea-surements. To see this more clearly we can define the Kalman gain K as

    K = 1(1 + 2)1

    Then the expressions above can be rewritten as

    x = x1 + K(x2 x1)and

    = (I K)1where I is the identity matrix. Notice that in all the algebraic manipulations abovewe used the fact that the product and the sum of symmetric matrices is symmetric,and that the inverse of a symmetric matrix is symmetric. Symmetric matricescommute under matrix multiplication.

  • 8/4/2019 Kalman Filter 2

    7/7

    THE KALMAN FILTER 7

    5. Forecast and measurement of different dimension

    The equations printed in books for the Kalman filter are more general than theexpressions we have derived, because they handle the more general case in whichthe measurement can have a different dimension than the systems state. But it iseasy to derive the more general form, all is needed is to move all calculations tomeasurement space.

    In the general case we have a process going through state transitions x1, x2, . . ..The propagator matrix A allows us to forecast the state at time k + 1 given ourbest past estimate xk of the state at time k:

    xk+1 = Axk +

    The forecast is error prone. The error is given by , a variable with Gaussiandistribution centered at 0. If the covariance matrix of the system state estimate attime k is called P, the covariance matrix of xk+1 is given by

    Pk+1 = APkAT + Q

    where Q is the covariance matrix of the noise .The measurement zk+1 is also error prone. The measurement is linearly related

    to the state, namely as follows:

    zk = Hxk + R

    where H is a matrix and R the covariance matrix of the measurement error.Kalman gain:

    Kk+1 = P

    k+1HT[HPk+1H

    T + R]1

    State update:xk+1 = x

    k+1 + Kk+1[zk+1 Hxk+1]Covariance update:

    Pk+1 = (I Kk+1H)P

    k+1

    References

    1. Peter Maybeck, Stochastic Models, Estimationd and Control, Academic Press. Inc, 1979.

    2. N.D. Singpurwalla and R.J. Meinhold, Understanding the Kalman Filter, The American

    Statistician, Vol. 37, N. 2, pp. 123-7.

    3. Greg Welch and Gary Bishop, An Introduction to the Kalman Filter, TR-95-041, Department

    of Computer Science, University of North Carolina at Chapell Hill, updated in 2002.

    Freie Universitat Berlin, Institut fur Informatik