1-s2.0-s0967066112000196-main

Upload: aaron-fulton

Post on 14-Apr-2018

212 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 1-s2.0-S0967066112000196-main

    1/9

    Event based sampling in non-linear filtering$

    Mauricio G. Cea n, Graham C. Goodwin

    School of Electrical Engineering and Computer Science, University of Newcastle, Australia

    a r t i c l e i n f o

    Article history:

    Received 16 August 2011

    Received in revised form

    17 November 2011

    Accepted 21 November 2011Available online 28 February 2012

    Keywords:

    Event-based sampling

    Sampling systems

    Vector quantization

    Non-linear filters

    Non-linear systems

    a b s t r a c t

    Most of the existing approaches to estimation and control are based on the premise that regular

    sampling is used. However, in some applications, there exists strong motivation to use event rather

    than time based sampling. For example, in sensor networks, it is often desirable to send data only

    when something interesting happens. This paper explores some of the issues involved in event based

    sampling in the context of non-linear filtering. Several examples are presented to illustrate the ideas.

    & 2012 Elsevier Ltd. All rights reserved.

    1. Introduction

    Most current implementations of digital control and estima-

    tion use regular sampling with fixed period T, see e.g. Middleton

    and Goodwin (1990), Feuer and Goodwin (1996), Astrom and

    Wittenmark (1990) and Hristu-Varsakelis and Levine (2005).However, there is often strong practical motivation to change

    this paradigm to one in which one only takes samples when

    something interesting happens. This changes the focus to, so-

    called, event based sampling. In this paper, we consider that a

    measurement is sent only when the measured variable crosses a

    given threshold. Thus the sampling is not regular. The latter

    strategy has many advantages including conserving valuable

    communication resources in the context of networked control

    or sensor networks.

    There is a growing literature on event based sampling. An early

    seminal paper was that of Astrom and Bernhardsson (2002). Other

    related publications include Arzen (1999), Anta and Tabuada (2009,

    2008), Byrnes and Isidori (1989), Otanez, Moyne, and Tilbury (2002),

    Tabuada (2007), Le and McCann (2007), McCann and Le (2008),Pawlowski et al. (2009), and Xu and Cao (2011). As pointed out in

    Anta and Tabuada (2010), event based sampling and control are

    particularly attractive for non-linear systems since the nature of the

    system response can be operating point dependent and this may

    mean that different sampling strategies are desirable at different

    operating points.

    The current paper examines some of the issues related to event

    based sampling for non-linear filtering. An event based non-linear

    filter is developed. It is also shown, how such a filter can be imple-

    mented using approximate non-linear filtering algorithms includingparticle filtering (Chen, 2003; Handschin & Mayne, 1969; Schon,

    2006) and minimum distortion filters (Cea, Goodwin, & Feuer, 2010;

    Goodwin, Feuer, & Muller, 2010).

    One issue that needs careful consideration in the context of

    event based filtering is that of the anti-aliasing filter. It is argued

    here that an alternative viewpoint needs to be adopted for the

    design of this filter.

    The layout of the remainder of this paper is as follows: Section 2

    reviews continuous time stochastic models. Section 3 describes

    basic sampling strategies. Section 4 describes the core ideas behind

    regular and event based sampling. Section 5 describes sampled data

    models. Section 6 reviews the traditional discrete non-linear filter.

    Section 7 details modifications that are required in the discrete non-

    linear filter to incorporate event based sampling. Section 8 brieflydescribes approximate discrete non-linear filters. Section 9 presents

    a realistic example. Section 10 draws conclusions.

    2. A continuous time non-linear model

    Most physical systems evolve in continuous time and are

    hence described by ordinary differential equations. A stochastic

    version of such equations takes the following conceptual form:

    dx

    dtfcxgcx

    dx

    dt1

    Contents lists available at SciVerse ScienceDirect

    journal homepage: www.elsevier.com/locate/conengprac

    Control Engineering Practice

    0967-0661/$- see front matter& 2012 Elsevier Ltd. All rights reserved.

    doi:10.1016/j.conengprac.2011.11.008

    $This paper is built upon the plenary presentation: Graham C. Goodwin

    Temporal and Spatial Quantization in Nonlinear Filtering, AdConIP, Hangzhou,

    China, 2011.n Corresponding author.

    E-mail addresses: [email protected],

    [email protected] (M.G. Cea),

    [email protected] (G.C. Goodwin).

    Control Engineering Practice 20 (2012) 963971

    http://www.elsevier.com/locate/conengprachttp://www.elsevier.com/locate/conengprachttp://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.conengprac.2011.11.008mailto:[email protected]:[email protected]:[email protected]://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.conengprac.2011.11.008http://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.conengprac.2011.11.008mailto:[email protected]:[email protected]:[email protected]://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.conengprac.2011.11.008http://www.elsevier.com/locate/conengprachttp://www.elsevier.com/locate/conengprac
  • 7/29/2019 1-s2.0-S0967066112000196-main

    2/9

    dz

    dt hcx dm

    dt2

    where xARn is the state vector and dz=dtARm is the measured

    output vector. In (1) and (2), dx=dt, dm=dt represent independent

    continuous time white noise processes of intensity Qc and Rcrespectively. An important observation is that continuous time

    white noise does not exist in any meaningful sense. (For example,

    if one calculates the auto-covariance of such a process, then it

    takes the form Qcdt where d is the dirac delta function.) Toovercome this difficulty, it is often more insightful to use spectral

    density description for the noise. Spectral density is the Fourier

    transform of the autocorrelation i.e.

    Spectral density ofdx

    dt

    Z1

    1Qcdtejxt dt Qc 3

    Thus Qc is the spectral density of the process fdx=dtg. Whitenoise has constraint spectral density over an infinite bandwidth. This

    observation allows one to supplement the notion of white noise by

    the notion of broad band noise which has constant spectrum over a

    wide (but not infinite) bandwidth. Indeed, it turns out that white-

    ness of the process and measurement noise is largely irrelevant to

    the operation of an optimal filter. What is actually needed is that the

    spectrum be substantially constant in key regions. This issue is dis-cussed in detail in Goodwin, Aguero, Salgado, and Yuz (2009). These

    ideas expose a difficulty with the common practice of using variances

    to describe noise in the discrete time case. For example, say that the

    noise is broadband (but non-white) having spectral density Q cover-

    ing a bandwidth of W, then the associated variance V is equal to the

    area under the spectrum, i.e. V WQ. If one uses spectral density todescribe the noise, then no difficulties will be encountered since the

    noise intensity has been correctly captured. However, say that the

    Nyquist frequency, 1=2D, is greater than the noise bandwidth. Then,if one uses variance to describe the associated filter, then the variance

    must be artificially scaled to VV=WD to match the spectraldensities. If this is not done then the associated filter will perform

    badly due to underestimation of the noise intensity.

    A related problem is that variance does not indicate thedifficulty of an estimation problem. For example, consider the

    case of very fast sampling. Then 1=D will be large. In this case, a

    small noise intensity i.e. small spectral density could be asso-

    ciated with a large noise variance. Yet, most of this noise power

    will lie at frequencies above the bandwidth of the system.

    Intuitively this part of the noise will not effect the filter perfor-

    mance. Again, it is only the spectral density in relevant parts of

    the spectrum that effects filter performance.

    The above difficulties are overcome if one works with spectral

    density rather than variance. Moreover, this aligns the continuous

    and discrete cases, since spectral density (or equivalently incre-

    mental variance) is exclusively used in the continuous case.

    In view of the above discussion, Eqs. (1) and (2) are more

    appropriately expressed in incremental form as:

    dxfcx dtgcx dx 4

    dz hcx dtdm 5where the processes x and m correspond to Brownian motion

    process having incremental covariance Qc dt and Rc dt respec-

    tively. Also, as discussed above, Qc and Rc can equivalently be

    thought of as spectral densities for dx=dt and dm=dt respectively.

    The linear equivalents of Eqs. (4) and (5) are

    dxAcx dtdx 6

    dz Ccx dtdm 7xARn, zARm, AcAR

    nn, CcARmn , dxARn and dmARm are

    the state, measured output, system matrices, process noise and

    measurement noise respectively. The initial state satisfies Efx0g ^x0 and Efx0 ^x0Tx0 ^x0g P0. In the linear case, x and m areassumed to be stationary vector Wiener processes with incremental

    covariance Qc dt and Rc dt respectively. The matrices Qc and P0 are

    assumed to be symmetric and positive semidefinite, and Rc is

    assumed to be symmetric and positive definite.

    3. Choice of sampling strategy

    Consider first the case of regular sampling with fixed period D.

    (This is sometimes called Riemann sampling (Astrom &

    Bernhardsson, 2002). Here the focus is on the independent time

    variable).

    In Section 2, dz=dt was defined as the continuous time output

    (see Eqs. (2), (5) and (7)). The next step is to develop the form of

    the model when samples are taken. However, this begs the

    question, Samples of what?. Two possible options are explored

    below for the sampled output.

    3.1. Direct sampling of dz=dt

    At first glance, it seems plausible that one could directly

    sample the continuous process dz=dt. However, this choice isactually an infeasible option since the samples of the associated

    noise, dm=dt, would have infinite variance!

    3.2. Sampling after passing through an anti-aliasing filter

    An appropriate remedy to the difficulty described in Section 3.1

    is to pass dz=dt through an anti-aliasing filter prior to sampling. A

    common choice for such a filter is to simply average dz=dt over the

    sample period. Actually, some form of averaging is inherent in all

    low pass filters that are typically used as anti-aliasing filters. In the

    case of averaging, the sampled output satisfies:

    yk 1

    D Zk 1D

    kD

    dz

    dt8

    yk 1

    Dfzk1DzkDg 9

    To obtain a notation for the sampled data case which resembles the

    continuous case, the (discrete) increment in z is defined via

    dz zk1DzkD 10where the superscript denotes next sampled value. In this case,Eq. (9) can be rewritten as

    yk 1

    Ddz 11

    4. Event based sampling

    Next consider the case of event based sampling. (This is some-

    times called Lebesgue sampling (Astrom & Bernhardsson, 2002).

    Here the focus is on the dependent variable).

    Let fqijg be a set of quantization levels for the jth output. Thesequantization levels could, for example, be evenly spaced so that

    qi 1,jqi,j LjAR for j 1, . . . ,n 12In event based sampling, the measured output is transmitted

    only when a quantization level has been crossed. Moreover,

    provided no bits are lost and provided a starting signal level is

    known, then only 1 bit/sample needs to be sent to indicate

    that the signal has moved to the next interval above ( 1) or thenext interval below (1). The difference between Riemann andLebesgue sampling is illustrated in Fig. 1.

    M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971964

  • 7/29/2019 1-s2.0-S0967066112000196-main

    3/9

    Next consider the design of the anti-aliasing filter. Here, a little

    more care is needed than in the case of Riemann sampling.

    Specifically it is required that interesting events should trigger

    sampling. This raises, the need to trade-off noise immunity versus

    sensitivity to change. To illustrate, say that one uses the averaging

    filter given in (8) and (9). Then a sudden change in output may be

    masked by the effect of averaging an (almost) constant signal over a

    long period of time. Hence it is desirable to place a lower limit on

    the bandwidth of the anti-aliasing filter so as to achieve a compro-

    mise between sensitivity and noise averaging. This trade-off doesnot arise in Riemann sampling since there is no need to detect

    changes. In the case of the averaging filter, the trade-off can be

    achieved by simply resetting the averager when the sample period

    goes beyond some pre-determined upper limit, say, Dmax.

    There also exists a close connection between the choice of the

    anti-aliasing filter bandwidth and the quantization thresholds

    used in the event based sampler. The reason is that one needs to

    ensure that measurement noise does not cause frequent trigger-

    ing of the event based sampling even if the signal component is

    substantially constant.

    Simple design guidelines can be developed as follows:

    Say that the measurement noise is broadband with spectral

    density R and that an anti-aliasing filter with reset period Dmax is

    used. Then the corresponding discrete measurement noise willhave variance of approximately R=Dmax. Assume that the quanti-

    zation level spacing is L and say that spurious triggering of the

    event based sampler should be avoided with high probability.

    This can be achieved by requiring that there is only a small

    probability that the discrete measurement noise has magnitude

    greater than L=2. To achieve this one might require

    2srL=2 13where s is the discrete noise standard deviation, i.e.

    ffiffiffiffiffiffiffiffiffiffiR=D

    p.

    Eq. (13) is equivalent to

    LZ4ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

    R=Dmaxp

    14This equation links the anti-aliasing filter bandwidth, 1=Dmax,

    the noise spectral density R and the quantization level spacing L

    to achieve a low probability that the noise will be greater thanL=2. In practice, it is desirable to choose Dmax as small as possible

    to satisfy (14) since large values ofDmax compromise ones ability

    to detect changes in the signal component.

    5. Discrete time models

    Here the model update period, which is denoted by D is notnecessarily the same as the sampling period, denoted T. Note that

    typically T4D, especially when event based sampling is utilized.

    For simplicity, the anti-aliasing filter is fixed as an averaging filter

    having period D. However, the extension to other anti-aliasing filters

    is straightforward.

    5.1. The conventional discrete model for linear systems

    First consider the linear case of (6) and (7). This case will

    reveal several modelling issues which apply, mutatis mutandis, to

    the non-linear case.

    Consider the linear continuous model (6) and (7). An exact

    discrete time model describing the samples can be readily shown

    to bex Adxx 15

    y Cdxm 16where the system matrices take the following specific values:

    Ad eAcD IAcDAcD2=2 17

    Cd 1

    DCcA

    1c eAcDI Cc I

    1

    2!AcD 1

    3!A2cD

    2

    18

    The corresponding process and output noise processes have

    zero mean and covariance:

    Sd

    Exk

    mk" # xk

    mk

    T

    " #( )Qd Sd

    STd Qd" # 19

    where the covariance matrix is given by

    Sd DZD

    0

    eAtQc 0

    0 Rc

    " #eA

    Tt dt

    !D 20

    and where

    A Ac 0

    Cc 0

    " #) eAt

    eAct 0

    CcRt

    0 eAcs ds I

    " #21

    D I 0

    01

    D

    2

    4

    3

    522

    Even though the above sampled system is an exact descriptionfor every finite D, the model is a source of conceptual and

    numerical problems when the sampling period decreases to zero.

    For example, it is readily seen, that as D-0:

    Ad-I 23

    Sd-0 0

    0 1

    24

    These results show that the discrete-time model (15) and (16)

    will be the source of difficulties as the sampling interval becomes

    small: the Ad matrix becomes the identity matrix, and the noise

    covariance matrix Sd tends to the uninformative values given in

    (24). These difficulties can be readily resolved by appropriate scaling

    of the model equations. This is shown in the next subsection.

    Regular Time Sampling

    Regular

    Spatial

    Sampling

    Fig. 1. Riemann vs Lebesgue sampling.

    M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971 965

  • 7/29/2019 1-s2.0-S0967066112000196-main

    4/9

    5.2. Incremental form of the sampled data linear model

    Here, an alternative formulation of the discrete-time model,

    which has the same structure as the continuous-time model, is

    presented. The key tool used is to introduce appropriate scaling so

    that the limit D-0 is meaningful. The alternative model provides

    conceptual advantages and superior numerical behavior at fast

    sampling rates, see Goodwin, Middleton, and Poor (1992), Feuer

    and Goodwin (1996), and Middleton and Goodwin (1990).The problems illustrated in (24) and (23) suggest that the

    traditional approach to describing discrete-time models is not

    appropriate when fast sampling rates are employed. The remedy

    is to scale the equations to produce an equivalent incremental

    model1 expressed as follows:

    dx xk 1xk AixkDxk 25

    dz zk 1zk Dyk CixkDmk yk 26where it is readily seen using (17) and (18) that

    Ai AdID

    Ac 12

    A2cD 27

    Ci

    Cd

    Cc

    28

    The initial state satisfies Efx0g ^x0 and Efx0 ^x0x0 ^x0Tg P0. The new process noise sequence is xk xk, having covarianceEfxkxTkg Qd. For consistency, with the continuous case, thenoise covariance is expressed in incremental form (or equiva-

    lently using spectral density) by scaling by the sample period.

    Thus let

    Qd QiD QcD

    2AcQcQcATc

    D 29

    where Qi can be interpreted as either incremental covariance or

    discrete noise spectral density.

    For the system output equation, it is clear that, when an

    integrating anti-aliasing filter is used, then the expression obtained

    for the output corresponds to increments of the variable z, i.e.

    yk Dyk ZkDD

    kDdz zk 1zk 30

    The measurement noise sequence is now mk Dmk havingincremental covariance expressed as EfmkmTkg RiD, where

    RiDD2Rd RcD2

    3CcQcC

    Tc

    " #D 31

    The cross-variance EfxkmTkg is SiD SdD.Finally, ifD is small, then the incremental model matrices can

    be approximated by retaining the first term in the expansions

    (27), (28) and (31) respectively i.e.

    AiCAc, CiCCc, QiCQc, RiCRc, SiC0

    32

    Thus at fast sampling the incremental matrices are approxi-mately the same as the underlying continuous time matrices.

    Note that the approximations given in (32) are equivalent to

    using Euler integration to obtain the incremental model. Also note

    that the use of Euler integration gives an approximation whereas

    use of incremental models can give an exact description.

    5.3. Incremental form of the sampled data non-linear model

    Next, consider the non-linear case under the assumption that

    D, the model update period, is sufficiently small so that Euler

    integration gives a discrete model of sufficient accuracy (in

    practice this may require some experimentation to find a suitable

    value for D). Also note that D is the model update period which is

    not necessarily equal to the sampling period T. The discrete model

    in incremental form can then be written as

    dx9xk 1xk fixkDxk 33

    dz9zk 1

    zk

    yk

    hi

    xk

    D

    mk

    34

    whereEfxkxTkg QixkD 35

    EfmkmTkg RiD 36Also, if one uses Euler integration, the functions fi, hi, Qi, Ri

    can be directly linked to the corresponding continuous functions

    as follows:

    fix-fcx 37

    hix-hcx 38

    Qi-Qc 39

    Ri-Rc 40

    6. Review of the traditional discrete non-linear filter

    The traditional discrete non-linear filter can now be directly

    formulated. The changes necessary to deal with event based

    sampling are dealt with later. Thus, consider a discrete time

    stochastic non-linear model of the form (33) and (34).

    The problem of interest is to compute pxk xk9Yk (the condi-tional distribution of the state at time k given observations ofy up

    to and including time k i.e. Yk fy0, . . . ,ykg).A recursive set of equations is presented below that yields the

    solution to the above problem (see also Jazwinski, 1970).

    One proceeds sequentially by first assuming that px0 x09Y1 isknown. For example, this distribution might be Gaussian with

    mean x0 and covariance P0.

    Next assume that pxkxk9Yk is known. Then have the followingformula of the state update law holds:

    pxk 1 xk 19Yk Z

    pxk xk9Ykpxk 1 xk 19xk dxk 41

    The impact of adding an observation, i.e. yk 1 is described by

    pxk 1 xk 19Yk 1 pxk 1 xk 19Yk,yk 1

    pxk 1 xk 19Ykpyk 1 yk 19xk 1Rpxk 1xk 19Ykpyk 1yk 19xk 1 dxk 1

    42

    Eqs. (41) and (42) are often referred to as the Chapman

    Kolmogorov equation and Bayes rule respectively.

    7. Modifications to deal with down-sampling

    As argued in Introduction, it may be highly inefficient to

    sample very quickly. Thus some form of down-sampling, or event

    based sampling, may be beneficial. Say that one begins with a

    sample period D. Then one can down-sample in several ways.

    Two alternatives are discussed below.

    7.1. Regular down-sampling

    Say that it is desired to change the sample period by a fixed

    factor, e.g. from D to mD, where D is assumed very small relatively

    1 Sometimes called a delta operator model in the literature (Middleton &

    Goodwin, 1990).

    M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971966

  • 7/29/2019 1-s2.0-S0967066112000196-main

    5/9

    to the natural dynamics of the system. There are some subtle

    issues that need to be considered.

    The non-linear filter is now updated only at period mD. Assume

    that the original anti-aliasing filter is reset every D seconds not

    every mD seconds. The correct strategy is now definitely not to

    simply take every mth sample and throw the rest away! Clearly this

    would lead to a highly suboptimal filter since most of the data will

    have been discarded. On the contrary, if one decides to increase the

    sampling period from D to mD, then a new anti-aliasing filterrelevant to the new sample period mD is desirable. For example, say

    that one uses the usual averaging filter, then a new observation

    sequence2 dzl0

    dzl0

    Xmk 1

    dzml1 k 43

    can be digitally constructed before using the discrete non-linear

    filter.

    If implemented properly, the step of down-sampling can lead

    to major computational improvements without significant degra-

    dation in performance. Indeed, the example below shows that, in

    this illustrative case, one can down-sample by several orders

    of magnitude with a corresponding reduction of several orders of

    magnitude in the computational effort without significantly chan-ging the computed conditional probability.

    7.1.1. Example

    Consider the following simple discrete non-linear system:

    xt 1 axtxt0 44

    yt x2t mt0 45where a0.999; Eo02t 102; En02t 104, D 103. The magni-tude of Eo0t2 and En0t2 may seem counterintuitive but thesescalings are a consequence of the ideas described earlier in

    Section 5.1. The system (44) and (45) is actually more intuitive

    when expressed in the equivalent incremental form:

    dx fixkDok 46dz hixkDnk 47where fixk xk; hixk x2k and ok, nk both have incrementalcovariance of 10D.

    It seems heuristically clear that the sample period of 103 maylead to wasted computational effort. Thus down-sample is intro-

    duced using the strategy explained in (43). Figs. 4, 3, 2 show the

    evolution of pxk xk9Zk for D 103 and the down-sampled ver-

    sions for mD 102 and mD 101 respectively. Inspection of theplots indicates that there is no noticeable deterioration in the

    computed posterior probability. However, at D 101, the totalcomputational load has been reduced by two orders of magnitude

    relative to the use ofD 103! Note that the introduction of thenew anti-aliasing filter in (43) is crucial in achieving those results.

    7.2. Event based sampling

    At first glance it may seem that the extension to event based

    sampling is immediate, i.e. all one need to do is run the state

    update (41) at period D (chosen sufficiently small so that Euler

    integration gives an adequate approximation) and then to use the

    observation update (42) when one decides that a sufficiently

    interesting change in the output has occurred. Certainly the

    observations are only needed when a threshold has been crossed.

    However, it is not true that there is zero relevant information

    between threshold crossings. On the contrary, there is a valuable

    piece of information, namely that the output has not crossed a

    threshold. Hence, estimates can continue to be updated between

    Fig. 2. Time evolution of the probability density function at fast sampling,

    D 0:001.

    Fig. 3. Time evolution of the probability density function at fast sampling,

    D 0:01.

    Fig. 4. Time evolution of the probability density function at fast sampling, D 0:1.

    2 Note that this step of using a new digital anti-aliasing filter is very helpful

    and does not appear to be widely appreciated.

    M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971 967

  • 7/29/2019 1-s2.0-S0967066112000196-main

    6/9

    threshold crossings provided an appropriate change is made to the

    observation update formula. Specifically, consider the situation

    illustrated in Fig. 5 where, at the kth time instant, it is known that

    ykAQa,Qb Qk 48The observation update (42) in the non-linear filter can now be

    modified to the following which explicitly utilizes (48):

    pxk 19Yk,

    yk 1A

    Qk RQk

    pxk 19Ykpyk 19xk 1 dyk 1R RQkpxk 19Ykpyk 19xk 1 dyk 1dxk 1

    49Note that if one simply chooses not to update the states, then

    the state estimation uncertainty will grow due to the drift term

    inherent in the state update (33). Use of (49) avoids this problem.

    Actually, this is different from the common strategy used in much

    of the existing event based sampling literature where updates are

    usually restricted to cases when a threshold is crossed (Anta &

    Tabuada, 2009, 2008; Arzen, 1999; Byrnes & Isidori, 1989; Le &

    McCann, 2007; McCann & Le, 2008; Otanez et al., 2002; Pawlowski

    et al., 2009; Tabuada, 2007; Xu & Cao, 2011). Some authors

    e.g. Sijs and Lazar (2009) and Marck and Sijs (2010) have noted

    that, for the case of linear filtering, it is desirable to continue

    to update based on the known fact that the output lies withinthe quantization threshold. This is the idea captured in (49) for

    the case of non-linear filtering. Of course, in practice, the integrals

    in (49) will need to be approximated. The approximation issue

    is discussed below via particle filtering and vector quantiza-

    tion methods.

    8. Spatial quantization

    Next consider the issue of spatial quantization. As is clear from

    (41) and (42), the conditional probability for the states is a function

    in a high dimensional space. Also, evolution of this function requires

    the evaluation of high dimensional integrals as is evident from the

    right hand sides of (41) and (42). Such integrals cannot be computed

    in practice without some form of discretization of the spatialcoordinates. Two strategies are described below to achieve spatial

    quantization, namely particle filtering and minimum distortion

    filtering. The former strategy has it strengths in that the number

    of particles is independent of dimension but requires large number

    of points to accurately describe the problem. The latter strategy uses

    a small number of points for low dimensional problems but its

    computational cost increases in higher dimensions due to the need

    for extra grid points.

    8.1. Particle filtering

    This technique achieves spatial quantization by drawing a set

    of random samples from the disturbance distribution. Thus, a

    discrete approximation to the posterior distribution is generated

    which is based on a set of randomly chosen points. The approx-

    imation converges, in probability, with order 1=ffiffiffiffi

    Np

    , where N is

    the number of chosen samples (Crisan & Doucet, 2002). The main

    disadvantages of this strategy is that a large number of points

    may be needed. Also these points need, in principle, to be related

    to the distribution of interest and suffer from degeneracy of the

    particles. Also, the number of points will grow exponentially with

    time unless some form of reduction is used. Thus, there are many

    fixes needed to get this type of algorithm to work in practice. Suchfixes include the use of proposal distributions, resampling meth-

    ods, etc. For details the reader is referred to Chen (2003).

    8.2. Minimum distortion filtering (MDF)

    This is a new class of algorithm. It was first described in Goodwin

    et al. (2010). The MDF algorithm belongs to the class of determi-

    nistic griding methods. There exist many algorithms within this

    framework. Some of them use fixed grid methods, where the choice

    of the grid is based on some pre-known information regarding the

    problem which is never updated. Another method is presented in

    Bucy and Senne (1971), where a griding method based on the mean

    plus an ellipsoid determined by the covariance of the probability

    density function is used. Other approaches include adaptive uniformgrid methods e.g. Bergman (1998). Here a uniform resolution grid

    with adaptive resolution is used. The grid is also relocated depend-

    ing on the likelihood of the current grid points. By contrast, the MDF

    algorithm is a method where the grid is non-uniform and is adapted

    at each sampling instant. The adaptation step depends on vector

    quantization of the current estimate of the probability density

    function. This technique provides the algorithm with the capacity

    of relocating grid points where they are most needed. The non-

    uniform characteristic allows for a tailored location of the grid,

    without wasting points in regions between the modes without

    importance e.g. in multimodal distributions. A summary of the

    algorithm is presented below.

    The key idea underlying this class of algorithm is to utilize

    vector quantization to generate, on-line, a finite approximation tothe a-posteriori distribution of the states.

    Say that one begins with a discrete approximation to the

    distribution of x0 on Nx grid points. Also assume that one has a

    finite approximation to the distribution of the process noise on Nwgrid points. These approximations can be generated off-line. Then

    utilizing the discretized version of Eq. (41), one obtains a finite

    approximation to px1 on Nx Nw grid points. Then, one uses thediscrete equivalent of (42) to obtain a finite approximation to

    px19y1 on Nx Nw points. Finally, one uses vector quantizationideas to re-approximate px19y1 back to Nx points. (How thiscrucial last step is performed will be described in detail below.)

    Then, one returns to the beginning to obtain a discrete approx-

    imation to px29y1 on Nx Nw points and so on. The algorithm issummarized in Table 1.

    The key step in the MDF algorithm is the vector quantization

    step (step 5 in Table 1). Details of this step are given below.

    Fig. 5. Inter-sample illustration.

    Table 1

    MDF algorithm.

    Step Description

    1 Initialization: Quantize px0 to Nx points by xi ,pi; i 1, . . . ,Nx . Quantizepx to Nx points by wj ,qj; j 1, . . . ,Nw

    2 Begin with pxk9Yk represented by xi ,pi; i 1, . . . ,Nx3 Approximate pxk 19Yk via (41) on NxnNx points4 Evaluate pxk 19Yk 1 on NxnNw points via (42)5 Quantize back to Nx points

    6 Go to 2

    M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971968

  • 7/29/2019 1-s2.0-S0967066112000196-main

    7/9

    Assume one has a vector discrete distribution for some distribu-

    tion px, where xARn , quantized to a very large (but finite) set ofpoints. The goal is to quantize px to a smaller finite set of pointsxi,pi, i 1, . . . ,N. The first step in vector quantization is to define ameasure to quantify the distortion of a given discrete representa-

    tion. This measure is then optimized to find the optimal representa-

    tion which minimizes the cost. In summary, one seeks a finite set

    Wx fx1, . . . ,xNg and an associated collection of sets S fS1, . . . ,SNg

    such thatSN

    1 Si Rn

    and Si \ Sj 0; iaj. The quantities Wx, Sx arechosen by minimizing a cost function of the form:

    JWx,Sx XNi 1

    EfxxiTWxxi9xASig 50

    whereW diagW1, . . . ,WN. Other choices of the distance measurecan also be used; e.g. Manhattan, L1, Jaccard, etc. . ., see Tan,

    Steinbach, and Kumar (2005).

    If x1, . . . ,xN (the set of grid points) are given, then the optimal

    choice of the sets Si is the, so-called, Voronoi cells (Gersho & Gray,

    1992; Graf & Lushgy, 2000):

    Si fx9xxiTWxxirxxjTWxxj; 8jaig 51

    Similarly, if the sets S1,

    . . .,

    SN are given, then the optimalchoice for xi is the centroid of the sets Si, i.e.

    xi Ex9xASi 52

    Many algorithms exist for minimizing functions of the form

    (50) to produce a discrete approximation. One class of algorithm

    (known as the k-means algorithm or Lloyds algorithm, Gersho &

    Gray, 1992; Graf & Lushgy, 2000; Lloyd, 1982) iterates between

    the two conditions (51) and (52).

    Thus Lloyds algorithm begins with an initial set of grid points

    Wx fxi; i 1, . . . ,Nxg. Then one calculates the Voronoi cells Sx ofWx using (51). Next, one computes the centroids of the Voronoi cells

    Sx via (52). One then returns to the calculation of the associated

    Voronoi cells and so on. Lloyds algorithm iterates these steps until

    the distortion measure (50) reaches a local minimum, or until thechange in the distortion measure falls below a given threshold, i.e.

    JWk 1x ,Sk 1x JWkx,SkxJWkx,Sxk

    re 53

    where Wkx and Sk

    x are the codebook and Voronoi cells at iteration k

    respectively.

    In order to obtain satisfactory results with the MDF algorithm

    various practical steps are necessary. These include the use of fast

    sampling, scaling, and clustering, see Goodwin and Cea (2011).

    9. Example

    Consider the practical problem of radar tracking using range and

    bearing measurements. Both particle filtering and MDF methods

    are used below for the spatial quantization step. Also event based

    sampling is used and compared with regular sampling.

    0 20 40 60 80 100 120 140 160 180 2000

    20

    40

    60

    80

    100

    120

    1st Moment

    Sample

    x1

    0 20 40 60 80 100 120 140 160 180 200

    50

    0

    50

    100

    1st Moment

    Sample

    x2

    TrueMDFPF

    TrueMDFPF

    Fig. 6. First moment estimation using, MDF, particle and true filters.

    0 20 40 60 80 100 120 140 160 180 200

    0

    50

    100

    150

    200

    250

    300

    350

    2nd Moment

    Sample

    x1

    0 20 40 60 80 100 120 140 160 180 200

    0

    50

    100

    150

    200

    2nd Moment

    Sample

    x2

    TrueMDFPF

    TrueMDFPF

    Fig. 7. Second central moment estimation using, MDF, particle and true filters.

    Table 2

    Root mean square error.

    Algorithm Mean x1 Mean x2 Variance x1 Variance x2

    MDF 4.043 4.5689 29.109 27.585

    PF 6.644 6.870 38.023 46.6718

    0 50 100 150 200

    0

    50

    100

    150

    200

    Ran

    ge

    True

    Lebesgue

    Riemann

    0 50 100 150 200

    1

    0.5

    0

    0.5

    1

    Bearing,radian

    True

    Lebesgue

    Riemann

    Fig. 8. Range and bearing trajectory.

    M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971 969

  • 7/29/2019 1-s2.0-S0967066112000196-main

    8/9

    Consider the following two state model,

    x1k

    1

    x1

    k

    Dv1

    k

    o1

    k

    54

    x2

    k 1 x2k

    Dv2k

    o2k

    55where D 0:1 is the sampling period and x x1x2AR2 is thestate vector. The input v v1v2AR2 corresponds to the speedof the object in cartesian coordinates, x o1o2AR2 is pro-cess noise (say wind gusts or unmeasured speed variations) with

    covariance:

    Qd 100 0

    0 100

    D 56

    The range and bearing measurements are given by the follow-

    ing equations (Floudas, Polychronopoulos, & Amditis, 2005):

    y1k

    ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffix1k

    2

    x

    2k

    2q n

    1k

    57

    y2k

    arctan x1k

    x2k

    !n2

    k58

    The measurement vector is thus y y1y2AR2, the mea-surement noise m n1n2AR2 is taken to have variance:

    Rd 0:6 0

    0 0:06

    1

    D59

    The MDF tuning parameters are taken to be Nx49, Nw9,z 1020 and E 10%. For the particle filter 1000 particles wereused. This approximately yields equal computational load per

    sample for MDF and particle methods. Both filters used the same

    initial condition for the sate, i.e. a Gaussian distribution with

    ^x0 35 23 and covariance P0 100 0; 0 100.Figs. 6 and 7 show the mean and variance of the state estimate.

    As can be seen, the MDF and particle filter give similar results.

    Moreover these results are almost identical to the true esti-

    mates. The latter was computed (for comparison purposes only)

    using a very fine griding of the state space.

    Table 2 shows the root mean square error for the mean and

    variance estimates using MDF and PF algorithms. These results

    show that the performance of the MDF algorithm is better than

    that obtained by PF methods.

    Next, regular (Riemann) sampling and event based (Lebesgue)

    sampling are compared. For the former, 8 bits were utilized to

    represent each sample and one sample was taken per second. For

    the second case, the quantization thresholds were set at 50

    and 0.9 respectively for range and bearing. Fig. 8 compares the

    reconstructed range and bearing for the two filters. Fig. 9 shows

    the sampling times for range (upper two traces) and bearing

    (lower two traces).

    It can be seen from Fig. 8 that the estimates produced by

    Lebesgue sampling are very close to those produced by Riemann

    sampling. This occurs despite of the obvious difference in sampling

    rates shown in Fig. 9. Indeed, the Riemann sampling strategy uses

    8 bits/sample and 1 sample/s, i.e. a data rate of 8 bits/s. On the other

    hand, the Lebesgue sampling strategy uses only 1 bit/sample (up or

    down) at an average of 0.2 samples/s. The latter corresponds to an

    average data rate of 0.2 bits/s which is 40 times less than used in the

    case of regular sampling.

    10. Conclusions

    This paper has described the use of event based sampling in

    the context of non-linear filtering. Special issues regarding the

    choice of anti-aliasing filter have been addressed. Also, a realistic

    example has been presented showing that the required data rate

    can be reduced by more than an order of magnitude (40:1 for the

    given example) whilst retaining essentially the same estimation

    accuracy.

    References

    Anta, A., & Tabuada, P. (2008). Self-triggered stabilization of homogeneous controlsystems. In: American control conference (pp. 41294134). IEEE.

    Anta, A., & Tabuada, P. (2009). On the benefits of relaxing the periodicityassumption for networked control systems over CAN. In: 30th IEEE real-timesystems symposium (pp. 312). IEEE.

    Anta, A., & Tabuada, P. (2010). To sample or not to sample: Self-triggered controlfor nonlinear systems. IEEE Transactions on Automatic Control, 55(9),20302042.

    Arzen, K. (1999). A simple event-based PID controller. In: Proceedings of the 14thIFAC world congress, Vol. 18.

    Astrom, K., & Bernhardsson, B. (2002). Comparison of Riemann and Lebesguesampling for first order stochastic systems. In: Proceedings of the 41st IEEEconference on decision and control, Vol. 2.

    Astrom, K. J., & Wittenmark, B. (1990). Computer controlled systems. Theory anddesign ((2nd edition). Englewood Cliffs, NJ: Prentice Hall.

    Bergman, N. (1998). An interpolating wavelet filter for terrain navigationIn: Proceedings of the conference on multisourcemultisensor information fusion(pp. 251258).

    Bucy, R., & Senne, K. (1971). Digital synthesis of non-linear filters. Automatica, 7(3),287298.

    Byrnes, C., & Isidori, A. (1989). New results and examples in nonlinear feedbackstabilization. Systems & Control Letters, 12(5), 437442.

    Cea, M., Goodwin, G., & Feuer, A. (2010). A discrete nonlinear filter for fast sampledproblems based on vector quantization. In: American control conference (ACC),

    July (pp. 13991403).

    0 20 40 60 80 100 120 140 160 180 200

    1

    1

    1

    1

    1

    1

    1

    Discrete time index k

    Sampleinstant

    Lebesgue

    Riemann

    Riemann

    Lebesgue

    Fig. 9. Sampling instants.

    M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971970

  • 7/29/2019 1-s2.0-S0967066112000196-main

    9/9

    Chen, Z. (2003). Bayesian filtering: From Kalman filters to particle filters, and beyond .Available at: /http://users.isr.ist.utl.pt/$jpg/tfc0607/chen_bayesian.pdfS.

    Crisan, D., & Doucet, A. (2002). A survey of convergence results on particle filteringmethods for practitioners. IEEE Transactions on Signal Processing, 50(3), 736746.

    Feuer, A., & Goodwin, G. (1996). Sampling in digital signal processing and control .Boston, Cambridge, MA: Birkausser.

    Floudas, N., Polychronopoulos, A., & Amditis, A. (2005). A survey of filteringtechniques for vehicle tracking by radar equipped automotive platforms. In8th international conference on information fusion, July, 2005 (Vol. 2, p. 8).

    Gersho, A., & McGray, R. (1992). Vector quantization and signal compression. In:Springer International Series in Engineering and Computer Science. .

    Goodwin, G., Aguero, J., Salgado, M., & Yuz, J. I. (2009). Variance or spectral densityin sampled data filtering? In 4th international conference on optimization andcontrol with applications (OCA2009), 611 June. Harbin, China.

    Goodwin, G., & Cea, M. G. (2011). Temporal and spatial quantization in nonlinearfiltering. In 4th international symposium on advanced control of industrial

    processes, 2327 May.Goodwin, G. C., Feuer, A., & Muller, C. (2010). Sequential bayesian filtering viaminimum

    distortion filtering. Three decades of progress in control sciences ((1st ed.). Springer.Goodwin, G. C., Middleton, R. H., & Poor, H. V. (1992). High-speed digital signal

    processing and control. Proceedings of the IEEE, 80(2), 240259.Graf, S., & Lushgy, H. (2000). Foundations of quantization for probability distribu-

    tions. In: Lecture notes in mathematics, Vol. 1730). Springer.Handschin, J., & Mayne, D. (1969). Monte carlo techniques to estimate the

    conditional expectation in multi-stage non-linear filtering. InternationalJournal of Control, 9(5), 547559.

    Hristu-Varsakelis, D., & Levine, W. (2005). Handbook of networked and embeddedcontrol systems. Birkhauser.

    Jazwinski, A. (1970). Stochastic processes and filtering theory. San Diego, CA:

    Academic Press.

    Le, A., & McCann, R. (2007). Event-based measurement updating Kalman filter innetwork control systems. In: 2007 IEEE region 5 technical conference (pp. 138141). TPS.

    Lloyd, S. (1982). Least squares quantization in PCM. IEEE Transactions on Informa-tion Theory, IT-28, 127135.

    Marck, J. W., & Sijs, J. (2010). Relevant sampling applied to event-based state-estimation. Proceedings4th international conference on sensor technologies andapplications SENSORCOMM. pp. 618624.

    McCann, R., & Le, A. T. (2008). Lebesgue sampling with a Kalman filter in wirelesssensors for smart appliance networks. In: Conference recordIAS annualmeeting. IEEE Industry Applications Society.

    Middleton, R., & Goodwin, G. C. (1990). Digital control and estimation. A unifiedapproach. Englewood Cliffs, NJ: Prentice Hall.

    Otanez, P., Moyne, J., & Tilbury, D. (2002). Using deadbands to reduce commu-nication in networked control systems. In: American control conference, Vol. 4.

    Pawlowski, A., Guzman, J. L., Rodrguez, F., Berenguel, M., Sanchez, J., & Dormido, S.(2009). The influence of event-based sampling techniques on data transmis-sion and control performance. In: ETFA IEEE conference on emerging technolo-

    gies and factory automation.Schon, T. B. (2006). Estimation of nonlinear dynamic systemsTheory and applica-

    tions. Ph.D. Thesis. Linkoping Studies in Science and Technology. /http://www.control.isy.liu.se/research/reports/Ph.D.Thesis/PhD998.pdfS.

    Sijs, J., & Lazar, M. (2009). On event based state estimation. In: Lecture notes incomputer science, Vol. 5469.

    Tabuada, P. (2007). Event-triggered real-time scheduling of stabilizing controltasks. IEEE Transactions on Automatic Control, 52(9), 16801685.

    Tan, P.-N., Steinbach, M., & Kumar, V. (2005). Introduction to data mining. AddisonWesley.

    Xu, Y., & Cao, X. (2011). Lebesgue-sampling-based optimal control problems with

    time aggregation. IEEE Transactions on Automatic Control, 56(5), 10971109.

    M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971 971

    http://users.isr.ist.utl.pt/~jpg/tfc0607/chen_bayesian.pdfhttp://users.isr.ist.utl.pt/~jpg/tfc0607/chen_bayesian.pdfhttp://www.control.isy.liu.se/research/reports/Ph.D.Thesis/PhD998.pdfhttp://www.control.isy.liu.se/research/reports/Ph.D.Thesis/PhD998.pdfhttp://www.control.isy.liu.se/research/reports/Ph.D.Thesis/PhD998.pdfhttp://www.control.isy.liu.se/research/reports/Ph.D.Thesis/PhD998.pdfhttp://users.isr.ist.utl.pt/~jpg/tfc0607/chen_bayesian.pdfhttp://users.isr.ist.utl.pt/~jpg/tfc0607/chen_bayesian.pdf