1-s2.0-s0967066112000196-main

7/29/2019 1-s2.0-S0967066112000196-main

1/9

Event based sampling in non-linear filtering$

Mauricio G. Cea n, Graham C. Goodwin

School of Electrical Engineering and Computer Science, University of Newcastle, Australia

a r t i c l e i n f o

Article history:

Received 16 August 2011

Received in revised form

17 November 2011

Accepted 21 November 2011Available online 28 February 2012

Keywords:

Event-based sampling

Sampling systems

Vector quantization

Non-linear filters

Non-linear systems

a b s t r a c t

Most of the existing approaches to estimation and control are based on the premise that regular

sampling is used. However, in some applications, there exists strong motivation to use event rather

than time based sampling. For example, in sensor networks, it is often desirable to send data only

when something interesting happens. This paper explores some of the issues involved in event based

sampling in the context of non-linear filtering. Several examples are presented to illustrate the ideas.

& 2012 Elsevier Ltd. All rights reserved.

1. Introduction

Most current implementations of digital control and estima-

tion use regular sampling with fixed period T, see e.g. Middleton

and Goodwin (1990), Feuer and Goodwin (1996), Astrom and

Wittenmark (1990) and Hristu-Varsakelis and Levine (2005).However, there is often strong practical motivation to change

this paradigm to one in which one only takes samples when

something interesting happens. This changes the focus to, so-

called, event based sampling. In this paper, we consider that a

measurement is sent only when the measured variable crosses a

given threshold. Thus the sampling is not regular. The latter

strategy has many advantages including conserving valuable

communication resources in the context of networked control

or sensor networks.

There is a growing literature on event based sampling. An early

seminal paper was that of Astrom and Bernhardsson (2002). Other

related publications include Arzen (1999), Anta and Tabuada (2009,

2008), Byrnes and Isidori (1989), Otanez, Moyne, and Tilbury (2002),

Tabuada (2007), Le and McCann (2007), McCann and Le (2008),Pawlowski et al. (2009), and Xu and Cao (2011). As pointed out in

Anta and Tabuada (2010), event based sampling and control are

particularly attractive for non-linear systems since the nature of the

system response can be operating point dependent and this may

mean that different sampling strategies are desirable at different

operating points.

The current paper examines some of the issues related to event

based sampling for non-linear filtering. An event based non-linear

filter is developed. It is also shown, how such a filter can be imple-

mented using approximate non-linear filtering algorithms includingparticle filtering (Chen, 2003; Handschin & Mayne, 1969; Schon,

2006) and minimum distortion filters (Cea, Goodwin, & Feuer, 2010;

Goodwin, Feuer, & Muller, 2010).

One issue that needs careful consideration in the context of

event based filtering is that of the anti-aliasing filter. It is argued

here that an alternative viewpoint needs to be adopted for the

design of this filter.

The layout of the remainder of this paper is as follows: Section 2

reviews continuous time stochastic models. Section 3 describes

basic sampling strategies. Section 4 describes the core ideas behind

regular and event based sampling. Section 5 describes sampled data

models. Section 6 reviews the traditional discrete non-linear filter.

Section 7 details modifications that are required in the discrete non-

linear filter to incorporate event based sampling. Section 8 brieflydescribes approximate discrete non-linear filters. Section 9 presents

a realistic example. Section 10 draws conclusions.

2. A continuous time non-linear model

Most physical systems evolve in continuous time and are

hence described by ordinary differential equations. A stochastic

version of such equations takes the following conceptual form:

dx

dtfcxgcx

dx

dt1

Contents lists available at SciVerse ScienceDirect

journal homepage: www.elsevier.com/locate/conengprac

Control Engineering Practice

0967-0661/$- see front matter& 2012 Elsevier Ltd. All rights reserved.

doi:10.1016/j.conengprac.2011.11.008

$This paper is built upon the plenary presentation: Graham C. Goodwin

Temporal and Spatial Quantization in Nonlinear Filtering, AdConIP, Hangzhou,

China, 2011.n Corresponding author.

E-mail addresses: [email protected],

[email protected] (M.G. Cea),

[email protected] (G.C. Goodwin).

Control Engineering Practice 20 (2012) 963971
http://www.elsevier.com/locate/conengprachttp://www.elsevier.com/locate/conengprachttp://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.conengprac.2011.11.008mailto:[email protected]:[email protected]:[email protected]://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.conengprac.2011.11.008http://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.conengprac.2011.11.008mailto:[email protected]:[email protected]:[email protected]://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.conengprac.2011.11.008http://www.elsevier.com/locate/conengprachttp://www.elsevier.com/locate/conengprac

7/29/2019 1-s2.0-S0967066112000196-main

2/9

dz

dt hcx dm

dt2

where xARn is the state vector and dz=dtARm is the measured

output vector. In (1) and (2), dx=dt, dm=dt represent independent

continuous time white noise processes of intensity Qc and Rcrespectively. An important observation is that continuous time

white noise does not exist in any meaningful sense. (For example,

if one calculates the auto-covariance of such a process, then it

takes the form Qcdt where d is the dirac delta function.) Toovercome this difficulty, it is often more insightful to use spectral

density description for the noise. Spectral density is the Fourier

transform of the autocorrelation i.e.

Spectral density ofdx

dt

Z1

1Qcdtejxt dt Qc 3

Thus Qc is the spectral density of the process fdx=dtg. Whitenoise has constraint spectral density over an infinite bandwidth. This

observation allows one to supplement the notion of white noise by

the notion of broad band noise which has constant spectrum over a

wide (but not infinite) bandwidth. Indeed, it turns out that white-

ness of the process and measurement noise is largely irrelevant to

the operation of an optimal filter. What is actually needed is that the

spectrum be substantially constant in key regions. This issue is dis-cussed in detail in Goodwin, Aguero, Salgado, and Yuz (2009). These

ideas expose a difficulty with the common practice of using variances

to describe noise in the discrete time case. For example, say that the

noise is broadband (but non-white) having spectral density Q cover-

ing a bandwidth of W, then the associated variance V is equal to the

area under the spectrum, i.e. V WQ. If one uses spectral density todescribe the noise, then no difficulties will be encountered since the

noise intensity has been correctly captured. However, say that the

Nyquist frequency, 1=2D, is greater than the noise bandwidth. Then,if one uses variance to describe the associated filter, then the variance

must be artificially scaled to VV=WD to match the spectraldensities. If this is not done then the associated filter will perform

badly due to underestimation of the noise intensity.

A related problem is that variance does not indicate thedifficulty of an estimation problem. For example, consider the

case of very fast sampling. Then 1=D will be large. In this case, a

small noise intensity i.e. small spectral density could be asso-

ciated with a large noise variance. Yet, most of this noise power

will lie at frequencies above the bandwidth of the system.

Intuitively this part of the noise will not effect the filter perfor-

mance. Again, it is only the spectral density in relevant parts of

the spectrum that effects filter performance.

The above difficulties are overcome if one works with spectral

density rather than variance. Moreover, this aligns the continuous

and discrete cases, since spectral density (or equivalently incre-

mental variance) is exclusively used in the continuous case.

In view of the above discussion, Eqs. (1) and (2) are more

appropriately expressed in incremental form as:

dxfcx dtgcx dx 4

dz hcx dtdm 5where the processes x and m correspond to Brownian motion

process having incremental covariance Qc dt and Rc dt respec-

tively. Also, as discussed above, Qc and Rc can equivalently be

thought of as spectral densities for dx=dt and dm=dt respectively.

The linear equivalents of Eqs. (4) and (5) are

dxAcx dtdx 6

dz Ccx dtdm 7xARn, zARm, AcAR

nn, CcARmn , dxARn and dmARm are

the state, measured output, system matrices, process noise and

measurement noise respectively. The initial state satisfies Efx0g ^x0 and Efx0 ^x0Tx0 ^x0g P0. In the linear case, x and m areassumed to be stationary vector Wiener processes with incremental

covariance Qc dt and Rc dt respectively. The matrices Qc and P0 are

assumed to be symmetric and positive semidefinite, and Rc is

assumed to be symmetric and positive definite.

3. Choice of sampling strategy

Consider first the case of regular sampling with fixed period D.

(This is sometimes called Riemann sampling (Astrom &

Bernhardsson, 2002). Here the focus is on the independent time

variable).

In Section 2, dz=dt was defined as the continuous time output

(see Eqs. (2), (5) and (7)). The next step is to develop the form of

the model when samples are taken. However, this begs the

question, Samples of what?. Two possible options are explored

below for the sampled output.

3.1. Direct sampling of dz=dt

At first glance, it seems plausible that one could directly

sample the continuous process dz=dt. However, this choice isactually an infeasible option since the samples of the associated

noise, dm=dt, would have infinite variance!

3.2. Sampling after passing through an anti-aliasing filter

An appropriate remedy to the difficulty described in Section 3.1

is to pass dz=dt through an anti-aliasing filter prior to sampling. A

common choice for such a filter is to simply average dz=dt over the

sample period. Actually, some form of averaging is inherent in all

low pass filters that are typically used as anti-aliasing filters. In the

case of averaging, the sampled output satisfies:

yk 1

D Zk 1D

kD

dz

dt8

yk 1

Dfzk1DzkDg 9

To obtain a notation for the sampled data case which resembles the

continuous case, the (discrete) increment in z is defined via

dz zk1DzkD 10where the superscript denotes next sampled value. In this case,Eq. (9) can be rewritten as

yk 1

Ddz 11

4. Event based sampling

Next consider the case of event based sampling. (This is some-

times called Lebesgue sampling (Astrom & Bernhardsson, 2002).

Here the focus is on the dependent variable).

Let fqijg be a set of quantization levels for the jth output. Thesequantization levels could, for example, be evenly spaced so that

qi 1,jqi,j LjAR for j 1, . . . ,n 12In event based sampling, the measured output is transmitted

only when a quantization level has been crossed. Moreover,

provided no bits are lost and provided a starting signal level is

known, then only 1 bit/sample needs to be sent to indicate

that the signal has moved to the next interval above ( 1) or thenext interval below (1). The difference between Riemann andLebesgue sampling is illustrated in Fig. 1.

M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971964

7/29/2019 1-s2.0-S0967066112000196-main

3/9

Next consider the design of the anti-aliasing filter. Here, a little

more care is needed than in the case of Riemann sampling.

Specifically it is required that interesting events should trigger

sampling. This raises, the need to trade-off noise immunity versus

sensitivity to change. To illustrate, say that one uses the averaging

filter given in (8) and (9). Then a sudden change in output may be

masked by the effect of averaging an (almost) constant signal over a

long period of time. Hence it is desirable to place a lower limit on

the bandwidth of the anti-aliasing filter so as to achieve a compro-

mise between sensitivity and noise averaging. This trade-off doesnot arise in Riemann sampling since there is no need to detect

changes. In the case of the averaging filter, the trade-off can be

achieved by simply resetting the averager when the sample period

goes beyond some pre-determined upper limit, say, Dmax.

There also exists a close connection between the choice of the

anti-aliasing filter bandwidth and the quantization thresholds

used in the event based sampler. The reason is that one needs to

ensure that measurement noise does not cause frequent trigger-

ing of the event based sampling even if the signal component is

substantially constant.

Simple design guidelines can be developed as follows:

Say that the measurement noise is broadband with spectral

density R and that an anti-aliasing filter with reset period Dmax is

used. Then the corresponding discrete measurement noise willhave variance of approximately R=Dmax. Assume that the quanti-

zation level spacing is L and say that spurious triggering of the

event based sampler should be avoided with high probability.

This can be achieved by requiring that there is only a small

probability that the discrete measurement noise has magnitude

greater than L=2. To achieve this one might require

2srL=2 13where s is the discrete noise standard deviation, i.e.

ffiffiffiffiffiffiffiffiffiffiR=D

p.

Eq. (13) is equivalent to

LZ4ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

R=Dmaxp

14This equation links the anti-aliasing filter bandwidth, 1=Dmax,

the noise spectral density R and the quantization level spacing L

to achieve a low probability that the noise will be greater thanL=2. In practice, it is desirable to choose Dmax as small as possible

to satisfy (14) since large values ofDmax compromise ones ability

to detect changes in the signal component.

5. Discrete time models

Here the model update period, which is denoted by D is notnecessarily the same as the sampling period, denoted T. Note that

typically T4D, especially when event based sampling is utilized.

For simplicity, the anti-aliasing filter is fixed as an averaging filter

having period D. However, the extension to other anti-aliasing filters

is straightforward.

5.1. The conventional discrete model for linear systems

First consider the linear case of (6) and (7). This case will

reveal several modelling issues which apply, mutatis mutandis, to

the non-linear case.

Consider the linear continuous model (6) and (7). An exact

discrete time model describing the samples can be readily shown

to bex Adxx 15

y Cdxm 16where the system matrices take the following specific values:

Ad eAcD IAcDAcD2=2 17

Cd 1

DCcA

1c eAcDI Cc I

1

2!AcD 1

3!A2cD

2

18

The corresponding process and output noise processes have

zero mean and covariance:

Sd

Exk

mk" # xk

mk

T

" #( )Qd Sd

STd Qd" # 19

where the covariance matrix is given by

Sd DZD

0

eAtQc 0

0 Rc

" #eA

Tt dt

!D 20

and where

A Ac 0

Cc 0

" #) eAt

eAct 0

CcRt

0 eAcs ds I

" #21

D I 0

01

D

2

4

3

522

Even though the above sampled system is an exact descriptionfor every finite D, the model is a source of conceptual and

numerical problems when the sampling period decreases to zero.

For example, it is readily seen, that as D-0:

Ad-I 23

Sd-0 0

0 1

24

These results show that the discrete-time model (15) and (16)

will be the source of difficulties as the sampling interval becomes

small: the Ad matrix becomes the identity matrix, and the noise

covariance matrix Sd tends to the uninformative values given in

(24). These difficulties can be readily resolved by appropriate scaling

of the model equations. This is shown in the next subsection.

Regular Time Sampling

Regular

Spatial

Sampling

Fig. 1. Riemann vs Lebesgue sampling.

M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971 965

7/29/2019 1-s2.0-S0967066112000196-main

4/9

5.2. Incremental form of the sampled data linear model

Here, an alternative formulation of the discrete-time model,

which has the same structure as the continuous-time model, is

presented. The key tool used is to introduce appropriate scaling so

that the limit D-0 is meaningful. The alternative model provides

conceptual advantages and superior numerical behavior at fast

sampling rates, see Goodwin, Middleton, and Poor (1992), Feuer

and Goodwin (1996), and Middleton and Goodwin (1990).The problems illustrated in (24) and (23) suggest that the

traditional approach to describing discrete-time models is not

appropriate when fast sampling rates are employed. The remedy

is to scale the equations to produce an equivalent incremental

model1 expressed as follows:

dx xk 1xk AixkDxk 25

dz zk 1zk Dyk CixkDmk yk 26where it is readily seen using (17) and (18) that

Ai AdID

Ac 12

A2cD 27

Ci

Cd

Cc

28

The initial state satisfies Efx0g ^x0 and Efx0 ^x0x0 ^x0Tg P0. The new process noise sequence is xk xk, having covarianceEfxkxTkg Qd. For consistency, with the continuous case, thenoise covariance is expressed in incremental form (or equiva-

lently using spectral density) by scaling by the sample period.

Thus let

Qd QiD QcD

2AcQcQcATc

D 29

where Qi can be interpreted as either incremental covariance or

discrete noise spectral density.

For the system output equation, it is clear that, when an

integrating anti-aliasing filter is used, then the expression obtained

for the output corresponds to increments of the variable z, i.e.

yk Dyk ZkDD

kDdz zk 1zk 30

The measurement noise sequence is now mk Dmk havingincremental covariance expressed as EfmkmTkg RiD, where

RiDD2Rd RcD2

3CcQcC

Tc

" #D 31

The cross-variance EfxkmTkg is SiD SdD.Finally, ifD is small, then the incremental model matrices can

be approximated by retaining the first term in the expansions

(27), (28) and (31) respectively i.e.

AiCAc, CiCCc, QiCQc, RiCRc, SiC0

32

Thus at fast sampling the incremental matrices are approxi-mately the same as the underlying continuous time matrices.

Note that the approximations given in (32) are equivalent to

using Euler integration to obtain the incremental model. Also note

that the use of Euler integration gives an approximation whereas

use of incremental models can give an exact description.

5.3. Incremental form of the sampled data non-linear model

Next, consider the non-linear case under the assumption that

D, the model update period, is sufficiently small so that Euler

integration gives a discrete model of sufficient accuracy (in

practice this may require some experimentation to find a suitable

value for D). Also note that D is the model update period which is

not necessarily equal to the sampling period T. The discrete model

in incremental form can then be written as

dx9xk 1xk fixkDxk 33

dz9zk 1

zk

yk

hi

xk

D

mk

34

whereEfxkxTkg QixkD 35

EfmkmTkg RiD 36Also, if one uses Euler integration, the functions fi, hi, Qi, Ri

can be directly linked to the corresponding continuous functions

as follows:

fix-fcx 37

hix-hcx 38

Qi-Qc 39

Ri-Rc 40

6. Review of the traditional discrete non-linear filter

The traditional discrete non-linear filter can now be directly

formulated. The changes necessary to deal with event based

sampling are dealt with later. Thus, consider a discrete time

stochastic non-linear model of the form (33) and (34).

The problem of interest is to compute pxk xk9Yk (the condi-tional distribution of the state at time k given observations ofy up

to and including time k i.e. Yk fy0, . . . ,ykg).A recursive set of equations is presented below that yields the

solution to the above problem (see also Jazwinski, 1970).

One proceeds sequentially by first assuming that px0 x09Y1 isknown. For example, this distribution might be Gaussian with

mean x0 and covariance P0.

Next assume that pxkxk9Yk is known. Then have the followingformula of the state update law holds:

pxk 1 xk 19Yk Z

pxk xk9Ykpxk 1 xk 19xk dxk 41

The impact of adding an observation, i.e. yk 1 is described by

pxk 1 xk 19Yk 1 pxk 1 xk 19Yk,yk 1

pxk 1 xk 19Ykpyk 1 yk 19xk 1Rpxk 1xk 19Ykpyk 1yk 19xk 1 dxk 1

42

Eqs. (41) and (42) are often referred to as the Chapman

Kolmogorov equation and Bayes rule respectively.

7. Modifications to deal with down-sampling

As argued in Introduction, it may be highly inefficient to

sample very quickly. Thus some form of down-sampling, or event

based sampling, may be beneficial. Say that one begins with a

sample period D. Then one can down-sample in several ways.

Two alternatives are discussed below.

7.1. Regular down-sampling

Say that it is desired to change the sample period by a fixed

factor, e.g. from D to mD, where D is assumed very small relatively

1 Sometimes called a delta operator model in the literature (Middleton &

Goodwin, 1990).


7/29/2019 1-s2.0-S0967066112000196-main

5/9

to the natural dynamics of the system. There are some subtle

issues that need to be considered.

The non-linear filter is now updated only at period mD. Assume

that the original anti-aliasing filter is reset every D seconds not

every mD seconds. The correct strategy is now definitely not to

simply take every mth sample and throw the rest away! Clearly this

would lead to a highly suboptimal filter since most of the data will

have been discarded. On the contrary, if one decides to increase the

sampling period from D to mD, then a new anti-aliasing filterrelevant to the new sample period mD is desirable. For example, say

that one uses the usual averaging filter, then a new observation

sequence2 dzl0

dzl0

Xmk 1

dzml1 k 43

can be digitally constructed before using the discrete non-linear

filter.

If implemented properly, the step of down-sampling can lead

to major computational improvements without significant degra-

dation in performance. Indeed, the example below shows that, in

this illustrative case, one can down-sample by several orders

of magnitude with a corresponding reduction of several orders of

magnitude in the computational effort without significantly chan-ging the computed conditional probability.

7.1.1. Example

Consider the following simple discrete non-linear system:

xt 1 axtxt0 44

yt x2t mt0 45where a0.999; Eo02t 102; En02t 104, D 103. The magni-tude of Eo0t2 and En0t2 may seem counterintuitive but thesescalings are a consequence of the ideas described earlier in

Section 5.1. The system (44) and (45) is actually more intuitive

when expressed in the equivalent incremental form:

dx fixkDok 46dz hixkDnk 47where fixk xk; hixk x2k and ok, nk both have incrementalcovariance of 10D.

It seems heuristically clear that the sample period of 103 maylead to wasted computational effort. Thus down-sample is intro-

duced using the strategy explained in (43). Figs. 4, 3, 2 show the

evolution of pxk xk9Zk for D 103 and the down-sampled ver-

sions for mD 102 and mD 101 respectively. Inspection of theplots indicates that there is no noticeable deterioration in the

computed posterior probability. However, at D 101, the totalcomputational load has been reduced by two orders of magnitude

relative to the use ofD 103! Note that the introduction of thenew anti-aliasing filter in (43) is crucial in achieving those results.

7.2. Event based sampling

At first glance it may seem that the extension to event based

sampling is immediate, i.e. all one need to do is run the state

update (41) at period D (chosen sufficiently small so that Euler

integration gives an adequate approximation) and then to use the

observation update (42) when one decides that a sufficiently

interesting change in the output has occurred. Certainly the

observations are only needed when a threshold has been crossed.

However, it is not true that there is zero relevant information

between threshold crossings. On the contrary, there is a valuable

piece of information, namely that the output has not crossed a

threshold. Hence, estimates can continue to be updated between

Fig. 2. Time evolution of the probability density function at fast sampling,

D 0:001.

Fig. 3. Time evolution of the probability density function at fast sampling,

D 0:01.

Fig. 4. Time evolution of the probability density function at fast sampling, D 0:1.

2 Note that this step of using a new digital anti-aliasing filter is very helpful

and does not appear to be widely appreciated.


7/29/2019 1-s2.0-S0967066112000196-main

6/9

threshold crossings provided an appropriate change is made to the

observation update formula. Specifically, consider the situation

illustrated in Fig. 5 where, at the kth time instant, it is known that

ykAQa,Qb Qk 48The observation update (42) in the non-linear filter can now be

modified to the following which explicitly utilizes (48):

pxk 19Yk,

yk 1A

Qk RQk

pxk 19Ykpyk 19xk 1 dyk 1R RQkpxk 19Ykpyk 19xk 1 dyk 1dxk 1

49Note that if one simply chooses not to update the states, then

the state estimation uncertainty will grow due to the drift term

inherent in the state update (33). Use of (49) avoids this problem.

Actually, this is different from the common strategy used in much

of the existing event based sampling literature where updates are

usually restricted to cases when a threshold is crossed (Anta &

Tabuada, 2009, 2008; Arzen, 1999; Byrnes & Isidori, 1989; Le &

McCann, 2007; McCann & Le, 2008; Otanez et al., 2002; Pawlowski

et al., 2009; Tabuada, 2007; Xu & Cao, 2011). Some authors

e.g. Sijs and Lazar (2009) and Marck and Sijs (2010) have noted

that, for the case of linear filtering, it is desirable to continue

to update based on the known fact that the output lies withinthe quantization threshold. This is the idea captured in (49) for

the case of non-linear filtering. Of course, in practice, the integrals

in (49) will need to be approximated. The approximation issue

is discussed below via particle filtering and vector quantiza-

tion methods.

8. Spatial quantization

Next consider the issue of spatial quantization. As is clear from

(41) and (42), the conditional probability for the states is a function

in a high dimensional space. Also, evolution of this function requires

the evaluation of high dimensional integrals as is evident from the

right hand sides of (41) and (42). Such integrals cannot be computed

in practice without some form of discretization of the spatialcoordinates. Two strategies are described below to achieve spatial

quantization, namely particle filtering and minimum distortion

filtering. The former strategy has it strengths in that the number

of particles is independent of dimension but requires large number

of points to accurately describe the problem. The latter strategy uses

a small number of points for low dimensional problems but its

computational cost increases in higher dimensions due to the need

for extra grid points.

8.1. Particle filtering

This technique achieves spatial quantization by drawing a set

of random samples from the disturbance distribution. Thus, a

discrete approximation to the posterior distribution is generated

which is based on a set of randomly chosen points. The approx-

imation converges, in probability, with order 1=ffiffiffiffi

Np

, where N is

the number of chosen samples (Crisan & Doucet, 2002). The main

disadvantages of this strategy is that a large number of points

may be needed. Also these points need, in principle, to be related

to the distribution of interest and suffer from degeneracy of the

particles. Also, the number of points will grow exponentially with

time unless some form of reduction is used. Thus, there are many

fixes needed to get this type of algorithm to work in practice. Suchfixes include the use of proposal distributions, resampling meth-

ods, etc. For details the reader is referred to Chen (2003).

8.2. Minimum distortion filtering (MDF)

This is a new class of algorithm. It was first described in Goodwin

et al. (2010). The MDF algorithm belongs to the class of determi-

nistic griding methods. There exist many algorithms within this

framework. Some of them use fixed grid methods, where the choice

of the grid is based on some pre-known information regarding the

problem which is never updated. Another method is presented in

Bucy and Senne (1971), where a griding method based on the mean

plus an ellipsoid determined by the covariance of the probability

density function is used. Other approaches include adaptive uniformgrid methods e.g. Bergman (1998). Here a uniform resolution grid

with adaptive resolution is used. The grid is also relocated depend-

ing on the likelihood of the current grid points. By contrast, the MDF

algorithm is a method where the grid is non-uniform and is adapted

at each sampling instant. The adaptation step depends on vector

quantization of the current estimate of the probability density

function. This technique provides the algorithm with the capacity

of relocating grid points where they are most needed. The non-

uniform characteristic allows for a tailored location of the grid,

without wasting points in regions between the modes without

importance e.g. in multimodal distributions. A summary of the

algorithm is presented below.

The key idea underlying this class of algorithm is to utilize

vector quantization to generate, on-line, a finite approximation tothe a-posteriori distribution of the states.

Say that one begins with a discrete approximation to the

distribution of x0 on Nx grid points. Also assume that one has a

finite approximation to the distribution of the process noise on Nwgrid points. These approximations can be generated off-line. Then

utilizing the discretized version of Eq. (41), one obtains a finite

approximation to px1 on Nx Nw grid points. Then, one uses thediscrete equivalent of (42) to obtain a finite approximation to

px19y1 on Nx Nw points. Finally, one uses vector quantizationideas to re-approximate px19y1 back to Nx points. (How thiscrucial last step is performed will be described in detail below.)

Then, one returns to the beginning to obtain a discrete approx-

imation to px29y1 on Nx Nw points and so on. The algorithm issummarized in Table 1.

The key step in the MDF algorithm is the vector quantization

step (step 5 in Table 1). Details of this step are given below.

Fig. 5. Inter-sample illustration.

Table 1

MDF algorithm.

Step Description

1 Initialization: Quantize px0 to Nx points by xi ,pi; i 1, . . . ,Nx . Quantizepx to Nx points by wj ,qj; j 1, . . . ,Nw

2 Begin with pxk9Yk represented by xi ,pi; i 1, . . . ,Nx3 Approximate pxk 19Yk via (41) on NxnNx points4 Evaluate pxk 19Yk 1 on NxnNw points via (42)5 Quantize back to Nx points

6 Go to 2


7/29/2019 1-s2.0-S0967066112000196-main

7/9

Assume one has a vector discrete distribution for some distribu-

tion px, where xARn , quantized to a very large (but finite) set ofpoints. The goal is to quantize px to a smaller finite set of pointsxi,pi, i 1, . . . ,N. The first step in vector quantization is to define ameasure to quantify the distortion of a given discrete representa-

tion. This measure is then optimized to find the optimal representa-

tion which minimizes the cost. In summary, one seeks a finite set

Wx fx1, . . . ,xNg and an associated collection of sets S fS1, . . . ,SNg

such thatSN

1 Si Rn

and Si \ Sj 0; iaj. The quantities Wx, Sx arechosen by minimizing a cost function of the form:

JWx,Sx XNi 1

EfxxiTWxxi9xASig 50

whereW diagW1, . . . ,WN. Other choices of the distance measurecan also be used; e.g. Manhattan, L1, Jaccard, etc. . ., see Tan,

Steinbach, and Kumar (2005).

If x1, . . . ,xN (the set of grid points) are given, then the optimal

choice of the sets Si is the, so-called, Voronoi cells (Gersho & Gray,

1992; Graf & Lushgy, 2000):

Si fx9xxiTWxxirxxjTWxxj; 8jaig 51

Similarly, if the sets S1,

. . .,

SN are given, then the optimalchoice for xi is the centroid of the sets Si, i.e.

xi Ex9xASi 52

Many algorithms exist for minimizing functions of the form

(50) to produce a discrete approximation. One class of algorithm

(known as the k-means algorithm or Lloyds algorithm, Gersho &

Gray, 1992; Graf & Lushgy, 2000; Lloyd, 1982) iterates between

the two conditions (51) and (52).

Thus Lloyds algorithm begins with an initial set of grid points

Wx fxi; i 1, . . . ,Nxg. Then one calculates the Voronoi cells Sx ofWx using (51). Next, one computes the centroids of the Voronoi cells

Sx via (52). One then returns to the calculation of the associated

Voronoi cells and so on. Lloyds algorithm iterates these steps until

the distortion measure (50) reaches a local minimum, or until thechange in the distortion measure falls below a given threshold, i.e.

JWk 1x ,Sk 1x JWkx,SkxJWkx,Sxk

re 53

where Wkx and Sk

x are the codebook and Voronoi cells at iteration k

respectively.

In order to obtain satisfactory results with the MDF algorithm

various practical steps are necessary. These include the use of fast

sampling, scaling, and clustering, see Goodwin and Cea (2011).

9. Example

Consider the practical problem of radar tracking using range and

bearing measurements. Both particle filtering and MDF methods

are used below for the spatial quantization step. Also event based

sampling is used and compared with regular sampling.

0 20 40 60 80 100 120 140 160 180 2000

20

40

60

80

100

120

1st Moment

Sample

x1

0 20 40 60 80 100 120 140 160 180 200

50

0

50

100

1st Moment

Sample

x2

TrueMDFPF

TrueMDFPF

Fig. 6. First moment estimation using, MDF, particle and true filters.

0 20 40 60 80 100 120 140 160 180 200

0

50

100

150

200

250

300

350

2nd Moment

Sample

x1

0 20 40 60 80 100 120 140 160 180 200

0

50

100

150

200

2nd Moment

Sample

x2

TrueMDFPF

TrueMDFPF

Fig. 7. Second central moment estimation using, MDF, particle and true filters.

Table 2

Root mean square error.

Algorithm Mean x1 Mean x2 Variance x1 Variance x2

MDF 4.043 4.5689 29.109 27.585

PF 6.644 6.870 38.023 46.6718

0 50 100 150 200

0

50

100

150

200

Ran

ge

True

Lebesgue

Riemann

0 50 100 150 200

1

0.5

0

0.5

1

Bearing,radian

True

Lebesgue

Riemann

Fig. 8. Range and bearing trajectory.


7/29/2019 1-s2.0-S0967066112000196-main

8/9

Consider the following two state model,

x1k

1

x1

k

Dv1

k

o1

k

54

x2

k 1 x2k

Dv2k

o2k

55where D 0:1 is the sampling period and x x1x2AR2 is thestate vector. The input v v1v2AR2 corresponds to the speedof the object in cartesian coordinates, x o1o2AR2 is pro-cess noise (say wind gusts or unmeasured speed variations) with

covariance:

Qd 100 0

0 100

D 56

The range and bearing measurements are given by the follow-

ing equations (Floudas, Polychronopoulos, & Amditis, 2005):

y1k

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffix1k

2

x

2k

2q n

1k

57

y2k

arctan x1k

x2k

!n2

k58

The measurement vector is thus y y1y2AR2, the mea-surement noise m n1n2AR2 is taken to have variance:

Rd 0:6 0

0 0:06

1

D59

The MDF tuning parameters are taken to be Nx49, Nw9,z 1020 and E 10%. For the particle filter 1000 particles wereused. This approximately yields equal computational load per

sample for MDF and particle methods. Both filters used the same

initial condition for the sate, i.e. a Gaussian distribution with

^x0 35 23 and covariance P0 100 0; 0 100.Figs. 6 and 7 show the mean and variance of the state estimate.

As can be seen, the MDF and particle filter give similar results.

Moreover these results are almost identical to the true esti-

mates. The latter was computed (for comparison purposes only)

using a very fine griding of the state space.

Table 2 shows the root mean square error for the mean and

variance estimates using MDF and PF algorithms. These results

show that the performance of the MDF algorithm is better than

that obtained by PF methods.

Next, regular (Riemann) sampling and event based (Lebesgue)

sampling are compared. For the former, 8 bits were utilized to

represent each sample and one sample was taken per second. For

the second case, the quantization thresholds were set at 50

and 0.9 respectively for range and bearing. Fig. 8 compares the

reconstructed range and bearing for the two filters. Fig. 9 shows

the sampling times for range (upper two traces) and bearing

(lower two traces).

It can be seen from Fig. 8 that the estimates produced by

Lebesgue sampling are very close to those produced by Riemann

sampling. This occurs despite of the obvious difference in sampling

rates shown in Fig. 9. Indeed, the Riemann sampling strategy uses

8 bits/sample and 1 sample/s, i.e. a data rate of 8 bits/s. On the other

hand, the Lebesgue sampling strategy uses only 1 bit/sample (up or

down) at an average of 0.2 samples/s. The latter corresponds to an

average data rate of 0.2 bits/s which is 40 times less than used in the

case of regular sampling.

10. Conclusions

This paper has described the use of event based sampling in

the context of non-linear filtering. Special issues regarding the

choice of anti-aliasing filter have been addressed. Also, a realistic

example has been presented showing that the required data rate

can be reduced by more than an order of magnitude (40:1 for the

given example) whilst retaining essentially the same estimation

accuracy.

References

Anta, A., & Tabuada, P. (2008). Self-triggered stabilization of homogeneous controlsystems. In: American control conference (pp. 41294134). IEEE.

Anta, A., & Tabuada, P. (2009). On the benefits of relaxing the periodicityassumption for networked control systems over CAN. In: 30th IEEE real-timesystems symposium (pp. 312). IEEE.

Anta, A., & Tabuada, P. (2010). To sample or not to sample: Self-triggered controlfor nonlinear systems. IEEE Transactions on Automatic Control, 55(9),20302042.

Arzen, K. (1999). A simple event-based PID controller. In: Proceedings of the 14thIFAC world congress, Vol. 18.

Astrom, K., & Bernhardsson, B. (2002). Comparison of Riemann and Lebesguesampling for first order stochastic systems. In: Proceedings of the 41st IEEEconference on decision and control, Vol. 2.

Astrom, K. J., & Wittenmark, B. (1990). Computer controlled systems. Theory anddesign ((2nd edition). Englewood Cliffs, NJ: Prentice Hall.

Bergman, N. (1998). An interpolating wavelet filter for terrain navigationIn: Proceedings of the conference on multisourcemultisensor information fusion(pp. 251258).

Bucy, R., & Senne, K. (1971). Digital synthesis of non-linear filters. Automatica, 7(3),287298.

Byrnes, C., & Isidori, A. (1989). New results and examples in nonlinear feedbackstabilization. Systems & Control Letters, 12(5), 437442.

Cea, M., Goodwin, G., & Feuer, A. (2010). A discrete nonlinear filter for fast sampledproblems based on vector quantization. In: American control conference (ACC),

July (pp. 13991403).

0 20 40 60 80 100 120 140 160 180 200

1

1

1

1

1

1

1

Discrete time index k

Sampleinstant

Lebesgue

Riemann

Riemann

Lebesgue

Fig. 9. Sampling instants.


7/29/2019 1-s2.0-S0967066112000196-main

9/9

Chen, Z. (2003). Bayesian filtering: From Kalman filters to particle filters, and beyond .Available at: /http://users.isr.ist.utl.pt/$jpg/tfc0607/chen_bayesian.pdfS.

Crisan, D., & Doucet, A. (2002). A survey of convergence results on particle filteringmethods for practitioners. IEEE Transactions on Signal Processing, 50(3), 736746.

Feuer, A., & Goodwin, G. (1996). Sampling in digital signal processing and control .Boston, Cambridge, MA: Birkausser.

Floudas, N., Polychronopoulos, A., & Amditis, A. (2005). A survey of filteringtechniques for vehicle tracking by radar equipped automotive platforms. In8th international conference on information fusion, July, 2005 (Vol. 2, p. 8).

Gersho, A., & McGray, R. (1992). Vector quantization and signal compression. In:Springer International Series in Engineering and Computer Science. .

Goodwin, G., Aguero, J., Salgado, M., & Yuz, J. I. (2009). Variance or spectral densityin sampled data filtering? In 4th international conference on optimization andcontrol with applications (OCA2009), 611 June. Harbin, China.

Goodwin, G., & Cea, M. G. (2011). Temporal and spatial quantization in nonlinearfiltering. In 4th international symposium on advanced control of industrial

processes, 2327 May.Goodwin, G. C., Feuer, A., & Muller, C. (2010). Sequential bayesian filtering viaminimum

distortion filtering. Three decades of progress in control sciences ((1st ed.). Springer.Goodwin, G. C., Middleton, R. H., & Poor, H. V. (1992). High-speed digital signal

processing and control. Proceedings of the IEEE, 80(2), 240259.Graf, S., & Lushgy, H. (2000). Foundations of quantization for probability distribu-

tions. In: Lecture notes in mathematics, Vol. 1730). Springer.Handschin, J., & Mayne, D. (1969). Monte carlo techniques to estimate the

conditional expectation in multi-stage non-linear filtering. InternationalJournal of Control, 9(5), 547559.

Hristu-Varsakelis, D., & Levine, W. (2005). Handbook of networked and embeddedcontrol systems. Birkhauser.

Jazwinski, A. (1970). Stochastic processes and filtering theory. San Diego, CA:

Academic Press.

Le, A., & McCann, R. (2007). Event-based measurement updating Kalman filter innetwork control systems. In: 2007 IEEE region 5 technical conference (pp. 138141). TPS.

Lloyd, S. (1982). Least squares quantization in PCM. IEEE Transactions on Informa-tion Theory, IT-28, 127135.

Marck, J. W., & Sijs, J. (2010). Relevant sampling applied to event-based state-estimation. Proceedings4th international conference on sensor technologies andapplications SENSORCOMM. pp. 618624.

McCann, R., & Le, A. T. (2008). Lebesgue sampling with a Kalman filter in wirelesssensors for smart appliance networks. In: Conference recordIAS annualmeeting. IEEE Industry Applications Society.

Middleton, R., & Goodwin, G. C. (1990). Digital control and estimation. A unifiedapproach. Englewood Cliffs, NJ: Prentice Hall.

Otanez, P., Moyne, J., & Tilbury, D. (2002). Using deadbands to reduce commu-nication in networked control systems. In: American control conference, Vol. 4.

Pawlowski, A., Guzman, J. L., Rodrguez, F., Berenguel, M., Sanchez, J., & Dormido, S.(2009). The influence of event-based sampling techniques on data transmis-sion and control performance. In: ETFA IEEE conference on emerging technolo-

gies and factory automation.Schon, T. B. (2006). Estimation of nonlinear dynamic systemsTheory and applica-

tions. Ph.D. Thesis. Linkoping Studies in Science and Technology. /http://www.control.isy.liu.se/research/reports/Ph.D.Thesis/PhD998.pdfS.

Sijs, J., & Lazar, M. (2009). On event based state estimation. In: Lecture notes incomputer science, Vol. 5469.

Tabuada, P. (2007). Event-triggered real-time scheduling of stabilizing controltasks. IEEE Transactions on Automatic Control, 52(9), 16801685.

Tan, P.-N., Steinbach, M., & Kumar, V. (2005). Introduction to data mining. AddisonWesley.

Xu, Y., & Cao, X. (2011). Lebesgue-sampling-based optimal control problems with

time aggregation. IEEE Transactions on Automatic Control, 56(5), 10971109.

http://users.isr.ist.utl.pt/~jpg/tfc0607/chen_bayesian.pdfhttp://users.isr.ist.utl.pt/~jpg/tfc0607/chen_bayesian.pdfhttp://www.control.isy.liu.se/research/reports/Ph.D.Thesis/PhD998.pdfhttp://www.control.isy.liu.se/research/reports/Ph.D.Thesis/PhD998.pdfhttp://www.control.isy.liu.se/research/reports/Ph.D.Thesis/PhD998.pdfhttp://www.control.isy.liu.se/research/reports/Ph.D.Thesis/PhD998.pdfhttp://users.isr.ist.utl.pt/~jpg/tfc0607/chen_bayesian.pdfhttp://users.isr.ist.utl.pt/~jpg/tfc0607/chen_bayesian.pdf

1-s2.0-s0967066112000196-main

Documents