1-s2.0-s0967066112000196-main
TRANSCRIPT
-
7/29/2019 1-s2.0-S0967066112000196-main
1/9
Event based sampling in non-linear filtering$
Mauricio G. Cea n, Graham C. Goodwin
School of Electrical Engineering and Computer Science, University of Newcastle, Australia
a r t i c l e i n f o
Article history:
Received 16 August 2011
Received in revised form
17 November 2011
Accepted 21 November 2011Available online 28 February 2012
Keywords:
Event-based sampling
Sampling systems
Vector quantization
Non-linear filters
Non-linear systems
a b s t r a c t
Most of the existing approaches to estimation and control are based on the premise that regular
sampling is used. However, in some applications, there exists strong motivation to use event rather
than time based sampling. For example, in sensor networks, it is often desirable to send data only
when something interesting happens. This paper explores some of the issues involved in event based
sampling in the context of non-linear filtering. Several examples are presented to illustrate the ideas.
& 2012 Elsevier Ltd. All rights reserved.
1. Introduction
Most current implementations of digital control and estima-
tion use regular sampling with fixed period T, see e.g. Middleton
and Goodwin (1990), Feuer and Goodwin (1996), Astrom and
Wittenmark (1990) and Hristu-Varsakelis and Levine (2005).However, there is often strong practical motivation to change
this paradigm to one in which one only takes samples when
something interesting happens. This changes the focus to, so-
called, event based sampling. In this paper, we consider that a
measurement is sent only when the measured variable crosses a
given threshold. Thus the sampling is not regular. The latter
strategy has many advantages including conserving valuable
communication resources in the context of networked control
or sensor networks.
There is a growing literature on event based sampling. An early
seminal paper was that of Astrom and Bernhardsson (2002). Other
related publications include Arzen (1999), Anta and Tabuada (2009,
2008), Byrnes and Isidori (1989), Otanez, Moyne, and Tilbury (2002),
Tabuada (2007), Le and McCann (2007), McCann and Le (2008),Pawlowski et al. (2009), and Xu and Cao (2011). As pointed out in
Anta and Tabuada (2010), event based sampling and control are
particularly attractive for non-linear systems since the nature of the
system response can be operating point dependent and this may
mean that different sampling strategies are desirable at different
operating points.
The current paper examines some of the issues related to event
based sampling for non-linear filtering. An event based non-linear
filter is developed. It is also shown, how such a filter can be imple-
mented using approximate non-linear filtering algorithms includingparticle filtering (Chen, 2003; Handschin & Mayne, 1969; Schon,
2006) and minimum distortion filters (Cea, Goodwin, & Feuer, 2010;
Goodwin, Feuer, & Muller, 2010).
One issue that needs careful consideration in the context of
event based filtering is that of the anti-aliasing filter. It is argued
here that an alternative viewpoint needs to be adopted for the
design of this filter.
The layout of the remainder of this paper is as follows: Section 2
reviews continuous time stochastic models. Section 3 describes
basic sampling strategies. Section 4 describes the core ideas behind
regular and event based sampling. Section 5 describes sampled data
models. Section 6 reviews the traditional discrete non-linear filter.
Section 7 details modifications that are required in the discrete non-
linear filter to incorporate event based sampling. Section 8 brieflydescribes approximate discrete non-linear filters. Section 9 presents
a realistic example. Section 10 draws conclusions.
2. A continuous time non-linear model
Most physical systems evolve in continuous time and are
hence described by ordinary differential equations. A stochastic
version of such equations takes the following conceptual form:
dx
dtfcxgcx
dx
dt1
Contents lists available at SciVerse ScienceDirect
journal homepage: www.elsevier.com/locate/conengprac
Control Engineering Practice
0967-0661/$- see front matter& 2012 Elsevier Ltd. All rights reserved.
doi:10.1016/j.conengprac.2011.11.008
$This paper is built upon the plenary presentation: Graham C. Goodwin
Temporal and Spatial Quantization in Nonlinear Filtering, AdConIP, Hangzhou,
China, 2011.n Corresponding author.
E-mail addresses: [email protected],
[email protected] (M.G. Cea),
[email protected] (G.C. Goodwin).
Control Engineering Practice 20 (2012) 963971
http://www.elsevier.com/locate/conengprachttp://www.elsevier.com/locate/conengprachttp://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.conengprac.2011.11.008mailto:[email protected]:[email protected]:[email protected]://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.conengprac.2011.11.008http://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.conengprac.2011.11.008mailto:[email protected]:[email protected]:[email protected]://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.conengprac.2011.11.008http://www.elsevier.com/locate/conengprachttp://www.elsevier.com/locate/conengprac -
7/29/2019 1-s2.0-S0967066112000196-main
2/9
dz
dt hcx dm
dt2
where xARn is the state vector and dz=dtARm is the measured
output vector. In (1) and (2), dx=dt, dm=dt represent independent
continuous time white noise processes of intensity Qc and Rcrespectively. An important observation is that continuous time
white noise does not exist in any meaningful sense. (For example,
if one calculates the auto-covariance of such a process, then it
takes the form Qcdt where d is the dirac delta function.) Toovercome this difficulty, it is often more insightful to use spectral
density description for the noise. Spectral density is the Fourier
transform of the autocorrelation i.e.
Spectral density ofdx
dt
Z1
1Qcdtejxt dt Qc 3
Thus Qc is the spectral density of the process fdx=dtg. Whitenoise has constraint spectral density over an infinite bandwidth. This
observation allows one to supplement the notion of white noise by
the notion of broad band noise which has constant spectrum over a
wide (but not infinite) bandwidth. Indeed, it turns out that white-
ness of the process and measurement noise is largely irrelevant to
the operation of an optimal filter. What is actually needed is that the
spectrum be substantially constant in key regions. This issue is dis-cussed in detail in Goodwin, Aguero, Salgado, and Yuz (2009). These
ideas expose a difficulty with the common practice of using variances
to describe noise in the discrete time case. For example, say that the
noise is broadband (but non-white) having spectral density Q cover-
ing a bandwidth of W, then the associated variance V is equal to the
area under the spectrum, i.e. V WQ. If one uses spectral density todescribe the noise, then no difficulties will be encountered since the
noise intensity has been correctly captured. However, say that the
Nyquist frequency, 1=2D, is greater than the noise bandwidth. Then,if one uses variance to describe the associated filter, then the variance
must be artificially scaled to VV=WD to match the spectraldensities. If this is not done then the associated filter will perform
badly due to underestimation of the noise intensity.
A related problem is that variance does not indicate thedifficulty of an estimation problem. For example, consider the
case of very fast sampling. Then 1=D will be large. In this case, a
small noise intensity i.e. small spectral density could be asso-
ciated with a large noise variance. Yet, most of this noise power
will lie at frequencies above the bandwidth of the system.
Intuitively this part of the noise will not effect the filter perfor-
mance. Again, it is only the spectral density in relevant parts of
the spectrum that effects filter performance.
The above difficulties are overcome if one works with spectral
density rather than variance. Moreover, this aligns the continuous
and discrete cases, since spectral density (or equivalently incre-
mental variance) is exclusively used in the continuous case.
In view of the above discussion, Eqs. (1) and (2) are more
appropriately expressed in incremental form as:
dxfcx dtgcx dx 4
dz hcx dtdm 5where the processes x and m correspond to Brownian motion
process having incremental covariance Qc dt and Rc dt respec-
tively. Also, as discussed above, Qc and Rc can equivalently be
thought of as spectral densities for dx=dt and dm=dt respectively.
The linear equivalents of Eqs. (4) and (5) are
dxAcx dtdx 6
dz Ccx dtdm 7xARn, zARm, AcAR
nn, CcARmn , dxARn and dmARm are
the state, measured output, system matrices, process noise and
measurement noise respectively. The initial state satisfies Efx0g ^x0 and Efx0 ^x0Tx0 ^x0g P0. In the linear case, x and m areassumed to be stationary vector Wiener processes with incremental
covariance Qc dt and Rc dt respectively. The matrices Qc and P0 are
assumed to be symmetric and positive semidefinite, and Rc is
assumed to be symmetric and positive definite.
3. Choice of sampling strategy
Consider first the case of regular sampling with fixed period D.
(This is sometimes called Riemann sampling (Astrom &
Bernhardsson, 2002). Here the focus is on the independent time
variable).
In Section 2, dz=dt was defined as the continuous time output
(see Eqs. (2), (5) and (7)). The next step is to develop the form of
the model when samples are taken. However, this begs the
question, Samples of what?. Two possible options are explored
below for the sampled output.
3.1. Direct sampling of dz=dt
At first glance, it seems plausible that one could directly
sample the continuous process dz=dt. However, this choice isactually an infeasible option since the samples of the associated
noise, dm=dt, would have infinite variance!
3.2. Sampling after passing through an anti-aliasing filter
An appropriate remedy to the difficulty described in Section 3.1
is to pass dz=dt through an anti-aliasing filter prior to sampling. A
common choice for such a filter is to simply average dz=dt over the
sample period. Actually, some form of averaging is inherent in all
low pass filters that are typically used as anti-aliasing filters. In the
case of averaging, the sampled output satisfies:
yk 1
D Zk 1D
kD
dz
dt8
yk 1
Dfzk1DzkDg 9
To obtain a notation for the sampled data case which resembles the
continuous case, the (discrete) increment in z is defined via
dz zk1DzkD 10where the superscript denotes next sampled value. In this case,Eq. (9) can be rewritten as
yk 1
Ddz 11
4. Event based sampling
Next consider the case of event based sampling. (This is some-
times called Lebesgue sampling (Astrom & Bernhardsson, 2002).
Here the focus is on the dependent variable).
Let fqijg be a set of quantization levels for the jth output. Thesequantization levels could, for example, be evenly spaced so that
qi 1,jqi,j LjAR for j 1, . . . ,n 12In event based sampling, the measured output is transmitted
only when a quantization level has been crossed. Moreover,
provided no bits are lost and provided a starting signal level is
known, then only 1 bit/sample needs to be sent to indicate
that the signal has moved to the next interval above ( 1) or thenext interval below (1). The difference between Riemann andLebesgue sampling is illustrated in Fig. 1.
M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971964
-
7/29/2019 1-s2.0-S0967066112000196-main
3/9
Next consider the design of the anti-aliasing filter. Here, a little
more care is needed than in the case of Riemann sampling.
Specifically it is required that interesting events should trigger
sampling. This raises, the need to trade-off noise immunity versus
sensitivity to change. To illustrate, say that one uses the averaging
filter given in (8) and (9). Then a sudden change in output may be
masked by the effect of averaging an (almost) constant signal over a
long period of time. Hence it is desirable to place a lower limit on
the bandwidth of the anti-aliasing filter so as to achieve a compro-
mise between sensitivity and noise averaging. This trade-off doesnot arise in Riemann sampling since there is no need to detect
changes. In the case of the averaging filter, the trade-off can be
achieved by simply resetting the averager when the sample period
goes beyond some pre-determined upper limit, say, Dmax.
There also exists a close connection between the choice of the
anti-aliasing filter bandwidth and the quantization thresholds
used in the event based sampler. The reason is that one needs to
ensure that measurement noise does not cause frequent trigger-
ing of the event based sampling even if the signal component is
substantially constant.
Simple design guidelines can be developed as follows:
Say that the measurement noise is broadband with spectral
density R and that an anti-aliasing filter with reset period Dmax is
used. Then the corresponding discrete measurement noise willhave variance of approximately R=Dmax. Assume that the quanti-
zation level spacing is L and say that spurious triggering of the
event based sampler should be avoided with high probability.
This can be achieved by requiring that there is only a small
probability that the discrete measurement noise has magnitude
greater than L=2. To achieve this one might require
2srL=2 13where s is the discrete noise standard deviation, i.e.
ffiffiffiffiffiffiffiffiffiffiR=D
p.
Eq. (13) is equivalent to
LZ4ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
R=Dmaxp
14This equation links the anti-aliasing filter bandwidth, 1=Dmax,
the noise spectral density R and the quantization level spacing L
to achieve a low probability that the noise will be greater thanL=2. In practice, it is desirable to choose Dmax as small as possible
to satisfy (14) since large values ofDmax compromise ones ability
to detect changes in the signal component.
5. Discrete time models
Here the model update period, which is denoted by D is notnecessarily the same as the sampling period, denoted T. Note that
typically T4D, especially when event based sampling is utilized.
For simplicity, the anti-aliasing filter is fixed as an averaging filter
having period D. However, the extension to other anti-aliasing filters
is straightforward.
5.1. The conventional discrete model for linear systems
First consider the linear case of (6) and (7). This case will
reveal several modelling issues which apply, mutatis mutandis, to
the non-linear case.
Consider the linear continuous model (6) and (7). An exact
discrete time model describing the samples can be readily shown
to bex Adxx 15
y Cdxm 16where the system matrices take the following specific values:
Ad eAcD IAcDAcD2=2 17
Cd 1
DCcA
1c eAcDI Cc I
1
2!AcD 1
3!A2cD
2
18
The corresponding process and output noise processes have
zero mean and covariance:
Sd
Exk
mk" # xk
mk
T
" #( )Qd Sd
STd Qd" # 19
where the covariance matrix is given by
Sd DZD
0
eAtQc 0
0 Rc
" #eA
Tt dt
!D 20
and where
A Ac 0
Cc 0
" #) eAt
eAct 0
CcRt
0 eAcs ds I
" #21
D I 0
01
D
2
4
3
522
Even though the above sampled system is an exact descriptionfor every finite D, the model is a source of conceptual and
numerical problems when the sampling period decreases to zero.
For example, it is readily seen, that as D-0:
Ad-I 23
Sd-0 0
0 1
24
These results show that the discrete-time model (15) and (16)
will be the source of difficulties as the sampling interval becomes
small: the Ad matrix becomes the identity matrix, and the noise
covariance matrix Sd tends to the uninformative values given in
(24). These difficulties can be readily resolved by appropriate scaling
of the model equations. This is shown in the next subsection.
Regular Time Sampling
Regular
Spatial
Sampling
Fig. 1. Riemann vs Lebesgue sampling.
M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971 965
-
7/29/2019 1-s2.0-S0967066112000196-main
4/9
5.2. Incremental form of the sampled data linear model
Here, an alternative formulation of the discrete-time model,
which has the same structure as the continuous-time model, is
presented. The key tool used is to introduce appropriate scaling so
that the limit D-0 is meaningful. The alternative model provides
conceptual advantages and superior numerical behavior at fast
sampling rates, see Goodwin, Middleton, and Poor (1992), Feuer
and Goodwin (1996), and Middleton and Goodwin (1990).The problems illustrated in (24) and (23) suggest that the
traditional approach to describing discrete-time models is not
appropriate when fast sampling rates are employed. The remedy
is to scale the equations to produce an equivalent incremental
model1 expressed as follows:
dx xk 1xk AixkDxk 25
dz zk 1zk Dyk CixkDmk yk 26where it is readily seen using (17) and (18) that
Ai AdID
Ac 12
A2cD 27
Ci
Cd
Cc
28
The initial state satisfies Efx0g ^x0 and Efx0 ^x0x0 ^x0Tg P0. The new process noise sequence is xk xk, having covarianceEfxkxTkg Qd. For consistency, with the continuous case, thenoise covariance is expressed in incremental form (or equiva-
lently using spectral density) by scaling by the sample period.
Thus let
Qd QiD QcD
2AcQcQcATc
D 29
where Qi can be interpreted as either incremental covariance or
discrete noise spectral density.
For the system output equation, it is clear that, when an
integrating anti-aliasing filter is used, then the expression obtained
for the output corresponds to increments of the variable z, i.e.
yk Dyk ZkDD
kDdz zk 1zk 30
The measurement noise sequence is now mk Dmk havingincremental covariance expressed as EfmkmTkg RiD, where
RiDD2Rd RcD2
3CcQcC
Tc
" #D 31
The cross-variance EfxkmTkg is SiD SdD.Finally, ifD is small, then the incremental model matrices can
be approximated by retaining the first term in the expansions
(27), (28) and (31) respectively i.e.
AiCAc, CiCCc, QiCQc, RiCRc, SiC0
32
Thus at fast sampling the incremental matrices are approxi-mately the same as the underlying continuous time matrices.
Note that the approximations given in (32) are equivalent to
using Euler integration to obtain the incremental model. Also note
that the use of Euler integration gives an approximation whereas
use of incremental models can give an exact description.
5.3. Incremental form of the sampled data non-linear model
Next, consider the non-linear case under the assumption that
D, the model update period, is sufficiently small so that Euler
integration gives a discrete model of sufficient accuracy (in
practice this may require some experimentation to find a suitable
value for D). Also note that D is the model update period which is
not necessarily equal to the sampling period T. The discrete model
in incremental form can then be written as
dx9xk 1xk fixkDxk 33
dz9zk 1
zk
yk
hi
xk
D
mk
34
whereEfxkxTkg QixkD 35
EfmkmTkg RiD 36Also, if one uses Euler integration, the functions fi, hi, Qi, Ri
can be directly linked to the corresponding continuous functions
as follows:
fix-fcx 37
hix-hcx 38
Qi-Qc 39
Ri-Rc 40
6. Review of the traditional discrete non-linear filter
The traditional discrete non-linear filter can now be directly
formulated. The changes necessary to deal with event based
sampling are dealt with later. Thus, consider a discrete time
stochastic non-linear model of the form (33) and (34).
The problem of interest is to compute pxk xk9Yk (the condi-tional distribution of the state at time k given observations ofy up
to and including time k i.e. Yk fy0, . . . ,ykg).A recursive set of equations is presented below that yields the
solution to the above problem (see also Jazwinski, 1970).
One proceeds sequentially by first assuming that px0 x09Y1 isknown. For example, this distribution might be Gaussian with
mean x0 and covariance P0.
Next assume that pxkxk9Yk is known. Then have the followingformula of the state update law holds:
pxk 1 xk 19Yk Z
pxk xk9Ykpxk 1 xk 19xk dxk 41
The impact of adding an observation, i.e. yk 1 is described by
pxk 1 xk 19Yk 1 pxk 1 xk 19Yk,yk 1
pxk 1 xk 19Ykpyk 1 yk 19xk 1Rpxk 1xk 19Ykpyk 1yk 19xk 1 dxk 1
42
Eqs. (41) and (42) are often referred to as the Chapman
Kolmogorov equation and Bayes rule respectively.
7. Modifications to deal with down-sampling
As argued in Introduction, it may be highly inefficient to
sample very quickly. Thus some form of down-sampling, or event
based sampling, may be beneficial. Say that one begins with a
sample period D. Then one can down-sample in several ways.
Two alternatives are discussed below.
7.1. Regular down-sampling
Say that it is desired to change the sample period by a fixed
factor, e.g. from D to mD, where D is assumed very small relatively
1 Sometimes called a delta operator model in the literature (Middleton &
Goodwin, 1990).
M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971966
-
7/29/2019 1-s2.0-S0967066112000196-main
5/9
to the natural dynamics of the system. There are some subtle
issues that need to be considered.
The non-linear filter is now updated only at period mD. Assume
that the original anti-aliasing filter is reset every D seconds not
every mD seconds. The correct strategy is now definitely not to
simply take every mth sample and throw the rest away! Clearly this
would lead to a highly suboptimal filter since most of the data will
have been discarded. On the contrary, if one decides to increase the
sampling period from D to mD, then a new anti-aliasing filterrelevant to the new sample period mD is desirable. For example, say
that one uses the usual averaging filter, then a new observation
sequence2 dzl0
dzl0
Xmk 1
dzml1 k 43
can be digitally constructed before using the discrete non-linear
filter.
If implemented properly, the step of down-sampling can lead
to major computational improvements without significant degra-
dation in performance. Indeed, the example below shows that, in
this illustrative case, one can down-sample by several orders
of magnitude with a corresponding reduction of several orders of
magnitude in the computational effort without significantly chan-ging the computed conditional probability.
7.1.1. Example
Consider the following simple discrete non-linear system:
xt 1 axtxt0 44
yt x2t mt0 45where a0.999; Eo02t 102; En02t 104, D 103. The magni-tude of Eo0t2 and En0t2 may seem counterintuitive but thesescalings are a consequence of the ideas described earlier in
Section 5.1. The system (44) and (45) is actually more intuitive
when expressed in the equivalent incremental form:
dx fixkDok 46dz hixkDnk 47where fixk xk; hixk x2k and ok, nk both have incrementalcovariance of 10D.
It seems heuristically clear that the sample period of 103 maylead to wasted computational effort. Thus down-sample is intro-
duced using the strategy explained in (43). Figs. 4, 3, 2 show the
evolution of pxk xk9Zk for D 103 and the down-sampled ver-
sions for mD 102 and mD 101 respectively. Inspection of theplots indicates that there is no noticeable deterioration in the
computed posterior probability. However, at D 101, the totalcomputational load has been reduced by two orders of magnitude
relative to the use ofD 103! Note that the introduction of thenew anti-aliasing filter in (43) is crucial in achieving those results.
7.2. Event based sampling
At first glance it may seem that the extension to event based
sampling is immediate, i.e. all one need to do is run the state
update (41) at period D (chosen sufficiently small so that Euler
integration gives an adequate approximation) and then to use the
observation update (42) when one decides that a sufficiently
interesting change in the output has occurred. Certainly the
observations are only needed when a threshold has been crossed.
However, it is not true that there is zero relevant information
between threshold crossings. On the contrary, there is a valuable
piece of information, namely that the output has not crossed a
threshold. Hence, estimates can continue to be updated between
Fig. 2. Time evolution of the probability density function at fast sampling,
D 0:001.
Fig. 3. Time evolution of the probability density function at fast sampling,
D 0:01.
Fig. 4. Time evolution of the probability density function at fast sampling, D 0:1.
2 Note that this step of using a new digital anti-aliasing filter is very helpful
and does not appear to be widely appreciated.
M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971 967
-
7/29/2019 1-s2.0-S0967066112000196-main
6/9
threshold crossings provided an appropriate change is made to the
observation update formula. Specifically, consider the situation
illustrated in Fig. 5 where, at the kth time instant, it is known that
ykAQa,Qb Qk 48The observation update (42) in the non-linear filter can now be
modified to the following which explicitly utilizes (48):
pxk 19Yk,
yk 1A
Qk RQk
pxk 19Ykpyk 19xk 1 dyk 1R RQkpxk 19Ykpyk 19xk 1 dyk 1dxk 1
49Note that if one simply chooses not to update the states, then
the state estimation uncertainty will grow due to the drift term
inherent in the state update (33). Use of (49) avoids this problem.
Actually, this is different from the common strategy used in much
of the existing event based sampling literature where updates are
usually restricted to cases when a threshold is crossed (Anta &
Tabuada, 2009, 2008; Arzen, 1999; Byrnes & Isidori, 1989; Le &
McCann, 2007; McCann & Le, 2008; Otanez et al., 2002; Pawlowski
et al., 2009; Tabuada, 2007; Xu & Cao, 2011). Some authors
e.g. Sijs and Lazar (2009) and Marck and Sijs (2010) have noted
that, for the case of linear filtering, it is desirable to continue
to update based on the known fact that the output lies withinthe quantization threshold. This is the idea captured in (49) for
the case of non-linear filtering. Of course, in practice, the integrals
in (49) will need to be approximated. The approximation issue
is discussed below via particle filtering and vector quantiza-
tion methods.
8. Spatial quantization
Next consider the issue of spatial quantization. As is clear from
(41) and (42), the conditional probability for the states is a function
in a high dimensional space. Also, evolution of this function requires
the evaluation of high dimensional integrals as is evident from the
right hand sides of (41) and (42). Such integrals cannot be computed
in practice without some form of discretization of the spatialcoordinates. Two strategies are described below to achieve spatial
quantization, namely particle filtering and minimum distortion
filtering. The former strategy has it strengths in that the number
of particles is independent of dimension but requires large number
of points to accurately describe the problem. The latter strategy uses
a small number of points for low dimensional problems but its
computational cost increases in higher dimensions due to the need
for extra grid points.
8.1. Particle filtering
This technique achieves spatial quantization by drawing a set
of random samples from the disturbance distribution. Thus, a
discrete approximation to the posterior distribution is generated
which is based on a set of randomly chosen points. The approx-
imation converges, in probability, with order 1=ffiffiffiffi
Np
, where N is
the number of chosen samples (Crisan & Doucet, 2002). The main
disadvantages of this strategy is that a large number of points
may be needed. Also these points need, in principle, to be related
to the distribution of interest and suffer from degeneracy of the
particles. Also, the number of points will grow exponentially with
time unless some form of reduction is used. Thus, there are many
fixes needed to get this type of algorithm to work in practice. Suchfixes include the use of proposal distributions, resampling meth-
ods, etc. For details the reader is referred to Chen (2003).
8.2. Minimum distortion filtering (MDF)
This is a new class of algorithm. It was first described in Goodwin
et al. (2010). The MDF algorithm belongs to the class of determi-
nistic griding methods. There exist many algorithms within this
framework. Some of them use fixed grid methods, where the choice
of the grid is based on some pre-known information regarding the
problem which is never updated. Another method is presented in
Bucy and Senne (1971), where a griding method based on the mean
plus an ellipsoid determined by the covariance of the probability
density function is used. Other approaches include adaptive uniformgrid methods e.g. Bergman (1998). Here a uniform resolution grid
with adaptive resolution is used. The grid is also relocated depend-
ing on the likelihood of the current grid points. By contrast, the MDF
algorithm is a method where the grid is non-uniform and is adapted
at each sampling instant. The adaptation step depends on vector
quantization of the current estimate of the probability density
function. This technique provides the algorithm with the capacity
of relocating grid points where they are most needed. The non-
uniform characteristic allows for a tailored location of the grid,
without wasting points in regions between the modes without
importance e.g. in multimodal distributions. A summary of the
algorithm is presented below.
The key idea underlying this class of algorithm is to utilize
vector quantization to generate, on-line, a finite approximation tothe a-posteriori distribution of the states.
Say that one begins with a discrete approximation to the
distribution of x0 on Nx grid points. Also assume that one has a
finite approximation to the distribution of the process noise on Nwgrid points. These approximations can be generated off-line. Then
utilizing the discretized version of Eq. (41), one obtains a finite
approximation to px1 on Nx Nw grid points. Then, one uses thediscrete equivalent of (42) to obtain a finite approximation to
px19y1 on Nx Nw points. Finally, one uses vector quantizationideas to re-approximate px19y1 back to Nx points. (How thiscrucial last step is performed will be described in detail below.)
Then, one returns to the beginning to obtain a discrete approx-
imation to px29y1 on Nx Nw points and so on. The algorithm issummarized in Table 1.
The key step in the MDF algorithm is the vector quantization
step (step 5 in Table 1). Details of this step are given below.
Fig. 5. Inter-sample illustration.
Table 1
MDF algorithm.
Step Description
1 Initialization: Quantize px0 to Nx points by xi ,pi; i 1, . . . ,Nx . Quantizepx to Nx points by wj ,qj; j 1, . . . ,Nw
2 Begin with pxk9Yk represented by xi ,pi; i 1, . . . ,Nx3 Approximate pxk 19Yk via (41) on NxnNx points4 Evaluate pxk 19Yk 1 on NxnNw points via (42)5 Quantize back to Nx points
6 Go to 2
M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971968
-
7/29/2019 1-s2.0-S0967066112000196-main
7/9
Assume one has a vector discrete distribution for some distribu-
tion px, where xARn , quantized to a very large (but finite) set ofpoints. The goal is to quantize px to a smaller finite set of pointsxi,pi, i 1, . . . ,N. The first step in vector quantization is to define ameasure to quantify the distortion of a given discrete representa-
tion. This measure is then optimized to find the optimal representa-
tion which minimizes the cost. In summary, one seeks a finite set
Wx fx1, . . . ,xNg and an associated collection of sets S fS1, . . . ,SNg
such thatSN
1 Si Rn
and Si \ Sj 0; iaj. The quantities Wx, Sx arechosen by minimizing a cost function of the form:
JWx,Sx XNi 1
EfxxiTWxxi9xASig 50
whereW diagW1, . . . ,WN. Other choices of the distance measurecan also be used; e.g. Manhattan, L1, Jaccard, etc. . ., see Tan,
Steinbach, and Kumar (2005).
If x1, . . . ,xN (the set of grid points) are given, then the optimal
choice of the sets Si is the, so-called, Voronoi cells (Gersho & Gray,
1992; Graf & Lushgy, 2000):
Si fx9xxiTWxxirxxjTWxxj; 8jaig 51
Similarly, if the sets S1,
. . .,
SN are given, then the optimalchoice for xi is the centroid of the sets Si, i.e.
xi Ex9xASi 52
Many algorithms exist for minimizing functions of the form
(50) to produce a discrete approximation. One class of algorithm
(known as the k-means algorithm or Lloyds algorithm, Gersho &
Gray, 1992; Graf & Lushgy, 2000; Lloyd, 1982) iterates between
the two conditions (51) and (52).
Thus Lloyds algorithm begins with an initial set of grid points
Wx fxi; i 1, . . . ,Nxg. Then one calculates the Voronoi cells Sx ofWx using (51). Next, one computes the centroids of the Voronoi cells
Sx via (52). One then returns to the calculation of the associated
Voronoi cells and so on. Lloyds algorithm iterates these steps until
the distortion measure (50) reaches a local minimum, or until thechange in the distortion measure falls below a given threshold, i.e.
JWk 1x ,Sk 1x JWkx,SkxJWkx,Sxk
re 53
where Wkx and Sk
x are the codebook and Voronoi cells at iteration k
respectively.
In order to obtain satisfactory results with the MDF algorithm
various practical steps are necessary. These include the use of fast
sampling, scaling, and clustering, see Goodwin and Cea (2011).
9. Example
Consider the practical problem of radar tracking using range and
bearing measurements. Both particle filtering and MDF methods
are used below for the spatial quantization step. Also event based
sampling is used and compared with regular sampling.
0 20 40 60 80 100 120 140 160 180 2000
20
40
60
80
100
120
1st Moment
Sample
x1
0 20 40 60 80 100 120 140 160 180 200
50
0
50
100
1st Moment
Sample
x2
TrueMDFPF
TrueMDFPF
Fig. 6. First moment estimation using, MDF, particle and true filters.
0 20 40 60 80 100 120 140 160 180 200
0
50
100
150
200
250
300
350
2nd Moment
Sample
x1
0 20 40 60 80 100 120 140 160 180 200
0
50
100
150
200
2nd Moment
Sample
x2
TrueMDFPF
TrueMDFPF
Fig. 7. Second central moment estimation using, MDF, particle and true filters.
Table 2
Root mean square error.
Algorithm Mean x1 Mean x2 Variance x1 Variance x2
MDF 4.043 4.5689 29.109 27.585
PF 6.644 6.870 38.023 46.6718
0 50 100 150 200
0
50
100
150
200
Ran
ge
True
Lebesgue
Riemann
0 50 100 150 200
1
0.5
0
0.5
1
Bearing,radian
True
Lebesgue
Riemann
Fig. 8. Range and bearing trajectory.
M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971 969
-
7/29/2019 1-s2.0-S0967066112000196-main
8/9
Consider the following two state model,
x1k
1
x1
k
Dv1
k
o1
k
54
x2
k 1 x2k
Dv2k
o2k
55where D 0:1 is the sampling period and x x1x2AR2 is thestate vector. The input v v1v2AR2 corresponds to the speedof the object in cartesian coordinates, x o1o2AR2 is pro-cess noise (say wind gusts or unmeasured speed variations) with
covariance:
Qd 100 0
0 100
D 56
The range and bearing measurements are given by the follow-
ing equations (Floudas, Polychronopoulos, & Amditis, 2005):
y1k
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffix1k
2
x
2k
2q n
1k
57
y2k
arctan x1k
x2k
!n2
k58
The measurement vector is thus y y1y2AR2, the mea-surement noise m n1n2AR2 is taken to have variance:
Rd 0:6 0
0 0:06
1
D59
The MDF tuning parameters are taken to be Nx49, Nw9,z 1020 and E 10%. For the particle filter 1000 particles wereused. This approximately yields equal computational load per
sample for MDF and particle methods. Both filters used the same
initial condition for the sate, i.e. a Gaussian distribution with
^x0 35 23 and covariance P0 100 0; 0 100.Figs. 6 and 7 show the mean and variance of the state estimate.
As can be seen, the MDF and particle filter give similar results.
Moreover these results are almost identical to the true esti-
mates. The latter was computed (for comparison purposes only)
using a very fine griding of the state space.
Table 2 shows the root mean square error for the mean and
variance estimates using MDF and PF algorithms. These results
show that the performance of the MDF algorithm is better than
that obtained by PF methods.
Next, regular (Riemann) sampling and event based (Lebesgue)
sampling are compared. For the former, 8 bits were utilized to
represent each sample and one sample was taken per second. For
the second case, the quantization thresholds were set at 50
and 0.9 respectively for range and bearing. Fig. 8 compares the
reconstructed range and bearing for the two filters. Fig. 9 shows
the sampling times for range (upper two traces) and bearing
(lower two traces).
It can be seen from Fig. 8 that the estimates produced by
Lebesgue sampling are very close to those produced by Riemann
sampling. This occurs despite of the obvious difference in sampling
rates shown in Fig. 9. Indeed, the Riemann sampling strategy uses
8 bits/sample and 1 sample/s, i.e. a data rate of 8 bits/s. On the other
hand, the Lebesgue sampling strategy uses only 1 bit/sample (up or
down) at an average of 0.2 samples/s. The latter corresponds to an
average data rate of 0.2 bits/s which is 40 times less than used in the
case of regular sampling.
10. Conclusions
This paper has described the use of event based sampling in
the context of non-linear filtering. Special issues regarding the
choice of anti-aliasing filter have been addressed. Also, a realistic
example has been presented showing that the required data rate
can be reduced by more than an order of magnitude (40:1 for the
given example) whilst retaining essentially the same estimation
accuracy.
References
Anta, A., & Tabuada, P. (2008). Self-triggered stabilization of homogeneous controlsystems. In: American control conference (pp. 41294134). IEEE.
Anta, A., & Tabuada, P. (2009). On the benefits of relaxing the periodicityassumption for networked control systems over CAN. In: 30th IEEE real-timesystems symposium (pp. 312). IEEE.
Anta, A., & Tabuada, P. (2010). To sample or not to sample: Self-triggered controlfor nonlinear systems. IEEE Transactions on Automatic Control, 55(9),20302042.
Arzen, K. (1999). A simple event-based PID controller. In: Proceedings of the 14thIFAC world congress, Vol. 18.
Astrom, K., & Bernhardsson, B. (2002). Comparison of Riemann and Lebesguesampling for first order stochastic systems. In: Proceedings of the 41st IEEEconference on decision and control, Vol. 2.
Astrom, K. J., & Wittenmark, B. (1990). Computer controlled systems. Theory anddesign ((2nd edition). Englewood Cliffs, NJ: Prentice Hall.
Bergman, N. (1998). An interpolating wavelet filter for terrain navigationIn: Proceedings of the conference on multisourcemultisensor information fusion(pp. 251258).
Bucy, R., & Senne, K. (1971). Digital synthesis of non-linear filters. Automatica, 7(3),287298.
Byrnes, C., & Isidori, A. (1989). New results and examples in nonlinear feedbackstabilization. Systems & Control Letters, 12(5), 437442.
Cea, M., Goodwin, G., & Feuer, A. (2010). A discrete nonlinear filter for fast sampledproblems based on vector quantization. In: American control conference (ACC),
July (pp. 13991403).
0 20 40 60 80 100 120 140 160 180 200
1
1
1
1
1
1
1
Discrete time index k
Sampleinstant
Lebesgue
Riemann
Riemann
Lebesgue
Fig. 9. Sampling instants.
M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971970
-
7/29/2019 1-s2.0-S0967066112000196-main
9/9
Chen, Z. (2003). Bayesian filtering: From Kalman filters to particle filters, and beyond .Available at: /http://users.isr.ist.utl.pt/$jpg/tfc0607/chen_bayesian.pdfS.
Crisan, D., & Doucet, A. (2002). A survey of convergence results on particle filteringmethods for practitioners. IEEE Transactions on Signal Processing, 50(3), 736746.
Feuer, A., & Goodwin, G. (1996). Sampling in digital signal processing and control .Boston, Cambridge, MA: Birkausser.
Floudas, N., Polychronopoulos, A., & Amditis, A. (2005). A survey of filteringtechniques for vehicle tracking by radar equipped automotive platforms. In8th international conference on information fusion, July, 2005 (Vol. 2, p. 8).
Gersho, A., & McGray, R. (1992). Vector quantization and signal compression. In:Springer International Series in Engineering and Computer Science. .
Goodwin, G., Aguero, J., Salgado, M., & Yuz, J. I. (2009). Variance or spectral densityin sampled data filtering? In 4th international conference on optimization andcontrol with applications (OCA2009), 611 June. Harbin, China.
Goodwin, G., & Cea, M. G. (2011). Temporal and spatial quantization in nonlinearfiltering. In 4th international symposium on advanced control of industrial
processes, 2327 May.Goodwin, G. C., Feuer, A., & Muller, C. (2010). Sequential bayesian filtering viaminimum
distortion filtering. Three decades of progress in control sciences ((1st ed.). Springer.Goodwin, G. C., Middleton, R. H., & Poor, H. V. (1992). High-speed digital signal
processing and control. Proceedings of the IEEE, 80(2), 240259.Graf, S., & Lushgy, H. (2000). Foundations of quantization for probability distribu-
tions. In: Lecture notes in mathematics, Vol. 1730). Springer.Handschin, J., & Mayne, D. (1969). Monte carlo techniques to estimate the
conditional expectation in multi-stage non-linear filtering. InternationalJournal of Control, 9(5), 547559.
Hristu-Varsakelis, D., & Levine, W. (2005). Handbook of networked and embeddedcontrol systems. Birkhauser.
Jazwinski, A. (1970). Stochastic processes and filtering theory. San Diego, CA:
Academic Press.
Le, A., & McCann, R. (2007). Event-based measurement updating Kalman filter innetwork control systems. In: 2007 IEEE region 5 technical conference (pp. 138141). TPS.
Lloyd, S. (1982). Least squares quantization in PCM. IEEE Transactions on Informa-tion Theory, IT-28, 127135.
Marck, J. W., & Sijs, J. (2010). Relevant sampling applied to event-based state-estimation. Proceedings4th international conference on sensor technologies andapplications SENSORCOMM. pp. 618624.
McCann, R., & Le, A. T. (2008). Lebesgue sampling with a Kalman filter in wirelesssensors for smart appliance networks. In: Conference recordIAS annualmeeting. IEEE Industry Applications Society.
Middleton, R., & Goodwin, G. C. (1990). Digital control and estimation. A unifiedapproach. Englewood Cliffs, NJ: Prentice Hall.
Otanez, P., Moyne, J., & Tilbury, D. (2002). Using deadbands to reduce commu-nication in networked control systems. In: American control conference, Vol. 4.
Pawlowski, A., Guzman, J. L., Rodrguez, F., Berenguel, M., Sanchez, J., & Dormido, S.(2009). The influence of event-based sampling techniques on data transmis-sion and control performance. In: ETFA IEEE conference on emerging technolo-
gies and factory automation.Schon, T. B. (2006). Estimation of nonlinear dynamic systemsTheory and applica-
tions. Ph.D. Thesis. Linkoping Studies in Science and Technology. /http://www.control.isy.liu.se/research/reports/Ph.D.Thesis/PhD998.pdfS.
Sijs, J., & Lazar, M. (2009). On event based state estimation. In: Lecture notes incomputer science, Vol. 5469.
Tabuada, P. (2007). Event-triggered real-time scheduling of stabilizing controltasks. IEEE Transactions on Automatic Control, 52(9), 16801685.
Tan, P.-N., Steinbach, M., & Kumar, V. (2005). Introduction to data mining. AddisonWesley.
Xu, Y., & Cao, X. (2011). Lebesgue-sampling-based optimal control problems with
time aggregation. IEEE Transactions on Automatic Control, 56(5), 10971109.
M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971 971
http://users.isr.ist.utl.pt/~jpg/tfc0607/chen_bayesian.pdfhttp://users.isr.ist.utl.pt/~jpg/tfc0607/chen_bayesian.pdfhttp://www.control.isy.liu.se/research/reports/Ph.D.Thesis/PhD998.pdfhttp://www.control.isy.liu.se/research/reports/Ph.D.Thesis/PhD998.pdfhttp://www.control.isy.liu.se/research/reports/Ph.D.Thesis/PhD998.pdfhttp://www.control.isy.liu.se/research/reports/Ph.D.Thesis/PhD998.pdfhttp://users.isr.ist.utl.pt/~jpg/tfc0607/chen_bayesian.pdfhttp://users.isr.ist.utl.pt/~jpg/tfc0607/chen_bayesian.pdf