active information acquisition under arbitrary unknown

Active Information Acquisition under Arbitrary Unknown Disturbances

Jennifer Wakulicz, He Kong, and Salah Sukkarieh

Abstract— Trajectory optimization of sensing robots to ac-tively gather information of targets has received much attentionin the past. It is well-known that under the assumption of linearGaussian target dynamics and sensor models the stochasticActive Information Acquisition problem is equivalent to adeterministic optimal control problem. However, the above-mentioned assumptions regarding the target dynamic modelare limiting. In real-world scenarios, the target may be sub-ject to disturbances whose models or statistical propertiesare hard or impossible to obtain. Typical scenarios includeabrupt maneuvers, jumping disturbances due to interactionswith the environment, anomalous misbehaviors due to systemfaults/attacks, etc. Motivated by the above considerations, inthis paper we consider targets whose dynamic models aresubject to arbitrary unknown inputs whose models or statisticalproperties are not assumed to be available. In particular, withthe aid of an unknown input decoupled filter, we formulatethe sensor trajectory planning problem to track evolution ofthe target state and analyse the resulting performance for boththe state and unknown input evolution tracking. Inspired byconcepts of Reduced Value Iteration, a suboptimal solutionthat expands a search tree via Forward Value Iteration withinformativeness-based pruning is proposed. Concrete subopti-mality performance guarantees for tracking both the state andthe unknown input are established. Numerical simulations of atarget tracking example are presented to compare the proposedsolution with a greedy policy.

I. INTRODUCTION

Due to its vast applications such as environmental mon-itoring [1], target/source motion tracking/localization [2]-[3], agriculture [4]-[5], sensor management has been studiedextensively in robotics and automation literature, in termsof communication management [6]-[7] and sensor trajectoryplanning [8]-[14], etc. Closely related problems of sensorscheduling and sensor placement have also received muchattention in the control community [15]-[19].

The problem of managing one or more sensor-equippedmobile robots’ trajectories to maximize the informationgathered regarding a target system/process is known asActive Information Acquisition (AIA). The AIA problem iscommonly formulated as a stochastic control problem wherethe mutual information between sensor measurements andthe target state is optimized [9]. When the target’s motiondynamics is linear and driven only by Gaussian noise, andthe sensor’s observation model is linear in the target state, itis well-known that the stochastic AIA problem reduces to adeterministic optimal control problem for which open-loopsolutions are optimal (see [8] and the references therein). Re-cent works have also established tree-search based methods

The authors are with the Australian Centre for FieldRobotics, The University of Sydney, NSW, 2006, Australia.Emails: [email protected], [email protected],[email protected]. Corresponding author: He Kong.

and algorithms to efficiently approximate the optimal policywhile maintaining suboptimality guarantees [9], [15].

However, the assumptions made regarding the target mo-tion model in the aforementioned works are often limit-ing. For example, the target may be subject to arbitraryunknown disturbances which are difficult, if possible, tostatistically interpret or model. Typical examples in appli-cations include systems under fault/attacks [20]-[22], track-ing/localisation of targets subject to abrupt maneuvers [23]-[27], advanced vehicle applications under complex tire-ground interactions [28]-[31], estimation of unmeasuredforces in grasping/manipulation [32]-[33], etc. In fact, fil-tering and estimation under arbitrary unknown inputs havereceived much attention in the control literature [34]-[37],and have found numerous applications, including robust-ness/security analysis and synthesis of resilient autonomousrobots and connected vehicle systems [38]-[41].

Motivated by the above considerations, here we considertargets whose dynamics are subject to arbitrary unknowndisturbances. In this case, it is of interest to track theevolution of both the target and the unknown disturbances. Tocircumvent complexities of such an approach, we formulateand solve the AIA problem for tracking the target state andanalyse the resulting performance of both target state andunknown disturbance tracking. Firstly, we show that boththe state and input error covariance update maps given inexisting unknown input filtering works [34]-[37] are concaveand monotone. To the best of our knowledge, these propertieshave not previously been explored. Secondly, inspired bythe concepts of Reduced Value Iteration (RVI) [9],[15], wepropose a suboptimal solution to the AIA problem usingForward Value Iteration (FVI) with pruning according toan information dominance metric. Concrete suboptimalityperformance guarantees for tracking both the target state andthe unknown disturbance are established. Finally, we use atarget tracking example to show the merits of the proposedsolution in comparison to a greedy policy.

II. PRELIMINARIES AND PROBLEM FORMULATION

A. Preliminaries of filtering under unknown inputs

Consider a mobile sensor with discrete dynamics model:xk+1 = f(xk, uk), (1)

where xk ∈ X ∼= Rnx is the sensor state with X being annx−dimensional state space with metric dX , and uk ∈ Uis the control input with U as a finite space of admissiblecontrols. Suppose there exists a target with linear time-varying motion model:

yk+1 = Akyk +Gkdk + wk, (2)

arX

iv:2

109.

0907

9v1

[cs

.RO

] 1

9 Se

p 20

21

where yk ∈ Rny is the target state vector, the target processnoise wk ∼ N (0, Qk), i.e. wk is normally distributed withzero-mean and covariance Qk ∈ Rny×ny � 0, dk ∈ Rnd

represents arbitrary unknown inputs whose models or statis-tical properties are not assumed to be known and Ak, Gkare known matrices of compatible dimensions. Without lossof generality, we assume that rank(Gk) = nd. While inoperation, the sensor has observation model:

zk = Ck(xk)yk + vk(xk), (3)where, zk ∈ Rnz is the measurement, Ck(xk) ∈ Rnz×ny

is a known measurement matrix, and the measurement noisevk(xk) ∼ N (0, Rk(xk)) with Rk(xk) � 0. For brevity wedrop dependence of Ck, Rk, vk on the sensor state xk in theremainder of the paper.

For filtering purposes, we adopt the framework of [34](other methods in [20], [37] can be similarly considered) andimplement the following steps recursively after initialization:

1. Time update:yk|k−1 = Akyk−1|k−1, (4)

2. Unknown input estimation:dk−1 = Mk(zk − Ckyk|k−1), Mk ∈ Rnd×nz , (5)

3. Measurement update:y?k|k = yk|k−1 +Gk−1dk−1yk|k = y?k|k +Kk(zk − Cky?k|k),Kk ∈ Rny×nz .

(6)

Definedk−1 = dk−1 − dk−1, Σdk−1 = E[dk−1dTk−1],yk|k = yk − yk|k, Σk = E[yk|kyTk|k],

(7)

as the unknown input estimation error, the filtered state error,and their respective covariances.

As shown in [34], dk−1 and yk|k in (5)-(6) are unbiasedestimates if and only if the initial state guess y0|0 is unbiasedand the unknown input filter gain Mk satisfies

MkCkGk−1 = Ind . (8)The optimal unknown input filter gain in the minimumvariance sense is given byM∗k (Σk−1) = (FT

k R−1k (Σk−1)Fk)−1FT

k R−1k (Σk−1), (9)

where Fk = CkGk−1, Rk(Σk−1) = Ck(Ak−1Σk−1ATk−1 +

Qk−1)CTk + Rk � 0 and Σk−1 is the filtered state error

covariance at time step k−1. Given M∗k , one may transformthe state estimation problem into a standard Kalman filteringproblem and find a resulting optimal gain matrix K∗k [34].The resulting optimal gain K∗k is in general non-unique [34].For simplicity, in this paper we take the choiceK∗k(Σk−1) = (Ak−1Σk−1A

Tk−1 +Qk−1)CT

k R−1k . (10)

The optimal filter gains in (9)-(10) give the state andunknown input error covariance update maps respectively:

Σk = ρ(Σk−1,M∗k ,K∗k)

= AkΣk−1ATk + FkQk−1FT

k + WkRkWTk ,

Σdk−1 = ρd(Σk−1) = (FTk R−1k (Σk−1)Fk)−1,

(11)

whereAk = (I −K∗kCk)(I −Gk−1M∗kCk)Ak−1,Fk = −(I −K∗kCk)(I −Gk−1M∗kCk),

Wk = Gk−1M∗k −KkCkGk−1M∗k +K∗k .

Note that ρ, ρd are indeed functions of xk and thus of thecontrol input uk−1. We therefore denote ρuk(Σk), ρduk(Σk)to refer to the update maps applied under the control uk ∈ U .

B. Problem Formulation

Given an initial sensor state x0 ∈ X and a prior dis-tribution of the target state y0, the problem of interest isto optimize the trajectory of the sensor over a planninghorizon of length N to best track the evolution of thetarget dynamics and the unknown input. Expanding upon theproblem formulation in [9], [15], we consider the followingoptimal control problem

minσ∈UN

log det(ΣN ) + log det(ΣdN−1) (12)

s.t. xk+1 = f(xk, uk), k = 0, . . . , N − 1,Σk+1 = ρuk(Σk), k = 0, . . . , N − 1,Σdk = ρduk(Σk), k = 1, . . . , N − 1,

where σ = {u0, · · · , uN−1} ∈ UN is a sequence ofadmissible controls, and ρuk(Σk), ρduk(Σk) are the state andunknown input error covariance update maps defined in (11),with the first measurement taken at sampling instant k = 1.

Although the unknown inputs are not assumed to followany specific probability distribution, one could give some sta-tistical interpretation of the optimization problem (12) similarto the existing works for the case without unknown inputs[8], [9]. This can be done by following the concepts in [42] tofirstly pose the unknown input as a Gaussian noise processwith variance D and derive the statistical interpretation ofproblem (12). Then, the lack of prior information regardingthe unknown input can be expressed by taking D to infinity.Due to limited space, we will not pursue this point further.

Finding the optimal solution to problem (12) amounts toexploring the large space of sensor states and error covari-ances allowed by U over a planning horizon N and findingthe optimal path via tree search. To obtain a compromisebetween complexity and optimality of search tree construc-tion, we adopt the concepts of the RVI algorithm proposedin [9], [15]. Conceptually, if a set of nodes are sufficientlyclose in sensor configuration space (i.e. they δ-cross) and onenode’s covariance is not as informative as nearby nodes’ (i.e.is ε-algebraically redundant), it is discarded from the tree.This method reduces computational complexity and givessuboptimality bounds for the resulting solution. δ-crossingand ε-algebraic redundancy are formalized below.

Definition 1: [9] Two sensor trajectories δ-cross at timek ∈ [1, N ] if dX (x1k, x

2k) ≤ δ, for δ ≥ 0.

Definition 2: [15] Let ε ≥ 0 and {Σi}Ki=1 be a finite setwith Σi � 0 ∀i. Then a matrix Σ � 0 is ε-algebraicallyredundant with respect to {Σi}Ki=1 if there exists a set ofnonnegative constants {αi}Ki=1 such that∑K

i=1 αi = 1, Σ + εI �∑Ki=1 αiΣ

i.To prune nodes (xk,Σk,Σ

dk−1) according to their alge-

braic redundancy and approximately solve (12) one mustconsider how informative δ-crossing nodes are for stateevolution and unknown input evolution tracking separately.That is, Definition 2 must be checked for both Σk and

Algorithm 1: RVI With Unknown Inputs

1 Initialise Sk = ∅ for k ∈ [1, N ], S0 = (x0,Σ0);2 forall k ∈ [1, N ] do3 forall (x,Σ,Σd) ∈ Sk−1 do4 forall u ∈ U do5 Sk ← Sk

⋃{(f(x, u), ρu(Σ), ρdu(Σ)};

6 Smin ← {(x,Σ,Σd) ∈ Sk | Σ =arg min(log det(Σk))};

7 S′k ← Smin;8 forall (x,Σ,Σd) ∈ Sk \ Smin do9 Q← {Σ | (x′,Σ′,Σd′) ∈ S′k, dX (x′, x) ≤ δ};

10 if Q = ∅ or checkRedundant(Σ, Q) isFalse then

11 S′k ← S′k⋃

(x,Σ,Σd);

12 return Σε,δN = arg min (log det(ΣN ));

Σdk−1. This may result in highly informative nodes for stateevolution tracking being pruned due to mediocre contributionto unknown input tracking, making suboptimality bounds forthe resulting solution to (12) difficult to analyse.

However, the close relationship between state and un-known input error covariance update maps seen in (11)allows one to prune according to state tracking performanceonly while still having concrete performance guarantees forunknown input tracking. The following sections of the papertherefore address the reduced problem

minσ∈UN

log det(ΣN ) (13)

s.t. xk+1 = f(xk, uk), k = 0, . . . , N − 1,Σk+1 = ρuk(Σk), k = 0, . . . , N − 1,

and derive suboptimality bounds for both state and unknowninput tracking that result from this simplified approach.

The RVI algorithm for tracking targets with unknowndynamics is detailed in Algorithm 1. Note that in Algorithm1, we use the reduced cost function in (13). The expansionof nodes in Line 5 is performed using the filter introducedin (4)-(11) to accommodate for any non-zero dk.

III. SUBOPTIMALITY BOUNDS FOR STATE ANDUNKNOWN DISTURBANCE EVOLUTION TRACKING

The expansion of RVI for tracking targets with unknowndynamics and the derivation of suboptimality bounds is ourmain focus. In our approach we solve (13) approximatelywith Algorithm 1 to find a control sequence σε,δ ∈ UNwithout optimization for unknown input evolution tracking.The suboptimality of state evolution tracking incurred bypruning nodes can then be upper bounded via a worst caseanalysis as in [9]. We then leverage the relationship betweenstate and unknown input estimation error covariance mapsto derive corresponding bounds for unknown input evolutiontracking. To begin, we require the following assumptions.

Assumption 1: [9] The sensor motion model is Lipschitzcontinuous in x with Lipschitz constant Lf ≥ 0 for everyfixed u ∈ U , i.e. dX (f(x1, u), f(x2, u)) ≤ LfdX (x1, x2).

Assumption 2: [9] For any two nodes(x1k−1,Σk−1,Σ

dk−2), (x2k−1,Σk−1,Σ

dk−2). Let Σ1

k, Σ2k

be the updated state estimation error covariances afterapplying control u ∈ U to each node. Then

Σ1k � γΣ2

k + (1− γ)Qk−1,Σ2k � γΣ1

k + (1− γ)Qk−1,

∀k ∈ [1, N ], where γ = (1 + LmdX (x1k, x2k))−1 < 1 for

some Lm > 0. Note for some δ > 0, if dX (x1k−1, x2k−1) < δ

then γ = (1 + LmLfδ)−1 < 1.

A. Suboptimality bounds for target state evolution tracking

The following properties of the state estimation covarianceupdate map in (11) are key for performance analysis.

Lemma 1: The state estimation covariance update map is:

1) Monotone: if Σ1 � Σ2 then ρ(Σ1) � ρ(Σ2)2) Concave: ∀α ∈ [0, 1], ρ(αΣ1+(1−α)Σ2) � αρ(Σ1)+

(1− α)ρ(Σ2)

It is important in our worst case analysis to considerrecursive update of the error covariance over a long horizon.We therefore introduce the k-horizon mapping [15] φk :Σ0 7→ Σk, which maps the state error covariance matrixat time 0 to time k according to the first k elementsu0, . . . , uk−1 of the control sequence σ ∈ UN :

φσk(Σ0) = ρuk−1(. . . ρu1

(ρu0(Σ0))) = Σk. (14)

Monotonicity and concavity of the k-horizon mappingnaturally follow from Lemma 1 and the definition in (14).As a direct result of concavity, the k-horizon mapping φk isbounded by its first order Taylor approximation, i.e.

φσk(Σ + εX) � φσk(Σ) + εgσk (Σ, X), (15)where

gσk (Σ, X) =dφσk(Σ + εX)

dε

∣∣∣∣ε=0

is the directional derivative of the k-horizon mapping φk atΣ � 0 along an arbitrary direction X � 0. The directionalderivative gσk (Σ, X) can therefore be interpreted as theimpact an early perturbative error will have on the errorcovariance at a later time k provided no further perturbationsoccur. This interpretation becomes pertinent for studying theconsequences of pruning nodes if one frames the εI term inDefinition 2 as a perturbative error. This motivates the studyof the directional derivative.

Lemma 2: The directional derivative of the state estima-tion covariance update map at Σ � 0 along the arbitrarydirection X � 0 is given by

dρu(Σ + εX)

dε

∣∣∣∣ε=0

= A(Σ)XA(Σ)T,

where A(Σ) is defined as in (11). The directional derivativeof the k-horizon mapping φk at Σ ∈ A along an arbitrarydirection X ∈ A is given by

gσk (Σ, X) =

k−1∏t=0

(Ak−t)Xk−1∏t=0

(At)T,

∀k = 1, . . . , N , with gσ0 (Σ, X) = X .

Lemma 3: Suppose ∃β < ∞ such that Σk � βI ∀k ∈[0, N ], then we have

Tr{gσk (Σ, X)} ≤ βηkTr{Σ−1X}where η = β

β+λQ< 1 and λQ is the minimum eigenvalue of

FkQk−1FTk ∀k ∈ [0, N ].

As in [9], [15], the above bound implies that provided thestate error covariance is bounded for all time, the effect ofa perturbation at an early time step decays exponentially astime evolves. The culmination of utilising the above resultsin a worst case performance analysis is an upper bound onthe suboptimality of the state error covariance Σε,δN found byAlgorithm 1. Denoting J(·) := log det(·),

Theorem 1: Let β∗ <∞ be the peak state estimation errorof the optimal trajectory, i.e. Σ∗k � β∗I ∀k ∈ [1, N ]. Thenwe have0 ≤ J(Σε,δN )− J(Σ∗N ) ≤ (ζN − 1)

(J(Σ∗N )− J(λQI)

)+

ε(nyλQ

+ ∆N )

where ζk :=∏k−1τ=1

(1 +

∑τs=1 L

sfLmδ

)≥ 1, ∆N :=

nyλ2Qβ∗∑N−1τ=1

ζNζτ η

N−τ∗ , η∗ = β∗

β∗+λQ< 1.

This bound is similar to the state estimation bound in [9],but derived for time varying Qk using the map in (11). Asin [9], [15], the performance bound in Theorem 1 growswith δ and ε, the tunable parameters that dictate pruning.For ε, δ = 0 we recover the optimal solution.

B. Suboptimality bounds for unknown input tracking

In this section, we show that despite considering onlyminimisation of the cost function for state estimation aswritten in (13), one can still derive concrete suboptimalitybounds for the resulting unknown input evolution tracking.

Once again we introduce a “k-horizon” update map forunknown input estimation error, φdk : Σ0 7→ Σdk−1φdσk (Σ0) = ρd(φσk−1(Σ0)) = (FT

k R−1k (Σk−1)Fk)−1, (16)

where φσk−1 is as in (14). From this definition, we see thatthe control sequence σ∗ ∈ UN that solves the reducedproblem (13) which considers state error covariance onlyshould give Σ∗N−1 that minimises (16). The performanceof unknown input tracking should therefore be closelylinked to that of the state evolution tracking. However, asFN (xN ) = CN (xN )GN−1 the sensor state x∗N found bysolving (13) may not coincide with the sensor state xd∗Nrequired to minimise (16) over both arguments ΣN−1, xN .This is an important observation that has direct impact onthe performance of unknown input tracking under a controlsequence tailored for state evolution tracking optimization.This impact will become apparent in Theorem 2.

Monotonicity and concavity of the unknown input errorcovariance update map are again crucial properties for de-scribing the evolution of nodes.

Lemma 4: The unknown input error covariance updatemap ρd(·) is monotone and concave.

Lemma 4 extends to the k-horizon input estimation errorupdate map in (16). Thus, φdσk is bounded from above by its

first order Taylor approximation. We can again characterizethe directional derivative gdσk−1(Σ, X).

Lemma 5: The directional derivative of φdσk at Σ � 0 inthe direction X � 0 is given by

gdσk−1(Σ, X) =d

dεφdk−1(Σ + εX)

∣∣∣∣ε=0

= M∗kCkAk−1gσk−1(Σ, X)AT

k−1CTkM

∗Tk ,

where gσk−1(Σ, X) is the directional derivative of the statek-horizon update map.

As in Lemma 3, we find that the effect of a perturbationin the state error covariance on the unknown input errorcovariance dampens with time provided Σdk and Σk arebounded for all k.

Lemma 6: Suppose ∃βd < ∞ such that Σdk � βdI ∀k ∈[1, N ]. Then

Tr{gdσk−1(Σ, I)} ≤ (nd)2(βd)2λGTr{gσk−1(Σ, I)}

where λG is the maximum eigenvalue ofGTk−1HkAk−1AT

k−1HkGk−1 ∈ Rnd×nd .The propagated error incurred on the unknown input

error covariance by a perturbation in the state covariance istherefore a multiple of that found for the state. Hence, giventhe state result in Lemma 3, the unknown input analogue canalso be found. We now provide an upper bound on the finalunknown input error covariance found by Algorithm 1.

Theorem 2: Let β∗ <∞, βd∗ <∞ be the peak state andinput estimation errors of the optimal trajectory respectively.That is, Σ∗k � β∗I and Σd∗k−1 � βd∗I ∀k ∈ [1, N ]. Then

0 ≤ J(Σd,(ε,δ)N−1 )− J(Σd∗N−1)

≤ (ζN − 1)(J(Σd∗N−1) + J(γd∗I)− J(λ

−1H I)

)+ ε(∆d

N )

where ∆dN := (γd∗)−1(nd)

2λHλG(βd∗)2λQ∆N , λH is themaximum eigenvalue of GT

N−1HNGN−1, and γd∗ = (1 +LmdX (x∗N , x

d∗N ))−1.

We observe the same behaviours of the unknown inputbounds with respect to δ, ε,N as the state bounds foundin Theorem 1. Here ∆d

N is a factor of ∆N from thestate bounds, again highlighting the close relationship be-tween the two bounds. Most notably, we see the previouslymentioned impact of unknown input estimation under acontrol sequence optimized for state estimation; the boundgrows with (γd∗)−1. This result is expected – if the dis-tance dX (x∗N , x

d∗N ) between optimal sensor positions for

state estimation and unknown input estimation is large, theperformance of unknown input estimation resulting fromoptimizing only state estimation via Algorithm 1 worsens.

IV. ILLUSTRATIVE SIMULATIONS

In this section, we illustrate the theoretical results with atwo-dimensional target tracking problem in which the targetdynamics is subject to an unknown input signal. Suppose asensor with state xk defined by its position-velocity vectoris mounted on a robot with the dynamic model:

xk+1 = f(xk, uk) :=

x1kx2k00

+

u1k cos(u2k)τu1k sin(u2k)τu1k cos(u2k)u1k sin(u2k)

(17)

RevTex page dimensions Full page (21.59 cm x 27.94 cm) = (612 pt x 792 pt)Double-column text (17.98 cm x 23.59 cm) = (510 pt x 669 pt)Single-column text (8.65 cm x 23.59 cm) = (245 pt x 669 pt)

SINGLE COLUMN WIDTH: 245 pt = 8.65 cm

DOUBLE COLUMN WIDTH: 510 pt = 17.98 cm

Colors

Plots

Feature Font Color Size Example

Figure label Roboto (bold) #514B4F 8 a

Plot title Roboto (light) #514B4F 8 Optimized controls

Axis labels (name) If text: Roboto (light) #514B4F 8 Frequency, Time, Infidelity

If symbol: Latex #514B4F 9

Axis labels (unit) Roboto (light) #514B4F 8 (MHz) (kHz) (μs) (W⋅Hz−1)

Axis tick valuesIf numeral: Roboto (light) #514B4F 7 0, 2, 4, 10−10, 100, 1010


Plot legendIf text: Roboto (light) #514B4F 7 Primitive


Plot annotation Roboto (light) #514B4F 7 Error suppression

Figures

Lines

Feature Linewidth Color Comment

Plot lines 1 pt as required

Error bars 0.5 pt as required

Annotation lines 0.5 pt as required

Tickmark 0.5 pt #CFCBCEo Length = 3 pto Pointing inside: regular ploto Pointing outside: color density plot

Plot border 0.5 pt #CFCBCE Square corners

Figure outline box 0.5 pt #CFCBCE Square corners

Margins

TBD TBD TBD TBD

Latex 8 | #514B4F | ⌦(!)⇤(�)�(�)⌅(⇠)⇧(⇡)�(�) | A(a)B(b)F (f)N(n)R(r)S(s)T (t) | 10�10, 100, 1010




Roboto 8 | #514B4F | 10−10, 100, 1010 | (MHz) (kHz) (μs) (W⋅Hz−1) (kg⋅m2⋅s−3)Roboto 7 | #514B4F | 10−10, 100, 1010 | (MHz) (kHz) (μs) (W⋅Hz−1) (kg⋅m2⋅s−3)

Roboto 9 | #514B4F | 10−10, 100, 1010 | (MHz) (kHz) (μs) (W⋅Hz−1) (kg⋅m2⋅s−3)Roboto 10 | #514B4F | 10−10, 100, 1010 | (MHz) (kHz) (μs) (W⋅Hz−1) (kg⋅m2⋅s−3)

Font templates



Schematics

Feature Font Color Size

label Roboto (bold) #514B4F 8 a

title Roboto (medium) #514B4F 8 Qubit manifold

AnnotationIf text: Roboto (light) #514B4F 8 Dephasing noise

If symbol: Latex #514B4F 9Latex 9 | #514B4F | ⌦(!)⇤(�)�(�)⌅(⇠)⇧(⇡)�(�) | A(a)B(b)F (f)N(n)R(r)S(s)T (t) | 10�10, 100, 1010

Fig. 1: Simulation results of the target tracking problemaveraged over 150 MC simulations. The left panel showsan example trajectory (initial robot positions marked bytriangles) and the cost of each policy’s calculated trajectory.The right panel shows the average RMSE of each policy’starget position and velocity estimates.

with control input uk ∈ U , where U = {(u1k, u2k) |u1k ∈ {0, 1, 2}, u2k ∈ {0,±π/2, π} and τ is a small timetranslation. The goal of the robot is to track and estimate theposition and velocity of a constant-velocity vehicle drivenby Gaussian noise and an unknown input dk in the form ofabrupt accelerations:

yk+1 =

[I2 τI20 I2

]yk +

[τ2/2I2τI2

]dk + wk,

wk ∼ N(

0, q

[τ3/3I2 τ2/2I2τ2/2I2 τI2

]) (18)

where yk = [y1k, y2k, y

1k, y

2k]T is the position-velocity vector

of the target state at time k and q is a diffusion strengthscalar. The tracking takes place over 51 time steps. At k ∈{4, 9, 19, 24, 34, 39}, dk is a maneuver that takes form of asharp acceleration in some direction.

The sensor takes noisy position measurements of the targetand uses them to obtain the target’s velocity by differentia-tion. For simplicity, the sensor observation model in (3) isgiven by Ck = I4 with the measurement noise increasinglinearly with the distance between robot and target. Certainareas of the environment are “cloudy”, depicted as grey areasin Figure 1, and increase the robot’s measurement noise.Upon entering a cloud, the robot should slow down for safetyunder poor visibility. Beyond a maximum range of 20 metresthe measurement noise is effectively infinite.

For 150 Monte-Carlo simulations, Algorithm 1 is used totrack the target with τ = 1, N = 5, q = 0.1, ε = 0.1, δ = 1.The performance of our proposed algorithm is compared toa greedy approach in Figure 1. The RVI algorithm’s longplanning horizon predicts the target will enter and remain inan area of high measurement noise in future time steps, andthus prioritises avoiding entering this area over remainingclose to the target. On the contrary, the greedy algorithmprioritises minimising the cost function at each time stepand therefore lacks the foresight to avoid these areas.

The trajectory costs of the two policies in Figure 1 eluci-date the impact that RVI’s non-myopic planning has on theperformance of the found solution. We see that RVI incursmuch less cost than the greedy policy. Further, comparison ofthe average root mean square error (RMSE) of the policies’state estimates shows that for all time steps our algorithmmore successfully tracks target evolution in the presence ofunknown inputs. These results are promising confirmation ofour theoretical expansion of RVI to tracking targets subjectto arbitrary, unknown disturbances.

V. CONCLUSION AND DISCUSSIONS

In this work, we studied the AIA problem for targetssubject to arbitrary unknown disturbances. We have shownboth the state and input error covariance update maps givenin existing unknown input filtering works are concave andmonotone. These properties were used to derive subopti-mality guarantees for both state and unknown disturbancetracking by the proposed method. Notably, we have shownthat one may consider tracking only the target state withoutloss of performance guarantees for unknown disturbancetracking due to the close relationship between unknowndisturbance and target state estimation. The suboptimalitybounds presented were notably linear in the tuning param-eters which dictate strictness of node pruning, and thus theoptimal solution is recovered when the tuning parameters areset to zero. Simulations demonstrated that the proposed algo-rithm performs well in tracking a target undertaking unknownmaneuvers. Future work will focus on the more general casewith unknown disturbances affecting both target motion andsensor observation models. A decentralized extension of theAIA considered here will also be pursued.

VI. APPENDIX A: PROOFS FOR MAIN RESULTS

A. Proofs for results in Section III-A

Proof of Lemma 1, 2, 3: The proofs are algebraicallyinvolved and therefore skipped due to limited space.

B. Proofs for results in Section III-B

To prove Lemma 4, we require some preparatory results:Lemma 7: Let α ∈ [0, 1] be a constant. Then ∀Σ � 0, we

have αρd(Σ) � ρd(αΣ).Proof: For Σ � 0, denote Rk(Σ) =

CkAk−1ΣATk−1C

Tk + α−1CkQk−1CT

k + α−1Rk. We have

Rk(Σ)− Rk(Σ) = (α−1 − 1)(CkQk−1C ′k +Rk) � 0,

when α ∈ (0, 1] and Rk � 0. Thus, Rk(Σ) � Rk(Σ). SinceX 7→ X−1 is order reversing for any matrix X , we haveR−1k (Σ) � R−1k (Σ). Additionally, note that

ρd(αΣ) = (F ′kR−1k (αΣ)Fk)−1

= α(FTk R−1k (Σ)Fk)−1 � α(FT

k R−1k (Σ)Fk)−1 = αρd(Σ).

For α = 0, we have αρd(Σ) = 0 � ρd(αΣ).Lemma 8: f(Σ) = FT

k R−1k (Σ)Fk is a convex function of

Σ ∈ A.Proof: We note that Rk(·) is monotone. Then, by

Corollary V.2.6 in [43], R−1k (·) is operator convex. For

Σ1,Σ2 � 0 and α ∈ [0, 1], let χ = αΣ1 + (1 − α)Σ2.We can prove thatFTk (αR−1k (Σ1) + (1− α)R−1k (Σ2)− R−1k (χ))Fk � 0.

So f(Σ) is also operator convex.Proof of Lemma 4: We note that Rk(·) is monotone.

Then for any Σ1,Σ2 � 0 with Σ1 � Σ2, we have Rk(Σ1) �Rk(Σ2). Then, as matrix multiplication is order preserving,and matrix inversion is order reversing, it immediately fol-lows that ρd(Σ1) � ρd(Σ2). Hence, monotonicity is proved.We next prove concavity. For ∀α ∈ [0, 1] and ∀Σ1,Σ2 � 0,let χ = αΣ1 + (1 − α)Σ2. Then, from Lemma 8, andsince αR−1k (Σ) � R−1k (αΣ) we have FT

k R−1k (χ)Fk �

FTk R−1k (αΣ1)Fk + FT

k R−1k ((1 − α)Σ2)Fk. Inverting this

expression, utilising Lemma 7, and remembering that X 7→X−1 is a convex operation [43] givesρd(χ)− αρd(Σ1)− (1− α)ρd(Σ2)

� [FTk R−1k (χ)Fk]−1

− [FTk R−1k (αΣ1)Fk + FT

k R−1k ((1− α)Σ2)Fk]−1 � 0,

thus proving concavity.Proof of Lemma 5: Denoting V =

ddε R

−1k (φσk−1(Σ + εX))

∣∣∣ε=0

and R−1k (φσk−1(Σ)) = R−1k , itis simple to show that

ddεφ

dσk−1(Σ + εX)

∣∣ε=0

= −φdσk−1(Σ)FTk V Fkφ

dσk−1(Σ),

V = −R−1k ddε Rk(φσk−1(Σ + εX))

∣∣∣ε=0

R−1k .

Putting these together and simplifying gives the result.Proof of Lemma 6: The proof follows straightforwardly

by considering the cyclical property of trace operator andsubmultiplicity of the Frobenius norm || · || =

√Tr{·}.

C. Proof of Theorem 1

Proof of Theorem 1: The proof follows from applyingJ(·) to Appendix C Lemma 7 of [9] to calculate the bounds,noting that φσ

∗τ

k−1−τ (Qτ ) � Fτ+k−1Qτ+k−2FTτ+k−1 ∀k, τ

and FkQk−1FTk � λQI ∀k.

D. Proof of Theorem 2

Lemma 9: There exists a real constant Lm ≥ 0 such that∀x1, x2 ∈ X :

Hk(x1) � (1 + LmdX (x1, x2))Hk(x2),Hk(x2) � (1 + LmdX (x1, x2))Hk(x1)

where Hk(x) = CTk (x)R−1k (x)Ck(x).

Proof: Consider any two nodes (x1k−1,Σk−1,Σdk−2),

(x2k−1,Σk−1,Σdk−2). Then, applying control u ∈ U to each

node we have ρx1k(Σk−1) � γρx2

k(Σk−1) from Assumption

2. Hence,γ−1ρ−1

x2k

(Σk−1) � ρ−1x1k

(Σk−1) � ρ−1x1k

(γ−1Σk−1), (19)where the last inequality follows from monotonicity of ρ andγ−1 > 1. Denote

Σ−1k+1(Σk) = A−Tk Σ−1k A−1k−A−Tk Σ−1k A−1k (A−Tk Σ−1k A−1k +Q−1k )−1A−Tk Σ−1k A−1k ,

then it is simple to show

γΣ−1k+1 � (γ−1Σk+1)−1. (20)

Now, the information form covariance update map for thefilter in (4)-(7) is ρ−1(·) [35]:

Σ−1k = Σ−1k +Hk−Σ

−1k Gk−1(GT

k−1Σ−1k Gk−1)−1GT

k−1Σ−1k .

Using (19) and (20) and denoting Hk(xnk ) = Hnk for n =

1, 2, it follows that

γ−1(Σ−1k +H2

k − Σ−1k Gk−1(GT


k−1Σ−1k )

� γ−1Σ−1k +H1

k

− γ−1Σ−1k Gk−1(GT


k−1Σ−1k .

Reducing gives the desired result H1k � (1 +

LmdX (x1k, x2k))H2

k . Following identical working for Hk(x2k)completes the proof.

Lemma 10: Suppose (x1k−1,Σk−1,Σdk−2),

(x2k−1,Σk−1,Σdk−2) are two nodes with d(x1, x2) ≤ δ. Let

Σd,1k−1, Σd,2k−1 be the input estimation error covariances afterupdating both nodes under the control u ∈ U Then

Σd,1k−1 � γΣd,2k−1, Σd,2k−1 � γΣd,1k−1∀k ∈ [1, N ].

Proof: Denote Rk(xi) = Rik for i = 1, 2, andconsider the inverse of the update map applied to node(x1k−1,Σk−1,Σ

dk−2):

(ρdx1(Σk−1))−1 = (Σd,1k−1)−1 = FTk R−1k (Σk−1)Fk

= FTk

(CkAk−1Σk−1A

Tk−1C

Tk + CkQk−1C

Tk +R1

k

)−1Fk.

By applying the matrix inversion lemma and Lemma 9,we get (Σd,1k−1)−1 � γ−1(ρdx2(γ−1Σk−1))−1. Then, takingthe inverse and applying monotonicity of the update mapgives the result. Following the same reasoning for updating(x2k−1,Σk−1,Σ

dk−2) completes the proof.

Proof of Theorem 2: Applying ρdu∗N−1(·) to Appendix

C Lemma 7 of [9], where u∗N−1 is the state-optimizedcontrol found by the algorithm, rather than an input-optimized control ud∗N−1. Noting again that φσ

∗τ

k−1−τ (Qτ ) �Fτ+k−1Qτ+k−2FT

τ+k−1 ∀k, and that∑N−1τ=1 Γτ (1 − γτ ) =

1− ΓN−1, by concavity of ρd,ρdu∗N−1

(Σ∗N−1)+

εgdσ∗N−1

1 (Σ∗N−1,N−2∑τ=1

Γτgσ∗τN−1−τ (Σ∗τ , I) + ΓN−1I)

� ΓN−1

K∑i=1

αiρdu∗N−1

(ΣiN−1) + (1− ΓN−1)ρdu∗N−1(λQI).

Using Lemma 10 with γd∗ = (1 +LmdX (x∗N , xd∗N ))−1, and

ρdu∗N−1(λQI) � (GT

N−1HNGN−1)−1 � λ−1H I , we have

γd∗Σd∗N−1 +MNCNAN−1

N−1∑τ=1

Γτgσ∗τN−τ (Σ∗τ , I)AT

N−1CTNM

TN

� ΓN−1

K∑i=1

αiγd∗ΣdiN−1 + (1− ΓN−1)λ

−1H I.

The proof then follows using monotonicity and concavityof J(·), and Lemma 6 after applying J(·) to the above result.

REFERENCES

[1] A. Singh, A. Krause, C. Guestrin, and W J. Kaiser, Efficient infor-mative sensing using multiple robots, Journal of Artificial IntelligenceResearch Vol. 34, pp. 707–755, 2009.

[2] S. Eiffert, H. Kong, N. Pirmarzdashti, and S. Sukkarieh, Path planningin dynamic environments using Generative RNNs and Monte Carlotree search, Proc. of IEEE International Conference on Robotics andAutomation, pp. 10263–10269, Paris, France, 2020.

[3] D. Su, H. Kong, S. Sukkarieh, and S. Huang, Necessary and sufficientconditions for observability of SLAM-Based TDOA sensor arraycalibration and source localization, IEEE Transactions on Robotics,Accepted and to appear, 03/2021.

[4] S. Eiffert, N. Wallace, H. Kong, N. Pirmarzdashti, and S. Sukkarieh, Ahierarchical framework for long-term and robust deployment of fieldground robots in large-scale farming, Proc. of IEEE InternationalConference on Automation Science and Engineering, pp. 948–954,2020.

[5] S. Eiffert, N. Wallace, H. Kong, N. Pirmarzdashti, and S. Sukkarieh,Experimental evaluation of a hierarchical operating framework forground robots in agriculture, Springer Proceedings in AdvancedRobotics Book Series, Results of 17th International Symposium onExperimental Robotics, Accepted and to appear, 2021.

[6] S. K. Gan, R. Fitch, and S. Sukkarieh, Online decentralized informa-tion gathering with spatial–temporal constraints, Autonomous Robots,Vol. 37, No. 1, pp. 1–25, 2014.

[7] A. Kassir, R. Fitch, and S. Sukkarieh, Communication-aware infor-mation gathering with dynamic information flow, The InternationalJournal of Robotics Research, Vol. 34, No. 2, pp. 173–200, 2015.

[8] J. Le Ny and G. Pappas, On trajectory optimization for active sensingin Gaussian process models, Proc. of IEEE Conference on Decisionand Control, pp. 6286–6292, 2009.

[9] N. Atanasov, J. Le Ny, K. Daniilidis, and G. J. Pappas, Informationacquisition with sensing robots: Algorithms and error bounds, Proc.of IEEE International Conference on Robotics and Automation, pp.6447–6454, 2014.

[10] G. Best, O.M. Cliff, T. Patten, R. R. Mettu and R. Fitch, Dec-MCTS: Decentralized planning for multi-robot active perception, TheInternational Journal of Robotics Research, Vol. 38, No. 2-3, pp. 316–337, 2019.

[11] G. A. Hollinger, and G. S. Sukhatme, Sampling-based robotic infor-mation gathering algorithms. The International Journal of RoboticsResearch, Vol. 33, No.9, pp. 1271—1287, 2014.

[12] B. Schlotfeldt, D. Thakur, N. Atanasov, V. Kumar and G. Pappas,Anytime planning for decentralized multirobot active informationgathering. IEEE Robotics and Automation Letters, Vol. 3 No. 2, pp.1025–1032, 2018.

[13] Z. Zhang and P. Tokekar, Non-myopic target tracking strategiesfor non-linear systems, Proc. of IEEE Conference on Decision andControl, pp. 5591–5596, 2016.

[14] Y. Kantaros, B. Schlotfeldt, N. Atanasov, and G. Pappas, Asymp-totically optimal planning for non-myopic multi-robot informationgathering, Proc. of Robotics: Science and Systems, pp. 22-26, 2019.

[15] M. Vitus, W. Zhang, A. Abate, J. Hu, and C. Tomlin, On efficientsensor scheduling for linear dynamical systems, Automatica, Vol. 48,No. 10, pp. 2482–2493, 2012.

[16] D. Han, J. Wu, H. Zhang, and L. Shi, Optimal sensor scheduling formultiple linear dynamical systems, Automatica, Vol. 75, pp. 260–270,2017.

[17] A. S. Leong, A. Ramaswamy, D. E. Quevedo, H. Karl, and L. Shi,Deep reinforcement learning for wireless sensor scheduling in cyber–physical systems, Automatica, Vol. 113, Article 108759, 2020.

[18] T. Iwaki, J. Wu, Y. Wu, H. Sandberg, and K. H. Johansson, Multi-hopsensor network scheduling for optimal remote estimation, Automatica,Vol. 127, Article 109498, 2021.

[19] L. Huang, J. Wu, Y. Mo, and L. Shi, Joint sensor and actuatorplacement for infinite-horizon LQG control, IEEE Transactions onAutomatic Control, Early access, 2021.

[20] S. Z. Yong, M. Zhu, and E. Frazzoli, A unified filter for simultaneousinput and state estimation of linear discrete-time stochastic systems,Automatica, Vol. 63, pp. 321–329, 2016.

[21] Y. Li, D. Shi, and T. Chen, Secure analysis of dynamic networks underpinning attacks against synchronization, Automatica, Vol. 111, Article108576, 2020.

[22] M. Showkatbakhsh, Y. Shoukry, S. N.Diggavi, and P. Tabuada, Secur-ing state reconstruction under sensor and actuator attacks: Theory anddesign, Automatica, Vol. 116, Article 108920, 2020.

[23] A. P. Dani, Z. Kan, N. R. Fischer, and W. E. Dixon, Structureestimation of a moving object using a moving camera: An unknowninput observer approach, Proc. of IEEE Conference on Decision andControl and European Control Conference, pp. 5005–5011, 2011.

[24] H. Jeong, H. Hassani, M. Morari, D. D. Lee and G. Pappas, Learning totrack dynamic targets in partially known environments. arXiv preprint,arXiv:2006.10190, 2020.

[25] S. Engin and V. Isler, Active localization of multiple targets fromnoisy relative measurements, Algorithmic Foundations of Robotics XIV.2020.

[26] H. Hur and H. S. Ahn, Unknown input H∞ observer-based localiza-tion of a mobile robot with sensor failure, IEEE/ASME Transactionson Mechatronics, Vol. 19, No. 6, pp. 1830–1838, 2014.

[27] A. Ansari and D. S. Bernstein, Input estimation for nonminimum-phase systems with application to acceleration estimation for a ma-neuvering vehicle, IEEE Transactions on Control Systems Technology,Vol. 27, No. 4, pp. 1596–1607, 2019.

[28] E. Hashemi, R. Zarringhalam, A. Khajepour, W. Melek, A. Ka-saiezadeh, and S. K. Chen, Real-time estimation of the road bank andgrade angles with unknown input observers, Vehicle System Dynamics,Vol. 55, No. 5, pp. 648–667, 2017.

[29] N. Wallace, H. Kong, A. Hill, and S. Sukkarieh, Receding horizonestimation and control with structured noise blocking for mobilerobot slip compensation, Proc. of IEEE International Conference onRobotics and Automation, pp. 1169–1175, Montreal, Canada, 2019.

[30] N. Wallace, H. Kong, A. Hill, and S. Sukkarieh, Experimental valida-tion of structured receding horizon estimation and control for mobileground robot slip compensation, Springer Proceedings in AdvancedRobotics Book Series, Vol. 16, Results of the 12th Conference on Fieldand Service Robotics (FSR), pp. 411-426, 2021.

[31] H. Guo, Z. Yin, D. Cao, H. Chen, and C. Lv, A review of estimationfor vehicle tire-road interactions toward automated driving, IEEETransactions on Systems, Man, and Cybernetics: Systems, Vol. 49,No. 1, pp. 14–30, 2019.

[32] L. D. Phong, J. Choi, and S. Kang, External force estimation usingjoint torque sensors for a robot manipulator, Proc. of IEEE Interna-tional Conference on Robotics and Automation, pp. 4507–4512, 2012.

[33] Q. Li, O. Kroemer, Z. Su, F. F. Veiga, M. Kaboli, and H. J. Ritter,A review of tactile information: perception and action through touch,IEEE Transactions on Robotics, pp. 1–16, 2020.

[34] S. Gillijns and B. De Moor, Unbiased minimum-variance input andstate estimation for linear discrete-time systems, Automatica, Vol. 43,No. 1, pp. 111–116, 2007.

[35] S. Gillijns, N. Haverbeke and B. De Moor, Information, covarianceand square-root filtering in the presence of unknown inputs, EuropeanControl Conference, pp. 2213–2217, 2007.

[36] H. Kong and S. Sukkarieh, An internal model approach to estimationof systems with arbitrary unknown inputs, Automatica, Vol. 108, 2019.

[37] H. Kong, M. Shan, D. Su, Y. Qiao, A. Al-Azzawi, and S. Sukkarieh,Filtering for systems subject to unknown inputs without a priori initialinformation, Automatica, Vol. 120, pp. 1–12, 2020.

[38] P. Guo, H. Kim, N. Virani, J. Xu, M. Zhu, and P. Liu, RoboADS:Anomaly detection against sensor and actuator misbehaviors in mo-bile robots, Proc. of Annual IEEE/IFIP International Conference onDependable Systems and Networks, pp. 574–585, 2018.

[39] A. Mitra, J. A. Richards, S. Bagchi, and S. Sundaram, Resilientdistributed state estimation with mobile agents: overcoming Byzantineadversaries, communication losses, and intermittent measurements,Autonomous Robots, Vol. 43, No. 3, pp. 743–768, 2019.

[40] L. Zhou, V. Tzoumas, G. J. Pappas, and P. Tokekar, Resilient activetarget tracking with multiple robots, IEEE Robotics and AutomationLetters, Vol. 4, No. 1, pp. 129–136, 2018.

[41] M. Pirani, E. Hashemi, A. Khajepour, B. Fidan, B. Litkouhi, S. K.Chen, and S. Sundaram, Cooperative vehicle speed fault diagnosis andcorrection, IEEE Transactions on Intelligent Transportation Systems,Vol. 20, No. 2, pp. 783–789, 2019.

[42] R. R. Bitmead, M. Hovd, and M. A. Abooshahab, A Kalman-filteringderivation of simultaneous input and state estimation, Automatica, Vol.108, Article 108478, 2019.

[43] R. Bhatia, Matrix analysis, Vol. 169. Springer Science & BusinessMedia, 2013.

http://arxiv.org/abs/2006.10190

active information acquisition under arbitrary unknown

Documents