optimum design of measurement channels and control policies for linear-quadratic stochastic...

11
226 European Journal of Operational Research 73 (1994) 226-236 North-Holland Optimum design of measurement channels and control policies for linear-quadratic stochastic systems * Tamer Ba§ar Coordinated Science Laboratory, University of Illinois, 1308 Main Street, Urbana, IL 61801, USA Rajesh Bansal AT& T Bell Laboratories, 6200 E. Broad Street, Columbus, OH 43213, USA Abstract: In the design of optimal controllers for linear-quadratic stochastic systems, a standard assumption is that the measurement channels are fixed and linear, and the measurement noise is Gaussian. In this paper we relax the first part of this restriction and raise the issue of the derivation of optimum measurement structures as a part of the overall design. Toward this end, we take the measuement process as one given by a Wiener integral, and modify the cost function so that it now places some soft constraints on the measurement strategy. Using some results from information theory, we show that the scalar version (for both finite and infinite horizons) of this joint design problem admits an optimum, dictating linear designs for both the controller and the measurement strategy. For the vector version, however, it is possible for a nonlinear design to improve over the best linear one. In both cases, best linear designs involve the solutions of nonlinear (deterministic) optimal control problems. Keywords: Stochastic control; LQG design; Optimum signal design; Dynamic optimization; Decentral- ized systems I. Introduction In the linear-quadratic-Gaussian (LQG) controller design problem, which is a widely used model in engineering, economics and operations research, the objective is to choose an optimal controller u t for a stochastic system described by dxt=Ax t dt+Bu t dt+F dut, t>0, (1) by minimizing the quadratic performance index f=EIftfe-~t[xTaxt+uTRut] , to dt+xTtfafxtf} (2) * This work was supported in part by the US Department of Energy under Grant DE-FG-02-88-ER-13939, and in part by the Joint Services Electronics Program under Grant N00014-90-J-1270. Correspondence to: T. Ba§ar, Coordinated Science Laboratory, University of Illinois, 1308 Main Street, Urbana, IL 61801, USA. Elsevier Science B.V. SSDI 0377-2217(93)E0095-F

Upload: fan-zhang

Post on 22-Dec-2015

229 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Optimum design of measurement channels and control policies for linear-quadratic stochastic systems.pdf

226 European Journal of Operational Research 73 (1994) 226-236 North-Holland

Optimum design of measurement channels and control policies for linear-quadratic stochastic systems *

Tamer Ba§ar Coordinated Science Laboratory, University of Illinois, 1308 Main Street, Urbana, IL 61801, USA

Rajesh Bansal AT& T Bell Laboratories, 6200 E. Broad Street, Columbus, OH 43213, USA

Abstract: In the design of optimal controllers for linear-quadratic stochastic systems, a standard assumption is that the measurement channels are fixed and linear, and the measurement noise is Gaussian. In this paper we relax the first part of this restriction and raise the issue of the derivation of opt imum measurement structures as a part of the overall design. Toward this end, we take the measuement process as one given by a Wiener integral, and modify the cost function so that it now places some soft constraints on the measurement strategy. Using some results from information theory, we show that the scalar version (for both finite and infinite horizons) of this joint design problem admits an optimum, dictating linear designs for both the controller and the measurement strategy. For the vector version, however, it is possible for a nonlinear design to improve over the best linear one. In both cases, best linear designs involve the solutions of nonlinear (deterministic) optimal control problems.

Keywords: Stochastic control; L Q G design; Opt imum signal design; Dynamic optimization; Decentral- ized systems

I. Introduction

In the l inear-quadratic-Gaussian (LQG) controller design problem, which is a widely used model in engineering, economics and operations research, the objective is to choose an optimal controller u t for a stochastic system described by

d x t = A x t d t + B u t d t + F dut, t>0, (1)

by minimizing the quadratic performance index

f=EIftfe-~t[xTaxt+uTRut], to dt+xTtfafxtf} (2)

* This work was supported in part by the US Department of Energy under Grant DE-FG-02-88-ER-13939, and in part by the Joint Services Electronics Program under Grant N00014-90-J-1270.

Correspondence to: T. Ba§ar, Coordinated Science Laboratory, University of Illinois, 1308 Main Street, Urbana, IL 61801, USA.

Elsevier Science B.V. SSDI 0377-2217(93)E0095-F

Page 2: Optimum design of measurement channels and control policies for linear-quadratic stochastic systems.pdf

T. Ba~ar, R. Bansal / Linear-quadratic stochastic systems 227

where Q >__ 0, R > 0, Qf > 0 are known matrices, with the first two possibly depending on the time variable t >_ to, and/3 > 0 is a discount factor.

In (1), x t is the state vector of dimension n, u t is the control vector of dimension r, v t is the additive stochastic disturbance term which is taken as an n-dimensional standard Wiener process, and A, B, F are matrices of appropriate dimensions, which are allowed to depend on the time variable t, t o < t _< tf. Perhaps a more familiar form of (1) for the readership of this journal is the white-noise model:

k t = A x t + B u t + f ~ t , t>_O, (3)

where ~t is a standard Gaussian vector white noise process with zero mean. Both (1) and (3) are driven by the initial state xto = x0, which is a Gaussian random vector with mean

zero and covariance "Y0, i.e., x 0 ~ N(0, Y,0). The control u t does not have direct access to the state, but to a noisy version of it, y , t > to, which is

generated by

d y t = H x t dt + G dw t, Yto = 0, (4)

where w t is another standard Wiener process, independent of {vt, t o < t < tr}, Xo, and of the same dimension as the measurement process Yt, t > t 0, say m. In (4), H is an m × n matrix, possibly depending on t, and G is a nonsingular matrix, that is G G T > 0. Let us denote the causal dependence of u on y by

u t = y t ( y ~ ) , y ~ : = { y ~ , t 0 < T < / } (5)

where ~/t, t > t 0, is a control policy. The well-known LQG theory [1] says that there is a unique controller of the type (5) that minimimes

(2) subject to (1), and this controller is linear and exhibits certainty equivalence. It is given by

= yt u t y , ( 0 ) = ~ t ( . ~ t ) = - R - 1 B T p ( t ) ~ t , t>~O,

where 2t is generated by the Kalman filter:

d~ t = A ~ t d t + Bu, d t + K ( t ) [ d y t - H ~ t d t ] ,

K ( t ) := ~ ( t ) H T ( t )[ G G T ] - I ,

S =A2~ + ~ A T + F F T - ~ , H T [ G G T ] - I H ~ ,

and

P +ATop + PAt3 - P B R - 1 B T p + Q = O,

where

A. :=A - ½/31.

(6)

L0 = o, (7)

(8)

~( to ) ='~0, (9)

P is given as the unique solution of the dual (backward propagating) Riccati differential equation

O ( t f ) = Qe, (10)

(11)

For the infinite horizon version (i.e., tf = ~) the two Riccati equations are replaced by their algebraic counterparts:

A ~ + ~ A T + F F T - ~ H T [ G G T ] - 1 H ~ = 0, (12)

ATop + PAt3 + Q - P B R - 1 B T p = 0. (13)

These equations are assured of unique nonnegative definite solutions, under which the controller (6) leads to a stable feedback system and a stable filter, if

( A , B) and ( A , F ) are controllable; ( A , Q) and ( A , H ) are observable. (14)

Page 3: Optimum design of measurement channels and control policies for linear-quadratic stochastic systems.pdf

228 T. Ba~ar, R. Bansal / Linear-quadratic stochastic systems

Note that one of the essential modeling assumptions of the LQG theory is that the measurement scheme (4) is fixed as given. This may not always be a reasonable assumption, however, especially in large scale decentralized systems with more than one decision maker, where the control decisions are often taken on the basis of information generated by other members of the same system and garbled by noisy communication channels. As also indicated in [2], we may identify agents in large decentralized systems with one of two kinds of roles: (i) agents who perform the communication tasks of generating information bearing signals, and (ii) agents who perform the control functions of forming estimates, minimizing erros and reducing costs. This flexibility opens the possibility (and necessity) of simultaneously designing measurement and control strategies, and implementing them in a decentralized fashion.

This paper addresses such a stochastic decision problem, which can be vieweff as an extension of the LQG model briefly described above, where a new design element is included in the measurement equation (4). Specifically, we replace (4) by

d Y t = h t ( x t , Yto) d t + G d w t , Yt0 = 0 , (15)

where ht, t > to, is a general (possibly nonlinear) function, which allows the measurement at time t to depend on the current value of the state as well as the past values of the measurement. The driving measurement noise w t, t > t o, is again as defined earlier, following (4).

Our interest lies in the derivation of an optimal measurement strategy (within the class described above by (15)) along with the corresponding optimal control, both chosen under the new performance index

J = f + E I f t ' e - ~ t h T ( x t , y~)Nht (x t , y~)d t} , N > 0 , (16) [ to

which places some soft constraints on the measurement strategy. Hence, we seek a pair y*, h*, t > to, from the classes of functions identified above, such that, for all permissible Tt, ht, t > to,

J ( y * , h*; t > to) <-J(Yt, ht; t>--to). (17)

The discrete-time version of this problem was earlier discussed in [2], where it was shown that for the scalar model the best measurement strategy is to amplify the innovation at each stage to a certain power threshold level, with this threshold obtained from the solution of a discrete-time nonlinear optimal control problem using dynamic programming. For the infinite-horizon version, these threshold levels converge to a fixed constant, leading to the existence of optimal linear stationary policies. For higher order problems, however, [2] has established the possibility of nonlinear optimum designs, and also obtained the best linear designs, through the solutions of nonlinear optimal control problems.

The continuous-time problem studied in this paper requires somewhat different mathematical tools than those used in [2], but it will turn out that the results to be obtained are qualitatively similar to those of [2]. We again first study the scalar problem (in the next section), and obtain the optimum joint design, which has a linear structure. We then present, in Section 3, several numerical examples to illustrate the results of Section 2. Following this, in Section 4 we solve the problem with higher-order dynamics, when the function h t in (15) is restricted to be affine. Section 5 concludes the paper.

2. Solution to the scalar problem

We consider here the one-dimensional version of the general problem, rewritten as

d x t = a x t d t + b u t d t + f dvt, xto=Xo', ,N(O, O'o)

dy t = h t ( x t , y ~ ) d t + g d w t, Yto=O, t > t o .

u t=Yt (Y~) ,

(18a)

(18b)

(19)

Page 4: Optimum design of measurement channels and control policies for linear-quadratic stochastic systems.pdf

T. Bagar, R. Bansal / Linear-quadratic stochastic systems 229

where {vt}, {w t} are standard independent Wiener processes, which are also independent of x 0. The control and measurement functions 7t and h t are taken to be Borel measurable, and the cost function is

where q >__ 0, qf >_ 0, with at least one of them positive, r > 0, n > 0.

2.1. Derivation o f the optimum solution

We seek a pair (h*, 7 * ) such that the cost J (y , h) is minimized. Toward this end, we first invoke, for the measurement process, the linear structure

h t ( x , , y t o ) = H t ' ( x t - 2 t l t ) , 2 t l t :=E[x t lY~] , (21)

where H t is a function of t, yet to be determined. For each Ht, 2tn t is given by the Kalman filter:

dx t l t = (a£, l , + b u t ) d t + K t dy t , xt01t0= 0, (22)

K, = Hto 'Jg 2, (23)

dr t = 2ao. t + f 2 _ Ht2o.t2/g2, crt0 = ~0- (24)

Here, the separation principle applies, and the unique optimal control law is given by

u t = y t (y~) = - ( 1 / r ) b p t 2 t l t, (25)

/~, = - (2a - B)Pt + p 2 ( b 2 / r ) - q, P,f = qe, (26)

and the corresponding value of the cost is (as a function of H t)

J = f tee-t~t[o-tb2/r .p2 + p t f 2] dt +poO~o + f t fe-t3tHt20.tndt to t o

= n g 2 f t ' e -C3t[mto ' t +c;o ' t ] d t + f t f e - 1 3 t p t f 2 d t +p0o-0 (27) ~t o to

with L( c ')

C~ :----- Ht2/g 2 and m t := b 2 p 2 / r n g 2.

Now let c t = c[~r t. Then, to obtain the best measurement strategy in the class specified, we have to solve the nonlinear dynamic optimization problem

min L ( c ) = f f 'e- t3 ' tmto' t + ct] dt (28a) {c, _> 01 at 0

s.t. ~t = 2acrt + f 2 _ C,~t ' o't0 = o" 0 . (28b)

Note that the function L(c) is bounded from below (by zero), and as c-- , % L(c)---, oo, implying that there exists a constant K > 0 (could be sufficiently large) such that mintc ' > o~L(c) = mint0 _< ,. ~ mL(c) .

This dynamic optimization problem can be viewed as a nonlinear optimal control problem, with c being the control variable (constrained to be nonnegative), and o" the state. As such, it can be solved using either dynamic programming or the minimum principle, with the former being applicable if there

Page 5: Optimum design of measurement channels and control policies for linear-quadratic stochastic systems.pdf

2 3 0 T. Ba~ar, R. Bansal / Linear-quadratic stochastic systems

exists a continuously differentiable value function. We now first explore this possibility, and write down the associated Hamilton-Jacobi-Bellman (HJB) equation:

- - - = m i n c t e - ¢ t + m t c r t e - m + - . ( 2 a o - t + f 2 - c , ~ r t , V ( t f , ~ r ) - O . (29) ~t c, ~ o ~ &r

Clearly a solution to the pointwise minimization exists if, and only if, e-St + (OV/Oo.)crt < O, under which

~V aV Ot - mt°'t e-3t + -~--~ (2aCrt + f 2 ) ,

A candidate solution to this PDE is

V( t, 6r) = ~( t)~r + rl( t ) ,

- ( ( t ) = 2 a ~ ( / ) + m t e -t3t,

- ¢ / ( t ) ---f2sc(t), rl(tf) = 0.

under which the earlier condition becomes:

~( t )cr( t ) < e -t3t.

~ ( t f ) = 0 ,

V( t f , ~) - 0 . (30)

(31a)

(31b)

(31c)

(32)

Hence, under the structural assumption (31), a necessary and sufficient condition for the existence of a tf solution to the optimal control problem (28) is the existence of a {c t > O}t~t ° such that (32)-(35) are

satisfied:

ct(e -t3`- ~:(t)o't) = 0,

d. t = 2ao~ t + f 2 _ ctcrt, O.to = Cro,

- ( ( t ) = 2ase(t) + m t e -t3t, ~:(tf) = 0 ,

where

b 2

(33)

(34)

(35)

b 2

m t .'= r--~g 2 p 2, /5 t = - (2a - [3)p t + --PZtr - q' Pit = qf" (36)

Hence, min{c,iL(c) = V(t o, o" o) - ~(to)tr 0 + */(to), where

= -e-Zat°f t°e2a" e-t3"m, d~'= e-2"'°ftte(2a-t~)'m, d~-, (37) t o ) t t to

~/( t 0) = f 2ft£t~( "r ) d'r = f2ftte-2atto dt fte(2a-t3I*m,to d ' r = f2f t [ tdTm ~. e (24-/3)~" fztte -2at dt

2

f 2 f t t e - t ~ ' m , d r - 2~ft ° e(2a-/3) 'm, dr. (38) = - - t t e - 2at t

2a to

Thus completing the discussion of the dynamic programming approach to the optimization problem (28), we now apply the minimum principle to the same problem. We first define the Hamiltonian associated with this problem as

H = e-t~t(mtot + ctcrt) + At(2acr, + f 2 _ cto.tZ) (39)

Page 6: Optimum design of measurement channels and control policies for linear-quadratic stochastic systems.pdf

T. Ba~ar, R. Bansal / Linear-quadratic stochastic systems 231

where A t, t > t 0, is the co-state variable. Any optimal solution c t, t > t o, should minimize H (pointwise), and lead to satisfaction of the two-point boundary value problem

- A t = m e -l~t + 2 a A t - c tA t , Atf = 0, (40)

-6", = 2 a ~ + f 2 _ C,O.t, o't0 = or 0. (41)

Note that since the Hamiltonian is linear in ct, the minimizing solution in (39) is obtained from

c , [e -~ t - ~A, ] = 0 , e-/3t - o'tAt ~ 0. (42)

Note that the solutions to (42), for any fixed t, are either

c t = 0, e -13t -- %A t >__ 0, (43)

or a c t that satisfies

e -¢~t = o'tA t . (44)

In the former case, cr t and A t c a n easily be solved from (41) and (40) (with c t = 0), and the condition in (43) be checked. Because of the boundary condition on A, this condition will always be satisfied in a neighborhood of the terminal time, implying that there exists an interval (t2, tf] on which ct = 0, which says that beyond a time point t 2 no measurement should be taken. If ~r 0 = 0, then the same scenario repeats in a neighborhood of t = t 0, which implies the existence of an interval [to, t 1) on which again no measurement should be taken.

Now, if t~ < t2, in the interval [tl, t z] or in any subinterval of it, there will be a singular control , obtained by differentiation of (44). Carrying out the manipulations, we arrive at the expression

[32 + 2 f 2 r n + o ' t m ( 4 a - [3) + o'trh - 2a[3 c t = , (45)

2o-tm - / 3

with the corresponding values of ~r and A being

[3 + ~/[32 + 4 f Z m e- /3t

~rt = 2m , "~t -- (46)

The condition for (45)-(46) to constitute a solution is for c t to be nonnegative, which determines the length of the interval over which it is valid. This can be done numerically, as will be discussed in the next section, in the context of some examples. But before doing this, we first verify optimality of the solution (25), along with (21), provided that (42) or (32)-(35) admit solutions.

2.2. Overal l opt imal i ty

We now prove the overall optimality of the pair given by (21) and the solution of the optimal control problem (28). Toward this end, we first rewrite J given by (20) as

- r P t X t ] r d t + x ~ p o x o + f t l f ( e - ¢ t p t f Z ) d t + E I f t ~ e - ¢ t n h ~ d t } (47) to L ~ to

where Pt > O, t >_ to, is defined by (26). Fix E[h2t] < t t , where gt is known. Then,

m i n J = min min J y,h {t't} y,yh,E[h2t]<lt

Page 7: Optimum design of measurement channels and control policies for linear-quadratic stochastic systems.pdf

232 T. Ba~ar, R. Bansal / Linear-quadratic stochastic systems

where for the inner minimization problem the equivalent cost function is

r ( ctf _t3t b2 2[ r . dt} L = r_. I.]to e r---~pt[ bp---~tu,- xt]2r , (48)

which is to be minimized with respect to y, h, subject to E[hZt] < gt and

d x t = a x t + b u t + f d v t , d y t = h , ( x t, y~) d t + g dwt. (49)

Let u~ := (r/bpt)u r Then we have

L =E 'e -m p2[u;--Xt]2r d t , (50)

#2p, dxt=axt + r u~ + f dvt' Xto=Xo, (51)

dy ,=h t ( x , , y~) dt + g dw,. (52)

Now decompose x t into two components:

- ' + x 2, (53) X t - - X t

dx] = ax] + f dr , , x it0 =x0, (54)

b 2 dx 2 = ax 2 + - -p tu; , X2o = x 0 (55)

r

g = E [ ftfe-t3tb-~22p2[ut-x~]2rdt ) (56) at o r

where

u;' : = u ; - x 2 (57)

and the minimization over u; is equivalent to minimization over u;', since xt z depends only on the past t values of u r

Hence the optimization problem is

Min L(y" , h) over ut=y; ' (y~) , hi, E[h 2] <rt

such that d x ] - a x t - 1 + f dr , , x20=xo,

_ _ 1 d y t - h t ( x , , x 2, y~) dt +g dw,,

where the differential equation for Yt can equivalently be written as

_ _ - 1 d y t - h t ( x , , y~) dt +g dwt,

this being true because xt z is o-(y~)-measurable. This is a problem of the type that arises in the transmission of information over Gaussian channels:

The Gaussian message process x 2, t > > to, is to be transmitted over a continuous-time channel corrupted by additive noise, modeled by a Wiener process (w t, t > to), and it is desired to design an encoder (in our case h") under a given power constraint ( t ) such that the quadratic error at the receiver (in our case L(y", h)) is minimized after optimum decoding (which, in our case, is y"). This information transmission problem has been studied before in ([3], pp. 177-195), where it has been shown that the

Page 8: Optimum design of measurement channels and control policies for linear-quadratic stochastic systems.pdf

T. Ba~ar, R. Bansal / Linear-quadratic stochastic systems 233

opt imal solut ion for h 7 (encoder ) is l inear in the innovations, in which case the o p t i m u m choice for the decoder y[' is the K a l m a n filter. Now, given that h" is in fo rm

h,, [ 1 tt , t x , , Y 'o)=(xJ-E[x~lYto])H, ,

we can invert the t r ans fo rmat ion used in this subsection, to arrive at the s t ructural fo rm (21) as the o p t i m u m one for the original p roblem.

We are now in a posit ion to summar ize the ma in result of this section in the following theorem:

T h e o r e m 1. Let there exist a solution to the nonlinear optimal control problem (28), to be denoted by 14,*, t >__ t o, after the transformation introduced by (27). Then, the scalar joint control~measurement design problem admits an optimal solution, given by

h * ( x t , y t o ) = ( x , - - ~ t ) H t * , Yt*(Y~) : -(1/r)bp,2,*l, ,

where 2t*lt is generated by (22), with H t = Ht*, and Pt is obtained from (26).

3 . N u m e r i c a l e x a m p l e s

W e presen t in this sect ion two numer ica l examples to i l lustrate the results p re sen ted in the previous section, and especially the switching na ture of the opt imal m e a s u r e m e n t scheme.

Exa mple 1. Parametric values: tf = 10; t o = 0; a = /3 = qf = tr o = 0; b = f = g = q = r = n = 1. Work ing with (32)-(35), we have

~:(t) = 11 - t - ( 2 / ( 1 + e2( / - ' ° ) ) ,

tanh2( 10 - t ) ~ ( t ) , 0.1125 < t < 8 . 0 8 5 ,

else,

0 < t < 0.1125, 0.115 < t < 8.085,

t > 8.085.

t ,

o ' ( t ) : 1/¢(t), k t - 7.972,

Hence , the o p t i m u m policy is to m a k e m e a s u r e m e n t s only in the interval (0.115, 8085) (see Figure 1). The cor responding opt imal controller , f rom (25), is

u, = Y*(Y6) = - t a n h ( t - 10)2,1t,

d2,1 t = - t a n h ( t - 1 0 ) 2 , 1 , d t + K t dy t , 2 0 1 0 = 0 , K , = ~ ( c ( t ) / 6 ( t ) ) .

8."

6. ]

4.

2.

t

Figure 1. A plot of optimum measurement gain Ht* = ~ for t ~ (0.115, 8.085)

Page 9: Optimum design of measurement channels and control policies for linear-quadratic stochastic systems.pdf

234 T. Ba~ar, R. Bansal / Linear-quadratic stochastic systems

H*

0.8

0.6

0.4

0.2

t 0.4 0.5 0.6

Figure 2. A plot of op t imum measuremen t gain Ht* for t e (0.25158, 0.6666)

E x a m p l e 2. Parameter values: a = / 3 = q = t o = ~r 0 = 0; b = r = n = g = q f = t f = 1.

The solutions to (26) and (31) are, respectively,

p ( t ) = l / ( 2 - t ) and s c ( t ) = ( 1 - t ) / ( 2 - t ) , 0 _ < t < l .

Let us consider two different values for f : (i) f = 1. Then, with H t = 0, o- t obtained from (34) is o- t = t, 0 < t < 1, which leads to satisfaction of

(32) as a strict inequality. Hence, the opt imum policy here is not to use any measurement throughout the interval. (Here the measurement is too costly!)

(ii) f = 3, which corresponds to a more noisy system. Then we have two switches in the measurement policy, at t ime instants

t 1 = ( 1 0 -- 2 f ~ ) / 1 8 --- 0.25158 and t z = 2 / 3 = 0.6667.

Outside the interval ( t l , t 2) the opt imum value for H t is zero, and inside the interval it is (see Figure 2)

H t * = ~ 9 ( 1 - t ) 2 - 1 / ( 2 - t ) , t l < t < t 2.

The corresponding filter error variance or is

9t, 0 < t < t 1,

o- t = ( 2 - t ) / ( 1 - t ) , t l < t < t 2,

9 t - 2 , t2 < t < l .

Our numerical experimentat ion with other examples has shown that the number of switches is not necessarily upperbounded by two. One can in fact have an arbitrary number of switches in the measurement strategy, depending on the parameters of the problem at hand. Also, depending on the values of the system parameters , it is possible for an optimal solution to (28) not to exist within the class of piecewise continuous controls; in this case one has to look for a solution in an extended class that includes impulsive controls.

4 . S o l u t i o n t o t h e v e c t o r v e r s i o n

We now return to the original problem formulated in Section 1, and require h in (15) to be a linear function of its arguments. First, using an argument similar to that of Theorem 6 of [2], it can be shown that within the general linear class there is no loss of generality in restricting the measurement strategies to be in a form that is the vector version of (21):

h t ( x t , y ~ ) = H t ' ( x t - 2 t l , ) , 2 ,1 ,=E[x t l y~] , (58)

Page 10: Optimum design of measurement channels and control policies for linear-quadratic stochastic systems.pdf

T. Ba~ar, R. Bansal / Linear-quadratic stochastic systems 235

where Ht, t >_>_ to, is a matrix valued function, of dimensions r × n. For each such H,, the conditional mean 2t it is again given by the Kalman filter:

d.~tlt = (A~tl t+But) d t + K t dy t, .~tolto = O, (59)

Kt = H, Zt(GGT ) -1, (60)

~ t = A X , + X t A T + F F T - H t X t ( G G T ) - I X t H t T , X to=X o. (61)

For each {Ht}, the unique optimal control law is given by

Ut = Y t * ( Y 0 ) = -R-1BTP2tlt, t > t o, (62)

where P(t), t > to, is the unique nonnegative definite solution of (10). Now, to obtain the optimum measurement gain matrix, we have to substitute (62), along with (59), into

the performance index (16), to arrive at a new cost function which will have to be optimized with respect to {Hi}:

m i n L ( H ) : L ( H ) = f t fe-m Tr[~ t (M t + HtTNHt)] dt + k (63) {Ht} t o

where

Mt := P( t ) BR- ~BTp( t ), (64)

k := Tr[ P(0),~0] + f t l f T r [ P ( t ) G G T] dt. (65)

Note that here the 'control' variable H t is matrix valued, and the dynamic constraint is the matrix valued state equation (59). Assuming the existence of a continuously differentiable value function for this problem, the associated HJB equation can be written as (by ignoring the constant bias term k in (63))

- - = m i n e -°t Tr[~tM t + ~tHTNHt] Ot H,

+Tr ( A Z t + Z t A T + F F T - H , Zt(GGT)-1ZtH, T , V(tf , Z ) - 0 . (66)

Invoking an affine structure for V:

V,(t, Z) = T r [ ~ Z ] + r/(t) (67)

and using this in (66), we arrive at

- T r [ = X ] = m i n e -¢' Tr[XtMt +ZtHTNHtl H~

,~ - 1 T + T r [ ~ ( A~t +~ 'AT + F F T - H t ~ t ( G G T ) ~tHt l} +~( t )

: e -m Tr[.YtM,] + Tr[Z(A.Yt + ~ t A T +FFT)] +' / /( t)

which is satisfied if _= and ~7 are chosen according to

- ~ ( t ) = ~ ( t ) A +ATE(t) + e-mMt, ~'(tf) = 0, (68)

-¢7(t) = Tr[~( t )FFT], ~(tf) = 0, (69)

Page 11: Optimum design of measurement channels and control policies for linear-quadratic stochastic systems.pdf

236 T. Ba~ar, R. Bansal / Linear-quadratic stochastic systems

with the optimal /4, satisfying the equation

- 1 T q~(Ht) := e-/3t Tr[J~tHTUHt] - Tr[~- ( t )Ht~ , t (GGT) -~,/-/t ] = 0 . (70)

The condition of existence, that replaces (32) in this case, is

minq~(Ht) = 0. (71) /4,

We can now summarize the main result of this section in the following theorem, which is the counterpart (but a weaker version) of Theorem 1 in the vector case:

Theorem 2. Let there exist a solution to the nonlinear optimal control problem (63), to be denoted by fit*, t > t o. Then, with the measurement strategies restricted to the linear class, the general joint control/ measurement design problem admits an optimal solution, given by

h * ( x t, y~) - - -n t* ' (x t - .~ t* l t ) , y * ( y ~ ) = -R-]BTp(t) .~t*I , ,

where xt*lt is generated by (59), with H t = H*t , and P is obtained from (10).

5. Concluding remarks

We have formulated in this paper a class of joint con t ro l /measurement design problems, which extends the framework of the standard LQG models. For the scalar version the globally optimal measurement strategy turns out to be a linear one, with the structure of a time-varying gain multiplying the innovations in the state measurements. This gain is determined from the solution of a nonlinear optimal control problem. For the vector version, the best linear measurement has the same structure, but the possibility of improving upon this design by a nonlinear one is not ruled out in this case. This qualitative difference between the scalar and vector versions is due to the fact that the result from information theory that was used in the proof of Theorem 1 does not extend to the vector case. In both cases, with linear measurements, the optimal controller is linear in the best (minimum-mean-square) estimate of the state.

Some challenges for the future might be a thorough investigation of the numerical aspects of the solutions to the optimal control problems (28) and (63), and the derivation of implementable nonlinear measurement schemes, as well as the corresponding control policies, for the problem of Section 4.

References

[1] Fleming, W.H., and Rishel, R.W., Deterministic and Stochastic Optimal Control, Springer-Verlag, Berlin, 1975. [2] Bansal, R., and Ba§ar, T., "Simultaneous design of measurement and control strategies for stochastic systems with feedback",

Automatica 25/5 (1989) 679-694. [3] Liptser, R.S., and Shiryayev, A.N., Statistics of Random Process, II. Applications, Springer-Verlag, New York, 1978.