a study of mbh-type realization algorithms

Automatica, Vol. 17, No. 3, pp. 523--533, 1981 0005ol098/811030523-11502.0010 Printed in Great Britain Pergamon Press Ltd.

~j 1981 International Federation of Automatic Control

A Study of MBH-type Realization Algorithms*

JAN STAARt

A new view of realization algorithms of the Markov Block Hankel (MBH) type permits un{l~cation of known results, and leads to some attractive new MBH realization algorithms.

Key Words---Minimal realization; multivariable systems; computational methods; least squares approximation; system order reduction; modelling; linear systems.

Abstract--Realization algorithms, based on the reduction of a Markov-Block-Hankel (MBH) matrix, associated with the input/output description of a linear system, are considered as reduction algorithms on trivial realizations in block companion form. This approach allows for a very simple derivation of the basic solution, and for a geometrically transparent further development into various well-known algorithms (Ho and Kalman, 1965; Silverman, 1971; Rissanen, 1971), or algorithms introduced here. The relative quality of the (stable) orthonormal and (fast) pick-out procedures is compared, and within this class a new fast algorithm is described.

1. INTRODUCTION

THE STUDY of the interrelations between Input/Output Descriptions (IOD) and State Space Representations (SSR) of linear systems is one of the basic subjects of System Theory.

The Multivariable Realization Problem was introduced by the work of Gilbert (1963) and Kalman (1963a, 1963b). This subject then was further developed by many authors in a variety of situations. Among those the following three classes are important for the broader context of this paper:

(1) Realization procedures based on the decomposition of a so-called Markov Block Hankel (MBH) matrix. Algorithms of this type were introduced by Ho and Kalman (1965) and further investigated by many others. (See Silverman (1971) for an extensive overview.) They allow for entirely real-valued computation on directly measurable data. (For example: in discrete time systems, the Markov parameter sequence is nothing but the unit impulse response). The main drawback is that the realization is performed through one, global, unpartitioned transformation of the given data.

*Received April 2 1979; revised May 19 1980 and October 28 1980. The original version of this paper was not presented at any IFAC Meeting. This paper was recommended for publication in revised form by Associate Editor J. Ackermann.

tAspirant N.F.W.O., ESAT, Dep. Elektrotechniek, K. U. Leuven. Kardinaal Mercierlaan 94, B-3030 Heverlee. Belgium.

AUTO--C 523

This requires the handling of significantly larger matrices compared to Partial Fraction Expansion procedures.

(2) Procedures based on a Partial Fraction Expansion of the given transfer matrix.

'Algorithms of this type (Panda and Chen (1969), Kuo (1970), Van Dooren and Dewilde (1971)) elegantly allow for a fast parallel solution of smaller sub realization problems, associated each to a distinct pole of the transfer matrix. The main drawbacks here are that Partial Fraction Expansions are unreliable in the case of Multiple Poles, and that complex-valued computations (yielding complex-valued solutions) are necessary in the case of complex poles.

(3) Procedures based on Data Set Representations of the given transfer matrix. This approach was introduced by Audley (1977) and is in fact a generalization of the Markov Sequence IOD.

These procedures are all based on expansions of the transfer matrix: indeed, the Markov Sequence is the Taylor expansion at infinity; a Partial Fraction Expansion is in fact a set in partial Laurent expansions at the poles; and Data Set Representations are partial Taylor expansions around nonsingular points of the complex plane.

Other possible descriptions, that can be considered for realization, are polynomial input- output models of the type P(z)y(z)=Q(z)u(z); input-partial state-output models of the type R(z)w(z)=u(z), y(z)=S(z)w(z); or even quite general input-output sequences of a given system. (See e.g. Ackermann, 1971.)

The main goal of this paper is to consider the procedures based on the decomposition of the MBH matrix. Using the properties of SSRs in Block Companion Form, the general set of minimal solutions of the Multivariable Realization Problem will be derived in a very short and simple way. Some famous results (Ho

524 JAN STAAR

and Kalman's, Silverman's, Rissanen's Algorithm) will elegantly follow as particular cases of this general solution, and some other attractive features will be examined. The new approach allows for nice geometric insight, and is helpful in numerical considerations. As an example, a reliable orthogonal algorithm based on SVD is compared with a fast Gauss Elimination based pick-out algorithm.

According to the author, the new approach is interesting mainly because it derives and interrelates many and well-known realization algorithms in a short and very simple way, and hence opens an avenue for even better new solutions, as will be shown by some examples.

2. STATE M E NT OF THE P R OB L E M

Throughout this paper, we use the following definitions and notations for a State Space Representation (SSR) and its related input-output transfer description: A linear, time invariant, finite dimensional strictly proper system can be" represented by its system matrix R

characterizing the state space description

( ' )x = / I x + Bu (2a)

y = Cx (2b)

where* u, x, y are the input, state and output vector, of dimension m, N, ~ respectively. Accordingly the dimensions of the constant matrices ,4, B, C are N x N, N x m and / x N. Moreover, ( .) is the differential or delay operator, depending on whether we consider the continuous time, or the discrete time description.

The input--output behaviour of (2) is characterised by a strictly proper rational transfer matrix G( ' ) .

G(. ) a-C[(. ) I - A ] - ~ B (3a)

( 'R \ / / ' - ' ) :,,i=o'~t(') ')l~i~=o ( ' , iai+(')" ARMA-Type

(3b)

= ~ ( . ) - i l l i MARKOV-Type (3c) i = 1

where r is called the order of the transfer matrix; the constant f x m matrices R~ are the moving average (M.A.)-coefficients, and the constant

*For the sake of (notational) simplicity, x is used for x(t), x(s) , x~, X*(z); and accordingly ( . ) x for dx(t)/dt, s X(s), x~+ ~, z X*(z); depending on the model used (continuous time or discrete time description) and on the considered domain (time domain, s or z domain). The formal treatment in the paper does not depend on this choice.

scalars a i are the autoregressive (A.R.) coefficients. The sequence of constant / x m matrices H, is called the Markov sequence of the system i.e.

Hi = C A i - t B (4)

Multivariable realization theory is concerned with finding a State Space Representation for a system of which another equivalent description is known. In this paper we concentrate on the realization of Markov-type input--output descriptions. The problem then is stated as follows: Find an algorithm that extracts a finite dimensional system matrix (1), out of a given Markov Sequence (3c), according to the relation (4), whenever this is possible.

The question of existence is easily clarified through the identification of equations (3b) and (3c). It is clear that the finite order r of the transfer matrix guarantees a constant and finite scalar Auto Regressive (A.R.) property of index r for the Markov Sequence (see e.g. Chen, 1970, p. 236):

r-1 Hk+,=-- ~. aiHk+i for k = l , 2 . . . . (5)

i=0

Conversely, as was pointed out by Ho (1965), this A.R. property on the Markov parameters is a sufficient condition for the Representation to be finite dimensional, as will also follow from the basic solution in next section.

The problem of finding 'good' realizations is somewhat more delicate: It is well known that, if only input--output characteristics are given,t 'good' realizations must necessarily be minimal (see e.g. Ho, 1965). Minimality guarantees that the dynamic behaviour of the realized SSR accounts strictly and only for the dynamic behaviour contained in the given transfer characteristics. A minimal system is completely controllable and completely observable, and in this case the SSR is uniquely determined up to a similarity transformation (see e.g. Desoer, 1970). This degree of freedom left can be used to obtain realizations in less redundant, prestructured forms, but also to improve the numerical quality of the realization algorithm (fast algorithms, stable algorithms . . . . ). In the sequel we present a unified approach which allows one to generate many different realizations and to study their numerical properties.

3. BASIC S O L U T I O N

In this section we derive the basic MBH-type solution Ro for the realization of a Markov sequence. It is shown that all minimal

?Of course, the condition of minimality may result in a loss of information if beyond the IO description certain aspects of the internal behavior are known.

A study of MBH-type realization algorithms 525

realizations are obtainable from the basic MBH procedure. In section 4 many well known algorithms will indeed be derived from this general solution.

An orthonormal variant will provide for geometrical insight, which will prove to be useful for numerical considerations in section 5. A direct and a pick-out variant of the Gauss- elimination based solution will illustrate how much work can be saved by clever choices of the row reducing matrix. Moreover, the pick-out variant will show how the explicit need for the A.R. coefficients in ~¢, may disappear in a surprisingly simple way, thus anticipating one of the methods described in section 4.

This section is mainly tutorial• Its practical importance is limited to problems, where the A.R. coefficients are known together with the Markov Sequence• In this case it is shown that the solution can be found in a numerically stable way. A direct application is found in the realization of an ARMA-type transfer matrix through its Markov Expansion• In the case where the A.R. coefficients are not given, many known, and new realization algorithms will follow elegantly from this basic solution in Section 4.

Theorem 1. Given an infinite Markov Parameter Sequence (3c), satisfying a given A.R. Property (5) of order r, then any minimal SSR (1,2), realizing this Markov Sequence, can be obtained as Ro with:

A o - (I,NK)s¢(K- llNn ) (6a)

Bo =- (I.NK )Y: l (6b)

C O = I / N ( K - llNn ) (6C)

Where: I~,. is a rectangular pseudounity matrix with p rows and v columns; ~ and "~k are defined with (3c) and (5) as:

~ =

Or< Ie: Ol:.. .

0:: Ol< l<r...

0:: O:t Oct...

- a f t : : --a~It: -a2Iee . . .

O:e

Oee

lee

- a , _ 1 It: NN

"~k =

Hk

Hk+ 1

Hk+r- 1 Nm

(6d)

and K is an appropriate nonsingular N x N matrix such that:

• : i I :

=[-~L] nr°ws (7) N - n rows

with N = r [ , M=rm, and n is the rank of ~R. Moreover, for any K satisfying condition (7), solution (6) is a minimal realization of the given Markov sequence.

Remark. It is important to realize that the assumed AR property (5) is that mentioned in (3b) and thus is guaranteed for any system that can be described by an SSR of finite dimension; or equivalently, by a rational transfer matrix of finite order.

Proof. First we prove that solution (6) indeed generates the given Markov sequence. For this consider the trivial realization R in block companion form:

j /8)

This realization generates the given Markov sequence; indeed using (6d) and property (5):

I:N,X~i- 1 ~ l = I r N ~ i = Hi

In order to extract a minimal SSR out of (8), it is sufficient (Wonham, 1974) to factorize the given State Space with respect to its unobservable subspaee, and then restrict the so-obtained SSR to its controllable subspace. It can be readily verified that (8) is completely observable [(8) is in observable blockcompanion form, and the first r blockrows of its observability matrix form an N x N unity matrix]. For the restriction of (8) to its

controllable subspace a classical procedure [e.g. Desoer, 1970, p. 171] is used. Consider a coordinate transformation with the nonsingular N x N matrix K:

x' =Kx (9a)

R' F K '&K-I ! K ~ e l l = (9b)

Then the controllable subspace S, of the new representation is spanned by the columns of the new controllability matrix S',:

S'~=Spar~o,{K'[:,~,.~:h:2i' .••.I ~,~,i~,.; i•.•i~N]} (10a)

= Span~ol (K -cgR) (10b)

526 JAN STAAR

where (10b) is obtained from (10a) by dropping the blockcolumns -Yf,+~.. .~N, since they are linearly dependent on ~ x . . . J g , by the A.R. property (5). f i r is the Markov Block Hankel (MBH) matrix, encountered in many realization algorithms. Now if n is the rank of c~ R and K satisfies condition (7), then from (10b) it is clear that the controllable subspace of the new representation is generated by the first n components of the new state vector (i.e. by the first n new base vectors) and it suffices to restrict this new representation to its first n vector components to obtain a minimal realization, which is nothing but equations .(6a-6c). This proves that Ro is a minimal realization of the given Markov sequence.

In order to prove that any minimal solution R3 can be set in the form (6), we prove that we can derive a K' for R~ from an SSR R0 which is obtained with K. Indeed, all minimal realizations R3 are similar to R0, hence there exists a nonsingular K, such that:

R0 LCoK_ i-0----- J . ( l la )

Choose then:

0 K K ' = I K " l u _ , ] . . ( l ib )

Orthonormal matrix K and geometric interpretation

If K is an orthonormal matrix U', satisfying (7), then its first n rows form an orthonormal basis for the controllable subspace S~=Spa~o m (~R) of the original State Space S, associated with the trivial realization (8.a) in block (observable) companion form. Hence in the solution*

response of the trivial SSR 18} (Chen. 19701. In this light, reconsider (7):

. . . . . t,, 1 1 3 /

Now row i in ~1 is the projection of the first r samples of the trivial state impulse response on basevector i of the (transformed) minimal subspace. The energy, contained in this projected signal, is equal to the squared Euclidean norm tr~ of row i of ~1, and the entire energy of c£ R (the sum of the squares of its entries) is equal to the sum of the energies contained in the n rows of ~1- The zero rows of <g2 are the orthonormal projections of the trivial state impulse response onto the N - n remaining (uncontrollable) basevectors of the trivial state space and contain no energy at all. More generally, the matrix 1,N K of Theorem 1 performs a mapping of the N-dimensional space S onto an n-dimensional space Sc such that both contain the entire dynamic action of the system. I fK is orthonormal, then the dynamic action in S is orthogonally projected onto n orthonormal base vectors in Sc and an energy conservation law is valid. This geometrical interpretation is quite general, and useful for conceptual as well as numerical considerations.

Remark 1. U'uN may be obtained in different ways. It may be the result of successive Givens rotations or Householder transformations, or it may be the transposed left factor of the Singular Value Decomposition of c~ (Eckart and Young (1936), Zeiger and McEwen (1974)). In this case, the singular values appear to be the above mentioned norms al of the rows in KcgR

~ga = UNN Y~uM VMM-'* U~vNc~R = ENM VMM (14)

Aol = Ut, NsgU N, (12a)

Bol = Ut, Ng~I (12b)

Co 1 = l/ , Us, (12c)

Ut.N orthonormally projects S onto Sc along S~, and (12) is the accordingly reduced representation. Now the Markov Sequence (4), and its realizations (8), (6) and (12) can always be considered as the impulse response and the realization of a discrete time system (whether this be really so or not). Then the (block) columns of cg R are the first r samples of the state impulse

*Now and in the sequel the subscripts on a matrix indicate the dimensions of the resulting matrix. So U'.sS=IU').N and U.N are both n x N matrices.

and this interpretation of the singular spectrum will be used in section 5 to find a motivated criterion for the determination of the rank n, used in (12) and (13), for a noisy cg R.

Remark 2. It is important to see that Realization (12) is numerically stable. Many authors, when realizing an ARMA type transfer matrix through its Markov expansion, drop the 'redundant' information, contained in the A.R. coefficients, and perform the realization with the Markov parameters only. It will be shown in next section that this is possible, but far less reliable. The conclusion is that whenever A.R. information is available together with MARKOV information (e.g. in realizations of ARMA type transfer matr ices . . . ) this A.R. information should be used in the computations in a way comparable to solution (12).


Remark 3. Nothing has been said yet about the choice of the blockdimension r of ~'~, except that r should at least be the (suspected) order of the system. However, as will be shown in section 4, a much larger blockdimension of c~.~ R may be motivated by numerical considerations.

Construction of K ria Gauss-elimination Though it is well known (Forsythe and Moler,

1967) that Gauss-elimination is numerically not as reliable as an orthogonal transformation, it remains a very popular procedure in practical applications due to its algorithmic simplicity and faster execution. With some additional care taken (see Appendix I), one can obtain quite elegant solutions with a reliability that is sufficient for most practical applications.

Using the notation of appendix I, the basic solution has as its regular G.E. variant:

Ao2=[YFlO],,NPR~Cp~I Yr ' ] - X ~ ? ~ ' ; ~ N. (15a)

Bo2 = EY~ ] 03.~P.~, (15b)

I y ; l 1 1 " C°Z=I/NP'R-~ ' -Y~" N. (15c)

In Appendix I, it is also described how, from the regular G.E. matrix F'PR, it is possible to derive an equally valid reduction matrix F'PR by replacing YF by 1... This yields a much simpler solution (pick out variant).

, F I. . -] A O 3 = I , N P , ~ P R | ~ |

L-- FiNn (16a)

B03 = InNPRJt~I (16b)

Co3=ieNP~[ I.. 1 " L -- XF _INn

(16c)

Remark 4. By comparing (15) and (16), it appears that an appropriate choice of the reduction matrix can avoid an important (and critical!) amount of computation. In (16) the inversion of YF is avoided, and many matrix products degenerate into trivial reorderings of rows and columns, easily accounted for by an appropriate pointer management.

Remark 5. Whenever both A.R. and MARKOV parameters are available the numerical aspects are better for those realization algorithms which effectively use the A.R. coefficients (see Remark 2) and thus we recommend in that case the use of solution (16) among the G.E. procedures.

Remark 6. From (16a), it is seen that if PR contains no nonzero elements in the last / columns of its first n rows, then the actual values of the A.R. coefficients are not used in the evaluation of (16). This condition is satisfied whenever the set of n independent rows in 'gs can be chosen among its first / ( r - 1 ) rows. This is often directly possible; in the other case, by the A.R. property (5) it is sufficient to choose the block dimension r of r¢ R one unit larger than strictly necessary to be sure that the last / rows of cg~R be linearly dependent on the previous ones. This is the basic idea for the second method to avoid the A.R. coefficients, described in Section 4.

4. ADAPTATION OF THE BASIC SOLUTION

This section adapts the basic solution (6) to the case where the A.R. coefficients are not given, and clarifies the relationships between many known and some new MBH-type realization algorithms. A first technique leads to a slightly generalized version of the Ho & Kalman Algorithm (1965), and the G.E. based pickout version of this solution yields Silverman's algorithm (1971). A second technique yields a new class of algorithms, among which the G.E. based pickout version is believed to be faster than any nonrecursive algorithm described before. The recursive variant in this class leads to Rissanen's algorithm (1971). An important observation here is that the price for bypassing the A.R. coefficients is numerical instability. The ill part of the solution can, however, be isolated and evaluated for each problem at hand.

First adaptation technique Theorem 2. Given a Markov parameter

sequence (3c) which satisfies an unl~nown A.R. property (5) of maximal order r, then for any minimal SSR realizing this Markov Sequence, there is a pair of nonsingular matrices KNN and LMM such that this minimal realization can be obtained as:

Ax=(Inu'K)(sff~R)(L'IM.)'P -1 (17a)

B1 = (I,N "K)gfl (17b)

Cl = (I/NC~R)(L. IM.) . p- 1 (17c)

where ~¢, ~ k , I,,., ~R are defined as in Theorem 1, and where K and L are such that:

0 " 1 =- INn "P"" "I"M K ' ~ R ' L LOu_.," N-..M-,A (18)

N = g.r, M = re 'r , and n is the rank of (~R"

528 JAN STAAR

Proof Since K and L are nonsingular, (O¢R and K.C~ R.L are of the same rank n, and hence P will be nonsingular. Then from (18), it follows t hat:

INn = K • c~ R " L" IMn " p ~ l. (19)

Substituting (19) in the basic solution (6) yields (17).

Remark 7. The need for the A.R. coefficients in ,,~ has indeed disappeared, because in (17a) matrix ~ operates directly on ~R, and by the A.R. property (5):

(20)

in which only the Markov parameters need to be known.

Remark 8. In solution (17), no inverse of matrix K is needed. Instead, the column-reducing matrix L must be known. In fact this solution is nothing but a slightly generalized version of the well known Ho and Kalman solution (1965). Ho "nd Kalman require ~gR to be reduced in such a way that P = I.,. In this case, the inversion of P is performed implicitly and its effect is distributed over K and L. However, if one does not prerequire this condition, i t is possible to isolate the ill part of the solution, as will be seen in what follows.

The row and column reducing matrices K and L in (18) can be obtained by successive orthogonal transformations such as Householder transformations on the columns (resp., rows) of rgR, or by successive Givens rotations. However, it appears from literature (Eckart and Young (1936), Forsythe and Moler (1967), Staar and Wemans (1980b)) that Singular Value Decomposition algorithms provide the most reliable rank information. In this case equation (18) becomes:

UNNENM VMM -'~ UNN(~R VMM -- INnZnnInM

(21)

where Enn is a diagonal matrix with the n nonzero singular values of ~R. Hence solution (17) reduces to:

A11 _ t c t -1 -- UnN('~'~ R)V Mn ~'~nn (22a)

B11 = Ut.u ~ t = Z . . Vnm (22b)

C! 1 = (lgN~R)VfMn E~ I = IrNUN." (22c)

From the basic solution Ro, we know that (22) essentially performs an orthogonal projection of the trivial representation (8) onto its controllable

subspace. The tricky bypass (19), used to avoid the A.R. coefficients, splits up the orthogonal factor K-~ of the basic solution, into a product (~R t - 1 VMnE.n . The numerical troubleshooter here is Z~ ~. Indeed, apart from this factor, the procedure is entirely orthonormal; but Z~ ~ contains the possibly bad condition of ~R :

)'~nn l 'm Diag [tri- 1.. . a~- t] (23)

and postmultiplication by Z~-, t in (18) performs a weighed upscaling of the less reliable columns in V ~, inversely proportional to the worst conditioned (smallest) singular values. Note that in (22) the numerical instability is well localized in Z~, 1, which is a nice conceptual advantage over the original form of Ho and Kalman's solution.

Applying this technique to the Gauss elimination procedure (15) and using the notations of Appendix I, we obtain:

At2=EYelO]nspR(dc~R)pcI~_lu" .p[,,l

(24a)

B,z=EYvIOL~PR(~, ) (24b)

C12 = (IeN~R)PcI~-~IMn'P~nl (24c)

where Pnn is the diagonal matrix of the n nonzero assumed successive pivots occurring in the elimination procedure. Analogously operating on (16) we obtain the pick out procedure:

A 13 = (I,N" Pa" (dc~'~)" Pc" Iu , )" T - 1 a= ~ T - l (25a)

B13 = I n N "PR " ~ 1 (25b)

C13 = (IotN(~R) "Pc" IM,)T- l (25c)

where T is a nonsingular n x n submatrix of ('~'R, pivoted to the first n rows and columns, and picked out of ~R by (I.NPR) and (Pclun) respectively. T e is picked out of (~¢cgR) in exactly the same way.

Remark 9. Algorithm (25) is the one described by Silverman (1971). It is particularly elegant, because the evaluation of (25) is performed directly on the given data, but it makes no full use of the information contained in ~R because it uses only the data of two picked out (nxn) submatrices T and T~ in ~gs- This is a typical disadvantage of G. E. based algorithms compared to orthonormai ones.


Remark 10. Notice for later comparison, that both solutions (24) and (25) require a row and a column reduction of ~R" The computations of (24) and (25) both require an explicit or implicit inversion of an (n x n) submatrix. Of course, for this inversion, the reduction procedure itself can partially be used, as described by Silverman (1971).

Second adaptation technique Theorem 3. Given a Markov parameter

sequence (3c) which satisfies an unknown A.R. property (5) of maximal order r, then any minimal SSR realizing this Markov sequence, can be obtained as:

A 2 = I n N ' K " [ONe I INN] " __ y - 1XK- 1 "INn

(26a)

B2 = I,N " K " YF ~ (26b)

C2 = I:N " K - 1 .IN,. (26c)

Where K satisfies (7), and where X:N and the nonsingular Y:t satisfy:

XeN~ R 4- Yee" [H.+, In,+2 [... I H2,]:M =o:M (27)

Proof By analogy with Theorem 1, consider a trivial realization R*, defined in the same way as (8) and (6d), but of blockorder r*= r + 1. Its A.R. coefficients a* are yet unknown, except that a] =0, because the maximal order of the representation is r * - l . Following the proof of Theorem 1, R* is completely observable, and the restriction of the representation to its controllable subspace requires a nonsingular matrix K* such that:

K * ~ ~K* :

H,+I " " H 2 r

= = --~);.~:- .(28)

Notice that the maximal order r of R* allows to use a reduced controllability matrix c~] with only r of the {(r + 1) blockcolumns of the matrix c~.. Now we try to obtain a row reducing matrix K* of the form:

[Ks.s, 0s,,] K * ~ L X : N Y . j

K-1 • 1 K*-X r s's' 0r,: = L - Y~X:sK~,.~ Y~I (29)

pick-out appendix, form:

where from (28) it follows that KNN is a row reducing matrix for :£~, and X and Y have to satisfy (27). Now again from the A.R. property it follows that the last blockrow of C~ is linearly dependent on the blockrows of :~'R; hence (27) has a solution even if Y is chosen as the unity matrix !::. This proves at the same time the existence and the nonuniqueness of K*. The same arguments as used in Theorem 1 prove that (26) covers all minimal solutions. The special structure of K* used in the basic solution (6) leads us to solution (26) in which the A.R. coefficients do not occur any more. In fact a generalized A.R. property is implicitly accounted for when solving equation (27), for which the original A.R. block relation Yr, t = l and X:N = [ao l la l l t . . . l a ,_ , l ] is one possible solution. In the special case of a monovariable realization problem with a minimal number 2 ro of Markov parameters available, this is the only solution• In the general case, many other solutions for the underdetermined set of equations (27) exist and may yield more reliable solutions in (26). Though this aspect has not been rigorously examined yet, it is clear that completely orthonormal matrices K* cannot be found. Indeed if K* were orthonormal, then by (29) X:N would be a zero matrix, and hence (27) cannot yield a nonsingular matrix lee. This is a parallel conclusion to that for the first method, saying that the price for avoiding the A.R. coefficients, is numerical instability. In (26) and (27) however, the ill posed part is n o t so easily isolated. Notice that in comparison with the first adaptation method, in (26) only row reducing matrices K, X, Y are needed. Moreover, if K and Y are each orthogonal, then no further computations are necessary for their inverses. [One could e.g. choose Y= 1].

In the case of Gauss elimination this second technique generates an ultra fast algorithm in its

version. With the notation of the K* can be chosen in the following

and

..~ X F IN-n,N-n ON-n.t' "P~ X~ Oe.N_ . I:t

K*- 1 _ p~,. I I.. 0., N-. 0., t "l

I

- X F IN-,.N- . 0N-,,e /

- X* Or,N-. I . _J (30a)

i t l ]. (30b)

530 JAN STAAR

Indeed, with K* of this form, n among the N rows of ~R are picked out and brought into the first n positions of q¢~ by P]. X r and X~ then annihilate the remaining rows in ~*. The following realization R2t results:

- X r la) Az,=I.NP~[ON,flNN]P I' t _ X ~ (3

B21 = I.NPR~ 1 (31b)

= l,Ne,.,, ] . L - - FA

(31c)

This solution leads to an algorithm that is believed to be faster than any other before in the class of nonrecursive algorithms, and goes as follows.

Step i. Perform a row reduction on ~R by G.E. Notice that no column reduction or diagonalization is necessary (compare this to Silverman's algorithm). This step yields X F and the row indices of the chosen set of n independent rows in ~R-

Step 2. If among the n independent rows, none belongs to the last f rows of qCa, then the basic solution (16) can be used without further calculation. For each row j, chosen among the { last rows in ~R, the G.E. procedure is continued in order to obtain the linear dependence of row j +E on the n independent rows. This yields the necessary rows of X~-. Now solution (31) is used without further calculation. Notice that the algorithm here presented requires at most a row elimination performed on the rows of ~ and compare this to Silverman's algorithm. The evaluation of (16) and (31) is then merely a reordering of this information contained in X r and X~.

Remark 11. If the G.E. is performed recursively, exploiting the Hankel structure of ~R, then there exist solutions requiring 0(n 2) operations. In this case the above algorithm leads to algorithms as described by Massey-Berlekamp (1969) and Rissanen (1971). However, these algorithms are numerically unstable. De Jong (1978) has described numerically stable recursive algorithms using orthogonal Givens transformations.

If the G.E. is not performed recursively, then the successive pivots may be chosen everywhere i'ta ~R, and based on numerical considerations only. The resulting algorithms are far more reliable, especially in the case of multiple poles and noisy data (Staar, Wemans and Vandewalle, 1980a). Among the nonrecursive algorithms, the

one described above is believed to be the fastest, as it requires only n row-elimination steps and some reordering of the resulting matrix. A rigorous study of the numerical stability for this type of algorithms is a subject still under investigation.

5. SOME NUMERICAL CONSIDERATIONS

For an extensive treatment of the numerical aspects of recursive algorithms in the monovariable case, we refer to De Jong (1975). A generalization of this technique to the general multivariable case is nontrivial and beyond the scope of this paper. Here we are concerned with the determination of the rank n of c~ a in the case of Markov Sequences of limited accuracy. We present a method for the case where there is no correlation between the original Markov Sequences and the disturbing inaccuracies. (Nonsystematic measurement noise often satisfies this condition). The method uses the geometric interpretation of section 3, and optimizes the expected matching of the realized and the original state impulse response sequences, associated to the Markov Sequence through equation (6d).

For this, reconsider the S.V.D. of the MBH matrix for the (unknown) exact ~R = UZV and its (given) noisy* approximation ~Rv = cgR + ACgg =UvEvVv. Then with c¢~1 and c¢v2 as noisy counterparts of c~ 1 and ~2 in equations (13-14),

A and ~ - U , U (the orthogonal transformation relating the exact and the noisy reduction bases), one obtains:

I ~ll=(u'~u)(u'cg,)+U~AC~ R (32a) ~v2j F~,IF~,I (32b)

J+Lu,d Where the subscripts 1 and 2 denote the first n (resp., last N - n ) rows of a matrix. Some further calculation leads to the useful expressions:

,33a,

= -u'v2 111 (33b)

=llu 'v , A e.ll; + IIu'v2 • e.ll; (33c)

=ll .ll;+llu',, .112 (33d)

n

=11 .11; +__E (11"',, A .ll;-tl'"v, (33e)

*The subscript v will denote results, obtained from the Noisy Markov Sequence.


Where I1"11~ denotes the Frobenius-norm, and ut~, is the ith (transposed) column of U~. We now first interpret these results, then give a matching theorem, finally state a rank criterium.

Equation (32b) shows clearly that the distinction between ~,.t and ~,.2, (and hence the determination of the rank n), has become vague. This is due to the additive noise Uv2 A~R, but also to a repartition of the original state impulse response over the spaces S~=Spancol(U~) and S,~-Span~ot(U,.2). An important remark is that even for a choice of n, that corresponds to the original representation, the realized state impulse response may be severely affected in its small energy directions. This observation motivates that a decent realization should include a possible order reduction, if the noise level is significant.

E(n) in the equations (33) is interpreted as the energy of the difference between the exact state impulse response, and the noisy state impulse response when projected on its main n- dimensional subspace Span~l(U,.x). Notice the similarity transformation ~b~, necessary to compare both responses in a same basis. (33b) shows clearly the part of the error U ~ A ~ R due to additive noise in S~ and the part - U ~ 2 ~ , due to projection of c~ R in S~. Finally, (33e) shows that for each additional dimension in the realization, the error will increase due to additional additive noise /dvo A~ R 12, and decrease due to additional signal UtiC~RI 2 taken into account. Based on this interpretation, we state

T h e o r e m 4 The energy of the matching difference E(n) between the exact state impulse response [ ~ I 0 ] ' and its reduced noisy approximation [c#'~10]', is minimal when the rank n of ~R,. is chosen so that

=, i = l . . . n (34a)

Ilu* ,A tl >llu' , ll 2, i = n + 1 . . . . N . (34b)

R a n k cri terion A practical estimation of criterion (34) is given by

a2i > 2mrtr2,. for i = 1 . . . . n (35a)

2 2 a,.i <2mrtr~ for i = n + 1 . . . . N. (35b)

Where tr,.i is the ith noisy singular value, and tr,. is the standard deviation of the inaccuracies on each entry of a Markov parameter. The proof of the theorem follows directly from (33e). The motivation of the rank criterion is based on the critical level T,~¢~,, at which noise and signal

would equally contribute to T~ tllu'v, ¢ ll

2

= 2Ilu'v,A ,112

=" 2tara 2 . (36)

The first approximation assumes Utv,~¢R and utvn " A ~ uncorrelated. The second approximation uses the average Ilu'v, ,lle -IIu' a e, ll /g = m r t r 2 .

R e m a r k 12. It is interesting to notice that in the discrete time case the above criterion provides optimal matching between the exact and realized trivial state impulse response and hence the Markov Sequence itself is matched with a ponderation sequence:

1,2 . . . . r , r - 1 . . . . 1.

R e m a r k 13. As (36) is based on a statistically expected error, it may be justified to make the criterion more severe by a safety coefficient, which can be chosen sharper for larger dimensions N of the MBH matrix. However, in this case the optimal matching property is lost.

No general rules can be derived to determine a numerically good blockorder r. r should be at least equal to the theoretical order r o (or a safe estimate of it), and is limited to ½kmax, where kmax is the number of available Markov parameters. From (36) it is clear that quickly vanishing modes are recovered only if r is taken small enough whereas diverging sequences, or high order stable modes of the form k nct k may require large values of r for a decent realization. If both types of responses are part of the impulse response then the demands may become incompatible, and the choice of the 'good' r will depend on external criteria and require additional effort from the user. A very involved but often clarifying tool then is a plot of the singular spectrum tri,. versus the blockorder r, showing where the modes break through the noise level for some r, and where they get drowned out again for some larger r. However, it is our experience t h a t in many cases these problems are not prohibitive, and a good reduction is found when r is chosen in the interval [ro,½kmax] and somewhat beyond the maximum of:

II eR (r)l[2- mr2tr tr2 (37)

(37) is an empirical result, and roughly evaluates

532 JAN STAAR

the expected signal to noise ratio in g R,,(r). It can be calculated from the entries of the Markov Sequence and the noise level, before the MBH decomposition is performed.

6. CONCLUSION

In this paper, many new and old realization algorithms for linear time-invariant systems, using the Markov Block Hankel (MBH) approach are presented in a unifying framework and their computational aspects are compared.

First, a very simple and geometrically meaningful derivation for the basic procedure is presented. It is shown that MBH procedures can be seen as reduction procedures on a trivial realization in block companion form.

Then, various algorithms are deduced from this basic solution, depending on the available data arrd on the reduction procedure used. The relative qualities of SVD-type algorithms (numerical stability), and Pick-out G.E.-type algorithms (Speed), are carefully weighed for the different options. Some well known algorithms are clearly situated within this framework, some extensions are examined, and an ultrafast solution requiring only one row reduction and some additional reordering is presented.

It is the author's belief that this unifying framework allows a good comparison of the quality of many algorithms, and forms a solid basis for future searches for still better algorithms.

Acknowledgements--The author is indebted to Ir. E. Van Damme, who introduced him to System theoretic thinking, to Dr. Ir. J. Vandewalle and to Prof. A. Haegemans for valuable discussions.

This work is supported by N.F.W.O. {Nationaal Fonds voor Wetenschappelijk Onderzoek, Belgium).

REFERENCES Ackermann, J. E. (1971). Die minimale Ein-Ausgangs-

Beschreibung yon mehrgr6ssen Systemen und ihre Bestimmung aus Ein-Ausgangs-Messungen. Regelungstechnik 19, 203-206.

Audley, D. R. (1977). A method of constructing minimal approximate realizations of linear input/output behaviour. Automatica 13, 409-415.

Chen, C. T. {1970). Introduction to Linear System Theory. HRW Series in Electrical Engineering.

De Jong, L. S. (1975). Numerical aspects of realization algorithms in linear systems theory. Doer. Thesis, T. H. Eindhoven.

Desoer, C. A. 0970). Notes for a Second Course on Linear Systems. New York, Van Nostrand Reinhold.

Eckart, C. and Young, G. (1936). The approximation of one matrix by another of lower rank. Psyehometrika i , 211 218.

Forsythe, G. and Moler, C. B. (1967). Computer Solution of Linear AIgeb~raie Systems. Prentice-Hall.

Gilbert, E. G. (1963). Controllability and observability in multivariable control systems. SIAM J. Control I, 128 151.

Guegen, C. J. and Toumire, E. (1970). Comments on irreducible Jordan form realization of a rational matrix. IEEE Trans. on Aut. Control AC-15.

Guidorzi, R. 11975). Canonical structures in the identification of multivariable systems. Automarica i l , 361-374.

Ho, B. L. and Kalman. R. E. !1965). Effective construction of linear state-variable models from input/output functions. Proc. 3rd Allerton Conf. Circuit and Systems Theory, 499- 459.

Kalman. R. E. (1963a). Mathematical description of linear dynamical systems. SIAM J. Control 1, 152-192.

Kalamn, R. E. (1963b). Irreducible realizations and the degree of a rational matrix. SIAM d. Control 13, 520-544.

Kuo, Y. L. (1970). On the irreducible Jordan form realization and the degree of a rational matrix. IEEE Trans. Circ. Th. CT-17, 322-332.

Massey, J. L. (1969). Shift-register synthesis and BCH- decoding, IEEE Trans. Information Theory 15, 122-127.

Panda, S. P. and Chen, C. T. (1969). Irreducible Jordan form realization of a rational matrix. IEEE Trans. Aut. Control AC-14, 66-69.

Rissanen, J. (1971). Recursive identification of linear systems. SIAM J. Control 9, 420--430.

Rosenbrock, H. H. (1970). State Space and Multivariable Theory. Nelson, London.

Silverman, L. M. and Meadows, H. E. (1966). Equivalence and synthesis of time variable linear systems. Proc. 4th AUerton Conf. Circuits and Systems Theory, 776-784.

Silverman, L. M. (1971). Realization of Linear Dynamical Systems. IEEE Trans. Aut. Control AC-I6,

Staar, J., Wemans, M. and Vandewalle, J. (1980a). Comparative results of multivariable realization algorithms of the MBH type in the presence of multiple poles, and noise disturbing the Markov sequence. 4th Int. Conference on Analysis and Optimization of Systems. INRIA, Versailles (F).

Staar, J. and Wemans, M. (1980b). Row and Column Reduction Procedures. Report ESA T Laboratory (in preparation).

Van Dooren, P. and Dewilde, P. (1971). State space realization of a general rational matrix. A numerically stable algorithm. Midwest Sytnp. Circ. and Syst., Aug.

Wolovich, W. A. {1974). Linear Multivariable Systems. Appl. Math. Sc., 11, Springer, Berlin.

Wonham, W. M. (1974). Linear Multit'ariable Control, a Geometric Approach, Lecture Notes in Econ. and Math. Syst., Springer, Berlin.

Youla, D. C. and Tissy, P. (1966). N-port synthesis via reactance extract ion-Part 1. IEEE Int. Cony. Rec. 4, (7), 183-205.

Zadeh, L. A. and Desoer, C. A. {1963). Linear System Theory. McGraw-Hill, New York.

Zeiger, H. P. and McEwen, A. J. (1974L Approximate linear realizations of given dimensions via Ho's algorithm. IEEE Trans. Aut. Control AC-19, 153.

APPENDIX ROW AND COLUMN REDUCTION OF A MATRIX

AND THE EFFECT OF NOISE ON ITS ENTRIES I. Orthogonal reduction based on singular value decomposition

It is well known (Forsythe and Moler, 1967) that Singular Value Decomposition (SVD) is numerically the most reliable way to cope with row and column reduction, and with the associated rank determination problem. Therefore let the S.V.D. of a matrix TNu be given by ~assume N < M I :

TNM = US.NENU Vuu (A 1 )

with UNs and VMU orthonormal and T.,Nu=diag (at, ~rz,...a,+~,, 0. . . ) . If the rank of TNM is n+An, then the closest approximation ~NM of rank n for Tsu is given by:

~su = U NnY,,V.u (A2)

where Us., Yn,, V.~ are the submatrices of Uss, Xsu, Vuu, restrictgd to the columns and/or rows associated with the n largest singular values a t . . . a n in Y~su. This statement is valid in both the 2-norm and E-norm (e.g. see De Jong, 1975, p. 59).


Euclidean Norm:

II T - 7 l h = ~ .+ ,. (A3)

Energy Norm (Erobenius Norm):

IIT-7~II~=E(t,~-t,j)== ~ .~ (A4) i i k - n + 1

Hence if a noise level a~=(tij-~.i) 2 on the elements of a matrix T i s known then (A.3I and (A.4t provide for a reliable tool to find the matrix 7 of smallest rank within the noise neighbourhood x/'(MN)a,..

Then, for matrix T, the closest approximation to an n-row or n-column reduced from, is given by:

U~,N TNM ~ UINN "NM =[U~fnNoT~NM ] = [ ~ ] (A5)

TN V' ~ ~ v, . M.,,=..~-,~,~=tT,,...v~,.IOl=EU~.x.°IOl. (A6)

It can further be shown (Wonham, 1974) that the n column vectors of U~, form an or thonormal base for the image space of T,, and the n row vectors of V.M form an or thonormal base for the orthogonal complement of the kernel space of T.

2. Gauss Elimination based reductions Speed requirements may be an imperative reason to use

G.E. instead of orthogonal procedures. Though never as accurate as S.V.D., G.E. leads to fairly reliable reduction procedures provided complete pivoting is used, and provided no row or column scaling is applied (Staar and Wemans, 1980b). Moreover a pickout variant may lead to ultra fast procedures, as shown in section 4.

(a) Classical procedure. The row reduction of a matrix T may be represented as follows (Forsythe and Moler, 1967):

FPR TNMPc =

p~

0 D n ~ ¸

O~-..M

(A7)

where

1 0

- LXr I,,_., ~ _ . j

and Pa and Pc perform row and column interchanges.

(A8)

(b) Pick out procedure. If in (A7} (AS), Yr is replaced by the identity matrix 1., then another row reduction matrix F' Pt¢ is obtained :

L O N - n , M _1 "

IA9)

Where now:

mxr I~'- . .N-. .J ' = -'Xt. I~ . . , s - . " (AI0)

Notice that the first n rows of F' PR merely pick out n independent rows out of T. X r then contains the linear dependence of the other rows on the n independent ones.

Note the advantages of the presented solution:

1. The nonzero entries of the reduced matrix are directly given data, which is attractive with respect to error propagation.

2. The reducing matrix F' contains less calculated entries than F and its (trivial) inversion requires no additional calculations.

3. Products with F' as a factor will require less computat ion and will introduce less error propagation, due to its partly trivial structure.

This pick out procedure will be the startpoint for a fast realization algorithm, where no additional computat ion will be required once the elimination step is completed.

Strictly speaking, in both procedures (a) and (b) the rank of T is n iff only n nonzero pivots can be found. If the entries of T are disturbed by noise, then the pivots n + 1, n + 2, . . . will be small, but different from zero. Though not as rigorous as in the S.V.D. case. still good results can be obtained by using the pivots in a similar way as the singular values in (A.3) (A.4), provided the noise level is not too high, and complete pivoting and no row or column scaling is used (Staar and Wemans, 1980b).

Column reduction is performed in a similar way, with matrices G, Y6 and X G which have a structure that is the symmetric complement of F, Yr and X r.

a study of mbh-type realization algorithms

Documents