[ieee 2009 ieee/asme international conference on advanced intelligent mechatronics (aim) - singapore...
TRANSCRIPT
New Form of Block Matrix Inversion
Youngjin Choi
Abstract— A block matrix inversion is a tool that is usefulin areas of control, estimation theory and signal processing.However, one of two diagonal entries of the block matricesshould be at least invertible to accomplish the conventionalblock matrix inversion. This paper shows that this assumptionis partially released by offering new form of symbolic blockmatrix inversion. Though the suggested block matrix inversioncan be applied to a variety of examples, in this paper, threeapplication examples are suggested to show their effectiveness;the one is simply for finding an inverse matrix, another is forobtaining an inverse plant model of MIMO (multi-inputs andmulti-outputs) plant for canceling plant noise and disturbance,and the other is for solving an optimization problem subject toan equality constraint, respectively.
I. INTRODUCTION
In an optimal filtering, state estimation, optimization, in-verse model-based control and disturbance canceling method,a block matrix inversion is often met [1], [2]. Though aconventional 2 × 2 block matrix inversion has been wellestablished in [1], [3], it still requires an assumption thatone of diagonal entries of the block matrices should be atleast invertible to make use of the conventional block matrixinversion. If this assumption is not satisfied although theentire matrix is full rank, we may have to either rearrange thecorresponding matrix in such a way to satisfy the assumptionusing column/row exchanges or otherwise abandon the sym-bolic block matrix inversion and try a numerical algorithmlike an LU decomposition-based inversion. Recently, thepivot-free block matrix inversion was suggested by usinga concept of Moore-Penrose generalized inverse in [4].In addition, Moore-Penrose generalized inverses of largeblock matrices were suggested in [5], [6], [7]. However,these methods are all using the concept of Moore-Penrosegeneralized inverse, although there exists an exact inverse ofa symbolic expression.
In this paper, new form of 2 × 2 block matrix inversionis suggested by partitioning the same matrix differently. Itcan be applied to the case that both off-diagonal blockmatrices have full ranks. In order to verify the effectivenessand applicability of the suggested block matrix inversion,we will show three application examples which are neitherobtained nor solved by using the conventional block matrixinversion; the one is to find an inverse matrix, another is toobtain an inverse plant model of a given MIMO plant and the
This work was supported in part by the Korea Science and EngineeringFoundation (KOSEF) grant funded by the Korea government (MEST) (R01-2008-000-20631) and in part by the Gyeonggi Regional Research CenterProgram under grant (2007-000-0407-0003), Republic of Korea.
Y. Choi is with the School of Electrical Engineering and ComputerScience, Hanyang University, Ansan, 426-791, Rep. of Korea, Tel: +82-31-400-5232, Fax: +82-31-436-8156, [email protected]
other is to solve the optimization problem with an equalityconstraint. The inverse plant model has been used as a partof signal processing method for canceling plant noise anddisturbance in [8] or as a feedforward controller to improvea tracking control performance in [9]. However, it is not easyto obtain the inverse plant model in the case of a MIMOplant. Also, the optimization problem for a positive semi-definite performance index subject to an equality constraintin [10] cannot be directly solved using the conventional blockmatrix inversion. The suggested new form of block matrixinversion can be an alternative for these cases.
This paper is organized as follows; the section II men-tions the problem statement on a conventional block matrixinversion method; the section III suggests new form of blockmatrix inversion as an alternative of conventional ones; thesection IV shows three application examples in order to showthe effectiveness of the suggested block matrix inversions;and the section V draws the conclusion.
II. PROBLEM STATEMENT
Now, let us consider the conventional block matrix inver-
sion. Suppose we have the partitioned matrix
[A BC D
],
where A and D are square matrices, and B and C matricesmay or may not be square. If A, B, C, and D are,respectively, an n× n, an n×m, an m× n, and an m×mmatrix, then the conventional block matrix inversion has theform of either[
A−1 + A−1BS−1A CA−1 −A−1BS−1
A
−S−1A CA−1 S−1
A
](1)
provided
|A| �= 0 and |SA| �= 0
with SA�= D − CA−1B of Schur complement of A, or[
S−1D −S−1
D BD−1
−D−1CS−1D D−1CS−1
D BD−1 + D−1
](2)
provided
|D| �= 0 and |SD| �= 0
with SD�= A − BD−1C of Schur complement of D.
Here, Eq. (1) and (2) are two expressions for the inverseof the corresponding matrix. Since these two expressions areinverses of the same matrix, they must be equal. So, we canget the well-known matrix inversion lemma in [11], [12] as
2009 IEEE/ASME International Conference on Advanced Intelligent MechatronicsSuntec Convention and Exhibition CenterSingapore, July 14-17, 2009
978-1-4244-2853-3/09/$25.00 ©2009 IEEE 1952
the following forms:
(A − BD−1C)−1 =
A−1 + A−1B(D − CA−1B)−1CA−1, (3)
(D − CA−1B)−1 =
D−1C(A − BD−1C)−1BD−1 + D−1. (4)
As aforementioned, one of both A and D should be atleast invertible to apply the symbolic block matrix inversion.In other words, if both |A| = 0 and |D| = 0, thenthe conventional block matrix inversion cannot be directlyapplied though the entire matrix has full rank. That is theproblem statement to be solved in this paper. As a matter offact, the matrix may be rearranged to be either |A| �= 0 or|D| �= 0 through the column/row exchanges, and then theconventional ones may be applied, however, these cases areout of our concern. In the following section, we will suggestnew form of symbolic block matrix inversion, which canbe found even though both |A| = 0 and |D| = 0, withsome assumptions to be suggested later. Also, these willpartially release the assumption which is one of defects inthe conventional block matrix inversion formula.
III. NEW FORM OF BLOCK MATRIX INVERSION
New form of block matrix inversion will be suggestedunder the conditions of both |A| = 0 and |D| = 0. It isderived from partitioning the matrix and it can be applied tothe case that both B and C in the off-diagonal have at leastfull (column or row) ranks in the following.
Though both A and D in the diagonal are all singular, ifboth B and C in the off-diagonal have full ranks, then theblock matrix inverse can be found under some assumptions.The required assumptions and the block matrix inverse willbe found in this section. The basic idea of removing theassumption that one of both A and D in the diagonal shouldbe at least invertible is to partition the corresponding matrixas follows:[
A BC D
]=
[I 0
DY −1CW−1 X1
] [A BC 0
],
(5)
where I and 0 are an identity and zero matrices with suitabledimensions and to newly define the following forms:
W�= A + BC (6)
Y�= CW−1B (7)
X�= I + D(I − Y −1), (8)
in which W , Y and X are assumed to be invertible,respectively, with an n×n, an m×m, and an m×m matrix. Inother words, we assume that |W | �= 0, |Y | �= 0 and |X| �= 0.These assumptions should be satisfied for existence of blockmatrix inversion. Also, for future notations, we introduce the
following two definitions.
B∗W
�= Y −1CW−1 = (CW−1B)−1CW−1,
for B∗W B = I, (9)
C∗W
�= W−1BY −1 = W−1B(CW−1B)−1,
for CC∗W = I. (10)
Here, we should note that B∗W and C∗
W are not Moore-Penrose pseudoinverses, though they play the roles of theleft weighted pseudoinverse of B and the right weightedpseudoinverse of C, respectively. Also they can be alwaysfound under the given assumptions. Ultimately, the proof ofEq. (5) and each inverse of partitioned matrix are requiredto accomplish the block matrix inversion.
First, let us prove Eq. (5) as follows:[A BC D
]=
[I 0
DB∗W X
] [W − BC B
C 0
]by definitions of Eq. (6) and (9)
=[
A BDB∗
W W − DC + XC D
]because B∗
W B = I
=[
A B
DY −1C − DC + [D(I − Y −1) + I]C D
]by definitions of Eq. (8) and (9)
=[
A BC D
].
Second, we can easily find the inverse of first part of Eq.(5) as the following form:[
I 0DB∗
W X
]−1
=[
I 0−X−1DB∗
W X−1
], (11)
also, the inverse of second part of Eq. (5) has the followingform of either:[
A BC 0
]−1
=[
W−1[I − BB∗W ] C∗
W
B∗W I − Y −1
](12)
or[A BC 0
]−1
=[
[I − C∗W C]W−1 C∗
W
B∗W I − Y −1
](13)
where above two expressions are equal. These inverse rela-tions are always found under the given assumptions. Now,we prove Eq. (12) as follows:[
A BC 0
] [A BC 0
]−1
=[W − BC B
C 0
] [W−1[I − BB∗
W ] C∗W
B∗W I − Y −1
]=[
I − BCW−1 + BY B∗W WC∗
W − BY −1
CW−1 − Y B∗W I
]=[
I 00 I
]by definitions of Eq. (9) and (10).
1953
Also, the proof of Eq. (13) follows the same procedures asexplained above.
Finally, we can get the inverse of Eq. (5) as the followingform of either
∴[
A BC D
]−1
=[W−1 − [W−1B + C∗
W X−1D]B∗W C∗
W X−1
[I − (I − Y −1)X−1D]B∗W [I − Y −1]X−1
](14)
or
∴[
A BC D
]−1
=[W−1 − C∗
W [CW−1 + X−1DB∗W ] C∗
W X−1
[I − (I − Y −1)X−1D]B∗W [I − Y −1]X−1
],
(15)
where above two expressions are obtained by using Eq. (12)and (13), respectively, and they must be equal. Actually, theproof of Eq. (14) is obtained by multiplying Eq. (12) with(11) and the proof of Eq. (15) by multiplying Eq. (13) with(11).
As aforementioned, the conventional expressions for sym-bolic inverse of 2×2 block matrix exist under the conditionsof either |A| �= 0 and |D − CA−1B| �= 0, or |D| �= 0 and|A−BD−1C| �= 0, as shown in Eq. (1) and (2). However,the symbolic inverse expressions suggested in above Eq.(14) and (15) do not require these assumptions, but newassumptions of |W | = |A+BC| �= 0, |Y | = |CW−1B| �=0 and |X| = |I+D(I−Y −1)| �= 0. Therefore, the inverse ofeither Eq. (14) or (15) is another expression of block matrixinversion.
After checking that both B and C have full (columnor row) ranks apparently, the suggested three assumptionsshould be checked for existence of block matrix inversion.If one of both B and C in the off-diagonal loses rank, thenthe Y will be singular, namely, |Y | = 0. Then, the suggestednew form cannot be utilized as the block matrix inversionformulas. Now, we will summarize the symbolic expressionsabout block matrix inversion found till now and show threeapplication examples such as inverse matrix, an inverse plantmodel and an optimization problem for a practical use.
IV. APPLICATION EXAMPLES
Firstly, Table 1 shows a summary of 2 × 2 block matrixinversions, according to the existence conditions expressedby prerequisite and requisite conditions. Apparently, if theprerequisite condition is satisfied, then the requisite con-ditions should be checked for the existence of symbolicexpression of block matrix inversion. Finally, if the requi-site conditions are satisfied, then the corresponding inversematrix expression exists as suggested in Table 1. In thissection, we will show three examples in order to verify theusefulness of the suggested block matrix inversion; the oneis to find an inverse matrix, another is to obtain the inverseplant model of a given MIMO plant and the other is to
solve the optimization problem for a positive semi-definiteperformance index subject to an equality constraint.
A. Numerical Example
Let us find an inverse of the following simple matrix:⎡⎣0 0 10 1 11 0 0
⎤⎦ , (16)
where above matrix must be full rank. In order to apply theblock matrix inversion formula, let us partition Eq. (16) asfollowing form:
[A BC D
]=
⎡⎣ 0 0 1
0 1 11 0 0
⎤⎦ , (17)
here, since |A| = 0 and |D| = 0, the conventional blockmatrix inversion formulas of Eq. (1) and (2) cannot beapplied. However, both B and C in the off-diagonal must befull (column and row) ranks with rank 1, since the followingthree matrices in Table 1 are invertible:
W = A + BC =[
1 01 1
]Y = CW−1B = 1
X = I + D(I − Y −1) = 1,
we can apply new expression of either Eq. (14) or (15) toEq. (17). For ease of notations, the following intermediatematrices are calculated in advance:
B∗W = Y −1CW−1 =
[1 0
]C∗
W = W−1BY −1 =[10
].
Finally, the inverse matrix can be found using Eq. (14) or(15) as the following form:
∴
⎡⎣0 0 1
0 1 11 0 0
⎤⎦−1
=
⎡⎣ 0 0 1−1 1 01 0 0
⎤⎦
However, if the matrix is partitioned differently from Eq.(17) as follow:
[A BC D
]=
⎡⎣ 0 0 1
0 1 11 0 0
⎤⎦ ,
then
W = A + BC = 1
Y = CW−1B =[0 00 1
]X = I + D(I − Y −1) = not defined,
because Y is not invertible. In this case, we cannot applythe block matrix inversion formulas.
1954
Table 1: Symbolic block matrix inversion expressions of
[A BC D
]according to the prerequisite and requisite conditions, where A and D are square
matrices, and B and C matrices may or may not be square, in other words, A, B, C, and D are, respectively, an n × n, an n × m, an m × n, and anm × m matrix
1. Prerequisite 2. Requisite Inverse Matrix
A has full rank |SA| �= |D− CA−1B| �= 0 Eq. (1)
D has full rank |SD | �= |A− BD−1C| �= 0 Eq. (2)
|W | �= |A + BC| �= 0
B and C have full ranks |Y | �= |CW−1B| �= 0 Eq. (14) or (15)
|X| �= |I + D(I − Y −1)| �= 0
B. Inverse Plant Model
The inverse plant model has been often used as a feed-forward controller in order to improve the entire controlperformance. It is required for the inverse plant model tocancel the plant noise and disturbance as addressed in [8].Here, let us consider the canceling method of plant noise anddisturbance as shown in Fig. 1.
Plant DisturbanceSensor Noise
PlantOutput
CommandInput
PlantInput
Noise and disturbanceat plant output
Noise and disturbancefiltered for canceling
_
_
Fig. 1. Canceling Scheme of Plant Noise and Disturbance
In the canceling scheme of Fig. 1, the plant disturbanceis often represented as an additive noise at the plant input.Sensing the plant output is done with a sensor that may benoisy. Sensor noise is often represented as an additive noiseat the plant output. In the system of Fig. 1, the plant noise anddisturbance are separated from the plant’s dynamic outputresponse. The plant input drives both the plant and its model(which is free of noise and disturbance). The differencebetween the plant output and the plant model output is theplant noise and disturbance as they appear at the plant output.Also, this difference drives the inverse plant model, and theresultant is fed back to the input side, in order to subtractthe plant noise and disturbance from the actual plant input.The ultimate effect is to cancel noise and disturbance at theplant output[8].
When the canceling scheme of Fig. 1 is realized, however,we have often met two kinds of problems; the one is aboutthe causality and the other is about the symbolic inversionmethod. Here since the causality is of no concern in thispaper, we will neglect it and focus on how to obtain ananalytical form of inverse plant model for the case when the
conventional matrix inversion fails to give the analytic formof inverse plant model. The suggested expression of blockmatrix inversion can be an alternative for this case.
u1
u2
u3
u4
y1
y2
y3
y4
Fig. 2. Plant Model: P (s)
Let us consider the following MIMO plant from fourinputs to four outputs as shown in Fig. 2. Here, if we partitionthe input vector and output vector by halves, then the MIMOplant of Fig. 2 is described as the following polynomialtransfer matrices:
P (s) =[
A(s) B(s)C(s) D(s)
]=
⎡⎢⎢⎣
1s+1 0 s
s+1 00 0 0 s
s+1s
s+1 0 0 00 s
s+1 0 1s+1
⎤⎥⎥⎦
(18)
where s is a Laplace variable. In this case, since |A| = 0 and|D| = 0, the conventional block matrix inversion formulas ofEq. (1) and (2) cannot be applied. However, both B and C inthe off-diagonal must be full ranks, also since the followingthree matrices in Table 1 are invertible:
W (s) = A + BC =
[s2+s+1(s+1)2 0
0 s2
(s+1)2
]
Y (s) = CW−1B =[
s2
s2+s+1 00 1
]
X(s) = I + D(I − Y −1) =[
1 00 1
],
we can apply new expression of either Eq. (14) or (15) toa given MIMO plant. For ease of notations, the following
1955
intermediate matrices are calculated in advance:
B∗W (s) = Y −1CW−1 =
[s+1
s 00 s+1
s
]
C∗W (s) = W−1BY −1 =
[s+1
s 00 s+1
s
].
Finally, the inverse plant model can be found analyticallyusing Eq. (14) or Eq. (15) as the following form:
∴ P−1(s) =
⎡⎢⎢⎣
0 0 s+1s 0
0 − s+1s2 0 s+1
ss+1
s 0 − s+1s2 0
0 s+1s 0 0
⎤⎥⎥⎦ . (19)
Also, above inverse plant model can be depicted as shownin Fig. 3. The plant model of Eq. (18) and its inverse plantmodel of Eq. (19) can be utilized for realization of cancelingscheme of plant noise and disturbance as shown in Fig.1. Thus we have shown the usefulness of the suggestedexpression of block matrix inversion through an applicationexample for which the inverse plant model cannot be foundusing a conventional block matrix inversion method.
y1
y2
y3
y4
u1
u2
u3
u4
_
_
Fig. 3. Inverse Plant Model: P−1(s)
C. Optimization with an Equality Constraint
As an another example, let us consider the the optimizationproblem for a positive semi-definite performance index (orobjective function) subject to an equality constraint as thefollowing form:
minimize 12xT Ax − bT x + d
subject to y = Cx,(20)
where A is a symmetric and positive semi-definite matrix,namely, AT = A ≥ 0, C is assumed to be full row rank witha suitable dimension, b, d, x and y are suitable dimensionalvectors.
To solve an optimization problem of objective function(20), first of all, the Lagrangian function should be definedas the following form:
L(x, y, λ) =12xT Ax − bT x + d + λT (y − Cx) (21)
where λ means a Lagrange multiplier vector. Then, the first-order derivatives can be obtained to find the (local) minimum
as the following forms:
∂L∂x
= Ax − b − CT λ = 0
∂L∂λ
= y − Cx = 0.
Now, rearranging above equations gives the following com-pact equation: [
A CT
C 0
] [x−λ
]=
[by
], (22)
where 0 means a zero matrix of suitable dimension. SinceA is a positive semi-definite matrix, the conventional blockmatrix inversion of Eq. (1) cannot be applied to Eq. (22). Asan alternative, let us apply the suggested expression becauseC has full row rank assumed before; firstly, the requisiteconditions in Table 1 should be checked before applyingtype III as follows:
W = A + CT C
Y = CW−1CT
X = I,
where W should be invertible for existence of the suggestedexpression, because Y must be invertible from the full rowrank condition. Also, for ease of notations, the followingintermediate matrices were obtained as follows:
C∗W = W−1CT Y −1
B∗W = Y −1CW−1,
where we should note that B∗W = C∗T
W . Finally, we can getthe inverse matrix using Eq. (15) as follow:
∴[
A CT
C 0
]−1
=[
W−1 − C∗W CW−1 C∗
W
C∗TW I − Y −1
](23)
As a result, we can get the following optimal solution andLagrange multiplier by using the inverse matrix of Eq. (23):
∴ xo = C∗W y + (I − C∗
W C)W−1b (24)
λ = (Y −1 − I)y − C∗TW b, (25)
where xo is the optimal solution of optimization problem ofEq. (20) and C∗
W = W−1CT (CW−1CT )−1 is the rightweighted pseudo-inverse of C.
V. CONCLUDING REMARKS
In this paper, we have suggested new expression of blockmatrix inversion which are used in areas of control, esti-mation theory and signal processing. The suggested onesare able to make up for the weak points of conventionalblock matrix inversion formula. Through three applicationexamples, we have shown that the suggested block matrixinversion method could be applied to obtaining inverse plantmodels and solving optimization problems subject to equalityconstraints.
1956
REFERENCES
[1] D. Simon,“Optimal State Estimation,” Wiley, 2006.[2] S. L. Fagin, “Measurement Matrix Partitioning Theorem,” IEEE Trans.
on Automatic Control, 14, (6), pp. 773–774, 1969.[3] K. Ogata, “Modern Control Engineering,” Prentice-Hall, 1990.[4] S. M. Watt, “Pivot-Free Block Matrix Inversion,” Proc. of the 8th Int.
Symp. on Symbolic and Numeric Algorithms for Scientific Computing,pp. 151-155, 2006.
[5] Y. Tian, “The Moore-Penrose Inverses of m × n Block Matrices andTheir Applications,” Linear Algebra and Its Applications, 283, pp.35-60, 1998.
[6] M. Fiedler, “Inversion of e-simple Block Matrices,” Linear Algebraand Its Applications, 400, pp. 231-241, 2005.
[7] A. Asif and J. M. F. Moura, “Inversion of Block Matrices with BlockBanded Inverses: Application to Kalman-Bucy Filtering,” Proc. ofIEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp. 608- 611, 2000.
[8] B. Widrow and E. Walach, “Adaptive Inverse Control,” Prentice-Hall,1996.
[9] S. Devasia, “Should Model-Based Inverse Inputs Be Used as Feedfor-ward Under Plant Uncertainty?” IEEE Trans. on Automatic Control,47, (11), pp. 1865–1871, 2002.
[10] C. L. Lawson and R. J. Hanson R. J., “Solving Least SquaresProblems,” Prentice-Hall, 1974.
[11] H. Hemami, “Derivation of Matrix Identity,” IEEE Trans. on AutomaticControl, 14, (3), pp. 303–304, 1969.
[12] T. E. Fortmann, “A Matrix Inversion Identity,” IEEE Trans. onAutomatic Control, 15, (5), pp. 599–599, 1970.
1957