on the computation of the matrix k-th root

ZAMM � Z. Angew. Math. Mech. 78 (1998) 3, 167±±172

Laki�c, S.

On the Computation of the Matrix k-th Root

In this paper we derive a family of iterative methods for computing the k-th root and the inverse k-th root of a givenmatrix. We will show that the methods are locally convergent. The methods are analyzed and their numerical stability isinvestigated.

MSC (1991): 65F30, 65F10

1. Introduction

Computational methods for the k-th root of some matrices have been proposed in [1], [2], [5], etc. In Section 2, a familyof iterative methods with high order of convergence is developed. In section 3 we will show that these methods arelocally stable. In section 4 we illustrate the performance of the method by numerical examples.

Let A 2 Cn;n be an n� n matrix with spectrum s�A� � fli; i � 1; . . . ; ng, where li are the eigenvalues of A. Firstwe need the following definitions.

Definition 1.1: Let A 2 Cn;n be a nonsingular matrix. The k-th root X � A1=k 2 Cn;n of A is defined byXk � A.

Definition 1.2: Let A 2 Cn;n be a nonsingular matrix. The inverse k-th root X � Aÿ1=k 2 Cn;n of A is defined byAXk � I.

For a nonsingular matrix a k-th root and an inverse k-th root always exist, c.f. [4].

2. Computation of A1=k and Aÿ1=k

The convergence analysis of the methods proposed in this paper is based on the following theorem.

Theorem 2.1: Let

fk�z� � 1��1ÿ zkp ;

where k 2 N, z 2 C. For j 2 N define

Rjÿ 1�z� �Pjÿ 1

i� 0

bizi; b0 � fk�0� � 1 ; bi � f

�i�k �0�i!�Qiÿ 1

m� 0

�1�mk�kii!

; i � 1; 2; . . . ; jÿ 1 :

Then there holds

1ÿ �1ÿ z�Rkjÿ 1�z� � zj

P�kÿ 1� �jÿ 1�

i� 0

ci; kzi �1�

with some nonnegative constants ci; k � ci; k�k; j�; i � 0; . . . ; �kÿ 1��jÿ 1�, where

P�kÿ 1� �jÿ 1�

i� 0

ci; k � 1 : �2�

P r o o f : The proof is by induction. For j � 1 we have

1ÿ �1ÿ z�Rkjÿ 1�z� � 1ÿ �1ÿ z� � z � zc0 ;

where c0 � 1. We assume that for k 2 N there holds (1). Then

1ÿ �1ÿ z�Rkj �z� � 1ÿ �1ÿ z��Rjÿ 1�z� � bjzj�k � 1ÿ �1ÿ z� Pk

m� 0

k

m

� �Rmjÿ 1�z� bkÿmj z�kÿm� j

� 1ÿ �1ÿ z�Rkjÿ 1�z� ÿ �1ÿ z�bkj zkj ÿ k�1ÿ z�bkÿ 1

j z�kÿ 1�jRjÿ 1�z�

Laki�c, S.: Computation of the Matrix k-th Root 167

� Pkÿ 1

m� 2

k

m

� �bkÿmj z�kÿm� j ÿ1� zj P�mÿ1��jÿ1�

i� 0

ci;mzi

!

� ÿzj Pkm� 0

k

m

� �bkÿmj z�kÿmÿ 1� j ÿ zjkbkÿ 1

j

Pjÿ 1

m� 1

bmzm� j�kÿ 2�

�zj bkj z1� j�kÿ 1� � kbkÿ 1

j z1� j�kÿ 2�Rjÿ1�z��

� Pkm� 2

k

m

� �bkÿmj z�kÿm�j

P�mÿ 1� �jÿ 1�

i� 0

ci;mzi

!

� zj bkj z1� j�kÿ 1� � bkÿ 1

j z�kÿ 1�j�kbjÿ 1 ÿ bj� � kbkÿ 1j

Pjÿ 1

i� 1

�biÿ 1 ÿ bi�zi� j�kÿ 2��

� Pkÿ 1

m� 2

bkÿmj zj�kÿm�k

m

� �c0;m ÿ k

mÿ 1

� �bj

� ��c0; k ÿ kbj� �

Pkm� 2

k

m

� �bkÿmj zj�kÿm�

P�mÿ 1� �jÿ 1�

i� 1

ci;mzi

!:

Now we prove that

c0;m � mf�j�m �0�j!

: �3�

From (1) follows

R�j�jÿ 1�z� � �hm�z�fm�z��j� ; �4�

where hm�z� � gm�T �z��, gm�T � � T 1=m, and

T �z� � 1ÿ zj P�mÿ 1� �jÿ 1�

i� 0

ci;mzi:

From (4) follows

0 � f �j�m �z� �Pji� 1

j

i

� �h�i�m �z� f �jÿ i�m �z� ;

since

h�i�m �z� �P

n1; ... ; ni

i!

n1!n2! . . .ni!g�s�m �T �

Qik� 1

T �k��z�k!

� �nk;

s � n1 � n2 � � � � � ni; where n1; . . . ; ni are the integer nonnegative solutions of the equation

n1 � 2n2 � � � � � ini � i:Since T �i��0� � 0 for 1 � i � jÿ 1, we have h�i�m �0� � 0 for 1 � i � jÿ 1, and finally

h�j�m �0� � g0m�1�T �j��0�:Now

0 � f �j�m �0� ÿ j!c0;m=m;

i.e. (3) is true. Since

kbjÿ 1 ÿ bj ��kÿ 1��kj� 1� Qjÿ 1

i� 0

�1� k�iÿ 1��j!kj

� 0 for k � 1;

biÿ 1 ÿ bi � kÿ 1

i!ki� 0 for 1 � i � jÿ 1 and k � 1 ;

k

m

� �c0;m ÿ k

mÿ 1

� �bj � k!

j!�mÿ 1�!�kÿm�!

Qjÿ 1

i� 1

i� 1

m

� �m

ÿQjÿ 1

i� 1

i� 1

k

� �k�kÿm� 1�

0B@1CA> 0

for k > m, and c0; k ÿ kbj � 0, we have

1ÿ �1ÿ z�Rkj �z� � zj� 1 Pj�kÿ 1�

i� 0

~cizi ;

where ~c0, . . .,~cj�kÿ 1� are nonnegative constants. Setting z � 1 gives (2). &

168 ZAMM � Z. Angew. Math. Mech. 78 (1998) 3

Theorem 2.2: Let w be a complex number such that w 6� 0: We define the sequence fzng by

zn� 1 � zn

Pjÿ 1

i� 0

bi�1ÿ wzkn�i ; �5�

j 2 N, j � 2, where k; bi are as in Theorem 2:1, and j1ÿ wzk0j < 1. Then

j1ÿ wzknj � j1ÿ wzk0jjn

; �6�

limn!1 zn � 1��

wkp ; �7�

where��wkp

is a k-th root of w.P r o o f : Using Theorem 2.1 we have

1ÿ wzk1 � �1ÿ wzk0�jP�kÿ 1� �jÿ 1�

i� 0

ci; k�1ÿ wzk0�i ;and

j1ÿ wzk1j � j1ÿ wzk0jj :Repeating this argument we have (6). From (6) it follows

limn!1 j1ÿ wz

knj � 0 : �8�

Now we prove that the sequence (5) is a Cauchy sequence and, hence, convergent. Since k :� j1ÿ wzk0jj < 1, from thedefinition of fzng and (6) we obtain

jzn� 1 ÿ znj � jznjPjÿ 1

i� 1

bij1ÿ wzknji � jznjPjÿ 1

i� 1

bi j1ÿ wzk0jjn

� �i�Mkn

with a certain constant M > 0. Hence, for each s 2 N we have

jzn� s ÿ znj �Psÿ 1

l� 0

jzn� sÿ l ÿ zn� sÿ lÿ 1j �MPsÿ 1

i� 1

kn� sÿ lÿ 1 � M

1ÿ kkn ! 0

for n!1. Since the sequence (5) is convergent, from (8) it follows (7). &In the following we say that P �A� is a function of A 2 Cn; n if it is a matrix function in the sense of matrix

theory, i.e., a polynomial in A, see, e.g., [4].

Algorithm (I): Let X0 2 Cn; n be a function of A, and let j; k; bi be as in Theorem 2.2. With S0 � AXk0 , for

n � 1; 2; . . . ; we define the following matrix sequences:

Xn �1 � Xn

Pjÿ 1

i� 0

bi�I ÿ Sn�i ; Sn� 1 � Sn

Pjÿ 1

i� 0

bi�I ÿ Sn�i� �k

:

It is easily seen by induction that Sn � AXkn for all n.

Special cases of Algorithm (I) for square and cubic roots arek � 2, j � 2:

Xn� 1 � 12 Xn�3I ÿ Sn�; Sn� 1 � 1

4 Sn�3I ÿ Sn�2 ;

k � 2, j � 3:

Xn� 1 � 18 Xn�3S2

n ÿ 10Sn � 15I�; Sn � 1 � 164 Sn�3S2

n ÿ 10Sn � 15I�2 ;

k � 3, j � 2:

Xn� 1 � 13 Xn�4I ÿ Sn�; Sn� 1 � 1

27 Sn�4I ÿ Sn�3 ;

k � 3, j � 3:

Xn� 1 � 19 Xn�2S2

n ÿ 7Sn � 14I�; Sn� 1 � 1729 Sn�2S2

n ÿ 7Sn � 14I�3 :

Theorem 2.3: Let A 2 Cn; n be a nonsingular and diagonalizable matrix. Let fXng, fSng be the sequences definedby Algorithm (I), and suppose

jjI ÿ S0jj < 1 ; �9�where jj : jj is a multiplicative matrix norm. Then

limn!1Xn � Aÿ1=k ; lim

n!1Sn � I ;


where Aÿ1=k is an inverse k-th root of A, and the convergence is of Q-order j in the sense of

jjI ÿAXkn jj � o�jjI ÿAXk

nÿ 1jjj� : �10�

P r o o f : From (I) follows

Xn� 1 � Xn

Pjÿ 1

i� 0

bi�I ÿAXkn�i :

Let

V ÿ1AV � L � diagfl1; . . . ; lng ;and define Kn � V ÿ1XnV , K0 � V ÿ1X0V . Then

Kn� 1 � Kn

Pjÿ 1

i� 0

bi�I ÿLKkn�i : �11�

From the above equation follows that fKng is a sequence of diagonal matrices, i.e.,

Kn � diagfk�n�1 ; . . . ; k�n�n g ; n � 0; 1; . . . ;

and K0 is a function of L � diagfl1; . . . ; lng. The matrix sequence (11) is equivalent to the n scalar sequences

k�n� 1�i � k�n�i

Pjÿ 1

m� 0

bm�1ÿ li�k�n�i �k�m; i � 1; . . . ; n ; �12�

and k�0�i is a function of li. Since jjI ÿAXk

0 jj < 1, there holds

��I ÿLKk0� � ��I ÿAXk

0� < 1 ;

so we have j1ÿ li�k�0�i �kj < 1. By Theorem 2.2 for the sequence (12) there holds

limn!1 k

�n�i � l

ÿ1=ki ;

and

1ÿ li�k�n�i �k ��

1ÿ li�k�nÿ 1�i �k

�j P�kÿ 1� �jÿ 1�

m� 0

cm; kÿ1ÿ ai�k�nÿ 1�

i �k�m :This implies

limn!1 Kn � Lÿ1=k � diag l

ÿ1=k1 ; . . . ; lÿ1=k

n

n o; lim

n!1 Xn � Aÿ1=k ;

and

I ÿLKkn �

�I ÿDKk

nÿ 1

�j P�kÿ 1� �jÿ 1�

m� 0

cm; kÿI ÿLKk

nÿ 1

�m;

hence

I ÿAXkn �

�I ÿAXk

nÿ 1

�j P�kÿ 1� �jÿ 1�

m� 0

cm; kÿI ÿAXk

nÿ 1

�m:

Taking the norm of the above equation leads to (10). Finally, from (I) we have limn!1Sn � I. &

Remark: Obviously, if A is replaced by Aÿ1 in Algorithm (I), then

limn!1 Xn � A1=k:

If the matrix A 2 Cn; n is general, the costs of Algorithm (I) are approximately �j� kÿ 1�n3 flops per iteration,and �j� kÿ 1�n3=2 flops per iteration for hermitian A if matrix multiplications are realized in standard way.

The condition (9) is not necessary for the convergence of the method (I). This will be illustrated in Example 2 ofSection 4.

In [1] an iterative method was proposed to find Aÿ1=k for the case when A is positive definite and diagonalizable.In [2] an iterative method was proposed for the case when A is real and diagonalizable. We see that the method (I) hasadvantages with respect to the methods in [1], [2] when A is a complex matrix which may not be positive definite.

For the quadratically convergent method in [5] the costs are approximately �k2 � 32 k�n3 flops per iteration. If we

compare the method (I) with the method in [5] we see that the method in [5] is computationally more expensive thanthe method (I).


3. Stability analysis

Assume that at the n-th step errors Pn and Qn are introduced in Xn and Sn, respectively, where Pn � o�e� andQn � o�e�. Let ~Xn and ~Sn be the computed matrices of this step, i.e.,

~Xn � Xn � Pn ; ~Sn � Sn �Qn :

We define

~Pn � V ÿ1PnV; ~Qn � V ÿ1QnV; Hn � V ÿ1SnV :

From

~Xn� 1 � ~Xn

Pjÿ 1

i� 0

bi�I ÿ ~Sn�i ; ~Sn� 1 � ~Sn

Pjÿ 1

i� 0

bi�I ÿ ~Sn�i� �k

direct calculations give

~Pn� 1 � ÿKn

Pjÿ 1

i� 1

biPiÿ 1

m� 0

�I ÿHn�m ~Qn�I ÿHn�iÿmÿ 1 � ~Pn

Pjÿ 1

i� 0

bi�I ÿHn�i � o�e2� ;

~Qn� 1 �ÿHn

Pkÿ 1

m� 0

Pjÿ 1

i� 0

bi�I ÿHn�i� �m� � Pjÿ 1

i� 1

biPiÿ1

m� 0

�I ÿHn�m ~Qn�I ÿHn�iÿmÿ 1

� � Pjÿ 1

i� 0

bi�I ÿHn�i� �kÿmÿ 1

� ~Qn

Pjÿ 1

i� 0

bi�I ÿHn�i� �k

�o�e2� :

Writing the above equations element-wise we obtain

~q�n� 1�rs � d�n�rs ~q�n�rs � o�e2� ; ~p�n� 1�

rs � v�n�rs ~q�n�rs � g�n�ij ~p�n�rs � o�e2� ;where

v�n�rs �ÿ k�n�rPjÿ 1

i� 1

biPiÿ 1

m� 0

�1ÿ h�n�r � �1ÿ h�n�s �iÿmÿ 1 ;

g�n�rs �Pjÿ 1

i� 0

bi�1ÿ h�n�s �i ;

d�n�rs �ÿ h�n�rPkÿ 1

m� 0

Pjÿ 1

i� 0

bi�1ÿ h�n�r �i� �m� � Pjÿ 1

i� 1

biPiÿ 1

m� 0

�1ÿ h�n�r �m�1ÿ h�n�s �iÿmÿ 1

� � Pjÿ 1

i� 0

bi�1ÿ h�n�s �i� �kÿmÿ 1

� Pjÿ 1

i� 0

bi�1ÿ h�n�s �i� �k

for r; s � 1; 2; . . . ; n. Let e�n�irs � ~q�n�rs

~p�n�rs

� �: Then we have

e�n� 1�rs �W �n�

rs e�n�rs � o�e2�

with

W �n�rs � j

d�n�rs 0

v�n�rs g�n�rs

" #:

Since

limn!1 d�n�rs � 1ÿ kb1 � 0 ; lim

n!1 g�n�rs � 1 ; v�n�rs � ÿ1

klÿ1=kr ;

we have

W �n�rs �Wrs � o�e�n�� ; Wrs �

0 0

ÿ 1

klÿ1=kr 1

24 35 ;where e�n� is sufficiently small for large n. The matrix W �n�

rs has the eigenvalues 0 and 1. Let z0 and z1 be the corre-sponding eigenvectors, so

e�n�rs � u�n�0 z0 � u�n�1 z1 :


For sufficiently small e and large n we have

e�n�m�rs �Wmrs e�n�rs � u�n�1 z1 ; m � 1; 2; . . . :

Consequently jje�n�m�rs jj � jje�n� 1�rs jj, and Algorithm (I) is locally stable.

4. Numerical examples

In this section we will use the Frobenius matrix norm

jjAjj ��Pni� 1

Pnj� 1

jaijj2s

;

and the errors en � jjXkn ÿAjj, fn � jjI ÿAXk

n jj. The inequalities en � t, fn � t, where t is a given error tolerance, areused as stopping criteria.

Example 1: In this example we compare the method (I) with the quadratically convergent method in [5, p. 218]and the cubically convergent alternative stable method in [3, p. 869].

Let A 2 R10; 10 be the matrix defined by

aij �ÿ0:1 if i < j ;1 if i � j ;0:1 if i > j :

8<:Let t � 10ÿ9:It is desired to find A1=3.The method in [5] converges within 4 iterations. The costs (for 4 iterations) are approximately 4� 13500 � 54000 flopsin total. We will use the method (I) with 5th order of convergence (j � 5) with X0 � I, hence, S0 � Aÿ1,jjI ÿ S0jj � 0:8183. The method (I) converges within 2 iterations. The costs (for 2 iterations) are approximately2� 7000 � 14000 flops in total.It is desired to find Aÿ1=2.The cubically convergent method in [3, p. 869] converges within 3 iterations, f3 < 10ÿ9. The costs are approximately3� 4000 � 12000 flops in total. The method (I) with 3rd order of convergence (j � 3) and X0 � I, S0 � A,jjI ÿ S0jj � 0:9487 converges within 3 iterations, f3 < 10ÿ9. The costs are approximately 3� 4000 � 12000 flops intotal.

In general, it is difficult to say which algorithm is better for the computation of matrix roots, since their beha-vior depends on the structure of the matrix under consideration.

Example 2. Let A be the following matrix,

A �40 0:1 39 00:1 40 0 3939 0 40 00 39 0 40

26643775 :

It is desired to find A1=3.If X0 � I then jjI ÿ S0jj � 1:39, and after 5 iterations for the method (I) with j � 5, we have e5 � 0:25� 10ÿ12. Thisexample illustrates that the condition (9) is not a necessary condition for the convergence of the method (I). Actually,method (I) converges because ��I ÿ S0� � 0:9873 < 1.

Double precision arithmetic was used for the two examples.

Acknowledgment: The author would like to thank the referees for their helpful suggestions and discussions.

References

1 Denman, E. D.: Roots of real matrices. Linear Algebra Appl. 36 (1981), 133±±139.2 Hoskins, W. D.; Walton, D.J. : A faster more stable method for computing n-th roots of positive definite matrices. Linear Alge-

bra Appl. 26 (1979), 139±±164.3 Laki�c, S.: An iterative method for the computation of a matrix inverse square root. Z. Angew. Math. Mech. 75 (1995), 867±±873.4 Lancaster, P.: Theory of matrices, Academic Press, New York 1969.5 Tsay, Y. T.; Tsai, J. S. H.; Shieh, L. S.: A fast method for computing the principal n-th roots of complex matrices. Linear

Algebra Appl. 76 (1986), 205±±221.

Received May 16, 1995, final revision May 6, 1997, accepted May 14, 1997

Address: Dr. Slobodan Laki�c: Technical Faculty' Mihajlo Pupin', University of Novi Sad, YU-23000 Zrenjanin, Yugoslavia


on the computation of the matrix k-th root

Documents