biharmonic in rectangle (2008)

8/6/2019 Biharmonic in Rectangle (2008)

1/37

A fast direct solver for the biharmonic problem in

a rectangular grid

Matania Ben-Artzi

Institute of Mathematics

The Hebrew University, Jerusalem 91904, Israel

email: [email protected]

Jean-Pierre Croisille

Department of Mathematics, LMAM, UMR 7122

University Paul Verlaine-Metz

57045 Metz, France


Dalia Fishelov

Afeka - Tel-Aviv Academic College of Engineering

218 Bnei-Efraim St., Tel-Aviv 69107, Israel


April 11, 2008

1


2/37

Abstract.We present a fast direct solver methodology for the Dirichlet biharmonic prob-lem in a rectangle. The solver is applicable in the case of the second orderStephenson scheme [34] as well as in the case of a new fourth order scheme, whichis discussed in this paper. It is based on the capacitance matrix method ([10],[8]). The discrete biharmonic operator is decomposed into two components.

The first is a diagonal operator in the eigenfunction basis of the Laplacian, towhich the FFT algorithm is applied. The second is a low rank perturbationoperator (given by the capacitance matrix), which is due to the deviation of thediscrete operators from diagonal form. The Sherman-Morrison formula [18] isapplied to obtain a fast solution of the resulting linear system of equations.

Keywords: Fast solver, FFT, biharmonic problem, capacitance matrix method,Sherman-Morrison formula, Navier-Stokes equations, streamfunction formula-tion, vorticity, compact schemes, driven cavity, Stephenson scheme.

1 Introduction

The accurate resolution of linear fourth order elliptic problems, such as the bi-harmonic equation, is of great importance in many fields of applied mathematics.Specific examples are the computation of the streamfunction in incompressiblefluid dynamics or in problems in elasticity theory. From the numerical point ofview, two essential issues are the accuracy of the scheme and the availability ofa fast solver.

Finite-difference schemes for the biharmonic problem on rectangles fall broadlyinto two categories: Coupled and non-coupled schemes. In coupled schemes, theproblem is decomposed into two Poisson problems, combined with boundaryconditions. Classical works include [32], [33], [14], [28]. When one applies sucha scheme to the Dirichlet biharmonic problem, one encounters difficulties in thedesign of artificial boundary conditions for the first Poisson problem, whereasthe second one is overdetermined. In the fluid dynamics context, this question

is related to the design of an artificial boundary condition for the vorticity ofthe flow.Here we focus on a non-coupled scheme, namely the nine point (second

order) Stephenson scheme introduced in [34]. This scheme has been successfullyapplied to the numerical simulation of the time-dependent Navier-Stokes system(see [17], [4], [3]). Its main advantage, in contrast with the usual thirteen pointscheme (see e.g. [24, 12, 10, 8]), is that it allows the construction of a time-dependent solver, which is closely related to the partial differential problem. Acrucial feature of the scheme is the absence of artificial boundary conditions onthe vorticity. This permits the construction of a compact scheme, which handlesnear boundary points in the same manner as internal points (see also [20], [19]).The solution of the linear system of discretized equations requires a fast solver.

The main goal of this paper is to present a fast solver for compact dis-

cretizations of the biharmonic problem. This is done by the capacitance matrix

2


3/37

formulation. In the first part of the paper we construct the capacitance matrixformulation for the Stephenson scheme on a rectangular grid. We show how itis efficiently applied to the solution of the discrete biharmonic problem.

In the second part of the paper, a fourth order accurate extension of theStephenson scheme is introduced. The new fourth order discrete biharmonicequation is solved using a modified version of the second order algorithm. The

key feature of the scheme consists of simple representation of the capacitancematrix in the eigenfunction basis of the Laplacian. It enables us to efficientlycompute the capacitance matrix and to solve the resulting low-rank linear sys-tem using a diagonal preconditioned conjugate-gradient method. In addition,the gradient and the Laplacian of the solution of the biharmonic problem areobtained as byproducts of the solver.

To put this paper in the context of existing literature, we make the followingremarks.

A fourth order version of the Stephenson scheme was introduced in [34].A multigrid solver for this scheme is devised in [1]. In the context of theequation 2 = f, it uses the values of f at five points (in the nine pointstencil). In contrast, our scheme uses only the value of f at the center

point. When f itself is a function of (e.g., ), this leads to a significantsimplification of the algorithm.

A scheme based on orthogonal spline collocation has been investigated in[35], [26], [5]. This scheme is fourth order accurate and the linear systemis solvable at a cost of O(N2 Log(N)) operations, where N is the numberof grid points in one direction. An almost block diagonal solver, [16], [2],is used as a basis of the fast solver in [5].

The algorithm presented here runs parallel the lines of the capacitance ma-trix principle of Golub (appendix of [15]), Buzzbee and Dorr [10]( thirteenpoints scheme, O(N3) solver), Bjrstad [8](thirteen points scheme, O(N2)solver), Bjrstad and Tjstheim [9], and Legendre or Hermite Galerkinscheme of Shen [30], [31], O(N3) solver. It consists basically of decompo-sition of the matrix, which represents the scheme, into a sum of a diagonaloperator in the eigenfunction basis of the (five point) Laplacian, and ofa low-rank perturbation, called the capacitance matrix ([11]). Our algo-rithm is based on two ingredients: (i) application of the Sherman-Morrisonformula with a low-rank capacitance matrix, (ii) resolution of the diago-nal component in the inversion formula by the FFT method (see [7] for asimilar application). We emphasize the fact that our algorithm is a direct(whereas in [21], [22] it is an iterative) solver.

The outline of the paper is as follows. In Section 2 we recall the second orderStephenson scheme, using the notation introduced in [4], [3]. In Section 3, wedevelop our fast solver for the biharmonic problem, including detailed algebraicderivation for the nine point second order Stephenson operator. Section 4 isdevoted to the new fourth order scheme, for which a similar fast algorithm

3


4/37

is developed. Finally, in Section 5, we represent numerical results for boththe second order and the fourth order schemes. The problems dealt with aremixed biharmonic-Laplacian problems, subject to non-homogeneous Dirichletboundary conditions. Computing efficiency are reported for several fourth orderproblems. The computational cost appears to be O(N2 Log(N)), where N isthe number of grid points in each direction. We note that a typical computing

time of the solver for a 1024 1024 grid is ten seconds on a PC.Finally, let us mention the work [29] for a comprehensive history of the

biharmonic problem in two dimensions.

2 Notation

2.1 Finite difference operators

We consider a square = (0, L)2 with a uniform grid xi = ih, yj = jh, i, j =0,...N, h = 1/N. The internal points are xi = ih, yj = jh, for 1 i, j N 1.Denote by l2h,0 the space of one-dimensional discrete functions (ui) defined on

xi = ih, 1 i N1. In two dimensions, L2h,0 is the space of discrete functions(i,j) defined on xi = ih, yj = jh, for 1 i, j N 1. The spaces l

2

h,0, L2

h,0are respectively equipped with the scalar products

(1) (u, v)h = hN1i=1

uivi, u, v l2h,0 ; (u, v)h = h2N1i,j=1

ui,jvi,j , u, v L2h,0.

For discrete function in L2h,0 we define the following centered operators

(2)

xi,j =

i+1,j i1,j2h

, 2xi,j =i+1,j + i1,j 2i,j

h2

yi,j =i,j+1 i,j1

2h, 2yi,j =

i,j+1 + i,j1 2i,jh2

.

and the mixed discrete derivative operator xy

(3)xyi,j = xyi,j =

i+1,j+1 i1,j+1 i+1,j1 + i1,j14h2

, 1 i, j N1.

Consider the following fourth order partial differential problem in the square = (0, L)2.

(4)

a + b2(x, y) = f , (x, y) (x, y) = 0 ,

n(x, y) = 0 , (x, y) ,

where a 0, b > 0 are two real constants. Problem (4) is approximated by thescheme

(5) ah + b2h

i,j = fi,j, at interior points (i, j)

i,j = 0 , ni,j = 0, at boundary points (i, j),4


5/37

where

hi,j is the five points discrete Laplacian(6) hi,j =

2xi,j +

2yi,j , 1 i, j N 1.

and 2

h is the nine points discrete biharmonic (Stephenson approxima-tion), defined by

(7) 2hi,j = 4xi,j +

4yi,j + 2

2x

2yi,j.

2x2y is the nine points operator defined by

2x

2y =

2x 2y. The finite difference

operators 4x, 4y (introduced in [4], [3]), approximating

4

x4 and4

y4 are theone-dimensional Stephenson operators

(8)

4x =12

h2

xx 2x

4y =12

h2

yy 2y

.

For any L2h,0, the Hermitian gradient (x, y) (L2h,0)2 in (8) is defined bythe following relations between and x and between and y.

(9)

(I+

h2

62x)x,i,j = xi,j , 1 i, j N 1

(I+h2

62y)y,i,j = yi,j , 1 i, j N 1.

The latter form fourth order approximation for x for given values of andfourth order approximations for y for given values of . The two operators

4x, 4y are fourth order approximations for

4

x4 and

4

y4 in the Fourier

sense (see [3]). Then, the numerical scheme (5) may be written in the followingway. Find i,j L2h,0 such that

(10) ah + b2hi,j = fi,j.The operator ah + b2h is second order accurate because of the second orderaccuracy of the mixed term 2x

2y in (7) and of the five points Laplacian (6).

However, we will see in Section 4 that the order of accuracy can be easilyincreased to fourth order by a slight modification of 2h and h, keeping apointwise source term fi,j.

2.2 Matrix operators

We relate the bidimensional finite difference operators acting in L2h,0 with matrix

operators of size (N 1) (N 1) (N 2), acting on a vector ui,j L2h,0.Most of those operators are obtained as Kronecker products of ( N1)(N1)matrices.

5


6/37

We use the indexing of ui,j L2h,0 as the column vector

(11) U =

u1,1,...u1,N1; u2,1, ...u2,N1; ...; uN1,1, ...uN1,N1

T R(N1)2

The bottom ordering of vector U R(N1)2

is obtained by letting the indexj vary first. The notation vect stands indifferently for the operator whichtransforms any one-dimensional discrete function ui l2h,0 to the vector U =[u1, . . ,uN1]

T RN1 or any two-dimensional discrete function ui,j L2h,0 tothe vector U defined by (11) (see [25]). 1 Recall that the Kronecker product ofthe matrices A Mm,n and B Mp,q is the matrix A B Mmp,nq defined by

(12) A B =

a1,1B a1,2B ... a1,nB

...

...am,1B a(N1),2B ... am,nB

.Let us define several (N 1) (N 1) matrices that will be useful for therepresentation of our fast algorithm.

Centered one-dimensional Laplacian matrixThe symmetric positive definite matrix T is the standard tridiagonal ma-trix related to the 3 points Laplacian

(13) Ti,m =

2, m = i1, |m i| = 10, |m i| 2

T =

2 1 0 . . . 0

1 2 1 . . . 0...

...... . . .

...0 . . . 1 2 10 . . . 0 1 2

.

The symmetric positive definite matrix P is deduced from T by

(14) P = 6I T,

or equivalently by

(15) Pi,m =

4, m = i

1, |m i| = 10, |m i| 2

, P =

4 1 0 . . . 01 4 1 . . . 0...

...... . . .

...0 . . . 1 4 10 . . . 0 1 4

.

We also define P = I 16T.1Actually , in [25], the ordering of U is by lines.

6


7/37

The antisymmetric matrix K = (Ki,m)1i,mN1 is given by(16)

Ki,m =

sgn(m i), |m i| = 10, |m i| = 1. , K =

0 1 0 . . . 01 0 1 . . . 0

......

... . . ....

0 . . . 1 0 10 . . . 0 1 0

.

For all 1 i N 1, ei RN1 is the column vector of the canonicalbasis ofRN1. The (N 1) (N 1) matrix Ei is defined by

(17) Ei = eieTi , 1 i N 1

For all u l2h,0 and U = vect(u) RN1,

(18) (EiU)j = i,jUj, 1 i, j N 1

with i,j the Kronecker symbol. If U = vect(u), Ux = vect(ux) RN1 thenthe following identities hold.

One-dimensional second order derivative 2x

(19) vect(2xu) =1

h2T U.

One-dimensional centered gradient x

(20) vect(xu) =1

2hKU.

One-dimensional hermitian gradient ux

(21) vect(ux) =3

h P1KU.

One-dimensional Stephenson operator

(22) vect(4xu) =12

h2

3

2h2KP1K+

1

h2T

U =

6

h4

3KP1K+ 2T

U.

Remark: The isomorphism operator vect which maps the discrete functionu L2h,0 to the vector U = vect(u) R(N1)

2

, induces a natural isomorphismof space operators. Let us denote by < L, u > the resulting discrete functionobtained by the operation of a x linear operator L on a discrete function u, Weassociate to L an operator L, which acts on a vector U, such that

(23) < L, u >=< L, U > .

7


8/37

To simplify notation, from now on we identify the operators L and L in caseswhere this notation does not lead to any confusion. For example (19) will bewritten

(24) 2x =1

h2T.

Now we apply the same identification between operators in the bidimensionalframework. We identify the discrete functions u L2h,0 and the vector U R(N1)2 . The following bidimensional operators are expressed as Kronecker

product operators:

Bidimensional second order derivation operators 2x, 2y

(25) 2x =1

h2T I , 2y =

1

h2I T.

Bidimensional first order derivation operators x, y

(26) x =1

2hK

I , y =

1

2hI

K.

Bidimensional Hermitian gradient

(27) Ux =3

h

P1K I

U . Uy =

3

h

I P1K

U .

Mixed derivative 2x2y

(28) 2x2y =

1

h4T T .

Fourth order derivation operators in two dimensions

(29) 4

x =

12

h2 32h2KP1K+ 1h2 T I4y =

12

h2I 3

2h2KP1K+

1

h2T

.

The following summarizes several preliminary results about matrices K,P,T

Lemma 2.1 (i) The commutator of P and K is

(30) [P, K] = P K KP = 2 (EN1 E1) .(ii) The commutator of P1 and K is

(31) [P1, K] = P1K KP1 = 2P1EN1 E1P1.(iii) The symmetric matrix K2 is related to T by

(32) K2 = T2 4T + 2(E1 + EN1).

8


9/37

(i) is easily verified and (ii) is deduced from (i) by conjugasion by P1.(iii): The following identities are simply verified.

(33)

(K2U)i = Ui+2 + Ui2 2Ui , 2 i N 2(K2U)1 = U3 U1(K2U)N1 = UN1 + UN3

(34)

(T2U)i = Ui+2 4Ui+1 + 6Ui 4Ui1 + Ui2 , 2 i N 2

(T2U)1 = (T U)2 + 2(T U)1 = U3 4U2 + 5U1(T2U)N1 = UN3 4UN2 + 5UN1.

Therefore,(35)

(K2 T2)U

i= 4Ui+1 8Ui + 4Ui1 = 4(T U)i , 2 i N 2

(K2 T2)U1

= 4(T U)1 + 2U1(K2 T2)U

N1= 4(T U)N1 + 2UN1,

which gives (32).

3 A fast FFT solver for the Stephenson bihar-

monic

3.1 The FFT Poisson solver

We recall here briefly the standard FFT algorithm for the discrete Laplacianaccording to [6], [27]. Consider the Poisson problem in the square = (0, L)2

(36)

u(x, y) = f, (x, y) = (0, L)2u(x, y) = 0, (x, y) ,

Its discrete form is, (see (6)),

(37)

hui,j = fi,j , 1 i, j N 1ui,j = 0 , i {0, N} or j {0, N}.

The eigenvectors of 2x, which form a basis of l2h,0, are zk l2h,0 defined by, (Lis the length of the interval).

(38) zkj =

2

L

1/2sin

kjh

L, 1 k, j N 1,

They form an orthonormal basis for the one-dimensional scalar product (., .)h

(39) (zk, zl)h = k,l, 1 k, l N 1.

9


10/37

Cast in vector form, we introduce the column vector Zk RN1 and row vectorZj RN1 defined by(40) Zk = h1/2zk , Zj = h

1/2zj , 1 k, j N 1

(41) Zkj

= 2N1/2

sinkj

N, 1

k, j

N

1,

The matrix Z MN1(R) whose kth column is Zk and jth row is Zj is asymmetric positive definite unitary matrix, thus

(42) Z2 = ZZT = IN1.

The eigenvalues of the matrix T are given by, (see (19),

(43) k = 4 sin2

k

2N

.

In matrix form, the scheme (37) reduces to the linear system with right-hand

side F = h2 vect(f) R(N1)2 and unknown U = vect(u) R(N1)2

(44) (T I+ I T) U = F.The orthonormal basis (in R(N1)

2

ofT I+ I T is (Zk Zl)1k,lN1, witheigenvalues (k + l)1k,lN1.

The algorithm of the fast Poisson solver is in 3 steps (see [27, 6] for moredetails).

Algorithm 1 (Fast FFT Poisson solver) Step 1:Decompose the source termF = h2 vect(f) on the orthonormal basis ZkZl. This step consists of computing the coefficients FZk,l = (F, Z

k Zl)and is performed by FFT (actually the fast sine transform).

Step 2 :Solve system (44) in the Fourier space by

(45) uZk,l =FZk,l

k + l

Step 3:Assemble componentwise the solution using the decomposition of the grid

function U R(N1)2 in Zk Zl

(46) Ui,j =N1k,l=1

UZk,lZki Z

lj.

The grid function u L2h,0 is such that U = vect(u), therefore(47) ui,j = Ui,j, 1 i, j N 1

10


11/37

Steps 1 and 3 are O(N2 Log(N)), and Step 2 is O(N2), which gives a O(N2 Log(N))algorithm. For the complexity analysis of the FFT, we refer to [25] or [23].

3.2 The Stephenson operator

Let us begin by representing the finite difference operator 4x (8) in matrix form.

Lemma 3.1 The operator P 4xu has the following matrix form

(48) P 4x =6

h4T2 +

36

h4

e1(e1 + KP1e1)

T + eN1(eN1 KP1eN1)T

.

Observe that P 4xu is not symmetric.Proof: We use systematically that for all column vectors u, v RN1, thenuT, vT RN1 are row vectors. For A a (N1) (N1) matrix, the followingrelation holds

(49) u(vTA) = (uvT)A = u(ATv)T

The finite difference operator 4x reads in matrix form (see (22))

(50) 4x =6

h4

3KP1K+ 2T.

Multiplying on the left by P gives, using (30),

P 4x =6

h4

3P KP1K+ 2P T

=

6

h4

3[P, K]P1K+ 3KP P1K+ 2P T

=

6

h4

6(EN1 E1)P1K+ 3K2 + 2(6I T)T

.

Using the expression of K2

as a function of T, (see (32)) and (P1

)T

= P1

,KT = K,

P 4x =6

h4

6eN1e

TN1P

1K 6e1eT1 P1K

+ 3(T2 4T + 2e1eT1 + 2eN1eTN1) + 12T 2T2

=6

h4

6eN1eTN1(P1)TKT + 6e1eT1 (P1)TKT + T2 + 6e1eT1 + 6eN1eTN1)

=

6

h4T2 +

36

h4

e1(e1 + KP

1e1)T + eN1(eN1 KP1eN1)T

,

which is the result.

11


12/37

Lemma 3.2 The symmetric positive definite operator 4x (see (22) has the al-ternative matrix form

(51) 4x =6

h4P1T2 +

36

h4

v1vT1 + v2v

T2

,

where the vectors v1, v2 are

(52)

v1 = ( )1/2P1

22

e1

2

2eN1

v2 = ( + )

1/2P12

2e1 +

2

2eN1

.

The matrix P is given in (14), and the constants , are

(53)

= 2(2 eT1 P1e1) = 2eTN1P

1e1.

Proof: Applying P1 to both sides of (48), we obtain

4x = P1(P 4x)

= 6h4

P1T2 + 36h4P1e1(e1 + KP1e1)T + P1eN1(eN1 KP1eN1)T.

Therefore, the term in braces, referred as (I),

(54) (I) = P1e1(e1 + KP1e1)

T + P1eN1(eN1 KP1eN1)T

is expanded as

(I) = P1e1eT1 P1e1eT1 P1K+ P1eN1eTN1 + P1eN1eTN1P1K

= P1e1eT1 P1e1eT1 KP1 + P1e1eT1 [K, P1] + P1eN1eTN1

+ P1eN1eTN1KP

1 + P1eN1eTN1[P

1, K].

Using the value of the commutator (31)

[K, P1] = 2P1(E1EN1)P1 = 2P1

e1eT1 eN1eTN1

P1 = [P1, K],

we obtain that (I) is the conjugate of (II) by P1, i.e.,

(55) (I) = P1(II)P1,

with (II) defined as

(II) = e1eT1 P e1eT1 K 2e1eT1 P1e1eT1 + 2e1eT1 P1eN1eTN1

+ eN1eTN1P + eN1e

TN1K+ 2eN1e

TN1P

1e1eT1 2eN1eTN1P1eN1eTN1.

Therefore, (I) rewrites

(56) (I) = P1(S) + (S)P1,12


13/37

where (S) and (S) are the matrices defined by

(S) = 2e1eT1 P1e1eT1 2eN1eTN1P1eN1eTN1+ 2

e1e

T1 P

1eN1eTN1 + eN1e

TN1P

1e1eT1

(S) = e1[(P + K)e1]

T + eN1[(P K)eN1]T.

The matrix (S) is clearly symmetric. In addition, we verify easily that

(57)

(P + K)e1 = 4e1(P K)eN1 = 4eN1.

Therefore, the matrix (S) reduces to

(58) (S) = 4e1eT1 + 4eN1e

TN1

and is as well symmetric. We deduce from (56) that (I) can be written as

(I) = P1 2e1eT1 P1e1eT1 2eN1eTN1P1eN1eTN1

+ 2e1eT1 P

1eN1eTN1 + 2eN1e

TN1P

1e1eT1 + 4e1e

T1 + 4eN1e

TN1P

1,

or

(59) (I) = P1

e1, eN1

eT1eTN1

P1,

with

(60)

= 4 2eT1 P1e1 = 4 2eTN1P1eN1 = 2eT1 P

1eN1 = 2eTN1P

1e1.

Using, in (59), that

(61) =

2

2

2

222

22

00 +

2

2

2

222

22

,we obtain that (I) is the symmetric matrix

(62) (I) =

v1, v2

vT1vT2

,

where the vectors v1, v2 are

(63)

v1 = ( )1/2P1

2

2e1

2

2eN1

v2 = ( + )1/2P122 e1 +

2

2 eN1.

13


14/37

which gives (51).

Cast in the matrix framework, formula (51) is a decomposition of the one-dimensional Stephenson biharmonic operator 4x in two parts,

(64) A = h44x = 6P1T2 B +36[v1, v2]

vT1

v

T

2 C

.

We note that B Span(T), since P = 6I T and that C is a rank 2 matrix.We can therefore use the Sherman-Morrison formula.

Formula 3.1 (Sherman-Morrison, [18], Chap.2, p. 50) Suppose thatA, B MN(R) are two invertible matrices such that

(65) A = B + RST,

withR, S MN,n(R),n N, then the inverse of the matrixA can be written as(66) A1 = B1 B1R(I+ STB1R)1STB1.provided that the matrix I+ STB1R

Mn(R) be invertible.

When n


15/37

A is the matrix corresponding to h42h in (67). B is the (N 1)2 (N 1)2 matrix

(69) B = 6P1T2 IN1 + 6IN1 P1T2 + 2T T.

Bis diagonal in the basis Zk

Zl and will be referred for convenience as

the diagonal part of matrix A. C is the (N 1)2 (N 1)2 matrix, (see (63) for the definition of v1, v2),

(70) C = 36

[v1, v2]

vT1vT2

IN1 + IN1

v1, v2

vT1vT2

.

Denoting by Zk, Zl respectively the column and line vectors of the unitarymatrix (40), we replace in (70) the identity matrix IN1 by

(71) IN1 = ZZT =

Z1, Z2, . . ,ZN1

Z1..

ZN1

.

The interest of the decomposition of the identity operator (71), instead of thetrivial one, will appear in the resolution of the capacitance system, see AppendixA.

The matrix in braces in (70) is therefore(72)

v1, v2

vT1vT2

Z1, Z2, . . ,ZN1

Z1..ZN1

(a)

+

Z1, Z2, . . ,ZN1

Z1..ZN1

v1, v2 vT1vT2

(b)

.

At this point we use several rules of the Kronecker product alegebra, (see [25]),namely(i) For all matrices A,B,C,D,

(73) (A B)(C D) = (AC) (BD)

assuming that the ordinary products AC and BD are defined.(ii) For all matrices A, B,

(74) (A B)T = AT BT

15


16/37

Applying (73), (74), and the definition of the Kronecker product (12), the term(a) in (72) can be rewritten as

(a) =

v1, v2

Z1, Z2, . . ,ZN1

vT1vT2

Z1..

ZN1

=

v1 Z1, . . ,v1 ZN1

vT1 Z1..vT1 ZN1

+ v2 Z1, . . ,v2 ZN1

vT2 Z1..v2 ZN1

Combining this with similar calculations applied to term (b) in (72), it turnsout that the matrix C in (70) can be expressed as a low-rank (N1)2(N1)2matrix in the form

(75) C = 36RRT,where R is a matrix (N 1)2 4(N 1), written in the form(76) R = [R1, R2, R3, R4] .In the sequel, we refer to C as the perturbation (or capacitance) part ofA.

The four (N 1)2 (N 1) matrices Rk are

(77)

R1 = [v1 Z1, v1 Z2,...,v1 ZN1]R2 = [v2 Z1, v2 Z2,...,v2 ZN1]R3 = [Z1 v1, Z2 v1,...,ZN1 v1]R4 = [Z1 v2, Z2 v2,...,ZN1 v2].

The interest of this decomposition (instead of the one using the canonical factor-ization of the identity operator), will appear clearly in Subsection A. Applyingthe Sherman-Morrison formula to the matrix

(78) A = B + 36RRT,allows to express A1 as

(79) A1 = B1 36B1R

I4(N1) + 36RTB1R1

RTB1.

We use now (79) to solve the system

(80) AU = Fwith A = h42h, F = h4 vect(f).

The following algorithm summarizes the solution procedure according to

formula (66). Indications of the computing complexity are given at each step ofthe algorithm.

16


17/37

Algorithm 2 (Fast FFT algorithm for the biharmonic problem)

Step1: Solve Bg = F. Let F R(N1)2 be the source-term vectorF = h4 vect(f). The linear system for g R(N1)2

(81) Bg = F,is solved using the FFT transform as follows. FirstF is decomposed on thebasis Zk Zl where the vector Zk, k = 1, . . ,N 1 are the eigenfunctions(40) of matrix T, as

(82) F =

N1k,l=1

FZk,lZk Zl,

where the coefficients FZk,l are

(83) FZk,l = (F, Zk Zl) , 1 k, l N 1.

(83) is computed by FFT. The eigenvalues of B are

(84) k,l =2

k(1 k/6) +2

l(1 l/6) + 2kl,and g = B1F is given by

(85) g =N1k,l=1

FZk,lk,l

Zk Zl.

The vector g R(N1)2 is stored for Step 7. The FFT computation (83)is O(N2 Log(N)) and (85) is O(N2).

Step2:Compute the vector RTg R4(N1),

(86) RTg = RT1

gRT2 gRT3 gRT4 g

.We have for example

(87) RT1 g =

(v1 Z1)Tg..(v1 ZN1)Tg

.The l component of vector RT1 g in (87) is

(88) (v1 Zl)Tg =N1i=1

(v1)i

N1j=1

gi,jZjl .

17


18/37

Each term

(89)N1j=1

gi,jZjl , 1 i N 1

is computed by FFT (actually the fast sine transform) via

(90)N1j=1

gi,jZjl =

2

N

1/2 N1j=1

gi,j sinlj

N.

Similarly, the k-th component in RT3 g is

(91) (RT3 g)k = (Zk v1)Tg =

N1j=1

(v1)j

N1i=1

gi,jZik.

The FFT is used to compute

(92)

N1

i=1 gi,jZik = 2

N1/2 N1

i=1 gi,j sinik

N.

The expressions of RT2 g and RT4 g are similar, replacing v1 by v2. As for

the counting complexity, of Step 2, (89) is O(NLog(N)) for each value ofi, which gives O(N2 Log(N)) in all. Then (87) is O(N2) using (88). Thesame is true for each of the four components of RTg in (86).

Step 3:Assemble the 4(N 1) 4(N 1) capacitance matrix matrix in bracketsin formula (79)

(93) I4(N1) + 36RTB1R.We refer to the Appendix A for the detailed structure of the symmetric ma-trix (93), as well as for the O(N2) computing complexity of its assembling.Note that matrix (93) is computed once for all.

Step 4:Solve the 4(N 1) 4(N 1) linear system(94)

I4(N1) + 36RTB1R

s = RTg.

The computing complexity of the whole algorithm relies on the efficiencyof this solving. It is performed by the preconditionned conjugate gradientmethod. Numerical evidence displays a O(N2 Log(N)) computing cost.We refer to Appendix A for more details. The solution s R4(N1) isdecomposed in

(95) s = [s1, s2, s3, s4]T, s1, s2, s3, s4 RN1

18


19/37

Step 5:Perform the product of t = Rs, s R4(N1), t R(N1)2 .(96) t = t1 + t2 + t3 + t4,

with

(97)

t1 = v1 [Z1,...,ZN1] s1t2 = v2 [Z1,...,ZN1] s2t3 =

[Z1,...,ZN1] v1

s3

t4 =

[Z1,...,ZN1] v2

s4.

For 1 i, j N 1,

(98)

(t1)i,j = (v1)i

N1l=1

(s1)lZlj

(t2)i,j = (v2)i

N1l=1

(s2)lZlj

(t3)i,j = (v1)j

N1

k=1(s3)kZ

ki

(t4)i,j = (v2)j

N1k=1

(s4)kZki .

Each sum in each right-hand-side in (98) is computed by FFT, which gives

a cost ofO(NLog(N)). The computation of the vectort R(N1)2 in (96)is therefore O(N2 Log(N)).

Step 6:Resolution of the linear system inR(N1)

2

(99) v = B1tvia the fast FFT solver as in Step 1. The cost is O(N2 Log(N)).

Step 7:Assemble the solution R(N1)2 vect L2h,0 of the biharmonic problem(5) (with (a, b) = (0, 1)) by

(100) = g 36vwhere g, v R(N1)2 are given in (85,99). The cost is O(N2).

Step 8:Compute the hermitian gradient x, y R(N1)2 as a post-processing ofthe grid values of by

(101)

x =

3

hP1K I

y = 3h

I P1K19


20/37

Computation (101) is performed using the one-dimensional FFT for aglobal cost O(N2 Log(N)).

The overall computing cost of Algorithm 2 is therefore O(N2)Log(N) underthe assumption that Step 4 is at most O(N2 Log(N)). See Appendix A. Letus conclude this Section by considering now the case of problem (4) discretized

by (5), with a 0, b > 0. The matrix form of the finite difference operatorah + b2h is

(102) ah + b2h =a

h2[T I+ I T] + b

h4(B + C) .

Defining

(103) Ba,b = ah2 [T I+ I T] + bB,we have

(104) h4(ah + b2h) = Ba,b + bC.It turns out that the algorithm that solves problem (5) is now exactly the same

as Algorithm 2, replacing the eigenvalues k,l of B by(105) a,bk,l = ah

2(k + l) + b

2k

(1 k/6) +2l

(1 l/6) + 2kl

.

In addition C is replaced by bC in (75, 79).

3.4 Treatment of non-homogeneous boundary conditions

Let us first consider the modifications of 4xu at near boundary points. We write

(106) 4xu =12

h2(xux 2xu).

Let U be a one dimensional vector of length N

1, associated with the values of

the solution u for a fixed y. The vector Ux is associated with the approximatedderivative of u. At near boundary point x = x1 we have

(107) (xux)x=x1 =1

2h((Ux)2 (Ux)0).

Similarly for x = xN1

(108) (xux)x=xN1 =1

2h((Ux)N (Ux)N2).

Therefore, we can write

(109) xux =1

2hKUx +

1

2h

(Ux)0.

.(Ux)N

.

20


21/37

We replace KUx by KP1P Ux and get

(110) xux =1

2hKP1P Ux +

1

2h

(Ux)0..(Ux)N

.

Now, for i = 1, , , N 1 we have (Ux)i+1+4(Ux)i+(Ux)i1 = 62h(Ui+1Ui1).For i = 1 we have

(111) (Ux)2 + 4(Ux)1 =6

2hU2 6

2hU0 (Ux)0.

Similarly, for i = N 1

(112) 4(Ux)N1 + (Ux)N2 = 62h

UN2 +6

2hUN (Ux)N.

Thus, it follows that

(113) P Ux =6

2h

KU.

For i = 1 we have

(114) (Ux)2 + 4(Ux)1 =6

2hU2 6

2hU0 (Ux)0.

Similarly, for i = N 1

(115) 4(Ux)N1 + (Ux)N2 = 62h

UN2 +6

2hUN (Ux)N.

Thus, it follows that

(116) P Ux =6

2hKU +

62hU0 (Ux)0.

.62hUN (Ux)N

.

Therefore,(117)

xux =3

2h2KP1KU +

1

2hKP1

62hU0 (Ux)00..062hUN (Ux)N

+1

2h

(Ux)00..0(Ux)N

.

We also have that

(118) (2xu)x=x1 = 1h2(U2 2U1 + U0).

21


22/37

Similarly for x = xN1

(119) (2xu)x=x1 =1

h2(UN 2UN1 + UN2).

Thus,

(120) 2xu = T

h2U +

1

h2

U0

.

.UN

.Therefore,

(121)

4xu =12

h2

3

2h2KP1K+

1

h2T

U +

1

2hKP1

62hU0 (Ux)00..062hUN

(Ux)N

+1

2h

(Ux)00..0(Ux)N

1h2

U00..0UN

Note that the previous expression is a perturbation of the operator 4x that wereceived in equation (22), where the additional terms come from the boundary.Since the value of the solution, along with its first order derivatives are known onthe boundary, they may be transformed to the right hand side of the equation.

4yu may be expressed in a similar way. The mixed derivative 2x

2yu yields

modifications involving the values of U only at the boundary.

4 A fourth order compact scheme for biharmonic

problems

4.1 Fourth order accurate nine points compact schemes

In this subsection, we describe how to modify the 9 points biharmonic operator(7) and the 5 points Laplacian (6) in order to obtain fourth-order accuracy.

The 1-D operators 4xi,j, 4yi,j in (122) are given as functions of , x, y

by

(122) 4xi,j =12

h2

(xx)i,j (2x)i,j

; 4yi,j =

12

h2

(yy)i,j (2y)i,j

.

These two operators are fourth order accurate (see [3]). The second order ac-curacy of the operator 2h is due only to the term

2x

2yi,j. Thus, the local

truncation error is 2x2yi,j 2x2yi,j = O(h2). The later may be checked us-

ing Taylor expansion of around (xi, yj), or by checking the truncation error

in 2

x2

yi,j by applying it to a Fourier mode = ei(kx+ly)

. We refer to [3] forthe study of the accuracy at near boundary points.

22


23/37

In order to derive a fourth order approximation to 2x we proceed as follows.The Taylor expansion of 2x, xx are

(123) 2xi = 2xi +

h2

124xi + O(h

4),

(124) xx,i = 2xi +

h2

64xi + O(h

4).

A linear combination of these two operators allows us to derive the fourth orderaccurate approximation 2x to

2x, eliminating the h

2 term in the following way

(125) 2xi = 22xi xx,i = 2xi + O(h4).

In a similar way we derive a fourth-order accurate scheme for 2y, i.e.,

(126) 2yi = 22yi yy,i = 2yi + O(h4).

Therefore, a fourth order approximation for the Laplacian is

(127) hi,j = 22xi,j xx,i,j + 22yi,j yy,i,j.

Invoking

(128) xx 2x =h2

124x, yy 2y =

h2

124y,

and applying it to (127), we find that the operator h may be rewritten as aperturbation of h in the following way

(129) h = 2x

h2

124x +

2y

h2

124y = h

h2

12(4x +

4y).

In a similar manner we construct a fourth-order accurate approximation to2x2yi,j. First we expand

2yx in powers of h

(130)

2yxx,i,j = 2y

2xi,j +

h2

64xi,j + O(h

4)

+

h2

124y

2xi,j +

h2

64xi,j + O(h

4).

Thus,

(131) 2yxx,i,j = 2x

2yi,j +

h2

62y

4xi,j +

h2

124y

2xi,j + O(h

4).

Symmetrically,

(132) 2xyy,i,j = 2x

2yi,j +

h2

62x

4yi,j +

h2

124x

2yi,j + O(h

4).

23


24/37

The Taylor expansion of the mixed operator 2x2y is therefore

(133) 2x2yi,j =

2x

2yi,j +

h2

122x

4yi,j +

h2

122y

4xi,j + O(h

4).

Combining (133), (132), (130), we define the new mixed finite-difference opera-

tor 2x2yi,j as(134) 2x2yi,j = 32x2yi,j 2xyy,i,j 2yxx,i,j = 2x2yi,j + O(h4).Keeping 4x and

4y as before, we define the 4th order biharmonic operator

2h

as

(135) 2hi,j = 4xi,j +

4yi,j + 2

2x2yi,j.Invoking (128) again allows us to rewrite the 4th order operator 2h as a per-turbation of 2h in the following way.

(136) 2h = 4xI

h2

62y+ 4y I

h2

62x+ 22x2y.

Finally, the new fourth order scheme which approximates (4) is

(137)

ah + b2hi,j = fi,j , 1 i, j N 1i,j = 0 , x,i,j = y,i,j = 0, , i {0, N}, j {0, N}.

As a consequence of (136) a fourth-order approximation to the biharmonic equa-tion 2 = f is

(138) 2h = 4x

I h

2

62y

+ 4y

I h

2

62x

+ 22x

2y = f.

Note that in the right-hand-side of (138) only the value off(i, j) is involvedin the discretization of the differential equation at (xi, yj). This feature of thescheme is important in cases where the biharmonic problem has to be solvedwhen the function f in unknown on the boundary, but is known only at interiorpoints (see [3] ). A different fourth-order accurate scheme for the biharmonicequation 2 = f was presented in ([34], Sec.3.2) and a multigrid solver wasdesigned for this scheme in [1]. The scheme in ([34], Sec.3.2) involves the fivevalues of f, fi,j, fi+1,j , fi1,j , fi,j+1 and fi,j1, in order to construct a fourth-order approximation to the biharmonic operator.

Note finally that in (138) the gradient (x, y) is used at all the nine pointsof the stencil of the scheme. In a forthcoming paper, an analogous scheme forirregular domains will be derived, using directional derivatives at corner points.

24


25/37

4.2 Fast solution procedure

The fast solution procedure for solving (136) follows exactly the same lines asin Section 3. The matrix forms of the positive operators 2h and (h) are

2h =1h4

6P1T2

IN1 +T6

+ 6

IN1 +

T6

P1T2 + 2T T+ 36h4 v1, v2 vT1vT2 IN1 + T6 + 36h4 IN1 + T6 v1, v2 vT1vT2

(h) = 1h2

T I+ I T + 12 [P1T2 IN1 + IN1 P1T2]

+ 3h2 [v1, v2]

vT1vT2

IN1 + 3h2 IN1 [v1, v2]

vT1vT2

.

The matrix form A of the fourth order operator h4ah + b2h

as a diag-

onal part B and a capacitance part C, (see Section 3.3) is(139) A = B + C

with

B = ah2T I+ I T + 12

[P1T2 IN1 + IN1 P1T2]+ b

6P1T2

IN1 +

T

6

+ 6

IN1 +

T

6

P1T2 + 2T T

.

The eigenvectors of B are Zk Zl and the eigenvalues are

k,l = ah2

(k + l) +

1

12

2k(1 k/6) +

1

12

2l(1 l/6)

+ b

2k

1 + l/6

1 k/6 + 2l

1 + k/6

1 l/6 + 2kl

.

The capacitance part ofA is(140)

C = 36 [v1, v2] vT1vT2 ( ah2

12+ b)IN1 + b

T

6

+

(ah2

12+ b)IN1 + b

T

6

v1, v2 vT1vT2 .The structure of C is the same as precedingly,(141) C = 36RRT

where R is the (N 1)2 4(N 1) matrix(142) R = [R1, R2, R3, R4] ,and

(143)

R1 = [v1 Z,1, v1 Z,2,...,v1 Z,N1]R2 = [v2 Z,1, v2 Z,2,...,v2 Z,N1]R

3 = [Z,1

v1, Z,2

v1,...,Z,N1

v1]R4 = [Z

,1 v2, Z,2 v2,...,Z,N1 v2]

25


26/37

with(144)

Z =

Z,1, Z,2, . . ,Z,N1

, Z,j =

ah2

12+ b(1 +

j6

)

1/2Zj, 1 j N1

The numerical algorithm for (137) is the same as Algorithm 2, replacing

B,

C,

R, Z,

by B, C, R, Z. Non homogeneous boundary conditions are treated as in Sub-section 3.4.

5 Numerical results

The numerical results presented in the sequel have been obtained with a codewritten in FORTRAN90. The package fftpack of Swarztrauber [36] has beenused for computing the FFT. In addition we have used the g95 compiler [37]without optimization. The computations have been ran in double precision ona laptop with a processor Intel Pentium M, 2.13 GHZ, with 1GB memory.

5.1 Accuracy

We report numerical results obtained so far with the two versions of the scheme,the second order and the fourth order versions. The discrete L2, and L errorsare defined by

(145)

hh =

h2i,j

((xi, yj) i,j)21/2

(a),

h,h = supi,j=1,..N1

|(xi, yj) i,j| (c).

Taking into account non homogeneous boundary conditions gives an additionalcontribution to the right-hand side for near boundary points, whose value issimply deduced from the boundary values of, x, y on the four edges, which

appear in the expression of the discrete Laplacian and Biharmonic operators. Example 1The problem solved is (5) with exact solution (x, y) = sin2(x)sin2(y) on thesquare = [0, ]2. That test is the same as Example 2 in [1] and Example1 in [5]. Table 1 reports the numerical results with the second order accuratescheme (5) (with (a, b) = (0, 1). The source term is 2(x, y) and the boundaryconditions are zeros on the four sides of the square. We observe the second orderaccuracy of the scheme for , the gradient (x, y) as well as for the Laplacian h.

Table 2 reports the numerical results with the fourth order scheme (137).The scheme exhibits a fourth order accuracy, for , (x, y) as well as for

h, up to a 512512 grid, where the numerical accuracy of the computeris reached, (double precision).

26


27/37


28/37

We consider Problem (5) with zero source term and boundary conditions

(146)

2(x, y) = 0, (x, y) (x, y) = 0

x(0, y) =

x(1, y) =

y(x, 0) = 0

y (x, 1) = 1

This corresponds to the Stokes problem in pure streamfunction form for a drivencavity setting. See Example 6 in [5], Problem 3 in [1]. The fourth order schemehas been used.

N max|| location (x, y)64 0.1000803 (0.5, 0.765625)

128 0.1000767 (0.5, 0.765625)256 0.1000759 (0.5, 0.765625)

Bialecki(N=128)[5] 0.100076276 (0.5, 0.765)Altas et al.(N=64)[1] 0.10008 (0.5, 0.766)

Table 3: Maximum value and location of max || in Stokes Problem (146) with thefourth order scheme (137).

Example 3In order to demonstrate the efficiency of the fourth order scheme (137), wepresent several examples of problems (4) with coefficients (a, b) = ( 1, 2) forwhich we applied our fourth order scheme (137). For the first test problem

(147) (x, y) = (1 + x2)(1 + y2)

Here we received zero error up to the machine accuracy, according to the fact

that the scheme is exact for fourth order polynomials. Example 4

Consider the case where the exact solution is

(148) (x, y) = (1 x2)2(1 y2)2, 1 x, y 1.

This function solves the equation

+ 22 = f(x, y),

where f(x, y) = + 22 is the forcing term. In this case we have homo-geneous boundary conditions.

Table 4 summarizes the errors, e = hh, and the error in the x andy-derivatives ex = x x,hh, ey = y y,hh.

28


29/37


30/37


31/37

Appendix

A Resolution of the capacitance linear system

In this subsection we focus on the resolution of the capacitance system (94).

Consider first the 4(N 1) 4(N 1) matrix RB1

RT

which has a 4 4 blockstructure

(150) RTB1R =

RT1 B1R1 RT1 B1R2 RT1 B1R3 RT1 B1R4RT2 B1R1 RT2 B1R2 RT2 B1R3 RT2 B1R4RT3 B1R1 RT3 B1R2 RT3 B1R3 RT3 B1R4RT4 B1R1 RT4 B1R2 RT4 B1R3 RT4 B1R4

,where the four (N 1)2 (N 1) matrices Rk are given in (77). By symmetry,only the upper diagonal part of (150) has to be computed. The vector Zj RN1 is given in (41) and the vectors v1, v2 RN1 in (63) are decomposed as

a linear combination of Zk by

(151)

v1 = ( )1/2

2

2

N1k=1

Zk1

ZkN16 k Zk

v2 = ( + )1/2

2

2

N1k=1

Zk1 + ZkN1

6 k Zk.

Therefore for i = 1, . . ,N 1,

(152)

v1 Zi = ( )1/2

2

2

N1k=1

Zk1 ZkN16 k Z

k Zi, (i)

v2 Zi = ( + )1/2

2

2

N1k=1

Zk1 + ZkN1

6 k Zk Zi, (ii)

Zi v1 = ( )1/222

N1k=1

Z

k

1 Zk

N16 k Z

i Zk, (iii)

Zi v2 = ( + )1/2

2

2

N1k=1

Zk1 + ZkN1

6 k Zi Zk. (iv)

31


32/37

Operating with B1 on the left in (152)i,ii,iii,iv gives for j = 1, . . ,N 1

(153)

B1(v1 Zj) = ( )1/2

2

2

N1k=1

Zk1 ZkN1(6 k)k,j Z

k Zj, (i)

B1(v2

Zj) = ( + )1/2

2

2

N1

k=1Zk1 + Z

kN1

(6 k)k,jZk

Zj, (ii)

B1(Zj v1) = ( )1/2

2

2

N1k=1

Zk1 ZkN1(6 k)k,j Z

k Zj, (iii)

B1(Zj v2) = ( + )1/2

2

2

N1k=1

Zk1 + ZkN1

(6 k)k,j Zk Zj. (iv)

Taking the R(N1)2

scalar product of (152) and (153), and using that

(154)

Zk Zl, Zk Zl

= k,kl,l , 1 k, k, l , l N 1,

yields the term (i, j) of the matrix RT1 B1R1 is

(155) (v1 Zi)TB1(v1 Zj) = 2

i,jN1k=1

Zk1 ZkN16 k

2 1k,j

.

This proves that the (N 1) (N 1) matrix D1 = RT1 B1R1 is actuallydiagonal. Using(156)

ZkN1 =

2

N

1/2sin

k(N 1)N

= (1)k+1

2

N

1/2sin

k

N= (1)k+1Zk1

in (155) yields that the diagonal coefficient (D1)j is

(157) RT1 B1R1j,j =4( )

N

N1

k=1,k even sin2k

N1

(6 k)2k,j.

Similarly, we find that the matrix D2 = RT2 B1R2 is diagonal as well, with j-thcoefficient

(v2 Zj)TB1(v2 Zj) = 12

( + )

N1k=1

Zk1 + Z

kN1

6 k

21

k,j

=4( + )

N

N1k=1,k odd

sin2

k

N

1

(6 k)2k,j .

In addition, using that k,l = l,k it is easy to verify that

(158) RT3 B1R3 = RT1 B1R1RT4 B1R4 = RT2 B1R2

32


33/37

and that

(159)

RT1 B1R2 = 0RT3 B1R4 = 0.

Finally, we obtain that the matrices M1,3 = RT1 B1R3, M1,4 = RT1 B1R4,M2,4 = R

T2

B1R4 are given by

(160) (RT1 B1R3)i,j = 4( )N

sin iN sinjN

(6 i)(6 j)i,j if i even , j even0 if i odd or j odd

(161)

(RT1 B1R4)i,j = 4( )

1/2( + )1/2

N

sin iN sinjN

(6 i)(6 j)i,j if i odd , j even0 if i even or j odd

(162)

(RT2

B1R3)i,j =

4( )1/2( + )1/2N

sin iN sinjN

(6

i)(6

j)i,j

if i even , j odd

0 if i odd or j even

(163) (RT2 B1R4)i,j = 4( + )N

sin iN sinjN

(6 i)(6 j)i,j if i odd , j odd .0 if i even or j even

Using that

(164) (RT1 B1R4)T = (RT2 B1R3),it results that the 4(N 1) 4(N 1) matrix RTB1R has the 4 4 blockstructure

(165) RTB1R = D1 0 M1,3 M1,40 D2 MT1,4 M2,4

M1,3 M1,4 D1 0MT1,4 M2,4 0 D2

,Formulas (157, 160, 161, 162, 163) show that the computing cost for the assem-bling is O(N2). The capacitance matrix I+ 36RTB1R in (94) is therefore

(166) I+ 36RTB1R =

I+ 36D1 0 36M1,3 36M1,4

0 I+ 36D2 36MT1,4 36M2,4

36M1,3 36M1,4 I+ 36D1 036MT1,4 36M2,4 0 I+ 36D2

.This symmetric positive matrix has the form I4(N1) + Z

TZ I, where

(167) Z = 6(B1)1/2R.

33


34/37

The conjugate gradient method can be applied to (94). The simple diagonalpreconditioning yields a linear system Mg = g with matrix of the form

(168) M =

I2(N1) MMT I2(N1)

,

where

M M2(N1)(R) is defined by(169) M =

( 136I+ D1)

1M13 (136

I+ D1)1M14

( 136I+ D2)1MT14 (

136

I+ D2)1M24

.

Solving (94) with the CG method for the matrix in (168) has been proved to bevery efficient. Equivalently, one can split the linear system into two independentlinear systems of size 2(N1) and matrices I2(N1)MTM and I2(N1)MMT,[10], each of them being solved by the CG algorithm.

The convergence analysis of the CG algorithm (see for example [13], chap.8,pp. 249 sqq.) yields that the norm of the error is reduced by a factor after kiterations. Here k is selected such that

(170) k 1

22(I M MT)ln(2/),where 2(I MMT) is the condition number of I MMT.

The eigenvalues of I MMT are

(171) 0 < 1 2 ... 2(N1) 1,

where k = 1 2k and k is the kth singular value of M. Therefore,

(172) 2(I MMT) 11 2max(M)

.

A full analytic estimate of max(M) seems to be difficult to derive. Thus, welimit ourselves to the numerical study of the relation between Log

2(N) and

max(M). In Fig.1 we display the graph of Log2(N) max(M). Here N is thesize of the problem and it ranges 2 to 2600. This value encompasses the numberof grid points, which is usually picked for two-dimensional problems. As canbe observed on Fig.1, one can infer a monotonic increase in the behaviour ofmax(M) as a function ofN, with a very slow growth for large N. The existenceof a bound uniform in N , though very plausible, is not completely apparent.Anyway, we observe numerically that max(N) 0.7 for N 2600. For a giventolerance error , the lower bound for the number of iterations k is independentin practice of the grid size, namely, k 0.7 ln(2/), at least for N 2600, (see(170)). Since each iteration of the CG algorithm is O(N2), we have that Step 4in Algorithm 2 is in practice O(N2). In Tables 6, 7 we report on the number ofiterations of the CG algorithm for the system (94) for the two specific examples.

It indicates a very slow increasing behaviour of the number of iterations of theCG algorithm, corroborating Fig. 1.

34


35/37

2 3 4 5 6 7 8 9 10 11 12

0.56

0.58

0.6

0.62

0.64

0.66

0.68

0.7

Log(N)/Log(2)

Figure 1: Curve Log2(N) max(M).

References[1] I. Altas, J. Dym, M. M. Gupta, and R. P Manohar. Mutigrid solution

of automatically generated high-order discretizations for the biharmonicequation. SIAM J. Sci. Comput., 19:15751585, 1998.

[2] P. Amodio, J.R. Cash, G. Roussos, R.W. Wright, G. Fairweather, I. Glad-well, G.L. Kraut, and M. Paprzycki. Almost block diagonal linear systems:sequential and parallel solution techniques, and applications. Numer. Lin-ear Algebra with Applications, 7:275317, 2000.

[3] M. Ben-Artzi, J-P. Croisille, and D. Fishelov. Convergence of a compactscheme for the pure streamfunction formulation of the unsteady Navier-Stokes system. SIAM J. Numer. Anal., 44,5:19972024, 2006.

[4] M. Ben-Artzi, J-P. Croisille, D. Fishelov, and S. Trachtenberg. A Pure-Compact Scheme for the Streamfunction Formulation of Navier-Stokesequations. J. Comput. Phys., 205(2):640664, 2005.

[5] B. Bialecki. A fast solver for the orthogonal spline collocation solution of thebiharmonic Dirichlet problem on rectangles. J. Comput. Phys., 191:601621, 2003.

[6] B. Bialecki, G. Fairweather, and K.R. Bennett. Fast direct solvers forpiecewise Hermite bicubic orthogonal spline collocation equations. SIAMJ. Numer. Anal., 29:156173, 1992.

[7] B. Bialecki, G. Fairweather, and K.A. Remington. Fourier methods for

piecewise Hermite bicubic orthogonal spline collocation. East-West J. Nu-mer. Math, 2:120, 1994.

35


36/37


37/37

[23] R. Kress. Numerical Analysis. Graduate Texts in Mathematics 181.Springer, 1998.

[24] K. Kunz. Numerical Analysis. McGrawHill, 1958.

[25] C. Van Loan. Computational Frameworks for the Fast Fourier Transform.SIAM, 1992.

[26] Z.M. Lou, B. Bialecki, and G. Fairweather. Orthogonal spline collocationmethods for biharmonic pronblems. Numer. Math., 80:267303, 1998.

[27] R.E. Lynch, J.R. Rice, and D.H. Thomas. Direct solution of partial dif-ference equations by tensor product methods. Numer. Math., 6:185199,1964.

[28] J.W. McLaurin. A general coupled equation approach for solving the bi-harmonic boundary value problem. SIAM J. Numer. Anal., 11(1):1433,1974.

[29] V.V. Meleshko. Selected topics in the history of the two-dimensional bi-harmonic problem. Appl. Mech. Review, 56(1):3385, 2003.

[30] J. Shen. Efficient Spectral-Galerkin method I. Direct solvers of second andfourth order equations using Legendre polynomials. SIAM J. Sci. Comput.,15:14401451, 1994.

[31] J. Shen. Efficient Spectral-Galerkin method II. Direct solvers of secondand fourth order equations using Chebyshev polynomials. SIAM J. Sci.Comput., 16:7487, 1995.

[32] J. Smith. The coupled equation approach to the numerical solution ofthe biharmonic equation by finite differences. I. SIAM J. Numer. Anal.,5(2):323339, 1968.

[33] J. Smith. The coupled equation approach to the numerical solution of

the biharmonic equation by finite differences. II. SIAM J. Numer. Anal.,7(1):104111, 1970.

[34] J. W. Stephenson. Single cell discretizations of order two and four forbiharmonic problems. J. Comput. Phys., 55:6580, 1984.

[35] W. Sun. Orthogonal collocation solution of biharmonic equations. Int.Jour. Comput. Math, 49:2221232, 1993.

[36] P. Swarztrauber. Fast Fourier Transform Algorithms for Vector Computers.Parallel Computing, pages 4563, 1984.

[37] A. Vaught. g95 Manual. Technical report, http://www.g95.org.

37

biharmonic in rectangle (2008)

Documents