anintroductionto domaindecompositionalgorithms · 2011-10-06 · the development of theory has...

56
An Introduction to Domain Decomposition Algorithms Olof Widlund Courant Institute of Mathematical Sciences New York University http://www.cs.nyu.edu/cs/faculty/widlund/

Upload: others

Post on 18-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

An Introduction to

Domain Decomposition Algorithms

Olof WidlundCourant Institute of Mathematical Sciences

New York Universityhttp://www.cs.nyu.edu/cs/faculty/widlund/

Page 2: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Scope of This Tutorial

I will exclusively look at domain decomposition algorithms for positivedefinite, symmetric problems arising from a low order finite elementapproximation of elliptic problems. All subproblems will be solved exactlyby Cholesky’s algorithm. I will also give few references to the literature.

I will adopt the view that a domain decomposition algorithm providespreconditioners (approximate inverses) M of the large and often very ill-conditioned stiffness matrices A that arise in finite element practice.

They are designed with parallel computing systems in mind and thebest of them have proven to scale very well on systems with very manyprocessors.

1/55

Page 3: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

All this work aims at designing preconditioners M such that κ(M−1A),the condition number of the preconditioned operator, is small, while keepingthe costs of applying M−1 acceptable. A preconditioned Krylov spacemethod is almost always used to accelerate the convergence of the iteration.

The development of theory has greatly assisted in the development ofimproved algorithms; it can be viewed as a subfield of finite element theory.In particular, some of the good choices of primal constraints and scalings forsome of these methods are unlikely to have been found without theoreticalwork.

2/55

Page 4: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Poisson’s Equation and a Simple Finite Element Model

By using Green’s formula, we can write Poisson’s equation as a variationalproblem: Find u ∈ V such that ∀v ∈ V,

a(u, v) :=

Ω

∇u · ∇vdx = F (v) :=

Ω

fvdx+

∂ΩN

gNvds.

Here f is the load, i.e., the right hand side and gN the Neumann datagiven on ∂ΩN ⊂ ∂Ω. All elements of V ⊂ H1(Ω) vanish on the set∂ΩD := ∂Ω \ ∂ΩN , first assumed to be non empty. This problem is thenuniquely solvable.

3/55

Page 5: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Introduce a triangulation Th of Ω and V h ⊂ V , the standard piecewiselinear finite elements on the triangulation. A linear system Au = F resultswhere u is the vector of nodal values at all interior nodes and those on∂ΩN . The stiffness matrix A is sparse, symmetric, and positive definite andcan be very large. The resulting finite element solution uh(x) is well definedand converges to the solution of the differential equation when h→ 0.

The smallest eigenvalue λ1(Ω) of the differential operator, and indirectlythat of the stiffness matrix, can be estimated by using Friedrichs’ inequality

‖u‖2L2(Ω) ≤ C1a(u, u) + C2(

∂ΩD

uds)2.

For u ∈ V the second integral vanishes and we get a positive lower boundof the Rayleigh quotient a(u, u)/‖u‖2

L2(Ω)and of λ1.

4/55

Page 6: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

In a pure Neumann problem, ∂ΩN = ∂Ω, the Laplace operator and thestiffness matrix have a common null space of constants and the problem issolvable, modulo a constant, iff F (1) = 0. The second eigenvalue λ2(Ω) ofthe operator is directly related to Poincare’s inequality:

‖u‖2L2(Ω) ≤ C1a(u, u) + C2(

Ω

udx)2.

Note that the second term on the right vanishes if u is orthogonal to thenull space.

The largest eigenvalue of the stiffness matrices can be estimated byusing Gershgorin’s theorem.

5/55

Page 7: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

It is important to understand what happens to these two inequalitieswhen the diameter of the domain changes under a dilation; a simple changeof variables gives the answer. Certain powers of the dilation factor willappear with the constants. Similarly, the full H1(Ω)−norm should bedefined by

‖u‖2H1(Ω) := |u|2H1(Ω)+1/diam(Ω)2‖u‖2L2(Ω) = a(u, u)+1/diam(Ω)2‖u‖2L2(Ω).

This formula is obtained by using the standard norm for a domain withdiameter 1 and dilation.

6/55

Page 8: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Using these inequalities and a few additional, elementary arguments, wecan show that the condition numbers of the stiffness matrices grow as Ch−2

in the case of quasi-uniform meshes. This accounts for the relatively slowconvergence of the conjugate gradient method without preconditioning.

We will consider the same type of stiffness matrices for subdomainsΩi obtained by integrating over the subset Ωi ⊂ Ω. These matrices willbe important building blocks for our finite element models and domaindecomposition algorithms.

7/55

Page 9: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Two Subdomains

∂Ω2

Γ

n1

∂Ω1

Ω1

n2

Ω

Ω2

Figure 1: Partition into two non-overlapping subdomains.

8/55

Page 10: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Consider a domain Ω subdivided into two non-overlapping subdomainsΩ1 and Ω2. In between the interface Γ.

Consider a finite element approximation of a Poisson problem on Ω (orscalar elliptic, linear elasticity, or even an incompressible Stokes problem.)

Set up a load vector and a stiffness matrix for each subdomain

f (i) =

(f(i)I

f(i)Γ

), A(i) =

(A

(i)II A

(i)IΓ

A(i)ΓI A

(i)ΓΓ

), i = 1, 2.

Homogeneous Dirichlet condition on ∂Ωi \ Γ, Neumann on Γ.

9/55

Page 11: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Subassemble:

A =

A(1)II 0 A

(1)IΓ

0 A(2)II A

(2)IΓ

A(1)ΓI A

(2)ΓI AΓΓ

, u =

u(1)I

u(2)I

, f =

f(1)I

f(2)I

.

AΓΓ = A(1)ΓΓ + A

(2)ΓΓ. Degrees of freedom partitioned into those internal to

Ω1, and internal to Ω2, and those on Γ.

This is a simple example of how stiffness matrices are assembled fromthose of the subdomains; we add quadratic forms representing the energycontributed by the subdomains.

Eliminate the interior unknowns. Gives two Schur complements:

S(i) := A(i)ΓΓ −A

(i)ΓIA

(i)II

−1A

(i)IΓ, i = 1, 2.

10/55

Page 12: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

The given system can then be reduced to

SuΓ = (S(1) + S(2))uΓ = gΓ.

If we use exact solvers for the subdomain problems, we can often reduceour discussion to one about Schur complement problems. We can also takeadvantage of the reduction in dimension of the Krylov space vectors. Oncethe interface values are approximated well enough, we can find the valuesin the interiors by solving a Dirichlet problem for each subdomain. Thecondition number of a Schur complement of a positive definite symmetricmatrix A is always smaller than that of A. In our particular context, theSchur complements will have a condition number on the order of Ch−1.This bound is sharp.

11/55

Page 13: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

It is easy to see that the product of S(i) times a vector can be obtainedat essentially the cost of solving a Dirichlet problem; the elements of theSchur complements need not be computed. This is in contrast to the useof Cholesky’s method for the entire problem. It is known that for anysymmetric permutation P , factoring P TAP will require at least quadraticwork for a three-dimensional finite element matrix A.

The product of S with a vector, as needed when computing a residual,can then be assembled from matrix-vector products with the two of thesubdomain Schur complements.

An important family of domain decomposition methods are theiterative substructuring methods; the vocabulary borrowed from structuralengineering.

12/55

Page 14: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

By solving a problem with the matrix A(i) with a right hand side of theform (0, fΓ)

T , we obtain a solution with the second component equal to

S(i)−1fΓ; this is an easy exercise on block-Gaussian elimination.

A solution of a system with such a right hand side is discrete harmonic.It is A(i)−orthogonal to any v which vanishes on Γ; it therefore providesthe minimal energy extension for given values on Γ.

Matrix-vector multiplications with S(i) and S(i)−1are completely local

operations and it does not matter if we have two or many more subdomains;we can use one processor for each subdomain problem and work in parallel.

13/55

Page 15: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Coupled system of PDE

Consider Poisson equation on Ω, 2D or 3D. Zero Dirichlet data on ∂Ω,the boundary of Ω. Ω partitioned into two non-overlapping subdomains Ωi :

Ω = Ω1 ∪ Ω2, Ω1 ∩ Ω2 = ∅, Γ = ∂Ω1 ∩ ∂Ω2;

measure(∂Ω1 ∩ ∂Ω) > 0, measure(∂Ω2 ∩ ∂Ω) > 0.

Assume that boundaries of subdomains Lipschitz.

−∆u = f in Ω,u = 0 on ∂Ω.

Under suitable assumptions on f (square integrable) and boundaries ofthe subdomains (Lipschitz) the Poisson problem is equivalent to a coupledproblem:

14/55

Page 16: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

−∆u1 = f in Ω1,u1 = 0 on ∂Ω1 \ Γ,

u1 = u2 on Γ,∂u1∂n1

= −∂u2∂n2

on Γ,

−∆u2 = f in Ω2,u2 = 0 on ∂Ω2 \ Γ.

ui the restriction of u to Ωi and ni outward normal to Ωi. Conditions oninterface Γ are transmission conditions. Equivalent to the equality of anytwo independent linear combinations of the traces of the functions and theirnormal derivatives. By eliminating the interior variables, the transmissionconditions gives Poincare-Steklov operators similar to Schur complements.

15/55

Page 17: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Refer to normal derivative as the flux.

Approximation of flux: φj is a nodal basis function for a node on Γ

Γ

∂ui∂ni

φj ds =

Ωi

(∆uiφj +∇ui · ∇φj) dx =

Ωi

(−fφj +∇ui · ∇φj) dx.

In finite element language:

λ(i) = A(i)ΓIu

(i)I +A

(i)ΓΓu

(i)Γ − f

(i)Γ .

Coincides with residual corresponding to nodes on Γ of a Poisson problemwith Neumann condition on Γ.

Setting fluxes equal gives the third equation in the previous block system.

16/55

Page 18: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

A Dirichlet-Neumann Method

In terms of differential operators, for n ≥ 0:

(D)

−∆un+1/21 = f in Ω1,

un+1/21 = 0 on ∂Ω1 \ Γ,

un+1/21 = unΓ on Γ,

(N)

−∆un+12 = f in Ω2,un+12 = 0 on ∂Ω2 \ Γ,

∂un+12

∂n2= −

∂un+1/21

∂n1on Γ,

un+1Γ = θun+1

2 + (1− θ)unΓ on Γ,

We can also use conjugate gradients.

17/55

Page 19: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

After working with matrices, we find that the finite element versiongives:

S(2)(un+1Γ − unΓ) = θ(gΓ − SunΓ),

Preconditioned operator: S(2)−1S = I + S(2)−1

S(1).

Theory: Prove that S(1) is spectrally equivalent to S(2), i.e.,

cuTΓS(2)uΓ ≤ uTΓS

(1)uΓ ≤ CuTΓS(2)uΓ.

Then use the right inequality.

18/55

Page 20: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Neumann-Neumann and FETI algorithms can be described using the

same frame work. The preconditioner for N-N: S(1)−1+ S(2)−1

. Thepreconditioned FETI operator is

(S(1) + S(2))(S(1)−1+ S(2)−1

).

The proofs of the optimality of these methods reduces to an extensiontheorem for finite elements: Given any value on Γ, estimate the energy ofcontributed by one subdomain in terms of that of the other.

19/55

Page 21: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Many subdomains

Now Ω is partitioned into a family of non-overlapping subdomainsΩi, 1 ≤ i ≤ N with

Ω =⋃

i

Ωi; Ωi ∩ Ωj = ∅ i 6= j.

If Γi = ∂Ωi \ ∂Ω, the interface Γ is defined as

Γ =⋃

i

Γi.

Linear system written as

(AII AIΓ

AΓI AΓΓ

)(uIuΓ

)=

(fIfΓ

).

20/55

Page 22: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

We have, interior degrees of freedom, collected in uI and those on Γ inuΓ. Block Gauss elimination, in parallel across the subdomains, gives

(AII AIΓ

0 S

)(uIuΓ

)=

(fIgΓ

).

The Schur complement S and vector gΓ subassembled from subdomainquantities. The restriction operators Ri, of zeros and ones, map values onΓ onto those on Γi := ∂Ωi ∩ Γ. Then,

S =N∑i=1

RTi S

(i)Ri,

gΓ =N∑i=1

RTi (f

(i)Γ −A

(i)ΓIA

(i)II

−1f(i)I ).

21/55

Page 23: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Neumann-Neumann and Dirichlet-Neumann

How to precondition S? Try, for N-N,

S−1NNS =

N∑

i=1

RTi S

(i)−1Ri S.

Not scalable; no mechanism for global transportation of information acrossthe domain in each iteration step. The number of steps required for goodprogress with conjugate gradients is at least on the order of 1/H. Also someS(i) singular.

Color subdomains red and black. Use Dirichlet conditions on black andNeumann on red and glue together the red subdomains at the cross points.Gives scalable algorithm in 2D. Condition number bound: C(1+log(H/h))2.

22/55

Page 24: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Red Black

Figure 2: Red-black coloring of the subdomains.

23/55

Page 25: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

The Schwarz Alternating Method

Γ1

Ω′1

Γ2

Ω1 Ω3Ω2

Ω′2

Dates back to 1870 and H.A. Schwarz.

24/55

Page 26: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Given an initial guess u0, which vanishes on ∂Ω, the iterate un+1 isdetermined from the previous iterate un in two sequential steps.

−∆un+1/2 = f in Ω′1,

un+1/2 = un on ∂Ω′1,

un+1/2 = un in Ω2 = Ω′2 \ Ω

′1,

−∆un+1 = f in Ω′2,

un+1 = un+1/2 on ∂Ω′2,

un+1 = un+1/2 in Ω1 = Ω′1 \ Ω

′2 .

We can also write this algorithm in terms of projections onto subspaces:

un+1 − u = (I − P2)(un+1/2 − u) = (I − P2)(I − P1)(u

n − u),

where Pi := RTi A

−1i RiA. This is the basic multiplicative Schwarz method.

25/55

Page 27: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Extends immediately to many subdomains by recursion; Schwarz did.We are solving

Pmuu := (P1 + P2 − P2P1)u = g.

We could simplify and just use the two linear terms. We get the basicadditive (parallel) Schwarz method:

Padu := (P1 + P2)u = gad.

This is a symmetric operator even for more than two subdomains.

There are other symmetric Schwarz methods such as, for threesubdomains,

(I − P1)(I − P2)(I − P3)(I − P2)(I − P1).

26/55

Page 28: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

There are at least three ways of analyzing the Schwarz methods.

Schwarz used a maximum principle in 1870.

We can use an abstract Schwarz theory to be briefly discussed.

For two subdomains, one can also argue about Schur complements andshow,

en+1Γ1

=(I − (S

(2)Γ1

+ S(3)Γ1

)−1(S(1)Γ1

+ S(2)Γ1

))enΓ1

.

We can view the iteration in terms of an update of the values on Γ1. TheSchur complements correspond to Ω′

1,Ω2, and Ω3.

27/55

Page 29: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Block Jacobi Preconditioners

Precondition A by

A−1J =

(A−1

1 00 0

)+

(0 00 A−1

2

)=

(A1 00 A2

)−1

.

Here Ai = RiARTi , i = 1, 2, and the space is split into two subspaces:

V = RT1 V1 ⊕ RT

2 V2. We can write the preconditioned operator asPad = A−1

J A, i.e., as an additive Schwarz operator. We can also introduceoverlap to enhance the convergence. Note that AJ is obtained by a classicalsplitting, by removing off diagonal block. With overlap, the formulas needsto be modified a little. We can of course also use three or more subregions.

28/55

Page 30: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

The convergence rate of the block Jacobi method can be estimated bya generalized Rayleigh quotient. We find that

uTA(P−1ad )u = uT (RT

1A−11 R1 +RT

2A−12 R2)

−1u

= uT (RT1A1R1 +RT

2A2R2)u

= uT1A1u1 + uT2A2u2.

An estimateuT1A1u1 + uT2A2u2 ≤ C2

0 uTAu,

provides the bound

supu∈V

uTA(P−1ad )u

uTAu≤ C2

0 ,

and thus a lower bound for the smallest eigenvalue of Pad. There is anupper bound of 2; Pad is a sum of two projections.

29/55

Page 31: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

The same type of bounds are equally relevant for overlapping domains.There is an abstract, relatively elementary theory: Estimate C2

0 such that

N∑

0

a(RTi ui, R

Ti ui) ≤ C2

0a(u, u), ∀u with u =N∑

0

RTi ui, ui ∈ Vi.

The smallest C20 = 1/λmin(Pad).

a(u, u) =∑

a(u,RTi ui) =

∑a(Piu,R

Ti u) ≤

(∑

(Piu, Piu))1/2(

∑a(RT

i ui, RTi ui))

1/2.

Then, a(u, u) ≤ C20a(Padu, u).

30/55

Page 32: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

An upper bound for Pad can be obtained by coloring the subspaces Vi.

Overlapping Schwarz methods, for many subdomains Ω′i, should be

improved by introducing a coarse component of the preconditioner definedon a coarse triangulation TH and with a coarse space V0 = V H. This canbe done even without Th being a refinement of TH, at a cost of morecomplicated programming. The coarse mesh size should be comparable tothe diameters of the subdomains. The basic, sharp result for second orderelliptic problems is:

κ(Pad) ≤ C

(1 +

H

δ

),

where δ measures the overlap between the neighboring subdomains. Theproof does not work if the material properties change a lot across theinterface Γ.

31/55

Page 33: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

In the decomposition, the local components can be defined by

ui = Ri(Ih(θiw)) ∈ Vi, 1 ≤ i ≤ N,

where w = u−u0. Here u0 ∈ V0 is a coarse interpolant or quasi-interpolantof u; this operator should reproduce constants. The θi is a piecewiselinear partition of unity associated with the overlapping partition. Wehave

∑N1 θi(x) = 1, and |∇θi| ≤ C/δ. In the proof, we use Poincare’s and

Friedrichs’ inequality. Without a coarse component, we will have largeL2−terms and a poor convergence rates: Consider ∇(θiu); we obtain alarge coefficient in front of one term.

32/55

Page 34: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Linear Elasticity

Find displacement u ∈ V of the domain Ω, fixed along ∂ΩD, with asurface force of density g, along ∂ΩN = ∂Ω \ ∂ΩD, and a body force f :

2

Ω

µ ǫ(u) : ǫ(v) dx+

Ω

λ div u div v dx = < F,v > ∀v ∈ V.

Here λ(x) and µ(x) are the Lame parameters and

ǫij(u) =1

2(∂ui∂xj

+∂uj∂xi

)

the linearized strain tensor.

33/55

Page 35: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Two inner products defined by

ǫ(u) : ǫ(v) =3∑

i=1

3∑

j=1

ǫij(u)ǫij(v),

< F,v > =

Ω

3∑

i=1

fivi dx+

∂ΩN

3∑

i=1

givi dA.

The Lame parameters expressed in terms of the Poisson ratio ν and Young’smodulus E:

λ =Eν

(1 + ν)(1− 2ν), µ =

E

2(1 + ν).

When ν → 1/2, we go to the incompressible limit.

34/55

Page 36: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Rigid Body Modes and Korn’s Inequality

For n = 3, there are six rigid body modes with zero energy, threetranslations

r1 :=

100

, r2 :=

010

, r3 :=

001

and three rotations

r4 :=1

Hi

0−x3 + x3x2 − x2

, r5 :=

1

Hi

x3 − x30

−x1 + x1

, r6 :=

1

Hi

−x2 + x2x1 − x1

0

,

where x is a shift.

35/55

Page 37: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Poincare’s inequality replaced by Korn’s second inequality

‖v‖2H1(Ωi)

≤ C

(ai(v,v) +

1

H2i

‖v‖2L2(Ωi)

).

We also have, more importantly, with RB, the space of rigid body modes,

infr∈RB

‖v − r‖2H1(Ωi)

≤ Cai(v,v).

36/55

Page 38: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Figure 3: Finite element meshing of a mechanical object. 37/55

Page 39: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Figure 4: Partition into thirty subdomains. Courtesy Charbel Farhat.

38/55

Page 40: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

These subdomains effectively provide our coarse mesh.

Faces, edges, and vertices of quite general subdomains can be definedin terms of certain equivalence classes of finite element nodes. Thesegeometric objects are central in an alternative construction of coarseproblems and in the theory. Also highly relevant for parallel computing.

We will use face, edge, and vertex functions, providing a partition ofunity on the interface. A face function θF ij equals 1 at all nodes ofa face common to two subdomains Ωi and Ωj and vanishes at all otherinterface nodes. They are extended as discrete harmonic functions, i.e.,with minimal elastic energy; this determines the values at interior nodes.Similarly, we have edge functions and vertex functions. The restriction ofthe rigid body modes – all linear functions – to faces and edges, are usedfor problems of elasticity. The coarse space needs to accommodate all rigidbody modes.

39/55

Page 41: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Alternative Overlapping Schwarz Methods

Consider a scalar elliptic problem, in three dimensions, and a coarsespace which is the range of the interpolation operator

IhBu(x) =∑

V k∈Γ

u(V k)θVk(x) +

Eℓ⊂Γ

uEℓθEℓ(x) +∑

F ij⊂Γ

uF ijθF ij(x).

Here uEi and uF k are averages over edges and faces of the subdomains.

θVk(x) is essentially the standard nodal basis functions of a vertex of

the subdomains, θEℓ(x) = 1 at the nodes of the edge Ei and vanishesat all other interface nodes, and θF ij(x) is the similar function alreadydefined for the face F ij. These functions are extended as discrete harmonicfunctions into the interior of the subdomains. This interpolation operatorIhB reproduces constants.

40/55

Page 42: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

For “nice enough” subregions, we have

|u− IhBu|2H1(Ωi)

≤ C(1 + log(Hi/hi))|u|2H1(Ωi)

.

We use Cauchy-Schwarz inequality, a trace theorem, bounds on face andedge functions, and finite element Sobolev inequalities. We can now alsohandle coefficient jumps across the interface.

As noted, good spaces for elasticity are obtained by multiplying the rigidbody modes by the face and edge functions. The interpolation operator canthen preserve all rigid body modes. The coefficients built from averagesand first order moments. Results in a large coarse space.

41/55

Page 43: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

We can shrink the coarse space by replacing vertex, edge and facecontributions by fewer terms. The new coarse basis functions are definedas linear combinations of those of the larger space and in terms of simpleleast squares problems. The dimension of this coarse space can be madeabout half of that of the older one.

Submatrices of assembled stiffness matrices can be used to compute theinterior values of the basis elements of the coarse space.

42/55

Page 44: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

The region is also covered by overlapping subregions Ω′i. δi/Hi measures

the relative overlap between adjacent subregions, each of which is a unionof elements. The local spaces chosen for the Schwarz methods are

Vi = V h ∩H10(Ω

′i), i > 0.

The standard overlapping subdomains Ω′i are obtained by repeatedly

adding layers of elements starting with Ωi.

Another interesting choice is to work with the Ωi and Ωiδ. The latterobtained by adding layers of elements on both sides of Γi := ∂Ωi ∩ Γ. Byusing a hybrid Schwarz methods, we can make all residuals interior to theΩi vanish in each step.

43/55

Page 45: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Schwarz Methods

Schwarz 1870, Pierre-Louis Lions 1987: (I − P2)(I − P1).

Standard two-level additive 1988: P0 +∑

i≥1 Pi.

Standard hybrid, e.g., as in balancing N-N: P0+(I−P0)∑

i≥1 Pi(I−P0).

New hybrid: (I −∑

i≥1Pi)(P0 +∑

i≥1 Piδ)(I −∑

i≥1 Pi).

Note that (I −∑

i≥1Pi) is a projection since the subdomains Ωi donot intersect. Therefore, after the first iteration, we need only apply thisoperator once per step.

Also note that the residuals vanish in the interior of the subdomains,which allow us to save storage.

44/55

Page 46: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Result for Overlapping Schwarz Method

Theorem The condition number of the preconditioned operator Tassatisfies

κ(Pad) ≤ C(1 +H/δ)(1 + log(H/h)).

Here C is independent of the mesh size, the number of subdomains, the

Lame parameters, as long as the material compressible. H/δ measures

the relative overlap between neighboring overlapping subregions, and H/hthe maximum number of elements across any subregion.

What needs to be done in the almost incompressible case?. That is thestory for another day.

45/55

Page 47: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

FETI and FETI–DP

Introduce Lagrange multipliers λ ∈ U := range(BΓ) and consider theproblem:

Find (u, λ) ∈W × U , such that

Au + BTΓλ = f

BΓu = 0

Eliminate displacement u by block Gauss elimination; solve resultingSchur system by PCG. Block diagonal matrix A, in general, only positivesemidefinite. Enforce continuity constraints on primal displacementvariables uΠ throughout iterations (as in primal methods); other constraints,on u∆, enforced by Lagrange multipliers λ. Local problems invertible; primalvariables provide a coarse problem.

46/55

Page 48: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Two dimensions. Maintain continuity of primal variables at vertices(subassembly); enforce continuity constraints elsewhere by Lagrangemultipliers.

i j

l k

A(1)II A

(1)I∆ ˜

(1)TΠI

A(1)∆I A

(1)∆∆ A

(1)TΠ∆

. . . ...

A(N)II A

(N)I∆ A

(N)TΠI

A(N)∆I A

(N)∆∆ A

(N)TΠ∆

A(1)ΠI A

(1)Π∆ · · · A

(N)ΠI A

(N)Π∆ AΠΠ

ABB AT

ΠB BT

AΠB AΠΠ OB O O

uB

λ

=

fB

fΠ0

47/55

Page 49: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

FETI–DP in 3D

Good numerical results in 2D; not always very good in 3D. Idea:In addition to (or instead of) continuity of primal variables at vertices,constrain certain average values (and moments) of primal variables overindividual edges and faces to take common values across the interface.

For scalar second order elliptic equations, this approach yields a conditionnumber estimate C(1+log(H/h)2) for certain families of algorithms. Resultindependent of jumps in coefficients, if scaling is chosen carefully. Thereare algorithms with quite small coarse problems, i.e., relatively few primalconstraints.

Reliable recipes exist to select sets of primal constraints for elasticityin 3D primarily using edge averages and first order moments as primalconstraints. High quality PETSc-based codes have been developed andsuccessfully tested on very large parallel computing systems.

48/55

Page 50: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

N-N Methods of Same Flavor: BDDC

Introduce a coarse basis function for each primal constraint; set oneprimal variable = 1 and all others = 0, one at a time. Extend withminimum energy one subdomain at a time. Results in basis functionsdiscontinuous across the interface Γ.

One local subspace for each subdomain. Set all relevant primal degreesof freedom = 0. Local problems invertible.

Partially subassembled Schur complement of the system is blockdiagonal. Apply ET

D to residual. Solve the linear systems correspondingto these blocks exactly, and compute a weighted average, with operatorED, of results, across the interface. Only one block assembled. Computeresidual, remove the interior residuals, and repeat coarse and local solves.Accelerate with the preconditioned conjugate gradient method. The theorycan be focused on estimate of norm of ED.

49/55

Page 51: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Matrix Analysis of FETI–DP and BDDC

Consider three product space of finite element functions/vectors ofnodal values.

WΓ ⊂ WΓ ⊂W.

W : no constraints; WΓ: continuity at every point on Γ; WΓ: commonvalues of primal variables.

Change variables, explicitly introducing primal variables andcomplementary sets of dual displacement variables. Write Schurcomplements as

S(i) =

(S(i)∆∆ S

(i)∆Π

S(i)Π∆ S

(i)ΠΠ

).

Let SΓ denote the partially assembled Schur complement. (In practice,work with interior variables as well when solving linear systems.)

50/55

Page 52: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

BDDC matrices

For the BDDC method, we use the fully assembled Schur complementRT

Γ SΓRΓ; it is used to compute the residual. Using the preconditionerinvolves solving a system with the matrix SΓ:

M−1BDDC = RT

DΓS−1Γ RDΓ,

where RDΓ is a scaled variant of RΓ with scale factors computed from thePDE coefficients.

Scaling chosen so that ED := RΓRTDΓ is a projection.

51/55

Page 53: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

FETI–DP Matrices

The basic operator is now B∆S−1BT

∆. S is a Schur complement ofSΓ obtained after eliminating all primal variables. It is elementary to showthat S−1 = RΓ∆S

−1Γ RT

Γ∆, where RΓ∆ removes the primal part of a vectordefined on Γ.

The preconditioner is now

M−1FETI = BD∆S∆∆B

TD∆,

where S∆∆ = RΓ∆SΓRTΓ∆ is the ∆ block of SΓ and BD∆ is a scaled jump

operator. Scale factors depend on PDE coefficients.

Scaling chosen so that PD := RTΓ∆B

TD∆B∆RΓ∆ is a projection.

Also, ED + PD = I and EDPD = PDED = 0.

52/55

Page 54: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Same Eigenvalues

FETI–DP preconditioned operator:

BD∆S∆∆BTD∆∗B∆S

−1BT∆ = BD∆RΓ∆SΓR

TΓ∆B

TD∆∗B∆RΓ∆S

−1Γ RT

Γ∆BT∆.

Multiply by RTΓ∆B

T∆ on left and remove same factor on right:

P TDSΓPDS

−1Γ .

BDDC preconditioned operator:

RTDΓS

−1Γ RDΓ ∗ RT

Γ SΓRΓ.

Multiply by RΓ on left and remove same factor on right:

EDS−1Γ ET

DSΓ.

53/55

Page 55: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

Let ϕ be an eigenvector of P TDSΓPDS

−1Γ with eigenvalue λ.

Let ψ = EDS−1Γ ϕ.

EDS−1Γ ET

DSΓ ∗EDS−1Γ ϕ = EDS

−1Γ (I − P T

D)SΓ(I − PD)S−1Γ ϕ.

Gives three terms

EDS−1Γ P T

DSΓPDS−1Γ ϕ = λEDS

−1Γ ϕ

and−EDPDS

−1Γ ϕ+ EDS

−1Γ (I − P T

D)ϕ.

EDPD = 0. (I−P TD)ϕ = 0 since ϕ ∈ range(P T

D). Similarly, any eigenvalueof the BDDC operator is an eigenvalue of the FETI–DP operator.

54/55

Page 56: AnIntroductionto DomainDecompositionAlgorithms · 2011-10-06 · The development of theory has greatly assisted in the development of improved algorithms; it can be viewed as a subfield

Olof Widlund IMA Tutorial, 2010

What we just did is not quite correct. BDDC always has an eigenvalueequal to 1; FETI–DP does not always.

The analysis of BDDC requires a bound of the S−norm of the averageoperator ED. Interestingly enough, a main role, in 2D, is played by thespecial edge functions θE and by the finite element extension theorem.Both were encountered previously. In 3D, the face functions θF also comesinto play. A sharp C(1 + log(H/h))2 condition number estimate results ifthe primal constraints and certain scale factors are chosen carefully.

55/55