introduction to numerical analysis for partial ﬀ equations › saito › materials ›...

Introduction to Numerical Analysis forPartial Differential Equations

Norikazu SAITO (齊藤宣一)The University of Tokyo

http://www.infsup.jp/saito/

2018 Summer Semester (April 2018)

http://www.infsup.jp/saito/

Contents

Introduction. Modeling and analysis 1

Chapter I. Finite difference method for the heat equation 3

1 Heat equation 31.1 Initial-boundary value problems . . . . . . . . . . . . . . . . . . 31.2 Uniqueness and maximum principle . . . . . . . . . . . . . . . . 51.3 Construction of a solution and Fourier’s method . . . . . . . . 61.4 Duhamel’s principle . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Explicit finite difference scheme 102.1 Finite difference quotients . . . . . . . . . . . . . . . . . . . . . 102.2 Explicit scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.3 Numerical experiments by Scilab . . . . . . . . . . . . . . . . . 14

3 Implicit finite difference schemes 233.1 Simple implicit scheme . . . . . . . . . . . . . . . . . . . . . . . 233.2 The implicit θ scheme . . . . . . . . . . . . . . . . . . . . . . . 263.3 Inhomogeneous problems . . . . . . . . . . . . . . . . . . . . . 273.4 Numerical experiments by Scilab . . . . . . . . . . . . . . . . . 29

4 Convergence and error estimates 344.1 ℓ∞ analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.2 ℓ2 analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384.3 Numerical examples . . . . . . . . . . . . . . . . . . . . . . . . 42

5 Nonlinear problems 455.1 Semilinear diffusion equation . . . . . . . . . . . . . . . . . . . 455.2 Explicit scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . 465.3 Implicit schemes . . . . . . . . . . . . . . . . . . . . . . . . . . 515.4 An example: Gray-Scott model . . . . . . . . . . . . . . . . . . 53

6 Complement for FDM 566.1 Non-homogeneous Dirichlet boundary condition . . . . . . . . . 566.2 Neumann boundary condition . . . . . . . . . . . . . . . . . . . 576.3 ℓ∞ analysis revisited . . . . . . . . . . . . . . . . . . . . . . . . 62

Problems and further readings 71Problems for Chapter I . . . . . . . . . . . . . . . . . . . . . . . . . 71Further readings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

Chapter II. Finite element method for the Poisson equation 74

7 Variational approach for the Poisson equation 747.1 Dirichlet’s principle . . . . . . . . . . . . . . . . . . . . . . . . . 74

7.2 Galerkin’s approximation . . . . . . . . . . . . . . . . . . . . . 76

8 Finite element method (FEM) 77

9 Tools from Functional Analysis 839.1 Sobolev spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 839.2 Lipschitz domain . . . . . . . . . . . . . . . . . . . . . . . . . . 859.3 Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

10 Weak solution and regularity 9010.1 Weak formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 9010.2 Regularity of solutions . . . . . . . . . . . . . . . . . . . . . . . 9110.3 Galerkin’s approximation and Cea’s lemma . . . . . . . . . . . 92

11 Shape-regularity of triangulations 9411.1 Interpolation error estimates . . . . . . . . . . . . . . . . . . . 9411.2 Proof of Lemma 11.1 . . . . . . . . . . . . . . . . . . . . . . . . 96

12 Error analysis of FEM 102

13 Numerical experiments using FreeFem++ 10513.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10513.2 Convergence rates: regular solutions . . . . . . . . . . . . . . . 11013.3 Convergence rates: singular solutions . . . . . . . . . . . . . . . 113

Problems and further readings 116Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116Further readings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

Chapter III. Abstract elliptic PDE and Galerkin method 118

14 Theory of Lax and Milgram 118

15 Galerkin approximation 121

16 Applications 12316.1 Convection-diffusion equation . . . . . . . . . . . . . . . . . . . 12316.2 Elliptic PDE of the second order . . . . . . . . . . . . . . . . . 126

Problems and further remark 129Problems for Chapter III . . . . . . . . . . . . . . . . . . . . . . . . 129Further remark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

References 133

4

Introduction. Modeling and analysis

The aim of this lecture is to give an introduction to the mathematical theoryof numerical methods for solving partial differential equations. To accomplishthis purpose, I shall concentrate my attention on the following two typicaltopics:

1. the finite difference method for the one dimensional heat equation; and

2. the finite element method for the two dimensional Poisson equation.

I believe that they contain the core idea of numerical methods for PDEs.

Anyway, why does one study Numerical Analysis?

-

?

6

?

6

approximation (discretization)

computer

real world(phenomena)

flow around buildingsnumerical weather forecastderivative (finance)etc.

mathematical model(differential equations)

continuous variables

data visualization

computational model

A partial answer (of mine) is as follows.

• Application of computer simulation of wide range phenomena is expand-ing to life sciences, clinical medicine, economics, and other areas beyondthe limited fields of science and technology.

• That expansion is bringing about wide and useful information for use inour life.

• The greater the degree to which computer simulations are used to addresscomplicated and large scale problems, the greater becomes the demandto find solutions to their related mathematical problems.

• Actually, the process of simulation is not completed inside of a computer.It encompasses processes of all kinds such as modeling of the targetedphenomenon, mathematical analysis of the model, approximation anddiscretization of differential equations, implementation of algorithms,program writing, visualization of computer output, validation of simula-tion by comparison with actual phenomena, and final evaluation of thesimulation reliability.

1

• All of these branches of processes are connected to the strong trunk ofmathematics.

• Summing up, numerical analysis demands the pursuit of mathematicaltruth and contributes to society through mathematics simultaneously —a quite rewarding activity.

Norikazu SaitoJune 7, 2019

2

I. Finite difference method for the heat equation

1 Heat equation

1.1 Initial-boundary value problems

The heat equation (heat conduction equation) is a simple mathematical modelof the heat conduction phenomenon of a thin wire. The wire is assumedto be the unit length. We suppose that a function u = u(x, t) denotes thetemperature of the wire at a position x ∈ [0, 1] and a time t ≥ 0.

The heat equation is expressed as

ut = kuxx + f(x, t),

where k is a positive constant, called the heat conduction coefficient, andf(x, t) denotes the supply/absorption of heat. Moreover, we write as

ut =∂u

∂t, uxx =

∂2u

∂x2.

This equation should be considered under one of the following boundaryconditions:

• Dirichlet boundary condition:

u(0, t) = b0(t), u(1, t) = b1(t),

where b0(t) and b1(t) are given functions;

• Neumann (flux) boundary condition:

−kux(0, t) = b0(t), kux(1, t) = b1(t),

where b0(t) and b1(t) are given functions;

• Robin boundary condition:

−kux(0, t) = α(u(0, t)− β), kux(1, t) = α(u(1, t)− β),

where α and β are given positive constants.

Moreover, we solve the equation together with the initial condition

u(x, 0) = a(x) (0 ≤ x ≤ 1),

where a(x) denotes a given continuous function in 0 ≤ x ≤ 1.

Our first target problems of this chapter are the initial-boundary value prob-lems

ut = kuxx (0 < x < 1, t > 0)

u(0, t) = 0, u(1, t) = 0 (t > 0)

u(x, 0) = a(x) (0 ≤ x ≤ 1),

(1.1)

3

and ut = kuxx + f(x, t) (0 < x < 1, t > 0)

u(0, t) = 0, u(1, t) = 0 (t > 0)

u(x, 0) = a(x) (0 ≤ x ≤ 1).

(1.2)

Hereinafter, we make the following assumptions:

• k > 0 is a constant;

• a ∈ C[0, 1] with the compatibly condition a(0) = a(1) = 0;

• f ∈ C((0, 1)× (0,∞)).

Notation. For any subset Ω of Rn, we write as

C(Ω) = “the set of all continuous (real-valued) function defined in Ω”.

Moreover, we write as C[a, b] = C([a, b]) for abbreviation.

Remark. Problem (1.1) is a paticular case of Problem (1.2). We, however,study these problems separately.

Remark. We will address only the homogeneous boundary condition

u(0, t) = 0, u(1, t) = 0.

If posing the inhomogeneous boundary condition

u(0, t) = b0(t), u(1, t) = b1(t)

instead of the homogeneous condition in (1.1) or (1.2), we can reduce theproblems to those of the form (1.2). In fact, set w(x, t) = b0(t)(1−x)+ b1(t)xand consider a new unknown function v(x, t) = u(x, t)−w(x, t). Then, if u(x, t)solves (1.2), the function v(x, t) satisfies vt− kvxx = f − (wt− kwxx) = f −wt

and v(0, t) = v(1, t) = 0. Letting f(x, t) = f(x, t) − wt(x, t) and a(x) =a(x)−w(x, 0), we obtain the problem of the form (1.2) with f(x, t) and a(x).Thus, it suffices to consider only the homogeneous boundary condition.

Definition.A function u = u(x, t) is a (classical) solution of (1.1) or (1.2) if and only if

1. u ∈ C([0, 1]× [0,∞));

2. ut, ux, uxx ∈ C((0, 1)× (0,∞));

3. The function u satisfies the heat equation ut = kuxx or ut = kuxx+f(x, t)for all 0 < x < 1 and t > 0 together with the boundary conditionu(0, t) = u(1, t) = 0 for all t > 0;

4. limt→+0

∥u(·, t)− a∥∞ = 0.

4

Definition (L∞ norm).For v ∈ C[0, 1], we let

∥v∥∞ = maxx∈[0,1]

|v(x)|,

which we call the L∞ norm or maximum norm of v.

Remark. ∥ · ∥ = ∥ · ∥∞ is a norm of X = C[0, 1]. That is,

(N1) ∥v∥ ≥ 0 (v ∈ X). Moreover, ∥v∥ = 0 implies v ≡ 0;

(N2) ∥αv∥ = |α| · ∥v∥ (α ∈ R, v ∈ X);

(N3) ∥v + w∥ ≤ ∥v∥+ ∥w∥ (v, w ∈ X).

1.2 Uniqueness and maximum principle

For T > 0, we set

QT = (0, 1)× (0, T ],

QT = [0, 1]× [0, T ],

ΓT = (x, t) | 0 ≤ x ≤ 1, t = 0 ∪ (x, t) | x = 0, 1, 0 ≤ t ≤ T= QT \QT (parabolic boundary).

1x

t

T

O

ΓT

QT

Figure 1.1: QT and ΓT .

Theorem 1.1.Let T > 0 be fixed. Suppose that a function u = u(x, t) is continuous in QT

and smooth in QT . Then, we have the following.

(i) ut − kuxx ≤ 0 in QT implies max(x,t)∈QT

u(x, t) = max(x,t)∈ΓT

u(x, t).

(ii) ut − kuxx ≥ 0 in QT implies min(x,t)∈QT

u(x, t) = min(x,t)∈ΓT

u(x, t).

5

Proof. (i) Assume that ut − kuxx ≤ 0 in QT . Let α = max(x,t)∈ΓT

u(x, t) and set

w = e−t(u−α). Clearly, we have w(x, t) ≤ 0 for (x, t) ∈ ΓT . Hence, it sufficesto prove

w(x, t) ≤ 0 for (x, t) ∈ QT , (∗)

since this implies that u(x, t) ≤ α for (x, t) ∈ QT , which is the desired inequal-ity.Inequality (∗) is proved in the following way. We suppose that the function

w achieves its positive maximum µ at (x0, t0) ∈ QT . Thus, we assume that

µ = w(x0, t0) = max(x,t)∈QT

w(x, t) > 0.

Then, we have wt(x0, t0) ≥ 0 and wxx(x0, t0) ≤ 0. Since w satisfies wt + w −kwxx ≤ 0 in QT , considering this inequality at x = x0 and t = t0, we obtain

0 < wt(x0, t0) + µ− kuxx(x0, t0) ≤ 0.

This is a contradiction; we have proved (∗).(ii) It is a consequence of (i), by considering −u instead of u itself.

Theorem 1.2.Problem (1.1) (or (1.2)) has at most one solution.

Theorem 1.3.The solution u of (1.1) satisfies the following:

(L∞ stability) ∥u(·, t)∥∞ ≤ ∥a∥∞;

(non-negativity) a(x) ≥ 0 ⇒ u(x, t) ≥ 0 for (x, t) ∈ [0, 1]× [0,∞).

Proof of Theorems 1.2 and 1.3. They are readily obtainable consequences ofTheorem 1.1.

Theorem 1.4.(positivity) a(x) ≥ 0, ≡ 0 ⇒ u(x, t) > 0 for (x, t) ∈ (0, 1)× (0,∞).

Proof. See, for example, the following textbook

[9] L. C. Evans: Partial Differential Equations, 2nd edition, Amer-ican Mathematical Society, 2010.

1.3 Construction of a solution and Fourier’s method

Our next task is to construct explicitly a solution of (1.1). To this end, weapply Fourier’s method.

First, we assume that the solution u(x, t) of (1.1) is expressed as u(x, t) =φ(x)η(t), where φ(x) and η(t) are, respectively, functions only on x and t.Then, substituting them into the heat equation and the boundary condition,we deduce

6

(i) −φ′′(x) = λφ(x), φ(0) = φ(1) = 0;

(ii) −η′(t) = kλη(t).

The system (i) is nothing but the eigenvalue problem, and its solution is givenas

λn = (nπ)2 , φ(x) =√2 sin

√λnx =

√2 sin(nπx) (n = 1, 2, . . .).

Herein, the coefficients of sine functions have been defined as

(φn, φm) =

∫ 1

0φn(x)φm(x) dx =

1 (n = m)

0 (n = m).

On the other hand, a solution of (ii) is given as η(t) = e−kλnt. Consequently,functions

un(x, t) = e−kλntφn(x) (n = 1, 2, . . .)

solve the heat equation with the boundary condition.Then, we consider

u(x, t) =∞∑n=1

αnun(x, t) =∞∑n=1

αne−kλntφn(x)

and find coefficients αn∞n=1 to satisfy the initial condition. Setting t = 0 inthe expression above, we have

a(x) =

∞∑n=1

αnφn(x).

Using the orthonormal property of φn(x), we obtain

αm = (a, φm) =√2

∫ 1

0a(x) sin(mπx) dx.

Summing up, we get a solution of (1.1) as follows

u(x, t) =

∞∑n=1

αne−kλntφn(x), αn = (a, φn). (1.3)

However, the derivation above is somewhat formal and it should be justified.Indeed, we can prove the following theorem.

Theorem 1.5.The function defined by the right-hand side of (1.3) is continuous in [0, 1] ×[0,∞) and is of class C∞ in [0, 1]× (0,∞). Moreover, it is a solution of (1.1).

Proof. See Chapter 1 of

[12] 藤田宏，池部晃生，犬井鉄郎，高見穎郎：数理物理に現れる偏微分方程式 I [岩波講座基礎数学，解析学 II-iv]，岩波書店，1977年．

7

Remark. The function defined by the right-hand side of (1.3) itself iswell-defined for a ∈ L2(0, 1). Then, it is of class C∞ in [0, 1]× (0,∞), and itsatisfies the heat equation with the boundary condition. Moreover, the initialcondition is satisfied in the following sense;

limt→+0

∫ 1

0|u(x, t)− a(x)|2 dx = 0.

1.4 Duhamel’s principle

The solution of (1.1) could be expressed (at least formally) as

u(x, t) =

∞∑n=1

(a, φn)e−kλntφn(x),

=∞∑n=1

(∫ 1

0a(y)φn(y) dy

)e−kλntφn(x)

=

∫ 1

0a(y)

( ∞∑n=1

e−kλntφn(y)φn(x)

)dy

=

∫ 1

0G(x, y, t) a(y) dy,

where

G(x, y, t) =∞∑n=1

e−kλntφn(y)φn(x)

is the Green function associated with (1.1).We recall that the Green function G(x, y, t) satisfies the following:

1. G(x, y, t) is of class C∞ in [0, 1]× (0,∞);

2. G(x, y, t) satisfies the heat equation ut = kuxx with the boundary con-dition u(0, t) = u(1, t) = 0 as a function of both (x, t) and (y, t);

3. G(x, y, t) = G(y, x, t), G(x, y, t) ≥ 0;

4.

∫ 1

0G(x, y, t)dy ≤ 1.

For the proof of those facts, see Chapter 1 of [12] for example.

We now come to consider the heat equation with a source term:ut = kuxx + f(x, t) (0 < x < 1, t > 0)

u(0, t) = 0, u(1, t) = 0 (t > 0)

u(x, 0) = a(x) (0 ≤ x ≤ 1).

(1.2)

8

We are unable to apply Fourier’s method and need another device. We firstdecompose u into u(x, t) = v(x, t) + w(x, t), where v(x, t) and w(x, t) are,respectively, solutions of

vt = kvxx (0 < x < 1, t > 0)

v(0, t) = 0, v(1, t) = 0 (t > 0)

v(x, 0) = a(x) (0 ≤ x ≤ 1),wt = kwxx + f(x, t) (0 < x < 1, t > 0)

w(0, t) = 0, w(1, t) = 0 (t > 0)

u(x, 0) = 0 (0 ≤ x ≤ 1).

We know that v(x, t) is expressed as

v(x, t) =

∫ 1

0G(x, y, t)a(y) dy.

To derive the expression of w(x, t), we introduce a parameter s > 0 andsuppose that W (x, t; s) solves

Wt = kWxx (0 < x < 1, t > s)

W (0, t) = 0, W (1, t) = 0 (t > s)

W (x, s; s) = f(x, s) (0 ≤ x ≤ 1).

Actually, W (s, t; s) is expressed as

W (x, t; s) =

∫ 1

0G(x, y, t− s)f(y, s) dy (0 ≤ x ≤ 1, t ≥ s).

At this stage, we define

w(x, t) =

∫ t

0W (x, t; s) ds

and this is actually the desired function. In fact, we have (formally)

∂w

∂t(x, t) = W (x, t; t) +

∫ t

0

∂W

∂t(x, t; s) ds

= f(x, t) +

∫ t

0k∂2W

∂x2(x, t; s) ds = f(x, t) + k

∂2w

∂x2(x, t)

and w(x, t) satisfies the boundary and initial conditions.Summing up, the solution of (1.2) is given as

u(x, t) =

∫ 1

0G(x, y, t)u0(y) dy +

∫ t

0W (x, t; s)ds

=

∫ 1

0G(x, y, t)u0(y) dy +

∫ t

0

∫ 1

0G(x, y, t− s)f(y, s) dyds.

This expression is called Duhamel’s formula.

9

2 Explicit finite difference scheme

2.1 Finite difference quotients

Recall that the differentiation of a function u(x) at x = a is defined by

u′(a) =du

dx(a) = lim

h→0

u(a+ h)− u(a)

h.

So, it is natural to expect that

u′(a) ≈ u(a+ h)− u(a)

h(|h| ≪ 1).

Actually, letting I be a bounded closed interval containing a and a ± h forh > 0, we have

u(a+ h) = u(a) + u′(a)h+1

2u′′(a+ θh)h2

with some θ ∈ (0, 1) and, hence,∣∣∣∣u′(a)− u(a+ h)− u(a)

h

∣∣∣∣ = ∣∣∣∣12u′′(a+ θh)h

∣∣∣∣ ≤ 1

2h∥u′′∥L∞(I). (2.1)

In the similar manner, we have

u(a− h) = u(a)− u′(a)h+1

2u′′(a− θh)h2

and ∣∣∣∣u′(a)− u(a)− u(a− h)

h

∣∣∣∣ = ∣∣∣∣12u′′(a− θh)h

∣∣∣∣ ≤ 1

2h∥u′′∥L∞(I). (2.2)

On the other hand, using

u(a+ h) = u(a) + u′(a)h+1

2u′′(a)h2 +

1

3!u(3)(a+ θ1h)h

3,

u(a− h) = u(a)− u′(a)h+1

2u′′(a)h2 − 1

3!u(3)(a− θ2h)h

3

with some θ1, θ2 ∈ (0, 1), we can estimate as∣∣∣∣u′(a)− u(a+ h)− u(a− h)

2h

∣∣∣∣≤∣∣∣∣ 1

2 · 3!u(3)(a+ θ1h)h

2

∣∣∣∣+ ∣∣∣∣ 1

2 · 3!u(3)(a− θ2h)h

2

∣∣∣∣ ≤ 1

6h2∥u(3)∥L∞(I).

Now, we propose a redefinition h = h/2 and obtain∣∣∣∣∣u′(a)− u(a+ h2 )− u(a− h

2 )

h

∣∣∣∣∣ ≤ 1

24h2∥u(3)∥L∞(I).

10

Name Definition Target Accuracy Assum.

forward Euleru(a+ h)− u(a)

hu′(a) O(h) u ∈ C2

backward Euleru(a)− u(a− h)

hu′(a) O(h) u ∈ C2

first order central differenceu(a+ h

2)− u(a− h

2)

hu′(a) O(h2) u ∈ C3

second order central differenceu(a+ h)− 2u(a) + u(a− h)

h2u′′(a) O(h2) u ∈ C4

O(h) u ∈ C3

Table 2.1: finite difference quotients

Next, we consider the second derivatives. Thus, using

u(a+ h) = u(a) + u′(a)h+1

2u′′(a)h2 +

1

3!u(3)(a)h3 +

1

4!u(4)(a+ θ1h)h

4,

u(a− h) = u(a)− u′(a)h+1

2u′′(a)h2 − 1

3!u(3)(a)h3 +

1

4!u(4)(a− θ2h)h

4

with some θ1, θ2 ∈ (0, 1), we can perform the estimation∣∣∣∣u′′(a)− u(a− h)− 2u(a) + u(a+ h)

h2

∣∣∣∣≤∣∣∣∣ 14!u(4)(a+ θ1h)h

2

∣∣∣∣+ ∣∣∣∣ 14!u(4)(a− θ2h)h2

∣∣∣∣ ≤ 1

12h2∥u(4)∥L∞(I). (2.3)

We summarize those finite difference quotients in Table 2.1.

2.2 Explicit scheme

We return to the initial-boundary value problem for the heat equationut = kuxx (0 < x < 1, t > 0)

u(0, t) = u(1, t) = 0 (t > 0)

u(x, 0) = a(x) (0 ≤ x ≤ 1),

(1.1)

where k > 0 is a constant and a(x) is a continuous function satisfying a(0) =a(1) = 0.We introduce a set of grid points

Qτh = (xi, tn) | xi = ih, tn = nτ (0 ≤ i ≤ N + 1, n ≥ 0),

where h =1

N + 1with a positive integer N and τ > 0. We remark that x0 = 0

and xN+1 = 1. Moreover, let us denote by uni an approximate value of u(xi, tn)to be solved.

11

xx1 x2 xN xN+1

t2

t1

t3

h

τ

t

Figure 2.1: Qτh.

As examined in the previous paragraph, ut and uxx are approximated by

ut(xi, tn) ≈un+1i − uni

τ(forward Euler) and

uxx(xi, tn) ≈uni−1 − 2uni + uni+1

h2(second order central difference).

The boundary and initial conditions are computed by

un0 = unN+1 = 0 (n ≥ 1) and

u0i = a(xi) (0 ≤ i ≤ N + 1).

Thus, we arrive at a finite difference scheme to (1.1) as follows:un+1i − uni

τ= k

uni−1 − 2uni + uni+1

h2(1 ≤ i ≤ N, n ≥ 0)

un0 = unN+1 = 0 (n ≥ 1)

u0i = a(xi) (0 ≤ i ≤ N + 1).

(2.4)

We call this the explicit finite difference scheme (simply, explicit scheme) to(1.1).Set, throughout this chapter,

λ = kτ

h2.

It is useful to rewrite (2.4) as

u(n) = Kλu(n−1) (n ≥ 1), u(0) = a, (2.4)

12

where

u(n) =

un1...

unN

∈ RN , a =

a(x1)...

a(xN )

∈ RN ,

Kλ =

1− 2λ λ

0λ 1− 2λ λ

. . .. . .. . .

. . .

0 λ 1− 2λ

∈ RN×N . (2.5)

Definition.(i) The vector ∞ norm is defined by

∥u∥∞(= ∥u∥ℓ∞) = max1≤i≤N

|ui| (u = (ui) ∈ RN )

(ii) The matrix ∞ norm is defined by

∥G∥∞ = maxu∈RN

∥Gu∥∞∥u∥∞

(G ∈ RN×N ).

Lemma 2.1.For G = (gij) ∈ RN×N , we have

(i) ∥Gu∥∞ ≤ ∥G∥∞∥u∥∞ (u ∈ RN );

(ii) ∥G∥∞ = max1≤i≤N

N∑j=1

|gij |.

Proof. (i) It is obvious from the definition.

(ii) Set M = max1≤i≤N

N∑j=1

|gij |. Let u ∈ RN be arbitrary. Then,

∥Gu∥∞ = max1≤i≤N

∣∣∣∣∣∣n∑

j=1

gijuj

∣∣∣∣∣∣ ≤ max1≤i≤N

N∑j=1

|gij | · |uj |

≤ ∥u∥∞ max1≤i≤N

n∑j=1

|gij | = M∥u∥∞.

Hence, we have ∥G∥∞ ≤ M .

Next, we suppose M =

n∑j=1

|gkj |. We define v = (vj) by setting

vj =

1 (gkj ≥ 0)

−1 (gkj < 0).

13

Then, we have ∥v∥∞ = 1 and gkjvj ≥ 0 for any j. Therefore, we can estimateas

∥G∥∞ = maxu∈RN

∥Gu∥∞∥u∥∞

≥ ∥Gv∥∞∥v∥∞

= ∥Gv∥∞ = max1≤i≤N

∣∣∣∣∣∣N∑j=1

gijvj

∣∣∣∣∣∣≥

∣∣∣∣∣∣N∑j=1

gkjvj

∣∣∣∣∣∣ =N∑j=1

|gkj | = max1≤i≤N

N∑j=1

|gij | = M.

Combining those inequalities, we obtain ∥G∥∞ = M .

Definition.(i) For u = (ui) ∈ RN , u ≥ 0

def.⇐⇒ ui ≥ 0 (∀i).(ii) For G = (gij) ∈ RN×N , G ≥ O

def.⇐⇒ gij ≥ 0 (∀i, j).

Theorem 2.2.Assume that

(0 <)λ ≤ 1

2.

Then, the solution u(n) = (ui) ∈ RN of (2.4) satisfies, for n ≥ 1,

(ℓ∞ stability) ∥u(n)∥∞ ≤ ∥a∥∞;

(non-negativity) a ≥ 0 ⇒ u(n) ≥ 0.

Proof. We have Kλ ≥ O by 1−2λ ≥ 0. Hence, a ≥ 0 implies u(n) = Knλa ≥ 0.

On the other hand, by Lemma 2.1,

∥Kλ∥∞ = |λ|+ |1− 2λ|+ |λ| = λ+ (1− 2λ) + λ = 1.

Therefore, ∥u(n)∥∞ ≤ ∥Kλ∥n∞∥a∥∞ ≤ ∥a∥∞.

2.3 Numerical experiments by Scilab

It is not difficult to implement the explicit scheme (2.4). Here, we presentseveral sample codes of Scilab. 1

A simple code for calculating the explicit scheme (2.4) is given below (Listing1, heat11.sci):

Listing 1: heat11.sci

// *** explicit scheme for heat equation without source ***// INPUTS:

1 It is explained in Wikipedia as follows: Scilab is an open source, cross-platform numericalcomputational package and a high-level, numerically oriented programming language. Itcan be used for signal processing, statistical analysis, image enhancement, fluid dynamicssimulations, numerical optimization, and modeling, simulation of explicit and implicitdynamical systems and (if the corresponding toolbox is installed) symbolic manipulations.MATLAB code, which is similar in syntax, can be converted to Scilab. Scilab is one ofseveral open source alternatives to MATLAB. (http://en.wikipedia.org/wiki/Scilab)

14

http://en.wikipedia.org/wiki/Scilab

// N: division of space, lambda: parameter (<= 0.5),// Tmax: length of time inverval// coef: heat conduction coefficientfunction heat11(N, lambda, Tmax, coef)// [a, b]: space interval, u(a, t) = ua, u(b, t) = ub: boundary valuesa = 0.0; b = 1.0; ua = 0.0; ub = 0.0;// h: space mesh, x[ ]: vector for computation, xx[ ]: vector for

drawingh = (b - a)/(N + 1); x = [a + h: h: b - h]’; xx = [a; x; b];// tau: time increment, nmax: nmax*tau < Tmax <= (nmax+1)*tautau = lambda*h*h/coef; nmax = int(Tmax/tau) + 1;// step: parameter for drawingstep_num = 40; step = max(int(nmax/step_num), 1);// u: approximation of u(x, t), uu: vector for drawing// set initial valueu = func_a(x); uu = [ua; u; ub];scf(1); plot2d(xx, uu, style = 5);// def of matrices A and KA = 2*eye(N, N)-diag(ones(N-1 , 1), -1)-diag(ones(N-1 , 1), 1);K = eye(N, N) - lambda*A;// iterationfor n = 1:nmax

u = K*u;if modulo(n, step)==0

uu = [ua; u; ub]; plot2d(xx, uu, style = 2);end

end// label for scf(1)xlabel(’x’); ylabel(’u’);// pdf filexs2pdf(1,’heat11.pdf’);endfunction// *** local function ***// Initial valuesfunction [y] =func_a(x)

y=min(x, 1.0 - x);//=x.*sin(3*%pi*x).*sin(3*%pi*x);//y=sin(%pi*x);

endfunction

The definitions of variables used above are given below:

variable definition variable definition

N N h h = 1/(N + 1)

Tmax T coef k

a left side of x-interval= 0 b right side of x-interval = 1

ua boundary value at x = a ub boundary value at x = b

tau τ nmax [T/τ ] + 1

lambda λ = kτ/h2

x x = (x1, . . . , xN ) xx x = (x0, x1, . . . , xN , xN+1)

u u(n) = (un1 , . . . , unN ) uu u = (un0 , u

n1 , . . . , u

nN , unN+1)

After running Scilab, we use heat11.sci in the following manner (see Fig.2.2):

15

In Scilab Window:

--> exec(’heat11.sci’);

--> heat_ex1(63, 0.5, 0.2, 1)

Figure 2.2: Results of computation. (left) a(x) = x sin2(3πx) (right) a(x) =sin(πx).

More sophisticated codes (Listing 2, heat12.sci and Listing 3, heat13.sci)are given below. We can get Fig. 2.3 and 2.4 using heat12.sci and heat13.sci,respectively, however, skip the explanation.


// *** explicit scheme for heat equation without source ***// INPUTS:// N: division of space, lambda: parameter (<= 0.5),// Tmax: length of time inverval// coef: heat conduction coefficientfunction heat12(N, lambda, Tmax, coef)// [a, b]: space interval, u(a, t) = ua, u(b, t) = ub: boundary valuesa = 0.0; b = 1.0; ua = 0.0; ub = 0.0;// h: space mesh, x[ ]: vector for computation, xx[ ]: vector for

drawingh = (b - a)/(N + 1); x = [a + h: h: b - h]’; xx = [a; x; b];// tau: time increment, nmax: nmax*tau < Tmax <= (nmax+1)*tautau = lambda*h*h/coef; nmax = int(Tmax/tau) + 1;// step: parameter for drawingstep_num = 40; step = max(int(nmax/step_num), 1);// u: approximation of u(x, t), uu: vector for drawing// set initial valueu = func_a(x); uu = [ua; u; ub];// draw initial valuescf(1); plot2d(xx, uu, style = 5);// def of matrices A and KA = 2*eye(N, N)-diag(ones(N-1 , 1), -1)-diag(ones(N-1 , 1), 1);K = eye(N, N) - lambda*A;// set current timetnow = 0.0;// for 3D plottsp=tnow*ones(1,N); xsp=x.’; Z=u’;// iterationfor n = 1:nmax

u = K*u + tau*func_f(x, tnow);

16

tnow = n*tau;if modulo(n, step)==0

uu = [ua; u; ub]; plot2d(xx, uu, style = 2);tsp=[tsp;tnow*ones(1,N)]; xsp=[xsp;x’];Z=[Z;u’];

endend// label for scf(1)xlabel(’x’); ylabel(’u’);// 3D viewscf(2); mesh(xsp,tsp,Z);xset(’colormap’,coolcolormap(32));xlabel(’x’); ylabel(’time’);// pdf filexs2pdf(1,’heat12a.pdf’); xs2pdf(2,’heat12b.pdf’);endfunction// *** local functions ***// Initial valuesfunction [y] =func_a(x)

y=min(x, 1.0 - x);// y=x.*sin(3*%pi*x).*sin(3*%pi*x);// y=sin(%pi*x);// I = find(x<=0.3); J = find(x>0.3 & x<=0.6); K = find(x>0.6);

// y = zeros(size(x,1)); y(I) = 0.3; y(J) = -2*(x(J) - 0.6); y(K) = 0.6;

endfunction// sorce termsfunction [y] = func_f(x, t)

y=0.0;//y=exp(t+3)*(x.^2).*(1-x);

endfunction

Figure 2.3: A result of computation by heat12.sci


// *** explicit scheme for heat equation with source ***function heat13(N, lambda, Tmax, coef)a = 0.0; b = 1.0; ua = 0.0; ub = 0.0;h = (b - a)/(N + 1); x = [a + h: h: b - h]’; xx = [a; x; b];tau = lambda*h*h/coef; nmax = int(Tmax/tau) + 1;step_num = 30; step = max(int(nmax/step_num), 1); rate = 0.05;u = func_a(x); uu = [ua; u; ub]; tt = 0.0*ones(N+2,1);//scf() a new window for drawing,scf(10);set(’current_figure’,10); HF = get(’current_figure’); set(HF, ’

figure_size’,[800, 400]);utp = max(uu) + rate*(max(uu)-min(uu)); ubt = min(uu) - rate*(max(uu)-

min(uu));

17

subplot(1,2,1); plot2d(xx, uu, style = 5);subplot(1,2,2); param3d(xx, tt, uu, flag=[1,4], ebox=[min(xx),max(xx)

,-0.001,Tmax,ubt,utp]);// def of matrices A and KA = 2*eye(N, N)-diag(ones(N-1 , 1), -1)-diag(ones(N-1 , 1), 1);K = eye(N, N) - lambda*A;// iterationtpast = 0.0;for n = 1:nmax

u = K*u + tau*func_f(x, tpast); // tpast = t_n-1tnow = n*tau; // tnow = t_nif modulo(n, step)==0

uu = [ua; u; ub];tt = tnow*ones(N+2,1);utp = max(uu) + rate*(max(uu)-min(uu)); ubt = min(uu) - rate*(

max(uu)-min(uu));subplot(1,2,1); plot2d(xx, uu, style = 2);subplot(1,2,2);param3d(xx, tt, uu,-45,65,flag=[1,4], ebox=[min(xx),max(xx),min(

tt),max(tt),ubt,utp]);endtpast = tnow;

end// labelsubplot(1,2,1); xlabel(’x’); ylabel(’u’);subplot(1,2,2); xlabel(’x’); ylabel(’t’); zlabel(’u’);// pdf filexs2pdf(10,’heat13.pdf’);endfunction// *** local functions ***// Initial valuesfunction [y] =func_a(x)

//y=min(x, 1.0 - x);y=x.*sin(3*%pi*x).*sin(3*%pi*x);//y=sin(%pi*x);//I = find(x<=0.3); J = find(x>0.3 & x<=0.6); K = find(x>0.6);// y = zeros(size(x,1)); y(I) = 0.3; y(J) = -2*(x(J) - 0.6); y(K) =

0.6; //endfunction// sorce termsfunction [y] = func_f(x, t)

//y=0.0;y=exp(t+3)*(x.^2).*(1-x);

endfunction

⋆Example 2.3. We are going to examine the explicit scheme with the aidof Scilab. Assuming k = 1, we consider the initial-boundary value problem(1.1) and its explicit scheme (2.4) with the following four initial functions

a1(x) = x sin2(3πx), a2(x) =

0.3 (0 ≤ x ≤ 0.3)

−2(x− 0.6) (0.3 < x ≤ 0.6)

0.6 (0.6 < x ≤ 0.3),

a3(x) =

x (0 ≤ x ≤ 1/2)

1− x (1/2 < x ≤ 1),a4(x) = sin(πx).

• In Figures 2.5, 2.6 and 2.7, numerical solutions u(n) for a1(x), a2(x), anda3(x), respectively, are displayed.

18

Figure 2.4: A result of computation by heat13.sci

• It should be noticed that a2(x) is discontinuous and a3(x) is not differ-entiable at x = 1/2. But, we observe that for tn > 0 both numericalsolutions actually approximate smooth functions.

• Figures 2.8 and 2.9 are numerical solutions for a3(x) with λ = 0.51 >1/2. In this case, since we cannot apply Theorem 2.2, the ℓ∞ stabilityand the non-negativity conservation are not guaranteed. In fact, a non-physical oscillation appears (Fig. 2.8) and then negative part of solutionsappears (Fig. 2.9). Moreover, we observe the following. A positive partof solutions at some tn > 0 becomes negative at the next time steptn+1. And, then, it becomes positive again at the next time step tn+2.This change of the sign occurs successively and, consequently, numericalsolution is violated.

• On the other hand, we take λ = 0.51 again and compute the explicitscheme for a4(x) (Fig. 2.10). Then, we do not observe the oscillation.However, if we take a larger λ, then the oscillation actually appears again(Fig. 2.11).

19

Figure 2.5: Initial value a1(x); λ = 0.49; N = 127; 0 ≤ t ≤ 0.1.



20

Figure 2.8: Initial value a3(x); λ = 0.51 ; N = 23; 0 ≤ t ≤ 0.1.



21


22

3 Implicit finite difference schemes

3.1 Simple implicit scheme

We continue to study finite difference schemes to approximate the initial-boundary value problem for the heat equation with no source term;

ut = kuxx (0 < x < 1, t > 0)

u(0, t) = u(1, t) = 0 (t > 0)

u(x, 0) = a(x) (0 ≤ x ≤ 1).

(1.1)

We recall that k > 0 is a constant and that a(x) is a continuous functionsatisfying a(0) = a(1) = 0.

We now take

ut(xi, tn) ≈uni − un−1

i

τ(backward Euler)

as an approximation of ut(xi, tn). The resulting scheme reads asuni − un−1

i

τ= k


h2(1 ≤ i ≤ N, n ≥ 1)

un0 = unN+1 = 0 (n ≥ 1)

u0i = a(xi) (0 ≤ i ≤ N + 1).

(3.1)

It is equivalently written as

1

τu(n) − u(n−1) = − k

h2Au(n) (n ≥ 1), u(0) = a,

where

A =

2 −1 0

0. . .. . .

−1 2 −1. . .

. . .

0 −1 2

, u(n) =

un1un2...

unN

. (3.2)

Moreover, we set

Hλ = I + λA =

1 + 2λ −λ 0

0. . .. . .

−λ 1 + 2λ −λ. . .

. . .

0 −λ 1 + 2λ

(3.3)

with λ = kτ/h2. Then, (3.1) can be rewritten as

Hλu(n) = u(n−1) (n ≥ 1), u(0) = a. (3.1)

According to this expression, we successively obtain u(1),u(2), · · · by solvingthe linear system (3.1) with the initial vector u(0). We call (3.1) the implicit

23

finite difference scheme (or, simply, implicit scheme), because the solution isdetermined by solving the system of equations. (We note that the recursiveformula (2.4) is called the explicit scheme.) As described below, there are manyimplicit schemes. So, (3.1) is sometimes called the simple implicit scheme todistinguish it with other implicit schemes.The simple implicit scheme has fine mathematical properties. Thus, we have

the following result.

Theorem 3.1.For any λ > 0 and n ≥ 1, there exists a solution u(n) = (uni ) ∈ RN of thesimple implicit scheme (3.1) with properties

(ℓ∞ stability) ∥u(n)∥∞ ≤ ∥a∥∞ for n ≥ 1; and

(positivity) a ≥ 0, = 0 ⇒ u(n) > 0 for n ≥ 1.

The key point is to examine the matrix Hλ very carefully.

Lemma 3.2.The matrix Hλ defined by (3.3) is non-singular. Moreover, H−1

λ > O and∥H−1

λ ∥∞ ≤ 1.

The proof of this lemma depends on the following lemma.

Lemma 3.3.If G ∈ RN×N satisfies ∥G∥ < 1 for a norm ∥ · ∥ of RN , then we have thefollowing:

(i) The matrix I −G is non-singular;

(ii) (I −G)−1 =

∞∑l=0

Gl;

(iii) ∥(I −G)−1∥ ≤ 1

1− ∥G∥.

Proof. (i) We argue by contradiction. Thus, we assume that I −G is singularso that there exists u = 0 satisfying (I −G)u = 0 and, hence, Gu = u. Thisimplies ∥u∥ ≤ ∥G∥ · ∥u∥ and ∥G∥ ≥ 1, which is impossible. Hence, we haveverified that I −G is non-singular.(ii) We have

(I −G)(I +G+ · · ·+Gm) = I −Gm+1.

Thus,m∑l=0

Gl = (I −G)−1(I −Gm+1).

24

Therefore,∥∥∥∥∥(I −G)−1 −m∑l=0

Gl

∥∥∥∥∥ =∥∥(I −G)−1

[I − (I −Gm+1)

]∥∥≤ ∥(I −G)−1∥ · ∥G∥m+1 → 0 (m → ∞).

This gives (I −G)−1 =

∞∑l=1

Gl.

(iii) Noting

(I −G)(I −G)−1 = I ⇔ (I −G)−1 = I +G(I −G)−1

we obtain

∥(I −G)−1∥ ≤ ∥I∥+ ∥G∥ · ∥(I −G)−1∥ ⇔ ∥(I −G)−1∥ ≤ 1

1− ∥G∥.

Proof of Lemma 3.2. We first observe

Hλ = (1 + 2λ)

1 −λ1+2λ. . .

. . .−λ

1+2λ 1 −λ1+2λ

. . .. . . −λ

1+2λ−λ

1+2λ 1

= (1 + 2λ) (I −G) with G =λ

1 + 2λ

0 1

. . .. . .

1 0 1. . .

. . . 11 0

.

Since ∥G∥∞ =λ

1 + 2λ· 2 < 1, we can apply Lemma 3.3 to obtain

H−1λ =

1

1 + 2λ(I −G)−1 =

1

1 + 2λ

∞∑l=0

Gl,

∥H−1λ ∥∞ =

1

1 + 2λ∥(I −G)−1∥∞

≤ 1

1 + 2λ· 1

1− ∥G∥∞=

1

1 + 2λ· 1

1− 2λ1+2λ

= 1.

It remains to show H−1λ > O. To do so, it suffices to verify (I − G)−1 > O.

Because of G ≥ O, we have (I − G)−1 ≥ O. We argue by contradiction toshow G > O. That is, we assume that there is a zero component of the jthcolumn v = (vi) of (I−G)−1. We note that we cannot have v = 0 since I−G

25

is non-singular. Thus, v ≥ 0, = 0. Without loss of generality, we suppose thatvk = 0 and vk+1 > 0. Then, noting that (I − G)v = ej = (δij) ≥ 0, the kthcomponent of v is

− λ

1 + 2λvk−1 + vk −

λ

1 + 2λvk+1 ≥ 0 ⇔ −vk−1 ≥ vk+1.

Hence, we have vk−1 < 0 which implies a contradiction. Therefore, we haveverified (I −G)−1 > O.

Now, we can state the following proof.

Proof of Theorem 3.1. It is a readily obtainable consequence of Lemma 3.2.Thus,

∥u(n+1)∥∞ = ∥H−1λ u(n)∥∞ ≤ ∥H−1

λ ∥∞∥u(n)∥∞ ≤ ∥u(n)∥∞,

andu(n) ≥ 0 ⇒ u(n+1) = H−1

λ u(n) > 0.

3.2 The implicit θ scheme

We recall that the explicit and simple implicit schemes for (1.1) are given as

uni − un−1i

τ= k

un−1i−1 − 2un−1

i + un−1i+1

h2

uni − un−1i

τ= k


h2.

At this stage, as an approximation of (1.1), we consider their average withweight θ ∈ [0, 1];

uni − un−1i

τ= (1− θ)k

un−1i−1 − 2un−1

i + un−1i+1

h2

+θkuni−1 − 2uni + uni+1

h2(1 ≤ i ≤ N, n ≥ 1)

un0 = unN+1 = 0 (n ≥ 1)

u0i = a(xi) (0 ≤ i ≤ N + 1).

(3.4)

This is called the implicit θ finite difference scheme or implicit θ scheme.Setting λ = kτ/h2 and θ′ = 1−θ, we can rewrite the first and second equalitiesas

1

τu(n) − u(n−1) = −θ′k

h2Au(n−1) − θk

h2Au(n).

Thus, (3.4) is equivalently expressed as

Hθλu(n) = Kθ′λu

(n−1) (n ≥ 1), u(0) = a. (3.4)

26

Here,

Hθλ = I + θλA =

1 + 2θλ −θλ 0

0. . .

−θλ 1 + 2θλ −θλ. . .

0 −θλ 1 + 2θλ

, (3.5)

Kθ′λ = I − θ′λA =

1− 2θ′λ θ′λ 0

0. . .

θ′λ 1− 2θ′λ θ′λ. . .

0 θ′λ 1− 2θ′λ

. (3.6)

We remark that (3.4) coincides with the explicit and simple implicit schemeswhen θ = 0 and θ = 1, respectively.The cases θ = 0 and θ = 1 have been described in Theorems 2.2 and 3.1.

We here deal with only the case 0 < θ < 1.


0 < θ < 1, 1− 2(1− θ)λ ≥ 0. (3.7)

Then, for any n ≥ 1, there exists a unique solution u(n) = (uni ) ∈ RN of (3.4)with properties

(ℓ∞ stability) ∥u(n)∥∞ ≤ ∥a∥∞;

(non-negativity) a ≥ 0 ⇒ u(n) ≥ 0;

(positivity) a ≥ 0, = 0 ⇒ u(n) > 0.

Proof. Under the assumption (3.7), Hθλ is non-singular as is verified in theproof of Lemma 3.2. Moreover, we haveH−1

θλ > O and ∥H−1θλ ∥∞ ≤ 1 by Lemma

3.2. We also have Kθ′λ ≥ 0 and ∥Kθ′λ∥∞ = 1. Hence,

∥u(n+1)∥∞ = ∥H−1θλ Kθ′λu

(n)∥∞ ≤ ∥u(n)∥∞.

Next, assuming that a ≥ 0, = 0, we have Kθ′λa ≥ 0, = 0. Therefore, u(1) =H−1

θλ Kθ′λa > 0 and, consequently, u(n) > 0 for n ≥ 1.

Remark. The case θ = 1/2 is called the Crank-Nicolson scheme.

3.3 Inhomogeneous problems

We come to consider the initial-boundary value problem with a source term:ut = kuxx + f(x, t) (0 < x < 1, t > 0)

u(0, t) = u(1, t) = 0 (t > 0)

u(x, 0) = a(x) (0 ≤ x ≤ 1),

(1.2)

27

where f(x, t) is a given continuous function that represents a supply/absorp-tion of heat.

The explicit scheme reads asun+1i − uni

τ= k


h2+ f(xi, tn) (0 < i < N, n ≥ 0)

un0 = unN+1 = 0 (n ≥ 1)

u0i = a(xi) (0 ≤ i ≤ N + 1).

We set λ = kτ

h2and

f (n) =

f(x1, tn)f(x2, tn)

...f(xN , tn)

.

Since the first and second equalities could be expressed as

1

τu(n+1) − u(n) = − k

h2Au(n) + f (n),

the explicit scheme is equivalently written as

u(n) = Kλu(n−1) + τf (n−1) (n ≥ 1), u(0) = a,

where Kλ is defined as (2.5).On the other hand, the simple implicit scheme reads as

uni − un−1i

τ= k


h2+ f(xi, tn) (0 < i < N, n ≥ 0)

un0 = unN+1 = 0 (n ≥ 1)

u0i = a(xi) (0 ≤ i ≤ N + 1),

or, equivalently,

Hλu(n) = u(n−1) + τf (n) (n ≥ 1), u(0) = a,

where Hλ is defined as (3.3).Finally, the implicit θ scheme reads as


(n−1) + τ (1− θ)f (n−1) + θf (n)︸︷︷︸=f (n−1+θ)

(n ≥ 1),

u(0) = a, (3.8)

where Hθλ and Kθ′λ are those defined by (3.5) and (3.6), respectively.

28


0 ≤ θ ≤ 1, 2(1− θ)λ ≤ 1.

Then, for any n ≥ 1, there exists a unique solution u(n) = (uni ) ∈ RN of (3.8)that satisfies the following.

(ℓ∞ stability) ∥u(n)∥∞ ≤ ∥a∥∞ + τ

n∑l=1

∥f (l−1+θ)∥∞;

(positivity)

θ = 0, a ≥ 0, = 0, f (l) ≥ 0 (0 ≤ l ≤ n) ⇒ u(n) > 0;

(non-negativity)

θ = 0, a ≥ 0, f (l) ≥ 0 (0 ≤ l ≤ n− 1) ⇒ u(n) ≥ 0.

Proof. It is done in the similar way as that of Theorem 3.4.

3.4 Numerical experiments by Scilab

As is seen in the previous paragraphs, we often meet a system of linear equa-tions of the form

Hθλ︸︷︷︸=H

u(n) = Kθ′λu(n−1) + τf (n−1+θ)︸︷︷︸

=g(n)

,

where a tri-diagonal matrix H ∈ RN×N is defined as

H =

1 + 2µ −µ 0

0. . .

−µ 1 + 2µ −µ. . .

0 −µ 1 + 2µ

, µ = (1− θ)λ.

Lemma 3.6.If G = (gij) ∈ RN×N is a strictly diagonally dominant matrix, i.e.,

|gii| >N∑

j=1,j =i

|gij | (1 ≤ i ≤ N),

then G is non-singular and admits an LU factorization (without pivoting).

Proof. See, for example, Chapter 2 of [35].

SinceH above is a strictly diagonally dominant matrix, we can apply Lemma3.6 to obtain a unique LU factorization

H = LU with

L: the unit lower triangular matrix,U : the upper triangular matrix.

29

From this, we have

Hu(n) = g(n) ⇔

Lc(n) = g(n)

Uu(n) = c(n).

Thus, the implicit θ scheme is computed by the following steps:

• LU factorization. H is decomposed as H = LU ;

• For n = 1, 2, . . .,

Forward elimination Find c(n) by solving Lc(n) = g(n),

Backward substitution find u(n) by solving Uu(n) = c(n).

This procedure is done in Scilab as follows:

In Scilab Window:

// def of example

--> A = [3, -1, 0; 1, 4, 2; 1, 1, 3];

--> b = [1,2,3]’;

// LU factorization

--> [L, U] = lu(A)

U =

3. - 1. 0.

0. 4.3333333 2.

0. 0. 2.3846154

L =

1. 0. 0.

0.3333333 1. 0.

0.3333333 0.3076923 1.

// Solve Ly = b

--> y = L\b

y =

1.

1.6666667

2.1538462

// Solve Ux = y

--> x = U\y

x =

0.3225806

- 0.0322581

0.9032258

// Check the residual

--> norm(b-A*x)

ans =

0.

30

Moreover, the LU factorization for sparse matrices is available (the matrixH above is a sparse matrix!);

In Scilab Window>>>

// def of example

--> m = 10;

--> A = 2*eye(m,m)-diag(ones(m-1,1),-1)-diag(ones(m-1,1),1);

--> b = [1:1:10]’;

// redefinition of A as a sparse matrix

--> As = sparse(A);

// LU factorization for a space matrix

--> [Lh, rk] = lufact(As);

--> x=lusolve(Lh, b)

x =

20.

39.

56.

70.

80.

85.

84.

76.

60.

35.

--> norm(b-A*x)

ans =

3.178D-14

The following Listing 4 (heat23.sci) is a Scilab code for computing theimplicit θ scheme (3.8). An example of computations is given in Fig. 3.1.


// *** implicit theta scheme for heat equation with source ***function heat23(N, lambda, theta, Tmax, coef)a = 0.0; b = 1.0; ua = 0.0; ub = 0.0;h = (b - a)/(N + 1); x = [a + h: h: b - h]’; xx = [a; x; b];tau = lambda*h*h/coef; nmax = int(Tmax/tau) + 1;step_num = 30; step = max(int(nmax/step_num), 1); rate = 0.05;u = func_a(x); uu = [ua; u; ub]; tt = 0.0*ones(N+2,1);//scf() a new window for drawing,scf(10);set(’current_figure’,10); HF = get(’current_figure’); set(HF, ’

figure_size’,[800, 400]);utp = max(uu) + rate*(max(uu)-min(uu)); ubt = min(uu) - rate*(max(uu)-

min(uu));subplot(1,2,1); plot2d(xx, uu, style = 5);subplot(1,2,2); param3d(xx, tt, uu, flag=[1,4], ebox=[min(xx),max(xx)

,-0.001,Tmax,ubt,utp]);// def of matrices A and KA = 2*eye(N, N)-diag(ones(N-1 , 1), -1)-diag(ones(N-1 , 1), 1);

31

K = eye(N, N) - (1-theta)*lambda*A; H = eye(N, N) + theta*lambda*A;// LU factorization of HHs = sparse(H); Ls = lufact(Hs);// iterationtpast = 0.0;for n = 1:nmax

tnow = n*tau;u = K*u + tau*((1.0-theta)*func_f(x,tpast) + theta*func_f(x,tnow))

;u = lusolve(Ls, u);if modulo(n, step)==0

uu = [ua; u; ub];tt = tnow*ones(N+2,1);utp = max(uu) + rate*(max(uu)-min(uu)); ubt = min(uu) - rate*(

max(uu)-min(uu));subplot(1,2,1); plot2d(xx, uu, style = 2);subplot(1,2,2);param3d(xx, tt, uu,-45,65,flag=[1,4], ebox=[min(xx),max(xx),min(

tt),max(tt),ubt,utp]);endtpast = tnow;

end// labelsubplot(1,2,1); xlabel(’x’); ylabel(’u’);subplot(1,2,2); xlabel(’x’); ylabel(’t’); zlabel(’u’);// eps filexs2pdf(10,’heat23.pdf’);endfunction// *** local functions ***// Initial valuesfunction [y] =func_a(x)

//y=min(x, 1.0 - x);y=x.*sin(3*%pi*x).*sin(3*%pi*x);//y=sin(%pi*x);//I = find(x<=0.3); J = find(x>0.3 & x<=0.6); K = find(x>0.6);

// y = zeros(size(x,1)); y(I) = 0.3; y(J) = -2*(x(J) - 0.6); y(K) = 0.6; //

endfunction// sorce termsfunction [y] = func_f(x, t)

//y=0.0;y=exp(t+3)*(x.^2).*(1-x);

endfunction

32

Figure 3.1: An example of computation of the implicit θ scheme (3.8) byheat23.sci, k = 1, a(x) = x sin2(3πx), f(x, t) = et+3x2(1 − x),N = 63, λ = 1.0, θ = 0.5.

33

4 Convergence and error estimates

4.1 ℓ∞ analysis

We consider the initial-boundary value problemut = kuxx + f(x, t) (0 < x < 1, t > 0)

u(0, t) = u(1, t) = 0 (t > 0)

u(x, 0) = a(x) (0 ≤ x ≤ 1)

(1.2)

and the implicit θ scheme


(n−1) + τf (n−1+θ) (n ≥ 1), u(0) = a, (3.8)

where

θ′ = 1− θ, f (n−1+θ) = (1− θ)f (n−1) + θf (n) = (fn−1+θi ).

In this section, we study the behavior of the error

e(n) = (eni ) ∈ RN , eni = u(xi, tn)− uni ,

where u(x, t) and u(n) = (uni ) are solutions of (1.2) and (3.8), respectively.We need some more notations.

Notation.

Dτvni =

vni − vn−1i

τ, ∆hv

ni =

vni−1 − 2vni + vni+1

h2.

Then, the problem (3.8) is rewritten asDτu

ni = (1− θ)k∆hu

n−1i + θk∆hu

ni + fn−1+θ

i (0 < i < N, n ≥ 0)

un0 = unN+1 = 0 (n ≥ 1)

u0i = a(xi) (0 ≤ i ≤ N + 1).

(3.8′)

Moreover, settingUni = u(xi, tn),

we have

Dτeni − (1− θ)k∆he

n−1i − θk∆he

ni

= DτUni − (1− θ)k∆hU

n−1i − θk∆hU

ni − [Dτu

ni − (1− θ)k∆hu

n−1i − θk∆hu

ni ]

= DτUni − (1− θ)k∆hU

n−1i − θk∆hU

ni − fn−1+θ

i

≡ rni .

Thus, e(n) = (eni ), which is called the error vector, is a solution ofDτe

ni = (1− θ)k∆he

n−1i + θk∆he

ni + rni (0 < i < N, n ≥ 0)

en0 = enN+1 = 0 (n > 0)

e0i = 0 (0 ≤ i ≤ N + 1).

34

At this stage, introducing r(n) = (rni ), which is called the residual vector, weobtain

Hθλe(n) = Kθ′λe

(n−1) + τr(n) (n ≥ 1), e(0) = 0. (4.1)

Therefore, in view of Theorem 3.5, we have the following lemma.

Lemma 4.1.Assume that

0 ≤ θ ≤ 1, 2(1− θ)λ ≤ 1. (4.2)

Then, the error vector e(n) = (eni ) satisfies

∥e(n)∥∞ ≤ τ

n∑l=1

∥r(l)∥∞.

Let Q = [0, 1]× [0, T ] with a positive constant T .

Notation. For a continuous function v defined in Q, we write

∥v(x, ·)∥L∞(0,T ) = max0≤t≤T

|v(x, t)| for 0 ≤ x ≤ 1,

∥v(·, t)∥L∞(0,1) = max0≤x≤1

|v(x, t)| for 0 ≤ t ≤ T ,

∥v∥L∞(Q) = max(x,t)∈Q

|v(x, t)|.

Then, ∥v∥L∞(Q) becomes a norm of C(Q).

Notation. For a sufficiently smooth function v = v(x, t) and a positiveinteger m, we write

∂mt v =

∂m

∂tmv, ∂m

x v =∂m

∂xmv.

The following lemma plays a crucial role.

Lemma 4.2.For arbitrary T > 0, set Q = [0, 1]× [0, T ]. Assume that

∂mx u ∈ C(Q) (0 ≤ m ≤ 4),

∂ltu ∈ C(Q)

(0 ≤ l ≤ 2) if θ = 1/2

(0 ≤ l ≤ 3) if θ = 1/2

(4.3)

is satisfied. Then, the residual vector r(n) = (rni ) admits an estimate

∥r(n)∥∞ ≤ αθ(T ) for any n satisfying 0 ≤ tn ≤ T,

where

αθ(T ) =

k

12h2∥∂4

xu∥L∞(Q) +τ

2∥∂2

t u∥L∞(Q) (θ = 1/2)

k

12h2∥∂4

xu∥L∞(Q) +5

12τ2∥∂3

t u∥L∞(Q) (θ = 1/2).

(4.4)

35

Theorem 4.3.Let T > 0 be fixed. Assume that (4.2) and (4.3) are satisfied. Then, we havethe error estimate

max0≤tn≤T

∥e(n)∥∞ ≤

CT,θ(τ + h2) (θ = 1/2)

CT,1/2(τ2 + h2) (θ = 1/2),

(4.5)

where

CT,θ =

T ·max

k

12∥∂4

xu∥L∞(Q),1

2∥∂2

t u∥L∞(Q)

(θ = 1/2)

T ·max

k

12∥∂4

xu∥L∞(Q),5

12∥∂3

t u∥L∞(Q)

(θ = 1/2).

(4.6)

Proof of Theorem 4.3. It is a direct consequence of Lemmas 4.1 and 4.2.

Proof of Lemma 4.2. By considering the heat equation at (xi, tn) and (xi, tn−1),we have

f(xi, tn) = ut(xi, tn)− kuxx(xi, tn),

f(xi, tn−1) = ut(xi, tn−1)− kuxx(xi, tn−1).

Hence,

rni = Dτu(xi, tn)− (1− θ)ut(xi, tn−1)− θut(xi, tn)︸︷︷︸=R1

−(1− θ)k [∆hu(xi, tn−1)− uxx(xi, tn−1)]︸︷︷︸=R2

−θk [∆hu(xi, tn)− uxx(xi, tn)]︸︷︷︸=R3

.

First, we derive the estimations for space discretizations. The error estimate(2.3) gives

|∆hu(xi, tn−1)− uxx(xi, tn−1)| ≤1

12h2∥∂4

xu(·, tn−1)∥L∞(0,1),

|∆hu(xi, tn)− uxx(xi, tn)| ≤1

12h2∥∂4

xu(·, tn)∥L∞(0,1).

Therefore,

|R2|+ |R3| ≤k

12h2∥∂4

xu∥L∞(Q).

Next, we examine the time discretization. Suppose θ = 1/2. For the sakeof simplicity, setting v(t) = u(xi, t), we have

R1 = Dτv(tn)− (1− θ)v′(tn−1)− θv′(tn)

= (1− θ)[Dτv(tn)− v′(tn−1)] + θ[Dτv(tn)− v′(tn)].

36

Hence, by using (2.1) and (2.2), we obtain

|R1| ≤ (1− θ) · τ2∥∂2

t u(xi, ·)∥L∞(0,T ) + θ · τ2∥∂2

t u(xi, ·)∥L∞(0,T )

≤ (1− θ) · τ2∥∂2

t u∥L∞(Q) + θ · τ2∥∂2

t u∥L∞(Q)

=τ

2∥∂2

t u∥L∞(Q).

Combining those inequalities, we have |rni | ≤ |R1|+ |R2|+ |R3| ≤ αθ(T ).Now, we suppose θ = 1/2. By Taylor’s theorem,

v(tn) = v(tn−1) + v′(tn−1)τ +1

2v′′(tn−1)τ

2 +1

3!v(3)(t)τ3,

v(tn−1) = v(tn)− v′(tn)τ +1

2v′′(tn)τ

2 − 1

3!v(3)(t)τ3

with tn−1 < t, t < tn. These imply

v(tn)− v(tn−1)

τ− v′(tn−1) =

1

2v′′(tn−1)τ +

1

3!v(3)(t)τ2,

v(tn)− v(tn−1)

τ− v′(tn) = −1

2v′′(tn)τ +

1

3!v(3)(t)τ2,

and, moreover,

R1 = Dτv(tn)−1

2v′(tn−1)−

1

2v′(tn)

=1

2[Dτv(tn)− v′(tn−1)] +

1

2[Dτv(tn)− v′(tn)]

=τ

4[v′′(tn−1)− v′′(tn)] +

τ2

2 · 3![v(3)(t) + v(3)(t)]

= −τ

4

∫ tn

tn−1

v(3)(s) ds+τ2

2 · 3![v(3)(t) + v(3)(t)].

Therefore, we have

|R1| ≤ τ

4

∫ tn

tn−1

∥∂3t u∥L∞(Q) ds+

τ2

2 · 3!· 2∥∂3

t u∥L∞(Q)

=τ2

4∥∂3

t u∥L∞(Q) +τ2

6∥∂3

t u∥L∞(Q)

=5τ2

12∥∂3

t u∥L∞(Q).

Thus, we deduce |rni | ≤ |R1|+ |R2|+ |R3| ≤ α1/2(T ).

Remark. We extend u(n) = (uni ) to a function uh,τ (x, t) by a piecewise-constant interpolation. Thus, we set

uh,τ (x, t) = uni (xi−1 ≤ x < xi, tn ≤ t < tn+1)

Let T > 0 be fixed. Assume that (4.2) and (4.3) are satisfied. Then, we havethe error estimate

∥u− uh,τ∥L∞(Q) ≤

CT,θ(τ + h2) (θ = 1/2)

CT,1/2(τ2 + h2) (θ = 1/2).

37

4.2 ℓ2 analysis

We continue to study the error e(n) of the implicit θ scheme for the heatequation.

Definition.(i) The vector 2 norm is defined as

∥v∥2 (= ∥v∥ℓ2) =

(N∑i=1

|vi|2)1/2

(v = (vi) ∈ RN ).

(ii) The matrix 2 norm is defined as

∥G∥2 = maxv∈RN

∥Gv∥2∥v∥2

(G ∈ RN×N ).

Lemma 4.4.Let λ1, . . . , λN be eigenvalues of a symmetric matrix G ∈ RN×N . Then, wehave ∥G∥2 = max

1≤i≤N|λi|．

Proof. Set ρ(G) = max1≤i≤N

|λi|, which is called the spectral radius. Since G is a

real symmetric matrix, it admits a diagonalization G = UΛUT, where

Λ =

λ1 0. . .

0 λN

, UTU = UUT = I.

We use a scalar product in RN ;

⟨x,y⟩ =N∑i=1

xiyi (x = (xi), y = (yi) ∈ RN ).

Recall that ∥x∥22 = ⟨x,x⟩ and ⟨Ax,x⟩ = ⟨x, ATx⟩ for any x ∈ RN andA ∈ RN×N .Now we let 0 = v ∈ RN and set w = UTv. Then, we can calculate as

∥w∥22 = ⟨UTv, UTv⟩ = ⟨UUTv,v⟩ = ∥v∥22,∥Gv∥22 = ⟨Gv, Gv⟩ = ⟨UΛw, UΛw⟩ = ⟨UTUΛw,Λw⟩

= ⟨Λw,Λw⟩ =N∑i=1

|λivi|2 =N∑i=1

|λi|2|wi|2

≤ ρ(G)2N∑i=1

|wi|2 = ρ(G)2∥w∥22 = ρ(G)2∥v∥22.

Therefore,

∥G∥2 = maxv∈RN

∥Gv∥2∥v∥2

≤ maxv∈RN

ρ(G)∥v∥2∥v∥2

= ρ(G).

38

On the other hand, suppose that |λk| = ρ(G) and Gu = λku. Then,

∥G∥2 ≥∥Gu∥2∥u∥2

=|λk|∥u∥2∥u∥2

= ρ(G).

Combining those inequalities, we obtain ∥G∥2 = ρ(G).

We again consider the initial-boundary value problemut = kuxx (0 < x < 1, t > 0)

u(0, t) = u(1, t) = 0 (t > 0)

u(x, 0) = a(x) (0 ≤ x ≤ 1)

(1.1)

and its implicit θ scheme


(n−1) (n ≥ 1), u(0) = a. (3.4)

The crucial point of the ℓ2 analysis is to rewrite the finite difference schemesin terms of A as follows:

u(n+1) = Fθ,λ(A)u(n), u(0) = a, (3.4)

where

A =

2 −1 0

0. . .. . .

−1 2 −1. . .

. . .

0 −1 2

,

Fθ,λ(A) = (I + θλA)−1 (I − θ′λA), θ′ = 1− θ.

For example, we know:

• The explicit scheme: u(n+1) = (I − λA)u(n);

• The simple implicit scheme: u(n+1) = (I + λA)−1u(n);

• The Crank-Nicolson scheme : u(n+1) =

(I +

λ

2A

)−1(I − λ

2A

)u(n).

Lemma 4.5.The eigenpairs of the eigenvalue problem

Aϕ = µϕ, ϕ = (φi) = 0

are give as µ⟨m⟩ = 4 sin2(

mπ

2(N + 1)

),

φ⟨m⟩ = (φ⟨m⟩i ) =

(√2 sin(mπxi)

)(1 ≤ m ≤ N).

(4.7)

39

Proof. See Problem 4.

Lemma 4.6.(i) For 1

2 ≤ θ ≤ 1, we have ∥Fθ,λ(A)∥2 ≤ 1.(ii) For 0 ≤ θ < 1

2 , we have ∥Fθ,λ(A)∥2 ≤ 1 if 2λ(1− 2θ) ≤ 1.

Proof. We note that

1. Fθ,λ(A) is a symmetric matrix;

2. Fθ,λ(µ⟨m⟩)Nm=1 are all the eigenvalues of Fθ,λ(A).

(See Problems 5 and 6.) Hence, Lemma 4.4 gives

∥Fθ,λ(A)∥2 = max1≤m≤N

|Fθ,λ(µ⟨m⟩)|,

where we have introduced a real-valued function Fθ,λ(s) = (1 + θλs)−1(1− θ′λs)for s ≥ 0 associating with Fθ,λ(A). The function Fθ,λ(s) satisfies

Fθ,λ(0) = 1;d

dsFθ,λ(s) =

−λ

(1 + θ′λs)2< 0;

Fθ,λ(s) = −1 ⇔ sλ(1− 2θ) = 2.

Therefore,

0 ≤ θ <1

2⇒ |Fθ,λ(s)| ≤ 1 (∀s > 0 s.t. sλ(1− 2θ) ≤ 2)

and1

2≤ θ ≤ 1 ⇒ |Fθ,λ(s)| ≤ 1 (∀s > 0).

Thus, if 1/2 ≤ θ ≤ 1, we always have ∥Fθ,λ(A)∥2 ≤ 1. On the other hand, if0 ≤ θ < 1/2, we have

(0 <)µ⟨m⟩ = 4 sin2(

mπ

2(N + 1)

)≤ 4 ≤ 2

λ(1− 2θ)(1 ≤ m ≤ N)

under the condition 2λ(1− 2θ) ≤ 1. This implies ∥Fθ,λ(A)∥2 ≤ 1.

We then state stability and convergence results.

Notation. We introduce

∥v∥h =√h∥v∥2 =

(h

m∑i=1

v2i

)1/2

.

Obviously, ∥v∥h ≤ ∥v∥∞.

40


2λ(1− 2θ) ≤ 1 (0 ≤ θ < 1/2)

no condition (1/2 ≤ θ ≤ 1).(4.8)

Then, the solution u(n) of (3.4) satisfies

∥u(n)∥h ≤ ∥a∥h

for n ≥ 1.

Proof. It is a direct consequence of Lemma 4.6 and

∥u(n+1)∥h = ∥Fθ,λ(A)u(n)∥h ≤ ∥Fθ,λ(A)∥h∥u(n)∥h ≤ ∥Fθ,λ(A)∥2∥u(n)∥h.

Now, we come to consider the initial-boundary value problem with inhomo-geneous source term:

ut = kuxx + f(x, t) (0 < x < 1, t > 0)

u(0, t) = u(1, t) = 0 (t > 0)

u(x, 0) = a(x) (0 ≤ x ≤ 1)

(1.2)

and its implicit θ scheme


(n−1) + τf (n−1+θ) (3.8)

or, equivalently,

u(n) = Fθ,λ(A)u(n−1) + τ(I + θλA)−1f (n−1+θ), (3.8)

wheref (n−1+θ) = (1− θ)f (n−1) + θf (n).

Theorem 4.8.Under the assumption (4.8), the solution u(n) of (3.8) satisfies

∥u(n)∥h ≤ ∥a∥h + τ

n∑l=1

∥f (l−1+θ)∥h

for n ≥ 1.

Proof. We can prove ∥(I + θλA)−1∥2 ≤ 1 (θ, λ > 0) in the exactly same wayas the proof of Lemma 4.6. Hence, the result follows Lemma 4.6.

Theorem 4.9.For arbitrary T > 0, we set Q = [0, 1] × [0, T ]. Assume that (4.8) and (4.3)are satisfied. Then, we have an error estimate

max0≤tn≤T

∥e(n)∥h ≤

CT,θ(τ + h2) (θ = 1/2)

CT,1/2(τ2 + h2) (θ = 1/2),

(4.9)

where CT,θ is a positive constant defined as (4.6).

41

Proof. In virtue of Theorem 4.8,

∥e(n)∥h ≤ τn∑

l=1

∥r(l)∥h ≤ τn∑

l=1

∥r(l)∥∞.

This, together with Lemma 4.2, implies (4.9).

4.3 Numerical examples

In this section, we offer some numerical examples in order to confirm thevalidity of Theorems 4.3 and 4.9. To this end, we use two Scilab functions(See Listing 5 and 6):

heat_error1.sci, error_plo1.sci.

Listing 5: heat error1.sci

// *** error observation 1: implicit theta scheme with source ***// Output: h: mesh size, err0: error in L^\infty, err1: error in L^2function [err0, err2, h] = heat_error1(N, lambda, theta, Tmax, coef)a = 0.0; b = 1.0; ua = 0.0; ub = 0.0;h = (b - a)/(N + 1); x = [a + h: h: b - h]’; xx = [a; x; b];tau = lambda*h*h/coef; nmax = int(Tmax/tau) + 1;u = func_a(x); uu = [ua; u; ub];A = 2*eye(N, N)-diag(ones(N-1 , 1), -1)-diag(ones(N-1 , 1), 1);K = eye(N, N) - (1-theta)*lambda*A; H = eye(N, N) + theta*lambda*A;// LU factorization of HHs = sparse(H); Ls = lufact(Hs);// iterationtpast = 0.0; tnow = 0.0; err0 = -1.0; err2 = -1.0;for n = 1:nmax

tnow = n*tau;u = K*u + tau*((1.0-theta)*func_f(x,tpast) + theta*func_f(x,tnow))

;u = lusolve(Ls, u);errvect = u - func_sol(x, tnow);err0 = max(norm(errvect, %inf), err0);err2 = max(norm(errvect, 2)*sqrt(h), err2);tpast = tnow;

endendfunction// ********* local functions *********// Initial valuefunction [y] = func_a(x)

//y=sin(%pi*x); // case 1y=(x.^3).*(1-x); // case 2

endfunction// Heat source termfunction [y] = func_f(x, t)

//y=0.0; // case1y=exp(t)*(-x.^4+x.^3+12*x.^2-6*x); // case 2

endfunction// exact solutionfunction [y] = func_sol(x, t)

//y=exp(-%pi^2*t)*sin(%pi*x); // case 1y=exp(t)*(x.^3).*(1-x); // case 2

endfunction

Listing 6: error plot1.sci

42

// *** error observation: dependence on h (implicit theta scheme withsource) ***

function [errvect0, errvect2, hvect] = error_plot1(lambda, theta, Tmax, coef)

errvect0 = []; errvect2 = []; hvect=[]; N0 = 10; jmax = 5;for j = 1:jmax

N = N0*j;[err0, err2, h] = heat_error1(N, lambda, theta, Tmax, coef)errvect0 = [errvect0; err0]; errvect2 = [errvect2; err2]; hvect =

[hvect; h];end//plot errorsscf(20); set(’current_figure’,20); HF = get(’current_figure’); set(HF,

’figure_size’,[400, 800]);xset(’thickness’,2)plot2d(hvect, errvect0, style = 2, logflag="ll");plot2d(hvect, errvect2, style = 5, logflag="ll");xset(’thickness’,1)xgrid(); xtitle(’Mesh size h vs. E_infty and E_2 Errors’,’log (h)’,’

log (error)’)legend(’E_infty’,’E_2’,4);xs2pdf(20,’error1.pdf’)endfunction

Letting T = 1, we define as

E∞ = max0≤tn≤1

∥e(n)∥∞,

E2 = max0≤tn≤1

∥e(n)∥h = max0≤tn≤1

∥e(n)∥2√h.

⋆Example 4.10. Let f(x, t) = 0 and a(x) = sin(πx). Then, the solution of(1.2) is

u(x, t) = e−πt2 sin(πx).

In Fig. 4.1, we observe that E∞ ≈ Ch2 and E2 ≈ Ch2. x

⋆Example 4.11. Let f(x, t) = et(−x4+x3+12x−6x) and a(x) = x3(1−x).Then, the solution of (1.2) is

u(x, t) = etx3(1− x).

In Fig. 4.2, we observe that E∞ ≈ Ch2 and E2 ≈ Ch2.

43

(a) θ = 0 (b) θ = 1/2 (c) θ = 1

Figure 4.1: log h vs. logE∞ and logE2 for u(x, t) = e−π2t sin(πx) (Example4.10) and λ = 1/2.

(a) θ = 0 (b) θ = 1/2 (c) θ = 1

Figure 4.2: log h vs. logE∞ and logE2 for u(x, t) = etx3(1 − x) (Example4.11) and λ = 1/2.

44

5 Nonlinear problems

5.1 Semilinear diffusion equation

Our next target is the initial-boundary value problems for a semilinear diffu-sion equation:

ut = kuxx + ε(1− u)u (0 < x < 1, t > 0)

u(0, t) = u(1, t) = 0 (t > 0)

u(x, t) = a(x) (0 ≤ x ≤ 1),

(5.1)

where

• k, ε are positive constants;

• a ∈ C[0, 1], a ≡ 0, 0 ≤ a ≤ 1, a(0) = a(1) = 0.

Theorem 5.1.The problem (5.1) admits a unique (classical) solution u = u(x, t) in [0, 1] ×[0,∞) and it satisfies 0 ≤ u(x, t) ≤ 1 for 0 ≤ x ≤ 1 and t ≥ 0.

We introduce the steady-state problem associated with (5.1): Find a func-tion w = w(x) (0 ≤ x ≤ 1) which depends only on x such that

0 = kw′′ + ε(1− w)w, w > 0 (0 < x < 1)

w(0) = w(1) = 0.(5.2)

The function w ≡ 0, which is called a trivial solution, clearly solves (5.2).

Theorem 5.2.We have the following.

(i) If ε > kπ2, then (5.2) admits a unique non-trivial solution w(x).

(ii) If ε ≤ kπ2, then (5.2) admits no non-trivial solution.

The asymptotic behavior of the solution u(x, t) of (5.1) is summarized asfollows.

Theorem 5.3.Let u(x, t) be the solution of (5.1).

(i) If ε > kπ2, we have limt→∞

∥u(·, t)− w∥∞ = 0, where w(x) denotes the

non-trivial solution of (5.2).

(ii) If ε ≤ kπ2, we have limt→∞

∥u(·, t)∥∞ = 0.

The proofs of Theorems 5.1–5.3 are found, for example, in

[15] Y. Kametaka: On the nonlinear diffusion equation of Kolmogorov–Petrovskii–Piskunov type, Osaka J. Math. 13 (1976) 11–66;

[16] 亀高惟倫：非線型偏微分方程式，産業図書，1977年.

We would like to verify Theorem 5.3 with the aid of numerical examples. Todo this, we introduce finite difference approximations in the following sections.

45

k ε λ r ρ

Fig. 5.1 10 10 0.4 0.0101 0.8002Fig. 5.2 1 10 0.4 1.0132 0.8015Fig. 5.3 0.1 10 0.4 101.32 0.8153Fig. 5.4 10 10 0.5 0.0101 1.0002Fig. 5.5 1 10 0.5 1.0132 1.0019Fig. 5.6 0.1 10 0.5 101.32 1.0192

Table 5.1: Parameters in Fig. 5.1–5.6. N is fixed as 50.

5.2 Explicit scheme

We introduce:

• h =1

N + 1with 0 < N ∈ Z;

• τ > 0;

• Qτh = (xi, tn)| xi = ih, tn = nτ (0 ≤ i ≤ N + 1, n ≥ 0);

• uni ≈ u(xi, tn).

The explicit scheme to (5.1) now reads asun+1i − uni

τ= k


h2+ ε(1− uni )u

ni (1 ≤ i ≤ N, n ≥ 0)

un0 = unN+1 = 0 (n ≥ 1)

u0i = a(xi) (0 ≤ i ≤ N + 1).

(5.3)

Notation. We let

λ = kτ

h2, r =

ε

kπ2, ρ = τ

(ε+

2k

h2

), q =

1...1

∈ RN .

Below we offer some numerical results in Fig. 5.1–5.6; Parameters are sum-marized in Tab. 5.1 .

The cases r > 1 and r ≤ 1, respectively, correspond (i) and (ii) in Theorem5.3. We observe from Fig. 5.1–5.3 that the solution decays if r ≤ 1 and that thesolution converges a non-trivial steady-state solution if r > 1. On the otherhand, Fig. 5.4–5.6 are results with the same parameters as Fig. 5.1–5.3 exceptfor λ (and thus ρ). We see that there are no differences between Fig. 5.1,5.2 and 5.4, 5.5, respectively. However, we observe an oscillation of numericalsolution in Fig. 5.6. As a matter of fact, the value of ρ plays an importantrole to obtain a stable numerical solution. We examine this issue next.

46

Figure 5.1: k = 10, ε = 10, N = 50, λ = 0.4, 0 ≤ t ≤ 0.2; r = 0.0101,ρ = 0.8002

Figure 5.2: k = 1, ε = 10, N = 50, λ = 0.4, 0 ≤ t ≤ 0.5; r = 1.0132,ρ = 0.8015

Figure 5.3: k = 0.1, ε = 10, N = 50, λ = 0.4, 0 ≤ t ≤ 1.6; r = 101.32,ρ = 0.8153

47

Figure 5.4: k = 10, ε = 10, N = 50, λ = 0.5, 0 ≤ t ≤ 0.2; r = 0.0101,ρ = 1.0002

Figure 5.5: k = 1, ε = 10, N = 50, λ = 0.5, 0 ≤ t ≤ 0.5; r = 1.0132,ρ = 1.0019

Figure 5.6: k = 0.1, ε = 10, N = 50, λ = 0.5, 0 ≤ t ≤ 1.6; r = 101.32,ρ = 1.0192

48


1− ετ − 2λ ≥ 0 ⇔ τ ≤(ε+

2k

h2

)−1

⇔ ρ ≤ 1. (5.4)

Then, we have0 ≤ u(n) ≤ q (n ≥ 0),

where u(n) = (uni ) denotes the solution of (5.3).

Proof. We argue by induction. First, by virtue of 0 ≤ a(x) ≤ 1, we have

0 ≤ u0i ≤ 1 (1 ≤ i ≤ N).

Suppose that 0 ≤ uni ≤ 1 for n ≥ 0. Since (5.3) gives

un+1i = λ(uni−1 + uni+1) + (1− 2λ)uni + ετuni (1− uni ),

we deduce un+1i ≥ 0. On the other hand, setting vni = 1 − uni , we have

0 ≤ vni ≤ 1 and

vn+1i = λ(vni−1 + vni+1) + (1− 2λ− ετuni )︸︷︷︸

≥1−2λ−ετ≥0

vni ≥ 0.

Therefore, we obtain un+1i ≤ 1.

The convergence is also guaranteed under the condition (5.4).

Theorem 5.5. • Let T > 0 be fixed. Assume that (5.4) holds true.

• Suppose that the solution u(x, t) of (5.1) satisfies

u, ut, utt, ux, uxx, uxxx, uxxxx ∈ C(Q),

where Q = [0, 1]× [0, T ]. Define

ZT =1

2∥utt∥L∞(Q) +

k

12∥uxxxx∥L∞(Q).

• Let u(n) = (uni ) ∈ RN be the solution of (5.3).

• Define e(n) = (eni ) ∈ RN by setting eni = u(xi, tn)− uni .Then, we have

max0≤tn≤T

∥e(n)∥∞ ≤ eεT − 1

ε(τ + h2)ZT .

Proof. Setting Uni = u(xi, tn) and introducing the difference quotients by

Dτen+1i =

en+1i − eni

τ, ∆he

ni =

eni−1 − 2eni + eni+1

h2,

49

we have

Dτen+1i − k∆he

ni

= DτUn+1i − k∆hU

ni − [Dτu

n+1i − k∆hu

ni ]


ni − ε(1− uni )u

ni


ni − [ut(xi, tn)− kuxx(xi, tn)]︸︷︷︸

=rni

+ ε(1− Uni )U


ni︸︷︷︸

=gni

.

That is, Dτe

n+1i = k∆he

ni + rni + gni (1 ≤≤ N, n ≥ 0)

en0 = enN+1 = 0 (n > 0)

e0i = 0 (0 ≤ i ≤ N).

It is equivalently written as

e(n+1) = Kλe(n) + τr(n) + τg(n) (n ≥ 1), e(0) = 0,

where r(n) = (rni ), g(n) = (gni ) and Kλ is defined as (2.5).

Since, by (5.4), 1− 2λ > 0, we have

∥Kλ∥∞ ≤ 1, Kλ ≥ O.

In view of §22.1, we can estimate as

|rni | ≤∣∣DτU

n+1i − ut(xi, tn)

∣∣+ k |∆hUni − uxx(xi, tn)|

≤ τ1

2maxt∈[0,T ]

|utt(xi, t)|+ h2k

12maxx∈[0,1]

|uxxxx(x, tn)|

and, hence,∥r(n)∥∞ ≤ ZT (τ + h2).

Moreover, we can calculate as

gni = ε(1− Uni )U


ni

= ε(1− Uni )U

ni − (1− Un

i )uni + (1− Un

i )uni − ε(1− uni )u

ni

= ε(1− Uni )(U

ni − uni )− ε(Un

i − uni )uni

= ε(1− Uni )e

ni − εeni u

ni

= ε(1− Uni − uni )e

ni .

This, together with −1 ≤ 1− Uni − uni ≤ 1, implies

∥g(n)∥∞ ≤ ε∥e(n)∥∞.

50

In conclusion, we have

∥e(n)∥∞≤ (1 + τε)∥e(n−1)∥∞ + τ(τ + h2)ZT

≤ (1 + τε)2∥e(n−2)∥∞ + [(1 + τε) + 1]τ(τ + h2)ZT

≤ · · ·≤ (1 + τε)n∥e(0)∥∞ + [(1 + τε)n−1 + · · ·+ (1 + τε) + 1]τ(τ + h2)ZT

≤ (1 + τε)n − 1

τε· τ(τ + h2)ZT

≤ enτε − 1

ε· (τ + h2)MT ≤ eεT − 1

ε(τ + h2)ZT .

5.3 Implicit schemes

The condition (5.4) may be strict in the practical computation. We can avoidthis by taking some implicit schemes. We still take the same grid points Qτ

h

introduced in the previous subsection. We consider semi-implicit schemes to(5.1);

uni − un−1i

τ= k


h2+ ε(1− un−1

i )un−1i (1 ≤ i ≤ N, n ≥ 1)

un0 = unN+1 = 0 (n ≥ 1)

u0i = a(xi) (0 ≤ i ≤ N + 1)

(5.5)and

uni − un−1i

τ= k


h2+ ε(1− un−1

i )uni (1 ≤ i ≤ N, n ≥ 1)

un0 = unN+1 = 0 (n ≥ 1)

u0i = a(xi) (0 ≤ i ≤ N + 1).

(5.6)They are, respectively, expressed as

(I + λA)u(n) = u(n−1) + ετD(u(n−1))u(n−1) (n ≥ 1), u(0) = a (5.5)

and

(I + λA)u(n) = u(n−1) + ετD(u(n−1))u(n) (n ≥ 1), u(0) = a, (5.6)

where

D(v) =

1− v1 0

1− v2. . .

0 1− vN

for v =

v1v2...vN

.

51


τ <1

ε. (5.7)

Then, the solutions u(n) = (uni ) both (5.5) and (5.6) satisfy

0 ≤ u(n) ≤ q (n ≥ 0).

Proof. As before, we set Hλ = I + λA and recall that ∥H−1λ ∥∞ ≤ 1 and

H−1λ > O for any λ > 0.First, let u(n) = (uni ) be the solution of (5.5) and assume that 0 ≤ u(n−1) ≤

q. Then, u(n) ≥ 0 is obvious. Setting v(n) = q − u(n), we have

D(u(n−1))u(n−1) = D(v(n−1))v(n−1)

andHλq − q = (λ, 0, . . . , 0, λ)T ≥ 0.

Therefore,

Hλv(n) = Hλq −Hλu

(n)

= Hλq − u(n−1) − ετD(u(n−1))u(n−1)

= Hλq − q + v(n−1) − ετD(v(n−1))v(n−1)

≥[I − τεD(v(n−1))

]︸︷︷︸

=W

v(n−1).

Since, in view of (5.7), the ith diagonal entry of W is estimated as

1− τε(1− vn−1i ) = 1− τε+ τεvn−1

i ≥ 1− τε > 0,

we have W ≥ O. Thus, we obtain v(n) ≥ 0 and, hence, u(n) ≤ q.Next, let u(n) = (uni ) be the solution of (5.6) and assume that 0 ≤ u(n−1) ≤

q. The scheme (5.6) is rewritten as[I + λA− τεD(un−1)

]︸︷︷︸=V

u(n) = u(n−1) (n ≥ 1), u(0) = a.

We observe the following:

• The ith diagonal entry of V is estimated as

1 + 2λ− τε(1− un−1i ) ≥ 1 + 2λ− τε > 1− τε ≥ 0.

On the other hand, all non-diagonal entries are non-positive.

• Writing V = (vi,j), we have, for 2 ≤ i ≤ N − 1,

N∑j=1

vi,j = 1 + 2λ− τε(1− un−1i )− 2λ = 1− τε > 0

52

and, for i = 1, N ,

N∑j=1

vi,j = 1 + 2λ− τε(1− un−1i )− λ > 0.

This indicates that ∥V ∥∞ < 1.

From these observations, we can get V −1 > O and ∥V −1∥∞ ≤ 1 in the samemanner as the proof of Lemma 3.2. Hence, we immediately deduce u(n) ≥ 0.Setting v(n) = q − u(n), we have

Hλv(n) = Hλq −Hλu

(n)

= Hλq − u(n−1) − ετD(u(n−1))u(n)

= Hλq − q + v(n−1) − ετD(v(n−1))v(n)

≥ v(n−1) − ετD(v(n−1))v(n).

Thus, v(n) satisfies[I + λA− τεD(vn−1)

]︸︷︷︸=V ′

v(n) ≥ v(n−1) ≥ 0.

Since V ′−1 > O as verified just above, we conclude v(n) ≥ 0 and, hence,u(n) ≤ q. This completes the proof.

5.4 An example: Gray-Scott model

As an example of nonlinear diffusion equations, we offer the Gray-Scott model:ut = kuuxx + u2v − (β + γ)u (0 < x < L, t > 0)

vt = kvvxx − u2v + β(1− v) (0 < x < L, t > 0)

ux(0, t) = ux(L, t) = vx(0, t) = vx(L, t) = 0 (t > 0)

u(x, 0) = u0(x), v(x, 0) = v0(x) (0 ≤ x ≤ L),

(5.8)

where L, ku, kv, β, γ are positive constants. This is a simple mathematicalmodel that describes a certain auto-catalytic reaction phenomenon. Here,the functions u = u(x, t) and v = v(x, t) are defined to be concentrations oftwo chemical substances. The diffusion coefficients are denoted by positiveconstants ku and kv. The rate of the supply of a chemical substance and theremoval of an intermediate product in a reaction are expressed by β and γ.See, for more concrete explanation,

[26] 三村昌泰 (編): パターン形成とダイナミクス (非線形・非平衡現象の数理 4)，東京大学出版会，2006年.

53

The explicit scheme to (5.8) is

un+1i − uni

τ= ku

uni+1 − 2uni + uni−1

h2+ (uni )

2vni − (β + γ)uni ,

vn+1i − vni

τ= kv

vni+1 − 2vni + vni−1

h2− (uni )

2vni + β(1− vni )

(1 ≤ i ≤ N, n ≥ 0),

un+10 = un+1

1 , un+1N = un+1

N+1, vn+10 = vn+1

1 , vn+1N = vn+1

N+1 (n ≥ 0),

u0i = u0(xi), v0i = u0(xi) (1 ≤ i ≤ N)(5.9)

where

• h = L/N (N ∈ N), τ > 0,

• Qτh =

(xi, tn)| xi =

(i− 1

2

)h, tn = nτ (0 ≤ i ≤ N + 1, n ≥ 0)

• uni ≈ u(xi, tn), vni ≈ v(xi, tn).

Some patterns created by this system are displayed in Fig. 5.7, where wesuppose 0 ≤ t ≤ 1000 and 0 ≤ x ≤ L = 0.5.

54

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 0

50

100

150

200

250

300

350

400

450

500 0

0.1

0.2

0.3

0.4

0.5

xtime

(a) (β, γ) = (0.1504, 0.1400)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 0

50

100

150

200

250

300

350

400

450

500 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

xtime

(c) (β, γ) = (0.1504, 0.0308)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 0

50

100

150

200

250

300

350

400

450

500 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

xtime

(e) (β, γ) = (0.0192, 0.0448)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 0

50

100

150

200

250

300

350

400

450

500 0

0.1

0.2

0.3

0.4

0.5

0.6

xtime

(b) (β, γ) = (0.1504, 0.0392)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 0

50

100

150

200

250

300

350

400

450

500 0

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

xtime

(d) (β, γ) = (0.1504, 0.0056)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 0

50

100

150

200

250

300

350

400

450

500 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

xtime

(f) (β, γ) = (0.0096, 0.0308)

Figure 5.7: Solutions uni of (5.9); ku = 10−5, kv = 2ku, N = 128, λ = 1/6,L = 0.5, T = 500.

55

6 Complement for FDM

6.1 Non-homogeneous Dirichlet boundary condition

So far, we have studied only the homogeneous Dirichlet boundary condition

u(0, t) = 0, u(1, t) = 0.

Non-homogeneous cases are treated similary. As an example, we considerut = kuxx + f(x, t) (0 < x < 1, t > 0)

u(0, t) = b0(t), u(1, t) = b1(t) (t > 0)

u(x, 0) = a(x) (0 ≤ x ≤ 1),

(6.1)

where b0(t) and b1(t) are given continuous function of t ≥ 0. We use the samenotation of Sections 2 and 3.First, we consider the explicit scheme. That is,

un+1i − uni

τ= k


h2+ f(xi, tn) (1 ≤ i ≤ N, n ≥ 0).

For i = 1, we have

un+11 − un1

τ= k

−2un1 + un2h2

+ f(x1, tn) + kb0(tn)

h2(n ≥ 0).

Consequently, we derive

u(n+1) = Kλu(n) + τf (n) + λb(n) (n ≥ 0),

where

b(n) =

b0(tn)

0...0

b1(tn)

∈ Rn.

In the similar manner, we derive the implicit θ scheme as

Hθλu(n) = K(1−θ)λu

(n) + τf (n−1+θ) + λb(n−1+θ) (n ≥ 0),

whereb(n−1+θ) = (1− θ)b(n−1) + θb(n).

Since the error vector e(n) satisfies (4.2), we are able to obtain exactly thesame error estimates in the previous section for non-homogeneous problems.

56

6.2 Neumann boundary condition

We move on to the initial-boundary value problem with the Neumann bound-ary condition:

ut = kuxx + f(x, t) (0 < x < 1, t > 0)

ux(0, t) = ux(1, t) = 0 (t > 0)

u(x, t) = a(x) (0 ≤ x ≤ 1)

(6.2)

We introduce:

• h =1

Nwith 0 < N ∈ Z;

• τ > 0;

• Qτh =

(xi, tn) | xi =

(i− 1

2

)h, tn = nτ (0 ≤ i ≤ N + 1, n ≥ 0)

;

• uni ≈ u(xi, tn).

The explicit scheme reads as

un+1i − uni

τ= k


h2+ f(xi, tn) (1 ≤ i ≤ N, n ≥ 0).

However, un0 and unN+1 are not defined at this stage. We employ the Neumannboundary condition to treat those values. Thus,

ux(0, t) = 0 ⇒ u(x1, t)− u(x0, t)

h≈ 0,

ux(1, t) = 0 ⇒ u(xN+1, t)− u(xN , t)

h≈ 0.

From these, we have

un1 − un0h

= 0 ⇔ un0 = un1 ,

unN+1 − unNh

= 0 ⇔ unN+1 = unN .

The resulting scheme now reads asun+1i − uni

τ= k


h2+ f(xi, tn) (1 ≤ i ≤ N, n ≥ 0)

un0 = un1 , unN+1 = unN (n ≥ 1)

u0i = a(xi) (0 ≤ i ≤ N + 1).

As before we set λ = kτ/h2. The first and second equalities are written as

1

τu(n+1) − u(n) = − k

h2Bu(n) + f (n),

57

where

B =

1 −1 0

0. . .. . .

−1 2 −1. . .

. . .

0 −1 1

, u(n) =

un1...

unN

, f (n) =

f(x1, tn)...

f(xN , tn)

Hence, the explicit scheme is expressed as

u(n) = (I − λB)︸︷︷︸=Lλ

u(n−1) + τf (n−1) (n ≥ 1), u(0) = a.

On the other hand, the simple implicit scheme isuni − un−1

i

τ= k


h2+ f(xi, tn) (1 ≤ i ≤ N, n ≥ 1)

un0 = un1 , unN+1 = unN (n ≥ 1)

u0i = a(xi) (0 ≤ i ≤ N + 1)

and its matrix representation is

(I + λB)︸︷︷︸=Mλ

u(n) = u(n−1) + τf (n) (n ≥ 1), u(0) = a.

Moreover, the implicit θ scheme reads as

Mθλu(n) = Lθ′λu

(n−1) + τf (n−1+θ) (n ≥ 1), u(0) = a, (6.3)

where 0 ≤ θ ≤ 1, θ′ = 1− θ and

f (n−1+θ) = (1− θ)f (n−1) + θf (n) = (fn−1+θi ).


0 ≤ θ ≤ 1, 1− 2(1− θ)λ ≥ 0. (6.4)

Then, for any n ≥ 1, there exists a unique solution u(n) of (6.3) and it satisfies

(ℓ∞ stability) ∥u(n)∥∞ ≤ ∥a∥∞ + τ

n∑l=1

∥f (l−1+θ)∥∞.

Moreover, if f(x, t) ≥ 0 for x ∈ (0, 1) and t ≥ 0, u(n) satisfies the followingproperties:

(positivity) θ = 0, 1− 2(1− θ)λ > 0, a ≥ 0, = 0

⇒ u(n) > 0 for n ≥ 1;

(nonnegativity) θ = 0, a ≥ 0 ⇒ u(n) ≥ 0 for n ≥ 1.

58

Proof. We first note that we have Lθ′λ ≥ O under the assumption (6.4).Nonnegativity. When θ = 0, we have Mθλ = I and, hence, u(1) = Lθ′λa ≥ 0for a ≥ 0.

Positivity. Assume that θ = 0. If 1 − 2(1 − θ)λ > 0, then the diagonalentries of Lθ′λ are positive. Hence, Lθ′λa ≥ 0, = 0 for a ≥ 0, = 0. To showM−1

θλ > O, we follow the method of the proof of Lemma 3.2. The matrix Mθλ

is represented as Ms = I + sB = D(I −G), where s = θλ and

D =

1 + s

01 + 2s

. . .

1 + 2s0 1 + s

,

G =

0 µ1 0µ2 0 µ2

. . .

µ2 0 µ2

0 µ1 0

, µm =s

1 +ms

Since ∥G∥∞ = max

s

1 + s,

2s

1 + 2s

=

2s

1 + 2s< 1, we can apply Lemma 3.3

to obtain that I − G is non-singular. Hence, Ms = D(I − G) is also non-

singular and M−1s = (I −G)−1D−1 =

∞∑l=0

GlD−1 ≥ O. On the other hand, we

can verify that M−1s > O by the exactly same manner as the proof of Lemma

3.2.

ℓ∞ stability. Let n ≥ 1 be fixed. We set q = (1, . . . , 1)T ∈ RN ,

α = max

0, max

1≤i≤Nun−1i

, β = max

0, max

1≤i≤Nfn−1+θi

.

We will use the following facts:

• u(n−1) ≤ αq and f (n−1+θ) ≤ βq;

• Bq = 0;

• Ls′v ≥ Ls′v′ for v ≥ v′ because of Ls′ ≥ O.

We can calculate as

Ls′u(n−1) ≤ Ls′αq = α[I − (1− θ)λB]q

= α(I + θλB)q − αλBq = Ms(αq),

and

f (n−1+θ) ≤ βq = βq −Ms(βq) +Ms(βq)

= β[I − (I + λθB)]q +Ms(βq)

= −βλθBq +Ms(βq) = Ms(βq).

59

These inequalities, together with the equation (6.3), give

Ms(αq + τβq − u(n)) ≥ Ls′u(n−1) + τf (n−1+θ) −Msu

(n) = 0.

Therefore, by virtue of M−1s > O, we obtain

αq + τβq − u(n) ≥ 0.

In the similar manner, we deduce

Ms(u(n) − α′q − τβ′q) ≥ 0

and, hence,u(n) − α′q − τβ′q ≥ 0,

where

α′ = min

0, min

1≤i≤Nun−1i

, β′ = min

0, min

1≤i≤Nfn−1+θi

.

As a result of these results,

α′q + τβ′q ≤ u(n) ≤ αq + τβq.

Thus, we get∥u(n)∥∞ ≤ ∥u(n−1)∥∞ + τ∥f (n−1+θ)∥∞,

which implies the desired inequality.

Remark. In the proof of Theorem 6.1, since

∥(I −G)−1∥∞ ≤ 1

1− ∥G∥∞=

1

1− 2s1+2s

= 1 + 2s,

we have

∥M−1s ∥∞ ≤ ∥(I −G)−1∥∞∥D−1∥∞ ≤ (1 + 2s) · 1

1 + s> 1.

Thus, we can not directly obtain the ℓ∞ stability, although ∥Lθ′λ∥∞ = 1.

Next, we proceed to a convergence analysis. We extend the solution u(x, t)of (6.2) to a function u(x, t) defined in [−1, 2]× [0,∞) by the reflection. Thus,we introduce

u(x, t) =

u(−x, t) (−1 ≤ x ≤ 0, t ≥ 0)

u(x, t) (0 ≤ x ≤ 1, t ≥ 0)

u(2− x, t) (1 ≤ x ≤ 2, t ≥ 0).

Moreover, the extension f(x, t) of f(x, t) is introduced similarly. Obviously,u(x, t) and f(x, t) are continuous. Furthermore, u(x, t) is a C1 function of xbecause of the Neumann boundary condition ux(0, t) = ux(1, t) = 0. Combin-ing these facts and using the heat equation ut = kuxx+f(x, t), we can deduce

60

that u(x, t) is a C2 function of x. This indicates that u(x, t) is a solution of theheat equation ut = kuxx + f(x, t) in (−1, 2)× (0,∞). In the similar manner,if we assume that

∂mu

∂xm∈ C([0, 1]× [0, T ]) (0 ≤ m ≤ 4),

∂lu

∂tl∈ C([0, 1]× [0, T ])

(0 ≤ l ≤ 2) if θ = 1/2

(0 ≤ l ≤ 3) if θ = 1/2,

(4.3)

we obtain∂mu

∂xm∈ C([−1, 2]× [0, T ]) (0 ≤ m ≤ 4),

∂lu

∂tl∈ C([−1, 2]× [0, T ])

(0 ≤ l ≤ 2) if θ = 1/2

(0 ≤ l ≤ 3) if θ = 1/2,

(6.5)

where T > 0 denotes a positive constant and l,m positive integers.We set

M lm(T ) = max

t∈[0,T ]max

x∈[−1,2]

∣∣∣∣ ∂m

∂xm∂l

∂tlu(x, t)

∣∣∣∣ .We consider the error

eni = u(xi, tn)− uni (0 ≤ i ≤ N + 1, n ≥ 1)

and the error vectore(n) = (eni ) ∈ RN , (6.6)

where u(n) = (uni ) ∈ RN denotes the solution of (6.3) with un0 = un1 andunN+1 = unN . Note that eni = u(xi, tn)− uni for 1 ≤ i ≤ N so that e(n) actuallyimplies the error of the finite difference scheme.Then, since

u(x0, tn) = u(x1, tn), u(xN+1, tn) = u(xN , tn),

the error vector satisfies

Mθλe(n) = Lθ′λe

(n−1) + τr(n) (n ≥ 1), e(0) = 0,

where r(n) = (rni ) is defined as

rni = Dτ u(xi, tn)− (1− θ)k∆hu(xi, tn−1)− θk∆hu(xi, tn)

with the notation of §4.Hence, in the exactly same manner as the proof of Theorem 4.3, we can

prove the following result.

61

Theorem 6.2.Let T > be fixed. Let u(x, t) and u(n) be solutions of (6.2) and (6.3), respec-tively. Assume that (6.4) and (4.3) are satisfied. Then, the error vector (6.6)admits an error estimate

max0≤tn≤T

∥e(n)∥∞ ≤

CT,θ(τ + h2) (θ = 1/2)

CT,1/2(τ2 + h2) (θ = 1/2),

where

CT,θ =

T ·max

k

12M0

4 (T ),1

2M2

0 (T )

(θ = 1/2)

T ·max

k

12M0

4 (T ),5

12M3

0 (T )

(θ = 1/2).

Remark (conservation of the heat flux). The solution u(x, t) of (6.2)satisfies the conservation of the heat flux

J(t) = J(0) =

∫ 1

0ca(x) dx (t ≥ 0),

where

J(t) =

∫ 1

0cu(x, t) dx (c: the heat capacity)

denotes the total heat flux. Note that we are assuming

u ∈ C([0, 1]× [0,∞)),

ux ∈ C([0, 1]× (0,∞)),

ut, uxx ∈ C((0, 1)× (0,∞)).

Then, we can calculate as

d

dtJ(t) =

∫ 1

0cut(x, t) dx =

∫ 1

0cuxx(x, t) dx = [cux(x, t)]

x=1x=0 = 0.

Now, we introduce a discrete heat flux for the solution u(n) = (uni ) of (6.3):

Jn =N∑i=1

cuni h.

Then, we have the conservation of the discrete heat flux

Jn =N∑i=1

ca(xi)h.

6.3 ℓ∞ analysis revisited

Remark. After having finished writing this paragraph, I noticed thatthe main result, Theorem 6.3 below, is exactly the same as Theorem 10.2 of

62

Thomee [39, page 118]. Moreover, the proof is almost the same as that ofThomee’s one. However, my stability results, Theorem 6.5 and Lemma 6.6,are not described in [39]; Those results may be out of scope of his interest. Ishall give an open question about the stability.

In this (rather long) paragraph, we revisit stability and convergence analysisof the FDM for a one-dimensional heat equation

ut = κuxx + f(x, t) (0 < x < d, t > 0)

u(0, t) = 0, u(d, t) = 0 (t > 0)

u(x, 0) = a(x) (0 ≤ x ≤ d),

(6.7)

where κ, d are positive constants, f(x, t), a(x) are prescribed continuous func-tions with a(0) = a(d) = 0.In order to state the finite difference scheme, we introduce 0 < N ∈ Z and

τ > 0, and set h = d(1 + N)−1, xi = ih (0 ≤ i ≤ N + 1), and tn = nτ(n ≥ 0). We denote by uni the finite-difference approximation of u(xi, tn) tobe computed.Let 0 ≤ θ ≤ 1. Then, we consider the standard implicit θ scheme

Dτuni = κ∆hu

n−1+θi + fn−1+θ

i (1 ≤ i ≤ N, n ≥ 1)

un0 = unN+1 = 0 (n ≥ 1)

u0i = a(xi) (0 ≤ i ≤ N + 1),

(6.8)

where

un−1+θi = (1− θ)un−1

i + θuni ,

fn−1+θi = (1− θ)f(xi, tn−1) + θf(xi, tn),

Dτuni =

uni − un−1i

τ, ∆hu

ni =


h2.

We introduce A,H,K ∈ RN×N as

A =

2 −1 0

0−1 2 −1. . .

. . .

0 −1 2

, H = (I + θλA), K = [I − (1− θ)λA],

where I ∈ RN×N denotes the identity matrix and λ = κτh−2. As is well-known, A and H are positive-definite real symmetric matrices so that theyare non-singular. Hence, we can set

G = H−1K.

Introducing

un =

un1...

unN

, a =

a(x1)...

a(xN )

, fn−1+θ =

fn−1+θ1...

fn−1+θN

63

we can rewrite (6.8) equivalently as

un = Gun−1 + τH−1fn−1+θ (n ≥ 1), u0 = a. (6.9)

As usual, we write as

∥v∥∞ = max1≤i≤N

|vi|, ∥v∥2 =

N∑j=1

|vj |2h

1/2

(v = (vi) ∈ RN ).

We use

(v,w)2 =N∑i=1

viwih (v = (vi),w = (wi) ∈ RN ).

Obviously, ∥v∥22 = (v,v)2. And, for any norm ∥ · ∥ in RN , we use the samesymbol to express the matrix norm corresponding to ∥ · ∥;

∥B∥ = maxv∈RN

∥Bv∥∥v∥

(B ∈ RN×N ).

The error vector en is defined as

en =

en1...enN

=

u(x1, tn)− un1...

u(xN , tn)− unN

.

The main purpose of this paragraph is to prove the following result.

64

Theorem 6.3.Let T > 0 and set Q = [0, d]× [0, T ].(i) Let 1/2 ≤ θ ≤ 1. Suppose that the solution u of (16.5) satisfies

u,∂l

∂tlu,

∂m

∂xmu ∈ C(Q) (1 ≤ l ≤ 2, 1 ≤ m ≤ 4). (6.10)

Then, there exists a positive constant C1 which depends only on κ and d suchthat

max0≤tn≤T

∥en∥∞ ≤ C1(τ + h2)√T(∥utt∥L∞(Q) + ∥uxxxx∥L∞(Q)

). (6.11)

(ii) Let θ = 1/2. Suppose that u satisfies

u,∂l

∂tlu,

∂m

∂xmu ∈ C(Q) (1 ≤ l ≤ 3, 1 ≤ m ≤ 4). (6.12)

Then, there exists a positive constant C2 which depends only on κ and d suchthat

max0≤tn≤T

∥en∥∞ ≤ C2(τ2 + h2)

√T(∥uttt∥L∞(Q) + ∥uxxxx∥L∞(Q)

). (6.13)

(iii) Let 0 ≤ θ < 1/2 . Take 0 < δ < 1 and assume

2λ(1− 2θ) ≤ 1− δ. (6.14)

Suppose that (6.10) is satisfied. Then, there exists a positive constant C3

which depends only on κ, d and δ such that

max0≤tn≤T

∥en∥∞ ≤ C3(τ + h2)√T(∥utt∥L∞(Q) + ∥uxxxx∥L∞(Q)

). (6.15)

Remark. It is well-known (cf. Theorem 4.3) that, if θ = 1/2, we have

max0≤tn≤T

∥en∥∞ = O(τ2 + h2) (h, τ → 0)

provided withλ ≤ 1.

Theorem 6.3 claims that the condition λ ≤ 1 is not necessary to prove theconvergence (with the optimal-order) of the finite difference scheme.

The matrix

Ah =1

h2A ∈ RN×N

is positive-definite symmetric matrix so that its square root A1/2h is defined in

a natural way. We introduce

|||v||| = ∥A1/2h v∥2 (v ∈ RN ). (6.16)

65

Lemma 6.4.(i) ∥v∥∞ ≤

√d |||v||| for v ∈ RN .

(ii) |||v||| ≤ 2h−1∥v∥2 for v ∈ RN .

Proof. (i) Let v = (v1, . . . , vN )T ∈ RN and set v0 = vN+1 = 0. Then,

∥A1/2h v∥22 = (Ahv,v)2 =

N∑i=1

−vi−1 + 2vi − vi−1

h2vih

=

N+1∑i=1

(vi − vi−1

h

)2

h. (6.17)

Now let 1 ≤ i ≤ N . We can write as vi =

i∑j=0

(vj − vj−1). Hence,

|vi| ≤N+1∑j=1

∣∣∣∣vj − vj−1

h

∣∣∣∣h1/2h1/2≤

N+1∑j=1

∣∣∣∣vj − vj−1

h

∣∣∣∣2 h1/2N+1∑

j=1

h

1/2

=√d ∥A1/2

h v∥2.

(ii) Again by using (6.17),

|||v|||2 = ∥A1/2h v∥22 =

N+1∑i=1

(vi − vi−1

h

)2

h

≤ 2N+1∑i=1

v2i + v2i−1

h2h ≤ 4

h2∥v∥22.

We introduce the truncation-error vectors rn = (rni ),Rn = (Rn

i ) ∈ RN as

rni = (1− θ)[Dτu(xi, tn)− ut(xi, tn−1)] + θ[Dτu(xi, t

n)− ut(xi, tn)],

Rnj = (1− θ)κ

[uxx(xi, tn−1)−∆hu(xi, t

n−1)]+ θκ [uxx(xi, tn)−∆hu(xi, t

n)] .

We know

∥rn∥2 ≤√d ∥rn∥∞ ≤

C4τ∥utt∥L∞(Q)

C5τ2∥uttt∥L∞(Q) if θ = 1/2,

(6.18)

∥Rn∥2 ≤√d ∥Rn∥∞ ≤ C6κh

2∥uxxxx∥L∞(Q), (6.19)

where C4, C5, C6 are positive constants depending only on d.The error en solves

en − en−1

τ+ κAhe

n−1+θ = rn +Rn (n ≥ 1), e0 = 0,

66

where en−1+θ = (1− θ)en−1 + θen. From this, we have the identity(en − en−1

τ,v

)2

+ κ(Ahen−1+θ,v)2 = (rn +Rn,v)2 (6.20)

for any v ∈ RN . Moreover, the identity

en−θ+1 = τ

(θ − 1

2

)en − en−1

τ+

en + en−1

2(6.21)

is of use.

We can state the following proof.

Proof of Theorem 6.3. (i) and (ii). Let 1/2 ≤ θ ≤ 1. In view of (6.21), we cancalculate as(en − en−1

τ,Ahe

n−θ+1

)2

= τ

(θ − 1

2

)(en − en−1

τ,Ah

en − en−1

τ

)2

+

(en − en−1

τ,Ah

en + en−1

2

)2

= τ

(θ − 1

2

)∥∥∥∥∥A1/2h (en − en−1)

τ

∥∥∥∥∥2

2

+1

2τ

(∥A1/2

h en∥22 − ∥A1/2h en−1∥22

)≥ 1

2τ

(|||en|||2 − |||en−1|||2

). (6.22)

Substituting v = Ahen−1+θ into (6.20), using (6.22) and Young’s inequality,

we have

1

2τ

(|||en|||2 − |||en−1|||2

)+ κ∥Ahe

n−1+θ∥22 ≤ (rn +Rn, Ahen−1+θ)2

≤ 1

2κ∥rn +Rn∥22 +

κ

2∥Ahe

n−1+θ∥22.

Hence,

|||en|||2 ≤ |||en−1|||2 + τ

κ∥rn +Rn∥22

≤ τ

κ

n∑k=1

∥rk +Rk∥22 ≤2τ

κ

n∑k=1

(∥rk∥22 + ∥Rk∥22).

This, together with Lemma 6.4 (i), (6.18) and (6.19), gives the error estimate(6.11) and (6.13).

(iii) Let 0 ≤ θ < 1/2 and assume (6.14) with some 0 < δ < 1. We have(en − en−1

τ,Ahe

n− 12

)2

=1

2τ

∥∥∥A1/2h en

∥∥∥22− 1

2τ

∥∥∥A1/2h en−1

∥∥∥22.

Again, using (6.21),

(Ahen−1+θ, Ahe

n− 12 ) = ∥Ahe

n− 12 ∥22 + τ

(θ − 1

2

)(Ah

en − en−1

τ,Ahe

n− 12

)= ∥Ahe

n− 12 ∥22 +

1

2

(θ − 1

2

)(∥Ahe

n∥22 − ∥Ahen−1∥22

).

67

Substituting v = Ahen− 1

2 into (6.20) and using those identities, we deduce

1

2τ

(|||en|||2 − |||en−1|||2

)+ κ∥Ahe

n− 12 ∥22

+κ

2

(θ − 1

2

)(∥Ahe

n∥22 − ∥Ahen−1∥22

)≤ 1

2κ∥rn +Rn∥22 +

κ

2∥Ahe

n− 12 ∥22.

Hence, setting

εn = |||en|||2 − κτ

(1

2− θ

)∥Ahe

n∥22,

we obtain

εn ≤ εn−1 +τ

κ∥rn +Rn∥22 ≤

2τ

κ

n∑k=1

(∥rk∥22 + ∥Rk∥22

).

In view of Lemma 6.4 (ii) and the condition (6.14), we estimate as

|||en|||2 − κτ

(1

2− θ

)∥Ahe

n∥22 ≥ |||en|||2 − κτ

(1− 2θ

2

)4

h2∥A1/2

h en∥22

≥ [1− 2λ (1− 2θ)] |||en|||2

≥ δ|||en|||2.

Therefore,

|||en|||2 ≤ 2δτ

κ

n∑k=1

(∥rk∥22 + ∥Rk∥22

).

Summing up this, Lemma 6.4 (i), (6.18) and (6.19), we complete the proof ofTheorem 6.3 (iii).

Instead of the stability in the standard norms ∥ · ∥∞ and ∥ · ∥2, we have thefollowing result.

Theorem 6.5.For the solution un of (6.8), we have the stability inequality

|||un||| ≤ |||a|||+ τn∑

k=1

|||fk−1+θ|||

provided with2λ(1− 2θ) ≤ 1 (6.23)

if 0 ≤ θ < 1/2.

This is a direct consequence of the following lemma.

Lemma 6.6.(i) |||G||| ≤ 1 provided with (6.23) if 0 ≤ θ < 1/2.(ii) |||H−1||| ≤ 1.

68

Proof. (i) It is well-known that (or it is a readily obtainable consequence of

the Spectral Mapping Theorem) ∥G∥2 ≤ 1. Hence, noting A1/2h G = GA

1/2h ,

we obtain |||G||| ≤ 1.

(ii) It follows ∥H−1∥2 ≤ 1 and A1/2h H−1 = H−1A

1/2h .

Remark. According to the expression (6.9) and Lemma 6.6, we obtain

|||en||| ≤ τ

n∑k=1

(|||rk|||+ |||Rk|||) (n ≥ 1). (6.24)

Therefore, we can deduce the error estimate if |||rn||| and |||Rn||| are estimatedin terms of h and τ . Consequently, we will succeed in avoiding the constantδ in the condition (6.14). As a matter of fact, the estimation of |||rn||| is notdifficult. In view of (6.17), we can derive

|||rn||| ≤

C7τ∥uttx∥L∞(Q)

C8τ2∥utttx∥L∞(Q) (if θ = 1/2)

with positive constants C7 and C8 depending only on d. This is because rn

naturally satisfies the “boundary condition” rn0 = rnN+1 = 0. (We mean that,if we extend the original definition of rni for 1 ≤ i ≤ N to i = 0, N + 1, wehave rn0 = rnN+1 = 0.) However, we are able to derive only

|||Rn||| ≤ C9h3/2∥uxxxxx∥L∞(Q)

because of the lack of the “boundary condition” Rn0 = Rn

N+1 = 0. Further-

more, we have observed that there are examples that |||Rn||| = O(h3/2) ash → 0 by numerical experiments. Therefore, it is impossible to deduce theoptimal-order error estimates by means of the stability result (6.24). Instead,if we consider the periodic boundary condition, we obtain the optimal-ordererror estimates by (6.24).

Remark. For a C1 function w defined in [0, d] with w(0) = w(d) = 0, wehave

|||w||| ≤ ∥wx∥L2(0,d),

where w = (w(xj)) ∈ RN . Therefore, under appropriate assumptions on aand f , we obtain, as a corollary of Theorem 6.5 and Lemma 6.4 (i),

∥un∥∞ ≤√d ∥ax∥L2(0,d) +

√d τ

n∑k=1

∥fx(·, tk)∥L2(0,d)

provided with (6.23) if 0 ≤ θ < 1/2.

In order to avoid unessential difficulties, we consider only the case θ = 1/2.Suppose now f ≡ 0. Then, (6.9) implies

un = Gna (n ≥ 1).

69

In view of Theorem 6.3, there exists τ0, h0 > 0 such that

∥Gna∥∞ ≤ 1 + ∥u∥L∞(Q) (0 ≤ tn ≤ T )

for τ ≤ τ0 and h ≤ h0. Therefore, by virtue of the uniform boundednessprinciple, there exists a potitive constant MT depending on T such that

∥Gn∥∞ ≤ MT (0 ≤ tn ≤ T ). (6.25)

(See also [32, §3.5].)In [21, §7], it is proved that

∥G∥∞ ≤ 1 ⇔ λ ≤ 3

2. (6.26)

Thus,

λ ≤ 3

2⇒ ∥Gn∥∞ ≤ 1 (∀n ≥ 1). (6.27)

On the other hand, as is stated in Lemma 6.6, we always have

|||G||| ≤ 1 and |||Gn||| ≤ 1 (∀n). (6.28)

Therefore, it would be interesting to consider

∥Gn∥∞ or ∥G∥∞ for λ >3

2.

Remark. For the general θ, we have (cf. [21, §7])

∥G∥∞ ≤ 1 ⇔ λ ≤ 2− θ

4(1− θ). (6.29)

70

Problems and further readings for Chapter I

Problems

Problem 1. Prove that eigenvalues λj and eigenvectors uj of a tri-diagonal matrix

b a0. . .

a b a. . .

0 a b

∈ RN×N (6.30)

are given as

λj = b+ 2a cosjπ

N + 1and

uj =

(sin

jπ

N + 1, · · · , sin Njπ

N + 1

)(1 ≤ j ≤ N).

Problem 2. Consider the initial-boundary value problem for a heat equationut = uxx (0 < x < 1, t > 0)

u(0, t) = u(1, t) = 0 (t > 0)

u(x, 0) = a(x) (0 ≤ x ≤ 1),

(6.31)

where a(x) is a sufficiently smooth function satisfying a(0) = a(1) = 0. (Con-sequently, the solution u(x, t) becomes a sufficiently smooth function of x andt.) Define a set of grid points (xi, tn) = (ih, nτ)| 0 ≤ i ≤ N +1, n ≥ 0 withh = 1/(N + 1) and τ > 0. Put Un

i = u(xi, tn). Then, determine the value ofλ = τ/h2 which gives an estimate of the form∣∣∣∣Un+1

i − Uni

τ−

Uni−1 − 2Un

i + Uni+1

h2

∣∣∣∣ = O(τ2 + h4)

as h, τ → 0.

Problem 3. Consider the explicit scheme for the initial-boundary valueproblem (6.31). Suppose the initial values are given as u0i = (−1)i sin(iπh).Prove that a solution of the finite difference scheme is expressed as

uni = (1− 2λ− 2λ cos(πh))nu0i .

Moreover, prove that, if tn = 1 and λ > 1/2, ∥u(n)∥∞ → ∞ (h → 0).

Problem 4. Prove that (4.7).

Problem 5. In the proof of Lemma 4.6, prove that Fθ,λ(A) is a symmetricmatrix.

71

Problem 6. In the proof of Lemma 4.6, prove that Fθ,λ(µ⟨m⟩)Nm=1 are all

the eigenvalues of Fθ,λ(A).

Problem 7. Recall that

µm = (mπ)2, φm(x) =√2 sin(mπx) (m = 1, 2, . . .).

are the eigenpairs of the eigenvalue problem

−φ′′(x) = µφ(x) (0 < x < 1), φ(0) = φ(1) = 0, φ ≡ 0.

We introduce a finite difference approximation:−φi−1 − 2φi + φi+1

h2= µφi (1 ≤ i ≤ N)

φ0 = φN+1 = 0, φ = (φi) = 0.(6.32)

Prove that µ⟨m⟩ =4

h2sin2

(mπ

2(N + 1)

),

φ⟨m⟩ = (φ⟨m⟩i ) =

(√2 sin(mπxi)

)(1 ≤ m ≤ N).

are the eigenpairs of (6.32). Moreover, prove that

µ⟨m⟩h → µm (h → 0), φm(xi) = φ

⟨m⟩i (0 ≤ i ≤ N + 1).

Problem 8 (machine assignment). Compute numerically (by the finite dif-ference method) the Gray-Scott model

ut = kuuxx + u2v − (β + γ)u (0 < x < 1, 0 < t < T )

vt = kvvxx − u2v + β(1− v) (0 < x < 1, 0 < t < T )

ux(0, t) = ux(1, t) = 0 (0 < t < T )

vx(0, t) = vx(1, t) = 0 (0 < t < T )

u(x, 0) = u0(x), v(x, 0) = v0(x) (0 ≤ x ≤ 1)

for several (β, γ)’s. Report non-trivial shapes of solutions. (recommendation:ku = 10−5，kv = 2ku，T = 500，0.009 ≤ β ≤ 0.151, and 0.03 ≤ γ ≤ 0.141.)

Further readings

In this chapter, I explained only FDM for the one space-dimensional heatequation. But, FDM can be applied to higher space-dimensional partial dif-ferential equations of parabolic, elliptic and hyperbolic types. For those topics,the following standard textbooks are useful:

72

[25] K. W. Morton and D. F. Mayers: Numerical Solution of Par-tial Differential Equations (2nd ed.), Cambridge University Press,2005.

[37] G. D. Smith: Numerical Solution of Partial Differential Equa-tions, Oxford University Press, 1965. (藤川洋一郎訳：コンピュータによる偏微分方程式の解法，新訂版，サイエンス社，1996年)

[24] S. Larsson and V. Thomee: Partial Differential Equations withNumerical Methods, Springer, 2009.

[22] 菊地文雄，齊藤宣一：数値解析の原理–現象の解明をめざして(岩波数学叢書)，岩波書店，2016.

[38] 田端正久：偏微分方程式の数値解析，岩波書店, 2010.

Therein, important topics including von Neumann condition and Fourieranalysis are also described. Moreover, the implementation of FDM is describedin

[36] 齊藤宣一：線形・非線形拡散方程式の差分解法と解の可視化，講義ノート (http://www.infsup.jp/saito/ns/notes.html)，2011年．

An interesting example of nonlinear problems and its analysis are given in

[27]三村昌泰：微分方程式と差分方程式—数値解は信用できるか？—，「数値解析と非線形現象 (山口昌哉編)」の第 3章，日本評論社，1996年 (オリジナルは 1981年)．

The following old articles are also worth reading:

[32] R. D. Richtmyer and K. W. Morton: Difference Methods forInitial-Value Problems, Interscience Publishers, 1967.

[10] 藤田宏：初期値問題の差分法による近似解法 (B: 微分方程式の近似解法)，自然科学者のための数学概論 [応用編]，寺沢寛一 (編)，岩波書店，1960年．

[11] 藤田宏：境界値問題の差分法による近似解法 (B: 微分方程式の近似解法)，自然科学者のための数学概論 [応用編]，寺沢寛一 (編)，岩波書店，1960年．

73

http://www.infsup.jp/saito/ns/notes.html

II. Finite element method for the Poisson equation

7 Variational approach for the Poisson equation

7.1 Dirichlet’s principle

As a typical model of elliptic partial differential equations, we consider theboundary value problem for the Poisson equation in a two-dimensional boundeddomain Ω ⊂ R2 with the boundary ∂Ω,

−∆u = −(

∂2

∂x21+

∂2

∂x22

)u = f in Ω, u = 0 on ∂Ω, (7.1)

where u = u(x) = u(x1, x2) denotes the unknown function to be found andf = f(x) is a given continuous function in Ω.As will be stated in Theorem 7.2, Problem (7.1) and the following mini-

mization problem (variational problem) are closely related:

Find u ∈ U s.t. J(u) = minv∈U

J(v), (7.2)

where

J(v) =1

2

∫Ω|∇v|2 dx−

∫Ωfv dx

=1

2

∫Ω

[(∂v

∂x1

)2

+

(∂v

∂x2

)2]

dx−∫Ωfv dx,

U = v ∈ C1(Ω) | v|∂Ω = “the boundary value of v on Γ” = 0.

Moreover, the following problem is called the Euler-Lagrange equation for(7.2):

Find u ∈ U s.t.

∫Ω∇u · ∇ϕ dx =

∫Ωfϕ dx (∀ϕ ∈ U). (7.3)

Theorem 7.1.A function u is a solution of (7.2), if and only if it is a solution of (7.3).

Proof. Let u be a solution of (7.2). Let ϕ ∈ U be arbitrary. Consider a realvalued function j(t) = J(u+ tϕ) for t ∈ R. Then,

j(t) = J(u+ tϕ) =1

2

∫Ω∇(u+ tϕ) · ∇(u+ tϕ) dx−

∫Ωf(u+ tϕ) dx

=1

2

∫Ω|∇u|2 dx−

∫Ωfu dx

+t2

2

∫Ω|∇ϕ|2 dx+

t

2

∫Ω(∇u · ∇ϕ+∇ϕ · ∇u) dx− t

∫Ωfuϕ dx.

The function j(t) achieves its minimum at t = 0. Hence, j′(t)|t=0 = 0. Thisimplies (7.3).

74

Conversely, let u be a solution (7.3). Let v ∈ U be arbitrary, and set ϕ =v − u ∈ U . Then,

J(v)− J(u) =1

2

∫Ω∇(u+ ϕ) · ∇(u+ ϕ) dx−

∫Ωf(u+ ϕ) dx− J(u)

=1

2

∫Ω|∇ϕ|2 dx+

1

2

∫Ω(∇u · ∇ϕ+∇ϕ · ∇u) dx−

∫Ωfϕ dx

=1

2

∫Ω|∇ϕ|2 dx+

[∫Ω(∇u · ∇ϕ) dx−

∫Ωfϕ dx

]≥ 1

2

∫Ω|∇ϕ|2 dx ≥ 0.

Theorem 7.2 (Dirichlet’s principle).Suppose that Ω is a bounded Lipschitz domain (See §9 for the definition).

(i) If u ∈ C2(Ω) is a solution of (7.1), then it solves (7.2).

(ii) If u ∈ C2(Ω) is a solution of (7.2)，then it solves (7.1).

Proof. (i) Let u ∈ C2(Ω) be a solution of (7.1). Multiplying by ϕ ∈ U the bothsides of −∆u = f and integrating them over Ω, we have by the integration byparts ∫

Ωfϕ dx =

∫Ω(−∆u)ϕ dx

=

∫Ω∇u · ∇ϕ dx−

∫Γ[(∇u) · n]ϕ dS =

∫Ω∇u · ∇ϕ dx,

where n = n(s) (s ∈ Γ) denotes the unit outward vector to Γ and dS the lineelement of Γ. Since ϕ is arbitrary, u solves (7.3). Hence, in view of Theorem7.1, it also solves (7.2).(ii) We deduce ∫

Ω(f +∆u)ϕ dx = 0 (∀ϕ ∈ U)

in the similar manner as (i). We argue by contradiction to show that w =f + ∆u ≡ 0. Assume that there exists z ∈ Ω such that w(z) > 0. Then, bycontinuity of w, there exists δ satisfying

w(x) > 0 (x ∈ B(z; δ) = x ∈ R2 | |z − x| < δ ⊂ Ω).

At this stage, we define ϕ ∈ U as

ϕ ≥ 0 (x ∈ Ω), ϕ(x) > 0 (x ∈ B(z; δ/2)), supp ϕ ⊂ B(z; δ).

Then, we have∫Ωwϕ dx =

∫B(z;δ)

wϕ dx ≥∫B(z;δ/2)

wϕ dx > 0,

which is a contradiction. Hence, we have w ≤ 0. By considering −w, wededuce w ≥ 0. Therefore, w ≡ 0, which completes the proof.

75

Remark. At this stage, we do not know whether a solution exists or not.

Notation. We say that v ∈ C(Ω) is a piecewise C1 function if and only ifthere exists a decomposition Ω = Ω1 ∪ · · · ∪ ΩN with Ωi ∩ Ωj = ∅ (i = j) andthat v is of class C1 in each Ωi (i = 1, . . . , N). Then, we can take

V = v ∈ C(Ω) | v is a piecewise C1 function, v|Γ = 0

instead of U in Theorems 7.1 and 7.2. Note that U ⊂ V .

7.2 Galerkin’s approximation

We move to finite dimensional approximations to (7.2) and (7.3). To thispurpose, we introduce

VN =

vN ∈ V

∣∣∣∣∣ vN (x) =

N∑i=1

ciϕi(x), ciNi=1 ⊂ R

⊂ V,

where ϕ1, ϕ2, . . . , ϕN ∈ V with 0 < N ∈ Z.A finite dimensional approximation to (7.3) reads as

Find uN ∈ VN s.t. J(uN ) = minvN∈VN

J(vN ), (7.4)

which is called the Ritz approximation.On the other hand, a finite dimensional approximation to (7.3) reads as

Find uN ∈ VN s.t.

∫Ω∇uN · ∇vN dx =

∫ΩfvN dx (∀vN ∈ VN ) (7.5)

which is called the Galerkin approximation.Both (7.4) and (7.5) can be represented in the vector-matrix form as follows:

Au = f , (7.6)

where

A = (aij) ∈ RN×N , aij =

∫Ω∇ϕj · ∇ϕi dx;

f = (fi) ∈ RN , fi =

∫Ωfϕi dx;

u = (ui) ∈ RN with uh =

N∑i=1

uiϕi.

Remark. (7.4) ⇔ (7.5).

76

8 Finite element method (FEM)

We give a concrete example VN by the finite element method (FEM). Weassume that Ω is a polygonal domain for the sake of simplicity. Below wecollect some notions of FEM. In this section, we write (x, y) = (x1, x2) toexpress the generic point in R2 and dx implies dx = dx1dx2.

• T (= triangulation of Ω) is introduced as follows.

1. T is a set of closed triangles T , and Ω =∪T∈T

T .

2. Any two triangles of T meet only in entire common faces or sidesor in vertices.

These triangles are prohibited.

• The size (mesh, granularity) parameter of T is defined as

h = maxT∈T

hT ,

where

hT = the diameter of the circumscribed circle of T.

Below, we write T = Th.

• T ∈ Th is called a element (要素). A vertex of T is called a node (節点).

• Let T ∈ Th be arbitrary. Suppose that Pi = (xi, yi) (i = 1, 2, 3) arevertices of T and |T | the area of T . We set

λi(x, y) =PPjPk

|T |=

1

2|T |

∣∣∣∣∣∣1 1 1x xj xky yj yk

∣∣∣∣∣∣=

1

2|T |(xjyk − xkyj) + (yj − yk)x− (xj − xk)y

for P = (x, y) ∈ T . Hereinafter, we write (i, j, k) = (1, 2, 3), (2, 3, 1), and(3, 1, 2). Obviously, λ1, λ2, λ3 are affine functions defined in T and wehave

λi(Pj) =

1 (i = j)

0 (i = j). (8.1)

See Fig. 8.5. We call λi3i=1 the barycentric coordinate of T and λi thebarycentric coordinate of T associating with Pi.

We have

λ1(x, y) + λ3(x, y) + λ3(x, y) = 1, (x, y) ∈ T. (8.2)

77

Figure 8.1: Triangulation of Ω = (0, 1)× (0, 1) by a uniform division. Numberof nodes: 81 (left), 289 (center) and 1089 (right). Number ofelements: 128 (left), 512 (center), 2048 (right).

Figure 8.2: Triangulations of Ω = (0, 1) × (0, 1) by a non-uniform division(freefem++ http://www.freefem.org). Number of nodes: 95(left), 333 (center) and 1267 (right). Number of elements: 156(left), 600 (center), 2404 (right).

Figure 8.3: Triangulations of a polygonal domain.

Figure 8.4: Triangulations of a domain with the piecewise smooth boundary.

78

http://www.freefem.org

P

P

iP

k

j

Figure 8.5: The barycentric coordinate λi(x, y) of T associating with Pi.

• N = N +NB is the total number of nodes, where

N is the number of nodes located on the interior of Ω and

NB is the number of nodes located on the boundary Γ.

• PiNi=1 is the set of all nodes, where

PiNi=1 is the set of all nodes located on Ω and

Pi+NINBi=1 is the set of all nodes located on Γ.

• Let Λi = T ∈ Th | Pi ∈ T.

• For 1 ≤ i ≤ N , we define ϕi = ϕh,i(x, y) ∈ C0(Ω) by setting

ϕi(x, y) =

λT,i(x, y) (x, y) ∈ T, T ∈ Λi,

0 otherwise,

where λT,i denotes the barycentric coordinate of T associating with Pi.See Fig 8.6. From the definition,

suppϕi =∪

T∈Λi

T. (8.3)

In view of (8.1) and (8.2), we have

ϕi(Pj) =

1 (i = j)

0 (i = j),

N∑i=1

ϕi = 1 (8.4)

• The spaceXh = spanϕiNi=1 (8.5)

is called the P1 element on Th. Each vh ∈ Xh is characterized by thefollowing two conditions:

– vh is a continuous function in Ω;

79

Pi

Figure 8.6: (left) Λi. (right) ϕi.

– For each T ∈ Th, there exist α, β, γ ∈ R such that vh|T = α+ βx+γy.

Below we also useVh = spanϕiNi=1 (8.6)

which is also called the P1 element (with the zero Dirichlet boundarycondition) on Th. Each vh ∈ Vh is characterized by the following threeconditions:

– vh is a continuous function in Ω;

– For each T ∈ Th, there exist α, β, γ ∈ R such that vh|T = α+ βx+γy;

– vh|∂Ω = 0.

Then, ϕiNi=1 is called the standard basis of Xh and ϕiNi=1 that of Vh.

Figure 8.7: (left) An example of vh ∈ Xh. (right) An example of vh ∈ Vh.

At this stage, we can state the finite element approximation to (7.3), whichreads as

Find uh ∈ Vh s.t.

∫Ω∇uh · ∇vh dx =

∫Ωfvh dx (∀vh ∈ Vh). (8.7)

80

It is equivalently written asAu = f ,

where

A = (aij) ∈ RN×N , aij =

∫Ω∇ϕj · ∇ϕi dx;

f = (fi) ∈ RN , fi =

∫Ωfϕi dx;

u = (ui) ∈ RN , ui = uh(Pi).

Remark. Let T ∈ Th with their vertices Pi = (xi, yi) (i = 1, 2, 3). Thebarycentric coordinate of T is denoted by λi3i=1. We have

∇λi =1

2|T |

(yj − yk

−(xj − xk)

)(constant vector).

Consequently,

∇λi ·−−−→PjPk = 0, ∇λi ·

−−→PjPi = ∇λi ·

−−−→PkPi = 1.

Moreover, we deduce

|∇λi| =1

2|T |PjPk =

1

κi · PjPk

PjPk =1

κi,

where κi denotes the perpendicular length from the vertex Pi to the segmentPjPk; see Fig. 8.8.

κi

Pk Pj

Pi

Figure 8.8:

We have ∫Tλiλj dx =

|T |6 (i = j),

|T |12 (i = j)

(8.8)

and∫T∇λi · ∇λj dx

=

1

4|T | [(xj − xk)2 + (yj − yk)

2] (i = j),

14|T | [(xj − xk)(xk − xi) + (yj − yk)(yk − yi)] (i = j).

(8.9)

81

Remark. It is a hard task to construct a triangulation of a given polygonaldomain. Some useful soft-wares are available; freefem++ [14].

82

9 Tools from Functional Analysis

In order to establish a mathematical justification of FEM, we introduce someconcepts from Functional Analysis. In particular, Sobolev spaces H1(Ω) andH1

0 (Ω) play important roles. We collect some notions below.

9.1 Sobolev spaces

Hereinafther, Ω is assumed to be a domain (open and connected subset) in R2.A generic point is denoted by x = (x1, x2).

• We shall treat only real-valued functions. Sets of continuous functionsC(R2) = C0(R2) and C(Ω) = C0(Ω) are well-known. For a nonnegativeinteger k, the set of Ck functions defined in R2 and Ω are denoted,respectively, by Ck(R2) and Ck(Ω). The spaces

Ck0 (R2) = v ∈ Ck(R2) | v has a compact support,Ck0 (Ω) = v|Ω | v ∈ Ck

0 (R2), the support of v is contained in Ω

play important roles. Therein, the support of v is defined as

supp v = x ∈ Ω | v(x) = 0.

We introduce, for k ≥ 0,

Ck(Ω) = v|Ω | v ∈ Ck0 (R2).

Moreover, as usual, set

C∞0 (R2) =

∩k≥0

Ck0 (R2), C∞(Ω) =

∩k≥0

Ck(Ω).

• For a continuous function v in Ω, we write

∥v∥∞ = ∥v∥L∞(Ω) = maxx∈Ω

|v(x)|,

which is called the L∞ norm or maximum norm of v.

• The L2 space defined in Ω, which is denoted by L2(Ω), is a Hilbert spaceequipped with the scalar product

(u, v) = (u, v)Ω = (u, v)L2(Ω) =

∫Ωu(x)v(x) dx.

The induced norm is

∥u∥ = ∥u∥Ω = ∥u∥L2(Ω) = (u, u)1/2 =

√∫Ω|u(x)|2 dx.

83

• We recall partial derivatives in the L2 sense. Let v ∈ L2(Ω). If there isa function g ∈ L2(Ω) such that∫

Ωv∂φ

∂x1dx = −

∫Ωgφ dx (∀φ ∈ C∞

0 (Ω)),

then g is called a generalized partial derivative in L2(Ω) of v with respectto x1 and is denoted by

g =∂v

∂x1, ∂1v, · · · .

We define successively

∂v

∂x2= ∂2v,

∂2v

∂x21= ∂2

1v,∂2v

∂x1x2= ∂1∂2v, · · · .

Those generalized derivatives are unique. If v ∈ L2(Ω) ∩ C1(Ω), then ageneralized derivative implies the usual one. In what follows, we consideronly generalized derivatives.

• Now we can introduce

H1(Ω) =u ∈ L2(Ω) | ∃∂1u, ∂2u ∈ L2(Ω)

together with

(u, v)H1(Ω) = (u, v) + (∂1u, ∂1v) + (∂2u, ∂2v)︸︷︷︸=(∇u,∇v)

,

∥u∥H1(Ω) =√(u, u)H1(Ω).

• We also introduce

H2(Ω) =u ∈ L2(Ω) | ∂1u, ∂2u, ∂2

1u, ∂1∂2u, ∂22u ∈ L2(Ω)

together with

(u, v)H2(Ω) = (u, v) +∑i=1,2

(∂iu, ∂iv) +∑

i,j=1,2

(∂i∂ju, ∂i∂jv),

∥u∥H2(Ω) =√

(u, u)H2(Ω).

• The space Hm(Ω), m = 1, 2, is a Hilbert space equipped with the scalarproduct (·, ·)Hm(Ω) and the norm ∥ · ∥Hm(Ω). The space Hm(Ω) is calledthe Sobolev space.

• In the space H1(Ω)

∥∇u∥ =√

(∇u,∇u) =

[∫Ω

(|∂1u|2 + |∂2u|2

)dx

]1/2defines a semi-norm, where

(∇u,∇v) = (∂1u, ∂1v) + (∂2u, ∂2v).

84

• We use Schwarz’s inequalities:

|(u, v)| ≤ ∥u∥ · ∥v∥ (u, v ∈ L2(Ω)),

|(∇u,∇v)| ≤ ∥∇u∥ · ∥∇v∥ (u, v ∈ H1(Ω)),

|(u, v)H1(Ω)| ≤ ∥u∥H1(Ω)∥v∥H1(Ω) (u, v ∈ H1(Ω)).

• Let Hm0 (Ω) be the closure of C∞

0 (Ω) in the norm ∥ · ∥Hm(Ω). That is,

Hm0 (Ω) = v ∈ Hm(Ω) | ∃φn ⊂ C∞

0 (Ω) s.t. ∥v − φn∥Hm(Ω) → 0.

9.2 Lipschitz domain

At this stage, we state the definition of a Lipschitz domain Ω.First, we say that the boundary Γ = ∂Ω is a Lipschitz continuous if and

only if, for every x = (x1, x2) ∈ Γ, there exists a neighborhood U of x in R2

and new orthogonal coordinates y = (y1, y2) such that

(i) U = y ∈ R2y | −ai < yi < ai (i = 1, 2) with some a1, a2 > 0.

(ii) There exists a Lipschitz continuous function ϕ(y1) defined in U1 =−a1 < y1 < a1 that satisfies

|ϕ(y1)| ≤ a2/2 (y1 ∈ U1),

Ω ∩ U = (y1, y2) | y2 < ϕ(y1), y1 ∈ U1,Γ ∩ U = (y1, y2) | y2 = ϕ(y1), y1 ∈ U1.

Then, we say that Ω is a Lipschitz domain, if and only if its boundary Γ = ∂Ωis Lipschitz continuous.Similarly, we define a Ck domain for k ≥ 0. A C0 domain is often called a

domain with the continuous boundary.

y

y

a

φ

a

a a

(y )

1

1

12

1

2

2

Figure 9.1: Lipschitz domain

85

9.3 Lemmas

Lemmas 9.1 and 9.2 below are valid for a bounded C0 domain Ω.

Lemma 9.1 (Density).C∞(Ω) is dense in Hm(Ω). That is, for any u ∈ Hm(Ω), there exists φn ⊂C∞(Ω) such that ∥u− φn∥Hm(Ω) → 0 (n → ∞).

Lemma 9.2 (Poincare’s inequality).There exists a domain constant CP such that ∥v∥ ≤ CP ∥∇v∥ for any v ∈H1

0 (Ω).

Here and hereafter, by the domain constant, we mean a positive constantdepending only on Ω.The following Lemmas 9.3–9.5 below hold true for a bounded Lipschitz do-

main Ω.

Lemma 9.3 (Sobolev’s inequality).For every u ∈ H2(Ω), there exists a continuous function u such that u = ua.e. in Ω and ∥u∥L∞(Ω) ≤ C∥u∥H2(Ω) with a domain constant C. Below, wewill identify u with u.

Lemma 9.4 (Trace theorem).The mapping v 7→ v|Γ of C∞(Ω) → C∞(Γ) is extended by continuity toa continuous linear mapping γ : H1(Ω) → L2(Γ), which is called the traceoperator. That is, we have

γu = u|∂Ω (v ∈ C(Ω)),

∥γv∥L2(Γ) ≤ CT ∥v∥H1(Ω) (v ∈ H1(Ω))

with the domain constant CT .

Lemma 9.5.The space H1

0 (Ω) is characterized by

H10 (Ω) = u ∈ H1(Ω) | γu = 0,

where γ : H1(Ω) → L2(Γ) denotes the trace operator described in Lemma 9.4.

We skip the proofs of those lemmas, which could be found in

[29] J. Necas: Direct Methods in the Theory of Elliptic Equations,Springer, 2011.

For the reader’s convenience, the correspondence is given in Tab. 9.1.

Lemma 9.6.Suppose that a bounded Lipschitz domain Ω is decomposed to two (disjoint)Lipschitz subdomains Ω1 and Ω2 by a simple smooth curve S; Ω = Ω1∪S∪Ω2

and Ω1∩Ω2 = ∅. Then, v ∈ C(Ω), v1 = v|Ω1 ∈ C1(Ω1) and v2 = v|Ω2 ∈ C1(Ω2)implies v ∈ H1(Ω).

86

this note Necas’ book assumption on Ω

Lemma 9.1 Theorem 3.1 (Chapter 2) bounded C0

Lemma 9.2 Theorem 1.1 (Chapter 1) bounded C0

Lemma 9.3 Theorem 3.8 (Chapter 2) bounded LipschitzLemma 9.4 Theorem 1.2 (Chapter 1) bounded LipschitzLemma 9.5 Theorem 4.10 (Chapter 2) bounded Lipschitz

Table 9.1: The correspondence between Lemmas in this note and Necas’ book.

Proof. We define g ∈ L2(Ω) by setting

gj =

∂v1/∂xj ∈ C(Ω1) in Ω1

∂v2/∂xj ∈ C(Ω2) in Ω2,(j = 1, 2).

Let nk = (n1,k, n2,k) be the unit normal vector to S outgoing from Ωk. Ob-viously, we have n1 = −n2. For an arbitrary φ ∈ C∞

0 (Ω), we have by theintegration by parts∫

Ωgjφ dx =

∫Ω1

∂v1∂xj

φ dx+

∫Ω2

∂v2∂xj

φ dx

= −∫Ω1

v1∂φ

∂xjdx−

∫Ω2

v2∂φ

∂xjdx+

∫Sv(nj,1 + nj,2)φ dS

= −∫Ωv∂φ

∂xjdx.

Hence, we obtain v ∈ H1(Ω) and ∇v = (g1, g2).

Lemma 9.7 (Poincare and Wirtinger’s inequality).Let a > 0.(i) Let D be a square region defined as D = (0, a) × (0, a). Then, for anyv ∈ H1(D),

∥v∥2L2(D) ≤ a2∥∇v∥2L2(D) +1

a2

(∫Dv(x) dx

)2

. (9.1)

(ii) Let T be a right-triangular region defined as T = x1, x2 > 0, x2 < a−x1.Then, for any v ∈ H1(T ),

∥v∥2L2(T ) ≤ a2∥∇v∥2L2(T ) +2

a2

(∫Tv(x) dx

)2

. (9.2)

Proof. (i) We follow the proof of Theorem 1-1.3 of Necas [29]. Let v ∈ C1(D)and express it as

v(x)− v(y) = v(x1, x2)− v(y1, x2) + v(y1, x2)− v(y1, y2)

=

∫ x1

y1

∂v

∂ξ1(ξ1, x2) dξ1 +

∫ x2

y2

∂v

∂ξ2(y1, ξ1) dξ2

87

for x, y ∈ D. We have

|v(x)− v(y)| ≤∫ a

0

∣∣∣∣ ∂v∂ξ1(ξ1, x2)

∣∣∣∣ dξ1 +

∫ a

0

∣∣∣∣ ∂v∂ξ2(y1, ξ1)

∣∣∣∣ dξ2

≤√a

(∫ a

0

∣∣∣∣ ∂v∂ξ1(ξ1, x2)

∣∣∣∣2 dξ1

)1/2

+

(∫ a

0

∣∣∣∣ ∂v∂ξ2(y1, ξ2)

∣∣∣∣2 dξ2

)1/2

and

|v(x)− v(y)|2 ≤ 2a

(∫ a

0

∣∣∣∣ ∂v∂ξ1(ξ1, x2)

∣∣∣∣2 dξ1 +

∫ a

0

∣∣∣∣ ∂v∂ξ2(y1, ξ2)

∣∣∣∣2 dξ2

)for x, y ∈ D. Hence,∫∫

D×D|v(x)− v(y)|2 dxdy

≤ 2a

(a3∫D

∣∣∣∣ ∂v∂ξ1(ξ1, x2)

∣∣∣∣2 dξ1dx2 + a3∫D

∣∣∣∣ ∂v∂ξ2(y1, ξ2)

∣∣∣∣2 dy1dξ2

)= 2a4∥∇v∥2L2(D).

On the other hand,∫∫D×D

|v(x)− v(y)|2 dxdy =

∫∫D×D

[v(x)2 + v(y)2 − 2v(x)v(y)] dxdy

= 2

∫∫D×D

v(x)2dxdy − 2

(∫Dv(x) dx

)2

= 2a2∥v∥2L2(D) − 2

(∫Dv(x) dx

)2

.

Combining these results, we obtain (9.1) for v ∈ C1(D). By the density, weget (9.1) for v ∈ H1(D).(ii) It suffices to prove (9.2) for v ∈ C1(T ). Setting T ′ = x1, x2 > 0, x2 >a − x1, we have D = T ∪ T ′. For x ∈ T ′, let x∗ be the reflection of x withrespect to the straight line x2 = a− x1;

x∗ =

(a− x2a− x1

), x =

(x1x2

)∈ T ′.

Note that x = x∗ for x ∈ T ∪ T ′. Then, introduce v ∈ C1(D) by

v(x) =

v(x) (x ∈ T )

v(x∗) (x ∈ T ′)(x ∈ D).

Since v ∈ H1(D) in view of Lemma 9.6, we can apply (9.1). On the otherhand,

∥v∥2L2(D) = 2∥v∥2L2(T ), ∥∇v∥2L2(D) = 2∥∇v∥2L2(T ),

∫Dv(x) dx = 2

∫Tv(x) dx.

Summing up, we deduce (9.2) for v ∈ C1(T ).

88

At this stage, we recall an important application of the projection theorem;For the proof, see a text book of Functional Analysis.

Lemma 9.8 (Riesz’s representation theorem).Let X be a Hilbert space equipped with the scalar product (·, ·)X and thenorm ∥ · ∥X . Further, let F be a bounded linear functional on X, that is,F : X → R is assumed to be a linear mapping satisfying

∥F∥X′ = supv∈X

|F (v)|∥v∥X

< ∞.

Then, there exists a unique a ∈ X such that F (v) = (a, v)X for all v ∈ X.

89

10 Weak solution and regularity

10.1 Weak formulation

We return to consider the Dirichlet BVP for the Poisson equation:

−∆u = f in Ω, u = 0 on ∂Ω, (10.1)

where Ω ⊂ R2 is a bounded Lipschitz domain with the boundary Γ = ∂Ω, andf ∈ L2(Ω) is a given function.

The basic function space of our consideration is

V = H10 (Ω)

which is a Hilbert space equipped with the standard scalar product and normof H1(Ω). As scalar product and norm in V , however, we take

(u, v)V = (∇u,∇v) =

∫Ω∇u · ∇v dx, ∥u∥V =

√(u, u)V .

In view of the Poincare inequality (Lemma 9.2), ∥ · ∥V is an equivalent normof ∥ · ∥H1(Ω) in V . That is, we have

∥v∥V ≤ ∥v∥H1(Ω) ≤ [C2P + 1]1/2∥v∥V (v ∈ V )

with the Poincare constant CP appearing in Lemma 9.2. Consequently, thespace V forms a Hiblert space equipped with (·, ·)V and ∥ · ∥V .

We derive a reformulation of (10.1) in the function space V . To this end,supposing that u is smooth, multiplying both sides of (10.1) by φ ∈ C∞

0 (Ω),and integrating them over Ω, we have by the integration by parts∫

Ω∇u · ∇φ dx =

∫Ωfφ dx.

By density, this implies∫Ω∇u · ∇v dx︸︷︷︸=(u,v)V

=

∫Ωfv dx︸︷︷︸

=(f,v)

(∀v ∈ V ).

Notice that the left hand-side is meaningful for u ∈ H1(Ω).Now we can state the reformulation form of (10.1) as follows:

Find u ∈ V s.t. (u, v)V = (f, v) (∀v ∈ V ). (10.2)

We call (10.2) a weak form of (10.1). The solution u of (10.2) called a weaksolution (generalized solution) of (10.1). The solution u ∈ C2(Ω) ∩ C(Ω) of(10.1) is called a classical solution of (10.1).

90

Theorem 10.1.Suppose that Ω is a bounded Lipschitz domain and f ∈ L2(Ω). Then, we havethe following.

(i) There exists a unique u satisfying (10.2).

(ii) ∥u∥V ≤ C∥f∥ with a domain constant C.

(iii) The function u ∈ V is characterized by

J(u) = minv∈V

J(v), J(v) =1

2∥u∥2V − (f, v).

Proof. (i) We apply Riesz’s representation theorem (Lemma 9.8). In doing so,F ∈ V ′ is defined by setting F (v) = (f, v). We have F ∈ V ′, since |F (v)| ≤∥f∥ · ∥v∥ ≤ CP ∥f∥ · ∥v∥V in view of the Schwarz and Poincare inequalities.Hence, there exists a unique u ∈ V satisfying (u, v)V = F (v) = (f, v) for allv ∈ V .(ii) Choosing v = u, we have by Schwarz and Poincare inequalities,

∥u∥2V = (f, u) ≤ ∥f∥ · ∥u∥ ≤ ∥f∥ · CP ∥∇u∥ = CP ∥f∥ · ∥u∥V .

(iii) See the proof of Theorem 7.1.

10.2 Regularity of solutions

Theorem 10.2 (Elliptic regularity).(i) Let k ≥ 0 be an integer. Assume that Γ = ∂Ω is the boundary of classCk+2 and f ∈ Hk(Ω). Then, the solution u ∈ V of (10.2) satisfies

u ∈ Hk+2(Ω), ∥u∥Hk+2(Ω) ≤ Ck∥f∥Hk(Ω)

with a domain constant Ck. In particular, if Ω and f are smooth enough, thenu ∈ C2(Ω) ∩ C(Ω).

(ii) Assume that Ω is a convex polygon and f ∈ L2(Ω). Then, the solutionu ∈ V of (10.2) satisfies

u ∈ H2(Ω), ∥u∥H2(Ω) ≤ C∥f∥

with a domain constant C. Moreover, if the polygon Ω has a non-convexcorner, there exist a lot of f ∈ L2(Ω) such that the solution u of (10.2) cannotbelong to H2(Ω).

Proof. The proof of (i) is described in the standard monographs of PDEs. Theproof of (ii) is given, for example, in the following references:

• M. Dauge: Elliptic Boundary Value Problems on Corner Domains. Smooth-ness and Asymptotics of Solutions, Lecture Notes in Mathe. 1341,Springer, 1988.

• P. Grisvard: Elliptic Problems in Nonsmooth Domains, Pitman, 1985.

91

• P. Grisvard: Behavior of the solutions of an elliptic boundary value prob-lem in a polygonal or polyhedral domain, Numerical solution of partialdifferential equations, III (Proc. Third Sympos. (SYNSPADE), Univ.Maryland, College Park, Md., 1975), pp. 207–274, Academic Press, NewYork, 1976.

• A. Kufner and A. M. Sandig: Some applications of weighted Sobolevspaces, Teubner Texts in Mathematics 100, Teubner Verlagsgesellschaft,Leipzig, 1987.

10.3 Galerkin’s approximation and Cea’s lemma

Let Vh, h > 0 being a parameter, be a finite dimensional subspace of V withdimVh = N . The Galerkin approximation of (10.2) reads as

Find uh ∈ Vh s.t. (uh, vh)V = (f, vh) (∀vh ∈ Vh). (10.3)

Theorem 10.3.Suppose that Ω is a bounded Lipschitz domain and f ∈ L2(Ω). Then, we havethe following.

(i) There exists a unique uh satisfying (10.3).

(ii) ∥uh∥V ≤ C∥f∥ with the same domain constant C appearing Theorem10.1.

(iii) The function uh ∈ Vh is characterized by

J(uh) = minvh∈Vh

J(vh), J(vh) =1

2∥vh∥2V − (f, vh).

Proof. It is the exactly same as that of Theorem 10.1.

Theorem 10.4 (Galerkin’s orthogonality).Let u ∈ V and uh ∈ Vh be solutions of (10.2) and (10.3), respectively. Then,we have

(u− uh, vh)V = 0 (∀vh ∈ Vh).

Proof. Let vh ∈ Vh be arbitrary. Subtracting (10.3) from (10.2) with v = vh,we obtain the desired equality.

Theorem 10.5 (Cea’s lemma).Let u ∈ V and uh ∈ Vh be solutions of (10.1) and (10.3), respectively. Then,we have

∥u− uh∥V = minvh∈Vh

∥u− vh∥V .

92

Proof. Let vh ∈ Vh be arbitrary. By Galerkin’s orthogonality, we have

∥u− uh∥2V = (u− uh, u− uh)V

= (u− uh, u− vh)V + (u− uh, vh − uh)V

= (u− uh, u− vh)V ≤ ∥u− uh∥V ∥u− vh∥V ;∥u− uh∥V ≤ ∥u− vh∥V .

Hence,∥u− uh∥V ≤ inf

vh∈Vh

∥u− vh∥V .

Thus, we obtain the desired relation.

93

11 Shape-regularity of triangulations

11.1 Interpolation error estimates

Throughout this section, we assume that Ω is a polygonal domain with theboundary Γ = ∂Ω. We recall the following.

• Th = Thh↓0 is a family of triangulations of Ω, PiNi=1 is the set ofall vertices of Th,

• Xh and Vh are sets of continuous piecewise affine functions defined as

Xh = vh ∈ C(Ω) | vh is an affine function on every T ∈ Th,Vh = vh ∈ Xh | vh|Γ = 0,

and ϕiNi=1 is the standard basis of Xh,

• According to Lemma 9.6, Xh and Vh are subspaces of H1(Ω) and H10 (Ω),

respectively.

The aim of this section is to show that every function v ∈ H1(Ω) can beapproximated by using functions of Xh.

We use the H2 semi-norm defined as

|u|2,ω = |u|H2(ω) =

(∫ω

[|∂1u|2 + 2|∂1∂2u|2 + |∂2

2u|2]dx

)1/2

,

where ω ⊂ R2. Then, the H2 norm is given as

∥u∥2,ω = ∥u∥H2(ω) =(∥u∥2ω + ∥∇u∥2ω + |u|22,ω

)1/2,

where ∥u∥ω = ∥u∥L2(ω).

Lemma 11.1 (local interpolation error).Let T be a closed triangle. Then, for every u ∈ H2(T ), we have

∥Πu− u∥T ≤ C1h2T |u|2,T , (11.1)

∥∇(Πu− u)∥T ≤ C2h2TρT

|u|2,T . (11.2)

Therein,

• Πu is the affine function defined on T such that the values at verticesof T coincide with those of u. Thus, (Πu)(Pi) = u(Pi) for i = 1, 2, 3,where P′

is are vertices of T . (Recall u is a continuous function in viewof Lemma 9.3.)

• C1 and C2 are absolute positive constants which are independent of Tand v.

• hT is the diameter of the circumscribed circle of T , ρT is the diameterof the inscribed circle of T .

94

⋆Example 11.2. We consider the triangle T whose vertices are (0, 0), (L, 0)and (L/2, Lα) with L > 0 and α > 0. Obviously, u(x1, x2) = u(x, y) = x2 isin H2(T ) and (Πu)(x, y) = Lx− 1

4L2−αy. Hence, we can calculate as

(u−Πu)x = 2x− L, (u−Πu)y =1

4L2−α, uxx = 2, uxy = uyy = 0.

Therefore, we have∥∇(Πu− u)∥2T

|u|22,T≥ 1

32· 1

L2α−4.

Hence, if α > 2, Inequality (11.2) is meaningless as L → 0.

Definition (shape regularity of triangulations).A family of triangulations Th = Thh>0 is of shape-regular

def.⇐⇒ ∃ν1 > 0 s.t.hTρT

≤ ν1 (∀T ∈ Th ∈ Th). (11.3)

Lemma 11.3 (Zlamal’s minimum angle condition).The condition (11.3) is equivalent to

∃θ1 > 0 s.t. θT ≥ θ1 (∀T ∈ Th ∈ Th), (11.4)

where θT is the minimum angle of T .

Proof. EXERCISE (→ Problem 9).

We introduce the Lagrange interpolation operator Πh : C(Ω) → Xh definedas

(Πhu)(x) =N∑i=1

u(Pi)ϕi(x) (u ∈ C(Ω)). (11.5)

By Lemma 9.3, Πhu ∈ Xh can be defined for every u ∈ H2(Ω).

Theorem 11.4 (global interpolation error).If Th is of shape-regular, then we have

∥Πhu− u∥ ≤ C1h2|u|2,Ω (u ∈ H2(Ω)), (11.6)

∥∇(Πhu− u)∥ ≤ C2ν1h|u|2,Ω (u ∈ H2(Ω)), (11.7)

where C1 are C2 are constants appearing in Lemma 11.1.

Proof. It is a direct consequence of the shape-regularity of Thhand Lemma

95

11.1. In fact, for u ∈ H2(Ω),

∥∇(Πhu− u)∥2 =∑T∈Th

∥∇(Πhu− u)∥2T

≤∑T∈Th

C22

(h2TρT

)2

|u|2H2(T )

≤∑T∈Th

C22 (hT ν1)

2 |u|2H2(T )

≤ (C2hν1)2∑T∈Th

|u|2H2(T ) = (C2hν1)2|u|22,Ω;

Thus we obtain (11.7).

Theorem 11.5 (approximation property of Xh and Vh).If Th is of shape-regular,

limh↓0

infvh∈Xh

∥u− vh∥ = 0 (u ∈ L2(Ω)), (11.8)

limh↓0

infvh∈Xh

∥∇(u− vh)∥ = 0 (u ∈ H1(Ω)), (11.9)

limh↓0

infvh∈Vh

∥∇(u− vh)∥ = 0 (u ∈ H10 (Ω)). (11.10)

Proof. Let u ∈ H1(Ω). Then, by virtue of (11.7), for any v ∈ C∞(Ω),

infvh∈Xh

∥∇(u− vh)∥ ≤ infvh∈Xh

[∥∇(u− v)∥+ ∥∇(v − vh)∥]

≤ ∥∇(u− v)∥+ ∥∇(v − πhv)∥≤ ∥∇(u− v)∥+ C3h|v|2,Ω (C3 = C2ν1) .

At this stage, let ε > 0 be arbitrary. By the density, there is v ∈ C∞(Ω)satisfying ∥∇(u− v)∥ ≤ ε/2. Hence, if putting δ = ε/(2C3|v|2,Ω), we have

0 < h ≤ δ ⇒ infvh∈Xh

∥∇(u− vh)∥ ≤ ε

2+

ε

2= ε.

This implies (11.9). The proofs of (11.8) and (11.10) are similar.

11.2 Proof of Lemma 11.1

We proceed to the proof of Lemma 11.1. The following notation is employed.

• Let T be a closed triangle:

– |T | is the area of T ;

– Pi are vertices of T (i = 1, 2, 3);

– pi =−−→OPi (i = 1, 2, 3).

96

• T is the reference element

def.⇐⇒ T is the triangle with vertices P1 = (0, 0), P2 = (1, 0), and P3 =(0, 1).

• Set B = [b1, b2], b1 = p2 − p1, b2 = p3 − p1, and consider

x = Φ(ξ) ≡ Bξ + p1 (ξ ∈ T )

which is a mapping from T onto T ; T = Φ(T ).

P

P

P1 2

3

T

P3

P2

P1

Tb1b2

Φ

Figure 11.1: The reference element T and Φ(ξ) = Bξ + p1.

Remark. We will make use of Poincare and Wirtinger’s inequality (Lemma9.7):∫

Tv2 dξ ≤

∫T|∇ξv|2 dξ =

∫T(∂ξ1v)

2 dξ +

∫T(∂ξ2v)

2 dξ(v ∈ H1(T ),

∫Tv dξ = 0

)and Sobolev’s inequality (Lemma 9.3):

∥v∥L∞(T ) ≤ C1∥v∥2,T (v ∈ H2(T )).

We note that C1 is actually an absolute positive constant.

Lemma 11.6.There exists an absolute positive constant C2 such that

infq∈P1

∥v + q∥2,T ≤ C2|v|2,T

for any v ∈ H2(T ).

Proof. We write, for example, L2 = L2(T ), ∥ · ∥ = ∥ · ∥T and ∂i = ∂ξi for short.Let v ∈ H2. There is a p ∈ P1 such that∫

T(v + p) dξ =

∫T∂1(v + p) dξ =

∫T∂2(v + p) dξ = 0.

97

We apply Poincare and Wirtinger to ∂i(v + p), i = 1, 2, and deduce∫T(∂i(v + p))2 dξ ≤

∫T|∇ξ∂i(v + p)|2 dξ ≤ |v + p|2

2,T= |v|2

2,T.

Hence,∥∇ξ(v + p)∥2 ≤ 2|v|2

2,T.

Again applying Poincare and Wirtinger to obtain

∥v + p∥2 ≤ ∥∇ξ(v + p)∥2 ≤ 2|v|22,T

.

Summing up,

∥v + p∥22,T

= ∥v + p∥2 + ∥∇ξ(v + p)∥21,T

+ |v|22,T

≤ 5|v|22,T

.

This completes the proof.

Lemma 11.7.Let S and T are closed triangles. Let

Φ : S ∋ ξ 7→ x = Bξ + b ∈ T,

where B ∈ R2×2 is a non-singular matrix and b ∈ R2 (thus, Φ is an affinemapping from S to T ). Then, we have the following.

(i) ∥B∥ ≡ sup|ξ|=1

|Bξ| ≤ hTρS，∥B−1∥ ≤ hS

ρT.

(iii) For v ∈ H1(T ), we have w ≡ v Φ ∈ H1(S) and

∥∇w∥S ≤ ∥B∥ · | detB|−1/2∥∇v∥T .

(iii) For v ∈ H2(T ), we have w ≡ v Φ ∈ H2(S) and

|w|2,S ≤ C0∥B∥2 · | detB|−1/2|v|2,T

with an absolute positive constant C0.

Proof. (i) |ξ| = ρS implies |Bξ| ≤ hT . Hence,

∥B∥ =1

ρSsup

|ξ|=ρS

|Bξ| ≤ hTρS

.

(ii) Let JΦ and JΦ−1 be determinants of Jacobi matrices of Φ and Φ−1. Then,|JΦ| = | detB|(= |T |/|S| = 0) and |JΦ−1 | = |detB−1| = | detB|−1(= |S|/|T |)．By the density, it suffices to consider the case v ∈ C1(T ). Then, w ∈ C1(S)．Since Φ is affine,

∇ξw = BT∇xv

98

Hence, by ∥B∥ = ∥BT∥,

∥∇w∥2S =

∫S|∇ξw|2 dξ

=

∫T|BT∇xv|2 JΦ−1dx

≤∫T∥BT∥2 · |∇xv|2 | detB−1| dx

= ∥B∥2|detB|−1∥∇v∥2T .

(iii) By the density, it suffices to consider the case v ∈ C2(T ). Then, w ∈ C2(S)and ∣∣∣∣∂2w

∂ξ21

∣∣∣∣ =∣∣∣∣∣∣

2∑i,j=1

Bi1Bj1∂2v

∂xi∂xj

∣∣∣∣∣∣ ≤ ∥B∥22∑

i,j=1

∣∣∣∣ ∂2v

∂xi∂xj

∣∣∣∣ .Since we obtain similar estimations for other derivatives, we have 2∑

|α|=2

∣∣Dαξ w∣∣2 ≤ 16∥B∥4

∑|α|=2

|Dαxv|

2 .

Hence,

|w|2H2(S) ≤ 16∥B∥4∫T

∑|α|=2

|Dαxv|

2 JΦ−1dx

= 16∥B∥4| detB|−1|v|2H2(T ).

Thus, we have showed the desired inequality with C0 = 4. 3

Lemma 11.8.There exists an absolute positive constant C such that

∥v −Πv∥1,T ≤ C|v|2,T

for all v ∈ H2(T ).

Proof. We write, for example, H1 = H1(T ) and (·, ·) = (·, ·)T for short.

• Let w ∈ H1 be arbitrary. Consider a functional F : H2 → R defined by

F (v) = (v −Πv, w) + (∇(v −Πv),∇w) (v ∈ H2).

(Recall that Πv is well-defined; see Lemma 9.3.)

• By Sobolev’s inequality, we have for v ∈ H2

∥Πv∥L∞ ≤ ∥v∥L∞ ≤ C1∥v∥2,T2We use a rough estimate: maxi,j |Bij | ≤ ∥B∥, where B = (Bij). In fact, setting η = (1, 0),we have |B11| ≤

√|B11|2 + |B21|2 = |Bη|/|η| ≤ ∥B∥.

3However, we can take C0 = 1 by another method of analysis.

99

and

∥∇(Πv)∥L∞

= max

|v(P1)− v(P2)|,

|v(P2)− v(P3)|√2

, |v(P3)− v(P1)|

≤ 2∥v∥L∞ ≤ 2C1∥v∥2,T .

• Therefore,

∥v −Πv∥21,T

≤ ∥v −Πv∥2 + ∥∇(v −Πv)∥2

≤ 2(∥v∥2 + |T |2∥Πv∥2L∞) + 2(∥∇v∥2 + |T |2∥∇Πv∥2L∞)

≤ 2∥v∥21,T

+1

2(∥Πv∥2L∞ + ∥∇Πv∥2L∞)

≤ 2∥v∥21,T

+5

2C21∥v∥22,T

≤ max

2,

5

2C21

∥v∥2

2,T≡ C2

3∥v∥22,T .

Thus,|F (v)| ≤ ∥v −Πv∥1,T ∥w∥1,T ≤ C3∥v∥2,T ∥w∥1,T .

• At this stage, let q ∈ P1 be arbitrary. Then, since F (v + q) = F (v), wehave

|F (v)| = |F (v + q)| ≤ C3∥v + q∥2,T ∥w∥1,T .

This gives|F (v)| ≤ C3 inf

q∈P1

∥v + q∥2,T ∥w∥1,T .

We apply Lemma 11.6 to obtain

|F (v)| ≤ C2C3|v|2,T ∥w∥1,T .

• Finally, choosing w = v −Πv, we arrive at

∥v −Πv∥21,T

≤ C2C3|v|2,T ∥v −Πv∥1,T ;

which implies the desired inequality.

Now, we can state the following proof.

Proof of Lemma 11.1. Let T be any triangle, and let Φ(ξ) = Bξ + b be theaffine mapping which maps the reference triangle T onto T . Set h = hT and

ρ = ρT . Fix v ∈ H2(T ), and define v = v Φ. Suppose that Πv is the affine

function defined on T whose values at vertices coincide with those of v. (The

100

meaning of Πv is the same as described in Lemma.) First, by Lemma 11.7 (i)and (iii),

|v|22,T

≤ C20∥B∥4| detB|−1|v|22,T ≤ C2

0

ρ4h4T | detB|−1|v|22,T .

This, together with Lemma 11.8, gives

∥v −Πv∥2T ≤ | detB| · ∥v − Πv∥2T≤ | detB| · C2|v|2

2,T

≤ | detB| · (C20 C

2/ρ4) · h4T |detB|−1|v|22,T .

Thus, we deduce (11.1). Similarly,

∥∇x(v −Πv)∥2T ≤ ∥B−1∥2|detB| · ∥∇ξ(v − Πv)∥2L2(T )

≤ h2

ρ2T| detB| · C2|v|2

2,T

≤ h2

ρ2T| detB| · (C2

0 C2/ρ4) · h4T | detB|−1|v|22,T

=

(C20 h

2C2

ρ4

)h4Tρ2T

|v|22,T .

This implies (11.2); this completes the proof.

101

12 Error analysis of FEM

We are now ready to study the convergence of FEM. We recall that the weakform of the Poisson equation is described as

Find u ∈ V s.t.

∫Ω∇u · ∇v dx︸︷︷︸=(u,v)V

=

∫Ωfv dx︸︷︷︸

=(f,v)

(∀v ∈ V ) (12.1)

and the finite element approximation is given as

Find uh ∈ Vh s.t. (uh, vh)V = (f, v) (∀vh ∈ Vh). (12.2)

We recall the following:

• Ω ⊂ R2 is a polygonal domain;

• f ∈ L2(Ω) is a given function;

• V = H10 (Ω) is a Hilbert space equipped with

∥v∥V = ∥∇v∥, (u, v)V =

∫Ω∇u · ∇v dx;

• ∥u∥ = ∥u∥L2(Ω)，(u, v) =

∫Ωuv dx;

• |u|22,Ω = ∥∂21u∥2 + 2∥∂1∂2u∥2 + ∥∂2

2u∥2;

• Thh>0 is a family of triangulations of Ω;

• Vh ⊂ V is the set of continuous piecewise linear functions defined in Th;

• The Lagrange interpolation Πh : C(Ω) → Xh is defined as

(Πhv)(x) =

N∑i=1

v(Pi)ϕi(x) (v ∈ C(Ω)).

(Recall that Πhv is well-defined for any v ∈ H2(Ω); See Lemma 9.3.)

Remark. Problem (12.1) is the weak form of the Dirichlet BVP for thePoisson equation

−∆u = f in Ω, u = 0 on ∂Ω.

There exists a unique u satisfying (12.1), and we have ∥u∥V = ∥∇u∥V ≤ C∥f∥(cf. Theorem 10.1). If Ω is a convex polygon, we further obtain u ∈ H2(Ω)with ∥u∥H2(Ω) ≤ C∥f∥ (cf. Theorem 10.2).

102

Theorem 12.1 (Convergence).Suppose that Thh>0 is of shape-regular. Let u ∈ V and uh ∈ Vh be solutionsof (12.1) and (12.2), respectively. Then, we have

limh↓0

∥∇u−∇uh∥ = 0.

Thus, uh converges to u as h ↓ 0 in H1(Ω).

Proof. Recall Theorem 10.5 (Cea’s lemma):

∥u− uh∥V = minvh∈Vh

∥u− vh∥V . (12.3)

This, together with Theorem 11.5, implies the conclusion.

Theorem 12.2 (H1 error estimate).Suppose that Thh>0 is of shape-regular. Let u ∈ V and uh ∈ Vh be solutionsof (12.1) and (12.2), respectively. Further, assume u ∈ H2(Ω). Then, we have

∥∇u−∇uh∥ ≤ Ch|u|2,Ω

with C = C2ν1 (C2 > 0 is the constant appearing in Lemma 11.1).

Proof. The equality (12.3) implies

∥u− uh∥V ≤ ∥u− vh∥V (∀vh ∈ Vh).

We choose vh = Πhu ∈ Vh (→ Problem 11) and apply Theorem 11.4 (globalinterpolation error) to obtain

∥u− uh∥V ≤ ∥u−Πhu∥V ≤ C2ν1h|u|2,Ω.

Theorem 12.3 (L2 error estimate).Assume that Thh>0 is of shape-regular. Let u ∈ V and uh ∈ Vh be solutionsof (12.1) and (12.2), respectively. Further, assume that Ω is a convex polygon.Then, we have

∥u− uh∥ ≤ C ′h2|u|2,Ωwith C ′ = ν21C

22CR > 0, where CR denotes a domain constant appearing

Theorem 10.2 (ii).

Remark. There is a big difference between Theorems 12.2 and 12.3. InTheorem 12.2, we assume that the weak solution u of the Poisson equation−∆u = f with u|∂Ω = 0 is in H2(Ω) for f ∈ L2(Ω) under consideration. Onthe other hand, if Ω is a convex polygon, the weak solution w of the Poissonequation −∆w = g with w|∂Ω = 0 belongs to H2(Ω) for all g ∈ L2(Ω).

Proof of Theorem 12.3. Define eh = u−uh and consider the variational prob-lem

find w ∈ V s.t. (v, w)V = (eh, v) (∀v ∈ V ),

103

which we call the adjoint problem of (12.1). In view of Theorems 10.1 and10.2, there exists a unique solution w ∈ V satisfying ∥w∥H2(Ω) ≤ CR∥g∥ witha domain constant CR. Moreover, we know ∥eh∥V ≤ ν1C2h|u|2,Ω by Theorem12.2. Choosing v = eh, we deduce

∥eh∥2 = (eh, w)V

= (eh, w −Πhw)V (by Galerkin orthogonality, (eh,Πhw)V = 0.)

≤ ∥eh∥V ∥w −Πhw∥V≤ ∥eh∥V · C2ν1h|w|2,Ω≤ C2ν1h|u|2,Ω · C2ν1h · CR∥eh∥.

Hence, we have ∥u− uh∥ ≤ C22CRν

21h

2|u|2,Ω.

Remark. The method of analysis in the proof of Theorem 12.3 is said tobe duality argument or Aubin-Nitsche’s trick.

104

13 Numerical experiments using FreeFem++

So far we have studied a mathematical theory of FEM. Unfortunately, theimplementation of FEM is not an easy task. Actually, Professor O. Pironneauwrote in his famous book “Finite Element Methods for Fluids (Wiley, 1989)”that

Numerical analysis is somewhat dry if it is taught without test-ing the methods. Unfortunately experience shows that a simplefinite element solution of a Laplace equation with the P 1 conform-ing element requires at least 20 hours of programming time; so itis difficult to reach the more interesting applications discussed inthis book in the time allotted to a Master course. [page 197]

In order to avoid these difficulties, we are able to utilize a free softwareFreefem++

http://www.freefem.org/ff++/index.htm

or Freefem++-cs

http://www.ann.jussieu.fr/~lehyaric/ffcs/index.htm

13.1 Examples

Figure 13.1: Freefem++-cs

⋆Example 13.1. In the unit square Ω = (0, 1) × (0, 1), we consider thePoisson equation

−∆u = x2 + 2y in Ω, u = 0 on ∂Ω.

105



stop run

save

Figure 13.2: Useful buttons

List 7 is a freefrem++ code to solve this problem, and Fig. 13.3 is output.

Listing 7: Example 13.1

// parameters and functionsfunc g = 0; // boundary conditionfunc f = x*x + 2.0*y; // righthand functionint n = 20; // division number

// domainmesh Th=square(n, n);// finite element spacefespace Vh(Th, P1);Vh u,v;

// Poisson equationsolve poisson(u, v) =int2d(Th)(dx(u)*dx(v) + dy(u)*dy(v))- int2d(Th)(f*v) + on(1, 2, 3, 4, u = g);

// plot dataplot(u, wait = true, ps = "prog1.eps");

⋆Example 13.2. Consider the same problem as Example 13.1 and applygnuplot to display the shape of the solution. List 8 is a freefrem++ code.After running this, we get prog2.data. Then, at gnuplot terminal, type asfollows:

gnuplot> set pm3d

gnuplot> set palette rgbformulae 33,13,10

gnuplot> set ticslevel 0

gnuplot> splot "prog2.data" with lines pal

Results are Figs 13.4 and 13.5. To save those results as eps file, type asfollows:

106

Figure 13.3: Example 13.1

gnuplot> set term pdf

gnuplot> set output "prog2.pdf"

gnuplot> replot


// parameters and functionsfunc g = 0; // boundary conditionfunc f = x*x + 2.0*y; // righthand functionint n = 10; // division number

// domainmesh Th=square(n, n);// finite element spacefespace Vh(Th, P1);Vh u,v;

// Poisson equationsolve poisson(u, v) =int2d(Th)(dx(u)*dx(v) + dy(u)*dy(v))- int2d(Th)(f*v) + on(1, 2, 3, 4, u = g);

// plot Thplot(Th, ps = "prog2.eps");

// gnuplot data fileofstream ff("prog2.data");for(int i = 0; i < Th.nt ; i++)

for(int j = 0; j < 3; j++)ff << Th[i][j].x << " "<< Th[i][j].y << " " << u[][Vh(i,j)] <<

endl;ff << Th[i][0].x << " " << Th[i][0].y << " " << u[][Vh(i,0)] <<

endl << endl << endl;

107

0

0.2

0.4

0.6

0.8

1 0

0.2

0.4

0.6

0.8

1

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

"prog2_10.data"

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1

Figure 13.4: Example 13.2 (n = 10)

0

0.2

0.4

0.6

0.8

1 0

0.2

0.4

0.6

0.8

1

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

"prog2_30.data"

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1


⋆Example 13.3. In a “complex-shaped” domain as in Figs 13.6 and 13.7,we solve the Poisson equation

−∆u = x2y in Ω, u = 0 on ∂Ω.

List 9 is a freefrem++ code to solve this problem, and Figs 13.6 and 13.7 areoutputs.


// parametersfunc g = 0; // boundary valuefunc f = x*x*y; // righthand functionint n = 30; // division number

// domain

108

0 0.5

1 1.5

2 2.5

3 0

0.5

1

1.5

2

2.5

3

-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

"prog3_10.data"

-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3


border G1(t = 0, 3) x = t; y = 0;border G2(t = 0, pi/2) x = 3*cos(t); y = 3*sin(t);border G3(t = 0, 3) x = 0; y = 3 - t;border G4(t = 0, 2*pi) x = 1.9 - 0.8*cos(t); y = 0.9 + 0.8*sin(t);border G5(t = 0, 2*pi) x = 0.7 - 0.5*cos(t); y = 2.3 + 0.5*sin(t);

// triangulationmesh Th = buildmesh(G1(n)+G2(2*n)+G3(n)+G4(2*n)+G5(n));

// finite element spacefespace Vh(Th,P1);Vh u,v;

// Poisson equationsolve poisson(u,v)= int2d(Th)(dx(u)*dx(v) + dy(u)*dy(v))- int2d(Th)(f*v) + on(G1,G2,G3,G4,G5,u = g);

// Plot Thplot(Th, ps = "prog3m.eps");





⋆Example 13.4. In an oval-shaped domain Ω = x2/(1.5)2 + y2 < 1, wesolve the Poisson equation

−∆u = 1 in Ω, u = x2 + y2 on ∂Ω.

List 10 is a freefrem++ code to solve this problem, and Figs 13.8 and 13.9 areoutputs.

109

0 0.5

1 1.5

2 2.5

3 0

0.5

1

1.5

2

2.5

3

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

"prog3_30.data"

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35



// parametersfunc g = x*x + y*y; // boundary valuefunc f = 1.0; // righthand functionint n = 60; // division number

// domainborder G1(t = 0, 2*pi) x = 1.5*cos(t); y = sin(t);

// triangulationmesh Th = buildmesh(G1(n));

// finite element spacefespace Vh(Th,P1);Vh u,v;

// Poisson equationsolve poisson(u,v)= int2d(Th)(dx(u)*dx(v) + dy(u)*dy(v))- int2d(Th)(f*v) + on(G1, u = g);

// Plot Thplot(Th, ps = "prog4m.eps");





13.2 Convergence rates: regular solutions

⋆Example 13.5. We examine the rate of convergence of FEM. To this end,we take

−∆u = 2π2 sin(πx) sin(πy) in Ω, u = 0 on Γ

110

-1.5-1

-0.5 0

0.5 1

1.5-1-0.8

-0.6-0.4

-0.2 0

0.2 0.4

0.6 0.8

1

1

1.2

1.4

1.6

1.8

2

2.2

2.4

"prog4_30.data"

1

1.2

1.4

1.6

1.8

2

2.2

2.4


-1.5-1

-0.5 0

0.5 1

1.5-1

-0.5

0

0.5

1

1

1.2

1.4

1.6

1.8

2

2.2

2.4

"prog4_60.data"

1

1.2

1.4

1.6

1.8

2

2.2

2.4


in the unit square Ω = (0, 1)× (0, 1). The exact solution is given as

u(x, y) = sin(πx) sin(πy).

We take uniform meshes Th’s illustrated in Fig. 13.10. Moreover, we in-troduce an extra fine mesh Th′ compared to Th and set u = Πh′u, where Πh′

denotes the Lagrange interpolation operator associated with Th′ . Notice that

∥∇u−∇uh∥L2(Ω) ≤ ∥∇u−∇u∥L2(Ω) + ∥∇u−∇uh∥L2(Ω)

and that ∥∇u−∇u∥L2(Ω) is expected to be much smaller than ∥∇u−∇uh∥L2(Ω).Hence, we observe

eh = ∥∇u−∇uh∥L2(Ω) and Eh = ∥u− uh∥L2(Ω)

instead of ∥∇u−∇uh∥L2(Ω) and ∥u− uh∥L2(Ω). Moreover, we observe

ρh =log e2h − log ehlog(2h)− log(h)

and Rh =logE2h − logEh

log(2h)− log(h).

111

Figure 13.10: Example 13.5

Results are reported in Fig. 13.11 and Tab. 13.1. We observe from thoseresults that errors are expressed as

eh ≈ C1h, Eh ≈ C2h2 (C1, C2 some constants),

and they are consistent with theoretical results (Theorems 12.2 and 12.3). List11 is a freefrem++ code used for this calculation.


// error1.edpfunc exact = sin(pi*x)*sin(pi*y); // exact solutionfunc f = 2.0*pi*pi*exact; // rifht-hand side functionfunc g = exact; // Dirichlet boundary conditionreal hsize, hold; // mesh sizereal errh1, errh1old, errl2, errl2old, rateh1, ratel2;int n, nn;

// fine triangulationnn = 256;mesh Th2 = square(nn, nn);fespace Vh2(Th2, P1);Vh2 uproj, w, uex;uex = exact;

// output fileofstream f1("error1.dat");

// n=4,8,16,32,64n = 2;errh1old = errl2old = 1.0;hold = 1.0;for (int i = 1; i < 6; i++)

n = 2*n;mesh Th = square(n, n);plot(Th, ps="error1.eps");fespace Vh(Th, P1);Vh u, v, hh = hTriangle;hsize = hh[].max;solve Poisson(u,v) =int2d(Th)( dx(u)*dx(v) + dy(u)*dy(v)) - int2d(

Th) ( f*v ) + on(1,2,3,4,u = g);// computation of error using the fine triangulationuproj = u; // projection of u into the fine triangulationw = uproj - uex; // error functionerrh1 = sqrt( int2d(Th2)(dx(w)*dx(w) + dy(w)*dy(w)) ); // H1-errorerrl2 = sqrt( int2d(Th2)(w^2) ); // L2-error// computation of ratesrateh1 = (log(errh1) - log(errh1old))/(log(hsize) - log(hold));

112

0.0001

0.001

0.01

0.1

1

0.01 0.1 1

log

(Erro

r)

log (h)

H1 ErrL2 Err

Figure 13.11: Behavior of errors (Example 13.5)

h eh ρh Eh Rh

0.353553 0.838452 — 0.079064 —0.176777 0.431591 0.96 0.021119 1.900.088388 0.217113 0.99 0.005364 1.980.044194 0.108122 1.01 0.001337 2.000.022097 0.052783 1.03 0.000325 2.04

Table 13.1: eh, ρh, eh，Rh

ratel2 = (log(errl2) - log(errl2old))/(log(hsize) - log(hold));errh1old = errh1;errl2old = errl2;hold = hsize;// output resultsf1 << hsize << " "<< errh1 << " " << rateh1 << " "<< errl2 << " " << ratel2<< " " << endl;

13.3 Convergence rates: singular solutions

⋆Example 13.6. Employing the polar coordinate (r, θ), set

Ω = (r, θ) | 0 < r < 1, 0 < θ < π/ω,

where ω = kπ, 0 < k < 2, k = 1. In Ω, we consider

−∆u = fdef.=

4

R2(1 + π/ω)w(r, θ) in Ω, u = 0 on Γ,

where w(r, θ) = rπ/ω sin(πωθ). The exact solution is given as

u(x, y) = ϕ(r)w(r, θ).

[ϕ(r) = 1−

( r

R

)2.

]

113

Numerical solutions uh for ω = 0.7π, 1.5π and 1.8π are displayed in Fig. 13.12–13.14. We actually have u ∈ H1

0 (Ω). The function ϕ(r) is sufficiently regularand the function w is “singular” as

∥w∥2L2(Ω) < ∞, ∥∇w∥2L2(Ω) < ∞, |w|2H2(Ω)

< ∞ (0 < ω < π)

= ∞ (π < ω < 2π).

This implies that

0 < ω < π ⇒ u ∈ H2(Ω), π < ω < 2π ⇒ u ∈ H2(Ω)

Therefore, theoretical results (Theorems 12.2 and 12.3) cannot be appliedto the cases ω = 1.5π and ω = 1.8π. Convergence rates are reported inFig. 13.15. From Fig. 13.15, we conjecture that, when π < ω < 2π, there exists = s(ω), s′ = s′(ω) ∈ (0, 1) such that

∥∇u−∇uh∥L2(Ω) ≤ Cs(u)hs, ∥u− uh∥L2(Ω) ≤ Cs′(u)h

1+s′ ,

where Cs(u) and Cs′(u) are positive constants depending on u.

Figure 13.12: Example 13.6 for ω = 0.7


114


0.000010

0.000100

0.001000

0.010000

0.100000

1.000000

0.01 0.1 1

log

(Erro

r)

log (h)

H1: ome=0.7piH1: ome=1.5piH1: ome=1.8piL2: ome=0.7piL2: ome=1.5piL2: ome=1.8pi

Figure 13.15: Behavior of errors (Example 13.6)

115

Problems and further readings for Chapter II

Problems

Problem 9. Prove Lemma 11.3.

Problem 10. Under the same notations of §8, prove the following equalities.

3∑i=1

(x− ai)λi = 0,

3∑i=1

(y − bi)λi = 0,

3∑i=1

(x− ai)∂λi

∂x= −1,

3∑i=1

(y − bi)∂λi

∂y= −1,

3∑i=1

(x− ai)∂λi

∂y= 0,

3∑i=1

(y − bi)∂λi

∂x= 0.

Problem 11. Prove that Πhu ∈ Vh for u ∈ H2(Ω) ∩ H10 (Ω), where Πh :

C(Ω) → Xh denotes the Lagrange interpolation operator defined as (11.5)and Vh is defined as (8.6).

Problem 12. Prove that (8.8) and (8.9).

Problem 13 (machine assignment). Make FEM meshes for domains illus-trated in Fig. 13.16. Moreover, in those Ω, solve numerically the followingPDEs and represent shapes of solutions by gnuplot.

−∆u = 1 in Ω, u = 0 on Γ,

−∂21u− 10−2∂2

2u = 1 in Ω, u = 0 on Γ.

Figure 13.16:

Further readings

There are many excellent textbooks devoted to the mathematical theory ofFEM. For example, I recommend students

116

[19] 菊地文雄: 有限要素法の数理 (数学的基礎と誤差解析)，培風館，1994年．

[24] S. Larsson and V. Thomee: Partial Differential Equations withNumerical Methods, Springer, 2009.

[22] 菊地文雄，齊藤宣一：数値解析の原理–現象の解明をめざして(岩波数学叢書)，岩波書店，2016.


For researchers, the following books might be valuable:

[6] S. C. Brenner and L. R. Scott: The Mathematical Theory ofFinite Element Methods (3rd ed.), Springer, 2007.

[20] P. Knabner and L. Angermann: Numerical Methods for Ellip-tic and Parabolic Partial Differential Equations, Springer, 2003.

[34] P. A. Raviart and J. M. Thomas: Introduction a l’AnalyseNumerique des Equations aux Derivees Partielles, Masson, Paris,1983.

My favorite one is [34]; But, unfortunately，it is written in French.Another important topic of FEM is the discrete maximum principle. For

example, we refer to [20].FEM for parabolic PDEs is explained, for example, in [20], [24], [34], [38]

and

[13] H. Fujita, N. Saito and T. Suzuki: Operator Theory and Nu-merical Methods, Elsevier, 2001.

[40] V. Thomee: Galerkin Finite Element Methods for ParabolicProblems (2nd ed.), Springer, 2006.

In particular, [13] is based on the analytical semigroup theory.The implementation of FEM is not an easy task. But, I recommend the

readers to try it by following

[18] 菊地文雄: 有限要素法概説 (理工学における基礎と応用)，サイエンス社，1980年．

The following article on Variational Methods by Professor Kato is worthreading:

[17] 加藤敏夫：変分法，自然科学者のための数学概論 [応用編]，寺沢寛一 (編)，岩波書店，1960年．

117

III. Abstract elliptic PDE and Galerkin method

14 Theory of Lax and Milgram

In Chapter II, we studied only the Poisson equation. However, the theorycould be applied to more general PDEs of elliptic type. We introduce thefollowing notions.

• Let V be a (real) Hilbert space with the scalar product (·, ·)V and thenorm ∥ · ∥V .

• The space V ′ denotes the dual space of V (= the set of all boundedlinear functionals over V ). Thus,

F ∈ V ′ def.⇐⇒ F : V → R,F (v + w) = F (v) + F (w) (v, w ∈ V )

F (αv) = αF (v) (v ∈ V, α ∈ R)

∥F∥V ′ ≡ supv∈V

F (v)

∥v∥V= sup

v∈V

|F (v)|∥v∥V

< ∞.

Hereinafter, we write as ⟨F, v⟩ = F (v) (v ∈ V ); ⟨·, ·⟩ = ⟨·, ·⟩V ′,V denotesthe duality product between V ′ and V .

• a : V × V → R is a bilinear form on V × Vdef.⇐⇒

a(αu+ βv,w) = α · a(u,w) + β · a(v, w),

a(u, αv + βw) = α · a(u, v) + β · a(u, v) (u, v, w ∈ V, α, β ∈ R)

• A bilinear form a : V × V → R is bounded (or continuous)def.⇐⇒

∥a∥ ≡ supu,v∈V

a(u, v)

∥u∥V ∥v∥V< ∞.

Theorem 14.1 (Lax–Milgram).Suppose that a bounded bilinear form a : V × V → R satisfies the followingcondition:

[Coercivity] ∃α > 0 s.t. a(v, v) ≥ α∥v∥2V (v ∈ V ). (14.1)

Then, for any F ∈ V ′, there exists a unique u ∈ V satisfying

a(u, v) = ⟨F, v⟩ (∀v ∈ V ). (14.2)

Remark. The function u appearing Theorem 14.1 satisfies a priori esti-mate

∥u∥V ≤ 1

α∥F∥V ′ .

118

In fact, choosing v = u, we have α∥u∥2V ≤ a(u, u) = ⟨F, u⟩ ≤ ∥F∥V ′∥u∥V .Thus, ∥u∥V ≤ (1/α)∥F∥V ′ .

We use the Riesz mapping σ = σV from V ′ to V that is a bijective operatorfrom V ′ to V defined as 4

(σF, v)V = ⟨F, v⟩ (∀v ∈ V )

for F ∈ V ′. It satisfies

∥σ∥V ′,V = supF∈V ′

∥σF∥V∥f∥V ′

= 1,

∥σ−1∥V,V ′ = supv∈V

∥σ−1v∥V ′

∥v∥V= 1.

Note that V ′ forms a Hiblert space equipped with the norm ∥ · ∥V ′ . Its scalarproduct is defined by

(F,G)V ′ = (σF, σG)V (F,G ∈ V ′).

We shall present two different proofs.

Proof of Theorem 14.1, I. For short, set ∥ · ∥ = ∥ · ∥V , (·, ·) = (·, ·)V , andσ = σV . In view of Riesz’s representation theorem (Lemma 9.8), there existsa linear operator A : V → V such that 5

(Au, v) = a(u, v) (u, v ∈ V ).

In particular,

(Au, u) ≥ α∥u∥2, ∥Au∥ ≤ ∥a∥ · ∥u∥ (u ∈ V ).

Since F is expressed as ⟨F, v⟩ = (σF, v)V (∀v ∈ V ), the equation (14.2) isequivalently written as

(Au, v) = (σF, v) (∀v ∈ V ) ⇔ Au− σF = 0 in V. (14.3)

We will show that this equation admits a unique solution by the contractionmapping principle 6. To this end, we introduce E : V → V and B : V → V by

Eu = u− ρ(Au− σF ) = Bu+ ρσF (Bu = u− ρAu)

4For an arbitrary F ∈ V ′, there exists a unique w ∈ V satisfying ⟨F, v⟩ = (w, v)V (v ∈ V ).This correspondence is denoted by σ : F 7→ w. Obviously, σ is a linear operator fromV ′ to V . Combining ∥F∥V ′ = supv∈V |⟨F, v⟩|/∥v∥V ≤ ∥w∥V = ∥σF∥V and ∥F∥V ′ ≥|⟨F,w⟩|/∥w∥V = ∥w∥V = ∥σF∥V , we have ∥F∥V ′ = ∥σF∥V . Hence, we obtain ∥σ∥V ′,V =1 (isometry). On the other hand, since, for any w ∈ V , we have (w, ·)V ∈ V ′, σ issubjective. Moreover, for F ∈ V with σF = 0, we have F = 0 by the isometry. Thus,the operator σ is bijective.

5Fix u ∈ V and consider φu(v) = a(u, v) (v ∈ V ). φu(v) is a linear functional on V . Since|φu(v)| ≤ (∥a∥·∥u∥)·∥v∥, φu(v) is bounded. By virtue of Riesz (Lemma 9.8), there exists aunique w ∈ V satisfying (w, v)V = φu(v) = a(u, v) (∀v ∈ V ). The correspondence u 7→ wis denoted by w = Au. Then, the operator A is linear on V . Choosing v = w = Au, wehave ∥Au∥2 = a(u,Au) ≤ ∥a∥ · ∥Au∥ · ∥u∥. Hence, ∥A∥ = supu∈V ∥Au∥/∥u∥ ≤ ∥a∥. Thisimplies that A : V → V is a bounded linear operator.

6Suppose that T is a contraction operator on a Hilbert space H. That is, T satisfies∥Tu− Tv∥H ≤ λ∥u− v∥H (u, v ∈ H) with 0 < λ < 1. Then, there exists a unique u ∈ Hsuch that u = Tu. Such u ∈ H is called the fixed point.

119

with a constant ρ. Then,

(Bu,Bu) = ∥u∥2 − 2ρ(Au, u) + ρ2∥Au∥2

≤ ∥u∥2 − 2ρα∥u∥2 + ρ2∥a∥2∥u∥2

= (1− 2ρα+ ρ2∥a∥2)︸︷︷︸k

∥u∥2 (u ∈ V ).

Now we take ρ such that 0 < ρ < 2α/∥a∥2; Then, 0 < k < 1 and ∥Bu∥ ≤√k ∥u∥ (u ∈ V ). Therefore,

∥Eu− Ev∥ = ∥B(u− v)∥ ≤√k ∥u− v∥ (u, v ∈ V ).

This implies that E is contraction on V . So, we can apply the contractionmapping principle to obtain a unique fixed point u ∈ V satisfying

u = Eu ⇔ u = u− ρ(Au− σF ).

The function u is a unique solution of the operator equation Au−σF = 0.

Proof of Theorem 14.1, II. We prove that the linear operator A defined aboveis bijective on V . First, since

(Av, v) ≥ α∥v∥2V (v ∈ V ),

the oprerator A is injective. Next, R(A), the range of A, is a closed set in V .Indeed, let wn ⊂ R(A) and wn → w in V with some w ∈ V . We can takevn ⊂ V such that Avn = wn. Then, since

∥vn − vm∥ ≤ ∥Avv −Avm∥ = ∥wn − wm∥,

we see that vn is a Cauchy sequence in V . Hence, there is v ∈ V suchthat vn → v in V and we have Av = w. This gives that w ∈ R(A) and,consequently, that R(A) is closed. Finaly, R(A) is dense in V . To verify this,suppose that v ∈ V satisfies (Au, v) = 0 for all u ∈ V . Taking u = v, weobtain 0 = (Av, v) ≥ α∥v∥2; hence v = 0. This imples that R(A) is dense.As a consequence, we have verified that A is bijevtive on V . Therefore, forσF ∈ V , there exists a unique u ∈ V such that Au = σF .

Theorem 14.2 (variational principle).In addition to assumptions of Theorem 14.1, we assume that a is symmetric,that is,

a(u, v) = a(v, u) (u, v ∈ V ). (14.4)

Then, u ∈ V is a solution of (14.2) if and only if u is a solution of

J(u) = minv∈V

J(v), J(v) =1

2a(v, v)− ⟨F, v⟩. (14.5)

Proof. It is the exactly same as that of Theorem 7.1.

120

15 Galerkin approximation

We follow the notation of the previous section. Further, we introduce a finitedimensional subspace Vh of V , h being the discretization parameter such thath ↓ 0.For a given F ∈ V ′, we consider an abstract elliptic problem:

Find u ∈ V s.t. a(u, v) = ⟨F, v⟩ (∀v ∈ V ) (15.1)

and its Galerkin approximation:

Find uh ∈ Vh s.t. a(uh, vh) = ⟨F, vh⟩ (∀vh ∈ Vh). (15.2)

Let ϕiNi=1 be a basis of Vh, N being the dimension of Vh. We write uh as

uh =N∑i=1

Uiϕi.

Then, we have the matrix representation of (15.2)

Au = f ,

where we have set

A = (a(ϕj , ϕi)) ∈ RN×N , u = (Ui) ∈ RN , f = (⟨F, ϕi⟩) ∈ RN .

As a consequence of Lax-Milgram’s theorem, Problems (15.1) and (15.2)admit the unique solutions. Hence, the matrix A is non-singular.

Theorem 15.1 (Galerkin orthogonality).Let u ∈ V and uh ∈ Vh be solutions of (15.1) and (15.2), respectively. Then,we have

a(u− uh, vh) = 0 (∀vh ∈ Vh).

Theorem 15.2 (Cea’s lemma).Let u ∈ V and uh ∈ VN be solutions of (15.1) and (15.2), respectively. Then,we have

∥u− uh∥V ≤ ∥a∥α

infvh∈Vh

∥u− vh∥V . (15.3)

Proof. Let vh ∈ Vh. Then a(u−uh, vh) = 0 by Galerkin orthogonality. Hence,

α∥u− uh∥2V ≤ a(u− uh, u− uh)

= a(u− uh, u− vh)

≤ ∥a∥ · ∥u− uh∥V ∥u− vh∥V .

Therefore,

∥u− uh∥V ≤ ∥a∥α

infvh∈Vh

∥u− vh∥V .

121

Remark. Let Ph be the orthogonal projection operator from V to Vh.Then, for u ∈ V , we have

∥u− Phu∥V = minvh∈Vh

∥u− vh∥V .

(This is nothing but the projection theorem.) Hence, we can replace inf bymin in Theorem 15.2.

Remark. If we assume, in addition assumptions of Theorem 15.2, that ais symmetric, then we have

∥u− uh∥V ≤√

∥a∥α

minvh∈Vh

∥u− vh∥V .

In fact, we can now define a scalar product of V by

((u, v)) = a(u, v) (u, v ∈ V ). (15.4)

Then, u ∈ V satisfies

((u− uh, wh)) = 0 (∀wh ∈ Vh).

This means that the operator defined as u 7→ uh is the projection operator ofV → Vh with respect to the scalar product (15.4). Hence, by the projectiontheorem, we obtain

|||u− uh||| = minvh∈Vh

|||u− vh|||

with |||u||| =√

((u, u)). Finally, noting α∥u∥2V ≤ |||u|||2 ≤ ∥a∥·∥u∥2V , we deducethe desired inequality.

Remark. Let S be a closed subspace of V . If, for u ∈ S, there exists uS ∈ Ssatisfying ∥u− uS∥V = min

v∈S∥u− v∥V , then uS is called the best approximation

of u in S. If the bilinear form a is symmetric, the solution uh of (15.2) isactually the best approximation of the solution u of (15.1) in Vh with ((·, ·)).

122

16 Applications

16.1 Convection-diffusion equation

Assume that Ω ⊂ R2 is a bounded Lipschitz domain and that its boundaryΓ = ∂Ω consists of two parts Γ1,Γ2 ⊂ Γ such that Γ = Γ1 ∪ Γ2. Since Ω isa Lipschitz domain, the unit outer normal vector to Γ, which is denoted byn = n(s) = (n1(s), n2(s)), is well-defined for almost every s ∈ Γ.

Differential problem. Suppose that we are given

ν > 0, f, c : Ω → R, b : Ω → R2, g1 : Γ1 → R, g2 : Γ2 → R.

The function u = u(x) is defined to be a concentration/density of a certainsubstance. We consider the equations:

j = −ν∇u+ bu in Ω (flux of u),∇ · j + cu = f in Ω (conservation law),u = g1 on Γ1 (Dirichlet B.C.),−j · n = g2 on Γ2 (Non-flux B.C.).

Thus, we consider

− ν∆u+∇ · (bu) + cu = f in Ω, (16.1a)

u = g1 on Γ1, (16.1b)

ν∂u

∂n− (b · n)u = g2 on Γ2, (16.1c)

where ∂u/∂n = ∇u · n. Multiplying the both sides of (16.1a) of the form∇ · j + cu = f by v ∈ C∞(Ω) with v|Γ1 = 0 and integrating them over Ω, wehave by the integration by parts∫

Ω∇ · jv dx+

∫Ωcuv dx =

∫Ωfv dx

⇔∫Γ(j · n)v dS −

∫Ωj · ∇v dx+

∫Ωcuv dx =

∫Ωfv dx

⇔ ν

∫Ω∇u · ∇v dx−

∫Ω(bu) · ∇v dx+

∫Ωcuv dx︸︷︷︸

=a(u,v)

=

∫Ωfv dx+

∫Γ2

g2v dS.

Hence, the solution u of (16.1) must satisfy

a(u, v) =

∫Ωfv dx+

∫Γ2

g2v dS (∀v ∈ C∞(Ω), v|Γ1 = 0).

Next, we take g1 : Ω → R satisfying g1|Γ1 = g1 and put u = u − g1. Then,u|Γ = 0 and

a(u, v) =

∫Ωfv dx+

∫Γ2

g2v dS + a(g1, v)︸︷︷︸=F (v)=⟨F,v⟩

.

123

Function spaces and forms.

V = v ∈ H1(Ω) | v|Γ1 = 0, ∥ · ∥V = ∥ · ∥1,2 = ∥ · ∥H1(Ω),

a(u, v) = ν

∫Ω∇u · ∇v dx−

∫Ω(bu) · ∇v dx+

∫Ωcuv dx,

⟨F, v⟩ = ⟨F, v⟩V ′,V =

∫Ωfv dx+

∫Γ2

g2v dS + a(g1, v),

∥v∥ = ∥v∥L2(Ω), ∥v∥Γk= ∥v∥L2(Γk) (k = 1, 2),

∥b∥∞ = supx∈Ω

√b1(x)2 + b2(x)2, ∥v∥∞ = sup

x∈Ω|v(x)|.

Weak formulation.

Find u = u+ g1 ∈ H1(Ω) s.t. u ∈ V and a(u, v) = ⟨F, v⟩ (∀v ∈ V ). (16.2)

Assumptions.

(A1) b ∈ C1(Ω)2，c ∈ C(Ω)，f ∈ L2(Ω);

(A2)1

2∇ · b+ c ≥ 0 (x ∈ Ω)，b · n ≤ 0 (x ∈ Γ2);

(A3) g1 ∈ H1(Ω)，g1 = g1|Γ1 ∈ L2(Γ1)，g2 ∈ L2(Γ2).

Continuity of a. For u, v ∈ V ,

|a(u, v)| ≤ ν

∫Ω|∇u| · |∇v| dx+

∫Ω∥b∥∞|u| · |∇v| dx+

∫Ω∥c∥∞ · |u| · |v| dx

≤ ν∥∇u∥ · ∥∇v∥+ ∥b∥∞∥∇u∥ · ∥v∥+ ∥c∥∞∥u∥ · ∥v∥≤ maxν, ∥b∥∞, ∥c∥∞︸︷︷︸

C1

∥u∥V ∥v∥V .

Continuity of F . According to Lemma 9.4 (trace theorem), for v ∈ H1(Ω),we have η = v|Γ2 ∈ L2(Γ2) and ∥η∥L2(Γ2) ≤ C2∥v∥1,2 with a domain constantC2. Hence, we have

|⟨F, v⟩| ≤∫Ω|fv| dx+

∫Γ2

|g2v| dS + |a(g1, v)|

≤ ∥f∥∥v∥+ ∥g2∥Γ2∥v∥Γ2 + C1∥g1∥V ∥v∥V≤ (∥f∥+ C2∥g2∥Γ2 + C1∥g1∥V )︸︷︷︸

C3

∥v∥V

124

Coercivity of a. We first note that

−∫Ωb(∇u)u dx = −

∫Ω

2∑i=1

bi∂u

∂xiu dx = −

∫Ω

2∑i=1

bi ·1

2

∂

∂xi(u2) dx

= −1

2

∫∂Ω

2∑i=1

1

2biu

2ni dS +1

2

∫Ω

2∑i=1

∂bi∂xi

· u2 dx

= −1

2

∫Γ2

(b · n)u2 dS +1

2

∫Ω

2∑i=1

∂bi∂xi

· u2 dx

≥ 1

2

∫Ω(∇ · b)u2 dx.

Therefore, for u ∈ V ,

a(u, u) = ν∥∇u∥2 +∫Ωb(∇u)u dx+

∫Ωcu2 dx

≥ ν∥∇u∥2 − 1

2

∫Ω(∇ · b)u2 dx+

∫Ωcu2 dx

≥ ν∥∇u∥2 +∫Ω

(c+

1

2∇ · b

)u2 dx

≥ ν∥∇u∥2

=ν

2∥∇u∥2 + ν

2∥∇u∥2

≥ ν

2∥∇u∥2 + ν

2

1

C2P

∥u∥2

≥ C24∥u∥2V .

Therein, we have used the following lemma.

Lemma 16.1 (Poincare’s inequality, II).If the Lebesgue measure of Γ1 is positive, then there exists a domain constantCP such that

∥v∥ ≤ CP ∥∇v∥ (v ∈ H1(Ω), v|Γ1 = 0).

Proof. It is the exactly same as that of Lemma 9.7.

Well-posedness.

Theorem 16.2.Under assumptions (A1), (A2) and (A3), there exists a unique solution u of(16.2). Moreover, it satisfies

∥u∥1,2 ≤ C5(∥f∥+ ∥g1∥1,2 + ∥g∥2,Γ2).

125

Proof. We can apply Lax-Milgram’s theorem to obtain the unique existenceof a solution. Substituting v = u into (16.2), we have

C4∥u∥2V ≤ a(u, u) = ⟨F, u⟩ ≤ (∥f∥+ C2∥g2∥2,Γ2 + C1∥g1∥V )∥u∥V .

Hence,

∥u∥V ≤ 1

C4(∥f∥+ C2∥g2∥2,Γ2 + C1∥g1∥V ).

This, together with ∥u∥V ≥ ∥u∥V − ∥g1∥V , implies the desired inequality.

16.2 Elliptic PDE of the second order

Assume that Ω ⊂ RN is a bounded smooth domain and that its boundaryΓ = ∂Ω consists of two parts Γ1,Γ2 ⊂ Γ such that Γ = Γ1 ∪ Γ2. Since Ωis a smooth domain, the unit outer normal vector to Γ, which is denoted byn = n(s) = (n1(s), n2(s)), is well-defined for almost every s ∈ Γ.

Elliptic PDE of the second order

Lu = f in Ω, u = g1 on Γ1,∂u

∂nL= g2 on Γ2, (16.3)

where

Lv = −N∑

i,j=1

∂

∂xjaij(x)

∂v

∂xi+

N∑i=1

bi(x)∂v

∂xi+ c(x)v,

∂v

∂nL=

N∑i,j=1

aij(x)∂v

∂xinj .

SettingA = (aij(x)), b = (bi(x)),

the equation Lu = f is expressed as

−∇ ·A∇u+ b · ∇u+ cu = f.

For v ∈ C∞(Ω) with v|Γ1 = 0,∫Ω(Lu)v dx = −

N∑i,j=1

∫Ω

(∂

∂xjaij

∂u

∂xi

)v dx+

N∑i=1

∫Ωbi

∂u

∂xiv dx+

∫Ωcuv dx

=N∑

i,j=1

(−∫Γaij

∂u

∂xinjv dx+

∫Ωaij

∂u

∂xi

∂v

∂xjdx

)

+N∑i=1

∫Ωbi

∂u

∂xiv dx+

∫Ωcuv dx

=

∫Ω

N∑i,j=1

aij∂u

∂xi

∂v

∂xj+

N∑i=1

bi∂u

∂xiv + cuv

dx

︸︷︷︸=a(u,v)

−∫Γ2

∂u

∂nLv dx.

126

Consequently, the solution u of (16.3) satisfies

a(u, v) =

∫Ωfv dx+

∫Γ2

g2v dx

for any v ∈ C∞(Ω) with v|Γ1 = 0. Suppose g1 : Ω → R is such that g1|Γ1 = g1.Put u = u− g1. Then, we have u|Γ = 0 and

a(u, v) =

∫Ωfv dx+

∫Γ2

g2v dx+ a(g1, v)︸︷︷︸=F (v)=⟨F,v⟩

.

Function spaces and forms.

V = v ∈ H1(Ω) | v|Γ1 = 0, ∥ · ∥V = ∥ · ∥1,2 = ∥ · ∥H1(Ω),

a(u, v) =

∫Ω

N∑i,j=1

aij∂u

∂xi

∂v

∂xj+

N∑i=1

bi∂u

∂xiv + cuv

dx,

⟨F, v⟩ = ⟨F, v⟩V ′,V =

∫Ωfv dx+

∫Γ2

g2v dS + a(g1, v),

∥v∥ = ∥v∥L2(Ω), ∥v∥Γk= ∥v∥L2(Γk) (k = 1, 2).

Weak formulation.

Find u = u− g1 ∈ H1(Ω) s.t. u ∈ V, a(u, v) = ⟨F, v⟩ (∀v ∈ V ). (16.4)

Assumptions.

(B1) aij = aji, bi, c ∈ C(Ω), f ∈ L2(Ω);

α = max1≤i,j≤N

supx∈Ω

|aij(x)|, β = max1≤i≤N

supx∈Ω

|bi(x)|,

γ = supx∈Ω

|c(x)|, γ′ = infx∈Ω

c(x).

(B2) ∃λ0 > 0 s.t.∑

1≤i,j≤N

aij(x)ξiξj ≥ λ0|ξ|2 (∀x ∈ Ω, ∀ξ ∈ RN );

(B3) γ′ ≥ β2

2λ0;

(B4) g1 ∈ H1(Ω), g1 = g1|Γ1 ∈ L2(Γ1), g2 ∈ L2(Γ2).

Continuity of a.

|a(u.v)| ≤ α∥∇u∥ · ∥∇v∥+ β∥∇u∥ · ∥v∥+ γ∥u∥ · ∥v∥≤ maxα, β, γ︸︷︷︸

C1

∥u∥V ∥v∥V .

127

Continuity of F . For v ∈ V ,

|⟨F, v⟩| ≤∫Ω|fv| dx+

∫Γ2

|g2v| dS + |a(g1, v)|

≤ ∥f∥∥v∥+ ∥g2∥Γ2∥v∥Γ2 + C1∥g1∥V ∥v∥V≤ (∥f∥+ C2∥g2∥Γ2 + C1∥g1∥V )︸︷︷︸

C3

∥v∥V .

Coercivity of a. For u ∈ V ,

a(u, u) ≥ λ0∥∇u∥2 − β∥∇u∥ · ∥u∥+ γ′∥u∥2

=λ0

2∥∇u∥2 +

(√λ0

2∥∇u∥ − β√

2λ0∥u∥

)2

+

(γ′ − β2

2λ0

)∥u∥2

≥ λ0

2∥∇u∥2 ≥ C2

4∥u∥2V .

Well-posedness

Theorem 16.3.Under the assumptions (B1)–(B4), there exists a unique solution u of (16.4).Moreover, it satisfies

∥u∥1,2 ≤ C5(∥f∥+ ∥g1∥1,2 + ∥g∥2,Γ2).

Proof. It is exactly the same as that of the previous theorem.

128

Problems and further remark for Chapter III

Problems

Problem 14. Let Ω be a bounded domain in R2 with the smooth boundaryΓ. Let f ∈ L2(Ω) and g ∈ L2(Γ). Give a variational formulation of theproblem

−∆u = f in Ω, with∂u

∂n+ u = g on Γ,

where ∆ denotes the Laplacian and ∂/∂n differentiation along the outer unitnormal vector n to Γ. Prove existence and uniqueness of a weak solution. Givean interpretation of the boundary condition in connection with some problemin mechanics or physics.

Problem 15. Let Ω be a bounded domain in R2 with the smooth boundaryΓ. Suppose that a bounded smooth domain Ω1 is strictly contained in Ω. Thatis, we assume that Ω1 ⊂ Ω. Define Ω2 = Ω\Ω1 and S = ∂Ω1∩∂Ω2. Moreover,n denotes the unit normal vector to S outgoing from Ω1. Then, we seek thefunctions

u1 : Ω1 → R, u2 : Ω2 → R

such that

−∆u1 = f in Ω1, −ε∆u2 = f in Ω2, u2 = 0 on Γ

together with the continuity condition:

u1 = u2,∂u1∂n

= ε∂u2∂n

on S,

where f ∈ L2(Ω) and 0 < ε < 1 is a constant. This problem is called theinterface problem. Introducing the function

u =

u1 in Ω1

u2 in Ω2,

give a variational formulation in H10 (Ω) of this interface problem and discuss

the unique existence of a weak solution.

Further remark

The Lax-Milgram theorem (Theorem 14.1) was presented in [23, theorem 2.1].It is interesting that the main aim of [23] is to resolve higher order parabolicequations by Hille–Yosida’s semigroup theory. It is described in [23] that

The following theorem is a mild generalization of the Frechet–RieszTheorem on the representation of bounded linear functionals inHilbert space. [page 168]

Without the coercivity condition (14.1), we can prove the unique existenceof the variational problem (14.2). Actually, Theorem 14.1 is generalized asfollows.

129

Theorem 16.4.Letting V be a Banach space and letting W be a reflexive Banach space, then,for any continuous bilinear form a : V × W → R, the following (i)–(iii) areequivalent.

(i) For any L ∈ W ′, there exists a unique u ∈ V such that

a(u,w) = ⟨L,w⟩W ′,W (∀w ∈ W ). (16.5)

(ii)

∃β > 0, infv∈V

supw∈W

a(v, w)

∥v∥V ∥w∥W= β; (16.6a)

w ∈ W, (∀v ∈ V, a(v, w) = 0) =⇒ (w = 0). (16.6b)

(iii)

∃β > 0, infv∈V

supw∈W

a(v, w)

∥v∥V ∥w∥W= inf

w∈Wsupv∈V

a(v, w)

∥v∥V ∥w∥W= β. (16.7)

Remark. Several remarks must be mentioned here.

1. If W = V , then (14.1) implies (16.7); Theorem 14.1 is a corollary ofTheorem 16.4.

2. The value of β in (16.6a) agrees with that of (16.7).

3. Condition (16.6a) is expressed equivalently as

∃β > 0, supw∈W

a(v, w)

∥w∥W≥ β∥v∥V (∀v ∈ V ).

Usually, (16.6a) is called the Babska–Brezzi condition or the inf–supcondition.

4. Condition (16.6b) is expressed equivalently as

supv∈V

|a(v, w)| > 0 (∀w ∈ W,w = 0).

5. The solution u ∈ V of (16.5) satisfies

∥u∥V ≤ 1

β∥L∥W ′

in view of (16.6a).

6. If V and W are finite-dimensional and dimV = dimW , then (16.6a)implies (16.6b). See [8, Proposition 2.21].

130

Theorem 16.4 has a long history.

• In 1962, Necas [28, Theoreme 3.1] proved that part “(iii) ⇒ (i)” for theHilbert case (i.e., the case where both V and W are Hilbert spaces) asa simple generalization of the Lax–Milgram theorem. Necas describedthat7

Considerant les espaces complexes et les operateurs differentielselliptiques, le theoreme de P. D. Lax and A. Milgram (cf. p.ex. L. Nirenberg [31]) paraıt etre tres utile pour la methodevariationnelle d’abord nous en signlons une generalisation facile.[page 318]

He also described that (see [28, Theoreme 3.2]) (16.6a) and

R(A) is dense in W ′

implies (i) for the Hilbert case, where A denotes the associating operatorwith a(·, ·). Later, in 1967, Necas [29, Theoreme 6-3.1] proved that(16.6a) and

∃c > 0, supv∈V

a(v, w)

∥v∥V≥ c∥w∥Z (w ∈ W )

implies (i) for the Hilbert case, where Z denotes a Banach space suchthat W ⊂ Z (algebraically and topologically). See also [30]. I infer thatNecas noticed the part “(ii) ⇒ (i)”.

• In 1971, Babuska [1, theorem 2.1] stated the part “(iii) ⇒ (i)” for theHilbert case. Babuska described that 8

The proof is adapted from Necas [28] and Nirenberg [31]. Wepresent this proof because we shall use a portion of it for proofof the next theorem. [page 323]

Later, Babuska–Aziz [2, Theorem 5.2.1] stated in 1972 the part “(ii) ⇒(i)” for the Hilbert case. It is described that

This theorem is a generalization of the well known Lax–Milgramtheorem. The theorem might be generalized easily to the casewhere H1 and H2 are reflexive Banach spaces. The methodproof is an adaptation from [28] and [31] (see also Necas [29],p.294). [page 116]

• In 1974, Brezzi [4, Corollary 0.1] proved the part “(i)⇔(iii)”for theHilbert case. It is described that

the results contained in theorem 0.1 and in corollary 0.1 areof classical type and that they might not be new. For instancepart I)⇒III) of corollary 0.1 was used by Babuska [1]. [page132]

7In quotations below, we have adapted reference numbers for the list of references of thisnote.

8However, I was unable to find where proof of the theorem was given in [31].

131

• In 1989, Rosca [33, Theorem 3] proved the part “(i)⇔(ii)” for the Banachcase and called it the Babuska–Lax–Milgram theorem9.

• In 2002, Ern and Guermond presented the part “(i)⇔(ii)” as Theoremof Necas in their monograph [7, §3.2]. Later, they named the part“(i)⇔(ii)” the Banach–Necas–Babuska Theorem in an expanded versionof [7]; see [8, §2.1]. It is described in [8] that

The BNB Theorem plays a fundamental role in this book. Al-though it is by no means standard, we have adopted the termi-nology “BNB Theorem” because the result is presented in theform below was first stated by Necas in 1962 [28] and popular-ized by Babuska in 1972 in the context of finite element meth-ods [2, p. 112]. From a functional analysis perspective, thistheorem is a rephrasing of two fundamental results by Banach:the Closed Range Theorem and the Open Mapping Theorem.[page 84]

• I could find no explicit reference to the part “(ii) ⇔ (iii)”. However, itis known among specialists.

Theorem 16.4 has many important applications.

• Necas originally established Theorem 16.4, the part “(iii)⇒(i)” to de-duce the well-posedness (the unique existence of a solution with a prioriestimate) of higher-order elliptic equations in weighted Sobolev spaces.

• Theorem 16.4 plays a crucial role in the theory of the finite elementmethod. Pioneering work was done for error analysis of elliptic problems(see [1], [2]). Moreover, active applications for the mixed finite elementmethod are well-known: see [5], [3] and [8] for systematic study.

• Another important application is the well-posedness of parabolic equa-tions (see [8, §6] for example). Although this later application is ap-parently unfamiliar, it is actually useful for studying the discontinuousGalerkin time-stepping method.

9In the article “Babuska–Lax–Milgram theorem” in Encyclopedia of Mathematics (http://www.encyclopediaofmath.org/), the part “(i)⇔(ii)” of Theorem 16.4 is called theBabuska–Lax–Milgram Theorem. (This article was written by I. Rosca.)

132

http://www.encyclopediaofmath.org/

http://www.encyclopediaofmath.org/

References

[1] I. Babuska. Error-bounds for finite element method. Numer. Math.,16:322–333, 1970/1971.

[2] I. Babuska and A. K. Aziz: Survey lectures on the mathematical foun-dations of the finite element method. In The mathematical foundationsof the finite element method with applications to partial differential equa-tions (Proc. Sympos., Univ. Maryland, Baltimore, Md., 1972), pages 1–359. Academic Press, New York, 1972. With the collaboration of G. Fixand R. B. Kellogg.

[3] D. Boffi, F. Brezzi, and M. Fortin. Mixed finite element methods andapplications, volume 44 of Springer Series in Computational Mathematics.Springer, Heidelberg, 2013.

[4] F. Brezzi. On the existence, uniqueness and approximation of saddle-pointproblems arising from Lagrangian multipliers. Rev. Francaise Automat.Informat. Recherche Operationnelle Ser. Rouge, 8(R-2):129–151, 1974.

[5] F. Brezzi and M. Fortin. Mixed and hybrid finite element methods, vol-ume 15 of Springer Series in Computational Mathematics. Springer-Verlag, New York, 1991.

[6] S. C. Brenner and L. R. Scott: The Mathematical Theory of Finite Ele-ment Methods (3rd ed.), Springer, 2007.

[7] A. Ern and J. L. Guermond. Elements finis: theorie, applications, miseen œuvre, volume 36 of Mathematiques & Applications (Berlin) [Mathe-matics & Applications]. Springer-Verlag, Berlin, 2002.

[8] A. Ern and J. L. Guermond. Theory and practice of finite elements,volume 159 of Applied Mathematical Sciences. Springer-Verlag, New York,2004.

[9] L. C. Evans: Partial Differential Equations (2nd ed.), American Mathe-matical Society, 2010.

[10] 藤田宏：初期値問題の差分法による近似解法 (B: 微分方程式の近似解法)，自然科学者のための数学概論 [応用編]，寺沢寛一 (編)，岩波書店，1960年．

[11] 藤田宏：境界値問題の差分法による近似解法 (B: 微分方程式の近似解法)，自然科学者のための数学概論 [応用編]，寺沢寛一 (編)，岩波書店，1960年．

[12] 藤田宏，池部晃生，犬井鉄郎，高見穎郎：数理物理に現れる偏微分方程式I [岩波講座基礎数学，解析学 II-iv]，岩波書店，1977年．

[13] H. Fujita, N. Saito and T. Suzuki: Operator Theory and Numerical Meth-ods, Elsevier, 2001.

[14] F. Hecht, O. Pironneau, A. Le Hyaric and K. Ohtsuka: freefem++, http://www.freefem.org/ff++/index.htm, Laboratoire Jacques-Louis Lions

133



(LJLL), University of Paris VI. (See also freefem++-cs http://www.ann.jussieu.fr/~lehyaric/ffcs/index.htm)

[15] Y. Kametaka: On the nonlinear diffusion equation of Kolmogorov–Petrovskii–Piskunov type, Osaka J. Math. 13 (1976) 11–66.

[16] 亀高惟倫：非線型偏微分方程式，産業図書，1977年.

[17] 加藤敏夫：変分法，自然科学者のための数学概論 [応用編]，寺沢寛一 (編)，岩波書店，1960年．

[18] 菊地文雄: 有限要素法概説 (理工学における基礎と応用)，サイエンス社，1980年．

[19] 菊地文雄: 有限要素法の数理 (数学的基礎と誤差解析)，培風館，1994年．

[20] P. Knabner and L. Angermann: Numerical Methods for Elliptic andParabolic Partial Differential Equations, Springer, 2003.

[21] J. F. B. M. Kraaijevanger: Maximum norm contractivity of discretizationschemes for the heat equation, Appl. Numer. Math. 9 (1992) 475–492.

[22] 菊地文雄，齊藤宣一：数値解析の原理–現象の解明をめざして (岩波数学叢書)，岩波書店，2016年

[23] P. D. Lax and A. N. Milgram. Parabolic equations. In Contributions tothe theory of partial differential equations, Annals of Mathematics Studies,no. 33, pages 167–190. Princeton University Press, Princeton, N. J., 1954.

[24] S. Larsson and V. Thomee: Partial Differential Equations with NumericalMethods, Springer, 2009.

[25] K. W. Morton and D. F. Mayers: Numerical Solution of Partial Differen-tial Equations (2nd ed.), Cambridge University Press, 2005.

[26] 三村昌泰 (編): パターン形成とダイナミクス (非線形・非平衡現象の数理4)，東京大学出版会，2006年.

[27] 三村昌泰：微分方程式と差分方程式—数値解は信用できるか？—，「数値解析と非線形現象 (山口昌哉編)」の第 3章，日本評論社，1996年 (オリジナルは 1981年)．

[28] J. Necas. Sur une methode pour resoudre les equations aux deriveespartielles du type elliptique, voisine de la variationnelle. Ann. ScuolaNorm. Sup. Pisa (3), 16:305–326, 1962.

[29] J. Necas. Les methodes directes en theorie des equations elliptiques. Mas-son et Cie, Editeurs, Paris; Academia, Editeurs, Prague, 1967.

[30] J. Necas. Direct methods in the theory of elliptic equations. SpringerMonographs in Mathematics. Springer, Heidelberg, 2012. Translated fromthe 1967 French original by Gerard Tronel and Alois Kufner, Editorial co-ordination and preface by Sarka Necasova and a contribution by ChristianG. Simader.

134



[31] L. Nirenberg. Remarks on strongly elliptic partial differential equations.Comm. Pure Appl. Math., 8:649–675, 1955.

[32] R. D. Richtmyer and K. W. Morton: Difference Methods for Initial-ValueProblems, Interscience Publishers, 1967.

[33] I. Rosca. On the Babuska–Lax–Milgram theorem. An. Univ. BucurestiMat., 38(3):61–65, 1989.

[34] P. A. Raviart and J. M. Thomas: Introduction a l’Analyse Numeriquedes Equations aux Derivees Partielles, Masson, Paris, 1983.

[35] 齊藤宣一：数値解析入門，東京大学出版会，2012年.

[36] 齊藤宣一：線形・非線形拡散方程式の差分解法と解の可視化，講義ノート (http://www.infsup.jp/saito/materials/fdm_heat11b.pdf/)，2011年．

[37] G. D. Smith: Numerical Solution of Partial Differential Equations, OxfordUniversity Press, 1965. (藤川洋一郎訳：コンピュータによる偏微分方程式の解法，新訂版，サイエンス社，1996年)


[39] V. Thomee: Finite Difference Methods for Linear Parabolic Equations,In Handbook of Numerical Analysis, Vol. I, 5–196, North-Holland, 1990.

[40] V. Thomee: Galerkin Finite Element Methods for Parabolic Problems(2nd ed.), Springer, 2006.

135

http://www.infsup.jp/saito/materials/fdm_heat11b.pdf/

introduction to numerical analysis for partial ﬀ equations › saito › materials ›...

Documents