quasi-optimality of adaptive nite element methods for

Noname manuscript No.(will be inserted by the editor)

Quasi-optimality of adaptive finite element methods forcontrolling local energy errors

Alan Demlow

the date of receipt and acceptance should be inserted later

Abstract A rich theory demonstrating convergence and optimality of adaptivefinite element methods (AFEM) has been developed in recent years. In this workwe prove optimality of AFEM which are designed to control local energy errors inelliptic partial differential equations. Because errors propagate globally in FEM,controlling local errors requires controlling both local energy solution propertiesand global error contributions (pollution errors) which may be measured in aweaker norm such as the L2 norm. We define adaptive methods which controlboth of these error components and prove that they converge with the best pos-sible rate over all possible refinements of the initial mesh. These results are validfor Poisson’s problem on convex polyhedral domains in arbitrary space dimension.Our theory establishes AFEM optimality for several adaptive marking strategieswhich rigorously control pollution effects. We also present numerical examples thatillustrate our theory and confirm that local energy AFEM without pollution con-trol can fail to yield optimal meshes.Adaptive finite element methods; convergenceof adaptive finite element methods; local error estimates

1 Introduction and Results

The standard adaptive finite element method (AFEM) is an iterative feedbackprocedure of the form

solve→ estimate→ mark→ refine. (1.1)

Over the past decade a robust convergence theory for AFEM for controlling globalenergy errors in linear elliptic problems has been developed; cf. [1], [2], [3], [4], [5],[6], [7], [8] among many others. Such error analysis generally proceeds in two steps.

This work was partially supported by the National Science Foundation under grant DMS-1016094.

Alan DemlowDepartment of Mathematics, Texas A&M University, Mail Stop 3368, College Station, TX77843–3368E-mail: [email protected]

2 A. DEMLOW

First, global energy AFEM yield a contraction, that is, a suitably defined errornotion decreases by a fixed fraction at each step of the algorithm. Second, AFEMis optimal in that it yields the best possible convergence rate with respect to thenumber of mesh elements over all systematic refinements of the initial mesh.

In many applications it is desirable to control a norm or functional (quantityof interest) of the solution other than the global energy norm. Refinement thatis optimal for controlling one measure of the error is generally not optimal forother quantities of interest, so it is standard practice to taylor AFEM to controlthe error in the desired output. We focus on error control in norms; cf. [9] foroptimality results for AFEM for computing linear functionals of the solution. Thestandard convergence framework for global energy norms relies heavily on simpleconsequences of the algebraic structure of FEM. These include Galerkin orthogo-nality properties in the energy inner product and best-approximation results aris-ing from the fact that the FEM is an energy projection onto the discrete solutionspace. Analogues of these basic tools are challenging to establish in the context ofnonstandard (non-global energy) norms. Thus even though AFEM for controllingnonstandard norms are not hard to define and can be observed to perform well inpractice, proving that they (optimally) converge is a nontrivial extension of theglobal energy theory. Convergence of AFEM for controlling “weak” norms domi-nated by the global energy norm was proved in [10], but no convergence rates wereprovided. In [11] we proved that an AFEM for controlling a suitably defined localenergy error notion yields a contraction, while in [12] the author and R. Stevensonproved optimality of a slightly nonstandard FEM for controlling L2 norms of theerror. These are to our knowledge the only works establishing convergence andoptimality of AFEM for controlling nonstandard norms. In this work we establishoptimality of AFEM for controlling local energy norms of the finite element error.Our analysis employs the technical framework developed in the closely relatedpapers [11], [12] along with some nontrivial extensions of those works.

In local error analysis the goal is to control the finite element error on somesubdomain D ⊂ Ω. Consider the elliptic model problem: Find u such that

−∆u =f in Ω,

u =0 on ∂Ω,

where Ω ⊂ Rn, n ≥ 2, is a convex polyhedral domain. Let uh ∈ Sh be a finiteelement approximation to u, where Sh is a space of Lagrange finite element func-tions. Information propagates globally in elliptic PDE and thus in finite elementmethods, so the quality of the finite element approximation on D is not solelydetermined by the ability of the mesh to resolve u on D. Local error behavior haslong been a topic of study in analysis of finite element methods. In [13] Nitsche andSchatz proved the first local error estimates quantifying effects of global solutionquality on local solution properties. Given d > 0, let Dd = x ∈ Ω : dist(x,D) < d.Then

‖∇(u− uh)‖D ≤ C(

infχ∈Sh

‖∇(u− χ)‖Dd +1

d‖u− uh‖Ω

). (1.2)

Here ‖v‖ω is the L2 norm over ω. The term ‖u−uh‖Ω in (1.2) quantifies “pollution”effects of global solution properties on local solution quality. These pollution effectsare measured in a weaker norm than is the error on the left hand side of (1.2).

OPTIMALITY OF AFEM FOR LOCAL ERRORS 3

Xu and Zhou in [14] proved an a posteriori analogue to (1.2), in particular

‖∇(u− uh)‖D ≤ C(

(∑

T∩Dd 6=∅

η(T )2)1/2 +1

d‖u− uh‖Ω

). (1.3)

Here η(T ) is the standard energy residual-type indicator. It was noted in [14] that‖u − uh‖Ω can be conveniently bounded a posteriori and controlled adaptivelywhen Ω is convex. [15], [16] contain similar a posteriori estimates, with particularattention paid to adaptive control of the pollution term in (1.2) on both convexand nonconvex polygonal domains in R2. A posteriori control of L2 errors is tech-nically more difficult on nonconvex domains because insufficient elliptic regularitynecessitates explicit inclusion of edge and vertex singularities in the estimates.

The main motivation for the estimate (1.3) in [14] was the development of“two-grid” algorithms. There the local energy error is reduced by employing aglobal coarse grid which resolves the weaker pollution error along with a gridrefined sufficiently on Dd to ensure that the stronger first term in (1.3) or (1.2)is small. Similar ideas have been developed in other contexts including Stokesand eigenvalue problems [17], [18], [19]. Related adaptive parallel algorithms wereproposed in [20], [21], [14]. There Ω is partitioned into subdomains Ωi, each ofwhich is assigned to a processor. The local energy estimator in (1.3) is adaptivelyreduced independently on each Ωi, yielding a grid Ti and global finite elementsolution ui which are designed to resolve u in the energy norm only on Ωi. Variousmethods have been proposed for controlling the pollution term in (1.3), rangingfrom completely rigorous control of the type we propose below [14] to various ad-hoc [20], [21] and duality-based strategies [22], [23] to none at all [24]. If pollutionis controlled well a globally accurate solution can be found by simply “sewingtogether” the ui’s appropriately. Some variants of the Bank-Holst algorithm [20],[21], [24] involve overlaying the locally produced grids Ti to form a single fine globalmesh T , and a single global solve is carried out on T . In this version completelyrigorous pollution control is less important, and the important question is whetherthe employed local energy AFEM produces grids Ti that efficiently resolve theglobal solution u. We mainly consider whether local energy AFEM produce goodsolutions, though we briefly comment on the weaker goal of optimal grids in ournumerical examples.

The chief contribution of this paper is to establish that an AFEM (definedbelow) for controlling a local energy error notion is optimal in the sense that ityields the best possible convergence rate achievable over regular refinements ofan initial triangulation. The target quantity controlled by our AFEM is ‖φ∇(u−uh)‖Dd + κ

d ‖u− uh‖Ω . Here κ > 0 is a weighting parameter. φ is a cutoff functionwhich is 1 on D and 0 outside of Dd. We include ‖u− uh‖Ω in our target quantitybecause pollution must be controlled in order to reduce the local energy error. In[11] we established the counterpart

‖φ∇(u− uh)‖Dd ≤ C[∑

T∩Dd 6=∅

η1,T ,φ(T )2)1/2 +1

d‖u− uh‖Ω ] (1.4)

to (1.3). Here η1,T ,φ is a weighted elementwise indicator (defined below) satisfyingη1,T ,φ(T ) ≤ η(T ) with equality holding on D. In [11] we proved that an AFEMfor controlling ‖∇[φ(u − uh)]‖Dd + κ

d ‖u − uh‖Ω is contractive. As is discussed in

4 A. DEMLOW

that work, considering a weighted error notion instead of the energy error oversimple subdomains is a natural modification of the traditional local estimates (1.2)and (1.3) because the analytical techniques used to prove the latter estimatesin essence involve bounding ‖φ∇(u − uh)‖Dd . However, estimates involving theweighted norm are balanced in the sense that the same φ-weighted energy notionappears on both sides of the inequality in (1.4), whereas traditional local estimatesinclude a stronger energy notion (over an expanded subdomain) on the right-hand-side vs. the left-hand side. This balance is essential to obtaining adaptiveconvergence and optimality results. On a practical level, note also that because‖∇(u − uh)‖D ≤ ‖φ∇(u − uh)‖Dd , (1.4) and associated AFEM provide sharpercontrol of ‖∇(u− uh)‖ than do (1.3) and associated AFEM even if one desires tocontrol only ‖∇(u− uh)‖D.

In order for an AFEM for a given error notion to be optimal, it is importantthat that error notion admit an analogue of Cea’s Lemma. Thus a key to extendingthe contraction result in [11] to obtain AFEM optimality is a new almost-best-approximation result for our local error notion:

‖φ∇(u− uh)‖Dd +κ

d‖u− uh‖Ω

≤ C[

infχ∈Sh

‖φ∇(u− χ)‖Dd +κ

dinfψ∈Sh

‖u− ψ‖Ω + osc

], κ ≥ 1.

(1.5)

Here osc is a data oscillation term measuring the deviation of f from a piecewisepolynomial. There are two essential properties of (1.5). First, the error notions onboth sides of (1.5) are the same up to data oscillation, that is, that the right handside of (1.5) is in turn bounded by the lefthand side up to data oscillation. Thisproperty does not hold for the energy-scaled terms in (1.2). Similarly, standardefficiency analysis gives that the energy estimator (

∑T∩Dd 6=∅ η(T )2)1/2 in (1.3) is

bounded up to data oscillation by ‖∇(u−uh)‖D′d (where D′d consists of all elements

touching Dd) but not by the error quantity ‖∇(u − uh)‖D on the left-hand-sideof (1.3). AFEM based on (1.3) thus cannot optimally control ‖∇(u − uh)‖D orassociated error notions that include pollution terms measured in weaker norms. Asecond key property of (1.5) is that the best approximations to u in the two normsunder consideration may be chosen independently on the right hand side. Thisproperty is essential in establishing optimality of some of the marking strategiesconsidered below and is a novel feature of our estimate.

The AFEM “mark” step in (1.1) chooses a subset of the elements in the gridwhich in principal lead to optimal error reduction when refined. Our local energyAFEM is driven by two residual-type error estimators, a φ-weighted local energyestimator η1,φ and a global L2 estimator η0. We consider three main choices ofmarking strategy for balancing local energy and pollution contributions to theoverall error:

1. An alternating marking strategy, in which marking is based on the local en-ergy estimator η1,φ when it dominates the pollution estimator κ

d η0, and on η0otherwise.

2. An integrated strategy based on the single estimator η1,φ + κd η0.

3. A pre-controlled pollution strategy in which the pollution estimator is first re-duced to a given tolerance, and the energy estimator is then reduced to thesame tolerance.


We prove that AFEM employing any of these three marking strategies convergewith the best possible rate under appropriate assumptions. In our numerical testswe additionally consider local energy AFEM without pollution control, i.e., wemark using η1,φ only. Such AFEM have been used in parallel adaptive algorithms[24]. Our tests and related discussion make clear that pollution must be controlledeither before or at the same time as the local energy contribution in order to achieveoptimality, and that algorithms which fail to do so can produce both suboptimalgrids and suboptimal control of local errors.

We also relate the estimate (1.5) to the literature on a priori error estimates fornonstandard norms. Most such estimates in the literature assume quasi-uniformgrids, while standard AFEM produce merely shape-regular grids. Establishment ofa priori estimates in nonstandard norms on shape regular grids is a longstandingproblem which for the most part remains open. (1.2) was proved only recentlyunder the assumption of shape regularity [25], and a sharper φ-weighted analogueof (1.2) may also be established under this condition. The estimate (1.5) how-ever also requires an optimal bound ‖u − uh‖Ω . infψ∈Sh ‖u − ψ‖Ω + osc. Thefirst proof is this result on any class of meshes is contained in [12], which estab-lishes the necessary L2 bound on a class of “mildly graded” meshes defined below.Such an estimate is not available in the literature assuming merely shape regu-lar meshes. Following [12], we analyze AFEM which produce sequences of mildlygraded meshes.

We finally comment on the practicality of the AFEM for which we prove op-timality. Standard AFEM optimality results for global energy norms require thatan essential user-supplied parameter in the “mark” step be sufficiently small [2].Optimality of our AFEM requires three user-supplied parameters to satisfy simi-lar threshold conditions: a marking parameter (precisely as in AFEM for globalenergy errors), a parameter in the “mark” step ensuring that the mesh remainssufficiently mildly graded, and the pollution weighting parameter κ. The first twoof these must be sufficiently small and the third sufficiently large. Whereas in theenergy case an upper bound for the marking parameter in terms of interpolation(Poincare) constants can in principal be derived theoretically, this will be harder inour situation as the corresponding parameters additionally depend on H2 regular-ity constants. As in the energy case, however, the threshold values do not dependon essential quantities such as the solution u and the current refinement level. Fur-ther comments concerning the history and practical impact of the “mild grading”parameter are also given below.

The outline of the paper is as follows. In Section 2 we give a number of defi-nitions and preliminaries. In Section 3 we recall and as necessary prove a numberof a priori, a posteriori, and AFEM convergence results. Section 4 contains state-ments, proofs, and discussion of our main results. Finally, in Section 5 we presentseveral numerical examples illustrating our theoretical development.

2 Preliminaries and assumptions

In this section we collect a number of preliminaries and assumptions.

6 A. DEMLOW

2.1 Constants

We employ generic constants C and Ci, i = 0, 1, 2... which depend on non-essentialquantities including the shape-regularity of the family of meshes employed byour AFEM, the domain Ω and space dimension n, the polynomial degree of thefinite element spaces employed, and an H2 regularity constant. C may differ frominstance to instance, while Ci denotes a constant that is fixed throughout thepaper. In addition, we use the abbreviated notation a . b to mean that a ≤ Cb.‘&’ and ‘'’ have an obvious similar meaning.

2.2 Meshes and mesh refinement

AFEM begins with an initial simplicial decomposition T0 of Ω such that all el-ements of T0 are shape-regular and conforming. Let T consist of all conformingmeshes that may be generated by newest-node bisection or its generalization ton ≥ 3 from T0 [26]. Uniform shape-regularity of T0 implies the same of T, generallywith a constant different from that of T0 but uniform over the entire class T. Wesay that T ⊂ T whenever T , T ⊂ T and T is a refinement of T and denote byRT→T the set of elements in T that are bisected in passing from T to T . If ω ⊂ Ω,we define the local mesh with respect to ω by Tω = T ∈ T : T ∩ ω 6= ∅.

The standard AFEM generates a nested sequence of meshes Tii≥0 ⊂ T. Also,RTi→Ti+1

is composed of a marked set Mi ⊂ Ti specified in the AFEM algorithm,plus additional elements refined in order to maintain conformity. An importantproperty of newest-node bisection is that with a suitable numbering of the initialmesh, RTi→Ti+1

may be chosen so that [26]

#RTi→Ti+1. #Mi. (2.1)

Our main results require as in [12] that the meshes Ti generated by AFEMbe “sufficiently mildly graded”. We first give a brief overview of the definitionand technical properties of such meshes here and refer to [12] for more details.Let hT = |T |1/n for T ∈ Ti. Given any shape-regular (not necessarily mildlygraded) mesh T , there exists a smoothed mesh size function hT ∈ W 1

∞(Ω) whichis uniformly equivalent to the piecewise constant function hT and which satisfies‖∇hT ‖L∞(Ω) . 1. The constant hidden in “.” depends on the space dimension andshape regularity properties of T . If hT ' h for all T ∈ T , we may choose hT ≡ h

and so ‖∇hi‖L∞(Ω) = 0. Thus ‖∇hT ‖L∞(Ω) measures how strongly graded themesh is. We say that the set Ti has grading parameter µ if for each i ≥ 0 hi maybe chosen so that hi ' hT with constants independent of µ, and ‖∇hi‖L∞(Ω) ≤ µ.It was proved in [12] that given a mesh Ti with grading parameter µ and markedset Mi, it is possible to construct Ti+1 also having grading parameter µ in such away that (2.1) still holds with the constant hidden in . now depending on µ. Thisresult was essential in establishing optimality of AFEM for L2 norms in [12] andis essential here also. We denote by Tµ the subset of T consisting of meshes withgrading parameter no larger than µ.

Our results below require that Ti be sufficiently mildly graded in the sensethat Ti ⊂ Tµ for µ sufficiently small, depending on the generic constant C in§2.1. The above discussion indicates that mild grading represents a theoretically


natural tightening of the shape regularity condition. In addition, because of (2.1)enforcement of mild grading does not compromise rate optimality of the resultingAFEM over T. Finally, it is possible that the meshes generated by AFEM aresufficiently mildly graded for our purposes without any additional enforcement,since all shape regular meshes satisfy ‖∇hT ‖L∞(Ω) . 1.

We now discuss the context surrounding “mildly graded” meshes; cf. [27], [12].Available proofs of optimal adaptive convergence require sharp a priori error esti-mates such as (1.5) on the class of (shape-regular) meshes generated by AFEM.This is natural since optimality of AFEM over T is a much stronger result thanoptimality of the FEM over a single mesh in T. Such a priori estimates are nearlytrivial to obtain for global energy norms on any grid, but most a priori error esti-mates in nonstandard norms in the literature assume quasi-uniform meshes. Theoriginal local energy estimates in [13] assume quasi-uniform grids. [28], [14] con-tain such estimates on classes of grids whose largest and smallest elements satisfy acertain relationship. The resulting class of grids allows for significant mesh gradingbut is more restricted than shape regularity and does not maintain optimality ofAFEM if enforced. [25] contains local energy estimates of the form (1.2) assumingonly shape regularity. This is to our knowledge the only work establishing optimala priori error estimates for nonstandard norms on merely shape regular grids ex-cept in one space dimension [29]. Note however that (1.2) and the correspondingestimates in [25] do not optimally bound the pollution term ‖u−uh‖Ω , which mustalso be accomplished in order to obtain the sharper estimate (1.5) needed to proveoptimality of our AFEM.

A mild grading condition was to our knowledge first proposed in [30], whichcontains maximum-norm estimates optimally reflecting mesh grading. Optimal L2

estimates in one space dimension are proved under a mild grading condition inthe introductory Chapter 0 of the textbook of Brenner and Scott [31], and an L2

almost-best-approximation result was proved under this condition for arbitraryspace dimension in [12]. [27] proves a W 1

∞ stability result assuming a mild gradingcondition and also contains an extended discussion of related mesh conditionsthat have been used in FEM error analysis in nonstandard norms. Establishmentof optimal a priori estimates in Lp and W 1

p norms on the whole class of shaperegular grids remains an open question.

As observed in [12], on a practical level enforcement of the mild grading con-dition could greatly inflate the number of refined elements and exacerbate theasymptotic nature of the observed convergence rates. However, we have not ex-plicitly enforced mild grading in our numerical experiments below or elsewhere[12] and have not observed any degradation in convergence rates as a result. Wenote two possible explanations for this. The first is that the condition may be anartifact of proof, though as noted above a number of unsuccessful attempts havebeen made to avoid such conditions. The second is that as also noted above, theshape-regular bisection meshes generated by AFEM may already be sufficientlymildly graded.

2.3 Finite element space, interpolants, and superapproximation

Given T ∈ T, let ST ⊂ H10 (Ω) be a space of continuous Lagrange finite element

functions which are piecewise polynomials of some fixed degree k on T . We set

8 A. DEMLOW

Si = STi . Note that Si ⊂ Si+1, since Ti+1 is a refinement of Ti. Let uT ∈ ST satisfy∫Ω

∇uT ∇v dx =

∫Ω

fv dx, v ∈ ST . (2.2)

As above we employ the abbreviation ui = uTi .We use two standard finite element (quasi)-interpolants. We denote the La-

grange interpolant by IL. The following two properties of IL are used in our proofsbelow.

Lemma 1 Let T ∈ Ti for some i ≥ 0, and let φ ∈ Pj(T ) for some fixed j ≥ 0, where

Pj denotes the polynomials of degree j. Assume also that

|φ|W j∞(T )

. d−j , 0 ≤ j ≤ k + 1, (2.3)

for some d ≥ hT . Then for all χ ∈ Si, there hold the superapproximation estimates

‖φ2χ− IL(φ2χ)‖L2(T ) + hT ‖φ2χ− IL(φ2χ)‖H1(T )

. h2T (1

d2‖χ‖L2(T ) +

1

d‖φ∇χ‖L2(T )) .

hTd‖χ‖L2(T ) . ‖χ‖L2(T ),

(2.4)

‖φχ− IL(φχ)‖L2(T ) + hT ‖φχ− IL(φχ)‖H1(T )

. h2T (1

d2‖χ‖L2(T ) +

1

d‖∇χ‖L2(T )) .

hTd‖χ‖L2(T ) . ‖χ‖L2(T ).

(2.5)

In addition, for any T ∈ Ti,

‖∇IL(φχ)‖T . ‖∇(φχ)‖T , ‖IL(φχ)‖T . ‖φχ‖T . (2.6)

The constants C above depend on the constant Cφ in (2.3), the degrees k of the finite

element space and j of φ, and the shape regularity of Ti.

Proof The first inequalities in (2.4) and(2.5) are a minor modification of Theorem2.1 of [25] and a standard superapproximation result, respectively. The remaininginequalities may be obtained by standard inverse estimates.

We shall also employ the Scott-Zhang interpolant ISZ , which is stable in H10

(cf. [32]). We do not list further properties of ISZ here, as its application in theestablishment of residual-type a posteriori upper bounds is standard.

2.4 Subdomain and cutoff function

Next we describe our target subdomain and cutoff function. Our requirementsare identical to those in [11], and as there some loosening of these restrictions ispossible but technically more involved. We require that D be the intersection ofa box (rectangle if n = 2) with Ω. Let Dd be the intersection with Ω of the boxwhose sides are parallel to and lie a distance d outside of those of D. We assumefor simplicity that d ≤ 1. Thus D ⊂ Dd and dist(D, ∂Dd \∂Ω) ' d. Let D′ = Dd \Ddenote the “ring” around D. We require that D = ∪T∈T0,DT for some subset T0,Dof T0, and likewise for Dd and D′.

Next we define a cutoff function φ. We require that φ ∈ W 2∞(Dd) ∩W 1

∞(Ω);that 0 < φ ≤ 1 on (the interior of) Dd, φ = 0 on Ω \Dd, and φ ≡ 1 on D; that φis a piecewise polynomial of some fixed degree j with respect to T0 and thus withrespect to all T ∈ T; and that (2.3) holds. Note that (2.4) and (2.5) hold for φsatisfying the above since for T ∈ D′, hTd . 1.


2.5 Error estimators and indicators

Following [11], we first define a weighted H1 residual indicator

η1,T ,φ(T )2 = h2T ‖φ(f +∆uT )‖2L2(T ) + hT ‖φJ∇uT K‖2L2(∂T ), T ∈ T ∈ T.

Here J·K is the jump across the element interface. We also use the standard L2

indicator

η0T (T )2 = h4T ‖f +∆uT ‖2T + h3T ‖J∇uT K‖2L2(∂T ), T ∈ T ∈ T.

Next we define error estimators. If S ⊂ T , let

η1,T ,φ(S)2 =∑T∈S

η1,T ,φ(T )2, η0,T (S)2 =∑T∈S

η0,T (T )2.

We also write η1,T ,φ = η1,T ,φ(T ), η1,Ti,φ = η1,i,φ, η0,T = η0,T (T ), and η0,i = η0,Ti .Finally, summing η1,T ,φ over TDd is equivalent to summing over T , so we write

η1,T ,φ instead of η1,T ,φ(TDd). We also suppress the domain dependence of the L2

norm when it is taken over all of Ω, i.e., ‖ ·‖Ω = ‖ ·‖, and ‖φ · ‖Dd = ‖φ · ‖Ω = ‖φ · ‖.

2.6 Data oscillation

Given T ∈ T , let fT be the L2 projection of f |T onto Pk−1(T ). We define theφ-weighted energy-scaled data oscillation

oscφ(T ) = hT ‖φ(f − fT )‖L2(T ), oscφ(T ) =

(∑T∈T

oscφ(T )2)1/2

and the L2-scaled oscillation

oscL2(T ) = h2T ‖f − fT ‖L2(T ), oscL2

(T ) =

(∑T∈T

oscL2(T )2

)1/2

.

Finally, observe that data oscillation is monotone under mesh refinement:

T ⊂ T =⇒ osc(T ) ≤ osc(T ) (2.7)

and that data oscillation is always bounded by the corresponding error estimator:

oscL2(T ) ≤ η0,T (T ) and oscφ(T ) ≤ η1,T ,φ(T ), T ∈ T ∈ T. (2.8)

A fundamental observation about AFEM based on residual error indicatorsis that they control the “total error” consisting of the error plus data oscillation[2]. We thus define total errors ET ,φ = ‖φ∇(u − uT )‖ + oscφ(T ) and E0,T =‖u− uT ‖+ oscL2

(T ), writing Ei,φ and E0,T as above when T = Ti.

3 Preliminary estimates and AFEM

In this section we collect a number of technical estimates that will be used in thenext section to prove optimality results.

10 A. DEMLOW

3.1 A posteriori error estimates

We collect and as necessary prove several a posteriori upper and lower bounds.The first results are found in [11], [12].

Lemma 2 Assume that Ω is convex and polyhedral. Then for any T ∈ T,

E0,T . η0,T , (3.1)

ET ,φ . η1,T ,φ +1

dη0,T . (3.2)

We next provide a posteriori lower bounds.

Lemma 3 Assume that Ω is a polyhedral domain and that T ∈ T. Then

η0,T (T ) . E0,T , (3.3)

η1,T ,φ . ET ,φ +1

dE0,T . (3.4)

Proof (3.3) is a standard result. To prove (3.4) we use a standard “bubble function”argument with some essential modifications. Given T ∈ T , let ψ be the product ofthe barycentric coordinates on T . The equivalence of norms on finite-dimensionalspaces, ‖ψ‖L∞(T ) . 1, and Holder’s inequality yield for ε > 0

‖φ(fT +∆uT )‖2T '∫T

ψφ2(fT +∆uT )2 dx

= −∫T

ψφ2(fT +∆uT )∆(u− uT ) dx−∫T

ψφ2(fT +∆uT )(f − fT ) dx

. −∫T

ψφ2(fT +∆uT )∆(u− uT ) dx

+1

ε‖φ(f − fT )‖2T + ε‖φ(fT +∆uT )‖2T .

(3.5)

Integrating by parts twice and employing an inverse inequality, we have for ε > 0

−∫T

ψφ2(fT +∆uT )∆(u− uT ) dx =

∫T

∇[ψφ2(fT +∆uT )]∇(u− uT ) dx

=

∫T

∇[ψφ(fT +∆uT )]φ∇(u− uT ) dx

+

∫T

ψφ(fT +∆uT )∇φ · ∇(u− uT ) dx

=

∫T

∇[ψφ(fT +∆uT )]φ∇(u− uT ) dx

−∫T

(u− uT ) [∇(ψφ(fT +∆uT ))∇φ+ ψφ(fT +∆uT )∆φ] dx

. ε−1h−2T ‖φ∇(u− uT )‖2T

+ ε−1(h−2T + d−2)d−2‖u− uT ‖2T + ε‖φ(fT +∆uT )‖2.

(3.6)


Collecting (3.6) into (3.5), taking ε small enough to kick back the last term in bothexpressions, multiplying through by h2T while recalling that hT

d . 1, and applyingthe triangle inequality yields

h2T ‖φ(f +∆uT )‖2T . ‖φ∇(u− uT )‖2T + oscφ(T )2 +1

d2‖u− uT ‖2T . (3.7)

We next bound the edge residual. Let e = T1 ∩ T2 be an element face, andlet µ be the extension of φJ∇uT K from e to Rn which is constant in the directionnormal to e. Writing hT = hT1

' hT2, we use the fact that µ is a polynomial

to obtain ‖µ‖Ti . h1/2T ‖φJ∇uiK‖e. To see this, let B be a ball in the hyper-

plane containing e which is centered at the barycenter of e and having diame-ter c1, and let B be a cylinder obtained by translating B by c2hT in a direc-tion normal to e. By shape regularity, we may choose c1, c2 uniformly equivalentto 1 so that Ti ⊂ B, and in addition there is a ball b ⊂ e ⊂ B having radius

also uniformly equivalent to hT . Then∫Tiµ2 dx ≤

∫Bµ2 dx =

∫ c2hT0

∫Bµ2 ds dt,

where the argument of µ is taken to be (s, t) for s ∈ B, t ∈ [0, c2hT ]. Thus

‖µ‖Ti . h1/2T ‖µ‖B . h

1/2T ‖φJ∇uiK‖b ≤ h

1/2T ‖φJ∇uiK‖b, where the next-to-last in-

equality follows by the equivalence of norms over polynomial spaces on B.Let also ψ be the standard piecewise polynomial edge bubble function having

support on T1 ∪ T2 and satisfying ‖ψ‖L∞(T1∪T2) ' 1. Integrating by parts, wecompute∫

e

φ2J∇uT K2 ds ' −∫e

ψφJ∇(u− uT )Kµ ds

= −∫T1∪T2

∆h(u− uT )φψµ dx−∫T1∪T2

∇(u− uT )∇(ψµφ) dx

:= −I − II.

(3.8)

Here ∆h is the Laplacian computed elementwise. Then

I . ‖φ(f +∆huT )‖T1∪T2‖µ‖T1∪T2

. ‖φ(f +∆huT )‖T1∪T2h1/2T ‖φJ∇uT K‖e.

(3.9)

Employing an inverse inequality yields for ε > 0

II =

∫T1∪T2

φ∇(u− uT )∇(ψµ) +

∫T1∪T2

ψµ∇(u− uT )∇φ

. ‖φ∇(u− uT )‖T1∪T2h−1T ‖ψµ‖T1∪T2

+1

d‖ψ∇(u− uT )‖T1∪T2

‖µ‖T1∪T2

. (h−1T ‖φ∇(u− uT )‖T1∪T2

+1

d‖ψ∇(u− uT )‖T1∪T2

)h1/2T ‖φJ∇uT K‖e.

(3.10)

Adding (3.9) and (3.10) and inserting the result into (3.8), dividing through by‖φJ∇uT K‖e, squaring the result, and multiplying through by hT yields

hT ‖φJ∇uT K‖2e . h2T ‖φ(f +∆huT ‖2T1∪T2+ ‖φ∇(u− uT )‖2T1∪T2

+h2Td2‖ψ∇(u− uT )‖2T1∪T2

.(3.11)

12 A. DEMLOW

Integrating by parts and employing a standard scaled trace inequality alongwith inverse inequalities, we next compute that for ε > 0

‖ψ∇(u− uT )‖2T1∪T2= −

∫e

ψ2(u− uT )J∇uT K ds

−∫T1∪T2

(u− uT )[ψ∇(u− uT )∇ψ + ψ2∆h(u− uT )

]dx

. ‖u− uT ‖T1∪T2(h−1T ‖ψ∇(u− uT )‖T1∪T2

+ ‖f +∆huT ‖T1∪T2)

+ ‖J∇uT K‖e(h−1/2T ‖ψ(u− uT )‖T∪T2

+ h1/2T ‖∇(ψ(u− uT ))‖T1∪T2

)

.1

εh2T‖u− uT ‖2T1∪T2

+h2Tε‖f +∆huT ‖2T1∪T2

+ hT ‖J∇uT K‖2e + ε‖ψ∇(u− uT )‖2T1∪T2.

Taking ε small enough to reabsorb the last term above and inserting the resultinto (3.11) yields

hT ‖φJ∇uT K‖2e . h2T ‖φ(f +∆huT )‖2T1∪T2+ ‖φ∇(u− uT )‖2T1∪T2

+1

d2(‖u− uT ‖2T1∪T2

+ h4T ‖f +∆uT ‖2T1∪T2+ h3T ‖J∇uT K‖2e).

(3.12)

Letting ωT be the patch of elements touching T , we insert (3.7) into (3.12),sum the result over all edges e in T , combine the result again with (3.7), andthen apply standard local efficiency results for L2-type error indicators in order toobtain

η1,T ,φ(T )2 . ‖φ∇(u− uT )‖2ωT + oscφ(ωT )2

+1

d2(‖u− uT ‖2ωT + η0,T (ωT ) + h4T ‖f +∆huT ‖2ωT )

. ‖φ∇(u− uT )‖2ωT + oscφ(ωT )2 +1

d2

(‖u− uT ‖2ωT + oscL2

(ωT )2).

Summing over T ∈ T yields (3.4).

Finally, our optimality proofs require local a posteriori bounds for the differencebetween discrete solutions on nested meshes.

Lemma 4 Assume that Ω is a convex polyhedral domain with φ, D, Dd, and D′ as

above, T ⊂ T ∈ T, and T ∈ Tµ with µ sufficiently small depending on the generic

constant C in §2.1. Then

‖uT − uT ‖Ω . η0,T (RT→T ), (3.13)

‖φ∇(uT − uT )‖Dd . η1,T ,φ(RT→T ) +1

dη0,T . (3.14)

Proof (3.13) is Lemma 2 of [12]. In our proof of (3.14) we employ the Lagrangeinterpolant IL,T on the finer mesh T and both the Lagrange and Scott-Zhang


interpolants IL,T and ISZ,T on the coarser mesh T . Let e = uT −uT , eT = u−uT ,and eT = u− uT . We compute using the second inequality in (2.5) that

‖φ∇e‖2 =

∫Ω

φ∇e∇(φe) dx−∫Ω

φe∇e∇φ dx

=

∫Ω

φ∇e∇(φe− IL,T (φe)) dx+

∫Ω

φ∇e∇IL,T (φe) dx−∫Ω

φe∇e∇φ dx

.∫Ω

φ∇e∇IL,T (φe) dx+1

d‖φ∇e‖‖e‖.

(3.15)

We write ψ = IL,T (φe) ∈ ST and define∫Ω

φ∇e∇IL,T (φe) dx =

∫Ω

φ∇eT ∇ψ dx−∫Ω

φ∇eT ∇ψ dx := I − II. (3.16)

Employing Galerkin orthogonality, standard techniques for residual error estima-tion, the first inequality in (2.5), and integration by parts, we compute

II =

∫Ω

∇eT ∇(φψ − IL,T (φψ)) dx−∫Ω

ψ∇eT ∇φ dx

.∑T∈T

η0,T (T )

(1

d2‖ψ‖T +

1

d‖∇ψ‖T

)+

∫Ω

eT (∇ψ∇φ+ ψ∆φ) dx.

(2.6), (3.3), (2.7), and the triangle inequality then yield for any ε > 0

II .1

εd2η0,T (T )2 + ε‖φ∇e‖2 +

1

d2‖e‖2 +

1

d2‖eT ‖

2

.1

d2(‖eT ‖

2 + oscL2(T )2 + ‖e‖2) + ε‖φ∇e‖2

.1

εd2(‖eT ‖2 + ‖e‖2 + oscL2

(T )2) + ε‖φ∇e‖2.

(3.17)

Next we compute that

I =

∫Ω

∇eT ∇(φψ) dx−∫Ω

ψ∇eT ∇φ dx

=

∫Ω

∇eT ∇[φ(ψ − ISZ,T ψ)] dx−∫Ω

ψ∇eT ∇φ dx+

∫Ω

∇eT ∇(φISZ,T ψ) dx

:= Ia + Ib + Ic.

As is standard in such proofs [2], we define ISZ,T so that ISZ,T χ = χ on anyT ∈ T \ RT→T and χ ∈ ST . Standard residual error estimation techniques thenyield for any ε > 0

Ia =∑T∈T

∫T

φ(f +∆uh)(ψ − ISZ,T ψ) dx+1

2

∫∂T

φJ∇uT K(ψ − ISZ,T ψ) ds

. η1,T ,φ(RT→T )‖∇ψ‖ . η1,T ,φ(RT→T )(‖φ∇e‖+1

d‖e‖)

. ε−1(η1,T ,φ(RT→T )2 +1

d2‖e‖2) + ε‖φ∇e‖2.

(3.18)

14 A. DEMLOW

Integration by parts and elementary manipulations yield for any ε > 0

Ib =

∫Ω

eT (∇ψ∇φ+ ψ∆φ) dx .1

εd2(‖eT ‖2 + ‖e‖2) + ε‖φ∇e‖2. (3.19)

Finally, Galerkin orthogonality, residual techniques, and the first inequality in (2.5)yield for ε > 0

Ic =∑T∈T

(f +∆uT )(φISZ,T ψ − IL,T (φISZ,T ψ)) dx

+1

2

∫∂T

J∇uT K(φISZ,T ψ − IL,T (φISZ,T ψ)) ds

.1

εd2(η0,T (T )2 + ‖e‖2) + ε‖φ∇e‖2.

(3.20)

Collecting (3.17), (3.18), (3.19), and (3.20) into (3.16) and then into (3.15)yields for any ε > 0

‖φ∇e‖2 .1

εd2(η0,T (T )2 + oscL2

(T )2 + ‖e‖2 + ‖eT ‖2)

+ ε−1η1,T ,φ(RT→T )2 + ε‖φ∇e‖2.(3.21)

Taking ε sufficiently small to kick back the final term, bounding ‖e‖ and ‖eT ‖ in(3.21) using (3.13) and (3.1), and recalling (2.8) completes the proof of (3.14).

3.2 A priori error estimates

In this section we recall an a priori bound in the L2 norm from [12] and then provea new Cea’s Lemma-type bound for the error notion ‖φ∇(u− ui)‖.

Lemma 5 Assume that Ω is convex and that T ∈ Tµ with µ sufficiently small. Then

‖u− uT ‖ . infχ∈ST

‖u− χ‖+ oscL2(T ), (3.22)

and

‖φ∇(u− uT )‖Dd

≤ C3

(infχ∈ST

‖φ∇(u− χ)‖Ω +1

d

[infψ∈ST

‖u− ψ‖Ω + oscL2(T )

]).

(3.23)

Proof (3.22) is proved in [12], so we are left to prove (3.23). Given χ ∈ ST , lete = u− uT and e = χ− uT . Elementary manipulations yield for any ε > 0

‖φ∇e‖2 =

∫Ω

∇e∇(φ2e) dx− 2

∫Ω

φe∇e∇φ dx

.∫Ω

∇e∇(φ2e) dx+ ε‖φ∇e‖2 +1

εd‖e‖2.


There also holds for any ε > 0∫Ω

∇e∇(φ2e) dx =

∫Ω

∇e∇(φ2e) dx+

∫Ω

∇(χ− u)∇(φ2e) dx

=

∫Ω

∇e∇(φ2e) dx+

∫Ω

∇(χ− u) · (φ2∇e+ 2φe∇φ) dx

.∫Ω

∇e∇(φ2e) dx+1

ε‖φ∇(u− χ)‖2 +

1

d‖e‖2 + ε‖φ∇e‖2.

Standard residual error estimation techniques and the first inequality in (2.4)yield for any ε > 0∫

Ω

∇e∇(φ2e) dx =∑T∈T

∫T

(f +∆uT )(φ2e− IL(φ2e)) dx

+1

2

∫∂T

J∇uT K(φ2e− IL(φ2e)) ds

≤ 1

εd2η20,T +

1

d2‖e‖2 + ε‖φ∇e‖2.

Combining the last three inequalities, employing the triangle inequality, and using(3.3) along with (3.22) yields for any ε > 0 that

‖φ∇e‖ . 1

ε

[‖φ∇(u− χ)‖+

1

d(‖u− χ‖+ oscL2

(T ))

]+ ε‖φ∇e‖.

Taking ε sufficiently small to reabsorb the last term yields the intermediate result

‖φ∇(u− uT )‖ . infχ∈ST

[‖φ∇(u− χ)‖+

1

d(‖u− χ‖+ oscL2

(T ))

]. (3.24)

We now complete the proof of (3.23). Since we are only preserving homogeneousDirichlet boundary conditions, it is possible to define the Scott-Zhang operator ISZto be stable in both the L2 norm and the H1 seminorm. That is, ‖ISZu‖T . ‖u‖ωTand ‖∇ISZu‖T . ‖∇u‖ωT . Note also that ISZ is a projection onto ST .

Assume first that T ⊂ Dd satisfies ‖φ‖L∞(T ) ≤ Cφ hTd , where Cφ is a sufficientlylarge constant to be specified later. Let Pr,0 be the polynomials of degree r thatare 0 on ∂T . Taking r to be sufficiently large (r = j + k + n will suffice), we have

‖φ∇ISZu‖T =

∫T

φ∇ISZu · ψ dx (3.25)

for some ψ ∈ [Pr,0]n with ‖ψ‖T ' 1. Employing integration by parts, an inverseinequality, and ‖φ‖L∞(T ) .

hTd , we find that∫

T

φ∇ISZu · ψ dx =

∫T

φ∇(ISZu− u) · ψ dx+

∫T

φ∇u · ψ dx

=

∫T

(u− ISZu)∇ · (φψ) + φ∇u · ψ dx

.[h−1T ‖u− ISZu‖T ‖φ‖L∞(T ) + ‖φ∇u‖T

]‖ψ‖T

.1

d‖u− ISZu‖T + ‖φ∇u‖T .

(3.26)

16 A. DEMLOW

Now let T ⊂ Dd satisfy ‖φ‖L∞(T ) ≥ Cφ hTd , with Cφ as above. By (2.3) and shaperegularity, there is a constant cT (depending only on the shape regularity of T andthe constant in (2.3)) such that

infx∈ωT

φ(x) ≥ ‖φ‖L∞(ωT ) − diam(ωT )‖∇φ‖L∞(ωT ) ≥ ‖φ‖L∞(T ) − cThTd. (3.27)

We let Cφ = 2cT , which because of (3.27) ensures that ‖φ‖L∞(T )‖φ−1‖L∞(ωT ) . 1

for T ∈ T satisfying ‖φ‖L∞(T ) ≥ Cφ hTd . This inequality and the stability of ISZ in

the H1 seminorm yields

‖φ∇ISZu‖T . ‖φ‖L∞(T )‖∇u‖ωT. ‖φ‖L∞(T )‖φ

−1‖L∞(ωT )‖φ∇u‖ωT . ‖φ∇u‖ωT .(3.28)

Summing (3.25), (3.26), and (3.28) over T ∈ TDd and collecting the results into(3.24) while taking χ = ISZu yields

‖φ∇e‖ . ‖φ∇u‖+1

d(‖u− ISZu‖+ oscL2

(T )).

Writing u = u − χ, u − ISZu = u − ψ + ISZ(u − ψ), and finally employing the L2

stability of ISZ completes the proof of (3.23).

3.3 Adaptive FEM

In this subsection we define each module of the standard AFEM iteration (1.1) inorder to construct AFEM for reducing the weighted energy error ‖φ∇(u− ui)‖Dd .Although our initial goal is to control the local energy error, any AFEM for thisquantity based on the a posteriori estimate (3.2) will also control the pollutionmeasure ‖u−ui‖Ω . We thus design our AFEM to control ‖φ∇(u−ui)‖Dd + κ

d ‖u−ui‖Ω , where κ > 0 is a scaling parameter that as in [11] must be chosen largeenough in order to obtain AFEM convergence results.

Our algorithm assumes that an initial shape-regular simplicial mesh T0 is given.The modules in (1.1) are defined as follows:

1. Module solve. Solve (2.2) exactly for ui.2. Module estimate. (3.1) and (3.2) yield for any κ & 1

‖φ∇(u− ui)‖+κ

d‖u− ui‖ . η1,i,φ +

κ

dη0,i. (3.29)

3. Module mark. Here we describe our main marking strategy, which we call strictly

alternating marking. We also consider other strategies below. Let 0 < θ ≤ 1 be amarking parameter. Here we view the local error ‖φ∇(u−ui)‖Dd and pollutionterm κ

d ‖u − ui‖Ω as separate terms, and marking is based on the dominantone of these quantities as measured by the corresponding error estimators. Inparticular, we determine a minimal set Mi ⊂ Ti so that:

η1,i,φ(Ti,Dd) ≥ κ

dη0,i(Ti) =⇒ η1,i,φ(Mi) ≥ θη1,i,φ(Ti,Dd),

κ

dη0,i(Ti) > η1,i,φ(Ti,Dd) =⇒ η0,i(Mi) ≥ θη0,i(Ti).

(3.30)


4. Module refine. Our results below assume that each marked element T ∈ Mi isbisected b ≥ 1 times in passing from Ti to Ti+1 and that additional elementsare refined in the process in order to ensure that Ti+1 is conforming and liesin Tµ with µ sufficiently small, as described in §2.2. We additionally assumethat (2.1) is satisfied. The modified newest node bisection algorithm describedin [12] fulfills these conditions.

3.4 Error contraction

Lemma 6 Given κ > 0, let E2i = ‖φ∇(u − ui)‖2Dd + κ2

d2 ‖u − ui‖2Ω and also η2i =

η1,i,φ(Ti,Dd)2 + κ2

d2 η0,i(Ti)2. Assume that the alternating marking strategy (3.30) is

used. There exist constants γ > 0 and 0 < α < 1 depending only on the parameter θ in

(3.30) and other nonessential quantities such that if Ti, Ti+1 ∈ Tµ with µ sufficiently

small and κ is sufficiently large (both depending only on the generic constant C in

§2.1), then

E2i+1 + γη2i+1 ≤ α2(E2i + γη2i ). (3.31)

Proof The desired result may be found in [11] with the local energy term ‖φ∇(u−ui)‖Dd in Ei replaced by the closely related quantity ‖∇[φ(u − ui)]‖. These twoweighted error notions are equivalent up to a term 1

d‖u−ui‖D′ . In the contractionestimate (3.31) such a term may be reabsorbed into Ei with slight adjustment toα and γ when κ is sufficiently large.

4 Optimality results

In this section we state and prove optimality results for the adaptive algorithmdefined in §3.3. In addition, we discuss two marking strategies other than thestrictly alternating strategy (3.30) which yield optimal AFEM.

4.1 Approximation classes

Given a convergence rate s > 0, let

AsL2= v ∈ H1

0 (Ω) : −∆v ∈ L2(Ω), |v|AsL2:=

supN∈N

Ns infT ∈T:#T −#T0≤N

[inf

vT ∈ST‖u− vT ‖Ω + oscL2

(T )

]<∞.

(4.1)

Next we define the φ-weighted energy class

Asφ = v ∈ H10 (Ω) : |v|Asφ :=

supN∈N


[inf

vT ∈ST‖φ∇(u− vT )‖Ω + oscφ(T )

]<∞.

(4.2)

In the simplest contexts, approximation classes may be characterized in terms ofBesov regularity of u [1], [33]. We do not attempt such a characterization here.

18 A. DEMLOW

4.2 Optimality of AFEM with strictly alternating marking strategy

Theorem 1 Assume that the strictly alternating marking strategy (3.30) is used with

κ sufficiently large and θ sufficiently small. In addition assume that AFEM produces a

sequence Tii≥0 ⊂ Tµ of meshes with µ sufficiently small. The required sizes of κ, θ,

and µ depend only on the generic constant C in §2.1. Then

Ei,φ +κ

dE0,i . (#Ti −#T0)−s

(|u|Asφ +

κ

d|u|AsL2

). (4.3)

Proof We follow the broad outline of previous AFEM optimality proofs [2], [12],[8] with some essential modifications. We first seek to bound the number #Mi ofmarked elements at each step of the AFEM algorithm. We immediately have fromCorollary 3 of [12] that

κ

dη0,i > η1,i,φ ⇒ #Mi .

(κ

d|u|AsL2

)1/s (κdE0,i

)−1/s. (4.4)

Assume now that η1,i,φ ≥ κd η0,i. For κ sufficiently large, (3.4) and (3.1) yield

η1,i,φ ≤ C1Ei,φ. (4.5)

Also, if Ti ⊂ T , then (3.14), (4.5), and η1,i,φ ≥ κd η0,i imply

‖φ∇(uT − ui)‖Dd ≤ C2

[η1,i,φ(RTi→T ) +

1

κEi,φ

]. (4.6)

Assume now that θ is small enough so that 1 − 2θC1(C2 + 1) > 0 and that Tis a refinement of Ti satisfying

ET ,φ ≤ [1− 2θC1(C2 + 1)]Ei,φ. (4.7)

The triangle inequality and the inequality oscφ(Ti) ≤ oscφ(RTi→T )+oscφ(T ) yield

Ei,φ − ET ,φ ≤ ‖φ∇(uT − ui)‖Dd + oscφ(RTi→T ). (4.8)

We now employ (4.5), rearrange terms, use in order (4.7), (4.8), and (4.6), andthen finally take κ sufficiently large so that C2

κ ≤ θC1(C2 + 1) to obtain

θ(C2 + 1)η1,i,φ ≤ θC1(C2 + 1)Ei,φ

= Ei,φ − (1− 2θC1(C2 + 1))Ei,φ − θC1(C2 + 1)Ei,φ

≤ Ei,φ − ET ,φ − θC1(C2 + 1)Ei,φ

≤ ‖φ∇(uT − ui)‖Dd + oscφ(RTi→T )− θC1(C2 + 1)Ei,φ

≤ (C2 + 1)η1,i,φ(RTi→T ) +C2

κEi,φ − θC1(C2 + 1)Ei,φ

≤ (C2 + 1)η1,i,φ(RTi→T ).

Thus for Ti ⊂ T ,

ET ,φ ≤ [1− 2θC1(C2 + 1)]Ei,φ ⇒ θη1,i,φ ≤ η1,i,φ(RTi→T ). (4.9)


Next let θ < 13C1(C2+1) . Recalling that C3 is the constant from (3.23), the

definition of Asφ guarantees that there is a T ′ ∈ T such that

#T ′ −#T0 . |u|1/sAsφ

(1− 3θC1(C2 + 1)

1 + C3Ei,φ

)−1/s

(4.10)

and vT ′ ∈ ST ′ such that

‖φ∇(u− vT ′)‖Dd ≤1− 3θC1(C2 + 1)

1 + C3Ei,φ.

It is shown in [12] (cf. [8]) that the smallest common refinement T of Ti and T ′which lies in Tµ additionally satisfies

#T −#Ti . #T ′ −#T0, (4.11)

with the constant depending on µ. Using (3.23) with χ = v′T and ψ = uT , (3.1),and noting that κ

d η0,i ≤ η1,i,φ ≤ CEi,φ by (4.5), we calculate

ET ,φ ≤ (C3 + 1)

(‖φ∇(u− vT ′)‖Dd + oscφ(T ′) +

1

dE0,i

)≤ [1− 3θC1(C2 + 1)]Ei,φ +

C

dη0,i

≤ [1− 3θC1(C2 + 1)]Ei,φ +C

κEi,φ.

Taking κ sufficiently large so that Cκ ≤ θC1(C2 + 1) then yields

ET ,φ ≤ [1− 2θC1(C2 + 1)]Ei,φ. (4.12)

Because Mi is the smallest subset of Ti satisfying Dorfler marking for η1,i,φwith parameter θ, (4.9) and (4.12) together yield #Mi ≤ #RTi→T . Combiningthis observation with (4.11) and the trivial observation RTi→T ≤ #T −#Ti yields

#Mi . #T ′ −#T0,

which together with (4.10) gives

κ

dη0,i(Ti) ≤ η1,i,φ(Ti) ⇒ #Mi . |u|

1/sAsφ

E−1/si,φ . (4.13)

Note that E0,i ' η0,i(Ti), and that Ei,φ ' η1,i,φ(Ti) whenever η1,i,φ(Ti) ≥κd η0,i(Ti). Thus κ

dE0,i ' Ei,φ + κdE0,i when κ

d η0,i(Ti) > η1,i,φ(Ti), and Ei,φ 'Ei,φ + κ

dE0,i when κd η0,i(Ti) ≤ η1,i,φ(Ti). Combining this observation with (4.4),

(4.13), and (3.31) yields

#Ti −#T0 .i−1∑j=0

Mj .i−1∑j=0

(|u|1/sAsφ +

(κ

d|u|AsL2

)1/s)(Ej,φ +

κ

dE0,j)

−1/s

.(|u|Asφ +

κ

d|u|AsL2

)1/s i−1∑j=0

αi−js

(Ei,φ +

κ

dE0,i

)−1/s,

(4.14)

which after summing geometric terms and rearranging yields (4.3), as desired.

20 A. DEMLOW

4.3 Optimality of AFEM with pre-controlled pollution

We next discuss optimality of an AFEM in which the pollution term is first adap-tively reduced to within a specified tolerance and the local energy residual issubsequently reduced to the same tolerance. This is essentially one of the two-grid algorithms proposed in [14]. We employ the following pre-controlled pollution

marking strategy: Given tolerance ε > 0, Dorfler marking fraction 0 < θ ≤ 1, andpollution parameter κ > 0,

1. Until the first iteration in which κd η0,i ≤ ε: In the “mark” step, choose a minimal

set Mi ⊂ Ti such that η0,i(Mi) ≥ θη0,i.2. For the first iteration in which κ

d η0,i ≤ ε and in each iteration thereafter un-

til η1,i,φ ≤ ε: In the “mark” step, choose a minimal set Mi ⊂ Ti such thatη1,i,φ(Mi) ≥ θη1,i,φ.

Corollary 1 Assume that the above pre-controlled pollution marking strategy is used,

that θ is sufficiently small, that κ is sufficiently large, and that Ti ⊂ Tµ for µ

sufficiently small. The required sizes of κ, θ, and µ depend only on the generic constant

C in §2.1. Assume also that u ∈ Asφ and u ∈ AsL2for some s > 0. Then the algorithm

terminates in a finite number of steps with output mesh TM , and

‖φ∇(u− uM )‖Dd +κ

d‖u− uM‖ . ε . (#TM −#T0)−s(|u|Asφ +

κ

d|u|AsL2

). (4.15)

Proof By [12], the L2 marking strategy in Part 1 of the pre-controlled pollutionmarking strategy will yield a mesh Tk for which κ

d η0,k ≤ ε, and for some 0 < α < 1,

#Tk −#T0 .k−1∑i=0

Mi .(κ

d|u|AsL2

)1/s k−1∑i=0

(κ

dE0,i

)−1/s

.(κ

d|u|AsL2

)1/s k−1∑i=0

αk−1−is ε−1/s .

(κ

d|u|AsL2

)1/sε−1/s.

(4.16)

Here we have used the contractive nature of the L2 AFEM and the fact thatκdE0,k−1 ' κ

d η0,k−1 ≥ ε.If η1,k,φ ≤ ε, the algorithm terminates and no further refinements are made.

In this case the a posteriori bounds (3.1) and (3.2) along with (4.16) yield (4.15).Assume now that η1,k,φ > ε. (2.7), (3.1), (3.3), and (3.22) guarantee that for anymesh Tj , j ≥ k, produced by Part 2 of the pre-controlled pollution strategy,

κ

dη0,j(Tj) . ε. (4.17)

Thus for some κ ' κ (with constant depending on that hidden in ‘.’ in (4.17)),

κ

dη0,j(Tj) ≤ ε < η1,j,φ(Tj), j ≥ k with η1,j,φ(Tj) > ε. (4.18)

Thus each step of Part 2 of the pre-controlled pollution marking strategy is aninstance of the strictly alternating strategy (3.30) with κ replacing κ and thefirst line of the strategy chosen. The contraction result (3.31) implies that if κis sufficiently large, then for any i with η1,j,φ > ε for all k ≤ j ≤ i, η21,i+1,φ ≤


α2(i+1−k)(γ−1E2k + η2k). In particular, there is M ≥ k + 1 with η1,M−1,φ > ε but

η1,M,φ ≤ ε, and the algorithm terminates in a finite number of steps as asserted.(3.1), (3.2), (3.4), and (4.18) imply for k ≤ j < M − 1 and κ sufficiently large

η1,j,φ(Tj) ' Ej,φ ' Ej,φ +κ

dE0,j .

Since EM−1,φ ' η1,M−1,φ > ε, the contraction result (3.31) then yields for κ (andthus κ) sufficiently large

Ei,φ & α−M+1+iEM−1,φ & α−M+1+iε, k ≤ i ≤M − 1. (4.19)

We next modify the proof of Theorem 1 to take into account that (4.13) alwaysoccurs for k ≤ i ≤ M − 1, and never (4.4). Thus employing (4.13) and inserting(4.19), we compute using (4.14) that

#TM −#Tk .M−1∑i=k

#Mi . |u|1/sAsφ,k

M−1∑i=k

E−1/si,φ

. |u|1/sAsφ,k

M−1∑i=k

αM−1−i

s ε−1/s . |u|1/sAsφ,kε−1/s.

(4.20)

Here Asφ,k is the φ-weighted local energy approximation class with rate s beginningat the mesh Tk instead of at the mesh T0, i.e., u ∈ Asφ,k satisfies

|u|Asφ,k := supN∈N

Ns infT ∈T:#T −#Tk≤N

infvT ∈ST

‖φ∇(u− vT )‖ <∞. (4.21)

Next let Ti,si≥0 be a sequence of meshes with #Ti,s −#T0 ≤ i realizing theinfima in the definition (4.2) of Asφ. Let also TL be the overlay of Tk and TL,s, L ≥ 0.The proof of Lemma 5.2 of [8] yields #TL−#Tk ≤ #TL,s−#T0 ≤ L. Since TL is arefinement of TL,s, we have infvTL∈TL ‖φ∇(u−vTL)‖ ≤ infvTL,s∈TL,s ‖φ∇(u−vTL,s)‖.These observations yield after comparing (4.2) and (4.21) that |u|Asφ,k ≤ |u|Asφ .

Combining this inequality with (4.16), (4.20), and κ ' κ yields

#TM −#T0 ≤ (#TM −#Tk) + (#Tk −#T0)

. (|u|Asφ +κ

d|u|AsL2

)1/sε−1/s,

which gives the second inequality in (4.15). The first inequality in (4.15) followsfrom (4.17), η1,M,φ + κ

d η0,M . ε, and the a posteriori bounds (3.1) and (3.2).

4.4 Optimality of AFEM with integrated marking strategy

We briefly consider a third marking strategy, which we call integrated Dorfler mark-

ing. Given θ > 0, we seek a minimal subset Mi ⊂ Ti so that∑T∈Mi

η1,i,φ(T )2 +κ2

d2η0,i(T )2 ≥ θ2(η21,i,φ +

κ2

d2η20,i).

Philosophically this strategy views the error ‖φ∇(u − ui)‖Dd + κd ‖u − ui‖Ω as a

single integrated quantity.

22 A. DEMLOW

The contraction property (3.31) and an optimality result similar to (4.3) aremodestly easier to obtain for integrated Dorfler marking than the correspondingresults proved above for the strictly alternating marking strategy. For example,such results require only the Cea’s Lemma (3.24) and not the sharper result (3.23).Also, here the natural approximation class is

Asφ,L2= v ∈ H1

0 (Ω) : −∆v ∈ L2(Ω), |v|Asφ,L2:=

supN∈N


infvT ∈ST

[‖φ∇(u− vT )‖Ω + oscφ(T )

+κ

d(‖u− vT ‖Ω + oscL2

(T ))]<∞.

It can be shown that given s > 0, u ∈ Asφ,L2⇔ u ∈ Asφ and u ∈ AsL2

. ThusAFEM with integrated, strictly alternating, and pre-controlled pollution markingstrategies all yield the same optimal convergence rates.

4.5 Discussion of marking strategies and algorithm parameters

The above marking strategies differ both in the complexity of their analysis andtheir potential applicability. The integrated Dorfler marking strategy is the sim-plest of the above three both in its form and its analysis. We are not aware ofour scheme being used directly in practice, but the “10−6” version of the Bank-Holst parallel adaptive strategy [20], [21] is an integrated marking strategy whichcontrols pollution using a global energy estimator multiplied by a small constantinstead of an L2 estimator.

The structure of strictly alternating strategy bears some resemblance to Steven-son’s algorithm in [8] for controlling global energy errors, in which the estimatorand data oscillation were reduced separately. While we control two estimators hereinstead of estimator and oscillation, the basic structure of the algorithm is similar,and their are modest parallels between the analyses as well. Alternating markingapplies more readily to the case where Ω is a nonconvex polyhedron. There η0does not reliably control the pollution term ‖u − uh‖, and definition of efficientand reliable estimators for the L2 error requires explicit consideration of cornersingularities [15], [16]. This is already cumbersome in two space dimensions andquite difficult in R3. In [34], [11] we advocated controlling the pollution error Lpfor some p sufficiently large to ensure that standard Lp error estimators are reli-able and efficient on any polyhedral domain; p > 4 suffices. An integrated markingstrategy is not easy to implement in this case because local energy and global Lpindicators accumulate differently over the mesh. A number of technical issues pre-vent immediate extension of our analysis here to Lp pollution control with p 6= 2,but our results still give some theoretical foundation to such algorithms.

The pre-controlled pollution strategy was essentially proposed by Xu and Zhouin the context of the two-grid parallel adaptive algorithms discussed in the intro-duction; cf. Sections 4 and 5 of [14] for more details.

Theorem 1 and Corollary 1 both require that the marking parameter θ andgrading parameter µ are sufficiently small and that the pollution weighting factorκ is sufficiently large, with the threshold values depending on the initial mesh anddomain but not on essential quantities such as u or the current mesh refinement


level. The corresponding global energy optimality result Theorem 5.11 of [2] alsorequires that the marking parameter θ be sufficiently small (cf. Assumption 5.8of that work), with the threshold value similarly depending on shape regularityproperties of the initial mesh but not on other essential quantities. The theoreticaldependencies of our AFEM input parameters are thus quite similar to the globalenergy case, with the addition that ours depend on the domain Ω via an H2

regularity estimate because we require a duality argument to control the pollutionerror. We discuss our practical parameter choices directly below in the followingsection.

5 Examples and Computational Experiments

In this section we present a number of examples which illustrate our theory.

5.1 Problem parameters: Algorithmic inputs and test problem

We first discuss our choices of κ, µ, and θ. In our computations we do not enforcethe mild mesh grading condition explicitly (beyond enforcing shape regularity,which as explained in §2.2 already restricts the mesh grading) and see no degra-dation in convergence as a result. As in standard global energy AFEM θ is chosento be moderately small (e.g., 0.25) with no ill effects observed on the convergencerate. Finally, our experiments indicate that taking κ to be roughly in the rangefrom 0.1 to 1 yielded the most robust algorithmic performance, indicating thatthe most natural choice of this parameter (roughly unit sized) is reasonable forpractical purposes. Observed convergence rates may degrade if κ << 1. Numericalexperiments of course do not allow us to conclude that κ must in fact be sufficientlylarge in order to guarantee optimal convergence rates, but they at the least indicatethat constants in our estimates degrade as κ→ 0, as should be expected. Optimalconvergence bounds with fixed constant are maintained if κ >> 1, but the errornotion for which optimality is achieved includes κ

d ‖u − uh‖Ω and de-emphasizesthe local energy portion of the error as κ is increased. Thus taking κ too largemay degrade the effective performance of the algorithm in controlling the localenergy error since doing so leads to unnecessary refinement in Ω \Dd. A differentbut parallel negative effect occurs if θ is taken to be too small in either our localor standard global energy AFEM, as then optimal convergence rates are guaran-teed but algorithmic efficiency is degraded because unnecessarily many adaptivesteps are used to achieve them. Finally, as we explain below, refinement based onlyon local energy estimators (κ = 0) leads to convergence of the algorithm to theincorrect solution.

In all of our examples we solve Poisson’s problem on the convex polyhedron Ω

pictured in Figure 1. The largest edge opening angle in Ω, which occurs at the edgee forming the right tip of the “wedge”, is 7π

8 . We solve −∆u = f in Ω, u = 0 on ∂Ωfor various choices of f . By standard theory of elliptic boundary value problemson polyhedral domains [35], [36], u has a singularity of the form r8/7 at e exceptfor special choices of f . Here r is the distance to the edge e. This is the strongestedge or vertex singularity induced in solutions to Poisson’s problem on Ω and thuswill generally limit convergence rates.

24 A. DEMLOW

e

x

z

x

y

z

Fig. 1 Left: Computational domain Ω with initial mesh. Right: Profile view of Ω with com-monly used versions of D (dark shading) and Dd (all shaded area) indicated.

Below we differentiate between the optimal convergence rate of the AFEM andthe generally best possible rate for a given polynomial degree k. By generally bestpossible rate we mean k/n (here, k/3) when measuring the error in energy normsand (k + 1)/n when measuring the error in the L2 norm. Even optimal AFEMdo not always achieve these rates because the class of meshes is not sufficientlyrich to do so, and these rates are exceeded for example when u lies in the finiteelement space. Following [12], except for special choices of f we have u ∈ AsL2

with

s = mink+13 , 157 − ε for any ε > 0, and for the choice of Dd pictured in Figure

1, u ∈ Asφ with s = mink3 ,87 − ε for any ε > 0. Thus for large enough k the

optimal rate is not equal to the generally best possible rate. Anisotropic meshesare necessary to recover higher convergence rates.

All computations below were carried out using the toolbox ALBERTA [37]. Wedid not enforce the mild grading condition explicitly and observed no degradationin convergence rates as a result. The

5.2 Example 1: Dominant pollution error

A fundamental heuristic of local error analysis is that because pollution errors aremeasured in a weaker norm, they are “of higher order” or “negligible” (cf. Remark3.10 of [14]). However, pollution errors may in fact limit adaptive convergencerates as compared with local energy errors. If we let f = 1 and Dd ⊂⊂ Ω, then

u|Dd is smooth and u ∈ Ak/3φ for any k ≥ 1. On the other hand, we at best have

u ∈ A15/7−εL2

, so for k ≥ 7 the pollution error limits the convergence rate. (Wedid not program this example because ALBERTA only contains polynomials ofdegree k ≤ 4.) This example may appear somewhat artificial because it involvesan unusually high polynomial degree, but similar effects can be observed whenmeasuring pollution errors in Lp norms on nonconvex polyhedral domains forfinite element spaces of low polynomial degree.


slope = −8/7

Cutoff: φ

Cutoff: 1− φ

slope = −4/3

Log(η

1,i,φ

+κ dη0,i

)

4.0 5.0 6.0 7.0

Log(DOF )

−2.0

−3.0

−4.0

−5.0

−6.0

Fig. 2 Reduction of the φ-weighted error for a domain including the wedge singularity (cutoffφ) and not including the wedge singularity (cutoff 1− φ).

5.3 Example 2: Effects of singularity on local energy convergence rates

First let D and Dd be as depicted in Figure 1. Then d = 13 cot(7π

8 ). φ is a cubicspline in the x-variable which is 1 on D, positive on Dd, and 0 outside of Dd(the white area). The initial mesh in Figure 1 is compatible with φ as requiredin §2.4. Let f(x) = 1, k = 4, and κ = 1. Because Dd abuts e we predict as in§5.1 that u ∈ Asφ only for s < 8

7 , while u ∈ AsL2for all s ≤ 5

3 . In our experimentswe employed the strictly alternating marking strategy with κ = 1. From (4.3) wepredict for κ sufficiently large that our AFEM yields Ei,φ + κ

d . #DOF−8/7+ε forany ε > 0, where DOF is the number of unknowns in the FEM system. We do notknow the exact solution u, but because there is no data oscillation η1,i,φ+ κ

d η0,i 'Ei,φ + κ

dE0,i. Thus we measure asymptotic decrease of the former quantity.

We also carried out the same experiment with φ replaced by 1 − φ so that Dconsisted of the white area in Figure 1 and Dd the union of the white and lightly

shaded areas. On this version of Dd u is sufficiently regular to obtain u ∈ A4/3φ .

Asymptotic error decrease is pictured in Figure 2; least-squares fit of error datafor iterations above 106 system unknowns shows convergence rates of about 1.19when the cutoff is taken to be φ (slightly above the predicted rate of 8/7− ε) andof about 1.30 when the cutoff is 1− φ (slightly below the predicted rate of 4/3).

With slight modification, the latter example also may be used to demonstratethe advantages of local energy refinement instead of global energy refinement ifinformation about u is only needed on a subdomain. In particular, the rate ofconvergence for the standard global energy AFEM for this example will be 8/7− εfor any ε > 0, as for local energy AFEM employing φ. Thus controlling only the1− φ-weighted energy error yields an improved rate of convergence.

26 A. DEMLOW

5.4 Example 3: Effects of κ on convergence rates

Theorem 1 and Corollary 1 require that κ exceed some threshold value not depend-ing on essential quantities in order to obtain optimal convergence rates. Our goalin this example is to provide experimental guidance concerning the threshold valuefor κ. The theoretical and practical placement of κ in our theory is very similar tothat of the Dorfler marking parameter θ both in our AFEM and in the correspond-ing global energy theory of [2]. In particular, optimal convergence is guaranteedindependent of u and other essential quantities if each of these parameters satisfiesa threshold condition, but the threshold value which guarantees optimal conver-gence would be difficult to calculate theoretically. A reasonable range of parametervalues is determined by carrying out numerical experiments for various values ofthese parameters and observing convergence rates.

We employ two different test solutions in order to test the effects of κ. First wetake u to be a known piecewise polynomial test solution. Dd is as in Figure 1, andon elements intersecting Dd u is a degree-4 polynomial and thus lies in the discretespace when k = 4. Refining the mesh on Dd thus yields no increase in accuracy,and an efficient AFEM for controlling ‖φ∇(u − ui)‖ will direct all refinement toΩ\Dd. Figure 3 contains meshes for κ = 1, 10−5, 10−7, 0. For κ = 1 no elements arerefined in D, and in Dd only enough are refined to maintain mesh conformity. As κis decreased we see more refinement inside of Dd and thus decreased computationalefficiency.

Figure 4 displays decrease of ‖φ∇(u− ui)‖ for each choice of κ. We only showone error line for κ = 10, 1, 10−1, 10−2 because all were the same. We also onlydisplay ‖φ∇(u − ui)‖ and not the corresponding L2 component because the L2

component of the error is weighted differently for each choice of κ. Because u ∈ Asφfor any s, and u ∈ A5/3

L2, we expect ‖φ∇(u− ui)‖ to decrease with rate at least 5/3

(possibly greater, since we do not measure the L2 error). This is in fact observedwith 0.01 ≤ κ ≤ 10 and κ = 10−5. It is possible that such a rate would be observedasymptotically with κ = 10−7 but is not seen in our computations, and practicallyspeaking pollution is not controlled sufficiently to yield an efficient computationwith this choice of κ. Thus for this experiment κ is clearly “sufficiently large”as in Theorem 1 when κ ≥ 10−2, and κ = 10−5 suffices but shows degradedalgorithmic performance. Finally, when κ = 0 pollution is not controlled at all andso we cannot expect that ui → u even on Dd. This is reflected in the lack of errorreduction observed in Figure 4. The significance of this case is discussed more inExample 4 below.

Our second test solution u is a known function having the expected singularityat the wedge tip e in Figure 1. The range of y-values for the domain Ω in Figure1 is 0 < y < 1. We first let φ be a cubic spline in y that is 1 for y ≥ 0.75 and 0 fory <= 0.5 so that D consists of the rear quarter of Ω (and incorporates a quarterof the wedge tip e) and Dd consists of the rear half of Ω. While pollution must becontrolled in this case, most refinement will be directed to the singularity at thewedge tip e contained in D. Reduction in ‖φ∇(u− uh)‖Dd is depicted in Figure 5for several values of κ. For κ = 1, 10−1, 10−2 convergence is clearly optimal (hereO(DOF )−8/7+ε), while optimal convergence with degraded constant is observedfor κ = 10. Thus while optimal convergence is still obtained, increasing κ too muchleads to less efficient approximation of u in Dd.


Fig. 3 Left to right: Computational meshes for κ = 1 (49917 tetrahedra), κ = 10−5 (43469tetrahedra), κ = 10−7 (49558 tetrahedra), and κ = 0 (51014 tetrahedra).

κ = 0

κ = 10−7

slope = − 53

κ = 10−5

10−2 ≤ κ ≤ 10

Log(‖φ∇

(u−ui)‖

)

2.0 3.0 4.0 5.0 6.0 7.0

Log(DOF )

−3.0

−5.0

−7.0

−9.0

Fig. 4 Reduction of the φ-weighted local energy error for various values of κ: Known polyno-mial solution, D includes wedge tip.

28 A. DEMLOW

κ = 10

κ = 1

κ = 10−1

κ = 10−2

slope = − 87

Log(‖φ∇

(u−ui)‖

)

2.0 3.0 4.0 5.0 6.0 7.0

Log(DOF )

−1.0

−3.0

−5.0

−7.0

Fig. 5 Reduction of the φ-weighted local energy error for various values of κ: Known singularsolution, D includes portion of wedge tip.

Using the same known singular test solution, we next employed a cutoff func-tion φ which is 1 in the white and lightly shaded portion of the right illustration inFigure 1 and decreases from 1 to 0 as x moves from let to right across the darklyshaded portion. Thus D is the white and lightly shaded portion of Ω, and Dd = Ω,and φ > 0 on Ω but is 0 at the singularity at e. Even though the singularity at eis contained in Dd, an optimal convergence rate of O(DOF−4/3) is still possiblefor our AFEM because φ(x) → 0 as x → e, thus counteracting the effect of thesingularity. This is in fact observed in Figure 7 for sufficiently large κ. Optimalconvergence is observed for κ = 10, 1, 10−1. Performance degenerates slightly as κis lowered to 10−2 and significantly as it is decreased to 10−3.

Combining data from Figures 4, 5, and 6, we find that optimal convergenceis consistently obtained for κ ≥ 0.1, with the best results in all three examplesbeing obtained for κ = 0.1. Thus while some examples maintain good performancewhen κ is taken to be smaller and others when κ is taken to be larger, consistentlygood performance for κ ≈ 0.1 is seen across examples having a different singularitystructure (no singularity, having a singularity in D, and having a singularity lyingat an edge where φ = 0).

The last example (Figure 6) also highlights an advantage of employing aweighted energy AFEM instead of the predecessor Xu-Zhou algorithm even iferror control over simple subdomains is desired. Because Dd = Ω in this exam-ple, employing the Xu-Zhou estimate (1.3) in an AFEM would be equivalent toemploying a global energy AFEM. Numerical experiments which we do not dis-play here for the sake of space show that ‖∇(u − uh)‖D = O(DOF−8/7+ε) whenemploying the Xu-Zhou AFEM in this way, while our weighted energy estimatoryields ‖∇(u − uh)‖D = O(DOF−4/3). Thus the sharper cutoff provided by φ inour a posteriori error indicators may have practical advantages even when errorcontrol over simple subdomains is desired.


κ = 10

κ = 1

κ = 10−1

κ = 10−2

κ = 10−3

slope = − 43

Log(‖φ∇

(u−ui)‖

)

2.0 3.0 4.0 5.0 6.0 7.0

Log(DOF )

−1.0

−3.0

−5.0

−7.0

Fig. 6 Reduction of the φ-weighted local energy error for various values of κ: Known singularsolution, Dd abuts wedge tip but φ = 0 on e.

5.5 Example 4: Refinement without pollution control

It was proved in [11] that when refinement is based only on local error estima-tors supported on Dd, the adaptive iterates satisfy ui → u∞ in H1

0 (Ω), and u∞satisfies −∆u∞ = −∆u = f in Dd. Thus refining based only on η1,i,φ yields asolution that differs from u in Dd by a harmonic function. The natural singularfunction generated at e is harmonic in a neighborhood of e including Dd, so u∞may be singular at e even when u is not. Recalling our above example involvinga known piecewise polynomial test solution (Figures 3 and 4), our numerical ex-periments indicate that this is so. First, the refinement pattern in the rightmostmesh displayed in Figure 3 corresponding to κ = 0 displays heavy refinement ate as is typical for singular solutions. Figure 7 also shows clear O(DOF−8/7) de-crease of η1,i,φ as the mesh is refined. Standard residual efficiency arguments yieldη1,i,φ ≤ ‖∇(u∞ − ui)‖L2(Dd) + osc, where osc is standard (higher-order) energy

data oscillation. We can conclude that ‖∇(u∞−ui)‖L2(Dd) & DOF−8/7 also. Thisdoes not provide conclusive evidence that u∞ has the asserted edge singularity ate because we have not established that ui → u∞ optimally, but these data providestrong evidence that u∞ has a singularity not present in the actual solution.

Recall that in some Bank-Holst adaptive parallel algorithms [24], only opti-mality of the generated meshes is desired and not necessarily optimal reduction oflocal errors. Our above example demonstrates that some form of pollution controlis necessary in order to guarantee optimality of the generated meshes even forpurely elliptic problems. Also, here our AFEM has detected a singularity that isnot present in the actual solution. Such an AFEM might similarly fail to detecta singularity that is present in the actual solution and thus underrefine insteadof overrefining. More generally, u∞ can have different singular function coeffi-cients (stress intensity factors) than u and thus can generate incorrect refinement

30 A. DEMLOW

κ = 0

slope = −8/7

Log(η

1,i,φ

)

2.0 3.0 4.0 5.0 6.0 7.0Log(DOF )

−2.0

−4.0

−6.0

−8.0

Fig. 7 Reduction of η1,i,φ when κ = 0.

patterns. Finally, this example demonstrates that reversing the pre-controlled pol-lution strategy–that is, first reducing the local energy error and then the pollutionerror–cannot be optimal because it involves first generating a mesh designed toresolve the wrong solution.

5.6 Discussion of other forms of pollution control

We have discussed two extreme cases of adaptive pollution control: no pollutioncontrol, and completely rigorous control via Lp norms. Many other methods forcontrolling pollution have been suggested and are generally effective in practice;cf. [20], [21], [22], [24], [23]. Some, such as the “10−6” method of Bank and Holst,employ an ad-hoc weighting of error indicators from outside of Dd. Others fall intothe general class of duality-based adaptivity. Generally speaking such methodshave theoretical backing but are not completely rigorous, so analyzing them isdifficult and they are unlikely to be optimal in the sense discussed here. However,our results may provide guidance for such algorithms concerning choices of markingstrategies and parameters which are likeliest to lead to optimal results.

References

1. P. Binev, W. Dahmen, R. DeVore, Numer. Math. 97(2), 219 (2004)2. J. Cascon, C. Kreuzer, R.H. Nochetto, K.G. Siebert, SIAM J. Numer. Anal. 46(5), 2524

(2008)3. L. Diening, C. Kreuzer, R. Stevenson, ArXiv e-prints (2013)4. W. Dorfler, SIAM J. Numer. Anal. 33(3), 1106 (1996)5. K. Mekchay, R.H. Nochetto, SIAM J. Numer. Anal. 43(5), 1803 (2005)6. P. Morin, R.H. Nochetto, K.G. Siebert, SIAM Rev. 44(4), 631 (2002). Revised reprint of

“Data oscillation and convergence of adaptive FEM” [SIAM J. Numer. Anal. 38 (2000),no. 2, 466–488 (electronic); MR1770058 (2001g:65157)]

7. R. Stevenson, SIAM J. Numer. Anal. 42(5), 2188 (2005)8. R. Stevenson, Found. Comput. Math. 7(2), 245 (2007)


9. M.S. Mommer, R. Stevenson, SIAM J. Numer. Anal. 47(2), 861 (2009)10. P. Morin, K.G. Siebert, A. Veeser, Math. Models Methods Appl. Sci. 18(5), 707 (2008)11. A. Demlow, SIAM J. Numer. Anal. 48(2), 470 (2010). DOI 10.1137/080741458. URL

http://dx.doi.org/10.1137/08074145812. A. Demlow, R. Stevenson, Numer. Math. 117(2), 185 (2011). DOI 10.1007/s00211-010-

0349-9. URL http://dx.doi.org/10.1007/s00211-010-0349-913. J.A. Nitsche, A.H. Schatz, Math. Comp. 28, 937 (1974)14. J. Xu, A. Zhou, Math. Comp. 69(231), 881 (2000)15. X. Liao, R.H. Nochetto, Numer. Methods Partial Differential Equations 19(4), 421 (2003)16. T.P. Wihler, Int. J. Numer. Anal. Model. 4(1), 100 (2007)17. Y. He, J. Xu, A. Zhou, J. Li, Numer. Math. 109(3), 415 (2008). DOI 10.1007/s00211-008-

0141-2. URL http://dx.doi.org/10.1007/s00211-008-0141-218. M. Mu, J. Xu, SIAM J. Numer. Anal. 45(5), 1801 (2007). DOI 10.1137/050637820. URL

http://dx.doi.org/10.1137/05063782019. J. Xu, A. Zhou, Math. Comp. 70(233), 17 (2001). DOI 10.1090/S0025-5718-99-01180-1.

URL http://dx.doi.org/10.1090/S0025-5718-99-01180-120. R.E. Bank, M. Holst, SIAM J. Sci. Comput. 22(4), 1411 (2000)21. R.E. Bank, M. Holst, SIAM Rev. 45(2), 291 (2003). Reprinted from SIAM J. Sci. Comput.

22 (2000), no. 4, 1411–1443 [MR1797889]22. R.E. Bank, J.S. Ovall, SIAM J. Sci. Comput. 29(4), 1511 (2007)23. D. Estep, M. Holst, M. Larson, SIAM J. Sci. Comput. 26(4), 1314 (2005)24. M. Holst, in Domain decomposition methods in science and engineering (Natl. Auton.

Univ. Mex., Mexico, 2003), pp. 63–78 (electronic)25. A. Demlow, J. Guzman, A.H. Schatz, Math. Comp. 80(273), 1 (2011). DOI 10.1090/S0025-

5718-2010-02353-1. URL http://dx.doi.org/10.1090/S0025-5718-2010-02353-126. R. Stevenson, Math. Comp. 77(261), 227 (2008). DOI 10.1090/S0025-5718-07-01959-X.

URL http://dx.doi.org/10.1090/S0025-5718-07-01959-X27. A. Demlow, D. Leykekhman, A.H. Schatz, L.B. Wahlbin, Math. Comp. (2011)28. R.H. Nochetto, M. Paolini, C. Verdi, Math. Comp. 57(195), 73 (1991)29. I. Babuska, J. Osborn, Numer. Math. 34(1), 41 (1980)30. K. Eriksson, Math. Models Methods Appl. Sci. 4(3), 313 (1994)31. S.C. Brenner, L.R. Scott, The mathematical theory of finite element methods, Texts in

Applied Mathematics, vol. 15, 3rd edn. (Springer, New York, 2008)32. L.R. Scott, S. Zhang, Math. Comp. 54(190), 483 (1990)33. P. Binev, W. Dahmen, R. DeVore, P. Petrushev, Serdica Math. J. 28(4), 391 (2002).

Dedicated to the memory of Vassil Popov on the occasion of his 60th birthday34. A. Demlow, Math. Comp. 76(257), 19 (2007)35. M. Dauge, Elliptic boundary value problems on corner domains, Lecture Notes in Math-

ematics, vol. 1341 (Springer-Verlag, Berlin, 1988)36. V.G. Maz’ya, J. Rossmann, Elliptic Equations in Polyhedral Domains, Mathematical Sur-

veys and Monographs, vol. 162 (American Mathematical Society, Providence, RI, 2010)37. A. Schmidt, K.G. Siebert, Design of adaptive finite element software, Lecture Notes in

Computational Science and Engineering, vol. 42 (Springer-Verlag, Berlin, 2005). The finiteelement toolbox ALBERTA, With 1 CD-ROM (Unix/Linux)

quasi-optimality of adaptive nite element methods for

Documents