numerical analysis an advanced coursemath.ubbcluj.ro/~tcatinas/cursanalizanum.pdf · theorems, best...
TRANSCRIPT
Numerical AnalysisAn Advanced Course
Gheorghe Coman, Ioana Chiorean, Teodora Catinas
2
Contents
1 Preliminaries 91.1 Linear spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.2 Examples of functions spaces . . . . . . . . . . . . . . . . . . . 201.3 The Peano Kernel Theorems . . . . . . . . . . . . . . . . . . . 241.4 Best approximation in Hilbert spaces . . . . . . . . . . . . . . 271.5 Orthogonal polynomials . . . . . . . . . . . . . . . . . . . . . 31
2 Approximation of functions 412.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 412.2 Univariate interpolation operators . . . . . . . . . . . . . . . . 47
2.2.1 Polynomial interpolation operators . . . . . . . . . . . . . 482.2.2 Polynomial spline interpolation operators . . . . . . . . . 552.2.3 Spline operators of Lagrange type . . . . . . . . . . . . . 692.2.4 Spline operators of Hermite type . . . . . . . . . . . . . . 732.2.5 Spline operators of Birkhoff type . . . . . . . . . . . . . . 762.2.6 Rational interpolation operators . . . . . . . . . . . . . . 77
2.2.6.1 An iterative method to construct a rational interpo-lation operator . . . . . . . . . . . . . . . . . . . . . . 77
2.2.6.2 Univariate Shepard interpolation . . . . . . . . . . . . 832.2.6.2.1 The univariate Shepard-Lagrange operator . . . 892.2.6.2.2 The univariate Shepard-Hermite operator . . . . 942.2.6.2.3 The univariate Shepard-Taylor operator . . . . . 952.2.6.2.4 The univariate Shepard-Birkhoff operator . . . . 96
2.2.7 Least squares approximation . . . . . . . . . . . . . . . . 972.3 Some elements of multivariate interpolation operators . . . . . 103
2.3.1 Interpolation on a rectangular domain . . . . . . . . . . . 1032.3.2 Interpolation on a simplex domain . . . . . . . . . . . . . 1102.3.3 Bivariate Shepard interpolation . . . . . . . . . . . . . . 112
3
4
2.3.3.1 The bivariate Shepard-Lagrange operator . . . . . . . 1152.3.3.2 The bivariate Shepard-Hermite operator . . . . . . . 1202.3.3.3 The bivariate Shepard-Birkhoff operator . . . . . . . 126
2.4 Uniform approximation . . . . . . . . . . . . . . . . . . . . . . 1282.4.1 The Weierstrass theorems . . . . . . . . . . . . . . . . . . 1292.4.2 The Stone-Weierstrass theorem . . . . . . . . . . . . . . . 1352.4.3 Positive linear operators . . . . . . . . . . . . . . . . . . 137
2.4.3.1 Popoviciu-Bohman-Korovkin Theorem . . . . . . . . 1382.4.3.2 Modulus of continuity . . . . . . . . . . . . . . . . . . 1392.4.3.3 Popoviciu-Bohman-Korovkin Theorem in a quantita-
tive form . . . . . . . . . . . . . . . . . . . . . . . . . 139
3 Numerical integration 1453.1 Optimal quadrature formulas . . . . . . . . . . . . . . . . . . 146
3.1.1 Optimality in the sense of Sard . . . . . . . . . . . . . . . 1473.1.2 Optimality in the sense of Nikolski . . . . . . . . . . . . . 161
3.2 Some optimality aspects in numerical integration of bivariatefunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
4 Numerical linear systems 1754.1 Triangular systems of equations . . . . . . . . . . . . . . . . . 175
4.1.1 The Blocks version . . . . . . . . . . . . . . . . . . . . . 1764.2 Direct methods for solving linear systems of equations . . . . . 177
4.2.1 Factorization techniques . . . . . . . . . . . . . . . . . . 1774.2.1.1 The Gauss method . . . . . . . . . . . . . . . . . . . 1784.2.1.2 The Gauss-Jordan method . . . . . . . . . . . . . . . 1804.2.1.3 The LU methods . . . . . . . . . . . . . . . . . . . . 1804.2.1.4 The Doolittle algorithm . . . . . . . . . . . . . . . . . 1814.2.1.5 Block LU decomposition . . . . . . . . . . . . . . . . 1834.2.1.6 The Cholesky decomposition . . . . . . . . . . . . . . 1844.2.1.7 The QR decomposition . . . . . . . . . . . . . . . . . 1864.2.1.8 The WZ decomposition . . . . . . . . . . . . . . . . . 191
4.3 Iterative methods . . . . . . . . . . . . . . . . . . . . . . . . . 1944.3.1 The Jacobi method . . . . . . . . . . . . . . . . . . . . . 1964.3.2 The Gauss-Seidel method . . . . . . . . . . . . . . . . . . 1964.3.3 The method of relaxation . . . . . . . . . . . . . . . . . . 1994.3.4 The Chebyshev method . . . . . . . . . . . . . . . . . . . 1994.3.5 The method of Steepest Descendent . . . . . . . . . . . . 202
5
4.3.6 The multigrid method . . . . . . . . . . . . . . . . . . . . 204
5 Non-linear equations on R 209
5.1 One-step methods . . . . . . . . . . . . . . . . . . . . . . . . . 210
5.1.1 Successive approximations method . . . . . . . . . . . . . 213
5.1.2 Inverse interpolation method . . . . . . . . . . . . . . . . 215
5.2 Multistep methods . . . . . . . . . . . . . . . . . . . . . . . . 221
5.3 Combined methods . . . . . . . . . . . . . . . . . . . . . . . . 227
6 Non-linear equations on Rn 231
6.1 Successive approximation method . . . . . . . . . . . . . . . . 231
6.2 Newton’s method . . . . . . . . . . . . . . . . . . . . . . . . . 234
7 Parallel calculus 239
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
7.2 Parallel methods to solve triangular systems of linear equations240
7.2.1 The column-sweep algorithm . . . . . . . . . . . . . . . . 240
7.2.2 The recurrent-product algorithm . . . . . . . . . . . . . . 242
7.3 Parallel approaches of direct methods for solving systems oflinear equations . . . . . . . . . . . . . . . . . . . . . . . . . . 243
7.3.1 The parallel Gauss method . . . . . . . . . . . . . . . . . 243
7.3.2 The parallel Gauss-Jordan method . . . . . . . . . . . . . 245
7.3.3 The parallel LU method . . . . . . . . . . . . . . . . . . 246
7.3.4 The parallel QR method . . . . . . . . . . . . . . . . . . 247
7.3.5 The parallel WZ method . . . . . . . . . . . . . . . . . . 247
7.4 Parallel iterative methods . . . . . . . . . . . . . . . . . . . . 249
7.4.1 Parallel Jacobi method . . . . . . . . . . . . . . . . . . . 249
7.4.2 Parallel Gauss-Seidel algorithm . . . . . . . . . . . . . . 251
7.4.3 The parallel SOR method . . . . . . . . . . . . . . . . . . 253
7.5 The parallelization of the multigrid method . . . . . . . . . . 254
7.5.1 Telescoping parallelizations . . . . . . . . . . . . . . . . . 254
7.5.2 Nontelescoping parallelizations . . . . . . . . . . . . . . . 256
6
Preface
Numerical Analysis is a branch of Mathematics which deals with the constructive
solutions of many problems formulated and studied by other fields of Mathema-
tics. As illustrative examples, one may consider the problem of solving operatorial
equations, the polynomial approximation of real-valued functions, the numerical
integration of functions, and so on.
At the same time, Numerical Analysis has a theoretical aspect of its own. When
analyzing a numerical method, one studies the error for the numerical solution,
and obtains results concerning convergence and error bounds.
This book provides both theoretical and numerical aspects for basic problems
of Numerical Analysis, with a special emphasis on optimal approximation.
It is addressed, first of all, to the undergraduate students, master students and
Ph.D. students in Mathematics, but also to the students of Computer Science,
Engineering, and to all the people interested in Numerical Analysis and Approxi-
mation Theory.
The book is structured in 7 chapters.
Chapter 1 is dedicated to some preliminary notions, linear spaces, Peano kernel
theorems, best approximation and orthogonal polynomials.
Chapter 2 treats the functions approximation by univariate interpolation ope-
rators, respectively by linear and positive operators. Also, there are presented
some elements of multivariate approximation.
Chapter 3 provides some applications to numerical integration of functions.
In Chapters 4-6 there are studied several methods of solving some operatorial
equations (linear systems, nonlinear equations on R,nonlinear equations on Rn).Chapter 7 is a brief introduction to parallel calculus.
The content of the book is essentially based on the authors work and research.
The authors
7
8
Chapter 1
Preliminaries
1.1 Linear spaces
Definition 1.1.1. Let K be a field and V be a given set. We say that V isa K-linear space (a linear space over K) if there exist an internal operation:
” + ” : V × V → V ; (v1, v2)→ v1 + v2,
and an external operation:
” · ” : K × V → V ; (α, v)→ αv
that satisfy the following conditions:1) (V,+) is a commutative group2)
a) (α + β)v = αv + βv, ∀α, β ∈ K, ∀v ∈ V,b) α(v1 + v2) = αv1 + αv2, ∀α ∈ K, ∀v1, v2 ∈ V,c) (αβ)v = α(βv), ∀α, β ∈ K, ∀v ∈ V,d) 1 · v = v, ∀v ∈ V.
The elements of V are called vectors and those of K are called scalars.
Definition 1.1.2. Let V be a K-linear space. A subset V1 ⊂ V is called asubspace of V if V1 is stable with respect to the internal, respectively, to theexternal operations, i.e.:
∀v1, v2 ∈ V1 =⇒ v1 + v2 ∈ V1,∀α ∈ K, ∀v ∈ V1 =⇒ αv ∈ V1,
9
10 Chapter 1. Preliminaries
and V1 is a K-linear space with respect to the induced operations.
Remark 1.1.3. 1) If V1 is a subspace of V then
0 ∈ V1
and∀v ∈ V1 =⇒ −v ∈ V1;
2) For all linear spaces the subsets 0 and V are subspaces;3) If V1 and V2 are subspaces of V then
V1 + V2 = v1 + v2 | v1 ∈ V1, v2 ∈ V2 is a subspace.
Definition 1.1.4. Let V and V ′ be two K-linear spaces. A function f : V →V ′ is called linear transformation or linear operator if:
1) f(v1 + v2) = f(v1) + f(v2), ∀v1, v2 ∈ V2) f(αv) = αf(v), ∀α ∈ K, ∀v ∈ V.
Remark 1.1.5. 1) f : V → V ′ is a linear operator if and only if
f(αv1 + βv2) = αf(v1) + βf(v2), ∀α, β ∈ K, ∀v1, v2 ∈ V.
2) f linear operator implies
f(0) = 0 and f(−v) = −f(v), v ∈ V.
3) If f : V → V ′ is a linear operator then
Ker f = v ∈ V | f(v) = 0
is a subspace of V called the Kernel or the null space of f ; and
Im f := f(V ) = f(v) | v ∈ V
is a subspace of V ′, called the image of f.
Remark 1.1.6. A R-linear space (K = R) is called the real linear space anda C-linear space (K = C) is called the complex linear space.
Definition 1.1.7. If V is a K-linear space then an application f : V → Kis called functional (real functional if K = R, respectively complex functionalif K = C).
1.1. Linear spaces 11
Definition 1.1.8. A nonnegative real functional p, defined on a linear spaceV, i.e.,
p : V → [0,∞],
with the properties
1) p(v1 + v2) ≤ p(v1) + p(v2), ∀v1, v2 ∈ V2) p(αv) = |α| p(v), ∀α ∈ R and v ∈ V
is called a seminorm on V. If, in addition,
3) p(v) = 0 =⇒ v = 0
then it is called a norm and V is called a normed linear space.
A norm is denoted by ‖·‖ .Remark 1.1.9. If V is a normed linear space then we have:
a) The properties:
1) and 3) =⇒ ‖v‖ = 0⇔ v = 0,
2) =⇒ |‖v1‖ − ‖v2‖| ≤ ‖v1 − v2‖ ;
b) d(v1, v2) = ‖v1 − v2‖ , for v1, v2 ∈ V, is a distance in V.c) A sequence (vn)n∈N of elements of V is said to be convergent in norm
of V to an element v ∈ V, if
limn→∞
‖vn − v‖ = 0.
d) A sequence (vn)n∈N is called a Cauchy sequence if for any ε > 0 thereexists a natural number N = Nε such that
‖vm − vn‖ < ε, for m,n > Nε.
Remark 1.1.10. A normed linear space is a metric space.
Remark 1.1.11. A convergent sequence is a Cauchy sequence.
Definition 1.1.12. In a normed linear space V, if any Cauchy sequence isconvergent, then V is called a complete space.
Definition 1.1.13. A complete normed linear space is called a Banach space.
12 Chapter 1. Preliminaries
Definition 1.1.14. Let V be a K-linear space. An application
〈v1, v2〉 : V × V → K,
with the properties:
1) 〈v1, v2〉 = 〈v2, v1〉,2) 〈v1 + v2, v3〉 = 〈v1, v3〉+ 〈v2, v3〉 ,3) 〈αv1, v2〉 = α 〈v1, v2〉 , α ∈ K4) v 6= 0 =⇒ 〈v, v〉 > 0,
is called the inner-product of v1 and v2.
Remark 1.1.15. The following statements are fulfilled:
1) and 4) =⇒ 〈v, v〉 is real and strict positive;
1) and 3) =⇒ 〈0, v〉 = 〈v, 0〉 = 0; in particular 〈0, 0〉 = 0;
1) and 2) =⇒ 〈v1,v2 + v3〉 = 〈v2 + v3, v1〉 = 〈v2, v1〉+ 〈v3, v1〉= 〈v1, v2〉+ 〈v1, v3〉 ;
1) and 3) =⇒ 〈v1, αv2〉 = α 〈v1, v2〉 .If K = R then
〈αv1, v2〉 = 〈v1, αv2〉 = α 〈v1, v2〉 .Definition 1.1.16. A linear space with an inner-product is called an inner-product space.
Definition 1.1.17. In an inner-product space, the norm defined by
‖v‖ =√〈v, v〉,
is called the norm induced by the inner-product.
Definition 1.1.18. A normed linear space with the norm induced by aninner-product is called a pre-Hilbert space.
In any pre-Hilbert space, the following inequality (called the Schwarzinequality) holds, with respect to the inner-product norm:
|〈v1, v2〉| ≤ ‖v1‖ · ‖v2‖ .Also, the inner-product norm satisfies the triangle inequality:
‖v1 + v2‖ ≤ ‖v1‖+ ‖v2‖ .
1.1. Linear spaces 13
Definition 1.1.19. A complete pre-Hilbert space is called a Hilbert space.
A Banach space with the norm induced by an inner-product is a Hilbertspace.
In any Hilbert space V the following equality holds:
‖v1 + v2‖2 + ‖v1 − v2‖2 = 2(‖v1‖2 + ‖v2‖2), ∀v1, v2 ∈ V, (1.1.1)
that is called the parallelogram identity or parallelogram law.Moreover, a Banach space whose norm satisfies the parallelogram identity
is a Hilbert space.
Definition 1.1.20. Two elements v1, v2 in an inner-product space V areorthogonal if
〈v1, v2〉 = 0.
A set V ′ ⊂ V is called an orthogonal set if
〈v1, v2〉 = 0, ∀v1, v2 ∈ V ′, v1 6= v2.
If, in addition,〈v, v〉 = 1, ∀v ∈ V ′,
then V ′ is called an orthonormal set.
Definition 1.1.21. Let V be a linear space over R or C. A linear operatorP : V → V is called a projector if
P P = P (or shortly, P 2 = P ).
For example, the identity operator
I : V → V, I(v) = v
and the null operator0 : V → V, 0(v) = 0
are projectors.
Definition 1.1.22. Let P1, P2 : V → V be projectors. If
P1 P2 = P2 P1
then P1 and P2 are called commuting projectors.
14 Chapter 1. Preliminaries
Remark 1.1.23. 1) ” ” is also denoted by ” · ” or it is just missing, i.e.,P1 P2 is denoted by P1 · P2 or P1P2.
2) P is a projector implies that PC := I − P is a projector.Indeed,
(PC)2 = (I − P )(I − P ) = I2 − P − P + P 2 = I − P = PC .
3) The projectors P and PC are commuting projectors.We have
PPC = P (I − P ) = PI − P = P − P = 0,
PCP = (I − P )P = IP − P 2 = P − P = 0,
hence,PPC = 0 = PCP.
4) (PC)C = P.5) P is projector implies ImPC = KerP.6) P and Q are commuting projectors implies PQ is projector.Indeed,
(PQ)2 = PQPQ = PPQQ = P 2Q2 = PQ.
7) P,Q are commuting projectors implies PC, QC are commuting projec-tors.
We have
PCQC = (I − P )(I −Q) = I2 − IQ− PI + PQ = I −Q− P + PQ,
QCPC = (I −Q)(I − P ) = I2 − IP −QI + PQ = I − P −Q+ PQ.
It follows thatPCQC = QCPC .
8) P,Q are commuting projectors implies PCQC and (PCQC)C are pro-jectors, with
(PCQC)C = P +Q− PQ := P ⊕Q,which is the boolean sum of P and Q.
Indeed, we already have seen that
PCQC = I − P −Q+ PQ,
so,(PCQC)C = I − (I − P −Q+ PQ) = P +Q− PQ.
1.1. Linear spaces 15
Let V be a linear space over R or C and
L= Pj | j ∈ J
be a set of commuting projectors of V , i.e., Pj : V → V are linear operatorsand
P 2j = Pj;
PjPk = PkPj , for j, k ∈ J.
Proposition 1.1.24. The set
L′ = PjPk | j, k ∈ J.
includes L, i.e.,L ⊆ L′, (1.1.2)
and it is made up by commuting projectors.
Proof. For all j ∈ J, we have P 2j = Pj, i.e.,
PjPj = Pj,
which implies Pj ∈ L′, so (1.1.2) is proved.For any j, k ∈ J denote by
Pjk := PjPk,
which implies that Pjk is a projector, as a composition of two commutingprojectors. For every j, k, i,m ∈ J we have
PjkPim = PjPkPiPm = PjPiPkPm = PiPjPkPm
= PiPjPmPk = PiPmPjPk = PimPjk,
i.e., the projectors from L′ are commuting.
Proposition 1.1.25. The set
L′′ = Pjk ⊕ Pim | j, k, i,m ∈ J
includes L′, i.e.,L′ ⊆ L′′, (1.1.3)
and it is made up by commuting projectors.
16 Chapter 1. Preliminaries
Proof. For ∀j, k ∈ J, we have
Pjk ⊕ Pjk = Pjk + Pjk − Pjk Pjk = Pjk + Pjk − P 2jk
= Pjk + Pjk − Pjk = Pjk,
which implies Pjk ∈ L′′, so (1.1.3) is proved. From the permutability of theprojectors from L′, it follows that the elements of L′′ are projectors. For thecommutability of the projectors of L′′ we have to prove that if Pi, i = 1, ..., 4are commuting projectors then
(P1 ⊕ P2)(P3 ⊕ P4) = (P3 ⊕ P4)(P1 ⊕ P2). (1.1.4)
Indeed, we get
(P1 ⊕ P2)(P3 ⊕ P4) =(P1 + P2 − P1P2)(P3 + P4 − P3P4)
=P1P3 + P1P4 − P1P3P4 + P2P3 + P2P4
− P2P3P4 − P1P2P3 − P1P2P4 + P1P2P3P4.
On the other hand,
(P3 ⊕ P4)(P1 ⊕ P2) =(P3 + P4 − P3P4)(P1 + P2 − P1P2)
=P3P1 + P3P2 − P3P1P2 + P4P1 + P4P2
− P4P1P2 − P3P4P1 − P3P4P2 + P3P4P1P2.
As, Pi, i = 1, ..., 4 are commuting projectors, we have
(P3 ⊕ P4)(P1 ⊕ P2) =P1P3 + P1P4 − P1P3P4 + P2P3 + P2P4
− P2P3P4 − P1P2P3 − P1P2P4 + P1P2P3P4,
and (1.1.4) is proved.
Remark 1.1.26. From the definitions of L′ and L′′, it follows that if L isfinite then L′ and L′′ are also finite.
Starting from a set L of commuting projectors, we have built L′ and L′′,which are sets of commuting projectors and
L ⊆ L′ ⊆ L′′.
DenotingL1 = L′′
1.1. Linear spaces 17
and inductively defining
Ln+1 = L′′n, (n ≥ 1),
one obtains an increasing sequence of sets of commuting projectors:
L1⊆ L2⊆ . . . ⊆ Ln⊆ Ln+1 ⊆ . . . . (1.1.5)
Let us denote
L =∞⋃n=1
Ln. (1.1.6)
Proposition 1.1.27. The set of commuting projectors L, is stable with re-spect to the composition and boolean sum, i.e.,
∀P,Q ∈ L =⇒ PQ ∈ L∀P,Q ∈ L=⇒P⊕Q∈L.
Proof. If P,Q ∈ L then, from (1.1.6), if follows that ∃n1, n2 ≤ n ∈ N∗ suchthat P ∈ Ln1, Q ∈ Ln2. If n1 ≤ n2, from (1.1.5) one obtains
Ln1 ⊆ Ln2 =⇒ P,Q ∈ Ln2 =⇒ PQ ∈ L′n2≤ L′′
n2⊆ Ln2+1
and
P ⊕Q ∈ L′′n2
= Ln2+1.
So, PQ = QP and PQ, P ⊕Q ∈ L.Remark 1.1.28. The set L is the lowest set of commuting projectors thatincludes L and which is stable with respect to the operations of compositionand boolean sum.
Remark 1.1.29. For P1, P2, P3 ∈ L, we have
P1P2 = P2P1,
P1 ⊕ P2 = P2 ⊕ P1,
(P1P2)P3 = P1(P2P3),
(P1 ⊕ P2)⊕ P3 = P1 ⊕ (P2 ⊕ P3),
which are easy to verify.
18 Chapter 1. Preliminaries
Proposition 1.1.30. The relation ” ≤ ”, defined on L by
P1 ≤ P2 ⇔ P1P2 = P1,
is an order relation (reflexive, transitive and antisymmetric).
Proof. Reflexive: ∀P ∈ L, PP = P (P is a projector), so P ≤ P.Transitive: ∀P1, P2, P3 ∈ L we have
P1 ≤ P2, P2 ≤ P3 =⇒ P1P2 = P1, P2P3 = P2 =⇒ P1P2P2P3 = P1P2
=⇒ P1P2P3 = P1P2 =⇒ P1P3 = P1 =⇒ P1 ≤ P3.
Antisymmetric: ∀P1, P2 we have
P1 ≤ P2, P2 ≤ P1 =⇒ P1P2 = P1,
P2P1 = P2 =⇒ P1 = P2 (P1P2 = P2P1)
Proposition 1.1.31. (L,≤) is a lattice and
inf(P1,P2) = P1P2, (1.1.7)
sup(P1, P2) = P1 ⊕ P2, ∀P1, P2 ∈ L. (1.1.8)
Proof. We have
P1(P1P2) = P 21P2 = P1P2 =⇒ P1P2 ≤ P1
P2(P1P2) = P1P22 = P1P2 =⇒ P1P2 ≤ P2
=⇒ P1P2 is a minor for P1 and P2.
Let P3 ∈ L be another minor of P1 and P2. It follows
P3 ≤ P1, P3 ≤ P2 =⇒ P3P1 = P3, P3P2 = P3 =⇒ (P3P1)(P3P2) = P 23
=⇒ P 23 (P1P2) = P 2
3 =⇒ P3(P1P2) = P3 =⇒ P3 ≤ P1P2.
Hence, P1P2 is the highest minor of P1 and P2, what implies that
inf(P1, P2) = P1P2
and (1.1.7) is proved.
1.1. Linear spaces 19
To prove (1.1.8), we consider
P1(P1 ⊕ P2) = P1(P1 + P2 − P1P2) = P 21 + P1P2 − P 2
1P2
= P1 + P1P2 − P1P2 = P1 =⇒ P1 ≤ P1 ⊕ P2.
In the same way it can be proved that
P2 ≤ P1 ⊕ P2.
So,P1 ≤ P1 ⊕ P2
P2 ≤ P1 ⊕ P2
=⇒ P1 ⊕ P2 is a major for P1 and P2.
Now, let P3 ∈ L be another major of P1 and P2. We have
P1 ≤ P3 =⇒ P1P3 = P1,
P2 ≤ P3 =⇒ P2P3 = P2,
respectively,
(P1 ⊕ P2)P3 = (P1 + P2 − P1P2)P3
= P1P3 + P2P3 − P1P2P3
= P1 + P2 − P1P2 = P1 ⊕ P2.
Therefore,(P1 ⊕ P2)P3 = P1 ⊕ P2 =⇒ P1 ⊕ P2 ≤ P3,
i.e., P1 ⊕ P2 is the lowest major of P1 and P2, and (1.1.8) is also proved.In concusion, (L,≤) is a lattice.
Proposition 1.1.32. The lattice (L,≤) is distributive, i.e.,
P1(P2 ⊕ P3) = (P1P2)⊕ (P1P3), ∀P1, P2, P3 ∈ L. (1.1.9)
Proof. We have that
P1(P2 ⊕ P3) = P1(P2 + P3 − P2P3) = P1P2 + P1P3 − P1P2P3,
(P1P2)⊕ (P1P3) = P1P2 + P1P3 − P1P2P1P3 = P1P2 + P1P3 − P1P2P3,
and therefore, the distributivity is proved.
20 Chapter 1. Preliminaries
Remark 1.1.33. The relation (1.1.9) represents the distributivity of infwith respect to sup. In any lattice this distributivity is equivalent to thedistributivity of sup with respect to inf. So, (1.1.9) is equivalent to
P1 ⊕ (P2P3) = (P1 ⊕ P2)(P1 ⊕ P3), ∀P1, P2, P3 ∈ L.
Remark 1.1.34. The lattice (L,≤) is generated by L.Definition 1.1.35. The elements of the set L are called the generators ofthe lattice (L,≤).
1.2 Examples of functions spaces
1. The space Cp[a, b], p ∈ N of functions f : [a, b] → R continuously differ-entiable up to order p, inclusively. For p = 0 one gets the space C[a, b] ofcontinuous functions on [a, b].
Cp[a, b] is a linear space over R, with respect to the usual addition offunctions and multiplication of functions by real numbers. It is also normed,with the norm defined by
‖f‖ =
p∑
k=0
maxa≤x≤b
∣∣f (k)(x)∣∣ .
Moreover, Cp[a, b] is a complete space, thus it is a Banach space.2. The space Pm of polynomials of degree at most m. One has Pm ⊂
C∞[a, b].We denote by Pnm the set of polynomials in n variables and of global degree
at most m and by Pnm1,...,mnthe set of polynomials in n variables x1, ..., xn
and of degree mk with respect to the variable xk, for k = 1, ..., n.3. The space Lp[a, b], p ∈ R, p ≥ 1 of functions p-Lebesgue integrable on
[a, b].Recall that a function f : [a, b] → R is said to be p-Lebesgue integrable
on [a, b] if |f |p is Lebesgue integrable on [a, b].Moreover, the class f of p-Lebesgue integrable functions on [a, b] is the
set of all functions g, which are p-Lebesgue integrable on [a, b] and equivalentto f (denoted g ∼ f), i.e., g = f almost everywhere on [a, b].
The set of equivalence classes defined above is a linear space with respectto addition:
f1 + f2 = g, iff f1 + f2 ∼ g,
1.2. Examples of functions spaces 21
and multiplication:λf = h, iff λf ∼ h.
Identifying the class f with its representant f ∈ f , we define the norm
‖f‖ =
(∫ b
a
|f(x)|p dx)1/p
. (1.2.1)
The space Lp[a, b] is the linear space of the introduced equivalence classes,and it is normed, with the norm given by (1.2.1).
Moreover, Lp[a, b] is a complete space, thus it is a Banach space.For p = 2, i.e., on L2[a, b] one can define the inner product
〈f, g〉2 =
∫ b
a
f(x)g(x)dx.
It follows that L2[a, b] is a Hilbert space, with respect to the norm endowedby this inner product, namely,
‖f‖2 =
(∫ b
a
f 2(x)dx
)1/2
.
4. Let w be a positive integrable function on [a, b], with
∫ b
a
w(x)dx <∞.
Let L2w[a, b] be the set of functions f such that
∫ b
a
w(x)|f(x)|2dx <∞.
On L2w[a, b] it is defined the inner-product
〈f, g〉w,2 =
∫ b
a
w(x)f(x)g(x)dx, f, g ∈ L2w[a, b],
and the induced norm
‖f‖w,2 =
(∫ b
a
w(x)|f(x)|2dx)1/2
.
22 Chapter 1. Preliminaries
(L2w[a, b], ‖·‖w,2
)is a Hilbert space.
5. The space Hm[a, b], m ∈ N∗ of functions f ∈ Cm−1[a, b], with f (m−1)
absolutely continuous on [a, b].Notice that every function f ∈ Hm[a, b] can be represented using Taylor’s
formula with remainder in the integral form:
f(x) =m−1∑
k=0
(x−a)k
k!f (k)(a) +
∫ x
a
(x−t)m−1
(m−1)!f (m)(t)dt. (1.2.2)
So, it immediately follows that for f, g ∈ Hm[a, b] and λ ∈ R, one has f + g,λf ∈ Hm[a, b].
Hm[a, b] is a linear space.6. The space Hm,2[a, b], m ∈ N∗ of functions f ∈ Hm[a, b], with f (m) ∈
L2[a, b].From (1.2.2) it follows that a function f ∈ Hm,2[a, b] is uniquely deter-
mined by the values f (k)(a), k = 0, ..., m−1 and f (m). Consequently, one canconsider the expression
〈f, g〉m,2 =
∫ b
a
f (m)(x)g(m)(x)dx+m−1∑
k=0
f (k)(a)g(k)(a),
as an inner product, since the four properties of the inner product are verifiedin this case.
Let ‖·‖m,2 be the norm endowed by this inner product, namely,
‖f‖2m,2 =∥∥f (m)
∥∥2
2+
m−1∑
k=0
(f (k)(a))2.
In Hm,2[a, b] we consider a Cauchy sequence fnn∈N. Since f (m)n n∈N is a
Cauchy sequence in L2[a, b], and f (k)n (a)n∈N is a Cauchy sequence in R,
for each k = 0, ..., m − 1, it follows that f (m)n n∈N converges to a function
g ∈ L2[a, b], and every sequence f (k)n (a)n∈N converges to a number ck ∈ R,
k = 0, ..., m− 1, (L2[a, b] and R are complete spaces). Let
f(x) =
m−1∑
k=0
ck(x−a)k
k!+
∫ x
a
(x−t)m−1
(m−1)!g(t)dt.
1.2. Examples of functions spaces 23
Since f ∈ Hm[a, b] and f (m) = g (f (m) ∈ L2[a, b]), it follows that f ∈Hm,2[a, b]. But,
limn→∞
‖fn − f‖2m,2 = limn→∞
(∥∥f (m)n − f (m)
∥∥2
2+
m∑k=0
[f (k)n (a)− f (k)(a)]2
)= 0,
so,limn→∞
fn = f,
in ‖·‖m,2 norm.
It yields Hm,2[a, b] is complete, thus it is a Hilbert space.7. The Sard-type space Bpq(a, c), (p, q ∈ N, p + q = m) of the functions
f : D → R, D = [a, b]× [c, d] satisfying:1. f (p,q) ∈ C (D) ;2. f (m−j,j)(·, c) ∈ C [a, b] , j < q;3. f (i,m−i)(a, ·) ∈ C [c, d] , i < p.8. The Sard-type space Br
pq(a, c), (p, q ∈ N, p + q = m, r ∈ R, r ≥ 1) ofthe functions f : D → R, D = [a, b]× [c, d] satisfying:
1. f (i,j) ∈ C (D) , i < p, j < q;2. f (m−j−1,j)(·, c) is absolute continuous on [a, b] and f (m−j,j)(·, c) ∈
Lr[a, b], j < q;3. f (i,m−i−1)(a, ·) is absolute continuous on [c, d] and f (i,m−i)(a, ·) ∈
Lr[c, d], i < p;4. f (p,q) ∈ Lr(D).As in the univariate case, we have that if f ∈ Bpq(a, c) then this function
admits the Taylor representation:
f(x, y) =∑
i+j<m
(x−a)i
i!(y−c)j
j!f (i,j)(a, c) + (Rmf) (x, y), (1.2.3)
with
(Rmf) (x, y) =∑
j<q
(y−c)j
j!
∫ b
a
(x−s)m−j−1+
(m−j−1)!f (m−j,j)(s, c)ds
+∑
i<p
(x−a)i
i!
∫ d
c
(y−t)m−i−1+
(m−i−1)!f (i,m−i)(a, t)dt
+
∫∫
D
(x−s)p−1+
(p−1)!
(y−t)q−1+
(q−1)!f (p,q)(s, t)dsdt,
24 Chapter 1. Preliminaries
where
z+ =
z, when z ≥ 0,0, when z < 0.
Remark 1.2.1. The Taylor type formula (1.2.3) holds for any domain Ωhaving the property that there exists a point (a, c) ∈ Ω such that the rectan-gle [a, x]× [c, y] ⊆ Ω, for all (x, y) ∈ Ω. In this case, formula (1.2.3) is writtenas
f(x, y) =∑
i+j<m
(x−a)i
i!(y−c)j
j!f (i,j)(a, c) + (Rmf) (x, y),
with
(Rmf) (x, y) =∑
j<q
(y−c)j
j!
∫
I1
(x−s)m−j−1+
(m−j−1)!f (m−j,j)(s, c)ds
+∑
i<p
(x−a)i
i!
∫
I2
(y−t)m−i−1+
(m−i−1)!f (i,m−i)(a, t)dt
+
∫∫
Ω
(x−s)p−1+
(p−1)!
(y−t)q−1+
(q−1)!f (p,q)(s, t)dsdt,
where
I1 = (x, y) ∈ R2 | y = c ∩ Ω,
I2 = (x, y) ∈ R2 | x = a ∩ Ω.
Examples of such domains are the triangle
Th = (x, y) ∈ R2 | x ≥ 0, y ≥ 0, x+ y ≤ h
with (a, c) = (0, 0) and any circle with center at (a, c).
1.3 The Peano Kernel Theorems
Theorem 1.3.1. Let L : Hm[a, b] → R be a linear functional which com-mutes with the defined integral operator, for example, of the form
L(f) =
m−1∑
i=0
∫ b
a
f (i)(x)dµi(x),
where µi are functions of bounded variation on [a, b].
1.3. The Peano Kernel Theorems 25
If KerL = Pm−1 then
L(f) =
∫ b
a
Km(t)f (m)(t)dt, (1.3.1)
where
Km(t) = Lx(
(x−t)m−1+
(m−1)!
).
The notation Lx(f) means that L is applied to f with respect to the variablex.
Proof. Since f ∈ Hm[a, b], one can apply Taylor’s formula and get
f = Pm−1 +Rm−1,
where Pm−1 ∈ Pm−1 and
Rm−1(x) =
∫ b
a
(x−t)m−1+
(m−1)!f (m)(t)dt.
Thus,L(f) = L(Pm−1) + L(Rm−1).
Since L(Pm−1) = 0 (KerL = Pm−1), we obtain
L(f) = Lx(∫ b
a
(x−t)m−1+
(m−1)!f (m)(t)dt
),
and, from the hypothesis condition, we have that
L(f) =
∫ b
a
L(
(·−t)m−1+
(m−1)!
)f (m)(t)dt.
The function K is called the Peano’s kernel.
Corollary 1.3.2. If the kernel K does not change its sign on the interval[a, b] and f (m) is continuous on [a, b], then
L(f) = 1m!L(em)f (m)(ξ), a ≤ ξ ≤ b, (1.3.2)
where ek(x) = xk.
26 Chapter 1. Preliminaries
Proof. Applying mean value theorem to (1.3.1), one obtains the representa-tion
L(f) = f (m)(ξ)
∫ b
a
Km(t)dt, (1.3.3)
which does not depend on f. Taking f = em, this yields∫ b
a
Km(t)dt = 1m!L(em).
Substituting the integral in (1.3.3) by the above expression, one gets (1.3.2).
Theorem 1.3.3. Let f ∈ Bpq(a, c) and the linear functional L given by
L(f) =∑
i+j<m
∫∫
D
f (i,j)(x, y)dµi,j(x, y) (1.3.4)
+∑
j<q
∫ b
a
f (m−j,j)(x, c)dµm−j,j(x)
+∑
i<p
∫ d
c
f (i,m−i)(a, y)dµi,m−i(y),
where µm,n are functions with bounded variation on D, [a, b] and, respectively,on [c, d]. If KerL = P2
m−1 then
L(f) =∑
j<q
∫ b
a
Km−j,j(s)f(m−j,j)(s, c)ds (1.3.5)
+∑
i<p
∫ d
c
Ki,m−i(t)f(i,m−i)(a, t)dt
+
∫∫
D
Kp,q(s, t)f(p,q)(s, t)dsdt,
where
Km−j,j(s) = Lx,y(
(x−s)m−j−1+
(m−j−1)!(y−c)j
j!
),
Ki,m−i(t) = Lx,y(
(x−a)i
i!
(y−t)m−i−1+
(m−i−1)!
),
Kp,q(s, t) = Lx,y(
(x−s)p−1+
(p−1)!
(y−t)q−1+
(q−1)!
).
1.4. Best approximation in Hilbert spaces 27
Proof. Since f ∈ Bpq(a, c), we can consider formula (1.2.3). Applying thefunctional L to each member of this formula and taking into account thehypothesis
L(p) = 0, ∀p ∈ P2m−1,
we are led toL(f) = L(Rmf).
Next, from (1.3.4) we get (1.3.5).
1.4 Best approximation in Hilbert spaces
Definition 1.4.1. Let B be a normed linear space A ⊂ B and f ∈ B. Theelement g∗ ∈ A with the property that
‖f − g∗‖ ≤ ‖f − g‖ , ∀g ∈ A
is called the best approximation of f from A .
Next, we are interested in some conditions for the existence, uniquenessand for a characterization of the best approximation, in the case when B isa Hilbert space.
Theorem 1.4.2. Let B be a Hilbert space. If A ⊂ B is a nonempty, convexand closed set then for all f ∈ B there exists an unique best approximationg∗ ∈ A .
Proof. If f ∈ A then g∗ = f and the proof is over.Suppose that f /∈ A. Let
d∗ = d(f,A) = infg∈A‖f − g‖ .
Let (gn)n∈N ⊂ A be such that
limn→∞
‖f − gn‖ = d∗.
Using the parallelogram identity (1.1.1), with v1 = f − gn and v2 = f − gm,one obtains:
‖2f − (gm + gn)‖2 + ‖gm − gn‖2 = 2(‖f − gm‖2 + ‖f − gn‖2
),
28 Chapter 1. Preliminaries
or
‖gm − gn‖2 = 2(‖f − gm‖2 + ‖f − gn‖2
)− 4
∥∥∥f − (gm+gn)2
∥∥∥2
.
Since A is a convex set and gm, gn ∈ A, it follows that
12(gm + gn) ∈ A,
which implies ∥∥∥f − (gm+gn)2
∥∥∥2
≥ (d∗)2.
So,‖gm − gn‖2 ≤ 2
(‖f − gm‖2 + ‖f − gn‖2
)− 4(d∗)2.
As,
limm→∞
‖f − gm‖ = d∗,
limn→∞
‖f − gn‖ = d∗,
it follows that (gn)n∈N is a Cauchy sequence, hence it is a convergent one,in B. But A is closed, so g∗ ∈ A and d∗ = ‖f − g∗‖ . Hence, g∗ is the bestapproximation to f and the existence is proved.
Suppose now, that there exist two elements g∗, h∗ ∈ A such that
‖f − g∗‖ = d∗
and‖f − h∗‖ = d∗.
Since A is a convex set and g∗, h∗ ∈ A, it follows
12(g∗ + h∗) ∈ A,
hence,‖2f − (g∗ + h∗)‖ ≥ 2d∗.
Using again the parallelogram identity, with v1 = f − g∗, v2 = f − h∗, oneobtains
‖g∗ − h∗‖2 = 2(‖f − g∗‖2 + ‖f − h∗‖2
)− ‖2f − (g∗ + h∗)‖2
≤ 4d∗ − 4d∗ = 0.
1.4. Best approximation in Hilbert spaces 29
Therefore,‖g∗ − h∗‖ = 0,
i.e.,g∗ = h∗.
Theorem 1.4.3. In the conditions of Theorem 1.4.2, g∗ ∈ A is the bestapproximation to f ∈ B if and only if
〈g∗ − f, g − g∗〉 ≥ 0, ∀g ∈ A. (1.4.1)
Proof. Suppose that g∗ is the best approximation to f, i.e.,
‖g∗ − f‖ ≤ ‖g − f‖ , ∀g ∈ A.
The set A being convex, it follows that ∀ε > 0, g∗ + ε(g − g∗) ∈ A, hence,
‖g∗ − f‖ ≤ ‖g∗ + ε(g − g∗)− f‖ .
It yields
‖g∗ − f‖2 ≤ ‖g∗ − f + ε(g − g∗)‖2
= 〈g∗ − f + ε(g − g∗), g∗ − f + ε(g − g∗)〉= ‖g∗ − f‖2 + 2ε 〈g∗ − f, g − g∗〉+ ε2 ‖g − g∗‖2 ,
which implies2 〈g∗ − f, g − g∗〉+ ε ‖g − g∗‖2 ≥ 0.
As this inequality holds for ∀ε > 0, we must have
〈g∗ − f, g − g∗〉 ≥ 0.
Now, let us suppose that (1.4.1) holds. For g ∈ A, we have
‖g − f‖2 = ‖g∗ − f + (g − g∗)‖2
= ‖g∗ − f‖2 + ‖g − g∗‖2 + 2 〈g∗ − f, g − g∗〉 ,
or‖g − f‖2 − ‖g∗ − f‖2 = ‖g − g∗‖2 + 2 〈g∗ − f, g − g∗〉 .
30 Chapter 1. Preliminaries
As (1.4.1) holds, it follows
‖g − f‖2 − ‖g∗ − f‖2 ≥ 0,
i.e.,
‖g − f‖ ≥ ‖g∗ − f‖ , ∀g ∈ A,
and the theorem is completely proved.
Remark 1.4.4. The property (1.4.1) is called the property of pseudoorthog-onality.
Corollary 1.4.5. If A is a linear subspace of B and At is a translation ofA then g∗ ∈ At is the best approximation for f ∈ B, if and only if
〈g∗ − f, g〉 = 0, ∀g ∈ A. (1.4.2)
Proof. According to Theorem 1.4.3, g∗ ∈ At is the best approximation forf ∈ B, if and only if
〈g∗ − f, g − g∗〉 ≥ 0, ∀g ∈ At.
As, h := g − g∗ ∈ A, it follows
〈g∗ − f, h〉 ≥ 0, h ∈ A.
Suppose that 〈g∗ − f, h〉 > 0. We have h ∈ A implies −h ∈ A (A being alinear space), so
〈g∗ − f,−h〉 ≥ 0, h ∈ A.
But, if
〈g∗ − f, h〉 > 0
then
〈g∗ − f,−h〉 < 0,
which is a contradiction with our assumption, therefore it implies (1.4.2).
1.5. Orthogonal polynomials 31
1.5 Orthogonal polynomials
Let P ⊂ L2w[a, b] be a set of orthogonal polynomials on [a, b], with respect
to the weight function w, and Pn ⊂ P the set of orthogonal polynomials ofdegree n, with the coefficient of xn equals to 1 (such a polynomial is denotedby pn).
Theorem 1.5.1. If pn ⊥ Pn−1 then pn has exactly n real and distinct rootsin the open interval (a, b).
Proof. The condition pn ⊥ Pn−1 is equivalent with
∫ b
a
w(x)pn(x)xidx = 0, i = 0, 1, ..., n− 1,
hence, ∫ b
a
w(x)pn(x)dx = 0.
It follows that pn has m ≥ 1 roots in which it changes the sign. Letx1, ..., xm ∈ (a, b) be these roots and
rm(x) = (x− x1)...(x− xm).
Therefore, pnrm ≥ 0 on [a, b]. As pnrm 6= 0, we have
〈pn, rm〉w,2 =
∫ b
a
w(x)pn(x)rm(x)dx ≥ 0
and the condition pn ⊥ Pm−1 implies m ≥ n. As a polynomial of degree ncannot have more than n roots, it follows that m = n.
Theorem 1.5.2. Let (pn)n∈N be a sequence of orthogonal polynomials on[a, b], with respect to the weight function w, and
pn(x) = anxn + bnx
n−1 + ...
Then we have the recurrence relation:
anan+1
pn+1(x) =
(bn+1
an+1− bnan
+ x
)pn(x)−
an−1
an
‖pn‖2w,2‖pn−1‖2w,2
pn−1(x), (1.5.1)
for n = 1, 2, ...
32 Chapter 1. Preliminaries
Proof. We have
xpn(x) =n+1∑
k=0
cn,kpk(x), (1.5.2)
where
cn,k =
∫ b
a
w(x)xpn(x)pk(x)dx
/∫ b
a
w(x)p2k(x)dx,
orcn,k = 〈e1pn, pk〉w,2
/‖pk‖2w,2 , (e1(x) = x).
We also havecn,k = 〈pn, e1pk〉w,2
/‖pk‖2w,2 .
As, 〈pn, pi〉w,2 = 0, for i < n, it follows that cn,k = 0, for k < n−1 and (1.5.2)becomes
xpn(x) = cn,n+1pn+1(x) + cn,npn(x) + cn,n−1pn−1(x). (1.5.3)
In (1.5.3), identifying the terms of degree n+ 1, respectively n, one obtains
cn,n+1 =anan+1
,
cn,n =bnan− bn+1
an+1.
In order to get cn,n−1, one notices that
cn,k = 〈e1pn, pk〉w,2/‖pk‖2w,2 ,
respectively,
ck,n = 〈e1pk, pn〉w,2/‖pn‖2w,2 .
So,cn,k ‖pk‖2w,2 = ck,n ‖pn‖2w,2 .
It follows that
cn,n−1 = cn−1,n
‖pn‖2w,2‖pn−1‖2w,2
,
respectively,
cn,n−1 =an−1
an
‖pn‖2w,2‖pn−1‖2w,2
,
and from (1.5.3), the proof follows.
1.5. Orthogonal polynomials 33
Remark 1.5.3. For the polynomials of P , we have
pn+1(x) = (x+ bn+1 − bn)pn(x)−‖pn‖2w,2‖pn−1‖2w,2
pn−1(x). (1.5.4)
Remark 1.5.4. If (pn)n∈N is a sequence of orthonormal polynomials then(1.5.1) becomes
anan+1
pn+1(x) =
(bn+1
an+1− bnan
+ x
)pn(x)−
an−1
anpn−1(x).
Theorem 1.5.5. If pn ⊥ Pn−1, on [a, b], with respect to the weight functionw, then
‖pn‖w,2 = minp∈Pn
‖p‖w,2 .
Proof. We have Pn = en+Pn−1, i.e., Pn is a translation of the linear subspacePn−1 of the space Hm,2[a, b]. In accordance with Corollary 1.4.5, p∗ ∈ Pn isthe best approximation to 0 ∈ L2
w[a, b] if and only if p∗ ⊥ Pn−1. So, p∗ = pn.
Next, we give a characterization of the orthogonal polynomials.
The recurrence relation of Theorem 1.5.2 is not always the most conve-nient method for computing the orthogonal polynomials. Some other usefultechniques come from the following characterization theorem.
Theorem 1.5.6. Let w : [a, b]→ R be any continuous function. The functionϕ
k+1∈ C[a, b] satisfies the orthogonality conditions,
∫ b
a
w(x)ϕk+1
(x)p(x)dx = 0, p ∈ Pk, (1.5.5)
if and only if there exists a k+1-times differentiable function, u(x), x ∈ [a, b]that satisfies the relations:
w(x)ϕk+1
(x) = u(k+1)(x), x ∈ [a, b] (1.5.6)
and
u(i)(a) = u(i)(b) = 0, i = 0, 1, ..., k. (1.5.7)
34 Chapter 1. Preliminaries
Proof. If equations (1.5.6) and (1.5.7) hold, then the integration by partsgives the following identity:
∫ b
a
w(x)ϕk+1
(x)p(x)dx = (−1)k+1
∫ b
a
u(x)p(k+1)(x)dx.
Therefore, as p ∈ Pk, the orthogonality condition (1.5.5) holds.Conversely, when equation (1.5.5) is satisfied, we let u be defined by
expression (1.5.6), where the constants of integration are chosen to give thevalues
u(i)(a) = 0, i = 0, 1, ..., k.
Expression (1.5.6) is substituted in the integral (1.5.5). For each integerj, j = 0, 1, ..., k, let p = pj be the polynomial
pj(x) = (b− x)j , x ∈ [a, b],
and we apply integration by parts j+1 times to the left-hand side of expres-sion (1.5.5). Thus, we obtain
[(−1)ju(k−j)(x)p(j)j (x)]
∣∣∣b
a+ (−1)j+1
∫ b
a
u(k−j)(x)p(j+1)j (x) = 0.
Because p(j+1)j (x) = 0, it follows that
u(k−j)(b) = 0, j = 0, 1, ..., k,
which completes the proof.In order to apply this theorem to generate orthogonal polynomials, it is
necessary to identify a function u, satisfying the conditions (1.5.7), such thatthe function ϕ
k+1, defined by (1.5.6), is a polynomial of k + 1 degree. There
is no automatic method of identification, but in many important cases therequired function u is easy to recognize.
Example 1.5.7. If there are satisfied the relations (1.5.7), by letting u bethe function
u(x) = (x− a)k+1(x− b)k+1, x ∈ [a, b],
then it follows that ϕk+1∈ Pk+1, when the weight function w is a constant.
1.5. Orthogonal polynomials 35
In other words, the polynomials
ϕj(x) =
dj
dxj[(x− a)j(x− b)j ], j = 0, 1, 2, ... (1.5.8)
satisfy the orthogonality conditions∫ b
a
ϕi(x)ϕ
j(x)dx = 0, i 6= j, i, j = 0, 1, 2, ....
Definition 1.5.8. The polynomial ln given by (1.5.8), i.e.,
ln(x) = Cdn
dxn[(x2 − 1)n]
is called the Legendre polynomial of degree n, relatively to the interval [−1, 1],and
ln(x) =n!
(2n)!
dn
dxn[(x2 − 1)n]
is the Legendre polynomial with the coefficient of xn equals to 1.
From (1.5.4) it follows the recurrence formula
ln+1(x) = xln(x)− n2
(2n−1)(2n+1)ln−1(x), n = 1, 2, ...,
with l0(x) = 1 and l1(x) = x.
Example 1.5.9. Let α and β be real constants that are both greater than−1. The polynomials (ϕ
i)i∈N, that satisfies the orthogonality conditions
∫ 1
−1
(1− x)α(1 + x)βϕi(x)ϕj(x)dx = 0, i 6= j, i, j = 0, 1, 2, ...
are called the Jacobi polynomials.In this case we require the function (1.5.6) to be a polynomial of degree
k + 1, multiplied by the weight function
w(x) = (1− x)α+k+1(1 + x)β+k+1, x ∈ [−1, 1].
Because the condition (1.5.7) is satisfied, it follows that the Jacobi polyno-mials are defined by the equation
ϕi(x) = (1− x)−α(1 + x)−β
di
dxi[(1− x)α+i(1 + x)β+i], i = 0, 1, 2, ...,
which is called the Rodrigue’s formula.
36 Chapter 1. Preliminaries
Remark 1.5.10. For α = β = 0 the Jacobi polynomials become the Legendrepolynomials.
Remark 1.5.11. Each family of orthogonal polynomials is characterized bythe weight function w and the interval [a, b].
Example 1.5.12. For the weight function w(x) = e−x and the interval[0,+∞), one obtains the Laguerre orthogonal polynomials gn, n = 0, 1, ...,i.e., ∫ ∞
0
e−xgi(x)gj(x)dx =
0, i 6= j,i!, j = i.
If u is the function
u(x) = e−xxn, 0 ≤ x <∞,
the conditions (1.5.7) are verified and the function ϕk+1, defined by equation(1.5.6), is from Pk+1. Hence, the Laguerre polynomials are
gn(x) = exdn
dxn(e−xxn), n = 0, 1, 2, ...,
with x ∈ [0,∞).The recurrence relation verified by the Laguerre polynomials is:
gn+1(x) =2n + 1− xn + 1
gn(x)− ngn−1(x), n = 1, 2, ...,
where
g0(x) = 1,
g1(x) = 1− x.
Example 1.5.13. The Hermite polynomials given by
hn(x) = (−1)nex2 dn
dxn(e−x
2
), n = 0, 1, 2, ...
fulfill the orthogonality conditions:
∫ ∞
−∞e−x
2
hm(x)hn(x)dx =
0, m 6= n2nn!√π, m = n.
1.5. Orthogonal polynomials 37
So, hn, n = 0, 1, ..., are orthogonal polynomials on R, with respect to theweight function w(x) = e−x
2. They verify the recurrence relation
hn+1(x) = 2xhn(x)− 2nhn−1(x),
with
h0(x) = 1,
h1(x) = 2x.
Example 1.5.14. The polynomial Tn ∈ Pn defined by
Tn(x) = cos(n arccos x), x ∈ [−1, 1],
is called the Chebyshev polynomial of the first kind.
Property 1.5.15. 1) Algebraic form. For x = cos θ we have
Tn(cos θ) = cosnθ = 12(einθ + e−inθ)
= 12[(cos θ + i sin θ)n + (cos θ − i sin θ)n].
Hence,
Tn(x) = 12[(x+
√x2 − 1)n + (x−
√x2 − 1)n]. (1.5.9)
2) Tn(x) = 12n−1Tn(x).
From the algebraic form (1.5.9), one obtains
limx→∞
Tn(x)xn = 1
2limx→∞
[(1 +
√1− 1
x2
)n+
(1−
√1− 1
x2
)n]= 2n−1.
So, the coefficient of xn in Tn is 2n−1. It follows that
Tn = 12n−1Tn.
3) The orthogonality property.Starting with
∫ π
0
cosmθ cos nθdθ =
0, m 6= n,π2, m = n 6= 0,π, m = n = 0,
38 Chapter 1. Preliminaries
and taking cos θ = x, it follows that
∫ 1
−1
1√1−x2Tm(x)Tn(x)dx =
0, m 6= n,c 6= 0, m = n.
So, (Tn)n∈N are orthogonal polynomials on [−1, 1], with regard to the weightfunction w(x) = 1/
√1− x2.
4) Theorem 1.5.5 implies that∥∥∥Tn
∥∥∥w,2
= minP∈Pn
‖P‖w,2 .
5) The roots of the polynomial Tn are
xk = cos 2k−12n
π, k = 1, ..., n,
that are real, distinct and belonging to (−1, 1).6) The recurrence relation:
Tn+1(x) = 2xTn(x)− Tn−1(x), x ∈ [−1, 1],
with T0(x) = 1 and T1(x) = x.This is provided by the identity
cos[(n+ 1)θ] + cos[(n− 1)θ] = 2 cos θ cos nθ.
7) On the interval [−1, 1] we have∥∥∥Tn
∥∥∥∞
= minP∈Pn
‖P‖∞ .
Proof. We have ∥∥∥Tn∥∥∥∞
= 12n−1
andTn(x
′k) = (−1)k
2n−1 ,
wherex′k = cos kπ
n, k = 0, 1, ..., n,
are the roots of the polynomial T ′n, i.e.,
T ′n(x) =
n sin(n arccosx)√1− x2
.
1.5. Orthogonal polynomials 39
Now, suppose that there exists a polynomial Pn ∈ Pn, with ‖Pn‖∞ < 12n−1 .
It follows thatPn = Tn + Pn−1, with Pn−1 ∈ Pn−1.
Hence,
Pn−1(x′k) = Pn(x
′k)− (−1)k
2n−1 .
As, ‖Pn‖ < 12n−1 , for k = 0, 1, ..., n− 1 we have
Pn−1(x′k) < 0, for k even,
Pn−1(x′k) > 0, for k odd.
It follows that Pn−1 changes its sign for n + 1 times in [−1, 1], i.e., it has atleast n roots. As the degree of Pn−1 is at most n−1 we obtain that Pn−1 = 0,hence Pn = Tn.
Example 1.5.16. The polynomial Qn ∈ Pn defined by
Qn(x) =sin((n+ 1) arccosx)√
1− x2, x ∈ [−1, 1]
is called the Chebyshev polynomial of the second kind.
Remark 1.5.17. The following relation is fulfilled:
Qn(x) = 1n+1
T ′n+1(x), x ∈ [−1, 1].
As the coefficient of xn is 2n−1, in Tn, it follows that
Qn = 12nQn.
We remark also that∫ 1
−1
√1− x2QmQndx =
0, m 6= n,π2, m = n,
i.e., (Qn)n∈N is an orthogonal sequence on [−1, 1], with respect to the weightfunction w(x) =
√1− x2. From the identity
sin(n+ 2)θ + sinnθ = 2 cos θ sin(n+ 1)θ,
one obtains the recurrence formula
Qn+1(x) = 2xQn(x)−Qn−1(x), n = 1, 2, ...,
with Q0(x) = 1 and Q1(x) = 2x.
40 Chapter 1. Preliminaries
Also, ∥∥∥Qn
∥∥∥w,2
= minP∈Pn
‖P‖w,2 .
A very important property of the polynomials Qn is that
∥∥∥Qn
∥∥∥w,1
= minP∈Pn
‖P‖w,1 , on [−1, 1],
i.e., the minimum norm property takes place also in the norm of L1w[−1, 1].
Chapter 2
Approximation of functions
2.1 Preliminaries
Functions are the basic mathematical tools for describing (modeling) manyphysical processes. Very frequently these functions are not known explicitlyand it is necessary to construct approximations of them, based on limitedinformation about the underlying process. As a physical process dependson one or more parameters, the problem which appears is to approximate afunction of one or several variables based on some limited information aboutit.
The most commonly approach for finding approximations to unknownfunctions is:
1) choose a reasonable class of functions A, in which to look for theapproximations;
2) select an appropriate approximation process for assigning a specificfunction to a specific problem.
The success of this approach heavily depends on the existence of theconvenient set A of approximating functions.
Such a set A should possess at least the following properties:
1) the functions in A should be relatively smooth;
2) the set A should be large enough so that arbitrary smooth functionscan be well approximated by elements of A;
3) the functions in A should be easy to store, manipulate and evaluateon a digital computer.
The study of various classes of approximating functions is the content of
41
42 Chapter 2. Approximation of functions
the approximation theory. The computational aspects of such approxima-tions are a major part of numerical analysis.
The most important classes of approximation functions used in this bookare: polynomials, polynomial splines and rational functions.
1) Polynomials (A ⊂ P).The polynomials have always played a central role in approximation the-
ory and numerical analysis. We note that the space Pm, of polynomials ofdegree at most m, has the following properties:
1. It is a finite dimensional linear space with a convenient basis (ei,i = 0, 1, ..., m, with ei(x) = xi).
2. Polynomials are smooth functions.3. The derivatives and primitives of the polynomials are also poly-
nomials whose coefficients can be found algebraically.4. The number of zeros of a polynomial of degree m cannot exceed
m.5. Various matrices arising in interpolation and approximation by
polynomials are usually nonsingular and they have strong sign-regularityproperties.
6. The sign structure and shape of a polynomial are strongly relatedto the sign structure of its set of coefficients.
7. Given any continuous function on an interval [a, b], there exists apolynomial which is uniformly close to it.
8. Precise rates of convergence can be given for approximation ofsmooth functions by polynomials.
9. Polynomials are easy to store, manipulate and evaluate on com-puter.
2) Polynomial splines (A ⊂ Sm).The main drawbacks of the space Pm for approximation of functions is
that the set Pm is relatively inflexible. Polynomials seem to be all right onsufficiently small intervals, but when we go to larger intervals often appearsevere oscillations, particulary for large m. This observation suggests that inorder to achieve a class of approximating functions with greater flexibility, weshould work with polynomials of relatively low degree and divide the intervalinto smaller pieces. This is the class of polynomial splines.
The space Sm of polynomial splines of order m have the following prop-erties:
1. Polynomial spline spaces are finite dimensional linear spaces withvery convenient bases.
2.1. Preliminaries 43
2. Polynomial splines are relatively smooth functions.3. The derivatives and primitives of polynomial splines are also poly-
nomial splines.4. Polynomial splines posses nice zeros properties analogous to those
for polynomials.5. Various matrices arising naturally in the use of splines in approx-
imation theory and numerical analysis have convenient sign and properties.6. The sign structure and shape of a polynomial spline can be related
to the sign structure of its coefficients.7. Every continuous function on an interval [a, b] can be well ap-
proximated by polynomial splines with the order m fixed, supposing that asufficient number of knots are allowed.
8. Precise rates of convergence can be given for approximation ofsmooth functions by splines.
9. Low-order splines are very flexible and do not exhibit the oscilla-tions usually associated with polynomials.
10. Polynomial splines are easy to store, manipulate and evaluate oncomputer.
3) Rational functions (A = Rp,q).Polynomials are not suitable to approximate, for example, a function
f : [a, b] → R for which a line of equation x = a − ε or x = a + ε is anasymptote, or for the case when f is an exponential function and so on.
In such cases a rational function can give better approximation than apolynomial.
In a computational structure sense, a rational function R is the one thatcan be evaluated as the quotient of two polynomials, i.e.,
R(x) =P (x)
Q(x)=a0 + a1x+ ...+ amx
m
b0 + b1x+ ...+ bnxn,
where am 6= 0, bn 6= 0. When n = 0 and b0 6= 0, then R reduces to apolynomial.
A rational functionR is finitely representable by the coefficients (a0, ..., am)and (b0, ..., bn), so it is computable by a finite number of arithmetic opera-tions, as in the polynomial or polynomial spline case.
Definition 2.1.1. Let B be a normed linear space of real-valued functionsdefined on Ω ⊂ Rn, A ⊂ B and f ∈ B.• P : B → A is an approximation operator;
44 Chapter 2. Approximation of functions
• Pf is the approximation of f by the operator P ;• f = Pf +Rf is the approximation formula generated by P, where R is
the corresponding remainder operator;• if P (Pf) = Pf then the operator P is idempotent;• if P is linear and idempotent then P is a projector;• the smallest real number, denoted by ‖P‖ , with the property that
‖Pf‖ ≤ ‖P‖ ‖f‖ , for all f ∈ B
is the norm of the operator P.
Remark 2.1.2. The norm of an approximation operator is also called theLebesgue constant of the operator.
Definition 2.1.3. The number r ∈ N for which
Pf = f, f ∈ Pnr
and there exists g ∈ Pnr+1 such that Pg 6= g is called the degree of exactnessof the operator P, (denoted by dex(P )).
Remark 2.1.4. If P is linear then (Pf = f, f ∈ P nr and there exists g ∈ Pnr+1
such that Pg 6= g) is equivalent to (Pei1,...,in = ei1,...,in, for all i1 + ...+ in ≤ rand there exists (µ1, ..., µn) ∈ Nn with µ1 + ... + µn = r + 1 such thatPeµ1,...,µn
6= eµ1,...,µn), where eν1,...,νn
(x1,...,xn) = xν11 ..xνnn .
Suppose that B is a set of real-valued functions defined on Ω ⊂ Rn, A ⊂ Band let
Λ := λi | λi : B → R, i = 1, ..., Nbe a set of linear functionals. The information of f ∈ B corresponding to theset Λ is denoted by
Λ(f) := λi(f) | λi ∈ Λ, i = 1, ..., N.
Definition 2.1.5. The operator P : B → A for which
λi(Pf) = λi(f), i = 1, ..., N, for f ∈ B,
is called an interpolation operator that interpolates the set Λ. Formula
f = Pf +Rf
is the interpolation formula generated by P, and R is the remainder operator.
2.1. Preliminaries 45
Definition 2.1.6. If for a given Λ, the operator P exists, then Λ is calledan admissible set of functionals, respectively Λ(f) is an admissible set ofinformation.
Definition 2.1.7. If pi, i = 1, ..., N are fundamental (cardinal) interpolationelements suitable to the interpolation operator P, i.e.,
λkpi = δki, i, k = 1, ..., N
then
Pf =N∑
i=1
piλi(f).
Now, let Λνi
i ⊂ Λ be a subset of νi (νi < N) functionals of Λ, associatedto λi, i = 1, ..., N, respectively. We have
⋃N
i=1Λνi
i = Λ.
Let Qνi
i be the operator that interpolates the set Λνi
i .
Definition 2.1.8. The operator PQ defined by
PQ =N∑
i=1
piQνi
i
is called the combined operator of P and Qνi
i , i = 1, ..., N.
Remark 2.1.9. If P and Qνi
i , i = 1, ..., N are linear operators then thecombined operator PQ is also linear.
Definition 2.1.10. A sequence of subsets Λνi
i ⊂ Λ, i = 1, ..., N, for whichthe suitable interpolation operators Qνi
i , i = 1, ..., N exist for all i = 1, ..., N,is called an admissible sequence.
Theorem 2.1.11. Let P and Qνi
i , i = 1, ..., N, be the operators that inter-polate the sets Λ and Λνi
i , i = 1, ..., N, respectively. If Λ is an admissibleset and Λν1
1 , ...,ΛνN
N is an admissible sequence then the combined operator PQexists.
Proof. As Λ is an admissible set and Λν11 , ...,Λ
νN
N is an admissible sequence,we have that all the operators P and Qνi
i , i = 1, ..., N exist, so the prooffollows.
46 Chapter 2. Approximation of functions
Theorem 2.1.12. Let P and Qνi
i , i = 1, ..., N, be some linear operators. If
dex (P ) ≥ 0(∑N
i=1 pi = 1)
and dex(Qνi
i ) = ri, i = 1, ..., N, then
dex(PQ) = rm := minr1, ..., rN.
Proof. As, dex(Qνi
i ) = ri and Qνi
i is linear, we have
Qνi
i ei1,...,in = ei1,...,in,
for all (i1, ..., in) ∈ Nn, with i1+ ...+in ≤ ri and there exists (µ1, ..., µn) ∈ Nn,with µ1 + ... + µn = ri + 1, such that
Qνi
i eµ1,...,µn6= eµ1,...,µn
,
for i = 1, ..., N, respectively. It follows that
Qνi
i ei1,...,in = ei1,...,in,
with i1 + ... + in ≤ rm, (rm = minr1, ..., rN), for all i = 1, ..., N and thereexists an index j, 1 ≤ j ≤ N, for which
Qνj
j eµ1,...,µn6= eµ1,...,µn
,
with µ1 + ... + µn = rm + 1. We have
PQei1,...,in =
N∑
k=1
pkQνk
k ei1,...,in = ei1,...,in
N∑
k=1
pk = ei1,...,in,
for all i1 + ... + in ≤ rm. From
Qνj
j eµ1,...,µn6= eµ1,...,µn
, when µ1 + ... + µn = rm + 1,
it follows thatPQeµ1,...,µn
6= eµ1,...,µn.
Finally, the linearity of PQ implies that
dex(PQ) = rm.
2.2. Univariate interpolation operators 47
2.2 Univariate interpolation operators
Suppose that B is a linear space of univariate real-valued functions definedon the interval [a, b] ⊂ R, A ⊂ B,
Λ = λi | λi : B → R, i = 1, ..., N
is a set of linear functionals and P : B → A is the interpolation operatorsuitable to Λ, i.e.,
λi(Pf) = λi(f), i = 1, ..., N, for f ∈ B.
The basic elements in the definition of on interpolation operator are the setsA and Λ. For a given pair (A,Λ), it is obtained a corresponding interpolationoperator.
For A will be considered the sets P, S and R of polynomial functions,natural spline functions, respectively, rational functions.
Supposing that xi ∈ [a, b], i = 0, ..., m are the interpolation nodes, for Λwill be considered the functionals of the following types:
• Lagrange type (ΛL):
ΛL = λi | λi(f) = f(xi), i = 0, ..., m;
• Hermite type (ΛH):
ΛH = λij | λij(f) = f (j)(xi), j = 0, ..., ri; i = 0, ..., m,
with ri ∈ N, i = 0, ..., m;
• Birkhoff type (ΛB):
ΛB = λij | λij(f) = f (j)(xi), j ∈ Ii, i = 0, ..., m,
withIi ⊂ 0, 1, ..., ri, for ri ∈ N, i = 0, ..., m.
Hence, for example, the pair (P,ΛL) gives rise to the polynomial interpo-lation operators of Lagrange type, the pair (S,ΛB) generates spline interpo-lation operators of Birkhoff type and so on.
48 Chapter 2. Approximation of functions
2.2.1 Polynomial interpolation operators
Consider the polynomial interpolation operators of Lagrange (P,ΛL), Her-mite (P,ΛH) and Birkhoff (P,ΛB) type, briefly called Lagrange (Lm), Her-mite (Hn), respectively Birkhoff (Bk) interpolation operators, where m =|ΛL|−1, n = |ΛH |−1 and k = |ΛB|−1, i.e., the degree of the correspondingpolynomials.
Next we give the interpolation formulas generated by these operators.⋆ Lagrange interpolation formula:
f = Lmf +Rmf, (2.2.1)
with Lagrange interpolation polynomial given by:
(Lmf)(x) =
m∑
i=0
um(x)(x−xi)u′m(xi)
f(xi), (2.2.2)
with
um(x) =m∏i=0
(x− xi).
For f ∈ Cm[a, b] and such that there exists f (m+1) on (a, b), we have
(Rmf)(x) = um(x)(m+1)!
f (m+1)(ξ), a < ξ < b, (2.2.3)
respectively, while if f ∈ Cm+1[a, b] we have
(Rmf)(x) =
∫ b
a
ϕm(x, t)f (m+1)(t)dt, (2.2.4)
whereϕm(x, t) = Rm
[(·−t)m
+
m!
]
is the Peano’s kernel.
Remark 2.2.1. If f (m+1) is also continuous on [a, b] we have
‖Rmf‖∞ ≤‖um‖∞(m+1)!
∥∥f (m+1)∥∥∞ .
Minimal norm property: ‖Rmf‖∞ takes the minimum value when‖um‖∞ takes the minimum value, i.e., when um ≡ Tm+1(·, a, b), where Tm+1
2.2. Univariate interpolation operators 49
is the Chebyshev polynomial of the first kind. It means that the interpolationnodes are the zeros of Tm+1 and
‖Rmf‖∞ ≤(b−a)m+1
(m+1)!22m+1
∥∥f (m+1)∥∥∞ .
⋆ Hermite interpolation formula:
f = Hnf +Rnf,
where
(Hnf)(x) =
m∑
i=0
ri∑
j=0
hij(x)f(j)(xi),
with
hij(x) = ui(x)(x−xi)j
j!
ri−j∑
ν=0
(x−xi)ν
ν!
[1
ui(x)
](ν)x=xi
and
un(x) =
m∏
i=0
(x− xi)ri+1, ui(x) = un(x)
(x−xi)ri+1 .
We present two representations of the remainder:- for f ∈ Cn[a, b], and such that there exists f (n+1) on (a, b), we have
(Rnf)(x) = un(x)(n+1)!
f (n+1)(ξ), a < ξ < b;
- for f ∈ Cn+1[a, b] we have
(Rnf)(x) =
∫ b
a
ϕn(x, t)f(n+1)(t)dt,
where ϕn is the corresponding Peano’s kernel.⋆ Birkhoff interpolation formula:
f = Bkf +Rkf,
with
(Bkf)(x) =
m∑
i=0
∑
j∈Ii
bij(x)f(j)(xi), (2.2.5)
50 Chapter 2. Approximation of functions
where bij are obtained from the cardinality conditions:
b(p)ij (xν) = 0, ν 6= i, p ∈ Iν ,b(p)ij (xi) = δjp, p ∈ Ii,
(2.2.6)
for all j ∈ Ii and i = 0, ..., m. For f ∈ Ck+1[a, b] we have:
(Rkf)(x) =
∫ b
a
ϕk(n, t)f(k+1)(t)dt,
with
ϕk(a, t) = R[
(·−t)k+
k!
].
Particular cases of Birkhoff interpolation.
• Abel-Goncharov interpolation formula:
f = Pnf +Rnf,
with
(Pnf)(x) =n∑
k=0
gk(x)f(k)(xk),
where gk, k = 0, ..., n are called Goncharov polynomials of degree k, and theyhave the following expressions:
g0(x) = 1,
g1(x) = x− x0, (2.2.7)
gk(x) =
∫ x
x0
dt1
∫ t1
x1
dt2 · · ·∫ tk−1
xk−1
dtk
= 1k!
[xk −
k−1∑j=0
gj(x)(kj
)xk−jj
], k = 2, ..., n.
For n ∈ N, a, b ∈ R, a < b and a function f : [a, b]→ R, having n derivativesf (i), i = 1, 2, ..., n, the Abel-Goncharov interpolation problem consists infinding a polynomial Pnf of degree n such that:
(Pnf)(i)(xi) = f (i)(xi), 0 ≤ i ≤ n. (2.2.8)
2.2. Univariate interpolation operators 51
If f ∈ Hn+1[a, b] then
(Rnf)(x) =
∫ b
a
ϕn(x, s)f(n+1)(s)ds,
with
ϕn(x, s) =(x−s)n
+
n!−
n∑
k=0
gk(x)(xk−s)n−k
+
(n−k)! . (2.2.9)
The determinant of the corresponding interpolation system is alwaysnonzero, so the problem (2.2.8) always has a unique solution.• Lidstone interpolation formula:
f = L∆mf +R∆
mf, (2.2.10)
with
(L∆mf)(x) =
N+1∑i=0
m−1∑j=0
rm,i,j(x)f(2j)(xi),
where rm,i,j, 0 ≤ i ≤ N + 1, 0 ≤ j ≤ m− 1 are the basic elements of the setLm(∆) := h ∈ C[a, b] : h is a polynomial of degree at most 2m− 1 in eachsubinterval [xi, xi+1], 0 ≤ i ≤ N, satisfying
D2νrm,i,j(xµ) = δiµδ2ν,j , 0 ≤ µ ≤ N + 1, 0 ≤ ν ≤ m− 1 (2.2.11)
and
rm,i,j(x) =
Λj
(xi+1−xh
)h2j , xi ≤ x ≤ xi+1, 0 ≤ i ≤ N
Λj
(x−xi−1
h
)h2j , xi−1 ≤ x ≤ xi, 1 ≤ i ≤ N + 1
0, otherwise
with ∆ denoting a fixed partition of [a, b].The Lidstone polynomial Λn of degree 2n+1, n ∈ N on the interval [0, 1],
is defined by:
Λ0(x) = x, (2.2.12)
Λ′′n(x) = Λn−1(x),
Λn(0) = Λn(1) = 0, n ≥ 1,
For f ∈ C2m−2[a, b], the Lidstone polynomial uniquely exists and it sat-isfies the interpolation conditions:
(L∆mf)(2k)(xi) = f (2k)(xi), 0 ≤ k ≤ m− 1, 0 ≤ i ≤ N + 1. (2.2.13)
52 Chapter 2. Approximation of functions
The remainder is given by
(R∆mf)(x) =
∫ b
a
gm(x, s)f (2m)(s)ds, (2.2.14)
where gm(x, s) is the corresponding Peano’s kernel.Common properties:1. Evidently, all the operators Lm, Hn, Bk, Pn and L∆
m are interpolationoperators.
2. Each operator Lm, Hn, Bk, Pn and L∆m has the degree of exactness
equal to the degree of corresponding interpolation polynomial, i.e.,
dex(Lm) = m, dex(Hn) = n, dex(Bk) = k, dex(Pn) = n and dex(L∆m) = 2m−1.
3. All operators Lm, Hn, Bk, Pn and L∆m, and their corresponding re-
mainder operators are projectors.Existence and uniqueness conditions:
1. If xi 6= xj , for i 6= j; i, j = 0, ..., m the operators Lm and Hn exist andare unique.
2. A necessary condition for the existence of the Birkhoff operator Bk isgiven by the Polya condition. Let
E = (eij)mi=0,j∈Ii
be the (m+ 1)× (k + 1) matrix associated to ΛB, in which
eij =
1, if j ∈ Ii,0, if j /∈ Ii,
for all i = 0, ..., m. One notices that the number of 1’s in E is exactly k + 1.The matrix E is called the incidence matrix corresponding to the set ΛB.In particular, for ΛL the (m+ 1)× (m+ 1) incidence matrix is
EL =
1 0 ... 01 0 ... 0...
......
1 0 ... 0
.
Considering Hermite interpolation, the i-th row of the (m + 1) × (n + 1)incidence matrix EH is
(1, ..., 1,︸ ︷︷ ︸ri+1
0, ..., 0︸ ︷︷ ︸n−ri
), for i = 0, 1, ..., m.
2.2. Univariate interpolation operators 53
An incidence matrix E satisfies the Polya condition if
r∑
j=0
m∑
i=0
eij ≥ r + 1, for all r = 0, ..., k.
It can be seen that the incidence matrices EL and EH satisfy the Polyacondition.
Next consider an example of Birkhoff interpolation.
Example 2.2.2. For f ∈ C3[a, b] one considers the set of information
ΛB(f) = f ′(0), f(h2), f ′(h).
The incidence matrix is
EB =
0 1 01 0 00 0 1
,
that verifies the Polya condition. Therefore, we look for the interpolationpolynomial, as a polynomial of degree 2 (k = 2):
(B2f)(x) = Ax2 +Bx+ C, (2.2.15)
that interpolates the information ΛB(f). One obtains
(B2f)′(0) := B = f ′(0)
(B2f)(h2) := h2
4A+ h
2B + C = f(h
2)
(B2f)′(h) := 2hA+B = f ′(h).
(2.2.16)
The determinant of this system is D = 2h 6= 0, hence B2f exists and isunique. There are two ways to find it.
1) Solving the system (2.2.16) directly. One obtains
A = f ′(h)−f ′(0)2h
,
B = f ′(0),
C = f(h2
)− h
8f ′(h)− 3h
8f ′(0),
so(B2f)(x) = f ′(h)−f ′(0)
2hx2 + f ′(0)x+ f
(h2
)− h
8f ′(h)− 3h
8f ′(0),
54 Chapter 2. Approximation of functions
or(B2f)(x) = (2x−h)(3h−2x)
8hf ′(0) + f
(h2
)+ 4x2−h2
8hf ′(h).
2) Using the general form (2.2.5):
(B2f)(x) = b01(x)f′(0) + b10(x)f
(h2
)+ b21(x)f
′(h)
and finding the cardinal polynomials b01, b10 and b21. Each of these poly-nomials is of degree 2 (as in (2.2.15)) and they must satisfy the cardinalityconditions (2.2.6), i.e.:
b′01(0) = 1b01(h2
)= 0
b′01(h) = 0
b′10(0) = 0b10(
h2) = 1
b′10(h) = 0
b′21(0) = 0b21(
h2) = 0
b′21(h) = 1.
Solving these systems, we get
b01(x) = (2x−h)(3h−2x)8h
,
b10(x) = 1,
b21(x) = 4x2−h2
8h.
Now, regarding the remainder term of the interpolation formula
f = B2f +R2f
we have:
(R2f)(x) =
∫ h
0
ϕ2(x, t)f(3)(t)dt,
whereϕ2(x, t) = 1
2[(x− t)2
+ − (h2− t)2
+ − 4x2−h2
4h(h− t)].
By a straightforward computation one obtains that
ϕ2(x, t) ≥ 0, x ∈ [0, h2]
andϕ2(x, t) ≤ 0, x ∈ [h
2, h],
for t ∈ [0, h]. As, for a given x ∈ [0, h], ϕ2(x, t) do not change the sign, fort ∈ [0, h], and f (3) is continuous on [0, h], using the mean value theorem, oneobtains:
(R2f)(x) = f (3)(ξ)
∫ h
0
ϕ2(x, t)dt, 0 ≤ ξ ≤ h.
2.2. Univariate interpolation operators 55
Finally, we have
(R2f)(x) = (2x−h)(2x2−2hx−h2)24
f (3)(ξ), 0 ≤ ξ ≤ h,
respectively,
‖R2f‖∞ ≤ h3
24
∥∥f (3)∥∥∞ .
2.2.2 Polynomial spline interpolation operators
Let Λ = λi | λi : Hm,2[a, b] → R, i = 1, ..., n be a set of linear functionals,y ∈ Rn and
U(y) =f ∈ Hm,2[a, b] | λi(f) = yi, i = 1, ..., n
. (2.2.17)
Definition 2.2.3. The problem of finding the elements s ∈ U(y) with theproperty that ∥∥s(m)
∥∥2
= infu∈U(y)
∥∥u(m)∥∥
2
is called a Polynomial Spline Interpolation Problem (denoted PSIP).
Remark 2.2.4. A solution s of a (PSIP) is the smoothest function (∥∥s(m)
∥∥2→
min), or in other words, the ”closest” function to a polynomial P ∈ Pm−1
(∥∥P (m)
∥∥2
= 0), that also satisfies the conditions λi(P ) = yi, i = 1, ..., n.
A solution of a (PSIP) is called a polynomial spline function that inter-polates y with respect to Λ, or a natural spline that interpolates y.
Next, we are looking for some existence and uniqueness conditions, basicproperties and a structural characterization for a solution of a (PSIP). Forthese, we need some results given by the following lemmas.
Lemma 2.2.5. Let ei, i = 0, ..., m − 1 be a basis of Pm−1 and let k ∈ N,0 ≤ k ≤ m − 1. There exists a linear functional λk, defined on Hm,2[a, b],such that
λk(ei) = δki, i = 0, ..., m− 1.
Proof. Considering f ∈ Hm,2[a, b], it is known that it can be representeduniquely in the form
f(x) = pf(x) +
∫ x
a
(x−t)m−1
(m−1)!f (m)(t)dt,
56 Chapter 2. Approximation of functions
with pf ∈ Pm−1. Denote
pf =m−1∑
i=0
αi(f)ei
the representation of pf in terms of ei, i = 0, ..., m− 1. Thus, the functionalλk = αk(f) has the desired property.
Lemma 2.2.6. Let f ∈ Lp[c, d], 1 < p <∞. If
∫ d
c
f(x)ϕ(x)dx = 0, ϕ ∈ C∞(c, d),
then f vanishes almost everywhere on (c, d), which is denoted by fa.e.= 0, on
(c, d).
Proof. Assume f > 0 on a set I ⊂ (c, d), with positive measure. Denotingby CI the characteristic function of the set I, we have f p−1CI ∈ Lp
′
(c, d),where 1
p+ 1
p′= 1. Indeed,
∫ d
c
[(f(x))p−1CI(x)
]p′dx =
∫
I
[(f(x))p−1
]p′dx =
∫
I
(f(x))pdx.
Note that the latter integral is well defined, since f ∈ Cp[c, d]. Taking intoaccount that C∞[c, d] is dense in Lp
′
[c, d], i.e., for every g ∈ Lp′
[c, d] thereexists a sequence (ϕk)k∈N ⊂ C∞[c, d], such that g = limk→∞ ϕk, one obtains
∫ d
c
(f(x))pdx =
∫ d
c
f(x)(f(x))p−1CI(x)dx = limk→∞
∫ d
c
f(x)ϕk(x)dx = 0,
that contradicts the assumption f(x) > 0, x ∈ I.
Lemma 2.2.7. Let f ∈ Cp[c, d], 1 < p <∞. If
∫ d
c
f(x)ϕ(m)(x)dx = 0, ϕ ∈ C∞m−1[c, d], (2.2.18)
then there exists a polynomial pf ∈ Pm−1 such that fa.e.= pf , on (c, d), where
C∞m−1[c, d] =
f ∈ C∞[c, d] | f (j)(c) = f (j)(d) = 0, j = 0, ..., m− 1
.
2.2. Univariate interpolation operators 57
Proof. We construct pf ∈ Pm−1 such that
〈f − pf , ei〉2 = 0, i = 0, ..., m− 1.
Taking into account Lemma 2.2.6, it is sufficient to prove that g = f − pfsatisfies the condition
〈g, ψ〉2 = 0, ψ ∈ C∞[c, d].
Let ψ ∈ C∞[c, d] and construct the polynomial pψ ∈ Pm−1, such that
〈ψ − pψ, ei〉2 = 0, i = 0, ..., m− 1.
Let
ϕ(x) =
∫ x
c
(x−t)m−1
(m−1)!(ψ − pψ)(t)dt.
We haveϕ(j)(c) = 0, j = 0, ..., m− 1
and from the orthogonality condition ψ − pψ ⊥ Pm−1, we also get
ϕ(j)(d) = 0, j = 0, ..., m− 1.
Thus, it follows ϕ ∈ C∞m−1[c, d]. Using f − pf ⊥ Pm−1 and ψ − pψ ⊥ Pm−1,
one obtains
〈f − pf , ψ〉2 = 〈f − pf , ψ − pψ〉2 = 〈f, ψ − pψ〉2 =⟨f, ϕ(m)
⟩2
= 0,
according to (2.2.18). Hence,
f − pf a.e.= 0, on (c, d),
which impliesfa.e.= pf , on (c, d).
Remark 2.2.8. The problem
(PSIP ) find s ∈ U(y) such that∥∥s(m)
∥∥2
= infu∈U(y)
∥∥u(m)∥∥
2
is equivalent to the best approximation problem in L2[a, b], i.e.,
(BAP ) find σ ∈ U (m)(y) such that ‖σ‖2 = infv∈U (m)(y)
‖v‖2 ,
whereU (m)(y) =
v | v = u(m), u ∈ U(y)
.
58 Chapter 2. Approximation of functions
Theorem 2.2.9. (Existence theorem). If the functionals λi ∈ Λ, i = 1, ..., nare bounded and U(y) given in (2.2.17) is a non-empty set then (PSIP) hasa solution.
Proof. Using the equivalence from Remark 2.2.8 and taking into accountthat the (BAP) has a unique solution if U (m)(y) ⊂ L2[a, b] is: 1) non-empty,2) convex and 3) closed, we have to check that U (m)(y) verifies these threeconditions.
1) Since U(y) was assumed to be non-empty, U (m)(y) is also non-empty.2) The functionals λi, i = 1, ..., n are linear. Thus, for u1, u2 ∈ U(y) and
α ∈ [0, 1], we have:
λi(αu1 + (1− α)u2) = αλi(u1) + (1− α)λi(u2)
= αyi + (1− α)yi = yi.
So, αu1 + (1− α)u2 ∈ U(y) which implies that U(y) is convex. Since, U(y)is convex and Dm is linear it follows that U (m)(y) is also a convex set.
3) Let (gk)k∈N be a sequence in U (m)(y) that converges to a functiong ∈ L2[a, b]. We have to prove that g ∈ U (m)(y). We show that there exists apolynomial p ∈ Pm−1 such that the function
f = p + h,
with
h(x) =
∫ x
a
(x−t)m−1
(m−1)!g(t)dt,
belongs to U(y). In this case
f (m) = g ∈ U (m)(y).
Since λi, i = 1, ..., n are bounded, U(y) is a closed set. Thus, it is sufficientto show that f is the limit of a sequence of functions from U(y), in the norm‖·‖m,2 of Hm,2[a, b]. Assume that fk ∈ U(y) are such that
gk = f(m)k , k ∈ N.
It follows that
fk = pk + hk,
2.2. Univariate interpolation operators 59
with pk ∈ Pm−1 and
hk(x) =
∫ x
a
(x−t)m−1
(m−1)!gk(t)dt.
The sequence (hk)k∈N converges to h. In order to find the limit of the sequence(pk)k∈N, we assume first that there exist m linear independent functionalsλi ∈ Λ, i = 1, ..., m, on Pm−1. In particular, we assume that λi are such thatthe matrix
A = (λi(vj))i,j=1,...,m , with vj(x) = (x− a)j−1,
is non-singular. Then,
pk(a)...
p(m−1)k (a)
= A−1
λ1pk
...
λmpk
.
Since λi(fk) = yi are bounded and (λi(hk))k∈Nconverges to λi(h), i = 1, ..., n
it follows that λi(pk), i = 1, ..., m are uniformly bounded. Hence, each of the
sequences (p(j)k (a))k∈N, j = 0, ..., m− 1, is uniformly bounded, which implies
that each of the m sequences contains a convergent subsequence (p(j)kν
(a))kν∈N.Let
p(j)(a) = limkν→∞
p(j)kν
(a), j = 0, ..., m− 1.
These limits define a polynomial p ∈ Pm−1. The sequence (fk)k∈N with fk =pk + hk, converges to f = p+ h, which completes the proof in this case.
Assume now that there exist only d < m linear independent functionalsλi ∈ Λ, i = 1, ..., d, on Pm−1, i.e., the rank of the matrix
(λi(vj))i=1,...,n;j=1,...,m
is equal to d. Let λi, i = 1, ..., d and vj, j = 1, ..., m be a renumbering of vj ,j = 1, ..., m such that the rank of the matrix (λi(vj))i,j=1,...,d is d. According
to Lemma 2.2.5, one can construct the functionals λi, i = d + 1, ..., m, onHm,2[a, b], such that
λi(vj) = δij , i = d+ 1, ..., m; j = 1, ..., m.
60 Chapter 2. Approximation of functions
It follows that the functionals λi, i = 1, ..., m are linear independent on Pm−1
and the proof is reduced to the previous case. Note that in the representationof fk, the polynomials pk have to be chosen such that
λi(pk) = 0, i = d+ 1, ..., m.
Thus, the (BAP) has an unique solution. This implies that the equivalent(PSIP) has a solution.
Theorem 2.2.10. If s1, s2 are two solutions of (PSIP) then
s1 − s2 ∈ Pm−1.
Proof. As the equivalent (BAP) has a unique solution σ, it follows that
s(m)1 = s
(m)2 = σ,
or(s1 − s2)
(m) = 0,
which is equivalent tos1 − s2 ∈ Pm−1.
Theorem 2.2.11. A (PISP) has an unique solution if and only if
Pm−1 ∩ U(0) = 0. (2.2.19)
Proof. Let s1 be a solution of (PSIP). Suppose that there exists
v ∈ Pm−1 ∩ U(0), v 6= 0.
Thens2 = s1 + v
is also a solution of (PSIP), since
λi(s2) = λi(s1) + λi(v) = λi(s1) = yi
and ∥∥∥s(m)2
∥∥∥2
=∥∥∥s(m)
1
∥∥∥2.
2.2. Univariate interpolation operators 61
As s2 6= s1, it means that (PSIP) has at least two solutions and the necessityis proved. For the sufficiency, we assume that s1 and s2 are solutions of(PSIP). Then, by Theorem 2.2.10 it follows that s1 − s2 ∈ Pm−1 and by
λi(s1 − s2) = λi(s1)− λi(s2) = yi − yi = 0
we get that s1 − s2 ∈ U(0). Thus,
s1 − s2 ∈ Pm−1 ∩ U(0)
and from (2.2.19) it follows that
s1 = s2.
By Theorem 2.2.9 we have that the existence of a solution of (PSIP) ischaracterized by the set of functionals Λ. The question is whether the set Λcan also characterize the uniqueness of the solution. The answer is given bythe following result.
Theorem 2.2.12. If Λ contains at least m functionals of Hermite type, then
Pm−1 ∩ U(0) = 0.
Proof. Consider p ∈ Pm−1 ∩ U(0) and let M = λ1, ..., λr ⊂ Λ, with r ≥ m,be a set of Hermite functionals. Since p ∈ U(0), we have
λi(p) = 0, i = 1, ..., r.
Thus, p has at least r zeros (r ≥ m). Taking into account that p ∈ Pm−1, itfollows that p = 0.
The next theorem states one of the most important properties of thespline interpolant, namely the orthogonality property.
Theorem 2.2.13. The function s ∈ U(y) is a solution of (PSIP) if and onlyif ⟨
s(m), g(m)⟩2
= 0, g ∈ U(0). (2.2.20)
Proof. It is used again the equivalence between (PSIP) and (BAP) (Remark2.2.8). Denote by σ := s(m). Since U (m)(0) is a linear subspace of L2[a, b] and
62 Chapter 2. Approximation of functions
U (m)(y) is a translation of U (m)(0), it follows that σ ∈ U (m)(y) is a solutionof the (BAP) if and only if σ ⊥ U (m)(0), or
〈σ, g〉2 = 0, g ∈ U (m)(0),
i.e., ⟨s(m), g(m)
⟩2
= 0, g ∈ U(0).
According to Theorem 2.2.13, the orthogonality property (2.2.20) com-pletely characterizes the solution of (PSIP), so it is natural to investigate theset
S(Λ) =f ∈ Hm,2[a, b] |
⟨f (m), g(m)
⟩2
= 0 g ∈ U(0).
Theorem 2.2.14. The set S(Λ) is a closed linear subspace of Hm,2[a, b].
Proof. Let f1, f2 ∈ S(Λ), i.e.,⟨f
(m)i , g(m)
⟩2
= 0, g ∈ U(0), for i = 1, 2.
Taking f = α1f1 + α2f2, we have⟨f (m), g(m)
⟩2
= α1
⟨f
(m)1 , g(m)
⟩
2+ α2
⟨f
(m)2 , g(m)
⟩
2= 0, for g ∈ U(0),
hence f ∈ S(Λ). It follows that S(Λ) is a linear subspace of Hm,2[a, b].In order to show that S(Λ) is closed, we consider a sequence (fk)k∈N ⊂
S(Λ) that converges to f ∈ Hm,2[a, b]. We have to prove that f ∈ S(Λ). Sincefk ∈ S(Λ), we have
⟨f
(m)k , g(m)
⟩
2= 0, g ∈ U(0),
and fromlimk→∞
fk = f, in Hm,2[a, b],
one obtainslimk→∞〈fk − f, g〉m,2 = 0,
which implieslimk→∞
⟨(fk − f)(m), g(m)
⟩2
= 0.
Consequently,⟨f (m), g(m)
⟩2
= limk→∞
[⟨f (m) − f (m)
k , g(m)⟩
2+⟨f
(m)k , g(m)
⟩2
]= 0, for g ∈ U(0),
hence, f ∈ S(Λ).
2.2. Univariate interpolation operators 63
Remark 2.2.15. The orthogonality property (2.2.20) implies that⟨p(m), g(m)
⟩2
= 0, for p ∈ Pm−1,
i.e.,Pm−1 ⊂ S(Λ).
Definition 2.2.16. The space S(Λ) is called the space of spline functionsinterpolating U(y).
Theorem 2.2.17. Let (vi)i=1,...,d be a basis of the space Pm−1 ∩U(0), and sia solution of (PSIP) that interpolates the set
Ui =f ∈ Hm,2 [a, b] | λi (f) = δij , j = 1, ..., n
,
for i = 1, ..., n. Then, the set (si)i=1,...,n ∪ (vj)j=1,...,d is a basis for S (Λ) .
Proof. Let s ∈ S (Λ) and
h = s−n∑
i=1
siλi (s) . (2.2.21)
Then h ∈ S (Λ) and h ∈ U(0). From (2.2.20) it follows that⟨h(m), h(m)
⟩2
= 0.
Thus, h(m) = 0, implying h ∈ Pm−1. Hence, we have h ∈ Pm−1 ∩ U(0). Since(vj)j=1,...,d is a basis of Pm−1 ∩ U(0), we can write
h =d∑
j=1
cjvj
and using (2.2.21), we obtain
s =
n∑
i=1
siλi (s) +
d∑
j=1
cjvj .
In order to show that the elements of the set (si)i=1,...,n∪(vj)j=1,...,d are linearlyindependent, we start with
n∑
i=1
aisi +d∑
j=1
vjbj = 0. (2.2.22)
64 Chapter 2. Approximation of functions
We have
λk
(n∑
i=1
aisi +d∑
j=1
vjbj
)= ak = 0, k = 1, ..., n,
since λk (si) = δki and λk (vj) = 0, j = 1, ..., d (vj ∈ U(0)) . Substitutingak = 0 in (2.2.22), it yields
d∑
j=1
vjbj = 0.
Because vj , j = 1, ..., d are linearly independent, we obtain bj = 0, j = 1, ..., d.Hence, (2.2.22) holds if and only if ai = 0, i = 1, ..., n and bj = 0, j = 1, ..., d.This implies that s1, ..., sn, v1, ..., vd are linearly independent.
Remark 2.2.18. IfPm−1 ∩ U(0) = 0,
(i.e., (PSIP) has a unique solution), then
dimS(Λ) = n
and s1, ..., sn is a basis of S(Λ).
The functions si, i = 1, ..., n are called fundamental interpolation splinesor cardinal splines, due to the property that
λk(si) = δki, k, i = 1, ..., n.
Also a very important property for the solutions of a (PSIP) is theirstructural characterization.
Obviously, the structure of a solution of (PSIP) essentially depends onthe set Λ that defines the interpolatory set U(y).
In the sequel we suppose that Λ is a set of Birkhoff type functionals,
Λ =λij | λijf = f (j) (xi) , i = 1, ..., k, j ⊂ Ii
, (2.2.23)
where a ≤ x1 < ... < xk ≤ b is a partition of the interval [a, b] , r1, ..., rk ∈ N,with ri ≤ m− 1 and Ii ⊆ 0, 1, ..., ri , i = 1, ..., k.
Theorem 2.2.19. (Structural Characterization) Let Λ be a set of Birkhofftype functionals given by (2.2.23), y ∈ Rn and let U(y) be the correspondinginterpolatory set. The function s ∈ U(y) is a solution of (PSIP) if and onlyif:
2.2. Univariate interpolation operators 65
1) s(2m) (x) = 0, x ∈ [x1, xk] \ x1, ..., xk ,2) s(m) (x) = 0, x ∈ (a, x1) ∪ (xk, b) ,3) s(2m−1−µ) (xi − 0) = s(2m−1−µ) (xi + 0) , µ ∈ 0, 1, ..., m− 1 \ Ii, for
i = 1, ..., k.
Proof. Let s ∈ U(y) be a solution of (PSIP).1) Let I be one of the intervals (xi, xi+1) , i = 0, 1, ..., k (x0 = a and
xk+1 = b). Considering ϕ ∈ C∞m−1 (I) , we define the function
g (x) =
ϕ (x) , x ∈ I0, x ∈ [a, b] \ I.
Since g ∈ U(0) and s is a solution of (PSIP), we have (the orthogonalityproperty): ∫ b
a
s(m)(x)g(m) (x) dx = 0.
As g(m) = ϕ(m) on I, we obtain
∫ b
a
s(m) (x) g(m) (x) dx =
∫
I
s(m) (x)ϕ(m) (x) dx = 0.
Using Lemma 2.2.7, it follows
s(m) a.e.= p ∈ Pm−1, on I.
But s ∈ Hm,2 [a, b] , hence
s (x) = ps (x) +
∫ x
a
(x−t)m−1
(m−1)!s(m) (t) dt, ps ∈ Pm−1.
Consequently, s ∈ P2m−1 on the interval I. Thus s(2m) (x) = 0, x ∈ I, whichimplies 1).
2) Now, we denote I := [a, x1) 6= ∅. Since s ∈ P2m−1 on I, we haves(m) ∈ C (I) . Consider the function
s (x) =
p (x) , x ∈ [a, x1] ,s (x) , x ∈ (x1, b],
where p ∈ Pm−1 is such that
p(j) (x1) = s(j) (x1) , j = 0, 1, ..., m− 1.
66 Chapter 2. Approximation of functions
Notice first that s ∈ U(y). Indeed,
λi (s) = λi (s) , i = 1, ..., n.
Assume that there exists ξ ∈ I, such as s(m) (ξ) 6= 0. Since s(m) is continuouson I, there exists a neighborhood V (ξ) of ξ, such that
s(m) (x) 6= 0, x ∈ V (ξ) .
This leads to the inequality
∥∥s(m)∥∥
2<∥∥s(m)
∥∥2,
which contradicts the hypothesis that s is a solution of (PSIP). Hence,
s(m) (x) = 0, x ∈ I.
The case I = (xk, b) can be treated similarly.3) Let g ∈ Hm,2 [a, b] . One has
⟨s(m), g(m)
⟩2
=
∫ b
a
s(m) (x) g(m) (x) dx =
k∑
i=0
∫ xi+1
xi
s(m) (x) g(m) (x) dx.
Integrating by parts in the above relation, it yields
⟨s(m), g(m)
⟩2
=
∫ x1
a
s(m) (x) g(m) (x) dx+
∫ b
xk
s(m) (x) g(m) (x) dx (2.2.24)
+k−1∑i=1
(−1)m∫ xi+1
xi
s(2m) (x) g (x) dx
+k∑i=1
m−1∑j=0
(−1)m−j g(j) (xi)[s(2m−1−j) (xi + 0)− s(2m−1−j) (xi − 0)
].
If x1 = a, the first integral is missing and in x1 the saltus is considered to bethe limit on the right in x1. The case xk = b is analogously treated.
Consider now an ε > 0 such that (xi − ε, xi + ε)∩xii=1,...,k = xi , andµ ∈ 0, 1, ..., m− 1 \ Ii. We construct h ∈ P3m−1 that satisfies conditions
h(j) (xi − ε) = h(j) (xi + ε) = 0,
h(j) (xi) = δjµ,
2.2. Univariate interpolation operators 67
for j = 0, 1, ..., m− 1. We define the function g by
g (x) =
h (x) , x ∈ J = [xi − ε, xi + ε] ∩ [a, b]0, otherwise.
We have g ∈ U(0), hence
∫ b
a
s(m) (x) g(m) (x) dx = 0.
Taking into account the latter equality and 1), 2), (2.2.24), we obtain
m−1∑j=0
(−1)m−j g(j) (xi)[s(2m−1−j) (xi + 0)− s(2m−1−j) (xi − 0)
]= 0, i = 1, ..., k.
Since,
g(j) (xi) = δjµ,
we get
s(2m−1−µ) (xi + 0)− s(2m−1−µ) (xi − 0) = 0, i = 1, ..., k.
Conversely, if statements 1)–3) are satisfied, then, from (2.2.24), it follows
⟨s(m), g(m)
⟩2
= 0, g ∈ U(0),
thus s is solution of (PSIP).
Remark 2.2.20. The Structural Characterization Theorem states that thesolution s of (PSIP) is a polynomial of 2m−1 degree on each interior interval(xi, xi+1) and it is a polynomial of m− 1 degree on the intervals [a, x1) and(xk, b]. Furthermore, the derivative of order 2m− 1−µ is continuous in xi ifthe value of the µ-th order derivative of f , in xi, does not belong to Λ.
A solution of a (PSIP) is also called (natural) spline of order 2m− 1.In conclusion, according to Theorems 2.2.13 and 2.2.19 we can state the
following result.
Theorem 2.2.21. If Λ is a set of Birkhoff type linear functionals, U(y),y ∈ Rn is the corresponding interpolation set and S (Λ) is the set of splinesthat interpolate U(y), then the following three statements are equivalent:
68 Chapter 2. Approximation of functions
(A) s ∈ S (Λ)⇐⇒∥∥s(m)
∥∥2
= infu∈U(y)
∥∥u(m)∥∥
2
(B) s ∈ S (Λ)⇐⇒⟨s(m), g(m)
⟩2
= 0, g ∈ U(0)
(C) s ∈ S (Λ)⇐⇒
1) s(2m) (x) = 0, x ∈ [x1, xk] \ xii=1,...,k
2) s(m) (x) = 0, x ∈ [a, x1) ∪ (xk, b]3) s(2m−1−µ) (xi + 0)− s(2m−1−µ) (xi − 0) = 0,
with µ ∈ 0, 1, ..., m− 1 \ Ii, i = 1, ..., k.Consequently, each of the three statements above can be used as definition
of the solution s of (PSIP) and the other two can be proved as theorems.In the present case, (A) was taken as definition (Definition 2.2.3), while (B)was proved as The Orthogonality Theorem (Theorem 2.2.13), respectively,(C) as The Structural Characterization Theorem (Theorem 2.2.19).
Suppose now that (PSIP) has an unique solution.
Theorem 2.2.22. Consider y = (y1, ..., yn) ∈ Rn and let si, i = 1, ..., n bethe fundamental interpolation splines. Then the function
sy =n∑
i=1
siyi
is the solution of (PSIP) corresponding to the set U (y) .
Proof. One has
λk (sy) = yk, k = 1, ..., n,
thus sy ∈ U(y). Since si are the solutions of (PSIP) that interpolate the setsUi, i = 1, ..., n, respectively, by Theorem 2.2.13 it follows that
⟨s(m)i , g(m)
⟩
2= 0, g ∈ U(0),
for all i = 1, ..., n. Consequently,
⟨s(m)y , g(m)
⟩2
=n∑
i=1
yi
⟨s(m)i , g(m)
⟩2
= 0, g ∈ U(0),
hence,
sy ∈ S (Λ) .
2.2. Univariate interpolation operators 69
Remark 2.2.23. Theorem 2.2.22 can be interpreted as follows: For a givenf ∈ Hm,2 [a, b] and a given set of functionals Λ, the function
Sf =
n∑
i=1
siλi (f) (2.2.25)
is a spline that interpolates f with respect to Λ, i.e.,
λi (Sf) = λi (f) , i = 1, ..., n.
Furthermore, Sf is the smoothest function in Hm,2 [a, b] that interpolates f,in the sense that ∥∥∥(Sf)(m)
∥∥∥2→ min .
Remark 2.2.24. From (2.2.25) one can notice that S : Hm,2 [a, b] → S (Λ)is a linear and idempotent operator, hence it is a projector.
Definition 2.2.25. The operator S : Hm,2 [a, b]→ S (Λ) is called polynomialspline interpolation operator suitable to Λ.
Next, there will be considered the spline operator S for particular sets offunctionals Λ.
2.2.3 Spline operators of Lagrange type
For a given function f : [a, b]→ R and xi ∈ [a, b], i = 1, ..., n, we consider
Λ := ΛL = λi | λi(f) = f(xi), i = 1, ..., n.
By Theorem 2.2.12 it follows that if n ≥ m then for every f ∈ Hm,2[a, b]the interpolation spline function SLf exists and is unique.
Definition 2.2.26. For Λ := ΛL, the corresponding spline operator SL iscalled the spline operator of Lagrange type.
For determining SL one can use Theorem 2.2.19, but with the remarkthat the third condition from this theorem becomes 3′) SLf ∈ C2m−2[a, b].Writing the function Sf as
(SLf)(x) =m−1∑
i=0
aixi +
n∑
j=1
bj(x− xj)2m−1+ , (2.2.26)
70 Chapter 2. Approximation of functions
it follows that the conditions 1), 2) for x ∈ [a, x1) and 3′) are verified. Indeed,
(SLf)(x) =
m−1∑
i=0
aixi +
k∑
j=1
bj(x− xj)2m−1,
for x ∈ (xk, xk+1), k = 1, ..., n− 1,
(SLf)(x) =m−1∑
i=0
aixi, x ∈ [a, x1),
and
(SLf)(ν)(xj − 0) = (SLf)(ν)(xj + 0), ν = 0, ..., 2m− 2, j = 1, ..., n.
So, SLf ∈ P2m−1 on each interval (xj , xj+1), j = 1, ..., m− 1, SLf ∈ Pm−1 on[a, x1) and SLf ∈ C2m−2[a, b]. On the interval (xn, b] we have
(SLf)(x) =
m−1∑
i=0
aixi +
n∑
j=1
bj(x− xj)2m−1,
so SLf ∈ P2m−1. Hence, the condition 2) is not verified on this interval.
Writing the polynomial SLf in the Taylor’s form, i.e.,
(SLf)(x) =2m−1∑
k=0
(x−α)k
k!(SLf)(k)(α), α > xn,
it follows that it reduces to a polynomial of m − 1 degree on (xn, b], if wesuppose that
(SLf)(p)(α) = 0, p = m, ..., 2m− 1. (2.2.27)
Consequently, if the conditions (2.2.27) are verified then all the three prop-erties of a spline function are fulfilled.
Regarding the expression from (2.2.26), we notice that the function SLfdepends on m + n parameters, ai, bj ∈ R, i = 0, ..., m − 1; j = 1, ..., n. Wealso notice that (2.2.27) and the interpolation conditions
(SLf)(xi) = f(xi), i = 1, ..., n
2.2. Univariate interpolation operators 71
provide a (m+ n)× (m+ n) linear algebraic system:
(SLf)(p)(α) = 0, p = m, ..., 2m− 1(SLf)(xi) = f(xi), i = 1, ..., n.
(2.2.28)
For n ≥ m, the function SLf may also be written in the form
SLf =n∑
k=1
skf(xk),
where sk, k = 1, ..., n are the fundamental interpolation spline functions.So, the determination of SLf reduces to the problem of finding the func-tions sk, k = 1, ..., n. For doing this we write these functions in some formscorresponding to (2.2.26), i.e.,
sk(x) =m−1∑
i=0
aki xi +
n∑
j=1
bkj (x− xj)2m−1+ , k = 1, ..., n,
with aki , i = 0, ..., m− 1 and bkj , j = 1, ..., n obtained as the solutions of thelinear algebraic system:
s(p)k (α) = 0, p = m, ..., 2m− 1 and α > xnsk(xν) = δkν , ν = 1, ..., n
(2.2.29)
for k = 1, ..., n. We remark that the matrices of all these systems coincideand only the free terms are different.
Definition 2.2.27. The formula
f = SLf +RLf
is called the spline interpolation formula of Lagrange type, where RL is theremainder operator.
We have dex(SL) = m− 1 and applying the Peano’s theorem, we obtain
(RLf)(x) =
∫ b
a
ϕL(x, t)f(m)(t)dt,
where
ϕL(x, t) =(x−t)m−1
+
(m−1)!−
n∑
i=1
si(x)(xi−t)m−1
+
(m−1)!.
72 Chapter 2. Approximation of functions
If f (m) is continuous on [a, b], then
|(RLf)(x)| ≤∥∥f (m)
∥∥∞
∫ b
a
|ϕL(x, t)|dt.
Example 2.2.28. Let f ∈ C[0, 1] and
ΛL(f) =f(in
)∣∣ i = 0, ..., n.
We are looking for a linear spline interpolation formula.We have
f = S1f +R1f,
where
(S1f)(x) =
n∑
i=0
si(x)f(in
)
and
(R1f)(x) =
∫ 1
0
ϕ1(x, t)f′(t)dt,
with
ϕ1(x, t) = (x− t)0+ −
n∑
i=0
si(x)(in− t)0+.
We have to find the functions si, i = 0, ..., n. We notice that
si(x) = ai0 +n∑
j=0
bij(x− j
n
)+, i = 0, ..., n,
and the coefficients ai0, bi0, ..., b
in, i = 0, ..., n are provided by the conditions:
si(kn
)= δik, k = 0, ..., n
s′i(α) = 0, α > 1 (we take α = 2).
We obtain
s0(x) = 1− nx+ n(x− 1
n
)+
s1(x) = nx− 2n(x− 1
n
)+
+ n(x− 2
n
)+
s2(x) = n(x− 1
n
)+− 2n
(x− 2
n
)+
+ n(x− 3
n
)+
...
sn−1(x) = n(x− n−2
n
)+− 2n
(x− n−1
n
)+
+ n (x− 1)+
sn(x) = n(x− n−1
n
)+− n (x− 1)+ .
2.2. Univariate interpolation operators 73
2.2.4 Spline operators of Hermite type
For a function f ∈ Hm,2[a, b] we have the set of Hermite type functionals
ΛH = λkj | λkjf = f (j)(xk), k = 1, ..., n, j = 0, ..., rk,
where rk ∈ N, rk < m.
Definition 2.2.29. The spline operator S corresponding to the set of Her-mite type functionals, ΛH , is called the spline operator of Hermite type.
The spline function of Hermite type can be written as
(SHf)(x) =
m−1∑
i=0
aixi +
n∑
k=1
rk∑
j=0
bkj(x− xk)2m−j−1+ , (2.2.30)
with the parameters ai, bkj given by conditions 1)–3) from Theorem 2.2.19and by the interpolation conditions:
(SHf)(j)(xk) = f (j)(xk), k = 1, ..., n; j = 0, ..., rk.
It is easy to check that SHf, written as in (2.2.30), verifies the conditions1), 2) on [a, x1) and 3), from Theorem 2.2.19. For the condition 2) to takeplace also on (xn, b], it is sufficient to consider
(SHf)(p)(α) = 0, p = m, ..., 2m− 1,
that is analogous with (2.2.27). Therefore, the parameters ai, i = 0, ..., m−1and bkj , k = 1, ..., n, j = 0, ..., rk, (their number is N := n + m +
∑nk=1rk),
are determined as the solution of the N ×N linear algebraic system:
(SHf)(j)(xk) = f (j)(xk), k = 1, ..., n; j = 0, ..., rk(SHf)(p)(α) = 0, p = m, ..., 2m− 1.
If |ΛH| ≥ m, by Theorem 2.2.12, it follows that the spline function SHfexists and is unique, so the system is compatible.
Also, the function SHf can be represented using the fundamental splinefunctions, i.e.,
(SHf)(x) =n∑
k=1
rk∑
j=0
skj(x)f(j)(xk), (2.2.31)
74 Chapter 2. Approximation of functions
where
skj(x) =m−1∑
µ=0
akjµ xµ +
n∑
µ=1
rk∑
ν=0
bkjµν(x− xµ)2m−ν−1+ ,
for k = 1, ..., n and j = 0, ..., rk. Each fundamental spline function skj isobtained from the corresponding system of the form:
s(q)kj (xν) = 0, ν = 1, ..., n, ν 6= k, q = 0, ..., rν ,
s(q)kj (xk) = δjq, q = 0, ..., rk,
s(p)kj (α) = 0, p = m, ..., 2m− 1 and α > xn,
(2.2.32)
for k = 1, ..., n and j = 0, ..., rk. As in case of the Lagrange interpolation, allthe systems have the same matrices.
Definition 2.2.30. Let Λ := ΛH and SH be the corresponding spline opera-tor. The formula
f = SHf +RHf,
is called the spline interpolation formula of Hermite type, where RH is theremainder operator.
Applying the Peano’s theorem, we obtain
(RHf)(x) =
∫ b
a
ϕH(x, t)f (m)(t)dt,
where
ϕH(x, t) =(x−t)m−1
+
(m−1)!−
n∑
k=1
rk∑
j=0
skj(x)(xk−t)m−j−1
+
(m−j−1)!.
If f (m) is continuous on [a, b] then
|(RHf)(x)| ≤∥∥f (m)
∥∥∞
∫ b
a
|ϕH(x, t)|dt.
Example 2.2.31. Let f ∈ C2[0, 1] and ΛH(f) =f(0), f ′(0), f
(12
), f(1), f ′(1)
be given. We have to find the corresponding spline interpolation formula.We consider the cubic spline function (m = 2) with its interpolation
formula:f = S3f +R3f.
2.2. Univariate interpolation operators 75
From (2.2.31) we have
(S3f)(x) = s00(x)f(0) + s01(x)f′(0) + s10(x)f
(12
)+ s20(x)f(1) + s21(x)f
′(1),
with the fundamental interpolation spline functions having the following ex-pressions:
skj(x) = akj0 + akj1 x+ bkj00x3 + bkj01x
2 + bkj10(x− 1
2
)3+
+ bkj20(x− 1)3+ + bkj21(x− 1)2
+,
(2.2.33)for (k, j) ∈ (0, 0), (0, 1), (1, 0), (2, 0), (2, 1).
In this case the system from (2.2.32), for α = 2, becomes:
s00(0) := a000 = 1 (2.2.34)
s′00(0) := a001 = 0
s00
(12
):= a00
0 + 12a00
1 + 18b0000 + 1
4b0001 = 0
s00 (1) := a000 + a00
1 + b0000 + b0001 + 18b0010 = 0
s′00 (1) := a001 + 3b0000 + 2b0001 + 3
4b0010 = 0
s′′00 (2) := 12b0000 + 2b0001 + 9b0010 + 6b0020 + 2b0021 = 0
s′′′00 (2) := b0000 + b0010 + b0020 = 0.
By solving this system we get
s00(x) = 1 + 10x3 − 9x2 − 16(x− 1
2
)3+.
The systems that provide the coefficients of the other fundamental splinefunctions have the same matrix, but the free terms are changing successivelyin (0, 1, 0, 0, 0, 0, 0), (0, 0, 1, 0, 0, 0, 0), (0, 0, 0, 1, 0, 0, 0), (0, 0, 0, 0, 1, 0, 0). Thisway, we obtain:
s01(x) = x+ 3x3 − 72x− 4
(x− 1
2
)3+,
s10(x) = −16x3 + 12x2 + 32(x− 1
2
)3+,
s20(x) = 6x3 − 3x2 − 16(x− 1
2
)3+,
s21(x) = −x3 + 12x2 + 4
(x− 1
2
)3+.
For the remainder we have:
(R3f)(x) =
∫ 1
0
ϕ2(x, t)f′′(t)dt,
where
ϕ2(x, t) = (x− t)+ − s10(x)(
12− t)+− s20(x)(1− t)− s21(x).
76 Chapter 2. Approximation of functions
2.2.5 Spline operators of Birkhoff type
For a function f ∈ Hm,2[a, b] the set of Birkhoff type functionals is given by:
ΛB = λkj | λkjf = f (j)(xk), k = 1, ..., n, j ∈ Ik,
for Ik ⊆ 0, ..., rk, rk ∈ N, rk < m.
Definition 2.2.32. The spline operator SB corresponding to the set of Birkhofftype functionals, ΛB, is called the spline operator of Birkhoff type.
If the spline function of Birkhoff type exists, it is of the following form:
(SBf)(x) =
m−1∑
i=0
aixi +
n∑
k=1
∑
j∈Ik
bkj(x− xk)2m−j−1+
or
(SBf)(x) =
n∑
k=1
∑
j∈Ik
skj(x)f(j)(xk),
where skj are the fundamental interpolation spline functions. They can beobtained by conditions as (2.2.32).
We show this procedure in the following example.
Example 2.2.33. Let f ∈ C2[0, 1] and ΛB(f) =f ′(0), f
(12
), f ′(1)
be
given. We are looking for the corresponding interpolation formula.We consider the cubic spline function:
(S3f)(x) = s01(x)f′(0) + s10(x)f
(12
)+ s21(x)f
′(1).
We determine the fundamental interpolation spline functions s01, s10 ands21. They have the same expressions as (2.2.33), their coefficients being thesolutions of some system of the form (2.2.34). So, for s01 we have:
s′01(0) := a011 = 1
s01
(12
):= a01
0 + 12a01
1 + 14b0101 = 0
s′01 (1) := a011 + 2b0101 + 3
4b0110 = 0
s′′01 (2) := 2b0101 + 9b0110 + 2b0121 = 0s′′′01 (2) := 6b0110 = 0.
It followss01(x) = −3
8+ x− 1
2x2 + 1
2(x− 1)2
+ .
2.2. Univariate interpolation operators 77
In the same way we obtain
s10(x) = 1
s21(x) = −18
+ 12x2 − 1
2(x− 1)2+ .
For the remainder term we have:
(R3f)(x) =
∫ 1
0
ϕ2(x, t)f′′(t)dt,
with
ϕ2(x, t) = (x− t)+ −(
12− t)+− s21(x).
2.2.6 Rational interpolation operators
2.2.6.1 An iterative method to construct a rational interpolationoperator
Let F = Pr/Ps, where Pk ∈ Pk. Because F is not changing if we replacePr and Ps with CPr and CPs, respectively, where C is a nonzero constant,it follows that F depends of (r + s+ 1) parameters, for their determinationbeing necessary the same number of conditions. In this section we will useinterpolation conditions to determine function F.
Denoting by m = r + s, we consider xi ∈ [a, b] , i = 0, 1, ..., m, withxi 6= xj , if i 6= j.
Definition 2.2.34. The problem of determining the function F in conditions
F (xi) = f (xi) , i = 0, 1, .., m, (2.2.35)
is named the rational interpolation problem.
We shall present an iterative method for construction the function F,based on its representation under continuous limited fraction form.
First, we consider the sequence vk, k = 0, 1, ..., defined by recurrencerelation:
vk(x) = vk(xk) + (x− xk)/vk+1(x). (2.2.36)
78 Chapter 2. Approximation of functions
Denoting by v0 = f and applying successively the relation (2.2.36), we obtain:
f (x) = f (x0) +x− x0
v1 (x1) +x− x1
v2 (x2)+. . .
+ x−xk−1
vk(xk)+x−xk
vk+1
(x)
.
(2.2.37)The problem is to define the numbers vi (xi), such that the function Fm givenby
Fm(x) = f (x0) +x− x0
v1 (x1) +x− x1
v2 (x2) +. . .
+x−xm−1
vm(xm)
,
satisfies the interpolation conditions
Fm (xk) = f (xk) , k = 0, 1, .., m.
Definition 2.2.35. Let us consider M = xi | xi ∈ R, i = 0, ..., m and f :M → R. The value
x0, x1, ..., xk−1, xk; f =xk−xk−1
x0,...,xk−2,xk;f−x0,...,xk−1;f ,
with x0, x1; f = [x0, x1; f ]−1 , is called the inverse divided difference of korder.
Remark 2.2.36. The inverse divided difference of k > 1 order differs fromthe inverse of divided difference. Also, unlike of divided difference, inversedivided difference depends on the order of nodes.
Theorem 2.2.37. Let f : M → R. If vk (xk) = x0, x1, ..., xk; f , k =1, ..., m, then
Fm (xk) = f (xk) , k = 0, 1, .., m.
Proof. We reduce it to a direct verification. Indeed, for each k = 0, 1, ..., m,we have
Fm (xk) = f (x0) + xk−x0
x0, x1; f+.. .
+xk−xk−2
x0,...,xk−1,xk;f+ xk−xk−1
x0,...,xk;f
2.2. Univariate interpolation operators 79
Replacing the inverse divided difference of maximum order x0, ..., xk; f , bythe expression obtained from definition,
x0, x1, ..., xk; f =xk−xk−1
x0,...,xk−2,xk;f−x0,...,xk−1;f ,
we obtain
Fm (xk) = f (x0) + xk−x0
x0, x1; f+.. .
+ xk−xk−2
x0,...,xk−2,xk;f
Using this procedure, after k − 1 substitutions, it follows that
Fm (xk) = f (x0) + xk−x0
x0,xk;f .
Sincex0, xk; f = xk−x0
f(xk)−f(x0),
it follows thatFm (xk) = f (xk) .
Remark 2.2.38. The function Fm is a limited continuous fraction.
We introduce an iterative procedure which puts function Fm under ratio-nal function form. For this, we write the relation
f (x) = f (x0) + x−x0
v1 (x1)+. . .
+x−xk−1
vk(x)
under equivalent form
f (x) =vk(x)Gk(x)+(x−xk−1)Pk(x)
vk(x)Hk(x)+(x−xk−1)Qk(x). (2.2.38)
In order to determine functions Gk, Hk, Qk and Pk, we first observe that fromrelation (2.2.37) we also have
f (x) = vk+1(x)Gk+1(x)+(x−xk)Pk+1(x)
vk+1(x)Hk+1(x)+(x−xk)Qk+1(x). (2.2.39)
80 Chapter 2. Approximation of functions
On the other hand, by replacing in (2.2.38) the expression (2.2.36) of vk (x),we obtain:
f (x) = vk+1(x)[vk(vk)Gk(x)+(x−xk−1)Pk(x)]+(x−xk)Gk(x)
vk+1(x)[vk(vk)Hk(x)+(x−xk−1)Qk(x)]+(x−xk)Hk(x), (2.2.40)
that is identical with (2.2.39).From (2.2.39) and (2.2.40), it follows that
Pk+1 (x) = Gk (x) ,
Qk+1 (x) = Hk (x) ,
and
Gk+1 (x) = vk (xk)Gk (x) + (x− xk−1)Pk (x) ,
Hk+1 (x) = vk (xk)Hk (x) + (x− xk−1)Qk (x) .
So, we have the recurrence relations
Gk+1 (x) = vk (xk)Gk (x) + (x− xk−1)Gk−1 (x) , (2.2.41)
Hk+1 (x) = vk (xk)Hk (x) + (x− xk−1)Hk−1 (x) .
Consequently,f (x) =
vk(x)Gk(x)+(x−xk−1)Gk−1(x)
vk(x)Hk(x)+(x−xk−1)Hk−1(x). (2.2.42)
The functions Gk, Hk, k > 1, are determined from (2.2.41), for given G0, G1
and H0, H1.In order to obtain these started functions, we consider the following ex-
pression,f (x) = v1(x)G1(x)+(x−x0)G0(x)
v1(x)H1(x)+(x−x0)H0(x),
obtained from (2.2.42), for k = 1. But, we also have the equivalent represen-tation:
f (x) = f (x0) + x−x0
v1(x)= v1(x)f(x0)+(x−x0)
v1(x).
Identifying these two last expressions of f , one obtains
G0 (x) = 1; G1 (x) = f (x0) (2.2.43)
H0 (x) = 0; H1 (x) = 1.
From (2.2.41) and (2.2.43) it follows that Gk and Hk are polynomials, so
Gk ∈ P[k2], Hk ∈ P[k−1
2 ], k = 1, 2, . . . .
We have proved the following result.
2.2. Univariate interpolation operators 81
Theorem 2.2.39. If f : M → R, vk (xk) = x0, ..., xk; f , for k = 1, ..., m,and
Fm =Gm+1
Hm+1
,
with Gm+1 and Hm+1 given by (2.2.41), then
Fm (xi) = f (xi) , i = 0, 1, ..., m.
Hence, Fm is a rational function, Fm = Pr/Qs, where r = (m+ 1) /2 ands = r − 1, if m is odd, and r = s = m/2, if m is even.
We can consider the following formula of the rational interpolation
f = ρmf + rmf,
where ρm, defined by ρmf = Fm, is a rational interpolation operator and rmfis the remainder term.
Theorem 2.2.40. The remainder term have the following expression:
(rmf) (x) = (−1)m u(x)Hm+1(x)[vm+1(x)Hm+1(x)+(x−xm)Hm(x)]
, (2.2.44)
where u (x) = (x− x0) ... (x− xm) .
Proof. We have
(rmf) (x) = vm(x)Gm(x)+(x−xm−1)Gm−1(x)vm(x)Hm(x)+(x−xm−1)Hm−1(x)
− Gm+1(x)Hm+1(x)
.
Bringing at the same denominator and reducing alike terms, we obtain
(rmf) (x) = (x−xm−1)[vm(x)−vm(xm)][Gm(x)Hm−1(x)−Hm(x)Gm−1(x)][vm(x)Hm(x)+(x−xm−1)Hm−1(x)]Hm+1(x)
.
Using (2.2.36), it follows
vm (x)− vm (xm) = (x− xm) /vm+1 (x)
and
vm (x)Hm (x) + (x− xm−1)Hm−1 (x) = Hm+1 (x) + x−xm
vm+1(x)Hm (x) .
Then,(rmf) (x) = (x−xm−1)(x−xm)[Gm(x)Hm−1(x)−Hm(x)Gm−1(x)]
[vm+1(x)Hm+1(x)+(x−xm)Hm(x)]Hm+1(x).
82 Chapter 2. Approximation of functions
Taking into account that
G2 (x)H1 (x)−H2 (x)G1 (x) = x− x0
and
Gm (x)Hm−1 (x)−Hm (x)Gm−1 (x) =
= − (x− xm−2) [Gm−1 (x)Hm−2 (x)−Hm−1 (x)Gm−2 (x)],
it follows that
Gm (x)Hm−1 (x)−Hm (x)Gm−1 (x) = (−1)m−2 (x− x0) ... (x− xm−2) ,
so the relation (2.2.44) is proved.
Remark 2.2.41. For the absolute approximation error we have
|(rmf) (x)| = |u(x)||Hm+1(x)||vm+1(x)Hm+1(x)+(x−xm)Hm(x)| ,
for given x ∈ [a, b].
To evaluate the value of the approximation ρmf at a point α ∈ R, takinginto account that
(ρmf) (α) =Gm+1 (α)
Hm+1 (α),
we can use the recurrence relations (2.2.41), with starting values (2.2.43).Using the following scheme we can evaluate the inverse divided differences
which intercede in the expressions of the polynomials Gk and Hk:
x0 v00
x1 v10 v11
x2 v20 v21 v22
...xi vi0 vi1 vi2 ... vii...xm vm0 vm1 vm2 ... vmi ... vmm,
where
vi0 = f (xi) , i = 0, ..., m (2.2.45)
vik =xi−xk−1
vi,k−1−vk−1,k−1, k = 1, ..., i; i = 1, ..., m.
2.2. Univariate interpolation operators 83
We observe that the inverse divided differences vii, from the expressions ofthe polynomials Gk and Hk, are situated on the hypotenuses of the triangularscheme and their evaluation, according to (2.2.45), can be realized after lines,each line giving the inverse divided difference of the following superior order.
Thus, using numbers vii, i = 0, 1, ..., there can be generated successivelyapproximations:
(ρ1f) (α) , (ρ2f) (α) , ..., (ρif) (α) , ... (2.2.46)
until the distance between two consecutive elements of the sequence (2.2.46)is less or equal to a given ε > 0, i.e.,
|(ρif) (α)− (ρi−1f) (α)| ≤ ε.
Therefore, we can say that (ρif) (α) approximates f (α) with a given precisionε.
2.2.6.2 Univariate Shepard interpolation
D. Shepard has introduced in 1968 an interpolation procedure which is veryefficient in interpolation of large scattered data sets.
Let f be a real-valued function defined on I = [a, b] ⊂ R, xi ∈ I, i =1, . . . , N , X ⊂ I, X denoting the set of nodes, and consider
Λ := ΛL = λi |λi(f) = f(xi), i = 1, . . . , N (2.2.47)
a Lagrange type set of functionals.The univariate Shepard operator S0 is defined by:
(S0f) (x) =
N∑
i=1
Ai (x) f (xi) , (2.2.48)
where
Ai (x) =
N∏j=1, j 6=i
|x− xj |µ
N∑k=1
N∏j=1, j 6=k
|x− xj |µ, (2.2.49)
and µ ∈ R+. The basis functions Ai may be written in barycentric form
Ai(x) =|x− xi|−µ
N∑k=1
|x− xk|−µ, i = 1, . . . , N. (2.2.50)
84 Chapter 2. Approximation of functions
andN∑
i=1
Ai (x) = 1. (2.2.51)
They have the cardinality property
Ai (xν) = δiν , i, ν = 1, ..., N. (2.2.52)
The main properties of the operator S0 are:
• the interpolation property regarding the set of functionals given by(2.2.47), i.e.,
(S0f)(xi) = f(xi), i = 1, . . . , N.
• the degree of exactness is
dex(S0) = 0.
Shepard’s method is a particular example of a convex combination, i.e.,the functions Ai, i = 1, . . . , N are non-negative and their sum is 1. Fromthis, it follows that the scheme has only constant precision.
The Shepard interpolation has two major drawbacks:• high computational cost;• low degree of exactness.These drawbacks can be overcome in two ways:⋆ Modifying basis functions. Shepard method has the property that there
are flat spots at the nodes, the accuracy tends to decrease in the areas wherethe interpolation nodes are sparse and the evaluation of (S0f)(x) requiresa considerable amount of work. These disadvantages are avoided by usinga local version of the Shepard formula, given by R. Franke and G. Nielson,that consists in replacing the basis functions Ai, from (2.2.50), by
wi(x) =
(1− |x− xi|
R
)ν
+
, (2.2.53)
where R is a radius of influence about the node xi and it is varying with i.⋆ Increasing the degree of exactness by combining the Shepard operator
with another interpolation operator. This is one of the most efficient ways ofgeneralizing the Shepard interpolation. In this way its degree of exactness is
2.2. Univariate interpolation operators 85
increased and there are used different sets of functionals. We notice that indefinition of the Shepard operator S there are used Lagrange type functionals.
LetΛ := λi : i = 1, ..., N
be a set of functionals and P be the corresponding interpolation operator.We consider that Λi ⊂ Λ are the subsets associated to the functionals λi,i = 1, ..., N . We have
⋃N
i=1Λi = Λ and Λi ∩ Λj 6= ∅,
excepting the case Λi = λi , i = 1, ..., N, when Λi ∩ Λj = ∅, for i 6= j. Weassociate the interpolation operator Pi to each subset Λi, i = 1, ..., N .
The operator SP defined by
(SPf) (x) =N∑
i=1
Ai (x) (Pif) (x) (2.2.54)
is the combined operator of S0 and P .
Remark 2.2.42. If Pi, i = 1, ..., N, are linear operators then SP is a linearoperator.
Remark 2.2.43. Let Pi, i = 1, ..., N, be some arbitrary linear operators. If
dex (Pi) = ri, i = 1, ..., N,
thendex (SP ) = min r1, ..., rN .
The function S0f and its behavior in the neighborhood of the nodesdepend on the size of µ.
If 0 < µ ≤ 1 then the function S0f has peaks at the nodes. For µ > 1 thefirst derivatives vanish at the nodes, i.e., S0f has flat spots and if µ is largeenough S0f becomes a step function. This phenomenon is shown in Figure2.1.
Example 2.2.44. Let f : [−2, 2]→ R,
f(x) =1
1 + x2
and consider the nodes xi = −2 + 0.5i, i = 0, ..., 8. We plot S0f, for µ =1, 2, 20.
86 Chapter 2. Approximation of functions
−2 −1.5 −1 −0.5 0 0.5 1 1.5 20.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
(a) Function f.
−2 −1.5 −1 −0.5 0 0.5 1 1.5 20.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
(b) Interpolant S0f , µ = 1.
−2 −1.5 −1 −0.5 0 0.5 1 1.5 20.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
(c) Interpolant S0f , µ = 2.
−2 −1.5 −1 −0.5 0 0.5 1 1.5 20.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
(d) Interpolant S0f , µ = 20.
Figure 2.1: Univariate Shepard interpolants.
The most frequent choice is µ = 2, in which case the basis functions Aiare rational and infinitely differentiable.
In the sequel we study the remainder of the interpolation formula gen-erated by Shepard operator. The interpolation function S0f can be viewedas a projection of f into the finite-dimensional linear space spanned by thefunctions Ai, i = 1, . . . , N . The Shepard interpolation formula is:
f = S0f +R0f,
where S0f was defined in (2.2.48) and R0f is the remainder term.
Theorem 2.2.45. If f is absolutely continuous on [a, b], then
(R0f)(x) =
∫ b
a
ϕ(x, s)f ′(s)ds, (2.2.55)
where
ϕ(x, s) = (x− s)0+ −
N∑
i=1
Ai(x)(xi − s)0+.
Also,|(R0f)(x)| ≤ K(x) ‖f ′‖∞ ,
2.2. Univariate interpolation operators 87
where
K(x) = x−N∑
i=1
xiAi(x) + 2N∑
i=1
Ai(x)(xi − x)+.
Proof. Using the Peano’s theorem and taking into account that R0e0 = e0,where ek(x) = xk, formula (2.2.55) follows.
Next, let us suppose that x ∈ (xk−1; xk) and
ϕj(x; ·) = ϕ(x; ·)∣∣[xj−1,xj] , for j = 1, . . . , N.
We have
ϕj(x; s) = (x− s)0+ −
N∑i=j
Ai(x).
Using∑N
i=1Ai(x) = 1 and Ai(x) ≥ 0, i = 1, . . . , N , one obtains:
ϕj(x; s) =
j−1∑i=1
Ai(x), for s ≤ x,
−N∑i=j
Ai(x), for s > x.
Hence,
ϕj(x; s) ≥ 0, for s ≤ x,
ϕj(x; s) ≤ 0, for s > x.
Since s, x ∈ (xk−1, xk), it follows that
ϕj(x; s) ≥ 0, for j = 1, . . . , k − 1,
ϕj(x; s) ≤ 0, for j = k + 1, . . . , N,
ϕk(x; s) ≥ 0, for s ≤ x,
ϕk(x; s) ≤ 0, for s > x.
Finally, one obtains that
ϕ(x; s) ≥ 0, for s ≤ x
andϕ(x; s) ≤ 0, for s > x.
88 Chapter 2. Approximation of functions
From (2.2.55) we have:
|(R0f)(x)| ≤ ‖f ′‖∞∫ b
a
|ϕ(x; s)| ds,
with ∫ b
a
|ϕ(x; s)| ds =
∫ x
a
ϕ(x; s)ds−∫ b
x
ϕ(x; s)ds = K(x)
and the theorem is proved.Next, we assume that at each point xi, i = 1, . . . , N, there also exists the
derivative f ′(xi) and we consider the operator
(S1f)(x) =N∑
i=1
Ai(x)[f(xi) + (x− xi)f ′(xi)].
It is easy to prove that if µ > 1, the operator S1 interpolates both the functionf and its first derivative at the points xi, i = 1, . . . , N .
Lemma 2.2.46. The operator S1 is linear and
S1ek = ek, for k = 0, 1.
Proof. The proof follows after a straightforward computation.This operator generates the following interpolation formula:
f = S1f +R1f.
Regarding its remainder term we have the following result.
Theorem 2.2.47. If f ∈ H2[a, b] then
(R1f)(x) =
∫ b
a
ϕ1(x, s)f′′(s)ds, (2.2.56)
where
ϕ1(x, s) = (x− s)+ −N∑i=1
Ai(x)[(xi − s)+ + (x− xi)(xi − s)0+].
Moreover, if f ′′is continuous on the interval [a, b], then
(R1f)(x) =
[−x
2
2+
1
2
N∑i=1
Ai(x)x2i
]f ′′(ξ), with ξ ∈ (a, b). (2.2.57)
2.2. Univariate interpolation operators 89
Proof. Lemma 2.2.46 and the Peano’s theorem imply (2.2.56). The functionϕk1(x; ·) = ϕ1(x; ·)
∣∣[xk−1,xk] has the form
ϕk1(x, s) = (x− s)+ − (x− s)N∑i=k
Ai(x),
so
ϕk1(x; s) =
(x− s)k−1∑i=1
Ai(x), for s ≤ x
−(x− s)N∑i=k
Ai(x), for s > x,
i.e.,
ϕk1(x; s) ≥ 0, for all k.
It follows that ϕ1(x; s) ≥ 0, for x, s ∈ [a, b], and by the mean theorem, oneobtains the representation (2.2.57).
2.2.6.2.1 The univariate Shepard-Lagrange operator Let f be areal-valued function defined on X ⊂ R, xi ∈ X, i = 1, . . . , N some distinctnodes. Consider the set of Lagrange functionals
Λ := ΛL(f) = λi |λi(f) = f(xi), i = 1, . . . , N
and
Λi(f) = f(xi+ν) | ν = 0, 1, . . . , m, i = 1, . . . , N ,with xN+ν = xN−ν , ν = 1, . . . , m. The operator
(Limf)(x) =m∑
ν=0
u(x)
(x− xi+ν)u′(xi+ν)f(xi+ν)
is the Lagrange operator suitable to Λi(f).
Definition 2.2.48. The operator SLmgiven by
(SLmf)(x) =
N∑i=1
Ai(x)(Limf)(x)
is called the univariate Shepard-Lagrange operator.
90 Chapter 2. Approximation of functions
Theorem 2.2.49. For µ > m, we have the following interpolation properties:
(SLmf)(xi) = f(xi), i = 1, . . . , N, (2.2.58)
and the degree of exactness is:
dex(SLm) = m. (2.2.59)
Proof. The interpolation properties (2.4.8) follows taking into account rela-tions (2.2.52) and the interpolatory properties of Limf, i = 1, . . . , N.
Relation (2.2.59) follows directly by Remark 2.2.43.The Shepard-Lagrange univariate interpolation formula is
f = SLmf − RLm
f,
where RLmf denotes the remainder term. We have the following error esti-
mation.
Theorem 2.2.50. If f ∈ Cm+1(I), then
‖RLmf‖∞ ≤ CM
∥∥f (m+1)∥∥∞ εmµ (r),
with
εmµ (r) =
|log r|−1 , µ = 1,rµ−1, 1 < µ < m+ 2,rµ−1 |log r| , µ = m+ 2,rm+1, µ > m+ 2,
where C is a positive constant independent of x and X.
Proof. If f ∈ Cm+1(I), then the following bound for the interpolation erroris known:
|(Rmf)(x; xi)| = |(Lmf)(x; xi)− f(x)| ≤ |x−xi|...|x−xi+m|(m+1)!
∥∥f (m+1)∥∥∞ .
We have
|(SLmf)(x)− f(x)| ≤
N∑
i=1
|(Rmf)(x; xi)|Ai(x) ≤ Cm∥∥f (m+1)
∥∥∞ smµ (x),
where
smµ (x) =
N∑i=1
|x−xi|...|x−xi+m||x−xi|−µ
N∑k=1
|x−xi|−µ
.
2.2. Univariate interpolation operators 91
We show that
smµ (x) ≤ CMεmµ (r).
We introduce the following notations
Bρ(x) = [x− ρ, x+ ρ],
r = inf ρ > 0 : ∀x ∈ I ∃ u ∈ X u ∈ Bρ(x)
and
K = supy
card(Br(y) ∩X).
Let
n =[b−a2r
]+ 1,
Qr(u) be the interval (u− r, u+ r] and Tj = Qr(x+ 2rj)∪Qr(x− 2rj). Theset
n⋃j=−n
Qr(x+ 2rj)
is a covering of I with half open intervals. If xi ∈ I ∩ Tj we have
(2j − 1)r ≤ |x− xi| ≤ (2j + 1)r, j = 1, . . . , N (2.2.60)
and
1 ≤ card(X ∩ Tj) ≤M.
Also, N = card(x) = O(r−1) and [2(j− l)− 1]r ≤ |x− xi+l| ≤ [2(j+ l)+ 1]r.If xi ∈ Tj we have
|x− xi| . . . |x− xi+m| ≤ rm+1m∏l=0
[2(j + l) + 1] (2.2.61)
Let xd be the closest point to x. Since
N∑i+1
|x− xi|−µ ≥ |x− xd|−µ ,
92 Chapter 2. Approximation of functions
applying (2.2.60) and (2.2.61), we get:
smµ (x) ≤ |x− xd|−µ( ∑xi∈T0
|x− xd|−µ |x− xi| . . . |x− xi+m|+
+n∑j=1
∑xi∈Tj
|x− xd|−µ |x− xi| . . . |x− xi+m|)
≤rµmMrm+1−µ
m∏l=0
(2l + 1) +n∑j=1
(m∏l=0
[2(j + l) + 1](2j + 1)−µ)
≤mMrm+1m∏l=0
(2l + 1)(1 + C
n∑j=1
jm+1−µ).
We have two cases. µ > 1:
• if 1 < µ < m+ 2 then
rm+1
(1 + C
n∑j=1
jm+1−µ
)= O(rµ−1).
• if µ = m+ 2 then
n∑j=1
jm+1−µ = logN = |log r| .
• if µ > m+ 2 then∑n
j=1 jm+1−µ is bounded.
µ = 1: We have
smµ (x) = sm1 (x) =
N∑i=1
|x−xi|−1|x−xi|...|x−xi+m|
N∑i=1
|x−xi|−1
.
Applying the inequality∑ai∑bi≤∑ai
bi,
2.2. Univariate interpolation operators 93
we get
sm1 (x) =∑xi∈T0
|x− xi| . . . |x− xi+m|
· C1r
|log r|
n∑j=1
∑xk∈Tj
|x− xi|−1 |x− xi| . . . |x− xi+m|
≤ mMrm+1m∏l=0
(2l + 1)
(1 + C2
|log r|
n∑j=1
jq
)
≤ C1Mrm+1(1 + C2
|log r|O(r−m−1))
= O(|log r|−1).
Particular cases. 1) Consider m = 1. The combined operator SL1 isgiven by:
(SL1f)(x) =N∑
i=1
Ai(x)(Li1f)(x),
where
(Li1f)(x) =x− xi+1
xi − xi+1f(xi) +
x− xixi+1 − xi
f(xi+1), for i = 1, ..., N,
with xN+1 = x1.2) Consider m = 2. The combined operator SL2 is given by:
(SL2f)(x) =
N∑
i=1
Ai(x)(Li2f)(x),
where
(Li2f)(x) = li(x)f(xi) + li+1(x)f(xi+1) + li+2(x)f(xi+2),
with
li(x) =(x− xi+1)(x− xi+2)
(xi − xi+1)(xi − xi+2),
li+1(x) =(x− xi)(x− xi+2)
(xi+1 − xi)(xi+1 − xi+2),
li+2(x) =(x− xi)(x− xi+1)
(xi+1 − xi)(xi+2 − xi+1),
for all i = 1, ..., N and considering xN+1 = x1 and xN+2 = x2.
94 Chapter 2. Approximation of functions
2.2.6.2.2 The univariate Shepard-Hermite operator Let f be areal-valued function defined on X ⊂ R, xi ∈ X, i = 1, . . . , N some distinctnodes. Consider the set of Hermite functionals regarding f :
Let
Λ := ΛH(f) =λij∣∣ λij(f) = f (j)(xi) , i = 1, . . . , N , j = 1, . . . , ri, ri ∈ N
∗
and the subset
Λi,j(f) =f (j)(xi+ν) | ν = 0, 1, . . . , m, j = 1, . . . , rν
,
for i = 1, ..., N with xN+ν = xN−ν , for ν = 1, ..., m, m ∈ N∗.
Definition 2.2.51. The operator SHqgiven by
(SHqf)(x) =
N∑
i=1
Ai(x)(Hiqf)(x)
is called the univariate Shepard-Hermite operator, where H iqf is the q-th
Hermite interpolation polynomial, with q = m+ r0 + ... + rm.
Theorem 2.2.52. For µ > q, the following interpolation conditions arefulfilled:
(SHqf)(j)(xi) = f (j)(xi), i = 1, . . . , N ; j = 0, . . . , ri, (2.2.62)
and the degree of exactness is:
dex(SHq) = q. (2.2.63)
Proof. The interpolation properties (2.2.62) follow taking into account rela-tions (2.2.52) and the interpolatory properties of H i
qf, i = 1, . . . , N.
Relation (2.2.63) follows directly by Remark 2.2.43.
The Shepard-Hermite interpolation formula is
f = SHqf +RHq
f,
where RHqf denotes the remainder term.
2.2. Univariate interpolation operators 95
Theorem 2.2.53. If f ∈ Cq+1(I), then∥∥RHq
f∥∥I≤ CM
∥∥f (q+1)∥∥Iεqµ(r),
where
εqµ(r) =
|log r|−1 , µ = 1,rµ−1, 1 < µ < q + 2,rµ−1 |log r| , µ = q + 2,rq+1, µ > q + 2,
and C is a positive constant independent of x and X.
Proof. The proof is analogous to that of Theorem 2.2.50, with remark thatH iqf has the nodes xi+ν of multiplicities rν , respectively.
2.2.6.2.3 The univariate Shepard-Taylor operator Let f be a real-valued function defined on X ⊂ R and xi ∈ X, i = 1, . . . , N some distinctnodes. Consider the set of Taylor functionals regarding f :
Λ := ΛT (f) = λkj | λkj(f) = f (j)(xk), j = 0, 1, . . . , m; k = 1, . . . , N,m ∈ N∗ and Λi(f) = λij(f) | j = 0, 1, . . . , m, where Λi ⊂ ΛT is a subsetof ΛT associated to the functional λi such that λi ∈ Λi, for all i = 1, . . . , N .We shall denote by
(T imf)(x) =
m∑
j=0
(x− xi)jj!
f (j)(xi)
the Taylor interpolation operator corresponding to the subset of functionalsΛi.
Definition 2.2.54. The operator STmgiven by
(STmf)(x) =
N∑i=1
Ai(x)(Timf)(x)
is called the univariate Shepard-Taylor operator.
Theorem 2.2.55. For µ > m,
(STmf)(j)(xi) = f (j)(xi), j = 1, . . . , m,
anddex(STm
) = m.
96 Chapter 2. Approximation of functions
Proof. The conclusion follows taking into account relations (2.2.52) and theinterpolatory properties of T imf, i = 1, . . . , N and by Remark 2.2.43.
A generalization of the operator STmis obtained when
ΛT = λkj | λkj(f) = f (j)(xk), j = 0, 1, . . . , mk; k = 1, . . . , N
andΛi = λij | j = 0, 1, . . . , mi .
We have
(STm1,...,mNf)(x) =
N∑i=1
Ai(x)(Timif)(x),
where the Taylor operator T imihas the degree mi, i = 1, . . . , N .
Remark 2.2.56. For µ > M := maxm1, . . . , mN,
(STm1,...,mNf)(j)(x) = f (j)(xi), j = 0, 1, . . . , mi, i = 1, . . . , N,
anddex(STm1,...,mN
) = minm1, . . . , mN.Remark 2.2.57. Regarding the approximation error we note that the Shepard-Taylor, the Shepard-Lagrange and the Shepard-Hermite operators have thesame rate of convergence.
2.2.6.2.4 The univariate Shepard-Birkhoff operator Let f be areal-valued function defined on X ⊂ R, xi ∈ X, i = 1, . . . , N some distinctnodes. Consider the set of Birkhoff functionals regarding f :
Λ := ΛB(f) = λkj | λkj(f) = f (j)(xk), j = Ik; k = 1, . . . , N,
andΛi,j(f) =
f (j)(xi+ν) |ν = 0, 1, . . . , m, j ∈ Ii+ν
,
with xN+ν = xN−ν , for ν = 1, ..., m, m ∈ N∗.
Definition 2.2.58. The operator SBmgiven by
(SBmf)(x) =
N∑
i=1
Ai(x)(Bimf)(x)
is called the univariate Shepard-Birkhoff operator.
2.2. Univariate interpolation operators 97
Theorem 2.2.59. For µ > m, the following interpolation conditions arefulfilled:
(SBmf)(j)(xk) = f (j)(xk), j ∈ Ik; k = 1, . . . , N, (2.2.64)
anddex(SBm
) = m. (2.2.65)
Proof. The interpolation properties (2.2.64) follow taking into account rela-tions (2.2.52) and the interpolatory properties of H i
qf, i = 1, . . . , N.Relation (2.2.65) follows directly by Remark 2.2.43.
Remark 2.2.60. In the same way there are defined the combined Shepard-Abel-Goncharov and Shepard-Lidstone operators, which are particular casesof the combined Shepard-Birkhoff operator (see, e.g., [22] and [23]).
2.2.7 Least squares approximation
In this section, as an extension of the interpolation problem, it is given ashort review of the least squares approximation.
Recall that when the values of a function f are known for a given set ofpoints xi, i = 0, ..., m, an interpolation method can be used to determine anapproximation F of the function f , such that
F (xi) = f (xi) , i = 0, ..., m.
If only approximations of the values f (xi) are available, or if the numberof interpolation conditions is too large, instead of requiring that the approxi-mating function reproduces the given function values exactly, we ask onlythat it fits the data ”as closely as posible”. The most popular applicationinvolves the least squares principle, where the approximation F is determinedsuch that (
m∑
i=0
w (xi) [f (xi)− F (xi)]2
)1/2
→ min,
in the discrete case, and
(∫ b
a
w (x) [f (x)− F (x)]2 dx
)1/2
→ min,
in the continuous case, where w ≥ 0 on [a, b] is a weight function.
98 Chapter 2. Approximation of functions
Remark 2.2.61. Notice that the interpolation is a particular case of theleast squares approximation, namely for
f (xi)− F (xi) = 0, i = 0, ..., m.
In the sequel we briefly describe each of the two cases, starting with thecontinuous one.
Consider the space L2w [a, b] , the corresponding inner product,
〈f, g〉w,2 =
∫ b
a
w (x) f (x) g (x) , f, g ∈ L2w [a, b] ,
the norm endowed by it,
‖f‖w,2 =
(∫ b
a
w (x) f 2 (x) dx
)1/2
,
and the distance between two functions f and g, given by
dw (f, g) = ‖f − g‖w,2 .
For A ⊂ L2w [a, b] and f ∈ L2
w [a, b] , the question is whether there existsan element g∗ ∈ A such that
dw (f, g∗) = ming∈A
dw (f, g) .
It will be shown that such an element exists and it is unique.If A is a linear finite dimensional space and g∗ exists, then
〈f − g∗, g〉w,2 = 0, g ∈ A. (2.2.66)
Let n be the dimension of A and g1, ..., gn ∈ A be a basis in A.The orthogonality condition (2.2.66), with
g =n∑
i=1
aigi
and
g∗ =
n∑
i=1
a∗i gi,
2.2. Univariate interpolation operators 99
is equivalent to the system
〈f − g∗, gk〉w,2 = 0, k = 1, ..., n,
with n equations and n unknowns ai, i = 1, ..., n. The latter system can beexplicitly written as
n∑
i=1
ai 〈gi, gk〉w,2 = 〈f, gk〉w,2 , k = 1, ..., n, (2.2.67)
and its determinant is the Gram determinant G (g1, ..., gn) . Since the func-tions g1, ..., gn are linearly independent, we have
G (g1, ..., gn) 6= 0,
hence
g∗ =
n∑
i=1
a∗i gi
is uniquely determined.Next we discuss three particular cases.Case A. We consider A = Pn, and gk = ek, k = 0, ..., n.In this case, g∗ is the nth degree polynomial whose coefficients satisfy the
condition
∫ b
a
w (x)
[f (x)−
n∑k=0
a∗kxk
]2
dx = mina0,...,an
∫ b
a
w (x)
[f (x)−
n∑k=0
akxk
]2
dx.
According to (2.2.67), the solution of the approximation problem can beobtain solving the following system of equations
n∑
k=0
ak⟨xk, xj
⟩w,2
=⟨f (x) , xj
⟩w,2
, j = 0, ..., n,
where
⟨xk, xj
⟩w,2
=
∫ b
a
w (x) xk+jdx,
⟨f (x) , xj
⟩w,2
=
∫ b
a
w (x) xjf (x) dx.
100 Chapter 2. Approximation of functions
Case B. We consider A = Pn, and gk = pk, k = 0, ..., n, where p0, ..., pnis a set of orthogonal polynomials, in the interval [a, b] , relative to the weightfunction w.
Since 〈pi, pj〉w,2 = 0, for i 6= j, and
〈pk, pk〉w,2 = ‖pk‖2w,2 ,
from (2.2.67) it follows
a∗k = 1‖pk‖2
w,2
∫ b
a
w (x) f (x) pk (x) dx, k = 0, ..., n, (2.2.68)
and
g∗ (x) =
n∑
k=0
a∗kpk (x) .
Remark 2.2.62. The numbers a∗k in (2.2.68) are the coefficients of the ex-pansion of f in terms of the orthogonal polynomials pk, k = 0, 1, . . . . If thesepolynomials are orthonormal (i.e., ‖pk‖2 = 1), we have
a∗k =
∫ b
a
w (x) f (x) pk (x) dx, k = 0, 1, . . . .
Case C. We consider that f : R → R is a periodic function with period2π, and A = Tn, (i.e., the set of trigonometric polynomials of degree n, withbasis consisting of the functions 1, cosx, sin x, ..., cosnx, sin nx).
Since this set of functions is orthogonal on [0, 2π] , with respect to theweight function w (x) = 1, we have
g∗ (x) = a∗0 +n∑
k=1
(a∗k cos kx+ b∗k sin kx) ,
where
a∗0 =
∫ 2π
0
f (x) dx,
a∗k = 1π
∫ 2π
0
f (x) cos kxdx,
b∗k = 1π
∫ 2π
0
f (x) sin kxdx,
2.2. Univariate interpolation operators 101
for k = 1, ..., n. So a∗k and b∗k are the Fourier coefficients of f.The discrete case can be treated similarly. Let f be a given function with
available data f (xi) , i = 0, ..., m, and A be a set of approximations of f.According to the least squares principle, we have to determine the functiong∗ ∈ A that satisfies
m∑
i=0
w (xi) [f (xi)− g∗ (xi)]2 dx = min
g∈A
m∑
i=0
w (xi) [f (xi)− g (xi)]2 .
If A is a n-dimensional linear space and g1, ..., gn is a basis of A, thensuch an element g∗ exists and it is given by
g∗ =
n∑
k=1
a∗kgk,
where (a∗1, ..., a∗n) is the solution of the following system of equations:
n∑
k=1
ak
m∑
i=0
w (xi) gk (xi) gj (xi) =m∑
i=0
w (xi) gj (xi) f (xi) , j = 1, ..., n.
(2.2.69)Since ak, k = 1, ..., n are the coordinates of the minimizing point of thefunction
G (a1, ..., an) =
m∑
i=0
w (xi)
[f (xi)−
n∑
k=1
akgk
]2
,
the system (2.2.69) is equivalent to
∂G
∂aj(a1, ..., an) = 0, j = 1, ..., n.
Remark 2.2.63. Often n is taken much less than m. Notice that, for n =m− 1 one has
m∑
i=0
w (xi) [f (xi)− g∗ (xi)]2 = 0,
implyingg (xi) = f (xi) , i = 0, ..., m,
so g∗ is a function that interpolates f at the points xi.
102 Chapter 2. Approximation of functions
We also consider here two particular cases.Case C1. A = Pn and gk = ek, k = 0, ..., n.We have
g∗ (x) =n∑
k=0
a∗kxk,
where (a∗0, ..., a∗n) is the solution of the system of equations:
n∑
k=0
ak
m∑
i=0
w (xi) xk+ji =
m∑
i=0
w (xi)xjif (xi) , j = 0, ..., n. (2.2.70)
Denoting by
sk =m∑
i=0
w (xi) xki , k = 0, ..., 2n,
tp =
m∑
i=0
w (xi) xpi f (xi) , p = 0, ..., n,
system (2.2.70) becomes
n∑
k=0
sk+jak = tj , j = 0, ..., n.
Since its matrix is symmetric, special methods can be used to solve the lattersystem of equations.
Remark 2.2.64. One of the major applications of the least squares ap-proximation is the so-called data smoothing. Whenever it is necessary toapproximate a function f on a discrete data set, and no other requirementsare given, we can achieve this with a polynomial p ∈ C∞ (R) , i.e., a verysmooth function.
Case C2. g0, ..., gn are orthogonal polynomials on the points set xi,i = 0, ..., m, with respect to the weight functions w (xi) , i = 0, ..., m, namely
m∑
i=0
w (xi) gk (xi) gj (xi) =
0, k 6= j,cj > 0, k = j.
In this case, the system of equations (2.2.69) becomes
ajcj =
m∑
i=0
w (xi) gj (xi) f (xi) , j = 0, ..., n
2.3. Some elements of multivariate interpolation operators 103
and its solution is given by
a∗k =1
ck
m∑
i=0
w (xi) gk (xi) f (xi) , k = 0, ..., n.
2.3 Some elements of multivariate interpola-
tion operators
The basic goal of this section is to construct multivariate interpolation oper-ators based on univariate ones.
As we have already seen, most of the univariate interpolation operators(polynomial, spline, Shepard, etc.) are commuting projectors. This propertyis essential in construction of multivariate operators.
2.3.1 Interpolation on a rectangular domain
Let Ωn ⊂ R be a rectangular domain,
Ωn =n∏i=1
[ai, bi], ai, bi ∈ R, ai < bi, i = 1, ..., n.
Let B be a set of real-valued functions defined on Ωn, and f ∈ B. One con-siders the univariate projectors Pi, Pi : B →Gi, i = 1, ..., n, that interpolatethe function f with respect to the variables xi, i = 1, ..., n, respectively. Ofcourse, each set Gi, i = 1, ..., n, is a set of (n− 1) free variables functions.
Let Pn be the set of all commuting projectors generated by the projectorsP1, ..., Pn (i.e., L of (1.1.6) generated by L=P1, ..., Pn).
For example,P2 = P1, P2, P1P2, P1 ⊕ P2
and
P3 = P1, P2, P3, P1P2, P1P3, P2P3, P1 ⊕ P2, P1 ⊕ P3, P2 ⊕ P3,
P1(P2 ⊕ P3), P2(P1 ⊕ P3), P3(P1 ⊕ P2), P1 ⊕ P2P3, P2 ⊕ P1P3,
P3 ⊕ P1P2, P1P2P3, P1 ⊕ P2 ⊕ P3.
Let (Pn,≤) be the lattice generated by P1, ..., Pn, where ” ≤ ” is the orderrelation defined in Proposition 1.1.30. In this lattice P = P1...Pn is theminimal element and S = P1 ⊕ ...⊕ Pn is the maximal element.
104 Chapter 2. Approximation of functions
Let Ri = I − Pi, i = 1, ..., n, be the corresponding remainder operators.The operator Pi being projectors, from Remark 1.1.23 it follows that Ri isalso a projector and Pi and Ri commute, i.e., (PiRi = RiPi). Using theidentities
I = P +RP ,
with
P = P1...Pn,
RP = R1 ⊕ ...Rn,
respectively,I = S +RS,
with
S = P1 ⊕ ...⊕ Pn,RS = R1...Rn,
one obtains the algebraic minimal interpolation formula:
f = Pf +RPf,
respectively, the algebraic maximal interpolation formula:
f = Sf +RSf,
characteristics given by the extremal properties of P and S in the lattice(Pn,≤).
The extremal projectors P and S of (Pn,≤) can also be characterized bytheir approximation order.
From the expression of the corresponding remainder operators,
Rp = R1 ⊕ ...⊕ Rn
= R1 + ...+Rn −R1R2 − ...− Rn−1Rn + ... + (−1)n−1R1...Rn
andRS = R1...Rn,
it follows thatord(P ) = minord(P1), ..., ord(Pn), (2.3.1)
2.3. Some elements of multivariate interpolation operators 105
respectively,
ord(S) =n∑
i=1
ord(Pi) (2.3.2)
andord(P ) < ord(Q) < ord(S), ∀Q ∈ Pn.
So, the remarkable property of the boolean sum operator S, in Pn, is thatit has the highest approximation order. One the other hand, taking intoaccount that
S = P1 + ...+ Pn − P1P2 − ...− Pn−1Pn + ... + (−1)n−1P1...Pn,
it follows that the interpolation function Sf is a sum of functions of (n− 1)free variables (Pif, i = 1, ..., n), (n − 2) free variables (PiPjf, i, j = 1, ..., n;i 6= j) and so on, only the last one P1...Pnf being a scalar approximation (itdoesn’t contain free variable), while the operator P, with the lowest approx-imation order, gives a scalar approximation Pf.
Remark 2.3.1. From the non-scalar approximation Sf it can be obtaineda scalar approximation using several levels of approximation.
We illustrate this for the two dimensional case. Let P 11 and P 1
2 be theinterpolation operators used in the first approximation level and R1
1, respec-tively R1
2, the corresponding remainder operators (R11 = I−P 1
1 ; R12 = I−P 1
2 ).We have
f = P 11 ⊕ P 1
2 f + R11R
12f, (2.3.3)
orf = (P 1
1 + P 12 − P 1
1P12 )f +R1
1R12f.
As P 11 f and P 1
2 f contain a free variable, using in a second level of approxi-mation the operators P 2
1 , P22 , with R2
1, R22 the corresponding remainders, one
obtains:
f = [P 11 (P 2
2 +R22) + P 1
2 (P 21 +R2
1)− P1P12 ]f +R1
1R12f,
respectively,
f = (P 11P
22 + P 2
1P12 − P 1
1P12 )f + (P 1
1R22 + P 1
2R21 +R1
1R12)f, (2.3.4)
whereQ = P 1
1P22 + P 2
1P12 − P 1
1P12 , (2.3.5)
106 Chapter 2. Approximation of functions
is the interpolation operator, while
RQ = P 11R
22 + P 1
2R21 +R1
1R21 (2.3.6)
is the remainder operator.This way, one obtains the formula (2.3.4), which is a scalar interpolation
formula and Q is a scalar interpolation operator.As it can be seen, the remainder operator RQ contains three terms. It
follows that
ord(Q) = minord(P 21 ), ord(P 2
2 ), ord(P 11 ) + ord(P 1
2 ).
Remark 2.3.2. 1) If
ord(P 21 ) ≥ ord(P 1
1 ) + ord(P 12 ) = ord(P 1
1 ⊕ P 12 )
and
ord(P 22 ) ≥ ord(P 1
1 ) + ord(P 12 ) = ord(P 1
1 ⊕ P 12 ),
then (2.3.4) is called a consistent approximation formula.2) If
ord(P 21 ) = ord(P 2
2 ) = ord(P 11 ⊕ P 1
2 ),
then (2.3.4) is called a homogeneous approximation formula.
Next, we give some examples of product and boolean sum formulas.Let us denote by Dh = [0, h] × [0, h], h > 0, the so-called standard
rectangle and let consider f : Dh → R.
Example 2.3.3. Let Lx1 and Lx1 be the univariate Lagrange-type interpola-tion operators defined by
(Lx1f)(x, y) = h−xhf(0, y) + x
hf(h, y), (2.3.7)
respectively,
(Ly1f)(x, y) = h−yhf(x, 0) + y
hf(x, h). (2.3.8)
It is easy to check that
(Lx1f)(0, y) = f(0, y), (Lx1f)(h, y) = f(h, y), y ∈ [0, h] (2.3.9)
(Ly1f)(x, 0) = f(x, 0), (Ly1f)(x, h) = f(x, h), x ∈ [0, h].
2.3. Some elements of multivariate interpolation operators 107
Let Rx1 = I−Lx1 and Ry
l = I−Ly1 be the corresponding remainder operators,i.e.,
(Rx1f)(x, y) = x(x−h)
2f (2,0)(ξ, y), for f (2,0)(·, y) ∈ C2[0, h], y ∈ [0, h] (2.3.10)
(Ry1f)(x, y) = y(y−h)
2f (0,2)(x, η), for f (0,2)(x, ·) ∈ C2[0, h], x ∈ [0, h]
with ξ, η ∈ [0, h] and
|(Rx1f)(·, y)| ≤ h2
8
∥∥f (2,0)(·, y∥∥∞ , (2.3.11)
|(Ry1f)(x, ·)| ≤ h2
8
∥∥f (0,2)(x, ·)∥∥∞ .
Hence,ord(Lx1) = ord(Ly1) = 2.
Because in this case P2 is
P2 = Lx1 , Ly1, Lx1Ly1, Lx1 ⊕ Ly1
the bivariate operators are only Lx1Ly1 and Lx1⊕Ly1, i.e., the extremal operators
of the lattice (P2,≤).So, we have the algebraic minimal approximation formula:
f = Lx1Ly1f +Rx
1 ⊕Ry1f, (2.3.12)
respectively, the algebraic maximal approximation formula:
f = Lx1 ⊕ Ly1f +Rx1R
y1f. (2.3.13)
From (2.3.7) and (2.3.8) it follows that
(Lx1Ly1f)(x, y) = (h−x)(h−y)
h2 f(0, 0) + x(h−y)h2 f(h, 0) (2.3.14)
+ y(h−x)h2 + xy
h2 f(h, h)
and
(Lx1 ⊕ Ly1f)(x, y) =h−xhf(0, y) + x
hf(h, y) + h−y
hf(x, 0) (2.3.15)
+ yhf(x, h)− (h−x)(h−y)
h2 f(0, 0)
− x(h−y)h2 f(h, 0)− y(h−x)
h2 f(0, h)− xyh2 .
108 Chapter 2. Approximation of functions
Also, from (2.3.10), one obtains
(Rx1 ⊕Ry
1f)(x, y) =x(x−h)2
f (2,0)(ξ, y) + y(y−h)2
f (0,2)(x, η) (2.3.16)
− xy(x−h)(y−h)4
f (2,2)(ξ1, η1)
and(Rx
1Ry1f)(x, y) = xy(x−h)(y−h)
4f (2,2)(ξ, η). (2.3.17)
Remark 2.3.4. 1) The approximation Lx1Ly1f of f uses the information
Λ(f) = f(0, 0), f(h, 0), f(0, h), f(h, h) of f , which is a scalar information.So, (2.3.12) is a scalar approximation formula.
The information on f used by the approximation Lx1 ⊕y1 f is Λ(f) =f(0, 0), f(h, 0), f(0, h), f(h, h), f(0, y), f(h, y), f(x, 0), f(x, h), for x, y ∈ [0, h],that is not a scalar information, i.e., (2.3.13) is a non-scalar approximationformula.
2) Lx1Ly1f interpolates the function f at the vertices Vi, i = 1, ..., 4 of Dh:
(Lx1Ly1f)(Vi) = f(Vi), i = 1, ..., 4,
i.e., the product Lx1Ly1 is a punctual interpolation operator.
Lx1 ⊕ Ly1f interpolates f at the border ∂Dh of Dh, i.e.:
Lx1 ⊕ Ly1f |∂Dh= f |∂Dh
.
So, Lx1 ⊕ Ly1 is a transfinite interpolation operator, also called a blendinginterpolation operator.
3) Regarding the approximation order, from (2.3.16), (2.3.17) and (2.3.11),it follows
‖Rx1 ⊕Ry
1f‖∞ ≤ h2
8
(∥∥f (2,0)∥∥∞ +
∥∥f (0,2)∥∥∞ + h2
8
∥∥f (2,2)∥∥∞
),
respectively,‖Rx
1Ry1f‖∞ ≤ h4
64
∥∥f (2,2)∥∥∞ ,
i.e.,
ord(Lx1Ly1) = 2,
ord(Lx1 ⊕ Ly1) = 4,
which is in concordance with (2.3.1) and (2.3.2).
2.3. Some elements of multivariate interpolation operators 109
Now, using a second approximation level, from the boolean sum inter-polation formula (2.3.13), we obtain a homogenous formula. Taking intoaccount that ord(Lx1 ⊕ Ly1) = 4, in a second level, we must use interpolationoperators of order 4. Such operators are, for example, the Hermite operatorsHx
3 and Hy3 that interpolate the function f at the double nodes 0 and h, i.e.,
(Hx3 f)(x, y) = h00(x)f(0, y)+h01(x)f
(1,0)(0, y)+h10(x)f(h, y)+h11(x)f(1,0)(h, y),
respectively,
(Hy3f)(x, y) = h00(y)f(x, 0)+h01(y)f
(0,1)(x, 0)+h10(y)f(x, h)+h11(y)f(0,1)(x, h),
where
h00(x) = 1h3 (x− h)2(2x+ h), h01(x) = x(x−h)2
h2 ,
h10(x) = 1h3x
2(3h− 2x), h11(x) = x2(x−h)h2 .
For the remainder terms, we have
(Rx3f)(x, y) = x2(x−h)2
24f (4,0)(ξ, y),
(Ry3f)(x, y) = y2(y−h)2
24f (0,4)(x, η),
with
ord(Hx3 ) = ord(Hy
3 ) = 4.
So,
ord(Hx3 ) = ord(Hy
3 ) = ord(Lx1 ⊕ Ly1) = 4.
hence,
f = (Lx1Hy3 +Hx
3Ly1 − Lx1Ly1)f + (Lx1R
y3 + Ly1R
x3 +Rx
1Ry1)f
is a homogenous interpolation formula of order 4.
Remark 2.3.5. From a boolean sum interpolation formula, it can always beobtained a homogenous formula. The condition is to choose, in the secondlevel of approximation, some interpolation operators of a suitable degree.This is always possible.
110 Chapter 2. Approximation of functions
2.3.2 Interpolation on a simplex domain
We consider only the bivariate case, with the standard triangle
Th = (x, y) ∈ R2 | x ≥ 0, y ≥ 0, x+ y ≤ h, h > 0.
Let us consider the univariate operators P1, P2 and P3 which interpolate agiven function f : Th → R on the set (0, y), (h− y, y), (x, 0), (x, h− x)and, respectively, on (x+ y, 0), (0, x+ y), i.e.,
(P1f)(x, y) = h−x−yh−y f(0, y) + x
h−yf(h− y, y), (2.3.18)
(P2f)(x, y) = h−x−yh−y f(x, 0) + y
h−xf(x, h− x),(P3f)(x, y) = x
x+yf(x+ y, 0) + y
x+yf(0, x+ y).
Using the product and the boolean sum of these operators, we constructbivariate operators with punctual or transfinite interpolation properties.
Example 2.3.6. Let
P = P1P2P3.
We have
(Pf)(x, y) = h−x−yh
f(0, 0) + xhf(h, 0) + y
hf(0, h).
So, P is a punctual interpolation operator that interpolates f at the verticesof Th. Also,
dex(P ) = 1,
i.e.,
Pf = f, for f ∈ P21.
Letf = Pf +Rf (2.3.19)
be the corresponding interpolation formula, where R denotes the remainderoperator.
If f ∈ B1,1[0, 0] then by Peano’s theorem, one obtains
(Rf)(x, y) =
∫ h
0
ϕ20(x, y; s)f(2,0)(s, 0)dt+
∫ h
0
ϕ02(x, y; t)f(0,2)(0, t)dt
+∫∫Th
ϕ11(x, y; s, t)f(1,1)(s, t)dsdt,
2.3. Some elements of multivariate interpolation operators 111
whereϕ20(x, y; s) = (x− s)+ − x
h(h− s) ≤ 0, s ∈ [0, h]
ϕ02(x, y; t) = (y − t)+ − yh(h− t) ≤ 0, t ∈ [0, h]
ϕ11(x, y; s, t) = (x− s)0+(y − t)0
+ ≥ 0, (s, t) ∈ Th.As the Peano’s kernels ϕ20, ϕ02 and ϕ11 do not change the sign on [0, h],respectively on Th, by the mean value theorem, we obtain:
(Rf)(x, y) = x(x−h)2
f (2,0)(ξ, 0) + y(y−h)2
f (0,2)(0, η) + xyf (1,1)(ξ1, η1),
where ξ, η ∈ [0, h] and (ξ1, η1) ∈ Th. It follows that:
|(Rf)(x, y)| ≤ h2
4
[12
∥∥f (2,0)(·, 0)∥∥L∞[0,h]
+ 12
∥∥f (0,2)(0, ·)∥∥L∞[0,h]
+∥∥f (1,1)
∥∥L∞(Th)
],
which implies that (2.3.19) is a homogenous interpolation formula of order 2(ord(P ) = 2).
Remark 2.3.7. We have
1) Pi ⊕ Pjf |∂Th= f |∂Th
, ∀i, j = 1, 2, 3; i 6= j
2) dex(Pi ⊕ Pj) = 2.
These properties are easy to verify.
Example 2.3.8. If f (1,0)(0, y), y ∈ [0, h] exists, one considers the Birkhofftype operator B1, defined by:
(B1f)(x, y) = f(h− y, y) + (x+ y − h)f (1,0)(0, y).
For y ∈ [0, h], it is easy to check that
(B1f)(h− y, y) = f(h− y, y),(B1f
(1,0)(0, y) = f (1,0)(0, y).
Let Q := B1 ⊕ P2. We have1)
(Qf)(x, 0) = f(x, 0),
(Qf)(h− y, y) = f(h− y, y),(Qf)(1,0)(0, y) = f (1,0)(0, y), x, y ∈ [0, h].
112 Chapter 2. Approximation of functions
2)dex(Q) = 2.
Indeed,
(Qf)(x, y) =h−x−yh−x f(x, 0) + y
h−xf(x, h− x)+ (x+ y − h)f (1,0)(0, y) + (h− x− y)
yh2 [f(0, h)− f(0, 0)]
+h−yhf (1,0)(0, 0) + y
h[f (1,0)(0, h)− f (0,1)(0, h)]
,
and the properties 1) and 2) are verified by a straightforward computation.One considers the interpolation formula
f = Qf +Rf, (2.3.20)
with Rf the remainder term. If f ∈ B12[0, 0] then by Peano’s theorem, oneobtains
|(Rf)(x, y)| ≤ h3
27
[23
∥∥f (0,3)(0, ·)∥∥L∞[0,h]
+ 12
∥∥f (1,2)∥∥L∞(Th)
].
So, (2.3.20) is a homogenous blending interpolation formula of Birkhoff-typeand
ord(Q) = 3.
2.3.3 Bivariate Shepard interpolation
Let X ⊂ R2 be an arbitrary domain, f : X → R and Z = zi | zi =(xi, yi) ∈ X, i = 1, ..., N be the set of the interpolation nodes. In 1968,Donald Shepard introduced a new kind of interpolation procedure, wherethe fundamental interpolation functions on a point z := (x, y) are definedusing the distances from the point (x, y) to the interpolation nodes (xi, yi),i = 1, ..., N.
The bivariate Shepard operator, denoted by S0, is defined by:
(S0f)(x, y) =N∑
i=1
Ai(x, y)f(xi, yi), (2.3.21)
with
Ai (x, y) =
N∏j=1j 6=i
ρµj (x, y)
N∑k=1
N∏j=1j 6=k
ρµj (x, y)
, (2.3.22)
2.3. Some elements of multivariate interpolation operators 113
where µ ∈ R+ and ρ is a metric on R2. Usually,
ρj(x, y) = ((x− xj)2 + (y − yj)2)1/2.
The functions Ai, i = 1, ..., N, can also be written as
Ai(x, y) =
1ρµ
i (x,y)
N∑j=1
1ρµ
j (x,y)
and
(S0f)(x, y) =
(N∑
i=1
f(xi, yi)
ρµi (x, y)
)/(N∑
j=1
1
ρµi (x, y)
).
It is easy to verify thatAi(xj , yj) = δij (2.3.23)
andN∑
i=1
Ai(x, y) = 1. (2.3.24)
The main properties of Shepard operator S0 are:
• Interpolation properties:
(S0f)(xi, yi) = f(xi, yi), i = 1, ..., N.
• Degree of exactness is:dex(S0) = 0.
mini=1,...,N
f(xi, yi) ≤ (S0f)(x, y) ≤ maxi=1,...,N
f(xi, yi).
These properties are implied by the relations (2.3.23) and (2.3.24).We notice that the drawbacks mentioned for the univariate Shepard method
are also maintained in bivariate case, namely:• high computational cost;• low degree of exactness.Two ways to overcome these drawbacks are:
114 Chapter 2. Approximation of functions
⋆ Modifying basis functions, for example, using local version of Shepardformula, introduced by Franke and Nielson. In this case the Shepard operatoris of the form:
(Swf) (x, y) =
N∑i=0
Wi (x, y) f (xi, yi)
N∑i=0
Wi (x, y)
, (2.3.25)
with
Wi (x, y) =
[(Rw − ri)+
Rwri
]2
, (2.3.26)
where Rw is a radius of influence about the node (xi, yi) and it is varyingwith i.
⋆ Increasing the degree of exactness by combining the Shepard operatorwith another interpolation operator. In this way its degree of exactnessis increased and there are used another sets of functionals. The generalprocedure is presented in Section 2.2.6.2.
We have some similar results with that given for univariate case.
Remark 2.3.9. If Pi, i = 0, ..., N, are linear operators then SP is also alinear operator.
Remark 2.3.10. Let Pi, i = 0, ..., N, be some linear operators. If
dex (Pi) = ρi, i = 0, ..., N,
thendex (SP ) = min ρ0, ..., ρN .
Remark 2.3.11. As Ai(x, y) ≥ 0, for (x, y) ∈ X, it follows that S0 is apositive operator.
Remark 2.3.12. The exponent µ ∈ R+ can be chosen arbitrary. If 0 ≤ µ ≤ 1the function S0f has peaks at the nodes, for µ > 1 it is level at nodes andfor µ large enough S0f becomes a step function.
We illustrate this phenomenon by some examples.
Example 2.3.13. Let f : [−1, 1] × [−1, 1] → R, f(x, y) = −(x2 + y2) andconsider the nodes z1 = (−1,−1), z2 = (−1, 1), z3 = (1,−1), z4 = (1, 1),z5 = (−0.5,−0.5), z6 = (−0.5, 0.5), z7 = (0.5,−0.5), z8 = (0.5, 0.5), z9 =(0, 0). In Figure 2.2 we plot S0f , for µ = 1, 2, 20.
2.3. Some elements of multivariate interpolation operators 115
−1
0
1
−1−0.8−0.6−0.4−0.200.20.40.60.81−2
−1.8
−1.6
−1.4
−1.2
−1
−0.8
−0.6
−0.4
−0.2
0
(a) Function f.
−1−0.5
00.5
1
−1−0.500.51−2
−1.8
−1.6
−1.4
−1.2
−1
−0.8
−0.6
−0.4
−0.2
0
(b) Interpolant S0f , µ = 1.
−1
−0.5
0
0.5
1
−1−0.5
00.5
1−2
−1.8
−1.6
−1.4
−1.2
−1
−0.8
−0.6
−0.4
−0.2
0
(c) Interpolant S0f , µ = 2.
−1
−0.5
0
0.5
1
−1
−0.5
0
0.5
1−2
−1.8
−1.6
−1.4
−1.2
−1
−0.8
−0.6
−0.4
−0.2
0
(d) Interpolant S0f , µ = 20.
Figure 2.2: Bivariate Shepard interpolants.
2.3.3.1 The bivariate Shepard-Lagrange operator
Let f : X → R, X ⊂ R2 and Z = zi | zi = (xi, yi) ∈ X, i = 1, ..., N be the
set of the interpolation nodes. Consider the set of Lagrange type functionals
ΛL(f) = λi | λi(f) = f(xi, yi), i = 1, . . . , N .
Taking into account that dex(S0) = 0, our goal is to construct Shepard-type operators of higher degree of exactness. To do this, it will be used acombination of the Shepard operator S0 with some Lagrange operator forbivariate functions.
Let Lif be the bivariate n degree Lagrange polynomial that interpolatesthe function f, respectively, at the sets of points
Zm,i := zi, zi+1, . . . , zi+m−1 , i = 1, . . . , N , m < N (2.3.27)
with zN+i = zi, i = 1, . . . , m−1 and m := (n+1)(n+2)/2 being the number
116 Chapter 2. Approximation of functions
of the coefficients of a bivariate polynomial of the degree n,
Pn(x, y) =∑
i+j≤naijx
iyj.
Remark 2.3.14. For given N , it can be considered only operators Lni withn such that (n + 1)(n + 2)/2 < N , i.e., for n ∈ 1, . . . , ν, where ν =integer[(
√8N + 1−3)/2]. The existence and the uniqueness of the operators
Lni are assured by the following theorem.
Theorem 2.3.15. Let zi := (xi, yi), i = 1, . . . , (n + 1)(n + 2)/2 be differentpoints in plane that do not lie on the same algebraic curve of n-th degree(∑
i+j≤n aijxiyj = 0). Then, for every function f defined at the points zi,
i = 1, . . . , (n+1)(n+2)/2 there exists a unique polynomial Qn of n-th degreethat interpolates f at zi, i.e.,
Qn(xi, yi) = f(xi, yi), i = 1, . . . , (n+ 1)(n+ 2)/2.
Hence, if the points zk, k = i, . . . , i + m − 1; i = 1, . . . , N, of the set(2.3.27), do not lie on an algebraic curve of n-th degree, then Lni exists andit is unique for all i = 1, . . . , N .
Suppose that the existence and uniqueness conditions of the operatorsLni , i = 1, . . . , N , are satisfied.
We have
(Lni f)(x, y) =
i+m−1∑
k=i
lk(x, y)f(xk, yk), i = 1, . . . , N,
where lk are the corresponding cardinal polynomials
lk(xj , yj) = δkj, k, j = i, . . . , i+m− 1.
The main properties of the Lagrange interpolation operators Lni are:
(Lni f)(xk, yk) = f(xk, yk), k = i, . . . , i+m− 1 (2.3.28)
and
dex(Lni ) = n, i = 1, . . . , N. (2.3.29)
2.3. Some elements of multivariate interpolation operators 117
Definition 2.3.16. The operator SLn given by
(SLn f)(x, y) =N∑
i=1
Ai(x, y)(Lni f)(x, y) (2.3.30)
is called the bivariate Shepard-Lagrange operator.
Remark 2.3.17. As the Lagrange operators Lni , i = 1, . . . , N and the Shep-ard operator S0 are linear operators, it follows that the combined operatorSLn is also linear.
Theorem 2.3.18. Let f : D → R be a given function, zi := (xi, yi) ∈ D,i = 1, . . . , N and m := (n + 1)(n + 2)/2. If the points zi, . . . , zi+m−1 do notlie on an algebraic curve of n-th degree, for all i = 1, . . . , N (zN+k := zk,k = 1, . . . , m − 1), then the combined operator SLn exists and it have thefollowing properties:
(SLn f)(xj, yj) = f(xj , yj), j = 1, . . . , N (2.3.31)
anddex(SLn ) = n. (2.3.32)
Proof. From (2.3.28) it follows
(SLn f)(xj, yj) =
N∑
i=1
Ai(xj, yj)(Lni f)(xj , yj) =
N∑
i=1
Ai(xj , yj)f(xj, yj)
and (2.3.23) implies the interpolation property (2.3.31).Relation (2.3.32) follows by Remark 2.3.10, taking into account that
dex(Lni ) = n.Next, one considers two particular cases: n = 1 and n = 2.1) Case n = 1. We have
(SL1 f)(x, y) =
N∑
i=1
Ai(x, y)(L1i f)(x, y), (2.3.33)
where
(L1i f)(x, y) = li(x, y)f(xi, yi)+ li+1(x, y)f(xi+1, yi+1)+ li+2(x, y)f(xi+2, yi+2),
118 Chapter 2. Approximation of functions
for i = 1, . . . , N (zN+1 := z1, zN+2 := z2). One obtains
li(x, y) = (yi+1−yi+2)x+(xi+2−xi+1)y+xi+1yi+2−xi+2yi+1
(xi−xi+1)(yi+1−yi+2)−(xi+1−xi+2)(yi−yi+1),
li+1(x, y) = (yi+2−yi)x+(xi−xi+2)y+xi+2yi−xiyi+2
(xi+1−xi+2)(yi+2−yi)−(xi+2−xi)(yi+1−yi+2),
li+2(x, y) = (yi−yi+1)x+(xi+1−xi)y+xiyi+1−xi+1yi
(xi+2−xi)(yi−yi+1)−(xi−xi+1)(yi+2−yi).
Remark 2.3.19. The existence and uniqueness condition of L1i is that the
points zi, zi+1, zi+2 do not lie on a line Ax+By +C = 0, or have to be thevertices of a non-degenerate triangle ∆i, for all i = 1, . . . , N .
Remark 2.3.20. The functions SL1 f and S0f use the same information aboutf (f(xi, yi), i = 1, . . . , N), but dex(SL1 ) = 1, while dex(S0) = 0.
2) Case n = 2. From (2.3.30), we obtain
(SL2 f)(x, y) =
N∑
i=1
Ai(x, y)(L2i f)(x, y), (2.3.34)
where L2i f are two-degree polynomials that interpolate f at zi, . . . , zi+5, re-
spectively.In accordance with Theorem 2.3.18, the interpolation nodes zi, . . . , zi+5
must not lie, respectively, on an algebraic curve of second degree gi, i =1, . . . , N . If, for some j, 1 ≤ j ≤ N this condition is not satisfied, the indexorder of zi, i = 1, . . . , N can be changed.
Remark 2.3.21. A Shepard-Lagrange operator of two-degree of exactnesscan be obtained using more information about f . Roughly speaking, theonly problem is to exist some two-degree Lagrange type polynomials whichinterpolate f at zk, k = 1, . . . , N.
A sufficient condition for the existence and uniqueness of such polynomi-als, say L2
i f , is that the six interpolation nodes to be the vertices zi, zi+1, zi+2
of the triangle ∆i and the midpoints, say ξi, ξi+1, ξi+2, of the sides of ∆i, fori = 1, . . . , N .
Using the polynomials L2i f, i = 1, . . . , N, one can define the Shepard
operator SL2 :
(SL2 f)(x, y) =
N∑
i=1
Ai(x, y)(L2i f)(x, y),
which interpolates f at zi, i = 1, . . . , N and dex(SL2 ) = 2.
2.3. Some elements of multivariate interpolation operators 119
Of course, SL2 f also uses the values of the function f at the midpoints
ξi, i = 1, . . . , N + 2. But, the advantage of SL2 f is that the existence and
uniqueness conditions of the polynomials L2i f are more simple to verify.
An extension of the Shepard-Lagrange operator type SLn , is obtained if theLagrange polynomials Lif, used in combination with the Shepard operatorS0, are of different degree, say ni, i = 1, . . . , N, such that (ni+1)(ni+2)/2 <N .
So, letZmi,i := zi, zi+1, . . . , zi+mi−1 , i = 1, . . . , N ,
be some sets of mi interpolation nodes, respectively, with mi = (ni +1)(ni +2)/2. Using the given information on f at the nodes zi ∈ Zmi,i, there areconstructed the corresponding Lagrange interpolation operators Lni
i , i =1, . . . , N .
Definition 2.3.22. The operator SLn1,...,nNdefined by
(SLn1,...,nNf)(x, y) =
N∑
i=1
Ai(x, y)(Lni
i f)(x, y) (2.3.35)
is called a general Shepard-Lagrange operator.
Remark 2.3.23. For n1 = · · · = nN = n the operator SLn1,...,nNbecomes SLn .
Theorem 2.3.24. Suppose that SLn1,...,nNexists. Then
(SLn1,...,nNf)(xi, yi) = f(xi, yi), i = 1, . . . , N
anddex(SLn1,...,nN
) = minn1, . . . , nN.
Proof. The proof follows by (2.3.23) and Remark 2.2.43.
Example 2.3.25. For f the same as in Example 2.3.13 and the same nodes,in Figure 2.3 we plot SL1 f, for µ = 1; 2.
120 Chapter 2. Approximation of functions
−1
−0.5
0
0.5
1
−1−0.5
00.5
1−2
−1.5
−1
−0.5
0
0.5
(a) Interpolant SL
1f, µ = 1.
−1
−0.5
0
0.5
1
−1
−0.5
0
0.5
1−2
−1.5
−1
−0.5
0
0.5
(b) Interpolant SL
1f, µ = 2.
Figure 2.3: Bivariate Shepard-Lagrange interpolants.
2.3.3.2 The bivariate Shepard-Hermite operator
A new extension of Shepard operator S0 can be obtained using not onlyLagrange type information on f , but also some of its partial derivatives.
Let X ⊂ R2 be an arbitrary domain, f : X → R and Z = zi | zi =(xi, yi) ∈ X, i = 1, ..., N be the set of the interpolation nodes. Considerthe set of Hermite type functionals:
ΛH(f) =λp,qi | λp,qi f = f (p,q)(zi), (p, q) ∈ N
2, p+ q ≤ mi, i = 1, . . . , N,
with mi ∈ N∗, i = 1, ..., N .First, denote by ΛH,k ⊂ ΛH the functionals corresponding to the point
zk, i.e.,
ΛH,k(f) :=λp,qk | λp,qk f = f (p,q)(zk), (p, q) ∈ N
2, p+ q ≤ mk
.
Let Tmk
k be the bivariate Taylor interpolation operator corresponding tothe set ΛH,k:
(Tmk
k f)(x, y) =∑
i+j≤mk
(x−xk)i(y−yk)j
i!j!f (i,j)(xk, yk).
As ΛH,k(f), k = 1, . . . , N, is an admissible sequence of functionals, the oper-ator Tmk
k exists, for all k = 1, . . . , N .
Definition 2.3.26. The operator STm1,...,mNgiven by
(STm1,...,mNf)(x, y) =
N∑
i=1
Ai(x, y)(Tmi
i f)(x, y) (2.3.36)
is called the bivariate Shepard-Taylor operator.
2.3. Some elements of multivariate interpolation operators 121
Remark 2.3.27. The operator ST1 , given by
(ST1 f)(x, y) =
N∑
i=1
Ai(x, y)(T1i f)(x, y)
or
(ST1 f)(x, y) =n∑
i=1
Ai(x, y)[f(xi, yi)+(x−xi)f (1,0)(xi, yi)+(y−yi)f (0,1)(xi, yi)]
was defined and studied by Shepard himself.
For µ > 1, it is easy to verify that
(ST1 f)(p,q)(xi, yi) = f (p,q)(xi, yi), p+ q ≤ 1, i = 1, . . . , N
anddex(ST1 ) = 1.
Theorem 2.3.28. For µ > maxm1, . . . , mN we have
(STm1,...,mNf)(p,q)(xi, yi) = f (p,q)(xi, yi), p+q ≤ mi, i = 1, . . . , N, (2.3.37)
respectively,dex(STm1,...,mN
) = minm1, . . . , mN. (2.3.38)
Proof. First, we see thatA
(p,q)i (xk, yk) = 0, k = 1, . . . , N ; 0 ≤ p+ q ≤ mk, i 6= k
A(p,q)i (xi, yi) = 0, p+ q ≥ 1,
(2.3.39)
for all i = 1, . . . , N . Indeed, let us consider
Ak =gkhk
with
gk(x, y) =N∏
j=1j 6=k
ρµj (x, y)
hk(x, y) =
N∑
k=1
N∏
j=1j 6=k
ρµj (x, y).
122 Chapter 2. Approximation of functions
After a straightforward computation, one obtains
g(p,q)k (xi, yi) = 0, 1 ≤ p+ q ≤ mi,
for µ > maxm1, . . . , mN, i = 1, . . . , N, i 6= k and
g(p,q)k (xk, yk) = h
(p,q)k (xk, yk), 1 ≤ p+ q ≤ mk.
So, (2.3.39) follows. Now, relations (2.3.39) imply the interpolation prop-erty (2.3.37). The degree of exactness property (2.3.38) is a consequence ofRemark 2.3.10, namely of the fact that dex(Tmk
k ) = mk, k = 1, ..., N.Next, for νk ∈ N∗, νk < N , let us consider the set
ΛH,νk(f) :=λp,qk+j | λp,qk+jf = f (p,q)(zk+j), (p, q) ∈ N
2, p+ q ≤ mk,
j = 0, 1, . . . , νk − 1
with zN+i = zi, i ∈ N∗, i.e., let ΛH,νk(f) ⊂ ΛH(f) be the set of functional
corresponding to the nodes zk, zk+1, . . . , zk+νk−1, regarding function f.If |ΛH,νk
(f)| = mk let Hnk
k be the Hermite interpolation operator cor-responding to the set ΛH,νk
(f), k = 1, . . . , N, with nk such that mk =(nk + 1)(nk + 2)/2.
Definition 2.3.29. Suppose that Hnk
k exists for all k = 1, . . . , N. The oper-ator SHn1,...,nN
defined by
(SHn1,...,nNf)(x, y) =
N∑
k=1
Ak(x, y)(Hnk
k f)(x, y)
is called the bivariate Shepard-Hermite operator.
Theorem 2.3.30. If ΛH,νk(f) ⊂ ΛH(f), k = 1, . . . , N is an admissible se-
quence, then SHn1,...,nNexists and it has the following interpolation properties:
(SHn1,...,nNf)(p,q)(xi, yi) = f (p,q)(xi, yi), p+ q ≤ mi, i = 1, . . . , N, (2.3.40)
for µ > maxm1, ..., mN and
dex(SHn1,...,nN) = minn1, . . . , nN (2.3.41)
2.3. Some elements of multivariate interpolation operators 123
Proof. Relations (2.3.40) follows by (2.3.39), taking into account the inter-polation properties of the polynomials Hnk
k f, k = 1, ..., N.Relation (2.3.41) follows by Remark 2.2.43.
Particular cases
• if νk = 1, for all k = 1, . . . , N, then SHn1,...,nNbecomes the Shepard-Taylor
operator (2.3.36)
• for n1 = · · · = nN = n, all polynomials Hnk
k , k = 1, . . . , N are of degreen and one obtains the Shepard-Hermite operator SHn :
(SHn f)(x, y) =
N∑
k=1
Ak(x, y)(Hnk f)(x, y). (2.3.42)
Example 2.3.31. Let
ΛH(f) =λp,qi | λp,qi f = f (p,q)(xi, yi), (p, q) ∈ N
2, p+ q ≤ 1, i = 1, . . . , N
(2.3.43)and
ΛH,νk(f) =
λ00k , λ
10k , λ
01k , λ
00k+1, λ
00k+2, λ
00k+3
, (2.3.44)
with νk = 4 and λN+1 = λ1, λN+2 = λ2, λN+3 = λ3. The existence anduniqueness of the Hermite operator H2
k are based on the following auxiliaryresult.
Lemma 2.3.32. Let zk+i = (xk+i, yk+i), i = 0, 1, 2, 3 be some given points inplane such that:
• xk+i 6= xk, for i = 1, 2, 3• if lk+i is the line determined by the points zk and zk+i, i = 1, 2, 3 then
lk+i 6= lk+j, for i, j = 1, 2, 3, i 6= j.
Then for every function f with the given informationf(zk), f
(1,0)(zk), f(0,1)(zk), f(zk+1), f(zk+2), f(zk+3)
there exists a unique
polynomial P2 of second degree such that
P
(i,j)2 (zk) = f (i,j)(zk), i, j = 0, 1 i 6= j ≤ 1P2(zk+i) = f(zk+i), i = 1, 2, 3
(2.3.45)
Proof. Let P2 be an arbitrary polynomial of second degree:
P2(x, y) = Ax2 +Bxy + Cy2 +Dx+ Ey + F. (2.3.46)
124 Chapter 2. Approximation of functions
From (2.3.45) one obtains a 6 × 6 linear algebraic system. Let M be itsmatrix. Then, after some permissible transformation, one obtains
detM = −3∏
i=1
(xk+i − xi)2
∣∣∣∣∣∣
1 αk+1 α2k+1
1 αk+2 α2k+2
1 αk+3 α2k+3
∣∣∣∣∣∣,
where
αk+i =yk+i − ykxk+i − xk
, i = 1, 2, 3. (2.3.47)
The hypothesis from Lemma (2.3.32) implies that detM 6= 0, i.e., the systemhas a unique solution.
Theorem 2.3.33. If ΛH,νk, k = 1, . . . , N, given by (2.3.44), is an admissible
sequence, then there exists the Shepard-Hermite operator SH2 , defined by
(SH2 f)(x, y) =
N∑
k=1
Ak(x, y)(H2kf)(x, y),
where H2k is the interpolation operator corresponding to the set ΛH,νk
.
Proof. The proof follows from Lemma 2.3.32. In order to give effectivelythe function SH2 f , we have to determine the Hermite polynomials H2
kf , k =1, . . . , N . For this, we consider H2
kf in the form:
H2kf =h00
kkf(xk, yk) + h10kkf
(1,0)(xk, yk) + h01kkf
(0,1)(xk, yk)
+ h00kkf(xk+1,k+1) + h00
kkf(xk+2,k+2) + h00kkf(xk+3,k+3),
where hp,qi,j are the corresponding fundamental interpolation polynomials.Hence, each polynomial hp,qi,j is a bivariate polynomial of the second degree, asin (2.3.46), that satisfies the cardinal interpolation conditions. For example,
h00kk(xk, yk) = 1h10kk(xk, yk) = 0h01kk(xk, yk) = 0h00kk(xk+1, yk+1) = 0h00kk(xk+2, yk+2) = 0h00kk(xk+3, yk+3) = 0
which is a linear algebraic 6× 6 system.
2.3. Some elements of multivariate interpolation operators 125
Example 2.3.34. Let
ΛH(f) =λ0,0k , λ1,0
k , λ0,1k , λ1,1
k |k = 1, . . . , N. (2.3.48)
One considers the sequence of subsets of ΛH :
ΛH,νk(f) =
λ0,0k , λ1,0
k , λ0,1k , λ1,1
k , λ0,0k+1, λ
0,0k+2
, νk = 3, k = 1, . . . , N,
(2.3.49)with
λN+1 = λ1, λN+2 = λ2.
Lemma 2.3.35. If• xk+1 6= xk and xk+2 6= xk,• αk+1 6= αk+2 and αk+1 6= −αk+2, for αk+1and αk+2 given in (2.3.47)then (2.3.49) is an admissible sequence.
Proof. For P2 as in (2.3.46) we have to study the linear system:
P
(i,j)2 (xk, yk) = f (i,j)(xk, yk),P2(xk+1, yk+1) = f(xk+1, yk+1),P2(xk+2, yk+2) = f(xk+2, yk+2),
with (i, j) ∈ (0, 0), (0, 1), (1, 0), (1, 1) . The determinant of this system is:
D = (xk+1 − xk)2(xk+2 − xk)2(α2k+2 − α2
k+1)
and the proof follows.
Remark 2.3.36. From Lemma 2.3.35 it follows that the Shepard-Hermiteoperator, corresponding to the set of functionals (2.3.48), is given by
SH2 f =
n∑
i=0
AiH2kf,
where H2k is the Hermite interpolation operator regarding ΛH,νk
. This oper-ator exists and is unique.
Example 2.3.37. For f the same as in Example 2.3.13 and the same nodes,in Figure 2.4 we plot ST1 f, for µ = 1; 2.
126 Chapter 2. Approximation of functions
−1
−0.5
0
0.5
1
−1
−0.5
0
0.5
1−2
−1.5
−1
−0.5
0
0.5
1
(a) Interpolant ST
1f, µ = 1.
−1
−0.5
0
0.5
1
−1−0.5
00.5
1−2
−1.5
−1
−0.5
0
0.5
(b) Interpolant ST
1f, µ = 2.
Figure 2.4: Bivariate Shepard-Taylor interpolants.
Example 2.3.38. Let g : [−2, 2]× [−2, 2]→ R,
g(x, y) = xe−(x2+y2),
and consider 10 random nodes, uniformly distributed on the domain of g. InFigure 2.5 we plot SH2 g, for µ = 1; 2.
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
−2−1
01
2−0.5
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
0.5
(a) Function g.
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
−2
−1
0
1
2
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
(b) SH
2g, µ = 1.
−2
−1
0
1
2
−2
−1
0
1
2−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
(c) SH
2g, µ = 2.
Figure 2.5: Bivariate Shepard-Hermite interpolants.
2.3.3.3 The bivariate Shepard-Birkhoff operator
Let X ⊂ R2 be an arbitrary domain, f : X → R and Z = zi | zi = (xi, yi) ∈X, i = 1, ..., N be the set of the interpolation nodes. Consider the set ofBirkhoff type functionals:
ΛB(f) =λp,qk |λp,qk f = f (p,q)(zk), (p, q) ∈ Ik ⊂ N
2, k = 1, . . . , N.
and the sequence Λk,νk, k = 1, . . . , N of the subsets of ΛB:
ΛB,νk(f) =
λp,qk+j ∈ ΛB(f) |(p, q) ∈ Ik+j, j = 0, 1, . . . , νk − 1
, (2.3.50)
2.3. Some elements of multivariate interpolation operators 127
with νk < N, k = 1, . . . , N and zN+i = zi, i ≥ 1.Denote by Bnk
k the interpolation operator of the total degree nk corres-ponding to the subset ΛB,νk
, i.e.,
λp,qk+j(Bnk
k ) = λp,qk+j(f), (p, q) ∈ Ik+j, j = 0, 1, . . . , νk − 1. (2.3.51)
Taking into account that
(Bnk
k )(x, y) =∑
i+j≤nk
aijxiyj,
the interpolation condition (2.3.51) gives rise to a linear algebraic system inthe unknowns aij , i, j ∈ N, i+ j ≤ nk. Let Mk := M(ΛB,νk
) be the matrix ofthis system. If |ΛB,νk
| = mk := (nk+1)(nk+2)/2 then Mk is a square matrix.Hence, for detMk 6= 0 the system has a unique solution. It follows that, ifdetMk 6= 0, for all k = 1, . . . , N, then (2.3.50) is an admissible sequence.
Definition 2.3.39. Suppose that detMk 6= 0, k = 1, . . . , N . The operatorSBf defined by
(SBn1,...,nNf)(x, y) =
N∑
k=1
Ak(x, y)(Bnk
k f)(x, y) (2.3.52)
is called the bivariate Shepard-Birkhoff operator.
Theorem 2.3.40. If ΛB,νk, k = 1, . . . , N is an admissible sequence then, for
µ > M := max1≤k≤N
µk | µk = max p+ q| (p, q) ∈ Ik we have
(SBn1,...,nNf)(p,q)(xk, yk) = f (p,q)(xk, yk), (p, q) ∈ Ik, k = 1, . . . , N (2.3.53)
anddex(SB) = m := min nk| k = 1, . . . , N . (2.3.54)
Proof. For µ > M, the relation (2.3.39) implies:
(SBn1,...,nNf)(p,q)(xi, yi) =
N∑
k=1
(AkBnk
k f)(p,q)(xi, yi)
=N∑
k=1
Ak(xi, yi)(Bnk
k f)(p,q)(xi, yi)
Now, from (2.3.23) and (2.3.51) the interpolation property (2.3.53) follows.Relation (2.3.54) follows by Remark 2.2.43.
128 Chapter 2. Approximation of functions
Remark 2.3.41. A particular case of the operator SBn1,...,nNis obtained for
nk = n, k = 1, . . . , N, i.e., when all the polynomials Bnk
k f have the samedegree n. One obtains
(SBn f)(x, y) =N∑
k=1
Ak(x, y)(Bnkf)(x, y).
Remark 2.3.42. The Shepard operator of Birkhoff type SBn is an extensionof all previous Shepard operators of Lagrange, Taylor and Hermite type.
The difficult problem which appears in the Hermite and Birkhoff casesis to prove the admissibility of the considered sequence of the subsets ofΛB. Theorem 2.3.15 gives sufficient condition for the admissibility of suchsequences in the Shepard operators of Lagrange type. Also, the problemof sequence admissibility is completely solved for the Shepard operators ofTaylor type.
Example 2.3.43. For g the same as in Example 2.3.38 and 10 randomnodes, uniformly distributed on the domain of g, in Figure 2.6 we plot SB2 g,for µ = 2; 20.
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2−2
02
−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
2.5
(a) Interpolant SB
2g, µ = 2.
−2
−1
0
1
2
−2
−1
0
1
2−3
−2
−1
0
1
2
3
(b) Interpolant SB
2g, µ = 20.
Figure 2.6: Bivariate Shepard-Birkhoff interpolants.
2.4 Uniform approximation
Let f : [a, b] → R be an integrable function, λk(f), k = 0, 1, ...m someinformation about f , w : [a, b] → R+ a weight function, also integrable on
2.4. Uniform approximation 129
[a, b], and
I(f) =
∫ b
a
w(x)f(x)dx.
2.4.1 The Weierstrass theorems
Let B be a given class of real-valued functions defined on an interval [a, b] ⊂ R
and A ⊂ B. One considers the following problem: for some given f ∈ B andε ∈ R+ find a function g ∈ A such that
|f(x)− g(x)| < ε, ∀x ∈ [a, b].
The solution of this problem is based on the Weierstrass theorems, usu-ally called ”First Weierstrass Theorem”, respectively, ”Second WeierstrassTheorem”.
Theorem 2.4.1. (First Weierstrass Theorem) For any f ∈ C[a, b] and ε > 0there exists an algebraic polynomial P (P ∈ P) such that
|f(x)− P (x)| < ε, ∀x ∈ [a, b].
In other words, the set P (P ⊂ C[a, b]) is dense in the Banach space C[a, b].
Next we give some other equivalent formulations of this theorem.
Theorem 2.4.2. Any function f ∈ C[a, b] is the limit of a sequence ofalgebraic polynomials, uniform convergent to f, on [a, b].
Theorem 2.4.3. For any function f ∈ C[a, b] a series of algebraic polyno-mials absolute and uniform convergent to f, on [a, b], can be found.
Theorem 2.4.4. (Second Weierstrass Theorem) The set T of all trigonomet-ric polynomials is dense in C(T ), where C(T ) denotes the class of continuousfunctions of period 2π.
Remark 2.4.5. As Theorem 2.4.4 can be obtained as a consequence of The-orem 2.4.1, we shall focus on the First Weierstrass Theorem and present someof the many existing proofs.
A proof of Weierstrass theorem based on the probability theory.In what follows, we present the well-known proof given by S. Bernstein in1912, exactly in the way it was given. Because he used probabilistic tools, theinterval I = [0, 1] is considered. Under these circumstances, we give anotherenounce of the Weierstrass theorem:
130 Chapter 2. Approximation of functions
Theorem 2.4.6. If F (x) is a continuous function everywhere in the interval[0, 1], it is always possible to determine a polynomial of degree n,
En(x) = a0xn + a1x
n−1 + · · ·+ an,
with n large enough, such that
|F (x)− En(x)| < ε
at every point of [0, 1], for any ε, no matter how small it is.
Proof. We consider an event A of probability x. Suppose that we make n
experiences and we pay to a gambler the amount of money F(mn
)if the
event A appears m times. Under this circumstances, the mean value, En, ofthe gambler will have the following expression:
En =n∑
m=0
F(mn
)Cmn x
m(1− x)n−m. (2.4.1)
From the continuity of function F (x), it is possible to fix a number, δ,such as the inequality
|x− x0| ≤ δ
generates
|F (x)− F (x0)| < ε2.
Denoting with F (x) the maximum and F (x) the minimum of F (x) in(x− δ, x+ δ), we get
F (x)− F (x) < ε2, F (x)− F (x) < ε
2. (2.4.2)
Let η be the probability of inequality∣∣∣x− m
n
∣∣∣ > δ and L the maximum
of |F (x)| in [0, 1]. Then, we have:
F (x)(1− η)− Lη < En < F (x)(1− η) + Lη. (2.4.3)
But, according with Bernoulli’s Theorem, we may take n so big in orderto get
η < ε4L. (2.4.4)
2.4. Uniform approximation 131
Then, the inequality (2.4.3) can be written as follows:
F (x)+(F (x)−F (x))−η(L+F (x)) < En < F (x)+(F (x)−F (x))+η(L−F (x))
and thenF (x)− ε
2− 2L
4Lε < En < F (x) + ε
2+ 2L
4Lε,
and so|F (x)−En| < ε. (2.4.5)
Or, En is a polynomial of degree n. So, the theorem is proved.Remark on the Bernstein polynomial. The Bernstein polynomials
can be written in the more general form
b(x) =
n∑
k=0
βk(nk
)xk(1− x)n−k, (2.4.6)
where β0, . . . , βn are given coefficients. The case βk = 1, k = 0, n generatesthe relation
n∑
k=0
(nk
)xk(1− x)n−k = (x+ (1− x))n = 1,
fact which is already well-known, being the sum of the probabilities in thedistribution of the random variable introduced by Bernstein (even not veryclearly emphasized!). What is more interesting, is the fact that, if b = 0(identically equal), all coefficients βn are zero.
By induction, the base case n = 0 is obvious. When n > 0, we see thatthe derivative of a Bernstein polynomial can be written as another Bernsteinpolynomial:
b′(t) = n
n−1∑
k=0
(βk+1 − βk)(n−1k
)xk(1− x)n−1−k. (2.4.7)
In particular, if b = 0, it follows by the induction hypothesis that all βkare equal, and then they are all zero, by (2.4.2).
In other words, the polynomials xk(1 − x)n−k, k = 0, . . . , n are linearlyindependent, and hence they span the n+1 dimensional space of polynomialsof degree ≤ 1. Thus, all polynomials can be written as Bernstein polynomials.
A proof of Weierstrass theorem using beta functions. Let usdefine, for 0 < α < 1,
Uα(x) = xα(1− x)1−α.
132 Chapter 2. Approximation of functions
These functions are the well known Beta functions. They are increasingon the interval [0, α] and decreasing on the interval [α, 1].
Lemma 2.4.7. Let 0 ≤ a < b ≤ α or α ≤ b < a ≤ 1. Then
Uα(a)
Uα(b)≤ e−2(a−b)2 .
Proof. Let us assume that 0 ≤ a < b ≤ α. It will be done if we prove thatthe function
H(x) = ln(Uα(x))− ln(Uα(b)) + 2(x− b)2
is negative for 0 ≤ x < b.A simple computation of the derivative of H gives us
H ′(x) = α−xx(1−x) + 4(x− b) ≥ 4(α− x) + 4(x− b) ≥ 0
because x(1− x) ≤ 1
4. This ends the proof, since F (b) = 0.
The case α ≤ b < a ≤ 1 can be proved in a similar way.
Lemma 2.4.8. Let ε > 0, µ > 0 and 0 ≤ p ≤ n. There exists N such thatfor n > N ∑
|k−p|>2µn
(nk
)xk(1− x)n−k < ε,
forx ∈
(pn− µ, p
n+ µ).
Proof. Let A be a set of all 0 ≤ k ≤ n such that |k − p| ≥ 2µn. Let B bea part of A of elements less than p, let C be a part of A of elements greaterthan p.
Defines = p−µn
n, r = p−2µn
n.
Notice that
xk(1− x)n−k = (Uα(x))n, where α = k
n.
For k ∈ B and x ∈(pn− µ, p
n+ µ)
we have α ≤ r ≤ r + η ≤ s < x so
that Lemma 2.4.7 can be applied:
xk(1− x)n−k = (Uα(x))n ≤ (Uα(s))
n ≤(e−2µ2
Uα(r))n
= e−2µ2nrk(1− r)n−k ≤ ε2rk(1− r)n−k
2.4. Uniform approximation 133
if
n >
log
2
ε
2µ2 .
Finally,
∑
k∈B
(nk
)xk(1− x)n−k ≤
∑
k∈B
(nk
)ε2rk(1− r)n−k
≤ ε2
n∑
k=0
(nk
)rk(1− r)n−k = ε
2.
A similar inequality can be derived with a set C instead of B. Adding bothinequalities, we receive the assertion of Lemma 2.4.8.
Remark 2.4.9. If the summation of the statement of Lemma 2.4.8 extendsover all 0 ≤ k ≤ n, then the result of this summation is 1.
A proof of Weierstrass theorem using the concept of convolu-tion.
Definition 2.4.10. If f and g are some suitable functions on R, then theconvolution f ∗ g is the function defined by
f ∗ g(x) =
∫ ∞
−∞f(x− t)g(t)dt,
where ”suitable” means that the integral exists.
Remark 2.4.11. In what follows, we need the Delta-Dirac function, δ, whichhas the properties:
δ(x) = 0 if x 6= 0 and
∫ ∞
−∞δ(t)dt = 1.
We should think of it as ”The density function of a unit mass at theorigin”.
For example,
f ∗ δ(x) =
∫ ∞
−∞f(x− t)δ(t)dt = f(x),
since δ(t) = 0 except at t = 0.
134 Chapter 2. Approximation of functions
Remark 2.4.12. Having in mind the previous remark, what we have to donow is to find a sequence of functions (Kn) which approximate the δ-function.The sequence (Kn ∗ f) will then approximate δ ∗ f = f .
Definition 2.4.13. The n-th Landau kernel function is
Kn = cn(1− x2)n for x ∈ [−1, 1] and 0 otherwise,
where cn is chosen so that ∫ ∞
−∞Kn = 1.
Lemma 2.4.14. If f is a continuous function on the interval [−1, 1], thenKn ∗ f is a polynomial.
Proof. We have
Kn ∗ f(x) =
∫ 1
−1
Kn(x− t)f(t)dt
and Kn is a polynomial, and so Kn(x− t) can be expanded as
g0(t) + g1(t)x+ · · ·+ g2n(t)x2n
and so the integral is a polynomial in x.
Lemma 2.4.15. The sequence
(Kn ∗ f)→ f for n→∞.
Proof. We need to show that Kn has ”most of its area” concentrated nearx = 0.
First we estimate how big cn is:
∫ 1
−1
(1− t)2ndt = 2
∫ 1
0
(1− t)n(1 + t)ndt ≥ 2
∫ 1
−1
(1− t)ndt = 2n+1
.
Since
∫ ∞
−∞Kn = 1 we must have cn ≤
n+ 1
2.
(In fact, cn grows like a multiple of√n. For large n, cn is approximately
0, 565√n.)
2.4. Uniform approximation 135
Concerning the area under Kn which is not near 0 we have
∫ 1
δ
Kn(t)dt =
∫ 1
δ
cn(1− t2)ndt ≤ n+12
∫ 1
δ
(1− δ2)ndt,
since Kn is decreasing on [δ, 1] and this is
n+12
(1− δ2)n(1− δ).
If r = 1− δ2 then (n + 1)rn → 0 as n→∞.The function f is continuous and bounded by M .If x ∈ [0, 1] then given ε > 0 we may find δ > 0 such that if |t| < δ then
|f(x− t)− f(x)| < ε. So now look at the convolution Kn ∗ f :
|f(x)−Kn ∗ f(x)| =∫ 1
−1
|f(x)− f(x− t)|Kn(t)dt︸ ︷︷ ︸A
=
∫ −δ
−1
A+
∫ δ
−δA+
∫ 1
δ
A.
Now, on [−1,−δ] and on [δ, 1], we have that Kn(t) is small if we chooseδ small. In fact, we can choose δ so that Kn(t) <
εM
and then the first andthe third integral are less than ε.
For the middle integral,
∫ δ
−δ|f(x)− f(x− t)|Kn(t)dt ≤
∫ δ
−δεKn(t)dt < ε,
since ∫ 1
−1
Kn(t)dt = 1.
Thus, |f(x) −Kn ∗ f(x)| is small when n is large and we have our con-vergence. This complete the proof of Weierstrass approximation theorem.
2.4.2 The Stone-Weierstrass theorem
The generalized approach of the Weierstrass approximation theorem is knownas the Stone-Weierstrass theorem.
Theorem 2.4.16. Let X be a compact metric space and let C0(X,R) be thealgebra of continuous real functions defined over X. Let A be a subalgebra ofC0(X,R) for which the following conditions hold:
136 Chapter 2. Approximation of functions
(1) ∀x, y ∈ X, x 6= y, ∃f ∈ A : f(x) 6= f(y),(2) 1 ∈ A.Then A is dense in C0(X,R).
Proof. Let A denote the closure ofA in C0(X,R), according with the uniformconvergence topology. We want to show that, if conditions 1 and 2 aresatisfied, then A = C0(X,R).
1. Firstly, we show that, if f ∈ A, then |f | ∈ A.Since f is a continuous function on a compact space, f must be bounded:
there exists constants a and b such that a ≤ f ≤ b. By the Weierstrassapproximation theorem, for every ε > 0, there exists a polynomial such that∣∣∣P (x)− |x|
∣∣∣ < ε when x ∈ [a, b].
Define g : X → R by g(x) = P (f(x)). Since A is an algebra, g ∈ A.
For all x ∈ X,∣∣∣g(x) − |f(x)|
∣∣∣ < ε. Since A is closed under the uniform
convergence topology, this implies that |f | ∈ A.A corollary of the fact just proven is that if f, g ∈ A, then max(f, g) ∈ A
and min(f, g) ∈ A, because one can write
max(a, b) =1
2(|a+ b|+ |a− b|),
min(a, b) =1
2(|a+ b| − |a− b|).
2. Secondly, we shall show that, for every f ∈ C0(X,R), x ∈ X and ε > 0,there exists gx ∈ A such that gx ≤ f + ε and g(x) > f(x). By condition 1, ify 6= x, there exists a function hxy ∈ A such that hxy(x) 6= hxy(y). Define hxyby
hxy(z) = phxy(z) + q,
where the constants p and q are chosen so that
hxy(x) = f(x) + ε/2,
hxy(y) = f(y)− ε/2.
By condition 2, hxy ∈ A.For every y 6= x, define the set
Uxy = z ∈ X | hxy(z) < f(z) + ε.
2.4. Uniform approximation 137
Since f and hxy are continuous, Uxy is an open set. Because x ∈ Uxy andy ∈ Uxy, Uxy | y ∈ X \ x is an open cover of X. By the definition ofa compact space, there must exist a finite subcover. In other words, thereexists a finite subset y1, y2, . . . , yn ⊂ X such that
X =
n⋃
m=0
Uxym.
Define gx = minhxy1, . . . , hxyn. By the corollary of the first part of the
proof, gx ∈ A. By construction,
gx(x) = f(x) + ε/2 and g < f + ε.
3. Thirdly, we shall show that, for every f ∈ C0(X,R) and every ε > 0,there exists a function g ∈ A such that f ≤ g < f + ε. This will completethe proof because it implies that A = C0(X,R). For every x ∈ X, define theset Vx as
Vx = z ∈ X | gx(z) > f(x),where gx is defined as before. Since f and gx are continuous, Vx is an openset. Because gx(x) = f(x) + ε/2 > f(x), x ∈ Vx. Hence Vx | x ∈ X is anopen cover of X. By the definition of a compact space, there must exist afinite subcover. In other words, there exists a finite subset x1, . . . , xn ⊂ Xsuch that
X =
n⋃
m=0
Vxm.
Define g asg(z) = maxgx1(z), . . . , gxn
(z).By the corollary of the first part of the proof, g ∈ A. By construction,
g > f . Sincegx < f + ε for every x ∈ X, g < f + ε.
So, the Stone-Weierstrass theorem is proved.
2.4.3 Positive linear operators
Another method that can be used to prove density is based on what is calledthe Popoviciu-Bohman-Korovkin Theorem. A primitive form of this theorem
138 Chapter 2. Approximation of functions
was proved by Bohman, in 1952. His proof, and the main idea of his approach,was a generalization of Bernstein’s proof of Weierstrass theorem.
One year later, Korovkin proved the same theorem for integral type ope-rators. Korovkin’s original proof is in fact based on positive singular integralsand there are very obvious links to Lebesgue’s work on singular operator that,in turn, was motivated by various of the proofs of Weierstrass theorem. In1950, Popoviciu gave also a proof of Weierstrass’ theorem using interpolationpolynomials.
2.4.3.1 Popoviciu-Bohman-Korovkin Theorem
Let (Ln) be a sequence of positive linear operators mapping C[a, b] into itself.Assume that
limn→∞
Ln(xi) = xi, i = 0, 1, 2
and the convergence is uniform on [a, b]. Then
limn→∞
(Lnf)(x) = f(x),
uniformly on [a, b], for every f ∈ C[a, b].
Remark 2.4.17. A similar result holds in the periodic case C[0, 2π], where”test functions” are 1, sin x and cosx.
How can the Popoviciu-Bohman-Korovkin Theorem be applied to obtaindensity results? In theory it can be easily applied. If Un = spanu1, . . . , un,n = 1, 2, . . . is a nested sequence of finite-dimensional subspaces of C[a, b],and Ln is a positive linear operator mapping C[a, b] into Un that satisfiesthe condition of the above theorem, then the (uk)
∞k=1 span a dense subset of
C[a, b]. In practice it is all too rarely applied in this manner.
Remark 2.4.18. The importance of Popoviciu-Bohman-Korovkin Theoremis primarily in that it presents conditions implying convergence, and also inthat it provides calculable error bounds on the rate of approximation.
Remark 2.4.19. One immediate application of the theorem is a proof of theconvergence of the Bernstein polynomials Bn(f) to f for each f ∈ C[0, 1].We may consider the (Bm) as a sequence of positive linear operators mappingC[0, 1] into Pn, the space of algebraic polynomials of degree at most n. It is
2.4. Uniform approximation 139
verified that
Bn(1; x) = 1,
Bn(x; x) = x,
Bn(x2; x) = x2 +
x(1− x)n
for all n ≥ 2.
Thus, by Popoviciu-Bohman-Korovkin Theorem it follows that Bnf convergeuniformly to f on [0, 1].
2.4.3.2 Modulus of continuity
The modulus of continuity is one of the basic characteristics of continuousfunctions. For a continuous function f on a closed interval, it is defined as
ω(f, δ) = max|h|≤δ
maxx|f(x+ h)− f(x)|, for δ ≥ 0.
This definition was introduced by N. Lebesgue in 1910, although in essencethe concept was known earlier. If the modulus of continuity of a function fsatisfies the condition
ω(f, δ) ≤Mδα,
where 0 < α < 1, then f is said to satisfy a Lipschitz condition of order α.
Remark 2.4.20. The modulus of continuity can be defined also in the fol-lowing manner:
ω(f, δ) = sup|f(u)− f(v)|, u, v ∈ I, |u− v| ≤ δ,
for f ∈ C(I), when I is an interval.
Remark 2.4.21. In what follows, we show some applications of the modulusof continuity in the study of the rapidity of convergence for a sequence oflinear operators, Ln, to a function f , according to the Popoviciu-Bohman-Korovkin Theorem.
2.4.3.3 Popoviciu-Bohman-Korovkin Theorem in a quantitativeform
We shall estimate the rapidity of convergence of Ln(f) to f in terms of therapidity of convergence of Ln(1) to 1, Ln(x) to x and Ln(x
2) to x2.
140 Chapter 2. Approximation of functions
Theorem 2.4.22. Let −∞ < a < b < ∞, and let L1, L2, . . . be linear posi-tive operators, all having the same domain D which contains the restrictionsof 1, t, t2 to [a, b]. For n = 1, 2, . . . suppose Ln(1) is bounded. Let f ∈ D becontinuous in [a, b], with modulus of continuity ω. Then, for n = 1, 2, . . .
‖f − Ln(f)‖ ≤ ‖f‖ · ‖L(1)− 1‖+ ‖Ln(1) + 1‖ω(µn), (2.4.8)
where
µn = ‖(Ln([t− x]2))(x)‖12 , (2.4.9)
and ‖ · ‖ stands for the sup norm over [a, b].In particular, if Ln(1) = 1, as is often the case, (2.4.8) reduces to
‖f − Ln(f)‖ ≤ 2ω(µn). (2.4.10)
Remark 2.4.23. In forming Ln([t − x]2), the x is held fixed, while t formsthe functions t, t2 on which Ln operates.
Remark 2.4.24. Observe that (2.4.9) implies, for n = 1, 2, . . .
µ2n ≤ ‖Ln(t2)(x)− x2‖+ 2c‖Ln(t)(x)− x‖+ c2‖Ln(1)− 1‖,
where c = max(|a|, |b|). Hence, if Ln(tk)(x) converges uniformly to xk in
[a, b], for k = 0, 1, 2, then µn → 0 and we have a simple estimate of µn interms of ‖Ln(tk)(x)− xk‖, k = 0, 1, 2.
Proof. Let x ∈ [a, b] and let δ be a positive number. Let t ∈ [a, b]. If|t− x| > δ then
|f(t)− f(x)| ≤ ω(|t− x|) = ω(|t− x|δ−1δ)
≤ (1 + |t− x|δ−1)ω(δ) ≤ [1 + (t− x)2δ−2]ω(δ).
The inequality
|f(t)− f(x)| ≤ [1 + (t− x)2δ−2]ω(δ)
holds also if |t− x| ≤ δ. Let n be a positive integer. Then
|(Lnf − f(x)Ln(1))(x)| ≤ ω(δ)[(Ln(1) + δ−2Ln([t− x]2))(x)]≤ ω(δ)[Ln(1)(x) + (µn/δ)
2].
2.4. Uniform approximation 141
If µn > 0, take δ = µn. Then
|[Ln(f)− f(x)Ln(1)](x)| ≤ ω(µn)‖Ln(1) + 1‖, (2.4.11)
| − f(x) + f(x)Ln(1)(x)| ≤ ‖f‖ · ‖Ln(1)− 1‖.
Adding, we obtain (2.4.8). If µn = 0, we have for every positive δ,
|(Ln(f)− f(x)Ln(1))(x)| ≤ ω(δ)Ln(1)(x).
Making δ → 0+0, we obtain (Lnf)(x) = f(x)Ln(1)(x). Then, by (2.4.11),
|[f − Ln(f)](x)| ≤ ‖f‖ · ‖Ln(1)− 1‖,
which implies (2.4.8).
Example 2.4.25. Let D be the set of all real functions with domain [0, 1].For n = 1, 2, . . . let Ln be the linear positive operator defined by
(Lnφ)(x) =n∑
k=0
(n
k
)xk(1− x)n−kφ
(k
n
).
Let f be a real function with domain [0, 1], continuous there, with mod-ulus of continuity ω. Let n be a positive integer. Then
Ln(1) = 1, [Ln(t)](x) = x, [Ln(t2)](x) =
(n− 1)x2
n− x
n,
(Ln([t− x]2))(x) =x− x2
n
and by Theorem 2.4.22,
max0≤x≤1
|f(x)− (Lnf)(x)| ≤ 2ω
(1
2√n
)≤ 2ω√
n. (2.4.12)
We have thus obtained the known result of Bernstein polynomials. Forsome universal constant C and for n = 1, 2, . . . , the left-hand-side of (2.4.12)
is bounded above byCω√n
.
Remark 2.4.26. For more details about this constant C, see the bibliogra-phy.
142 Chapter 2. Approximation of functions
Theorem 2.4.27. Let −∞ < a < b < ∞ and let L1, L2, . . . be linear pos-itive operators, all having the same domain D which contains the boundedfunctions f0, f1, f2 with domain [a, b]. We assume that the restriction of 1to [a, b] belongs to D, and that, for n = 1, 2, . . . , Ln(1) is bounded. Leta0(x), a1(x), a2(x) be real functions, defined and bounded in [a, b], and as-sume that for every t, x ∈ [a, b],
F (t, x) =2∑
k=0
ak(x)fk(t) ≥ K(t− x)2, F (x, x) = 0, (2.4.13)
where K is a positive constant depending on neither x nor t. Let f ∈ D becontinuous in [a, b], with modulus of continuity ω. Then, for n = 1, 2, . . . wehave again
‖f − Lnf‖ ≤ ‖f‖ · ‖Ln(1)− 1‖+ ‖Ln(1) + 1‖ω(µn), (2.4.14)
whereµn = (‖LnF (t, x)(x)‖/K)
12 ,
and ‖ · ‖ stands for the sup norm over [a, b]. In particular, if Ln(1) = 1, asis often the case, (2.4.14) reduces to
‖f − Lnf‖ ≤ 2ω(µn).
Remark 2.4.28. Since
‖(LnF (t, x))(x)‖ ≤2∑
k=0
‖ak(x)‖ · ‖fk − Lnfk‖ (n = 1, 2, . . . ),
if Lnfk converges uniformly to fk in [a, b], for k = 0, 1, 2, then µn → 0, andone can estimate the rapidity of convergence of µn in terms of those of Lnf0,Lnf1 and Lnf2.
Proof. Let x ∈ [a, b] and let δ > 0. Let t ∈ [a, b]. If |t− x| > δ, then
|f(t)− f(x)| ≤ ω(|t− x|)= ω(|t− x|δ−1δ)
≤ (1 + |t− x|δ−1)ω(δ)
≤ [1 + (t− x)2δ−2]ω(δ)
≤ [1 +K−1F (t, x)δ−2]ω(δ).
2.4. Uniform approximation 143
The inequality
|f(t)− f(x)| ≤ [1 +K−1F (t, x)δ−2]ω(δ)
holds, obviously, also if |t− x| ≤ δ. Let n be a positive integer. Then
|(Lnf − f(x)Ln(1))(x)| ≤ ω(δ)[(Ln(1) + δ−2K−1LnF (t, x))(x)]
≤ ω(δ)[(Ln(1))(x) + (µn/δ)2].
We proceed, then, as in the proof of Theorem 2.4.22.
Remark 2.4.29. For f0(t) = 1, f1(t) = t and f2(t) = t2 we observe thatTheorem 2.4.22 is a special case of Theorem 2.4.27.
144 Chapter 2. Approximation of functions
Chapter 3
Numerical integration
Definition 3.0.30. Formula
I(f) = Q(f) +R(f), (3.0.1)
where
Q(f) =
m∑
k=0
Akλk(f),
is called a numerical integration formula (of function f) or a quadrature for-mula, Ak, k = 0, 1, ..., m are called the coefficients of the quadrature formulaand R(f) is the remainder term.
Definition 3.0.31. The natural number r such that R(f) = 0, f ∈ Pr andfor which there exists g ∈ Pr+1 such that R(g) 6= 0, is called the degree ofexactness of the quadrature Q, i.e.,
dex(Q) = r.
Remark 3.0.32. The linearity of the remainder operator R implies thatdex(Q) = r if and only if
R(ei) = 0, i = 0, 1, ..., r
andR(er+1) 6= 0,
with ei(x) = xi.
145
146 Chapter 3. Numerical integration
Remark 3.0.33. Usually, the information λk(f), k = 0, 1, ...m, are the valuesof f or of certain of its derivatives at some points xi ∈ [a, b], i = 0, 1, ...n,called the quadrature nodes.
So, let f ∈ Hr,2[a, b], and
ΛB =λkj : Hr,2[a, b]→ R
∣∣λkj(f) = f (j)(xk), k = 0, 1, ..., m; j ∈ Ik,
with Ik ⊆ 0, 1, ..., rk, rk ∈ N, rk < r, k = 0, 1, ..., m, be the set of Birkhoff-type functionals (which is the most general punctual evaluation of f andsome of its derivatives), i.e.,
ΛB(f) = λkj(f) | k = 0, 1, ..., m; j ∈ Ik.In this case the quadrature formula is:
I(f) = Qn(f) +Rn(f), (3.0.2)
where
Qn(f) =m∑
k=0
∑
j∈Ik
Akjf(j)(xk),
with n = |I0 |+...+| Im| − 1. The problem which appears is to find the co-efficients Akj and the nodes xk, k = 0, 1, ..., m; j ∈ Ik, and to study theremainder term (the approximation error) for the obtained values of coeffi-cients and nodes. Relatively to the conditions used to find the quadratureparameters Akj and xk, k = 0, 1, ..., m; j ∈ Ik, the quadrature formulas canbe classified as follows:
- quadrature formulas of interpolatory type, that are obtained integratingan interpolation formula;
- quadrature of Gauss type or with the maximum degree of exactness;- quadrature of Chebyshev type or with equal coefficients and of the
maximum degree of exactness;- optimal quadrature formulas.Next, we are dealing with the last class.
3.1 Optimal quadrature formulas
Definition 3.1.1. For a given f ∈ Hr,2[a, b], the quadrature formula (3.0.2)for which the error |Rn(f)| takes the minimum value with regard to the pa-rameters Akj and xk, k = 0, 1, ..., m; j ∈ Ik, is called the local optimalquadrature formula with regard to the error.
3.1. Optimal quadrature formulas 147
Definition 3.1.2. A quadrature formula for which
supf∈Hr,2[a,b]
|Rn(f)|
takes the minimum value with regard to its parameters Akj, xk; k = 0, 1, ..., m;j ∈ Ik, is called a global optimal formula with regard to the error.
Remark 3.1.3. The parameters A∗kj and x∗k, k = 0, 1, ..., m of an optimal
quadrature formula are called the (local or global) optimal coefficients, re-spectively, the optimal nodes.
Remark 3.1.4. Some of the parameters of a quadrature formula can befixed for the beginning. Such formulas are the quadrature formulas withequal coefficients or with fixed nodes, (usually equally spaced).
Very often, there are studied the quadrature formulas with a given degreeof exactness, say r. In such a case the parameters of a quadrature formulasmust satisfy the following relations:
m∑
k=0
∑
j∈Ik
Akjxik =
∫ b
a
w(x)xidx, i = 0, 1, ..., r. (3.1.1)
If r = n + 1 then (3.1.1) becomes an algebraic (n + 1) × (n + 1) system.If this system has a solution then all the parameters are determinated, (notfree parameters remained), and the optimality problem is superfluous. Suchexamples are the quadrature formulas of Gauss type or of Chebyshev type.
3.1.1 Optimality in the sense of Sard
This problem will be treated first for some classes of linear functionals, in-cluding numerical integration of functions.
Let λ : Hm,2 [a, b] → R be a linear functional that commutes with thedefined integral, i.e.,
λ
∫ b
a
=
∫ b
a
λ,
andΛ =
λi | λi : Hm,2 [a, b]→ R, i = 1, ..., n
be a set of given linear functionals such that λ, λ1, ..., λn are linearly inde-pendent. For f ∈ Hm,2 [a, b] , one considers the approximation formula:
148 Chapter 3. Numerical integration
λ (f) =
n∑
i=1
Aiλi (f) +R (f) . (3.1.2)
Definition 3.1.5. Formula (3.1.2) is called optimal in the sense of Sard if:
1. R (ej) = 0, j = 0, 1, ..., m− 1,
2.∫ baK2 (t) dt is minimum,
(3.1.3)
where ej (x) = xj and K is the Peano’s kernel of the remainder functionalRn, i.e.,
K (t) = R[
(·−t)m−1+
(m−1)!
].
The coefficients of the optimal formula, denoted A∗i , i = 1, ..., n, are called
optimal coefficients.
Remark 3.1.6. From 1. of (3.1.3) it follows that formula (3.1.2) has thedegree of exactness at least m− 1.
Remark 3.1.7. If n = m (n ≥ m) the parameters Ai, i = 1, ..., n can bedetermined from the conditions 1. of (3.1.3), which become a m ×m linearalgebraic system. Conditions 2. are superfluous.
First, one supposes that Λ is a set of Lagrange-type functionals:
Λ := ΛL = λi | λi (f) = f (xi) , i = 1, ..., n ,
with xi ∈ [a, b] , xi 6= xj for i 6= j, i, j = 1, ..., n.Formula (3.1.2) becomes
λ (f) =
n∑
i=1
Aif (xi) +R (f) , (3.1.4)
and we have
K (t) = λx[
(x−t)m−1+
(m−1)!
]−
n∑
i=1
Ai(xi−t)m−1
+
(m−1)!. (3.1.5)
Theorem 3.1.8. If n > m then the formula of the form (3.1.4), that isoptimal in the sense of Sard, exists and is unique.
3.1. Optimal quadrature formulas 149
Proof. Using the Lagrange’s multiplier method, we must find the minimumof the functional
F (A, γ) = 12
∫ b
a
K2 (t) dt+m−1∑
j=0
γjR (ej) ,
where A = (A1, ..., An) ∈ Rn and γ = (γ0, ..., γm−1) ∈ Rm. So, we have tostudy the linear algebraic system:
∂F (A,γ)∂Ap
= 0, p = 1, ..., n,∂F (A,γ)∂γq
= 0, q = 0, 1, ..., m− 1,
or
∫ b
a
−λx
[(x−t)m−1
+
(m−1)!
]+
n∑i=1
Ai(xi−t)m−1
+
(m−1)!
(xp−t)m−1
+
(m−1)!dt+
m−1∑j=0
γjxjp = 0,
p = 1, ..., nR (eq) = 0, q = 0, 1, ..., m− 1.
Using the identities:
(x−t)m−1+
(m−1)!= (x−t)m−1
(m−1)!+ (−1)m
(t−x)m−1+
(m−1)!,
(xi−t)m−1+
(m−1)!= (xi−t)m−1
(m−1)!+ (−1)m
(t−xi)m−1+
(m−1)!,
and the relation
−λx[
(x−t)m−1
(m−1)!
]+
n∑
i=1
Ai(xi−t)m−1
(m−1)!= 0,
based on the Remark 3.1.6, one obtains
(−1)mb∫
a
−λx
[(t−x)m−1
+
(m−1)!
]+
n∑i=1
Ai(t−xi)
m−1+
(m−1)!
(xp−t)m−1
+
(m−1)!dt+
m−1∑j=0
γjxjp = 0,
p = 1, ..., nn∑i=1
Aixqi = λ (eq) , q = 0, 1, ..., m− 1.
150 Chapter 3. Numerical integration
Taking into account the commuting property of λ and∫ ba, the above system
can be written in the form:
m−1∑j=0
Bjxjp +
n∑i=1
Ai(xp−xi)
2m−1+
(2m−1)!= λx
[(xp−x)2m−1
+
(2m−1)!
], p = 1, ..., n
n∑i=1
Aixqi = λ (eq) , q = 0, 1, ..., m− 1,
(3.1.6)
whereBj = (−1)m+1 γj, j = 0, 1, ..., m− 1
and
(xp−xi)2m−1+
(2m−1)!=
∫ b
a
(t−xi)m−1+
(m−1)!
(xp−t)m−1+
(m−1)!dt,
(xp−x)2m−1+
(2m−1)!=
∫ b
a
(t−x)m−1+
(m−1)!
(xp−t)m−1+
(m−1)!dt.
On the other hand, for f ∈ Hm,2 [a, b] the coefficients of the interpolationspline function of Lagrange type,
(Sf) (x) =
m−1∑
j=0
ajxj +
n∑
i=1
bi(x−xi)
2m−1+
(2m−1)!,
are obtained as a solution of the system
(Sf) (xp) :=m−1∑j=0
ajxjp +
n∑i=1
bi(xp−xi)
2m−1+
(2m−1)!= f (xp) , p = 1, ..., n
(Sf)(q) (α) :=n∑i=1
bi(α−xi)
2m−q−1+
(2m−q−1)!= 0, q = m, ..., 2m− 1; α > xn.
(3.1.7)As the last m equations from (3.1.7) can be written in the equivalent form
n∑
i=1
bixqi = 0, q = 0, 1, ..., m− 1,
the two systems, (3.1.6) and (3.1.7), are equivalent. Hence, the uniqueness ofthe spline function Sf, implies that the system (3.1.6) has a unique solution,say (A∗, γ∗) , i.e., A∗ = (A∗
1, ..., A∗n) is the unique minimum point of the
integral from (3.1.3).
3.1. Optimal quadrature formulas 151
Theorem 3.1.9. For f ∈ Hm,2 [a, b] , if
f = Sf +Rf
is the spline interpolation formula corresponding to Λ := ΛL, with n > m,then the approximation formula of the form (3.1.4), that is optimal in senseof Sard, is
λ (f) = λ (Sf) + λ (Rf) . (3.1.8)
Proof. As n := |ΛL| > m formula (3.1.8) exists, is unique and we have
f (x) =
n∑
i=1
si (x) f (xi) +
∫ b
a
ϕ (x, t) f (m) (t) dt,
where
ϕ (x, t) =(x−t)m−1
+
(m−1)!−
n∑
i=1
si (x)(xi−t)m−1
+
(m−1)!.
It follows that
λ (f) =n∑
i=1
A∗i f (xi) +R∗ (f) , (3.1.9)
withA∗i = λ (si) , i = 1, ..., n
and
R∗ (f) =
∫ b
a
K∗ (t) f (m) (t) dt, (3.1.10)
where
K∗ (t) = λx[
(x−t)m−1+
(m−1)!
]−
n∑
i=1
A∗i
(xi−t)m−1+
(m−1)!.
Formula (3.1.9) is optimal in the sense of Sard if there are satisfied conditions(3.1.3). The first condition follows directly from (3.1.10), i.e.,
R∗ (ej) = 0, j = 0, 1, ...m− 1.
To prove the second condition, let us suppose that the formula (3.1.9) is notoptimal. Let
λ (f) =
n∑
i=1
Aif (xi) + R (f) (3.1.11)
152 Chapter 3. Numerical integration
be the optimal formula. For the remainder term, we have (condition 1. from(3.1.3)):
R (f) =
∫ b
a
K (t) f (m) (t) dt,
with
K (t) = λx[
(x−t)m−1+
(m−1)!
]−
n∑
i=1
Ai(xi−t)m−1
+
(m−1)!.
One considersg = K −K∗,
i.e.,
g (t) =n∑
i=1
[Ai − A∗
i
] (xi−t)m−1+
(m−1)!. (3.1.12)
We haveg (t) = 0, for t ∈ [a, x1) ∪ (xn, b]. (3.1.13)
For t < x1, we have
g (t) =
n∑
i=1
(Ai − A∗i )
(xi−t)m−1
(m−1)!.
As the degree of exactness of both formulas (3.1.9) and (3.1.11) is (m− 1) ,it follows that g (t) = 0, for t < x1.
For t > xn we have (xi − t)+ = 0, i = 1, ..., n, and we also get g (t) = 0.
Now, let us define the function s such that s(m) = g. From (3.1.12) and(3.1.13) it follows that
s|(xi,xi+1)
∈ P2m−1, i = 1, ..., n− 1,
s|[a,x1)∪(xn,b]∈ Pm−1,
ands ∈ C2m−2 [a, b] .
Whence, we have s ∈ S (Λ) . Since
Sf = f, f ∈ S (Λ) ,
we have
R∗ (s) :=
∫ b
a
K∗ (t) s(m) (t) dt = 0,
3.1. Optimal quadrature formulas 153
i.e., ∫ b
a
K∗ (t) g (t) dt = 0
or ∫ b
a
K∗ (t)[K (t)−K∗ (t)
]dt = 0.
It follows that∫ b
a
(K (t)
)2dt =
∫ b
a
[K (t)−K∗ (t) +K∗ (t)
]2dt
≥∫ b
a
(K∗ (t))2 dt+
∫ b
a
[K (t)−K∗ (t)
]2dt,
whence, ∫ b
a
(K (t)
)2dt ≥
∫ b
a
(K∗ (t))2 dt.
As K is the kernel of the optimal formula it must be K = K∗, i.e., g = 0 orAi = A∗
i , i = 1, ..., n.
Remark 3.1.10. Analogous results can be proved when Λ is a set of Hermiteor Birkhoff-type functionals for which the corresponding spline interpolationformula exists.
Suppose now that Λ is a Hermite-type set of functionals on Hm,2 [a, b] ,
Λ := ΛH =λij | λij = f (j) (xi) , i = 1, ..., r; j = 0, ..., νi
,
with 0 ≤ νi < m and n = r + ν1 + ... + νr.
Remark 3.1.11. For f ∈ Hm,2 [a, b] formula (3.1.2) becomes
λ (f) =
r∑
i=1
νi∑
j=0
Aijf(j) (xi) +R (f) , (3.1.14)
where
R (f) =
∫ b
a
K (t) f (n) (t) dt,
with
K (t) = λx[
(x−t)m−1+
(m−1)!
]−
r∑
i=1
νi∑
j=0
Aij(xi−t)m−j−1
+
(m−j−1)!.
154 Chapter 3. Numerical integration
Theorem 3.1.12. If n > m formula (3.1.14), that is optimal in sense ofSard, exists and is unique.
Proof. As in the previous case, one considers the functional FH (A, γ) givenby
FH (A, γ) = 12
∫ b
a
K2 (t) dt+m−1∑k=0
γkR(ek
k!
),
where A = (A10, ..., A1ν1 , ..., Ar0, ..., Arνr) and γ = (γ0, ..., γm−1) . The para-
meters Aij, i = 1, ..., r; j = 0, 1, ..., νi and γk, k = 0, 1, ..., m−1 are determinedas a solution of the linear algebraic system:
∂F (A,γ)∂Apq
= 0, p = 1, ..., r; q = 0, 1, ..., νp,∂F (A,γ)∂γµ
= 0, µ = 0, 1, ..., m− 1.(3.1.15)
Let SH be the spline interpolation operator, corresponding to the set ΛH,i.e.,
(SHf) (x) =
m−1∑
k=0
akxk +
r∑
i=1
νi∑
j=0
bij(x−xi)
2m−j−1+
(2m−j−1)!,
where the coefficients ak, k = 0, 1, ..., m−1 and bij , i = 1, ..., r; j = 0, 1, ..., νi
are obtained as a solution of the following linear algebraic system:
(SHf)(j) (xk) = f (j) (xk) , k = 1, ..., r; j = 0, 1, ..., νk(SHf)(p) (α) = 0, p = m, ..., 2m− 1, α > r.
(3.1.16)
After some transformations (as in the previous case) we can see that thesystems (3.1.15) and (3.1.16) are equivalent. But, for n > m the splinefunction SHf exists and is unique, hence the system (3.1.16) has a uniquesolution. It means that the system (3.1.15) has also a unique solution andthe proof follows.
Remark 3.1.13. A theorem analogous to Theorem 3.1.12 can be also provedwhen Λ is a set of functionals of Birkhoff-type in the additional condition thatΛ contains a subset of at least m Hermite-type functionals (condition underwhich the corresponding spline interpolation function exists and is unique).
In particular, let λ be the defined integral functional:
λ (f) := I (f) =
∫ b
a
f (x) dx, for f ∈ Hm,2 [a, b] .
3.1. Optimal quadrature formulas 155
For
Λ =λi | λi : Hm,2 [a, b]→ R, i = 1, ..., n
,
one considers the quadrature formula
I (f) = Qn (f) +Rn (f) , (3.1.17)
where
Qn (f) =n∑
i=1
Aiλi (f)
and Rn (f) is the remainder term.
The quadrature formula (3.1.17) is optimal in the sense of Sard if
Rn (ej) = 0, j = 0, 1, ..., m− 1
andb∫
a
K2 (t) dx is minimum,
with
K (t) := Rn
[(·−t)m−1
+
(m−1)!
].
Now, if Λ is a Birkhoff-type set of functionals that contains a subset ofat least m functionals of Hermite-type then the corresponding spline inter-polation formula,
f = Sf +Rf,
exists and is unique.
By Theorem 3.1.9, it follows that the formula
∫ b
a
f (x) dx =
∫ b
a
(Sf) (x) dx+
∫ b
a
(Rf) (x) dx
is the optimal quadrature formula in sense of Sard corresponding to the setΛ.
Next, there will be given some examples of such optimal quadrature for-mulas for Lagrange-type, Hermite-type and Birkhoff-type sets of functionals.
156 Chapter 3. Numerical integration
Example 3.1.14. Let f ∈ C [0, 1] and
ΛL(f) =f(in
)| i = 0, ..., n
.
We construct the optimal quadrature formula in sense of Sard, using linearsplines (m = 1).
We havef = S1f +R1f,
where
(S1f) (x) =n∑
i=0
si (x) f(in
)
and
(R1f) (x) =
∫ 1
0
ϕ1 (x, t) f ′ (t) dt,
with
ϕ1 (x, t) = (x− t)0+ −
n∑
i=0
si (x)(in− t)0+.
Next, we determine the functions si, i = 0, 1, ..., n. We have that
si (x) = ai0 +n∑
j=0
bij(x− j
n
)+, i = 0, 1, ...., n,
with the coefficients ai0, bi0, ..., b
in, determined, for any i = 0, 1, ..., n, from the
conditions: si(kn
)= δik, k = 0, 1, ..., n
s′i (α) = 0, α > 1, (we take α = 2).
One obtains
s0 (x) = 1− nx+ n(x− 1
n
)+
s1 (x) = nx− 2n(x− 1
n
)+
+ n(x− 2
n
)+
s2 (x) = n(x− 1
n
)+− 2n
(x− 2
n
)+
+ n(x− 3
n
)+
. . .
sn−1 (x) = n(x− n−2
n
)+− 2n
(x− n−1
n
)+
+ n (x− 1)+
sn (x) = n(x− n−1
n
)+− n (x− 1)+ .
3.1. Optimal quadrature formulas 157
It follows that
∫ 1
0
f (x) dx =n∑
i=0
A∗i f(in
)+R∗ (f) ,
where
A∗i =
∫ 1
0
si (x) dx, i = 0, ..., n,
i.e.,
A∗0 = 1
2n
A∗1 = 1
n
...
A∗i = 1
n
...
A∗n−1 = 1
n
A∗n = 1
2n,
and
R∗ (f) =
∫ 1
0
K1 (t) f ′ (t) dt,
with
K1 (t) = 1− t−n∑
i=0
A∗i
(in− t)0+.
Hence, the optimal quadrature formula is
∫ 1
0
f (x) dx = 1n
[12f (0) + f
(1n
)+ ...+ f
(n−1n
)+ 1
2f (n)
]+R∗ (f) .
For the remainder term we have
|R∗ (f)| ≤ ‖f ′‖2∫ 1
0
K21(t)dt = ‖f ′‖2
n∑
i=1
∫ in
i−1n
K21,i(t)dt ≤ 1
2n√
3‖f ′‖2 ,
where K1,i = K1|[ i−1n, in ].
158 Chapter 3. Numerical integration
Example 3.1.15. Let f ∈ C2 [0, 1] and
ΛH(f) =f (0) , f ′ (0) , f
(12
), f (1) , f ′ (1)
be given. We construct the optimal quadrature formula (in the sense ofSard). To this end, one considers the cubic spline interpolation formula:
f = S3f +R3f,
where
(S3f) (x) = s00 (x) f (0)+s01 (x) f ′ (0)+s10 (x) f(
12
)+s20 (x) f (1)+s21 (x) f ′ (1)
and the fundamental interpolation splines have the expressions
skj (x) = akj0 +akj1 x+bkj00x3+bkj01x
2+bkj10(x− 1
2
)3+
+bkj20 (x− 1)3++bkj21 (x− 1)2
+ ,
for (k, j) ∈ (0, 0) ; (0, 1) ; (1, 0) ; (2, 0) ; (2, 1) . For (k, j) = (0, 0) we have thesystem:
s00 (0) := a000 = 1
s′00 (0) := a001 = 0
s00
(12
):= a00
0 +1
2a00
1 +1
8b0000 + 1
4b0001 = 0
s00 (1) := a000 + a00
1 + b0000 + b0001 + 18b0010 = 0
s′00 (1) := a001 + 3b0000 + 2b0001 +
3
4b0010 = 0
s′′00 (2) := 12b0000 + 2b0001 + 9b0010 + 6b0020 + 2b0021 = 0s′′′00 (2) := b0000 + b0010 + b0020 = 0.
Solving this system, one obtains
s00 (x) = 1 + 10x3 − 9x2 − 16(x− 1
2
)3+.
The systems that generate the coefficients of the other fundamental in-terpolation spline functions have the same matrix, free terms being changedsuccessively in (0, 1, 0, 0, 0, 0, 0) , (0, 0, 1, 0, 0, 0, 0) , (0, 0, 0, 1, 0, 0, 0) ,(0, 0, 0, 0, 1, 0, 0).
So, we have
s01 (x) = x+ 3x3 − 72x− 4
(x− 1
2
)3+,
s10 (x) = −16x3 + 12x2 + 32(x− 1
2
)3+,
s20 (x) = 6x3 − 3x2 − 16(x− 1
2
)3+,
s21 (x) = −x3 + 12x2 + 4
(x− 1
2
)3+.
3.1. Optimal quadrature formulas 159
For the remainder term, we have:
(R3f) (x) =
∫ 1
0
K2 (x, t) f ′′ (t) dt,
where
K2 (x, t) = (x− t)+ − s10 (x)(
12− t)+− s20 (x) (1− t)− s21 (x) .
It follows that∫ 1
0
f (x) dx = A∗00f (0)+A∗
01f′ (0)+A∗
10f(
12
)+A∗
20f (1)+A∗21f
′ (1)+R∗ (f) ,
with
A∗00 =
∫ 1
0
s00 (x) dx = 14,
A∗01 =
∫ 1
0
s01 (x) dx = 148,
A∗10 =
∫ 1
0
s10 (x) dx = 12,
A∗20 =
∫ 1
0
s20 (x) dx = 14,
A∗21 =
∫ 1
0
s21 (x) dx = − 148,
and
R∗ (f) =
∫ 1
0
K2 (t) f ′′ (t) dt,
whereK2 (t) = (1−t)2
2−A∗
10
(12− t)+−A∗
20 (1− t)− A∗21.
For the remainder we have
|R∗ (f)| ≤(∫ 1
0
K22 (t) dt
)1/2
‖f ′′‖2 .
But, ∫ 1
0
K22 (t) dt =
∫ 1/2
0
K22 (t) dt+
∫ 1
1/2
K22 (t) dt.
160 Chapter 3. Numerical integration
We obtain∫ 1
0
f (x) dx =[f (0) + 1
48f ′ (0) + 1
2f(
12
)+ f (1)− 1
48f ′ (1)
]+R∗ (f) ,
with|R∗ (f)| ≤ 1
48√
5‖f ′′‖2 .
Example 3.1.16. Let f ∈ C2 [0, 1] and
ΛB(f) =f ′ (0) , f
(12
), f ′ (1)
be given. We need to construct the optimal quadrature formula (in the senseof Sard). Similarly to the Hermite case, we consider the cubic spline formula
(S3f) (x) = s01 (x) f ′ (0) + s10 (x) f(
12
)+ s21 (x) f ′ (1) ,
and we determine the functions s01, s10 and s21.The coefficients of this functions are solutions of a 5× 5 system. For s01
we have
s′01 (0) := a011 = 1
s01
(12
):= a01
0 + 12a01
1 + 14b0101 = 0
s′01 (1) := a011 + 2b0101 + 3
4b0110 = 0
s′′01 (2) := 12b0101 + 9b0110 + 2b0121 = 0s′′′01 (2) := 6b0110 = 0.
It follows that
s01 (x) = −38
+ x− 12x2 + 1
2(x− 1)2
+ .
In the same way we obtain the other functions:
s10 (x) = 1,
s21 (x) = −18
+ 12x2 − 1
2(x− 1)2+ .
For the remainder term, we have:
(R3f) (x) =
∫ 1
0
K2 (x, t) f ′′ (x, t) dt,
whereK2 (x, t) = (x− t)+ −
(12− t)+− s21 (x) .
3.1. Optimal quadrature formulas 161
One obtains the optimal formula:
∫ 1
0
f (x) dx = A∗01f
′ (0) + A∗10f(
12
)+ A∗
21f′ (1) +R∗ (f) ,
where
A∗01 =
∫ 1
0
s01 (x) dx = − 124,
A∗10 =
∫ 1
0
s10 (x) dx = 1,
A∗21 =
∫ 1
0
s21 (x) dx = 124,
and
R∗ (f) =
∫ 1
0
K∗2 (t) f ′′ (t) dt,
with
K∗2 (t) =
∫ 1
0
K2 (x, t) dx = (1−t)22−(
12− t)+− 1
24.
Finally, we get
∫ 1
0
f (x) dx = − 124f ′ (0) + f
(12
)+ 1
24f ′ (1) +R∗ (f) ,
with|R∗ (f)| ≤ 1
12√
5‖f ′′‖2 .
3.1.2 Optimality in the sense of Nikolski
Consider a quadrature formula of the form given in (3.0.2). One suppose thatall the parameters of the quadrature formula are free. So, we have to find thecoefficients and the nodes of the quadrature formula such that |Rn(f)| takesthe minimum value. Sometimes there can be supposed that the quadratureformula has a given degree of exactness, i.e., some relations as in (3.1.1) aresatisfied. Here there are used two ways to solve such an optimality problem.
A first way is based on the relationship between the optimality in thesense of Sard and the spline interpolation. For the beginning, suppose thatthe quadrature nodes are fixed and find the coefficients such that |Rn(f)| is
162 Chapter 3. Numerical integration
minimum. But, this is a problem of optimality in the sense of Sard and itcan be solved using the suitable spline interpolation formula. So, if
f = Srf +Rrf, (3.1.18)
is the spline interpolation formula, with
(Srf)(x) =
m∑
k=0
∑
j∈Ik
skj(x)f(j)(xk),
and
(Rrf)(x) =
∫ b
a
ϕr(x, t)f(r)(t)dt,
then it follows that∫ b
a
f(x)dx =m∑
k=0
∑
j∈Ik
Akjf(j)(xk) +Rn(f), (3.1.19)
with
Akj =
∫ b
a
skj(x)dx
and
Rn(f) =
∫ b
a
Kr(t)f(r)(t)dt, (3.1.20)
where
Kr(t) =
∫ b
a
ϕr(x, t)dx = (b−t)r
r!−
m∑
k=0
∑
j∈Ik
Akj(xk−t)2r−j−1
+
(2r−j−1)!
is the optimal in the sense of Sard quadrature formula.As Akj are functions of variables xk, k = 0, 1, ..., m, i.e.,
Akj = Fkj(x0, ...xm), (3.1.21)
it follows that Rn(f) is a function of the nodes,
Rn(f) = F (x0, ..., xm). (3.1.22)
Now, we have to minimize the function F, with respect to the variablesx0, ..., xm. Let (x∗0, ..., x
∗m) be the minimum point of F. It follows that x∗0, ..., x
∗m
are the optimal nodes, in the sense of Nikolski.
3.1. Optimal quadrature formulas 163
Then, from (3.1.21) and (3.1.22), one obtains the optimal coefficients:
A∗kj = Fkj(x
∗0, ..., x
∗m), k = 0, 1, ..., m; j ∈ Ik, (3.1.23)
respectively, the optimal remainder term:
R∗n(f) = F (x∗0, ..., x
∗m). (3.1.24)
Thus, the quadrature formula
∫ b
a
f(x)dx =
m∑
k=0
∑
j∈Ik
A∗kjf
(j)(x∗k) +R∗n(f)
is optimal in the sense of Nikolski.
Conclusion 3.1.17. The optimal quadrature in the sense of Nikolski, can beobtained in two steps:
- it is constructed the optimal quadrature in the sense of Sard (3.1.19), forfixed nodes, using for example the relationship with the spline interpolation.
- it is minimized the error functional (3.1.20), that depends only of thenodes, with respect to these parameters. There are obtained the optimal nodes,x∗k, k = 0, 1, ..., m. Then (3.1.23) and (3.1.24) give the optimal coefficientsA∗kj, k = 0, 1, ..., m, ; j ∈ Ik, respectively, the optimal remainder term, R∗
n(f).
A second way is given by the ϕ-function method and it is based on theminimal norm properties of orthogonal polynomials.
Let f ∈ Cr[a, b] and a = x0 < x1 < ... < xn = b. The ϕ-functionmethod consists in associating to each interval [xk−1, xk] of a function ϕk, with
ϕ(r)k = 1, k = 1, ..., m.
Then,
∫ b
a
f(x)dx =
m∑
k=1
∫ xk
xk−1
f(x)dx =
m∑
k=1
∫ xk
xk−1
ϕ(r)k (x)f(x)dx,
and applying r times the formula of integrating by parts to the last integrals,one obtains:∫ b
a
f(x)dx =m∑k=1
[ϕ(r−1)k (x)f(x)− ϕ(r−2)
k (x)f ′(x) + ...+ (−1)r−1ϕk(x)f(r−1)(x)]
∣∣∣xkxk−1
+ (−1)r∫ xk
xk−1
ϕk(x)f(r)(x)dx,
164 Chapter 3. Numerical integration
and further,
∫ b
a
f(x)dx =
r−1∑
j=0
(−1)j+1ϕ(r−j−1)1 (x0)f
(j)(x0)
+m−1∑
k=1
r−1∑
j=0
(−1)j(ϕk − ϕk+1)(r−j−1)(xi)f
(j)(xi)
+
r−1∑
j=0
(−1)jϕ(r−j−i)m (xm)f (j)(xm)
+ (−1)rm∑
k=1
∫ xk
xk−1
ϕk(x)f(r)(x)dx.
Therefore, considering ϕ∣∣[xk−1,xk] = ϕk, k = 1, ..., m, , we have
∫ b
a
f(x)dx =m∑
k=0
r−1∑
j=0
Akjf(j)(xk) +Rn(f), (3.1.25)
where
A0j = (−1)j+1ϕ(r−j−1)1 (x0), j = 0, 1, ..., r − 1
Akj = (−1)j(ϕk − ϕk+1)(r−j−1)(xk),
k = 1, ..., m− 1;j = 0, 1, ..., r − 1
Amj = (−1)jϕ(r−j−1)m (xn) , j = 0, 1, ..., r − 1
(3.1.26)
and
Rn(f) = (−1)r∫ b
a
ϕ(x)f (r)(x)dx, (3.1.27)
with
ϕ(x) = (xm−x)r
r!+ (−1)r+1
m∑
k=0
r−1∑
j=0
Akj(xk−x)r−j−1
+
(r−j−1)!.
If f ∈ Hm,2[a, b], then
|Rn(f)| ≤∥∥f (r)
∥∥2
(∫ b
a
ϕ2(x)dx
)1/2
.
3.1. Optimal quadrature formulas 165
This way, the optimal quadrature formula in the sense of Nikolski is de-termined by the coefficients A := (Akj)k=0,m;j=0,r−1 and the nodes X :=(xk)k=0,m, for which the functional
F (A,X) :=
∫ b
a
ϕ2(x)dx
takes the minimum value.
Remark 3.1.18. From (3.1.27) it follows that the quadrature formula (3.1.25)has the degree of exactness at least (r − 1).
Next, we must find the coefficients A and the nodes X that minimize thefunctional F (A,X). We have,
F (A,X) =
m∑
k=1
∫ xk
xk−1
ϕ2k(x)dx,
where
ϕk(x) =(xm − x)r
r!+ (−1)r+1
m∑
k=0
r−1∑
j=0
Akj(xk−x)r−j−1
+
(r−j−1)!
is a polynomial of r degree. So, we have to find the polynomial ϕk with theminimum L2
w[xk−1, xk] norm (w = 1). This is the Legendre polynomial:
lr,k(x) =r!
(2r)!
dr
dxr[(x− xk−1)
r(x− xk)r],
i.e., the coefficients Akj of ϕk, say Akj, are obtained from the identity
ϕk = lr,k, for all k = 1, ..., m.
Evidently, Akj = Akj(x1, ..., xm−1). One obtains
F (A,X) =
m∑
k=1
∫ xk
xk−1
l2r,k(x)dx,
that must be minimized with respect to the nodes xk, k = 1, ..., m − 1,(x0 = a, xm = b). Let X∗ = (a, x∗, ..., x∗m−1, b) be the minimum point. Itfollows that X∗ = (a, x∗, ..., x∗m−1, b) are the optimal nodes, respectively,
A∗kj = Akj(a, x
∗, ..., x∗m−1, b), k = 0, 1, ..., m; j = 0, 1, ..., r − 1
166 Chapter 3. Numerical integration
are the optimal coefficients.For the remainder term, one obtains
|R∗m(f)| ≤ [F (A∗, X∗)]1/2
∥∥f (r)∥∥
2.
Example 3.1.19. Let f ∈ H1,2[0, 1], 0 = x0 < x1 < ... < xm = 1 be apartition of the interval [0, 1], and
I(f) :=
∫ 1
0
f(x)dx =
m∑
k=0
Akf(xk) +Rm(f).
Applying the ϕ-function method (ϕ′k = 1), one obtains
∫ 1
0
f(x)dx = −ϕ1(0)+m−1∑
k=1
(ϕk−ϕk+1)(xk)f(xk)+ϕm(1)f(1)−∫ 1
0
ϕ(x)f ′(x)dx,
with
ϕ|[xk−1,xk]
= ϕk(x) = x−k−1∑
i=0
Ai.
We have
A0 = −ϕ1(0),
Ak = (ϕk − ϕk+1)(xk), k = 1, ..., m− 1,
Am = ϕm(1),
and
Rm(f) = −∫ 1
0
ϕ(x)f ′(x)dx.
Here
F (A,X) :=
∫ 1
0
ϕ2(x)dx =
m∑
k=1
∫ xk
xk−1
ϕ2k(x)dx,
with ϕk being a polynomial of the first degree, for all k = 1, ..., m. But it isknown that ‖ϕk‖2 takes the minimum value when ϕk ≡ l1,k, the Legendrepolynomial of first degree. It follows that
k−1∑
i=0
Ai =xk−1+xk
2(3.1.28)
3.2. Some optimality aspects in numerical integration of bivariate functions167
and
F (A,X) = 112
m∑
k=1
(xk − xk−1)3.
From the system of equations
∂F (A,X)
∂xk:= 1
4[(xk − xx−1)
2 − (xk+1 − xk)2] = 0, k = 1, ..., m,
it follows that
x∗k − x∗k−1 = x∗k+1 − x∗k, k = 1, ..., m− 1,
respectively,x∗k = k
m, k = 0, 1, ..., m.
The relations (3.1.28) become
k−1∑
i=0
A∗i =
x∗k−1+x
∗k
2= 2k−1
2, k = 1, ..., m.
whence,
A∗0 = 1
2m,
A∗1 = ... = A∗
m−1 = 1m,
A∗m = 1
2m.
We also haveF (A∗, X∗) = 1
12m2 ,
respectively,|R∗
m(f)| ≤ 12m
√3‖f ′‖2 .
3.2 Some optimality aspects in numerical in-
tegration of bivariate functions
Let D be a given domain in R2, f : D → R an integrable function on D andΛ(f) = λ1(f), ..., λN(f) a set of information on f. Next, one supposes thatλi(f), i = 1, ..., N, are the values of f or of certain of its derivatives at somepoints from D, called the cubature nodes.
168 Chapter 3. Numerical integration
One considers the cubature formula:
Ixyw f :=∫∫D
w(x, y)f(x, y)dxdy =
N∑
i=1
Ciλi(f) +RN (f),
where w is a given weight function (often w(x, y) = 1 on D), Ci, i = 1, ..., Nare its coefficients and RN(f) is the remainder term.
The problem to construct such a cubature formula consists in the de-termination of the coefficients Ci, i = 1, ..., N and its nodes, in some givenconditions, and to evaluate the corresponding remainder term.
For some particular cases, a cubature formula can be constructed usingthe product or the boolean sum of two quadrature rules.
The most results have been obtained when D is a regular domain in R2
(rectangle, triangle, etc.) and the cubature nodes are regularly spaced.First, one supposes that D is a rectangle, D = [a, b]× [c, d] and w(x, y) =
1, (x, y) ∈ D. If
Λx(f) = λxi f | i = 0, 1, ..., m,Λy(f) = λyjf | j = 0, 1, ..., n,
are sets of partial information on f, with respect to x, respectively to y, oneconsiders the quadrature formulas:
Ixf :=
∫ b
a
f(x, y)dx = (Qx1f)(·, y) + (Rx
1f)(·, y)
and
Iyf =
∫ d
c
f(x, y)dy = (Qy1f)(x, ·) + (Ry
1f)(x, ·),
where the quadrature rules Qx1 and Qy
1 are given by:
Qx1f)(·, y) =
m∑
i=0
Ai(λxi f)(·, y),
respectively,
(Qy1f)(x, ·) =
m∑
j=0
Bj(λyjf)(x, ·),
3.2. Some optimality aspects in numerical integration of bivariate functions169
with Rx1 and Ry
1 the corresponding remainder operators,
Rx1 = Ix −Qx
1 ,
Ry1 = Iy −Qy
1.
It is easy to check the following decompositions of the double integral oper-ator Ixy:
Ixy = Qx1Q
y1 + (IyRx
1 + IxRy1 − Rx
1Ry1), (3.2.1)
andIxy = (Qx
1Iy +Qy
1Ix −Qx
1Qy1) +Rx
1Ry1. (3.2.2)
The identities (3.2.1) and (3.2.2) generate the product cubature formula:
Ixyf := QPf +RPf = Qx1Q
y1f + (Rx
1Iy + IxQy
1 − Rx1R
y1)f, (3.2.3)
respectively, the boolean sum cubature formula:
Ixyf := QSf +RSf = (Qx1I
y +Qy1I
x −Qx1Q
y1)f +Rx
1Ry1f. (3.2.4)
For example, if λxi f = f(xi, y), λyjf = f(x, yj), i = 0, 1, ..., m; j = 0, 1, ...n
and Qx1 , respectively Qy
1, are the trapezoidal rules, i.e.,
(Qx1f)(·, y) = b−a
2[f(a, y) + f(b, y)]
(Qy1f)(x, ·) = d−c
2[f(x, c) + f(x, d)]
then the formulas (3.2.3) and (3.2.4) become
∫ b
a
∫ d
c
f(x, y)dxdy = (b−a)(d−c)4
[f(a, c) + f(a, d) + f(b, c) + f(b, d)] +RP (f),
(3.2.5)with
RP (f) =− (b−a)312
∫ d
c
f (2,0)(ξ1, y)dy − (d−c)312
∫ b
a
f (0,2)(x, η1)dx−
− (b−a)312
(d−c)312
f (2,2)(ξ2, η2),
respectively,∫ b
a
∫ d
c
f(x, y)dxdy = b−a2
∫ d
c
[f(a, y) + f(b, y)]dy + d−c2
∫ b
a
[f(x, c) + f(x, d)]dx
(3.2.6)
− (b−a)2
(d−c)2
[f(a, c) + f(a, d) + f(b, c) + f(b, d)] +RS(f),
170 Chapter 3. Numerical integration
withRS(f) = − (b−a)3
12(d−c)3
12f (2,2)(ξ, η),
whereξ, ξ1, ξ2 ∈ [a, b] and η, η1, η2 ∈ [c, d].
This way, there are obtained the trapezoidal product cubature formula (3.2.5)and the trapezoidal boolean-sum cubature formula (3.2.6).
From (3.2.3) and (3.2.4), it follows that
ord(QP ) = minord(Qx1), ord(Qy
1),
andord(QS) = ord(Qx
1) + ord(Qy1).
Remark 3.2.1. The boolean-sum cubature operator has the remarkableproperty of having high approximation order, which can be seen as an opti-mality property. But, the boolean-sum formula contains the simple integralsIxf and Iyf. These simple integrals can be approximated, in a second levelof approximation, using new quadrature operators, say Qx
2 and Qy2.
From (3.2.4) there is obtained
Ixyf = Qf +RQf, (3.2.7)
withQ = Qx
1Qy2 +Qx
2Qy1 −Qx
1Qy1 (3.2.8)
andRQ = Qx
1Ry2 +Qy
1Rx2 +Rx
1Ry1, (3.2.9)
where
Rx2 = Ix −Qx
2 ,
Ry2 = Iy −Qy
2.
From (3.2.9) it follows that
ord(Q) = minord(Qy2) + 1, ord(Qx
2) + 1, ord(QS)].
The quadrature operators Qx2 and Qy
2 can be chosen in many ways. First ofall, it depends on the given information of the function f.
3.2. Some optimality aspects in numerical integration of bivariate functions171
Remark 3.2.2. A natural way to select them is such that the approximationorder of the initial boolean-sum operators to be preserved. It is obvious thatits approximation order cannot be increased.
Definition 3.2.3. A cubature formula of the form (3.2.7), derived from theboolean-sum formula (3.2.4), which preserves the approximation order of QS,(ord(Qx
2) + 1 ≥ ord(QS) and ord(Qy2) + 1 ≥ ord(QS)), is called a consistent
cubature formula.
As the approximation order of the boolean-sum cubature operator QS
cannot be increased, it is preferable to choose the quadrature operators Qx2
and Qy2 such that each term of the remainder of (3.2.7) to have the same
approximation order.
Definition 3.2.4. A cubature formula of the form (3.2.7), for which
ord(Qx2) = ord(Qy
2) = ord(Qs)− 1,
is called a homogeneous cubature formula.
Example 3.2.5. Let Dh = [0, h]× [0, h] and f ∈ B2,2(0, 0). From the trape-zoidal boolean-sum cubature formula (3.2.6) it can be obtained a homoge-nous cubature formula, if in the second level of approximation it is used theSimpson quadrature formula:
∫ h
0
g(t)dt = h6[g(0) + 4g
(h2
)+ g(h)]− h5
2880g(4)(ξ), ξ ∈ [0, h],
where g ∈ C4[a, h].One obtains the following trapezoidal-Simpson homogeneous formula:
∫∫Dh
f(x, y)dxdy =h2
4
−1
3[f(0, 0) + f(0, h) + f(h, 0) + f(h, h)]
+ 43[f(0, h
2) + f(h, h
2) + f(h
2, 0) + f(h
2, h)] +R(f),
where
R(f) = − h5
144
[120f (4,0)(ξ1, η1) + 1
20f (0,4)(ξ2, η2) + f (2,2)(ξ3, η3)
],
with (ξi, ηi) ∈ Dh, i = 1, 2, 3.
172 Chapter 3. Numerical integration
Example 3.2.6. Also, from (3.2.6), using in the second level of approxima-tion, the following quadrature formula:
∫ h
0
g(t)dt = h2[g(0) + g(h)] + h2
12[g′(0)− g′(h)] +R(g),
withR(g) = h5
720g(4)(ξ), ξ ∈ [0, h], g ∈ C4[0, h],
it is obtained:
∫∫Dh
f(x, y)dxdy =h2
4[f(0, 0) + f(0, h) + f(h, 0) + f(h, h)]+ (3.2.10)
+ h3
24[(f (1,0) + f (0,1))(0, 0) + (f (1,0) − f (0,1))(0, h)
+ (f (0,1) − f (1,0))(h, 0)− (f (1,0) + f (0,1))(h, h)] +R(f),
withR(f) = h5
144[15f (4,0)(ξ1, η1) + 1
5f (0,4)(ξ2, η2)− f (2,2)(ξ3, η3)],
for (ξi, ηi) ∈ Dh, i = 1, 2, 3. This is a homogenous cubature formula.
Remark 3.2.7. Formula (3.2.10) contains the same nodes as the initial one(3.2.6), while in the trapezoidal-Simpson formula (3.2.8) there appear somenew nodes.
Example 3.2.8. Let
Ixyf =
∫ h
0
∫ h
0
f(x, y)dxdy,
and
(Qx1f)(·, y) = hf(h
2, y),
(Qy1f)(x1·) = hf(x, h
2),
be the gaussian quadrature rules. The suitable boolean-sum cubature for-mula is:
Ixyf = h
∫ h
0
f(h2, y)dy + h
∫ h
0
f(x, h2)dx− h2f(h
2, h
2) +RS(f), (3.2.11)
whereRS(f) = h6
576f (2,2)(ξ, η).
3.2. Some optimality aspects in numerical integration of bivariate functions173
In order to get a homogeneous cubature formula in a second level of ap-proximation we must use some quadrature rules Qx
2 , Qy2, with ord(Qx
2) =ord(Qy
2) = 5. Such quadrature rules can be
(Qx2f)(·, η) = h
2[f(0, y) + f(h, y)] + h2
12[f (1,0)(0, y)− f (1,0)(h, y)] +Rx
2(f),
with(Rx
2f)(·, y) = h5
720f (4,0)(ξ, y),
respectively,
(Qy2f)(x, ·) = h
2[f(x, 0) + f(x, h)] + h2
12[f (0,1)(x, 0)− f (0,1)(x, h)] +Ry
2(f),
with(R2ξf)(x, ·) = h5
720f (0,4)(x, η).
Theorem 3.2.9. If f ∈ B22(0, 0) then we have the following homogeneouscubature formula, derived from (3.2.11):
∫∫Dh
f(x, y)dxdy =h2
2[f(h
2, 0) + f(h
2, h) + f(0, h
2) + f(h, h
2)− 2f(h
2, h
2)]
+ h3
12[f (1,0)(0, h
2)− f (1,0)(h, h
2) + f (0,1)(h
2, 0)
− f (0,1)(h2, h)] +R(f),
whereR(f) = h6
144[15f (4,0)(h
2, η) + 1
5f (0.4)(ξ, h
2) + 1
4f (2,2)(ξ1, η1)].
Another optimality criterion for the cubature formulas is that regardingits number of nodes. The optimality problem is to find the cubature formulawith the minimum number of nodes from a class of cubature formulas of thesame degree of exactness.
The product and boolean-sum cubature formulas can be constructed ap-plying the integral operator Ixy to both members of the product, respectivelyboolean-sum interpolation formulas, corresponding to the function f and thedata λxi f, i = 0, 1, ..., m and λyjf, j = 0, 1, ..., n. Moreover, if the operatorIxy is applied to a homogeneous interpolation formula it is obtained a ho-mogeneous cubature formula. Next, we give some examples for a triangulardomain. Let us consider the standard triangle:
Th = (x, y) ∈ R2, x ≥ 0, y ≥ 0, x+ y ≤ h, for h > 0.
174 Chapter 3. Numerical integration
Example 3.2.10. The operator
P = P1P2P3,
with
(P1f(x, y) = h−x−yh−y f(0, y) + x
h−y (f(h− y, y),(P2f)(x, y) = h−x−y
h−x f(x, 0) + yh−xf(x, h− x),
(P3f)(x, y) = xx+y
f(x+ y, 0) + yx+y
f(0, x+ y),
generates the homogenous interpolation formula:
f = Pf +Rf,
with(Pf)(x, y) = h−x−y
hf(0, 0) + x
hf(h, 0) + y
hf(0, h),
and
(Rf)(x, y) =
∫ h
0
ϕ20(x, y; s)f(2,0)(s, 0)ds+
∫ h
0
ϕ02(x, y; t)f(0,2)(0, t)dt+
+∫∫Th
ϕ11(x, y; s, t)f(11)(s, t)dsdt,
for f ∈ B11(0, 0),where ϕij are the Peano’s kernels. Then
∫∫Th
f(x, y)dxdy =∫∫Th
(Pf)(x, y)dxdy +∫∫Th
(Rf)(x, y)dxdy
is a homogeneous cubature formula.Indeed, we have
∫∫Th
f(x, y)dxdy = h2
6[f(0, 0) + f(h, 0) + f(0, h)] +R(f),
whereR(f) = h4
24[f (2,0)(ξ, 0) + f (0,2)(0, η)− f (1,1)(ξ1, η1)],
for ξ, η ∈ [0, h] and (ξ1, η1) ∈ Th.
Chapter 4
Numerical linear systems
Many problems of applied mathematics reduce to a set of linear equations,or a linear system
A · x = b,
with the matrix A and vector b given and the vector x to be determined, sothe solving of a linear system is one of the principal problem of numericalanalysis. An extensive set of algorithms have been developed for doing this,several of them being presented in the sequel.
4.1 Triangular systems of equations
Let us consider the system of equations
A · x = b, (4.1.1)
where A ∈Mn×n(R) is a lower triangular matrix of the form
a11 0 ... 0a21 a22 ... 0...an1 an2 ... ann
,
x, b ∈ Mn×1(R), and∏n
i=1 aii 6= 0.
175
176 Chapter 4. Numerical linear systems
Then, the solution of (4.1.1) is obtained by forward substitution, accord-ing with the relations:
xi =
(bi −
i−1∑
k=1
aikxk
)/aii, i = 1, n. (4.1.2)
Let us consider the system
A · x = b, (4.1.3)
with x, b ∈ Mn×1(R), and A ∈ Mn×n(R) an upper triangular matrix ofthe form
a11 a12 ... a1n
0 a22 ... a2n...0 0 ... ann
,
with∏n
i=1 aii 6= 0.Then, the solution of (4.1.3) is obtained by backward substitution, ac-
cording with the relations:
xi =
(bi −
n∑
k=i+1
aikxk
)/aii, i = 1, n.
4.1.1 The Blocks version
If we consider the system (4.1.1) and x1 has been found, then, after substi-tution of x1 into the equations from the second to the n-th, we obtain a new(n− 1)× (n− 1) lower triangular system
a22 0 . . . 0a32 a33 . . . 0...an2 an3 . . . ann
x2
x3...xn
=
b2 − a21x1
b3 − a32x1...
bn − an1x1
.
Continuing in this way, we obtain the solution of system (4.1.1).Note 1. In the same way, we may consider the column version. In
that situation, matrix A is considered upper triangular, and by substituting
4.2. Direct methods for solving linear systems of equations 177
xn into the equations from the first to the (n − 1)-th, we obtain a new(n− 1)× (n− 1) upper triangular system.
Note 2. The block version of the triangular systems is more appropriateto the parallel calculus.
4.2 Direct methods for solving linear systems
of equations
Let us consider a system of linear equations
A · x = b, (4.2.1)
where A ∈Mn×n(R), x, b ∈ n×1(R).Here A is any matrix of the form
a11 a12 . . . a1n
a21 a22 . . . a2n...an1 an2 . . . ann
.
The direct methods for solving such systems give the exact solution, letit be x∗ ∈Mn×1(R).
In the literature, many direct methods for solving systems of type (4.2.1)are known. The Cramer method is one of it. It is a performant method,from computational point of view, for systems of small dimension. For largersystems, there are other methods, more appropriate, which generate efficientalgorithms, both for serial and for parallel computers. Generally, all thesemethods are based on some factorization techniques, as can be seen in whatfollows.
4.2.1 Factorization techniques
These type of methods are based on the idea of converting A into productsof the form LU or LDU , where L is zero above the main diagonal, U iszero below it and D has only diagonal elements different from zero. If L orU has all diagonal elements equal to 1, it is called unit triangular. Severalmethods, as Gauss, Gauss-Jordan, Doolittle, Crout, Cholesky, etc. producefactorization.
178 Chapter 4. Numerical linear systems
4.2.1.1 The Gauss method
The Gauss method, known also as the method of partial elimination, is one ofthe oldest algorithm and still one of the most popular. Many authors presentit. We remind here the main characteristics of this method, in principal dueto the fact that it generates a factorization of the matrix A.
Let write the system (4.2.1) in the detailed form:
a11x1 + a12x2 + · · ·+ a1nxn = b1a21x1 + a22x2 + · · ·+ a2nxn = b2...an1x1 + an2x2 + · · ·+ annxn = bn.
(4.2.2)
By Gauss method, the system (4.2.2) can be solved in two stages.The first stage. Replacing the equations by combinations of equations,
(by using algebraic elementary transformations on the extended matrix A),the following triangular system is obtained:
a(1)11 x1 + a
(1)12 x2 + · · ·+ a
(1)1nxn = b
(1)1
a(2)21 x2 + · · ·+ a
(2)2nxn = b
(2)2
...
a(n)nnxn = b(n)
n
(4.2.3)
The elements of the new matrix are given by the relations:
a(1)ij = aij , b
(1)i = bi, i, j = 1, n
and
a(p+1)ij = a
(p)ij − a(p)
pj ·a
(p)ip
a(p)pp
b(p+1)i = b
(p)i − b(p)p ·
a(p)ip
a(p)pp
p = 1, n− 1, i, j = p+ 1, n,
for a(p)pp 6= 0, p = 1, n.
The second stage. The components of the vector x are easily found,one after another, by a process called back-substitution. So, the last equationdetermines xn, which is then substituted in the next-to-last equation to getxn−1, and so on.
4.2. Direct methods for solving linear systems of equations 179
The relations are:
xn =b(n)n
a(n)nn
xp =1
a(p)pp
(b(p)p −
n∑
j=p+1
a(p)pj xj
). (4.2.4)
The process of elimination can be described more accurate by means ofa sequence of matrices. If we denote b
(p)p by a
(p)p,n+1, for p = 1, n, and use the
extended matrix of the system (4.2.3),
A =
a11 a12 ... a1n a1,n+1
0 a22 a2n a2,n+1...0 0 0 ann an,n+1
, (4.2.5)
we get the following sequence of matrices: A(1), A
(2), . . . , A
(n), where A
(1)is
matrix A given in (4.2.5) and A(p)
, for every p = 2, n contains the elements:
a(p)ij =
a(p−1)ij , when i = 1, p− 1 and j = 1, n+ 1
0, when i = p, n and j = 1, p− 1
a(p−1)ij −
a(p−1)i,p−1
a(p−1)p−1,p−1
· a(p−1)p−1,j , when i = 1, p− 1 and j = 1, n+ 1.
Remark 4.2.1. The Gauss method yields a factorization of matrix A in theform
A = L · U,where U is the upper triangular matrix shown in (4.2.3) and L is a lowertriangular one with 1 on the diagonal.
Remark 4.2.2. If A is nonsingular, then condition (4.2.4) can always be
fulfilled. So, by using the technique of pivoting, even if one a(p)pp = 0, we look
fora
(p)ipjp = max|a(p)
ij | | i, j = p, nThen, replacing the rows p and ip, respectively the columns p and jp, the
element a(p)ipjp
will take the place of a(p)pp . Obviously, a
(p)ipjp6= 0.
180 Chapter 4. Numerical linear systems
Remark 4.2.3. On opposite, if A is singular, then Ax = b will have eitherno solution at all or infinitely many solutions. In fact, the Gauss method canbe used to prove the fundamental theorem of algebra, which deals with thequestion of whether or not a solution exists.
4.2.1.2 The Gauss-Jordan method
The Gauss-Jordan method, known also as ”the method of complete elimi-nation”, is based in fact on the same idea of computation as in the Gaussmethod, with the only difference that in the process of elimination, the sys-tem matrix will become a diagonal matrix, so the system (4.2.1) will be,finally, of the form:
a(1)11 x1 = b
(1)1
a(2)22 x2 = b
(2)2
. . ....
a(n)nnxn = b
(n)n .
Then the components of solution x∗ will be:
xi =b(i)i
a(i)ii
, i = 1, n.
Obviously, all a(i)ii 6= 0.
4.2.1.3 The LU methods
Under certain conditions, the system matrix A of (4.2.1) can be expressed inthe form of a product of a lower triangular matrix L and an upper triangularmatrix U and, as the result, we have to solve two systems with triangularmatrices.
So,A = LU,
then the system (4.2.1) becomes:
LUx = b. (4.2.6)
which can be solved in two steps:
4.2. Direct methods for solving linear systems of equations 181
Step 1. Solve L · y = b.Step 2. Solve U · x = y.The matrix L will have the form:
L =
l11 0 ... 0l21 l22 ... 0...ln1 ln2 lnn
and the matrix U will have the form:
U =
u11 u12 ... u1n
0 u22 ... u2n...0 0 unn
.
The LU decompositions are basically modified forms of Gaussian elimination.The way in which the entries are computed is given in what follows, knownas the Doolittle algorithm.
4.2.1.4 The Doolittle algorithm
This algorithm does the elimination column by column starting from the left,by multiplying A to the left with lower triangular matrices. It results in aunit lower triangular matrix and an upper triangular matrix. Moreprecisely, for system (4.2.1), with akk 6= 0, k = 1, n we denote
li,k := −a
(k−1)i,k
a(k−1)k,k
, i = k + 1, n,
t(k) = [0 . . . 0︸ ︷︷ ︸k zeros
lk+1,k . . . ln,k]
andMk = I − t(k)eTk , (4.2.7)
where
ek =
0...1...0
182 Chapter 4. Numerical linear systems
is the k-unit vector.Then
Mkx =
1 . . . 0 0 . . . 0...
. . ....
.... . .
...0 . . . 1 0 . . . 00 . . . lk+1,k 1 . . . 0...
. . ....
.... . .
...0 . . . ln,k 0 . . . 1
x1...xkxk+1
...xm
=
x1...xk0...0
.
Definition 4.2.4. A matrix Mk of the form (4.2.7) is called a Gauss matrix,the components li,k, i = k + 1, n are called Gauss multipliers and the vectort(k) is called the Gauss vector. The transformation defined with the Gaussmatrix Mk is called the Gauss transformation.
Note 1. If A ∈Mn×n(R) then the Gauss matrices M1, . . . ,Mn−1 can befound such that
Mn−1 ·Mn−2 . . .M2 ·M1 · A = U
is upper triangular.Moreover, if we choose
L = M−11 ·M−1
2 . . .M−1n−1
thenA = L · U
and so we obtain our factorization.
Example 4.2.5. Find the LU factorization of the matrix
A =
(2 18 7
)
using the Doolittle algorithm.
Solution. Find the Gauss matrix M1 for A:
M1 = I − t(1)eT1 =
(1 00 1
)−(
08
2
)·(
1 0)
=
(1 00 1
)−(
0 04 0
)=
(1 0−4 1
).
4.2. Direct methods for solving linear systems of equations 183
Thus
M1 · A =
(1 0−4 1
)·(
2 18 7
)=
(2 10 3
)= U
M−11 =
(1 04 1
)= L.
So
A =
(2 18 7
)= L · U =
(1 04 1
)·(
2 10 3
).
Note 2. There is another algorithm, called the Crout Algorithmwhich is slightly different. It constructs a lower triangular matrix and anunit upper triangular matrix.
4.2.1.5 Block LU decomposition
The Block LU decomposition generates a lower block triangular matrix L andan upper block triangular matrix U . This decomposition is used in order toreduce the complexity computation and is appropriate to parallel calculus.
Consider a block matrix:(A BC D
)=
(I
CA−1
)· A ·
(I A−1B
)+
(0 00 D − CA−1B
),
when matrix A is assumed to be non-singular, I is an identity matrix withproper dimension, and O is a null matrix.
We can also rewrite the above equation using the half matrices:
(A BC D
)=
(A
12
CA− 12
)(A
∗2 A− 1
2B)
+
(0 0
0 Q12
)(0 00 Q
∗2
),
where Q = D−CA−1B is called the Schur complement of A, and the halfmatrices can be calculated by means of Cholesky decomposition (see section4.2.1.6).
The half matrices satisfy that
A12 · A ∗
2 = A, A12 · A− 1
2 = I, A− ∗2 · A ∗
2 = I,
Q12Q
∗2 = Q.
184 Chapter 4. Numerical linear systems
Thus, we have (A BC D
)= LU,
where
LU =
(A
12 0
CA− ∗2 0
)(A
∗2 A− 1
2B0 0
)+
(0 0
0 Q12
)(0 0
0 Q12
).
The matrix LU can be decomposed in an algebraic manner into
L =
(A
12 0
CA− ∗2 Q
12
)and U =
(A
∗2 A− 1
2B0 Q
∗2
).
4.2.1.6 The Cholesky decomposition
The Cholesky decomposition is a matrix decomposition of a symmetric positive-definite matrix into a lower triangular matrix and the transpose of the lowertriangular matrix. The lower triangular matrix is the Cholesky triangle ofthe original, positive definite, matrix.
As we saw in the previous section, any square matrix A can be written asthe product of a lower triangular matrix L and an upper triangular matrix U .However, if A is symmetric and positive definite, we can choose the factorssuch that U is the transpose of L.
Remark 4.2.6. When is applicable, the Cholesky decomposition is twice asefficient as the LU decomposition.
Let’s consider the system
A · x = b, (4.2.8)
where A ∈ Mn×n(R) is symmetric (A = AT ) and positive definite (xTAx >0, ∀ x ∈Mn×1(R)). Also, b ∈Mn×1(R).
In order to solve (4.2.8), first we have to compute the Cholesky decom-position
A = L · LT ,then we solve
L · y = b
to get y ∈Mn×1(R) andLT · x = y
4.2. Direct methods for solving linear systems of equations 185
to get x, the solution of the initial system.The Cholesky algorithmIt is a modifed version of the Gauss recursive algorithm.Step 1. i := 1, A(1) := AFor i := 1 to n doStep i.
A(i) =
Ii−1 0 00 aii bTi0 bi B(i)
,
where Ii−1 denotes the identity matrix of dimension i− 1
Li :=
Ii−1 0 00
√aii 0
01√aiibi In−i
.
So,A(i) = Li · A(i+1) · LTi ,
where
A(i+1) =
Ii−1 0 00 1 0
0 0 B(i) − 1
aiibib
Ti
.
After n steps, we get A(n+1) = I. Hence, the lower triangular matrix L weare looking for is calculated as
L := L1L2 . . . Ln.
Remark 4.2.7. The following formula for the entries of L holds
l11 =√a11
li1 = ai1/l11, i = 2, n
lii =
[aii −
i−1∑
k=1
l2ik
]1/2
, i = 2, n
lij =1
ljj
[aij −
j−1∑
k=1
likljk
], i = 3, n, j = 2, i− 1.
Remark 4.2.8. The expression under the square root is always positive asA is real and positive definite.
186 Chapter 4. Numerical linear systems
4.2.1.7 The QR decomposition
The QR decomposition of a matrix is a method which permits the writingof matrix A into a product of an orthogonal matrix and a triangular one.
So, let’s consider the system of linear equations
A · x = b, (4.2.9)
where A ∈Mn×n(R), x, b ∈ Mn×1(R).
According with the QR decomposition, matrix A can be written
A = Q ·R,
where Q is an orthogonal matrix (meaning that QT · Q = I) and R is anupper triangular matrix.
Remark 4.2.9. This factorization is unique if we require that the diagonalelements of R to be positive.
There are several methods for computing the QR decomposition, suchas: by means of Givens rotations, Householder transformation or the Gram-Schmidt decomposition.
Computing QR by means of Gram-Schmidt decomposition. Ac-cording with the Gram-Schmidt method, the matrix A can be written ascolumns:
A =(a1 . . . an
).
Then, denoting by 〈·, ·〉 the inner vectorial product, we have
u1 = a1, e1 =u1
‖u1‖u2 = a2 − 〈e1, a2〉, e2 =
u2
‖u2‖...
uk = ak −k−1∑
j=1
〈ej , ek〉, ak =uk‖uk‖
4.2. Direct methods for solving linear systems of equations 187
or, rearranging the terms,
a1 = e1‖u1‖a2 = 〈e1, a2〉+ e‖u2‖a3 = 〈e1, a3〉+ 〈e2, a3〉+ e3‖u3‖...
ak =k−1∑
j=1
〈ej, ak〉+ ek‖uk‖.
In the matrix form, these equations can be written:
(e1 . . . en
)
‖u1‖ 〈e1, a2〉 〈e1, a3〉 . . .0 ‖u2‖ 〈e2, a3〉 . . .0 0 ‖u3‖ . . ....
......
.
But the product of each row and column of the matrices above gives us arespective column of A that we started with. So, the matrix Q is the matrixof eks:
Q =(e1 . . . en
).
Then,
R = QT · A =
〈e1, a1〉 〈e1, a2〉 〈e1, a3〉 . . .0 〈e2, a2〉 〈e2, a3〉 . . .0 0 〈e3, a3〉 . . ....
......
.
Note that 〈ej, aj〉 = ‖uj‖, 〈ej , ak〉 = 0 for j > k and QQT = I, so QT = Q−1.
Example 4.2.10. Let us consider the decomposition of
A =
12 −51 46 167 −68−4 24 −41
.
We compute the matrix Q by means of Gram-Schmidt method, as follows:
U =(u1 u2 u3
)
12 −69 −586 158 6−4 30 −165
,
188 Chapter 4. Numerical linear systems
Q =
(u1
‖u1‖u2
‖u2‖u3
‖u3‖)
=
6/7 −69/175 −58/1753/7 158/175 6/175−2/7 6/35 −33/35
.
Thus, we haveA = Q ·QT
︸ ︷︷ ︸I
·A = Q ·R,
so
R = QT · A =
14 21 −140 175 −700 0 35
.
Computing QR by means of Householder reflections. A House-holder reflection (or Householder transformation) is a transformation thattakes a vector and ”reflects” it about some plane. We can use this propertyto calculate the QR factorization of a matrix.
Q can be used to reflect a vector in such a way that all coordinates butone disappear.
Let x be an arbitrary m-dimensional column vector such that ‖x‖ = |α|for a scalar α.
Then, for e1 = (1, 0, . . . , 0)T and ‖ · ‖ the Euclidean norm, set
u = x− αe1,v =
u
‖u‖ ,
Q = I − 2v · vT ,
Q is a Householder matrix and
Qx = (α, 0, . . . , 0)T .
This can be used to gradually transfer an m-by-n matrix A to uppertriangular form. First, we multiply A with the Householder matrix Q1 thatwe obtain when we choose the first matrix column for x. This results in amatrix QA with zeros in the left column (except for the first row)
Q1A =
α1 ∗ . . . ∗0... A′
0
.
4.2. Direct methods for solving linear systems of equations 189
This can be repeated for A′ (obtained from A by deleting the first rowand first column) resulting in a Householder matrix Q′
2. Note that Q′2 is
smaller than Q1. Since we want it really to operate on Q1A instead of A′ weneed to expand it to the upper left, filling in a 1, or in general:
Qk =
(Ik−1 00 Q′
k
).
After t iterations of the process, t = min(m− 1, n),
R = Qt . . . Q2Q1A
is an upper triangular matrix. So, with
Q = Q1Q2 . . . Qt,
A = QR,
so we have needed QR decomposition of A.
Example 4.2.11. Let us consider the matrix A like in Example 4.2.10:
A =
12 −51 46 167 −68−4 24 −41
.
We decompose A = Q · R by means of the Householder transformation.First, we need to find a reflection that transforms the first column of
matrix A, vector a1 = ( 12 6 −4 )T , to ‖a1‖ e1 = ( 14 0 0 )T .Now, for α = 14 and x = a1 = ( 12 6 −4 )T , we compute
u = x− αe1 =(−2 6 −4
)T
and
v =u
‖u‖ =1√14
(−1 3 −2
)T.
Then,
Q1 = I − 2√14 ·√
14
−13−2
( −1 3 −2
)
= I − 1
7
1 −3 2−3 9 −62 −6 4
=
6/7 3/7 −2/73/7 −2/7 6/7−2/7 6/7 3/7
.
190 Chapter 4. Numerical linear systems
Now observe:
Q1 ·A =
14 21 −140 −49 −140 168 −77
.
Further, we need to make zero the (3, 2) entry. We apply the same processto
A′ = M11 =
(−49 −14168 −77
).
Finally, we get the matrix of the Householder transformation:
Q2 =
1 0 00 −7/25 24/250 24/25 7/25
.
We find
Q = Q1Q2 =
6/7 −69/175 58/1753/7 158/175 −6/175−2/7 6/35 33/35
R = QT · A =
14 21 −140 175 −700 0 −35
.
The matrix Q is orthogonal and R is upper triangular, so A = QR is therequired QR-decomposition.
Computing QR by means of Givens rotations. The QR decompo-sition can also be computed with a series of Givens rotations, where a Givensrotation is defined by
R(θ) :=
(cos θ sin θ− sin θ cos θ
)
and rotates the vector x ∈ R2 with an angle θ. Applying such a rotation tomatrix A in the system, this will make zeros an element in the sub-diagonalof the matrix. The concaternation of all the given rotations forms the or-thogonal Q matrix.
Example 4.2.12. Let us consider the same previous example
A =
12 −51 46 167 −68−4 24 −41
.
4.2. Direct methods for solving linear systems of equations 191
First, we need to form a rotation matrix that will zero the element a31 =−4. We get this matrix using the Givens rotation method, and call thematrix G1. We rotate the vector (6,−4), using the angle θ = arctan
(−4
6
).
Then,
G1 =
1 0 00 cos θ sin θ0 − sin θ cos θ
=
1 0 00 0, 83 −0, 550 0, 55 0, 83
and
G1A =
12 −51 47, 21 125, 6 −33, 83
0 112, 6 −71, 83
.
The matrix G2 and G3 will make zeros the sub-diagonal elements a21 anda32, forming a rectangular matrix R. The orthogonal matrix QT is formedfrom the concatenation of all the given matrices:
QT = G3G2G1.
Thus, we have G3G2G1A = QTA = R and the QR-decomposition is
A = Q · R.
Remark 4.2.13. The Givens method for a QR decomposition is more ap-propriate to parallelization than the previous methods.
4.2.1.8 The WZ decomposition
Let us consider the system of linear equations
A · x = b
where A ∈Mn×n(R), x, b ∈Mn×1(R).
The WZ decomposition will decompose matrix A, dense and nonsingular,into a product of matrices W and Z, consisting of following columns wi and
192 Chapter 4. Numerical linear systems
rows zi, i.e.:
wi = (0 . . . 01︸ ︷︷ ︸i
wi+1,i . . . wn−i,i0 . . . 0)T , i = 1, m,
wi = 0 . . . 01︸ ︷︷ ︸i
0 . . . 0)T , i = p, q
wi = (0 . . . 0︸ ︷︷ ︸n−i+1
wn−i+2,i . . . wi−1,i10 . . . 0)T , i = q + 1, n,
zi = (0 . . . 0︸ ︷︷ ︸i−1
zii . . . zi,n−i+10 . . . 0), i = 1, p
zi = (0 . . . 0zi,n−i+1 . . . zii0 . . . 0), i = p+ 1, n
wherem = [(n− 1)/2], p = [(n+ 1)/2], q = [(n+ 1)/2].
For example, for n = 5, we have:
W =
1 0 0 0 0w21 1 0 0 w25
w31 w32 1 w34 w35
w41 0 0 1 w45
0 0 0 0 1
, Z =
z11 z12 z13 z14 z150 z22 z23 z24 00 0 z33 0 00 z42 z43 z44 0z51 z52 z53 z54 z55
.
After factorization, we can solve the two linear systems:
Wc = b
andZx = c,
instead of one.Computing the WZ-decomposition. The WZ algorithm consists of
two parts: reduction of the matrix A (and the vector b) to the matrix Z (andthe vector c), and next, solving equation Z · x = c.
Reduction of matrix AStep 1. We zero the elements from the 2nd to n− 1st rows in the 1st and
nth columns. Finally, we can write this as follows:Step 1.1. We compute wi1 and win from the linear system
a11wi1 + an1win = −ai1a1nwi1 + annwin = −ain, for i = 2, n− 1
4.2. Direct methods for solving linear systems of equations 193
and we get the matrix:
W (1) =
1 0w21 1 w2n...
. . ....
wn−1,1 1 wn−1,n
0 1
.
Step 1.2. Compute
A(1) = W (1)A, b(1) = W (1)b.
So, we get the linear system A(1)x = b(1), where
A(1) =
a11 a12 . . . a1,n−1 a1n
0 a(1)22 . . . a
(1)2,n−1 0
......
......
0 a(1)n−1,2 . . . a
(1)n−1,n−1 0
an1 an2 . . . an,n−1 ann
, b(1) =
b1b(1)2...
b(1)n−1
bn
and
a(1)ij = aij + wi1a1j + winanj , j = 2, n− 1, i = 2, n− 1,
b(1)i = bi + wi1b1 + winbn, j = 1, n− 1.
Similarly, we carry out the second step (and the next steps) which consistsin the same operations as before, but only on the submatrices of A(1) obtainedby deleting the first and the last rows and columns of the matrix A(1).
After m such steps we get the matrix Z = A(m) (as defined in (4.2.1.8))and the vector c = b(m). We have
W (m) . . .W (1)A = Z,
so
A = (W (1))−1 . . . (W (m))−1Z = WZ.
Step 2. The second part of the method is to solve the linear systemZx = b(m), which consists in solving a linear system with two unknownquantities xp and xq and next updating the vector b. Formally, we have:
194 Chapter 4. Numerical linear systems
Step 2.1. We find xp and xq from the system
zppxp + zpqxq = b
(m)p
zqpxp + zqqxq = b(m)q .
(4.2.10)
Step 2.2. Compute
c(1)i = ci − zipxp − ziqxq, i = 1, . . . , p− 1, q + 1, . . . , n.
For the odd n, system (4.2.10) consists (in the last step) of one equation.Similarly, we make the next steps for the inner two-equations system. Thereare m+ 1 such steps.
Remark 4.2.14. The WZ-decomposition can be very well parallelized, aswe shall see in Chapter 7.
4.3 Iterative methods
The iterative methods for solving a linear system of equations of the form
A · x = b, (4.3.1)
with A ∈Mn×n(R), x, b ∈ n×1(R), generate an approximation of the solutionvector x, by means of a sequence of successive approximations, in case of theconvergence of the method. So, in order to solve iteratively a system ofequations, the following steps have to be performed:
Step 1. Determine the convergence of the iterative method.
In order to do this, we decompose matrix A of the system (4.3.1) in asum of three matrices:
A = L+D + U, (4.3.2)
where
L =
0 0 . . . 0a21 0 . . . 0...
. . ....
an1 an2 . . . 0
, U =
0 a12 . . . a1n
0 0 . . . a2n...
......
0 0 . . . 0
4.3. Iterative methods 195
and
D =
a11 0 . . . 00 a22 . . . 0...0 0 . . . ann
.
So, all iterative methods are of the type
Mx(k+1) = Nx(k) + b, (4.3.3)
where A = M −N and x(k), x(k+1) denote the approximation vectors of thesolution x at the step k, respectively k+1, based on an initial approximationx(0).
Recalling the notion of spectral radius of a matrix G as the quantity
ρ(G) = max|λ| | λ eingenvalue of G
the following result holds.
Theorem 4.3.1. Let A = M −N be the decomposition of the regular matrixA ∈Mn×n(R) and b ∈Mn×1(R). If the matrix M is regular and ρ(M−1N) <1, then the iterative method is convergent.
Remark 4.3.2. If the iterative method is convergent, then it makes sense tocontinue with Step 2, in order to get the approximate solution. Otherwise,the method cannot be applied.
Step 2. Starting with an initial value x(0), the sequence of successiveapproximations
x(1), x(2), . . . , x(k), . . . , x(n), . . . (4.3.4)
is generated, according with the iterative method used.Because the method is convergent, the sequence (4.3.4) will converge to
the exact solution x of system (4.3.1).Step 3. The stop condition is ‖x(n)−x(n−1)‖ ≤ ε, for given ε, where ‖ · ‖
is the Euclidean norm. Then x is approximated by x(n) or x(n−1).
Remark 4.3.3. The initial approximation x(0) does not influence the con-vergence of the method, only its speed-up.
In what follows, we shall present different ways of generating the sequence(4.3.4), according with different iterative methods.
196 Chapter 4. Numerical linear systems
4.3.1 The Jacobi method
Let us write the system (4.3.1) in the following form, having aii 6= 0, i = 1, n:
xi =
(bi −
n∑
j=1,j 6=iaijxj
)/aii, i = 1, n. (4.3.5)
Starting with an initial approximation x(0)i , i = 1, n, Jacobi’s iterative process
is defined by the algorithm
x(k+1)i =
(bi −
n∑
j=1,j 6=iaijx
(k)j
)/aii, i = 1, n, (4.3.6)
or, in matriceal form, based on decomposition (4.3.2) of A,
Dx(k+1) = −(L+ U)x(k) + b. (4.3.7)
So, MJ = D and NJ = −(L+ U), and we have to verify for convergence theproperty
ρ(M−1J NJ) < 1.
4.3.2 The Gauss-Seidel method
Considering again the system (4.3.1) in the form (4.3.5), and starting with
x(0)i , i = 1, n, the initial approximation, the Gauss-Seidel iterative process is
defined by the algorithm:
x(k+1)i =
(bi −
i−1∑
j=1
aijx(k+1)j −
n∑
j=i+1
aijx(k)j
)/aii, i = 1, n (4.3.8)
or, in matriceal form, based on (4.3.2),
(D + L)x(k+1) = −Ux(k) + b,
so MGS = D + L and NGS = −U .To see if the method is convergent, we have to verify
ρ(M−1GSNGS) < 1.
4.3. Iterative methods 197
Remark 4.3.4. If the matrix A ∈Mn×n(R) has a strictly dominant diago-nal, it means
|aii| >n∑
j=1j 6=i
|aij|, i = 1, n,
then the spectral radius ρ(M−1J NJ) < 1.
Also, if A ∈ Mn×n(R) is a symmetric positive definite matrix, then theGauss-Seidel process is convergent.
Example 4.3.5. Solve, with an error of 10−1, the following system:
A · x = b,
where
A =
1 1 −1−1 3 01 0 −2
and b =
02−3
,
using Jacobi method.
Let us represent the matrix A in the form
A = L+D + U
=
0 0 0−1 0 01 0 0
+
1 0 00 3 00 0 −2
+
0 1 −10 0 00 0 0
.
Then, we form the matrices MJ and NJ :
MJ = D =
1 0 00 3 00 0 −2
,
NJ = −(L+ U) =
0 −1 11 0 0−1 0 0
.
In order to determine the convergence of the method, we have to compute
198 Chapter 4. Numerical linear systems
ρ(M−1J NJ), where N−1
J NJ is:
M−1J NJ =
1 0 00 1
30
0 0 −12
0 −1 11 0 0−1 0 0
=
0 −1 113
0 012
0 0
.
The eigenvalues of M−1J NJ are:
λ(M−1J NJ) =
0,− 1√
6, 1√
6
andρ(M−1
J NJ) = 1√6< 1,
so the Jacobi method converges. It makes sense to continue by generatingthe successive approximation sequence, with
M−1J b =
1 0 00 1
30
0 0 −12
02−3
=
02332
and x0 =(
0 0 0)T
, according with the relation
x(k+1) = M−1J NJx
(k) +M−1J b.
We get
x(1) =
0 −1 113
0 012
0 0
000
+
02332
=
0
0.666671.5
x(2) =
0 −1 113
0 012
0 0
00.66667
1.5
=
0.833330.66667
1.5
,
and so on. After 5 steps, we obtain
x(5) =(
0.97226 0.99075 0.9861)T,
with the required error, 10−1, taking into account that the exact solution of
the system is x =(
1 1 2)T
.
Remark 4.3.6. Following the same scheme, we can solve the system withthe Gauss-Seidel method.
4.3. Iterative methods 199
4.3.3 The method of relaxation
Normally, the Gauss-Seidel method is twice as fast as the Jacobi methodbut, if the quantity ρ(M−1
G NG) is smaller than one but close to one, theneven the Gauss-Seidel speed of convergence decreases. It arises a problem:how to accelerate the convergence of the sequence of approximations? Themethod which solve this problem is known as the method of relaxation (or,sometimes, called ”successive overrelaxation”, or SOR).
It is based on the algorithm:
x(k+1)i = ω
(bi −
i−1∑
j=1
aijx(k+1)j −
n∑
j=i+1
aijx(k)j
)/aii + (1− ω)x
(k)i , i = 1, n,
which can be written in matrix form:
(D + ωL)x(k+1) = ((1− ω)D − ωU)x(k) + ωb,
where MSOR = D + ωL and NSOR = (1− ω)D − ωU .The problem lies in finding the parameter ω so that ρ(M−1
SORNSOR) werethe least. According with Kahan’s theorem, the method converges for 0 <ω < 2.
Remark 4.3.7. For ω = 1, SOR method is the Gauss-Seidel method.
4.3.4 The Chebyshev method
In order to accelerate the convergence of an iterative process, the followingChebyshev’s method can be used.
Let us suppose that, using an iterative method, we have found the ap-proximation x(1), . . . , x(k) of the solution x of system (4.3.1).
Let be
y(k) =
k∑
j=0
νj(k)x(j). (4.3.9)
The problem lies in finding the coefficients ν(k)j in formula (4.3.9) so that
the error vector y(k) − x of the approximation y(k) is smaller than the errorvector x(k) − x. In case of
x(j) = x, j = 1, k,
200 Chapter 4. Numerical linear systems
it is natural to demand y(k) = x. This is just a case when
k∑
j=0
νj(k) = 1. (4.3.10)
In order to determine the coefficients needed, we can write
x(k) − x = (M−1N)ke(0),
then
y(k) − x =k∑
j=0
νj(k)(x(j) − x) =
k∑
j=0
νj(k)(M−1N)je(0)
=k∑
j=0
νj(k)G(j)e(0) = pk(G)e(0),
where G = M−1N and
pk(z) =
k∑
j=0
νj(k)zj . (4.3.11)
From (4.3.10) it follows thatpk(1) = 1. (4.3.12)
In addition,‖y(k) − x‖2 ≤ ‖pk(G)‖2‖e(0)‖2. (4.3.13)
In the case of a symmetric matrix G, it eigenvalues satisfy the followinginequalities:
−1 < α ≤ λn ≤ · · · ≤ λ1 ≤ β < 1.
If λ is the eigenvalue corresponding to the eigenvector x, then
pk(G)x =
k∑
j=0
νj(k)Gjx =
k∑
j=0
νj(k)λjx,
i.e., the vector x is also the eigenvector of the matrix pk(G) correspondingto the eigenvalue
k∑
j=0
νj(k)λj = pk(λ).
4.3. Iterative methods 201
In case of G symmetric, matrix pk(G) is also symmetric. Hence,
‖pk(G)‖2 = maxλi∈λ(G)
∣∣∣∣∣
k∑
j=0
νj(k)λji
∣∣∣∣∣ ≤ maxλ∈[α,β]
|pk(λ)|. (4.3.14)
To decrease the quantity ‖pk(G)‖2 one must find such a polynomial pk(z)that has small values on the segment [α, β] and satisfies the condition (4.3.12).The polynomials which have such properties are the Chebyshev’s. They aredefined on the interval [−1, 1] by the recurrence relation
cj(z) = 2zcj−1(z)− cj−2(z), j = 2, 3, . . . ,
with c0(z) = 1 and c1(z) = z.
These polynomials satisfy the inequality
|cj(z)| ≤ 1, z ∈ [−1, 1]
and cj(1) = 1, and the values |cj(z)| grow quickly outside the segment [−1, 1].
Further, the polynomial
pk(z) =ck(−1 + 2(z − α)/(β − α))
ck(µ),
where
µ = −1 + 21− αβ − α = 1 + 2
1− ββ − α > 1
satisfies the condition (4.3.12) and
|pk(z)| ≤ 1 for z ∈ [α, β].
Taking into account relations (4.3.13) and (4.3.14), we find
‖y(k) − x‖2 ≤‖x(k) − x‖2|ck(µ)| .
Therefore, the greater µ is, the greater |ck(µ)| is and, consequently, thegreater will be the convergence of the method.
202 Chapter 4. Numerical linear systems
4.3.5 The method of Steepest Descendent
This method is a very important one, and also contains the main ideas ofanother well known iterative method: the Conjugate Gradient (CG) method.
For didactical reason, we choose to present in this section only the Steep-est Descendent method, because it is easy understandable, and maybe not sowell known as CG, in spite of the fact that it has comparable performances.
Let us consider the linear system of equations
A · x = b, (4.3.15)
where A ∈Mn×n(R) is a symmetric, positive-definite matrix, x, b ∈Mn×1(R).With the information given by (4.3.15) we generate the following quadratic
form:
f(x) =1
2xT · A · x− bT · x+ c, (4.3.16)
where c is a scalar constant, which will have the gradient
f ′(x) =1
2AT · x+
1
2Ax− b. (4.3.17)
If A is symmetric, (4.3.17) reduces to
f ′(x) = Ax− b. (4.3.18)
Remark 4.3.8. We recall that the gradient of a quadratic form is definedto be
f ′(x) =
∂∂x1f(x)
∂∂x2f(x)...
∂∂xn
f(x)
.
Setting the gradient f ′(x) to zero, in (4.3.18), we get exactly the system(4.3.15) we have to solve. Therefore, the solution to Ax = b is a criticalpoint of f(x). Under the assumption we made on A, then this solution is aminimum of f(x), so Ax = b can be solved by finding an x that minimizesf(x).
Computing the Steepest Descendent method. We start with aninitial approximation x(0) and we shall generate the sequence of successiveapproximation x(1), x(2), . . . choosing the direction in which f decreases most
4.3. Iterative methods 203
quickly, which is the direction opposite f ′(x(i)). According with equation(4.3.18) this direction is
−f ′(x(i)) = b− Ax(i). (4.3.19)
Recalling the notion of the residual, denoted by
ri = b− Ax(i), (4.3.20)
which means how far we are from the correct value of b, and recalling thenotion of the error,
ei = x(i) − x, (4.3.21)
it is easy to see that
ri = −Aei (4.3.22)
or, more important,
ri = −f ′(x(i)). (4.3.23)
So, the residual can be considered the direction of the steepest descendent.
Under this considerations, with the initial approximation x(0) given, weshall determine
x(1) = x(0) + αr0, (4.3.24)
with α a real value chosen to minimize f along a line.
From a basic calculus, α minimizes f when the directional derivatived
dαf(x(1)) is equal to zero
d
dαf(x(1)) = f ′(x(1))T
d
dαx(1) = f ′(x(1))Tr0 = 0. (4.3.25)
Relation (4.3.25) indicates that α minimizes f if f ′(x(1)) and r0 are or-thogonal.
204 Chapter 4. Numerical linear systems
To determine explicitly α, note that f ′(x(1)) = −r1 and then we have:
rT1 · r0 = 0
(b− Ax(1))T r0 = 0
(b− A(x(0) + αr0))T r0 = 0
(b− Ax(0))T r0 − α(Ar0)T r0 = 0
(b− Ax(0))T r0 = α(Ar0)T r0
rT0 r0 = αrT0 (Ar0)
α =rT0 · r0
rT0 ·A · r0.
Putting it all together, the method of Steepest Descent is:
ri = b− A · x(i)
αi =rTi · ri
rTi · A · rix(i+1) = x(i) + αi · ri.
Remark 4.3.9. It can be proved that the method of Steepest Descent isconvergent.
4.3.6 The multigrid method
Originally introduced as a way to numerically solve elliptic boundary-valueproblems, multigrid methods and their various multiscale descendants havebeen developed and applied to various problems, in many disciplines. It isamong the fastest solution techniques known today.
Let us consider the linear system of equations
A · x = b, (4.3.26)
with A ∈ Mn×n(R), x, b ∈ Mn×1(R) which models our problem. So, x willbe exact solution. If it is difficult to determine analytically, we try to find anapproximation xh ∈Mn×1(R), which approximates x with the smallest errorpossible. In order to do this, let’s consider the discretized problem attachedto (4.3.26), denoted
A · xh = b, (4.3.27)
4.3. Iterative methods 205
obtained by means of any discretization method (finite difference, finite ele-ments, etc.).
The continuous definition domain will be covered with a mesh of N points(for a 1D problem), or N2 point (for 2D problem), etc., with the mesh size
h =1
N, or h =
1
N2, etc.
Remark 4.3.10. As in the literature, we use the name ”grid” for this mesh.
Computing the Multigrid method. In this case, the algebraic erroris
e = x− xh (4.3.28)
and the residual is
r = b−A · xh. (4.3.29)
Also, the residual equation which may be deduced from (4.3.28) and (4.3.29)is:
A · e = r. (4.3.30)
which means that the algebraic error satisfies the system (4.3.26), when inr.h.s. we have the residual.
Having all this in mind, we may have a first, very intuitively, image ofthe multigrid method:
1. Determine an approximation xh for x using a classic iteration (Jacobi,Gauss-Seidel, SOR, etc.).
2. Compute the residual, according with (4.3.29).3. Solve the residual equation (4.3.30).4. Improve the approximation xh:
x = xh + e.
Having these scheme in mind, we need to understand why the method iscalled multigrid. The answer is find in the way in which the previous stepsaffect the approximation error.
Using the Fourier analysis, one notices that the error vector is composedby two types of waves: more oscillatory and smoother. In order to reduce theerror, we want to smooth as much as possible the components of the error,which means, in fact to reduce it.
So, the multigrid method consists in two parts: applying the smoothingproperty and the coarse grid correction.
206 Chapter 4. Numerical linear systems
The smoothing properties. It has been proved that the classical it-erations, as Jacobi, Gauss-Seidel, etc. has this property, it means that theytake off the oscilatory components of the error, but leave the smoother com-ponents unchanged. That’s why we start the solving of the system (4.3.26)with a number of steps of some classical iterative method, in order to obtaina first approximation, xh, of the solution.
The Coarse-grid correction. In order to take off the smoother com-ponents, too, the information is transferred from the fine grid to a coarserone, with half number of points, because there the components of the errorlook less smooth.
Remark 4.3.11. The transfer of information is made by means of two ope-rators: one for prolongation (Ih2h) and one for reduction (I2h
h ).
The prolongation operator makes the transfer between the coarse gridto the fine grid; according to the formulas:
Ih2hx2h = xh,
where
xh2j = 12x2hj , 0 ≤ j ≤ N
2− 1 (4.3.31)
xh2j+1 = 12(x2h
j + x2hj+1).
The restriction operator makes the transfer between the fine grid tothe coarse grid, according with the formulas:
I2hh · xh = x2h,
where
x2hj =
1
4(xh2j−1 + xh2j + xh2j+1), 1 ≤ j ≤ N
2− 1. (4.3.32)
Remark 4.3.12. Formulas (4.3.31) and (4.3.32) shows that
I2hh = c(Ih2h)
T , with c ∈ R.
This fact is very important from computational point of view.We may write, by short, as follows.The Two-Grid-Algorithm:Relax Ah · x = bh on fine grid Ωh and get xh an initial approximation.Compute r2h = I2h
h (bh − Ahxh).
4.3. Iterative methods 207
Solve A2h · e2h = r2h on coarse grid Ω2h.Correct xh ← xh + Ih2h · e2h.The Multigrid Algorithm is a recursive one, and implies this change of
information among several grids, the coarsest one being formed only withone point, where the solution of the system can be determined exactly.
So, the Multigrid Algorithm (MV) is the following:2. Relax Ahx = bh and get xh.2. If Ωh is the coarsest grid, then go to 3.
Elseb2h ← I2h
h (bh − Ahxh)x2h ← 0x2h ← MV (x2h, b2h)
3. Correct xh ← xh + Ih2hx2h.
Remark 4.3.13. The relaxation procedure may appear also at the end ofstep 3. The previous Multigrid Algorithm is called V-cycle algorithm. Thereexists, also, W-cycle or combined multigrid algorithms.
208 Chapter 4. Numerical linear systems
Chapter 5
Non-linear equations on R
Let Ω ⊂ R and f : Ω→ R. Consider the equation
f(x) = 0, x ∈ Ω, (5.0.1)
and attach a mapping
F : D → D, D ⊂ Ωn.
Let (x0, ..., xn) ∈ D. Using the mapping F and the numbers x0, ..., xn−1 weconstruct iteratively the sequence
x0, x1, ..., xn−1, xn, ..., (5.0.2)
wherexi = F (xi−n, ...xi−1), i = n, n+ 1, ... (5.0.3)
The problem is to choose F and the numbers x0, ..., xn−1 ∈ D such that thesequence (5.0.2) converges to a solution of equation (5.0.1).
Definition 5.0.14. The method of approximating a solution of equation(5.0.1) by the elements of the sequence (5.0.2), computed as in (5.0.3), iscalled F -method attached to the equation (5.0.1) and to the values x0, ..., xn−1.Numbers x0, ..., xn−1 are called starting values, and the pth element of the se-quence (5.0.2) is called pth approximation order of the solution.
If the set of the starting values consists in a single element, then the corre-sponding F -method is called one-step method, otherwise it is called multistepmethod.
209
210 Chapter 5. Non-linear equations on R
Definition 5.0.15. If the sequence (5.0.2) converges to a solution of theequation (5.0.1), then the F -method is said to be convergent, otherwise it isdivergent.
Definition 5.0.16. Let x∗ ∈ Ω be a solution of the equation (5.0.1) and letx0, ..., xn, ... be a sequence generated by a given F -method. The number phaving the property that
limxi→x∗
x∗ − F (xi−n+1, ..., xi)
(x∗ − xi)p= C 6= 0,
is called the order of the F -method, i.e., ord(F ) = p and the constant C isthe asymptotical error.
In the sequel we describe some general procedures of constructing classesof one-step and multistep F -methods, and we discuss also some simple par-ticular cases.
5.1 One-step methods
Let F be an one-step method, i.e., for a given xi we have
xi+1 = F (xi). (5.1.1)
Theorem 5.1.1. If x∗ is a solution of the equation (5.0.1), V (x∗) is a neigh-borhood of x∗ and F ∈ Cp[V (x∗)], p ∈ N∗, then
ord(F ) = p,
if and only if
F (x∗) = x∗, F ′(x∗) = . . . = F (p−1)(x∗) = 0, F (p)(x∗) 6= 0. (5.1.2)
Proof. Since F ∈ Cp[V (x∗)], Taylor’s formula gives rise to
F (xi) = F (x∗) +
p−1∑
k=1
1k!
(xi − x∗)kF (k)(x∗) + 1p!
(xi − x∗)pF (p)(ξi), (5.1.3)
where xi ∈ V (x∗) and ξi belongs to the interval determined by xi and x∗.
5.1. One-step methods 211
Assume that conditions (5.1.2) are verified. Then (5.1.3) implies
F (xi) = x∗ + 1p!
(xi − x∗)pF (p)(ξi), (5.1.4)
and it follows
limxi−x∗
F (xi)− x∗(xi − x∗)p
= 1p!F (p)(x∗) 6= 0.
Thus,
ord(F ) = p.
Let now consider p = ord(F ). We have to prove that conditions (5.1.2) aresatisfied. Assume either that there exists a number r < p − 1 such thatF (r)(x∗) 6= 0, or that F (p)(x∗) = 0. In both cases, from (5.1.3) and (5.1.4),respectively, it follows that
ord(F ) 6= p.
Theorem 5.1.2. Let x∗ be a solution of the equation (5.0.1), x0 ∈ Ω,
I = x | |x− x∗| ≤ |x0 − x∗| , x ∈ Ω,
F ∈ Cp(I), p is the order of F and
1p!
∣∣F (p)(x)∣∣ ≤M, x ∈ I.
If
M |x0 − x∗|p−1 < 1 (5.1.5)
then
1) xi ∈ I, i = 0, 1, ...,
2) limi→∞
xi = x∗,
where
xi+1 = F (xi), i = 0, 1, ...
Proof. 1) Obviously, x0 ∈ I. Assume that xi ∈ I. Since,
ord(F ) = p
212 Chapter 5. Non-linear equations on R
it follows that (5.1.4) holds, and we can write it as
xi+1 − x∗ = Mi(xi − x∗)p, Mi = 1p!F (p)(ξ). (5.1.6)
Using (5.1.6), the definition of I and inequality Mi ≤ M, on interval I, ityields
|xi+1 − x∗| ≤M |xi − x∗|p
≤M |x0 − x∗|p
= M |x0 − x∗|p−1 |x0 − x∗|< |x0 − x∗| ,
which implies xi+1 ∈ I and, according to complete induction, it follows
xi ∈ I, i = 0, 1, . . . .
2) One considers again the inequality:
|xi+1 − x∗| ≤M |xi − x∗|p , i = 0, 1, ...,
which implies successively
|xi − x∗| ≤M |xi−1 − x∗|p
≤MMp |xi−2 − x∗|p2
≤ ... ≤ M1+p+...+pi−1 |x0 − x∗|pi
.
Hence,|xi − x∗| ≤M− 1
p−1 (M |x0 − x∗|p−1)pi/(p−1), for p > 1,
|xi − x∗| ≤M i |x0 − x∗| , for p = 1,(5.1.7)
and, taking into account (5.1.5), it follows that the sequence (xi)i∈N convergesto x∗.
Remark 5.1.3. Inequalities (5.1.7) also provide upper bounds for the abso-lute error in approximating x∗ by xi.
Remark 5.1.4. If p = 1, the convergence condition is M < 1, i.e.,
|F ′(x)| < 1, x ∈ I.Remark 5.1.5. If p > 1 there exists always a neighborhood of x∗ where theF -method converges.
Next we present some construction procedures for F -methods.
5.1. One-step methods 213
5.1.1 Successive approximations method
One of the easiest way to construct F -method consists in writing the equation(5.0.1) in the equivalent form:
x = F (x), with F (x) = x+ f(x),
and applying successive approximations method for approximating the fixedpoints of F, namely,
xn+1 = F (xn), n = 0, 1, ...,
where x0 is given. From (5.0.1) it follows that f(x∗) = 0 if and only F (x∗) =x∗, whence, any root of f is a fixed point of F, and viceversa. This methodis based upon the following result.
Theorem 5.1.6. (Picard-Banach). Let (X, d) be a complete metric spaceand let F : X → X be a contraction (i.e., it satisfies Lipschitz condition witha constant α < 1). Then, the following statements hold:
i) The mapping F has a unique fixed point x∗.
ii) If x0 ∈ X, then the sequence xn+1 = F (xn), n = 0, 1, ... converges tox∗.
iii) One hasd(xn, x
∗) ≤ αn
1−αd(x0, x1).
Proof. From
d(xn, xn+1) = d(F (xn−1), F (xn))
≤ αd(xn−1, xn) ≤ ... ≤ αnd(x0, x1).
it follows that
d(xn, xn+p) ≤ d(xn, xn+1) + ... + d(xn+p−1, xn+p) (5.1.8)
≤ (αn + ...+ αn+p−1)d(x0, x1)
≤ (αn + ...αn+p−1 + ...)d(x0, x1)
= αn
1−αd(x0, x1).
214 Chapter 5. Non-linear equations on R
Hence, the successive approximations sequence is a Cauchy sequence. SinceX is a complete metric space, it follows that the latter sequence is also con-vergent. Denote its limit by x∗. The mapping F satisfies Lipschitz condition,so it is continuous. Passing to the limit in equality
xn+1 = F (xn),
it yieldsx∗ = F (x∗),
thus x∗ is a fixed point of F .We prove now the uniqueness of the fixed point. Assume the contrary,
namely that x∗1 and x∗2 are two distinct fixed points of F. One has
0 < d(x∗1, x∗2) = d(F (x∗1), F (x∗2)) ≤ αd(x∗1, x
∗2),
which yields the contradiction α ≥ 1.Finally, passing to the limit in (5.1.8) with respect to p, we obtain
d(xn, x∗) ≤ αn
1−αd(x0, x1).
Remark 5.1.7. The Picard-Banach theorem provides, beside the existenceand uniqueness conditions, an effective method for constructing the succes-sive approximations sequence, as well as an upper bound for the error inapproximating the fixed point by the approximation of order n. Since condi-tion
M := supx∈X|F ′(x)| < 1
implies α < 1, one can use in practical applications the inequality M < 1 asconvergence test.
A more practical upper bound for the approximation error is given in thenext result.
Lemma 5.1.8. Let x∗ be a fixed point of F and let (xn)n∈N be the successiveapproximations sequence generated by F, for a given x0, which converges tox∗. If
|F ′(x)| < α, x ∈ V (x∗),
then the following inequality holds:
|xn − x∗| ≤ α1−α |xn − xn−1| . (5.1.9)
5.1. One-step methods 215
Proof. Denoting f(x) = x− F (x), one has
f ′(x) = 1− F ′(x).
Conditions in the hypothesis imply
|f ′(x)| = |1− F ′(x)| ≥ 1− α, x ∈ V (x∗).
Taking into account that f(x∗) = 0, we obtain
|xn − F (xn)| = |f(xn)− f(x∗)|= |f ′(ξ)| |xn − x∗|≥ (1− α) |xn − x∗| ,
where ξ belongs to the interval determined by x and x∗. We have
|xn − x∗| ≤ 11−α |xn − F (xn)| = 1
1−α |xn+1 − xn| .
Using again Lagrange formula, this yields
|xn+1 − xn| = |F (xn)− F (xn−1)| ≤ α |xn − xn−1| ,
whence|xn − x∗| ≤ α
1−α |xn − xn−1| .
Remark 5.1.9. Denoting by ε the maximal admissible absolute error, i.e,|x− x∗| ≤ ε, from (5.1.9) we obtain
|xn − xn−1| ≤ 1−ααε,
which can be used as a stop criterium for the computation.
5.1.2 Inverse interpolation method
Let x∗ ∈ Ω be a solution of equation (5.0.1) and V (x∗) a neighborhood of x∗.Assume that f has the inverse on V (x∗) and denote g = f−1. Since f(x∗) = 0it follows that x∗ = g(0).
The inverse interpolation method for approximating a solution x∗ of equa-tion (5.0.1) consists in approximating the inverse g by means of a certain in-terpolation method, and x∗ by the value of the interpolating element at point
216 Chapter 5. Non-linear equations on R
zero. This approach generates a large number of approximation methods forthe solution of an equation (thus, for the zeros of a function), according tothe employed interpolation method.
In order to obtain a one-step F -method, all information on f must begiven at a single point, namely the starting value. Hence, we are lead toTaylor interpolation.
Theorem 5.1.10. Let x∗ be a solution of equation (5.0.1), V (x∗) a neigh-borhood of x∗, f ∈ Cm[V (x∗)], such that
f ′(x) 6= 0, for x ∈ V (x∗),
and consider xi ∈ V (x∗). Then we have the following method for approximat-ing x∗, denoted by F T
m :
F Tm(xi) = xi +
m−1∑
k=1
(−1)k
k!fki g
(k)(fi), (5.1.10)
where g is the inverse of f and fi = f(xi).
Proof. From the hypothesis of this theorem it follows that there exists g =f−1 and that g ∈ Cm[V (0)],where V (0) = f(V (x∗)).
Let yi = f(xi) and consider Taylor interpolation formula:
g(y) = (Tm−1g)(y) + (Rm−1g)(y), (5.1.11)
relative to the function g and the node yi, where Tm−1g is (m− 1)th degreeTaylor polynomial, namely,
(Tm−1g)(y) =m−1∑
k=0
1k!
(y − yi)kg(k)(yi),
and Rm−1g is the corresponding remainder. Since x∗ = g(0) and g ≈ Tm−1g,it follows
x∗ ≈ (Tm−1g)(0) = xi +
m−1∑
k=1
(−1)k
k!yki g
(k)(yi),
whence,
xi+1 := F Tm(xi) = xi +
m−1∑
k=1
(−1)k
k!fki g
(k)(fi)
5.1. One-step methods 217
is an approximation of x∗, and F Tm is an approximation method for the solu-
tion x∗.Concerning the order of the method F T
m we state the following result.
Theorem 5.1.11. Let x∗ be a solution of equation (5.0.1), f ∈ Cm[V (x∗)]and f ′(x∗) 6= 0. If g = f−1 satisfies condition
g(m)(0) 6= 0,
then the method F Tm is of order m.
Proof. Consider the remainder of the Taylor interpolation formula written inLagrange form, namely,
(Rm−1g)(y) = 1m!
(y − yi)mg(m)(yi + θ(y − yi)), 0 < θ < 1.
Since (Rm−1g)(0) is the error of approximating x∗ by (Tm−1g)(0), one has
x∗ − F Tm(xi) = (−1)m
m!fimg(m)(yi(1− θ)). (5.1.12)
Taking into account that f(x∗) = 0, it follows
f(xi) = f(xi)− f(x∗) = (xi − x∗)f ′(ξi), (5.1.13)
where ξi belongs to the interval determined by xi and x∗. Then (5.1.13)implies
x∗ − F Tm(xi) = (−1)m
m!(xi − x∗)m[f ′(ξi)]
mg(m)(yi(1− θ)),
and thus,x∗ − F T
m(xi)
(x∗ − xi)m= 1
m![f ′(ξi)]
mg(m)(yi(1− θ)).
Since g(m)(0) 6= 0 and f ′(x∗) 6= 0, we obtain
limxi→x∗
x∗ − F Tm(xi)
(x∗ − xi)m= 1
m![f ′(x∗)]mg(m)(0) 6= 0,
whence,ord(F T
m) = m.
218 Chapter 5. Non-linear equations on R
Remark 5.1.12. From (5.1.13) we get
∣∣x∗ − F Tm(xi)
∣∣ ≤ 1m!|fi|mMmg, with Mm = sup
y∈V (0)
∣∣g(m)(y)∣∣ ,
which provides an upper bound for the absolute error in approximating x∗
by xi+1.
Next we present two particular cases.1) Newton-Raphson methodThis method is obtained for m = 2, namely
F T2 (xi) = xi −
f(xi)
f ′(xi).
Thus, the Newton-Raphson method (also called the Newton’s method or thetangent method) is of order p = 2. According to Remark 5.1.5, there alwaysexists a neighborhood of x∗ where F T
2 is convergent. Choosing the startingvalue x0 in such a neighborhood, allows approximating the solution x∗ byterms of the sequence generated by F T
2 , namely
xi+1 = F T2 (xi), i = 0, 1, ..., (5.1.14)
with a prescribed error. As far as the approximation error is concerned,Remark 5.1.12 gives
∣∣x∗ − F T2 (xn)
∣∣ ≤ 12[f(xn)]
2M2g.
Lemma 5.1.13. Let x∗ ∈ (a, b) be a solution of equation (5.0.1) and let
xn = F T2 (xn−1).
Then|x∗ − xn| ≤ 1
m1|f(xn)| , m1 ≤ m1f = inf
a≤x≤b|f ′(x)| .
Proof. We use the mean’s formula
f(x∗)− f(xn) = f ′(ξ)(x∗ − xn),
with ξ belonging to the interval determined by x∗ and xn. From
f(x∗) = 0
5.1. One-step methods 219
and|f ′(x)| ≥ m1,
for x ∈ (a, b), it follows
|f(xn)| ≥ m1 |x∗ − xn| ,
i.e,|x∗ − xn| ≤ 1
m1|f(xn)| .
In practical applications the next evaluation is more useful.
Lemma 5.1.14. Let x∗ ∈ (a, b) be a solution of equation (5.0.1) and letxn = F T
2 (xn−1) be the nth order approximation generated by F2. If f ∈ C2[a, b]and F T
2 is convergent, then there exists n0 ∈ N such that
|xn − x∗| ≤ |xn − xn−1| , n > n0.
Proof. We start with Taylor’s formula:
f(xn) = f(xn−1) + (xn − xn−1)f′(xn−1) + 1
2(xn − xn−1)
2f ′′(ξ),
where ξ belongs to the interval determined by xn−1 and xn. Since
xn = F T2 (xn−1),
it follows thatf(xn−1) + (xn − xn−1)f
′(xn−1) = 0.
Thus, we obtainf(xn) = 1
2(xn − xn−1)
2f ′′(ξ).
Consequently,|f(xn)| ≤ 1
2M2f(xn − xn−1)
2,
and using (5.1.13) yields
|x∗ − xn| ≤ 12m1
(xn − xn−1)2M2f.
Since F T2 is convergent, there exists n0 ∈ N such that
12m1|xn − xn−1|M2f < 1, n > n0
220 Chapter 5. Non-linear equations on R
whence,|x∗ − xn| ≤ |xn − xn−1| , n > n0.
In general, choosing the starting value is a non-trivial problem. Often thisvalue is chosen randomly, though taking into account the available informa-tion about the function f. If after a fixed number of iterations the requiredprecision was not achieved, i.e., condition |xn − xn−1| ≤ ε does not hold fora prescribed positive ε, the computation has to be started over with a newstarting value.
For certain classes of functions the problem of choosing the starting valuecan be significantly simplified. For example, if the solution x∗ is isolated inthe interval (a, b) and f ′′(x) 6= 0, x ∈ (a, b), then one can choose as a startingvalue in F T
2 -method one of the endpoints a or b. More precisely,
x0 =
a, f(a)f ′′(c) > 0 or f(b)f ′′(c) < 0,b, f(a)f ′′(c) < 0 or f(b)f ′′(c) > 0,
where c ∈ (a, b), (for example, c = (a+ b)/2).This statement can be intuitively justified using the geometric interpre-
tation of the method. Consider, for example, the case f(a) < 0, f(b) > 0and f ′′(x) < 0, for x ∈ (a, b). It follows immediately x0 = a. Indeed,
f(a)f ′′(a+b2
) > 0.
Next we consider the second particular case.2) Case m = 3.Using (5.1.10), we obtain
F T3 (xi) = xi −
f(xi)
f ′(xi)− 1
2
[f(xi)
f ′(xi)
]2f ′′(xi)
f ′(xi), (5.1.15)
which is a method of order p = 3 and thus converges faster than F T2 .
Remark 5.1.15. The higher the order of a method is, the faster the methodconverges. Still this doesn’t mean that a higher order method of approxi-mation is more efficient (taking into account the computation requirements)than a lower order approximation method. By the contrary, the results inthe literature show that the methods of relatively low order are the mostefficient ones, due to their simplicity. Within the one-step methods class, themost efficient methods (in the above sense) are considered F T
2 or F T3 .
5.2. Multistep methods 221
5.2 Multistep methods
Recall that if the set of the starting values consists of more than one element,the corresponding F -method is called multistep method.
In the sequel we will also use the inverse interpolation method to gen-erate multistep methods. Obviously, in this case there has to be used aninterpolation method with more that one interpolation node.
Let x∗ ∈ Ω be a solution of equation (5.0.1), let (a, b) ⊂ Ω be a neighbor-hood of x∗ that isolates this solution and x0, ..., xn ∈ (a, b) some given values.Denote by g the inverse function of f, supposing it exists. Because x∗ = g(0),the problem reduces to approximating the inverse function g by means of aninterpolation method with n > 1 nodes, for example Lagrange, Hermite,Birkhoff, spline, etc. We discuss first the use of Lagrange interpolation. Let
yk = f(xk), k = 0, ..., n,
whence,xk = g(yk).
To the data yk and g(yk), k = 0, ..., n, we attach the Lagrange interpolationformula:
g = Lng +Rng, (5.2.1)
where
(Lng)(y) =n∑
k=0
(y−y0)...(y−yk−1)(y−yk+1)...(y−yn)
(yk−y0)...(yk−yk−1)(yk−yk+1)...(yk−yn)g(yk).
If g ∈ Cn+1[c, d], where [c, d] is the image of the interval [a, b] through thefunction f, then
(Rng)(y) = u(y)(n+1)!
g(n+1)(η), c < η < d. (5.2.2)
TakingFLn (x0, ..., xn) = (Lng)(0),
it follows that FLn is a (n+ 1)-steps method defined by
FLn (x0, ..., xn) =
n∑
k=0
f0...fk−1fk+1...fn
(f0−fk).../...(fn−fk)xk, (5.2.3)
with fk = f(xk).Concerning the convergence of this method we state the following result.
222 Chapter 5. Non-linear equations on R
Theorem 5.2.1. If x∗ ∈ (a, b) is a solution of the equation (5.0.1), f ′ isbounded on (a, b), and the starting values satisfy
|x∗ − xk| < 1c, k = 0, ..., n,
with the constant c given in the proof, then the sequence
xi+1 = FLn (xn−i, ..., xi), i = n, n+ 1, ...
converges to x∗.
Proof. From formulas (5.2.1)–(5.2.2) we get
x∗ − xn+1 = (Rng)(0)
and
x∗ − xn+1 = (−1)n+1cg
n∏
k=0
fk,
where
cg = 1(n+1)!
g(n+1)(η).
Using the mean’s formula, we have
f(xk) = f(xk)− f(x∗) = (xk − x∗)f ′(ξk),
whence,
x∗ − xn+1 = cg
n∏
k=0
(x∗ − xk)f ′(ξk).
Since f ′ is bounded, it follows that there exists a constant c such that
c |x∗ − xn+1| ≤n∏
k=0
c |x∗ − xk| . (5.2.4)
Taking into account that
|x∗ − xk| < 1c,
we obtain
c |x∗ − xn+1| < 1.
5.2. Multistep methods 223
Denoting M = max0≤k≤n
(c |x∗ − xk|), it follows that there exist the numbers
ak > 0 such thatc |x∗ − xk| = Mak ,
for k = 0, ..., n. Then, from (5.2.4) we get
Man+1 ≤Ma0+...+an,
thus,an+1 ≥ a0 + ...+ an.
Considering as starting values xn−i, ..., xi and applying the above argumentsfor i = n+ 1, n + 2, ..., we obtain
an+i ≥ ai−1 + ... + an+i−1, i = 1, 2, . . . . (5.2.5)
Since ai > 0, it follows that the sequence a0, ..., an, ... is monotonically in-creasing and
limn→∞
ai =∞.
Indeed, iflimn→∞
an = α <∞,
inequalities (5.2.5) lead to the contradiction α ≥ (n+ 1)α, for n > 1. Conse-quently, condition 0 < M < 1 implies
limi→∞
c |x∗ − xi| = limi→∞
Mai = 0,
thus,limi→∞
xi = x∗, (c 6= 0).
Remark 5.2.2. For n = 1, we get
FL1 (x0, x1) = x1 −
(x1 − x0)f(x1)
f(x1)− f(x0),
which is called the secant method. Thus,
xk+1 := FL1 (xk−1, xk) = xk − (xk−xk−1)f(xk)
f(xk)−f(xk−1), k = 1, 2, ...
is the new approximation obtained using the previous approximations xk−1,xk.
224 Chapter 5. Non-linear equations on R
Practical F -methods for approximating the zeros of a function can beobtained using inverse Hermite or Birkhoff interpolation. Assume x∗ is asolution of equation (5.0.1), V (x∗) is a neighborhood of x∗ such that g = f−1
exists on V (x∗) and x0, x1 ∈ V (x∗).Consider the Birkhoff-type interpolation formula
g(y) = (B1g)(y) + (R1g)(y),
where
(B1g)(y) = (y − y1)g′(y0) + g(y1),
with y0 = f(x0), y1 = f(x1), and for y0 < y1
(R1g)(y) = 12(y − y1)(y + y1 − 2y0)g
′′(η), (5.2.6)
with η ∈ [y0, y1], for every y ∈ [y0, y1]. Taking FB1 (x0, x1) = (B1g)(0), we
obtain the FB1 -method, defined by
F1(x0, x1) = x1 − f(x1)f ′(x0)
.
Therefore we can state the following result.
Theorem 5.2.3. If the following conditions hold:
i) f(x0) < 0 < f(x1);
ii) there exists f ′(x), it is finite, and f ′(x) > 0, for x ∈ [x0, x1];
iii) there exists f ′′(x), it is finite, and f ′′(x) ≤ 0, for x ∈ (x0, x1);
then1) equation (5.0.1) has the unique solution x∗and x0 < x∗ < x1.2) the sequence (xn)n∈N, defined by the method FB
1 , namely,
xn+1 = FB1 (x0, xn), n = 1, 2, ... (5.2.7)
converges to x∗. Moreover, we have
x∗ ≤ xn+1 ≤ xn, n = 1, 2, ... (5.2.8)
5.2. Multistep methods 225
Proof. Existence and uniqueness follow immediately from conditions i) andii). Using complete induction we first prove the inequalities (5.2.8). Indeed,we have x∗ < x1. Assume x∗ < xn. Using the remainder expression (5.2.6),we obtain
x∗ − xn+1 = (R1g)(0) =1
2[f(xn)− 2f(x0)]
f ′′(ξn)f ′(ξn)]3
.
From the inequalities
f(xn) ≥ 0, (xn ≥ x∗), f(x0) < 0, f ′(x) > 0, f ′′(x) ≤ 0,
for x ∈ (x0, x1), it follows
x∗ − xn+1 ≤ 0,
so
x∗ ≤ xn+1,
and thus the first inequality in (5.2.8) is proved. From (5.2.7), we get
xn+1 − xn = − f(xn)f ′(x0)
≤ 0,
whence xn+1 ≤ xn, and (5.2.8) is completely verified. Consequently, (xn)n∈N
is a decreasing and bounded sequence, thus it is convergent. We denote byx its limit. Then, (5.2.7) implies
limn→∞
f(xn)f ′(x0)
= limn→∞
(xn+1 − xn) = 0,
so
limn→∞
f(xn) = f(x∗) = 0,
and
x0 < x∗ < x1.
Since x∗ is unique, it follows x = x∗, thus, the theorem is completely proved.
In the sequel we present the FH2 -method obtained by using the inverse
Hermite interpolation relative to the function g = f−1, the simple nodey0 = f(x0) and the double node y1 = f(x1), namely,
g(y) = (H2g)(y) + (R2g)(y),
226 Chapter 5. Non-linear equations on R
where
(H2g)(y) = (y−y1)2
(y0−y1)2 g(y0) + (y−y0)(2y1−y0−y)(y0−y1)2 g(y1)
− (y−y0)(y−y1)y0−y1 g′(y1)
and(R2g)(y) = (y−y0)(y−y1)2
6g′′′(η), y0 ≤ η ≤ y1.
We obtain
FH2 (x0, x1) = x1 −
[f(x1)
f(x0)−f(x1)
]2(x1 − x0)− f(x1)
f(x0)−f(x1)f(x0)f ′(x1)
(5.2.9)
andx∗ − FH
2 (x0, x1) = −f(x0)f2(x1)6
g′′′(η0). (5.2.10)
Theorem 5.2.4. If f satisfies the requirements of Theorem 5.2.3, g′′′(y) ≥ 0,for y ∈ [y0, y1], and (xn)n∈N is the sequence generated by FH
2 , i.e.,
xn+1 = FH2 (xn−1, xn), n = 1, 2, ...
then(x∗ − xn)(x∗ − xn+1) ≤ 0, n = 1, 2, ...
andlimn→∞
xn = x∗.
Proof. From equalities (5.2.9), (5.2.10) and the hypothesis of the theorem itfollows x2 < x1 and x2 ≤ x∗. Since f(x2) < 0 (f(x2) = 0 implies x2 = x∗) andf(x1) > 0 we also get x3 > x2 and x∗ ≤ x3. Assume now that xn−1 ≤ x∗ < xn.Then, f(xn−1) ≤ 0, f(xn) ≥ 0, thus xn+1 ≤ xn and xn+1 ≤ x∗, so the firstclaim is fulfilled.
In order to prove that limn→∞
xn = x∗, notice first from the construction of
FH2 that xn ∈ (x0, x1), n = 2, ..., i.e., (xn)n∈N is a bounded sequence, hence
it contains a convergent subsequence (xnk)k∈N. Let x = limk→∞ xnk
. Then
limk→∞
(xnk+1− xnk
) = 0,
and (5.2.9) implieslimk→∞
f(xnk) = 0 = f(x),
where x ∈ (x0, x1). Finally, from the uniqueness of the solution x∗ we getx = x∗.
5.3. Combined methods 227
5.3 Combined methods
In applications one needs to approximate the solution of an equation witha prescribed error. Usually, this error is evaluated by means of the distancebetween two consecutive approximations, say |xn − xn−1| . Since inequality|xn − xn−1| ≤ ε does not always imply |xn − x∗| ≤ ε, where ε ∈ R+ is given,it turned out that very useful are the so-called combined methods, where thesolution x∗ is approximated both from bellow and from above, thus it alwayslays between two consecutive approximations, namely
xsn ≤ x∗ ≤ xdn, n = 1, 2, ...
In such cases, the error in approximating x∗ by (xsn + xdn)/2 is not biggerthan (xdn − xsn)/2. In other words, the approximation error can be strictlycontrolled. In the sequel we describe some simple combined methods.
1) The tangent-secant methodIf function f is convex or concave on an interval (a, b) that isolates the
solution x∗ of equation (5.0.1), i.e., f ′′(x) 6= 0, x ∈ (a, b), then both successiveapproximation sequences generated by the tangent method and the secantmethod are monotone. Namely, if the first sequence is increasing, the secondone is decreasing, and viceversa. The endpoints of the interval a and b canbe respectively chosen as starting values. Thus, in order to approximatex∗ ∈ (a, b) we can apply simultaneously the tangent and the secant methods.Denoting by xtk and xsk, respectively, the corresponding approximations oforder k, we obtain:
(xt0, xs0) ⊃ (xt1, x
s1) ⊃ ... ⊃ (xtn, x
sn) ⊃ ..., (5.3.1)
or(xs0, x
t0) ⊃ (xs1, x
tx) ⊃ ... ⊃ (xsn, x
tn) ⊃ ..., (5.3.2)
where
xtk+1 = xtk −f(xtk)
f ′(xtk), k = 0, 1, ...,
xsk+1 = xsk −(xsk − xtk+1)f(xsk)
f(xsk)− f(xtk+1), k = 0, 1, ...,
depending whether f(a)f ′′ (a+b2
)> 0 or f(a)f ′′ (a+b
2
)< 0. In the first case
we have xt0 = a, xs0 = b, and x∗ ∈ (xtn, xsn), while in the second one xs0 = a,
228 Chapter 5. Non-linear equations on R
xt0 = b and x∗ ∈ (xsn, xtn). Successive approximations will be computed until
there holds ∣∣xtn − xsn∣∣ ≤ 2ε,
where ε is a prescribed positive number. When the above inequality is sa-tisfied, we take
x∗ ≈ x =xtn + xsn
2,
and so,
|x∗ − x| ≤ ε.
2) The tangent method −FB1
This method is based on the following result.
Lemma 5.3.1. If the requirements of Theorem 5.2.3 are fulfilled, namely, ifthere take place:
i) f(x0) < 0 < f(x1);
ii) there exists f ′(x), it is finite, and f ′(x) > 0, for x ∈ [x0, x1);
iii) there exists f ′′(x), it is finite, and f ′′(x) ≤ 0, for x ∈ (x0, x1);
then the sequence
x0, x1, ..., xn, ...
generated by the following iterations
x2n = x2n−2 − f(x2n−2)f ′(x2n−2)
(5.3.3)
and
x2n+1 = x2n−1 − f(x2n−1)f ′(x2n−2)
, (5.3.4)
n = 1, 2, ..., converges to x∗. Moreover, x∗ ∈ [x2n, x2n+1], for all n ∈ N.
Proof. Notice that the iteration in (5.3.3) is the one in Newton’s method(i.e., tangent method), and inequalities
x2n ≤ x2n+2 ≤ x∗, n = 0, 1, ...,
are satisfied in the given hypothesis.
5.3. Combined methods 229
Furthermore, from Theorem 5.2.3 we get
x∗ ≤ x2n+1 ≤ x2n−1, n = 1, 2, . . . .
This method has the advantage that each component uses the value ofthe derivative f ′ on the same point.
Remark 5.3.2. A much simpler combined method can be obtained using(5.3.3) and (5.3.4), for a fixed x0, namely
x2n = x2n−2 − f(x2n−2)f ′(x0)
,
x2n+1 = x2n−1 − f(x2n−1)f ′(x0)
, n = 1, 2, ...,
where only the value of f ′ in x0 is needed. This method is recommendedwhenever a lot of work is involved in computing a value of the derivative.Other combined methods can be obtained using FB
1 , FH2 , F
T3 , the tangent
and the secant method.
230 Chapter 5. Non-linear equations on R
Chapter 6
Non-linear equations on Rn
Let D ⊆ Rn, fi : D → R, i = 1, ..., n and consider the system:
fi(x1, ..., xn) = 0, i = 1, ..., n; (x1, ..., xn) ∈ D. (6.0.1)
Taking the application f : D → Rn, such that
(x1, ..., xn)f−→ (f1(x1, ..., xn), ..., fn(x1, ..., xn)),
system (6.0.1) can be written also as the vectorial equation
f(x) = 0, x ∈ D, (6.0.2)
where 0 is the null element of the space Rn.In the sequel we describe two methods for approximating the solutions of
a non-linear system on Rn.
6.1 Successive approximation method
Consider first the system (6.0.1) written in the form
xi = ϕi(x1, ..., xn), i = 1, ..., n; (x1, ..., xn) ∈ D, (6.1.1)
where the functions ϕi : D → R are continuous on D and such that for all(x1, ..., xn) ∈ D one has
(ϕ1(x1, ..., xn), ..., ϕn(x1, ..., xn)) ∈ D.
231
232 Chapter 6. Non-linear equations on RN
Let x = (x1, ..., xn) and ϕ = (ϕ1, ..., ϕn), then the system can be written as
x = ϕ(x), x ∈ D. (6.1.2)
Like in the real case, for a given starting value x(0), one generates the sequence
x(0), x(1), ..., x(n), ..., (6.1.3)
defined byx(m+1) = ϕ(x(m)), m = 0, 1, . . . . (6.1.4)
If the sequence (6.1.3) is convergent (the ϕmethod is convergent), and x∗ ∈ Ddenotes its limit, then x∗ is a solution of equation (6.1.2). Indeed, passing tothe limit in (6.1.4) and taking into account that ϕ is a continuous function,we obtain
limm→∞
x(m+1) = ϕ limm→∞
x(m),
that isx∗ = ϕ(x∗).
The convergence of the ϕ method remains to be studied. To that end wecan use the Picard-Banach theorem: if ϕ : Rn → Rn satisfies the contractioncondition
‖ϕ(x)− ϕ(y)‖ ≤ α ‖x− y‖ , x, y ∈ Rn,
with 0 < α < 1, then there exists a unique element x∗ ∈ Rn which verifiesequation (6.1.2) and it is the limit of the sequence (6.1.3). The error inapproximating x∗ by x(n) is evaluated by
∥∥x∗ − x(n)∥∥ ≤ αn
1−α∥∥x(1) − x(0)
∥∥ .
Remark 6.1.1. As in the case of equations on R, a sufficient condition fora continuous function ϕ, with first order partial derivatives continuous on adomain D ∈ Rn, to be a contraction is that
‖ϕ′‖ ≤ q < 1, on D,
where the norm of ϕ′ is the norm of the Jacobian matrix
ϕ′ =(∂ϕi
∂xj
), i, j = 1, ..., n.
For example,
‖ϕ′‖ = max1≤i≤m
n∑
j=1
maxx∈D
∣∣∣∂ϕi(n)∂xj
∣∣∣ .
6.1. Successive approximation method 233
Remark 6.1.2. If ϕ is defined on a subdomain of Rn, for example, on thesphere
S := S(x(0), r) = x∣∣∥∥x− x(0)
∥∥ < r, r ∈ R, x(0) ∈ Rn,
then the supplementary condition∥∥x(0) − ϕ(x(0))
∥∥ ≤ (1− α)r
has to be fulfilled in order to guarantee the convergence of the correspondingiterative process. This condition makes sure that ϕ(x) ∈ S, provided x ∈ S.
The successive approximations method can also be applied for equationsof the form (6.0.2), where f is a function defined and continuous in a neigh-borhood of the solution vector x∗. We first write the equation (6.0.2) in theform
x = x+Mf(x),
where M is a non-singular matrix. Denoting
x+Mf(x) = ϕ(x), (6.1.5)
equation (6.0.2) becomes x = ϕ(x), so the successive approximations methodcan now be applied. Next, we determine the matrix M such that the methodϕ = I +Mf is convergent. For this, we use condition
‖ϕ′‖ ≤ q < 1
and the obvious observation is that the smaller ‖ϕ′‖ is, the faster the iterativeprocess converges. From (6.1.5) we get
ϕ′(x) = I +Mf ′(x).
We determine the matrix M such that for the starting value x(0) one has
ϕ′(x(0)) = 0.
It followsI +Mf ′(x0) = 0,
thus, if f ′(x0) 6= 0 we obtain
M = −(f ′(x(0)))−1.
If f ′(x(0)) = 0 another starting value x(0) has to be chosen.
234 Chapter 6. Non-linear equations on RN
6.2 Newton’s method
Consider again equation (6.0.2). Let x∗ ∈ D be a solution of this equationand let x(p) be an approximation of x∗. Denoting by ε(p) the error producedin approximating x∗ by x(p), we have
x∗ = x(p) + ε(p).
Hence,f(x(p) + ε(p)) = 0. (6.2.1)
Assuming that f is derivable in a convex domain that contains x∗ and x(p),the left side term in (6.2.1) can be approximated by the first two terms ofthe Taylor’s expansion of f in the neighborhood of x(p), namely,
f(x(p) + ε(p)) ≈ f(x(p)) + f ′(x(p))ε(p).
Thus, (6.2.1) can be approximated by
f(x(p)) + f ′(x(p))ε(p) = 0, (6.2.2)
which leads to the following linear algebraic systems of equations with theunknowns ε
(p)i , i = 1, ..., n:
∂∂x1fi(x
(p)1 , ..., x(p)
n )ε(p)1 +...+ ∂
∂xnfi(x
(p)1 , ..., x(p)
n )ε(p)n (6.2.3)
= −fi(x(p)1 , ..., x(p)
n ), i = 1, ..., n
Denoting
w(x(p)) = f ′(x(p)) =(
∂∂xjfi(x
(p)))i,j=1,...,n
and in the hypothesis that w(x(p)) is a non-singular matrix, we obtain
ε(p) = −w−1(x(p))f(x(p)),
where w−1(x(p)) is the inverse of the matrix w(x(p)). Thus, we get a value forthe correction vector ε(p).
If we denote by x(p+1) the approximation obtained by applying to x(p) thecorrection ε(p), we reach the iterative method:
x(p+1) = x(p) − w−1(x(p))f(x(p)), p = 0, 1, ..., (6.2.4)
6.2. Newton’s method 235
which is called Newton’s method.Passing to the limit in (6.2.4), it immediately follows that if (x(p))p∈N is a
convergent sequence with limit denoted by x∗, then x∗ is the solution of thediscussed equation.
Concerning the convergence of the sequence we can state the followingresult.
Theorem 6.2.1. Let f ∈ C2(D), x(0) ∈ D and r ∈ R such that
S0(x(0), r) = x
∣∣∥∥x− x(0)∥∥ ≤ r ⊂ D.
If the following conditions are verified:
i) there exists the matrix M0 = w−1(x(0)) and ‖M0‖ ≤ A0,
ii)∥∥M0f(x(0))
∥∥ ≤ B0 ≤ r/2,
iii) ‖f ′′(x)‖ ≤ C, x ∈ S0,
iv) η0 := 2nA0B0 ≤ 1,
then, for the starting value x(0), the iterative process (6.2.4) is convergent,the vector
x∗ = limp→∞
x(p)
is the solution of the given equation and one has
∥∥x∗ − x(k)∥∥ ≤ 1
2k−1η2k−10 B0.
Proof. We start showing that all the conditions i)− iv), valid for x(0), alsohold for the approximation x(1), given by (6.2.4). Indeed, one has
∥∥x(1) − x(0)∥∥ =
∥∥w−1(x(0))f(x(0))∥∥ =
∥∥M0f(x(0))∥∥ ≤ B0 ≤ r/2,
whence x(1) ∈ S1(x(0), r/2) ⊂ S0.
In order to prove the existence of M1 = w−1(x1)), we use the property(A · B)−1 = B−1 · A−1, and, thus, we obtain
M1 = ω−1(x(1)) = (w(x(0))M0w(x(1))−1 = (M0w(x(1)))−1M0.
236 Chapter 6. Non-linear equations on RN
Using condition i), we can write
∥∥I −M0w(x(1))∥∥ =
∥∥M0(w(x(0))− w(x(1)))∥∥ (6.2.5)
≤ ‖M0‖∥∥w(x(0))− w(x(1))
∥∥
≤ A0
∥∥w(x(1))− w(x(0))∥∥ .
Using the inequality
‖f ′(x+ ∆x)− f ′(x)‖ ≤ n ‖∆x‖ ‖f ′′(ξ)‖ , ξ = x+ α∆x, 0 < α < 1
and conditions iii), iv), relation (6.2.5) becomes
∥∥I −M0w(x(1))∥∥ ≤ nA0B0C = η0/2 ≤ 1/2. (6.2.6)
The latter inequality assures the existence of the inverse matrix
[M0w(x(0))]−1 = [I − (I −M0w(x(1))]−1
and one has ∥∥[M0w(x(1))]−1∥∥ ≤ 1
1−η0/2 ≤ 2.
Therefore, there exists the matrix
M1 = [M0w(x(1))]−1M0
and for its norm we get
‖M1‖ =∥∥[M0w(x(1))]−1M0
∥∥ ≤∥∥M0w(x(1))]−1
∥∥ ‖M0‖ ≤ 2A0 = A1. (6.2.7)
Thus condition i) is also valid for the approximation x(1).Using formulas (6.2.4), yields
∥∥f(x(1))∥∥ =
∥∥f(x(1))− f(x(0))− f ′(x(0))(x(1) − x(0))∥∥ (6.2.8)
≤ 12
∥∥x(1) − x(0)∥∥2 ‖f ′′(ξ)‖ ≤ 1
2nB2
nC,
where ξ = x(0) + θ(x(1) − x(0)), 0 < θ < 1.We mention that in order to obtain the first inequality above we used
relation
‖f(x+ ∆x)− f(x)− f ′(x)∆x‖ ≤ 12n ‖∆x‖2 ‖f ′′(ξ)‖ ,
6.2. Newton’s method 237
with ∆x = x(1) − x(0).From (6.2.7) and (6.2.8) it follows
∥∥M1f(x(1))∥∥ ≤ ‖M1‖
∥∥f(x(1))∥∥ ≤ 2A0
12nB0C
= nA0B0C = 12η0B0 = B1,
so condition ii) also holds for x(1).Since condition iv) is verified, it remains to show η1 = 2nA1B1C ≤ 1. We
haveη1 = 2n2A0
12η0B0C = η02nA0B0C = η2
o ≤ 1.
Using the same argument, we deduce that the successive approximationsx(p), p = 1, 2, ... generated by (6.2.4) can be defined, and one has x(p) ∈Sp(x
(p−1), r/2p−1), with
S0(x(0), r) ⊃ S1(x
(1), r/2) ⊃ ... ⊃ Sp(x(p), r/2p) ⊃ ...
Furthermore, we have
‖Mp‖ =∥∥w−1(x(p)
∥∥ ≤ Ap, (6.2.9)∥∥Mpf(x(p))
∥∥ =∥∥x(p+1) − x(p)
∥∥ ≤ Bp, p = 1, 2, ...,
where the constants Ap, Bp verify the relations
Ap = 2Ap−1, (6.2.10)
Bp = 12ηp−1Bp−1,
with ηp = 2nApBpC, p = 1, 2, ...Now, we show that the sequence generated by (6.2.4) is fundamental.
Indeed, one hasx(p+q) ∈ Sp(x(p), r/2p), q ∈ N
∗,
that is ∥∥x(p+q) − x(p)∥∥ ≤ r/2p.
In other words, for every ε > 0, there exists n0 ∈ N, such that∥∥x(p+q) − x(p)
∥∥ < ε, p > n0 and q ∈ N∗,
so (x(p))p∈N is a fundamental sequence. Since x(p) ∈ S0(x(0), r), p = 0, 1, ...
and S0(x(0), r) is a closed set, it follows that the above sequence is convergent.
Letlimp→∞
x(p) = x∗ ∈ S0(x(0), r).
238 Chapter 6. Non-linear equations on RN
In order to prove that x∗ is the solution of equation (6.0.2), we write equality(6.2.4) in the form
w(x(p))(x(p+1) − x(p)) = f(x(p)).
Passing to the limit and taking into account the continuity of f and f ′, itfollows f(x∗) = 0.
Using (6.2.9) and (6.2.10), we obtain
∥∥x(p+1) − x(p)∥∥ ≤ Bp = 1
2p η2p−10 B0,
which implies
∥∥x(k+p) − x(k)∥∥ ≤
∥∥x(k+p) − x(k+p−1)∥∥+ ... +
∥∥x(k+1) − x(k)∥∥
≤ B0
(1
2k+p−1 η2k+p−1−10 + ... + 1
2k η2k−10
)
= B0
2k η2k−10
(1 + 1
2η2k
0 + ...+ 12p−1 η
2k(2p−1−1)0
)
≤ B0
2k η2k−10
(1 + 1
2+ ... + 1
2p−1
).
Passing to the limit with respect to p, yields
∥∥x∗ − x(k)∥∥ ≤ B0
2k−1 η2k−10 ,
so the theorem is completely proved.
Chapter 7
Parallel calculus
7.1 Introduction
Parallel calculus, or parallel computation, is the process of solving problemson parallel computers. The 1980s will go down in history as the decade inwhich parallel computing first had a significant impact in the scientific andcommercial worlds. The problems have arisen from real world demand com-puters that are many order of magnitude faster than the fastest conventionalcomputers. Parallelism represents the most feasible avenue to achieve thiskind of performances. The modern improvements in hardware have broughttremendous advances, e.g., by means of ultra large-scale integrated circuits(VLSI). So, the processor has born and today there exist many supercom-puters which work with thousands and hundred of thousands processors, sothe parallel algorithms have a strong technique support.
From the software point of view, many programs that run well on sequen-tial computers (serial computers) are not easily transformed to programs thatefficiently can be performed on parallel computers. Conversely, algorithmsthat are less efficient in a sequential context often reveal an inherent paral-lelism that makes then attractive to parallel programs.
Anyway, the parallel computation does not replace all the serial programs(of course, the best and the most efficient of then, remain), but it is a veryattractive alternative in order to improve the speed-up of execution. Inparallel calculus, the problem have to be rethink in order to be perform withmore than one processor. This is the fascinating part of parallel calculus.
In the following section we try to indicate some parallel ways of solving
239
240 Chapter 7. Parallel calculus
several of the numerical methods presented in previous chapters.
7.2 Parallel methods to solve triangular sys-
tems of linear equations
7.2.1 The column-sweep algorithm
Let us consider the systemA · x = b
with A ∈Mn×n(R) triangular matrix and x, b ∈ Mn×1(R),We begin by defining a linear recurrence system R < n, m > of order m
for n equations by
R < n,m >: xk =
0, if k ≤ 0,
b′k +k−1∑
j=k−ma′kjxj , if 1 ≤ k ≤ n,
(7.2.1)
where m ≤ n− 1. If we write A = [aik], with aik = 0 for i ≤ k or i− k > m,x = [x1, ..., xn]
T and b = [b1, ..., bn]T , then (7.2.1) can be written as:
x = A′ · x+ b′. (7.2.2)
Note: A′ and b′ can be easily obtained by A and b by division with nonzeroelements.
Relation (7.2.2) is, e.g., R < 4, 3 >:
x1 = b′1 (7.2.3)
x2 = b′2 + a′21x1
x3 = b′3 + a′31x1 + a′32x2
x4 = b′4 + a′41x1 + a′42 + a′43x3.
In order to illustrate the column-sweep algorithm, we take a 4× 4 system,
A · x = b,
with
A =
1−a21 1−a31 −a32 1−a41 −a42 −a42 1
, (7.2.4)
7.2. Parallel methods to solve triangular systems of linear equations 241
x =
x1
x2
x3
x4
, b =
b1b2b3b4
.
Using (7.2.2), we get
x1 = b1 (7.2.5)
x2 = b2 + a21x1
x3 = b3 + a31x1 + a32x2
x4 = b4 + a41x1 + a42x2 + a43x3,
from which we can evaluate x1, x2, x3 and x4 in parallel as follows:
Step 1 Evaluate, in parallel, expressions of the from
b(1)i = bi + ai1, for i = 2, n,
where x1 = b1 is known. Now there remain n− 2 equations to solve.
Step 2. Evaluate, in parallel, expressions of the form
b(2)i = b
(1)i + ai2x2, i = 3, n,
where x1 and x2 are known. Now there remain n − 3 equations to besolved.
· · ·
Step k. Evaluate, in parallel, expressions of the from
b(k)i = b
(k−1)i + aik · xk, i = k + 1, ..., n,
where x1, ..., xk.
Remark 7.2.1. Using a network with n processors, the execution time forthe column-sweep algorithm is n times reduced comparing with the executiontime involved in solving a triangular system on a serial computer.
242 Chapter 7. Parallel calculus
7.2.2 The recurrent-product algorithm
Recalling the method presented in the previous section, we can express thesolution x as a product of elementary matrices that can easily be computedusing the recursive-doubling technique. Thinking in this way, Sameth andBrent gave the following method for solving a triangular system of equations,known also as the recurrent-product algorithm.
According with it, relation (7.2.2) can be written
x = (I − A′)−1 · b′, (7.2.6)
where matrix A′ is strictly lower triangular and (I −A′)−1 can be expressedin the product from:
(I − A′)−1 =n−1∏
i=1
Mn−i, (7.2.7)
where
Mi =
1 0. . .
10 ai+1,i 1
.... . .
an,i 0 1
.
The productn−1∏i=1
Mn−i can be carried out using the recursive-doubling tech-
nique, on a binary tree connectivity among processors.
Example 7.2.2. Suppose we take a banded lower triangular system R 〈5, 2〉 .Then, relation (7.2.6) is:
x1 = b′1 (7.2.8)
x2 = b′2 + a′21x1,
x3 = b′3 + a′31x1 + a′32x2
x4 = b′4 + a′42x2 + a′43x3
x5 = b′5 + a′53x3 +′54 x4.
Hence,x = M4 ·M3 ·M2 ·M1 · b′, (7.2.9)
7.3. Parallel approaches of direct methods for solving systems of linear equations243
where
M4
11
1 00 1
a′54 1
; M3 =
11 0
10 a′43 1
a′53 0 1
M2 =
1 00 10 a′32 10 a′42 0 10 0 0 0 1
; M1 =
1a′21 1 0a′32 10 0 10 1
.
The matrix product in (7.2.9) can be evaluated using the recursive-doubling,as follows: (((M4M3)(M2M1)) · b′) which means that we compute all theinner brackets at step1, then all ”next” brackets at step2, and so on. Thecomputations of the inner brackets, etc. are made simultaneously, by twoprocessors. The whole computation needs three parallel steps.
Remark 7.2.3. The problem of inverting of a lower triangular matrix canbe made, with parallel calculus, in several other ways. For example, Borodinand Munro proposed a method that use date partitioning.
7.3 Parallel approaches of direct methods for
solving systems of linear equations
In this section we discuss several techniques for solving linear algebraic sys-tems which are particularly suited to parallel computation. It should benoted that the choice of method depends on the type of parallel environmentused.
7.3.1 The parallel Gauss method
Let us consider the system
A · x = b, (7.3.1)
244 Chapter 7. Parallel calculus
with A ∈Mn×n(R), x, b ∈Mn×1(R),or, in the explicitly from:
a11x1 + a12x2 + ... + a1nxn = b1a21x1 + a22x2 + ... + a2nxn = b2
...an1x1 + an2x2 + ...+ annxn = bn.
(7.3.2)
A possible parallel execution with n processors is the following:
Stage 1: all n processors work together
Step 1: Considering a11 6= 0,
for j : 2,n in parallel do
a(1)1j := a1j/a11; b
(1)1 = b1/a11.
Step 2: for j := 2,n in parallel do
a(1)ij = aij − a(1)
1j ∗ ai1; b(1)i = bi − b(1)1 ∗ ai1.
After Step 1 and Step 2, the following system is obtained:
x1 + a(1)12 x2 + ... + a
(1)1nxn = b
(1)1
a(1)22 x2 + ...+ a
(1)2n xn = b
(1)2
......
...
a(1)n1x2 + ... + a
(1)nnxn = b
(1)n .
(7.3.3)
Stage 2: n− 1 processors work together
Repeat computations of type from Step 1 and Step 2, with a22 6= 0
...
Stage n− 1 : 2 processors work together
Repeat computations of type from Step 1 and Step 2, with an−1,n−1 6= 0.
7.3. Parallel approaches of direct methods for solving systems of linear equations245
Finally, we get the system:
x1 + a(1)12 x2 + ...+ a
(1)1,n−1xn−1 + a
(1)1n xn = b
(1)1
x2 + ...+ a(2)2,n−1xn−1 + a
(2)2n xn = b
(2)2
. . .
xn−1 + a(n−1)n−1,nxn = b
(n−1)n−1
a(n−1)nn xn = b
(n−1)n .
(7.3.4)
So, the problem has reduced to solving of the triangular system (7.3.4). Oneway to do it in parallel is by means of techniques presented in section 7.2.Another possibility is to reactivate the processors, one by one, and to get thesolution starting with the last equation.
Remark 7.3.1. The Gauss method can be parallelized, as we saw, but thefinal algorithm is not very efficient, because the n processors are not all ofthem used, at every step. From this point of view, the Gauss-Jordan methodis more appropriate for the parallel calculus.
7.3.2 The parallel Gauss-Jordan method
There are several possibilities to perform the Gauss-Jordan method by usingmore than one processor. So, considering the system
A · x = b, (7.3.5)
with A ∈ Mn×n(R), x, b ∈ Mn×1(R), aii 6= 0, i = 1, n, and n processors on aSIMD type machine, the parallel method is the following:
Step 1. The system (7.3.5) is transformed in:
x1 + a(1)12 x2 + ...+ a
(1)1n xn = b
(1)1
a(1)22 x2 + ... + a
(1)2nxn = b
(1)2
......
...
a(1)n2 x2 + ...+ a
(1)nnxn = b
(1)n ,
(7.3.6)
where
a(1)1j = a1j/a11, j = 2, n, b
(1)1 = b1/a11
a(1)ij = aij − a(1)
1j · ai1, b(1)i = bi − b(1)1 · ai1, i, j = 2, n.
246 Chapter 7. Parallel calculus
The n processors work as follows: one processor computes a(1)12 , a
(1)13 , ..., a
(1)1n ,
b(1)1 , one after another. In the moment where a
(1)12 is computed, the
other n − 1 processors compute the elements of the second column:a
(1)i2 , i = 2, n, then the elements of the third column, a
(1)13 , etc., until
they compute also b(1)i , i = 2, n.
Step 2. Meanwhile, the first processor, which already finished the computa-tion of b
(1)1 , continues with a
(2)23 = a
(1)23 /a
(1)22 . The rest of n− 1 processors
will start the computation which eliminated x2, it means the values
a(2)13 = a
(1)13 − a(1)
12 · a(2)23 , b
(2)1 = b
(1)1 − a(1)
12 · b(2)2
a(2)i3 = a
(1)i3 − a
(1)i2 · a
(2)23 , b
(2)i = b
(1)i − a
(1)i2 · b
(2)2 , i = 3, n.
And so on. After n steps, we get the solution:
xi = b(n)i , i = 1, n. (7.3.7)
Remark 7.3.2. Another parallel approach can be made. Instead of dividingthe amount of work among one processor (which performs only a divisionoperation, all the time) and the other n − 1 processors (which perform amultiplication and an addition, all the time), Modi suggests a parallel exe-cution in which all n processors work together, on the same column, in orderto eliminate the elements upon and under the element aii, i = 1, n , so, aftern steps, we get the final from (7.3.7) of the system.
7.3.3 The parallel LU method
To get a parallel LU Method, the idea of parallel Gauss method (see section7.3.1) can also be used to get the matrices L and U of the decomposition
A = L · U.
Another possibility is to use the double recursive technique to get the matrixproduct which form matrix U, and then matrix L, as there was presentedin section 4.2.1.4, in the case of Doolittle algorithm. Also, as we alreadystated, the Block LU decomposition, presented in section 4.2.1.5, is moreappropriate to the parallel calculus, due to the fact that the operation in theblock-matrices can be perform simultaneously.
7.3. Parallel approaches of direct methods for solving systems of linear equations247
7.3.4 The parallel QR method
Recalling the QR method, presented in section 4.2.1.7, a parallel approachcan be obtain by using the Givens rotation, because several rotations canbe performed simultaneously. So, if we use a different notation and denoteby (pk, qk), k = 1, mn − 1
2n2 − 1
2n, if A∈Mn×n(R) the fact that a Givens
rotation G(Pk, qk) is performed between the rows Pk and Pk − 1, in order toannihilate the (Pk, qk) element of
G(pk−1, qk−1)...G(p1, q1)A.
According to Gentelman, the set S of all this rotations is partitioned intosubsets Sr, r = 1, m+ n− 2, so that
Sr = (p, q) |1 ≤ q < p ≤ m, q ≤ n, m+ 2q = p+ r + 1.
The rotations Sr are disjoint, and therefore they can be performed simulta-neously. The corresponding product matrix
Q(r)∏G(p, q) |(p, q) ∈ Sr,
QT = Q(n)...Q(1), QT ·A = R,
hence we get
A = Q · R.
7.3.5 The parallel WZ method
Recalling the WZ Method (see section 4.2.1.8), we notice that the matrix Acan be factorized
A = WZ,
where
W =
1 0
w21 1. . . 0 w2n
... 1...
wn−1,1 0. . . 1 wn−1,n
0 1
248 Chapter 7. Parallel calculus
and
Z =
z1,1 · · · z1,n0 z2,2 · · · z2,n−1 0... 0
. . . zi,j · · ·0
. . .
zn,1 · · · zn,n
,
with elements wij and zij, i, j = 1, n computed according to the formulaspresented in section 4.2.1.8. The system A · x = b, it means WZ · x = b, canbe solved in two steps:
W · p = b, (7.3.8)
and we get the n-elements vector p, then
Z · x = p, (7.3.9)
which generates the solution x. From (7.3.8), we have
1 0w21 1 0 w2n...
. . .
1
wn−1,1. . . 1 wn−1,n
0 1
·
p1
p2...
pn−1
pn
=
b1b2...
bn−1
bn
, (7.3.10)
hence, first we compute p1 and ρn, then p2 and pn−1, and so on, (each timeworking from the top and the bottom). At the i-th step we have
pi = b(1)i , pn−i+1 = b
(i)n−i+2,
whereb(1)j = bj
and
b(i+1)j = b
(i)j − wj,i.pi − wj,n−i+1.pn−i+1 (j = i+ 1, ..., n− i). (7.3.11)
Relations (7.3.11) can be evaluated in parallel, using the double recursivetechnique. Then, in order to solve (7.3.9), the computation of x is similar,so it can be made, also, in parallel.
7.4. Parallel iterative methods 249
7.4 Parallel iterative methods
As we saw in section 4.3, the iterative methods for solving a linear systemof equations are useful when the dimension of the problem that has to besolved increases, and so the direct methods become almost impractical. Froma parallel execution point of view, the iterative methods are more attractive,in spite of the fact that they do not guarantee a solution for every systemof equations due to the convergent of the method. In what follows, we shallgive parallel approaches for some iterative methods.
7.4.1 Parallel Jacobi method
Recalling the expressions for computing the components of vector x at thestep k, starting with an initial value x(0) (see section 4.3.1)
x(k)i = (bi −
n∑
j=1j 6=i
aij · x(k−1)j )/aii, i = 1, n with aii 6= 0,
there exist several ways to perform them using more than one processor.If we work with n processors, every processor i can compute a value xi
at a given step k. The only problem, here, is that the processors have tocommunicate among then, because, at every step, every processor needs toknow all the other values computed by all the other processors.
Another parallel approach for Jacobi method was given by Wong.In order to present it, we rewrite the method in matricial from (see also
formula (4.3.7)):
A · x = b
b− A · x = r
A = L+D + U
A · x = (L+D + U)x + b
Dx(k+1) = −(L+ U)x(k) + b
x(k+1) = −(L+ U)x(k)/D + b/D
x(k+1) = −(L+D + U)x(k)/D + b/D + x(k)
x(k+1) = x(k) + r(k)/D.
250 Chapter 7. Parallel calculus
Remark 7.4.1. The method is presented by means of the residual error.
Algorithmically, the Jacobi method of this form is the following:
Remark 7.4.2. Step 1. k = 0
Step 2. Initialize x(0).
Step 3. r(0) = b− A · x(0)
Step 4. While (∥∥r(k)
∥∥ > ε) do
Step 5. k = k + 1
Step 6. for i = 1 to n do
Step 7. x(k)i = r(k−1)/aii + x
(k−1)i
Step 8. end for
Step 9. r(k) = b− A · x(k)
Step 10.∥∥r(k)
∥∥ = (n∑i=1
(r(k)i )2)
12
Step 11. end while
Step 12. x = x(k).
Remark 7.4.3. We have effective computation in steps 7,9 and 10.
The parallel implementation of this algorithm, using n processors, is thefollowing:
Step 0: Initialize k and x(0).
Step 1 and 2: Assign the x(0)value to n processors.
Step 3: r = b−A ·x matrix-vector multiplication, computation made by allprocessors, without any communication among them.
Step 3+: Update vector r. Communication is required.
Step 4 and 10: Calculate residue ‖r‖ . Communication is required.
Step 7: Update value of vector x. Communication is required
7.4. Parallel iterative methods 251
Step 9: Calculate r = b−A · x, matrix-vector multiplication.
Step 9+: Update vector r; communication is required.
Remark 7.4.4. For parallel matrix multiplication, see the bibliography.
7.4.2 Parallel Gauss-Seidel algorithm
As for the Jacobi method, there are several parallel approaches for the Gauss-Seidel method.
For instance, the following idea is used: having n processors, and due tothe fact that the computation of a component x
(k)i , i = 1, n involves some
values already computed at the same iteration k, according to the formulas:
x(k)i =
1
aii
(bi −
i−1∑
j=1
aij · x(k)j −
n∑
j=i+1
aij · x(k−1)j
)i = 1, n
all the processors have to work together in order to determine the value ofone x
(k)i .
P. Wong presented another possibility, by means of the residual value.The Gauss-Seidel iteration is given as follows:
Ax = b (7.4.1)
b− Ax = r, A = (L+D + U)
Ax = (L+D + U)x = b
(D + L)x = −(U)x + b
(D + L)x(k+1) = −(U)x(k) + b
x(k+1) = −(D + L)−1 · U · x(k) + (D + L)−1 · bx(k+1) = −(D + L)−1 · (U +D + L)x(k) + x(k) + (D + L)−1 · bx(k+1) = x(k) + (D + L)−1 · rk.
In algorithmic manner, the relations (7.4.1) are the following:
Step 1: k = 0
Step 2: initialize x0
Step 3: r0 = b− A · x0
252 Chapter 7. Parallel calculus
Step 4: while (‖rk‖ > ε) do
Step 5: k = k + 1
Step 6: for i = 1 to n do
Step 7: x(k)i = x
(k−1)i + (L+D)−1rk−1
Step 8: end for
Step 9: rk = b−A · xk
Step 10: ‖rk‖ =
(n∑i=1
r2k(i)
) 12
Step 11: end while
Step 12: x = xk.
Remark 7.4.5. The Gauss-Seidel algorithm appears to be identical withJacobi’s, however, here step7 is completely sequential because xi cannot becomputed until xi−1 has been computed. In fact, step7 can be rewritten as:
x(k)i =
(bi −
i−1∑
j=1
aij · x(k)j −
n∑
j=i+1
aij · x(k−1)j
)/aii
=((b− Ux(k−1))
)− Lx(k)
)/aii.
Hence, we can reuse only portion of the residue r calculation. These calcu-lations can be done in parallel.
Remark 7.4.6. The Gauss-Seidel method used to solve discretized problemsattached to PDE equations generates different type of parallel execution, e.q.the checker-board execution (or red-black execution). The type used takesinto account the numbering of the nodes in the discretized network and theirupdating. These ideas may be found also in the next section.
7.4. Parallel iterative methods 253
7.4.3 The parallel SOR method
As we stated in the previous section, the different types of parallel executioncan be shown also by means of the modified Gauss-Seidel method, which isSOR method.
As we have seen in section 4.3.3, using a parameter ω (0 < ω < 2) toimprove the convergence speed, the values of x are updated as follows:
x(k)i =(1− ω)x
(k−1)i + ω(bi −
i−1∑
j=1
aij · x(k)j (7.4.2)
−n∑
j=i+1
aij · x(k−1)j )/aii.
In order to get a good parallelization of the computations (7.4.2), we ”color”the matrix A in such a way that no two neighboring nodes are connectedso that update information are independent, and then they can be madesimultaneously.
Example 7.4.7. Consider a 1D discretized problem, with information in thepoints:
Figure 7.1: The set of points.
Then, the computation of values given by (7.4.2) can be carried on in twostages, according with the red-black type of execution:
Example 7.4.8. Stage 1: Work on even number of node points,
(so only node 2,4,6,8 are updated)
for i = 1 to n in parallel do
x(k)i =
(bi −
i−1∑
j−1
aijx(k)j −
n∑
j=i+1
aij · x(k−1)j
)/aii
end
254 Chapter 7. Parallel calculus
Figure 7.2: The set of even points.
Stage 2: Work on odd number of node points, (so only node 1,3,5,7,9 areupdated)for i = 1 to n in parallel do
Figure 7.3: The set of odd points.
x(k)i =
(bi −
i−1∑
j=1
aij · x(k)j −
n∑
j=i+1
aij · x(k−1)j
)/aii
end for.
7.5 The parallelization of the multigrid method
C.C. Douglas considered that two major classes of methods for parallelizationof the multigrid method are used: telescoping and nontelescoping.
7.5.1 Telescoping parallelizations
As we have seen in section 4.3.6, the multigrid method implies the passingof information among several grids. At every step, some values in the gridnodes have to be computed.
One way of doing it in parallel is by means of domain decomposition tech-niques. Then, each processor computes on the block of data per grid that isassigned to it. Of course, with or without overlaps, they have to communi-cate at the borders, especially if the smoothing procedure is of Gauss-Seideltype.
7.5. The parallelization of the multigrid method 255
Another approach to telescoping multigrid involves massively parallelcomputers (it means at least as many processors as unknowns on all of thegrids).
Then, the following method is proposed, known as ”the concurrent muti-grid” method: the concept is that all operations should be perform simul-taneously, on all unknowns, on all grids. An initial approximation of zerois assumed for the solution. Two sets of vectors, qj and dj are used to holdinformation about right hand sides and data on the spectrum of levels in thesense that
bk ≃k∑
j=1
q1 + dj.
A third set of vectors, xj , contains the approximations to the solutions toeach problem on each level. This information percolates to the finest grid,k, to finally provide the approximate solution to the real problem. Recallingthe serial Multigrid Algorithm (MV) given in section 4.3.6, in what followswe denote the restriction operator, I2h
h , by R and the prolongation operator,Ih2h, by P. Also, the level h and 2h, will be denoted, respectively, by j andj − 1.
So, the parallel algorithm is the following:
Step 1: Initialize in parallel
xj = qj = 0, 1 ≤ j < k; gk = bk, xk = 0.
Step 2: Repeat (i = 1, ..., µ).
Step 2a: Smoothing in parallel:
xj ← PreSolve (xj , qj), 1 ≤ j ≤ k.
Step 2b: Compute data corresponding to xj in parallel:
dj ← Aj · xj , 1 ≤ j ≤ k.
Step 2c: Compute residuals in parallel:
qj ← qj − dj, 1 ≤ j ≤ k.
256 Chapter 7. Parallel calculus
Step 2d: Project q onto coarser grids in parallel.
q1 ← q1 +R2q2;
qk ← (I − Pk−1Rk)qk;
qj ← (I − Pj−1Rj)qj +Rjqj+1, 1 < j < k.
Step 2e: Inject x into finer grids in parallel:
x1 ← 0; xk ← xk + Pk−1 · xk−1;
xj ← Pj−1 · xj−1, 1 < j < k.
Step 2f: Inject d into finer grids in parallel:
d1 ← 0; dk ← dk + Pk−1 · dk−1;
dj ← Pj−1 · dj−1, 1 < j < k.
Step 2g: Put all the data back into q in parallel
gj ← gj + dj, 1 ≤ j ≤ k.
Step 3: Return xk.
Finally, there is another approach to telescoping parallelization: applya domain decomposition method to the finest grid and then use a serialcomputer multigrid method on each of the subdomains.
7.5.2 Nontelescoping parallelizations
Nontelescoping methods use multiple coarse subspaces in which the sum ofthe unknowns on any level equals the sum on any other level.
For a given problem k, it either has a set Ck of coarse space correctionproblems or it has none at all (i.e. Ck = ∅). When Ck 6= ∅, there arerestrictions and prolongations operators for each space problem ℓ ∈ Ck suchthat
Rℓ : Mk →Mℓ and Pℓ : Mℓ → Mk,
where by Mℓ and Mk we denoted the solutions spaces.A multiple coarse space correction multigrid scheme is defined by the
following algorithm, named MCM:Algorithm MCM(j, µj , Cj, xj , bj)
7.5. The parallelization of the multigrid method 257
Step 1: If Cj = ∅, then solve Aj · xj = bj .
Step 2: If Cj 6= ∅, then repeat (i = 1, ..., µj).
Step 2a: Smoothing: xj ← PreSolve(xj, bj).
Step 2b: Residual correction:
xj ← xi +∑
ℓ∈CPℓ ·MCM(ℓ, µℓ, Cℓ, O,Rℓ(bj − Ajxj))/
Step 2c: Smoothing: xj ← PreSolve(xj, bj).
Step 3: Return xj .
Remark 7.5.1. In the parallel implementation of MCM algorithm, eachprocessor executes the computation on a Cj coarse space.
This algorithm was introduced by Ta’asan, when standard multigrid failedto converge for a class of problems with highly oscillatory solutions. He usedstandard interpolation and projection methods for operators (as presentedin section 4.3.6) and a relaxation method for the smoothing procedure.
Using a different type of interpolation and projection methods, Hackbuschdeveloped another method, known as robust multigrid.
Frederichson and McBryan elaborated the parallel superconvergent multi-grid method, using a SIMD machine, standard interpolation and projectionoperators, but an elaborate smoother on every level.
Remark 7.5.2. All these methods reduce the execution time, but are space-wasteful.
In order to correct this inconvenient Douglas proposed the domain re-duction method, in which the approximate solve step was eliminated. A sidebenefit of this theory is that the fine grid problem and most of the coarse gridmatrices do not need to be generated, so the method can use substantiallyless memory than a standard iterative algorithm.
258 Chapter 7. Parallel calculus
Bibliography
[1] O. Agratini, I. Chiorean, Gh. Coman, R.T. Trımbitas, Analiz a Numerica si
Teoria Aproximarii, vol. III, Presa Universitara Clujeana, 2002.
[2] O. Agratini, P. Blaga, Gh. Coman, Lectures on Wavelets, Numerical Meth-
ods and Statistics, Casa Cartii de Stiinta, Cluj-Napoca, 2005.
[3] R.E. Barnhill, Representation and Approximation of Surfaces, Mathematical
Software III, (Ed. J.R. Rice), Academic Press, 1977, pp. 69–120.
[4] R.E. Barnhill, A survey of the representation and design of surfaces, IEEE
Computer Graphics and Applications, 3 (1983), no. 7, pp. 9–16.
[5] R.E. Barnhill, Surfaces in Computer Aided Geometric Design: A survey
with new results, Computer Aided Geometric Design, 2 (1985), pp. 1–17.
[6] R.E. Barnhill, G. Birkhoff, W.J. Gordon, Smooth interpolation in triangle,
J. Approx. Theory 8 (1973), pp. 114–128.
[7] R.E. Barnhill, R.P. Dube, F.F Little, Properties of Shepard’s surfaces, Rocky
Mountains J. Math., 13 (1983), no. 2, pp. 365–382.
[8] S. Bernstein, Demonstration du theoreme de Weierstrass fondee sur le calcul
des probabilites, Comm. Soc. Math., Khavkov, 13 (1912), pp. 1-2.
[9] J. Bernoulli, Ars conjectandi, Werke, 3, Birkhaser, 1975, pp. 107-286.
[10] P. Blaga, Gh. Coman, On some bivariate spline operators, Rev. Anal.
Numer. Theor. Approx., 7 (1979), pp. 143–153.
[11] P. Blaga, Gh. Coman, Multivariate interpolation formulae of Birkhoff type,
Studia Univ. Babes-Bolyai, Mathematica, 26 (1981), no. 2, pp. 14–22.
259
260 BIBLIOGRAPHY
[12] P. Blaga, Gh. Coman, S. Pop, R. Trımbitas, D. Vasaru, Analiza numerica-
Lucrari de laborator, Univ. Babes-Bolyai, Cluj, 1994.
[13] H. Bohman, On approximation of continuous and analytic functions, Ark.
Mat. 2, 1952, pp. 43-56.
[14] K. Bohmer, Spline functionen eine einfuhrung in theorie und anwendungen,
Teubner Valag, Stuttgart, 1974.
[15] K. Bohmer, Gh. Coman, Smooth interpolation schemes in triagles with error
bounds, Mathematica (Cluj), 18 (41) (1976), no. 1, pp. 15–27.
[16] K. Bohmer, Gh. Coman, On some approximation schemes on triangles,
Mathematica (Cluj), 24 (45) (1980), pp. 231–235.
[17] A. Borodin, I. Munro, The computational complexity of algebraic and nu-
merical problems, American Elsevier, 1975.
[18] B. Bylina, Solving linear systems with vectorized WZ factorization, Annales
UMCS Informatica AI, 1 (2003), pp. 5-13.
[19] T. Catinas (Gulea), On trivariate approximation, Proceedings of the In-
ternational Symposium on Numerical Analysis and Approximation Theory,
Cluj-Napoca, May 9–11, 2002, pp. 207–230.
[20] T. Catinas, Trivariate approximation operators on cube by parametric exten-
sions, Acta Universitatis Apulensis, Mathematics-Informatics, no. 4 (2002),
pp. 29–36.
[21] T. Catinas, Interpolating on some nodes of a given triangle, Studia Ma-
thematica ”Babes-Bolyai”, 48 (2003), no. 4, pp. 3–8.
[22] T. Catinas, The combined Shepard-Abel-Goncharov univariate operator,
Rev. Anal. Numer. Theor. Approx., 32 (2003), no. 1, pp. 11–20.
[23] T. Catinas, The combined Shepard-Lidstone univariate operator, ”Tiberiu
Popoviciu” Itinerant Seminar of Functional Equations, Approximation and
Convexity, Cluj-Napoca, May 21–25, 2003, pp. 3–15.
[24] T. Catinas, On the generalized Newton algorithm for some nodes given on
tetrahedron, Seminar on Numerical and Statistical Calculus, ”Babes-Bolyai”
University, Cluj-Napoca, 2004, pp. 65–76.
BIBLIOGRAPHY 261
[25] T. Catinas, The combined Shepard-Lidstone bivariate operator, Trends and
Applications in Constructive Approximation, (Eds. M.G. de Bruin, D.H.
Mache, J. Szabados), International Series of Numerical Mathematics, Vol.
151, 2005, Springer Group-Birkhauser Verlag, pp. 77-89.
[26] T. Catinas, Bounds for the remainder in the bivariate Shepard interpolation
of Lidstone type, Rev. Anal. Numer. Theor. Approx., 34 (2005), no. 1, pp.
47-53.
[27] T. Catinas, Bivariate interpolation by combined Shepard operators, Proceed-
ing of 17thIMACS World Congress, Scientific Computation, Applied Math-
ematics and Simulation (Eds. P. Borne, M. Benrejeb, N. Dangoumau, L.
Lorimier), Paris, July 11-15, 2005, ISBN 2-915913-02-1, pp.1-7.
[28] T. Catinas, Three ways of defining the bivariate Shepard operator of Lidstone
type, Studia Univ. ”Babes-Bolyai”, Mathematica, 50 (2005), no. 3, pp. 57–
63.
[29] T. Catinas, The Lidstone interpolation on tetrahedron, J. Appl. Funct. Anal.,
2006, no. 4, pp. 425-439.
[30] T. Catinas, A combined method for interpolation of scattered data based on
triangulation and Lagrange interpolation, Studia Univ. Babes-Bolyai, Math-
ematica, 51 (2006), no. 4, pp. 55-63.
[31] T. Catinas, Interpolation of scattered data, ”Casa Cartii de Stiinta”, 2007.
[32] T. Catinas, The bivariate Shepard operator of Bernoulli type, Calcolo, 2007,
to appear.
[33] T. Catinas, A cubature formula based on Bernoulli interpolation for the
rectangle, to appear.
[34] T. Catinas, Gh. Coman, Optimal quadrature formulas based on ϕ-function
method , Studia Univ. Babes-Bolyai, Mathematica, 51 (2006), no. 1, pp.
49-64.
[35] E.W. Cheney, Multivariate Approximation Theory, Selected Topics,
CBMS51, SIAM, Philadelphia, Pennsylvania, 1986.
[36] W. Cheney, W. Light, A Course in Approximation Theory, Brooks/Cole
Publishing Company, Pacific Grove, 2000.
262 BIBLIOGRAPHY
[37] I. Chiorean, Parallel Numerical Methods for Solving Partial Differential
Equations, Matarom Rev., Nr. 3, Paris, 1993, pp. 19-38.
[38] I. Chiorean, On the complexity of multigrid method in solving a system of
PDE, Research seminars, Seminar on Numerical and Statistical Calculus,
Preprint nr. 1, 1994, pp. 27-32.
[39] I. Chiorean, Calcul Paralel, Editura Microinformatica, 1995.
[40] I. Chiorean, On the Convergence of the Numerical Solution for a Problem of
Convection in Porous Medium, Research seminars, Seminar on Numerical
and Statistical Calculus, Preprint nr. 1, 1996, pp. 23-27.
[41] I. Chiorean, Parallel algorithm for solving a problem of convection in porous
medium, Advances in Engineering Software, Elsevier Ltd., 28 (1997), no. 8,
Great Britain, pp. 463-467.
[42] I. Chiorean, Free convection in an inclined square enclosure filled with a
heat-generating porous-medium, Studia Universitatis ”Babes-Bolyai” Math-
ematica, Cluj-Napoca, no. 4, 1997, pp. 35-43.
[43] I. Chiorean, Convection between the Number of Multigrid Iterations and the
Discretized Error in Solving a Problem of Convection in Porous Medium,
Studia Universitatis ”Babes-Bolyai” Mathematica, Cluj-Napoca, no. 1,
1997, pp. 59-65.
[44] I. Chiorean, Parallel Numerical methods for solving nonlinear equations,
Studia Universitatis ”Babes-Bolyai”, Cluj-Napoca, Mathematica, 46 (2001),
no. 4, pp. 53-61.
[45] I. Chiorean, On the complexity of some parallel relaxation schemes, Bul.
Stiintific Univ. Baia Mare, Ser. B, Matematica-Informatica, 43 (2002), no.
2, pp. 171-176.
[46] I. Chiorean, On some 2D Parallel Relaxation Schemes, Proceedings of
ROGER, Sibiu, 12-14 iunie 2002, Mathematical Analysis and Approxima-
tion Theory, Burg Verlag, 2002, pp. 77-85.
[47] I. Chiorean, Contributii la studiul unei probleme de convectie in mediu poros,
Presa Universitara Clujeana, 2002.
BIBLIOGRAPHY 263
[48] I. Chiorean, Parallel numerical methods for PDE, Kragujevac J. Math. 25
(2003), pp. 5-18.
[49] I. Chiorean, On some Numerical Methods for Solving the Black-Scholes For-
mula, Creative Mathematics Journal, 13 (2004), Pub. by Dep. of Math. and
Comp. Science, North Univ. Baia-Mare, pp. 31-36 (conf. ICAM4, Suior,
Baia-Mare).
[50] I. Chiorean, Remarks on the restriction and prolongation operators from
the multigrid method, Mathematical Analysis and Approximation Theory,
Mediamira Science Publisher, 2005, pp. 43-49.
[51] I. Chiorean, Parallel Algorithm for Solving the Black-Scholes Equation,
Kragujevac J. Math, 27 (2005), pp. 39-48.
[52] I. Chiorean, Parallel LU Method for solving the Black-Scholes Equation,
Proceedings of NAAT, Cluj, 2006, pp. 155-161.
[53] P.L. Clark, The Stone-Weierstrass Approximation Theory,
www.math.uga.edu.
[54] Gh. Coman, Two-dimensional monosplines and optimal cubature formulae,
Studia Univ. ”Babes-Bolyai”, Ser. Math.-Mech. 1973, no. 1, pp. 41–53.
[55] Gh. Coman, Multivariate approximation schemes and the approximation of
linear functionals, Mathematica, 16 (39) (1974), pp. 229–249.
[56] Gh. Coman, The approximation of multivariate functions, Technical Sum-
mary Report 1254, Mathematics Research Center, University of Wisconsin,
Madison, Wisconsin, 1974.
[57] Gh. Coman,The complexity of the quadrature formulas, Mathematica (Cluj),
23 (46) (1981), pp. 183–192.
[58] Gh. Coman, Homogeneous multivariate approximation formulas with appli-
cations in numerical integration, Research Seminaries, ”Babes-Bolyai” Uni-
versity, Faculty of Mathematics, Preprint No. 4, 1985, pp. 46–63.
[59] Gh. Coman, The remainder of certain Shepard type interpolation formulas,
Studia Univ. ”Babes-Bolyai”, Mathematica, 32 (1987), no. 4, pp. 24–32.
264 BIBLIOGRAPHY
[60] Gh. Coman, Shepard-Taylor interpolation, Itinerant Seminar on Functional
Equations, Approximation and Convexity, Cluj-Napoca, 1988, pp. 5–14.
[61] Gh. Coman, Homogeneous cubature formulas, Studia Univ. ”Babes-Bolyai”,
Mathematica, 38 (1993), no. 2, pp. 91–101.
[62] Gh. Coman, Analiza Numerica, Ed. Libris, Cluj-Napoca, 1995.
[63] Gh. Coman, Shepard operators of Hermite-type, Rev. Anal. Numer. Theor.
Approx., 26 (1997), no. 1–2, pp. 33–38.
[64] Gh. Coman, Shepard operators of Birkhoff type, Calcolo, 35 (1998), pp.
197–203.
[65] Gh. Coman, Numerical multivariate approximation operator, ”Tiberiu
Popoviciu” Itinerant Seminar of Functional Equations, Approximation and
Convexity, Cluj-Napoca, 2001, May 22–26.
[66] Gh. Coman, M. Birou, Bivariate spline-polynomial interpolation, Studia
Mathematica ”Babes-Bolyai”, 48 (2003), no. 4, pp. 17–25.
[67] Gh. Coman, K. Bohmer, Blending interpolation schemes on triangle with
error bounds, Lecture Notes in Mathematics, Springer-Verlag, Berlin, 571
(1977), pp. 14–37.
[68] Gh. Coman, T. Catinas, M. Birou, A. Oprisan, C. Osan, I. Pop, I. Somogyi,
I. Todea, Interpolation operators, Ed. ”Casa Cartii de Stiinta”, Cluj-Napoca,
2004.
[69] Gh. Coman, I. Gansca, Blending approximation with applications in con-
structions, Buletinul Stiintific al Institutului Politehnic Cluj-Napoca, 24
(1981), pp. 35–40.
[70] Gh. Coman, I. Gansca, Some practical application of blending approximation
II, Itinerant Seminar on Functional Equations, Approximation and Convexi-
ty, Cluj-Napoca, 1986.
[71] Gh. Coman, I. Gansca, L. Tambulea, Some practical application of blending
interpolation, Proceedings of the Colloquium on Approximation and Opti-
mization, Cluj-Napoca, October 25–27, 1984.
BIBLIOGRAPHY 265
[72] Gh. Coman, I. Gansca, L. Tambulea, Some new roof-surfaces generated by
blending interpolation technique, Studia Univ. ”Babes-Bolyai”, Mathema-
tica, 36 (1991), no. 1, pp. 119–130.
[73] Gh. Coman, I. Gansca, L. Tambulea, Multivariate approximation, Seminar
on Numerical and Statistical Calculus, Preprint no. 1, 1996, pp. 29–60.
[74] Gh. Coman, D. Johnson, Complexitatea algoritmilor, Lito. Univ. ”Babes-
Bolyai”, Cluj-Napoca, 1987.
[75] Gh. Coman, I. Pop, Some interpolation schemes on triangle, Studia Univ.
”Babes-Bolyai”, Mathematica, 48 (2003), 3, pp. 57–62.
[76] Gh. Coman, I. Pop, R. Trımbitas, An adaptive cubature on triangle, Studia
Univ. ”Babes-Bolyai”, Mathematica, 47 (2002), 4, pp. 27–35.
[77] Gh. Coman, I. Purdea Pop, On the remainder term in multivariate approx-
imation, Studia Univ. ”Babes-Bolyai”, Mathematica, 43 (1998), no. 1, pp.
7–14.
[78] Gh. Coman, I. Somogyi, Homogeneous formulas for approximation of mul-
tiple integrales, Studia Univ. ”Babes-Bolyai”, Mathematica, 45 (2000), pp.
11–18.
[79] Gh. Coman, R. Trımbitas, Lagrange-type Shepard operators, Studia Univ.
”Babes-Bolyai”, Mathematica, 42 (1997), pp. 75–83.
[80] Gh. Coman, R. Trımbitas, Combined Shepard univariate operators, East
Jurnal on Approximations, 7 (2001), 4, pp. 471–483.
[81] Gh. Coman, R. Trımbitas, Univariate Shepard-Birkhoff interpolation, Rev.
Anal. Numer. Theor. Approx., 30 (2001), no. 1, pp. 15–24.
[82] Gh. Coman, L. Tambulea, A Shepard-Taylor approximation formula, Studia
Univ. ”Babes-Bolyai”, Mathematica, 33 (1988), no. 3, pp. 65–73.
[83] Gh. Coman, L. Tambulea, On some interpolation of scattered data, Studia
Univ. ”Babes-Bolyai”, 35 (1990), no. 2, pp. 90–98.
[84] Gh. Coman, L. Tambulea, Bivariate Birkhoff interpolation of scattered data,
Studia Univ. ”Babes-Bolyai”, 36 (1991), no. 2, pp. 77–86.
266 BIBLIOGRAPHY
[85] Gh. Coman, L. Tambulea, I. Gansca, Multivariate approximation, Seminar
on Numerical and Statistical Calculus, Preprint no. 1, 1996, pp. 29–60.
[86] Ph. J. Davis, Interpolation and Approximation, Blaisdell Publishing Com-
pany, New York, 1963.
[87] B. Della Vecchia, G. Mastroianni, P. Vertesi,Direct and converse theorems
for Shepard rational approximation, Numer. Funct. Anal. Optim., 17 (1996),
nos. 5–6, pp. 537–561.
[88] F.J. Delvos, D-variate boolean interpolation, J. Approx. Theory, 34 (1982),
pp. 99–44.
[89] F.J. Delvos, Boolean methods for double integration, Math. Comp., 55
(1990), pp. 683–692.
[90] F.J. Delvos, W. Schempp, Boolean methods in interpolation and approxi-
mation, Longman Scientific & Technical, England, copublished with John
Wiley & Sons, New York, 1989.
[91] J.W. Demmel, Solving the Discrete Poisson Equation using Mutigrid,
CS267, Notes for Lectures, www.cs.berkeley.edu.
[92] C.C. Douglas, A Review of Numerous Parallel Multigrid Methods, SIAM
News, V. 25, No. 3, 2000.
[93] A.V. Efimov, Modulus of continuity, Enciclopedia of Mathematics, 2001.
[94] R. Farwig, Rate of convergence of Shepard’s global interpolation formula,
Math. Comp., 46 (1986), pp. 577–590.
[95] R. Franke, Scattered data interpolation: tests of some methods, Math.
Comp., 38 (1982), pp. 181–200.
[96] R. Franke, G.M. Nielson, Smooth interpolation of large sets of scattered data,
International Journal for Numerical Methods in Engineering, 15 (1980), pp.
1691–1704.
[97] P.O. Frederickson, O.A. McBryan, Parallel superconvergent multigrid, Lec-
ture Notes in Pure and App. Math., 1988.
BIBLIOGRAPHY 267
[98] M. Gasca, T. Sauer, Polynomial interpolation in several variables. Multi-
variate polynomial interpolation, Adv. Comput. Math., 12 (2000), no. 4,
pp. 377–410.
[99] M. Gasca, T. Sauer, On the history of multivariate polynomial interpolation,
J. Comput. Appl. Math., 122 (2000), pp. 23–35.
[100] W.B. Gearhart, M. Qian, Peano kernels and the Euler-MacLaurin formula,
Far East J. Math. Sci., 3 (2001), no. 2, pp. 247–271.
[101] W.M. Gentelman, Least squares computations by Givens transformations
without square roots, Journal of the IMA, 12, 1973.
[102] A.C. Gilbert, The SOR Method, www.maths.lancs.ac.uk.
[103] W.J. Gordon, Distributive lattices and approximation of multivariate func-
tions, Proc. Symp. Approximation with Special Emphasis on Spline Func-
tions (Madison, Wisc.), (Ed. I.J. Schoenberg), 1969, pp. 223–277.
[104] W.J. Gordon, Blending-function methods of bivariate and multivariate inter-
polation and approximation, SIAM J. Numer. Anal., 8 (1971), pp. 158–177.
[105] W. Hackbusch, A new approach to robust multigrid solvers, Proceeding of
ICIAM’87.
[106] P.P. Korovkin, On converge of linear positive operators in the space of con-
tinuous functions, Dobl. Akad. Nauk SSSR 90, 1960, pp. 961-964.
[107] D.V. Ionescu, Numerical Quadratures, Bucharest, 1957.
[108] D.V. Ionescu, Sur une classe de formules de cubature, C.R. Acad. Sc. Paris,
266 (1968), pp. 1155–1158.
[109] D.V. Ionescu, Extension de la formule de quadrature de Gauss a une classe
de formules de cubature, C.R. Acad. Sc. Paris, 269 (1969), pp. 655–657.
[110] D.V. Ionescu, L’extension d’une formule de cubature, Acad. Royale de Bel-
gique, Bull. de la Classe des Science, 56 (1970), pp. 661–690.
[111] W. Light, W. Cheney, Approximation Theory in Tensor Product Spaces, Lec-
ture Notes in Mathematics 1169, (Ed. A. Dold and B. Eckmann), Springer-
Verlag, Berlin, 1985.
268 BIBLIOGRAPHY
[112] L.E. Mansfield, On the optimal approximation of linear functionals in spaces
of bivariate functions, SIAM J. Numer. Anal., 8 (1971), pp. 115–126.
[113] C.A. Micchelli, Interpolation of scattered data: distance matrices and con-
ditionally positive definite functions, Constr. Approx., 2 (1986), no. 1, pp.
11–22.
[114] J.J. Modi, Parallel Algorithm and Matrix Computation, Clarendon Press,
Oxford, 1988.
[115] G.M. Nielson, R. Franke, A method for construction of surfaces under ten-
sion, Rocky Mtn. J. Math., 14 (1984), no. 1, pp. 203–221.
[116] H.H. Olsen, The Weierstrass density theorem, TMA4230, Functional Anal-
ysis, 2005.
[117] A. Oprisan, About convergence order of the iterative methods generated by
inverse interpolation, Seminar on Numerical and Statistical Calculus, 2004,
pp. 97–109.
[118] A. Pinkus, Density Methods and Results in Approximation Theory, Orliez
Centenary volume, Banck Center Publication, vol.64, Polish Acad. of Sci-
ence, Warszawa, 2004.
[119] T. Popoviciu, On the proof of Weierstrass theorem using interpolation poly-
nomials, Lucr. Ses. Gen. Stiintific, 2:12 (1950), pp. 1664-1667.
[120] H.J. Quinn, Designing Efficient Algorithm for Paralell Computers, McGraw-
Hill Int., 1988.
[121] A.H. Sameh, R.P. Brent, Solving triangular systems on a parallel computer,
SIAM Journal on Numerical Analysis, 14 (6), 1977.
[122] A. Sard, Linear Approximation, AMS, Providence, RI, 1963.–334.
[123] I.J. Schoenberg, On monosplines of least deviation and best quadrature for-
mulae, J. SIAM Numer. Anal., Ser. B, 2 (1965) no. 1, pp. 145–170.
[124] I.J. Schoenberg, On monosplines of least deviation and best quadrature for-
mulae II, J. SIAM Numer. Anal., Ser. B, 3 (1966), no. 2, pp. 321–328.
BIBLIOGRAPHY 269
[125] L.L. Schumaker, Fitting surfaces to scattered data, Approximation Theory
II, (Eds. G. G. Lorentz, C. K. Chui, L. L. Schumaker), Academic Press,
1976, pp. 203–268.
[126] L.L. Schumaker, Spline functions: basic theory, Pure and Applied Mathe-
matics. John Wiley & Sons, Inc., New York, 1981.
[127] B. Sendov, A. Andreev, Approximation and Interpolation Theory, Hand-
book of Numerical Analysis, vol. III, ed. P.G. Ciarlet and J.L. Lions, North
Holland, Amsterdam, 1994.
[128] W. Shayne, Refinements of the Peano kernel theorem, Numer. Funct. Anal.
Optimiz., 20 (1999) nos. 1&2, pp. 147–161.
[129] D. Shepard, A two-dimensional interpolation function for irregularly spaced
points, Proc. 23rd Nat. Conf. ACM (1968), pp. 517–523.
[130] S.A. Smolyak, Quadrature and interpolation formulas for tensor products of
certain classes of functions, Soviet Math. Dokl., 4, 1963, pp. 240–243.
[131] I. Somogyi, About some cubature formulas, Proceedings of the Interna-
tional Symposium on Numerical Analysis and Approximation Theory, Cluj-
Napoca, May 9–11, 2002, pp. 430–436.
[132] I. Somogyi, Almost optimal numerical methods, Studia Univ. ”Babes-
Bolyai”, Mathematica, 1999, no. 1, pp. 85–93.
[133] D.D. Stancu, A study of the polynomial interpolation of functions of several
variables, with applications to the numerical differentiation and integration;
methods for evaluating the remainders, Doctoral Dissertation, 1956, Univer-
sity of Cluj.
[134] D.D. Stancu, The generalization of certain interpolation formulae for the
functions of many variables, Buletinul Institutului Politehnic din Iasi, nos.
1–2, 1957, pp. 31–38.
[135] D.D. Stancu, On the Hermite interpolation formula and on some of its ap-
plications, Acad. R. P. Romıne. Fil. Cluj. Stud. Cerc. Mat., 8 (1957), pp.
339–355.
270 BIBLIOGRAPHY
[136] D.D. Stancu, Generalization of some interpolation formulas for multiva-
riate functions and some contributions on the Gauss formula for numerical
integration, Buletin St. Acad. Romane, 9, 1957, pp. 287–313.
[137] D.D. Stancu, On some Taylor expansions for functions of several variables,
Rev. Roumaine Math. Pures Appl., 4 (1959), pp. 249–265 (in Russian).
[138] D.D. Stancu, The expression of the remainder in some numerical partial
differentiation formulas, Acad. R. P. Romıne Fil. Cluj Stud. Cerc. Mat., 11
(1960), pp. 371–380.
[139] D.D. Stancu, On the integral representation of the remainder in Taylor’s
formula in two variables, Stud. Cerc. Mat. (Cluj), 13 (1962), no. 1, pp.
175–182.
[140] D.D. Stancu, The remainder of certain linear approximation formulas in
two variables, SIAM J. Numer. Anal. Ser. B, 1 (1964), pp. 137–163.
[141] D.D. Stancu, On Hermite’s osculatory interpolation formula and on some
generalizations of it, Mathematica (Cluj), 8 (31) (1966), pp. 373–391.
[142] D.D. Stancu, A generalization of the Schoenberg approximating spline ope-
rator, Studia Univ. ”Babes-Bolyai”, Math., 26 (1981), no. 2, pp. 37–42.
[143] D.D. Stancu, Gh. Coman, O. Agratini, R. Trımbitas, Analiza Numerica si
Teoria Aproximarii, vol. I, Presa Universitara Clujeana, 2001.
[144] D.D. Stancu, Gh. Coman, P. Blaga, Analiza Numerica si Teoria
Aproximarii, vol. II, Presa Universitara Clujeana, 2002.
[145] A.H. Stroud, Optimal quadrature formulas, Approximation in Theory and
Praxis, Bibliographices Institut, Mannheim, 1979.
[146] A.H. Stroud, Approximate calculation of multiple integrals, Prentice-Hall,
Inc., Englewood Cliffs, New Jersey, 1971.
[147] J.R. Shewchuk, An Introduction to the Conjugate Gradient Method with-
out Agonizing Pain, School of Computer Science, Carnegie Mellon Univ.,
Pittsburg, 1994.
[148] O. Shisa, B. Mond, The degree of convergence of sequences of linear positive
operators, Aerospace Research Lab., Ohio, 1968, pp.11-96.
BIBLIOGRAPHY 271
[149] J. Szabados, Direct and converse approximation theorems for the Shepard
operator, Approx. Theory Appl., 7 (1991), no. 3, pp. 63–76.
[150] S. Ta’asan, Multigrid Methods for Highly Oscillatory Problems, PhD thesis,
Weizmann Inst. of Science, Rehovot, Israel, 1984.
[151] R. Toomas, Transformation and LU-Factorization, www.cs.ut.ee.
[152] J.F. Traub, Iterative Methods for the Solutions of Equations, Prentice-Hall,
Inc, Englewood Cliffs, 1964.
[153] R.T. Trımbitas, Univariate Shepard-Lagrange interpolation, Kragujevac J.
Math., 24 (2002), pp. 85–94.
[154] R. Trımbitas, Numerical Analysis, Presa Universitara Clujeana, 2006.
[155] P. Wong, Iterative solvers for System of Linear Equations, www.jics.utk.edu,
2002.
[156] ***www.Wikipedia.org, Weierstrass Approximation Theorem,
www.literka.addr.com.