smoothing noisy data for irregular regions using penalized bivariate splines on triangulations

Comput Stat (2014) 29:263–281DOI 10.1007/s00180-013-0448-z

ORIGINAL PAPER

Smoothing noisy data for irregular regions usingpenalized bivariate splines on triangulations

Lan Zhou · Huijun Pan

Received: 7 April 2012 / Accepted: 4 August 2013 / Published online: 27 August 2013© Springer-Verlag Berlin Heidelberg 2013

Abstract The penalized spline method has been widely used for estimating univariatesmooth functions based on noisy data. This paper studies its extension to the two-dimensional case. To accommodate the need of handling data distributed on irregularregions, we consider bivariate splines defined on triangulations. Penalty functionsbased on the second-order derivatives are employed to regularize the spline fit andgeneralized cross-validation is used to select the penalty parameters. A simulationstudy shows that the penalized bivariate spline method is competitive to some well-established two-dimensional smoothers. The method is also illustrated using a realdataset on Texas temperature.

Keywords Bivariate smoothing · Generalized cross-validation · Nonparametricfunction estimation · Roughness penalty · P-splines

1 Introduction

In spatial data analysis a common problem is to find the value of a target variable,for example, temperature or ozone concentration, over a two-dimensional domain.Usually this domain has an irregular shape and the variable is observed at discretelocations/times. There are mainly two approaches to solve this problem. One approachtreats the value of a target variable at each location as a random variable and uses the

L. Zhou (B)Department of Statistics, 3143 TAMU, Texas A&M University, College Station, TX 77843, USAe-mail: [email protected]

H. PanTravelers, Hartford, CT 06183, USAe-mail: [email protected]

123

264 L. Zhou, H. Pan

covariance function between these random variables or a variogram to represent thecorrelation; another approach uses a deterministic smooth surface function to describethe variations and connections among values at different locations. This work takesthe latter approach.

To be more specific, we are interested in estimating a smooth function f (x, y)

over some bounded domain � ⊆ R2 given observations {zi }ni=1 at a collection of

discrete points {vi = (xi , yi )}ni=1 in the domain. We assume that the observations zi

at locations (xi , yi ) ∈ R2 satisfy

zi = f (xi , yi ) + εi , i = 1, 2, . . . , n,

where εi ’s are zero-mean random variables. While studying this two-dimensionalsmoothing problem, we pay special attention to the following two practical issues.1. The data need not be evenly distributed; observations can be dense at some locationswhile sparse at others. 2. The domain for the data can take non-rectangular shapes, itmay have holes inside but it should be nicely connected.

In this work we introduce bivariate splines on triangulations to handle irregulardomains and propose to extend the idea of univariate penalized splines (Eilers and Marx1996) to the two-dimensional case. The bivariate splines we consider have severaladvantages. First, they are rich enough to provide good approximations of smoothfunctions over complicated domains. Second, they can be easily incorporated intomany widely used functional data models tailored for different data structure. Third,the computational cost for spline evaluation and parameter estimation are manageable.The bivariate splines are well-developed in applied mathematics (Lai and Schumaker2007) but are not well-known in the statistics community.

Although not as extensively studied as univariate smoothing, two-dimensionalsmoothing over irregular domains has been the subject of research for many authors.Using roughness penalties (Green and Silverman 1994) has been the dominatingapproach. To deal with irregular domains, Wang and Ranalli (2007) applied low-rank thin-plate splines defined as functions of the geodesic distance instead of theEuclidean distance, while Eilers (2006) employed the Schwarz–Christoffel transformto convert the complex domains to regular domains. Other authors dealt with the irreg-ular domains directly. Ramsay (2002) formulated a penalized least squares problemwith a Laplacian penalty and transformed the problem to that of solving a systemof partial differential equations (PDE), which in turn can be solved nicely using thefinite element method. In order to make the method work, however, Ramsay (2002)made the rather strong assumption that the gradient of the estimated function is zeroalong normal directions to the boundary at all boundary points. To remove this unde-sirable assumption, Wood et al. (2008) developed the soap film smoother and showedempirically its superior performance over Ramsay’s method. The penalized bivariatespline method we consider in this paper is related to the triogram model by Hansenet al. (1998), where linear splines on triangulations are used to model bivariate func-tions and triangulations are built in a data adaptive fashion. Different from their work,however, we follow the idea of penalized splines and use a pre-specified triangula-tion and employ roughness penalties to regularize the spline fit. In order to defineroughness penalties, we use bivariate splines that allow definitions of derivatives. Our

123

Smoothing noisy data for irregular regions 265

penalized bivariate spline method is computationally simple, but does not have thespatial adaptivity of the triogram model of Hansen et al. (1998). We will show, using asimulation study, that our method is competitive to existing methods such as the soapfilm smoother and the thin-plate regression spline. Another related work is Koenkerand Mizera (2004), where the total variation penalty was used and turned out to beconvenient for quantile regressions.

The rest of the paper is organized as follows. Section 2 reviews the background aboutbivariate splines. Section 3 presents the details of the penalized bivariate spline method.Section 4 discusses some implementation details. Section 5 contains numerical results.Section 5.1 presents simulation results comparing our method with the soap filmsmoother of Wood et al. (2008) and the thin-plate regression spline of Wood (2003).Section 5.2 illustrates our method using a real dataset.

2 Bivariate splines on triangulations

This section provides a review of the background of bivariate splines on triangulations,following the excellent book of Lai and Schumaker (2007). Section 2.1 introducesBarycentric coordinates and Bernstein basis polynomials with respect to a triangle.Section 2.2 gives formulas for computing the directional derivatives and introducingsmoothness across edges of two adjacent triangles. Section 2.3 discusses constructionof bivariate splines on a triangulation.

2.1 Barycentric coordinates and Bernstein basis polynomials

Given a non-degenerate triangle A with vertices counter-clockwise numbered asv1, v2, v3, any point v ∈ R2 can be written as

v = b1v1 + b2v2 + b3v3, with b1 + b2 + b3 = 1. (1)

The coefficients (b1, b2, b3) are called the barycentric coordinates of point v withrespect to the triangle A, denoted as bv = (b1, b2, b3). Here the constraint b1 + b2 +b3 = 1 guarantees the unique representation of point v. Although all points in R2

could be represented as (1) where bi ’s may take negative values, we only consider thepoints inside or on the edges of the triangle A where bi ’s are all nonnegative.

The constraint b1 + b2 + b3 = 1 on barycentric coordinates makes the coordinatesuniquely defined. Other constraints may be used for identifiability, but we adopt thisone because it gives barycentric coordinates an interesting geometric interpretation.When the point v is located inside or on the edges of the triangle T , we connect thepoint v with v1, v2 and v3 to generate three triangles {A1, A2, A3} as shown in Fig. 1,then the barycentric coordinates are

bi = Area of Ai

Area of A, i = 1, 2, 3. (2)

123

266 L. Zhou, H. Pan

Fig. 1 Illustration ofbarycentric coordinates on atriangle

One important property of the Barycentric coordinates is that they are invariant tolinear transformation of Cartesian coordinates. Since bivariate splines and all modelequations are built subsequently on barycentric coordinates, they are also invariant tolinear transformations.

Given a triangle A and a point v ∈ A with barycentric coordinates bv = (b1, b2, b3),for a positive integer d and nonnegative integers i, j, k summing to d, Bernstein basispolynomials of degree d with respect to the triangle A are defined as

Bd;i jk(v) := d!i ! j !k!bi

1b j2bk

3, bv = (b1, b2, b3), b1 + b2 + b3 = 1, v ∈ A. (3)

Note that the barycentric coordinates b1, b2, b3 are all linear functions of the Carte-sian coordinates. It follows that each Bd;i jk(v) is a polynomial with degree d in theCartesian coordinates. Let Pd(A) be the space of polynomials defined on the triangleA with degree d, then Bd;i jk ∈ Pd(A). The set of Bernstein basis polynomials

Bd,A := {Bd;i jk, i, j, k ≥ 0, i + j + k = d} (4)

forms a basis of the space Pd(A) and satisfies

1)∑

i+ j+k=dBd;i jk(v) = 1, for all v ∈ A,

2) 0 ≤ Bd;i jk(v) ≤ 1, for all v ∈ A,3) Bd;i jk has a unique maximum at the point ξi jk = (iv1 + jv2 + kv3)/d ∈ A.

(See Section 2.3 of Lai and Schumaker 2007.) Figure 2 gives an example of Bernsteinbasis polynomials for d = 2.

Since Bd,A is a basis for Pd(A), for any function s ∈ Pd(A), there exist coefficients{ci jk} such that

s(v) =∑

i+ j+k=d

ci jk Bd;i jk(v). (5)

123


Fig. 2 Contour plots of the complete collection of six Bernstein polynomials when the domain is a triangleand d = 2. The function values vary between 0 and 1

This representation is usually called the B-form of s relative to the triangle A. Forcomputer implementation, we need an ordering of the coefficients in (5). In this paperwe use the lexicographical order, that is, cνμκ comes before ci jk provided ν > i , or ifν = i , then μ > j , or if ν = i and μ = j , then κ > k. Using this ordering, (5) can bewritten in vector form

s(v) = BTd,A(v) c, (6)

where

Bd,A(v)=(Bd;d,0,0(v), Bd;d−1,1,0(v), Bd;d−1,0,1(v), Bd;d−2,2,0, . . . , Bd;0,0,d(v))T

(7)

and

c = {cd,0,0, cd−1,1,0, cd−1,0,1, cd−2,2,0, . . . , c0,0,d}T . (8)

In (8), ci jk corresponds to the lth element of vector c where

l =d−i∑

m=0

(m + 1 − j). (9)

Note that choosing different ordering of the elements of c will not affect the results ofevaluation of the polynomial.

123

268 L. Zhou, H. Pan

Fig. 3 First example of twotriangles sharing an edge

Barycentric coordinates and Bernstein basis polynomials are convenientfor deriving conditions for continuously connecting polynomials defined onadjacent triangles. Assume that we have two triangles A1 := <v1, v2, v3> andA2 := <v4, v3, v2> sharing a common edge e = <v2, v3> with the Bernstein basispolynomials {B(1)

d;i jk}i+ j+k=d defined on A1 and {B(2)d;i jk}i+ j+k=d on A2 using the cor-

responding barycentric coordinates. See Fig. 3 for an illustration. Consider degree-dpolynomials p1(v) defined on A1 and p2(v) on A2, written in the B-form

p1(v) =∑

i+ j+k=d

c(1)i jk B(1)

d;i jk(v) and p2(v) =∑

i+ j+k=d

c(2)i jk B(2)

d;i jk(v). (10)

Assume the point v is on the common edge e. The barycentric coordinates (b1, b2, b3)

of v ∈ e with respect to A1 and A2 are, respectively (0, b2, 1−b2) and (0, 1−b2, b2).Therefore polynomials p1(v) and p2(v) are reduced to univariate functions

p1(v) =∑

j+k=d

c(1)0 jk

d!j !k!b

j2(1 − b2)

k

and

p2(v) =∑

j+k=d

c(2)0 jk

d!j !k! (1 − b2)

j (b2)k .

In this case, p1 and p2 join continuously on edge e if and only if c(1)0 jk = c(2)

0k j forj, k ≥ 0 and j + k = d.

123


2.2 Directional derivatives and smoothness

To facilitate introduction of smoothness restrictions across edges of two triangles whenwe define bivariate splines, we give the expressions of the directional derivatives ofBernstein basis polynomials. Directional derivative of a multivariate smooth functionf at point v with respect to direction w is defined as

Dw f (v) := ∂

∂tf (v + tw)

∣∣∣∣t=0

= limt→0

f (v + tw) − f (v)

t. (11)

Higher order derivatives can be defined recursively.Assume that the direction w = u1 − u2 where u1 and u2 are two points with

barycentric coordinates (α1, α2, α3) and (β1, β2, β3) with respect to the triangle A.Then w is uniquely described by the triple (a1, a2, a3) with ai = αi − βi and a1 +a2 + a3 = 0. From the definition and direct calculation, the directional derivative ofthe Bernstein basis polynomial Bd;i jk has the expression

Dw Bd;i jk(v) = d [a1 Bd−1;(i−1) jk(v) + a2 Bd−1;i( j−1)k(v) + a3 Bd−1;i j (k−1)(v)].(12)

Thus, for s ∈ Pd(A) expressed in the B-form (5), the directional derivative has theexpression

Dws(v) =∑

i+ j+k=d

ci jk Dw Bd;i jk(v)

=∑

i+ j+k=d

a1ci jkd [a1 Bd−1;(i−1) jk(v)+a2 Bd−1;i( j−1)k(v)+a3 Bd−1;i j (k−1)(v)]

= d∑

i+ j+k=d

c(1)i jk(a)Bd−1;i jk(v),

where c(1)i jk(a) = a1ci+1, j,k +a2ci, j+1,k +a3ci, j,k+1. In general, define c(0)

i jk(w) = ci jk

for w = (a1, a2, a2), and recursively

c(m)i jk (a) := a1c(m−1)

i+1, j,k + a2c(m−1)i, j+1,k + a3c(m−1)

i, j,k+1, for m = 1, . . . , d. (13)

Then, the mth order directional derivative has the expression

Dmw s(v) = d!

(d − m)!∑

i+ j+k=d−m

c(m)i jk (w)Bd−m;i jk(v). (14)

(See Theorem 2.15 of Lai and Schumaker 2007.)Consider two triangles A1 := <v1, v2, v3> and A2 := <v4, v3, v2> sharing a

common edge e = <v2, v3> as depicted in Fig. 3 and two degree-d polynomials p1and p2 defined separately on A1 and A2. For any direction w, let Dl

w p(v) denote the

123

270 L. Zhou, H. Pan

lth order derivative in the direction w at the point v. We say that p1 and p2 connectsmoothly on the common edge e with order r if their derivatives up to order r in alldirections agree at every point on e, that is,

Dlw p1(v) = Dl

w p2(v), all v ∈ e and l = 0, . . . , r, (15)

and for all directions w. The conditions for the smooth connection of two polynomialscan be expressed in terms of the coefficients of their B-form representations. Assumethat p1 and p2 have the B-form representations given in (10). Then, p1 and p2 aresmoothly connected on edge e with order r if and only if

c(2)l jk =

∑

ν+μ+κ=l

c(1)ν,k+μ, j+κ B(1)

l;νμκ(v4), j + k = d − l, l = 0, . . . , r, (16)

where v4 is the vertex of A2 that is not on e, and c(1)l jk and c(2)

l jk are coefficients inthe B-form expressions (10). (See Theorem 2.28 of Lai and Schumaker 2007). Theconditions for continuously connecting two polynomials as discussed at the end of theprevious subsection is a special case of (16) when r = 0.

2.3 Bivariate splines on a triangulation

Consider a collection of K triangles := {A1, . . . , AK } that covers the domain �.This collection is called a triangulation of the domain, provided that if a pair oftriangles in intersect, then their intersection is either a common vertex or a commonedge. Let d be a positive integer. Given a triangulation , define the space of degree-dpiecewise polynomials on to be the set of functions that are degree-d polynomialswhen restricted to each triangle in , that is,

Sd() = {s : s|Ak ∈ Pd(Ak), k = 1, . . . , K }, (17)

where s|Ak denotes the restriction of s on the triangle Ak . Since piecewise polynomialscan be discontinuous across shared edges of adjacent triangles, they are in general notideal objects to use for smoothing noisy data. We thus consider a subset of piecewisepolynomials that satisfy global smoothness restrictions. For a nonnegative integer r ,let C

r () denote the collection of r th differentiable function on . Define the r thdifferentiable spline space S

rd() as

Srd() = Sd() ∩ C

r () = {s ∈ Cr () : s|Ai ∈ Pd(Ak), k = 1, . . . , K }. (18)

Now we discuss how to represent bivariate splines using a basis expansion. On eachtriangle Ak ∈ , let Bd,k be the set of degree-d bivariate Bernstein basis polynomials.Note the slight abuse of notation here. The subscript k in Bd,k indicates that thiscollection of Bernstein basis polynomials corresponds to the triangle Ak and thereforeit is different from the meaning of k in (4). Since the set of polynomials Bd,k is a basis

123


of Pd(Ak), k = 1, . . . , K , the union of these sets, {Bd,k}Ki=k , forms a basis of Sd().

Thus for any s ∈ Sd , there exists a vector ck as in (8) such that

s|Ai = BTd,kck, k = 1, . . . , K ,

and

s = BTd c,

where Bd = (BTd,1, BT

d,2, . . . , BTd,K )T and c = (cT

1 , cT2 , . . . , cT

K )T .It is well-known that B-splines provide easy-to-calculate, locally supported basis

functions for representing univariate splines. However, construction of locally sup-ported basis functions for bivariate splines is much more complicated and can only bedone in a case-by-case basis. Thus we adopt a different strategy in this paper. We usethe Bernstein basis polynomials to represent bivariate splines and impose constraintson the coefficients of the basis expansion to meet the smoothness requirement. Specif-ically, we repeatedly apply (16) for all shared edges in the triangulation to obtain amatrix H such that Hc = 0. This constraint matrix H depends on d, r and the structureof the triangulation.

3 Penalized bivariate splines on triangulations

Section 3.1 presents the penalized least squares problem for smoothing noisy data andits solution strategy. Section 3.2 discusses the selection of penalty parameters.

3.1 Penalized least squares

The roughness penalty approach of smoothing noisy data (Green and Silverman 1994)minimizes the following objective function

n∑

i=1

{zi − f (xi , yi )}2 + λ Pen( f ), (19)

where Pen( f ) is a roughness penalty which is small when the function is smooth andλ > 0 is a penalty parameter. A commonly used penalty for surface smoothing is thethin-plate spline penalty (Duchon 1977; Green and Silverman 1994)

Pen1( f ) =∫

�

( f 2xx + 2 f 2

xy + f 2yy) dxdy,

where fxx , fxy, fyy are second-order partial derivatives. Another penalty used in theliterature is the squared Laplacian penalty (Ramsay 2002)

123

272 L. Zhou, H. Pan

Pen2( f ) =∫

�

( fxx + fyy)2 dxdy.

Both of these penalties are invariant to rotation and translation of the variables. Thisis desirable since the coordinate system used to describe the bivariate smoothing isusually arbitrary and we want the minimizer of (19) not to depend on the choice of thecoordinate system. Note that these penalties require second-order partial derivativesand thus the degree of bivariate splines should be at least two. Following the commonpractice of univariate smoothing, we use cubic splines in this paper.

The two roughness penalties differ in how a candidate function is penalized: thereis no mixed second-order derivative term in Pen2 and the sum of the second-orderderivative terms is squared in Pen2 rather than the terms being separately squared inPen1. While only linear functions are not penalized by Pen1, the space of functionsnot penalized by Pen2 is infinite dimensional, since the second-order derivatives withrespect to x and y are allowed to be traded off against each other. Note that the form ofPen2 is crucial in Ramsay’s reformulation of the data smoothing problem as solvinga system of partial differential equations and then using the finite element method.Both penalties can be used in our penalized spline method but systematic comparisonof two penalties is beyond the scope of the paper.

For univariate smoothing problem, the penalized spline method (Eilers and Marx1996) solves the penalized squares problem in a finite-dimensional space of splinefunctions. Extending this approach to the bivariate case, we solve the minimizationproblem (19) but restricting the solution in a finite-dimensional space of bivariatesplines. In particular, we minimize the objective function in S

rd() for a triangulation

of the domain �. As we mentioned in Sect. 2.3, we use Bernstein basis polynomials torepresent the bivariate splines and introduce the constraint matrix H on the coefficientsto enforce smoothness across shared edges of triangles. Write the target bivariatefunction as f (x, y) = BT

d (x, y) c and the vector of coefficients satisfies the constraintHc = 0. Let z = (z1, . . . , zn)T be the vector of n observations of the responsevariable. Denote B = (Bd(x1, y1), . . . , Bd(xn, yn))T . Then the optimization problem(19) reduces to

minc

{‖z − Bc‖2 + λ cT Pc} subject to Hc = 0, (20)

where P is the penalty matrix such that Pen( f ) = Pen(BTd c) = cT Pc.

To solve the constrained minimization problem (20), we first remove the constraintvia QR decomposition of the transpose of matrix H and convert the problem to aconventional penalized regression problem without any restriction. More specifically,we assume

HT = QR = [Q1 Q2][

R1R2

]

, (21)

where Q is an orthogonal matrix and R is an upper triangle matrix, the submatrix Q1is the first r columns of Q where r is the rank of matrix H, and R2 is a matrix of zeros.

123


We reparametrize using c = Q2c for some c and then it is guaranteed that Hc = 0.The problem (20) is now changed to

minc

{‖z − BQ2c‖2 + λ(Q2c)T P(Q2c)}. (22)

For a fixed penalty parameter λ, if ˆc solves (22), the solution of the original problem(20) is

c = Q2 ˆc = Q2QT2 (BT B + λP)−1Q2QT

2 BT z. (23)

The bivariate smooth function is estimated as f (x, y) = BTd (x, y)c.

3.2 Selecting the penalty parameter

The penalty parameter λ controls the trade-off between model fitting and model par-simony. A large value of λ enforces a smoother fitted function with potentially largerfitting errors, while a small value yields a rougher fitted function and potentially smallerfitting errors. Since the in-sample fitting errors can not gauge the prediction property ofthe fitted function, one should target a criterion function that mimics the out-of-sampleperformance of the fitted model. The generalized cross-validation (GCV) (Craven andWahba 1979; Wahba 1990) is such a criterion and is widely used for choosing thepenalty parameter. The closed-form expression of the GCV criterion is

GCV(λ) = n ‖z − A(λ)z‖2

{tr(I − A(λ))}2 , (24)

where A(λ) is the hat matrix depending on λ with the form

A(λ) = BQ2QT2 (BT B + λP)−1Q2QT

2 BT . (25)

We need to employ a numerical optimization algorithm to find the λ value that mini-mizes the GCV criterion. Following Wood (2004), we search for the optimal λ usingNewton’s method on η = log λ. In detail, we replace λ with exp η in (24). Thenaccording to Newton’s method, the kth update of η is

η(k+1) = η(k) − M−1k mk, (26)

where m and M are the values of the first and second derivatives of the GCV criterionwith respect to η evaluated at η(k). Once the algorithm has converged, we find theoptimal ηopt and therefore λopt = exp ηopt . The transformation from λ to η guaranteesthat the chosen λ at the convergence step is positive.

123

274 L. Zhou, H. Pan

4 Implementation details

This section discusses some implementation details such as construction and repre-sentation of the triangulation, construction of the constraint matrix, and some compu-tational issues.

4.1 Construction of triangulation

Our method requires the construction of a triangulation. We refer to Chapter 4 ofLai and Schumaker (2007) for a detailed discussion of triangulations. Clearly nosingle algorithm exists that is appropriate for all purposes. We manually generatedtriangulations for the numerical examples in this paper. The general guideline wefollowed is that we should avoid having triangles with a very small interior angleand that there is no triangle that contains no data point. When applying the penalizedspline method to univariate smoothing, the number of knots is not crucial in manyapplications as long as it is moderately large. Similarly, for the bivariate cases weconsider here, we expect the results are not very sensitive to the triangulation whensufficient triangles are used to capture features of interest, since the roughness penaltyhelps regularize the estimation. A triangulation with 20–40 triangles worked well inour simulation study.

4.2 Computer representation of triangulation

For computer implementation, we need to represent the structure of a triangulationin a concise way. Although there are other ways of doing it, we find the followingapproach convenient. We first label all vertices in the triangulation in any order aslong as there are no duplicate labels and then construct two lists. One list is the vertexlist with the i th row storing the Cartesian coordinates (xi , yi ) for the i th vertex. Thesecond list is the triangle list with i th row, a triple (li1, li2, li3), recording that the i thtriangle is comprised of the li1th, li2th and li3th points in the vertex list.

As an example, consider the triangulation of the region consisting of two adjacenttriangles as shown in Fig. 4. Tables 1 and 2 give the two-list representation of thetriangulation. According to the triangle list shown in Table 2, the first triangle A1consists of vertices v1, v2, v3 and the second triangle A2 consists of v1, v3, v4. Thelocations of vertices v1, v2, v3, and v4 are recorded in the vertex list shown in Table 1.This representation is not unique; for example, the first row of the triangle list can be

Table 1 Vertex list for thetriangulation in Fig. 4

vi xi yi

1 0 0

2 0.5 0

3 0.5 0.5

4 0 0.5

123


Fig. 4 Second example of twotriangles sharing an edge

Table 2 Triangle list for thetriangulation in Fig. 4

Triangle index Vertice 1 Vertice 2 Vertice 3

1 1 2 3

2 1 3 4

any permutation of 1, 2, 3. However, this non-uniqueness does not create any problemsince using different representations does not change the structure of the triangulation.

4.3 Construction of the constraint matrix

Equation (16) provides the basis for constructing the constraint matrix H. Weexplain the detailed steps of the construction here through an example. Con-sider the triangulation in Fig. 4 and construction of bivariate splines on thistriangulation using degree-2 (d = 2) piecewise polynomials. There are (d +1)(d + 2)/2 = 6 Bernstein basis polynomials defined on each triangle withcorresponding coefficients denoted as {c(1)

i jk} and {c(2)i jk}. Denote the vector c =

(c(1)2,0,0, c(1)

1,1,0, c(1)1,0,1, c(1)

0,2,0, c(1)0,1,1, c(1)

0,0,2, c(2)2,0,0, . . . , c(2)

0,0,2)T as the union of {c(1)

i jk} and

{c(2)i jk} using the ordering as in (8).Suppose we want to introduce a smoothness constraint such that the bivariate spline

is continuous over the whole region. Applying (16) with l = r = 0, we obtain

c(1)0 jk = c(2)

0k j B(2)2;000(v4) = c(2)

0k j , (27)

for any non-negative integers k, j such that k + j = 2. The constraint matrix is

H =⎡

⎣0 0 0 1 0 0 0 0 0 0 0 −10 0 0 0 1 0 0 0 0 0 −1 00 0 0 0 0 1 0 0 0 −1 0 0

⎤

⎦ .

123

276 L. Zhou, H. Pan

Suppose we want to introduce a smoothness constraint such that the bivariate splinehas continuous first derivatives over the whole region. In addition to the equations in(27), we need also apply (16) with l = 1. The resulting equations are

c(1)1 jk = c(2)

1k j B(2)2;100(v4) + c(2)

0,k+1, j B(2)2;010(v4) + c(2)

0,k, j+1 B(2)2;001(v4)

= c(2)1k j − c(2)

0,k+1, j + c(2)0,k, j+1, (28)

for any non-negative integers k, j such that k + j = 1. Here we used the fact thatthe barycentric coordinates of v4 with respect to the triangle A1 is (1,−1, 1). Theequations in (27) and (28) together yield the following constraint matrix

H =

⎡

⎢⎢⎢⎢⎣

0 0 0 1 0 0 0 0 0 0 0 −10 0 0 0 1 0 0 0 0 0 −1 00 0 0 0 0 1 0 0 0 −1 0 00 1 0 0 0 0 0 0 −1 0 1 −10 0 1 0 0 0 0 −1 0 1 −1 0

⎤

⎥⎥⎥⎥⎦

. (29)

Here, the first three rows of H corresponds to (27) and the last two rows of H correspondto (28).

4.4 Some computation issues

Both the calculation of the coefficient estimation using (23) and the evaluation of theGCV criterion (24) involve the inversion of the matrix (BT B + λP) where B is theevaluation matrix of Bernstein basis polynomials and P is the penalty matrix. When wehave K triangles in the triangulation, (BT B+λP) is a {K (d +1)(d +2)/2}×{K (d +1)(d +2)/2} matrix. This matrix could be very large and the computation of its inverseis expensive especially when the domain is irregular with a curvy boundary and a largenumber of small triangles is needed to have a good coverage of the domain. However,thanks to the locality of Bernstein basis polynomials, both the matrices BT B and P areblock diagonal with each block of dimension {(d +1)(d +2)/2}×{(d +1)(d +2)/2}.Thus, we can calculate the inversion of each small block to derive the inversion of thewhole matrix. Since B, P and H are sparse matrices, we can also save memory spacesby only storing the non-zero elements.

5 Numerical results

This section illustrates the proposed penalized bivariate spline method using a simu-lation study and a real dataset. We also report some results from the simulation studyof comparing the proposed method with several existing bivariate smoothers.

123


Fig. 5 Comparison of four surface smoothers for estimating a function over a horseshoe-shaped domainfor a simulated dataset with noise level σ = 1. Top left panel shows an image plot with contour lines ofthe true surface. The middle and bottom panels show image plots of the estimation errors of using thepenalized bivariate splines (PBS), soap film smoother (SOAP), thin-plate regression splines (TPRS), andtensor-product splines (TPS). Top right panel shows the triangulation used by the PBS

5.1 Simulation of model fitting on a complicated domain

Our simulation study used the horseshoe-shaped domain (referred to as the targetdomain hereafter) as shown in Fig. 5. Such a domain was used previously in theliterature to illustrate smoothing over difficult domains (Ramsay 2002; Wood et al.2008).

The surface function to be recovered from noisy data is

f (x, y) = 8 sin(xy) + 0.5 x2 I (x > 0, y > 0) − 0.5 x2 I (x > 0, y < 0), (30)

whose contour plot is shown on the top left panel of Fig. 5. This function is smoothover the manifold of the horseshoe-shaped domain but is not smooth over R2. We

123

278 L. Zhou, H. Pan

Fig. 6 Comparison of four surface smoothers for estimating a function over a horseshoe-shaped domainat three noise levels, σ = .1, 1, and 2. The four smoothers are the penalized bivariate splines (PBS), soapfilm smoother (SOAP), thin-plate regression splines (TPRS), and tensor-product splines (TPS). Boxplotsof IMSEs are shown based on 400 simulation runs at each noise level

uniformly sampled n = 500 locations in the rectangle [−1, 4]×[−1, 1] and evaluatedthe function at only those locations falling in the target domain. We then added i.i.d.white noises from N (0, σ 2) distribution to the function values to obtain the simulateddata. Three noise levels, σ = .1, 1, and 2, were considered. The goal is to estimatethe surface function (30) based on the noisy data.

We applied the proposed penalized bivariate splines method to the data using thetriangulation depicted on the top right panel of Fig. 5. We used d = 3 and r = 1when generating bivariate splines and used the thin-plate spline penalty to regularizethe function estimation. We also applied three other surface smoothers from the liter-ature: the soap film smoother (Wood et al. 2008), thin-plate regression splines (Wood2003), and the tensor product splines. We employed the implementation of these threesmoothers in the mgcv R package (Wood 2006). We used 60 basis functions for thesoap film smoother and the thin-plate regression splines, 64 basis functions (8 mar-ginal basis functions for each variable) for the tensor product splines. All four methodsuse roughness penalties to regularize the fit—the soap film smoother uses the squaredLaplacian penalty while other three smoothers use the thin-plate spline penalty. TheGCV criterion is used by all four methods for selecting the penalty parameters. Fig-ure 5 shows the estimation errors of the four smoothing methods for a simulated datasetwhen the noise level is σ = 1. We can see that the penalized bivariate spline method

123


Fig. 7 Texas monthly temperature data. Top left panel locations of all weather stations on Texas map aremarked as triangles; triangulation used for the penalized bivariate spline method is shown as grey lines.Top right panel time series plot of available data for two stations; the discontinuous parts indicate missingdata at those times. Low panels the stations having observations on August and December 1987 are shownas triangles, the size of the triangle indicates the magnitude of temperature; stations without observationsare marked with the plus sign

gives overall smaller estimation errors than other three methods. Similar behaviors ofthe smoothers were observed for other datasets and for other noise levels.

To get a systematic assessment of the performance of the bivariate smoothers, weran the simulation 400 times for each setup of noise level. For each simulation run, wecalculated the integrated squared error (ISE) over the target domain of the functionestimate by each method. The summaries of the results with boxplots are given inFig. 6. It is clear that the proposed penalized bivariate spline method outperformsthe other three smoothers. The reason that both the thin-plate splines and the tensor-product splines gave higher ISEs is that both methods do not take into account themanifold structure of the domain. The soap film smoother is designed to deal withdifficult regions. Its inferior performance in this example may be attributed to the factthat it needs extra effort to estimate the boundary functions.

123

280 L. Zhou, H. Pan

Fig. 8 Image plots with contour lines of the estimated temperature surfaces for four selected months, usingpenalized bivariate splines

5.2 Texas temperature data

In this section, we apply the penalized bivariate spline method in a real data exampleof constructing the temperature surface over the state of Texas based on data collectedby the International Research Institute for Climate and Society. The dataset availableto us consists of monthly average temperatures (unit: ◦C) at 52 weather stations in thestate of Texas from Jan. 1867 to Dec. 1995. Figure 7 shows the locations of the stationsand the temperature records from two selected stations and two selected months. Insome time periods, most stations have no record. For example, only 15 stations haverecords of the temperature in 1989 and 11 stations in 1990. But almost all stationshave records between 1930 and 1987.

We applied the penalized bivariate spline method to data from four selected months,August of 1986 and 1987, and December of 1986 and 1987, to reconstruct the temper-ature surface for each month using data from all stations in that month. We used cubicbivariate splines (d = 3) on the triangulation shown on the top left panel of Fig. 7 andimposed continuous first directional derivatives across shared edges (r = 1). The thin-plate penalty was used to regularize the estimation and the GCV was used to select the

123


penalty parameters. The resulting estimated temperature surfaces depicted in Fig. 8show the following temperature patterns: The spatial temperature variation is smallerduring August than December; there is little variation in the east half of Texas duringAugust. Moreover, December temperature shows a clear latitude effect—temperaturegets higher as the latitudes gets lower.

Acknowledgments This reseach is partially supported by NSF Grant DMS-0907170. The authors thankthe Co-editor, the AE, and two reviewers for helpful comments. The authors also thank Jianhua Huang forencouragement and many helpful discussions.

References

Craven P, Wahba G (1979) Smoothing noisy data with spline functions. Numerische Mathematik 31:377–403

Duchon J (1977) Splines minimizing rotation invariant semi-norms in sobolev spaces. Lect Notes Math571:85–100

Eilers P (2006) P-spline smoothing on difficult domains. Seminar at Ludwig-Maximillians UniversityMunich. http://www.statistik.lmu.de/sfb386/workshop/smcs2006/slides/eilers.pdf

Eilers PHC, Marx BD (1996) Flexible smoothing with b-splines and penalties. Stat Sci 11:89–121Green P, Silverman B (1994) Nonparametric regression and generalized linear models: a roughness penalty

approach. Chapman and Hall, LondonHansen M, Kooperberg C, Sardy S (1998) Triogram models. J Am Stat Assoc 93:101–119Koenker R, Mizera I (2004) Penalized triograms: total variation regularization for bivariate smoothing. J R

Stat Soc B 66:145–163Lai MJ, Schumaker LL (2007) Spline functions on triangulations. Cambridge University Press, OxfordRamsay T (2002) Spline smoothing over difficult regions. J R Stat Soc B 64:307–319Wahba G (1990) Spline models for observational data. SIAM, PhiladelphiaWang H, Ranalli M (2007) Low-rank smoothing splines on complicated domains. Biometrics 63:209–217Wood SN (2003) Thin plate regression splines. J R Stat Soc 65:95–114Wood SN (2004) Stable and efficient multiple smoothing parameter estimation for generalized additive

models. J Am Stat Assoc 99:673–686Wood SN (2006) Generalized additive models: an introduction with R. Chapman and Hall/CRC Press,

LondonWood SN, Bravington MV, Hedley SL (2008) Soap film smoothing. J R Stat Soc B 70:931–955

123

http://www.statistik.lmu.de/sfb386/workshop/smcs2006/slides/eilers.pdf

smoothing noisy data for irregular regions using penalized bivariate splines on triangulations

Documents