10.1.1.194.9300

38
Estimation of Hedonic Price Functions via Additive Nonparametric Regression by Carlos Martins-Filho Okmyung Bin Department of Economics Department of Economics Oregon State University and East Carolina Universit y Ballard Hall 303 A-426 Brewster Corvallis, OR 97331-3612 USA Greenville, NC 27858-4353 USA email: carlos.martins@orst.edu email: bino@mail.ecu.edu V oice: + 1 541 737 1476 V oice: + 1 252 328 2872 F ax: + 1 541 737 5917 F ax: + 1 252 328 6743 November, 2001 abstract . We model a hedonic price function for housi ng as an additive nonparametric regression. Esti - mation is done via a back¯tti ng procedure in combination with a local polynomial estimator. It avoid s the pitfalls of an unrestricted nonparametric estimator, such as slow convergence rates and the curse of dimen- sionality . Bandwidths are chosen via a nov el plug in method that minimizes the asymptotic mean average square d error (AMA SE) of the regressi on. We compare our resul ts to an alternative parametric model and ¯nd evidence of the superio rity of our nonparametric model. F rom an empi rical perspective our study is interesting in that the e®ects on housing prices of a series of environmental characteristics are modeled in the regression. We ¯nd these characteristics to be important in the determination of housing prices. Keywords and Phrases: additive nonparametric regression; local polynomial estimation; hedonic price mod- els; housing markets. JEL clas si¯c ations. C14, R21. 1

Upload: somnath-mukherjee

Post on 05-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 1/38

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 2/38

1 Introduction

Hedonic price models have been used extensively in applied economics since the seminal work by Rosen(1974).1

A frequent concern in this literature is the adequacy of commonly assumed parametric speci¯cations as hedo-

nic price functions. This speci¯cation problem arises quite naturally from the inability of economic theory to

provide guidance on how characteristics of similar products relate functionally to their market prices. Recog-

nizing the potentially serious consequences of functional misspeci¯cation, most researchers have attempted

to estimate hedonic price models by specifying more °exible regression models. Most of these attempts have

concentrated on parametric speci¯cations ranging from simple data transformations, including the model

introduced by Box and Cox(1964) and its variants, approximations based on second order Taylor-expansions

to °exible non-linear models such as those introduced by Wooldridge(1992). A considerably smaller set of 

authors have proposed semi and fully (unrestricted) nonparametric speci¯cations for hedonic price functions.

Nonparametric regression models are very °exible in that regressions are allowed to belong to a vastly

broader class of functions than that in parametric models. However, their use in applied economics has not

been as prevalent as one would expect, or comparable to their use in other disciplines, such as Biostatistics.2

In fact, nonparametric estimation of hedonic price functions has appeared in just a few articles and has

concentrated almost exclusively in housing markets. Among these papers are the important contributions

of Hartog and Bierens(1989), Stock(1991), Pace(1993,1995), Anglin and Gencay (1996), Gencay and Yang

(1996) and Iwata, Murao and Wang(2000). Closer inspection of this literature, however, reveals the possibility

of a number of methodological improvements that can add to both the ease of obtaining, and interpreting

hedonic price nonparametric functional estimates. These improvements fall into three broad categories: (a)

speci¯cation of the regression class; (b) choice of the smoother underlying the estimation; and (c) choice of 

the bandwidths. In this paper we address each of these categories.

With the exception of Iwata, Murao and Wang(2000), all the articles listed above specify a regression class

1Recent examples include among many Dreyfus and Viscusi(1995), Walls (1996) and Hill, Knight and Sirmans (1997).2Yatchew(1998) provides a list of potential reasons for this relative scarcity of applied nonparametric economic modeling.

2

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 3/38

that requires the estimation of multivariate smoothers. There are a number of practical, as well as theoretical

problems that emerge when estimating multivariate smoothers. First, there is the curse of dimensionality 

identi¯ed by Friedman and Stuetzel(1981). The problem is specially acute in data sets used by economists

in hedonic price function estimation. These data sets are not in general very large but have at times a vast

list of product attributes or characteristics which contribute to slow convergence rates of the estimators and

diminished con¯dence on inference. Second, when de¯ning neighborhoods in two or more dimensions for

local averaging, an universal characteristic of multivariate nonparametric estimation, there is the need to

assume some type of metric that is hard to justify when the variables are measured in di®erent units or

are highly correlated. Third, from a practical perspective, multivariate smoothers are extremely expensive

to compute and even with the use of sophisticated graphical analysis four or higher dimensional smooths

are virtually impossible to represent or interpret. Even in more parsimonious speci¯cations, such as that in

Robinson(1988) and Speckman(1988), applied research inevitably encounters the uncertain prospect of ¯tting

multivariate smoothers. Since one of the objectives of hedonic price modeling is to easily interpret and isolate

the contributions of a given attribute to market price variability, holding all other product characteristics

¯xed, we ¯nd the use of a fully unrestricted nonparametric regression undesirable. We believe much better

results in hedonic price modeling can be obtained by estimating an additive nonparametric regression model

(ANRM), as in Hastie and Tibshirani(1986). Estimation of these models involve only univariate smoothing

but allow for multiple regressors, and due to their additive nature lend themselves to easy interpretation and

analysis.

Regarding the choice of estimator, we depart from the standard literature in applied economics by using

local polynomial estimators rather than the popular Nadaraya-Watson (NW) estimator. Recent work by

Fan(1992), Fan, Gasser, Gijbels, Brockmann and Engel(1993) and Ruppert and Wand(1994) has shown that

local polynomial estimators possesses a number of desirable theoretical and practical properties relative to

other smoothing methods, including the NW estimator. One of the most serious pitfalls of the NW estimator

3

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 4/38

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 5/38

E (Y jX 1 = x1; ¢ ¢ ¢ ; X D = xD) = ® +DXd=1

md(xd): (1)

We assume that n independent observations f(yt; xt1;:::;xtD)0gnt=1 are taken on the random vector (Y; X 1; ¢ ¢ ¢

; X D) and that V (Y jX 1 = x1; ¢ ¢ ¢ ; X D = xD) = ¾2, an unknown parameter. The md(¢) are real valued

measurable functions with E (md(¢)) = 0, E (m2d(¢)) < 1 belonging to H d a Hilbert space with inner product

given by E (md(X d)m0d(X d)) for each d.5 Under these assumptions E (Y ) = ® and the optimal predictor for

Y  given X 1; ¢ ¢ ¢ ; X D can be characterized by

md(xd) = E 

0@Y  ¡ ® ¡

DX±=1;±6=d

m±(¢)jX d = xd

1A (2)

for d = 1;:::;D.6 The system of equations in (2) provides the motivation for the back tting estimator 

(B-estimator) for md(¢) proposed by Friedman and Stuetzle(1981). The idea is to simultaneously solve

(2), estimating the conditional expectations via  a suitably chosen linear smoother with regressand given by

Y  ¡ ® ¡ PD±=1;±6=d m±(¢) and regressor given by X d. A precise de¯nition of the B-estimator requires some

additional notation. Let an ´ (a; ¢ ¢ ¢ ; a)0 be the n £ 1 vector with the constant a as its components, I n

be the identity matrix of size n, Dn ´¡

I n ¡ n¡11n10n¢

, ~y ´ (y1;:::;yn)0, ~ xd ´ (x1d;:::;xnd)0, ~ md(~xd) ´

(md(x1d); :::md(xnd))0, S d be the n £ n smoother matrix associated with regressor d and S ¤d ´ DnS d be a

centered  smoother for d = 1; ¢ ¢ ¢ ; D.7 The B-estimator for ~ md(~xd) is the solution for the following system of 

normal equations,

0BBB@I n S ¤1 ¢ ¢ ¢ S ¤1

¤

2 I n ¢ ¢ ¢ S 

¤

2......

. . ....

S ¤D S ¤D ¢ ¢ ¢ I n

1CCCA0BBB@~ m1(~ x1)

~ m2(~ x2)...~ mD(~ xD)

1CCCA =0BBB@

S ¤1

¤

2...S ¤D

1CCCA ~y; (3)

which we denote by¡

~ mb1(~x1); ¢ ¢ ¢ ; ~ mb

D(~xD)¢0

.

5m0

d(¢) denotes an element of the H d which is di®erent from md(¢).

6See Buja, Hastie and Tibshirani(1989).7The speci¯c form for S d depends on the univariate nonparametric estimator used in (2).

5

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 6/38

A convenient procedure to obtain a solution for (3) is to use the Gauss-Seidel algorithm. In practice, its

implementation involves setting initial values ®0 ´ n¡110n~y and ~ mbd(~ xd)0 ´ 0n for all d. We then de¯ne the vth

iteration (v = 1; 2; ¢ ¢ ¢) estimator ~ mbd(~xd)v as the smooth that results from a suitably chosen nonparametric

univariate regression estimator, where the observed  regressands are given by

~ y ¡ 1n®0 ¡

d¡1X±=1

~ mb±(~x±)v ¡

DX±=d+1

~ mb±(~x±)v¡1 and the regressors are ~xd, for d = 1; ¢ ¢ ¢ ; D.

Iterations continue until jj~ y ¡ 1n®0 ¡PD±=1 ~ mb

±(~x±)vjj22 ¡ jj~y ¡ 1n®0 ¡PD±=1 ~ mb

±(~x±)v+1jj22 = 0 or is smaller

than a pre speci¯ed level of tolerance, where if  µ 2 <n, jjµjj2 =¡Pn

i=1 µ2i¢1=2

.

We construct S ¤d using a local polynomial estimator of order p = 1 or 3 as needed in our estimation

algorithm. For this purpose let ekt ´ (0; :::; 1; :::; 0)0 be a vector of length k, where the number one appears

on the tth position of the vector, K  : < ! < be an univariate symmetric kernel function and hdn be the

bandwidth sequence associated with the estimation of  ~ md. We de¯ne R(K ) ´R 

K (Ã)2dÃ, ¹ j ´R 

àjK (Ã)dÃ

for j = 0; 1; 2;:::, and for ~x a vector, (~x) p is the vector obtained by raising each element of  ~x to the pth power.

We de¯ne the following estimating weight functions for d = 1;:::;D,

sd(x) : < ! <n : sd(x) = e p+10

1 (Rd(x)0W d(x)Rd(x))¡1Rd(x)0W d(x);

with W d(x) = diagn

1hdn

K ³xtd¡xhdn

´ont=1

, Rd(x) = (1n; (~xd ¡ 1nx); (~xd ¡ 1nx)2; ¢ ¢ ¢ ; (~xd ¡ 1nx) p). We then

de¯ne S d ´

0B@

sd(x1d)...

sd(xnd)

1CA0

.

Conditions under which (3) has an unique solution, i.e.,0BBB@

I n S ¤1 ¢ ¢ ¢ S ¤1S ¤2 I n ¢ ¢ ¢ S ¤2

......

. . ....

S ¤D S ¤D ¢ ¢ ¢ I n

1CCCA (4)

is invertible, have been investigated by Opsomer(1999) for linear smoothers in general and by Opsomer

and Ruppert(1997) for local polynomial smoothers of bivariate ANRM. For D > 2 there exists no explicit

6

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 7/38

set of conditions guaranteeing the uniqueness of the B-estimator. However, simulations reveal that unless

the regressors are highly correlated, the Gauss-Seidel method produces a suitable estimator when using

local polynomial smoothers.8 We have encountered no di±culties with the data set used in this paper. To

suitably choose hdn based on the observed data, we assume that: A1. The kernel K  is bounded, continuous

with compact support, and its ¯rst derivative has a ¯nite number of sign changes over its support. Also,

¹ j(K ) ´R 

u jK (u)du = 0 for all j odd and ¹2(K ) 6= 0; A2. The marginal densities of the regressors exist as

f Xd, d = 1;:::;D are continuous, have compact support [ad; bd] and f Xd

(x) > 0 for all x 2 (ad; bd); A3. As

n ! 1, hdn ! 0 and nhdn ! 1 for all d; A4. The ( p + 1)th derivatives of  md(¢), d = 1;:::;D exist and are

continuous.

One of the practical conveniences of the local polynomial estimators is that provided that  p is large enough

and that md(¢) is su±ciently smooth, the qth derivative of  md(¢), denoted by m(q)d (x) can be estimated by,

mb(q)d (x) = q!e

 p+10

q+1 (Rd(x)0W d(x)Rd(x))¡1Rd(x)0W d(x)

0

@y ¡ 1n®0 ¡

DX±=1;±6=d

~ mb(~x±)

1

A(5)

and therefore we can de¯ne, ~ mb(q)d (~xd) = (mb(q)

d (x1d); ¢ ¢ ¢ ; mb(q)d (xnd))0.

2.2 Data Driven Selection of hdn

One of the most important steps in estimating any nonparametric regression model is the choice of  hdn.

In essence, after a nonparametric estimation procedure is chosen the selection of the bandwidths is tanta-

mount to the selection of an estimated regression. It is therefore desirable to use a data driven method for

their selection. The most commonly used procedure, especially in applied econometrics, involves choosing

h1n; h2n; ¢ ¢ ¢ ; hDn that minimize a jackknifed sum of squares or cross validation function. Unfortunately,

besides being extremely computationally intense, cross validation has a tendency to undersmooth producing

estimated regressions that have rather high variance. Furthermore, the di®erence between the cross vali-

dation estimator of  hdn for a local polynomial estimator and the hdn that minimizes the estimator's mean

8See Opsomer(1995) and Martins-Filho(2001).

7

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 8/38

average squared error (MASE) converges to zero at a very slow rate. These undesirable characteristics, have

prompted the use of  plug in  methods for bandwidth selection.

The basic principle behind plug in  methods is the direct estimation of functionals that appear on the

expressions describing the bandwidths that minimizes the function used to guide the choice of the estimated

regression. Speci cally, we choose ~ hn ´ (h1n; ¢ ¢ ¢ ; hDn)0 2 <D such that the conditional mean average

squared error (MASE)9,

MASE (~ hn) =1

n

n

Xt=1

E 0@Ã(®0 ¡ ®) +

D

Xd=1

(mb

d

(xtd) ¡ md(xtd))!2

j~x1; ¢ ¢ ¢ ; ~xD1Ais minimized. Given a local polynomial estimator, and under the assumption that the regressors (X 1; ¢ ¢ ¢ ; X D)

are independent, it can be shown that for p = 1, we have that,

MASE (~ hn) =¹2(K )2

2

DXd=1

h4dnµdd(2; 2) + ¾2

DXd=1

R(K )

nhdnn¡1

nXt=1

1

f Xd(xtd)

+ O p

ÃDXd=1

1

nhdn

!+

o pà DXd=1

µ 1

nhdn + h4dn¶

;!

(6)

where µdd(2; 2) = 1n jj ~ m

(2)d (~xd) ¡ E ( ~ m

(2)d (~xd))jj22. Ignoring the terms O p and o p in (6) we have that the vector

~ hn that minimizes the conditional MASE has dth component given by,

hdn =

þ2

R(K )n¡1Pn

t=11

f Xd(xtd)

n¹2(K )2µdd(2; 2)

! 1

5

: (7)

Our plug in  strategy is to obtain hdn by directly estimating ¾2, µdd(2; 2) and f Xd. All other components of 

(7) are known, given the kernel function choice.

We follow the statistical literature and assume for simplicity that f Xdis a uniform density, and estimate

n¡1Pn

t=11

f Xd (xtd)by max(~xd) ¡ min(~xd). The estimation of  µdd(2; 2) requires the estimation of second

derivatives of  md which in turn requires the selection of an auxiliary bandwith vector ~gn = (g1n; ¢ ¢ ¢ ; gDn)0

9Conditioning here is on the entire sample.

8

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 9/38

that minimizes a suitably chosen measure of discrepancy between µdd(2; 2) and the estimator µdd(2; 2) =

1n jjDn~ m

b(2)d (~xd)jj22, which depends on ~gn through ~ m

b(2)d (~xd). It follows as an extension of Theorem 1 in

Opsomer and Ruppert(1998), that when the regressors are independent, the vector ~g that minimizes the

conditional (asymptotic) mean squared error of the estimator µdd(2; 2) has as its dth component,

gdn =

µC a

4!R(K 2;3)¾2(bd ¡ ad)

2njµdd(2; 4)j¹4(K 2;3)

¶1=7

for d = 1; 2;:::;D, where C a =

½1 if  µdd(2; 4) < 0

2:5 if  µdd(2; 4) > 0; (8)

where K 2;3 is obtained from K r;p(u) =³r!det(M r;p)det(N p)

´K (u) with N  p a ( p + 1) £ ( p + 1) matrix having (i; j)

entry given byR 

ui+ j¡2K (u)du, M r;p is the same as N  p except that the r + 1 column is substituted by

(1; u ; u2;:::;u p)0, and det(A) is the determinant of the squared matrix A.

The other component of (7) that needs to be estimated is ¾2. We de¯ne, ¾2(~·n) = n¡1jj~ y ¡ 1n®0 ¡

PDd=1 ~ mb

d(~xd)jj22 as a generic estimator for ¾2 based on ~·n 2 <D, the bandwith used to obtain ~ mbd(~xd) for

d = 1; ¢ ¢ ¢ ; D. As in the case of µd(2; 2), we choose ~·n that minimizes the conditional (asymptotic) mean

squared error of ¾2. The optimal vector ~·n has dth component given by,

·dn =

µC b

4jR(K 0;1) ¡ 2K 0;1(0)j¾2(bd ¡ ad)

nµdd(2; 2)¹2(K 0;1)2

¶1=5

where C b =

½1 if  R(K 0;1) ¡ 2K 0;1(0) < 0

0:25 if  R(K 0;1) ¡ 2K 0;1(0) > 0(9)

The expressions for the optimal bandwith vectors in (7), (8) and (9) depend on unknown functionals,

µdd(2; 4), µdd(2; 2) and ¾2. Our estimation strategy follows Opsomer and Ruppert(1998) and is summarized

as follows. We obtain pilot estimates for ¾2 and µdd(2; 4) for d = 1;:::;D based on a suitably de¯ned

parametric model. These initial estimates are used to compute gdn according to equation (8). We then use

gdn as bandwiths to ¯t an additive model using a polynomial smoother of degreee p = 3. We then use the

¯tted additive model to estimate µdd(2; 2). This estimate together with the initial estimate of  ¾2 are then

used to obtain ·dn according to (9). ·dn are then used to ¯t an additive model using a local polynomial of 

order p = 1 and to compute an updated estimate of  ¾2. Finally, the updated estimate of  ¾2 together with

9

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 10/38

the estimate of  µdd(2; 2) are used to obtain estimates hdn according to (7). The hdn are then used to obtain

a ¯nal ¯t for the ANRM and ¾2.

Before we proceed to describe and analyze the data, a few comments are in order regarding our chosen

estimation procedure. First, Linton and HÄardle(1996) have proposed an alternative estimator for the ANRM

called the marginal integration (MI) estimator. We chose not to use the MI estimator for two reasons: a) the

algorithm required to produce the estimator calls for the estimation of a multivariate fully nonparametric

regression model of order D. With the relatively large number of regressors in hedonic price models, the

curse of dimensionality becomes a major concern; b) Martins-Filho(2001) provides experimental evidence

that the ¯nite sample performance of the back¯tting estimator is superior to that of the MI estimator.

Second, the asymptotic and ¯nite sample distibutions of the B and MI estimators are currently unknown

when ~ hn is chosen by any data driven method.10 However, following Hastie and Tibshirani(1990) we are

able to construct approximate con¯dence intervals for the B-estimators.

3 The Data

The housing market data that we use comes from a portion of Multnomah County, Oregon-USA that lies

within Portland's urban growth boundary. The regressand in our empirical model is the actual recorded

sales price of a dwelling. We use 1000 recorded sales that occured between June of 1992 and May of 1994

in the study area. All sales prices were adjusted to May 1994 levels using a Multnomah County residential

housing market price index. Figure 1 provides a histogram of housing prices with a bandwith chosen by

the method proposed by Sheather and Jones(1991). Direct inspection of the histogram reveals that the vast

majority of the observations falls within the (50000,250000) interval. There is however a great dispersion of 

sales prices and some observations are quite distanced within this interval. Another clear conclusion from

the histogram is the leptokurdic nature of the distribution. Table 1 provides a list of all variables used in

our study as well as some sample statistics. The regressors can be grouped into two main categories. The

10This is also true for the MI estimator.

10

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 11/38

¯rst contains a series of dwelling attributes commonly used in hedonic price studies, such as dwelling area,

land area, the total number of bedrooms and bathrooms, as well as the dwelling age in 1994. The second

contains a series of locational characteristics of the dwelling. They include distances to the central business

district, the nearest lake, improved park, industrial sector, commercial district and wetland. Also included

is a measure dwelling elevation. All linear measurements are in kilometers, and all area measurements are

in squared meters.11

The data come from two major sources; MetroScan, which compiles real estate data from assessor's

records for numerous U.S. cities provided most of the house's structural attributes data; Metro Regional

Services, a regional government agency provided the locational characteristics. All distance calculations

were made using a raster system where all data are arranged in grid cells. Each cell is a 15.6-meter square

with distances measured using the Euclidean norm from the center of the dwelling land parcel to the nearest

edge of the feature.

Wetland locations are based on the U.S. Fish and Wildlife Service's National Wetlands Inventory in

Oregon. Wetlands vary from primarily open water to forest and grassland that is wet only part of the

year. Although the area of urban wetlands has been declining in the U.S. the section of Multnomah county

included in our study has more than 4500 wetlands and deep water habitats, varying in size from 1 to 358

acres.12.

We have some a priori  beliefs regarding the e®ects of these various characteristics on housing prices.

Speci¯cally, we expect a positive association between sales prices and the dwelling's area, number of bath-

rooms, land area, the dwelling's elevation and the distance to the nearest industrial zone. Conversely, we

expect a negative association between sales prices and distances to lakes, parks and the central business dis-

trict as well as the dwellings age. We have no a priori  expectations regarding the e®ects of wetland distance

on sales prices. Although proximity to wetlands may be perceived as a desirable location characteristic due

11Squared meters can be converted to squared feet dividing by 0.093 and kilometers can be converted to miles dividing by1.6.12See Mahan, Polasky and Adams(1998).

11

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 12/38

to enhanced view quality or increased pollution protection, it can also be undesirable due to increased risk of 

°ooding or wildlife annoyances. We are also unsure about the e®ects of proximity to commercial districts on

house prices. Although close proximity may be undesirable due to increased tra±c and noise, large distances

maybe undesirable due to increased transportation costs.

It is important to point out that our a priori  expectations are from a descriptive point of view rather

crude. One of the advantages of departing from a linear parametric model involves the possibility of unveiling

much richer patterns of association between regressand and regressors. Put di®erently, the standard practice

in applied economic work of revealing expected parameter signs is in itself evidence of how restricted linear

regression parametric modeling can be.

4 Empirical Modeling and Computations

Since two of the variables used as regressors are discrete (number of bathrooms and bedrooms), we estimate

the following semiparametric version of (1),

E (Y jX 1 = xt1; ¢ ¢ ¢ ; X 12 = xtD) = ® +2X

d=1

¯ dxtd +12Xd=3

md(xtd) (10)

for t = 1;:::;n and Y  and X 1; ¢ ¢ ¢ ; X 12 as indicated in Table 1.13

The estimation procedure requires initial estimates for ¾2 and µdd(2; 4). The latter requires estimates

for m(4)d (¢) and m

(2)d (¢). To obtain these initial values we ¯rst ¯t a linear parametric regression model that

results from a Taylor's expansion of order 5 that explores the additivity of the conditional expectation of 

Y  including all regressors that appear in (10). This estimated regression provides initial estimates for the

second and fourth derivatives of  md, which are then used to construct the ¯rst estimates of  µdd(2; 4). These,

together with an estimate for the variance ¾2 are used in equation (8) to obtain ~g, which is given in Table 2.

~g is then used to ¯t (10) using the B-estimator and a local polynomial estimator of order p = 3. From this

13The data, as well as the computer code for the implementation of the estimation procedure, which was written in theGAUSS v.3.2.14 (1996) programming environment, are available from the ¯rst author.

12

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 13/38

¯t of the additive model we obtain a new estimate for the second derivatives of  md using (5). These new

estimates of the second derivatives are used to obtain an updated estimate for µdd(2; 2) which together with

the variance estimate are used to obtain ·dn according to (9). The ~·n reported in Table 2 is then used to

¯t (10) once again, so that a new estimate for ¾2 is obtained. This new variance estimate is used with the

updated estimate of  µdd(2; 2) to obtain ~ hn according to (7) and reported in Table 2. Finally, ~ hn is used to

obtain a ¯nal ¯t of (10), which is then used to obtain a ¯nal estimate for ¾2. We denote these ¯nal estimates

by ®b, ¯ b1, ¯ b2, mbd(~xd) for d = 3; ¢ ¢ ¢ ; 12, and ¾2.

We have a series of observations to make regarding the estimation procedure. First, the fact that the esti-

mated bandwiths depends on ~y produces estimators that are not projectors and are nonlinear. However, as de-

sired, the cycles of the back¯tting algorithm produced a decreasing sequence of jj~y ¡1n®0 ¡PD

±=1 ~ mb±(~x±)vjj22.

Convergence was obtained for the ¯rst ¯tting of the ANRM after 13 cycles, for the second ¯tting after 11

cycles, and for the ¯nal ¯tting convergence was attained after 9 cycles, all with a level of tolerance of 0.001.

With n = 1000 the computation time for the estimates was approximately 4 hours and 33 minutes on a 866

Mhz Pentium III PC. The estimated regressions are the solid lines that appear on Figure 2. In the next

section we provide a detailed discussion of the results.

Second, one major di±culty in ¯tting an ANRM via  back¯tting (or marginal integration) with data

driven estimated bandwiths, is that we are unable to construct asymptotically valid con¯dence intervals

for the estimated bandwiths. It is possible however to obtain an estimated covariance matrix for each

mbd(~ xd) for d = 1; ¢ ¢ ¢ ; 12. This results from the fact that at convergence, mb

d(~xd) can be written as Rd~y for

some n £ n matrix Rd. Hence, V (mbd(~xd)) can be estimated by ¾2RdR0

d. We obtain Rd by noting that at

convergence it can be calculated by the following numerical procedure. We de¯ne the identity matrix of size

n = 1000 to be I n and denote its ith column by I :i. Using ~ hn we ¯t the ANRM in (10) with regressand given

by I :i. This produces n-dimensional vectors ~mbd(~xd) that correspond to the ith column of Rd, for d = 1; ¢ ¢ ¢ ; n.

Hence, we run 1000 new ANRM to obtain Rd. The dashed lines that appear in Figure 2 are lower and upper

13

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 14/38

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 15/38

E (Y jX 1 = xt1; ¢ ¢ ¢ ; X D = xtD) = ® +DXd=1

µdxtd +1

2

DXd=1

DX±=1

µd±xtdxt± where µd± = µ±d: (11)

Similar parametric second order approximations have been extensively used in applied econometrics.

Table 3 provides ordinary least squares estimates for the parameters in (11), as well as some other commonly

reported regression statistics. The parametric model provides a very reasonable ¯t for the data. We observe

that model (11) does not have the additivity property of the ANRM. If one is interested on the relationship

between Y  and X d, ceteris paribus, it is necessary to arbitrarily choose values for all X ± where ±  6= d to

characterize such relationship. Choosing average sample values for other regressors is common practice in

the empirical literature. We observe, however, that this procedure generates simply one of (uncountably)

many potential estimated regressions. The same can be said about ¯rst derivatives and elasticities that

derive from these models. We view this as a major inconvenience of this type of parametric approximation.

For comparison with the ANRM we have estimated each regression direction based on the parametric model

with all other regressors evaluated at their averages. Their estimated price e®ects are the dash-dotted lines

that appear on Figure 2.

Additivity is an important assumption of the ANRM, hence we performed a test for additivity proposed

by Hastie and Tibshirani(1990). The test involves estimating the following arti¯cial regression,

rt =DXd=1

X j>d

Ãdjmbd(xtd)mb

 j(xtj); (12)

where rt = yt ¡ ®b ¡P2

d=1 xtd¯ bd ¡P12

d=3 mbd(xtd) are the residuals from the estimated additive model. A

conventional Student's-t statistic for H 0 : Ãdj = 0 against H A : Ãdj 6= 0 is calculated. We interpret rejection

of  H 0 as evidence that the interaction of regressors is strong enough to reject additivity. Conversely, failing

to reject H 0 lends support to the additivity assumption. Almost all t-values lead to non-rejection of  H 0 at

1 percent, therefore supporting the additivity assumption. 15

15Results are available upon request. Interaction terms can easily be handled in an additive non parametric framework

15

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 16/38

5 Estimation Results and Analysis

5.1 Parametric Estimates

Our parametric model is simple and for the casual observer may seem rather inappropriate, especially given

the availability of much richer parametric speci¯cations.16 However, after informal experimentation with

several alternative parametric speci¯cations that do not expand the set of regressors, we were surprised to

learn that the best overall ¯t resulted from our model. Our informal search is by no means evidence that

one cannot ¯t a better parametric model, it is simply indicative that (11) does reasonably well against some

obvious parametric alternatives.

The results in Table 3 suggest that the most important attributes in determining sales prices are the

dwelling area, the lot square footage and dwelling age. Among the locational attributes the most signi¯cant

determinants of sales prices are the dwelling's elevation, its distance to the central business district, nearest

wetland and nearest commercial district. The results are reasonable and generally in line with comparable

previous studies, e.g., Iwata et al.(2000). We will make more speci¯c comments regarding the estimated

parametric model as we contrast its results with that of the ANRM. All of the variables used in the estimation

of (11) are in deviation form. Hence, the parametric model can be interpreted as a second order Taylor's

approximation around the sample average of the regressors.

5.2 Nonparametric Estimates

We start our analysis by observing that the estimated regressions have very resonable shapes. No clear

instance of over or under smoothing is apparent and the fact that a common bandwith was used in each

regression direction did not create problems for most of the regressors' data range. There seems to be some

similarities between the regression estimates produced by the parametric model and the ANRM for some

by specifying, for example, E (Y  jX 1 = xt1; ¢ ¢ ¢ ;X D = xtD) = ® +P

D

d=1md(xtd) +

PK

k=1mk(ztk), where ztk ´ xtdxt± for

d 6= ±; and d; ± = 1; ¢ ¢ ¢ ;D.16One could try to improve the parametric ¯t by expanding the number of regressors. An almost surely successful strategy

would be to add sets of dummy variables for the regressors involving distances.

16

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 17/38

regressors, e.g., dwelling area and land area, however for most regressors the two models reveal signi¯cantly

di®erent impact on sales price. These are more pronounced for the impact of distance to the nearest lake,

industrial zone, central business district and dwelling elevation. We now turn to a more speci¯c analysis of 

each estimated regression. In various occasions we will refer to data ranges where most of the observations

fall. These can be promptly seen in Figure 3, where the estimated nonparmetric regressions are graphed

together with the n = 1000 additive partial residuals.

DDDDwwwweeeelllllllliiiinnnngggg aaaannnndddd llllaaaannnndddd aaaarrrreeeeaaaa: The estimated regressions for the in°uence of a dwelling's area and land area on

price have very reasonable shape, with the exception of the more variable behavior of  mb3 for X 3 > 234.17

We attribute this variability to undersmoothing in this data range and place little credence on the sharp

increase on the estimated regression in that data range. In the range (60; 234), where most of the data

is concentrated, mb3 is increasing and nearly linear, with a ¯rst derivative that oscillates between 300 and

1,000 dollars. The estimated regression is slightly convex for dwellings with area larger than 180, indicating

a larger impact of size on prices for larger dwellings. For a dwelling that is +3:5 standard deviations from

average the impact of additional area is about 3; 000 dollars or more. It is not surprising, given the near

linearity of  mb3, that the parametric model produces results that are very similar to those suggested by mb

3.

As we expected mb4 suggests, in general, a positive association between sales price and land area. We

place little weight on the declining portion of the estimated regression for lots whose area is below 227. It

corresponds to a data range where there are only 9 observations. The impact on sales prices of land area is

much less pronounced than that of dwelling. Within §1 standard deviation (134; 1; 320) from the average

land size (727) sales prices vary by less than 60 dollars for a marginal increase in land area. The results from

the parametric model are similar in that the estimated regression has positive slope, but the impact of land

area on sales prices is even smaller than that suggested by the nonparametric model. We believe that in

this case, even though m4 is nearly linear, the parametric model underestimates the impact of land area on

17Since all regressors are centered, this corresponds to the centered X 3 being greater than 100. All of our analysis of theestimated results will be made using noncentered values for the regressors.

17

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 18/38

prices. This results from the fact that the OLS estimator of the parametric model is highly in°uenced by the

observations in the (827; 1; 000) range, which have a much smaller impact on the nonparametric estimator

of  m4 where most of the data lies.

DDDDwwwweeeelllllllliiiinnnniiiigggg aaaaggggeeee: The impact of dwelling age on sales prices is di±cult to intepret. The data seems to

indicate that housing developments has occured in fairly well de¯ned phases. We were able to identify four

data clusters corresponding to the periods 1920 ¡ 1930, 1945 ¡ 1956, 1970 ¡ 1981, and 1991 ¡ 1994. The

fact that the data is clustered has produced a bandwith that seems to undersmooth mb5, hence the wiggly

appearence of the estimated regression. We are able, however to discern some patterns. Dwelling age impacts

prices negatively in the (1975; 1990), i.e., for house with age between 4 and 18 years. Ceteris paribus a house

built in 1990 will cost about 20; 000 dollars more than one built in 1975. However, for dwellings that are

between 20 and 80 years old the impact of age on sales prices seems to be much less pronounced. In fact,

not counting the 1935-1945 period, for dwellings between 20 and 80 years old, the impact of age on sales

prices is smaller than §10; 000 dollars. The sharp increase in mb5 from 1935 ¡ 1945 seems to be caused by a

combination of sample variability and undersmoothing. The results that derive from the parametric model

are quite di®erent. Age according to (11) has little impact on sales prices for dwellings up to approximately

35 years old. After that the negative impact of age intensi es continuously. An 80 year old dwelling is

expected, ceteris paribus to sell for about 25; 000 dollars less than a 50 year old.

DDDDiiiissssttttaaaannnncccceeee ttttoooo tttthhhheeee nnnneeeeaaaarrrreeeesssstttt llllaaaakkkkeeee: Close proximity to a lake has a substantial positive impact on sales prices.

However, this impact is restricted to dwellings that are less than 2 kilometers away from a lake. Here, the

impact on prices from moving away from a lake is substantial. Ceteris paribus, moving a house from 2 to 4

kilometers away from a lake produces a drop in sales price of approximately 30 ; 000 dollars. The in°uence

of proximity to a lake on sales price falls dramatically after a dwelling is more than 3 kilometers away.

In fact, m5(¢) ´ 0 falls within the con¯dence band around mb5 almost everywhere. These results are very

intutitive. After a certain distance use of the lake for recreational purposes is limited and even prohibited by

18

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 19/38

some neighborhood associations (in the case of private lakes) and transportation costs start to outweight the

bene¯ts of lake use. The parametric model produces an altogether di®erent and unappealing interpretation

of the data. First, the drop in sales prices as one moves away from a lake is very gradual and decreasing.

The sharp drop in prices captured by mb5 at short distances from a lake is not revealed by the parametric

model. Second, and more disturbing, the parametric model predicts an increase in sales prices for dwellings

that are more than 6 kilometers away from a lake.

DDDDiiiissssttttaaaannnncccceeee ttttoooo tttthhhheeee nnnneeeeaaaarrrreeeesssstttt wwwweeeettttllllaaaannnndddd: There is a negative relationship between sales prices and distance

to the nearest wetland for most of the data range. This negative impact is more pronounced within §1

standard deviation of the average distance (1.09 kilometers). Moving a dwelling adjacent to a wetland 2

kilometers away produces a decrease in price of about 20; 000 dollars. Althought this e®ect is smaller than

the lake e®ect, it is signi¯cant and suggests a market that incorporates quite well the value of environmental

amenities. As in the case with distance to a lake, dwellings that are farther away from a wetland (in this case

more than 2 kilometers) have prices that are impacted little by wetland distance. Note that the impact of 

wetland distance on sales prices dies out more rapidly than the impact o f distance to a lake. The parametric

model was able to predict well the impact of wetland distance in sales prices. The estimated parametric

relationship between these two variables falls within the con¯dence bands we have constructed for mb7 for

almost all the data range.

DDDDiiiissssttttaaaannnncccceeee ttttoooo tttthhhheeee nnnneeeeaaaarrrreeeesssstttt iiiimmmmpppprrrroooovvvveeeedddd ppppaaaarrrrkkkk: mb8 reveals that in general proximity to an improved park has

little impact on sales prices. In fact, m8(¢) ´ 0 falls within the con¯dence band we constructed for virtually

all data range. Price variations where most of the data lies (:1; :6) kilometers is all within §3000 dollars.

The parametric model produces similar results.

DDDDwwwweeeelllllllliiiinnnngggg eeeelllleeeevvvvaaaattttiiiioooonnnn: mb9 predicts a positive impact of dwelling elevation on sales prices for the interval

(20; 100) meters. The gain in price for this data range is 40; 000 dollars. After 100 meters mb9 levels out

with a derivative that is very close to zero. For elevations above 200 meters the impact on sales prices

19

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 20/38

increases sharply, however as in previous cases there are just a few observations on this data range. It seems

reasonable to expect a positive association between elevation and prices, due to the possibility of views and

quieter streets. However, higher elevations may also be associated with higher transportation costs. Hence,

the fact that mb9 levels o® is expected and intuitively appealing. The parametric model produces quite

di®erent predictions. Dwelling prices ¯rst drop (until about 60 meters) with elevation then rise continuously

at an increasing rate. Clearly, there is no cost associated with elevation gain. According to the parametric

model, a house at average elevation (81 meters) has a price increase of almost 80; 000 dollars when it gains

100 meters in elevation.

DDDDiiiissssttttaaaannnncccceeee ttttoooo tttthhhheeee nnnneeeeaaaarrrreeeesssstttt iiiinnnndddduuuussssttttrrrriiiiaaaallll,,,, ccccoooommmmmmmmeeeerrrrcccciiiiaaaallll zzzzoooonnnneeeessss aaaannnndddd tttthhhheeee cccceeeennnnttttrrrraaaallll bbbbuuuussssiiiinnnneeeessssssss ddddiiiissssttttrrrriiiicccctttt: mb10

clearly captures the negative impact on sales prices of dwellings that are close to industrial zones. Further-

more, it indicates that this negative e®ect is specially intense very close to the industrial zone. For example,

moving a dwelling that is adjacent to an industrial zone 400 meters away, increases its value ceteris paribus

by approximately 20; 000 dollars. The bene¯ts from moving away from industrial zones diminish at an in-

creasing rate. Interestingly, there seems to be a negative impact on sales prices after about 2 kilometers. It

is likely that this results from increased transportation costs to the work place. Once again the predictions of 

the parametric model are quite di®erent. Prices are predicted to fall as we move away from industrial zones

(until about .7 kilometers). For distances above .7 kilometers, prices increase continuously at an increasing

rate. The fall in prices due to increased transportation costs captured by the nonparametric model is not

revealed by the parametric model.

mb11 suggets a positive relationship between sales prices and distances to the nearest commercial zone.

The impact however is much smaller than that of  X 10. That is, the problems of congestion, pollution, etc.

associated with proximity to a commercial zone do not seem as intense as those related to proximity to

industrial zones. As in the case of mb10 we observe that after a certain distance, there is a reversal of the ¯rst

derivative sign, and distance begins to have a negative impact on sales prices. Once again, we atribute this

20

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 21/38

e®ect to a predominance of transportation costs over congestion costs. Here, the parametric model predicts

better than in the case of  X 10, however the negative impact of distance in prices at the upper range of the

data is not captured.

Proximity to the central business district (CBD) has a very strong positive impact on sales prices for the

¯rst 3 kilometers. Ceteris paribus moving a dwelling that is adjacent to the CBD to a location 3 kilometers

away may reduce its price by as much as 170; 000 dollars. However, while still negative this e®ect is reduced

dramatically after 3 kilometers. For example, moving from 3 kilometers to 9 kilometers away from the CBD

impacts price by only 20; 000. This is in line with our general expectation. The e®ect of distance to the CBD

for the parametric and nonparametric models are quite di®erent. Once again, the very sharp initial decline

on prices as a typical dwelling is moved  away from the CBD is largely unaccounted by the parametric model.

Overall, we believe that the nonparametric results are more appealing than those suggested by the

parametric model. It comes as no surprise that whenever the estimated regressions are close to linear the

parametric model provides very adequate responses. However, even in this case estimated slopes can be over

or under estimated depending on the pattern of dispersion of the data, as in the case of land area.

5.3 Out-of-Sample Forecast Exercise

One of the characteristics of hedonic price models is that they can be used to forecast prices, given a set

of product characteristics. We were able to perform a simple out-of-sample forecast evaluation of both the

paprametric and nonparametric models based on a new sample of 1000 observations, which we denote by

f(yN t ; xN t1; ¢ ¢ ¢ ; xN t12)gnt=1.18 The new sample comes from the same housing market and corresponds to the

same regressors and regressand. Using the new data, we obtained forecasted sales prices for the parametric

model,

18We have re-estimated the ANRM based on n = 2000 and the results are virtually identical to those reported here forn = 1000. Results are available from the ¯rst author upon request.

21

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 22/38

y pt = ® +

DXd=1

µdxN td +1

2

DXd=1

DX±=1

µd±xN tdxN t±; where µd± and µd are the least squares estimators.

and for the ANRM,

ybt = ®b +

2Xd=1

¯ bdxN td +

12Xd=3

mbd(xN td):

We used ~ hn from Table 2 as the bandwith used in the estimation of the ANRM and we did not update ®b.

Using fyN t gnt=1 we calculate the square root of the average squared forecast error for the parametric ( F E  p)

and ANRM (F E b) estimators to be F E  p = 31251:19 and F E b = 27897:84. Hence, the forecast error of 

the ANRM is about .89 of that corresponding to the parametric model. This di®erence can translate into

substantial di®erences. For example, assuming a property tax rate of 0:014 of dwelling market value, this

represents an average 47 dollars tax adjustment per dwelling. Assuming 200; 000 tax units, this represents

about 9:4 million dollars in property tax adjustments.

6 Conclusions

In this paper we have argued that the functional form speci¯cation problem common in hedonic price models

can be conveniently addressed by modeling the conditional mean of prices as an additive nonparametric

regression model. The approach is in our view vastly superior to a fully nonparametric design for several

reasons. First, from a practical perspective, the smooths are very easy to interpret and visualize permitting

therefore an analysis of the contribution of characteristics and attributes to prices in much the same way

that is obtained in classical separable parametric models. Second, from a statistical perspective, a series of 

well known di±culties of an unrestricted fully nonparametric design are avoided.

Our bandwith selection method is novel, entirely data driven, and has performed well in pactice vis a vis

the popular cross validation selection method. To facilitate the implementation of the method among applied

economists we have written a GAUSS program that permits relatively fast implementation of the procedure.

22

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 23/38

Most importantly, after implementing the procedure using data for Multnomah County, Oregon, we verify

that the nonparametric additive model is able to identify associations and patterns of dependencies among

the data that a reasonably adequate parametric model fails to unveil. Besides, an out-of-sample evaluation

of the forecast errors of both the parametric and nonparametric models reveals a superior performance of the

ANPRM. We are optimistic that the econometric modeling strategy used in this study can be successfully

used in a variety of settings.

Despite our optimism, we ¯nd that the estimation procedure used in this paper needs to be understood

better. We are speci¯cally concerned with our current inability to construct more precise con¯dence intervals

and perform some rather standard tests of hypotheses.

7 References

1. Anglin, P. and R. Gencay (1996) Semiparametric Estimation of Hedonic Price Functions, Journal of 

Applied Econometrics, 11111111,,,, 633-648.

2. Box, G.E.P and D.R. Cox (1964) An Analysis of Transformations, Journal of the Royal Statistical 

Society B,22226666,,,, 211-252.

3. Buja, A., Hastie, T.J. and R.J. Tibshirani (1989) Linear Smoothers and Additive Models, Annals of 

Statistics, 11117777,,,, 453-555.

4. Dreyfus, M.K. and W. Kip Viscusi (1995) Rates of Time Preference and Consumer Valuations of Automobile Safety and Fuel E±ciencyJournal of Law and Economics, 33338888,,,, 79-105.

5. Fan, J. (1992) Design Adaptive Nonparametric Regression, Journal of the American Statistical Asso-

ciation,88887777,,,, 998-1004.

6. Fan,J., T. Gasser, I. Gijbels, M. Brockmann, J. Engel(1993) Local Polynomial Fitting: a Standard forNonparametric Regression. Department of Statistics, UNC.

7. Friedman, J.H. and W. Stuetzel (1981) Projection Pursuit Regression, Journal of the American Sta-

tistical Association,77776666,,,, 817-823.

8. Gencay, R. and X. Yang (1996) A Forecast Comparison of Residential Housing Prices by Parametricand Semiparametric Conditional Mean Estimators, Economic Letters, 55552222,,,, 129-135.

9. Hartog, J. and H. Bierens(1991) Estimating a Hedonic Earnings Function with a NonparametricMethod, in A. Ullah ed.Semiparametric and Nonparametric Econometrics: Studies in Empirical Eco-

nomics,Springer Verlag-Heidelberg.

10. Hastie, T.J. and R.J. Tibshirani (1986) Generalized Additive Models, Statistical Science, 1111,,,, 297-318.

11. Hastie, T.J. and R.J. Tibshirani (1990) Generalized Additive Models, Chapman and Hall-New York.

23

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 24/38

12. Hill, R.C., J.R. Knight and C.F. Sirmans (1997) Estimating Capital Asset Pricing Indexes Review of 

Economics and Statistics, 77779999,,,, 226-233.

13. Iwata, S., Murao, H. and Q. Wang (2000) Nonparametric Assessment of the E®ects of NeighborhoodLand Uses on the Residential House Values, in T. Fomby and R. Carter Hill Eds. Advances in Econo-

metrics: Applying Kernel and Nonparametric Estimation to Economic Topics, vvvv....11114444, JAI Press.

14. Linton, O. and W. HÄardle (1996) Estimation of Additive Regression Models with Known Links,Biometrika, 88882222, 93-100.

15. Martins-Filho, C. (2001) Additive Nonparametric Regression Estimation via  Back¯tting and MarginalIntegration: Experimental Finite Sample Performance. Working Paper  Department of Economics,Oregon State University.

16. Mahan, B., S. Polasky and R. Adams (2001) Valuing Urban Wetlands: A Property Price Approach

Land Economics,77776666 , 100-113.

17. Opsomer, J. (1995) Optimal Bandwith Selection for Fitting an Additive Model by Local PolynomialRegression. Ph.D. Thesis Cornell University.

18. Opsomer, J. (1999) Asymptotic Properties of Back¯tting Estimators. Working Paper  Department of Statistics, Iowa State University.

19. Opsomer, J. and D. Ruppert(1997) Fitting a Bivariate Additive Model by Local Polynomial Regression,Annals of Statistics,22225555,,,, 186-211.

20. Opsomer, J. and D. Ruppert(1998) A Fully Automated Bandwidth Selection Method for Fitting Ad-ditive Models, Journal of the American Statistical Association,99993333,,,, 605-619.

21. Pace, R. Kelley (1993) Nonparametric Methods with Applications to Hedonic Models, Journal of Real Estate Finance and Economics,7777,,,, 185-204.

22. Pace, R. Kelley (1995) Parametric, Semiparametric, and Nonparametric Estimation of Characteris-tics Values within Mass Assessment and Hedonic Pricing Models,Journal of Real Estate Finance and 

Economics,11111111,,,, 195-217.

23. Park, B. and J.S. Marron (1990) Comparison of Data-Driven Bandwidth Selectors, Journal of the

American Statistical Association,88885555,,,, 66-72.

24. Robinson, P. (1988) Root-N Consistent Semiparametric Regression, Econometrica, 55556666,,,, 931-954.

25. Rosen, S.(1974) Hedonic Prices and Implicit Markets: Product Di®erentiation in Pure Competition,Journal of Political Economy, 88882222,,,, 34-55.

26. Ruppert, D., and M.P. Wand (1994) Multivariate Locally Weighted Least Squares Regression, TheAnnals of Statistics,22222222,,,, 1346-1370.

27. Ruppert, D., S.J. Sheather, and M.P. Wand (1995) An E®ective Bandwidth Selector for Least SquaresRegression, Journal of the American Statistical Association,99990000,,,, 1257-1270.

28. Sheather, S.J. and M.C. Jones (1991) A Reliable Data-based Bandwidth Selection Method for KernelDensity Estimation, Journal of the Royal Statistical Society B,55553333,,,, 683-690.

29. Simono®, J.S. (1996) Smoothing Methods in Statistics, Springer-New York.

24

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 25/38

30. Speckman, P.E. (1988) Regression Analysis for Partially Linear Models, Journal of the Royal Statistical 

Society, BBBB 55550000,,,, 413-436.

31. Stock, J.(1991) Nonparametric Policy Analysis: An Application to Estimating Hazardous WasteCleanup Bene¯ts, in W. Barnett, J. Powell and G. Tauchen Eds. Nonparametric and Semiparametric

Methods in Econometrics and Statistics: Proceedings of the 5th International Symposium in Economic

Theory and Econometrics,Cambridge University Press-New York.

32. Walls, M.(1996) Valuing the Characteristics of Natural Gas Vehicles: An Implicit Markets ApproachReview 

of Economics and Statistics, 77778888,,,, 266-276.

33. Wooldridge, J.(1992) Some Alternatives to the Box-Cox Regression Model, International Economic

Review,33333333,,,, 935-955.

34. Yatchew, Adonis (1998) Nonparametric Regression Techniques in Economics, Journal of Economic

Literature,33336666,,,, 669-721.

25

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 26/38

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 27/38

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 28/38

Figure 1

28

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 29/38

Figure 2

29

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 30/38

Figure 2

30

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 31/38

Figure 2

31

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 32/38

Figure 2

32

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 33/38

Figure 2

33

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 34/38

Figure 3

34

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 35/38

Figure 3

35

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 36/38

Figure 3

36

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 37/38

Figure 3

37

8/2/2019 10.1.1.194.9300

http://slidepdf.com/reader/full/10111949300 38/38

Figure 3