estimation of median incomes of small areas: a bayesian ... · malay ghosh estimation of median...

33
Estimation of Median Incomes of Small Areas: A Bayesian Semiparametric Approach Malay Ghosh University of Florida Joint work with D. Bhadra and D. Kim August 13, 2011 Malay Ghosh Estimation of Median Income

Upload: others

Post on 03-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

Estimation of Median Incomes of Small Areas: ABayesian Semiparametric Approach

Malay GhoshUniversity of Florida

Joint work with D. Bhadra and D. Kim

August 13, 2011

Malay Ghosh Estimation of Median Income

Page 2: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

Outline

• Introduction

• Semiparametric Modeling

• Hierarchical Bayesian Model

• Data Analysis

• Goodness of Fit Test

• Adaptive Knot Selection

• Summary and Conclusion

Malay Ghosh Estimation of Median Income

Page 3: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

Introduction

• Often observations on various characteristics of small areasare collected over time, and thus, may possess an underlyingtime-varying pattern.

• It is likely that models which exploit this pattern may performbetter than those which do not utilize this feature.

• In this study, we present a semiparametric Bayesian frameworkfor the analysis of small area data, while explicitlyaccommodating for the longitudinal pattern in the responseand the covariates.

Malay Ghosh Estimation of Median Income

Page 4: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

• Estimation of median household income of small areas is oneof the principal targets of inference of the U.S Bureau ofCensus under its Small Area Income and Poverty Estimation(SAIPE) program.

• The above estimates play an important role in theadministration of federal programs and allocation of federalfunds to local jurisdictions.

• Since these estimates are collected over time, they oftenpossess an underlying longitudinal pattern.

• In this talk, I will use the household income data for all theU.S states for the period 1995 through 1999 to estimate thetrue state specific median household income for 1999.

• Fig I. plots the CPS (Current Population Survey) medianincomes against the IRS mean income for all the statesspanning 1995-1999.

Malay Ghosh Estimation of Median Income

Page 5: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

● ●

●●

●●

● ●

●●

● ●

●●

●●

●●

●● ●

●●

●●

●●

● ●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●●

● ●

●●

● ●

●●

●●

●●

● ● ●

● ●

●●

●●

●●

●●

●●

● ● ●

● ●

●●

●●

●●

●●

●●

●●

●●

30000 40000 50000 60000 70000

2500

030

000

3500

040

000

4500

050

000

IRS Mean Income

CP

S M

edia

n In

com

e

Malay Ghosh Estimation of Median Income

Page 6: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

• The Small Area Income and Poverty Estimates (SAIPE)program of the U.S Census Bureau provides annual estimatesof income and poverty statistics for all states, counties andschool districts across the United States.

• They use the Fay-Herriot class of models (Fay and Herriot,1979) in combining state and county estimates of poverty andincome obtained from different sources.

• Bayesian techniques are used to weigh the contributions of theCPS median income estimates and the regression predictionsof the median income based on their relative precision.

Malay Ghosh Estimation of Median Income

Page 7: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

• Data: IRS median income and CPS state median householdincome estimates for 1995-1999. In addition, we have the1999 state median household income estimates from 2000census data.

• We have used data from CPS for the period 1995-1999 inorder to estimate the state wide median household income for1999.

• This is because, the most recent census estimates correspondto the year 1999 and these census values can be used forcomparison purposes.

Malay Ghosh Estimation of Median Income

Page 8: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

• Ghosh, Nangia and Kim (1996) proposed a Bayesian timeseries modeling framework to estimate the statewide medianincome of four-person families for 1989.

• Opsomer et.al (2008) pioneered the use of nonparametricregression methodology in small area estimation context.

• They combined small area random effects with a smooth,non-parametrically specified trend using penalized splines.

• They applied their model to analyze a non-longitudinal,spatial dataset concerning the estimation of mean acidneutralizing capacity (ANC) of lakes.

Malay Ghosh Estimation of Median Income

Page 9: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

Semiparametric Modeling

• The annual state specific median income estimates can belooked upon as a longitudinal profile or “income trajectory”.

• Moreover, the median income estimates may possess anunderlying non-linear pattern with respect to the covariates.

• These characteristics motivated us to use a semi-parametricmodeling approach for our problem.

• Our main objective is to estimate the 1999 state medianhousehold income using a semi-parametric approach and tocompare these estimates with the CPS as well as the SAIPEmodel based estimates.

Malay Ghosh Estimation of Median Income

Page 10: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

• Sometimes the relationship between two variables is toocomplicated to be expressed using a known functional form.

• Non-parametric statistical methods uses the data, but not anyprespecified function to determine the true underlyingfunctional relationship between the variables.

• For example, suppose Y and X are related asyi = f (xi ) + ei , i = 1, 2, ...,m. where ei ∼ N(0, σ2e ) and f (x)is unspecified.

• In a non-paramteric setting, f (x) is often estimated usingPenalized splines (P-splines).

Malay Ghosh Estimation of Median Income

Page 11: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

• In the P-spline framework, f (x) is represented asf (x ;β) = β0 + β1x + ...+ βpx

p +∑K

k=1 βp+k(x − τk)p+.

• Here, p is the degree of the spline, (x)p+ = xpI (x > 0) and(τ1 < τ2 < ... < τK ) is a fixed set of knots.

• The spline coefficients (βp+1, ..., βp+K ) measure the jumps ofthe spline at the knots (τ1, ..., τK ).

• Smoothness of the resulting fit is achieved by “penalizing” orrestricting these jumps.

• Provided the knots are evenly spread out over the range of x ,the functions f (x ;β) can accurately estimate a very largeclass of smooth functions f (·).

Malay Ghosh Estimation of Median Income

Page 12: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

• Let Yij be the sample survey estimators of somecharacteristics θij for the i th small area at the j th time(i = 1, 2, ...,m; j = 1, 2, ..., t).

• The inferential target is usually θij or some function of it.

• In our context, θij denotes the true median household incomeof the i th state at the j th year.

• We denote by Xij , the covariate corresponding to the i th stateand j th year.

• In our problem, Xij is the IRS mean income recorded for thei th state and j th year.

Malay Ghosh Estimation of Median Income

Page 13: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

• Our basic semiparametric model (SPM) is

Yij = f (xij) + bi + uij + eij .

• Here f (x) is an unspecified function of x reflecting theunknown response-covariate relationship.

• We approximate f (xij) using a first degree P-spline andrewrite (1) as

Yij = β0 + β1xij +K∑

k=1

γk(xij − τk)+ + bi + uij + eij

= θij + eij , i = 1, ...,m; j = 1, ..., t.

Malay Ghosh Estimation of Median Income

Page 14: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

• Here, bi is a state-specific random effect while uij representsan interaction effect between the i th state and the j th year.

• uij and eij are assumed to be mutually independent withuij ∼ N(0, ψ2

j ) and eij ∼ N(0, σ2ij). σ2ij ’s are assumed to beknown.

• We assume bi ∼i .i .d N(0, σ2b) and γ ∼ N(0, σ2γ IK ) where σ2γcontrols the amount of smoothing of the underlying incometrajectory.

• Generally, the knots (τ1, ..., τK ) are placed on a grid of equallyspaced sample quantiles of Xij ’s.

Malay Ghosh Estimation of Median Income

Page 15: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

• A second model, a semiparametric random walk model(SPRWM) introduces in addition a trend component (overtime) in the model.

Yij = X′ijβ + Z′ijγ + bi + vj + uij + eij

= θij + eij .

• Here, vj denotes the time specific random component.

• We assume vj |vj−1 ∼ N(vj−1, σ2v ).

• An alternate representation is vj = vj−1 + wj , where

wjiid∼ N(0, σ2v ).

Malay Ghosh Estimation of Median Income

Page 16: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

Hierarchical Bayesian Model

• Our Hierarchical Bayesian model is

1. (Yij |θij)ind∼ N(θij , σ

2ij), i = 1, ...,m; j = 1, ...t

2. (θij |β,γ, bi , ψ2j )

ind∼ N(x′ijβ + Z′ijγ + bi , ψ2j )

3. biiid∼ N(0, σ2), i = 1, ...,m

4. γ ∼ N(0, σ2γ IK )

• We use a noninformative uniform improper prior for β whileproper but diffuse inverse gamma priors for the varianceparameters.

• We use Gibbs sampler in an MCMC framework to samplefrom the full conditionals of θij , our target of inference.

Malay Ghosh Estimation of Median Income

Page 17: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

• Since we have used improper prior for β, posterior proprietywas proved before doing any computation.

• We use Gibbs sampler in MCMC framework to sample fromthe full conditionals of θij , our target of inference.

• We follow the recommendation of Gelman and Rubin (1992)and run n(≥ 2) parallel chains. For each chain, we run 2diterations with starting points drawn from an overdisperseddistribution.

• The first d iterations of each chain are discarded and posteriorsummaries are calculated based on the rest of the d iterates.

Malay Ghosh Estimation of Median Income

Page 18: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

Data Analysis

• We fitted the semi-parametric small area model (SPM) withIRS mean as predictor and varying number of knots.

• Decennial census values are used as the “gold standard”.

• Comparison Measures:

• Average Relative Bias (ARB) = 151

∑51i=1|ci − ei |

ci ;

• Average Squared Relative Bias (ASRB) = 151

∑51i=1|ci − ei |2

c2i;

• Average Absolute Bias (AAB) = 151

∑51i=1 |ci − ei |;

• Average Squared Deviation (ASD) = 151

∑51i=1(ci − ei )

2.

Malay Ghosh Estimation of Median Income

Page 19: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

The model with 5 knots in the income trajectory produced the bestestimates.

Estimate ARB ASRB AAB ASD

CPS 0.0415 0.0027 1,753.33 5,300,023

SAIPE 0.0326 0.0015 1,423.75 3,134,906

SPM(5) 0.0326 0.0016 1,398.46 3,287,368

The 95% CI for γ1, γ4 and γ5 does not contain 0 indicating thesignificance of the first, fourth and fifth knots.

Malay Ghosh Estimation of Median Income

Page 20: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

• It is clear that the SPM estimates are superior to the CPSmedian income estimate but they are equivalent to the SAIPEmodel estimates.

• In a semiparametric regression framework, proper positioningof knots plays a pivotal role in capturing the true underlyingpattern in a set of observations.

• Poorly placed knots does little in this regard and can even leadto an erroneous or biased estimate of the underlying trajectory.

• The exact positions of the 5 knots in our setup are shown inthe following figure.

Malay Ghosh Estimation of Median Income

Page 21: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

● ●

●●

●●

● ●

●●

● ●

●●

●●

●●

●● ●

●●

●●

●●

● ●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●●

● ●

●●

● ●

●●

●●

●●

● ● ●

● ●

●●

●●

●●

●●

●●

● ● ●

● ●

●●

●●

●●

●●

●●

●●

●●

30000 40000 50000 60000 70000

2500

030

000

3500

040

000

4500

050

000

IRS Mean Income

CP

S M

edia

n In

com

e

Malay Ghosh Estimation of Median Income

Page 22: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

• It is clear that the knots mostly lie in the high density regionof the graph while the non-linearity is mainly visible in the lowdensity region.

• Thus, we decided to place half of the knots in the low densityregion of the graph while the other half in the high densityregion. The following figure shows the new pattern.

● ●

●●

●●

● ●

●●

● ●

●●

●●

●●

●● ●

●●

●●

●●

● ●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●●

● ●

●●

● ●

●●

●●

●●

● ● ●

● ●

●●

●●

●●

●●

●●

● ● ●

● ●

●●

●●

●●

●●

●●

●●

●●

30000 40000 50000 60000 70000

2500

030

000

3500

040

000

4500

050

000

IRS Mean Income

CP

S M

edia

n In

com

e

Malay Ghosh Estimation of Median Income

Page 23: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

• It is clear that a much larger proportion of observations hasbeen captured with the knot realignment.

• The region between the bold and dashed vertical lines denotesthe additional coverage that has been achieved with the knotrearrangement.

• Since the new coverage area overlaps the region ofnon-linearity, it seems that the new knots are able to capturethe underlying non-linear pattern in the dataset which the oldknots failed to achieve.

• On fitting the semiparametric model with the new knotalignment, we did achieve some improvement in the results asshown below.

Malay Ghosh Estimation of Median Income

Page 24: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

Estimate ARB ASRB AAB ASD

CPS 0.0415 0.0027 1,753.33 5,300,023

SAIPE 0.0326 0.0015 1,423.75 3,134,906

GNK 0.0400 0.0025 1709.58 5,229,869

SPM(5) 0.0326 0.0016 1,398.46 3,287,368

SPM(5)∗ 0.0280 0.0012 1173.71 2,334,379

SPRWM(5)∗ 0.0300 0.0013 1256.08 2, 747,010

• The new comparison measures for SPM are quite lower thanthose of the SAIPE model.

• The percentage improvements of the SPM estimates over theSAIPE estimates are respectively 14.11%, 20%, 17.56% and25.54%.

• This improvement is apparently due to the additional coverageof the observational pattern that is being achieved with therelocation of the knots.

Malay Ghosh Estimation of Median Income

Page 25: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

Goodness of Fit Test

• To examine the goodness-of-fit of the semiparametric models,we used a Bayesian Chi-square goodness-of-fit statistic.

• This is an extension of the classical Chi-square goodness-of-fittest where the statistic is calculated at every iteration of theGibbs sampler as a function of the parameter values drawnfrom the posterior distribution.

• We form 10 equally spaced bins ((k − 1)/10, k/10),k = 1, ..., 10, with fixed bin probabilities, pk = 1/10.

• At each iteration of the Gibbs sampler, bin allocation is madebased on the conditional distribution of each observationgiven the generated parameter values, i.e. F (yij |θij).

Malay Ghosh Estimation of Median Income

Page 26: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

• The Bayesian chi-square statistic is then calculated as

RB(Θ̃) =10∑k=1

[mk(Θ̃)− npk√

npk

]2

• Here mk(Θ̃) is the random bin count given the posteriorsample Θ̃.

• For model assessment, we use two summary measures. Firstone is the proportion of times the generated values of RB

exceeds the 0.95 quantile of a χ29 distribution. Values quite

close to 0.05 would suggest a good fit.

• The second diagnostic is the probability that RB(Θ̃) exceeds aχ29 deviate. Values close to 0.5 would suggest a good fit.

Malay Ghosh Estimation of Median Income

Page 27: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

• For the SPM, the above summary measures were respectively0.049 and 0.5 indicating a good fit.

• The QQ plots of RB values shown below also demonstrategood agreement between the distribution of RB and that of aχ2(9) random variable.

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●

●●●●●●●●

●●

●●

0 10 20 30 40

010

2030

40

Theoretical Quantiles of Chi−Square (9)

Qua

ntile

s of

R^{

B}

Malay Ghosh Estimation of Median Income

Page 28: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

Adaptive Knot Selection

• Recall the modelYij = f (xij) + bi + uij + eij , where

f (xij) = β0 + β1xij +∑K

k=1(xij − τk)+.• However, now we do not require either a fixed number or fixed

locations of the knots.• So now the posterior will involve two additional sets of

parameters- (i) the number of knots and (ii) the locations ofthe knots.

• We consider the prior under which K , the number of knots,follows a Poisson distribution with some mean, say, µ.

• Conditional on K = k, we consider the locationsτ1 < τ2 < . . . < τk as order statistics from a uniform (a, b)distribution.

• In addition to sampling the earlier parameters, we now need tosample the the knot number and the knot locations (k , τ) ateach iteration of the Gibbs sampler.

Malay Ghosh Estimation of Median Income

Page 29: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

• As a result, the dimension of the parameter space changes atevery iteration.

• Thus we need a Reversible Jump Markov chain Monte Carlo(RJMCMC) which accounts for varying dimension of theparameter space.

• The RJMCMC algorithm consists of three types of transition.

• Knot selection (birth step), Knot deletion (death step) andKnot relocation (relocation step).

• The probabilities for these moves are denoted by bk , dk andζk respectively.

• bk = c min{1, πk+1

πk}, dk = c min{1, πk−1

πk}, ζk = 1− bk − dk .

• πk = exp(−µ)µk/k!.

• c is a preassigned constant.

Malay Ghosh Estimation of Median Income

Page 30: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

• Mτ (k) = {k , τ1, . . . , τk}: current model as identified by k andτ .

• Birth step: Mτ (k)→ Mτ (k + 1) with prob. bk .

• Death step: Mτ (k)→ Mτ (k − 1) with prob. dk .

• Relocation Mτ (k)→ Mτ (k∗) with prob. ζk .

Malay Ghosh Estimation of Median Income

Page 31: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

Summary and Conclusion

• Information on past median income levels of different statesdo provide strength towards the estimation of state specificmedian incomes for the current period.

• In fact, if there is an underlying non-linear pattern in themedian income levels, it may be worthwhile to capture thatpattern as accurately as possible.

• The contribution of the knots towards deciphering theunderlying observational pattern improved substantially whenthey were placed with an optimal coverage area.

• Our final estimates proved to be superior, to both the CPSestimates, and also to the current U.S Census Bureau(SAIPE) estimates.

Malay Ghosh Estimation of Median Income

Page 32: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

• Some possible extensions are as follows :

• The state specific deviations can be modeled as unspecifiednonparametric functions instead of just a random intercept.

• If the median income pattern have varying degree ofsmoothness, spatially adaptive smoothing procedures can beused.

• Other kinds of basis functions like B-splines or radial bases etccan also be used to model the income trajectory.

• Instead of a parametric normal distributional assumption forthe random effects, a broader class of distributions like theDirichlet process or Polya trees may be tested.

Malay Ghosh Estimation of Median Income

Page 33: Estimation of Median Incomes of Small Areas: A Bayesian ... · Malay Ghosh Estimation of Median Income It is clear that the SPM estimates are superior to the CPS median income estimate

• The theorem for posterior propriety is as follows :

• Theorem. Let ψ2max = max(ψ2

1, ..., ψ2t ) = ψ2

k , say, for somek ∈ [1, ..., t]. Then, posterior propriety holds if (i)(m− p − 5)/2 + ck > 0 and dk > 0 and (ii) m/2 + cj − 2 > 0and dj > 0, j = 1, ..., t; j 6= k .

Malay Ghosh Estimation of Median Income