research article the optimal selection for restricted...

14
Research Article The Optimal Selection for Restricted Linear Models with Average Estimator Qichang Xie 1 and Meng Du 2 1 School of Economics, Shandong Institute of Business and Technology, Yantai, Shandong 264005, China 2 School of Finance, Dongbei University of Finance and Economics, Dalian, Liaoning 116025, China Correspondence should be addressed to Qichang Xie; [email protected] Received 5 March 2014; Revised 15 April 2014; Accepted 29 April 2014; Published 21 May 2014 Academic Editor: Qian Guo Copyright © 2014 Q. Xie and M. Du. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. e essential task of risk investment is to select an optimal tracking portfolio among various portfolios. Statistically, this process can be achieved by choosing an optimal restricted linear model. is paper develops a statistical procedure to do this, based on selecting appropriate weights for averaging approximately restricted models. e method of weighted average least squares is adopted to estimate the approximately restricted models under dependent error setting. e optimal weights are selected by minimizing a k-class generalized information criterion (k-GIC), which is an estimate of the average squared error from the model average fit. is model selection procedure is shown to be asymptotically optimal in the sense of obtaining the lowest possible average squared error. Monte Carlo simulations illustrate that the suggested method has comparable efficiency to some alternative model selection techniques. 1. Introduction e essential task of risk investment aims to select an optimal tracking portfolio among numerous portfolios of stocks. Given a desired target and a series of stocks, a tracking portfolio is comprised by every nonempty subset of the given group of stocks so as to track the target to a certain degree. Because of the number of nonempty subsets of stocks, there exists a mass of possible tracking portfolios. Among all possible portfolios, we should find an optimal tracking portfolio whose return is closest to the targets. Statistically, a tracking portfolio is built by a group of stocks, which is equivalent to fitting a restricted linear model with the target’s return as the dependent variable and returns on stocks in the group as the regressors. Since the coefficient of a regressor indicates the proportion of the investment in the corresponding stock within the total investment in the portfolio, the linear model is restricted such that all coefficients in the model sum to one. us, the task of choosing an optimal tracking portfolio can be accomplished by selecting an optimal restricted linear model. In this paper, a model average technique is developed for examining the selection problem of restricted linear models. Model selection has played an important role in econometrics and statistics over the past decades. e goal of model selection is to choose a model which gives the well- posed fit for observational data. So, the investigation of model selection is an indispensable process in empirical analysis. is work proposes a procedure of minimizing -class gen- eralized information criterion to select the optimal weights for constrained linear models. Under some conditions, we examine the asymptotic behaviors of the selection program. Various methods have been suggested to study the problems of model selection. Knight and Fu [1] discussed the lasso-type estimators with least squares methods. To simultaneously estimate parameters and select important variables, Fan and Peng [2] proposed a method of nonconcave penalized likelihood and demonstrated that this technique had an oracle property when the number of parameters was infinite. Zou and Yuan [3] investigated the oracle theory of model selection based on composite quantile regression. Caner [4] considered model selection by the generalized method of moments estimator. In the empirical likelihood framework, Tang and Leng [5] studied the parametric estimator and variable selection for diverging numbers of parameters. Jennifer et al. [6] explored the ability of automatic Hindawi Publishing Corporation Abstract and Applied Analysis Volume 2014, Article ID 692472, 13 pages http://dx.doi.org/10.1155/2014/692472

Upload: others

Post on 03-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Article The Optimal Selection for Restricted ...downloads.hindawi.com/journals/aaa/2014/692472.pdf · stocks in the group as the regressors. Since the coe cient of a regressor

Research ArticleThe Optimal Selection for Restricted Linear Models withAverage Estimator

Qichang Xie1 and Meng Du2

1 School of Economics Shandong Institute of Business and Technology Yantai Shandong 264005 China2 School of Finance Dongbei University of Finance and Economics Dalian Liaoning 116025 China

Correspondence should be addressed to Qichang Xie qichangx163com

Received 5 March 2014 Revised 15 April 2014 Accepted 29 April 2014 Published 21 May 2014

Academic Editor Qian Guo

Copyright copy 2014 Q Xie and M Du This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

The essential task of risk investment is to select an optimal tracking portfolio among various portfolios Statistically this process canbe achieved by choosing an optimal restricted linearmodelThis paper develops a statistical procedure to do this based on selectingappropriate weights for averaging approximately restricted models The method of weighted average least squares is adopted toestimate the approximately restricted models under dependent error setting The optimal weights are selected by minimizing ak-class generalized information criterion (k-GIC) which is an estimate of the average squared error from the model average fitThis model selection procedure is shown to be asymptotically optimal in the sense of obtaining the lowest possible average squarederror Monte Carlo simulations illustrate that the suggested method has comparable efficiency to some alternative model selectiontechniques

1 Introduction

The essential task of risk investment aims to select an optimaltracking portfolio among numerous portfolios of stocksGiven a desired target and a series of stocks a trackingportfolio is comprised by every nonempty subset of thegiven group of stocks so as to track the target to a certaindegree Because of the number of nonempty subsets of stocksthere exists a mass of possible tracking portfolios Amongall possible portfolios we should find an optimal trackingportfolio whose return is closest to the targets Statisticallya tracking portfolio is built by a group of stocks whichis equivalent to fitting a restricted linear model with thetargetrsquos return as the dependent variable and returns onstocks in the group as the regressors Since the coefficientof a regressor indicates the proportion of the investmentin the corresponding stock within the total investment inthe portfolio the linear model is restricted such that allcoefficients in the model sum to one Thus the task ofchoosing an optimal tracking portfolio can be accomplishedby selecting an optimal restricted linear model

In this paper a model average technique is developedfor examining the selection problem of restricted linear

models Model selection has played an important role ineconometrics and statistics over the past decades The goalof model selection is to choose a model which gives the well-posed fit for observational data So the investigation ofmodelselection is an indispensable process in empirical analysisThis work proposes a procedure of minimizing 119896-class gen-eralized information criterion to select the optimal weightsfor constrained linear models Under some conditions weexamine the asymptotic behaviors of the selection program

Various methods have been suggested to study theproblems of model selection Knight and Fu [1] discussedthe lasso-type estimators with least squares methods Tosimultaneously estimate parameters and select importantvariables Fan andPeng [2] proposed amethodof nonconcavepenalized likelihood and demonstrated that this techniquehad an oracle property when the number of parameters wasinfinite Zou and Yuan [3] investigated the oracle theoryof model selection based on composite quantile regressionCaner [4] considered model selection by the generalizedmethod of moments estimator In the empirical likelihoodframework Tang and Leng [5] studied the parametricestimator and variable selection for diverging numbers ofparameters Jennifer et al [6] explored the ability of automatic

Hindawi Publishing CorporationAbstract and Applied AnalysisVolume 2014 Article ID 692472 13 pageshttpdxdoiorg1011552014692472

2 Abstract and Applied Analysis

selection algorithms to handle the selection problems of bothvariables and principal components

Model averaging is another popular and widely usedtechnique for model selection The method is to averagethe estimators corresponding to different candidate mod-els Bayesian and frequentist are two main perspectivesof thought in model averaging Although their spirit andobjectives are similar the two techniques are different ininference and selection of models In view of the Bayesianmodel averaging the basic paradigm was introduced byLeamer [7] Owing to the difficulty of implementing theapproach was basically ignored until the 2000s About recentdevelopments of this method the readers can refer to Brownet al [8] and Rodney and Herman [9] Compared withBayesian model averaging since the method of frequentistmodel averaging focused on model selection rather thanmodel averaging it has been considered bymany authors forinstance Hjort and Claeskens [10] Hansen [11] Liang et al[12] Zhang and Liang [13] and Hansen and Racine [14]

Generally speaking different methods of model selectionneed to construct distinct model selection criteria includingAIC [15] Mallowsrsquo C

119871[16] CV [17] BIC [18] GCV [19]

GMM J-statistic [20] and FPE120572[21] Zhang et al [22]

employed the generalized information criterion for selectingthe regularization parameters To choose basis functions ofsplines Xu and Huang [23] showed the optimal propertyof a LsoCV criterion and designed an efficient Newton-type algorithm for this criterion Focusing on the divergencemeasure of Kullback-Leibler So and Ando [24] defined ageneralized predictive information criterion using the biascorrection of an expected weighted loglikelihood estimatorGroen and Kapetanios [25] examined the criteria of AIC andBIC to discuss consistent estimates of a factor-augmentedregression

The literature mentioned above pays more attention tothe unconstrained models with independently and identi-cally distributed random errors Recently Lai and Xie [26]discussed model selection for constrained models whichwere limited to the homoscedastic cases Instead of usingunrestricted models or homoscedastic models we develop a119896-class generalized information criterion (119896-GIC) to discussthe selecting problems of approximately constrained linearmodels with dependent errors The 119896-GIC is an extensionof the GIC

120582119899

proposed by Shao [27] and includes someconventional model selection criteria such as BIC and GICWe employ the technique of weighted average least squares toestimate the approximately constrained models and choosethe weights through minimizing the 119896-class generalizedinformation criterion Our main result demonstrates that the119896-class generalized information criterion is asymptoticallyequivalent to the average squared error In other wordsthe selected weights from 119896-GIC are asymptotically optimalMoreover we highlight two new results which enrich theworks of Lai and Xie [26] One is that an estimate ofvariance is given and the estimate is proved to be consistentAnother is that the selected weights from 119896-GIC are shownto be still asymptotically optimal when the true varianceis replaced by the suggested estimate The finite sampleproperties of model selection are performed by Monte Carlo

simulationThe results of simulation reveal that the proposedmethod of model selection is dominant over some alternativeapproaches

The remainder of this paper begins with an illustrationof the model set-up and average estimator in Section 2Section 3 calculates the average squared error of the modelaverage estimator The 119896-GIC criterion is introduced andits asymptotic optimality is derived in Section 4 Section 5states some results from simulation evidence and Section 6is conclusions

2 Model Set-Up and Average Estimator

The core in risk investment is to build a tracking portfolioof stocks whose return mimics that of a chosen investmenttarget Let 119910

119894be the return from investing in a selected target

and 119909119894119895

be the historical return of the 119895th stock at time 119894Assume that 119901 stocks are available for building a trackingportfolio of the target Then a tracking portfolio consistingof all 119901 stocks can be represented by

119910119894=

119901

sum

119895=1

119909119894119895120579119895+ 119890119894 (1)

where 120579119895is unknown parameter and 119890

119894is random error The

left-hand side of (1) is the return of investing one dollar inthe target The right-hand side is the return of investing onedollar in the portfolio consisting of all 119901 stocks plus randomnoise

Because each parameter 120579119895stands for the proportion of

investment on the corresponding stock to the total invest-ment in the tracking portfolio the sum of all parameters isone namely

1205791+ sdot sdot sdot + 120579

119901= 1 (2)

which means the 100 percent of the whole investmentIn practice there exist a large number of stocks These

stocks compose various portfolios thatmay track the target tosome degree Among all possible tracking portfolios an idealtracking portfolio should be the onewhose return is closest tothe targetrsquos returnTherefore we need to find such an optimaltracking portfolio Because of the dependance between atracking portfolio and a restricted linear model the aim offinding the optimal tracking portfolio can be accomplishedby choosing an optimal restricted linear model

In the following we extendmodels (1) and (2) to considera generalized constrained linear model for the problem ofbuilding an optimal tracking portfolio Suppose that 119910

119894 119894 =

1 119899 is a random observation at fixed value (119909119894 119911119894) where

119909119894= (1199091198941 119909

119894119901) is fixed-dimensional explanatory variable

and 119911119894= (1199111198941 119911

119894infin) is countably infinite-dimensional

explanatory variable Consider the constrained linear model

119910119894=

119901

sum

119895=1

119909119894119895120579119895+

infin

sum

119897=1

119911119894119897120595119897+ 119890119894 subject to (3)

119903119904120579119904=

infin

sum

119896=1

119886119904119896120595119896+ 119887119904 (119904 = 1 119901) (4)

Abstract and Applied Analysis 3

where 119901 is a positive integer 120579119895and 120595

119897are parameters 119890

119894is

random error 119903119904is a constant for restricting the 119904th parameter

120579119904 and 119886

119904119896and 119887119904are some constants Assume that120601(119911

119894119897 120595119897) =

suminfin

119897=1119911119894119897120595119897and 120601(119886

119904119896 120595119896) = sum

infin

119896=1119886119904119896120595119896converge in mean

squareIn (3) the explanatory variable 119909

119894119895is involved in the

model on theoretical grounds or other reasons and 119911119894119897is the

additional explanatory variable that we need to make surewhether it should be included in the model In the contextof building tracking portfolio the fixed explanatory variable119909119894119895stands for the historical return of the 119895th stock at time 119894

which must be selected by investors because of their personalpreference or the stable earning of this stock In a trackingportfolio investors need to select some alternative stocksfrom numerous stocks to realize their expected return Sothe additional explanatory variable 119911

119894119897indicates the historical

return of the 119897th alternative stock at time 119894 Since 119911119894119897

canbe viewed as a series expansion the identity (3) includessemiparametric models as special form In fact the model(3) generalizes the models considered by Lai and Xie [26]and Liang et al [12] In addition the parameters 120579

119895and 120595

119897

denote respectively the proportion of the 119895th required stockand the 119897th alternative stock in a tracking portfolio In (4) theparameter 120579

119904is adjusted by a linear combination of 120595

119896and

119887119904 The economic significance of (4) is that the proportion

of each fixed investment varies with the proportion of allalternative investments When investors change their prefer-ence or have acquired new information on alternative stocksthey are capable of adjusting the proportion between requiredstocks and alternative stocks according to (4) This impliesthat the increase or decrease of alternative stocks can affectthe proportion of each required stock in a portfolio Besides ifwe assume that 120595

119896= 0 119896 = 1 infin the model (4) becomes

119903119904120579119904= 119887119904which which has been discussed by Lai and Xie [26]

Particularly if set 1199031= sdot sdot sdot = 119903

119901= 1 119886

11+ sdot sdot sdot + 119886

1199011= sdot sdot sdot =

1198861infin

+ sdot sdot sdot + 119886119901infin

= minus1 and 1198871+ sdot sdot sdot + 119887

119901= 1 the restricted

equation (4) turns into (2)Denote an index set U = K

1 K

119872 where 119872 is a

positive integer LetΘ = (1205791 120579

119901)119879 andΨ = (120595

1 120595

infin)119879

where ldquo119879rdquo stands for the transpose operation Due to theuncertain number of 119911

119894119897in formula (3) we consider a

sequence of approximately restricted models (3) and (4) withK119898

isin U where the 119898th model includes the first K119898

elements of 119911119894 that is 119911

1198941 119911

119894K119898

and the parameter of 120579119904

is restricted by a linear combination withK119898elements of Ψ

and a constant 119887119904 Hence the 119898th approximately restricted

models (3) and (4) are

119910119894=

119901

sum

119895=1

119909119894119895120579119895+

K119898

sum

119897=1

119911119894119897120595119897+ 119889119894+ 119890119894 subject to (5)

119903119904120579119904=

K119898

sum

119896=1

119886119904119896120595119896+ 119889995333

119904+ 119887119904 (119904 = 1 119901) (6)

where 119889119894= suminfin

119897=K119898+1119911119894119897120595119897and 119889995333

119904= suminfin

119896=K119898+1119886119904119896120595119896are the

approximation errorsSet 119884 = (119910

1 119910

119899)119879 119886119898119904

= (1198861199041 119886

119904K119898

)119879 119860119898alefsym=

(119886119898119879

1 119886

119898119879

119901)119879 119863 = (119889

1 119889

119899)119879 119863995333 = (119889

995333

1 119889

995333

119901)119879

Ψ119898= (1205951 120595K

119898

)119879 119861 = (119887

1 119887

119901)119879 and 119890 = (119890

1 119890

119899)119879

By matrix notation the119898th approximately restricted models(5) and (6) can be rewritten as

119884 = 119883Θ + 119885119898Ψ119898+ 119863 + 119890 subject to (7)

119877Θ = 119860119898

alefsymΨ119898+ 119863995333+ 119861 (8)

where 119883 is a 119899 times 119901 matrix whose 119894119895th element is 119909119894119895 119885119898is a

119899 times K119898matrix whose 119894119895th element is 119911

119894119895 and 119877 is a 119901 times 119901

diagonal matrix whose 119894th diagonal element is 119903119894

Hypothesize that 119890119894119899

119894=1satisfies 119864(119890

119894| 119909119894 119911119894) = 0 and its

conditional covariance matrixΩ119899= 119864(119890119890

119879| 119883 119885

119898) is

Ω119899= (

1205902

19848581

sdot sdot sdot 984858119899minus1

9848581

1205902

2sdot sdot sdot 984858119899minus2

sdot sdot sdot

984858119899minus1

984858119899minus2

sdot sdot sdot 1205902

119899

) (9)

in which 119864(119890119894119890119894| 119909119894 119911119894) = 120590

2

119894and 119864(119890

119894119890119894+119895

| 119909119894 119911119894) = 984858119895with

119894 = 1 119899 and 119895 = 1 119899 minus 1 Clearly the random errorsfollow a heteroscedastic stationary Gaussian process

Substituting (8) into (7) it yields

119884119906= 119885119898119906Ψ119898+ 119880 (10)

where 119884119906= 119884 minus 119883119877

minus1119861 119880 = 119863 + 119883119877

minus1119863995333+ 119890 and 119885

119898119906=

119885119898+ 119883119877minus1119860119898

alefsym By the method of least squares the estimator

of Ψ119898is

Ψ119898= (119885119879

119898119906119885119898119906)minus1

119885119879

119898119906119884119906 (11)

where (119885119879119898119906119885119898119906)minus1 denotes the inverse of 119885119879

119898119906119885119898119906

In the119898th approximating model (7) we set 120583

119898= 119883Θ + 119885

119898Ψ119898 so

that 120583 = 119864(119884 | 119883 119885119898) = 120583119898+ 119863 Thus the estimator of 120583

119898is

120583119898= 119883Θ + 119885

119898Ψ119898= 120578 + 119875

119898119884 (12)

where 120578 = (119868minus119875119898)119883119877minus1119861 and 119875

119898= 119885119898119906(119885119879

119898119906119885119898119906)minus1

119885119879

119898119906is

the ldquohatrdquo matrixLet 119908 = (119908

1 119908

119872)119879 be a weight vector where119872 lt 119899

Define a weight setW as

W = 119908 | 119908119898isin [0 1] 119898 = 1 119872

119872

sum

119898=1

119908119898= 1 (13)

For all119898 le 119872 the weighted average estimator of Ψ119898is

Ψ119908=

119872

sum

119898=1

119908119898Ψ119898=

119872

sum

119898=1

119908119898(119885119879

119898119906119885119898119906)minus1

119885119879

119898119906119884119906 (14)

Naturally the weighted average estimator of Θ is

Θ119908= 119877minus1119861 + 119877

minus1119860119898

alefsymΨ119908 (15)

Furthermore the weighted average estimator of 120583 is

120583 (119908) =

119872

sum

119898=1

119908119898120583119898= 120578 (119908) + 119875 (119908)119884 (16)

4 Abstract and Applied Analysis

where 119875(119908) = sum119872

119898=1119908119898119875119898and 120578(119908) = (119868 minus 119875(119908))119883119877

minus1119861

It can be seen that the weighted estimator 120583(119908) is anaverage estimator of 120583

119898 119898 = 1 119872 The weighted ldquohatrdquo

matrix 119875(119908) depends on nonrandom regressor 119885119898119906

andweight vector 119908 In general conditions the matrix 119875(119908) issymmetric but not idempotent

For a positive integerG let 120585 be the maximal value of 1205902120580

120580 = 1 119899 and let 120582119895 119895 = 1 G be the nonzero eigenvalue

ofΩ119899 Assume that both 120582

119895and 984858120580are summable namely

Γ =

G

sum

119895=1

120582119895lt infin

W = 120585 + 2

119899minus1

sum

120580=1

10038161003816100381610038169848581205801003816100381610038161003816 lt infin

(17)

Since the covariance matrix Ω119899determines the algebraic

structure of the model average estimator we discuss itsproperties in the following

Lemma 1 For any 119899 dimensional vectors 119886 = (1198861 119886

119899)119879 and

119887 = (1198871 119887

119899)119879 one has |119886119879Ω

119899119887| le 119886119887W where sdot is

the Euclidean norm

Proof Through the definition ofΩ119899 one has

10038161003816100381610038161003816119886119879Ω11989911988710038161003816100381610038161003816=

1003816100381610038161003816100381610038161003816100381610038161003816

119899

sum

119897=1

1198861198971198871198971205902

119897+

119899minus1

sum

119894=1

984858119894

119899minus119894

sum

119897=1

(119886119897+119894119887119897+ 119886119897119887119897+119894)

1003816100381610038161003816100381610038161003816100381610038161003816

le 120585

1003816100381610038161003816100381610038161003816100381610038161003816

119899

sum

119897=1

119886119897119887119897

1003816100381610038161003816100381610038161003816100381610038161003816

+

1003816100381610038161003816100381610038161003816100381610038161003816

119899minus1

sum

119894=1

984858119894

1003816100381610038161003816100381610038161003816100381610038161003816

1003816100381610038161003816100381610038161003816100381610038161003816

119899minus119894

sum

119897=1

(119886119897+119894119887119897+ 119886119897119887119897+119894)

1003816100381610038161003816100381610038161003816100381610038161003816

(18)

ApplyingCauchy-Schwarz inequality for 119894 isin 1 119899minus1one gets

119899minus119894

sum

119897=1

119886119897+119894119887119897le radic

119899minus119894

sum

119897=1

1198862119897+119894

119899minus119894

sum

119897=1

1198872119897

le radic

119899

sum

119897=1

1198862119897

119899

sum

119897=1

1198872119897= 119886 119887

(19)

Similarly sum119899minus119894119897=1119886119897119887119897+119894le 119886119887 holds Therefore

10038161003816100381610038161003816119886119879Ω11989911988710038161003816100381610038161003816le 119886 119887(120585 + 2

119899minus1

sum

119894=1

984858119894) = 119886 119887W (20)

Because the matrix 119875(119908) takes an important role in ana-lyzing the problems of model selection we state some of itsproperties We set 120591 = maxK

1 K

119872 Let 120582(119875(119908)) and

120582max(119875(119908)) denote the eigenvalue and the largest eigenvalueof 119875(119908) respectively

Lemma 2 One has (i) tr(119875(119908)) le 120591 and (ii) 0 le 120582(119875(119908)) le 1where tr(sdot) denotes the trace operation

Proof It follows from tr(119875119898) = K

119898and tr(119875(119908)) =

sum119872

119898=1119908119898K119898le 120591 that (i) is established Next we consider

(ii) Without loss of generality let 1205821119898

ge sdot sdot sdot ge 120582119899119898

and1205931119898 120593

119899119898be the eigenvalues and standardly orthogonal

eigenvectors of 119875119898 respectively Observe that 119875

119898is idempo-

tent which implies that 120582119894119898

isin 0 1 119894 = 1 119899 For any120603 isin R119899 it can be seen that

120582 (119875 (119908)) =120603119879119875 (119908)120603

120603119879120603

=

119872

sum

119898=1

119908119898

120603119879119875119898120603

120603119879120603

=

119872

sum

119898=1

119908119898

120594119879Ξ119898120594

120594119879120594

=

119872

sum

119898=1

119908119898

119899

sum

119894=1

120582119894119898

100381610038161003816100381612059411989410038161003816100381610038162

120594119879120594

(21)

where Ξ119898= diag(120582

1119898 120582

119899119898) 120603 = Φ

119898120594 and Φ

119898=

(1205931119898 120593

119899119898) Due to 120582

119894119898isin 0 1 119894 = 1 119899 it means

that

120582 (119875 (119908)) le

119872

sum

119898=1

119908119898

119899

sum

119894=1

1

100381610038161003816100381612059411989410038161003816100381610038162

120594119879120594=

119872

sum

119898=1

119908119898= 1

120582 (119875 (119908)) ge

119872

sum

119898=1

119908119898

119899

sum

119894=1

0

100381610038161003816100381612059411989410038161003816100381610038162

120594119879120594= 0

(22)

From Lemma 2(ii) we know that 119875(119908) is nonnegativedefinite

Lemma 3 Let 120572 denote the number of eigenvalues of 119875(119908)Then (120572minus1 tr(119875(119908)))2 le 120572minus1 tr(1198752(119908))

Proof Let 1205821 120582

120572be the eigenvalues of 119875(119908) (i) If 120582

1=

sdot sdot sdot = 120582120572= 0 we know that Lemma 3 holds (ii) Let

1205821= 0 120582

120572= 0 and 120582

120572+1= sdot sdot sdot = 120582

120572= 0 It is easy to see

that sum120572119894=1(120582119894minus 120582)2

ge 0 where 120582 = 120572minus1 tr(119875(119908)) Thus it canbe shown that120572

sum

119894=1

(120582119894minus 120582)2

=

120572

sum

119894=1

1205822

119894minus 2120582

120572

sum

119894=1

120582119894+ 1205721205822

= tr (1198752 (119908)) minus 120572(120572minus1 tr (119875 (119908)))2

ge 0

(23)

which implies that (120572minus1 tr(119875(119908)))2 le 120572minus1 tr(1198752(119908))

Lemma 4 For any 119908 isin W there exists C1

= C10158402

such that tr(119875(119908)Ω119899) le C1015840Γ tr (Ω

119899119875(119908))

2le C1Γ2 and

tr (Ω119899119875119879(119908)119875(119908))

2

le C1Γ tr(119875(119908)Ω

119899119875119879(119908)) in which C1015840 =

119904119906119901119908isinW120582max(119875(119908))

Proof Notice that 120582min(A)120582max(B) le 120582max(AB) le

120582max(A)120582max(B) and tr(AB) le 120582max(A) tr(B) le

Abstract and Applied Analysis 5

tr(A) tr(B) where A and B are any positive semidefinitematrices

Since 119875(119908) and Ω119899are symmetric and nonnegative

definite the first inequality is established by tr(119875(119908)Ω119899) le

120582max(119875(119908)) tr(Ω119899) Let 120582119894 be eigenvalue of 119875(119908) From thesymmetric property of 119875(119908) we know that there exists a119899 times 119899 orthogonal matrix Ψ such that 119875(119908) = ΨΥΨ

119879where Υ = diag(120582

1 120582

119899) Then it yields tr(Ω

119899119875(119908)) =

tr(Υ12Ψ119879Ω119899ΨΥ12) where Υ

12Ψ119879

Ω119899ΨΥ12 is symmetric

and nonnegative definite The second inequality holdsbecause

tr (Ω119899119875 (119908))

2le [tr (Ω

119899119875 (119908))]

2

le [120582max (119875 (119908)) tr(Ω119899)]2

le C1Γ2

(24)

For the last formula we note that

tr (Ω119899119875119879(119908) 119875 (119908))

2

= tr (119875 (119908)Ω119899119875119879(119908))2

le 120582max (119875 (119908)Ω119899119875119879(119908)) tr (119875 (119908)Ω119899119875

119879(119908))

le 120582max (1198752(119908)) 120582max (Ω119899) tr (119875 (119908)Ω119899119875

119879(119908))

le 1205822

max (119875 (119908)) 120582max (Ω119899) tr (119875 (119908)Ω119899119875119879(119908))

le C1Γ tr (119875 (119908)Ω119899119875

119879(119908))

(25)

This completes the proof of Lemma 4

Lemma 5 Let A be a symmetric matrix and 119883 be a randomvector with zero expectation Then one has Var(119883119879A119883) =2 tr(Var(119883)AVar(119883)A) where Var(sdot) is the operation ofvariance

Proof The proof of this lemma is provided in Lai and Xie[26]

3 Average Squared Error

Denote an average squared error by

119871119899 (119908) =

1

119899

1003817100381710038171003817120583 minus 120583(119908)10038171003817100381710038172

119896=(120583 minus 120583 (119908))

119879119896 (120583 minus 120583 (119908))

119899 (26)

where 119896 is a fixed positive integer which is often used toeliminate the boundary effect Andrews [28] suggested thatthe 119896 can take the value of 120590minus2 when errors obeyed anindependent and identical distribution with variance 1205902 Themost common situation is 119896 = 1 The average squarederror 119871

119899(119908) can be viewed as a measure of accuracy between

120583(119908) and 120583 Obviously an optimal estimator can obtain theminimum value of 119871

119899(119908) In other words we should select

a weight vector 119908 from W to make that the average squarederror 119871

119899(119908) takes value that is as small as possible

In order to investigate the problem of weight selection wegive the expression of conditional expected average squarederror as follows

119877119899 (119908) = 119864 (119871119899 (119908) | 119883 119885119898) (27)

From the definition of119877119899(119908) the following lemma can be

obtained

Lemma6 The conditional expected average squared error canbe rewritten as

119877119899 (119908) =

1003817100381710038171003817119872(119908)120583 minus 120578(119908)10038171003817100381710038172

119896

119899+119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899 (28)

where119872(119908) = 119868 minus 119875(119908)

Proof A straight calculation of 119871119899(119908) leads to

119899119871119899 (119908)

= (120583 minus 120583 (119908))119879119896 (120583 minus 120583 (119908))

= (120583 minus 120578 (119908) minus 119875 (119908) (120583 + 119890))119879

times 119896 (120583 minus 120578 (119908) minus 119875 (119908) (120583 + 119890))

= [120583119879119872119879(119908) minus 120578

119879(119908) minus 119890

119879119875119879(119908)]

times 119896 [119872 (119908) 120583 minus 120578 (119908) minus 119875 (119908) 119890]

=1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896minus 119896(120583 minus 119883119877

minus1119861)119879

119872119879(119908) 119875 (119908) 119890

+ 119896119890119879119875119879(119908) 119875 (119908) 119890 minus 119896119890

119879119875119879(119908)119872 (119908) (120583 minus 119883119877

minus1119861)

(29)

Using 119864(119890119879119875119879(119908)119875(119908)119890 | 119883 119885119898) = tr(119875(119908)Ω

119899119875119879(119908))

and taking conditional expectations on both sides of theabove equality give rise to Lemma 6

4 The 119896-Class Generalized InformationCriterion and Asymptotic Optimality

As the value of120583 is unknown the average squared error119871119899(119908)

cannot be used directly to select the weight vector119908Thus wesuggest a 119896-class generalized information criterion (119896-GIC)to choose the weight vector Further we will prove that theselected weight vector from 119896-GIC minimizes the averagesquared error 119871

119899(119908)

The 119896-GIC for the restricted model average estimator is

119862119899 (119908) =

1003817100381710038171003817119884 minus 120583 (119908)10038171003817100381710038172

119896

119899+2ℏ

119899tr (119875 (119908)Ω119899) (30)

where ℏ is larger than one and satisfies assumption (35) men-tioned below The 119896-class generalized information criterionextends the generalized information criterion (GIC

120582119899

) pro-posed by Shao [27] Because the ℏ can take different valuesthe 119896-GIC includes some common information criteria formodel selection such as theMallows C

119871criterion (119896 = ℏ = 1)

the GIC criterion (119896 = 1 ℏ rarr infin) the FPE120572criterion

(119896 = 1 ℏ gt 1) and the BIC criterion (119896 = 1 ℏ = log(119899))

6 Abstract and Applied Analysis

Lemma 7 If 119896 = ℏ we have 119864(119862119899(119908) | 119883 119885

119898) = 119877

119899(119908) +

119896Γ119899

Proof Recalling the definition of 119862119899(119908) one gets

119899119862119899 (119908) = (119884 minus 120583 (119908))

119879119896 (119884 minus 120583 (119908)) + 2119896 tr (119875 (119908)Ω119899)

= [119872 (119908)119884 minus 120578 (119908)]119879119896 [119872 (119908)119884 minus 120578 (119908)]

+ 2119896 tr (119875 (119908)Ω119899)

= 2119896 tr (119875 (119908)Ω119899) + 119896119890119879119872119879(119908)119872 (119908) 119890

+ 119896(119872 (119908) 120583 minus 120578 (119908))119879119872(119908) 119890

+ 119896119890119879119872119879(119908) (119872 (119908) 120583 minus 120578 (119908))

+1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

(31)

Observe that

119896119890119879119872119879(119908)119872 (119908) 119890 = 119890

119879(119868 minus 119875 (119908))

119879119896 (119868 minus 119875 (119908)) 119890

= 119896119890119879119890 minus 119896119890

119879119875119879(119908) 119890 minus 119896119890

119879119875 (119908) 119890

+ 119896119890119879119875119879(119908) 119875 (119908) 119890

(32)

Notice that 119864(119896119890119879119890 | 119883 119885

119898) = 119896 tr(Ω

119899) and

119864(119896119890119879119875119879(119908)119890 | 119883 119885

119898) = 119896 tr(119875119879(119908)Ω

119899) Moreover we

have that 119864(119896119890119879119875(119908)119890 | 119883 119885119898) = 119896 tr(119875(119908)Ω

119899) and

119864(119896119890119879119875119879(119908)119875(119908)119890 | 119883 119885

119898) = 119896 tr(119875(119908)Ω

119899119875119879(119908))

Thus taking conditional expectations on both sides of(31) we obtain

119899119864 (119862119899 (119908) | 119883 119885119898) = 119896 tr (Ω119899) +

1003817100381710038171003817119872(119908)120583 minus 120578(119908)10038171003817100381710038172

119896

+ 119896 tr (119875 (119908)Ω119899119875119879(119908))

(33)

It follows from (33) and Lemma 6 that Lemma 7 isestablished as desired

Lemma 7 shows that 119862119899(119908) is equivalent to the con-

ditional expected average squared error plus an error biasParticularly when 119899 approaches to infinite 119862

119899(119908) is an

unbiased estimation of 119877119899(119908)

The 119896-GIC criterion is defined so as to select the optimalweight vector 119908 The optimal weight vector 119908 is chosen byminimizing 119862

119899(119908)

Obviously the well-posed estimators of parameters areΘ119908 and Ψ119908 with the weight vector 119908 Under some regular

conditions we intend to demonstrate that the selectionprocedure is asymptotically optimal in the following sense

119871119899 (119908)

inf119908isinW119871119899 (119908)

119901

997888rarr 1 (34)

If the formula (34) holds we know that the selectedweight vector 119908 from 119862

119899(119908) can realize the minimum value

of 119871119899(119908) In other words the weight vector 119908 is equivalent

to the selected weight vector by minimizing 119871119899(119908) and is the

optimal weight vector for 120583(119908) The asymptotic optimality of119862119899(119908) can be established under the following assumptions

Assumptions 1 Write 120572119899= inf119908isinW119899119877119899(119908) As 119899 rarr infin we

assume that

lim inf119908isinW

ℏ2

120572119899

997888rarr 0 (35)

sum

119908isinW

1

(119899119877119899 (119908))

1198731015840997888rarr 0 (36)

for a large ℏ and any positive integer 1 le 1198731015840 lt infinThe above assumptions have been employed by many

literatures of model selection For instance the expression(35) was used by Shao [27] and the formula (36) was adoptedby Li [29] Andrews [28] Shao [27] and Hansen [11]

The following lemma offers a bridge for proving theasymptotic optimality of 119862

119899(119908)

Lemma 8 We have1003817100381710038171003817119872(119908)119884 minus 120578(119908)

10038171003817100381710038172

119896= 119899119871119899 (119908) minus 2⟨119890

119879 119875(119908)119890⟩

119896

+ 2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896+ 1198902

119896

(37)

Proof Using 120583(119908) = 119875(119908)119884 + 120578(119908) 119884 = 120583 + 119890 and thedefinition of 119871

119899(119908) one obtains

1003817100381710038171003817119872(119908)119884 minus 120578(119908)10038171003817100381710038172

119896=1003817100381710038171003817120583 + 119890 minus 119875(119908)119884 minus 120578(119908)

10038171003817100381710038172

119896

= 1198902

119896+1003817100381710038171003817120583 minus 119875(119908)119884 minus 120578(119908)

10038171003817100381710038172

119896

+ 2⟨119890119879 120583 minus 120578(119908) minus 119875(119908)119884⟩

119896

= 1198902

119896+ 119899119871119899 (119908)

+ 2⟨119890119879 120583 minus 120578(119908) minus 119875(119908)(120583 + 119890)⟩

119896

= 1198902

119896+ 119899119871119899 (119908)

+ 2⟨119890119879 (119868 minus 119875(119908))120583 minus 120578(119908) minus 119875(119908)119890⟩

119896

= 1198902

119896+ 119899119871119899 (119908)

+ 2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

minus 2⟨119890119879 119875(119908)119890⟩

119896

(38)

This completes the proof of Lemma 8

Lemma 8 and 119884 minus 120583(119908)2119896= 119872(119908)119884 minus 120578(119908)

2

119896imply

that

119862119899 (119908) = 119871119899 (119908) +

2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

119899

+1198902

119896

119899+2ℏ tr (119875 (119908)Ω119899)

119899minus 2

⟨119890119879 119875(119908)119890⟩

119896

119899

(39)

Abstract and Applied Analysis 7

The goal is to choose119908 by minimizing 119862119899(119908) From (39)

one only needs to select 119908 through minimizing

119871119899 (119908) +

2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

119899

+

2ℏ tr (119875 (119908)Ω119899) minus 2⟨119890119879 119875(119908)119890⟩

119896

119899

(40)

where 119908 isinWCompared with 119871

119899(119908) it is sufficient to establish

that 119899minus1⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896and 119899

minus1ℏ tr(119875(119908)Ω

119899) minus

119899minus1⟨119890119879 119875(119908)119890⟩

119896are uniformly negligible for any 119908 isin W

More specifically to prove (34) we need to check

sup119908isinW

⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

119899119877119899 (119908)

119901

997888rarr 0 (41)

sup119908isinW

10038161003816100381610038161003816ℏ tr (119875 (119908)Ω119899) minus ⟨119890

119879 119875(119908)119890⟩

119896

10038161003816100381610038161003816

119899119877119899 (119908)

119901

997888rarr 0 (42)

sup119908isinW

10038161003816100381610038161003816100381610038161003816

119871119899 (119908) minus 119877119899 (119908)

119877119899 (119908)

10038161003816100381610038161003816100381610038161003816

119901

997888rarr 0 (43)

where ldquo119901

997888rarrrdquo denotes the convergence in probabilityFollowing the idea of Li [29] we testify the main result

of our work that the minimizing of 119896-class generalizedinformation criterion is asymptotically optimalNowwe statethe main theorem

Theorem9 Under assumptions (35) and (36) theminimizingof 119896-class generalized information criterion119862

119899(119908) is asymptot-

ically optimal namely (34) holds

Proof The asymptotically optimal property of 119896-GIC needsto show that (41) (42) and (43) are valid

Firstly we prove that (41) holds For any 120577 gt 0 byChebyshevrsquos inequality one has

Pr

sup119908isinW

10038161003816100381610038161003816⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

10038161003816100381610038161003816

119899119877119899 (119908)

gt 120577

le sum

119908isinW

119864[⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896]2

(120577119899119877119899 (119908))

2

(44)

which by Lemma 1 is no greater than

120577minus2119896W sum

119908isinW

119899minus21003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896(119877119899 (119908))

minus2 (45)

Recalling the definition of 119877119899(119908) we get 119872(119908)120583 minus

120578(119908)2

119896le 119899119877

119899(119908) Then (45) does not exceed

120577minus2119896Wsum

119908isinW (119899119877119899(119908))minus1 By assumption (36) one knows

that (45) tends to zero in probabilityThus (41) is established

To prove (42) it suffices to testify that

119896 tr (119875 (119908)Ω119899) minus ⟨119890119879 119875(119908)119890⟩

119896

119899119877119899 (119908)

119901

997888rarr 0 (46)

(ℏ minus 119896) tr (119875 (119908)Ω119899)119899119877119899 (119908)

119901

997888rarr 0 (47)

By Chebyshevrsquos inequality and Lemmas 4 and 5 for any120598 gt 0 we have

Pr sup119908isinW

10038161003816100381610038161003816100381610038161003816100381610038161003816

tr (119875 (119908)Ω119899) minus ⟨119890119879 119875 (119908) 119890⟩

119899119877119899 (119908)

10038161003816100381610038161003816100381610038161003816100381610038161003816

gt 120598

le sum

119908isinW

119864[tr(119875(119908)Ω119899) minus ⟨119890

119879 119875(119908)119890⟩]

2

(120598119899119877119899(119908))2

=1

1205982sum

119908isinW

Var (119890119879119875 (119908) 119890)

(119899119877119899(119908))2

le2C1Γ2

1205982sum

119908isinW

1

(119899119877119899(119908))2

(48)

From (36) we know that (48) is close to zero Thus (46)is reasonable

Recalling Lemma 4 and (35) it yields

1003816100381610038161003816(ℏ minus 119896) tr (119875 (119908)Ω119899)1003816100381610038161003816

119899119877119899 (119908)

le

10038161003816100381610038161003816C1015840 (ℏ minus 119896) Γ

10038161003816100381610038161003816

119899119877119899 (119908)

997888rarr 0 (49)

which derives (47) Then (42) is provedNext we show that the expression (43) also holds A

straightforward calculation leads to

1003817100381710038171003817120583 minus 120583(119908)10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

=1003817100381710038171003817120583 minus 120578(119908) minus 119875(119908)119884

10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

=1003817100381710038171003817120583 minus 120578(119908) minus 119875(119908)120583 minus 119875(119908)119890

10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

=1003817100381710038171003817119872(119908)120583 minus 120578(119908) minus 119875(119908)119890

10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

= 119875(119908)1198902

119896minus 2⟨120583

119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

(50)

From the identity (50) we know that

119871119899 (119908) minus 119877119899 (119908) = minus

2⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899

+119875(119908)119890

2

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899

(51)

8 Abstract and Applied Analysis

Obviously the proof of (43) needs to verify that

sup119908isinW

119875(119908)1198902

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899119877119899 (119908)

119901

997888rarr 0

sup119908isinW

⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899119877119899 (119908)

119901

997888rarr 0

(52)

To prove (52) it should be noticed thattr(119875(119908)Ω

119899119875119879(119908)) = 119864(119890

119879119875119879(119908)119875(119908)119890) and 119875(119908)119890

2

119896=

⟨119890119879119875119879(119908) 119875(119908)119890⟩

119896 In addition

1003817100381710038171003817119875 (119908) (119872 (119908) 120583 minus 120578 (119908))10038171003817100381710038172le 1205822

max (119875 (119908))1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

le1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

(53)

Thus there exist any 1205771015840 gt 0 and 1205981015840 gt 0 such that

Pr sup119908isinW

119875(119908)1198902

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899119877119899 (119908)

gt 1205771015840

le sum

119908isinW

119864[119875(119908)1198902

119896minus 119896119864 (119890

119879119875119879(119908) 119875 (119908) 119890)]

2

(119899119877119899 (119908) 120577

1015840)2

=1198962

12057710158402sum

119908isinW

Var (119890119879119875119879 (119908) 119875 (119908) 119890)

(119899119877119899(119908))2

le2119896C1Γ

12057710158402sum

119908isinW

119896 tr (119875 (119908)Ω119899119875119879(119908))

(119899119877119899(119908))2

le2119896C1Γ

12057710158402sum

119908isinW

1

(119899119877119899 (119908))

(54)

Pr

sup119908isinW

⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899119877119899 (119908)

gt 1205981015840

le sum

119908isinW

119864[⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896]2

(119899119877119899(119908)1205981015840)

2

le sum

119908isinW

1003817100381710038171003817119875(119908)(119872(119908)120583 minus 120578(119908))10038171003817100381710038172

119896119896W

(119899119877119899(119908)1205981015840)

2

le 1205981015840minus2119896W sum

119908isinW

119899minus21003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896(119877119899 (119908))

minus2

(55)

Combining (36) and (45) one knows that both (54)and (55) tend to zero In other words (43) is confirmedWe conclude that the expressions (41) (42) and (43) arereasonable This completes the proof of Theorem 9

In practice the covariance of errors is usually unknownand needs to be estimated However it is difficult to build agood estimate for Ω

119899in virtue of the special structure of Ω

119899

In the special case when the random errors are independentand identical distribution with variance 1205902 the consistentestimator of1205902 can be built for the constrainedmodels (7) and(8) Let 119898 = 119876 in 120583

119898and 1205911015840 = tr(119875

119876) where 119876 corresponds

to a ldquolargerdquo approximatingmodel Denote 2119876= (119899minus120591

1015840)minus1(119884minus

120583119876)119879(119884 minus 120583

119876) The coming theorem will show that 2

119876is a

consistent estimate of 1205902

Theorem 10 If 1205911015840119899 rarr 0 when 1205911015840 rarr infin and 119899 rarr infin wehave 2

119876

119901

997888rarr 1205902 as 119899 rarr infin

Proof Writing Λ = 119863 + 119883119877minus1119863995333 one obtains

(119884 minus 120583119876)119879(119884 minus 120583

119876)

119899 minus 1205911015840=119890119879119872119876119890

119899 minus 1205911015840+

1003817100381710038171003817119872119876Λ10038171003817100381710038172

119899 minus 1205911015840+2119890119879119872119876Λ

119899 minus 1205911015840

(56)

where119872119876= 119868 minus 119875

119876

Since 119864(119890119879119872119876119890) = 120590

2(119899 minus 120591

1015840) it leads to

11986410038161003816100381610038161003816119890119879119872119876119890 minus 1205902(119899 minus 120591

1015840)10038161003816100381610038161003816

2

= 11986410038161003816100381610038161003816119890119879119872119876119890 minus 119864(119890

119879119872119876119890)10038161003816100381610038161003816

2

= Var (119890119879119872119876119890) = 2120590

4 tr (119872119876)

(57)

For any 120576 gt 0 it follows from (57) that

Pr1003816100381610038161003816100381610038161003816100381610038161003816

119890119879119872119876119890

119899 minus 1205911015840minus 1205902

1003816100381610038161003816100381610038161003816100381610038161003816

gt 120576 le11986410038161003816100381610038161003816119890119879119872119876119890 minus (119899 minus 120591

1015840)120590210038161003816100381610038161003816

2

1205762(119899 minus 1205911015840)2

le21205904

1205762 (119899 minus 1205911015840)997888rarr 0

(58)

Equation (58) implies that (119899 minus 1205911015840)minus1(119890119879119872119876119890)119901

997888rarr 1205902 Let

I119876119905

be the 119905th diagonal element of the ldquohatrdquo matrix 119875119876Then

0 le 1minusI119876119905lt 1 is the 119905th diagonal element of119872

119876and satisfies

sum119899

119905=1(1 minusI

119876119905) = 119899 minus 120591

1015840Because 120601(119911

119894119897 120595119897) and 120601(119886

119904119896 120595119896) converge to mean

square we have 119864(Λ2119876119905) rarr 0 as 1205911015840 rarr infin where Λ

119876119905is

the 119905th element of Λ 119905 = 1 119899 Notice that

119864 (Λ119879119872119876Λ)

(119899 minus 1205911015840)=

1

119899 minus 1205911015840

119899

sum

119905=1

119864 (Λ2

119876119905119872119876119905)

=1

119899 minus 1205911015840

119899

sum

119905=1

119864 (Λ2

119876119905) [1 minusI

119876119905]

le max119905

119864 (Λ2

119876119905)

1

119899 minus 1205911015840

119899

sum

119905=1

[1 minusI119876119905]

= max119905

119864 (Λ2

119876119905) 997888rarr 0

(59)

The above expression implies that the second term on theright hand of (56) approaches to zero By the similar proof of(59) we obtain that the final term on the right-hand of (56)also tends to zero The proof of Theorem 10 is complete

Abstract and Applied Analysis 9

In the case of independently and identically distributederrors if we replace 1205902 by 2

119876in the 119896-GIC the 119896-GIC can be

simplified to

119862119899 (119908) =

1003817100381710038171003817119884 minus 120583(119908)10038171003817100381710038172

119896

119899+2ℏ2

119876tr (119875 (119908))119899

(60)

Here we intend to illustrate that the model selection pro-cedure of minimizing 119862

119899(119908) is also asymptotically optimal

Theorem11 Assume that the random error 119890 is iid withmeanzero and variance 1205902 Under the conditions (35) and (36) the119862119899(119908) is still asymptotically valid

Proof Using a similar technique of deriving (39) we obtain

119899119862119899 (119908) =

1003817100381710038171003817119884 minus 120583(119908)10038171003817100381710038172

119896+ 2ℏ

2

119876tr (119875 (119908))

= 119899119871119899 (119908) + 2⟨119890

119879119872(119908)120583 minus 120578(119908)⟩

119896

+ 2ℏ2

119876tr (119875 (119908)) minus 2⟨119890119879 119875(119908)119890⟩

119896+ 1198902

119896

= 2ℏ (2

119876minus 1205902) tr (119875 (119908)) + 2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

+ 119899119871119899 (119908) + 2ℏ120590

2 tr (119875 (119908))

minus 2⟨119890119879 119875(119908)119890⟩

119896+ 1198902

119896

(61)

With an appropriate modification of the proofs (41)ndash(43)one only needs to verify

sup119908isinW

ℏ100381610038161003816100381610038161205902minus 2

119876

10038161003816100381610038161003816tr (119875 (119908))

119899119877119899 (119908)

119901

997888rarr 0 (62)

which is equivalent to showing

sup119908isinW

ℏ2((1119899)

100381610038161003816100381610038161205902minus 2

119876

10038161003816100381610038161003816tr(119875(119908)))

2

1198772119899(119908)

119901

997888rarr 0 (63)

From the proof of Theorem 10 we have

Pr1003816100381610038161003816100381610038161003816100381610038161003816

ℏ119890119879119872119876119890

119899 minus 1205911015840minus ℏ1205902

1003816100381610038161003816100381610038161003816100381610038161003816

gt 1205761015840119877minus12

119899(119908)

le ℏ211986410038161003816100381610038161003816119890119879119872119876119890 minus (119899 minus 120591

1015840)120590210038161003816100381610038161003816

2

12057610158402(119899 minus 1205911015840)2119877119899 (119908)

le2ℏ21205904

12057610158402 (119899 minus 1205911015840) 119877119899 (119908)

(64)

where 1205761015840 gt 0By assumption (35) one knows that the above equation

is close to zero Thus we obtain 119877119899(119908)minus1|ℏ1205902minus ℏ2

119876|2 119901

997888rarr

0 It follows from Lemma 3 and the definition of 119877119899(119908)

that (119899minus1 tr(119875(119908)))2 is no greater than 119877119899(119908) Then the

formula (63) is confirmed Therefore we conclude that theminimizing of 119862

119899(119908) is also asymptotically optimal

5 Monte Carlo Simulation

In this section Monte Carlo simulations are performedto investigate the finite sample properties of the proposedrestricted linearmodel selectionThis data generating processis

119910119894= 119909119894120579 +

1000

sum

119897=1

119911119894119897120595119897+ 119890119894

subject to 120579 = minus

1000

sum

119897=1

120595119897+ 1 119894 = 1 119899

(65)

where 119909119894follows 119905(3) distribution 119911

1198941= 1 and 119911

119894119897

119897 = 2 1000 is independently and identically distributed119873(0 1) The parameter 120595

119897 119897 = 1 1000 is determined by

the rule 120595119897= 120577119897minus1 where 120577 is a parameter which is selected

to control the population 2 = 1205772(1 + 120577

2) The error 119890

119894is

independent of 119909119894and 119911119894119897 We consider two cases of the error

distribution

Case 1 119890119894is independently and identically distributed

119873(0 1)

Case 2 1198901 119890

119899obey multivariate normal distribution

119873(0Ω119899) whereΩ

119899is a 119899times119899 dimensional covariance matrix

The 119895th diagonal element ofΩ119899is 1205902119895generated from uniform

distribution on (0 1) The 119895119897th (119895 = 119897) nondiagonal element ofΩ119899isΩ119895119897119899= exp(minus05|119909

119894119895minus 119909119894119897|2) where 119909

119894119895and 119909

119894119897denote the

119895th and 119897th elements of 119909119894 respectively

The sample size is varied between 119899 = 50 100 150 and200 The number of models is determined by 119872 = lfloor3119899

13rfloor

where lfloor119904rfloor stands for the integer part of 119904 We set 120577 so that2 varies on a grid between 01 and 09 The number of

simulation trials is Π = 500 For the 119896-GIC the value of 119896takes one and ℏ adopts the effective number of parameters

To assess the performance of 119896-GIC we consider fiveestimators which are (1) AIC model selection estimators(AIC) (2) BIC model selection estimators (BIC) (3) leave-one-out cross-validated model selection estimator (CV [17])(4) Mallows model averaging estimators (MMA [11]) and(5) 119896-GIC model selection estimator (119896-GIC) FollowingMachado [30] the AIC and BIC are defined respectively as

AIC119898= 2119899 ln 1

119899

1003817100381710038171003817119884 minus 12058311989810038171003817100381710038172 + 2K

119898

BIC119898= 2119899 ln 1

119899

1003817100381710038171003817119884 minus 12058311989810038171003817100381710038172 + ln (119899)K119898

(66)

We employ the out-of-sample prediction error to evaluateeach estimator For each replication 119910

ℓ 119909ℓ 119911ℓ100

ℓ=1are gener-

ated as out-of-sample observations In the120587th simulation theprediction error is

PE (120587) = 1

100

100

sum

ℓ=1

(119910ℓminus 120583ℓ (119908))

2 (67)

10 Abstract and Applied Analysis

CVMMAAIC

BIC

n = 50M = 11Pr

edic

tion

erro

r18

14

10

06

01 03 05 07 09

R

2

k-GIC

(a)

CVMMAAIC

BIC

n = 100M = 14

Pred

ictio

n er

ror

18

14

10

06

01 03 05 07 09

R

2

k-GIC

(b)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

n = 150M = 16

01 03 05 07 09

R

2

k-GIC

(c)

Pred

ictio

n er

ror

18

14

10

06

01 03 05 07 09

CVMMAAIC

BIC

n = 200M = 18

R

2

k-GIC

(d)

Figure 1 Results for Monte Carlo design

where119908 is selected by one of the five methodsThen the out-of-sample prediction error is calculated by

PE = 1

Π

Π

sum

120587=1

PE (120587) (68)

where Π = 500 is the number of replication Obviously thesmaller PE implies the better method of model estimatorWe consider PE under homoscedastic errors at first Theprediction error calculations are summarized in Figure 1Thefour panels in each graph depict results for a variety of samplesizes In each panel PE is displayed on the 119910-axis and 2 isdisplayed on the 119909-axis

We find that the 119896-GIC estimators are almost the bestestimators among those considered When 2 is very large

the MMA estimators can sometimes be marginally preferredto the 119896-GIC estimators In each panel the AIC and CV havequite similar prediction errors For a smaller 2 the AICobtains a higher prediction error than CV However the AICestimators yield smaller PE119904 than the CV estimators when2 is increasing In many situations the PE119904 of BIC estimator

with a large 2 are quite poor relative to the other methodsNext we discuss PE under correlative errors and the PE

calculation is summarized in Figure 2 Broadly speakingthe conclusions are similar to those found in homoscedasticcases The 119896-GIC estimator frequently yields the most accu-rate estimators followed by the MMA estimator and bothaverage estimators enjoy significantly smaller PE119904 than theother three estimators over a large portion of the 2 space

Abstract and Applied Analysis 11

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 50M = 11

k-GIC

(a)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 100M = 14

k-GIC

(b)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 150M = 16

k-GIC

(c)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 200M = 18

k-GIC

(d)

Figure 2 Results for Monte Carlo design

When 2 le 02 the BIC estimator outperforms the 119896-GICestimator Again the AIC estimator is habitually the worstperforming estimator with the CV being a close second in alarge region of the 2 space Besides their relative efficiencyrelies closely on sample size with the BIC estimator revealingincreasing PE and the remaining four estimators showingdecreasing PE as 119899 increases

6 Conclusions

In risk investment an important subject is to find an optimalportfolio The commonly used techniques are the opti-mization methods based on the scheme of mean-varianceHowever those methods are cumbersome in computingand cannot obtain the closed solutions for some complex

problems To make up the defects of mean-variance analternative methodology for obtaining an optimal portfoliois to use model selection This paper attempts to developa statistical program to consider the selection problem ofoptimal tracking portfolio We build the theoretical modelsof tracking portfolios by constrained linear models Thenthe selection problems of optimal portfolio boil down tochoosing an optimal constrained linear model

In the setting of unrestricted models or homoscedasticmodels a large number of works investigate the problemsof model selection In distinction we discuss the modelselection for constrained models with dependent errors Therestricted models are estimated by the method of weightedaverage least squares Thus the selection of an optimalconstrained model is equivalent to finding a series of optimal

12 Abstract and Applied Analysis

weights We select the weights by minimizing a 119896-class gen-eralized information criterion (119896-GIC) which is an estimateof the average squared error from the model average fit Theprocedure of selecting weights is proved to be asymptoticallyoptimal Through Monte Carlo simulation the performanceof 119896-GIC is compared against that of four other methods Itis found that the 119896-GIC gives the best performance in mostcases

There are two limitations of our results which are openfor further research First what is the asymptotic distributionof the parametric estimators Second can the theory begeneralized to allow for continuous weightsThese questionsremain to be answered by future research In this work wemainly adopt the method of regression analysis to solve theselection problem In fact some alternative mathematicaltools can also be employed to explore the theoretical prop-erties of model selection For example the optimal modelcan be selected by the methods of linear optimization orquadratic programming and we can apply the techniques oflinear functional analysis and stochastic control to considerthe inferences of parametric estimator Besides we mentionthat the applications of this study can also be extended tosome other fields including risk management ruin theoryand factor analysis

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

This work was supported by the Shandong Institute ofBusiness and Technology (SIBT Grant 521014306203)

References

[1] K Knight and W Fu ldquoAsymptotics for lasso-type estimatorsrdquoThe Annals of Statistics vol 28 no 5 pp 1356ndash1378 2000

[2] J Fan and H Peng ldquoNonconcave penalized likelihood with adiverging number of parametersrdquo The Annals of Statistics vol32 no 3 pp 928ndash961 2004

[3] H Zou and M Yuan ldquoComposite quantile regression and theoracle model selection theoryrdquo The Annals of Statistics vol 36no 3 pp 1108ndash1126 2008

[4] M Caner ldquoLasso-type GMM estimatorrdquo Econometric Theoryvol 25 no 1 pp 270ndash290 2009

[5] C Y Tang and C Leng ldquoPenalized high-dimensional empiricallikelihoodrdquo Biometrika vol 97 no 4 pp 905ndash919 2010

[6] L C Jennifer A D Jurgen and F H David Model Selectionin Equations With Many Small Effects Oxford Bulletin ofEconomics and Statistics 2013

[7] E E Leamer Specification Searches John Wiley amp Sons NewYork NY USA 1978

[8] P J Brown M Vannucci and T Fearn ldquoBayes model averagingwith selection of regressorsrdquo Journal of the Royal StatisticalSociety B Statistical Methodology vol 64 no 3 pp 519ndash5362002

[9] W S Rodney and K D Herman Bayesian Model SelectionWith An Uninformative Prior Oxford Bulletin of Economicsand Statistics 2003

[10] N L Hjort and G Claeskens ldquoFrequentist model averageestimatorsrdquo Journal of the American Statistical Association vol98 no 464 pp 879ndash899 2003

[11] B E Hansen ldquoLeast squares model averagingrdquo EconometricaJournal of the Econometric Society vol 75 no 4 pp 1175ndash11892007

[12] H Liang G Zou A T K Wan and X Zhang ldquoOptimal weightchoice for frequentist model average estimatorsrdquo Journal of theAmerican Statistical Association vol 106 no 495 pp 1053ndash10662011

[13] X Zhang and H Liang ldquoFocused information criterion andmodel averaging for generalized additive partial linear modelsrdquoThe Annals of Statistics vol 39 no 1 pp 194ndash200 2011

[14] B E Hansen and J S Racine ldquoJackknife model averagingrdquoJournal of Econometrics vol 167 no 1 pp 38ndash46 2012

[15] H Akaike ldquoInformation theory and an extension of themaximum likelihood principlerdquo in Proceedings of the 2ndInternational Symposium on Information Theory B N Petrovand F Csake Eds pp 267ndash281 Akademiai Kiado BudapestHungary 1973

[16] C L Mallows ldquoSome comments on CPrdquo Technometrics vol 15no 4 pp 661ndash675 1973

[17] M Stone ldquoCross-validatory choice and assessment of statisticalpredictionsrdquo Journal of the Royal Statistical Society B Method-ological vol 36 pp 111ndash147 1974

[18] G Schwarz ldquoEstimating the dimension of a modelrdquoTheAnnalsof Statistics vol 6 no 2 pp 461ndash464 1978

[19] P Craven and G Wahba ldquoSmoothing noisy data with splinefunctions Estimating the correct degree of smoothing by themethod of generalized cross-validationrdquo Numerische Mathe-matik vol 31 no 4 pp 377ndash403 1979

[20] D W K Andrews ldquoConsistent moment selection proceduresfor generalized method of moments estimationrdquo EconometricaJournal of the Econometric Society vol 67 no 3 pp 543ndash5641999

[21] G Claeskens and N L Hjort ldquoThe focused informationcriterionrdquo Journal of theAmerican Statistical Association vol 98no 464 pp 900ndash945 2003With discussions and a rejoinder bythe authors

[22] Y Zhang R Li and C-L Tsai ldquoRegularization parameterselections via generalized information criterionrdquo Journal of theAmerican Statistical Association vol 105 no 489 pp 312ndash3232010

[23] G Xu and J Z Huang ldquoAsymptotic optimality and efficientcomputation of the leave-subject-out cross-validationrdquo TheAnnals of Statistics vol 40 no 6 pp 3003ndash3030 2012

[24] M K P So and T Ando ldquoGeneralized predictive informationcriteria for the analysis of feature eventsrdquo Electronic Journal ofStatistics vol 7 pp 742ndash762 2013

[25] J J J Groen and G Kapetanios ldquoModel selection criteria forfactor-augmented regressionsrdquo Oxford Bulletin of Economicsand Statistics vol 75 no 1 pp 37ndash63 2013

[26] S Lai and Q Xie ldquoA selection problem for a constrainedlinear regression modelrdquo Journal of Industrial and ManagementOptimization vol 4 no 4 pp 757ndash766 2008

[27] J Shao ldquoAn asymptotic theory for linear model selectionrdquoStatistica Sinica vol 7 no 2 pp 221ndash264 1997 With commentsand a rejoinder by the author

Abstract and Applied Analysis 13

[28] D W K Andrews ldquoAsymptotic optimality of generalized 119862119871

cross-validation and generalized cross-validation in regressionwith heteroskedastic errorsrdquo Journal of Econometrics vol 47 no2-3 pp 359ndash377 1991

[29] K-C Li ldquoAsymptotic optimality for 119862119901 119862119871 cross-validation

and generalized cross-validation discrete index setrdquoTheAnnalsof Statistics vol 15 no 3 pp 958ndash975 1987

[30] J A F Machado ldquoRobust model selection and119872-estimationrdquoEconometric Theory vol 9 no 3 pp 478ndash493 1993

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article The Optimal Selection for Restricted ...downloads.hindawi.com/journals/aaa/2014/692472.pdf · stocks in the group as the regressors. Since the coe cient of a regressor

2 Abstract and Applied Analysis

selection algorithms to handle the selection problems of bothvariables and principal components

Model averaging is another popular and widely usedtechnique for model selection The method is to averagethe estimators corresponding to different candidate mod-els Bayesian and frequentist are two main perspectivesof thought in model averaging Although their spirit andobjectives are similar the two techniques are different ininference and selection of models In view of the Bayesianmodel averaging the basic paradigm was introduced byLeamer [7] Owing to the difficulty of implementing theapproach was basically ignored until the 2000s About recentdevelopments of this method the readers can refer to Brownet al [8] and Rodney and Herman [9] Compared withBayesian model averaging since the method of frequentistmodel averaging focused on model selection rather thanmodel averaging it has been considered bymany authors forinstance Hjort and Claeskens [10] Hansen [11] Liang et al[12] Zhang and Liang [13] and Hansen and Racine [14]

Generally speaking different methods of model selectionneed to construct distinct model selection criteria includingAIC [15] Mallowsrsquo C

119871[16] CV [17] BIC [18] GCV [19]

GMM J-statistic [20] and FPE120572[21] Zhang et al [22]

employed the generalized information criterion for selectingthe regularization parameters To choose basis functions ofsplines Xu and Huang [23] showed the optimal propertyof a LsoCV criterion and designed an efficient Newton-type algorithm for this criterion Focusing on the divergencemeasure of Kullback-Leibler So and Ando [24] defined ageneralized predictive information criterion using the biascorrection of an expected weighted loglikelihood estimatorGroen and Kapetanios [25] examined the criteria of AIC andBIC to discuss consistent estimates of a factor-augmentedregression

The literature mentioned above pays more attention tothe unconstrained models with independently and identi-cally distributed random errors Recently Lai and Xie [26]discussed model selection for constrained models whichwere limited to the homoscedastic cases Instead of usingunrestricted models or homoscedastic models we develop a119896-class generalized information criterion (119896-GIC) to discussthe selecting problems of approximately constrained linearmodels with dependent errors The 119896-GIC is an extensionof the GIC

120582119899

proposed by Shao [27] and includes someconventional model selection criteria such as BIC and GICWe employ the technique of weighted average least squares toestimate the approximately constrained models and choosethe weights through minimizing the 119896-class generalizedinformation criterion Our main result demonstrates that the119896-class generalized information criterion is asymptoticallyequivalent to the average squared error In other wordsthe selected weights from 119896-GIC are asymptotically optimalMoreover we highlight two new results which enrich theworks of Lai and Xie [26] One is that an estimate ofvariance is given and the estimate is proved to be consistentAnother is that the selected weights from 119896-GIC are shownto be still asymptotically optimal when the true varianceis replaced by the suggested estimate The finite sampleproperties of model selection are performed by Monte Carlo

simulationThe results of simulation reveal that the proposedmethod of model selection is dominant over some alternativeapproaches

The remainder of this paper begins with an illustrationof the model set-up and average estimator in Section 2Section 3 calculates the average squared error of the modelaverage estimator The 119896-GIC criterion is introduced andits asymptotic optimality is derived in Section 4 Section 5states some results from simulation evidence and Section 6is conclusions

2 Model Set-Up and Average Estimator

The core in risk investment is to build a tracking portfolioof stocks whose return mimics that of a chosen investmenttarget Let 119910

119894be the return from investing in a selected target

and 119909119894119895

be the historical return of the 119895th stock at time 119894Assume that 119901 stocks are available for building a trackingportfolio of the target Then a tracking portfolio consistingof all 119901 stocks can be represented by

119910119894=

119901

sum

119895=1

119909119894119895120579119895+ 119890119894 (1)

where 120579119895is unknown parameter and 119890

119894is random error The

left-hand side of (1) is the return of investing one dollar inthe target The right-hand side is the return of investing onedollar in the portfolio consisting of all 119901 stocks plus randomnoise

Because each parameter 120579119895stands for the proportion of

investment on the corresponding stock to the total invest-ment in the tracking portfolio the sum of all parameters isone namely

1205791+ sdot sdot sdot + 120579

119901= 1 (2)

which means the 100 percent of the whole investmentIn practice there exist a large number of stocks These

stocks compose various portfolios thatmay track the target tosome degree Among all possible tracking portfolios an idealtracking portfolio should be the onewhose return is closest tothe targetrsquos returnTherefore we need to find such an optimaltracking portfolio Because of the dependance between atracking portfolio and a restricted linear model the aim offinding the optimal tracking portfolio can be accomplishedby choosing an optimal restricted linear model

In the following we extendmodels (1) and (2) to considera generalized constrained linear model for the problem ofbuilding an optimal tracking portfolio Suppose that 119910

119894 119894 =

1 119899 is a random observation at fixed value (119909119894 119911119894) where

119909119894= (1199091198941 119909

119894119901) is fixed-dimensional explanatory variable

and 119911119894= (1199111198941 119911

119894infin) is countably infinite-dimensional

explanatory variable Consider the constrained linear model

119910119894=

119901

sum

119895=1

119909119894119895120579119895+

infin

sum

119897=1

119911119894119897120595119897+ 119890119894 subject to (3)

119903119904120579119904=

infin

sum

119896=1

119886119904119896120595119896+ 119887119904 (119904 = 1 119901) (4)

Abstract and Applied Analysis 3

where 119901 is a positive integer 120579119895and 120595

119897are parameters 119890

119894is

random error 119903119904is a constant for restricting the 119904th parameter

120579119904 and 119886

119904119896and 119887119904are some constants Assume that120601(119911

119894119897 120595119897) =

suminfin

119897=1119911119894119897120595119897and 120601(119886

119904119896 120595119896) = sum

infin

119896=1119886119904119896120595119896converge in mean

squareIn (3) the explanatory variable 119909

119894119895is involved in the

model on theoretical grounds or other reasons and 119911119894119897is the

additional explanatory variable that we need to make surewhether it should be included in the model In the contextof building tracking portfolio the fixed explanatory variable119909119894119895stands for the historical return of the 119895th stock at time 119894

which must be selected by investors because of their personalpreference or the stable earning of this stock In a trackingportfolio investors need to select some alternative stocksfrom numerous stocks to realize their expected return Sothe additional explanatory variable 119911

119894119897indicates the historical

return of the 119897th alternative stock at time 119894 Since 119911119894119897

canbe viewed as a series expansion the identity (3) includessemiparametric models as special form In fact the model(3) generalizes the models considered by Lai and Xie [26]and Liang et al [12] In addition the parameters 120579

119895and 120595

119897

denote respectively the proportion of the 119895th required stockand the 119897th alternative stock in a tracking portfolio In (4) theparameter 120579

119904is adjusted by a linear combination of 120595

119896and

119887119904 The economic significance of (4) is that the proportion

of each fixed investment varies with the proportion of allalternative investments When investors change their prefer-ence or have acquired new information on alternative stocksthey are capable of adjusting the proportion between requiredstocks and alternative stocks according to (4) This impliesthat the increase or decrease of alternative stocks can affectthe proportion of each required stock in a portfolio Besides ifwe assume that 120595

119896= 0 119896 = 1 infin the model (4) becomes

119903119904120579119904= 119887119904which which has been discussed by Lai and Xie [26]

Particularly if set 1199031= sdot sdot sdot = 119903

119901= 1 119886

11+ sdot sdot sdot + 119886

1199011= sdot sdot sdot =

1198861infin

+ sdot sdot sdot + 119886119901infin

= minus1 and 1198871+ sdot sdot sdot + 119887

119901= 1 the restricted

equation (4) turns into (2)Denote an index set U = K

1 K

119872 where 119872 is a

positive integer LetΘ = (1205791 120579

119901)119879 andΨ = (120595

1 120595

infin)119879

where ldquo119879rdquo stands for the transpose operation Due to theuncertain number of 119911

119894119897in formula (3) we consider a

sequence of approximately restricted models (3) and (4) withK119898

isin U where the 119898th model includes the first K119898

elements of 119911119894 that is 119911

1198941 119911

119894K119898

and the parameter of 120579119904

is restricted by a linear combination withK119898elements of Ψ

and a constant 119887119904 Hence the 119898th approximately restricted

models (3) and (4) are

119910119894=

119901

sum

119895=1

119909119894119895120579119895+

K119898

sum

119897=1

119911119894119897120595119897+ 119889119894+ 119890119894 subject to (5)

119903119904120579119904=

K119898

sum

119896=1

119886119904119896120595119896+ 119889995333

119904+ 119887119904 (119904 = 1 119901) (6)

where 119889119894= suminfin

119897=K119898+1119911119894119897120595119897and 119889995333

119904= suminfin

119896=K119898+1119886119904119896120595119896are the

approximation errorsSet 119884 = (119910

1 119910

119899)119879 119886119898119904

= (1198861199041 119886

119904K119898

)119879 119860119898alefsym=

(119886119898119879

1 119886

119898119879

119901)119879 119863 = (119889

1 119889

119899)119879 119863995333 = (119889

995333

1 119889

995333

119901)119879

Ψ119898= (1205951 120595K

119898

)119879 119861 = (119887

1 119887

119901)119879 and 119890 = (119890

1 119890

119899)119879

By matrix notation the119898th approximately restricted models(5) and (6) can be rewritten as

119884 = 119883Θ + 119885119898Ψ119898+ 119863 + 119890 subject to (7)

119877Θ = 119860119898

alefsymΨ119898+ 119863995333+ 119861 (8)

where 119883 is a 119899 times 119901 matrix whose 119894119895th element is 119909119894119895 119885119898is a

119899 times K119898matrix whose 119894119895th element is 119911

119894119895 and 119877 is a 119901 times 119901

diagonal matrix whose 119894th diagonal element is 119903119894

Hypothesize that 119890119894119899

119894=1satisfies 119864(119890

119894| 119909119894 119911119894) = 0 and its

conditional covariance matrixΩ119899= 119864(119890119890

119879| 119883 119885

119898) is

Ω119899= (

1205902

19848581

sdot sdot sdot 984858119899minus1

9848581

1205902

2sdot sdot sdot 984858119899minus2

sdot sdot sdot

984858119899minus1

984858119899minus2

sdot sdot sdot 1205902

119899

) (9)

in which 119864(119890119894119890119894| 119909119894 119911119894) = 120590

2

119894and 119864(119890

119894119890119894+119895

| 119909119894 119911119894) = 984858119895with

119894 = 1 119899 and 119895 = 1 119899 minus 1 Clearly the random errorsfollow a heteroscedastic stationary Gaussian process

Substituting (8) into (7) it yields

119884119906= 119885119898119906Ψ119898+ 119880 (10)

where 119884119906= 119884 minus 119883119877

minus1119861 119880 = 119863 + 119883119877

minus1119863995333+ 119890 and 119885

119898119906=

119885119898+ 119883119877minus1119860119898

alefsym By the method of least squares the estimator

of Ψ119898is

Ψ119898= (119885119879

119898119906119885119898119906)minus1

119885119879

119898119906119884119906 (11)

where (119885119879119898119906119885119898119906)minus1 denotes the inverse of 119885119879

119898119906119885119898119906

In the119898th approximating model (7) we set 120583

119898= 119883Θ + 119885

119898Ψ119898 so

that 120583 = 119864(119884 | 119883 119885119898) = 120583119898+ 119863 Thus the estimator of 120583

119898is

120583119898= 119883Θ + 119885

119898Ψ119898= 120578 + 119875

119898119884 (12)

where 120578 = (119868minus119875119898)119883119877minus1119861 and 119875

119898= 119885119898119906(119885119879

119898119906119885119898119906)minus1

119885119879

119898119906is

the ldquohatrdquo matrixLet 119908 = (119908

1 119908

119872)119879 be a weight vector where119872 lt 119899

Define a weight setW as

W = 119908 | 119908119898isin [0 1] 119898 = 1 119872

119872

sum

119898=1

119908119898= 1 (13)

For all119898 le 119872 the weighted average estimator of Ψ119898is

Ψ119908=

119872

sum

119898=1

119908119898Ψ119898=

119872

sum

119898=1

119908119898(119885119879

119898119906119885119898119906)minus1

119885119879

119898119906119884119906 (14)

Naturally the weighted average estimator of Θ is

Θ119908= 119877minus1119861 + 119877

minus1119860119898

alefsymΨ119908 (15)

Furthermore the weighted average estimator of 120583 is

120583 (119908) =

119872

sum

119898=1

119908119898120583119898= 120578 (119908) + 119875 (119908)119884 (16)

4 Abstract and Applied Analysis

where 119875(119908) = sum119872

119898=1119908119898119875119898and 120578(119908) = (119868 minus 119875(119908))119883119877

minus1119861

It can be seen that the weighted estimator 120583(119908) is anaverage estimator of 120583

119898 119898 = 1 119872 The weighted ldquohatrdquo

matrix 119875(119908) depends on nonrandom regressor 119885119898119906

andweight vector 119908 In general conditions the matrix 119875(119908) issymmetric but not idempotent

For a positive integerG let 120585 be the maximal value of 1205902120580

120580 = 1 119899 and let 120582119895 119895 = 1 G be the nonzero eigenvalue

ofΩ119899 Assume that both 120582

119895and 984858120580are summable namely

Γ =

G

sum

119895=1

120582119895lt infin

W = 120585 + 2

119899minus1

sum

120580=1

10038161003816100381610038169848581205801003816100381610038161003816 lt infin

(17)

Since the covariance matrix Ω119899determines the algebraic

structure of the model average estimator we discuss itsproperties in the following

Lemma 1 For any 119899 dimensional vectors 119886 = (1198861 119886

119899)119879 and

119887 = (1198871 119887

119899)119879 one has |119886119879Ω

119899119887| le 119886119887W where sdot is

the Euclidean norm

Proof Through the definition ofΩ119899 one has

10038161003816100381610038161003816119886119879Ω11989911988710038161003816100381610038161003816=

1003816100381610038161003816100381610038161003816100381610038161003816

119899

sum

119897=1

1198861198971198871198971205902

119897+

119899minus1

sum

119894=1

984858119894

119899minus119894

sum

119897=1

(119886119897+119894119887119897+ 119886119897119887119897+119894)

1003816100381610038161003816100381610038161003816100381610038161003816

le 120585

1003816100381610038161003816100381610038161003816100381610038161003816

119899

sum

119897=1

119886119897119887119897

1003816100381610038161003816100381610038161003816100381610038161003816

+

1003816100381610038161003816100381610038161003816100381610038161003816

119899minus1

sum

119894=1

984858119894

1003816100381610038161003816100381610038161003816100381610038161003816

1003816100381610038161003816100381610038161003816100381610038161003816

119899minus119894

sum

119897=1

(119886119897+119894119887119897+ 119886119897119887119897+119894)

1003816100381610038161003816100381610038161003816100381610038161003816

(18)

ApplyingCauchy-Schwarz inequality for 119894 isin 1 119899minus1one gets

119899minus119894

sum

119897=1

119886119897+119894119887119897le radic

119899minus119894

sum

119897=1

1198862119897+119894

119899minus119894

sum

119897=1

1198872119897

le radic

119899

sum

119897=1

1198862119897

119899

sum

119897=1

1198872119897= 119886 119887

(19)

Similarly sum119899minus119894119897=1119886119897119887119897+119894le 119886119887 holds Therefore

10038161003816100381610038161003816119886119879Ω11989911988710038161003816100381610038161003816le 119886 119887(120585 + 2

119899minus1

sum

119894=1

984858119894) = 119886 119887W (20)

Because the matrix 119875(119908) takes an important role in ana-lyzing the problems of model selection we state some of itsproperties We set 120591 = maxK

1 K

119872 Let 120582(119875(119908)) and

120582max(119875(119908)) denote the eigenvalue and the largest eigenvalueof 119875(119908) respectively

Lemma 2 One has (i) tr(119875(119908)) le 120591 and (ii) 0 le 120582(119875(119908)) le 1where tr(sdot) denotes the trace operation

Proof It follows from tr(119875119898) = K

119898and tr(119875(119908)) =

sum119872

119898=1119908119898K119898le 120591 that (i) is established Next we consider

(ii) Without loss of generality let 1205821119898

ge sdot sdot sdot ge 120582119899119898

and1205931119898 120593

119899119898be the eigenvalues and standardly orthogonal

eigenvectors of 119875119898 respectively Observe that 119875

119898is idempo-

tent which implies that 120582119894119898

isin 0 1 119894 = 1 119899 For any120603 isin R119899 it can be seen that

120582 (119875 (119908)) =120603119879119875 (119908)120603

120603119879120603

=

119872

sum

119898=1

119908119898

120603119879119875119898120603

120603119879120603

=

119872

sum

119898=1

119908119898

120594119879Ξ119898120594

120594119879120594

=

119872

sum

119898=1

119908119898

119899

sum

119894=1

120582119894119898

100381610038161003816100381612059411989410038161003816100381610038162

120594119879120594

(21)

where Ξ119898= diag(120582

1119898 120582

119899119898) 120603 = Φ

119898120594 and Φ

119898=

(1205931119898 120593

119899119898) Due to 120582

119894119898isin 0 1 119894 = 1 119899 it means

that

120582 (119875 (119908)) le

119872

sum

119898=1

119908119898

119899

sum

119894=1

1

100381610038161003816100381612059411989410038161003816100381610038162

120594119879120594=

119872

sum

119898=1

119908119898= 1

120582 (119875 (119908)) ge

119872

sum

119898=1

119908119898

119899

sum

119894=1

0

100381610038161003816100381612059411989410038161003816100381610038162

120594119879120594= 0

(22)

From Lemma 2(ii) we know that 119875(119908) is nonnegativedefinite

Lemma 3 Let 120572 denote the number of eigenvalues of 119875(119908)Then (120572minus1 tr(119875(119908)))2 le 120572minus1 tr(1198752(119908))

Proof Let 1205821 120582

120572be the eigenvalues of 119875(119908) (i) If 120582

1=

sdot sdot sdot = 120582120572= 0 we know that Lemma 3 holds (ii) Let

1205821= 0 120582

120572= 0 and 120582

120572+1= sdot sdot sdot = 120582

120572= 0 It is easy to see

that sum120572119894=1(120582119894minus 120582)2

ge 0 where 120582 = 120572minus1 tr(119875(119908)) Thus it canbe shown that120572

sum

119894=1

(120582119894minus 120582)2

=

120572

sum

119894=1

1205822

119894minus 2120582

120572

sum

119894=1

120582119894+ 1205721205822

= tr (1198752 (119908)) minus 120572(120572minus1 tr (119875 (119908)))2

ge 0

(23)

which implies that (120572minus1 tr(119875(119908)))2 le 120572minus1 tr(1198752(119908))

Lemma 4 For any 119908 isin W there exists C1

= C10158402

such that tr(119875(119908)Ω119899) le C1015840Γ tr (Ω

119899119875(119908))

2le C1Γ2 and

tr (Ω119899119875119879(119908)119875(119908))

2

le C1Γ tr(119875(119908)Ω

119899119875119879(119908)) in which C1015840 =

119904119906119901119908isinW120582max(119875(119908))

Proof Notice that 120582min(A)120582max(B) le 120582max(AB) le

120582max(A)120582max(B) and tr(AB) le 120582max(A) tr(B) le

Abstract and Applied Analysis 5

tr(A) tr(B) where A and B are any positive semidefinitematrices

Since 119875(119908) and Ω119899are symmetric and nonnegative

definite the first inequality is established by tr(119875(119908)Ω119899) le

120582max(119875(119908)) tr(Ω119899) Let 120582119894 be eigenvalue of 119875(119908) From thesymmetric property of 119875(119908) we know that there exists a119899 times 119899 orthogonal matrix Ψ such that 119875(119908) = ΨΥΨ

119879where Υ = diag(120582

1 120582

119899) Then it yields tr(Ω

119899119875(119908)) =

tr(Υ12Ψ119879Ω119899ΨΥ12) where Υ

12Ψ119879

Ω119899ΨΥ12 is symmetric

and nonnegative definite The second inequality holdsbecause

tr (Ω119899119875 (119908))

2le [tr (Ω

119899119875 (119908))]

2

le [120582max (119875 (119908)) tr(Ω119899)]2

le C1Γ2

(24)

For the last formula we note that

tr (Ω119899119875119879(119908) 119875 (119908))

2

= tr (119875 (119908)Ω119899119875119879(119908))2

le 120582max (119875 (119908)Ω119899119875119879(119908)) tr (119875 (119908)Ω119899119875

119879(119908))

le 120582max (1198752(119908)) 120582max (Ω119899) tr (119875 (119908)Ω119899119875

119879(119908))

le 1205822

max (119875 (119908)) 120582max (Ω119899) tr (119875 (119908)Ω119899119875119879(119908))

le C1Γ tr (119875 (119908)Ω119899119875

119879(119908))

(25)

This completes the proof of Lemma 4

Lemma 5 Let A be a symmetric matrix and 119883 be a randomvector with zero expectation Then one has Var(119883119879A119883) =2 tr(Var(119883)AVar(119883)A) where Var(sdot) is the operation ofvariance

Proof The proof of this lemma is provided in Lai and Xie[26]

3 Average Squared Error

Denote an average squared error by

119871119899 (119908) =

1

119899

1003817100381710038171003817120583 minus 120583(119908)10038171003817100381710038172

119896=(120583 minus 120583 (119908))

119879119896 (120583 minus 120583 (119908))

119899 (26)

where 119896 is a fixed positive integer which is often used toeliminate the boundary effect Andrews [28] suggested thatthe 119896 can take the value of 120590minus2 when errors obeyed anindependent and identical distribution with variance 1205902 Themost common situation is 119896 = 1 The average squarederror 119871

119899(119908) can be viewed as a measure of accuracy between

120583(119908) and 120583 Obviously an optimal estimator can obtain theminimum value of 119871

119899(119908) In other words we should select

a weight vector 119908 from W to make that the average squarederror 119871

119899(119908) takes value that is as small as possible

In order to investigate the problem of weight selection wegive the expression of conditional expected average squarederror as follows

119877119899 (119908) = 119864 (119871119899 (119908) | 119883 119885119898) (27)

From the definition of119877119899(119908) the following lemma can be

obtained

Lemma6 The conditional expected average squared error canbe rewritten as

119877119899 (119908) =

1003817100381710038171003817119872(119908)120583 minus 120578(119908)10038171003817100381710038172

119896

119899+119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899 (28)

where119872(119908) = 119868 minus 119875(119908)

Proof A straight calculation of 119871119899(119908) leads to

119899119871119899 (119908)

= (120583 minus 120583 (119908))119879119896 (120583 minus 120583 (119908))

= (120583 minus 120578 (119908) minus 119875 (119908) (120583 + 119890))119879

times 119896 (120583 minus 120578 (119908) minus 119875 (119908) (120583 + 119890))

= [120583119879119872119879(119908) minus 120578

119879(119908) minus 119890

119879119875119879(119908)]

times 119896 [119872 (119908) 120583 minus 120578 (119908) minus 119875 (119908) 119890]

=1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896minus 119896(120583 minus 119883119877

minus1119861)119879

119872119879(119908) 119875 (119908) 119890

+ 119896119890119879119875119879(119908) 119875 (119908) 119890 minus 119896119890

119879119875119879(119908)119872 (119908) (120583 minus 119883119877

minus1119861)

(29)

Using 119864(119890119879119875119879(119908)119875(119908)119890 | 119883 119885119898) = tr(119875(119908)Ω

119899119875119879(119908))

and taking conditional expectations on both sides of theabove equality give rise to Lemma 6

4 The 119896-Class Generalized InformationCriterion and Asymptotic Optimality

As the value of120583 is unknown the average squared error119871119899(119908)

cannot be used directly to select the weight vector119908Thus wesuggest a 119896-class generalized information criterion (119896-GIC)to choose the weight vector Further we will prove that theselected weight vector from 119896-GIC minimizes the averagesquared error 119871

119899(119908)

The 119896-GIC for the restricted model average estimator is

119862119899 (119908) =

1003817100381710038171003817119884 minus 120583 (119908)10038171003817100381710038172

119896

119899+2ℏ

119899tr (119875 (119908)Ω119899) (30)

where ℏ is larger than one and satisfies assumption (35) men-tioned below The 119896-class generalized information criterionextends the generalized information criterion (GIC

120582119899

) pro-posed by Shao [27] Because the ℏ can take different valuesthe 119896-GIC includes some common information criteria formodel selection such as theMallows C

119871criterion (119896 = ℏ = 1)

the GIC criterion (119896 = 1 ℏ rarr infin) the FPE120572criterion

(119896 = 1 ℏ gt 1) and the BIC criterion (119896 = 1 ℏ = log(119899))

6 Abstract and Applied Analysis

Lemma 7 If 119896 = ℏ we have 119864(119862119899(119908) | 119883 119885

119898) = 119877

119899(119908) +

119896Γ119899

Proof Recalling the definition of 119862119899(119908) one gets

119899119862119899 (119908) = (119884 minus 120583 (119908))

119879119896 (119884 minus 120583 (119908)) + 2119896 tr (119875 (119908)Ω119899)

= [119872 (119908)119884 minus 120578 (119908)]119879119896 [119872 (119908)119884 minus 120578 (119908)]

+ 2119896 tr (119875 (119908)Ω119899)

= 2119896 tr (119875 (119908)Ω119899) + 119896119890119879119872119879(119908)119872 (119908) 119890

+ 119896(119872 (119908) 120583 minus 120578 (119908))119879119872(119908) 119890

+ 119896119890119879119872119879(119908) (119872 (119908) 120583 minus 120578 (119908))

+1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

(31)

Observe that

119896119890119879119872119879(119908)119872 (119908) 119890 = 119890

119879(119868 minus 119875 (119908))

119879119896 (119868 minus 119875 (119908)) 119890

= 119896119890119879119890 minus 119896119890

119879119875119879(119908) 119890 minus 119896119890

119879119875 (119908) 119890

+ 119896119890119879119875119879(119908) 119875 (119908) 119890

(32)

Notice that 119864(119896119890119879119890 | 119883 119885

119898) = 119896 tr(Ω

119899) and

119864(119896119890119879119875119879(119908)119890 | 119883 119885

119898) = 119896 tr(119875119879(119908)Ω

119899) Moreover we

have that 119864(119896119890119879119875(119908)119890 | 119883 119885119898) = 119896 tr(119875(119908)Ω

119899) and

119864(119896119890119879119875119879(119908)119875(119908)119890 | 119883 119885

119898) = 119896 tr(119875(119908)Ω

119899119875119879(119908))

Thus taking conditional expectations on both sides of(31) we obtain

119899119864 (119862119899 (119908) | 119883 119885119898) = 119896 tr (Ω119899) +

1003817100381710038171003817119872(119908)120583 minus 120578(119908)10038171003817100381710038172

119896

+ 119896 tr (119875 (119908)Ω119899119875119879(119908))

(33)

It follows from (33) and Lemma 6 that Lemma 7 isestablished as desired

Lemma 7 shows that 119862119899(119908) is equivalent to the con-

ditional expected average squared error plus an error biasParticularly when 119899 approaches to infinite 119862

119899(119908) is an

unbiased estimation of 119877119899(119908)

The 119896-GIC criterion is defined so as to select the optimalweight vector 119908 The optimal weight vector 119908 is chosen byminimizing 119862

119899(119908)

Obviously the well-posed estimators of parameters areΘ119908 and Ψ119908 with the weight vector 119908 Under some regular

conditions we intend to demonstrate that the selectionprocedure is asymptotically optimal in the following sense

119871119899 (119908)

inf119908isinW119871119899 (119908)

119901

997888rarr 1 (34)

If the formula (34) holds we know that the selectedweight vector 119908 from 119862

119899(119908) can realize the minimum value

of 119871119899(119908) In other words the weight vector 119908 is equivalent

to the selected weight vector by minimizing 119871119899(119908) and is the

optimal weight vector for 120583(119908) The asymptotic optimality of119862119899(119908) can be established under the following assumptions

Assumptions 1 Write 120572119899= inf119908isinW119899119877119899(119908) As 119899 rarr infin we

assume that

lim inf119908isinW

ℏ2

120572119899

997888rarr 0 (35)

sum

119908isinW

1

(119899119877119899 (119908))

1198731015840997888rarr 0 (36)

for a large ℏ and any positive integer 1 le 1198731015840 lt infinThe above assumptions have been employed by many

literatures of model selection For instance the expression(35) was used by Shao [27] and the formula (36) was adoptedby Li [29] Andrews [28] Shao [27] and Hansen [11]

The following lemma offers a bridge for proving theasymptotic optimality of 119862

119899(119908)

Lemma 8 We have1003817100381710038171003817119872(119908)119884 minus 120578(119908)

10038171003817100381710038172

119896= 119899119871119899 (119908) minus 2⟨119890

119879 119875(119908)119890⟩

119896

+ 2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896+ 1198902

119896

(37)

Proof Using 120583(119908) = 119875(119908)119884 + 120578(119908) 119884 = 120583 + 119890 and thedefinition of 119871

119899(119908) one obtains

1003817100381710038171003817119872(119908)119884 minus 120578(119908)10038171003817100381710038172

119896=1003817100381710038171003817120583 + 119890 minus 119875(119908)119884 minus 120578(119908)

10038171003817100381710038172

119896

= 1198902

119896+1003817100381710038171003817120583 minus 119875(119908)119884 minus 120578(119908)

10038171003817100381710038172

119896

+ 2⟨119890119879 120583 minus 120578(119908) minus 119875(119908)119884⟩

119896

= 1198902

119896+ 119899119871119899 (119908)

+ 2⟨119890119879 120583 minus 120578(119908) minus 119875(119908)(120583 + 119890)⟩

119896

= 1198902

119896+ 119899119871119899 (119908)

+ 2⟨119890119879 (119868 minus 119875(119908))120583 minus 120578(119908) minus 119875(119908)119890⟩

119896

= 1198902

119896+ 119899119871119899 (119908)

+ 2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

minus 2⟨119890119879 119875(119908)119890⟩

119896

(38)

This completes the proof of Lemma 8

Lemma 8 and 119884 minus 120583(119908)2119896= 119872(119908)119884 minus 120578(119908)

2

119896imply

that

119862119899 (119908) = 119871119899 (119908) +

2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

119899

+1198902

119896

119899+2ℏ tr (119875 (119908)Ω119899)

119899minus 2

⟨119890119879 119875(119908)119890⟩

119896

119899

(39)

Abstract and Applied Analysis 7

The goal is to choose119908 by minimizing 119862119899(119908) From (39)

one only needs to select 119908 through minimizing

119871119899 (119908) +

2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

119899

+

2ℏ tr (119875 (119908)Ω119899) minus 2⟨119890119879 119875(119908)119890⟩

119896

119899

(40)

where 119908 isinWCompared with 119871

119899(119908) it is sufficient to establish

that 119899minus1⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896and 119899

minus1ℏ tr(119875(119908)Ω

119899) minus

119899minus1⟨119890119879 119875(119908)119890⟩

119896are uniformly negligible for any 119908 isin W

More specifically to prove (34) we need to check

sup119908isinW

⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

119899119877119899 (119908)

119901

997888rarr 0 (41)

sup119908isinW

10038161003816100381610038161003816ℏ tr (119875 (119908)Ω119899) minus ⟨119890

119879 119875(119908)119890⟩

119896

10038161003816100381610038161003816

119899119877119899 (119908)

119901

997888rarr 0 (42)

sup119908isinW

10038161003816100381610038161003816100381610038161003816

119871119899 (119908) minus 119877119899 (119908)

119877119899 (119908)

10038161003816100381610038161003816100381610038161003816

119901

997888rarr 0 (43)

where ldquo119901

997888rarrrdquo denotes the convergence in probabilityFollowing the idea of Li [29] we testify the main result

of our work that the minimizing of 119896-class generalizedinformation criterion is asymptotically optimalNowwe statethe main theorem

Theorem9 Under assumptions (35) and (36) theminimizingof 119896-class generalized information criterion119862

119899(119908) is asymptot-

ically optimal namely (34) holds

Proof The asymptotically optimal property of 119896-GIC needsto show that (41) (42) and (43) are valid

Firstly we prove that (41) holds For any 120577 gt 0 byChebyshevrsquos inequality one has

Pr

sup119908isinW

10038161003816100381610038161003816⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

10038161003816100381610038161003816

119899119877119899 (119908)

gt 120577

le sum

119908isinW

119864[⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896]2

(120577119899119877119899 (119908))

2

(44)

which by Lemma 1 is no greater than

120577minus2119896W sum

119908isinW

119899minus21003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896(119877119899 (119908))

minus2 (45)

Recalling the definition of 119877119899(119908) we get 119872(119908)120583 minus

120578(119908)2

119896le 119899119877

119899(119908) Then (45) does not exceed

120577minus2119896Wsum

119908isinW (119899119877119899(119908))minus1 By assumption (36) one knows

that (45) tends to zero in probabilityThus (41) is established

To prove (42) it suffices to testify that

119896 tr (119875 (119908)Ω119899) minus ⟨119890119879 119875(119908)119890⟩

119896

119899119877119899 (119908)

119901

997888rarr 0 (46)

(ℏ minus 119896) tr (119875 (119908)Ω119899)119899119877119899 (119908)

119901

997888rarr 0 (47)

By Chebyshevrsquos inequality and Lemmas 4 and 5 for any120598 gt 0 we have

Pr sup119908isinW

10038161003816100381610038161003816100381610038161003816100381610038161003816

tr (119875 (119908)Ω119899) minus ⟨119890119879 119875 (119908) 119890⟩

119899119877119899 (119908)

10038161003816100381610038161003816100381610038161003816100381610038161003816

gt 120598

le sum

119908isinW

119864[tr(119875(119908)Ω119899) minus ⟨119890

119879 119875(119908)119890⟩]

2

(120598119899119877119899(119908))2

=1

1205982sum

119908isinW

Var (119890119879119875 (119908) 119890)

(119899119877119899(119908))2

le2C1Γ2

1205982sum

119908isinW

1

(119899119877119899(119908))2

(48)

From (36) we know that (48) is close to zero Thus (46)is reasonable

Recalling Lemma 4 and (35) it yields

1003816100381610038161003816(ℏ minus 119896) tr (119875 (119908)Ω119899)1003816100381610038161003816

119899119877119899 (119908)

le

10038161003816100381610038161003816C1015840 (ℏ minus 119896) Γ

10038161003816100381610038161003816

119899119877119899 (119908)

997888rarr 0 (49)

which derives (47) Then (42) is provedNext we show that the expression (43) also holds A

straightforward calculation leads to

1003817100381710038171003817120583 minus 120583(119908)10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

=1003817100381710038171003817120583 minus 120578(119908) minus 119875(119908)119884

10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

=1003817100381710038171003817120583 minus 120578(119908) minus 119875(119908)120583 minus 119875(119908)119890

10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

=1003817100381710038171003817119872(119908)120583 minus 120578(119908) minus 119875(119908)119890

10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

= 119875(119908)1198902

119896minus 2⟨120583

119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

(50)

From the identity (50) we know that

119871119899 (119908) minus 119877119899 (119908) = minus

2⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899

+119875(119908)119890

2

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899

(51)

8 Abstract and Applied Analysis

Obviously the proof of (43) needs to verify that

sup119908isinW

119875(119908)1198902

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899119877119899 (119908)

119901

997888rarr 0

sup119908isinW

⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899119877119899 (119908)

119901

997888rarr 0

(52)

To prove (52) it should be noticed thattr(119875(119908)Ω

119899119875119879(119908)) = 119864(119890

119879119875119879(119908)119875(119908)119890) and 119875(119908)119890

2

119896=

⟨119890119879119875119879(119908) 119875(119908)119890⟩

119896 In addition

1003817100381710038171003817119875 (119908) (119872 (119908) 120583 minus 120578 (119908))10038171003817100381710038172le 1205822

max (119875 (119908))1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

le1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

(53)

Thus there exist any 1205771015840 gt 0 and 1205981015840 gt 0 such that

Pr sup119908isinW

119875(119908)1198902

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899119877119899 (119908)

gt 1205771015840

le sum

119908isinW

119864[119875(119908)1198902

119896minus 119896119864 (119890

119879119875119879(119908) 119875 (119908) 119890)]

2

(119899119877119899 (119908) 120577

1015840)2

=1198962

12057710158402sum

119908isinW

Var (119890119879119875119879 (119908) 119875 (119908) 119890)

(119899119877119899(119908))2

le2119896C1Γ

12057710158402sum

119908isinW

119896 tr (119875 (119908)Ω119899119875119879(119908))

(119899119877119899(119908))2

le2119896C1Γ

12057710158402sum

119908isinW

1

(119899119877119899 (119908))

(54)

Pr

sup119908isinW

⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899119877119899 (119908)

gt 1205981015840

le sum

119908isinW

119864[⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896]2

(119899119877119899(119908)1205981015840)

2

le sum

119908isinW

1003817100381710038171003817119875(119908)(119872(119908)120583 minus 120578(119908))10038171003817100381710038172

119896119896W

(119899119877119899(119908)1205981015840)

2

le 1205981015840minus2119896W sum

119908isinW

119899minus21003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896(119877119899 (119908))

minus2

(55)

Combining (36) and (45) one knows that both (54)and (55) tend to zero In other words (43) is confirmedWe conclude that the expressions (41) (42) and (43) arereasonable This completes the proof of Theorem 9

In practice the covariance of errors is usually unknownand needs to be estimated However it is difficult to build agood estimate for Ω

119899in virtue of the special structure of Ω

119899

In the special case when the random errors are independentand identical distribution with variance 1205902 the consistentestimator of1205902 can be built for the constrainedmodels (7) and(8) Let 119898 = 119876 in 120583

119898and 1205911015840 = tr(119875

119876) where 119876 corresponds

to a ldquolargerdquo approximatingmodel Denote 2119876= (119899minus120591

1015840)minus1(119884minus

120583119876)119879(119884 minus 120583

119876) The coming theorem will show that 2

119876is a

consistent estimate of 1205902

Theorem 10 If 1205911015840119899 rarr 0 when 1205911015840 rarr infin and 119899 rarr infin wehave 2

119876

119901

997888rarr 1205902 as 119899 rarr infin

Proof Writing Λ = 119863 + 119883119877minus1119863995333 one obtains

(119884 minus 120583119876)119879(119884 minus 120583

119876)

119899 minus 1205911015840=119890119879119872119876119890

119899 minus 1205911015840+

1003817100381710038171003817119872119876Λ10038171003817100381710038172

119899 minus 1205911015840+2119890119879119872119876Λ

119899 minus 1205911015840

(56)

where119872119876= 119868 minus 119875

119876

Since 119864(119890119879119872119876119890) = 120590

2(119899 minus 120591

1015840) it leads to

11986410038161003816100381610038161003816119890119879119872119876119890 minus 1205902(119899 minus 120591

1015840)10038161003816100381610038161003816

2

= 11986410038161003816100381610038161003816119890119879119872119876119890 minus 119864(119890

119879119872119876119890)10038161003816100381610038161003816

2

= Var (119890119879119872119876119890) = 2120590

4 tr (119872119876)

(57)

For any 120576 gt 0 it follows from (57) that

Pr1003816100381610038161003816100381610038161003816100381610038161003816

119890119879119872119876119890

119899 minus 1205911015840minus 1205902

1003816100381610038161003816100381610038161003816100381610038161003816

gt 120576 le11986410038161003816100381610038161003816119890119879119872119876119890 minus (119899 minus 120591

1015840)120590210038161003816100381610038161003816

2

1205762(119899 minus 1205911015840)2

le21205904

1205762 (119899 minus 1205911015840)997888rarr 0

(58)

Equation (58) implies that (119899 minus 1205911015840)minus1(119890119879119872119876119890)119901

997888rarr 1205902 Let

I119876119905

be the 119905th diagonal element of the ldquohatrdquo matrix 119875119876Then

0 le 1minusI119876119905lt 1 is the 119905th diagonal element of119872

119876and satisfies

sum119899

119905=1(1 minusI

119876119905) = 119899 minus 120591

1015840Because 120601(119911

119894119897 120595119897) and 120601(119886

119904119896 120595119896) converge to mean

square we have 119864(Λ2119876119905) rarr 0 as 1205911015840 rarr infin where Λ

119876119905is

the 119905th element of Λ 119905 = 1 119899 Notice that

119864 (Λ119879119872119876Λ)

(119899 minus 1205911015840)=

1

119899 minus 1205911015840

119899

sum

119905=1

119864 (Λ2

119876119905119872119876119905)

=1

119899 minus 1205911015840

119899

sum

119905=1

119864 (Λ2

119876119905) [1 minusI

119876119905]

le max119905

119864 (Λ2

119876119905)

1

119899 minus 1205911015840

119899

sum

119905=1

[1 minusI119876119905]

= max119905

119864 (Λ2

119876119905) 997888rarr 0

(59)

The above expression implies that the second term on theright hand of (56) approaches to zero By the similar proof of(59) we obtain that the final term on the right-hand of (56)also tends to zero The proof of Theorem 10 is complete

Abstract and Applied Analysis 9

In the case of independently and identically distributederrors if we replace 1205902 by 2

119876in the 119896-GIC the 119896-GIC can be

simplified to

119862119899 (119908) =

1003817100381710038171003817119884 minus 120583(119908)10038171003817100381710038172

119896

119899+2ℏ2

119876tr (119875 (119908))119899

(60)

Here we intend to illustrate that the model selection pro-cedure of minimizing 119862

119899(119908) is also asymptotically optimal

Theorem11 Assume that the random error 119890 is iid withmeanzero and variance 1205902 Under the conditions (35) and (36) the119862119899(119908) is still asymptotically valid

Proof Using a similar technique of deriving (39) we obtain

119899119862119899 (119908) =

1003817100381710038171003817119884 minus 120583(119908)10038171003817100381710038172

119896+ 2ℏ

2

119876tr (119875 (119908))

= 119899119871119899 (119908) + 2⟨119890

119879119872(119908)120583 minus 120578(119908)⟩

119896

+ 2ℏ2

119876tr (119875 (119908)) minus 2⟨119890119879 119875(119908)119890⟩

119896+ 1198902

119896

= 2ℏ (2

119876minus 1205902) tr (119875 (119908)) + 2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

+ 119899119871119899 (119908) + 2ℏ120590

2 tr (119875 (119908))

minus 2⟨119890119879 119875(119908)119890⟩

119896+ 1198902

119896

(61)

With an appropriate modification of the proofs (41)ndash(43)one only needs to verify

sup119908isinW

ℏ100381610038161003816100381610038161205902minus 2

119876

10038161003816100381610038161003816tr (119875 (119908))

119899119877119899 (119908)

119901

997888rarr 0 (62)

which is equivalent to showing

sup119908isinW

ℏ2((1119899)

100381610038161003816100381610038161205902minus 2

119876

10038161003816100381610038161003816tr(119875(119908)))

2

1198772119899(119908)

119901

997888rarr 0 (63)

From the proof of Theorem 10 we have

Pr1003816100381610038161003816100381610038161003816100381610038161003816

ℏ119890119879119872119876119890

119899 minus 1205911015840minus ℏ1205902

1003816100381610038161003816100381610038161003816100381610038161003816

gt 1205761015840119877minus12

119899(119908)

le ℏ211986410038161003816100381610038161003816119890119879119872119876119890 minus (119899 minus 120591

1015840)120590210038161003816100381610038161003816

2

12057610158402(119899 minus 1205911015840)2119877119899 (119908)

le2ℏ21205904

12057610158402 (119899 minus 1205911015840) 119877119899 (119908)

(64)

where 1205761015840 gt 0By assumption (35) one knows that the above equation

is close to zero Thus we obtain 119877119899(119908)minus1|ℏ1205902minus ℏ2

119876|2 119901

997888rarr

0 It follows from Lemma 3 and the definition of 119877119899(119908)

that (119899minus1 tr(119875(119908)))2 is no greater than 119877119899(119908) Then the

formula (63) is confirmed Therefore we conclude that theminimizing of 119862

119899(119908) is also asymptotically optimal

5 Monte Carlo Simulation

In this section Monte Carlo simulations are performedto investigate the finite sample properties of the proposedrestricted linearmodel selectionThis data generating processis

119910119894= 119909119894120579 +

1000

sum

119897=1

119911119894119897120595119897+ 119890119894

subject to 120579 = minus

1000

sum

119897=1

120595119897+ 1 119894 = 1 119899

(65)

where 119909119894follows 119905(3) distribution 119911

1198941= 1 and 119911

119894119897

119897 = 2 1000 is independently and identically distributed119873(0 1) The parameter 120595

119897 119897 = 1 1000 is determined by

the rule 120595119897= 120577119897minus1 where 120577 is a parameter which is selected

to control the population 2 = 1205772(1 + 120577

2) The error 119890

119894is

independent of 119909119894and 119911119894119897 We consider two cases of the error

distribution

Case 1 119890119894is independently and identically distributed

119873(0 1)

Case 2 1198901 119890

119899obey multivariate normal distribution

119873(0Ω119899) whereΩ

119899is a 119899times119899 dimensional covariance matrix

The 119895th diagonal element ofΩ119899is 1205902119895generated from uniform

distribution on (0 1) The 119895119897th (119895 = 119897) nondiagonal element ofΩ119899isΩ119895119897119899= exp(minus05|119909

119894119895minus 119909119894119897|2) where 119909

119894119895and 119909

119894119897denote the

119895th and 119897th elements of 119909119894 respectively

The sample size is varied between 119899 = 50 100 150 and200 The number of models is determined by 119872 = lfloor3119899

13rfloor

where lfloor119904rfloor stands for the integer part of 119904 We set 120577 so that2 varies on a grid between 01 and 09 The number of

simulation trials is Π = 500 For the 119896-GIC the value of 119896takes one and ℏ adopts the effective number of parameters

To assess the performance of 119896-GIC we consider fiveestimators which are (1) AIC model selection estimators(AIC) (2) BIC model selection estimators (BIC) (3) leave-one-out cross-validated model selection estimator (CV [17])(4) Mallows model averaging estimators (MMA [11]) and(5) 119896-GIC model selection estimator (119896-GIC) FollowingMachado [30] the AIC and BIC are defined respectively as

AIC119898= 2119899 ln 1

119899

1003817100381710038171003817119884 minus 12058311989810038171003817100381710038172 + 2K

119898

BIC119898= 2119899 ln 1

119899

1003817100381710038171003817119884 minus 12058311989810038171003817100381710038172 + ln (119899)K119898

(66)

We employ the out-of-sample prediction error to evaluateeach estimator For each replication 119910

ℓ 119909ℓ 119911ℓ100

ℓ=1are gener-

ated as out-of-sample observations In the120587th simulation theprediction error is

PE (120587) = 1

100

100

sum

ℓ=1

(119910ℓminus 120583ℓ (119908))

2 (67)

10 Abstract and Applied Analysis

CVMMAAIC

BIC

n = 50M = 11Pr

edic

tion

erro

r18

14

10

06

01 03 05 07 09

R

2

k-GIC

(a)

CVMMAAIC

BIC

n = 100M = 14

Pred

ictio

n er

ror

18

14

10

06

01 03 05 07 09

R

2

k-GIC

(b)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

n = 150M = 16

01 03 05 07 09

R

2

k-GIC

(c)

Pred

ictio

n er

ror

18

14

10

06

01 03 05 07 09

CVMMAAIC

BIC

n = 200M = 18

R

2

k-GIC

(d)

Figure 1 Results for Monte Carlo design

where119908 is selected by one of the five methodsThen the out-of-sample prediction error is calculated by

PE = 1

Π

Π

sum

120587=1

PE (120587) (68)

where Π = 500 is the number of replication Obviously thesmaller PE implies the better method of model estimatorWe consider PE under homoscedastic errors at first Theprediction error calculations are summarized in Figure 1Thefour panels in each graph depict results for a variety of samplesizes In each panel PE is displayed on the 119910-axis and 2 isdisplayed on the 119909-axis

We find that the 119896-GIC estimators are almost the bestestimators among those considered When 2 is very large

the MMA estimators can sometimes be marginally preferredto the 119896-GIC estimators In each panel the AIC and CV havequite similar prediction errors For a smaller 2 the AICobtains a higher prediction error than CV However the AICestimators yield smaller PE119904 than the CV estimators when2 is increasing In many situations the PE119904 of BIC estimator

with a large 2 are quite poor relative to the other methodsNext we discuss PE under correlative errors and the PE

calculation is summarized in Figure 2 Broadly speakingthe conclusions are similar to those found in homoscedasticcases The 119896-GIC estimator frequently yields the most accu-rate estimators followed by the MMA estimator and bothaverage estimators enjoy significantly smaller PE119904 than theother three estimators over a large portion of the 2 space

Abstract and Applied Analysis 11

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 50M = 11

k-GIC

(a)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 100M = 14

k-GIC

(b)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 150M = 16

k-GIC

(c)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 200M = 18

k-GIC

(d)

Figure 2 Results for Monte Carlo design

When 2 le 02 the BIC estimator outperforms the 119896-GICestimator Again the AIC estimator is habitually the worstperforming estimator with the CV being a close second in alarge region of the 2 space Besides their relative efficiencyrelies closely on sample size with the BIC estimator revealingincreasing PE and the remaining four estimators showingdecreasing PE as 119899 increases

6 Conclusions

In risk investment an important subject is to find an optimalportfolio The commonly used techniques are the opti-mization methods based on the scheme of mean-varianceHowever those methods are cumbersome in computingand cannot obtain the closed solutions for some complex

problems To make up the defects of mean-variance analternative methodology for obtaining an optimal portfoliois to use model selection This paper attempts to developa statistical program to consider the selection problem ofoptimal tracking portfolio We build the theoretical modelsof tracking portfolios by constrained linear models Thenthe selection problems of optimal portfolio boil down tochoosing an optimal constrained linear model

In the setting of unrestricted models or homoscedasticmodels a large number of works investigate the problemsof model selection In distinction we discuss the modelselection for constrained models with dependent errors Therestricted models are estimated by the method of weightedaverage least squares Thus the selection of an optimalconstrained model is equivalent to finding a series of optimal

12 Abstract and Applied Analysis

weights We select the weights by minimizing a 119896-class gen-eralized information criterion (119896-GIC) which is an estimateof the average squared error from the model average fit Theprocedure of selecting weights is proved to be asymptoticallyoptimal Through Monte Carlo simulation the performanceof 119896-GIC is compared against that of four other methods Itis found that the 119896-GIC gives the best performance in mostcases

There are two limitations of our results which are openfor further research First what is the asymptotic distributionof the parametric estimators Second can the theory begeneralized to allow for continuous weightsThese questionsremain to be answered by future research In this work wemainly adopt the method of regression analysis to solve theselection problem In fact some alternative mathematicaltools can also be employed to explore the theoretical prop-erties of model selection For example the optimal modelcan be selected by the methods of linear optimization orquadratic programming and we can apply the techniques oflinear functional analysis and stochastic control to considerthe inferences of parametric estimator Besides we mentionthat the applications of this study can also be extended tosome other fields including risk management ruin theoryand factor analysis

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

This work was supported by the Shandong Institute ofBusiness and Technology (SIBT Grant 521014306203)

References

[1] K Knight and W Fu ldquoAsymptotics for lasso-type estimatorsrdquoThe Annals of Statistics vol 28 no 5 pp 1356ndash1378 2000

[2] J Fan and H Peng ldquoNonconcave penalized likelihood with adiverging number of parametersrdquo The Annals of Statistics vol32 no 3 pp 928ndash961 2004

[3] H Zou and M Yuan ldquoComposite quantile regression and theoracle model selection theoryrdquo The Annals of Statistics vol 36no 3 pp 1108ndash1126 2008

[4] M Caner ldquoLasso-type GMM estimatorrdquo Econometric Theoryvol 25 no 1 pp 270ndash290 2009

[5] C Y Tang and C Leng ldquoPenalized high-dimensional empiricallikelihoodrdquo Biometrika vol 97 no 4 pp 905ndash919 2010

[6] L C Jennifer A D Jurgen and F H David Model Selectionin Equations With Many Small Effects Oxford Bulletin ofEconomics and Statistics 2013

[7] E E Leamer Specification Searches John Wiley amp Sons NewYork NY USA 1978

[8] P J Brown M Vannucci and T Fearn ldquoBayes model averagingwith selection of regressorsrdquo Journal of the Royal StatisticalSociety B Statistical Methodology vol 64 no 3 pp 519ndash5362002

[9] W S Rodney and K D Herman Bayesian Model SelectionWith An Uninformative Prior Oxford Bulletin of Economicsand Statistics 2003

[10] N L Hjort and G Claeskens ldquoFrequentist model averageestimatorsrdquo Journal of the American Statistical Association vol98 no 464 pp 879ndash899 2003

[11] B E Hansen ldquoLeast squares model averagingrdquo EconometricaJournal of the Econometric Society vol 75 no 4 pp 1175ndash11892007

[12] H Liang G Zou A T K Wan and X Zhang ldquoOptimal weightchoice for frequentist model average estimatorsrdquo Journal of theAmerican Statistical Association vol 106 no 495 pp 1053ndash10662011

[13] X Zhang and H Liang ldquoFocused information criterion andmodel averaging for generalized additive partial linear modelsrdquoThe Annals of Statistics vol 39 no 1 pp 194ndash200 2011

[14] B E Hansen and J S Racine ldquoJackknife model averagingrdquoJournal of Econometrics vol 167 no 1 pp 38ndash46 2012

[15] H Akaike ldquoInformation theory and an extension of themaximum likelihood principlerdquo in Proceedings of the 2ndInternational Symposium on Information Theory B N Petrovand F Csake Eds pp 267ndash281 Akademiai Kiado BudapestHungary 1973

[16] C L Mallows ldquoSome comments on CPrdquo Technometrics vol 15no 4 pp 661ndash675 1973

[17] M Stone ldquoCross-validatory choice and assessment of statisticalpredictionsrdquo Journal of the Royal Statistical Society B Method-ological vol 36 pp 111ndash147 1974

[18] G Schwarz ldquoEstimating the dimension of a modelrdquoTheAnnalsof Statistics vol 6 no 2 pp 461ndash464 1978

[19] P Craven and G Wahba ldquoSmoothing noisy data with splinefunctions Estimating the correct degree of smoothing by themethod of generalized cross-validationrdquo Numerische Mathe-matik vol 31 no 4 pp 377ndash403 1979

[20] D W K Andrews ldquoConsistent moment selection proceduresfor generalized method of moments estimationrdquo EconometricaJournal of the Econometric Society vol 67 no 3 pp 543ndash5641999

[21] G Claeskens and N L Hjort ldquoThe focused informationcriterionrdquo Journal of theAmerican Statistical Association vol 98no 464 pp 900ndash945 2003With discussions and a rejoinder bythe authors

[22] Y Zhang R Li and C-L Tsai ldquoRegularization parameterselections via generalized information criterionrdquo Journal of theAmerican Statistical Association vol 105 no 489 pp 312ndash3232010

[23] G Xu and J Z Huang ldquoAsymptotic optimality and efficientcomputation of the leave-subject-out cross-validationrdquo TheAnnals of Statistics vol 40 no 6 pp 3003ndash3030 2012

[24] M K P So and T Ando ldquoGeneralized predictive informationcriteria for the analysis of feature eventsrdquo Electronic Journal ofStatistics vol 7 pp 742ndash762 2013

[25] J J J Groen and G Kapetanios ldquoModel selection criteria forfactor-augmented regressionsrdquo Oxford Bulletin of Economicsand Statistics vol 75 no 1 pp 37ndash63 2013

[26] S Lai and Q Xie ldquoA selection problem for a constrainedlinear regression modelrdquo Journal of Industrial and ManagementOptimization vol 4 no 4 pp 757ndash766 2008

[27] J Shao ldquoAn asymptotic theory for linear model selectionrdquoStatistica Sinica vol 7 no 2 pp 221ndash264 1997 With commentsand a rejoinder by the author

Abstract and Applied Analysis 13

[28] D W K Andrews ldquoAsymptotic optimality of generalized 119862119871

cross-validation and generalized cross-validation in regressionwith heteroskedastic errorsrdquo Journal of Econometrics vol 47 no2-3 pp 359ndash377 1991

[29] K-C Li ldquoAsymptotic optimality for 119862119901 119862119871 cross-validation

and generalized cross-validation discrete index setrdquoTheAnnalsof Statistics vol 15 no 3 pp 958ndash975 1987

[30] J A F Machado ldquoRobust model selection and119872-estimationrdquoEconometric Theory vol 9 no 3 pp 478ndash493 1993

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article The Optimal Selection for Restricted ...downloads.hindawi.com/journals/aaa/2014/692472.pdf · stocks in the group as the regressors. Since the coe cient of a regressor

Abstract and Applied Analysis 3

where 119901 is a positive integer 120579119895and 120595

119897are parameters 119890

119894is

random error 119903119904is a constant for restricting the 119904th parameter

120579119904 and 119886

119904119896and 119887119904are some constants Assume that120601(119911

119894119897 120595119897) =

suminfin

119897=1119911119894119897120595119897and 120601(119886

119904119896 120595119896) = sum

infin

119896=1119886119904119896120595119896converge in mean

squareIn (3) the explanatory variable 119909

119894119895is involved in the

model on theoretical grounds or other reasons and 119911119894119897is the

additional explanatory variable that we need to make surewhether it should be included in the model In the contextof building tracking portfolio the fixed explanatory variable119909119894119895stands for the historical return of the 119895th stock at time 119894

which must be selected by investors because of their personalpreference or the stable earning of this stock In a trackingportfolio investors need to select some alternative stocksfrom numerous stocks to realize their expected return Sothe additional explanatory variable 119911

119894119897indicates the historical

return of the 119897th alternative stock at time 119894 Since 119911119894119897

canbe viewed as a series expansion the identity (3) includessemiparametric models as special form In fact the model(3) generalizes the models considered by Lai and Xie [26]and Liang et al [12] In addition the parameters 120579

119895and 120595

119897

denote respectively the proportion of the 119895th required stockand the 119897th alternative stock in a tracking portfolio In (4) theparameter 120579

119904is adjusted by a linear combination of 120595

119896and

119887119904 The economic significance of (4) is that the proportion

of each fixed investment varies with the proportion of allalternative investments When investors change their prefer-ence or have acquired new information on alternative stocksthey are capable of adjusting the proportion between requiredstocks and alternative stocks according to (4) This impliesthat the increase or decrease of alternative stocks can affectthe proportion of each required stock in a portfolio Besides ifwe assume that 120595

119896= 0 119896 = 1 infin the model (4) becomes

119903119904120579119904= 119887119904which which has been discussed by Lai and Xie [26]

Particularly if set 1199031= sdot sdot sdot = 119903

119901= 1 119886

11+ sdot sdot sdot + 119886

1199011= sdot sdot sdot =

1198861infin

+ sdot sdot sdot + 119886119901infin

= minus1 and 1198871+ sdot sdot sdot + 119887

119901= 1 the restricted

equation (4) turns into (2)Denote an index set U = K

1 K

119872 where 119872 is a

positive integer LetΘ = (1205791 120579

119901)119879 andΨ = (120595

1 120595

infin)119879

where ldquo119879rdquo stands for the transpose operation Due to theuncertain number of 119911

119894119897in formula (3) we consider a

sequence of approximately restricted models (3) and (4) withK119898

isin U where the 119898th model includes the first K119898

elements of 119911119894 that is 119911

1198941 119911

119894K119898

and the parameter of 120579119904

is restricted by a linear combination withK119898elements of Ψ

and a constant 119887119904 Hence the 119898th approximately restricted

models (3) and (4) are

119910119894=

119901

sum

119895=1

119909119894119895120579119895+

K119898

sum

119897=1

119911119894119897120595119897+ 119889119894+ 119890119894 subject to (5)

119903119904120579119904=

K119898

sum

119896=1

119886119904119896120595119896+ 119889995333

119904+ 119887119904 (119904 = 1 119901) (6)

where 119889119894= suminfin

119897=K119898+1119911119894119897120595119897and 119889995333

119904= suminfin

119896=K119898+1119886119904119896120595119896are the

approximation errorsSet 119884 = (119910

1 119910

119899)119879 119886119898119904

= (1198861199041 119886

119904K119898

)119879 119860119898alefsym=

(119886119898119879

1 119886

119898119879

119901)119879 119863 = (119889

1 119889

119899)119879 119863995333 = (119889

995333

1 119889

995333

119901)119879

Ψ119898= (1205951 120595K

119898

)119879 119861 = (119887

1 119887

119901)119879 and 119890 = (119890

1 119890

119899)119879

By matrix notation the119898th approximately restricted models(5) and (6) can be rewritten as

119884 = 119883Θ + 119885119898Ψ119898+ 119863 + 119890 subject to (7)

119877Θ = 119860119898

alefsymΨ119898+ 119863995333+ 119861 (8)

where 119883 is a 119899 times 119901 matrix whose 119894119895th element is 119909119894119895 119885119898is a

119899 times K119898matrix whose 119894119895th element is 119911

119894119895 and 119877 is a 119901 times 119901

diagonal matrix whose 119894th diagonal element is 119903119894

Hypothesize that 119890119894119899

119894=1satisfies 119864(119890

119894| 119909119894 119911119894) = 0 and its

conditional covariance matrixΩ119899= 119864(119890119890

119879| 119883 119885

119898) is

Ω119899= (

1205902

19848581

sdot sdot sdot 984858119899minus1

9848581

1205902

2sdot sdot sdot 984858119899minus2

sdot sdot sdot

984858119899minus1

984858119899minus2

sdot sdot sdot 1205902

119899

) (9)

in which 119864(119890119894119890119894| 119909119894 119911119894) = 120590

2

119894and 119864(119890

119894119890119894+119895

| 119909119894 119911119894) = 984858119895with

119894 = 1 119899 and 119895 = 1 119899 minus 1 Clearly the random errorsfollow a heteroscedastic stationary Gaussian process

Substituting (8) into (7) it yields

119884119906= 119885119898119906Ψ119898+ 119880 (10)

where 119884119906= 119884 minus 119883119877

minus1119861 119880 = 119863 + 119883119877

minus1119863995333+ 119890 and 119885

119898119906=

119885119898+ 119883119877minus1119860119898

alefsym By the method of least squares the estimator

of Ψ119898is

Ψ119898= (119885119879

119898119906119885119898119906)minus1

119885119879

119898119906119884119906 (11)

where (119885119879119898119906119885119898119906)minus1 denotes the inverse of 119885119879

119898119906119885119898119906

In the119898th approximating model (7) we set 120583

119898= 119883Θ + 119885

119898Ψ119898 so

that 120583 = 119864(119884 | 119883 119885119898) = 120583119898+ 119863 Thus the estimator of 120583

119898is

120583119898= 119883Θ + 119885

119898Ψ119898= 120578 + 119875

119898119884 (12)

where 120578 = (119868minus119875119898)119883119877minus1119861 and 119875

119898= 119885119898119906(119885119879

119898119906119885119898119906)minus1

119885119879

119898119906is

the ldquohatrdquo matrixLet 119908 = (119908

1 119908

119872)119879 be a weight vector where119872 lt 119899

Define a weight setW as

W = 119908 | 119908119898isin [0 1] 119898 = 1 119872

119872

sum

119898=1

119908119898= 1 (13)

For all119898 le 119872 the weighted average estimator of Ψ119898is

Ψ119908=

119872

sum

119898=1

119908119898Ψ119898=

119872

sum

119898=1

119908119898(119885119879

119898119906119885119898119906)minus1

119885119879

119898119906119884119906 (14)

Naturally the weighted average estimator of Θ is

Θ119908= 119877minus1119861 + 119877

minus1119860119898

alefsymΨ119908 (15)

Furthermore the weighted average estimator of 120583 is

120583 (119908) =

119872

sum

119898=1

119908119898120583119898= 120578 (119908) + 119875 (119908)119884 (16)

4 Abstract and Applied Analysis

where 119875(119908) = sum119872

119898=1119908119898119875119898and 120578(119908) = (119868 minus 119875(119908))119883119877

minus1119861

It can be seen that the weighted estimator 120583(119908) is anaverage estimator of 120583

119898 119898 = 1 119872 The weighted ldquohatrdquo

matrix 119875(119908) depends on nonrandom regressor 119885119898119906

andweight vector 119908 In general conditions the matrix 119875(119908) issymmetric but not idempotent

For a positive integerG let 120585 be the maximal value of 1205902120580

120580 = 1 119899 and let 120582119895 119895 = 1 G be the nonzero eigenvalue

ofΩ119899 Assume that both 120582

119895and 984858120580are summable namely

Γ =

G

sum

119895=1

120582119895lt infin

W = 120585 + 2

119899minus1

sum

120580=1

10038161003816100381610038169848581205801003816100381610038161003816 lt infin

(17)

Since the covariance matrix Ω119899determines the algebraic

structure of the model average estimator we discuss itsproperties in the following

Lemma 1 For any 119899 dimensional vectors 119886 = (1198861 119886

119899)119879 and

119887 = (1198871 119887

119899)119879 one has |119886119879Ω

119899119887| le 119886119887W where sdot is

the Euclidean norm

Proof Through the definition ofΩ119899 one has

10038161003816100381610038161003816119886119879Ω11989911988710038161003816100381610038161003816=

1003816100381610038161003816100381610038161003816100381610038161003816

119899

sum

119897=1

1198861198971198871198971205902

119897+

119899minus1

sum

119894=1

984858119894

119899minus119894

sum

119897=1

(119886119897+119894119887119897+ 119886119897119887119897+119894)

1003816100381610038161003816100381610038161003816100381610038161003816

le 120585

1003816100381610038161003816100381610038161003816100381610038161003816

119899

sum

119897=1

119886119897119887119897

1003816100381610038161003816100381610038161003816100381610038161003816

+

1003816100381610038161003816100381610038161003816100381610038161003816

119899minus1

sum

119894=1

984858119894

1003816100381610038161003816100381610038161003816100381610038161003816

1003816100381610038161003816100381610038161003816100381610038161003816

119899minus119894

sum

119897=1

(119886119897+119894119887119897+ 119886119897119887119897+119894)

1003816100381610038161003816100381610038161003816100381610038161003816

(18)

ApplyingCauchy-Schwarz inequality for 119894 isin 1 119899minus1one gets

119899minus119894

sum

119897=1

119886119897+119894119887119897le radic

119899minus119894

sum

119897=1

1198862119897+119894

119899minus119894

sum

119897=1

1198872119897

le radic

119899

sum

119897=1

1198862119897

119899

sum

119897=1

1198872119897= 119886 119887

(19)

Similarly sum119899minus119894119897=1119886119897119887119897+119894le 119886119887 holds Therefore

10038161003816100381610038161003816119886119879Ω11989911988710038161003816100381610038161003816le 119886 119887(120585 + 2

119899minus1

sum

119894=1

984858119894) = 119886 119887W (20)

Because the matrix 119875(119908) takes an important role in ana-lyzing the problems of model selection we state some of itsproperties We set 120591 = maxK

1 K

119872 Let 120582(119875(119908)) and

120582max(119875(119908)) denote the eigenvalue and the largest eigenvalueof 119875(119908) respectively

Lemma 2 One has (i) tr(119875(119908)) le 120591 and (ii) 0 le 120582(119875(119908)) le 1where tr(sdot) denotes the trace operation

Proof It follows from tr(119875119898) = K

119898and tr(119875(119908)) =

sum119872

119898=1119908119898K119898le 120591 that (i) is established Next we consider

(ii) Without loss of generality let 1205821119898

ge sdot sdot sdot ge 120582119899119898

and1205931119898 120593

119899119898be the eigenvalues and standardly orthogonal

eigenvectors of 119875119898 respectively Observe that 119875

119898is idempo-

tent which implies that 120582119894119898

isin 0 1 119894 = 1 119899 For any120603 isin R119899 it can be seen that

120582 (119875 (119908)) =120603119879119875 (119908)120603

120603119879120603

=

119872

sum

119898=1

119908119898

120603119879119875119898120603

120603119879120603

=

119872

sum

119898=1

119908119898

120594119879Ξ119898120594

120594119879120594

=

119872

sum

119898=1

119908119898

119899

sum

119894=1

120582119894119898

100381610038161003816100381612059411989410038161003816100381610038162

120594119879120594

(21)

where Ξ119898= diag(120582

1119898 120582

119899119898) 120603 = Φ

119898120594 and Φ

119898=

(1205931119898 120593

119899119898) Due to 120582

119894119898isin 0 1 119894 = 1 119899 it means

that

120582 (119875 (119908)) le

119872

sum

119898=1

119908119898

119899

sum

119894=1

1

100381610038161003816100381612059411989410038161003816100381610038162

120594119879120594=

119872

sum

119898=1

119908119898= 1

120582 (119875 (119908)) ge

119872

sum

119898=1

119908119898

119899

sum

119894=1

0

100381610038161003816100381612059411989410038161003816100381610038162

120594119879120594= 0

(22)

From Lemma 2(ii) we know that 119875(119908) is nonnegativedefinite

Lemma 3 Let 120572 denote the number of eigenvalues of 119875(119908)Then (120572minus1 tr(119875(119908)))2 le 120572minus1 tr(1198752(119908))

Proof Let 1205821 120582

120572be the eigenvalues of 119875(119908) (i) If 120582

1=

sdot sdot sdot = 120582120572= 0 we know that Lemma 3 holds (ii) Let

1205821= 0 120582

120572= 0 and 120582

120572+1= sdot sdot sdot = 120582

120572= 0 It is easy to see

that sum120572119894=1(120582119894minus 120582)2

ge 0 where 120582 = 120572minus1 tr(119875(119908)) Thus it canbe shown that120572

sum

119894=1

(120582119894minus 120582)2

=

120572

sum

119894=1

1205822

119894minus 2120582

120572

sum

119894=1

120582119894+ 1205721205822

= tr (1198752 (119908)) minus 120572(120572minus1 tr (119875 (119908)))2

ge 0

(23)

which implies that (120572minus1 tr(119875(119908)))2 le 120572minus1 tr(1198752(119908))

Lemma 4 For any 119908 isin W there exists C1

= C10158402

such that tr(119875(119908)Ω119899) le C1015840Γ tr (Ω

119899119875(119908))

2le C1Γ2 and

tr (Ω119899119875119879(119908)119875(119908))

2

le C1Γ tr(119875(119908)Ω

119899119875119879(119908)) in which C1015840 =

119904119906119901119908isinW120582max(119875(119908))

Proof Notice that 120582min(A)120582max(B) le 120582max(AB) le

120582max(A)120582max(B) and tr(AB) le 120582max(A) tr(B) le

Abstract and Applied Analysis 5

tr(A) tr(B) where A and B are any positive semidefinitematrices

Since 119875(119908) and Ω119899are symmetric and nonnegative

definite the first inequality is established by tr(119875(119908)Ω119899) le

120582max(119875(119908)) tr(Ω119899) Let 120582119894 be eigenvalue of 119875(119908) From thesymmetric property of 119875(119908) we know that there exists a119899 times 119899 orthogonal matrix Ψ such that 119875(119908) = ΨΥΨ

119879where Υ = diag(120582

1 120582

119899) Then it yields tr(Ω

119899119875(119908)) =

tr(Υ12Ψ119879Ω119899ΨΥ12) where Υ

12Ψ119879

Ω119899ΨΥ12 is symmetric

and nonnegative definite The second inequality holdsbecause

tr (Ω119899119875 (119908))

2le [tr (Ω

119899119875 (119908))]

2

le [120582max (119875 (119908)) tr(Ω119899)]2

le C1Γ2

(24)

For the last formula we note that

tr (Ω119899119875119879(119908) 119875 (119908))

2

= tr (119875 (119908)Ω119899119875119879(119908))2

le 120582max (119875 (119908)Ω119899119875119879(119908)) tr (119875 (119908)Ω119899119875

119879(119908))

le 120582max (1198752(119908)) 120582max (Ω119899) tr (119875 (119908)Ω119899119875

119879(119908))

le 1205822

max (119875 (119908)) 120582max (Ω119899) tr (119875 (119908)Ω119899119875119879(119908))

le C1Γ tr (119875 (119908)Ω119899119875

119879(119908))

(25)

This completes the proof of Lemma 4

Lemma 5 Let A be a symmetric matrix and 119883 be a randomvector with zero expectation Then one has Var(119883119879A119883) =2 tr(Var(119883)AVar(119883)A) where Var(sdot) is the operation ofvariance

Proof The proof of this lemma is provided in Lai and Xie[26]

3 Average Squared Error

Denote an average squared error by

119871119899 (119908) =

1

119899

1003817100381710038171003817120583 minus 120583(119908)10038171003817100381710038172

119896=(120583 minus 120583 (119908))

119879119896 (120583 minus 120583 (119908))

119899 (26)

where 119896 is a fixed positive integer which is often used toeliminate the boundary effect Andrews [28] suggested thatthe 119896 can take the value of 120590minus2 when errors obeyed anindependent and identical distribution with variance 1205902 Themost common situation is 119896 = 1 The average squarederror 119871

119899(119908) can be viewed as a measure of accuracy between

120583(119908) and 120583 Obviously an optimal estimator can obtain theminimum value of 119871

119899(119908) In other words we should select

a weight vector 119908 from W to make that the average squarederror 119871

119899(119908) takes value that is as small as possible

In order to investigate the problem of weight selection wegive the expression of conditional expected average squarederror as follows

119877119899 (119908) = 119864 (119871119899 (119908) | 119883 119885119898) (27)

From the definition of119877119899(119908) the following lemma can be

obtained

Lemma6 The conditional expected average squared error canbe rewritten as

119877119899 (119908) =

1003817100381710038171003817119872(119908)120583 minus 120578(119908)10038171003817100381710038172

119896

119899+119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899 (28)

where119872(119908) = 119868 minus 119875(119908)

Proof A straight calculation of 119871119899(119908) leads to

119899119871119899 (119908)

= (120583 minus 120583 (119908))119879119896 (120583 minus 120583 (119908))

= (120583 minus 120578 (119908) minus 119875 (119908) (120583 + 119890))119879

times 119896 (120583 minus 120578 (119908) minus 119875 (119908) (120583 + 119890))

= [120583119879119872119879(119908) minus 120578

119879(119908) minus 119890

119879119875119879(119908)]

times 119896 [119872 (119908) 120583 minus 120578 (119908) minus 119875 (119908) 119890]

=1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896minus 119896(120583 minus 119883119877

minus1119861)119879

119872119879(119908) 119875 (119908) 119890

+ 119896119890119879119875119879(119908) 119875 (119908) 119890 minus 119896119890

119879119875119879(119908)119872 (119908) (120583 minus 119883119877

minus1119861)

(29)

Using 119864(119890119879119875119879(119908)119875(119908)119890 | 119883 119885119898) = tr(119875(119908)Ω

119899119875119879(119908))

and taking conditional expectations on both sides of theabove equality give rise to Lemma 6

4 The 119896-Class Generalized InformationCriterion and Asymptotic Optimality

As the value of120583 is unknown the average squared error119871119899(119908)

cannot be used directly to select the weight vector119908Thus wesuggest a 119896-class generalized information criterion (119896-GIC)to choose the weight vector Further we will prove that theselected weight vector from 119896-GIC minimizes the averagesquared error 119871

119899(119908)

The 119896-GIC for the restricted model average estimator is

119862119899 (119908) =

1003817100381710038171003817119884 minus 120583 (119908)10038171003817100381710038172

119896

119899+2ℏ

119899tr (119875 (119908)Ω119899) (30)

where ℏ is larger than one and satisfies assumption (35) men-tioned below The 119896-class generalized information criterionextends the generalized information criterion (GIC

120582119899

) pro-posed by Shao [27] Because the ℏ can take different valuesthe 119896-GIC includes some common information criteria formodel selection such as theMallows C

119871criterion (119896 = ℏ = 1)

the GIC criterion (119896 = 1 ℏ rarr infin) the FPE120572criterion

(119896 = 1 ℏ gt 1) and the BIC criterion (119896 = 1 ℏ = log(119899))

6 Abstract and Applied Analysis

Lemma 7 If 119896 = ℏ we have 119864(119862119899(119908) | 119883 119885

119898) = 119877

119899(119908) +

119896Γ119899

Proof Recalling the definition of 119862119899(119908) one gets

119899119862119899 (119908) = (119884 minus 120583 (119908))

119879119896 (119884 minus 120583 (119908)) + 2119896 tr (119875 (119908)Ω119899)

= [119872 (119908)119884 minus 120578 (119908)]119879119896 [119872 (119908)119884 minus 120578 (119908)]

+ 2119896 tr (119875 (119908)Ω119899)

= 2119896 tr (119875 (119908)Ω119899) + 119896119890119879119872119879(119908)119872 (119908) 119890

+ 119896(119872 (119908) 120583 minus 120578 (119908))119879119872(119908) 119890

+ 119896119890119879119872119879(119908) (119872 (119908) 120583 minus 120578 (119908))

+1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

(31)

Observe that

119896119890119879119872119879(119908)119872 (119908) 119890 = 119890

119879(119868 minus 119875 (119908))

119879119896 (119868 minus 119875 (119908)) 119890

= 119896119890119879119890 minus 119896119890

119879119875119879(119908) 119890 minus 119896119890

119879119875 (119908) 119890

+ 119896119890119879119875119879(119908) 119875 (119908) 119890

(32)

Notice that 119864(119896119890119879119890 | 119883 119885

119898) = 119896 tr(Ω

119899) and

119864(119896119890119879119875119879(119908)119890 | 119883 119885

119898) = 119896 tr(119875119879(119908)Ω

119899) Moreover we

have that 119864(119896119890119879119875(119908)119890 | 119883 119885119898) = 119896 tr(119875(119908)Ω

119899) and

119864(119896119890119879119875119879(119908)119875(119908)119890 | 119883 119885

119898) = 119896 tr(119875(119908)Ω

119899119875119879(119908))

Thus taking conditional expectations on both sides of(31) we obtain

119899119864 (119862119899 (119908) | 119883 119885119898) = 119896 tr (Ω119899) +

1003817100381710038171003817119872(119908)120583 minus 120578(119908)10038171003817100381710038172

119896

+ 119896 tr (119875 (119908)Ω119899119875119879(119908))

(33)

It follows from (33) and Lemma 6 that Lemma 7 isestablished as desired

Lemma 7 shows that 119862119899(119908) is equivalent to the con-

ditional expected average squared error plus an error biasParticularly when 119899 approaches to infinite 119862

119899(119908) is an

unbiased estimation of 119877119899(119908)

The 119896-GIC criterion is defined so as to select the optimalweight vector 119908 The optimal weight vector 119908 is chosen byminimizing 119862

119899(119908)

Obviously the well-posed estimators of parameters areΘ119908 and Ψ119908 with the weight vector 119908 Under some regular

conditions we intend to demonstrate that the selectionprocedure is asymptotically optimal in the following sense

119871119899 (119908)

inf119908isinW119871119899 (119908)

119901

997888rarr 1 (34)

If the formula (34) holds we know that the selectedweight vector 119908 from 119862

119899(119908) can realize the minimum value

of 119871119899(119908) In other words the weight vector 119908 is equivalent

to the selected weight vector by minimizing 119871119899(119908) and is the

optimal weight vector for 120583(119908) The asymptotic optimality of119862119899(119908) can be established under the following assumptions

Assumptions 1 Write 120572119899= inf119908isinW119899119877119899(119908) As 119899 rarr infin we

assume that

lim inf119908isinW

ℏ2

120572119899

997888rarr 0 (35)

sum

119908isinW

1

(119899119877119899 (119908))

1198731015840997888rarr 0 (36)

for a large ℏ and any positive integer 1 le 1198731015840 lt infinThe above assumptions have been employed by many

literatures of model selection For instance the expression(35) was used by Shao [27] and the formula (36) was adoptedby Li [29] Andrews [28] Shao [27] and Hansen [11]

The following lemma offers a bridge for proving theasymptotic optimality of 119862

119899(119908)

Lemma 8 We have1003817100381710038171003817119872(119908)119884 minus 120578(119908)

10038171003817100381710038172

119896= 119899119871119899 (119908) minus 2⟨119890

119879 119875(119908)119890⟩

119896

+ 2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896+ 1198902

119896

(37)

Proof Using 120583(119908) = 119875(119908)119884 + 120578(119908) 119884 = 120583 + 119890 and thedefinition of 119871

119899(119908) one obtains

1003817100381710038171003817119872(119908)119884 minus 120578(119908)10038171003817100381710038172

119896=1003817100381710038171003817120583 + 119890 minus 119875(119908)119884 minus 120578(119908)

10038171003817100381710038172

119896

= 1198902

119896+1003817100381710038171003817120583 minus 119875(119908)119884 minus 120578(119908)

10038171003817100381710038172

119896

+ 2⟨119890119879 120583 minus 120578(119908) minus 119875(119908)119884⟩

119896

= 1198902

119896+ 119899119871119899 (119908)

+ 2⟨119890119879 120583 minus 120578(119908) minus 119875(119908)(120583 + 119890)⟩

119896

= 1198902

119896+ 119899119871119899 (119908)

+ 2⟨119890119879 (119868 minus 119875(119908))120583 minus 120578(119908) minus 119875(119908)119890⟩

119896

= 1198902

119896+ 119899119871119899 (119908)

+ 2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

minus 2⟨119890119879 119875(119908)119890⟩

119896

(38)

This completes the proof of Lemma 8

Lemma 8 and 119884 minus 120583(119908)2119896= 119872(119908)119884 minus 120578(119908)

2

119896imply

that

119862119899 (119908) = 119871119899 (119908) +

2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

119899

+1198902

119896

119899+2ℏ tr (119875 (119908)Ω119899)

119899minus 2

⟨119890119879 119875(119908)119890⟩

119896

119899

(39)

Abstract and Applied Analysis 7

The goal is to choose119908 by minimizing 119862119899(119908) From (39)

one only needs to select 119908 through minimizing

119871119899 (119908) +

2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

119899

+

2ℏ tr (119875 (119908)Ω119899) minus 2⟨119890119879 119875(119908)119890⟩

119896

119899

(40)

where 119908 isinWCompared with 119871

119899(119908) it is sufficient to establish

that 119899minus1⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896and 119899

minus1ℏ tr(119875(119908)Ω

119899) minus

119899minus1⟨119890119879 119875(119908)119890⟩

119896are uniformly negligible for any 119908 isin W

More specifically to prove (34) we need to check

sup119908isinW

⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

119899119877119899 (119908)

119901

997888rarr 0 (41)

sup119908isinW

10038161003816100381610038161003816ℏ tr (119875 (119908)Ω119899) minus ⟨119890

119879 119875(119908)119890⟩

119896

10038161003816100381610038161003816

119899119877119899 (119908)

119901

997888rarr 0 (42)

sup119908isinW

10038161003816100381610038161003816100381610038161003816

119871119899 (119908) minus 119877119899 (119908)

119877119899 (119908)

10038161003816100381610038161003816100381610038161003816

119901

997888rarr 0 (43)

where ldquo119901

997888rarrrdquo denotes the convergence in probabilityFollowing the idea of Li [29] we testify the main result

of our work that the minimizing of 119896-class generalizedinformation criterion is asymptotically optimalNowwe statethe main theorem

Theorem9 Under assumptions (35) and (36) theminimizingof 119896-class generalized information criterion119862

119899(119908) is asymptot-

ically optimal namely (34) holds

Proof The asymptotically optimal property of 119896-GIC needsto show that (41) (42) and (43) are valid

Firstly we prove that (41) holds For any 120577 gt 0 byChebyshevrsquos inequality one has

Pr

sup119908isinW

10038161003816100381610038161003816⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

10038161003816100381610038161003816

119899119877119899 (119908)

gt 120577

le sum

119908isinW

119864[⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896]2

(120577119899119877119899 (119908))

2

(44)

which by Lemma 1 is no greater than

120577minus2119896W sum

119908isinW

119899minus21003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896(119877119899 (119908))

minus2 (45)

Recalling the definition of 119877119899(119908) we get 119872(119908)120583 minus

120578(119908)2

119896le 119899119877

119899(119908) Then (45) does not exceed

120577minus2119896Wsum

119908isinW (119899119877119899(119908))minus1 By assumption (36) one knows

that (45) tends to zero in probabilityThus (41) is established

To prove (42) it suffices to testify that

119896 tr (119875 (119908)Ω119899) minus ⟨119890119879 119875(119908)119890⟩

119896

119899119877119899 (119908)

119901

997888rarr 0 (46)

(ℏ minus 119896) tr (119875 (119908)Ω119899)119899119877119899 (119908)

119901

997888rarr 0 (47)

By Chebyshevrsquos inequality and Lemmas 4 and 5 for any120598 gt 0 we have

Pr sup119908isinW

10038161003816100381610038161003816100381610038161003816100381610038161003816

tr (119875 (119908)Ω119899) minus ⟨119890119879 119875 (119908) 119890⟩

119899119877119899 (119908)

10038161003816100381610038161003816100381610038161003816100381610038161003816

gt 120598

le sum

119908isinW

119864[tr(119875(119908)Ω119899) minus ⟨119890

119879 119875(119908)119890⟩]

2

(120598119899119877119899(119908))2

=1

1205982sum

119908isinW

Var (119890119879119875 (119908) 119890)

(119899119877119899(119908))2

le2C1Γ2

1205982sum

119908isinW

1

(119899119877119899(119908))2

(48)

From (36) we know that (48) is close to zero Thus (46)is reasonable

Recalling Lemma 4 and (35) it yields

1003816100381610038161003816(ℏ minus 119896) tr (119875 (119908)Ω119899)1003816100381610038161003816

119899119877119899 (119908)

le

10038161003816100381610038161003816C1015840 (ℏ minus 119896) Γ

10038161003816100381610038161003816

119899119877119899 (119908)

997888rarr 0 (49)

which derives (47) Then (42) is provedNext we show that the expression (43) also holds A

straightforward calculation leads to

1003817100381710038171003817120583 minus 120583(119908)10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

=1003817100381710038171003817120583 minus 120578(119908) minus 119875(119908)119884

10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

=1003817100381710038171003817120583 minus 120578(119908) minus 119875(119908)120583 minus 119875(119908)119890

10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

=1003817100381710038171003817119872(119908)120583 minus 120578(119908) minus 119875(119908)119890

10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

= 119875(119908)1198902

119896minus 2⟨120583

119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

(50)

From the identity (50) we know that

119871119899 (119908) minus 119877119899 (119908) = minus

2⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899

+119875(119908)119890

2

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899

(51)

8 Abstract and Applied Analysis

Obviously the proof of (43) needs to verify that

sup119908isinW

119875(119908)1198902

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899119877119899 (119908)

119901

997888rarr 0

sup119908isinW

⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899119877119899 (119908)

119901

997888rarr 0

(52)

To prove (52) it should be noticed thattr(119875(119908)Ω

119899119875119879(119908)) = 119864(119890

119879119875119879(119908)119875(119908)119890) and 119875(119908)119890

2

119896=

⟨119890119879119875119879(119908) 119875(119908)119890⟩

119896 In addition

1003817100381710038171003817119875 (119908) (119872 (119908) 120583 minus 120578 (119908))10038171003817100381710038172le 1205822

max (119875 (119908))1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

le1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

(53)

Thus there exist any 1205771015840 gt 0 and 1205981015840 gt 0 such that

Pr sup119908isinW

119875(119908)1198902

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899119877119899 (119908)

gt 1205771015840

le sum

119908isinW

119864[119875(119908)1198902

119896minus 119896119864 (119890

119879119875119879(119908) 119875 (119908) 119890)]

2

(119899119877119899 (119908) 120577

1015840)2

=1198962

12057710158402sum

119908isinW

Var (119890119879119875119879 (119908) 119875 (119908) 119890)

(119899119877119899(119908))2

le2119896C1Γ

12057710158402sum

119908isinW

119896 tr (119875 (119908)Ω119899119875119879(119908))

(119899119877119899(119908))2

le2119896C1Γ

12057710158402sum

119908isinW

1

(119899119877119899 (119908))

(54)

Pr

sup119908isinW

⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899119877119899 (119908)

gt 1205981015840

le sum

119908isinW

119864[⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896]2

(119899119877119899(119908)1205981015840)

2

le sum

119908isinW

1003817100381710038171003817119875(119908)(119872(119908)120583 minus 120578(119908))10038171003817100381710038172

119896119896W

(119899119877119899(119908)1205981015840)

2

le 1205981015840minus2119896W sum

119908isinW

119899minus21003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896(119877119899 (119908))

minus2

(55)

Combining (36) and (45) one knows that both (54)and (55) tend to zero In other words (43) is confirmedWe conclude that the expressions (41) (42) and (43) arereasonable This completes the proof of Theorem 9

In practice the covariance of errors is usually unknownand needs to be estimated However it is difficult to build agood estimate for Ω

119899in virtue of the special structure of Ω

119899

In the special case when the random errors are independentand identical distribution with variance 1205902 the consistentestimator of1205902 can be built for the constrainedmodels (7) and(8) Let 119898 = 119876 in 120583

119898and 1205911015840 = tr(119875

119876) where 119876 corresponds

to a ldquolargerdquo approximatingmodel Denote 2119876= (119899minus120591

1015840)minus1(119884minus

120583119876)119879(119884 minus 120583

119876) The coming theorem will show that 2

119876is a

consistent estimate of 1205902

Theorem 10 If 1205911015840119899 rarr 0 when 1205911015840 rarr infin and 119899 rarr infin wehave 2

119876

119901

997888rarr 1205902 as 119899 rarr infin

Proof Writing Λ = 119863 + 119883119877minus1119863995333 one obtains

(119884 minus 120583119876)119879(119884 minus 120583

119876)

119899 minus 1205911015840=119890119879119872119876119890

119899 minus 1205911015840+

1003817100381710038171003817119872119876Λ10038171003817100381710038172

119899 minus 1205911015840+2119890119879119872119876Λ

119899 minus 1205911015840

(56)

where119872119876= 119868 minus 119875

119876

Since 119864(119890119879119872119876119890) = 120590

2(119899 minus 120591

1015840) it leads to

11986410038161003816100381610038161003816119890119879119872119876119890 minus 1205902(119899 minus 120591

1015840)10038161003816100381610038161003816

2

= 11986410038161003816100381610038161003816119890119879119872119876119890 minus 119864(119890

119879119872119876119890)10038161003816100381610038161003816

2

= Var (119890119879119872119876119890) = 2120590

4 tr (119872119876)

(57)

For any 120576 gt 0 it follows from (57) that

Pr1003816100381610038161003816100381610038161003816100381610038161003816

119890119879119872119876119890

119899 minus 1205911015840minus 1205902

1003816100381610038161003816100381610038161003816100381610038161003816

gt 120576 le11986410038161003816100381610038161003816119890119879119872119876119890 minus (119899 minus 120591

1015840)120590210038161003816100381610038161003816

2

1205762(119899 minus 1205911015840)2

le21205904

1205762 (119899 minus 1205911015840)997888rarr 0

(58)

Equation (58) implies that (119899 minus 1205911015840)minus1(119890119879119872119876119890)119901

997888rarr 1205902 Let

I119876119905

be the 119905th diagonal element of the ldquohatrdquo matrix 119875119876Then

0 le 1minusI119876119905lt 1 is the 119905th diagonal element of119872

119876and satisfies

sum119899

119905=1(1 minusI

119876119905) = 119899 minus 120591

1015840Because 120601(119911

119894119897 120595119897) and 120601(119886

119904119896 120595119896) converge to mean

square we have 119864(Λ2119876119905) rarr 0 as 1205911015840 rarr infin where Λ

119876119905is

the 119905th element of Λ 119905 = 1 119899 Notice that

119864 (Λ119879119872119876Λ)

(119899 minus 1205911015840)=

1

119899 minus 1205911015840

119899

sum

119905=1

119864 (Λ2

119876119905119872119876119905)

=1

119899 minus 1205911015840

119899

sum

119905=1

119864 (Λ2

119876119905) [1 minusI

119876119905]

le max119905

119864 (Λ2

119876119905)

1

119899 minus 1205911015840

119899

sum

119905=1

[1 minusI119876119905]

= max119905

119864 (Λ2

119876119905) 997888rarr 0

(59)

The above expression implies that the second term on theright hand of (56) approaches to zero By the similar proof of(59) we obtain that the final term on the right-hand of (56)also tends to zero The proof of Theorem 10 is complete

Abstract and Applied Analysis 9

In the case of independently and identically distributederrors if we replace 1205902 by 2

119876in the 119896-GIC the 119896-GIC can be

simplified to

119862119899 (119908) =

1003817100381710038171003817119884 minus 120583(119908)10038171003817100381710038172

119896

119899+2ℏ2

119876tr (119875 (119908))119899

(60)

Here we intend to illustrate that the model selection pro-cedure of minimizing 119862

119899(119908) is also asymptotically optimal

Theorem11 Assume that the random error 119890 is iid withmeanzero and variance 1205902 Under the conditions (35) and (36) the119862119899(119908) is still asymptotically valid

Proof Using a similar technique of deriving (39) we obtain

119899119862119899 (119908) =

1003817100381710038171003817119884 minus 120583(119908)10038171003817100381710038172

119896+ 2ℏ

2

119876tr (119875 (119908))

= 119899119871119899 (119908) + 2⟨119890

119879119872(119908)120583 minus 120578(119908)⟩

119896

+ 2ℏ2

119876tr (119875 (119908)) minus 2⟨119890119879 119875(119908)119890⟩

119896+ 1198902

119896

= 2ℏ (2

119876minus 1205902) tr (119875 (119908)) + 2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

+ 119899119871119899 (119908) + 2ℏ120590

2 tr (119875 (119908))

minus 2⟨119890119879 119875(119908)119890⟩

119896+ 1198902

119896

(61)

With an appropriate modification of the proofs (41)ndash(43)one only needs to verify

sup119908isinW

ℏ100381610038161003816100381610038161205902minus 2

119876

10038161003816100381610038161003816tr (119875 (119908))

119899119877119899 (119908)

119901

997888rarr 0 (62)

which is equivalent to showing

sup119908isinW

ℏ2((1119899)

100381610038161003816100381610038161205902minus 2

119876

10038161003816100381610038161003816tr(119875(119908)))

2

1198772119899(119908)

119901

997888rarr 0 (63)

From the proof of Theorem 10 we have

Pr1003816100381610038161003816100381610038161003816100381610038161003816

ℏ119890119879119872119876119890

119899 minus 1205911015840minus ℏ1205902

1003816100381610038161003816100381610038161003816100381610038161003816

gt 1205761015840119877minus12

119899(119908)

le ℏ211986410038161003816100381610038161003816119890119879119872119876119890 minus (119899 minus 120591

1015840)120590210038161003816100381610038161003816

2

12057610158402(119899 minus 1205911015840)2119877119899 (119908)

le2ℏ21205904

12057610158402 (119899 minus 1205911015840) 119877119899 (119908)

(64)

where 1205761015840 gt 0By assumption (35) one knows that the above equation

is close to zero Thus we obtain 119877119899(119908)minus1|ℏ1205902minus ℏ2

119876|2 119901

997888rarr

0 It follows from Lemma 3 and the definition of 119877119899(119908)

that (119899minus1 tr(119875(119908)))2 is no greater than 119877119899(119908) Then the

formula (63) is confirmed Therefore we conclude that theminimizing of 119862

119899(119908) is also asymptotically optimal

5 Monte Carlo Simulation

In this section Monte Carlo simulations are performedto investigate the finite sample properties of the proposedrestricted linearmodel selectionThis data generating processis

119910119894= 119909119894120579 +

1000

sum

119897=1

119911119894119897120595119897+ 119890119894

subject to 120579 = minus

1000

sum

119897=1

120595119897+ 1 119894 = 1 119899

(65)

where 119909119894follows 119905(3) distribution 119911

1198941= 1 and 119911

119894119897

119897 = 2 1000 is independently and identically distributed119873(0 1) The parameter 120595

119897 119897 = 1 1000 is determined by

the rule 120595119897= 120577119897minus1 where 120577 is a parameter which is selected

to control the population 2 = 1205772(1 + 120577

2) The error 119890

119894is

independent of 119909119894and 119911119894119897 We consider two cases of the error

distribution

Case 1 119890119894is independently and identically distributed

119873(0 1)

Case 2 1198901 119890

119899obey multivariate normal distribution

119873(0Ω119899) whereΩ

119899is a 119899times119899 dimensional covariance matrix

The 119895th diagonal element ofΩ119899is 1205902119895generated from uniform

distribution on (0 1) The 119895119897th (119895 = 119897) nondiagonal element ofΩ119899isΩ119895119897119899= exp(minus05|119909

119894119895minus 119909119894119897|2) where 119909

119894119895and 119909

119894119897denote the

119895th and 119897th elements of 119909119894 respectively

The sample size is varied between 119899 = 50 100 150 and200 The number of models is determined by 119872 = lfloor3119899

13rfloor

where lfloor119904rfloor stands for the integer part of 119904 We set 120577 so that2 varies on a grid between 01 and 09 The number of

simulation trials is Π = 500 For the 119896-GIC the value of 119896takes one and ℏ adopts the effective number of parameters

To assess the performance of 119896-GIC we consider fiveestimators which are (1) AIC model selection estimators(AIC) (2) BIC model selection estimators (BIC) (3) leave-one-out cross-validated model selection estimator (CV [17])(4) Mallows model averaging estimators (MMA [11]) and(5) 119896-GIC model selection estimator (119896-GIC) FollowingMachado [30] the AIC and BIC are defined respectively as

AIC119898= 2119899 ln 1

119899

1003817100381710038171003817119884 minus 12058311989810038171003817100381710038172 + 2K

119898

BIC119898= 2119899 ln 1

119899

1003817100381710038171003817119884 minus 12058311989810038171003817100381710038172 + ln (119899)K119898

(66)

We employ the out-of-sample prediction error to evaluateeach estimator For each replication 119910

ℓ 119909ℓ 119911ℓ100

ℓ=1are gener-

ated as out-of-sample observations In the120587th simulation theprediction error is

PE (120587) = 1

100

100

sum

ℓ=1

(119910ℓminus 120583ℓ (119908))

2 (67)

10 Abstract and Applied Analysis

CVMMAAIC

BIC

n = 50M = 11Pr

edic

tion

erro

r18

14

10

06

01 03 05 07 09

R

2

k-GIC

(a)

CVMMAAIC

BIC

n = 100M = 14

Pred

ictio

n er

ror

18

14

10

06

01 03 05 07 09

R

2

k-GIC

(b)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

n = 150M = 16

01 03 05 07 09

R

2

k-GIC

(c)

Pred

ictio

n er

ror

18

14

10

06

01 03 05 07 09

CVMMAAIC

BIC

n = 200M = 18

R

2

k-GIC

(d)

Figure 1 Results for Monte Carlo design

where119908 is selected by one of the five methodsThen the out-of-sample prediction error is calculated by

PE = 1

Π

Π

sum

120587=1

PE (120587) (68)

where Π = 500 is the number of replication Obviously thesmaller PE implies the better method of model estimatorWe consider PE under homoscedastic errors at first Theprediction error calculations are summarized in Figure 1Thefour panels in each graph depict results for a variety of samplesizes In each panel PE is displayed on the 119910-axis and 2 isdisplayed on the 119909-axis

We find that the 119896-GIC estimators are almost the bestestimators among those considered When 2 is very large

the MMA estimators can sometimes be marginally preferredto the 119896-GIC estimators In each panel the AIC and CV havequite similar prediction errors For a smaller 2 the AICobtains a higher prediction error than CV However the AICestimators yield smaller PE119904 than the CV estimators when2 is increasing In many situations the PE119904 of BIC estimator

with a large 2 are quite poor relative to the other methodsNext we discuss PE under correlative errors and the PE

calculation is summarized in Figure 2 Broadly speakingthe conclusions are similar to those found in homoscedasticcases The 119896-GIC estimator frequently yields the most accu-rate estimators followed by the MMA estimator and bothaverage estimators enjoy significantly smaller PE119904 than theother three estimators over a large portion of the 2 space

Abstract and Applied Analysis 11

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 50M = 11

k-GIC

(a)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 100M = 14

k-GIC

(b)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 150M = 16

k-GIC

(c)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 200M = 18

k-GIC

(d)

Figure 2 Results for Monte Carlo design

When 2 le 02 the BIC estimator outperforms the 119896-GICestimator Again the AIC estimator is habitually the worstperforming estimator with the CV being a close second in alarge region of the 2 space Besides their relative efficiencyrelies closely on sample size with the BIC estimator revealingincreasing PE and the remaining four estimators showingdecreasing PE as 119899 increases

6 Conclusions

In risk investment an important subject is to find an optimalportfolio The commonly used techniques are the opti-mization methods based on the scheme of mean-varianceHowever those methods are cumbersome in computingand cannot obtain the closed solutions for some complex

problems To make up the defects of mean-variance analternative methodology for obtaining an optimal portfoliois to use model selection This paper attempts to developa statistical program to consider the selection problem ofoptimal tracking portfolio We build the theoretical modelsof tracking portfolios by constrained linear models Thenthe selection problems of optimal portfolio boil down tochoosing an optimal constrained linear model

In the setting of unrestricted models or homoscedasticmodels a large number of works investigate the problemsof model selection In distinction we discuss the modelselection for constrained models with dependent errors Therestricted models are estimated by the method of weightedaverage least squares Thus the selection of an optimalconstrained model is equivalent to finding a series of optimal

12 Abstract and Applied Analysis

weights We select the weights by minimizing a 119896-class gen-eralized information criterion (119896-GIC) which is an estimateof the average squared error from the model average fit Theprocedure of selecting weights is proved to be asymptoticallyoptimal Through Monte Carlo simulation the performanceof 119896-GIC is compared against that of four other methods Itis found that the 119896-GIC gives the best performance in mostcases

There are two limitations of our results which are openfor further research First what is the asymptotic distributionof the parametric estimators Second can the theory begeneralized to allow for continuous weightsThese questionsremain to be answered by future research In this work wemainly adopt the method of regression analysis to solve theselection problem In fact some alternative mathematicaltools can also be employed to explore the theoretical prop-erties of model selection For example the optimal modelcan be selected by the methods of linear optimization orquadratic programming and we can apply the techniques oflinear functional analysis and stochastic control to considerthe inferences of parametric estimator Besides we mentionthat the applications of this study can also be extended tosome other fields including risk management ruin theoryand factor analysis

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

This work was supported by the Shandong Institute ofBusiness and Technology (SIBT Grant 521014306203)

References

[1] K Knight and W Fu ldquoAsymptotics for lasso-type estimatorsrdquoThe Annals of Statistics vol 28 no 5 pp 1356ndash1378 2000

[2] J Fan and H Peng ldquoNonconcave penalized likelihood with adiverging number of parametersrdquo The Annals of Statistics vol32 no 3 pp 928ndash961 2004

[3] H Zou and M Yuan ldquoComposite quantile regression and theoracle model selection theoryrdquo The Annals of Statistics vol 36no 3 pp 1108ndash1126 2008

[4] M Caner ldquoLasso-type GMM estimatorrdquo Econometric Theoryvol 25 no 1 pp 270ndash290 2009

[5] C Y Tang and C Leng ldquoPenalized high-dimensional empiricallikelihoodrdquo Biometrika vol 97 no 4 pp 905ndash919 2010

[6] L C Jennifer A D Jurgen and F H David Model Selectionin Equations With Many Small Effects Oxford Bulletin ofEconomics and Statistics 2013

[7] E E Leamer Specification Searches John Wiley amp Sons NewYork NY USA 1978

[8] P J Brown M Vannucci and T Fearn ldquoBayes model averagingwith selection of regressorsrdquo Journal of the Royal StatisticalSociety B Statistical Methodology vol 64 no 3 pp 519ndash5362002

[9] W S Rodney and K D Herman Bayesian Model SelectionWith An Uninformative Prior Oxford Bulletin of Economicsand Statistics 2003

[10] N L Hjort and G Claeskens ldquoFrequentist model averageestimatorsrdquo Journal of the American Statistical Association vol98 no 464 pp 879ndash899 2003

[11] B E Hansen ldquoLeast squares model averagingrdquo EconometricaJournal of the Econometric Society vol 75 no 4 pp 1175ndash11892007

[12] H Liang G Zou A T K Wan and X Zhang ldquoOptimal weightchoice for frequentist model average estimatorsrdquo Journal of theAmerican Statistical Association vol 106 no 495 pp 1053ndash10662011

[13] X Zhang and H Liang ldquoFocused information criterion andmodel averaging for generalized additive partial linear modelsrdquoThe Annals of Statistics vol 39 no 1 pp 194ndash200 2011

[14] B E Hansen and J S Racine ldquoJackknife model averagingrdquoJournal of Econometrics vol 167 no 1 pp 38ndash46 2012

[15] H Akaike ldquoInformation theory and an extension of themaximum likelihood principlerdquo in Proceedings of the 2ndInternational Symposium on Information Theory B N Petrovand F Csake Eds pp 267ndash281 Akademiai Kiado BudapestHungary 1973

[16] C L Mallows ldquoSome comments on CPrdquo Technometrics vol 15no 4 pp 661ndash675 1973

[17] M Stone ldquoCross-validatory choice and assessment of statisticalpredictionsrdquo Journal of the Royal Statistical Society B Method-ological vol 36 pp 111ndash147 1974

[18] G Schwarz ldquoEstimating the dimension of a modelrdquoTheAnnalsof Statistics vol 6 no 2 pp 461ndash464 1978

[19] P Craven and G Wahba ldquoSmoothing noisy data with splinefunctions Estimating the correct degree of smoothing by themethod of generalized cross-validationrdquo Numerische Mathe-matik vol 31 no 4 pp 377ndash403 1979

[20] D W K Andrews ldquoConsistent moment selection proceduresfor generalized method of moments estimationrdquo EconometricaJournal of the Econometric Society vol 67 no 3 pp 543ndash5641999

[21] G Claeskens and N L Hjort ldquoThe focused informationcriterionrdquo Journal of theAmerican Statistical Association vol 98no 464 pp 900ndash945 2003With discussions and a rejoinder bythe authors

[22] Y Zhang R Li and C-L Tsai ldquoRegularization parameterselections via generalized information criterionrdquo Journal of theAmerican Statistical Association vol 105 no 489 pp 312ndash3232010

[23] G Xu and J Z Huang ldquoAsymptotic optimality and efficientcomputation of the leave-subject-out cross-validationrdquo TheAnnals of Statistics vol 40 no 6 pp 3003ndash3030 2012

[24] M K P So and T Ando ldquoGeneralized predictive informationcriteria for the analysis of feature eventsrdquo Electronic Journal ofStatistics vol 7 pp 742ndash762 2013

[25] J J J Groen and G Kapetanios ldquoModel selection criteria forfactor-augmented regressionsrdquo Oxford Bulletin of Economicsand Statistics vol 75 no 1 pp 37ndash63 2013

[26] S Lai and Q Xie ldquoA selection problem for a constrainedlinear regression modelrdquo Journal of Industrial and ManagementOptimization vol 4 no 4 pp 757ndash766 2008

[27] J Shao ldquoAn asymptotic theory for linear model selectionrdquoStatistica Sinica vol 7 no 2 pp 221ndash264 1997 With commentsand a rejoinder by the author

Abstract and Applied Analysis 13

[28] D W K Andrews ldquoAsymptotic optimality of generalized 119862119871

cross-validation and generalized cross-validation in regressionwith heteroskedastic errorsrdquo Journal of Econometrics vol 47 no2-3 pp 359ndash377 1991

[29] K-C Li ldquoAsymptotic optimality for 119862119901 119862119871 cross-validation

and generalized cross-validation discrete index setrdquoTheAnnalsof Statistics vol 15 no 3 pp 958ndash975 1987

[30] J A F Machado ldquoRobust model selection and119872-estimationrdquoEconometric Theory vol 9 no 3 pp 478ndash493 1993

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article The Optimal Selection for Restricted ...downloads.hindawi.com/journals/aaa/2014/692472.pdf · stocks in the group as the regressors. Since the coe cient of a regressor

4 Abstract and Applied Analysis

where 119875(119908) = sum119872

119898=1119908119898119875119898and 120578(119908) = (119868 minus 119875(119908))119883119877

minus1119861

It can be seen that the weighted estimator 120583(119908) is anaverage estimator of 120583

119898 119898 = 1 119872 The weighted ldquohatrdquo

matrix 119875(119908) depends on nonrandom regressor 119885119898119906

andweight vector 119908 In general conditions the matrix 119875(119908) issymmetric but not idempotent

For a positive integerG let 120585 be the maximal value of 1205902120580

120580 = 1 119899 and let 120582119895 119895 = 1 G be the nonzero eigenvalue

ofΩ119899 Assume that both 120582

119895and 984858120580are summable namely

Γ =

G

sum

119895=1

120582119895lt infin

W = 120585 + 2

119899minus1

sum

120580=1

10038161003816100381610038169848581205801003816100381610038161003816 lt infin

(17)

Since the covariance matrix Ω119899determines the algebraic

structure of the model average estimator we discuss itsproperties in the following

Lemma 1 For any 119899 dimensional vectors 119886 = (1198861 119886

119899)119879 and

119887 = (1198871 119887

119899)119879 one has |119886119879Ω

119899119887| le 119886119887W where sdot is

the Euclidean norm

Proof Through the definition ofΩ119899 one has

10038161003816100381610038161003816119886119879Ω11989911988710038161003816100381610038161003816=

1003816100381610038161003816100381610038161003816100381610038161003816

119899

sum

119897=1

1198861198971198871198971205902

119897+

119899minus1

sum

119894=1

984858119894

119899minus119894

sum

119897=1

(119886119897+119894119887119897+ 119886119897119887119897+119894)

1003816100381610038161003816100381610038161003816100381610038161003816

le 120585

1003816100381610038161003816100381610038161003816100381610038161003816

119899

sum

119897=1

119886119897119887119897

1003816100381610038161003816100381610038161003816100381610038161003816

+

1003816100381610038161003816100381610038161003816100381610038161003816

119899minus1

sum

119894=1

984858119894

1003816100381610038161003816100381610038161003816100381610038161003816

1003816100381610038161003816100381610038161003816100381610038161003816

119899minus119894

sum

119897=1

(119886119897+119894119887119897+ 119886119897119887119897+119894)

1003816100381610038161003816100381610038161003816100381610038161003816

(18)

ApplyingCauchy-Schwarz inequality for 119894 isin 1 119899minus1one gets

119899minus119894

sum

119897=1

119886119897+119894119887119897le radic

119899minus119894

sum

119897=1

1198862119897+119894

119899minus119894

sum

119897=1

1198872119897

le radic

119899

sum

119897=1

1198862119897

119899

sum

119897=1

1198872119897= 119886 119887

(19)

Similarly sum119899minus119894119897=1119886119897119887119897+119894le 119886119887 holds Therefore

10038161003816100381610038161003816119886119879Ω11989911988710038161003816100381610038161003816le 119886 119887(120585 + 2

119899minus1

sum

119894=1

984858119894) = 119886 119887W (20)

Because the matrix 119875(119908) takes an important role in ana-lyzing the problems of model selection we state some of itsproperties We set 120591 = maxK

1 K

119872 Let 120582(119875(119908)) and

120582max(119875(119908)) denote the eigenvalue and the largest eigenvalueof 119875(119908) respectively

Lemma 2 One has (i) tr(119875(119908)) le 120591 and (ii) 0 le 120582(119875(119908)) le 1where tr(sdot) denotes the trace operation

Proof It follows from tr(119875119898) = K

119898and tr(119875(119908)) =

sum119872

119898=1119908119898K119898le 120591 that (i) is established Next we consider

(ii) Without loss of generality let 1205821119898

ge sdot sdot sdot ge 120582119899119898

and1205931119898 120593

119899119898be the eigenvalues and standardly orthogonal

eigenvectors of 119875119898 respectively Observe that 119875

119898is idempo-

tent which implies that 120582119894119898

isin 0 1 119894 = 1 119899 For any120603 isin R119899 it can be seen that

120582 (119875 (119908)) =120603119879119875 (119908)120603

120603119879120603

=

119872

sum

119898=1

119908119898

120603119879119875119898120603

120603119879120603

=

119872

sum

119898=1

119908119898

120594119879Ξ119898120594

120594119879120594

=

119872

sum

119898=1

119908119898

119899

sum

119894=1

120582119894119898

100381610038161003816100381612059411989410038161003816100381610038162

120594119879120594

(21)

where Ξ119898= diag(120582

1119898 120582

119899119898) 120603 = Φ

119898120594 and Φ

119898=

(1205931119898 120593

119899119898) Due to 120582

119894119898isin 0 1 119894 = 1 119899 it means

that

120582 (119875 (119908)) le

119872

sum

119898=1

119908119898

119899

sum

119894=1

1

100381610038161003816100381612059411989410038161003816100381610038162

120594119879120594=

119872

sum

119898=1

119908119898= 1

120582 (119875 (119908)) ge

119872

sum

119898=1

119908119898

119899

sum

119894=1

0

100381610038161003816100381612059411989410038161003816100381610038162

120594119879120594= 0

(22)

From Lemma 2(ii) we know that 119875(119908) is nonnegativedefinite

Lemma 3 Let 120572 denote the number of eigenvalues of 119875(119908)Then (120572minus1 tr(119875(119908)))2 le 120572minus1 tr(1198752(119908))

Proof Let 1205821 120582

120572be the eigenvalues of 119875(119908) (i) If 120582

1=

sdot sdot sdot = 120582120572= 0 we know that Lemma 3 holds (ii) Let

1205821= 0 120582

120572= 0 and 120582

120572+1= sdot sdot sdot = 120582

120572= 0 It is easy to see

that sum120572119894=1(120582119894minus 120582)2

ge 0 where 120582 = 120572minus1 tr(119875(119908)) Thus it canbe shown that120572

sum

119894=1

(120582119894minus 120582)2

=

120572

sum

119894=1

1205822

119894minus 2120582

120572

sum

119894=1

120582119894+ 1205721205822

= tr (1198752 (119908)) minus 120572(120572minus1 tr (119875 (119908)))2

ge 0

(23)

which implies that (120572minus1 tr(119875(119908)))2 le 120572minus1 tr(1198752(119908))

Lemma 4 For any 119908 isin W there exists C1

= C10158402

such that tr(119875(119908)Ω119899) le C1015840Γ tr (Ω

119899119875(119908))

2le C1Γ2 and

tr (Ω119899119875119879(119908)119875(119908))

2

le C1Γ tr(119875(119908)Ω

119899119875119879(119908)) in which C1015840 =

119904119906119901119908isinW120582max(119875(119908))

Proof Notice that 120582min(A)120582max(B) le 120582max(AB) le

120582max(A)120582max(B) and tr(AB) le 120582max(A) tr(B) le

Abstract and Applied Analysis 5

tr(A) tr(B) where A and B are any positive semidefinitematrices

Since 119875(119908) and Ω119899are symmetric and nonnegative

definite the first inequality is established by tr(119875(119908)Ω119899) le

120582max(119875(119908)) tr(Ω119899) Let 120582119894 be eigenvalue of 119875(119908) From thesymmetric property of 119875(119908) we know that there exists a119899 times 119899 orthogonal matrix Ψ such that 119875(119908) = ΨΥΨ

119879where Υ = diag(120582

1 120582

119899) Then it yields tr(Ω

119899119875(119908)) =

tr(Υ12Ψ119879Ω119899ΨΥ12) where Υ

12Ψ119879

Ω119899ΨΥ12 is symmetric

and nonnegative definite The second inequality holdsbecause

tr (Ω119899119875 (119908))

2le [tr (Ω

119899119875 (119908))]

2

le [120582max (119875 (119908)) tr(Ω119899)]2

le C1Γ2

(24)

For the last formula we note that

tr (Ω119899119875119879(119908) 119875 (119908))

2

= tr (119875 (119908)Ω119899119875119879(119908))2

le 120582max (119875 (119908)Ω119899119875119879(119908)) tr (119875 (119908)Ω119899119875

119879(119908))

le 120582max (1198752(119908)) 120582max (Ω119899) tr (119875 (119908)Ω119899119875

119879(119908))

le 1205822

max (119875 (119908)) 120582max (Ω119899) tr (119875 (119908)Ω119899119875119879(119908))

le C1Γ tr (119875 (119908)Ω119899119875

119879(119908))

(25)

This completes the proof of Lemma 4

Lemma 5 Let A be a symmetric matrix and 119883 be a randomvector with zero expectation Then one has Var(119883119879A119883) =2 tr(Var(119883)AVar(119883)A) where Var(sdot) is the operation ofvariance

Proof The proof of this lemma is provided in Lai and Xie[26]

3 Average Squared Error

Denote an average squared error by

119871119899 (119908) =

1

119899

1003817100381710038171003817120583 minus 120583(119908)10038171003817100381710038172

119896=(120583 minus 120583 (119908))

119879119896 (120583 minus 120583 (119908))

119899 (26)

where 119896 is a fixed positive integer which is often used toeliminate the boundary effect Andrews [28] suggested thatthe 119896 can take the value of 120590minus2 when errors obeyed anindependent and identical distribution with variance 1205902 Themost common situation is 119896 = 1 The average squarederror 119871

119899(119908) can be viewed as a measure of accuracy between

120583(119908) and 120583 Obviously an optimal estimator can obtain theminimum value of 119871

119899(119908) In other words we should select

a weight vector 119908 from W to make that the average squarederror 119871

119899(119908) takes value that is as small as possible

In order to investigate the problem of weight selection wegive the expression of conditional expected average squarederror as follows

119877119899 (119908) = 119864 (119871119899 (119908) | 119883 119885119898) (27)

From the definition of119877119899(119908) the following lemma can be

obtained

Lemma6 The conditional expected average squared error canbe rewritten as

119877119899 (119908) =

1003817100381710038171003817119872(119908)120583 minus 120578(119908)10038171003817100381710038172

119896

119899+119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899 (28)

where119872(119908) = 119868 minus 119875(119908)

Proof A straight calculation of 119871119899(119908) leads to

119899119871119899 (119908)

= (120583 minus 120583 (119908))119879119896 (120583 minus 120583 (119908))

= (120583 minus 120578 (119908) minus 119875 (119908) (120583 + 119890))119879

times 119896 (120583 minus 120578 (119908) minus 119875 (119908) (120583 + 119890))

= [120583119879119872119879(119908) minus 120578

119879(119908) minus 119890

119879119875119879(119908)]

times 119896 [119872 (119908) 120583 minus 120578 (119908) minus 119875 (119908) 119890]

=1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896minus 119896(120583 minus 119883119877

minus1119861)119879

119872119879(119908) 119875 (119908) 119890

+ 119896119890119879119875119879(119908) 119875 (119908) 119890 minus 119896119890

119879119875119879(119908)119872 (119908) (120583 minus 119883119877

minus1119861)

(29)

Using 119864(119890119879119875119879(119908)119875(119908)119890 | 119883 119885119898) = tr(119875(119908)Ω

119899119875119879(119908))

and taking conditional expectations on both sides of theabove equality give rise to Lemma 6

4 The 119896-Class Generalized InformationCriterion and Asymptotic Optimality

As the value of120583 is unknown the average squared error119871119899(119908)

cannot be used directly to select the weight vector119908Thus wesuggest a 119896-class generalized information criterion (119896-GIC)to choose the weight vector Further we will prove that theselected weight vector from 119896-GIC minimizes the averagesquared error 119871

119899(119908)

The 119896-GIC for the restricted model average estimator is

119862119899 (119908) =

1003817100381710038171003817119884 minus 120583 (119908)10038171003817100381710038172

119896

119899+2ℏ

119899tr (119875 (119908)Ω119899) (30)

where ℏ is larger than one and satisfies assumption (35) men-tioned below The 119896-class generalized information criterionextends the generalized information criterion (GIC

120582119899

) pro-posed by Shao [27] Because the ℏ can take different valuesthe 119896-GIC includes some common information criteria formodel selection such as theMallows C

119871criterion (119896 = ℏ = 1)

the GIC criterion (119896 = 1 ℏ rarr infin) the FPE120572criterion

(119896 = 1 ℏ gt 1) and the BIC criterion (119896 = 1 ℏ = log(119899))

6 Abstract and Applied Analysis

Lemma 7 If 119896 = ℏ we have 119864(119862119899(119908) | 119883 119885

119898) = 119877

119899(119908) +

119896Γ119899

Proof Recalling the definition of 119862119899(119908) one gets

119899119862119899 (119908) = (119884 minus 120583 (119908))

119879119896 (119884 minus 120583 (119908)) + 2119896 tr (119875 (119908)Ω119899)

= [119872 (119908)119884 minus 120578 (119908)]119879119896 [119872 (119908)119884 minus 120578 (119908)]

+ 2119896 tr (119875 (119908)Ω119899)

= 2119896 tr (119875 (119908)Ω119899) + 119896119890119879119872119879(119908)119872 (119908) 119890

+ 119896(119872 (119908) 120583 minus 120578 (119908))119879119872(119908) 119890

+ 119896119890119879119872119879(119908) (119872 (119908) 120583 minus 120578 (119908))

+1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

(31)

Observe that

119896119890119879119872119879(119908)119872 (119908) 119890 = 119890

119879(119868 minus 119875 (119908))

119879119896 (119868 minus 119875 (119908)) 119890

= 119896119890119879119890 minus 119896119890

119879119875119879(119908) 119890 minus 119896119890

119879119875 (119908) 119890

+ 119896119890119879119875119879(119908) 119875 (119908) 119890

(32)

Notice that 119864(119896119890119879119890 | 119883 119885

119898) = 119896 tr(Ω

119899) and

119864(119896119890119879119875119879(119908)119890 | 119883 119885

119898) = 119896 tr(119875119879(119908)Ω

119899) Moreover we

have that 119864(119896119890119879119875(119908)119890 | 119883 119885119898) = 119896 tr(119875(119908)Ω

119899) and

119864(119896119890119879119875119879(119908)119875(119908)119890 | 119883 119885

119898) = 119896 tr(119875(119908)Ω

119899119875119879(119908))

Thus taking conditional expectations on both sides of(31) we obtain

119899119864 (119862119899 (119908) | 119883 119885119898) = 119896 tr (Ω119899) +

1003817100381710038171003817119872(119908)120583 minus 120578(119908)10038171003817100381710038172

119896

+ 119896 tr (119875 (119908)Ω119899119875119879(119908))

(33)

It follows from (33) and Lemma 6 that Lemma 7 isestablished as desired

Lemma 7 shows that 119862119899(119908) is equivalent to the con-

ditional expected average squared error plus an error biasParticularly when 119899 approaches to infinite 119862

119899(119908) is an

unbiased estimation of 119877119899(119908)

The 119896-GIC criterion is defined so as to select the optimalweight vector 119908 The optimal weight vector 119908 is chosen byminimizing 119862

119899(119908)

Obviously the well-posed estimators of parameters areΘ119908 and Ψ119908 with the weight vector 119908 Under some regular

conditions we intend to demonstrate that the selectionprocedure is asymptotically optimal in the following sense

119871119899 (119908)

inf119908isinW119871119899 (119908)

119901

997888rarr 1 (34)

If the formula (34) holds we know that the selectedweight vector 119908 from 119862

119899(119908) can realize the minimum value

of 119871119899(119908) In other words the weight vector 119908 is equivalent

to the selected weight vector by minimizing 119871119899(119908) and is the

optimal weight vector for 120583(119908) The asymptotic optimality of119862119899(119908) can be established under the following assumptions

Assumptions 1 Write 120572119899= inf119908isinW119899119877119899(119908) As 119899 rarr infin we

assume that

lim inf119908isinW

ℏ2

120572119899

997888rarr 0 (35)

sum

119908isinW

1

(119899119877119899 (119908))

1198731015840997888rarr 0 (36)

for a large ℏ and any positive integer 1 le 1198731015840 lt infinThe above assumptions have been employed by many

literatures of model selection For instance the expression(35) was used by Shao [27] and the formula (36) was adoptedby Li [29] Andrews [28] Shao [27] and Hansen [11]

The following lemma offers a bridge for proving theasymptotic optimality of 119862

119899(119908)

Lemma 8 We have1003817100381710038171003817119872(119908)119884 minus 120578(119908)

10038171003817100381710038172

119896= 119899119871119899 (119908) minus 2⟨119890

119879 119875(119908)119890⟩

119896

+ 2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896+ 1198902

119896

(37)

Proof Using 120583(119908) = 119875(119908)119884 + 120578(119908) 119884 = 120583 + 119890 and thedefinition of 119871

119899(119908) one obtains

1003817100381710038171003817119872(119908)119884 minus 120578(119908)10038171003817100381710038172

119896=1003817100381710038171003817120583 + 119890 minus 119875(119908)119884 minus 120578(119908)

10038171003817100381710038172

119896

= 1198902

119896+1003817100381710038171003817120583 minus 119875(119908)119884 minus 120578(119908)

10038171003817100381710038172

119896

+ 2⟨119890119879 120583 minus 120578(119908) minus 119875(119908)119884⟩

119896

= 1198902

119896+ 119899119871119899 (119908)

+ 2⟨119890119879 120583 minus 120578(119908) minus 119875(119908)(120583 + 119890)⟩

119896

= 1198902

119896+ 119899119871119899 (119908)

+ 2⟨119890119879 (119868 minus 119875(119908))120583 minus 120578(119908) minus 119875(119908)119890⟩

119896

= 1198902

119896+ 119899119871119899 (119908)

+ 2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

minus 2⟨119890119879 119875(119908)119890⟩

119896

(38)

This completes the proof of Lemma 8

Lemma 8 and 119884 minus 120583(119908)2119896= 119872(119908)119884 minus 120578(119908)

2

119896imply

that

119862119899 (119908) = 119871119899 (119908) +

2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

119899

+1198902

119896

119899+2ℏ tr (119875 (119908)Ω119899)

119899minus 2

⟨119890119879 119875(119908)119890⟩

119896

119899

(39)

Abstract and Applied Analysis 7

The goal is to choose119908 by minimizing 119862119899(119908) From (39)

one only needs to select 119908 through minimizing

119871119899 (119908) +

2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

119899

+

2ℏ tr (119875 (119908)Ω119899) minus 2⟨119890119879 119875(119908)119890⟩

119896

119899

(40)

where 119908 isinWCompared with 119871

119899(119908) it is sufficient to establish

that 119899minus1⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896and 119899

minus1ℏ tr(119875(119908)Ω

119899) minus

119899minus1⟨119890119879 119875(119908)119890⟩

119896are uniformly negligible for any 119908 isin W

More specifically to prove (34) we need to check

sup119908isinW

⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

119899119877119899 (119908)

119901

997888rarr 0 (41)

sup119908isinW

10038161003816100381610038161003816ℏ tr (119875 (119908)Ω119899) minus ⟨119890

119879 119875(119908)119890⟩

119896

10038161003816100381610038161003816

119899119877119899 (119908)

119901

997888rarr 0 (42)

sup119908isinW

10038161003816100381610038161003816100381610038161003816

119871119899 (119908) minus 119877119899 (119908)

119877119899 (119908)

10038161003816100381610038161003816100381610038161003816

119901

997888rarr 0 (43)

where ldquo119901

997888rarrrdquo denotes the convergence in probabilityFollowing the idea of Li [29] we testify the main result

of our work that the minimizing of 119896-class generalizedinformation criterion is asymptotically optimalNowwe statethe main theorem

Theorem9 Under assumptions (35) and (36) theminimizingof 119896-class generalized information criterion119862

119899(119908) is asymptot-

ically optimal namely (34) holds

Proof The asymptotically optimal property of 119896-GIC needsto show that (41) (42) and (43) are valid

Firstly we prove that (41) holds For any 120577 gt 0 byChebyshevrsquos inequality one has

Pr

sup119908isinW

10038161003816100381610038161003816⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

10038161003816100381610038161003816

119899119877119899 (119908)

gt 120577

le sum

119908isinW

119864[⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896]2

(120577119899119877119899 (119908))

2

(44)

which by Lemma 1 is no greater than

120577minus2119896W sum

119908isinW

119899minus21003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896(119877119899 (119908))

minus2 (45)

Recalling the definition of 119877119899(119908) we get 119872(119908)120583 minus

120578(119908)2

119896le 119899119877

119899(119908) Then (45) does not exceed

120577minus2119896Wsum

119908isinW (119899119877119899(119908))minus1 By assumption (36) one knows

that (45) tends to zero in probabilityThus (41) is established

To prove (42) it suffices to testify that

119896 tr (119875 (119908)Ω119899) minus ⟨119890119879 119875(119908)119890⟩

119896

119899119877119899 (119908)

119901

997888rarr 0 (46)

(ℏ minus 119896) tr (119875 (119908)Ω119899)119899119877119899 (119908)

119901

997888rarr 0 (47)

By Chebyshevrsquos inequality and Lemmas 4 and 5 for any120598 gt 0 we have

Pr sup119908isinW

10038161003816100381610038161003816100381610038161003816100381610038161003816

tr (119875 (119908)Ω119899) minus ⟨119890119879 119875 (119908) 119890⟩

119899119877119899 (119908)

10038161003816100381610038161003816100381610038161003816100381610038161003816

gt 120598

le sum

119908isinW

119864[tr(119875(119908)Ω119899) minus ⟨119890

119879 119875(119908)119890⟩]

2

(120598119899119877119899(119908))2

=1

1205982sum

119908isinW

Var (119890119879119875 (119908) 119890)

(119899119877119899(119908))2

le2C1Γ2

1205982sum

119908isinW

1

(119899119877119899(119908))2

(48)

From (36) we know that (48) is close to zero Thus (46)is reasonable

Recalling Lemma 4 and (35) it yields

1003816100381610038161003816(ℏ minus 119896) tr (119875 (119908)Ω119899)1003816100381610038161003816

119899119877119899 (119908)

le

10038161003816100381610038161003816C1015840 (ℏ minus 119896) Γ

10038161003816100381610038161003816

119899119877119899 (119908)

997888rarr 0 (49)

which derives (47) Then (42) is provedNext we show that the expression (43) also holds A

straightforward calculation leads to

1003817100381710038171003817120583 minus 120583(119908)10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

=1003817100381710038171003817120583 minus 120578(119908) minus 119875(119908)119884

10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

=1003817100381710038171003817120583 minus 120578(119908) minus 119875(119908)120583 minus 119875(119908)119890

10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

=1003817100381710038171003817119872(119908)120583 minus 120578(119908) minus 119875(119908)119890

10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

= 119875(119908)1198902

119896minus 2⟨120583

119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

(50)

From the identity (50) we know that

119871119899 (119908) minus 119877119899 (119908) = minus

2⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899

+119875(119908)119890

2

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899

(51)

8 Abstract and Applied Analysis

Obviously the proof of (43) needs to verify that

sup119908isinW

119875(119908)1198902

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899119877119899 (119908)

119901

997888rarr 0

sup119908isinW

⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899119877119899 (119908)

119901

997888rarr 0

(52)

To prove (52) it should be noticed thattr(119875(119908)Ω

119899119875119879(119908)) = 119864(119890

119879119875119879(119908)119875(119908)119890) and 119875(119908)119890

2

119896=

⟨119890119879119875119879(119908) 119875(119908)119890⟩

119896 In addition

1003817100381710038171003817119875 (119908) (119872 (119908) 120583 minus 120578 (119908))10038171003817100381710038172le 1205822

max (119875 (119908))1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

le1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

(53)

Thus there exist any 1205771015840 gt 0 and 1205981015840 gt 0 such that

Pr sup119908isinW

119875(119908)1198902

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899119877119899 (119908)

gt 1205771015840

le sum

119908isinW

119864[119875(119908)1198902

119896minus 119896119864 (119890

119879119875119879(119908) 119875 (119908) 119890)]

2

(119899119877119899 (119908) 120577

1015840)2

=1198962

12057710158402sum

119908isinW

Var (119890119879119875119879 (119908) 119875 (119908) 119890)

(119899119877119899(119908))2

le2119896C1Γ

12057710158402sum

119908isinW

119896 tr (119875 (119908)Ω119899119875119879(119908))

(119899119877119899(119908))2

le2119896C1Γ

12057710158402sum

119908isinW

1

(119899119877119899 (119908))

(54)

Pr

sup119908isinW

⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899119877119899 (119908)

gt 1205981015840

le sum

119908isinW

119864[⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896]2

(119899119877119899(119908)1205981015840)

2

le sum

119908isinW

1003817100381710038171003817119875(119908)(119872(119908)120583 minus 120578(119908))10038171003817100381710038172

119896119896W

(119899119877119899(119908)1205981015840)

2

le 1205981015840minus2119896W sum

119908isinW

119899minus21003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896(119877119899 (119908))

minus2

(55)

Combining (36) and (45) one knows that both (54)and (55) tend to zero In other words (43) is confirmedWe conclude that the expressions (41) (42) and (43) arereasonable This completes the proof of Theorem 9

In practice the covariance of errors is usually unknownand needs to be estimated However it is difficult to build agood estimate for Ω

119899in virtue of the special structure of Ω

119899

In the special case when the random errors are independentand identical distribution with variance 1205902 the consistentestimator of1205902 can be built for the constrainedmodels (7) and(8) Let 119898 = 119876 in 120583

119898and 1205911015840 = tr(119875

119876) where 119876 corresponds

to a ldquolargerdquo approximatingmodel Denote 2119876= (119899minus120591

1015840)minus1(119884minus

120583119876)119879(119884 minus 120583

119876) The coming theorem will show that 2

119876is a

consistent estimate of 1205902

Theorem 10 If 1205911015840119899 rarr 0 when 1205911015840 rarr infin and 119899 rarr infin wehave 2

119876

119901

997888rarr 1205902 as 119899 rarr infin

Proof Writing Λ = 119863 + 119883119877minus1119863995333 one obtains

(119884 minus 120583119876)119879(119884 minus 120583

119876)

119899 minus 1205911015840=119890119879119872119876119890

119899 minus 1205911015840+

1003817100381710038171003817119872119876Λ10038171003817100381710038172

119899 minus 1205911015840+2119890119879119872119876Λ

119899 minus 1205911015840

(56)

where119872119876= 119868 minus 119875

119876

Since 119864(119890119879119872119876119890) = 120590

2(119899 minus 120591

1015840) it leads to

11986410038161003816100381610038161003816119890119879119872119876119890 minus 1205902(119899 minus 120591

1015840)10038161003816100381610038161003816

2

= 11986410038161003816100381610038161003816119890119879119872119876119890 minus 119864(119890

119879119872119876119890)10038161003816100381610038161003816

2

= Var (119890119879119872119876119890) = 2120590

4 tr (119872119876)

(57)

For any 120576 gt 0 it follows from (57) that

Pr1003816100381610038161003816100381610038161003816100381610038161003816

119890119879119872119876119890

119899 minus 1205911015840minus 1205902

1003816100381610038161003816100381610038161003816100381610038161003816

gt 120576 le11986410038161003816100381610038161003816119890119879119872119876119890 minus (119899 minus 120591

1015840)120590210038161003816100381610038161003816

2

1205762(119899 minus 1205911015840)2

le21205904

1205762 (119899 minus 1205911015840)997888rarr 0

(58)

Equation (58) implies that (119899 minus 1205911015840)minus1(119890119879119872119876119890)119901

997888rarr 1205902 Let

I119876119905

be the 119905th diagonal element of the ldquohatrdquo matrix 119875119876Then

0 le 1minusI119876119905lt 1 is the 119905th diagonal element of119872

119876and satisfies

sum119899

119905=1(1 minusI

119876119905) = 119899 minus 120591

1015840Because 120601(119911

119894119897 120595119897) and 120601(119886

119904119896 120595119896) converge to mean

square we have 119864(Λ2119876119905) rarr 0 as 1205911015840 rarr infin where Λ

119876119905is

the 119905th element of Λ 119905 = 1 119899 Notice that

119864 (Λ119879119872119876Λ)

(119899 minus 1205911015840)=

1

119899 minus 1205911015840

119899

sum

119905=1

119864 (Λ2

119876119905119872119876119905)

=1

119899 minus 1205911015840

119899

sum

119905=1

119864 (Λ2

119876119905) [1 minusI

119876119905]

le max119905

119864 (Λ2

119876119905)

1

119899 minus 1205911015840

119899

sum

119905=1

[1 minusI119876119905]

= max119905

119864 (Λ2

119876119905) 997888rarr 0

(59)

The above expression implies that the second term on theright hand of (56) approaches to zero By the similar proof of(59) we obtain that the final term on the right-hand of (56)also tends to zero The proof of Theorem 10 is complete

Abstract and Applied Analysis 9

In the case of independently and identically distributederrors if we replace 1205902 by 2

119876in the 119896-GIC the 119896-GIC can be

simplified to

119862119899 (119908) =

1003817100381710038171003817119884 minus 120583(119908)10038171003817100381710038172

119896

119899+2ℏ2

119876tr (119875 (119908))119899

(60)

Here we intend to illustrate that the model selection pro-cedure of minimizing 119862

119899(119908) is also asymptotically optimal

Theorem11 Assume that the random error 119890 is iid withmeanzero and variance 1205902 Under the conditions (35) and (36) the119862119899(119908) is still asymptotically valid

Proof Using a similar technique of deriving (39) we obtain

119899119862119899 (119908) =

1003817100381710038171003817119884 minus 120583(119908)10038171003817100381710038172

119896+ 2ℏ

2

119876tr (119875 (119908))

= 119899119871119899 (119908) + 2⟨119890

119879119872(119908)120583 minus 120578(119908)⟩

119896

+ 2ℏ2

119876tr (119875 (119908)) minus 2⟨119890119879 119875(119908)119890⟩

119896+ 1198902

119896

= 2ℏ (2

119876minus 1205902) tr (119875 (119908)) + 2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

+ 119899119871119899 (119908) + 2ℏ120590

2 tr (119875 (119908))

minus 2⟨119890119879 119875(119908)119890⟩

119896+ 1198902

119896

(61)

With an appropriate modification of the proofs (41)ndash(43)one only needs to verify

sup119908isinW

ℏ100381610038161003816100381610038161205902minus 2

119876

10038161003816100381610038161003816tr (119875 (119908))

119899119877119899 (119908)

119901

997888rarr 0 (62)

which is equivalent to showing

sup119908isinW

ℏ2((1119899)

100381610038161003816100381610038161205902minus 2

119876

10038161003816100381610038161003816tr(119875(119908)))

2

1198772119899(119908)

119901

997888rarr 0 (63)

From the proof of Theorem 10 we have

Pr1003816100381610038161003816100381610038161003816100381610038161003816

ℏ119890119879119872119876119890

119899 minus 1205911015840minus ℏ1205902

1003816100381610038161003816100381610038161003816100381610038161003816

gt 1205761015840119877minus12

119899(119908)

le ℏ211986410038161003816100381610038161003816119890119879119872119876119890 minus (119899 minus 120591

1015840)120590210038161003816100381610038161003816

2

12057610158402(119899 minus 1205911015840)2119877119899 (119908)

le2ℏ21205904

12057610158402 (119899 minus 1205911015840) 119877119899 (119908)

(64)

where 1205761015840 gt 0By assumption (35) one knows that the above equation

is close to zero Thus we obtain 119877119899(119908)minus1|ℏ1205902minus ℏ2

119876|2 119901

997888rarr

0 It follows from Lemma 3 and the definition of 119877119899(119908)

that (119899minus1 tr(119875(119908)))2 is no greater than 119877119899(119908) Then the

formula (63) is confirmed Therefore we conclude that theminimizing of 119862

119899(119908) is also asymptotically optimal

5 Monte Carlo Simulation

In this section Monte Carlo simulations are performedto investigate the finite sample properties of the proposedrestricted linearmodel selectionThis data generating processis

119910119894= 119909119894120579 +

1000

sum

119897=1

119911119894119897120595119897+ 119890119894

subject to 120579 = minus

1000

sum

119897=1

120595119897+ 1 119894 = 1 119899

(65)

where 119909119894follows 119905(3) distribution 119911

1198941= 1 and 119911

119894119897

119897 = 2 1000 is independently and identically distributed119873(0 1) The parameter 120595

119897 119897 = 1 1000 is determined by

the rule 120595119897= 120577119897minus1 where 120577 is a parameter which is selected

to control the population 2 = 1205772(1 + 120577

2) The error 119890

119894is

independent of 119909119894and 119911119894119897 We consider two cases of the error

distribution

Case 1 119890119894is independently and identically distributed

119873(0 1)

Case 2 1198901 119890

119899obey multivariate normal distribution

119873(0Ω119899) whereΩ

119899is a 119899times119899 dimensional covariance matrix

The 119895th diagonal element ofΩ119899is 1205902119895generated from uniform

distribution on (0 1) The 119895119897th (119895 = 119897) nondiagonal element ofΩ119899isΩ119895119897119899= exp(minus05|119909

119894119895minus 119909119894119897|2) where 119909

119894119895and 119909

119894119897denote the

119895th and 119897th elements of 119909119894 respectively

The sample size is varied between 119899 = 50 100 150 and200 The number of models is determined by 119872 = lfloor3119899

13rfloor

where lfloor119904rfloor stands for the integer part of 119904 We set 120577 so that2 varies on a grid between 01 and 09 The number of

simulation trials is Π = 500 For the 119896-GIC the value of 119896takes one and ℏ adopts the effective number of parameters

To assess the performance of 119896-GIC we consider fiveestimators which are (1) AIC model selection estimators(AIC) (2) BIC model selection estimators (BIC) (3) leave-one-out cross-validated model selection estimator (CV [17])(4) Mallows model averaging estimators (MMA [11]) and(5) 119896-GIC model selection estimator (119896-GIC) FollowingMachado [30] the AIC and BIC are defined respectively as

AIC119898= 2119899 ln 1

119899

1003817100381710038171003817119884 minus 12058311989810038171003817100381710038172 + 2K

119898

BIC119898= 2119899 ln 1

119899

1003817100381710038171003817119884 minus 12058311989810038171003817100381710038172 + ln (119899)K119898

(66)

We employ the out-of-sample prediction error to evaluateeach estimator For each replication 119910

ℓ 119909ℓ 119911ℓ100

ℓ=1are gener-

ated as out-of-sample observations In the120587th simulation theprediction error is

PE (120587) = 1

100

100

sum

ℓ=1

(119910ℓminus 120583ℓ (119908))

2 (67)

10 Abstract and Applied Analysis

CVMMAAIC

BIC

n = 50M = 11Pr

edic

tion

erro

r18

14

10

06

01 03 05 07 09

R

2

k-GIC

(a)

CVMMAAIC

BIC

n = 100M = 14

Pred

ictio

n er

ror

18

14

10

06

01 03 05 07 09

R

2

k-GIC

(b)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

n = 150M = 16

01 03 05 07 09

R

2

k-GIC

(c)

Pred

ictio

n er

ror

18

14

10

06

01 03 05 07 09

CVMMAAIC

BIC

n = 200M = 18

R

2

k-GIC

(d)

Figure 1 Results for Monte Carlo design

where119908 is selected by one of the five methodsThen the out-of-sample prediction error is calculated by

PE = 1

Π

Π

sum

120587=1

PE (120587) (68)

where Π = 500 is the number of replication Obviously thesmaller PE implies the better method of model estimatorWe consider PE under homoscedastic errors at first Theprediction error calculations are summarized in Figure 1Thefour panels in each graph depict results for a variety of samplesizes In each panel PE is displayed on the 119910-axis and 2 isdisplayed on the 119909-axis

We find that the 119896-GIC estimators are almost the bestestimators among those considered When 2 is very large

the MMA estimators can sometimes be marginally preferredto the 119896-GIC estimators In each panel the AIC and CV havequite similar prediction errors For a smaller 2 the AICobtains a higher prediction error than CV However the AICestimators yield smaller PE119904 than the CV estimators when2 is increasing In many situations the PE119904 of BIC estimator

with a large 2 are quite poor relative to the other methodsNext we discuss PE under correlative errors and the PE

calculation is summarized in Figure 2 Broadly speakingthe conclusions are similar to those found in homoscedasticcases The 119896-GIC estimator frequently yields the most accu-rate estimators followed by the MMA estimator and bothaverage estimators enjoy significantly smaller PE119904 than theother three estimators over a large portion of the 2 space

Abstract and Applied Analysis 11

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 50M = 11

k-GIC

(a)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 100M = 14

k-GIC

(b)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 150M = 16

k-GIC

(c)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 200M = 18

k-GIC

(d)

Figure 2 Results for Monte Carlo design

When 2 le 02 the BIC estimator outperforms the 119896-GICestimator Again the AIC estimator is habitually the worstperforming estimator with the CV being a close second in alarge region of the 2 space Besides their relative efficiencyrelies closely on sample size with the BIC estimator revealingincreasing PE and the remaining four estimators showingdecreasing PE as 119899 increases

6 Conclusions

In risk investment an important subject is to find an optimalportfolio The commonly used techniques are the opti-mization methods based on the scheme of mean-varianceHowever those methods are cumbersome in computingand cannot obtain the closed solutions for some complex

problems To make up the defects of mean-variance analternative methodology for obtaining an optimal portfoliois to use model selection This paper attempts to developa statistical program to consider the selection problem ofoptimal tracking portfolio We build the theoretical modelsof tracking portfolios by constrained linear models Thenthe selection problems of optimal portfolio boil down tochoosing an optimal constrained linear model

In the setting of unrestricted models or homoscedasticmodels a large number of works investigate the problemsof model selection In distinction we discuss the modelselection for constrained models with dependent errors Therestricted models are estimated by the method of weightedaverage least squares Thus the selection of an optimalconstrained model is equivalent to finding a series of optimal

12 Abstract and Applied Analysis

weights We select the weights by minimizing a 119896-class gen-eralized information criterion (119896-GIC) which is an estimateof the average squared error from the model average fit Theprocedure of selecting weights is proved to be asymptoticallyoptimal Through Monte Carlo simulation the performanceof 119896-GIC is compared against that of four other methods Itis found that the 119896-GIC gives the best performance in mostcases

There are two limitations of our results which are openfor further research First what is the asymptotic distributionof the parametric estimators Second can the theory begeneralized to allow for continuous weightsThese questionsremain to be answered by future research In this work wemainly adopt the method of regression analysis to solve theselection problem In fact some alternative mathematicaltools can also be employed to explore the theoretical prop-erties of model selection For example the optimal modelcan be selected by the methods of linear optimization orquadratic programming and we can apply the techniques oflinear functional analysis and stochastic control to considerthe inferences of parametric estimator Besides we mentionthat the applications of this study can also be extended tosome other fields including risk management ruin theoryand factor analysis

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

This work was supported by the Shandong Institute ofBusiness and Technology (SIBT Grant 521014306203)

References

[1] K Knight and W Fu ldquoAsymptotics for lasso-type estimatorsrdquoThe Annals of Statistics vol 28 no 5 pp 1356ndash1378 2000

[2] J Fan and H Peng ldquoNonconcave penalized likelihood with adiverging number of parametersrdquo The Annals of Statistics vol32 no 3 pp 928ndash961 2004

[3] H Zou and M Yuan ldquoComposite quantile regression and theoracle model selection theoryrdquo The Annals of Statistics vol 36no 3 pp 1108ndash1126 2008

[4] M Caner ldquoLasso-type GMM estimatorrdquo Econometric Theoryvol 25 no 1 pp 270ndash290 2009

[5] C Y Tang and C Leng ldquoPenalized high-dimensional empiricallikelihoodrdquo Biometrika vol 97 no 4 pp 905ndash919 2010

[6] L C Jennifer A D Jurgen and F H David Model Selectionin Equations With Many Small Effects Oxford Bulletin ofEconomics and Statistics 2013

[7] E E Leamer Specification Searches John Wiley amp Sons NewYork NY USA 1978

[8] P J Brown M Vannucci and T Fearn ldquoBayes model averagingwith selection of regressorsrdquo Journal of the Royal StatisticalSociety B Statistical Methodology vol 64 no 3 pp 519ndash5362002

[9] W S Rodney and K D Herman Bayesian Model SelectionWith An Uninformative Prior Oxford Bulletin of Economicsand Statistics 2003

[10] N L Hjort and G Claeskens ldquoFrequentist model averageestimatorsrdquo Journal of the American Statistical Association vol98 no 464 pp 879ndash899 2003

[11] B E Hansen ldquoLeast squares model averagingrdquo EconometricaJournal of the Econometric Society vol 75 no 4 pp 1175ndash11892007

[12] H Liang G Zou A T K Wan and X Zhang ldquoOptimal weightchoice for frequentist model average estimatorsrdquo Journal of theAmerican Statistical Association vol 106 no 495 pp 1053ndash10662011

[13] X Zhang and H Liang ldquoFocused information criterion andmodel averaging for generalized additive partial linear modelsrdquoThe Annals of Statistics vol 39 no 1 pp 194ndash200 2011

[14] B E Hansen and J S Racine ldquoJackknife model averagingrdquoJournal of Econometrics vol 167 no 1 pp 38ndash46 2012

[15] H Akaike ldquoInformation theory and an extension of themaximum likelihood principlerdquo in Proceedings of the 2ndInternational Symposium on Information Theory B N Petrovand F Csake Eds pp 267ndash281 Akademiai Kiado BudapestHungary 1973

[16] C L Mallows ldquoSome comments on CPrdquo Technometrics vol 15no 4 pp 661ndash675 1973

[17] M Stone ldquoCross-validatory choice and assessment of statisticalpredictionsrdquo Journal of the Royal Statistical Society B Method-ological vol 36 pp 111ndash147 1974

[18] G Schwarz ldquoEstimating the dimension of a modelrdquoTheAnnalsof Statistics vol 6 no 2 pp 461ndash464 1978

[19] P Craven and G Wahba ldquoSmoothing noisy data with splinefunctions Estimating the correct degree of smoothing by themethod of generalized cross-validationrdquo Numerische Mathe-matik vol 31 no 4 pp 377ndash403 1979

[20] D W K Andrews ldquoConsistent moment selection proceduresfor generalized method of moments estimationrdquo EconometricaJournal of the Econometric Society vol 67 no 3 pp 543ndash5641999

[21] G Claeskens and N L Hjort ldquoThe focused informationcriterionrdquo Journal of theAmerican Statistical Association vol 98no 464 pp 900ndash945 2003With discussions and a rejoinder bythe authors

[22] Y Zhang R Li and C-L Tsai ldquoRegularization parameterselections via generalized information criterionrdquo Journal of theAmerican Statistical Association vol 105 no 489 pp 312ndash3232010

[23] G Xu and J Z Huang ldquoAsymptotic optimality and efficientcomputation of the leave-subject-out cross-validationrdquo TheAnnals of Statistics vol 40 no 6 pp 3003ndash3030 2012

[24] M K P So and T Ando ldquoGeneralized predictive informationcriteria for the analysis of feature eventsrdquo Electronic Journal ofStatistics vol 7 pp 742ndash762 2013

[25] J J J Groen and G Kapetanios ldquoModel selection criteria forfactor-augmented regressionsrdquo Oxford Bulletin of Economicsand Statistics vol 75 no 1 pp 37ndash63 2013

[26] S Lai and Q Xie ldquoA selection problem for a constrainedlinear regression modelrdquo Journal of Industrial and ManagementOptimization vol 4 no 4 pp 757ndash766 2008

[27] J Shao ldquoAn asymptotic theory for linear model selectionrdquoStatistica Sinica vol 7 no 2 pp 221ndash264 1997 With commentsand a rejoinder by the author

Abstract and Applied Analysis 13

[28] D W K Andrews ldquoAsymptotic optimality of generalized 119862119871

cross-validation and generalized cross-validation in regressionwith heteroskedastic errorsrdquo Journal of Econometrics vol 47 no2-3 pp 359ndash377 1991

[29] K-C Li ldquoAsymptotic optimality for 119862119901 119862119871 cross-validation

and generalized cross-validation discrete index setrdquoTheAnnalsof Statistics vol 15 no 3 pp 958ndash975 1987

[30] J A F Machado ldquoRobust model selection and119872-estimationrdquoEconometric Theory vol 9 no 3 pp 478ndash493 1993

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article The Optimal Selection for Restricted ...downloads.hindawi.com/journals/aaa/2014/692472.pdf · stocks in the group as the regressors. Since the coe cient of a regressor

Abstract and Applied Analysis 5

tr(A) tr(B) where A and B are any positive semidefinitematrices

Since 119875(119908) and Ω119899are symmetric and nonnegative

definite the first inequality is established by tr(119875(119908)Ω119899) le

120582max(119875(119908)) tr(Ω119899) Let 120582119894 be eigenvalue of 119875(119908) From thesymmetric property of 119875(119908) we know that there exists a119899 times 119899 orthogonal matrix Ψ such that 119875(119908) = ΨΥΨ

119879where Υ = diag(120582

1 120582

119899) Then it yields tr(Ω

119899119875(119908)) =

tr(Υ12Ψ119879Ω119899ΨΥ12) where Υ

12Ψ119879

Ω119899ΨΥ12 is symmetric

and nonnegative definite The second inequality holdsbecause

tr (Ω119899119875 (119908))

2le [tr (Ω

119899119875 (119908))]

2

le [120582max (119875 (119908)) tr(Ω119899)]2

le C1Γ2

(24)

For the last formula we note that

tr (Ω119899119875119879(119908) 119875 (119908))

2

= tr (119875 (119908)Ω119899119875119879(119908))2

le 120582max (119875 (119908)Ω119899119875119879(119908)) tr (119875 (119908)Ω119899119875

119879(119908))

le 120582max (1198752(119908)) 120582max (Ω119899) tr (119875 (119908)Ω119899119875

119879(119908))

le 1205822

max (119875 (119908)) 120582max (Ω119899) tr (119875 (119908)Ω119899119875119879(119908))

le C1Γ tr (119875 (119908)Ω119899119875

119879(119908))

(25)

This completes the proof of Lemma 4

Lemma 5 Let A be a symmetric matrix and 119883 be a randomvector with zero expectation Then one has Var(119883119879A119883) =2 tr(Var(119883)AVar(119883)A) where Var(sdot) is the operation ofvariance

Proof The proof of this lemma is provided in Lai and Xie[26]

3 Average Squared Error

Denote an average squared error by

119871119899 (119908) =

1

119899

1003817100381710038171003817120583 minus 120583(119908)10038171003817100381710038172

119896=(120583 minus 120583 (119908))

119879119896 (120583 minus 120583 (119908))

119899 (26)

where 119896 is a fixed positive integer which is often used toeliminate the boundary effect Andrews [28] suggested thatthe 119896 can take the value of 120590minus2 when errors obeyed anindependent and identical distribution with variance 1205902 Themost common situation is 119896 = 1 The average squarederror 119871

119899(119908) can be viewed as a measure of accuracy between

120583(119908) and 120583 Obviously an optimal estimator can obtain theminimum value of 119871

119899(119908) In other words we should select

a weight vector 119908 from W to make that the average squarederror 119871

119899(119908) takes value that is as small as possible

In order to investigate the problem of weight selection wegive the expression of conditional expected average squarederror as follows

119877119899 (119908) = 119864 (119871119899 (119908) | 119883 119885119898) (27)

From the definition of119877119899(119908) the following lemma can be

obtained

Lemma6 The conditional expected average squared error canbe rewritten as

119877119899 (119908) =

1003817100381710038171003817119872(119908)120583 minus 120578(119908)10038171003817100381710038172

119896

119899+119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899 (28)

where119872(119908) = 119868 minus 119875(119908)

Proof A straight calculation of 119871119899(119908) leads to

119899119871119899 (119908)

= (120583 minus 120583 (119908))119879119896 (120583 minus 120583 (119908))

= (120583 minus 120578 (119908) minus 119875 (119908) (120583 + 119890))119879

times 119896 (120583 minus 120578 (119908) minus 119875 (119908) (120583 + 119890))

= [120583119879119872119879(119908) minus 120578

119879(119908) minus 119890

119879119875119879(119908)]

times 119896 [119872 (119908) 120583 minus 120578 (119908) minus 119875 (119908) 119890]

=1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896minus 119896(120583 minus 119883119877

minus1119861)119879

119872119879(119908) 119875 (119908) 119890

+ 119896119890119879119875119879(119908) 119875 (119908) 119890 minus 119896119890

119879119875119879(119908)119872 (119908) (120583 minus 119883119877

minus1119861)

(29)

Using 119864(119890119879119875119879(119908)119875(119908)119890 | 119883 119885119898) = tr(119875(119908)Ω

119899119875119879(119908))

and taking conditional expectations on both sides of theabove equality give rise to Lemma 6

4 The 119896-Class Generalized InformationCriterion and Asymptotic Optimality

As the value of120583 is unknown the average squared error119871119899(119908)

cannot be used directly to select the weight vector119908Thus wesuggest a 119896-class generalized information criterion (119896-GIC)to choose the weight vector Further we will prove that theselected weight vector from 119896-GIC minimizes the averagesquared error 119871

119899(119908)

The 119896-GIC for the restricted model average estimator is

119862119899 (119908) =

1003817100381710038171003817119884 minus 120583 (119908)10038171003817100381710038172

119896

119899+2ℏ

119899tr (119875 (119908)Ω119899) (30)

where ℏ is larger than one and satisfies assumption (35) men-tioned below The 119896-class generalized information criterionextends the generalized information criterion (GIC

120582119899

) pro-posed by Shao [27] Because the ℏ can take different valuesthe 119896-GIC includes some common information criteria formodel selection such as theMallows C

119871criterion (119896 = ℏ = 1)

the GIC criterion (119896 = 1 ℏ rarr infin) the FPE120572criterion

(119896 = 1 ℏ gt 1) and the BIC criterion (119896 = 1 ℏ = log(119899))

6 Abstract and Applied Analysis

Lemma 7 If 119896 = ℏ we have 119864(119862119899(119908) | 119883 119885

119898) = 119877

119899(119908) +

119896Γ119899

Proof Recalling the definition of 119862119899(119908) one gets

119899119862119899 (119908) = (119884 minus 120583 (119908))

119879119896 (119884 minus 120583 (119908)) + 2119896 tr (119875 (119908)Ω119899)

= [119872 (119908)119884 minus 120578 (119908)]119879119896 [119872 (119908)119884 minus 120578 (119908)]

+ 2119896 tr (119875 (119908)Ω119899)

= 2119896 tr (119875 (119908)Ω119899) + 119896119890119879119872119879(119908)119872 (119908) 119890

+ 119896(119872 (119908) 120583 minus 120578 (119908))119879119872(119908) 119890

+ 119896119890119879119872119879(119908) (119872 (119908) 120583 minus 120578 (119908))

+1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

(31)

Observe that

119896119890119879119872119879(119908)119872 (119908) 119890 = 119890

119879(119868 minus 119875 (119908))

119879119896 (119868 minus 119875 (119908)) 119890

= 119896119890119879119890 minus 119896119890

119879119875119879(119908) 119890 minus 119896119890

119879119875 (119908) 119890

+ 119896119890119879119875119879(119908) 119875 (119908) 119890

(32)

Notice that 119864(119896119890119879119890 | 119883 119885

119898) = 119896 tr(Ω

119899) and

119864(119896119890119879119875119879(119908)119890 | 119883 119885

119898) = 119896 tr(119875119879(119908)Ω

119899) Moreover we

have that 119864(119896119890119879119875(119908)119890 | 119883 119885119898) = 119896 tr(119875(119908)Ω

119899) and

119864(119896119890119879119875119879(119908)119875(119908)119890 | 119883 119885

119898) = 119896 tr(119875(119908)Ω

119899119875119879(119908))

Thus taking conditional expectations on both sides of(31) we obtain

119899119864 (119862119899 (119908) | 119883 119885119898) = 119896 tr (Ω119899) +

1003817100381710038171003817119872(119908)120583 minus 120578(119908)10038171003817100381710038172

119896

+ 119896 tr (119875 (119908)Ω119899119875119879(119908))

(33)

It follows from (33) and Lemma 6 that Lemma 7 isestablished as desired

Lemma 7 shows that 119862119899(119908) is equivalent to the con-

ditional expected average squared error plus an error biasParticularly when 119899 approaches to infinite 119862

119899(119908) is an

unbiased estimation of 119877119899(119908)

The 119896-GIC criterion is defined so as to select the optimalweight vector 119908 The optimal weight vector 119908 is chosen byminimizing 119862

119899(119908)

Obviously the well-posed estimators of parameters areΘ119908 and Ψ119908 with the weight vector 119908 Under some regular

conditions we intend to demonstrate that the selectionprocedure is asymptotically optimal in the following sense

119871119899 (119908)

inf119908isinW119871119899 (119908)

119901

997888rarr 1 (34)

If the formula (34) holds we know that the selectedweight vector 119908 from 119862

119899(119908) can realize the minimum value

of 119871119899(119908) In other words the weight vector 119908 is equivalent

to the selected weight vector by minimizing 119871119899(119908) and is the

optimal weight vector for 120583(119908) The asymptotic optimality of119862119899(119908) can be established under the following assumptions

Assumptions 1 Write 120572119899= inf119908isinW119899119877119899(119908) As 119899 rarr infin we

assume that

lim inf119908isinW

ℏ2

120572119899

997888rarr 0 (35)

sum

119908isinW

1

(119899119877119899 (119908))

1198731015840997888rarr 0 (36)

for a large ℏ and any positive integer 1 le 1198731015840 lt infinThe above assumptions have been employed by many

literatures of model selection For instance the expression(35) was used by Shao [27] and the formula (36) was adoptedby Li [29] Andrews [28] Shao [27] and Hansen [11]

The following lemma offers a bridge for proving theasymptotic optimality of 119862

119899(119908)

Lemma 8 We have1003817100381710038171003817119872(119908)119884 minus 120578(119908)

10038171003817100381710038172

119896= 119899119871119899 (119908) minus 2⟨119890

119879 119875(119908)119890⟩

119896

+ 2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896+ 1198902

119896

(37)

Proof Using 120583(119908) = 119875(119908)119884 + 120578(119908) 119884 = 120583 + 119890 and thedefinition of 119871

119899(119908) one obtains

1003817100381710038171003817119872(119908)119884 minus 120578(119908)10038171003817100381710038172

119896=1003817100381710038171003817120583 + 119890 minus 119875(119908)119884 minus 120578(119908)

10038171003817100381710038172

119896

= 1198902

119896+1003817100381710038171003817120583 minus 119875(119908)119884 minus 120578(119908)

10038171003817100381710038172

119896

+ 2⟨119890119879 120583 minus 120578(119908) minus 119875(119908)119884⟩

119896

= 1198902

119896+ 119899119871119899 (119908)

+ 2⟨119890119879 120583 minus 120578(119908) minus 119875(119908)(120583 + 119890)⟩

119896

= 1198902

119896+ 119899119871119899 (119908)

+ 2⟨119890119879 (119868 minus 119875(119908))120583 minus 120578(119908) minus 119875(119908)119890⟩

119896

= 1198902

119896+ 119899119871119899 (119908)

+ 2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

minus 2⟨119890119879 119875(119908)119890⟩

119896

(38)

This completes the proof of Lemma 8

Lemma 8 and 119884 minus 120583(119908)2119896= 119872(119908)119884 minus 120578(119908)

2

119896imply

that

119862119899 (119908) = 119871119899 (119908) +

2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

119899

+1198902

119896

119899+2ℏ tr (119875 (119908)Ω119899)

119899minus 2

⟨119890119879 119875(119908)119890⟩

119896

119899

(39)

Abstract and Applied Analysis 7

The goal is to choose119908 by minimizing 119862119899(119908) From (39)

one only needs to select 119908 through minimizing

119871119899 (119908) +

2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

119899

+

2ℏ tr (119875 (119908)Ω119899) minus 2⟨119890119879 119875(119908)119890⟩

119896

119899

(40)

where 119908 isinWCompared with 119871

119899(119908) it is sufficient to establish

that 119899minus1⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896and 119899

minus1ℏ tr(119875(119908)Ω

119899) minus

119899minus1⟨119890119879 119875(119908)119890⟩

119896are uniformly negligible for any 119908 isin W

More specifically to prove (34) we need to check

sup119908isinW

⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

119899119877119899 (119908)

119901

997888rarr 0 (41)

sup119908isinW

10038161003816100381610038161003816ℏ tr (119875 (119908)Ω119899) minus ⟨119890

119879 119875(119908)119890⟩

119896

10038161003816100381610038161003816

119899119877119899 (119908)

119901

997888rarr 0 (42)

sup119908isinW

10038161003816100381610038161003816100381610038161003816

119871119899 (119908) minus 119877119899 (119908)

119877119899 (119908)

10038161003816100381610038161003816100381610038161003816

119901

997888rarr 0 (43)

where ldquo119901

997888rarrrdquo denotes the convergence in probabilityFollowing the idea of Li [29] we testify the main result

of our work that the minimizing of 119896-class generalizedinformation criterion is asymptotically optimalNowwe statethe main theorem

Theorem9 Under assumptions (35) and (36) theminimizingof 119896-class generalized information criterion119862

119899(119908) is asymptot-

ically optimal namely (34) holds

Proof The asymptotically optimal property of 119896-GIC needsto show that (41) (42) and (43) are valid

Firstly we prove that (41) holds For any 120577 gt 0 byChebyshevrsquos inequality one has

Pr

sup119908isinW

10038161003816100381610038161003816⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

10038161003816100381610038161003816

119899119877119899 (119908)

gt 120577

le sum

119908isinW

119864[⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896]2

(120577119899119877119899 (119908))

2

(44)

which by Lemma 1 is no greater than

120577minus2119896W sum

119908isinW

119899minus21003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896(119877119899 (119908))

minus2 (45)

Recalling the definition of 119877119899(119908) we get 119872(119908)120583 minus

120578(119908)2

119896le 119899119877

119899(119908) Then (45) does not exceed

120577minus2119896Wsum

119908isinW (119899119877119899(119908))minus1 By assumption (36) one knows

that (45) tends to zero in probabilityThus (41) is established

To prove (42) it suffices to testify that

119896 tr (119875 (119908)Ω119899) minus ⟨119890119879 119875(119908)119890⟩

119896

119899119877119899 (119908)

119901

997888rarr 0 (46)

(ℏ minus 119896) tr (119875 (119908)Ω119899)119899119877119899 (119908)

119901

997888rarr 0 (47)

By Chebyshevrsquos inequality and Lemmas 4 and 5 for any120598 gt 0 we have

Pr sup119908isinW

10038161003816100381610038161003816100381610038161003816100381610038161003816

tr (119875 (119908)Ω119899) minus ⟨119890119879 119875 (119908) 119890⟩

119899119877119899 (119908)

10038161003816100381610038161003816100381610038161003816100381610038161003816

gt 120598

le sum

119908isinW

119864[tr(119875(119908)Ω119899) minus ⟨119890

119879 119875(119908)119890⟩]

2

(120598119899119877119899(119908))2

=1

1205982sum

119908isinW

Var (119890119879119875 (119908) 119890)

(119899119877119899(119908))2

le2C1Γ2

1205982sum

119908isinW

1

(119899119877119899(119908))2

(48)

From (36) we know that (48) is close to zero Thus (46)is reasonable

Recalling Lemma 4 and (35) it yields

1003816100381610038161003816(ℏ minus 119896) tr (119875 (119908)Ω119899)1003816100381610038161003816

119899119877119899 (119908)

le

10038161003816100381610038161003816C1015840 (ℏ minus 119896) Γ

10038161003816100381610038161003816

119899119877119899 (119908)

997888rarr 0 (49)

which derives (47) Then (42) is provedNext we show that the expression (43) also holds A

straightforward calculation leads to

1003817100381710038171003817120583 minus 120583(119908)10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

=1003817100381710038171003817120583 minus 120578(119908) minus 119875(119908)119884

10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

=1003817100381710038171003817120583 minus 120578(119908) minus 119875(119908)120583 minus 119875(119908)119890

10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

=1003817100381710038171003817119872(119908)120583 minus 120578(119908) minus 119875(119908)119890

10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

= 119875(119908)1198902

119896minus 2⟨120583

119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

(50)

From the identity (50) we know that

119871119899 (119908) minus 119877119899 (119908) = minus

2⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899

+119875(119908)119890

2

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899

(51)

8 Abstract and Applied Analysis

Obviously the proof of (43) needs to verify that

sup119908isinW

119875(119908)1198902

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899119877119899 (119908)

119901

997888rarr 0

sup119908isinW

⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899119877119899 (119908)

119901

997888rarr 0

(52)

To prove (52) it should be noticed thattr(119875(119908)Ω

119899119875119879(119908)) = 119864(119890

119879119875119879(119908)119875(119908)119890) and 119875(119908)119890

2

119896=

⟨119890119879119875119879(119908) 119875(119908)119890⟩

119896 In addition

1003817100381710038171003817119875 (119908) (119872 (119908) 120583 minus 120578 (119908))10038171003817100381710038172le 1205822

max (119875 (119908))1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

le1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

(53)

Thus there exist any 1205771015840 gt 0 and 1205981015840 gt 0 such that

Pr sup119908isinW

119875(119908)1198902

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899119877119899 (119908)

gt 1205771015840

le sum

119908isinW

119864[119875(119908)1198902

119896minus 119896119864 (119890

119879119875119879(119908) 119875 (119908) 119890)]

2

(119899119877119899 (119908) 120577

1015840)2

=1198962

12057710158402sum

119908isinW

Var (119890119879119875119879 (119908) 119875 (119908) 119890)

(119899119877119899(119908))2

le2119896C1Γ

12057710158402sum

119908isinW

119896 tr (119875 (119908)Ω119899119875119879(119908))

(119899119877119899(119908))2

le2119896C1Γ

12057710158402sum

119908isinW

1

(119899119877119899 (119908))

(54)

Pr

sup119908isinW

⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899119877119899 (119908)

gt 1205981015840

le sum

119908isinW

119864[⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896]2

(119899119877119899(119908)1205981015840)

2

le sum

119908isinW

1003817100381710038171003817119875(119908)(119872(119908)120583 minus 120578(119908))10038171003817100381710038172

119896119896W

(119899119877119899(119908)1205981015840)

2

le 1205981015840minus2119896W sum

119908isinW

119899minus21003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896(119877119899 (119908))

minus2

(55)

Combining (36) and (45) one knows that both (54)and (55) tend to zero In other words (43) is confirmedWe conclude that the expressions (41) (42) and (43) arereasonable This completes the proof of Theorem 9

In practice the covariance of errors is usually unknownand needs to be estimated However it is difficult to build agood estimate for Ω

119899in virtue of the special structure of Ω

119899

In the special case when the random errors are independentand identical distribution with variance 1205902 the consistentestimator of1205902 can be built for the constrainedmodels (7) and(8) Let 119898 = 119876 in 120583

119898and 1205911015840 = tr(119875

119876) where 119876 corresponds

to a ldquolargerdquo approximatingmodel Denote 2119876= (119899minus120591

1015840)minus1(119884minus

120583119876)119879(119884 minus 120583

119876) The coming theorem will show that 2

119876is a

consistent estimate of 1205902

Theorem 10 If 1205911015840119899 rarr 0 when 1205911015840 rarr infin and 119899 rarr infin wehave 2

119876

119901

997888rarr 1205902 as 119899 rarr infin

Proof Writing Λ = 119863 + 119883119877minus1119863995333 one obtains

(119884 minus 120583119876)119879(119884 minus 120583

119876)

119899 minus 1205911015840=119890119879119872119876119890

119899 minus 1205911015840+

1003817100381710038171003817119872119876Λ10038171003817100381710038172

119899 minus 1205911015840+2119890119879119872119876Λ

119899 minus 1205911015840

(56)

where119872119876= 119868 minus 119875

119876

Since 119864(119890119879119872119876119890) = 120590

2(119899 minus 120591

1015840) it leads to

11986410038161003816100381610038161003816119890119879119872119876119890 minus 1205902(119899 minus 120591

1015840)10038161003816100381610038161003816

2

= 11986410038161003816100381610038161003816119890119879119872119876119890 minus 119864(119890

119879119872119876119890)10038161003816100381610038161003816

2

= Var (119890119879119872119876119890) = 2120590

4 tr (119872119876)

(57)

For any 120576 gt 0 it follows from (57) that

Pr1003816100381610038161003816100381610038161003816100381610038161003816

119890119879119872119876119890

119899 minus 1205911015840minus 1205902

1003816100381610038161003816100381610038161003816100381610038161003816

gt 120576 le11986410038161003816100381610038161003816119890119879119872119876119890 minus (119899 minus 120591

1015840)120590210038161003816100381610038161003816

2

1205762(119899 minus 1205911015840)2

le21205904

1205762 (119899 minus 1205911015840)997888rarr 0

(58)

Equation (58) implies that (119899 minus 1205911015840)minus1(119890119879119872119876119890)119901

997888rarr 1205902 Let

I119876119905

be the 119905th diagonal element of the ldquohatrdquo matrix 119875119876Then

0 le 1minusI119876119905lt 1 is the 119905th diagonal element of119872

119876and satisfies

sum119899

119905=1(1 minusI

119876119905) = 119899 minus 120591

1015840Because 120601(119911

119894119897 120595119897) and 120601(119886

119904119896 120595119896) converge to mean

square we have 119864(Λ2119876119905) rarr 0 as 1205911015840 rarr infin where Λ

119876119905is

the 119905th element of Λ 119905 = 1 119899 Notice that

119864 (Λ119879119872119876Λ)

(119899 minus 1205911015840)=

1

119899 minus 1205911015840

119899

sum

119905=1

119864 (Λ2

119876119905119872119876119905)

=1

119899 minus 1205911015840

119899

sum

119905=1

119864 (Λ2

119876119905) [1 minusI

119876119905]

le max119905

119864 (Λ2

119876119905)

1

119899 minus 1205911015840

119899

sum

119905=1

[1 minusI119876119905]

= max119905

119864 (Λ2

119876119905) 997888rarr 0

(59)

The above expression implies that the second term on theright hand of (56) approaches to zero By the similar proof of(59) we obtain that the final term on the right-hand of (56)also tends to zero The proof of Theorem 10 is complete

Abstract and Applied Analysis 9

In the case of independently and identically distributederrors if we replace 1205902 by 2

119876in the 119896-GIC the 119896-GIC can be

simplified to

119862119899 (119908) =

1003817100381710038171003817119884 minus 120583(119908)10038171003817100381710038172

119896

119899+2ℏ2

119876tr (119875 (119908))119899

(60)

Here we intend to illustrate that the model selection pro-cedure of minimizing 119862

119899(119908) is also asymptotically optimal

Theorem11 Assume that the random error 119890 is iid withmeanzero and variance 1205902 Under the conditions (35) and (36) the119862119899(119908) is still asymptotically valid

Proof Using a similar technique of deriving (39) we obtain

119899119862119899 (119908) =

1003817100381710038171003817119884 minus 120583(119908)10038171003817100381710038172

119896+ 2ℏ

2

119876tr (119875 (119908))

= 119899119871119899 (119908) + 2⟨119890

119879119872(119908)120583 minus 120578(119908)⟩

119896

+ 2ℏ2

119876tr (119875 (119908)) minus 2⟨119890119879 119875(119908)119890⟩

119896+ 1198902

119896

= 2ℏ (2

119876minus 1205902) tr (119875 (119908)) + 2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

+ 119899119871119899 (119908) + 2ℏ120590

2 tr (119875 (119908))

minus 2⟨119890119879 119875(119908)119890⟩

119896+ 1198902

119896

(61)

With an appropriate modification of the proofs (41)ndash(43)one only needs to verify

sup119908isinW

ℏ100381610038161003816100381610038161205902minus 2

119876

10038161003816100381610038161003816tr (119875 (119908))

119899119877119899 (119908)

119901

997888rarr 0 (62)

which is equivalent to showing

sup119908isinW

ℏ2((1119899)

100381610038161003816100381610038161205902minus 2

119876

10038161003816100381610038161003816tr(119875(119908)))

2

1198772119899(119908)

119901

997888rarr 0 (63)

From the proof of Theorem 10 we have

Pr1003816100381610038161003816100381610038161003816100381610038161003816

ℏ119890119879119872119876119890

119899 minus 1205911015840minus ℏ1205902

1003816100381610038161003816100381610038161003816100381610038161003816

gt 1205761015840119877minus12

119899(119908)

le ℏ211986410038161003816100381610038161003816119890119879119872119876119890 minus (119899 minus 120591

1015840)120590210038161003816100381610038161003816

2

12057610158402(119899 minus 1205911015840)2119877119899 (119908)

le2ℏ21205904

12057610158402 (119899 minus 1205911015840) 119877119899 (119908)

(64)

where 1205761015840 gt 0By assumption (35) one knows that the above equation

is close to zero Thus we obtain 119877119899(119908)minus1|ℏ1205902minus ℏ2

119876|2 119901

997888rarr

0 It follows from Lemma 3 and the definition of 119877119899(119908)

that (119899minus1 tr(119875(119908)))2 is no greater than 119877119899(119908) Then the

formula (63) is confirmed Therefore we conclude that theminimizing of 119862

119899(119908) is also asymptotically optimal

5 Monte Carlo Simulation

In this section Monte Carlo simulations are performedto investigate the finite sample properties of the proposedrestricted linearmodel selectionThis data generating processis

119910119894= 119909119894120579 +

1000

sum

119897=1

119911119894119897120595119897+ 119890119894

subject to 120579 = minus

1000

sum

119897=1

120595119897+ 1 119894 = 1 119899

(65)

where 119909119894follows 119905(3) distribution 119911

1198941= 1 and 119911

119894119897

119897 = 2 1000 is independently and identically distributed119873(0 1) The parameter 120595

119897 119897 = 1 1000 is determined by

the rule 120595119897= 120577119897minus1 where 120577 is a parameter which is selected

to control the population 2 = 1205772(1 + 120577

2) The error 119890

119894is

independent of 119909119894and 119911119894119897 We consider two cases of the error

distribution

Case 1 119890119894is independently and identically distributed

119873(0 1)

Case 2 1198901 119890

119899obey multivariate normal distribution

119873(0Ω119899) whereΩ

119899is a 119899times119899 dimensional covariance matrix

The 119895th diagonal element ofΩ119899is 1205902119895generated from uniform

distribution on (0 1) The 119895119897th (119895 = 119897) nondiagonal element ofΩ119899isΩ119895119897119899= exp(minus05|119909

119894119895minus 119909119894119897|2) where 119909

119894119895and 119909

119894119897denote the

119895th and 119897th elements of 119909119894 respectively

The sample size is varied between 119899 = 50 100 150 and200 The number of models is determined by 119872 = lfloor3119899

13rfloor

where lfloor119904rfloor stands for the integer part of 119904 We set 120577 so that2 varies on a grid between 01 and 09 The number of

simulation trials is Π = 500 For the 119896-GIC the value of 119896takes one and ℏ adopts the effective number of parameters

To assess the performance of 119896-GIC we consider fiveestimators which are (1) AIC model selection estimators(AIC) (2) BIC model selection estimators (BIC) (3) leave-one-out cross-validated model selection estimator (CV [17])(4) Mallows model averaging estimators (MMA [11]) and(5) 119896-GIC model selection estimator (119896-GIC) FollowingMachado [30] the AIC and BIC are defined respectively as

AIC119898= 2119899 ln 1

119899

1003817100381710038171003817119884 minus 12058311989810038171003817100381710038172 + 2K

119898

BIC119898= 2119899 ln 1

119899

1003817100381710038171003817119884 minus 12058311989810038171003817100381710038172 + ln (119899)K119898

(66)

We employ the out-of-sample prediction error to evaluateeach estimator For each replication 119910

ℓ 119909ℓ 119911ℓ100

ℓ=1are gener-

ated as out-of-sample observations In the120587th simulation theprediction error is

PE (120587) = 1

100

100

sum

ℓ=1

(119910ℓminus 120583ℓ (119908))

2 (67)

10 Abstract and Applied Analysis

CVMMAAIC

BIC

n = 50M = 11Pr

edic

tion

erro

r18

14

10

06

01 03 05 07 09

R

2

k-GIC

(a)

CVMMAAIC

BIC

n = 100M = 14

Pred

ictio

n er

ror

18

14

10

06

01 03 05 07 09

R

2

k-GIC

(b)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

n = 150M = 16

01 03 05 07 09

R

2

k-GIC

(c)

Pred

ictio

n er

ror

18

14

10

06

01 03 05 07 09

CVMMAAIC

BIC

n = 200M = 18

R

2

k-GIC

(d)

Figure 1 Results for Monte Carlo design

where119908 is selected by one of the five methodsThen the out-of-sample prediction error is calculated by

PE = 1

Π

Π

sum

120587=1

PE (120587) (68)

where Π = 500 is the number of replication Obviously thesmaller PE implies the better method of model estimatorWe consider PE under homoscedastic errors at first Theprediction error calculations are summarized in Figure 1Thefour panels in each graph depict results for a variety of samplesizes In each panel PE is displayed on the 119910-axis and 2 isdisplayed on the 119909-axis

We find that the 119896-GIC estimators are almost the bestestimators among those considered When 2 is very large

the MMA estimators can sometimes be marginally preferredto the 119896-GIC estimators In each panel the AIC and CV havequite similar prediction errors For a smaller 2 the AICobtains a higher prediction error than CV However the AICestimators yield smaller PE119904 than the CV estimators when2 is increasing In many situations the PE119904 of BIC estimator

with a large 2 are quite poor relative to the other methodsNext we discuss PE under correlative errors and the PE

calculation is summarized in Figure 2 Broadly speakingthe conclusions are similar to those found in homoscedasticcases The 119896-GIC estimator frequently yields the most accu-rate estimators followed by the MMA estimator and bothaverage estimators enjoy significantly smaller PE119904 than theother three estimators over a large portion of the 2 space

Abstract and Applied Analysis 11

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 50M = 11

k-GIC

(a)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 100M = 14

k-GIC

(b)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 150M = 16

k-GIC

(c)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 200M = 18

k-GIC

(d)

Figure 2 Results for Monte Carlo design

When 2 le 02 the BIC estimator outperforms the 119896-GICestimator Again the AIC estimator is habitually the worstperforming estimator with the CV being a close second in alarge region of the 2 space Besides their relative efficiencyrelies closely on sample size with the BIC estimator revealingincreasing PE and the remaining four estimators showingdecreasing PE as 119899 increases

6 Conclusions

In risk investment an important subject is to find an optimalportfolio The commonly used techniques are the opti-mization methods based on the scheme of mean-varianceHowever those methods are cumbersome in computingand cannot obtain the closed solutions for some complex

problems To make up the defects of mean-variance analternative methodology for obtaining an optimal portfoliois to use model selection This paper attempts to developa statistical program to consider the selection problem ofoptimal tracking portfolio We build the theoretical modelsof tracking portfolios by constrained linear models Thenthe selection problems of optimal portfolio boil down tochoosing an optimal constrained linear model

In the setting of unrestricted models or homoscedasticmodels a large number of works investigate the problemsof model selection In distinction we discuss the modelselection for constrained models with dependent errors Therestricted models are estimated by the method of weightedaverage least squares Thus the selection of an optimalconstrained model is equivalent to finding a series of optimal

12 Abstract and Applied Analysis

weights We select the weights by minimizing a 119896-class gen-eralized information criterion (119896-GIC) which is an estimateof the average squared error from the model average fit Theprocedure of selecting weights is proved to be asymptoticallyoptimal Through Monte Carlo simulation the performanceof 119896-GIC is compared against that of four other methods Itis found that the 119896-GIC gives the best performance in mostcases

There are two limitations of our results which are openfor further research First what is the asymptotic distributionof the parametric estimators Second can the theory begeneralized to allow for continuous weightsThese questionsremain to be answered by future research In this work wemainly adopt the method of regression analysis to solve theselection problem In fact some alternative mathematicaltools can also be employed to explore the theoretical prop-erties of model selection For example the optimal modelcan be selected by the methods of linear optimization orquadratic programming and we can apply the techniques oflinear functional analysis and stochastic control to considerthe inferences of parametric estimator Besides we mentionthat the applications of this study can also be extended tosome other fields including risk management ruin theoryand factor analysis

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

This work was supported by the Shandong Institute ofBusiness and Technology (SIBT Grant 521014306203)

References

[1] K Knight and W Fu ldquoAsymptotics for lasso-type estimatorsrdquoThe Annals of Statistics vol 28 no 5 pp 1356ndash1378 2000

[2] J Fan and H Peng ldquoNonconcave penalized likelihood with adiverging number of parametersrdquo The Annals of Statistics vol32 no 3 pp 928ndash961 2004

[3] H Zou and M Yuan ldquoComposite quantile regression and theoracle model selection theoryrdquo The Annals of Statistics vol 36no 3 pp 1108ndash1126 2008

[4] M Caner ldquoLasso-type GMM estimatorrdquo Econometric Theoryvol 25 no 1 pp 270ndash290 2009

[5] C Y Tang and C Leng ldquoPenalized high-dimensional empiricallikelihoodrdquo Biometrika vol 97 no 4 pp 905ndash919 2010

[6] L C Jennifer A D Jurgen and F H David Model Selectionin Equations With Many Small Effects Oxford Bulletin ofEconomics and Statistics 2013

[7] E E Leamer Specification Searches John Wiley amp Sons NewYork NY USA 1978

[8] P J Brown M Vannucci and T Fearn ldquoBayes model averagingwith selection of regressorsrdquo Journal of the Royal StatisticalSociety B Statistical Methodology vol 64 no 3 pp 519ndash5362002

[9] W S Rodney and K D Herman Bayesian Model SelectionWith An Uninformative Prior Oxford Bulletin of Economicsand Statistics 2003

[10] N L Hjort and G Claeskens ldquoFrequentist model averageestimatorsrdquo Journal of the American Statistical Association vol98 no 464 pp 879ndash899 2003

[11] B E Hansen ldquoLeast squares model averagingrdquo EconometricaJournal of the Econometric Society vol 75 no 4 pp 1175ndash11892007

[12] H Liang G Zou A T K Wan and X Zhang ldquoOptimal weightchoice for frequentist model average estimatorsrdquo Journal of theAmerican Statistical Association vol 106 no 495 pp 1053ndash10662011

[13] X Zhang and H Liang ldquoFocused information criterion andmodel averaging for generalized additive partial linear modelsrdquoThe Annals of Statistics vol 39 no 1 pp 194ndash200 2011

[14] B E Hansen and J S Racine ldquoJackknife model averagingrdquoJournal of Econometrics vol 167 no 1 pp 38ndash46 2012

[15] H Akaike ldquoInformation theory and an extension of themaximum likelihood principlerdquo in Proceedings of the 2ndInternational Symposium on Information Theory B N Petrovand F Csake Eds pp 267ndash281 Akademiai Kiado BudapestHungary 1973

[16] C L Mallows ldquoSome comments on CPrdquo Technometrics vol 15no 4 pp 661ndash675 1973

[17] M Stone ldquoCross-validatory choice and assessment of statisticalpredictionsrdquo Journal of the Royal Statistical Society B Method-ological vol 36 pp 111ndash147 1974

[18] G Schwarz ldquoEstimating the dimension of a modelrdquoTheAnnalsof Statistics vol 6 no 2 pp 461ndash464 1978

[19] P Craven and G Wahba ldquoSmoothing noisy data with splinefunctions Estimating the correct degree of smoothing by themethod of generalized cross-validationrdquo Numerische Mathe-matik vol 31 no 4 pp 377ndash403 1979

[20] D W K Andrews ldquoConsistent moment selection proceduresfor generalized method of moments estimationrdquo EconometricaJournal of the Econometric Society vol 67 no 3 pp 543ndash5641999

[21] G Claeskens and N L Hjort ldquoThe focused informationcriterionrdquo Journal of theAmerican Statistical Association vol 98no 464 pp 900ndash945 2003With discussions and a rejoinder bythe authors

[22] Y Zhang R Li and C-L Tsai ldquoRegularization parameterselections via generalized information criterionrdquo Journal of theAmerican Statistical Association vol 105 no 489 pp 312ndash3232010

[23] G Xu and J Z Huang ldquoAsymptotic optimality and efficientcomputation of the leave-subject-out cross-validationrdquo TheAnnals of Statistics vol 40 no 6 pp 3003ndash3030 2012

[24] M K P So and T Ando ldquoGeneralized predictive informationcriteria for the analysis of feature eventsrdquo Electronic Journal ofStatistics vol 7 pp 742ndash762 2013

[25] J J J Groen and G Kapetanios ldquoModel selection criteria forfactor-augmented regressionsrdquo Oxford Bulletin of Economicsand Statistics vol 75 no 1 pp 37ndash63 2013

[26] S Lai and Q Xie ldquoA selection problem for a constrainedlinear regression modelrdquo Journal of Industrial and ManagementOptimization vol 4 no 4 pp 757ndash766 2008

[27] J Shao ldquoAn asymptotic theory for linear model selectionrdquoStatistica Sinica vol 7 no 2 pp 221ndash264 1997 With commentsand a rejoinder by the author

Abstract and Applied Analysis 13

[28] D W K Andrews ldquoAsymptotic optimality of generalized 119862119871

cross-validation and generalized cross-validation in regressionwith heteroskedastic errorsrdquo Journal of Econometrics vol 47 no2-3 pp 359ndash377 1991

[29] K-C Li ldquoAsymptotic optimality for 119862119901 119862119871 cross-validation

and generalized cross-validation discrete index setrdquoTheAnnalsof Statistics vol 15 no 3 pp 958ndash975 1987

[30] J A F Machado ldquoRobust model selection and119872-estimationrdquoEconometric Theory vol 9 no 3 pp 478ndash493 1993

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article The Optimal Selection for Restricted ...downloads.hindawi.com/journals/aaa/2014/692472.pdf · stocks in the group as the regressors. Since the coe cient of a regressor

6 Abstract and Applied Analysis

Lemma 7 If 119896 = ℏ we have 119864(119862119899(119908) | 119883 119885

119898) = 119877

119899(119908) +

119896Γ119899

Proof Recalling the definition of 119862119899(119908) one gets

119899119862119899 (119908) = (119884 minus 120583 (119908))

119879119896 (119884 minus 120583 (119908)) + 2119896 tr (119875 (119908)Ω119899)

= [119872 (119908)119884 minus 120578 (119908)]119879119896 [119872 (119908)119884 minus 120578 (119908)]

+ 2119896 tr (119875 (119908)Ω119899)

= 2119896 tr (119875 (119908)Ω119899) + 119896119890119879119872119879(119908)119872 (119908) 119890

+ 119896(119872 (119908) 120583 minus 120578 (119908))119879119872(119908) 119890

+ 119896119890119879119872119879(119908) (119872 (119908) 120583 minus 120578 (119908))

+1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

(31)

Observe that

119896119890119879119872119879(119908)119872 (119908) 119890 = 119890

119879(119868 minus 119875 (119908))

119879119896 (119868 minus 119875 (119908)) 119890

= 119896119890119879119890 minus 119896119890

119879119875119879(119908) 119890 minus 119896119890

119879119875 (119908) 119890

+ 119896119890119879119875119879(119908) 119875 (119908) 119890

(32)

Notice that 119864(119896119890119879119890 | 119883 119885

119898) = 119896 tr(Ω

119899) and

119864(119896119890119879119875119879(119908)119890 | 119883 119885

119898) = 119896 tr(119875119879(119908)Ω

119899) Moreover we

have that 119864(119896119890119879119875(119908)119890 | 119883 119885119898) = 119896 tr(119875(119908)Ω

119899) and

119864(119896119890119879119875119879(119908)119875(119908)119890 | 119883 119885

119898) = 119896 tr(119875(119908)Ω

119899119875119879(119908))

Thus taking conditional expectations on both sides of(31) we obtain

119899119864 (119862119899 (119908) | 119883 119885119898) = 119896 tr (Ω119899) +

1003817100381710038171003817119872(119908)120583 minus 120578(119908)10038171003817100381710038172

119896

+ 119896 tr (119875 (119908)Ω119899119875119879(119908))

(33)

It follows from (33) and Lemma 6 that Lemma 7 isestablished as desired

Lemma 7 shows that 119862119899(119908) is equivalent to the con-

ditional expected average squared error plus an error biasParticularly when 119899 approaches to infinite 119862

119899(119908) is an

unbiased estimation of 119877119899(119908)

The 119896-GIC criterion is defined so as to select the optimalweight vector 119908 The optimal weight vector 119908 is chosen byminimizing 119862

119899(119908)

Obviously the well-posed estimators of parameters areΘ119908 and Ψ119908 with the weight vector 119908 Under some regular

conditions we intend to demonstrate that the selectionprocedure is asymptotically optimal in the following sense

119871119899 (119908)

inf119908isinW119871119899 (119908)

119901

997888rarr 1 (34)

If the formula (34) holds we know that the selectedweight vector 119908 from 119862

119899(119908) can realize the minimum value

of 119871119899(119908) In other words the weight vector 119908 is equivalent

to the selected weight vector by minimizing 119871119899(119908) and is the

optimal weight vector for 120583(119908) The asymptotic optimality of119862119899(119908) can be established under the following assumptions

Assumptions 1 Write 120572119899= inf119908isinW119899119877119899(119908) As 119899 rarr infin we

assume that

lim inf119908isinW

ℏ2

120572119899

997888rarr 0 (35)

sum

119908isinW

1

(119899119877119899 (119908))

1198731015840997888rarr 0 (36)

for a large ℏ and any positive integer 1 le 1198731015840 lt infinThe above assumptions have been employed by many

literatures of model selection For instance the expression(35) was used by Shao [27] and the formula (36) was adoptedby Li [29] Andrews [28] Shao [27] and Hansen [11]

The following lemma offers a bridge for proving theasymptotic optimality of 119862

119899(119908)

Lemma 8 We have1003817100381710038171003817119872(119908)119884 minus 120578(119908)

10038171003817100381710038172

119896= 119899119871119899 (119908) minus 2⟨119890

119879 119875(119908)119890⟩

119896

+ 2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896+ 1198902

119896

(37)

Proof Using 120583(119908) = 119875(119908)119884 + 120578(119908) 119884 = 120583 + 119890 and thedefinition of 119871

119899(119908) one obtains

1003817100381710038171003817119872(119908)119884 minus 120578(119908)10038171003817100381710038172

119896=1003817100381710038171003817120583 + 119890 minus 119875(119908)119884 minus 120578(119908)

10038171003817100381710038172

119896

= 1198902

119896+1003817100381710038171003817120583 minus 119875(119908)119884 minus 120578(119908)

10038171003817100381710038172

119896

+ 2⟨119890119879 120583 minus 120578(119908) minus 119875(119908)119884⟩

119896

= 1198902

119896+ 119899119871119899 (119908)

+ 2⟨119890119879 120583 minus 120578(119908) minus 119875(119908)(120583 + 119890)⟩

119896

= 1198902

119896+ 119899119871119899 (119908)

+ 2⟨119890119879 (119868 minus 119875(119908))120583 minus 120578(119908) minus 119875(119908)119890⟩

119896

= 1198902

119896+ 119899119871119899 (119908)

+ 2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

minus 2⟨119890119879 119875(119908)119890⟩

119896

(38)

This completes the proof of Lemma 8

Lemma 8 and 119884 minus 120583(119908)2119896= 119872(119908)119884 minus 120578(119908)

2

119896imply

that

119862119899 (119908) = 119871119899 (119908) +

2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

119899

+1198902

119896

119899+2ℏ tr (119875 (119908)Ω119899)

119899minus 2

⟨119890119879 119875(119908)119890⟩

119896

119899

(39)

Abstract and Applied Analysis 7

The goal is to choose119908 by minimizing 119862119899(119908) From (39)

one only needs to select 119908 through minimizing

119871119899 (119908) +

2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

119899

+

2ℏ tr (119875 (119908)Ω119899) minus 2⟨119890119879 119875(119908)119890⟩

119896

119899

(40)

where 119908 isinWCompared with 119871

119899(119908) it is sufficient to establish

that 119899minus1⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896and 119899

minus1ℏ tr(119875(119908)Ω

119899) minus

119899minus1⟨119890119879 119875(119908)119890⟩

119896are uniformly negligible for any 119908 isin W

More specifically to prove (34) we need to check

sup119908isinW

⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

119899119877119899 (119908)

119901

997888rarr 0 (41)

sup119908isinW

10038161003816100381610038161003816ℏ tr (119875 (119908)Ω119899) minus ⟨119890

119879 119875(119908)119890⟩

119896

10038161003816100381610038161003816

119899119877119899 (119908)

119901

997888rarr 0 (42)

sup119908isinW

10038161003816100381610038161003816100381610038161003816

119871119899 (119908) minus 119877119899 (119908)

119877119899 (119908)

10038161003816100381610038161003816100381610038161003816

119901

997888rarr 0 (43)

where ldquo119901

997888rarrrdquo denotes the convergence in probabilityFollowing the idea of Li [29] we testify the main result

of our work that the minimizing of 119896-class generalizedinformation criterion is asymptotically optimalNowwe statethe main theorem

Theorem9 Under assumptions (35) and (36) theminimizingof 119896-class generalized information criterion119862

119899(119908) is asymptot-

ically optimal namely (34) holds

Proof The asymptotically optimal property of 119896-GIC needsto show that (41) (42) and (43) are valid

Firstly we prove that (41) holds For any 120577 gt 0 byChebyshevrsquos inequality one has

Pr

sup119908isinW

10038161003816100381610038161003816⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

10038161003816100381610038161003816

119899119877119899 (119908)

gt 120577

le sum

119908isinW

119864[⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896]2

(120577119899119877119899 (119908))

2

(44)

which by Lemma 1 is no greater than

120577minus2119896W sum

119908isinW

119899minus21003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896(119877119899 (119908))

minus2 (45)

Recalling the definition of 119877119899(119908) we get 119872(119908)120583 minus

120578(119908)2

119896le 119899119877

119899(119908) Then (45) does not exceed

120577minus2119896Wsum

119908isinW (119899119877119899(119908))minus1 By assumption (36) one knows

that (45) tends to zero in probabilityThus (41) is established

To prove (42) it suffices to testify that

119896 tr (119875 (119908)Ω119899) minus ⟨119890119879 119875(119908)119890⟩

119896

119899119877119899 (119908)

119901

997888rarr 0 (46)

(ℏ minus 119896) tr (119875 (119908)Ω119899)119899119877119899 (119908)

119901

997888rarr 0 (47)

By Chebyshevrsquos inequality and Lemmas 4 and 5 for any120598 gt 0 we have

Pr sup119908isinW

10038161003816100381610038161003816100381610038161003816100381610038161003816

tr (119875 (119908)Ω119899) minus ⟨119890119879 119875 (119908) 119890⟩

119899119877119899 (119908)

10038161003816100381610038161003816100381610038161003816100381610038161003816

gt 120598

le sum

119908isinW

119864[tr(119875(119908)Ω119899) minus ⟨119890

119879 119875(119908)119890⟩]

2

(120598119899119877119899(119908))2

=1

1205982sum

119908isinW

Var (119890119879119875 (119908) 119890)

(119899119877119899(119908))2

le2C1Γ2

1205982sum

119908isinW

1

(119899119877119899(119908))2

(48)

From (36) we know that (48) is close to zero Thus (46)is reasonable

Recalling Lemma 4 and (35) it yields

1003816100381610038161003816(ℏ minus 119896) tr (119875 (119908)Ω119899)1003816100381610038161003816

119899119877119899 (119908)

le

10038161003816100381610038161003816C1015840 (ℏ minus 119896) Γ

10038161003816100381610038161003816

119899119877119899 (119908)

997888rarr 0 (49)

which derives (47) Then (42) is provedNext we show that the expression (43) also holds A

straightforward calculation leads to

1003817100381710038171003817120583 minus 120583(119908)10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

=1003817100381710038171003817120583 minus 120578(119908) minus 119875(119908)119884

10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

=1003817100381710038171003817120583 minus 120578(119908) minus 119875(119908)120583 minus 119875(119908)119890

10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

=1003817100381710038171003817119872(119908)120583 minus 120578(119908) minus 119875(119908)119890

10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

= 119875(119908)1198902

119896minus 2⟨120583

119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

(50)

From the identity (50) we know that

119871119899 (119908) minus 119877119899 (119908) = minus

2⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899

+119875(119908)119890

2

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899

(51)

8 Abstract and Applied Analysis

Obviously the proof of (43) needs to verify that

sup119908isinW

119875(119908)1198902

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899119877119899 (119908)

119901

997888rarr 0

sup119908isinW

⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899119877119899 (119908)

119901

997888rarr 0

(52)

To prove (52) it should be noticed thattr(119875(119908)Ω

119899119875119879(119908)) = 119864(119890

119879119875119879(119908)119875(119908)119890) and 119875(119908)119890

2

119896=

⟨119890119879119875119879(119908) 119875(119908)119890⟩

119896 In addition

1003817100381710038171003817119875 (119908) (119872 (119908) 120583 minus 120578 (119908))10038171003817100381710038172le 1205822

max (119875 (119908))1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

le1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

(53)

Thus there exist any 1205771015840 gt 0 and 1205981015840 gt 0 such that

Pr sup119908isinW

119875(119908)1198902

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899119877119899 (119908)

gt 1205771015840

le sum

119908isinW

119864[119875(119908)1198902

119896minus 119896119864 (119890

119879119875119879(119908) 119875 (119908) 119890)]

2

(119899119877119899 (119908) 120577

1015840)2

=1198962

12057710158402sum

119908isinW

Var (119890119879119875119879 (119908) 119875 (119908) 119890)

(119899119877119899(119908))2

le2119896C1Γ

12057710158402sum

119908isinW

119896 tr (119875 (119908)Ω119899119875119879(119908))

(119899119877119899(119908))2

le2119896C1Γ

12057710158402sum

119908isinW

1

(119899119877119899 (119908))

(54)

Pr

sup119908isinW

⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899119877119899 (119908)

gt 1205981015840

le sum

119908isinW

119864[⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896]2

(119899119877119899(119908)1205981015840)

2

le sum

119908isinW

1003817100381710038171003817119875(119908)(119872(119908)120583 minus 120578(119908))10038171003817100381710038172

119896119896W

(119899119877119899(119908)1205981015840)

2

le 1205981015840minus2119896W sum

119908isinW

119899minus21003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896(119877119899 (119908))

minus2

(55)

Combining (36) and (45) one knows that both (54)and (55) tend to zero In other words (43) is confirmedWe conclude that the expressions (41) (42) and (43) arereasonable This completes the proof of Theorem 9

In practice the covariance of errors is usually unknownand needs to be estimated However it is difficult to build agood estimate for Ω

119899in virtue of the special structure of Ω

119899

In the special case when the random errors are independentand identical distribution with variance 1205902 the consistentestimator of1205902 can be built for the constrainedmodels (7) and(8) Let 119898 = 119876 in 120583

119898and 1205911015840 = tr(119875

119876) where 119876 corresponds

to a ldquolargerdquo approximatingmodel Denote 2119876= (119899minus120591

1015840)minus1(119884minus

120583119876)119879(119884 minus 120583

119876) The coming theorem will show that 2

119876is a

consistent estimate of 1205902

Theorem 10 If 1205911015840119899 rarr 0 when 1205911015840 rarr infin and 119899 rarr infin wehave 2

119876

119901

997888rarr 1205902 as 119899 rarr infin

Proof Writing Λ = 119863 + 119883119877minus1119863995333 one obtains

(119884 minus 120583119876)119879(119884 minus 120583

119876)

119899 minus 1205911015840=119890119879119872119876119890

119899 minus 1205911015840+

1003817100381710038171003817119872119876Λ10038171003817100381710038172

119899 minus 1205911015840+2119890119879119872119876Λ

119899 minus 1205911015840

(56)

where119872119876= 119868 minus 119875

119876

Since 119864(119890119879119872119876119890) = 120590

2(119899 minus 120591

1015840) it leads to

11986410038161003816100381610038161003816119890119879119872119876119890 minus 1205902(119899 minus 120591

1015840)10038161003816100381610038161003816

2

= 11986410038161003816100381610038161003816119890119879119872119876119890 minus 119864(119890

119879119872119876119890)10038161003816100381610038161003816

2

= Var (119890119879119872119876119890) = 2120590

4 tr (119872119876)

(57)

For any 120576 gt 0 it follows from (57) that

Pr1003816100381610038161003816100381610038161003816100381610038161003816

119890119879119872119876119890

119899 minus 1205911015840minus 1205902

1003816100381610038161003816100381610038161003816100381610038161003816

gt 120576 le11986410038161003816100381610038161003816119890119879119872119876119890 minus (119899 minus 120591

1015840)120590210038161003816100381610038161003816

2

1205762(119899 minus 1205911015840)2

le21205904

1205762 (119899 minus 1205911015840)997888rarr 0

(58)

Equation (58) implies that (119899 minus 1205911015840)minus1(119890119879119872119876119890)119901

997888rarr 1205902 Let

I119876119905

be the 119905th diagonal element of the ldquohatrdquo matrix 119875119876Then

0 le 1minusI119876119905lt 1 is the 119905th diagonal element of119872

119876and satisfies

sum119899

119905=1(1 minusI

119876119905) = 119899 minus 120591

1015840Because 120601(119911

119894119897 120595119897) and 120601(119886

119904119896 120595119896) converge to mean

square we have 119864(Λ2119876119905) rarr 0 as 1205911015840 rarr infin where Λ

119876119905is

the 119905th element of Λ 119905 = 1 119899 Notice that

119864 (Λ119879119872119876Λ)

(119899 minus 1205911015840)=

1

119899 minus 1205911015840

119899

sum

119905=1

119864 (Λ2

119876119905119872119876119905)

=1

119899 minus 1205911015840

119899

sum

119905=1

119864 (Λ2

119876119905) [1 minusI

119876119905]

le max119905

119864 (Λ2

119876119905)

1

119899 minus 1205911015840

119899

sum

119905=1

[1 minusI119876119905]

= max119905

119864 (Λ2

119876119905) 997888rarr 0

(59)

The above expression implies that the second term on theright hand of (56) approaches to zero By the similar proof of(59) we obtain that the final term on the right-hand of (56)also tends to zero The proof of Theorem 10 is complete

Abstract and Applied Analysis 9

In the case of independently and identically distributederrors if we replace 1205902 by 2

119876in the 119896-GIC the 119896-GIC can be

simplified to

119862119899 (119908) =

1003817100381710038171003817119884 minus 120583(119908)10038171003817100381710038172

119896

119899+2ℏ2

119876tr (119875 (119908))119899

(60)

Here we intend to illustrate that the model selection pro-cedure of minimizing 119862

119899(119908) is also asymptotically optimal

Theorem11 Assume that the random error 119890 is iid withmeanzero and variance 1205902 Under the conditions (35) and (36) the119862119899(119908) is still asymptotically valid

Proof Using a similar technique of deriving (39) we obtain

119899119862119899 (119908) =

1003817100381710038171003817119884 minus 120583(119908)10038171003817100381710038172

119896+ 2ℏ

2

119876tr (119875 (119908))

= 119899119871119899 (119908) + 2⟨119890

119879119872(119908)120583 minus 120578(119908)⟩

119896

+ 2ℏ2

119876tr (119875 (119908)) minus 2⟨119890119879 119875(119908)119890⟩

119896+ 1198902

119896

= 2ℏ (2

119876minus 1205902) tr (119875 (119908)) + 2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

+ 119899119871119899 (119908) + 2ℏ120590

2 tr (119875 (119908))

minus 2⟨119890119879 119875(119908)119890⟩

119896+ 1198902

119896

(61)

With an appropriate modification of the proofs (41)ndash(43)one only needs to verify

sup119908isinW

ℏ100381610038161003816100381610038161205902minus 2

119876

10038161003816100381610038161003816tr (119875 (119908))

119899119877119899 (119908)

119901

997888rarr 0 (62)

which is equivalent to showing

sup119908isinW

ℏ2((1119899)

100381610038161003816100381610038161205902minus 2

119876

10038161003816100381610038161003816tr(119875(119908)))

2

1198772119899(119908)

119901

997888rarr 0 (63)

From the proof of Theorem 10 we have

Pr1003816100381610038161003816100381610038161003816100381610038161003816

ℏ119890119879119872119876119890

119899 minus 1205911015840minus ℏ1205902

1003816100381610038161003816100381610038161003816100381610038161003816

gt 1205761015840119877minus12

119899(119908)

le ℏ211986410038161003816100381610038161003816119890119879119872119876119890 minus (119899 minus 120591

1015840)120590210038161003816100381610038161003816

2

12057610158402(119899 minus 1205911015840)2119877119899 (119908)

le2ℏ21205904

12057610158402 (119899 minus 1205911015840) 119877119899 (119908)

(64)

where 1205761015840 gt 0By assumption (35) one knows that the above equation

is close to zero Thus we obtain 119877119899(119908)minus1|ℏ1205902minus ℏ2

119876|2 119901

997888rarr

0 It follows from Lemma 3 and the definition of 119877119899(119908)

that (119899minus1 tr(119875(119908)))2 is no greater than 119877119899(119908) Then the

formula (63) is confirmed Therefore we conclude that theminimizing of 119862

119899(119908) is also asymptotically optimal

5 Monte Carlo Simulation

In this section Monte Carlo simulations are performedto investigate the finite sample properties of the proposedrestricted linearmodel selectionThis data generating processis

119910119894= 119909119894120579 +

1000

sum

119897=1

119911119894119897120595119897+ 119890119894

subject to 120579 = minus

1000

sum

119897=1

120595119897+ 1 119894 = 1 119899

(65)

where 119909119894follows 119905(3) distribution 119911

1198941= 1 and 119911

119894119897

119897 = 2 1000 is independently and identically distributed119873(0 1) The parameter 120595

119897 119897 = 1 1000 is determined by

the rule 120595119897= 120577119897minus1 where 120577 is a parameter which is selected

to control the population 2 = 1205772(1 + 120577

2) The error 119890

119894is

independent of 119909119894and 119911119894119897 We consider two cases of the error

distribution

Case 1 119890119894is independently and identically distributed

119873(0 1)

Case 2 1198901 119890

119899obey multivariate normal distribution

119873(0Ω119899) whereΩ

119899is a 119899times119899 dimensional covariance matrix

The 119895th diagonal element ofΩ119899is 1205902119895generated from uniform

distribution on (0 1) The 119895119897th (119895 = 119897) nondiagonal element ofΩ119899isΩ119895119897119899= exp(minus05|119909

119894119895minus 119909119894119897|2) where 119909

119894119895and 119909

119894119897denote the

119895th and 119897th elements of 119909119894 respectively

The sample size is varied between 119899 = 50 100 150 and200 The number of models is determined by 119872 = lfloor3119899

13rfloor

where lfloor119904rfloor stands for the integer part of 119904 We set 120577 so that2 varies on a grid between 01 and 09 The number of

simulation trials is Π = 500 For the 119896-GIC the value of 119896takes one and ℏ adopts the effective number of parameters

To assess the performance of 119896-GIC we consider fiveestimators which are (1) AIC model selection estimators(AIC) (2) BIC model selection estimators (BIC) (3) leave-one-out cross-validated model selection estimator (CV [17])(4) Mallows model averaging estimators (MMA [11]) and(5) 119896-GIC model selection estimator (119896-GIC) FollowingMachado [30] the AIC and BIC are defined respectively as

AIC119898= 2119899 ln 1

119899

1003817100381710038171003817119884 minus 12058311989810038171003817100381710038172 + 2K

119898

BIC119898= 2119899 ln 1

119899

1003817100381710038171003817119884 minus 12058311989810038171003817100381710038172 + ln (119899)K119898

(66)

We employ the out-of-sample prediction error to evaluateeach estimator For each replication 119910

ℓ 119909ℓ 119911ℓ100

ℓ=1are gener-

ated as out-of-sample observations In the120587th simulation theprediction error is

PE (120587) = 1

100

100

sum

ℓ=1

(119910ℓminus 120583ℓ (119908))

2 (67)

10 Abstract and Applied Analysis

CVMMAAIC

BIC

n = 50M = 11Pr

edic

tion

erro

r18

14

10

06

01 03 05 07 09

R

2

k-GIC

(a)

CVMMAAIC

BIC

n = 100M = 14

Pred

ictio

n er

ror

18

14

10

06

01 03 05 07 09

R

2

k-GIC

(b)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

n = 150M = 16

01 03 05 07 09

R

2

k-GIC

(c)

Pred

ictio

n er

ror

18

14

10

06

01 03 05 07 09

CVMMAAIC

BIC

n = 200M = 18

R

2

k-GIC

(d)

Figure 1 Results for Monte Carlo design

where119908 is selected by one of the five methodsThen the out-of-sample prediction error is calculated by

PE = 1

Π

Π

sum

120587=1

PE (120587) (68)

where Π = 500 is the number of replication Obviously thesmaller PE implies the better method of model estimatorWe consider PE under homoscedastic errors at first Theprediction error calculations are summarized in Figure 1Thefour panels in each graph depict results for a variety of samplesizes In each panel PE is displayed on the 119910-axis and 2 isdisplayed on the 119909-axis

We find that the 119896-GIC estimators are almost the bestestimators among those considered When 2 is very large

the MMA estimators can sometimes be marginally preferredto the 119896-GIC estimators In each panel the AIC and CV havequite similar prediction errors For a smaller 2 the AICobtains a higher prediction error than CV However the AICestimators yield smaller PE119904 than the CV estimators when2 is increasing In many situations the PE119904 of BIC estimator

with a large 2 are quite poor relative to the other methodsNext we discuss PE under correlative errors and the PE

calculation is summarized in Figure 2 Broadly speakingthe conclusions are similar to those found in homoscedasticcases The 119896-GIC estimator frequently yields the most accu-rate estimators followed by the MMA estimator and bothaverage estimators enjoy significantly smaller PE119904 than theother three estimators over a large portion of the 2 space

Abstract and Applied Analysis 11

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 50M = 11

k-GIC

(a)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 100M = 14

k-GIC

(b)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 150M = 16

k-GIC

(c)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 200M = 18

k-GIC

(d)

Figure 2 Results for Monte Carlo design

When 2 le 02 the BIC estimator outperforms the 119896-GICestimator Again the AIC estimator is habitually the worstperforming estimator with the CV being a close second in alarge region of the 2 space Besides their relative efficiencyrelies closely on sample size with the BIC estimator revealingincreasing PE and the remaining four estimators showingdecreasing PE as 119899 increases

6 Conclusions

In risk investment an important subject is to find an optimalportfolio The commonly used techniques are the opti-mization methods based on the scheme of mean-varianceHowever those methods are cumbersome in computingand cannot obtain the closed solutions for some complex

problems To make up the defects of mean-variance analternative methodology for obtaining an optimal portfoliois to use model selection This paper attempts to developa statistical program to consider the selection problem ofoptimal tracking portfolio We build the theoretical modelsof tracking portfolios by constrained linear models Thenthe selection problems of optimal portfolio boil down tochoosing an optimal constrained linear model

In the setting of unrestricted models or homoscedasticmodels a large number of works investigate the problemsof model selection In distinction we discuss the modelselection for constrained models with dependent errors Therestricted models are estimated by the method of weightedaverage least squares Thus the selection of an optimalconstrained model is equivalent to finding a series of optimal

12 Abstract and Applied Analysis

weights We select the weights by minimizing a 119896-class gen-eralized information criterion (119896-GIC) which is an estimateof the average squared error from the model average fit Theprocedure of selecting weights is proved to be asymptoticallyoptimal Through Monte Carlo simulation the performanceof 119896-GIC is compared against that of four other methods Itis found that the 119896-GIC gives the best performance in mostcases

There are two limitations of our results which are openfor further research First what is the asymptotic distributionof the parametric estimators Second can the theory begeneralized to allow for continuous weightsThese questionsremain to be answered by future research In this work wemainly adopt the method of regression analysis to solve theselection problem In fact some alternative mathematicaltools can also be employed to explore the theoretical prop-erties of model selection For example the optimal modelcan be selected by the methods of linear optimization orquadratic programming and we can apply the techniques oflinear functional analysis and stochastic control to considerthe inferences of parametric estimator Besides we mentionthat the applications of this study can also be extended tosome other fields including risk management ruin theoryand factor analysis

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

This work was supported by the Shandong Institute ofBusiness and Technology (SIBT Grant 521014306203)

References

[1] K Knight and W Fu ldquoAsymptotics for lasso-type estimatorsrdquoThe Annals of Statistics vol 28 no 5 pp 1356ndash1378 2000

[2] J Fan and H Peng ldquoNonconcave penalized likelihood with adiverging number of parametersrdquo The Annals of Statistics vol32 no 3 pp 928ndash961 2004

[3] H Zou and M Yuan ldquoComposite quantile regression and theoracle model selection theoryrdquo The Annals of Statistics vol 36no 3 pp 1108ndash1126 2008

[4] M Caner ldquoLasso-type GMM estimatorrdquo Econometric Theoryvol 25 no 1 pp 270ndash290 2009

[5] C Y Tang and C Leng ldquoPenalized high-dimensional empiricallikelihoodrdquo Biometrika vol 97 no 4 pp 905ndash919 2010

[6] L C Jennifer A D Jurgen and F H David Model Selectionin Equations With Many Small Effects Oxford Bulletin ofEconomics and Statistics 2013

[7] E E Leamer Specification Searches John Wiley amp Sons NewYork NY USA 1978

[8] P J Brown M Vannucci and T Fearn ldquoBayes model averagingwith selection of regressorsrdquo Journal of the Royal StatisticalSociety B Statistical Methodology vol 64 no 3 pp 519ndash5362002

[9] W S Rodney and K D Herman Bayesian Model SelectionWith An Uninformative Prior Oxford Bulletin of Economicsand Statistics 2003

[10] N L Hjort and G Claeskens ldquoFrequentist model averageestimatorsrdquo Journal of the American Statistical Association vol98 no 464 pp 879ndash899 2003

[11] B E Hansen ldquoLeast squares model averagingrdquo EconometricaJournal of the Econometric Society vol 75 no 4 pp 1175ndash11892007

[12] H Liang G Zou A T K Wan and X Zhang ldquoOptimal weightchoice for frequentist model average estimatorsrdquo Journal of theAmerican Statistical Association vol 106 no 495 pp 1053ndash10662011

[13] X Zhang and H Liang ldquoFocused information criterion andmodel averaging for generalized additive partial linear modelsrdquoThe Annals of Statistics vol 39 no 1 pp 194ndash200 2011

[14] B E Hansen and J S Racine ldquoJackknife model averagingrdquoJournal of Econometrics vol 167 no 1 pp 38ndash46 2012

[15] H Akaike ldquoInformation theory and an extension of themaximum likelihood principlerdquo in Proceedings of the 2ndInternational Symposium on Information Theory B N Petrovand F Csake Eds pp 267ndash281 Akademiai Kiado BudapestHungary 1973

[16] C L Mallows ldquoSome comments on CPrdquo Technometrics vol 15no 4 pp 661ndash675 1973

[17] M Stone ldquoCross-validatory choice and assessment of statisticalpredictionsrdquo Journal of the Royal Statistical Society B Method-ological vol 36 pp 111ndash147 1974

[18] G Schwarz ldquoEstimating the dimension of a modelrdquoTheAnnalsof Statistics vol 6 no 2 pp 461ndash464 1978

[19] P Craven and G Wahba ldquoSmoothing noisy data with splinefunctions Estimating the correct degree of smoothing by themethod of generalized cross-validationrdquo Numerische Mathe-matik vol 31 no 4 pp 377ndash403 1979

[20] D W K Andrews ldquoConsistent moment selection proceduresfor generalized method of moments estimationrdquo EconometricaJournal of the Econometric Society vol 67 no 3 pp 543ndash5641999

[21] G Claeskens and N L Hjort ldquoThe focused informationcriterionrdquo Journal of theAmerican Statistical Association vol 98no 464 pp 900ndash945 2003With discussions and a rejoinder bythe authors

[22] Y Zhang R Li and C-L Tsai ldquoRegularization parameterselections via generalized information criterionrdquo Journal of theAmerican Statistical Association vol 105 no 489 pp 312ndash3232010

[23] G Xu and J Z Huang ldquoAsymptotic optimality and efficientcomputation of the leave-subject-out cross-validationrdquo TheAnnals of Statistics vol 40 no 6 pp 3003ndash3030 2012

[24] M K P So and T Ando ldquoGeneralized predictive informationcriteria for the analysis of feature eventsrdquo Electronic Journal ofStatistics vol 7 pp 742ndash762 2013

[25] J J J Groen and G Kapetanios ldquoModel selection criteria forfactor-augmented regressionsrdquo Oxford Bulletin of Economicsand Statistics vol 75 no 1 pp 37ndash63 2013

[26] S Lai and Q Xie ldquoA selection problem for a constrainedlinear regression modelrdquo Journal of Industrial and ManagementOptimization vol 4 no 4 pp 757ndash766 2008

[27] J Shao ldquoAn asymptotic theory for linear model selectionrdquoStatistica Sinica vol 7 no 2 pp 221ndash264 1997 With commentsand a rejoinder by the author

Abstract and Applied Analysis 13

[28] D W K Andrews ldquoAsymptotic optimality of generalized 119862119871

cross-validation and generalized cross-validation in regressionwith heteroskedastic errorsrdquo Journal of Econometrics vol 47 no2-3 pp 359ndash377 1991

[29] K-C Li ldquoAsymptotic optimality for 119862119901 119862119871 cross-validation

and generalized cross-validation discrete index setrdquoTheAnnalsof Statistics vol 15 no 3 pp 958ndash975 1987

[30] J A F Machado ldquoRobust model selection and119872-estimationrdquoEconometric Theory vol 9 no 3 pp 478ndash493 1993

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article The Optimal Selection for Restricted ...downloads.hindawi.com/journals/aaa/2014/692472.pdf · stocks in the group as the regressors. Since the coe cient of a regressor

Abstract and Applied Analysis 7

The goal is to choose119908 by minimizing 119862119899(119908) From (39)

one only needs to select 119908 through minimizing

119871119899 (119908) +

2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

119899

+

2ℏ tr (119875 (119908)Ω119899) minus 2⟨119890119879 119875(119908)119890⟩

119896

119899

(40)

where 119908 isinWCompared with 119871

119899(119908) it is sufficient to establish

that 119899minus1⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896and 119899

minus1ℏ tr(119875(119908)Ω

119899) minus

119899minus1⟨119890119879 119875(119908)119890⟩

119896are uniformly negligible for any 119908 isin W

More specifically to prove (34) we need to check

sup119908isinW

⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

119899119877119899 (119908)

119901

997888rarr 0 (41)

sup119908isinW

10038161003816100381610038161003816ℏ tr (119875 (119908)Ω119899) minus ⟨119890

119879 119875(119908)119890⟩

119896

10038161003816100381610038161003816

119899119877119899 (119908)

119901

997888rarr 0 (42)

sup119908isinW

10038161003816100381610038161003816100381610038161003816

119871119899 (119908) minus 119877119899 (119908)

119877119899 (119908)

10038161003816100381610038161003816100381610038161003816

119901

997888rarr 0 (43)

where ldquo119901

997888rarrrdquo denotes the convergence in probabilityFollowing the idea of Li [29] we testify the main result

of our work that the minimizing of 119896-class generalizedinformation criterion is asymptotically optimalNowwe statethe main theorem

Theorem9 Under assumptions (35) and (36) theminimizingof 119896-class generalized information criterion119862

119899(119908) is asymptot-

ically optimal namely (34) holds

Proof The asymptotically optimal property of 119896-GIC needsto show that (41) (42) and (43) are valid

Firstly we prove that (41) holds For any 120577 gt 0 byChebyshevrsquos inequality one has

Pr

sup119908isinW

10038161003816100381610038161003816⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

10038161003816100381610038161003816

119899119877119899 (119908)

gt 120577

le sum

119908isinW

119864[⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896]2

(120577119899119877119899 (119908))

2

(44)

which by Lemma 1 is no greater than

120577minus2119896W sum

119908isinW

119899minus21003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896(119877119899 (119908))

minus2 (45)

Recalling the definition of 119877119899(119908) we get 119872(119908)120583 minus

120578(119908)2

119896le 119899119877

119899(119908) Then (45) does not exceed

120577minus2119896Wsum

119908isinW (119899119877119899(119908))minus1 By assumption (36) one knows

that (45) tends to zero in probabilityThus (41) is established

To prove (42) it suffices to testify that

119896 tr (119875 (119908)Ω119899) minus ⟨119890119879 119875(119908)119890⟩

119896

119899119877119899 (119908)

119901

997888rarr 0 (46)

(ℏ minus 119896) tr (119875 (119908)Ω119899)119899119877119899 (119908)

119901

997888rarr 0 (47)

By Chebyshevrsquos inequality and Lemmas 4 and 5 for any120598 gt 0 we have

Pr sup119908isinW

10038161003816100381610038161003816100381610038161003816100381610038161003816

tr (119875 (119908)Ω119899) minus ⟨119890119879 119875 (119908) 119890⟩

119899119877119899 (119908)

10038161003816100381610038161003816100381610038161003816100381610038161003816

gt 120598

le sum

119908isinW

119864[tr(119875(119908)Ω119899) minus ⟨119890

119879 119875(119908)119890⟩]

2

(120598119899119877119899(119908))2

=1

1205982sum

119908isinW

Var (119890119879119875 (119908) 119890)

(119899119877119899(119908))2

le2C1Γ2

1205982sum

119908isinW

1

(119899119877119899(119908))2

(48)

From (36) we know that (48) is close to zero Thus (46)is reasonable

Recalling Lemma 4 and (35) it yields

1003816100381610038161003816(ℏ minus 119896) tr (119875 (119908)Ω119899)1003816100381610038161003816

119899119877119899 (119908)

le

10038161003816100381610038161003816C1015840 (ℏ minus 119896) Γ

10038161003816100381610038161003816

119899119877119899 (119908)

997888rarr 0 (49)

which derives (47) Then (42) is provedNext we show that the expression (43) also holds A

straightforward calculation leads to

1003817100381710038171003817120583 minus 120583(119908)10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

=1003817100381710038171003817120583 minus 120578(119908) minus 119875(119908)119884

10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

=1003817100381710038171003817120583 minus 120578(119908) minus 119875(119908)120583 minus 119875(119908)119890

10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

=1003817100381710038171003817119872(119908)120583 minus 120578(119908) minus 119875(119908)119890

10038171003817100381710038172

119896minus1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896

= 119875(119908)1198902

119896minus 2⟨120583

119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

(50)

From the identity (50) we know that

119871119899 (119908) minus 119877119899 (119908) = minus

2⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899

+119875(119908)119890

2

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899

(51)

8 Abstract and Applied Analysis

Obviously the proof of (43) needs to verify that

sup119908isinW

119875(119908)1198902

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899119877119899 (119908)

119901

997888rarr 0

sup119908isinW

⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899119877119899 (119908)

119901

997888rarr 0

(52)

To prove (52) it should be noticed thattr(119875(119908)Ω

119899119875119879(119908)) = 119864(119890

119879119875119879(119908)119875(119908)119890) and 119875(119908)119890

2

119896=

⟨119890119879119875119879(119908) 119875(119908)119890⟩

119896 In addition

1003817100381710038171003817119875 (119908) (119872 (119908) 120583 minus 120578 (119908))10038171003817100381710038172le 1205822

max (119875 (119908))1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

le1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

(53)

Thus there exist any 1205771015840 gt 0 and 1205981015840 gt 0 such that

Pr sup119908isinW

119875(119908)1198902

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899119877119899 (119908)

gt 1205771015840

le sum

119908isinW

119864[119875(119908)1198902

119896minus 119896119864 (119890

119879119875119879(119908) 119875 (119908) 119890)]

2

(119899119877119899 (119908) 120577

1015840)2

=1198962

12057710158402sum

119908isinW

Var (119890119879119875119879 (119908) 119875 (119908) 119890)

(119899119877119899(119908))2

le2119896C1Γ

12057710158402sum

119908isinW

119896 tr (119875 (119908)Ω119899119875119879(119908))

(119899119877119899(119908))2

le2119896C1Γ

12057710158402sum

119908isinW

1

(119899119877119899 (119908))

(54)

Pr

sup119908isinW

⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899119877119899 (119908)

gt 1205981015840

le sum

119908isinW

119864[⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896]2

(119899119877119899(119908)1205981015840)

2

le sum

119908isinW

1003817100381710038171003817119875(119908)(119872(119908)120583 minus 120578(119908))10038171003817100381710038172

119896119896W

(119899119877119899(119908)1205981015840)

2

le 1205981015840minus2119896W sum

119908isinW

119899minus21003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896(119877119899 (119908))

minus2

(55)

Combining (36) and (45) one knows that both (54)and (55) tend to zero In other words (43) is confirmedWe conclude that the expressions (41) (42) and (43) arereasonable This completes the proof of Theorem 9

In practice the covariance of errors is usually unknownand needs to be estimated However it is difficult to build agood estimate for Ω

119899in virtue of the special structure of Ω

119899

In the special case when the random errors are independentand identical distribution with variance 1205902 the consistentestimator of1205902 can be built for the constrainedmodels (7) and(8) Let 119898 = 119876 in 120583

119898and 1205911015840 = tr(119875

119876) where 119876 corresponds

to a ldquolargerdquo approximatingmodel Denote 2119876= (119899minus120591

1015840)minus1(119884minus

120583119876)119879(119884 minus 120583

119876) The coming theorem will show that 2

119876is a

consistent estimate of 1205902

Theorem 10 If 1205911015840119899 rarr 0 when 1205911015840 rarr infin and 119899 rarr infin wehave 2

119876

119901

997888rarr 1205902 as 119899 rarr infin

Proof Writing Λ = 119863 + 119883119877minus1119863995333 one obtains

(119884 minus 120583119876)119879(119884 minus 120583

119876)

119899 minus 1205911015840=119890119879119872119876119890

119899 minus 1205911015840+

1003817100381710038171003817119872119876Λ10038171003817100381710038172

119899 minus 1205911015840+2119890119879119872119876Λ

119899 minus 1205911015840

(56)

where119872119876= 119868 minus 119875

119876

Since 119864(119890119879119872119876119890) = 120590

2(119899 minus 120591

1015840) it leads to

11986410038161003816100381610038161003816119890119879119872119876119890 minus 1205902(119899 minus 120591

1015840)10038161003816100381610038161003816

2

= 11986410038161003816100381610038161003816119890119879119872119876119890 minus 119864(119890

119879119872119876119890)10038161003816100381610038161003816

2

= Var (119890119879119872119876119890) = 2120590

4 tr (119872119876)

(57)

For any 120576 gt 0 it follows from (57) that

Pr1003816100381610038161003816100381610038161003816100381610038161003816

119890119879119872119876119890

119899 minus 1205911015840minus 1205902

1003816100381610038161003816100381610038161003816100381610038161003816

gt 120576 le11986410038161003816100381610038161003816119890119879119872119876119890 minus (119899 minus 120591

1015840)120590210038161003816100381610038161003816

2

1205762(119899 minus 1205911015840)2

le21205904

1205762 (119899 minus 1205911015840)997888rarr 0

(58)

Equation (58) implies that (119899 minus 1205911015840)minus1(119890119879119872119876119890)119901

997888rarr 1205902 Let

I119876119905

be the 119905th diagonal element of the ldquohatrdquo matrix 119875119876Then

0 le 1minusI119876119905lt 1 is the 119905th diagonal element of119872

119876and satisfies

sum119899

119905=1(1 minusI

119876119905) = 119899 minus 120591

1015840Because 120601(119911

119894119897 120595119897) and 120601(119886

119904119896 120595119896) converge to mean

square we have 119864(Λ2119876119905) rarr 0 as 1205911015840 rarr infin where Λ

119876119905is

the 119905th element of Λ 119905 = 1 119899 Notice that

119864 (Λ119879119872119876Λ)

(119899 minus 1205911015840)=

1

119899 minus 1205911015840

119899

sum

119905=1

119864 (Λ2

119876119905119872119876119905)

=1

119899 minus 1205911015840

119899

sum

119905=1

119864 (Λ2

119876119905) [1 minusI

119876119905]

le max119905

119864 (Λ2

119876119905)

1

119899 minus 1205911015840

119899

sum

119905=1

[1 minusI119876119905]

= max119905

119864 (Λ2

119876119905) 997888rarr 0

(59)

The above expression implies that the second term on theright hand of (56) approaches to zero By the similar proof of(59) we obtain that the final term on the right-hand of (56)also tends to zero The proof of Theorem 10 is complete

Abstract and Applied Analysis 9

In the case of independently and identically distributederrors if we replace 1205902 by 2

119876in the 119896-GIC the 119896-GIC can be

simplified to

119862119899 (119908) =

1003817100381710038171003817119884 minus 120583(119908)10038171003817100381710038172

119896

119899+2ℏ2

119876tr (119875 (119908))119899

(60)

Here we intend to illustrate that the model selection pro-cedure of minimizing 119862

119899(119908) is also asymptotically optimal

Theorem11 Assume that the random error 119890 is iid withmeanzero and variance 1205902 Under the conditions (35) and (36) the119862119899(119908) is still asymptotically valid

Proof Using a similar technique of deriving (39) we obtain

119899119862119899 (119908) =

1003817100381710038171003817119884 minus 120583(119908)10038171003817100381710038172

119896+ 2ℏ

2

119876tr (119875 (119908))

= 119899119871119899 (119908) + 2⟨119890

119879119872(119908)120583 minus 120578(119908)⟩

119896

+ 2ℏ2

119876tr (119875 (119908)) minus 2⟨119890119879 119875(119908)119890⟩

119896+ 1198902

119896

= 2ℏ (2

119876minus 1205902) tr (119875 (119908)) + 2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

+ 119899119871119899 (119908) + 2ℏ120590

2 tr (119875 (119908))

minus 2⟨119890119879 119875(119908)119890⟩

119896+ 1198902

119896

(61)

With an appropriate modification of the proofs (41)ndash(43)one only needs to verify

sup119908isinW

ℏ100381610038161003816100381610038161205902minus 2

119876

10038161003816100381610038161003816tr (119875 (119908))

119899119877119899 (119908)

119901

997888rarr 0 (62)

which is equivalent to showing

sup119908isinW

ℏ2((1119899)

100381610038161003816100381610038161205902minus 2

119876

10038161003816100381610038161003816tr(119875(119908)))

2

1198772119899(119908)

119901

997888rarr 0 (63)

From the proof of Theorem 10 we have

Pr1003816100381610038161003816100381610038161003816100381610038161003816

ℏ119890119879119872119876119890

119899 minus 1205911015840minus ℏ1205902

1003816100381610038161003816100381610038161003816100381610038161003816

gt 1205761015840119877minus12

119899(119908)

le ℏ211986410038161003816100381610038161003816119890119879119872119876119890 minus (119899 minus 120591

1015840)120590210038161003816100381610038161003816

2

12057610158402(119899 minus 1205911015840)2119877119899 (119908)

le2ℏ21205904

12057610158402 (119899 minus 1205911015840) 119877119899 (119908)

(64)

where 1205761015840 gt 0By assumption (35) one knows that the above equation

is close to zero Thus we obtain 119877119899(119908)minus1|ℏ1205902minus ℏ2

119876|2 119901

997888rarr

0 It follows from Lemma 3 and the definition of 119877119899(119908)

that (119899minus1 tr(119875(119908)))2 is no greater than 119877119899(119908) Then the

formula (63) is confirmed Therefore we conclude that theminimizing of 119862

119899(119908) is also asymptotically optimal

5 Monte Carlo Simulation

In this section Monte Carlo simulations are performedto investigate the finite sample properties of the proposedrestricted linearmodel selectionThis data generating processis

119910119894= 119909119894120579 +

1000

sum

119897=1

119911119894119897120595119897+ 119890119894

subject to 120579 = minus

1000

sum

119897=1

120595119897+ 1 119894 = 1 119899

(65)

where 119909119894follows 119905(3) distribution 119911

1198941= 1 and 119911

119894119897

119897 = 2 1000 is independently and identically distributed119873(0 1) The parameter 120595

119897 119897 = 1 1000 is determined by

the rule 120595119897= 120577119897minus1 where 120577 is a parameter which is selected

to control the population 2 = 1205772(1 + 120577

2) The error 119890

119894is

independent of 119909119894and 119911119894119897 We consider two cases of the error

distribution

Case 1 119890119894is independently and identically distributed

119873(0 1)

Case 2 1198901 119890

119899obey multivariate normal distribution

119873(0Ω119899) whereΩ

119899is a 119899times119899 dimensional covariance matrix

The 119895th diagonal element ofΩ119899is 1205902119895generated from uniform

distribution on (0 1) The 119895119897th (119895 = 119897) nondiagonal element ofΩ119899isΩ119895119897119899= exp(minus05|119909

119894119895minus 119909119894119897|2) where 119909

119894119895and 119909

119894119897denote the

119895th and 119897th elements of 119909119894 respectively

The sample size is varied between 119899 = 50 100 150 and200 The number of models is determined by 119872 = lfloor3119899

13rfloor

where lfloor119904rfloor stands for the integer part of 119904 We set 120577 so that2 varies on a grid between 01 and 09 The number of

simulation trials is Π = 500 For the 119896-GIC the value of 119896takes one and ℏ adopts the effective number of parameters

To assess the performance of 119896-GIC we consider fiveestimators which are (1) AIC model selection estimators(AIC) (2) BIC model selection estimators (BIC) (3) leave-one-out cross-validated model selection estimator (CV [17])(4) Mallows model averaging estimators (MMA [11]) and(5) 119896-GIC model selection estimator (119896-GIC) FollowingMachado [30] the AIC and BIC are defined respectively as

AIC119898= 2119899 ln 1

119899

1003817100381710038171003817119884 minus 12058311989810038171003817100381710038172 + 2K

119898

BIC119898= 2119899 ln 1

119899

1003817100381710038171003817119884 minus 12058311989810038171003817100381710038172 + ln (119899)K119898

(66)

We employ the out-of-sample prediction error to evaluateeach estimator For each replication 119910

ℓ 119909ℓ 119911ℓ100

ℓ=1are gener-

ated as out-of-sample observations In the120587th simulation theprediction error is

PE (120587) = 1

100

100

sum

ℓ=1

(119910ℓminus 120583ℓ (119908))

2 (67)

10 Abstract and Applied Analysis

CVMMAAIC

BIC

n = 50M = 11Pr

edic

tion

erro

r18

14

10

06

01 03 05 07 09

R

2

k-GIC

(a)

CVMMAAIC

BIC

n = 100M = 14

Pred

ictio

n er

ror

18

14

10

06

01 03 05 07 09

R

2

k-GIC

(b)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

n = 150M = 16

01 03 05 07 09

R

2

k-GIC

(c)

Pred

ictio

n er

ror

18

14

10

06

01 03 05 07 09

CVMMAAIC

BIC

n = 200M = 18

R

2

k-GIC

(d)

Figure 1 Results for Monte Carlo design

where119908 is selected by one of the five methodsThen the out-of-sample prediction error is calculated by

PE = 1

Π

Π

sum

120587=1

PE (120587) (68)

where Π = 500 is the number of replication Obviously thesmaller PE implies the better method of model estimatorWe consider PE under homoscedastic errors at first Theprediction error calculations are summarized in Figure 1Thefour panels in each graph depict results for a variety of samplesizes In each panel PE is displayed on the 119910-axis and 2 isdisplayed on the 119909-axis

We find that the 119896-GIC estimators are almost the bestestimators among those considered When 2 is very large

the MMA estimators can sometimes be marginally preferredto the 119896-GIC estimators In each panel the AIC and CV havequite similar prediction errors For a smaller 2 the AICobtains a higher prediction error than CV However the AICestimators yield smaller PE119904 than the CV estimators when2 is increasing In many situations the PE119904 of BIC estimator

with a large 2 are quite poor relative to the other methodsNext we discuss PE under correlative errors and the PE

calculation is summarized in Figure 2 Broadly speakingthe conclusions are similar to those found in homoscedasticcases The 119896-GIC estimator frequently yields the most accu-rate estimators followed by the MMA estimator and bothaverage estimators enjoy significantly smaller PE119904 than theother three estimators over a large portion of the 2 space

Abstract and Applied Analysis 11

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 50M = 11

k-GIC

(a)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 100M = 14

k-GIC

(b)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 150M = 16

k-GIC

(c)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 200M = 18

k-GIC

(d)

Figure 2 Results for Monte Carlo design

When 2 le 02 the BIC estimator outperforms the 119896-GICestimator Again the AIC estimator is habitually the worstperforming estimator with the CV being a close second in alarge region of the 2 space Besides their relative efficiencyrelies closely on sample size with the BIC estimator revealingincreasing PE and the remaining four estimators showingdecreasing PE as 119899 increases

6 Conclusions

In risk investment an important subject is to find an optimalportfolio The commonly used techniques are the opti-mization methods based on the scheme of mean-varianceHowever those methods are cumbersome in computingand cannot obtain the closed solutions for some complex

problems To make up the defects of mean-variance analternative methodology for obtaining an optimal portfoliois to use model selection This paper attempts to developa statistical program to consider the selection problem ofoptimal tracking portfolio We build the theoretical modelsof tracking portfolios by constrained linear models Thenthe selection problems of optimal portfolio boil down tochoosing an optimal constrained linear model

In the setting of unrestricted models or homoscedasticmodels a large number of works investigate the problemsof model selection In distinction we discuss the modelselection for constrained models with dependent errors Therestricted models are estimated by the method of weightedaverage least squares Thus the selection of an optimalconstrained model is equivalent to finding a series of optimal

12 Abstract and Applied Analysis

weights We select the weights by minimizing a 119896-class gen-eralized information criterion (119896-GIC) which is an estimateof the average squared error from the model average fit Theprocedure of selecting weights is proved to be asymptoticallyoptimal Through Monte Carlo simulation the performanceof 119896-GIC is compared against that of four other methods Itis found that the 119896-GIC gives the best performance in mostcases

There are two limitations of our results which are openfor further research First what is the asymptotic distributionof the parametric estimators Second can the theory begeneralized to allow for continuous weightsThese questionsremain to be answered by future research In this work wemainly adopt the method of regression analysis to solve theselection problem In fact some alternative mathematicaltools can also be employed to explore the theoretical prop-erties of model selection For example the optimal modelcan be selected by the methods of linear optimization orquadratic programming and we can apply the techniques oflinear functional analysis and stochastic control to considerthe inferences of parametric estimator Besides we mentionthat the applications of this study can also be extended tosome other fields including risk management ruin theoryand factor analysis

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

This work was supported by the Shandong Institute ofBusiness and Technology (SIBT Grant 521014306203)

References

[1] K Knight and W Fu ldquoAsymptotics for lasso-type estimatorsrdquoThe Annals of Statistics vol 28 no 5 pp 1356ndash1378 2000

[2] J Fan and H Peng ldquoNonconcave penalized likelihood with adiverging number of parametersrdquo The Annals of Statistics vol32 no 3 pp 928ndash961 2004

[3] H Zou and M Yuan ldquoComposite quantile regression and theoracle model selection theoryrdquo The Annals of Statistics vol 36no 3 pp 1108ndash1126 2008

[4] M Caner ldquoLasso-type GMM estimatorrdquo Econometric Theoryvol 25 no 1 pp 270ndash290 2009

[5] C Y Tang and C Leng ldquoPenalized high-dimensional empiricallikelihoodrdquo Biometrika vol 97 no 4 pp 905ndash919 2010

[6] L C Jennifer A D Jurgen and F H David Model Selectionin Equations With Many Small Effects Oxford Bulletin ofEconomics and Statistics 2013

[7] E E Leamer Specification Searches John Wiley amp Sons NewYork NY USA 1978

[8] P J Brown M Vannucci and T Fearn ldquoBayes model averagingwith selection of regressorsrdquo Journal of the Royal StatisticalSociety B Statistical Methodology vol 64 no 3 pp 519ndash5362002

[9] W S Rodney and K D Herman Bayesian Model SelectionWith An Uninformative Prior Oxford Bulletin of Economicsand Statistics 2003

[10] N L Hjort and G Claeskens ldquoFrequentist model averageestimatorsrdquo Journal of the American Statistical Association vol98 no 464 pp 879ndash899 2003

[11] B E Hansen ldquoLeast squares model averagingrdquo EconometricaJournal of the Econometric Society vol 75 no 4 pp 1175ndash11892007

[12] H Liang G Zou A T K Wan and X Zhang ldquoOptimal weightchoice for frequentist model average estimatorsrdquo Journal of theAmerican Statistical Association vol 106 no 495 pp 1053ndash10662011

[13] X Zhang and H Liang ldquoFocused information criterion andmodel averaging for generalized additive partial linear modelsrdquoThe Annals of Statistics vol 39 no 1 pp 194ndash200 2011

[14] B E Hansen and J S Racine ldquoJackknife model averagingrdquoJournal of Econometrics vol 167 no 1 pp 38ndash46 2012

[15] H Akaike ldquoInformation theory and an extension of themaximum likelihood principlerdquo in Proceedings of the 2ndInternational Symposium on Information Theory B N Petrovand F Csake Eds pp 267ndash281 Akademiai Kiado BudapestHungary 1973

[16] C L Mallows ldquoSome comments on CPrdquo Technometrics vol 15no 4 pp 661ndash675 1973

[17] M Stone ldquoCross-validatory choice and assessment of statisticalpredictionsrdquo Journal of the Royal Statistical Society B Method-ological vol 36 pp 111ndash147 1974

[18] G Schwarz ldquoEstimating the dimension of a modelrdquoTheAnnalsof Statistics vol 6 no 2 pp 461ndash464 1978

[19] P Craven and G Wahba ldquoSmoothing noisy data with splinefunctions Estimating the correct degree of smoothing by themethod of generalized cross-validationrdquo Numerische Mathe-matik vol 31 no 4 pp 377ndash403 1979

[20] D W K Andrews ldquoConsistent moment selection proceduresfor generalized method of moments estimationrdquo EconometricaJournal of the Econometric Society vol 67 no 3 pp 543ndash5641999

[21] G Claeskens and N L Hjort ldquoThe focused informationcriterionrdquo Journal of theAmerican Statistical Association vol 98no 464 pp 900ndash945 2003With discussions and a rejoinder bythe authors

[22] Y Zhang R Li and C-L Tsai ldquoRegularization parameterselections via generalized information criterionrdquo Journal of theAmerican Statistical Association vol 105 no 489 pp 312ndash3232010

[23] G Xu and J Z Huang ldquoAsymptotic optimality and efficientcomputation of the leave-subject-out cross-validationrdquo TheAnnals of Statistics vol 40 no 6 pp 3003ndash3030 2012

[24] M K P So and T Ando ldquoGeneralized predictive informationcriteria for the analysis of feature eventsrdquo Electronic Journal ofStatistics vol 7 pp 742ndash762 2013

[25] J J J Groen and G Kapetanios ldquoModel selection criteria forfactor-augmented regressionsrdquo Oxford Bulletin of Economicsand Statistics vol 75 no 1 pp 37ndash63 2013

[26] S Lai and Q Xie ldquoA selection problem for a constrainedlinear regression modelrdquo Journal of Industrial and ManagementOptimization vol 4 no 4 pp 757ndash766 2008

[27] J Shao ldquoAn asymptotic theory for linear model selectionrdquoStatistica Sinica vol 7 no 2 pp 221ndash264 1997 With commentsand a rejoinder by the author

Abstract and Applied Analysis 13

[28] D W K Andrews ldquoAsymptotic optimality of generalized 119862119871

cross-validation and generalized cross-validation in regressionwith heteroskedastic errorsrdquo Journal of Econometrics vol 47 no2-3 pp 359ndash377 1991

[29] K-C Li ldquoAsymptotic optimality for 119862119901 119862119871 cross-validation

and generalized cross-validation discrete index setrdquoTheAnnalsof Statistics vol 15 no 3 pp 958ndash975 1987

[30] J A F Machado ldquoRobust model selection and119872-estimationrdquoEconometric Theory vol 9 no 3 pp 478ndash493 1993

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article The Optimal Selection for Restricted ...downloads.hindawi.com/journals/aaa/2014/692472.pdf · stocks in the group as the regressors. Since the coe cient of a regressor

8 Abstract and Applied Analysis

Obviously the proof of (43) needs to verify that

sup119908isinW

119875(119908)1198902

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899119877119899 (119908)

119901

997888rarr 0

sup119908isinW

⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899119877119899 (119908)

119901

997888rarr 0

(52)

To prove (52) it should be noticed thattr(119875(119908)Ω

119899119875119879(119908)) = 119864(119890

119879119875119879(119908)119875(119908)119890) and 119875(119908)119890

2

119896=

⟨119890119879119875119879(119908) 119875(119908)119890⟩

119896 In addition

1003817100381710038171003817119875 (119908) (119872 (119908) 120583 minus 120578 (119908))10038171003817100381710038172le 1205822

max (119875 (119908))1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

le1003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

(53)

Thus there exist any 1205771015840 gt 0 and 1205981015840 gt 0 such that

Pr sup119908isinW

119875(119908)1198902

119896minus 119896 tr (119875 (119908)Ω119899119875

119879(119908))

119899119877119899 (119908)

gt 1205771015840

le sum

119908isinW

119864[119875(119908)1198902

119896minus 119896119864 (119890

119879119875119879(119908) 119875 (119908) 119890)]

2

(119899119877119899 (119908) 120577

1015840)2

=1198962

12057710158402sum

119908isinW

Var (119890119879119875119879 (119908) 119875 (119908) 119890)

(119899119877119899(119908))2

le2119896C1Γ

12057710158402sum

119908isinW

119896 tr (119875 (119908)Ω119899119875119879(119908))

(119899119877119899(119908))2

le2119896C1Γ

12057710158402sum

119908isinW

1

(119899119877119899 (119908))

(54)

Pr

sup119908isinW

⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896

119899119877119899 (119908)

gt 1205981015840

le sum

119908isinW

119864[⟨120583119879119872119879(119908) minus 120578

119879(119908) 119875(119908)119890⟩

119896]2

(119899119877119899(119908)1205981015840)

2

le sum

119908isinW

1003817100381710038171003817119875(119908)(119872(119908)120583 minus 120578(119908))10038171003817100381710038172

119896119896W

(119899119877119899(119908)1205981015840)

2

le 1205981015840minus2119896W sum

119908isinW

119899minus21003817100381710038171003817119872(119908)120583 minus 120578(119908)

10038171003817100381710038172

119896(119877119899 (119908))

minus2

(55)

Combining (36) and (45) one knows that both (54)and (55) tend to zero In other words (43) is confirmedWe conclude that the expressions (41) (42) and (43) arereasonable This completes the proof of Theorem 9

In practice the covariance of errors is usually unknownand needs to be estimated However it is difficult to build agood estimate for Ω

119899in virtue of the special structure of Ω

119899

In the special case when the random errors are independentand identical distribution with variance 1205902 the consistentestimator of1205902 can be built for the constrainedmodels (7) and(8) Let 119898 = 119876 in 120583

119898and 1205911015840 = tr(119875

119876) where 119876 corresponds

to a ldquolargerdquo approximatingmodel Denote 2119876= (119899minus120591

1015840)minus1(119884minus

120583119876)119879(119884 minus 120583

119876) The coming theorem will show that 2

119876is a

consistent estimate of 1205902

Theorem 10 If 1205911015840119899 rarr 0 when 1205911015840 rarr infin and 119899 rarr infin wehave 2

119876

119901

997888rarr 1205902 as 119899 rarr infin

Proof Writing Λ = 119863 + 119883119877minus1119863995333 one obtains

(119884 minus 120583119876)119879(119884 minus 120583

119876)

119899 minus 1205911015840=119890119879119872119876119890

119899 minus 1205911015840+

1003817100381710038171003817119872119876Λ10038171003817100381710038172

119899 minus 1205911015840+2119890119879119872119876Λ

119899 minus 1205911015840

(56)

where119872119876= 119868 minus 119875

119876

Since 119864(119890119879119872119876119890) = 120590

2(119899 minus 120591

1015840) it leads to

11986410038161003816100381610038161003816119890119879119872119876119890 minus 1205902(119899 minus 120591

1015840)10038161003816100381610038161003816

2

= 11986410038161003816100381610038161003816119890119879119872119876119890 minus 119864(119890

119879119872119876119890)10038161003816100381610038161003816

2

= Var (119890119879119872119876119890) = 2120590

4 tr (119872119876)

(57)

For any 120576 gt 0 it follows from (57) that

Pr1003816100381610038161003816100381610038161003816100381610038161003816

119890119879119872119876119890

119899 minus 1205911015840minus 1205902

1003816100381610038161003816100381610038161003816100381610038161003816

gt 120576 le11986410038161003816100381610038161003816119890119879119872119876119890 minus (119899 minus 120591

1015840)120590210038161003816100381610038161003816

2

1205762(119899 minus 1205911015840)2

le21205904

1205762 (119899 minus 1205911015840)997888rarr 0

(58)

Equation (58) implies that (119899 minus 1205911015840)minus1(119890119879119872119876119890)119901

997888rarr 1205902 Let

I119876119905

be the 119905th diagonal element of the ldquohatrdquo matrix 119875119876Then

0 le 1minusI119876119905lt 1 is the 119905th diagonal element of119872

119876and satisfies

sum119899

119905=1(1 minusI

119876119905) = 119899 minus 120591

1015840Because 120601(119911

119894119897 120595119897) and 120601(119886

119904119896 120595119896) converge to mean

square we have 119864(Λ2119876119905) rarr 0 as 1205911015840 rarr infin where Λ

119876119905is

the 119905th element of Λ 119905 = 1 119899 Notice that

119864 (Λ119879119872119876Λ)

(119899 minus 1205911015840)=

1

119899 minus 1205911015840

119899

sum

119905=1

119864 (Λ2

119876119905119872119876119905)

=1

119899 minus 1205911015840

119899

sum

119905=1

119864 (Λ2

119876119905) [1 minusI

119876119905]

le max119905

119864 (Λ2

119876119905)

1

119899 minus 1205911015840

119899

sum

119905=1

[1 minusI119876119905]

= max119905

119864 (Λ2

119876119905) 997888rarr 0

(59)

The above expression implies that the second term on theright hand of (56) approaches to zero By the similar proof of(59) we obtain that the final term on the right-hand of (56)also tends to zero The proof of Theorem 10 is complete

Abstract and Applied Analysis 9

In the case of independently and identically distributederrors if we replace 1205902 by 2

119876in the 119896-GIC the 119896-GIC can be

simplified to

119862119899 (119908) =

1003817100381710038171003817119884 minus 120583(119908)10038171003817100381710038172

119896

119899+2ℏ2

119876tr (119875 (119908))119899

(60)

Here we intend to illustrate that the model selection pro-cedure of minimizing 119862

119899(119908) is also asymptotically optimal

Theorem11 Assume that the random error 119890 is iid withmeanzero and variance 1205902 Under the conditions (35) and (36) the119862119899(119908) is still asymptotically valid

Proof Using a similar technique of deriving (39) we obtain

119899119862119899 (119908) =

1003817100381710038171003817119884 minus 120583(119908)10038171003817100381710038172

119896+ 2ℏ

2

119876tr (119875 (119908))

= 119899119871119899 (119908) + 2⟨119890

119879119872(119908)120583 minus 120578(119908)⟩

119896

+ 2ℏ2

119876tr (119875 (119908)) minus 2⟨119890119879 119875(119908)119890⟩

119896+ 1198902

119896

= 2ℏ (2

119876minus 1205902) tr (119875 (119908)) + 2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

+ 119899119871119899 (119908) + 2ℏ120590

2 tr (119875 (119908))

minus 2⟨119890119879 119875(119908)119890⟩

119896+ 1198902

119896

(61)

With an appropriate modification of the proofs (41)ndash(43)one only needs to verify

sup119908isinW

ℏ100381610038161003816100381610038161205902minus 2

119876

10038161003816100381610038161003816tr (119875 (119908))

119899119877119899 (119908)

119901

997888rarr 0 (62)

which is equivalent to showing

sup119908isinW

ℏ2((1119899)

100381610038161003816100381610038161205902minus 2

119876

10038161003816100381610038161003816tr(119875(119908)))

2

1198772119899(119908)

119901

997888rarr 0 (63)

From the proof of Theorem 10 we have

Pr1003816100381610038161003816100381610038161003816100381610038161003816

ℏ119890119879119872119876119890

119899 minus 1205911015840minus ℏ1205902

1003816100381610038161003816100381610038161003816100381610038161003816

gt 1205761015840119877minus12

119899(119908)

le ℏ211986410038161003816100381610038161003816119890119879119872119876119890 minus (119899 minus 120591

1015840)120590210038161003816100381610038161003816

2

12057610158402(119899 minus 1205911015840)2119877119899 (119908)

le2ℏ21205904

12057610158402 (119899 minus 1205911015840) 119877119899 (119908)

(64)

where 1205761015840 gt 0By assumption (35) one knows that the above equation

is close to zero Thus we obtain 119877119899(119908)minus1|ℏ1205902minus ℏ2

119876|2 119901

997888rarr

0 It follows from Lemma 3 and the definition of 119877119899(119908)

that (119899minus1 tr(119875(119908)))2 is no greater than 119877119899(119908) Then the

formula (63) is confirmed Therefore we conclude that theminimizing of 119862

119899(119908) is also asymptotically optimal

5 Monte Carlo Simulation

In this section Monte Carlo simulations are performedto investigate the finite sample properties of the proposedrestricted linearmodel selectionThis data generating processis

119910119894= 119909119894120579 +

1000

sum

119897=1

119911119894119897120595119897+ 119890119894

subject to 120579 = minus

1000

sum

119897=1

120595119897+ 1 119894 = 1 119899

(65)

where 119909119894follows 119905(3) distribution 119911

1198941= 1 and 119911

119894119897

119897 = 2 1000 is independently and identically distributed119873(0 1) The parameter 120595

119897 119897 = 1 1000 is determined by

the rule 120595119897= 120577119897minus1 where 120577 is a parameter which is selected

to control the population 2 = 1205772(1 + 120577

2) The error 119890

119894is

independent of 119909119894and 119911119894119897 We consider two cases of the error

distribution

Case 1 119890119894is independently and identically distributed

119873(0 1)

Case 2 1198901 119890

119899obey multivariate normal distribution

119873(0Ω119899) whereΩ

119899is a 119899times119899 dimensional covariance matrix

The 119895th diagonal element ofΩ119899is 1205902119895generated from uniform

distribution on (0 1) The 119895119897th (119895 = 119897) nondiagonal element ofΩ119899isΩ119895119897119899= exp(minus05|119909

119894119895minus 119909119894119897|2) where 119909

119894119895and 119909

119894119897denote the

119895th and 119897th elements of 119909119894 respectively

The sample size is varied between 119899 = 50 100 150 and200 The number of models is determined by 119872 = lfloor3119899

13rfloor

where lfloor119904rfloor stands for the integer part of 119904 We set 120577 so that2 varies on a grid between 01 and 09 The number of

simulation trials is Π = 500 For the 119896-GIC the value of 119896takes one and ℏ adopts the effective number of parameters

To assess the performance of 119896-GIC we consider fiveestimators which are (1) AIC model selection estimators(AIC) (2) BIC model selection estimators (BIC) (3) leave-one-out cross-validated model selection estimator (CV [17])(4) Mallows model averaging estimators (MMA [11]) and(5) 119896-GIC model selection estimator (119896-GIC) FollowingMachado [30] the AIC and BIC are defined respectively as

AIC119898= 2119899 ln 1

119899

1003817100381710038171003817119884 minus 12058311989810038171003817100381710038172 + 2K

119898

BIC119898= 2119899 ln 1

119899

1003817100381710038171003817119884 minus 12058311989810038171003817100381710038172 + ln (119899)K119898

(66)

We employ the out-of-sample prediction error to evaluateeach estimator For each replication 119910

ℓ 119909ℓ 119911ℓ100

ℓ=1are gener-

ated as out-of-sample observations In the120587th simulation theprediction error is

PE (120587) = 1

100

100

sum

ℓ=1

(119910ℓminus 120583ℓ (119908))

2 (67)

10 Abstract and Applied Analysis

CVMMAAIC

BIC

n = 50M = 11Pr

edic

tion

erro

r18

14

10

06

01 03 05 07 09

R

2

k-GIC

(a)

CVMMAAIC

BIC

n = 100M = 14

Pred

ictio

n er

ror

18

14

10

06

01 03 05 07 09

R

2

k-GIC

(b)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

n = 150M = 16

01 03 05 07 09

R

2

k-GIC

(c)

Pred

ictio

n er

ror

18

14

10

06

01 03 05 07 09

CVMMAAIC

BIC

n = 200M = 18

R

2

k-GIC

(d)

Figure 1 Results for Monte Carlo design

where119908 is selected by one of the five methodsThen the out-of-sample prediction error is calculated by

PE = 1

Π

Π

sum

120587=1

PE (120587) (68)

where Π = 500 is the number of replication Obviously thesmaller PE implies the better method of model estimatorWe consider PE under homoscedastic errors at first Theprediction error calculations are summarized in Figure 1Thefour panels in each graph depict results for a variety of samplesizes In each panel PE is displayed on the 119910-axis and 2 isdisplayed on the 119909-axis

We find that the 119896-GIC estimators are almost the bestestimators among those considered When 2 is very large

the MMA estimators can sometimes be marginally preferredto the 119896-GIC estimators In each panel the AIC and CV havequite similar prediction errors For a smaller 2 the AICobtains a higher prediction error than CV However the AICestimators yield smaller PE119904 than the CV estimators when2 is increasing In many situations the PE119904 of BIC estimator

with a large 2 are quite poor relative to the other methodsNext we discuss PE under correlative errors and the PE

calculation is summarized in Figure 2 Broadly speakingthe conclusions are similar to those found in homoscedasticcases The 119896-GIC estimator frequently yields the most accu-rate estimators followed by the MMA estimator and bothaverage estimators enjoy significantly smaller PE119904 than theother three estimators over a large portion of the 2 space

Abstract and Applied Analysis 11

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 50M = 11

k-GIC

(a)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 100M = 14

k-GIC

(b)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 150M = 16

k-GIC

(c)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 200M = 18

k-GIC

(d)

Figure 2 Results for Monte Carlo design

When 2 le 02 the BIC estimator outperforms the 119896-GICestimator Again the AIC estimator is habitually the worstperforming estimator with the CV being a close second in alarge region of the 2 space Besides their relative efficiencyrelies closely on sample size with the BIC estimator revealingincreasing PE and the remaining four estimators showingdecreasing PE as 119899 increases

6 Conclusions

In risk investment an important subject is to find an optimalportfolio The commonly used techniques are the opti-mization methods based on the scheme of mean-varianceHowever those methods are cumbersome in computingand cannot obtain the closed solutions for some complex

problems To make up the defects of mean-variance analternative methodology for obtaining an optimal portfoliois to use model selection This paper attempts to developa statistical program to consider the selection problem ofoptimal tracking portfolio We build the theoretical modelsof tracking portfolios by constrained linear models Thenthe selection problems of optimal portfolio boil down tochoosing an optimal constrained linear model

In the setting of unrestricted models or homoscedasticmodels a large number of works investigate the problemsof model selection In distinction we discuss the modelselection for constrained models with dependent errors Therestricted models are estimated by the method of weightedaverage least squares Thus the selection of an optimalconstrained model is equivalent to finding a series of optimal

12 Abstract and Applied Analysis

weights We select the weights by minimizing a 119896-class gen-eralized information criterion (119896-GIC) which is an estimateof the average squared error from the model average fit Theprocedure of selecting weights is proved to be asymptoticallyoptimal Through Monte Carlo simulation the performanceof 119896-GIC is compared against that of four other methods Itis found that the 119896-GIC gives the best performance in mostcases

There are two limitations of our results which are openfor further research First what is the asymptotic distributionof the parametric estimators Second can the theory begeneralized to allow for continuous weightsThese questionsremain to be answered by future research In this work wemainly adopt the method of regression analysis to solve theselection problem In fact some alternative mathematicaltools can also be employed to explore the theoretical prop-erties of model selection For example the optimal modelcan be selected by the methods of linear optimization orquadratic programming and we can apply the techniques oflinear functional analysis and stochastic control to considerthe inferences of parametric estimator Besides we mentionthat the applications of this study can also be extended tosome other fields including risk management ruin theoryand factor analysis

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

This work was supported by the Shandong Institute ofBusiness and Technology (SIBT Grant 521014306203)

References

[1] K Knight and W Fu ldquoAsymptotics for lasso-type estimatorsrdquoThe Annals of Statistics vol 28 no 5 pp 1356ndash1378 2000

[2] J Fan and H Peng ldquoNonconcave penalized likelihood with adiverging number of parametersrdquo The Annals of Statistics vol32 no 3 pp 928ndash961 2004

[3] H Zou and M Yuan ldquoComposite quantile regression and theoracle model selection theoryrdquo The Annals of Statistics vol 36no 3 pp 1108ndash1126 2008

[4] M Caner ldquoLasso-type GMM estimatorrdquo Econometric Theoryvol 25 no 1 pp 270ndash290 2009

[5] C Y Tang and C Leng ldquoPenalized high-dimensional empiricallikelihoodrdquo Biometrika vol 97 no 4 pp 905ndash919 2010

[6] L C Jennifer A D Jurgen and F H David Model Selectionin Equations With Many Small Effects Oxford Bulletin ofEconomics and Statistics 2013

[7] E E Leamer Specification Searches John Wiley amp Sons NewYork NY USA 1978

[8] P J Brown M Vannucci and T Fearn ldquoBayes model averagingwith selection of regressorsrdquo Journal of the Royal StatisticalSociety B Statistical Methodology vol 64 no 3 pp 519ndash5362002

[9] W S Rodney and K D Herman Bayesian Model SelectionWith An Uninformative Prior Oxford Bulletin of Economicsand Statistics 2003

[10] N L Hjort and G Claeskens ldquoFrequentist model averageestimatorsrdquo Journal of the American Statistical Association vol98 no 464 pp 879ndash899 2003

[11] B E Hansen ldquoLeast squares model averagingrdquo EconometricaJournal of the Econometric Society vol 75 no 4 pp 1175ndash11892007

[12] H Liang G Zou A T K Wan and X Zhang ldquoOptimal weightchoice for frequentist model average estimatorsrdquo Journal of theAmerican Statistical Association vol 106 no 495 pp 1053ndash10662011

[13] X Zhang and H Liang ldquoFocused information criterion andmodel averaging for generalized additive partial linear modelsrdquoThe Annals of Statistics vol 39 no 1 pp 194ndash200 2011

[14] B E Hansen and J S Racine ldquoJackknife model averagingrdquoJournal of Econometrics vol 167 no 1 pp 38ndash46 2012

[15] H Akaike ldquoInformation theory and an extension of themaximum likelihood principlerdquo in Proceedings of the 2ndInternational Symposium on Information Theory B N Petrovand F Csake Eds pp 267ndash281 Akademiai Kiado BudapestHungary 1973

[16] C L Mallows ldquoSome comments on CPrdquo Technometrics vol 15no 4 pp 661ndash675 1973

[17] M Stone ldquoCross-validatory choice and assessment of statisticalpredictionsrdquo Journal of the Royal Statistical Society B Method-ological vol 36 pp 111ndash147 1974

[18] G Schwarz ldquoEstimating the dimension of a modelrdquoTheAnnalsof Statistics vol 6 no 2 pp 461ndash464 1978

[19] P Craven and G Wahba ldquoSmoothing noisy data with splinefunctions Estimating the correct degree of smoothing by themethod of generalized cross-validationrdquo Numerische Mathe-matik vol 31 no 4 pp 377ndash403 1979

[20] D W K Andrews ldquoConsistent moment selection proceduresfor generalized method of moments estimationrdquo EconometricaJournal of the Econometric Society vol 67 no 3 pp 543ndash5641999

[21] G Claeskens and N L Hjort ldquoThe focused informationcriterionrdquo Journal of theAmerican Statistical Association vol 98no 464 pp 900ndash945 2003With discussions and a rejoinder bythe authors

[22] Y Zhang R Li and C-L Tsai ldquoRegularization parameterselections via generalized information criterionrdquo Journal of theAmerican Statistical Association vol 105 no 489 pp 312ndash3232010

[23] G Xu and J Z Huang ldquoAsymptotic optimality and efficientcomputation of the leave-subject-out cross-validationrdquo TheAnnals of Statistics vol 40 no 6 pp 3003ndash3030 2012

[24] M K P So and T Ando ldquoGeneralized predictive informationcriteria for the analysis of feature eventsrdquo Electronic Journal ofStatistics vol 7 pp 742ndash762 2013

[25] J J J Groen and G Kapetanios ldquoModel selection criteria forfactor-augmented regressionsrdquo Oxford Bulletin of Economicsand Statistics vol 75 no 1 pp 37ndash63 2013

[26] S Lai and Q Xie ldquoA selection problem for a constrainedlinear regression modelrdquo Journal of Industrial and ManagementOptimization vol 4 no 4 pp 757ndash766 2008

[27] J Shao ldquoAn asymptotic theory for linear model selectionrdquoStatistica Sinica vol 7 no 2 pp 221ndash264 1997 With commentsand a rejoinder by the author

Abstract and Applied Analysis 13

[28] D W K Andrews ldquoAsymptotic optimality of generalized 119862119871

cross-validation and generalized cross-validation in regressionwith heteroskedastic errorsrdquo Journal of Econometrics vol 47 no2-3 pp 359ndash377 1991

[29] K-C Li ldquoAsymptotic optimality for 119862119901 119862119871 cross-validation

and generalized cross-validation discrete index setrdquoTheAnnalsof Statistics vol 15 no 3 pp 958ndash975 1987

[30] J A F Machado ldquoRobust model selection and119872-estimationrdquoEconometric Theory vol 9 no 3 pp 478ndash493 1993

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 9: Research Article The Optimal Selection for Restricted ...downloads.hindawi.com/journals/aaa/2014/692472.pdf · stocks in the group as the regressors. Since the coe cient of a regressor

Abstract and Applied Analysis 9

In the case of independently and identically distributederrors if we replace 1205902 by 2

119876in the 119896-GIC the 119896-GIC can be

simplified to

119862119899 (119908) =

1003817100381710038171003817119884 minus 120583(119908)10038171003817100381710038172

119896

119899+2ℏ2

119876tr (119875 (119908))119899

(60)

Here we intend to illustrate that the model selection pro-cedure of minimizing 119862

119899(119908) is also asymptotically optimal

Theorem11 Assume that the random error 119890 is iid withmeanzero and variance 1205902 Under the conditions (35) and (36) the119862119899(119908) is still asymptotically valid

Proof Using a similar technique of deriving (39) we obtain

119899119862119899 (119908) =

1003817100381710038171003817119884 minus 120583(119908)10038171003817100381710038172

119896+ 2ℏ

2

119876tr (119875 (119908))

= 119899119871119899 (119908) + 2⟨119890

119879119872(119908)120583 minus 120578(119908)⟩

119896

+ 2ℏ2

119876tr (119875 (119908)) minus 2⟨119890119879 119875(119908)119890⟩

119896+ 1198902

119896

= 2ℏ (2

119876minus 1205902) tr (119875 (119908)) + 2⟨119890119879119872(119908)120583 minus 120578(119908)⟩

119896

+ 119899119871119899 (119908) + 2ℏ120590

2 tr (119875 (119908))

minus 2⟨119890119879 119875(119908)119890⟩

119896+ 1198902

119896

(61)

With an appropriate modification of the proofs (41)ndash(43)one only needs to verify

sup119908isinW

ℏ100381610038161003816100381610038161205902minus 2

119876

10038161003816100381610038161003816tr (119875 (119908))

119899119877119899 (119908)

119901

997888rarr 0 (62)

which is equivalent to showing

sup119908isinW

ℏ2((1119899)

100381610038161003816100381610038161205902minus 2

119876

10038161003816100381610038161003816tr(119875(119908)))

2

1198772119899(119908)

119901

997888rarr 0 (63)

From the proof of Theorem 10 we have

Pr1003816100381610038161003816100381610038161003816100381610038161003816

ℏ119890119879119872119876119890

119899 minus 1205911015840minus ℏ1205902

1003816100381610038161003816100381610038161003816100381610038161003816

gt 1205761015840119877minus12

119899(119908)

le ℏ211986410038161003816100381610038161003816119890119879119872119876119890 minus (119899 minus 120591

1015840)120590210038161003816100381610038161003816

2

12057610158402(119899 minus 1205911015840)2119877119899 (119908)

le2ℏ21205904

12057610158402 (119899 minus 1205911015840) 119877119899 (119908)

(64)

where 1205761015840 gt 0By assumption (35) one knows that the above equation

is close to zero Thus we obtain 119877119899(119908)minus1|ℏ1205902minus ℏ2

119876|2 119901

997888rarr

0 It follows from Lemma 3 and the definition of 119877119899(119908)

that (119899minus1 tr(119875(119908)))2 is no greater than 119877119899(119908) Then the

formula (63) is confirmed Therefore we conclude that theminimizing of 119862

119899(119908) is also asymptotically optimal

5 Monte Carlo Simulation

In this section Monte Carlo simulations are performedto investigate the finite sample properties of the proposedrestricted linearmodel selectionThis data generating processis

119910119894= 119909119894120579 +

1000

sum

119897=1

119911119894119897120595119897+ 119890119894

subject to 120579 = minus

1000

sum

119897=1

120595119897+ 1 119894 = 1 119899

(65)

where 119909119894follows 119905(3) distribution 119911

1198941= 1 and 119911

119894119897

119897 = 2 1000 is independently and identically distributed119873(0 1) The parameter 120595

119897 119897 = 1 1000 is determined by

the rule 120595119897= 120577119897minus1 where 120577 is a parameter which is selected

to control the population 2 = 1205772(1 + 120577

2) The error 119890

119894is

independent of 119909119894and 119911119894119897 We consider two cases of the error

distribution

Case 1 119890119894is independently and identically distributed

119873(0 1)

Case 2 1198901 119890

119899obey multivariate normal distribution

119873(0Ω119899) whereΩ

119899is a 119899times119899 dimensional covariance matrix

The 119895th diagonal element ofΩ119899is 1205902119895generated from uniform

distribution on (0 1) The 119895119897th (119895 = 119897) nondiagonal element ofΩ119899isΩ119895119897119899= exp(minus05|119909

119894119895minus 119909119894119897|2) where 119909

119894119895and 119909

119894119897denote the

119895th and 119897th elements of 119909119894 respectively

The sample size is varied between 119899 = 50 100 150 and200 The number of models is determined by 119872 = lfloor3119899

13rfloor

where lfloor119904rfloor stands for the integer part of 119904 We set 120577 so that2 varies on a grid between 01 and 09 The number of

simulation trials is Π = 500 For the 119896-GIC the value of 119896takes one and ℏ adopts the effective number of parameters

To assess the performance of 119896-GIC we consider fiveestimators which are (1) AIC model selection estimators(AIC) (2) BIC model selection estimators (BIC) (3) leave-one-out cross-validated model selection estimator (CV [17])(4) Mallows model averaging estimators (MMA [11]) and(5) 119896-GIC model selection estimator (119896-GIC) FollowingMachado [30] the AIC and BIC are defined respectively as

AIC119898= 2119899 ln 1

119899

1003817100381710038171003817119884 minus 12058311989810038171003817100381710038172 + 2K

119898

BIC119898= 2119899 ln 1

119899

1003817100381710038171003817119884 minus 12058311989810038171003817100381710038172 + ln (119899)K119898

(66)

We employ the out-of-sample prediction error to evaluateeach estimator For each replication 119910

ℓ 119909ℓ 119911ℓ100

ℓ=1are gener-

ated as out-of-sample observations In the120587th simulation theprediction error is

PE (120587) = 1

100

100

sum

ℓ=1

(119910ℓminus 120583ℓ (119908))

2 (67)

10 Abstract and Applied Analysis

CVMMAAIC

BIC

n = 50M = 11Pr

edic

tion

erro

r18

14

10

06

01 03 05 07 09

R

2

k-GIC

(a)

CVMMAAIC

BIC

n = 100M = 14

Pred

ictio

n er

ror

18

14

10

06

01 03 05 07 09

R

2

k-GIC

(b)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

n = 150M = 16

01 03 05 07 09

R

2

k-GIC

(c)

Pred

ictio

n er

ror

18

14

10

06

01 03 05 07 09

CVMMAAIC

BIC

n = 200M = 18

R

2

k-GIC

(d)

Figure 1 Results for Monte Carlo design

where119908 is selected by one of the five methodsThen the out-of-sample prediction error is calculated by

PE = 1

Π

Π

sum

120587=1

PE (120587) (68)

where Π = 500 is the number of replication Obviously thesmaller PE implies the better method of model estimatorWe consider PE under homoscedastic errors at first Theprediction error calculations are summarized in Figure 1Thefour panels in each graph depict results for a variety of samplesizes In each panel PE is displayed on the 119910-axis and 2 isdisplayed on the 119909-axis

We find that the 119896-GIC estimators are almost the bestestimators among those considered When 2 is very large

the MMA estimators can sometimes be marginally preferredto the 119896-GIC estimators In each panel the AIC and CV havequite similar prediction errors For a smaller 2 the AICobtains a higher prediction error than CV However the AICestimators yield smaller PE119904 than the CV estimators when2 is increasing In many situations the PE119904 of BIC estimator

with a large 2 are quite poor relative to the other methodsNext we discuss PE under correlative errors and the PE

calculation is summarized in Figure 2 Broadly speakingthe conclusions are similar to those found in homoscedasticcases The 119896-GIC estimator frequently yields the most accu-rate estimators followed by the MMA estimator and bothaverage estimators enjoy significantly smaller PE119904 than theother three estimators over a large portion of the 2 space

Abstract and Applied Analysis 11

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 50M = 11

k-GIC

(a)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 100M = 14

k-GIC

(b)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 150M = 16

k-GIC

(c)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 200M = 18

k-GIC

(d)

Figure 2 Results for Monte Carlo design

When 2 le 02 the BIC estimator outperforms the 119896-GICestimator Again the AIC estimator is habitually the worstperforming estimator with the CV being a close second in alarge region of the 2 space Besides their relative efficiencyrelies closely on sample size with the BIC estimator revealingincreasing PE and the remaining four estimators showingdecreasing PE as 119899 increases

6 Conclusions

In risk investment an important subject is to find an optimalportfolio The commonly used techniques are the opti-mization methods based on the scheme of mean-varianceHowever those methods are cumbersome in computingand cannot obtain the closed solutions for some complex

problems To make up the defects of mean-variance analternative methodology for obtaining an optimal portfoliois to use model selection This paper attempts to developa statistical program to consider the selection problem ofoptimal tracking portfolio We build the theoretical modelsof tracking portfolios by constrained linear models Thenthe selection problems of optimal portfolio boil down tochoosing an optimal constrained linear model

In the setting of unrestricted models or homoscedasticmodels a large number of works investigate the problemsof model selection In distinction we discuss the modelselection for constrained models with dependent errors Therestricted models are estimated by the method of weightedaverage least squares Thus the selection of an optimalconstrained model is equivalent to finding a series of optimal

12 Abstract and Applied Analysis

weights We select the weights by minimizing a 119896-class gen-eralized information criterion (119896-GIC) which is an estimateof the average squared error from the model average fit Theprocedure of selecting weights is proved to be asymptoticallyoptimal Through Monte Carlo simulation the performanceof 119896-GIC is compared against that of four other methods Itis found that the 119896-GIC gives the best performance in mostcases

There are two limitations of our results which are openfor further research First what is the asymptotic distributionof the parametric estimators Second can the theory begeneralized to allow for continuous weightsThese questionsremain to be answered by future research In this work wemainly adopt the method of regression analysis to solve theselection problem In fact some alternative mathematicaltools can also be employed to explore the theoretical prop-erties of model selection For example the optimal modelcan be selected by the methods of linear optimization orquadratic programming and we can apply the techniques oflinear functional analysis and stochastic control to considerthe inferences of parametric estimator Besides we mentionthat the applications of this study can also be extended tosome other fields including risk management ruin theoryand factor analysis

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

This work was supported by the Shandong Institute ofBusiness and Technology (SIBT Grant 521014306203)

References

[1] K Knight and W Fu ldquoAsymptotics for lasso-type estimatorsrdquoThe Annals of Statistics vol 28 no 5 pp 1356ndash1378 2000

[2] J Fan and H Peng ldquoNonconcave penalized likelihood with adiverging number of parametersrdquo The Annals of Statistics vol32 no 3 pp 928ndash961 2004

[3] H Zou and M Yuan ldquoComposite quantile regression and theoracle model selection theoryrdquo The Annals of Statistics vol 36no 3 pp 1108ndash1126 2008

[4] M Caner ldquoLasso-type GMM estimatorrdquo Econometric Theoryvol 25 no 1 pp 270ndash290 2009

[5] C Y Tang and C Leng ldquoPenalized high-dimensional empiricallikelihoodrdquo Biometrika vol 97 no 4 pp 905ndash919 2010

[6] L C Jennifer A D Jurgen and F H David Model Selectionin Equations With Many Small Effects Oxford Bulletin ofEconomics and Statistics 2013

[7] E E Leamer Specification Searches John Wiley amp Sons NewYork NY USA 1978

[8] P J Brown M Vannucci and T Fearn ldquoBayes model averagingwith selection of regressorsrdquo Journal of the Royal StatisticalSociety B Statistical Methodology vol 64 no 3 pp 519ndash5362002

[9] W S Rodney and K D Herman Bayesian Model SelectionWith An Uninformative Prior Oxford Bulletin of Economicsand Statistics 2003

[10] N L Hjort and G Claeskens ldquoFrequentist model averageestimatorsrdquo Journal of the American Statistical Association vol98 no 464 pp 879ndash899 2003

[11] B E Hansen ldquoLeast squares model averagingrdquo EconometricaJournal of the Econometric Society vol 75 no 4 pp 1175ndash11892007

[12] H Liang G Zou A T K Wan and X Zhang ldquoOptimal weightchoice for frequentist model average estimatorsrdquo Journal of theAmerican Statistical Association vol 106 no 495 pp 1053ndash10662011

[13] X Zhang and H Liang ldquoFocused information criterion andmodel averaging for generalized additive partial linear modelsrdquoThe Annals of Statistics vol 39 no 1 pp 194ndash200 2011

[14] B E Hansen and J S Racine ldquoJackknife model averagingrdquoJournal of Econometrics vol 167 no 1 pp 38ndash46 2012

[15] H Akaike ldquoInformation theory and an extension of themaximum likelihood principlerdquo in Proceedings of the 2ndInternational Symposium on Information Theory B N Petrovand F Csake Eds pp 267ndash281 Akademiai Kiado BudapestHungary 1973

[16] C L Mallows ldquoSome comments on CPrdquo Technometrics vol 15no 4 pp 661ndash675 1973

[17] M Stone ldquoCross-validatory choice and assessment of statisticalpredictionsrdquo Journal of the Royal Statistical Society B Method-ological vol 36 pp 111ndash147 1974

[18] G Schwarz ldquoEstimating the dimension of a modelrdquoTheAnnalsof Statistics vol 6 no 2 pp 461ndash464 1978

[19] P Craven and G Wahba ldquoSmoothing noisy data with splinefunctions Estimating the correct degree of smoothing by themethod of generalized cross-validationrdquo Numerische Mathe-matik vol 31 no 4 pp 377ndash403 1979

[20] D W K Andrews ldquoConsistent moment selection proceduresfor generalized method of moments estimationrdquo EconometricaJournal of the Econometric Society vol 67 no 3 pp 543ndash5641999

[21] G Claeskens and N L Hjort ldquoThe focused informationcriterionrdquo Journal of theAmerican Statistical Association vol 98no 464 pp 900ndash945 2003With discussions and a rejoinder bythe authors

[22] Y Zhang R Li and C-L Tsai ldquoRegularization parameterselections via generalized information criterionrdquo Journal of theAmerican Statistical Association vol 105 no 489 pp 312ndash3232010

[23] G Xu and J Z Huang ldquoAsymptotic optimality and efficientcomputation of the leave-subject-out cross-validationrdquo TheAnnals of Statistics vol 40 no 6 pp 3003ndash3030 2012

[24] M K P So and T Ando ldquoGeneralized predictive informationcriteria for the analysis of feature eventsrdquo Electronic Journal ofStatistics vol 7 pp 742ndash762 2013

[25] J J J Groen and G Kapetanios ldquoModel selection criteria forfactor-augmented regressionsrdquo Oxford Bulletin of Economicsand Statistics vol 75 no 1 pp 37ndash63 2013

[26] S Lai and Q Xie ldquoA selection problem for a constrainedlinear regression modelrdquo Journal of Industrial and ManagementOptimization vol 4 no 4 pp 757ndash766 2008

[27] J Shao ldquoAn asymptotic theory for linear model selectionrdquoStatistica Sinica vol 7 no 2 pp 221ndash264 1997 With commentsand a rejoinder by the author

Abstract and Applied Analysis 13

[28] D W K Andrews ldquoAsymptotic optimality of generalized 119862119871

cross-validation and generalized cross-validation in regressionwith heteroskedastic errorsrdquo Journal of Econometrics vol 47 no2-3 pp 359ndash377 1991

[29] K-C Li ldquoAsymptotic optimality for 119862119901 119862119871 cross-validation

and generalized cross-validation discrete index setrdquoTheAnnalsof Statistics vol 15 no 3 pp 958ndash975 1987

[30] J A F Machado ldquoRobust model selection and119872-estimationrdquoEconometric Theory vol 9 no 3 pp 478ndash493 1993

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 10: Research Article The Optimal Selection for Restricted ...downloads.hindawi.com/journals/aaa/2014/692472.pdf · stocks in the group as the regressors. Since the coe cient of a regressor

10 Abstract and Applied Analysis

CVMMAAIC

BIC

n = 50M = 11Pr

edic

tion

erro

r18

14

10

06

01 03 05 07 09

R

2

k-GIC

(a)

CVMMAAIC

BIC

n = 100M = 14

Pred

ictio

n er

ror

18

14

10

06

01 03 05 07 09

R

2

k-GIC

(b)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

n = 150M = 16

01 03 05 07 09

R

2

k-GIC

(c)

Pred

ictio

n er

ror

18

14

10

06

01 03 05 07 09

CVMMAAIC

BIC

n = 200M = 18

R

2

k-GIC

(d)

Figure 1 Results for Monte Carlo design

where119908 is selected by one of the five methodsThen the out-of-sample prediction error is calculated by

PE = 1

Π

Π

sum

120587=1

PE (120587) (68)

where Π = 500 is the number of replication Obviously thesmaller PE implies the better method of model estimatorWe consider PE under homoscedastic errors at first Theprediction error calculations are summarized in Figure 1Thefour panels in each graph depict results for a variety of samplesizes In each panel PE is displayed on the 119910-axis and 2 isdisplayed on the 119909-axis

We find that the 119896-GIC estimators are almost the bestestimators among those considered When 2 is very large

the MMA estimators can sometimes be marginally preferredto the 119896-GIC estimators In each panel the AIC and CV havequite similar prediction errors For a smaller 2 the AICobtains a higher prediction error than CV However the AICestimators yield smaller PE119904 than the CV estimators when2 is increasing In many situations the PE119904 of BIC estimator

with a large 2 are quite poor relative to the other methodsNext we discuss PE under correlative errors and the PE

calculation is summarized in Figure 2 Broadly speakingthe conclusions are similar to those found in homoscedasticcases The 119896-GIC estimator frequently yields the most accu-rate estimators followed by the MMA estimator and bothaverage estimators enjoy significantly smaller PE119904 than theother three estimators over a large portion of the 2 space

Abstract and Applied Analysis 11

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 50M = 11

k-GIC

(a)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 100M = 14

k-GIC

(b)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 150M = 16

k-GIC

(c)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 200M = 18

k-GIC

(d)

Figure 2 Results for Monte Carlo design

When 2 le 02 the BIC estimator outperforms the 119896-GICestimator Again the AIC estimator is habitually the worstperforming estimator with the CV being a close second in alarge region of the 2 space Besides their relative efficiencyrelies closely on sample size with the BIC estimator revealingincreasing PE and the remaining four estimators showingdecreasing PE as 119899 increases

6 Conclusions

In risk investment an important subject is to find an optimalportfolio The commonly used techniques are the opti-mization methods based on the scheme of mean-varianceHowever those methods are cumbersome in computingand cannot obtain the closed solutions for some complex

problems To make up the defects of mean-variance analternative methodology for obtaining an optimal portfoliois to use model selection This paper attempts to developa statistical program to consider the selection problem ofoptimal tracking portfolio We build the theoretical modelsof tracking portfolios by constrained linear models Thenthe selection problems of optimal portfolio boil down tochoosing an optimal constrained linear model

In the setting of unrestricted models or homoscedasticmodels a large number of works investigate the problemsof model selection In distinction we discuss the modelselection for constrained models with dependent errors Therestricted models are estimated by the method of weightedaverage least squares Thus the selection of an optimalconstrained model is equivalent to finding a series of optimal

12 Abstract and Applied Analysis

weights We select the weights by minimizing a 119896-class gen-eralized information criterion (119896-GIC) which is an estimateof the average squared error from the model average fit Theprocedure of selecting weights is proved to be asymptoticallyoptimal Through Monte Carlo simulation the performanceof 119896-GIC is compared against that of four other methods Itis found that the 119896-GIC gives the best performance in mostcases

There are two limitations of our results which are openfor further research First what is the asymptotic distributionof the parametric estimators Second can the theory begeneralized to allow for continuous weightsThese questionsremain to be answered by future research In this work wemainly adopt the method of regression analysis to solve theselection problem In fact some alternative mathematicaltools can also be employed to explore the theoretical prop-erties of model selection For example the optimal modelcan be selected by the methods of linear optimization orquadratic programming and we can apply the techniques oflinear functional analysis and stochastic control to considerthe inferences of parametric estimator Besides we mentionthat the applications of this study can also be extended tosome other fields including risk management ruin theoryand factor analysis

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

This work was supported by the Shandong Institute ofBusiness and Technology (SIBT Grant 521014306203)

References

[1] K Knight and W Fu ldquoAsymptotics for lasso-type estimatorsrdquoThe Annals of Statistics vol 28 no 5 pp 1356ndash1378 2000

[2] J Fan and H Peng ldquoNonconcave penalized likelihood with adiverging number of parametersrdquo The Annals of Statistics vol32 no 3 pp 928ndash961 2004

[3] H Zou and M Yuan ldquoComposite quantile regression and theoracle model selection theoryrdquo The Annals of Statistics vol 36no 3 pp 1108ndash1126 2008

[4] M Caner ldquoLasso-type GMM estimatorrdquo Econometric Theoryvol 25 no 1 pp 270ndash290 2009

[5] C Y Tang and C Leng ldquoPenalized high-dimensional empiricallikelihoodrdquo Biometrika vol 97 no 4 pp 905ndash919 2010

[6] L C Jennifer A D Jurgen and F H David Model Selectionin Equations With Many Small Effects Oxford Bulletin ofEconomics and Statistics 2013

[7] E E Leamer Specification Searches John Wiley amp Sons NewYork NY USA 1978

[8] P J Brown M Vannucci and T Fearn ldquoBayes model averagingwith selection of regressorsrdquo Journal of the Royal StatisticalSociety B Statistical Methodology vol 64 no 3 pp 519ndash5362002

[9] W S Rodney and K D Herman Bayesian Model SelectionWith An Uninformative Prior Oxford Bulletin of Economicsand Statistics 2003

[10] N L Hjort and G Claeskens ldquoFrequentist model averageestimatorsrdquo Journal of the American Statistical Association vol98 no 464 pp 879ndash899 2003

[11] B E Hansen ldquoLeast squares model averagingrdquo EconometricaJournal of the Econometric Society vol 75 no 4 pp 1175ndash11892007

[12] H Liang G Zou A T K Wan and X Zhang ldquoOptimal weightchoice for frequentist model average estimatorsrdquo Journal of theAmerican Statistical Association vol 106 no 495 pp 1053ndash10662011

[13] X Zhang and H Liang ldquoFocused information criterion andmodel averaging for generalized additive partial linear modelsrdquoThe Annals of Statistics vol 39 no 1 pp 194ndash200 2011

[14] B E Hansen and J S Racine ldquoJackknife model averagingrdquoJournal of Econometrics vol 167 no 1 pp 38ndash46 2012

[15] H Akaike ldquoInformation theory and an extension of themaximum likelihood principlerdquo in Proceedings of the 2ndInternational Symposium on Information Theory B N Petrovand F Csake Eds pp 267ndash281 Akademiai Kiado BudapestHungary 1973

[16] C L Mallows ldquoSome comments on CPrdquo Technometrics vol 15no 4 pp 661ndash675 1973

[17] M Stone ldquoCross-validatory choice and assessment of statisticalpredictionsrdquo Journal of the Royal Statistical Society B Method-ological vol 36 pp 111ndash147 1974

[18] G Schwarz ldquoEstimating the dimension of a modelrdquoTheAnnalsof Statistics vol 6 no 2 pp 461ndash464 1978

[19] P Craven and G Wahba ldquoSmoothing noisy data with splinefunctions Estimating the correct degree of smoothing by themethod of generalized cross-validationrdquo Numerische Mathe-matik vol 31 no 4 pp 377ndash403 1979

[20] D W K Andrews ldquoConsistent moment selection proceduresfor generalized method of moments estimationrdquo EconometricaJournal of the Econometric Society vol 67 no 3 pp 543ndash5641999

[21] G Claeskens and N L Hjort ldquoThe focused informationcriterionrdquo Journal of theAmerican Statistical Association vol 98no 464 pp 900ndash945 2003With discussions and a rejoinder bythe authors

[22] Y Zhang R Li and C-L Tsai ldquoRegularization parameterselections via generalized information criterionrdquo Journal of theAmerican Statistical Association vol 105 no 489 pp 312ndash3232010

[23] G Xu and J Z Huang ldquoAsymptotic optimality and efficientcomputation of the leave-subject-out cross-validationrdquo TheAnnals of Statistics vol 40 no 6 pp 3003ndash3030 2012

[24] M K P So and T Ando ldquoGeneralized predictive informationcriteria for the analysis of feature eventsrdquo Electronic Journal ofStatistics vol 7 pp 742ndash762 2013

[25] J J J Groen and G Kapetanios ldquoModel selection criteria forfactor-augmented regressionsrdquo Oxford Bulletin of Economicsand Statistics vol 75 no 1 pp 37ndash63 2013

[26] S Lai and Q Xie ldquoA selection problem for a constrainedlinear regression modelrdquo Journal of Industrial and ManagementOptimization vol 4 no 4 pp 757ndash766 2008

[27] J Shao ldquoAn asymptotic theory for linear model selectionrdquoStatistica Sinica vol 7 no 2 pp 221ndash264 1997 With commentsand a rejoinder by the author

Abstract and Applied Analysis 13

[28] D W K Andrews ldquoAsymptotic optimality of generalized 119862119871

cross-validation and generalized cross-validation in regressionwith heteroskedastic errorsrdquo Journal of Econometrics vol 47 no2-3 pp 359ndash377 1991

[29] K-C Li ldquoAsymptotic optimality for 119862119901 119862119871 cross-validation

and generalized cross-validation discrete index setrdquoTheAnnalsof Statistics vol 15 no 3 pp 958ndash975 1987

[30] J A F Machado ldquoRobust model selection and119872-estimationrdquoEconometric Theory vol 9 no 3 pp 478ndash493 1993

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 11: Research Article The Optimal Selection for Restricted ...downloads.hindawi.com/journals/aaa/2014/692472.pdf · stocks in the group as the regressors. Since the coe cient of a regressor

Abstract and Applied Analysis 11

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 50M = 11

k-GIC

(a)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 100M = 14

k-GIC

(b)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 150M = 16

k-GIC

(c)

Pred

ictio

n er

ror

18

14

10

06

CVMMAAIC

BIC

01 03 05 07 09

R

2

n = 200M = 18

k-GIC

(d)

Figure 2 Results for Monte Carlo design

When 2 le 02 the BIC estimator outperforms the 119896-GICestimator Again the AIC estimator is habitually the worstperforming estimator with the CV being a close second in alarge region of the 2 space Besides their relative efficiencyrelies closely on sample size with the BIC estimator revealingincreasing PE and the remaining four estimators showingdecreasing PE as 119899 increases

6 Conclusions

In risk investment an important subject is to find an optimalportfolio The commonly used techniques are the opti-mization methods based on the scheme of mean-varianceHowever those methods are cumbersome in computingand cannot obtain the closed solutions for some complex

problems To make up the defects of mean-variance analternative methodology for obtaining an optimal portfoliois to use model selection This paper attempts to developa statistical program to consider the selection problem ofoptimal tracking portfolio We build the theoretical modelsof tracking portfolios by constrained linear models Thenthe selection problems of optimal portfolio boil down tochoosing an optimal constrained linear model

In the setting of unrestricted models or homoscedasticmodels a large number of works investigate the problemsof model selection In distinction we discuss the modelselection for constrained models with dependent errors Therestricted models are estimated by the method of weightedaverage least squares Thus the selection of an optimalconstrained model is equivalent to finding a series of optimal

12 Abstract and Applied Analysis

weights We select the weights by minimizing a 119896-class gen-eralized information criterion (119896-GIC) which is an estimateof the average squared error from the model average fit Theprocedure of selecting weights is proved to be asymptoticallyoptimal Through Monte Carlo simulation the performanceof 119896-GIC is compared against that of four other methods Itis found that the 119896-GIC gives the best performance in mostcases

There are two limitations of our results which are openfor further research First what is the asymptotic distributionof the parametric estimators Second can the theory begeneralized to allow for continuous weightsThese questionsremain to be answered by future research In this work wemainly adopt the method of regression analysis to solve theselection problem In fact some alternative mathematicaltools can also be employed to explore the theoretical prop-erties of model selection For example the optimal modelcan be selected by the methods of linear optimization orquadratic programming and we can apply the techniques oflinear functional analysis and stochastic control to considerthe inferences of parametric estimator Besides we mentionthat the applications of this study can also be extended tosome other fields including risk management ruin theoryand factor analysis

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

This work was supported by the Shandong Institute ofBusiness and Technology (SIBT Grant 521014306203)

References

[1] K Knight and W Fu ldquoAsymptotics for lasso-type estimatorsrdquoThe Annals of Statistics vol 28 no 5 pp 1356ndash1378 2000

[2] J Fan and H Peng ldquoNonconcave penalized likelihood with adiverging number of parametersrdquo The Annals of Statistics vol32 no 3 pp 928ndash961 2004

[3] H Zou and M Yuan ldquoComposite quantile regression and theoracle model selection theoryrdquo The Annals of Statistics vol 36no 3 pp 1108ndash1126 2008

[4] M Caner ldquoLasso-type GMM estimatorrdquo Econometric Theoryvol 25 no 1 pp 270ndash290 2009

[5] C Y Tang and C Leng ldquoPenalized high-dimensional empiricallikelihoodrdquo Biometrika vol 97 no 4 pp 905ndash919 2010

[6] L C Jennifer A D Jurgen and F H David Model Selectionin Equations With Many Small Effects Oxford Bulletin ofEconomics and Statistics 2013

[7] E E Leamer Specification Searches John Wiley amp Sons NewYork NY USA 1978

[8] P J Brown M Vannucci and T Fearn ldquoBayes model averagingwith selection of regressorsrdquo Journal of the Royal StatisticalSociety B Statistical Methodology vol 64 no 3 pp 519ndash5362002

[9] W S Rodney and K D Herman Bayesian Model SelectionWith An Uninformative Prior Oxford Bulletin of Economicsand Statistics 2003

[10] N L Hjort and G Claeskens ldquoFrequentist model averageestimatorsrdquo Journal of the American Statistical Association vol98 no 464 pp 879ndash899 2003

[11] B E Hansen ldquoLeast squares model averagingrdquo EconometricaJournal of the Econometric Society vol 75 no 4 pp 1175ndash11892007

[12] H Liang G Zou A T K Wan and X Zhang ldquoOptimal weightchoice for frequentist model average estimatorsrdquo Journal of theAmerican Statistical Association vol 106 no 495 pp 1053ndash10662011

[13] X Zhang and H Liang ldquoFocused information criterion andmodel averaging for generalized additive partial linear modelsrdquoThe Annals of Statistics vol 39 no 1 pp 194ndash200 2011

[14] B E Hansen and J S Racine ldquoJackknife model averagingrdquoJournal of Econometrics vol 167 no 1 pp 38ndash46 2012

[15] H Akaike ldquoInformation theory and an extension of themaximum likelihood principlerdquo in Proceedings of the 2ndInternational Symposium on Information Theory B N Petrovand F Csake Eds pp 267ndash281 Akademiai Kiado BudapestHungary 1973

[16] C L Mallows ldquoSome comments on CPrdquo Technometrics vol 15no 4 pp 661ndash675 1973

[17] M Stone ldquoCross-validatory choice and assessment of statisticalpredictionsrdquo Journal of the Royal Statistical Society B Method-ological vol 36 pp 111ndash147 1974

[18] G Schwarz ldquoEstimating the dimension of a modelrdquoTheAnnalsof Statistics vol 6 no 2 pp 461ndash464 1978

[19] P Craven and G Wahba ldquoSmoothing noisy data with splinefunctions Estimating the correct degree of smoothing by themethod of generalized cross-validationrdquo Numerische Mathe-matik vol 31 no 4 pp 377ndash403 1979

[20] D W K Andrews ldquoConsistent moment selection proceduresfor generalized method of moments estimationrdquo EconometricaJournal of the Econometric Society vol 67 no 3 pp 543ndash5641999

[21] G Claeskens and N L Hjort ldquoThe focused informationcriterionrdquo Journal of theAmerican Statistical Association vol 98no 464 pp 900ndash945 2003With discussions and a rejoinder bythe authors

[22] Y Zhang R Li and C-L Tsai ldquoRegularization parameterselections via generalized information criterionrdquo Journal of theAmerican Statistical Association vol 105 no 489 pp 312ndash3232010

[23] G Xu and J Z Huang ldquoAsymptotic optimality and efficientcomputation of the leave-subject-out cross-validationrdquo TheAnnals of Statistics vol 40 no 6 pp 3003ndash3030 2012

[24] M K P So and T Ando ldquoGeneralized predictive informationcriteria for the analysis of feature eventsrdquo Electronic Journal ofStatistics vol 7 pp 742ndash762 2013

[25] J J J Groen and G Kapetanios ldquoModel selection criteria forfactor-augmented regressionsrdquo Oxford Bulletin of Economicsand Statistics vol 75 no 1 pp 37ndash63 2013

[26] S Lai and Q Xie ldquoA selection problem for a constrainedlinear regression modelrdquo Journal of Industrial and ManagementOptimization vol 4 no 4 pp 757ndash766 2008

[27] J Shao ldquoAn asymptotic theory for linear model selectionrdquoStatistica Sinica vol 7 no 2 pp 221ndash264 1997 With commentsand a rejoinder by the author

Abstract and Applied Analysis 13

[28] D W K Andrews ldquoAsymptotic optimality of generalized 119862119871

cross-validation and generalized cross-validation in regressionwith heteroskedastic errorsrdquo Journal of Econometrics vol 47 no2-3 pp 359ndash377 1991

[29] K-C Li ldquoAsymptotic optimality for 119862119901 119862119871 cross-validation

and generalized cross-validation discrete index setrdquoTheAnnalsof Statistics vol 15 no 3 pp 958ndash975 1987

[30] J A F Machado ldquoRobust model selection and119872-estimationrdquoEconometric Theory vol 9 no 3 pp 478ndash493 1993

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 12: Research Article The Optimal Selection for Restricted ...downloads.hindawi.com/journals/aaa/2014/692472.pdf · stocks in the group as the regressors. Since the coe cient of a regressor

12 Abstract and Applied Analysis

weights We select the weights by minimizing a 119896-class gen-eralized information criterion (119896-GIC) which is an estimateof the average squared error from the model average fit Theprocedure of selecting weights is proved to be asymptoticallyoptimal Through Monte Carlo simulation the performanceof 119896-GIC is compared against that of four other methods Itis found that the 119896-GIC gives the best performance in mostcases

There are two limitations of our results which are openfor further research First what is the asymptotic distributionof the parametric estimators Second can the theory begeneralized to allow for continuous weightsThese questionsremain to be answered by future research In this work wemainly adopt the method of regression analysis to solve theselection problem In fact some alternative mathematicaltools can also be employed to explore the theoretical prop-erties of model selection For example the optimal modelcan be selected by the methods of linear optimization orquadratic programming and we can apply the techniques oflinear functional analysis and stochastic control to considerthe inferences of parametric estimator Besides we mentionthat the applications of this study can also be extended tosome other fields including risk management ruin theoryand factor analysis

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

This work was supported by the Shandong Institute ofBusiness and Technology (SIBT Grant 521014306203)

References

[1] K Knight and W Fu ldquoAsymptotics for lasso-type estimatorsrdquoThe Annals of Statistics vol 28 no 5 pp 1356ndash1378 2000

[2] J Fan and H Peng ldquoNonconcave penalized likelihood with adiverging number of parametersrdquo The Annals of Statistics vol32 no 3 pp 928ndash961 2004

[3] H Zou and M Yuan ldquoComposite quantile regression and theoracle model selection theoryrdquo The Annals of Statistics vol 36no 3 pp 1108ndash1126 2008

[4] M Caner ldquoLasso-type GMM estimatorrdquo Econometric Theoryvol 25 no 1 pp 270ndash290 2009

[5] C Y Tang and C Leng ldquoPenalized high-dimensional empiricallikelihoodrdquo Biometrika vol 97 no 4 pp 905ndash919 2010

[6] L C Jennifer A D Jurgen and F H David Model Selectionin Equations With Many Small Effects Oxford Bulletin ofEconomics and Statistics 2013

[7] E E Leamer Specification Searches John Wiley amp Sons NewYork NY USA 1978

[8] P J Brown M Vannucci and T Fearn ldquoBayes model averagingwith selection of regressorsrdquo Journal of the Royal StatisticalSociety B Statistical Methodology vol 64 no 3 pp 519ndash5362002

[9] W S Rodney and K D Herman Bayesian Model SelectionWith An Uninformative Prior Oxford Bulletin of Economicsand Statistics 2003

[10] N L Hjort and G Claeskens ldquoFrequentist model averageestimatorsrdquo Journal of the American Statistical Association vol98 no 464 pp 879ndash899 2003

[11] B E Hansen ldquoLeast squares model averagingrdquo EconometricaJournal of the Econometric Society vol 75 no 4 pp 1175ndash11892007

[12] H Liang G Zou A T K Wan and X Zhang ldquoOptimal weightchoice for frequentist model average estimatorsrdquo Journal of theAmerican Statistical Association vol 106 no 495 pp 1053ndash10662011

[13] X Zhang and H Liang ldquoFocused information criterion andmodel averaging for generalized additive partial linear modelsrdquoThe Annals of Statistics vol 39 no 1 pp 194ndash200 2011

[14] B E Hansen and J S Racine ldquoJackknife model averagingrdquoJournal of Econometrics vol 167 no 1 pp 38ndash46 2012

[15] H Akaike ldquoInformation theory and an extension of themaximum likelihood principlerdquo in Proceedings of the 2ndInternational Symposium on Information Theory B N Petrovand F Csake Eds pp 267ndash281 Akademiai Kiado BudapestHungary 1973

[16] C L Mallows ldquoSome comments on CPrdquo Technometrics vol 15no 4 pp 661ndash675 1973

[17] M Stone ldquoCross-validatory choice and assessment of statisticalpredictionsrdquo Journal of the Royal Statistical Society B Method-ological vol 36 pp 111ndash147 1974

[18] G Schwarz ldquoEstimating the dimension of a modelrdquoTheAnnalsof Statistics vol 6 no 2 pp 461ndash464 1978

[19] P Craven and G Wahba ldquoSmoothing noisy data with splinefunctions Estimating the correct degree of smoothing by themethod of generalized cross-validationrdquo Numerische Mathe-matik vol 31 no 4 pp 377ndash403 1979

[20] D W K Andrews ldquoConsistent moment selection proceduresfor generalized method of moments estimationrdquo EconometricaJournal of the Econometric Society vol 67 no 3 pp 543ndash5641999

[21] G Claeskens and N L Hjort ldquoThe focused informationcriterionrdquo Journal of theAmerican Statistical Association vol 98no 464 pp 900ndash945 2003With discussions and a rejoinder bythe authors

[22] Y Zhang R Li and C-L Tsai ldquoRegularization parameterselections via generalized information criterionrdquo Journal of theAmerican Statistical Association vol 105 no 489 pp 312ndash3232010

[23] G Xu and J Z Huang ldquoAsymptotic optimality and efficientcomputation of the leave-subject-out cross-validationrdquo TheAnnals of Statistics vol 40 no 6 pp 3003ndash3030 2012

[24] M K P So and T Ando ldquoGeneralized predictive informationcriteria for the analysis of feature eventsrdquo Electronic Journal ofStatistics vol 7 pp 742ndash762 2013

[25] J J J Groen and G Kapetanios ldquoModel selection criteria forfactor-augmented regressionsrdquo Oxford Bulletin of Economicsand Statistics vol 75 no 1 pp 37ndash63 2013

[26] S Lai and Q Xie ldquoA selection problem for a constrainedlinear regression modelrdquo Journal of Industrial and ManagementOptimization vol 4 no 4 pp 757ndash766 2008

[27] J Shao ldquoAn asymptotic theory for linear model selectionrdquoStatistica Sinica vol 7 no 2 pp 221ndash264 1997 With commentsand a rejoinder by the author

Abstract and Applied Analysis 13

[28] D W K Andrews ldquoAsymptotic optimality of generalized 119862119871

cross-validation and generalized cross-validation in regressionwith heteroskedastic errorsrdquo Journal of Econometrics vol 47 no2-3 pp 359ndash377 1991

[29] K-C Li ldquoAsymptotic optimality for 119862119901 119862119871 cross-validation

and generalized cross-validation discrete index setrdquoTheAnnalsof Statistics vol 15 no 3 pp 958ndash975 1987

[30] J A F Machado ldquoRobust model selection and119872-estimationrdquoEconometric Theory vol 9 no 3 pp 478ndash493 1993

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 13: Research Article The Optimal Selection for Restricted ...downloads.hindawi.com/journals/aaa/2014/692472.pdf · stocks in the group as the regressors. Since the coe cient of a regressor

Abstract and Applied Analysis 13

[28] D W K Andrews ldquoAsymptotic optimality of generalized 119862119871

cross-validation and generalized cross-validation in regressionwith heteroskedastic errorsrdquo Journal of Econometrics vol 47 no2-3 pp 359ndash377 1991

[29] K-C Li ldquoAsymptotic optimality for 119862119901 119862119871 cross-validation

and generalized cross-validation discrete index setrdquoTheAnnalsof Statistics vol 15 no 3 pp 958ndash975 1987

[30] J A F Machado ldquoRobust model selection and119872-estimationrdquoEconometric Theory vol 9 no 3 pp 478ndash493 1993

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 14: Research Article The Optimal Selection for Restricted ...downloads.hindawi.com/journals/aaa/2014/692472.pdf · stocks in the group as the regressors. Since the coe cient of a regressor

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of