stock return predictability, conditional asset pricing
TRANSCRIPT
Stock Return Predictability, Conditional Asset PricingModels and Portfolio Selection
Ane Tamayo1
London Business SchoolRegent’s Park
London NW1 4SA2
First Draft: September, 2000This draft:May, 2002
1This paper is part of my dissertation at the University of Rochester. I thank the members of my dis-sertation committee Gregory Bauer, Christopher Jones, John Long, and especially the committee chair JayShanken, for helpful discussions and comments. The comments of Andreas Gintschel, Ludger Hentschel, Sta-nimir Markov, Micah Officer, William Schwert, Raman Uppal, Michela Verardo, and seminar participantsat the European Finance Association Meetings at Barcelona, Centro de Estudios Monetarios y FinancierosCEMFI, Frank Russell Company, HEC School of Management, London Business School, Ohio State Univer-sity, Universidad Carlos III, Universitat Pompeu Fabra, and University of Rochester are also appreciated. Iam responsible for any remaining errors.
2Telephone: +44 (0)20 7262 5050, Ext.: 3410. Fax: +44 (0)20 7724 6573. E-mail: [email protected].
Abstract
I examine an investor’s portfolio allocation problem across multiple risky assets in the presenceof return predictability when, in addition to the predictability evidence, the investor uses condi-tional asset pricing models to guide him in the portfolio selection decision. I also explore how theuncertainty associated with the model dynamics affects the investor’s optimal portfolio. To analyzethis, I introduce Bayesian techniques that have not been used before in the asset pricing literature.
Using a market index and a small capitalization or a value portfolio, I find that the sampleevidence on predictability plays a major role in the investor’s portfolio allocation decision. Theoptimal portfolio also depends on his beliefs about the extent to which this predictability can beattributed to time variation in risk premia and betas. Finally, I show that the portfolio allocationdecision is also affected by the investor’s uncertainty about the beta risk dynamics.
1 Introduction
Consider an investor who observes a number of variables that may predict stock returns and has
some knowledge of asset pricing theory. How can he use this information to allocate funds between a
riskless asset and a portfolio of risky assets? In this paper, I address this question by examining the
portfolio allocation problem of a Bayesian investor when returns may be predictable. In addition
to the predictability evidence, the investor uses asset pricing models to guide him in the portfolio
selection decision. In particular, since he conditions on ex-ante information, a natural benchmark is
provided by conditional asset pricing models, in which the stock return predictability is attributed
to time variation in risk premia and risk exposures or betas. The use of conditional models, however,
introduces further uncertainty into the problem related to the unobservable dynamics of expected
returns, risk premia and betas. As a first attempt to deal with this type of model uncertainty,
I suggest a simple econometric framework in which betas and risk premia are latent, stochastic
functions of the predictive variables.
Although the predictability evidence goes back at least to the late 70s,1 its effect on portfolio
allocation decisions (i.e., its economic significance) has not been explored until recently. In partic-
ular, Kandel and Stambaugh (1996) have shown that the asset allocation decision of a Bayesian
investor between a market index and a riskless asset is affected by the predictability evidence even
though this evidence could be regarded as insignificant using standard statistical measures.2 With
the exception of a few studies (e.g., Avramov, 2000; Bauer, 2000; Cremers, 2001), however, previous
literature has reduced the asset choice to a market index and a riskless asset. Allowing for multiple
risky assets is interesting because there are considerable cross-sectional differences in the time series
predictability of returns. For example, expected returns on small capitalization stocks are more
sensitive to changes in several predictive variables, such as dividend yields and default spreads,
than expected returns on large capitalization stocks (e.g., Fama and French, 1989; Harvey, 1989).
Thus, one contribution of this paper is to provide further evidence on the economic significance of
predictability when the investor allocates his funds across multiple risky assets.
Once several risky assets are introduced into the problem, the potential usefulness of asset
pricing models becomes an important consideration. For example, Pastor (2000) has shown that
an investor’s beliefs about the validity of an (unconditional) asset pricing model can largely affect
his optimal portfolio (see also Jorion, 1991; Grauer and Hakansson, 1995). These studies, however,
assume that returns are identically, independently distributed and focus on unconditional models.
In the presence of return predictability, the relevant benchmark is provided by conditional asset
1For example, see Fama and Schwert (1977) for an early study. More recent studies include Keim and Stambaugh(1986), Fama and French (1989), Goetzmann and Jorion (1993), and Kothari and Shanken (1997).
2Other studies examining the economic significance of return predictability include Pesaran and Timmerman(1995), Kim and Omberg (1996), Brennan et al (1997), Campbell and Viceira (1999), Barberis (2000), Avramov(2000), and Shanken and Tamayo (2001).
1
pricing models, in which the return predictability is captured by time variation in risk premia and
betas. Thus, in this paper, I extend the previous literature by formally considering the role of
conditional asset pricing models in portfolio selection problems. Although conditional asset pricing
models have been widely examined, there is no evidence about the extent to which departures from
the models and prior beliefs affect an investor’s optimal portfolio.3 For example, if an investor
dogmatically believes in a conditional model, his optimal portfolio should be a combination of only
benchmark portfolios that expose investors to priced sources of risk. At the other extreme, if he
dogmatically believes that the predictability cannot be explained by an asset pricing model, he
should ignore any evidence supporting the model. Finally, if the investor does not hold dogmatic
beliefs, his optimal portfolio should be affected by the sample evidence on time variation in alphas,
betas, and risk premia, and the model’s ability to explain average returns.
The use of conditional models, however, complicates the analysis because it introduces further
uncertainty into the problem. Prior literature has shown (e.g., Zellner and Chetty, 1965; Bawa,
Brown and Klein, 1979; and, more recently, Kandel and Stambaugh, 1996; Anderson et al, 1999;
Barberis, 2000; Maenhout, 2000; Pastor; 2000; Avramov, 2000) that the optimal portfolio of a
Bayesian investor is affected by parameter or, more broadly, model uncertainty. In a conditional
setting the existence of model uncertainty is an important concern, because finance theory provides
only a vague indication of how expected returns, risk premia and betas vary over time. Hence,
investors face additional uncertainty associated with the unobservable dynamics of these inputs to
the portfolio selection problem. Although this type of uncertainty has been addressed in several
theoretical papers, there is little empirical evidence about its effect on asset pricing. In this paper, I
empirically explore how uncertainty about the beta dynamics affects an investor’s optimal portfolio
and, more generally, suggest an econometric approach to incorporate uncertainty about the model
dynamics into problem.
The model that I suggest treats betas, and potentially risk premia, as latent (or unobservable)
variables and stochastic functions of some observable instruments. Previous literature has modeled
betas as deterministic functions of ex-ante observable variables even though deterministic models
are unlikely to be perfect descriptions of the true dynamics.4 The possibility of model misspecifi-
cation has therefore been ignored in these studies. The framework that I suggest allows for model
misspecification and incorporates the investor’s uncertainty about the beta dynamics into the prob-
lem. In particular, by using stochastic betas, the investor can specify a prior distribution for the
error terms in the beta equation to reflect his uncertainty about the beta model. In addition, he
3Conditional models have been studied by Bollerslev et al (1988), Harvey (1989), Shanken (1990), Bodurtha andMark (1991), Ng (1991), Ferson and Harvey (1991, 1993, 1999), Evans (1994), Ferson and Korajczyk (1995), Braunet al (1995), He et al (1996), Ghysels (1998), and Lewellen (1999) among others.
4Stochastic models for betas have been previously suggested by Rosenberg (1973) and Ohlson and Rosenberg(1983), among others. However, these studies model betas as functions of lagged betas only and, more importantly,do not present a framework to address directly the investor’s uncertainty about the beta dynamics.
2
can separate his uncertainty about the beta dynamics from his beliefs about the asset pricing model
(alphas). For example, if an investor believes in the conditional CAPM but is uncertain about the
true beta process, he will probably set very tight priors around zero for alphas and allow for fairly
diffuse priors for the regression parameters and the residual variance in the beta equation. Given
that the time variation in alphas and betas play very different roles in portfolio allocation problems,
it is important to consider these two sources of model uncertainty separately.
The framework that I adopt is Bayesian. Bayesian methods provide a convenient way to explore
portfolio selection problems because they account for parameter uncertainty by considering the
predictive distribution of returns. Furthermore, an investor’s priors about the model can be formally
incorporated into the portfolio selection problem. To estimate the posterior distribution of the
model parameters, I provide a new Bayesian estimation method that could be applied to other
problems. In particular, I use a data augmentation algorithm via Markov Chain Monte Carlo
methods, which consists of augmenting the data (i.e., the returns and predictive variables) with
a latent variable, the betas, and using the Gibbs sampler or the Metropolis-Hasting algorithm
to estimate the posterior distributions of the parameters. Also, I provide some insight into the
stochastic nature of betas and examine to what extent the simplification of deterministic, time-
varying betas is warranted.
To illustrate the framework suggested in here, I examine the optimal portfolio of an investor
who allocates funds across a riskfree asset, a value-weighted market index (benchmark asset) and
a portfolio of small capitalization or value stocks (non-benchmark assets). The investor considers
the conditional CAPM as a reference point and uses the dividend yield, term and default spreads
as predictive variables. I examine the optimal portfolio under two scenarios for the price of risk
(the ratio of the expected return on the market to the variance), constant and time-varying price
of risk; and under two scenarios for the dynamics of beta, deterministic and stochastic betas.
In my empirical analysis, I find that the sample evidence on return predictability affects an
investor’s portfolio allocation decision, which is consistent with previous studies (e.g. Kandel and
Stambaugh, 1996; Barberis, 2000). The optimal portfolio mix, however, depends largely on the
investor’s beliefs about the source of predictability. As expected, if an investor dogmatically be-
lieves that the predictability can be captured by a conditional model, the optimal allocation to
non-benchmark assets does not depend on the predictability evidence. However, if he allows for
conditional mispricing, the predictability evidence plays a major role in his allocation decision. I
also find that incorporating the investor’s uncertainty about the beta dynamics into the problem is
economically important and that allowing for time variation in the risk premia makes the optimal
portfolio even more sensitive to the predictability evidence.
This paper proceeds as follows. Section two presents a general framework to investigate how
predictability in asset returns affects an investor’s portfolio choice. First it briefly reviews the
3
literature on portfolio selection in a Bayesian setting and then introduces the framework suggested
in this paper to incorporate both return predictability and the investor’s beliefs about the source
of predictability in the portfolio selection problem. Section three presents the modeling framework
and Bayesian methodology used in this paper. It also discusses ways to examine the economic
significance of the evidence on predictability and the source of this predictability. Section four
describes the data and the priors. The empirical results are presented in sections five and six.
Section seven concludes the paper.
2 Portfolio Selection Under Time Series Return Predictability
2.1 General Framework and Previous Work5
Consider a risk averse investor with a one-period investment horizon who invests in a riskless
asset and a portfolio of (� + �) risky assets. � of the risky assets are benchmark portfolios
that expose the investor to priced sources of risk and � of them are “non-benchmark” portfolios.
Let � denote the fraction of the investor’s portfolio allocated to the riskless asset and � denote
the (� + �) × 1 vector of weights in the risky assets, such that �0��+� = 1� where ��+� is a
conformable vector of ones. The investor’s wealth at the end of month � + 1 is
��+1 =��
¡1 + � + (1− �)�0�+1
¢� (1)
where � is the rate of return on the riskless asset, observed at the end of month � , and �+1 is
the (� +�)× 1 vector of returns on the risky assets in month � + 1 in excess of � .
Let � be the information that the investor observes at the end of month �� The investor
chooses � so as to maximize the expected utility of his wealth at the end of month � + 1�
max�
Z�(��+1) (�+1|� )��+1� (2)
where �(�) is the investor’s utility function and (�+1|� ) is the density of �+1 conditional on
� , also referred to as the predictive density.
To derive the predictive density (�+1|� ) and, hence, obtain the optimal portfolio, a Bayesian
investor updates his beliefs about the model’s parameters, Θ� and specifies the likelihood function
for the future observation, (�+1|Θ� � )� After observing the data, the investor’s beliefs about the
parameters are summarized by the posterior distribution of Θ, which is proportional to the prior
distribution and the likelihood function
(Θ|� ) ∝ (Θ)�(Θ|� )� (3)
5For more details about the general framework to portfolio selection in a Bayesian setting, see, for example, Bawaand Brown (1976) and, more recently, Kandel and Stambaugh (1996), Barberis (2000), and Pastor (2000).
4
The predictive density is then obtained by integrating the product of the likelihood function for
the future observation and the posterior distribution of Θ with respect to the parameters
(�+1|� ) =
Z (�+1|Θ� � ) (Θ|� )�Θ� (4)
By integrating over the parameter space, the Bayesian investor explicitly takes into account pa-
rameter uncertainty or estimation risk in the portfolio allocation problem.
From the discussion above, it becomes clear that the optimal portfolio depends on the model
(likelihood function) that the investor specifies (in addition to his prior beliefs and utility function).
Traditionally, most studies on portfolio choice in a Bayesian setting have assumed that the distri-
bution of returns is identically and independently distributed (i.i.d.) over time (e.g., Klein and
Bawa, 1976; Bawa, et al, 1979; Jobson and Korkie, 1980; Jorion, 1985, 1991; Frost and Savarino,
1986). Starting with the pioneering work of Kandel and Stambaugh (1996), some recent work has
analyzed the portfolio allocation problem when returns may be predictable.
Kandel and Stambaugh (1996) examine the portfolio choice between a market index and a
riskless asset of an investor who conditions on dividend yield. They find that the optimal portfolio
may depend on the current level of dividend yield even though its predictive ability could be
regarded as insignificant using standard statistical measures. Barberis (2000) extends the analysis
to long-horizon investors and finds that investors allocate more to stocks the longer their horizon as
a result of the predictability evidence. Finally, Shanken and Tamayo (2001) explore the portfolio
allocation problem when both expected returns and risk are time-varying. They show that an
investor’s optimal portfolio is affected by his prior beliefs about whether the predictive ability of
the dividend yield is due to changes in risk or mispricing. Studies extending the analysis to multiple
risky assets include Avramov (2001), Bauer (2001) and Cremers (2001).6
Most of this work, however, has not considered the potential usefulness of asset pricing models
in portfolio selection problems, mainly because it tends to focus on a single risky asset. As Pastor
(2001) puts it, these studies represent “data-based” approaches to portfolio selection; they specify
a functional form for the distribution of returns and estimate the parameters from the data. When
investors can choose from a wider set of assets however, they are likely to also use theoretical models
to help them in their portfolio allocation decision. In contrast to the“data-based” approach, this
“model-based” approach specifies an asset pricing model and the optimal portfolio of every investor
is a combination of benchmark portfolios that expose investors to priced sources of risk.7 Pastor
(2001) examines an approach to portfolio allocation that lies in between the “data-” and “model-
based” approaches in an i.i.d. context. He finds that an investor’s beliefs about a model’s pricing
ability can affect his asset allocation decision.6Other studies analyzing the asset choice problem between a market index and a riskless asset in the presence
of time-varying expected returns include Kim and Omberg (1996), Brennan et al (1997), and Campbell and Viceira(1999). These studies, however, ignore parameter uncertainty.
7The notion of “data” versus “model-based” approaches to portfolio selection was first introduced in Pastor (2000).
5
The portfolio selection problem presented in this paper also incorporates a “data-and-model-
based” approach. However, I relax the i.i.d. setting by allowing for time series predictability in
returns. Accordingly, the investor analyzed here focuses on conditional rather than unconditional
asset pricing models. These models are discussed in the next subsection.
2.2 Portfolio Selection Using Conditional Models
Consider an investor who observes� predictive variables (including a constant), Z� = [1�...,� ]�
at the beginning of month � , where � = [1� �1���...,��−1�� ] is a 1×� vector for � = 1� ���� �� and �
is a subset of all the information that investors use to set prices, ��� The investor decides upon the
optimal weight on asset �� which can be a benchmark or non-benchmark portfolio, based partly on
the evidence from the time-series regression
��+1 = ��+ ��+1� (5)
for � = 1� ����� +� and � = 1� ���� �� and ��+1 ∼ �(0� �2��+1)�
In addition to the evidence from this predictive regression, the investor uses, at least indirectly,
asset pricing models to guide him in the portfolio selection problem. In particular, he may believe
that the expected returns on some assets (non-benchmark assets) are partly explained by the
assets’ exposures to priced sources of risk, whose realizations are replicated by the returns on the
benchmark assets. Hence, the investor follows a “data-and-model” based approach to portfolio
selection and, since expected returns may be time-varying, the natural reference point is provided
by conditional asset pricing models.
According to a conditional model, the cross-sectional and time series variation in expected
returns can be explained by a model with time-varying betas and risk premia. If there are �
priced sources of risk, the expected return on non-benchmark asset �� � = 1� ������ conditional on
all available information is given by
�[��+1|��] = �[��+1|��]���� (6)
� is a 1×� vector of risk premia, ��� is a � × 1 vector of conditional factor loadings and � (�|��)denotes expectation conditional on ��. If the conditional asset pricing model holds, the optimal
portfolio of every investor should be a combination of the � benchmark portfolios.
The investor, however, may not believe that a conditional model can capture all the time-series
return predictability or price assets.8 Furthermore, even if he is fairly confident that a model could
hold conditional on all available information, he may want to account for model departures in his
8As in previous studies (e.g. Kandel and Stambaugh, 1996; Pastor, 2000), the investor in this paper should notbe viewed as a representative investor, since equilibrium cannot be obtained if all investors have the same beliefs anduse the same model (likelihood function) as this investor.
6
analysis, given that he uses a subset of information ��. This investor can use both a conditional
model and the predictive regression in (5) to help him in the portfolio selection problem.
Assume, for ease of exposition, that the investor chooses between a market index (the benchmark
asset) and a portfolio of non-benchmark assets. Assume also that he uses the conditional Sharpe-
Lintner CAPM as a reference point in the asset allocation decision, where the market index is a
proxy for the market portfolio (hence, � = � = 1). The investor could combine the “data-based”
and “model-based” approaches to portfolio selection by examining the following system for the
non-benchmark asset
�+1 = �� + ����+1 + ��+1 (7)
�� = ���
�� = �� + �∗ �
where �� and � are �×1 vectors of parameters, ��+1 ∼ �(0� �2���+1) and �∗ � ∼ �(0� �∗2� )� For the
benchmark asset he still uses the predictive regression in (5).9
The models presented in (5) for the benchmark asset and in (7) for the non-benchmark asset
provide a general framework to analyze the role that beliefs play in an investor’s asset allocation.
In particular, it is possible to study how the optimal portfolio depends on the investor’s beliefs
about predictability, and the ability of a conditional model to capture the predictability and price
assets.
To make the specification of prior beliefs easier, express the predictive variables, �, as deviations
from their means. In this case, assuming stationarity, the first elements of the parameter vectors
� in (5), and �� and � in (7), denoted by �� �, and � respectively, are the long run means of the
market risk premia, alphas and betas. Hence, the simple case in which the investor has dogmatic
beliefs that expected returns are constant can be represented by setting very tight priors around
� = [�0 00�−1]0 for the benchmark asset in (5), and �� = [�0 00�−1]
0, and � = [�0 00�−1]0 and
�∗2� = 0 for the non-benchmark asset in (7), where 0�−1 is a (� − 1) × 1 vector of zeros and�0� �0� and �0 are the (long run) prior means of the risk premium, alpha, and beta. Cases in
which the investor allows for predictability can also be easily explored. For example, if an investor
dogmatically believes in the conditional model, he will have very tight priors that the parameters
for alpha equal zero, i.e., �� = 0� . At the other extreme, if he dogmatically believes that the
predictability cannot be explained by an asset pricing model, he will center �� away from zero or
set diffuse priors around �� = 0� , and maybe �∗2� = 0� � = [� 00�−1]
0� and � = [� 00�−1]0� Finally, if
the investor does not hold dogmatic beliefs, he will not specify tight priors and his optimal portfolio
will be affected by the sample evidence on time variation in alphas, betas, and risk premia, and the
model’s ability to explain average returns.
9Note that, for the non-benchmark asset, (5) is nested in (7).
7
2.3 Uncertainty about the Model Dynamics
One problem with conditional models is that finance theory provides very little indication of
how expected returns, risk premia and betas vary over time. Furthermore, empirical analyses of
conditional models necessarily use only part of all the available information. These factors introduce
additional uncertainty into the problem related to the model dynamics. A simplified approach to
deal with this type of uncertainty is to treat the risk premia and betas as latent (or unobservable)
variables and model them as stochastic functions of some observable instruments.
For the non-benchmark assets, I therefore assume that the investor considers the system in
(7). A similar framework was first suggested by Shanken (1990) to test conditional asset pricing
models but, like most of the previous literature, he assumes that betas are deterministic functions
of the predictive variables.10 In contrast, I model them as stochastic functions (i.e., I allow for
error terms in the beta equations), which allows me to incorporate into the problem the investor’s
uncertainty about the beta dynamics by specifying prior distributions for the regression parameters,
� � and the variance of the error term in the beta equation, �∗2� .11 Furthermore, the investor can
separate his uncertainty about the beta dynamics (i.e., � and �∗2� )� from his beliefs about the
asset pricing model (i.e., ��). For example, if he thinks that the conditional CAPM is likely to
hold but is uncertain about the true beta process, he will set very tight priors around zero for the
alpha parameters, ��, and allow for fairly diffuse priors for the regression parameters, � � and the
residual variance �∗2� in the beta equation. Given that time variation in alphas and betas play very
different roles in portfolio allocation problems (see section 3.3), it is important to consider these
two sources of uncertainty separately.
The investor may also want to account for uncertainty about the risk premia dynamics. Again,
he could model the expected excess returns on the benchmark assets as latent variables, � using
the following model
�+1 = � + !�+1 (8)
� = ��� + �∗��
where the investor’s uncertainty is reflected by his priors beliefs about �� and �∗2��� The practical
implementation of this system, however, presents econometric problems because empirically the
variance of �∗�� is very small compared to the variance of !�+1. Pitt and Shephard (1999) show that
in this case, if the error terms �∗�� are highly autocorrelated, algorithms to estimate models like (8)
converge very slowly or even fail to converge. So, unless we have strong priors about the parameters10Betas have been modeled as deterministic functions of macroeconomic and financial variables (e.g., Harvey, 1989;
Shanken, 1990; Ferson and Harvey, 1991, 1993, 1999; Ferson and Korajczyk, 1995; He et al, 1996), firm specificvariables (e.g., Ferson and Korajczyk, 1995 and Lewellen, 1999), and lagged values of betas, variances or covariances(e.g, Bollerslev et al, 1988; Bodurtha and Mark, 1991; Ng, 1991; Evans, 1994). See Tamayo (2000) for furtherdiscussion on tests of conditional asset pricing models when betas are stochastic versus deterministic.11 In addition, stochastic models allow for potential misspecification in the model.
8
governing the model, its estimation may present difficulties (the same applies to modeling alphas
as latent variables). Given this problem, in my empirical analysis I do not model the risk premia or
alphas as stochastic processes, although in principle one could do so by specifying strong priors.12
As I show in the empirical section, the estimation of stochastic betas does not pose the problem
described above because the variance of �∗ � is large compared to the variance of ��+1�
3 Modeling Framework and Bayesian Methodology
In the empirical analysis, I restrict the study to � = � = 1; that is, the investor chooses
between a market index (benchmark portfolio), and one non-benchmark portfolio. The methodology
however, can be applied to a larger number of benchmark and non-benchmark assets. The general
case is derived in the appendices.
I model the error terms in the beta equations as first order autoregressive, AR (1), processes.
These errors are likely to be serially correlated since variables other than � (i.e., the “omitted”
variables) could capture persistent components in beta. More generally, model misspecification can
also lead to autocorrelated errors in the beta equation. Therefore, for the non-benchmark asset,
the investor estimates the following model13
�+1 = �� + ����+1 + ��+1 (9)
�� = ���
�� = (� − " �−1)� + " ��−1 + � �
where, " is the autoregressive parameter, and ��+1 ∼ �(0� �2�) and � � ∼ �(0� �2� ) are ho-
moskedastic error terms.14 For the benchmark asset, he estimates the regression
��+1 = ��� + ��+1� (10)
where ��+1 ∼ �(0� �2)� The latter regression is also estimated with random coefficients ���� to
allow for conditional heteroskedasticity, as explained in the next subsection.
One appealing feature of the model for beta is that it nests many specifications previously
suggested in the literature. For example, assuming that " = 0 and ignoring the error terms in
the � equation, this is equivalent to the specifications in, for example, Shanken (1990), Ferson and
Harvey (1991, 1993, 1999), and Lewellen (1999) where � is linearly related to ex-ante observable
12By specifying strong priors about the model parameters, one “guides” the simulation algorithm and, hence, it iseasier to achieve convergence.13This result is derived by modeling the error term as an AR(1) process, �∗�� = ���
∗�−1 + ���. After substituting
in �� = ���� + �∗��� this yields �� = ���� + ��(��−1 − ��−1��) + ��� = (�� − ����−1)�� + ����−1 + ����14Note that due to leverage effects, the residual variance of the non-benchmark asset could vary with the betas.
This is left to future research.
9
variables. Another case, related to the GARCH specifications in Evans (1994), and Braun et al
(1995), amounts to letting all the elements of � but the constant equal 0 at any ��
3.1 Assumptions About the Price of Risk
One important input to the portfolio selection problem, and central to the variation in betas,
is the conditional variance of the risk premia. Modeling general forms of heteroskedasticity in a
Bayesian framework adds an additional complication to the problem. This is examined in Shanken
and Tamayo (2001) and is not pursued here.15 Instead, I make some simplifying assumptions and
investigate the investor’s portfolio allocation problem under two different scenarios for the price of
risk (the ratio of the expected return on the benchmark asset to its conditional variance).
In the first scenario, I assume that the investor has dogmatic priors that the price of risk is
constant over time. I model the expected return on the benchmark asset using (5) and vary the
conditional variance accordingly so that the price of risk remains constant. Hence, in this case, I
do not need to model the conditional variance. The assumption of constant price of risk has been
adopted in several tests of conditional asset pricing models although it has been rejected empirically
in several studies (e.g., Harvey, 1989). Nonetheless, it provides a starting point in the analysis of
the economic significance of time series return predictability in the presence of multiple assets.
In the second scenario, I relax the constant price of risk assumption and model the conditional
variance of the benchmark asset as a function of the predictive variables using a regression model
with random coefficients. For � = 1� ����� benchmark assets, the model in (10) is extended to a
model with random coefficients
��+1 = ��� + ��+1 (11)
�� = �+ ��
where ��+1 ∼ �(0� �2), �� ∼ �(0�Σ��) and Σ��
is a � × � matrix16 This regression model
with random coefficients and constant variance can be transformed into a regression model with
deterministic coefficients and heteroscedastic error term
��+1 = ��+ �∗�+1 (12)
where �∗�+1 = ��� + ��+1� The variance of �∗�+1 conditional on � is
#�[�∗�+1|�] = �Σ��0� + �2� (13)
If the predictive variables are expressed as deviations from their means, then the model in
(13) relates the conditional return variance to the volatility of the predictive variables, typically15Shanken and Tamayo (2001) model the conditional variance of the market as a function of the dividend yield
and use importance sampling to obtain the parameter estimates.16A similar framework could be used to allow for heteroskedasticity in the residual variances of the non-benchmark
assets.
10
macroeconomic and financial variables. This specification is in the spirit of Schwert (1989), who
relates the stock market volatility to the time-varying volatility of a variety of economic variables.
Although this random coefficient model is by no means the only way to model the variances, it
does provide a convenient approach to model them using the methodology developed in this paper.
3.2 Bayesian Methodology
Priors
For the non-benchmark portfolio, I assume independent conjugate priors for the parameters in
(9), which simplifies the computation for the posteriors considerably:17
$¡��� � � " � �
∗2�� � �
2�
¢= $(��)$(� )$
¡" ¢$(�∗2� )$
¡�2�¢� (14)
In particular, I specify a normal distribution as the prior distributions of �� and � , a normal
distribution truncated to the stationary region (-1,1) as the prior of " and an inverse gamma
distribution as the prior of �∗2� , the long run residual variance in the beta equation. The prior of
�2� is implied by the priors of �∗2� and " (for further details, see Tamayo, 2000). The prior of �
2�
is assumed to be noninformative, $ (��) ∝ �−1� �
Finally, for the benchmark portfolio, I assume that the priors of the parameters in model (10)
are also independent conjugate priors:
$¡��� �
2
¢= $(��)$(�
2)� (15)
I specify a normal distribution as the prior distribution of ��� The prior of �2 is assumed to
be noninformative. The actual values for all the prior distributions are discussed in section 4.3 and
Appendix V.
Estimation of Posteriors: Data Augmentation using MCMC Methods
The estimation the models when beta is a deterministic function of the explanatory variables
follows standard Bayesian results (see Appendix I). When beta is a stochastic function, as in (9),
the system is a state-space model and its estimation is more complex.18 In a classical framework,
it is estimated using maximum likelihood and the Kalman filter. In a Bayesian framework, it can
17 I use � and to denote prior and posterior distributions of the parameters respectively.18State-space models provide a useful framework to express dynamic systems that involve unobserved state variables
or stochastic coefficients. A state-space model consists of two-equations: a measurement equation and a transitionequation (also called state equation). The measurement equation describes the relation between the data and theunobservable state variables (or stochastic coefficients). The transition equation describes the dynamics of the statevariables (or stochastic coefficients) and has the form of a first-order difference equation in the state vector. In thecontext of this paper, the measurement is the asset pricing model and the transition equation is the equation forbeta. See Harvey (1989) for further details.
11
be estimated using the two alternative methods that I briefly describe in this subsection.19 Further
details are in Appendices I and III, and Tamayo (2000). Notice that the models with random
coefficients are also state-space models and can therefore be estimated using the approach that I
use to estimate model (9).
The problem with the estimation of the state-space model proposed in (9) is that the functional
form of the joint posterior density of the parameters, ³��� � � " � �
2� � �
2�|� ��
´� is unknown.
However, it is possible to obtain a sample from it using a data augmentation algorithm via Markov
Chain Monte Carlo (MCMC) methods.20 The basic idea of data augmentation is to augment
the observed data (� �� ) with a latent variable � in order to obtain the augmented posterior
³��� � � " � �
2� � �
2�|�� � ��
´� Given the assumption of independent priors, this posterior can
be decomposed into:
¡��� � � " � �
2� � �
2�|�� � ��
¢=
¡� � " � �
2� |�� ��� �2�� � ��
¢(16)
× ¡��|�� �2�� � �¢× ¡�2�|��� �� � �
¢�
These posteriors are either analytically tractable or can be simulated. Therefore, it is possible
to sample from them by using a data augmentation algorithm via the Gibbs sampler, when the
posteriors have familiar distributions, or via the Metropolis-Hasting algorithm, when the posterior
distributions are unfamiliar (see Appendix III). Finally, there are two alternative methods to sample
from the distribution of �. The first one obtains the posterior distribution of � based on an
approach suggested by Jacquier, Polson and Rossi (1994), and Kim, Shephard and Chib (1998)
for stochastic volatility models, and is derived in Tamayo (2000). The second method uses the
simulation smoother for state space models of De Jong and Shephard (1995), and is described in
Appendix II.
3.3 Predictive Return Density and Portfolio Allocation
The predictive density of the excess returns can be easily computed from the output obtained
from the data augmentation via the MCMC algorithm. Formally, the predictive density of �+1 is
obtained by integrating the joint predictive density of �+1 and ��+1 with respect to ��+1
(�+1|� ) =
Z (�+1� ��+1|� )���+1 =
Z (�+1|��+1� � ) (��+1|� )���+1� (17)
19The Bayesian approach has several advantages over the classical one. First, the estimation of beta is based onthe joint posterior distribution of the parameters rather than on distributions conditional on the MLE estimates ofthe rest of the parameters. Second, the Bayesian approach does not rely on asymptotics. The exact small-sampledistribution of the parameters can be computed using a data augmentation algorithm via Monte Carlo Markov Chainmethods.20MCMC is a simulation technique that generates a sample from a target distribution. It specifies the transition
probability of a Markov process with the property that its limiting distribution is the desired distribution. TheMarkov chain is iterated a large number of times and, under a set of criteria, the resulting sample is a sample fromthe desired distribution. For further discussion, see, Chib and Greenberg (1994), Gilks et al (1997), and Tanner(1996).
12
This predictive density and the predictive density of ��+1 can be obtained by integrating the
product of the likelihood functions for the future observations and the posterior distributions of
the parameters with respect to the parameters:
(��+1|� ) =
ZZ (��+1|��� �� � )����� (18)
and
(�+1|� ) =
ZZZZ (�+1|��� �� � ��� ��+1� � ) (��+1|� ) (19)
(��� ��|�� � � ) (�� |� )���+1��������� � (20)
The computation of the optimal portfolio weights is greatly simplified in a mean-variance frame-
work because, in this case, the investor only cares about the first two moments of the predictive
distribution, � and %�21 Therefore, following Pastor (2000) and Pastor and Stambaugh (2000), I
consider a risk-averse mean-variance investor who maximizes the mean-variance objective
&� = �� − 12'�2� � (21)
where �� is the expected rate of return on the portfolio, �2� is the variance, and ' is the risk
aversion parameter.
The solution to the expected-utility maximization problem in (2), given the mean-variance
objective (21), is
�∗ = '−1% −1�� (22)
where �∗ is the (� + �) × 1 vector of portfolio weights. These weights do not necessarily sumto one because they are also affected by the investment in the riskfree asset, which is given by
1− �0�+��∗. Hence, they represent the proportion of an investor’s wealth allocated to each asset.
When � = � = 1 the optimal weights in (22) are given by (see Appendix IV)
�∗�� = '−1µ
��
#�(��|�)
¶= '−1
µ��
[1���]()#(��� ��|�)[1���]0 + #�(��|�)%� +�(�2�|�)
¶(23)
and
�∗� = '−1��
%�− ���
∗�� (24)
where, �∗�� and �∗� are the optimal weights on the non-benchmark asset and the market index
respectively. The conditional moments are calculated using the predictive densities of the returns.
21 In more general situations, it is possible to draw a sample from the predictive densities by sequentially samplingfrom the posterior distributions and computing the future observations �+1 and ��+1 using the models in (9) and(10).
13
The expressions in (23) and (24) provide some insight into what affects the optimal weights.
First, parameter uncertainty reduces the total allocation to risky assets. For example, the denom-
inator in (23) takes parameter uncertainty into account through the covariance matrix of alpha
and beta. In a classical approach to portfolio selection, the parameters are fixed and, hence,
()#(��� ��|�) = 02×2� In a Bayesian setting, the parameters are random variables and, hence,
()#(��� ��|�) is a positive definite matrix.22 Second, uncertainty about the beta dynamics increases
the variance of the predictive distribution of the returns on the non-benchmark assets (see Appendix
IV) and, as a result, reduces the allocation to the non-benchmark asset. Third, ceteris paribus,
time variation in betas does not affect an investor’s optimal allocation to the non-benchmark as-
set but his allocation to the benchmark asset could be affected. Finally, conditional mispricing
(time-varying alphas) can affect the optimal weights on both the benchmark and non-benchmark
assets.
3.4 Economic Significance of Return Predictability and Source of Predictability
As in Kandel and Stambaugh (1996), I study the economic significance of return predictability
by examining the sensitivity of the optimal allocation to the most recent observation of the predic-
tive variables, � . I compare the optimal allocation �∗, the solution to the investor’s maximization
problem in (2), to a suboptimal allocation �� which is derived by solving (2) when the most recent
observation � is replaced by a different observation �� . Although the posterior distributions of
the parameters are the same under both samples, the densities of the future observations �+1
and ��+1 differ because the most recent values of the predictive variables are different.23 Hence,
differences between �∗ and �� document the effect that the sample evidence on predictability has
on the optimal allocation.
A second way to assess the economic significance of the sample is to examine whether the
optimal portfolio is sensitive to the source of predictability, namely model mispricing, changes in
betas or risk premia. In this case, I compare the optimal allocation �∗� to a suboptimal allocation
���, that is obtained by solving (2) under an alternative model (or likelihood). For example, one
model may be the CAPM with constant alphas and time-varying betas and the alternative model
the CAPM with time-varying alphas and betas. The information set, � , is assumed to be the same
in both cases. Therefore, differences between �∗ and ��� reflect the role that the investor’s asset
pricing model plays in his portfolio allocation decision. Unlike the case discussed above, now the
posterior distributions of the parameters are different under the two models because the likelihood
functions and priors differ.24
22For the market weights, parameter uncertainty is also reflected through ��, which increases.23Hence, (Θ|�� ) = (Θ|��
� )� (�+1|� ��+1� �� ) 6= (�+1|� ��+1� ��� ) and (��+1|� �� ) 6=
(��+1|� ��� )�
24Hence, ∗�(Θ|�� ) 6= ��(Θ|�� )� ∗�(�+1|Θ� ��+1� �� ) 6= ��(�+1|Θ� ��+1� �� ) and ∗�(��+1|Θ� �� ) 6=
14
Finally, I also analyze the economic significance of the sample by computing the investor’s
expected utility loss, measured as the difference in certainty equivalent returns (CER), if he were
to hold a suboptimal portfolio instead of the one he perceives to be optimal. The CER equivalent
comparison (the CER for the optimal portfolio minus the CER for the suboptimal portfolio, &∗−&�
or &∗�−&��) is done using one common predictive distribution: the distribution associated with
the optimal portfolio (i.e., the predictive distribution obtained under the sample � or under the
investor’s model). As Kandel and Stambaugh (1996) emphasize, it is important to compute CERs
using one common probability distribution because otherwise it is difficult to interpret differences
in CERs.
4 Specification of Priors and Data
4.1 Specification of the Priors in the Empirical Test
In the empirical analysis, I assume that the investor is very uncertain about the degree of
predictability (i.e., the parameters in the model) but has dogmatic beliefs about the source of
predictability (i.e., the model). The former assumption (i.e., fairly diffuse priors given a model)
allows me to investigate the role that the sample evidence on predictability plays in portfolio
allocation decisions. The latter assumption (i.e., dogmatic priors for the models) allows me to
investigate the role of the source of predictability.25
For example, consider an investor who believes that the return predictability may not be cap-
tured by a conditional asset pricing model. By setting diffuse priors for the parameters in (10),
it is possible to examine how the predictability evidence affects his optimal portfolio. Conversely,
consider now an investor who dogmatically believes that the predictability can be captured by a
conditional asset pricing model but that the model may misprice assets on average.26 By setting
diffuse priors for the parameters in a constant alpha, time-varying beta model, it is possible to ana-
lyze how the time variation in risk premia and betas affect the investor’s optimal portfolio. Finally,
the role that model mispricing or, more generally, the source of predictability (i.e., the model) plays
in the portfolio allocation decision can be analyzed by comparing the portfolio allocation of these
two investors.
The assumption of diffuse priors given a model is also convenient econometrically because
it simplifies the computation of the posterior distributions of the parameters when some of the
predictive variables, such as the dividend yield, are endogenous regressors. If the regressors are
��(��+1|Θ� �� )� The superscripts ∗� and � denote the investor’s optimal model and an alternative modelrespectively.25To save space, the specific priors used in this paper are discussed in Appendix V.26 If the investor dogmatically believes that the predictability can be captured by a conditional model and that the
model can price assets on average, then he will only invest in the benchmark asset and the risk free asset.
15
endogenous, the predictive regression should be estimated simultaneously with a model for the
predictive variables, unless one has very strong priors that the errors are independent across the
models or specifies diffuse priors for the model parameters (see Stambaugh, 1999). In this paper
I assume that the priors are fairly diffuse and I do not estimate the models for the predictive
variables.27
4.2 Predictive Variables
I use the dividend yield, term spread, and default spread as the investor’s conditioning in-
formation. The predictive ability of these variables has been also documented by, among others,
Keim and Stambaugh (1986), Fama and French (1988, 1989), and used in conditional asset pricing
models by Harvey (1989), Ferson and Harvey (1991, 1993, 1999), Ferson and Korajczyk (1995),
and He et al (1996). Fama and French (1989) suggest that the dividend yield and default spread
capture a component in expected returns that is related to long term economic conditions. They
also argue that the term spread captures a component in expected returns that is related to the
business cycle and is less persistent.
The descriptive statistics for these variables are presented in Table I, Panel A. The dividend
yield is calculated as a function of the value-weighted market returns with and without distributions
and is computed as in Fama and French (1988, 1989). The mean of the dividend yield is 3.55%
per annum and the standard deviation is 0.94%. The default spread is the difference between the
yields on BAA grade bonds and AAA bonds and is obtained from the Federal Reserve database.
The mean of the default spread is 0.085% per month, standard deviation 0.038%. Finally, the term
spread is the difference between the yields on ten-year Government bonds and one-month Treasury
Bill. Its mean is 0.12% per month, standard deviation 0.12%.
In the empirical analysis, I standardize the predictive variables in order to make the interpre-
tation of the slope coefficients easier. Following Fama and French, I do not use the dividend yield
and default spread together in the same regression because they are highly correlated (0.68).
4.3 Benchmark and Non-Benchmark Portfolios
Starting with Banz (1981), a considerable number of studies have shown that small capitaliza-
tion stocks earn higher returns on average than predicted by the CAPM (e.g., Fama and French,
1992, 1993; Chan et al., 1995), although in recent years they have underperformed the market
index.28 Also, there is evidence that value (high book-to-market) stocks earn higher returns on
27Although a system could be estimated using a vector autoregression as the one in Appendix III for �� and ��.28Among the variables that explain the cross-section of average returns, size (Banz (1981) and book-to-market
ratio (Stattman (1980), Chan et al (1991)) have emerged as the most relevant ones (e.g., Fama and French (1992)).
16
average than growth (low book-to-market) stocks after controlling for beta risk, size and other firm
characteristics (e.g., Fama and French, 1992, 1993; Chan et al., 1995).
From an asset allocation perspective, Pastor (2000) finds that an investor who is not very
confident about the unconditional CAPM should short the size premium (the return on SMB,
small minus large capitalization stocks) over several periods, including the 90’s. On the other
hand, he shows that even an investor with strong beliefs in the unconditional CAPM should invest
a considerable amount in the value premium (the return on HML, high minus low book-to-market
stocks).
Like previous studies, I focus on the size and value anomalies and use the smallest capitalization
and the highest book-to-market (BM) quintiles as the non-benchmark assets. As the benchmark
asset, I use the value-weighted CRSP market index. In addition to the riskfree asset, the investor
is assumed to invest in the market index and a portfolio of small capitalization stocks or a portfolio
of value stocks. I consider these non-benchmark portfolios for two reasons. First, the time series
predictability of the returns on the size and value portfolios presents cross-sectional differences
that could have important implications for asset allocation. And second, there is considerable
evidence suggesting that conditional asset pricing models cannot explain the average returns on
these portfolios, or fully capture the time variation in their expected returns.
To calculate the returns on the size (BM) quintiles, I sort on the basis of size (BM) all NYSE,
AMEX and NASDAQ stocks with market value data on CRSP for the current month (book data
on Compustat for the previous fiscal year).29 Stocks are divided into size (BM) quintiles portfolios
using only NYSE stocks to calculate the breakpoints to avoid a disproportionate number of stocks
in some portfolios (e.g., smallest capitalization). The sample is from January 1963 to December
1998 because book data is not available prior to 1963. Descriptive statistics for the excess returns
on the market index, and the highest BM and smallest capitalization quintiles are provided in Table
I, Panel B.
In order to provide some insight into the cross-sectional differences that arise in the time series
predictability of returns,Table I, Panel C presents the maximum likelihood estimates from regressing
the excess returns on the predictive variables. These estimates should be interpreted with caution
given the small-sample problems documented by Stambaugh (1999). The dividend yield predicts
excess returns on the value and size portfolios. A one-standard deviation increase in the dividend
yield predicts, ceteris paribus, a 0.50% and 0.46% increase per month in the excess returns on
the value and size portfolios respectively. The evidence on predictability using the dividend yield
for the market index is insignificant using standard statistical measures. However, as Kandel and
Stambaugh (1996) observe, the economic significance of this predictability is not readily conveyed
29To ensure that book data is known to investors when computing BM, I do not use book data until six monthsafter the fiscal year end.
17
by standard statistical measures (see next sections). The predictive ability of the term spread
is also statistically insignificant using standard statistical measures. Finally, the default spread
predicts excess returns on all three portfolios. A one-standard deviation increase in the default
spread results in a 0.79% increase in the excess return on the value portfolio, and 0.65% and 0.37%
increases in the excess returns on the size portfolio and market index respectively.
5 Posteriors of the Model Parameters
Tables II and III present the posterior means and standard deviations of the model parameters
for the value and size portfolios respectively. The predictive variables in Panel A and C (Panel B
and D) are the dividend yield and term spread (default and term spreads). In Panels A and B,
I assume that betas are either constant or deterministic functions of the predictive variables. In
Panels C and D, I assume that betas are stochastic functions of the predictive variables and, hence,
implicitly allow for uncertainty about the beta dynamics.
As previously documented in the literature, value stocks earn higher returns on average than
predicted by the CAPM (i.e., the long run mean of alpha, the ��1 parameter, is positive - second
column in Table II).30 Roughly half of the average monthly return (around 0.48% per month or
5.9% per annum) cannot be explained by the CAPM. Further, the posterior standard deviations
of the ��1 parameter are small relative to the means and the prior standard deviations, indicating
that the data strongly supports the evidence on mispricing. The long run betas (the � 1 parameter)
are smaller than one, suggesting that value stocks are less risky than the market portfolio proxy,
which is also consistent with previous studies (e.g., Fama and French, 1992).
To examine to what extent the predictability evidence is captured by a conditional CAPM, I
compare the results in Tables I and II . This comparison is meaningful because I specify diffuse
priors given a model and the predictive variables are standardized.31 Starting with Panel A of
Table II, I find that the predictive ability of the dividend yield cannot be fully attributed to time
variation in risk premia and betas (i.e., the ��2 parameter is not equal to zero). Of the 0.50%
increase per month in the value portfolio return associated with a one-standard deviation increase
in yield (Table I), nearly 40% can be attributed to time variation in the market risk premia (the
parameters associated to the dividend yield decrease from 0.499 in Table I to 0.310 in the constant
beta, time varying risk premia model in Table II). A further 10% can be attributed to time variation
in betas (the parameter decreases to 0.261 when betas are also allowed to be time-varying). Hence,
30Note that since I have standardized the predictive variables, ��1 and ��1 represent the unconditional or long runmeans of alpha and betas respectively.31Under diffuse priors, the means of the posterior distributions of the parameters are the same as maximum
likelihood estimates of the parameters.
18
at most, I can explain roughly 50% of the dividend yield predictive ability using the conditional
CAPMs suggested in this paper. The weak predictive ability of the term spread, however, can be
explained by a conditional CAPM.
Similar findings are reported in Panel B using the default spread as the predictive variable. Of
the 0.79% increase per month in the value portfolio return associated with a one-standard deviation
increase in the default spread (Table I), roughly 45% can be attributed to time variation in the
market risk premia and another 7% to time variation in betas (the parameters decrease from 0.787
in Table I to 0.436 in the constant beta, time varying risk premia model and to 0.378 when betas
are also allowed to vary in Table II).
The inability of the of the conditional CAPM to capture all the predictive ability of the dividend
yield and default spread persists when betas are modeled as stochastic functions of the variables,
as shown in Panels C and D. The means of the conditional alphas are smaller now (the standard
deviation of the parameters is similar), which suggests lower conditional mispricing. However, much
of the predictability evidence, around 40-45%, remains unexplained by the conditional CAPM. The
long-run alpha is also a bit smaller for the value portfolio when betas are stochastic functions.
Regarding the dynamics of betas, Table II shows that betas vary with both dividend yields
and default spreads. For example, an investor with diffuse priors about whether betas vary with
the default spread will conclude that a one-standard deviation increase in default predicts an
increase in betas of 0.083, if he dogmatically believes that the predictability can be explained
within a conditional CAPM, or of 0.061, if he has diffuse beliefs about the source of predictability.
Furthermore, Panels C and D suggest that betas are stochastic, rather than deterministic, functions
of the predictive variables. The posterior means of the long run standard deviation of the errors
in the beta equation, �∗� � are around 0.33 (recall that the prior is diffuse), and the means of the
autoregressive parameters are around 0.6 (0.2 standard deviation). Although the mean of �∗� can
be slightly biased upwards in small samples when the true value of �∗� is really small (around,
0.01, see Tamayo, 2001), it is surprisingly large. Thus, the evidence in Panels C and D suggests
that much of the time variation in betas cannot be captured by the dividend yield, and default and
term spreads.
Turning to the evidence for the size portfolio in Table III, I find that small stocks earn on average
higher returns (positive ��1 parameter) than predicted by the CAPM, which is consistent with prior
literature. However, these abnormal returns are smaller than those previously documented (e.g.,
Fama and French, 1993) as a result of including of the 90’s in the sample. Also, the posterior
standard deviations of the ��1 parameter are large compared to their means, suggesting that the
sample evidence on average mispricing is weak. The annualized average abnormal returns vary from
0.28% for the static CAPM in Panels A and B, to 1.34% for the conditional CAPM with stochastic
betas in Panels C and D. It is interesting to note that models with stochastic betas yield larger
19
average abnormal returns for the size portfolio. This suggests that allowing for heteroskedasticity
in the variance of the size portfolio results in larger mispricing, which is consistent with previous
findings (e.g., Seguin and Schwert, 1990). Also, consistent with previous evidence, the long run
betas are larger than one.
Comparing the results in Tables I and III, I find that a substantial proportion of the size return
predictability seems to be captured by a conditional CAPM. For example, of the 0.65% increase per
month in returns associated with a one-standard deviation increase in default (Table I), roughly
65-70% can be attributed to time variation in the market risk premia (the parameters decrease
from 0.65 in Table I to 0.216 in the constant beta, time varying risk premia model in Table III).
Furthermore, the predictability of the alphas are insignificant using traditional frequentist methods
although, as I show in the next section, their economic significance is non-trivial.
Finally, I find that the betas of the size portfolio are negatively related to the dividend yield and
term spread, which is not what one would expect a priori based on economic intuition. However,
the standard deviations of these parameters are large comparing to the means, suggesting that
this evidence is weak. The sample evidence also suggests that betas are stochastic functions of
the explanatory variables (Panels C and D). The posterior means of the long run residual standard
deviation in the beta equation, �∗� � are around 0.37, and the means of the autoregressive parameters
are around 0.45 (0.18 standard deviation). Thus, much of the time variation in betas is not captured
by the predictive variables used in this paper.
6 Economic Significance of Predictability
To examine the economic significance of return predictability, I compute the (non-normalized)
portfolio weights, Sharpe ratio and differences in certainty equivalent returns (CERs) under different
values of the predictive variables.32 The evidence is presented in Tables IV-XI. The columns labeled
by “Mean” compute the optimal portfolios when the predictive variables are at their long run means.
In the other columns, the predictive variable of interest is assumed to be one-standard deviation
above or below its long run mean. To provide a realistic economic scenario, I account for the
correlation across the predictive variables by considering how a change in one of the variables affects
the other one.33�34 The role that the sample evidence on predictability plays in asset allocation can32The tangency portfolio weights are not reported but they can be easily obtained by normalizing the weights of
the risky assets. In the tables, I present the non-normalized weights because they provide a clearer picture about theeffect of predictability on the amount invested in each asset since they take into account leverage effects.33For example, in the column labeled by “∆d/p”, I assume that the dividend yield is one-standard deviation above
its mean and set the term spread at its expected value conditional on the value of the dividend yield.34Mathematically, I compute the Cholesky decomposition of the covariance matrix of the predictive variables, which
yields a 2× 2 upper triangular matrix (because there are two predictive variables in each model specification). Sincethe variables are standardized, the upper off-diagonal element is equivalent to the slope coefficient from regressingthe other predictive variable on the variable of interest.
20
be analyzed by comparing the optimal portfolio weights and Sharpe ratios along a given row.
Additional insight into the economic significance of the sample evidence can be obtained by
computing the investor’s expected utility loss, measured in terms of certainty equivalent returns
(CER), if he were to hold a suboptimal portfolio. As discussed in section 3.4, I present two analyses
of CERs. In Tables IV-IX, I examine how much an investor should be compensated, in terms of
CER, if he were to ignore the predictability evidence. I compute the optimal weights assuming
that the value of the predictive variable of interest is one-standard deviation above/below its mean.
The suboptimal portfolio weights are calculated at the mean values of the predictive variables.
For example, the column ∆d/p compares the CER for the optimal portfolio, which is computed
assuming that the dividend yield is one-standard deviation above its mean, to the CER for a
suboptimal portfolio, which is computed assuming that the dividend yield is at its mean. In Tables
X-XI, I extend the CER analysis by computing the expected utility loss that an investor would
suffer if he were to hold the optimal portfolio of an investor who believes in an alternative model.
Tables IV-IX are derived under different assumptions about the time variation in expected
returns and covariance matrix. I examine three cases: (i) time variation in expected returns only;
(ii) proportional time variation in market expected return and risk, or constant price of risk case;
(iii) (non-proportional) time variation in market expected return and risk, or time-varying price of
risk case.35 In all the tables the risk aversion parameter is assumed to be 2.8, which is the value
at which the investor would allocate 100% of his funds to the value-weighted index if he were only
to invest in the riskless asset and the market index (i.e., before introducing any additional risky
assets).
6.1 Time Variation in Expected Returns Only
Following previous studies, I first examine the case in which expected returns may be time
varying but the return covariance matrix is constant. The regression model underlying this case is
equivalent to a time-varying alpha, constant beta regression model under the diffuse prior assump-
tion. The results are reported in Tables IV and V. The predictive variables in Panel A (Panel B)
are the dividend yield and term spread (default and term spreads).
Consider first the evidence for the value portfolio in Table IV. The investor’s optimal allocation
is sensitive to the level of the dividend yield and default spread, and, to a lesser extent, term spread.
For example, a one standard deviation increase in the dividend yield above its mean increases the
total allocation to the risky assets from 111% (216.30-104.77) to 156% (355.58-199.42) and the
riskless borrowing from 11% to 56%. Conversely, a one standard deviation decrease in the dividend
yield reduces the total allocation to the risky assets to approximately 66% (75.71-9.18), with the
remaining 33% being invested in the riskless asset. The Sharpe ratio increases (decreases) from
35Note that the time variation in expected returns only case also corresponds to a time-varying price of risk case.
21
0.21 to 0.33 (0.10) with a one standard deviation increase (decrease) in yield. Finally, the investor
would have to be compensated by a 3.1% riskless return per annum to ignore the predictive ability
of the dividend yield.
The economic significance of the predictability evidence using the default spread is even stronger.
The total allocation to the risky assets increases (decreases) from 113% to 197% (28%) when the
default spread moves by one standard deviation above (below) its mean. Furthermore, the Sharpe
ratio nearly doubles if the default spread increases by one standard deviation (from 0.21 to 0.39),
and it is dramatically reduced (to 0.04) if the default spread decreases by one standard deviation.
The investor would also demand a higher compensation in terms of certainty equivalent returns to
ignore the predictive ability of the default spread; he would require a certainty equivalent return
of approximately 6.70% per annum.
Finally, the economic significance of return predictability using the term spread is considerably
lower. Although the total allocation to the risky assets and the Sharpe ratio are sensitive to the
value of the term spread, the investor would not require a large compensation, in terms of CERs,
to ignore its predictive ability.
So far I have discussed the effect of predictability on the total allocation to risky assets. One
interesting finding in Table IV is how the allocation to risky assets is actually split between the
market index and the value portfolio. When the dividend yield is at its long run mean, the weight
on the value portfolio is 216%, which is partly financed by shorting the market index by 104%.
Recall that the risk aversion parameter is 2.8, which is the value at which the investor would invest
100% of his wealth in the market index if he were to invest only in the index and the riskfree asset.
The introduction of the value portfolio in the investor’s opportunity set results in short positions in
the market index. One would naturally expect a reduction in the market index weights given the
positive alphas of the value portfolio. In this particular case, the CAPM cannot explain nearly 50%
of the average monthly return on the value portfolio (the long run alpha is about 0.48%, nearly
50% of the 1% monthly return - see Table II). The value portfolio mispricing is nearly as large as
the excess return on the market index, 0.55%. Furthermore, the value portfolio is not much riskier
(measured by the standard deviation) than the market index and actually its beta is smaller than
one. As a result, the investor shifts his allocation to the value portfolio when his opportunity set
is expanded.
Increases in the predictive variables lead to even more extreme portfolio allocations. For ex-
ample, if the dividend yield (default spread) increases by one standard deviation above its long
run mean, the value portfolio weight increases from approximately 216% to 355% (407%), and the
market index is shorted an additional 95% (103%). These large changes in the optimal portfolio
composition are a consequence of the large conditional alphas. In particular, a one standard devi-
ation increase in dividend yield (default spread) increases the model mispricing by an additional
22
0.31% (0.44%) per month (see Table II).
Decreases in the predictive variables affect the optimal portfolio in the opposite direction. The
investor allocates less to the value portfolio and more to the market index. For example, a one
standard deviation decrease in default reduces the value portfolio weight to 30%, the market is only
marginally shorted, and the remaining 72% is invested in the riskless asset.
Turning to the evidence for the size portfolio in Table V, I find that although the evidence on
predictability can be regarded as statistically insignificant using classical statistical methods, it still
plays a role in the portfolio allocation decision. For example, the optimal allocation to risky assets
increases (decreases) from 99% to 164% (34%) when the default spread moves by one standard
deviation above (below) its mean; the Sharpe ratio changes from 0.12 to 0.22 (0.05). Measured in
term of utility loss, however, the evidence on return predictability is not as strong as for the value
portfolio. In order to ignore the predictability evidence, the investor would have to be compensated
by a riskless return of 1.60% or 2.36% per annum depending on whether the predictive variable is
the dividend yield or default spread. I also find that the investor does not short the market index
any longer. When the predictive variables are at their means, the investor’s allocation to the size
portfolio is very small, around 7%. This is not surprising given the small unconditional alphas and
the large uncertainty surrounding then (see Table III). Conditioning on information, however, the
alphas become larger and, hence, the allocation to the size portfolio increases. For example, a one
standard deviation increase in the dividend yield (default spread) increases the model mispricing by
an additional 0.26% (0.22%) per month and the optimal weight on the size portfolio to 80% (58%).
The investor’s allocation to the market index also increases with the predictive variables since (i) a
large proportion of the size return predictability can be attributed to time variation in risk premia;
(ii) and there is large uncertainty surrounding the conditional alphas. Finally, decreases in the
predictive variables do actually yield short positions in the size portfolio.
In sum, the evidence in Tables IV and V suggests that, assuming that the return covariance
matrix is constant, the predictability evidence plays an important role in the investor’s asset allo-
cation decision. An investor who originally invests 100% of his wealth in the market index reduces
his allocation to the index once his opportunity set is expanded to include a value or size portfolios.
In particular, if he were to choose between a value portfolio and the market index, he would con-
siderably short the index (assuming diffuse priors). In the next sections, I explore what happens
when the portfolio risk also varies with the predictive variables.
6.2 Constant Price of Risk
Tables VI and VII present the portfolio weights, Sharpe ratio and differences in certainty
equivalent returns (CERs) for the constant price of risk case. Table VI reports the evidence for
the value portfolio and Table VII for the size portfolio. As before, the predictive variables in Panel
23
A and C (Panel B and D) are the dividend yield and term spread (default and term spreads). In
Panels A and B, I assume that betas are either constant or deterministic functions of the predictive
variables; each panel presents four set of results derived under different assumptions about the
source of predictability (i.e., time variation in risk premia, beta and/or conditional mispricing). In
Panels C and D, I assume that betas are stochastic functions of the predictive variables; each panel
presents two sets of results derived under different assumptions about the conditional alphas.
Overall, I find that the investor’s optimal allocation is very sensitive to the levels of the dividend
yield and default spread, but not the term spread. The economic significance of return predictability,
however, depends on the investor’s beliefs about the source of predictability, namely, time variation
in risk premia and risk and/or model mispricing. The results for the size portfolio are a bit less
extreme than the results for the value portfolio although they convey a similar message. Thus, in
the discussion that follows, I focus mainly on the findings for the value portfolio in Table VI.
The first set of results in Panels A and B assume that the investor allows for predictability and
does not use asset pricing models to guide him in the portfolio allocation decision. Like in Tables
IV and V, the regression model underlying this case is equivalent to a time-varying alpha, constant
beta regression model under the diffuse prior assumption. However, the price of risk is assumed
to be constant now. Compared to the results in Table IV, I find that the total allocation to the
risky assets becomes a bit less sensitive to the predictability evidence. This is to be expected given
that the constant price of risk assumption implies that both the expected return on the market
index and the variance move proportionally (hence, a positive expected return effect in the market
weights is offset by a negative variance effect). The more surprising finding is the magnitude of
the results. For example, if the default spread increases by one standard deviation above its long
run mean, the total allocation to risky assets increases from 113% to 124% (instead of 197% as
in Table IV). As before, the value portfolio weight increases from 219% to 407% and the investor
decreases his position in the market from -106% to -283%.36 The short position becomes larger
than in Table IV because increases in the predictive variables increase not only expected returns
but also risk. Finally, despite the constant price of risk assumption, the predictability evidence is
still economically significant. For example, the Sharpe ratio increases from 0.21 to 0.36 with a one
standard deviation increase in default above its mean. The investor would have to be compensated
by a riskless return of 4.80% to ignore this evidence.
The second and third set of results in Panels A and B explore the opposite cases: the investor
uses asset pricing models and believes that the return predictability is due to time variation in risk
premia and/or betas (i.e., he does not allow for conditional mispricing). More specifically, in the
second set of results, the investor believes that all the predictability is due to time variation in
36The allocation to the non-benchmark asset is the same as in Table IV because it depends on the alphas but noton the assumption about the price of risk.
24
risk premia.37 Given the assumption of constant price of risk, in this case, the optimal allocation
does not depend on the value of the predictive variables and the investor is happy to ignore the
predictability evidence. In the third set of results, the investors also allows for time variation in
betas. In this case, the value portfolio weights do not change with the predictive variables but
the market index weights change due to the time variation in betas.38 Thus, the magnitude of
these changes reflect the extent to which the predictability in betas is economically significant. As
shown in Table VI, the predictability in betas is not economically important (although its statistical
significance is large, see Table II). In particular, allowing for time variation in risk premia and betas
yields very similar allocations as allowing for time variation in risk premia only. Furthermore, the
Sharpe ratios are nearly identical and, in both cases, the investor would require a very small
compensation to ignore the predictability evidence.
The fourth set of results in Panels A and B are derived under the assumption that the investor
believes that the return predictability may not be completely captured by a conditional CAPM; he
holds fairly diffuse priors about the source of predictability and, hence, allows for time variation
in risk premia, betas and alphas. The results in Table VI indicate that the investor increases
his investment in the value portfolio when the dividend yield or default spread increase, which is
(partly) financed by shorting the market index further. Since part of the predictability is captured
by the time-varying betas, the optimal portfolio is a bit less sensitive to the level of the predictive
variables than in the constant beta case (first set of results). For example, the investor increases
his optimal portfolio from 221% to 387% (339%) when the default spread (dividend yield) increases
by one standard deviation above its mean; the Sharpe ratio increases from 0.21 to 0.35 (0.31);
and the investor would have to be compensated, in terms of CER, by 3.74% (1.88%) per annum in
order to ignore the predictability evidence, around 1% less than an investor who does not allow for
time-varying betas.
One interesting finding is that the total allocation to the risky assets can decrease when the
predictive variables increase. For example, if the investor has diffuse priors about the source of
predictability (last set of results), the total allocation to risky assets is reduced from 117 to 111
(115 to 109) when the default spread (dividend yield) increases by one standard deviation. This is
driven by the time variation in betas (apart from the assumption of constant price of risk): when the
betas increase with the predictive variables, the covariance between the value portfolio and market
index becomes larger, increasing thereby the portfolio risk. Also, allowing for time-varying betas
introduces further parameter uncertainty into the problem, which makes the predictive residual
variance larger and the investor more reluctant to invest in the asset.
37This is equivalent to a regression model with constant alpha and beta.38Since the price of risk is assumed to be constant, changes in the market index weights are driven by time-varying
alphas and betas while changes in the non-benchmark portfolio weights are mainly driven by time variation in alphas(See 23 and 24)
25
The last two panels of Table VI (Panels C and D) are derived under the assumption that
the investor models the betas as stochastic functions of the predictive variables. As discussed in
section 3.3, modeling betas as stochastic functions introduces further uncertainty into the portfolio
selection problem. Thus, holding the rest of the parameters and the sample constant, the allocation
to the non-benchmark asset should be lower than in the deterministic beta case. In Table III,
however, I find that the distribution of the rest of the parameters depends on the nature of beta.
In particular, for the value portfolio, there is a bit less evidence of CAPM mispricing when beta is
a stochastic function of the predictive variables. This finding reinforces the negative impact on the
value portfolio weights of the increased beta uncertainty. For example, when the default spread is
at its mean, an investor with diffuse priors about the source of return predictability allocates 200%
of his wealth to the value portfolio if betas are stochastic functions (as opposed to 221% if betas
are deterministic functions).
Panels C and D also show that when betas are stochastic functions of the predictive variables, the
economic significance of return predictability is reduced. For example, the maximum compensation,
in terms of CERs, the investor would demand to ignore the predictability evidence is 1.99% (versus
3.78% in Panels A and B). Nonetheless, the predictability evidence impacts the optimal portfolio
composition and the Sharpe ratio. For example, if the investor has diffuse priors about the source
of predictability and the default spread increases by one standard deviation above its mean, the
allocation to the value portfolio increases by an additional 85% (versus 165% in Panel B); the
Sharpe ratio increases from 0.20 to 0.30. Note that although the predictive ability of the dividend
yield would be regarded as insignificant using standard statistical measures, changes in dividend
yield still have a considerable impact on the optimal portfolio weights and Sharpe ratio.
One interesting finding in Panels C and D is that when the investor holds dogmatic priors that
the predictability can be captured by a conditional CAPM (i.e., constant alpha, stochastic beta
model), changes in the predictive variables affect the optimal allocation to the value portfolio.39
For example, when the default spread increases by one standard deviation over its long run mean,
the optimal allocation to the value portfolio (market index) changes from 191% to 154% (-80% to
-53%). Note that, unlike in Panels A and B, increases in the predictive variables actually reduce
the optimal allocation to the value portfolio and increase the allocation to the market index. This
is due to the increase in uncertainty associated to the stochastic nature of beta. Increases in the
predictive variables increase both the market expected return and risk, which, together with the
larger uncertainty in beta, results in larger covariance risk (see A4.15).
Finally, the results in Panels C and D also indicate that the time variation in betas not captured
by the predictive variables is not only economically significant but also larger than the economic
39Recall from the discussion in Panels A and B that if the betas are deterministic functions of the predictivevariables and the alphas are constant, then the optimal allocation to the non-benchmark asset is not sensitive tochanges in the predictive variables (given the constant price of risk assumption).
26
significance of the time variation in betas captured by the variables. For example, a one standard
deviation increase in betas (unrelated to the predictive variables) reduces the allocation to the
market index by an additional 60-65%; the investor would require a compensation of 1.39-1.54%
riskless return per annum to ignore the stochastic nature of betas.
Turning briefly to the evidence for the size portfolio in Table VII, I find that although the
predictability is weaker than for the value portfolio, it still affects the investor’s optimal portfolio. As
before, if the investor does not allow for conditional mispricing (i.e., assumes that alphas are always
constant), the optimal portfolio is insensitive to the predictability evidence, given the constant price
of risk assumption. In contrast, if he allows for conditional mispricing, the optimal portfolio depends
on the levels of the dividend yield, default spread and, to a lesser extent, term spread. For example,
if the investor has fairly diffuse views about the source of return predictability (last set of results
in Panels A and B), his allocation to risky assets decreases from 98% to 87% when the dividend
yield increases by one standard deviation above its mean; the Sharpe ratio increases from 0.12 to
0.17 and the investor would have to be compensated by a riskless return of 1.53% per annum to
ignore the predictability evidence. Note that the investor revises his allocation to the size portfolio
substantially as a result of the changes in conditional alphas, even though these alphas would be
regarded as insignificant using standard statistical measures. Finally, note that increases in the
dividend yield and default spread actually reduce the total allocation to the risky assets (unlike in
Table III, which assumes that the return covariance matrix is constant).
Again, Panels C and D of Table VII report the results when the investor explicitly accounts for
uncertainty about the beta model dynamics by modelling betas as stochastic functions. As shown
in Table IV, when betas are stochastic functions, there is stronger evidence of non-zero average
alphas. As a result, even though the stochastic nature of betas introduces further uncertainty into
the problem, the optimal allocation to the size portfolio is larger than in Panels A and B when
the predictive variables are at their mean values. On the other hand, the economic significance
of the predictability evidence becomes weaker when betas are modeled as stochastic functions. In
particular, the investor would not require a large compensation, in terms of CERs, to ignore the
predictability evidence. Although changes in the predictive variables do lead to changes in the
portfolio composition or the Sharpe ratio, the magnitude of the latter changes are not as large as
in Panels A and B. Finally, the time variation in betas not captured by the predictive variables
does not seem to be economically significant either.
Summarizing, the evidence in Tables VI and VII suggests that, assuming that the price of risk
is constant, the predictability evidence plays an important role in the investor’s asset allocation
decision. The economic significance of return predictability, however, depends on the source of
predictability, namely, time variation in risk premia and risk and/or model mispricing. The eco-
nomic significance of the time variation in betas captured by the predictive variables is not large.
27
The economic significance of the CAPM departures is considerable, especially for the value port-
folio. Finally, allowing betas to be stochastic rather than deterministic functions of the predictive
variables seems economically significant for the value portfolio but not for the size portfolio.
6.3 Time-Varying Price of Risk with Time Variation in Expected Returns andVariances
Tables VIII and IX present the portfolio weights, Sharpe ratio and differences in certainty
equivalent returns (CERs) assuming that the price of risk is time-varying. Unlike in Tables IV and
V, in which the covariance matrix was assumed to be constant (hence, the price of risk was also
time-varying), now I allow for time variation in both the market expected return and risk. Since
the market expected return and risk do not necessarily move proportionally, the price of risk can
be time-varying. Table VIII reports the evidence for the value portfolio and Table IX for the size
portfolio. As before, the predictive variables in Panels A and C (Panels B and D) are the dividend
yield and term spread (default and term spreads). In Panels A and B, betas are either constant or
deterministic functions of the predictive variables; in Panels C and D, they are stochastic functions
of the predictive variables.
I estimate the conditional market variance using the random coefficient model described in
section 3.1, which relates the return volatility to the volatility of the dividend yield, default and
term spreads in the spirit of Schwert (1989). When the predictive variables are the dividend yield
and term spread, the posterior means of the parameters are (in %)=� = [0.490, 0.116, 0.272]’
and the posterior standard deviations *��(�) = [0.205, 0.230, 0.219]’, for the intercept, dividend
yield and term spread respectively. The means of the residual standard deviations of the random
coefficients are 1.883 and 1.104 for the coefficients associated with the dividend yield and term
spread respectively. When the predictive variables are the default and term spreads, the posterior
means of the parameters are=� = [0.559, 0.351, 0.113] ’ and the posterior standard deviations
*��(�) = [0.207, 0.193, 0.226]’, for the intercept, default and term spreads respectively. The means
of the residual standard deviations of the random coefficients are 0.110 and 1.210 for the coefficients
associated with the default and term spreads.
Overall, the results in Tables VIII and IX reinforce the findings discussed so far: the investor’s
optimal portfolio is sensitive to the levels of the dividend yield, default spread, and, to a lesser
extent, term spread; the evidence is stronger for the value portfolio than for the size portfolio; and
the economic significance depends on the source of predictability. I also find that the economic
significance of the evidence on predictability is slightly larger than in the constant price of risk
case and, when the predictive variables decrease, it is also larger than for the constant covariance
matrix case. There are two main differences between the results in Tables VIII and IX and the
results discussed so far.
28
First, when the predictive variables decrease, the optimal portfolios consist of smaller/shorter
positions in the market index and the investor requires a larger compensation to ignore the pre-
dictability evidence. In the proposed model for the variance, the conditional variance increases with
changes in the predictive variables regardless of the sign of these changes. Hence, decreases in the
predictive variables predict higher market risk and lower expected returns, which ultimately results
in shorter positions in the market. For example, as shown in Table VIII, Panel B, if the investor
has diffuse priors about the source of predictability (e.g., deterministic alpha and beta case) and
the default spread decreases by one standard deviation, he shorts the market by -10% (in Table VI,
Panel B, he invests 53% in the market); the investor has to be compensated by a riskless return of
5.22% per annum to ignore the predictability evidence (versus 3.78% in Table VI).
Second, the optimal portfolio is more sensitive to the level of the term spread now because
changes in the term spread affect the market variance. For example, when the predictive variables
are the dividend yield and term spread (Table VIII, Panel A), an investor who has diffuse beliefs
about the predictability evidence would require a CER of 0.99%-1.72% to ignore the predictive
ability of the term spread.
6.4 Comparison of CERs Across Models
In the previous tables, I have analyzed how much an investor should be compensated, in terms
of CER, if he were to ignore the predictability evidence. Another way to examine the economic
significance of predictability and the source of predictability is to compute how much an investor
should be compensated if he were to hold a portfolio derived under an alternative model. In this
case, the sample is held constant across models and the differences in CERs are driven by differences
across models only.
The analysis is presented in Tables X-XI for the value and size portfolios respectively. Since I
find that the results do not depend much on the whether the price of risk is constant or time-varying,
I present the results for the constant price of risk only. I assume that the predictive variables are at
one standard deviation above their long run mean. Again, the results are similar if the predictive
variables are at one standard deviation below their mean.
The results in Tables X and XI reinforce the findings discussed so far. The investor requires a
considerable compensation: (i) if he believes that there may be conditional mispricing (i.e., alphas
may be time-varying) and is forced to hold the portfolio of an investor who dogmatically believes
that a conditional CAPM can explain the predictability in returns (i.e., alphas are always constant),
or vice versa; and (ii) if he is to ignore the stochastic versus deterministic/constant nature of beta.
The investor, however, does not require a large compensation in order to ignore the time variation
in betas captured by the predictive variables. Again, the results for the value portfolio are stronger
29
than for the size portfolio.
Some of the CER are large, especially for the value portfolio. For example, if an investor does
not allow for conditional mispricing but allows for uncertainty about the beta dynamics, he would
have to be compensated by a riskless return of 11.14% per annum to hold the portfolio of an investor
with opposite beliefs (i.e., predictability cannot be captured by a conditional CAPM and betas are
constant) when the default spread is one standard deviation above its mean. As shown in Tables X
and XI, other differences in CER range from 0% to 9.63%. These large differences in CER suggest
that it is important to incorporate into the portfolio allocation problem the investor’s beliefs about
the source of return predictability.
7 Conclusions
This paper examines how an investor’s optimal allocation across multiple risky assets is affected
by the sample evidence on predictability, and the investor’s beliefs about both predictability and
the ability of a model to capture the predictability and price assets. I present a “data-and-model”
based approach to portfolio selection in which returns may be predictable and this predictability
may be captured by a conditional asset pricing model. I also introduce a general econometric
framework to incorporate uncertainty about the model dynamics into the problem.
Using the dividend yield, default and term spreads as predictive variables, I find that the sample
evidence on predictability plays an important role in an investor’s portfolio allocation decision
across multiple risky assets. The sensitivity of the optimal portfolio to the predictability evidence
depends on the investor’s priors about the source of predictability, namely, model mispricing or
time variation in risk premia and betas. In particular, the optimal portfolio of an investor who
attributes the predictability evidence to time variation in risk premia betas is rather insensitive
to the level of the predictive variables. Conversely, if the investor allows for time variation in
alphas, the predictability evidence plays a major role in his portfolio allocation decision regardless
of whether he allows betas to change over time or not. The effect of predictability is also important
in terms of the expected utility. For example, an investor who believes that betas are time-varying
but alphas are constant would suffer a considerable loss in expected utility if he was forced to hold
the optimal portfolio of an investor who allows for time variation in alphas. Likewise, an investor
who allows for time variation alphas would suffer a large utility loss if he were hold the portfolio of
an investor who believes that alphas are constant.
Finally, I find that it is important to incorporate an investor’s uncertainty about the beta
dynamics into the portfolio selection problem. An investor who allows for model mispricing in betas
would suffer a considerable loss in expected utility if he were to ignore the model misspecification
30
possibility. Interestingly, the sample evidence on the stochastic nature of beta is economically
more significant than the sample evidence on time variation in betas associated with the predictive
variables used in this paper.
In sum, when examining the portfolio allocation problem in the presence of predictability and
multiple risky assets, it is important to incorporate into the problem: (i) an investor’s belief about
predictability and this source of predictability; (ii) his uncertainty about the model dynamics; and
(iii) the model’s pricing abilities. The optimal portfolio differ substantially depending on these
factors.
The framework introduced in this paper could be extended to examine other questions. It could
be applied to multifactor models and other assets. The latter may be particularly interesting given
the cross-sectional differences that arise in the time series predictability of returns. The framework
discussed here could also be extended to allow investors to have multiple period investment horizons.
More informative priors and other forms of conditional volatility could also be explored. Finally,
the out-of-sample performance of portfolios resulting from different beliefs in the models could be
examined.
31
APPENDIX I:
Bayesian Estimation of the Conditional CAPM with Deterministic � and �
Rewrite the conditional asset pricing model in (7) can be rewritten in vector notation as
= +� + �� (A1.1)
where
• = (1� ���� � )0 is a �� × 1 vector and is a � × 1 vector for � = 1� ������
• � is a �2� × 1 vector, � = [��1 , � 1� ���� ���� � �
]0� and �� and � are (� × 1) vectors,
• + is a ��×�2� matrix+ =
01 ,��+1 · 01 0 ��� 0 00 � 0��� � ���
0 0 ��� 0 0� ,��+1 · 0�
, is a��
matrix of explanatory variables for asset � = 1� ����� and ,+1�� is � � diagonal matrix with
diagonal elements {�+1��} for � = 1� �����
• and � ∼ � (0�Σ⊗ �� ) �
Assuming independent normal and inverted Wishart priors for � and Σ respectively,
� ∼ �(����) (A1.2)
Σ−1 ∼� (-−1� #)� (A1.3)
the conditional posterior distributions of � and Σ are given by:
�| ¡Σ−1� �+¢ ∼ �
µ=��
=��
¶� (A1.4)
Σ−1| (�� �+) ∼�³¡- + -
¢−1� # + �
´� (A1.5)
where
=�� = ��
−1++0 ¡Σ−1 ⊗ ��
¢+� (A1.6)
=� =
=��
h��
−1� ++0 ¡Σ−1 ⊗ ��
¢i� (A1.7)
and - = [-�] � a � ×� matrix with elements -� = ( −+�)0 (� −+���) �
32
APPENDIX II:
Application of The Simulation Smoother for Time Series Models of De Jong and
Shephard (1995) to Sample the Latent Betas
In this appendix, I show how to sample the latent betas using the simulation smoother for time
series models of De Jong and Shephard (1995). Following the time series literature, the latent betas
are defined as a stack of state vectors with respect to state space form and (�|) is assumed tobe Gaussian. This sampling method presents computational advantages, especially for elaborate
models.
Formulation of the State Space Model
Let �� = (�1�� ���� ���)0, for � = 1� ���� �� be the vector of the latent betas for the � assets. Given
the vector of alphas ��� for � = 1� ���� �� the conditional CAPM can be re-written in a state space
form as:
.� = /���−1 + �� (A2.1)
�� = �� +Φ��−1 + ��� (A2.2)
where
• .� is a � × 1 vector with elements {.� = � − ��}, � = 1� ������
• /� = ���0(��� ���� ��) is a � ×� diagonal matrix,
• � =
�1� − "1�
1�−1 0 � � � 0
0 � �
� � �
� � �
� � 00 � � � 0 ��� − "��
��−1
is a � �� matrix of explanatory vari-
ables for the latent data and �� is a 1� vector of explanatory variables for asset � = 1� ������
• �� = (� 1� ���� � � )0 is a �� × 1 vector of parameters where � = (�1� ���� ��)
0 are � × 1vectors for � = 1� ������
• Φ = ���0("1� ���� "� ) is a � ×� diagonal matrix of autoregressive parameters,
• �� and �� are independently, identically distributed as � (0�Σ) and � (0� 1) respectively,
• and the initial conditional beta, �0, is given by �0 = 0� + �0.
33
Equation (A3.1) is usually referred to as the measurement equation and equation (A3.2) as the
state or transition equation.
To use the simulator smoother, let re-write the error terms, �� and ��� as
�� = 2�#�� (A2.3)
�� = 3�#�� (A2.4)
where
• #� ∼ �(0� �) is a (� +�)× 1 vector of innovations,
• 2� = 2 = (Σ 0) is a � × (� +�) matrix, where 0 is a � ×� matrix of zeros,
• and 3� = 3 = (0 1) is a � × (� +�) matrix, where 0 is a � ×� matrix of zeros.
The state space model can be re-written as
.� = /���−1 +2#� (A2.5)
�� = �� +Φ��−1 +3#�� (A2.6)
The Simulator Smoother40
The simulator smoother draws
Step I : Kalman filter: for � = 1� ���� � run:
4� = .� −/��� (A2.7)
+� = /�5�/0� +220 (A2.8)
�� = (Φ5�/0�)+
−1� (A2.9)
�� = Φ−��/� (A2.10)
��+1 = �� +�� +��4� (A2.11)
5�+1 = Φ5��0� +33 0 (A2.12)
where �1 = 0� and 51 = 30300 are the initial values for the filter. On this Kalman filter pass, the
quantities 4�, +� and �� are stored. The equations A2.7-A2.12 can be interpreted as follows: A2.7
is the innovation equation; A2.8 is the innovation variance; A2.9 is the Kalman gain; A2.11 is the
updating equation; A2.12 is the mean squared error of the prediction.
40Note that the notation used for the simulator smoother applies to this section only.
34
Step II: Backward recursion: for � = �� ���� 1 run:
&� = 33 0 −33 06�33 0 (A2.13)
κ� ∼ �(0� &�) (A2.14)
%� = 33 06��� (A2.15)
�� = 33 0*� + κ� (A2.16)
*�−1 = / 0�+−1� 4� + �0�*� − % 0�&
−1� κ� (A2.17)
6�−1 = / 0�+−1� /� + �0�6��� + % 0�&
−1� %� (A2.18)
and store ��� The � × 1 vector �� is a draw from 5 (3��|.)�The simulated �� is obtained recursively from
�� = �� +Φ��−1 + ��� (A2.19)
where �0 = 0� + �0�
35
APPENDIX III:
Derivation of the Conditional Posterior Distributions of the Parameters in the Beta
Equation
This appendix derives the conditional posterior distributions of the parameters in the beta
equation. At each step, the posterior distribution of the parameter of interest is conditioned on
fixed values of all the other parameters (previously simulated).
Prior Distributions
In the derivation of the posterior distributions, I assume independent priors for �� " and � ∗� ,
the long run covariance matrix of the residuals, which is given by
� ∗� = Φ�
∗�Φ+��� (A3.1)
In particular, I specify a normal distribution as the prior distribution of �, a normal distribution
truncated to the stationary region as the prior for " and an inverse Wishart distribution as the
prior for � ∗�
� ∼ �³_��� �
´� (A3.2)
" ∼ �(_"�
_3)�� � (A3.3)
� ∗ −1� ∼ �
³-−1� #´� (A3.4)
The prior of �� is implied by the prior distribution of � ∗� � However, given the prior of �
∗� , the
prior distribution of �� turns out not to be a familiar distribution. As I discuss later in this
appendix, this poses no problem because we can use the Metropolis-Hasting algorithm to draw
from the posterior of ��.
36
Conditional Posteriors of the � Parameter Vector and the Innovations Covariance
Matrix ��
In order to derive the conditional posterior densities of the parameters � and ��, write the
equations for beta as a system of equations�1�
�
�
��
=
1 − "1
1−1 0 � � � 00 � �
� � �
� � �
0 � � � 0 � − "��−1
�1�
�
�
��
+
"1�1�−1�
�
�
"����−1
+
�1�
�
�
��
�
(A3.5)
where, for assets � = 1� ������ � is a � × 1 vector and ��−1 is a � × 1 vector of betas lagged oneperiod ; is a � ×� matrix of explanatory variables and −1 is a � ×� matrix of explanatory
variables lagged one period; � is a �×1 vector of parameters,; and "’s are (scalar) autoregressive
coefficients.
Conditional on the values of ", the system in (A3.4) can be expressed as
7 = 8 −Φ⊗ ��8−1 = ∗� + �� (A3.6)
where 8 = (�1� ���� �� )0 is a �� × 1 vector of the latent betas; ∗ is a �� ×�� sparse matrix
of “new” explanatory variables with elements {1 − "11−1}; � = (�1� ���� �� )0 is a �� × 1 vector
of parameters; Φ = ���0(") is a � × � diagonal matrix of the autoregressive parameters and
" = (" 1� ���� " � )0 is a � × 1 vector of autoregressive coefficients; and � = (� 1� ���� � � )
0 is a
�� × 1 vector of residuals, which are independently, identically distributed as
� ∼ � (0��� ⊗ �� ) �
To derive the conditional posteriors of interest, divide the � observations into the initial obser-
vation, � = 0, and the rest of the observations, � = 1� ���� � − 1. Then the model in (A3.6) can bere-written as follows.
For � = 0, let 70 be a � × 1 vector and ∗0 the corresponding � ×�� matrix of explanatory
variables. Since 8−1 = 0, 70 equals the conditional betas at time � = 0� 80� Hence at � = 0, model
for the model is given by
70 = 80 = ∗0� + �0� (A3.7)
where �0 ∼ �(0�� ∗� ) and � ∗
� = Φ�∗�Φ+���
For the rest of the observations, � 9 0, let 7� = (8�−Φ⊗ �(�−1)8−1��) be a (� −1)� ×1 vectorand ∗� the corresponding (� −1)� ×�� matrix of explanatory variables�41 For � 9 0, the system
41The subscript denotes rest of the observations. Notice that the matrix � is lagged one period.
37
is
7� = ∗� � + �� (A3.8)
where �� ∼ �(0��� ⊗ �(�−1))�
Likelihood
The likelihood function is proportional to:
: (7 |) ∝ |� ∗� |−1�2 exp
½−12(70 − ∗0�)
0� ∗ −1
� (70 −∗0�)¾
(A3.9)
· |��|−(�−1)�2 exp½−12(7� − ∗� �)
0 ¡�−1� ⊗ �(�−1)
¢(7� − ∗� �)
¾�
Conditional Posterior Distribution of �
Define:
b�0 =¡∗00 �
∗ −1� ∗0
¢−1∗00 �
∗ −1� 70�b�� =
¡∗0�
¡�−1
� ⊗ �(�−1)¢∗�¢−1
∗0�¡�−1
� ⊗ �(�−1)¢7��
where b�0 and b�� are the estimators of �0 and ��. As a function of b�0and b��� the likelihood functionis proportional to
: (7 |) ∝ |� ∗� |−1�2 exp
½−12
³70 − ∗0b�0´0� ∗ −1
�
³70 − ∗0b�0´− 12 ³�0 − b�0´0∗00 � ∗ −1
� ∗0³�0 − b�0´¾
· |��|−(�−1)�2 exp −12
³7� − ∗�b��´0 ¡�−1
� ⊗ �(�−1)¢³
7� − ∗�b��´−12
³�� − b��´0∗00�
¡�−1
� ⊗ �(�−1)¢∗�³�� − b��´
� (A3.10)
Combining the likelihood in (A3.10) and the prior in (A3.2), the density kernel of the posterior
distribution of � is given by
exp
½−12
³�0 − b�0´0 ∗00 � ∗ −1
� ∗0³�0 − b�0´− 1
2
³�� − b��´0∗0� ¡�−1
� ⊗ �(�−1)¢∗�³�� − b��´
−12
¡� − �
¢0�−1�
¡� − �
¢¾� (A3.11)
Therefore, the posterior distribution of � is given by
� ∼ �
µ=��
=� �
¶where
=�−1� = �
−1� + ∗00 �
∗ −1� ∗0 + ∗
0�
¡�−1
� ⊗ �(�−1)¢∗� (A3.12)
=� =
=� �
³�−1� � + ∗00 �
∗ −1� 70 + ∗
0�
¡�−1
� ⊗ �(�−1)¢7�
´� (A3.13)
38
Conditional Posterior Distribution of ��
Let -0 = [-�] and -� = [-� ] with elements -� = (7 − ∗ �)0 (7 − ∗ �) for � = 0 and � =
1� ���� � − 1 respectively. As a function of -0 and -� the likelihood function is proportional to
:(7 |) ∝ |� ∗� |−1�2 exp
½−12�(� ∗−1
� -0)
¾· |��|−(�−1)�2 exp
½−12�(�−1
� -�)
¾� (A3.14)
Combining the likelihood in (A3.14) with the prior in (A3.4), the density kernel of the posterior
distribution of �� is given by
|� ∗� |−1�2 exp
½−12�(� ∗−1
� -0)
¾· |��|−(�−1)�2 exp
½−12�(��
−1-�)¾·
|� ∗� |−(�+�+1)�2 exp
½−12�(� ∗−1
� -)
¾= �(��) · |��|−(�−1)�2 exp
½−12�(�−1
� -�)
¾� (A3.15)
where
�(��) = |� ∗� |−(�+�+2)�2 exp
½−12�(� ∗−1
� (-0 + -))
¾�
Since the exact form of the distribution of �� is unknown, I use the Metropolis-Hasting algo-
rithm to sample from a density whose kernel is given by (A3.15). In particular, I use the inverse
Wishart distribution with parameters -� and � − 1 as a proposal distribution in the Metropolis-Hasting algorithm. Therefore, given a “current” value of �� and corresponding � , I sample a
candidate value � �� from �−1 (-� � − 1) and accept it with probability
min
½1��(� �
�)
�(��)
¾where � � = Φ� �Φ+� �
�.
2) Conditional Posteriors of the " Vector of Autoregressive Parameters
Conditional on � and ��, rewrite (A3.1) as�1 − 1� 1
�
�
�
�� − �� �
= /
" 1�
�
�
" �
+
� 1�
�
�
� �
� (A3.16)
where
•
/ =
�1�−1 − 1−1� 1 0 � � 0
� �
� �
� �
� � � 0 ���−1 − �−1� �
� (A3.17)
39
• and � ∼ � (0��� ⊗ �� ) and �� is the covariance matrix of the residuals.
For simplicity express (A3.16) as
7 = /"+ ��
and 7 = [¡�1 − 1� 1
¢� ���� (�� − �� � )]
0�
As before, divide the � observations into the initial observation, � = 0, and the rest of the
observations, � = 1� ���� � − 1 and define 70, 7�, /0 and /� accordingly.
Likelihood
The likelihood function is proportional to
: (7 |/) ∝ |� |−1�2 exp½−12700�−170
¾· (A3.18)
· |��|−(�−1)�2 exp½−12(7� −/�")
0 ¡�−1� ⊗ �(�−1)
¢(7� −/�")
¾�
Define b"� = ¡/ 0�
¡�−1
� ⊗ �(�−1)¢/�
¢−1/ 0
�
¡�−1
� ⊗ �(�−1)¢7�� (A3.19)
and rewrite the likelihood as proportional to
(7 |/) ∝ |� ∗� |−1�2 exp
½−12700�−170
¾· (A3.20)
· |��|−(�−1)�2 exp −12
³7� −/�
b"´0 ¡�−1� ⊗ �(�−1)
¢ ³7� −/�
b"´−12("� − b"0�)/ 00
�
¡�−1
� ⊗ �(�−1)¢/�
³"� − b"�´
�
Combining the likelihood in (A3.20) with the prior in (A3.5), the density kernel of the posterior
distribution of " is given by42
|��|−1�2 exp½−12
¡"− "
¢03−1 ¡
"− "¢¾ · |� ∗
� |−(�+�+1)�2 exp½−12�(� ∗−1
� -)
¾(A3.21)
· |� ∗� |−1�2 exp
½−12700� ∗−1
� 70
¾· |��|−(�−1)�2 exp
½−12("� − b"0�)/ 00
�
¡�−1
� ⊗ �(�−1)¢/�
³"� − b"�´¾ �
which is proportional to
((") · exp½−12
·³"� − b"�´0/ 0
�
¡�−1
� ⊗ �(�−1)¢/�
³"� − b"�´+ ¡"− "
¢03−1 ¡
"− "¢¸¾
�
where
((") = |� ∗� |−(�+�+2)�2 · exp
½−12�(� ∗−1
� - + 700� ∗−1
� 70)
¾�
42Nothe that the posterior didtribution of � is also influenced by the prior of � ∗ �
40
Let
=3−1
= 3−1+/ 0
�(�−1� ⊗ ��−1)/�� (A3.22)
=" =
=3h3−1"+/ 0
�(�−1� ⊗ ��−1)7�
i� (A3.23)
The full conditional posterior distribution of " is the multivariate normal �µ"|="�
=3
¶trun-
cated to the (−1� 1) region and multiplied by the factor ((")� I sample from this distribution using
a Metropolis-Hasting algorithm that takes the truncated multivariate normal component as a pro-
posal distribution. Therefore, given a “current” value of " and corresponding � , I sample a
candidate value "� from the truncated normal and accept it with probability
min
½1�(("�)
((")
¾�
where � � = Φ� �Φ+� ��.
41
APPENDIX IV:
Moments of the Predictive Density when N=1, K=1
The conditional moments
=� =
h=�1
=�2
i0=£�(��+1|�) �(���+1|�)
¤0(A4.1)
=% =
" =% 1
=% 12
=% 12
=% 2
#=
·#�(��+1|�) ()#(��+1� ���+1|�)
()#(��+1� ���+1|�) #�(���+1|�)
¸(A4.2)
can be obtained from the posterior distribution of the parameters. Let the label=
(�) denote the
posterior mean of the parameter of interest. The conditional moments are given by
=�1 =
=�+
=�2
=�� (A4.3)
=�2 = �
= (A4.4)
=% 1 = [1�
=�2]()#(��� ��|�)[1�
=�2]
0 +=��=% 2 + #�(��|�)
=% 2 +�(�2�|�) (A4.5)
=% 2 = �()#(�� �|�)
0� +�(�2|�) (A4.6)
=% 12 =
=��=% 2 (A4.7)
where
=� = �
=�� (A4.8)
=�� = (� −
=" �−1)
=� +
="
=��−1 (A4.9)
#�(��|�) = �()#(��� ��|�)0� (A4.10)
#�(��|�) = �()#(� � � |�)0� +�(�∗2� |�) (A4.11)
()#(��� ��|�) = �()#(��� � |�)0�� (A4.12)
Optimal Weights in the Assets when N=1, K=1
I derive an equivalent representation for the optimal weights in the benchmark and non-
benchmark assets. This representation provides some insight into the components of the weights
and happens to be very helpful to interpret the results in this paper.
The optimal weights � for an investor who maximizes the mean-variance objective in (21) are
given by
�∗ = '−1=%−1=�� (A4.13)
Note that these weights are proportional to the tangency portfolio weights.
42
Substituting the moments for their expressions in (A4.3)-(A4.7), we obtain that
�∗1 = '−1
=��=
#�(��|�)
(A4.14)
'−1Ã
=��
[1�=�2]()#(��� ��|�)[1�
=�2]0 + #�(��|�)
=% 2 +�(�2�|�)
!� (A4.15)
and
�∗2 = '−1=�2=% 2
−=���
∗1� (A4.16)
where,=
#�(��|�) is the predictive residual variance of the non-benchmark asset, and �∗1 and �∗2are the optimal portfolio weights on the non-benchmark and benchmark assets respectively.
The denominator of �∗1 is the residual variance from the predictive distribution of returns and
incorporates the effect of parameter uncertainty through the covariance matrix of alpha and betas.
If one were to ignore parameter uncertainty, it is clear from (A4.14) that the optimal portfolio
weight on the non-benchmark asset would be larger and, vice versa, the optimal portfolio weight
on the benchmark asset would be smaller. Also, from (A4.11) it becomes clear that the stochastic
nature of beta increases the denominator of �∗1 and, hence, the allocation to the non-benchmark
asset is reduced.
It is easy to show that the squared Sharpe ratio of the portfolio is given by
-,2� =
=��2
[1�=�2]()#(��� ��|�)[1�
=�2]0 + #�(��|�)
=% 2 +�(�2�|�)
+
=�2=% 2
� (A4.17)
which is the sum of squared Sharpe ratio of the non-benchmark,=�2
� ��� ��! � and squared Sharpe ratio
of the benchmark. The moments are derived under the predictive distributions. Note also that the
squared Sharpe ratio of the tangency portfolio is also given by (A4.17). Hence, the Sharpe ratio of
the portfolio in (A4.13) and the tangency portfolio are the same, which is not surprising given that
the weights in (A4.13) are proportional to the tangency weights.
43
APPENDIX V:
Priors Used in the Paper
0The specific priors used in this paper are:
• Prior Distribution of �
I shrink the long run mean of beta towards the market beta and specify a prior suggesting
no predictability. However, since I let the prior standard deviations be quite large, the prior
is nearly noninformative. The prior distribution of � is
� ∼ �³¡1 0�−1
¢0����
´where ��� = ���0(1�), and 0�−1 and 1� are (� − 1) × 1 and � × 1 vectors of zeros andones.
• Prior Distribution of "
0I consider a fairly noninformative prior for " � In particular, I assume that it is normally
distributed as " ∼ �(0�9� 0�16) truncated to the stationary region (-1,1).
• Prior Distribution of �∗2�
�∗2� reflects the departures from the model specified for beta. The strength of the prior
depends on−#, which corresponds to the number of data points we have observed in order
to specify the prior. Tamayo (2000) points out that the latent nature of beta makes very
small values of−# assign a lot of weight to the prior of �∗2� . This is due to the fact that beta
is latent rather than observable and, hence, the investor needs to observe larger amount of
data to estimate the beta parameter precisely. Thus,−# = 1 for the latent betas is roughly
equivalent to observing 7 or 8 data points of actually observable data. In this paper I assume
that−# = 1, a very noninformative prior.
• Prior Distribution of ��
I assume that the prior mean of �� equals 0� , the true value under the null hypothesis that
the conditional CAPM is an adequate characterization of expected returns. This implies
that the conditional CAPM can capture the time series predictability of returns and price
average returns. However, again I let the prior standard deviations be quite large, 1, so these
priors are nearly noninformative. The prior distribution of �� is �� ∼ �³00� ����
´,where
��� = ���0(1�) and 0� and 1� are � × 1 vectors of zeros and ones respectively.
44
• Prior Distribution of ��
I set the first element in ��, the long run mean of the excess return on the market, equal to
0.005 (or 6% premium over the riskfree asset per annum) and specify a prior suggesting no
predictability. Again, I let the prior standard deviations be quite large so that these priors
are nearly uninformative. Therefore, the prior distribution of �� is:
0�� ∼ �³¡0�005 0�
¢0����
´where ���
= ���0(0�5� 1�) and 0� and 1� are � × 1 vectors of zeros and ones respectively.
• Prior Distributions of �2� and �2
The priors for the parameters �2� and �2 are assumed to be diffuse.
45
References
[1] Anderson, E, Lars Hansen and Thomas Sargent, 1999, “Risk and robustness in equilibrium”,
Working Paper, Stanford University.
[2] Avramov, Doron, 2000, “Stock return predictability and model uncertainty”, Working Paper,
University of Maryland.
[3] Avramov, Doron, 2000, “Stock return predictability and asset pricing models”, Working Paper,
University of Maryland.
[4] Banz, Rolf, 1981, “The relationship between returns and market value of common stocks”,
Journal of Financial Economics, 9, 13-18.
[5] Barberis, Nicholas, 2000, “Investing for the long run when returns are predictable:”, Journal
of Finance, 55, 225-264.
[6] Bauer, Gregory, 2001, Working Paper, University of Rochester.
[7] Bawa, Vijay, Stephen Brown, and Roger Klein, 1979, Estimation Risk and Optimal Portfolio
Choice, North Holland, Amsterdam.
[8] Berger, James O., 1985, Statistical Decision Theory and Bayesian Analysis, 2nd ed., Springer
Verlag, New York.
[9] Brennan, Michael, Eduardo Scwartz, and Rinald Lagnado, 1997, “Strategic asset allocation”,
Journal of Economic Dynamics and Control, 21, 1377-1403.
[10] Bodurtha, James N. and Nelson C. Mark, 1991, “Testing the CAPM with time-varying risk”,
Journal of Finance, 46, 1485-1505.
[11] Bollerslev, Tim, Robert F. Engle, and Jeffrey M. Wooldridge, 1988, “A capital asset pricing
model with time varying covariances”, Journal of Political Economy, 96, 116-131.
[12] Bossaerts, P. and P. Hillion, 1999, “Implementing statistical criteria to select return forecasting
models: what do we learn?”, Review of Financial Studies, 12, 405-428.
[13] Campbell, John and Luis Viceira, 1999, “Consumption and portfolio decisions when expected
returns are time-varying”, Quarterly Journal of Economics, 114, 433-495.
[14] Casella, George and Edwards I. George, 1992, “Explaining the Gibbs sampler”, The American
Statistician 46, 167-174.
46
[15] Chan, Louis K.C., Narasimham Jegadeesh, and Josef Lakonishok, 1995, “Evaluating the perfor-
mance of value versus glamour: the impact of selection bias”, Journal of Financial Economics,
38, 269-296.
[16] Chib, Siddartha and Edward Greenberg, 1995, “Understanding the Metropolis-Hasting algo-
rithm”, The American Statistician, 49, 327-335.
[17] Chib, Siddartha and Edward Greenberg, 1996, “Markov Chain Monte Carlo simulation meth-
ods in econometrics”, Econometric Theory, 12, 409-431.
[18] De Jong, Piet and Neil Shephard, 1995, “Efficient sampling from the smoothing density in
time series models”, Biometrika 82, 339-350.
[19] Evans, Martin D., 1994, “Expected returns, time-varying risk and risk premia”, Journal of
Finance 49, 655-679.
[20] Fama, Eugene and William Schwert, 1977, “Asset returns and inflation”, Journal of Financial
Economics, 5, 115-146.
[21] Fama, Eugene and Kenneth French, 1988, “Dividend yields and expected stock returns”, Jour-
nal of Financial Economics, 22, 3-27.
[22] Fama, Eugene and Kenneth French, 1989, “Business conditions and expected returns on stocks
and bonds”, Journal of Financial Economics, 25, 23-49.
[23] Fama, Eugene and Kenneth French, 1992, “The cross-section of expected stock returns:”,
Journal of Finance, 47, 427-465.
[24] Fama, Eugene and Kenneth French, 1993, “Common risk factors in the returns on stocks and
bonds”, Journal of Financial Economics, 33, 3-56.
[25] Ferson, Wayne E. and Campbell R. Harvey, 1991, “The variation of economic risk premiums”,
Journal of Political Economy, 99, 385-415.
[26] Ferson, Wayne E. and Campbell R. Harvey, 1993, “The risk and predictability of international
equity returns”, Review of Financial Studies, 6, 527-566.
[27] Ferson, Wayne E. and Campbell R. Harvey, 1999, “Conditioning variables and the cross-section
of stock retur0ns”, Journal of Finance, 54, 1325-1360.
[28] Ferson, Wayne E and Robert A. Korajczyk, 1995, “Do arbitrage pricing models explain the
predictability of stock returns?”, Journal of Business 68, 309-349.
47
[29] Foster, F. D., T. Smith, and R. E. Whaley, 1997, “Assessing goodness-of-fit of asset pricing
models: the distribution of maximal R2”, Journal of Finance, 52, 591-607.
[30] Frost, Peter and James Savarino, 1986, “An empirical Bayes approach to efficient portfolio
selection”, Journal of Financial and Quantitative Analysis, 21, 293-305.
[31] Gilks, W.R., S. Richardson and D.J. Spiegelhalter, 1996, Markov Chain Monte Carlo in Prac-
tice, Chapman and Hall, London.
[32] Ghysels, Eric, 1998, “On stable factor structures in the pricing of risk: do time varying betas
help or hurt?, Journal of Finance, 53, 549-573.
[33] Goetzmann, W.N. and P. Jorion, 1993, “Testing the predictive power of dividend yields”,
Journal of Finance, 48, 663-679.
[34] Grauer, Robert and Hakansson, 1995, “Stein and CAPM estimators of the means in asset
allocation”, International Review of Financial Analysis, 4, 35-66.
[35] Hansen, Lars P. and Scott F. Richard, 1987, “The role of conditioning information in deducing
testable restrictions implied by dynamic asset pricing models”, Econometrica, 50, 1029-1054.
[36] Harvey, Andrew, 1989, Forecasting, Structural Time Series Models and the Kalman Filter,
Cambridge University Press, Cambridge.
[37] Harvey, Campbell R., 1989, “Time-varying conditional covariances in tests of asset pricing
models”, Journal of F0inancial Economics, 24, 289-318.
[38] Harvey, Campbell R., 1991, “The world price of covariance risk”, Journal of Finance 46,
111-157.
[39] Harvey, Campbell R. and Guofu Zhou, 1990, “Bayesian inference in asset pricing test”, Journal
of Financial Economics, 26, 221-254.
[40] He, Jia, Raymond Kan, Lilian Ng ,and Chu Zhang, 1996, “Tests of the relations among
marketwide factors, firm-specific variables, and stock returns using a conditional asset pricing
model”, Journal of Finance, 51, 1891-1908.
[41] Jacquier, Eric, Nicholas G. Polson and Peter E. Rossi, 1994, “Bayesian analysis of stochastic
volatility models”, Journal of Business and Economic Statistics, 12, 371-417.
[42] Jagannathan, Ravi and Zhenyu Wang, 1996, “The conditional CAPM and the cross-section of
expected returns”, Journal of Finance, 51, 3-53.
48
[43] Jobson, J. D. and Robert Korkie, 1980, “Estimation for Markowitz efficient portfolios”, Journal
of the American Statistical Association, 75, 544-554.
[44] Jorion, Philippe, 1985, “International portfolio diversification with estimation risk”, Journal
of Business, 58, 259-278.
[45] Jorion, Philippe, 1991, “Bayesian and CAPM estimators of the means: implications for port-
folio selection”, Journal of Banking and Finance, 15, 717-727.
[46] Kandel, Shmuel, Robert McCulloch and Robert F. Stambaugh, 1995, “Bayesian inference and
portfolio efficiency”, Review of Financial Studies, 8, 1-53.
[47] Kandel, Shmuel and Robert F. Stambaugh, 1996, “On the predictability of asset returns: an
asset-allocation perspective”, Journal of Finance, 51, 385-424.
[48] Keim, Donald and Robert Stambaugh, 1986, “Predicting returns in the stock and bond mar-
kets”, Journal of Financial Economics, 17, 357-390.
[49] Kim, Sangjoon, Neil Shephard and Siddhartha Chib, 1998, “Stochastic volatility: likelihood
inference and comparison of ARCH models”, Review of Economic Studies ,65, 361-393.
[50] Kim, Tong Suk and Edward Omberg, 1996, “Dynamic nonmyopic portfolio behavior”, Review
of Financial Studies, 9, 141-161.
[51] Klein, Roger W. and Vijay S. Bawa, 1976, “The effect of estimation risk on optimal portfolio
choice”, Journal of Financial Economics, 3, 215-231.
[52] Kothari, S.P. and Jay Shanken, 1997, “Book-to-market, dividend yield and expected market
returns: a time series analysis”, Journal of Financial Economics, 44, 169-203.
[53] Lewellen, Jonathan, 1999, “The time series relation among expected returns, risk and book-
to-market”, Journal of Financial Economics, 54, 5-44.
[54] Maenhout, Pascal, 2000, “Portfolio rules and asset pricing”, Working Paper, Insead.
[55] McCulloch, Robert and Peter E. Rossi, 1990, “Posterior predictive and utility-based ap-
proaches to testing the arbitrage pricing theory”, Journal of Financial Economics, 28, 7-38.
[56] McCulloch, Robert and Peter E. Ros0si, 1991, “A Bayesian approach to testing the arbitrage
pricing theory”, Journal of Econometrics, 49, 141-168.
[57] Ohlson, James and Barr Rosenberg, 1982, “Systematic risk of the CRSP equal-weighted com-
mon stock index: a history estimated by stochastic-parameter regression”, Journal of Business,
55, 121-145.
49
[58] Pastor, Lubos, 2000, “Portfolio selection and asset pricing models”, Journal of Finance, 55,
179-224.
[59] Pastor, Lubos and Robert Stambaugh, 1999, “Cost of equity capital and model mispricing”,
Journal of Finance, 54, 67-121.
[60] Pastor, Lubos and Robert Stambaugh, 2000, “Evaluating and investing in equity mutual
funds”, Wharton School Working Paper.
[61] Pesaran, M. H. and A. Timmerman, 1995, “Predictability of stock returns: robustness and
economic significance”, Journal of Finance, 50, 1201-1228.
[62] Rosenberg, Barr, 1973, “Random coefficient models: the analysis of a cross section of time
series by stochastically convergent parameter regression”, Annals of Economic and Social Mea-
surement 2, 399-428.
[63] Schwert, William, 1989, “Why does stock market volatility change over time?”, Journal of
Finance, 44, 1115-1153.
[64] Shanken, Jay, 1987, “A Bayesian approach to testing portfolio efficiency”, Journal of Financial
Economics, 19, 195-216.
[65] Shanken, Jay, 1990, “Intertemporal asset pricing: an empirical investigation”, Journal of
Econometrics, 45, 99-120.
[66] Shanken, Jay and Ane Tamayo, 2001, “Dividend yield and stock return predictability: mis-
pricing or risk?”, Working Paper, University of Rochester.
[67] Tamayo, Ane, 2000, “An Examination of conditional asset pricing models when betas are
stochastic”, Working Paper, University of Rochester.
[68] Tanner, Martin A., 1996, Tools for Statistical Inference, Springer-Verlag, New York.
[69] Zellner, Arnold, 1971, An Introduction to Bayesian Inference in Econometrics, Wiley and
Sons, New York.
[70] Zellner, Arnold and Karuppan Chetty, 1965, “Prediction and decision problems in regression
models from a Bayesian point of view”, Journal of the American Statistical Association, 60,
608-616.
50
Table I: Descriptive Statistics
Panel A: Predictive Variables (%)
Dividend yield is the annual dividend yield on the value-weighted CRSP market index. Default spread is the average monthly yield to maturity of corporate bonds rated BAA minus the AAA corporate bond yield. Term spread is the difference between the average monthly yield of a 10-year government bond and a 1-month Treasury bill. Sample period: 1:63–12:98
Variable Mean Median Maximum Minimum Std. Dev.
Dividend Yield 3.5567 3.4119 6.2675 1.5592 0.9406 Default Spread 0.0849 0.0746 0.2242 0.0267 0.0385 Term Spread 0.1285 0.1254 0.5165 -0.3933 0.1208
Panel B: Monthly Excess Returns on the Portfolios (%)
Stocks from the population of NYSE, AMEX and NASDAQ are divided into size and Book-to-market quintiles using NYSE as breakpoints. Sample period: 1:63–12:98
Portfolio Mean Median Maximum Minimum Std. Dev.
High BM 1.0010 0.9311 35.2383 -16.8129 5.0033 Small 0.6672 0.9761 30.6653 -30.0860 6.1952 Market 0.5496 0.7706 16.0360 -23.0920 4.3776
Panel C: Maximum Likelihood Estimates of the Predictive Regression This table presents the OLS regression estimates from regressing the excess returns on the market index, and the value weighted portfolios on the predictive variables. Standard Errors in Parenthesis. Sample period: 1:63–12:98
Variable Constant (%) D/P (%) Spread (%) Default (%)
0.549 0.210 0.231 (0.210) (0.210) (0.210) Market Index 0.549 0.153 0.371 (0.210) (0.215) (0.215) 1.001 0.499 0.224 (0.239) (0.240) (0.240) Value Portfolio 1.001 0.060 0.787 (0.238) (0.244) (0.244) 0.667 0.462 0.145 (0.297) (0.298) (0.298) Size Portfolio 0.667 0.010 0.650 (0.297) (0.306) (0.304)
Table II: Posterior Means and Standard Deviations of the Model Parameters Value Portfolio
This table presents the posterior means and standard deviations of the model parameters. The rows that are shaded present the results for models with time-varying alphas, which are assumed to be deterministic functions of the predictive variables. The other rows present the results for models with constant alphas. In the first column, cons refers to a constant beta model, dete to a time-varying, deterministic beta model and stoc to a time-varying, stochastic beta model. The general model for Panels A and B is: 111 +++ ++= tmtttt rZZr εθθ βα
where ε~N(0, σε2). The general model for Panels C and D is:
ttttt
tmtttt
uZZ
rZr
ββββ
α
βφθφβεβθ
++−=++=
−−
+++
11
111
)(
where ε~N(0, σε2), uβt~N(0,σuβ
2) and σuβ*2 = σuβ
2/(1-φ2) is the long run variance. Panel A and C use the dividend yield and term spread as predictive variables (hence, Zt is a 1x3 vector where its first element is an intercept, the second one the dividend yield and the third one the term spread). Panel B and D use the default and term spreads as predictive variables (hence, Zt is a 1x3 vector where its first element is an intercept, the second one the default spread and the third one the term spread). Sample period is 1:63-12:98. Panel A: Dividend Yield and Term Spread
θα1 (%) c
θα2 (%) d/p
θα3 (%) term
θβ1 c
θβ2 d/p
θβ3 term
σε (%)
Cons beta 0.478 0.950 2.796 (0.135) (0.031) (0.005) Cons beta 0.479 0.310 0.006 0.947 2.786 (0.135) (0.135) (0.134) (0.031) (0.005) Dete beta 0.486 0.932 0.053 -0.033 2.773 (0.135) (0.031) (0.027) (0.027) (0.005) Dete beta 0.487 0.261 0.031 0.931 0.042 -0.035 2.778 (0.135 (0.104) (0.135) (0.031) (0.027) (0.027) (0.005) Panel B: Default Spread and Term Spread
θα1 (%) c
θα2 (%) default
θα3 (%) term
θβ1 c
θβ2 default
θβ3 term
σε (%)
Cons beta 0.478 0.950 2.796 (0.135) (0.031) (0.005) Cons beta 0.483 0.436 -0.084 0.942 2.771 (0.134) (0.137) (0.137) (0.031) (0.005) Dete beta 0.473 0.925 0.083 -0.061 2.773 (0.135) (0.032) (0.031) (0.028) (0.005) Dete beta 0.484 0.378 -0.054 0.922 0.061 -0.059 2.757 (0.134) (0.141) (0.137) (0.032) (0.032) (0.028) (0.005)
Panel C: Dividend Yield and Term Spread – Incorporating Uncertainty About the Beta Model Dynamics
θα1 (%) c
θα2 (%) d/p
θα3 (%) term
θβ1 c
θβ2 d/p
θβ3 term
φβ σ*uβ σε (%)
Stoc beta 0.410 0.946 0.046 -0.025 0.614 0.337 2.289 (0.124) (0.051) (0.047) (0.038) (0.194) (0.027) (0.005) Stoc beta 0.415 0.207 0.041 0.949 0.030 -0.022 0.572 0.334 2.297 (0.124) (0.131) (0.126) (0.049) (0.046) (0.038) (0.209) (0.027) (0.004) Panel D: Default and Term Spread – Incorporating Uncertainty About to the Beta Model Dynamics
θα1 (%) c
θα2 (%) def
θα3 (%) term
θβ1 c
θβ2 def
θβ3 term
φβ σ*uβ σε (%)
Stoc beta 0.416 0.945 0.056 -0.036 0.661 0.338 2.309 (0.124) (0.058) (0.064) (0.040) (0.205) (0.031) (0.005) Stoc beta 0.429 0.332 -0.023 0.948 0.024 -0.034 0.603 0.332 2.296 (0.124) (0.132) (0.128) (0.052) (0.051) (0.040) (0.209) (0.028) (0.005)
Table III: Posterior Means and Standard Deviations of the Model Parameters Size Portfolio
This table presents the posterior means and standard deviations of the model parameters. The rows that are shaded present the results for models with time-varying alphas, which are assumed to be deterministic functions of the predictive variables. The other rows present the results for models with constant alphas. In the first column, cons refers to a constant beta model, dete to a time-varying, deterministic beta model and stoc to a time-varying, stochastic beta model. The general model for Panels A and B is: 111 +++ ++= tmtttt rZZr εθθ βα
where ε~N(0, σε2). The general model for Panels C and D is:
ttttt
tmtttt
uZZ
rZr
ββββ
α
βφθφβεβθ
++−=++=
−−
+++
11
111
)(
where ε~N(0, σε2), uβt~N(0,σuβ
2) and σuβ*2 = σuβ
2/(1-φ2) is the long run variance. Panel A and C use the dividend yield and term spread as predictive variables (hence, Zt is a 1x3 vector where its first element is an intercept, the second one the dividend yield and the third one the term spread). Panel B and D use the default and term spreads as predictive variables (hence, Zt is a 1x3 vector where its first element is an intercept, the second one the default spread and the third one the term spread). Sample period is 1:63-12:98. Panel A: Dividend Yield and Term Spread
θα1 (%) c
θα2 (%) d/p
θα3 (%) term
θβ1 c
θβ2 d/p
θβ3 term
σε (%)
Cons beta 0.023 1.169 3.506 (0.170) (0.038) (0.008) Cons beta 0.025 0.259 -0.125 1.168 3.502 (0.170) (0.170) (0.170) (0.039) (0.008) Dete beta 0.040 1.168 -0.022 -0.043 3.506 (0.170) (0.040) (0.033) (0.034) (0.008) Dete beta 0.041 0.300 0.104 1.170 -0.035 -0.042 3.500 (0.170) (0.175) (0.170) (0.040) (0.034) (0.034) (0.008) Panel B: Default Spread and Term Spread
θα1 (%) c
θα2 (%) default
θα3 (%) term
θβ1 c
θβ2 default
θβ3 term
σε (%)
Cons beta 0.023 1.169 3.506 (0.170) (0.039) (0.008) Cons beta 0.026 0.216 -0.168 1.166 3.506 (0.170) (0.173) (0.173) (0.039) (0.008) Dete beta 0.035 1.160 0.011 -0.044 3.508 (0.170) (0.040) (0.039) (0.035) (0.008) Dete beta 0.039 0.228 -0.150 1.160 -0.001 -0.041 3.508 (0.171) (0.180) (0.174) (0.040) (0.041) (0.035) (0.008)
Panel C: Dividend Yield and Term Spread – Incorporating Uncertainty About the Beta Model Dynamics
θα1 (%) c
θα2 (%) d/p
θα3 (%) term
θβ1 c
θβ2 d/p
θβ3 term
φβ σ*uβ σε (%)
Stoc beta 0.099 1.152 -0.008 -0.039 0.490 0.376 3.083 (0.164) (0.055) (0.051) (0.047) (0.185) (0.044) (0.008) Stoc beta 0.111 0.260 -0.151 1.157 -0.021 -0.035 0.451 0.377 3.081 (0.165) (0.172) (0.166) (0.054) (0.051) (0.047) (0.191) (0.044) (0.008) Panel D: Default and Term Spread – Incorporating Uncertainty About the Beta Model Dynamics
θα1 (%) c
θα2 (%) def
θα3 (%) term
θβ1 c
θβ2 def
θβ3 term
φβ σ*uβ σε (%)
Stoc beta 0.099 1.150 0.007 -0.041 0.482 0.373 3.085 (0.164) (0.054) (0.052) (0.049) (0.183) (0.043) (0.008) Stoc beta 0.110 0.159 -0.176 1.152 -0.001 -0.039 0.453 0.374 3.090 (0.165) (0.174) (0.169) (0.054) (0.052) (0.048) (0.183) (0.043) (0.008)
Table IV: Weights, Sharpe Ratio and Differences in CERs - Value Portfolio Time Variation in Expected Returns Only
This table presents the optimal (non-normalized) weights on the risky assets, the Sharpe ratio and differences in CER when the investor invests in a value portfolio and a market index. The weights are given by A-1V-1E, where A is the risk aversion parameter, and E and V are the first two moments of the predictive distribution of the excess returns. The maximum Sharpe ratio is the ex-ante Sharpe ratio perceived by an investor who invests in the two
assets and is given by EVE 1' − . CER is the difference in certainty equivalent returns, C*-Ca, where C* (Ca) is the CER for the optimal (suboptimal) portfolio. The optimal (suboptimal) portfolio is computed assuming that the predictive variable of interest is one standard deviation above or below its mean (at their mean values). The difference in CER is annualized. In the column labeled by “Mean”, the predictive variables are at their mean values. In the columns labeled by “∆” (“∇”) the predictive variables are one standard deviation above (below) their means. In the Table, cons refers to constant and dete to a time-varying, deterministic alpha and/or beta model. Sample period: 1:63-12:98.
Panel A: Dividend Yield and Term Spread
Mean ∆d/p ∇d/p ∆term ∇term
Predictability in expected returns only (dete alpha, cons beta) Weight Value 216.30 355.58 75.71 220.60 211.01 Weight Market -104.77 -199.42 -9.18 -66.25 -142.35 Sharpe Ratio 0.21 0.33 0.10 0.25 0.18 CER (% per annum) 0.00 3.10 3.14 0.60 0.06
Panel B: Default and Term Spreads
Mean ∆def ∇def ∆term ∇term
Predictability in expected returns only (dete alpha, cons beta) Weight Value 219.07 407.42 29.71 222.81 214.32 Weight Market -106.40 -210.28 -1.58 -67.23 -144.63 Sharpe Ratio 0.21 0.39 0.04 0.25 0.18 CER (% per annum) 0.00 6.71 6.76 0.60 0.06
Table V: Weights, Sharpe Ratio and Differences in CERs - Size Portfolio Time Variation in Expected Returns Only
This table presents the optimal (non-normalized) weights on the risky assets when the investor invests in a size portfolio and a market index, which are given by A-1V-1E, where A is the risk aversion parameter, and E and V are the first two moments of the predictive distribution of the excess returns. The normalized weights, not reported, are given by V-1E/i2’V
-1E). The maximum Sharpe ratio is the ex-ante Sharpe ratio perceived by an investor who invests in the two assets and is
given by EVE 1' − . CER is the difference in certainty equivalent returns, C*-Ca, where C* (Ca) is the CER for the optimal (suboptimal) portfolio. The optimal (suboptimal) portfolio is computed assuming that the predictive variable of interest is one standard deviation above or below its mean (at their mean values). The difference in CER is annualized. In the column labeled by “Mean”, the predictive variables are at their mean values. In the columns labeled by “∆” (“∇”) the predictive variables are one standard deviation above (below) their means. The rows that are shaded present the results for models with time-varying alphas. The other rows present the results for models with constant alphas. In the second column, cons refers to a constant beta model and dete to a time-varying, deterministic beta model. Sample period: 1:63-12:98.
Panel A: Dividend Yield and Term Spread
Mean ∆d/p ∇d/p ∆term ∇term
Predictability in expected returns only (dete alpha, cons beta) Weight Value 7.02 80.28 -66.28 -27.61 41.61 Weight Market 91.81 43.74 139.91 174.83 8.82 Sharpe Ratio 0.12 0.19 0.10 0.18 0.08 CER (% per annum) 0.00 1.60 1.60 0.85 0.85
Panel B: Default and Term Spreads
Mean ∆def ∇def ∆term ∇term
Predictability in expected returns only (dete alpha, cons beta) Weight Value 7.40 58.32 -43.55 -27.05 41.82 Weight Market 91.36 105.55 77.22 174.26 8.51 Sharpe Ratio 0.12 0.22 0.05 0.18 0.08 CER (% per annum) 0.00 2.36 2.36 0.86 0.86
Table VI: Weights, Sharpe Ratio and Differences in CERs Value Portfolio – Constant Price of Risk
This table presents the optimal (non-normalized) weights on the risky assets, the Sharpe ratio and differences in CER when the investor invests in a value portfolio and a market index. The weights are given by A-1V-1E, where A is the risk aversion parameter, and E and V are the first two moments of the predictive distribution of the excess returns. The maximum Sharpe ratio is the ex-ante Sharpe ratio perceived by an investor who invests in the two
assets and is given by EVE 1' − . CER is the difference in certainty equivalent returns, C*-Ca, where C* (Ca) is the CER for the optimal (suboptimal) portfolio. The optimal (suboptimal) portfolio is computed assuming that the predictive variable of interest is one standard deviation above or below its mean (at their mean values). The difference in CER is annualized. In the column labeled by “Mean”, the predictive variables are at their mean values. In the columns labeled by “∆” (“∇”) the predictive variables are one standard deviation above (below) their means. Sample period: 1:63-12:98.
Panel A: Dividend Yield and Term Spread
Mean ∆d/p ∇d/p ∆term ∇term
Predictability but no asset pricing model (deterministic alpha, constant beta) Weight Value 216.30 355.58 75.78 220.38 211.22 Weight Market -104.77 -236.62 28.26 -108.63 -99.96 Sharpe Ratio 0.21 0.32 0.11 0.23 0.19 CER (% per annum) 0.00 2.62 2.66 0.00 0.00
Predictability only in risk premia (constant alpha and beta)
Weight Value 214.08 213.90 214.27 213.87 214.30 Weight Market -103.42 -103.24 -103.60 -103.22 -103.62 Sharpe Ratio 0.21 0.22 0.20 0.23 0.19 CER (% per annum) 0.00 0.00 0.00 0.00 0.00
Predictability in risk premia and betas (constant alpha, deterministic beta) Weight Value 219.99 219.52 219.84 219.07 220.04 Weight Market -105.02 -116.17 -93.28 -97.01 -112.24 Sharpe Ratio 0.21 0.23 0.20 0.23 0.20 CER (% per annum) 0.00 0.00 0.00 0.00 0.00
Predictability in risk premia, beta and model mispricing (deterministic alpha and beta) Weight Value 221.34 338.65 102.15 235.76 204.95 Weight Market -106.14 -229.45 9.10 -111.49 -97.91 Sharpe Ratio 0.21 0.31 0.13 0.24 0.19 CER (% per annum) 0.00 1.88 1.92 0.00 0.00
Panel B: Default and Term Spreads
Mean ∆def ∇def ∆term ∇term
Predictability but no asset pricing model (deterministic alpha, constant beta) Weight Value 219.07 406.71 29.76 222.59 214.54 Weight Market -106.40 -283.20 71.96 -109.72 -102.13 Sharpe Ratio 0.21 0.36 0.07 0.23 0.19 CER (% per annum) 0.00 4.80 4.86 0.00 0.00
Predictability only in risk premia (constant alpha and beta)
Weight Value 214.08 213.72 214.45 213.87 214.30 Weight Market -103.42 -103.07 -103.77 -103.22 -103.62 Sharpe Ratio 0.21 0.24 0.18 0.23 0.19 CER (% per annum) 0.00 0.00 0.00 0.00 0.00
Predictability in risk premia and betas (constant alpha, deterministic beta) Weight Value 214.17 213.32 214.38 213.22 214.24 Weight Market -98.07 -112.20 -83.27 -88.04 -107.33 Sharpe Ratio 0.21 0.24 0.18 0.23 0.19 CER (% per annum) 0.00 0.12 0.02 0.04 0.02
Predictability in risk premia, beta and model mispricing (deterministic alpha and beta) Weight Value 221.46 386.69 53.59 232.37 208.55 Weight Market -104.22 -275.28 53.17 -103.59 -101.94 Sharpe Ratio 0.21 0.35 0.08 0.24 0.19 CER (% per annum) 0.00 3.74 3.78 0.00 0.00
Panel C: Dividend Yield and Term Spread – Incorporating Uncertainty About the Beta Model Dynamics
Mean ∆d/p ∇d/p ∆term ∇term ∆beta ∇beta
Predictability in risk premia and betas (constant alpha, stochastic beta) Weight Value 191.84 170.96 215.63 168.57 220.04 191.84 191.84 Weight Market -81.47 -60.48 -94.19 -55.43 -113.42 -146.17 -16.77 Sharpe Ratio 0.19 0.20 0.19 0.20 0.19 0.19 0.19 CER (% per annum) 0.00 0.10 0.08 0.09 0.09 1.39 1.39
Predictability in risk premia and betas (deterministic alpha, stochastic beta) Weight Value 194.06 259.83 108.22 188.76 197.87 194.06 194.06 Weight Market -84.12 -154.33 0.57 -74.94 -92.10 -148.90 -19.34 Sharpe Ratio 0.20 0.26 0.13 0.22 0.17 0.20 0.20 CER (% per annum) 0.00 0.63 0.85 0.01 0.00 1.40 1.40
Panel D: Default and Term Spread – Incorporating Uncertainty About the Beta Model Dynamics
Mean ∆def ∇def ∆term ∇term ∆beta ∇beta
Predictability in risk premia and betas (constant alpha, stochastic beta) Weight Value 190.87 153.67 244.61 167.76 218.89 190.87 190.87 Weight Market -80.43 -52.62 -119.52 -54.52 -112.22 -144.97 -15.89 Sharpe Ratio 0.20 0.21 0.18 0.21 0.19 0.20 0.20 CER (% per annum) 0.00 0.27 0.30 0.09 0.09 1.38 1.38
Predictability in risk premia and betas (deterministic alpha, stochastic beta) Weight Value 200.50 285.75 60.93 196.18 202.96 200.50 200.50 Weight Market -90.01 -175.47 43.25 -80.25 -98.19 -156.55 -23.46 Sharpe Ratio 0.20 0.30 0.08 0.22 0.18 0.20 0.20 CER (% per annum) 0.00 1.11 1.99 0.00 0.00 1.54 1.47
Table VII: Weights, Sharpe Ratio and Differences in CERs Size Portfolio – Constant Price of Risk
This table presents the optimal (non-normalized) weights on the risky assets, the Sharpe ratio and differences in CER when the investor invests in a value portfolio and a market index. The weights are given by A-1V-1E, where A is the risk aversion parameter, and E and V are the first two moments of the predictive distribution of the excess returns. The maximum Sharpe ratio is the ex-ante Sharpe ratio perceived by an investor who invests in the two
assets and is given by EVE 1' − . CER is the difference in certainty equivalent returns, C*-Ca, where C* (Ca) is the CER for the optimal (suboptimal) portfolio. The optimal (suboptimal) portfolio is computed assuming that the predictive variable of interest is one standard deviation above or below its mean (at their mean values). The difference in CER is annualized. In the column labeled by “Mean”, the predictive variables are at their mean values. In the columns labeled by “∆” (“∇”) the predictive variables are one standard deviation above (below) their means. Sample period: 1:63-12:98.
Panel A: Dividend Yield and Term Spread
Mean ∆d/p ∇d/p ∆term ∇term
Predictability but no asset pricing model (deterministic alpha, constant beta) Weight Value 7.02 80.21 -66.34 -27.58 41.65 Weight Market 91.81 6.33 177.47 132.21 51.36 Sharpe Ratio 0.12 0.17 0.12 0.15 0.10 CER (% per annum) 0.00 1.13 1.14 0.25 0.25
Predictability only in risk premia (constant alpha and beta)
Weight Value 6.54 6.53 6.54 6.53 6.54 Weight Market 92.36 92.36 92.35 92.37 92.35 Sharpe Ratio 0.12 0.15 0.10 0.15 0.09 CER (% per annum) 0.00 0.00 0.00 0.00 0.00
Predictability in risk premia and betas (constant alpha, deterministic beta) Weight Value 11.28 11.26 11.27 11.23 11.28 Weight Market 86.82 87.11 86.57 87.37 86.32 Sharpe Ratio 0.12 0.15 0.10 0.15 0.09 CER (% per annum) 0.00 0.00 0.00 0.00 0.00
Predictability in risk premia, beta and model mispricing (deterministic alpha and beta) Weight Value 11.79 96.61 -73.26 -16.64 40.25 Weight Market 86.20 -9.56 188.37 118.76 51.19 Sharpe Ratio 0.12 0.17 0.12 0.15 0.10 CER (% per annum) 0.00 1.53 1.53 0.17 0.17
Panel B: Default and Term Spreads
Mean ∆def ∇def ∆term ∇term
Predictability but no asset pricing model (deterministic alpha, constant beta) Weight Value 7.40 58.22 -43.63 -27.03 41.86 Weight Market 91.36 32.08 150.89 131.53 51.16 Sharpe Ratio 0.12 0.17 0.08 0.15 0.10 CER (% per annum) 0.00 0.55 0.55 0.25 0.25
Predictability only in risk premia (constant alpha and beta)
Weight Value 6.54 6.52 6.55 6.53 6.54 Weight Market 92.36 92.37 92.34 92.37 92.35 Sharpe Ratio 0.12 0.16 0.06 0.15 0.09 CER (% per annum) 0.00 0.00 0.00 0.00 0.00
Predictability in risk premia and betas (constant alpha, deterministic beta) Weight Value 9.88 9.84 9.89 9.83 9.88 Weight Market 88.54 88.56 88.54 89.00 88.12 Sharpe Ratio 0.13 0.16 0.06 0.15 0.09 CER (% per annum) 0.00 0.00 0.00 0.00 0.00
Predictability in risk premia, beta and model mispricing (deterministic alpha and beta) Weight Value 10.99 66.24 -44.65 -16.55 38.57 Weight Market 87.25 23.76 152.24 118.52 53.65 Sharpe Ratio 0.13 0.18 0.08 0.15 0.10 CER (% per annum) 0.00 0.65 0.66 0.16 0.16
Panel C: Dividend Yield and Term Spread – Incorporating Uncertainty About the Beta Model Dynamics
Mean ∆d/p ∇d/p ∆term ∇term ∆beta ∇beta
Predictability in risk premia and betas (constant alpha, stochastic beta) Weight Value 28.22 25.85 30.79 25.52 31.24 28.22 28.22 Weight Market 67.47 70.42 64.26 71.57 62.79 56.86 78.08 Sharpe Ratio 0.13 0.15 0.10 0.15 0.10 0.13 0.13 CER (% per annum) 0.00 0.00 0.00 0.00 0.00 0.04 0.04
Predictability in risk premia and betas (deterministic alpha, stochastic beta) Weight Value 31.66 96.05 -45.48 -9.30 81.34 31.66 31.66 Weight Market 63.38 -9.01 153.61 110.43 3.06 51.45 75.30 Sharpe Ratio 0.13 0.18 0.11 0.15 0.12 0.13 0.13 CER (% per annum) 0.00 0.96 1.16 0.39 0.47 0.05 0.05
Panel D: Default and Term Spread – Incorporating Uncertainty About the Beta Model Dynamics
Mean ∆def ∇def ∆term ∇term ∆beta ∇beta
Predictability in risk premia and betas (constant alpha, stochastic beta) Weight Value 28.15 23.95 33.71 25.49 31.12 28.15 28.15 Weight Market 67.61 72.49 61.15 71.67 62.98 57.11 78.12 Sharpe Ratio 0.13 0.17 0.07 0.15 0.10 0.13 0.13 CER (% per annum) 0.00 0.00 0.00 0.00 0.00 0.04 0.04
Predictability in risk premia and betas (deterministic alpha, stochastic beta) Weight Value 31.27 55.79 -3.79 -7.96 78.69 31.27 31.27 Weight Market 63.98 36.26 104.39 108.86 6.32 52.29 75.68 Sharpe Ratio 0.13 0.18 0.06 0.15 0.12 0.13 0.13 CER (% per annum) 0.00 0.15 0.22 0.36 0.43 0.05 0.05
Table VIII: Weights, Sharpe Ratio and Differences in CERs Value Portfolio – Time-Varying Price of Risk
This table presents the optimal (non-normalized) weights on the risky assets, the Sharpe ratio and differences in CER when the investor invests in a value portfolio and a market index. The weights are given by A-1V-1E, where A is the risk aversion parameter, and E and V are the first two moments of the predictive distribution of the excess returns. The maximum Sharpe ratio is the ex-ante Sharpe ratio perceived by an investor who invests in the two
assets and is given by EVE 1' − . CER is the difference in certainty equivalent returns, C*-Ca, where C* (Ca) is the CER for the optimal (suboptimal) portfolio. The optimal (suboptimal) portfolio is computed assuming that the predictive variable of interest is one standard deviation above or below its mean (at their mean values). The difference in CER is annualized. In the column labeled by “Mean”, the predictive variables are at their mean values. In the columns labeled by “∆” (“∇”) the predictive variables are one standard deviation above (below) their means. Sample period: 1:63-12:98.
Panel A: Dividend Yield and Term Spread
Mean ∆d/p ∇d/p ∆term ∇term
Predictability but no asset pricing model (deterministic alpha, constant beta) Weight Value 219.50 360.99 76.80 223.83 214.10 Weight Market -87.84 -221.92 -0.15 -39.58 -154.23 Sharpe Ratio 0.21 0.32 0.10 0.26 0.18 CER (% per annum) 0.00 2.66 3.42 0.73 1.37
Predictability only in market expected return and variance(constant alpha and beta)
Weight Value 217.69 217.60 217.60 217.66 217.66 Weight Market -86.88 -86.93 -134.20 -34.50 -158.36 Sharpe Ratio 0.21 0.22 0.19 0.26 0.18 CER (% per annum) 0.00 0.00 0.69 0.73 1.37
Predictability in risk premia and betas (constant alpha, deterministic beta) Weight Value 223.25 222.96 222.62 222.80 222.96 Weight Market -88.09 -99.72 -123.16 -28.06 -166.60 Sharpe Ratio 0.22 0.22 0.19 0.26 0.18 CER (% per annum) 0.00 0.04 0.39 0.95 1.67 Predictability in risk premia, beta and model mispricing (deterministic alpha and beta) Weight Value 224.62 343.96 103.44 239.80 207.67 Weight Market -89.23 -214.79 -19.49 -42.79 -152.07 Sharpe Ratio 0.22 0.30 0.12 0.27 0.17 CER (% per annum) 0.00 1.90 2.39 0.99 1.72
Panel B: Default and Term Spreads
Mean ∆def ∇def ∆term ∇term
Predictability but no asset pricing model (deterministic alpha, constant beta) Weight Value 223.72 416.06 30.34 227.50 218.83 Weight Market -96.82 -202.83 8.66 -73.97 -136.74 Sharpe Ratio 0.22 0.39 0.05 0.25 0.19 CER (% per annum) 0.00 6.70 6.83 0.22 0.64
Predictability only in market expected return and variance(constant alpha and beta)
Weight Value 217.31 217.61 217.61 217.57 217.57 Weight Market -92.81 -17.59 -169.52 -66.35 -137.30 Sharpe Ratio 0.22 0.28 0.18 0.24 0.19 CER (% per annum) 0.00 1.69 1.76 0.22 0.64
Predictability in risk premia and betas (constant alpha, deterministic beta) Weight Value 223.17 222.93 222.63 222.64 222.75 Weight Market -94.01 -28.85 -159.97 -62.21 -143.04 Sharpe Ratio 0.22 0.28 0.18 0.24 0.19 CER (% per annum) 0.00 1.26 1.32 0.31 0.78 Predictability in risk premia, beta and model mispricing (deterministic alpha and beta) Weight Value 226.16 396.09 54.51 237.77 212.61 Weight Market -94.61 -195.22 -10.39 -67.93 -136.43 Sharpe Ratio 0.22 0.38 0.06 0.25 0.18 CER (% per annum) 0.00 5.07 5.22 0.45 0.10
Panel C: Dividend Yield and Term Spread – Incorporating Uncertainty About the Beta Model Dynamics
Mean ∆d/p ∇d/p ∆term ∇term ∆beta ∇beta
Predictability in risk premia and betas (constant alpha, stochastic beta) Weight Value 210.11 196.93 197.24 203.85 205.80 210.11 210.11 Weight Market -78.79 -75.40 -105.07 -15.64 -151.15 -149.65 -7.92 Sharpe Ratio 0.20 0.21 0.17 0.25 0.16 0.20 0.20 CER (% per annum) 0.00 0.05 0.46 0.88 1.57 1.24 1.24
Predictability in risk premia and betas (deterministic alpha, stochastic beta) Weight Value 212.18 298.43 99.24 227.40 185.44 212.18 212.18 Weight Market -81.36 -172.28 -18.63 -38.43 -131.59 -152.18 -10.53 Sharpe Ratio 0.20 0.27 0.11 0.26 0.15 0.20 0.20 CER (% per annum) 0.00 0.95 2.13 0.90 1.64 1.24 1.24
Panel D: Default and Term Spread – Incorporating Uncertainty About the Beta Model Dynamics
Mean ∆def ∇def ∆term ∇term ∆beta ∇beta
Predictability in risk premia and betas (constant alpha, stochastic beta) Weight Value 199.99 196.98 199.21 193.71 195.08 199.99 199.99 Weight Market -75.22 -5.12 -143.13 -40.00 -117.99 -142.67 -7.77 Sharpe Ratio 0.20 0.27 0.16 0.23 0.17 0.20 0.20 CER (% per annum) 0.00 1.35 1.41 0.28 0.73 1.35 1.35
Predictability in risk premia and betas (deterministic alpha, stochastic beta) Weight Value 209.98 362.91 49.82 225.90 181.72 209.98 209.98 Weight Market -85.03 -160.67 -9.15 -67.17 -108.02 -154.72 -15.34 Sharpe Ratio 0.21 0.35 0.06 0.24 0.16 0.21 0.21 CER (% per annum) 0.00 4.55 4.88 0.37 0.92 1.44 1.44
Table IX: Weights, Sharpe Ratio and Differences in CERs Size Portfolio – Time-Varying Price of Risk
This table presents the optimal (non-normalized) weights on the risky assets, the Sharpe ratio and differences in CER when the investor invests in a value portfolio and a market index. The weights are given by A-1V-1E, where A is the risk aversion parameter, and E and V are the first two moments of the predictive distribution of the excess returns. The maximum Sharpe ratio is the ex-ante Sharpe ratio perceived by an investor who invests in the two
assets and is given by EVE 1' − . CER is the difference in certainty equivalent returns, C*-Ca, where C* (Ca) is the CER for the optimal (suboptimal) portfolio. The optimal (suboptimal) portfolio is computed assuming that the predictive variable of interest is one standard deviation above or below its mean (at their mean values). The difference in CER is annualized. In the column labeled by “Mean”, the predictive variables are at their mean values. In the columns labeled by “∆” (“∇”) the predictive variables are one standard deviation above (below) their means. Sample period: 1:63-12:98.
Panel A: Dividend Yield and Term Spread
Mean ∆d/p ∇d/p ∆term ∇term
Predictability but no asset pricing model (deterministic alpha, constant beta) Weight Value 7.12 81.43 -67.23 -28.01 42.18 Weight Market 111.65 24.73 151.07 205.03 -0.85 Sharpe Ratio 0.13 0.16 0.11 0.19 0.07 CER (% per annum) 0.00 1.15 1.85 0.99 1.63
Predictability only in market expected return and variance(constant alpha and beta)
Weight Value 6.65 6.64 6.64 6.64 6.64 Weight Market 121.19 112.06 64.78 164.54 40.68 Sharpe Ratio 0.13 0.14 0.09 0.19 0.05 CER (% per annum) 0.00 0.00 0.69 0.73 1.37
Predictability in risk premia and betas (constant alpha, deterministic beta) Weight Value 11.45 11.43 11.42 11.43 11.43 Weight Market 106.59 106.73 58.95 159.46 34.59 Sharpe Ratio 0.13 0.14 0.09 0.19 0.05 CER (% per annum) 0.00 0.00 0.70 0.74 1.39
Predictability in risk premia, beta and model mispricing (deterministic alpha and beta) Weight Value 11.97 98.13 -74.18 -16.92 40.78 Weight Market 105.96 8.55 162.03 191.40 -1.00 Sharpe Ratio 0.13 0.17 0.11 0.19 0.07 CER (% per annum) 0.00 1.55 2.27 0.92 1.57
Panel B: Default and Term Spreads
Mean ∆def ∇def ∆term ∇term
Predictability but no asset pricing model (deterministic alpha, constant beta) Weight Value 7.56 59.56 -44.47 -27.62 42.70 Weight Market 105.14 119.70 89.13 172.60 19.62 Sharpe Ratio 0.13 0.23 0.06 0.17 0.09 CER (% per annum) 0.00 2.26 2.33 0.48 0.89
Predictability only in market expected return and variance(constant alpha and beta)
Weight Value 6.64 6.64 6.64 6.64 6.64 Weight Market 106.19 181.41 29.48 132.61 61.67 Sharpe Ratio 0.13 0.22 0.04 0.17 0.08 CER (% per annum) 0.00 1.69 1.76 0.22 0.64
Predictability in risk premia and betas (constant alpha, deterministic beta) Weight Value 10.09 10.08 10.06 10.06 10.07 Weight Market 102.25 177.47 25.59 129.12 57.32 Sharpe Ratio 0.13 0.22 0.04 0.17 0.08 CER (% per annum) 0.00 1.69 1.76 0.23 0.65
Predictability in risk premia, beta and model mispricing (deterministic alpha and beta) Weight Value 11.22 67.85 -45.42 -16.93 39.32 Weight Market 100.94 111.09 90.39 159.33 22.18 Sharpe Ratio 0.13 0.23 0.06 0.17 0.09 CER (% per annum) 0.00 2.37 2.45 0.39 0.82
Panel C: Dividend Yield and Term Spread – Incorporating Uncertainty About the Beta Model Dynamics
Mean ∆d/p ∇d/p ∆term ∇term ∆beta ∇beta
Predictability in risk premia and betas (constant alpha, stochastic beta) Weight Value 30.32 28.90 28.91 29.62 29.84 30.32 30.32 Weight Market 85.02 86.76 38.98 139.32 12.90 73.63 96.42 Sharpe Ratio 0.13 0.15 0.09 0.19 0.06 0.13 0.13 CER (% per annum) 0.00 0.00 0.70 0.76 1.42 0.03 0.03
Predictability in risk premia and betas (deterministic alpha, stochastic beta) Weight Value 34.02 107.37 -42.71 -10.80 77.70 34.02 34.02 Weight Market 80.61 -2.03 122.89 184.42 -44.14 67.80 93.43 Sharpe Ratio 0.13 0.18 0.10 0.19 0.09 0.13 0.13 CER (% per annum) 0.00 1.11 1.94 1.17 1.80 0.04 0.04
Panel D: Default and Term Spread – Incorporating Uncertainty About the Beta Model Dynamics
Mean ∆def ∇def ∆term ∇term ∆beta ∇beta
Predictability in risk premia and betas (constant alpha, stochastic beta) Weight Value 29.30 28.99 29.22 28.59 28.77 29.30 29.30 Weight Market 80.25 155.87 3.57 108.61 35.21 69.31 91.18 Sharpe Ratio 0.14 0.22 0.05 0.17 0.09 0.14 0.14 CER (% per annum) 0.00 1.69 1.76 0.24 0.67 0.03 0.03
Predictability in risk premia and betas (deterministic alpha, stochastic beta) Weight Value 32.56 67.52 -3.28 -8.93 72.76 32.56 32.56 Weight Market 76.47 112.04 41.06 150.32 -17.19 64.29 88.64 Sharpe Ratio 0.14 0.23 0.04 0.17 0.11 0.14 0.14 CER (% per annum) 0.00 1.96 2.04 0.61 1.01 0.04 0.04
Table X: Economic Significance of Evidence on the Source of Predictability: Difference in CER Across Models. Value Portfolio – Constant Price of Risk
This table presents the difference in certainty equivalent returns (CER), C*m-Cam, across models. C*m is the CER for the optimal portfolio and Cam is the CER for the suboptimal portfolio. The optimal portfolio is computed using the model that the investor perceives as the optimal one. In the table, the optimal model is the one in the first column. The suboptimal portfolio is computed using an alternative model, which is the one given in the second column. The difference in CER’s is annualized and is in percentages. In the column labeled by “∆dyld” the dividend yield is one standard deviation above its mean. The same convention applies to the rest of the columns. In the first two columns, cons refers to a constant model; dete to a time-varying, deterministic model; and stoc to a time-varying stochastic model. Sample period: 1:63-12:98.
Optimal Non-optimal ∆dyld ∆term ∆beta ∆def ∆term ∆beta Cons alpha, Dete alpha, cons beta 2.70 0.28 5.25 0.01 Cons beta Cons alpha, dete beta 0.02 0.07 0.03 0.07 Dete alpha, dete beta 2.12 0.15 4.49 0.15 Cons alpha, stoc beta 0.26 0.26 1.46 0.48 0.15 1.40 Dete alpha, stoc beta 0.32 0.04 1.49 0.71 0.04 1.40 Dete alpha, Cons alpha, cons beta 2.70 0.01 5.12 0.01 Cons beta Cons alpha, dete beta 2.52 0.08 5.02 0.80 Dete alpha, dete beta 0.07 0.12 0.11 0.12 Cons alpha, stoc beta 5.54 0.37 1.48 8.85 0.37 1.56 Dete alpha, stoc beta 1.20 0.09 1.51 2.00 0.09 1.59 Cons alpha, Cons alpha, cons beta 0.03 0.10 0.05 0.10 Dete beta Dete alpha, cons beta 2.60 0.09 5.17 0.09 Dete alpha, dete beta 1.94 0.04 4.20 0.04 Cons alpha, stoc beta 0.30 0.29 1.64 0.50 0.29 1.67 Dete alpha, stoc beta 0.24 0.06 1.66 0.67 0.06 1.72 Dete alpha, Cons alpha, cons beta 2.09 0.17 4.27 0.17 Dete beta Dete alpha, cons beta 0.07 0.11 0.11 0.11 Cons alpha, dete beta 1.92 0.04 4.11 0.04 Cons alpha, stoc beta 3.77 0.55 1.65 7.61 0.55 1.75 Dete alpha, stoc beta 0.79 0.20 1.67 1.50 0.20 1.79 Cons alpha, Cons alpha, cons beta 0.30 0.29 0.60 0.29 Stoc beta Dete alpha, cons beta 5.06 0.41 11.14 0.41 Cons alpha, dete beta 0.32 0.33 0.62 0.33 Dete alpha, dete beta 4.10 0.64 0.00 9.63 0.64 0.00 Dete alpha, stoc beta 1.17 0.11 0.00 2.85 0.11 0.00 Dete alpha, Cons alpha, cons beta 0.34 0.05 0.84 0.05 Stoc beta Dete alpha, cons beta 1.31 0.19 2.43 0.10 Cons alpha, dete beta 0.26 0.07 0.79 0.07 Dete alpha, dete beta 0.84 0.22 0.00 1.85 0.22 0.00 Cons alpha, stoc beta 1.16 0.11 0.00 2.77 0.11 0.00
Table XI: Economic Significance of Evidence on the Source of Predictability: Difference in CER Across Models. Size Portfolio – Constant Price of Risk
This table presents the difference in certainty equivalent returns (CER), C*m-Cam, across models. C*m is the CER for the optimal portfolio and Cam is the CER for the suboptimal portfolio. The optimal portfolio is computed using the model that the investor perceives as the optimal one. In the table, the optimal model is the one in the first column. The suboptimal portfolio is computed using an alternative model, which is the one given in the second column. The difference in CER’s is annualized and is in percentages. In the column labeled by “∆dyld” the dividend yield is one standard deviation above its mean. The same convention applies to the rest of the columns. In the first two columns, cons refers to a constant model; dete to a time-varying, deterministic model; and stoc to a time-varying stochastic model. Sample period: 1:63-12:98.
Optimal Non-optimal ∆dyld ∆term ∆beta ∆def ∆term ∆beta Cons alpha, Dete alpha, cons beta 1.14 0.24 0.58 0.24 Cons beta Cons alpha, dete beta 0.00 0.00 0.00 0.00 Dete alpha, dete beta 1.71 0.10 0.77 0.11 Cons alpha, stoc beta 0.08 0.08 0.14 0.07 0.08 0.15 Dete alpha, stoc beta 1.68 0.05 0.18 0.52 0.04 0.19 Dete alpha, Cons alpha, cons beta 1.15 0.25 0.58 0.07 Cons beta Cons alpha, dete beta 1.00 0.32 0.51 0.29 Dete alpha, dete beta 0.06 0.03 0.01 0.02 Cons alpha, stoc beta 0.62 0.60 0.13 0.25 0.57 0.13 Dete alpha, stoc beta 0.06 0.07 0.17 0.00 0.08 0.17 Cons alpha, Cons alpha, cons beta 0.00 0.00 0.00 0.00 Dete beta Dete alpha, cons beta 1.00 0.33 0.50 0.29 Dete alpha, dete beta 1.54 0.16 0.68 0.15 Cons alpha, stoc beta 0.05 0.04 0.10 0.04 0.05 0.12 Dete alpha, stoc beta 1.51 0.09 0.13 0.45 0.07 0.15 Dete alpha, Cons alpha, cons beta 1.72 0.11 0.77 0.12 Dete beta Dete alpha, cons beta 0.06 0.03 0.01 0.02 Cons alpha, dete beta 1.54 0.16 0.69 0.15 Cons alpha, stoc beta 1.06 0.37 0.09 0.38 0.38 0.11 Dete alpha, stoc beta 0.00 0.01 0.13 0.02 0.02 0.14 Cons alpha, Cons alpha, cons beta 0.09 0.08 0.08 0.08 Stoc beta Dete alpha, cons beta 0.68 0.67 0.30 0.64 Cons alpha, dete beta 0.05 0.05 0.05 0.06 Dete alpha, dete beta 1.15 0.40 0.00 0.45 0.41 0.00 Dete alpha, stoc beta 1.12 0.28 0.00 0.25 0.26 0.00 Dete alpha, Cons alpha, cons beta 1.86 0.06 0.61 0.05 Stoc beta Dete alpha, cons beta 0.06 0.08 0.00 0.09 Cons alpha, dete beta 1.66 0.10 0.53 0.07 Dete alpha, dete beta 0.00 0.01 0.00 0.03 0.02 0.00 Cons alpha, stoc beta 1.13 0.28 0.00 0.25 0.26 0.00