bayesian solutions for the factor zoo: we just ran two
TRANSCRIPT
Bayesian Solutions for the Factor Zoo
We Just Ran Two Quadrillion Modelslowast
Svetlana Bryzgalovadagger Jiantao HuangDagger Christian Julliardsect
November 18 2019
Abstract
We propose a novel and simple Bayesian estimation and model selection procedure for cross-sectional asset pricing Our approach that allows for both tradable and non-tradable factorsand is applicable to high dimensional cases has several desirable properties First weak andspurious factors lead to diffuse and centered at zero posteriors for their market price of riskmaking such factors easily detectable Second posterior inference is robust to the presence ofsuch factors Third we show that flat priors for risk premia lead to improper marginal like-lihoods rendering model selection invalid Therefore we provide a novel prior that is diffusefor strong factors but shrinks away useless ones under which posterior probabilities are wellbehaved and can be used for factor and (non necessarily nested) model selection as well asmodel averaging in large scale problems We apply our method to a very large set of factorsproposed in the literature and analyse 225 quadrillion possible models gaining novel insightson the empirical drivers of asset returns
Keywords Cross-sectional asset pricing factor models model evaluation multiple testing datamining p-hacking Bayesian methodsJEL codes G12 C11 C12 C52 C58
lowastAny errors or omissions are the responsibility of the authors Christian Julliard thanks the Economic and Social Research
Council (UK) [grant number ESK0023091] for financial support For helpful comments discussions and suggestions we thank
Mikhail Chernov Pierre Collin-Dufresne Aureo de Paula Marcelo Fernandes Rodrigo Guimaraes Alexander Michaelides
Olivier Scaillet and George TauchendaggerLondon Business School sbryzgalovalondoneduDaggerDepartment of Finance London School of Economics JHuang27lseacuksectDepartment of Finance FMG and SRC London School of Economics and CEPR cjulliardlseacuk
I Introduction
In the last decade or so two observations have come to the forefront of the empirical asset pricing
literature First at current production rates in the near future we will have more sources of em-
pirically ldquoidentifiedrdquo risk than stock returns to price with these factors ndash the so called factors zoo
phenomenon (see eg Harvey Liu and Zhu (2016)) Second given the commonly used estimation
methods in empirical asset pricing useless factors (ie factors whose true covariance with asset
returns is asymptotically zero) are not only likely to appear empirically relevant but also invali-
date inference regarding the true sources of risk (see eg Gospodinov Kan and Robotti (2019))
Nevertheless to the best of our knowledge no general method has been suggested to date that
i) is applicable to both tradeable and non tradeable factors can ii) handle the very large factor
zoo and iii) remains valid under model misspecification while iv) being robust to the spurious
inference problem And that is what we provide
As stressed by Harvey (2017) in his AFA presidential address the first observation naturally
calls for a Bayesian solution ndash and we develop one Furthermore we show that the two fundamental
problems above are tightly connected and a naive Bayesian approach to model selection may fail in
the presence of spurious factors Hence we correct it and apply our method to the zoo of traded and
non-traded factors proposed in the literature jointly evaluating 225 quadrillion models and gaining
novel insights on the empirical drivers of asset returns In particular we find that only a handful
of factors proposed in the previous literature are robust explanators of the cross-section of asset
returns and a four robust factor model easily outperforms canonical factor models Nevertheless
we also show that the lsquotruersquo latent stochastic discount factor (SDF) is dense is the space of empirical
asset pricing factors ie a large set of factors is needed to fully capture its pricing implications
Nonetheless the SDF-implied maximum Sharpe ratio in the economy is not unrealistically high
First we develop a very simple Bayesian version of the canonical Fama and MacBeth (1973)
regression method that is applicable to both traded and non-traded factors This approach makes
useless factors easily detectable in finite sample while delivering sharp posteriors for the strong
factorsrsquo risk premia (ie leaving inference about them unaffected) The result is quite intuitive
Useless factors make frequentist inference unreliable since when factor exposures go to zero risk
premia are no more identified We show that exactly the same phenomenon causes the posterior
credible intervals of risk premia to become diffuse and centered at zero which makes them easily
detectable in empirical applications This contribution is meant to add to the empirical researcher
toolset an approach for robust inference that is as easy to implement as eg the canonical Shanken
(1992) correction of the standard errors
Second the main intent of this paper is to provide a method for handling inference on the
entirety of the factor zoo at once This naturally calls for the use of model posterior probabilities
But we show that under flat priors for risk premia model and factors selection based on marginal
likelihoods (ie on posterior model probabilities or Bayes factors) is unreliable asymptotically
useless factors get selected with probability one This is due to the fact that lack of identification
generates an unbounded manifold for the risk premia parameters over which the likelihood surface
1
is totally flat1 Hence integration applied directly to the likelihood as if it were an unnormalized
probability distribution function (pdf) produces improper marginal ldquoposteriorsrdquo As a result in the
presence of identification failure naive Bayesian inference has the same weakness as the frequentist
one This observation however not only illustrates the nature of the problem but also suggests
how to restore inference use suitable non-informative but yet non-flat priors
Third building upon the literature on predictor selection (see eg Ishwaran Rao et al (2005)
and Giannone Lenza and Primiceri (2018)) we provide a novel (continuous) ldquospike-and-slabrdquo prior
that restores the validity of model selection based on posterior model probabilities and Bayes factors
The prior is uninformative (the ldquoslabrdquo) for strong factors but shrinks away (the ldquospikerdquo) useless
factors This approach is similar in spirit to a ridge regression and acts as a (Tikhonov-Phillips)
regularization of the likelihood function of the cross-sectional regression needed to estimate risk
premia A distinguishing feature of our prior is that the prior variance of a factorrsquos risk premium is
proportional to its correlation with the test asset returns Hence when a useless factor is present
the prior variance of its risk premium converges to zero so the shrinkage dominates and forces its
posterior distribution to concentrated around zero Not only this prior restores integrability but
also i) makes it computationally feasible to analyse quadrillions of alternative factor models ii)
allows the researcher to encode prior beliefs about the sparsity of the true SDF without imposing
hard thresholds and iii) shrinks the estimate of useless factorsrsquo risk premia toward zero We regard
this novel spike-and-slab prior approach as a solution for the high-dimensional inference problem
generated by the factor zoo
Our method is easy to implement and in all of our simulations has good finite sample properties
even when the cross-section of test assets is large We investigate its performance for risk premia
estimation model evaluation and factor selection in a range of simulation designs that mimic the
stylized features of returns Our simulations account for potential model misspecification and the
presence of either strong or useless factors in the model The use of posterior sampling naturally
allows to build credible confidence intervals not only for risk premia but also other statistics of
interest such as the cross-sectional R2 that is notoriously hard to estimate precisely (Lewellen
Nagel and Shanken (2010))
We show that whenever risk premia are well identified both our method and the frequentist
approach provide valid confidence intervals for model parameters with empirical coverage being
close to its nominal size The posterior distribution for useless factors however is reliably centered
around zero and quickly revels them even in a relatively short sample We find that the posterior
of strong factors is largely unaffected by the identification failure with the posterior coverage
corresponding to its nominal size as well In other words the Bayesian approach restores reliable
statistical inference in the model
We also demonstrate the pitfalls of flat priors for risk premia with the same simulation design
their use leads to selecting useless factor with probability approaching 1 for all the sample sizes
However our spike-and-slab prior seems to successfully eliminate them from the model while
1This is similar to the effect of ldquoweak instrumentsrdquo in IV estimations as discussed in Sims (2007)
2
retaining the true sources of risk
Our results have important empirical implications for the estimation of popular linear factor
models and their comparison We jointly evaluate 51 factors proposed in the previous literature
yielding a total of 225 quadrillion possible models to analyze and find that only a handful of
factors are robust explanators of the cross-section of asset returns (the Fama and French (1992)
ldquohigh-minus-lowrdquo proxy for the value premium the market index as well the adjusted versions
of the ldquosmall-minus-bigrdquo size factor and the market factor of Daniel Mota Rottke and Santos
(2018))
Jointly the four robust factors provide a model that is compared to the previous empirical
literature one order of magnitude more likely to have generated the observed asset returns (itrsquos
posterior probability is about 90) However we show that with very high probability the ldquotruerdquo
latent SDF is dense in the space of factors ie capturing its characteristics requires the use of 24-25
factors Nevertheless the SDF-implied maximum Sharpe ratio is not excessive suggesting a high
degree of commonality in terms of captured risks among the factors in the zoo
Furthermore we apply our useless factors detection method to a selection of popular linear
SDFs We find that many non-traded factors such as consumption proxies labour factors or the
consumption-to-wealth ratio cay are only weakly identified at best and are characterised by a
substantial degree of model misspecification and uncertainty
I1 Closely related literature
There are numerous contributions to the literature that rely on the use of Bayesian tools in finance
especially in the areas of asset allocation (for an excellent overview see Fabozzi Huang and Zhou
(2010) and Avramov and Zhou (2010)) model selection (eg Barillas and Shanken (2018)) and
performance evaluation (Baks Metrick and Wachter (2001) Pastor and Stambaugh (2002) Jones
and Shanken (2005) Harvey and Liu (2019)) In this paper therefore we aim to provide only an
overview of the literature that is most closely related to our paper
While there are multiple papers in the literature that adopt a Bayesian approach to analyse
linear factor models and portfolio choice most of them focus on the time-series regressions where
the intercepts thanks to factors being directly traded (or using their mimicking portfolios) can be
interpreted as the vector of pricing errors ndash the αrsquos In this case the use of test assets actually
becomes irrelevant since the problem of comparing model performance is reduced to the spanning
tests of one set of factors by another (eg Barillas and Shanken (2018))
Our paper instead develops a method that can be applied to both tradable and non-tradable
factors As a result we focus on the general pricing performance in the cross-section of asset returns
(which is no longer irrelevant) and show that there is a tight link between the use of the most
popular diffuse priors for the risk premia and the failure of the standard estimation techniques
in the presence of useless factors (eg Kan and Zhang (1999a))
Shanken (1987) and Harvey and Zhou (1990) were probably the first contributions to the lit-
erature that adapted the Bayesian framework to the analysis of portfolio choice and developed
3
GRS-type tests (cf Gibbons Ross and Shanken (1989)) for mean-variance efficiency While
Shanken (1987) was the first to examine the posterior odds ratio for portfolio alphas in the linear
factor model Harvey and Zhou (1990) set the benchmark by imposing the priors both diffuse and
informative directly on the deep model parameters Using the data on 12 industry portfolios they
reject the mean-variance efficiency of the market return with a high degree of certainty
Pastor and Stambaugh (2000) and Pastor (2000) directly assign a prior distribution to the
pricing errors α α sim N (0κΣ) where Σ is the variance-covariance matrix of returns and κ isin R+
and apply it to the Bayesian portfolio choice problem The intuition behind their prior is that it
imposes a degree of shrinkage on the alphas so that whenever factor models are misspecified the
pricing errors cannot be too large a priori hence placing a bound on the Sharpe ratio achievable
in this economy Therefore a diffuse prior for the pricing errors α in general should be avoided
Barillas and Shanken (2018) extend the aforementioned prior to derive a closed-form solution
for the Bayesrsquo factor and use it to compare different linear factor models exploiting the time series
dimension of the data2 In a recent paper Goyal He and Huh (2018) also extend the notion of
distance between alternative model specifications and highlight the tension between the power of
the GRS-type tests and the absolute return-based measures of mispricing
Last but not the least there is a general close connection between the Bayesian approach
to model selection or parameter estimation and the shrinkage-based one Garlappi Uppal and
Wang (2007) impose a set of different priors on expected returns and the variance-covarince matrix
and find that the shrinkage-based analogue leads to superior empirical performance Anderson
and Cheng (2016) develop a Bayesian model-averaging approach to portfolio choice with model
uncertainty being one of the key ingredients that yield robust asset allocation and superior out-of-
sample performance of the strategies Finally the shrinkage-based approach to recovering the SDF
of Kozak Nagel and Santosh (2019) can also be interpreted from a Bayesian perspective Within
a universe of characteristic-managed portfolios the authors assign prior distributions to expected
returns3 and their posterior maximum likelihood estimators resemble a ridge regression Instead
we work directly with tradable and nontradable factors and consider (endogenously) heterogenous
priors for factor risk premia λ The dispersion of our prior for each λ directly depends on the
correlation between test assets and the factor so that it mimics the strength of the identification
of the factor risk premium
Naturally our paper also contributes to the very active (and growing) body of work that criti-
cally evaluates existing findings in the empirical asset pricing literature and tries to develop a robust
methodology There is ample empirical evidence that most linear asset pricing models are misspec-
ified (eg Chernov Lochstoer and Lundeby (2019) He Huang and Zhou (2018) Giglio and Xiu
(2018)) Gospodinov Kan and Robotti (2014) develop a general approach for misspecification-
robust inference that provides valid confidence interval for the pseudo-true values of the risk premia
2Chib Zeng and Zhao (forthcoming) show that the improper prior specification of Barillas and Shanken (2018)is problematic and propose a new class of priors that leads to valid model comparison
3Or equivalently the coefficients b when the linear stochastic discount factor is represented as mt = 1 minus (ft minusE[ft])
⊤b where ft and E denote respectively a vector of factors and the unconditional expectation operator
4
Giglio and Xiu (2018) exploit the invariance principle of the PCA and recover the risk premia asso-
ciated with the given factor from the projection on the span of latent factors driving a cross-section
of asset returns Uppal Zaffaroni and Zviadadze (2018) adopt a similar approach by recovering
latent factors from the residuals of the asset pricing model effectively completing the span of the
SDF Daniel Mota Rottke and Santos (2018) instead focus on the construction of the cross-
sectional factors and note that many well-established tradable portfolios such as HML and SMB
can be substantially improved in asset pricing tests by hedging their unpriced component (that does
not carry a risk premium) We do not take a stand on the origin of the factors or the completion of
the model space Instead we consider the whole universe of potential models that can be created
from the set of observable candidates factors proposed in the empirical literature As such our
analysis explicitly takes into account both standard specifications that have been successfully used
in numerous papers (eg Fama-French 3-factor models or nondurable consumption growth) as
well as the combinations of the factors that have never been explicitly tested
Following Harvey Liu and Zhu (2016) a large literature has tried to understand which of
the existing factors (or their combinations) drive the cross-section of asset returns Giglio Feng
and Xiu (2019) combine cross-sectional asset pricing regressions with the double-selection LASSO
of Belloni Chernozhukov and Hansen (2014) to provide valid uniform inference on the selected
sources of risk Huang Li and Zhou (2018) use a reduced rank approach to select among not
only the observable factors but their total span effectively allowing for sparsity not necessarily in
the observable set of factors but their combinations as well Kelly Pruitt and Su (2019) build a
latent factor model for stock returns with factor loadings being a linear function of the company
characteristics and find that only a small subset of the latter provide substantial independent
information relevant for asset pricing Our approach does not take a stand on whether there exists
a single combination of factors that substantially outperforms other model specifications Instead
we let the data speak and find out that the cross-sectional likelihood across the many models is
rather flat meaning the data is not informative enough to reliably indicate that there is a single
dominant specification
Finally our paper naturally contributes to the literature on weak identification in asset pricing
Starting from the seminal papers of Kan and Zhang (1999a 1999b) identification of risk premia has
been shown to be challenging for traditional estimation procedures Kleibergen (2009) demonstrates
that the two-pass regression of Fama-MacBeth lead to biased estimates of the risk premia and
spuriously high significance levels Moreover useless factors often crowd out the impact of the true
sources of risk in the model and lead to seemingly high levels of cross-sectional fit (Kleibergen and
Zhan (2015)) Gospodinov Kan and Robotti (2014 2019) demonstrate that most of the estimation
techniques used to recover risk premia in the cross-section are rendered invalid in the presence of
useless factors and propose alternative procedures that effectively eliminate the impact of these
factors from the model We build upon the intuition developed in these papers and formulate
the Bayesian solution to the problem by providing a prior that directly reflects the strength of
the factor Whenever the vector of correlation coefficients between asset returns and a factor is
close to zero the prior variance of λ for this specific factor also goes to zero and the penalty for
5
the risk premium converges to infinity effectively shrinking the posterior of the useless factorsrsquo
risk premia toward zero Therefore our priors are particularly robust to the presence of spurious
factors Conversely they are very diffuse for the strong factors with the posterior reflecting the
full impact of the likelihood
II Inference in Linear Factor Models
This section introduces the notation and reviews the main results of the Fama-MacBeth (FM)
regression method (see Fama and MacBeth (1973)) We focus on classic linear factor models for
cross-sectional asset returns Suppose that there are K factors ft = (f1t fKt)⊤ t = 1 T
which could be either tradable or non-tradable To simplify exposition we consider mean zero
factors that have also been demeaned in sample so that we have both E[ft] = 0K and f = 0K where
E[] denotes the unconditional expectation and the upper bar denotes the sample mean operator
The returns of N test assets in excess of the risk free rate are denoted by Rt = (R1t RNt)⊤
In the FM procedure the factor exposures of asset returns βf isin RNtimesK are recovered from
the linear regression
Rt = a+ βfft + 983171t (1)
where 9831711 983171Tiidsim N (0N Σ) and a isin RN Given the mean normalization of ft we have E[Rt] = a
The risk premia associated with the factors λf isin RK are then estimated from the cross-
sectional regression
R = λc1N + 983141βfλf +α (2)
where 983141βf denotes the time series estimates λc is a scalar average mispricing that should be equal to
zero under the null of the model being correctly specified 1N denotes an N -dimensional vector of
ones and α isin RN is the vector of pricing errors in excess of λc If the model is correctly specified
it implies the parameter restriction a = E[Rt] = λc1N + βfλf Therefore we can rewrite the
two-step Fama-MacBeth regression into one equation as
Rt = λc1N + βfλf + βfft + 983171t (3)
Equation (3) is particularly useful in our simulation study Note that the intercept λc is included
in (2) and (3) in order to separately evaluate the ability of the model to explain the average level
of the equity premium and the cross-sectional variation of asset returns
Let B⊤ = (aβf ) and F⊤t = (1f⊤
t ) and consider the matrices of stacked time series observa-
tions
R =
983091
983109983109983107
R⊤1
R⊤T
983092
983110983110983108 F =
983091
983109983109983107
F⊤1
F⊤T
983092
983110983110983108 983171 =
983091
983109983109983107
983171⊤1
983171⊤T
983092
983110983110983108
The regression in (1) can then be rewritten as R = FB + 983171 yielding the time-series estimates of
6
(aβf ) and Σ
983141B =
983075983141a⊤
983141β⊤f
983076= (F⊤F )minus1F⊤R 983141Σ =
1
T(Rminus F 983141B)⊤(Rminus F 983141B)
In the second step the OLS estimates of the factor risk premia are
983141λ = (983141β⊤ 983141β)minus1 983141β⊤R (4)
where 983141β = (1N 983141βf ) and λ⊤ = (λc λf ⊤) The canonical Shanken (1992) corrected covariance matrix
of the estimated risk premia is4
983141σ2(983141λ) = 1
T
983147(983141β⊤ 983141β)minus1 983141β⊤ 983141Σ983141β(983141β⊤ 983141β)minus1(1 + 983141λ⊤
f983141Σminus1f
983141λf )983148 (5)
where 983141Σf is the sample estimate of the variance-covariance matrix of the factors ft There are
two sources of estimation uncertainty in the OLS estimates of λ First we do not know the test
assetsrsquo expected returns but instead estimate them as sample means R According to the time-
series regression R sim N (a 1T Σ) asymptotically Second if β is known the asymptotic covariance
matrix of 983141λ is simply 1T (β
⊤β)minus1β⊤ 983141Σβ(β⊤β)minus1 The extra term (1 + λ⊤fΣ
minus1f λf ) is included to
account for the fact that βf is estimated
Alternatively we can run a (feasible) GLS regression in the second stage obtaining the estimates
983141λ = (983141β⊤ 983141Σminus1 983141β)minus1 983141β⊤ 983141Σminus1R (6)
where 983141Σ = 1T 983141983171
⊤983141983171 and 983141983171 denotes the OLS residuals and with the associated covariance matrix of
the estimates
983141σ2(983141λ) = 1
T(983141β⊤ 983141Σminus1 983141β)minus1(1 + 983141λ⊤
f983141Σminus1f
983141λf ) (7)
Equations (4) and (6) make it clear that in the presence of a spurious (or useless) factor ie
such that βj = CradicT C isin RN risk premia are no longer identified Furthermore their estimates
diverge leading to inference problems for both the useless and the strong (ie βj ∕rarr 0 as T rarr infin)
factors (see eg Kan and Zhang (1999b)) In the presence of such an identification failure the
cross-sectional R2 also becomes untrustworthy If a useless factor is included into the two-pass
regression the OLS R2 tend to be highly inflated (although the GLS R2 is less affected)5
4An alternative way (see eg Cochrane (2005) Page 242) to account for the uncertainty from ldquogenerated regres-sorsrdquo such as βf is to estimate the whole system in GMM The moments are
gT (aβf λ) =
983061IN otimes IK+1
A⊤
983062983091
983107E[Rt minus aminus βfft]
E[(Rt minus aminus βfft)otimes f⊤t ]
E[Rt minus λc1N minus βfλf ]
983092
983108 = 0
where β = (1N βf ) A⊤ = β⊤ for OLS and A⊤ = β⊤Σminus1 for GLS Also note that GLS estimation is not the sameas efficient GMM estimation
5For example Kleibergen and Zhan (2015) derive the asymptotic distribution of the R2 under the assumptionthat a few unknown factors are able to explain expected asset returns and show that in the presence of a useless
7
This problem arises not only when using the Fama-MacBeth two-step procedure Kan and
Zhang (1999a) point out that the identification condition in the GMM test of linear stochastic
discount factor models fails when a useless factor is included Moreover this leads to overrejection
of the hypothesis of a zero risk premium for the useless factor under the Wald test and the power
of the over-identifying restriction test decreases Gospodinov Kan and Robotti (2019) document
similar problems within the maximum likelihood estimation and testing framework
Consequently several papers have attempted to develop alternative statistical procedures that
are robust to the presence of useless factors Kleibergen (2009) proposes several novel statis-
tics whose large sample distributions are unaffected by the failure of the identification condition
Gospodinov Kan and Robotti (2014) derive robust standard errors for the GMM estimates of
factor risk premia in the linear stochastic factor framework and prove that t-statistics calculated
using their standard errors are robust even when the model is misspecified and a useless factor is
included Bryzgalova (2015) introduces a LASSO-like penalty term in the cross-sectional regression
to shrink the risk premium of the useless factor towards zero
In this paper we provide a Bayesian inference and model selection framework that i) can be
easily used for robust inference in the presence and detection of useless factors (section II1) and
ii) can be used for both model selection and model averaging even in the presence of a very large
number of candidate (traded or non traded and possibly useless) risk factors ndash ie the entire factor
zoo
II1 Bayesian Fama-MacBeth
This section introduces our hierarchical Bayesian Fama-MacBeth (BFM) estimation method A
formal derivation is presented in Appendix A1 To start with letrsquos consider the time-series regres-
sion We assume that the time series error terms follow an iid multivariate Gaussian distribution
(the approach at the cost of analytical solutions could be generalized to accommodate different
distributional assumptions) ie 983171 sim MVN (0TtimesN Σ otimes IT ) The likelihood of the data (RF ) is
then
p(data|BΣ) = (2π)minusNT2 |Σ|minus
T2 exp
983069minus1
2tr
983147Σminus1(Rminus FB)⊤(Rminus FB)
983148983070
The time-series regression is always valid even in the presence of a spurious factor For simplicity
we choose the non-informative Jeffreysrsquo prior for (BΣ) π(BΣ) prop |Σ|minusN+1
2 Note that this prior
is flat in the B dimension The posterior distribution of (BΣ) is therefore
B|Σ data sim MVN983059983141BolsΣotimes (F⊤F )minus1
983060 (8)
Σ|data sim W -1983059T minusK minus 1 T Σ
983060 (9)
where 983141Bols and Σ denote the canonical OLS based estimates and W -1 is the inverse-Wishart
distribution (a multivariate generalization of the inverse-gamma distribution) From the above
factor the OLS R2 is more likely to be inflated than its GLS counterpart
8
we can sample the posterior distribution of the parameters (BΣ) by first drawing the covariance
matrix Σ from the inverse-Wishart distribution conditional on the data and then drawing B from
a multivariate normal distribution conditional on the data and the draw of Σ
If the model is correctly specified in the sense that all true factors are included expected
returns of the assets should be fully explained by their risk exposure β and the prices of risk λ
ie E[Rt] = βλ But since given our mean normalisation of the factors E[Rt] = a we have the
least square estimate (β⊤β)minus1β⊤a Therefore we can define our first estimator
Definition 1 (Bayesian Fama-MacBeth (BFM)) The posterior distribution of λ conditional
on B Σ and the data is a Dirac distribution at (β⊤β)minus1β⊤a A draw (λ(j)) from the posterior
distribution of λ conditional on the data only is obtained by drawing B(j) and Σ(j) from the Normal-
inverse-Wishart in (8)-(9) and computing (β⊤(j)β(j))
minus1β⊤(j)a(j)
The posterior distribution of λ defined above accounts both for the uncertainty about the
expected returns (via the sampling of a) and the uncertainty about the factor loadings (via the
sampling of β) Note that differently from the frequentist case in equation (5) there is no ldquoextra
termrdquo (1 + λ⊤fΣ
minus1f λf ) to account for the fact that βf is estimated The reason being that it
is unnecessary to explicitly adjust standard errors of λ in the Bayesian approach since we keep
updating βf in each simulation step automatically incorporating the uncertainty about βf into the
posterior distribution of λ Furthermore it is quite intuitive from the above definition of the BFM
estimator why we expect posterior inference to detect weak and spurious factors in finite sample
For such factors the near singularity of (β⊤(j)β(j))
minus1 will cause the draws for λ(j) to diverge as in
the frequentist case Nevertheless the posterior uncertainty about factor loadings and risk premia
will cause β⊤(j)a(j) to flip sign across draws causing the posterior distribution of λ to put substantial
probability mass on both values above and below zero Hence centered posterior credible intervals
will tend to include zero with high probability
In addition to the price of risk λ we are also interested in estimating the cross-sectional fit of
the model ie the cross-sectional R2 Once we obtain the posterior draws of the parameters we
can easily obtain the posterior distribution of the cross-sectional R2 defined as
R2ols = 1minus (aminus βλ)⊤(aminus βλ)
(aminus a1N )⊤(aminus a1N ) (10)
where a = 1N
983123Ni ai That is for each posterior draw of (a β λ) we can construct the corre-
sponding draw for the R2 from equation (10) hence tracing out its posterior distribution We can
think of equation (10) as the population R2 where a β and λ are unknown After observing
the data we infer the posterior distribution of a β and λ and from these we can recover the
distribution of the R2
However realistically the models are rarely true Therefore one might want to allow for the
presence of pricing errors α in the cross-sectional regression6 This can be easily accommodated
6As we will show in the next section this is essential for model selection
9
within our Bayesian framework since in this case the data generating process in the second stage
becomes a = βλ + α If we further assume that pricing error αi follows an independent and
identical normal distribution N (0σ2) the likelihood function in the second step becomes7
p(data|λσ2β) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070 (11)
In the cross-sectional regression the ldquodatardquo are the expected risk premia a and the factor loadings
β albeit these quantities are not directly observable to the researcher Hence in the above we
are conditioning on the knowledge of these quantities which can be sampled from the first step
Normal-inverse-Wishart posterior distribution (8)-(9) Conceptually this is not very different from
the Bayesian modeling of latent variables In the benchmark case we assume a Jeffreysrsquo diffuse
prior8 for (λσ2) π(λσ2) prop σminus2 In Appendix A1 we show that the posterior distribution of
(λσ2) is then
λ|σ2BΣ data sim N
983091
983109983107(β⊤β)minus1β⊤a983167 983166983165 983168983141λ
σ2(β⊤β)minus1
983167 983166983165 983168Σλ
983092
983110983108 (12)
σ2|BΣ data sim Γminus1
983075N minusK minus 1
2(aminus βλ)⊤(aminus βλ)
2
983076 (13)
where Γminus1 denotes the inverse-gamma distribution The conditional distribution in equation (12)
makes it clear that the posterior takes into account both the uncertainty about the market price
of risk stemming from the first stage uncertainty about the β and a (that are drawn from the
Normal-inverse-Wishart posterior in equations (8)-(9)) and the random pricing errors α that have
the conditional posterior variance in equation (13) If test assetsrsquo expected excess returns are fully
explained by β there are no pricing errors and σ2(β⊤β)minus1 converges to zero otherwise this layer
of uncertainty always exists
Note also that we can think of the posterior distribution of (β⊤β)minus1β⊤a as a Bayesian decision
makerrsquos belief about the dispersion of the Fama-MacBeth OLS estimates after observing the data
RtftTt=1 Alternatively when pricing errors α are assumed to be zero under the null hypothesis
the posterior distribution of λ in equation (12) collapses to a degenerate distribution where λ
equals (β⊤β)minus1β⊤a with probability one
Often the cross sectional step of the Fama-MacBeth estimation is performed via GLS rather than
least squares In our setting under the null of the model this leads to λ = (β⊤Σminus1β)minus1β⊤Σminus1a
Therefore we define the following GLS estimator
Definition 2 (Bayesian Fama-MacBeth GLS (BFM-GLS)) The posterior distribution of λ
conditional on B Σ and the data is a Dirac distribution at (β⊤Σminus1β)minus1β⊤Σminus1a A draw (λ(j))
7We derive a formulation with non-spherical cross-sectional pricing errors that leads to a GLS type estimator inAppendix A2
8As shown in the next subsection in the presence of useless factors such prior is not appropriate for modelselection based on Bayes factors and posterior probabilities since it does not lead to proper marginal likelihoodsTherefore we introduce therein a novel prior for model selection
10
from the posterior distribution of λ conditional on the data only is obtained by drawing B(j) and Σ(j)
from the Normal-inverse-Wishart in equations (8)-(9) and computing (β⊤(j)Σ
minus1(j)β(j))
minus1β⊤(j)Σ
minus1(j)a(j)
From the posterior sampling of the parameters in the above definition we can also obtain the
posterior distribution of the cross-sectional GLS R2 defined as
R2gls = 1minus
(aminus βλgls)⊤Σminus1(aminus βλgls)
(aminus a1N )⊤Σminus1(aminus a1N ) (14)
Once again we can think of equation (14) as the population GLS R2 that is a function of the
unknown quantities a β and λ But after observing the data we infer the posterior distribution
of the parameters and from these we recover the posterior distribution of the R2gls
Remark 1 (Generated factors) Often factors are estimated as eg in the case of principal
components (PCs) and factor mimicking portfolios (albeit the latter is not needed in our setting)
This generates an additional layer of uncertainty normally ignored in empirical analysis due to
the associated asymptotic complexities Nevertheless it is relatively easy to adjust the Bayesian
estimators of risk premia to account for this uncertainty In the case of a mimicking portfolio
under a diffuse prior and Normal errors the posterior distribution of the portfolio weights follow the
standard Normal-inverse-Gamma of Gaussian linear regression models (see eg Lancaster (2004))
Similarly in the case of principal components as factors under a diffuse prior the covariance
matrix from which the PCs are constructed follow an inverse-Wishart distribution9 Hence the
posterior distributions in Definitions 1 and 2 can account for the generated factors uncertainty by
first drawing from an inverse-Wishart the covariance matrix from which PCs are constructed or
the Normal-inverse-Gamma posterior of the mimicking portfolios coefficients and then sampling
the remaining parameters as explained in the definitions
II2 Model selection
In the previous subsection we have derived simple Bayesian estimators that deliver in finite sam-
ple credible intervals robust to the presence of spurious factors and avoid over-rejecting the null
hypothesis of zero risk premia for such factors
However given the plethora of risk factors that have been proposed in the literature a robust
approach for models selection across non-necessarily nested models and that can handle poten-
tially a very large number of possible models as well as both traded and non-traded factors is
of paramount importance for empirical asset pricing The canonical way of selecting models and
testing hypothesis within the Bayesian framework is through Bayesrsquo factors and posterior prob-
abilities and that is the approach we present in this section This is for instance the approach
suggested by Barillas and Shanken (2018) The key elements of novelty of the proposed method
are that i) our procedure is robust to the presence of spurious and weak factors ii) it is directly
9Based on these two observations Allena (2019) proposes a generalisation of Barillas and Shanken (2018) modelcomparison approach for these type of factors
11
applicable to both traded and non-traded factors and iii) it selects models based on their cross-
sectional performance (rather than the time series one) ie on the basis of the risk premia that the
factors command
In this subsection we show first that flat priors for risk premia (as the Jeffreysrsquo priors used
in section II1 for illustrative purposes) are not suitable for model selection in the presence of
spurious factors Given the close analogy between frequentist testing and Bayesian inference with
flat priors this is not too surprising But the novel insight is that the problem arises exactly
because of the use of flat priors and can therefore be fixed by using non-flat yet non-informative
priors Second we introduce ldquospike-and-slabrdquo priors that are robust to the presence of spurious
factors and particularly powerful in high-dimensional model selection ie when one wants as in
our empirical application to test all factors in the zoo
II21 Pitfalls of Flat Priors for Risk Premia
We start this section by discussing why flat priors for risk premia as the Jeffreysrsquo prior are not
desirable in model selection Since we want to focus and select models based on the cross-sectional
asset pricing properties of the factors for simplicity we retain Jeffreysrsquo priors for the time series
parameter (aβf Σ) of the first-step regression
In order to perform model selection we relax the (null) hypothesis that models are correctly
specified and allow instead for the presence of cross-sectional pricing errors That is we consider
the cross-sectional regression a = βλ + α For illustrative purposes we focus on spherical errors
but all the results in this and the following subsections can be generalized to the non-spherical error
setting in Appendix A2
Similar to many Bayesian variable selection problems we introduce a vector of binary latent
variables γ⊤ = (γ0 γ1 γK) where γj isin 0 1 When γj = 1 it indicates that the factor j
(with associated loadings βj) should be included into the model and vice versa The number of
included factors is simply given by pγ =983123K
j=0 γj Note that we do not shrink the intercept so
γ0 is always equal to 1 (as the common intercept plays the role of the first ldquofactorrdquo) The notation
βγ = [βj ]γj=1 represents a pγ-columns sub-matrix of β
When testing whether the risk premium of factor j is zero the null hypothesis is H0 λj = 0
In our notation this null hypothesis can be expressed as H0 γj = 0 while the alternative is
H1 γj = 1 This is a small but important difference relative to the canonical frequentist testing
approach for useless factors the risk premium is not identified hence testing whether it is equal to
any given value is per se problematic Nevertheless as we show in the next section with appropriate
priors whether a factor should be included or not is a well-defined question even in the presence
of useless factors
In the Bayesian framework the prior distribution of parameters under the alternative hypothesis
should be carefully specified Generally speaking the priors for nuisance parameters such as β σ2
and Σ do not greatly influence the cross-sectional inference But as we are about to show this is
not the case for the priors about risk premia
12
Recall that when considering multiple models say wlog model γ and model γ prime by Bayesrsquo
theorem we have that the posterior probability of model γ is
Pr(γ|data) = p(data|γ)p(data|γ) + p(data|γ prime)
where we have given equal prior probability to each model and p(data|γ) denotes the marginal
likelihood of the model indexed by γ In Appendix A3 we show that when using a Jeffreysrsquo prior
(that is flat for λ) the marginal likelihood is
p(data|γ) prop (2π)pγ+1
2 |β⊤γ βγ |minus
12
Γ983059Nminuspγ+1
2
983060
983059N σ2
γ
2
983060Nminuspγ+1
2
(15)
where λγ = (β⊤γ βγ)
minus1β⊤γ a σ
2γ =
(aminusβγ λγ)⊤(aminusβγ λγ)N and Γ denotes the Gamma function
Therefore if model γ includes a useless factor (whose β asymptotically converges to zero) the
matrix β⊤γ βγ is nearly singular and its determinant goes to zero sending the marginal likelihood
in (15) to infinity As a result the posterior probability of the model containing the spurious factor
go to one Consequently under a flat prior for risk premia the model containing a useless factor
will always be selected asymptotically However the posterior distribution of λ for the spurious
factor is robust and particularly disperse in any finite sample
Moreover it is highly likely that conclusions based on the posterior coverage of λ contradict
those arising from Bayesrsquo factors When the prior distribution of λj is too diffuse under the
alternative hypothesis H1 the Bayesrsquo factor tends to favor H0 over H1 even though the estimate
of λj is far from 0 The reason is that even though H0 seems quite unlikely based on posterior
coverages the data is even more unlikely under H1 than under H0 Therefore a disperse prior for
λj may push the posterior probabilities to favor H0 and make it fail to identity true factors This
phenomenon is the so called ldquoBartlett Paradoxrdquo (see Bartlett (1957))
Note also that flat hence improper priors for the risk premia are not legitimate since they render
the posterior model probabilities arbitrary Suppose that we are testing the null H0 λj = 0 Under
the null hypothesis the prior for (λσ2) is λj = 0 and π(λminusj σ2) prop 1
σ2 However the prior under the
alternative hypothesis is π(λj λminusj σ2) prop 1
σ2 Since the marginal likelihoods of data p(data|H0)
and p(data|H1) are both undetermined we cannot define the Bayesrsquo factor p(data|H1)p(data|H0)
(see eg
Cremers (2002) Chib Zeng and Zhao (forthcoming)) In contrast for nuisance parameters such
as σ2 we can continue to assign improper priors Since both hypotheses H0 and H1 include σ2
the prior for it will be offset in the Bayesrsquo factor and in the posterior probabilities Therefore we
can only assign improper priors for common parameters10 Similarly we can still assign improper
priors for β and Σ in the first time series step
The final reason why it might be undesirable to use Jeffreysrsquo prior in the second step is that it
does not impose any shrinkage on the parameters This is problematic given the large number of
10See Kass and Raftery (1995) (and also Cremers (2002)) for more detailed discussion
13
members of the factor zoo while we have only limited time-series observations of both factors and
test asset returns
In the next subsection we propose an appropriate prior for risk premia that is both robust to
spurious factors and can be used for model selection even when dealing with a very large number
of potential models
II22 Spike and Slab Prior for Risk Premia
In order to make sure that the integration of the marginal likelihood is well-behaved we propose
a novel prior specification for the factorsrsquo risk premia λ⊤f = (λ1 λK)11 Since the inference in
time-series regression is always valid we only modify the priors of the cross-sectional regression
parameters
The prior that we propose belongs to the so-called ldquospike-and-slabrdquo family For exemplifying
purposes in this section we introduce a Dirac spike so that we can easily illustrate its implications
for model selection In the next subsection we generalize the approach to a ldquocontinuous spikerdquo
prior and study its finite sample performance in our simulation setup
In particular we model the uncertainty underlying the model selection problem with a mixture
prior π(λσ2γ) prop π(λ|σ2γ)π(σ2)π(γ) for the risk premium of the j-th factor When γj = 1
and hence the factor should be included in the model the prior follows a normal distribution given
by λj |σ2 γj = 1 sim N (0σ2ψj) where ψj is a quantity that we will be defining below When instead
γj = 0 and the corresponding risk factor should not be included in the model the prior is a Dirac
distribution at zero For the cross-sectional variance of the pricing errors we keep the same prior
that would arise with Jeffreysrsquo approach π(σ2) prop σminus2
Let D denote a diagonal matrix with elements cψminus11 middot middot middot ψminus1
K and Dγ the sub-matrix of D
corresponding to model γ We can then express the prior for the risk factors λγ of model γ as
λγ |σ2γ sim N (0σ2Dminus1γ )
Note that c is a small positive number since we do not shrink the common intercept λc of the
cross-sectional regression
Given the above prior specification we sample the posterior distribution by sequentially drawing
from the conditional distributions of the parameters (ie we use a Gibbs sampling algorithm) The
crucial steps in addition to the sampling of the times series parameters from the posteriors in
equations (8)-(9) are as follows
Sampling λγ
Note that
11We do not shrink the intercept λc
14
p(λ|dataσ2γ) prop p(data|λσ2γ)π(λ|σ2γ)
prop (2π)minuspγ2 |Dγ |
12 (σ2)minus
N+pγ2 exp
983069minus 1
2σ2[(aminus βγλγ)
⊤(aminus βγλγ) + λ⊤γDγλγ ]
983070
= (2π)minuspγ2 |Dγ |
12 (σ2)minus
N+pγ2 e
983069minus
(λγminusλγ )⊤(β⊤γ βγ+Dγ )(λγminusλγ )
2σ2
983070
e
983153minusSSRγ
2σ2
983154
where SSRγ = a⊤aminus a⊤βγ(β⊤γ βγ +Dγ)
minus1β⊤γ a = minλγ(aminus βγλγ)
⊤(aminus βγλγ) + λ⊤γDγλγ
Note that SSRγ is the minimized sum of squared errors with generalised ridge regression penalty
term λ⊤γDγλγ That is our prior modelling is analogous to introducing a Tikhonov-Phillips
regularisation (see Tikhonov Goncharsky Stepanov and Yagola (1995) Phillips (1962)) in the
cross-sectional regression step and has the same rationale delivering a well defined marginal
likelihood in the presence of rank deficiency (that in our settings arises in the presence of useless
factors) However in our setting the shrinkage applied to the factors is heterogeneous since we
rely on the partial correlation between factors and test assets to set ψj as
ψj = ψ times ρ⊤j ρj (16)
where ρj is an N times 1 vector of correlation coefficients between factor j and the test assets and
ψ isin R+ is a tuning parameter which controls the shrinkage over all the factors12 When the
correlation between fjt and Rt is very low as in the case of a useless factor the penalty for λj
which is the reciprocal of ψρ⊤j ρj is very large and dominates the sum of squared errors
Let λγ = (β⊤γ βγ +Dγ)
minus1β⊤γ a and σ2(λγ) = σ2(β⊤
γ βγ +Dγ)minus1 the posterior distribution of
λγ is
λγ |dataσ2γ sim N (λγ σ2(λγ))
The above equation makes it clear why this Bayesian formulation is robust to spurious factors
When β converges to zero (β⊤γ βγ +Dγ) is dominated by Dγ so the identification condition for
the risk premia no longer fails When a factor is spurious its correlation with test assets converges
to zero hence the penalty for this factor ψminus1j goes to infinity As a result the posterior mean
of λγ λγ = (β⊤γ βγ +Dγ)
minus1β⊤γ a is shrunk towards zero and the posterior variance term σ2(λ)
approaches σ2Dminus1γ Consequently the posterior distribution of λ for a spurious factor is nearly the
same as its prior In contrast for a normal factor that has non-zero covariance with test assets the
information contained in β dominates the prior information since in this case the absolute size of
Dγ is small relative to β⊤γ βγ
Using our priors and integrating out λ yields
p(data|σ2γ) =
983133p(data|λσ2γ)π(λ|σ2γ)dλ prop (σ2)minus
N2
|Dγ |12
|β⊤γ βγ +Dγ |
12
exp
983069minusSSRγ
2σ2
983070
12Alternatively we could have set ψj = ψ times β⊤j βj where βj is an N times 1 vector However ρj has the advantage
of being invariant to the units in which the factors are measured
15
Sampling σ2
From the Bayesrsquo theorem we have that the posterior of σ2 given by
p(σ2|dataγ) prop p(data|σ2γ)π(σ2) prop (σ2)minusN2minus1 exp
983069minusSSRγ
2σ2
983070
hence the posterior distribution of σ2 is an inverse-Gamma σ2|dataγ sim Γminus1983059N2
SSRγ
2
983060
Finally we obtain the marginal likelihood of the data in model γ by integrating out σ2
p(data|γ) =983133
p(data|σ2γ)π(σ2)dσ2 prop |Dγ |12
|β⊤γ βγ +Dγ |
12
1
(SSRγ2)N2
When comparing two models using posterior model probabilities is equivalent to simply using the
ratio of the marginal likelihoods ie the Bayesrsquo factor defined as
BFγγprime = p(data|γ)p(data|γ prime)
where we have given equal prior probability to model γ and model γprime
Remark 2 (Bayes Factor) Consider two nested linear factor models γ and γ prime The only dif-
ference between γ and γ prime is γp γp equals 1 in model γ but 0 in model γ prime Let γminusp denote a
(K minus 1) times 1 vector of model index excluding γp γ⊤ = (γ⊤minusp 1) and γ prime⊤ = (γ⊤
minusp 0) where without
loss of generality we have assumed that the factor p is ordered last The Bayesrsquo factor is then
BFγγprime =
983061SSRγprime
SSRγ
983062N2 983059
1 + ψpβ⊤p
983147IN minus βγprime(β⊤
γprimeβγprime +Dγprime)minus1β⊤γprime
983148βp
983060minus 12 (17)
The above result is proved in Appendix A4
Since β⊤p [IN minus βγprime(β⊤
γprimeβγprime + Dγprime)minus1β⊤γprime ]βp is always positive ψp plays an important role in
variable selection For a strong and useful factor that can substantially reduce pricing errors the
first term in equation (17) dominates and the Bayesrsquo factor will be much greater than 1 hence
providing evidence in favour of model γ
Remember that SSRγ = minλγ(a minus βγλγ)⊤(a minus βγλγ) + λ⊤
γDγλγ hence we always have
SSRγ le SSRγprime in sample There are two effects of increasing ψp i) when ψp is large the penalty
for λp is small hence it is easier to minimise SSRγ and SSRγprimeSSRγ becomes much larger than
1 ii) large ψp decreases the second term in equation (17) lowering the Bayesrsquo factor and acting as
a penalty for dimensionality
A particularly interesting case is when the factor is useless βp converges to zero but the
penalty term 1ψp prop 1ρ⊤pρp goes to infinity On the one hand the first term in equation (17) will
converge to 1 on the other hand since ψp asymp 0 in large sample the second term in equation (17)
will also be around 1 Therefore the Bayesrsquo factor for a useless factor will go to 1 asymptotically13
13But in finite sample it may deviate from its asymptotic value so we should not use 1 as a threshold when testingthe null hypothesis H0 γp = 0
16
In contrast a useful factor should be able to greatly reduce the sum of squared errors SSRγ so
the Bayesrsquo factor will be dominated by SSRγ yielding a value substantially above 1
Remark 3 (Level Factors) Identification failure of factorsrsquo risk premia can arise in the presence
of lsquolevel factorsrsquo exposure to which is non-zero but lacks cross-sectional spread ie βj rarr c1N with
c ∕= 0 These factors help explain the average level of returns but not the their cross-sectional
dispersion and hence are collinear with the common cross-sectional intercept Our approach can
handle this case by using variance standardised variables in the estimation and replacing the penalty
in (16) with ψj = ψtimes983145ρj⊤983145ρj where 983145ρj is the cross-sectionally demeaned vector of correlations with
asset returns ie 983145ρj = ρj minus983059
1N
983123Ni=1 ρji
983060times 1N
II23 Continuous Spike
We extend the Dirac spike-and-slab prior by encoding a continuous spike for λj when γj equals
0 Following the literature on Bayesian variable selection (see eg George and McCulloch (1993)
George and McCulloch (1997) and Ishwaran Rao et al (2005)) we model the uncertainty under-
lying model selection with a mixture prior π(λσ2γω) = π(λ|σ2γ)π(σ2)π(γ|ω)π(ω) which is
specified as following
λj |γj σ2 sim N (0 r(γj)ψjσ2)
Note the introduction of a new element r(γj) in the prior and where r(1) = 1 and r(0) = r As
we explain below the additional parameter vector ω encodes our prior beliefs about the sparsity
of the true model
Redefine D as a diagonal matrix with elements c (r(γ1)ψ1)minus1 (r(γK)ψK)minus1 where ψj is
given as before by equation (16) In matrix notation the prior for λ is λ|σ2γ sim N (0σ2Dminus1)
The term r(γj)ψj in Dminus1 is set to be small or large depending on whether γj = 0 or γj = 1 In the
empirical implementation we force r to be much less than 1 since we intend to shrink λj towards
zero when γj is 014 Hence the spike component concentrates the mass of λ towards zero whereas
the slab component allows λ to take values over a much wider range Therefore the posterior
distribution of λ is very similar to the case of a Dirac spike in section II22
We use Gibbs sampling (ie sequential sampling from conditional distributions) to draw the
posterior distribution of the parameters (λγωσ2) where as explained below ω encodes our
prior beliefs about the sparsity of the true model
Sampling λγ
Combining the likelihood and the prior for λ we have
p(λ|dataσ2γ) prop p(data|λσ2γ)p(λ|σ2γ) prop exp
983069minus 1
2σ2
983147λ⊤(β⊤β +D)λminus 2λ⊤β⊤a
983148983070
Therefore defining λ = (β⊤β+D)minus1β⊤a and σ2(λ) = σ2(β⊤β+D)minus1 the posterior distribution
14We can set r = 00001 In our framework r is essentially a tune parameter hence we need to choose a reasonablevalue such that we can identify useful factor but exclude spurious ones
17
of λ can be expressed as λ|dataσ2γω sim N (λ σ2(λ))
Sampling γjKj=1
Even though the prior on model index γ could be simply set to be π(γ) = 12K we interpret
π(γj = 1|ωj) = ωj as our prior belief about the sparsity of the true model As in the literature on
predictors selection we assign the following prior distribution to (γω)
π(γj = 1|ωj) = ωj ωj sim Beta(aω bω)
Different hyper-parameters aω and bω determine whether we favor more parsimonious models or
not15
Given a ωj the Bayes factor for the j-th risk then factor is
p(γj = 1|dataλωσ2γminusj)
p(γj = 0|dataλωσ2γminusj)=
ωj
1minus ωj
p(λj |γj = 1σ2)
p(λj |γj = 0σ2)
If we had instead imposed ωj = 05 as in section II22 the Bayesrsquo factor would be fully determined
byp(λj |γj=1σ2)p(λj |γj=0σ2)
Sampling ω
From Bayesrsquo theorem we have
p(ωj |dataλγσ2) prop π(ωj)π(γj |ωj) prop ωγjj (1minus ωj)
1minusγjωaωminus1j (1minus ωj)
bωminus1
prop ωγj+aωminus1j (1minus ωj)
1minusγj+bωminus1
Therefore the posterior distribution of ωj is ωj |dataλγσ2 sim Beta(γj + aω 1minus γj + bω)
Sampling σ2
Finally
p(σ2|dataωλγ) prop (σ2)minusN+K+1
2minus1 exp
983069minus 1
2σ2[(aminus βλ)⊤(aminus βλ) + λ⊤Dλ]
983070
Hence the posterior distribution of σ2 is
σ2|dataωλγ sim Γminus1
983061N +K + 1
2(aminus βλ)⊤(aminus βλ) + λ⊤Dλ
2
983062
Note that when ωj is a constant 05 and r converges to 0 the continuous slab-and-spike prior
is equivalent to the one with a Dirac spike in section II22 However the MCMC algorithm of
the continuous spike setting is particularly useful in the high dimensional case Imagine that there
are 30 candidate factors in the factor zoo In the Dirac spike-and-slab prior case we have to
15We set aω = bω = 2 in the benchmark case However we could assign aω = 1 and bω = 2 in order to select asparser model
18
calculate the posterior model probabilities for 230 different models Given that we update (aβ)
in each sampling round posterior probabilities for all models are necessarily re-computed for every
new draw of these quantities rendering the computational cost very large In contrast using our
continuous spike approach we can simply use the posterior mean of γj to approximate the posterior
marginal probability of the j-th factor
III Simulation
We build a simple setting for a linear factor model that includes both strong and irrelevant factors
and allows for potential model misspecification
The cross-section of asset returns mimics empirical properties of 25 Fama-French portfolios
sorted by size and value We generate both factors and test asset returns from normal distributions
assuming that HML is the only useful factor A misspecified model also includes pricing errors from
the two-step Fama-MacBeth procedure which makes the vector of simulated expected returns equal
to their sample mean estimates of 25 Fama-French portfolios Finally a spurious factor is simulated
from an independent normal distribution with mean zero and standard deviation 1
ftuselessiidsim N (0 (1)2)
ftHMLiidsim N (rHML σ
2HML)
ftHML = ftHML minus macrftHML
Rt|ftHMLiidsim
983099983103
983101N (λc1N + β
983059λHML + ftHML
983060 Σ) if the model is correct
N (R+ βftHML Σ) if the model is misspecified
where factor loadings risk premia and variance-covariance matrix of returns are equal to their
OLS sample estimates from time series and two-pass Fama-MacBeth regressions of 25 size and
value portfolios on HML All the model parameters are estimated on monthly data from July 1963
to December 2017
To better illustrate the properties of the frequentist and bayesian approaches we consider 3
estimation setups
(a) the model includes only a strong factor (HML)
(b) the model includes only a useless factor
(c) the model includes both strong and useless factors
Each setting can be correctly or incorrectly specified with the following sample sizes T = 100
200 600 1000 and 20000 We compare the performance of the OLSGLS standard frequentist
and Bayesian Fama-MacBeth estimators (FM and BFM correspondingly) with the focus on risk
premia recovery testing and identification of strong and useless factors for model comparison
19
III1 Estimating risk premia via Bayesian Fama-MacBeth
Since it is unlikely that in most empirical settings a linear factor model is correctly specified we
focus our discussion on the case that allows for model misspecification We report similar simulation
results for the case of correct model specification in Appendix C
Table 1 Tests of risk premia in a misspecified model with a strong factor
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0107 0052 0007 0115 0076 0013 -422 4530200 0094 0067 0006 0118 0072 0015 -303 5621
FM 600 0097 0053 0006 0098 0047 0011 389 55751000 0088 0048 0012 0103 0060 0012 865 532820000 0109 0056 0010 0110 0060 0008 2486 3655
100 0063 0026 0006 0032 0013 0001 -356 3609200 0086 0041 0012 0079 0039 0008 -367 4689
BFM 600 0087 0043 0013 0087 0047 0009 -067 52611000 0095 0046 0009 0093 0046 0009 440 534620000 0096 0052 0012 0100 0052 0012 2390 3601
Panel B GLS
100 0246 0173 0067 0251 0154 0070 1720 7464200 0165 0107 0031 0171 0105 0035 5337 8045
FM 600 0137 0076 0020 0137 0073 0016 6942 83871000 0131 0072 0019 0132 0072 0015 7341 841620000 0122 0074 0014 0113 0061 0015 8034 8283
100 0139 0083 0022 0143 0083 0024 3127 6815200 0131 0072 0014 0121 0069 0019 4858 7299
BFM 600 0114 0061 0013 0126 0067 0018 6594 80091000 0100 0048 0009 0106 0058 0008 7063 814920000 0092 0050 0012 0100 0048 0012 8022 8267
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor The true value of the cross-sectional R2
adj is 3055(8175) for the OLS (GLS) estimation Fama-MacBeth estimates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidence intervals and their size for BFMestimates are constructed using posterior coverage of Fama-MacBeth estimates of λ The last two columns report the5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at the simulation point estimatesfor FM and its posterior mode for BFM
Table 1 presents the size of the tests for risk premia and confidence intervals for cross-sectional
R2 in probably the most relevant (and best-case) scenario for empirical applications It compares
the performance of frequentist and Bayesian Fama-MacBeth estimators for the case when the model
is misspecified and includes a single cross-sectional factor that is priced and strongly identified
proxied by HML Since the model is misspecified cross-sectional R2 never reaches 100 even for
T = 20 000 with the population value of 31 (82) for OLS (GLS) As expected both FM and
BFM estimators are very similar to each other and provide confidence intervals of correct size In
case of the standard FM approach they are constructed using standard t-statistics adjusted for
Shanken correction and in case of the BFM we rely on the quantiles of the posterior distribution
to form the credible confidence intervals for parameters The last two columns also report the
quantiles of the mode of the posterior distribution of R2 across the simulations
20
Table 2 summarizes risk premia estimation for the same cross-section of 25 expected returns
but using a useless (spurious) factor as a candidate cross-sectional factor As expected the standard
Fama-MacBeth estimator fails to recognize the rank failure in the second stage and conventional
risk premia estimates and t-statistics are inflated Indeed it is widely known since Kan and Zhang
(1999) that if the model is misspecified then t-statistics of the spurious factors tend to infinity
asymptotically This is confirmed in Panels A and B for T = 20 000 the probability of finding a
useless factor to have a t-stat above 196 is almost 50 for FM-OLS and over 80 for FM-GLS
Furthermore cross-sectional measures of fit such as R2 and related quantities are substantially
inflated even though itrsquos true value is 0 for both OLS and GLS settings in-sample estimates
produced by the frequentist approach not only have a much wider empirical support (for example
from -4 to over 40 for FM-OLS) but also uncertainty that does not decrease with the sample
size
Table 2 Tests of risk premia in a misspecified model with a useless factor
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0072 0037 0010 0055 0011 0001 -429 4184200 0078 0043 0005 0084 0021 0001 -418 4524
FM 600 0089 0043 0014 0223 0116 0013 -428 43701000 0093 0052 0014 0333 0187 0027 -429 450120000 0213 0135 0067 0698 0488 0172 -427 4363
100 0035 0011 0001 0001 0000 0000 -247 -036200 0041 0013 0001 0008 0002 0000 -261 -016
BFM 600 0071 003 0003 0031 0006 0002 -287 0191000 0047 002 0001 0039 0017 0002 -298 08420000 0034 0013 0000 0091 0043 0012 -336 1140
Panel B GLS
100 0238 0144 0066 0305 0199 0062 -347 3857200 0152 0091 0028 0292 0189 0067 -375 1953
FM 600 0126 0066 0017 0407 0314 0148 -385 16431000 0117 0063 0013 0510 0401 0239 -405 156920000 0104 0039 0005 0864 0847 0768 -311 1333
100 0128 0070 0019 0047 0018 0002 -218 1154200 0107 0060 0014 0034 0011 0000 -275 891
BFM 600 0093 0046 0008 0042 0012 0001 -325 6621000 0083 0031 0004 0061 0028 0004 -339 51920000 0023 0006 0000 0099 0049 0011 -304 138
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor The true value of the cross-sectional R2 is zero
The Bayesian approach to Fama-MacBeth regressions successfully overcomes the hurdle of use-
less factors As Table 2 demonstrates both BFM-OLS and BFM-GLS are able to identify the spu-
rious factor with the posterior distribution providing credible confidence bounds with the proper
control for the size of the test (eg for T = 20 000 the probability to reject the null of no risk
premia attached to a useless factor when using a test with the nominal size of 10 is 91) Rec-
ognizing a useless factor cross-sectional measures of fit are more conservative and overall tighter
Why does the bayesian approach work when the frequentist fails The argument is probably
best summarized by Figure 1 that plots a posterior distribution of λuseless for BFM from one of
21
minus4 minus2 0 2 4
00
02
04
06
08
λuseless
Density
BFMminusOLSFMminusOLS
Figure 1 Risk premia estimates of a useless factor
The graph presents the posterior distribution (blue line) of λuseless from the BFM-OLS estimation in a misspecifiedmodel with a useless factor based on a single simulation with T = 1 000 The red line depicts the asymptoticdistribution of Fama-MacBeth estimate of risk premium under the normal approximation
the simulations along with the pseudo-true value of risk premium defined as 0 in this case In this
particular simulation Fama-MacBeth OLS estimate of λuseless is -119 with Shanken-corrected
t-statistics equal to -255 so according to traditional hypothesis testing we would reject the null
of λuseless = 0 even at 1 The posterior distribution of BFM estimates of risk premium (the
blue line in Figure 1) behaves rather differently it is centered around 0 and overall more spread
out with a confidence interval (minus1603 1201) Intuitively the main driving force behind it
is the fact that in BFM β is updated continuously when β is close to zero the posterior draws
of β will be positive or negative randomly which implies that the conditional expectation of λ in
Equation 12 will also flip sign depending on the draw As a result the posterior distribution of
λuseless is centered around 0 and so is the confidence interval The same logic applies to the case
of BFM-GLS
Finally Table 3 combines the insights for true and irrelevant factors and presents the simulation
results for the most realistic model setup that includes both useless and strong factors As expected
in the conventional case of frequentist Fama-MacBeth estimation the useless factor is often found
to be a significant predictor of the asset returns its OLS (GLS) t-statistic would be above a
5-critical value in over 60 (80) of the simulations On the contrary the bayesian confidence
intervals have the right coverage and reject the null of no risk premia attached to the spurious
factor with frequency asymptotically approaching the size of the tests
The crowding out of the true factors by the useless ones could also be an important empirical
concern When the model is misspecified the presence of spurious factors can also bias the risk
premia estimates for the strong ones and often leads to their crowding out of the model Panel
A in Table 3 serves as a good illustration of this possibility with risk premia estimates for the
22
Table 3 Tests of risk premia in a misspecified model with useless and strong factors
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0082 0039 0008 0121 0067 0016 0099 0023 0001 -513 5663200 0096 0044 0005 0157 0100 0034 0129 0039 0005 127 6190
FM 600 0093 0034 0014 0212 0147 0071 0264 0129 0022 840 61781000 0102 0046 0010 0261 0194 0098 0380 0199 0056 1184 624820000 0114 0054 0009 0289 0229 0152 0848 0633 0240 2507 6076
100 0035 0012 0001 0028 0007 0001 0004 0001 0000 -211 4033200 0049 0017 0001 0067 0031 0004 0011 0003 0000 -175 4828
BFM 600 005 0018 0004 0099 0047 0005 0047 0014 0002 1020 55721000 0041 0021 0003 0102 0048 0011 0071 0035 0004 1487 569520000 0017 0007 0000 0087 0033 0007 0099 0055 0012 2480 5466
Panel B GLS
100 0219 0155 0057 0224 0135 0066 0303 0198 0064 1911 7775200 0155 0092 0028 0149 0090 0024 0263 0183 0061 5537 8171
FM 600 0121 0068 0015 0116 0064 0016 0391 0293 0134 6948 84331000 0115 0061 0013 0115 0057 0012 0487 0387 0216 7305 847420000 0084 0050 0009 0100 0041 0005 0864 0836 0757 7979 8424
100 0122 0069 0016 0129 0070 0017 0046 0017 0002 3243 6869200 0112 0056 0012 0099 0048 0012 0031 0012 0000 4844 7355
BFM 600 0096 0049 0011 0086 0045 0009 0049 0016 0002 6576 80301000 0081 0036 0007 0073 0032 0006 0058 0030 0003 7064 815420000 0027 0005 0000 0022 0007 0000 0098 0047 0013 7974 8259
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and
λstrong λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor The true value of the
cross-sectional R2adj is 3055 (8175) for the OLS (GLS) estimation
strong factor are clearly biased in the frequentist estimation by the identification failure in case of
the frequentist approach Again in this case BFM provides reliable albeit conservative confidence
bounds for model parameters
III11 Evaluating cross-sectional fit
In addition to risk premia estimates it is often useful to understand the quality of cross-sectional
fit of the model Indeed the increase in cross-sectional R2 is often interpreted as measuring the
economic importance of the predictor contrary to the statistical one implied by the risk premia
significance It is well-known however that the average values of R2 are not always informative
about the true model performance its sample distribution often suffers from a large estimation un-
certainty (see eg Stock (1991) and Lewellen Nagel and Shanken (2010)) ad has a non-standard
distribution when the matrix of β has reduced rank (Kleibergen and Zhan (2015) Gospodinov
Kan and Robotti (2019)) In this section we further investigate the properties of cross-sectional
R2 in the frequentist and Bayesian Fama-MacBeth regressions
Figure 2 shows the distribution of cross-sectional OLS R2 across a large number of simulations
for the asymptotic case of T = 20 000 and a misspecified process for returns As Panel (a)
illustrates if the model is strongly identified the distribution of posterior mode of R2 for BFM
tends to coincide with that of conventional Fama-MacBeth procedure as expected The major
23
difference emerges whenever a useless factor is included into the candidate set of variables Indeed
it is well-known that in this case the distribution of conventional measures of fit is nonstandard
and often inflated (Kleibergen and Zhan (2015)) This is further confirmed in Figures 2 b) and
c) that show that under the presence of spurious factors conventional Fama-MacBeth R2 has an
extremely spreadout right tail of the distribution which makes it easy to find a substantial increase
of fit whenever the model is simply not identified This unfortunate property of the frequentist
approach is not shared by the inference with BFM Indeed the mode of the posterior distribution
of R2 is generally tightly centered around the true values The slight bump to the right tail of the
distribution comes from the fact that whenever a spurious factor is included into the model with a
small probability (based on t-statistic cut-off this is equal to the size of the test see eg Table 3
Panel A) its fit will be similar to that of the frequentist estimation
00 02 04 06 08 10
02
46
810
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(a) Model with a strong factor
00 02 04 06 08
010
2030
40
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(b) Model with a useless factor
00 02 04 06 08 10
02
46
8
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(c) Model with strong and useless factors
Figure 2 Cross-sectional distribution of R2adj in models with strong and useless factors
Figures (a)-(c) show the asymptotic distribution of cross-sectional R2 under different model specifications across 1000simulations of sample size T = 20 000 Blue lines correspond to the distribution of the posterior mode for R2
adj whilegreen lines depict the pointwise sample distribution of cross-sectional fit evaluated at Fama-MacBeth risk premiaestimates The dark yellow dash line stands for the true value of R2
adj in the model
24
However the pointwise distribution of cross-sectional R2 across the simulations is only part of
the story as it does not reveal the in-sample estimation uncertainty and whether the confidence
intervals are credible in reflecting it While BFM incorporates this uncertainty directly into the
shape of its posterior distribution one needs to rely on bootstrap-like algorithms to build a similar
analogue in the frequentist case As frequentist benchmark we use the approach Lewellen Nagel
and Shanken (2010) to construct the confidence interval for R2 Details on this procedure can be
found in Appendix A5
minus05 00 05 10
00
05
10
15
20
25
Radj2
Den
sity
BFM PosteriorFM Radj
2
5th95th CI of BFM5th95th CI of LNSTrue Radj
2
(a) Model with a useless factor
minus04 minus02 00 02 04 06 08 10
00
05
10
15
20
Radj2
Den
sity
BFM PosteriorFM Radj
2
5th95th CI of BFM5th95th CI of LNSTrue Radj
2
(b) Model with strong and useless factors
Figure 3 The estimation uncertainty of cross-sectional R2Figures (a) and (b) show the posterior density of the cross-sectional R2 (blue) in one representative simulationalong with its 90 confidence interval (red) The dark yellow line denotes the true value of R2 while the green linedepicts its in-sample Fama-MacBeth estimate with the confidence interval constructed following Lewellen Nagel andShanken (2010) (black)
Figure 3 presents the posterior distribution of cross-sectional R2 for a model that contains a
useless factor (and potentially a strong one too) and contrasts it with a frequentist value and
the confidence interval around it Consider for example Panel a) The fact that the in-sample
Fama-MacBeth estimate of cross-sectional fit (51) is substantially higher than the mode of the
posterior distribution (-2 which is close to the true value of R2adj about minus4) is not surprising
given the previous results on the pointwise distribution of the estimates What is quite interesting
however is the coverage of the confidence interval constructed via the simulation-based approach of
Lewellen Nagel and Shanken (2010) Not only does it not include true value of the cross-sectional
fit but in fact in this particular simulation it suggests that R2adj should be between 42 and
100 A similar mismatch between the seemingly high levels of cross-sectional fit produced by a
frequentist approach and their true values can also be observed in Panel b) of Figure 3 for the
case of including both strong and a useless factors
In section A6 of the Appendix we show that the performance of the Bayesian Fama-MacBeth
method is robust to the use of a larger cross-sectional dimension ndash ie the above discussed properties
25
hold in a larger N setting
III2 Bayes factors
How well do flat and spike-and-slab priors work empirically in selecting relevant and detecting
spurious factors in the cross-section of asset returns We revisit the theoretical results from Section
II1 using the same simulation design used to evaluate the estimation of risk premia
Table 4 The probability of retaining risk factors using Bayes factors
T 55 57 59 61 63 65
Panel A strong factors
Jeffreys Prior fstrong 200 0813 0784 0758 0722 0693 0662600 0929 0915 0896 0876 0851 08341000 0972 0963 0957 0951 0937 0924
Spike-and-Slab Prior fstrong 200 0605 0570 0538 0505 0476 0447600 0853 0830 0807 0784 0762 07351000 0954 0940 0928 0912 0896 0875
Panel B useless factors
Jeffreys Prior fuseless 200 1000 0996 0988 0967 0919 0822600 0998 0998 0995 0988 0977 09431000 1000 1000 1000 0994 0983 0965
Spike-and-Slab Prior fuseless 200 0325 0188 0106 0057 0024 0013600 0072 0025 0006 0002 0001 00001000 0051 0015 0001 0000 0000 0000
Panel C strong and useless factors
Jeffreys Prior fstrong 200 0924 0897 0874 0848 0821 0799600 0988 0985 0976 0974 0965 09581000 0998 0996 0996 0995 0992 0987
fuseless 200 0984 0960 0910 0811 0702 0584600 0999 0993 0985 0954 0913 08541000 1000 1000 0995 0986 0966 0945
Spike-and-Slab Prior fstrong 200 0627 0591 0552 0509 0474 0452600 0860 0830 0802 0783 0758 07271000 0956 0942 0927 0911 0895 0875
fuseless 200 0260 0128 0071 0031 0019 0010600 0080 0028 0010 0004 0001 00011000 0058 0013 0005 0000 0000 0000
The table shows the frequency of retaining risk factors for different choice sets across 1000 simulations of differentsize (T=200 600 and 1000) In Panel A the candidate risk factor is truly cross-sectionally priced and stronglyidentified while in Panel B they are not Panel C summarizes the case of using both strong and useless candidatefactors in the model A candidate factor is retained in the model if its marginal posterior probability p(γi = 1|data)is greater than a certain threshold ie 55 57 59 61 63 and 65
Consider a cross-section of 25 portfolios that is actually loading on 2 systematic sources of risk
with the econometrician potentially observing at most only one of them a strong (and priced) ft
However there is also a second candidate factor available which is orthogonal to asset returns
and essentially useless We compute Bayes factors corresponding to each of the potential sources
of risk and document the empirical probability of retaining the variable in the model across 1000
26
simulations Again we consider models that contain either strong or useless factors or a com-
bination of both and different sample sizes (T = 200 600 and 1000) In each case we run the
Gibbs sampling algorithm derived using continuous spike-and-slab prior and then approximate the
marginal probability of each factor by the posterior mean of γj The decision rule is based on a
range of critical values 55-65 such that whenever the posterior mean of γj is above a particular
threshold we retain the factor Finally we also compute the probability of retaining a factor under
Jeffreys prior which would be the standard in the literature
Table 4 summarizes our findings When only a true risk factor is included in the candidate set
(Panel A) both Jeffreyrsquos and spike-and-slab priors successfully identify it with a high probability
especially in large sample For small sample sizes however Jeffreys prior clearly has a somewhat
higher power of detecting a true risk factor This outcome is not surprising as the estimator does
not impose additional shrinkage on the risk premium The difference however vanishes for larger
sample sizes
The difference between the two priors becomes drastic whenever useless factors are included
in the model (Panels B Panels B and C in Table 4) As discussed in Section II21 since the
matrix β⊤γ βγ is nearly singular and its determinant goes to zero under a flat prior the posterior
probability of including a spurious factor in the model converges to 1 asymptotically For example
the probability of misidentifying a spurious factor as being the true source of risk is almost 1
under Jeffreys prior even for a very short sample This in turn makes the overall process of model
selection invalid
Overall we find the asymptotic behavior of the spike-and-slab prior encouraging for variable
and model selection While often somewhat conservative in very short samples it successfully
eliminates the impact of the spurious factors from the model and identifies the true sources of
risk
IV Empirical Applications
In this section we apply our Bayesian approach to a large set of factors proposed in the previous
literature First we use the Bayesian Fama-MacBeth method to analyse several notable factor
models (subsection IV1) Second we consider 51 tradable and non-tradable factors yielding more
than two quadrillion possible models and employ our spike and slab priors to compute factorsrsquo
posterior probabilities and implied risk premia (subsections IV2 and IV3) Third we compare
the performance of a (low dimensional) robust model constructed with only the factors that have
high posterior probability to the one of several notable factor models (subsection IV4) Fourth
we estimate the degree of sparsity (in terms of linear factors) of the true latent stochastic discount
factor as well as the SDF-implied maximum Sharpe ratio (subsection IV5)
27
IV1 Some notable factor models
In this section we illustrate the differences between the frequentist and Bayesian FM estimation
(both OLS and GLS) for several candidate models In particular we estimate a set of linear
factor models on the returns of the standard 25 Fama-French portfolios sorted by size and value
using frequentist and Bayesian Fama-MacBeth estimators We use monthly data over the 197001-
201712 sample for tradable factors and whenever possible nontradables For factors available
only at quarterly frequency the sample is 1952Q1-2017Q3 (whenever possible) A full description
of the data and models used can be found in Appendix B Additional empirical results for other
candidate factors and cross-sections (eg 25 Fama-French + 17 Industry portfolios) can be found
in Appendix C
Tables 5 and 6 summarize the performance of several leading factor models For the classical FM
approach we report point estimates of risk premia with their Shanken-corrected t-statistics and
the cross-sectional R2 along with its 90 confidence interval (constructed following the method-
ology of Lewellen Nagel and Shanken (2010)) For BFM we report the posterior mean of risk
premia estimates and the posterior median and mode of R2 along with the centered 90 posterior
coverage We chose to report both the median and the mode for cross-sectional fit because the of
the shape of its distribution which is often heavily skewed
Carhart (1997) 4-factor model OLS and GLS Fama-MacBeth estimates of risk premia indicate
that size value and momentum (SMB HML and UMD correspondingly) are significant drivers of
the cross-section of test assets The market factor does not command a significant risk premium
which is a typical finding for this model Cross-sectional fit seems to be high with R2 over 70 even
though it comes with rather wide confidence bounds according to Lewellen Nagel and Shanken
(2010) approach The Bayesian estimation indicates that part of the model success is due to the fact
that this cross section of test assets does not have much exposure to momentum especially after
one controls for the conventional Fama-French factors While still marginally significant its risk
premium is substantially lower under both BFM and BFM-GLS estimation with tighter bounds
for R2 too On the contrary both HML and SMB have virtually identical risk prices under both
FM and Bayesian estimations
Hou Xue and Zhang (2014) q-factor model emphasized the role of investment and profitability
in matching the cross-section of equity returns and true to the data we find these factors strongly
priced Both estimation strategies give identical parameter estimates and largely consistent mea-
sures of cross-sectional fit This in turn implies that all the risk premia are strongly identified
for this model when estimated on a cross-section of 25 Fama-French portfolios When industry
portfolios are among the test assets (see Appendix C) risk premia and R2 decline and some of the
parameters lose significance but broadly speaking the model performs in a qualitatively similar
way
Liquidity-adjusted CAPM of Pastor and Stambaugh (2003) seems to suffer from identification
failure as the risk premium on the liquidity factor ceases to remain significant when BFM is used
in estimation Wide confidence bounds and uncertain cross-sectional fit provide a stark difference to
28
Table 5 Tradable factors and 25 Fama-French portfolios sorted by size and value
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Carhart (1997) Intercept 0489 7063 0703 6432 6329[-0244 1222] [3160 9400] [-0061 1426] [4826 7646]
MKT 0120 -0101[-0631 0870] [-0822 0683]
SMB 0171 0164[0100 0241] [0089 0232]
HML 0404 0396[0331 0477] [0330 0466]
UMD 2445 1806[0955 3936] [0259 3328]
q-factor model Intercept 0912 6567 0922 6062 6123Hou Xue and Zhang (2014) [0286 1539] [3040 8680] [0276 1560] [4131 7640]
ROE 0394 0377[0016 0771] [-0020 0789]
IA 0387 0385[0203 0571] [0208 0580]
ME 0274 0268[0169 0379] [0158 0376]
MKT -0371 -0378[-0995 0252] [-1005 0272]
Liquidity-CAPM Intercept 0973 3624 1162 3409 3027Pastor and Stambaugh (2000) [-0084 2030] [-909 10000] [0175 2120] [-239 6146]
LIQ 3057 1785[0727 5388] [-1237 4150]
MKT -0281 -0449[-1350 0788] [-1371 0509]
Panel A GLS
Carhart (1997) Intercept 1017 8964 1083 8587 863[0389 1645] [8200 9760] [0458 1717] [8085 9105]
MKT -0434 -0504[-1065 0196] [-1150 0122]
SMB 0191 0189[0150 0233] [0150 0230]
HML 0356 0356[0313 0400] [0316 0395]
UMD 1626 1264[0479 2772] [0077 2401]
q-factor model Intercept 1305 5503 1277 4728 4854Hou Xue and Zhang (2014) [0779 1831] [2440 9640] [0702 1879] [3245 6419]
ROE 0295 0266[-0026 0615] [-0087 0640]
IA 0270 0265[0104 0437] [0093 0450]
ME 0251 0246[0161 0341] [0144 0345]
MKT -0749 -0720[-1268 -0229] [-1292 -0156]
Liquidity-CAPM Intercept 1244 4938 1256 5298 4317Pastor and Stambaugh (2000) [0664 1824] [2691 9891] [0738 1749] [1250 6653]
LIQ 1141 0775[-0232 2514] [-0450 2116]
MKT -0664 -0678[-1242 -0086] [-1176 -0162]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of 25 Fama-French monthly excess returns Each model is estimated via OLS and GLS We reportpoint estimates and 5 confidence intervals for risk premia which are constructed based on the asymptotic normaldistribution and cross-sectional R2 and its (5 95) confidence level constructed as in Lewellen Nagel and Shanken(2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide the posterior mean of λ denoted byλj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (595) credible intervals and denote significance at the 90 95 and 99 level respectively
29
Table 6 Nontradable factors and 25 Fama-French portfolios sorted by size and value
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Scaled CCAPM Intercept 1046 2567 1791 3436 2919[-0848 2940] [-1429 10000] [0001 3723] [-476 6207]
cay 1817 0791[-0653 4288] [-1347 2686]
∆Cnd 0713 0303[-0030 1456] [-0462 0951]
∆Cnd times cay 0804 0301[-1645 3253] [-1911 2270]
HC-CAPM Intercept 3243 -122 3090 354 957[1228 5257] [-909 3345] [0790 5259] [-748 4431]
∆Y 0464 0085[-0213 1140] [-1119 1058]
MKT -0719 -0656[-2680 1242] [-2859 1558]
Durable CCAPM Intercept 2214 5238 2780 471 4078[-1037 5465] [2800 10000] [-0184 5751] [120 6991]
∆Cnd 0743 0357[-0025 1511] [-0207 0832]
∆Cd -0057 0014[-0719 0605] [-0668 0693]
MKT 0083 -0495[-3322 3489] [-3395 2555]
Panel B GLS
Scaled CCAPM Intercept 2180 -1024 2257 -658 -313[0825 3536] [-1429 6457] [1221 3258] [-1187 1517]
cay 0435 0256[-0774 1643] [-0688 1217]
∆Cnd 0118 0089[-0266 0502] [-0214 0407]
∆Cnd times cay 0141 0063[-1005 1286] [-0845 0938]
HC-CAPM Intercept 2730 5636 2759 5824 4926[1458 4002] [3018 8364] [1379 4095] [967 7507]
∆Y -0421 -0241[-0742 -0099] [-0598 0114]
MKT -0717 -0740[-1979 0545] [-2073 0622]
Durable CCAPM Intercept 2960 4454 2841 5474 4099[0547 5374] [286 7829] [1102 4558] [-241 7215]
∆Cnd 0105 0052[-0265 0475] [-0201 0311]
∆Cd 0055 0025[-0390 0501] [-0286 0327]
MKT -0941 -0822[-3314 1432] [-2528 0895]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of 25 Fama-French quarterly excess returns Each model is estimated via OLS and GLS Wereport point estimates and 5 confidence intervals for risk premia which are constructed based on the asymptoticnormal distribution and cross-sectional R2 and its (5 95) confidence level constructed as in Lewellen Nageland Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide the posterior mean of λdenoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as wellas its (5 95) credible intervals and denote significance at the 90 95 and 99 level respectively
the pointwise estimates and their seemingly high significance levels under the standard frequentist
approach
Conditional CCAPM of Lettau and Ludvigson (2001) is weakly identified at best Unlike the
basic FM estimation that indicates a relative empirical success of the model the Bayesian approach
reveals most risk premia to be substantially lower losing all the accompanying statistical and
30
economic significance This is particularly pronounced in the BFM-GLS that delivers both risk
premia and cross-sectional R2 close to zero
Labour-adjusted CAPM of Jagannathan and Wang (1996) extends the classic CAPM framework
by introducing a proxy for human capital and finds it strongly priced in the cross-sections of stocks
returns The BFM estimates of risk premia are substantially lower and no longer significant with
the same patterns observed under both OLS and GLS procedures
Durable CCAPM of Yogo (2006) in the linearized version included the durable consumption
factor and found that its impact is priced in a number of cross-sections sorted by size and value past
betas and other characteristics Even though the Lewellen Nagel and Shanken (2010) approach
indicates a really wide support for the cross-sectional R2 the model found empirical support in the
data We find that both durable and nondurable consumption are weak predictors of the cross-
section of returns as the magnitude of their risk premia substantially declines and is no longer
significant The model is still characterized by a wide confidence interval for R2 but overall its
pricing ability is questionable at best
Appendix C provides additional empirical results on the performance of both frequentist and
bayesian Fama-MacBeth estimators In many cases when the models are well specified and strongly
identified in the data there is almost no distinction between the two approaches One notable
difference however are the confidence intervals of the R2 that are often notoriously wide in
the frequentist case There are also cases however when the difference in model performance
becomes large affecting both risk premia estimates and measures of cross-sectional fit Similar
to Gospodinov Kan and Robotti (2019) we caution the reader against blindly relying on the
estimates produced by conventional Fama-MacBeth procedure and advocate a robust approach to
inference
IV2 Sampling two quadrillion models
We now turn our attention to a large cross-section of candidate asset pricing factors In particular
we focus on 51 (both traded and non) monthly factors available from October 1973 to December
2016 (ie T = 600) Factors are described in details in Table B1 of Appendix B In choosing
the cross-section of assets to price we follow Lewellen Nagel and Shanken (2010) and employ 25
Fama-French size and book-to-market portfolios plus 30 Industry portfolio (ie N = 55) Since
we do not restrict the maximum number of factors to be included all the possible combinations
of factors give us a total of 251 possible specifications ie 225 quadrillion models Note that each
model involves 55 time series regressions and one cross-section regression ie we jointly evaluate
the equivalent of 126 quadrillion regressions
We employ the continuous spike-and-slab approach of section II23 since it is the most suited
for handling a very large number of possible models and report both the posterior (given the data)
of each factor (ie E [γj |data] forallj) as well as the posterior means of the factorsrsquo risk premia (ie
E [λj |data] forallj) computed as the Bayesian Model Average16 across all the models considered
16If we are interested in some quantity ∆ that is well-defined for every model j = 1 M (eg risk premia
31
The posterior evaluation is performed and reported over a wide range for the parameter (ψ in
equation (16)) that controls the degree of shrinkage of potentially useless factorsrsquo risk premia from
ψ = 1 (ie very strong shrinkage) to ψ = 100 (making the shrinkage virtually irrelevant) The
prior for each factor inclusion is a Beta(1 1) (ie a uniform on [0 1]) yielding a prior expectation
for γj equal to 5017
000
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
1 5 10 20 50 100
Post
erio
r pro
babi
lity
ψ
HML MKT MKT SMB STRev
IPGrowth BEH_PEAD NONDUR SERV LIQ_TR
PE DeltaSLOPE BW_ISENT DEFAULT Oil
DIV REAL_UNC UNRATE TERM HJTZ_ISENT
UMD CMA LIQ_NT FIN_UNC STOCK_ISS
NetOA MACRO_UNC INV_IN_ASS COMP_ISSUE ACCR
HML ROA RMW CMA LTRev
INTERM_CAP_RATIO ASS_Growth DISSTR PERF BAB
IA MGMT ROE O_SCORE BEH_FIN
GR_PROF QMJ RMW SKEW SMB
HML_DEVIL
Figure 4 Posterior factor probabilities
Posterior probabilities of factors E [γj |data] estimated over the 197310-201612 sample using a cross-section of 25Fama-French size and book-to-market and 30 Industry test asset portfolios computed using the continuous spike andslab approach of section II23 and 51 factors yielding 251 asymp 225 quadrillion models The 51 factors considered aredescribed in Table B1 of Appendix B The prior distribution for the j-th factor inclusion is a Beta(1 1) yielding aprior expectation of γj = 50 Posterior probabilities are plotted for ψ isin [1 100]
Figure 4 plots the posterior probabilities of the 51 factors as a function of the parameter ψ
The corresponding values are reported in Table 7 Overall the inclusion of only four factors finds
maximum Sharpe ratio etc) from Bayesrsquo theorem we have
E [∆|data] =M983131
j=0
E [∆|datamodel = j] Pr (model = j|data)
where E [∆|datamodel = j] = limLrarrinfin1L
983123Li=1 ∆(θ
(j)i ) and
983153θ(j)i
983154L
i=1denotes draws from the posterior distribution
of the parameters of model j That is the (Bayesian model average) expectation of ∆ conditional on only thedata is simply the weighted average of the expectation in every model with weights equal to the modelsrsquo posteriorprobabilities See eg Raftery Madigan and Hoeting (1997) Hoeting Madigan Raftery and Volinsky (1999)
17Using a Beta(2 2) that still implies a prior probability of factor inclusion of 50 but lower probabilities forvery dense and very sparse models we obtain virtually identical results Furthermore using a prior in favour of moresparse factor models (ie a Beta(2 8)) the empirical findings are very similar to the ones reported
32
Table 7 Posterior factor probabilities E [γj |data] and risk premia in 225 quadrillion models
E [γj |data] E [λj |data]ψ ψ
Factors 1 5 10 20 50 100 1 5 10 20 50 100 F
HML 0860 0909 0916 0903 0882 0877 0192 0288 0301 0300 0292 0293 0377MKT 0730 0807 0831 0851 0837 0778 0108 0271 0370 0476 0565 0572 0514MKT 0501 0630 0705 0748 0749 0707 0047 0184 0285 0385 0467 0472 0563SMB 0654 0662 0580 0500 0429 0370 0076 0138 0133 0117 0103 0102 0215STRev 0506 0526 0511 0530 0568 0550 0003 0013 0026 0049 0106 0158 0438IPGrowth 0508 0508 0501 0502 0508 0502 0000 -0001 -0001 -0002 -0004 -0007 0121BEH PEAD 0486 0512 0478 0492 0511 0517 0004 0012 0018 0027 0049 0072 0619NONDUR 0475 0499 0490 0504 0504 0509 0000 0002 0003 0005 0011 0018 0170SERV 0504 0481 0497 0502 0491 0497 0000 0000 0000 0000 -0001 -0001 0052LIQ TR 0494 0493 0485 0485 0506 0507 0000 0005 0010 0020 0048 0083 0438PE 0495 0494 0502 0490 0494 0492 -0002 -0004 -0005 -0006 -0008 -0009 6401DeltaSLOPE 0478 0490 0506 0498 0482 0511 0000 0000 -0001 -0001 -0002 -0005 0096BW ISENT 0489 0491 0496 0498 0504 0477 0000 0002 0003 0005 0010 0014 0094DEFAULT 0497 0498 0490 0497 0484 0490 0000 0000 0000 0001 0001 0002 0313Oil 0495 0504 0489 0490 0494 0483 0002 0011 0019 0034 0068 0108 0852DIV 0488 0491 0486 0494 0491 0505 0000 0000 0000 -0001 -0002 -0003 0876REAL UNC 0479 0490 0503 0490 0498 0484 0000 0000 0000 0000 0000 0000 0043UNRATE 0483 0499 0484 0490 0490 0486 0000 -0001 -0002 -0003 -0006 -0008 1083TERM 0470 0476 0480 0497 0503 0501 0000 0003 0005 0009 0018 0030 0942HJTZ ISENT 0483 0480 0491 0487 0489 0477 0000 0000 0000 -0001 0000 0000 0210UMD 0531 0537 0536 0495 0422 0384 0028 0069 0089 0100 0102 0108 0646CMA 0499 0504 0491 0495 0466 0442 0001 0000 -0002 -0005 -0008 -0012 0242LIQ NT 0477 0478 0494 0493 0481 0471 -0003 -0005 -0004 -0002 0009 0019 0501FIN UNC 0481 0477 0471 0477 0489 0491 0000 0000 0000 0000 -0001 -0001 0096STOCK ISS 0515 0537 0509 0473 0433 0374 -0032 -0081 -0096 -0103 -0110 -0105 0515NetOA 0504 0504 0481 0489 0423 0402 0005 0015 0022 0032 0040 0042 0544MACRO UNC 0476 0483 0479 0471 0463 0430 0000 0000 0000 0000 0000 0000 0073INV IN ASS 0479 0476 0479 0461 0454 0440 0001 0002 0005 0010 0023 0029 0549COMP ISSUE 0514 0518 0503 0481 0393 0363 0037 0083 0101 0111 0105 0108 0497ACCR 0483 0477 0486 0472 0436 0415 -0009 -0012 -0009 0001 0015 0015 0343HML 0491 0481 0479 0469 0431 0376 -0002 -0001 0001 0004 0008 0010 0251ROA 0517 0514 0499 0473 0399 0319 0041 0105 0141 0162 0154 0123 0551RMW 0508 0481 0483 0465 0397 0347 0002 0008 0013 0017 0017 0016 0219CMA 0560 0530 0499 0432 0351 0292 0031 0056 0062 0058 0051 0044 0351LTRev 0485 0491 0474 0442 0402 0350 -0007 -0025 -0032 -0035 -0036 -0029 0252INTERM CAP RATIO 0499 0477 0452 0451 0380 0324 0027 0037 0054 0082 0103 0104 0790ASS Growth 0473 0470 0458 0439 0380 0327 0001 0005 0005 0000 -0005 -0003 0525DISSTR 0505 0487 0456 0410 0340 0269 0043 0094 0105 0104 0085 0070 0475PERF 0541 0497 0448 0390 0319 0259 -0054 -0079 -0072 -0064 -0054 -0049 0651BAB 0460 0448 0433 0405 0372 0325 -0010 -0012 -0008 0006 0027 0037 0921IA 0497 0465 0425 0385 0325 0257 -0010 -0011 -0009 -0008 -0008 -0007 0409MGMT 0516 0469 0424 0379 0302 0244 0028 0055 0058 0056 0050 0044 0631ROE 0482 0446 0414 0364 0299 0233 0009 0024 0027 0028 0024 0018 0555O SCORE 0475 0426 0403 0368 0281 0229 -0009 -0014 -0015 -0018 -0016 -0012 002BEH FIN 0483 0410 0360 0321 0238 0196 -0016 -0008 -0003 0000 0003 0003 076GR PROF 0459 0405 0366 0314 0249 0206 0011 0005 0008 0011 0012 0014 0199QMJ 0502 0408 0369 0300 0232 0178 0030 0030 0026 0018 0014 0011 0405RMW 0464 0395 0356 0302 0245 0193 0015 0025 0028 0027 0025 0020 0292SKEW 0481 0388 0345 0287 0219 0179 -0001 0000 0000 0000 0000 0000 0438SMB 0336 0305 0319 0312 0311 0277 0019 0032 0041 0047 0054 0051 0257HML DEVIL 0434 0326 0289 0242 0183 0152 0014 0013 0009 0005 0003 0002 0356
Posterior probabilities of factors E [γj |data] and posterior mean of factor risk premia E [λj |data] computed usingthe continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225 quadrillion models Theprior for each factor inclusion is a Beta(1 1) yielding a prior expectation for γj equal to 50 The last column reportssample average returns for the tradable factors The data is monthly 197310 to 201612 Test assets cross-sectionof 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factors considered are described inTable B1 of Appendix B Numbers denoted with the asterisk in the last column correspond to the return on thefactor-mimicking portfolio of the nontradable factor constructed by a linear projection of its values on the set of 51test assets and scaled to have the same volatility as the original nontradable factor
33
substantial support in our empirical analysis First the celebrated Fama-French HML (high-minus-
low) designed to capture the so-called lsquovalue premiumrsquo is a strong determinant of the cross-section
of asset returns For ψ = 10 (a reasonable benchmark) its posterior probability is about 895 and
only for very strong shrinkage (ψ = 1) the posterior probability gets reduced to 639 Second
the market factor in the version of Daniel Mota Rottke and Santos (2018) (MKTlowast that is
meant to have hedged out the unpriced risk contained in the market index) has also high posterior
probability (856 for ψ = 10) Third the simple market factor (MKT) seems also to be a robust
source of priced risk albeit with an empirical performance weaker than MKTlowast Fourth albeit to
a lesser extent SMBlowast the Daniel Mota Rottke and Santos (2018) version of the small-minus-big
Fama-French factor (meant to capture the so called lsquosizersquo premium) seems also to contain relevant
information for pricing the cross-section of asset returns with a posterior probability in the 60-
70 range for most values of ψ Beside the ones above mentioned all other factors have posterior
probabilities of about 50 or less for all values of ψ Interestingly the results are not very sensitive
to the choice of ψ
In addition to the posterior probabilities of the factors Table 7 reports the posterior means
of the factor risk premia computed as Bayesian Model Average ie the weighted average of the
posterior means in each possible factor model specification with weights equal to the posterior
probability of each specification being the true data generating process (see eg Roberts (1965)
Geweke (1999) Madigan and Raftery (1994)) The results are not very sensitive to the choice of
ψ except when considering very small values for of the shrinkage parameter ψ since in this case
posterior means are shrunk toward zero Interestingly the estimated price of risk for the market
factor is positive despite it being very often estimated as a negative quantity when considering
multifactor models and not dissimilar from the market excess return over the same period More
generally there is a clear pattern in cross-sectionally estimated (ie ex post) factor risk premia and
their simple time series average estimates (reported in the last column of Table 7) for the robust
four factors (HML MKTlowast MKT and SMBlowast) ex post risk premia are very similar to the time series
estimnates while the opposite holds true for the other factors In other words robust factors seems
to price themselves well (since theoretically their own beta is one) while other factors donrsquot
IV3 Estimating 26 million sparse factor models
Instead of drawing unrestricted factor models specifications we now constraint models to have a
maximum of 5 factors ndash ie we are imposing sparsity on the implied stochastic discount factor as
in most of the previous empirical asset pricing literature that has tried to identify low dimensional
factor models to explain the cross-section of asset returns Given our set of 51 factors this approach
yields about 26 millions of possible models (ie the equivalent of about 147 million time series and
cross-sectional regressions) Since we now do not sample the possible models posterior probabilities
are computed using the marginal likelihoods of all these models ie the posterior probability of
model γj is computed as
Pr(γj |data) =p(data|γj)983123i p(data|γi)
34
000
1000
2000
3000
4000
5000
6000
7000
8000
1 5 10 20 50 60
Post
erio
r pro
babi
lity
ψ
HML MKT MKT SMB PERF
STOCK_ISS COMP_ISSUE CMA ROA UMD
BEH_PEAD STRev NONDUR DISSTR MGMT
BW_ISENT TERM FIN_UNC LIQ_TR DeltaSLOPE
IPGrowth Oil SERV DEFAULT NetOA DIV PE REAL_UNC HJTZ_ISENT UNRATE
INTERM_CAP_RATIO LIQ_NT QMJ MACRO_UNC INV_IN_ASS
ACCR LTRev CMA ASS_Growth HML
RMW BAB IA O_SCORE ROE
SMB SKEW BEH_FIN GR_PROF RMW
HML_DEVIL
Figure 5 Posterior factor probabilities
Posterior probabilities of factors Pr [γj = 1|data] estimated over the 197310-201612 sample using a cross-section of25 Fama-French size and book-to-market and 30 Industry test asset portfolios computed using the Dirac spike andslab approach of section II22 51 factors and all possible models with up to 5 factors yielding about 26 millioncandidate models The prior probability of a factor being included is about 1038 The 51 factors considered aredescribed in Table B1 of Appendix B Posterior probabilities are plotted for ψ isin [1 100]
where we have assigned equal prior probability to all possible specifications and p(data|γj) denotes
the marginal likelihood of the j-th model To both simplify the numerical computation and to
illustrate the qualities of the approach we use the Dirac spike-and-slab prior of section II22 since
we can leverage its closed form solution for the marginal likelihoods18
Posterior factor probabilities are reported in Figure 5 and jointly with the Bayesian model
averaging of risk premia across the sparse models in Table C15 of the Appendix Note that in
this case all factors have an ex ante probability of being included equal to 1038 The results
are strikingly similar to the ones in Table 7 and Figure 4 as before only for four factors (HML
MKTlowast MKT and SMBlowast) we observe a marked increase in the posterior probability of inclusion
after observing the data Furthermore these factors seems to price themselves well ndash ie the BMA
of their risk premia are very similar to the sample average of their excess returns ndash while other
factors do not
35
Table 8 Factor Models with highest posterior probability (Dirac spike-and-slab ψ = 10)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232SMB 983232 983232 983232 983232 983232 983232STRevBW ISENTLIQ TRUNRATENONDURTERMCOMP ISSUE 983232 983232OilBEH PEADDeltaSLOPEINV IN ASSIPGrowthDEFAULTUMD 983232ROA 983232 983232 983232REAL UNCPECMA 983232ACCRSERVSTOCK ISS 983232 983232 983232DIVMACRO UNCFIN UNCLIQ NTCMANetOAHJTZ ISENTLTRevRMWHMLINTERM CAP RATIOASS GrowthPERF 983232 983232 983232IABABDISSTRROEMGMT 983232 983232O SCOREQMJBEH FINGR PROFSKEWRMWHML DEVILSMB
Probability () 00666 00599 00574 00522 00503 00460 00437 00428 00423 00393
Factors and posterior model probabilities of ten most likely specifications computed using the Dirac spike and slabapproach of section II22 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 26 millioncandidate models and a model prior probability of the order of 10minus7 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
36
IV4 A robust factors model
The previous subsections suggest that only a small number of factors (HML MKTlowast MKT and to a
lesser extent SMBlowast) are robust explanators of the cross-section of asset returns Furthermore Table
8 that reports the ten factor model specifications with the highest posterior probabilities19 shows
that these robust factors tend to be almost always included in the most likely models both HML
and MKTlowast are featured in all ten specifications while MKT is excluded only once and SMBlowast is
included in the five most likely specification plus the tenth one Note that the posterior probabilities
in Table 8 might appear small in absolute terms but are actually four orders of magnitude larger
than the prior model probabilities (equal to one over the number of models considered)
Therefore a natural question is whether the four factors identified as robust in the above
analysis do indeed deliver a significantly better cross-sectional asset pricing model We answer this
questions by comparing the performance of a four factor model with HML MKTlowast MKT and SMBlowast
as factors to the one of several notable factor models20 In particular Table 9 reports the model
posterior probabilities for the specifications considered ie the probability of any of these models
being the true data generating process Strikingly for almost any value of ψ the model posterior
probabilities are in the single digits range for all models but the robust factors one the probability
of this specification is always higher than 85 except when using a very strong shrinkage (in which
case it is reduced to 65) Furthermore for ψ in the most salient range (10-20) the posterior
probability of the robust factors model is about 90
Table 9 Posterior probabilities of notable models vs robust factors
ψmodel 1 5 10 20 50 100
CAPM 002 001 001 002 004 008Fama and French (1992) 005 001 001 001 000 000Fama and French (2016) 009 006 005 004 003 002Carhart (1997) 005 001 001 001 001 001Hou Xue and Zhang (2015) 002 000 000 000 000 000Pastor and Stambaugh (2000) 001 000 000 001 001 002Asness Frazzini and Pedersen (2014) 010 003 002 002 002 001Robust Factors Model 065 088 090 090 089 085
Posterior model probabilities for the specifications in the first column for different values of ψ computed using theDirac spike-and-slab prior The models and their factors are described in Appendix B The model in the last rowuses the HML MKT MKTlowast and SBMlowast factors described in Table B1 The data is monthly 197310 to 201612 Testassets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios
18Alternatively one could use the continuous spike-and-slab approach employed in the previous subsection anddrop the draws of specifications with more than 5 factors but this significantly reduces computational efficiency
19Table 8 focuses on the Dirac spike-and-slab specification with ψ = 10 Very similar results are reported in tablesC16-C18 of Appendix C for other values of ψ and for the continuous prior case
20Note that the correlation between MKT and MKTlowast is not too large being about 064
37
IV5 The sparsity of the stochastic discount factor
We have shown that a robust model with only four ndash robust ndash factors is much more likely to
capture the true stochastic discount factor than all the notable models considered (see Table 9)
Nevertheless are only four factors likely to deliver a sufficiently accurate representation of the true
latent stochastic discount factor
Thanks to our Bayesian method this question can be easily answered In particular by using
our estimations of about 225 quadrillion models and their posterior probabilities we can compute
the posterior distribution of the dimensionality of the lsquotruersquo model That is for any integer number
between one and fifty-one we can compute the posterior probability of the SDF being a function
of that number of factors
Figure 6 reports the posterior distributions of the dimensionality of the SDF for various values
of ψ These distributions are also summarized in Table 10 For the most salient values of ψ (10 and
20) the posterior mean of the number of factors in the true SDF is in the 24-25 range and the 95
posterior credible intervals are contained in the 17 to 32 factors range That is there is substantial
evidence that the SDF is dense in the space of the linear factors considered given the factors at
hand a relative large number of them is needed to provide an accurate representation of the true
latent SDF Since most of the literature has focused on very low dimension linear factor models
this finding suggests that most empirical results therein have been affected by a large degree of
misspecification
It is worth noticing that as Figure 6 and Table 10 show for very large ψ ie with basically
a flat prior for factor risk premia the posterior dimensionality is reduced This is due to two
phenomena we have already outlined First if some of the factors are useless (and our analysis
points in this direction) under a flat prior they do tend to have a higher posterior probability and
drive out the true sources of priced risk Second a flat prior for the risk premia can generate a
ldquoBartlett Paradoxrdquo (see the discussion in section II21 and Bartlett (1957))
Table 10 The posterior dimensionality of the stochastic discount factor
ψ1 5 10 20 50 100
mean 2570 2525 2460 2370 2203 2046median 26 25 25 24 22 2025 19 18 18 17 15 145 20 20 19 18 16 1595th 31 31 30 30 28 26975th 33 32 32 31 29 27
Summary statistics of the posterior density of the true model number of factors for various values of ψ Estimated
over the 197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30 Industry
test asset portfolios using the continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225
quadrillion models The prior for each factor inclusion is a Beta(1 1)
38
0 10 20 30 40 50
000
004
008
012
Model Dimensionality
Freq
uenc
y
ψ=1ψ=5ψ=10ψ=20ψ=50ψ=100
Figure 6 Posterior density of the dimensionality of the stochastic discount factor
Posterior density of the true model having the number of factors listed on the horizontal axis Estimated over the
197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30 Industry test asset
portfolios computed using the continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225
quadrillion models The prior for each factor inclusion is a Beta(1 1) yielding a prior expectation for γj equal to 50
The 51 factors considered are described in Table B1 of Appendix B Posterior densities are plotted for ψ isin [1 100]
Note that if the factors proposed in the literature were to capture different and uncorrelated
sources of risk one might worry that a SDF that is dense in the space of factors might imply
unrealistically high Sharpe ratios (see eg the discussion in Kozak Nagel and Santosh (2019))
Since given a factor model the SDF-implied maximum Sharpe ratio is just a function of the
factorsrsquo risk premia and covariance matrix our Bayesian method allows to construct the posterior
distribution of the maximum Sharpe ratio for each of the 2 quadrilion models considered Therefore
using the posterior probabilities of each possible model specification we can actually construct
the (Bayesian Model Averaging) posterior distribution of the SDF-implied maximum Sharpe ratio
(conditional on the data only)
Figure 7 and table 11 report respectively the posterior distribution of the (annualized) SDF-
implied maximum Sharpe ratio and its summary statistics for several values of the parameter ψ
Except when a very strong shrinkage (small ψ) is imposed (and hence risk premia and consequently
Sharpe ratios are shrunk toward zero) the posterior distribution of the Sharpe ratio are quite
similar for all values of ψ Furthermore despite the SDF being dense in the space of factors the
posterior maximum Sharpe ratio does not appear to be unrealistically high eg for ψ isin [10 20]
its posterior mean is about 074ndash080 and the 95 posterior credible intervals are in the 050ndash114
range Interestingly Ghosh Julliard and Taylor (2016 2018) provides a non-parametric estimate
of the pricing kernel extracted using an information-theoretic approach and wide cross-sections of
equity portfolios and find SDF-implied maximum Sharpe ratios of very similar magnitude
39
00 05 10 15
01
23
4
Sharpe Ratio
Dens
ity
ψ=1ψ=5ψ=10ψ=20ψ=50ψ=100
Figure 7 Posterior density of the Sharpe ratio of the model-implied stochastic discount factor
Posterior density of the Sharpe ratio of the model-implied stochastic discount factor for various values of ψ isin [1 100]
Estimated over the 197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30
Industry test asset portfolios computed using the continuous spike and slab approach of section II23 51 factors
described in Table B1 of Appendix B and Bayesian Model Averaging of the 251 asymp 225 quadrillion possible models
The prior for each factor inclusion is a Beta(1 1)
Table 11 Posterior distribution of the Sharpe ratio of the model-implied stochastic discount factor
ψ1 5 10 20 50 100
mean 044 067 074 080 087 092median 043 066 073 079 086 09125 026 044 050 054 059 0625 029 047 053 058 063 06695th 060 089 099 108 117 124975th 064 094 105 114 124 133
Summary statistics of the posterior distribution of the maximal Sharpe ratio of the model-implied SDF for various
values of ψ isin [1 100] Estimated over the 197310-201612 sample using a cross-section of 25 Fama-French size and
book-to-market and 30 Industry test asset portfolios computed using the continuous spike and slab approach of
section II23 51 factors described in Table B1 of Appendix B and Bayesian Model Averaging of the 251 asymp 225
quadrillion possible models The prior for each factor inclusion is a Beta(1 1)
V Extensions
In additions to the extensions formalized in remarks 1 (on how to handle generated factors such as
principal components and factor mimicking portfolios) and 3 (on how to handle the identification
failure generated by lsquolevel factorsrsquo) our method can be feasibly extended to encompass several
salient generalizations
40
First based on economic considerations one might possibly want to bound the maximum risk
premia (or the maximum Sharpe ratios) associated with the factors This can be achieved by re-
placing the Gaussian distributions in our spike-and-slab priors with (rescaled and centered) Beta
distributions since the latter have bounded support Furthermore for the sake of expositional
simplicity and closed form solutions we have focused on regularising spike and slab priors with
exponential tails Nevertheless our approach that shrinks useless factors based on their correla-
tion with asset returns could be as well implemented using polynomial tailed (ie heavy-tailed)
mixing priors (see Polson and Scott (2011) for a general discussion of priors for regularisation and
shrinkage)21 The rational for using heavy tailed priors is that when the likelihood has thick tails
while the prior has thin tail if the likelihood peak moves too far from the prior mean the posterior
eventually eventually reverts toward the prior This is illustrated in Figure D2 of the Appendix
Nevertheless note that this mechanism (first pointed out in Jeffreys (1961)) is actually desirable
in our settings in order to shrink the risk premia of useless factors toward zero22
Second Lewellen Nagel and Shanken (2010) points out that the first pass time-series regression
is often affected by having a strong factor structure in the residuals Given the hierarchical structure
of our Bayesian approach one can add latent linear components in the time series regression of asset
returns on factors reformulate the time series estimation step as a state-space problem and filter
the latent components (eg via Kalman filter) The posterior sampling of the time series parameters
would then be enriched by the drawing of the added terms as in Bryzgalova and Julliard (2018)
Furthermore one could allow the latent time series factors to be potential priced in the cross-
section (again as in Bryzgalova and Julliard (2018)) This extension would increase the numerical
complexity of the procedure in the time-series step but would nonetheless leave unchanged the
method proposed in this paper at the cross-sectional step (with the only difference that the time
series loadings of the latent factors could be included in the cross-sectional step as if these latent
factors were observable) This extended approach would lead to valid posterior inference and model
selection
Third again thanks to the hierarchical structure of our method time varying time series betas
could be accommodated by adopting the time varying parameters approach of Primiceri (2005)
in the time series step And since in our approach the asset specific expected risk premia are
a parameter estimated in the time series step this extension would also allow for time variation
in asset risk premia Furthermore albeit this would increase the numerical complexity of the
cross-sectional inference step the time varying parameters formulation could also be used for the
modelling of the factor risk premia
21For example albeit alternatives with more desirable properties exist our spike and slab could be implementedusing a Cauchy prior with location parameter set to zero and scale parameter proportional to ψj as defined in equation(16)
22Risk premia have a natural support hence the prior can be chosen (adjusting the paramater ψ) to attach highprobability to it And since useless factors will tend to generate heavy tailed likelihoods with posterior peaks for riskpremia that deviate toward infinity the risk premia of such factors will be shrunk toward the prior mean if the priorhas thin-tails
41
VI Conclusions
We have developed a novel (Bayesian) method for the analysis of linear factor models in asset
pricing The approach can handle the quadrillions of models generated by the zoo of traded and
non traded factors and delivers inference that is robust to the common identification failures and
spurious inference problems caused by useless factors
We have applied our approach to the study of more than two quadrillion factor model spec-
ification and have found that 1) only a handful of factors (the Fama and French (1992) ldquohigh-
minus-lowrdquo proxy for the value premium the market index as well the adjusted versions of the
ldquosmall-minus-bigrdquo size factor and the market factor of Daniel Mota Rottke and Santos (2018))
seem to be robust explanators of the cross-sections of asset returns 2) jointly the four robust fac-
tors provide a model that is compared to the previous empirical literature one order of magnitude
more likely to have generated the observed asset returns (itrsquos posterior probability is about 90)
3) with very high probability the ldquotruerdquo latent stochastic discount factor is dense in the space of
factors proposed in the previous literature ie capturing its characteristics requires the use of 24-25
factors (at the posterior mean of the SDF sparsity) 4) despite being dense in the space of factors
the SDF-implied maximum Sharpe ratio is not excessive suggesting a high degree of commonality
in terms of captured risks among the factors in the zoo
As a byproduct of our novel framework for empirical asset pricing we provide a very simple
Bayesian version of the Fama and MacBeth (1973) regression method (BFM) We show that this
simple procedure (that does neither require optimisation nor tuning parameters and is not harder
to implement than eg the Shanken (1992) correction for standard errors) makes useless factors
easily detectable in finite sample In extensive simulations the BFM and its GLS analogue (BFM-
GLS) perform well even with relatively small time and large cross-sectional dimensions We apply
BFM and BFM-GLS to several notable factor models and document that a range of non-traded
factors such as consumption proxies labour factors or the consumption-to-wealth ratio are only
weakly identified at best and are characterised by a substantial degree of model misspecification
and uncertainty
Finally thanks to its hierarchical structure our framework is extremely flexible and can accom-
modate and deliver robust inference in the presence of 1) pre-estimated factors (eg mimicking
portfolios and principal components) 2) latent priced and unpriced factors 3) time varying betas
as well as asset and factor risk premia
42
References
Allena R (2019) ldquoComparing Asset Pricing Models with Traded and Non-Traded Factorsrdquo working paper
Asness C and A Frazzini (2013) ldquoThe Devil in HMLs Detailsrdquo Journal of Portfolio Management 39 49ndash68
Asness C S A Frazzini and L H Pedersen (2019) ldquoQuality minus Junkrdquo Review of Accounting Studies24(1) 34ndash112
Avramov D and G Zhou (2010) ldquoBayesian Portfolio Analysisrdquo Annual Review of Financial Economics 225ndash47
Baker M and J Wurgler (2006) ldquoInvestor Sentiment and the Cross-Section of Stock Returnsrdquo Journal ofFinance 61 1645ndash80
Baks K P A Metrick and J Wachter (2001) ldquoShould Investors Avoid All Actively Managed Mutual FundsA Study in Bayesian Performance Evaluationrdquo Journal of Finance 56 45ndash85
Ball R (1985) ldquoAnomalies in Relationships between Securitiesrsquo Yields and Yield-Surrogatesrdquo Journal of FinancialEconomics 6 103ndash126
Barillas F and J Shanken (2018) ldquoComparing Asset Pricing Modelsrdquo The Journal of Finance 73(2) 715ndash754
Bartlett M S (1957) ldquoComment on a Statistical Paradox by DV Lindleyrdquo Biometrika 44(1-2) 533ndash534
Basu S (1977) ldquoInvestment Performance of Common Stocks in Relation to their Price-Earnings Ratio A Test ofthe Efficient Market Hypothesisrdquo Journal of Finance 32 663ndash82
Belloni A V Chernozhukov and C Hansen (2014) ldquoInference on treatment effects after selection amonghigh-dimensional controlsrdquo The Review of Economic Studies 81 608ndash650
Breeden D T M R Gibbons and R H Litzenberger (1989) ldquoEmpirical Test of the Consumption-OrientedCAPMrdquo The Journal of Finance 44(2) 231ndash262
Bryzgalova S (2015) ldquoSpurious Factors in Linear Asset Pricing Modelsrdquo working paper
Bryzgalova S and C Julliard (2018) ldquoConsumption in Asset Returnsrdquo London School of EconomicsManuscript
Campbell J Y (1996) ldquoUnderstanding Risk and Returnrdquo Journal of Political Economy 104(2) 298ndash345
Campbell J Y J Hilscher and J Szilagyi (2008) ldquoIn Search of Distress Riskrdquo Journal of Finance 632899ndash2939
Carhart M M (1997) ldquoOn Persistence in Mutual Fund Performancerdquo The Journal of Finance 52(1) 57ndash82
Chan K N-F Chen and D A Hsieh (1985) ldquoAn Exploratory Investigation of the Firm Size Effectrdquo Journalof Financial Economics 14 451ndash71
Chen L R Novy-Marx and L Zhang (2010) ldquoA New Three-Factor Modelrdquo working paper
Chen N-F S A Ross and R Roll (1986) ldquoEconomic Forces and the Stock Marketrdquo Journal of Business 59383ndash403
Chernov M L Lochstoer and S Lundeby (2019) ldquoConditional Dynamics and the Multi-Horizon Risk-ReturnTrade-Offrdquo working paper
Chib S X Zeng and L Zhao (forthcoming) ldquoOn Comparing Asset Pricing Modelsrdquo The Journal of Finance
Cochrane J H (2005) Asset Pricing Revised Edition Princeton University Press
Cooper M J H Gulen and M J Schill (2008) ldquoAsset Growth and the Cross-Section of Stock ReturnsrdquoJournal of Finance 63 1609ndash52
Cremers K M (2002) ldquoStock Return Predictability A Bayesian Model Selection Perspectiverdquo The Review ofFinancial Studies 15(4) 1223ndash1249
Daniel K D Hirshleifer and L Sun (2019) ldquoShort- and Long-Horizon Behavioral Factorsrdquo Review ofFinancial Studies
Daniel K L Mota S Rottke and T Santos (2018) ldquoThe Cross-Section of Risk and Returnrdquo workingpaper
Daniel K and S Titman (2006) ldquoMarket reactions to tangible and intangible informationrdquo Journal of Finance61 1605ndash43
Fabozzi F D Huang and G Zhou (2010) ldquoRobust Portfolios Contributions from Operations Research andFinancerdquo Annals of Operations Research 172 191ndash220
43
Fama E F and K French (2008) ldquoDissecting Anomaliesrdquo Journal of Finance 63 1653ndash78
Fama E F and K R French (1992) ldquoThe Cross-Section of Expected Stock Returnsrdquo The Journal of Finance47 427ndash465
(1993) ldquoCommon Risk Factors in the Returns on Stocks and Bondsrdquo The Journal of Financial Economics33 3ndash56
Fama E F and K R French (2015) ldquoA Five-Factor Asset Pricing Modelrdquo Journal of Financial Economics116 1ndash22
(2016) ldquoDissecting Anomalies with a Five-Factor Modelrdquo Review of Financial Studies 29 69ndash103
Fama E F and J D MacBeth (1973) ldquoRisk Return and Equilibrium Empirical Testsrdquo Journal of PoliticalEconomy 81(3) 607ndash636
Ferson W and C Harvey (1991) ldquoThe Variation of Economic Risk Premiumsrdquo Journal of Political Economy99 385ndash415
Frazzini A and L H Pedersen (2014) ldquoBetting against Betardquo Journal of Financial Economics 111(1) 1ndash25
Garlappi L R Uppal and T Wang (2007) ldquoPortfolio Selection with Parameter and Model Uncertainty AMulti-Prior Approachrdquo Review of Financial Studies 20 41ndash81
George E I and R E McCulloch (1993) ldquoVariable Selection via Gibbs Samplingrdquo Journal of the AmericanStatistical Association 88 881ndash889
(1997) ldquoApproaches for Bayesian Variable Selectionrdquo Statistica Sinica 7 339ndash373
Gertler M and E Grinols (1982) ldquoUnemployment Inflation and Common Stock Returnsrdquo Journal of MoneyCredit and Banking 14 216ndash33
Geweke J (1999) ldquoUsing Simulation Methods for Bayesian Econometric Models Inference Development andCommunicationrdquo Econometric Reviews 18(1) 1ndash73
Ghosh A C Julliard and A Taylor (2016) ldquoWhat is the Consumption-CAPM missing An Information-Theoretic Framework for the Analysis of Asset Pricing Modelsrdquo Review of Financial Studies 30 442ndash504
(2018) ldquoAn Information-Theoretic Asset Pricing Modelrdquo London School of Economics Manuscript
Giannone D M Lenza and G E Primiceri (2018) ldquoEconomic Predictions with Big Data The Illusion ofSparsityrdquo FRB of New York Staff Report No 847
Gibbons M R S A Ross and J Shanken (1989) ldquoA Test of the Efficiency of a Given Portfoliordquo Econometrica57(5) 1121ndash1152
Giglio S G Feng and D Xiu (2019) ldquoTaming the factor Zoordquo Journal of Finance forthcoming
Giglio S and D Xiu (2018) ldquoAsset Pricing with Omitted Factorsrdquo working paper
Gospodinov N R Kan and C Robotti (2014) ldquoMisspecification-Robust Inference in Linear Asset-PricingModels with Irrelevant Risk Factorsrdquo The Review of Financial Studies 27(7) 2139ndash2170
(2019) ldquoToo Good to be True Fallacies in Evaluating Risk Factor Modelsrdquo Journal of Financial Economics132(2) 451ndash471
Goyal A Z L He and S-W Huh (2018) ldquoDistance-Based Metrics A Bayesian Solution to the Power andExtreme-Error Problems in Asset-Pricing Testsrdquo Swiss Finance Institute Research Paper No 18-78 available atSSRN httpsssrncomabstract=3286327
Hall R E (1978) ldquoThe Stochastic Implications of the Life Cycle Permanent Income Hypothesisrdquo Journal ofPolitical Economy 86(6) 971ndash87
Harvey C and Y Liu (2019) ldquoCross-sectional Alpha Dispersion and Performance Evaluationrdquo Journal ofFinancial Economics forthcoming
Harvey C and G Zhou (1990) ldquoBayesian Inference in Asset Pricing Testsrdquo Journal of Financial Economics26 221ndash254
Harvey C R (2017) ldquoPresidential Address The Scientific Outlook in Financial Economicsrdquo The Journal ofFinance 72(4) 1399ndash1440
Harvey C R Y Liu and H Zhu (2016) ldquo and the Cross-Section of Expected Returnsrdquo The Review ofFinancial Studies 29(1) 5ndash68
He A D Huang and G Zhou (2018) ldquoNew Factors Wanted Evidence from a Simple Specification Testrdquoworking paper available at SSRN httpsssrncomabstract=3143752
44
He Z B Kelly and A Manela (2017) ldquoIntermediary Asset Pricing New Evidence from Many Asset ClassesrdquoJournal of Financial Economics 126 1ndash35
Hirshleifer D H Kewei S Teoh and Y Zhang (2004) ldquoDo Investors Overvalue Firms with Bloated BalanceSheetsrdquo Journal of Accounting and Economics 38 297ndash331
Hoeting J A D Madigan A E Raftery and C T Volinsky (1999) ldquoBayesian Model Averaging ATutorialrdquo Statistical Science 14(4) 382ndash401
Hou K C Xue and L Zhang (2015) ldquoDigesting Anomalies An Investment Approachrdquo The Review of FinancialStudies 28(3) 650ndash705
Huang D F Jiang J Tu and G Zhou (2015) ldquoInvestor Sentiment Aligned A Powerful Predictor of StockReturnsrdquo Review of Financial Studies 28 791ndash837
Huang D J Li and G Zhou (2018) ldquoShrinking Factor Dimension A Reduced-Rank Approachrdquo workingpaper available at SSRN httpsssrncomabstract=3205697
Ishwaran H J S Rao et al (2005) ldquoSpike and Slab Variable Selection Frequentist and Bayesian StrategiesrdquoThe Annals of Statistics 33(2) 730ndash773
Jagannathan R and Z Wang (1996) ldquoThe Conditional CAPM and the Cross-Section of Expected ReturnsrdquoThe Journal of Finance 51(1) 3ndash53
Jeffreys H (1961) Theory of Probability Oxford Calendron Press
Jegadeesh N and S Titman (1993) ldquoReturns to Buying Winners and Selling Losers Implications for StockMarket Efficiencyrdquo Journal of Finance 48 65ndash91
(2001) ldquoProfitability of Momentum Strategies An Evaluation of Alternative Explanationsrdquo Journal ofFinance 56 699ndash720
Jones C and J Shanken (2005) ldquoMutual Fund Performance with Learning across Fundsrdquo Journal of FinancialEconomics 78 507ndash552
Jurado K S Ludvigson and S Ng (2015) ldquoMeasuring Uncertaintyrdquo American Economic Review 105 1177ndash1215
Kan R and C Zhang (1999a) ldquoGMM Tests of Stochastic Discount Factor Models with Useless Factorsrdquo Journalof Financial Economics 54(1) 103ndash127
(1999b) ldquoTwo-Pass Tests of Asset Pricing Models with Useless Factorsrdquo the Journal of Finance 54(1)203ndash235
Kass R E and A E Raftery (1995) ldquoBayes Factorsrdquo Journal of the American Statistical Association 90(430)773ndash795
Kelly B S Pruitt and Y Su (2019) ldquoCharacteristics are covariances A unified model of risk and returnrdquoJournal of Financial Economics 134 501ndash524
Kleibergen F (2009) ldquoTests of Risk Premia in Linear Factor Modelsrdquo Journal of Econometrics 149(2) 149ndash173
Kleibergen F and Z Zhan (2015) ldquoUnexplained Factors and their Effects on Second Pass R-squaredsrdquo Journalof Econometrics 189(1) 101ndash116
Kozak S S Nagel and S Santosh (2019) ldquoShrinking the Cross-Sectionrdquo Journal of Financial Economicsforthcoming
Lancaster T (2004) Introduction to Modern Bayesian Econometrics Wiley
Langlois H (2019) ldquoMeasuring Skewness Premiardquo Journal of Financial Economics
Lettau M and S Ludvigson (2001) ldquoResurrecting the (C) CAPM A Cross-Sectional Test when Risk Premiaare Time-Varyingrdquo Journal of political economy 109(6) 1238ndash1287
Lewellen J S Nagel and J Shanken (2010) ldquoA Skeptical Appraisal of Asset Pricing Testsrdquo Journal ofFinancial Economics 96(2) 175ndash194
Lintner J (1965) ldquoSecurity Prices Risk and Maximal Gains from Diversificationrdquo Journal of Finance 20587ndash615
Ludvigson S S Ma and S Ng (2019) ldquoUncertainty and Business Cycles Exogenous Impulse or EndogenousResponserdquo American Economic Journal Macroeconomics
Madigan D and A E Raftery (1994) ldquoModel Selection and Accounting for Model Uncertainty in GraphicalModels Using Occamrsquos Windowrdquo Journal of the American Statistical Association 89(428) 1535ndash1546
45
Novy-Marx R (2013) ldquoThe Other Side of Value The Gross Profitability Premiumrdquo Journal of Financial Eco-nomics 108 1ndash28
Ohlson J A (1980) ldquoFinancial Ratios and the Probabilistic Prediction of Bankruptcyrdquo Journal of AccountingResearch 18 109ndash131
Pastor L (2000) ldquoPortfolio Selection and Asset Pricing Modelsrdquo The Journal of Finance 55(1) 179ndash223
Pastor L and R F Stambaugh (2000) ldquoComparing Asset Pricing Models an Investment Perspectiverdquo Journalof Financial Economics 56(3) 335ndash381
Pastor L and R F Stambaugh (2002) ldquoInvesting in Equity Mutual Fundsrdquo Journal of Financial Economics63 351ndash380
Pastor L and R F Stambaugh (2003) ldquoLiquidity Risk and Expected Stock Returnsrdquo Journal of Politicaleconomy 111(3) 642ndash685
Phillips D L (1962) ldquoA Technique for the Numerical Solution of Certain Integral Equations of the First KindrdquoJ ACM 9(1) 84ndash97
Polson N G and J G Scott (2011) ldquoShrink Globally Act Locally Sparse Bayesian Regularization and Pre-dictionrdquo in Bayesian Statistics 9 ed by J M Bernardo M J Bayarri J O Berger A P Dawid D HeckermanA F M Smith and M West Oxford University Press
Primiceri G E (2005) ldquoTime Varying Structural Vector Autoregressions and Monetary Policyrdquo Review of Eco-nomic Studies 72(3) 821ndash852
Raftery A E D Madigan and J A Hoeting (1997) ldquoBayesian Model Averaging for Linear RegressionModelsrdquo Journal of the American Statistical Association 92(437) 179ndash191
Ritter J R (1991) ldquoThe Long-Run Performance of Initial Public Offeringsrdquo Journal of Finance 46 3ndash27
Roberts H V (1965) ldquoProbabilistic Predictionrdquo Journal of the American Statistical Association 60(309) 50ndash62
Shanken J (1987) ldquoA Bayesian Approach to Testing Portfolio Efficiencyrdquo Journal of Financial Economics 19195ndash215
(1992) ldquoOn the Estimation of Beta-Pricing Modelsrdquo The Review of Financial Studies 5 1ndash33
Sharpe W F (1964) ldquoCapital Asset Prices A Theory of Market Equilibrium under Conditions of Riskrdquo Journalof Finance 19(3) 425ndash42
Sims C A (2007) ldquoThinking about Instrumental Variablesrdquo mimeo
Sloan R (1996) ldquoDo Stock Prices Fully Reflect Information in Accruals and Cash Flows about Future EarningsrdquoThe Accounting Review 71 289ndash315
Stambaugh R F and Y Yuan (2016) ldquoMispricing Factorsrdquo Review of Financial Studies 30 1270ndash1315
Stock J H (1991) ldquoConfidence Intervals for the Largest Autoregressive Root in US Macroeconomic Time SeriesrdquoJournal of Monetary Economics 28(3) 435ndash459
Tikhonov A A Goncharsky V Stepanov and A G Yagola (1995) Numerical Methods for the Solutionof Ill-Posed Problems Mathematics and Its Applications Springer Netherlands
Titman S K J Wei and F Xie (2004) ldquoCapital Investments and Stock Returnsrdquo Journal of Financial andQuantitative Analysis 39 677ndash700
Uppal R P Zaffaroni and I Zviadadze (2018) ldquoCorrecting Misspecified Stochastic Discount Factorsrdquoworking paper
Yogo M (2006) ldquoA Consumption-Based Explanation of Expected Stock Returnsrdquo Journal of Finance 61(2)539ndash580
46
A Additional Derivations
A1 Derivation of the posterior distribution in section II1
Letrsquos consider first the time-series regression We assume that 983171tiidsim N (0Σ) or 983171 sim MVN (0TtimesN Σotimes
IT ) The likelihood of data (RF ) is therefore
p(data|BΣ) = (2π)minusNT2 |Σ|minus
T2 exp
983069minus1
2tr[Σminus1(Rminus FB)⊤(Rminus FB)]
983070
After assigning the Jeffreysrsquo prior for (BΣ) π(BΣ) prop |Σ|minusN+1
2 we simplify the likelihood
function by exploiting the fact that OLS estimated residuals are orthogonal to the regressors
(Rminus FB)⊤(Rminus FB) = [Rminus FBols minus F (B minus Bols)]⊤[Rminus FBols minus F (B minus Bols)]
= (Rminus FBols)⊤(Rminus FBols) + (B minus Bols)
⊤F⊤F (B minus Bols)
= T Σ+ (B minus Bols)⊤F⊤F (B minus Bols)
where
Bols =
983075a⊤
β⊤
983076= (F⊤F )minus1F⊤R Σols =
1
T(Rminus FBols)
⊤(Rminus FBols)
Therefore the posterior distribution in the first step is
p(BΣ|data) prop (2π)minusNT2 |Σ|minus
T+N+12 exp
983069minus1
2tr[Σminus1(Rminus FB)⊤(Rminus FB)]
983070
prop |Σ|minusT+N+1
2 eminus12tr[Σminus1(T Σ)]eminus
12tr[Σminus1(BminusBols)
⊤F⊤F (BminusBols)]
Hence the posterior distribution of B conditional on data and Σ is
p(B|Σ data) prop exp
983069minus1
2tr[Σminus1(B minus Bols)
⊤F⊤F (B minus Bols)]
983070
and the above is the kernel of the multivariate normal in equation (8)
If we further integrate out B it is easy to show that
p(Σ|data) prop |Σ|minusT+NminusK
2 exp
983069minus1
2tr[Σminus1(T Σ)]
983070
Therefore the posterior distribution of Σ is the inverse-Wishart in equation (9)
Recall that β = (1N βf ) λ⊤ = (λc λ⊤
f ) If we assume that the pricing error αi follows an
independent and idencitical normal distribution N (0σ2) the likelihood function in the second step
is
p(data|λσ2) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070
where data in the second step include (aβf ) drawn from the first step Assuming the diffuse
47
Jeffreysrsquo prior π(λσ2) prop 1σ2 the posterior distribution of (λσ2) is
p(λσ2|dataBΣ) prop (σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070
= (σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βλ+ β(λminus λ))⊤(aminus βλ+ β(λminus λ))
983070
= (σ2)minusN+2
2 exp
983069minusN σ2
2σ2
983070exp
983083minus(λminus λ)⊤β⊤β(λminus λ)
2σ2
983084
there4 p(λ|σ2 dataBΣ) prop exp
983083minus(λminus λ)⊤β⊤β(λminus λ)
2σ2
983084
where λ = (β⊤β)minus1β⊤a and σ2 = (aminusβλ)⊤(aminusβλ)N Therefore the posterior conditional distribution
of λ is the one in equation (12) Finally we can derive the posterior distribution of σ2 by integrating
out λ
p(σ2|dataBΣ) =
983133p(λσ2|dataBΣ)dσ2 prop (σ2)minus
NminusK+12 exp
983069minusN σ2
2σ2
983070
hence obtaining the posterior distribution in equation (13)
A2 Non-spherical pricing errors
Our framework can also easily accommodate non-spherical cross-sectional pricing errors To see
this note that under the null of the model we can rewrite equation (1) as Rt = βλ + βfft + 983171t
Consider the cross-sectional regression and let ET define the sample mean operator Since ET [Rt] =
βλ+ET [βfft]+ET [983171t] = βλ+ET [983171t] the pricing error α should be equal to ET [983171t] Hence under
the hypothesis that the model is correctly specified and in the spirit of the central limit theorem
a suitable distributional assumption for the pricing errors α in the second step is
α|Σ sim N
9830610N
1
TΣ
983062
Implying the distribution of R equiv ET [Rt] sim N (βλ 1T Σ) The likelihood function in the second step
is then
p(data|λ) = (2π)minusN2
9830559830559830559830551
TΣ
983055983055983055983055minus 1
2
exp
983069minusT
2(Rminus βλ)⊤Σminus1(Rminus βλ)
983070
prop exp
983069minusT
2(λ⊤β⊤Σminus1βλminus 2λ⊤β⊤Σminus1R)
983070
Hence we can now define the following estimator
Definition 3 (Bayesian Fama-MacBeth GLS2 (BFM-GLS2)) The BFM-GLS2 posterior dis-
tribution of λ is
λ|dataBΣ sim N (λΣλ) (18)
48
where λ = (β⊤Σminus1β)minus1β⊤Σminus1R Σλ = 1T (β
⊤Σminus1β)minus1 and where β and Σ are drawn from the
Normal-inverse-Wishart in (8)-(9)
In the above the conditional expectation λ = (β⊤Σminus1β)minus1β⊤Σminus1R is essentially the Fama-
MacBeth GLS estimate of λ and Σλ is just the standard covariance matrix of the GLS estimates
Different from our OLS version we incorporate the uncertainty of R by drawing λ from the normal
distribution with variance matrix 1T (β
⊤Σminus1β)minus1
In this case two forces are at play to make useless factors detectable in finite sample First as
for the BFM and BFM-GLS cases useless factors will generate posterior draws with diverging 983141λand flipping sing Second differently from our OLS version we incorporate the uncertainty of R
by drawing λ from the normal distribution with variance matrix 1T (β
⊤Σminus1β)minus1
A3 Formal derivation of the flat prior pitfall for risk premia
Following the derivation in section A1 the likelihood function in the second step is
p(data|γλσ2) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βγλγ)
⊤(aminus βγλγ)
983070(19)
Assigning a Jeffreysrsquo prior to the parameters23 (λσ2) the marginal likelihood function conditional
on model index γ is
p(data|γ) =983133 983133
p(data|γλσ2)π(λσ2|γ)dλdσ2
prop983133 983133
(σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βγλγ)
⊤(aminus βγλγ)
983070dλdσ2
=
983133 983133(σ2)minus
N+22 exp
983083minusN σ2
γ
2σ2
983084exp
983083minus(λγ minus λγ)
⊤β⊤γ βγ(λγ minus λγ)
2σ2
983084dλdσ2
= (2π)pγ+1
2 |β⊤γ βγ |minus
12
983133(σ2)minus
Nminuspγ+1
2 exp
983083minusN σ2
γ
2σ2
983084dσ2
= (2π)pγ+1
2 |β⊤γ βγ |minus
12
Γ(Nminuspγ+1
2 )
(N σ2
γ
2 )Nminuspγ+1
2
where λγ = (β⊤γ βγ)
minus1β⊤γ a σ
2γ =
(aminusβγ λγ)⊤(aminusβγ λγ)N and Γ denotes the Gamma function
A4 Proof of Remark 2
Proof Consider two nested linear factor models γ and γ prime The only difference between γ and γ prime
is γp γp equals 1 in model γ but 0 in model γ prime Let γminusp denote a (K minus 1) times 1 vector of model
index excluding γp γ⊤ = (γ⊤minusp 1) and γ prime⊤ = (γ⊤
minusp 0) Suppose that we rearrange the ordering of
23More precisely the priors for (λσ2) are π(λγ σ2) prop 1
σ2 and λminusγ = 0
49
factors such that factor p is the last one To begin with we introduce some matrix notations
βγ = (βγprime βp) Dγ =
983075Dγprime
1ψp
983076 β⊤
γ βγ +Dγ =
983075β⊤γprimeβγprime +Dγprime β⊤
γprimeβp
β⊤p βγprime β⊤
p βp + 1ψp
983076
|β⊤γ βγ +Dγ | = |β⊤
γprimeβγprime +Dγprime |times |β⊤p βp +
1
ψpminus β⊤
p βγprime(β⊤γprimeβγprime +Dγprime)minus1β⊤
γprimeβp|
|β⊤γ βγ +Dγ | = |β⊤
γprimeβγprime +Dγprime |times |β⊤p βp +
1
ψpminus β⊤
p βγprime(β⊤γprimeβγprime +Dγprime)minus1β⊤
γprimeβp|
|Dγ | = |Dγprime |times 1
ψp
Equipped with the above we have by direct calculation
p(data|γj = 1γminusj)
p(data|γj = 0γminusj)=
|Dγ |12
|β⊤γ βγ+Dγ |
12
1983059
SSRγ2
983060N2
|Dγprime |
12
|β⊤γprimeβγprime+D
γprime |12
1983061
SSRγprime2
983062N2
=
983061SSRγprime
SSRγ
983062N2983061|Dγ ||Dγprime |
983062 12
983075|β⊤
γprimeβγprime +Dγprime ||β⊤
γ βγ +Dγ |
983076 12
=
983061SSRγprime
SSRγ
983062N2
ψminus 1
2p
983055983055983055983055β⊤p βp +
1
ψpminus β⊤
p βγprime
983059β⊤γprimeβγprime +Dγprime
983060minus1β⊤γprimeβp
983055983055983055983055minus 1
2
=
983061SSRγprime
SSRγ
983062N29830611 + ψpβ
⊤p
983063IN minus βγprime
983059β⊤γprimeβγprime +Dγprime
983060minus1β⊤γprime
983064βp
983062minus 12
where β⊤p
983147IN minus βγprime(β⊤
γprimeβγprime +Dγprime)minus1β⊤γprime
983148βp = minb(βpminusβγprimeb)⊤(βpminusβγprimeb)+ b⊤Dγprimeb which
is the minimal value of the penalised sum of squared errors when we use βγprime to predict βp
A5 Confidence Intervals for R2 in Fama-MacBeth Estimation
In the spirit of Stock (1991) and Lewellen Nagel and Shanken (2010) for each of the simulations
we use the following approach to construct the confidence interval for cross-sectional R2 produced
by the standard Fama-MacBeth estimation
First we choose a hypothetical true cross-sectional R2 denoted by R2h The expected test asset
returns E[Rt] are assumed to follow E[Rt] = hβλ+α where β and λ are sample estimates of β
and λ from historical data and αiiidsim N (0σ2
e) The constants h and σ2e are chosen to match the
hypothetical cross-sectional R2
Let R denote the vector of historical average asset returns and V arN (x) denote the sample
cross-sectional variance of vector x The population cross-sectional R2 is therefore
R2h =
h2V arN (βλ)
V arN (E[Rt])=
h2V arN (βλ)
h2V arN (βλ) + σ2e
At the same time wersquod like to maintain the historical cross-sectional dispersion of test asset returns
50
so we further have the following equation
V arN (E[Rt]) = h2V arN (βλ) + σ2e = V arN (ET [Rt])
h2 =R2
hV arN (ET [Rt])
V arN (βλ)
σ2e = (1minusR2
h)V arN (ET [Rt])
Solving the system for h and σ2e we simulate a vector of pricing errors from a normal distribution
αiiidsim N (0σ2
e) Since the sample variance of the draws αi is generally different from σ2e with a non-
zero cross-sectional covariance with β we adjust the vector α by (1) subtracting CovN (βλα)
V arN (βλ)βλ
from α such as the sample covariance between α and βλ is zero (2) multiplying α by σeσN (α) in
order that sample cross-sectional variance of α equals σ2e
Second we simulate a random sample of both factors and test asset returns Factors are drawn
with replacement from their empirical sample while test assets returns Rt are assumed to follow a
parametric normal distribution that is
Rt|ftiidsim N (hβλ+α+ βft Σ)24
We then use Fama-MacBeth two-step approach to estimateR2 for every simulated sample ftRtTt=1
The second step is repeated for 1000 times For each hypothetical R2 between 0 and 1 we find the
(5 95) confidence interval in the simulation Finally build a 90 confidence interval for R2 by
including those values of hypothetically true R2 whose 90 confidence interval contains R2
The confidence intervals for GLS R2 can be found in a similar way with the only difference
of focusing on cross-sectional R2 for a linear combination of Rt ie Σminus 12Rt Let Rt = Σminus 1
2Rt
β = Σminus 12 β and 983171t = Σminus 1
2 983171tiidsim N (0 IN ) The simulation then relies on Rt rather than Rt
Rt|ftiidsim N (hβλ+α+ βft IN )
A6 Large N behavior
In this section we investigate the properties of the BFM procedure in estimating the risk premia
and R-squared as well successfully identifying irrelevant factors in the model when applied to a
large cross-section For the sake of brevity here we present the overview of the simulation results
with Tables C4 - C9 summarizing the output
We consider the same simulation design as described at the beginning of Section III except
for the choice of the cross-section of test assets which time series and cross-sectional features we
mimic In our baseline case in the previous subsections we built a cross-section to emulate the 25
Fama-French portfolios sorted by size and value Now instead we consider the properties of the
following composite cross-sections to simulate returns
24Σ is the sample estimate of covariance matrix of random errors in time-series regression in equation (1)
51
(a) N = 55 25 Fama-French portfolios sorted by size and value and 30 industry portfolios
(b) N = 100 25 Fama-French portfolios sorted by size and value 30 industry portfolios 25
profitability and investment portfolios 10 momentum portfolios and 10 long-term reversal
portfolios
The rest of the simulation design stays unchanged ie the strong factor mimics the behavior of
HML with its betas and risk premia corresponding to their in-sample values cross-sectional R2adj
as well as portfolio average returns and variance of the residuals
Tables C4 and C7 focus on the case of including only a strong factor into the model that is
inherently misspecified and estimated on a large cross-section (N = 55 and N = 100 respectively)
Risk premia estimates recovered by BFM are centered around the pseudo-true values and confi-
dence intervals produced by the posterior coverage have the correct size Confidence intervals for
R2adj are equally centered around the true values and overall become quite tight for a large sample
size (eg T ge 600) The GLS version of the cross-sectional fit is slightly lower than than the true
value and this is largely due to the estimation errors in the weight matrix that are particularly
important for a large cross-section but overall are very close to the true values
Tables C5 and C8 focus on the model with just a useless factor As expected the standard
Fama-MacBeth estimation yields confidence intervals for the risk premia with wrong size rejecting
the zero risk premia for a useless factor with increasing frequency as T becomes large In contrast
the BFM inference remains valid with its empirical rejection rates being close to the true size of
the tests
Finally Tables C6 and C9 consider the mixed case of estimating a model that includes both
strong and useless factors Again BFM correctly identifies the presence of a spurious factor in
the specification and if anything becomes somewhat more conservative underrejecting the zero
risk premia associated with it This conservative inference is also shared by the estimates of the
strong factor risk premia with the latter being particularly evident in a large sample estimation
(T = 20 000 N = 100) Again the reason for this additional estimation uncertainty seems to lie
in the large N behavior of the cross-sectional regression (betas and the weight matrix in case of
the GLS) The posterior of a cross-sectional measure of fit remains relatively tight around the true
value
B Data
CAPM Following Sharpe (1964) and Lintner (1965) the only risk factor is the excess return on
market portfolio which is proxied by a value-weighted portfolio of all CRSP firms incorporated in
US and listed on the NYSE AMEX or NASDAQ We use 1-month Treasury rate as a proxy for
risk-free rate The data comes from Ken Frenchrsquos website
Fama-French 3 factor model Fama and French (1992) extend CAPM by introducing two
additional factors SMB and HML where SMB is the return difference between portfolios of stocks
52
with small and large market capitalizations and HML is the return difference between portfolios of
stocks with high and low book-to-market ratio Again the data comes from Ken Frenchrsquos website
Carhart (1997) This paper extends Fama-French 3 factor model by including a momentum
factor UMD (also available from Ken French website)
q-factor model Hou Xue and Zhang (2015) introduce a four-factor model that includes market
excess return a new size factor (ME) proxied by the return difference between large and small
stocks an investment factor (IA) proxied by the return difference of stocks with high and low
investment-to-asset ratio and finally the profitability factor (ROE) created by sorting stocks based
on their return-on-equity ratio We receive the data from the authors An alternative is Fama-
French five factor model but we only show the result of the first one since factors in these two
papers are extremely similar
Liquidity Pastor and Stambaugh (2003) created a liquidity factor based on the fact that order
flows result in larger return reversals when liquidity is lower We download the monthly tradable
liquidity factor from Stambaughrsquos website
Quality-minus-junk Asness Frazzini and Pedersen (2019) introduced the QMJ factor and
demonstrated that profitable growing well-managed companies referred to as lsquoqualityrsquo firms
command a higher rate of return The factor is available from AQR data library
CCAPM Real growth rate in nondurable consumption per capita (quarterly data) is computed
from the data on consumption levels available at St Louis FRED
Scaled CAPM Lettau and Ludvigson (2001) considered a conditional SDFmt+1 = atminusbtMKTt+1
where at and bt are linear function of conditional information cayt We download cayt from the
authorsrsquo website
Scaled CCAPM Similar to conditional CAPM but the SDF includes nondurable consumption
growth instead of the market factor
HC-CCAPM Jagannathan and Wang (1996) added a labor factor to CAPM Following their
paper we compute the returns on human capital as
Rlabort =
Ltminus1 + Ltminus2
Ltminus2 + Ltminus3minus 1 (20)
where Lt is the disposable labor income per capita
Scaled HC-CCAPM Lettau and Ludvigson (2001) extend Jagannathan and Wang (1996) by
considering a conditional SDF in which cay is the only conditional information
Durable consumption model Yogo (2006) emphasized the role of durable consumption goods
in explaining high returns of small stocks and value stocks We consider a three-factor model
market excess return non-durable consumption growth and durable (real) consumption growth Cd
(seasonally adjusted at annual rates) The data is from Yogorsquos website 1952Q1 to 2001Q4
53
B1 Factor list
Table B1 List of the factors for cross-sectional asset pricing models
Factor ID Factor name and description Reference Sourceconstruction
MKT Market excess return Sharpe (1964) Lintner(1965)
Ken French website
SMB Size factor constructed as a long-shortportfolio of stocks sorted by their mar-ket cap (small-minus-big)
Fama and French (1992) Ken French website
HML Value factor constructed as a long-short portfolio of stocks sorted by theirbook-to-market ratio (high-minus-low)
Fama and French (1992) Ken French website
RMW Profitability factor constructed as along-short portfolio of stocks sorted bytheir profitability (robust-minus-weak)
Fama and French (2015) Ken French website
CMA Investment factor constructed as along-short portfolio of stocks sorted bytheir investment activity (conservative-minus-aggressive)
Fama and French (2015) Ken French website
UMD Momentum factor constructed as along-short portfolio of stocks sorted bytheir 12-2 cumulative previous return(up-minus-down)
Carhart (1997) Je-gadeesh and Titman(1993)
Ken French website
STREV Short-term reversal factor constructedas a long-short portfolio of stockssorted by their previous month return
Jegadeesh and Titman(1993)
Ken French website
LTREV Long-term reversal factor constructedas a long-short portfolio of stockssorted by their cumulative return ac-crued in the previous 60-13 months
Jegadeesh and Titman(2001)
Ken French website
q IA Investment factor constructed as along-short portfolio of stocks sorted bytheir investment-to-capital
Hou Xue and Zhang(2015)
Lu Zhang
q ROE Profitability factor constructed as along-short portfolio of stocks sorted bytheir return on equity
Hou Xue and Zhang(2015)
Lu Zhang
LIQ NT Liquidity factor computed as the av-erage of individual-stock measures es-timated with daily data (residual pre-dictability controlling for the marketfactor)
Pastor and Stambaugh(2003)
Robert Stambaughwebsite
LIQ TR Liquidity factor constructed as a long-short portfolio of stocks sorted by theirexposure to LIQ NT
Pastor and Stambaugh(2003)
Robert Stambaughwebsite
MGMT Mispricing factor constructed as acombination of anomalies related tofirmrsquos management practices (stock is-sue accruals asset growth etc)
Stambaugh and Yuan(2016)
Robert Stambaughwebsite
PERF Mispricing factor constructed as acombination of anomalies related tofirmrsquos performance (profitability dis-tress return on assets etc)
Stambaugh and Yuan(2016)
Robert Stambaughwebsite
ACCR Accruals factor constructed as a long-short portfolio of stocks sorted bychanges in operating working capitalper split-adjusted share from the fiscalyear end t-2 to t-1 divided by book eq-uity per share in t-1
Sloan (1996) Robert Stambaughwebsite
54
DISSTR Distress factor constructed as a long-short portfolio of stocks sorted by thepredicted failure probability
Campbell Hilscher andSzilagyi (2008)
Robert Stambaughwebsite
ASS Growth Asset growth factor constructed as along-short portfolio of stocks sorted bygrowth rate of total assets in the previ-ous fiscal year
Cooper Gulen andSchill (2008)
Robert Stambaughwebsite
COMP ISSUE Composite issue factor constructed asa long-short portfolio of stocks sortedby the growth in the firmrsquos total marketvalue of equity above that of the stockrsquosrate of return
Daniel and Titman(2006)
Robert Stambaughwebsite
GR PROF Gross profitability factor constructedas a long-short portfolio of stockssorted by the ratio of gross profit toassets creates abnormal benchmark-adjusted returns
Novy-Marx (2013) Robert Stambaughwebsite
INV IN ASS Investment-in-assets factor con-structed as a long-short portfolio ofstocks sorted by the annual change ingross property plant and equipmentplus the annual change in inventoriesscaled by lagged book value of theassets
Titman Wei and Xie(2004)
Robert Stambaughwebsite
NetOA Net operating assets factor con-structed as a long-short portfolio ofstocks sorted by net operating assets
Hirshleifer Kewei Teohand Zhang (2004)
Robert Stambaughwebsite
OSCORE Ohlson O-score factor constructed as along-short portfolio of stocks sorted bythe predicted value of a distress mea-sure
Ohlson (1980) Robert Stambaughwebsite
ROA Return-on-assets factor constructed asa long-short portfolio of stocks sortedby their return on assets
Chen Novy-Marx andZhang (2010)
Robert Stambaughwebsite
STOCK ISS Stock issuance factor constructed asa long-short portfolio of stocks sortedby their annual log change in split-adjusted shares outstanding
Ritter (1991) Fama andFrench (2008)
Robert Stambaughwebsite
INTERM CR Innovations to the intermediariesrsquo cap-ital (equity) ratio
He Kelly and Manela(2017)
Asaf Manela website
BAB Betting-against-beta factor con-structed as a portfolio that holdslow-beta assets leveraged to a betaof 1 and that shorts high-beta assetsde-leveraged to a beta of 1
Frazzini and Pedersen(2014)
AQR data library
HML DEVIL A version of the HML factor that relieson the current price level to sort thestocks into long and short legs
Asness and Frazzini(2013)
AQR data library
QMJ Auality-minus-junk factor constructedas a long-short portfolio of stockssorted by the combination of theirsafety profitability growth and thequality of management practices
Asness Frazzini andPedersen (2019)
AQR data library
FIN UNC A measure of financial uncertainty Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
REAL UNC A measure of real economic uncertainty Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
55
MACRO UNC A measure of macroeconomic uncer-tainty
Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
TERM Term spread measured as the differ-ence in 10 year Treasury bonds and fedfunds rate
Chen Ross and Roll(1986) Fama and French(1993)
FRED-MD database
DELTA SLOPE Change in the difference between a10-year Treasury bond yield and a 3-month Treasury bill yield
Ferson and Harvey(1991)
FRED-MD database
CREDIT Credit spread measured as the dif-ference between Moodyrsquos BAA corpo-rate bond yields and 10-year Treasurybonds
Chen Ross and Roll(1986) Fama and French(1993)
FRED-MD database
DIV Dividend yield Campbell (1996) FRED-MD databasePE Price-earnings ratio Basu (1977) Ball (1985) FRED-MD database
BW INV SENT Investor sentiment measure aggre-gated from a set of indices orthogonalto macroeconomic fundamentals
Baker and Wurgler(2006)
Dashan Huang web-site
HJTZ INV SENT Investor sentiment measure extractedwith PLS from Baker and Wurgler(2006) proxies
Huang Jiang Tu andZhou (2015)
Dashan Huang web-site
BEH PEAD Short-term behavioral factor reflectingpost-earnings announcement drift
Daniel Hirshleifer andSun (2019)
Kent Daniel website
BEH FIN Long-term behavioral factor predom-inantly capturing the impact of shareissuance and correction
Daniel Hirshleifer andSun (2019)
Kent Daniel website
MKTlowast Market factor with a hedged unpricedcomponent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
SMBlowast SMB with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
HMLlowast HML with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
RMWlowast RMW with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
CMAlowast CMA with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
SKEW Systematic skewness factor con-structed as a long-short portfolio ofstocks sorted on the their predictedsystematic skewness rank
Langlois (2019) Hugues Langloiswebsite
NONDUR Nondurable consumption growth (realchain-weighted per capita)
Chen Ross and Roll(1986) Breeden Gib-bons and Litzenberger(1989)
Monthly consump-tion expenditurethe chain-weightedprice index andpopulation data arefrom BEA
SERV Growth rate (real chain-weighted percapita) service expenditure
Breeden Gibbons andLitzenberger (1989)Hall (1978)
Monthly expenditurefor services thechain-weighted priceindex and popula-tion data are fromBEA
UNRATE Unemployment rate Gertler and Grinols(1982)
FRED-MD database
IND PROD Growth rate of industrial production Chan Chen and Hsieh(1985) Chen Ross andRoll (1986)
Industrial Produc-tion Index is fromthe Board of Gover-nors of the FederalReserve System
56
OIL Monthly growth rate of the ProducerPrice index for Crude Petroleum (do-mestic production)
Chen Ross and Roll(1986)
PPI is from US Bu-reau of Labor Statis-tics
The table presents the list of factors used in Section IV2 For each of the variables we present their identificationindex the nature of the factor and the source of data for downloading andor constructing the time series
57
C Additional Tables
Table C1 Tests of risk premia in a correctly specified model with a strong factor
λintercept λHML R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0115 0049 0017 0114 0050 0014 -417 4429200 0101 0055 0012 0096 0047 0010 096 6773
FM 600 0106 0053 0011 0100 0048 0011 3052 84561000 0096 0050 0011 0102 0044 0010 4786 901820000 0102 0042 0012 0100 0049 0007 9620 9943
100 0063 0027 0006 0034 0010 0002 -337 3516200 0107 0055 0012 0080 0037 0005 -299 4906
BFM 600 0095 0050 0007 0080 0035 0007 431 69131000 0092 0054 0008 0092 0047 0012 3291 781820000 0108 0054 0012 0110 0052 0008 9654 9849
Panel B GLS
100 0221 0156 0061 0212 0137 0064 1062 7380200 0153 0088 0024 0144 0088 0024 5923 8571
FM 600 0117 0062 0017 0115 0060 0014 8338 94461000 0106 0053 0011 0112 0052 0011 9003 964920000 0100 0058 0010 0103 0047 0011 9947 9981
100 0151 0088 0026 0149 0086 0026 2642 6569200 0123 0066 0016 0124 0066 0015 4907 7352
BFM 600 0106 0055 0012 0105 0056 0011 7640 87301000 0107 0054 0011 0103 0051 0011 8512 915420000 0098 0057 0009 0100 0051 0013 9915 9950
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in a correctlyspecified model with an intercept and a strong factor Hypothetical true value of R2
adj is 100 Fama-MacBeth esti-mates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shankencorrection Confidence intervals for BFM estimates are constructed using a posterior distribution of Fama-MacBethestimates of λ The last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simula-tions evaluated at the simulation point estimates for FM and its posterior mode for BFM
58
Table C2 Tests of risk premia in a correctly specified model with a useless factor
λintercept λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0087 0033 0008 0019 0003 0000 -419 4844200 0071 0034 0006 0052 0011 0000 -420 5684
FM 600 0077 0031 0005 0131 0037 0000 -422 58501000 0071 0035 0006 0205 0066 0002 -416 612820000 0133 0077 0027 0709 0479 0140 -411 7007
100 0039 0011 0002 0001 0001 0000 -229 -012200 0038 0013 0001 0002 0000 0000 -232 030
BFM 600 0048 0012 0002 0017 0006 0001 -221 1261000 0048 0011 0000 0031 0010 0000 -214 16020000 0007 0000 0000 0103 0051 0014 -176 5793
Panel B GLS
100 0215 0130 0052 0211 0117 0033 -310 3727200 0141 0080 0023 0139 0072 0013 -373 2282
FM 600 0108 0053 0014 0142 0065 0010 -377 21161000 0097 0053 0009 0171 0082 0012 -371 209220000 0108 0047 0010 0621 0559 0372 -259 1697
100 0136 0074 0022 0029 0011 0001 -194 1060200 0112 0060 0014 0012 0004 0000 -239 837
BFM 600 0097 0047 0010 0010 0002 0000 -265 7491000 0096 0047 0009 0010 0002 0000 -276 83220000 0071 0034 0004 0077 0033 0004 -326 766
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a correctly specified model with an intercept and a useless factor The true value of R2 is 0 Fama-MacBeth esti-mates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shankencorrection Confidence intervals for BFM estimates are constructed using a posterior distribution of Fama-MacBethestimates of λ The last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simula-tions evaluated at the simulation point estimates for FM and its posterior mode for BFM
59
Table C3 Tests of risk premia in a correctly specified model with useless and strong factors
λintercept λHML λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0073 0038 0006 0071 0033 0006 0043 0012 0000 -445 5930200 0073 0035 0006 0090 0045 0008 0028 0006 0000 1083 7484
FM 600 0076 0033 0005 0100 0046 0009 0027 0005 0000 3873 85721000 0073 0034 0006 0090 0046 0009 0026 0005 0000 5415 911320000 0087 0032 0006 0061 0026 0005 0025 0008 0000 9687 9947
100 0039 0013 0001 0023 0007 0001 0002 0001 0000 -197 4174200 0048 0023 0000 0046 0015 0000 0002 0001 0000 003 5372
BFM 600 0054 0025 0003 0062 0026 0006 0002 0000 0000 4056 72821000 0084 0034 0006 0064 0021 0002 0002 0000 0000 5763 808920000 0071 0033 0005 0069 0032 0004 0004 0000 0000 9704 9860
Panel B GLS
100 0193 0129 0043 0205 0133 0043 0180 0121 0031 1405 7480200 0135 0080 0021 0141 0075 0024 0129 0062 0010 6015 8683
FM 600 0101 0054 0010 0109 0052 0009 0091 0037 0004 8331 94231000 0104 0055 0010 0105 0048 0011 0085 0039 0003 8995 963220000 0095 0047 0006 0101 0052 0005 0088 0035 0004 9944 9981
100 0133 0074 0023 0132 0074 0023 0026 0009 0001 2796 6680200 0106 0054 0012 0106 0053 0012 0013 0004 0000 4899 7362
BFM 600 0094 0047 0009 0093 0046 0009 0007 0001 0000 7654 87351000 0090 0045 0010 0091 0044 0007 0005 0001 0000 8530 914620000 0092 0043 0008 0092 0046 0008 0004 0001 0000 9914 9951
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and
λstrong λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor The true value
of the cross-sectional R2adj is 100 Fama-MacBeth estimates are constructed using OLS (GLS) two-step cross-
sectional regressions with standard errors including Shanken correction Confidence intervals for BFM estimates areconstructed using a posterior distribution of Fama-MacBeth estimates of λ The last two columns report the 5th and95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at the simulation point estimates for FMand its posterior mode for BFM
60
Table C4 Tests of risk premia in a misspecified model with a strong factor (N = 55)
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0107 0058 0014 0115 0059 0012 -186 1305200 0091 0046 0009 0102 0052 0011 -184 1485600 0104 0052 0013 0109 0060 0014 -166 14951000 0100 0056 0009 0123 0061 0013 -123 135120000 0104 0049 0009 0109 0048 0009 353 803
BFM 100 0017 0004 0000 0002 0000 0000 -155 -067200 0046 0017 0004 0023 0006 0000 -168 273600 0089 0042 0012 0071 0036 0005 -174 8441000 0093 0047 0007 0089 0040 0007 -171 95920000 0101 0048 0011 0099 0042 0008 329 777
Panel B GLS
FM 100 0509 0428 0305 0513 0431 0305 2093 6471200 0295 0206 0102 0274 0187 0091 3999 6657600 0170 0109 0042 0176 0113 0034 5585 71321000 0170 0106 0034 0176 0105 0031 6010 723220000 0145 0082 0020 0141 0073 0022 6917 7182
BFM 100 0265 0181 0086 0262 0186 0079 1872 5919200 0163 0100 0032 0147 0086 0024 3401 5842600 0116 0060 0017 0118 0058 0014 5125 65951000 0120 0064 0016 0121 0063 0014 5689 686520000 0106 0053 0009 0095 0048 0013 6897 7163
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor estimates of a cross-section of 55 test assets The true valueof the cross-sectional R2
adj is 572 (7071) in OLS (GLS) estimation Fama-MacBeth estimates are constructedusing OLS (GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidenceintervals and their size for BFM estimates are constructed using posterior coverage of Fama-MacBeth estimates of λThe last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluatedat the simulation point estimates for FM and its posterior mode for BFM The test assets mimic the time series andcross-sectional properties of 25 Fama-French size-value portfolios and 30 industry portfolios while the strong factorproxies the HML factor (all the data available from Ken French website)
61
Table C5 Tests of risk premia in a misspecified model with a useless factor (N = 55)
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0095 0050 0013 0084 0031 0003 -186 2080200 0084 0042 0007 0093 0038 0004 -186 1801600 0103 0051 0011 0155 0077 0011 -187 13621000 0101 0051 0009 0226 0131 0033 -187 141720000 0191 0126 0050 0716 0650 0460 -187 988
BFM 100 0013 0002 0000 0000 0000 0000 -105 -047200 0039 0015 0001 0002 0000 0000 -128 -043600 0076 0040 0006 0004 0000 0000 -144 -0491000 0082 0035 0004 0022 0009 0000 -150 -02620000 0129 0059 0005 0078 0037 0008 -168 -020
Panel B GLS
FM 100 0492 0411 0287 0540 0468 0302 -134 2318200 0267 0196 0088 0364 0274 0153 -159 1359600 0166 0101 0035 0408 0323 0198 -162 10091000 0154 0089 0028 0475 0401 0266 -155 86820000 0155 0100 0044 0806 0772 0705 -068 668
BFM 100 0259 0180 0085 01695 0105 0041 -014 1285200 0162 0094 0030 0065 0027 0005 -089 615600 0108 0058 0013 0056 0022 0003 -127 5311000 0112 0057 0014 0064 0021 0004 -138 47920000 0051 0020 0001 0093 0049 0011 -072 172
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor estimated on a cross-section of 55 portfolios Thetrue value of the cross-sectional R2 is zero The test assets mimic the time series and cross-sectional properties of 25Fama-French size-value portfolios and 30 industry portfolios
62
Table C6 Tests of risk premia in a misspecified model with useless and strong factors (N = 55)
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0097 0052 0014 0099 0045 0009 0108 0041 0006 -318 2477200 0085 0041 0006 0090 0049 0011 0117 0051 0008 -298 2350600 0095 0045 0013 0108 0060 0012 0191 0100 0015 -212 21211000 0083 0043 0006 0125 0072 0016 0258 0171 0047 -155 213720000 0109 0055 0012 0146 0086 0038 0766 0704 0535 267 1778
BFM 100 0014 0002 0000 0000 0000 0000 0000 0000 0000 -141 161200 0037 0014 0001 0017 0005 0000 0002 0000 0000 -203 726600 0069 0031 0005 0058 0027 0002 0007 0000 0000 -227 10961000 0064 0024 0003 0069 0027 0005 0026 0008 0001 -209 117320000 0028 0006 0000 0068 0033 0004 0080 0042 0009 262 873
Panel B GLS
FM 100 0488 0406 0282 0495 0411 0281 0537 0465 0306 2201 6613200 0263 0194 0088 0252 0167 0077 0359 0279 0151 4047 6707600 0157 0094 0032 0161 0093 0027 0398 0313 0199 5581 71591000 0146 0087 0029 0144 0090 0026 0475 0393 0251 5994 725020000 0140 0094 0035 0134 0089 0041 0789 0747 0667 6888 7237
BFM 100 0253 0182 0081 0260 0176 0077 0168 0104 0039 2036 6007200 0156 0092 0028 0140 0082 0021 0063 0029 0004 3451 5844600 0105 0058 0013 0108 0049 0011 0054 0024 0002 5125 66141000 0105 0054 0015 0111 0056 0009 0057 0024 0003 5678 687320000 0051 0019 0001 0045 0018 0001 0093 0045 0013 6873 7157
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and λstrong
and λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor estimated on a cross-section
of 55 portfolios The true value of the cross-sectional R2adj is 572 (7071) in OLS (GLS) estimation The test
assets mimic the time series and cross-sectional properties of 25 Fama-French size-value portfolios and 30 industryportfolios while the strong factor proxies the HML factor
63
Table C7 Tests of risk premia in a misspecified model with a strong factor (N = 100)
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0091 0044 0010 0109 0055 0012 -098 1310600 0102 0052 0013 0116 0057 0019 -042 12311000 0099 0056 0011 0117 0060 0014 031 114520000 0099 0044 0010 0116 0068 0017 431 748
BFM 200 0016 0004 0000 0006 0001 0000 -082 074600 0076 0034 0006 0060 0028 0005 -086 7901000 0081 0046 0007 0076 0037 0007 -072 85920000 0097 0044 0011 0106 0057 0014 418 742
Panel B GLS
FM 200 0495 0411 0287 0481 0403 0268 2938 5885600 0255 0169 0077 0257 0178 0073 4736 62091000 0234 0149 0059 0205 0129 0049 5198 635220000 0166 0100 0035 0170 0097 0036 6142 6398
BFM 200 0249 0163 0070 0233 0158 0073 2567 5296600 0132 0082 0023 0137 0077 0020 4287 56771000 0125 0073 0021 0112 0060 0014 4849 597020000 0101 0056 0012 0098 0053 0013 6114 6375
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for the pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor estimated on a cross-section of 100 portfolios The truevalue of R2
adj is 585 (6297) for the OLS (GLS) estimation Fama-MacBeth estimates are constructed using OLS(GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidence intervalsand their size for BFM estimates are constructed using a posterior distribution of Fama-MacBeth estimates of λ Thelast two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at thesimulation point estimates for FM and its posterior mode for BFM The simulations design follows the methodologydescribed in Section III with the test assets mimicking the composite cross-section of 25 Fama-French size-BMportfolios 30 industry portfolios 25 profitability and investment portfolios 10 momentum portfolios as well as 10long-term reversal portfolios (all available from Ken French website) A strong factor mimics the behavior of HML
64
Table C8 Tests of risk premia in a misspecified model with a useless factor (N = 100)
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0091 0041 0009 0147 0079 0015 -101 1481600 0110 0062 0010 0280 0180 0064 -100 13551000 0096 0053 0007 0341 0248 0101 -100 122220000 0221 0151 0061 0805 0759 0643 -100 1220
BFM 200 0014 0003 0000 0000 0000 0000 -050 002600 0070 0030 0003 0014 0002 0000 -066 0261000 0073 0034 0005 0026 0010 0001 -071 02720000 0097 0035 0003 0091 0045 0008 -081 109
Panel B GLS
FM 200 0472 0397 0272 0530 0454 0320 -072 1495600 0230 0149 0064 0458 0363 0235 -052 9691000 0196 0122 0048 0487 0407 0275 -037 86120000 0106 0063 0021 0760 0717 0640 171 615
BFM 200 0244 0161 0066 0150 0092 0031 -016 769600 0129 0073 0021 0078 0035 0008 -055 6761000 0118 0066 0019 0070 0033 0006 -060 62420000 0076 0034 0005 0093 0049 0009 172 399
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor estimated on a cross-section of 100 portfolios The truevalue of R2 is zero The test assets mimic the time series and cross-sectional properties of composite cross-section of25 Fama-French size-BM portfolios 30 industry portfolios 25 profitability and investment portfolios 10 momentumportfolios as well as 10 long-term reversal portfolios while the strong factor mimics the behavior of HML (all thedata is available from Ken French website)
Table C9 Tests of risk premia in a misspecified model with useless and strong factors (N = 100)
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0088 0043 0009 0098 0045 0008 0187 0104 0026 -124 2118600 0103 0056 0011 0104 0064 0015 0319 0217 0081 -020 19571000 0105 0049 0009 0115 0058 0013 0389 0284 0134 070 182620000 0102 0057 0014 0140 0075 0028 0823 0782 0684 400 1766
BFM 200 0012 0003 0000 0005 0000 0000 0000 0000 0000 -051 541600 0062 0025 0003 0046 0021 0002 0016 0004 0000 -063 10771000 0062 0029 0005 0057 0025 0003 0029 0008 0001 -005 107820000 0026 0008 0001 0042 0015 0000 0089 0039 0011 399 818
Panel B GLS
FM 200 0467 0392 0268 0471 0383 0254 0523 0454 0315 2971 5943600 0227 0149 0059 0228 0154 0060 0455 0363 0227 4745 62221000 0191 0118 0046 0178 0114 0036 0480 0401 0262 5207 636820000 0098 0054 0020 0095 0062 0024 0749 0707 0621 6121 6416
BFM 200 0243 0160 0066 0230 0155 0072 0152 0087 0031 2613 5350600 0126 0072 0022 0132 0071 0017 0074 0033 0007 4268 56921000 0117 0067 0017 0109 0057 0013 0068 0034 0006 4864 597920000 0068 0032 0002 0066 0034 0006 0087 0047 0009 6102 6371
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and λstrong
λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor on the cross-section of 100
portfolios The true value of R2adj is 585 (6297) in OLS (GLS) estimation The test assets mimic the time series
and cross-sectional properties of composite cross-section of 25 Fama-French size-BM portfolios 30 industry portfolios25 profitability and investment portfolios 10 momentum portfolios as well as 10 long-term reversal portfolios whilethe strong factor mimics the behavior of HML (all the data is available from Ken French website)
65
Table C10 The probability of retaining risk factors using BF
T 55 57 59 61 63 65
Panel A strong factors
Jeffreys Prior fstrong 200 0860 0845 0830 0812 0792 0771600 0987 0985 0985 0983 0981 09791000 0998 0998 0997 0996 0996 0995
Spike-and-Slab Prior fstrong 200 0749 0718 0685 0654 0618 0586600 0982 0979 0978 0972 0970 09641000 0996 0996 0996 0995 0994 0992
Panel B useless factors
Jeffreys Prior fuseless 200 1000 0995 0982 0940 0862 0726600 1000 0999 0998 0988 0971 09201000 1000 1000 1000 0999 0990 0971
Spike-and-Slab Prior fuseless 200 0419 0248 0149 0083 0040 0022600 0119 0062 0028 0012 0007 00041000 0083 0028 0011 0001 0000 0000
Panel C strong and useless factors
Jeffreys Prior fstrong 200 0928 0912 0891 0878 0860 0838600 0994 0994 0992 0991 0991 09891000 0999 0999 0999 0999 0999 0999
fuseless 200 0955 0894 0788 0642 0489 0360600 0957 0895 0764 0618 0461 03541000 0957 0893 0787 0645 0483 0357
Spike-and-Slab Prior fstrong 200 0753 0725 0697 0666 0629 0592600 0982 0980 0979 0972 0970 09611000 0996 0996 0996 0995 0994 0994
fuseless 200 0283 0154 0085 0044 0023 0011600 0077 0031 0006 0002 0000 00001000 0053 0008 0000 0000 0000 0000
The table shows the frequency of retaining risk factors for different choice sets across 1000 simulations of differentsize (T=200 600 and 1000) In Panel A the candidate risk factor is truly cross-sectionally priced and stronglyidentified while in Panel B they are not Panel C summarizes the case of using both strong and useless candidatefactors in the model A candidate factor is retained in the model if its marginal posterior probability p(γi = 1|data)is greater than a certain threshold ie 55 57 59 61 63 and 65
66
Table C11 Two-pass regressions with tradable factors 25 Fama-French portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CAPM Intercept 1421 1769 1404 -091 1461[0627 2215] [-435 6661] [0595 2256] [-403 5211]
MKT -0647 -0631[-1429 0135] [-1458 0163]
Fama and French (1992) Intercept 1273 6071 1249 5604 5566[0730 1816] [2000 8514] [0682 1834] [4275 6946]
MKT -0686 -0664[-1227 -0145] [-1242 -0102]
SMB 0140 0140[0075 0205] [0073 0205]
HML 0380 0379[0322 0438] [0319 0439]
Quality-minus-junk Intercept 0626 7442 0648 6947 6879Asness Frazzini and Pedersen (2014) [-0048 1300] [4480 9520] [-0055 1308] [5548 7946]
QMJ 0369 0358[0148 0589] [0130 0591]
MKT -0097 -0116[-0763 0568] [-0772 0580]
SMB 0209 0206[0157 0260] [0151 0262]
HML 0338 0340[0288 0388] [0289 0392]
Panel B GLS
CAPM Intercept 1362 4805 1323 4706 4265[0941 1783] [2696 10000] [0846 1802] [1411 6530]
MKT -0788 -0749[-1206 -0369] [-1227 -0287]
Fama and French (1992) Intercept 1397 8849 1353 8543 8543[0924 1870] [8286 9771] [0846 1878] [8003 8989]
MKT -0827 -0783[-1298 -0356] [-1311 -0283]
SMB 0179 0178[0145 0213] [0142 0215]
HML 0348 0350[0312 0385] [0310 0391]
Quality-minus-junk Intercept 0966 8948 0982 8736 8642Asness Frazzini and Pedersen (2014) [0395 1537] [8320 9640] [0383 1616] [8120 9090]
QMJ 0378 0359[0204 0552] [0172 0539]
MKT -0425 -0438[-0984 0133] [-1052 0157]
SMB 0197 0196[0160 0234] [0156 0235]
HML 0359 0359[0321 0398] [0320 0399]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of monthly excess returns for 25 Fama-French sizevalue portfolios Each model is estimated viaOLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructed basedon the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructed as inLewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide theposterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and 99level respectively
67
Table C12 Two-pass regressions with tradable factors 25 Fama-French portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CCAPM Intercept 1444 662 1814 -202 545[0155 2732] [-435 9374] [0106 3664] [-423 4175]
∆Cnd 0383 0254[-0221 0988] [-0462 0956]
Scaled CAPM Intercept 4489 1617 3731 1739 2565[1479 7499] [-1429 6114] [-0268 7514] [-391 5783]
cay 1141 0563[-0198 2480] [-2262 3085]
MKT -2119 -1444[-4890 0653] [-5241 2643]
MKT times cay -8554 -5188[-19227 2119] [-17911 9415]
Scaled HC-CAPM Intercept 4405 1386 3343 3895 3659[1182 7629] [-2632 5453] [-0500 7182] [-006 6630]
cay 1038 0552[-0444 2520] [-1957 2848]
∆Y 0437 0034[-0421 1295] [-0995 0928]
MKT -2049 -1148[-5031 0933] [-4946 2772]
∆Y times cay 1428 0724[-1047 3903] [-3458 4645]
MKT times cay -7782 -4127[-18579 3015] [-17926 10532]
Panel B GLS
CCAPM Intercept 2274 -251 2336 -303 -056[1386 3162] [-435 9896] [1360 3266] [-403 1109]
∆Cnd 0139 0088[-0114 0391] [-0222 0383]
Scaled CAPM Intercept 2623 4949 2668 5626 4567[0794 4451] [1657 7829] [0859 4376] [439 7146]
cay 0362 0205[-0759 1483] [-0853 1277]
MKT -0596 -0642[-2412 1220] [-2325 1149]
MKT times cay 0644 0310[-5695 6984] [-5988 6529]
Scaled HC-CAPM Intercept 2486 5014 2604 5832 4698[0510 4463] [1158 7726] [0801 4372] [558 7371]
cay 0361 0199[-0902 1623] [-0879 1269]
∆Y -0415 -0242[-0878 0048] [-0672 0157]
MKT -0479 -0588[-2441 1482] [-2357 1241]
∆Y times cay 0395 0150[-1761 2552] [-1718 2036]
MKT times cay 1158 0477[-5888 8205] [-5890 6968]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of quarterly excess returns for 25 Fama-French sizevalue portfolios Each model is estimatedvia OLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructed basedon the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructed as inLewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide theposterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and 99level respectively
68
Table C13 Two-pass regressions with tradable factors 25 Fama-French + 17 industry portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Carhart (1997) Intercept 0755 4720 0780 3835 3769
[0167 1342] [803 8559] [0158 1395] [2222 5427]
MKT -0162 -0188
[-0759 0435] [-0808 0432]
SMB 0144 0142
[0082 0206] [0078 0207]
HML 0320 0316
[0251 0388] [0245 0388]
UMD 0759 0673
[-0360 1878] [-0476 1802]
q-factor model Intercept 0744 4505 0761 3647 3600
Hou Xue and Zhang (2015) [0244 1243] [138 8338] [0239 1287] [1711 5508]
ROE 0173 0156
[-0139 0485] [-0154 0481]
IA 0287 0279
[0117 0456] [0117 0455]
ME 0230 0221
[0135 0325] [0121 0320]
MKT -0199 -0211
[-0703 0304] [-0741 0311]
Liquidity Factor Intercept 0958 277 0963 -124 620
Pastor and Stambaugh (2003) [0417 1499] [-513 4113] [0377 1527] [-435 3340]
LIQ 0210 0153
[-1001 1421] [-1077 1445]
MKT -0274 -0281
[-0822 0274] [-0851 0340]
Panel B GLS
Carhart (1997) Intercept 0839 8826 0909 8486 8397
[0467 1210] [7562 9557] [0487 1339] [7895 8812]
MKT -0235 -0309
[-0610 0140] [-0737 0120]
SMB 0162 0161
[0134 0190] [0130 0193]
HML 0351 0350
[0316 0386] [0312 0390]
UMD 1426 1184
[0832 2020] [0541 1861]
q-factor model Intercept 1107 4289 1103 3742 3631
Hou Xue and Zhang (2015) [0761 1454] [027 7341] [0714 1486] [2022 5216]
ROE 0270 0247
[0057 0482] [0008 0502]
IA 0277 0263
[0134 0420] [0107 0428]
ME 0221 0216
[0156 0287] [0144 0288]
MKT -0524 -0520
[-0869 -0179] [-0900 -0138]
Liquidity Factor Intercept 1186 2897 1159 1811 2373
Pastor and Stambaugh (2003) [0874 1497] [-197 7687] [0803 1519] [438 5253]
LIQ 0089 0065
[-0567 0744] [-0696 0871]
MKT -0598 -0571
[-0909 -0288] [-0936 -0212]
69
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CAPM Intercept 0966 467 0957 -113 306
[0427 1506] [-250 5490] [0382 1522] [-246 2863]
MKT -0277 -0269
[-0823 0269] [-0838 0311]
Fama-French (1992) Intercept 1016 4330 1002 3425 3370
[0540 1491] [182 8166] [0517 1500] [2194 4614]
MKT -0445 -0430
[-0916 0026] [-0912 0068]
SMB 0147 0144
[0086 0208] [0077 0208]
HML 0316 0313
[0248 0383] [0242 0383]
Quality-minus-junk Intercept 0836 4931 0836 3861 3934
Asness et all (2019) [0323 1350] [914 8449] [0314 1365] [2459 5577]
QMJ 0153 0147
[-0065 0372] [-0066 0373]
MKT -0293 -0289
[-0796 0210] [-0814 0233]
SMB 0179 0175
[0114 0243] [0105 0240]
HML 0302 0300
[0234 0369] [0230 0373]
Panel B GLS
CAPM Intercept 1192 3060 1170 2602 2573
[0884 1500] [058 7848] [0815 1517] [600 5118]
MKT -0604 -0583
[-0911 -0298] [-0923 -0227]
Fama-French (1992) Intercept 1182 8616 1150 8272 8234
[0853 1510] [7303 9353] [0788 1519] [7736 8662]
MKT -0596 -0564
[-0923 -0268] [-0937 -0198]
SMB 0160 0160
[0132 0187] [0129 0191]
HML 0349 0348
[0314 0384] [0310 0386]
Quality-minus-junk Intercept 0970 8674 0977 8227 8276
Asness et all (2019) [0606 1334] [7451 9335] [0561 1369] [7759 8693]
QMJ 0293 0278
[0159 0427] [0125 0439]
MKT -0393 -0399
[-0754 -0033] [-0791 0012]
SMB 0166 0165
[0138 0194] [0134 0196]
HML 0353 0352
[0318 0388] [0315 0392]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of monthly excess returns for 25 Fama-French and 17 industry portfolios Each model is estimatedvia OLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructedbased on the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructedas in Lewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we providethe posterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of thecross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and99 level respectively
70
Table C14 Two-pass regressions with nontradable factors 25 Fama-French + 17 industry port-folios
Fama-MacBeth Bayesian Estimation
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CCAPM Intercept 1817 356 1975 -124 209
[0905 2729] [-250 6208] [0795 3135] [-245 2680]
∆Cnd 0189 0123
[-0216 0594] [-0285 0502]
Scaled CAPM Intercept 2141 1528 2344 298 1306
[0529 3754] [-789 9568] [0692 3944] [-451 4069]
cay 1408 0574
[-0182 2999] [-0935 1939]
MKT 0108 -0142
[-1471 1686] [-1747 1499]
cay timesMKT -2554 -2159
[-8811 3702] [-8813 4620]
Scaled CCAPM Intercept 1607 1709 1983 95 1468
[0394 2820] [-789 10000] [0761 3180] [-372 4189]
cay 1137 0511
[-0394 2669] [-0871 1865]
∆Cnd 0452 0175
[-0103 1007] [-0261 0598]
cay times∆Cnd -0102 0030
[-1752 1547] [-1175 1292]
HC-CAPM Intercept 2185 611 2221 -147 644
[0720 3650] [-513 6005] [0667 3735] [-429 3093]
∆Y 0398 0201
[-0011 0808] [-0290 0667]
MKT 0123 0050
[-1392 1638] [-1513 1650]
Panel B GLS
CCAPM Intercept 2351 264 2442 -057 218
[1700 3003] [-250 9795] [1612 3258] [-204 1296]
∆Cnd 0211 0132
[0034 0387] [-0081 0351]
Scaled CAPM Intercept 2516 3951 2653 4332 3470
[1467 3566] [1045 7411] [1616 3705] [360 6539]
cay 0783 0424
[0056 1511] [-0311 1119]
MKT -0504 -0639
[-1548 0541] [-1674 0406]
cay timesMKT 3785 2327
[-0412 7981] [-2053 6791]
Scaled CCAPM Intercept 2244 331 2378 296 356
[1378 3109] [-789 8166] [1539 3237] [-476 1690]
cay 0742 0425
[-0006 1489] [-0316 1104]
∆Cnd 0160 0119
[-0078 0398] [-0116 0342]
cay times∆Cnd 0355 0190
[-0343 1053] [-0455 0840]
HC-CAPM Intercept 2908 3810 2808 4454 3356
[2053 3762] [854 7372] [1809 3826] [194 6591]
∆Y -0155 -0089
[-0381 0072] [-0357 0174]
MKT -0892 -0793
[-1742 -0043] [-1803 0208]
71
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Scaled HC-CAPM Intercept 2099 1372 2243 1599 2186
[0549 3649] [-1389 4875] [0491 3955] [-121 4771]
cay 1124 0529
[-0326 2574] [-1066 1981]
∆Y 0088 0106
[-0314 0489] [-0399 0599]
MKT 0169 -0068
[-1340 1679] [-1805 1744]
cay times∆Y 1277 0579
[-1361 3914] [-1930 2919]
cay timesMKT -2125 -1605
[-8058 3808] [-8349 5354]
Durable CCAPM Intercept 2281 2316 2220 1423 1459
[0025 4537] [-789 10000] [0403 4028] [-359 4178]
∆Cnd 0544 0194
[0054 1034] [-0163 0525]
∆Cd 0481 0104
[-0101 1062] [-0353 0531]
MKT -0102 -0063
[-2466 2262] [-1948 1863]
Panel B GLS
Scaled HC-CAPM Intercept 2662 3848 2698 4312 3502
[1626 3698] [433 7267] [1555 3844] [231 6466]
cay 0726 0416
[-0029 1481] [-0324 1146]
∆Y -0165 -0088
[-0440 0110] [-0372 0198]
MKT -0652 -0685
[-1684 0379] [-1816 0438]
cay times∆Y 0681 0333
[-0648 2009] [-1017 1616]
cay timesMKT 3266 2123
[-0950 7482] [-2173 6502]
Durable CCAPM Intercept 1958 2926 1994 412 2558
[0634 3281] [-789 6763] [0831 3125] [-269 6336]
∆Cnd 0164 0065
[-0065 0394] [-0126 0260]
∆Cd 0289 0123
[-0011 0589] [-0117 0377]
MKT 0002 -0026
[-1322 1326] [-1165 1125]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of quarterly excess returns for 25 Fama-French and 17 industry portfolios Each modelis estimated via OLS and GLS We report point estimates and 5 confidence intervals for risk premia which areconstructed based on the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence levelconstructed as in Lewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimationwe provide the posterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode andmedian of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the90 95 and 99 level respectively
72
Table C15 Posterior factor probabilities and risk premia of 26 million sparse models
Pr [γj = 1|data] E [λj |data]ψ ψ
factors 1 5 10 20 50 100 1 5 10 20 50 100
HML 0455 0702 0720 0695 0646 0614 0106 0231 0249 0245 0229 0219MKTlowast 0274 0513 0594 0658 0700 0702 0050 0211 0321 0440 0553 0591MKT 0086 0178 0311 0446 0550 0576 0009 0067 0163 0286 0408 0454SMBlowast 0246 0350 0317 0264 0210 0188 0039 0113 0120 0110 0095 0089PERF 0136 0160 0147 0125 0098 0083 -0020 -0050 -0053 -0049 -0041 -0036STOCK ISS 0099 0117 0125 0123 0110 0099 -0008 -0029 -0042 -0050 -0053 -0051COMP ISSUE 0097 0101 0104 0103 0096 0088 0010 0029 0041 0051 0059 0059CMA 0111 0114 0104 0091 0075 0066 0008 0018 0019 0019 0017 0016ROA 0089 0079 0083 0094 0102 0097 0010 0023 0034 0051 0068 0070UMD 0091 0097 0097 0090 0078 0071 0006 0020 0026 0029 0030 0030BEH PEAD 0081 0069 0069 0074 0089 0108 0001 0002 0004 0007 0016 0030STRev 0078 0064 0063 0067 0086 0112 0000 0002 0003 0007 0022 0048NONDUR 0079 0065 0064 0067 0079 0097 0000 0000 0001 0001 0003 0006DISSTR 0085 0079 0076 0070 0060 0053 0010 0030 0039 0043 0042 0040MGMT 0090 0083 0077 0068 0055 0047 0007 0017 0019 0019 0017 0015BW ISENT 0079 0064 0062 0063 0069 0077 0000 0000 0000 0001 0002 0004TERM 0078 0063 0061 0062 0069 0078 0000 0000 0001 0001 0004 0007FIN UNC 0078 0063 0061 0061 0066 0073 -0000 -0000 -0000 -0000 -0000 -0000LIQ TR 0078 0063 0060 0061 0066 0075 -0000 0000 0001 0002 0007 0015DeltaSLOPE 0078 0063 0061 0061 0066 0073 -0000 -0000 -0000 -0000 -0000 -0001IPGrowth 0078 0063 0061 0061 0066 0072 -0000 -0000 -0000 -0000 -0001 -0002Oil 0078 0063 0061 0061 0066 0072 -0000 0001 0001 0002 0004 0009SERV 0078 0063 0061 0061 0065 0072 -0000 -0000 -0000 -0000 -0000 -0000DEFAULT 0078 0063 0060 0060 0064 0069 -0000 -0000 0000 0000 0000 0000NetOA 0079 0063 0061 0062 0065 0065 0001 0003 0004 0007 0014 0018DIV 0078 0063 0060 0060 0064 0069 0000 -0000 -0000 -0000 -0000 -0000PE 0078 0063 0060 0060 0064 0068 -0000 -0001 -0001 -0002 -0004 -0008REAL UNC 0078 0063 0060 0060 0064 0067 0000 0000 0000 0000 0000 0000HJTZ ISENT 0078 0063 0060 0060 0064 0068 -0000 -0000 -0000 -0000 -0000 -0001UNRATE 0078 0063 0060 0060 0064 0068 0000 -0000 -0000 -0000 -0001 -0002INTERM CAP RATIO 0084 0065 0061 0062 0062 0059 0009 0013 0018 0027 0038 0041LIQ NT 0079 0063 0060 0060 0063 0066 -0001 -0001 -0002 -0002 -0003 -0004QMJ 0113 0090 0069 0051 0035 0028 0012 0016 0014 0010 0007 0005MACRO UNC 0078 0063 0060 0059 0060 0061 0000 0000 0000 0000 0000 0000INV IN ASS 0078 0062 0059 0059 0060 0061 0000 0000 0001 0001 0003 0006ACCR 0081 0067 0062 0059 0056 0055 -0002 -0005 -0006 -0006 -0005 -0004LTRev 0078 0064 0063 0062 0059 0053 -0001 -0005 -0008 -0012 -0016 -0017CMAlowast 0079 0062 0059 0058 0059 0058 0000 0000 -0000 -0001 -0002 -0002ASS Growth 0078 0060 0056 0055 0055 0052 -0001 -0001 -0001 -0004 -0008 -0011HMLlowast 0079 0062 0058 0055 0053 0049 -0001 -0001 -0001 -0001 -0001 -0001RMWlowast 0077 0058 0052 0047 0040 0035 0000 0001 0001 0002 0002 0002BAB 0079 0059 0051 0045 0039 0034 -0003 -0003 -0003 -0001 0001 0003IA 0083 0060 0051 0043 0035 0030 -0003 -0004 -0004 -0003 -0003 -0003O SCORE 0077 0056 0047 0040 0032 0027 -0003 -0004 -0005 -0005 -0005 -0005ROE 0076 0055 0047 0040 0032 0027 0001 0004 0005 0005 0004 0003SMB 0039 0024 0027 0042 0067 0077 0002 0002 0004 0007 0014 0017SKEW 0090 0062 0047 0034 0024 0019 -0000 -0000 -0000 -0000 -0000 -0000BEH FIN 0079 0051 0040 0031 0023 0019 -0006 -0005 -0003 -0001 -0000 -0000GR PROF 0077 0047 0038 0031 0025 0020 0004 0003 0003 0003 0003 0003RMW 0073 0045 0036 0030 0023 0018 0003 0004 0004 0003 0003 0003HML DEVIL 0063 0036 0027 0020 0015 0012 0002 0003 0002 0001 0000 0000
Posterior probabilities of factors Pr [γj = 1|data] and posterior mean of factor risk premia E [λj |data] computedusing the Dirac spike and slab approach of section II22 51 factors and all possible models with up to 5 factorsyielding about 26 million candidate models The prior probability of a factor being included is about 1038 Thedata is monthly 197310 to 201612 Test assets cross-section of 25 Fama-French size and book-to-market and 30Industry portfolios The 51 factors considered are described in Table B1 of Appendix B
73
Table C16 Factor models with highest posterior probability (Dirac spike-and-slab ψ = 20)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232SMB 983232 983232 983232STRevBW ISENTLIQ TRUNRATENONDURTERMCOMP ISSUE 983232 983232OilBEH PEAD 983232DeltaSLOPEINV IN ASSIPGrowthDEFAULTUMD 983232ROA 983232 983232 983232 983232 983232 983232 983232REAL UNCPECMA 983232ACCRSERVSTOCK ISS 983232 983232 983232DIVMACRO UNCFIN UNCLIQ NTCMANetOAHJTZ ISENTLTRevRMWHMLINTERM CAP RATIOASS GrowthPERF 983232IABABDISSTR 983232ROEMGMTO SCOREQMJBEH FINGR PROFSKEWRMWHML DEVILSMBProbability () 01080 00964 00771 00710 00709 00668 00661 00624 00600 00560
Factors and posterior model probabilities of ten most likely specifications computed using the Dirac spike and slabapproach of section II22 ψ = 20 51 factors and all possible models with up to 5 factors yielding about 26 millionmodels and a model prior probability of the order of 10minus7 Specifications organised by columns with the symbol 983232indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612 Test assetscross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factors considered aredescribed in Table B1 of Appendix B
74
Table C17 Factor Models with highest posterior probability (continuous spike-and-slab ψ = 10)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKTlowast 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232SMBlowast 983232 983232 983232 983232 983232 983232 983232STRev 983232 983232 983232 983232 983232 983232 983232IPGrowth 983232 983232 983232 983232 983232BEH PEAD 983232 983232 983232 983232 983232NONDUR 983232 983232 983232 983232SERV 983232 983232 983232LIQ TR 983232 983232 983232 983232PE 983232 983232 983232 983232 983232 983232DeltaSLOPE 983232 983232 983232 983232BW ISENT 983232 983232DEFAULT 983232 983232 983232 983232 983232 983232Oil 983232 983232 983232DIV 983232 983232 983232 983232 983232 983232REAL UNC 983232 983232 983232 983232 983232 983232UNRATE 983232 983232TERM 983232 983232 983232 983232HJTZ ISENT 983232 983232 983232 983232 983232 983232UMD 983232 983232 983232 983232 983232CMAlowast 983232 983232 983232 983232 983232 983232LIQ NT 983232 983232 983232 983232 983232 983232 983232 983232FIN UNC 983232 983232 983232 983232 983232 983232 983232STOCK ISS 983232 983232NetOA 983232 983232 983232 983232 983232 983232MACRO UNC 983232 983232 983232 983232 983232INV IN ASS 983232 983232 983232COMP ISSUE 983232 983232 983232 983232 983232ACCR 983232 983232 983232 983232 983232 983232HMLlowast 983232 983232 983232 983232 983232 983232ROA 983232 983232 983232 983232 983232 983232RMWlowast 983232 983232 983232 983232 983232 983232CMA 983232 983232 983232 983232 983232 983232LTRev 983232 983232 983232 983232 983232 983232 983232 983232INTERM CAP RATIO 983232 983232 983232 983232 983232 983232ASS Growth 983232 983232 983232 983232 983232 983232DISSTR 983232 983232 983232 983232PERF 983232 983232 983232BAB 983232 983232IA 983232 983232 983232 983232MGMT 983232 983232 983232 983232ROE 983232 983232 983232O SCORE 983232BEH FIN 983232 983232 983232 983232 983232GR PROF 983232 983232 983232 983232 983232QMJ 983232 983232RMW 983232SKEW 983232 983232 983232SMB 983232HML DEVIL 983232 983232Probability () 00111 00111 00111 00100 00100 00100 00100 00100 00100 00100
Factors and posterior model probabilities of ten most likely specifications computed using the continuous spike andslab approach of section II23 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 225quadrillion models and a model prior probability of the order of 10minus16 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
75
Table C18 Factor models with highest posterior probability (Continuous spike-and-slab ψ = 20)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232MKTlowast 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232SMBlowast 983232 983232 983232STRev 983232 983232 983232 983232 983232 983232 983232 983232IPGrowth 983232 983232 983232BEH PEAD 983232 983232 983232 983232 983232 983232NONDUR 983232 983232 983232SERV 983232 983232 983232 983232 983232 983232LIQ TR 983232 983232 983232 983232 983232PE 983232 983232 983232 983232 983232 983232 983232DeltaSLOPE 983232 983232 983232 983232 983232 983232BW ISENT 983232 983232 983232 983232 983232DEFAULT 983232 983232 983232 983232 983232Oil 983232 983232DIV 983232 983232 983232 983232REAL UNC 983232 983232 983232 983232 983232UNRATE 983232 983232 983232 983232 983232TERM 983232 983232 983232 983232HJTZ ISENT 983232 983232 983232 983232 983232 983232 983232 983232UMD 983232 983232 983232 983232CMAlowast 983232 983232LIQ NT 983232 983232 983232 983232 983232FIN UNC 983232 983232 983232 983232 983232 983232STOCK ISS 983232 983232 983232 983232 983232NetOA 983232 983232 983232 983232 983232 983232 983232MACRO UNC 983232 983232INV IN ASS 983232 983232 983232 983232COMP ISSUE 983232 983232 983232 983232ACCR 983232 983232 983232HMLlowast 983232 983232 983232 983232 983232 983232 983232ROA 983232 983232 983232 983232 983232 983232RMWlowast 983232 983232 983232 983232 983232CMA 983232 983232 983232 983232 983232 983232LTRev 983232 983232 983232INTERM CAP RATIO 983232 983232 983232ASS Growth 983232 983232 983232 983232DISSTR 983232 983232 983232 983232 983232PERF 983232BAB 983232 983232 983232 983232IA 983232 983232 983232 983232MGMT 983232 983232 983232ROE 983232 983232 983232 983232 983232O SCORE 983232 983232 983232 983232 983232BEH FIN 983232GR PROF 983232 983232 983232QMJ 983232 983232RMW 983232 983232SKEW 983232 983232SMB 983232HML DEVIL 983232 983232Probability () 00133 00122 00111 00111 00100 00100 00100 00100 00100 00089
Factors and posterior model probabilities of ten most likely specifications computed using the continuous spike andslab approach of section II23 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 225quadrillion models and a model prior probability of the order of 10minus16 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
76
D Additional Figures
Panel A OLS
00 01 02 03 04 05
02
46
810
λstrong
Density
BFMFM
(a) Strong factor
minus2 minus1 0 1 2
00
02
04
06
08
10
λuseless
Density
BFMFM
(b) Useless factor
Panel B GLS
025 030 035
05
1015
2025
30
λstrong
Density
BFMFM
(c) Strong factor
minus20 minus15 minus10 minus05 00 05 10
00
04
08
12
λuseless
Density
BFMFM
(d) Useless factor
Figure D1 Posterior distribution of the risk premia estimates in a misspecified model that includesboth strong and irrelevant factors
The graph presents posterior distribution of risk premia estimates for a misspecified model with both strong anduseless factors in one representative simulation Panels (a) and (b) display posterior distribution of the BFM-OLSestimates of risk premia along with the frequentist distribution implied by the point estimates and standard errorsPanels (c) and (d) report the same objects for GLS T =1000
77
0 5 10
00
04
08
12
Parameter
Den
sity
PriorLikelihood ILikelihood IILikelihood IIIPosterior IPosterior IIPosterior III
Figure D2 Posterior distributions with heavy-tailed likelihood and thin-tailed prior
Posterior shapes generated by a thin-tails (Gaussian) prior and a heavy-tails (Cauchy) likelihood as the likelihood
peak moves away from the prior mean Posteriors I II and III are generated respectively by the likelihood I II and
III
78
I Introduction
In the last decade or so two observations have come to the forefront of the empirical asset pricing
literature First at current production rates in the near future we will have more sources of em-
pirically ldquoidentifiedrdquo risk than stock returns to price with these factors ndash the so called factors zoo
phenomenon (see eg Harvey Liu and Zhu (2016)) Second given the commonly used estimation
methods in empirical asset pricing useless factors (ie factors whose true covariance with asset
returns is asymptotically zero) are not only likely to appear empirically relevant but also invali-
date inference regarding the true sources of risk (see eg Gospodinov Kan and Robotti (2019))
Nevertheless to the best of our knowledge no general method has been suggested to date that
i) is applicable to both tradeable and non tradeable factors can ii) handle the very large factor
zoo and iii) remains valid under model misspecification while iv) being robust to the spurious
inference problem And that is what we provide
As stressed by Harvey (2017) in his AFA presidential address the first observation naturally
calls for a Bayesian solution ndash and we develop one Furthermore we show that the two fundamental
problems above are tightly connected and a naive Bayesian approach to model selection may fail in
the presence of spurious factors Hence we correct it and apply our method to the zoo of traded and
non-traded factors proposed in the literature jointly evaluating 225 quadrillion models and gaining
novel insights on the empirical drivers of asset returns In particular we find that only a handful
of factors proposed in the previous literature are robust explanators of the cross-section of asset
returns and a four robust factor model easily outperforms canonical factor models Nevertheless
we also show that the lsquotruersquo latent stochastic discount factor (SDF) is dense is the space of empirical
asset pricing factors ie a large set of factors is needed to fully capture its pricing implications
Nonetheless the SDF-implied maximum Sharpe ratio in the economy is not unrealistically high
First we develop a very simple Bayesian version of the canonical Fama and MacBeth (1973)
regression method that is applicable to both traded and non-traded factors This approach makes
useless factors easily detectable in finite sample while delivering sharp posteriors for the strong
factorsrsquo risk premia (ie leaving inference about them unaffected) The result is quite intuitive
Useless factors make frequentist inference unreliable since when factor exposures go to zero risk
premia are no more identified We show that exactly the same phenomenon causes the posterior
credible intervals of risk premia to become diffuse and centered at zero which makes them easily
detectable in empirical applications This contribution is meant to add to the empirical researcher
toolset an approach for robust inference that is as easy to implement as eg the canonical Shanken
(1992) correction of the standard errors
Second the main intent of this paper is to provide a method for handling inference on the
entirety of the factor zoo at once This naturally calls for the use of model posterior probabilities
But we show that under flat priors for risk premia model and factors selection based on marginal
likelihoods (ie on posterior model probabilities or Bayes factors) is unreliable asymptotically
useless factors get selected with probability one This is due to the fact that lack of identification
generates an unbounded manifold for the risk premia parameters over which the likelihood surface
1
is totally flat1 Hence integration applied directly to the likelihood as if it were an unnormalized
probability distribution function (pdf) produces improper marginal ldquoposteriorsrdquo As a result in the
presence of identification failure naive Bayesian inference has the same weakness as the frequentist
one This observation however not only illustrates the nature of the problem but also suggests
how to restore inference use suitable non-informative but yet non-flat priors
Third building upon the literature on predictor selection (see eg Ishwaran Rao et al (2005)
and Giannone Lenza and Primiceri (2018)) we provide a novel (continuous) ldquospike-and-slabrdquo prior
that restores the validity of model selection based on posterior model probabilities and Bayes factors
The prior is uninformative (the ldquoslabrdquo) for strong factors but shrinks away (the ldquospikerdquo) useless
factors This approach is similar in spirit to a ridge regression and acts as a (Tikhonov-Phillips)
regularization of the likelihood function of the cross-sectional regression needed to estimate risk
premia A distinguishing feature of our prior is that the prior variance of a factorrsquos risk premium is
proportional to its correlation with the test asset returns Hence when a useless factor is present
the prior variance of its risk premium converges to zero so the shrinkage dominates and forces its
posterior distribution to concentrated around zero Not only this prior restores integrability but
also i) makes it computationally feasible to analyse quadrillions of alternative factor models ii)
allows the researcher to encode prior beliefs about the sparsity of the true SDF without imposing
hard thresholds and iii) shrinks the estimate of useless factorsrsquo risk premia toward zero We regard
this novel spike-and-slab prior approach as a solution for the high-dimensional inference problem
generated by the factor zoo
Our method is easy to implement and in all of our simulations has good finite sample properties
even when the cross-section of test assets is large We investigate its performance for risk premia
estimation model evaluation and factor selection in a range of simulation designs that mimic the
stylized features of returns Our simulations account for potential model misspecification and the
presence of either strong or useless factors in the model The use of posterior sampling naturally
allows to build credible confidence intervals not only for risk premia but also other statistics of
interest such as the cross-sectional R2 that is notoriously hard to estimate precisely (Lewellen
Nagel and Shanken (2010))
We show that whenever risk premia are well identified both our method and the frequentist
approach provide valid confidence intervals for model parameters with empirical coverage being
close to its nominal size The posterior distribution for useless factors however is reliably centered
around zero and quickly revels them even in a relatively short sample We find that the posterior
of strong factors is largely unaffected by the identification failure with the posterior coverage
corresponding to its nominal size as well In other words the Bayesian approach restores reliable
statistical inference in the model
We also demonstrate the pitfalls of flat priors for risk premia with the same simulation design
their use leads to selecting useless factor with probability approaching 1 for all the sample sizes
However our spike-and-slab prior seems to successfully eliminate them from the model while
1This is similar to the effect of ldquoweak instrumentsrdquo in IV estimations as discussed in Sims (2007)
2
retaining the true sources of risk
Our results have important empirical implications for the estimation of popular linear factor
models and their comparison We jointly evaluate 51 factors proposed in the previous literature
yielding a total of 225 quadrillion possible models to analyze and find that only a handful of
factors are robust explanators of the cross-section of asset returns (the Fama and French (1992)
ldquohigh-minus-lowrdquo proxy for the value premium the market index as well the adjusted versions
of the ldquosmall-minus-bigrdquo size factor and the market factor of Daniel Mota Rottke and Santos
(2018))
Jointly the four robust factors provide a model that is compared to the previous empirical
literature one order of magnitude more likely to have generated the observed asset returns (itrsquos
posterior probability is about 90) However we show that with very high probability the ldquotruerdquo
latent SDF is dense in the space of factors ie capturing its characteristics requires the use of 24-25
factors Nevertheless the SDF-implied maximum Sharpe ratio is not excessive suggesting a high
degree of commonality in terms of captured risks among the factors in the zoo
Furthermore we apply our useless factors detection method to a selection of popular linear
SDFs We find that many non-traded factors such as consumption proxies labour factors or the
consumption-to-wealth ratio cay are only weakly identified at best and are characterised by a
substantial degree of model misspecification and uncertainty
I1 Closely related literature
There are numerous contributions to the literature that rely on the use of Bayesian tools in finance
especially in the areas of asset allocation (for an excellent overview see Fabozzi Huang and Zhou
(2010) and Avramov and Zhou (2010)) model selection (eg Barillas and Shanken (2018)) and
performance evaluation (Baks Metrick and Wachter (2001) Pastor and Stambaugh (2002) Jones
and Shanken (2005) Harvey and Liu (2019)) In this paper therefore we aim to provide only an
overview of the literature that is most closely related to our paper
While there are multiple papers in the literature that adopt a Bayesian approach to analyse
linear factor models and portfolio choice most of them focus on the time-series regressions where
the intercepts thanks to factors being directly traded (or using their mimicking portfolios) can be
interpreted as the vector of pricing errors ndash the αrsquos In this case the use of test assets actually
becomes irrelevant since the problem of comparing model performance is reduced to the spanning
tests of one set of factors by another (eg Barillas and Shanken (2018))
Our paper instead develops a method that can be applied to both tradable and non-tradable
factors As a result we focus on the general pricing performance in the cross-section of asset returns
(which is no longer irrelevant) and show that there is a tight link between the use of the most
popular diffuse priors for the risk premia and the failure of the standard estimation techniques
in the presence of useless factors (eg Kan and Zhang (1999a))
Shanken (1987) and Harvey and Zhou (1990) were probably the first contributions to the lit-
erature that adapted the Bayesian framework to the analysis of portfolio choice and developed
3
GRS-type tests (cf Gibbons Ross and Shanken (1989)) for mean-variance efficiency While
Shanken (1987) was the first to examine the posterior odds ratio for portfolio alphas in the linear
factor model Harvey and Zhou (1990) set the benchmark by imposing the priors both diffuse and
informative directly on the deep model parameters Using the data on 12 industry portfolios they
reject the mean-variance efficiency of the market return with a high degree of certainty
Pastor and Stambaugh (2000) and Pastor (2000) directly assign a prior distribution to the
pricing errors α α sim N (0κΣ) where Σ is the variance-covariance matrix of returns and κ isin R+
and apply it to the Bayesian portfolio choice problem The intuition behind their prior is that it
imposes a degree of shrinkage on the alphas so that whenever factor models are misspecified the
pricing errors cannot be too large a priori hence placing a bound on the Sharpe ratio achievable
in this economy Therefore a diffuse prior for the pricing errors α in general should be avoided
Barillas and Shanken (2018) extend the aforementioned prior to derive a closed-form solution
for the Bayesrsquo factor and use it to compare different linear factor models exploiting the time series
dimension of the data2 In a recent paper Goyal He and Huh (2018) also extend the notion of
distance between alternative model specifications and highlight the tension between the power of
the GRS-type tests and the absolute return-based measures of mispricing
Last but not the least there is a general close connection between the Bayesian approach
to model selection or parameter estimation and the shrinkage-based one Garlappi Uppal and
Wang (2007) impose a set of different priors on expected returns and the variance-covarince matrix
and find that the shrinkage-based analogue leads to superior empirical performance Anderson
and Cheng (2016) develop a Bayesian model-averaging approach to portfolio choice with model
uncertainty being one of the key ingredients that yield robust asset allocation and superior out-of-
sample performance of the strategies Finally the shrinkage-based approach to recovering the SDF
of Kozak Nagel and Santosh (2019) can also be interpreted from a Bayesian perspective Within
a universe of characteristic-managed portfolios the authors assign prior distributions to expected
returns3 and their posterior maximum likelihood estimators resemble a ridge regression Instead
we work directly with tradable and nontradable factors and consider (endogenously) heterogenous
priors for factor risk premia λ The dispersion of our prior for each λ directly depends on the
correlation between test assets and the factor so that it mimics the strength of the identification
of the factor risk premium
Naturally our paper also contributes to the very active (and growing) body of work that criti-
cally evaluates existing findings in the empirical asset pricing literature and tries to develop a robust
methodology There is ample empirical evidence that most linear asset pricing models are misspec-
ified (eg Chernov Lochstoer and Lundeby (2019) He Huang and Zhou (2018) Giglio and Xiu
(2018)) Gospodinov Kan and Robotti (2014) develop a general approach for misspecification-
robust inference that provides valid confidence interval for the pseudo-true values of the risk premia
2Chib Zeng and Zhao (forthcoming) show that the improper prior specification of Barillas and Shanken (2018)is problematic and propose a new class of priors that leads to valid model comparison
3Or equivalently the coefficients b when the linear stochastic discount factor is represented as mt = 1 minus (ft minusE[ft])
⊤b where ft and E denote respectively a vector of factors and the unconditional expectation operator
4
Giglio and Xiu (2018) exploit the invariance principle of the PCA and recover the risk premia asso-
ciated with the given factor from the projection on the span of latent factors driving a cross-section
of asset returns Uppal Zaffaroni and Zviadadze (2018) adopt a similar approach by recovering
latent factors from the residuals of the asset pricing model effectively completing the span of the
SDF Daniel Mota Rottke and Santos (2018) instead focus on the construction of the cross-
sectional factors and note that many well-established tradable portfolios such as HML and SMB
can be substantially improved in asset pricing tests by hedging their unpriced component (that does
not carry a risk premium) We do not take a stand on the origin of the factors or the completion of
the model space Instead we consider the whole universe of potential models that can be created
from the set of observable candidates factors proposed in the empirical literature As such our
analysis explicitly takes into account both standard specifications that have been successfully used
in numerous papers (eg Fama-French 3-factor models or nondurable consumption growth) as
well as the combinations of the factors that have never been explicitly tested
Following Harvey Liu and Zhu (2016) a large literature has tried to understand which of
the existing factors (or their combinations) drive the cross-section of asset returns Giglio Feng
and Xiu (2019) combine cross-sectional asset pricing regressions with the double-selection LASSO
of Belloni Chernozhukov and Hansen (2014) to provide valid uniform inference on the selected
sources of risk Huang Li and Zhou (2018) use a reduced rank approach to select among not
only the observable factors but their total span effectively allowing for sparsity not necessarily in
the observable set of factors but their combinations as well Kelly Pruitt and Su (2019) build a
latent factor model for stock returns with factor loadings being a linear function of the company
characteristics and find that only a small subset of the latter provide substantial independent
information relevant for asset pricing Our approach does not take a stand on whether there exists
a single combination of factors that substantially outperforms other model specifications Instead
we let the data speak and find out that the cross-sectional likelihood across the many models is
rather flat meaning the data is not informative enough to reliably indicate that there is a single
dominant specification
Finally our paper naturally contributes to the literature on weak identification in asset pricing
Starting from the seminal papers of Kan and Zhang (1999a 1999b) identification of risk premia has
been shown to be challenging for traditional estimation procedures Kleibergen (2009) demonstrates
that the two-pass regression of Fama-MacBeth lead to biased estimates of the risk premia and
spuriously high significance levels Moreover useless factors often crowd out the impact of the true
sources of risk in the model and lead to seemingly high levels of cross-sectional fit (Kleibergen and
Zhan (2015)) Gospodinov Kan and Robotti (2014 2019) demonstrate that most of the estimation
techniques used to recover risk premia in the cross-section are rendered invalid in the presence of
useless factors and propose alternative procedures that effectively eliminate the impact of these
factors from the model We build upon the intuition developed in these papers and formulate
the Bayesian solution to the problem by providing a prior that directly reflects the strength of
the factor Whenever the vector of correlation coefficients between asset returns and a factor is
close to zero the prior variance of λ for this specific factor also goes to zero and the penalty for
5
the risk premium converges to infinity effectively shrinking the posterior of the useless factorsrsquo
risk premia toward zero Therefore our priors are particularly robust to the presence of spurious
factors Conversely they are very diffuse for the strong factors with the posterior reflecting the
full impact of the likelihood
II Inference in Linear Factor Models
This section introduces the notation and reviews the main results of the Fama-MacBeth (FM)
regression method (see Fama and MacBeth (1973)) We focus on classic linear factor models for
cross-sectional asset returns Suppose that there are K factors ft = (f1t fKt)⊤ t = 1 T
which could be either tradable or non-tradable To simplify exposition we consider mean zero
factors that have also been demeaned in sample so that we have both E[ft] = 0K and f = 0K where
E[] denotes the unconditional expectation and the upper bar denotes the sample mean operator
The returns of N test assets in excess of the risk free rate are denoted by Rt = (R1t RNt)⊤
In the FM procedure the factor exposures of asset returns βf isin RNtimesK are recovered from
the linear regression
Rt = a+ βfft + 983171t (1)
where 9831711 983171Tiidsim N (0N Σ) and a isin RN Given the mean normalization of ft we have E[Rt] = a
The risk premia associated with the factors λf isin RK are then estimated from the cross-
sectional regression
R = λc1N + 983141βfλf +α (2)
where 983141βf denotes the time series estimates λc is a scalar average mispricing that should be equal to
zero under the null of the model being correctly specified 1N denotes an N -dimensional vector of
ones and α isin RN is the vector of pricing errors in excess of λc If the model is correctly specified
it implies the parameter restriction a = E[Rt] = λc1N + βfλf Therefore we can rewrite the
two-step Fama-MacBeth regression into one equation as
Rt = λc1N + βfλf + βfft + 983171t (3)
Equation (3) is particularly useful in our simulation study Note that the intercept λc is included
in (2) and (3) in order to separately evaluate the ability of the model to explain the average level
of the equity premium and the cross-sectional variation of asset returns
Let B⊤ = (aβf ) and F⊤t = (1f⊤
t ) and consider the matrices of stacked time series observa-
tions
R =
983091
983109983109983107
R⊤1
R⊤T
983092
983110983110983108 F =
983091
983109983109983107
F⊤1
F⊤T
983092
983110983110983108 983171 =
983091
983109983109983107
983171⊤1
983171⊤T
983092
983110983110983108
The regression in (1) can then be rewritten as R = FB + 983171 yielding the time-series estimates of
6
(aβf ) and Σ
983141B =
983075983141a⊤
983141β⊤f
983076= (F⊤F )minus1F⊤R 983141Σ =
1
T(Rminus F 983141B)⊤(Rminus F 983141B)
In the second step the OLS estimates of the factor risk premia are
983141λ = (983141β⊤ 983141β)minus1 983141β⊤R (4)
where 983141β = (1N 983141βf ) and λ⊤ = (λc λf ⊤) The canonical Shanken (1992) corrected covariance matrix
of the estimated risk premia is4
983141σ2(983141λ) = 1
T
983147(983141β⊤ 983141β)minus1 983141β⊤ 983141Σ983141β(983141β⊤ 983141β)minus1(1 + 983141λ⊤
f983141Σminus1f
983141λf )983148 (5)
where 983141Σf is the sample estimate of the variance-covariance matrix of the factors ft There are
two sources of estimation uncertainty in the OLS estimates of λ First we do not know the test
assetsrsquo expected returns but instead estimate them as sample means R According to the time-
series regression R sim N (a 1T Σ) asymptotically Second if β is known the asymptotic covariance
matrix of 983141λ is simply 1T (β
⊤β)minus1β⊤ 983141Σβ(β⊤β)minus1 The extra term (1 + λ⊤fΣ
minus1f λf ) is included to
account for the fact that βf is estimated
Alternatively we can run a (feasible) GLS regression in the second stage obtaining the estimates
983141λ = (983141β⊤ 983141Σminus1 983141β)minus1 983141β⊤ 983141Σminus1R (6)
where 983141Σ = 1T 983141983171
⊤983141983171 and 983141983171 denotes the OLS residuals and with the associated covariance matrix of
the estimates
983141σ2(983141λ) = 1
T(983141β⊤ 983141Σminus1 983141β)minus1(1 + 983141λ⊤
f983141Σminus1f
983141λf ) (7)
Equations (4) and (6) make it clear that in the presence of a spurious (or useless) factor ie
such that βj = CradicT C isin RN risk premia are no longer identified Furthermore their estimates
diverge leading to inference problems for both the useless and the strong (ie βj ∕rarr 0 as T rarr infin)
factors (see eg Kan and Zhang (1999b)) In the presence of such an identification failure the
cross-sectional R2 also becomes untrustworthy If a useless factor is included into the two-pass
regression the OLS R2 tend to be highly inflated (although the GLS R2 is less affected)5
4An alternative way (see eg Cochrane (2005) Page 242) to account for the uncertainty from ldquogenerated regres-sorsrdquo such as βf is to estimate the whole system in GMM The moments are
gT (aβf λ) =
983061IN otimes IK+1
A⊤
983062983091
983107E[Rt minus aminus βfft]
E[(Rt minus aminus βfft)otimes f⊤t ]
E[Rt minus λc1N minus βfλf ]
983092
983108 = 0
where β = (1N βf ) A⊤ = β⊤ for OLS and A⊤ = β⊤Σminus1 for GLS Also note that GLS estimation is not the sameas efficient GMM estimation
5For example Kleibergen and Zhan (2015) derive the asymptotic distribution of the R2 under the assumptionthat a few unknown factors are able to explain expected asset returns and show that in the presence of a useless
7
This problem arises not only when using the Fama-MacBeth two-step procedure Kan and
Zhang (1999a) point out that the identification condition in the GMM test of linear stochastic
discount factor models fails when a useless factor is included Moreover this leads to overrejection
of the hypothesis of a zero risk premium for the useless factor under the Wald test and the power
of the over-identifying restriction test decreases Gospodinov Kan and Robotti (2019) document
similar problems within the maximum likelihood estimation and testing framework
Consequently several papers have attempted to develop alternative statistical procedures that
are robust to the presence of useless factors Kleibergen (2009) proposes several novel statis-
tics whose large sample distributions are unaffected by the failure of the identification condition
Gospodinov Kan and Robotti (2014) derive robust standard errors for the GMM estimates of
factor risk premia in the linear stochastic factor framework and prove that t-statistics calculated
using their standard errors are robust even when the model is misspecified and a useless factor is
included Bryzgalova (2015) introduces a LASSO-like penalty term in the cross-sectional regression
to shrink the risk premium of the useless factor towards zero
In this paper we provide a Bayesian inference and model selection framework that i) can be
easily used for robust inference in the presence and detection of useless factors (section II1) and
ii) can be used for both model selection and model averaging even in the presence of a very large
number of candidate (traded or non traded and possibly useless) risk factors ndash ie the entire factor
zoo
II1 Bayesian Fama-MacBeth
This section introduces our hierarchical Bayesian Fama-MacBeth (BFM) estimation method A
formal derivation is presented in Appendix A1 To start with letrsquos consider the time-series regres-
sion We assume that the time series error terms follow an iid multivariate Gaussian distribution
(the approach at the cost of analytical solutions could be generalized to accommodate different
distributional assumptions) ie 983171 sim MVN (0TtimesN Σ otimes IT ) The likelihood of the data (RF ) is
then
p(data|BΣ) = (2π)minusNT2 |Σ|minus
T2 exp
983069minus1
2tr
983147Σminus1(Rminus FB)⊤(Rminus FB)
983148983070
The time-series regression is always valid even in the presence of a spurious factor For simplicity
we choose the non-informative Jeffreysrsquo prior for (BΣ) π(BΣ) prop |Σ|minusN+1
2 Note that this prior
is flat in the B dimension The posterior distribution of (BΣ) is therefore
B|Σ data sim MVN983059983141BolsΣotimes (F⊤F )minus1
983060 (8)
Σ|data sim W -1983059T minusK minus 1 T Σ
983060 (9)
where 983141Bols and Σ denote the canonical OLS based estimates and W -1 is the inverse-Wishart
distribution (a multivariate generalization of the inverse-gamma distribution) From the above
factor the OLS R2 is more likely to be inflated than its GLS counterpart
8
we can sample the posterior distribution of the parameters (BΣ) by first drawing the covariance
matrix Σ from the inverse-Wishart distribution conditional on the data and then drawing B from
a multivariate normal distribution conditional on the data and the draw of Σ
If the model is correctly specified in the sense that all true factors are included expected
returns of the assets should be fully explained by their risk exposure β and the prices of risk λ
ie E[Rt] = βλ But since given our mean normalisation of the factors E[Rt] = a we have the
least square estimate (β⊤β)minus1β⊤a Therefore we can define our first estimator
Definition 1 (Bayesian Fama-MacBeth (BFM)) The posterior distribution of λ conditional
on B Σ and the data is a Dirac distribution at (β⊤β)minus1β⊤a A draw (λ(j)) from the posterior
distribution of λ conditional on the data only is obtained by drawing B(j) and Σ(j) from the Normal-
inverse-Wishart in (8)-(9) and computing (β⊤(j)β(j))
minus1β⊤(j)a(j)
The posterior distribution of λ defined above accounts both for the uncertainty about the
expected returns (via the sampling of a) and the uncertainty about the factor loadings (via the
sampling of β) Note that differently from the frequentist case in equation (5) there is no ldquoextra
termrdquo (1 + λ⊤fΣ
minus1f λf ) to account for the fact that βf is estimated The reason being that it
is unnecessary to explicitly adjust standard errors of λ in the Bayesian approach since we keep
updating βf in each simulation step automatically incorporating the uncertainty about βf into the
posterior distribution of λ Furthermore it is quite intuitive from the above definition of the BFM
estimator why we expect posterior inference to detect weak and spurious factors in finite sample
For such factors the near singularity of (β⊤(j)β(j))
minus1 will cause the draws for λ(j) to diverge as in
the frequentist case Nevertheless the posterior uncertainty about factor loadings and risk premia
will cause β⊤(j)a(j) to flip sign across draws causing the posterior distribution of λ to put substantial
probability mass on both values above and below zero Hence centered posterior credible intervals
will tend to include zero with high probability
In addition to the price of risk λ we are also interested in estimating the cross-sectional fit of
the model ie the cross-sectional R2 Once we obtain the posterior draws of the parameters we
can easily obtain the posterior distribution of the cross-sectional R2 defined as
R2ols = 1minus (aminus βλ)⊤(aminus βλ)
(aminus a1N )⊤(aminus a1N ) (10)
where a = 1N
983123Ni ai That is for each posterior draw of (a β λ) we can construct the corre-
sponding draw for the R2 from equation (10) hence tracing out its posterior distribution We can
think of equation (10) as the population R2 where a β and λ are unknown After observing
the data we infer the posterior distribution of a β and λ and from these we can recover the
distribution of the R2
However realistically the models are rarely true Therefore one might want to allow for the
presence of pricing errors α in the cross-sectional regression6 This can be easily accommodated
6As we will show in the next section this is essential for model selection
9
within our Bayesian framework since in this case the data generating process in the second stage
becomes a = βλ + α If we further assume that pricing error αi follows an independent and
identical normal distribution N (0σ2) the likelihood function in the second step becomes7
p(data|λσ2β) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070 (11)
In the cross-sectional regression the ldquodatardquo are the expected risk premia a and the factor loadings
β albeit these quantities are not directly observable to the researcher Hence in the above we
are conditioning on the knowledge of these quantities which can be sampled from the first step
Normal-inverse-Wishart posterior distribution (8)-(9) Conceptually this is not very different from
the Bayesian modeling of latent variables In the benchmark case we assume a Jeffreysrsquo diffuse
prior8 for (λσ2) π(λσ2) prop σminus2 In Appendix A1 we show that the posterior distribution of
(λσ2) is then
λ|σ2BΣ data sim N
983091
983109983107(β⊤β)minus1β⊤a983167 983166983165 983168983141λ
σ2(β⊤β)minus1
983167 983166983165 983168Σλ
983092
983110983108 (12)
σ2|BΣ data sim Γminus1
983075N minusK minus 1
2(aminus βλ)⊤(aminus βλ)
2
983076 (13)
where Γminus1 denotes the inverse-gamma distribution The conditional distribution in equation (12)
makes it clear that the posterior takes into account both the uncertainty about the market price
of risk stemming from the first stage uncertainty about the β and a (that are drawn from the
Normal-inverse-Wishart posterior in equations (8)-(9)) and the random pricing errors α that have
the conditional posterior variance in equation (13) If test assetsrsquo expected excess returns are fully
explained by β there are no pricing errors and σ2(β⊤β)minus1 converges to zero otherwise this layer
of uncertainty always exists
Note also that we can think of the posterior distribution of (β⊤β)minus1β⊤a as a Bayesian decision
makerrsquos belief about the dispersion of the Fama-MacBeth OLS estimates after observing the data
RtftTt=1 Alternatively when pricing errors α are assumed to be zero under the null hypothesis
the posterior distribution of λ in equation (12) collapses to a degenerate distribution where λ
equals (β⊤β)minus1β⊤a with probability one
Often the cross sectional step of the Fama-MacBeth estimation is performed via GLS rather than
least squares In our setting under the null of the model this leads to λ = (β⊤Σminus1β)minus1β⊤Σminus1a
Therefore we define the following GLS estimator
Definition 2 (Bayesian Fama-MacBeth GLS (BFM-GLS)) The posterior distribution of λ
conditional on B Σ and the data is a Dirac distribution at (β⊤Σminus1β)minus1β⊤Σminus1a A draw (λ(j))
7We derive a formulation with non-spherical cross-sectional pricing errors that leads to a GLS type estimator inAppendix A2
8As shown in the next subsection in the presence of useless factors such prior is not appropriate for modelselection based on Bayes factors and posterior probabilities since it does not lead to proper marginal likelihoodsTherefore we introduce therein a novel prior for model selection
10
from the posterior distribution of λ conditional on the data only is obtained by drawing B(j) and Σ(j)
from the Normal-inverse-Wishart in equations (8)-(9) and computing (β⊤(j)Σ
minus1(j)β(j))
minus1β⊤(j)Σ
minus1(j)a(j)
From the posterior sampling of the parameters in the above definition we can also obtain the
posterior distribution of the cross-sectional GLS R2 defined as
R2gls = 1minus
(aminus βλgls)⊤Σminus1(aminus βλgls)
(aminus a1N )⊤Σminus1(aminus a1N ) (14)
Once again we can think of equation (14) as the population GLS R2 that is a function of the
unknown quantities a β and λ But after observing the data we infer the posterior distribution
of the parameters and from these we recover the posterior distribution of the R2gls
Remark 1 (Generated factors) Often factors are estimated as eg in the case of principal
components (PCs) and factor mimicking portfolios (albeit the latter is not needed in our setting)
This generates an additional layer of uncertainty normally ignored in empirical analysis due to
the associated asymptotic complexities Nevertheless it is relatively easy to adjust the Bayesian
estimators of risk premia to account for this uncertainty In the case of a mimicking portfolio
under a diffuse prior and Normal errors the posterior distribution of the portfolio weights follow the
standard Normal-inverse-Gamma of Gaussian linear regression models (see eg Lancaster (2004))
Similarly in the case of principal components as factors under a diffuse prior the covariance
matrix from which the PCs are constructed follow an inverse-Wishart distribution9 Hence the
posterior distributions in Definitions 1 and 2 can account for the generated factors uncertainty by
first drawing from an inverse-Wishart the covariance matrix from which PCs are constructed or
the Normal-inverse-Gamma posterior of the mimicking portfolios coefficients and then sampling
the remaining parameters as explained in the definitions
II2 Model selection
In the previous subsection we have derived simple Bayesian estimators that deliver in finite sam-
ple credible intervals robust to the presence of spurious factors and avoid over-rejecting the null
hypothesis of zero risk premia for such factors
However given the plethora of risk factors that have been proposed in the literature a robust
approach for models selection across non-necessarily nested models and that can handle poten-
tially a very large number of possible models as well as both traded and non-traded factors is
of paramount importance for empirical asset pricing The canonical way of selecting models and
testing hypothesis within the Bayesian framework is through Bayesrsquo factors and posterior prob-
abilities and that is the approach we present in this section This is for instance the approach
suggested by Barillas and Shanken (2018) The key elements of novelty of the proposed method
are that i) our procedure is robust to the presence of spurious and weak factors ii) it is directly
9Based on these two observations Allena (2019) proposes a generalisation of Barillas and Shanken (2018) modelcomparison approach for these type of factors
11
applicable to both traded and non-traded factors and iii) it selects models based on their cross-
sectional performance (rather than the time series one) ie on the basis of the risk premia that the
factors command
In this subsection we show first that flat priors for risk premia (as the Jeffreysrsquo priors used
in section II1 for illustrative purposes) are not suitable for model selection in the presence of
spurious factors Given the close analogy between frequentist testing and Bayesian inference with
flat priors this is not too surprising But the novel insight is that the problem arises exactly
because of the use of flat priors and can therefore be fixed by using non-flat yet non-informative
priors Second we introduce ldquospike-and-slabrdquo priors that are robust to the presence of spurious
factors and particularly powerful in high-dimensional model selection ie when one wants as in
our empirical application to test all factors in the zoo
II21 Pitfalls of Flat Priors for Risk Premia
We start this section by discussing why flat priors for risk premia as the Jeffreysrsquo prior are not
desirable in model selection Since we want to focus and select models based on the cross-sectional
asset pricing properties of the factors for simplicity we retain Jeffreysrsquo priors for the time series
parameter (aβf Σ) of the first-step regression
In order to perform model selection we relax the (null) hypothesis that models are correctly
specified and allow instead for the presence of cross-sectional pricing errors That is we consider
the cross-sectional regression a = βλ + α For illustrative purposes we focus on spherical errors
but all the results in this and the following subsections can be generalized to the non-spherical error
setting in Appendix A2
Similar to many Bayesian variable selection problems we introduce a vector of binary latent
variables γ⊤ = (γ0 γ1 γK) where γj isin 0 1 When γj = 1 it indicates that the factor j
(with associated loadings βj) should be included into the model and vice versa The number of
included factors is simply given by pγ =983123K
j=0 γj Note that we do not shrink the intercept so
γ0 is always equal to 1 (as the common intercept plays the role of the first ldquofactorrdquo) The notation
βγ = [βj ]γj=1 represents a pγ-columns sub-matrix of β
When testing whether the risk premium of factor j is zero the null hypothesis is H0 λj = 0
In our notation this null hypothesis can be expressed as H0 γj = 0 while the alternative is
H1 γj = 1 This is a small but important difference relative to the canonical frequentist testing
approach for useless factors the risk premium is not identified hence testing whether it is equal to
any given value is per se problematic Nevertheless as we show in the next section with appropriate
priors whether a factor should be included or not is a well-defined question even in the presence
of useless factors
In the Bayesian framework the prior distribution of parameters under the alternative hypothesis
should be carefully specified Generally speaking the priors for nuisance parameters such as β σ2
and Σ do not greatly influence the cross-sectional inference But as we are about to show this is
not the case for the priors about risk premia
12
Recall that when considering multiple models say wlog model γ and model γ prime by Bayesrsquo
theorem we have that the posterior probability of model γ is
Pr(γ|data) = p(data|γ)p(data|γ) + p(data|γ prime)
where we have given equal prior probability to each model and p(data|γ) denotes the marginal
likelihood of the model indexed by γ In Appendix A3 we show that when using a Jeffreysrsquo prior
(that is flat for λ) the marginal likelihood is
p(data|γ) prop (2π)pγ+1
2 |β⊤γ βγ |minus
12
Γ983059Nminuspγ+1
2
983060
983059N σ2
γ
2
983060Nminuspγ+1
2
(15)
where λγ = (β⊤γ βγ)
minus1β⊤γ a σ
2γ =
(aminusβγ λγ)⊤(aminusβγ λγ)N and Γ denotes the Gamma function
Therefore if model γ includes a useless factor (whose β asymptotically converges to zero) the
matrix β⊤γ βγ is nearly singular and its determinant goes to zero sending the marginal likelihood
in (15) to infinity As a result the posterior probability of the model containing the spurious factor
go to one Consequently under a flat prior for risk premia the model containing a useless factor
will always be selected asymptotically However the posterior distribution of λ for the spurious
factor is robust and particularly disperse in any finite sample
Moreover it is highly likely that conclusions based on the posterior coverage of λ contradict
those arising from Bayesrsquo factors When the prior distribution of λj is too diffuse under the
alternative hypothesis H1 the Bayesrsquo factor tends to favor H0 over H1 even though the estimate
of λj is far from 0 The reason is that even though H0 seems quite unlikely based on posterior
coverages the data is even more unlikely under H1 than under H0 Therefore a disperse prior for
λj may push the posterior probabilities to favor H0 and make it fail to identity true factors This
phenomenon is the so called ldquoBartlett Paradoxrdquo (see Bartlett (1957))
Note also that flat hence improper priors for the risk premia are not legitimate since they render
the posterior model probabilities arbitrary Suppose that we are testing the null H0 λj = 0 Under
the null hypothesis the prior for (λσ2) is λj = 0 and π(λminusj σ2) prop 1
σ2 However the prior under the
alternative hypothesis is π(λj λminusj σ2) prop 1
σ2 Since the marginal likelihoods of data p(data|H0)
and p(data|H1) are both undetermined we cannot define the Bayesrsquo factor p(data|H1)p(data|H0)
(see eg
Cremers (2002) Chib Zeng and Zhao (forthcoming)) In contrast for nuisance parameters such
as σ2 we can continue to assign improper priors Since both hypotheses H0 and H1 include σ2
the prior for it will be offset in the Bayesrsquo factor and in the posterior probabilities Therefore we
can only assign improper priors for common parameters10 Similarly we can still assign improper
priors for β and Σ in the first time series step
The final reason why it might be undesirable to use Jeffreysrsquo prior in the second step is that it
does not impose any shrinkage on the parameters This is problematic given the large number of
10See Kass and Raftery (1995) (and also Cremers (2002)) for more detailed discussion
13
members of the factor zoo while we have only limited time-series observations of both factors and
test asset returns
In the next subsection we propose an appropriate prior for risk premia that is both robust to
spurious factors and can be used for model selection even when dealing with a very large number
of potential models
II22 Spike and Slab Prior for Risk Premia
In order to make sure that the integration of the marginal likelihood is well-behaved we propose
a novel prior specification for the factorsrsquo risk premia λ⊤f = (λ1 λK)11 Since the inference in
time-series regression is always valid we only modify the priors of the cross-sectional regression
parameters
The prior that we propose belongs to the so-called ldquospike-and-slabrdquo family For exemplifying
purposes in this section we introduce a Dirac spike so that we can easily illustrate its implications
for model selection In the next subsection we generalize the approach to a ldquocontinuous spikerdquo
prior and study its finite sample performance in our simulation setup
In particular we model the uncertainty underlying the model selection problem with a mixture
prior π(λσ2γ) prop π(λ|σ2γ)π(σ2)π(γ) for the risk premium of the j-th factor When γj = 1
and hence the factor should be included in the model the prior follows a normal distribution given
by λj |σ2 γj = 1 sim N (0σ2ψj) where ψj is a quantity that we will be defining below When instead
γj = 0 and the corresponding risk factor should not be included in the model the prior is a Dirac
distribution at zero For the cross-sectional variance of the pricing errors we keep the same prior
that would arise with Jeffreysrsquo approach π(σ2) prop σminus2
Let D denote a diagonal matrix with elements cψminus11 middot middot middot ψminus1
K and Dγ the sub-matrix of D
corresponding to model γ We can then express the prior for the risk factors λγ of model γ as
λγ |σ2γ sim N (0σ2Dminus1γ )
Note that c is a small positive number since we do not shrink the common intercept λc of the
cross-sectional regression
Given the above prior specification we sample the posterior distribution by sequentially drawing
from the conditional distributions of the parameters (ie we use a Gibbs sampling algorithm) The
crucial steps in addition to the sampling of the times series parameters from the posteriors in
equations (8)-(9) are as follows
Sampling λγ
Note that
11We do not shrink the intercept λc
14
p(λ|dataσ2γ) prop p(data|λσ2γ)π(λ|σ2γ)
prop (2π)minuspγ2 |Dγ |
12 (σ2)minus
N+pγ2 exp
983069minus 1
2σ2[(aminus βγλγ)
⊤(aminus βγλγ) + λ⊤γDγλγ ]
983070
= (2π)minuspγ2 |Dγ |
12 (σ2)minus
N+pγ2 e
983069minus
(λγminusλγ )⊤(β⊤γ βγ+Dγ )(λγminusλγ )
2σ2
983070
e
983153minusSSRγ
2σ2
983154
where SSRγ = a⊤aminus a⊤βγ(β⊤γ βγ +Dγ)
minus1β⊤γ a = minλγ(aminus βγλγ)
⊤(aminus βγλγ) + λ⊤γDγλγ
Note that SSRγ is the minimized sum of squared errors with generalised ridge regression penalty
term λ⊤γDγλγ That is our prior modelling is analogous to introducing a Tikhonov-Phillips
regularisation (see Tikhonov Goncharsky Stepanov and Yagola (1995) Phillips (1962)) in the
cross-sectional regression step and has the same rationale delivering a well defined marginal
likelihood in the presence of rank deficiency (that in our settings arises in the presence of useless
factors) However in our setting the shrinkage applied to the factors is heterogeneous since we
rely on the partial correlation between factors and test assets to set ψj as
ψj = ψ times ρ⊤j ρj (16)
where ρj is an N times 1 vector of correlation coefficients between factor j and the test assets and
ψ isin R+ is a tuning parameter which controls the shrinkage over all the factors12 When the
correlation between fjt and Rt is very low as in the case of a useless factor the penalty for λj
which is the reciprocal of ψρ⊤j ρj is very large and dominates the sum of squared errors
Let λγ = (β⊤γ βγ +Dγ)
minus1β⊤γ a and σ2(λγ) = σ2(β⊤
γ βγ +Dγ)minus1 the posterior distribution of
λγ is
λγ |dataσ2γ sim N (λγ σ2(λγ))
The above equation makes it clear why this Bayesian formulation is robust to spurious factors
When β converges to zero (β⊤γ βγ +Dγ) is dominated by Dγ so the identification condition for
the risk premia no longer fails When a factor is spurious its correlation with test assets converges
to zero hence the penalty for this factor ψminus1j goes to infinity As a result the posterior mean
of λγ λγ = (β⊤γ βγ +Dγ)
minus1β⊤γ a is shrunk towards zero and the posterior variance term σ2(λ)
approaches σ2Dminus1γ Consequently the posterior distribution of λ for a spurious factor is nearly the
same as its prior In contrast for a normal factor that has non-zero covariance with test assets the
information contained in β dominates the prior information since in this case the absolute size of
Dγ is small relative to β⊤γ βγ
Using our priors and integrating out λ yields
p(data|σ2γ) =
983133p(data|λσ2γ)π(λ|σ2γ)dλ prop (σ2)minus
N2
|Dγ |12
|β⊤γ βγ +Dγ |
12
exp
983069minusSSRγ
2σ2
983070
12Alternatively we could have set ψj = ψ times β⊤j βj where βj is an N times 1 vector However ρj has the advantage
of being invariant to the units in which the factors are measured
15
Sampling σ2
From the Bayesrsquo theorem we have that the posterior of σ2 given by
p(σ2|dataγ) prop p(data|σ2γ)π(σ2) prop (σ2)minusN2minus1 exp
983069minusSSRγ
2σ2
983070
hence the posterior distribution of σ2 is an inverse-Gamma σ2|dataγ sim Γminus1983059N2
SSRγ
2
983060
Finally we obtain the marginal likelihood of the data in model γ by integrating out σ2
p(data|γ) =983133
p(data|σ2γ)π(σ2)dσ2 prop |Dγ |12
|β⊤γ βγ +Dγ |
12
1
(SSRγ2)N2
When comparing two models using posterior model probabilities is equivalent to simply using the
ratio of the marginal likelihoods ie the Bayesrsquo factor defined as
BFγγprime = p(data|γ)p(data|γ prime)
where we have given equal prior probability to model γ and model γprime
Remark 2 (Bayes Factor) Consider two nested linear factor models γ and γ prime The only dif-
ference between γ and γ prime is γp γp equals 1 in model γ but 0 in model γ prime Let γminusp denote a
(K minus 1) times 1 vector of model index excluding γp γ⊤ = (γ⊤minusp 1) and γ prime⊤ = (γ⊤
minusp 0) where without
loss of generality we have assumed that the factor p is ordered last The Bayesrsquo factor is then
BFγγprime =
983061SSRγprime
SSRγ
983062N2 983059
1 + ψpβ⊤p
983147IN minus βγprime(β⊤
γprimeβγprime +Dγprime)minus1β⊤γprime
983148βp
983060minus 12 (17)
The above result is proved in Appendix A4
Since β⊤p [IN minus βγprime(β⊤
γprimeβγprime + Dγprime)minus1β⊤γprime ]βp is always positive ψp plays an important role in
variable selection For a strong and useful factor that can substantially reduce pricing errors the
first term in equation (17) dominates and the Bayesrsquo factor will be much greater than 1 hence
providing evidence in favour of model γ
Remember that SSRγ = minλγ(a minus βγλγ)⊤(a minus βγλγ) + λ⊤
γDγλγ hence we always have
SSRγ le SSRγprime in sample There are two effects of increasing ψp i) when ψp is large the penalty
for λp is small hence it is easier to minimise SSRγ and SSRγprimeSSRγ becomes much larger than
1 ii) large ψp decreases the second term in equation (17) lowering the Bayesrsquo factor and acting as
a penalty for dimensionality
A particularly interesting case is when the factor is useless βp converges to zero but the
penalty term 1ψp prop 1ρ⊤pρp goes to infinity On the one hand the first term in equation (17) will
converge to 1 on the other hand since ψp asymp 0 in large sample the second term in equation (17)
will also be around 1 Therefore the Bayesrsquo factor for a useless factor will go to 1 asymptotically13
13But in finite sample it may deviate from its asymptotic value so we should not use 1 as a threshold when testingthe null hypothesis H0 γp = 0
16
In contrast a useful factor should be able to greatly reduce the sum of squared errors SSRγ so
the Bayesrsquo factor will be dominated by SSRγ yielding a value substantially above 1
Remark 3 (Level Factors) Identification failure of factorsrsquo risk premia can arise in the presence
of lsquolevel factorsrsquo exposure to which is non-zero but lacks cross-sectional spread ie βj rarr c1N with
c ∕= 0 These factors help explain the average level of returns but not the their cross-sectional
dispersion and hence are collinear with the common cross-sectional intercept Our approach can
handle this case by using variance standardised variables in the estimation and replacing the penalty
in (16) with ψj = ψtimes983145ρj⊤983145ρj where 983145ρj is the cross-sectionally demeaned vector of correlations with
asset returns ie 983145ρj = ρj minus983059
1N
983123Ni=1 ρji
983060times 1N
II23 Continuous Spike
We extend the Dirac spike-and-slab prior by encoding a continuous spike for λj when γj equals
0 Following the literature on Bayesian variable selection (see eg George and McCulloch (1993)
George and McCulloch (1997) and Ishwaran Rao et al (2005)) we model the uncertainty under-
lying model selection with a mixture prior π(λσ2γω) = π(λ|σ2γ)π(σ2)π(γ|ω)π(ω) which is
specified as following
λj |γj σ2 sim N (0 r(γj)ψjσ2)
Note the introduction of a new element r(γj) in the prior and where r(1) = 1 and r(0) = r As
we explain below the additional parameter vector ω encodes our prior beliefs about the sparsity
of the true model
Redefine D as a diagonal matrix with elements c (r(γ1)ψ1)minus1 (r(γK)ψK)minus1 where ψj is
given as before by equation (16) In matrix notation the prior for λ is λ|σ2γ sim N (0σ2Dminus1)
The term r(γj)ψj in Dminus1 is set to be small or large depending on whether γj = 0 or γj = 1 In the
empirical implementation we force r to be much less than 1 since we intend to shrink λj towards
zero when γj is 014 Hence the spike component concentrates the mass of λ towards zero whereas
the slab component allows λ to take values over a much wider range Therefore the posterior
distribution of λ is very similar to the case of a Dirac spike in section II22
We use Gibbs sampling (ie sequential sampling from conditional distributions) to draw the
posterior distribution of the parameters (λγωσ2) where as explained below ω encodes our
prior beliefs about the sparsity of the true model
Sampling λγ
Combining the likelihood and the prior for λ we have
p(λ|dataσ2γ) prop p(data|λσ2γ)p(λ|σ2γ) prop exp
983069minus 1
2σ2
983147λ⊤(β⊤β +D)λminus 2λ⊤β⊤a
983148983070
Therefore defining λ = (β⊤β+D)minus1β⊤a and σ2(λ) = σ2(β⊤β+D)minus1 the posterior distribution
14We can set r = 00001 In our framework r is essentially a tune parameter hence we need to choose a reasonablevalue such that we can identify useful factor but exclude spurious ones
17
of λ can be expressed as λ|dataσ2γω sim N (λ σ2(λ))
Sampling γjKj=1
Even though the prior on model index γ could be simply set to be π(γ) = 12K we interpret
π(γj = 1|ωj) = ωj as our prior belief about the sparsity of the true model As in the literature on
predictors selection we assign the following prior distribution to (γω)
π(γj = 1|ωj) = ωj ωj sim Beta(aω bω)
Different hyper-parameters aω and bω determine whether we favor more parsimonious models or
not15
Given a ωj the Bayes factor for the j-th risk then factor is
p(γj = 1|dataλωσ2γminusj)
p(γj = 0|dataλωσ2γminusj)=
ωj
1minus ωj
p(λj |γj = 1σ2)
p(λj |γj = 0σ2)
If we had instead imposed ωj = 05 as in section II22 the Bayesrsquo factor would be fully determined
byp(λj |γj=1σ2)p(λj |γj=0σ2)
Sampling ω
From Bayesrsquo theorem we have
p(ωj |dataλγσ2) prop π(ωj)π(γj |ωj) prop ωγjj (1minus ωj)
1minusγjωaωminus1j (1minus ωj)
bωminus1
prop ωγj+aωminus1j (1minus ωj)
1minusγj+bωminus1
Therefore the posterior distribution of ωj is ωj |dataλγσ2 sim Beta(γj + aω 1minus γj + bω)
Sampling σ2
Finally
p(σ2|dataωλγ) prop (σ2)minusN+K+1
2minus1 exp
983069minus 1
2σ2[(aminus βλ)⊤(aminus βλ) + λ⊤Dλ]
983070
Hence the posterior distribution of σ2 is
σ2|dataωλγ sim Γminus1
983061N +K + 1
2(aminus βλ)⊤(aminus βλ) + λ⊤Dλ
2
983062
Note that when ωj is a constant 05 and r converges to 0 the continuous slab-and-spike prior
is equivalent to the one with a Dirac spike in section II22 However the MCMC algorithm of
the continuous spike setting is particularly useful in the high dimensional case Imagine that there
are 30 candidate factors in the factor zoo In the Dirac spike-and-slab prior case we have to
15We set aω = bω = 2 in the benchmark case However we could assign aω = 1 and bω = 2 in order to select asparser model
18
calculate the posterior model probabilities for 230 different models Given that we update (aβ)
in each sampling round posterior probabilities for all models are necessarily re-computed for every
new draw of these quantities rendering the computational cost very large In contrast using our
continuous spike approach we can simply use the posterior mean of γj to approximate the posterior
marginal probability of the j-th factor
III Simulation
We build a simple setting for a linear factor model that includes both strong and irrelevant factors
and allows for potential model misspecification
The cross-section of asset returns mimics empirical properties of 25 Fama-French portfolios
sorted by size and value We generate both factors and test asset returns from normal distributions
assuming that HML is the only useful factor A misspecified model also includes pricing errors from
the two-step Fama-MacBeth procedure which makes the vector of simulated expected returns equal
to their sample mean estimates of 25 Fama-French portfolios Finally a spurious factor is simulated
from an independent normal distribution with mean zero and standard deviation 1
ftuselessiidsim N (0 (1)2)
ftHMLiidsim N (rHML σ
2HML)
ftHML = ftHML minus macrftHML
Rt|ftHMLiidsim
983099983103
983101N (λc1N + β
983059λHML + ftHML
983060 Σ) if the model is correct
N (R+ βftHML Σ) if the model is misspecified
where factor loadings risk premia and variance-covariance matrix of returns are equal to their
OLS sample estimates from time series and two-pass Fama-MacBeth regressions of 25 size and
value portfolios on HML All the model parameters are estimated on monthly data from July 1963
to December 2017
To better illustrate the properties of the frequentist and bayesian approaches we consider 3
estimation setups
(a) the model includes only a strong factor (HML)
(b) the model includes only a useless factor
(c) the model includes both strong and useless factors
Each setting can be correctly or incorrectly specified with the following sample sizes T = 100
200 600 1000 and 20000 We compare the performance of the OLSGLS standard frequentist
and Bayesian Fama-MacBeth estimators (FM and BFM correspondingly) with the focus on risk
premia recovery testing and identification of strong and useless factors for model comparison
19
III1 Estimating risk premia via Bayesian Fama-MacBeth
Since it is unlikely that in most empirical settings a linear factor model is correctly specified we
focus our discussion on the case that allows for model misspecification We report similar simulation
results for the case of correct model specification in Appendix C
Table 1 Tests of risk premia in a misspecified model with a strong factor
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0107 0052 0007 0115 0076 0013 -422 4530200 0094 0067 0006 0118 0072 0015 -303 5621
FM 600 0097 0053 0006 0098 0047 0011 389 55751000 0088 0048 0012 0103 0060 0012 865 532820000 0109 0056 0010 0110 0060 0008 2486 3655
100 0063 0026 0006 0032 0013 0001 -356 3609200 0086 0041 0012 0079 0039 0008 -367 4689
BFM 600 0087 0043 0013 0087 0047 0009 -067 52611000 0095 0046 0009 0093 0046 0009 440 534620000 0096 0052 0012 0100 0052 0012 2390 3601
Panel B GLS
100 0246 0173 0067 0251 0154 0070 1720 7464200 0165 0107 0031 0171 0105 0035 5337 8045
FM 600 0137 0076 0020 0137 0073 0016 6942 83871000 0131 0072 0019 0132 0072 0015 7341 841620000 0122 0074 0014 0113 0061 0015 8034 8283
100 0139 0083 0022 0143 0083 0024 3127 6815200 0131 0072 0014 0121 0069 0019 4858 7299
BFM 600 0114 0061 0013 0126 0067 0018 6594 80091000 0100 0048 0009 0106 0058 0008 7063 814920000 0092 0050 0012 0100 0048 0012 8022 8267
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor The true value of the cross-sectional R2
adj is 3055(8175) for the OLS (GLS) estimation Fama-MacBeth estimates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidence intervals and their size for BFMestimates are constructed using posterior coverage of Fama-MacBeth estimates of λ The last two columns report the5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at the simulation point estimatesfor FM and its posterior mode for BFM
Table 1 presents the size of the tests for risk premia and confidence intervals for cross-sectional
R2 in probably the most relevant (and best-case) scenario for empirical applications It compares
the performance of frequentist and Bayesian Fama-MacBeth estimators for the case when the model
is misspecified and includes a single cross-sectional factor that is priced and strongly identified
proxied by HML Since the model is misspecified cross-sectional R2 never reaches 100 even for
T = 20 000 with the population value of 31 (82) for OLS (GLS) As expected both FM and
BFM estimators are very similar to each other and provide confidence intervals of correct size In
case of the standard FM approach they are constructed using standard t-statistics adjusted for
Shanken correction and in case of the BFM we rely on the quantiles of the posterior distribution
to form the credible confidence intervals for parameters The last two columns also report the
quantiles of the mode of the posterior distribution of R2 across the simulations
20
Table 2 summarizes risk premia estimation for the same cross-section of 25 expected returns
but using a useless (spurious) factor as a candidate cross-sectional factor As expected the standard
Fama-MacBeth estimator fails to recognize the rank failure in the second stage and conventional
risk premia estimates and t-statistics are inflated Indeed it is widely known since Kan and Zhang
(1999) that if the model is misspecified then t-statistics of the spurious factors tend to infinity
asymptotically This is confirmed in Panels A and B for T = 20 000 the probability of finding a
useless factor to have a t-stat above 196 is almost 50 for FM-OLS and over 80 for FM-GLS
Furthermore cross-sectional measures of fit such as R2 and related quantities are substantially
inflated even though itrsquos true value is 0 for both OLS and GLS settings in-sample estimates
produced by the frequentist approach not only have a much wider empirical support (for example
from -4 to over 40 for FM-OLS) but also uncertainty that does not decrease with the sample
size
Table 2 Tests of risk premia in a misspecified model with a useless factor
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0072 0037 0010 0055 0011 0001 -429 4184200 0078 0043 0005 0084 0021 0001 -418 4524
FM 600 0089 0043 0014 0223 0116 0013 -428 43701000 0093 0052 0014 0333 0187 0027 -429 450120000 0213 0135 0067 0698 0488 0172 -427 4363
100 0035 0011 0001 0001 0000 0000 -247 -036200 0041 0013 0001 0008 0002 0000 -261 -016
BFM 600 0071 003 0003 0031 0006 0002 -287 0191000 0047 002 0001 0039 0017 0002 -298 08420000 0034 0013 0000 0091 0043 0012 -336 1140
Panel B GLS
100 0238 0144 0066 0305 0199 0062 -347 3857200 0152 0091 0028 0292 0189 0067 -375 1953
FM 600 0126 0066 0017 0407 0314 0148 -385 16431000 0117 0063 0013 0510 0401 0239 -405 156920000 0104 0039 0005 0864 0847 0768 -311 1333
100 0128 0070 0019 0047 0018 0002 -218 1154200 0107 0060 0014 0034 0011 0000 -275 891
BFM 600 0093 0046 0008 0042 0012 0001 -325 6621000 0083 0031 0004 0061 0028 0004 -339 51920000 0023 0006 0000 0099 0049 0011 -304 138
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor The true value of the cross-sectional R2 is zero
The Bayesian approach to Fama-MacBeth regressions successfully overcomes the hurdle of use-
less factors As Table 2 demonstrates both BFM-OLS and BFM-GLS are able to identify the spu-
rious factor with the posterior distribution providing credible confidence bounds with the proper
control for the size of the test (eg for T = 20 000 the probability to reject the null of no risk
premia attached to a useless factor when using a test with the nominal size of 10 is 91) Rec-
ognizing a useless factor cross-sectional measures of fit are more conservative and overall tighter
Why does the bayesian approach work when the frequentist fails The argument is probably
best summarized by Figure 1 that plots a posterior distribution of λuseless for BFM from one of
21
minus4 minus2 0 2 4
00
02
04
06
08
λuseless
Density
BFMminusOLSFMminusOLS
Figure 1 Risk premia estimates of a useless factor
The graph presents the posterior distribution (blue line) of λuseless from the BFM-OLS estimation in a misspecifiedmodel with a useless factor based on a single simulation with T = 1 000 The red line depicts the asymptoticdistribution of Fama-MacBeth estimate of risk premium under the normal approximation
the simulations along with the pseudo-true value of risk premium defined as 0 in this case In this
particular simulation Fama-MacBeth OLS estimate of λuseless is -119 with Shanken-corrected
t-statistics equal to -255 so according to traditional hypothesis testing we would reject the null
of λuseless = 0 even at 1 The posterior distribution of BFM estimates of risk premium (the
blue line in Figure 1) behaves rather differently it is centered around 0 and overall more spread
out with a confidence interval (minus1603 1201) Intuitively the main driving force behind it
is the fact that in BFM β is updated continuously when β is close to zero the posterior draws
of β will be positive or negative randomly which implies that the conditional expectation of λ in
Equation 12 will also flip sign depending on the draw As a result the posterior distribution of
λuseless is centered around 0 and so is the confidence interval The same logic applies to the case
of BFM-GLS
Finally Table 3 combines the insights for true and irrelevant factors and presents the simulation
results for the most realistic model setup that includes both useless and strong factors As expected
in the conventional case of frequentist Fama-MacBeth estimation the useless factor is often found
to be a significant predictor of the asset returns its OLS (GLS) t-statistic would be above a
5-critical value in over 60 (80) of the simulations On the contrary the bayesian confidence
intervals have the right coverage and reject the null of no risk premia attached to the spurious
factor with frequency asymptotically approaching the size of the tests
The crowding out of the true factors by the useless ones could also be an important empirical
concern When the model is misspecified the presence of spurious factors can also bias the risk
premia estimates for the strong ones and often leads to their crowding out of the model Panel
A in Table 3 serves as a good illustration of this possibility with risk premia estimates for the
22
Table 3 Tests of risk premia in a misspecified model with useless and strong factors
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0082 0039 0008 0121 0067 0016 0099 0023 0001 -513 5663200 0096 0044 0005 0157 0100 0034 0129 0039 0005 127 6190
FM 600 0093 0034 0014 0212 0147 0071 0264 0129 0022 840 61781000 0102 0046 0010 0261 0194 0098 0380 0199 0056 1184 624820000 0114 0054 0009 0289 0229 0152 0848 0633 0240 2507 6076
100 0035 0012 0001 0028 0007 0001 0004 0001 0000 -211 4033200 0049 0017 0001 0067 0031 0004 0011 0003 0000 -175 4828
BFM 600 005 0018 0004 0099 0047 0005 0047 0014 0002 1020 55721000 0041 0021 0003 0102 0048 0011 0071 0035 0004 1487 569520000 0017 0007 0000 0087 0033 0007 0099 0055 0012 2480 5466
Panel B GLS
100 0219 0155 0057 0224 0135 0066 0303 0198 0064 1911 7775200 0155 0092 0028 0149 0090 0024 0263 0183 0061 5537 8171
FM 600 0121 0068 0015 0116 0064 0016 0391 0293 0134 6948 84331000 0115 0061 0013 0115 0057 0012 0487 0387 0216 7305 847420000 0084 0050 0009 0100 0041 0005 0864 0836 0757 7979 8424
100 0122 0069 0016 0129 0070 0017 0046 0017 0002 3243 6869200 0112 0056 0012 0099 0048 0012 0031 0012 0000 4844 7355
BFM 600 0096 0049 0011 0086 0045 0009 0049 0016 0002 6576 80301000 0081 0036 0007 0073 0032 0006 0058 0030 0003 7064 815420000 0027 0005 0000 0022 0007 0000 0098 0047 0013 7974 8259
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and
λstrong λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor The true value of the
cross-sectional R2adj is 3055 (8175) for the OLS (GLS) estimation
strong factor are clearly biased in the frequentist estimation by the identification failure in case of
the frequentist approach Again in this case BFM provides reliable albeit conservative confidence
bounds for model parameters
III11 Evaluating cross-sectional fit
In addition to risk premia estimates it is often useful to understand the quality of cross-sectional
fit of the model Indeed the increase in cross-sectional R2 is often interpreted as measuring the
economic importance of the predictor contrary to the statistical one implied by the risk premia
significance It is well-known however that the average values of R2 are not always informative
about the true model performance its sample distribution often suffers from a large estimation un-
certainty (see eg Stock (1991) and Lewellen Nagel and Shanken (2010)) ad has a non-standard
distribution when the matrix of β has reduced rank (Kleibergen and Zhan (2015) Gospodinov
Kan and Robotti (2019)) In this section we further investigate the properties of cross-sectional
R2 in the frequentist and Bayesian Fama-MacBeth regressions
Figure 2 shows the distribution of cross-sectional OLS R2 across a large number of simulations
for the asymptotic case of T = 20 000 and a misspecified process for returns As Panel (a)
illustrates if the model is strongly identified the distribution of posterior mode of R2 for BFM
tends to coincide with that of conventional Fama-MacBeth procedure as expected The major
23
difference emerges whenever a useless factor is included into the candidate set of variables Indeed
it is well-known that in this case the distribution of conventional measures of fit is nonstandard
and often inflated (Kleibergen and Zhan (2015)) This is further confirmed in Figures 2 b) and
c) that show that under the presence of spurious factors conventional Fama-MacBeth R2 has an
extremely spreadout right tail of the distribution which makes it easy to find a substantial increase
of fit whenever the model is simply not identified This unfortunate property of the frequentist
approach is not shared by the inference with BFM Indeed the mode of the posterior distribution
of R2 is generally tightly centered around the true values The slight bump to the right tail of the
distribution comes from the fact that whenever a spurious factor is included into the model with a
small probability (based on t-statistic cut-off this is equal to the size of the test see eg Table 3
Panel A) its fit will be similar to that of the frequentist estimation
00 02 04 06 08 10
02
46
810
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(a) Model with a strong factor
00 02 04 06 08
010
2030
40
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(b) Model with a useless factor
00 02 04 06 08 10
02
46
8
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(c) Model with strong and useless factors
Figure 2 Cross-sectional distribution of R2adj in models with strong and useless factors
Figures (a)-(c) show the asymptotic distribution of cross-sectional R2 under different model specifications across 1000simulations of sample size T = 20 000 Blue lines correspond to the distribution of the posterior mode for R2
adj whilegreen lines depict the pointwise sample distribution of cross-sectional fit evaluated at Fama-MacBeth risk premiaestimates The dark yellow dash line stands for the true value of R2
adj in the model
24
However the pointwise distribution of cross-sectional R2 across the simulations is only part of
the story as it does not reveal the in-sample estimation uncertainty and whether the confidence
intervals are credible in reflecting it While BFM incorporates this uncertainty directly into the
shape of its posterior distribution one needs to rely on bootstrap-like algorithms to build a similar
analogue in the frequentist case As frequentist benchmark we use the approach Lewellen Nagel
and Shanken (2010) to construct the confidence interval for R2 Details on this procedure can be
found in Appendix A5
minus05 00 05 10
00
05
10
15
20
25
Radj2
Den
sity
BFM PosteriorFM Radj
2
5th95th CI of BFM5th95th CI of LNSTrue Radj
2
(a) Model with a useless factor
minus04 minus02 00 02 04 06 08 10
00
05
10
15
20
Radj2
Den
sity
BFM PosteriorFM Radj
2
5th95th CI of BFM5th95th CI of LNSTrue Radj
2
(b) Model with strong and useless factors
Figure 3 The estimation uncertainty of cross-sectional R2Figures (a) and (b) show the posterior density of the cross-sectional R2 (blue) in one representative simulationalong with its 90 confidence interval (red) The dark yellow line denotes the true value of R2 while the green linedepicts its in-sample Fama-MacBeth estimate with the confidence interval constructed following Lewellen Nagel andShanken (2010) (black)
Figure 3 presents the posterior distribution of cross-sectional R2 for a model that contains a
useless factor (and potentially a strong one too) and contrasts it with a frequentist value and
the confidence interval around it Consider for example Panel a) The fact that the in-sample
Fama-MacBeth estimate of cross-sectional fit (51) is substantially higher than the mode of the
posterior distribution (-2 which is close to the true value of R2adj about minus4) is not surprising
given the previous results on the pointwise distribution of the estimates What is quite interesting
however is the coverage of the confidence interval constructed via the simulation-based approach of
Lewellen Nagel and Shanken (2010) Not only does it not include true value of the cross-sectional
fit but in fact in this particular simulation it suggests that R2adj should be between 42 and
100 A similar mismatch between the seemingly high levels of cross-sectional fit produced by a
frequentist approach and their true values can also be observed in Panel b) of Figure 3 for the
case of including both strong and a useless factors
In section A6 of the Appendix we show that the performance of the Bayesian Fama-MacBeth
method is robust to the use of a larger cross-sectional dimension ndash ie the above discussed properties
25
hold in a larger N setting
III2 Bayes factors
How well do flat and spike-and-slab priors work empirically in selecting relevant and detecting
spurious factors in the cross-section of asset returns We revisit the theoretical results from Section
II1 using the same simulation design used to evaluate the estimation of risk premia
Table 4 The probability of retaining risk factors using Bayes factors
T 55 57 59 61 63 65
Panel A strong factors
Jeffreys Prior fstrong 200 0813 0784 0758 0722 0693 0662600 0929 0915 0896 0876 0851 08341000 0972 0963 0957 0951 0937 0924
Spike-and-Slab Prior fstrong 200 0605 0570 0538 0505 0476 0447600 0853 0830 0807 0784 0762 07351000 0954 0940 0928 0912 0896 0875
Panel B useless factors
Jeffreys Prior fuseless 200 1000 0996 0988 0967 0919 0822600 0998 0998 0995 0988 0977 09431000 1000 1000 1000 0994 0983 0965
Spike-and-Slab Prior fuseless 200 0325 0188 0106 0057 0024 0013600 0072 0025 0006 0002 0001 00001000 0051 0015 0001 0000 0000 0000
Panel C strong and useless factors
Jeffreys Prior fstrong 200 0924 0897 0874 0848 0821 0799600 0988 0985 0976 0974 0965 09581000 0998 0996 0996 0995 0992 0987
fuseless 200 0984 0960 0910 0811 0702 0584600 0999 0993 0985 0954 0913 08541000 1000 1000 0995 0986 0966 0945
Spike-and-Slab Prior fstrong 200 0627 0591 0552 0509 0474 0452600 0860 0830 0802 0783 0758 07271000 0956 0942 0927 0911 0895 0875
fuseless 200 0260 0128 0071 0031 0019 0010600 0080 0028 0010 0004 0001 00011000 0058 0013 0005 0000 0000 0000
The table shows the frequency of retaining risk factors for different choice sets across 1000 simulations of differentsize (T=200 600 and 1000) In Panel A the candidate risk factor is truly cross-sectionally priced and stronglyidentified while in Panel B they are not Panel C summarizes the case of using both strong and useless candidatefactors in the model A candidate factor is retained in the model if its marginal posterior probability p(γi = 1|data)is greater than a certain threshold ie 55 57 59 61 63 and 65
Consider a cross-section of 25 portfolios that is actually loading on 2 systematic sources of risk
with the econometrician potentially observing at most only one of them a strong (and priced) ft
However there is also a second candidate factor available which is orthogonal to asset returns
and essentially useless We compute Bayes factors corresponding to each of the potential sources
of risk and document the empirical probability of retaining the variable in the model across 1000
26
simulations Again we consider models that contain either strong or useless factors or a com-
bination of both and different sample sizes (T = 200 600 and 1000) In each case we run the
Gibbs sampling algorithm derived using continuous spike-and-slab prior and then approximate the
marginal probability of each factor by the posterior mean of γj The decision rule is based on a
range of critical values 55-65 such that whenever the posterior mean of γj is above a particular
threshold we retain the factor Finally we also compute the probability of retaining a factor under
Jeffreys prior which would be the standard in the literature
Table 4 summarizes our findings When only a true risk factor is included in the candidate set
(Panel A) both Jeffreyrsquos and spike-and-slab priors successfully identify it with a high probability
especially in large sample For small sample sizes however Jeffreys prior clearly has a somewhat
higher power of detecting a true risk factor This outcome is not surprising as the estimator does
not impose additional shrinkage on the risk premium The difference however vanishes for larger
sample sizes
The difference between the two priors becomes drastic whenever useless factors are included
in the model (Panels B Panels B and C in Table 4) As discussed in Section II21 since the
matrix β⊤γ βγ is nearly singular and its determinant goes to zero under a flat prior the posterior
probability of including a spurious factor in the model converges to 1 asymptotically For example
the probability of misidentifying a spurious factor as being the true source of risk is almost 1
under Jeffreys prior even for a very short sample This in turn makes the overall process of model
selection invalid
Overall we find the asymptotic behavior of the spike-and-slab prior encouraging for variable
and model selection While often somewhat conservative in very short samples it successfully
eliminates the impact of the spurious factors from the model and identifies the true sources of
risk
IV Empirical Applications
In this section we apply our Bayesian approach to a large set of factors proposed in the previous
literature First we use the Bayesian Fama-MacBeth method to analyse several notable factor
models (subsection IV1) Second we consider 51 tradable and non-tradable factors yielding more
than two quadrillion possible models and employ our spike and slab priors to compute factorsrsquo
posterior probabilities and implied risk premia (subsections IV2 and IV3) Third we compare
the performance of a (low dimensional) robust model constructed with only the factors that have
high posterior probability to the one of several notable factor models (subsection IV4) Fourth
we estimate the degree of sparsity (in terms of linear factors) of the true latent stochastic discount
factor as well as the SDF-implied maximum Sharpe ratio (subsection IV5)
27
IV1 Some notable factor models
In this section we illustrate the differences between the frequentist and Bayesian FM estimation
(both OLS and GLS) for several candidate models In particular we estimate a set of linear
factor models on the returns of the standard 25 Fama-French portfolios sorted by size and value
using frequentist and Bayesian Fama-MacBeth estimators We use monthly data over the 197001-
201712 sample for tradable factors and whenever possible nontradables For factors available
only at quarterly frequency the sample is 1952Q1-2017Q3 (whenever possible) A full description
of the data and models used can be found in Appendix B Additional empirical results for other
candidate factors and cross-sections (eg 25 Fama-French + 17 Industry portfolios) can be found
in Appendix C
Tables 5 and 6 summarize the performance of several leading factor models For the classical FM
approach we report point estimates of risk premia with their Shanken-corrected t-statistics and
the cross-sectional R2 along with its 90 confidence interval (constructed following the method-
ology of Lewellen Nagel and Shanken (2010)) For BFM we report the posterior mean of risk
premia estimates and the posterior median and mode of R2 along with the centered 90 posterior
coverage We chose to report both the median and the mode for cross-sectional fit because the of
the shape of its distribution which is often heavily skewed
Carhart (1997) 4-factor model OLS and GLS Fama-MacBeth estimates of risk premia indicate
that size value and momentum (SMB HML and UMD correspondingly) are significant drivers of
the cross-section of test assets The market factor does not command a significant risk premium
which is a typical finding for this model Cross-sectional fit seems to be high with R2 over 70 even
though it comes with rather wide confidence bounds according to Lewellen Nagel and Shanken
(2010) approach The Bayesian estimation indicates that part of the model success is due to the fact
that this cross section of test assets does not have much exposure to momentum especially after
one controls for the conventional Fama-French factors While still marginally significant its risk
premium is substantially lower under both BFM and BFM-GLS estimation with tighter bounds
for R2 too On the contrary both HML and SMB have virtually identical risk prices under both
FM and Bayesian estimations
Hou Xue and Zhang (2014) q-factor model emphasized the role of investment and profitability
in matching the cross-section of equity returns and true to the data we find these factors strongly
priced Both estimation strategies give identical parameter estimates and largely consistent mea-
sures of cross-sectional fit This in turn implies that all the risk premia are strongly identified
for this model when estimated on a cross-section of 25 Fama-French portfolios When industry
portfolios are among the test assets (see Appendix C) risk premia and R2 decline and some of the
parameters lose significance but broadly speaking the model performs in a qualitatively similar
way
Liquidity-adjusted CAPM of Pastor and Stambaugh (2003) seems to suffer from identification
failure as the risk premium on the liquidity factor ceases to remain significant when BFM is used
in estimation Wide confidence bounds and uncertain cross-sectional fit provide a stark difference to
28
Table 5 Tradable factors and 25 Fama-French portfolios sorted by size and value
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Carhart (1997) Intercept 0489 7063 0703 6432 6329[-0244 1222] [3160 9400] [-0061 1426] [4826 7646]
MKT 0120 -0101[-0631 0870] [-0822 0683]
SMB 0171 0164[0100 0241] [0089 0232]
HML 0404 0396[0331 0477] [0330 0466]
UMD 2445 1806[0955 3936] [0259 3328]
q-factor model Intercept 0912 6567 0922 6062 6123Hou Xue and Zhang (2014) [0286 1539] [3040 8680] [0276 1560] [4131 7640]
ROE 0394 0377[0016 0771] [-0020 0789]
IA 0387 0385[0203 0571] [0208 0580]
ME 0274 0268[0169 0379] [0158 0376]
MKT -0371 -0378[-0995 0252] [-1005 0272]
Liquidity-CAPM Intercept 0973 3624 1162 3409 3027Pastor and Stambaugh (2000) [-0084 2030] [-909 10000] [0175 2120] [-239 6146]
LIQ 3057 1785[0727 5388] [-1237 4150]
MKT -0281 -0449[-1350 0788] [-1371 0509]
Panel A GLS
Carhart (1997) Intercept 1017 8964 1083 8587 863[0389 1645] [8200 9760] [0458 1717] [8085 9105]
MKT -0434 -0504[-1065 0196] [-1150 0122]
SMB 0191 0189[0150 0233] [0150 0230]
HML 0356 0356[0313 0400] [0316 0395]
UMD 1626 1264[0479 2772] [0077 2401]
q-factor model Intercept 1305 5503 1277 4728 4854Hou Xue and Zhang (2014) [0779 1831] [2440 9640] [0702 1879] [3245 6419]
ROE 0295 0266[-0026 0615] [-0087 0640]
IA 0270 0265[0104 0437] [0093 0450]
ME 0251 0246[0161 0341] [0144 0345]
MKT -0749 -0720[-1268 -0229] [-1292 -0156]
Liquidity-CAPM Intercept 1244 4938 1256 5298 4317Pastor and Stambaugh (2000) [0664 1824] [2691 9891] [0738 1749] [1250 6653]
LIQ 1141 0775[-0232 2514] [-0450 2116]
MKT -0664 -0678[-1242 -0086] [-1176 -0162]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of 25 Fama-French monthly excess returns Each model is estimated via OLS and GLS We reportpoint estimates and 5 confidence intervals for risk premia which are constructed based on the asymptotic normaldistribution and cross-sectional R2 and its (5 95) confidence level constructed as in Lewellen Nagel and Shanken(2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide the posterior mean of λ denoted byλj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (595) credible intervals and denote significance at the 90 95 and 99 level respectively
29
Table 6 Nontradable factors and 25 Fama-French portfolios sorted by size and value
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Scaled CCAPM Intercept 1046 2567 1791 3436 2919[-0848 2940] [-1429 10000] [0001 3723] [-476 6207]
cay 1817 0791[-0653 4288] [-1347 2686]
∆Cnd 0713 0303[-0030 1456] [-0462 0951]
∆Cnd times cay 0804 0301[-1645 3253] [-1911 2270]
HC-CAPM Intercept 3243 -122 3090 354 957[1228 5257] [-909 3345] [0790 5259] [-748 4431]
∆Y 0464 0085[-0213 1140] [-1119 1058]
MKT -0719 -0656[-2680 1242] [-2859 1558]
Durable CCAPM Intercept 2214 5238 2780 471 4078[-1037 5465] [2800 10000] [-0184 5751] [120 6991]
∆Cnd 0743 0357[-0025 1511] [-0207 0832]
∆Cd -0057 0014[-0719 0605] [-0668 0693]
MKT 0083 -0495[-3322 3489] [-3395 2555]
Panel B GLS
Scaled CCAPM Intercept 2180 -1024 2257 -658 -313[0825 3536] [-1429 6457] [1221 3258] [-1187 1517]
cay 0435 0256[-0774 1643] [-0688 1217]
∆Cnd 0118 0089[-0266 0502] [-0214 0407]
∆Cnd times cay 0141 0063[-1005 1286] [-0845 0938]
HC-CAPM Intercept 2730 5636 2759 5824 4926[1458 4002] [3018 8364] [1379 4095] [967 7507]
∆Y -0421 -0241[-0742 -0099] [-0598 0114]
MKT -0717 -0740[-1979 0545] [-2073 0622]
Durable CCAPM Intercept 2960 4454 2841 5474 4099[0547 5374] [286 7829] [1102 4558] [-241 7215]
∆Cnd 0105 0052[-0265 0475] [-0201 0311]
∆Cd 0055 0025[-0390 0501] [-0286 0327]
MKT -0941 -0822[-3314 1432] [-2528 0895]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of 25 Fama-French quarterly excess returns Each model is estimated via OLS and GLS Wereport point estimates and 5 confidence intervals for risk premia which are constructed based on the asymptoticnormal distribution and cross-sectional R2 and its (5 95) confidence level constructed as in Lewellen Nageland Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide the posterior mean of λdenoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as wellas its (5 95) credible intervals and denote significance at the 90 95 and 99 level respectively
the pointwise estimates and their seemingly high significance levels under the standard frequentist
approach
Conditional CCAPM of Lettau and Ludvigson (2001) is weakly identified at best Unlike the
basic FM estimation that indicates a relative empirical success of the model the Bayesian approach
reveals most risk premia to be substantially lower losing all the accompanying statistical and
30
economic significance This is particularly pronounced in the BFM-GLS that delivers both risk
premia and cross-sectional R2 close to zero
Labour-adjusted CAPM of Jagannathan and Wang (1996) extends the classic CAPM framework
by introducing a proxy for human capital and finds it strongly priced in the cross-sections of stocks
returns The BFM estimates of risk premia are substantially lower and no longer significant with
the same patterns observed under both OLS and GLS procedures
Durable CCAPM of Yogo (2006) in the linearized version included the durable consumption
factor and found that its impact is priced in a number of cross-sections sorted by size and value past
betas and other characteristics Even though the Lewellen Nagel and Shanken (2010) approach
indicates a really wide support for the cross-sectional R2 the model found empirical support in the
data We find that both durable and nondurable consumption are weak predictors of the cross-
section of returns as the magnitude of their risk premia substantially declines and is no longer
significant The model is still characterized by a wide confidence interval for R2 but overall its
pricing ability is questionable at best
Appendix C provides additional empirical results on the performance of both frequentist and
bayesian Fama-MacBeth estimators In many cases when the models are well specified and strongly
identified in the data there is almost no distinction between the two approaches One notable
difference however are the confidence intervals of the R2 that are often notoriously wide in
the frequentist case There are also cases however when the difference in model performance
becomes large affecting both risk premia estimates and measures of cross-sectional fit Similar
to Gospodinov Kan and Robotti (2019) we caution the reader against blindly relying on the
estimates produced by conventional Fama-MacBeth procedure and advocate a robust approach to
inference
IV2 Sampling two quadrillion models
We now turn our attention to a large cross-section of candidate asset pricing factors In particular
we focus on 51 (both traded and non) monthly factors available from October 1973 to December
2016 (ie T = 600) Factors are described in details in Table B1 of Appendix B In choosing
the cross-section of assets to price we follow Lewellen Nagel and Shanken (2010) and employ 25
Fama-French size and book-to-market portfolios plus 30 Industry portfolio (ie N = 55) Since
we do not restrict the maximum number of factors to be included all the possible combinations
of factors give us a total of 251 possible specifications ie 225 quadrillion models Note that each
model involves 55 time series regressions and one cross-section regression ie we jointly evaluate
the equivalent of 126 quadrillion regressions
We employ the continuous spike-and-slab approach of section II23 since it is the most suited
for handling a very large number of possible models and report both the posterior (given the data)
of each factor (ie E [γj |data] forallj) as well as the posterior means of the factorsrsquo risk premia (ie
E [λj |data] forallj) computed as the Bayesian Model Average16 across all the models considered
16If we are interested in some quantity ∆ that is well-defined for every model j = 1 M (eg risk premia
31
The posterior evaluation is performed and reported over a wide range for the parameter (ψ in
equation (16)) that controls the degree of shrinkage of potentially useless factorsrsquo risk premia from
ψ = 1 (ie very strong shrinkage) to ψ = 100 (making the shrinkage virtually irrelevant) The
prior for each factor inclusion is a Beta(1 1) (ie a uniform on [0 1]) yielding a prior expectation
for γj equal to 5017
000
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
1 5 10 20 50 100
Post
erio
r pro
babi
lity
ψ
HML MKT MKT SMB STRev
IPGrowth BEH_PEAD NONDUR SERV LIQ_TR
PE DeltaSLOPE BW_ISENT DEFAULT Oil
DIV REAL_UNC UNRATE TERM HJTZ_ISENT
UMD CMA LIQ_NT FIN_UNC STOCK_ISS
NetOA MACRO_UNC INV_IN_ASS COMP_ISSUE ACCR
HML ROA RMW CMA LTRev
INTERM_CAP_RATIO ASS_Growth DISSTR PERF BAB
IA MGMT ROE O_SCORE BEH_FIN
GR_PROF QMJ RMW SKEW SMB
HML_DEVIL
Figure 4 Posterior factor probabilities
Posterior probabilities of factors E [γj |data] estimated over the 197310-201612 sample using a cross-section of 25Fama-French size and book-to-market and 30 Industry test asset portfolios computed using the continuous spike andslab approach of section II23 and 51 factors yielding 251 asymp 225 quadrillion models The 51 factors considered aredescribed in Table B1 of Appendix B The prior distribution for the j-th factor inclusion is a Beta(1 1) yielding aprior expectation of γj = 50 Posterior probabilities are plotted for ψ isin [1 100]
Figure 4 plots the posterior probabilities of the 51 factors as a function of the parameter ψ
The corresponding values are reported in Table 7 Overall the inclusion of only four factors finds
maximum Sharpe ratio etc) from Bayesrsquo theorem we have
E [∆|data] =M983131
j=0
E [∆|datamodel = j] Pr (model = j|data)
where E [∆|datamodel = j] = limLrarrinfin1L
983123Li=1 ∆(θ
(j)i ) and
983153θ(j)i
983154L
i=1denotes draws from the posterior distribution
of the parameters of model j That is the (Bayesian model average) expectation of ∆ conditional on only thedata is simply the weighted average of the expectation in every model with weights equal to the modelsrsquo posteriorprobabilities See eg Raftery Madigan and Hoeting (1997) Hoeting Madigan Raftery and Volinsky (1999)
17Using a Beta(2 2) that still implies a prior probability of factor inclusion of 50 but lower probabilities forvery dense and very sparse models we obtain virtually identical results Furthermore using a prior in favour of moresparse factor models (ie a Beta(2 8)) the empirical findings are very similar to the ones reported
32
Table 7 Posterior factor probabilities E [γj |data] and risk premia in 225 quadrillion models
E [γj |data] E [λj |data]ψ ψ
Factors 1 5 10 20 50 100 1 5 10 20 50 100 F
HML 0860 0909 0916 0903 0882 0877 0192 0288 0301 0300 0292 0293 0377MKT 0730 0807 0831 0851 0837 0778 0108 0271 0370 0476 0565 0572 0514MKT 0501 0630 0705 0748 0749 0707 0047 0184 0285 0385 0467 0472 0563SMB 0654 0662 0580 0500 0429 0370 0076 0138 0133 0117 0103 0102 0215STRev 0506 0526 0511 0530 0568 0550 0003 0013 0026 0049 0106 0158 0438IPGrowth 0508 0508 0501 0502 0508 0502 0000 -0001 -0001 -0002 -0004 -0007 0121BEH PEAD 0486 0512 0478 0492 0511 0517 0004 0012 0018 0027 0049 0072 0619NONDUR 0475 0499 0490 0504 0504 0509 0000 0002 0003 0005 0011 0018 0170SERV 0504 0481 0497 0502 0491 0497 0000 0000 0000 0000 -0001 -0001 0052LIQ TR 0494 0493 0485 0485 0506 0507 0000 0005 0010 0020 0048 0083 0438PE 0495 0494 0502 0490 0494 0492 -0002 -0004 -0005 -0006 -0008 -0009 6401DeltaSLOPE 0478 0490 0506 0498 0482 0511 0000 0000 -0001 -0001 -0002 -0005 0096BW ISENT 0489 0491 0496 0498 0504 0477 0000 0002 0003 0005 0010 0014 0094DEFAULT 0497 0498 0490 0497 0484 0490 0000 0000 0000 0001 0001 0002 0313Oil 0495 0504 0489 0490 0494 0483 0002 0011 0019 0034 0068 0108 0852DIV 0488 0491 0486 0494 0491 0505 0000 0000 0000 -0001 -0002 -0003 0876REAL UNC 0479 0490 0503 0490 0498 0484 0000 0000 0000 0000 0000 0000 0043UNRATE 0483 0499 0484 0490 0490 0486 0000 -0001 -0002 -0003 -0006 -0008 1083TERM 0470 0476 0480 0497 0503 0501 0000 0003 0005 0009 0018 0030 0942HJTZ ISENT 0483 0480 0491 0487 0489 0477 0000 0000 0000 -0001 0000 0000 0210UMD 0531 0537 0536 0495 0422 0384 0028 0069 0089 0100 0102 0108 0646CMA 0499 0504 0491 0495 0466 0442 0001 0000 -0002 -0005 -0008 -0012 0242LIQ NT 0477 0478 0494 0493 0481 0471 -0003 -0005 -0004 -0002 0009 0019 0501FIN UNC 0481 0477 0471 0477 0489 0491 0000 0000 0000 0000 -0001 -0001 0096STOCK ISS 0515 0537 0509 0473 0433 0374 -0032 -0081 -0096 -0103 -0110 -0105 0515NetOA 0504 0504 0481 0489 0423 0402 0005 0015 0022 0032 0040 0042 0544MACRO UNC 0476 0483 0479 0471 0463 0430 0000 0000 0000 0000 0000 0000 0073INV IN ASS 0479 0476 0479 0461 0454 0440 0001 0002 0005 0010 0023 0029 0549COMP ISSUE 0514 0518 0503 0481 0393 0363 0037 0083 0101 0111 0105 0108 0497ACCR 0483 0477 0486 0472 0436 0415 -0009 -0012 -0009 0001 0015 0015 0343HML 0491 0481 0479 0469 0431 0376 -0002 -0001 0001 0004 0008 0010 0251ROA 0517 0514 0499 0473 0399 0319 0041 0105 0141 0162 0154 0123 0551RMW 0508 0481 0483 0465 0397 0347 0002 0008 0013 0017 0017 0016 0219CMA 0560 0530 0499 0432 0351 0292 0031 0056 0062 0058 0051 0044 0351LTRev 0485 0491 0474 0442 0402 0350 -0007 -0025 -0032 -0035 -0036 -0029 0252INTERM CAP RATIO 0499 0477 0452 0451 0380 0324 0027 0037 0054 0082 0103 0104 0790ASS Growth 0473 0470 0458 0439 0380 0327 0001 0005 0005 0000 -0005 -0003 0525DISSTR 0505 0487 0456 0410 0340 0269 0043 0094 0105 0104 0085 0070 0475PERF 0541 0497 0448 0390 0319 0259 -0054 -0079 -0072 -0064 -0054 -0049 0651BAB 0460 0448 0433 0405 0372 0325 -0010 -0012 -0008 0006 0027 0037 0921IA 0497 0465 0425 0385 0325 0257 -0010 -0011 -0009 -0008 -0008 -0007 0409MGMT 0516 0469 0424 0379 0302 0244 0028 0055 0058 0056 0050 0044 0631ROE 0482 0446 0414 0364 0299 0233 0009 0024 0027 0028 0024 0018 0555O SCORE 0475 0426 0403 0368 0281 0229 -0009 -0014 -0015 -0018 -0016 -0012 002BEH FIN 0483 0410 0360 0321 0238 0196 -0016 -0008 -0003 0000 0003 0003 076GR PROF 0459 0405 0366 0314 0249 0206 0011 0005 0008 0011 0012 0014 0199QMJ 0502 0408 0369 0300 0232 0178 0030 0030 0026 0018 0014 0011 0405RMW 0464 0395 0356 0302 0245 0193 0015 0025 0028 0027 0025 0020 0292SKEW 0481 0388 0345 0287 0219 0179 -0001 0000 0000 0000 0000 0000 0438SMB 0336 0305 0319 0312 0311 0277 0019 0032 0041 0047 0054 0051 0257HML DEVIL 0434 0326 0289 0242 0183 0152 0014 0013 0009 0005 0003 0002 0356
Posterior probabilities of factors E [γj |data] and posterior mean of factor risk premia E [λj |data] computed usingthe continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225 quadrillion models Theprior for each factor inclusion is a Beta(1 1) yielding a prior expectation for γj equal to 50 The last column reportssample average returns for the tradable factors The data is monthly 197310 to 201612 Test assets cross-sectionof 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factors considered are described inTable B1 of Appendix B Numbers denoted with the asterisk in the last column correspond to the return on thefactor-mimicking portfolio of the nontradable factor constructed by a linear projection of its values on the set of 51test assets and scaled to have the same volatility as the original nontradable factor
33
substantial support in our empirical analysis First the celebrated Fama-French HML (high-minus-
low) designed to capture the so-called lsquovalue premiumrsquo is a strong determinant of the cross-section
of asset returns For ψ = 10 (a reasonable benchmark) its posterior probability is about 895 and
only for very strong shrinkage (ψ = 1) the posterior probability gets reduced to 639 Second
the market factor in the version of Daniel Mota Rottke and Santos (2018) (MKTlowast that is
meant to have hedged out the unpriced risk contained in the market index) has also high posterior
probability (856 for ψ = 10) Third the simple market factor (MKT) seems also to be a robust
source of priced risk albeit with an empirical performance weaker than MKTlowast Fourth albeit to
a lesser extent SMBlowast the Daniel Mota Rottke and Santos (2018) version of the small-minus-big
Fama-French factor (meant to capture the so called lsquosizersquo premium) seems also to contain relevant
information for pricing the cross-section of asset returns with a posterior probability in the 60-
70 range for most values of ψ Beside the ones above mentioned all other factors have posterior
probabilities of about 50 or less for all values of ψ Interestingly the results are not very sensitive
to the choice of ψ
In addition to the posterior probabilities of the factors Table 7 reports the posterior means
of the factor risk premia computed as Bayesian Model Average ie the weighted average of the
posterior means in each possible factor model specification with weights equal to the posterior
probability of each specification being the true data generating process (see eg Roberts (1965)
Geweke (1999) Madigan and Raftery (1994)) The results are not very sensitive to the choice of
ψ except when considering very small values for of the shrinkage parameter ψ since in this case
posterior means are shrunk toward zero Interestingly the estimated price of risk for the market
factor is positive despite it being very often estimated as a negative quantity when considering
multifactor models and not dissimilar from the market excess return over the same period More
generally there is a clear pattern in cross-sectionally estimated (ie ex post) factor risk premia and
their simple time series average estimates (reported in the last column of Table 7) for the robust
four factors (HML MKTlowast MKT and SMBlowast) ex post risk premia are very similar to the time series
estimnates while the opposite holds true for the other factors In other words robust factors seems
to price themselves well (since theoretically their own beta is one) while other factors donrsquot
IV3 Estimating 26 million sparse factor models
Instead of drawing unrestricted factor models specifications we now constraint models to have a
maximum of 5 factors ndash ie we are imposing sparsity on the implied stochastic discount factor as
in most of the previous empirical asset pricing literature that has tried to identify low dimensional
factor models to explain the cross-section of asset returns Given our set of 51 factors this approach
yields about 26 millions of possible models (ie the equivalent of about 147 million time series and
cross-sectional regressions) Since we now do not sample the possible models posterior probabilities
are computed using the marginal likelihoods of all these models ie the posterior probability of
model γj is computed as
Pr(γj |data) =p(data|γj)983123i p(data|γi)
34
000
1000
2000
3000
4000
5000
6000
7000
8000
1 5 10 20 50 60
Post
erio
r pro
babi
lity
ψ
HML MKT MKT SMB PERF
STOCK_ISS COMP_ISSUE CMA ROA UMD
BEH_PEAD STRev NONDUR DISSTR MGMT
BW_ISENT TERM FIN_UNC LIQ_TR DeltaSLOPE
IPGrowth Oil SERV DEFAULT NetOA DIV PE REAL_UNC HJTZ_ISENT UNRATE
INTERM_CAP_RATIO LIQ_NT QMJ MACRO_UNC INV_IN_ASS
ACCR LTRev CMA ASS_Growth HML
RMW BAB IA O_SCORE ROE
SMB SKEW BEH_FIN GR_PROF RMW
HML_DEVIL
Figure 5 Posterior factor probabilities
Posterior probabilities of factors Pr [γj = 1|data] estimated over the 197310-201612 sample using a cross-section of25 Fama-French size and book-to-market and 30 Industry test asset portfolios computed using the Dirac spike andslab approach of section II22 51 factors and all possible models with up to 5 factors yielding about 26 millioncandidate models The prior probability of a factor being included is about 1038 The 51 factors considered aredescribed in Table B1 of Appendix B Posterior probabilities are plotted for ψ isin [1 100]
where we have assigned equal prior probability to all possible specifications and p(data|γj) denotes
the marginal likelihood of the j-th model To both simplify the numerical computation and to
illustrate the qualities of the approach we use the Dirac spike-and-slab prior of section II22 since
we can leverage its closed form solution for the marginal likelihoods18
Posterior factor probabilities are reported in Figure 5 and jointly with the Bayesian model
averaging of risk premia across the sparse models in Table C15 of the Appendix Note that in
this case all factors have an ex ante probability of being included equal to 1038 The results
are strikingly similar to the ones in Table 7 and Figure 4 as before only for four factors (HML
MKTlowast MKT and SMBlowast) we observe a marked increase in the posterior probability of inclusion
after observing the data Furthermore these factors seems to price themselves well ndash ie the BMA
of their risk premia are very similar to the sample average of their excess returns ndash while other
factors do not
35
Table 8 Factor Models with highest posterior probability (Dirac spike-and-slab ψ = 10)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232SMB 983232 983232 983232 983232 983232 983232STRevBW ISENTLIQ TRUNRATENONDURTERMCOMP ISSUE 983232 983232OilBEH PEADDeltaSLOPEINV IN ASSIPGrowthDEFAULTUMD 983232ROA 983232 983232 983232REAL UNCPECMA 983232ACCRSERVSTOCK ISS 983232 983232 983232DIVMACRO UNCFIN UNCLIQ NTCMANetOAHJTZ ISENTLTRevRMWHMLINTERM CAP RATIOASS GrowthPERF 983232 983232 983232IABABDISSTRROEMGMT 983232 983232O SCOREQMJBEH FINGR PROFSKEWRMWHML DEVILSMB
Probability () 00666 00599 00574 00522 00503 00460 00437 00428 00423 00393
Factors and posterior model probabilities of ten most likely specifications computed using the Dirac spike and slabapproach of section II22 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 26 millioncandidate models and a model prior probability of the order of 10minus7 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
36
IV4 A robust factors model
The previous subsections suggest that only a small number of factors (HML MKTlowast MKT and to a
lesser extent SMBlowast) are robust explanators of the cross-section of asset returns Furthermore Table
8 that reports the ten factor model specifications with the highest posterior probabilities19 shows
that these robust factors tend to be almost always included in the most likely models both HML
and MKTlowast are featured in all ten specifications while MKT is excluded only once and SMBlowast is
included in the five most likely specification plus the tenth one Note that the posterior probabilities
in Table 8 might appear small in absolute terms but are actually four orders of magnitude larger
than the prior model probabilities (equal to one over the number of models considered)
Therefore a natural question is whether the four factors identified as robust in the above
analysis do indeed deliver a significantly better cross-sectional asset pricing model We answer this
questions by comparing the performance of a four factor model with HML MKTlowast MKT and SMBlowast
as factors to the one of several notable factor models20 In particular Table 9 reports the model
posterior probabilities for the specifications considered ie the probability of any of these models
being the true data generating process Strikingly for almost any value of ψ the model posterior
probabilities are in the single digits range for all models but the robust factors one the probability
of this specification is always higher than 85 except when using a very strong shrinkage (in which
case it is reduced to 65) Furthermore for ψ in the most salient range (10-20) the posterior
probability of the robust factors model is about 90
Table 9 Posterior probabilities of notable models vs robust factors
ψmodel 1 5 10 20 50 100
CAPM 002 001 001 002 004 008Fama and French (1992) 005 001 001 001 000 000Fama and French (2016) 009 006 005 004 003 002Carhart (1997) 005 001 001 001 001 001Hou Xue and Zhang (2015) 002 000 000 000 000 000Pastor and Stambaugh (2000) 001 000 000 001 001 002Asness Frazzini and Pedersen (2014) 010 003 002 002 002 001Robust Factors Model 065 088 090 090 089 085
Posterior model probabilities for the specifications in the first column for different values of ψ computed using theDirac spike-and-slab prior The models and their factors are described in Appendix B The model in the last rowuses the HML MKT MKTlowast and SBMlowast factors described in Table B1 The data is monthly 197310 to 201612 Testassets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios
18Alternatively one could use the continuous spike-and-slab approach employed in the previous subsection anddrop the draws of specifications with more than 5 factors but this significantly reduces computational efficiency
19Table 8 focuses on the Dirac spike-and-slab specification with ψ = 10 Very similar results are reported in tablesC16-C18 of Appendix C for other values of ψ and for the continuous prior case
20Note that the correlation between MKT and MKTlowast is not too large being about 064
37
IV5 The sparsity of the stochastic discount factor
We have shown that a robust model with only four ndash robust ndash factors is much more likely to
capture the true stochastic discount factor than all the notable models considered (see Table 9)
Nevertheless are only four factors likely to deliver a sufficiently accurate representation of the true
latent stochastic discount factor
Thanks to our Bayesian method this question can be easily answered In particular by using
our estimations of about 225 quadrillion models and their posterior probabilities we can compute
the posterior distribution of the dimensionality of the lsquotruersquo model That is for any integer number
between one and fifty-one we can compute the posterior probability of the SDF being a function
of that number of factors
Figure 6 reports the posterior distributions of the dimensionality of the SDF for various values
of ψ These distributions are also summarized in Table 10 For the most salient values of ψ (10 and
20) the posterior mean of the number of factors in the true SDF is in the 24-25 range and the 95
posterior credible intervals are contained in the 17 to 32 factors range That is there is substantial
evidence that the SDF is dense in the space of the linear factors considered given the factors at
hand a relative large number of them is needed to provide an accurate representation of the true
latent SDF Since most of the literature has focused on very low dimension linear factor models
this finding suggests that most empirical results therein have been affected by a large degree of
misspecification
It is worth noticing that as Figure 6 and Table 10 show for very large ψ ie with basically
a flat prior for factor risk premia the posterior dimensionality is reduced This is due to two
phenomena we have already outlined First if some of the factors are useless (and our analysis
points in this direction) under a flat prior they do tend to have a higher posterior probability and
drive out the true sources of priced risk Second a flat prior for the risk premia can generate a
ldquoBartlett Paradoxrdquo (see the discussion in section II21 and Bartlett (1957))
Table 10 The posterior dimensionality of the stochastic discount factor
ψ1 5 10 20 50 100
mean 2570 2525 2460 2370 2203 2046median 26 25 25 24 22 2025 19 18 18 17 15 145 20 20 19 18 16 1595th 31 31 30 30 28 26975th 33 32 32 31 29 27
Summary statistics of the posterior density of the true model number of factors for various values of ψ Estimated
over the 197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30 Industry
test asset portfolios using the continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225
quadrillion models The prior for each factor inclusion is a Beta(1 1)
38
0 10 20 30 40 50
000
004
008
012
Model Dimensionality
Freq
uenc
y
ψ=1ψ=5ψ=10ψ=20ψ=50ψ=100
Figure 6 Posterior density of the dimensionality of the stochastic discount factor
Posterior density of the true model having the number of factors listed on the horizontal axis Estimated over the
197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30 Industry test asset
portfolios computed using the continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225
quadrillion models The prior for each factor inclusion is a Beta(1 1) yielding a prior expectation for γj equal to 50
The 51 factors considered are described in Table B1 of Appendix B Posterior densities are plotted for ψ isin [1 100]
Note that if the factors proposed in the literature were to capture different and uncorrelated
sources of risk one might worry that a SDF that is dense in the space of factors might imply
unrealistically high Sharpe ratios (see eg the discussion in Kozak Nagel and Santosh (2019))
Since given a factor model the SDF-implied maximum Sharpe ratio is just a function of the
factorsrsquo risk premia and covariance matrix our Bayesian method allows to construct the posterior
distribution of the maximum Sharpe ratio for each of the 2 quadrilion models considered Therefore
using the posterior probabilities of each possible model specification we can actually construct
the (Bayesian Model Averaging) posterior distribution of the SDF-implied maximum Sharpe ratio
(conditional on the data only)
Figure 7 and table 11 report respectively the posterior distribution of the (annualized) SDF-
implied maximum Sharpe ratio and its summary statistics for several values of the parameter ψ
Except when a very strong shrinkage (small ψ) is imposed (and hence risk premia and consequently
Sharpe ratios are shrunk toward zero) the posterior distribution of the Sharpe ratio are quite
similar for all values of ψ Furthermore despite the SDF being dense in the space of factors the
posterior maximum Sharpe ratio does not appear to be unrealistically high eg for ψ isin [10 20]
its posterior mean is about 074ndash080 and the 95 posterior credible intervals are in the 050ndash114
range Interestingly Ghosh Julliard and Taylor (2016 2018) provides a non-parametric estimate
of the pricing kernel extracted using an information-theoretic approach and wide cross-sections of
equity portfolios and find SDF-implied maximum Sharpe ratios of very similar magnitude
39
00 05 10 15
01
23
4
Sharpe Ratio
Dens
ity
ψ=1ψ=5ψ=10ψ=20ψ=50ψ=100
Figure 7 Posterior density of the Sharpe ratio of the model-implied stochastic discount factor
Posterior density of the Sharpe ratio of the model-implied stochastic discount factor for various values of ψ isin [1 100]
Estimated over the 197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30
Industry test asset portfolios computed using the continuous spike and slab approach of section II23 51 factors
described in Table B1 of Appendix B and Bayesian Model Averaging of the 251 asymp 225 quadrillion possible models
The prior for each factor inclusion is a Beta(1 1)
Table 11 Posterior distribution of the Sharpe ratio of the model-implied stochastic discount factor
ψ1 5 10 20 50 100
mean 044 067 074 080 087 092median 043 066 073 079 086 09125 026 044 050 054 059 0625 029 047 053 058 063 06695th 060 089 099 108 117 124975th 064 094 105 114 124 133
Summary statistics of the posterior distribution of the maximal Sharpe ratio of the model-implied SDF for various
values of ψ isin [1 100] Estimated over the 197310-201612 sample using a cross-section of 25 Fama-French size and
book-to-market and 30 Industry test asset portfolios computed using the continuous spike and slab approach of
section II23 51 factors described in Table B1 of Appendix B and Bayesian Model Averaging of the 251 asymp 225
quadrillion possible models The prior for each factor inclusion is a Beta(1 1)
V Extensions
In additions to the extensions formalized in remarks 1 (on how to handle generated factors such as
principal components and factor mimicking portfolios) and 3 (on how to handle the identification
failure generated by lsquolevel factorsrsquo) our method can be feasibly extended to encompass several
salient generalizations
40
First based on economic considerations one might possibly want to bound the maximum risk
premia (or the maximum Sharpe ratios) associated with the factors This can be achieved by re-
placing the Gaussian distributions in our spike-and-slab priors with (rescaled and centered) Beta
distributions since the latter have bounded support Furthermore for the sake of expositional
simplicity and closed form solutions we have focused on regularising spike and slab priors with
exponential tails Nevertheless our approach that shrinks useless factors based on their correla-
tion with asset returns could be as well implemented using polynomial tailed (ie heavy-tailed)
mixing priors (see Polson and Scott (2011) for a general discussion of priors for regularisation and
shrinkage)21 The rational for using heavy tailed priors is that when the likelihood has thick tails
while the prior has thin tail if the likelihood peak moves too far from the prior mean the posterior
eventually eventually reverts toward the prior This is illustrated in Figure D2 of the Appendix
Nevertheless note that this mechanism (first pointed out in Jeffreys (1961)) is actually desirable
in our settings in order to shrink the risk premia of useless factors toward zero22
Second Lewellen Nagel and Shanken (2010) points out that the first pass time-series regression
is often affected by having a strong factor structure in the residuals Given the hierarchical structure
of our Bayesian approach one can add latent linear components in the time series regression of asset
returns on factors reformulate the time series estimation step as a state-space problem and filter
the latent components (eg via Kalman filter) The posterior sampling of the time series parameters
would then be enriched by the drawing of the added terms as in Bryzgalova and Julliard (2018)
Furthermore one could allow the latent time series factors to be potential priced in the cross-
section (again as in Bryzgalova and Julliard (2018)) This extension would increase the numerical
complexity of the procedure in the time-series step but would nonetheless leave unchanged the
method proposed in this paper at the cross-sectional step (with the only difference that the time
series loadings of the latent factors could be included in the cross-sectional step as if these latent
factors were observable) This extended approach would lead to valid posterior inference and model
selection
Third again thanks to the hierarchical structure of our method time varying time series betas
could be accommodated by adopting the time varying parameters approach of Primiceri (2005)
in the time series step And since in our approach the asset specific expected risk premia are
a parameter estimated in the time series step this extension would also allow for time variation
in asset risk premia Furthermore albeit this would increase the numerical complexity of the
cross-sectional inference step the time varying parameters formulation could also be used for the
modelling of the factor risk premia
21For example albeit alternatives with more desirable properties exist our spike and slab could be implementedusing a Cauchy prior with location parameter set to zero and scale parameter proportional to ψj as defined in equation(16)
22Risk premia have a natural support hence the prior can be chosen (adjusting the paramater ψ) to attach highprobability to it And since useless factors will tend to generate heavy tailed likelihoods with posterior peaks for riskpremia that deviate toward infinity the risk premia of such factors will be shrunk toward the prior mean if the priorhas thin-tails
41
VI Conclusions
We have developed a novel (Bayesian) method for the analysis of linear factor models in asset
pricing The approach can handle the quadrillions of models generated by the zoo of traded and
non traded factors and delivers inference that is robust to the common identification failures and
spurious inference problems caused by useless factors
We have applied our approach to the study of more than two quadrillion factor model spec-
ification and have found that 1) only a handful of factors (the Fama and French (1992) ldquohigh-
minus-lowrdquo proxy for the value premium the market index as well the adjusted versions of the
ldquosmall-minus-bigrdquo size factor and the market factor of Daniel Mota Rottke and Santos (2018))
seem to be robust explanators of the cross-sections of asset returns 2) jointly the four robust fac-
tors provide a model that is compared to the previous empirical literature one order of magnitude
more likely to have generated the observed asset returns (itrsquos posterior probability is about 90)
3) with very high probability the ldquotruerdquo latent stochastic discount factor is dense in the space of
factors proposed in the previous literature ie capturing its characteristics requires the use of 24-25
factors (at the posterior mean of the SDF sparsity) 4) despite being dense in the space of factors
the SDF-implied maximum Sharpe ratio is not excessive suggesting a high degree of commonality
in terms of captured risks among the factors in the zoo
As a byproduct of our novel framework for empirical asset pricing we provide a very simple
Bayesian version of the Fama and MacBeth (1973) regression method (BFM) We show that this
simple procedure (that does neither require optimisation nor tuning parameters and is not harder
to implement than eg the Shanken (1992) correction for standard errors) makes useless factors
easily detectable in finite sample In extensive simulations the BFM and its GLS analogue (BFM-
GLS) perform well even with relatively small time and large cross-sectional dimensions We apply
BFM and BFM-GLS to several notable factor models and document that a range of non-traded
factors such as consumption proxies labour factors or the consumption-to-wealth ratio are only
weakly identified at best and are characterised by a substantial degree of model misspecification
and uncertainty
Finally thanks to its hierarchical structure our framework is extremely flexible and can accom-
modate and deliver robust inference in the presence of 1) pre-estimated factors (eg mimicking
portfolios and principal components) 2) latent priced and unpriced factors 3) time varying betas
as well as asset and factor risk premia
42
References
Allena R (2019) ldquoComparing Asset Pricing Models with Traded and Non-Traded Factorsrdquo working paper
Asness C and A Frazzini (2013) ldquoThe Devil in HMLs Detailsrdquo Journal of Portfolio Management 39 49ndash68
Asness C S A Frazzini and L H Pedersen (2019) ldquoQuality minus Junkrdquo Review of Accounting Studies24(1) 34ndash112
Avramov D and G Zhou (2010) ldquoBayesian Portfolio Analysisrdquo Annual Review of Financial Economics 225ndash47
Baker M and J Wurgler (2006) ldquoInvestor Sentiment and the Cross-Section of Stock Returnsrdquo Journal ofFinance 61 1645ndash80
Baks K P A Metrick and J Wachter (2001) ldquoShould Investors Avoid All Actively Managed Mutual FundsA Study in Bayesian Performance Evaluationrdquo Journal of Finance 56 45ndash85
Ball R (1985) ldquoAnomalies in Relationships between Securitiesrsquo Yields and Yield-Surrogatesrdquo Journal of FinancialEconomics 6 103ndash126
Barillas F and J Shanken (2018) ldquoComparing Asset Pricing Modelsrdquo The Journal of Finance 73(2) 715ndash754
Bartlett M S (1957) ldquoComment on a Statistical Paradox by DV Lindleyrdquo Biometrika 44(1-2) 533ndash534
Basu S (1977) ldquoInvestment Performance of Common Stocks in Relation to their Price-Earnings Ratio A Test ofthe Efficient Market Hypothesisrdquo Journal of Finance 32 663ndash82
Belloni A V Chernozhukov and C Hansen (2014) ldquoInference on treatment effects after selection amonghigh-dimensional controlsrdquo The Review of Economic Studies 81 608ndash650
Breeden D T M R Gibbons and R H Litzenberger (1989) ldquoEmpirical Test of the Consumption-OrientedCAPMrdquo The Journal of Finance 44(2) 231ndash262
Bryzgalova S (2015) ldquoSpurious Factors in Linear Asset Pricing Modelsrdquo working paper
Bryzgalova S and C Julliard (2018) ldquoConsumption in Asset Returnsrdquo London School of EconomicsManuscript
Campbell J Y (1996) ldquoUnderstanding Risk and Returnrdquo Journal of Political Economy 104(2) 298ndash345
Campbell J Y J Hilscher and J Szilagyi (2008) ldquoIn Search of Distress Riskrdquo Journal of Finance 632899ndash2939
Carhart M M (1997) ldquoOn Persistence in Mutual Fund Performancerdquo The Journal of Finance 52(1) 57ndash82
Chan K N-F Chen and D A Hsieh (1985) ldquoAn Exploratory Investigation of the Firm Size Effectrdquo Journalof Financial Economics 14 451ndash71
Chen L R Novy-Marx and L Zhang (2010) ldquoA New Three-Factor Modelrdquo working paper
Chen N-F S A Ross and R Roll (1986) ldquoEconomic Forces and the Stock Marketrdquo Journal of Business 59383ndash403
Chernov M L Lochstoer and S Lundeby (2019) ldquoConditional Dynamics and the Multi-Horizon Risk-ReturnTrade-Offrdquo working paper
Chib S X Zeng and L Zhao (forthcoming) ldquoOn Comparing Asset Pricing Modelsrdquo The Journal of Finance
Cochrane J H (2005) Asset Pricing Revised Edition Princeton University Press
Cooper M J H Gulen and M J Schill (2008) ldquoAsset Growth and the Cross-Section of Stock ReturnsrdquoJournal of Finance 63 1609ndash52
Cremers K M (2002) ldquoStock Return Predictability A Bayesian Model Selection Perspectiverdquo The Review ofFinancial Studies 15(4) 1223ndash1249
Daniel K D Hirshleifer and L Sun (2019) ldquoShort- and Long-Horizon Behavioral Factorsrdquo Review ofFinancial Studies
Daniel K L Mota S Rottke and T Santos (2018) ldquoThe Cross-Section of Risk and Returnrdquo workingpaper
Daniel K and S Titman (2006) ldquoMarket reactions to tangible and intangible informationrdquo Journal of Finance61 1605ndash43
Fabozzi F D Huang and G Zhou (2010) ldquoRobust Portfolios Contributions from Operations Research andFinancerdquo Annals of Operations Research 172 191ndash220
43
Fama E F and K French (2008) ldquoDissecting Anomaliesrdquo Journal of Finance 63 1653ndash78
Fama E F and K R French (1992) ldquoThe Cross-Section of Expected Stock Returnsrdquo The Journal of Finance47 427ndash465
(1993) ldquoCommon Risk Factors in the Returns on Stocks and Bondsrdquo The Journal of Financial Economics33 3ndash56
Fama E F and K R French (2015) ldquoA Five-Factor Asset Pricing Modelrdquo Journal of Financial Economics116 1ndash22
(2016) ldquoDissecting Anomalies with a Five-Factor Modelrdquo Review of Financial Studies 29 69ndash103
Fama E F and J D MacBeth (1973) ldquoRisk Return and Equilibrium Empirical Testsrdquo Journal of PoliticalEconomy 81(3) 607ndash636
Ferson W and C Harvey (1991) ldquoThe Variation of Economic Risk Premiumsrdquo Journal of Political Economy99 385ndash415
Frazzini A and L H Pedersen (2014) ldquoBetting against Betardquo Journal of Financial Economics 111(1) 1ndash25
Garlappi L R Uppal and T Wang (2007) ldquoPortfolio Selection with Parameter and Model Uncertainty AMulti-Prior Approachrdquo Review of Financial Studies 20 41ndash81
George E I and R E McCulloch (1993) ldquoVariable Selection via Gibbs Samplingrdquo Journal of the AmericanStatistical Association 88 881ndash889
(1997) ldquoApproaches for Bayesian Variable Selectionrdquo Statistica Sinica 7 339ndash373
Gertler M and E Grinols (1982) ldquoUnemployment Inflation and Common Stock Returnsrdquo Journal of MoneyCredit and Banking 14 216ndash33
Geweke J (1999) ldquoUsing Simulation Methods for Bayesian Econometric Models Inference Development andCommunicationrdquo Econometric Reviews 18(1) 1ndash73
Ghosh A C Julliard and A Taylor (2016) ldquoWhat is the Consumption-CAPM missing An Information-Theoretic Framework for the Analysis of Asset Pricing Modelsrdquo Review of Financial Studies 30 442ndash504
(2018) ldquoAn Information-Theoretic Asset Pricing Modelrdquo London School of Economics Manuscript
Giannone D M Lenza and G E Primiceri (2018) ldquoEconomic Predictions with Big Data The Illusion ofSparsityrdquo FRB of New York Staff Report No 847
Gibbons M R S A Ross and J Shanken (1989) ldquoA Test of the Efficiency of a Given Portfoliordquo Econometrica57(5) 1121ndash1152
Giglio S G Feng and D Xiu (2019) ldquoTaming the factor Zoordquo Journal of Finance forthcoming
Giglio S and D Xiu (2018) ldquoAsset Pricing with Omitted Factorsrdquo working paper
Gospodinov N R Kan and C Robotti (2014) ldquoMisspecification-Robust Inference in Linear Asset-PricingModels with Irrelevant Risk Factorsrdquo The Review of Financial Studies 27(7) 2139ndash2170
(2019) ldquoToo Good to be True Fallacies in Evaluating Risk Factor Modelsrdquo Journal of Financial Economics132(2) 451ndash471
Goyal A Z L He and S-W Huh (2018) ldquoDistance-Based Metrics A Bayesian Solution to the Power andExtreme-Error Problems in Asset-Pricing Testsrdquo Swiss Finance Institute Research Paper No 18-78 available atSSRN httpsssrncomabstract=3286327
Hall R E (1978) ldquoThe Stochastic Implications of the Life Cycle Permanent Income Hypothesisrdquo Journal ofPolitical Economy 86(6) 971ndash87
Harvey C and Y Liu (2019) ldquoCross-sectional Alpha Dispersion and Performance Evaluationrdquo Journal ofFinancial Economics forthcoming
Harvey C and G Zhou (1990) ldquoBayesian Inference in Asset Pricing Testsrdquo Journal of Financial Economics26 221ndash254
Harvey C R (2017) ldquoPresidential Address The Scientific Outlook in Financial Economicsrdquo The Journal ofFinance 72(4) 1399ndash1440
Harvey C R Y Liu and H Zhu (2016) ldquo and the Cross-Section of Expected Returnsrdquo The Review ofFinancial Studies 29(1) 5ndash68
He A D Huang and G Zhou (2018) ldquoNew Factors Wanted Evidence from a Simple Specification Testrdquoworking paper available at SSRN httpsssrncomabstract=3143752
44
He Z B Kelly and A Manela (2017) ldquoIntermediary Asset Pricing New Evidence from Many Asset ClassesrdquoJournal of Financial Economics 126 1ndash35
Hirshleifer D H Kewei S Teoh and Y Zhang (2004) ldquoDo Investors Overvalue Firms with Bloated BalanceSheetsrdquo Journal of Accounting and Economics 38 297ndash331
Hoeting J A D Madigan A E Raftery and C T Volinsky (1999) ldquoBayesian Model Averaging ATutorialrdquo Statistical Science 14(4) 382ndash401
Hou K C Xue and L Zhang (2015) ldquoDigesting Anomalies An Investment Approachrdquo The Review of FinancialStudies 28(3) 650ndash705
Huang D F Jiang J Tu and G Zhou (2015) ldquoInvestor Sentiment Aligned A Powerful Predictor of StockReturnsrdquo Review of Financial Studies 28 791ndash837
Huang D J Li and G Zhou (2018) ldquoShrinking Factor Dimension A Reduced-Rank Approachrdquo workingpaper available at SSRN httpsssrncomabstract=3205697
Ishwaran H J S Rao et al (2005) ldquoSpike and Slab Variable Selection Frequentist and Bayesian StrategiesrdquoThe Annals of Statistics 33(2) 730ndash773
Jagannathan R and Z Wang (1996) ldquoThe Conditional CAPM and the Cross-Section of Expected ReturnsrdquoThe Journal of Finance 51(1) 3ndash53
Jeffreys H (1961) Theory of Probability Oxford Calendron Press
Jegadeesh N and S Titman (1993) ldquoReturns to Buying Winners and Selling Losers Implications for StockMarket Efficiencyrdquo Journal of Finance 48 65ndash91
(2001) ldquoProfitability of Momentum Strategies An Evaluation of Alternative Explanationsrdquo Journal ofFinance 56 699ndash720
Jones C and J Shanken (2005) ldquoMutual Fund Performance with Learning across Fundsrdquo Journal of FinancialEconomics 78 507ndash552
Jurado K S Ludvigson and S Ng (2015) ldquoMeasuring Uncertaintyrdquo American Economic Review 105 1177ndash1215
Kan R and C Zhang (1999a) ldquoGMM Tests of Stochastic Discount Factor Models with Useless Factorsrdquo Journalof Financial Economics 54(1) 103ndash127
(1999b) ldquoTwo-Pass Tests of Asset Pricing Models with Useless Factorsrdquo the Journal of Finance 54(1)203ndash235
Kass R E and A E Raftery (1995) ldquoBayes Factorsrdquo Journal of the American Statistical Association 90(430)773ndash795
Kelly B S Pruitt and Y Su (2019) ldquoCharacteristics are covariances A unified model of risk and returnrdquoJournal of Financial Economics 134 501ndash524
Kleibergen F (2009) ldquoTests of Risk Premia in Linear Factor Modelsrdquo Journal of Econometrics 149(2) 149ndash173
Kleibergen F and Z Zhan (2015) ldquoUnexplained Factors and their Effects on Second Pass R-squaredsrdquo Journalof Econometrics 189(1) 101ndash116
Kozak S S Nagel and S Santosh (2019) ldquoShrinking the Cross-Sectionrdquo Journal of Financial Economicsforthcoming
Lancaster T (2004) Introduction to Modern Bayesian Econometrics Wiley
Langlois H (2019) ldquoMeasuring Skewness Premiardquo Journal of Financial Economics
Lettau M and S Ludvigson (2001) ldquoResurrecting the (C) CAPM A Cross-Sectional Test when Risk Premiaare Time-Varyingrdquo Journal of political economy 109(6) 1238ndash1287
Lewellen J S Nagel and J Shanken (2010) ldquoA Skeptical Appraisal of Asset Pricing Testsrdquo Journal ofFinancial Economics 96(2) 175ndash194
Lintner J (1965) ldquoSecurity Prices Risk and Maximal Gains from Diversificationrdquo Journal of Finance 20587ndash615
Ludvigson S S Ma and S Ng (2019) ldquoUncertainty and Business Cycles Exogenous Impulse or EndogenousResponserdquo American Economic Journal Macroeconomics
Madigan D and A E Raftery (1994) ldquoModel Selection and Accounting for Model Uncertainty in GraphicalModels Using Occamrsquos Windowrdquo Journal of the American Statistical Association 89(428) 1535ndash1546
45
Novy-Marx R (2013) ldquoThe Other Side of Value The Gross Profitability Premiumrdquo Journal of Financial Eco-nomics 108 1ndash28
Ohlson J A (1980) ldquoFinancial Ratios and the Probabilistic Prediction of Bankruptcyrdquo Journal of AccountingResearch 18 109ndash131
Pastor L (2000) ldquoPortfolio Selection and Asset Pricing Modelsrdquo The Journal of Finance 55(1) 179ndash223
Pastor L and R F Stambaugh (2000) ldquoComparing Asset Pricing Models an Investment Perspectiverdquo Journalof Financial Economics 56(3) 335ndash381
Pastor L and R F Stambaugh (2002) ldquoInvesting in Equity Mutual Fundsrdquo Journal of Financial Economics63 351ndash380
Pastor L and R F Stambaugh (2003) ldquoLiquidity Risk and Expected Stock Returnsrdquo Journal of Politicaleconomy 111(3) 642ndash685
Phillips D L (1962) ldquoA Technique for the Numerical Solution of Certain Integral Equations of the First KindrdquoJ ACM 9(1) 84ndash97
Polson N G and J G Scott (2011) ldquoShrink Globally Act Locally Sparse Bayesian Regularization and Pre-dictionrdquo in Bayesian Statistics 9 ed by J M Bernardo M J Bayarri J O Berger A P Dawid D HeckermanA F M Smith and M West Oxford University Press
Primiceri G E (2005) ldquoTime Varying Structural Vector Autoregressions and Monetary Policyrdquo Review of Eco-nomic Studies 72(3) 821ndash852
Raftery A E D Madigan and J A Hoeting (1997) ldquoBayesian Model Averaging for Linear RegressionModelsrdquo Journal of the American Statistical Association 92(437) 179ndash191
Ritter J R (1991) ldquoThe Long-Run Performance of Initial Public Offeringsrdquo Journal of Finance 46 3ndash27
Roberts H V (1965) ldquoProbabilistic Predictionrdquo Journal of the American Statistical Association 60(309) 50ndash62
Shanken J (1987) ldquoA Bayesian Approach to Testing Portfolio Efficiencyrdquo Journal of Financial Economics 19195ndash215
(1992) ldquoOn the Estimation of Beta-Pricing Modelsrdquo The Review of Financial Studies 5 1ndash33
Sharpe W F (1964) ldquoCapital Asset Prices A Theory of Market Equilibrium under Conditions of Riskrdquo Journalof Finance 19(3) 425ndash42
Sims C A (2007) ldquoThinking about Instrumental Variablesrdquo mimeo
Sloan R (1996) ldquoDo Stock Prices Fully Reflect Information in Accruals and Cash Flows about Future EarningsrdquoThe Accounting Review 71 289ndash315
Stambaugh R F and Y Yuan (2016) ldquoMispricing Factorsrdquo Review of Financial Studies 30 1270ndash1315
Stock J H (1991) ldquoConfidence Intervals for the Largest Autoregressive Root in US Macroeconomic Time SeriesrdquoJournal of Monetary Economics 28(3) 435ndash459
Tikhonov A A Goncharsky V Stepanov and A G Yagola (1995) Numerical Methods for the Solutionof Ill-Posed Problems Mathematics and Its Applications Springer Netherlands
Titman S K J Wei and F Xie (2004) ldquoCapital Investments and Stock Returnsrdquo Journal of Financial andQuantitative Analysis 39 677ndash700
Uppal R P Zaffaroni and I Zviadadze (2018) ldquoCorrecting Misspecified Stochastic Discount Factorsrdquoworking paper
Yogo M (2006) ldquoA Consumption-Based Explanation of Expected Stock Returnsrdquo Journal of Finance 61(2)539ndash580
46
A Additional Derivations
A1 Derivation of the posterior distribution in section II1
Letrsquos consider first the time-series regression We assume that 983171tiidsim N (0Σ) or 983171 sim MVN (0TtimesN Σotimes
IT ) The likelihood of data (RF ) is therefore
p(data|BΣ) = (2π)minusNT2 |Σ|minus
T2 exp
983069minus1
2tr[Σminus1(Rminus FB)⊤(Rminus FB)]
983070
After assigning the Jeffreysrsquo prior for (BΣ) π(BΣ) prop |Σ|minusN+1
2 we simplify the likelihood
function by exploiting the fact that OLS estimated residuals are orthogonal to the regressors
(Rminus FB)⊤(Rminus FB) = [Rminus FBols minus F (B minus Bols)]⊤[Rminus FBols minus F (B minus Bols)]
= (Rminus FBols)⊤(Rminus FBols) + (B minus Bols)
⊤F⊤F (B minus Bols)
= T Σ+ (B minus Bols)⊤F⊤F (B minus Bols)
where
Bols =
983075a⊤
β⊤
983076= (F⊤F )minus1F⊤R Σols =
1
T(Rminus FBols)
⊤(Rminus FBols)
Therefore the posterior distribution in the first step is
p(BΣ|data) prop (2π)minusNT2 |Σ|minus
T+N+12 exp
983069minus1
2tr[Σminus1(Rminus FB)⊤(Rminus FB)]
983070
prop |Σ|minusT+N+1
2 eminus12tr[Σminus1(T Σ)]eminus
12tr[Σminus1(BminusBols)
⊤F⊤F (BminusBols)]
Hence the posterior distribution of B conditional on data and Σ is
p(B|Σ data) prop exp
983069minus1
2tr[Σminus1(B minus Bols)
⊤F⊤F (B minus Bols)]
983070
and the above is the kernel of the multivariate normal in equation (8)
If we further integrate out B it is easy to show that
p(Σ|data) prop |Σ|minusT+NminusK
2 exp
983069minus1
2tr[Σminus1(T Σ)]
983070
Therefore the posterior distribution of Σ is the inverse-Wishart in equation (9)
Recall that β = (1N βf ) λ⊤ = (λc λ⊤
f ) If we assume that the pricing error αi follows an
independent and idencitical normal distribution N (0σ2) the likelihood function in the second step
is
p(data|λσ2) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070
where data in the second step include (aβf ) drawn from the first step Assuming the diffuse
47
Jeffreysrsquo prior π(λσ2) prop 1σ2 the posterior distribution of (λσ2) is
p(λσ2|dataBΣ) prop (σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070
= (σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βλ+ β(λminus λ))⊤(aminus βλ+ β(λminus λ))
983070
= (σ2)minusN+2
2 exp
983069minusN σ2
2σ2
983070exp
983083minus(λminus λ)⊤β⊤β(λminus λ)
2σ2
983084
there4 p(λ|σ2 dataBΣ) prop exp
983083minus(λminus λ)⊤β⊤β(λminus λ)
2σ2
983084
where λ = (β⊤β)minus1β⊤a and σ2 = (aminusβλ)⊤(aminusβλ)N Therefore the posterior conditional distribution
of λ is the one in equation (12) Finally we can derive the posterior distribution of σ2 by integrating
out λ
p(σ2|dataBΣ) =
983133p(λσ2|dataBΣ)dσ2 prop (σ2)minus
NminusK+12 exp
983069minusN σ2
2σ2
983070
hence obtaining the posterior distribution in equation (13)
A2 Non-spherical pricing errors
Our framework can also easily accommodate non-spherical cross-sectional pricing errors To see
this note that under the null of the model we can rewrite equation (1) as Rt = βλ + βfft + 983171t
Consider the cross-sectional regression and let ET define the sample mean operator Since ET [Rt] =
βλ+ET [βfft]+ET [983171t] = βλ+ET [983171t] the pricing error α should be equal to ET [983171t] Hence under
the hypothesis that the model is correctly specified and in the spirit of the central limit theorem
a suitable distributional assumption for the pricing errors α in the second step is
α|Σ sim N
9830610N
1
TΣ
983062
Implying the distribution of R equiv ET [Rt] sim N (βλ 1T Σ) The likelihood function in the second step
is then
p(data|λ) = (2π)minusN2
9830559830559830559830551
TΣ
983055983055983055983055minus 1
2
exp
983069minusT
2(Rminus βλ)⊤Σminus1(Rminus βλ)
983070
prop exp
983069minusT
2(λ⊤β⊤Σminus1βλminus 2λ⊤β⊤Σminus1R)
983070
Hence we can now define the following estimator
Definition 3 (Bayesian Fama-MacBeth GLS2 (BFM-GLS2)) The BFM-GLS2 posterior dis-
tribution of λ is
λ|dataBΣ sim N (λΣλ) (18)
48
where λ = (β⊤Σminus1β)minus1β⊤Σminus1R Σλ = 1T (β
⊤Σminus1β)minus1 and where β and Σ are drawn from the
Normal-inverse-Wishart in (8)-(9)
In the above the conditional expectation λ = (β⊤Σminus1β)minus1β⊤Σminus1R is essentially the Fama-
MacBeth GLS estimate of λ and Σλ is just the standard covariance matrix of the GLS estimates
Different from our OLS version we incorporate the uncertainty of R by drawing λ from the normal
distribution with variance matrix 1T (β
⊤Σminus1β)minus1
In this case two forces are at play to make useless factors detectable in finite sample First as
for the BFM and BFM-GLS cases useless factors will generate posterior draws with diverging 983141λand flipping sing Second differently from our OLS version we incorporate the uncertainty of R
by drawing λ from the normal distribution with variance matrix 1T (β
⊤Σminus1β)minus1
A3 Formal derivation of the flat prior pitfall for risk premia
Following the derivation in section A1 the likelihood function in the second step is
p(data|γλσ2) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βγλγ)
⊤(aminus βγλγ)
983070(19)
Assigning a Jeffreysrsquo prior to the parameters23 (λσ2) the marginal likelihood function conditional
on model index γ is
p(data|γ) =983133 983133
p(data|γλσ2)π(λσ2|γ)dλdσ2
prop983133 983133
(σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βγλγ)
⊤(aminus βγλγ)
983070dλdσ2
=
983133 983133(σ2)minus
N+22 exp
983083minusN σ2
γ
2σ2
983084exp
983083minus(λγ minus λγ)
⊤β⊤γ βγ(λγ minus λγ)
2σ2
983084dλdσ2
= (2π)pγ+1
2 |β⊤γ βγ |minus
12
983133(σ2)minus
Nminuspγ+1
2 exp
983083minusN σ2
γ
2σ2
983084dσ2
= (2π)pγ+1
2 |β⊤γ βγ |minus
12
Γ(Nminuspγ+1
2 )
(N σ2
γ
2 )Nminuspγ+1
2
where λγ = (β⊤γ βγ)
minus1β⊤γ a σ
2γ =
(aminusβγ λγ)⊤(aminusβγ λγ)N and Γ denotes the Gamma function
A4 Proof of Remark 2
Proof Consider two nested linear factor models γ and γ prime The only difference between γ and γ prime
is γp γp equals 1 in model γ but 0 in model γ prime Let γminusp denote a (K minus 1) times 1 vector of model
index excluding γp γ⊤ = (γ⊤minusp 1) and γ prime⊤ = (γ⊤
minusp 0) Suppose that we rearrange the ordering of
23More precisely the priors for (λσ2) are π(λγ σ2) prop 1
σ2 and λminusγ = 0
49
factors such that factor p is the last one To begin with we introduce some matrix notations
βγ = (βγprime βp) Dγ =
983075Dγprime
1ψp
983076 β⊤
γ βγ +Dγ =
983075β⊤γprimeβγprime +Dγprime β⊤
γprimeβp
β⊤p βγprime β⊤
p βp + 1ψp
983076
|β⊤γ βγ +Dγ | = |β⊤
γprimeβγprime +Dγprime |times |β⊤p βp +
1
ψpminus β⊤
p βγprime(β⊤γprimeβγprime +Dγprime)minus1β⊤
γprimeβp|
|β⊤γ βγ +Dγ | = |β⊤
γprimeβγprime +Dγprime |times |β⊤p βp +
1
ψpminus β⊤
p βγprime(β⊤γprimeβγprime +Dγprime)minus1β⊤
γprimeβp|
|Dγ | = |Dγprime |times 1
ψp
Equipped with the above we have by direct calculation
p(data|γj = 1γminusj)
p(data|γj = 0γminusj)=
|Dγ |12
|β⊤γ βγ+Dγ |
12
1983059
SSRγ2
983060N2
|Dγprime |
12
|β⊤γprimeβγprime+D
γprime |12
1983061
SSRγprime2
983062N2
=
983061SSRγprime
SSRγ
983062N2983061|Dγ ||Dγprime |
983062 12
983075|β⊤
γprimeβγprime +Dγprime ||β⊤
γ βγ +Dγ |
983076 12
=
983061SSRγprime
SSRγ
983062N2
ψminus 1
2p
983055983055983055983055β⊤p βp +
1
ψpminus β⊤
p βγprime
983059β⊤γprimeβγprime +Dγprime
983060minus1β⊤γprimeβp
983055983055983055983055minus 1
2
=
983061SSRγprime
SSRγ
983062N29830611 + ψpβ
⊤p
983063IN minus βγprime
983059β⊤γprimeβγprime +Dγprime
983060minus1β⊤γprime
983064βp
983062minus 12
where β⊤p
983147IN minus βγprime(β⊤
γprimeβγprime +Dγprime)minus1β⊤γprime
983148βp = minb(βpminusβγprimeb)⊤(βpminusβγprimeb)+ b⊤Dγprimeb which
is the minimal value of the penalised sum of squared errors when we use βγprime to predict βp
A5 Confidence Intervals for R2 in Fama-MacBeth Estimation
In the spirit of Stock (1991) and Lewellen Nagel and Shanken (2010) for each of the simulations
we use the following approach to construct the confidence interval for cross-sectional R2 produced
by the standard Fama-MacBeth estimation
First we choose a hypothetical true cross-sectional R2 denoted by R2h The expected test asset
returns E[Rt] are assumed to follow E[Rt] = hβλ+α where β and λ are sample estimates of β
and λ from historical data and αiiidsim N (0σ2
e) The constants h and σ2e are chosen to match the
hypothetical cross-sectional R2
Let R denote the vector of historical average asset returns and V arN (x) denote the sample
cross-sectional variance of vector x The population cross-sectional R2 is therefore
R2h =
h2V arN (βλ)
V arN (E[Rt])=
h2V arN (βλ)
h2V arN (βλ) + σ2e
At the same time wersquod like to maintain the historical cross-sectional dispersion of test asset returns
50
so we further have the following equation
V arN (E[Rt]) = h2V arN (βλ) + σ2e = V arN (ET [Rt])
h2 =R2
hV arN (ET [Rt])
V arN (βλ)
σ2e = (1minusR2
h)V arN (ET [Rt])
Solving the system for h and σ2e we simulate a vector of pricing errors from a normal distribution
αiiidsim N (0σ2
e) Since the sample variance of the draws αi is generally different from σ2e with a non-
zero cross-sectional covariance with β we adjust the vector α by (1) subtracting CovN (βλα)
V arN (βλ)βλ
from α such as the sample covariance between α and βλ is zero (2) multiplying α by σeσN (α) in
order that sample cross-sectional variance of α equals σ2e
Second we simulate a random sample of both factors and test asset returns Factors are drawn
with replacement from their empirical sample while test assets returns Rt are assumed to follow a
parametric normal distribution that is
Rt|ftiidsim N (hβλ+α+ βft Σ)24
We then use Fama-MacBeth two-step approach to estimateR2 for every simulated sample ftRtTt=1
The second step is repeated for 1000 times For each hypothetical R2 between 0 and 1 we find the
(5 95) confidence interval in the simulation Finally build a 90 confidence interval for R2 by
including those values of hypothetically true R2 whose 90 confidence interval contains R2
The confidence intervals for GLS R2 can be found in a similar way with the only difference
of focusing on cross-sectional R2 for a linear combination of Rt ie Σminus 12Rt Let Rt = Σminus 1
2Rt
β = Σminus 12 β and 983171t = Σminus 1
2 983171tiidsim N (0 IN ) The simulation then relies on Rt rather than Rt
Rt|ftiidsim N (hβλ+α+ βft IN )
A6 Large N behavior
In this section we investigate the properties of the BFM procedure in estimating the risk premia
and R-squared as well successfully identifying irrelevant factors in the model when applied to a
large cross-section For the sake of brevity here we present the overview of the simulation results
with Tables C4 - C9 summarizing the output
We consider the same simulation design as described at the beginning of Section III except
for the choice of the cross-section of test assets which time series and cross-sectional features we
mimic In our baseline case in the previous subsections we built a cross-section to emulate the 25
Fama-French portfolios sorted by size and value Now instead we consider the properties of the
following composite cross-sections to simulate returns
24Σ is the sample estimate of covariance matrix of random errors in time-series regression in equation (1)
51
(a) N = 55 25 Fama-French portfolios sorted by size and value and 30 industry portfolios
(b) N = 100 25 Fama-French portfolios sorted by size and value 30 industry portfolios 25
profitability and investment portfolios 10 momentum portfolios and 10 long-term reversal
portfolios
The rest of the simulation design stays unchanged ie the strong factor mimics the behavior of
HML with its betas and risk premia corresponding to their in-sample values cross-sectional R2adj
as well as portfolio average returns and variance of the residuals
Tables C4 and C7 focus on the case of including only a strong factor into the model that is
inherently misspecified and estimated on a large cross-section (N = 55 and N = 100 respectively)
Risk premia estimates recovered by BFM are centered around the pseudo-true values and confi-
dence intervals produced by the posterior coverage have the correct size Confidence intervals for
R2adj are equally centered around the true values and overall become quite tight for a large sample
size (eg T ge 600) The GLS version of the cross-sectional fit is slightly lower than than the true
value and this is largely due to the estimation errors in the weight matrix that are particularly
important for a large cross-section but overall are very close to the true values
Tables C5 and C8 focus on the model with just a useless factor As expected the standard
Fama-MacBeth estimation yields confidence intervals for the risk premia with wrong size rejecting
the zero risk premia for a useless factor with increasing frequency as T becomes large In contrast
the BFM inference remains valid with its empirical rejection rates being close to the true size of
the tests
Finally Tables C6 and C9 consider the mixed case of estimating a model that includes both
strong and useless factors Again BFM correctly identifies the presence of a spurious factor in
the specification and if anything becomes somewhat more conservative underrejecting the zero
risk premia associated with it This conservative inference is also shared by the estimates of the
strong factor risk premia with the latter being particularly evident in a large sample estimation
(T = 20 000 N = 100) Again the reason for this additional estimation uncertainty seems to lie
in the large N behavior of the cross-sectional regression (betas and the weight matrix in case of
the GLS) The posterior of a cross-sectional measure of fit remains relatively tight around the true
value
B Data
CAPM Following Sharpe (1964) and Lintner (1965) the only risk factor is the excess return on
market portfolio which is proxied by a value-weighted portfolio of all CRSP firms incorporated in
US and listed on the NYSE AMEX or NASDAQ We use 1-month Treasury rate as a proxy for
risk-free rate The data comes from Ken Frenchrsquos website
Fama-French 3 factor model Fama and French (1992) extend CAPM by introducing two
additional factors SMB and HML where SMB is the return difference between portfolios of stocks
52
with small and large market capitalizations and HML is the return difference between portfolios of
stocks with high and low book-to-market ratio Again the data comes from Ken Frenchrsquos website
Carhart (1997) This paper extends Fama-French 3 factor model by including a momentum
factor UMD (also available from Ken French website)
q-factor model Hou Xue and Zhang (2015) introduce a four-factor model that includes market
excess return a new size factor (ME) proxied by the return difference between large and small
stocks an investment factor (IA) proxied by the return difference of stocks with high and low
investment-to-asset ratio and finally the profitability factor (ROE) created by sorting stocks based
on their return-on-equity ratio We receive the data from the authors An alternative is Fama-
French five factor model but we only show the result of the first one since factors in these two
papers are extremely similar
Liquidity Pastor and Stambaugh (2003) created a liquidity factor based on the fact that order
flows result in larger return reversals when liquidity is lower We download the monthly tradable
liquidity factor from Stambaughrsquos website
Quality-minus-junk Asness Frazzini and Pedersen (2019) introduced the QMJ factor and
demonstrated that profitable growing well-managed companies referred to as lsquoqualityrsquo firms
command a higher rate of return The factor is available from AQR data library
CCAPM Real growth rate in nondurable consumption per capita (quarterly data) is computed
from the data on consumption levels available at St Louis FRED
Scaled CAPM Lettau and Ludvigson (2001) considered a conditional SDFmt+1 = atminusbtMKTt+1
where at and bt are linear function of conditional information cayt We download cayt from the
authorsrsquo website
Scaled CCAPM Similar to conditional CAPM but the SDF includes nondurable consumption
growth instead of the market factor
HC-CCAPM Jagannathan and Wang (1996) added a labor factor to CAPM Following their
paper we compute the returns on human capital as
Rlabort =
Ltminus1 + Ltminus2
Ltminus2 + Ltminus3minus 1 (20)
where Lt is the disposable labor income per capita
Scaled HC-CCAPM Lettau and Ludvigson (2001) extend Jagannathan and Wang (1996) by
considering a conditional SDF in which cay is the only conditional information
Durable consumption model Yogo (2006) emphasized the role of durable consumption goods
in explaining high returns of small stocks and value stocks We consider a three-factor model
market excess return non-durable consumption growth and durable (real) consumption growth Cd
(seasonally adjusted at annual rates) The data is from Yogorsquos website 1952Q1 to 2001Q4
53
B1 Factor list
Table B1 List of the factors for cross-sectional asset pricing models
Factor ID Factor name and description Reference Sourceconstruction
MKT Market excess return Sharpe (1964) Lintner(1965)
Ken French website
SMB Size factor constructed as a long-shortportfolio of stocks sorted by their mar-ket cap (small-minus-big)
Fama and French (1992) Ken French website
HML Value factor constructed as a long-short portfolio of stocks sorted by theirbook-to-market ratio (high-minus-low)
Fama and French (1992) Ken French website
RMW Profitability factor constructed as along-short portfolio of stocks sorted bytheir profitability (robust-minus-weak)
Fama and French (2015) Ken French website
CMA Investment factor constructed as along-short portfolio of stocks sorted bytheir investment activity (conservative-minus-aggressive)
Fama and French (2015) Ken French website
UMD Momentum factor constructed as along-short portfolio of stocks sorted bytheir 12-2 cumulative previous return(up-minus-down)
Carhart (1997) Je-gadeesh and Titman(1993)
Ken French website
STREV Short-term reversal factor constructedas a long-short portfolio of stockssorted by their previous month return
Jegadeesh and Titman(1993)
Ken French website
LTREV Long-term reversal factor constructedas a long-short portfolio of stockssorted by their cumulative return ac-crued in the previous 60-13 months
Jegadeesh and Titman(2001)
Ken French website
q IA Investment factor constructed as along-short portfolio of stocks sorted bytheir investment-to-capital
Hou Xue and Zhang(2015)
Lu Zhang
q ROE Profitability factor constructed as along-short portfolio of stocks sorted bytheir return on equity
Hou Xue and Zhang(2015)
Lu Zhang
LIQ NT Liquidity factor computed as the av-erage of individual-stock measures es-timated with daily data (residual pre-dictability controlling for the marketfactor)
Pastor and Stambaugh(2003)
Robert Stambaughwebsite
LIQ TR Liquidity factor constructed as a long-short portfolio of stocks sorted by theirexposure to LIQ NT
Pastor and Stambaugh(2003)
Robert Stambaughwebsite
MGMT Mispricing factor constructed as acombination of anomalies related tofirmrsquos management practices (stock is-sue accruals asset growth etc)
Stambaugh and Yuan(2016)
Robert Stambaughwebsite
PERF Mispricing factor constructed as acombination of anomalies related tofirmrsquos performance (profitability dis-tress return on assets etc)
Stambaugh and Yuan(2016)
Robert Stambaughwebsite
ACCR Accruals factor constructed as a long-short portfolio of stocks sorted bychanges in operating working capitalper split-adjusted share from the fiscalyear end t-2 to t-1 divided by book eq-uity per share in t-1
Sloan (1996) Robert Stambaughwebsite
54
DISSTR Distress factor constructed as a long-short portfolio of stocks sorted by thepredicted failure probability
Campbell Hilscher andSzilagyi (2008)
Robert Stambaughwebsite
ASS Growth Asset growth factor constructed as along-short portfolio of stocks sorted bygrowth rate of total assets in the previ-ous fiscal year
Cooper Gulen andSchill (2008)
Robert Stambaughwebsite
COMP ISSUE Composite issue factor constructed asa long-short portfolio of stocks sortedby the growth in the firmrsquos total marketvalue of equity above that of the stockrsquosrate of return
Daniel and Titman(2006)
Robert Stambaughwebsite
GR PROF Gross profitability factor constructedas a long-short portfolio of stockssorted by the ratio of gross profit toassets creates abnormal benchmark-adjusted returns
Novy-Marx (2013) Robert Stambaughwebsite
INV IN ASS Investment-in-assets factor con-structed as a long-short portfolio ofstocks sorted by the annual change ingross property plant and equipmentplus the annual change in inventoriesscaled by lagged book value of theassets
Titman Wei and Xie(2004)
Robert Stambaughwebsite
NetOA Net operating assets factor con-structed as a long-short portfolio ofstocks sorted by net operating assets
Hirshleifer Kewei Teohand Zhang (2004)
Robert Stambaughwebsite
OSCORE Ohlson O-score factor constructed as along-short portfolio of stocks sorted bythe predicted value of a distress mea-sure
Ohlson (1980) Robert Stambaughwebsite
ROA Return-on-assets factor constructed asa long-short portfolio of stocks sortedby their return on assets
Chen Novy-Marx andZhang (2010)
Robert Stambaughwebsite
STOCK ISS Stock issuance factor constructed asa long-short portfolio of stocks sortedby their annual log change in split-adjusted shares outstanding
Ritter (1991) Fama andFrench (2008)
Robert Stambaughwebsite
INTERM CR Innovations to the intermediariesrsquo cap-ital (equity) ratio
He Kelly and Manela(2017)
Asaf Manela website
BAB Betting-against-beta factor con-structed as a portfolio that holdslow-beta assets leveraged to a betaof 1 and that shorts high-beta assetsde-leveraged to a beta of 1
Frazzini and Pedersen(2014)
AQR data library
HML DEVIL A version of the HML factor that relieson the current price level to sort thestocks into long and short legs
Asness and Frazzini(2013)
AQR data library
QMJ Auality-minus-junk factor constructedas a long-short portfolio of stockssorted by the combination of theirsafety profitability growth and thequality of management practices
Asness Frazzini andPedersen (2019)
AQR data library
FIN UNC A measure of financial uncertainty Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
REAL UNC A measure of real economic uncertainty Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
55
MACRO UNC A measure of macroeconomic uncer-tainty
Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
TERM Term spread measured as the differ-ence in 10 year Treasury bonds and fedfunds rate
Chen Ross and Roll(1986) Fama and French(1993)
FRED-MD database
DELTA SLOPE Change in the difference between a10-year Treasury bond yield and a 3-month Treasury bill yield
Ferson and Harvey(1991)
FRED-MD database
CREDIT Credit spread measured as the dif-ference between Moodyrsquos BAA corpo-rate bond yields and 10-year Treasurybonds
Chen Ross and Roll(1986) Fama and French(1993)
FRED-MD database
DIV Dividend yield Campbell (1996) FRED-MD databasePE Price-earnings ratio Basu (1977) Ball (1985) FRED-MD database
BW INV SENT Investor sentiment measure aggre-gated from a set of indices orthogonalto macroeconomic fundamentals
Baker and Wurgler(2006)
Dashan Huang web-site
HJTZ INV SENT Investor sentiment measure extractedwith PLS from Baker and Wurgler(2006) proxies
Huang Jiang Tu andZhou (2015)
Dashan Huang web-site
BEH PEAD Short-term behavioral factor reflectingpost-earnings announcement drift
Daniel Hirshleifer andSun (2019)
Kent Daniel website
BEH FIN Long-term behavioral factor predom-inantly capturing the impact of shareissuance and correction
Daniel Hirshleifer andSun (2019)
Kent Daniel website
MKTlowast Market factor with a hedged unpricedcomponent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
SMBlowast SMB with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
HMLlowast HML with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
RMWlowast RMW with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
CMAlowast CMA with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
SKEW Systematic skewness factor con-structed as a long-short portfolio ofstocks sorted on the their predictedsystematic skewness rank
Langlois (2019) Hugues Langloiswebsite
NONDUR Nondurable consumption growth (realchain-weighted per capita)
Chen Ross and Roll(1986) Breeden Gib-bons and Litzenberger(1989)
Monthly consump-tion expenditurethe chain-weightedprice index andpopulation data arefrom BEA
SERV Growth rate (real chain-weighted percapita) service expenditure
Breeden Gibbons andLitzenberger (1989)Hall (1978)
Monthly expenditurefor services thechain-weighted priceindex and popula-tion data are fromBEA
UNRATE Unemployment rate Gertler and Grinols(1982)
FRED-MD database
IND PROD Growth rate of industrial production Chan Chen and Hsieh(1985) Chen Ross andRoll (1986)
Industrial Produc-tion Index is fromthe Board of Gover-nors of the FederalReserve System
56
OIL Monthly growth rate of the ProducerPrice index for Crude Petroleum (do-mestic production)
Chen Ross and Roll(1986)
PPI is from US Bu-reau of Labor Statis-tics
The table presents the list of factors used in Section IV2 For each of the variables we present their identificationindex the nature of the factor and the source of data for downloading andor constructing the time series
57
C Additional Tables
Table C1 Tests of risk premia in a correctly specified model with a strong factor
λintercept λHML R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0115 0049 0017 0114 0050 0014 -417 4429200 0101 0055 0012 0096 0047 0010 096 6773
FM 600 0106 0053 0011 0100 0048 0011 3052 84561000 0096 0050 0011 0102 0044 0010 4786 901820000 0102 0042 0012 0100 0049 0007 9620 9943
100 0063 0027 0006 0034 0010 0002 -337 3516200 0107 0055 0012 0080 0037 0005 -299 4906
BFM 600 0095 0050 0007 0080 0035 0007 431 69131000 0092 0054 0008 0092 0047 0012 3291 781820000 0108 0054 0012 0110 0052 0008 9654 9849
Panel B GLS
100 0221 0156 0061 0212 0137 0064 1062 7380200 0153 0088 0024 0144 0088 0024 5923 8571
FM 600 0117 0062 0017 0115 0060 0014 8338 94461000 0106 0053 0011 0112 0052 0011 9003 964920000 0100 0058 0010 0103 0047 0011 9947 9981
100 0151 0088 0026 0149 0086 0026 2642 6569200 0123 0066 0016 0124 0066 0015 4907 7352
BFM 600 0106 0055 0012 0105 0056 0011 7640 87301000 0107 0054 0011 0103 0051 0011 8512 915420000 0098 0057 0009 0100 0051 0013 9915 9950
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in a correctlyspecified model with an intercept and a strong factor Hypothetical true value of R2
adj is 100 Fama-MacBeth esti-mates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shankencorrection Confidence intervals for BFM estimates are constructed using a posterior distribution of Fama-MacBethestimates of λ The last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simula-tions evaluated at the simulation point estimates for FM and its posterior mode for BFM
58
Table C2 Tests of risk premia in a correctly specified model with a useless factor
λintercept λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0087 0033 0008 0019 0003 0000 -419 4844200 0071 0034 0006 0052 0011 0000 -420 5684
FM 600 0077 0031 0005 0131 0037 0000 -422 58501000 0071 0035 0006 0205 0066 0002 -416 612820000 0133 0077 0027 0709 0479 0140 -411 7007
100 0039 0011 0002 0001 0001 0000 -229 -012200 0038 0013 0001 0002 0000 0000 -232 030
BFM 600 0048 0012 0002 0017 0006 0001 -221 1261000 0048 0011 0000 0031 0010 0000 -214 16020000 0007 0000 0000 0103 0051 0014 -176 5793
Panel B GLS
100 0215 0130 0052 0211 0117 0033 -310 3727200 0141 0080 0023 0139 0072 0013 -373 2282
FM 600 0108 0053 0014 0142 0065 0010 -377 21161000 0097 0053 0009 0171 0082 0012 -371 209220000 0108 0047 0010 0621 0559 0372 -259 1697
100 0136 0074 0022 0029 0011 0001 -194 1060200 0112 0060 0014 0012 0004 0000 -239 837
BFM 600 0097 0047 0010 0010 0002 0000 -265 7491000 0096 0047 0009 0010 0002 0000 -276 83220000 0071 0034 0004 0077 0033 0004 -326 766
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a correctly specified model with an intercept and a useless factor The true value of R2 is 0 Fama-MacBeth esti-mates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shankencorrection Confidence intervals for BFM estimates are constructed using a posterior distribution of Fama-MacBethestimates of λ The last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simula-tions evaluated at the simulation point estimates for FM and its posterior mode for BFM
59
Table C3 Tests of risk premia in a correctly specified model with useless and strong factors
λintercept λHML λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0073 0038 0006 0071 0033 0006 0043 0012 0000 -445 5930200 0073 0035 0006 0090 0045 0008 0028 0006 0000 1083 7484
FM 600 0076 0033 0005 0100 0046 0009 0027 0005 0000 3873 85721000 0073 0034 0006 0090 0046 0009 0026 0005 0000 5415 911320000 0087 0032 0006 0061 0026 0005 0025 0008 0000 9687 9947
100 0039 0013 0001 0023 0007 0001 0002 0001 0000 -197 4174200 0048 0023 0000 0046 0015 0000 0002 0001 0000 003 5372
BFM 600 0054 0025 0003 0062 0026 0006 0002 0000 0000 4056 72821000 0084 0034 0006 0064 0021 0002 0002 0000 0000 5763 808920000 0071 0033 0005 0069 0032 0004 0004 0000 0000 9704 9860
Panel B GLS
100 0193 0129 0043 0205 0133 0043 0180 0121 0031 1405 7480200 0135 0080 0021 0141 0075 0024 0129 0062 0010 6015 8683
FM 600 0101 0054 0010 0109 0052 0009 0091 0037 0004 8331 94231000 0104 0055 0010 0105 0048 0011 0085 0039 0003 8995 963220000 0095 0047 0006 0101 0052 0005 0088 0035 0004 9944 9981
100 0133 0074 0023 0132 0074 0023 0026 0009 0001 2796 6680200 0106 0054 0012 0106 0053 0012 0013 0004 0000 4899 7362
BFM 600 0094 0047 0009 0093 0046 0009 0007 0001 0000 7654 87351000 0090 0045 0010 0091 0044 0007 0005 0001 0000 8530 914620000 0092 0043 0008 0092 0046 0008 0004 0001 0000 9914 9951
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and
λstrong λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor The true value
of the cross-sectional R2adj is 100 Fama-MacBeth estimates are constructed using OLS (GLS) two-step cross-
sectional regressions with standard errors including Shanken correction Confidence intervals for BFM estimates areconstructed using a posterior distribution of Fama-MacBeth estimates of λ The last two columns report the 5th and95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at the simulation point estimates for FMand its posterior mode for BFM
60
Table C4 Tests of risk premia in a misspecified model with a strong factor (N = 55)
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0107 0058 0014 0115 0059 0012 -186 1305200 0091 0046 0009 0102 0052 0011 -184 1485600 0104 0052 0013 0109 0060 0014 -166 14951000 0100 0056 0009 0123 0061 0013 -123 135120000 0104 0049 0009 0109 0048 0009 353 803
BFM 100 0017 0004 0000 0002 0000 0000 -155 -067200 0046 0017 0004 0023 0006 0000 -168 273600 0089 0042 0012 0071 0036 0005 -174 8441000 0093 0047 0007 0089 0040 0007 -171 95920000 0101 0048 0011 0099 0042 0008 329 777
Panel B GLS
FM 100 0509 0428 0305 0513 0431 0305 2093 6471200 0295 0206 0102 0274 0187 0091 3999 6657600 0170 0109 0042 0176 0113 0034 5585 71321000 0170 0106 0034 0176 0105 0031 6010 723220000 0145 0082 0020 0141 0073 0022 6917 7182
BFM 100 0265 0181 0086 0262 0186 0079 1872 5919200 0163 0100 0032 0147 0086 0024 3401 5842600 0116 0060 0017 0118 0058 0014 5125 65951000 0120 0064 0016 0121 0063 0014 5689 686520000 0106 0053 0009 0095 0048 0013 6897 7163
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor estimates of a cross-section of 55 test assets The true valueof the cross-sectional R2
adj is 572 (7071) in OLS (GLS) estimation Fama-MacBeth estimates are constructedusing OLS (GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidenceintervals and their size for BFM estimates are constructed using posterior coverage of Fama-MacBeth estimates of λThe last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluatedat the simulation point estimates for FM and its posterior mode for BFM The test assets mimic the time series andcross-sectional properties of 25 Fama-French size-value portfolios and 30 industry portfolios while the strong factorproxies the HML factor (all the data available from Ken French website)
61
Table C5 Tests of risk premia in a misspecified model with a useless factor (N = 55)
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0095 0050 0013 0084 0031 0003 -186 2080200 0084 0042 0007 0093 0038 0004 -186 1801600 0103 0051 0011 0155 0077 0011 -187 13621000 0101 0051 0009 0226 0131 0033 -187 141720000 0191 0126 0050 0716 0650 0460 -187 988
BFM 100 0013 0002 0000 0000 0000 0000 -105 -047200 0039 0015 0001 0002 0000 0000 -128 -043600 0076 0040 0006 0004 0000 0000 -144 -0491000 0082 0035 0004 0022 0009 0000 -150 -02620000 0129 0059 0005 0078 0037 0008 -168 -020
Panel B GLS
FM 100 0492 0411 0287 0540 0468 0302 -134 2318200 0267 0196 0088 0364 0274 0153 -159 1359600 0166 0101 0035 0408 0323 0198 -162 10091000 0154 0089 0028 0475 0401 0266 -155 86820000 0155 0100 0044 0806 0772 0705 -068 668
BFM 100 0259 0180 0085 01695 0105 0041 -014 1285200 0162 0094 0030 0065 0027 0005 -089 615600 0108 0058 0013 0056 0022 0003 -127 5311000 0112 0057 0014 0064 0021 0004 -138 47920000 0051 0020 0001 0093 0049 0011 -072 172
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor estimated on a cross-section of 55 portfolios Thetrue value of the cross-sectional R2 is zero The test assets mimic the time series and cross-sectional properties of 25Fama-French size-value portfolios and 30 industry portfolios
62
Table C6 Tests of risk premia in a misspecified model with useless and strong factors (N = 55)
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0097 0052 0014 0099 0045 0009 0108 0041 0006 -318 2477200 0085 0041 0006 0090 0049 0011 0117 0051 0008 -298 2350600 0095 0045 0013 0108 0060 0012 0191 0100 0015 -212 21211000 0083 0043 0006 0125 0072 0016 0258 0171 0047 -155 213720000 0109 0055 0012 0146 0086 0038 0766 0704 0535 267 1778
BFM 100 0014 0002 0000 0000 0000 0000 0000 0000 0000 -141 161200 0037 0014 0001 0017 0005 0000 0002 0000 0000 -203 726600 0069 0031 0005 0058 0027 0002 0007 0000 0000 -227 10961000 0064 0024 0003 0069 0027 0005 0026 0008 0001 -209 117320000 0028 0006 0000 0068 0033 0004 0080 0042 0009 262 873
Panel B GLS
FM 100 0488 0406 0282 0495 0411 0281 0537 0465 0306 2201 6613200 0263 0194 0088 0252 0167 0077 0359 0279 0151 4047 6707600 0157 0094 0032 0161 0093 0027 0398 0313 0199 5581 71591000 0146 0087 0029 0144 0090 0026 0475 0393 0251 5994 725020000 0140 0094 0035 0134 0089 0041 0789 0747 0667 6888 7237
BFM 100 0253 0182 0081 0260 0176 0077 0168 0104 0039 2036 6007200 0156 0092 0028 0140 0082 0021 0063 0029 0004 3451 5844600 0105 0058 0013 0108 0049 0011 0054 0024 0002 5125 66141000 0105 0054 0015 0111 0056 0009 0057 0024 0003 5678 687320000 0051 0019 0001 0045 0018 0001 0093 0045 0013 6873 7157
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and λstrong
and λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor estimated on a cross-section
of 55 portfolios The true value of the cross-sectional R2adj is 572 (7071) in OLS (GLS) estimation The test
assets mimic the time series and cross-sectional properties of 25 Fama-French size-value portfolios and 30 industryportfolios while the strong factor proxies the HML factor
63
Table C7 Tests of risk premia in a misspecified model with a strong factor (N = 100)
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0091 0044 0010 0109 0055 0012 -098 1310600 0102 0052 0013 0116 0057 0019 -042 12311000 0099 0056 0011 0117 0060 0014 031 114520000 0099 0044 0010 0116 0068 0017 431 748
BFM 200 0016 0004 0000 0006 0001 0000 -082 074600 0076 0034 0006 0060 0028 0005 -086 7901000 0081 0046 0007 0076 0037 0007 -072 85920000 0097 0044 0011 0106 0057 0014 418 742
Panel B GLS
FM 200 0495 0411 0287 0481 0403 0268 2938 5885600 0255 0169 0077 0257 0178 0073 4736 62091000 0234 0149 0059 0205 0129 0049 5198 635220000 0166 0100 0035 0170 0097 0036 6142 6398
BFM 200 0249 0163 0070 0233 0158 0073 2567 5296600 0132 0082 0023 0137 0077 0020 4287 56771000 0125 0073 0021 0112 0060 0014 4849 597020000 0101 0056 0012 0098 0053 0013 6114 6375
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for the pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor estimated on a cross-section of 100 portfolios The truevalue of R2
adj is 585 (6297) for the OLS (GLS) estimation Fama-MacBeth estimates are constructed using OLS(GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidence intervalsand their size for BFM estimates are constructed using a posterior distribution of Fama-MacBeth estimates of λ Thelast two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at thesimulation point estimates for FM and its posterior mode for BFM The simulations design follows the methodologydescribed in Section III with the test assets mimicking the composite cross-section of 25 Fama-French size-BMportfolios 30 industry portfolios 25 profitability and investment portfolios 10 momentum portfolios as well as 10long-term reversal portfolios (all available from Ken French website) A strong factor mimics the behavior of HML
64
Table C8 Tests of risk premia in a misspecified model with a useless factor (N = 100)
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0091 0041 0009 0147 0079 0015 -101 1481600 0110 0062 0010 0280 0180 0064 -100 13551000 0096 0053 0007 0341 0248 0101 -100 122220000 0221 0151 0061 0805 0759 0643 -100 1220
BFM 200 0014 0003 0000 0000 0000 0000 -050 002600 0070 0030 0003 0014 0002 0000 -066 0261000 0073 0034 0005 0026 0010 0001 -071 02720000 0097 0035 0003 0091 0045 0008 -081 109
Panel B GLS
FM 200 0472 0397 0272 0530 0454 0320 -072 1495600 0230 0149 0064 0458 0363 0235 -052 9691000 0196 0122 0048 0487 0407 0275 -037 86120000 0106 0063 0021 0760 0717 0640 171 615
BFM 200 0244 0161 0066 0150 0092 0031 -016 769600 0129 0073 0021 0078 0035 0008 -055 6761000 0118 0066 0019 0070 0033 0006 -060 62420000 0076 0034 0005 0093 0049 0009 172 399
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor estimated on a cross-section of 100 portfolios The truevalue of R2 is zero The test assets mimic the time series and cross-sectional properties of composite cross-section of25 Fama-French size-BM portfolios 30 industry portfolios 25 profitability and investment portfolios 10 momentumportfolios as well as 10 long-term reversal portfolios while the strong factor mimics the behavior of HML (all thedata is available from Ken French website)
Table C9 Tests of risk premia in a misspecified model with useless and strong factors (N = 100)
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0088 0043 0009 0098 0045 0008 0187 0104 0026 -124 2118600 0103 0056 0011 0104 0064 0015 0319 0217 0081 -020 19571000 0105 0049 0009 0115 0058 0013 0389 0284 0134 070 182620000 0102 0057 0014 0140 0075 0028 0823 0782 0684 400 1766
BFM 200 0012 0003 0000 0005 0000 0000 0000 0000 0000 -051 541600 0062 0025 0003 0046 0021 0002 0016 0004 0000 -063 10771000 0062 0029 0005 0057 0025 0003 0029 0008 0001 -005 107820000 0026 0008 0001 0042 0015 0000 0089 0039 0011 399 818
Panel B GLS
FM 200 0467 0392 0268 0471 0383 0254 0523 0454 0315 2971 5943600 0227 0149 0059 0228 0154 0060 0455 0363 0227 4745 62221000 0191 0118 0046 0178 0114 0036 0480 0401 0262 5207 636820000 0098 0054 0020 0095 0062 0024 0749 0707 0621 6121 6416
BFM 200 0243 0160 0066 0230 0155 0072 0152 0087 0031 2613 5350600 0126 0072 0022 0132 0071 0017 0074 0033 0007 4268 56921000 0117 0067 0017 0109 0057 0013 0068 0034 0006 4864 597920000 0068 0032 0002 0066 0034 0006 0087 0047 0009 6102 6371
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and λstrong
λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor on the cross-section of 100
portfolios The true value of R2adj is 585 (6297) in OLS (GLS) estimation The test assets mimic the time series
and cross-sectional properties of composite cross-section of 25 Fama-French size-BM portfolios 30 industry portfolios25 profitability and investment portfolios 10 momentum portfolios as well as 10 long-term reversal portfolios whilethe strong factor mimics the behavior of HML (all the data is available from Ken French website)
65
Table C10 The probability of retaining risk factors using BF
T 55 57 59 61 63 65
Panel A strong factors
Jeffreys Prior fstrong 200 0860 0845 0830 0812 0792 0771600 0987 0985 0985 0983 0981 09791000 0998 0998 0997 0996 0996 0995
Spike-and-Slab Prior fstrong 200 0749 0718 0685 0654 0618 0586600 0982 0979 0978 0972 0970 09641000 0996 0996 0996 0995 0994 0992
Panel B useless factors
Jeffreys Prior fuseless 200 1000 0995 0982 0940 0862 0726600 1000 0999 0998 0988 0971 09201000 1000 1000 1000 0999 0990 0971
Spike-and-Slab Prior fuseless 200 0419 0248 0149 0083 0040 0022600 0119 0062 0028 0012 0007 00041000 0083 0028 0011 0001 0000 0000
Panel C strong and useless factors
Jeffreys Prior fstrong 200 0928 0912 0891 0878 0860 0838600 0994 0994 0992 0991 0991 09891000 0999 0999 0999 0999 0999 0999
fuseless 200 0955 0894 0788 0642 0489 0360600 0957 0895 0764 0618 0461 03541000 0957 0893 0787 0645 0483 0357
Spike-and-Slab Prior fstrong 200 0753 0725 0697 0666 0629 0592600 0982 0980 0979 0972 0970 09611000 0996 0996 0996 0995 0994 0994
fuseless 200 0283 0154 0085 0044 0023 0011600 0077 0031 0006 0002 0000 00001000 0053 0008 0000 0000 0000 0000
The table shows the frequency of retaining risk factors for different choice sets across 1000 simulations of differentsize (T=200 600 and 1000) In Panel A the candidate risk factor is truly cross-sectionally priced and stronglyidentified while in Panel B they are not Panel C summarizes the case of using both strong and useless candidatefactors in the model A candidate factor is retained in the model if its marginal posterior probability p(γi = 1|data)is greater than a certain threshold ie 55 57 59 61 63 and 65
66
Table C11 Two-pass regressions with tradable factors 25 Fama-French portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CAPM Intercept 1421 1769 1404 -091 1461[0627 2215] [-435 6661] [0595 2256] [-403 5211]
MKT -0647 -0631[-1429 0135] [-1458 0163]
Fama and French (1992) Intercept 1273 6071 1249 5604 5566[0730 1816] [2000 8514] [0682 1834] [4275 6946]
MKT -0686 -0664[-1227 -0145] [-1242 -0102]
SMB 0140 0140[0075 0205] [0073 0205]
HML 0380 0379[0322 0438] [0319 0439]
Quality-minus-junk Intercept 0626 7442 0648 6947 6879Asness Frazzini and Pedersen (2014) [-0048 1300] [4480 9520] [-0055 1308] [5548 7946]
QMJ 0369 0358[0148 0589] [0130 0591]
MKT -0097 -0116[-0763 0568] [-0772 0580]
SMB 0209 0206[0157 0260] [0151 0262]
HML 0338 0340[0288 0388] [0289 0392]
Panel B GLS
CAPM Intercept 1362 4805 1323 4706 4265[0941 1783] [2696 10000] [0846 1802] [1411 6530]
MKT -0788 -0749[-1206 -0369] [-1227 -0287]
Fama and French (1992) Intercept 1397 8849 1353 8543 8543[0924 1870] [8286 9771] [0846 1878] [8003 8989]
MKT -0827 -0783[-1298 -0356] [-1311 -0283]
SMB 0179 0178[0145 0213] [0142 0215]
HML 0348 0350[0312 0385] [0310 0391]
Quality-minus-junk Intercept 0966 8948 0982 8736 8642Asness Frazzini and Pedersen (2014) [0395 1537] [8320 9640] [0383 1616] [8120 9090]
QMJ 0378 0359[0204 0552] [0172 0539]
MKT -0425 -0438[-0984 0133] [-1052 0157]
SMB 0197 0196[0160 0234] [0156 0235]
HML 0359 0359[0321 0398] [0320 0399]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of monthly excess returns for 25 Fama-French sizevalue portfolios Each model is estimated viaOLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructed basedon the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructed as inLewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide theposterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and 99level respectively
67
Table C12 Two-pass regressions with tradable factors 25 Fama-French portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CCAPM Intercept 1444 662 1814 -202 545[0155 2732] [-435 9374] [0106 3664] [-423 4175]
∆Cnd 0383 0254[-0221 0988] [-0462 0956]
Scaled CAPM Intercept 4489 1617 3731 1739 2565[1479 7499] [-1429 6114] [-0268 7514] [-391 5783]
cay 1141 0563[-0198 2480] [-2262 3085]
MKT -2119 -1444[-4890 0653] [-5241 2643]
MKT times cay -8554 -5188[-19227 2119] [-17911 9415]
Scaled HC-CAPM Intercept 4405 1386 3343 3895 3659[1182 7629] [-2632 5453] [-0500 7182] [-006 6630]
cay 1038 0552[-0444 2520] [-1957 2848]
∆Y 0437 0034[-0421 1295] [-0995 0928]
MKT -2049 -1148[-5031 0933] [-4946 2772]
∆Y times cay 1428 0724[-1047 3903] [-3458 4645]
MKT times cay -7782 -4127[-18579 3015] [-17926 10532]
Panel B GLS
CCAPM Intercept 2274 -251 2336 -303 -056[1386 3162] [-435 9896] [1360 3266] [-403 1109]
∆Cnd 0139 0088[-0114 0391] [-0222 0383]
Scaled CAPM Intercept 2623 4949 2668 5626 4567[0794 4451] [1657 7829] [0859 4376] [439 7146]
cay 0362 0205[-0759 1483] [-0853 1277]
MKT -0596 -0642[-2412 1220] [-2325 1149]
MKT times cay 0644 0310[-5695 6984] [-5988 6529]
Scaled HC-CAPM Intercept 2486 5014 2604 5832 4698[0510 4463] [1158 7726] [0801 4372] [558 7371]
cay 0361 0199[-0902 1623] [-0879 1269]
∆Y -0415 -0242[-0878 0048] [-0672 0157]
MKT -0479 -0588[-2441 1482] [-2357 1241]
∆Y times cay 0395 0150[-1761 2552] [-1718 2036]
MKT times cay 1158 0477[-5888 8205] [-5890 6968]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of quarterly excess returns for 25 Fama-French sizevalue portfolios Each model is estimatedvia OLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructed basedon the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructed as inLewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide theposterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and 99level respectively
68
Table C13 Two-pass regressions with tradable factors 25 Fama-French + 17 industry portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Carhart (1997) Intercept 0755 4720 0780 3835 3769
[0167 1342] [803 8559] [0158 1395] [2222 5427]
MKT -0162 -0188
[-0759 0435] [-0808 0432]
SMB 0144 0142
[0082 0206] [0078 0207]
HML 0320 0316
[0251 0388] [0245 0388]
UMD 0759 0673
[-0360 1878] [-0476 1802]
q-factor model Intercept 0744 4505 0761 3647 3600
Hou Xue and Zhang (2015) [0244 1243] [138 8338] [0239 1287] [1711 5508]
ROE 0173 0156
[-0139 0485] [-0154 0481]
IA 0287 0279
[0117 0456] [0117 0455]
ME 0230 0221
[0135 0325] [0121 0320]
MKT -0199 -0211
[-0703 0304] [-0741 0311]
Liquidity Factor Intercept 0958 277 0963 -124 620
Pastor and Stambaugh (2003) [0417 1499] [-513 4113] [0377 1527] [-435 3340]
LIQ 0210 0153
[-1001 1421] [-1077 1445]
MKT -0274 -0281
[-0822 0274] [-0851 0340]
Panel B GLS
Carhart (1997) Intercept 0839 8826 0909 8486 8397
[0467 1210] [7562 9557] [0487 1339] [7895 8812]
MKT -0235 -0309
[-0610 0140] [-0737 0120]
SMB 0162 0161
[0134 0190] [0130 0193]
HML 0351 0350
[0316 0386] [0312 0390]
UMD 1426 1184
[0832 2020] [0541 1861]
q-factor model Intercept 1107 4289 1103 3742 3631
Hou Xue and Zhang (2015) [0761 1454] [027 7341] [0714 1486] [2022 5216]
ROE 0270 0247
[0057 0482] [0008 0502]
IA 0277 0263
[0134 0420] [0107 0428]
ME 0221 0216
[0156 0287] [0144 0288]
MKT -0524 -0520
[-0869 -0179] [-0900 -0138]
Liquidity Factor Intercept 1186 2897 1159 1811 2373
Pastor and Stambaugh (2003) [0874 1497] [-197 7687] [0803 1519] [438 5253]
LIQ 0089 0065
[-0567 0744] [-0696 0871]
MKT -0598 -0571
[-0909 -0288] [-0936 -0212]
69
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CAPM Intercept 0966 467 0957 -113 306
[0427 1506] [-250 5490] [0382 1522] [-246 2863]
MKT -0277 -0269
[-0823 0269] [-0838 0311]
Fama-French (1992) Intercept 1016 4330 1002 3425 3370
[0540 1491] [182 8166] [0517 1500] [2194 4614]
MKT -0445 -0430
[-0916 0026] [-0912 0068]
SMB 0147 0144
[0086 0208] [0077 0208]
HML 0316 0313
[0248 0383] [0242 0383]
Quality-minus-junk Intercept 0836 4931 0836 3861 3934
Asness et all (2019) [0323 1350] [914 8449] [0314 1365] [2459 5577]
QMJ 0153 0147
[-0065 0372] [-0066 0373]
MKT -0293 -0289
[-0796 0210] [-0814 0233]
SMB 0179 0175
[0114 0243] [0105 0240]
HML 0302 0300
[0234 0369] [0230 0373]
Panel B GLS
CAPM Intercept 1192 3060 1170 2602 2573
[0884 1500] [058 7848] [0815 1517] [600 5118]
MKT -0604 -0583
[-0911 -0298] [-0923 -0227]
Fama-French (1992) Intercept 1182 8616 1150 8272 8234
[0853 1510] [7303 9353] [0788 1519] [7736 8662]
MKT -0596 -0564
[-0923 -0268] [-0937 -0198]
SMB 0160 0160
[0132 0187] [0129 0191]
HML 0349 0348
[0314 0384] [0310 0386]
Quality-minus-junk Intercept 0970 8674 0977 8227 8276
Asness et all (2019) [0606 1334] [7451 9335] [0561 1369] [7759 8693]
QMJ 0293 0278
[0159 0427] [0125 0439]
MKT -0393 -0399
[-0754 -0033] [-0791 0012]
SMB 0166 0165
[0138 0194] [0134 0196]
HML 0353 0352
[0318 0388] [0315 0392]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of monthly excess returns for 25 Fama-French and 17 industry portfolios Each model is estimatedvia OLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructedbased on the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructedas in Lewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we providethe posterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of thecross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and99 level respectively
70
Table C14 Two-pass regressions with nontradable factors 25 Fama-French + 17 industry port-folios
Fama-MacBeth Bayesian Estimation
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CCAPM Intercept 1817 356 1975 -124 209
[0905 2729] [-250 6208] [0795 3135] [-245 2680]
∆Cnd 0189 0123
[-0216 0594] [-0285 0502]
Scaled CAPM Intercept 2141 1528 2344 298 1306
[0529 3754] [-789 9568] [0692 3944] [-451 4069]
cay 1408 0574
[-0182 2999] [-0935 1939]
MKT 0108 -0142
[-1471 1686] [-1747 1499]
cay timesMKT -2554 -2159
[-8811 3702] [-8813 4620]
Scaled CCAPM Intercept 1607 1709 1983 95 1468
[0394 2820] [-789 10000] [0761 3180] [-372 4189]
cay 1137 0511
[-0394 2669] [-0871 1865]
∆Cnd 0452 0175
[-0103 1007] [-0261 0598]
cay times∆Cnd -0102 0030
[-1752 1547] [-1175 1292]
HC-CAPM Intercept 2185 611 2221 -147 644
[0720 3650] [-513 6005] [0667 3735] [-429 3093]
∆Y 0398 0201
[-0011 0808] [-0290 0667]
MKT 0123 0050
[-1392 1638] [-1513 1650]
Panel B GLS
CCAPM Intercept 2351 264 2442 -057 218
[1700 3003] [-250 9795] [1612 3258] [-204 1296]
∆Cnd 0211 0132
[0034 0387] [-0081 0351]
Scaled CAPM Intercept 2516 3951 2653 4332 3470
[1467 3566] [1045 7411] [1616 3705] [360 6539]
cay 0783 0424
[0056 1511] [-0311 1119]
MKT -0504 -0639
[-1548 0541] [-1674 0406]
cay timesMKT 3785 2327
[-0412 7981] [-2053 6791]
Scaled CCAPM Intercept 2244 331 2378 296 356
[1378 3109] [-789 8166] [1539 3237] [-476 1690]
cay 0742 0425
[-0006 1489] [-0316 1104]
∆Cnd 0160 0119
[-0078 0398] [-0116 0342]
cay times∆Cnd 0355 0190
[-0343 1053] [-0455 0840]
HC-CAPM Intercept 2908 3810 2808 4454 3356
[2053 3762] [854 7372] [1809 3826] [194 6591]
∆Y -0155 -0089
[-0381 0072] [-0357 0174]
MKT -0892 -0793
[-1742 -0043] [-1803 0208]
71
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Scaled HC-CAPM Intercept 2099 1372 2243 1599 2186
[0549 3649] [-1389 4875] [0491 3955] [-121 4771]
cay 1124 0529
[-0326 2574] [-1066 1981]
∆Y 0088 0106
[-0314 0489] [-0399 0599]
MKT 0169 -0068
[-1340 1679] [-1805 1744]
cay times∆Y 1277 0579
[-1361 3914] [-1930 2919]
cay timesMKT -2125 -1605
[-8058 3808] [-8349 5354]
Durable CCAPM Intercept 2281 2316 2220 1423 1459
[0025 4537] [-789 10000] [0403 4028] [-359 4178]
∆Cnd 0544 0194
[0054 1034] [-0163 0525]
∆Cd 0481 0104
[-0101 1062] [-0353 0531]
MKT -0102 -0063
[-2466 2262] [-1948 1863]
Panel B GLS
Scaled HC-CAPM Intercept 2662 3848 2698 4312 3502
[1626 3698] [433 7267] [1555 3844] [231 6466]
cay 0726 0416
[-0029 1481] [-0324 1146]
∆Y -0165 -0088
[-0440 0110] [-0372 0198]
MKT -0652 -0685
[-1684 0379] [-1816 0438]
cay times∆Y 0681 0333
[-0648 2009] [-1017 1616]
cay timesMKT 3266 2123
[-0950 7482] [-2173 6502]
Durable CCAPM Intercept 1958 2926 1994 412 2558
[0634 3281] [-789 6763] [0831 3125] [-269 6336]
∆Cnd 0164 0065
[-0065 0394] [-0126 0260]
∆Cd 0289 0123
[-0011 0589] [-0117 0377]
MKT 0002 -0026
[-1322 1326] [-1165 1125]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of quarterly excess returns for 25 Fama-French and 17 industry portfolios Each modelis estimated via OLS and GLS We report point estimates and 5 confidence intervals for risk premia which areconstructed based on the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence levelconstructed as in Lewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimationwe provide the posterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode andmedian of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the90 95 and 99 level respectively
72
Table C15 Posterior factor probabilities and risk premia of 26 million sparse models
Pr [γj = 1|data] E [λj |data]ψ ψ
factors 1 5 10 20 50 100 1 5 10 20 50 100
HML 0455 0702 0720 0695 0646 0614 0106 0231 0249 0245 0229 0219MKTlowast 0274 0513 0594 0658 0700 0702 0050 0211 0321 0440 0553 0591MKT 0086 0178 0311 0446 0550 0576 0009 0067 0163 0286 0408 0454SMBlowast 0246 0350 0317 0264 0210 0188 0039 0113 0120 0110 0095 0089PERF 0136 0160 0147 0125 0098 0083 -0020 -0050 -0053 -0049 -0041 -0036STOCK ISS 0099 0117 0125 0123 0110 0099 -0008 -0029 -0042 -0050 -0053 -0051COMP ISSUE 0097 0101 0104 0103 0096 0088 0010 0029 0041 0051 0059 0059CMA 0111 0114 0104 0091 0075 0066 0008 0018 0019 0019 0017 0016ROA 0089 0079 0083 0094 0102 0097 0010 0023 0034 0051 0068 0070UMD 0091 0097 0097 0090 0078 0071 0006 0020 0026 0029 0030 0030BEH PEAD 0081 0069 0069 0074 0089 0108 0001 0002 0004 0007 0016 0030STRev 0078 0064 0063 0067 0086 0112 0000 0002 0003 0007 0022 0048NONDUR 0079 0065 0064 0067 0079 0097 0000 0000 0001 0001 0003 0006DISSTR 0085 0079 0076 0070 0060 0053 0010 0030 0039 0043 0042 0040MGMT 0090 0083 0077 0068 0055 0047 0007 0017 0019 0019 0017 0015BW ISENT 0079 0064 0062 0063 0069 0077 0000 0000 0000 0001 0002 0004TERM 0078 0063 0061 0062 0069 0078 0000 0000 0001 0001 0004 0007FIN UNC 0078 0063 0061 0061 0066 0073 -0000 -0000 -0000 -0000 -0000 -0000LIQ TR 0078 0063 0060 0061 0066 0075 -0000 0000 0001 0002 0007 0015DeltaSLOPE 0078 0063 0061 0061 0066 0073 -0000 -0000 -0000 -0000 -0000 -0001IPGrowth 0078 0063 0061 0061 0066 0072 -0000 -0000 -0000 -0000 -0001 -0002Oil 0078 0063 0061 0061 0066 0072 -0000 0001 0001 0002 0004 0009SERV 0078 0063 0061 0061 0065 0072 -0000 -0000 -0000 -0000 -0000 -0000DEFAULT 0078 0063 0060 0060 0064 0069 -0000 -0000 0000 0000 0000 0000NetOA 0079 0063 0061 0062 0065 0065 0001 0003 0004 0007 0014 0018DIV 0078 0063 0060 0060 0064 0069 0000 -0000 -0000 -0000 -0000 -0000PE 0078 0063 0060 0060 0064 0068 -0000 -0001 -0001 -0002 -0004 -0008REAL UNC 0078 0063 0060 0060 0064 0067 0000 0000 0000 0000 0000 0000HJTZ ISENT 0078 0063 0060 0060 0064 0068 -0000 -0000 -0000 -0000 -0000 -0001UNRATE 0078 0063 0060 0060 0064 0068 0000 -0000 -0000 -0000 -0001 -0002INTERM CAP RATIO 0084 0065 0061 0062 0062 0059 0009 0013 0018 0027 0038 0041LIQ NT 0079 0063 0060 0060 0063 0066 -0001 -0001 -0002 -0002 -0003 -0004QMJ 0113 0090 0069 0051 0035 0028 0012 0016 0014 0010 0007 0005MACRO UNC 0078 0063 0060 0059 0060 0061 0000 0000 0000 0000 0000 0000INV IN ASS 0078 0062 0059 0059 0060 0061 0000 0000 0001 0001 0003 0006ACCR 0081 0067 0062 0059 0056 0055 -0002 -0005 -0006 -0006 -0005 -0004LTRev 0078 0064 0063 0062 0059 0053 -0001 -0005 -0008 -0012 -0016 -0017CMAlowast 0079 0062 0059 0058 0059 0058 0000 0000 -0000 -0001 -0002 -0002ASS Growth 0078 0060 0056 0055 0055 0052 -0001 -0001 -0001 -0004 -0008 -0011HMLlowast 0079 0062 0058 0055 0053 0049 -0001 -0001 -0001 -0001 -0001 -0001RMWlowast 0077 0058 0052 0047 0040 0035 0000 0001 0001 0002 0002 0002BAB 0079 0059 0051 0045 0039 0034 -0003 -0003 -0003 -0001 0001 0003IA 0083 0060 0051 0043 0035 0030 -0003 -0004 -0004 -0003 -0003 -0003O SCORE 0077 0056 0047 0040 0032 0027 -0003 -0004 -0005 -0005 -0005 -0005ROE 0076 0055 0047 0040 0032 0027 0001 0004 0005 0005 0004 0003SMB 0039 0024 0027 0042 0067 0077 0002 0002 0004 0007 0014 0017SKEW 0090 0062 0047 0034 0024 0019 -0000 -0000 -0000 -0000 -0000 -0000BEH FIN 0079 0051 0040 0031 0023 0019 -0006 -0005 -0003 -0001 -0000 -0000GR PROF 0077 0047 0038 0031 0025 0020 0004 0003 0003 0003 0003 0003RMW 0073 0045 0036 0030 0023 0018 0003 0004 0004 0003 0003 0003HML DEVIL 0063 0036 0027 0020 0015 0012 0002 0003 0002 0001 0000 0000
Posterior probabilities of factors Pr [γj = 1|data] and posterior mean of factor risk premia E [λj |data] computedusing the Dirac spike and slab approach of section II22 51 factors and all possible models with up to 5 factorsyielding about 26 million candidate models The prior probability of a factor being included is about 1038 Thedata is monthly 197310 to 201612 Test assets cross-section of 25 Fama-French size and book-to-market and 30Industry portfolios The 51 factors considered are described in Table B1 of Appendix B
73
Table C16 Factor models with highest posterior probability (Dirac spike-and-slab ψ = 20)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232SMB 983232 983232 983232STRevBW ISENTLIQ TRUNRATENONDURTERMCOMP ISSUE 983232 983232OilBEH PEAD 983232DeltaSLOPEINV IN ASSIPGrowthDEFAULTUMD 983232ROA 983232 983232 983232 983232 983232 983232 983232REAL UNCPECMA 983232ACCRSERVSTOCK ISS 983232 983232 983232DIVMACRO UNCFIN UNCLIQ NTCMANetOAHJTZ ISENTLTRevRMWHMLINTERM CAP RATIOASS GrowthPERF 983232IABABDISSTR 983232ROEMGMTO SCOREQMJBEH FINGR PROFSKEWRMWHML DEVILSMBProbability () 01080 00964 00771 00710 00709 00668 00661 00624 00600 00560
Factors and posterior model probabilities of ten most likely specifications computed using the Dirac spike and slabapproach of section II22 ψ = 20 51 factors and all possible models with up to 5 factors yielding about 26 millionmodels and a model prior probability of the order of 10minus7 Specifications organised by columns with the symbol 983232indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612 Test assetscross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factors considered aredescribed in Table B1 of Appendix B
74
Table C17 Factor Models with highest posterior probability (continuous spike-and-slab ψ = 10)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKTlowast 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232SMBlowast 983232 983232 983232 983232 983232 983232 983232STRev 983232 983232 983232 983232 983232 983232 983232IPGrowth 983232 983232 983232 983232 983232BEH PEAD 983232 983232 983232 983232 983232NONDUR 983232 983232 983232 983232SERV 983232 983232 983232LIQ TR 983232 983232 983232 983232PE 983232 983232 983232 983232 983232 983232DeltaSLOPE 983232 983232 983232 983232BW ISENT 983232 983232DEFAULT 983232 983232 983232 983232 983232 983232Oil 983232 983232 983232DIV 983232 983232 983232 983232 983232 983232REAL UNC 983232 983232 983232 983232 983232 983232UNRATE 983232 983232TERM 983232 983232 983232 983232HJTZ ISENT 983232 983232 983232 983232 983232 983232UMD 983232 983232 983232 983232 983232CMAlowast 983232 983232 983232 983232 983232 983232LIQ NT 983232 983232 983232 983232 983232 983232 983232 983232FIN UNC 983232 983232 983232 983232 983232 983232 983232STOCK ISS 983232 983232NetOA 983232 983232 983232 983232 983232 983232MACRO UNC 983232 983232 983232 983232 983232INV IN ASS 983232 983232 983232COMP ISSUE 983232 983232 983232 983232 983232ACCR 983232 983232 983232 983232 983232 983232HMLlowast 983232 983232 983232 983232 983232 983232ROA 983232 983232 983232 983232 983232 983232RMWlowast 983232 983232 983232 983232 983232 983232CMA 983232 983232 983232 983232 983232 983232LTRev 983232 983232 983232 983232 983232 983232 983232 983232INTERM CAP RATIO 983232 983232 983232 983232 983232 983232ASS Growth 983232 983232 983232 983232 983232 983232DISSTR 983232 983232 983232 983232PERF 983232 983232 983232BAB 983232 983232IA 983232 983232 983232 983232MGMT 983232 983232 983232 983232ROE 983232 983232 983232O SCORE 983232BEH FIN 983232 983232 983232 983232 983232GR PROF 983232 983232 983232 983232 983232QMJ 983232 983232RMW 983232SKEW 983232 983232 983232SMB 983232HML DEVIL 983232 983232Probability () 00111 00111 00111 00100 00100 00100 00100 00100 00100 00100
Factors and posterior model probabilities of ten most likely specifications computed using the continuous spike andslab approach of section II23 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 225quadrillion models and a model prior probability of the order of 10minus16 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
75
Table C18 Factor models with highest posterior probability (Continuous spike-and-slab ψ = 20)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232MKTlowast 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232SMBlowast 983232 983232 983232STRev 983232 983232 983232 983232 983232 983232 983232 983232IPGrowth 983232 983232 983232BEH PEAD 983232 983232 983232 983232 983232 983232NONDUR 983232 983232 983232SERV 983232 983232 983232 983232 983232 983232LIQ TR 983232 983232 983232 983232 983232PE 983232 983232 983232 983232 983232 983232 983232DeltaSLOPE 983232 983232 983232 983232 983232 983232BW ISENT 983232 983232 983232 983232 983232DEFAULT 983232 983232 983232 983232 983232Oil 983232 983232DIV 983232 983232 983232 983232REAL UNC 983232 983232 983232 983232 983232UNRATE 983232 983232 983232 983232 983232TERM 983232 983232 983232 983232HJTZ ISENT 983232 983232 983232 983232 983232 983232 983232 983232UMD 983232 983232 983232 983232CMAlowast 983232 983232LIQ NT 983232 983232 983232 983232 983232FIN UNC 983232 983232 983232 983232 983232 983232STOCK ISS 983232 983232 983232 983232 983232NetOA 983232 983232 983232 983232 983232 983232 983232MACRO UNC 983232 983232INV IN ASS 983232 983232 983232 983232COMP ISSUE 983232 983232 983232 983232ACCR 983232 983232 983232HMLlowast 983232 983232 983232 983232 983232 983232 983232ROA 983232 983232 983232 983232 983232 983232RMWlowast 983232 983232 983232 983232 983232CMA 983232 983232 983232 983232 983232 983232LTRev 983232 983232 983232INTERM CAP RATIO 983232 983232 983232ASS Growth 983232 983232 983232 983232DISSTR 983232 983232 983232 983232 983232PERF 983232BAB 983232 983232 983232 983232IA 983232 983232 983232 983232MGMT 983232 983232 983232ROE 983232 983232 983232 983232 983232O SCORE 983232 983232 983232 983232 983232BEH FIN 983232GR PROF 983232 983232 983232QMJ 983232 983232RMW 983232 983232SKEW 983232 983232SMB 983232HML DEVIL 983232 983232Probability () 00133 00122 00111 00111 00100 00100 00100 00100 00100 00089
Factors and posterior model probabilities of ten most likely specifications computed using the continuous spike andslab approach of section II23 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 225quadrillion models and a model prior probability of the order of 10minus16 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
76
D Additional Figures
Panel A OLS
00 01 02 03 04 05
02
46
810
λstrong
Density
BFMFM
(a) Strong factor
minus2 minus1 0 1 2
00
02
04
06
08
10
λuseless
Density
BFMFM
(b) Useless factor
Panel B GLS
025 030 035
05
1015
2025
30
λstrong
Density
BFMFM
(c) Strong factor
minus20 minus15 minus10 minus05 00 05 10
00
04
08
12
λuseless
Density
BFMFM
(d) Useless factor
Figure D1 Posterior distribution of the risk premia estimates in a misspecified model that includesboth strong and irrelevant factors
The graph presents posterior distribution of risk premia estimates for a misspecified model with both strong anduseless factors in one representative simulation Panels (a) and (b) display posterior distribution of the BFM-OLSestimates of risk premia along with the frequentist distribution implied by the point estimates and standard errorsPanels (c) and (d) report the same objects for GLS T =1000
77
0 5 10
00
04
08
12
Parameter
Den
sity
PriorLikelihood ILikelihood IILikelihood IIIPosterior IPosterior IIPosterior III
Figure D2 Posterior distributions with heavy-tailed likelihood and thin-tailed prior
Posterior shapes generated by a thin-tails (Gaussian) prior and a heavy-tails (Cauchy) likelihood as the likelihood
peak moves away from the prior mean Posteriors I II and III are generated respectively by the likelihood I II and
III
78
is totally flat1 Hence integration applied directly to the likelihood as if it were an unnormalized
probability distribution function (pdf) produces improper marginal ldquoposteriorsrdquo As a result in the
presence of identification failure naive Bayesian inference has the same weakness as the frequentist
one This observation however not only illustrates the nature of the problem but also suggests
how to restore inference use suitable non-informative but yet non-flat priors
Third building upon the literature on predictor selection (see eg Ishwaran Rao et al (2005)
and Giannone Lenza and Primiceri (2018)) we provide a novel (continuous) ldquospike-and-slabrdquo prior
that restores the validity of model selection based on posterior model probabilities and Bayes factors
The prior is uninformative (the ldquoslabrdquo) for strong factors but shrinks away (the ldquospikerdquo) useless
factors This approach is similar in spirit to a ridge regression and acts as a (Tikhonov-Phillips)
regularization of the likelihood function of the cross-sectional regression needed to estimate risk
premia A distinguishing feature of our prior is that the prior variance of a factorrsquos risk premium is
proportional to its correlation with the test asset returns Hence when a useless factor is present
the prior variance of its risk premium converges to zero so the shrinkage dominates and forces its
posterior distribution to concentrated around zero Not only this prior restores integrability but
also i) makes it computationally feasible to analyse quadrillions of alternative factor models ii)
allows the researcher to encode prior beliefs about the sparsity of the true SDF without imposing
hard thresholds and iii) shrinks the estimate of useless factorsrsquo risk premia toward zero We regard
this novel spike-and-slab prior approach as a solution for the high-dimensional inference problem
generated by the factor zoo
Our method is easy to implement and in all of our simulations has good finite sample properties
even when the cross-section of test assets is large We investigate its performance for risk premia
estimation model evaluation and factor selection in a range of simulation designs that mimic the
stylized features of returns Our simulations account for potential model misspecification and the
presence of either strong or useless factors in the model The use of posterior sampling naturally
allows to build credible confidence intervals not only for risk premia but also other statistics of
interest such as the cross-sectional R2 that is notoriously hard to estimate precisely (Lewellen
Nagel and Shanken (2010))
We show that whenever risk premia are well identified both our method and the frequentist
approach provide valid confidence intervals for model parameters with empirical coverage being
close to its nominal size The posterior distribution for useless factors however is reliably centered
around zero and quickly revels them even in a relatively short sample We find that the posterior
of strong factors is largely unaffected by the identification failure with the posterior coverage
corresponding to its nominal size as well In other words the Bayesian approach restores reliable
statistical inference in the model
We also demonstrate the pitfalls of flat priors for risk premia with the same simulation design
their use leads to selecting useless factor with probability approaching 1 for all the sample sizes
However our spike-and-slab prior seems to successfully eliminate them from the model while
1This is similar to the effect of ldquoweak instrumentsrdquo in IV estimations as discussed in Sims (2007)
2
retaining the true sources of risk
Our results have important empirical implications for the estimation of popular linear factor
models and their comparison We jointly evaluate 51 factors proposed in the previous literature
yielding a total of 225 quadrillion possible models to analyze and find that only a handful of
factors are robust explanators of the cross-section of asset returns (the Fama and French (1992)
ldquohigh-minus-lowrdquo proxy for the value premium the market index as well the adjusted versions
of the ldquosmall-minus-bigrdquo size factor and the market factor of Daniel Mota Rottke and Santos
(2018))
Jointly the four robust factors provide a model that is compared to the previous empirical
literature one order of magnitude more likely to have generated the observed asset returns (itrsquos
posterior probability is about 90) However we show that with very high probability the ldquotruerdquo
latent SDF is dense in the space of factors ie capturing its characteristics requires the use of 24-25
factors Nevertheless the SDF-implied maximum Sharpe ratio is not excessive suggesting a high
degree of commonality in terms of captured risks among the factors in the zoo
Furthermore we apply our useless factors detection method to a selection of popular linear
SDFs We find that many non-traded factors such as consumption proxies labour factors or the
consumption-to-wealth ratio cay are only weakly identified at best and are characterised by a
substantial degree of model misspecification and uncertainty
I1 Closely related literature
There are numerous contributions to the literature that rely on the use of Bayesian tools in finance
especially in the areas of asset allocation (for an excellent overview see Fabozzi Huang and Zhou
(2010) and Avramov and Zhou (2010)) model selection (eg Barillas and Shanken (2018)) and
performance evaluation (Baks Metrick and Wachter (2001) Pastor and Stambaugh (2002) Jones
and Shanken (2005) Harvey and Liu (2019)) In this paper therefore we aim to provide only an
overview of the literature that is most closely related to our paper
While there are multiple papers in the literature that adopt a Bayesian approach to analyse
linear factor models and portfolio choice most of them focus on the time-series regressions where
the intercepts thanks to factors being directly traded (or using their mimicking portfolios) can be
interpreted as the vector of pricing errors ndash the αrsquos In this case the use of test assets actually
becomes irrelevant since the problem of comparing model performance is reduced to the spanning
tests of one set of factors by another (eg Barillas and Shanken (2018))
Our paper instead develops a method that can be applied to both tradable and non-tradable
factors As a result we focus on the general pricing performance in the cross-section of asset returns
(which is no longer irrelevant) and show that there is a tight link between the use of the most
popular diffuse priors for the risk premia and the failure of the standard estimation techniques
in the presence of useless factors (eg Kan and Zhang (1999a))
Shanken (1987) and Harvey and Zhou (1990) were probably the first contributions to the lit-
erature that adapted the Bayesian framework to the analysis of portfolio choice and developed
3
GRS-type tests (cf Gibbons Ross and Shanken (1989)) for mean-variance efficiency While
Shanken (1987) was the first to examine the posterior odds ratio for portfolio alphas in the linear
factor model Harvey and Zhou (1990) set the benchmark by imposing the priors both diffuse and
informative directly on the deep model parameters Using the data on 12 industry portfolios they
reject the mean-variance efficiency of the market return with a high degree of certainty
Pastor and Stambaugh (2000) and Pastor (2000) directly assign a prior distribution to the
pricing errors α α sim N (0κΣ) where Σ is the variance-covariance matrix of returns and κ isin R+
and apply it to the Bayesian portfolio choice problem The intuition behind their prior is that it
imposes a degree of shrinkage on the alphas so that whenever factor models are misspecified the
pricing errors cannot be too large a priori hence placing a bound on the Sharpe ratio achievable
in this economy Therefore a diffuse prior for the pricing errors α in general should be avoided
Barillas and Shanken (2018) extend the aforementioned prior to derive a closed-form solution
for the Bayesrsquo factor and use it to compare different linear factor models exploiting the time series
dimension of the data2 In a recent paper Goyal He and Huh (2018) also extend the notion of
distance between alternative model specifications and highlight the tension between the power of
the GRS-type tests and the absolute return-based measures of mispricing
Last but not the least there is a general close connection between the Bayesian approach
to model selection or parameter estimation and the shrinkage-based one Garlappi Uppal and
Wang (2007) impose a set of different priors on expected returns and the variance-covarince matrix
and find that the shrinkage-based analogue leads to superior empirical performance Anderson
and Cheng (2016) develop a Bayesian model-averaging approach to portfolio choice with model
uncertainty being one of the key ingredients that yield robust asset allocation and superior out-of-
sample performance of the strategies Finally the shrinkage-based approach to recovering the SDF
of Kozak Nagel and Santosh (2019) can also be interpreted from a Bayesian perspective Within
a universe of characteristic-managed portfolios the authors assign prior distributions to expected
returns3 and their posterior maximum likelihood estimators resemble a ridge regression Instead
we work directly with tradable and nontradable factors and consider (endogenously) heterogenous
priors for factor risk premia λ The dispersion of our prior for each λ directly depends on the
correlation between test assets and the factor so that it mimics the strength of the identification
of the factor risk premium
Naturally our paper also contributes to the very active (and growing) body of work that criti-
cally evaluates existing findings in the empirical asset pricing literature and tries to develop a robust
methodology There is ample empirical evidence that most linear asset pricing models are misspec-
ified (eg Chernov Lochstoer and Lundeby (2019) He Huang and Zhou (2018) Giglio and Xiu
(2018)) Gospodinov Kan and Robotti (2014) develop a general approach for misspecification-
robust inference that provides valid confidence interval for the pseudo-true values of the risk premia
2Chib Zeng and Zhao (forthcoming) show that the improper prior specification of Barillas and Shanken (2018)is problematic and propose a new class of priors that leads to valid model comparison
3Or equivalently the coefficients b when the linear stochastic discount factor is represented as mt = 1 minus (ft minusE[ft])
⊤b where ft and E denote respectively a vector of factors and the unconditional expectation operator
4
Giglio and Xiu (2018) exploit the invariance principle of the PCA and recover the risk premia asso-
ciated with the given factor from the projection on the span of latent factors driving a cross-section
of asset returns Uppal Zaffaroni and Zviadadze (2018) adopt a similar approach by recovering
latent factors from the residuals of the asset pricing model effectively completing the span of the
SDF Daniel Mota Rottke and Santos (2018) instead focus on the construction of the cross-
sectional factors and note that many well-established tradable portfolios such as HML and SMB
can be substantially improved in asset pricing tests by hedging their unpriced component (that does
not carry a risk premium) We do not take a stand on the origin of the factors or the completion of
the model space Instead we consider the whole universe of potential models that can be created
from the set of observable candidates factors proposed in the empirical literature As such our
analysis explicitly takes into account both standard specifications that have been successfully used
in numerous papers (eg Fama-French 3-factor models or nondurable consumption growth) as
well as the combinations of the factors that have never been explicitly tested
Following Harvey Liu and Zhu (2016) a large literature has tried to understand which of
the existing factors (or their combinations) drive the cross-section of asset returns Giglio Feng
and Xiu (2019) combine cross-sectional asset pricing regressions with the double-selection LASSO
of Belloni Chernozhukov and Hansen (2014) to provide valid uniform inference on the selected
sources of risk Huang Li and Zhou (2018) use a reduced rank approach to select among not
only the observable factors but their total span effectively allowing for sparsity not necessarily in
the observable set of factors but their combinations as well Kelly Pruitt and Su (2019) build a
latent factor model for stock returns with factor loadings being a linear function of the company
characteristics and find that only a small subset of the latter provide substantial independent
information relevant for asset pricing Our approach does not take a stand on whether there exists
a single combination of factors that substantially outperforms other model specifications Instead
we let the data speak and find out that the cross-sectional likelihood across the many models is
rather flat meaning the data is not informative enough to reliably indicate that there is a single
dominant specification
Finally our paper naturally contributes to the literature on weak identification in asset pricing
Starting from the seminal papers of Kan and Zhang (1999a 1999b) identification of risk premia has
been shown to be challenging for traditional estimation procedures Kleibergen (2009) demonstrates
that the two-pass regression of Fama-MacBeth lead to biased estimates of the risk premia and
spuriously high significance levels Moreover useless factors often crowd out the impact of the true
sources of risk in the model and lead to seemingly high levels of cross-sectional fit (Kleibergen and
Zhan (2015)) Gospodinov Kan and Robotti (2014 2019) demonstrate that most of the estimation
techniques used to recover risk premia in the cross-section are rendered invalid in the presence of
useless factors and propose alternative procedures that effectively eliminate the impact of these
factors from the model We build upon the intuition developed in these papers and formulate
the Bayesian solution to the problem by providing a prior that directly reflects the strength of
the factor Whenever the vector of correlation coefficients between asset returns and a factor is
close to zero the prior variance of λ for this specific factor also goes to zero and the penalty for
5
the risk premium converges to infinity effectively shrinking the posterior of the useless factorsrsquo
risk premia toward zero Therefore our priors are particularly robust to the presence of spurious
factors Conversely they are very diffuse for the strong factors with the posterior reflecting the
full impact of the likelihood
II Inference in Linear Factor Models
This section introduces the notation and reviews the main results of the Fama-MacBeth (FM)
regression method (see Fama and MacBeth (1973)) We focus on classic linear factor models for
cross-sectional asset returns Suppose that there are K factors ft = (f1t fKt)⊤ t = 1 T
which could be either tradable or non-tradable To simplify exposition we consider mean zero
factors that have also been demeaned in sample so that we have both E[ft] = 0K and f = 0K where
E[] denotes the unconditional expectation and the upper bar denotes the sample mean operator
The returns of N test assets in excess of the risk free rate are denoted by Rt = (R1t RNt)⊤
In the FM procedure the factor exposures of asset returns βf isin RNtimesK are recovered from
the linear regression
Rt = a+ βfft + 983171t (1)
where 9831711 983171Tiidsim N (0N Σ) and a isin RN Given the mean normalization of ft we have E[Rt] = a
The risk premia associated with the factors λf isin RK are then estimated from the cross-
sectional regression
R = λc1N + 983141βfλf +α (2)
where 983141βf denotes the time series estimates λc is a scalar average mispricing that should be equal to
zero under the null of the model being correctly specified 1N denotes an N -dimensional vector of
ones and α isin RN is the vector of pricing errors in excess of λc If the model is correctly specified
it implies the parameter restriction a = E[Rt] = λc1N + βfλf Therefore we can rewrite the
two-step Fama-MacBeth regression into one equation as
Rt = λc1N + βfλf + βfft + 983171t (3)
Equation (3) is particularly useful in our simulation study Note that the intercept λc is included
in (2) and (3) in order to separately evaluate the ability of the model to explain the average level
of the equity premium and the cross-sectional variation of asset returns
Let B⊤ = (aβf ) and F⊤t = (1f⊤
t ) and consider the matrices of stacked time series observa-
tions
R =
983091
983109983109983107
R⊤1
R⊤T
983092
983110983110983108 F =
983091
983109983109983107
F⊤1
F⊤T
983092
983110983110983108 983171 =
983091
983109983109983107
983171⊤1
983171⊤T
983092
983110983110983108
The regression in (1) can then be rewritten as R = FB + 983171 yielding the time-series estimates of
6
(aβf ) and Σ
983141B =
983075983141a⊤
983141β⊤f
983076= (F⊤F )minus1F⊤R 983141Σ =
1
T(Rminus F 983141B)⊤(Rminus F 983141B)
In the second step the OLS estimates of the factor risk premia are
983141λ = (983141β⊤ 983141β)minus1 983141β⊤R (4)
where 983141β = (1N 983141βf ) and λ⊤ = (λc λf ⊤) The canonical Shanken (1992) corrected covariance matrix
of the estimated risk premia is4
983141σ2(983141λ) = 1
T
983147(983141β⊤ 983141β)minus1 983141β⊤ 983141Σ983141β(983141β⊤ 983141β)minus1(1 + 983141λ⊤
f983141Σminus1f
983141λf )983148 (5)
where 983141Σf is the sample estimate of the variance-covariance matrix of the factors ft There are
two sources of estimation uncertainty in the OLS estimates of λ First we do not know the test
assetsrsquo expected returns but instead estimate them as sample means R According to the time-
series regression R sim N (a 1T Σ) asymptotically Second if β is known the asymptotic covariance
matrix of 983141λ is simply 1T (β
⊤β)minus1β⊤ 983141Σβ(β⊤β)minus1 The extra term (1 + λ⊤fΣ
minus1f λf ) is included to
account for the fact that βf is estimated
Alternatively we can run a (feasible) GLS regression in the second stage obtaining the estimates
983141λ = (983141β⊤ 983141Σminus1 983141β)minus1 983141β⊤ 983141Σminus1R (6)
where 983141Σ = 1T 983141983171
⊤983141983171 and 983141983171 denotes the OLS residuals and with the associated covariance matrix of
the estimates
983141σ2(983141λ) = 1
T(983141β⊤ 983141Σminus1 983141β)minus1(1 + 983141λ⊤
f983141Σminus1f
983141λf ) (7)
Equations (4) and (6) make it clear that in the presence of a spurious (or useless) factor ie
such that βj = CradicT C isin RN risk premia are no longer identified Furthermore their estimates
diverge leading to inference problems for both the useless and the strong (ie βj ∕rarr 0 as T rarr infin)
factors (see eg Kan and Zhang (1999b)) In the presence of such an identification failure the
cross-sectional R2 also becomes untrustworthy If a useless factor is included into the two-pass
regression the OLS R2 tend to be highly inflated (although the GLS R2 is less affected)5
4An alternative way (see eg Cochrane (2005) Page 242) to account for the uncertainty from ldquogenerated regres-sorsrdquo such as βf is to estimate the whole system in GMM The moments are
gT (aβf λ) =
983061IN otimes IK+1
A⊤
983062983091
983107E[Rt minus aminus βfft]
E[(Rt minus aminus βfft)otimes f⊤t ]
E[Rt minus λc1N minus βfλf ]
983092
983108 = 0
where β = (1N βf ) A⊤ = β⊤ for OLS and A⊤ = β⊤Σminus1 for GLS Also note that GLS estimation is not the sameas efficient GMM estimation
5For example Kleibergen and Zhan (2015) derive the asymptotic distribution of the R2 under the assumptionthat a few unknown factors are able to explain expected asset returns and show that in the presence of a useless
7
This problem arises not only when using the Fama-MacBeth two-step procedure Kan and
Zhang (1999a) point out that the identification condition in the GMM test of linear stochastic
discount factor models fails when a useless factor is included Moreover this leads to overrejection
of the hypothesis of a zero risk premium for the useless factor under the Wald test and the power
of the over-identifying restriction test decreases Gospodinov Kan and Robotti (2019) document
similar problems within the maximum likelihood estimation and testing framework
Consequently several papers have attempted to develop alternative statistical procedures that
are robust to the presence of useless factors Kleibergen (2009) proposes several novel statis-
tics whose large sample distributions are unaffected by the failure of the identification condition
Gospodinov Kan and Robotti (2014) derive robust standard errors for the GMM estimates of
factor risk premia in the linear stochastic factor framework and prove that t-statistics calculated
using their standard errors are robust even when the model is misspecified and a useless factor is
included Bryzgalova (2015) introduces a LASSO-like penalty term in the cross-sectional regression
to shrink the risk premium of the useless factor towards zero
In this paper we provide a Bayesian inference and model selection framework that i) can be
easily used for robust inference in the presence and detection of useless factors (section II1) and
ii) can be used for both model selection and model averaging even in the presence of a very large
number of candidate (traded or non traded and possibly useless) risk factors ndash ie the entire factor
zoo
II1 Bayesian Fama-MacBeth
This section introduces our hierarchical Bayesian Fama-MacBeth (BFM) estimation method A
formal derivation is presented in Appendix A1 To start with letrsquos consider the time-series regres-
sion We assume that the time series error terms follow an iid multivariate Gaussian distribution
(the approach at the cost of analytical solutions could be generalized to accommodate different
distributional assumptions) ie 983171 sim MVN (0TtimesN Σ otimes IT ) The likelihood of the data (RF ) is
then
p(data|BΣ) = (2π)minusNT2 |Σ|minus
T2 exp
983069minus1
2tr
983147Σminus1(Rminus FB)⊤(Rminus FB)
983148983070
The time-series regression is always valid even in the presence of a spurious factor For simplicity
we choose the non-informative Jeffreysrsquo prior for (BΣ) π(BΣ) prop |Σ|minusN+1
2 Note that this prior
is flat in the B dimension The posterior distribution of (BΣ) is therefore
B|Σ data sim MVN983059983141BolsΣotimes (F⊤F )minus1
983060 (8)
Σ|data sim W -1983059T minusK minus 1 T Σ
983060 (9)
where 983141Bols and Σ denote the canonical OLS based estimates and W -1 is the inverse-Wishart
distribution (a multivariate generalization of the inverse-gamma distribution) From the above
factor the OLS R2 is more likely to be inflated than its GLS counterpart
8
we can sample the posterior distribution of the parameters (BΣ) by first drawing the covariance
matrix Σ from the inverse-Wishart distribution conditional on the data and then drawing B from
a multivariate normal distribution conditional on the data and the draw of Σ
If the model is correctly specified in the sense that all true factors are included expected
returns of the assets should be fully explained by their risk exposure β and the prices of risk λ
ie E[Rt] = βλ But since given our mean normalisation of the factors E[Rt] = a we have the
least square estimate (β⊤β)minus1β⊤a Therefore we can define our first estimator
Definition 1 (Bayesian Fama-MacBeth (BFM)) The posterior distribution of λ conditional
on B Σ and the data is a Dirac distribution at (β⊤β)minus1β⊤a A draw (λ(j)) from the posterior
distribution of λ conditional on the data only is obtained by drawing B(j) and Σ(j) from the Normal-
inverse-Wishart in (8)-(9) and computing (β⊤(j)β(j))
minus1β⊤(j)a(j)
The posterior distribution of λ defined above accounts both for the uncertainty about the
expected returns (via the sampling of a) and the uncertainty about the factor loadings (via the
sampling of β) Note that differently from the frequentist case in equation (5) there is no ldquoextra
termrdquo (1 + λ⊤fΣ
minus1f λf ) to account for the fact that βf is estimated The reason being that it
is unnecessary to explicitly adjust standard errors of λ in the Bayesian approach since we keep
updating βf in each simulation step automatically incorporating the uncertainty about βf into the
posterior distribution of λ Furthermore it is quite intuitive from the above definition of the BFM
estimator why we expect posterior inference to detect weak and spurious factors in finite sample
For such factors the near singularity of (β⊤(j)β(j))
minus1 will cause the draws for λ(j) to diverge as in
the frequentist case Nevertheless the posterior uncertainty about factor loadings and risk premia
will cause β⊤(j)a(j) to flip sign across draws causing the posterior distribution of λ to put substantial
probability mass on both values above and below zero Hence centered posterior credible intervals
will tend to include zero with high probability
In addition to the price of risk λ we are also interested in estimating the cross-sectional fit of
the model ie the cross-sectional R2 Once we obtain the posterior draws of the parameters we
can easily obtain the posterior distribution of the cross-sectional R2 defined as
R2ols = 1minus (aminus βλ)⊤(aminus βλ)
(aminus a1N )⊤(aminus a1N ) (10)
where a = 1N
983123Ni ai That is for each posterior draw of (a β λ) we can construct the corre-
sponding draw for the R2 from equation (10) hence tracing out its posterior distribution We can
think of equation (10) as the population R2 where a β and λ are unknown After observing
the data we infer the posterior distribution of a β and λ and from these we can recover the
distribution of the R2
However realistically the models are rarely true Therefore one might want to allow for the
presence of pricing errors α in the cross-sectional regression6 This can be easily accommodated
6As we will show in the next section this is essential for model selection
9
within our Bayesian framework since in this case the data generating process in the second stage
becomes a = βλ + α If we further assume that pricing error αi follows an independent and
identical normal distribution N (0σ2) the likelihood function in the second step becomes7
p(data|λσ2β) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070 (11)
In the cross-sectional regression the ldquodatardquo are the expected risk premia a and the factor loadings
β albeit these quantities are not directly observable to the researcher Hence in the above we
are conditioning on the knowledge of these quantities which can be sampled from the first step
Normal-inverse-Wishart posterior distribution (8)-(9) Conceptually this is not very different from
the Bayesian modeling of latent variables In the benchmark case we assume a Jeffreysrsquo diffuse
prior8 for (λσ2) π(λσ2) prop σminus2 In Appendix A1 we show that the posterior distribution of
(λσ2) is then
λ|σ2BΣ data sim N
983091
983109983107(β⊤β)minus1β⊤a983167 983166983165 983168983141λ
σ2(β⊤β)minus1
983167 983166983165 983168Σλ
983092
983110983108 (12)
σ2|BΣ data sim Γminus1
983075N minusK minus 1
2(aminus βλ)⊤(aminus βλ)
2
983076 (13)
where Γminus1 denotes the inverse-gamma distribution The conditional distribution in equation (12)
makes it clear that the posterior takes into account both the uncertainty about the market price
of risk stemming from the first stage uncertainty about the β and a (that are drawn from the
Normal-inverse-Wishart posterior in equations (8)-(9)) and the random pricing errors α that have
the conditional posterior variance in equation (13) If test assetsrsquo expected excess returns are fully
explained by β there are no pricing errors and σ2(β⊤β)minus1 converges to zero otherwise this layer
of uncertainty always exists
Note also that we can think of the posterior distribution of (β⊤β)minus1β⊤a as a Bayesian decision
makerrsquos belief about the dispersion of the Fama-MacBeth OLS estimates after observing the data
RtftTt=1 Alternatively when pricing errors α are assumed to be zero under the null hypothesis
the posterior distribution of λ in equation (12) collapses to a degenerate distribution where λ
equals (β⊤β)minus1β⊤a with probability one
Often the cross sectional step of the Fama-MacBeth estimation is performed via GLS rather than
least squares In our setting under the null of the model this leads to λ = (β⊤Σminus1β)minus1β⊤Σminus1a
Therefore we define the following GLS estimator
Definition 2 (Bayesian Fama-MacBeth GLS (BFM-GLS)) The posterior distribution of λ
conditional on B Σ and the data is a Dirac distribution at (β⊤Σminus1β)minus1β⊤Σminus1a A draw (λ(j))
7We derive a formulation with non-spherical cross-sectional pricing errors that leads to a GLS type estimator inAppendix A2
8As shown in the next subsection in the presence of useless factors such prior is not appropriate for modelselection based on Bayes factors and posterior probabilities since it does not lead to proper marginal likelihoodsTherefore we introduce therein a novel prior for model selection
10
from the posterior distribution of λ conditional on the data only is obtained by drawing B(j) and Σ(j)
from the Normal-inverse-Wishart in equations (8)-(9) and computing (β⊤(j)Σ
minus1(j)β(j))
minus1β⊤(j)Σ
minus1(j)a(j)
From the posterior sampling of the parameters in the above definition we can also obtain the
posterior distribution of the cross-sectional GLS R2 defined as
R2gls = 1minus
(aminus βλgls)⊤Σminus1(aminus βλgls)
(aminus a1N )⊤Σminus1(aminus a1N ) (14)
Once again we can think of equation (14) as the population GLS R2 that is a function of the
unknown quantities a β and λ But after observing the data we infer the posterior distribution
of the parameters and from these we recover the posterior distribution of the R2gls
Remark 1 (Generated factors) Often factors are estimated as eg in the case of principal
components (PCs) and factor mimicking portfolios (albeit the latter is not needed in our setting)
This generates an additional layer of uncertainty normally ignored in empirical analysis due to
the associated asymptotic complexities Nevertheless it is relatively easy to adjust the Bayesian
estimators of risk premia to account for this uncertainty In the case of a mimicking portfolio
under a diffuse prior and Normal errors the posterior distribution of the portfolio weights follow the
standard Normal-inverse-Gamma of Gaussian linear regression models (see eg Lancaster (2004))
Similarly in the case of principal components as factors under a diffuse prior the covariance
matrix from which the PCs are constructed follow an inverse-Wishart distribution9 Hence the
posterior distributions in Definitions 1 and 2 can account for the generated factors uncertainty by
first drawing from an inverse-Wishart the covariance matrix from which PCs are constructed or
the Normal-inverse-Gamma posterior of the mimicking portfolios coefficients and then sampling
the remaining parameters as explained in the definitions
II2 Model selection
In the previous subsection we have derived simple Bayesian estimators that deliver in finite sam-
ple credible intervals robust to the presence of spurious factors and avoid over-rejecting the null
hypothesis of zero risk premia for such factors
However given the plethora of risk factors that have been proposed in the literature a robust
approach for models selection across non-necessarily nested models and that can handle poten-
tially a very large number of possible models as well as both traded and non-traded factors is
of paramount importance for empirical asset pricing The canonical way of selecting models and
testing hypothesis within the Bayesian framework is through Bayesrsquo factors and posterior prob-
abilities and that is the approach we present in this section This is for instance the approach
suggested by Barillas and Shanken (2018) The key elements of novelty of the proposed method
are that i) our procedure is robust to the presence of spurious and weak factors ii) it is directly
9Based on these two observations Allena (2019) proposes a generalisation of Barillas and Shanken (2018) modelcomparison approach for these type of factors
11
applicable to both traded and non-traded factors and iii) it selects models based on their cross-
sectional performance (rather than the time series one) ie on the basis of the risk premia that the
factors command
In this subsection we show first that flat priors for risk premia (as the Jeffreysrsquo priors used
in section II1 for illustrative purposes) are not suitable for model selection in the presence of
spurious factors Given the close analogy between frequentist testing and Bayesian inference with
flat priors this is not too surprising But the novel insight is that the problem arises exactly
because of the use of flat priors and can therefore be fixed by using non-flat yet non-informative
priors Second we introduce ldquospike-and-slabrdquo priors that are robust to the presence of spurious
factors and particularly powerful in high-dimensional model selection ie when one wants as in
our empirical application to test all factors in the zoo
II21 Pitfalls of Flat Priors for Risk Premia
We start this section by discussing why flat priors for risk premia as the Jeffreysrsquo prior are not
desirable in model selection Since we want to focus and select models based on the cross-sectional
asset pricing properties of the factors for simplicity we retain Jeffreysrsquo priors for the time series
parameter (aβf Σ) of the first-step regression
In order to perform model selection we relax the (null) hypothesis that models are correctly
specified and allow instead for the presence of cross-sectional pricing errors That is we consider
the cross-sectional regression a = βλ + α For illustrative purposes we focus on spherical errors
but all the results in this and the following subsections can be generalized to the non-spherical error
setting in Appendix A2
Similar to many Bayesian variable selection problems we introduce a vector of binary latent
variables γ⊤ = (γ0 γ1 γK) where γj isin 0 1 When γj = 1 it indicates that the factor j
(with associated loadings βj) should be included into the model and vice versa The number of
included factors is simply given by pγ =983123K
j=0 γj Note that we do not shrink the intercept so
γ0 is always equal to 1 (as the common intercept plays the role of the first ldquofactorrdquo) The notation
βγ = [βj ]γj=1 represents a pγ-columns sub-matrix of β
When testing whether the risk premium of factor j is zero the null hypothesis is H0 λj = 0
In our notation this null hypothesis can be expressed as H0 γj = 0 while the alternative is
H1 γj = 1 This is a small but important difference relative to the canonical frequentist testing
approach for useless factors the risk premium is not identified hence testing whether it is equal to
any given value is per se problematic Nevertheless as we show in the next section with appropriate
priors whether a factor should be included or not is a well-defined question even in the presence
of useless factors
In the Bayesian framework the prior distribution of parameters under the alternative hypothesis
should be carefully specified Generally speaking the priors for nuisance parameters such as β σ2
and Σ do not greatly influence the cross-sectional inference But as we are about to show this is
not the case for the priors about risk premia
12
Recall that when considering multiple models say wlog model γ and model γ prime by Bayesrsquo
theorem we have that the posterior probability of model γ is
Pr(γ|data) = p(data|γ)p(data|γ) + p(data|γ prime)
where we have given equal prior probability to each model and p(data|γ) denotes the marginal
likelihood of the model indexed by γ In Appendix A3 we show that when using a Jeffreysrsquo prior
(that is flat for λ) the marginal likelihood is
p(data|γ) prop (2π)pγ+1
2 |β⊤γ βγ |minus
12
Γ983059Nminuspγ+1
2
983060
983059N σ2
γ
2
983060Nminuspγ+1
2
(15)
where λγ = (β⊤γ βγ)
minus1β⊤γ a σ
2γ =
(aminusβγ λγ)⊤(aminusβγ λγ)N and Γ denotes the Gamma function
Therefore if model γ includes a useless factor (whose β asymptotically converges to zero) the
matrix β⊤γ βγ is nearly singular and its determinant goes to zero sending the marginal likelihood
in (15) to infinity As a result the posterior probability of the model containing the spurious factor
go to one Consequently under a flat prior for risk premia the model containing a useless factor
will always be selected asymptotically However the posterior distribution of λ for the spurious
factor is robust and particularly disperse in any finite sample
Moreover it is highly likely that conclusions based on the posterior coverage of λ contradict
those arising from Bayesrsquo factors When the prior distribution of λj is too diffuse under the
alternative hypothesis H1 the Bayesrsquo factor tends to favor H0 over H1 even though the estimate
of λj is far from 0 The reason is that even though H0 seems quite unlikely based on posterior
coverages the data is even more unlikely under H1 than under H0 Therefore a disperse prior for
λj may push the posterior probabilities to favor H0 and make it fail to identity true factors This
phenomenon is the so called ldquoBartlett Paradoxrdquo (see Bartlett (1957))
Note also that flat hence improper priors for the risk premia are not legitimate since they render
the posterior model probabilities arbitrary Suppose that we are testing the null H0 λj = 0 Under
the null hypothesis the prior for (λσ2) is λj = 0 and π(λminusj σ2) prop 1
σ2 However the prior under the
alternative hypothesis is π(λj λminusj σ2) prop 1
σ2 Since the marginal likelihoods of data p(data|H0)
and p(data|H1) are both undetermined we cannot define the Bayesrsquo factor p(data|H1)p(data|H0)
(see eg
Cremers (2002) Chib Zeng and Zhao (forthcoming)) In contrast for nuisance parameters such
as σ2 we can continue to assign improper priors Since both hypotheses H0 and H1 include σ2
the prior for it will be offset in the Bayesrsquo factor and in the posterior probabilities Therefore we
can only assign improper priors for common parameters10 Similarly we can still assign improper
priors for β and Σ in the first time series step
The final reason why it might be undesirable to use Jeffreysrsquo prior in the second step is that it
does not impose any shrinkage on the parameters This is problematic given the large number of
10See Kass and Raftery (1995) (and also Cremers (2002)) for more detailed discussion
13
members of the factor zoo while we have only limited time-series observations of both factors and
test asset returns
In the next subsection we propose an appropriate prior for risk premia that is both robust to
spurious factors and can be used for model selection even when dealing with a very large number
of potential models
II22 Spike and Slab Prior for Risk Premia
In order to make sure that the integration of the marginal likelihood is well-behaved we propose
a novel prior specification for the factorsrsquo risk premia λ⊤f = (λ1 λK)11 Since the inference in
time-series regression is always valid we only modify the priors of the cross-sectional regression
parameters
The prior that we propose belongs to the so-called ldquospike-and-slabrdquo family For exemplifying
purposes in this section we introduce a Dirac spike so that we can easily illustrate its implications
for model selection In the next subsection we generalize the approach to a ldquocontinuous spikerdquo
prior and study its finite sample performance in our simulation setup
In particular we model the uncertainty underlying the model selection problem with a mixture
prior π(λσ2γ) prop π(λ|σ2γ)π(σ2)π(γ) for the risk premium of the j-th factor When γj = 1
and hence the factor should be included in the model the prior follows a normal distribution given
by λj |σ2 γj = 1 sim N (0σ2ψj) where ψj is a quantity that we will be defining below When instead
γj = 0 and the corresponding risk factor should not be included in the model the prior is a Dirac
distribution at zero For the cross-sectional variance of the pricing errors we keep the same prior
that would arise with Jeffreysrsquo approach π(σ2) prop σminus2
Let D denote a diagonal matrix with elements cψminus11 middot middot middot ψminus1
K and Dγ the sub-matrix of D
corresponding to model γ We can then express the prior for the risk factors λγ of model γ as
λγ |σ2γ sim N (0σ2Dminus1γ )
Note that c is a small positive number since we do not shrink the common intercept λc of the
cross-sectional regression
Given the above prior specification we sample the posterior distribution by sequentially drawing
from the conditional distributions of the parameters (ie we use a Gibbs sampling algorithm) The
crucial steps in addition to the sampling of the times series parameters from the posteriors in
equations (8)-(9) are as follows
Sampling λγ
Note that
11We do not shrink the intercept λc
14
p(λ|dataσ2γ) prop p(data|λσ2γ)π(λ|σ2γ)
prop (2π)minuspγ2 |Dγ |
12 (σ2)minus
N+pγ2 exp
983069minus 1
2σ2[(aminus βγλγ)
⊤(aminus βγλγ) + λ⊤γDγλγ ]
983070
= (2π)minuspγ2 |Dγ |
12 (σ2)minus
N+pγ2 e
983069minus
(λγminusλγ )⊤(β⊤γ βγ+Dγ )(λγminusλγ )
2σ2
983070
e
983153minusSSRγ
2σ2
983154
where SSRγ = a⊤aminus a⊤βγ(β⊤γ βγ +Dγ)
minus1β⊤γ a = minλγ(aminus βγλγ)
⊤(aminus βγλγ) + λ⊤γDγλγ
Note that SSRγ is the minimized sum of squared errors with generalised ridge regression penalty
term λ⊤γDγλγ That is our prior modelling is analogous to introducing a Tikhonov-Phillips
regularisation (see Tikhonov Goncharsky Stepanov and Yagola (1995) Phillips (1962)) in the
cross-sectional regression step and has the same rationale delivering a well defined marginal
likelihood in the presence of rank deficiency (that in our settings arises in the presence of useless
factors) However in our setting the shrinkage applied to the factors is heterogeneous since we
rely on the partial correlation between factors and test assets to set ψj as
ψj = ψ times ρ⊤j ρj (16)
where ρj is an N times 1 vector of correlation coefficients between factor j and the test assets and
ψ isin R+ is a tuning parameter which controls the shrinkage over all the factors12 When the
correlation between fjt and Rt is very low as in the case of a useless factor the penalty for λj
which is the reciprocal of ψρ⊤j ρj is very large and dominates the sum of squared errors
Let λγ = (β⊤γ βγ +Dγ)
minus1β⊤γ a and σ2(λγ) = σ2(β⊤
γ βγ +Dγ)minus1 the posterior distribution of
λγ is
λγ |dataσ2γ sim N (λγ σ2(λγ))
The above equation makes it clear why this Bayesian formulation is robust to spurious factors
When β converges to zero (β⊤γ βγ +Dγ) is dominated by Dγ so the identification condition for
the risk premia no longer fails When a factor is spurious its correlation with test assets converges
to zero hence the penalty for this factor ψminus1j goes to infinity As a result the posterior mean
of λγ λγ = (β⊤γ βγ +Dγ)
minus1β⊤γ a is shrunk towards zero and the posterior variance term σ2(λ)
approaches σ2Dminus1γ Consequently the posterior distribution of λ for a spurious factor is nearly the
same as its prior In contrast for a normal factor that has non-zero covariance with test assets the
information contained in β dominates the prior information since in this case the absolute size of
Dγ is small relative to β⊤γ βγ
Using our priors and integrating out λ yields
p(data|σ2γ) =
983133p(data|λσ2γ)π(λ|σ2γ)dλ prop (σ2)minus
N2
|Dγ |12
|β⊤γ βγ +Dγ |
12
exp
983069minusSSRγ
2σ2
983070
12Alternatively we could have set ψj = ψ times β⊤j βj where βj is an N times 1 vector However ρj has the advantage
of being invariant to the units in which the factors are measured
15
Sampling σ2
From the Bayesrsquo theorem we have that the posterior of σ2 given by
p(σ2|dataγ) prop p(data|σ2γ)π(σ2) prop (σ2)minusN2minus1 exp
983069minusSSRγ
2σ2
983070
hence the posterior distribution of σ2 is an inverse-Gamma σ2|dataγ sim Γminus1983059N2
SSRγ
2
983060
Finally we obtain the marginal likelihood of the data in model γ by integrating out σ2
p(data|γ) =983133
p(data|σ2γ)π(σ2)dσ2 prop |Dγ |12
|β⊤γ βγ +Dγ |
12
1
(SSRγ2)N2
When comparing two models using posterior model probabilities is equivalent to simply using the
ratio of the marginal likelihoods ie the Bayesrsquo factor defined as
BFγγprime = p(data|γ)p(data|γ prime)
where we have given equal prior probability to model γ and model γprime
Remark 2 (Bayes Factor) Consider two nested linear factor models γ and γ prime The only dif-
ference between γ and γ prime is γp γp equals 1 in model γ but 0 in model γ prime Let γminusp denote a
(K minus 1) times 1 vector of model index excluding γp γ⊤ = (γ⊤minusp 1) and γ prime⊤ = (γ⊤
minusp 0) where without
loss of generality we have assumed that the factor p is ordered last The Bayesrsquo factor is then
BFγγprime =
983061SSRγprime
SSRγ
983062N2 983059
1 + ψpβ⊤p
983147IN minus βγprime(β⊤
γprimeβγprime +Dγprime)minus1β⊤γprime
983148βp
983060minus 12 (17)
The above result is proved in Appendix A4
Since β⊤p [IN minus βγprime(β⊤
γprimeβγprime + Dγprime)minus1β⊤γprime ]βp is always positive ψp plays an important role in
variable selection For a strong and useful factor that can substantially reduce pricing errors the
first term in equation (17) dominates and the Bayesrsquo factor will be much greater than 1 hence
providing evidence in favour of model γ
Remember that SSRγ = minλγ(a minus βγλγ)⊤(a minus βγλγ) + λ⊤
γDγλγ hence we always have
SSRγ le SSRγprime in sample There are two effects of increasing ψp i) when ψp is large the penalty
for λp is small hence it is easier to minimise SSRγ and SSRγprimeSSRγ becomes much larger than
1 ii) large ψp decreases the second term in equation (17) lowering the Bayesrsquo factor and acting as
a penalty for dimensionality
A particularly interesting case is when the factor is useless βp converges to zero but the
penalty term 1ψp prop 1ρ⊤pρp goes to infinity On the one hand the first term in equation (17) will
converge to 1 on the other hand since ψp asymp 0 in large sample the second term in equation (17)
will also be around 1 Therefore the Bayesrsquo factor for a useless factor will go to 1 asymptotically13
13But in finite sample it may deviate from its asymptotic value so we should not use 1 as a threshold when testingthe null hypothesis H0 γp = 0
16
In contrast a useful factor should be able to greatly reduce the sum of squared errors SSRγ so
the Bayesrsquo factor will be dominated by SSRγ yielding a value substantially above 1
Remark 3 (Level Factors) Identification failure of factorsrsquo risk premia can arise in the presence
of lsquolevel factorsrsquo exposure to which is non-zero but lacks cross-sectional spread ie βj rarr c1N with
c ∕= 0 These factors help explain the average level of returns but not the their cross-sectional
dispersion and hence are collinear with the common cross-sectional intercept Our approach can
handle this case by using variance standardised variables in the estimation and replacing the penalty
in (16) with ψj = ψtimes983145ρj⊤983145ρj where 983145ρj is the cross-sectionally demeaned vector of correlations with
asset returns ie 983145ρj = ρj minus983059
1N
983123Ni=1 ρji
983060times 1N
II23 Continuous Spike
We extend the Dirac spike-and-slab prior by encoding a continuous spike for λj when γj equals
0 Following the literature on Bayesian variable selection (see eg George and McCulloch (1993)
George and McCulloch (1997) and Ishwaran Rao et al (2005)) we model the uncertainty under-
lying model selection with a mixture prior π(λσ2γω) = π(λ|σ2γ)π(σ2)π(γ|ω)π(ω) which is
specified as following
λj |γj σ2 sim N (0 r(γj)ψjσ2)
Note the introduction of a new element r(γj) in the prior and where r(1) = 1 and r(0) = r As
we explain below the additional parameter vector ω encodes our prior beliefs about the sparsity
of the true model
Redefine D as a diagonal matrix with elements c (r(γ1)ψ1)minus1 (r(γK)ψK)minus1 where ψj is
given as before by equation (16) In matrix notation the prior for λ is λ|σ2γ sim N (0σ2Dminus1)
The term r(γj)ψj in Dminus1 is set to be small or large depending on whether γj = 0 or γj = 1 In the
empirical implementation we force r to be much less than 1 since we intend to shrink λj towards
zero when γj is 014 Hence the spike component concentrates the mass of λ towards zero whereas
the slab component allows λ to take values over a much wider range Therefore the posterior
distribution of λ is very similar to the case of a Dirac spike in section II22
We use Gibbs sampling (ie sequential sampling from conditional distributions) to draw the
posterior distribution of the parameters (λγωσ2) where as explained below ω encodes our
prior beliefs about the sparsity of the true model
Sampling λγ
Combining the likelihood and the prior for λ we have
p(λ|dataσ2γ) prop p(data|λσ2γ)p(λ|σ2γ) prop exp
983069minus 1
2σ2
983147λ⊤(β⊤β +D)λminus 2λ⊤β⊤a
983148983070
Therefore defining λ = (β⊤β+D)minus1β⊤a and σ2(λ) = σ2(β⊤β+D)minus1 the posterior distribution
14We can set r = 00001 In our framework r is essentially a tune parameter hence we need to choose a reasonablevalue such that we can identify useful factor but exclude spurious ones
17
of λ can be expressed as λ|dataσ2γω sim N (λ σ2(λ))
Sampling γjKj=1
Even though the prior on model index γ could be simply set to be π(γ) = 12K we interpret
π(γj = 1|ωj) = ωj as our prior belief about the sparsity of the true model As in the literature on
predictors selection we assign the following prior distribution to (γω)
π(γj = 1|ωj) = ωj ωj sim Beta(aω bω)
Different hyper-parameters aω and bω determine whether we favor more parsimonious models or
not15
Given a ωj the Bayes factor for the j-th risk then factor is
p(γj = 1|dataλωσ2γminusj)
p(γj = 0|dataλωσ2γminusj)=
ωj
1minus ωj
p(λj |γj = 1σ2)
p(λj |γj = 0σ2)
If we had instead imposed ωj = 05 as in section II22 the Bayesrsquo factor would be fully determined
byp(λj |γj=1σ2)p(λj |γj=0σ2)
Sampling ω
From Bayesrsquo theorem we have
p(ωj |dataλγσ2) prop π(ωj)π(γj |ωj) prop ωγjj (1minus ωj)
1minusγjωaωminus1j (1minus ωj)
bωminus1
prop ωγj+aωminus1j (1minus ωj)
1minusγj+bωminus1
Therefore the posterior distribution of ωj is ωj |dataλγσ2 sim Beta(γj + aω 1minus γj + bω)
Sampling σ2
Finally
p(σ2|dataωλγ) prop (σ2)minusN+K+1
2minus1 exp
983069minus 1
2σ2[(aminus βλ)⊤(aminus βλ) + λ⊤Dλ]
983070
Hence the posterior distribution of σ2 is
σ2|dataωλγ sim Γminus1
983061N +K + 1
2(aminus βλ)⊤(aminus βλ) + λ⊤Dλ
2
983062
Note that when ωj is a constant 05 and r converges to 0 the continuous slab-and-spike prior
is equivalent to the one with a Dirac spike in section II22 However the MCMC algorithm of
the continuous spike setting is particularly useful in the high dimensional case Imagine that there
are 30 candidate factors in the factor zoo In the Dirac spike-and-slab prior case we have to
15We set aω = bω = 2 in the benchmark case However we could assign aω = 1 and bω = 2 in order to select asparser model
18
calculate the posterior model probabilities for 230 different models Given that we update (aβ)
in each sampling round posterior probabilities for all models are necessarily re-computed for every
new draw of these quantities rendering the computational cost very large In contrast using our
continuous spike approach we can simply use the posterior mean of γj to approximate the posterior
marginal probability of the j-th factor
III Simulation
We build a simple setting for a linear factor model that includes both strong and irrelevant factors
and allows for potential model misspecification
The cross-section of asset returns mimics empirical properties of 25 Fama-French portfolios
sorted by size and value We generate both factors and test asset returns from normal distributions
assuming that HML is the only useful factor A misspecified model also includes pricing errors from
the two-step Fama-MacBeth procedure which makes the vector of simulated expected returns equal
to their sample mean estimates of 25 Fama-French portfolios Finally a spurious factor is simulated
from an independent normal distribution with mean zero and standard deviation 1
ftuselessiidsim N (0 (1)2)
ftHMLiidsim N (rHML σ
2HML)
ftHML = ftHML minus macrftHML
Rt|ftHMLiidsim
983099983103
983101N (λc1N + β
983059λHML + ftHML
983060 Σ) if the model is correct
N (R+ βftHML Σ) if the model is misspecified
where factor loadings risk premia and variance-covariance matrix of returns are equal to their
OLS sample estimates from time series and two-pass Fama-MacBeth regressions of 25 size and
value portfolios on HML All the model parameters are estimated on monthly data from July 1963
to December 2017
To better illustrate the properties of the frequentist and bayesian approaches we consider 3
estimation setups
(a) the model includes only a strong factor (HML)
(b) the model includes only a useless factor
(c) the model includes both strong and useless factors
Each setting can be correctly or incorrectly specified with the following sample sizes T = 100
200 600 1000 and 20000 We compare the performance of the OLSGLS standard frequentist
and Bayesian Fama-MacBeth estimators (FM and BFM correspondingly) with the focus on risk
premia recovery testing and identification of strong and useless factors for model comparison
19
III1 Estimating risk premia via Bayesian Fama-MacBeth
Since it is unlikely that in most empirical settings a linear factor model is correctly specified we
focus our discussion on the case that allows for model misspecification We report similar simulation
results for the case of correct model specification in Appendix C
Table 1 Tests of risk premia in a misspecified model with a strong factor
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0107 0052 0007 0115 0076 0013 -422 4530200 0094 0067 0006 0118 0072 0015 -303 5621
FM 600 0097 0053 0006 0098 0047 0011 389 55751000 0088 0048 0012 0103 0060 0012 865 532820000 0109 0056 0010 0110 0060 0008 2486 3655
100 0063 0026 0006 0032 0013 0001 -356 3609200 0086 0041 0012 0079 0039 0008 -367 4689
BFM 600 0087 0043 0013 0087 0047 0009 -067 52611000 0095 0046 0009 0093 0046 0009 440 534620000 0096 0052 0012 0100 0052 0012 2390 3601
Panel B GLS
100 0246 0173 0067 0251 0154 0070 1720 7464200 0165 0107 0031 0171 0105 0035 5337 8045
FM 600 0137 0076 0020 0137 0073 0016 6942 83871000 0131 0072 0019 0132 0072 0015 7341 841620000 0122 0074 0014 0113 0061 0015 8034 8283
100 0139 0083 0022 0143 0083 0024 3127 6815200 0131 0072 0014 0121 0069 0019 4858 7299
BFM 600 0114 0061 0013 0126 0067 0018 6594 80091000 0100 0048 0009 0106 0058 0008 7063 814920000 0092 0050 0012 0100 0048 0012 8022 8267
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor The true value of the cross-sectional R2
adj is 3055(8175) for the OLS (GLS) estimation Fama-MacBeth estimates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidence intervals and their size for BFMestimates are constructed using posterior coverage of Fama-MacBeth estimates of λ The last two columns report the5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at the simulation point estimatesfor FM and its posterior mode for BFM
Table 1 presents the size of the tests for risk premia and confidence intervals for cross-sectional
R2 in probably the most relevant (and best-case) scenario for empirical applications It compares
the performance of frequentist and Bayesian Fama-MacBeth estimators for the case when the model
is misspecified and includes a single cross-sectional factor that is priced and strongly identified
proxied by HML Since the model is misspecified cross-sectional R2 never reaches 100 even for
T = 20 000 with the population value of 31 (82) for OLS (GLS) As expected both FM and
BFM estimators are very similar to each other and provide confidence intervals of correct size In
case of the standard FM approach they are constructed using standard t-statistics adjusted for
Shanken correction and in case of the BFM we rely on the quantiles of the posterior distribution
to form the credible confidence intervals for parameters The last two columns also report the
quantiles of the mode of the posterior distribution of R2 across the simulations
20
Table 2 summarizes risk premia estimation for the same cross-section of 25 expected returns
but using a useless (spurious) factor as a candidate cross-sectional factor As expected the standard
Fama-MacBeth estimator fails to recognize the rank failure in the second stage and conventional
risk premia estimates and t-statistics are inflated Indeed it is widely known since Kan and Zhang
(1999) that if the model is misspecified then t-statistics of the spurious factors tend to infinity
asymptotically This is confirmed in Panels A and B for T = 20 000 the probability of finding a
useless factor to have a t-stat above 196 is almost 50 for FM-OLS and over 80 for FM-GLS
Furthermore cross-sectional measures of fit such as R2 and related quantities are substantially
inflated even though itrsquos true value is 0 for both OLS and GLS settings in-sample estimates
produced by the frequentist approach not only have a much wider empirical support (for example
from -4 to over 40 for FM-OLS) but also uncertainty that does not decrease with the sample
size
Table 2 Tests of risk premia in a misspecified model with a useless factor
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0072 0037 0010 0055 0011 0001 -429 4184200 0078 0043 0005 0084 0021 0001 -418 4524
FM 600 0089 0043 0014 0223 0116 0013 -428 43701000 0093 0052 0014 0333 0187 0027 -429 450120000 0213 0135 0067 0698 0488 0172 -427 4363
100 0035 0011 0001 0001 0000 0000 -247 -036200 0041 0013 0001 0008 0002 0000 -261 -016
BFM 600 0071 003 0003 0031 0006 0002 -287 0191000 0047 002 0001 0039 0017 0002 -298 08420000 0034 0013 0000 0091 0043 0012 -336 1140
Panel B GLS
100 0238 0144 0066 0305 0199 0062 -347 3857200 0152 0091 0028 0292 0189 0067 -375 1953
FM 600 0126 0066 0017 0407 0314 0148 -385 16431000 0117 0063 0013 0510 0401 0239 -405 156920000 0104 0039 0005 0864 0847 0768 -311 1333
100 0128 0070 0019 0047 0018 0002 -218 1154200 0107 0060 0014 0034 0011 0000 -275 891
BFM 600 0093 0046 0008 0042 0012 0001 -325 6621000 0083 0031 0004 0061 0028 0004 -339 51920000 0023 0006 0000 0099 0049 0011 -304 138
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor The true value of the cross-sectional R2 is zero
The Bayesian approach to Fama-MacBeth regressions successfully overcomes the hurdle of use-
less factors As Table 2 demonstrates both BFM-OLS and BFM-GLS are able to identify the spu-
rious factor with the posterior distribution providing credible confidence bounds with the proper
control for the size of the test (eg for T = 20 000 the probability to reject the null of no risk
premia attached to a useless factor when using a test with the nominal size of 10 is 91) Rec-
ognizing a useless factor cross-sectional measures of fit are more conservative and overall tighter
Why does the bayesian approach work when the frequentist fails The argument is probably
best summarized by Figure 1 that plots a posterior distribution of λuseless for BFM from one of
21
minus4 minus2 0 2 4
00
02
04
06
08
λuseless
Density
BFMminusOLSFMminusOLS
Figure 1 Risk premia estimates of a useless factor
The graph presents the posterior distribution (blue line) of λuseless from the BFM-OLS estimation in a misspecifiedmodel with a useless factor based on a single simulation with T = 1 000 The red line depicts the asymptoticdistribution of Fama-MacBeth estimate of risk premium under the normal approximation
the simulations along with the pseudo-true value of risk premium defined as 0 in this case In this
particular simulation Fama-MacBeth OLS estimate of λuseless is -119 with Shanken-corrected
t-statistics equal to -255 so according to traditional hypothesis testing we would reject the null
of λuseless = 0 even at 1 The posterior distribution of BFM estimates of risk premium (the
blue line in Figure 1) behaves rather differently it is centered around 0 and overall more spread
out with a confidence interval (minus1603 1201) Intuitively the main driving force behind it
is the fact that in BFM β is updated continuously when β is close to zero the posterior draws
of β will be positive or negative randomly which implies that the conditional expectation of λ in
Equation 12 will also flip sign depending on the draw As a result the posterior distribution of
λuseless is centered around 0 and so is the confidence interval The same logic applies to the case
of BFM-GLS
Finally Table 3 combines the insights for true and irrelevant factors and presents the simulation
results for the most realistic model setup that includes both useless and strong factors As expected
in the conventional case of frequentist Fama-MacBeth estimation the useless factor is often found
to be a significant predictor of the asset returns its OLS (GLS) t-statistic would be above a
5-critical value in over 60 (80) of the simulations On the contrary the bayesian confidence
intervals have the right coverage and reject the null of no risk premia attached to the spurious
factor with frequency asymptotically approaching the size of the tests
The crowding out of the true factors by the useless ones could also be an important empirical
concern When the model is misspecified the presence of spurious factors can also bias the risk
premia estimates for the strong ones and often leads to their crowding out of the model Panel
A in Table 3 serves as a good illustration of this possibility with risk premia estimates for the
22
Table 3 Tests of risk premia in a misspecified model with useless and strong factors
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0082 0039 0008 0121 0067 0016 0099 0023 0001 -513 5663200 0096 0044 0005 0157 0100 0034 0129 0039 0005 127 6190
FM 600 0093 0034 0014 0212 0147 0071 0264 0129 0022 840 61781000 0102 0046 0010 0261 0194 0098 0380 0199 0056 1184 624820000 0114 0054 0009 0289 0229 0152 0848 0633 0240 2507 6076
100 0035 0012 0001 0028 0007 0001 0004 0001 0000 -211 4033200 0049 0017 0001 0067 0031 0004 0011 0003 0000 -175 4828
BFM 600 005 0018 0004 0099 0047 0005 0047 0014 0002 1020 55721000 0041 0021 0003 0102 0048 0011 0071 0035 0004 1487 569520000 0017 0007 0000 0087 0033 0007 0099 0055 0012 2480 5466
Panel B GLS
100 0219 0155 0057 0224 0135 0066 0303 0198 0064 1911 7775200 0155 0092 0028 0149 0090 0024 0263 0183 0061 5537 8171
FM 600 0121 0068 0015 0116 0064 0016 0391 0293 0134 6948 84331000 0115 0061 0013 0115 0057 0012 0487 0387 0216 7305 847420000 0084 0050 0009 0100 0041 0005 0864 0836 0757 7979 8424
100 0122 0069 0016 0129 0070 0017 0046 0017 0002 3243 6869200 0112 0056 0012 0099 0048 0012 0031 0012 0000 4844 7355
BFM 600 0096 0049 0011 0086 0045 0009 0049 0016 0002 6576 80301000 0081 0036 0007 0073 0032 0006 0058 0030 0003 7064 815420000 0027 0005 0000 0022 0007 0000 0098 0047 0013 7974 8259
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and
λstrong λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor The true value of the
cross-sectional R2adj is 3055 (8175) for the OLS (GLS) estimation
strong factor are clearly biased in the frequentist estimation by the identification failure in case of
the frequentist approach Again in this case BFM provides reliable albeit conservative confidence
bounds for model parameters
III11 Evaluating cross-sectional fit
In addition to risk premia estimates it is often useful to understand the quality of cross-sectional
fit of the model Indeed the increase in cross-sectional R2 is often interpreted as measuring the
economic importance of the predictor contrary to the statistical one implied by the risk premia
significance It is well-known however that the average values of R2 are not always informative
about the true model performance its sample distribution often suffers from a large estimation un-
certainty (see eg Stock (1991) and Lewellen Nagel and Shanken (2010)) ad has a non-standard
distribution when the matrix of β has reduced rank (Kleibergen and Zhan (2015) Gospodinov
Kan and Robotti (2019)) In this section we further investigate the properties of cross-sectional
R2 in the frequentist and Bayesian Fama-MacBeth regressions
Figure 2 shows the distribution of cross-sectional OLS R2 across a large number of simulations
for the asymptotic case of T = 20 000 and a misspecified process for returns As Panel (a)
illustrates if the model is strongly identified the distribution of posterior mode of R2 for BFM
tends to coincide with that of conventional Fama-MacBeth procedure as expected The major
23
difference emerges whenever a useless factor is included into the candidate set of variables Indeed
it is well-known that in this case the distribution of conventional measures of fit is nonstandard
and often inflated (Kleibergen and Zhan (2015)) This is further confirmed in Figures 2 b) and
c) that show that under the presence of spurious factors conventional Fama-MacBeth R2 has an
extremely spreadout right tail of the distribution which makes it easy to find a substantial increase
of fit whenever the model is simply not identified This unfortunate property of the frequentist
approach is not shared by the inference with BFM Indeed the mode of the posterior distribution
of R2 is generally tightly centered around the true values The slight bump to the right tail of the
distribution comes from the fact that whenever a spurious factor is included into the model with a
small probability (based on t-statistic cut-off this is equal to the size of the test see eg Table 3
Panel A) its fit will be similar to that of the frequentist estimation
00 02 04 06 08 10
02
46
810
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(a) Model with a strong factor
00 02 04 06 08
010
2030
40
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(b) Model with a useless factor
00 02 04 06 08 10
02
46
8
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(c) Model with strong and useless factors
Figure 2 Cross-sectional distribution of R2adj in models with strong and useless factors
Figures (a)-(c) show the asymptotic distribution of cross-sectional R2 under different model specifications across 1000simulations of sample size T = 20 000 Blue lines correspond to the distribution of the posterior mode for R2
adj whilegreen lines depict the pointwise sample distribution of cross-sectional fit evaluated at Fama-MacBeth risk premiaestimates The dark yellow dash line stands for the true value of R2
adj in the model
24
However the pointwise distribution of cross-sectional R2 across the simulations is only part of
the story as it does not reveal the in-sample estimation uncertainty and whether the confidence
intervals are credible in reflecting it While BFM incorporates this uncertainty directly into the
shape of its posterior distribution one needs to rely on bootstrap-like algorithms to build a similar
analogue in the frequentist case As frequentist benchmark we use the approach Lewellen Nagel
and Shanken (2010) to construct the confidence interval for R2 Details on this procedure can be
found in Appendix A5
minus05 00 05 10
00
05
10
15
20
25
Radj2
Den
sity
BFM PosteriorFM Radj
2
5th95th CI of BFM5th95th CI of LNSTrue Radj
2
(a) Model with a useless factor
minus04 minus02 00 02 04 06 08 10
00
05
10
15
20
Radj2
Den
sity
BFM PosteriorFM Radj
2
5th95th CI of BFM5th95th CI of LNSTrue Radj
2
(b) Model with strong and useless factors
Figure 3 The estimation uncertainty of cross-sectional R2Figures (a) and (b) show the posterior density of the cross-sectional R2 (blue) in one representative simulationalong with its 90 confidence interval (red) The dark yellow line denotes the true value of R2 while the green linedepicts its in-sample Fama-MacBeth estimate with the confidence interval constructed following Lewellen Nagel andShanken (2010) (black)
Figure 3 presents the posterior distribution of cross-sectional R2 for a model that contains a
useless factor (and potentially a strong one too) and contrasts it with a frequentist value and
the confidence interval around it Consider for example Panel a) The fact that the in-sample
Fama-MacBeth estimate of cross-sectional fit (51) is substantially higher than the mode of the
posterior distribution (-2 which is close to the true value of R2adj about minus4) is not surprising
given the previous results on the pointwise distribution of the estimates What is quite interesting
however is the coverage of the confidence interval constructed via the simulation-based approach of
Lewellen Nagel and Shanken (2010) Not only does it not include true value of the cross-sectional
fit but in fact in this particular simulation it suggests that R2adj should be between 42 and
100 A similar mismatch between the seemingly high levels of cross-sectional fit produced by a
frequentist approach and their true values can also be observed in Panel b) of Figure 3 for the
case of including both strong and a useless factors
In section A6 of the Appendix we show that the performance of the Bayesian Fama-MacBeth
method is robust to the use of a larger cross-sectional dimension ndash ie the above discussed properties
25
hold in a larger N setting
III2 Bayes factors
How well do flat and spike-and-slab priors work empirically in selecting relevant and detecting
spurious factors in the cross-section of asset returns We revisit the theoretical results from Section
II1 using the same simulation design used to evaluate the estimation of risk premia
Table 4 The probability of retaining risk factors using Bayes factors
T 55 57 59 61 63 65
Panel A strong factors
Jeffreys Prior fstrong 200 0813 0784 0758 0722 0693 0662600 0929 0915 0896 0876 0851 08341000 0972 0963 0957 0951 0937 0924
Spike-and-Slab Prior fstrong 200 0605 0570 0538 0505 0476 0447600 0853 0830 0807 0784 0762 07351000 0954 0940 0928 0912 0896 0875
Panel B useless factors
Jeffreys Prior fuseless 200 1000 0996 0988 0967 0919 0822600 0998 0998 0995 0988 0977 09431000 1000 1000 1000 0994 0983 0965
Spike-and-Slab Prior fuseless 200 0325 0188 0106 0057 0024 0013600 0072 0025 0006 0002 0001 00001000 0051 0015 0001 0000 0000 0000
Panel C strong and useless factors
Jeffreys Prior fstrong 200 0924 0897 0874 0848 0821 0799600 0988 0985 0976 0974 0965 09581000 0998 0996 0996 0995 0992 0987
fuseless 200 0984 0960 0910 0811 0702 0584600 0999 0993 0985 0954 0913 08541000 1000 1000 0995 0986 0966 0945
Spike-and-Slab Prior fstrong 200 0627 0591 0552 0509 0474 0452600 0860 0830 0802 0783 0758 07271000 0956 0942 0927 0911 0895 0875
fuseless 200 0260 0128 0071 0031 0019 0010600 0080 0028 0010 0004 0001 00011000 0058 0013 0005 0000 0000 0000
The table shows the frequency of retaining risk factors for different choice sets across 1000 simulations of differentsize (T=200 600 and 1000) In Panel A the candidate risk factor is truly cross-sectionally priced and stronglyidentified while in Panel B they are not Panel C summarizes the case of using both strong and useless candidatefactors in the model A candidate factor is retained in the model if its marginal posterior probability p(γi = 1|data)is greater than a certain threshold ie 55 57 59 61 63 and 65
Consider a cross-section of 25 portfolios that is actually loading on 2 systematic sources of risk
with the econometrician potentially observing at most only one of them a strong (and priced) ft
However there is also a second candidate factor available which is orthogonal to asset returns
and essentially useless We compute Bayes factors corresponding to each of the potential sources
of risk and document the empirical probability of retaining the variable in the model across 1000
26
simulations Again we consider models that contain either strong or useless factors or a com-
bination of both and different sample sizes (T = 200 600 and 1000) In each case we run the
Gibbs sampling algorithm derived using continuous spike-and-slab prior and then approximate the
marginal probability of each factor by the posterior mean of γj The decision rule is based on a
range of critical values 55-65 such that whenever the posterior mean of γj is above a particular
threshold we retain the factor Finally we also compute the probability of retaining a factor under
Jeffreys prior which would be the standard in the literature
Table 4 summarizes our findings When only a true risk factor is included in the candidate set
(Panel A) both Jeffreyrsquos and spike-and-slab priors successfully identify it with a high probability
especially in large sample For small sample sizes however Jeffreys prior clearly has a somewhat
higher power of detecting a true risk factor This outcome is not surprising as the estimator does
not impose additional shrinkage on the risk premium The difference however vanishes for larger
sample sizes
The difference between the two priors becomes drastic whenever useless factors are included
in the model (Panels B Panels B and C in Table 4) As discussed in Section II21 since the
matrix β⊤γ βγ is nearly singular and its determinant goes to zero under a flat prior the posterior
probability of including a spurious factor in the model converges to 1 asymptotically For example
the probability of misidentifying a spurious factor as being the true source of risk is almost 1
under Jeffreys prior even for a very short sample This in turn makes the overall process of model
selection invalid
Overall we find the asymptotic behavior of the spike-and-slab prior encouraging for variable
and model selection While often somewhat conservative in very short samples it successfully
eliminates the impact of the spurious factors from the model and identifies the true sources of
risk
IV Empirical Applications
In this section we apply our Bayesian approach to a large set of factors proposed in the previous
literature First we use the Bayesian Fama-MacBeth method to analyse several notable factor
models (subsection IV1) Second we consider 51 tradable and non-tradable factors yielding more
than two quadrillion possible models and employ our spike and slab priors to compute factorsrsquo
posterior probabilities and implied risk premia (subsections IV2 and IV3) Third we compare
the performance of a (low dimensional) robust model constructed with only the factors that have
high posterior probability to the one of several notable factor models (subsection IV4) Fourth
we estimate the degree of sparsity (in terms of linear factors) of the true latent stochastic discount
factor as well as the SDF-implied maximum Sharpe ratio (subsection IV5)
27
IV1 Some notable factor models
In this section we illustrate the differences between the frequentist and Bayesian FM estimation
(both OLS and GLS) for several candidate models In particular we estimate a set of linear
factor models on the returns of the standard 25 Fama-French portfolios sorted by size and value
using frequentist and Bayesian Fama-MacBeth estimators We use monthly data over the 197001-
201712 sample for tradable factors and whenever possible nontradables For factors available
only at quarterly frequency the sample is 1952Q1-2017Q3 (whenever possible) A full description
of the data and models used can be found in Appendix B Additional empirical results for other
candidate factors and cross-sections (eg 25 Fama-French + 17 Industry portfolios) can be found
in Appendix C
Tables 5 and 6 summarize the performance of several leading factor models For the classical FM
approach we report point estimates of risk premia with their Shanken-corrected t-statistics and
the cross-sectional R2 along with its 90 confidence interval (constructed following the method-
ology of Lewellen Nagel and Shanken (2010)) For BFM we report the posterior mean of risk
premia estimates and the posterior median and mode of R2 along with the centered 90 posterior
coverage We chose to report both the median and the mode for cross-sectional fit because the of
the shape of its distribution which is often heavily skewed
Carhart (1997) 4-factor model OLS and GLS Fama-MacBeth estimates of risk premia indicate
that size value and momentum (SMB HML and UMD correspondingly) are significant drivers of
the cross-section of test assets The market factor does not command a significant risk premium
which is a typical finding for this model Cross-sectional fit seems to be high with R2 over 70 even
though it comes with rather wide confidence bounds according to Lewellen Nagel and Shanken
(2010) approach The Bayesian estimation indicates that part of the model success is due to the fact
that this cross section of test assets does not have much exposure to momentum especially after
one controls for the conventional Fama-French factors While still marginally significant its risk
premium is substantially lower under both BFM and BFM-GLS estimation with tighter bounds
for R2 too On the contrary both HML and SMB have virtually identical risk prices under both
FM and Bayesian estimations
Hou Xue and Zhang (2014) q-factor model emphasized the role of investment and profitability
in matching the cross-section of equity returns and true to the data we find these factors strongly
priced Both estimation strategies give identical parameter estimates and largely consistent mea-
sures of cross-sectional fit This in turn implies that all the risk premia are strongly identified
for this model when estimated on a cross-section of 25 Fama-French portfolios When industry
portfolios are among the test assets (see Appendix C) risk premia and R2 decline and some of the
parameters lose significance but broadly speaking the model performs in a qualitatively similar
way
Liquidity-adjusted CAPM of Pastor and Stambaugh (2003) seems to suffer from identification
failure as the risk premium on the liquidity factor ceases to remain significant when BFM is used
in estimation Wide confidence bounds and uncertain cross-sectional fit provide a stark difference to
28
Table 5 Tradable factors and 25 Fama-French portfolios sorted by size and value
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Carhart (1997) Intercept 0489 7063 0703 6432 6329[-0244 1222] [3160 9400] [-0061 1426] [4826 7646]
MKT 0120 -0101[-0631 0870] [-0822 0683]
SMB 0171 0164[0100 0241] [0089 0232]
HML 0404 0396[0331 0477] [0330 0466]
UMD 2445 1806[0955 3936] [0259 3328]
q-factor model Intercept 0912 6567 0922 6062 6123Hou Xue and Zhang (2014) [0286 1539] [3040 8680] [0276 1560] [4131 7640]
ROE 0394 0377[0016 0771] [-0020 0789]
IA 0387 0385[0203 0571] [0208 0580]
ME 0274 0268[0169 0379] [0158 0376]
MKT -0371 -0378[-0995 0252] [-1005 0272]
Liquidity-CAPM Intercept 0973 3624 1162 3409 3027Pastor and Stambaugh (2000) [-0084 2030] [-909 10000] [0175 2120] [-239 6146]
LIQ 3057 1785[0727 5388] [-1237 4150]
MKT -0281 -0449[-1350 0788] [-1371 0509]
Panel A GLS
Carhart (1997) Intercept 1017 8964 1083 8587 863[0389 1645] [8200 9760] [0458 1717] [8085 9105]
MKT -0434 -0504[-1065 0196] [-1150 0122]
SMB 0191 0189[0150 0233] [0150 0230]
HML 0356 0356[0313 0400] [0316 0395]
UMD 1626 1264[0479 2772] [0077 2401]
q-factor model Intercept 1305 5503 1277 4728 4854Hou Xue and Zhang (2014) [0779 1831] [2440 9640] [0702 1879] [3245 6419]
ROE 0295 0266[-0026 0615] [-0087 0640]
IA 0270 0265[0104 0437] [0093 0450]
ME 0251 0246[0161 0341] [0144 0345]
MKT -0749 -0720[-1268 -0229] [-1292 -0156]
Liquidity-CAPM Intercept 1244 4938 1256 5298 4317Pastor and Stambaugh (2000) [0664 1824] [2691 9891] [0738 1749] [1250 6653]
LIQ 1141 0775[-0232 2514] [-0450 2116]
MKT -0664 -0678[-1242 -0086] [-1176 -0162]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of 25 Fama-French monthly excess returns Each model is estimated via OLS and GLS We reportpoint estimates and 5 confidence intervals for risk premia which are constructed based on the asymptotic normaldistribution and cross-sectional R2 and its (5 95) confidence level constructed as in Lewellen Nagel and Shanken(2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide the posterior mean of λ denoted byλj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (595) credible intervals and denote significance at the 90 95 and 99 level respectively
29
Table 6 Nontradable factors and 25 Fama-French portfolios sorted by size and value
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Scaled CCAPM Intercept 1046 2567 1791 3436 2919[-0848 2940] [-1429 10000] [0001 3723] [-476 6207]
cay 1817 0791[-0653 4288] [-1347 2686]
∆Cnd 0713 0303[-0030 1456] [-0462 0951]
∆Cnd times cay 0804 0301[-1645 3253] [-1911 2270]
HC-CAPM Intercept 3243 -122 3090 354 957[1228 5257] [-909 3345] [0790 5259] [-748 4431]
∆Y 0464 0085[-0213 1140] [-1119 1058]
MKT -0719 -0656[-2680 1242] [-2859 1558]
Durable CCAPM Intercept 2214 5238 2780 471 4078[-1037 5465] [2800 10000] [-0184 5751] [120 6991]
∆Cnd 0743 0357[-0025 1511] [-0207 0832]
∆Cd -0057 0014[-0719 0605] [-0668 0693]
MKT 0083 -0495[-3322 3489] [-3395 2555]
Panel B GLS
Scaled CCAPM Intercept 2180 -1024 2257 -658 -313[0825 3536] [-1429 6457] [1221 3258] [-1187 1517]
cay 0435 0256[-0774 1643] [-0688 1217]
∆Cnd 0118 0089[-0266 0502] [-0214 0407]
∆Cnd times cay 0141 0063[-1005 1286] [-0845 0938]
HC-CAPM Intercept 2730 5636 2759 5824 4926[1458 4002] [3018 8364] [1379 4095] [967 7507]
∆Y -0421 -0241[-0742 -0099] [-0598 0114]
MKT -0717 -0740[-1979 0545] [-2073 0622]
Durable CCAPM Intercept 2960 4454 2841 5474 4099[0547 5374] [286 7829] [1102 4558] [-241 7215]
∆Cnd 0105 0052[-0265 0475] [-0201 0311]
∆Cd 0055 0025[-0390 0501] [-0286 0327]
MKT -0941 -0822[-3314 1432] [-2528 0895]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of 25 Fama-French quarterly excess returns Each model is estimated via OLS and GLS Wereport point estimates and 5 confidence intervals for risk premia which are constructed based on the asymptoticnormal distribution and cross-sectional R2 and its (5 95) confidence level constructed as in Lewellen Nageland Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide the posterior mean of λdenoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as wellas its (5 95) credible intervals and denote significance at the 90 95 and 99 level respectively
the pointwise estimates and their seemingly high significance levels under the standard frequentist
approach
Conditional CCAPM of Lettau and Ludvigson (2001) is weakly identified at best Unlike the
basic FM estimation that indicates a relative empirical success of the model the Bayesian approach
reveals most risk premia to be substantially lower losing all the accompanying statistical and
30
economic significance This is particularly pronounced in the BFM-GLS that delivers both risk
premia and cross-sectional R2 close to zero
Labour-adjusted CAPM of Jagannathan and Wang (1996) extends the classic CAPM framework
by introducing a proxy for human capital and finds it strongly priced in the cross-sections of stocks
returns The BFM estimates of risk premia are substantially lower and no longer significant with
the same patterns observed under both OLS and GLS procedures
Durable CCAPM of Yogo (2006) in the linearized version included the durable consumption
factor and found that its impact is priced in a number of cross-sections sorted by size and value past
betas and other characteristics Even though the Lewellen Nagel and Shanken (2010) approach
indicates a really wide support for the cross-sectional R2 the model found empirical support in the
data We find that both durable and nondurable consumption are weak predictors of the cross-
section of returns as the magnitude of their risk premia substantially declines and is no longer
significant The model is still characterized by a wide confidence interval for R2 but overall its
pricing ability is questionable at best
Appendix C provides additional empirical results on the performance of both frequentist and
bayesian Fama-MacBeth estimators In many cases when the models are well specified and strongly
identified in the data there is almost no distinction between the two approaches One notable
difference however are the confidence intervals of the R2 that are often notoriously wide in
the frequentist case There are also cases however when the difference in model performance
becomes large affecting both risk premia estimates and measures of cross-sectional fit Similar
to Gospodinov Kan and Robotti (2019) we caution the reader against blindly relying on the
estimates produced by conventional Fama-MacBeth procedure and advocate a robust approach to
inference
IV2 Sampling two quadrillion models
We now turn our attention to a large cross-section of candidate asset pricing factors In particular
we focus on 51 (both traded and non) monthly factors available from October 1973 to December
2016 (ie T = 600) Factors are described in details in Table B1 of Appendix B In choosing
the cross-section of assets to price we follow Lewellen Nagel and Shanken (2010) and employ 25
Fama-French size and book-to-market portfolios plus 30 Industry portfolio (ie N = 55) Since
we do not restrict the maximum number of factors to be included all the possible combinations
of factors give us a total of 251 possible specifications ie 225 quadrillion models Note that each
model involves 55 time series regressions and one cross-section regression ie we jointly evaluate
the equivalent of 126 quadrillion regressions
We employ the continuous spike-and-slab approach of section II23 since it is the most suited
for handling a very large number of possible models and report both the posterior (given the data)
of each factor (ie E [γj |data] forallj) as well as the posterior means of the factorsrsquo risk premia (ie
E [λj |data] forallj) computed as the Bayesian Model Average16 across all the models considered
16If we are interested in some quantity ∆ that is well-defined for every model j = 1 M (eg risk premia
31
The posterior evaluation is performed and reported over a wide range for the parameter (ψ in
equation (16)) that controls the degree of shrinkage of potentially useless factorsrsquo risk premia from
ψ = 1 (ie very strong shrinkage) to ψ = 100 (making the shrinkage virtually irrelevant) The
prior for each factor inclusion is a Beta(1 1) (ie a uniform on [0 1]) yielding a prior expectation
for γj equal to 5017
000
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
1 5 10 20 50 100
Post
erio
r pro
babi
lity
ψ
HML MKT MKT SMB STRev
IPGrowth BEH_PEAD NONDUR SERV LIQ_TR
PE DeltaSLOPE BW_ISENT DEFAULT Oil
DIV REAL_UNC UNRATE TERM HJTZ_ISENT
UMD CMA LIQ_NT FIN_UNC STOCK_ISS
NetOA MACRO_UNC INV_IN_ASS COMP_ISSUE ACCR
HML ROA RMW CMA LTRev
INTERM_CAP_RATIO ASS_Growth DISSTR PERF BAB
IA MGMT ROE O_SCORE BEH_FIN
GR_PROF QMJ RMW SKEW SMB
HML_DEVIL
Figure 4 Posterior factor probabilities
Posterior probabilities of factors E [γj |data] estimated over the 197310-201612 sample using a cross-section of 25Fama-French size and book-to-market and 30 Industry test asset portfolios computed using the continuous spike andslab approach of section II23 and 51 factors yielding 251 asymp 225 quadrillion models The 51 factors considered aredescribed in Table B1 of Appendix B The prior distribution for the j-th factor inclusion is a Beta(1 1) yielding aprior expectation of γj = 50 Posterior probabilities are plotted for ψ isin [1 100]
Figure 4 plots the posterior probabilities of the 51 factors as a function of the parameter ψ
The corresponding values are reported in Table 7 Overall the inclusion of only four factors finds
maximum Sharpe ratio etc) from Bayesrsquo theorem we have
E [∆|data] =M983131
j=0
E [∆|datamodel = j] Pr (model = j|data)
where E [∆|datamodel = j] = limLrarrinfin1L
983123Li=1 ∆(θ
(j)i ) and
983153θ(j)i
983154L
i=1denotes draws from the posterior distribution
of the parameters of model j That is the (Bayesian model average) expectation of ∆ conditional on only thedata is simply the weighted average of the expectation in every model with weights equal to the modelsrsquo posteriorprobabilities See eg Raftery Madigan and Hoeting (1997) Hoeting Madigan Raftery and Volinsky (1999)
17Using a Beta(2 2) that still implies a prior probability of factor inclusion of 50 but lower probabilities forvery dense and very sparse models we obtain virtually identical results Furthermore using a prior in favour of moresparse factor models (ie a Beta(2 8)) the empirical findings are very similar to the ones reported
32
Table 7 Posterior factor probabilities E [γj |data] and risk premia in 225 quadrillion models
E [γj |data] E [λj |data]ψ ψ
Factors 1 5 10 20 50 100 1 5 10 20 50 100 F
HML 0860 0909 0916 0903 0882 0877 0192 0288 0301 0300 0292 0293 0377MKT 0730 0807 0831 0851 0837 0778 0108 0271 0370 0476 0565 0572 0514MKT 0501 0630 0705 0748 0749 0707 0047 0184 0285 0385 0467 0472 0563SMB 0654 0662 0580 0500 0429 0370 0076 0138 0133 0117 0103 0102 0215STRev 0506 0526 0511 0530 0568 0550 0003 0013 0026 0049 0106 0158 0438IPGrowth 0508 0508 0501 0502 0508 0502 0000 -0001 -0001 -0002 -0004 -0007 0121BEH PEAD 0486 0512 0478 0492 0511 0517 0004 0012 0018 0027 0049 0072 0619NONDUR 0475 0499 0490 0504 0504 0509 0000 0002 0003 0005 0011 0018 0170SERV 0504 0481 0497 0502 0491 0497 0000 0000 0000 0000 -0001 -0001 0052LIQ TR 0494 0493 0485 0485 0506 0507 0000 0005 0010 0020 0048 0083 0438PE 0495 0494 0502 0490 0494 0492 -0002 -0004 -0005 -0006 -0008 -0009 6401DeltaSLOPE 0478 0490 0506 0498 0482 0511 0000 0000 -0001 -0001 -0002 -0005 0096BW ISENT 0489 0491 0496 0498 0504 0477 0000 0002 0003 0005 0010 0014 0094DEFAULT 0497 0498 0490 0497 0484 0490 0000 0000 0000 0001 0001 0002 0313Oil 0495 0504 0489 0490 0494 0483 0002 0011 0019 0034 0068 0108 0852DIV 0488 0491 0486 0494 0491 0505 0000 0000 0000 -0001 -0002 -0003 0876REAL UNC 0479 0490 0503 0490 0498 0484 0000 0000 0000 0000 0000 0000 0043UNRATE 0483 0499 0484 0490 0490 0486 0000 -0001 -0002 -0003 -0006 -0008 1083TERM 0470 0476 0480 0497 0503 0501 0000 0003 0005 0009 0018 0030 0942HJTZ ISENT 0483 0480 0491 0487 0489 0477 0000 0000 0000 -0001 0000 0000 0210UMD 0531 0537 0536 0495 0422 0384 0028 0069 0089 0100 0102 0108 0646CMA 0499 0504 0491 0495 0466 0442 0001 0000 -0002 -0005 -0008 -0012 0242LIQ NT 0477 0478 0494 0493 0481 0471 -0003 -0005 -0004 -0002 0009 0019 0501FIN UNC 0481 0477 0471 0477 0489 0491 0000 0000 0000 0000 -0001 -0001 0096STOCK ISS 0515 0537 0509 0473 0433 0374 -0032 -0081 -0096 -0103 -0110 -0105 0515NetOA 0504 0504 0481 0489 0423 0402 0005 0015 0022 0032 0040 0042 0544MACRO UNC 0476 0483 0479 0471 0463 0430 0000 0000 0000 0000 0000 0000 0073INV IN ASS 0479 0476 0479 0461 0454 0440 0001 0002 0005 0010 0023 0029 0549COMP ISSUE 0514 0518 0503 0481 0393 0363 0037 0083 0101 0111 0105 0108 0497ACCR 0483 0477 0486 0472 0436 0415 -0009 -0012 -0009 0001 0015 0015 0343HML 0491 0481 0479 0469 0431 0376 -0002 -0001 0001 0004 0008 0010 0251ROA 0517 0514 0499 0473 0399 0319 0041 0105 0141 0162 0154 0123 0551RMW 0508 0481 0483 0465 0397 0347 0002 0008 0013 0017 0017 0016 0219CMA 0560 0530 0499 0432 0351 0292 0031 0056 0062 0058 0051 0044 0351LTRev 0485 0491 0474 0442 0402 0350 -0007 -0025 -0032 -0035 -0036 -0029 0252INTERM CAP RATIO 0499 0477 0452 0451 0380 0324 0027 0037 0054 0082 0103 0104 0790ASS Growth 0473 0470 0458 0439 0380 0327 0001 0005 0005 0000 -0005 -0003 0525DISSTR 0505 0487 0456 0410 0340 0269 0043 0094 0105 0104 0085 0070 0475PERF 0541 0497 0448 0390 0319 0259 -0054 -0079 -0072 -0064 -0054 -0049 0651BAB 0460 0448 0433 0405 0372 0325 -0010 -0012 -0008 0006 0027 0037 0921IA 0497 0465 0425 0385 0325 0257 -0010 -0011 -0009 -0008 -0008 -0007 0409MGMT 0516 0469 0424 0379 0302 0244 0028 0055 0058 0056 0050 0044 0631ROE 0482 0446 0414 0364 0299 0233 0009 0024 0027 0028 0024 0018 0555O SCORE 0475 0426 0403 0368 0281 0229 -0009 -0014 -0015 -0018 -0016 -0012 002BEH FIN 0483 0410 0360 0321 0238 0196 -0016 -0008 -0003 0000 0003 0003 076GR PROF 0459 0405 0366 0314 0249 0206 0011 0005 0008 0011 0012 0014 0199QMJ 0502 0408 0369 0300 0232 0178 0030 0030 0026 0018 0014 0011 0405RMW 0464 0395 0356 0302 0245 0193 0015 0025 0028 0027 0025 0020 0292SKEW 0481 0388 0345 0287 0219 0179 -0001 0000 0000 0000 0000 0000 0438SMB 0336 0305 0319 0312 0311 0277 0019 0032 0041 0047 0054 0051 0257HML DEVIL 0434 0326 0289 0242 0183 0152 0014 0013 0009 0005 0003 0002 0356
Posterior probabilities of factors E [γj |data] and posterior mean of factor risk premia E [λj |data] computed usingthe continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225 quadrillion models Theprior for each factor inclusion is a Beta(1 1) yielding a prior expectation for γj equal to 50 The last column reportssample average returns for the tradable factors The data is monthly 197310 to 201612 Test assets cross-sectionof 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factors considered are described inTable B1 of Appendix B Numbers denoted with the asterisk in the last column correspond to the return on thefactor-mimicking portfolio of the nontradable factor constructed by a linear projection of its values on the set of 51test assets and scaled to have the same volatility as the original nontradable factor
33
substantial support in our empirical analysis First the celebrated Fama-French HML (high-minus-
low) designed to capture the so-called lsquovalue premiumrsquo is a strong determinant of the cross-section
of asset returns For ψ = 10 (a reasonable benchmark) its posterior probability is about 895 and
only for very strong shrinkage (ψ = 1) the posterior probability gets reduced to 639 Second
the market factor in the version of Daniel Mota Rottke and Santos (2018) (MKTlowast that is
meant to have hedged out the unpriced risk contained in the market index) has also high posterior
probability (856 for ψ = 10) Third the simple market factor (MKT) seems also to be a robust
source of priced risk albeit with an empirical performance weaker than MKTlowast Fourth albeit to
a lesser extent SMBlowast the Daniel Mota Rottke and Santos (2018) version of the small-minus-big
Fama-French factor (meant to capture the so called lsquosizersquo premium) seems also to contain relevant
information for pricing the cross-section of asset returns with a posterior probability in the 60-
70 range for most values of ψ Beside the ones above mentioned all other factors have posterior
probabilities of about 50 or less for all values of ψ Interestingly the results are not very sensitive
to the choice of ψ
In addition to the posterior probabilities of the factors Table 7 reports the posterior means
of the factor risk premia computed as Bayesian Model Average ie the weighted average of the
posterior means in each possible factor model specification with weights equal to the posterior
probability of each specification being the true data generating process (see eg Roberts (1965)
Geweke (1999) Madigan and Raftery (1994)) The results are not very sensitive to the choice of
ψ except when considering very small values for of the shrinkage parameter ψ since in this case
posterior means are shrunk toward zero Interestingly the estimated price of risk for the market
factor is positive despite it being very often estimated as a negative quantity when considering
multifactor models and not dissimilar from the market excess return over the same period More
generally there is a clear pattern in cross-sectionally estimated (ie ex post) factor risk premia and
their simple time series average estimates (reported in the last column of Table 7) for the robust
four factors (HML MKTlowast MKT and SMBlowast) ex post risk premia are very similar to the time series
estimnates while the opposite holds true for the other factors In other words robust factors seems
to price themselves well (since theoretically their own beta is one) while other factors donrsquot
IV3 Estimating 26 million sparse factor models
Instead of drawing unrestricted factor models specifications we now constraint models to have a
maximum of 5 factors ndash ie we are imposing sparsity on the implied stochastic discount factor as
in most of the previous empirical asset pricing literature that has tried to identify low dimensional
factor models to explain the cross-section of asset returns Given our set of 51 factors this approach
yields about 26 millions of possible models (ie the equivalent of about 147 million time series and
cross-sectional regressions) Since we now do not sample the possible models posterior probabilities
are computed using the marginal likelihoods of all these models ie the posterior probability of
model γj is computed as
Pr(γj |data) =p(data|γj)983123i p(data|γi)
34
000
1000
2000
3000
4000
5000
6000
7000
8000
1 5 10 20 50 60
Post
erio
r pro
babi
lity
ψ
HML MKT MKT SMB PERF
STOCK_ISS COMP_ISSUE CMA ROA UMD
BEH_PEAD STRev NONDUR DISSTR MGMT
BW_ISENT TERM FIN_UNC LIQ_TR DeltaSLOPE
IPGrowth Oil SERV DEFAULT NetOA DIV PE REAL_UNC HJTZ_ISENT UNRATE
INTERM_CAP_RATIO LIQ_NT QMJ MACRO_UNC INV_IN_ASS
ACCR LTRev CMA ASS_Growth HML
RMW BAB IA O_SCORE ROE
SMB SKEW BEH_FIN GR_PROF RMW
HML_DEVIL
Figure 5 Posterior factor probabilities
Posterior probabilities of factors Pr [γj = 1|data] estimated over the 197310-201612 sample using a cross-section of25 Fama-French size and book-to-market and 30 Industry test asset portfolios computed using the Dirac spike andslab approach of section II22 51 factors and all possible models with up to 5 factors yielding about 26 millioncandidate models The prior probability of a factor being included is about 1038 The 51 factors considered aredescribed in Table B1 of Appendix B Posterior probabilities are plotted for ψ isin [1 100]
where we have assigned equal prior probability to all possible specifications and p(data|γj) denotes
the marginal likelihood of the j-th model To both simplify the numerical computation and to
illustrate the qualities of the approach we use the Dirac spike-and-slab prior of section II22 since
we can leverage its closed form solution for the marginal likelihoods18
Posterior factor probabilities are reported in Figure 5 and jointly with the Bayesian model
averaging of risk premia across the sparse models in Table C15 of the Appendix Note that in
this case all factors have an ex ante probability of being included equal to 1038 The results
are strikingly similar to the ones in Table 7 and Figure 4 as before only for four factors (HML
MKTlowast MKT and SMBlowast) we observe a marked increase in the posterior probability of inclusion
after observing the data Furthermore these factors seems to price themselves well ndash ie the BMA
of their risk premia are very similar to the sample average of their excess returns ndash while other
factors do not
35
Table 8 Factor Models with highest posterior probability (Dirac spike-and-slab ψ = 10)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232SMB 983232 983232 983232 983232 983232 983232STRevBW ISENTLIQ TRUNRATENONDURTERMCOMP ISSUE 983232 983232OilBEH PEADDeltaSLOPEINV IN ASSIPGrowthDEFAULTUMD 983232ROA 983232 983232 983232REAL UNCPECMA 983232ACCRSERVSTOCK ISS 983232 983232 983232DIVMACRO UNCFIN UNCLIQ NTCMANetOAHJTZ ISENTLTRevRMWHMLINTERM CAP RATIOASS GrowthPERF 983232 983232 983232IABABDISSTRROEMGMT 983232 983232O SCOREQMJBEH FINGR PROFSKEWRMWHML DEVILSMB
Probability () 00666 00599 00574 00522 00503 00460 00437 00428 00423 00393
Factors and posterior model probabilities of ten most likely specifications computed using the Dirac spike and slabapproach of section II22 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 26 millioncandidate models and a model prior probability of the order of 10minus7 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
36
IV4 A robust factors model
The previous subsections suggest that only a small number of factors (HML MKTlowast MKT and to a
lesser extent SMBlowast) are robust explanators of the cross-section of asset returns Furthermore Table
8 that reports the ten factor model specifications with the highest posterior probabilities19 shows
that these robust factors tend to be almost always included in the most likely models both HML
and MKTlowast are featured in all ten specifications while MKT is excluded only once and SMBlowast is
included in the five most likely specification plus the tenth one Note that the posterior probabilities
in Table 8 might appear small in absolute terms but are actually four orders of magnitude larger
than the prior model probabilities (equal to one over the number of models considered)
Therefore a natural question is whether the four factors identified as robust in the above
analysis do indeed deliver a significantly better cross-sectional asset pricing model We answer this
questions by comparing the performance of a four factor model with HML MKTlowast MKT and SMBlowast
as factors to the one of several notable factor models20 In particular Table 9 reports the model
posterior probabilities for the specifications considered ie the probability of any of these models
being the true data generating process Strikingly for almost any value of ψ the model posterior
probabilities are in the single digits range for all models but the robust factors one the probability
of this specification is always higher than 85 except when using a very strong shrinkage (in which
case it is reduced to 65) Furthermore for ψ in the most salient range (10-20) the posterior
probability of the robust factors model is about 90
Table 9 Posterior probabilities of notable models vs robust factors
ψmodel 1 5 10 20 50 100
CAPM 002 001 001 002 004 008Fama and French (1992) 005 001 001 001 000 000Fama and French (2016) 009 006 005 004 003 002Carhart (1997) 005 001 001 001 001 001Hou Xue and Zhang (2015) 002 000 000 000 000 000Pastor and Stambaugh (2000) 001 000 000 001 001 002Asness Frazzini and Pedersen (2014) 010 003 002 002 002 001Robust Factors Model 065 088 090 090 089 085
Posterior model probabilities for the specifications in the first column for different values of ψ computed using theDirac spike-and-slab prior The models and their factors are described in Appendix B The model in the last rowuses the HML MKT MKTlowast and SBMlowast factors described in Table B1 The data is monthly 197310 to 201612 Testassets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios
18Alternatively one could use the continuous spike-and-slab approach employed in the previous subsection anddrop the draws of specifications with more than 5 factors but this significantly reduces computational efficiency
19Table 8 focuses on the Dirac spike-and-slab specification with ψ = 10 Very similar results are reported in tablesC16-C18 of Appendix C for other values of ψ and for the continuous prior case
20Note that the correlation between MKT and MKTlowast is not too large being about 064
37
IV5 The sparsity of the stochastic discount factor
We have shown that a robust model with only four ndash robust ndash factors is much more likely to
capture the true stochastic discount factor than all the notable models considered (see Table 9)
Nevertheless are only four factors likely to deliver a sufficiently accurate representation of the true
latent stochastic discount factor
Thanks to our Bayesian method this question can be easily answered In particular by using
our estimations of about 225 quadrillion models and their posterior probabilities we can compute
the posterior distribution of the dimensionality of the lsquotruersquo model That is for any integer number
between one and fifty-one we can compute the posterior probability of the SDF being a function
of that number of factors
Figure 6 reports the posterior distributions of the dimensionality of the SDF for various values
of ψ These distributions are also summarized in Table 10 For the most salient values of ψ (10 and
20) the posterior mean of the number of factors in the true SDF is in the 24-25 range and the 95
posterior credible intervals are contained in the 17 to 32 factors range That is there is substantial
evidence that the SDF is dense in the space of the linear factors considered given the factors at
hand a relative large number of them is needed to provide an accurate representation of the true
latent SDF Since most of the literature has focused on very low dimension linear factor models
this finding suggests that most empirical results therein have been affected by a large degree of
misspecification
It is worth noticing that as Figure 6 and Table 10 show for very large ψ ie with basically
a flat prior for factor risk premia the posterior dimensionality is reduced This is due to two
phenomena we have already outlined First if some of the factors are useless (and our analysis
points in this direction) under a flat prior they do tend to have a higher posterior probability and
drive out the true sources of priced risk Second a flat prior for the risk premia can generate a
ldquoBartlett Paradoxrdquo (see the discussion in section II21 and Bartlett (1957))
Table 10 The posterior dimensionality of the stochastic discount factor
ψ1 5 10 20 50 100
mean 2570 2525 2460 2370 2203 2046median 26 25 25 24 22 2025 19 18 18 17 15 145 20 20 19 18 16 1595th 31 31 30 30 28 26975th 33 32 32 31 29 27
Summary statistics of the posterior density of the true model number of factors for various values of ψ Estimated
over the 197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30 Industry
test asset portfolios using the continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225
quadrillion models The prior for each factor inclusion is a Beta(1 1)
38
0 10 20 30 40 50
000
004
008
012
Model Dimensionality
Freq
uenc
y
ψ=1ψ=5ψ=10ψ=20ψ=50ψ=100
Figure 6 Posterior density of the dimensionality of the stochastic discount factor
Posterior density of the true model having the number of factors listed on the horizontal axis Estimated over the
197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30 Industry test asset
portfolios computed using the continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225
quadrillion models The prior for each factor inclusion is a Beta(1 1) yielding a prior expectation for γj equal to 50
The 51 factors considered are described in Table B1 of Appendix B Posterior densities are plotted for ψ isin [1 100]
Note that if the factors proposed in the literature were to capture different and uncorrelated
sources of risk one might worry that a SDF that is dense in the space of factors might imply
unrealistically high Sharpe ratios (see eg the discussion in Kozak Nagel and Santosh (2019))
Since given a factor model the SDF-implied maximum Sharpe ratio is just a function of the
factorsrsquo risk premia and covariance matrix our Bayesian method allows to construct the posterior
distribution of the maximum Sharpe ratio for each of the 2 quadrilion models considered Therefore
using the posterior probabilities of each possible model specification we can actually construct
the (Bayesian Model Averaging) posterior distribution of the SDF-implied maximum Sharpe ratio
(conditional on the data only)
Figure 7 and table 11 report respectively the posterior distribution of the (annualized) SDF-
implied maximum Sharpe ratio and its summary statistics for several values of the parameter ψ
Except when a very strong shrinkage (small ψ) is imposed (and hence risk premia and consequently
Sharpe ratios are shrunk toward zero) the posterior distribution of the Sharpe ratio are quite
similar for all values of ψ Furthermore despite the SDF being dense in the space of factors the
posterior maximum Sharpe ratio does not appear to be unrealistically high eg for ψ isin [10 20]
its posterior mean is about 074ndash080 and the 95 posterior credible intervals are in the 050ndash114
range Interestingly Ghosh Julliard and Taylor (2016 2018) provides a non-parametric estimate
of the pricing kernel extracted using an information-theoretic approach and wide cross-sections of
equity portfolios and find SDF-implied maximum Sharpe ratios of very similar magnitude
39
00 05 10 15
01
23
4
Sharpe Ratio
Dens
ity
ψ=1ψ=5ψ=10ψ=20ψ=50ψ=100
Figure 7 Posterior density of the Sharpe ratio of the model-implied stochastic discount factor
Posterior density of the Sharpe ratio of the model-implied stochastic discount factor for various values of ψ isin [1 100]
Estimated over the 197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30
Industry test asset portfolios computed using the continuous spike and slab approach of section II23 51 factors
described in Table B1 of Appendix B and Bayesian Model Averaging of the 251 asymp 225 quadrillion possible models
The prior for each factor inclusion is a Beta(1 1)
Table 11 Posterior distribution of the Sharpe ratio of the model-implied stochastic discount factor
ψ1 5 10 20 50 100
mean 044 067 074 080 087 092median 043 066 073 079 086 09125 026 044 050 054 059 0625 029 047 053 058 063 06695th 060 089 099 108 117 124975th 064 094 105 114 124 133
Summary statistics of the posterior distribution of the maximal Sharpe ratio of the model-implied SDF for various
values of ψ isin [1 100] Estimated over the 197310-201612 sample using a cross-section of 25 Fama-French size and
book-to-market and 30 Industry test asset portfolios computed using the continuous spike and slab approach of
section II23 51 factors described in Table B1 of Appendix B and Bayesian Model Averaging of the 251 asymp 225
quadrillion possible models The prior for each factor inclusion is a Beta(1 1)
V Extensions
In additions to the extensions formalized in remarks 1 (on how to handle generated factors such as
principal components and factor mimicking portfolios) and 3 (on how to handle the identification
failure generated by lsquolevel factorsrsquo) our method can be feasibly extended to encompass several
salient generalizations
40
First based on economic considerations one might possibly want to bound the maximum risk
premia (or the maximum Sharpe ratios) associated with the factors This can be achieved by re-
placing the Gaussian distributions in our spike-and-slab priors with (rescaled and centered) Beta
distributions since the latter have bounded support Furthermore for the sake of expositional
simplicity and closed form solutions we have focused on regularising spike and slab priors with
exponential tails Nevertheless our approach that shrinks useless factors based on their correla-
tion with asset returns could be as well implemented using polynomial tailed (ie heavy-tailed)
mixing priors (see Polson and Scott (2011) for a general discussion of priors for regularisation and
shrinkage)21 The rational for using heavy tailed priors is that when the likelihood has thick tails
while the prior has thin tail if the likelihood peak moves too far from the prior mean the posterior
eventually eventually reverts toward the prior This is illustrated in Figure D2 of the Appendix
Nevertheless note that this mechanism (first pointed out in Jeffreys (1961)) is actually desirable
in our settings in order to shrink the risk premia of useless factors toward zero22
Second Lewellen Nagel and Shanken (2010) points out that the first pass time-series regression
is often affected by having a strong factor structure in the residuals Given the hierarchical structure
of our Bayesian approach one can add latent linear components in the time series regression of asset
returns on factors reformulate the time series estimation step as a state-space problem and filter
the latent components (eg via Kalman filter) The posterior sampling of the time series parameters
would then be enriched by the drawing of the added terms as in Bryzgalova and Julliard (2018)
Furthermore one could allow the latent time series factors to be potential priced in the cross-
section (again as in Bryzgalova and Julliard (2018)) This extension would increase the numerical
complexity of the procedure in the time-series step but would nonetheless leave unchanged the
method proposed in this paper at the cross-sectional step (with the only difference that the time
series loadings of the latent factors could be included in the cross-sectional step as if these latent
factors were observable) This extended approach would lead to valid posterior inference and model
selection
Third again thanks to the hierarchical structure of our method time varying time series betas
could be accommodated by adopting the time varying parameters approach of Primiceri (2005)
in the time series step And since in our approach the asset specific expected risk premia are
a parameter estimated in the time series step this extension would also allow for time variation
in asset risk premia Furthermore albeit this would increase the numerical complexity of the
cross-sectional inference step the time varying parameters formulation could also be used for the
modelling of the factor risk premia
21For example albeit alternatives with more desirable properties exist our spike and slab could be implementedusing a Cauchy prior with location parameter set to zero and scale parameter proportional to ψj as defined in equation(16)
22Risk premia have a natural support hence the prior can be chosen (adjusting the paramater ψ) to attach highprobability to it And since useless factors will tend to generate heavy tailed likelihoods with posterior peaks for riskpremia that deviate toward infinity the risk premia of such factors will be shrunk toward the prior mean if the priorhas thin-tails
41
VI Conclusions
We have developed a novel (Bayesian) method for the analysis of linear factor models in asset
pricing The approach can handle the quadrillions of models generated by the zoo of traded and
non traded factors and delivers inference that is robust to the common identification failures and
spurious inference problems caused by useless factors
We have applied our approach to the study of more than two quadrillion factor model spec-
ification and have found that 1) only a handful of factors (the Fama and French (1992) ldquohigh-
minus-lowrdquo proxy for the value premium the market index as well the adjusted versions of the
ldquosmall-minus-bigrdquo size factor and the market factor of Daniel Mota Rottke and Santos (2018))
seem to be robust explanators of the cross-sections of asset returns 2) jointly the four robust fac-
tors provide a model that is compared to the previous empirical literature one order of magnitude
more likely to have generated the observed asset returns (itrsquos posterior probability is about 90)
3) with very high probability the ldquotruerdquo latent stochastic discount factor is dense in the space of
factors proposed in the previous literature ie capturing its characteristics requires the use of 24-25
factors (at the posterior mean of the SDF sparsity) 4) despite being dense in the space of factors
the SDF-implied maximum Sharpe ratio is not excessive suggesting a high degree of commonality
in terms of captured risks among the factors in the zoo
As a byproduct of our novel framework for empirical asset pricing we provide a very simple
Bayesian version of the Fama and MacBeth (1973) regression method (BFM) We show that this
simple procedure (that does neither require optimisation nor tuning parameters and is not harder
to implement than eg the Shanken (1992) correction for standard errors) makes useless factors
easily detectable in finite sample In extensive simulations the BFM and its GLS analogue (BFM-
GLS) perform well even with relatively small time and large cross-sectional dimensions We apply
BFM and BFM-GLS to several notable factor models and document that a range of non-traded
factors such as consumption proxies labour factors or the consumption-to-wealth ratio are only
weakly identified at best and are characterised by a substantial degree of model misspecification
and uncertainty
Finally thanks to its hierarchical structure our framework is extremely flexible and can accom-
modate and deliver robust inference in the presence of 1) pre-estimated factors (eg mimicking
portfolios and principal components) 2) latent priced and unpriced factors 3) time varying betas
as well as asset and factor risk premia
42
References
Allena R (2019) ldquoComparing Asset Pricing Models with Traded and Non-Traded Factorsrdquo working paper
Asness C and A Frazzini (2013) ldquoThe Devil in HMLs Detailsrdquo Journal of Portfolio Management 39 49ndash68
Asness C S A Frazzini and L H Pedersen (2019) ldquoQuality minus Junkrdquo Review of Accounting Studies24(1) 34ndash112
Avramov D and G Zhou (2010) ldquoBayesian Portfolio Analysisrdquo Annual Review of Financial Economics 225ndash47
Baker M and J Wurgler (2006) ldquoInvestor Sentiment and the Cross-Section of Stock Returnsrdquo Journal ofFinance 61 1645ndash80
Baks K P A Metrick and J Wachter (2001) ldquoShould Investors Avoid All Actively Managed Mutual FundsA Study in Bayesian Performance Evaluationrdquo Journal of Finance 56 45ndash85
Ball R (1985) ldquoAnomalies in Relationships between Securitiesrsquo Yields and Yield-Surrogatesrdquo Journal of FinancialEconomics 6 103ndash126
Barillas F and J Shanken (2018) ldquoComparing Asset Pricing Modelsrdquo The Journal of Finance 73(2) 715ndash754
Bartlett M S (1957) ldquoComment on a Statistical Paradox by DV Lindleyrdquo Biometrika 44(1-2) 533ndash534
Basu S (1977) ldquoInvestment Performance of Common Stocks in Relation to their Price-Earnings Ratio A Test ofthe Efficient Market Hypothesisrdquo Journal of Finance 32 663ndash82
Belloni A V Chernozhukov and C Hansen (2014) ldquoInference on treatment effects after selection amonghigh-dimensional controlsrdquo The Review of Economic Studies 81 608ndash650
Breeden D T M R Gibbons and R H Litzenberger (1989) ldquoEmpirical Test of the Consumption-OrientedCAPMrdquo The Journal of Finance 44(2) 231ndash262
Bryzgalova S (2015) ldquoSpurious Factors in Linear Asset Pricing Modelsrdquo working paper
Bryzgalova S and C Julliard (2018) ldquoConsumption in Asset Returnsrdquo London School of EconomicsManuscript
Campbell J Y (1996) ldquoUnderstanding Risk and Returnrdquo Journal of Political Economy 104(2) 298ndash345
Campbell J Y J Hilscher and J Szilagyi (2008) ldquoIn Search of Distress Riskrdquo Journal of Finance 632899ndash2939
Carhart M M (1997) ldquoOn Persistence in Mutual Fund Performancerdquo The Journal of Finance 52(1) 57ndash82
Chan K N-F Chen and D A Hsieh (1985) ldquoAn Exploratory Investigation of the Firm Size Effectrdquo Journalof Financial Economics 14 451ndash71
Chen L R Novy-Marx and L Zhang (2010) ldquoA New Three-Factor Modelrdquo working paper
Chen N-F S A Ross and R Roll (1986) ldquoEconomic Forces and the Stock Marketrdquo Journal of Business 59383ndash403
Chernov M L Lochstoer and S Lundeby (2019) ldquoConditional Dynamics and the Multi-Horizon Risk-ReturnTrade-Offrdquo working paper
Chib S X Zeng and L Zhao (forthcoming) ldquoOn Comparing Asset Pricing Modelsrdquo The Journal of Finance
Cochrane J H (2005) Asset Pricing Revised Edition Princeton University Press
Cooper M J H Gulen and M J Schill (2008) ldquoAsset Growth and the Cross-Section of Stock ReturnsrdquoJournal of Finance 63 1609ndash52
Cremers K M (2002) ldquoStock Return Predictability A Bayesian Model Selection Perspectiverdquo The Review ofFinancial Studies 15(4) 1223ndash1249
Daniel K D Hirshleifer and L Sun (2019) ldquoShort- and Long-Horizon Behavioral Factorsrdquo Review ofFinancial Studies
Daniel K L Mota S Rottke and T Santos (2018) ldquoThe Cross-Section of Risk and Returnrdquo workingpaper
Daniel K and S Titman (2006) ldquoMarket reactions to tangible and intangible informationrdquo Journal of Finance61 1605ndash43
Fabozzi F D Huang and G Zhou (2010) ldquoRobust Portfolios Contributions from Operations Research andFinancerdquo Annals of Operations Research 172 191ndash220
43
Fama E F and K French (2008) ldquoDissecting Anomaliesrdquo Journal of Finance 63 1653ndash78
Fama E F and K R French (1992) ldquoThe Cross-Section of Expected Stock Returnsrdquo The Journal of Finance47 427ndash465
(1993) ldquoCommon Risk Factors in the Returns on Stocks and Bondsrdquo The Journal of Financial Economics33 3ndash56
Fama E F and K R French (2015) ldquoA Five-Factor Asset Pricing Modelrdquo Journal of Financial Economics116 1ndash22
(2016) ldquoDissecting Anomalies with a Five-Factor Modelrdquo Review of Financial Studies 29 69ndash103
Fama E F and J D MacBeth (1973) ldquoRisk Return and Equilibrium Empirical Testsrdquo Journal of PoliticalEconomy 81(3) 607ndash636
Ferson W and C Harvey (1991) ldquoThe Variation of Economic Risk Premiumsrdquo Journal of Political Economy99 385ndash415
Frazzini A and L H Pedersen (2014) ldquoBetting against Betardquo Journal of Financial Economics 111(1) 1ndash25
Garlappi L R Uppal and T Wang (2007) ldquoPortfolio Selection with Parameter and Model Uncertainty AMulti-Prior Approachrdquo Review of Financial Studies 20 41ndash81
George E I and R E McCulloch (1993) ldquoVariable Selection via Gibbs Samplingrdquo Journal of the AmericanStatistical Association 88 881ndash889
(1997) ldquoApproaches for Bayesian Variable Selectionrdquo Statistica Sinica 7 339ndash373
Gertler M and E Grinols (1982) ldquoUnemployment Inflation and Common Stock Returnsrdquo Journal of MoneyCredit and Banking 14 216ndash33
Geweke J (1999) ldquoUsing Simulation Methods for Bayesian Econometric Models Inference Development andCommunicationrdquo Econometric Reviews 18(1) 1ndash73
Ghosh A C Julliard and A Taylor (2016) ldquoWhat is the Consumption-CAPM missing An Information-Theoretic Framework for the Analysis of Asset Pricing Modelsrdquo Review of Financial Studies 30 442ndash504
(2018) ldquoAn Information-Theoretic Asset Pricing Modelrdquo London School of Economics Manuscript
Giannone D M Lenza and G E Primiceri (2018) ldquoEconomic Predictions with Big Data The Illusion ofSparsityrdquo FRB of New York Staff Report No 847
Gibbons M R S A Ross and J Shanken (1989) ldquoA Test of the Efficiency of a Given Portfoliordquo Econometrica57(5) 1121ndash1152
Giglio S G Feng and D Xiu (2019) ldquoTaming the factor Zoordquo Journal of Finance forthcoming
Giglio S and D Xiu (2018) ldquoAsset Pricing with Omitted Factorsrdquo working paper
Gospodinov N R Kan and C Robotti (2014) ldquoMisspecification-Robust Inference in Linear Asset-PricingModels with Irrelevant Risk Factorsrdquo The Review of Financial Studies 27(7) 2139ndash2170
(2019) ldquoToo Good to be True Fallacies in Evaluating Risk Factor Modelsrdquo Journal of Financial Economics132(2) 451ndash471
Goyal A Z L He and S-W Huh (2018) ldquoDistance-Based Metrics A Bayesian Solution to the Power andExtreme-Error Problems in Asset-Pricing Testsrdquo Swiss Finance Institute Research Paper No 18-78 available atSSRN httpsssrncomabstract=3286327
Hall R E (1978) ldquoThe Stochastic Implications of the Life Cycle Permanent Income Hypothesisrdquo Journal ofPolitical Economy 86(6) 971ndash87
Harvey C and Y Liu (2019) ldquoCross-sectional Alpha Dispersion and Performance Evaluationrdquo Journal ofFinancial Economics forthcoming
Harvey C and G Zhou (1990) ldquoBayesian Inference in Asset Pricing Testsrdquo Journal of Financial Economics26 221ndash254
Harvey C R (2017) ldquoPresidential Address The Scientific Outlook in Financial Economicsrdquo The Journal ofFinance 72(4) 1399ndash1440
Harvey C R Y Liu and H Zhu (2016) ldquo and the Cross-Section of Expected Returnsrdquo The Review ofFinancial Studies 29(1) 5ndash68
He A D Huang and G Zhou (2018) ldquoNew Factors Wanted Evidence from a Simple Specification Testrdquoworking paper available at SSRN httpsssrncomabstract=3143752
44
He Z B Kelly and A Manela (2017) ldquoIntermediary Asset Pricing New Evidence from Many Asset ClassesrdquoJournal of Financial Economics 126 1ndash35
Hirshleifer D H Kewei S Teoh and Y Zhang (2004) ldquoDo Investors Overvalue Firms with Bloated BalanceSheetsrdquo Journal of Accounting and Economics 38 297ndash331
Hoeting J A D Madigan A E Raftery and C T Volinsky (1999) ldquoBayesian Model Averaging ATutorialrdquo Statistical Science 14(4) 382ndash401
Hou K C Xue and L Zhang (2015) ldquoDigesting Anomalies An Investment Approachrdquo The Review of FinancialStudies 28(3) 650ndash705
Huang D F Jiang J Tu and G Zhou (2015) ldquoInvestor Sentiment Aligned A Powerful Predictor of StockReturnsrdquo Review of Financial Studies 28 791ndash837
Huang D J Li and G Zhou (2018) ldquoShrinking Factor Dimension A Reduced-Rank Approachrdquo workingpaper available at SSRN httpsssrncomabstract=3205697
Ishwaran H J S Rao et al (2005) ldquoSpike and Slab Variable Selection Frequentist and Bayesian StrategiesrdquoThe Annals of Statistics 33(2) 730ndash773
Jagannathan R and Z Wang (1996) ldquoThe Conditional CAPM and the Cross-Section of Expected ReturnsrdquoThe Journal of Finance 51(1) 3ndash53
Jeffreys H (1961) Theory of Probability Oxford Calendron Press
Jegadeesh N and S Titman (1993) ldquoReturns to Buying Winners and Selling Losers Implications for StockMarket Efficiencyrdquo Journal of Finance 48 65ndash91
(2001) ldquoProfitability of Momentum Strategies An Evaluation of Alternative Explanationsrdquo Journal ofFinance 56 699ndash720
Jones C and J Shanken (2005) ldquoMutual Fund Performance with Learning across Fundsrdquo Journal of FinancialEconomics 78 507ndash552
Jurado K S Ludvigson and S Ng (2015) ldquoMeasuring Uncertaintyrdquo American Economic Review 105 1177ndash1215
Kan R and C Zhang (1999a) ldquoGMM Tests of Stochastic Discount Factor Models with Useless Factorsrdquo Journalof Financial Economics 54(1) 103ndash127
(1999b) ldquoTwo-Pass Tests of Asset Pricing Models with Useless Factorsrdquo the Journal of Finance 54(1)203ndash235
Kass R E and A E Raftery (1995) ldquoBayes Factorsrdquo Journal of the American Statistical Association 90(430)773ndash795
Kelly B S Pruitt and Y Su (2019) ldquoCharacteristics are covariances A unified model of risk and returnrdquoJournal of Financial Economics 134 501ndash524
Kleibergen F (2009) ldquoTests of Risk Premia in Linear Factor Modelsrdquo Journal of Econometrics 149(2) 149ndash173
Kleibergen F and Z Zhan (2015) ldquoUnexplained Factors and their Effects on Second Pass R-squaredsrdquo Journalof Econometrics 189(1) 101ndash116
Kozak S S Nagel and S Santosh (2019) ldquoShrinking the Cross-Sectionrdquo Journal of Financial Economicsforthcoming
Lancaster T (2004) Introduction to Modern Bayesian Econometrics Wiley
Langlois H (2019) ldquoMeasuring Skewness Premiardquo Journal of Financial Economics
Lettau M and S Ludvigson (2001) ldquoResurrecting the (C) CAPM A Cross-Sectional Test when Risk Premiaare Time-Varyingrdquo Journal of political economy 109(6) 1238ndash1287
Lewellen J S Nagel and J Shanken (2010) ldquoA Skeptical Appraisal of Asset Pricing Testsrdquo Journal ofFinancial Economics 96(2) 175ndash194
Lintner J (1965) ldquoSecurity Prices Risk and Maximal Gains from Diversificationrdquo Journal of Finance 20587ndash615
Ludvigson S S Ma and S Ng (2019) ldquoUncertainty and Business Cycles Exogenous Impulse or EndogenousResponserdquo American Economic Journal Macroeconomics
Madigan D and A E Raftery (1994) ldquoModel Selection and Accounting for Model Uncertainty in GraphicalModels Using Occamrsquos Windowrdquo Journal of the American Statistical Association 89(428) 1535ndash1546
45
Novy-Marx R (2013) ldquoThe Other Side of Value The Gross Profitability Premiumrdquo Journal of Financial Eco-nomics 108 1ndash28
Ohlson J A (1980) ldquoFinancial Ratios and the Probabilistic Prediction of Bankruptcyrdquo Journal of AccountingResearch 18 109ndash131
Pastor L (2000) ldquoPortfolio Selection and Asset Pricing Modelsrdquo The Journal of Finance 55(1) 179ndash223
Pastor L and R F Stambaugh (2000) ldquoComparing Asset Pricing Models an Investment Perspectiverdquo Journalof Financial Economics 56(3) 335ndash381
Pastor L and R F Stambaugh (2002) ldquoInvesting in Equity Mutual Fundsrdquo Journal of Financial Economics63 351ndash380
Pastor L and R F Stambaugh (2003) ldquoLiquidity Risk and Expected Stock Returnsrdquo Journal of Politicaleconomy 111(3) 642ndash685
Phillips D L (1962) ldquoA Technique for the Numerical Solution of Certain Integral Equations of the First KindrdquoJ ACM 9(1) 84ndash97
Polson N G and J G Scott (2011) ldquoShrink Globally Act Locally Sparse Bayesian Regularization and Pre-dictionrdquo in Bayesian Statistics 9 ed by J M Bernardo M J Bayarri J O Berger A P Dawid D HeckermanA F M Smith and M West Oxford University Press
Primiceri G E (2005) ldquoTime Varying Structural Vector Autoregressions and Monetary Policyrdquo Review of Eco-nomic Studies 72(3) 821ndash852
Raftery A E D Madigan and J A Hoeting (1997) ldquoBayesian Model Averaging for Linear RegressionModelsrdquo Journal of the American Statistical Association 92(437) 179ndash191
Ritter J R (1991) ldquoThe Long-Run Performance of Initial Public Offeringsrdquo Journal of Finance 46 3ndash27
Roberts H V (1965) ldquoProbabilistic Predictionrdquo Journal of the American Statistical Association 60(309) 50ndash62
Shanken J (1987) ldquoA Bayesian Approach to Testing Portfolio Efficiencyrdquo Journal of Financial Economics 19195ndash215
(1992) ldquoOn the Estimation of Beta-Pricing Modelsrdquo The Review of Financial Studies 5 1ndash33
Sharpe W F (1964) ldquoCapital Asset Prices A Theory of Market Equilibrium under Conditions of Riskrdquo Journalof Finance 19(3) 425ndash42
Sims C A (2007) ldquoThinking about Instrumental Variablesrdquo mimeo
Sloan R (1996) ldquoDo Stock Prices Fully Reflect Information in Accruals and Cash Flows about Future EarningsrdquoThe Accounting Review 71 289ndash315
Stambaugh R F and Y Yuan (2016) ldquoMispricing Factorsrdquo Review of Financial Studies 30 1270ndash1315
Stock J H (1991) ldquoConfidence Intervals for the Largest Autoregressive Root in US Macroeconomic Time SeriesrdquoJournal of Monetary Economics 28(3) 435ndash459
Tikhonov A A Goncharsky V Stepanov and A G Yagola (1995) Numerical Methods for the Solutionof Ill-Posed Problems Mathematics and Its Applications Springer Netherlands
Titman S K J Wei and F Xie (2004) ldquoCapital Investments and Stock Returnsrdquo Journal of Financial andQuantitative Analysis 39 677ndash700
Uppal R P Zaffaroni and I Zviadadze (2018) ldquoCorrecting Misspecified Stochastic Discount Factorsrdquoworking paper
Yogo M (2006) ldquoA Consumption-Based Explanation of Expected Stock Returnsrdquo Journal of Finance 61(2)539ndash580
46
A Additional Derivations
A1 Derivation of the posterior distribution in section II1
Letrsquos consider first the time-series regression We assume that 983171tiidsim N (0Σ) or 983171 sim MVN (0TtimesN Σotimes
IT ) The likelihood of data (RF ) is therefore
p(data|BΣ) = (2π)minusNT2 |Σ|minus
T2 exp
983069minus1
2tr[Σminus1(Rminus FB)⊤(Rminus FB)]
983070
After assigning the Jeffreysrsquo prior for (BΣ) π(BΣ) prop |Σ|minusN+1
2 we simplify the likelihood
function by exploiting the fact that OLS estimated residuals are orthogonal to the regressors
(Rminus FB)⊤(Rminus FB) = [Rminus FBols minus F (B minus Bols)]⊤[Rminus FBols minus F (B minus Bols)]
= (Rminus FBols)⊤(Rminus FBols) + (B minus Bols)
⊤F⊤F (B minus Bols)
= T Σ+ (B minus Bols)⊤F⊤F (B minus Bols)
where
Bols =
983075a⊤
β⊤
983076= (F⊤F )minus1F⊤R Σols =
1
T(Rminus FBols)
⊤(Rminus FBols)
Therefore the posterior distribution in the first step is
p(BΣ|data) prop (2π)minusNT2 |Σ|minus
T+N+12 exp
983069minus1
2tr[Σminus1(Rminus FB)⊤(Rminus FB)]
983070
prop |Σ|minusT+N+1
2 eminus12tr[Σminus1(T Σ)]eminus
12tr[Σminus1(BminusBols)
⊤F⊤F (BminusBols)]
Hence the posterior distribution of B conditional on data and Σ is
p(B|Σ data) prop exp
983069minus1
2tr[Σminus1(B minus Bols)
⊤F⊤F (B minus Bols)]
983070
and the above is the kernel of the multivariate normal in equation (8)
If we further integrate out B it is easy to show that
p(Σ|data) prop |Σ|minusT+NminusK
2 exp
983069minus1
2tr[Σminus1(T Σ)]
983070
Therefore the posterior distribution of Σ is the inverse-Wishart in equation (9)
Recall that β = (1N βf ) λ⊤ = (λc λ⊤
f ) If we assume that the pricing error αi follows an
independent and idencitical normal distribution N (0σ2) the likelihood function in the second step
is
p(data|λσ2) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070
where data in the second step include (aβf ) drawn from the first step Assuming the diffuse
47
Jeffreysrsquo prior π(λσ2) prop 1σ2 the posterior distribution of (λσ2) is
p(λσ2|dataBΣ) prop (σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070
= (σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βλ+ β(λminus λ))⊤(aminus βλ+ β(λminus λ))
983070
= (σ2)minusN+2
2 exp
983069minusN σ2
2σ2
983070exp
983083minus(λminus λ)⊤β⊤β(λminus λ)
2σ2
983084
there4 p(λ|σ2 dataBΣ) prop exp
983083minus(λminus λ)⊤β⊤β(λminus λ)
2σ2
983084
where λ = (β⊤β)minus1β⊤a and σ2 = (aminusβλ)⊤(aminusβλ)N Therefore the posterior conditional distribution
of λ is the one in equation (12) Finally we can derive the posterior distribution of σ2 by integrating
out λ
p(σ2|dataBΣ) =
983133p(λσ2|dataBΣ)dσ2 prop (σ2)minus
NminusK+12 exp
983069minusN σ2
2σ2
983070
hence obtaining the posterior distribution in equation (13)
A2 Non-spherical pricing errors
Our framework can also easily accommodate non-spherical cross-sectional pricing errors To see
this note that under the null of the model we can rewrite equation (1) as Rt = βλ + βfft + 983171t
Consider the cross-sectional regression and let ET define the sample mean operator Since ET [Rt] =
βλ+ET [βfft]+ET [983171t] = βλ+ET [983171t] the pricing error α should be equal to ET [983171t] Hence under
the hypothesis that the model is correctly specified and in the spirit of the central limit theorem
a suitable distributional assumption for the pricing errors α in the second step is
α|Σ sim N
9830610N
1
TΣ
983062
Implying the distribution of R equiv ET [Rt] sim N (βλ 1T Σ) The likelihood function in the second step
is then
p(data|λ) = (2π)minusN2
9830559830559830559830551
TΣ
983055983055983055983055minus 1
2
exp
983069minusT
2(Rminus βλ)⊤Σminus1(Rminus βλ)
983070
prop exp
983069minusT
2(λ⊤β⊤Σminus1βλminus 2λ⊤β⊤Σminus1R)
983070
Hence we can now define the following estimator
Definition 3 (Bayesian Fama-MacBeth GLS2 (BFM-GLS2)) The BFM-GLS2 posterior dis-
tribution of λ is
λ|dataBΣ sim N (λΣλ) (18)
48
where λ = (β⊤Σminus1β)minus1β⊤Σminus1R Σλ = 1T (β
⊤Σminus1β)minus1 and where β and Σ are drawn from the
Normal-inverse-Wishart in (8)-(9)
In the above the conditional expectation λ = (β⊤Σminus1β)minus1β⊤Σminus1R is essentially the Fama-
MacBeth GLS estimate of λ and Σλ is just the standard covariance matrix of the GLS estimates
Different from our OLS version we incorporate the uncertainty of R by drawing λ from the normal
distribution with variance matrix 1T (β
⊤Σminus1β)minus1
In this case two forces are at play to make useless factors detectable in finite sample First as
for the BFM and BFM-GLS cases useless factors will generate posterior draws with diverging 983141λand flipping sing Second differently from our OLS version we incorporate the uncertainty of R
by drawing λ from the normal distribution with variance matrix 1T (β
⊤Σminus1β)minus1
A3 Formal derivation of the flat prior pitfall for risk premia
Following the derivation in section A1 the likelihood function in the second step is
p(data|γλσ2) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βγλγ)
⊤(aminus βγλγ)
983070(19)
Assigning a Jeffreysrsquo prior to the parameters23 (λσ2) the marginal likelihood function conditional
on model index γ is
p(data|γ) =983133 983133
p(data|γλσ2)π(λσ2|γ)dλdσ2
prop983133 983133
(σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βγλγ)
⊤(aminus βγλγ)
983070dλdσ2
=
983133 983133(σ2)minus
N+22 exp
983083minusN σ2
γ
2σ2
983084exp
983083minus(λγ minus λγ)
⊤β⊤γ βγ(λγ minus λγ)
2σ2
983084dλdσ2
= (2π)pγ+1
2 |β⊤γ βγ |minus
12
983133(σ2)minus
Nminuspγ+1
2 exp
983083minusN σ2
γ
2σ2
983084dσ2
= (2π)pγ+1
2 |β⊤γ βγ |minus
12
Γ(Nminuspγ+1
2 )
(N σ2
γ
2 )Nminuspγ+1
2
where λγ = (β⊤γ βγ)
minus1β⊤γ a σ
2γ =
(aminusβγ λγ)⊤(aminusβγ λγ)N and Γ denotes the Gamma function
A4 Proof of Remark 2
Proof Consider two nested linear factor models γ and γ prime The only difference between γ and γ prime
is γp γp equals 1 in model γ but 0 in model γ prime Let γminusp denote a (K minus 1) times 1 vector of model
index excluding γp γ⊤ = (γ⊤minusp 1) and γ prime⊤ = (γ⊤
minusp 0) Suppose that we rearrange the ordering of
23More precisely the priors for (λσ2) are π(λγ σ2) prop 1
σ2 and λminusγ = 0
49
factors such that factor p is the last one To begin with we introduce some matrix notations
βγ = (βγprime βp) Dγ =
983075Dγprime
1ψp
983076 β⊤
γ βγ +Dγ =
983075β⊤γprimeβγprime +Dγprime β⊤
γprimeβp
β⊤p βγprime β⊤
p βp + 1ψp
983076
|β⊤γ βγ +Dγ | = |β⊤
γprimeβγprime +Dγprime |times |β⊤p βp +
1
ψpminus β⊤
p βγprime(β⊤γprimeβγprime +Dγprime)minus1β⊤
γprimeβp|
|β⊤γ βγ +Dγ | = |β⊤
γprimeβγprime +Dγprime |times |β⊤p βp +
1
ψpminus β⊤
p βγprime(β⊤γprimeβγprime +Dγprime)minus1β⊤
γprimeβp|
|Dγ | = |Dγprime |times 1
ψp
Equipped with the above we have by direct calculation
p(data|γj = 1γminusj)
p(data|γj = 0γminusj)=
|Dγ |12
|β⊤γ βγ+Dγ |
12
1983059
SSRγ2
983060N2
|Dγprime |
12
|β⊤γprimeβγprime+D
γprime |12
1983061
SSRγprime2
983062N2
=
983061SSRγprime
SSRγ
983062N2983061|Dγ ||Dγprime |
983062 12
983075|β⊤
γprimeβγprime +Dγprime ||β⊤
γ βγ +Dγ |
983076 12
=
983061SSRγprime
SSRγ
983062N2
ψminus 1
2p
983055983055983055983055β⊤p βp +
1
ψpminus β⊤
p βγprime
983059β⊤γprimeβγprime +Dγprime
983060minus1β⊤γprimeβp
983055983055983055983055minus 1
2
=
983061SSRγprime
SSRγ
983062N29830611 + ψpβ
⊤p
983063IN minus βγprime
983059β⊤γprimeβγprime +Dγprime
983060minus1β⊤γprime
983064βp
983062minus 12
where β⊤p
983147IN minus βγprime(β⊤
γprimeβγprime +Dγprime)minus1β⊤γprime
983148βp = minb(βpminusβγprimeb)⊤(βpminusβγprimeb)+ b⊤Dγprimeb which
is the minimal value of the penalised sum of squared errors when we use βγprime to predict βp
A5 Confidence Intervals for R2 in Fama-MacBeth Estimation
In the spirit of Stock (1991) and Lewellen Nagel and Shanken (2010) for each of the simulations
we use the following approach to construct the confidence interval for cross-sectional R2 produced
by the standard Fama-MacBeth estimation
First we choose a hypothetical true cross-sectional R2 denoted by R2h The expected test asset
returns E[Rt] are assumed to follow E[Rt] = hβλ+α where β and λ are sample estimates of β
and λ from historical data and αiiidsim N (0σ2
e) The constants h and σ2e are chosen to match the
hypothetical cross-sectional R2
Let R denote the vector of historical average asset returns and V arN (x) denote the sample
cross-sectional variance of vector x The population cross-sectional R2 is therefore
R2h =
h2V arN (βλ)
V arN (E[Rt])=
h2V arN (βλ)
h2V arN (βλ) + σ2e
At the same time wersquod like to maintain the historical cross-sectional dispersion of test asset returns
50
so we further have the following equation
V arN (E[Rt]) = h2V arN (βλ) + σ2e = V arN (ET [Rt])
h2 =R2
hV arN (ET [Rt])
V arN (βλ)
σ2e = (1minusR2
h)V arN (ET [Rt])
Solving the system for h and σ2e we simulate a vector of pricing errors from a normal distribution
αiiidsim N (0σ2
e) Since the sample variance of the draws αi is generally different from σ2e with a non-
zero cross-sectional covariance with β we adjust the vector α by (1) subtracting CovN (βλα)
V arN (βλ)βλ
from α such as the sample covariance between α and βλ is zero (2) multiplying α by σeσN (α) in
order that sample cross-sectional variance of α equals σ2e
Second we simulate a random sample of both factors and test asset returns Factors are drawn
with replacement from their empirical sample while test assets returns Rt are assumed to follow a
parametric normal distribution that is
Rt|ftiidsim N (hβλ+α+ βft Σ)24
We then use Fama-MacBeth two-step approach to estimateR2 for every simulated sample ftRtTt=1
The second step is repeated for 1000 times For each hypothetical R2 between 0 and 1 we find the
(5 95) confidence interval in the simulation Finally build a 90 confidence interval for R2 by
including those values of hypothetically true R2 whose 90 confidence interval contains R2
The confidence intervals for GLS R2 can be found in a similar way with the only difference
of focusing on cross-sectional R2 for a linear combination of Rt ie Σminus 12Rt Let Rt = Σminus 1
2Rt
β = Σminus 12 β and 983171t = Σminus 1
2 983171tiidsim N (0 IN ) The simulation then relies on Rt rather than Rt
Rt|ftiidsim N (hβλ+α+ βft IN )
A6 Large N behavior
In this section we investigate the properties of the BFM procedure in estimating the risk premia
and R-squared as well successfully identifying irrelevant factors in the model when applied to a
large cross-section For the sake of brevity here we present the overview of the simulation results
with Tables C4 - C9 summarizing the output
We consider the same simulation design as described at the beginning of Section III except
for the choice of the cross-section of test assets which time series and cross-sectional features we
mimic In our baseline case in the previous subsections we built a cross-section to emulate the 25
Fama-French portfolios sorted by size and value Now instead we consider the properties of the
following composite cross-sections to simulate returns
24Σ is the sample estimate of covariance matrix of random errors in time-series regression in equation (1)
51
(a) N = 55 25 Fama-French portfolios sorted by size and value and 30 industry portfolios
(b) N = 100 25 Fama-French portfolios sorted by size and value 30 industry portfolios 25
profitability and investment portfolios 10 momentum portfolios and 10 long-term reversal
portfolios
The rest of the simulation design stays unchanged ie the strong factor mimics the behavior of
HML with its betas and risk premia corresponding to their in-sample values cross-sectional R2adj
as well as portfolio average returns and variance of the residuals
Tables C4 and C7 focus on the case of including only a strong factor into the model that is
inherently misspecified and estimated on a large cross-section (N = 55 and N = 100 respectively)
Risk premia estimates recovered by BFM are centered around the pseudo-true values and confi-
dence intervals produced by the posterior coverage have the correct size Confidence intervals for
R2adj are equally centered around the true values and overall become quite tight for a large sample
size (eg T ge 600) The GLS version of the cross-sectional fit is slightly lower than than the true
value and this is largely due to the estimation errors in the weight matrix that are particularly
important for a large cross-section but overall are very close to the true values
Tables C5 and C8 focus on the model with just a useless factor As expected the standard
Fama-MacBeth estimation yields confidence intervals for the risk premia with wrong size rejecting
the zero risk premia for a useless factor with increasing frequency as T becomes large In contrast
the BFM inference remains valid with its empirical rejection rates being close to the true size of
the tests
Finally Tables C6 and C9 consider the mixed case of estimating a model that includes both
strong and useless factors Again BFM correctly identifies the presence of a spurious factor in
the specification and if anything becomes somewhat more conservative underrejecting the zero
risk premia associated with it This conservative inference is also shared by the estimates of the
strong factor risk premia with the latter being particularly evident in a large sample estimation
(T = 20 000 N = 100) Again the reason for this additional estimation uncertainty seems to lie
in the large N behavior of the cross-sectional regression (betas and the weight matrix in case of
the GLS) The posterior of a cross-sectional measure of fit remains relatively tight around the true
value
B Data
CAPM Following Sharpe (1964) and Lintner (1965) the only risk factor is the excess return on
market portfolio which is proxied by a value-weighted portfolio of all CRSP firms incorporated in
US and listed on the NYSE AMEX or NASDAQ We use 1-month Treasury rate as a proxy for
risk-free rate The data comes from Ken Frenchrsquos website
Fama-French 3 factor model Fama and French (1992) extend CAPM by introducing two
additional factors SMB and HML where SMB is the return difference between portfolios of stocks
52
with small and large market capitalizations and HML is the return difference between portfolios of
stocks with high and low book-to-market ratio Again the data comes from Ken Frenchrsquos website
Carhart (1997) This paper extends Fama-French 3 factor model by including a momentum
factor UMD (also available from Ken French website)
q-factor model Hou Xue and Zhang (2015) introduce a four-factor model that includes market
excess return a new size factor (ME) proxied by the return difference between large and small
stocks an investment factor (IA) proxied by the return difference of stocks with high and low
investment-to-asset ratio and finally the profitability factor (ROE) created by sorting stocks based
on their return-on-equity ratio We receive the data from the authors An alternative is Fama-
French five factor model but we only show the result of the first one since factors in these two
papers are extremely similar
Liquidity Pastor and Stambaugh (2003) created a liquidity factor based on the fact that order
flows result in larger return reversals when liquidity is lower We download the monthly tradable
liquidity factor from Stambaughrsquos website
Quality-minus-junk Asness Frazzini and Pedersen (2019) introduced the QMJ factor and
demonstrated that profitable growing well-managed companies referred to as lsquoqualityrsquo firms
command a higher rate of return The factor is available from AQR data library
CCAPM Real growth rate in nondurable consumption per capita (quarterly data) is computed
from the data on consumption levels available at St Louis FRED
Scaled CAPM Lettau and Ludvigson (2001) considered a conditional SDFmt+1 = atminusbtMKTt+1
where at and bt are linear function of conditional information cayt We download cayt from the
authorsrsquo website
Scaled CCAPM Similar to conditional CAPM but the SDF includes nondurable consumption
growth instead of the market factor
HC-CCAPM Jagannathan and Wang (1996) added a labor factor to CAPM Following their
paper we compute the returns on human capital as
Rlabort =
Ltminus1 + Ltminus2
Ltminus2 + Ltminus3minus 1 (20)
where Lt is the disposable labor income per capita
Scaled HC-CCAPM Lettau and Ludvigson (2001) extend Jagannathan and Wang (1996) by
considering a conditional SDF in which cay is the only conditional information
Durable consumption model Yogo (2006) emphasized the role of durable consumption goods
in explaining high returns of small stocks and value stocks We consider a three-factor model
market excess return non-durable consumption growth and durable (real) consumption growth Cd
(seasonally adjusted at annual rates) The data is from Yogorsquos website 1952Q1 to 2001Q4
53
B1 Factor list
Table B1 List of the factors for cross-sectional asset pricing models
Factor ID Factor name and description Reference Sourceconstruction
MKT Market excess return Sharpe (1964) Lintner(1965)
Ken French website
SMB Size factor constructed as a long-shortportfolio of stocks sorted by their mar-ket cap (small-minus-big)
Fama and French (1992) Ken French website
HML Value factor constructed as a long-short portfolio of stocks sorted by theirbook-to-market ratio (high-minus-low)
Fama and French (1992) Ken French website
RMW Profitability factor constructed as along-short portfolio of stocks sorted bytheir profitability (robust-minus-weak)
Fama and French (2015) Ken French website
CMA Investment factor constructed as along-short portfolio of stocks sorted bytheir investment activity (conservative-minus-aggressive)
Fama and French (2015) Ken French website
UMD Momentum factor constructed as along-short portfolio of stocks sorted bytheir 12-2 cumulative previous return(up-minus-down)
Carhart (1997) Je-gadeesh and Titman(1993)
Ken French website
STREV Short-term reversal factor constructedas a long-short portfolio of stockssorted by their previous month return
Jegadeesh and Titman(1993)
Ken French website
LTREV Long-term reversal factor constructedas a long-short portfolio of stockssorted by their cumulative return ac-crued in the previous 60-13 months
Jegadeesh and Titman(2001)
Ken French website
q IA Investment factor constructed as along-short portfolio of stocks sorted bytheir investment-to-capital
Hou Xue and Zhang(2015)
Lu Zhang
q ROE Profitability factor constructed as along-short portfolio of stocks sorted bytheir return on equity
Hou Xue and Zhang(2015)
Lu Zhang
LIQ NT Liquidity factor computed as the av-erage of individual-stock measures es-timated with daily data (residual pre-dictability controlling for the marketfactor)
Pastor and Stambaugh(2003)
Robert Stambaughwebsite
LIQ TR Liquidity factor constructed as a long-short portfolio of stocks sorted by theirexposure to LIQ NT
Pastor and Stambaugh(2003)
Robert Stambaughwebsite
MGMT Mispricing factor constructed as acombination of anomalies related tofirmrsquos management practices (stock is-sue accruals asset growth etc)
Stambaugh and Yuan(2016)
Robert Stambaughwebsite
PERF Mispricing factor constructed as acombination of anomalies related tofirmrsquos performance (profitability dis-tress return on assets etc)
Stambaugh and Yuan(2016)
Robert Stambaughwebsite
ACCR Accruals factor constructed as a long-short portfolio of stocks sorted bychanges in operating working capitalper split-adjusted share from the fiscalyear end t-2 to t-1 divided by book eq-uity per share in t-1
Sloan (1996) Robert Stambaughwebsite
54
DISSTR Distress factor constructed as a long-short portfolio of stocks sorted by thepredicted failure probability
Campbell Hilscher andSzilagyi (2008)
Robert Stambaughwebsite
ASS Growth Asset growth factor constructed as along-short portfolio of stocks sorted bygrowth rate of total assets in the previ-ous fiscal year
Cooper Gulen andSchill (2008)
Robert Stambaughwebsite
COMP ISSUE Composite issue factor constructed asa long-short portfolio of stocks sortedby the growth in the firmrsquos total marketvalue of equity above that of the stockrsquosrate of return
Daniel and Titman(2006)
Robert Stambaughwebsite
GR PROF Gross profitability factor constructedas a long-short portfolio of stockssorted by the ratio of gross profit toassets creates abnormal benchmark-adjusted returns
Novy-Marx (2013) Robert Stambaughwebsite
INV IN ASS Investment-in-assets factor con-structed as a long-short portfolio ofstocks sorted by the annual change ingross property plant and equipmentplus the annual change in inventoriesscaled by lagged book value of theassets
Titman Wei and Xie(2004)
Robert Stambaughwebsite
NetOA Net operating assets factor con-structed as a long-short portfolio ofstocks sorted by net operating assets
Hirshleifer Kewei Teohand Zhang (2004)
Robert Stambaughwebsite
OSCORE Ohlson O-score factor constructed as along-short portfolio of stocks sorted bythe predicted value of a distress mea-sure
Ohlson (1980) Robert Stambaughwebsite
ROA Return-on-assets factor constructed asa long-short portfolio of stocks sortedby their return on assets
Chen Novy-Marx andZhang (2010)
Robert Stambaughwebsite
STOCK ISS Stock issuance factor constructed asa long-short portfolio of stocks sortedby their annual log change in split-adjusted shares outstanding
Ritter (1991) Fama andFrench (2008)
Robert Stambaughwebsite
INTERM CR Innovations to the intermediariesrsquo cap-ital (equity) ratio
He Kelly and Manela(2017)
Asaf Manela website
BAB Betting-against-beta factor con-structed as a portfolio that holdslow-beta assets leveraged to a betaof 1 and that shorts high-beta assetsde-leveraged to a beta of 1
Frazzini and Pedersen(2014)
AQR data library
HML DEVIL A version of the HML factor that relieson the current price level to sort thestocks into long and short legs
Asness and Frazzini(2013)
AQR data library
QMJ Auality-minus-junk factor constructedas a long-short portfolio of stockssorted by the combination of theirsafety profitability growth and thequality of management practices
Asness Frazzini andPedersen (2019)
AQR data library
FIN UNC A measure of financial uncertainty Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
REAL UNC A measure of real economic uncertainty Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
55
MACRO UNC A measure of macroeconomic uncer-tainty
Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
TERM Term spread measured as the differ-ence in 10 year Treasury bonds and fedfunds rate
Chen Ross and Roll(1986) Fama and French(1993)
FRED-MD database
DELTA SLOPE Change in the difference between a10-year Treasury bond yield and a 3-month Treasury bill yield
Ferson and Harvey(1991)
FRED-MD database
CREDIT Credit spread measured as the dif-ference between Moodyrsquos BAA corpo-rate bond yields and 10-year Treasurybonds
Chen Ross and Roll(1986) Fama and French(1993)
FRED-MD database
DIV Dividend yield Campbell (1996) FRED-MD databasePE Price-earnings ratio Basu (1977) Ball (1985) FRED-MD database
BW INV SENT Investor sentiment measure aggre-gated from a set of indices orthogonalto macroeconomic fundamentals
Baker and Wurgler(2006)
Dashan Huang web-site
HJTZ INV SENT Investor sentiment measure extractedwith PLS from Baker and Wurgler(2006) proxies
Huang Jiang Tu andZhou (2015)
Dashan Huang web-site
BEH PEAD Short-term behavioral factor reflectingpost-earnings announcement drift
Daniel Hirshleifer andSun (2019)
Kent Daniel website
BEH FIN Long-term behavioral factor predom-inantly capturing the impact of shareissuance and correction
Daniel Hirshleifer andSun (2019)
Kent Daniel website
MKTlowast Market factor with a hedged unpricedcomponent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
SMBlowast SMB with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
HMLlowast HML with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
RMWlowast RMW with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
CMAlowast CMA with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
SKEW Systematic skewness factor con-structed as a long-short portfolio ofstocks sorted on the their predictedsystematic skewness rank
Langlois (2019) Hugues Langloiswebsite
NONDUR Nondurable consumption growth (realchain-weighted per capita)
Chen Ross and Roll(1986) Breeden Gib-bons and Litzenberger(1989)
Monthly consump-tion expenditurethe chain-weightedprice index andpopulation data arefrom BEA
SERV Growth rate (real chain-weighted percapita) service expenditure
Breeden Gibbons andLitzenberger (1989)Hall (1978)
Monthly expenditurefor services thechain-weighted priceindex and popula-tion data are fromBEA
UNRATE Unemployment rate Gertler and Grinols(1982)
FRED-MD database
IND PROD Growth rate of industrial production Chan Chen and Hsieh(1985) Chen Ross andRoll (1986)
Industrial Produc-tion Index is fromthe Board of Gover-nors of the FederalReserve System
56
OIL Monthly growth rate of the ProducerPrice index for Crude Petroleum (do-mestic production)
Chen Ross and Roll(1986)
PPI is from US Bu-reau of Labor Statis-tics
The table presents the list of factors used in Section IV2 For each of the variables we present their identificationindex the nature of the factor and the source of data for downloading andor constructing the time series
57
C Additional Tables
Table C1 Tests of risk premia in a correctly specified model with a strong factor
λintercept λHML R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0115 0049 0017 0114 0050 0014 -417 4429200 0101 0055 0012 0096 0047 0010 096 6773
FM 600 0106 0053 0011 0100 0048 0011 3052 84561000 0096 0050 0011 0102 0044 0010 4786 901820000 0102 0042 0012 0100 0049 0007 9620 9943
100 0063 0027 0006 0034 0010 0002 -337 3516200 0107 0055 0012 0080 0037 0005 -299 4906
BFM 600 0095 0050 0007 0080 0035 0007 431 69131000 0092 0054 0008 0092 0047 0012 3291 781820000 0108 0054 0012 0110 0052 0008 9654 9849
Panel B GLS
100 0221 0156 0061 0212 0137 0064 1062 7380200 0153 0088 0024 0144 0088 0024 5923 8571
FM 600 0117 0062 0017 0115 0060 0014 8338 94461000 0106 0053 0011 0112 0052 0011 9003 964920000 0100 0058 0010 0103 0047 0011 9947 9981
100 0151 0088 0026 0149 0086 0026 2642 6569200 0123 0066 0016 0124 0066 0015 4907 7352
BFM 600 0106 0055 0012 0105 0056 0011 7640 87301000 0107 0054 0011 0103 0051 0011 8512 915420000 0098 0057 0009 0100 0051 0013 9915 9950
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in a correctlyspecified model with an intercept and a strong factor Hypothetical true value of R2
adj is 100 Fama-MacBeth esti-mates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shankencorrection Confidence intervals for BFM estimates are constructed using a posterior distribution of Fama-MacBethestimates of λ The last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simula-tions evaluated at the simulation point estimates for FM and its posterior mode for BFM
58
Table C2 Tests of risk premia in a correctly specified model with a useless factor
λintercept λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0087 0033 0008 0019 0003 0000 -419 4844200 0071 0034 0006 0052 0011 0000 -420 5684
FM 600 0077 0031 0005 0131 0037 0000 -422 58501000 0071 0035 0006 0205 0066 0002 -416 612820000 0133 0077 0027 0709 0479 0140 -411 7007
100 0039 0011 0002 0001 0001 0000 -229 -012200 0038 0013 0001 0002 0000 0000 -232 030
BFM 600 0048 0012 0002 0017 0006 0001 -221 1261000 0048 0011 0000 0031 0010 0000 -214 16020000 0007 0000 0000 0103 0051 0014 -176 5793
Panel B GLS
100 0215 0130 0052 0211 0117 0033 -310 3727200 0141 0080 0023 0139 0072 0013 -373 2282
FM 600 0108 0053 0014 0142 0065 0010 -377 21161000 0097 0053 0009 0171 0082 0012 -371 209220000 0108 0047 0010 0621 0559 0372 -259 1697
100 0136 0074 0022 0029 0011 0001 -194 1060200 0112 0060 0014 0012 0004 0000 -239 837
BFM 600 0097 0047 0010 0010 0002 0000 -265 7491000 0096 0047 0009 0010 0002 0000 -276 83220000 0071 0034 0004 0077 0033 0004 -326 766
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a correctly specified model with an intercept and a useless factor The true value of R2 is 0 Fama-MacBeth esti-mates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shankencorrection Confidence intervals for BFM estimates are constructed using a posterior distribution of Fama-MacBethestimates of λ The last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simula-tions evaluated at the simulation point estimates for FM and its posterior mode for BFM
59
Table C3 Tests of risk premia in a correctly specified model with useless and strong factors
λintercept λHML λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0073 0038 0006 0071 0033 0006 0043 0012 0000 -445 5930200 0073 0035 0006 0090 0045 0008 0028 0006 0000 1083 7484
FM 600 0076 0033 0005 0100 0046 0009 0027 0005 0000 3873 85721000 0073 0034 0006 0090 0046 0009 0026 0005 0000 5415 911320000 0087 0032 0006 0061 0026 0005 0025 0008 0000 9687 9947
100 0039 0013 0001 0023 0007 0001 0002 0001 0000 -197 4174200 0048 0023 0000 0046 0015 0000 0002 0001 0000 003 5372
BFM 600 0054 0025 0003 0062 0026 0006 0002 0000 0000 4056 72821000 0084 0034 0006 0064 0021 0002 0002 0000 0000 5763 808920000 0071 0033 0005 0069 0032 0004 0004 0000 0000 9704 9860
Panel B GLS
100 0193 0129 0043 0205 0133 0043 0180 0121 0031 1405 7480200 0135 0080 0021 0141 0075 0024 0129 0062 0010 6015 8683
FM 600 0101 0054 0010 0109 0052 0009 0091 0037 0004 8331 94231000 0104 0055 0010 0105 0048 0011 0085 0039 0003 8995 963220000 0095 0047 0006 0101 0052 0005 0088 0035 0004 9944 9981
100 0133 0074 0023 0132 0074 0023 0026 0009 0001 2796 6680200 0106 0054 0012 0106 0053 0012 0013 0004 0000 4899 7362
BFM 600 0094 0047 0009 0093 0046 0009 0007 0001 0000 7654 87351000 0090 0045 0010 0091 0044 0007 0005 0001 0000 8530 914620000 0092 0043 0008 0092 0046 0008 0004 0001 0000 9914 9951
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and
λstrong λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor The true value
of the cross-sectional R2adj is 100 Fama-MacBeth estimates are constructed using OLS (GLS) two-step cross-
sectional regressions with standard errors including Shanken correction Confidence intervals for BFM estimates areconstructed using a posterior distribution of Fama-MacBeth estimates of λ The last two columns report the 5th and95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at the simulation point estimates for FMand its posterior mode for BFM
60
Table C4 Tests of risk premia in a misspecified model with a strong factor (N = 55)
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0107 0058 0014 0115 0059 0012 -186 1305200 0091 0046 0009 0102 0052 0011 -184 1485600 0104 0052 0013 0109 0060 0014 -166 14951000 0100 0056 0009 0123 0061 0013 -123 135120000 0104 0049 0009 0109 0048 0009 353 803
BFM 100 0017 0004 0000 0002 0000 0000 -155 -067200 0046 0017 0004 0023 0006 0000 -168 273600 0089 0042 0012 0071 0036 0005 -174 8441000 0093 0047 0007 0089 0040 0007 -171 95920000 0101 0048 0011 0099 0042 0008 329 777
Panel B GLS
FM 100 0509 0428 0305 0513 0431 0305 2093 6471200 0295 0206 0102 0274 0187 0091 3999 6657600 0170 0109 0042 0176 0113 0034 5585 71321000 0170 0106 0034 0176 0105 0031 6010 723220000 0145 0082 0020 0141 0073 0022 6917 7182
BFM 100 0265 0181 0086 0262 0186 0079 1872 5919200 0163 0100 0032 0147 0086 0024 3401 5842600 0116 0060 0017 0118 0058 0014 5125 65951000 0120 0064 0016 0121 0063 0014 5689 686520000 0106 0053 0009 0095 0048 0013 6897 7163
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor estimates of a cross-section of 55 test assets The true valueof the cross-sectional R2
adj is 572 (7071) in OLS (GLS) estimation Fama-MacBeth estimates are constructedusing OLS (GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidenceintervals and their size for BFM estimates are constructed using posterior coverage of Fama-MacBeth estimates of λThe last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluatedat the simulation point estimates for FM and its posterior mode for BFM The test assets mimic the time series andcross-sectional properties of 25 Fama-French size-value portfolios and 30 industry portfolios while the strong factorproxies the HML factor (all the data available from Ken French website)
61
Table C5 Tests of risk premia in a misspecified model with a useless factor (N = 55)
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0095 0050 0013 0084 0031 0003 -186 2080200 0084 0042 0007 0093 0038 0004 -186 1801600 0103 0051 0011 0155 0077 0011 -187 13621000 0101 0051 0009 0226 0131 0033 -187 141720000 0191 0126 0050 0716 0650 0460 -187 988
BFM 100 0013 0002 0000 0000 0000 0000 -105 -047200 0039 0015 0001 0002 0000 0000 -128 -043600 0076 0040 0006 0004 0000 0000 -144 -0491000 0082 0035 0004 0022 0009 0000 -150 -02620000 0129 0059 0005 0078 0037 0008 -168 -020
Panel B GLS
FM 100 0492 0411 0287 0540 0468 0302 -134 2318200 0267 0196 0088 0364 0274 0153 -159 1359600 0166 0101 0035 0408 0323 0198 -162 10091000 0154 0089 0028 0475 0401 0266 -155 86820000 0155 0100 0044 0806 0772 0705 -068 668
BFM 100 0259 0180 0085 01695 0105 0041 -014 1285200 0162 0094 0030 0065 0027 0005 -089 615600 0108 0058 0013 0056 0022 0003 -127 5311000 0112 0057 0014 0064 0021 0004 -138 47920000 0051 0020 0001 0093 0049 0011 -072 172
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor estimated on a cross-section of 55 portfolios Thetrue value of the cross-sectional R2 is zero The test assets mimic the time series and cross-sectional properties of 25Fama-French size-value portfolios and 30 industry portfolios
62
Table C6 Tests of risk premia in a misspecified model with useless and strong factors (N = 55)
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0097 0052 0014 0099 0045 0009 0108 0041 0006 -318 2477200 0085 0041 0006 0090 0049 0011 0117 0051 0008 -298 2350600 0095 0045 0013 0108 0060 0012 0191 0100 0015 -212 21211000 0083 0043 0006 0125 0072 0016 0258 0171 0047 -155 213720000 0109 0055 0012 0146 0086 0038 0766 0704 0535 267 1778
BFM 100 0014 0002 0000 0000 0000 0000 0000 0000 0000 -141 161200 0037 0014 0001 0017 0005 0000 0002 0000 0000 -203 726600 0069 0031 0005 0058 0027 0002 0007 0000 0000 -227 10961000 0064 0024 0003 0069 0027 0005 0026 0008 0001 -209 117320000 0028 0006 0000 0068 0033 0004 0080 0042 0009 262 873
Panel B GLS
FM 100 0488 0406 0282 0495 0411 0281 0537 0465 0306 2201 6613200 0263 0194 0088 0252 0167 0077 0359 0279 0151 4047 6707600 0157 0094 0032 0161 0093 0027 0398 0313 0199 5581 71591000 0146 0087 0029 0144 0090 0026 0475 0393 0251 5994 725020000 0140 0094 0035 0134 0089 0041 0789 0747 0667 6888 7237
BFM 100 0253 0182 0081 0260 0176 0077 0168 0104 0039 2036 6007200 0156 0092 0028 0140 0082 0021 0063 0029 0004 3451 5844600 0105 0058 0013 0108 0049 0011 0054 0024 0002 5125 66141000 0105 0054 0015 0111 0056 0009 0057 0024 0003 5678 687320000 0051 0019 0001 0045 0018 0001 0093 0045 0013 6873 7157
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and λstrong
and λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor estimated on a cross-section
of 55 portfolios The true value of the cross-sectional R2adj is 572 (7071) in OLS (GLS) estimation The test
assets mimic the time series and cross-sectional properties of 25 Fama-French size-value portfolios and 30 industryportfolios while the strong factor proxies the HML factor
63
Table C7 Tests of risk premia in a misspecified model with a strong factor (N = 100)
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0091 0044 0010 0109 0055 0012 -098 1310600 0102 0052 0013 0116 0057 0019 -042 12311000 0099 0056 0011 0117 0060 0014 031 114520000 0099 0044 0010 0116 0068 0017 431 748
BFM 200 0016 0004 0000 0006 0001 0000 -082 074600 0076 0034 0006 0060 0028 0005 -086 7901000 0081 0046 0007 0076 0037 0007 -072 85920000 0097 0044 0011 0106 0057 0014 418 742
Panel B GLS
FM 200 0495 0411 0287 0481 0403 0268 2938 5885600 0255 0169 0077 0257 0178 0073 4736 62091000 0234 0149 0059 0205 0129 0049 5198 635220000 0166 0100 0035 0170 0097 0036 6142 6398
BFM 200 0249 0163 0070 0233 0158 0073 2567 5296600 0132 0082 0023 0137 0077 0020 4287 56771000 0125 0073 0021 0112 0060 0014 4849 597020000 0101 0056 0012 0098 0053 0013 6114 6375
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for the pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor estimated on a cross-section of 100 portfolios The truevalue of R2
adj is 585 (6297) for the OLS (GLS) estimation Fama-MacBeth estimates are constructed using OLS(GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidence intervalsand their size for BFM estimates are constructed using a posterior distribution of Fama-MacBeth estimates of λ Thelast two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at thesimulation point estimates for FM and its posterior mode for BFM The simulations design follows the methodologydescribed in Section III with the test assets mimicking the composite cross-section of 25 Fama-French size-BMportfolios 30 industry portfolios 25 profitability and investment portfolios 10 momentum portfolios as well as 10long-term reversal portfolios (all available from Ken French website) A strong factor mimics the behavior of HML
64
Table C8 Tests of risk premia in a misspecified model with a useless factor (N = 100)
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0091 0041 0009 0147 0079 0015 -101 1481600 0110 0062 0010 0280 0180 0064 -100 13551000 0096 0053 0007 0341 0248 0101 -100 122220000 0221 0151 0061 0805 0759 0643 -100 1220
BFM 200 0014 0003 0000 0000 0000 0000 -050 002600 0070 0030 0003 0014 0002 0000 -066 0261000 0073 0034 0005 0026 0010 0001 -071 02720000 0097 0035 0003 0091 0045 0008 -081 109
Panel B GLS
FM 200 0472 0397 0272 0530 0454 0320 -072 1495600 0230 0149 0064 0458 0363 0235 -052 9691000 0196 0122 0048 0487 0407 0275 -037 86120000 0106 0063 0021 0760 0717 0640 171 615
BFM 200 0244 0161 0066 0150 0092 0031 -016 769600 0129 0073 0021 0078 0035 0008 -055 6761000 0118 0066 0019 0070 0033 0006 -060 62420000 0076 0034 0005 0093 0049 0009 172 399
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor estimated on a cross-section of 100 portfolios The truevalue of R2 is zero The test assets mimic the time series and cross-sectional properties of composite cross-section of25 Fama-French size-BM portfolios 30 industry portfolios 25 profitability and investment portfolios 10 momentumportfolios as well as 10 long-term reversal portfolios while the strong factor mimics the behavior of HML (all thedata is available from Ken French website)
Table C9 Tests of risk premia in a misspecified model with useless and strong factors (N = 100)
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0088 0043 0009 0098 0045 0008 0187 0104 0026 -124 2118600 0103 0056 0011 0104 0064 0015 0319 0217 0081 -020 19571000 0105 0049 0009 0115 0058 0013 0389 0284 0134 070 182620000 0102 0057 0014 0140 0075 0028 0823 0782 0684 400 1766
BFM 200 0012 0003 0000 0005 0000 0000 0000 0000 0000 -051 541600 0062 0025 0003 0046 0021 0002 0016 0004 0000 -063 10771000 0062 0029 0005 0057 0025 0003 0029 0008 0001 -005 107820000 0026 0008 0001 0042 0015 0000 0089 0039 0011 399 818
Panel B GLS
FM 200 0467 0392 0268 0471 0383 0254 0523 0454 0315 2971 5943600 0227 0149 0059 0228 0154 0060 0455 0363 0227 4745 62221000 0191 0118 0046 0178 0114 0036 0480 0401 0262 5207 636820000 0098 0054 0020 0095 0062 0024 0749 0707 0621 6121 6416
BFM 200 0243 0160 0066 0230 0155 0072 0152 0087 0031 2613 5350600 0126 0072 0022 0132 0071 0017 0074 0033 0007 4268 56921000 0117 0067 0017 0109 0057 0013 0068 0034 0006 4864 597920000 0068 0032 0002 0066 0034 0006 0087 0047 0009 6102 6371
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and λstrong
λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor on the cross-section of 100
portfolios The true value of R2adj is 585 (6297) in OLS (GLS) estimation The test assets mimic the time series
and cross-sectional properties of composite cross-section of 25 Fama-French size-BM portfolios 30 industry portfolios25 profitability and investment portfolios 10 momentum portfolios as well as 10 long-term reversal portfolios whilethe strong factor mimics the behavior of HML (all the data is available from Ken French website)
65
Table C10 The probability of retaining risk factors using BF
T 55 57 59 61 63 65
Panel A strong factors
Jeffreys Prior fstrong 200 0860 0845 0830 0812 0792 0771600 0987 0985 0985 0983 0981 09791000 0998 0998 0997 0996 0996 0995
Spike-and-Slab Prior fstrong 200 0749 0718 0685 0654 0618 0586600 0982 0979 0978 0972 0970 09641000 0996 0996 0996 0995 0994 0992
Panel B useless factors
Jeffreys Prior fuseless 200 1000 0995 0982 0940 0862 0726600 1000 0999 0998 0988 0971 09201000 1000 1000 1000 0999 0990 0971
Spike-and-Slab Prior fuseless 200 0419 0248 0149 0083 0040 0022600 0119 0062 0028 0012 0007 00041000 0083 0028 0011 0001 0000 0000
Panel C strong and useless factors
Jeffreys Prior fstrong 200 0928 0912 0891 0878 0860 0838600 0994 0994 0992 0991 0991 09891000 0999 0999 0999 0999 0999 0999
fuseless 200 0955 0894 0788 0642 0489 0360600 0957 0895 0764 0618 0461 03541000 0957 0893 0787 0645 0483 0357
Spike-and-Slab Prior fstrong 200 0753 0725 0697 0666 0629 0592600 0982 0980 0979 0972 0970 09611000 0996 0996 0996 0995 0994 0994
fuseless 200 0283 0154 0085 0044 0023 0011600 0077 0031 0006 0002 0000 00001000 0053 0008 0000 0000 0000 0000
The table shows the frequency of retaining risk factors for different choice sets across 1000 simulations of differentsize (T=200 600 and 1000) In Panel A the candidate risk factor is truly cross-sectionally priced and stronglyidentified while in Panel B they are not Panel C summarizes the case of using both strong and useless candidatefactors in the model A candidate factor is retained in the model if its marginal posterior probability p(γi = 1|data)is greater than a certain threshold ie 55 57 59 61 63 and 65
66
Table C11 Two-pass regressions with tradable factors 25 Fama-French portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CAPM Intercept 1421 1769 1404 -091 1461[0627 2215] [-435 6661] [0595 2256] [-403 5211]
MKT -0647 -0631[-1429 0135] [-1458 0163]
Fama and French (1992) Intercept 1273 6071 1249 5604 5566[0730 1816] [2000 8514] [0682 1834] [4275 6946]
MKT -0686 -0664[-1227 -0145] [-1242 -0102]
SMB 0140 0140[0075 0205] [0073 0205]
HML 0380 0379[0322 0438] [0319 0439]
Quality-minus-junk Intercept 0626 7442 0648 6947 6879Asness Frazzini and Pedersen (2014) [-0048 1300] [4480 9520] [-0055 1308] [5548 7946]
QMJ 0369 0358[0148 0589] [0130 0591]
MKT -0097 -0116[-0763 0568] [-0772 0580]
SMB 0209 0206[0157 0260] [0151 0262]
HML 0338 0340[0288 0388] [0289 0392]
Panel B GLS
CAPM Intercept 1362 4805 1323 4706 4265[0941 1783] [2696 10000] [0846 1802] [1411 6530]
MKT -0788 -0749[-1206 -0369] [-1227 -0287]
Fama and French (1992) Intercept 1397 8849 1353 8543 8543[0924 1870] [8286 9771] [0846 1878] [8003 8989]
MKT -0827 -0783[-1298 -0356] [-1311 -0283]
SMB 0179 0178[0145 0213] [0142 0215]
HML 0348 0350[0312 0385] [0310 0391]
Quality-minus-junk Intercept 0966 8948 0982 8736 8642Asness Frazzini and Pedersen (2014) [0395 1537] [8320 9640] [0383 1616] [8120 9090]
QMJ 0378 0359[0204 0552] [0172 0539]
MKT -0425 -0438[-0984 0133] [-1052 0157]
SMB 0197 0196[0160 0234] [0156 0235]
HML 0359 0359[0321 0398] [0320 0399]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of monthly excess returns for 25 Fama-French sizevalue portfolios Each model is estimated viaOLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructed basedon the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructed as inLewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide theposterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and 99level respectively
67
Table C12 Two-pass regressions with tradable factors 25 Fama-French portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CCAPM Intercept 1444 662 1814 -202 545[0155 2732] [-435 9374] [0106 3664] [-423 4175]
∆Cnd 0383 0254[-0221 0988] [-0462 0956]
Scaled CAPM Intercept 4489 1617 3731 1739 2565[1479 7499] [-1429 6114] [-0268 7514] [-391 5783]
cay 1141 0563[-0198 2480] [-2262 3085]
MKT -2119 -1444[-4890 0653] [-5241 2643]
MKT times cay -8554 -5188[-19227 2119] [-17911 9415]
Scaled HC-CAPM Intercept 4405 1386 3343 3895 3659[1182 7629] [-2632 5453] [-0500 7182] [-006 6630]
cay 1038 0552[-0444 2520] [-1957 2848]
∆Y 0437 0034[-0421 1295] [-0995 0928]
MKT -2049 -1148[-5031 0933] [-4946 2772]
∆Y times cay 1428 0724[-1047 3903] [-3458 4645]
MKT times cay -7782 -4127[-18579 3015] [-17926 10532]
Panel B GLS
CCAPM Intercept 2274 -251 2336 -303 -056[1386 3162] [-435 9896] [1360 3266] [-403 1109]
∆Cnd 0139 0088[-0114 0391] [-0222 0383]
Scaled CAPM Intercept 2623 4949 2668 5626 4567[0794 4451] [1657 7829] [0859 4376] [439 7146]
cay 0362 0205[-0759 1483] [-0853 1277]
MKT -0596 -0642[-2412 1220] [-2325 1149]
MKT times cay 0644 0310[-5695 6984] [-5988 6529]
Scaled HC-CAPM Intercept 2486 5014 2604 5832 4698[0510 4463] [1158 7726] [0801 4372] [558 7371]
cay 0361 0199[-0902 1623] [-0879 1269]
∆Y -0415 -0242[-0878 0048] [-0672 0157]
MKT -0479 -0588[-2441 1482] [-2357 1241]
∆Y times cay 0395 0150[-1761 2552] [-1718 2036]
MKT times cay 1158 0477[-5888 8205] [-5890 6968]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of quarterly excess returns for 25 Fama-French sizevalue portfolios Each model is estimatedvia OLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructed basedon the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructed as inLewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide theposterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and 99level respectively
68
Table C13 Two-pass regressions with tradable factors 25 Fama-French + 17 industry portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Carhart (1997) Intercept 0755 4720 0780 3835 3769
[0167 1342] [803 8559] [0158 1395] [2222 5427]
MKT -0162 -0188
[-0759 0435] [-0808 0432]
SMB 0144 0142
[0082 0206] [0078 0207]
HML 0320 0316
[0251 0388] [0245 0388]
UMD 0759 0673
[-0360 1878] [-0476 1802]
q-factor model Intercept 0744 4505 0761 3647 3600
Hou Xue and Zhang (2015) [0244 1243] [138 8338] [0239 1287] [1711 5508]
ROE 0173 0156
[-0139 0485] [-0154 0481]
IA 0287 0279
[0117 0456] [0117 0455]
ME 0230 0221
[0135 0325] [0121 0320]
MKT -0199 -0211
[-0703 0304] [-0741 0311]
Liquidity Factor Intercept 0958 277 0963 -124 620
Pastor and Stambaugh (2003) [0417 1499] [-513 4113] [0377 1527] [-435 3340]
LIQ 0210 0153
[-1001 1421] [-1077 1445]
MKT -0274 -0281
[-0822 0274] [-0851 0340]
Panel B GLS
Carhart (1997) Intercept 0839 8826 0909 8486 8397
[0467 1210] [7562 9557] [0487 1339] [7895 8812]
MKT -0235 -0309
[-0610 0140] [-0737 0120]
SMB 0162 0161
[0134 0190] [0130 0193]
HML 0351 0350
[0316 0386] [0312 0390]
UMD 1426 1184
[0832 2020] [0541 1861]
q-factor model Intercept 1107 4289 1103 3742 3631
Hou Xue and Zhang (2015) [0761 1454] [027 7341] [0714 1486] [2022 5216]
ROE 0270 0247
[0057 0482] [0008 0502]
IA 0277 0263
[0134 0420] [0107 0428]
ME 0221 0216
[0156 0287] [0144 0288]
MKT -0524 -0520
[-0869 -0179] [-0900 -0138]
Liquidity Factor Intercept 1186 2897 1159 1811 2373
Pastor and Stambaugh (2003) [0874 1497] [-197 7687] [0803 1519] [438 5253]
LIQ 0089 0065
[-0567 0744] [-0696 0871]
MKT -0598 -0571
[-0909 -0288] [-0936 -0212]
69
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CAPM Intercept 0966 467 0957 -113 306
[0427 1506] [-250 5490] [0382 1522] [-246 2863]
MKT -0277 -0269
[-0823 0269] [-0838 0311]
Fama-French (1992) Intercept 1016 4330 1002 3425 3370
[0540 1491] [182 8166] [0517 1500] [2194 4614]
MKT -0445 -0430
[-0916 0026] [-0912 0068]
SMB 0147 0144
[0086 0208] [0077 0208]
HML 0316 0313
[0248 0383] [0242 0383]
Quality-minus-junk Intercept 0836 4931 0836 3861 3934
Asness et all (2019) [0323 1350] [914 8449] [0314 1365] [2459 5577]
QMJ 0153 0147
[-0065 0372] [-0066 0373]
MKT -0293 -0289
[-0796 0210] [-0814 0233]
SMB 0179 0175
[0114 0243] [0105 0240]
HML 0302 0300
[0234 0369] [0230 0373]
Panel B GLS
CAPM Intercept 1192 3060 1170 2602 2573
[0884 1500] [058 7848] [0815 1517] [600 5118]
MKT -0604 -0583
[-0911 -0298] [-0923 -0227]
Fama-French (1992) Intercept 1182 8616 1150 8272 8234
[0853 1510] [7303 9353] [0788 1519] [7736 8662]
MKT -0596 -0564
[-0923 -0268] [-0937 -0198]
SMB 0160 0160
[0132 0187] [0129 0191]
HML 0349 0348
[0314 0384] [0310 0386]
Quality-minus-junk Intercept 0970 8674 0977 8227 8276
Asness et all (2019) [0606 1334] [7451 9335] [0561 1369] [7759 8693]
QMJ 0293 0278
[0159 0427] [0125 0439]
MKT -0393 -0399
[-0754 -0033] [-0791 0012]
SMB 0166 0165
[0138 0194] [0134 0196]
HML 0353 0352
[0318 0388] [0315 0392]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of monthly excess returns for 25 Fama-French and 17 industry portfolios Each model is estimatedvia OLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructedbased on the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructedas in Lewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we providethe posterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of thecross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and99 level respectively
70
Table C14 Two-pass regressions with nontradable factors 25 Fama-French + 17 industry port-folios
Fama-MacBeth Bayesian Estimation
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CCAPM Intercept 1817 356 1975 -124 209
[0905 2729] [-250 6208] [0795 3135] [-245 2680]
∆Cnd 0189 0123
[-0216 0594] [-0285 0502]
Scaled CAPM Intercept 2141 1528 2344 298 1306
[0529 3754] [-789 9568] [0692 3944] [-451 4069]
cay 1408 0574
[-0182 2999] [-0935 1939]
MKT 0108 -0142
[-1471 1686] [-1747 1499]
cay timesMKT -2554 -2159
[-8811 3702] [-8813 4620]
Scaled CCAPM Intercept 1607 1709 1983 95 1468
[0394 2820] [-789 10000] [0761 3180] [-372 4189]
cay 1137 0511
[-0394 2669] [-0871 1865]
∆Cnd 0452 0175
[-0103 1007] [-0261 0598]
cay times∆Cnd -0102 0030
[-1752 1547] [-1175 1292]
HC-CAPM Intercept 2185 611 2221 -147 644
[0720 3650] [-513 6005] [0667 3735] [-429 3093]
∆Y 0398 0201
[-0011 0808] [-0290 0667]
MKT 0123 0050
[-1392 1638] [-1513 1650]
Panel B GLS
CCAPM Intercept 2351 264 2442 -057 218
[1700 3003] [-250 9795] [1612 3258] [-204 1296]
∆Cnd 0211 0132
[0034 0387] [-0081 0351]
Scaled CAPM Intercept 2516 3951 2653 4332 3470
[1467 3566] [1045 7411] [1616 3705] [360 6539]
cay 0783 0424
[0056 1511] [-0311 1119]
MKT -0504 -0639
[-1548 0541] [-1674 0406]
cay timesMKT 3785 2327
[-0412 7981] [-2053 6791]
Scaled CCAPM Intercept 2244 331 2378 296 356
[1378 3109] [-789 8166] [1539 3237] [-476 1690]
cay 0742 0425
[-0006 1489] [-0316 1104]
∆Cnd 0160 0119
[-0078 0398] [-0116 0342]
cay times∆Cnd 0355 0190
[-0343 1053] [-0455 0840]
HC-CAPM Intercept 2908 3810 2808 4454 3356
[2053 3762] [854 7372] [1809 3826] [194 6591]
∆Y -0155 -0089
[-0381 0072] [-0357 0174]
MKT -0892 -0793
[-1742 -0043] [-1803 0208]
71
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Scaled HC-CAPM Intercept 2099 1372 2243 1599 2186
[0549 3649] [-1389 4875] [0491 3955] [-121 4771]
cay 1124 0529
[-0326 2574] [-1066 1981]
∆Y 0088 0106
[-0314 0489] [-0399 0599]
MKT 0169 -0068
[-1340 1679] [-1805 1744]
cay times∆Y 1277 0579
[-1361 3914] [-1930 2919]
cay timesMKT -2125 -1605
[-8058 3808] [-8349 5354]
Durable CCAPM Intercept 2281 2316 2220 1423 1459
[0025 4537] [-789 10000] [0403 4028] [-359 4178]
∆Cnd 0544 0194
[0054 1034] [-0163 0525]
∆Cd 0481 0104
[-0101 1062] [-0353 0531]
MKT -0102 -0063
[-2466 2262] [-1948 1863]
Panel B GLS
Scaled HC-CAPM Intercept 2662 3848 2698 4312 3502
[1626 3698] [433 7267] [1555 3844] [231 6466]
cay 0726 0416
[-0029 1481] [-0324 1146]
∆Y -0165 -0088
[-0440 0110] [-0372 0198]
MKT -0652 -0685
[-1684 0379] [-1816 0438]
cay times∆Y 0681 0333
[-0648 2009] [-1017 1616]
cay timesMKT 3266 2123
[-0950 7482] [-2173 6502]
Durable CCAPM Intercept 1958 2926 1994 412 2558
[0634 3281] [-789 6763] [0831 3125] [-269 6336]
∆Cnd 0164 0065
[-0065 0394] [-0126 0260]
∆Cd 0289 0123
[-0011 0589] [-0117 0377]
MKT 0002 -0026
[-1322 1326] [-1165 1125]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of quarterly excess returns for 25 Fama-French and 17 industry portfolios Each modelis estimated via OLS and GLS We report point estimates and 5 confidence intervals for risk premia which areconstructed based on the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence levelconstructed as in Lewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimationwe provide the posterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode andmedian of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the90 95 and 99 level respectively
72
Table C15 Posterior factor probabilities and risk premia of 26 million sparse models
Pr [γj = 1|data] E [λj |data]ψ ψ
factors 1 5 10 20 50 100 1 5 10 20 50 100
HML 0455 0702 0720 0695 0646 0614 0106 0231 0249 0245 0229 0219MKTlowast 0274 0513 0594 0658 0700 0702 0050 0211 0321 0440 0553 0591MKT 0086 0178 0311 0446 0550 0576 0009 0067 0163 0286 0408 0454SMBlowast 0246 0350 0317 0264 0210 0188 0039 0113 0120 0110 0095 0089PERF 0136 0160 0147 0125 0098 0083 -0020 -0050 -0053 -0049 -0041 -0036STOCK ISS 0099 0117 0125 0123 0110 0099 -0008 -0029 -0042 -0050 -0053 -0051COMP ISSUE 0097 0101 0104 0103 0096 0088 0010 0029 0041 0051 0059 0059CMA 0111 0114 0104 0091 0075 0066 0008 0018 0019 0019 0017 0016ROA 0089 0079 0083 0094 0102 0097 0010 0023 0034 0051 0068 0070UMD 0091 0097 0097 0090 0078 0071 0006 0020 0026 0029 0030 0030BEH PEAD 0081 0069 0069 0074 0089 0108 0001 0002 0004 0007 0016 0030STRev 0078 0064 0063 0067 0086 0112 0000 0002 0003 0007 0022 0048NONDUR 0079 0065 0064 0067 0079 0097 0000 0000 0001 0001 0003 0006DISSTR 0085 0079 0076 0070 0060 0053 0010 0030 0039 0043 0042 0040MGMT 0090 0083 0077 0068 0055 0047 0007 0017 0019 0019 0017 0015BW ISENT 0079 0064 0062 0063 0069 0077 0000 0000 0000 0001 0002 0004TERM 0078 0063 0061 0062 0069 0078 0000 0000 0001 0001 0004 0007FIN UNC 0078 0063 0061 0061 0066 0073 -0000 -0000 -0000 -0000 -0000 -0000LIQ TR 0078 0063 0060 0061 0066 0075 -0000 0000 0001 0002 0007 0015DeltaSLOPE 0078 0063 0061 0061 0066 0073 -0000 -0000 -0000 -0000 -0000 -0001IPGrowth 0078 0063 0061 0061 0066 0072 -0000 -0000 -0000 -0000 -0001 -0002Oil 0078 0063 0061 0061 0066 0072 -0000 0001 0001 0002 0004 0009SERV 0078 0063 0061 0061 0065 0072 -0000 -0000 -0000 -0000 -0000 -0000DEFAULT 0078 0063 0060 0060 0064 0069 -0000 -0000 0000 0000 0000 0000NetOA 0079 0063 0061 0062 0065 0065 0001 0003 0004 0007 0014 0018DIV 0078 0063 0060 0060 0064 0069 0000 -0000 -0000 -0000 -0000 -0000PE 0078 0063 0060 0060 0064 0068 -0000 -0001 -0001 -0002 -0004 -0008REAL UNC 0078 0063 0060 0060 0064 0067 0000 0000 0000 0000 0000 0000HJTZ ISENT 0078 0063 0060 0060 0064 0068 -0000 -0000 -0000 -0000 -0000 -0001UNRATE 0078 0063 0060 0060 0064 0068 0000 -0000 -0000 -0000 -0001 -0002INTERM CAP RATIO 0084 0065 0061 0062 0062 0059 0009 0013 0018 0027 0038 0041LIQ NT 0079 0063 0060 0060 0063 0066 -0001 -0001 -0002 -0002 -0003 -0004QMJ 0113 0090 0069 0051 0035 0028 0012 0016 0014 0010 0007 0005MACRO UNC 0078 0063 0060 0059 0060 0061 0000 0000 0000 0000 0000 0000INV IN ASS 0078 0062 0059 0059 0060 0061 0000 0000 0001 0001 0003 0006ACCR 0081 0067 0062 0059 0056 0055 -0002 -0005 -0006 -0006 -0005 -0004LTRev 0078 0064 0063 0062 0059 0053 -0001 -0005 -0008 -0012 -0016 -0017CMAlowast 0079 0062 0059 0058 0059 0058 0000 0000 -0000 -0001 -0002 -0002ASS Growth 0078 0060 0056 0055 0055 0052 -0001 -0001 -0001 -0004 -0008 -0011HMLlowast 0079 0062 0058 0055 0053 0049 -0001 -0001 -0001 -0001 -0001 -0001RMWlowast 0077 0058 0052 0047 0040 0035 0000 0001 0001 0002 0002 0002BAB 0079 0059 0051 0045 0039 0034 -0003 -0003 -0003 -0001 0001 0003IA 0083 0060 0051 0043 0035 0030 -0003 -0004 -0004 -0003 -0003 -0003O SCORE 0077 0056 0047 0040 0032 0027 -0003 -0004 -0005 -0005 -0005 -0005ROE 0076 0055 0047 0040 0032 0027 0001 0004 0005 0005 0004 0003SMB 0039 0024 0027 0042 0067 0077 0002 0002 0004 0007 0014 0017SKEW 0090 0062 0047 0034 0024 0019 -0000 -0000 -0000 -0000 -0000 -0000BEH FIN 0079 0051 0040 0031 0023 0019 -0006 -0005 -0003 -0001 -0000 -0000GR PROF 0077 0047 0038 0031 0025 0020 0004 0003 0003 0003 0003 0003RMW 0073 0045 0036 0030 0023 0018 0003 0004 0004 0003 0003 0003HML DEVIL 0063 0036 0027 0020 0015 0012 0002 0003 0002 0001 0000 0000
Posterior probabilities of factors Pr [γj = 1|data] and posterior mean of factor risk premia E [λj |data] computedusing the Dirac spike and slab approach of section II22 51 factors and all possible models with up to 5 factorsyielding about 26 million candidate models The prior probability of a factor being included is about 1038 Thedata is monthly 197310 to 201612 Test assets cross-section of 25 Fama-French size and book-to-market and 30Industry portfolios The 51 factors considered are described in Table B1 of Appendix B
73
Table C16 Factor models with highest posterior probability (Dirac spike-and-slab ψ = 20)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232SMB 983232 983232 983232STRevBW ISENTLIQ TRUNRATENONDURTERMCOMP ISSUE 983232 983232OilBEH PEAD 983232DeltaSLOPEINV IN ASSIPGrowthDEFAULTUMD 983232ROA 983232 983232 983232 983232 983232 983232 983232REAL UNCPECMA 983232ACCRSERVSTOCK ISS 983232 983232 983232DIVMACRO UNCFIN UNCLIQ NTCMANetOAHJTZ ISENTLTRevRMWHMLINTERM CAP RATIOASS GrowthPERF 983232IABABDISSTR 983232ROEMGMTO SCOREQMJBEH FINGR PROFSKEWRMWHML DEVILSMBProbability () 01080 00964 00771 00710 00709 00668 00661 00624 00600 00560
Factors and posterior model probabilities of ten most likely specifications computed using the Dirac spike and slabapproach of section II22 ψ = 20 51 factors and all possible models with up to 5 factors yielding about 26 millionmodels and a model prior probability of the order of 10minus7 Specifications organised by columns with the symbol 983232indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612 Test assetscross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factors considered aredescribed in Table B1 of Appendix B
74
Table C17 Factor Models with highest posterior probability (continuous spike-and-slab ψ = 10)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKTlowast 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232SMBlowast 983232 983232 983232 983232 983232 983232 983232STRev 983232 983232 983232 983232 983232 983232 983232IPGrowth 983232 983232 983232 983232 983232BEH PEAD 983232 983232 983232 983232 983232NONDUR 983232 983232 983232 983232SERV 983232 983232 983232LIQ TR 983232 983232 983232 983232PE 983232 983232 983232 983232 983232 983232DeltaSLOPE 983232 983232 983232 983232BW ISENT 983232 983232DEFAULT 983232 983232 983232 983232 983232 983232Oil 983232 983232 983232DIV 983232 983232 983232 983232 983232 983232REAL UNC 983232 983232 983232 983232 983232 983232UNRATE 983232 983232TERM 983232 983232 983232 983232HJTZ ISENT 983232 983232 983232 983232 983232 983232UMD 983232 983232 983232 983232 983232CMAlowast 983232 983232 983232 983232 983232 983232LIQ NT 983232 983232 983232 983232 983232 983232 983232 983232FIN UNC 983232 983232 983232 983232 983232 983232 983232STOCK ISS 983232 983232NetOA 983232 983232 983232 983232 983232 983232MACRO UNC 983232 983232 983232 983232 983232INV IN ASS 983232 983232 983232COMP ISSUE 983232 983232 983232 983232 983232ACCR 983232 983232 983232 983232 983232 983232HMLlowast 983232 983232 983232 983232 983232 983232ROA 983232 983232 983232 983232 983232 983232RMWlowast 983232 983232 983232 983232 983232 983232CMA 983232 983232 983232 983232 983232 983232LTRev 983232 983232 983232 983232 983232 983232 983232 983232INTERM CAP RATIO 983232 983232 983232 983232 983232 983232ASS Growth 983232 983232 983232 983232 983232 983232DISSTR 983232 983232 983232 983232PERF 983232 983232 983232BAB 983232 983232IA 983232 983232 983232 983232MGMT 983232 983232 983232 983232ROE 983232 983232 983232O SCORE 983232BEH FIN 983232 983232 983232 983232 983232GR PROF 983232 983232 983232 983232 983232QMJ 983232 983232RMW 983232SKEW 983232 983232 983232SMB 983232HML DEVIL 983232 983232Probability () 00111 00111 00111 00100 00100 00100 00100 00100 00100 00100
Factors and posterior model probabilities of ten most likely specifications computed using the continuous spike andslab approach of section II23 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 225quadrillion models and a model prior probability of the order of 10minus16 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
75
Table C18 Factor models with highest posterior probability (Continuous spike-and-slab ψ = 20)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232MKTlowast 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232SMBlowast 983232 983232 983232STRev 983232 983232 983232 983232 983232 983232 983232 983232IPGrowth 983232 983232 983232BEH PEAD 983232 983232 983232 983232 983232 983232NONDUR 983232 983232 983232SERV 983232 983232 983232 983232 983232 983232LIQ TR 983232 983232 983232 983232 983232PE 983232 983232 983232 983232 983232 983232 983232DeltaSLOPE 983232 983232 983232 983232 983232 983232BW ISENT 983232 983232 983232 983232 983232DEFAULT 983232 983232 983232 983232 983232Oil 983232 983232DIV 983232 983232 983232 983232REAL UNC 983232 983232 983232 983232 983232UNRATE 983232 983232 983232 983232 983232TERM 983232 983232 983232 983232HJTZ ISENT 983232 983232 983232 983232 983232 983232 983232 983232UMD 983232 983232 983232 983232CMAlowast 983232 983232LIQ NT 983232 983232 983232 983232 983232FIN UNC 983232 983232 983232 983232 983232 983232STOCK ISS 983232 983232 983232 983232 983232NetOA 983232 983232 983232 983232 983232 983232 983232MACRO UNC 983232 983232INV IN ASS 983232 983232 983232 983232COMP ISSUE 983232 983232 983232 983232ACCR 983232 983232 983232HMLlowast 983232 983232 983232 983232 983232 983232 983232ROA 983232 983232 983232 983232 983232 983232RMWlowast 983232 983232 983232 983232 983232CMA 983232 983232 983232 983232 983232 983232LTRev 983232 983232 983232INTERM CAP RATIO 983232 983232 983232ASS Growth 983232 983232 983232 983232DISSTR 983232 983232 983232 983232 983232PERF 983232BAB 983232 983232 983232 983232IA 983232 983232 983232 983232MGMT 983232 983232 983232ROE 983232 983232 983232 983232 983232O SCORE 983232 983232 983232 983232 983232BEH FIN 983232GR PROF 983232 983232 983232QMJ 983232 983232RMW 983232 983232SKEW 983232 983232SMB 983232HML DEVIL 983232 983232Probability () 00133 00122 00111 00111 00100 00100 00100 00100 00100 00089
Factors and posterior model probabilities of ten most likely specifications computed using the continuous spike andslab approach of section II23 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 225quadrillion models and a model prior probability of the order of 10minus16 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
76
D Additional Figures
Panel A OLS
00 01 02 03 04 05
02
46
810
λstrong
Density
BFMFM
(a) Strong factor
minus2 minus1 0 1 2
00
02
04
06
08
10
λuseless
Density
BFMFM
(b) Useless factor
Panel B GLS
025 030 035
05
1015
2025
30
λstrong
Density
BFMFM
(c) Strong factor
minus20 minus15 minus10 minus05 00 05 10
00
04
08
12
λuseless
Density
BFMFM
(d) Useless factor
Figure D1 Posterior distribution of the risk premia estimates in a misspecified model that includesboth strong and irrelevant factors
The graph presents posterior distribution of risk premia estimates for a misspecified model with both strong anduseless factors in one representative simulation Panels (a) and (b) display posterior distribution of the BFM-OLSestimates of risk premia along with the frequentist distribution implied by the point estimates and standard errorsPanels (c) and (d) report the same objects for GLS T =1000
77
0 5 10
00
04
08
12
Parameter
Den
sity
PriorLikelihood ILikelihood IILikelihood IIIPosterior IPosterior IIPosterior III
Figure D2 Posterior distributions with heavy-tailed likelihood and thin-tailed prior
Posterior shapes generated by a thin-tails (Gaussian) prior and a heavy-tails (Cauchy) likelihood as the likelihood
peak moves away from the prior mean Posteriors I II and III are generated respectively by the likelihood I II and
III
78
retaining the true sources of risk
Our results have important empirical implications for the estimation of popular linear factor
models and their comparison We jointly evaluate 51 factors proposed in the previous literature
yielding a total of 225 quadrillion possible models to analyze and find that only a handful of
factors are robust explanators of the cross-section of asset returns (the Fama and French (1992)
ldquohigh-minus-lowrdquo proxy for the value premium the market index as well the adjusted versions
of the ldquosmall-minus-bigrdquo size factor and the market factor of Daniel Mota Rottke and Santos
(2018))
Jointly the four robust factors provide a model that is compared to the previous empirical
literature one order of magnitude more likely to have generated the observed asset returns (itrsquos
posterior probability is about 90) However we show that with very high probability the ldquotruerdquo
latent SDF is dense in the space of factors ie capturing its characteristics requires the use of 24-25
factors Nevertheless the SDF-implied maximum Sharpe ratio is not excessive suggesting a high
degree of commonality in terms of captured risks among the factors in the zoo
Furthermore we apply our useless factors detection method to a selection of popular linear
SDFs We find that many non-traded factors such as consumption proxies labour factors or the
consumption-to-wealth ratio cay are only weakly identified at best and are characterised by a
substantial degree of model misspecification and uncertainty
I1 Closely related literature
There are numerous contributions to the literature that rely on the use of Bayesian tools in finance
especially in the areas of asset allocation (for an excellent overview see Fabozzi Huang and Zhou
(2010) and Avramov and Zhou (2010)) model selection (eg Barillas and Shanken (2018)) and
performance evaluation (Baks Metrick and Wachter (2001) Pastor and Stambaugh (2002) Jones
and Shanken (2005) Harvey and Liu (2019)) In this paper therefore we aim to provide only an
overview of the literature that is most closely related to our paper
While there are multiple papers in the literature that adopt a Bayesian approach to analyse
linear factor models and portfolio choice most of them focus on the time-series regressions where
the intercepts thanks to factors being directly traded (or using their mimicking portfolios) can be
interpreted as the vector of pricing errors ndash the αrsquos In this case the use of test assets actually
becomes irrelevant since the problem of comparing model performance is reduced to the spanning
tests of one set of factors by another (eg Barillas and Shanken (2018))
Our paper instead develops a method that can be applied to both tradable and non-tradable
factors As a result we focus on the general pricing performance in the cross-section of asset returns
(which is no longer irrelevant) and show that there is a tight link between the use of the most
popular diffuse priors for the risk premia and the failure of the standard estimation techniques
in the presence of useless factors (eg Kan and Zhang (1999a))
Shanken (1987) and Harvey and Zhou (1990) were probably the first contributions to the lit-
erature that adapted the Bayesian framework to the analysis of portfolio choice and developed
3
GRS-type tests (cf Gibbons Ross and Shanken (1989)) for mean-variance efficiency While
Shanken (1987) was the first to examine the posterior odds ratio for portfolio alphas in the linear
factor model Harvey and Zhou (1990) set the benchmark by imposing the priors both diffuse and
informative directly on the deep model parameters Using the data on 12 industry portfolios they
reject the mean-variance efficiency of the market return with a high degree of certainty
Pastor and Stambaugh (2000) and Pastor (2000) directly assign a prior distribution to the
pricing errors α α sim N (0κΣ) where Σ is the variance-covariance matrix of returns and κ isin R+
and apply it to the Bayesian portfolio choice problem The intuition behind their prior is that it
imposes a degree of shrinkage on the alphas so that whenever factor models are misspecified the
pricing errors cannot be too large a priori hence placing a bound on the Sharpe ratio achievable
in this economy Therefore a diffuse prior for the pricing errors α in general should be avoided
Barillas and Shanken (2018) extend the aforementioned prior to derive a closed-form solution
for the Bayesrsquo factor and use it to compare different linear factor models exploiting the time series
dimension of the data2 In a recent paper Goyal He and Huh (2018) also extend the notion of
distance between alternative model specifications and highlight the tension between the power of
the GRS-type tests and the absolute return-based measures of mispricing
Last but not the least there is a general close connection between the Bayesian approach
to model selection or parameter estimation and the shrinkage-based one Garlappi Uppal and
Wang (2007) impose a set of different priors on expected returns and the variance-covarince matrix
and find that the shrinkage-based analogue leads to superior empirical performance Anderson
and Cheng (2016) develop a Bayesian model-averaging approach to portfolio choice with model
uncertainty being one of the key ingredients that yield robust asset allocation and superior out-of-
sample performance of the strategies Finally the shrinkage-based approach to recovering the SDF
of Kozak Nagel and Santosh (2019) can also be interpreted from a Bayesian perspective Within
a universe of characteristic-managed portfolios the authors assign prior distributions to expected
returns3 and their posterior maximum likelihood estimators resemble a ridge regression Instead
we work directly with tradable and nontradable factors and consider (endogenously) heterogenous
priors for factor risk premia λ The dispersion of our prior for each λ directly depends on the
correlation between test assets and the factor so that it mimics the strength of the identification
of the factor risk premium
Naturally our paper also contributes to the very active (and growing) body of work that criti-
cally evaluates existing findings in the empirical asset pricing literature and tries to develop a robust
methodology There is ample empirical evidence that most linear asset pricing models are misspec-
ified (eg Chernov Lochstoer and Lundeby (2019) He Huang and Zhou (2018) Giglio and Xiu
(2018)) Gospodinov Kan and Robotti (2014) develop a general approach for misspecification-
robust inference that provides valid confidence interval for the pseudo-true values of the risk premia
2Chib Zeng and Zhao (forthcoming) show that the improper prior specification of Barillas and Shanken (2018)is problematic and propose a new class of priors that leads to valid model comparison
3Or equivalently the coefficients b when the linear stochastic discount factor is represented as mt = 1 minus (ft minusE[ft])
⊤b where ft and E denote respectively a vector of factors and the unconditional expectation operator
4
Giglio and Xiu (2018) exploit the invariance principle of the PCA and recover the risk premia asso-
ciated with the given factor from the projection on the span of latent factors driving a cross-section
of asset returns Uppal Zaffaroni and Zviadadze (2018) adopt a similar approach by recovering
latent factors from the residuals of the asset pricing model effectively completing the span of the
SDF Daniel Mota Rottke and Santos (2018) instead focus on the construction of the cross-
sectional factors and note that many well-established tradable portfolios such as HML and SMB
can be substantially improved in asset pricing tests by hedging their unpriced component (that does
not carry a risk premium) We do not take a stand on the origin of the factors or the completion of
the model space Instead we consider the whole universe of potential models that can be created
from the set of observable candidates factors proposed in the empirical literature As such our
analysis explicitly takes into account both standard specifications that have been successfully used
in numerous papers (eg Fama-French 3-factor models or nondurable consumption growth) as
well as the combinations of the factors that have never been explicitly tested
Following Harvey Liu and Zhu (2016) a large literature has tried to understand which of
the existing factors (or their combinations) drive the cross-section of asset returns Giglio Feng
and Xiu (2019) combine cross-sectional asset pricing regressions with the double-selection LASSO
of Belloni Chernozhukov and Hansen (2014) to provide valid uniform inference on the selected
sources of risk Huang Li and Zhou (2018) use a reduced rank approach to select among not
only the observable factors but their total span effectively allowing for sparsity not necessarily in
the observable set of factors but their combinations as well Kelly Pruitt and Su (2019) build a
latent factor model for stock returns with factor loadings being a linear function of the company
characteristics and find that only a small subset of the latter provide substantial independent
information relevant for asset pricing Our approach does not take a stand on whether there exists
a single combination of factors that substantially outperforms other model specifications Instead
we let the data speak and find out that the cross-sectional likelihood across the many models is
rather flat meaning the data is not informative enough to reliably indicate that there is a single
dominant specification
Finally our paper naturally contributes to the literature on weak identification in asset pricing
Starting from the seminal papers of Kan and Zhang (1999a 1999b) identification of risk premia has
been shown to be challenging for traditional estimation procedures Kleibergen (2009) demonstrates
that the two-pass regression of Fama-MacBeth lead to biased estimates of the risk premia and
spuriously high significance levels Moreover useless factors often crowd out the impact of the true
sources of risk in the model and lead to seemingly high levels of cross-sectional fit (Kleibergen and
Zhan (2015)) Gospodinov Kan and Robotti (2014 2019) demonstrate that most of the estimation
techniques used to recover risk premia in the cross-section are rendered invalid in the presence of
useless factors and propose alternative procedures that effectively eliminate the impact of these
factors from the model We build upon the intuition developed in these papers and formulate
the Bayesian solution to the problem by providing a prior that directly reflects the strength of
the factor Whenever the vector of correlation coefficients between asset returns and a factor is
close to zero the prior variance of λ for this specific factor also goes to zero and the penalty for
5
the risk premium converges to infinity effectively shrinking the posterior of the useless factorsrsquo
risk premia toward zero Therefore our priors are particularly robust to the presence of spurious
factors Conversely they are very diffuse for the strong factors with the posterior reflecting the
full impact of the likelihood
II Inference in Linear Factor Models
This section introduces the notation and reviews the main results of the Fama-MacBeth (FM)
regression method (see Fama and MacBeth (1973)) We focus on classic linear factor models for
cross-sectional asset returns Suppose that there are K factors ft = (f1t fKt)⊤ t = 1 T
which could be either tradable or non-tradable To simplify exposition we consider mean zero
factors that have also been demeaned in sample so that we have both E[ft] = 0K and f = 0K where
E[] denotes the unconditional expectation and the upper bar denotes the sample mean operator
The returns of N test assets in excess of the risk free rate are denoted by Rt = (R1t RNt)⊤
In the FM procedure the factor exposures of asset returns βf isin RNtimesK are recovered from
the linear regression
Rt = a+ βfft + 983171t (1)
where 9831711 983171Tiidsim N (0N Σ) and a isin RN Given the mean normalization of ft we have E[Rt] = a
The risk premia associated with the factors λf isin RK are then estimated from the cross-
sectional regression
R = λc1N + 983141βfλf +α (2)
where 983141βf denotes the time series estimates λc is a scalar average mispricing that should be equal to
zero under the null of the model being correctly specified 1N denotes an N -dimensional vector of
ones and α isin RN is the vector of pricing errors in excess of λc If the model is correctly specified
it implies the parameter restriction a = E[Rt] = λc1N + βfλf Therefore we can rewrite the
two-step Fama-MacBeth regression into one equation as
Rt = λc1N + βfλf + βfft + 983171t (3)
Equation (3) is particularly useful in our simulation study Note that the intercept λc is included
in (2) and (3) in order to separately evaluate the ability of the model to explain the average level
of the equity premium and the cross-sectional variation of asset returns
Let B⊤ = (aβf ) and F⊤t = (1f⊤
t ) and consider the matrices of stacked time series observa-
tions
R =
983091
983109983109983107
R⊤1
R⊤T
983092
983110983110983108 F =
983091
983109983109983107
F⊤1
F⊤T
983092
983110983110983108 983171 =
983091
983109983109983107
983171⊤1
983171⊤T
983092
983110983110983108
The regression in (1) can then be rewritten as R = FB + 983171 yielding the time-series estimates of
6
(aβf ) and Σ
983141B =
983075983141a⊤
983141β⊤f
983076= (F⊤F )minus1F⊤R 983141Σ =
1
T(Rminus F 983141B)⊤(Rminus F 983141B)
In the second step the OLS estimates of the factor risk premia are
983141λ = (983141β⊤ 983141β)minus1 983141β⊤R (4)
where 983141β = (1N 983141βf ) and λ⊤ = (λc λf ⊤) The canonical Shanken (1992) corrected covariance matrix
of the estimated risk premia is4
983141σ2(983141λ) = 1
T
983147(983141β⊤ 983141β)minus1 983141β⊤ 983141Σ983141β(983141β⊤ 983141β)minus1(1 + 983141λ⊤
f983141Σminus1f
983141λf )983148 (5)
where 983141Σf is the sample estimate of the variance-covariance matrix of the factors ft There are
two sources of estimation uncertainty in the OLS estimates of λ First we do not know the test
assetsrsquo expected returns but instead estimate them as sample means R According to the time-
series regression R sim N (a 1T Σ) asymptotically Second if β is known the asymptotic covariance
matrix of 983141λ is simply 1T (β
⊤β)minus1β⊤ 983141Σβ(β⊤β)minus1 The extra term (1 + λ⊤fΣ
minus1f λf ) is included to
account for the fact that βf is estimated
Alternatively we can run a (feasible) GLS regression in the second stage obtaining the estimates
983141λ = (983141β⊤ 983141Σminus1 983141β)minus1 983141β⊤ 983141Σminus1R (6)
where 983141Σ = 1T 983141983171
⊤983141983171 and 983141983171 denotes the OLS residuals and with the associated covariance matrix of
the estimates
983141σ2(983141λ) = 1
T(983141β⊤ 983141Σminus1 983141β)minus1(1 + 983141λ⊤
f983141Σminus1f
983141λf ) (7)
Equations (4) and (6) make it clear that in the presence of a spurious (or useless) factor ie
such that βj = CradicT C isin RN risk premia are no longer identified Furthermore their estimates
diverge leading to inference problems for both the useless and the strong (ie βj ∕rarr 0 as T rarr infin)
factors (see eg Kan and Zhang (1999b)) In the presence of such an identification failure the
cross-sectional R2 also becomes untrustworthy If a useless factor is included into the two-pass
regression the OLS R2 tend to be highly inflated (although the GLS R2 is less affected)5
4An alternative way (see eg Cochrane (2005) Page 242) to account for the uncertainty from ldquogenerated regres-sorsrdquo such as βf is to estimate the whole system in GMM The moments are
gT (aβf λ) =
983061IN otimes IK+1
A⊤
983062983091
983107E[Rt minus aminus βfft]
E[(Rt minus aminus βfft)otimes f⊤t ]
E[Rt minus λc1N minus βfλf ]
983092
983108 = 0
where β = (1N βf ) A⊤ = β⊤ for OLS and A⊤ = β⊤Σminus1 for GLS Also note that GLS estimation is not the sameas efficient GMM estimation
5For example Kleibergen and Zhan (2015) derive the asymptotic distribution of the R2 under the assumptionthat a few unknown factors are able to explain expected asset returns and show that in the presence of a useless
7
This problem arises not only when using the Fama-MacBeth two-step procedure Kan and
Zhang (1999a) point out that the identification condition in the GMM test of linear stochastic
discount factor models fails when a useless factor is included Moreover this leads to overrejection
of the hypothesis of a zero risk premium for the useless factor under the Wald test and the power
of the over-identifying restriction test decreases Gospodinov Kan and Robotti (2019) document
similar problems within the maximum likelihood estimation and testing framework
Consequently several papers have attempted to develop alternative statistical procedures that
are robust to the presence of useless factors Kleibergen (2009) proposes several novel statis-
tics whose large sample distributions are unaffected by the failure of the identification condition
Gospodinov Kan and Robotti (2014) derive robust standard errors for the GMM estimates of
factor risk premia in the linear stochastic factor framework and prove that t-statistics calculated
using their standard errors are robust even when the model is misspecified and a useless factor is
included Bryzgalova (2015) introduces a LASSO-like penalty term in the cross-sectional regression
to shrink the risk premium of the useless factor towards zero
In this paper we provide a Bayesian inference and model selection framework that i) can be
easily used for robust inference in the presence and detection of useless factors (section II1) and
ii) can be used for both model selection and model averaging even in the presence of a very large
number of candidate (traded or non traded and possibly useless) risk factors ndash ie the entire factor
zoo
II1 Bayesian Fama-MacBeth
This section introduces our hierarchical Bayesian Fama-MacBeth (BFM) estimation method A
formal derivation is presented in Appendix A1 To start with letrsquos consider the time-series regres-
sion We assume that the time series error terms follow an iid multivariate Gaussian distribution
(the approach at the cost of analytical solutions could be generalized to accommodate different
distributional assumptions) ie 983171 sim MVN (0TtimesN Σ otimes IT ) The likelihood of the data (RF ) is
then
p(data|BΣ) = (2π)minusNT2 |Σ|minus
T2 exp
983069minus1
2tr
983147Σminus1(Rminus FB)⊤(Rminus FB)
983148983070
The time-series regression is always valid even in the presence of a spurious factor For simplicity
we choose the non-informative Jeffreysrsquo prior for (BΣ) π(BΣ) prop |Σ|minusN+1
2 Note that this prior
is flat in the B dimension The posterior distribution of (BΣ) is therefore
B|Σ data sim MVN983059983141BolsΣotimes (F⊤F )minus1
983060 (8)
Σ|data sim W -1983059T minusK minus 1 T Σ
983060 (9)
where 983141Bols and Σ denote the canonical OLS based estimates and W -1 is the inverse-Wishart
distribution (a multivariate generalization of the inverse-gamma distribution) From the above
factor the OLS R2 is more likely to be inflated than its GLS counterpart
8
we can sample the posterior distribution of the parameters (BΣ) by first drawing the covariance
matrix Σ from the inverse-Wishart distribution conditional on the data and then drawing B from
a multivariate normal distribution conditional on the data and the draw of Σ
If the model is correctly specified in the sense that all true factors are included expected
returns of the assets should be fully explained by their risk exposure β and the prices of risk λ
ie E[Rt] = βλ But since given our mean normalisation of the factors E[Rt] = a we have the
least square estimate (β⊤β)minus1β⊤a Therefore we can define our first estimator
Definition 1 (Bayesian Fama-MacBeth (BFM)) The posterior distribution of λ conditional
on B Σ and the data is a Dirac distribution at (β⊤β)minus1β⊤a A draw (λ(j)) from the posterior
distribution of λ conditional on the data only is obtained by drawing B(j) and Σ(j) from the Normal-
inverse-Wishart in (8)-(9) and computing (β⊤(j)β(j))
minus1β⊤(j)a(j)
The posterior distribution of λ defined above accounts both for the uncertainty about the
expected returns (via the sampling of a) and the uncertainty about the factor loadings (via the
sampling of β) Note that differently from the frequentist case in equation (5) there is no ldquoextra
termrdquo (1 + λ⊤fΣ
minus1f λf ) to account for the fact that βf is estimated The reason being that it
is unnecessary to explicitly adjust standard errors of λ in the Bayesian approach since we keep
updating βf in each simulation step automatically incorporating the uncertainty about βf into the
posterior distribution of λ Furthermore it is quite intuitive from the above definition of the BFM
estimator why we expect posterior inference to detect weak and spurious factors in finite sample
For such factors the near singularity of (β⊤(j)β(j))
minus1 will cause the draws for λ(j) to diverge as in
the frequentist case Nevertheless the posterior uncertainty about factor loadings and risk premia
will cause β⊤(j)a(j) to flip sign across draws causing the posterior distribution of λ to put substantial
probability mass on both values above and below zero Hence centered posterior credible intervals
will tend to include zero with high probability
In addition to the price of risk λ we are also interested in estimating the cross-sectional fit of
the model ie the cross-sectional R2 Once we obtain the posterior draws of the parameters we
can easily obtain the posterior distribution of the cross-sectional R2 defined as
R2ols = 1minus (aminus βλ)⊤(aminus βλ)
(aminus a1N )⊤(aminus a1N ) (10)
where a = 1N
983123Ni ai That is for each posterior draw of (a β λ) we can construct the corre-
sponding draw for the R2 from equation (10) hence tracing out its posterior distribution We can
think of equation (10) as the population R2 where a β and λ are unknown After observing
the data we infer the posterior distribution of a β and λ and from these we can recover the
distribution of the R2
However realistically the models are rarely true Therefore one might want to allow for the
presence of pricing errors α in the cross-sectional regression6 This can be easily accommodated
6As we will show in the next section this is essential for model selection
9
within our Bayesian framework since in this case the data generating process in the second stage
becomes a = βλ + α If we further assume that pricing error αi follows an independent and
identical normal distribution N (0σ2) the likelihood function in the second step becomes7
p(data|λσ2β) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070 (11)
In the cross-sectional regression the ldquodatardquo are the expected risk premia a and the factor loadings
β albeit these quantities are not directly observable to the researcher Hence in the above we
are conditioning on the knowledge of these quantities which can be sampled from the first step
Normal-inverse-Wishart posterior distribution (8)-(9) Conceptually this is not very different from
the Bayesian modeling of latent variables In the benchmark case we assume a Jeffreysrsquo diffuse
prior8 for (λσ2) π(λσ2) prop σminus2 In Appendix A1 we show that the posterior distribution of
(λσ2) is then
λ|σ2BΣ data sim N
983091
983109983107(β⊤β)minus1β⊤a983167 983166983165 983168983141λ
σ2(β⊤β)minus1
983167 983166983165 983168Σλ
983092
983110983108 (12)
σ2|BΣ data sim Γminus1
983075N minusK minus 1
2(aminus βλ)⊤(aminus βλ)
2
983076 (13)
where Γminus1 denotes the inverse-gamma distribution The conditional distribution in equation (12)
makes it clear that the posterior takes into account both the uncertainty about the market price
of risk stemming from the first stage uncertainty about the β and a (that are drawn from the
Normal-inverse-Wishart posterior in equations (8)-(9)) and the random pricing errors α that have
the conditional posterior variance in equation (13) If test assetsrsquo expected excess returns are fully
explained by β there are no pricing errors and σ2(β⊤β)minus1 converges to zero otherwise this layer
of uncertainty always exists
Note also that we can think of the posterior distribution of (β⊤β)minus1β⊤a as a Bayesian decision
makerrsquos belief about the dispersion of the Fama-MacBeth OLS estimates after observing the data
RtftTt=1 Alternatively when pricing errors α are assumed to be zero under the null hypothesis
the posterior distribution of λ in equation (12) collapses to a degenerate distribution where λ
equals (β⊤β)minus1β⊤a with probability one
Often the cross sectional step of the Fama-MacBeth estimation is performed via GLS rather than
least squares In our setting under the null of the model this leads to λ = (β⊤Σminus1β)minus1β⊤Σminus1a
Therefore we define the following GLS estimator
Definition 2 (Bayesian Fama-MacBeth GLS (BFM-GLS)) The posterior distribution of λ
conditional on B Σ and the data is a Dirac distribution at (β⊤Σminus1β)minus1β⊤Σminus1a A draw (λ(j))
7We derive a formulation with non-spherical cross-sectional pricing errors that leads to a GLS type estimator inAppendix A2
8As shown in the next subsection in the presence of useless factors such prior is not appropriate for modelselection based on Bayes factors and posterior probabilities since it does not lead to proper marginal likelihoodsTherefore we introduce therein a novel prior for model selection
10
from the posterior distribution of λ conditional on the data only is obtained by drawing B(j) and Σ(j)
from the Normal-inverse-Wishart in equations (8)-(9) and computing (β⊤(j)Σ
minus1(j)β(j))
minus1β⊤(j)Σ
minus1(j)a(j)
From the posterior sampling of the parameters in the above definition we can also obtain the
posterior distribution of the cross-sectional GLS R2 defined as
R2gls = 1minus
(aminus βλgls)⊤Σminus1(aminus βλgls)
(aminus a1N )⊤Σminus1(aminus a1N ) (14)
Once again we can think of equation (14) as the population GLS R2 that is a function of the
unknown quantities a β and λ But after observing the data we infer the posterior distribution
of the parameters and from these we recover the posterior distribution of the R2gls
Remark 1 (Generated factors) Often factors are estimated as eg in the case of principal
components (PCs) and factor mimicking portfolios (albeit the latter is not needed in our setting)
This generates an additional layer of uncertainty normally ignored in empirical analysis due to
the associated asymptotic complexities Nevertheless it is relatively easy to adjust the Bayesian
estimators of risk premia to account for this uncertainty In the case of a mimicking portfolio
under a diffuse prior and Normal errors the posterior distribution of the portfolio weights follow the
standard Normal-inverse-Gamma of Gaussian linear regression models (see eg Lancaster (2004))
Similarly in the case of principal components as factors under a diffuse prior the covariance
matrix from which the PCs are constructed follow an inverse-Wishart distribution9 Hence the
posterior distributions in Definitions 1 and 2 can account for the generated factors uncertainty by
first drawing from an inverse-Wishart the covariance matrix from which PCs are constructed or
the Normal-inverse-Gamma posterior of the mimicking portfolios coefficients and then sampling
the remaining parameters as explained in the definitions
II2 Model selection
In the previous subsection we have derived simple Bayesian estimators that deliver in finite sam-
ple credible intervals robust to the presence of spurious factors and avoid over-rejecting the null
hypothesis of zero risk premia for such factors
However given the plethora of risk factors that have been proposed in the literature a robust
approach for models selection across non-necessarily nested models and that can handle poten-
tially a very large number of possible models as well as both traded and non-traded factors is
of paramount importance for empirical asset pricing The canonical way of selecting models and
testing hypothesis within the Bayesian framework is through Bayesrsquo factors and posterior prob-
abilities and that is the approach we present in this section This is for instance the approach
suggested by Barillas and Shanken (2018) The key elements of novelty of the proposed method
are that i) our procedure is robust to the presence of spurious and weak factors ii) it is directly
9Based on these two observations Allena (2019) proposes a generalisation of Barillas and Shanken (2018) modelcomparison approach for these type of factors
11
applicable to both traded and non-traded factors and iii) it selects models based on their cross-
sectional performance (rather than the time series one) ie on the basis of the risk premia that the
factors command
In this subsection we show first that flat priors for risk premia (as the Jeffreysrsquo priors used
in section II1 for illustrative purposes) are not suitable for model selection in the presence of
spurious factors Given the close analogy between frequentist testing and Bayesian inference with
flat priors this is not too surprising But the novel insight is that the problem arises exactly
because of the use of flat priors and can therefore be fixed by using non-flat yet non-informative
priors Second we introduce ldquospike-and-slabrdquo priors that are robust to the presence of spurious
factors and particularly powerful in high-dimensional model selection ie when one wants as in
our empirical application to test all factors in the zoo
II21 Pitfalls of Flat Priors for Risk Premia
We start this section by discussing why flat priors for risk premia as the Jeffreysrsquo prior are not
desirable in model selection Since we want to focus and select models based on the cross-sectional
asset pricing properties of the factors for simplicity we retain Jeffreysrsquo priors for the time series
parameter (aβf Σ) of the first-step regression
In order to perform model selection we relax the (null) hypothesis that models are correctly
specified and allow instead for the presence of cross-sectional pricing errors That is we consider
the cross-sectional regression a = βλ + α For illustrative purposes we focus on spherical errors
but all the results in this and the following subsections can be generalized to the non-spherical error
setting in Appendix A2
Similar to many Bayesian variable selection problems we introduce a vector of binary latent
variables γ⊤ = (γ0 γ1 γK) where γj isin 0 1 When γj = 1 it indicates that the factor j
(with associated loadings βj) should be included into the model and vice versa The number of
included factors is simply given by pγ =983123K
j=0 γj Note that we do not shrink the intercept so
γ0 is always equal to 1 (as the common intercept plays the role of the first ldquofactorrdquo) The notation
βγ = [βj ]γj=1 represents a pγ-columns sub-matrix of β
When testing whether the risk premium of factor j is zero the null hypothesis is H0 λj = 0
In our notation this null hypothesis can be expressed as H0 γj = 0 while the alternative is
H1 γj = 1 This is a small but important difference relative to the canonical frequentist testing
approach for useless factors the risk premium is not identified hence testing whether it is equal to
any given value is per se problematic Nevertheless as we show in the next section with appropriate
priors whether a factor should be included or not is a well-defined question even in the presence
of useless factors
In the Bayesian framework the prior distribution of parameters under the alternative hypothesis
should be carefully specified Generally speaking the priors for nuisance parameters such as β σ2
and Σ do not greatly influence the cross-sectional inference But as we are about to show this is
not the case for the priors about risk premia
12
Recall that when considering multiple models say wlog model γ and model γ prime by Bayesrsquo
theorem we have that the posterior probability of model γ is
Pr(γ|data) = p(data|γ)p(data|γ) + p(data|γ prime)
where we have given equal prior probability to each model and p(data|γ) denotes the marginal
likelihood of the model indexed by γ In Appendix A3 we show that when using a Jeffreysrsquo prior
(that is flat for λ) the marginal likelihood is
p(data|γ) prop (2π)pγ+1
2 |β⊤γ βγ |minus
12
Γ983059Nminuspγ+1
2
983060
983059N σ2
γ
2
983060Nminuspγ+1
2
(15)
where λγ = (β⊤γ βγ)
minus1β⊤γ a σ
2γ =
(aminusβγ λγ)⊤(aminusβγ λγ)N and Γ denotes the Gamma function
Therefore if model γ includes a useless factor (whose β asymptotically converges to zero) the
matrix β⊤γ βγ is nearly singular and its determinant goes to zero sending the marginal likelihood
in (15) to infinity As a result the posterior probability of the model containing the spurious factor
go to one Consequently under a flat prior for risk premia the model containing a useless factor
will always be selected asymptotically However the posterior distribution of λ for the spurious
factor is robust and particularly disperse in any finite sample
Moreover it is highly likely that conclusions based on the posterior coverage of λ contradict
those arising from Bayesrsquo factors When the prior distribution of λj is too diffuse under the
alternative hypothesis H1 the Bayesrsquo factor tends to favor H0 over H1 even though the estimate
of λj is far from 0 The reason is that even though H0 seems quite unlikely based on posterior
coverages the data is even more unlikely under H1 than under H0 Therefore a disperse prior for
λj may push the posterior probabilities to favor H0 and make it fail to identity true factors This
phenomenon is the so called ldquoBartlett Paradoxrdquo (see Bartlett (1957))
Note also that flat hence improper priors for the risk premia are not legitimate since they render
the posterior model probabilities arbitrary Suppose that we are testing the null H0 λj = 0 Under
the null hypothesis the prior for (λσ2) is λj = 0 and π(λminusj σ2) prop 1
σ2 However the prior under the
alternative hypothesis is π(λj λminusj σ2) prop 1
σ2 Since the marginal likelihoods of data p(data|H0)
and p(data|H1) are both undetermined we cannot define the Bayesrsquo factor p(data|H1)p(data|H0)
(see eg
Cremers (2002) Chib Zeng and Zhao (forthcoming)) In contrast for nuisance parameters such
as σ2 we can continue to assign improper priors Since both hypotheses H0 and H1 include σ2
the prior for it will be offset in the Bayesrsquo factor and in the posterior probabilities Therefore we
can only assign improper priors for common parameters10 Similarly we can still assign improper
priors for β and Σ in the first time series step
The final reason why it might be undesirable to use Jeffreysrsquo prior in the second step is that it
does not impose any shrinkage on the parameters This is problematic given the large number of
10See Kass and Raftery (1995) (and also Cremers (2002)) for more detailed discussion
13
members of the factor zoo while we have only limited time-series observations of both factors and
test asset returns
In the next subsection we propose an appropriate prior for risk premia that is both robust to
spurious factors and can be used for model selection even when dealing with a very large number
of potential models
II22 Spike and Slab Prior for Risk Premia
In order to make sure that the integration of the marginal likelihood is well-behaved we propose
a novel prior specification for the factorsrsquo risk premia λ⊤f = (λ1 λK)11 Since the inference in
time-series regression is always valid we only modify the priors of the cross-sectional regression
parameters
The prior that we propose belongs to the so-called ldquospike-and-slabrdquo family For exemplifying
purposes in this section we introduce a Dirac spike so that we can easily illustrate its implications
for model selection In the next subsection we generalize the approach to a ldquocontinuous spikerdquo
prior and study its finite sample performance in our simulation setup
In particular we model the uncertainty underlying the model selection problem with a mixture
prior π(λσ2γ) prop π(λ|σ2γ)π(σ2)π(γ) for the risk premium of the j-th factor When γj = 1
and hence the factor should be included in the model the prior follows a normal distribution given
by λj |σ2 γj = 1 sim N (0σ2ψj) where ψj is a quantity that we will be defining below When instead
γj = 0 and the corresponding risk factor should not be included in the model the prior is a Dirac
distribution at zero For the cross-sectional variance of the pricing errors we keep the same prior
that would arise with Jeffreysrsquo approach π(σ2) prop σminus2
Let D denote a diagonal matrix with elements cψminus11 middot middot middot ψminus1
K and Dγ the sub-matrix of D
corresponding to model γ We can then express the prior for the risk factors λγ of model γ as
λγ |σ2γ sim N (0σ2Dminus1γ )
Note that c is a small positive number since we do not shrink the common intercept λc of the
cross-sectional regression
Given the above prior specification we sample the posterior distribution by sequentially drawing
from the conditional distributions of the parameters (ie we use a Gibbs sampling algorithm) The
crucial steps in addition to the sampling of the times series parameters from the posteriors in
equations (8)-(9) are as follows
Sampling λγ
Note that
11We do not shrink the intercept λc
14
p(λ|dataσ2γ) prop p(data|λσ2γ)π(λ|σ2γ)
prop (2π)minuspγ2 |Dγ |
12 (σ2)minus
N+pγ2 exp
983069minus 1
2σ2[(aminus βγλγ)
⊤(aminus βγλγ) + λ⊤γDγλγ ]
983070
= (2π)minuspγ2 |Dγ |
12 (σ2)minus
N+pγ2 e
983069minus
(λγminusλγ )⊤(β⊤γ βγ+Dγ )(λγminusλγ )
2σ2
983070
e
983153minusSSRγ
2σ2
983154
where SSRγ = a⊤aminus a⊤βγ(β⊤γ βγ +Dγ)
minus1β⊤γ a = minλγ(aminus βγλγ)
⊤(aminus βγλγ) + λ⊤γDγλγ
Note that SSRγ is the minimized sum of squared errors with generalised ridge regression penalty
term λ⊤γDγλγ That is our prior modelling is analogous to introducing a Tikhonov-Phillips
regularisation (see Tikhonov Goncharsky Stepanov and Yagola (1995) Phillips (1962)) in the
cross-sectional regression step and has the same rationale delivering a well defined marginal
likelihood in the presence of rank deficiency (that in our settings arises in the presence of useless
factors) However in our setting the shrinkage applied to the factors is heterogeneous since we
rely on the partial correlation between factors and test assets to set ψj as
ψj = ψ times ρ⊤j ρj (16)
where ρj is an N times 1 vector of correlation coefficients between factor j and the test assets and
ψ isin R+ is a tuning parameter which controls the shrinkage over all the factors12 When the
correlation between fjt and Rt is very low as in the case of a useless factor the penalty for λj
which is the reciprocal of ψρ⊤j ρj is very large and dominates the sum of squared errors
Let λγ = (β⊤γ βγ +Dγ)
minus1β⊤γ a and σ2(λγ) = σ2(β⊤
γ βγ +Dγ)minus1 the posterior distribution of
λγ is
λγ |dataσ2γ sim N (λγ σ2(λγ))
The above equation makes it clear why this Bayesian formulation is robust to spurious factors
When β converges to zero (β⊤γ βγ +Dγ) is dominated by Dγ so the identification condition for
the risk premia no longer fails When a factor is spurious its correlation with test assets converges
to zero hence the penalty for this factor ψminus1j goes to infinity As a result the posterior mean
of λγ λγ = (β⊤γ βγ +Dγ)
minus1β⊤γ a is shrunk towards zero and the posterior variance term σ2(λ)
approaches σ2Dminus1γ Consequently the posterior distribution of λ for a spurious factor is nearly the
same as its prior In contrast for a normal factor that has non-zero covariance with test assets the
information contained in β dominates the prior information since in this case the absolute size of
Dγ is small relative to β⊤γ βγ
Using our priors and integrating out λ yields
p(data|σ2γ) =
983133p(data|λσ2γ)π(λ|σ2γ)dλ prop (σ2)minus
N2
|Dγ |12
|β⊤γ βγ +Dγ |
12
exp
983069minusSSRγ
2σ2
983070
12Alternatively we could have set ψj = ψ times β⊤j βj where βj is an N times 1 vector However ρj has the advantage
of being invariant to the units in which the factors are measured
15
Sampling σ2
From the Bayesrsquo theorem we have that the posterior of σ2 given by
p(σ2|dataγ) prop p(data|σ2γ)π(σ2) prop (σ2)minusN2minus1 exp
983069minusSSRγ
2σ2
983070
hence the posterior distribution of σ2 is an inverse-Gamma σ2|dataγ sim Γminus1983059N2
SSRγ
2
983060
Finally we obtain the marginal likelihood of the data in model γ by integrating out σ2
p(data|γ) =983133
p(data|σ2γ)π(σ2)dσ2 prop |Dγ |12
|β⊤γ βγ +Dγ |
12
1
(SSRγ2)N2
When comparing two models using posterior model probabilities is equivalent to simply using the
ratio of the marginal likelihoods ie the Bayesrsquo factor defined as
BFγγprime = p(data|γ)p(data|γ prime)
where we have given equal prior probability to model γ and model γprime
Remark 2 (Bayes Factor) Consider two nested linear factor models γ and γ prime The only dif-
ference between γ and γ prime is γp γp equals 1 in model γ but 0 in model γ prime Let γminusp denote a
(K minus 1) times 1 vector of model index excluding γp γ⊤ = (γ⊤minusp 1) and γ prime⊤ = (γ⊤
minusp 0) where without
loss of generality we have assumed that the factor p is ordered last The Bayesrsquo factor is then
BFγγprime =
983061SSRγprime
SSRγ
983062N2 983059
1 + ψpβ⊤p
983147IN minus βγprime(β⊤
γprimeβγprime +Dγprime)minus1β⊤γprime
983148βp
983060minus 12 (17)
The above result is proved in Appendix A4
Since β⊤p [IN minus βγprime(β⊤
γprimeβγprime + Dγprime)minus1β⊤γprime ]βp is always positive ψp plays an important role in
variable selection For a strong and useful factor that can substantially reduce pricing errors the
first term in equation (17) dominates and the Bayesrsquo factor will be much greater than 1 hence
providing evidence in favour of model γ
Remember that SSRγ = minλγ(a minus βγλγ)⊤(a minus βγλγ) + λ⊤
γDγλγ hence we always have
SSRγ le SSRγprime in sample There are two effects of increasing ψp i) when ψp is large the penalty
for λp is small hence it is easier to minimise SSRγ and SSRγprimeSSRγ becomes much larger than
1 ii) large ψp decreases the second term in equation (17) lowering the Bayesrsquo factor and acting as
a penalty for dimensionality
A particularly interesting case is when the factor is useless βp converges to zero but the
penalty term 1ψp prop 1ρ⊤pρp goes to infinity On the one hand the first term in equation (17) will
converge to 1 on the other hand since ψp asymp 0 in large sample the second term in equation (17)
will also be around 1 Therefore the Bayesrsquo factor for a useless factor will go to 1 asymptotically13
13But in finite sample it may deviate from its asymptotic value so we should not use 1 as a threshold when testingthe null hypothesis H0 γp = 0
16
In contrast a useful factor should be able to greatly reduce the sum of squared errors SSRγ so
the Bayesrsquo factor will be dominated by SSRγ yielding a value substantially above 1
Remark 3 (Level Factors) Identification failure of factorsrsquo risk premia can arise in the presence
of lsquolevel factorsrsquo exposure to which is non-zero but lacks cross-sectional spread ie βj rarr c1N with
c ∕= 0 These factors help explain the average level of returns but not the their cross-sectional
dispersion and hence are collinear with the common cross-sectional intercept Our approach can
handle this case by using variance standardised variables in the estimation and replacing the penalty
in (16) with ψj = ψtimes983145ρj⊤983145ρj where 983145ρj is the cross-sectionally demeaned vector of correlations with
asset returns ie 983145ρj = ρj minus983059
1N
983123Ni=1 ρji
983060times 1N
II23 Continuous Spike
We extend the Dirac spike-and-slab prior by encoding a continuous spike for λj when γj equals
0 Following the literature on Bayesian variable selection (see eg George and McCulloch (1993)
George and McCulloch (1997) and Ishwaran Rao et al (2005)) we model the uncertainty under-
lying model selection with a mixture prior π(λσ2γω) = π(λ|σ2γ)π(σ2)π(γ|ω)π(ω) which is
specified as following
λj |γj σ2 sim N (0 r(γj)ψjσ2)
Note the introduction of a new element r(γj) in the prior and where r(1) = 1 and r(0) = r As
we explain below the additional parameter vector ω encodes our prior beliefs about the sparsity
of the true model
Redefine D as a diagonal matrix with elements c (r(γ1)ψ1)minus1 (r(γK)ψK)minus1 where ψj is
given as before by equation (16) In matrix notation the prior for λ is λ|σ2γ sim N (0σ2Dminus1)
The term r(γj)ψj in Dminus1 is set to be small or large depending on whether γj = 0 or γj = 1 In the
empirical implementation we force r to be much less than 1 since we intend to shrink λj towards
zero when γj is 014 Hence the spike component concentrates the mass of λ towards zero whereas
the slab component allows λ to take values over a much wider range Therefore the posterior
distribution of λ is very similar to the case of a Dirac spike in section II22
We use Gibbs sampling (ie sequential sampling from conditional distributions) to draw the
posterior distribution of the parameters (λγωσ2) where as explained below ω encodes our
prior beliefs about the sparsity of the true model
Sampling λγ
Combining the likelihood and the prior for λ we have
p(λ|dataσ2γ) prop p(data|λσ2γ)p(λ|σ2γ) prop exp
983069minus 1
2σ2
983147λ⊤(β⊤β +D)λminus 2λ⊤β⊤a
983148983070
Therefore defining λ = (β⊤β+D)minus1β⊤a and σ2(λ) = σ2(β⊤β+D)minus1 the posterior distribution
14We can set r = 00001 In our framework r is essentially a tune parameter hence we need to choose a reasonablevalue such that we can identify useful factor but exclude spurious ones
17
of λ can be expressed as λ|dataσ2γω sim N (λ σ2(λ))
Sampling γjKj=1
Even though the prior on model index γ could be simply set to be π(γ) = 12K we interpret
π(γj = 1|ωj) = ωj as our prior belief about the sparsity of the true model As in the literature on
predictors selection we assign the following prior distribution to (γω)
π(γj = 1|ωj) = ωj ωj sim Beta(aω bω)
Different hyper-parameters aω and bω determine whether we favor more parsimonious models or
not15
Given a ωj the Bayes factor for the j-th risk then factor is
p(γj = 1|dataλωσ2γminusj)
p(γj = 0|dataλωσ2γminusj)=
ωj
1minus ωj
p(λj |γj = 1σ2)
p(λj |γj = 0σ2)
If we had instead imposed ωj = 05 as in section II22 the Bayesrsquo factor would be fully determined
byp(λj |γj=1σ2)p(λj |γj=0σ2)
Sampling ω
From Bayesrsquo theorem we have
p(ωj |dataλγσ2) prop π(ωj)π(γj |ωj) prop ωγjj (1minus ωj)
1minusγjωaωminus1j (1minus ωj)
bωminus1
prop ωγj+aωminus1j (1minus ωj)
1minusγj+bωminus1
Therefore the posterior distribution of ωj is ωj |dataλγσ2 sim Beta(γj + aω 1minus γj + bω)
Sampling σ2
Finally
p(σ2|dataωλγ) prop (σ2)minusN+K+1
2minus1 exp
983069minus 1
2σ2[(aminus βλ)⊤(aminus βλ) + λ⊤Dλ]
983070
Hence the posterior distribution of σ2 is
σ2|dataωλγ sim Γminus1
983061N +K + 1
2(aminus βλ)⊤(aminus βλ) + λ⊤Dλ
2
983062
Note that when ωj is a constant 05 and r converges to 0 the continuous slab-and-spike prior
is equivalent to the one with a Dirac spike in section II22 However the MCMC algorithm of
the continuous spike setting is particularly useful in the high dimensional case Imagine that there
are 30 candidate factors in the factor zoo In the Dirac spike-and-slab prior case we have to
15We set aω = bω = 2 in the benchmark case However we could assign aω = 1 and bω = 2 in order to select asparser model
18
calculate the posterior model probabilities for 230 different models Given that we update (aβ)
in each sampling round posterior probabilities for all models are necessarily re-computed for every
new draw of these quantities rendering the computational cost very large In contrast using our
continuous spike approach we can simply use the posterior mean of γj to approximate the posterior
marginal probability of the j-th factor
III Simulation
We build a simple setting for a linear factor model that includes both strong and irrelevant factors
and allows for potential model misspecification
The cross-section of asset returns mimics empirical properties of 25 Fama-French portfolios
sorted by size and value We generate both factors and test asset returns from normal distributions
assuming that HML is the only useful factor A misspecified model also includes pricing errors from
the two-step Fama-MacBeth procedure which makes the vector of simulated expected returns equal
to their sample mean estimates of 25 Fama-French portfolios Finally a spurious factor is simulated
from an independent normal distribution with mean zero and standard deviation 1
ftuselessiidsim N (0 (1)2)
ftHMLiidsim N (rHML σ
2HML)
ftHML = ftHML minus macrftHML
Rt|ftHMLiidsim
983099983103
983101N (λc1N + β
983059λHML + ftHML
983060 Σ) if the model is correct
N (R+ βftHML Σ) if the model is misspecified
where factor loadings risk premia and variance-covariance matrix of returns are equal to their
OLS sample estimates from time series and two-pass Fama-MacBeth regressions of 25 size and
value portfolios on HML All the model parameters are estimated on monthly data from July 1963
to December 2017
To better illustrate the properties of the frequentist and bayesian approaches we consider 3
estimation setups
(a) the model includes only a strong factor (HML)
(b) the model includes only a useless factor
(c) the model includes both strong and useless factors
Each setting can be correctly or incorrectly specified with the following sample sizes T = 100
200 600 1000 and 20000 We compare the performance of the OLSGLS standard frequentist
and Bayesian Fama-MacBeth estimators (FM and BFM correspondingly) with the focus on risk
premia recovery testing and identification of strong and useless factors for model comparison
19
III1 Estimating risk premia via Bayesian Fama-MacBeth
Since it is unlikely that in most empirical settings a linear factor model is correctly specified we
focus our discussion on the case that allows for model misspecification We report similar simulation
results for the case of correct model specification in Appendix C
Table 1 Tests of risk premia in a misspecified model with a strong factor
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0107 0052 0007 0115 0076 0013 -422 4530200 0094 0067 0006 0118 0072 0015 -303 5621
FM 600 0097 0053 0006 0098 0047 0011 389 55751000 0088 0048 0012 0103 0060 0012 865 532820000 0109 0056 0010 0110 0060 0008 2486 3655
100 0063 0026 0006 0032 0013 0001 -356 3609200 0086 0041 0012 0079 0039 0008 -367 4689
BFM 600 0087 0043 0013 0087 0047 0009 -067 52611000 0095 0046 0009 0093 0046 0009 440 534620000 0096 0052 0012 0100 0052 0012 2390 3601
Panel B GLS
100 0246 0173 0067 0251 0154 0070 1720 7464200 0165 0107 0031 0171 0105 0035 5337 8045
FM 600 0137 0076 0020 0137 0073 0016 6942 83871000 0131 0072 0019 0132 0072 0015 7341 841620000 0122 0074 0014 0113 0061 0015 8034 8283
100 0139 0083 0022 0143 0083 0024 3127 6815200 0131 0072 0014 0121 0069 0019 4858 7299
BFM 600 0114 0061 0013 0126 0067 0018 6594 80091000 0100 0048 0009 0106 0058 0008 7063 814920000 0092 0050 0012 0100 0048 0012 8022 8267
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor The true value of the cross-sectional R2
adj is 3055(8175) for the OLS (GLS) estimation Fama-MacBeth estimates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidence intervals and their size for BFMestimates are constructed using posterior coverage of Fama-MacBeth estimates of λ The last two columns report the5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at the simulation point estimatesfor FM and its posterior mode for BFM
Table 1 presents the size of the tests for risk premia and confidence intervals for cross-sectional
R2 in probably the most relevant (and best-case) scenario for empirical applications It compares
the performance of frequentist and Bayesian Fama-MacBeth estimators for the case when the model
is misspecified and includes a single cross-sectional factor that is priced and strongly identified
proxied by HML Since the model is misspecified cross-sectional R2 never reaches 100 even for
T = 20 000 with the population value of 31 (82) for OLS (GLS) As expected both FM and
BFM estimators are very similar to each other and provide confidence intervals of correct size In
case of the standard FM approach they are constructed using standard t-statistics adjusted for
Shanken correction and in case of the BFM we rely on the quantiles of the posterior distribution
to form the credible confidence intervals for parameters The last two columns also report the
quantiles of the mode of the posterior distribution of R2 across the simulations
20
Table 2 summarizes risk premia estimation for the same cross-section of 25 expected returns
but using a useless (spurious) factor as a candidate cross-sectional factor As expected the standard
Fama-MacBeth estimator fails to recognize the rank failure in the second stage and conventional
risk premia estimates and t-statistics are inflated Indeed it is widely known since Kan and Zhang
(1999) that if the model is misspecified then t-statistics of the spurious factors tend to infinity
asymptotically This is confirmed in Panels A and B for T = 20 000 the probability of finding a
useless factor to have a t-stat above 196 is almost 50 for FM-OLS and over 80 for FM-GLS
Furthermore cross-sectional measures of fit such as R2 and related quantities are substantially
inflated even though itrsquos true value is 0 for both OLS and GLS settings in-sample estimates
produced by the frequentist approach not only have a much wider empirical support (for example
from -4 to over 40 for FM-OLS) but also uncertainty that does not decrease with the sample
size
Table 2 Tests of risk premia in a misspecified model with a useless factor
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0072 0037 0010 0055 0011 0001 -429 4184200 0078 0043 0005 0084 0021 0001 -418 4524
FM 600 0089 0043 0014 0223 0116 0013 -428 43701000 0093 0052 0014 0333 0187 0027 -429 450120000 0213 0135 0067 0698 0488 0172 -427 4363
100 0035 0011 0001 0001 0000 0000 -247 -036200 0041 0013 0001 0008 0002 0000 -261 -016
BFM 600 0071 003 0003 0031 0006 0002 -287 0191000 0047 002 0001 0039 0017 0002 -298 08420000 0034 0013 0000 0091 0043 0012 -336 1140
Panel B GLS
100 0238 0144 0066 0305 0199 0062 -347 3857200 0152 0091 0028 0292 0189 0067 -375 1953
FM 600 0126 0066 0017 0407 0314 0148 -385 16431000 0117 0063 0013 0510 0401 0239 -405 156920000 0104 0039 0005 0864 0847 0768 -311 1333
100 0128 0070 0019 0047 0018 0002 -218 1154200 0107 0060 0014 0034 0011 0000 -275 891
BFM 600 0093 0046 0008 0042 0012 0001 -325 6621000 0083 0031 0004 0061 0028 0004 -339 51920000 0023 0006 0000 0099 0049 0011 -304 138
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor The true value of the cross-sectional R2 is zero
The Bayesian approach to Fama-MacBeth regressions successfully overcomes the hurdle of use-
less factors As Table 2 demonstrates both BFM-OLS and BFM-GLS are able to identify the spu-
rious factor with the posterior distribution providing credible confidence bounds with the proper
control for the size of the test (eg for T = 20 000 the probability to reject the null of no risk
premia attached to a useless factor when using a test with the nominal size of 10 is 91) Rec-
ognizing a useless factor cross-sectional measures of fit are more conservative and overall tighter
Why does the bayesian approach work when the frequentist fails The argument is probably
best summarized by Figure 1 that plots a posterior distribution of λuseless for BFM from one of
21
minus4 minus2 0 2 4
00
02
04
06
08
λuseless
Density
BFMminusOLSFMminusOLS
Figure 1 Risk premia estimates of a useless factor
The graph presents the posterior distribution (blue line) of λuseless from the BFM-OLS estimation in a misspecifiedmodel with a useless factor based on a single simulation with T = 1 000 The red line depicts the asymptoticdistribution of Fama-MacBeth estimate of risk premium under the normal approximation
the simulations along with the pseudo-true value of risk premium defined as 0 in this case In this
particular simulation Fama-MacBeth OLS estimate of λuseless is -119 with Shanken-corrected
t-statistics equal to -255 so according to traditional hypothesis testing we would reject the null
of λuseless = 0 even at 1 The posterior distribution of BFM estimates of risk premium (the
blue line in Figure 1) behaves rather differently it is centered around 0 and overall more spread
out with a confidence interval (minus1603 1201) Intuitively the main driving force behind it
is the fact that in BFM β is updated continuously when β is close to zero the posterior draws
of β will be positive or negative randomly which implies that the conditional expectation of λ in
Equation 12 will also flip sign depending on the draw As a result the posterior distribution of
λuseless is centered around 0 and so is the confidence interval The same logic applies to the case
of BFM-GLS
Finally Table 3 combines the insights for true and irrelevant factors and presents the simulation
results for the most realistic model setup that includes both useless and strong factors As expected
in the conventional case of frequentist Fama-MacBeth estimation the useless factor is often found
to be a significant predictor of the asset returns its OLS (GLS) t-statistic would be above a
5-critical value in over 60 (80) of the simulations On the contrary the bayesian confidence
intervals have the right coverage and reject the null of no risk premia attached to the spurious
factor with frequency asymptotically approaching the size of the tests
The crowding out of the true factors by the useless ones could also be an important empirical
concern When the model is misspecified the presence of spurious factors can also bias the risk
premia estimates for the strong ones and often leads to their crowding out of the model Panel
A in Table 3 serves as a good illustration of this possibility with risk premia estimates for the
22
Table 3 Tests of risk premia in a misspecified model with useless and strong factors
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0082 0039 0008 0121 0067 0016 0099 0023 0001 -513 5663200 0096 0044 0005 0157 0100 0034 0129 0039 0005 127 6190
FM 600 0093 0034 0014 0212 0147 0071 0264 0129 0022 840 61781000 0102 0046 0010 0261 0194 0098 0380 0199 0056 1184 624820000 0114 0054 0009 0289 0229 0152 0848 0633 0240 2507 6076
100 0035 0012 0001 0028 0007 0001 0004 0001 0000 -211 4033200 0049 0017 0001 0067 0031 0004 0011 0003 0000 -175 4828
BFM 600 005 0018 0004 0099 0047 0005 0047 0014 0002 1020 55721000 0041 0021 0003 0102 0048 0011 0071 0035 0004 1487 569520000 0017 0007 0000 0087 0033 0007 0099 0055 0012 2480 5466
Panel B GLS
100 0219 0155 0057 0224 0135 0066 0303 0198 0064 1911 7775200 0155 0092 0028 0149 0090 0024 0263 0183 0061 5537 8171
FM 600 0121 0068 0015 0116 0064 0016 0391 0293 0134 6948 84331000 0115 0061 0013 0115 0057 0012 0487 0387 0216 7305 847420000 0084 0050 0009 0100 0041 0005 0864 0836 0757 7979 8424
100 0122 0069 0016 0129 0070 0017 0046 0017 0002 3243 6869200 0112 0056 0012 0099 0048 0012 0031 0012 0000 4844 7355
BFM 600 0096 0049 0011 0086 0045 0009 0049 0016 0002 6576 80301000 0081 0036 0007 0073 0032 0006 0058 0030 0003 7064 815420000 0027 0005 0000 0022 0007 0000 0098 0047 0013 7974 8259
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and
λstrong λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor The true value of the
cross-sectional R2adj is 3055 (8175) for the OLS (GLS) estimation
strong factor are clearly biased in the frequentist estimation by the identification failure in case of
the frequentist approach Again in this case BFM provides reliable albeit conservative confidence
bounds for model parameters
III11 Evaluating cross-sectional fit
In addition to risk premia estimates it is often useful to understand the quality of cross-sectional
fit of the model Indeed the increase in cross-sectional R2 is often interpreted as measuring the
economic importance of the predictor contrary to the statistical one implied by the risk premia
significance It is well-known however that the average values of R2 are not always informative
about the true model performance its sample distribution often suffers from a large estimation un-
certainty (see eg Stock (1991) and Lewellen Nagel and Shanken (2010)) ad has a non-standard
distribution when the matrix of β has reduced rank (Kleibergen and Zhan (2015) Gospodinov
Kan and Robotti (2019)) In this section we further investigate the properties of cross-sectional
R2 in the frequentist and Bayesian Fama-MacBeth regressions
Figure 2 shows the distribution of cross-sectional OLS R2 across a large number of simulations
for the asymptotic case of T = 20 000 and a misspecified process for returns As Panel (a)
illustrates if the model is strongly identified the distribution of posterior mode of R2 for BFM
tends to coincide with that of conventional Fama-MacBeth procedure as expected The major
23
difference emerges whenever a useless factor is included into the candidate set of variables Indeed
it is well-known that in this case the distribution of conventional measures of fit is nonstandard
and often inflated (Kleibergen and Zhan (2015)) This is further confirmed in Figures 2 b) and
c) that show that under the presence of spurious factors conventional Fama-MacBeth R2 has an
extremely spreadout right tail of the distribution which makes it easy to find a substantial increase
of fit whenever the model is simply not identified This unfortunate property of the frequentist
approach is not shared by the inference with BFM Indeed the mode of the posterior distribution
of R2 is generally tightly centered around the true values The slight bump to the right tail of the
distribution comes from the fact that whenever a spurious factor is included into the model with a
small probability (based on t-statistic cut-off this is equal to the size of the test see eg Table 3
Panel A) its fit will be similar to that of the frequentist estimation
00 02 04 06 08 10
02
46
810
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(a) Model with a strong factor
00 02 04 06 08
010
2030
40
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(b) Model with a useless factor
00 02 04 06 08 10
02
46
8
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(c) Model with strong and useless factors
Figure 2 Cross-sectional distribution of R2adj in models with strong and useless factors
Figures (a)-(c) show the asymptotic distribution of cross-sectional R2 under different model specifications across 1000simulations of sample size T = 20 000 Blue lines correspond to the distribution of the posterior mode for R2
adj whilegreen lines depict the pointwise sample distribution of cross-sectional fit evaluated at Fama-MacBeth risk premiaestimates The dark yellow dash line stands for the true value of R2
adj in the model
24
However the pointwise distribution of cross-sectional R2 across the simulations is only part of
the story as it does not reveal the in-sample estimation uncertainty and whether the confidence
intervals are credible in reflecting it While BFM incorporates this uncertainty directly into the
shape of its posterior distribution one needs to rely on bootstrap-like algorithms to build a similar
analogue in the frequentist case As frequentist benchmark we use the approach Lewellen Nagel
and Shanken (2010) to construct the confidence interval for R2 Details on this procedure can be
found in Appendix A5
minus05 00 05 10
00
05
10
15
20
25
Radj2
Den
sity
BFM PosteriorFM Radj
2
5th95th CI of BFM5th95th CI of LNSTrue Radj
2
(a) Model with a useless factor
minus04 minus02 00 02 04 06 08 10
00
05
10
15
20
Radj2
Den
sity
BFM PosteriorFM Radj
2
5th95th CI of BFM5th95th CI of LNSTrue Radj
2
(b) Model with strong and useless factors
Figure 3 The estimation uncertainty of cross-sectional R2Figures (a) and (b) show the posterior density of the cross-sectional R2 (blue) in one representative simulationalong with its 90 confidence interval (red) The dark yellow line denotes the true value of R2 while the green linedepicts its in-sample Fama-MacBeth estimate with the confidence interval constructed following Lewellen Nagel andShanken (2010) (black)
Figure 3 presents the posterior distribution of cross-sectional R2 for a model that contains a
useless factor (and potentially a strong one too) and contrasts it with a frequentist value and
the confidence interval around it Consider for example Panel a) The fact that the in-sample
Fama-MacBeth estimate of cross-sectional fit (51) is substantially higher than the mode of the
posterior distribution (-2 which is close to the true value of R2adj about minus4) is not surprising
given the previous results on the pointwise distribution of the estimates What is quite interesting
however is the coverage of the confidence interval constructed via the simulation-based approach of
Lewellen Nagel and Shanken (2010) Not only does it not include true value of the cross-sectional
fit but in fact in this particular simulation it suggests that R2adj should be between 42 and
100 A similar mismatch between the seemingly high levels of cross-sectional fit produced by a
frequentist approach and their true values can also be observed in Panel b) of Figure 3 for the
case of including both strong and a useless factors
In section A6 of the Appendix we show that the performance of the Bayesian Fama-MacBeth
method is robust to the use of a larger cross-sectional dimension ndash ie the above discussed properties
25
hold in a larger N setting
III2 Bayes factors
How well do flat and spike-and-slab priors work empirically in selecting relevant and detecting
spurious factors in the cross-section of asset returns We revisit the theoretical results from Section
II1 using the same simulation design used to evaluate the estimation of risk premia
Table 4 The probability of retaining risk factors using Bayes factors
T 55 57 59 61 63 65
Panel A strong factors
Jeffreys Prior fstrong 200 0813 0784 0758 0722 0693 0662600 0929 0915 0896 0876 0851 08341000 0972 0963 0957 0951 0937 0924
Spike-and-Slab Prior fstrong 200 0605 0570 0538 0505 0476 0447600 0853 0830 0807 0784 0762 07351000 0954 0940 0928 0912 0896 0875
Panel B useless factors
Jeffreys Prior fuseless 200 1000 0996 0988 0967 0919 0822600 0998 0998 0995 0988 0977 09431000 1000 1000 1000 0994 0983 0965
Spike-and-Slab Prior fuseless 200 0325 0188 0106 0057 0024 0013600 0072 0025 0006 0002 0001 00001000 0051 0015 0001 0000 0000 0000
Panel C strong and useless factors
Jeffreys Prior fstrong 200 0924 0897 0874 0848 0821 0799600 0988 0985 0976 0974 0965 09581000 0998 0996 0996 0995 0992 0987
fuseless 200 0984 0960 0910 0811 0702 0584600 0999 0993 0985 0954 0913 08541000 1000 1000 0995 0986 0966 0945
Spike-and-Slab Prior fstrong 200 0627 0591 0552 0509 0474 0452600 0860 0830 0802 0783 0758 07271000 0956 0942 0927 0911 0895 0875
fuseless 200 0260 0128 0071 0031 0019 0010600 0080 0028 0010 0004 0001 00011000 0058 0013 0005 0000 0000 0000
The table shows the frequency of retaining risk factors for different choice sets across 1000 simulations of differentsize (T=200 600 and 1000) In Panel A the candidate risk factor is truly cross-sectionally priced and stronglyidentified while in Panel B they are not Panel C summarizes the case of using both strong and useless candidatefactors in the model A candidate factor is retained in the model if its marginal posterior probability p(γi = 1|data)is greater than a certain threshold ie 55 57 59 61 63 and 65
Consider a cross-section of 25 portfolios that is actually loading on 2 systematic sources of risk
with the econometrician potentially observing at most only one of them a strong (and priced) ft
However there is also a second candidate factor available which is orthogonal to asset returns
and essentially useless We compute Bayes factors corresponding to each of the potential sources
of risk and document the empirical probability of retaining the variable in the model across 1000
26
simulations Again we consider models that contain either strong or useless factors or a com-
bination of both and different sample sizes (T = 200 600 and 1000) In each case we run the
Gibbs sampling algorithm derived using continuous spike-and-slab prior and then approximate the
marginal probability of each factor by the posterior mean of γj The decision rule is based on a
range of critical values 55-65 such that whenever the posterior mean of γj is above a particular
threshold we retain the factor Finally we also compute the probability of retaining a factor under
Jeffreys prior which would be the standard in the literature
Table 4 summarizes our findings When only a true risk factor is included in the candidate set
(Panel A) both Jeffreyrsquos and spike-and-slab priors successfully identify it with a high probability
especially in large sample For small sample sizes however Jeffreys prior clearly has a somewhat
higher power of detecting a true risk factor This outcome is not surprising as the estimator does
not impose additional shrinkage on the risk premium The difference however vanishes for larger
sample sizes
The difference between the two priors becomes drastic whenever useless factors are included
in the model (Panels B Panels B and C in Table 4) As discussed in Section II21 since the
matrix β⊤γ βγ is nearly singular and its determinant goes to zero under a flat prior the posterior
probability of including a spurious factor in the model converges to 1 asymptotically For example
the probability of misidentifying a spurious factor as being the true source of risk is almost 1
under Jeffreys prior even for a very short sample This in turn makes the overall process of model
selection invalid
Overall we find the asymptotic behavior of the spike-and-slab prior encouraging for variable
and model selection While often somewhat conservative in very short samples it successfully
eliminates the impact of the spurious factors from the model and identifies the true sources of
risk
IV Empirical Applications
In this section we apply our Bayesian approach to a large set of factors proposed in the previous
literature First we use the Bayesian Fama-MacBeth method to analyse several notable factor
models (subsection IV1) Second we consider 51 tradable and non-tradable factors yielding more
than two quadrillion possible models and employ our spike and slab priors to compute factorsrsquo
posterior probabilities and implied risk premia (subsections IV2 and IV3) Third we compare
the performance of a (low dimensional) robust model constructed with only the factors that have
high posterior probability to the one of several notable factor models (subsection IV4) Fourth
we estimate the degree of sparsity (in terms of linear factors) of the true latent stochastic discount
factor as well as the SDF-implied maximum Sharpe ratio (subsection IV5)
27
IV1 Some notable factor models
In this section we illustrate the differences between the frequentist and Bayesian FM estimation
(both OLS and GLS) for several candidate models In particular we estimate a set of linear
factor models on the returns of the standard 25 Fama-French portfolios sorted by size and value
using frequentist and Bayesian Fama-MacBeth estimators We use monthly data over the 197001-
201712 sample for tradable factors and whenever possible nontradables For factors available
only at quarterly frequency the sample is 1952Q1-2017Q3 (whenever possible) A full description
of the data and models used can be found in Appendix B Additional empirical results for other
candidate factors and cross-sections (eg 25 Fama-French + 17 Industry portfolios) can be found
in Appendix C
Tables 5 and 6 summarize the performance of several leading factor models For the classical FM
approach we report point estimates of risk premia with their Shanken-corrected t-statistics and
the cross-sectional R2 along with its 90 confidence interval (constructed following the method-
ology of Lewellen Nagel and Shanken (2010)) For BFM we report the posterior mean of risk
premia estimates and the posterior median and mode of R2 along with the centered 90 posterior
coverage We chose to report both the median and the mode for cross-sectional fit because the of
the shape of its distribution which is often heavily skewed
Carhart (1997) 4-factor model OLS and GLS Fama-MacBeth estimates of risk premia indicate
that size value and momentum (SMB HML and UMD correspondingly) are significant drivers of
the cross-section of test assets The market factor does not command a significant risk premium
which is a typical finding for this model Cross-sectional fit seems to be high with R2 over 70 even
though it comes with rather wide confidence bounds according to Lewellen Nagel and Shanken
(2010) approach The Bayesian estimation indicates that part of the model success is due to the fact
that this cross section of test assets does not have much exposure to momentum especially after
one controls for the conventional Fama-French factors While still marginally significant its risk
premium is substantially lower under both BFM and BFM-GLS estimation with tighter bounds
for R2 too On the contrary both HML and SMB have virtually identical risk prices under both
FM and Bayesian estimations
Hou Xue and Zhang (2014) q-factor model emphasized the role of investment and profitability
in matching the cross-section of equity returns and true to the data we find these factors strongly
priced Both estimation strategies give identical parameter estimates and largely consistent mea-
sures of cross-sectional fit This in turn implies that all the risk premia are strongly identified
for this model when estimated on a cross-section of 25 Fama-French portfolios When industry
portfolios are among the test assets (see Appendix C) risk premia and R2 decline and some of the
parameters lose significance but broadly speaking the model performs in a qualitatively similar
way
Liquidity-adjusted CAPM of Pastor and Stambaugh (2003) seems to suffer from identification
failure as the risk premium on the liquidity factor ceases to remain significant when BFM is used
in estimation Wide confidence bounds and uncertain cross-sectional fit provide a stark difference to
28
Table 5 Tradable factors and 25 Fama-French portfolios sorted by size and value
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Carhart (1997) Intercept 0489 7063 0703 6432 6329[-0244 1222] [3160 9400] [-0061 1426] [4826 7646]
MKT 0120 -0101[-0631 0870] [-0822 0683]
SMB 0171 0164[0100 0241] [0089 0232]
HML 0404 0396[0331 0477] [0330 0466]
UMD 2445 1806[0955 3936] [0259 3328]
q-factor model Intercept 0912 6567 0922 6062 6123Hou Xue and Zhang (2014) [0286 1539] [3040 8680] [0276 1560] [4131 7640]
ROE 0394 0377[0016 0771] [-0020 0789]
IA 0387 0385[0203 0571] [0208 0580]
ME 0274 0268[0169 0379] [0158 0376]
MKT -0371 -0378[-0995 0252] [-1005 0272]
Liquidity-CAPM Intercept 0973 3624 1162 3409 3027Pastor and Stambaugh (2000) [-0084 2030] [-909 10000] [0175 2120] [-239 6146]
LIQ 3057 1785[0727 5388] [-1237 4150]
MKT -0281 -0449[-1350 0788] [-1371 0509]
Panel A GLS
Carhart (1997) Intercept 1017 8964 1083 8587 863[0389 1645] [8200 9760] [0458 1717] [8085 9105]
MKT -0434 -0504[-1065 0196] [-1150 0122]
SMB 0191 0189[0150 0233] [0150 0230]
HML 0356 0356[0313 0400] [0316 0395]
UMD 1626 1264[0479 2772] [0077 2401]
q-factor model Intercept 1305 5503 1277 4728 4854Hou Xue and Zhang (2014) [0779 1831] [2440 9640] [0702 1879] [3245 6419]
ROE 0295 0266[-0026 0615] [-0087 0640]
IA 0270 0265[0104 0437] [0093 0450]
ME 0251 0246[0161 0341] [0144 0345]
MKT -0749 -0720[-1268 -0229] [-1292 -0156]
Liquidity-CAPM Intercept 1244 4938 1256 5298 4317Pastor and Stambaugh (2000) [0664 1824] [2691 9891] [0738 1749] [1250 6653]
LIQ 1141 0775[-0232 2514] [-0450 2116]
MKT -0664 -0678[-1242 -0086] [-1176 -0162]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of 25 Fama-French monthly excess returns Each model is estimated via OLS and GLS We reportpoint estimates and 5 confidence intervals for risk premia which are constructed based on the asymptotic normaldistribution and cross-sectional R2 and its (5 95) confidence level constructed as in Lewellen Nagel and Shanken(2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide the posterior mean of λ denoted byλj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (595) credible intervals and denote significance at the 90 95 and 99 level respectively
29
Table 6 Nontradable factors and 25 Fama-French portfolios sorted by size and value
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Scaled CCAPM Intercept 1046 2567 1791 3436 2919[-0848 2940] [-1429 10000] [0001 3723] [-476 6207]
cay 1817 0791[-0653 4288] [-1347 2686]
∆Cnd 0713 0303[-0030 1456] [-0462 0951]
∆Cnd times cay 0804 0301[-1645 3253] [-1911 2270]
HC-CAPM Intercept 3243 -122 3090 354 957[1228 5257] [-909 3345] [0790 5259] [-748 4431]
∆Y 0464 0085[-0213 1140] [-1119 1058]
MKT -0719 -0656[-2680 1242] [-2859 1558]
Durable CCAPM Intercept 2214 5238 2780 471 4078[-1037 5465] [2800 10000] [-0184 5751] [120 6991]
∆Cnd 0743 0357[-0025 1511] [-0207 0832]
∆Cd -0057 0014[-0719 0605] [-0668 0693]
MKT 0083 -0495[-3322 3489] [-3395 2555]
Panel B GLS
Scaled CCAPM Intercept 2180 -1024 2257 -658 -313[0825 3536] [-1429 6457] [1221 3258] [-1187 1517]
cay 0435 0256[-0774 1643] [-0688 1217]
∆Cnd 0118 0089[-0266 0502] [-0214 0407]
∆Cnd times cay 0141 0063[-1005 1286] [-0845 0938]
HC-CAPM Intercept 2730 5636 2759 5824 4926[1458 4002] [3018 8364] [1379 4095] [967 7507]
∆Y -0421 -0241[-0742 -0099] [-0598 0114]
MKT -0717 -0740[-1979 0545] [-2073 0622]
Durable CCAPM Intercept 2960 4454 2841 5474 4099[0547 5374] [286 7829] [1102 4558] [-241 7215]
∆Cnd 0105 0052[-0265 0475] [-0201 0311]
∆Cd 0055 0025[-0390 0501] [-0286 0327]
MKT -0941 -0822[-3314 1432] [-2528 0895]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of 25 Fama-French quarterly excess returns Each model is estimated via OLS and GLS Wereport point estimates and 5 confidence intervals for risk premia which are constructed based on the asymptoticnormal distribution and cross-sectional R2 and its (5 95) confidence level constructed as in Lewellen Nageland Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide the posterior mean of λdenoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as wellas its (5 95) credible intervals and denote significance at the 90 95 and 99 level respectively
the pointwise estimates and their seemingly high significance levels under the standard frequentist
approach
Conditional CCAPM of Lettau and Ludvigson (2001) is weakly identified at best Unlike the
basic FM estimation that indicates a relative empirical success of the model the Bayesian approach
reveals most risk premia to be substantially lower losing all the accompanying statistical and
30
economic significance This is particularly pronounced in the BFM-GLS that delivers both risk
premia and cross-sectional R2 close to zero
Labour-adjusted CAPM of Jagannathan and Wang (1996) extends the classic CAPM framework
by introducing a proxy for human capital and finds it strongly priced in the cross-sections of stocks
returns The BFM estimates of risk premia are substantially lower and no longer significant with
the same patterns observed under both OLS and GLS procedures
Durable CCAPM of Yogo (2006) in the linearized version included the durable consumption
factor and found that its impact is priced in a number of cross-sections sorted by size and value past
betas and other characteristics Even though the Lewellen Nagel and Shanken (2010) approach
indicates a really wide support for the cross-sectional R2 the model found empirical support in the
data We find that both durable and nondurable consumption are weak predictors of the cross-
section of returns as the magnitude of their risk premia substantially declines and is no longer
significant The model is still characterized by a wide confidence interval for R2 but overall its
pricing ability is questionable at best
Appendix C provides additional empirical results on the performance of both frequentist and
bayesian Fama-MacBeth estimators In many cases when the models are well specified and strongly
identified in the data there is almost no distinction between the two approaches One notable
difference however are the confidence intervals of the R2 that are often notoriously wide in
the frequentist case There are also cases however when the difference in model performance
becomes large affecting both risk premia estimates and measures of cross-sectional fit Similar
to Gospodinov Kan and Robotti (2019) we caution the reader against blindly relying on the
estimates produced by conventional Fama-MacBeth procedure and advocate a robust approach to
inference
IV2 Sampling two quadrillion models
We now turn our attention to a large cross-section of candidate asset pricing factors In particular
we focus on 51 (both traded and non) monthly factors available from October 1973 to December
2016 (ie T = 600) Factors are described in details in Table B1 of Appendix B In choosing
the cross-section of assets to price we follow Lewellen Nagel and Shanken (2010) and employ 25
Fama-French size and book-to-market portfolios plus 30 Industry portfolio (ie N = 55) Since
we do not restrict the maximum number of factors to be included all the possible combinations
of factors give us a total of 251 possible specifications ie 225 quadrillion models Note that each
model involves 55 time series regressions and one cross-section regression ie we jointly evaluate
the equivalent of 126 quadrillion regressions
We employ the continuous spike-and-slab approach of section II23 since it is the most suited
for handling a very large number of possible models and report both the posterior (given the data)
of each factor (ie E [γj |data] forallj) as well as the posterior means of the factorsrsquo risk premia (ie
E [λj |data] forallj) computed as the Bayesian Model Average16 across all the models considered
16If we are interested in some quantity ∆ that is well-defined for every model j = 1 M (eg risk premia
31
The posterior evaluation is performed and reported over a wide range for the parameter (ψ in
equation (16)) that controls the degree of shrinkage of potentially useless factorsrsquo risk premia from
ψ = 1 (ie very strong shrinkage) to ψ = 100 (making the shrinkage virtually irrelevant) The
prior for each factor inclusion is a Beta(1 1) (ie a uniform on [0 1]) yielding a prior expectation
for γj equal to 5017
000
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
1 5 10 20 50 100
Post
erio
r pro
babi
lity
ψ
HML MKT MKT SMB STRev
IPGrowth BEH_PEAD NONDUR SERV LIQ_TR
PE DeltaSLOPE BW_ISENT DEFAULT Oil
DIV REAL_UNC UNRATE TERM HJTZ_ISENT
UMD CMA LIQ_NT FIN_UNC STOCK_ISS
NetOA MACRO_UNC INV_IN_ASS COMP_ISSUE ACCR
HML ROA RMW CMA LTRev
INTERM_CAP_RATIO ASS_Growth DISSTR PERF BAB
IA MGMT ROE O_SCORE BEH_FIN
GR_PROF QMJ RMW SKEW SMB
HML_DEVIL
Figure 4 Posterior factor probabilities
Posterior probabilities of factors E [γj |data] estimated over the 197310-201612 sample using a cross-section of 25Fama-French size and book-to-market and 30 Industry test asset portfolios computed using the continuous spike andslab approach of section II23 and 51 factors yielding 251 asymp 225 quadrillion models The 51 factors considered aredescribed in Table B1 of Appendix B The prior distribution for the j-th factor inclusion is a Beta(1 1) yielding aprior expectation of γj = 50 Posterior probabilities are plotted for ψ isin [1 100]
Figure 4 plots the posterior probabilities of the 51 factors as a function of the parameter ψ
The corresponding values are reported in Table 7 Overall the inclusion of only four factors finds
maximum Sharpe ratio etc) from Bayesrsquo theorem we have
E [∆|data] =M983131
j=0
E [∆|datamodel = j] Pr (model = j|data)
where E [∆|datamodel = j] = limLrarrinfin1L
983123Li=1 ∆(θ
(j)i ) and
983153θ(j)i
983154L
i=1denotes draws from the posterior distribution
of the parameters of model j That is the (Bayesian model average) expectation of ∆ conditional on only thedata is simply the weighted average of the expectation in every model with weights equal to the modelsrsquo posteriorprobabilities See eg Raftery Madigan and Hoeting (1997) Hoeting Madigan Raftery and Volinsky (1999)
17Using a Beta(2 2) that still implies a prior probability of factor inclusion of 50 but lower probabilities forvery dense and very sparse models we obtain virtually identical results Furthermore using a prior in favour of moresparse factor models (ie a Beta(2 8)) the empirical findings are very similar to the ones reported
32
Table 7 Posterior factor probabilities E [γj |data] and risk premia in 225 quadrillion models
E [γj |data] E [λj |data]ψ ψ
Factors 1 5 10 20 50 100 1 5 10 20 50 100 F
HML 0860 0909 0916 0903 0882 0877 0192 0288 0301 0300 0292 0293 0377MKT 0730 0807 0831 0851 0837 0778 0108 0271 0370 0476 0565 0572 0514MKT 0501 0630 0705 0748 0749 0707 0047 0184 0285 0385 0467 0472 0563SMB 0654 0662 0580 0500 0429 0370 0076 0138 0133 0117 0103 0102 0215STRev 0506 0526 0511 0530 0568 0550 0003 0013 0026 0049 0106 0158 0438IPGrowth 0508 0508 0501 0502 0508 0502 0000 -0001 -0001 -0002 -0004 -0007 0121BEH PEAD 0486 0512 0478 0492 0511 0517 0004 0012 0018 0027 0049 0072 0619NONDUR 0475 0499 0490 0504 0504 0509 0000 0002 0003 0005 0011 0018 0170SERV 0504 0481 0497 0502 0491 0497 0000 0000 0000 0000 -0001 -0001 0052LIQ TR 0494 0493 0485 0485 0506 0507 0000 0005 0010 0020 0048 0083 0438PE 0495 0494 0502 0490 0494 0492 -0002 -0004 -0005 -0006 -0008 -0009 6401DeltaSLOPE 0478 0490 0506 0498 0482 0511 0000 0000 -0001 -0001 -0002 -0005 0096BW ISENT 0489 0491 0496 0498 0504 0477 0000 0002 0003 0005 0010 0014 0094DEFAULT 0497 0498 0490 0497 0484 0490 0000 0000 0000 0001 0001 0002 0313Oil 0495 0504 0489 0490 0494 0483 0002 0011 0019 0034 0068 0108 0852DIV 0488 0491 0486 0494 0491 0505 0000 0000 0000 -0001 -0002 -0003 0876REAL UNC 0479 0490 0503 0490 0498 0484 0000 0000 0000 0000 0000 0000 0043UNRATE 0483 0499 0484 0490 0490 0486 0000 -0001 -0002 -0003 -0006 -0008 1083TERM 0470 0476 0480 0497 0503 0501 0000 0003 0005 0009 0018 0030 0942HJTZ ISENT 0483 0480 0491 0487 0489 0477 0000 0000 0000 -0001 0000 0000 0210UMD 0531 0537 0536 0495 0422 0384 0028 0069 0089 0100 0102 0108 0646CMA 0499 0504 0491 0495 0466 0442 0001 0000 -0002 -0005 -0008 -0012 0242LIQ NT 0477 0478 0494 0493 0481 0471 -0003 -0005 -0004 -0002 0009 0019 0501FIN UNC 0481 0477 0471 0477 0489 0491 0000 0000 0000 0000 -0001 -0001 0096STOCK ISS 0515 0537 0509 0473 0433 0374 -0032 -0081 -0096 -0103 -0110 -0105 0515NetOA 0504 0504 0481 0489 0423 0402 0005 0015 0022 0032 0040 0042 0544MACRO UNC 0476 0483 0479 0471 0463 0430 0000 0000 0000 0000 0000 0000 0073INV IN ASS 0479 0476 0479 0461 0454 0440 0001 0002 0005 0010 0023 0029 0549COMP ISSUE 0514 0518 0503 0481 0393 0363 0037 0083 0101 0111 0105 0108 0497ACCR 0483 0477 0486 0472 0436 0415 -0009 -0012 -0009 0001 0015 0015 0343HML 0491 0481 0479 0469 0431 0376 -0002 -0001 0001 0004 0008 0010 0251ROA 0517 0514 0499 0473 0399 0319 0041 0105 0141 0162 0154 0123 0551RMW 0508 0481 0483 0465 0397 0347 0002 0008 0013 0017 0017 0016 0219CMA 0560 0530 0499 0432 0351 0292 0031 0056 0062 0058 0051 0044 0351LTRev 0485 0491 0474 0442 0402 0350 -0007 -0025 -0032 -0035 -0036 -0029 0252INTERM CAP RATIO 0499 0477 0452 0451 0380 0324 0027 0037 0054 0082 0103 0104 0790ASS Growth 0473 0470 0458 0439 0380 0327 0001 0005 0005 0000 -0005 -0003 0525DISSTR 0505 0487 0456 0410 0340 0269 0043 0094 0105 0104 0085 0070 0475PERF 0541 0497 0448 0390 0319 0259 -0054 -0079 -0072 -0064 -0054 -0049 0651BAB 0460 0448 0433 0405 0372 0325 -0010 -0012 -0008 0006 0027 0037 0921IA 0497 0465 0425 0385 0325 0257 -0010 -0011 -0009 -0008 -0008 -0007 0409MGMT 0516 0469 0424 0379 0302 0244 0028 0055 0058 0056 0050 0044 0631ROE 0482 0446 0414 0364 0299 0233 0009 0024 0027 0028 0024 0018 0555O SCORE 0475 0426 0403 0368 0281 0229 -0009 -0014 -0015 -0018 -0016 -0012 002BEH FIN 0483 0410 0360 0321 0238 0196 -0016 -0008 -0003 0000 0003 0003 076GR PROF 0459 0405 0366 0314 0249 0206 0011 0005 0008 0011 0012 0014 0199QMJ 0502 0408 0369 0300 0232 0178 0030 0030 0026 0018 0014 0011 0405RMW 0464 0395 0356 0302 0245 0193 0015 0025 0028 0027 0025 0020 0292SKEW 0481 0388 0345 0287 0219 0179 -0001 0000 0000 0000 0000 0000 0438SMB 0336 0305 0319 0312 0311 0277 0019 0032 0041 0047 0054 0051 0257HML DEVIL 0434 0326 0289 0242 0183 0152 0014 0013 0009 0005 0003 0002 0356
Posterior probabilities of factors E [γj |data] and posterior mean of factor risk premia E [λj |data] computed usingthe continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225 quadrillion models Theprior for each factor inclusion is a Beta(1 1) yielding a prior expectation for γj equal to 50 The last column reportssample average returns for the tradable factors The data is monthly 197310 to 201612 Test assets cross-sectionof 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factors considered are described inTable B1 of Appendix B Numbers denoted with the asterisk in the last column correspond to the return on thefactor-mimicking portfolio of the nontradable factor constructed by a linear projection of its values on the set of 51test assets and scaled to have the same volatility as the original nontradable factor
33
substantial support in our empirical analysis First the celebrated Fama-French HML (high-minus-
low) designed to capture the so-called lsquovalue premiumrsquo is a strong determinant of the cross-section
of asset returns For ψ = 10 (a reasonable benchmark) its posterior probability is about 895 and
only for very strong shrinkage (ψ = 1) the posterior probability gets reduced to 639 Second
the market factor in the version of Daniel Mota Rottke and Santos (2018) (MKTlowast that is
meant to have hedged out the unpriced risk contained in the market index) has also high posterior
probability (856 for ψ = 10) Third the simple market factor (MKT) seems also to be a robust
source of priced risk albeit with an empirical performance weaker than MKTlowast Fourth albeit to
a lesser extent SMBlowast the Daniel Mota Rottke and Santos (2018) version of the small-minus-big
Fama-French factor (meant to capture the so called lsquosizersquo premium) seems also to contain relevant
information for pricing the cross-section of asset returns with a posterior probability in the 60-
70 range for most values of ψ Beside the ones above mentioned all other factors have posterior
probabilities of about 50 or less for all values of ψ Interestingly the results are not very sensitive
to the choice of ψ
In addition to the posterior probabilities of the factors Table 7 reports the posterior means
of the factor risk premia computed as Bayesian Model Average ie the weighted average of the
posterior means in each possible factor model specification with weights equal to the posterior
probability of each specification being the true data generating process (see eg Roberts (1965)
Geweke (1999) Madigan and Raftery (1994)) The results are not very sensitive to the choice of
ψ except when considering very small values for of the shrinkage parameter ψ since in this case
posterior means are shrunk toward zero Interestingly the estimated price of risk for the market
factor is positive despite it being very often estimated as a negative quantity when considering
multifactor models and not dissimilar from the market excess return over the same period More
generally there is a clear pattern in cross-sectionally estimated (ie ex post) factor risk premia and
their simple time series average estimates (reported in the last column of Table 7) for the robust
four factors (HML MKTlowast MKT and SMBlowast) ex post risk premia are very similar to the time series
estimnates while the opposite holds true for the other factors In other words robust factors seems
to price themselves well (since theoretically their own beta is one) while other factors donrsquot
IV3 Estimating 26 million sparse factor models
Instead of drawing unrestricted factor models specifications we now constraint models to have a
maximum of 5 factors ndash ie we are imposing sparsity on the implied stochastic discount factor as
in most of the previous empirical asset pricing literature that has tried to identify low dimensional
factor models to explain the cross-section of asset returns Given our set of 51 factors this approach
yields about 26 millions of possible models (ie the equivalent of about 147 million time series and
cross-sectional regressions) Since we now do not sample the possible models posterior probabilities
are computed using the marginal likelihoods of all these models ie the posterior probability of
model γj is computed as
Pr(γj |data) =p(data|γj)983123i p(data|γi)
34
000
1000
2000
3000
4000
5000
6000
7000
8000
1 5 10 20 50 60
Post
erio
r pro
babi
lity
ψ
HML MKT MKT SMB PERF
STOCK_ISS COMP_ISSUE CMA ROA UMD
BEH_PEAD STRev NONDUR DISSTR MGMT
BW_ISENT TERM FIN_UNC LIQ_TR DeltaSLOPE
IPGrowth Oil SERV DEFAULT NetOA DIV PE REAL_UNC HJTZ_ISENT UNRATE
INTERM_CAP_RATIO LIQ_NT QMJ MACRO_UNC INV_IN_ASS
ACCR LTRev CMA ASS_Growth HML
RMW BAB IA O_SCORE ROE
SMB SKEW BEH_FIN GR_PROF RMW
HML_DEVIL
Figure 5 Posterior factor probabilities
Posterior probabilities of factors Pr [γj = 1|data] estimated over the 197310-201612 sample using a cross-section of25 Fama-French size and book-to-market and 30 Industry test asset portfolios computed using the Dirac spike andslab approach of section II22 51 factors and all possible models with up to 5 factors yielding about 26 millioncandidate models The prior probability of a factor being included is about 1038 The 51 factors considered aredescribed in Table B1 of Appendix B Posterior probabilities are plotted for ψ isin [1 100]
where we have assigned equal prior probability to all possible specifications and p(data|γj) denotes
the marginal likelihood of the j-th model To both simplify the numerical computation and to
illustrate the qualities of the approach we use the Dirac spike-and-slab prior of section II22 since
we can leverage its closed form solution for the marginal likelihoods18
Posterior factor probabilities are reported in Figure 5 and jointly with the Bayesian model
averaging of risk premia across the sparse models in Table C15 of the Appendix Note that in
this case all factors have an ex ante probability of being included equal to 1038 The results
are strikingly similar to the ones in Table 7 and Figure 4 as before only for four factors (HML
MKTlowast MKT and SMBlowast) we observe a marked increase in the posterior probability of inclusion
after observing the data Furthermore these factors seems to price themselves well ndash ie the BMA
of their risk premia are very similar to the sample average of their excess returns ndash while other
factors do not
35
Table 8 Factor Models with highest posterior probability (Dirac spike-and-slab ψ = 10)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232SMB 983232 983232 983232 983232 983232 983232STRevBW ISENTLIQ TRUNRATENONDURTERMCOMP ISSUE 983232 983232OilBEH PEADDeltaSLOPEINV IN ASSIPGrowthDEFAULTUMD 983232ROA 983232 983232 983232REAL UNCPECMA 983232ACCRSERVSTOCK ISS 983232 983232 983232DIVMACRO UNCFIN UNCLIQ NTCMANetOAHJTZ ISENTLTRevRMWHMLINTERM CAP RATIOASS GrowthPERF 983232 983232 983232IABABDISSTRROEMGMT 983232 983232O SCOREQMJBEH FINGR PROFSKEWRMWHML DEVILSMB
Probability () 00666 00599 00574 00522 00503 00460 00437 00428 00423 00393
Factors and posterior model probabilities of ten most likely specifications computed using the Dirac spike and slabapproach of section II22 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 26 millioncandidate models and a model prior probability of the order of 10minus7 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
36
IV4 A robust factors model
The previous subsections suggest that only a small number of factors (HML MKTlowast MKT and to a
lesser extent SMBlowast) are robust explanators of the cross-section of asset returns Furthermore Table
8 that reports the ten factor model specifications with the highest posterior probabilities19 shows
that these robust factors tend to be almost always included in the most likely models both HML
and MKTlowast are featured in all ten specifications while MKT is excluded only once and SMBlowast is
included in the five most likely specification plus the tenth one Note that the posterior probabilities
in Table 8 might appear small in absolute terms but are actually four orders of magnitude larger
than the prior model probabilities (equal to one over the number of models considered)
Therefore a natural question is whether the four factors identified as robust in the above
analysis do indeed deliver a significantly better cross-sectional asset pricing model We answer this
questions by comparing the performance of a four factor model with HML MKTlowast MKT and SMBlowast
as factors to the one of several notable factor models20 In particular Table 9 reports the model
posterior probabilities for the specifications considered ie the probability of any of these models
being the true data generating process Strikingly for almost any value of ψ the model posterior
probabilities are in the single digits range for all models but the robust factors one the probability
of this specification is always higher than 85 except when using a very strong shrinkage (in which
case it is reduced to 65) Furthermore for ψ in the most salient range (10-20) the posterior
probability of the robust factors model is about 90
Table 9 Posterior probabilities of notable models vs robust factors
ψmodel 1 5 10 20 50 100
CAPM 002 001 001 002 004 008Fama and French (1992) 005 001 001 001 000 000Fama and French (2016) 009 006 005 004 003 002Carhart (1997) 005 001 001 001 001 001Hou Xue and Zhang (2015) 002 000 000 000 000 000Pastor and Stambaugh (2000) 001 000 000 001 001 002Asness Frazzini and Pedersen (2014) 010 003 002 002 002 001Robust Factors Model 065 088 090 090 089 085
Posterior model probabilities for the specifications in the first column for different values of ψ computed using theDirac spike-and-slab prior The models and their factors are described in Appendix B The model in the last rowuses the HML MKT MKTlowast and SBMlowast factors described in Table B1 The data is monthly 197310 to 201612 Testassets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios
18Alternatively one could use the continuous spike-and-slab approach employed in the previous subsection anddrop the draws of specifications with more than 5 factors but this significantly reduces computational efficiency
19Table 8 focuses on the Dirac spike-and-slab specification with ψ = 10 Very similar results are reported in tablesC16-C18 of Appendix C for other values of ψ and for the continuous prior case
20Note that the correlation between MKT and MKTlowast is not too large being about 064
37
IV5 The sparsity of the stochastic discount factor
We have shown that a robust model with only four ndash robust ndash factors is much more likely to
capture the true stochastic discount factor than all the notable models considered (see Table 9)
Nevertheless are only four factors likely to deliver a sufficiently accurate representation of the true
latent stochastic discount factor
Thanks to our Bayesian method this question can be easily answered In particular by using
our estimations of about 225 quadrillion models and their posterior probabilities we can compute
the posterior distribution of the dimensionality of the lsquotruersquo model That is for any integer number
between one and fifty-one we can compute the posterior probability of the SDF being a function
of that number of factors
Figure 6 reports the posterior distributions of the dimensionality of the SDF for various values
of ψ These distributions are also summarized in Table 10 For the most salient values of ψ (10 and
20) the posterior mean of the number of factors in the true SDF is in the 24-25 range and the 95
posterior credible intervals are contained in the 17 to 32 factors range That is there is substantial
evidence that the SDF is dense in the space of the linear factors considered given the factors at
hand a relative large number of them is needed to provide an accurate representation of the true
latent SDF Since most of the literature has focused on very low dimension linear factor models
this finding suggests that most empirical results therein have been affected by a large degree of
misspecification
It is worth noticing that as Figure 6 and Table 10 show for very large ψ ie with basically
a flat prior for factor risk premia the posterior dimensionality is reduced This is due to two
phenomena we have already outlined First if some of the factors are useless (and our analysis
points in this direction) under a flat prior they do tend to have a higher posterior probability and
drive out the true sources of priced risk Second a flat prior for the risk premia can generate a
ldquoBartlett Paradoxrdquo (see the discussion in section II21 and Bartlett (1957))
Table 10 The posterior dimensionality of the stochastic discount factor
ψ1 5 10 20 50 100
mean 2570 2525 2460 2370 2203 2046median 26 25 25 24 22 2025 19 18 18 17 15 145 20 20 19 18 16 1595th 31 31 30 30 28 26975th 33 32 32 31 29 27
Summary statistics of the posterior density of the true model number of factors for various values of ψ Estimated
over the 197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30 Industry
test asset portfolios using the continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225
quadrillion models The prior for each factor inclusion is a Beta(1 1)
38
0 10 20 30 40 50
000
004
008
012
Model Dimensionality
Freq
uenc
y
ψ=1ψ=5ψ=10ψ=20ψ=50ψ=100
Figure 6 Posterior density of the dimensionality of the stochastic discount factor
Posterior density of the true model having the number of factors listed on the horizontal axis Estimated over the
197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30 Industry test asset
portfolios computed using the continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225
quadrillion models The prior for each factor inclusion is a Beta(1 1) yielding a prior expectation for γj equal to 50
The 51 factors considered are described in Table B1 of Appendix B Posterior densities are plotted for ψ isin [1 100]
Note that if the factors proposed in the literature were to capture different and uncorrelated
sources of risk one might worry that a SDF that is dense in the space of factors might imply
unrealistically high Sharpe ratios (see eg the discussion in Kozak Nagel and Santosh (2019))
Since given a factor model the SDF-implied maximum Sharpe ratio is just a function of the
factorsrsquo risk premia and covariance matrix our Bayesian method allows to construct the posterior
distribution of the maximum Sharpe ratio for each of the 2 quadrilion models considered Therefore
using the posterior probabilities of each possible model specification we can actually construct
the (Bayesian Model Averaging) posterior distribution of the SDF-implied maximum Sharpe ratio
(conditional on the data only)
Figure 7 and table 11 report respectively the posterior distribution of the (annualized) SDF-
implied maximum Sharpe ratio and its summary statistics for several values of the parameter ψ
Except when a very strong shrinkage (small ψ) is imposed (and hence risk premia and consequently
Sharpe ratios are shrunk toward zero) the posterior distribution of the Sharpe ratio are quite
similar for all values of ψ Furthermore despite the SDF being dense in the space of factors the
posterior maximum Sharpe ratio does not appear to be unrealistically high eg for ψ isin [10 20]
its posterior mean is about 074ndash080 and the 95 posterior credible intervals are in the 050ndash114
range Interestingly Ghosh Julliard and Taylor (2016 2018) provides a non-parametric estimate
of the pricing kernel extracted using an information-theoretic approach and wide cross-sections of
equity portfolios and find SDF-implied maximum Sharpe ratios of very similar magnitude
39
00 05 10 15
01
23
4
Sharpe Ratio
Dens
ity
ψ=1ψ=5ψ=10ψ=20ψ=50ψ=100
Figure 7 Posterior density of the Sharpe ratio of the model-implied stochastic discount factor
Posterior density of the Sharpe ratio of the model-implied stochastic discount factor for various values of ψ isin [1 100]
Estimated over the 197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30
Industry test asset portfolios computed using the continuous spike and slab approach of section II23 51 factors
described in Table B1 of Appendix B and Bayesian Model Averaging of the 251 asymp 225 quadrillion possible models
The prior for each factor inclusion is a Beta(1 1)
Table 11 Posterior distribution of the Sharpe ratio of the model-implied stochastic discount factor
ψ1 5 10 20 50 100
mean 044 067 074 080 087 092median 043 066 073 079 086 09125 026 044 050 054 059 0625 029 047 053 058 063 06695th 060 089 099 108 117 124975th 064 094 105 114 124 133
Summary statistics of the posterior distribution of the maximal Sharpe ratio of the model-implied SDF for various
values of ψ isin [1 100] Estimated over the 197310-201612 sample using a cross-section of 25 Fama-French size and
book-to-market and 30 Industry test asset portfolios computed using the continuous spike and slab approach of
section II23 51 factors described in Table B1 of Appendix B and Bayesian Model Averaging of the 251 asymp 225
quadrillion possible models The prior for each factor inclusion is a Beta(1 1)
V Extensions
In additions to the extensions formalized in remarks 1 (on how to handle generated factors such as
principal components and factor mimicking portfolios) and 3 (on how to handle the identification
failure generated by lsquolevel factorsrsquo) our method can be feasibly extended to encompass several
salient generalizations
40
First based on economic considerations one might possibly want to bound the maximum risk
premia (or the maximum Sharpe ratios) associated with the factors This can be achieved by re-
placing the Gaussian distributions in our spike-and-slab priors with (rescaled and centered) Beta
distributions since the latter have bounded support Furthermore for the sake of expositional
simplicity and closed form solutions we have focused on regularising spike and slab priors with
exponential tails Nevertheless our approach that shrinks useless factors based on their correla-
tion with asset returns could be as well implemented using polynomial tailed (ie heavy-tailed)
mixing priors (see Polson and Scott (2011) for a general discussion of priors for regularisation and
shrinkage)21 The rational for using heavy tailed priors is that when the likelihood has thick tails
while the prior has thin tail if the likelihood peak moves too far from the prior mean the posterior
eventually eventually reverts toward the prior This is illustrated in Figure D2 of the Appendix
Nevertheless note that this mechanism (first pointed out in Jeffreys (1961)) is actually desirable
in our settings in order to shrink the risk premia of useless factors toward zero22
Second Lewellen Nagel and Shanken (2010) points out that the first pass time-series regression
is often affected by having a strong factor structure in the residuals Given the hierarchical structure
of our Bayesian approach one can add latent linear components in the time series regression of asset
returns on factors reformulate the time series estimation step as a state-space problem and filter
the latent components (eg via Kalman filter) The posterior sampling of the time series parameters
would then be enriched by the drawing of the added terms as in Bryzgalova and Julliard (2018)
Furthermore one could allow the latent time series factors to be potential priced in the cross-
section (again as in Bryzgalova and Julliard (2018)) This extension would increase the numerical
complexity of the procedure in the time-series step but would nonetheless leave unchanged the
method proposed in this paper at the cross-sectional step (with the only difference that the time
series loadings of the latent factors could be included in the cross-sectional step as if these latent
factors were observable) This extended approach would lead to valid posterior inference and model
selection
Third again thanks to the hierarchical structure of our method time varying time series betas
could be accommodated by adopting the time varying parameters approach of Primiceri (2005)
in the time series step And since in our approach the asset specific expected risk premia are
a parameter estimated in the time series step this extension would also allow for time variation
in asset risk premia Furthermore albeit this would increase the numerical complexity of the
cross-sectional inference step the time varying parameters formulation could also be used for the
modelling of the factor risk premia
21For example albeit alternatives with more desirable properties exist our spike and slab could be implementedusing a Cauchy prior with location parameter set to zero and scale parameter proportional to ψj as defined in equation(16)
22Risk premia have a natural support hence the prior can be chosen (adjusting the paramater ψ) to attach highprobability to it And since useless factors will tend to generate heavy tailed likelihoods with posterior peaks for riskpremia that deviate toward infinity the risk premia of such factors will be shrunk toward the prior mean if the priorhas thin-tails
41
VI Conclusions
We have developed a novel (Bayesian) method for the analysis of linear factor models in asset
pricing The approach can handle the quadrillions of models generated by the zoo of traded and
non traded factors and delivers inference that is robust to the common identification failures and
spurious inference problems caused by useless factors
We have applied our approach to the study of more than two quadrillion factor model spec-
ification and have found that 1) only a handful of factors (the Fama and French (1992) ldquohigh-
minus-lowrdquo proxy for the value premium the market index as well the adjusted versions of the
ldquosmall-minus-bigrdquo size factor and the market factor of Daniel Mota Rottke and Santos (2018))
seem to be robust explanators of the cross-sections of asset returns 2) jointly the four robust fac-
tors provide a model that is compared to the previous empirical literature one order of magnitude
more likely to have generated the observed asset returns (itrsquos posterior probability is about 90)
3) with very high probability the ldquotruerdquo latent stochastic discount factor is dense in the space of
factors proposed in the previous literature ie capturing its characteristics requires the use of 24-25
factors (at the posterior mean of the SDF sparsity) 4) despite being dense in the space of factors
the SDF-implied maximum Sharpe ratio is not excessive suggesting a high degree of commonality
in terms of captured risks among the factors in the zoo
As a byproduct of our novel framework for empirical asset pricing we provide a very simple
Bayesian version of the Fama and MacBeth (1973) regression method (BFM) We show that this
simple procedure (that does neither require optimisation nor tuning parameters and is not harder
to implement than eg the Shanken (1992) correction for standard errors) makes useless factors
easily detectable in finite sample In extensive simulations the BFM and its GLS analogue (BFM-
GLS) perform well even with relatively small time and large cross-sectional dimensions We apply
BFM and BFM-GLS to several notable factor models and document that a range of non-traded
factors such as consumption proxies labour factors or the consumption-to-wealth ratio are only
weakly identified at best and are characterised by a substantial degree of model misspecification
and uncertainty
Finally thanks to its hierarchical structure our framework is extremely flexible and can accom-
modate and deliver robust inference in the presence of 1) pre-estimated factors (eg mimicking
portfolios and principal components) 2) latent priced and unpriced factors 3) time varying betas
as well as asset and factor risk premia
42
References
Allena R (2019) ldquoComparing Asset Pricing Models with Traded and Non-Traded Factorsrdquo working paper
Asness C and A Frazzini (2013) ldquoThe Devil in HMLs Detailsrdquo Journal of Portfolio Management 39 49ndash68
Asness C S A Frazzini and L H Pedersen (2019) ldquoQuality minus Junkrdquo Review of Accounting Studies24(1) 34ndash112
Avramov D and G Zhou (2010) ldquoBayesian Portfolio Analysisrdquo Annual Review of Financial Economics 225ndash47
Baker M and J Wurgler (2006) ldquoInvestor Sentiment and the Cross-Section of Stock Returnsrdquo Journal ofFinance 61 1645ndash80
Baks K P A Metrick and J Wachter (2001) ldquoShould Investors Avoid All Actively Managed Mutual FundsA Study in Bayesian Performance Evaluationrdquo Journal of Finance 56 45ndash85
Ball R (1985) ldquoAnomalies in Relationships between Securitiesrsquo Yields and Yield-Surrogatesrdquo Journal of FinancialEconomics 6 103ndash126
Barillas F and J Shanken (2018) ldquoComparing Asset Pricing Modelsrdquo The Journal of Finance 73(2) 715ndash754
Bartlett M S (1957) ldquoComment on a Statistical Paradox by DV Lindleyrdquo Biometrika 44(1-2) 533ndash534
Basu S (1977) ldquoInvestment Performance of Common Stocks in Relation to their Price-Earnings Ratio A Test ofthe Efficient Market Hypothesisrdquo Journal of Finance 32 663ndash82
Belloni A V Chernozhukov and C Hansen (2014) ldquoInference on treatment effects after selection amonghigh-dimensional controlsrdquo The Review of Economic Studies 81 608ndash650
Breeden D T M R Gibbons and R H Litzenberger (1989) ldquoEmpirical Test of the Consumption-OrientedCAPMrdquo The Journal of Finance 44(2) 231ndash262
Bryzgalova S (2015) ldquoSpurious Factors in Linear Asset Pricing Modelsrdquo working paper
Bryzgalova S and C Julliard (2018) ldquoConsumption in Asset Returnsrdquo London School of EconomicsManuscript
Campbell J Y (1996) ldquoUnderstanding Risk and Returnrdquo Journal of Political Economy 104(2) 298ndash345
Campbell J Y J Hilscher and J Szilagyi (2008) ldquoIn Search of Distress Riskrdquo Journal of Finance 632899ndash2939
Carhart M M (1997) ldquoOn Persistence in Mutual Fund Performancerdquo The Journal of Finance 52(1) 57ndash82
Chan K N-F Chen and D A Hsieh (1985) ldquoAn Exploratory Investigation of the Firm Size Effectrdquo Journalof Financial Economics 14 451ndash71
Chen L R Novy-Marx and L Zhang (2010) ldquoA New Three-Factor Modelrdquo working paper
Chen N-F S A Ross and R Roll (1986) ldquoEconomic Forces and the Stock Marketrdquo Journal of Business 59383ndash403
Chernov M L Lochstoer and S Lundeby (2019) ldquoConditional Dynamics and the Multi-Horizon Risk-ReturnTrade-Offrdquo working paper
Chib S X Zeng and L Zhao (forthcoming) ldquoOn Comparing Asset Pricing Modelsrdquo The Journal of Finance
Cochrane J H (2005) Asset Pricing Revised Edition Princeton University Press
Cooper M J H Gulen and M J Schill (2008) ldquoAsset Growth and the Cross-Section of Stock ReturnsrdquoJournal of Finance 63 1609ndash52
Cremers K M (2002) ldquoStock Return Predictability A Bayesian Model Selection Perspectiverdquo The Review ofFinancial Studies 15(4) 1223ndash1249
Daniel K D Hirshleifer and L Sun (2019) ldquoShort- and Long-Horizon Behavioral Factorsrdquo Review ofFinancial Studies
Daniel K L Mota S Rottke and T Santos (2018) ldquoThe Cross-Section of Risk and Returnrdquo workingpaper
Daniel K and S Titman (2006) ldquoMarket reactions to tangible and intangible informationrdquo Journal of Finance61 1605ndash43
Fabozzi F D Huang and G Zhou (2010) ldquoRobust Portfolios Contributions from Operations Research andFinancerdquo Annals of Operations Research 172 191ndash220
43
Fama E F and K French (2008) ldquoDissecting Anomaliesrdquo Journal of Finance 63 1653ndash78
Fama E F and K R French (1992) ldquoThe Cross-Section of Expected Stock Returnsrdquo The Journal of Finance47 427ndash465
(1993) ldquoCommon Risk Factors in the Returns on Stocks and Bondsrdquo The Journal of Financial Economics33 3ndash56
Fama E F and K R French (2015) ldquoA Five-Factor Asset Pricing Modelrdquo Journal of Financial Economics116 1ndash22
(2016) ldquoDissecting Anomalies with a Five-Factor Modelrdquo Review of Financial Studies 29 69ndash103
Fama E F and J D MacBeth (1973) ldquoRisk Return and Equilibrium Empirical Testsrdquo Journal of PoliticalEconomy 81(3) 607ndash636
Ferson W and C Harvey (1991) ldquoThe Variation of Economic Risk Premiumsrdquo Journal of Political Economy99 385ndash415
Frazzini A and L H Pedersen (2014) ldquoBetting against Betardquo Journal of Financial Economics 111(1) 1ndash25
Garlappi L R Uppal and T Wang (2007) ldquoPortfolio Selection with Parameter and Model Uncertainty AMulti-Prior Approachrdquo Review of Financial Studies 20 41ndash81
George E I and R E McCulloch (1993) ldquoVariable Selection via Gibbs Samplingrdquo Journal of the AmericanStatistical Association 88 881ndash889
(1997) ldquoApproaches for Bayesian Variable Selectionrdquo Statistica Sinica 7 339ndash373
Gertler M and E Grinols (1982) ldquoUnemployment Inflation and Common Stock Returnsrdquo Journal of MoneyCredit and Banking 14 216ndash33
Geweke J (1999) ldquoUsing Simulation Methods for Bayesian Econometric Models Inference Development andCommunicationrdquo Econometric Reviews 18(1) 1ndash73
Ghosh A C Julliard and A Taylor (2016) ldquoWhat is the Consumption-CAPM missing An Information-Theoretic Framework for the Analysis of Asset Pricing Modelsrdquo Review of Financial Studies 30 442ndash504
(2018) ldquoAn Information-Theoretic Asset Pricing Modelrdquo London School of Economics Manuscript
Giannone D M Lenza and G E Primiceri (2018) ldquoEconomic Predictions with Big Data The Illusion ofSparsityrdquo FRB of New York Staff Report No 847
Gibbons M R S A Ross and J Shanken (1989) ldquoA Test of the Efficiency of a Given Portfoliordquo Econometrica57(5) 1121ndash1152
Giglio S G Feng and D Xiu (2019) ldquoTaming the factor Zoordquo Journal of Finance forthcoming
Giglio S and D Xiu (2018) ldquoAsset Pricing with Omitted Factorsrdquo working paper
Gospodinov N R Kan and C Robotti (2014) ldquoMisspecification-Robust Inference in Linear Asset-PricingModels with Irrelevant Risk Factorsrdquo The Review of Financial Studies 27(7) 2139ndash2170
(2019) ldquoToo Good to be True Fallacies in Evaluating Risk Factor Modelsrdquo Journal of Financial Economics132(2) 451ndash471
Goyal A Z L He and S-W Huh (2018) ldquoDistance-Based Metrics A Bayesian Solution to the Power andExtreme-Error Problems in Asset-Pricing Testsrdquo Swiss Finance Institute Research Paper No 18-78 available atSSRN httpsssrncomabstract=3286327
Hall R E (1978) ldquoThe Stochastic Implications of the Life Cycle Permanent Income Hypothesisrdquo Journal ofPolitical Economy 86(6) 971ndash87
Harvey C and Y Liu (2019) ldquoCross-sectional Alpha Dispersion and Performance Evaluationrdquo Journal ofFinancial Economics forthcoming
Harvey C and G Zhou (1990) ldquoBayesian Inference in Asset Pricing Testsrdquo Journal of Financial Economics26 221ndash254
Harvey C R (2017) ldquoPresidential Address The Scientific Outlook in Financial Economicsrdquo The Journal ofFinance 72(4) 1399ndash1440
Harvey C R Y Liu and H Zhu (2016) ldquo and the Cross-Section of Expected Returnsrdquo The Review ofFinancial Studies 29(1) 5ndash68
He A D Huang and G Zhou (2018) ldquoNew Factors Wanted Evidence from a Simple Specification Testrdquoworking paper available at SSRN httpsssrncomabstract=3143752
44
He Z B Kelly and A Manela (2017) ldquoIntermediary Asset Pricing New Evidence from Many Asset ClassesrdquoJournal of Financial Economics 126 1ndash35
Hirshleifer D H Kewei S Teoh and Y Zhang (2004) ldquoDo Investors Overvalue Firms with Bloated BalanceSheetsrdquo Journal of Accounting and Economics 38 297ndash331
Hoeting J A D Madigan A E Raftery and C T Volinsky (1999) ldquoBayesian Model Averaging ATutorialrdquo Statistical Science 14(4) 382ndash401
Hou K C Xue and L Zhang (2015) ldquoDigesting Anomalies An Investment Approachrdquo The Review of FinancialStudies 28(3) 650ndash705
Huang D F Jiang J Tu and G Zhou (2015) ldquoInvestor Sentiment Aligned A Powerful Predictor of StockReturnsrdquo Review of Financial Studies 28 791ndash837
Huang D J Li and G Zhou (2018) ldquoShrinking Factor Dimension A Reduced-Rank Approachrdquo workingpaper available at SSRN httpsssrncomabstract=3205697
Ishwaran H J S Rao et al (2005) ldquoSpike and Slab Variable Selection Frequentist and Bayesian StrategiesrdquoThe Annals of Statistics 33(2) 730ndash773
Jagannathan R and Z Wang (1996) ldquoThe Conditional CAPM and the Cross-Section of Expected ReturnsrdquoThe Journal of Finance 51(1) 3ndash53
Jeffreys H (1961) Theory of Probability Oxford Calendron Press
Jegadeesh N and S Titman (1993) ldquoReturns to Buying Winners and Selling Losers Implications for StockMarket Efficiencyrdquo Journal of Finance 48 65ndash91
(2001) ldquoProfitability of Momentum Strategies An Evaluation of Alternative Explanationsrdquo Journal ofFinance 56 699ndash720
Jones C and J Shanken (2005) ldquoMutual Fund Performance with Learning across Fundsrdquo Journal of FinancialEconomics 78 507ndash552
Jurado K S Ludvigson and S Ng (2015) ldquoMeasuring Uncertaintyrdquo American Economic Review 105 1177ndash1215
Kan R and C Zhang (1999a) ldquoGMM Tests of Stochastic Discount Factor Models with Useless Factorsrdquo Journalof Financial Economics 54(1) 103ndash127
(1999b) ldquoTwo-Pass Tests of Asset Pricing Models with Useless Factorsrdquo the Journal of Finance 54(1)203ndash235
Kass R E and A E Raftery (1995) ldquoBayes Factorsrdquo Journal of the American Statistical Association 90(430)773ndash795
Kelly B S Pruitt and Y Su (2019) ldquoCharacteristics are covariances A unified model of risk and returnrdquoJournal of Financial Economics 134 501ndash524
Kleibergen F (2009) ldquoTests of Risk Premia in Linear Factor Modelsrdquo Journal of Econometrics 149(2) 149ndash173
Kleibergen F and Z Zhan (2015) ldquoUnexplained Factors and their Effects on Second Pass R-squaredsrdquo Journalof Econometrics 189(1) 101ndash116
Kozak S S Nagel and S Santosh (2019) ldquoShrinking the Cross-Sectionrdquo Journal of Financial Economicsforthcoming
Lancaster T (2004) Introduction to Modern Bayesian Econometrics Wiley
Langlois H (2019) ldquoMeasuring Skewness Premiardquo Journal of Financial Economics
Lettau M and S Ludvigson (2001) ldquoResurrecting the (C) CAPM A Cross-Sectional Test when Risk Premiaare Time-Varyingrdquo Journal of political economy 109(6) 1238ndash1287
Lewellen J S Nagel and J Shanken (2010) ldquoA Skeptical Appraisal of Asset Pricing Testsrdquo Journal ofFinancial Economics 96(2) 175ndash194
Lintner J (1965) ldquoSecurity Prices Risk and Maximal Gains from Diversificationrdquo Journal of Finance 20587ndash615
Ludvigson S S Ma and S Ng (2019) ldquoUncertainty and Business Cycles Exogenous Impulse or EndogenousResponserdquo American Economic Journal Macroeconomics
Madigan D and A E Raftery (1994) ldquoModel Selection and Accounting for Model Uncertainty in GraphicalModels Using Occamrsquos Windowrdquo Journal of the American Statistical Association 89(428) 1535ndash1546
45
Novy-Marx R (2013) ldquoThe Other Side of Value The Gross Profitability Premiumrdquo Journal of Financial Eco-nomics 108 1ndash28
Ohlson J A (1980) ldquoFinancial Ratios and the Probabilistic Prediction of Bankruptcyrdquo Journal of AccountingResearch 18 109ndash131
Pastor L (2000) ldquoPortfolio Selection and Asset Pricing Modelsrdquo The Journal of Finance 55(1) 179ndash223
Pastor L and R F Stambaugh (2000) ldquoComparing Asset Pricing Models an Investment Perspectiverdquo Journalof Financial Economics 56(3) 335ndash381
Pastor L and R F Stambaugh (2002) ldquoInvesting in Equity Mutual Fundsrdquo Journal of Financial Economics63 351ndash380
Pastor L and R F Stambaugh (2003) ldquoLiquidity Risk and Expected Stock Returnsrdquo Journal of Politicaleconomy 111(3) 642ndash685
Phillips D L (1962) ldquoA Technique for the Numerical Solution of Certain Integral Equations of the First KindrdquoJ ACM 9(1) 84ndash97
Polson N G and J G Scott (2011) ldquoShrink Globally Act Locally Sparse Bayesian Regularization and Pre-dictionrdquo in Bayesian Statistics 9 ed by J M Bernardo M J Bayarri J O Berger A P Dawid D HeckermanA F M Smith and M West Oxford University Press
Primiceri G E (2005) ldquoTime Varying Structural Vector Autoregressions and Monetary Policyrdquo Review of Eco-nomic Studies 72(3) 821ndash852
Raftery A E D Madigan and J A Hoeting (1997) ldquoBayesian Model Averaging for Linear RegressionModelsrdquo Journal of the American Statistical Association 92(437) 179ndash191
Ritter J R (1991) ldquoThe Long-Run Performance of Initial Public Offeringsrdquo Journal of Finance 46 3ndash27
Roberts H V (1965) ldquoProbabilistic Predictionrdquo Journal of the American Statistical Association 60(309) 50ndash62
Shanken J (1987) ldquoA Bayesian Approach to Testing Portfolio Efficiencyrdquo Journal of Financial Economics 19195ndash215
(1992) ldquoOn the Estimation of Beta-Pricing Modelsrdquo The Review of Financial Studies 5 1ndash33
Sharpe W F (1964) ldquoCapital Asset Prices A Theory of Market Equilibrium under Conditions of Riskrdquo Journalof Finance 19(3) 425ndash42
Sims C A (2007) ldquoThinking about Instrumental Variablesrdquo mimeo
Sloan R (1996) ldquoDo Stock Prices Fully Reflect Information in Accruals and Cash Flows about Future EarningsrdquoThe Accounting Review 71 289ndash315
Stambaugh R F and Y Yuan (2016) ldquoMispricing Factorsrdquo Review of Financial Studies 30 1270ndash1315
Stock J H (1991) ldquoConfidence Intervals for the Largest Autoregressive Root in US Macroeconomic Time SeriesrdquoJournal of Monetary Economics 28(3) 435ndash459
Tikhonov A A Goncharsky V Stepanov and A G Yagola (1995) Numerical Methods for the Solutionof Ill-Posed Problems Mathematics and Its Applications Springer Netherlands
Titman S K J Wei and F Xie (2004) ldquoCapital Investments and Stock Returnsrdquo Journal of Financial andQuantitative Analysis 39 677ndash700
Uppal R P Zaffaroni and I Zviadadze (2018) ldquoCorrecting Misspecified Stochastic Discount Factorsrdquoworking paper
Yogo M (2006) ldquoA Consumption-Based Explanation of Expected Stock Returnsrdquo Journal of Finance 61(2)539ndash580
46
A Additional Derivations
A1 Derivation of the posterior distribution in section II1
Letrsquos consider first the time-series regression We assume that 983171tiidsim N (0Σ) or 983171 sim MVN (0TtimesN Σotimes
IT ) The likelihood of data (RF ) is therefore
p(data|BΣ) = (2π)minusNT2 |Σ|minus
T2 exp
983069minus1
2tr[Σminus1(Rminus FB)⊤(Rminus FB)]
983070
After assigning the Jeffreysrsquo prior for (BΣ) π(BΣ) prop |Σ|minusN+1
2 we simplify the likelihood
function by exploiting the fact that OLS estimated residuals are orthogonal to the regressors
(Rminus FB)⊤(Rminus FB) = [Rminus FBols minus F (B minus Bols)]⊤[Rminus FBols minus F (B minus Bols)]
= (Rminus FBols)⊤(Rminus FBols) + (B minus Bols)
⊤F⊤F (B minus Bols)
= T Σ+ (B minus Bols)⊤F⊤F (B minus Bols)
where
Bols =
983075a⊤
β⊤
983076= (F⊤F )minus1F⊤R Σols =
1
T(Rminus FBols)
⊤(Rminus FBols)
Therefore the posterior distribution in the first step is
p(BΣ|data) prop (2π)minusNT2 |Σ|minus
T+N+12 exp
983069minus1
2tr[Σminus1(Rminus FB)⊤(Rminus FB)]
983070
prop |Σ|minusT+N+1
2 eminus12tr[Σminus1(T Σ)]eminus
12tr[Σminus1(BminusBols)
⊤F⊤F (BminusBols)]
Hence the posterior distribution of B conditional on data and Σ is
p(B|Σ data) prop exp
983069minus1
2tr[Σminus1(B minus Bols)
⊤F⊤F (B minus Bols)]
983070
and the above is the kernel of the multivariate normal in equation (8)
If we further integrate out B it is easy to show that
p(Σ|data) prop |Σ|minusT+NminusK
2 exp
983069minus1
2tr[Σminus1(T Σ)]
983070
Therefore the posterior distribution of Σ is the inverse-Wishart in equation (9)
Recall that β = (1N βf ) λ⊤ = (λc λ⊤
f ) If we assume that the pricing error αi follows an
independent and idencitical normal distribution N (0σ2) the likelihood function in the second step
is
p(data|λσ2) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070
where data in the second step include (aβf ) drawn from the first step Assuming the diffuse
47
Jeffreysrsquo prior π(λσ2) prop 1σ2 the posterior distribution of (λσ2) is
p(λσ2|dataBΣ) prop (σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070
= (σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βλ+ β(λminus λ))⊤(aminus βλ+ β(λminus λ))
983070
= (σ2)minusN+2
2 exp
983069minusN σ2
2σ2
983070exp
983083minus(λminus λ)⊤β⊤β(λminus λ)
2σ2
983084
there4 p(λ|σ2 dataBΣ) prop exp
983083minus(λminus λ)⊤β⊤β(λminus λ)
2σ2
983084
where λ = (β⊤β)minus1β⊤a and σ2 = (aminusβλ)⊤(aminusβλ)N Therefore the posterior conditional distribution
of λ is the one in equation (12) Finally we can derive the posterior distribution of σ2 by integrating
out λ
p(σ2|dataBΣ) =
983133p(λσ2|dataBΣ)dσ2 prop (σ2)minus
NminusK+12 exp
983069minusN σ2
2σ2
983070
hence obtaining the posterior distribution in equation (13)
A2 Non-spherical pricing errors
Our framework can also easily accommodate non-spherical cross-sectional pricing errors To see
this note that under the null of the model we can rewrite equation (1) as Rt = βλ + βfft + 983171t
Consider the cross-sectional regression and let ET define the sample mean operator Since ET [Rt] =
βλ+ET [βfft]+ET [983171t] = βλ+ET [983171t] the pricing error α should be equal to ET [983171t] Hence under
the hypothesis that the model is correctly specified and in the spirit of the central limit theorem
a suitable distributional assumption for the pricing errors α in the second step is
α|Σ sim N
9830610N
1
TΣ
983062
Implying the distribution of R equiv ET [Rt] sim N (βλ 1T Σ) The likelihood function in the second step
is then
p(data|λ) = (2π)minusN2
9830559830559830559830551
TΣ
983055983055983055983055minus 1
2
exp
983069minusT
2(Rminus βλ)⊤Σminus1(Rminus βλ)
983070
prop exp
983069minusT
2(λ⊤β⊤Σminus1βλminus 2λ⊤β⊤Σminus1R)
983070
Hence we can now define the following estimator
Definition 3 (Bayesian Fama-MacBeth GLS2 (BFM-GLS2)) The BFM-GLS2 posterior dis-
tribution of λ is
λ|dataBΣ sim N (λΣλ) (18)
48
where λ = (β⊤Σminus1β)minus1β⊤Σminus1R Σλ = 1T (β
⊤Σminus1β)minus1 and where β and Σ are drawn from the
Normal-inverse-Wishart in (8)-(9)
In the above the conditional expectation λ = (β⊤Σminus1β)minus1β⊤Σminus1R is essentially the Fama-
MacBeth GLS estimate of λ and Σλ is just the standard covariance matrix of the GLS estimates
Different from our OLS version we incorporate the uncertainty of R by drawing λ from the normal
distribution with variance matrix 1T (β
⊤Σminus1β)minus1
In this case two forces are at play to make useless factors detectable in finite sample First as
for the BFM and BFM-GLS cases useless factors will generate posterior draws with diverging 983141λand flipping sing Second differently from our OLS version we incorporate the uncertainty of R
by drawing λ from the normal distribution with variance matrix 1T (β
⊤Σminus1β)minus1
A3 Formal derivation of the flat prior pitfall for risk premia
Following the derivation in section A1 the likelihood function in the second step is
p(data|γλσ2) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βγλγ)
⊤(aminus βγλγ)
983070(19)
Assigning a Jeffreysrsquo prior to the parameters23 (λσ2) the marginal likelihood function conditional
on model index γ is
p(data|γ) =983133 983133
p(data|γλσ2)π(λσ2|γ)dλdσ2
prop983133 983133
(σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βγλγ)
⊤(aminus βγλγ)
983070dλdσ2
=
983133 983133(σ2)minus
N+22 exp
983083minusN σ2
γ
2σ2
983084exp
983083minus(λγ minus λγ)
⊤β⊤γ βγ(λγ minus λγ)
2σ2
983084dλdσ2
= (2π)pγ+1
2 |β⊤γ βγ |minus
12
983133(σ2)minus
Nminuspγ+1
2 exp
983083minusN σ2
γ
2σ2
983084dσ2
= (2π)pγ+1
2 |β⊤γ βγ |minus
12
Γ(Nminuspγ+1
2 )
(N σ2
γ
2 )Nminuspγ+1
2
where λγ = (β⊤γ βγ)
minus1β⊤γ a σ
2γ =
(aminusβγ λγ)⊤(aminusβγ λγ)N and Γ denotes the Gamma function
A4 Proof of Remark 2
Proof Consider two nested linear factor models γ and γ prime The only difference between γ and γ prime
is γp γp equals 1 in model γ but 0 in model γ prime Let γminusp denote a (K minus 1) times 1 vector of model
index excluding γp γ⊤ = (γ⊤minusp 1) and γ prime⊤ = (γ⊤
minusp 0) Suppose that we rearrange the ordering of
23More precisely the priors for (λσ2) are π(λγ σ2) prop 1
σ2 and λminusγ = 0
49
factors such that factor p is the last one To begin with we introduce some matrix notations
βγ = (βγprime βp) Dγ =
983075Dγprime
1ψp
983076 β⊤
γ βγ +Dγ =
983075β⊤γprimeβγprime +Dγprime β⊤
γprimeβp
β⊤p βγprime β⊤
p βp + 1ψp
983076
|β⊤γ βγ +Dγ | = |β⊤
γprimeβγprime +Dγprime |times |β⊤p βp +
1
ψpminus β⊤
p βγprime(β⊤γprimeβγprime +Dγprime)minus1β⊤
γprimeβp|
|β⊤γ βγ +Dγ | = |β⊤
γprimeβγprime +Dγprime |times |β⊤p βp +
1
ψpminus β⊤
p βγprime(β⊤γprimeβγprime +Dγprime)minus1β⊤
γprimeβp|
|Dγ | = |Dγprime |times 1
ψp
Equipped with the above we have by direct calculation
p(data|γj = 1γminusj)
p(data|γj = 0γminusj)=
|Dγ |12
|β⊤γ βγ+Dγ |
12
1983059
SSRγ2
983060N2
|Dγprime |
12
|β⊤γprimeβγprime+D
γprime |12
1983061
SSRγprime2
983062N2
=
983061SSRγprime
SSRγ
983062N2983061|Dγ ||Dγprime |
983062 12
983075|β⊤
γprimeβγprime +Dγprime ||β⊤
γ βγ +Dγ |
983076 12
=
983061SSRγprime
SSRγ
983062N2
ψminus 1
2p
983055983055983055983055β⊤p βp +
1
ψpminus β⊤
p βγprime
983059β⊤γprimeβγprime +Dγprime
983060minus1β⊤γprimeβp
983055983055983055983055minus 1
2
=
983061SSRγprime
SSRγ
983062N29830611 + ψpβ
⊤p
983063IN minus βγprime
983059β⊤γprimeβγprime +Dγprime
983060minus1β⊤γprime
983064βp
983062minus 12
where β⊤p
983147IN minus βγprime(β⊤
γprimeβγprime +Dγprime)minus1β⊤γprime
983148βp = minb(βpminusβγprimeb)⊤(βpminusβγprimeb)+ b⊤Dγprimeb which
is the minimal value of the penalised sum of squared errors when we use βγprime to predict βp
A5 Confidence Intervals for R2 in Fama-MacBeth Estimation
In the spirit of Stock (1991) and Lewellen Nagel and Shanken (2010) for each of the simulations
we use the following approach to construct the confidence interval for cross-sectional R2 produced
by the standard Fama-MacBeth estimation
First we choose a hypothetical true cross-sectional R2 denoted by R2h The expected test asset
returns E[Rt] are assumed to follow E[Rt] = hβλ+α where β and λ are sample estimates of β
and λ from historical data and αiiidsim N (0σ2
e) The constants h and σ2e are chosen to match the
hypothetical cross-sectional R2
Let R denote the vector of historical average asset returns and V arN (x) denote the sample
cross-sectional variance of vector x The population cross-sectional R2 is therefore
R2h =
h2V arN (βλ)
V arN (E[Rt])=
h2V arN (βλ)
h2V arN (βλ) + σ2e
At the same time wersquod like to maintain the historical cross-sectional dispersion of test asset returns
50
so we further have the following equation
V arN (E[Rt]) = h2V arN (βλ) + σ2e = V arN (ET [Rt])
h2 =R2
hV arN (ET [Rt])
V arN (βλ)
σ2e = (1minusR2
h)V arN (ET [Rt])
Solving the system for h and σ2e we simulate a vector of pricing errors from a normal distribution
αiiidsim N (0σ2
e) Since the sample variance of the draws αi is generally different from σ2e with a non-
zero cross-sectional covariance with β we adjust the vector α by (1) subtracting CovN (βλα)
V arN (βλ)βλ
from α such as the sample covariance between α and βλ is zero (2) multiplying α by σeσN (α) in
order that sample cross-sectional variance of α equals σ2e
Second we simulate a random sample of both factors and test asset returns Factors are drawn
with replacement from their empirical sample while test assets returns Rt are assumed to follow a
parametric normal distribution that is
Rt|ftiidsim N (hβλ+α+ βft Σ)24
We then use Fama-MacBeth two-step approach to estimateR2 for every simulated sample ftRtTt=1
The second step is repeated for 1000 times For each hypothetical R2 between 0 and 1 we find the
(5 95) confidence interval in the simulation Finally build a 90 confidence interval for R2 by
including those values of hypothetically true R2 whose 90 confidence interval contains R2
The confidence intervals for GLS R2 can be found in a similar way with the only difference
of focusing on cross-sectional R2 for a linear combination of Rt ie Σminus 12Rt Let Rt = Σminus 1
2Rt
β = Σminus 12 β and 983171t = Σminus 1
2 983171tiidsim N (0 IN ) The simulation then relies on Rt rather than Rt
Rt|ftiidsim N (hβλ+α+ βft IN )
A6 Large N behavior
In this section we investigate the properties of the BFM procedure in estimating the risk premia
and R-squared as well successfully identifying irrelevant factors in the model when applied to a
large cross-section For the sake of brevity here we present the overview of the simulation results
with Tables C4 - C9 summarizing the output
We consider the same simulation design as described at the beginning of Section III except
for the choice of the cross-section of test assets which time series and cross-sectional features we
mimic In our baseline case in the previous subsections we built a cross-section to emulate the 25
Fama-French portfolios sorted by size and value Now instead we consider the properties of the
following composite cross-sections to simulate returns
24Σ is the sample estimate of covariance matrix of random errors in time-series regression in equation (1)
51
(a) N = 55 25 Fama-French portfolios sorted by size and value and 30 industry portfolios
(b) N = 100 25 Fama-French portfolios sorted by size and value 30 industry portfolios 25
profitability and investment portfolios 10 momentum portfolios and 10 long-term reversal
portfolios
The rest of the simulation design stays unchanged ie the strong factor mimics the behavior of
HML with its betas and risk premia corresponding to their in-sample values cross-sectional R2adj
as well as portfolio average returns and variance of the residuals
Tables C4 and C7 focus on the case of including only a strong factor into the model that is
inherently misspecified and estimated on a large cross-section (N = 55 and N = 100 respectively)
Risk premia estimates recovered by BFM are centered around the pseudo-true values and confi-
dence intervals produced by the posterior coverage have the correct size Confidence intervals for
R2adj are equally centered around the true values and overall become quite tight for a large sample
size (eg T ge 600) The GLS version of the cross-sectional fit is slightly lower than than the true
value and this is largely due to the estimation errors in the weight matrix that are particularly
important for a large cross-section but overall are very close to the true values
Tables C5 and C8 focus on the model with just a useless factor As expected the standard
Fama-MacBeth estimation yields confidence intervals for the risk premia with wrong size rejecting
the zero risk premia for a useless factor with increasing frequency as T becomes large In contrast
the BFM inference remains valid with its empirical rejection rates being close to the true size of
the tests
Finally Tables C6 and C9 consider the mixed case of estimating a model that includes both
strong and useless factors Again BFM correctly identifies the presence of a spurious factor in
the specification and if anything becomes somewhat more conservative underrejecting the zero
risk premia associated with it This conservative inference is also shared by the estimates of the
strong factor risk premia with the latter being particularly evident in a large sample estimation
(T = 20 000 N = 100) Again the reason for this additional estimation uncertainty seems to lie
in the large N behavior of the cross-sectional regression (betas and the weight matrix in case of
the GLS) The posterior of a cross-sectional measure of fit remains relatively tight around the true
value
B Data
CAPM Following Sharpe (1964) and Lintner (1965) the only risk factor is the excess return on
market portfolio which is proxied by a value-weighted portfolio of all CRSP firms incorporated in
US and listed on the NYSE AMEX or NASDAQ We use 1-month Treasury rate as a proxy for
risk-free rate The data comes from Ken Frenchrsquos website
Fama-French 3 factor model Fama and French (1992) extend CAPM by introducing two
additional factors SMB and HML where SMB is the return difference between portfolios of stocks
52
with small and large market capitalizations and HML is the return difference between portfolios of
stocks with high and low book-to-market ratio Again the data comes from Ken Frenchrsquos website
Carhart (1997) This paper extends Fama-French 3 factor model by including a momentum
factor UMD (also available from Ken French website)
q-factor model Hou Xue and Zhang (2015) introduce a four-factor model that includes market
excess return a new size factor (ME) proxied by the return difference between large and small
stocks an investment factor (IA) proxied by the return difference of stocks with high and low
investment-to-asset ratio and finally the profitability factor (ROE) created by sorting stocks based
on their return-on-equity ratio We receive the data from the authors An alternative is Fama-
French five factor model but we only show the result of the first one since factors in these two
papers are extremely similar
Liquidity Pastor and Stambaugh (2003) created a liquidity factor based on the fact that order
flows result in larger return reversals when liquidity is lower We download the monthly tradable
liquidity factor from Stambaughrsquos website
Quality-minus-junk Asness Frazzini and Pedersen (2019) introduced the QMJ factor and
demonstrated that profitable growing well-managed companies referred to as lsquoqualityrsquo firms
command a higher rate of return The factor is available from AQR data library
CCAPM Real growth rate in nondurable consumption per capita (quarterly data) is computed
from the data on consumption levels available at St Louis FRED
Scaled CAPM Lettau and Ludvigson (2001) considered a conditional SDFmt+1 = atminusbtMKTt+1
where at and bt are linear function of conditional information cayt We download cayt from the
authorsrsquo website
Scaled CCAPM Similar to conditional CAPM but the SDF includes nondurable consumption
growth instead of the market factor
HC-CCAPM Jagannathan and Wang (1996) added a labor factor to CAPM Following their
paper we compute the returns on human capital as
Rlabort =
Ltminus1 + Ltminus2
Ltminus2 + Ltminus3minus 1 (20)
where Lt is the disposable labor income per capita
Scaled HC-CCAPM Lettau and Ludvigson (2001) extend Jagannathan and Wang (1996) by
considering a conditional SDF in which cay is the only conditional information
Durable consumption model Yogo (2006) emphasized the role of durable consumption goods
in explaining high returns of small stocks and value stocks We consider a three-factor model
market excess return non-durable consumption growth and durable (real) consumption growth Cd
(seasonally adjusted at annual rates) The data is from Yogorsquos website 1952Q1 to 2001Q4
53
B1 Factor list
Table B1 List of the factors for cross-sectional asset pricing models
Factor ID Factor name and description Reference Sourceconstruction
MKT Market excess return Sharpe (1964) Lintner(1965)
Ken French website
SMB Size factor constructed as a long-shortportfolio of stocks sorted by their mar-ket cap (small-minus-big)
Fama and French (1992) Ken French website
HML Value factor constructed as a long-short portfolio of stocks sorted by theirbook-to-market ratio (high-minus-low)
Fama and French (1992) Ken French website
RMW Profitability factor constructed as along-short portfolio of stocks sorted bytheir profitability (robust-minus-weak)
Fama and French (2015) Ken French website
CMA Investment factor constructed as along-short portfolio of stocks sorted bytheir investment activity (conservative-minus-aggressive)
Fama and French (2015) Ken French website
UMD Momentum factor constructed as along-short portfolio of stocks sorted bytheir 12-2 cumulative previous return(up-minus-down)
Carhart (1997) Je-gadeesh and Titman(1993)
Ken French website
STREV Short-term reversal factor constructedas a long-short portfolio of stockssorted by their previous month return
Jegadeesh and Titman(1993)
Ken French website
LTREV Long-term reversal factor constructedas a long-short portfolio of stockssorted by their cumulative return ac-crued in the previous 60-13 months
Jegadeesh and Titman(2001)
Ken French website
q IA Investment factor constructed as along-short portfolio of stocks sorted bytheir investment-to-capital
Hou Xue and Zhang(2015)
Lu Zhang
q ROE Profitability factor constructed as along-short portfolio of stocks sorted bytheir return on equity
Hou Xue and Zhang(2015)
Lu Zhang
LIQ NT Liquidity factor computed as the av-erage of individual-stock measures es-timated with daily data (residual pre-dictability controlling for the marketfactor)
Pastor and Stambaugh(2003)
Robert Stambaughwebsite
LIQ TR Liquidity factor constructed as a long-short portfolio of stocks sorted by theirexposure to LIQ NT
Pastor and Stambaugh(2003)
Robert Stambaughwebsite
MGMT Mispricing factor constructed as acombination of anomalies related tofirmrsquos management practices (stock is-sue accruals asset growth etc)
Stambaugh and Yuan(2016)
Robert Stambaughwebsite
PERF Mispricing factor constructed as acombination of anomalies related tofirmrsquos performance (profitability dis-tress return on assets etc)
Stambaugh and Yuan(2016)
Robert Stambaughwebsite
ACCR Accruals factor constructed as a long-short portfolio of stocks sorted bychanges in operating working capitalper split-adjusted share from the fiscalyear end t-2 to t-1 divided by book eq-uity per share in t-1
Sloan (1996) Robert Stambaughwebsite
54
DISSTR Distress factor constructed as a long-short portfolio of stocks sorted by thepredicted failure probability
Campbell Hilscher andSzilagyi (2008)
Robert Stambaughwebsite
ASS Growth Asset growth factor constructed as along-short portfolio of stocks sorted bygrowth rate of total assets in the previ-ous fiscal year
Cooper Gulen andSchill (2008)
Robert Stambaughwebsite
COMP ISSUE Composite issue factor constructed asa long-short portfolio of stocks sortedby the growth in the firmrsquos total marketvalue of equity above that of the stockrsquosrate of return
Daniel and Titman(2006)
Robert Stambaughwebsite
GR PROF Gross profitability factor constructedas a long-short portfolio of stockssorted by the ratio of gross profit toassets creates abnormal benchmark-adjusted returns
Novy-Marx (2013) Robert Stambaughwebsite
INV IN ASS Investment-in-assets factor con-structed as a long-short portfolio ofstocks sorted by the annual change ingross property plant and equipmentplus the annual change in inventoriesscaled by lagged book value of theassets
Titman Wei and Xie(2004)
Robert Stambaughwebsite
NetOA Net operating assets factor con-structed as a long-short portfolio ofstocks sorted by net operating assets
Hirshleifer Kewei Teohand Zhang (2004)
Robert Stambaughwebsite
OSCORE Ohlson O-score factor constructed as along-short portfolio of stocks sorted bythe predicted value of a distress mea-sure
Ohlson (1980) Robert Stambaughwebsite
ROA Return-on-assets factor constructed asa long-short portfolio of stocks sortedby their return on assets
Chen Novy-Marx andZhang (2010)
Robert Stambaughwebsite
STOCK ISS Stock issuance factor constructed asa long-short portfolio of stocks sortedby their annual log change in split-adjusted shares outstanding
Ritter (1991) Fama andFrench (2008)
Robert Stambaughwebsite
INTERM CR Innovations to the intermediariesrsquo cap-ital (equity) ratio
He Kelly and Manela(2017)
Asaf Manela website
BAB Betting-against-beta factor con-structed as a portfolio that holdslow-beta assets leveraged to a betaof 1 and that shorts high-beta assetsde-leveraged to a beta of 1
Frazzini and Pedersen(2014)
AQR data library
HML DEVIL A version of the HML factor that relieson the current price level to sort thestocks into long and short legs
Asness and Frazzini(2013)
AQR data library
QMJ Auality-minus-junk factor constructedas a long-short portfolio of stockssorted by the combination of theirsafety profitability growth and thequality of management practices
Asness Frazzini andPedersen (2019)
AQR data library
FIN UNC A measure of financial uncertainty Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
REAL UNC A measure of real economic uncertainty Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
55
MACRO UNC A measure of macroeconomic uncer-tainty
Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
TERM Term spread measured as the differ-ence in 10 year Treasury bonds and fedfunds rate
Chen Ross and Roll(1986) Fama and French(1993)
FRED-MD database
DELTA SLOPE Change in the difference between a10-year Treasury bond yield and a 3-month Treasury bill yield
Ferson and Harvey(1991)
FRED-MD database
CREDIT Credit spread measured as the dif-ference between Moodyrsquos BAA corpo-rate bond yields and 10-year Treasurybonds
Chen Ross and Roll(1986) Fama and French(1993)
FRED-MD database
DIV Dividend yield Campbell (1996) FRED-MD databasePE Price-earnings ratio Basu (1977) Ball (1985) FRED-MD database
BW INV SENT Investor sentiment measure aggre-gated from a set of indices orthogonalto macroeconomic fundamentals
Baker and Wurgler(2006)
Dashan Huang web-site
HJTZ INV SENT Investor sentiment measure extractedwith PLS from Baker and Wurgler(2006) proxies
Huang Jiang Tu andZhou (2015)
Dashan Huang web-site
BEH PEAD Short-term behavioral factor reflectingpost-earnings announcement drift
Daniel Hirshleifer andSun (2019)
Kent Daniel website
BEH FIN Long-term behavioral factor predom-inantly capturing the impact of shareissuance and correction
Daniel Hirshleifer andSun (2019)
Kent Daniel website
MKTlowast Market factor with a hedged unpricedcomponent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
SMBlowast SMB with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
HMLlowast HML with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
RMWlowast RMW with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
CMAlowast CMA with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
SKEW Systematic skewness factor con-structed as a long-short portfolio ofstocks sorted on the their predictedsystematic skewness rank
Langlois (2019) Hugues Langloiswebsite
NONDUR Nondurable consumption growth (realchain-weighted per capita)
Chen Ross and Roll(1986) Breeden Gib-bons and Litzenberger(1989)
Monthly consump-tion expenditurethe chain-weightedprice index andpopulation data arefrom BEA
SERV Growth rate (real chain-weighted percapita) service expenditure
Breeden Gibbons andLitzenberger (1989)Hall (1978)
Monthly expenditurefor services thechain-weighted priceindex and popula-tion data are fromBEA
UNRATE Unemployment rate Gertler and Grinols(1982)
FRED-MD database
IND PROD Growth rate of industrial production Chan Chen and Hsieh(1985) Chen Ross andRoll (1986)
Industrial Produc-tion Index is fromthe Board of Gover-nors of the FederalReserve System
56
OIL Monthly growth rate of the ProducerPrice index for Crude Petroleum (do-mestic production)
Chen Ross and Roll(1986)
PPI is from US Bu-reau of Labor Statis-tics
The table presents the list of factors used in Section IV2 For each of the variables we present their identificationindex the nature of the factor and the source of data for downloading andor constructing the time series
57
C Additional Tables
Table C1 Tests of risk premia in a correctly specified model with a strong factor
λintercept λHML R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0115 0049 0017 0114 0050 0014 -417 4429200 0101 0055 0012 0096 0047 0010 096 6773
FM 600 0106 0053 0011 0100 0048 0011 3052 84561000 0096 0050 0011 0102 0044 0010 4786 901820000 0102 0042 0012 0100 0049 0007 9620 9943
100 0063 0027 0006 0034 0010 0002 -337 3516200 0107 0055 0012 0080 0037 0005 -299 4906
BFM 600 0095 0050 0007 0080 0035 0007 431 69131000 0092 0054 0008 0092 0047 0012 3291 781820000 0108 0054 0012 0110 0052 0008 9654 9849
Panel B GLS
100 0221 0156 0061 0212 0137 0064 1062 7380200 0153 0088 0024 0144 0088 0024 5923 8571
FM 600 0117 0062 0017 0115 0060 0014 8338 94461000 0106 0053 0011 0112 0052 0011 9003 964920000 0100 0058 0010 0103 0047 0011 9947 9981
100 0151 0088 0026 0149 0086 0026 2642 6569200 0123 0066 0016 0124 0066 0015 4907 7352
BFM 600 0106 0055 0012 0105 0056 0011 7640 87301000 0107 0054 0011 0103 0051 0011 8512 915420000 0098 0057 0009 0100 0051 0013 9915 9950
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in a correctlyspecified model with an intercept and a strong factor Hypothetical true value of R2
adj is 100 Fama-MacBeth esti-mates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shankencorrection Confidence intervals for BFM estimates are constructed using a posterior distribution of Fama-MacBethestimates of λ The last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simula-tions evaluated at the simulation point estimates for FM and its posterior mode for BFM
58
Table C2 Tests of risk premia in a correctly specified model with a useless factor
λintercept λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0087 0033 0008 0019 0003 0000 -419 4844200 0071 0034 0006 0052 0011 0000 -420 5684
FM 600 0077 0031 0005 0131 0037 0000 -422 58501000 0071 0035 0006 0205 0066 0002 -416 612820000 0133 0077 0027 0709 0479 0140 -411 7007
100 0039 0011 0002 0001 0001 0000 -229 -012200 0038 0013 0001 0002 0000 0000 -232 030
BFM 600 0048 0012 0002 0017 0006 0001 -221 1261000 0048 0011 0000 0031 0010 0000 -214 16020000 0007 0000 0000 0103 0051 0014 -176 5793
Panel B GLS
100 0215 0130 0052 0211 0117 0033 -310 3727200 0141 0080 0023 0139 0072 0013 -373 2282
FM 600 0108 0053 0014 0142 0065 0010 -377 21161000 0097 0053 0009 0171 0082 0012 -371 209220000 0108 0047 0010 0621 0559 0372 -259 1697
100 0136 0074 0022 0029 0011 0001 -194 1060200 0112 0060 0014 0012 0004 0000 -239 837
BFM 600 0097 0047 0010 0010 0002 0000 -265 7491000 0096 0047 0009 0010 0002 0000 -276 83220000 0071 0034 0004 0077 0033 0004 -326 766
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a correctly specified model with an intercept and a useless factor The true value of R2 is 0 Fama-MacBeth esti-mates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shankencorrection Confidence intervals for BFM estimates are constructed using a posterior distribution of Fama-MacBethestimates of λ The last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simula-tions evaluated at the simulation point estimates for FM and its posterior mode for BFM
59
Table C3 Tests of risk premia in a correctly specified model with useless and strong factors
λintercept λHML λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0073 0038 0006 0071 0033 0006 0043 0012 0000 -445 5930200 0073 0035 0006 0090 0045 0008 0028 0006 0000 1083 7484
FM 600 0076 0033 0005 0100 0046 0009 0027 0005 0000 3873 85721000 0073 0034 0006 0090 0046 0009 0026 0005 0000 5415 911320000 0087 0032 0006 0061 0026 0005 0025 0008 0000 9687 9947
100 0039 0013 0001 0023 0007 0001 0002 0001 0000 -197 4174200 0048 0023 0000 0046 0015 0000 0002 0001 0000 003 5372
BFM 600 0054 0025 0003 0062 0026 0006 0002 0000 0000 4056 72821000 0084 0034 0006 0064 0021 0002 0002 0000 0000 5763 808920000 0071 0033 0005 0069 0032 0004 0004 0000 0000 9704 9860
Panel B GLS
100 0193 0129 0043 0205 0133 0043 0180 0121 0031 1405 7480200 0135 0080 0021 0141 0075 0024 0129 0062 0010 6015 8683
FM 600 0101 0054 0010 0109 0052 0009 0091 0037 0004 8331 94231000 0104 0055 0010 0105 0048 0011 0085 0039 0003 8995 963220000 0095 0047 0006 0101 0052 0005 0088 0035 0004 9944 9981
100 0133 0074 0023 0132 0074 0023 0026 0009 0001 2796 6680200 0106 0054 0012 0106 0053 0012 0013 0004 0000 4899 7362
BFM 600 0094 0047 0009 0093 0046 0009 0007 0001 0000 7654 87351000 0090 0045 0010 0091 0044 0007 0005 0001 0000 8530 914620000 0092 0043 0008 0092 0046 0008 0004 0001 0000 9914 9951
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and
λstrong λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor The true value
of the cross-sectional R2adj is 100 Fama-MacBeth estimates are constructed using OLS (GLS) two-step cross-
sectional regressions with standard errors including Shanken correction Confidence intervals for BFM estimates areconstructed using a posterior distribution of Fama-MacBeth estimates of λ The last two columns report the 5th and95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at the simulation point estimates for FMand its posterior mode for BFM
60
Table C4 Tests of risk premia in a misspecified model with a strong factor (N = 55)
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0107 0058 0014 0115 0059 0012 -186 1305200 0091 0046 0009 0102 0052 0011 -184 1485600 0104 0052 0013 0109 0060 0014 -166 14951000 0100 0056 0009 0123 0061 0013 -123 135120000 0104 0049 0009 0109 0048 0009 353 803
BFM 100 0017 0004 0000 0002 0000 0000 -155 -067200 0046 0017 0004 0023 0006 0000 -168 273600 0089 0042 0012 0071 0036 0005 -174 8441000 0093 0047 0007 0089 0040 0007 -171 95920000 0101 0048 0011 0099 0042 0008 329 777
Panel B GLS
FM 100 0509 0428 0305 0513 0431 0305 2093 6471200 0295 0206 0102 0274 0187 0091 3999 6657600 0170 0109 0042 0176 0113 0034 5585 71321000 0170 0106 0034 0176 0105 0031 6010 723220000 0145 0082 0020 0141 0073 0022 6917 7182
BFM 100 0265 0181 0086 0262 0186 0079 1872 5919200 0163 0100 0032 0147 0086 0024 3401 5842600 0116 0060 0017 0118 0058 0014 5125 65951000 0120 0064 0016 0121 0063 0014 5689 686520000 0106 0053 0009 0095 0048 0013 6897 7163
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor estimates of a cross-section of 55 test assets The true valueof the cross-sectional R2
adj is 572 (7071) in OLS (GLS) estimation Fama-MacBeth estimates are constructedusing OLS (GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidenceintervals and their size for BFM estimates are constructed using posterior coverage of Fama-MacBeth estimates of λThe last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluatedat the simulation point estimates for FM and its posterior mode for BFM The test assets mimic the time series andcross-sectional properties of 25 Fama-French size-value portfolios and 30 industry portfolios while the strong factorproxies the HML factor (all the data available from Ken French website)
61
Table C5 Tests of risk premia in a misspecified model with a useless factor (N = 55)
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0095 0050 0013 0084 0031 0003 -186 2080200 0084 0042 0007 0093 0038 0004 -186 1801600 0103 0051 0011 0155 0077 0011 -187 13621000 0101 0051 0009 0226 0131 0033 -187 141720000 0191 0126 0050 0716 0650 0460 -187 988
BFM 100 0013 0002 0000 0000 0000 0000 -105 -047200 0039 0015 0001 0002 0000 0000 -128 -043600 0076 0040 0006 0004 0000 0000 -144 -0491000 0082 0035 0004 0022 0009 0000 -150 -02620000 0129 0059 0005 0078 0037 0008 -168 -020
Panel B GLS
FM 100 0492 0411 0287 0540 0468 0302 -134 2318200 0267 0196 0088 0364 0274 0153 -159 1359600 0166 0101 0035 0408 0323 0198 -162 10091000 0154 0089 0028 0475 0401 0266 -155 86820000 0155 0100 0044 0806 0772 0705 -068 668
BFM 100 0259 0180 0085 01695 0105 0041 -014 1285200 0162 0094 0030 0065 0027 0005 -089 615600 0108 0058 0013 0056 0022 0003 -127 5311000 0112 0057 0014 0064 0021 0004 -138 47920000 0051 0020 0001 0093 0049 0011 -072 172
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor estimated on a cross-section of 55 portfolios Thetrue value of the cross-sectional R2 is zero The test assets mimic the time series and cross-sectional properties of 25Fama-French size-value portfolios and 30 industry portfolios
62
Table C6 Tests of risk premia in a misspecified model with useless and strong factors (N = 55)
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0097 0052 0014 0099 0045 0009 0108 0041 0006 -318 2477200 0085 0041 0006 0090 0049 0011 0117 0051 0008 -298 2350600 0095 0045 0013 0108 0060 0012 0191 0100 0015 -212 21211000 0083 0043 0006 0125 0072 0016 0258 0171 0047 -155 213720000 0109 0055 0012 0146 0086 0038 0766 0704 0535 267 1778
BFM 100 0014 0002 0000 0000 0000 0000 0000 0000 0000 -141 161200 0037 0014 0001 0017 0005 0000 0002 0000 0000 -203 726600 0069 0031 0005 0058 0027 0002 0007 0000 0000 -227 10961000 0064 0024 0003 0069 0027 0005 0026 0008 0001 -209 117320000 0028 0006 0000 0068 0033 0004 0080 0042 0009 262 873
Panel B GLS
FM 100 0488 0406 0282 0495 0411 0281 0537 0465 0306 2201 6613200 0263 0194 0088 0252 0167 0077 0359 0279 0151 4047 6707600 0157 0094 0032 0161 0093 0027 0398 0313 0199 5581 71591000 0146 0087 0029 0144 0090 0026 0475 0393 0251 5994 725020000 0140 0094 0035 0134 0089 0041 0789 0747 0667 6888 7237
BFM 100 0253 0182 0081 0260 0176 0077 0168 0104 0039 2036 6007200 0156 0092 0028 0140 0082 0021 0063 0029 0004 3451 5844600 0105 0058 0013 0108 0049 0011 0054 0024 0002 5125 66141000 0105 0054 0015 0111 0056 0009 0057 0024 0003 5678 687320000 0051 0019 0001 0045 0018 0001 0093 0045 0013 6873 7157
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and λstrong
and λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor estimated on a cross-section
of 55 portfolios The true value of the cross-sectional R2adj is 572 (7071) in OLS (GLS) estimation The test
assets mimic the time series and cross-sectional properties of 25 Fama-French size-value portfolios and 30 industryportfolios while the strong factor proxies the HML factor
63
Table C7 Tests of risk premia in a misspecified model with a strong factor (N = 100)
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0091 0044 0010 0109 0055 0012 -098 1310600 0102 0052 0013 0116 0057 0019 -042 12311000 0099 0056 0011 0117 0060 0014 031 114520000 0099 0044 0010 0116 0068 0017 431 748
BFM 200 0016 0004 0000 0006 0001 0000 -082 074600 0076 0034 0006 0060 0028 0005 -086 7901000 0081 0046 0007 0076 0037 0007 -072 85920000 0097 0044 0011 0106 0057 0014 418 742
Panel B GLS
FM 200 0495 0411 0287 0481 0403 0268 2938 5885600 0255 0169 0077 0257 0178 0073 4736 62091000 0234 0149 0059 0205 0129 0049 5198 635220000 0166 0100 0035 0170 0097 0036 6142 6398
BFM 200 0249 0163 0070 0233 0158 0073 2567 5296600 0132 0082 0023 0137 0077 0020 4287 56771000 0125 0073 0021 0112 0060 0014 4849 597020000 0101 0056 0012 0098 0053 0013 6114 6375
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for the pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor estimated on a cross-section of 100 portfolios The truevalue of R2
adj is 585 (6297) for the OLS (GLS) estimation Fama-MacBeth estimates are constructed using OLS(GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidence intervalsand their size for BFM estimates are constructed using a posterior distribution of Fama-MacBeth estimates of λ Thelast two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at thesimulation point estimates for FM and its posterior mode for BFM The simulations design follows the methodologydescribed in Section III with the test assets mimicking the composite cross-section of 25 Fama-French size-BMportfolios 30 industry portfolios 25 profitability and investment portfolios 10 momentum portfolios as well as 10long-term reversal portfolios (all available from Ken French website) A strong factor mimics the behavior of HML
64
Table C8 Tests of risk premia in a misspecified model with a useless factor (N = 100)
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0091 0041 0009 0147 0079 0015 -101 1481600 0110 0062 0010 0280 0180 0064 -100 13551000 0096 0053 0007 0341 0248 0101 -100 122220000 0221 0151 0061 0805 0759 0643 -100 1220
BFM 200 0014 0003 0000 0000 0000 0000 -050 002600 0070 0030 0003 0014 0002 0000 -066 0261000 0073 0034 0005 0026 0010 0001 -071 02720000 0097 0035 0003 0091 0045 0008 -081 109
Panel B GLS
FM 200 0472 0397 0272 0530 0454 0320 -072 1495600 0230 0149 0064 0458 0363 0235 -052 9691000 0196 0122 0048 0487 0407 0275 -037 86120000 0106 0063 0021 0760 0717 0640 171 615
BFM 200 0244 0161 0066 0150 0092 0031 -016 769600 0129 0073 0021 0078 0035 0008 -055 6761000 0118 0066 0019 0070 0033 0006 -060 62420000 0076 0034 0005 0093 0049 0009 172 399
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor estimated on a cross-section of 100 portfolios The truevalue of R2 is zero The test assets mimic the time series and cross-sectional properties of composite cross-section of25 Fama-French size-BM portfolios 30 industry portfolios 25 profitability and investment portfolios 10 momentumportfolios as well as 10 long-term reversal portfolios while the strong factor mimics the behavior of HML (all thedata is available from Ken French website)
Table C9 Tests of risk premia in a misspecified model with useless and strong factors (N = 100)
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0088 0043 0009 0098 0045 0008 0187 0104 0026 -124 2118600 0103 0056 0011 0104 0064 0015 0319 0217 0081 -020 19571000 0105 0049 0009 0115 0058 0013 0389 0284 0134 070 182620000 0102 0057 0014 0140 0075 0028 0823 0782 0684 400 1766
BFM 200 0012 0003 0000 0005 0000 0000 0000 0000 0000 -051 541600 0062 0025 0003 0046 0021 0002 0016 0004 0000 -063 10771000 0062 0029 0005 0057 0025 0003 0029 0008 0001 -005 107820000 0026 0008 0001 0042 0015 0000 0089 0039 0011 399 818
Panel B GLS
FM 200 0467 0392 0268 0471 0383 0254 0523 0454 0315 2971 5943600 0227 0149 0059 0228 0154 0060 0455 0363 0227 4745 62221000 0191 0118 0046 0178 0114 0036 0480 0401 0262 5207 636820000 0098 0054 0020 0095 0062 0024 0749 0707 0621 6121 6416
BFM 200 0243 0160 0066 0230 0155 0072 0152 0087 0031 2613 5350600 0126 0072 0022 0132 0071 0017 0074 0033 0007 4268 56921000 0117 0067 0017 0109 0057 0013 0068 0034 0006 4864 597920000 0068 0032 0002 0066 0034 0006 0087 0047 0009 6102 6371
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and λstrong
λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor on the cross-section of 100
portfolios The true value of R2adj is 585 (6297) in OLS (GLS) estimation The test assets mimic the time series
and cross-sectional properties of composite cross-section of 25 Fama-French size-BM portfolios 30 industry portfolios25 profitability and investment portfolios 10 momentum portfolios as well as 10 long-term reversal portfolios whilethe strong factor mimics the behavior of HML (all the data is available from Ken French website)
65
Table C10 The probability of retaining risk factors using BF
T 55 57 59 61 63 65
Panel A strong factors
Jeffreys Prior fstrong 200 0860 0845 0830 0812 0792 0771600 0987 0985 0985 0983 0981 09791000 0998 0998 0997 0996 0996 0995
Spike-and-Slab Prior fstrong 200 0749 0718 0685 0654 0618 0586600 0982 0979 0978 0972 0970 09641000 0996 0996 0996 0995 0994 0992
Panel B useless factors
Jeffreys Prior fuseless 200 1000 0995 0982 0940 0862 0726600 1000 0999 0998 0988 0971 09201000 1000 1000 1000 0999 0990 0971
Spike-and-Slab Prior fuseless 200 0419 0248 0149 0083 0040 0022600 0119 0062 0028 0012 0007 00041000 0083 0028 0011 0001 0000 0000
Panel C strong and useless factors
Jeffreys Prior fstrong 200 0928 0912 0891 0878 0860 0838600 0994 0994 0992 0991 0991 09891000 0999 0999 0999 0999 0999 0999
fuseless 200 0955 0894 0788 0642 0489 0360600 0957 0895 0764 0618 0461 03541000 0957 0893 0787 0645 0483 0357
Spike-and-Slab Prior fstrong 200 0753 0725 0697 0666 0629 0592600 0982 0980 0979 0972 0970 09611000 0996 0996 0996 0995 0994 0994
fuseless 200 0283 0154 0085 0044 0023 0011600 0077 0031 0006 0002 0000 00001000 0053 0008 0000 0000 0000 0000
The table shows the frequency of retaining risk factors for different choice sets across 1000 simulations of differentsize (T=200 600 and 1000) In Panel A the candidate risk factor is truly cross-sectionally priced and stronglyidentified while in Panel B they are not Panel C summarizes the case of using both strong and useless candidatefactors in the model A candidate factor is retained in the model if its marginal posterior probability p(γi = 1|data)is greater than a certain threshold ie 55 57 59 61 63 and 65
66
Table C11 Two-pass regressions with tradable factors 25 Fama-French portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CAPM Intercept 1421 1769 1404 -091 1461[0627 2215] [-435 6661] [0595 2256] [-403 5211]
MKT -0647 -0631[-1429 0135] [-1458 0163]
Fama and French (1992) Intercept 1273 6071 1249 5604 5566[0730 1816] [2000 8514] [0682 1834] [4275 6946]
MKT -0686 -0664[-1227 -0145] [-1242 -0102]
SMB 0140 0140[0075 0205] [0073 0205]
HML 0380 0379[0322 0438] [0319 0439]
Quality-minus-junk Intercept 0626 7442 0648 6947 6879Asness Frazzini and Pedersen (2014) [-0048 1300] [4480 9520] [-0055 1308] [5548 7946]
QMJ 0369 0358[0148 0589] [0130 0591]
MKT -0097 -0116[-0763 0568] [-0772 0580]
SMB 0209 0206[0157 0260] [0151 0262]
HML 0338 0340[0288 0388] [0289 0392]
Panel B GLS
CAPM Intercept 1362 4805 1323 4706 4265[0941 1783] [2696 10000] [0846 1802] [1411 6530]
MKT -0788 -0749[-1206 -0369] [-1227 -0287]
Fama and French (1992) Intercept 1397 8849 1353 8543 8543[0924 1870] [8286 9771] [0846 1878] [8003 8989]
MKT -0827 -0783[-1298 -0356] [-1311 -0283]
SMB 0179 0178[0145 0213] [0142 0215]
HML 0348 0350[0312 0385] [0310 0391]
Quality-minus-junk Intercept 0966 8948 0982 8736 8642Asness Frazzini and Pedersen (2014) [0395 1537] [8320 9640] [0383 1616] [8120 9090]
QMJ 0378 0359[0204 0552] [0172 0539]
MKT -0425 -0438[-0984 0133] [-1052 0157]
SMB 0197 0196[0160 0234] [0156 0235]
HML 0359 0359[0321 0398] [0320 0399]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of monthly excess returns for 25 Fama-French sizevalue portfolios Each model is estimated viaOLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructed basedon the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructed as inLewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide theposterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and 99level respectively
67
Table C12 Two-pass regressions with tradable factors 25 Fama-French portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CCAPM Intercept 1444 662 1814 -202 545[0155 2732] [-435 9374] [0106 3664] [-423 4175]
∆Cnd 0383 0254[-0221 0988] [-0462 0956]
Scaled CAPM Intercept 4489 1617 3731 1739 2565[1479 7499] [-1429 6114] [-0268 7514] [-391 5783]
cay 1141 0563[-0198 2480] [-2262 3085]
MKT -2119 -1444[-4890 0653] [-5241 2643]
MKT times cay -8554 -5188[-19227 2119] [-17911 9415]
Scaled HC-CAPM Intercept 4405 1386 3343 3895 3659[1182 7629] [-2632 5453] [-0500 7182] [-006 6630]
cay 1038 0552[-0444 2520] [-1957 2848]
∆Y 0437 0034[-0421 1295] [-0995 0928]
MKT -2049 -1148[-5031 0933] [-4946 2772]
∆Y times cay 1428 0724[-1047 3903] [-3458 4645]
MKT times cay -7782 -4127[-18579 3015] [-17926 10532]
Panel B GLS
CCAPM Intercept 2274 -251 2336 -303 -056[1386 3162] [-435 9896] [1360 3266] [-403 1109]
∆Cnd 0139 0088[-0114 0391] [-0222 0383]
Scaled CAPM Intercept 2623 4949 2668 5626 4567[0794 4451] [1657 7829] [0859 4376] [439 7146]
cay 0362 0205[-0759 1483] [-0853 1277]
MKT -0596 -0642[-2412 1220] [-2325 1149]
MKT times cay 0644 0310[-5695 6984] [-5988 6529]
Scaled HC-CAPM Intercept 2486 5014 2604 5832 4698[0510 4463] [1158 7726] [0801 4372] [558 7371]
cay 0361 0199[-0902 1623] [-0879 1269]
∆Y -0415 -0242[-0878 0048] [-0672 0157]
MKT -0479 -0588[-2441 1482] [-2357 1241]
∆Y times cay 0395 0150[-1761 2552] [-1718 2036]
MKT times cay 1158 0477[-5888 8205] [-5890 6968]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of quarterly excess returns for 25 Fama-French sizevalue portfolios Each model is estimatedvia OLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructed basedon the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructed as inLewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide theposterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and 99level respectively
68
Table C13 Two-pass regressions with tradable factors 25 Fama-French + 17 industry portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Carhart (1997) Intercept 0755 4720 0780 3835 3769
[0167 1342] [803 8559] [0158 1395] [2222 5427]
MKT -0162 -0188
[-0759 0435] [-0808 0432]
SMB 0144 0142
[0082 0206] [0078 0207]
HML 0320 0316
[0251 0388] [0245 0388]
UMD 0759 0673
[-0360 1878] [-0476 1802]
q-factor model Intercept 0744 4505 0761 3647 3600
Hou Xue and Zhang (2015) [0244 1243] [138 8338] [0239 1287] [1711 5508]
ROE 0173 0156
[-0139 0485] [-0154 0481]
IA 0287 0279
[0117 0456] [0117 0455]
ME 0230 0221
[0135 0325] [0121 0320]
MKT -0199 -0211
[-0703 0304] [-0741 0311]
Liquidity Factor Intercept 0958 277 0963 -124 620
Pastor and Stambaugh (2003) [0417 1499] [-513 4113] [0377 1527] [-435 3340]
LIQ 0210 0153
[-1001 1421] [-1077 1445]
MKT -0274 -0281
[-0822 0274] [-0851 0340]
Panel B GLS
Carhart (1997) Intercept 0839 8826 0909 8486 8397
[0467 1210] [7562 9557] [0487 1339] [7895 8812]
MKT -0235 -0309
[-0610 0140] [-0737 0120]
SMB 0162 0161
[0134 0190] [0130 0193]
HML 0351 0350
[0316 0386] [0312 0390]
UMD 1426 1184
[0832 2020] [0541 1861]
q-factor model Intercept 1107 4289 1103 3742 3631
Hou Xue and Zhang (2015) [0761 1454] [027 7341] [0714 1486] [2022 5216]
ROE 0270 0247
[0057 0482] [0008 0502]
IA 0277 0263
[0134 0420] [0107 0428]
ME 0221 0216
[0156 0287] [0144 0288]
MKT -0524 -0520
[-0869 -0179] [-0900 -0138]
Liquidity Factor Intercept 1186 2897 1159 1811 2373
Pastor and Stambaugh (2003) [0874 1497] [-197 7687] [0803 1519] [438 5253]
LIQ 0089 0065
[-0567 0744] [-0696 0871]
MKT -0598 -0571
[-0909 -0288] [-0936 -0212]
69
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CAPM Intercept 0966 467 0957 -113 306
[0427 1506] [-250 5490] [0382 1522] [-246 2863]
MKT -0277 -0269
[-0823 0269] [-0838 0311]
Fama-French (1992) Intercept 1016 4330 1002 3425 3370
[0540 1491] [182 8166] [0517 1500] [2194 4614]
MKT -0445 -0430
[-0916 0026] [-0912 0068]
SMB 0147 0144
[0086 0208] [0077 0208]
HML 0316 0313
[0248 0383] [0242 0383]
Quality-minus-junk Intercept 0836 4931 0836 3861 3934
Asness et all (2019) [0323 1350] [914 8449] [0314 1365] [2459 5577]
QMJ 0153 0147
[-0065 0372] [-0066 0373]
MKT -0293 -0289
[-0796 0210] [-0814 0233]
SMB 0179 0175
[0114 0243] [0105 0240]
HML 0302 0300
[0234 0369] [0230 0373]
Panel B GLS
CAPM Intercept 1192 3060 1170 2602 2573
[0884 1500] [058 7848] [0815 1517] [600 5118]
MKT -0604 -0583
[-0911 -0298] [-0923 -0227]
Fama-French (1992) Intercept 1182 8616 1150 8272 8234
[0853 1510] [7303 9353] [0788 1519] [7736 8662]
MKT -0596 -0564
[-0923 -0268] [-0937 -0198]
SMB 0160 0160
[0132 0187] [0129 0191]
HML 0349 0348
[0314 0384] [0310 0386]
Quality-minus-junk Intercept 0970 8674 0977 8227 8276
Asness et all (2019) [0606 1334] [7451 9335] [0561 1369] [7759 8693]
QMJ 0293 0278
[0159 0427] [0125 0439]
MKT -0393 -0399
[-0754 -0033] [-0791 0012]
SMB 0166 0165
[0138 0194] [0134 0196]
HML 0353 0352
[0318 0388] [0315 0392]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of monthly excess returns for 25 Fama-French and 17 industry portfolios Each model is estimatedvia OLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructedbased on the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructedas in Lewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we providethe posterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of thecross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and99 level respectively
70
Table C14 Two-pass regressions with nontradable factors 25 Fama-French + 17 industry port-folios
Fama-MacBeth Bayesian Estimation
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CCAPM Intercept 1817 356 1975 -124 209
[0905 2729] [-250 6208] [0795 3135] [-245 2680]
∆Cnd 0189 0123
[-0216 0594] [-0285 0502]
Scaled CAPM Intercept 2141 1528 2344 298 1306
[0529 3754] [-789 9568] [0692 3944] [-451 4069]
cay 1408 0574
[-0182 2999] [-0935 1939]
MKT 0108 -0142
[-1471 1686] [-1747 1499]
cay timesMKT -2554 -2159
[-8811 3702] [-8813 4620]
Scaled CCAPM Intercept 1607 1709 1983 95 1468
[0394 2820] [-789 10000] [0761 3180] [-372 4189]
cay 1137 0511
[-0394 2669] [-0871 1865]
∆Cnd 0452 0175
[-0103 1007] [-0261 0598]
cay times∆Cnd -0102 0030
[-1752 1547] [-1175 1292]
HC-CAPM Intercept 2185 611 2221 -147 644
[0720 3650] [-513 6005] [0667 3735] [-429 3093]
∆Y 0398 0201
[-0011 0808] [-0290 0667]
MKT 0123 0050
[-1392 1638] [-1513 1650]
Panel B GLS
CCAPM Intercept 2351 264 2442 -057 218
[1700 3003] [-250 9795] [1612 3258] [-204 1296]
∆Cnd 0211 0132
[0034 0387] [-0081 0351]
Scaled CAPM Intercept 2516 3951 2653 4332 3470
[1467 3566] [1045 7411] [1616 3705] [360 6539]
cay 0783 0424
[0056 1511] [-0311 1119]
MKT -0504 -0639
[-1548 0541] [-1674 0406]
cay timesMKT 3785 2327
[-0412 7981] [-2053 6791]
Scaled CCAPM Intercept 2244 331 2378 296 356
[1378 3109] [-789 8166] [1539 3237] [-476 1690]
cay 0742 0425
[-0006 1489] [-0316 1104]
∆Cnd 0160 0119
[-0078 0398] [-0116 0342]
cay times∆Cnd 0355 0190
[-0343 1053] [-0455 0840]
HC-CAPM Intercept 2908 3810 2808 4454 3356
[2053 3762] [854 7372] [1809 3826] [194 6591]
∆Y -0155 -0089
[-0381 0072] [-0357 0174]
MKT -0892 -0793
[-1742 -0043] [-1803 0208]
71
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Scaled HC-CAPM Intercept 2099 1372 2243 1599 2186
[0549 3649] [-1389 4875] [0491 3955] [-121 4771]
cay 1124 0529
[-0326 2574] [-1066 1981]
∆Y 0088 0106
[-0314 0489] [-0399 0599]
MKT 0169 -0068
[-1340 1679] [-1805 1744]
cay times∆Y 1277 0579
[-1361 3914] [-1930 2919]
cay timesMKT -2125 -1605
[-8058 3808] [-8349 5354]
Durable CCAPM Intercept 2281 2316 2220 1423 1459
[0025 4537] [-789 10000] [0403 4028] [-359 4178]
∆Cnd 0544 0194
[0054 1034] [-0163 0525]
∆Cd 0481 0104
[-0101 1062] [-0353 0531]
MKT -0102 -0063
[-2466 2262] [-1948 1863]
Panel B GLS
Scaled HC-CAPM Intercept 2662 3848 2698 4312 3502
[1626 3698] [433 7267] [1555 3844] [231 6466]
cay 0726 0416
[-0029 1481] [-0324 1146]
∆Y -0165 -0088
[-0440 0110] [-0372 0198]
MKT -0652 -0685
[-1684 0379] [-1816 0438]
cay times∆Y 0681 0333
[-0648 2009] [-1017 1616]
cay timesMKT 3266 2123
[-0950 7482] [-2173 6502]
Durable CCAPM Intercept 1958 2926 1994 412 2558
[0634 3281] [-789 6763] [0831 3125] [-269 6336]
∆Cnd 0164 0065
[-0065 0394] [-0126 0260]
∆Cd 0289 0123
[-0011 0589] [-0117 0377]
MKT 0002 -0026
[-1322 1326] [-1165 1125]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of quarterly excess returns for 25 Fama-French and 17 industry portfolios Each modelis estimated via OLS and GLS We report point estimates and 5 confidence intervals for risk premia which areconstructed based on the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence levelconstructed as in Lewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimationwe provide the posterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode andmedian of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the90 95 and 99 level respectively
72
Table C15 Posterior factor probabilities and risk premia of 26 million sparse models
Pr [γj = 1|data] E [λj |data]ψ ψ
factors 1 5 10 20 50 100 1 5 10 20 50 100
HML 0455 0702 0720 0695 0646 0614 0106 0231 0249 0245 0229 0219MKTlowast 0274 0513 0594 0658 0700 0702 0050 0211 0321 0440 0553 0591MKT 0086 0178 0311 0446 0550 0576 0009 0067 0163 0286 0408 0454SMBlowast 0246 0350 0317 0264 0210 0188 0039 0113 0120 0110 0095 0089PERF 0136 0160 0147 0125 0098 0083 -0020 -0050 -0053 -0049 -0041 -0036STOCK ISS 0099 0117 0125 0123 0110 0099 -0008 -0029 -0042 -0050 -0053 -0051COMP ISSUE 0097 0101 0104 0103 0096 0088 0010 0029 0041 0051 0059 0059CMA 0111 0114 0104 0091 0075 0066 0008 0018 0019 0019 0017 0016ROA 0089 0079 0083 0094 0102 0097 0010 0023 0034 0051 0068 0070UMD 0091 0097 0097 0090 0078 0071 0006 0020 0026 0029 0030 0030BEH PEAD 0081 0069 0069 0074 0089 0108 0001 0002 0004 0007 0016 0030STRev 0078 0064 0063 0067 0086 0112 0000 0002 0003 0007 0022 0048NONDUR 0079 0065 0064 0067 0079 0097 0000 0000 0001 0001 0003 0006DISSTR 0085 0079 0076 0070 0060 0053 0010 0030 0039 0043 0042 0040MGMT 0090 0083 0077 0068 0055 0047 0007 0017 0019 0019 0017 0015BW ISENT 0079 0064 0062 0063 0069 0077 0000 0000 0000 0001 0002 0004TERM 0078 0063 0061 0062 0069 0078 0000 0000 0001 0001 0004 0007FIN UNC 0078 0063 0061 0061 0066 0073 -0000 -0000 -0000 -0000 -0000 -0000LIQ TR 0078 0063 0060 0061 0066 0075 -0000 0000 0001 0002 0007 0015DeltaSLOPE 0078 0063 0061 0061 0066 0073 -0000 -0000 -0000 -0000 -0000 -0001IPGrowth 0078 0063 0061 0061 0066 0072 -0000 -0000 -0000 -0000 -0001 -0002Oil 0078 0063 0061 0061 0066 0072 -0000 0001 0001 0002 0004 0009SERV 0078 0063 0061 0061 0065 0072 -0000 -0000 -0000 -0000 -0000 -0000DEFAULT 0078 0063 0060 0060 0064 0069 -0000 -0000 0000 0000 0000 0000NetOA 0079 0063 0061 0062 0065 0065 0001 0003 0004 0007 0014 0018DIV 0078 0063 0060 0060 0064 0069 0000 -0000 -0000 -0000 -0000 -0000PE 0078 0063 0060 0060 0064 0068 -0000 -0001 -0001 -0002 -0004 -0008REAL UNC 0078 0063 0060 0060 0064 0067 0000 0000 0000 0000 0000 0000HJTZ ISENT 0078 0063 0060 0060 0064 0068 -0000 -0000 -0000 -0000 -0000 -0001UNRATE 0078 0063 0060 0060 0064 0068 0000 -0000 -0000 -0000 -0001 -0002INTERM CAP RATIO 0084 0065 0061 0062 0062 0059 0009 0013 0018 0027 0038 0041LIQ NT 0079 0063 0060 0060 0063 0066 -0001 -0001 -0002 -0002 -0003 -0004QMJ 0113 0090 0069 0051 0035 0028 0012 0016 0014 0010 0007 0005MACRO UNC 0078 0063 0060 0059 0060 0061 0000 0000 0000 0000 0000 0000INV IN ASS 0078 0062 0059 0059 0060 0061 0000 0000 0001 0001 0003 0006ACCR 0081 0067 0062 0059 0056 0055 -0002 -0005 -0006 -0006 -0005 -0004LTRev 0078 0064 0063 0062 0059 0053 -0001 -0005 -0008 -0012 -0016 -0017CMAlowast 0079 0062 0059 0058 0059 0058 0000 0000 -0000 -0001 -0002 -0002ASS Growth 0078 0060 0056 0055 0055 0052 -0001 -0001 -0001 -0004 -0008 -0011HMLlowast 0079 0062 0058 0055 0053 0049 -0001 -0001 -0001 -0001 -0001 -0001RMWlowast 0077 0058 0052 0047 0040 0035 0000 0001 0001 0002 0002 0002BAB 0079 0059 0051 0045 0039 0034 -0003 -0003 -0003 -0001 0001 0003IA 0083 0060 0051 0043 0035 0030 -0003 -0004 -0004 -0003 -0003 -0003O SCORE 0077 0056 0047 0040 0032 0027 -0003 -0004 -0005 -0005 -0005 -0005ROE 0076 0055 0047 0040 0032 0027 0001 0004 0005 0005 0004 0003SMB 0039 0024 0027 0042 0067 0077 0002 0002 0004 0007 0014 0017SKEW 0090 0062 0047 0034 0024 0019 -0000 -0000 -0000 -0000 -0000 -0000BEH FIN 0079 0051 0040 0031 0023 0019 -0006 -0005 -0003 -0001 -0000 -0000GR PROF 0077 0047 0038 0031 0025 0020 0004 0003 0003 0003 0003 0003RMW 0073 0045 0036 0030 0023 0018 0003 0004 0004 0003 0003 0003HML DEVIL 0063 0036 0027 0020 0015 0012 0002 0003 0002 0001 0000 0000
Posterior probabilities of factors Pr [γj = 1|data] and posterior mean of factor risk premia E [λj |data] computedusing the Dirac spike and slab approach of section II22 51 factors and all possible models with up to 5 factorsyielding about 26 million candidate models The prior probability of a factor being included is about 1038 Thedata is monthly 197310 to 201612 Test assets cross-section of 25 Fama-French size and book-to-market and 30Industry portfolios The 51 factors considered are described in Table B1 of Appendix B
73
Table C16 Factor models with highest posterior probability (Dirac spike-and-slab ψ = 20)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232SMB 983232 983232 983232STRevBW ISENTLIQ TRUNRATENONDURTERMCOMP ISSUE 983232 983232OilBEH PEAD 983232DeltaSLOPEINV IN ASSIPGrowthDEFAULTUMD 983232ROA 983232 983232 983232 983232 983232 983232 983232REAL UNCPECMA 983232ACCRSERVSTOCK ISS 983232 983232 983232DIVMACRO UNCFIN UNCLIQ NTCMANetOAHJTZ ISENTLTRevRMWHMLINTERM CAP RATIOASS GrowthPERF 983232IABABDISSTR 983232ROEMGMTO SCOREQMJBEH FINGR PROFSKEWRMWHML DEVILSMBProbability () 01080 00964 00771 00710 00709 00668 00661 00624 00600 00560
Factors and posterior model probabilities of ten most likely specifications computed using the Dirac spike and slabapproach of section II22 ψ = 20 51 factors and all possible models with up to 5 factors yielding about 26 millionmodels and a model prior probability of the order of 10minus7 Specifications organised by columns with the symbol 983232indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612 Test assetscross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factors considered aredescribed in Table B1 of Appendix B
74
Table C17 Factor Models with highest posterior probability (continuous spike-and-slab ψ = 10)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKTlowast 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232SMBlowast 983232 983232 983232 983232 983232 983232 983232STRev 983232 983232 983232 983232 983232 983232 983232IPGrowth 983232 983232 983232 983232 983232BEH PEAD 983232 983232 983232 983232 983232NONDUR 983232 983232 983232 983232SERV 983232 983232 983232LIQ TR 983232 983232 983232 983232PE 983232 983232 983232 983232 983232 983232DeltaSLOPE 983232 983232 983232 983232BW ISENT 983232 983232DEFAULT 983232 983232 983232 983232 983232 983232Oil 983232 983232 983232DIV 983232 983232 983232 983232 983232 983232REAL UNC 983232 983232 983232 983232 983232 983232UNRATE 983232 983232TERM 983232 983232 983232 983232HJTZ ISENT 983232 983232 983232 983232 983232 983232UMD 983232 983232 983232 983232 983232CMAlowast 983232 983232 983232 983232 983232 983232LIQ NT 983232 983232 983232 983232 983232 983232 983232 983232FIN UNC 983232 983232 983232 983232 983232 983232 983232STOCK ISS 983232 983232NetOA 983232 983232 983232 983232 983232 983232MACRO UNC 983232 983232 983232 983232 983232INV IN ASS 983232 983232 983232COMP ISSUE 983232 983232 983232 983232 983232ACCR 983232 983232 983232 983232 983232 983232HMLlowast 983232 983232 983232 983232 983232 983232ROA 983232 983232 983232 983232 983232 983232RMWlowast 983232 983232 983232 983232 983232 983232CMA 983232 983232 983232 983232 983232 983232LTRev 983232 983232 983232 983232 983232 983232 983232 983232INTERM CAP RATIO 983232 983232 983232 983232 983232 983232ASS Growth 983232 983232 983232 983232 983232 983232DISSTR 983232 983232 983232 983232PERF 983232 983232 983232BAB 983232 983232IA 983232 983232 983232 983232MGMT 983232 983232 983232 983232ROE 983232 983232 983232O SCORE 983232BEH FIN 983232 983232 983232 983232 983232GR PROF 983232 983232 983232 983232 983232QMJ 983232 983232RMW 983232SKEW 983232 983232 983232SMB 983232HML DEVIL 983232 983232Probability () 00111 00111 00111 00100 00100 00100 00100 00100 00100 00100
Factors and posterior model probabilities of ten most likely specifications computed using the continuous spike andslab approach of section II23 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 225quadrillion models and a model prior probability of the order of 10minus16 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
75
Table C18 Factor models with highest posterior probability (Continuous spike-and-slab ψ = 20)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232MKTlowast 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232SMBlowast 983232 983232 983232STRev 983232 983232 983232 983232 983232 983232 983232 983232IPGrowth 983232 983232 983232BEH PEAD 983232 983232 983232 983232 983232 983232NONDUR 983232 983232 983232SERV 983232 983232 983232 983232 983232 983232LIQ TR 983232 983232 983232 983232 983232PE 983232 983232 983232 983232 983232 983232 983232DeltaSLOPE 983232 983232 983232 983232 983232 983232BW ISENT 983232 983232 983232 983232 983232DEFAULT 983232 983232 983232 983232 983232Oil 983232 983232DIV 983232 983232 983232 983232REAL UNC 983232 983232 983232 983232 983232UNRATE 983232 983232 983232 983232 983232TERM 983232 983232 983232 983232HJTZ ISENT 983232 983232 983232 983232 983232 983232 983232 983232UMD 983232 983232 983232 983232CMAlowast 983232 983232LIQ NT 983232 983232 983232 983232 983232FIN UNC 983232 983232 983232 983232 983232 983232STOCK ISS 983232 983232 983232 983232 983232NetOA 983232 983232 983232 983232 983232 983232 983232MACRO UNC 983232 983232INV IN ASS 983232 983232 983232 983232COMP ISSUE 983232 983232 983232 983232ACCR 983232 983232 983232HMLlowast 983232 983232 983232 983232 983232 983232 983232ROA 983232 983232 983232 983232 983232 983232RMWlowast 983232 983232 983232 983232 983232CMA 983232 983232 983232 983232 983232 983232LTRev 983232 983232 983232INTERM CAP RATIO 983232 983232 983232ASS Growth 983232 983232 983232 983232DISSTR 983232 983232 983232 983232 983232PERF 983232BAB 983232 983232 983232 983232IA 983232 983232 983232 983232MGMT 983232 983232 983232ROE 983232 983232 983232 983232 983232O SCORE 983232 983232 983232 983232 983232BEH FIN 983232GR PROF 983232 983232 983232QMJ 983232 983232RMW 983232 983232SKEW 983232 983232SMB 983232HML DEVIL 983232 983232Probability () 00133 00122 00111 00111 00100 00100 00100 00100 00100 00089
Factors and posterior model probabilities of ten most likely specifications computed using the continuous spike andslab approach of section II23 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 225quadrillion models and a model prior probability of the order of 10minus16 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
76
D Additional Figures
Panel A OLS
00 01 02 03 04 05
02
46
810
λstrong
Density
BFMFM
(a) Strong factor
minus2 minus1 0 1 2
00
02
04
06
08
10
λuseless
Density
BFMFM
(b) Useless factor
Panel B GLS
025 030 035
05
1015
2025
30
λstrong
Density
BFMFM
(c) Strong factor
minus20 minus15 minus10 minus05 00 05 10
00
04
08
12
λuseless
Density
BFMFM
(d) Useless factor
Figure D1 Posterior distribution of the risk premia estimates in a misspecified model that includesboth strong and irrelevant factors
The graph presents posterior distribution of risk premia estimates for a misspecified model with both strong anduseless factors in one representative simulation Panels (a) and (b) display posterior distribution of the BFM-OLSestimates of risk premia along with the frequentist distribution implied by the point estimates and standard errorsPanels (c) and (d) report the same objects for GLS T =1000
77
0 5 10
00
04
08
12
Parameter
Den
sity
PriorLikelihood ILikelihood IILikelihood IIIPosterior IPosterior IIPosterior III
Figure D2 Posterior distributions with heavy-tailed likelihood and thin-tailed prior
Posterior shapes generated by a thin-tails (Gaussian) prior and a heavy-tails (Cauchy) likelihood as the likelihood
peak moves away from the prior mean Posteriors I II and III are generated respectively by the likelihood I II and
III
78
GRS-type tests (cf Gibbons Ross and Shanken (1989)) for mean-variance efficiency While
Shanken (1987) was the first to examine the posterior odds ratio for portfolio alphas in the linear
factor model Harvey and Zhou (1990) set the benchmark by imposing the priors both diffuse and
informative directly on the deep model parameters Using the data on 12 industry portfolios they
reject the mean-variance efficiency of the market return with a high degree of certainty
Pastor and Stambaugh (2000) and Pastor (2000) directly assign a prior distribution to the
pricing errors α α sim N (0κΣ) where Σ is the variance-covariance matrix of returns and κ isin R+
and apply it to the Bayesian portfolio choice problem The intuition behind their prior is that it
imposes a degree of shrinkage on the alphas so that whenever factor models are misspecified the
pricing errors cannot be too large a priori hence placing a bound on the Sharpe ratio achievable
in this economy Therefore a diffuse prior for the pricing errors α in general should be avoided
Barillas and Shanken (2018) extend the aforementioned prior to derive a closed-form solution
for the Bayesrsquo factor and use it to compare different linear factor models exploiting the time series
dimension of the data2 In a recent paper Goyal He and Huh (2018) also extend the notion of
distance between alternative model specifications and highlight the tension between the power of
the GRS-type tests and the absolute return-based measures of mispricing
Last but not the least there is a general close connection between the Bayesian approach
to model selection or parameter estimation and the shrinkage-based one Garlappi Uppal and
Wang (2007) impose a set of different priors on expected returns and the variance-covarince matrix
and find that the shrinkage-based analogue leads to superior empirical performance Anderson
and Cheng (2016) develop a Bayesian model-averaging approach to portfolio choice with model
uncertainty being one of the key ingredients that yield robust asset allocation and superior out-of-
sample performance of the strategies Finally the shrinkage-based approach to recovering the SDF
of Kozak Nagel and Santosh (2019) can also be interpreted from a Bayesian perspective Within
a universe of characteristic-managed portfolios the authors assign prior distributions to expected
returns3 and their posterior maximum likelihood estimators resemble a ridge regression Instead
we work directly with tradable and nontradable factors and consider (endogenously) heterogenous
priors for factor risk premia λ The dispersion of our prior for each λ directly depends on the
correlation between test assets and the factor so that it mimics the strength of the identification
of the factor risk premium
Naturally our paper also contributes to the very active (and growing) body of work that criti-
cally evaluates existing findings in the empirical asset pricing literature and tries to develop a robust
methodology There is ample empirical evidence that most linear asset pricing models are misspec-
ified (eg Chernov Lochstoer and Lundeby (2019) He Huang and Zhou (2018) Giglio and Xiu
(2018)) Gospodinov Kan and Robotti (2014) develop a general approach for misspecification-
robust inference that provides valid confidence interval for the pseudo-true values of the risk premia
2Chib Zeng and Zhao (forthcoming) show that the improper prior specification of Barillas and Shanken (2018)is problematic and propose a new class of priors that leads to valid model comparison
3Or equivalently the coefficients b when the linear stochastic discount factor is represented as mt = 1 minus (ft minusE[ft])
⊤b where ft and E denote respectively a vector of factors and the unconditional expectation operator
4
Giglio and Xiu (2018) exploit the invariance principle of the PCA and recover the risk premia asso-
ciated with the given factor from the projection on the span of latent factors driving a cross-section
of asset returns Uppal Zaffaroni and Zviadadze (2018) adopt a similar approach by recovering
latent factors from the residuals of the asset pricing model effectively completing the span of the
SDF Daniel Mota Rottke and Santos (2018) instead focus on the construction of the cross-
sectional factors and note that many well-established tradable portfolios such as HML and SMB
can be substantially improved in asset pricing tests by hedging their unpriced component (that does
not carry a risk premium) We do not take a stand on the origin of the factors or the completion of
the model space Instead we consider the whole universe of potential models that can be created
from the set of observable candidates factors proposed in the empirical literature As such our
analysis explicitly takes into account both standard specifications that have been successfully used
in numerous papers (eg Fama-French 3-factor models or nondurable consumption growth) as
well as the combinations of the factors that have never been explicitly tested
Following Harvey Liu and Zhu (2016) a large literature has tried to understand which of
the existing factors (or their combinations) drive the cross-section of asset returns Giglio Feng
and Xiu (2019) combine cross-sectional asset pricing regressions with the double-selection LASSO
of Belloni Chernozhukov and Hansen (2014) to provide valid uniform inference on the selected
sources of risk Huang Li and Zhou (2018) use a reduced rank approach to select among not
only the observable factors but their total span effectively allowing for sparsity not necessarily in
the observable set of factors but their combinations as well Kelly Pruitt and Su (2019) build a
latent factor model for stock returns with factor loadings being a linear function of the company
characteristics and find that only a small subset of the latter provide substantial independent
information relevant for asset pricing Our approach does not take a stand on whether there exists
a single combination of factors that substantially outperforms other model specifications Instead
we let the data speak and find out that the cross-sectional likelihood across the many models is
rather flat meaning the data is not informative enough to reliably indicate that there is a single
dominant specification
Finally our paper naturally contributes to the literature on weak identification in asset pricing
Starting from the seminal papers of Kan and Zhang (1999a 1999b) identification of risk premia has
been shown to be challenging for traditional estimation procedures Kleibergen (2009) demonstrates
that the two-pass regression of Fama-MacBeth lead to biased estimates of the risk premia and
spuriously high significance levels Moreover useless factors often crowd out the impact of the true
sources of risk in the model and lead to seemingly high levels of cross-sectional fit (Kleibergen and
Zhan (2015)) Gospodinov Kan and Robotti (2014 2019) demonstrate that most of the estimation
techniques used to recover risk premia in the cross-section are rendered invalid in the presence of
useless factors and propose alternative procedures that effectively eliminate the impact of these
factors from the model We build upon the intuition developed in these papers and formulate
the Bayesian solution to the problem by providing a prior that directly reflects the strength of
the factor Whenever the vector of correlation coefficients between asset returns and a factor is
close to zero the prior variance of λ for this specific factor also goes to zero and the penalty for
5
the risk premium converges to infinity effectively shrinking the posterior of the useless factorsrsquo
risk premia toward zero Therefore our priors are particularly robust to the presence of spurious
factors Conversely they are very diffuse for the strong factors with the posterior reflecting the
full impact of the likelihood
II Inference in Linear Factor Models
This section introduces the notation and reviews the main results of the Fama-MacBeth (FM)
regression method (see Fama and MacBeth (1973)) We focus on classic linear factor models for
cross-sectional asset returns Suppose that there are K factors ft = (f1t fKt)⊤ t = 1 T
which could be either tradable or non-tradable To simplify exposition we consider mean zero
factors that have also been demeaned in sample so that we have both E[ft] = 0K and f = 0K where
E[] denotes the unconditional expectation and the upper bar denotes the sample mean operator
The returns of N test assets in excess of the risk free rate are denoted by Rt = (R1t RNt)⊤
In the FM procedure the factor exposures of asset returns βf isin RNtimesK are recovered from
the linear regression
Rt = a+ βfft + 983171t (1)
where 9831711 983171Tiidsim N (0N Σ) and a isin RN Given the mean normalization of ft we have E[Rt] = a
The risk premia associated with the factors λf isin RK are then estimated from the cross-
sectional regression
R = λc1N + 983141βfλf +α (2)
where 983141βf denotes the time series estimates λc is a scalar average mispricing that should be equal to
zero under the null of the model being correctly specified 1N denotes an N -dimensional vector of
ones and α isin RN is the vector of pricing errors in excess of λc If the model is correctly specified
it implies the parameter restriction a = E[Rt] = λc1N + βfλf Therefore we can rewrite the
two-step Fama-MacBeth regression into one equation as
Rt = λc1N + βfλf + βfft + 983171t (3)
Equation (3) is particularly useful in our simulation study Note that the intercept λc is included
in (2) and (3) in order to separately evaluate the ability of the model to explain the average level
of the equity premium and the cross-sectional variation of asset returns
Let B⊤ = (aβf ) and F⊤t = (1f⊤
t ) and consider the matrices of stacked time series observa-
tions
R =
983091
983109983109983107
R⊤1
R⊤T
983092
983110983110983108 F =
983091
983109983109983107
F⊤1
F⊤T
983092
983110983110983108 983171 =
983091
983109983109983107
983171⊤1
983171⊤T
983092
983110983110983108
The regression in (1) can then be rewritten as R = FB + 983171 yielding the time-series estimates of
6
(aβf ) and Σ
983141B =
983075983141a⊤
983141β⊤f
983076= (F⊤F )minus1F⊤R 983141Σ =
1
T(Rminus F 983141B)⊤(Rminus F 983141B)
In the second step the OLS estimates of the factor risk premia are
983141λ = (983141β⊤ 983141β)minus1 983141β⊤R (4)
where 983141β = (1N 983141βf ) and λ⊤ = (λc λf ⊤) The canonical Shanken (1992) corrected covariance matrix
of the estimated risk premia is4
983141σ2(983141λ) = 1
T
983147(983141β⊤ 983141β)minus1 983141β⊤ 983141Σ983141β(983141β⊤ 983141β)minus1(1 + 983141λ⊤
f983141Σminus1f
983141λf )983148 (5)
where 983141Σf is the sample estimate of the variance-covariance matrix of the factors ft There are
two sources of estimation uncertainty in the OLS estimates of λ First we do not know the test
assetsrsquo expected returns but instead estimate them as sample means R According to the time-
series regression R sim N (a 1T Σ) asymptotically Second if β is known the asymptotic covariance
matrix of 983141λ is simply 1T (β
⊤β)minus1β⊤ 983141Σβ(β⊤β)minus1 The extra term (1 + λ⊤fΣ
minus1f λf ) is included to
account for the fact that βf is estimated
Alternatively we can run a (feasible) GLS regression in the second stage obtaining the estimates
983141λ = (983141β⊤ 983141Σminus1 983141β)minus1 983141β⊤ 983141Σminus1R (6)
where 983141Σ = 1T 983141983171
⊤983141983171 and 983141983171 denotes the OLS residuals and with the associated covariance matrix of
the estimates
983141σ2(983141λ) = 1
T(983141β⊤ 983141Σminus1 983141β)minus1(1 + 983141λ⊤
f983141Σminus1f
983141λf ) (7)
Equations (4) and (6) make it clear that in the presence of a spurious (or useless) factor ie
such that βj = CradicT C isin RN risk premia are no longer identified Furthermore their estimates
diverge leading to inference problems for both the useless and the strong (ie βj ∕rarr 0 as T rarr infin)
factors (see eg Kan and Zhang (1999b)) In the presence of such an identification failure the
cross-sectional R2 also becomes untrustworthy If a useless factor is included into the two-pass
regression the OLS R2 tend to be highly inflated (although the GLS R2 is less affected)5
4An alternative way (see eg Cochrane (2005) Page 242) to account for the uncertainty from ldquogenerated regres-sorsrdquo such as βf is to estimate the whole system in GMM The moments are
gT (aβf λ) =
983061IN otimes IK+1
A⊤
983062983091
983107E[Rt minus aminus βfft]
E[(Rt minus aminus βfft)otimes f⊤t ]
E[Rt minus λc1N minus βfλf ]
983092
983108 = 0
where β = (1N βf ) A⊤ = β⊤ for OLS and A⊤ = β⊤Σminus1 for GLS Also note that GLS estimation is not the sameas efficient GMM estimation
5For example Kleibergen and Zhan (2015) derive the asymptotic distribution of the R2 under the assumptionthat a few unknown factors are able to explain expected asset returns and show that in the presence of a useless
7
This problem arises not only when using the Fama-MacBeth two-step procedure Kan and
Zhang (1999a) point out that the identification condition in the GMM test of linear stochastic
discount factor models fails when a useless factor is included Moreover this leads to overrejection
of the hypothesis of a zero risk premium for the useless factor under the Wald test and the power
of the over-identifying restriction test decreases Gospodinov Kan and Robotti (2019) document
similar problems within the maximum likelihood estimation and testing framework
Consequently several papers have attempted to develop alternative statistical procedures that
are robust to the presence of useless factors Kleibergen (2009) proposes several novel statis-
tics whose large sample distributions are unaffected by the failure of the identification condition
Gospodinov Kan and Robotti (2014) derive robust standard errors for the GMM estimates of
factor risk premia in the linear stochastic factor framework and prove that t-statistics calculated
using their standard errors are robust even when the model is misspecified and a useless factor is
included Bryzgalova (2015) introduces a LASSO-like penalty term in the cross-sectional regression
to shrink the risk premium of the useless factor towards zero
In this paper we provide a Bayesian inference and model selection framework that i) can be
easily used for robust inference in the presence and detection of useless factors (section II1) and
ii) can be used for both model selection and model averaging even in the presence of a very large
number of candidate (traded or non traded and possibly useless) risk factors ndash ie the entire factor
zoo
II1 Bayesian Fama-MacBeth
This section introduces our hierarchical Bayesian Fama-MacBeth (BFM) estimation method A
formal derivation is presented in Appendix A1 To start with letrsquos consider the time-series regres-
sion We assume that the time series error terms follow an iid multivariate Gaussian distribution
(the approach at the cost of analytical solutions could be generalized to accommodate different
distributional assumptions) ie 983171 sim MVN (0TtimesN Σ otimes IT ) The likelihood of the data (RF ) is
then
p(data|BΣ) = (2π)minusNT2 |Σ|minus
T2 exp
983069minus1
2tr
983147Σminus1(Rminus FB)⊤(Rminus FB)
983148983070
The time-series regression is always valid even in the presence of a spurious factor For simplicity
we choose the non-informative Jeffreysrsquo prior for (BΣ) π(BΣ) prop |Σ|minusN+1
2 Note that this prior
is flat in the B dimension The posterior distribution of (BΣ) is therefore
B|Σ data sim MVN983059983141BolsΣotimes (F⊤F )minus1
983060 (8)
Σ|data sim W -1983059T minusK minus 1 T Σ
983060 (9)
where 983141Bols and Σ denote the canonical OLS based estimates and W -1 is the inverse-Wishart
distribution (a multivariate generalization of the inverse-gamma distribution) From the above
factor the OLS R2 is more likely to be inflated than its GLS counterpart
8
we can sample the posterior distribution of the parameters (BΣ) by first drawing the covariance
matrix Σ from the inverse-Wishart distribution conditional on the data and then drawing B from
a multivariate normal distribution conditional on the data and the draw of Σ
If the model is correctly specified in the sense that all true factors are included expected
returns of the assets should be fully explained by their risk exposure β and the prices of risk λ
ie E[Rt] = βλ But since given our mean normalisation of the factors E[Rt] = a we have the
least square estimate (β⊤β)minus1β⊤a Therefore we can define our first estimator
Definition 1 (Bayesian Fama-MacBeth (BFM)) The posterior distribution of λ conditional
on B Σ and the data is a Dirac distribution at (β⊤β)minus1β⊤a A draw (λ(j)) from the posterior
distribution of λ conditional on the data only is obtained by drawing B(j) and Σ(j) from the Normal-
inverse-Wishart in (8)-(9) and computing (β⊤(j)β(j))
minus1β⊤(j)a(j)
The posterior distribution of λ defined above accounts both for the uncertainty about the
expected returns (via the sampling of a) and the uncertainty about the factor loadings (via the
sampling of β) Note that differently from the frequentist case in equation (5) there is no ldquoextra
termrdquo (1 + λ⊤fΣ
minus1f λf ) to account for the fact that βf is estimated The reason being that it
is unnecessary to explicitly adjust standard errors of λ in the Bayesian approach since we keep
updating βf in each simulation step automatically incorporating the uncertainty about βf into the
posterior distribution of λ Furthermore it is quite intuitive from the above definition of the BFM
estimator why we expect posterior inference to detect weak and spurious factors in finite sample
For such factors the near singularity of (β⊤(j)β(j))
minus1 will cause the draws for λ(j) to diverge as in
the frequentist case Nevertheless the posterior uncertainty about factor loadings and risk premia
will cause β⊤(j)a(j) to flip sign across draws causing the posterior distribution of λ to put substantial
probability mass on both values above and below zero Hence centered posterior credible intervals
will tend to include zero with high probability
In addition to the price of risk λ we are also interested in estimating the cross-sectional fit of
the model ie the cross-sectional R2 Once we obtain the posterior draws of the parameters we
can easily obtain the posterior distribution of the cross-sectional R2 defined as
R2ols = 1minus (aminus βλ)⊤(aminus βλ)
(aminus a1N )⊤(aminus a1N ) (10)
where a = 1N
983123Ni ai That is for each posterior draw of (a β λ) we can construct the corre-
sponding draw for the R2 from equation (10) hence tracing out its posterior distribution We can
think of equation (10) as the population R2 where a β and λ are unknown After observing
the data we infer the posterior distribution of a β and λ and from these we can recover the
distribution of the R2
However realistically the models are rarely true Therefore one might want to allow for the
presence of pricing errors α in the cross-sectional regression6 This can be easily accommodated
6As we will show in the next section this is essential for model selection
9
within our Bayesian framework since in this case the data generating process in the second stage
becomes a = βλ + α If we further assume that pricing error αi follows an independent and
identical normal distribution N (0σ2) the likelihood function in the second step becomes7
p(data|λσ2β) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070 (11)
In the cross-sectional regression the ldquodatardquo are the expected risk premia a and the factor loadings
β albeit these quantities are not directly observable to the researcher Hence in the above we
are conditioning on the knowledge of these quantities which can be sampled from the first step
Normal-inverse-Wishart posterior distribution (8)-(9) Conceptually this is not very different from
the Bayesian modeling of latent variables In the benchmark case we assume a Jeffreysrsquo diffuse
prior8 for (λσ2) π(λσ2) prop σminus2 In Appendix A1 we show that the posterior distribution of
(λσ2) is then
λ|σ2BΣ data sim N
983091
983109983107(β⊤β)minus1β⊤a983167 983166983165 983168983141λ
σ2(β⊤β)minus1
983167 983166983165 983168Σλ
983092
983110983108 (12)
σ2|BΣ data sim Γminus1
983075N minusK minus 1
2(aminus βλ)⊤(aminus βλ)
2
983076 (13)
where Γminus1 denotes the inverse-gamma distribution The conditional distribution in equation (12)
makes it clear that the posterior takes into account both the uncertainty about the market price
of risk stemming from the first stage uncertainty about the β and a (that are drawn from the
Normal-inverse-Wishart posterior in equations (8)-(9)) and the random pricing errors α that have
the conditional posterior variance in equation (13) If test assetsrsquo expected excess returns are fully
explained by β there are no pricing errors and σ2(β⊤β)minus1 converges to zero otherwise this layer
of uncertainty always exists
Note also that we can think of the posterior distribution of (β⊤β)minus1β⊤a as a Bayesian decision
makerrsquos belief about the dispersion of the Fama-MacBeth OLS estimates after observing the data
RtftTt=1 Alternatively when pricing errors α are assumed to be zero under the null hypothesis
the posterior distribution of λ in equation (12) collapses to a degenerate distribution where λ
equals (β⊤β)minus1β⊤a with probability one
Often the cross sectional step of the Fama-MacBeth estimation is performed via GLS rather than
least squares In our setting under the null of the model this leads to λ = (β⊤Σminus1β)minus1β⊤Σminus1a
Therefore we define the following GLS estimator
Definition 2 (Bayesian Fama-MacBeth GLS (BFM-GLS)) The posterior distribution of λ
conditional on B Σ and the data is a Dirac distribution at (β⊤Σminus1β)minus1β⊤Σminus1a A draw (λ(j))
7We derive a formulation with non-spherical cross-sectional pricing errors that leads to a GLS type estimator inAppendix A2
8As shown in the next subsection in the presence of useless factors such prior is not appropriate for modelselection based on Bayes factors and posterior probabilities since it does not lead to proper marginal likelihoodsTherefore we introduce therein a novel prior for model selection
10
from the posterior distribution of λ conditional on the data only is obtained by drawing B(j) and Σ(j)
from the Normal-inverse-Wishart in equations (8)-(9) and computing (β⊤(j)Σ
minus1(j)β(j))
minus1β⊤(j)Σ
minus1(j)a(j)
From the posterior sampling of the parameters in the above definition we can also obtain the
posterior distribution of the cross-sectional GLS R2 defined as
R2gls = 1minus
(aminus βλgls)⊤Σminus1(aminus βλgls)
(aminus a1N )⊤Σminus1(aminus a1N ) (14)
Once again we can think of equation (14) as the population GLS R2 that is a function of the
unknown quantities a β and λ But after observing the data we infer the posterior distribution
of the parameters and from these we recover the posterior distribution of the R2gls
Remark 1 (Generated factors) Often factors are estimated as eg in the case of principal
components (PCs) and factor mimicking portfolios (albeit the latter is not needed in our setting)
This generates an additional layer of uncertainty normally ignored in empirical analysis due to
the associated asymptotic complexities Nevertheless it is relatively easy to adjust the Bayesian
estimators of risk premia to account for this uncertainty In the case of a mimicking portfolio
under a diffuse prior and Normal errors the posterior distribution of the portfolio weights follow the
standard Normal-inverse-Gamma of Gaussian linear regression models (see eg Lancaster (2004))
Similarly in the case of principal components as factors under a diffuse prior the covariance
matrix from which the PCs are constructed follow an inverse-Wishart distribution9 Hence the
posterior distributions in Definitions 1 and 2 can account for the generated factors uncertainty by
first drawing from an inverse-Wishart the covariance matrix from which PCs are constructed or
the Normal-inverse-Gamma posterior of the mimicking portfolios coefficients and then sampling
the remaining parameters as explained in the definitions
II2 Model selection
In the previous subsection we have derived simple Bayesian estimators that deliver in finite sam-
ple credible intervals robust to the presence of spurious factors and avoid over-rejecting the null
hypothesis of zero risk premia for such factors
However given the plethora of risk factors that have been proposed in the literature a robust
approach for models selection across non-necessarily nested models and that can handle poten-
tially a very large number of possible models as well as both traded and non-traded factors is
of paramount importance for empirical asset pricing The canonical way of selecting models and
testing hypothesis within the Bayesian framework is through Bayesrsquo factors and posterior prob-
abilities and that is the approach we present in this section This is for instance the approach
suggested by Barillas and Shanken (2018) The key elements of novelty of the proposed method
are that i) our procedure is robust to the presence of spurious and weak factors ii) it is directly
9Based on these two observations Allena (2019) proposes a generalisation of Barillas and Shanken (2018) modelcomparison approach for these type of factors
11
applicable to both traded and non-traded factors and iii) it selects models based on their cross-
sectional performance (rather than the time series one) ie on the basis of the risk premia that the
factors command
In this subsection we show first that flat priors for risk premia (as the Jeffreysrsquo priors used
in section II1 for illustrative purposes) are not suitable for model selection in the presence of
spurious factors Given the close analogy between frequentist testing and Bayesian inference with
flat priors this is not too surprising But the novel insight is that the problem arises exactly
because of the use of flat priors and can therefore be fixed by using non-flat yet non-informative
priors Second we introduce ldquospike-and-slabrdquo priors that are robust to the presence of spurious
factors and particularly powerful in high-dimensional model selection ie when one wants as in
our empirical application to test all factors in the zoo
II21 Pitfalls of Flat Priors for Risk Premia
We start this section by discussing why flat priors for risk premia as the Jeffreysrsquo prior are not
desirable in model selection Since we want to focus and select models based on the cross-sectional
asset pricing properties of the factors for simplicity we retain Jeffreysrsquo priors for the time series
parameter (aβf Σ) of the first-step regression
In order to perform model selection we relax the (null) hypothesis that models are correctly
specified and allow instead for the presence of cross-sectional pricing errors That is we consider
the cross-sectional regression a = βλ + α For illustrative purposes we focus on spherical errors
but all the results in this and the following subsections can be generalized to the non-spherical error
setting in Appendix A2
Similar to many Bayesian variable selection problems we introduce a vector of binary latent
variables γ⊤ = (γ0 γ1 γK) where γj isin 0 1 When γj = 1 it indicates that the factor j
(with associated loadings βj) should be included into the model and vice versa The number of
included factors is simply given by pγ =983123K
j=0 γj Note that we do not shrink the intercept so
γ0 is always equal to 1 (as the common intercept plays the role of the first ldquofactorrdquo) The notation
βγ = [βj ]γj=1 represents a pγ-columns sub-matrix of β
When testing whether the risk premium of factor j is zero the null hypothesis is H0 λj = 0
In our notation this null hypothesis can be expressed as H0 γj = 0 while the alternative is
H1 γj = 1 This is a small but important difference relative to the canonical frequentist testing
approach for useless factors the risk premium is not identified hence testing whether it is equal to
any given value is per se problematic Nevertheless as we show in the next section with appropriate
priors whether a factor should be included or not is a well-defined question even in the presence
of useless factors
In the Bayesian framework the prior distribution of parameters under the alternative hypothesis
should be carefully specified Generally speaking the priors for nuisance parameters such as β σ2
and Σ do not greatly influence the cross-sectional inference But as we are about to show this is
not the case for the priors about risk premia
12
Recall that when considering multiple models say wlog model γ and model γ prime by Bayesrsquo
theorem we have that the posterior probability of model γ is
Pr(γ|data) = p(data|γ)p(data|γ) + p(data|γ prime)
where we have given equal prior probability to each model and p(data|γ) denotes the marginal
likelihood of the model indexed by γ In Appendix A3 we show that when using a Jeffreysrsquo prior
(that is flat for λ) the marginal likelihood is
p(data|γ) prop (2π)pγ+1
2 |β⊤γ βγ |minus
12
Γ983059Nminuspγ+1
2
983060
983059N σ2
γ
2
983060Nminuspγ+1
2
(15)
where λγ = (β⊤γ βγ)
minus1β⊤γ a σ
2γ =
(aminusβγ λγ)⊤(aminusβγ λγ)N and Γ denotes the Gamma function
Therefore if model γ includes a useless factor (whose β asymptotically converges to zero) the
matrix β⊤γ βγ is nearly singular and its determinant goes to zero sending the marginal likelihood
in (15) to infinity As a result the posterior probability of the model containing the spurious factor
go to one Consequently under a flat prior for risk premia the model containing a useless factor
will always be selected asymptotically However the posterior distribution of λ for the spurious
factor is robust and particularly disperse in any finite sample
Moreover it is highly likely that conclusions based on the posterior coverage of λ contradict
those arising from Bayesrsquo factors When the prior distribution of λj is too diffuse under the
alternative hypothesis H1 the Bayesrsquo factor tends to favor H0 over H1 even though the estimate
of λj is far from 0 The reason is that even though H0 seems quite unlikely based on posterior
coverages the data is even more unlikely under H1 than under H0 Therefore a disperse prior for
λj may push the posterior probabilities to favor H0 and make it fail to identity true factors This
phenomenon is the so called ldquoBartlett Paradoxrdquo (see Bartlett (1957))
Note also that flat hence improper priors for the risk premia are not legitimate since they render
the posterior model probabilities arbitrary Suppose that we are testing the null H0 λj = 0 Under
the null hypothesis the prior for (λσ2) is λj = 0 and π(λminusj σ2) prop 1
σ2 However the prior under the
alternative hypothesis is π(λj λminusj σ2) prop 1
σ2 Since the marginal likelihoods of data p(data|H0)
and p(data|H1) are both undetermined we cannot define the Bayesrsquo factor p(data|H1)p(data|H0)
(see eg
Cremers (2002) Chib Zeng and Zhao (forthcoming)) In contrast for nuisance parameters such
as σ2 we can continue to assign improper priors Since both hypotheses H0 and H1 include σ2
the prior for it will be offset in the Bayesrsquo factor and in the posterior probabilities Therefore we
can only assign improper priors for common parameters10 Similarly we can still assign improper
priors for β and Σ in the first time series step
The final reason why it might be undesirable to use Jeffreysrsquo prior in the second step is that it
does not impose any shrinkage on the parameters This is problematic given the large number of
10See Kass and Raftery (1995) (and also Cremers (2002)) for more detailed discussion
13
members of the factor zoo while we have only limited time-series observations of both factors and
test asset returns
In the next subsection we propose an appropriate prior for risk premia that is both robust to
spurious factors and can be used for model selection even when dealing with a very large number
of potential models
II22 Spike and Slab Prior for Risk Premia
In order to make sure that the integration of the marginal likelihood is well-behaved we propose
a novel prior specification for the factorsrsquo risk premia λ⊤f = (λ1 λK)11 Since the inference in
time-series regression is always valid we only modify the priors of the cross-sectional regression
parameters
The prior that we propose belongs to the so-called ldquospike-and-slabrdquo family For exemplifying
purposes in this section we introduce a Dirac spike so that we can easily illustrate its implications
for model selection In the next subsection we generalize the approach to a ldquocontinuous spikerdquo
prior and study its finite sample performance in our simulation setup
In particular we model the uncertainty underlying the model selection problem with a mixture
prior π(λσ2γ) prop π(λ|σ2γ)π(σ2)π(γ) for the risk premium of the j-th factor When γj = 1
and hence the factor should be included in the model the prior follows a normal distribution given
by λj |σ2 γj = 1 sim N (0σ2ψj) where ψj is a quantity that we will be defining below When instead
γj = 0 and the corresponding risk factor should not be included in the model the prior is a Dirac
distribution at zero For the cross-sectional variance of the pricing errors we keep the same prior
that would arise with Jeffreysrsquo approach π(σ2) prop σminus2
Let D denote a diagonal matrix with elements cψminus11 middot middot middot ψminus1
K and Dγ the sub-matrix of D
corresponding to model γ We can then express the prior for the risk factors λγ of model γ as
λγ |σ2γ sim N (0σ2Dminus1γ )
Note that c is a small positive number since we do not shrink the common intercept λc of the
cross-sectional regression
Given the above prior specification we sample the posterior distribution by sequentially drawing
from the conditional distributions of the parameters (ie we use a Gibbs sampling algorithm) The
crucial steps in addition to the sampling of the times series parameters from the posteriors in
equations (8)-(9) are as follows
Sampling λγ
Note that
11We do not shrink the intercept λc
14
p(λ|dataσ2γ) prop p(data|λσ2γ)π(λ|σ2γ)
prop (2π)minuspγ2 |Dγ |
12 (σ2)minus
N+pγ2 exp
983069minus 1
2σ2[(aminus βγλγ)
⊤(aminus βγλγ) + λ⊤γDγλγ ]
983070
= (2π)minuspγ2 |Dγ |
12 (σ2)minus
N+pγ2 e
983069minus
(λγminusλγ )⊤(β⊤γ βγ+Dγ )(λγminusλγ )
2σ2
983070
e
983153minusSSRγ
2σ2
983154
where SSRγ = a⊤aminus a⊤βγ(β⊤γ βγ +Dγ)
minus1β⊤γ a = minλγ(aminus βγλγ)
⊤(aminus βγλγ) + λ⊤γDγλγ
Note that SSRγ is the minimized sum of squared errors with generalised ridge regression penalty
term λ⊤γDγλγ That is our prior modelling is analogous to introducing a Tikhonov-Phillips
regularisation (see Tikhonov Goncharsky Stepanov and Yagola (1995) Phillips (1962)) in the
cross-sectional regression step and has the same rationale delivering a well defined marginal
likelihood in the presence of rank deficiency (that in our settings arises in the presence of useless
factors) However in our setting the shrinkage applied to the factors is heterogeneous since we
rely on the partial correlation between factors and test assets to set ψj as
ψj = ψ times ρ⊤j ρj (16)
where ρj is an N times 1 vector of correlation coefficients between factor j and the test assets and
ψ isin R+ is a tuning parameter which controls the shrinkage over all the factors12 When the
correlation between fjt and Rt is very low as in the case of a useless factor the penalty for λj
which is the reciprocal of ψρ⊤j ρj is very large and dominates the sum of squared errors
Let λγ = (β⊤γ βγ +Dγ)
minus1β⊤γ a and σ2(λγ) = σ2(β⊤
γ βγ +Dγ)minus1 the posterior distribution of
λγ is
λγ |dataσ2γ sim N (λγ σ2(λγ))
The above equation makes it clear why this Bayesian formulation is robust to spurious factors
When β converges to zero (β⊤γ βγ +Dγ) is dominated by Dγ so the identification condition for
the risk premia no longer fails When a factor is spurious its correlation with test assets converges
to zero hence the penalty for this factor ψminus1j goes to infinity As a result the posterior mean
of λγ λγ = (β⊤γ βγ +Dγ)
minus1β⊤γ a is shrunk towards zero and the posterior variance term σ2(λ)
approaches σ2Dminus1γ Consequently the posterior distribution of λ for a spurious factor is nearly the
same as its prior In contrast for a normal factor that has non-zero covariance with test assets the
information contained in β dominates the prior information since in this case the absolute size of
Dγ is small relative to β⊤γ βγ
Using our priors and integrating out λ yields
p(data|σ2γ) =
983133p(data|λσ2γ)π(λ|σ2γ)dλ prop (σ2)minus
N2
|Dγ |12
|β⊤γ βγ +Dγ |
12
exp
983069minusSSRγ
2σ2
983070
12Alternatively we could have set ψj = ψ times β⊤j βj where βj is an N times 1 vector However ρj has the advantage
of being invariant to the units in which the factors are measured
15
Sampling σ2
From the Bayesrsquo theorem we have that the posterior of σ2 given by
p(σ2|dataγ) prop p(data|σ2γ)π(σ2) prop (σ2)minusN2minus1 exp
983069minusSSRγ
2σ2
983070
hence the posterior distribution of σ2 is an inverse-Gamma σ2|dataγ sim Γminus1983059N2
SSRγ
2
983060
Finally we obtain the marginal likelihood of the data in model γ by integrating out σ2
p(data|γ) =983133
p(data|σ2γ)π(σ2)dσ2 prop |Dγ |12
|β⊤γ βγ +Dγ |
12
1
(SSRγ2)N2
When comparing two models using posterior model probabilities is equivalent to simply using the
ratio of the marginal likelihoods ie the Bayesrsquo factor defined as
BFγγprime = p(data|γ)p(data|γ prime)
where we have given equal prior probability to model γ and model γprime
Remark 2 (Bayes Factor) Consider two nested linear factor models γ and γ prime The only dif-
ference between γ and γ prime is γp γp equals 1 in model γ but 0 in model γ prime Let γminusp denote a
(K minus 1) times 1 vector of model index excluding γp γ⊤ = (γ⊤minusp 1) and γ prime⊤ = (γ⊤
minusp 0) where without
loss of generality we have assumed that the factor p is ordered last The Bayesrsquo factor is then
BFγγprime =
983061SSRγprime
SSRγ
983062N2 983059
1 + ψpβ⊤p
983147IN minus βγprime(β⊤
γprimeβγprime +Dγprime)minus1β⊤γprime
983148βp
983060minus 12 (17)
The above result is proved in Appendix A4
Since β⊤p [IN minus βγprime(β⊤
γprimeβγprime + Dγprime)minus1β⊤γprime ]βp is always positive ψp plays an important role in
variable selection For a strong and useful factor that can substantially reduce pricing errors the
first term in equation (17) dominates and the Bayesrsquo factor will be much greater than 1 hence
providing evidence in favour of model γ
Remember that SSRγ = minλγ(a minus βγλγ)⊤(a minus βγλγ) + λ⊤
γDγλγ hence we always have
SSRγ le SSRγprime in sample There are two effects of increasing ψp i) when ψp is large the penalty
for λp is small hence it is easier to minimise SSRγ and SSRγprimeSSRγ becomes much larger than
1 ii) large ψp decreases the second term in equation (17) lowering the Bayesrsquo factor and acting as
a penalty for dimensionality
A particularly interesting case is when the factor is useless βp converges to zero but the
penalty term 1ψp prop 1ρ⊤pρp goes to infinity On the one hand the first term in equation (17) will
converge to 1 on the other hand since ψp asymp 0 in large sample the second term in equation (17)
will also be around 1 Therefore the Bayesrsquo factor for a useless factor will go to 1 asymptotically13
13But in finite sample it may deviate from its asymptotic value so we should not use 1 as a threshold when testingthe null hypothesis H0 γp = 0
16
In contrast a useful factor should be able to greatly reduce the sum of squared errors SSRγ so
the Bayesrsquo factor will be dominated by SSRγ yielding a value substantially above 1
Remark 3 (Level Factors) Identification failure of factorsrsquo risk premia can arise in the presence
of lsquolevel factorsrsquo exposure to which is non-zero but lacks cross-sectional spread ie βj rarr c1N with
c ∕= 0 These factors help explain the average level of returns but not the their cross-sectional
dispersion and hence are collinear with the common cross-sectional intercept Our approach can
handle this case by using variance standardised variables in the estimation and replacing the penalty
in (16) with ψj = ψtimes983145ρj⊤983145ρj where 983145ρj is the cross-sectionally demeaned vector of correlations with
asset returns ie 983145ρj = ρj minus983059
1N
983123Ni=1 ρji
983060times 1N
II23 Continuous Spike
We extend the Dirac spike-and-slab prior by encoding a continuous spike for λj when γj equals
0 Following the literature on Bayesian variable selection (see eg George and McCulloch (1993)
George and McCulloch (1997) and Ishwaran Rao et al (2005)) we model the uncertainty under-
lying model selection with a mixture prior π(λσ2γω) = π(λ|σ2γ)π(σ2)π(γ|ω)π(ω) which is
specified as following
λj |γj σ2 sim N (0 r(γj)ψjσ2)
Note the introduction of a new element r(γj) in the prior and where r(1) = 1 and r(0) = r As
we explain below the additional parameter vector ω encodes our prior beliefs about the sparsity
of the true model
Redefine D as a diagonal matrix with elements c (r(γ1)ψ1)minus1 (r(γK)ψK)minus1 where ψj is
given as before by equation (16) In matrix notation the prior for λ is λ|σ2γ sim N (0σ2Dminus1)
The term r(γj)ψj in Dminus1 is set to be small or large depending on whether γj = 0 or γj = 1 In the
empirical implementation we force r to be much less than 1 since we intend to shrink λj towards
zero when γj is 014 Hence the spike component concentrates the mass of λ towards zero whereas
the slab component allows λ to take values over a much wider range Therefore the posterior
distribution of λ is very similar to the case of a Dirac spike in section II22
We use Gibbs sampling (ie sequential sampling from conditional distributions) to draw the
posterior distribution of the parameters (λγωσ2) where as explained below ω encodes our
prior beliefs about the sparsity of the true model
Sampling λγ
Combining the likelihood and the prior for λ we have
p(λ|dataσ2γ) prop p(data|λσ2γ)p(λ|σ2γ) prop exp
983069minus 1
2σ2
983147λ⊤(β⊤β +D)λminus 2λ⊤β⊤a
983148983070
Therefore defining λ = (β⊤β+D)minus1β⊤a and σ2(λ) = σ2(β⊤β+D)minus1 the posterior distribution
14We can set r = 00001 In our framework r is essentially a tune parameter hence we need to choose a reasonablevalue such that we can identify useful factor but exclude spurious ones
17
of λ can be expressed as λ|dataσ2γω sim N (λ σ2(λ))
Sampling γjKj=1
Even though the prior on model index γ could be simply set to be π(γ) = 12K we interpret
π(γj = 1|ωj) = ωj as our prior belief about the sparsity of the true model As in the literature on
predictors selection we assign the following prior distribution to (γω)
π(γj = 1|ωj) = ωj ωj sim Beta(aω bω)
Different hyper-parameters aω and bω determine whether we favor more parsimonious models or
not15
Given a ωj the Bayes factor for the j-th risk then factor is
p(γj = 1|dataλωσ2γminusj)
p(γj = 0|dataλωσ2γminusj)=
ωj
1minus ωj
p(λj |γj = 1σ2)
p(λj |γj = 0σ2)
If we had instead imposed ωj = 05 as in section II22 the Bayesrsquo factor would be fully determined
byp(λj |γj=1σ2)p(λj |γj=0σ2)
Sampling ω
From Bayesrsquo theorem we have
p(ωj |dataλγσ2) prop π(ωj)π(γj |ωj) prop ωγjj (1minus ωj)
1minusγjωaωminus1j (1minus ωj)
bωminus1
prop ωγj+aωminus1j (1minus ωj)
1minusγj+bωminus1
Therefore the posterior distribution of ωj is ωj |dataλγσ2 sim Beta(γj + aω 1minus γj + bω)
Sampling σ2
Finally
p(σ2|dataωλγ) prop (σ2)minusN+K+1
2minus1 exp
983069minus 1
2σ2[(aminus βλ)⊤(aminus βλ) + λ⊤Dλ]
983070
Hence the posterior distribution of σ2 is
σ2|dataωλγ sim Γminus1
983061N +K + 1
2(aminus βλ)⊤(aminus βλ) + λ⊤Dλ
2
983062
Note that when ωj is a constant 05 and r converges to 0 the continuous slab-and-spike prior
is equivalent to the one with a Dirac spike in section II22 However the MCMC algorithm of
the continuous spike setting is particularly useful in the high dimensional case Imagine that there
are 30 candidate factors in the factor zoo In the Dirac spike-and-slab prior case we have to
15We set aω = bω = 2 in the benchmark case However we could assign aω = 1 and bω = 2 in order to select asparser model
18
calculate the posterior model probabilities for 230 different models Given that we update (aβ)
in each sampling round posterior probabilities for all models are necessarily re-computed for every
new draw of these quantities rendering the computational cost very large In contrast using our
continuous spike approach we can simply use the posterior mean of γj to approximate the posterior
marginal probability of the j-th factor
III Simulation
We build a simple setting for a linear factor model that includes both strong and irrelevant factors
and allows for potential model misspecification
The cross-section of asset returns mimics empirical properties of 25 Fama-French portfolios
sorted by size and value We generate both factors and test asset returns from normal distributions
assuming that HML is the only useful factor A misspecified model also includes pricing errors from
the two-step Fama-MacBeth procedure which makes the vector of simulated expected returns equal
to their sample mean estimates of 25 Fama-French portfolios Finally a spurious factor is simulated
from an independent normal distribution with mean zero and standard deviation 1
ftuselessiidsim N (0 (1)2)
ftHMLiidsim N (rHML σ
2HML)
ftHML = ftHML minus macrftHML
Rt|ftHMLiidsim
983099983103
983101N (λc1N + β
983059λHML + ftHML
983060 Σ) if the model is correct
N (R+ βftHML Σ) if the model is misspecified
where factor loadings risk premia and variance-covariance matrix of returns are equal to their
OLS sample estimates from time series and two-pass Fama-MacBeth regressions of 25 size and
value portfolios on HML All the model parameters are estimated on monthly data from July 1963
to December 2017
To better illustrate the properties of the frequentist and bayesian approaches we consider 3
estimation setups
(a) the model includes only a strong factor (HML)
(b) the model includes only a useless factor
(c) the model includes both strong and useless factors
Each setting can be correctly or incorrectly specified with the following sample sizes T = 100
200 600 1000 and 20000 We compare the performance of the OLSGLS standard frequentist
and Bayesian Fama-MacBeth estimators (FM and BFM correspondingly) with the focus on risk
premia recovery testing and identification of strong and useless factors for model comparison
19
III1 Estimating risk premia via Bayesian Fama-MacBeth
Since it is unlikely that in most empirical settings a linear factor model is correctly specified we
focus our discussion on the case that allows for model misspecification We report similar simulation
results for the case of correct model specification in Appendix C
Table 1 Tests of risk premia in a misspecified model with a strong factor
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0107 0052 0007 0115 0076 0013 -422 4530200 0094 0067 0006 0118 0072 0015 -303 5621
FM 600 0097 0053 0006 0098 0047 0011 389 55751000 0088 0048 0012 0103 0060 0012 865 532820000 0109 0056 0010 0110 0060 0008 2486 3655
100 0063 0026 0006 0032 0013 0001 -356 3609200 0086 0041 0012 0079 0039 0008 -367 4689
BFM 600 0087 0043 0013 0087 0047 0009 -067 52611000 0095 0046 0009 0093 0046 0009 440 534620000 0096 0052 0012 0100 0052 0012 2390 3601
Panel B GLS
100 0246 0173 0067 0251 0154 0070 1720 7464200 0165 0107 0031 0171 0105 0035 5337 8045
FM 600 0137 0076 0020 0137 0073 0016 6942 83871000 0131 0072 0019 0132 0072 0015 7341 841620000 0122 0074 0014 0113 0061 0015 8034 8283
100 0139 0083 0022 0143 0083 0024 3127 6815200 0131 0072 0014 0121 0069 0019 4858 7299
BFM 600 0114 0061 0013 0126 0067 0018 6594 80091000 0100 0048 0009 0106 0058 0008 7063 814920000 0092 0050 0012 0100 0048 0012 8022 8267
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor The true value of the cross-sectional R2
adj is 3055(8175) for the OLS (GLS) estimation Fama-MacBeth estimates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidence intervals and their size for BFMestimates are constructed using posterior coverage of Fama-MacBeth estimates of λ The last two columns report the5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at the simulation point estimatesfor FM and its posterior mode for BFM
Table 1 presents the size of the tests for risk premia and confidence intervals for cross-sectional
R2 in probably the most relevant (and best-case) scenario for empirical applications It compares
the performance of frequentist and Bayesian Fama-MacBeth estimators for the case when the model
is misspecified and includes a single cross-sectional factor that is priced and strongly identified
proxied by HML Since the model is misspecified cross-sectional R2 never reaches 100 even for
T = 20 000 with the population value of 31 (82) for OLS (GLS) As expected both FM and
BFM estimators are very similar to each other and provide confidence intervals of correct size In
case of the standard FM approach they are constructed using standard t-statistics adjusted for
Shanken correction and in case of the BFM we rely on the quantiles of the posterior distribution
to form the credible confidence intervals for parameters The last two columns also report the
quantiles of the mode of the posterior distribution of R2 across the simulations
20
Table 2 summarizes risk premia estimation for the same cross-section of 25 expected returns
but using a useless (spurious) factor as a candidate cross-sectional factor As expected the standard
Fama-MacBeth estimator fails to recognize the rank failure in the second stage and conventional
risk premia estimates and t-statistics are inflated Indeed it is widely known since Kan and Zhang
(1999) that if the model is misspecified then t-statistics of the spurious factors tend to infinity
asymptotically This is confirmed in Panels A and B for T = 20 000 the probability of finding a
useless factor to have a t-stat above 196 is almost 50 for FM-OLS and over 80 for FM-GLS
Furthermore cross-sectional measures of fit such as R2 and related quantities are substantially
inflated even though itrsquos true value is 0 for both OLS and GLS settings in-sample estimates
produced by the frequentist approach not only have a much wider empirical support (for example
from -4 to over 40 for FM-OLS) but also uncertainty that does not decrease with the sample
size
Table 2 Tests of risk premia in a misspecified model with a useless factor
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0072 0037 0010 0055 0011 0001 -429 4184200 0078 0043 0005 0084 0021 0001 -418 4524
FM 600 0089 0043 0014 0223 0116 0013 -428 43701000 0093 0052 0014 0333 0187 0027 -429 450120000 0213 0135 0067 0698 0488 0172 -427 4363
100 0035 0011 0001 0001 0000 0000 -247 -036200 0041 0013 0001 0008 0002 0000 -261 -016
BFM 600 0071 003 0003 0031 0006 0002 -287 0191000 0047 002 0001 0039 0017 0002 -298 08420000 0034 0013 0000 0091 0043 0012 -336 1140
Panel B GLS
100 0238 0144 0066 0305 0199 0062 -347 3857200 0152 0091 0028 0292 0189 0067 -375 1953
FM 600 0126 0066 0017 0407 0314 0148 -385 16431000 0117 0063 0013 0510 0401 0239 -405 156920000 0104 0039 0005 0864 0847 0768 -311 1333
100 0128 0070 0019 0047 0018 0002 -218 1154200 0107 0060 0014 0034 0011 0000 -275 891
BFM 600 0093 0046 0008 0042 0012 0001 -325 6621000 0083 0031 0004 0061 0028 0004 -339 51920000 0023 0006 0000 0099 0049 0011 -304 138
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor The true value of the cross-sectional R2 is zero
The Bayesian approach to Fama-MacBeth regressions successfully overcomes the hurdle of use-
less factors As Table 2 demonstrates both BFM-OLS and BFM-GLS are able to identify the spu-
rious factor with the posterior distribution providing credible confidence bounds with the proper
control for the size of the test (eg for T = 20 000 the probability to reject the null of no risk
premia attached to a useless factor when using a test with the nominal size of 10 is 91) Rec-
ognizing a useless factor cross-sectional measures of fit are more conservative and overall tighter
Why does the bayesian approach work when the frequentist fails The argument is probably
best summarized by Figure 1 that plots a posterior distribution of λuseless for BFM from one of
21
minus4 minus2 0 2 4
00
02
04
06
08
λuseless
Density
BFMminusOLSFMminusOLS
Figure 1 Risk premia estimates of a useless factor
The graph presents the posterior distribution (blue line) of λuseless from the BFM-OLS estimation in a misspecifiedmodel with a useless factor based on a single simulation with T = 1 000 The red line depicts the asymptoticdistribution of Fama-MacBeth estimate of risk premium under the normal approximation
the simulations along with the pseudo-true value of risk premium defined as 0 in this case In this
particular simulation Fama-MacBeth OLS estimate of λuseless is -119 with Shanken-corrected
t-statistics equal to -255 so according to traditional hypothesis testing we would reject the null
of λuseless = 0 even at 1 The posterior distribution of BFM estimates of risk premium (the
blue line in Figure 1) behaves rather differently it is centered around 0 and overall more spread
out with a confidence interval (minus1603 1201) Intuitively the main driving force behind it
is the fact that in BFM β is updated continuously when β is close to zero the posterior draws
of β will be positive or negative randomly which implies that the conditional expectation of λ in
Equation 12 will also flip sign depending on the draw As a result the posterior distribution of
λuseless is centered around 0 and so is the confidence interval The same logic applies to the case
of BFM-GLS
Finally Table 3 combines the insights for true and irrelevant factors and presents the simulation
results for the most realistic model setup that includes both useless and strong factors As expected
in the conventional case of frequentist Fama-MacBeth estimation the useless factor is often found
to be a significant predictor of the asset returns its OLS (GLS) t-statistic would be above a
5-critical value in over 60 (80) of the simulations On the contrary the bayesian confidence
intervals have the right coverage and reject the null of no risk premia attached to the spurious
factor with frequency asymptotically approaching the size of the tests
The crowding out of the true factors by the useless ones could also be an important empirical
concern When the model is misspecified the presence of spurious factors can also bias the risk
premia estimates for the strong ones and often leads to their crowding out of the model Panel
A in Table 3 serves as a good illustration of this possibility with risk premia estimates for the
22
Table 3 Tests of risk premia in a misspecified model with useless and strong factors
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0082 0039 0008 0121 0067 0016 0099 0023 0001 -513 5663200 0096 0044 0005 0157 0100 0034 0129 0039 0005 127 6190
FM 600 0093 0034 0014 0212 0147 0071 0264 0129 0022 840 61781000 0102 0046 0010 0261 0194 0098 0380 0199 0056 1184 624820000 0114 0054 0009 0289 0229 0152 0848 0633 0240 2507 6076
100 0035 0012 0001 0028 0007 0001 0004 0001 0000 -211 4033200 0049 0017 0001 0067 0031 0004 0011 0003 0000 -175 4828
BFM 600 005 0018 0004 0099 0047 0005 0047 0014 0002 1020 55721000 0041 0021 0003 0102 0048 0011 0071 0035 0004 1487 569520000 0017 0007 0000 0087 0033 0007 0099 0055 0012 2480 5466
Panel B GLS
100 0219 0155 0057 0224 0135 0066 0303 0198 0064 1911 7775200 0155 0092 0028 0149 0090 0024 0263 0183 0061 5537 8171
FM 600 0121 0068 0015 0116 0064 0016 0391 0293 0134 6948 84331000 0115 0061 0013 0115 0057 0012 0487 0387 0216 7305 847420000 0084 0050 0009 0100 0041 0005 0864 0836 0757 7979 8424
100 0122 0069 0016 0129 0070 0017 0046 0017 0002 3243 6869200 0112 0056 0012 0099 0048 0012 0031 0012 0000 4844 7355
BFM 600 0096 0049 0011 0086 0045 0009 0049 0016 0002 6576 80301000 0081 0036 0007 0073 0032 0006 0058 0030 0003 7064 815420000 0027 0005 0000 0022 0007 0000 0098 0047 0013 7974 8259
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and
λstrong λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor The true value of the
cross-sectional R2adj is 3055 (8175) for the OLS (GLS) estimation
strong factor are clearly biased in the frequentist estimation by the identification failure in case of
the frequentist approach Again in this case BFM provides reliable albeit conservative confidence
bounds for model parameters
III11 Evaluating cross-sectional fit
In addition to risk premia estimates it is often useful to understand the quality of cross-sectional
fit of the model Indeed the increase in cross-sectional R2 is often interpreted as measuring the
economic importance of the predictor contrary to the statistical one implied by the risk premia
significance It is well-known however that the average values of R2 are not always informative
about the true model performance its sample distribution often suffers from a large estimation un-
certainty (see eg Stock (1991) and Lewellen Nagel and Shanken (2010)) ad has a non-standard
distribution when the matrix of β has reduced rank (Kleibergen and Zhan (2015) Gospodinov
Kan and Robotti (2019)) In this section we further investigate the properties of cross-sectional
R2 in the frequentist and Bayesian Fama-MacBeth regressions
Figure 2 shows the distribution of cross-sectional OLS R2 across a large number of simulations
for the asymptotic case of T = 20 000 and a misspecified process for returns As Panel (a)
illustrates if the model is strongly identified the distribution of posterior mode of R2 for BFM
tends to coincide with that of conventional Fama-MacBeth procedure as expected The major
23
difference emerges whenever a useless factor is included into the candidate set of variables Indeed
it is well-known that in this case the distribution of conventional measures of fit is nonstandard
and often inflated (Kleibergen and Zhan (2015)) This is further confirmed in Figures 2 b) and
c) that show that under the presence of spurious factors conventional Fama-MacBeth R2 has an
extremely spreadout right tail of the distribution which makes it easy to find a substantial increase
of fit whenever the model is simply not identified This unfortunate property of the frequentist
approach is not shared by the inference with BFM Indeed the mode of the posterior distribution
of R2 is generally tightly centered around the true values The slight bump to the right tail of the
distribution comes from the fact that whenever a spurious factor is included into the model with a
small probability (based on t-statistic cut-off this is equal to the size of the test see eg Table 3
Panel A) its fit will be similar to that of the frequentist estimation
00 02 04 06 08 10
02
46
810
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(a) Model with a strong factor
00 02 04 06 08
010
2030
40
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(b) Model with a useless factor
00 02 04 06 08 10
02
46
8
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(c) Model with strong and useless factors
Figure 2 Cross-sectional distribution of R2adj in models with strong and useless factors
Figures (a)-(c) show the asymptotic distribution of cross-sectional R2 under different model specifications across 1000simulations of sample size T = 20 000 Blue lines correspond to the distribution of the posterior mode for R2
adj whilegreen lines depict the pointwise sample distribution of cross-sectional fit evaluated at Fama-MacBeth risk premiaestimates The dark yellow dash line stands for the true value of R2
adj in the model
24
However the pointwise distribution of cross-sectional R2 across the simulations is only part of
the story as it does not reveal the in-sample estimation uncertainty and whether the confidence
intervals are credible in reflecting it While BFM incorporates this uncertainty directly into the
shape of its posterior distribution one needs to rely on bootstrap-like algorithms to build a similar
analogue in the frequentist case As frequentist benchmark we use the approach Lewellen Nagel
and Shanken (2010) to construct the confidence interval for R2 Details on this procedure can be
found in Appendix A5
minus05 00 05 10
00
05
10
15
20
25
Radj2
Den
sity
BFM PosteriorFM Radj
2
5th95th CI of BFM5th95th CI of LNSTrue Radj
2
(a) Model with a useless factor
minus04 minus02 00 02 04 06 08 10
00
05
10
15
20
Radj2
Den
sity
BFM PosteriorFM Radj
2
5th95th CI of BFM5th95th CI of LNSTrue Radj
2
(b) Model with strong and useless factors
Figure 3 The estimation uncertainty of cross-sectional R2Figures (a) and (b) show the posterior density of the cross-sectional R2 (blue) in one representative simulationalong with its 90 confidence interval (red) The dark yellow line denotes the true value of R2 while the green linedepicts its in-sample Fama-MacBeth estimate with the confidence interval constructed following Lewellen Nagel andShanken (2010) (black)
Figure 3 presents the posterior distribution of cross-sectional R2 for a model that contains a
useless factor (and potentially a strong one too) and contrasts it with a frequentist value and
the confidence interval around it Consider for example Panel a) The fact that the in-sample
Fama-MacBeth estimate of cross-sectional fit (51) is substantially higher than the mode of the
posterior distribution (-2 which is close to the true value of R2adj about minus4) is not surprising
given the previous results on the pointwise distribution of the estimates What is quite interesting
however is the coverage of the confidence interval constructed via the simulation-based approach of
Lewellen Nagel and Shanken (2010) Not only does it not include true value of the cross-sectional
fit but in fact in this particular simulation it suggests that R2adj should be between 42 and
100 A similar mismatch between the seemingly high levels of cross-sectional fit produced by a
frequentist approach and their true values can also be observed in Panel b) of Figure 3 for the
case of including both strong and a useless factors
In section A6 of the Appendix we show that the performance of the Bayesian Fama-MacBeth
method is robust to the use of a larger cross-sectional dimension ndash ie the above discussed properties
25
hold in a larger N setting
III2 Bayes factors
How well do flat and spike-and-slab priors work empirically in selecting relevant and detecting
spurious factors in the cross-section of asset returns We revisit the theoretical results from Section
II1 using the same simulation design used to evaluate the estimation of risk premia
Table 4 The probability of retaining risk factors using Bayes factors
T 55 57 59 61 63 65
Panel A strong factors
Jeffreys Prior fstrong 200 0813 0784 0758 0722 0693 0662600 0929 0915 0896 0876 0851 08341000 0972 0963 0957 0951 0937 0924
Spike-and-Slab Prior fstrong 200 0605 0570 0538 0505 0476 0447600 0853 0830 0807 0784 0762 07351000 0954 0940 0928 0912 0896 0875
Panel B useless factors
Jeffreys Prior fuseless 200 1000 0996 0988 0967 0919 0822600 0998 0998 0995 0988 0977 09431000 1000 1000 1000 0994 0983 0965
Spike-and-Slab Prior fuseless 200 0325 0188 0106 0057 0024 0013600 0072 0025 0006 0002 0001 00001000 0051 0015 0001 0000 0000 0000
Panel C strong and useless factors
Jeffreys Prior fstrong 200 0924 0897 0874 0848 0821 0799600 0988 0985 0976 0974 0965 09581000 0998 0996 0996 0995 0992 0987
fuseless 200 0984 0960 0910 0811 0702 0584600 0999 0993 0985 0954 0913 08541000 1000 1000 0995 0986 0966 0945
Spike-and-Slab Prior fstrong 200 0627 0591 0552 0509 0474 0452600 0860 0830 0802 0783 0758 07271000 0956 0942 0927 0911 0895 0875
fuseless 200 0260 0128 0071 0031 0019 0010600 0080 0028 0010 0004 0001 00011000 0058 0013 0005 0000 0000 0000
The table shows the frequency of retaining risk factors for different choice sets across 1000 simulations of differentsize (T=200 600 and 1000) In Panel A the candidate risk factor is truly cross-sectionally priced and stronglyidentified while in Panel B they are not Panel C summarizes the case of using both strong and useless candidatefactors in the model A candidate factor is retained in the model if its marginal posterior probability p(γi = 1|data)is greater than a certain threshold ie 55 57 59 61 63 and 65
Consider a cross-section of 25 portfolios that is actually loading on 2 systematic sources of risk
with the econometrician potentially observing at most only one of them a strong (and priced) ft
However there is also a second candidate factor available which is orthogonal to asset returns
and essentially useless We compute Bayes factors corresponding to each of the potential sources
of risk and document the empirical probability of retaining the variable in the model across 1000
26
simulations Again we consider models that contain either strong or useless factors or a com-
bination of both and different sample sizes (T = 200 600 and 1000) In each case we run the
Gibbs sampling algorithm derived using continuous spike-and-slab prior and then approximate the
marginal probability of each factor by the posterior mean of γj The decision rule is based on a
range of critical values 55-65 such that whenever the posterior mean of γj is above a particular
threshold we retain the factor Finally we also compute the probability of retaining a factor under
Jeffreys prior which would be the standard in the literature
Table 4 summarizes our findings When only a true risk factor is included in the candidate set
(Panel A) both Jeffreyrsquos and spike-and-slab priors successfully identify it with a high probability
especially in large sample For small sample sizes however Jeffreys prior clearly has a somewhat
higher power of detecting a true risk factor This outcome is not surprising as the estimator does
not impose additional shrinkage on the risk premium The difference however vanishes for larger
sample sizes
The difference between the two priors becomes drastic whenever useless factors are included
in the model (Panels B Panels B and C in Table 4) As discussed in Section II21 since the
matrix β⊤γ βγ is nearly singular and its determinant goes to zero under a flat prior the posterior
probability of including a spurious factor in the model converges to 1 asymptotically For example
the probability of misidentifying a spurious factor as being the true source of risk is almost 1
under Jeffreys prior even for a very short sample This in turn makes the overall process of model
selection invalid
Overall we find the asymptotic behavior of the spike-and-slab prior encouraging for variable
and model selection While often somewhat conservative in very short samples it successfully
eliminates the impact of the spurious factors from the model and identifies the true sources of
risk
IV Empirical Applications
In this section we apply our Bayesian approach to a large set of factors proposed in the previous
literature First we use the Bayesian Fama-MacBeth method to analyse several notable factor
models (subsection IV1) Second we consider 51 tradable and non-tradable factors yielding more
than two quadrillion possible models and employ our spike and slab priors to compute factorsrsquo
posterior probabilities and implied risk premia (subsections IV2 and IV3) Third we compare
the performance of a (low dimensional) robust model constructed with only the factors that have
high posterior probability to the one of several notable factor models (subsection IV4) Fourth
we estimate the degree of sparsity (in terms of linear factors) of the true latent stochastic discount
factor as well as the SDF-implied maximum Sharpe ratio (subsection IV5)
27
IV1 Some notable factor models
In this section we illustrate the differences between the frequentist and Bayesian FM estimation
(both OLS and GLS) for several candidate models In particular we estimate a set of linear
factor models on the returns of the standard 25 Fama-French portfolios sorted by size and value
using frequentist and Bayesian Fama-MacBeth estimators We use monthly data over the 197001-
201712 sample for tradable factors and whenever possible nontradables For factors available
only at quarterly frequency the sample is 1952Q1-2017Q3 (whenever possible) A full description
of the data and models used can be found in Appendix B Additional empirical results for other
candidate factors and cross-sections (eg 25 Fama-French + 17 Industry portfolios) can be found
in Appendix C
Tables 5 and 6 summarize the performance of several leading factor models For the classical FM
approach we report point estimates of risk premia with their Shanken-corrected t-statistics and
the cross-sectional R2 along with its 90 confidence interval (constructed following the method-
ology of Lewellen Nagel and Shanken (2010)) For BFM we report the posterior mean of risk
premia estimates and the posterior median and mode of R2 along with the centered 90 posterior
coverage We chose to report both the median and the mode for cross-sectional fit because the of
the shape of its distribution which is often heavily skewed
Carhart (1997) 4-factor model OLS and GLS Fama-MacBeth estimates of risk premia indicate
that size value and momentum (SMB HML and UMD correspondingly) are significant drivers of
the cross-section of test assets The market factor does not command a significant risk premium
which is a typical finding for this model Cross-sectional fit seems to be high with R2 over 70 even
though it comes with rather wide confidence bounds according to Lewellen Nagel and Shanken
(2010) approach The Bayesian estimation indicates that part of the model success is due to the fact
that this cross section of test assets does not have much exposure to momentum especially after
one controls for the conventional Fama-French factors While still marginally significant its risk
premium is substantially lower under both BFM and BFM-GLS estimation with tighter bounds
for R2 too On the contrary both HML and SMB have virtually identical risk prices under both
FM and Bayesian estimations
Hou Xue and Zhang (2014) q-factor model emphasized the role of investment and profitability
in matching the cross-section of equity returns and true to the data we find these factors strongly
priced Both estimation strategies give identical parameter estimates and largely consistent mea-
sures of cross-sectional fit This in turn implies that all the risk premia are strongly identified
for this model when estimated on a cross-section of 25 Fama-French portfolios When industry
portfolios are among the test assets (see Appendix C) risk premia and R2 decline and some of the
parameters lose significance but broadly speaking the model performs in a qualitatively similar
way
Liquidity-adjusted CAPM of Pastor and Stambaugh (2003) seems to suffer from identification
failure as the risk premium on the liquidity factor ceases to remain significant when BFM is used
in estimation Wide confidence bounds and uncertain cross-sectional fit provide a stark difference to
28
Table 5 Tradable factors and 25 Fama-French portfolios sorted by size and value
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Carhart (1997) Intercept 0489 7063 0703 6432 6329[-0244 1222] [3160 9400] [-0061 1426] [4826 7646]
MKT 0120 -0101[-0631 0870] [-0822 0683]
SMB 0171 0164[0100 0241] [0089 0232]
HML 0404 0396[0331 0477] [0330 0466]
UMD 2445 1806[0955 3936] [0259 3328]
q-factor model Intercept 0912 6567 0922 6062 6123Hou Xue and Zhang (2014) [0286 1539] [3040 8680] [0276 1560] [4131 7640]
ROE 0394 0377[0016 0771] [-0020 0789]
IA 0387 0385[0203 0571] [0208 0580]
ME 0274 0268[0169 0379] [0158 0376]
MKT -0371 -0378[-0995 0252] [-1005 0272]
Liquidity-CAPM Intercept 0973 3624 1162 3409 3027Pastor and Stambaugh (2000) [-0084 2030] [-909 10000] [0175 2120] [-239 6146]
LIQ 3057 1785[0727 5388] [-1237 4150]
MKT -0281 -0449[-1350 0788] [-1371 0509]
Panel A GLS
Carhart (1997) Intercept 1017 8964 1083 8587 863[0389 1645] [8200 9760] [0458 1717] [8085 9105]
MKT -0434 -0504[-1065 0196] [-1150 0122]
SMB 0191 0189[0150 0233] [0150 0230]
HML 0356 0356[0313 0400] [0316 0395]
UMD 1626 1264[0479 2772] [0077 2401]
q-factor model Intercept 1305 5503 1277 4728 4854Hou Xue and Zhang (2014) [0779 1831] [2440 9640] [0702 1879] [3245 6419]
ROE 0295 0266[-0026 0615] [-0087 0640]
IA 0270 0265[0104 0437] [0093 0450]
ME 0251 0246[0161 0341] [0144 0345]
MKT -0749 -0720[-1268 -0229] [-1292 -0156]
Liquidity-CAPM Intercept 1244 4938 1256 5298 4317Pastor and Stambaugh (2000) [0664 1824] [2691 9891] [0738 1749] [1250 6653]
LIQ 1141 0775[-0232 2514] [-0450 2116]
MKT -0664 -0678[-1242 -0086] [-1176 -0162]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of 25 Fama-French monthly excess returns Each model is estimated via OLS and GLS We reportpoint estimates and 5 confidence intervals for risk premia which are constructed based on the asymptotic normaldistribution and cross-sectional R2 and its (5 95) confidence level constructed as in Lewellen Nagel and Shanken(2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide the posterior mean of λ denoted byλj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (595) credible intervals and denote significance at the 90 95 and 99 level respectively
29
Table 6 Nontradable factors and 25 Fama-French portfolios sorted by size and value
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Scaled CCAPM Intercept 1046 2567 1791 3436 2919[-0848 2940] [-1429 10000] [0001 3723] [-476 6207]
cay 1817 0791[-0653 4288] [-1347 2686]
∆Cnd 0713 0303[-0030 1456] [-0462 0951]
∆Cnd times cay 0804 0301[-1645 3253] [-1911 2270]
HC-CAPM Intercept 3243 -122 3090 354 957[1228 5257] [-909 3345] [0790 5259] [-748 4431]
∆Y 0464 0085[-0213 1140] [-1119 1058]
MKT -0719 -0656[-2680 1242] [-2859 1558]
Durable CCAPM Intercept 2214 5238 2780 471 4078[-1037 5465] [2800 10000] [-0184 5751] [120 6991]
∆Cnd 0743 0357[-0025 1511] [-0207 0832]
∆Cd -0057 0014[-0719 0605] [-0668 0693]
MKT 0083 -0495[-3322 3489] [-3395 2555]
Panel B GLS
Scaled CCAPM Intercept 2180 -1024 2257 -658 -313[0825 3536] [-1429 6457] [1221 3258] [-1187 1517]
cay 0435 0256[-0774 1643] [-0688 1217]
∆Cnd 0118 0089[-0266 0502] [-0214 0407]
∆Cnd times cay 0141 0063[-1005 1286] [-0845 0938]
HC-CAPM Intercept 2730 5636 2759 5824 4926[1458 4002] [3018 8364] [1379 4095] [967 7507]
∆Y -0421 -0241[-0742 -0099] [-0598 0114]
MKT -0717 -0740[-1979 0545] [-2073 0622]
Durable CCAPM Intercept 2960 4454 2841 5474 4099[0547 5374] [286 7829] [1102 4558] [-241 7215]
∆Cnd 0105 0052[-0265 0475] [-0201 0311]
∆Cd 0055 0025[-0390 0501] [-0286 0327]
MKT -0941 -0822[-3314 1432] [-2528 0895]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of 25 Fama-French quarterly excess returns Each model is estimated via OLS and GLS Wereport point estimates and 5 confidence intervals for risk premia which are constructed based on the asymptoticnormal distribution and cross-sectional R2 and its (5 95) confidence level constructed as in Lewellen Nageland Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide the posterior mean of λdenoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as wellas its (5 95) credible intervals and denote significance at the 90 95 and 99 level respectively
the pointwise estimates and their seemingly high significance levels under the standard frequentist
approach
Conditional CCAPM of Lettau and Ludvigson (2001) is weakly identified at best Unlike the
basic FM estimation that indicates a relative empirical success of the model the Bayesian approach
reveals most risk premia to be substantially lower losing all the accompanying statistical and
30
economic significance This is particularly pronounced in the BFM-GLS that delivers both risk
premia and cross-sectional R2 close to zero
Labour-adjusted CAPM of Jagannathan and Wang (1996) extends the classic CAPM framework
by introducing a proxy for human capital and finds it strongly priced in the cross-sections of stocks
returns The BFM estimates of risk premia are substantially lower and no longer significant with
the same patterns observed under both OLS and GLS procedures
Durable CCAPM of Yogo (2006) in the linearized version included the durable consumption
factor and found that its impact is priced in a number of cross-sections sorted by size and value past
betas and other characteristics Even though the Lewellen Nagel and Shanken (2010) approach
indicates a really wide support for the cross-sectional R2 the model found empirical support in the
data We find that both durable and nondurable consumption are weak predictors of the cross-
section of returns as the magnitude of their risk premia substantially declines and is no longer
significant The model is still characterized by a wide confidence interval for R2 but overall its
pricing ability is questionable at best
Appendix C provides additional empirical results on the performance of both frequentist and
bayesian Fama-MacBeth estimators In many cases when the models are well specified and strongly
identified in the data there is almost no distinction between the two approaches One notable
difference however are the confidence intervals of the R2 that are often notoriously wide in
the frequentist case There are also cases however when the difference in model performance
becomes large affecting both risk premia estimates and measures of cross-sectional fit Similar
to Gospodinov Kan and Robotti (2019) we caution the reader against blindly relying on the
estimates produced by conventional Fama-MacBeth procedure and advocate a robust approach to
inference
IV2 Sampling two quadrillion models
We now turn our attention to a large cross-section of candidate asset pricing factors In particular
we focus on 51 (both traded and non) monthly factors available from October 1973 to December
2016 (ie T = 600) Factors are described in details in Table B1 of Appendix B In choosing
the cross-section of assets to price we follow Lewellen Nagel and Shanken (2010) and employ 25
Fama-French size and book-to-market portfolios plus 30 Industry portfolio (ie N = 55) Since
we do not restrict the maximum number of factors to be included all the possible combinations
of factors give us a total of 251 possible specifications ie 225 quadrillion models Note that each
model involves 55 time series regressions and one cross-section regression ie we jointly evaluate
the equivalent of 126 quadrillion regressions
We employ the continuous spike-and-slab approach of section II23 since it is the most suited
for handling a very large number of possible models and report both the posterior (given the data)
of each factor (ie E [γj |data] forallj) as well as the posterior means of the factorsrsquo risk premia (ie
E [λj |data] forallj) computed as the Bayesian Model Average16 across all the models considered
16If we are interested in some quantity ∆ that is well-defined for every model j = 1 M (eg risk premia
31
The posterior evaluation is performed and reported over a wide range for the parameter (ψ in
equation (16)) that controls the degree of shrinkage of potentially useless factorsrsquo risk premia from
ψ = 1 (ie very strong shrinkage) to ψ = 100 (making the shrinkage virtually irrelevant) The
prior for each factor inclusion is a Beta(1 1) (ie a uniform on [0 1]) yielding a prior expectation
for γj equal to 5017
000
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
1 5 10 20 50 100
Post
erio
r pro
babi
lity
ψ
HML MKT MKT SMB STRev
IPGrowth BEH_PEAD NONDUR SERV LIQ_TR
PE DeltaSLOPE BW_ISENT DEFAULT Oil
DIV REAL_UNC UNRATE TERM HJTZ_ISENT
UMD CMA LIQ_NT FIN_UNC STOCK_ISS
NetOA MACRO_UNC INV_IN_ASS COMP_ISSUE ACCR
HML ROA RMW CMA LTRev
INTERM_CAP_RATIO ASS_Growth DISSTR PERF BAB
IA MGMT ROE O_SCORE BEH_FIN
GR_PROF QMJ RMW SKEW SMB
HML_DEVIL
Figure 4 Posterior factor probabilities
Posterior probabilities of factors E [γj |data] estimated over the 197310-201612 sample using a cross-section of 25Fama-French size and book-to-market and 30 Industry test asset portfolios computed using the continuous spike andslab approach of section II23 and 51 factors yielding 251 asymp 225 quadrillion models The 51 factors considered aredescribed in Table B1 of Appendix B The prior distribution for the j-th factor inclusion is a Beta(1 1) yielding aprior expectation of γj = 50 Posterior probabilities are plotted for ψ isin [1 100]
Figure 4 plots the posterior probabilities of the 51 factors as a function of the parameter ψ
The corresponding values are reported in Table 7 Overall the inclusion of only four factors finds
maximum Sharpe ratio etc) from Bayesrsquo theorem we have
E [∆|data] =M983131
j=0
E [∆|datamodel = j] Pr (model = j|data)
where E [∆|datamodel = j] = limLrarrinfin1L
983123Li=1 ∆(θ
(j)i ) and
983153θ(j)i
983154L
i=1denotes draws from the posterior distribution
of the parameters of model j That is the (Bayesian model average) expectation of ∆ conditional on only thedata is simply the weighted average of the expectation in every model with weights equal to the modelsrsquo posteriorprobabilities See eg Raftery Madigan and Hoeting (1997) Hoeting Madigan Raftery and Volinsky (1999)
17Using a Beta(2 2) that still implies a prior probability of factor inclusion of 50 but lower probabilities forvery dense and very sparse models we obtain virtually identical results Furthermore using a prior in favour of moresparse factor models (ie a Beta(2 8)) the empirical findings are very similar to the ones reported
32
Table 7 Posterior factor probabilities E [γj |data] and risk premia in 225 quadrillion models
E [γj |data] E [λj |data]ψ ψ
Factors 1 5 10 20 50 100 1 5 10 20 50 100 F
HML 0860 0909 0916 0903 0882 0877 0192 0288 0301 0300 0292 0293 0377MKT 0730 0807 0831 0851 0837 0778 0108 0271 0370 0476 0565 0572 0514MKT 0501 0630 0705 0748 0749 0707 0047 0184 0285 0385 0467 0472 0563SMB 0654 0662 0580 0500 0429 0370 0076 0138 0133 0117 0103 0102 0215STRev 0506 0526 0511 0530 0568 0550 0003 0013 0026 0049 0106 0158 0438IPGrowth 0508 0508 0501 0502 0508 0502 0000 -0001 -0001 -0002 -0004 -0007 0121BEH PEAD 0486 0512 0478 0492 0511 0517 0004 0012 0018 0027 0049 0072 0619NONDUR 0475 0499 0490 0504 0504 0509 0000 0002 0003 0005 0011 0018 0170SERV 0504 0481 0497 0502 0491 0497 0000 0000 0000 0000 -0001 -0001 0052LIQ TR 0494 0493 0485 0485 0506 0507 0000 0005 0010 0020 0048 0083 0438PE 0495 0494 0502 0490 0494 0492 -0002 -0004 -0005 -0006 -0008 -0009 6401DeltaSLOPE 0478 0490 0506 0498 0482 0511 0000 0000 -0001 -0001 -0002 -0005 0096BW ISENT 0489 0491 0496 0498 0504 0477 0000 0002 0003 0005 0010 0014 0094DEFAULT 0497 0498 0490 0497 0484 0490 0000 0000 0000 0001 0001 0002 0313Oil 0495 0504 0489 0490 0494 0483 0002 0011 0019 0034 0068 0108 0852DIV 0488 0491 0486 0494 0491 0505 0000 0000 0000 -0001 -0002 -0003 0876REAL UNC 0479 0490 0503 0490 0498 0484 0000 0000 0000 0000 0000 0000 0043UNRATE 0483 0499 0484 0490 0490 0486 0000 -0001 -0002 -0003 -0006 -0008 1083TERM 0470 0476 0480 0497 0503 0501 0000 0003 0005 0009 0018 0030 0942HJTZ ISENT 0483 0480 0491 0487 0489 0477 0000 0000 0000 -0001 0000 0000 0210UMD 0531 0537 0536 0495 0422 0384 0028 0069 0089 0100 0102 0108 0646CMA 0499 0504 0491 0495 0466 0442 0001 0000 -0002 -0005 -0008 -0012 0242LIQ NT 0477 0478 0494 0493 0481 0471 -0003 -0005 -0004 -0002 0009 0019 0501FIN UNC 0481 0477 0471 0477 0489 0491 0000 0000 0000 0000 -0001 -0001 0096STOCK ISS 0515 0537 0509 0473 0433 0374 -0032 -0081 -0096 -0103 -0110 -0105 0515NetOA 0504 0504 0481 0489 0423 0402 0005 0015 0022 0032 0040 0042 0544MACRO UNC 0476 0483 0479 0471 0463 0430 0000 0000 0000 0000 0000 0000 0073INV IN ASS 0479 0476 0479 0461 0454 0440 0001 0002 0005 0010 0023 0029 0549COMP ISSUE 0514 0518 0503 0481 0393 0363 0037 0083 0101 0111 0105 0108 0497ACCR 0483 0477 0486 0472 0436 0415 -0009 -0012 -0009 0001 0015 0015 0343HML 0491 0481 0479 0469 0431 0376 -0002 -0001 0001 0004 0008 0010 0251ROA 0517 0514 0499 0473 0399 0319 0041 0105 0141 0162 0154 0123 0551RMW 0508 0481 0483 0465 0397 0347 0002 0008 0013 0017 0017 0016 0219CMA 0560 0530 0499 0432 0351 0292 0031 0056 0062 0058 0051 0044 0351LTRev 0485 0491 0474 0442 0402 0350 -0007 -0025 -0032 -0035 -0036 -0029 0252INTERM CAP RATIO 0499 0477 0452 0451 0380 0324 0027 0037 0054 0082 0103 0104 0790ASS Growth 0473 0470 0458 0439 0380 0327 0001 0005 0005 0000 -0005 -0003 0525DISSTR 0505 0487 0456 0410 0340 0269 0043 0094 0105 0104 0085 0070 0475PERF 0541 0497 0448 0390 0319 0259 -0054 -0079 -0072 -0064 -0054 -0049 0651BAB 0460 0448 0433 0405 0372 0325 -0010 -0012 -0008 0006 0027 0037 0921IA 0497 0465 0425 0385 0325 0257 -0010 -0011 -0009 -0008 -0008 -0007 0409MGMT 0516 0469 0424 0379 0302 0244 0028 0055 0058 0056 0050 0044 0631ROE 0482 0446 0414 0364 0299 0233 0009 0024 0027 0028 0024 0018 0555O SCORE 0475 0426 0403 0368 0281 0229 -0009 -0014 -0015 -0018 -0016 -0012 002BEH FIN 0483 0410 0360 0321 0238 0196 -0016 -0008 -0003 0000 0003 0003 076GR PROF 0459 0405 0366 0314 0249 0206 0011 0005 0008 0011 0012 0014 0199QMJ 0502 0408 0369 0300 0232 0178 0030 0030 0026 0018 0014 0011 0405RMW 0464 0395 0356 0302 0245 0193 0015 0025 0028 0027 0025 0020 0292SKEW 0481 0388 0345 0287 0219 0179 -0001 0000 0000 0000 0000 0000 0438SMB 0336 0305 0319 0312 0311 0277 0019 0032 0041 0047 0054 0051 0257HML DEVIL 0434 0326 0289 0242 0183 0152 0014 0013 0009 0005 0003 0002 0356
Posterior probabilities of factors E [γj |data] and posterior mean of factor risk premia E [λj |data] computed usingthe continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225 quadrillion models Theprior for each factor inclusion is a Beta(1 1) yielding a prior expectation for γj equal to 50 The last column reportssample average returns for the tradable factors The data is monthly 197310 to 201612 Test assets cross-sectionof 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factors considered are described inTable B1 of Appendix B Numbers denoted with the asterisk in the last column correspond to the return on thefactor-mimicking portfolio of the nontradable factor constructed by a linear projection of its values on the set of 51test assets and scaled to have the same volatility as the original nontradable factor
33
substantial support in our empirical analysis First the celebrated Fama-French HML (high-minus-
low) designed to capture the so-called lsquovalue premiumrsquo is a strong determinant of the cross-section
of asset returns For ψ = 10 (a reasonable benchmark) its posterior probability is about 895 and
only for very strong shrinkage (ψ = 1) the posterior probability gets reduced to 639 Second
the market factor in the version of Daniel Mota Rottke and Santos (2018) (MKTlowast that is
meant to have hedged out the unpriced risk contained in the market index) has also high posterior
probability (856 for ψ = 10) Third the simple market factor (MKT) seems also to be a robust
source of priced risk albeit with an empirical performance weaker than MKTlowast Fourth albeit to
a lesser extent SMBlowast the Daniel Mota Rottke and Santos (2018) version of the small-minus-big
Fama-French factor (meant to capture the so called lsquosizersquo premium) seems also to contain relevant
information for pricing the cross-section of asset returns with a posterior probability in the 60-
70 range for most values of ψ Beside the ones above mentioned all other factors have posterior
probabilities of about 50 or less for all values of ψ Interestingly the results are not very sensitive
to the choice of ψ
In addition to the posterior probabilities of the factors Table 7 reports the posterior means
of the factor risk premia computed as Bayesian Model Average ie the weighted average of the
posterior means in each possible factor model specification with weights equal to the posterior
probability of each specification being the true data generating process (see eg Roberts (1965)
Geweke (1999) Madigan and Raftery (1994)) The results are not very sensitive to the choice of
ψ except when considering very small values for of the shrinkage parameter ψ since in this case
posterior means are shrunk toward zero Interestingly the estimated price of risk for the market
factor is positive despite it being very often estimated as a negative quantity when considering
multifactor models and not dissimilar from the market excess return over the same period More
generally there is a clear pattern in cross-sectionally estimated (ie ex post) factor risk premia and
their simple time series average estimates (reported in the last column of Table 7) for the robust
four factors (HML MKTlowast MKT and SMBlowast) ex post risk premia are very similar to the time series
estimnates while the opposite holds true for the other factors In other words robust factors seems
to price themselves well (since theoretically their own beta is one) while other factors donrsquot
IV3 Estimating 26 million sparse factor models
Instead of drawing unrestricted factor models specifications we now constraint models to have a
maximum of 5 factors ndash ie we are imposing sparsity on the implied stochastic discount factor as
in most of the previous empirical asset pricing literature that has tried to identify low dimensional
factor models to explain the cross-section of asset returns Given our set of 51 factors this approach
yields about 26 millions of possible models (ie the equivalent of about 147 million time series and
cross-sectional regressions) Since we now do not sample the possible models posterior probabilities
are computed using the marginal likelihoods of all these models ie the posterior probability of
model γj is computed as
Pr(γj |data) =p(data|γj)983123i p(data|γi)
34
000
1000
2000
3000
4000
5000
6000
7000
8000
1 5 10 20 50 60
Post
erio
r pro
babi
lity
ψ
HML MKT MKT SMB PERF
STOCK_ISS COMP_ISSUE CMA ROA UMD
BEH_PEAD STRev NONDUR DISSTR MGMT
BW_ISENT TERM FIN_UNC LIQ_TR DeltaSLOPE
IPGrowth Oil SERV DEFAULT NetOA DIV PE REAL_UNC HJTZ_ISENT UNRATE
INTERM_CAP_RATIO LIQ_NT QMJ MACRO_UNC INV_IN_ASS
ACCR LTRev CMA ASS_Growth HML
RMW BAB IA O_SCORE ROE
SMB SKEW BEH_FIN GR_PROF RMW
HML_DEVIL
Figure 5 Posterior factor probabilities
Posterior probabilities of factors Pr [γj = 1|data] estimated over the 197310-201612 sample using a cross-section of25 Fama-French size and book-to-market and 30 Industry test asset portfolios computed using the Dirac spike andslab approach of section II22 51 factors and all possible models with up to 5 factors yielding about 26 millioncandidate models The prior probability of a factor being included is about 1038 The 51 factors considered aredescribed in Table B1 of Appendix B Posterior probabilities are plotted for ψ isin [1 100]
where we have assigned equal prior probability to all possible specifications and p(data|γj) denotes
the marginal likelihood of the j-th model To both simplify the numerical computation and to
illustrate the qualities of the approach we use the Dirac spike-and-slab prior of section II22 since
we can leverage its closed form solution for the marginal likelihoods18
Posterior factor probabilities are reported in Figure 5 and jointly with the Bayesian model
averaging of risk premia across the sparse models in Table C15 of the Appendix Note that in
this case all factors have an ex ante probability of being included equal to 1038 The results
are strikingly similar to the ones in Table 7 and Figure 4 as before only for four factors (HML
MKTlowast MKT and SMBlowast) we observe a marked increase in the posterior probability of inclusion
after observing the data Furthermore these factors seems to price themselves well ndash ie the BMA
of their risk premia are very similar to the sample average of their excess returns ndash while other
factors do not
35
Table 8 Factor Models with highest posterior probability (Dirac spike-and-slab ψ = 10)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232SMB 983232 983232 983232 983232 983232 983232STRevBW ISENTLIQ TRUNRATENONDURTERMCOMP ISSUE 983232 983232OilBEH PEADDeltaSLOPEINV IN ASSIPGrowthDEFAULTUMD 983232ROA 983232 983232 983232REAL UNCPECMA 983232ACCRSERVSTOCK ISS 983232 983232 983232DIVMACRO UNCFIN UNCLIQ NTCMANetOAHJTZ ISENTLTRevRMWHMLINTERM CAP RATIOASS GrowthPERF 983232 983232 983232IABABDISSTRROEMGMT 983232 983232O SCOREQMJBEH FINGR PROFSKEWRMWHML DEVILSMB
Probability () 00666 00599 00574 00522 00503 00460 00437 00428 00423 00393
Factors and posterior model probabilities of ten most likely specifications computed using the Dirac spike and slabapproach of section II22 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 26 millioncandidate models and a model prior probability of the order of 10minus7 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
36
IV4 A robust factors model
The previous subsections suggest that only a small number of factors (HML MKTlowast MKT and to a
lesser extent SMBlowast) are robust explanators of the cross-section of asset returns Furthermore Table
8 that reports the ten factor model specifications with the highest posterior probabilities19 shows
that these robust factors tend to be almost always included in the most likely models both HML
and MKTlowast are featured in all ten specifications while MKT is excluded only once and SMBlowast is
included in the five most likely specification plus the tenth one Note that the posterior probabilities
in Table 8 might appear small in absolute terms but are actually four orders of magnitude larger
than the prior model probabilities (equal to one over the number of models considered)
Therefore a natural question is whether the four factors identified as robust in the above
analysis do indeed deliver a significantly better cross-sectional asset pricing model We answer this
questions by comparing the performance of a four factor model with HML MKTlowast MKT and SMBlowast
as factors to the one of several notable factor models20 In particular Table 9 reports the model
posterior probabilities for the specifications considered ie the probability of any of these models
being the true data generating process Strikingly for almost any value of ψ the model posterior
probabilities are in the single digits range for all models but the robust factors one the probability
of this specification is always higher than 85 except when using a very strong shrinkage (in which
case it is reduced to 65) Furthermore for ψ in the most salient range (10-20) the posterior
probability of the robust factors model is about 90
Table 9 Posterior probabilities of notable models vs robust factors
ψmodel 1 5 10 20 50 100
CAPM 002 001 001 002 004 008Fama and French (1992) 005 001 001 001 000 000Fama and French (2016) 009 006 005 004 003 002Carhart (1997) 005 001 001 001 001 001Hou Xue and Zhang (2015) 002 000 000 000 000 000Pastor and Stambaugh (2000) 001 000 000 001 001 002Asness Frazzini and Pedersen (2014) 010 003 002 002 002 001Robust Factors Model 065 088 090 090 089 085
Posterior model probabilities for the specifications in the first column for different values of ψ computed using theDirac spike-and-slab prior The models and their factors are described in Appendix B The model in the last rowuses the HML MKT MKTlowast and SBMlowast factors described in Table B1 The data is monthly 197310 to 201612 Testassets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios
18Alternatively one could use the continuous spike-and-slab approach employed in the previous subsection anddrop the draws of specifications with more than 5 factors but this significantly reduces computational efficiency
19Table 8 focuses on the Dirac spike-and-slab specification with ψ = 10 Very similar results are reported in tablesC16-C18 of Appendix C for other values of ψ and for the continuous prior case
20Note that the correlation between MKT and MKTlowast is not too large being about 064
37
IV5 The sparsity of the stochastic discount factor
We have shown that a robust model with only four ndash robust ndash factors is much more likely to
capture the true stochastic discount factor than all the notable models considered (see Table 9)
Nevertheless are only four factors likely to deliver a sufficiently accurate representation of the true
latent stochastic discount factor
Thanks to our Bayesian method this question can be easily answered In particular by using
our estimations of about 225 quadrillion models and their posterior probabilities we can compute
the posterior distribution of the dimensionality of the lsquotruersquo model That is for any integer number
between one and fifty-one we can compute the posterior probability of the SDF being a function
of that number of factors
Figure 6 reports the posterior distributions of the dimensionality of the SDF for various values
of ψ These distributions are also summarized in Table 10 For the most salient values of ψ (10 and
20) the posterior mean of the number of factors in the true SDF is in the 24-25 range and the 95
posterior credible intervals are contained in the 17 to 32 factors range That is there is substantial
evidence that the SDF is dense in the space of the linear factors considered given the factors at
hand a relative large number of them is needed to provide an accurate representation of the true
latent SDF Since most of the literature has focused on very low dimension linear factor models
this finding suggests that most empirical results therein have been affected by a large degree of
misspecification
It is worth noticing that as Figure 6 and Table 10 show for very large ψ ie with basically
a flat prior for factor risk premia the posterior dimensionality is reduced This is due to two
phenomena we have already outlined First if some of the factors are useless (and our analysis
points in this direction) under a flat prior they do tend to have a higher posterior probability and
drive out the true sources of priced risk Second a flat prior for the risk premia can generate a
ldquoBartlett Paradoxrdquo (see the discussion in section II21 and Bartlett (1957))
Table 10 The posterior dimensionality of the stochastic discount factor
ψ1 5 10 20 50 100
mean 2570 2525 2460 2370 2203 2046median 26 25 25 24 22 2025 19 18 18 17 15 145 20 20 19 18 16 1595th 31 31 30 30 28 26975th 33 32 32 31 29 27
Summary statistics of the posterior density of the true model number of factors for various values of ψ Estimated
over the 197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30 Industry
test asset portfolios using the continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225
quadrillion models The prior for each factor inclusion is a Beta(1 1)
38
0 10 20 30 40 50
000
004
008
012
Model Dimensionality
Freq
uenc
y
ψ=1ψ=5ψ=10ψ=20ψ=50ψ=100
Figure 6 Posterior density of the dimensionality of the stochastic discount factor
Posterior density of the true model having the number of factors listed on the horizontal axis Estimated over the
197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30 Industry test asset
portfolios computed using the continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225
quadrillion models The prior for each factor inclusion is a Beta(1 1) yielding a prior expectation for γj equal to 50
The 51 factors considered are described in Table B1 of Appendix B Posterior densities are plotted for ψ isin [1 100]
Note that if the factors proposed in the literature were to capture different and uncorrelated
sources of risk one might worry that a SDF that is dense in the space of factors might imply
unrealistically high Sharpe ratios (see eg the discussion in Kozak Nagel and Santosh (2019))
Since given a factor model the SDF-implied maximum Sharpe ratio is just a function of the
factorsrsquo risk premia and covariance matrix our Bayesian method allows to construct the posterior
distribution of the maximum Sharpe ratio for each of the 2 quadrilion models considered Therefore
using the posterior probabilities of each possible model specification we can actually construct
the (Bayesian Model Averaging) posterior distribution of the SDF-implied maximum Sharpe ratio
(conditional on the data only)
Figure 7 and table 11 report respectively the posterior distribution of the (annualized) SDF-
implied maximum Sharpe ratio and its summary statistics for several values of the parameter ψ
Except when a very strong shrinkage (small ψ) is imposed (and hence risk premia and consequently
Sharpe ratios are shrunk toward zero) the posterior distribution of the Sharpe ratio are quite
similar for all values of ψ Furthermore despite the SDF being dense in the space of factors the
posterior maximum Sharpe ratio does not appear to be unrealistically high eg for ψ isin [10 20]
its posterior mean is about 074ndash080 and the 95 posterior credible intervals are in the 050ndash114
range Interestingly Ghosh Julliard and Taylor (2016 2018) provides a non-parametric estimate
of the pricing kernel extracted using an information-theoretic approach and wide cross-sections of
equity portfolios and find SDF-implied maximum Sharpe ratios of very similar magnitude
39
00 05 10 15
01
23
4
Sharpe Ratio
Dens
ity
ψ=1ψ=5ψ=10ψ=20ψ=50ψ=100
Figure 7 Posterior density of the Sharpe ratio of the model-implied stochastic discount factor
Posterior density of the Sharpe ratio of the model-implied stochastic discount factor for various values of ψ isin [1 100]
Estimated over the 197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30
Industry test asset portfolios computed using the continuous spike and slab approach of section II23 51 factors
described in Table B1 of Appendix B and Bayesian Model Averaging of the 251 asymp 225 quadrillion possible models
The prior for each factor inclusion is a Beta(1 1)
Table 11 Posterior distribution of the Sharpe ratio of the model-implied stochastic discount factor
ψ1 5 10 20 50 100
mean 044 067 074 080 087 092median 043 066 073 079 086 09125 026 044 050 054 059 0625 029 047 053 058 063 06695th 060 089 099 108 117 124975th 064 094 105 114 124 133
Summary statistics of the posterior distribution of the maximal Sharpe ratio of the model-implied SDF for various
values of ψ isin [1 100] Estimated over the 197310-201612 sample using a cross-section of 25 Fama-French size and
book-to-market and 30 Industry test asset portfolios computed using the continuous spike and slab approach of
section II23 51 factors described in Table B1 of Appendix B and Bayesian Model Averaging of the 251 asymp 225
quadrillion possible models The prior for each factor inclusion is a Beta(1 1)
V Extensions
In additions to the extensions formalized in remarks 1 (on how to handle generated factors such as
principal components and factor mimicking portfolios) and 3 (on how to handle the identification
failure generated by lsquolevel factorsrsquo) our method can be feasibly extended to encompass several
salient generalizations
40
First based on economic considerations one might possibly want to bound the maximum risk
premia (or the maximum Sharpe ratios) associated with the factors This can be achieved by re-
placing the Gaussian distributions in our spike-and-slab priors with (rescaled and centered) Beta
distributions since the latter have bounded support Furthermore for the sake of expositional
simplicity and closed form solutions we have focused on regularising spike and slab priors with
exponential tails Nevertheless our approach that shrinks useless factors based on their correla-
tion with asset returns could be as well implemented using polynomial tailed (ie heavy-tailed)
mixing priors (see Polson and Scott (2011) for a general discussion of priors for regularisation and
shrinkage)21 The rational for using heavy tailed priors is that when the likelihood has thick tails
while the prior has thin tail if the likelihood peak moves too far from the prior mean the posterior
eventually eventually reverts toward the prior This is illustrated in Figure D2 of the Appendix
Nevertheless note that this mechanism (first pointed out in Jeffreys (1961)) is actually desirable
in our settings in order to shrink the risk premia of useless factors toward zero22
Second Lewellen Nagel and Shanken (2010) points out that the first pass time-series regression
is often affected by having a strong factor structure in the residuals Given the hierarchical structure
of our Bayesian approach one can add latent linear components in the time series regression of asset
returns on factors reformulate the time series estimation step as a state-space problem and filter
the latent components (eg via Kalman filter) The posterior sampling of the time series parameters
would then be enriched by the drawing of the added terms as in Bryzgalova and Julliard (2018)
Furthermore one could allow the latent time series factors to be potential priced in the cross-
section (again as in Bryzgalova and Julliard (2018)) This extension would increase the numerical
complexity of the procedure in the time-series step but would nonetheless leave unchanged the
method proposed in this paper at the cross-sectional step (with the only difference that the time
series loadings of the latent factors could be included in the cross-sectional step as if these latent
factors were observable) This extended approach would lead to valid posterior inference and model
selection
Third again thanks to the hierarchical structure of our method time varying time series betas
could be accommodated by adopting the time varying parameters approach of Primiceri (2005)
in the time series step And since in our approach the asset specific expected risk premia are
a parameter estimated in the time series step this extension would also allow for time variation
in asset risk premia Furthermore albeit this would increase the numerical complexity of the
cross-sectional inference step the time varying parameters formulation could also be used for the
modelling of the factor risk premia
21For example albeit alternatives with more desirable properties exist our spike and slab could be implementedusing a Cauchy prior with location parameter set to zero and scale parameter proportional to ψj as defined in equation(16)
22Risk premia have a natural support hence the prior can be chosen (adjusting the paramater ψ) to attach highprobability to it And since useless factors will tend to generate heavy tailed likelihoods with posterior peaks for riskpremia that deviate toward infinity the risk premia of such factors will be shrunk toward the prior mean if the priorhas thin-tails
41
VI Conclusions
We have developed a novel (Bayesian) method for the analysis of linear factor models in asset
pricing The approach can handle the quadrillions of models generated by the zoo of traded and
non traded factors and delivers inference that is robust to the common identification failures and
spurious inference problems caused by useless factors
We have applied our approach to the study of more than two quadrillion factor model spec-
ification and have found that 1) only a handful of factors (the Fama and French (1992) ldquohigh-
minus-lowrdquo proxy for the value premium the market index as well the adjusted versions of the
ldquosmall-minus-bigrdquo size factor and the market factor of Daniel Mota Rottke and Santos (2018))
seem to be robust explanators of the cross-sections of asset returns 2) jointly the four robust fac-
tors provide a model that is compared to the previous empirical literature one order of magnitude
more likely to have generated the observed asset returns (itrsquos posterior probability is about 90)
3) with very high probability the ldquotruerdquo latent stochastic discount factor is dense in the space of
factors proposed in the previous literature ie capturing its characteristics requires the use of 24-25
factors (at the posterior mean of the SDF sparsity) 4) despite being dense in the space of factors
the SDF-implied maximum Sharpe ratio is not excessive suggesting a high degree of commonality
in terms of captured risks among the factors in the zoo
As a byproduct of our novel framework for empirical asset pricing we provide a very simple
Bayesian version of the Fama and MacBeth (1973) regression method (BFM) We show that this
simple procedure (that does neither require optimisation nor tuning parameters and is not harder
to implement than eg the Shanken (1992) correction for standard errors) makes useless factors
easily detectable in finite sample In extensive simulations the BFM and its GLS analogue (BFM-
GLS) perform well even with relatively small time and large cross-sectional dimensions We apply
BFM and BFM-GLS to several notable factor models and document that a range of non-traded
factors such as consumption proxies labour factors or the consumption-to-wealth ratio are only
weakly identified at best and are characterised by a substantial degree of model misspecification
and uncertainty
Finally thanks to its hierarchical structure our framework is extremely flexible and can accom-
modate and deliver robust inference in the presence of 1) pre-estimated factors (eg mimicking
portfolios and principal components) 2) latent priced and unpriced factors 3) time varying betas
as well as asset and factor risk premia
42
References
Allena R (2019) ldquoComparing Asset Pricing Models with Traded and Non-Traded Factorsrdquo working paper
Asness C and A Frazzini (2013) ldquoThe Devil in HMLs Detailsrdquo Journal of Portfolio Management 39 49ndash68
Asness C S A Frazzini and L H Pedersen (2019) ldquoQuality minus Junkrdquo Review of Accounting Studies24(1) 34ndash112
Avramov D and G Zhou (2010) ldquoBayesian Portfolio Analysisrdquo Annual Review of Financial Economics 225ndash47
Baker M and J Wurgler (2006) ldquoInvestor Sentiment and the Cross-Section of Stock Returnsrdquo Journal ofFinance 61 1645ndash80
Baks K P A Metrick and J Wachter (2001) ldquoShould Investors Avoid All Actively Managed Mutual FundsA Study in Bayesian Performance Evaluationrdquo Journal of Finance 56 45ndash85
Ball R (1985) ldquoAnomalies in Relationships between Securitiesrsquo Yields and Yield-Surrogatesrdquo Journal of FinancialEconomics 6 103ndash126
Barillas F and J Shanken (2018) ldquoComparing Asset Pricing Modelsrdquo The Journal of Finance 73(2) 715ndash754
Bartlett M S (1957) ldquoComment on a Statistical Paradox by DV Lindleyrdquo Biometrika 44(1-2) 533ndash534
Basu S (1977) ldquoInvestment Performance of Common Stocks in Relation to their Price-Earnings Ratio A Test ofthe Efficient Market Hypothesisrdquo Journal of Finance 32 663ndash82
Belloni A V Chernozhukov and C Hansen (2014) ldquoInference on treatment effects after selection amonghigh-dimensional controlsrdquo The Review of Economic Studies 81 608ndash650
Breeden D T M R Gibbons and R H Litzenberger (1989) ldquoEmpirical Test of the Consumption-OrientedCAPMrdquo The Journal of Finance 44(2) 231ndash262
Bryzgalova S (2015) ldquoSpurious Factors in Linear Asset Pricing Modelsrdquo working paper
Bryzgalova S and C Julliard (2018) ldquoConsumption in Asset Returnsrdquo London School of EconomicsManuscript
Campbell J Y (1996) ldquoUnderstanding Risk and Returnrdquo Journal of Political Economy 104(2) 298ndash345
Campbell J Y J Hilscher and J Szilagyi (2008) ldquoIn Search of Distress Riskrdquo Journal of Finance 632899ndash2939
Carhart M M (1997) ldquoOn Persistence in Mutual Fund Performancerdquo The Journal of Finance 52(1) 57ndash82
Chan K N-F Chen and D A Hsieh (1985) ldquoAn Exploratory Investigation of the Firm Size Effectrdquo Journalof Financial Economics 14 451ndash71
Chen L R Novy-Marx and L Zhang (2010) ldquoA New Three-Factor Modelrdquo working paper
Chen N-F S A Ross and R Roll (1986) ldquoEconomic Forces and the Stock Marketrdquo Journal of Business 59383ndash403
Chernov M L Lochstoer and S Lundeby (2019) ldquoConditional Dynamics and the Multi-Horizon Risk-ReturnTrade-Offrdquo working paper
Chib S X Zeng and L Zhao (forthcoming) ldquoOn Comparing Asset Pricing Modelsrdquo The Journal of Finance
Cochrane J H (2005) Asset Pricing Revised Edition Princeton University Press
Cooper M J H Gulen and M J Schill (2008) ldquoAsset Growth and the Cross-Section of Stock ReturnsrdquoJournal of Finance 63 1609ndash52
Cremers K M (2002) ldquoStock Return Predictability A Bayesian Model Selection Perspectiverdquo The Review ofFinancial Studies 15(4) 1223ndash1249
Daniel K D Hirshleifer and L Sun (2019) ldquoShort- and Long-Horizon Behavioral Factorsrdquo Review ofFinancial Studies
Daniel K L Mota S Rottke and T Santos (2018) ldquoThe Cross-Section of Risk and Returnrdquo workingpaper
Daniel K and S Titman (2006) ldquoMarket reactions to tangible and intangible informationrdquo Journal of Finance61 1605ndash43
Fabozzi F D Huang and G Zhou (2010) ldquoRobust Portfolios Contributions from Operations Research andFinancerdquo Annals of Operations Research 172 191ndash220
43
Fama E F and K French (2008) ldquoDissecting Anomaliesrdquo Journal of Finance 63 1653ndash78
Fama E F and K R French (1992) ldquoThe Cross-Section of Expected Stock Returnsrdquo The Journal of Finance47 427ndash465
(1993) ldquoCommon Risk Factors in the Returns on Stocks and Bondsrdquo The Journal of Financial Economics33 3ndash56
Fama E F and K R French (2015) ldquoA Five-Factor Asset Pricing Modelrdquo Journal of Financial Economics116 1ndash22
(2016) ldquoDissecting Anomalies with a Five-Factor Modelrdquo Review of Financial Studies 29 69ndash103
Fama E F and J D MacBeth (1973) ldquoRisk Return and Equilibrium Empirical Testsrdquo Journal of PoliticalEconomy 81(3) 607ndash636
Ferson W and C Harvey (1991) ldquoThe Variation of Economic Risk Premiumsrdquo Journal of Political Economy99 385ndash415
Frazzini A and L H Pedersen (2014) ldquoBetting against Betardquo Journal of Financial Economics 111(1) 1ndash25
Garlappi L R Uppal and T Wang (2007) ldquoPortfolio Selection with Parameter and Model Uncertainty AMulti-Prior Approachrdquo Review of Financial Studies 20 41ndash81
George E I and R E McCulloch (1993) ldquoVariable Selection via Gibbs Samplingrdquo Journal of the AmericanStatistical Association 88 881ndash889
(1997) ldquoApproaches for Bayesian Variable Selectionrdquo Statistica Sinica 7 339ndash373
Gertler M and E Grinols (1982) ldquoUnemployment Inflation and Common Stock Returnsrdquo Journal of MoneyCredit and Banking 14 216ndash33
Geweke J (1999) ldquoUsing Simulation Methods for Bayesian Econometric Models Inference Development andCommunicationrdquo Econometric Reviews 18(1) 1ndash73
Ghosh A C Julliard and A Taylor (2016) ldquoWhat is the Consumption-CAPM missing An Information-Theoretic Framework for the Analysis of Asset Pricing Modelsrdquo Review of Financial Studies 30 442ndash504
(2018) ldquoAn Information-Theoretic Asset Pricing Modelrdquo London School of Economics Manuscript
Giannone D M Lenza and G E Primiceri (2018) ldquoEconomic Predictions with Big Data The Illusion ofSparsityrdquo FRB of New York Staff Report No 847
Gibbons M R S A Ross and J Shanken (1989) ldquoA Test of the Efficiency of a Given Portfoliordquo Econometrica57(5) 1121ndash1152
Giglio S G Feng and D Xiu (2019) ldquoTaming the factor Zoordquo Journal of Finance forthcoming
Giglio S and D Xiu (2018) ldquoAsset Pricing with Omitted Factorsrdquo working paper
Gospodinov N R Kan and C Robotti (2014) ldquoMisspecification-Robust Inference in Linear Asset-PricingModels with Irrelevant Risk Factorsrdquo The Review of Financial Studies 27(7) 2139ndash2170
(2019) ldquoToo Good to be True Fallacies in Evaluating Risk Factor Modelsrdquo Journal of Financial Economics132(2) 451ndash471
Goyal A Z L He and S-W Huh (2018) ldquoDistance-Based Metrics A Bayesian Solution to the Power andExtreme-Error Problems in Asset-Pricing Testsrdquo Swiss Finance Institute Research Paper No 18-78 available atSSRN httpsssrncomabstract=3286327
Hall R E (1978) ldquoThe Stochastic Implications of the Life Cycle Permanent Income Hypothesisrdquo Journal ofPolitical Economy 86(6) 971ndash87
Harvey C and Y Liu (2019) ldquoCross-sectional Alpha Dispersion and Performance Evaluationrdquo Journal ofFinancial Economics forthcoming
Harvey C and G Zhou (1990) ldquoBayesian Inference in Asset Pricing Testsrdquo Journal of Financial Economics26 221ndash254
Harvey C R (2017) ldquoPresidential Address The Scientific Outlook in Financial Economicsrdquo The Journal ofFinance 72(4) 1399ndash1440
Harvey C R Y Liu and H Zhu (2016) ldquo and the Cross-Section of Expected Returnsrdquo The Review ofFinancial Studies 29(1) 5ndash68
He A D Huang and G Zhou (2018) ldquoNew Factors Wanted Evidence from a Simple Specification Testrdquoworking paper available at SSRN httpsssrncomabstract=3143752
44
He Z B Kelly and A Manela (2017) ldquoIntermediary Asset Pricing New Evidence from Many Asset ClassesrdquoJournal of Financial Economics 126 1ndash35
Hirshleifer D H Kewei S Teoh and Y Zhang (2004) ldquoDo Investors Overvalue Firms with Bloated BalanceSheetsrdquo Journal of Accounting and Economics 38 297ndash331
Hoeting J A D Madigan A E Raftery and C T Volinsky (1999) ldquoBayesian Model Averaging ATutorialrdquo Statistical Science 14(4) 382ndash401
Hou K C Xue and L Zhang (2015) ldquoDigesting Anomalies An Investment Approachrdquo The Review of FinancialStudies 28(3) 650ndash705
Huang D F Jiang J Tu and G Zhou (2015) ldquoInvestor Sentiment Aligned A Powerful Predictor of StockReturnsrdquo Review of Financial Studies 28 791ndash837
Huang D J Li and G Zhou (2018) ldquoShrinking Factor Dimension A Reduced-Rank Approachrdquo workingpaper available at SSRN httpsssrncomabstract=3205697
Ishwaran H J S Rao et al (2005) ldquoSpike and Slab Variable Selection Frequentist and Bayesian StrategiesrdquoThe Annals of Statistics 33(2) 730ndash773
Jagannathan R and Z Wang (1996) ldquoThe Conditional CAPM and the Cross-Section of Expected ReturnsrdquoThe Journal of Finance 51(1) 3ndash53
Jeffreys H (1961) Theory of Probability Oxford Calendron Press
Jegadeesh N and S Titman (1993) ldquoReturns to Buying Winners and Selling Losers Implications for StockMarket Efficiencyrdquo Journal of Finance 48 65ndash91
(2001) ldquoProfitability of Momentum Strategies An Evaluation of Alternative Explanationsrdquo Journal ofFinance 56 699ndash720
Jones C and J Shanken (2005) ldquoMutual Fund Performance with Learning across Fundsrdquo Journal of FinancialEconomics 78 507ndash552
Jurado K S Ludvigson and S Ng (2015) ldquoMeasuring Uncertaintyrdquo American Economic Review 105 1177ndash1215
Kan R and C Zhang (1999a) ldquoGMM Tests of Stochastic Discount Factor Models with Useless Factorsrdquo Journalof Financial Economics 54(1) 103ndash127
(1999b) ldquoTwo-Pass Tests of Asset Pricing Models with Useless Factorsrdquo the Journal of Finance 54(1)203ndash235
Kass R E and A E Raftery (1995) ldquoBayes Factorsrdquo Journal of the American Statistical Association 90(430)773ndash795
Kelly B S Pruitt and Y Su (2019) ldquoCharacteristics are covariances A unified model of risk and returnrdquoJournal of Financial Economics 134 501ndash524
Kleibergen F (2009) ldquoTests of Risk Premia in Linear Factor Modelsrdquo Journal of Econometrics 149(2) 149ndash173
Kleibergen F and Z Zhan (2015) ldquoUnexplained Factors and their Effects on Second Pass R-squaredsrdquo Journalof Econometrics 189(1) 101ndash116
Kozak S S Nagel and S Santosh (2019) ldquoShrinking the Cross-Sectionrdquo Journal of Financial Economicsforthcoming
Lancaster T (2004) Introduction to Modern Bayesian Econometrics Wiley
Langlois H (2019) ldquoMeasuring Skewness Premiardquo Journal of Financial Economics
Lettau M and S Ludvigson (2001) ldquoResurrecting the (C) CAPM A Cross-Sectional Test when Risk Premiaare Time-Varyingrdquo Journal of political economy 109(6) 1238ndash1287
Lewellen J S Nagel and J Shanken (2010) ldquoA Skeptical Appraisal of Asset Pricing Testsrdquo Journal ofFinancial Economics 96(2) 175ndash194
Lintner J (1965) ldquoSecurity Prices Risk and Maximal Gains from Diversificationrdquo Journal of Finance 20587ndash615
Ludvigson S S Ma and S Ng (2019) ldquoUncertainty and Business Cycles Exogenous Impulse or EndogenousResponserdquo American Economic Journal Macroeconomics
Madigan D and A E Raftery (1994) ldquoModel Selection and Accounting for Model Uncertainty in GraphicalModels Using Occamrsquos Windowrdquo Journal of the American Statistical Association 89(428) 1535ndash1546
45
Novy-Marx R (2013) ldquoThe Other Side of Value The Gross Profitability Premiumrdquo Journal of Financial Eco-nomics 108 1ndash28
Ohlson J A (1980) ldquoFinancial Ratios and the Probabilistic Prediction of Bankruptcyrdquo Journal of AccountingResearch 18 109ndash131
Pastor L (2000) ldquoPortfolio Selection and Asset Pricing Modelsrdquo The Journal of Finance 55(1) 179ndash223
Pastor L and R F Stambaugh (2000) ldquoComparing Asset Pricing Models an Investment Perspectiverdquo Journalof Financial Economics 56(3) 335ndash381
Pastor L and R F Stambaugh (2002) ldquoInvesting in Equity Mutual Fundsrdquo Journal of Financial Economics63 351ndash380
Pastor L and R F Stambaugh (2003) ldquoLiquidity Risk and Expected Stock Returnsrdquo Journal of Politicaleconomy 111(3) 642ndash685
Phillips D L (1962) ldquoA Technique for the Numerical Solution of Certain Integral Equations of the First KindrdquoJ ACM 9(1) 84ndash97
Polson N G and J G Scott (2011) ldquoShrink Globally Act Locally Sparse Bayesian Regularization and Pre-dictionrdquo in Bayesian Statistics 9 ed by J M Bernardo M J Bayarri J O Berger A P Dawid D HeckermanA F M Smith and M West Oxford University Press
Primiceri G E (2005) ldquoTime Varying Structural Vector Autoregressions and Monetary Policyrdquo Review of Eco-nomic Studies 72(3) 821ndash852
Raftery A E D Madigan and J A Hoeting (1997) ldquoBayesian Model Averaging for Linear RegressionModelsrdquo Journal of the American Statistical Association 92(437) 179ndash191
Ritter J R (1991) ldquoThe Long-Run Performance of Initial Public Offeringsrdquo Journal of Finance 46 3ndash27
Roberts H V (1965) ldquoProbabilistic Predictionrdquo Journal of the American Statistical Association 60(309) 50ndash62
Shanken J (1987) ldquoA Bayesian Approach to Testing Portfolio Efficiencyrdquo Journal of Financial Economics 19195ndash215
(1992) ldquoOn the Estimation of Beta-Pricing Modelsrdquo The Review of Financial Studies 5 1ndash33
Sharpe W F (1964) ldquoCapital Asset Prices A Theory of Market Equilibrium under Conditions of Riskrdquo Journalof Finance 19(3) 425ndash42
Sims C A (2007) ldquoThinking about Instrumental Variablesrdquo mimeo
Sloan R (1996) ldquoDo Stock Prices Fully Reflect Information in Accruals and Cash Flows about Future EarningsrdquoThe Accounting Review 71 289ndash315
Stambaugh R F and Y Yuan (2016) ldquoMispricing Factorsrdquo Review of Financial Studies 30 1270ndash1315
Stock J H (1991) ldquoConfidence Intervals for the Largest Autoregressive Root in US Macroeconomic Time SeriesrdquoJournal of Monetary Economics 28(3) 435ndash459
Tikhonov A A Goncharsky V Stepanov and A G Yagola (1995) Numerical Methods for the Solutionof Ill-Posed Problems Mathematics and Its Applications Springer Netherlands
Titman S K J Wei and F Xie (2004) ldquoCapital Investments and Stock Returnsrdquo Journal of Financial andQuantitative Analysis 39 677ndash700
Uppal R P Zaffaroni and I Zviadadze (2018) ldquoCorrecting Misspecified Stochastic Discount Factorsrdquoworking paper
Yogo M (2006) ldquoA Consumption-Based Explanation of Expected Stock Returnsrdquo Journal of Finance 61(2)539ndash580
46
A Additional Derivations
A1 Derivation of the posterior distribution in section II1
Letrsquos consider first the time-series regression We assume that 983171tiidsim N (0Σ) or 983171 sim MVN (0TtimesN Σotimes
IT ) The likelihood of data (RF ) is therefore
p(data|BΣ) = (2π)minusNT2 |Σ|minus
T2 exp
983069minus1
2tr[Σminus1(Rminus FB)⊤(Rminus FB)]
983070
After assigning the Jeffreysrsquo prior for (BΣ) π(BΣ) prop |Σ|minusN+1
2 we simplify the likelihood
function by exploiting the fact that OLS estimated residuals are orthogonal to the regressors
(Rminus FB)⊤(Rminus FB) = [Rminus FBols minus F (B minus Bols)]⊤[Rminus FBols minus F (B minus Bols)]
= (Rminus FBols)⊤(Rminus FBols) + (B minus Bols)
⊤F⊤F (B minus Bols)
= T Σ+ (B minus Bols)⊤F⊤F (B minus Bols)
where
Bols =
983075a⊤
β⊤
983076= (F⊤F )minus1F⊤R Σols =
1
T(Rminus FBols)
⊤(Rminus FBols)
Therefore the posterior distribution in the first step is
p(BΣ|data) prop (2π)minusNT2 |Σ|minus
T+N+12 exp
983069minus1
2tr[Σminus1(Rminus FB)⊤(Rminus FB)]
983070
prop |Σ|minusT+N+1
2 eminus12tr[Σminus1(T Σ)]eminus
12tr[Σminus1(BminusBols)
⊤F⊤F (BminusBols)]
Hence the posterior distribution of B conditional on data and Σ is
p(B|Σ data) prop exp
983069minus1
2tr[Σminus1(B minus Bols)
⊤F⊤F (B minus Bols)]
983070
and the above is the kernel of the multivariate normal in equation (8)
If we further integrate out B it is easy to show that
p(Σ|data) prop |Σ|minusT+NminusK
2 exp
983069minus1
2tr[Σminus1(T Σ)]
983070
Therefore the posterior distribution of Σ is the inverse-Wishart in equation (9)
Recall that β = (1N βf ) λ⊤ = (λc λ⊤
f ) If we assume that the pricing error αi follows an
independent and idencitical normal distribution N (0σ2) the likelihood function in the second step
is
p(data|λσ2) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070
where data in the second step include (aβf ) drawn from the first step Assuming the diffuse
47
Jeffreysrsquo prior π(λσ2) prop 1σ2 the posterior distribution of (λσ2) is
p(λσ2|dataBΣ) prop (σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070
= (σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βλ+ β(λminus λ))⊤(aminus βλ+ β(λminus λ))
983070
= (σ2)minusN+2
2 exp
983069minusN σ2
2σ2
983070exp
983083minus(λminus λ)⊤β⊤β(λminus λ)
2σ2
983084
there4 p(λ|σ2 dataBΣ) prop exp
983083minus(λminus λ)⊤β⊤β(λminus λ)
2σ2
983084
where λ = (β⊤β)minus1β⊤a and σ2 = (aminusβλ)⊤(aminusβλ)N Therefore the posterior conditional distribution
of λ is the one in equation (12) Finally we can derive the posterior distribution of σ2 by integrating
out λ
p(σ2|dataBΣ) =
983133p(λσ2|dataBΣ)dσ2 prop (σ2)minus
NminusK+12 exp
983069minusN σ2
2σ2
983070
hence obtaining the posterior distribution in equation (13)
A2 Non-spherical pricing errors
Our framework can also easily accommodate non-spherical cross-sectional pricing errors To see
this note that under the null of the model we can rewrite equation (1) as Rt = βλ + βfft + 983171t
Consider the cross-sectional regression and let ET define the sample mean operator Since ET [Rt] =
βλ+ET [βfft]+ET [983171t] = βλ+ET [983171t] the pricing error α should be equal to ET [983171t] Hence under
the hypothesis that the model is correctly specified and in the spirit of the central limit theorem
a suitable distributional assumption for the pricing errors α in the second step is
α|Σ sim N
9830610N
1
TΣ
983062
Implying the distribution of R equiv ET [Rt] sim N (βλ 1T Σ) The likelihood function in the second step
is then
p(data|λ) = (2π)minusN2
9830559830559830559830551
TΣ
983055983055983055983055minus 1
2
exp
983069minusT
2(Rminus βλ)⊤Σminus1(Rminus βλ)
983070
prop exp
983069minusT
2(λ⊤β⊤Σminus1βλminus 2λ⊤β⊤Σminus1R)
983070
Hence we can now define the following estimator
Definition 3 (Bayesian Fama-MacBeth GLS2 (BFM-GLS2)) The BFM-GLS2 posterior dis-
tribution of λ is
λ|dataBΣ sim N (λΣλ) (18)
48
where λ = (β⊤Σminus1β)minus1β⊤Σminus1R Σλ = 1T (β
⊤Σminus1β)minus1 and where β and Σ are drawn from the
Normal-inverse-Wishart in (8)-(9)
In the above the conditional expectation λ = (β⊤Σminus1β)minus1β⊤Σminus1R is essentially the Fama-
MacBeth GLS estimate of λ and Σλ is just the standard covariance matrix of the GLS estimates
Different from our OLS version we incorporate the uncertainty of R by drawing λ from the normal
distribution with variance matrix 1T (β
⊤Σminus1β)minus1
In this case two forces are at play to make useless factors detectable in finite sample First as
for the BFM and BFM-GLS cases useless factors will generate posterior draws with diverging 983141λand flipping sing Second differently from our OLS version we incorporate the uncertainty of R
by drawing λ from the normal distribution with variance matrix 1T (β
⊤Σminus1β)minus1
A3 Formal derivation of the flat prior pitfall for risk premia
Following the derivation in section A1 the likelihood function in the second step is
p(data|γλσ2) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βγλγ)
⊤(aminus βγλγ)
983070(19)
Assigning a Jeffreysrsquo prior to the parameters23 (λσ2) the marginal likelihood function conditional
on model index γ is
p(data|γ) =983133 983133
p(data|γλσ2)π(λσ2|γ)dλdσ2
prop983133 983133
(σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βγλγ)
⊤(aminus βγλγ)
983070dλdσ2
=
983133 983133(σ2)minus
N+22 exp
983083minusN σ2
γ
2σ2
983084exp
983083minus(λγ minus λγ)
⊤β⊤γ βγ(λγ minus λγ)
2σ2
983084dλdσ2
= (2π)pγ+1
2 |β⊤γ βγ |minus
12
983133(σ2)minus
Nminuspγ+1
2 exp
983083minusN σ2
γ
2σ2
983084dσ2
= (2π)pγ+1
2 |β⊤γ βγ |minus
12
Γ(Nminuspγ+1
2 )
(N σ2
γ
2 )Nminuspγ+1
2
where λγ = (β⊤γ βγ)
minus1β⊤γ a σ
2γ =
(aminusβγ λγ)⊤(aminusβγ λγ)N and Γ denotes the Gamma function
A4 Proof of Remark 2
Proof Consider two nested linear factor models γ and γ prime The only difference between γ and γ prime
is γp γp equals 1 in model γ but 0 in model γ prime Let γminusp denote a (K minus 1) times 1 vector of model
index excluding γp γ⊤ = (γ⊤minusp 1) and γ prime⊤ = (γ⊤
minusp 0) Suppose that we rearrange the ordering of
23More precisely the priors for (λσ2) are π(λγ σ2) prop 1
σ2 and λminusγ = 0
49
factors such that factor p is the last one To begin with we introduce some matrix notations
βγ = (βγprime βp) Dγ =
983075Dγprime
1ψp
983076 β⊤
γ βγ +Dγ =
983075β⊤γprimeβγprime +Dγprime β⊤
γprimeβp
β⊤p βγprime β⊤
p βp + 1ψp
983076
|β⊤γ βγ +Dγ | = |β⊤
γprimeβγprime +Dγprime |times |β⊤p βp +
1
ψpminus β⊤
p βγprime(β⊤γprimeβγprime +Dγprime)minus1β⊤
γprimeβp|
|β⊤γ βγ +Dγ | = |β⊤
γprimeβγprime +Dγprime |times |β⊤p βp +
1
ψpminus β⊤
p βγprime(β⊤γprimeβγprime +Dγprime)minus1β⊤
γprimeβp|
|Dγ | = |Dγprime |times 1
ψp
Equipped with the above we have by direct calculation
p(data|γj = 1γminusj)
p(data|γj = 0γminusj)=
|Dγ |12
|β⊤γ βγ+Dγ |
12
1983059
SSRγ2
983060N2
|Dγprime |
12
|β⊤γprimeβγprime+D
γprime |12
1983061
SSRγprime2
983062N2
=
983061SSRγprime
SSRγ
983062N2983061|Dγ ||Dγprime |
983062 12
983075|β⊤
γprimeβγprime +Dγprime ||β⊤
γ βγ +Dγ |
983076 12
=
983061SSRγprime
SSRγ
983062N2
ψminus 1
2p
983055983055983055983055β⊤p βp +
1
ψpminus β⊤
p βγprime
983059β⊤γprimeβγprime +Dγprime
983060minus1β⊤γprimeβp
983055983055983055983055minus 1
2
=
983061SSRγprime
SSRγ
983062N29830611 + ψpβ
⊤p
983063IN minus βγprime
983059β⊤γprimeβγprime +Dγprime
983060minus1β⊤γprime
983064βp
983062minus 12
where β⊤p
983147IN minus βγprime(β⊤
γprimeβγprime +Dγprime)minus1β⊤γprime
983148βp = minb(βpminusβγprimeb)⊤(βpminusβγprimeb)+ b⊤Dγprimeb which
is the minimal value of the penalised sum of squared errors when we use βγprime to predict βp
A5 Confidence Intervals for R2 in Fama-MacBeth Estimation
In the spirit of Stock (1991) and Lewellen Nagel and Shanken (2010) for each of the simulations
we use the following approach to construct the confidence interval for cross-sectional R2 produced
by the standard Fama-MacBeth estimation
First we choose a hypothetical true cross-sectional R2 denoted by R2h The expected test asset
returns E[Rt] are assumed to follow E[Rt] = hβλ+α where β and λ are sample estimates of β
and λ from historical data and αiiidsim N (0σ2
e) The constants h and σ2e are chosen to match the
hypothetical cross-sectional R2
Let R denote the vector of historical average asset returns and V arN (x) denote the sample
cross-sectional variance of vector x The population cross-sectional R2 is therefore
R2h =
h2V arN (βλ)
V arN (E[Rt])=
h2V arN (βλ)
h2V arN (βλ) + σ2e
At the same time wersquod like to maintain the historical cross-sectional dispersion of test asset returns
50
so we further have the following equation
V arN (E[Rt]) = h2V arN (βλ) + σ2e = V arN (ET [Rt])
h2 =R2
hV arN (ET [Rt])
V arN (βλ)
σ2e = (1minusR2
h)V arN (ET [Rt])
Solving the system for h and σ2e we simulate a vector of pricing errors from a normal distribution
αiiidsim N (0σ2
e) Since the sample variance of the draws αi is generally different from σ2e with a non-
zero cross-sectional covariance with β we adjust the vector α by (1) subtracting CovN (βλα)
V arN (βλ)βλ
from α such as the sample covariance between α and βλ is zero (2) multiplying α by σeσN (α) in
order that sample cross-sectional variance of α equals σ2e
Second we simulate a random sample of both factors and test asset returns Factors are drawn
with replacement from their empirical sample while test assets returns Rt are assumed to follow a
parametric normal distribution that is
Rt|ftiidsim N (hβλ+α+ βft Σ)24
We then use Fama-MacBeth two-step approach to estimateR2 for every simulated sample ftRtTt=1
The second step is repeated for 1000 times For each hypothetical R2 between 0 and 1 we find the
(5 95) confidence interval in the simulation Finally build a 90 confidence interval for R2 by
including those values of hypothetically true R2 whose 90 confidence interval contains R2
The confidence intervals for GLS R2 can be found in a similar way with the only difference
of focusing on cross-sectional R2 for a linear combination of Rt ie Σminus 12Rt Let Rt = Σminus 1
2Rt
β = Σminus 12 β and 983171t = Σminus 1
2 983171tiidsim N (0 IN ) The simulation then relies on Rt rather than Rt
Rt|ftiidsim N (hβλ+α+ βft IN )
A6 Large N behavior
In this section we investigate the properties of the BFM procedure in estimating the risk premia
and R-squared as well successfully identifying irrelevant factors in the model when applied to a
large cross-section For the sake of brevity here we present the overview of the simulation results
with Tables C4 - C9 summarizing the output
We consider the same simulation design as described at the beginning of Section III except
for the choice of the cross-section of test assets which time series and cross-sectional features we
mimic In our baseline case in the previous subsections we built a cross-section to emulate the 25
Fama-French portfolios sorted by size and value Now instead we consider the properties of the
following composite cross-sections to simulate returns
24Σ is the sample estimate of covariance matrix of random errors in time-series regression in equation (1)
51
(a) N = 55 25 Fama-French portfolios sorted by size and value and 30 industry portfolios
(b) N = 100 25 Fama-French portfolios sorted by size and value 30 industry portfolios 25
profitability and investment portfolios 10 momentum portfolios and 10 long-term reversal
portfolios
The rest of the simulation design stays unchanged ie the strong factor mimics the behavior of
HML with its betas and risk premia corresponding to their in-sample values cross-sectional R2adj
as well as portfolio average returns and variance of the residuals
Tables C4 and C7 focus on the case of including only a strong factor into the model that is
inherently misspecified and estimated on a large cross-section (N = 55 and N = 100 respectively)
Risk premia estimates recovered by BFM are centered around the pseudo-true values and confi-
dence intervals produced by the posterior coverage have the correct size Confidence intervals for
R2adj are equally centered around the true values and overall become quite tight for a large sample
size (eg T ge 600) The GLS version of the cross-sectional fit is slightly lower than than the true
value and this is largely due to the estimation errors in the weight matrix that are particularly
important for a large cross-section but overall are very close to the true values
Tables C5 and C8 focus on the model with just a useless factor As expected the standard
Fama-MacBeth estimation yields confidence intervals for the risk premia with wrong size rejecting
the zero risk premia for a useless factor with increasing frequency as T becomes large In contrast
the BFM inference remains valid with its empirical rejection rates being close to the true size of
the tests
Finally Tables C6 and C9 consider the mixed case of estimating a model that includes both
strong and useless factors Again BFM correctly identifies the presence of a spurious factor in
the specification and if anything becomes somewhat more conservative underrejecting the zero
risk premia associated with it This conservative inference is also shared by the estimates of the
strong factor risk premia with the latter being particularly evident in a large sample estimation
(T = 20 000 N = 100) Again the reason for this additional estimation uncertainty seems to lie
in the large N behavior of the cross-sectional regression (betas and the weight matrix in case of
the GLS) The posterior of a cross-sectional measure of fit remains relatively tight around the true
value
B Data
CAPM Following Sharpe (1964) and Lintner (1965) the only risk factor is the excess return on
market portfolio which is proxied by a value-weighted portfolio of all CRSP firms incorporated in
US and listed on the NYSE AMEX or NASDAQ We use 1-month Treasury rate as a proxy for
risk-free rate The data comes from Ken Frenchrsquos website
Fama-French 3 factor model Fama and French (1992) extend CAPM by introducing two
additional factors SMB and HML where SMB is the return difference between portfolios of stocks
52
with small and large market capitalizations and HML is the return difference between portfolios of
stocks with high and low book-to-market ratio Again the data comes from Ken Frenchrsquos website
Carhart (1997) This paper extends Fama-French 3 factor model by including a momentum
factor UMD (also available from Ken French website)
q-factor model Hou Xue and Zhang (2015) introduce a four-factor model that includes market
excess return a new size factor (ME) proxied by the return difference between large and small
stocks an investment factor (IA) proxied by the return difference of stocks with high and low
investment-to-asset ratio and finally the profitability factor (ROE) created by sorting stocks based
on their return-on-equity ratio We receive the data from the authors An alternative is Fama-
French five factor model but we only show the result of the first one since factors in these two
papers are extremely similar
Liquidity Pastor and Stambaugh (2003) created a liquidity factor based on the fact that order
flows result in larger return reversals when liquidity is lower We download the monthly tradable
liquidity factor from Stambaughrsquos website
Quality-minus-junk Asness Frazzini and Pedersen (2019) introduced the QMJ factor and
demonstrated that profitable growing well-managed companies referred to as lsquoqualityrsquo firms
command a higher rate of return The factor is available from AQR data library
CCAPM Real growth rate in nondurable consumption per capita (quarterly data) is computed
from the data on consumption levels available at St Louis FRED
Scaled CAPM Lettau and Ludvigson (2001) considered a conditional SDFmt+1 = atminusbtMKTt+1
where at and bt are linear function of conditional information cayt We download cayt from the
authorsrsquo website
Scaled CCAPM Similar to conditional CAPM but the SDF includes nondurable consumption
growth instead of the market factor
HC-CCAPM Jagannathan and Wang (1996) added a labor factor to CAPM Following their
paper we compute the returns on human capital as
Rlabort =
Ltminus1 + Ltminus2
Ltminus2 + Ltminus3minus 1 (20)
where Lt is the disposable labor income per capita
Scaled HC-CCAPM Lettau and Ludvigson (2001) extend Jagannathan and Wang (1996) by
considering a conditional SDF in which cay is the only conditional information
Durable consumption model Yogo (2006) emphasized the role of durable consumption goods
in explaining high returns of small stocks and value stocks We consider a three-factor model
market excess return non-durable consumption growth and durable (real) consumption growth Cd
(seasonally adjusted at annual rates) The data is from Yogorsquos website 1952Q1 to 2001Q4
53
B1 Factor list
Table B1 List of the factors for cross-sectional asset pricing models
Factor ID Factor name and description Reference Sourceconstruction
MKT Market excess return Sharpe (1964) Lintner(1965)
Ken French website
SMB Size factor constructed as a long-shortportfolio of stocks sorted by their mar-ket cap (small-minus-big)
Fama and French (1992) Ken French website
HML Value factor constructed as a long-short portfolio of stocks sorted by theirbook-to-market ratio (high-minus-low)
Fama and French (1992) Ken French website
RMW Profitability factor constructed as along-short portfolio of stocks sorted bytheir profitability (robust-minus-weak)
Fama and French (2015) Ken French website
CMA Investment factor constructed as along-short portfolio of stocks sorted bytheir investment activity (conservative-minus-aggressive)
Fama and French (2015) Ken French website
UMD Momentum factor constructed as along-short portfolio of stocks sorted bytheir 12-2 cumulative previous return(up-minus-down)
Carhart (1997) Je-gadeesh and Titman(1993)
Ken French website
STREV Short-term reversal factor constructedas a long-short portfolio of stockssorted by their previous month return
Jegadeesh and Titman(1993)
Ken French website
LTREV Long-term reversal factor constructedas a long-short portfolio of stockssorted by their cumulative return ac-crued in the previous 60-13 months
Jegadeesh and Titman(2001)
Ken French website
q IA Investment factor constructed as along-short portfolio of stocks sorted bytheir investment-to-capital
Hou Xue and Zhang(2015)
Lu Zhang
q ROE Profitability factor constructed as along-short portfolio of stocks sorted bytheir return on equity
Hou Xue and Zhang(2015)
Lu Zhang
LIQ NT Liquidity factor computed as the av-erage of individual-stock measures es-timated with daily data (residual pre-dictability controlling for the marketfactor)
Pastor and Stambaugh(2003)
Robert Stambaughwebsite
LIQ TR Liquidity factor constructed as a long-short portfolio of stocks sorted by theirexposure to LIQ NT
Pastor and Stambaugh(2003)
Robert Stambaughwebsite
MGMT Mispricing factor constructed as acombination of anomalies related tofirmrsquos management practices (stock is-sue accruals asset growth etc)
Stambaugh and Yuan(2016)
Robert Stambaughwebsite
PERF Mispricing factor constructed as acombination of anomalies related tofirmrsquos performance (profitability dis-tress return on assets etc)
Stambaugh and Yuan(2016)
Robert Stambaughwebsite
ACCR Accruals factor constructed as a long-short portfolio of stocks sorted bychanges in operating working capitalper split-adjusted share from the fiscalyear end t-2 to t-1 divided by book eq-uity per share in t-1
Sloan (1996) Robert Stambaughwebsite
54
DISSTR Distress factor constructed as a long-short portfolio of stocks sorted by thepredicted failure probability
Campbell Hilscher andSzilagyi (2008)
Robert Stambaughwebsite
ASS Growth Asset growth factor constructed as along-short portfolio of stocks sorted bygrowth rate of total assets in the previ-ous fiscal year
Cooper Gulen andSchill (2008)
Robert Stambaughwebsite
COMP ISSUE Composite issue factor constructed asa long-short portfolio of stocks sortedby the growth in the firmrsquos total marketvalue of equity above that of the stockrsquosrate of return
Daniel and Titman(2006)
Robert Stambaughwebsite
GR PROF Gross profitability factor constructedas a long-short portfolio of stockssorted by the ratio of gross profit toassets creates abnormal benchmark-adjusted returns
Novy-Marx (2013) Robert Stambaughwebsite
INV IN ASS Investment-in-assets factor con-structed as a long-short portfolio ofstocks sorted by the annual change ingross property plant and equipmentplus the annual change in inventoriesscaled by lagged book value of theassets
Titman Wei and Xie(2004)
Robert Stambaughwebsite
NetOA Net operating assets factor con-structed as a long-short portfolio ofstocks sorted by net operating assets
Hirshleifer Kewei Teohand Zhang (2004)
Robert Stambaughwebsite
OSCORE Ohlson O-score factor constructed as along-short portfolio of stocks sorted bythe predicted value of a distress mea-sure
Ohlson (1980) Robert Stambaughwebsite
ROA Return-on-assets factor constructed asa long-short portfolio of stocks sortedby their return on assets
Chen Novy-Marx andZhang (2010)
Robert Stambaughwebsite
STOCK ISS Stock issuance factor constructed asa long-short portfolio of stocks sortedby their annual log change in split-adjusted shares outstanding
Ritter (1991) Fama andFrench (2008)
Robert Stambaughwebsite
INTERM CR Innovations to the intermediariesrsquo cap-ital (equity) ratio
He Kelly and Manela(2017)
Asaf Manela website
BAB Betting-against-beta factor con-structed as a portfolio that holdslow-beta assets leveraged to a betaof 1 and that shorts high-beta assetsde-leveraged to a beta of 1
Frazzini and Pedersen(2014)
AQR data library
HML DEVIL A version of the HML factor that relieson the current price level to sort thestocks into long and short legs
Asness and Frazzini(2013)
AQR data library
QMJ Auality-minus-junk factor constructedas a long-short portfolio of stockssorted by the combination of theirsafety profitability growth and thequality of management practices
Asness Frazzini andPedersen (2019)
AQR data library
FIN UNC A measure of financial uncertainty Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
REAL UNC A measure of real economic uncertainty Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
55
MACRO UNC A measure of macroeconomic uncer-tainty
Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
TERM Term spread measured as the differ-ence in 10 year Treasury bonds and fedfunds rate
Chen Ross and Roll(1986) Fama and French(1993)
FRED-MD database
DELTA SLOPE Change in the difference between a10-year Treasury bond yield and a 3-month Treasury bill yield
Ferson and Harvey(1991)
FRED-MD database
CREDIT Credit spread measured as the dif-ference between Moodyrsquos BAA corpo-rate bond yields and 10-year Treasurybonds
Chen Ross and Roll(1986) Fama and French(1993)
FRED-MD database
DIV Dividend yield Campbell (1996) FRED-MD databasePE Price-earnings ratio Basu (1977) Ball (1985) FRED-MD database
BW INV SENT Investor sentiment measure aggre-gated from a set of indices orthogonalto macroeconomic fundamentals
Baker and Wurgler(2006)
Dashan Huang web-site
HJTZ INV SENT Investor sentiment measure extractedwith PLS from Baker and Wurgler(2006) proxies
Huang Jiang Tu andZhou (2015)
Dashan Huang web-site
BEH PEAD Short-term behavioral factor reflectingpost-earnings announcement drift
Daniel Hirshleifer andSun (2019)
Kent Daniel website
BEH FIN Long-term behavioral factor predom-inantly capturing the impact of shareissuance and correction
Daniel Hirshleifer andSun (2019)
Kent Daniel website
MKTlowast Market factor with a hedged unpricedcomponent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
SMBlowast SMB with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
HMLlowast HML with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
RMWlowast RMW with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
CMAlowast CMA with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
SKEW Systematic skewness factor con-structed as a long-short portfolio ofstocks sorted on the their predictedsystematic skewness rank
Langlois (2019) Hugues Langloiswebsite
NONDUR Nondurable consumption growth (realchain-weighted per capita)
Chen Ross and Roll(1986) Breeden Gib-bons and Litzenberger(1989)
Monthly consump-tion expenditurethe chain-weightedprice index andpopulation data arefrom BEA
SERV Growth rate (real chain-weighted percapita) service expenditure
Breeden Gibbons andLitzenberger (1989)Hall (1978)
Monthly expenditurefor services thechain-weighted priceindex and popula-tion data are fromBEA
UNRATE Unemployment rate Gertler and Grinols(1982)
FRED-MD database
IND PROD Growth rate of industrial production Chan Chen and Hsieh(1985) Chen Ross andRoll (1986)
Industrial Produc-tion Index is fromthe Board of Gover-nors of the FederalReserve System
56
OIL Monthly growth rate of the ProducerPrice index for Crude Petroleum (do-mestic production)
Chen Ross and Roll(1986)
PPI is from US Bu-reau of Labor Statis-tics
The table presents the list of factors used in Section IV2 For each of the variables we present their identificationindex the nature of the factor and the source of data for downloading andor constructing the time series
57
C Additional Tables
Table C1 Tests of risk premia in a correctly specified model with a strong factor
λintercept λHML R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0115 0049 0017 0114 0050 0014 -417 4429200 0101 0055 0012 0096 0047 0010 096 6773
FM 600 0106 0053 0011 0100 0048 0011 3052 84561000 0096 0050 0011 0102 0044 0010 4786 901820000 0102 0042 0012 0100 0049 0007 9620 9943
100 0063 0027 0006 0034 0010 0002 -337 3516200 0107 0055 0012 0080 0037 0005 -299 4906
BFM 600 0095 0050 0007 0080 0035 0007 431 69131000 0092 0054 0008 0092 0047 0012 3291 781820000 0108 0054 0012 0110 0052 0008 9654 9849
Panel B GLS
100 0221 0156 0061 0212 0137 0064 1062 7380200 0153 0088 0024 0144 0088 0024 5923 8571
FM 600 0117 0062 0017 0115 0060 0014 8338 94461000 0106 0053 0011 0112 0052 0011 9003 964920000 0100 0058 0010 0103 0047 0011 9947 9981
100 0151 0088 0026 0149 0086 0026 2642 6569200 0123 0066 0016 0124 0066 0015 4907 7352
BFM 600 0106 0055 0012 0105 0056 0011 7640 87301000 0107 0054 0011 0103 0051 0011 8512 915420000 0098 0057 0009 0100 0051 0013 9915 9950
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in a correctlyspecified model with an intercept and a strong factor Hypothetical true value of R2
adj is 100 Fama-MacBeth esti-mates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shankencorrection Confidence intervals for BFM estimates are constructed using a posterior distribution of Fama-MacBethestimates of λ The last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simula-tions evaluated at the simulation point estimates for FM and its posterior mode for BFM
58
Table C2 Tests of risk premia in a correctly specified model with a useless factor
λintercept λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0087 0033 0008 0019 0003 0000 -419 4844200 0071 0034 0006 0052 0011 0000 -420 5684
FM 600 0077 0031 0005 0131 0037 0000 -422 58501000 0071 0035 0006 0205 0066 0002 -416 612820000 0133 0077 0027 0709 0479 0140 -411 7007
100 0039 0011 0002 0001 0001 0000 -229 -012200 0038 0013 0001 0002 0000 0000 -232 030
BFM 600 0048 0012 0002 0017 0006 0001 -221 1261000 0048 0011 0000 0031 0010 0000 -214 16020000 0007 0000 0000 0103 0051 0014 -176 5793
Panel B GLS
100 0215 0130 0052 0211 0117 0033 -310 3727200 0141 0080 0023 0139 0072 0013 -373 2282
FM 600 0108 0053 0014 0142 0065 0010 -377 21161000 0097 0053 0009 0171 0082 0012 -371 209220000 0108 0047 0010 0621 0559 0372 -259 1697
100 0136 0074 0022 0029 0011 0001 -194 1060200 0112 0060 0014 0012 0004 0000 -239 837
BFM 600 0097 0047 0010 0010 0002 0000 -265 7491000 0096 0047 0009 0010 0002 0000 -276 83220000 0071 0034 0004 0077 0033 0004 -326 766
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a correctly specified model with an intercept and a useless factor The true value of R2 is 0 Fama-MacBeth esti-mates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shankencorrection Confidence intervals for BFM estimates are constructed using a posterior distribution of Fama-MacBethestimates of λ The last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simula-tions evaluated at the simulation point estimates for FM and its posterior mode for BFM
59
Table C3 Tests of risk premia in a correctly specified model with useless and strong factors
λintercept λHML λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0073 0038 0006 0071 0033 0006 0043 0012 0000 -445 5930200 0073 0035 0006 0090 0045 0008 0028 0006 0000 1083 7484
FM 600 0076 0033 0005 0100 0046 0009 0027 0005 0000 3873 85721000 0073 0034 0006 0090 0046 0009 0026 0005 0000 5415 911320000 0087 0032 0006 0061 0026 0005 0025 0008 0000 9687 9947
100 0039 0013 0001 0023 0007 0001 0002 0001 0000 -197 4174200 0048 0023 0000 0046 0015 0000 0002 0001 0000 003 5372
BFM 600 0054 0025 0003 0062 0026 0006 0002 0000 0000 4056 72821000 0084 0034 0006 0064 0021 0002 0002 0000 0000 5763 808920000 0071 0033 0005 0069 0032 0004 0004 0000 0000 9704 9860
Panel B GLS
100 0193 0129 0043 0205 0133 0043 0180 0121 0031 1405 7480200 0135 0080 0021 0141 0075 0024 0129 0062 0010 6015 8683
FM 600 0101 0054 0010 0109 0052 0009 0091 0037 0004 8331 94231000 0104 0055 0010 0105 0048 0011 0085 0039 0003 8995 963220000 0095 0047 0006 0101 0052 0005 0088 0035 0004 9944 9981
100 0133 0074 0023 0132 0074 0023 0026 0009 0001 2796 6680200 0106 0054 0012 0106 0053 0012 0013 0004 0000 4899 7362
BFM 600 0094 0047 0009 0093 0046 0009 0007 0001 0000 7654 87351000 0090 0045 0010 0091 0044 0007 0005 0001 0000 8530 914620000 0092 0043 0008 0092 0046 0008 0004 0001 0000 9914 9951
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and
λstrong λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor The true value
of the cross-sectional R2adj is 100 Fama-MacBeth estimates are constructed using OLS (GLS) two-step cross-
sectional regressions with standard errors including Shanken correction Confidence intervals for BFM estimates areconstructed using a posterior distribution of Fama-MacBeth estimates of λ The last two columns report the 5th and95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at the simulation point estimates for FMand its posterior mode for BFM
60
Table C4 Tests of risk premia in a misspecified model with a strong factor (N = 55)
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0107 0058 0014 0115 0059 0012 -186 1305200 0091 0046 0009 0102 0052 0011 -184 1485600 0104 0052 0013 0109 0060 0014 -166 14951000 0100 0056 0009 0123 0061 0013 -123 135120000 0104 0049 0009 0109 0048 0009 353 803
BFM 100 0017 0004 0000 0002 0000 0000 -155 -067200 0046 0017 0004 0023 0006 0000 -168 273600 0089 0042 0012 0071 0036 0005 -174 8441000 0093 0047 0007 0089 0040 0007 -171 95920000 0101 0048 0011 0099 0042 0008 329 777
Panel B GLS
FM 100 0509 0428 0305 0513 0431 0305 2093 6471200 0295 0206 0102 0274 0187 0091 3999 6657600 0170 0109 0042 0176 0113 0034 5585 71321000 0170 0106 0034 0176 0105 0031 6010 723220000 0145 0082 0020 0141 0073 0022 6917 7182
BFM 100 0265 0181 0086 0262 0186 0079 1872 5919200 0163 0100 0032 0147 0086 0024 3401 5842600 0116 0060 0017 0118 0058 0014 5125 65951000 0120 0064 0016 0121 0063 0014 5689 686520000 0106 0053 0009 0095 0048 0013 6897 7163
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor estimates of a cross-section of 55 test assets The true valueof the cross-sectional R2
adj is 572 (7071) in OLS (GLS) estimation Fama-MacBeth estimates are constructedusing OLS (GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidenceintervals and their size for BFM estimates are constructed using posterior coverage of Fama-MacBeth estimates of λThe last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluatedat the simulation point estimates for FM and its posterior mode for BFM The test assets mimic the time series andcross-sectional properties of 25 Fama-French size-value portfolios and 30 industry portfolios while the strong factorproxies the HML factor (all the data available from Ken French website)
61
Table C5 Tests of risk premia in a misspecified model with a useless factor (N = 55)
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0095 0050 0013 0084 0031 0003 -186 2080200 0084 0042 0007 0093 0038 0004 -186 1801600 0103 0051 0011 0155 0077 0011 -187 13621000 0101 0051 0009 0226 0131 0033 -187 141720000 0191 0126 0050 0716 0650 0460 -187 988
BFM 100 0013 0002 0000 0000 0000 0000 -105 -047200 0039 0015 0001 0002 0000 0000 -128 -043600 0076 0040 0006 0004 0000 0000 -144 -0491000 0082 0035 0004 0022 0009 0000 -150 -02620000 0129 0059 0005 0078 0037 0008 -168 -020
Panel B GLS
FM 100 0492 0411 0287 0540 0468 0302 -134 2318200 0267 0196 0088 0364 0274 0153 -159 1359600 0166 0101 0035 0408 0323 0198 -162 10091000 0154 0089 0028 0475 0401 0266 -155 86820000 0155 0100 0044 0806 0772 0705 -068 668
BFM 100 0259 0180 0085 01695 0105 0041 -014 1285200 0162 0094 0030 0065 0027 0005 -089 615600 0108 0058 0013 0056 0022 0003 -127 5311000 0112 0057 0014 0064 0021 0004 -138 47920000 0051 0020 0001 0093 0049 0011 -072 172
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor estimated on a cross-section of 55 portfolios Thetrue value of the cross-sectional R2 is zero The test assets mimic the time series and cross-sectional properties of 25Fama-French size-value portfolios and 30 industry portfolios
62
Table C6 Tests of risk premia in a misspecified model with useless and strong factors (N = 55)
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0097 0052 0014 0099 0045 0009 0108 0041 0006 -318 2477200 0085 0041 0006 0090 0049 0011 0117 0051 0008 -298 2350600 0095 0045 0013 0108 0060 0012 0191 0100 0015 -212 21211000 0083 0043 0006 0125 0072 0016 0258 0171 0047 -155 213720000 0109 0055 0012 0146 0086 0038 0766 0704 0535 267 1778
BFM 100 0014 0002 0000 0000 0000 0000 0000 0000 0000 -141 161200 0037 0014 0001 0017 0005 0000 0002 0000 0000 -203 726600 0069 0031 0005 0058 0027 0002 0007 0000 0000 -227 10961000 0064 0024 0003 0069 0027 0005 0026 0008 0001 -209 117320000 0028 0006 0000 0068 0033 0004 0080 0042 0009 262 873
Panel B GLS
FM 100 0488 0406 0282 0495 0411 0281 0537 0465 0306 2201 6613200 0263 0194 0088 0252 0167 0077 0359 0279 0151 4047 6707600 0157 0094 0032 0161 0093 0027 0398 0313 0199 5581 71591000 0146 0087 0029 0144 0090 0026 0475 0393 0251 5994 725020000 0140 0094 0035 0134 0089 0041 0789 0747 0667 6888 7237
BFM 100 0253 0182 0081 0260 0176 0077 0168 0104 0039 2036 6007200 0156 0092 0028 0140 0082 0021 0063 0029 0004 3451 5844600 0105 0058 0013 0108 0049 0011 0054 0024 0002 5125 66141000 0105 0054 0015 0111 0056 0009 0057 0024 0003 5678 687320000 0051 0019 0001 0045 0018 0001 0093 0045 0013 6873 7157
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and λstrong
and λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor estimated on a cross-section
of 55 portfolios The true value of the cross-sectional R2adj is 572 (7071) in OLS (GLS) estimation The test
assets mimic the time series and cross-sectional properties of 25 Fama-French size-value portfolios and 30 industryportfolios while the strong factor proxies the HML factor
63
Table C7 Tests of risk premia in a misspecified model with a strong factor (N = 100)
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0091 0044 0010 0109 0055 0012 -098 1310600 0102 0052 0013 0116 0057 0019 -042 12311000 0099 0056 0011 0117 0060 0014 031 114520000 0099 0044 0010 0116 0068 0017 431 748
BFM 200 0016 0004 0000 0006 0001 0000 -082 074600 0076 0034 0006 0060 0028 0005 -086 7901000 0081 0046 0007 0076 0037 0007 -072 85920000 0097 0044 0011 0106 0057 0014 418 742
Panel B GLS
FM 200 0495 0411 0287 0481 0403 0268 2938 5885600 0255 0169 0077 0257 0178 0073 4736 62091000 0234 0149 0059 0205 0129 0049 5198 635220000 0166 0100 0035 0170 0097 0036 6142 6398
BFM 200 0249 0163 0070 0233 0158 0073 2567 5296600 0132 0082 0023 0137 0077 0020 4287 56771000 0125 0073 0021 0112 0060 0014 4849 597020000 0101 0056 0012 0098 0053 0013 6114 6375
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for the pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor estimated on a cross-section of 100 portfolios The truevalue of R2
adj is 585 (6297) for the OLS (GLS) estimation Fama-MacBeth estimates are constructed using OLS(GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidence intervalsand their size for BFM estimates are constructed using a posterior distribution of Fama-MacBeth estimates of λ Thelast two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at thesimulation point estimates for FM and its posterior mode for BFM The simulations design follows the methodologydescribed in Section III with the test assets mimicking the composite cross-section of 25 Fama-French size-BMportfolios 30 industry portfolios 25 profitability and investment portfolios 10 momentum portfolios as well as 10long-term reversal portfolios (all available from Ken French website) A strong factor mimics the behavior of HML
64
Table C8 Tests of risk premia in a misspecified model with a useless factor (N = 100)
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0091 0041 0009 0147 0079 0015 -101 1481600 0110 0062 0010 0280 0180 0064 -100 13551000 0096 0053 0007 0341 0248 0101 -100 122220000 0221 0151 0061 0805 0759 0643 -100 1220
BFM 200 0014 0003 0000 0000 0000 0000 -050 002600 0070 0030 0003 0014 0002 0000 -066 0261000 0073 0034 0005 0026 0010 0001 -071 02720000 0097 0035 0003 0091 0045 0008 -081 109
Panel B GLS
FM 200 0472 0397 0272 0530 0454 0320 -072 1495600 0230 0149 0064 0458 0363 0235 -052 9691000 0196 0122 0048 0487 0407 0275 -037 86120000 0106 0063 0021 0760 0717 0640 171 615
BFM 200 0244 0161 0066 0150 0092 0031 -016 769600 0129 0073 0021 0078 0035 0008 -055 6761000 0118 0066 0019 0070 0033 0006 -060 62420000 0076 0034 0005 0093 0049 0009 172 399
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor estimated on a cross-section of 100 portfolios The truevalue of R2 is zero The test assets mimic the time series and cross-sectional properties of composite cross-section of25 Fama-French size-BM portfolios 30 industry portfolios 25 profitability and investment portfolios 10 momentumportfolios as well as 10 long-term reversal portfolios while the strong factor mimics the behavior of HML (all thedata is available from Ken French website)
Table C9 Tests of risk premia in a misspecified model with useless and strong factors (N = 100)
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0088 0043 0009 0098 0045 0008 0187 0104 0026 -124 2118600 0103 0056 0011 0104 0064 0015 0319 0217 0081 -020 19571000 0105 0049 0009 0115 0058 0013 0389 0284 0134 070 182620000 0102 0057 0014 0140 0075 0028 0823 0782 0684 400 1766
BFM 200 0012 0003 0000 0005 0000 0000 0000 0000 0000 -051 541600 0062 0025 0003 0046 0021 0002 0016 0004 0000 -063 10771000 0062 0029 0005 0057 0025 0003 0029 0008 0001 -005 107820000 0026 0008 0001 0042 0015 0000 0089 0039 0011 399 818
Panel B GLS
FM 200 0467 0392 0268 0471 0383 0254 0523 0454 0315 2971 5943600 0227 0149 0059 0228 0154 0060 0455 0363 0227 4745 62221000 0191 0118 0046 0178 0114 0036 0480 0401 0262 5207 636820000 0098 0054 0020 0095 0062 0024 0749 0707 0621 6121 6416
BFM 200 0243 0160 0066 0230 0155 0072 0152 0087 0031 2613 5350600 0126 0072 0022 0132 0071 0017 0074 0033 0007 4268 56921000 0117 0067 0017 0109 0057 0013 0068 0034 0006 4864 597920000 0068 0032 0002 0066 0034 0006 0087 0047 0009 6102 6371
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and λstrong
λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor on the cross-section of 100
portfolios The true value of R2adj is 585 (6297) in OLS (GLS) estimation The test assets mimic the time series
and cross-sectional properties of composite cross-section of 25 Fama-French size-BM portfolios 30 industry portfolios25 profitability and investment portfolios 10 momentum portfolios as well as 10 long-term reversal portfolios whilethe strong factor mimics the behavior of HML (all the data is available from Ken French website)
65
Table C10 The probability of retaining risk factors using BF
T 55 57 59 61 63 65
Panel A strong factors
Jeffreys Prior fstrong 200 0860 0845 0830 0812 0792 0771600 0987 0985 0985 0983 0981 09791000 0998 0998 0997 0996 0996 0995
Spike-and-Slab Prior fstrong 200 0749 0718 0685 0654 0618 0586600 0982 0979 0978 0972 0970 09641000 0996 0996 0996 0995 0994 0992
Panel B useless factors
Jeffreys Prior fuseless 200 1000 0995 0982 0940 0862 0726600 1000 0999 0998 0988 0971 09201000 1000 1000 1000 0999 0990 0971
Spike-and-Slab Prior fuseless 200 0419 0248 0149 0083 0040 0022600 0119 0062 0028 0012 0007 00041000 0083 0028 0011 0001 0000 0000
Panel C strong and useless factors
Jeffreys Prior fstrong 200 0928 0912 0891 0878 0860 0838600 0994 0994 0992 0991 0991 09891000 0999 0999 0999 0999 0999 0999
fuseless 200 0955 0894 0788 0642 0489 0360600 0957 0895 0764 0618 0461 03541000 0957 0893 0787 0645 0483 0357
Spike-and-Slab Prior fstrong 200 0753 0725 0697 0666 0629 0592600 0982 0980 0979 0972 0970 09611000 0996 0996 0996 0995 0994 0994
fuseless 200 0283 0154 0085 0044 0023 0011600 0077 0031 0006 0002 0000 00001000 0053 0008 0000 0000 0000 0000
The table shows the frequency of retaining risk factors for different choice sets across 1000 simulations of differentsize (T=200 600 and 1000) In Panel A the candidate risk factor is truly cross-sectionally priced and stronglyidentified while in Panel B they are not Panel C summarizes the case of using both strong and useless candidatefactors in the model A candidate factor is retained in the model if its marginal posterior probability p(γi = 1|data)is greater than a certain threshold ie 55 57 59 61 63 and 65
66
Table C11 Two-pass regressions with tradable factors 25 Fama-French portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CAPM Intercept 1421 1769 1404 -091 1461[0627 2215] [-435 6661] [0595 2256] [-403 5211]
MKT -0647 -0631[-1429 0135] [-1458 0163]
Fama and French (1992) Intercept 1273 6071 1249 5604 5566[0730 1816] [2000 8514] [0682 1834] [4275 6946]
MKT -0686 -0664[-1227 -0145] [-1242 -0102]
SMB 0140 0140[0075 0205] [0073 0205]
HML 0380 0379[0322 0438] [0319 0439]
Quality-minus-junk Intercept 0626 7442 0648 6947 6879Asness Frazzini and Pedersen (2014) [-0048 1300] [4480 9520] [-0055 1308] [5548 7946]
QMJ 0369 0358[0148 0589] [0130 0591]
MKT -0097 -0116[-0763 0568] [-0772 0580]
SMB 0209 0206[0157 0260] [0151 0262]
HML 0338 0340[0288 0388] [0289 0392]
Panel B GLS
CAPM Intercept 1362 4805 1323 4706 4265[0941 1783] [2696 10000] [0846 1802] [1411 6530]
MKT -0788 -0749[-1206 -0369] [-1227 -0287]
Fama and French (1992) Intercept 1397 8849 1353 8543 8543[0924 1870] [8286 9771] [0846 1878] [8003 8989]
MKT -0827 -0783[-1298 -0356] [-1311 -0283]
SMB 0179 0178[0145 0213] [0142 0215]
HML 0348 0350[0312 0385] [0310 0391]
Quality-minus-junk Intercept 0966 8948 0982 8736 8642Asness Frazzini and Pedersen (2014) [0395 1537] [8320 9640] [0383 1616] [8120 9090]
QMJ 0378 0359[0204 0552] [0172 0539]
MKT -0425 -0438[-0984 0133] [-1052 0157]
SMB 0197 0196[0160 0234] [0156 0235]
HML 0359 0359[0321 0398] [0320 0399]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of monthly excess returns for 25 Fama-French sizevalue portfolios Each model is estimated viaOLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructed basedon the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructed as inLewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide theposterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and 99level respectively
67
Table C12 Two-pass regressions with tradable factors 25 Fama-French portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CCAPM Intercept 1444 662 1814 -202 545[0155 2732] [-435 9374] [0106 3664] [-423 4175]
∆Cnd 0383 0254[-0221 0988] [-0462 0956]
Scaled CAPM Intercept 4489 1617 3731 1739 2565[1479 7499] [-1429 6114] [-0268 7514] [-391 5783]
cay 1141 0563[-0198 2480] [-2262 3085]
MKT -2119 -1444[-4890 0653] [-5241 2643]
MKT times cay -8554 -5188[-19227 2119] [-17911 9415]
Scaled HC-CAPM Intercept 4405 1386 3343 3895 3659[1182 7629] [-2632 5453] [-0500 7182] [-006 6630]
cay 1038 0552[-0444 2520] [-1957 2848]
∆Y 0437 0034[-0421 1295] [-0995 0928]
MKT -2049 -1148[-5031 0933] [-4946 2772]
∆Y times cay 1428 0724[-1047 3903] [-3458 4645]
MKT times cay -7782 -4127[-18579 3015] [-17926 10532]
Panel B GLS
CCAPM Intercept 2274 -251 2336 -303 -056[1386 3162] [-435 9896] [1360 3266] [-403 1109]
∆Cnd 0139 0088[-0114 0391] [-0222 0383]
Scaled CAPM Intercept 2623 4949 2668 5626 4567[0794 4451] [1657 7829] [0859 4376] [439 7146]
cay 0362 0205[-0759 1483] [-0853 1277]
MKT -0596 -0642[-2412 1220] [-2325 1149]
MKT times cay 0644 0310[-5695 6984] [-5988 6529]
Scaled HC-CAPM Intercept 2486 5014 2604 5832 4698[0510 4463] [1158 7726] [0801 4372] [558 7371]
cay 0361 0199[-0902 1623] [-0879 1269]
∆Y -0415 -0242[-0878 0048] [-0672 0157]
MKT -0479 -0588[-2441 1482] [-2357 1241]
∆Y times cay 0395 0150[-1761 2552] [-1718 2036]
MKT times cay 1158 0477[-5888 8205] [-5890 6968]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of quarterly excess returns for 25 Fama-French sizevalue portfolios Each model is estimatedvia OLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructed basedon the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructed as inLewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide theposterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and 99level respectively
68
Table C13 Two-pass regressions with tradable factors 25 Fama-French + 17 industry portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Carhart (1997) Intercept 0755 4720 0780 3835 3769
[0167 1342] [803 8559] [0158 1395] [2222 5427]
MKT -0162 -0188
[-0759 0435] [-0808 0432]
SMB 0144 0142
[0082 0206] [0078 0207]
HML 0320 0316
[0251 0388] [0245 0388]
UMD 0759 0673
[-0360 1878] [-0476 1802]
q-factor model Intercept 0744 4505 0761 3647 3600
Hou Xue and Zhang (2015) [0244 1243] [138 8338] [0239 1287] [1711 5508]
ROE 0173 0156
[-0139 0485] [-0154 0481]
IA 0287 0279
[0117 0456] [0117 0455]
ME 0230 0221
[0135 0325] [0121 0320]
MKT -0199 -0211
[-0703 0304] [-0741 0311]
Liquidity Factor Intercept 0958 277 0963 -124 620
Pastor and Stambaugh (2003) [0417 1499] [-513 4113] [0377 1527] [-435 3340]
LIQ 0210 0153
[-1001 1421] [-1077 1445]
MKT -0274 -0281
[-0822 0274] [-0851 0340]
Panel B GLS
Carhart (1997) Intercept 0839 8826 0909 8486 8397
[0467 1210] [7562 9557] [0487 1339] [7895 8812]
MKT -0235 -0309
[-0610 0140] [-0737 0120]
SMB 0162 0161
[0134 0190] [0130 0193]
HML 0351 0350
[0316 0386] [0312 0390]
UMD 1426 1184
[0832 2020] [0541 1861]
q-factor model Intercept 1107 4289 1103 3742 3631
Hou Xue and Zhang (2015) [0761 1454] [027 7341] [0714 1486] [2022 5216]
ROE 0270 0247
[0057 0482] [0008 0502]
IA 0277 0263
[0134 0420] [0107 0428]
ME 0221 0216
[0156 0287] [0144 0288]
MKT -0524 -0520
[-0869 -0179] [-0900 -0138]
Liquidity Factor Intercept 1186 2897 1159 1811 2373
Pastor and Stambaugh (2003) [0874 1497] [-197 7687] [0803 1519] [438 5253]
LIQ 0089 0065
[-0567 0744] [-0696 0871]
MKT -0598 -0571
[-0909 -0288] [-0936 -0212]
69
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CAPM Intercept 0966 467 0957 -113 306
[0427 1506] [-250 5490] [0382 1522] [-246 2863]
MKT -0277 -0269
[-0823 0269] [-0838 0311]
Fama-French (1992) Intercept 1016 4330 1002 3425 3370
[0540 1491] [182 8166] [0517 1500] [2194 4614]
MKT -0445 -0430
[-0916 0026] [-0912 0068]
SMB 0147 0144
[0086 0208] [0077 0208]
HML 0316 0313
[0248 0383] [0242 0383]
Quality-minus-junk Intercept 0836 4931 0836 3861 3934
Asness et all (2019) [0323 1350] [914 8449] [0314 1365] [2459 5577]
QMJ 0153 0147
[-0065 0372] [-0066 0373]
MKT -0293 -0289
[-0796 0210] [-0814 0233]
SMB 0179 0175
[0114 0243] [0105 0240]
HML 0302 0300
[0234 0369] [0230 0373]
Panel B GLS
CAPM Intercept 1192 3060 1170 2602 2573
[0884 1500] [058 7848] [0815 1517] [600 5118]
MKT -0604 -0583
[-0911 -0298] [-0923 -0227]
Fama-French (1992) Intercept 1182 8616 1150 8272 8234
[0853 1510] [7303 9353] [0788 1519] [7736 8662]
MKT -0596 -0564
[-0923 -0268] [-0937 -0198]
SMB 0160 0160
[0132 0187] [0129 0191]
HML 0349 0348
[0314 0384] [0310 0386]
Quality-minus-junk Intercept 0970 8674 0977 8227 8276
Asness et all (2019) [0606 1334] [7451 9335] [0561 1369] [7759 8693]
QMJ 0293 0278
[0159 0427] [0125 0439]
MKT -0393 -0399
[-0754 -0033] [-0791 0012]
SMB 0166 0165
[0138 0194] [0134 0196]
HML 0353 0352
[0318 0388] [0315 0392]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of monthly excess returns for 25 Fama-French and 17 industry portfolios Each model is estimatedvia OLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructedbased on the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructedas in Lewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we providethe posterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of thecross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and99 level respectively
70
Table C14 Two-pass regressions with nontradable factors 25 Fama-French + 17 industry port-folios
Fama-MacBeth Bayesian Estimation
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CCAPM Intercept 1817 356 1975 -124 209
[0905 2729] [-250 6208] [0795 3135] [-245 2680]
∆Cnd 0189 0123
[-0216 0594] [-0285 0502]
Scaled CAPM Intercept 2141 1528 2344 298 1306
[0529 3754] [-789 9568] [0692 3944] [-451 4069]
cay 1408 0574
[-0182 2999] [-0935 1939]
MKT 0108 -0142
[-1471 1686] [-1747 1499]
cay timesMKT -2554 -2159
[-8811 3702] [-8813 4620]
Scaled CCAPM Intercept 1607 1709 1983 95 1468
[0394 2820] [-789 10000] [0761 3180] [-372 4189]
cay 1137 0511
[-0394 2669] [-0871 1865]
∆Cnd 0452 0175
[-0103 1007] [-0261 0598]
cay times∆Cnd -0102 0030
[-1752 1547] [-1175 1292]
HC-CAPM Intercept 2185 611 2221 -147 644
[0720 3650] [-513 6005] [0667 3735] [-429 3093]
∆Y 0398 0201
[-0011 0808] [-0290 0667]
MKT 0123 0050
[-1392 1638] [-1513 1650]
Panel B GLS
CCAPM Intercept 2351 264 2442 -057 218
[1700 3003] [-250 9795] [1612 3258] [-204 1296]
∆Cnd 0211 0132
[0034 0387] [-0081 0351]
Scaled CAPM Intercept 2516 3951 2653 4332 3470
[1467 3566] [1045 7411] [1616 3705] [360 6539]
cay 0783 0424
[0056 1511] [-0311 1119]
MKT -0504 -0639
[-1548 0541] [-1674 0406]
cay timesMKT 3785 2327
[-0412 7981] [-2053 6791]
Scaled CCAPM Intercept 2244 331 2378 296 356
[1378 3109] [-789 8166] [1539 3237] [-476 1690]
cay 0742 0425
[-0006 1489] [-0316 1104]
∆Cnd 0160 0119
[-0078 0398] [-0116 0342]
cay times∆Cnd 0355 0190
[-0343 1053] [-0455 0840]
HC-CAPM Intercept 2908 3810 2808 4454 3356
[2053 3762] [854 7372] [1809 3826] [194 6591]
∆Y -0155 -0089
[-0381 0072] [-0357 0174]
MKT -0892 -0793
[-1742 -0043] [-1803 0208]
71
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Scaled HC-CAPM Intercept 2099 1372 2243 1599 2186
[0549 3649] [-1389 4875] [0491 3955] [-121 4771]
cay 1124 0529
[-0326 2574] [-1066 1981]
∆Y 0088 0106
[-0314 0489] [-0399 0599]
MKT 0169 -0068
[-1340 1679] [-1805 1744]
cay times∆Y 1277 0579
[-1361 3914] [-1930 2919]
cay timesMKT -2125 -1605
[-8058 3808] [-8349 5354]
Durable CCAPM Intercept 2281 2316 2220 1423 1459
[0025 4537] [-789 10000] [0403 4028] [-359 4178]
∆Cnd 0544 0194
[0054 1034] [-0163 0525]
∆Cd 0481 0104
[-0101 1062] [-0353 0531]
MKT -0102 -0063
[-2466 2262] [-1948 1863]
Panel B GLS
Scaled HC-CAPM Intercept 2662 3848 2698 4312 3502
[1626 3698] [433 7267] [1555 3844] [231 6466]
cay 0726 0416
[-0029 1481] [-0324 1146]
∆Y -0165 -0088
[-0440 0110] [-0372 0198]
MKT -0652 -0685
[-1684 0379] [-1816 0438]
cay times∆Y 0681 0333
[-0648 2009] [-1017 1616]
cay timesMKT 3266 2123
[-0950 7482] [-2173 6502]
Durable CCAPM Intercept 1958 2926 1994 412 2558
[0634 3281] [-789 6763] [0831 3125] [-269 6336]
∆Cnd 0164 0065
[-0065 0394] [-0126 0260]
∆Cd 0289 0123
[-0011 0589] [-0117 0377]
MKT 0002 -0026
[-1322 1326] [-1165 1125]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of quarterly excess returns for 25 Fama-French and 17 industry portfolios Each modelis estimated via OLS and GLS We report point estimates and 5 confidence intervals for risk premia which areconstructed based on the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence levelconstructed as in Lewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimationwe provide the posterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode andmedian of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the90 95 and 99 level respectively
72
Table C15 Posterior factor probabilities and risk premia of 26 million sparse models
Pr [γj = 1|data] E [λj |data]ψ ψ
factors 1 5 10 20 50 100 1 5 10 20 50 100
HML 0455 0702 0720 0695 0646 0614 0106 0231 0249 0245 0229 0219MKTlowast 0274 0513 0594 0658 0700 0702 0050 0211 0321 0440 0553 0591MKT 0086 0178 0311 0446 0550 0576 0009 0067 0163 0286 0408 0454SMBlowast 0246 0350 0317 0264 0210 0188 0039 0113 0120 0110 0095 0089PERF 0136 0160 0147 0125 0098 0083 -0020 -0050 -0053 -0049 -0041 -0036STOCK ISS 0099 0117 0125 0123 0110 0099 -0008 -0029 -0042 -0050 -0053 -0051COMP ISSUE 0097 0101 0104 0103 0096 0088 0010 0029 0041 0051 0059 0059CMA 0111 0114 0104 0091 0075 0066 0008 0018 0019 0019 0017 0016ROA 0089 0079 0083 0094 0102 0097 0010 0023 0034 0051 0068 0070UMD 0091 0097 0097 0090 0078 0071 0006 0020 0026 0029 0030 0030BEH PEAD 0081 0069 0069 0074 0089 0108 0001 0002 0004 0007 0016 0030STRev 0078 0064 0063 0067 0086 0112 0000 0002 0003 0007 0022 0048NONDUR 0079 0065 0064 0067 0079 0097 0000 0000 0001 0001 0003 0006DISSTR 0085 0079 0076 0070 0060 0053 0010 0030 0039 0043 0042 0040MGMT 0090 0083 0077 0068 0055 0047 0007 0017 0019 0019 0017 0015BW ISENT 0079 0064 0062 0063 0069 0077 0000 0000 0000 0001 0002 0004TERM 0078 0063 0061 0062 0069 0078 0000 0000 0001 0001 0004 0007FIN UNC 0078 0063 0061 0061 0066 0073 -0000 -0000 -0000 -0000 -0000 -0000LIQ TR 0078 0063 0060 0061 0066 0075 -0000 0000 0001 0002 0007 0015DeltaSLOPE 0078 0063 0061 0061 0066 0073 -0000 -0000 -0000 -0000 -0000 -0001IPGrowth 0078 0063 0061 0061 0066 0072 -0000 -0000 -0000 -0000 -0001 -0002Oil 0078 0063 0061 0061 0066 0072 -0000 0001 0001 0002 0004 0009SERV 0078 0063 0061 0061 0065 0072 -0000 -0000 -0000 -0000 -0000 -0000DEFAULT 0078 0063 0060 0060 0064 0069 -0000 -0000 0000 0000 0000 0000NetOA 0079 0063 0061 0062 0065 0065 0001 0003 0004 0007 0014 0018DIV 0078 0063 0060 0060 0064 0069 0000 -0000 -0000 -0000 -0000 -0000PE 0078 0063 0060 0060 0064 0068 -0000 -0001 -0001 -0002 -0004 -0008REAL UNC 0078 0063 0060 0060 0064 0067 0000 0000 0000 0000 0000 0000HJTZ ISENT 0078 0063 0060 0060 0064 0068 -0000 -0000 -0000 -0000 -0000 -0001UNRATE 0078 0063 0060 0060 0064 0068 0000 -0000 -0000 -0000 -0001 -0002INTERM CAP RATIO 0084 0065 0061 0062 0062 0059 0009 0013 0018 0027 0038 0041LIQ NT 0079 0063 0060 0060 0063 0066 -0001 -0001 -0002 -0002 -0003 -0004QMJ 0113 0090 0069 0051 0035 0028 0012 0016 0014 0010 0007 0005MACRO UNC 0078 0063 0060 0059 0060 0061 0000 0000 0000 0000 0000 0000INV IN ASS 0078 0062 0059 0059 0060 0061 0000 0000 0001 0001 0003 0006ACCR 0081 0067 0062 0059 0056 0055 -0002 -0005 -0006 -0006 -0005 -0004LTRev 0078 0064 0063 0062 0059 0053 -0001 -0005 -0008 -0012 -0016 -0017CMAlowast 0079 0062 0059 0058 0059 0058 0000 0000 -0000 -0001 -0002 -0002ASS Growth 0078 0060 0056 0055 0055 0052 -0001 -0001 -0001 -0004 -0008 -0011HMLlowast 0079 0062 0058 0055 0053 0049 -0001 -0001 -0001 -0001 -0001 -0001RMWlowast 0077 0058 0052 0047 0040 0035 0000 0001 0001 0002 0002 0002BAB 0079 0059 0051 0045 0039 0034 -0003 -0003 -0003 -0001 0001 0003IA 0083 0060 0051 0043 0035 0030 -0003 -0004 -0004 -0003 -0003 -0003O SCORE 0077 0056 0047 0040 0032 0027 -0003 -0004 -0005 -0005 -0005 -0005ROE 0076 0055 0047 0040 0032 0027 0001 0004 0005 0005 0004 0003SMB 0039 0024 0027 0042 0067 0077 0002 0002 0004 0007 0014 0017SKEW 0090 0062 0047 0034 0024 0019 -0000 -0000 -0000 -0000 -0000 -0000BEH FIN 0079 0051 0040 0031 0023 0019 -0006 -0005 -0003 -0001 -0000 -0000GR PROF 0077 0047 0038 0031 0025 0020 0004 0003 0003 0003 0003 0003RMW 0073 0045 0036 0030 0023 0018 0003 0004 0004 0003 0003 0003HML DEVIL 0063 0036 0027 0020 0015 0012 0002 0003 0002 0001 0000 0000
Posterior probabilities of factors Pr [γj = 1|data] and posterior mean of factor risk premia E [λj |data] computedusing the Dirac spike and slab approach of section II22 51 factors and all possible models with up to 5 factorsyielding about 26 million candidate models The prior probability of a factor being included is about 1038 Thedata is monthly 197310 to 201612 Test assets cross-section of 25 Fama-French size and book-to-market and 30Industry portfolios The 51 factors considered are described in Table B1 of Appendix B
73
Table C16 Factor models with highest posterior probability (Dirac spike-and-slab ψ = 20)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232SMB 983232 983232 983232STRevBW ISENTLIQ TRUNRATENONDURTERMCOMP ISSUE 983232 983232OilBEH PEAD 983232DeltaSLOPEINV IN ASSIPGrowthDEFAULTUMD 983232ROA 983232 983232 983232 983232 983232 983232 983232REAL UNCPECMA 983232ACCRSERVSTOCK ISS 983232 983232 983232DIVMACRO UNCFIN UNCLIQ NTCMANetOAHJTZ ISENTLTRevRMWHMLINTERM CAP RATIOASS GrowthPERF 983232IABABDISSTR 983232ROEMGMTO SCOREQMJBEH FINGR PROFSKEWRMWHML DEVILSMBProbability () 01080 00964 00771 00710 00709 00668 00661 00624 00600 00560
Factors and posterior model probabilities of ten most likely specifications computed using the Dirac spike and slabapproach of section II22 ψ = 20 51 factors and all possible models with up to 5 factors yielding about 26 millionmodels and a model prior probability of the order of 10minus7 Specifications organised by columns with the symbol 983232indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612 Test assetscross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factors considered aredescribed in Table B1 of Appendix B
74
Table C17 Factor Models with highest posterior probability (continuous spike-and-slab ψ = 10)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKTlowast 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232SMBlowast 983232 983232 983232 983232 983232 983232 983232STRev 983232 983232 983232 983232 983232 983232 983232IPGrowth 983232 983232 983232 983232 983232BEH PEAD 983232 983232 983232 983232 983232NONDUR 983232 983232 983232 983232SERV 983232 983232 983232LIQ TR 983232 983232 983232 983232PE 983232 983232 983232 983232 983232 983232DeltaSLOPE 983232 983232 983232 983232BW ISENT 983232 983232DEFAULT 983232 983232 983232 983232 983232 983232Oil 983232 983232 983232DIV 983232 983232 983232 983232 983232 983232REAL UNC 983232 983232 983232 983232 983232 983232UNRATE 983232 983232TERM 983232 983232 983232 983232HJTZ ISENT 983232 983232 983232 983232 983232 983232UMD 983232 983232 983232 983232 983232CMAlowast 983232 983232 983232 983232 983232 983232LIQ NT 983232 983232 983232 983232 983232 983232 983232 983232FIN UNC 983232 983232 983232 983232 983232 983232 983232STOCK ISS 983232 983232NetOA 983232 983232 983232 983232 983232 983232MACRO UNC 983232 983232 983232 983232 983232INV IN ASS 983232 983232 983232COMP ISSUE 983232 983232 983232 983232 983232ACCR 983232 983232 983232 983232 983232 983232HMLlowast 983232 983232 983232 983232 983232 983232ROA 983232 983232 983232 983232 983232 983232RMWlowast 983232 983232 983232 983232 983232 983232CMA 983232 983232 983232 983232 983232 983232LTRev 983232 983232 983232 983232 983232 983232 983232 983232INTERM CAP RATIO 983232 983232 983232 983232 983232 983232ASS Growth 983232 983232 983232 983232 983232 983232DISSTR 983232 983232 983232 983232PERF 983232 983232 983232BAB 983232 983232IA 983232 983232 983232 983232MGMT 983232 983232 983232 983232ROE 983232 983232 983232O SCORE 983232BEH FIN 983232 983232 983232 983232 983232GR PROF 983232 983232 983232 983232 983232QMJ 983232 983232RMW 983232SKEW 983232 983232 983232SMB 983232HML DEVIL 983232 983232Probability () 00111 00111 00111 00100 00100 00100 00100 00100 00100 00100
Factors and posterior model probabilities of ten most likely specifications computed using the continuous spike andslab approach of section II23 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 225quadrillion models and a model prior probability of the order of 10minus16 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
75
Table C18 Factor models with highest posterior probability (Continuous spike-and-slab ψ = 20)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232MKTlowast 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232SMBlowast 983232 983232 983232STRev 983232 983232 983232 983232 983232 983232 983232 983232IPGrowth 983232 983232 983232BEH PEAD 983232 983232 983232 983232 983232 983232NONDUR 983232 983232 983232SERV 983232 983232 983232 983232 983232 983232LIQ TR 983232 983232 983232 983232 983232PE 983232 983232 983232 983232 983232 983232 983232DeltaSLOPE 983232 983232 983232 983232 983232 983232BW ISENT 983232 983232 983232 983232 983232DEFAULT 983232 983232 983232 983232 983232Oil 983232 983232DIV 983232 983232 983232 983232REAL UNC 983232 983232 983232 983232 983232UNRATE 983232 983232 983232 983232 983232TERM 983232 983232 983232 983232HJTZ ISENT 983232 983232 983232 983232 983232 983232 983232 983232UMD 983232 983232 983232 983232CMAlowast 983232 983232LIQ NT 983232 983232 983232 983232 983232FIN UNC 983232 983232 983232 983232 983232 983232STOCK ISS 983232 983232 983232 983232 983232NetOA 983232 983232 983232 983232 983232 983232 983232MACRO UNC 983232 983232INV IN ASS 983232 983232 983232 983232COMP ISSUE 983232 983232 983232 983232ACCR 983232 983232 983232HMLlowast 983232 983232 983232 983232 983232 983232 983232ROA 983232 983232 983232 983232 983232 983232RMWlowast 983232 983232 983232 983232 983232CMA 983232 983232 983232 983232 983232 983232LTRev 983232 983232 983232INTERM CAP RATIO 983232 983232 983232ASS Growth 983232 983232 983232 983232DISSTR 983232 983232 983232 983232 983232PERF 983232BAB 983232 983232 983232 983232IA 983232 983232 983232 983232MGMT 983232 983232 983232ROE 983232 983232 983232 983232 983232O SCORE 983232 983232 983232 983232 983232BEH FIN 983232GR PROF 983232 983232 983232QMJ 983232 983232RMW 983232 983232SKEW 983232 983232SMB 983232HML DEVIL 983232 983232Probability () 00133 00122 00111 00111 00100 00100 00100 00100 00100 00089
Factors and posterior model probabilities of ten most likely specifications computed using the continuous spike andslab approach of section II23 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 225quadrillion models and a model prior probability of the order of 10minus16 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
76
D Additional Figures
Panel A OLS
00 01 02 03 04 05
02
46
810
λstrong
Density
BFMFM
(a) Strong factor
minus2 minus1 0 1 2
00
02
04
06
08
10
λuseless
Density
BFMFM
(b) Useless factor
Panel B GLS
025 030 035
05
1015
2025
30
λstrong
Density
BFMFM
(c) Strong factor
minus20 minus15 minus10 minus05 00 05 10
00
04
08
12
λuseless
Density
BFMFM
(d) Useless factor
Figure D1 Posterior distribution of the risk premia estimates in a misspecified model that includesboth strong and irrelevant factors
The graph presents posterior distribution of risk premia estimates for a misspecified model with both strong anduseless factors in one representative simulation Panels (a) and (b) display posterior distribution of the BFM-OLSestimates of risk premia along with the frequentist distribution implied by the point estimates and standard errorsPanels (c) and (d) report the same objects for GLS T =1000
77
0 5 10
00
04
08
12
Parameter
Den
sity
PriorLikelihood ILikelihood IILikelihood IIIPosterior IPosterior IIPosterior III
Figure D2 Posterior distributions with heavy-tailed likelihood and thin-tailed prior
Posterior shapes generated by a thin-tails (Gaussian) prior and a heavy-tails (Cauchy) likelihood as the likelihood
peak moves away from the prior mean Posteriors I II and III are generated respectively by the likelihood I II and
III
78
Giglio and Xiu (2018) exploit the invariance principle of the PCA and recover the risk premia asso-
ciated with the given factor from the projection on the span of latent factors driving a cross-section
of asset returns Uppal Zaffaroni and Zviadadze (2018) adopt a similar approach by recovering
latent factors from the residuals of the asset pricing model effectively completing the span of the
SDF Daniel Mota Rottke and Santos (2018) instead focus on the construction of the cross-
sectional factors and note that many well-established tradable portfolios such as HML and SMB
can be substantially improved in asset pricing tests by hedging their unpriced component (that does
not carry a risk premium) We do not take a stand on the origin of the factors or the completion of
the model space Instead we consider the whole universe of potential models that can be created
from the set of observable candidates factors proposed in the empirical literature As such our
analysis explicitly takes into account both standard specifications that have been successfully used
in numerous papers (eg Fama-French 3-factor models or nondurable consumption growth) as
well as the combinations of the factors that have never been explicitly tested
Following Harvey Liu and Zhu (2016) a large literature has tried to understand which of
the existing factors (or their combinations) drive the cross-section of asset returns Giglio Feng
and Xiu (2019) combine cross-sectional asset pricing regressions with the double-selection LASSO
of Belloni Chernozhukov and Hansen (2014) to provide valid uniform inference on the selected
sources of risk Huang Li and Zhou (2018) use a reduced rank approach to select among not
only the observable factors but their total span effectively allowing for sparsity not necessarily in
the observable set of factors but their combinations as well Kelly Pruitt and Su (2019) build a
latent factor model for stock returns with factor loadings being a linear function of the company
characteristics and find that only a small subset of the latter provide substantial independent
information relevant for asset pricing Our approach does not take a stand on whether there exists
a single combination of factors that substantially outperforms other model specifications Instead
we let the data speak and find out that the cross-sectional likelihood across the many models is
rather flat meaning the data is not informative enough to reliably indicate that there is a single
dominant specification
Finally our paper naturally contributes to the literature on weak identification in asset pricing
Starting from the seminal papers of Kan and Zhang (1999a 1999b) identification of risk premia has
been shown to be challenging for traditional estimation procedures Kleibergen (2009) demonstrates
that the two-pass regression of Fama-MacBeth lead to biased estimates of the risk premia and
spuriously high significance levels Moreover useless factors often crowd out the impact of the true
sources of risk in the model and lead to seemingly high levels of cross-sectional fit (Kleibergen and
Zhan (2015)) Gospodinov Kan and Robotti (2014 2019) demonstrate that most of the estimation
techniques used to recover risk premia in the cross-section are rendered invalid in the presence of
useless factors and propose alternative procedures that effectively eliminate the impact of these
factors from the model We build upon the intuition developed in these papers and formulate
the Bayesian solution to the problem by providing a prior that directly reflects the strength of
the factor Whenever the vector of correlation coefficients between asset returns and a factor is
close to zero the prior variance of λ for this specific factor also goes to zero and the penalty for
5
the risk premium converges to infinity effectively shrinking the posterior of the useless factorsrsquo
risk premia toward zero Therefore our priors are particularly robust to the presence of spurious
factors Conversely they are very diffuse for the strong factors with the posterior reflecting the
full impact of the likelihood
II Inference in Linear Factor Models
This section introduces the notation and reviews the main results of the Fama-MacBeth (FM)
regression method (see Fama and MacBeth (1973)) We focus on classic linear factor models for
cross-sectional asset returns Suppose that there are K factors ft = (f1t fKt)⊤ t = 1 T
which could be either tradable or non-tradable To simplify exposition we consider mean zero
factors that have also been demeaned in sample so that we have both E[ft] = 0K and f = 0K where
E[] denotes the unconditional expectation and the upper bar denotes the sample mean operator
The returns of N test assets in excess of the risk free rate are denoted by Rt = (R1t RNt)⊤
In the FM procedure the factor exposures of asset returns βf isin RNtimesK are recovered from
the linear regression
Rt = a+ βfft + 983171t (1)
where 9831711 983171Tiidsim N (0N Σ) and a isin RN Given the mean normalization of ft we have E[Rt] = a
The risk premia associated with the factors λf isin RK are then estimated from the cross-
sectional regression
R = λc1N + 983141βfλf +α (2)
where 983141βf denotes the time series estimates λc is a scalar average mispricing that should be equal to
zero under the null of the model being correctly specified 1N denotes an N -dimensional vector of
ones and α isin RN is the vector of pricing errors in excess of λc If the model is correctly specified
it implies the parameter restriction a = E[Rt] = λc1N + βfλf Therefore we can rewrite the
two-step Fama-MacBeth regression into one equation as
Rt = λc1N + βfλf + βfft + 983171t (3)
Equation (3) is particularly useful in our simulation study Note that the intercept λc is included
in (2) and (3) in order to separately evaluate the ability of the model to explain the average level
of the equity premium and the cross-sectional variation of asset returns
Let B⊤ = (aβf ) and F⊤t = (1f⊤
t ) and consider the matrices of stacked time series observa-
tions
R =
983091
983109983109983107
R⊤1
R⊤T
983092
983110983110983108 F =
983091
983109983109983107
F⊤1
F⊤T
983092
983110983110983108 983171 =
983091
983109983109983107
983171⊤1
983171⊤T
983092
983110983110983108
The regression in (1) can then be rewritten as R = FB + 983171 yielding the time-series estimates of
6
(aβf ) and Σ
983141B =
983075983141a⊤
983141β⊤f
983076= (F⊤F )minus1F⊤R 983141Σ =
1
T(Rminus F 983141B)⊤(Rminus F 983141B)
In the second step the OLS estimates of the factor risk premia are
983141λ = (983141β⊤ 983141β)minus1 983141β⊤R (4)
where 983141β = (1N 983141βf ) and λ⊤ = (λc λf ⊤) The canonical Shanken (1992) corrected covariance matrix
of the estimated risk premia is4
983141σ2(983141λ) = 1
T
983147(983141β⊤ 983141β)minus1 983141β⊤ 983141Σ983141β(983141β⊤ 983141β)minus1(1 + 983141λ⊤
f983141Σminus1f
983141λf )983148 (5)
where 983141Σf is the sample estimate of the variance-covariance matrix of the factors ft There are
two sources of estimation uncertainty in the OLS estimates of λ First we do not know the test
assetsrsquo expected returns but instead estimate them as sample means R According to the time-
series regression R sim N (a 1T Σ) asymptotically Second if β is known the asymptotic covariance
matrix of 983141λ is simply 1T (β
⊤β)minus1β⊤ 983141Σβ(β⊤β)minus1 The extra term (1 + λ⊤fΣ
minus1f λf ) is included to
account for the fact that βf is estimated
Alternatively we can run a (feasible) GLS regression in the second stage obtaining the estimates
983141λ = (983141β⊤ 983141Σminus1 983141β)minus1 983141β⊤ 983141Σminus1R (6)
where 983141Σ = 1T 983141983171
⊤983141983171 and 983141983171 denotes the OLS residuals and with the associated covariance matrix of
the estimates
983141σ2(983141λ) = 1
T(983141β⊤ 983141Σminus1 983141β)minus1(1 + 983141λ⊤
f983141Σminus1f
983141λf ) (7)
Equations (4) and (6) make it clear that in the presence of a spurious (or useless) factor ie
such that βj = CradicT C isin RN risk premia are no longer identified Furthermore their estimates
diverge leading to inference problems for both the useless and the strong (ie βj ∕rarr 0 as T rarr infin)
factors (see eg Kan and Zhang (1999b)) In the presence of such an identification failure the
cross-sectional R2 also becomes untrustworthy If a useless factor is included into the two-pass
regression the OLS R2 tend to be highly inflated (although the GLS R2 is less affected)5
4An alternative way (see eg Cochrane (2005) Page 242) to account for the uncertainty from ldquogenerated regres-sorsrdquo such as βf is to estimate the whole system in GMM The moments are
gT (aβf λ) =
983061IN otimes IK+1
A⊤
983062983091
983107E[Rt minus aminus βfft]
E[(Rt minus aminus βfft)otimes f⊤t ]
E[Rt minus λc1N minus βfλf ]
983092
983108 = 0
where β = (1N βf ) A⊤ = β⊤ for OLS and A⊤ = β⊤Σminus1 for GLS Also note that GLS estimation is not the sameas efficient GMM estimation
5For example Kleibergen and Zhan (2015) derive the asymptotic distribution of the R2 under the assumptionthat a few unknown factors are able to explain expected asset returns and show that in the presence of a useless
7
This problem arises not only when using the Fama-MacBeth two-step procedure Kan and
Zhang (1999a) point out that the identification condition in the GMM test of linear stochastic
discount factor models fails when a useless factor is included Moreover this leads to overrejection
of the hypothesis of a zero risk premium for the useless factor under the Wald test and the power
of the over-identifying restriction test decreases Gospodinov Kan and Robotti (2019) document
similar problems within the maximum likelihood estimation and testing framework
Consequently several papers have attempted to develop alternative statistical procedures that
are robust to the presence of useless factors Kleibergen (2009) proposes several novel statis-
tics whose large sample distributions are unaffected by the failure of the identification condition
Gospodinov Kan and Robotti (2014) derive robust standard errors for the GMM estimates of
factor risk premia in the linear stochastic factor framework and prove that t-statistics calculated
using their standard errors are robust even when the model is misspecified and a useless factor is
included Bryzgalova (2015) introduces a LASSO-like penalty term in the cross-sectional regression
to shrink the risk premium of the useless factor towards zero
In this paper we provide a Bayesian inference and model selection framework that i) can be
easily used for robust inference in the presence and detection of useless factors (section II1) and
ii) can be used for both model selection and model averaging even in the presence of a very large
number of candidate (traded or non traded and possibly useless) risk factors ndash ie the entire factor
zoo
II1 Bayesian Fama-MacBeth
This section introduces our hierarchical Bayesian Fama-MacBeth (BFM) estimation method A
formal derivation is presented in Appendix A1 To start with letrsquos consider the time-series regres-
sion We assume that the time series error terms follow an iid multivariate Gaussian distribution
(the approach at the cost of analytical solutions could be generalized to accommodate different
distributional assumptions) ie 983171 sim MVN (0TtimesN Σ otimes IT ) The likelihood of the data (RF ) is
then
p(data|BΣ) = (2π)minusNT2 |Σ|minus
T2 exp
983069minus1
2tr
983147Σminus1(Rminus FB)⊤(Rminus FB)
983148983070
The time-series regression is always valid even in the presence of a spurious factor For simplicity
we choose the non-informative Jeffreysrsquo prior for (BΣ) π(BΣ) prop |Σ|minusN+1
2 Note that this prior
is flat in the B dimension The posterior distribution of (BΣ) is therefore
B|Σ data sim MVN983059983141BolsΣotimes (F⊤F )minus1
983060 (8)
Σ|data sim W -1983059T minusK minus 1 T Σ
983060 (9)
where 983141Bols and Σ denote the canonical OLS based estimates and W -1 is the inverse-Wishart
distribution (a multivariate generalization of the inverse-gamma distribution) From the above
factor the OLS R2 is more likely to be inflated than its GLS counterpart
8
we can sample the posterior distribution of the parameters (BΣ) by first drawing the covariance
matrix Σ from the inverse-Wishart distribution conditional on the data and then drawing B from
a multivariate normal distribution conditional on the data and the draw of Σ
If the model is correctly specified in the sense that all true factors are included expected
returns of the assets should be fully explained by their risk exposure β and the prices of risk λ
ie E[Rt] = βλ But since given our mean normalisation of the factors E[Rt] = a we have the
least square estimate (β⊤β)minus1β⊤a Therefore we can define our first estimator
Definition 1 (Bayesian Fama-MacBeth (BFM)) The posterior distribution of λ conditional
on B Σ and the data is a Dirac distribution at (β⊤β)minus1β⊤a A draw (λ(j)) from the posterior
distribution of λ conditional on the data only is obtained by drawing B(j) and Σ(j) from the Normal-
inverse-Wishart in (8)-(9) and computing (β⊤(j)β(j))
minus1β⊤(j)a(j)
The posterior distribution of λ defined above accounts both for the uncertainty about the
expected returns (via the sampling of a) and the uncertainty about the factor loadings (via the
sampling of β) Note that differently from the frequentist case in equation (5) there is no ldquoextra
termrdquo (1 + λ⊤fΣ
minus1f λf ) to account for the fact that βf is estimated The reason being that it
is unnecessary to explicitly adjust standard errors of λ in the Bayesian approach since we keep
updating βf in each simulation step automatically incorporating the uncertainty about βf into the
posterior distribution of λ Furthermore it is quite intuitive from the above definition of the BFM
estimator why we expect posterior inference to detect weak and spurious factors in finite sample
For such factors the near singularity of (β⊤(j)β(j))
minus1 will cause the draws for λ(j) to diverge as in
the frequentist case Nevertheless the posterior uncertainty about factor loadings and risk premia
will cause β⊤(j)a(j) to flip sign across draws causing the posterior distribution of λ to put substantial
probability mass on both values above and below zero Hence centered posterior credible intervals
will tend to include zero with high probability
In addition to the price of risk λ we are also interested in estimating the cross-sectional fit of
the model ie the cross-sectional R2 Once we obtain the posterior draws of the parameters we
can easily obtain the posterior distribution of the cross-sectional R2 defined as
R2ols = 1minus (aminus βλ)⊤(aminus βλ)
(aminus a1N )⊤(aminus a1N ) (10)
where a = 1N
983123Ni ai That is for each posterior draw of (a β λ) we can construct the corre-
sponding draw for the R2 from equation (10) hence tracing out its posterior distribution We can
think of equation (10) as the population R2 where a β and λ are unknown After observing
the data we infer the posterior distribution of a β and λ and from these we can recover the
distribution of the R2
However realistically the models are rarely true Therefore one might want to allow for the
presence of pricing errors α in the cross-sectional regression6 This can be easily accommodated
6As we will show in the next section this is essential for model selection
9
within our Bayesian framework since in this case the data generating process in the second stage
becomes a = βλ + α If we further assume that pricing error αi follows an independent and
identical normal distribution N (0σ2) the likelihood function in the second step becomes7
p(data|λσ2β) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070 (11)
In the cross-sectional regression the ldquodatardquo are the expected risk premia a and the factor loadings
β albeit these quantities are not directly observable to the researcher Hence in the above we
are conditioning on the knowledge of these quantities which can be sampled from the first step
Normal-inverse-Wishart posterior distribution (8)-(9) Conceptually this is not very different from
the Bayesian modeling of latent variables In the benchmark case we assume a Jeffreysrsquo diffuse
prior8 for (λσ2) π(λσ2) prop σminus2 In Appendix A1 we show that the posterior distribution of
(λσ2) is then
λ|σ2BΣ data sim N
983091
983109983107(β⊤β)minus1β⊤a983167 983166983165 983168983141λ
σ2(β⊤β)minus1
983167 983166983165 983168Σλ
983092
983110983108 (12)
σ2|BΣ data sim Γminus1
983075N minusK minus 1
2(aminus βλ)⊤(aminus βλ)
2
983076 (13)
where Γminus1 denotes the inverse-gamma distribution The conditional distribution in equation (12)
makes it clear that the posterior takes into account both the uncertainty about the market price
of risk stemming from the first stage uncertainty about the β and a (that are drawn from the
Normal-inverse-Wishart posterior in equations (8)-(9)) and the random pricing errors α that have
the conditional posterior variance in equation (13) If test assetsrsquo expected excess returns are fully
explained by β there are no pricing errors and σ2(β⊤β)minus1 converges to zero otherwise this layer
of uncertainty always exists
Note also that we can think of the posterior distribution of (β⊤β)minus1β⊤a as a Bayesian decision
makerrsquos belief about the dispersion of the Fama-MacBeth OLS estimates after observing the data
RtftTt=1 Alternatively when pricing errors α are assumed to be zero under the null hypothesis
the posterior distribution of λ in equation (12) collapses to a degenerate distribution where λ
equals (β⊤β)minus1β⊤a with probability one
Often the cross sectional step of the Fama-MacBeth estimation is performed via GLS rather than
least squares In our setting under the null of the model this leads to λ = (β⊤Σminus1β)minus1β⊤Σminus1a
Therefore we define the following GLS estimator
Definition 2 (Bayesian Fama-MacBeth GLS (BFM-GLS)) The posterior distribution of λ
conditional on B Σ and the data is a Dirac distribution at (β⊤Σminus1β)minus1β⊤Σminus1a A draw (λ(j))
7We derive a formulation with non-spherical cross-sectional pricing errors that leads to a GLS type estimator inAppendix A2
8As shown in the next subsection in the presence of useless factors such prior is not appropriate for modelselection based on Bayes factors and posterior probabilities since it does not lead to proper marginal likelihoodsTherefore we introduce therein a novel prior for model selection
10
from the posterior distribution of λ conditional on the data only is obtained by drawing B(j) and Σ(j)
from the Normal-inverse-Wishart in equations (8)-(9) and computing (β⊤(j)Σ
minus1(j)β(j))
minus1β⊤(j)Σ
minus1(j)a(j)
From the posterior sampling of the parameters in the above definition we can also obtain the
posterior distribution of the cross-sectional GLS R2 defined as
R2gls = 1minus
(aminus βλgls)⊤Σminus1(aminus βλgls)
(aminus a1N )⊤Σminus1(aminus a1N ) (14)
Once again we can think of equation (14) as the population GLS R2 that is a function of the
unknown quantities a β and λ But after observing the data we infer the posterior distribution
of the parameters and from these we recover the posterior distribution of the R2gls
Remark 1 (Generated factors) Often factors are estimated as eg in the case of principal
components (PCs) and factor mimicking portfolios (albeit the latter is not needed in our setting)
This generates an additional layer of uncertainty normally ignored in empirical analysis due to
the associated asymptotic complexities Nevertheless it is relatively easy to adjust the Bayesian
estimators of risk premia to account for this uncertainty In the case of a mimicking portfolio
under a diffuse prior and Normal errors the posterior distribution of the portfolio weights follow the
standard Normal-inverse-Gamma of Gaussian linear regression models (see eg Lancaster (2004))
Similarly in the case of principal components as factors under a diffuse prior the covariance
matrix from which the PCs are constructed follow an inverse-Wishart distribution9 Hence the
posterior distributions in Definitions 1 and 2 can account for the generated factors uncertainty by
first drawing from an inverse-Wishart the covariance matrix from which PCs are constructed or
the Normal-inverse-Gamma posterior of the mimicking portfolios coefficients and then sampling
the remaining parameters as explained in the definitions
II2 Model selection
In the previous subsection we have derived simple Bayesian estimators that deliver in finite sam-
ple credible intervals robust to the presence of spurious factors and avoid over-rejecting the null
hypothesis of zero risk premia for such factors
However given the plethora of risk factors that have been proposed in the literature a robust
approach for models selection across non-necessarily nested models and that can handle poten-
tially a very large number of possible models as well as both traded and non-traded factors is
of paramount importance for empirical asset pricing The canonical way of selecting models and
testing hypothesis within the Bayesian framework is through Bayesrsquo factors and posterior prob-
abilities and that is the approach we present in this section This is for instance the approach
suggested by Barillas and Shanken (2018) The key elements of novelty of the proposed method
are that i) our procedure is robust to the presence of spurious and weak factors ii) it is directly
9Based on these two observations Allena (2019) proposes a generalisation of Barillas and Shanken (2018) modelcomparison approach for these type of factors
11
applicable to both traded and non-traded factors and iii) it selects models based on their cross-
sectional performance (rather than the time series one) ie on the basis of the risk premia that the
factors command
In this subsection we show first that flat priors for risk premia (as the Jeffreysrsquo priors used
in section II1 for illustrative purposes) are not suitable for model selection in the presence of
spurious factors Given the close analogy between frequentist testing and Bayesian inference with
flat priors this is not too surprising But the novel insight is that the problem arises exactly
because of the use of flat priors and can therefore be fixed by using non-flat yet non-informative
priors Second we introduce ldquospike-and-slabrdquo priors that are robust to the presence of spurious
factors and particularly powerful in high-dimensional model selection ie when one wants as in
our empirical application to test all factors in the zoo
II21 Pitfalls of Flat Priors for Risk Premia
We start this section by discussing why flat priors for risk premia as the Jeffreysrsquo prior are not
desirable in model selection Since we want to focus and select models based on the cross-sectional
asset pricing properties of the factors for simplicity we retain Jeffreysrsquo priors for the time series
parameter (aβf Σ) of the first-step regression
In order to perform model selection we relax the (null) hypothesis that models are correctly
specified and allow instead for the presence of cross-sectional pricing errors That is we consider
the cross-sectional regression a = βλ + α For illustrative purposes we focus on spherical errors
but all the results in this and the following subsections can be generalized to the non-spherical error
setting in Appendix A2
Similar to many Bayesian variable selection problems we introduce a vector of binary latent
variables γ⊤ = (γ0 γ1 γK) where γj isin 0 1 When γj = 1 it indicates that the factor j
(with associated loadings βj) should be included into the model and vice versa The number of
included factors is simply given by pγ =983123K
j=0 γj Note that we do not shrink the intercept so
γ0 is always equal to 1 (as the common intercept plays the role of the first ldquofactorrdquo) The notation
βγ = [βj ]γj=1 represents a pγ-columns sub-matrix of β
When testing whether the risk premium of factor j is zero the null hypothesis is H0 λj = 0
In our notation this null hypothesis can be expressed as H0 γj = 0 while the alternative is
H1 γj = 1 This is a small but important difference relative to the canonical frequentist testing
approach for useless factors the risk premium is not identified hence testing whether it is equal to
any given value is per se problematic Nevertheless as we show in the next section with appropriate
priors whether a factor should be included or not is a well-defined question even in the presence
of useless factors
In the Bayesian framework the prior distribution of parameters under the alternative hypothesis
should be carefully specified Generally speaking the priors for nuisance parameters such as β σ2
and Σ do not greatly influence the cross-sectional inference But as we are about to show this is
not the case for the priors about risk premia
12
Recall that when considering multiple models say wlog model γ and model γ prime by Bayesrsquo
theorem we have that the posterior probability of model γ is
Pr(γ|data) = p(data|γ)p(data|γ) + p(data|γ prime)
where we have given equal prior probability to each model and p(data|γ) denotes the marginal
likelihood of the model indexed by γ In Appendix A3 we show that when using a Jeffreysrsquo prior
(that is flat for λ) the marginal likelihood is
p(data|γ) prop (2π)pγ+1
2 |β⊤γ βγ |minus
12
Γ983059Nminuspγ+1
2
983060
983059N σ2
γ
2
983060Nminuspγ+1
2
(15)
where λγ = (β⊤γ βγ)
minus1β⊤γ a σ
2γ =
(aminusβγ λγ)⊤(aminusβγ λγ)N and Γ denotes the Gamma function
Therefore if model γ includes a useless factor (whose β asymptotically converges to zero) the
matrix β⊤γ βγ is nearly singular and its determinant goes to zero sending the marginal likelihood
in (15) to infinity As a result the posterior probability of the model containing the spurious factor
go to one Consequently under a flat prior for risk premia the model containing a useless factor
will always be selected asymptotically However the posterior distribution of λ for the spurious
factor is robust and particularly disperse in any finite sample
Moreover it is highly likely that conclusions based on the posterior coverage of λ contradict
those arising from Bayesrsquo factors When the prior distribution of λj is too diffuse under the
alternative hypothesis H1 the Bayesrsquo factor tends to favor H0 over H1 even though the estimate
of λj is far from 0 The reason is that even though H0 seems quite unlikely based on posterior
coverages the data is even more unlikely under H1 than under H0 Therefore a disperse prior for
λj may push the posterior probabilities to favor H0 and make it fail to identity true factors This
phenomenon is the so called ldquoBartlett Paradoxrdquo (see Bartlett (1957))
Note also that flat hence improper priors for the risk premia are not legitimate since they render
the posterior model probabilities arbitrary Suppose that we are testing the null H0 λj = 0 Under
the null hypothesis the prior for (λσ2) is λj = 0 and π(λminusj σ2) prop 1
σ2 However the prior under the
alternative hypothesis is π(λj λminusj σ2) prop 1
σ2 Since the marginal likelihoods of data p(data|H0)
and p(data|H1) are both undetermined we cannot define the Bayesrsquo factor p(data|H1)p(data|H0)
(see eg
Cremers (2002) Chib Zeng and Zhao (forthcoming)) In contrast for nuisance parameters such
as σ2 we can continue to assign improper priors Since both hypotheses H0 and H1 include σ2
the prior for it will be offset in the Bayesrsquo factor and in the posterior probabilities Therefore we
can only assign improper priors for common parameters10 Similarly we can still assign improper
priors for β and Σ in the first time series step
The final reason why it might be undesirable to use Jeffreysrsquo prior in the second step is that it
does not impose any shrinkage on the parameters This is problematic given the large number of
10See Kass and Raftery (1995) (and also Cremers (2002)) for more detailed discussion
13
members of the factor zoo while we have only limited time-series observations of both factors and
test asset returns
In the next subsection we propose an appropriate prior for risk premia that is both robust to
spurious factors and can be used for model selection even when dealing with a very large number
of potential models
II22 Spike and Slab Prior for Risk Premia
In order to make sure that the integration of the marginal likelihood is well-behaved we propose
a novel prior specification for the factorsrsquo risk premia λ⊤f = (λ1 λK)11 Since the inference in
time-series regression is always valid we only modify the priors of the cross-sectional regression
parameters
The prior that we propose belongs to the so-called ldquospike-and-slabrdquo family For exemplifying
purposes in this section we introduce a Dirac spike so that we can easily illustrate its implications
for model selection In the next subsection we generalize the approach to a ldquocontinuous spikerdquo
prior and study its finite sample performance in our simulation setup
In particular we model the uncertainty underlying the model selection problem with a mixture
prior π(λσ2γ) prop π(λ|σ2γ)π(σ2)π(γ) for the risk premium of the j-th factor When γj = 1
and hence the factor should be included in the model the prior follows a normal distribution given
by λj |σ2 γj = 1 sim N (0σ2ψj) where ψj is a quantity that we will be defining below When instead
γj = 0 and the corresponding risk factor should not be included in the model the prior is a Dirac
distribution at zero For the cross-sectional variance of the pricing errors we keep the same prior
that would arise with Jeffreysrsquo approach π(σ2) prop σminus2
Let D denote a diagonal matrix with elements cψminus11 middot middot middot ψminus1
K and Dγ the sub-matrix of D
corresponding to model γ We can then express the prior for the risk factors λγ of model γ as
λγ |σ2γ sim N (0σ2Dminus1γ )
Note that c is a small positive number since we do not shrink the common intercept λc of the
cross-sectional regression
Given the above prior specification we sample the posterior distribution by sequentially drawing
from the conditional distributions of the parameters (ie we use a Gibbs sampling algorithm) The
crucial steps in addition to the sampling of the times series parameters from the posteriors in
equations (8)-(9) are as follows
Sampling λγ
Note that
11We do not shrink the intercept λc
14
p(λ|dataσ2γ) prop p(data|λσ2γ)π(λ|σ2γ)
prop (2π)minuspγ2 |Dγ |
12 (σ2)minus
N+pγ2 exp
983069minus 1
2σ2[(aminus βγλγ)
⊤(aminus βγλγ) + λ⊤γDγλγ ]
983070
= (2π)minuspγ2 |Dγ |
12 (σ2)minus
N+pγ2 e
983069minus
(λγminusλγ )⊤(β⊤γ βγ+Dγ )(λγminusλγ )
2σ2
983070
e
983153minusSSRγ
2σ2
983154
where SSRγ = a⊤aminus a⊤βγ(β⊤γ βγ +Dγ)
minus1β⊤γ a = minλγ(aminus βγλγ)
⊤(aminus βγλγ) + λ⊤γDγλγ
Note that SSRγ is the minimized sum of squared errors with generalised ridge regression penalty
term λ⊤γDγλγ That is our prior modelling is analogous to introducing a Tikhonov-Phillips
regularisation (see Tikhonov Goncharsky Stepanov and Yagola (1995) Phillips (1962)) in the
cross-sectional regression step and has the same rationale delivering a well defined marginal
likelihood in the presence of rank deficiency (that in our settings arises in the presence of useless
factors) However in our setting the shrinkage applied to the factors is heterogeneous since we
rely on the partial correlation between factors and test assets to set ψj as
ψj = ψ times ρ⊤j ρj (16)
where ρj is an N times 1 vector of correlation coefficients between factor j and the test assets and
ψ isin R+ is a tuning parameter which controls the shrinkage over all the factors12 When the
correlation between fjt and Rt is very low as in the case of a useless factor the penalty for λj
which is the reciprocal of ψρ⊤j ρj is very large and dominates the sum of squared errors
Let λγ = (β⊤γ βγ +Dγ)
minus1β⊤γ a and σ2(λγ) = σ2(β⊤
γ βγ +Dγ)minus1 the posterior distribution of
λγ is
λγ |dataσ2γ sim N (λγ σ2(λγ))
The above equation makes it clear why this Bayesian formulation is robust to spurious factors
When β converges to zero (β⊤γ βγ +Dγ) is dominated by Dγ so the identification condition for
the risk premia no longer fails When a factor is spurious its correlation with test assets converges
to zero hence the penalty for this factor ψminus1j goes to infinity As a result the posterior mean
of λγ λγ = (β⊤γ βγ +Dγ)
minus1β⊤γ a is shrunk towards zero and the posterior variance term σ2(λ)
approaches σ2Dminus1γ Consequently the posterior distribution of λ for a spurious factor is nearly the
same as its prior In contrast for a normal factor that has non-zero covariance with test assets the
information contained in β dominates the prior information since in this case the absolute size of
Dγ is small relative to β⊤γ βγ
Using our priors and integrating out λ yields
p(data|σ2γ) =
983133p(data|λσ2γ)π(λ|σ2γ)dλ prop (σ2)minus
N2
|Dγ |12
|β⊤γ βγ +Dγ |
12
exp
983069minusSSRγ
2σ2
983070
12Alternatively we could have set ψj = ψ times β⊤j βj where βj is an N times 1 vector However ρj has the advantage
of being invariant to the units in which the factors are measured
15
Sampling σ2
From the Bayesrsquo theorem we have that the posterior of σ2 given by
p(σ2|dataγ) prop p(data|σ2γ)π(σ2) prop (σ2)minusN2minus1 exp
983069minusSSRγ
2σ2
983070
hence the posterior distribution of σ2 is an inverse-Gamma σ2|dataγ sim Γminus1983059N2
SSRγ
2
983060
Finally we obtain the marginal likelihood of the data in model γ by integrating out σ2
p(data|γ) =983133
p(data|σ2γ)π(σ2)dσ2 prop |Dγ |12
|β⊤γ βγ +Dγ |
12
1
(SSRγ2)N2
When comparing two models using posterior model probabilities is equivalent to simply using the
ratio of the marginal likelihoods ie the Bayesrsquo factor defined as
BFγγprime = p(data|γ)p(data|γ prime)
where we have given equal prior probability to model γ and model γprime
Remark 2 (Bayes Factor) Consider two nested linear factor models γ and γ prime The only dif-
ference between γ and γ prime is γp γp equals 1 in model γ but 0 in model γ prime Let γminusp denote a
(K minus 1) times 1 vector of model index excluding γp γ⊤ = (γ⊤minusp 1) and γ prime⊤ = (γ⊤
minusp 0) where without
loss of generality we have assumed that the factor p is ordered last The Bayesrsquo factor is then
BFγγprime =
983061SSRγprime
SSRγ
983062N2 983059
1 + ψpβ⊤p
983147IN minus βγprime(β⊤
γprimeβγprime +Dγprime)minus1β⊤γprime
983148βp
983060minus 12 (17)
The above result is proved in Appendix A4
Since β⊤p [IN minus βγprime(β⊤
γprimeβγprime + Dγprime)minus1β⊤γprime ]βp is always positive ψp plays an important role in
variable selection For a strong and useful factor that can substantially reduce pricing errors the
first term in equation (17) dominates and the Bayesrsquo factor will be much greater than 1 hence
providing evidence in favour of model γ
Remember that SSRγ = minλγ(a minus βγλγ)⊤(a minus βγλγ) + λ⊤
γDγλγ hence we always have
SSRγ le SSRγprime in sample There are two effects of increasing ψp i) when ψp is large the penalty
for λp is small hence it is easier to minimise SSRγ and SSRγprimeSSRγ becomes much larger than
1 ii) large ψp decreases the second term in equation (17) lowering the Bayesrsquo factor and acting as
a penalty for dimensionality
A particularly interesting case is when the factor is useless βp converges to zero but the
penalty term 1ψp prop 1ρ⊤pρp goes to infinity On the one hand the first term in equation (17) will
converge to 1 on the other hand since ψp asymp 0 in large sample the second term in equation (17)
will also be around 1 Therefore the Bayesrsquo factor for a useless factor will go to 1 asymptotically13
13But in finite sample it may deviate from its asymptotic value so we should not use 1 as a threshold when testingthe null hypothesis H0 γp = 0
16
In contrast a useful factor should be able to greatly reduce the sum of squared errors SSRγ so
the Bayesrsquo factor will be dominated by SSRγ yielding a value substantially above 1
Remark 3 (Level Factors) Identification failure of factorsrsquo risk premia can arise in the presence
of lsquolevel factorsrsquo exposure to which is non-zero but lacks cross-sectional spread ie βj rarr c1N with
c ∕= 0 These factors help explain the average level of returns but not the their cross-sectional
dispersion and hence are collinear with the common cross-sectional intercept Our approach can
handle this case by using variance standardised variables in the estimation and replacing the penalty
in (16) with ψj = ψtimes983145ρj⊤983145ρj where 983145ρj is the cross-sectionally demeaned vector of correlations with
asset returns ie 983145ρj = ρj minus983059
1N
983123Ni=1 ρji
983060times 1N
II23 Continuous Spike
We extend the Dirac spike-and-slab prior by encoding a continuous spike for λj when γj equals
0 Following the literature on Bayesian variable selection (see eg George and McCulloch (1993)
George and McCulloch (1997) and Ishwaran Rao et al (2005)) we model the uncertainty under-
lying model selection with a mixture prior π(λσ2γω) = π(λ|σ2γ)π(σ2)π(γ|ω)π(ω) which is
specified as following
λj |γj σ2 sim N (0 r(γj)ψjσ2)
Note the introduction of a new element r(γj) in the prior and where r(1) = 1 and r(0) = r As
we explain below the additional parameter vector ω encodes our prior beliefs about the sparsity
of the true model
Redefine D as a diagonal matrix with elements c (r(γ1)ψ1)minus1 (r(γK)ψK)minus1 where ψj is
given as before by equation (16) In matrix notation the prior for λ is λ|σ2γ sim N (0σ2Dminus1)
The term r(γj)ψj in Dminus1 is set to be small or large depending on whether γj = 0 or γj = 1 In the
empirical implementation we force r to be much less than 1 since we intend to shrink λj towards
zero when γj is 014 Hence the spike component concentrates the mass of λ towards zero whereas
the slab component allows λ to take values over a much wider range Therefore the posterior
distribution of λ is very similar to the case of a Dirac spike in section II22
We use Gibbs sampling (ie sequential sampling from conditional distributions) to draw the
posterior distribution of the parameters (λγωσ2) where as explained below ω encodes our
prior beliefs about the sparsity of the true model
Sampling λγ
Combining the likelihood and the prior for λ we have
p(λ|dataσ2γ) prop p(data|λσ2γ)p(λ|σ2γ) prop exp
983069minus 1
2σ2
983147λ⊤(β⊤β +D)λminus 2λ⊤β⊤a
983148983070
Therefore defining λ = (β⊤β+D)minus1β⊤a and σ2(λ) = σ2(β⊤β+D)minus1 the posterior distribution
14We can set r = 00001 In our framework r is essentially a tune parameter hence we need to choose a reasonablevalue such that we can identify useful factor but exclude spurious ones
17
of λ can be expressed as λ|dataσ2γω sim N (λ σ2(λ))
Sampling γjKj=1
Even though the prior on model index γ could be simply set to be π(γ) = 12K we interpret
π(γj = 1|ωj) = ωj as our prior belief about the sparsity of the true model As in the literature on
predictors selection we assign the following prior distribution to (γω)
π(γj = 1|ωj) = ωj ωj sim Beta(aω bω)
Different hyper-parameters aω and bω determine whether we favor more parsimonious models or
not15
Given a ωj the Bayes factor for the j-th risk then factor is
p(γj = 1|dataλωσ2γminusj)
p(γj = 0|dataλωσ2γminusj)=
ωj
1minus ωj
p(λj |γj = 1σ2)
p(λj |γj = 0σ2)
If we had instead imposed ωj = 05 as in section II22 the Bayesrsquo factor would be fully determined
byp(λj |γj=1σ2)p(λj |γj=0σ2)
Sampling ω
From Bayesrsquo theorem we have
p(ωj |dataλγσ2) prop π(ωj)π(γj |ωj) prop ωγjj (1minus ωj)
1minusγjωaωminus1j (1minus ωj)
bωminus1
prop ωγj+aωminus1j (1minus ωj)
1minusγj+bωminus1
Therefore the posterior distribution of ωj is ωj |dataλγσ2 sim Beta(γj + aω 1minus γj + bω)
Sampling σ2
Finally
p(σ2|dataωλγ) prop (σ2)minusN+K+1
2minus1 exp
983069minus 1
2σ2[(aminus βλ)⊤(aminus βλ) + λ⊤Dλ]
983070
Hence the posterior distribution of σ2 is
σ2|dataωλγ sim Γminus1
983061N +K + 1
2(aminus βλ)⊤(aminus βλ) + λ⊤Dλ
2
983062
Note that when ωj is a constant 05 and r converges to 0 the continuous slab-and-spike prior
is equivalent to the one with a Dirac spike in section II22 However the MCMC algorithm of
the continuous spike setting is particularly useful in the high dimensional case Imagine that there
are 30 candidate factors in the factor zoo In the Dirac spike-and-slab prior case we have to
15We set aω = bω = 2 in the benchmark case However we could assign aω = 1 and bω = 2 in order to select asparser model
18
calculate the posterior model probabilities for 230 different models Given that we update (aβ)
in each sampling round posterior probabilities for all models are necessarily re-computed for every
new draw of these quantities rendering the computational cost very large In contrast using our
continuous spike approach we can simply use the posterior mean of γj to approximate the posterior
marginal probability of the j-th factor
III Simulation
We build a simple setting for a linear factor model that includes both strong and irrelevant factors
and allows for potential model misspecification
The cross-section of asset returns mimics empirical properties of 25 Fama-French portfolios
sorted by size and value We generate both factors and test asset returns from normal distributions
assuming that HML is the only useful factor A misspecified model also includes pricing errors from
the two-step Fama-MacBeth procedure which makes the vector of simulated expected returns equal
to their sample mean estimates of 25 Fama-French portfolios Finally a spurious factor is simulated
from an independent normal distribution with mean zero and standard deviation 1
ftuselessiidsim N (0 (1)2)
ftHMLiidsim N (rHML σ
2HML)
ftHML = ftHML minus macrftHML
Rt|ftHMLiidsim
983099983103
983101N (λc1N + β
983059λHML + ftHML
983060 Σ) if the model is correct
N (R+ βftHML Σ) if the model is misspecified
where factor loadings risk premia and variance-covariance matrix of returns are equal to their
OLS sample estimates from time series and two-pass Fama-MacBeth regressions of 25 size and
value portfolios on HML All the model parameters are estimated on monthly data from July 1963
to December 2017
To better illustrate the properties of the frequentist and bayesian approaches we consider 3
estimation setups
(a) the model includes only a strong factor (HML)
(b) the model includes only a useless factor
(c) the model includes both strong and useless factors
Each setting can be correctly or incorrectly specified with the following sample sizes T = 100
200 600 1000 and 20000 We compare the performance of the OLSGLS standard frequentist
and Bayesian Fama-MacBeth estimators (FM and BFM correspondingly) with the focus on risk
premia recovery testing and identification of strong and useless factors for model comparison
19
III1 Estimating risk premia via Bayesian Fama-MacBeth
Since it is unlikely that in most empirical settings a linear factor model is correctly specified we
focus our discussion on the case that allows for model misspecification We report similar simulation
results for the case of correct model specification in Appendix C
Table 1 Tests of risk premia in a misspecified model with a strong factor
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0107 0052 0007 0115 0076 0013 -422 4530200 0094 0067 0006 0118 0072 0015 -303 5621
FM 600 0097 0053 0006 0098 0047 0011 389 55751000 0088 0048 0012 0103 0060 0012 865 532820000 0109 0056 0010 0110 0060 0008 2486 3655
100 0063 0026 0006 0032 0013 0001 -356 3609200 0086 0041 0012 0079 0039 0008 -367 4689
BFM 600 0087 0043 0013 0087 0047 0009 -067 52611000 0095 0046 0009 0093 0046 0009 440 534620000 0096 0052 0012 0100 0052 0012 2390 3601
Panel B GLS
100 0246 0173 0067 0251 0154 0070 1720 7464200 0165 0107 0031 0171 0105 0035 5337 8045
FM 600 0137 0076 0020 0137 0073 0016 6942 83871000 0131 0072 0019 0132 0072 0015 7341 841620000 0122 0074 0014 0113 0061 0015 8034 8283
100 0139 0083 0022 0143 0083 0024 3127 6815200 0131 0072 0014 0121 0069 0019 4858 7299
BFM 600 0114 0061 0013 0126 0067 0018 6594 80091000 0100 0048 0009 0106 0058 0008 7063 814920000 0092 0050 0012 0100 0048 0012 8022 8267
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor The true value of the cross-sectional R2
adj is 3055(8175) for the OLS (GLS) estimation Fama-MacBeth estimates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidence intervals and their size for BFMestimates are constructed using posterior coverage of Fama-MacBeth estimates of λ The last two columns report the5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at the simulation point estimatesfor FM and its posterior mode for BFM
Table 1 presents the size of the tests for risk premia and confidence intervals for cross-sectional
R2 in probably the most relevant (and best-case) scenario for empirical applications It compares
the performance of frequentist and Bayesian Fama-MacBeth estimators for the case when the model
is misspecified and includes a single cross-sectional factor that is priced and strongly identified
proxied by HML Since the model is misspecified cross-sectional R2 never reaches 100 even for
T = 20 000 with the population value of 31 (82) for OLS (GLS) As expected both FM and
BFM estimators are very similar to each other and provide confidence intervals of correct size In
case of the standard FM approach they are constructed using standard t-statistics adjusted for
Shanken correction and in case of the BFM we rely on the quantiles of the posterior distribution
to form the credible confidence intervals for parameters The last two columns also report the
quantiles of the mode of the posterior distribution of R2 across the simulations
20
Table 2 summarizes risk premia estimation for the same cross-section of 25 expected returns
but using a useless (spurious) factor as a candidate cross-sectional factor As expected the standard
Fama-MacBeth estimator fails to recognize the rank failure in the second stage and conventional
risk premia estimates and t-statistics are inflated Indeed it is widely known since Kan and Zhang
(1999) that if the model is misspecified then t-statistics of the spurious factors tend to infinity
asymptotically This is confirmed in Panels A and B for T = 20 000 the probability of finding a
useless factor to have a t-stat above 196 is almost 50 for FM-OLS and over 80 for FM-GLS
Furthermore cross-sectional measures of fit such as R2 and related quantities are substantially
inflated even though itrsquos true value is 0 for both OLS and GLS settings in-sample estimates
produced by the frequentist approach not only have a much wider empirical support (for example
from -4 to over 40 for FM-OLS) but also uncertainty that does not decrease with the sample
size
Table 2 Tests of risk premia in a misspecified model with a useless factor
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0072 0037 0010 0055 0011 0001 -429 4184200 0078 0043 0005 0084 0021 0001 -418 4524
FM 600 0089 0043 0014 0223 0116 0013 -428 43701000 0093 0052 0014 0333 0187 0027 -429 450120000 0213 0135 0067 0698 0488 0172 -427 4363
100 0035 0011 0001 0001 0000 0000 -247 -036200 0041 0013 0001 0008 0002 0000 -261 -016
BFM 600 0071 003 0003 0031 0006 0002 -287 0191000 0047 002 0001 0039 0017 0002 -298 08420000 0034 0013 0000 0091 0043 0012 -336 1140
Panel B GLS
100 0238 0144 0066 0305 0199 0062 -347 3857200 0152 0091 0028 0292 0189 0067 -375 1953
FM 600 0126 0066 0017 0407 0314 0148 -385 16431000 0117 0063 0013 0510 0401 0239 -405 156920000 0104 0039 0005 0864 0847 0768 -311 1333
100 0128 0070 0019 0047 0018 0002 -218 1154200 0107 0060 0014 0034 0011 0000 -275 891
BFM 600 0093 0046 0008 0042 0012 0001 -325 6621000 0083 0031 0004 0061 0028 0004 -339 51920000 0023 0006 0000 0099 0049 0011 -304 138
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor The true value of the cross-sectional R2 is zero
The Bayesian approach to Fama-MacBeth regressions successfully overcomes the hurdle of use-
less factors As Table 2 demonstrates both BFM-OLS and BFM-GLS are able to identify the spu-
rious factor with the posterior distribution providing credible confidence bounds with the proper
control for the size of the test (eg for T = 20 000 the probability to reject the null of no risk
premia attached to a useless factor when using a test with the nominal size of 10 is 91) Rec-
ognizing a useless factor cross-sectional measures of fit are more conservative and overall tighter
Why does the bayesian approach work when the frequentist fails The argument is probably
best summarized by Figure 1 that plots a posterior distribution of λuseless for BFM from one of
21
minus4 minus2 0 2 4
00
02
04
06
08
λuseless
Density
BFMminusOLSFMminusOLS
Figure 1 Risk premia estimates of a useless factor
The graph presents the posterior distribution (blue line) of λuseless from the BFM-OLS estimation in a misspecifiedmodel with a useless factor based on a single simulation with T = 1 000 The red line depicts the asymptoticdistribution of Fama-MacBeth estimate of risk premium under the normal approximation
the simulations along with the pseudo-true value of risk premium defined as 0 in this case In this
particular simulation Fama-MacBeth OLS estimate of λuseless is -119 with Shanken-corrected
t-statistics equal to -255 so according to traditional hypothesis testing we would reject the null
of λuseless = 0 even at 1 The posterior distribution of BFM estimates of risk premium (the
blue line in Figure 1) behaves rather differently it is centered around 0 and overall more spread
out with a confidence interval (minus1603 1201) Intuitively the main driving force behind it
is the fact that in BFM β is updated continuously when β is close to zero the posterior draws
of β will be positive or negative randomly which implies that the conditional expectation of λ in
Equation 12 will also flip sign depending on the draw As a result the posterior distribution of
λuseless is centered around 0 and so is the confidence interval The same logic applies to the case
of BFM-GLS
Finally Table 3 combines the insights for true and irrelevant factors and presents the simulation
results for the most realistic model setup that includes both useless and strong factors As expected
in the conventional case of frequentist Fama-MacBeth estimation the useless factor is often found
to be a significant predictor of the asset returns its OLS (GLS) t-statistic would be above a
5-critical value in over 60 (80) of the simulations On the contrary the bayesian confidence
intervals have the right coverage and reject the null of no risk premia attached to the spurious
factor with frequency asymptotically approaching the size of the tests
The crowding out of the true factors by the useless ones could also be an important empirical
concern When the model is misspecified the presence of spurious factors can also bias the risk
premia estimates for the strong ones and often leads to their crowding out of the model Panel
A in Table 3 serves as a good illustration of this possibility with risk premia estimates for the
22
Table 3 Tests of risk premia in a misspecified model with useless and strong factors
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0082 0039 0008 0121 0067 0016 0099 0023 0001 -513 5663200 0096 0044 0005 0157 0100 0034 0129 0039 0005 127 6190
FM 600 0093 0034 0014 0212 0147 0071 0264 0129 0022 840 61781000 0102 0046 0010 0261 0194 0098 0380 0199 0056 1184 624820000 0114 0054 0009 0289 0229 0152 0848 0633 0240 2507 6076
100 0035 0012 0001 0028 0007 0001 0004 0001 0000 -211 4033200 0049 0017 0001 0067 0031 0004 0011 0003 0000 -175 4828
BFM 600 005 0018 0004 0099 0047 0005 0047 0014 0002 1020 55721000 0041 0021 0003 0102 0048 0011 0071 0035 0004 1487 569520000 0017 0007 0000 0087 0033 0007 0099 0055 0012 2480 5466
Panel B GLS
100 0219 0155 0057 0224 0135 0066 0303 0198 0064 1911 7775200 0155 0092 0028 0149 0090 0024 0263 0183 0061 5537 8171
FM 600 0121 0068 0015 0116 0064 0016 0391 0293 0134 6948 84331000 0115 0061 0013 0115 0057 0012 0487 0387 0216 7305 847420000 0084 0050 0009 0100 0041 0005 0864 0836 0757 7979 8424
100 0122 0069 0016 0129 0070 0017 0046 0017 0002 3243 6869200 0112 0056 0012 0099 0048 0012 0031 0012 0000 4844 7355
BFM 600 0096 0049 0011 0086 0045 0009 0049 0016 0002 6576 80301000 0081 0036 0007 0073 0032 0006 0058 0030 0003 7064 815420000 0027 0005 0000 0022 0007 0000 0098 0047 0013 7974 8259
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and
λstrong λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor The true value of the
cross-sectional R2adj is 3055 (8175) for the OLS (GLS) estimation
strong factor are clearly biased in the frequentist estimation by the identification failure in case of
the frequentist approach Again in this case BFM provides reliable albeit conservative confidence
bounds for model parameters
III11 Evaluating cross-sectional fit
In addition to risk premia estimates it is often useful to understand the quality of cross-sectional
fit of the model Indeed the increase in cross-sectional R2 is often interpreted as measuring the
economic importance of the predictor contrary to the statistical one implied by the risk premia
significance It is well-known however that the average values of R2 are not always informative
about the true model performance its sample distribution often suffers from a large estimation un-
certainty (see eg Stock (1991) and Lewellen Nagel and Shanken (2010)) ad has a non-standard
distribution when the matrix of β has reduced rank (Kleibergen and Zhan (2015) Gospodinov
Kan and Robotti (2019)) In this section we further investigate the properties of cross-sectional
R2 in the frequentist and Bayesian Fama-MacBeth regressions
Figure 2 shows the distribution of cross-sectional OLS R2 across a large number of simulations
for the asymptotic case of T = 20 000 and a misspecified process for returns As Panel (a)
illustrates if the model is strongly identified the distribution of posterior mode of R2 for BFM
tends to coincide with that of conventional Fama-MacBeth procedure as expected The major
23
difference emerges whenever a useless factor is included into the candidate set of variables Indeed
it is well-known that in this case the distribution of conventional measures of fit is nonstandard
and often inflated (Kleibergen and Zhan (2015)) This is further confirmed in Figures 2 b) and
c) that show that under the presence of spurious factors conventional Fama-MacBeth R2 has an
extremely spreadout right tail of the distribution which makes it easy to find a substantial increase
of fit whenever the model is simply not identified This unfortunate property of the frequentist
approach is not shared by the inference with BFM Indeed the mode of the posterior distribution
of R2 is generally tightly centered around the true values The slight bump to the right tail of the
distribution comes from the fact that whenever a spurious factor is included into the model with a
small probability (based on t-statistic cut-off this is equal to the size of the test see eg Table 3
Panel A) its fit will be similar to that of the frequentist estimation
00 02 04 06 08 10
02
46
810
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(a) Model with a strong factor
00 02 04 06 08
010
2030
40
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(b) Model with a useless factor
00 02 04 06 08 10
02
46
8
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(c) Model with strong and useless factors
Figure 2 Cross-sectional distribution of R2adj in models with strong and useless factors
Figures (a)-(c) show the asymptotic distribution of cross-sectional R2 under different model specifications across 1000simulations of sample size T = 20 000 Blue lines correspond to the distribution of the posterior mode for R2
adj whilegreen lines depict the pointwise sample distribution of cross-sectional fit evaluated at Fama-MacBeth risk premiaestimates The dark yellow dash line stands for the true value of R2
adj in the model
24
However the pointwise distribution of cross-sectional R2 across the simulations is only part of
the story as it does not reveal the in-sample estimation uncertainty and whether the confidence
intervals are credible in reflecting it While BFM incorporates this uncertainty directly into the
shape of its posterior distribution one needs to rely on bootstrap-like algorithms to build a similar
analogue in the frequentist case As frequentist benchmark we use the approach Lewellen Nagel
and Shanken (2010) to construct the confidence interval for R2 Details on this procedure can be
found in Appendix A5
minus05 00 05 10
00
05
10
15
20
25
Radj2
Den
sity
BFM PosteriorFM Radj
2
5th95th CI of BFM5th95th CI of LNSTrue Radj
2
(a) Model with a useless factor
minus04 minus02 00 02 04 06 08 10
00
05
10
15
20
Radj2
Den
sity
BFM PosteriorFM Radj
2
5th95th CI of BFM5th95th CI of LNSTrue Radj
2
(b) Model with strong and useless factors
Figure 3 The estimation uncertainty of cross-sectional R2Figures (a) and (b) show the posterior density of the cross-sectional R2 (blue) in one representative simulationalong with its 90 confidence interval (red) The dark yellow line denotes the true value of R2 while the green linedepicts its in-sample Fama-MacBeth estimate with the confidence interval constructed following Lewellen Nagel andShanken (2010) (black)
Figure 3 presents the posterior distribution of cross-sectional R2 for a model that contains a
useless factor (and potentially a strong one too) and contrasts it with a frequentist value and
the confidence interval around it Consider for example Panel a) The fact that the in-sample
Fama-MacBeth estimate of cross-sectional fit (51) is substantially higher than the mode of the
posterior distribution (-2 which is close to the true value of R2adj about minus4) is not surprising
given the previous results on the pointwise distribution of the estimates What is quite interesting
however is the coverage of the confidence interval constructed via the simulation-based approach of
Lewellen Nagel and Shanken (2010) Not only does it not include true value of the cross-sectional
fit but in fact in this particular simulation it suggests that R2adj should be between 42 and
100 A similar mismatch between the seemingly high levels of cross-sectional fit produced by a
frequentist approach and their true values can also be observed in Panel b) of Figure 3 for the
case of including both strong and a useless factors
In section A6 of the Appendix we show that the performance of the Bayesian Fama-MacBeth
method is robust to the use of a larger cross-sectional dimension ndash ie the above discussed properties
25
hold in a larger N setting
III2 Bayes factors
How well do flat and spike-and-slab priors work empirically in selecting relevant and detecting
spurious factors in the cross-section of asset returns We revisit the theoretical results from Section
II1 using the same simulation design used to evaluate the estimation of risk premia
Table 4 The probability of retaining risk factors using Bayes factors
T 55 57 59 61 63 65
Panel A strong factors
Jeffreys Prior fstrong 200 0813 0784 0758 0722 0693 0662600 0929 0915 0896 0876 0851 08341000 0972 0963 0957 0951 0937 0924
Spike-and-Slab Prior fstrong 200 0605 0570 0538 0505 0476 0447600 0853 0830 0807 0784 0762 07351000 0954 0940 0928 0912 0896 0875
Panel B useless factors
Jeffreys Prior fuseless 200 1000 0996 0988 0967 0919 0822600 0998 0998 0995 0988 0977 09431000 1000 1000 1000 0994 0983 0965
Spike-and-Slab Prior fuseless 200 0325 0188 0106 0057 0024 0013600 0072 0025 0006 0002 0001 00001000 0051 0015 0001 0000 0000 0000
Panel C strong and useless factors
Jeffreys Prior fstrong 200 0924 0897 0874 0848 0821 0799600 0988 0985 0976 0974 0965 09581000 0998 0996 0996 0995 0992 0987
fuseless 200 0984 0960 0910 0811 0702 0584600 0999 0993 0985 0954 0913 08541000 1000 1000 0995 0986 0966 0945
Spike-and-Slab Prior fstrong 200 0627 0591 0552 0509 0474 0452600 0860 0830 0802 0783 0758 07271000 0956 0942 0927 0911 0895 0875
fuseless 200 0260 0128 0071 0031 0019 0010600 0080 0028 0010 0004 0001 00011000 0058 0013 0005 0000 0000 0000
The table shows the frequency of retaining risk factors for different choice sets across 1000 simulations of differentsize (T=200 600 and 1000) In Panel A the candidate risk factor is truly cross-sectionally priced and stronglyidentified while in Panel B they are not Panel C summarizes the case of using both strong and useless candidatefactors in the model A candidate factor is retained in the model if its marginal posterior probability p(γi = 1|data)is greater than a certain threshold ie 55 57 59 61 63 and 65
Consider a cross-section of 25 portfolios that is actually loading on 2 systematic sources of risk
with the econometrician potentially observing at most only one of them a strong (and priced) ft
However there is also a second candidate factor available which is orthogonal to asset returns
and essentially useless We compute Bayes factors corresponding to each of the potential sources
of risk and document the empirical probability of retaining the variable in the model across 1000
26
simulations Again we consider models that contain either strong or useless factors or a com-
bination of both and different sample sizes (T = 200 600 and 1000) In each case we run the
Gibbs sampling algorithm derived using continuous spike-and-slab prior and then approximate the
marginal probability of each factor by the posterior mean of γj The decision rule is based on a
range of critical values 55-65 such that whenever the posterior mean of γj is above a particular
threshold we retain the factor Finally we also compute the probability of retaining a factor under
Jeffreys prior which would be the standard in the literature
Table 4 summarizes our findings When only a true risk factor is included in the candidate set
(Panel A) both Jeffreyrsquos and spike-and-slab priors successfully identify it with a high probability
especially in large sample For small sample sizes however Jeffreys prior clearly has a somewhat
higher power of detecting a true risk factor This outcome is not surprising as the estimator does
not impose additional shrinkage on the risk premium The difference however vanishes for larger
sample sizes
The difference between the two priors becomes drastic whenever useless factors are included
in the model (Panels B Panels B and C in Table 4) As discussed in Section II21 since the
matrix β⊤γ βγ is nearly singular and its determinant goes to zero under a flat prior the posterior
probability of including a spurious factor in the model converges to 1 asymptotically For example
the probability of misidentifying a spurious factor as being the true source of risk is almost 1
under Jeffreys prior even for a very short sample This in turn makes the overall process of model
selection invalid
Overall we find the asymptotic behavior of the spike-and-slab prior encouraging for variable
and model selection While often somewhat conservative in very short samples it successfully
eliminates the impact of the spurious factors from the model and identifies the true sources of
risk
IV Empirical Applications
In this section we apply our Bayesian approach to a large set of factors proposed in the previous
literature First we use the Bayesian Fama-MacBeth method to analyse several notable factor
models (subsection IV1) Second we consider 51 tradable and non-tradable factors yielding more
than two quadrillion possible models and employ our spike and slab priors to compute factorsrsquo
posterior probabilities and implied risk premia (subsections IV2 and IV3) Third we compare
the performance of a (low dimensional) robust model constructed with only the factors that have
high posterior probability to the one of several notable factor models (subsection IV4) Fourth
we estimate the degree of sparsity (in terms of linear factors) of the true latent stochastic discount
factor as well as the SDF-implied maximum Sharpe ratio (subsection IV5)
27
IV1 Some notable factor models
In this section we illustrate the differences between the frequentist and Bayesian FM estimation
(both OLS and GLS) for several candidate models In particular we estimate a set of linear
factor models on the returns of the standard 25 Fama-French portfolios sorted by size and value
using frequentist and Bayesian Fama-MacBeth estimators We use monthly data over the 197001-
201712 sample for tradable factors and whenever possible nontradables For factors available
only at quarterly frequency the sample is 1952Q1-2017Q3 (whenever possible) A full description
of the data and models used can be found in Appendix B Additional empirical results for other
candidate factors and cross-sections (eg 25 Fama-French + 17 Industry portfolios) can be found
in Appendix C
Tables 5 and 6 summarize the performance of several leading factor models For the classical FM
approach we report point estimates of risk premia with their Shanken-corrected t-statistics and
the cross-sectional R2 along with its 90 confidence interval (constructed following the method-
ology of Lewellen Nagel and Shanken (2010)) For BFM we report the posterior mean of risk
premia estimates and the posterior median and mode of R2 along with the centered 90 posterior
coverage We chose to report both the median and the mode for cross-sectional fit because the of
the shape of its distribution which is often heavily skewed
Carhart (1997) 4-factor model OLS and GLS Fama-MacBeth estimates of risk premia indicate
that size value and momentum (SMB HML and UMD correspondingly) are significant drivers of
the cross-section of test assets The market factor does not command a significant risk premium
which is a typical finding for this model Cross-sectional fit seems to be high with R2 over 70 even
though it comes with rather wide confidence bounds according to Lewellen Nagel and Shanken
(2010) approach The Bayesian estimation indicates that part of the model success is due to the fact
that this cross section of test assets does not have much exposure to momentum especially after
one controls for the conventional Fama-French factors While still marginally significant its risk
premium is substantially lower under both BFM and BFM-GLS estimation with tighter bounds
for R2 too On the contrary both HML and SMB have virtually identical risk prices under both
FM and Bayesian estimations
Hou Xue and Zhang (2014) q-factor model emphasized the role of investment and profitability
in matching the cross-section of equity returns and true to the data we find these factors strongly
priced Both estimation strategies give identical parameter estimates and largely consistent mea-
sures of cross-sectional fit This in turn implies that all the risk premia are strongly identified
for this model when estimated on a cross-section of 25 Fama-French portfolios When industry
portfolios are among the test assets (see Appendix C) risk premia and R2 decline and some of the
parameters lose significance but broadly speaking the model performs in a qualitatively similar
way
Liquidity-adjusted CAPM of Pastor and Stambaugh (2003) seems to suffer from identification
failure as the risk premium on the liquidity factor ceases to remain significant when BFM is used
in estimation Wide confidence bounds and uncertain cross-sectional fit provide a stark difference to
28
Table 5 Tradable factors and 25 Fama-French portfolios sorted by size and value
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Carhart (1997) Intercept 0489 7063 0703 6432 6329[-0244 1222] [3160 9400] [-0061 1426] [4826 7646]
MKT 0120 -0101[-0631 0870] [-0822 0683]
SMB 0171 0164[0100 0241] [0089 0232]
HML 0404 0396[0331 0477] [0330 0466]
UMD 2445 1806[0955 3936] [0259 3328]
q-factor model Intercept 0912 6567 0922 6062 6123Hou Xue and Zhang (2014) [0286 1539] [3040 8680] [0276 1560] [4131 7640]
ROE 0394 0377[0016 0771] [-0020 0789]
IA 0387 0385[0203 0571] [0208 0580]
ME 0274 0268[0169 0379] [0158 0376]
MKT -0371 -0378[-0995 0252] [-1005 0272]
Liquidity-CAPM Intercept 0973 3624 1162 3409 3027Pastor and Stambaugh (2000) [-0084 2030] [-909 10000] [0175 2120] [-239 6146]
LIQ 3057 1785[0727 5388] [-1237 4150]
MKT -0281 -0449[-1350 0788] [-1371 0509]
Panel A GLS
Carhart (1997) Intercept 1017 8964 1083 8587 863[0389 1645] [8200 9760] [0458 1717] [8085 9105]
MKT -0434 -0504[-1065 0196] [-1150 0122]
SMB 0191 0189[0150 0233] [0150 0230]
HML 0356 0356[0313 0400] [0316 0395]
UMD 1626 1264[0479 2772] [0077 2401]
q-factor model Intercept 1305 5503 1277 4728 4854Hou Xue and Zhang (2014) [0779 1831] [2440 9640] [0702 1879] [3245 6419]
ROE 0295 0266[-0026 0615] [-0087 0640]
IA 0270 0265[0104 0437] [0093 0450]
ME 0251 0246[0161 0341] [0144 0345]
MKT -0749 -0720[-1268 -0229] [-1292 -0156]
Liquidity-CAPM Intercept 1244 4938 1256 5298 4317Pastor and Stambaugh (2000) [0664 1824] [2691 9891] [0738 1749] [1250 6653]
LIQ 1141 0775[-0232 2514] [-0450 2116]
MKT -0664 -0678[-1242 -0086] [-1176 -0162]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of 25 Fama-French monthly excess returns Each model is estimated via OLS and GLS We reportpoint estimates and 5 confidence intervals for risk premia which are constructed based on the asymptotic normaldistribution and cross-sectional R2 and its (5 95) confidence level constructed as in Lewellen Nagel and Shanken(2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide the posterior mean of λ denoted byλj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (595) credible intervals and denote significance at the 90 95 and 99 level respectively
29
Table 6 Nontradable factors and 25 Fama-French portfolios sorted by size and value
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Scaled CCAPM Intercept 1046 2567 1791 3436 2919[-0848 2940] [-1429 10000] [0001 3723] [-476 6207]
cay 1817 0791[-0653 4288] [-1347 2686]
∆Cnd 0713 0303[-0030 1456] [-0462 0951]
∆Cnd times cay 0804 0301[-1645 3253] [-1911 2270]
HC-CAPM Intercept 3243 -122 3090 354 957[1228 5257] [-909 3345] [0790 5259] [-748 4431]
∆Y 0464 0085[-0213 1140] [-1119 1058]
MKT -0719 -0656[-2680 1242] [-2859 1558]
Durable CCAPM Intercept 2214 5238 2780 471 4078[-1037 5465] [2800 10000] [-0184 5751] [120 6991]
∆Cnd 0743 0357[-0025 1511] [-0207 0832]
∆Cd -0057 0014[-0719 0605] [-0668 0693]
MKT 0083 -0495[-3322 3489] [-3395 2555]
Panel B GLS
Scaled CCAPM Intercept 2180 -1024 2257 -658 -313[0825 3536] [-1429 6457] [1221 3258] [-1187 1517]
cay 0435 0256[-0774 1643] [-0688 1217]
∆Cnd 0118 0089[-0266 0502] [-0214 0407]
∆Cnd times cay 0141 0063[-1005 1286] [-0845 0938]
HC-CAPM Intercept 2730 5636 2759 5824 4926[1458 4002] [3018 8364] [1379 4095] [967 7507]
∆Y -0421 -0241[-0742 -0099] [-0598 0114]
MKT -0717 -0740[-1979 0545] [-2073 0622]
Durable CCAPM Intercept 2960 4454 2841 5474 4099[0547 5374] [286 7829] [1102 4558] [-241 7215]
∆Cnd 0105 0052[-0265 0475] [-0201 0311]
∆Cd 0055 0025[-0390 0501] [-0286 0327]
MKT -0941 -0822[-3314 1432] [-2528 0895]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of 25 Fama-French quarterly excess returns Each model is estimated via OLS and GLS Wereport point estimates and 5 confidence intervals for risk premia which are constructed based on the asymptoticnormal distribution and cross-sectional R2 and its (5 95) confidence level constructed as in Lewellen Nageland Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide the posterior mean of λdenoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as wellas its (5 95) credible intervals and denote significance at the 90 95 and 99 level respectively
the pointwise estimates and their seemingly high significance levels under the standard frequentist
approach
Conditional CCAPM of Lettau and Ludvigson (2001) is weakly identified at best Unlike the
basic FM estimation that indicates a relative empirical success of the model the Bayesian approach
reveals most risk premia to be substantially lower losing all the accompanying statistical and
30
economic significance This is particularly pronounced in the BFM-GLS that delivers both risk
premia and cross-sectional R2 close to zero
Labour-adjusted CAPM of Jagannathan and Wang (1996) extends the classic CAPM framework
by introducing a proxy for human capital and finds it strongly priced in the cross-sections of stocks
returns The BFM estimates of risk premia are substantially lower and no longer significant with
the same patterns observed under both OLS and GLS procedures
Durable CCAPM of Yogo (2006) in the linearized version included the durable consumption
factor and found that its impact is priced in a number of cross-sections sorted by size and value past
betas and other characteristics Even though the Lewellen Nagel and Shanken (2010) approach
indicates a really wide support for the cross-sectional R2 the model found empirical support in the
data We find that both durable and nondurable consumption are weak predictors of the cross-
section of returns as the magnitude of their risk premia substantially declines and is no longer
significant The model is still characterized by a wide confidence interval for R2 but overall its
pricing ability is questionable at best
Appendix C provides additional empirical results on the performance of both frequentist and
bayesian Fama-MacBeth estimators In many cases when the models are well specified and strongly
identified in the data there is almost no distinction between the two approaches One notable
difference however are the confidence intervals of the R2 that are often notoriously wide in
the frequentist case There are also cases however when the difference in model performance
becomes large affecting both risk premia estimates and measures of cross-sectional fit Similar
to Gospodinov Kan and Robotti (2019) we caution the reader against blindly relying on the
estimates produced by conventional Fama-MacBeth procedure and advocate a robust approach to
inference
IV2 Sampling two quadrillion models
We now turn our attention to a large cross-section of candidate asset pricing factors In particular
we focus on 51 (both traded and non) monthly factors available from October 1973 to December
2016 (ie T = 600) Factors are described in details in Table B1 of Appendix B In choosing
the cross-section of assets to price we follow Lewellen Nagel and Shanken (2010) and employ 25
Fama-French size and book-to-market portfolios plus 30 Industry portfolio (ie N = 55) Since
we do not restrict the maximum number of factors to be included all the possible combinations
of factors give us a total of 251 possible specifications ie 225 quadrillion models Note that each
model involves 55 time series regressions and one cross-section regression ie we jointly evaluate
the equivalent of 126 quadrillion regressions
We employ the continuous spike-and-slab approach of section II23 since it is the most suited
for handling a very large number of possible models and report both the posterior (given the data)
of each factor (ie E [γj |data] forallj) as well as the posterior means of the factorsrsquo risk premia (ie
E [λj |data] forallj) computed as the Bayesian Model Average16 across all the models considered
16If we are interested in some quantity ∆ that is well-defined for every model j = 1 M (eg risk premia
31
The posterior evaluation is performed and reported over a wide range for the parameter (ψ in
equation (16)) that controls the degree of shrinkage of potentially useless factorsrsquo risk premia from
ψ = 1 (ie very strong shrinkage) to ψ = 100 (making the shrinkage virtually irrelevant) The
prior for each factor inclusion is a Beta(1 1) (ie a uniform on [0 1]) yielding a prior expectation
for γj equal to 5017
000
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
1 5 10 20 50 100
Post
erio
r pro
babi
lity
ψ
HML MKT MKT SMB STRev
IPGrowth BEH_PEAD NONDUR SERV LIQ_TR
PE DeltaSLOPE BW_ISENT DEFAULT Oil
DIV REAL_UNC UNRATE TERM HJTZ_ISENT
UMD CMA LIQ_NT FIN_UNC STOCK_ISS
NetOA MACRO_UNC INV_IN_ASS COMP_ISSUE ACCR
HML ROA RMW CMA LTRev
INTERM_CAP_RATIO ASS_Growth DISSTR PERF BAB
IA MGMT ROE O_SCORE BEH_FIN
GR_PROF QMJ RMW SKEW SMB
HML_DEVIL
Figure 4 Posterior factor probabilities
Posterior probabilities of factors E [γj |data] estimated over the 197310-201612 sample using a cross-section of 25Fama-French size and book-to-market and 30 Industry test asset portfolios computed using the continuous spike andslab approach of section II23 and 51 factors yielding 251 asymp 225 quadrillion models The 51 factors considered aredescribed in Table B1 of Appendix B The prior distribution for the j-th factor inclusion is a Beta(1 1) yielding aprior expectation of γj = 50 Posterior probabilities are plotted for ψ isin [1 100]
Figure 4 plots the posterior probabilities of the 51 factors as a function of the parameter ψ
The corresponding values are reported in Table 7 Overall the inclusion of only four factors finds
maximum Sharpe ratio etc) from Bayesrsquo theorem we have
E [∆|data] =M983131
j=0
E [∆|datamodel = j] Pr (model = j|data)
where E [∆|datamodel = j] = limLrarrinfin1L
983123Li=1 ∆(θ
(j)i ) and
983153θ(j)i
983154L
i=1denotes draws from the posterior distribution
of the parameters of model j That is the (Bayesian model average) expectation of ∆ conditional on only thedata is simply the weighted average of the expectation in every model with weights equal to the modelsrsquo posteriorprobabilities See eg Raftery Madigan and Hoeting (1997) Hoeting Madigan Raftery and Volinsky (1999)
17Using a Beta(2 2) that still implies a prior probability of factor inclusion of 50 but lower probabilities forvery dense and very sparse models we obtain virtually identical results Furthermore using a prior in favour of moresparse factor models (ie a Beta(2 8)) the empirical findings are very similar to the ones reported
32
Table 7 Posterior factor probabilities E [γj |data] and risk premia in 225 quadrillion models
E [γj |data] E [λj |data]ψ ψ
Factors 1 5 10 20 50 100 1 5 10 20 50 100 F
HML 0860 0909 0916 0903 0882 0877 0192 0288 0301 0300 0292 0293 0377MKT 0730 0807 0831 0851 0837 0778 0108 0271 0370 0476 0565 0572 0514MKT 0501 0630 0705 0748 0749 0707 0047 0184 0285 0385 0467 0472 0563SMB 0654 0662 0580 0500 0429 0370 0076 0138 0133 0117 0103 0102 0215STRev 0506 0526 0511 0530 0568 0550 0003 0013 0026 0049 0106 0158 0438IPGrowth 0508 0508 0501 0502 0508 0502 0000 -0001 -0001 -0002 -0004 -0007 0121BEH PEAD 0486 0512 0478 0492 0511 0517 0004 0012 0018 0027 0049 0072 0619NONDUR 0475 0499 0490 0504 0504 0509 0000 0002 0003 0005 0011 0018 0170SERV 0504 0481 0497 0502 0491 0497 0000 0000 0000 0000 -0001 -0001 0052LIQ TR 0494 0493 0485 0485 0506 0507 0000 0005 0010 0020 0048 0083 0438PE 0495 0494 0502 0490 0494 0492 -0002 -0004 -0005 -0006 -0008 -0009 6401DeltaSLOPE 0478 0490 0506 0498 0482 0511 0000 0000 -0001 -0001 -0002 -0005 0096BW ISENT 0489 0491 0496 0498 0504 0477 0000 0002 0003 0005 0010 0014 0094DEFAULT 0497 0498 0490 0497 0484 0490 0000 0000 0000 0001 0001 0002 0313Oil 0495 0504 0489 0490 0494 0483 0002 0011 0019 0034 0068 0108 0852DIV 0488 0491 0486 0494 0491 0505 0000 0000 0000 -0001 -0002 -0003 0876REAL UNC 0479 0490 0503 0490 0498 0484 0000 0000 0000 0000 0000 0000 0043UNRATE 0483 0499 0484 0490 0490 0486 0000 -0001 -0002 -0003 -0006 -0008 1083TERM 0470 0476 0480 0497 0503 0501 0000 0003 0005 0009 0018 0030 0942HJTZ ISENT 0483 0480 0491 0487 0489 0477 0000 0000 0000 -0001 0000 0000 0210UMD 0531 0537 0536 0495 0422 0384 0028 0069 0089 0100 0102 0108 0646CMA 0499 0504 0491 0495 0466 0442 0001 0000 -0002 -0005 -0008 -0012 0242LIQ NT 0477 0478 0494 0493 0481 0471 -0003 -0005 -0004 -0002 0009 0019 0501FIN UNC 0481 0477 0471 0477 0489 0491 0000 0000 0000 0000 -0001 -0001 0096STOCK ISS 0515 0537 0509 0473 0433 0374 -0032 -0081 -0096 -0103 -0110 -0105 0515NetOA 0504 0504 0481 0489 0423 0402 0005 0015 0022 0032 0040 0042 0544MACRO UNC 0476 0483 0479 0471 0463 0430 0000 0000 0000 0000 0000 0000 0073INV IN ASS 0479 0476 0479 0461 0454 0440 0001 0002 0005 0010 0023 0029 0549COMP ISSUE 0514 0518 0503 0481 0393 0363 0037 0083 0101 0111 0105 0108 0497ACCR 0483 0477 0486 0472 0436 0415 -0009 -0012 -0009 0001 0015 0015 0343HML 0491 0481 0479 0469 0431 0376 -0002 -0001 0001 0004 0008 0010 0251ROA 0517 0514 0499 0473 0399 0319 0041 0105 0141 0162 0154 0123 0551RMW 0508 0481 0483 0465 0397 0347 0002 0008 0013 0017 0017 0016 0219CMA 0560 0530 0499 0432 0351 0292 0031 0056 0062 0058 0051 0044 0351LTRev 0485 0491 0474 0442 0402 0350 -0007 -0025 -0032 -0035 -0036 -0029 0252INTERM CAP RATIO 0499 0477 0452 0451 0380 0324 0027 0037 0054 0082 0103 0104 0790ASS Growth 0473 0470 0458 0439 0380 0327 0001 0005 0005 0000 -0005 -0003 0525DISSTR 0505 0487 0456 0410 0340 0269 0043 0094 0105 0104 0085 0070 0475PERF 0541 0497 0448 0390 0319 0259 -0054 -0079 -0072 -0064 -0054 -0049 0651BAB 0460 0448 0433 0405 0372 0325 -0010 -0012 -0008 0006 0027 0037 0921IA 0497 0465 0425 0385 0325 0257 -0010 -0011 -0009 -0008 -0008 -0007 0409MGMT 0516 0469 0424 0379 0302 0244 0028 0055 0058 0056 0050 0044 0631ROE 0482 0446 0414 0364 0299 0233 0009 0024 0027 0028 0024 0018 0555O SCORE 0475 0426 0403 0368 0281 0229 -0009 -0014 -0015 -0018 -0016 -0012 002BEH FIN 0483 0410 0360 0321 0238 0196 -0016 -0008 -0003 0000 0003 0003 076GR PROF 0459 0405 0366 0314 0249 0206 0011 0005 0008 0011 0012 0014 0199QMJ 0502 0408 0369 0300 0232 0178 0030 0030 0026 0018 0014 0011 0405RMW 0464 0395 0356 0302 0245 0193 0015 0025 0028 0027 0025 0020 0292SKEW 0481 0388 0345 0287 0219 0179 -0001 0000 0000 0000 0000 0000 0438SMB 0336 0305 0319 0312 0311 0277 0019 0032 0041 0047 0054 0051 0257HML DEVIL 0434 0326 0289 0242 0183 0152 0014 0013 0009 0005 0003 0002 0356
Posterior probabilities of factors E [γj |data] and posterior mean of factor risk premia E [λj |data] computed usingthe continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225 quadrillion models Theprior for each factor inclusion is a Beta(1 1) yielding a prior expectation for γj equal to 50 The last column reportssample average returns for the tradable factors The data is monthly 197310 to 201612 Test assets cross-sectionof 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factors considered are described inTable B1 of Appendix B Numbers denoted with the asterisk in the last column correspond to the return on thefactor-mimicking portfolio of the nontradable factor constructed by a linear projection of its values on the set of 51test assets and scaled to have the same volatility as the original nontradable factor
33
substantial support in our empirical analysis First the celebrated Fama-French HML (high-minus-
low) designed to capture the so-called lsquovalue premiumrsquo is a strong determinant of the cross-section
of asset returns For ψ = 10 (a reasonable benchmark) its posterior probability is about 895 and
only for very strong shrinkage (ψ = 1) the posterior probability gets reduced to 639 Second
the market factor in the version of Daniel Mota Rottke and Santos (2018) (MKTlowast that is
meant to have hedged out the unpriced risk contained in the market index) has also high posterior
probability (856 for ψ = 10) Third the simple market factor (MKT) seems also to be a robust
source of priced risk albeit with an empirical performance weaker than MKTlowast Fourth albeit to
a lesser extent SMBlowast the Daniel Mota Rottke and Santos (2018) version of the small-minus-big
Fama-French factor (meant to capture the so called lsquosizersquo premium) seems also to contain relevant
information for pricing the cross-section of asset returns with a posterior probability in the 60-
70 range for most values of ψ Beside the ones above mentioned all other factors have posterior
probabilities of about 50 or less for all values of ψ Interestingly the results are not very sensitive
to the choice of ψ
In addition to the posterior probabilities of the factors Table 7 reports the posterior means
of the factor risk premia computed as Bayesian Model Average ie the weighted average of the
posterior means in each possible factor model specification with weights equal to the posterior
probability of each specification being the true data generating process (see eg Roberts (1965)
Geweke (1999) Madigan and Raftery (1994)) The results are not very sensitive to the choice of
ψ except when considering very small values for of the shrinkage parameter ψ since in this case
posterior means are shrunk toward zero Interestingly the estimated price of risk for the market
factor is positive despite it being very often estimated as a negative quantity when considering
multifactor models and not dissimilar from the market excess return over the same period More
generally there is a clear pattern in cross-sectionally estimated (ie ex post) factor risk premia and
their simple time series average estimates (reported in the last column of Table 7) for the robust
four factors (HML MKTlowast MKT and SMBlowast) ex post risk premia are very similar to the time series
estimnates while the opposite holds true for the other factors In other words robust factors seems
to price themselves well (since theoretically their own beta is one) while other factors donrsquot
IV3 Estimating 26 million sparse factor models
Instead of drawing unrestricted factor models specifications we now constraint models to have a
maximum of 5 factors ndash ie we are imposing sparsity on the implied stochastic discount factor as
in most of the previous empirical asset pricing literature that has tried to identify low dimensional
factor models to explain the cross-section of asset returns Given our set of 51 factors this approach
yields about 26 millions of possible models (ie the equivalent of about 147 million time series and
cross-sectional regressions) Since we now do not sample the possible models posterior probabilities
are computed using the marginal likelihoods of all these models ie the posterior probability of
model γj is computed as
Pr(γj |data) =p(data|γj)983123i p(data|γi)
34
000
1000
2000
3000
4000
5000
6000
7000
8000
1 5 10 20 50 60
Post
erio
r pro
babi
lity
ψ
HML MKT MKT SMB PERF
STOCK_ISS COMP_ISSUE CMA ROA UMD
BEH_PEAD STRev NONDUR DISSTR MGMT
BW_ISENT TERM FIN_UNC LIQ_TR DeltaSLOPE
IPGrowth Oil SERV DEFAULT NetOA DIV PE REAL_UNC HJTZ_ISENT UNRATE
INTERM_CAP_RATIO LIQ_NT QMJ MACRO_UNC INV_IN_ASS
ACCR LTRev CMA ASS_Growth HML
RMW BAB IA O_SCORE ROE
SMB SKEW BEH_FIN GR_PROF RMW
HML_DEVIL
Figure 5 Posterior factor probabilities
Posterior probabilities of factors Pr [γj = 1|data] estimated over the 197310-201612 sample using a cross-section of25 Fama-French size and book-to-market and 30 Industry test asset portfolios computed using the Dirac spike andslab approach of section II22 51 factors and all possible models with up to 5 factors yielding about 26 millioncandidate models The prior probability of a factor being included is about 1038 The 51 factors considered aredescribed in Table B1 of Appendix B Posterior probabilities are plotted for ψ isin [1 100]
where we have assigned equal prior probability to all possible specifications and p(data|γj) denotes
the marginal likelihood of the j-th model To both simplify the numerical computation and to
illustrate the qualities of the approach we use the Dirac spike-and-slab prior of section II22 since
we can leverage its closed form solution for the marginal likelihoods18
Posterior factor probabilities are reported in Figure 5 and jointly with the Bayesian model
averaging of risk premia across the sparse models in Table C15 of the Appendix Note that in
this case all factors have an ex ante probability of being included equal to 1038 The results
are strikingly similar to the ones in Table 7 and Figure 4 as before only for four factors (HML
MKTlowast MKT and SMBlowast) we observe a marked increase in the posterior probability of inclusion
after observing the data Furthermore these factors seems to price themselves well ndash ie the BMA
of their risk premia are very similar to the sample average of their excess returns ndash while other
factors do not
35
Table 8 Factor Models with highest posterior probability (Dirac spike-and-slab ψ = 10)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232SMB 983232 983232 983232 983232 983232 983232STRevBW ISENTLIQ TRUNRATENONDURTERMCOMP ISSUE 983232 983232OilBEH PEADDeltaSLOPEINV IN ASSIPGrowthDEFAULTUMD 983232ROA 983232 983232 983232REAL UNCPECMA 983232ACCRSERVSTOCK ISS 983232 983232 983232DIVMACRO UNCFIN UNCLIQ NTCMANetOAHJTZ ISENTLTRevRMWHMLINTERM CAP RATIOASS GrowthPERF 983232 983232 983232IABABDISSTRROEMGMT 983232 983232O SCOREQMJBEH FINGR PROFSKEWRMWHML DEVILSMB
Probability () 00666 00599 00574 00522 00503 00460 00437 00428 00423 00393
Factors and posterior model probabilities of ten most likely specifications computed using the Dirac spike and slabapproach of section II22 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 26 millioncandidate models and a model prior probability of the order of 10minus7 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
36
IV4 A robust factors model
The previous subsections suggest that only a small number of factors (HML MKTlowast MKT and to a
lesser extent SMBlowast) are robust explanators of the cross-section of asset returns Furthermore Table
8 that reports the ten factor model specifications with the highest posterior probabilities19 shows
that these robust factors tend to be almost always included in the most likely models both HML
and MKTlowast are featured in all ten specifications while MKT is excluded only once and SMBlowast is
included in the five most likely specification plus the tenth one Note that the posterior probabilities
in Table 8 might appear small in absolute terms but are actually four orders of magnitude larger
than the prior model probabilities (equal to one over the number of models considered)
Therefore a natural question is whether the four factors identified as robust in the above
analysis do indeed deliver a significantly better cross-sectional asset pricing model We answer this
questions by comparing the performance of a four factor model with HML MKTlowast MKT and SMBlowast
as factors to the one of several notable factor models20 In particular Table 9 reports the model
posterior probabilities for the specifications considered ie the probability of any of these models
being the true data generating process Strikingly for almost any value of ψ the model posterior
probabilities are in the single digits range for all models but the robust factors one the probability
of this specification is always higher than 85 except when using a very strong shrinkage (in which
case it is reduced to 65) Furthermore for ψ in the most salient range (10-20) the posterior
probability of the robust factors model is about 90
Table 9 Posterior probabilities of notable models vs robust factors
ψmodel 1 5 10 20 50 100
CAPM 002 001 001 002 004 008Fama and French (1992) 005 001 001 001 000 000Fama and French (2016) 009 006 005 004 003 002Carhart (1997) 005 001 001 001 001 001Hou Xue and Zhang (2015) 002 000 000 000 000 000Pastor and Stambaugh (2000) 001 000 000 001 001 002Asness Frazzini and Pedersen (2014) 010 003 002 002 002 001Robust Factors Model 065 088 090 090 089 085
Posterior model probabilities for the specifications in the first column for different values of ψ computed using theDirac spike-and-slab prior The models and their factors are described in Appendix B The model in the last rowuses the HML MKT MKTlowast and SBMlowast factors described in Table B1 The data is monthly 197310 to 201612 Testassets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios
18Alternatively one could use the continuous spike-and-slab approach employed in the previous subsection anddrop the draws of specifications with more than 5 factors but this significantly reduces computational efficiency
19Table 8 focuses on the Dirac spike-and-slab specification with ψ = 10 Very similar results are reported in tablesC16-C18 of Appendix C for other values of ψ and for the continuous prior case
20Note that the correlation between MKT and MKTlowast is not too large being about 064
37
IV5 The sparsity of the stochastic discount factor
We have shown that a robust model with only four ndash robust ndash factors is much more likely to
capture the true stochastic discount factor than all the notable models considered (see Table 9)
Nevertheless are only four factors likely to deliver a sufficiently accurate representation of the true
latent stochastic discount factor
Thanks to our Bayesian method this question can be easily answered In particular by using
our estimations of about 225 quadrillion models and their posterior probabilities we can compute
the posterior distribution of the dimensionality of the lsquotruersquo model That is for any integer number
between one and fifty-one we can compute the posterior probability of the SDF being a function
of that number of factors
Figure 6 reports the posterior distributions of the dimensionality of the SDF for various values
of ψ These distributions are also summarized in Table 10 For the most salient values of ψ (10 and
20) the posterior mean of the number of factors in the true SDF is in the 24-25 range and the 95
posterior credible intervals are contained in the 17 to 32 factors range That is there is substantial
evidence that the SDF is dense in the space of the linear factors considered given the factors at
hand a relative large number of them is needed to provide an accurate representation of the true
latent SDF Since most of the literature has focused on very low dimension linear factor models
this finding suggests that most empirical results therein have been affected by a large degree of
misspecification
It is worth noticing that as Figure 6 and Table 10 show for very large ψ ie with basically
a flat prior for factor risk premia the posterior dimensionality is reduced This is due to two
phenomena we have already outlined First if some of the factors are useless (and our analysis
points in this direction) under a flat prior they do tend to have a higher posterior probability and
drive out the true sources of priced risk Second a flat prior for the risk premia can generate a
ldquoBartlett Paradoxrdquo (see the discussion in section II21 and Bartlett (1957))
Table 10 The posterior dimensionality of the stochastic discount factor
ψ1 5 10 20 50 100
mean 2570 2525 2460 2370 2203 2046median 26 25 25 24 22 2025 19 18 18 17 15 145 20 20 19 18 16 1595th 31 31 30 30 28 26975th 33 32 32 31 29 27
Summary statistics of the posterior density of the true model number of factors for various values of ψ Estimated
over the 197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30 Industry
test asset portfolios using the continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225
quadrillion models The prior for each factor inclusion is a Beta(1 1)
38
0 10 20 30 40 50
000
004
008
012
Model Dimensionality
Freq
uenc
y
ψ=1ψ=5ψ=10ψ=20ψ=50ψ=100
Figure 6 Posterior density of the dimensionality of the stochastic discount factor
Posterior density of the true model having the number of factors listed on the horizontal axis Estimated over the
197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30 Industry test asset
portfolios computed using the continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225
quadrillion models The prior for each factor inclusion is a Beta(1 1) yielding a prior expectation for γj equal to 50
The 51 factors considered are described in Table B1 of Appendix B Posterior densities are plotted for ψ isin [1 100]
Note that if the factors proposed in the literature were to capture different and uncorrelated
sources of risk one might worry that a SDF that is dense in the space of factors might imply
unrealistically high Sharpe ratios (see eg the discussion in Kozak Nagel and Santosh (2019))
Since given a factor model the SDF-implied maximum Sharpe ratio is just a function of the
factorsrsquo risk premia and covariance matrix our Bayesian method allows to construct the posterior
distribution of the maximum Sharpe ratio for each of the 2 quadrilion models considered Therefore
using the posterior probabilities of each possible model specification we can actually construct
the (Bayesian Model Averaging) posterior distribution of the SDF-implied maximum Sharpe ratio
(conditional on the data only)
Figure 7 and table 11 report respectively the posterior distribution of the (annualized) SDF-
implied maximum Sharpe ratio and its summary statistics for several values of the parameter ψ
Except when a very strong shrinkage (small ψ) is imposed (and hence risk premia and consequently
Sharpe ratios are shrunk toward zero) the posterior distribution of the Sharpe ratio are quite
similar for all values of ψ Furthermore despite the SDF being dense in the space of factors the
posterior maximum Sharpe ratio does not appear to be unrealistically high eg for ψ isin [10 20]
its posterior mean is about 074ndash080 and the 95 posterior credible intervals are in the 050ndash114
range Interestingly Ghosh Julliard and Taylor (2016 2018) provides a non-parametric estimate
of the pricing kernel extracted using an information-theoretic approach and wide cross-sections of
equity portfolios and find SDF-implied maximum Sharpe ratios of very similar magnitude
39
00 05 10 15
01
23
4
Sharpe Ratio
Dens
ity
ψ=1ψ=5ψ=10ψ=20ψ=50ψ=100
Figure 7 Posterior density of the Sharpe ratio of the model-implied stochastic discount factor
Posterior density of the Sharpe ratio of the model-implied stochastic discount factor for various values of ψ isin [1 100]
Estimated over the 197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30
Industry test asset portfolios computed using the continuous spike and slab approach of section II23 51 factors
described in Table B1 of Appendix B and Bayesian Model Averaging of the 251 asymp 225 quadrillion possible models
The prior for each factor inclusion is a Beta(1 1)
Table 11 Posterior distribution of the Sharpe ratio of the model-implied stochastic discount factor
ψ1 5 10 20 50 100
mean 044 067 074 080 087 092median 043 066 073 079 086 09125 026 044 050 054 059 0625 029 047 053 058 063 06695th 060 089 099 108 117 124975th 064 094 105 114 124 133
Summary statistics of the posterior distribution of the maximal Sharpe ratio of the model-implied SDF for various
values of ψ isin [1 100] Estimated over the 197310-201612 sample using a cross-section of 25 Fama-French size and
book-to-market and 30 Industry test asset portfolios computed using the continuous spike and slab approach of
section II23 51 factors described in Table B1 of Appendix B and Bayesian Model Averaging of the 251 asymp 225
quadrillion possible models The prior for each factor inclusion is a Beta(1 1)
V Extensions
In additions to the extensions formalized in remarks 1 (on how to handle generated factors such as
principal components and factor mimicking portfolios) and 3 (on how to handle the identification
failure generated by lsquolevel factorsrsquo) our method can be feasibly extended to encompass several
salient generalizations
40
First based on economic considerations one might possibly want to bound the maximum risk
premia (or the maximum Sharpe ratios) associated with the factors This can be achieved by re-
placing the Gaussian distributions in our spike-and-slab priors with (rescaled and centered) Beta
distributions since the latter have bounded support Furthermore for the sake of expositional
simplicity and closed form solutions we have focused on regularising spike and slab priors with
exponential tails Nevertheless our approach that shrinks useless factors based on their correla-
tion with asset returns could be as well implemented using polynomial tailed (ie heavy-tailed)
mixing priors (see Polson and Scott (2011) for a general discussion of priors for regularisation and
shrinkage)21 The rational for using heavy tailed priors is that when the likelihood has thick tails
while the prior has thin tail if the likelihood peak moves too far from the prior mean the posterior
eventually eventually reverts toward the prior This is illustrated in Figure D2 of the Appendix
Nevertheless note that this mechanism (first pointed out in Jeffreys (1961)) is actually desirable
in our settings in order to shrink the risk premia of useless factors toward zero22
Second Lewellen Nagel and Shanken (2010) points out that the first pass time-series regression
is often affected by having a strong factor structure in the residuals Given the hierarchical structure
of our Bayesian approach one can add latent linear components in the time series regression of asset
returns on factors reformulate the time series estimation step as a state-space problem and filter
the latent components (eg via Kalman filter) The posterior sampling of the time series parameters
would then be enriched by the drawing of the added terms as in Bryzgalova and Julliard (2018)
Furthermore one could allow the latent time series factors to be potential priced in the cross-
section (again as in Bryzgalova and Julliard (2018)) This extension would increase the numerical
complexity of the procedure in the time-series step but would nonetheless leave unchanged the
method proposed in this paper at the cross-sectional step (with the only difference that the time
series loadings of the latent factors could be included in the cross-sectional step as if these latent
factors were observable) This extended approach would lead to valid posterior inference and model
selection
Third again thanks to the hierarchical structure of our method time varying time series betas
could be accommodated by adopting the time varying parameters approach of Primiceri (2005)
in the time series step And since in our approach the asset specific expected risk premia are
a parameter estimated in the time series step this extension would also allow for time variation
in asset risk premia Furthermore albeit this would increase the numerical complexity of the
cross-sectional inference step the time varying parameters formulation could also be used for the
modelling of the factor risk premia
21For example albeit alternatives with more desirable properties exist our spike and slab could be implementedusing a Cauchy prior with location parameter set to zero and scale parameter proportional to ψj as defined in equation(16)
22Risk premia have a natural support hence the prior can be chosen (adjusting the paramater ψ) to attach highprobability to it And since useless factors will tend to generate heavy tailed likelihoods with posterior peaks for riskpremia that deviate toward infinity the risk premia of such factors will be shrunk toward the prior mean if the priorhas thin-tails
41
VI Conclusions
We have developed a novel (Bayesian) method for the analysis of linear factor models in asset
pricing The approach can handle the quadrillions of models generated by the zoo of traded and
non traded factors and delivers inference that is robust to the common identification failures and
spurious inference problems caused by useless factors
We have applied our approach to the study of more than two quadrillion factor model spec-
ification and have found that 1) only a handful of factors (the Fama and French (1992) ldquohigh-
minus-lowrdquo proxy for the value premium the market index as well the adjusted versions of the
ldquosmall-minus-bigrdquo size factor and the market factor of Daniel Mota Rottke and Santos (2018))
seem to be robust explanators of the cross-sections of asset returns 2) jointly the four robust fac-
tors provide a model that is compared to the previous empirical literature one order of magnitude
more likely to have generated the observed asset returns (itrsquos posterior probability is about 90)
3) with very high probability the ldquotruerdquo latent stochastic discount factor is dense in the space of
factors proposed in the previous literature ie capturing its characteristics requires the use of 24-25
factors (at the posterior mean of the SDF sparsity) 4) despite being dense in the space of factors
the SDF-implied maximum Sharpe ratio is not excessive suggesting a high degree of commonality
in terms of captured risks among the factors in the zoo
As a byproduct of our novel framework for empirical asset pricing we provide a very simple
Bayesian version of the Fama and MacBeth (1973) regression method (BFM) We show that this
simple procedure (that does neither require optimisation nor tuning parameters and is not harder
to implement than eg the Shanken (1992) correction for standard errors) makes useless factors
easily detectable in finite sample In extensive simulations the BFM and its GLS analogue (BFM-
GLS) perform well even with relatively small time and large cross-sectional dimensions We apply
BFM and BFM-GLS to several notable factor models and document that a range of non-traded
factors such as consumption proxies labour factors or the consumption-to-wealth ratio are only
weakly identified at best and are characterised by a substantial degree of model misspecification
and uncertainty
Finally thanks to its hierarchical structure our framework is extremely flexible and can accom-
modate and deliver robust inference in the presence of 1) pre-estimated factors (eg mimicking
portfolios and principal components) 2) latent priced and unpriced factors 3) time varying betas
as well as asset and factor risk premia
42
References
Allena R (2019) ldquoComparing Asset Pricing Models with Traded and Non-Traded Factorsrdquo working paper
Asness C and A Frazzini (2013) ldquoThe Devil in HMLs Detailsrdquo Journal of Portfolio Management 39 49ndash68
Asness C S A Frazzini and L H Pedersen (2019) ldquoQuality minus Junkrdquo Review of Accounting Studies24(1) 34ndash112
Avramov D and G Zhou (2010) ldquoBayesian Portfolio Analysisrdquo Annual Review of Financial Economics 225ndash47
Baker M and J Wurgler (2006) ldquoInvestor Sentiment and the Cross-Section of Stock Returnsrdquo Journal ofFinance 61 1645ndash80
Baks K P A Metrick and J Wachter (2001) ldquoShould Investors Avoid All Actively Managed Mutual FundsA Study in Bayesian Performance Evaluationrdquo Journal of Finance 56 45ndash85
Ball R (1985) ldquoAnomalies in Relationships between Securitiesrsquo Yields and Yield-Surrogatesrdquo Journal of FinancialEconomics 6 103ndash126
Barillas F and J Shanken (2018) ldquoComparing Asset Pricing Modelsrdquo The Journal of Finance 73(2) 715ndash754
Bartlett M S (1957) ldquoComment on a Statistical Paradox by DV Lindleyrdquo Biometrika 44(1-2) 533ndash534
Basu S (1977) ldquoInvestment Performance of Common Stocks in Relation to their Price-Earnings Ratio A Test ofthe Efficient Market Hypothesisrdquo Journal of Finance 32 663ndash82
Belloni A V Chernozhukov and C Hansen (2014) ldquoInference on treatment effects after selection amonghigh-dimensional controlsrdquo The Review of Economic Studies 81 608ndash650
Breeden D T M R Gibbons and R H Litzenberger (1989) ldquoEmpirical Test of the Consumption-OrientedCAPMrdquo The Journal of Finance 44(2) 231ndash262
Bryzgalova S (2015) ldquoSpurious Factors in Linear Asset Pricing Modelsrdquo working paper
Bryzgalova S and C Julliard (2018) ldquoConsumption in Asset Returnsrdquo London School of EconomicsManuscript
Campbell J Y (1996) ldquoUnderstanding Risk and Returnrdquo Journal of Political Economy 104(2) 298ndash345
Campbell J Y J Hilscher and J Szilagyi (2008) ldquoIn Search of Distress Riskrdquo Journal of Finance 632899ndash2939
Carhart M M (1997) ldquoOn Persistence in Mutual Fund Performancerdquo The Journal of Finance 52(1) 57ndash82
Chan K N-F Chen and D A Hsieh (1985) ldquoAn Exploratory Investigation of the Firm Size Effectrdquo Journalof Financial Economics 14 451ndash71
Chen L R Novy-Marx and L Zhang (2010) ldquoA New Three-Factor Modelrdquo working paper
Chen N-F S A Ross and R Roll (1986) ldquoEconomic Forces and the Stock Marketrdquo Journal of Business 59383ndash403
Chernov M L Lochstoer and S Lundeby (2019) ldquoConditional Dynamics and the Multi-Horizon Risk-ReturnTrade-Offrdquo working paper
Chib S X Zeng and L Zhao (forthcoming) ldquoOn Comparing Asset Pricing Modelsrdquo The Journal of Finance
Cochrane J H (2005) Asset Pricing Revised Edition Princeton University Press
Cooper M J H Gulen and M J Schill (2008) ldquoAsset Growth and the Cross-Section of Stock ReturnsrdquoJournal of Finance 63 1609ndash52
Cremers K M (2002) ldquoStock Return Predictability A Bayesian Model Selection Perspectiverdquo The Review ofFinancial Studies 15(4) 1223ndash1249
Daniel K D Hirshleifer and L Sun (2019) ldquoShort- and Long-Horizon Behavioral Factorsrdquo Review ofFinancial Studies
Daniel K L Mota S Rottke and T Santos (2018) ldquoThe Cross-Section of Risk and Returnrdquo workingpaper
Daniel K and S Titman (2006) ldquoMarket reactions to tangible and intangible informationrdquo Journal of Finance61 1605ndash43
Fabozzi F D Huang and G Zhou (2010) ldquoRobust Portfolios Contributions from Operations Research andFinancerdquo Annals of Operations Research 172 191ndash220
43
Fama E F and K French (2008) ldquoDissecting Anomaliesrdquo Journal of Finance 63 1653ndash78
Fama E F and K R French (1992) ldquoThe Cross-Section of Expected Stock Returnsrdquo The Journal of Finance47 427ndash465
(1993) ldquoCommon Risk Factors in the Returns on Stocks and Bondsrdquo The Journal of Financial Economics33 3ndash56
Fama E F and K R French (2015) ldquoA Five-Factor Asset Pricing Modelrdquo Journal of Financial Economics116 1ndash22
(2016) ldquoDissecting Anomalies with a Five-Factor Modelrdquo Review of Financial Studies 29 69ndash103
Fama E F and J D MacBeth (1973) ldquoRisk Return and Equilibrium Empirical Testsrdquo Journal of PoliticalEconomy 81(3) 607ndash636
Ferson W and C Harvey (1991) ldquoThe Variation of Economic Risk Premiumsrdquo Journal of Political Economy99 385ndash415
Frazzini A and L H Pedersen (2014) ldquoBetting against Betardquo Journal of Financial Economics 111(1) 1ndash25
Garlappi L R Uppal and T Wang (2007) ldquoPortfolio Selection with Parameter and Model Uncertainty AMulti-Prior Approachrdquo Review of Financial Studies 20 41ndash81
George E I and R E McCulloch (1993) ldquoVariable Selection via Gibbs Samplingrdquo Journal of the AmericanStatistical Association 88 881ndash889
(1997) ldquoApproaches for Bayesian Variable Selectionrdquo Statistica Sinica 7 339ndash373
Gertler M and E Grinols (1982) ldquoUnemployment Inflation and Common Stock Returnsrdquo Journal of MoneyCredit and Banking 14 216ndash33
Geweke J (1999) ldquoUsing Simulation Methods for Bayesian Econometric Models Inference Development andCommunicationrdquo Econometric Reviews 18(1) 1ndash73
Ghosh A C Julliard and A Taylor (2016) ldquoWhat is the Consumption-CAPM missing An Information-Theoretic Framework for the Analysis of Asset Pricing Modelsrdquo Review of Financial Studies 30 442ndash504
(2018) ldquoAn Information-Theoretic Asset Pricing Modelrdquo London School of Economics Manuscript
Giannone D M Lenza and G E Primiceri (2018) ldquoEconomic Predictions with Big Data The Illusion ofSparsityrdquo FRB of New York Staff Report No 847
Gibbons M R S A Ross and J Shanken (1989) ldquoA Test of the Efficiency of a Given Portfoliordquo Econometrica57(5) 1121ndash1152
Giglio S G Feng and D Xiu (2019) ldquoTaming the factor Zoordquo Journal of Finance forthcoming
Giglio S and D Xiu (2018) ldquoAsset Pricing with Omitted Factorsrdquo working paper
Gospodinov N R Kan and C Robotti (2014) ldquoMisspecification-Robust Inference in Linear Asset-PricingModels with Irrelevant Risk Factorsrdquo The Review of Financial Studies 27(7) 2139ndash2170
(2019) ldquoToo Good to be True Fallacies in Evaluating Risk Factor Modelsrdquo Journal of Financial Economics132(2) 451ndash471
Goyal A Z L He and S-W Huh (2018) ldquoDistance-Based Metrics A Bayesian Solution to the Power andExtreme-Error Problems in Asset-Pricing Testsrdquo Swiss Finance Institute Research Paper No 18-78 available atSSRN httpsssrncomabstract=3286327
Hall R E (1978) ldquoThe Stochastic Implications of the Life Cycle Permanent Income Hypothesisrdquo Journal ofPolitical Economy 86(6) 971ndash87
Harvey C and Y Liu (2019) ldquoCross-sectional Alpha Dispersion and Performance Evaluationrdquo Journal ofFinancial Economics forthcoming
Harvey C and G Zhou (1990) ldquoBayesian Inference in Asset Pricing Testsrdquo Journal of Financial Economics26 221ndash254
Harvey C R (2017) ldquoPresidential Address The Scientific Outlook in Financial Economicsrdquo The Journal ofFinance 72(4) 1399ndash1440
Harvey C R Y Liu and H Zhu (2016) ldquo and the Cross-Section of Expected Returnsrdquo The Review ofFinancial Studies 29(1) 5ndash68
He A D Huang and G Zhou (2018) ldquoNew Factors Wanted Evidence from a Simple Specification Testrdquoworking paper available at SSRN httpsssrncomabstract=3143752
44
He Z B Kelly and A Manela (2017) ldquoIntermediary Asset Pricing New Evidence from Many Asset ClassesrdquoJournal of Financial Economics 126 1ndash35
Hirshleifer D H Kewei S Teoh and Y Zhang (2004) ldquoDo Investors Overvalue Firms with Bloated BalanceSheetsrdquo Journal of Accounting and Economics 38 297ndash331
Hoeting J A D Madigan A E Raftery and C T Volinsky (1999) ldquoBayesian Model Averaging ATutorialrdquo Statistical Science 14(4) 382ndash401
Hou K C Xue and L Zhang (2015) ldquoDigesting Anomalies An Investment Approachrdquo The Review of FinancialStudies 28(3) 650ndash705
Huang D F Jiang J Tu and G Zhou (2015) ldquoInvestor Sentiment Aligned A Powerful Predictor of StockReturnsrdquo Review of Financial Studies 28 791ndash837
Huang D J Li and G Zhou (2018) ldquoShrinking Factor Dimension A Reduced-Rank Approachrdquo workingpaper available at SSRN httpsssrncomabstract=3205697
Ishwaran H J S Rao et al (2005) ldquoSpike and Slab Variable Selection Frequentist and Bayesian StrategiesrdquoThe Annals of Statistics 33(2) 730ndash773
Jagannathan R and Z Wang (1996) ldquoThe Conditional CAPM and the Cross-Section of Expected ReturnsrdquoThe Journal of Finance 51(1) 3ndash53
Jeffreys H (1961) Theory of Probability Oxford Calendron Press
Jegadeesh N and S Titman (1993) ldquoReturns to Buying Winners and Selling Losers Implications for StockMarket Efficiencyrdquo Journal of Finance 48 65ndash91
(2001) ldquoProfitability of Momentum Strategies An Evaluation of Alternative Explanationsrdquo Journal ofFinance 56 699ndash720
Jones C and J Shanken (2005) ldquoMutual Fund Performance with Learning across Fundsrdquo Journal of FinancialEconomics 78 507ndash552
Jurado K S Ludvigson and S Ng (2015) ldquoMeasuring Uncertaintyrdquo American Economic Review 105 1177ndash1215
Kan R and C Zhang (1999a) ldquoGMM Tests of Stochastic Discount Factor Models with Useless Factorsrdquo Journalof Financial Economics 54(1) 103ndash127
(1999b) ldquoTwo-Pass Tests of Asset Pricing Models with Useless Factorsrdquo the Journal of Finance 54(1)203ndash235
Kass R E and A E Raftery (1995) ldquoBayes Factorsrdquo Journal of the American Statistical Association 90(430)773ndash795
Kelly B S Pruitt and Y Su (2019) ldquoCharacteristics are covariances A unified model of risk and returnrdquoJournal of Financial Economics 134 501ndash524
Kleibergen F (2009) ldquoTests of Risk Premia in Linear Factor Modelsrdquo Journal of Econometrics 149(2) 149ndash173
Kleibergen F and Z Zhan (2015) ldquoUnexplained Factors and their Effects on Second Pass R-squaredsrdquo Journalof Econometrics 189(1) 101ndash116
Kozak S S Nagel and S Santosh (2019) ldquoShrinking the Cross-Sectionrdquo Journal of Financial Economicsforthcoming
Lancaster T (2004) Introduction to Modern Bayesian Econometrics Wiley
Langlois H (2019) ldquoMeasuring Skewness Premiardquo Journal of Financial Economics
Lettau M and S Ludvigson (2001) ldquoResurrecting the (C) CAPM A Cross-Sectional Test when Risk Premiaare Time-Varyingrdquo Journal of political economy 109(6) 1238ndash1287
Lewellen J S Nagel and J Shanken (2010) ldquoA Skeptical Appraisal of Asset Pricing Testsrdquo Journal ofFinancial Economics 96(2) 175ndash194
Lintner J (1965) ldquoSecurity Prices Risk and Maximal Gains from Diversificationrdquo Journal of Finance 20587ndash615
Ludvigson S S Ma and S Ng (2019) ldquoUncertainty and Business Cycles Exogenous Impulse or EndogenousResponserdquo American Economic Journal Macroeconomics
Madigan D and A E Raftery (1994) ldquoModel Selection and Accounting for Model Uncertainty in GraphicalModels Using Occamrsquos Windowrdquo Journal of the American Statistical Association 89(428) 1535ndash1546
45
Novy-Marx R (2013) ldquoThe Other Side of Value The Gross Profitability Premiumrdquo Journal of Financial Eco-nomics 108 1ndash28
Ohlson J A (1980) ldquoFinancial Ratios and the Probabilistic Prediction of Bankruptcyrdquo Journal of AccountingResearch 18 109ndash131
Pastor L (2000) ldquoPortfolio Selection and Asset Pricing Modelsrdquo The Journal of Finance 55(1) 179ndash223
Pastor L and R F Stambaugh (2000) ldquoComparing Asset Pricing Models an Investment Perspectiverdquo Journalof Financial Economics 56(3) 335ndash381
Pastor L and R F Stambaugh (2002) ldquoInvesting in Equity Mutual Fundsrdquo Journal of Financial Economics63 351ndash380
Pastor L and R F Stambaugh (2003) ldquoLiquidity Risk and Expected Stock Returnsrdquo Journal of Politicaleconomy 111(3) 642ndash685
Phillips D L (1962) ldquoA Technique for the Numerical Solution of Certain Integral Equations of the First KindrdquoJ ACM 9(1) 84ndash97
Polson N G and J G Scott (2011) ldquoShrink Globally Act Locally Sparse Bayesian Regularization and Pre-dictionrdquo in Bayesian Statistics 9 ed by J M Bernardo M J Bayarri J O Berger A P Dawid D HeckermanA F M Smith and M West Oxford University Press
Primiceri G E (2005) ldquoTime Varying Structural Vector Autoregressions and Monetary Policyrdquo Review of Eco-nomic Studies 72(3) 821ndash852
Raftery A E D Madigan and J A Hoeting (1997) ldquoBayesian Model Averaging for Linear RegressionModelsrdquo Journal of the American Statistical Association 92(437) 179ndash191
Ritter J R (1991) ldquoThe Long-Run Performance of Initial Public Offeringsrdquo Journal of Finance 46 3ndash27
Roberts H V (1965) ldquoProbabilistic Predictionrdquo Journal of the American Statistical Association 60(309) 50ndash62
Shanken J (1987) ldquoA Bayesian Approach to Testing Portfolio Efficiencyrdquo Journal of Financial Economics 19195ndash215
(1992) ldquoOn the Estimation of Beta-Pricing Modelsrdquo The Review of Financial Studies 5 1ndash33
Sharpe W F (1964) ldquoCapital Asset Prices A Theory of Market Equilibrium under Conditions of Riskrdquo Journalof Finance 19(3) 425ndash42
Sims C A (2007) ldquoThinking about Instrumental Variablesrdquo mimeo
Sloan R (1996) ldquoDo Stock Prices Fully Reflect Information in Accruals and Cash Flows about Future EarningsrdquoThe Accounting Review 71 289ndash315
Stambaugh R F and Y Yuan (2016) ldquoMispricing Factorsrdquo Review of Financial Studies 30 1270ndash1315
Stock J H (1991) ldquoConfidence Intervals for the Largest Autoregressive Root in US Macroeconomic Time SeriesrdquoJournal of Monetary Economics 28(3) 435ndash459
Tikhonov A A Goncharsky V Stepanov and A G Yagola (1995) Numerical Methods for the Solutionof Ill-Posed Problems Mathematics and Its Applications Springer Netherlands
Titman S K J Wei and F Xie (2004) ldquoCapital Investments and Stock Returnsrdquo Journal of Financial andQuantitative Analysis 39 677ndash700
Uppal R P Zaffaroni and I Zviadadze (2018) ldquoCorrecting Misspecified Stochastic Discount Factorsrdquoworking paper
Yogo M (2006) ldquoA Consumption-Based Explanation of Expected Stock Returnsrdquo Journal of Finance 61(2)539ndash580
46
A Additional Derivations
A1 Derivation of the posterior distribution in section II1
Letrsquos consider first the time-series regression We assume that 983171tiidsim N (0Σ) or 983171 sim MVN (0TtimesN Σotimes
IT ) The likelihood of data (RF ) is therefore
p(data|BΣ) = (2π)minusNT2 |Σ|minus
T2 exp
983069minus1
2tr[Σminus1(Rminus FB)⊤(Rminus FB)]
983070
After assigning the Jeffreysrsquo prior for (BΣ) π(BΣ) prop |Σ|minusN+1
2 we simplify the likelihood
function by exploiting the fact that OLS estimated residuals are orthogonal to the regressors
(Rminus FB)⊤(Rminus FB) = [Rminus FBols minus F (B minus Bols)]⊤[Rminus FBols minus F (B minus Bols)]
= (Rminus FBols)⊤(Rminus FBols) + (B minus Bols)
⊤F⊤F (B minus Bols)
= T Σ+ (B minus Bols)⊤F⊤F (B minus Bols)
where
Bols =
983075a⊤
β⊤
983076= (F⊤F )minus1F⊤R Σols =
1
T(Rminus FBols)
⊤(Rminus FBols)
Therefore the posterior distribution in the first step is
p(BΣ|data) prop (2π)minusNT2 |Σ|minus
T+N+12 exp
983069minus1
2tr[Σminus1(Rminus FB)⊤(Rminus FB)]
983070
prop |Σ|minusT+N+1
2 eminus12tr[Σminus1(T Σ)]eminus
12tr[Σminus1(BminusBols)
⊤F⊤F (BminusBols)]
Hence the posterior distribution of B conditional on data and Σ is
p(B|Σ data) prop exp
983069minus1
2tr[Σminus1(B minus Bols)
⊤F⊤F (B minus Bols)]
983070
and the above is the kernel of the multivariate normal in equation (8)
If we further integrate out B it is easy to show that
p(Σ|data) prop |Σ|minusT+NminusK
2 exp
983069minus1
2tr[Σminus1(T Σ)]
983070
Therefore the posterior distribution of Σ is the inverse-Wishart in equation (9)
Recall that β = (1N βf ) λ⊤ = (λc λ⊤
f ) If we assume that the pricing error αi follows an
independent and idencitical normal distribution N (0σ2) the likelihood function in the second step
is
p(data|λσ2) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070
where data in the second step include (aβf ) drawn from the first step Assuming the diffuse
47
Jeffreysrsquo prior π(λσ2) prop 1σ2 the posterior distribution of (λσ2) is
p(λσ2|dataBΣ) prop (σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070
= (σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βλ+ β(λminus λ))⊤(aminus βλ+ β(λminus λ))
983070
= (σ2)minusN+2
2 exp
983069minusN σ2
2σ2
983070exp
983083minus(λminus λ)⊤β⊤β(λminus λ)
2σ2
983084
there4 p(λ|σ2 dataBΣ) prop exp
983083minus(λminus λ)⊤β⊤β(λminus λ)
2σ2
983084
where λ = (β⊤β)minus1β⊤a and σ2 = (aminusβλ)⊤(aminusβλ)N Therefore the posterior conditional distribution
of λ is the one in equation (12) Finally we can derive the posterior distribution of σ2 by integrating
out λ
p(σ2|dataBΣ) =
983133p(λσ2|dataBΣ)dσ2 prop (σ2)minus
NminusK+12 exp
983069minusN σ2
2σ2
983070
hence obtaining the posterior distribution in equation (13)
A2 Non-spherical pricing errors
Our framework can also easily accommodate non-spherical cross-sectional pricing errors To see
this note that under the null of the model we can rewrite equation (1) as Rt = βλ + βfft + 983171t
Consider the cross-sectional regression and let ET define the sample mean operator Since ET [Rt] =
βλ+ET [βfft]+ET [983171t] = βλ+ET [983171t] the pricing error α should be equal to ET [983171t] Hence under
the hypothesis that the model is correctly specified and in the spirit of the central limit theorem
a suitable distributional assumption for the pricing errors α in the second step is
α|Σ sim N
9830610N
1
TΣ
983062
Implying the distribution of R equiv ET [Rt] sim N (βλ 1T Σ) The likelihood function in the second step
is then
p(data|λ) = (2π)minusN2
9830559830559830559830551
TΣ
983055983055983055983055minus 1
2
exp
983069minusT
2(Rminus βλ)⊤Σminus1(Rminus βλ)
983070
prop exp
983069minusT
2(λ⊤β⊤Σminus1βλminus 2λ⊤β⊤Σminus1R)
983070
Hence we can now define the following estimator
Definition 3 (Bayesian Fama-MacBeth GLS2 (BFM-GLS2)) The BFM-GLS2 posterior dis-
tribution of λ is
λ|dataBΣ sim N (λΣλ) (18)
48
where λ = (β⊤Σminus1β)minus1β⊤Σminus1R Σλ = 1T (β
⊤Σminus1β)minus1 and where β and Σ are drawn from the
Normal-inverse-Wishart in (8)-(9)
In the above the conditional expectation λ = (β⊤Σminus1β)minus1β⊤Σminus1R is essentially the Fama-
MacBeth GLS estimate of λ and Σλ is just the standard covariance matrix of the GLS estimates
Different from our OLS version we incorporate the uncertainty of R by drawing λ from the normal
distribution with variance matrix 1T (β
⊤Σminus1β)minus1
In this case two forces are at play to make useless factors detectable in finite sample First as
for the BFM and BFM-GLS cases useless factors will generate posterior draws with diverging 983141λand flipping sing Second differently from our OLS version we incorporate the uncertainty of R
by drawing λ from the normal distribution with variance matrix 1T (β
⊤Σminus1β)minus1
A3 Formal derivation of the flat prior pitfall for risk premia
Following the derivation in section A1 the likelihood function in the second step is
p(data|γλσ2) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βγλγ)
⊤(aminus βγλγ)
983070(19)
Assigning a Jeffreysrsquo prior to the parameters23 (λσ2) the marginal likelihood function conditional
on model index γ is
p(data|γ) =983133 983133
p(data|γλσ2)π(λσ2|γ)dλdσ2
prop983133 983133
(σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βγλγ)
⊤(aminus βγλγ)
983070dλdσ2
=
983133 983133(σ2)minus
N+22 exp
983083minusN σ2
γ
2σ2
983084exp
983083minus(λγ minus λγ)
⊤β⊤γ βγ(λγ minus λγ)
2σ2
983084dλdσ2
= (2π)pγ+1
2 |β⊤γ βγ |minus
12
983133(σ2)minus
Nminuspγ+1
2 exp
983083minusN σ2
γ
2σ2
983084dσ2
= (2π)pγ+1
2 |β⊤γ βγ |minus
12
Γ(Nminuspγ+1
2 )
(N σ2
γ
2 )Nminuspγ+1
2
where λγ = (β⊤γ βγ)
minus1β⊤γ a σ
2γ =
(aminusβγ λγ)⊤(aminusβγ λγ)N and Γ denotes the Gamma function
A4 Proof of Remark 2
Proof Consider two nested linear factor models γ and γ prime The only difference between γ and γ prime
is γp γp equals 1 in model γ but 0 in model γ prime Let γminusp denote a (K minus 1) times 1 vector of model
index excluding γp γ⊤ = (γ⊤minusp 1) and γ prime⊤ = (γ⊤
minusp 0) Suppose that we rearrange the ordering of
23More precisely the priors for (λσ2) are π(λγ σ2) prop 1
σ2 and λminusγ = 0
49
factors such that factor p is the last one To begin with we introduce some matrix notations
βγ = (βγprime βp) Dγ =
983075Dγprime
1ψp
983076 β⊤
γ βγ +Dγ =
983075β⊤γprimeβγprime +Dγprime β⊤
γprimeβp
β⊤p βγprime β⊤
p βp + 1ψp
983076
|β⊤γ βγ +Dγ | = |β⊤
γprimeβγprime +Dγprime |times |β⊤p βp +
1
ψpminus β⊤
p βγprime(β⊤γprimeβγprime +Dγprime)minus1β⊤
γprimeβp|
|β⊤γ βγ +Dγ | = |β⊤
γprimeβγprime +Dγprime |times |β⊤p βp +
1
ψpminus β⊤
p βγprime(β⊤γprimeβγprime +Dγprime)minus1β⊤
γprimeβp|
|Dγ | = |Dγprime |times 1
ψp
Equipped with the above we have by direct calculation
p(data|γj = 1γminusj)
p(data|γj = 0γminusj)=
|Dγ |12
|β⊤γ βγ+Dγ |
12
1983059
SSRγ2
983060N2
|Dγprime |
12
|β⊤γprimeβγprime+D
γprime |12
1983061
SSRγprime2
983062N2
=
983061SSRγprime
SSRγ
983062N2983061|Dγ ||Dγprime |
983062 12
983075|β⊤
γprimeβγprime +Dγprime ||β⊤
γ βγ +Dγ |
983076 12
=
983061SSRγprime
SSRγ
983062N2
ψminus 1
2p
983055983055983055983055β⊤p βp +
1
ψpminus β⊤
p βγprime
983059β⊤γprimeβγprime +Dγprime
983060minus1β⊤γprimeβp
983055983055983055983055minus 1
2
=
983061SSRγprime
SSRγ
983062N29830611 + ψpβ
⊤p
983063IN minus βγprime
983059β⊤γprimeβγprime +Dγprime
983060minus1β⊤γprime
983064βp
983062minus 12
where β⊤p
983147IN minus βγprime(β⊤
γprimeβγprime +Dγprime)minus1β⊤γprime
983148βp = minb(βpminusβγprimeb)⊤(βpminusβγprimeb)+ b⊤Dγprimeb which
is the minimal value of the penalised sum of squared errors when we use βγprime to predict βp
A5 Confidence Intervals for R2 in Fama-MacBeth Estimation
In the spirit of Stock (1991) and Lewellen Nagel and Shanken (2010) for each of the simulations
we use the following approach to construct the confidence interval for cross-sectional R2 produced
by the standard Fama-MacBeth estimation
First we choose a hypothetical true cross-sectional R2 denoted by R2h The expected test asset
returns E[Rt] are assumed to follow E[Rt] = hβλ+α where β and λ are sample estimates of β
and λ from historical data and αiiidsim N (0σ2
e) The constants h and σ2e are chosen to match the
hypothetical cross-sectional R2
Let R denote the vector of historical average asset returns and V arN (x) denote the sample
cross-sectional variance of vector x The population cross-sectional R2 is therefore
R2h =
h2V arN (βλ)
V arN (E[Rt])=
h2V arN (βλ)
h2V arN (βλ) + σ2e
At the same time wersquod like to maintain the historical cross-sectional dispersion of test asset returns
50
so we further have the following equation
V arN (E[Rt]) = h2V arN (βλ) + σ2e = V arN (ET [Rt])
h2 =R2
hV arN (ET [Rt])
V arN (βλ)
σ2e = (1minusR2
h)V arN (ET [Rt])
Solving the system for h and σ2e we simulate a vector of pricing errors from a normal distribution
αiiidsim N (0σ2
e) Since the sample variance of the draws αi is generally different from σ2e with a non-
zero cross-sectional covariance with β we adjust the vector α by (1) subtracting CovN (βλα)
V arN (βλ)βλ
from α such as the sample covariance between α and βλ is zero (2) multiplying α by σeσN (α) in
order that sample cross-sectional variance of α equals σ2e
Second we simulate a random sample of both factors and test asset returns Factors are drawn
with replacement from their empirical sample while test assets returns Rt are assumed to follow a
parametric normal distribution that is
Rt|ftiidsim N (hβλ+α+ βft Σ)24
We then use Fama-MacBeth two-step approach to estimateR2 for every simulated sample ftRtTt=1
The second step is repeated for 1000 times For each hypothetical R2 between 0 and 1 we find the
(5 95) confidence interval in the simulation Finally build a 90 confidence interval for R2 by
including those values of hypothetically true R2 whose 90 confidence interval contains R2
The confidence intervals for GLS R2 can be found in a similar way with the only difference
of focusing on cross-sectional R2 for a linear combination of Rt ie Σminus 12Rt Let Rt = Σminus 1
2Rt
β = Σminus 12 β and 983171t = Σminus 1
2 983171tiidsim N (0 IN ) The simulation then relies on Rt rather than Rt
Rt|ftiidsim N (hβλ+α+ βft IN )
A6 Large N behavior
In this section we investigate the properties of the BFM procedure in estimating the risk premia
and R-squared as well successfully identifying irrelevant factors in the model when applied to a
large cross-section For the sake of brevity here we present the overview of the simulation results
with Tables C4 - C9 summarizing the output
We consider the same simulation design as described at the beginning of Section III except
for the choice of the cross-section of test assets which time series and cross-sectional features we
mimic In our baseline case in the previous subsections we built a cross-section to emulate the 25
Fama-French portfolios sorted by size and value Now instead we consider the properties of the
following composite cross-sections to simulate returns
24Σ is the sample estimate of covariance matrix of random errors in time-series regression in equation (1)
51
(a) N = 55 25 Fama-French portfolios sorted by size and value and 30 industry portfolios
(b) N = 100 25 Fama-French portfolios sorted by size and value 30 industry portfolios 25
profitability and investment portfolios 10 momentum portfolios and 10 long-term reversal
portfolios
The rest of the simulation design stays unchanged ie the strong factor mimics the behavior of
HML with its betas and risk premia corresponding to their in-sample values cross-sectional R2adj
as well as portfolio average returns and variance of the residuals
Tables C4 and C7 focus on the case of including only a strong factor into the model that is
inherently misspecified and estimated on a large cross-section (N = 55 and N = 100 respectively)
Risk premia estimates recovered by BFM are centered around the pseudo-true values and confi-
dence intervals produced by the posterior coverage have the correct size Confidence intervals for
R2adj are equally centered around the true values and overall become quite tight for a large sample
size (eg T ge 600) The GLS version of the cross-sectional fit is slightly lower than than the true
value and this is largely due to the estimation errors in the weight matrix that are particularly
important for a large cross-section but overall are very close to the true values
Tables C5 and C8 focus on the model with just a useless factor As expected the standard
Fama-MacBeth estimation yields confidence intervals for the risk premia with wrong size rejecting
the zero risk premia for a useless factor with increasing frequency as T becomes large In contrast
the BFM inference remains valid with its empirical rejection rates being close to the true size of
the tests
Finally Tables C6 and C9 consider the mixed case of estimating a model that includes both
strong and useless factors Again BFM correctly identifies the presence of a spurious factor in
the specification and if anything becomes somewhat more conservative underrejecting the zero
risk premia associated with it This conservative inference is also shared by the estimates of the
strong factor risk premia with the latter being particularly evident in a large sample estimation
(T = 20 000 N = 100) Again the reason for this additional estimation uncertainty seems to lie
in the large N behavior of the cross-sectional regression (betas and the weight matrix in case of
the GLS) The posterior of a cross-sectional measure of fit remains relatively tight around the true
value
B Data
CAPM Following Sharpe (1964) and Lintner (1965) the only risk factor is the excess return on
market portfolio which is proxied by a value-weighted portfolio of all CRSP firms incorporated in
US and listed on the NYSE AMEX or NASDAQ We use 1-month Treasury rate as a proxy for
risk-free rate The data comes from Ken Frenchrsquos website
Fama-French 3 factor model Fama and French (1992) extend CAPM by introducing two
additional factors SMB and HML where SMB is the return difference between portfolios of stocks
52
with small and large market capitalizations and HML is the return difference between portfolios of
stocks with high and low book-to-market ratio Again the data comes from Ken Frenchrsquos website
Carhart (1997) This paper extends Fama-French 3 factor model by including a momentum
factor UMD (also available from Ken French website)
q-factor model Hou Xue and Zhang (2015) introduce a four-factor model that includes market
excess return a new size factor (ME) proxied by the return difference between large and small
stocks an investment factor (IA) proxied by the return difference of stocks with high and low
investment-to-asset ratio and finally the profitability factor (ROE) created by sorting stocks based
on their return-on-equity ratio We receive the data from the authors An alternative is Fama-
French five factor model but we only show the result of the first one since factors in these two
papers are extremely similar
Liquidity Pastor and Stambaugh (2003) created a liquidity factor based on the fact that order
flows result in larger return reversals when liquidity is lower We download the monthly tradable
liquidity factor from Stambaughrsquos website
Quality-minus-junk Asness Frazzini and Pedersen (2019) introduced the QMJ factor and
demonstrated that profitable growing well-managed companies referred to as lsquoqualityrsquo firms
command a higher rate of return The factor is available from AQR data library
CCAPM Real growth rate in nondurable consumption per capita (quarterly data) is computed
from the data on consumption levels available at St Louis FRED
Scaled CAPM Lettau and Ludvigson (2001) considered a conditional SDFmt+1 = atminusbtMKTt+1
where at and bt are linear function of conditional information cayt We download cayt from the
authorsrsquo website
Scaled CCAPM Similar to conditional CAPM but the SDF includes nondurable consumption
growth instead of the market factor
HC-CCAPM Jagannathan and Wang (1996) added a labor factor to CAPM Following their
paper we compute the returns on human capital as
Rlabort =
Ltminus1 + Ltminus2
Ltminus2 + Ltminus3minus 1 (20)
where Lt is the disposable labor income per capita
Scaled HC-CCAPM Lettau and Ludvigson (2001) extend Jagannathan and Wang (1996) by
considering a conditional SDF in which cay is the only conditional information
Durable consumption model Yogo (2006) emphasized the role of durable consumption goods
in explaining high returns of small stocks and value stocks We consider a three-factor model
market excess return non-durable consumption growth and durable (real) consumption growth Cd
(seasonally adjusted at annual rates) The data is from Yogorsquos website 1952Q1 to 2001Q4
53
B1 Factor list
Table B1 List of the factors for cross-sectional asset pricing models
Factor ID Factor name and description Reference Sourceconstruction
MKT Market excess return Sharpe (1964) Lintner(1965)
Ken French website
SMB Size factor constructed as a long-shortportfolio of stocks sorted by their mar-ket cap (small-minus-big)
Fama and French (1992) Ken French website
HML Value factor constructed as a long-short portfolio of stocks sorted by theirbook-to-market ratio (high-minus-low)
Fama and French (1992) Ken French website
RMW Profitability factor constructed as along-short portfolio of stocks sorted bytheir profitability (robust-minus-weak)
Fama and French (2015) Ken French website
CMA Investment factor constructed as along-short portfolio of stocks sorted bytheir investment activity (conservative-minus-aggressive)
Fama and French (2015) Ken French website
UMD Momentum factor constructed as along-short portfolio of stocks sorted bytheir 12-2 cumulative previous return(up-minus-down)
Carhart (1997) Je-gadeesh and Titman(1993)
Ken French website
STREV Short-term reversal factor constructedas a long-short portfolio of stockssorted by their previous month return
Jegadeesh and Titman(1993)
Ken French website
LTREV Long-term reversal factor constructedas a long-short portfolio of stockssorted by their cumulative return ac-crued in the previous 60-13 months
Jegadeesh and Titman(2001)
Ken French website
q IA Investment factor constructed as along-short portfolio of stocks sorted bytheir investment-to-capital
Hou Xue and Zhang(2015)
Lu Zhang
q ROE Profitability factor constructed as along-short portfolio of stocks sorted bytheir return on equity
Hou Xue and Zhang(2015)
Lu Zhang
LIQ NT Liquidity factor computed as the av-erage of individual-stock measures es-timated with daily data (residual pre-dictability controlling for the marketfactor)
Pastor and Stambaugh(2003)
Robert Stambaughwebsite
LIQ TR Liquidity factor constructed as a long-short portfolio of stocks sorted by theirexposure to LIQ NT
Pastor and Stambaugh(2003)
Robert Stambaughwebsite
MGMT Mispricing factor constructed as acombination of anomalies related tofirmrsquos management practices (stock is-sue accruals asset growth etc)
Stambaugh and Yuan(2016)
Robert Stambaughwebsite
PERF Mispricing factor constructed as acombination of anomalies related tofirmrsquos performance (profitability dis-tress return on assets etc)
Stambaugh and Yuan(2016)
Robert Stambaughwebsite
ACCR Accruals factor constructed as a long-short portfolio of stocks sorted bychanges in operating working capitalper split-adjusted share from the fiscalyear end t-2 to t-1 divided by book eq-uity per share in t-1
Sloan (1996) Robert Stambaughwebsite
54
DISSTR Distress factor constructed as a long-short portfolio of stocks sorted by thepredicted failure probability
Campbell Hilscher andSzilagyi (2008)
Robert Stambaughwebsite
ASS Growth Asset growth factor constructed as along-short portfolio of stocks sorted bygrowth rate of total assets in the previ-ous fiscal year
Cooper Gulen andSchill (2008)
Robert Stambaughwebsite
COMP ISSUE Composite issue factor constructed asa long-short portfolio of stocks sortedby the growth in the firmrsquos total marketvalue of equity above that of the stockrsquosrate of return
Daniel and Titman(2006)
Robert Stambaughwebsite
GR PROF Gross profitability factor constructedas a long-short portfolio of stockssorted by the ratio of gross profit toassets creates abnormal benchmark-adjusted returns
Novy-Marx (2013) Robert Stambaughwebsite
INV IN ASS Investment-in-assets factor con-structed as a long-short portfolio ofstocks sorted by the annual change ingross property plant and equipmentplus the annual change in inventoriesscaled by lagged book value of theassets
Titman Wei and Xie(2004)
Robert Stambaughwebsite
NetOA Net operating assets factor con-structed as a long-short portfolio ofstocks sorted by net operating assets
Hirshleifer Kewei Teohand Zhang (2004)
Robert Stambaughwebsite
OSCORE Ohlson O-score factor constructed as along-short portfolio of stocks sorted bythe predicted value of a distress mea-sure
Ohlson (1980) Robert Stambaughwebsite
ROA Return-on-assets factor constructed asa long-short portfolio of stocks sortedby their return on assets
Chen Novy-Marx andZhang (2010)
Robert Stambaughwebsite
STOCK ISS Stock issuance factor constructed asa long-short portfolio of stocks sortedby their annual log change in split-adjusted shares outstanding
Ritter (1991) Fama andFrench (2008)
Robert Stambaughwebsite
INTERM CR Innovations to the intermediariesrsquo cap-ital (equity) ratio
He Kelly and Manela(2017)
Asaf Manela website
BAB Betting-against-beta factor con-structed as a portfolio that holdslow-beta assets leveraged to a betaof 1 and that shorts high-beta assetsde-leveraged to a beta of 1
Frazzini and Pedersen(2014)
AQR data library
HML DEVIL A version of the HML factor that relieson the current price level to sort thestocks into long and short legs
Asness and Frazzini(2013)
AQR data library
QMJ Auality-minus-junk factor constructedas a long-short portfolio of stockssorted by the combination of theirsafety profitability growth and thequality of management practices
Asness Frazzini andPedersen (2019)
AQR data library
FIN UNC A measure of financial uncertainty Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
REAL UNC A measure of real economic uncertainty Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
55
MACRO UNC A measure of macroeconomic uncer-tainty
Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
TERM Term spread measured as the differ-ence in 10 year Treasury bonds and fedfunds rate
Chen Ross and Roll(1986) Fama and French(1993)
FRED-MD database
DELTA SLOPE Change in the difference between a10-year Treasury bond yield and a 3-month Treasury bill yield
Ferson and Harvey(1991)
FRED-MD database
CREDIT Credit spread measured as the dif-ference between Moodyrsquos BAA corpo-rate bond yields and 10-year Treasurybonds
Chen Ross and Roll(1986) Fama and French(1993)
FRED-MD database
DIV Dividend yield Campbell (1996) FRED-MD databasePE Price-earnings ratio Basu (1977) Ball (1985) FRED-MD database
BW INV SENT Investor sentiment measure aggre-gated from a set of indices orthogonalto macroeconomic fundamentals
Baker and Wurgler(2006)
Dashan Huang web-site
HJTZ INV SENT Investor sentiment measure extractedwith PLS from Baker and Wurgler(2006) proxies
Huang Jiang Tu andZhou (2015)
Dashan Huang web-site
BEH PEAD Short-term behavioral factor reflectingpost-earnings announcement drift
Daniel Hirshleifer andSun (2019)
Kent Daniel website
BEH FIN Long-term behavioral factor predom-inantly capturing the impact of shareissuance and correction
Daniel Hirshleifer andSun (2019)
Kent Daniel website
MKTlowast Market factor with a hedged unpricedcomponent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
SMBlowast SMB with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
HMLlowast HML with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
RMWlowast RMW with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
CMAlowast CMA with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
SKEW Systematic skewness factor con-structed as a long-short portfolio ofstocks sorted on the their predictedsystematic skewness rank
Langlois (2019) Hugues Langloiswebsite
NONDUR Nondurable consumption growth (realchain-weighted per capita)
Chen Ross and Roll(1986) Breeden Gib-bons and Litzenberger(1989)
Monthly consump-tion expenditurethe chain-weightedprice index andpopulation data arefrom BEA
SERV Growth rate (real chain-weighted percapita) service expenditure
Breeden Gibbons andLitzenberger (1989)Hall (1978)
Monthly expenditurefor services thechain-weighted priceindex and popula-tion data are fromBEA
UNRATE Unemployment rate Gertler and Grinols(1982)
FRED-MD database
IND PROD Growth rate of industrial production Chan Chen and Hsieh(1985) Chen Ross andRoll (1986)
Industrial Produc-tion Index is fromthe Board of Gover-nors of the FederalReserve System
56
OIL Monthly growth rate of the ProducerPrice index for Crude Petroleum (do-mestic production)
Chen Ross and Roll(1986)
PPI is from US Bu-reau of Labor Statis-tics
The table presents the list of factors used in Section IV2 For each of the variables we present their identificationindex the nature of the factor and the source of data for downloading andor constructing the time series
57
C Additional Tables
Table C1 Tests of risk premia in a correctly specified model with a strong factor
λintercept λHML R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0115 0049 0017 0114 0050 0014 -417 4429200 0101 0055 0012 0096 0047 0010 096 6773
FM 600 0106 0053 0011 0100 0048 0011 3052 84561000 0096 0050 0011 0102 0044 0010 4786 901820000 0102 0042 0012 0100 0049 0007 9620 9943
100 0063 0027 0006 0034 0010 0002 -337 3516200 0107 0055 0012 0080 0037 0005 -299 4906
BFM 600 0095 0050 0007 0080 0035 0007 431 69131000 0092 0054 0008 0092 0047 0012 3291 781820000 0108 0054 0012 0110 0052 0008 9654 9849
Panel B GLS
100 0221 0156 0061 0212 0137 0064 1062 7380200 0153 0088 0024 0144 0088 0024 5923 8571
FM 600 0117 0062 0017 0115 0060 0014 8338 94461000 0106 0053 0011 0112 0052 0011 9003 964920000 0100 0058 0010 0103 0047 0011 9947 9981
100 0151 0088 0026 0149 0086 0026 2642 6569200 0123 0066 0016 0124 0066 0015 4907 7352
BFM 600 0106 0055 0012 0105 0056 0011 7640 87301000 0107 0054 0011 0103 0051 0011 8512 915420000 0098 0057 0009 0100 0051 0013 9915 9950
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in a correctlyspecified model with an intercept and a strong factor Hypothetical true value of R2
adj is 100 Fama-MacBeth esti-mates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shankencorrection Confidence intervals for BFM estimates are constructed using a posterior distribution of Fama-MacBethestimates of λ The last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simula-tions evaluated at the simulation point estimates for FM and its posterior mode for BFM
58
Table C2 Tests of risk premia in a correctly specified model with a useless factor
λintercept λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0087 0033 0008 0019 0003 0000 -419 4844200 0071 0034 0006 0052 0011 0000 -420 5684
FM 600 0077 0031 0005 0131 0037 0000 -422 58501000 0071 0035 0006 0205 0066 0002 -416 612820000 0133 0077 0027 0709 0479 0140 -411 7007
100 0039 0011 0002 0001 0001 0000 -229 -012200 0038 0013 0001 0002 0000 0000 -232 030
BFM 600 0048 0012 0002 0017 0006 0001 -221 1261000 0048 0011 0000 0031 0010 0000 -214 16020000 0007 0000 0000 0103 0051 0014 -176 5793
Panel B GLS
100 0215 0130 0052 0211 0117 0033 -310 3727200 0141 0080 0023 0139 0072 0013 -373 2282
FM 600 0108 0053 0014 0142 0065 0010 -377 21161000 0097 0053 0009 0171 0082 0012 -371 209220000 0108 0047 0010 0621 0559 0372 -259 1697
100 0136 0074 0022 0029 0011 0001 -194 1060200 0112 0060 0014 0012 0004 0000 -239 837
BFM 600 0097 0047 0010 0010 0002 0000 -265 7491000 0096 0047 0009 0010 0002 0000 -276 83220000 0071 0034 0004 0077 0033 0004 -326 766
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a correctly specified model with an intercept and a useless factor The true value of R2 is 0 Fama-MacBeth esti-mates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shankencorrection Confidence intervals for BFM estimates are constructed using a posterior distribution of Fama-MacBethestimates of λ The last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simula-tions evaluated at the simulation point estimates for FM and its posterior mode for BFM
59
Table C3 Tests of risk premia in a correctly specified model with useless and strong factors
λintercept λHML λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0073 0038 0006 0071 0033 0006 0043 0012 0000 -445 5930200 0073 0035 0006 0090 0045 0008 0028 0006 0000 1083 7484
FM 600 0076 0033 0005 0100 0046 0009 0027 0005 0000 3873 85721000 0073 0034 0006 0090 0046 0009 0026 0005 0000 5415 911320000 0087 0032 0006 0061 0026 0005 0025 0008 0000 9687 9947
100 0039 0013 0001 0023 0007 0001 0002 0001 0000 -197 4174200 0048 0023 0000 0046 0015 0000 0002 0001 0000 003 5372
BFM 600 0054 0025 0003 0062 0026 0006 0002 0000 0000 4056 72821000 0084 0034 0006 0064 0021 0002 0002 0000 0000 5763 808920000 0071 0033 0005 0069 0032 0004 0004 0000 0000 9704 9860
Panel B GLS
100 0193 0129 0043 0205 0133 0043 0180 0121 0031 1405 7480200 0135 0080 0021 0141 0075 0024 0129 0062 0010 6015 8683
FM 600 0101 0054 0010 0109 0052 0009 0091 0037 0004 8331 94231000 0104 0055 0010 0105 0048 0011 0085 0039 0003 8995 963220000 0095 0047 0006 0101 0052 0005 0088 0035 0004 9944 9981
100 0133 0074 0023 0132 0074 0023 0026 0009 0001 2796 6680200 0106 0054 0012 0106 0053 0012 0013 0004 0000 4899 7362
BFM 600 0094 0047 0009 0093 0046 0009 0007 0001 0000 7654 87351000 0090 0045 0010 0091 0044 0007 0005 0001 0000 8530 914620000 0092 0043 0008 0092 0046 0008 0004 0001 0000 9914 9951
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and
λstrong λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor The true value
of the cross-sectional R2adj is 100 Fama-MacBeth estimates are constructed using OLS (GLS) two-step cross-
sectional regressions with standard errors including Shanken correction Confidence intervals for BFM estimates areconstructed using a posterior distribution of Fama-MacBeth estimates of λ The last two columns report the 5th and95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at the simulation point estimates for FMand its posterior mode for BFM
60
Table C4 Tests of risk premia in a misspecified model with a strong factor (N = 55)
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0107 0058 0014 0115 0059 0012 -186 1305200 0091 0046 0009 0102 0052 0011 -184 1485600 0104 0052 0013 0109 0060 0014 -166 14951000 0100 0056 0009 0123 0061 0013 -123 135120000 0104 0049 0009 0109 0048 0009 353 803
BFM 100 0017 0004 0000 0002 0000 0000 -155 -067200 0046 0017 0004 0023 0006 0000 -168 273600 0089 0042 0012 0071 0036 0005 -174 8441000 0093 0047 0007 0089 0040 0007 -171 95920000 0101 0048 0011 0099 0042 0008 329 777
Panel B GLS
FM 100 0509 0428 0305 0513 0431 0305 2093 6471200 0295 0206 0102 0274 0187 0091 3999 6657600 0170 0109 0042 0176 0113 0034 5585 71321000 0170 0106 0034 0176 0105 0031 6010 723220000 0145 0082 0020 0141 0073 0022 6917 7182
BFM 100 0265 0181 0086 0262 0186 0079 1872 5919200 0163 0100 0032 0147 0086 0024 3401 5842600 0116 0060 0017 0118 0058 0014 5125 65951000 0120 0064 0016 0121 0063 0014 5689 686520000 0106 0053 0009 0095 0048 0013 6897 7163
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor estimates of a cross-section of 55 test assets The true valueof the cross-sectional R2
adj is 572 (7071) in OLS (GLS) estimation Fama-MacBeth estimates are constructedusing OLS (GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidenceintervals and their size for BFM estimates are constructed using posterior coverage of Fama-MacBeth estimates of λThe last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluatedat the simulation point estimates for FM and its posterior mode for BFM The test assets mimic the time series andcross-sectional properties of 25 Fama-French size-value portfolios and 30 industry portfolios while the strong factorproxies the HML factor (all the data available from Ken French website)
61
Table C5 Tests of risk premia in a misspecified model with a useless factor (N = 55)
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0095 0050 0013 0084 0031 0003 -186 2080200 0084 0042 0007 0093 0038 0004 -186 1801600 0103 0051 0011 0155 0077 0011 -187 13621000 0101 0051 0009 0226 0131 0033 -187 141720000 0191 0126 0050 0716 0650 0460 -187 988
BFM 100 0013 0002 0000 0000 0000 0000 -105 -047200 0039 0015 0001 0002 0000 0000 -128 -043600 0076 0040 0006 0004 0000 0000 -144 -0491000 0082 0035 0004 0022 0009 0000 -150 -02620000 0129 0059 0005 0078 0037 0008 -168 -020
Panel B GLS
FM 100 0492 0411 0287 0540 0468 0302 -134 2318200 0267 0196 0088 0364 0274 0153 -159 1359600 0166 0101 0035 0408 0323 0198 -162 10091000 0154 0089 0028 0475 0401 0266 -155 86820000 0155 0100 0044 0806 0772 0705 -068 668
BFM 100 0259 0180 0085 01695 0105 0041 -014 1285200 0162 0094 0030 0065 0027 0005 -089 615600 0108 0058 0013 0056 0022 0003 -127 5311000 0112 0057 0014 0064 0021 0004 -138 47920000 0051 0020 0001 0093 0049 0011 -072 172
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor estimated on a cross-section of 55 portfolios Thetrue value of the cross-sectional R2 is zero The test assets mimic the time series and cross-sectional properties of 25Fama-French size-value portfolios and 30 industry portfolios
62
Table C6 Tests of risk premia in a misspecified model with useless and strong factors (N = 55)
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0097 0052 0014 0099 0045 0009 0108 0041 0006 -318 2477200 0085 0041 0006 0090 0049 0011 0117 0051 0008 -298 2350600 0095 0045 0013 0108 0060 0012 0191 0100 0015 -212 21211000 0083 0043 0006 0125 0072 0016 0258 0171 0047 -155 213720000 0109 0055 0012 0146 0086 0038 0766 0704 0535 267 1778
BFM 100 0014 0002 0000 0000 0000 0000 0000 0000 0000 -141 161200 0037 0014 0001 0017 0005 0000 0002 0000 0000 -203 726600 0069 0031 0005 0058 0027 0002 0007 0000 0000 -227 10961000 0064 0024 0003 0069 0027 0005 0026 0008 0001 -209 117320000 0028 0006 0000 0068 0033 0004 0080 0042 0009 262 873
Panel B GLS
FM 100 0488 0406 0282 0495 0411 0281 0537 0465 0306 2201 6613200 0263 0194 0088 0252 0167 0077 0359 0279 0151 4047 6707600 0157 0094 0032 0161 0093 0027 0398 0313 0199 5581 71591000 0146 0087 0029 0144 0090 0026 0475 0393 0251 5994 725020000 0140 0094 0035 0134 0089 0041 0789 0747 0667 6888 7237
BFM 100 0253 0182 0081 0260 0176 0077 0168 0104 0039 2036 6007200 0156 0092 0028 0140 0082 0021 0063 0029 0004 3451 5844600 0105 0058 0013 0108 0049 0011 0054 0024 0002 5125 66141000 0105 0054 0015 0111 0056 0009 0057 0024 0003 5678 687320000 0051 0019 0001 0045 0018 0001 0093 0045 0013 6873 7157
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and λstrong
and λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor estimated on a cross-section
of 55 portfolios The true value of the cross-sectional R2adj is 572 (7071) in OLS (GLS) estimation The test
assets mimic the time series and cross-sectional properties of 25 Fama-French size-value portfolios and 30 industryportfolios while the strong factor proxies the HML factor
63
Table C7 Tests of risk premia in a misspecified model with a strong factor (N = 100)
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0091 0044 0010 0109 0055 0012 -098 1310600 0102 0052 0013 0116 0057 0019 -042 12311000 0099 0056 0011 0117 0060 0014 031 114520000 0099 0044 0010 0116 0068 0017 431 748
BFM 200 0016 0004 0000 0006 0001 0000 -082 074600 0076 0034 0006 0060 0028 0005 -086 7901000 0081 0046 0007 0076 0037 0007 -072 85920000 0097 0044 0011 0106 0057 0014 418 742
Panel B GLS
FM 200 0495 0411 0287 0481 0403 0268 2938 5885600 0255 0169 0077 0257 0178 0073 4736 62091000 0234 0149 0059 0205 0129 0049 5198 635220000 0166 0100 0035 0170 0097 0036 6142 6398
BFM 200 0249 0163 0070 0233 0158 0073 2567 5296600 0132 0082 0023 0137 0077 0020 4287 56771000 0125 0073 0021 0112 0060 0014 4849 597020000 0101 0056 0012 0098 0053 0013 6114 6375
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for the pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor estimated on a cross-section of 100 portfolios The truevalue of R2
adj is 585 (6297) for the OLS (GLS) estimation Fama-MacBeth estimates are constructed using OLS(GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidence intervalsand their size for BFM estimates are constructed using a posterior distribution of Fama-MacBeth estimates of λ Thelast two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at thesimulation point estimates for FM and its posterior mode for BFM The simulations design follows the methodologydescribed in Section III with the test assets mimicking the composite cross-section of 25 Fama-French size-BMportfolios 30 industry portfolios 25 profitability and investment portfolios 10 momentum portfolios as well as 10long-term reversal portfolios (all available from Ken French website) A strong factor mimics the behavior of HML
64
Table C8 Tests of risk premia in a misspecified model with a useless factor (N = 100)
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0091 0041 0009 0147 0079 0015 -101 1481600 0110 0062 0010 0280 0180 0064 -100 13551000 0096 0053 0007 0341 0248 0101 -100 122220000 0221 0151 0061 0805 0759 0643 -100 1220
BFM 200 0014 0003 0000 0000 0000 0000 -050 002600 0070 0030 0003 0014 0002 0000 -066 0261000 0073 0034 0005 0026 0010 0001 -071 02720000 0097 0035 0003 0091 0045 0008 -081 109
Panel B GLS
FM 200 0472 0397 0272 0530 0454 0320 -072 1495600 0230 0149 0064 0458 0363 0235 -052 9691000 0196 0122 0048 0487 0407 0275 -037 86120000 0106 0063 0021 0760 0717 0640 171 615
BFM 200 0244 0161 0066 0150 0092 0031 -016 769600 0129 0073 0021 0078 0035 0008 -055 6761000 0118 0066 0019 0070 0033 0006 -060 62420000 0076 0034 0005 0093 0049 0009 172 399
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor estimated on a cross-section of 100 portfolios The truevalue of R2 is zero The test assets mimic the time series and cross-sectional properties of composite cross-section of25 Fama-French size-BM portfolios 30 industry portfolios 25 profitability and investment portfolios 10 momentumportfolios as well as 10 long-term reversal portfolios while the strong factor mimics the behavior of HML (all thedata is available from Ken French website)
Table C9 Tests of risk premia in a misspecified model with useless and strong factors (N = 100)
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0088 0043 0009 0098 0045 0008 0187 0104 0026 -124 2118600 0103 0056 0011 0104 0064 0015 0319 0217 0081 -020 19571000 0105 0049 0009 0115 0058 0013 0389 0284 0134 070 182620000 0102 0057 0014 0140 0075 0028 0823 0782 0684 400 1766
BFM 200 0012 0003 0000 0005 0000 0000 0000 0000 0000 -051 541600 0062 0025 0003 0046 0021 0002 0016 0004 0000 -063 10771000 0062 0029 0005 0057 0025 0003 0029 0008 0001 -005 107820000 0026 0008 0001 0042 0015 0000 0089 0039 0011 399 818
Panel B GLS
FM 200 0467 0392 0268 0471 0383 0254 0523 0454 0315 2971 5943600 0227 0149 0059 0228 0154 0060 0455 0363 0227 4745 62221000 0191 0118 0046 0178 0114 0036 0480 0401 0262 5207 636820000 0098 0054 0020 0095 0062 0024 0749 0707 0621 6121 6416
BFM 200 0243 0160 0066 0230 0155 0072 0152 0087 0031 2613 5350600 0126 0072 0022 0132 0071 0017 0074 0033 0007 4268 56921000 0117 0067 0017 0109 0057 0013 0068 0034 0006 4864 597920000 0068 0032 0002 0066 0034 0006 0087 0047 0009 6102 6371
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and λstrong
λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor on the cross-section of 100
portfolios The true value of R2adj is 585 (6297) in OLS (GLS) estimation The test assets mimic the time series
and cross-sectional properties of composite cross-section of 25 Fama-French size-BM portfolios 30 industry portfolios25 profitability and investment portfolios 10 momentum portfolios as well as 10 long-term reversal portfolios whilethe strong factor mimics the behavior of HML (all the data is available from Ken French website)
65
Table C10 The probability of retaining risk factors using BF
T 55 57 59 61 63 65
Panel A strong factors
Jeffreys Prior fstrong 200 0860 0845 0830 0812 0792 0771600 0987 0985 0985 0983 0981 09791000 0998 0998 0997 0996 0996 0995
Spike-and-Slab Prior fstrong 200 0749 0718 0685 0654 0618 0586600 0982 0979 0978 0972 0970 09641000 0996 0996 0996 0995 0994 0992
Panel B useless factors
Jeffreys Prior fuseless 200 1000 0995 0982 0940 0862 0726600 1000 0999 0998 0988 0971 09201000 1000 1000 1000 0999 0990 0971
Spike-and-Slab Prior fuseless 200 0419 0248 0149 0083 0040 0022600 0119 0062 0028 0012 0007 00041000 0083 0028 0011 0001 0000 0000
Panel C strong and useless factors
Jeffreys Prior fstrong 200 0928 0912 0891 0878 0860 0838600 0994 0994 0992 0991 0991 09891000 0999 0999 0999 0999 0999 0999
fuseless 200 0955 0894 0788 0642 0489 0360600 0957 0895 0764 0618 0461 03541000 0957 0893 0787 0645 0483 0357
Spike-and-Slab Prior fstrong 200 0753 0725 0697 0666 0629 0592600 0982 0980 0979 0972 0970 09611000 0996 0996 0996 0995 0994 0994
fuseless 200 0283 0154 0085 0044 0023 0011600 0077 0031 0006 0002 0000 00001000 0053 0008 0000 0000 0000 0000
The table shows the frequency of retaining risk factors for different choice sets across 1000 simulations of differentsize (T=200 600 and 1000) In Panel A the candidate risk factor is truly cross-sectionally priced and stronglyidentified while in Panel B they are not Panel C summarizes the case of using both strong and useless candidatefactors in the model A candidate factor is retained in the model if its marginal posterior probability p(γi = 1|data)is greater than a certain threshold ie 55 57 59 61 63 and 65
66
Table C11 Two-pass regressions with tradable factors 25 Fama-French portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CAPM Intercept 1421 1769 1404 -091 1461[0627 2215] [-435 6661] [0595 2256] [-403 5211]
MKT -0647 -0631[-1429 0135] [-1458 0163]
Fama and French (1992) Intercept 1273 6071 1249 5604 5566[0730 1816] [2000 8514] [0682 1834] [4275 6946]
MKT -0686 -0664[-1227 -0145] [-1242 -0102]
SMB 0140 0140[0075 0205] [0073 0205]
HML 0380 0379[0322 0438] [0319 0439]
Quality-minus-junk Intercept 0626 7442 0648 6947 6879Asness Frazzini and Pedersen (2014) [-0048 1300] [4480 9520] [-0055 1308] [5548 7946]
QMJ 0369 0358[0148 0589] [0130 0591]
MKT -0097 -0116[-0763 0568] [-0772 0580]
SMB 0209 0206[0157 0260] [0151 0262]
HML 0338 0340[0288 0388] [0289 0392]
Panel B GLS
CAPM Intercept 1362 4805 1323 4706 4265[0941 1783] [2696 10000] [0846 1802] [1411 6530]
MKT -0788 -0749[-1206 -0369] [-1227 -0287]
Fama and French (1992) Intercept 1397 8849 1353 8543 8543[0924 1870] [8286 9771] [0846 1878] [8003 8989]
MKT -0827 -0783[-1298 -0356] [-1311 -0283]
SMB 0179 0178[0145 0213] [0142 0215]
HML 0348 0350[0312 0385] [0310 0391]
Quality-minus-junk Intercept 0966 8948 0982 8736 8642Asness Frazzini and Pedersen (2014) [0395 1537] [8320 9640] [0383 1616] [8120 9090]
QMJ 0378 0359[0204 0552] [0172 0539]
MKT -0425 -0438[-0984 0133] [-1052 0157]
SMB 0197 0196[0160 0234] [0156 0235]
HML 0359 0359[0321 0398] [0320 0399]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of monthly excess returns for 25 Fama-French sizevalue portfolios Each model is estimated viaOLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructed basedon the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructed as inLewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide theposterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and 99level respectively
67
Table C12 Two-pass regressions with tradable factors 25 Fama-French portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CCAPM Intercept 1444 662 1814 -202 545[0155 2732] [-435 9374] [0106 3664] [-423 4175]
∆Cnd 0383 0254[-0221 0988] [-0462 0956]
Scaled CAPM Intercept 4489 1617 3731 1739 2565[1479 7499] [-1429 6114] [-0268 7514] [-391 5783]
cay 1141 0563[-0198 2480] [-2262 3085]
MKT -2119 -1444[-4890 0653] [-5241 2643]
MKT times cay -8554 -5188[-19227 2119] [-17911 9415]
Scaled HC-CAPM Intercept 4405 1386 3343 3895 3659[1182 7629] [-2632 5453] [-0500 7182] [-006 6630]
cay 1038 0552[-0444 2520] [-1957 2848]
∆Y 0437 0034[-0421 1295] [-0995 0928]
MKT -2049 -1148[-5031 0933] [-4946 2772]
∆Y times cay 1428 0724[-1047 3903] [-3458 4645]
MKT times cay -7782 -4127[-18579 3015] [-17926 10532]
Panel B GLS
CCAPM Intercept 2274 -251 2336 -303 -056[1386 3162] [-435 9896] [1360 3266] [-403 1109]
∆Cnd 0139 0088[-0114 0391] [-0222 0383]
Scaled CAPM Intercept 2623 4949 2668 5626 4567[0794 4451] [1657 7829] [0859 4376] [439 7146]
cay 0362 0205[-0759 1483] [-0853 1277]
MKT -0596 -0642[-2412 1220] [-2325 1149]
MKT times cay 0644 0310[-5695 6984] [-5988 6529]
Scaled HC-CAPM Intercept 2486 5014 2604 5832 4698[0510 4463] [1158 7726] [0801 4372] [558 7371]
cay 0361 0199[-0902 1623] [-0879 1269]
∆Y -0415 -0242[-0878 0048] [-0672 0157]
MKT -0479 -0588[-2441 1482] [-2357 1241]
∆Y times cay 0395 0150[-1761 2552] [-1718 2036]
MKT times cay 1158 0477[-5888 8205] [-5890 6968]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of quarterly excess returns for 25 Fama-French sizevalue portfolios Each model is estimatedvia OLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructed basedon the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructed as inLewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide theposterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and 99level respectively
68
Table C13 Two-pass regressions with tradable factors 25 Fama-French + 17 industry portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Carhart (1997) Intercept 0755 4720 0780 3835 3769
[0167 1342] [803 8559] [0158 1395] [2222 5427]
MKT -0162 -0188
[-0759 0435] [-0808 0432]
SMB 0144 0142
[0082 0206] [0078 0207]
HML 0320 0316
[0251 0388] [0245 0388]
UMD 0759 0673
[-0360 1878] [-0476 1802]
q-factor model Intercept 0744 4505 0761 3647 3600
Hou Xue and Zhang (2015) [0244 1243] [138 8338] [0239 1287] [1711 5508]
ROE 0173 0156
[-0139 0485] [-0154 0481]
IA 0287 0279
[0117 0456] [0117 0455]
ME 0230 0221
[0135 0325] [0121 0320]
MKT -0199 -0211
[-0703 0304] [-0741 0311]
Liquidity Factor Intercept 0958 277 0963 -124 620
Pastor and Stambaugh (2003) [0417 1499] [-513 4113] [0377 1527] [-435 3340]
LIQ 0210 0153
[-1001 1421] [-1077 1445]
MKT -0274 -0281
[-0822 0274] [-0851 0340]
Panel B GLS
Carhart (1997) Intercept 0839 8826 0909 8486 8397
[0467 1210] [7562 9557] [0487 1339] [7895 8812]
MKT -0235 -0309
[-0610 0140] [-0737 0120]
SMB 0162 0161
[0134 0190] [0130 0193]
HML 0351 0350
[0316 0386] [0312 0390]
UMD 1426 1184
[0832 2020] [0541 1861]
q-factor model Intercept 1107 4289 1103 3742 3631
Hou Xue and Zhang (2015) [0761 1454] [027 7341] [0714 1486] [2022 5216]
ROE 0270 0247
[0057 0482] [0008 0502]
IA 0277 0263
[0134 0420] [0107 0428]
ME 0221 0216
[0156 0287] [0144 0288]
MKT -0524 -0520
[-0869 -0179] [-0900 -0138]
Liquidity Factor Intercept 1186 2897 1159 1811 2373
Pastor and Stambaugh (2003) [0874 1497] [-197 7687] [0803 1519] [438 5253]
LIQ 0089 0065
[-0567 0744] [-0696 0871]
MKT -0598 -0571
[-0909 -0288] [-0936 -0212]
69
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CAPM Intercept 0966 467 0957 -113 306
[0427 1506] [-250 5490] [0382 1522] [-246 2863]
MKT -0277 -0269
[-0823 0269] [-0838 0311]
Fama-French (1992) Intercept 1016 4330 1002 3425 3370
[0540 1491] [182 8166] [0517 1500] [2194 4614]
MKT -0445 -0430
[-0916 0026] [-0912 0068]
SMB 0147 0144
[0086 0208] [0077 0208]
HML 0316 0313
[0248 0383] [0242 0383]
Quality-minus-junk Intercept 0836 4931 0836 3861 3934
Asness et all (2019) [0323 1350] [914 8449] [0314 1365] [2459 5577]
QMJ 0153 0147
[-0065 0372] [-0066 0373]
MKT -0293 -0289
[-0796 0210] [-0814 0233]
SMB 0179 0175
[0114 0243] [0105 0240]
HML 0302 0300
[0234 0369] [0230 0373]
Panel B GLS
CAPM Intercept 1192 3060 1170 2602 2573
[0884 1500] [058 7848] [0815 1517] [600 5118]
MKT -0604 -0583
[-0911 -0298] [-0923 -0227]
Fama-French (1992) Intercept 1182 8616 1150 8272 8234
[0853 1510] [7303 9353] [0788 1519] [7736 8662]
MKT -0596 -0564
[-0923 -0268] [-0937 -0198]
SMB 0160 0160
[0132 0187] [0129 0191]
HML 0349 0348
[0314 0384] [0310 0386]
Quality-minus-junk Intercept 0970 8674 0977 8227 8276
Asness et all (2019) [0606 1334] [7451 9335] [0561 1369] [7759 8693]
QMJ 0293 0278
[0159 0427] [0125 0439]
MKT -0393 -0399
[-0754 -0033] [-0791 0012]
SMB 0166 0165
[0138 0194] [0134 0196]
HML 0353 0352
[0318 0388] [0315 0392]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of monthly excess returns for 25 Fama-French and 17 industry portfolios Each model is estimatedvia OLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructedbased on the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructedas in Lewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we providethe posterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of thecross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and99 level respectively
70
Table C14 Two-pass regressions with nontradable factors 25 Fama-French + 17 industry port-folios
Fama-MacBeth Bayesian Estimation
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CCAPM Intercept 1817 356 1975 -124 209
[0905 2729] [-250 6208] [0795 3135] [-245 2680]
∆Cnd 0189 0123
[-0216 0594] [-0285 0502]
Scaled CAPM Intercept 2141 1528 2344 298 1306
[0529 3754] [-789 9568] [0692 3944] [-451 4069]
cay 1408 0574
[-0182 2999] [-0935 1939]
MKT 0108 -0142
[-1471 1686] [-1747 1499]
cay timesMKT -2554 -2159
[-8811 3702] [-8813 4620]
Scaled CCAPM Intercept 1607 1709 1983 95 1468
[0394 2820] [-789 10000] [0761 3180] [-372 4189]
cay 1137 0511
[-0394 2669] [-0871 1865]
∆Cnd 0452 0175
[-0103 1007] [-0261 0598]
cay times∆Cnd -0102 0030
[-1752 1547] [-1175 1292]
HC-CAPM Intercept 2185 611 2221 -147 644
[0720 3650] [-513 6005] [0667 3735] [-429 3093]
∆Y 0398 0201
[-0011 0808] [-0290 0667]
MKT 0123 0050
[-1392 1638] [-1513 1650]
Panel B GLS
CCAPM Intercept 2351 264 2442 -057 218
[1700 3003] [-250 9795] [1612 3258] [-204 1296]
∆Cnd 0211 0132
[0034 0387] [-0081 0351]
Scaled CAPM Intercept 2516 3951 2653 4332 3470
[1467 3566] [1045 7411] [1616 3705] [360 6539]
cay 0783 0424
[0056 1511] [-0311 1119]
MKT -0504 -0639
[-1548 0541] [-1674 0406]
cay timesMKT 3785 2327
[-0412 7981] [-2053 6791]
Scaled CCAPM Intercept 2244 331 2378 296 356
[1378 3109] [-789 8166] [1539 3237] [-476 1690]
cay 0742 0425
[-0006 1489] [-0316 1104]
∆Cnd 0160 0119
[-0078 0398] [-0116 0342]
cay times∆Cnd 0355 0190
[-0343 1053] [-0455 0840]
HC-CAPM Intercept 2908 3810 2808 4454 3356
[2053 3762] [854 7372] [1809 3826] [194 6591]
∆Y -0155 -0089
[-0381 0072] [-0357 0174]
MKT -0892 -0793
[-1742 -0043] [-1803 0208]
71
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Scaled HC-CAPM Intercept 2099 1372 2243 1599 2186
[0549 3649] [-1389 4875] [0491 3955] [-121 4771]
cay 1124 0529
[-0326 2574] [-1066 1981]
∆Y 0088 0106
[-0314 0489] [-0399 0599]
MKT 0169 -0068
[-1340 1679] [-1805 1744]
cay times∆Y 1277 0579
[-1361 3914] [-1930 2919]
cay timesMKT -2125 -1605
[-8058 3808] [-8349 5354]
Durable CCAPM Intercept 2281 2316 2220 1423 1459
[0025 4537] [-789 10000] [0403 4028] [-359 4178]
∆Cnd 0544 0194
[0054 1034] [-0163 0525]
∆Cd 0481 0104
[-0101 1062] [-0353 0531]
MKT -0102 -0063
[-2466 2262] [-1948 1863]
Panel B GLS
Scaled HC-CAPM Intercept 2662 3848 2698 4312 3502
[1626 3698] [433 7267] [1555 3844] [231 6466]
cay 0726 0416
[-0029 1481] [-0324 1146]
∆Y -0165 -0088
[-0440 0110] [-0372 0198]
MKT -0652 -0685
[-1684 0379] [-1816 0438]
cay times∆Y 0681 0333
[-0648 2009] [-1017 1616]
cay timesMKT 3266 2123
[-0950 7482] [-2173 6502]
Durable CCAPM Intercept 1958 2926 1994 412 2558
[0634 3281] [-789 6763] [0831 3125] [-269 6336]
∆Cnd 0164 0065
[-0065 0394] [-0126 0260]
∆Cd 0289 0123
[-0011 0589] [-0117 0377]
MKT 0002 -0026
[-1322 1326] [-1165 1125]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of quarterly excess returns for 25 Fama-French and 17 industry portfolios Each modelis estimated via OLS and GLS We report point estimates and 5 confidence intervals for risk premia which areconstructed based on the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence levelconstructed as in Lewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimationwe provide the posterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode andmedian of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the90 95 and 99 level respectively
72
Table C15 Posterior factor probabilities and risk premia of 26 million sparse models
Pr [γj = 1|data] E [λj |data]ψ ψ
factors 1 5 10 20 50 100 1 5 10 20 50 100
HML 0455 0702 0720 0695 0646 0614 0106 0231 0249 0245 0229 0219MKTlowast 0274 0513 0594 0658 0700 0702 0050 0211 0321 0440 0553 0591MKT 0086 0178 0311 0446 0550 0576 0009 0067 0163 0286 0408 0454SMBlowast 0246 0350 0317 0264 0210 0188 0039 0113 0120 0110 0095 0089PERF 0136 0160 0147 0125 0098 0083 -0020 -0050 -0053 -0049 -0041 -0036STOCK ISS 0099 0117 0125 0123 0110 0099 -0008 -0029 -0042 -0050 -0053 -0051COMP ISSUE 0097 0101 0104 0103 0096 0088 0010 0029 0041 0051 0059 0059CMA 0111 0114 0104 0091 0075 0066 0008 0018 0019 0019 0017 0016ROA 0089 0079 0083 0094 0102 0097 0010 0023 0034 0051 0068 0070UMD 0091 0097 0097 0090 0078 0071 0006 0020 0026 0029 0030 0030BEH PEAD 0081 0069 0069 0074 0089 0108 0001 0002 0004 0007 0016 0030STRev 0078 0064 0063 0067 0086 0112 0000 0002 0003 0007 0022 0048NONDUR 0079 0065 0064 0067 0079 0097 0000 0000 0001 0001 0003 0006DISSTR 0085 0079 0076 0070 0060 0053 0010 0030 0039 0043 0042 0040MGMT 0090 0083 0077 0068 0055 0047 0007 0017 0019 0019 0017 0015BW ISENT 0079 0064 0062 0063 0069 0077 0000 0000 0000 0001 0002 0004TERM 0078 0063 0061 0062 0069 0078 0000 0000 0001 0001 0004 0007FIN UNC 0078 0063 0061 0061 0066 0073 -0000 -0000 -0000 -0000 -0000 -0000LIQ TR 0078 0063 0060 0061 0066 0075 -0000 0000 0001 0002 0007 0015DeltaSLOPE 0078 0063 0061 0061 0066 0073 -0000 -0000 -0000 -0000 -0000 -0001IPGrowth 0078 0063 0061 0061 0066 0072 -0000 -0000 -0000 -0000 -0001 -0002Oil 0078 0063 0061 0061 0066 0072 -0000 0001 0001 0002 0004 0009SERV 0078 0063 0061 0061 0065 0072 -0000 -0000 -0000 -0000 -0000 -0000DEFAULT 0078 0063 0060 0060 0064 0069 -0000 -0000 0000 0000 0000 0000NetOA 0079 0063 0061 0062 0065 0065 0001 0003 0004 0007 0014 0018DIV 0078 0063 0060 0060 0064 0069 0000 -0000 -0000 -0000 -0000 -0000PE 0078 0063 0060 0060 0064 0068 -0000 -0001 -0001 -0002 -0004 -0008REAL UNC 0078 0063 0060 0060 0064 0067 0000 0000 0000 0000 0000 0000HJTZ ISENT 0078 0063 0060 0060 0064 0068 -0000 -0000 -0000 -0000 -0000 -0001UNRATE 0078 0063 0060 0060 0064 0068 0000 -0000 -0000 -0000 -0001 -0002INTERM CAP RATIO 0084 0065 0061 0062 0062 0059 0009 0013 0018 0027 0038 0041LIQ NT 0079 0063 0060 0060 0063 0066 -0001 -0001 -0002 -0002 -0003 -0004QMJ 0113 0090 0069 0051 0035 0028 0012 0016 0014 0010 0007 0005MACRO UNC 0078 0063 0060 0059 0060 0061 0000 0000 0000 0000 0000 0000INV IN ASS 0078 0062 0059 0059 0060 0061 0000 0000 0001 0001 0003 0006ACCR 0081 0067 0062 0059 0056 0055 -0002 -0005 -0006 -0006 -0005 -0004LTRev 0078 0064 0063 0062 0059 0053 -0001 -0005 -0008 -0012 -0016 -0017CMAlowast 0079 0062 0059 0058 0059 0058 0000 0000 -0000 -0001 -0002 -0002ASS Growth 0078 0060 0056 0055 0055 0052 -0001 -0001 -0001 -0004 -0008 -0011HMLlowast 0079 0062 0058 0055 0053 0049 -0001 -0001 -0001 -0001 -0001 -0001RMWlowast 0077 0058 0052 0047 0040 0035 0000 0001 0001 0002 0002 0002BAB 0079 0059 0051 0045 0039 0034 -0003 -0003 -0003 -0001 0001 0003IA 0083 0060 0051 0043 0035 0030 -0003 -0004 -0004 -0003 -0003 -0003O SCORE 0077 0056 0047 0040 0032 0027 -0003 -0004 -0005 -0005 -0005 -0005ROE 0076 0055 0047 0040 0032 0027 0001 0004 0005 0005 0004 0003SMB 0039 0024 0027 0042 0067 0077 0002 0002 0004 0007 0014 0017SKEW 0090 0062 0047 0034 0024 0019 -0000 -0000 -0000 -0000 -0000 -0000BEH FIN 0079 0051 0040 0031 0023 0019 -0006 -0005 -0003 -0001 -0000 -0000GR PROF 0077 0047 0038 0031 0025 0020 0004 0003 0003 0003 0003 0003RMW 0073 0045 0036 0030 0023 0018 0003 0004 0004 0003 0003 0003HML DEVIL 0063 0036 0027 0020 0015 0012 0002 0003 0002 0001 0000 0000
Posterior probabilities of factors Pr [γj = 1|data] and posterior mean of factor risk premia E [λj |data] computedusing the Dirac spike and slab approach of section II22 51 factors and all possible models with up to 5 factorsyielding about 26 million candidate models The prior probability of a factor being included is about 1038 Thedata is monthly 197310 to 201612 Test assets cross-section of 25 Fama-French size and book-to-market and 30Industry portfolios The 51 factors considered are described in Table B1 of Appendix B
73
Table C16 Factor models with highest posterior probability (Dirac spike-and-slab ψ = 20)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232SMB 983232 983232 983232STRevBW ISENTLIQ TRUNRATENONDURTERMCOMP ISSUE 983232 983232OilBEH PEAD 983232DeltaSLOPEINV IN ASSIPGrowthDEFAULTUMD 983232ROA 983232 983232 983232 983232 983232 983232 983232REAL UNCPECMA 983232ACCRSERVSTOCK ISS 983232 983232 983232DIVMACRO UNCFIN UNCLIQ NTCMANetOAHJTZ ISENTLTRevRMWHMLINTERM CAP RATIOASS GrowthPERF 983232IABABDISSTR 983232ROEMGMTO SCOREQMJBEH FINGR PROFSKEWRMWHML DEVILSMBProbability () 01080 00964 00771 00710 00709 00668 00661 00624 00600 00560
Factors and posterior model probabilities of ten most likely specifications computed using the Dirac spike and slabapproach of section II22 ψ = 20 51 factors and all possible models with up to 5 factors yielding about 26 millionmodels and a model prior probability of the order of 10minus7 Specifications organised by columns with the symbol 983232indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612 Test assetscross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factors considered aredescribed in Table B1 of Appendix B
74
Table C17 Factor Models with highest posterior probability (continuous spike-and-slab ψ = 10)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKTlowast 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232SMBlowast 983232 983232 983232 983232 983232 983232 983232STRev 983232 983232 983232 983232 983232 983232 983232IPGrowth 983232 983232 983232 983232 983232BEH PEAD 983232 983232 983232 983232 983232NONDUR 983232 983232 983232 983232SERV 983232 983232 983232LIQ TR 983232 983232 983232 983232PE 983232 983232 983232 983232 983232 983232DeltaSLOPE 983232 983232 983232 983232BW ISENT 983232 983232DEFAULT 983232 983232 983232 983232 983232 983232Oil 983232 983232 983232DIV 983232 983232 983232 983232 983232 983232REAL UNC 983232 983232 983232 983232 983232 983232UNRATE 983232 983232TERM 983232 983232 983232 983232HJTZ ISENT 983232 983232 983232 983232 983232 983232UMD 983232 983232 983232 983232 983232CMAlowast 983232 983232 983232 983232 983232 983232LIQ NT 983232 983232 983232 983232 983232 983232 983232 983232FIN UNC 983232 983232 983232 983232 983232 983232 983232STOCK ISS 983232 983232NetOA 983232 983232 983232 983232 983232 983232MACRO UNC 983232 983232 983232 983232 983232INV IN ASS 983232 983232 983232COMP ISSUE 983232 983232 983232 983232 983232ACCR 983232 983232 983232 983232 983232 983232HMLlowast 983232 983232 983232 983232 983232 983232ROA 983232 983232 983232 983232 983232 983232RMWlowast 983232 983232 983232 983232 983232 983232CMA 983232 983232 983232 983232 983232 983232LTRev 983232 983232 983232 983232 983232 983232 983232 983232INTERM CAP RATIO 983232 983232 983232 983232 983232 983232ASS Growth 983232 983232 983232 983232 983232 983232DISSTR 983232 983232 983232 983232PERF 983232 983232 983232BAB 983232 983232IA 983232 983232 983232 983232MGMT 983232 983232 983232 983232ROE 983232 983232 983232O SCORE 983232BEH FIN 983232 983232 983232 983232 983232GR PROF 983232 983232 983232 983232 983232QMJ 983232 983232RMW 983232SKEW 983232 983232 983232SMB 983232HML DEVIL 983232 983232Probability () 00111 00111 00111 00100 00100 00100 00100 00100 00100 00100
Factors and posterior model probabilities of ten most likely specifications computed using the continuous spike andslab approach of section II23 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 225quadrillion models and a model prior probability of the order of 10minus16 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
75
Table C18 Factor models with highest posterior probability (Continuous spike-and-slab ψ = 20)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232MKTlowast 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232SMBlowast 983232 983232 983232STRev 983232 983232 983232 983232 983232 983232 983232 983232IPGrowth 983232 983232 983232BEH PEAD 983232 983232 983232 983232 983232 983232NONDUR 983232 983232 983232SERV 983232 983232 983232 983232 983232 983232LIQ TR 983232 983232 983232 983232 983232PE 983232 983232 983232 983232 983232 983232 983232DeltaSLOPE 983232 983232 983232 983232 983232 983232BW ISENT 983232 983232 983232 983232 983232DEFAULT 983232 983232 983232 983232 983232Oil 983232 983232DIV 983232 983232 983232 983232REAL UNC 983232 983232 983232 983232 983232UNRATE 983232 983232 983232 983232 983232TERM 983232 983232 983232 983232HJTZ ISENT 983232 983232 983232 983232 983232 983232 983232 983232UMD 983232 983232 983232 983232CMAlowast 983232 983232LIQ NT 983232 983232 983232 983232 983232FIN UNC 983232 983232 983232 983232 983232 983232STOCK ISS 983232 983232 983232 983232 983232NetOA 983232 983232 983232 983232 983232 983232 983232MACRO UNC 983232 983232INV IN ASS 983232 983232 983232 983232COMP ISSUE 983232 983232 983232 983232ACCR 983232 983232 983232HMLlowast 983232 983232 983232 983232 983232 983232 983232ROA 983232 983232 983232 983232 983232 983232RMWlowast 983232 983232 983232 983232 983232CMA 983232 983232 983232 983232 983232 983232LTRev 983232 983232 983232INTERM CAP RATIO 983232 983232 983232ASS Growth 983232 983232 983232 983232DISSTR 983232 983232 983232 983232 983232PERF 983232BAB 983232 983232 983232 983232IA 983232 983232 983232 983232MGMT 983232 983232 983232ROE 983232 983232 983232 983232 983232O SCORE 983232 983232 983232 983232 983232BEH FIN 983232GR PROF 983232 983232 983232QMJ 983232 983232RMW 983232 983232SKEW 983232 983232SMB 983232HML DEVIL 983232 983232Probability () 00133 00122 00111 00111 00100 00100 00100 00100 00100 00089
Factors and posterior model probabilities of ten most likely specifications computed using the continuous spike andslab approach of section II23 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 225quadrillion models and a model prior probability of the order of 10minus16 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
76
D Additional Figures
Panel A OLS
00 01 02 03 04 05
02
46
810
λstrong
Density
BFMFM
(a) Strong factor
minus2 minus1 0 1 2
00
02
04
06
08
10
λuseless
Density
BFMFM
(b) Useless factor
Panel B GLS
025 030 035
05
1015
2025
30
λstrong
Density
BFMFM
(c) Strong factor
minus20 minus15 minus10 minus05 00 05 10
00
04
08
12
λuseless
Density
BFMFM
(d) Useless factor
Figure D1 Posterior distribution of the risk premia estimates in a misspecified model that includesboth strong and irrelevant factors
The graph presents posterior distribution of risk premia estimates for a misspecified model with both strong anduseless factors in one representative simulation Panels (a) and (b) display posterior distribution of the BFM-OLSestimates of risk premia along with the frequentist distribution implied by the point estimates and standard errorsPanels (c) and (d) report the same objects for GLS T =1000
77
0 5 10
00
04
08
12
Parameter
Den
sity
PriorLikelihood ILikelihood IILikelihood IIIPosterior IPosterior IIPosterior III
Figure D2 Posterior distributions with heavy-tailed likelihood and thin-tailed prior
Posterior shapes generated by a thin-tails (Gaussian) prior and a heavy-tails (Cauchy) likelihood as the likelihood
peak moves away from the prior mean Posteriors I II and III are generated respectively by the likelihood I II and
III
78
the risk premium converges to infinity effectively shrinking the posterior of the useless factorsrsquo
risk premia toward zero Therefore our priors are particularly robust to the presence of spurious
factors Conversely they are very diffuse for the strong factors with the posterior reflecting the
full impact of the likelihood
II Inference in Linear Factor Models
This section introduces the notation and reviews the main results of the Fama-MacBeth (FM)
regression method (see Fama and MacBeth (1973)) We focus on classic linear factor models for
cross-sectional asset returns Suppose that there are K factors ft = (f1t fKt)⊤ t = 1 T
which could be either tradable or non-tradable To simplify exposition we consider mean zero
factors that have also been demeaned in sample so that we have both E[ft] = 0K and f = 0K where
E[] denotes the unconditional expectation and the upper bar denotes the sample mean operator
The returns of N test assets in excess of the risk free rate are denoted by Rt = (R1t RNt)⊤
In the FM procedure the factor exposures of asset returns βf isin RNtimesK are recovered from
the linear regression
Rt = a+ βfft + 983171t (1)
where 9831711 983171Tiidsim N (0N Σ) and a isin RN Given the mean normalization of ft we have E[Rt] = a
The risk premia associated with the factors λf isin RK are then estimated from the cross-
sectional regression
R = λc1N + 983141βfλf +α (2)
where 983141βf denotes the time series estimates λc is a scalar average mispricing that should be equal to
zero under the null of the model being correctly specified 1N denotes an N -dimensional vector of
ones and α isin RN is the vector of pricing errors in excess of λc If the model is correctly specified
it implies the parameter restriction a = E[Rt] = λc1N + βfλf Therefore we can rewrite the
two-step Fama-MacBeth regression into one equation as
Rt = λc1N + βfλf + βfft + 983171t (3)
Equation (3) is particularly useful in our simulation study Note that the intercept λc is included
in (2) and (3) in order to separately evaluate the ability of the model to explain the average level
of the equity premium and the cross-sectional variation of asset returns
Let B⊤ = (aβf ) and F⊤t = (1f⊤
t ) and consider the matrices of stacked time series observa-
tions
R =
983091
983109983109983107
R⊤1
R⊤T
983092
983110983110983108 F =
983091
983109983109983107
F⊤1
F⊤T
983092
983110983110983108 983171 =
983091
983109983109983107
983171⊤1
983171⊤T
983092
983110983110983108
The regression in (1) can then be rewritten as R = FB + 983171 yielding the time-series estimates of
6
(aβf ) and Σ
983141B =
983075983141a⊤
983141β⊤f
983076= (F⊤F )minus1F⊤R 983141Σ =
1
T(Rminus F 983141B)⊤(Rminus F 983141B)
In the second step the OLS estimates of the factor risk premia are
983141λ = (983141β⊤ 983141β)minus1 983141β⊤R (4)
where 983141β = (1N 983141βf ) and λ⊤ = (λc λf ⊤) The canonical Shanken (1992) corrected covariance matrix
of the estimated risk premia is4
983141σ2(983141λ) = 1
T
983147(983141β⊤ 983141β)minus1 983141β⊤ 983141Σ983141β(983141β⊤ 983141β)minus1(1 + 983141λ⊤
f983141Σminus1f
983141λf )983148 (5)
where 983141Σf is the sample estimate of the variance-covariance matrix of the factors ft There are
two sources of estimation uncertainty in the OLS estimates of λ First we do not know the test
assetsrsquo expected returns but instead estimate them as sample means R According to the time-
series regression R sim N (a 1T Σ) asymptotically Second if β is known the asymptotic covariance
matrix of 983141λ is simply 1T (β
⊤β)minus1β⊤ 983141Σβ(β⊤β)minus1 The extra term (1 + λ⊤fΣ
minus1f λf ) is included to
account for the fact that βf is estimated
Alternatively we can run a (feasible) GLS regression in the second stage obtaining the estimates
983141λ = (983141β⊤ 983141Σminus1 983141β)minus1 983141β⊤ 983141Σminus1R (6)
where 983141Σ = 1T 983141983171
⊤983141983171 and 983141983171 denotes the OLS residuals and with the associated covariance matrix of
the estimates
983141σ2(983141λ) = 1
T(983141β⊤ 983141Σminus1 983141β)minus1(1 + 983141λ⊤
f983141Σminus1f
983141λf ) (7)
Equations (4) and (6) make it clear that in the presence of a spurious (or useless) factor ie
such that βj = CradicT C isin RN risk premia are no longer identified Furthermore their estimates
diverge leading to inference problems for both the useless and the strong (ie βj ∕rarr 0 as T rarr infin)
factors (see eg Kan and Zhang (1999b)) In the presence of such an identification failure the
cross-sectional R2 also becomes untrustworthy If a useless factor is included into the two-pass
regression the OLS R2 tend to be highly inflated (although the GLS R2 is less affected)5
4An alternative way (see eg Cochrane (2005) Page 242) to account for the uncertainty from ldquogenerated regres-sorsrdquo such as βf is to estimate the whole system in GMM The moments are
gT (aβf λ) =
983061IN otimes IK+1
A⊤
983062983091
983107E[Rt minus aminus βfft]
E[(Rt minus aminus βfft)otimes f⊤t ]
E[Rt minus λc1N minus βfλf ]
983092
983108 = 0
where β = (1N βf ) A⊤ = β⊤ for OLS and A⊤ = β⊤Σminus1 for GLS Also note that GLS estimation is not the sameas efficient GMM estimation
5For example Kleibergen and Zhan (2015) derive the asymptotic distribution of the R2 under the assumptionthat a few unknown factors are able to explain expected asset returns and show that in the presence of a useless
7
This problem arises not only when using the Fama-MacBeth two-step procedure Kan and
Zhang (1999a) point out that the identification condition in the GMM test of linear stochastic
discount factor models fails when a useless factor is included Moreover this leads to overrejection
of the hypothesis of a zero risk premium for the useless factor under the Wald test and the power
of the over-identifying restriction test decreases Gospodinov Kan and Robotti (2019) document
similar problems within the maximum likelihood estimation and testing framework
Consequently several papers have attempted to develop alternative statistical procedures that
are robust to the presence of useless factors Kleibergen (2009) proposes several novel statis-
tics whose large sample distributions are unaffected by the failure of the identification condition
Gospodinov Kan and Robotti (2014) derive robust standard errors for the GMM estimates of
factor risk premia in the linear stochastic factor framework and prove that t-statistics calculated
using their standard errors are robust even when the model is misspecified and a useless factor is
included Bryzgalova (2015) introduces a LASSO-like penalty term in the cross-sectional regression
to shrink the risk premium of the useless factor towards zero
In this paper we provide a Bayesian inference and model selection framework that i) can be
easily used for robust inference in the presence and detection of useless factors (section II1) and
ii) can be used for both model selection and model averaging even in the presence of a very large
number of candidate (traded or non traded and possibly useless) risk factors ndash ie the entire factor
zoo
II1 Bayesian Fama-MacBeth
This section introduces our hierarchical Bayesian Fama-MacBeth (BFM) estimation method A
formal derivation is presented in Appendix A1 To start with letrsquos consider the time-series regres-
sion We assume that the time series error terms follow an iid multivariate Gaussian distribution
(the approach at the cost of analytical solutions could be generalized to accommodate different
distributional assumptions) ie 983171 sim MVN (0TtimesN Σ otimes IT ) The likelihood of the data (RF ) is
then
p(data|BΣ) = (2π)minusNT2 |Σ|minus
T2 exp
983069minus1
2tr
983147Σminus1(Rminus FB)⊤(Rminus FB)
983148983070
The time-series regression is always valid even in the presence of a spurious factor For simplicity
we choose the non-informative Jeffreysrsquo prior for (BΣ) π(BΣ) prop |Σ|minusN+1
2 Note that this prior
is flat in the B dimension The posterior distribution of (BΣ) is therefore
B|Σ data sim MVN983059983141BolsΣotimes (F⊤F )minus1
983060 (8)
Σ|data sim W -1983059T minusK minus 1 T Σ
983060 (9)
where 983141Bols and Σ denote the canonical OLS based estimates and W -1 is the inverse-Wishart
distribution (a multivariate generalization of the inverse-gamma distribution) From the above
factor the OLS R2 is more likely to be inflated than its GLS counterpart
8
we can sample the posterior distribution of the parameters (BΣ) by first drawing the covariance
matrix Σ from the inverse-Wishart distribution conditional on the data and then drawing B from
a multivariate normal distribution conditional on the data and the draw of Σ
If the model is correctly specified in the sense that all true factors are included expected
returns of the assets should be fully explained by their risk exposure β and the prices of risk λ
ie E[Rt] = βλ But since given our mean normalisation of the factors E[Rt] = a we have the
least square estimate (β⊤β)minus1β⊤a Therefore we can define our first estimator
Definition 1 (Bayesian Fama-MacBeth (BFM)) The posterior distribution of λ conditional
on B Σ and the data is a Dirac distribution at (β⊤β)minus1β⊤a A draw (λ(j)) from the posterior
distribution of λ conditional on the data only is obtained by drawing B(j) and Σ(j) from the Normal-
inverse-Wishart in (8)-(9) and computing (β⊤(j)β(j))
minus1β⊤(j)a(j)
The posterior distribution of λ defined above accounts both for the uncertainty about the
expected returns (via the sampling of a) and the uncertainty about the factor loadings (via the
sampling of β) Note that differently from the frequentist case in equation (5) there is no ldquoextra
termrdquo (1 + λ⊤fΣ
minus1f λf ) to account for the fact that βf is estimated The reason being that it
is unnecessary to explicitly adjust standard errors of λ in the Bayesian approach since we keep
updating βf in each simulation step automatically incorporating the uncertainty about βf into the
posterior distribution of λ Furthermore it is quite intuitive from the above definition of the BFM
estimator why we expect posterior inference to detect weak and spurious factors in finite sample
For such factors the near singularity of (β⊤(j)β(j))
minus1 will cause the draws for λ(j) to diverge as in
the frequentist case Nevertheless the posterior uncertainty about factor loadings and risk premia
will cause β⊤(j)a(j) to flip sign across draws causing the posterior distribution of λ to put substantial
probability mass on both values above and below zero Hence centered posterior credible intervals
will tend to include zero with high probability
In addition to the price of risk λ we are also interested in estimating the cross-sectional fit of
the model ie the cross-sectional R2 Once we obtain the posterior draws of the parameters we
can easily obtain the posterior distribution of the cross-sectional R2 defined as
R2ols = 1minus (aminus βλ)⊤(aminus βλ)
(aminus a1N )⊤(aminus a1N ) (10)
where a = 1N
983123Ni ai That is for each posterior draw of (a β λ) we can construct the corre-
sponding draw for the R2 from equation (10) hence tracing out its posterior distribution We can
think of equation (10) as the population R2 where a β and λ are unknown After observing
the data we infer the posterior distribution of a β and λ and from these we can recover the
distribution of the R2
However realistically the models are rarely true Therefore one might want to allow for the
presence of pricing errors α in the cross-sectional regression6 This can be easily accommodated
6As we will show in the next section this is essential for model selection
9
within our Bayesian framework since in this case the data generating process in the second stage
becomes a = βλ + α If we further assume that pricing error αi follows an independent and
identical normal distribution N (0σ2) the likelihood function in the second step becomes7
p(data|λσ2β) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070 (11)
In the cross-sectional regression the ldquodatardquo are the expected risk premia a and the factor loadings
β albeit these quantities are not directly observable to the researcher Hence in the above we
are conditioning on the knowledge of these quantities which can be sampled from the first step
Normal-inverse-Wishart posterior distribution (8)-(9) Conceptually this is not very different from
the Bayesian modeling of latent variables In the benchmark case we assume a Jeffreysrsquo diffuse
prior8 for (λσ2) π(λσ2) prop σminus2 In Appendix A1 we show that the posterior distribution of
(λσ2) is then
λ|σ2BΣ data sim N
983091
983109983107(β⊤β)minus1β⊤a983167 983166983165 983168983141λ
σ2(β⊤β)minus1
983167 983166983165 983168Σλ
983092
983110983108 (12)
σ2|BΣ data sim Γminus1
983075N minusK minus 1
2(aminus βλ)⊤(aminus βλ)
2
983076 (13)
where Γminus1 denotes the inverse-gamma distribution The conditional distribution in equation (12)
makes it clear that the posterior takes into account both the uncertainty about the market price
of risk stemming from the first stage uncertainty about the β and a (that are drawn from the
Normal-inverse-Wishart posterior in equations (8)-(9)) and the random pricing errors α that have
the conditional posterior variance in equation (13) If test assetsrsquo expected excess returns are fully
explained by β there are no pricing errors and σ2(β⊤β)minus1 converges to zero otherwise this layer
of uncertainty always exists
Note also that we can think of the posterior distribution of (β⊤β)minus1β⊤a as a Bayesian decision
makerrsquos belief about the dispersion of the Fama-MacBeth OLS estimates after observing the data
RtftTt=1 Alternatively when pricing errors α are assumed to be zero under the null hypothesis
the posterior distribution of λ in equation (12) collapses to a degenerate distribution where λ
equals (β⊤β)minus1β⊤a with probability one
Often the cross sectional step of the Fama-MacBeth estimation is performed via GLS rather than
least squares In our setting under the null of the model this leads to λ = (β⊤Σminus1β)minus1β⊤Σminus1a
Therefore we define the following GLS estimator
Definition 2 (Bayesian Fama-MacBeth GLS (BFM-GLS)) The posterior distribution of λ
conditional on B Σ and the data is a Dirac distribution at (β⊤Σminus1β)minus1β⊤Σminus1a A draw (λ(j))
7We derive a formulation with non-spherical cross-sectional pricing errors that leads to a GLS type estimator inAppendix A2
8As shown in the next subsection in the presence of useless factors such prior is not appropriate for modelselection based on Bayes factors and posterior probabilities since it does not lead to proper marginal likelihoodsTherefore we introduce therein a novel prior for model selection
10
from the posterior distribution of λ conditional on the data only is obtained by drawing B(j) and Σ(j)
from the Normal-inverse-Wishart in equations (8)-(9) and computing (β⊤(j)Σ
minus1(j)β(j))
minus1β⊤(j)Σ
minus1(j)a(j)
From the posterior sampling of the parameters in the above definition we can also obtain the
posterior distribution of the cross-sectional GLS R2 defined as
R2gls = 1minus
(aminus βλgls)⊤Σminus1(aminus βλgls)
(aminus a1N )⊤Σminus1(aminus a1N ) (14)
Once again we can think of equation (14) as the population GLS R2 that is a function of the
unknown quantities a β and λ But after observing the data we infer the posterior distribution
of the parameters and from these we recover the posterior distribution of the R2gls
Remark 1 (Generated factors) Often factors are estimated as eg in the case of principal
components (PCs) and factor mimicking portfolios (albeit the latter is not needed in our setting)
This generates an additional layer of uncertainty normally ignored in empirical analysis due to
the associated asymptotic complexities Nevertheless it is relatively easy to adjust the Bayesian
estimators of risk premia to account for this uncertainty In the case of a mimicking portfolio
under a diffuse prior and Normal errors the posterior distribution of the portfolio weights follow the
standard Normal-inverse-Gamma of Gaussian linear regression models (see eg Lancaster (2004))
Similarly in the case of principal components as factors under a diffuse prior the covariance
matrix from which the PCs are constructed follow an inverse-Wishart distribution9 Hence the
posterior distributions in Definitions 1 and 2 can account for the generated factors uncertainty by
first drawing from an inverse-Wishart the covariance matrix from which PCs are constructed or
the Normal-inverse-Gamma posterior of the mimicking portfolios coefficients and then sampling
the remaining parameters as explained in the definitions
II2 Model selection
In the previous subsection we have derived simple Bayesian estimators that deliver in finite sam-
ple credible intervals robust to the presence of spurious factors and avoid over-rejecting the null
hypothesis of zero risk premia for such factors
However given the plethora of risk factors that have been proposed in the literature a robust
approach for models selection across non-necessarily nested models and that can handle poten-
tially a very large number of possible models as well as both traded and non-traded factors is
of paramount importance for empirical asset pricing The canonical way of selecting models and
testing hypothesis within the Bayesian framework is through Bayesrsquo factors and posterior prob-
abilities and that is the approach we present in this section This is for instance the approach
suggested by Barillas and Shanken (2018) The key elements of novelty of the proposed method
are that i) our procedure is robust to the presence of spurious and weak factors ii) it is directly
9Based on these two observations Allena (2019) proposes a generalisation of Barillas and Shanken (2018) modelcomparison approach for these type of factors
11
applicable to both traded and non-traded factors and iii) it selects models based on their cross-
sectional performance (rather than the time series one) ie on the basis of the risk premia that the
factors command
In this subsection we show first that flat priors for risk premia (as the Jeffreysrsquo priors used
in section II1 for illustrative purposes) are not suitable for model selection in the presence of
spurious factors Given the close analogy between frequentist testing and Bayesian inference with
flat priors this is not too surprising But the novel insight is that the problem arises exactly
because of the use of flat priors and can therefore be fixed by using non-flat yet non-informative
priors Second we introduce ldquospike-and-slabrdquo priors that are robust to the presence of spurious
factors and particularly powerful in high-dimensional model selection ie when one wants as in
our empirical application to test all factors in the zoo
II21 Pitfalls of Flat Priors for Risk Premia
We start this section by discussing why flat priors for risk premia as the Jeffreysrsquo prior are not
desirable in model selection Since we want to focus and select models based on the cross-sectional
asset pricing properties of the factors for simplicity we retain Jeffreysrsquo priors for the time series
parameter (aβf Σ) of the first-step regression
In order to perform model selection we relax the (null) hypothesis that models are correctly
specified and allow instead for the presence of cross-sectional pricing errors That is we consider
the cross-sectional regression a = βλ + α For illustrative purposes we focus on spherical errors
but all the results in this and the following subsections can be generalized to the non-spherical error
setting in Appendix A2
Similar to many Bayesian variable selection problems we introduce a vector of binary latent
variables γ⊤ = (γ0 γ1 γK) where γj isin 0 1 When γj = 1 it indicates that the factor j
(with associated loadings βj) should be included into the model and vice versa The number of
included factors is simply given by pγ =983123K
j=0 γj Note that we do not shrink the intercept so
γ0 is always equal to 1 (as the common intercept plays the role of the first ldquofactorrdquo) The notation
βγ = [βj ]γj=1 represents a pγ-columns sub-matrix of β
When testing whether the risk premium of factor j is zero the null hypothesis is H0 λj = 0
In our notation this null hypothesis can be expressed as H0 γj = 0 while the alternative is
H1 γj = 1 This is a small but important difference relative to the canonical frequentist testing
approach for useless factors the risk premium is not identified hence testing whether it is equal to
any given value is per se problematic Nevertheless as we show in the next section with appropriate
priors whether a factor should be included or not is a well-defined question even in the presence
of useless factors
In the Bayesian framework the prior distribution of parameters under the alternative hypothesis
should be carefully specified Generally speaking the priors for nuisance parameters such as β σ2
and Σ do not greatly influence the cross-sectional inference But as we are about to show this is
not the case for the priors about risk premia
12
Recall that when considering multiple models say wlog model γ and model γ prime by Bayesrsquo
theorem we have that the posterior probability of model γ is
Pr(γ|data) = p(data|γ)p(data|γ) + p(data|γ prime)
where we have given equal prior probability to each model and p(data|γ) denotes the marginal
likelihood of the model indexed by γ In Appendix A3 we show that when using a Jeffreysrsquo prior
(that is flat for λ) the marginal likelihood is
p(data|γ) prop (2π)pγ+1
2 |β⊤γ βγ |minus
12
Γ983059Nminuspγ+1
2
983060
983059N σ2
γ
2
983060Nminuspγ+1
2
(15)
where λγ = (β⊤γ βγ)
minus1β⊤γ a σ
2γ =
(aminusβγ λγ)⊤(aminusβγ λγ)N and Γ denotes the Gamma function
Therefore if model γ includes a useless factor (whose β asymptotically converges to zero) the
matrix β⊤γ βγ is nearly singular and its determinant goes to zero sending the marginal likelihood
in (15) to infinity As a result the posterior probability of the model containing the spurious factor
go to one Consequently under a flat prior for risk premia the model containing a useless factor
will always be selected asymptotically However the posterior distribution of λ for the spurious
factor is robust and particularly disperse in any finite sample
Moreover it is highly likely that conclusions based on the posterior coverage of λ contradict
those arising from Bayesrsquo factors When the prior distribution of λj is too diffuse under the
alternative hypothesis H1 the Bayesrsquo factor tends to favor H0 over H1 even though the estimate
of λj is far from 0 The reason is that even though H0 seems quite unlikely based on posterior
coverages the data is even more unlikely under H1 than under H0 Therefore a disperse prior for
λj may push the posterior probabilities to favor H0 and make it fail to identity true factors This
phenomenon is the so called ldquoBartlett Paradoxrdquo (see Bartlett (1957))
Note also that flat hence improper priors for the risk premia are not legitimate since they render
the posterior model probabilities arbitrary Suppose that we are testing the null H0 λj = 0 Under
the null hypothesis the prior for (λσ2) is λj = 0 and π(λminusj σ2) prop 1
σ2 However the prior under the
alternative hypothesis is π(λj λminusj σ2) prop 1
σ2 Since the marginal likelihoods of data p(data|H0)
and p(data|H1) are both undetermined we cannot define the Bayesrsquo factor p(data|H1)p(data|H0)
(see eg
Cremers (2002) Chib Zeng and Zhao (forthcoming)) In contrast for nuisance parameters such
as σ2 we can continue to assign improper priors Since both hypotheses H0 and H1 include σ2
the prior for it will be offset in the Bayesrsquo factor and in the posterior probabilities Therefore we
can only assign improper priors for common parameters10 Similarly we can still assign improper
priors for β and Σ in the first time series step
The final reason why it might be undesirable to use Jeffreysrsquo prior in the second step is that it
does not impose any shrinkage on the parameters This is problematic given the large number of
10See Kass and Raftery (1995) (and also Cremers (2002)) for more detailed discussion
13
members of the factor zoo while we have only limited time-series observations of both factors and
test asset returns
In the next subsection we propose an appropriate prior for risk premia that is both robust to
spurious factors and can be used for model selection even when dealing with a very large number
of potential models
II22 Spike and Slab Prior for Risk Premia
In order to make sure that the integration of the marginal likelihood is well-behaved we propose
a novel prior specification for the factorsrsquo risk premia λ⊤f = (λ1 λK)11 Since the inference in
time-series regression is always valid we only modify the priors of the cross-sectional regression
parameters
The prior that we propose belongs to the so-called ldquospike-and-slabrdquo family For exemplifying
purposes in this section we introduce a Dirac spike so that we can easily illustrate its implications
for model selection In the next subsection we generalize the approach to a ldquocontinuous spikerdquo
prior and study its finite sample performance in our simulation setup
In particular we model the uncertainty underlying the model selection problem with a mixture
prior π(λσ2γ) prop π(λ|σ2γ)π(σ2)π(γ) for the risk premium of the j-th factor When γj = 1
and hence the factor should be included in the model the prior follows a normal distribution given
by λj |σ2 γj = 1 sim N (0σ2ψj) where ψj is a quantity that we will be defining below When instead
γj = 0 and the corresponding risk factor should not be included in the model the prior is a Dirac
distribution at zero For the cross-sectional variance of the pricing errors we keep the same prior
that would arise with Jeffreysrsquo approach π(σ2) prop σminus2
Let D denote a diagonal matrix with elements cψminus11 middot middot middot ψminus1
K and Dγ the sub-matrix of D
corresponding to model γ We can then express the prior for the risk factors λγ of model γ as
λγ |σ2γ sim N (0σ2Dminus1γ )
Note that c is a small positive number since we do not shrink the common intercept λc of the
cross-sectional regression
Given the above prior specification we sample the posterior distribution by sequentially drawing
from the conditional distributions of the parameters (ie we use a Gibbs sampling algorithm) The
crucial steps in addition to the sampling of the times series parameters from the posteriors in
equations (8)-(9) are as follows
Sampling λγ
Note that
11We do not shrink the intercept λc
14
p(λ|dataσ2γ) prop p(data|λσ2γ)π(λ|σ2γ)
prop (2π)minuspγ2 |Dγ |
12 (σ2)minus
N+pγ2 exp
983069minus 1
2σ2[(aminus βγλγ)
⊤(aminus βγλγ) + λ⊤γDγλγ ]
983070
= (2π)minuspγ2 |Dγ |
12 (σ2)minus
N+pγ2 e
983069minus
(λγminusλγ )⊤(β⊤γ βγ+Dγ )(λγminusλγ )
2σ2
983070
e
983153minusSSRγ
2σ2
983154
where SSRγ = a⊤aminus a⊤βγ(β⊤γ βγ +Dγ)
minus1β⊤γ a = minλγ(aminus βγλγ)
⊤(aminus βγλγ) + λ⊤γDγλγ
Note that SSRγ is the minimized sum of squared errors with generalised ridge regression penalty
term λ⊤γDγλγ That is our prior modelling is analogous to introducing a Tikhonov-Phillips
regularisation (see Tikhonov Goncharsky Stepanov and Yagola (1995) Phillips (1962)) in the
cross-sectional regression step and has the same rationale delivering a well defined marginal
likelihood in the presence of rank deficiency (that in our settings arises in the presence of useless
factors) However in our setting the shrinkage applied to the factors is heterogeneous since we
rely on the partial correlation between factors and test assets to set ψj as
ψj = ψ times ρ⊤j ρj (16)
where ρj is an N times 1 vector of correlation coefficients between factor j and the test assets and
ψ isin R+ is a tuning parameter which controls the shrinkage over all the factors12 When the
correlation between fjt and Rt is very low as in the case of a useless factor the penalty for λj
which is the reciprocal of ψρ⊤j ρj is very large and dominates the sum of squared errors
Let λγ = (β⊤γ βγ +Dγ)
minus1β⊤γ a and σ2(λγ) = σ2(β⊤
γ βγ +Dγ)minus1 the posterior distribution of
λγ is
λγ |dataσ2γ sim N (λγ σ2(λγ))
The above equation makes it clear why this Bayesian formulation is robust to spurious factors
When β converges to zero (β⊤γ βγ +Dγ) is dominated by Dγ so the identification condition for
the risk premia no longer fails When a factor is spurious its correlation with test assets converges
to zero hence the penalty for this factor ψminus1j goes to infinity As a result the posterior mean
of λγ λγ = (β⊤γ βγ +Dγ)
minus1β⊤γ a is shrunk towards zero and the posterior variance term σ2(λ)
approaches σ2Dminus1γ Consequently the posterior distribution of λ for a spurious factor is nearly the
same as its prior In contrast for a normal factor that has non-zero covariance with test assets the
information contained in β dominates the prior information since in this case the absolute size of
Dγ is small relative to β⊤γ βγ
Using our priors and integrating out λ yields
p(data|σ2γ) =
983133p(data|λσ2γ)π(λ|σ2γ)dλ prop (σ2)minus
N2
|Dγ |12
|β⊤γ βγ +Dγ |
12
exp
983069minusSSRγ
2σ2
983070
12Alternatively we could have set ψj = ψ times β⊤j βj where βj is an N times 1 vector However ρj has the advantage
of being invariant to the units in which the factors are measured
15
Sampling σ2
From the Bayesrsquo theorem we have that the posterior of σ2 given by
p(σ2|dataγ) prop p(data|σ2γ)π(σ2) prop (σ2)minusN2minus1 exp
983069minusSSRγ
2σ2
983070
hence the posterior distribution of σ2 is an inverse-Gamma σ2|dataγ sim Γminus1983059N2
SSRγ
2
983060
Finally we obtain the marginal likelihood of the data in model γ by integrating out σ2
p(data|γ) =983133
p(data|σ2γ)π(σ2)dσ2 prop |Dγ |12
|β⊤γ βγ +Dγ |
12
1
(SSRγ2)N2
When comparing two models using posterior model probabilities is equivalent to simply using the
ratio of the marginal likelihoods ie the Bayesrsquo factor defined as
BFγγprime = p(data|γ)p(data|γ prime)
where we have given equal prior probability to model γ and model γprime
Remark 2 (Bayes Factor) Consider two nested linear factor models γ and γ prime The only dif-
ference between γ and γ prime is γp γp equals 1 in model γ but 0 in model γ prime Let γminusp denote a
(K minus 1) times 1 vector of model index excluding γp γ⊤ = (γ⊤minusp 1) and γ prime⊤ = (γ⊤
minusp 0) where without
loss of generality we have assumed that the factor p is ordered last The Bayesrsquo factor is then
BFγγprime =
983061SSRγprime
SSRγ
983062N2 983059
1 + ψpβ⊤p
983147IN minus βγprime(β⊤
γprimeβγprime +Dγprime)minus1β⊤γprime
983148βp
983060minus 12 (17)
The above result is proved in Appendix A4
Since β⊤p [IN minus βγprime(β⊤
γprimeβγprime + Dγprime)minus1β⊤γprime ]βp is always positive ψp plays an important role in
variable selection For a strong and useful factor that can substantially reduce pricing errors the
first term in equation (17) dominates and the Bayesrsquo factor will be much greater than 1 hence
providing evidence in favour of model γ
Remember that SSRγ = minλγ(a minus βγλγ)⊤(a minus βγλγ) + λ⊤
γDγλγ hence we always have
SSRγ le SSRγprime in sample There are two effects of increasing ψp i) when ψp is large the penalty
for λp is small hence it is easier to minimise SSRγ and SSRγprimeSSRγ becomes much larger than
1 ii) large ψp decreases the second term in equation (17) lowering the Bayesrsquo factor and acting as
a penalty for dimensionality
A particularly interesting case is when the factor is useless βp converges to zero but the
penalty term 1ψp prop 1ρ⊤pρp goes to infinity On the one hand the first term in equation (17) will
converge to 1 on the other hand since ψp asymp 0 in large sample the second term in equation (17)
will also be around 1 Therefore the Bayesrsquo factor for a useless factor will go to 1 asymptotically13
13But in finite sample it may deviate from its asymptotic value so we should not use 1 as a threshold when testingthe null hypothesis H0 γp = 0
16
In contrast a useful factor should be able to greatly reduce the sum of squared errors SSRγ so
the Bayesrsquo factor will be dominated by SSRγ yielding a value substantially above 1
Remark 3 (Level Factors) Identification failure of factorsrsquo risk premia can arise in the presence
of lsquolevel factorsrsquo exposure to which is non-zero but lacks cross-sectional spread ie βj rarr c1N with
c ∕= 0 These factors help explain the average level of returns but not the their cross-sectional
dispersion and hence are collinear with the common cross-sectional intercept Our approach can
handle this case by using variance standardised variables in the estimation and replacing the penalty
in (16) with ψj = ψtimes983145ρj⊤983145ρj where 983145ρj is the cross-sectionally demeaned vector of correlations with
asset returns ie 983145ρj = ρj minus983059
1N
983123Ni=1 ρji
983060times 1N
II23 Continuous Spike
We extend the Dirac spike-and-slab prior by encoding a continuous spike for λj when γj equals
0 Following the literature on Bayesian variable selection (see eg George and McCulloch (1993)
George and McCulloch (1997) and Ishwaran Rao et al (2005)) we model the uncertainty under-
lying model selection with a mixture prior π(λσ2γω) = π(λ|σ2γ)π(σ2)π(γ|ω)π(ω) which is
specified as following
λj |γj σ2 sim N (0 r(γj)ψjσ2)
Note the introduction of a new element r(γj) in the prior and where r(1) = 1 and r(0) = r As
we explain below the additional parameter vector ω encodes our prior beliefs about the sparsity
of the true model
Redefine D as a diagonal matrix with elements c (r(γ1)ψ1)minus1 (r(γK)ψK)minus1 where ψj is
given as before by equation (16) In matrix notation the prior for λ is λ|σ2γ sim N (0σ2Dminus1)
The term r(γj)ψj in Dminus1 is set to be small or large depending on whether γj = 0 or γj = 1 In the
empirical implementation we force r to be much less than 1 since we intend to shrink λj towards
zero when γj is 014 Hence the spike component concentrates the mass of λ towards zero whereas
the slab component allows λ to take values over a much wider range Therefore the posterior
distribution of λ is very similar to the case of a Dirac spike in section II22
We use Gibbs sampling (ie sequential sampling from conditional distributions) to draw the
posterior distribution of the parameters (λγωσ2) where as explained below ω encodes our
prior beliefs about the sparsity of the true model
Sampling λγ
Combining the likelihood and the prior for λ we have
p(λ|dataσ2γ) prop p(data|λσ2γ)p(λ|σ2γ) prop exp
983069minus 1
2σ2
983147λ⊤(β⊤β +D)λminus 2λ⊤β⊤a
983148983070
Therefore defining λ = (β⊤β+D)minus1β⊤a and σ2(λ) = σ2(β⊤β+D)minus1 the posterior distribution
14We can set r = 00001 In our framework r is essentially a tune parameter hence we need to choose a reasonablevalue such that we can identify useful factor but exclude spurious ones
17
of λ can be expressed as λ|dataσ2γω sim N (λ σ2(λ))
Sampling γjKj=1
Even though the prior on model index γ could be simply set to be π(γ) = 12K we interpret
π(γj = 1|ωj) = ωj as our prior belief about the sparsity of the true model As in the literature on
predictors selection we assign the following prior distribution to (γω)
π(γj = 1|ωj) = ωj ωj sim Beta(aω bω)
Different hyper-parameters aω and bω determine whether we favor more parsimonious models or
not15
Given a ωj the Bayes factor for the j-th risk then factor is
p(γj = 1|dataλωσ2γminusj)
p(γj = 0|dataλωσ2γminusj)=
ωj
1minus ωj
p(λj |γj = 1σ2)
p(λj |γj = 0σ2)
If we had instead imposed ωj = 05 as in section II22 the Bayesrsquo factor would be fully determined
byp(λj |γj=1σ2)p(λj |γj=0σ2)
Sampling ω
From Bayesrsquo theorem we have
p(ωj |dataλγσ2) prop π(ωj)π(γj |ωj) prop ωγjj (1minus ωj)
1minusγjωaωminus1j (1minus ωj)
bωminus1
prop ωγj+aωminus1j (1minus ωj)
1minusγj+bωminus1
Therefore the posterior distribution of ωj is ωj |dataλγσ2 sim Beta(γj + aω 1minus γj + bω)
Sampling σ2
Finally
p(σ2|dataωλγ) prop (σ2)minusN+K+1
2minus1 exp
983069minus 1
2σ2[(aminus βλ)⊤(aminus βλ) + λ⊤Dλ]
983070
Hence the posterior distribution of σ2 is
σ2|dataωλγ sim Γminus1
983061N +K + 1
2(aminus βλ)⊤(aminus βλ) + λ⊤Dλ
2
983062
Note that when ωj is a constant 05 and r converges to 0 the continuous slab-and-spike prior
is equivalent to the one with a Dirac spike in section II22 However the MCMC algorithm of
the continuous spike setting is particularly useful in the high dimensional case Imagine that there
are 30 candidate factors in the factor zoo In the Dirac spike-and-slab prior case we have to
15We set aω = bω = 2 in the benchmark case However we could assign aω = 1 and bω = 2 in order to select asparser model
18
calculate the posterior model probabilities for 230 different models Given that we update (aβ)
in each sampling round posterior probabilities for all models are necessarily re-computed for every
new draw of these quantities rendering the computational cost very large In contrast using our
continuous spike approach we can simply use the posterior mean of γj to approximate the posterior
marginal probability of the j-th factor
III Simulation
We build a simple setting for a linear factor model that includes both strong and irrelevant factors
and allows for potential model misspecification
The cross-section of asset returns mimics empirical properties of 25 Fama-French portfolios
sorted by size and value We generate both factors and test asset returns from normal distributions
assuming that HML is the only useful factor A misspecified model also includes pricing errors from
the two-step Fama-MacBeth procedure which makes the vector of simulated expected returns equal
to their sample mean estimates of 25 Fama-French portfolios Finally a spurious factor is simulated
from an independent normal distribution with mean zero and standard deviation 1
ftuselessiidsim N (0 (1)2)
ftHMLiidsim N (rHML σ
2HML)
ftHML = ftHML minus macrftHML
Rt|ftHMLiidsim
983099983103
983101N (λc1N + β
983059λHML + ftHML
983060 Σ) if the model is correct
N (R+ βftHML Σ) if the model is misspecified
where factor loadings risk premia and variance-covariance matrix of returns are equal to their
OLS sample estimates from time series and two-pass Fama-MacBeth regressions of 25 size and
value portfolios on HML All the model parameters are estimated on monthly data from July 1963
to December 2017
To better illustrate the properties of the frequentist and bayesian approaches we consider 3
estimation setups
(a) the model includes only a strong factor (HML)
(b) the model includes only a useless factor
(c) the model includes both strong and useless factors
Each setting can be correctly or incorrectly specified with the following sample sizes T = 100
200 600 1000 and 20000 We compare the performance of the OLSGLS standard frequentist
and Bayesian Fama-MacBeth estimators (FM and BFM correspondingly) with the focus on risk
premia recovery testing and identification of strong and useless factors for model comparison
19
III1 Estimating risk premia via Bayesian Fama-MacBeth
Since it is unlikely that in most empirical settings a linear factor model is correctly specified we
focus our discussion on the case that allows for model misspecification We report similar simulation
results for the case of correct model specification in Appendix C
Table 1 Tests of risk premia in a misspecified model with a strong factor
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0107 0052 0007 0115 0076 0013 -422 4530200 0094 0067 0006 0118 0072 0015 -303 5621
FM 600 0097 0053 0006 0098 0047 0011 389 55751000 0088 0048 0012 0103 0060 0012 865 532820000 0109 0056 0010 0110 0060 0008 2486 3655
100 0063 0026 0006 0032 0013 0001 -356 3609200 0086 0041 0012 0079 0039 0008 -367 4689
BFM 600 0087 0043 0013 0087 0047 0009 -067 52611000 0095 0046 0009 0093 0046 0009 440 534620000 0096 0052 0012 0100 0052 0012 2390 3601
Panel B GLS
100 0246 0173 0067 0251 0154 0070 1720 7464200 0165 0107 0031 0171 0105 0035 5337 8045
FM 600 0137 0076 0020 0137 0073 0016 6942 83871000 0131 0072 0019 0132 0072 0015 7341 841620000 0122 0074 0014 0113 0061 0015 8034 8283
100 0139 0083 0022 0143 0083 0024 3127 6815200 0131 0072 0014 0121 0069 0019 4858 7299
BFM 600 0114 0061 0013 0126 0067 0018 6594 80091000 0100 0048 0009 0106 0058 0008 7063 814920000 0092 0050 0012 0100 0048 0012 8022 8267
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor The true value of the cross-sectional R2
adj is 3055(8175) for the OLS (GLS) estimation Fama-MacBeth estimates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidence intervals and their size for BFMestimates are constructed using posterior coverage of Fama-MacBeth estimates of λ The last two columns report the5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at the simulation point estimatesfor FM and its posterior mode for BFM
Table 1 presents the size of the tests for risk premia and confidence intervals for cross-sectional
R2 in probably the most relevant (and best-case) scenario for empirical applications It compares
the performance of frequentist and Bayesian Fama-MacBeth estimators for the case when the model
is misspecified and includes a single cross-sectional factor that is priced and strongly identified
proxied by HML Since the model is misspecified cross-sectional R2 never reaches 100 even for
T = 20 000 with the population value of 31 (82) for OLS (GLS) As expected both FM and
BFM estimators are very similar to each other and provide confidence intervals of correct size In
case of the standard FM approach they are constructed using standard t-statistics adjusted for
Shanken correction and in case of the BFM we rely on the quantiles of the posterior distribution
to form the credible confidence intervals for parameters The last two columns also report the
quantiles of the mode of the posterior distribution of R2 across the simulations
20
Table 2 summarizes risk premia estimation for the same cross-section of 25 expected returns
but using a useless (spurious) factor as a candidate cross-sectional factor As expected the standard
Fama-MacBeth estimator fails to recognize the rank failure in the second stage and conventional
risk premia estimates and t-statistics are inflated Indeed it is widely known since Kan and Zhang
(1999) that if the model is misspecified then t-statistics of the spurious factors tend to infinity
asymptotically This is confirmed in Panels A and B for T = 20 000 the probability of finding a
useless factor to have a t-stat above 196 is almost 50 for FM-OLS and over 80 for FM-GLS
Furthermore cross-sectional measures of fit such as R2 and related quantities are substantially
inflated even though itrsquos true value is 0 for both OLS and GLS settings in-sample estimates
produced by the frequentist approach not only have a much wider empirical support (for example
from -4 to over 40 for FM-OLS) but also uncertainty that does not decrease with the sample
size
Table 2 Tests of risk premia in a misspecified model with a useless factor
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0072 0037 0010 0055 0011 0001 -429 4184200 0078 0043 0005 0084 0021 0001 -418 4524
FM 600 0089 0043 0014 0223 0116 0013 -428 43701000 0093 0052 0014 0333 0187 0027 -429 450120000 0213 0135 0067 0698 0488 0172 -427 4363
100 0035 0011 0001 0001 0000 0000 -247 -036200 0041 0013 0001 0008 0002 0000 -261 -016
BFM 600 0071 003 0003 0031 0006 0002 -287 0191000 0047 002 0001 0039 0017 0002 -298 08420000 0034 0013 0000 0091 0043 0012 -336 1140
Panel B GLS
100 0238 0144 0066 0305 0199 0062 -347 3857200 0152 0091 0028 0292 0189 0067 -375 1953
FM 600 0126 0066 0017 0407 0314 0148 -385 16431000 0117 0063 0013 0510 0401 0239 -405 156920000 0104 0039 0005 0864 0847 0768 -311 1333
100 0128 0070 0019 0047 0018 0002 -218 1154200 0107 0060 0014 0034 0011 0000 -275 891
BFM 600 0093 0046 0008 0042 0012 0001 -325 6621000 0083 0031 0004 0061 0028 0004 -339 51920000 0023 0006 0000 0099 0049 0011 -304 138
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor The true value of the cross-sectional R2 is zero
The Bayesian approach to Fama-MacBeth regressions successfully overcomes the hurdle of use-
less factors As Table 2 demonstrates both BFM-OLS and BFM-GLS are able to identify the spu-
rious factor with the posterior distribution providing credible confidence bounds with the proper
control for the size of the test (eg for T = 20 000 the probability to reject the null of no risk
premia attached to a useless factor when using a test with the nominal size of 10 is 91) Rec-
ognizing a useless factor cross-sectional measures of fit are more conservative and overall tighter
Why does the bayesian approach work when the frequentist fails The argument is probably
best summarized by Figure 1 that plots a posterior distribution of λuseless for BFM from one of
21
minus4 minus2 0 2 4
00
02
04
06
08
λuseless
Density
BFMminusOLSFMminusOLS
Figure 1 Risk premia estimates of a useless factor
The graph presents the posterior distribution (blue line) of λuseless from the BFM-OLS estimation in a misspecifiedmodel with a useless factor based on a single simulation with T = 1 000 The red line depicts the asymptoticdistribution of Fama-MacBeth estimate of risk premium under the normal approximation
the simulations along with the pseudo-true value of risk premium defined as 0 in this case In this
particular simulation Fama-MacBeth OLS estimate of λuseless is -119 with Shanken-corrected
t-statistics equal to -255 so according to traditional hypothesis testing we would reject the null
of λuseless = 0 even at 1 The posterior distribution of BFM estimates of risk premium (the
blue line in Figure 1) behaves rather differently it is centered around 0 and overall more spread
out with a confidence interval (minus1603 1201) Intuitively the main driving force behind it
is the fact that in BFM β is updated continuously when β is close to zero the posterior draws
of β will be positive or negative randomly which implies that the conditional expectation of λ in
Equation 12 will also flip sign depending on the draw As a result the posterior distribution of
λuseless is centered around 0 and so is the confidence interval The same logic applies to the case
of BFM-GLS
Finally Table 3 combines the insights for true and irrelevant factors and presents the simulation
results for the most realistic model setup that includes both useless and strong factors As expected
in the conventional case of frequentist Fama-MacBeth estimation the useless factor is often found
to be a significant predictor of the asset returns its OLS (GLS) t-statistic would be above a
5-critical value in over 60 (80) of the simulations On the contrary the bayesian confidence
intervals have the right coverage and reject the null of no risk premia attached to the spurious
factor with frequency asymptotically approaching the size of the tests
The crowding out of the true factors by the useless ones could also be an important empirical
concern When the model is misspecified the presence of spurious factors can also bias the risk
premia estimates for the strong ones and often leads to their crowding out of the model Panel
A in Table 3 serves as a good illustration of this possibility with risk premia estimates for the
22
Table 3 Tests of risk premia in a misspecified model with useless and strong factors
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0082 0039 0008 0121 0067 0016 0099 0023 0001 -513 5663200 0096 0044 0005 0157 0100 0034 0129 0039 0005 127 6190
FM 600 0093 0034 0014 0212 0147 0071 0264 0129 0022 840 61781000 0102 0046 0010 0261 0194 0098 0380 0199 0056 1184 624820000 0114 0054 0009 0289 0229 0152 0848 0633 0240 2507 6076
100 0035 0012 0001 0028 0007 0001 0004 0001 0000 -211 4033200 0049 0017 0001 0067 0031 0004 0011 0003 0000 -175 4828
BFM 600 005 0018 0004 0099 0047 0005 0047 0014 0002 1020 55721000 0041 0021 0003 0102 0048 0011 0071 0035 0004 1487 569520000 0017 0007 0000 0087 0033 0007 0099 0055 0012 2480 5466
Panel B GLS
100 0219 0155 0057 0224 0135 0066 0303 0198 0064 1911 7775200 0155 0092 0028 0149 0090 0024 0263 0183 0061 5537 8171
FM 600 0121 0068 0015 0116 0064 0016 0391 0293 0134 6948 84331000 0115 0061 0013 0115 0057 0012 0487 0387 0216 7305 847420000 0084 0050 0009 0100 0041 0005 0864 0836 0757 7979 8424
100 0122 0069 0016 0129 0070 0017 0046 0017 0002 3243 6869200 0112 0056 0012 0099 0048 0012 0031 0012 0000 4844 7355
BFM 600 0096 0049 0011 0086 0045 0009 0049 0016 0002 6576 80301000 0081 0036 0007 0073 0032 0006 0058 0030 0003 7064 815420000 0027 0005 0000 0022 0007 0000 0098 0047 0013 7974 8259
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and
λstrong λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor The true value of the
cross-sectional R2adj is 3055 (8175) for the OLS (GLS) estimation
strong factor are clearly biased in the frequentist estimation by the identification failure in case of
the frequentist approach Again in this case BFM provides reliable albeit conservative confidence
bounds for model parameters
III11 Evaluating cross-sectional fit
In addition to risk premia estimates it is often useful to understand the quality of cross-sectional
fit of the model Indeed the increase in cross-sectional R2 is often interpreted as measuring the
economic importance of the predictor contrary to the statistical one implied by the risk premia
significance It is well-known however that the average values of R2 are not always informative
about the true model performance its sample distribution often suffers from a large estimation un-
certainty (see eg Stock (1991) and Lewellen Nagel and Shanken (2010)) ad has a non-standard
distribution when the matrix of β has reduced rank (Kleibergen and Zhan (2015) Gospodinov
Kan and Robotti (2019)) In this section we further investigate the properties of cross-sectional
R2 in the frequentist and Bayesian Fama-MacBeth regressions
Figure 2 shows the distribution of cross-sectional OLS R2 across a large number of simulations
for the asymptotic case of T = 20 000 and a misspecified process for returns As Panel (a)
illustrates if the model is strongly identified the distribution of posterior mode of R2 for BFM
tends to coincide with that of conventional Fama-MacBeth procedure as expected The major
23
difference emerges whenever a useless factor is included into the candidate set of variables Indeed
it is well-known that in this case the distribution of conventional measures of fit is nonstandard
and often inflated (Kleibergen and Zhan (2015)) This is further confirmed in Figures 2 b) and
c) that show that under the presence of spurious factors conventional Fama-MacBeth R2 has an
extremely spreadout right tail of the distribution which makes it easy to find a substantial increase
of fit whenever the model is simply not identified This unfortunate property of the frequentist
approach is not shared by the inference with BFM Indeed the mode of the posterior distribution
of R2 is generally tightly centered around the true values The slight bump to the right tail of the
distribution comes from the fact that whenever a spurious factor is included into the model with a
small probability (based on t-statistic cut-off this is equal to the size of the test see eg Table 3
Panel A) its fit will be similar to that of the frequentist estimation
00 02 04 06 08 10
02
46
810
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(a) Model with a strong factor
00 02 04 06 08
010
2030
40
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(b) Model with a useless factor
00 02 04 06 08 10
02
46
8
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(c) Model with strong and useless factors
Figure 2 Cross-sectional distribution of R2adj in models with strong and useless factors
Figures (a)-(c) show the asymptotic distribution of cross-sectional R2 under different model specifications across 1000simulations of sample size T = 20 000 Blue lines correspond to the distribution of the posterior mode for R2
adj whilegreen lines depict the pointwise sample distribution of cross-sectional fit evaluated at Fama-MacBeth risk premiaestimates The dark yellow dash line stands for the true value of R2
adj in the model
24
However the pointwise distribution of cross-sectional R2 across the simulations is only part of
the story as it does not reveal the in-sample estimation uncertainty and whether the confidence
intervals are credible in reflecting it While BFM incorporates this uncertainty directly into the
shape of its posterior distribution one needs to rely on bootstrap-like algorithms to build a similar
analogue in the frequentist case As frequentist benchmark we use the approach Lewellen Nagel
and Shanken (2010) to construct the confidence interval for R2 Details on this procedure can be
found in Appendix A5
minus05 00 05 10
00
05
10
15
20
25
Radj2
Den
sity
BFM PosteriorFM Radj
2
5th95th CI of BFM5th95th CI of LNSTrue Radj
2
(a) Model with a useless factor
minus04 minus02 00 02 04 06 08 10
00
05
10
15
20
Radj2
Den
sity
BFM PosteriorFM Radj
2
5th95th CI of BFM5th95th CI of LNSTrue Radj
2
(b) Model with strong and useless factors
Figure 3 The estimation uncertainty of cross-sectional R2Figures (a) and (b) show the posterior density of the cross-sectional R2 (blue) in one representative simulationalong with its 90 confidence interval (red) The dark yellow line denotes the true value of R2 while the green linedepicts its in-sample Fama-MacBeth estimate with the confidence interval constructed following Lewellen Nagel andShanken (2010) (black)
Figure 3 presents the posterior distribution of cross-sectional R2 for a model that contains a
useless factor (and potentially a strong one too) and contrasts it with a frequentist value and
the confidence interval around it Consider for example Panel a) The fact that the in-sample
Fama-MacBeth estimate of cross-sectional fit (51) is substantially higher than the mode of the
posterior distribution (-2 which is close to the true value of R2adj about minus4) is not surprising
given the previous results on the pointwise distribution of the estimates What is quite interesting
however is the coverage of the confidence interval constructed via the simulation-based approach of
Lewellen Nagel and Shanken (2010) Not only does it not include true value of the cross-sectional
fit but in fact in this particular simulation it suggests that R2adj should be between 42 and
100 A similar mismatch between the seemingly high levels of cross-sectional fit produced by a
frequentist approach and their true values can also be observed in Panel b) of Figure 3 for the
case of including both strong and a useless factors
In section A6 of the Appendix we show that the performance of the Bayesian Fama-MacBeth
method is robust to the use of a larger cross-sectional dimension ndash ie the above discussed properties
25
hold in a larger N setting
III2 Bayes factors
How well do flat and spike-and-slab priors work empirically in selecting relevant and detecting
spurious factors in the cross-section of asset returns We revisit the theoretical results from Section
II1 using the same simulation design used to evaluate the estimation of risk premia
Table 4 The probability of retaining risk factors using Bayes factors
T 55 57 59 61 63 65
Panel A strong factors
Jeffreys Prior fstrong 200 0813 0784 0758 0722 0693 0662600 0929 0915 0896 0876 0851 08341000 0972 0963 0957 0951 0937 0924
Spike-and-Slab Prior fstrong 200 0605 0570 0538 0505 0476 0447600 0853 0830 0807 0784 0762 07351000 0954 0940 0928 0912 0896 0875
Panel B useless factors
Jeffreys Prior fuseless 200 1000 0996 0988 0967 0919 0822600 0998 0998 0995 0988 0977 09431000 1000 1000 1000 0994 0983 0965
Spike-and-Slab Prior fuseless 200 0325 0188 0106 0057 0024 0013600 0072 0025 0006 0002 0001 00001000 0051 0015 0001 0000 0000 0000
Panel C strong and useless factors
Jeffreys Prior fstrong 200 0924 0897 0874 0848 0821 0799600 0988 0985 0976 0974 0965 09581000 0998 0996 0996 0995 0992 0987
fuseless 200 0984 0960 0910 0811 0702 0584600 0999 0993 0985 0954 0913 08541000 1000 1000 0995 0986 0966 0945
Spike-and-Slab Prior fstrong 200 0627 0591 0552 0509 0474 0452600 0860 0830 0802 0783 0758 07271000 0956 0942 0927 0911 0895 0875
fuseless 200 0260 0128 0071 0031 0019 0010600 0080 0028 0010 0004 0001 00011000 0058 0013 0005 0000 0000 0000
The table shows the frequency of retaining risk factors for different choice sets across 1000 simulations of differentsize (T=200 600 and 1000) In Panel A the candidate risk factor is truly cross-sectionally priced and stronglyidentified while in Panel B they are not Panel C summarizes the case of using both strong and useless candidatefactors in the model A candidate factor is retained in the model if its marginal posterior probability p(γi = 1|data)is greater than a certain threshold ie 55 57 59 61 63 and 65
Consider a cross-section of 25 portfolios that is actually loading on 2 systematic sources of risk
with the econometrician potentially observing at most only one of them a strong (and priced) ft
However there is also a second candidate factor available which is orthogonal to asset returns
and essentially useless We compute Bayes factors corresponding to each of the potential sources
of risk and document the empirical probability of retaining the variable in the model across 1000
26
simulations Again we consider models that contain either strong or useless factors or a com-
bination of both and different sample sizes (T = 200 600 and 1000) In each case we run the
Gibbs sampling algorithm derived using continuous spike-and-slab prior and then approximate the
marginal probability of each factor by the posterior mean of γj The decision rule is based on a
range of critical values 55-65 such that whenever the posterior mean of γj is above a particular
threshold we retain the factor Finally we also compute the probability of retaining a factor under
Jeffreys prior which would be the standard in the literature
Table 4 summarizes our findings When only a true risk factor is included in the candidate set
(Panel A) both Jeffreyrsquos and spike-and-slab priors successfully identify it with a high probability
especially in large sample For small sample sizes however Jeffreys prior clearly has a somewhat
higher power of detecting a true risk factor This outcome is not surprising as the estimator does
not impose additional shrinkage on the risk premium The difference however vanishes for larger
sample sizes
The difference between the two priors becomes drastic whenever useless factors are included
in the model (Panels B Panels B and C in Table 4) As discussed in Section II21 since the
matrix β⊤γ βγ is nearly singular and its determinant goes to zero under a flat prior the posterior
probability of including a spurious factor in the model converges to 1 asymptotically For example
the probability of misidentifying a spurious factor as being the true source of risk is almost 1
under Jeffreys prior even for a very short sample This in turn makes the overall process of model
selection invalid
Overall we find the asymptotic behavior of the spike-and-slab prior encouraging for variable
and model selection While often somewhat conservative in very short samples it successfully
eliminates the impact of the spurious factors from the model and identifies the true sources of
risk
IV Empirical Applications
In this section we apply our Bayesian approach to a large set of factors proposed in the previous
literature First we use the Bayesian Fama-MacBeth method to analyse several notable factor
models (subsection IV1) Second we consider 51 tradable and non-tradable factors yielding more
than two quadrillion possible models and employ our spike and slab priors to compute factorsrsquo
posterior probabilities and implied risk premia (subsections IV2 and IV3) Third we compare
the performance of a (low dimensional) robust model constructed with only the factors that have
high posterior probability to the one of several notable factor models (subsection IV4) Fourth
we estimate the degree of sparsity (in terms of linear factors) of the true latent stochastic discount
factor as well as the SDF-implied maximum Sharpe ratio (subsection IV5)
27
IV1 Some notable factor models
In this section we illustrate the differences between the frequentist and Bayesian FM estimation
(both OLS and GLS) for several candidate models In particular we estimate a set of linear
factor models on the returns of the standard 25 Fama-French portfolios sorted by size and value
using frequentist and Bayesian Fama-MacBeth estimators We use monthly data over the 197001-
201712 sample for tradable factors and whenever possible nontradables For factors available
only at quarterly frequency the sample is 1952Q1-2017Q3 (whenever possible) A full description
of the data and models used can be found in Appendix B Additional empirical results for other
candidate factors and cross-sections (eg 25 Fama-French + 17 Industry portfolios) can be found
in Appendix C
Tables 5 and 6 summarize the performance of several leading factor models For the classical FM
approach we report point estimates of risk premia with their Shanken-corrected t-statistics and
the cross-sectional R2 along with its 90 confidence interval (constructed following the method-
ology of Lewellen Nagel and Shanken (2010)) For BFM we report the posterior mean of risk
premia estimates and the posterior median and mode of R2 along with the centered 90 posterior
coverage We chose to report both the median and the mode for cross-sectional fit because the of
the shape of its distribution which is often heavily skewed
Carhart (1997) 4-factor model OLS and GLS Fama-MacBeth estimates of risk premia indicate
that size value and momentum (SMB HML and UMD correspondingly) are significant drivers of
the cross-section of test assets The market factor does not command a significant risk premium
which is a typical finding for this model Cross-sectional fit seems to be high with R2 over 70 even
though it comes with rather wide confidence bounds according to Lewellen Nagel and Shanken
(2010) approach The Bayesian estimation indicates that part of the model success is due to the fact
that this cross section of test assets does not have much exposure to momentum especially after
one controls for the conventional Fama-French factors While still marginally significant its risk
premium is substantially lower under both BFM and BFM-GLS estimation with tighter bounds
for R2 too On the contrary both HML and SMB have virtually identical risk prices under both
FM and Bayesian estimations
Hou Xue and Zhang (2014) q-factor model emphasized the role of investment and profitability
in matching the cross-section of equity returns and true to the data we find these factors strongly
priced Both estimation strategies give identical parameter estimates and largely consistent mea-
sures of cross-sectional fit This in turn implies that all the risk premia are strongly identified
for this model when estimated on a cross-section of 25 Fama-French portfolios When industry
portfolios are among the test assets (see Appendix C) risk premia and R2 decline and some of the
parameters lose significance but broadly speaking the model performs in a qualitatively similar
way
Liquidity-adjusted CAPM of Pastor and Stambaugh (2003) seems to suffer from identification
failure as the risk premium on the liquidity factor ceases to remain significant when BFM is used
in estimation Wide confidence bounds and uncertain cross-sectional fit provide a stark difference to
28
Table 5 Tradable factors and 25 Fama-French portfolios sorted by size and value
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Carhart (1997) Intercept 0489 7063 0703 6432 6329[-0244 1222] [3160 9400] [-0061 1426] [4826 7646]
MKT 0120 -0101[-0631 0870] [-0822 0683]
SMB 0171 0164[0100 0241] [0089 0232]
HML 0404 0396[0331 0477] [0330 0466]
UMD 2445 1806[0955 3936] [0259 3328]
q-factor model Intercept 0912 6567 0922 6062 6123Hou Xue and Zhang (2014) [0286 1539] [3040 8680] [0276 1560] [4131 7640]
ROE 0394 0377[0016 0771] [-0020 0789]
IA 0387 0385[0203 0571] [0208 0580]
ME 0274 0268[0169 0379] [0158 0376]
MKT -0371 -0378[-0995 0252] [-1005 0272]
Liquidity-CAPM Intercept 0973 3624 1162 3409 3027Pastor and Stambaugh (2000) [-0084 2030] [-909 10000] [0175 2120] [-239 6146]
LIQ 3057 1785[0727 5388] [-1237 4150]
MKT -0281 -0449[-1350 0788] [-1371 0509]
Panel A GLS
Carhart (1997) Intercept 1017 8964 1083 8587 863[0389 1645] [8200 9760] [0458 1717] [8085 9105]
MKT -0434 -0504[-1065 0196] [-1150 0122]
SMB 0191 0189[0150 0233] [0150 0230]
HML 0356 0356[0313 0400] [0316 0395]
UMD 1626 1264[0479 2772] [0077 2401]
q-factor model Intercept 1305 5503 1277 4728 4854Hou Xue and Zhang (2014) [0779 1831] [2440 9640] [0702 1879] [3245 6419]
ROE 0295 0266[-0026 0615] [-0087 0640]
IA 0270 0265[0104 0437] [0093 0450]
ME 0251 0246[0161 0341] [0144 0345]
MKT -0749 -0720[-1268 -0229] [-1292 -0156]
Liquidity-CAPM Intercept 1244 4938 1256 5298 4317Pastor and Stambaugh (2000) [0664 1824] [2691 9891] [0738 1749] [1250 6653]
LIQ 1141 0775[-0232 2514] [-0450 2116]
MKT -0664 -0678[-1242 -0086] [-1176 -0162]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of 25 Fama-French monthly excess returns Each model is estimated via OLS and GLS We reportpoint estimates and 5 confidence intervals for risk premia which are constructed based on the asymptotic normaldistribution and cross-sectional R2 and its (5 95) confidence level constructed as in Lewellen Nagel and Shanken(2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide the posterior mean of λ denoted byλj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (595) credible intervals and denote significance at the 90 95 and 99 level respectively
29
Table 6 Nontradable factors and 25 Fama-French portfolios sorted by size and value
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Scaled CCAPM Intercept 1046 2567 1791 3436 2919[-0848 2940] [-1429 10000] [0001 3723] [-476 6207]
cay 1817 0791[-0653 4288] [-1347 2686]
∆Cnd 0713 0303[-0030 1456] [-0462 0951]
∆Cnd times cay 0804 0301[-1645 3253] [-1911 2270]
HC-CAPM Intercept 3243 -122 3090 354 957[1228 5257] [-909 3345] [0790 5259] [-748 4431]
∆Y 0464 0085[-0213 1140] [-1119 1058]
MKT -0719 -0656[-2680 1242] [-2859 1558]
Durable CCAPM Intercept 2214 5238 2780 471 4078[-1037 5465] [2800 10000] [-0184 5751] [120 6991]
∆Cnd 0743 0357[-0025 1511] [-0207 0832]
∆Cd -0057 0014[-0719 0605] [-0668 0693]
MKT 0083 -0495[-3322 3489] [-3395 2555]
Panel B GLS
Scaled CCAPM Intercept 2180 -1024 2257 -658 -313[0825 3536] [-1429 6457] [1221 3258] [-1187 1517]
cay 0435 0256[-0774 1643] [-0688 1217]
∆Cnd 0118 0089[-0266 0502] [-0214 0407]
∆Cnd times cay 0141 0063[-1005 1286] [-0845 0938]
HC-CAPM Intercept 2730 5636 2759 5824 4926[1458 4002] [3018 8364] [1379 4095] [967 7507]
∆Y -0421 -0241[-0742 -0099] [-0598 0114]
MKT -0717 -0740[-1979 0545] [-2073 0622]
Durable CCAPM Intercept 2960 4454 2841 5474 4099[0547 5374] [286 7829] [1102 4558] [-241 7215]
∆Cnd 0105 0052[-0265 0475] [-0201 0311]
∆Cd 0055 0025[-0390 0501] [-0286 0327]
MKT -0941 -0822[-3314 1432] [-2528 0895]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of 25 Fama-French quarterly excess returns Each model is estimated via OLS and GLS Wereport point estimates and 5 confidence intervals for risk premia which are constructed based on the asymptoticnormal distribution and cross-sectional R2 and its (5 95) confidence level constructed as in Lewellen Nageland Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide the posterior mean of λdenoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as wellas its (5 95) credible intervals and denote significance at the 90 95 and 99 level respectively
the pointwise estimates and their seemingly high significance levels under the standard frequentist
approach
Conditional CCAPM of Lettau and Ludvigson (2001) is weakly identified at best Unlike the
basic FM estimation that indicates a relative empirical success of the model the Bayesian approach
reveals most risk premia to be substantially lower losing all the accompanying statistical and
30
economic significance This is particularly pronounced in the BFM-GLS that delivers both risk
premia and cross-sectional R2 close to zero
Labour-adjusted CAPM of Jagannathan and Wang (1996) extends the classic CAPM framework
by introducing a proxy for human capital and finds it strongly priced in the cross-sections of stocks
returns The BFM estimates of risk premia are substantially lower and no longer significant with
the same patterns observed under both OLS and GLS procedures
Durable CCAPM of Yogo (2006) in the linearized version included the durable consumption
factor and found that its impact is priced in a number of cross-sections sorted by size and value past
betas and other characteristics Even though the Lewellen Nagel and Shanken (2010) approach
indicates a really wide support for the cross-sectional R2 the model found empirical support in the
data We find that both durable and nondurable consumption are weak predictors of the cross-
section of returns as the magnitude of their risk premia substantially declines and is no longer
significant The model is still characterized by a wide confidence interval for R2 but overall its
pricing ability is questionable at best
Appendix C provides additional empirical results on the performance of both frequentist and
bayesian Fama-MacBeth estimators In many cases when the models are well specified and strongly
identified in the data there is almost no distinction between the two approaches One notable
difference however are the confidence intervals of the R2 that are often notoriously wide in
the frequentist case There are also cases however when the difference in model performance
becomes large affecting both risk premia estimates and measures of cross-sectional fit Similar
to Gospodinov Kan and Robotti (2019) we caution the reader against blindly relying on the
estimates produced by conventional Fama-MacBeth procedure and advocate a robust approach to
inference
IV2 Sampling two quadrillion models
We now turn our attention to a large cross-section of candidate asset pricing factors In particular
we focus on 51 (both traded and non) monthly factors available from October 1973 to December
2016 (ie T = 600) Factors are described in details in Table B1 of Appendix B In choosing
the cross-section of assets to price we follow Lewellen Nagel and Shanken (2010) and employ 25
Fama-French size and book-to-market portfolios plus 30 Industry portfolio (ie N = 55) Since
we do not restrict the maximum number of factors to be included all the possible combinations
of factors give us a total of 251 possible specifications ie 225 quadrillion models Note that each
model involves 55 time series regressions and one cross-section regression ie we jointly evaluate
the equivalent of 126 quadrillion regressions
We employ the continuous spike-and-slab approach of section II23 since it is the most suited
for handling a very large number of possible models and report both the posterior (given the data)
of each factor (ie E [γj |data] forallj) as well as the posterior means of the factorsrsquo risk premia (ie
E [λj |data] forallj) computed as the Bayesian Model Average16 across all the models considered
16If we are interested in some quantity ∆ that is well-defined for every model j = 1 M (eg risk premia
31
The posterior evaluation is performed and reported over a wide range for the parameter (ψ in
equation (16)) that controls the degree of shrinkage of potentially useless factorsrsquo risk premia from
ψ = 1 (ie very strong shrinkage) to ψ = 100 (making the shrinkage virtually irrelevant) The
prior for each factor inclusion is a Beta(1 1) (ie a uniform on [0 1]) yielding a prior expectation
for γj equal to 5017
000
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
1 5 10 20 50 100
Post
erio
r pro
babi
lity
ψ
HML MKT MKT SMB STRev
IPGrowth BEH_PEAD NONDUR SERV LIQ_TR
PE DeltaSLOPE BW_ISENT DEFAULT Oil
DIV REAL_UNC UNRATE TERM HJTZ_ISENT
UMD CMA LIQ_NT FIN_UNC STOCK_ISS
NetOA MACRO_UNC INV_IN_ASS COMP_ISSUE ACCR
HML ROA RMW CMA LTRev
INTERM_CAP_RATIO ASS_Growth DISSTR PERF BAB
IA MGMT ROE O_SCORE BEH_FIN
GR_PROF QMJ RMW SKEW SMB
HML_DEVIL
Figure 4 Posterior factor probabilities
Posterior probabilities of factors E [γj |data] estimated over the 197310-201612 sample using a cross-section of 25Fama-French size and book-to-market and 30 Industry test asset portfolios computed using the continuous spike andslab approach of section II23 and 51 factors yielding 251 asymp 225 quadrillion models The 51 factors considered aredescribed in Table B1 of Appendix B The prior distribution for the j-th factor inclusion is a Beta(1 1) yielding aprior expectation of γj = 50 Posterior probabilities are plotted for ψ isin [1 100]
Figure 4 plots the posterior probabilities of the 51 factors as a function of the parameter ψ
The corresponding values are reported in Table 7 Overall the inclusion of only four factors finds
maximum Sharpe ratio etc) from Bayesrsquo theorem we have
E [∆|data] =M983131
j=0
E [∆|datamodel = j] Pr (model = j|data)
where E [∆|datamodel = j] = limLrarrinfin1L
983123Li=1 ∆(θ
(j)i ) and
983153θ(j)i
983154L
i=1denotes draws from the posterior distribution
of the parameters of model j That is the (Bayesian model average) expectation of ∆ conditional on only thedata is simply the weighted average of the expectation in every model with weights equal to the modelsrsquo posteriorprobabilities See eg Raftery Madigan and Hoeting (1997) Hoeting Madigan Raftery and Volinsky (1999)
17Using a Beta(2 2) that still implies a prior probability of factor inclusion of 50 but lower probabilities forvery dense and very sparse models we obtain virtually identical results Furthermore using a prior in favour of moresparse factor models (ie a Beta(2 8)) the empirical findings are very similar to the ones reported
32
Table 7 Posterior factor probabilities E [γj |data] and risk premia in 225 quadrillion models
E [γj |data] E [λj |data]ψ ψ
Factors 1 5 10 20 50 100 1 5 10 20 50 100 F
HML 0860 0909 0916 0903 0882 0877 0192 0288 0301 0300 0292 0293 0377MKT 0730 0807 0831 0851 0837 0778 0108 0271 0370 0476 0565 0572 0514MKT 0501 0630 0705 0748 0749 0707 0047 0184 0285 0385 0467 0472 0563SMB 0654 0662 0580 0500 0429 0370 0076 0138 0133 0117 0103 0102 0215STRev 0506 0526 0511 0530 0568 0550 0003 0013 0026 0049 0106 0158 0438IPGrowth 0508 0508 0501 0502 0508 0502 0000 -0001 -0001 -0002 -0004 -0007 0121BEH PEAD 0486 0512 0478 0492 0511 0517 0004 0012 0018 0027 0049 0072 0619NONDUR 0475 0499 0490 0504 0504 0509 0000 0002 0003 0005 0011 0018 0170SERV 0504 0481 0497 0502 0491 0497 0000 0000 0000 0000 -0001 -0001 0052LIQ TR 0494 0493 0485 0485 0506 0507 0000 0005 0010 0020 0048 0083 0438PE 0495 0494 0502 0490 0494 0492 -0002 -0004 -0005 -0006 -0008 -0009 6401DeltaSLOPE 0478 0490 0506 0498 0482 0511 0000 0000 -0001 -0001 -0002 -0005 0096BW ISENT 0489 0491 0496 0498 0504 0477 0000 0002 0003 0005 0010 0014 0094DEFAULT 0497 0498 0490 0497 0484 0490 0000 0000 0000 0001 0001 0002 0313Oil 0495 0504 0489 0490 0494 0483 0002 0011 0019 0034 0068 0108 0852DIV 0488 0491 0486 0494 0491 0505 0000 0000 0000 -0001 -0002 -0003 0876REAL UNC 0479 0490 0503 0490 0498 0484 0000 0000 0000 0000 0000 0000 0043UNRATE 0483 0499 0484 0490 0490 0486 0000 -0001 -0002 -0003 -0006 -0008 1083TERM 0470 0476 0480 0497 0503 0501 0000 0003 0005 0009 0018 0030 0942HJTZ ISENT 0483 0480 0491 0487 0489 0477 0000 0000 0000 -0001 0000 0000 0210UMD 0531 0537 0536 0495 0422 0384 0028 0069 0089 0100 0102 0108 0646CMA 0499 0504 0491 0495 0466 0442 0001 0000 -0002 -0005 -0008 -0012 0242LIQ NT 0477 0478 0494 0493 0481 0471 -0003 -0005 -0004 -0002 0009 0019 0501FIN UNC 0481 0477 0471 0477 0489 0491 0000 0000 0000 0000 -0001 -0001 0096STOCK ISS 0515 0537 0509 0473 0433 0374 -0032 -0081 -0096 -0103 -0110 -0105 0515NetOA 0504 0504 0481 0489 0423 0402 0005 0015 0022 0032 0040 0042 0544MACRO UNC 0476 0483 0479 0471 0463 0430 0000 0000 0000 0000 0000 0000 0073INV IN ASS 0479 0476 0479 0461 0454 0440 0001 0002 0005 0010 0023 0029 0549COMP ISSUE 0514 0518 0503 0481 0393 0363 0037 0083 0101 0111 0105 0108 0497ACCR 0483 0477 0486 0472 0436 0415 -0009 -0012 -0009 0001 0015 0015 0343HML 0491 0481 0479 0469 0431 0376 -0002 -0001 0001 0004 0008 0010 0251ROA 0517 0514 0499 0473 0399 0319 0041 0105 0141 0162 0154 0123 0551RMW 0508 0481 0483 0465 0397 0347 0002 0008 0013 0017 0017 0016 0219CMA 0560 0530 0499 0432 0351 0292 0031 0056 0062 0058 0051 0044 0351LTRev 0485 0491 0474 0442 0402 0350 -0007 -0025 -0032 -0035 -0036 -0029 0252INTERM CAP RATIO 0499 0477 0452 0451 0380 0324 0027 0037 0054 0082 0103 0104 0790ASS Growth 0473 0470 0458 0439 0380 0327 0001 0005 0005 0000 -0005 -0003 0525DISSTR 0505 0487 0456 0410 0340 0269 0043 0094 0105 0104 0085 0070 0475PERF 0541 0497 0448 0390 0319 0259 -0054 -0079 -0072 -0064 -0054 -0049 0651BAB 0460 0448 0433 0405 0372 0325 -0010 -0012 -0008 0006 0027 0037 0921IA 0497 0465 0425 0385 0325 0257 -0010 -0011 -0009 -0008 -0008 -0007 0409MGMT 0516 0469 0424 0379 0302 0244 0028 0055 0058 0056 0050 0044 0631ROE 0482 0446 0414 0364 0299 0233 0009 0024 0027 0028 0024 0018 0555O SCORE 0475 0426 0403 0368 0281 0229 -0009 -0014 -0015 -0018 -0016 -0012 002BEH FIN 0483 0410 0360 0321 0238 0196 -0016 -0008 -0003 0000 0003 0003 076GR PROF 0459 0405 0366 0314 0249 0206 0011 0005 0008 0011 0012 0014 0199QMJ 0502 0408 0369 0300 0232 0178 0030 0030 0026 0018 0014 0011 0405RMW 0464 0395 0356 0302 0245 0193 0015 0025 0028 0027 0025 0020 0292SKEW 0481 0388 0345 0287 0219 0179 -0001 0000 0000 0000 0000 0000 0438SMB 0336 0305 0319 0312 0311 0277 0019 0032 0041 0047 0054 0051 0257HML DEVIL 0434 0326 0289 0242 0183 0152 0014 0013 0009 0005 0003 0002 0356
Posterior probabilities of factors E [γj |data] and posterior mean of factor risk premia E [λj |data] computed usingthe continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225 quadrillion models Theprior for each factor inclusion is a Beta(1 1) yielding a prior expectation for γj equal to 50 The last column reportssample average returns for the tradable factors The data is monthly 197310 to 201612 Test assets cross-sectionof 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factors considered are described inTable B1 of Appendix B Numbers denoted with the asterisk in the last column correspond to the return on thefactor-mimicking portfolio of the nontradable factor constructed by a linear projection of its values on the set of 51test assets and scaled to have the same volatility as the original nontradable factor
33
substantial support in our empirical analysis First the celebrated Fama-French HML (high-minus-
low) designed to capture the so-called lsquovalue premiumrsquo is a strong determinant of the cross-section
of asset returns For ψ = 10 (a reasonable benchmark) its posterior probability is about 895 and
only for very strong shrinkage (ψ = 1) the posterior probability gets reduced to 639 Second
the market factor in the version of Daniel Mota Rottke and Santos (2018) (MKTlowast that is
meant to have hedged out the unpriced risk contained in the market index) has also high posterior
probability (856 for ψ = 10) Third the simple market factor (MKT) seems also to be a robust
source of priced risk albeit with an empirical performance weaker than MKTlowast Fourth albeit to
a lesser extent SMBlowast the Daniel Mota Rottke and Santos (2018) version of the small-minus-big
Fama-French factor (meant to capture the so called lsquosizersquo premium) seems also to contain relevant
information for pricing the cross-section of asset returns with a posterior probability in the 60-
70 range for most values of ψ Beside the ones above mentioned all other factors have posterior
probabilities of about 50 or less for all values of ψ Interestingly the results are not very sensitive
to the choice of ψ
In addition to the posterior probabilities of the factors Table 7 reports the posterior means
of the factor risk premia computed as Bayesian Model Average ie the weighted average of the
posterior means in each possible factor model specification with weights equal to the posterior
probability of each specification being the true data generating process (see eg Roberts (1965)
Geweke (1999) Madigan and Raftery (1994)) The results are not very sensitive to the choice of
ψ except when considering very small values for of the shrinkage parameter ψ since in this case
posterior means are shrunk toward zero Interestingly the estimated price of risk for the market
factor is positive despite it being very often estimated as a negative quantity when considering
multifactor models and not dissimilar from the market excess return over the same period More
generally there is a clear pattern in cross-sectionally estimated (ie ex post) factor risk premia and
their simple time series average estimates (reported in the last column of Table 7) for the robust
four factors (HML MKTlowast MKT and SMBlowast) ex post risk premia are very similar to the time series
estimnates while the opposite holds true for the other factors In other words robust factors seems
to price themselves well (since theoretically their own beta is one) while other factors donrsquot
IV3 Estimating 26 million sparse factor models
Instead of drawing unrestricted factor models specifications we now constraint models to have a
maximum of 5 factors ndash ie we are imposing sparsity on the implied stochastic discount factor as
in most of the previous empirical asset pricing literature that has tried to identify low dimensional
factor models to explain the cross-section of asset returns Given our set of 51 factors this approach
yields about 26 millions of possible models (ie the equivalent of about 147 million time series and
cross-sectional regressions) Since we now do not sample the possible models posterior probabilities
are computed using the marginal likelihoods of all these models ie the posterior probability of
model γj is computed as
Pr(γj |data) =p(data|γj)983123i p(data|γi)
34
000
1000
2000
3000
4000
5000
6000
7000
8000
1 5 10 20 50 60
Post
erio
r pro
babi
lity
ψ
HML MKT MKT SMB PERF
STOCK_ISS COMP_ISSUE CMA ROA UMD
BEH_PEAD STRev NONDUR DISSTR MGMT
BW_ISENT TERM FIN_UNC LIQ_TR DeltaSLOPE
IPGrowth Oil SERV DEFAULT NetOA DIV PE REAL_UNC HJTZ_ISENT UNRATE
INTERM_CAP_RATIO LIQ_NT QMJ MACRO_UNC INV_IN_ASS
ACCR LTRev CMA ASS_Growth HML
RMW BAB IA O_SCORE ROE
SMB SKEW BEH_FIN GR_PROF RMW
HML_DEVIL
Figure 5 Posterior factor probabilities
Posterior probabilities of factors Pr [γj = 1|data] estimated over the 197310-201612 sample using a cross-section of25 Fama-French size and book-to-market and 30 Industry test asset portfolios computed using the Dirac spike andslab approach of section II22 51 factors and all possible models with up to 5 factors yielding about 26 millioncandidate models The prior probability of a factor being included is about 1038 The 51 factors considered aredescribed in Table B1 of Appendix B Posterior probabilities are plotted for ψ isin [1 100]
where we have assigned equal prior probability to all possible specifications and p(data|γj) denotes
the marginal likelihood of the j-th model To both simplify the numerical computation and to
illustrate the qualities of the approach we use the Dirac spike-and-slab prior of section II22 since
we can leverage its closed form solution for the marginal likelihoods18
Posterior factor probabilities are reported in Figure 5 and jointly with the Bayesian model
averaging of risk premia across the sparse models in Table C15 of the Appendix Note that in
this case all factors have an ex ante probability of being included equal to 1038 The results
are strikingly similar to the ones in Table 7 and Figure 4 as before only for four factors (HML
MKTlowast MKT and SMBlowast) we observe a marked increase in the posterior probability of inclusion
after observing the data Furthermore these factors seems to price themselves well ndash ie the BMA
of their risk premia are very similar to the sample average of their excess returns ndash while other
factors do not
35
Table 8 Factor Models with highest posterior probability (Dirac spike-and-slab ψ = 10)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232SMB 983232 983232 983232 983232 983232 983232STRevBW ISENTLIQ TRUNRATENONDURTERMCOMP ISSUE 983232 983232OilBEH PEADDeltaSLOPEINV IN ASSIPGrowthDEFAULTUMD 983232ROA 983232 983232 983232REAL UNCPECMA 983232ACCRSERVSTOCK ISS 983232 983232 983232DIVMACRO UNCFIN UNCLIQ NTCMANetOAHJTZ ISENTLTRevRMWHMLINTERM CAP RATIOASS GrowthPERF 983232 983232 983232IABABDISSTRROEMGMT 983232 983232O SCOREQMJBEH FINGR PROFSKEWRMWHML DEVILSMB
Probability () 00666 00599 00574 00522 00503 00460 00437 00428 00423 00393
Factors and posterior model probabilities of ten most likely specifications computed using the Dirac spike and slabapproach of section II22 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 26 millioncandidate models and a model prior probability of the order of 10minus7 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
36
IV4 A robust factors model
The previous subsections suggest that only a small number of factors (HML MKTlowast MKT and to a
lesser extent SMBlowast) are robust explanators of the cross-section of asset returns Furthermore Table
8 that reports the ten factor model specifications with the highest posterior probabilities19 shows
that these robust factors tend to be almost always included in the most likely models both HML
and MKTlowast are featured in all ten specifications while MKT is excluded only once and SMBlowast is
included in the five most likely specification plus the tenth one Note that the posterior probabilities
in Table 8 might appear small in absolute terms but are actually four orders of magnitude larger
than the prior model probabilities (equal to one over the number of models considered)
Therefore a natural question is whether the four factors identified as robust in the above
analysis do indeed deliver a significantly better cross-sectional asset pricing model We answer this
questions by comparing the performance of a four factor model with HML MKTlowast MKT and SMBlowast
as factors to the one of several notable factor models20 In particular Table 9 reports the model
posterior probabilities for the specifications considered ie the probability of any of these models
being the true data generating process Strikingly for almost any value of ψ the model posterior
probabilities are in the single digits range for all models but the robust factors one the probability
of this specification is always higher than 85 except when using a very strong shrinkage (in which
case it is reduced to 65) Furthermore for ψ in the most salient range (10-20) the posterior
probability of the robust factors model is about 90
Table 9 Posterior probabilities of notable models vs robust factors
ψmodel 1 5 10 20 50 100
CAPM 002 001 001 002 004 008Fama and French (1992) 005 001 001 001 000 000Fama and French (2016) 009 006 005 004 003 002Carhart (1997) 005 001 001 001 001 001Hou Xue and Zhang (2015) 002 000 000 000 000 000Pastor and Stambaugh (2000) 001 000 000 001 001 002Asness Frazzini and Pedersen (2014) 010 003 002 002 002 001Robust Factors Model 065 088 090 090 089 085
Posterior model probabilities for the specifications in the first column for different values of ψ computed using theDirac spike-and-slab prior The models and their factors are described in Appendix B The model in the last rowuses the HML MKT MKTlowast and SBMlowast factors described in Table B1 The data is monthly 197310 to 201612 Testassets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios
18Alternatively one could use the continuous spike-and-slab approach employed in the previous subsection anddrop the draws of specifications with more than 5 factors but this significantly reduces computational efficiency
19Table 8 focuses on the Dirac spike-and-slab specification with ψ = 10 Very similar results are reported in tablesC16-C18 of Appendix C for other values of ψ and for the continuous prior case
20Note that the correlation between MKT and MKTlowast is not too large being about 064
37
IV5 The sparsity of the stochastic discount factor
We have shown that a robust model with only four ndash robust ndash factors is much more likely to
capture the true stochastic discount factor than all the notable models considered (see Table 9)
Nevertheless are only four factors likely to deliver a sufficiently accurate representation of the true
latent stochastic discount factor
Thanks to our Bayesian method this question can be easily answered In particular by using
our estimations of about 225 quadrillion models and their posterior probabilities we can compute
the posterior distribution of the dimensionality of the lsquotruersquo model That is for any integer number
between one and fifty-one we can compute the posterior probability of the SDF being a function
of that number of factors
Figure 6 reports the posterior distributions of the dimensionality of the SDF for various values
of ψ These distributions are also summarized in Table 10 For the most salient values of ψ (10 and
20) the posterior mean of the number of factors in the true SDF is in the 24-25 range and the 95
posterior credible intervals are contained in the 17 to 32 factors range That is there is substantial
evidence that the SDF is dense in the space of the linear factors considered given the factors at
hand a relative large number of them is needed to provide an accurate representation of the true
latent SDF Since most of the literature has focused on very low dimension linear factor models
this finding suggests that most empirical results therein have been affected by a large degree of
misspecification
It is worth noticing that as Figure 6 and Table 10 show for very large ψ ie with basically
a flat prior for factor risk premia the posterior dimensionality is reduced This is due to two
phenomena we have already outlined First if some of the factors are useless (and our analysis
points in this direction) under a flat prior they do tend to have a higher posterior probability and
drive out the true sources of priced risk Second a flat prior for the risk premia can generate a
ldquoBartlett Paradoxrdquo (see the discussion in section II21 and Bartlett (1957))
Table 10 The posterior dimensionality of the stochastic discount factor
ψ1 5 10 20 50 100
mean 2570 2525 2460 2370 2203 2046median 26 25 25 24 22 2025 19 18 18 17 15 145 20 20 19 18 16 1595th 31 31 30 30 28 26975th 33 32 32 31 29 27
Summary statistics of the posterior density of the true model number of factors for various values of ψ Estimated
over the 197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30 Industry
test asset portfolios using the continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225
quadrillion models The prior for each factor inclusion is a Beta(1 1)
38
0 10 20 30 40 50
000
004
008
012
Model Dimensionality
Freq
uenc
y
ψ=1ψ=5ψ=10ψ=20ψ=50ψ=100
Figure 6 Posterior density of the dimensionality of the stochastic discount factor
Posterior density of the true model having the number of factors listed on the horizontal axis Estimated over the
197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30 Industry test asset
portfolios computed using the continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225
quadrillion models The prior for each factor inclusion is a Beta(1 1) yielding a prior expectation for γj equal to 50
The 51 factors considered are described in Table B1 of Appendix B Posterior densities are plotted for ψ isin [1 100]
Note that if the factors proposed in the literature were to capture different and uncorrelated
sources of risk one might worry that a SDF that is dense in the space of factors might imply
unrealistically high Sharpe ratios (see eg the discussion in Kozak Nagel and Santosh (2019))
Since given a factor model the SDF-implied maximum Sharpe ratio is just a function of the
factorsrsquo risk premia and covariance matrix our Bayesian method allows to construct the posterior
distribution of the maximum Sharpe ratio for each of the 2 quadrilion models considered Therefore
using the posterior probabilities of each possible model specification we can actually construct
the (Bayesian Model Averaging) posterior distribution of the SDF-implied maximum Sharpe ratio
(conditional on the data only)
Figure 7 and table 11 report respectively the posterior distribution of the (annualized) SDF-
implied maximum Sharpe ratio and its summary statistics for several values of the parameter ψ
Except when a very strong shrinkage (small ψ) is imposed (and hence risk premia and consequently
Sharpe ratios are shrunk toward zero) the posterior distribution of the Sharpe ratio are quite
similar for all values of ψ Furthermore despite the SDF being dense in the space of factors the
posterior maximum Sharpe ratio does not appear to be unrealistically high eg for ψ isin [10 20]
its posterior mean is about 074ndash080 and the 95 posterior credible intervals are in the 050ndash114
range Interestingly Ghosh Julliard and Taylor (2016 2018) provides a non-parametric estimate
of the pricing kernel extracted using an information-theoretic approach and wide cross-sections of
equity portfolios and find SDF-implied maximum Sharpe ratios of very similar magnitude
39
00 05 10 15
01
23
4
Sharpe Ratio
Dens
ity
ψ=1ψ=5ψ=10ψ=20ψ=50ψ=100
Figure 7 Posterior density of the Sharpe ratio of the model-implied stochastic discount factor
Posterior density of the Sharpe ratio of the model-implied stochastic discount factor for various values of ψ isin [1 100]
Estimated over the 197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30
Industry test asset portfolios computed using the continuous spike and slab approach of section II23 51 factors
described in Table B1 of Appendix B and Bayesian Model Averaging of the 251 asymp 225 quadrillion possible models
The prior for each factor inclusion is a Beta(1 1)
Table 11 Posterior distribution of the Sharpe ratio of the model-implied stochastic discount factor
ψ1 5 10 20 50 100
mean 044 067 074 080 087 092median 043 066 073 079 086 09125 026 044 050 054 059 0625 029 047 053 058 063 06695th 060 089 099 108 117 124975th 064 094 105 114 124 133
Summary statistics of the posterior distribution of the maximal Sharpe ratio of the model-implied SDF for various
values of ψ isin [1 100] Estimated over the 197310-201612 sample using a cross-section of 25 Fama-French size and
book-to-market and 30 Industry test asset portfolios computed using the continuous spike and slab approach of
section II23 51 factors described in Table B1 of Appendix B and Bayesian Model Averaging of the 251 asymp 225
quadrillion possible models The prior for each factor inclusion is a Beta(1 1)
V Extensions
In additions to the extensions formalized in remarks 1 (on how to handle generated factors such as
principal components and factor mimicking portfolios) and 3 (on how to handle the identification
failure generated by lsquolevel factorsrsquo) our method can be feasibly extended to encompass several
salient generalizations
40
First based on economic considerations one might possibly want to bound the maximum risk
premia (or the maximum Sharpe ratios) associated with the factors This can be achieved by re-
placing the Gaussian distributions in our spike-and-slab priors with (rescaled and centered) Beta
distributions since the latter have bounded support Furthermore for the sake of expositional
simplicity and closed form solutions we have focused on regularising spike and slab priors with
exponential tails Nevertheless our approach that shrinks useless factors based on their correla-
tion with asset returns could be as well implemented using polynomial tailed (ie heavy-tailed)
mixing priors (see Polson and Scott (2011) for a general discussion of priors for regularisation and
shrinkage)21 The rational for using heavy tailed priors is that when the likelihood has thick tails
while the prior has thin tail if the likelihood peak moves too far from the prior mean the posterior
eventually eventually reverts toward the prior This is illustrated in Figure D2 of the Appendix
Nevertheless note that this mechanism (first pointed out in Jeffreys (1961)) is actually desirable
in our settings in order to shrink the risk premia of useless factors toward zero22
Second Lewellen Nagel and Shanken (2010) points out that the first pass time-series regression
is often affected by having a strong factor structure in the residuals Given the hierarchical structure
of our Bayesian approach one can add latent linear components in the time series regression of asset
returns on factors reformulate the time series estimation step as a state-space problem and filter
the latent components (eg via Kalman filter) The posterior sampling of the time series parameters
would then be enriched by the drawing of the added terms as in Bryzgalova and Julliard (2018)
Furthermore one could allow the latent time series factors to be potential priced in the cross-
section (again as in Bryzgalova and Julliard (2018)) This extension would increase the numerical
complexity of the procedure in the time-series step but would nonetheless leave unchanged the
method proposed in this paper at the cross-sectional step (with the only difference that the time
series loadings of the latent factors could be included in the cross-sectional step as if these latent
factors were observable) This extended approach would lead to valid posterior inference and model
selection
Third again thanks to the hierarchical structure of our method time varying time series betas
could be accommodated by adopting the time varying parameters approach of Primiceri (2005)
in the time series step And since in our approach the asset specific expected risk premia are
a parameter estimated in the time series step this extension would also allow for time variation
in asset risk premia Furthermore albeit this would increase the numerical complexity of the
cross-sectional inference step the time varying parameters formulation could also be used for the
modelling of the factor risk premia
21For example albeit alternatives with more desirable properties exist our spike and slab could be implementedusing a Cauchy prior with location parameter set to zero and scale parameter proportional to ψj as defined in equation(16)
22Risk premia have a natural support hence the prior can be chosen (adjusting the paramater ψ) to attach highprobability to it And since useless factors will tend to generate heavy tailed likelihoods with posterior peaks for riskpremia that deviate toward infinity the risk premia of such factors will be shrunk toward the prior mean if the priorhas thin-tails
41
VI Conclusions
We have developed a novel (Bayesian) method for the analysis of linear factor models in asset
pricing The approach can handle the quadrillions of models generated by the zoo of traded and
non traded factors and delivers inference that is robust to the common identification failures and
spurious inference problems caused by useless factors
We have applied our approach to the study of more than two quadrillion factor model spec-
ification and have found that 1) only a handful of factors (the Fama and French (1992) ldquohigh-
minus-lowrdquo proxy for the value premium the market index as well the adjusted versions of the
ldquosmall-minus-bigrdquo size factor and the market factor of Daniel Mota Rottke and Santos (2018))
seem to be robust explanators of the cross-sections of asset returns 2) jointly the four robust fac-
tors provide a model that is compared to the previous empirical literature one order of magnitude
more likely to have generated the observed asset returns (itrsquos posterior probability is about 90)
3) with very high probability the ldquotruerdquo latent stochastic discount factor is dense in the space of
factors proposed in the previous literature ie capturing its characteristics requires the use of 24-25
factors (at the posterior mean of the SDF sparsity) 4) despite being dense in the space of factors
the SDF-implied maximum Sharpe ratio is not excessive suggesting a high degree of commonality
in terms of captured risks among the factors in the zoo
As a byproduct of our novel framework for empirical asset pricing we provide a very simple
Bayesian version of the Fama and MacBeth (1973) regression method (BFM) We show that this
simple procedure (that does neither require optimisation nor tuning parameters and is not harder
to implement than eg the Shanken (1992) correction for standard errors) makes useless factors
easily detectable in finite sample In extensive simulations the BFM and its GLS analogue (BFM-
GLS) perform well even with relatively small time and large cross-sectional dimensions We apply
BFM and BFM-GLS to several notable factor models and document that a range of non-traded
factors such as consumption proxies labour factors or the consumption-to-wealth ratio are only
weakly identified at best and are characterised by a substantial degree of model misspecification
and uncertainty
Finally thanks to its hierarchical structure our framework is extremely flexible and can accom-
modate and deliver robust inference in the presence of 1) pre-estimated factors (eg mimicking
portfolios and principal components) 2) latent priced and unpriced factors 3) time varying betas
as well as asset and factor risk premia
42
References
Allena R (2019) ldquoComparing Asset Pricing Models with Traded and Non-Traded Factorsrdquo working paper
Asness C and A Frazzini (2013) ldquoThe Devil in HMLs Detailsrdquo Journal of Portfolio Management 39 49ndash68
Asness C S A Frazzini and L H Pedersen (2019) ldquoQuality minus Junkrdquo Review of Accounting Studies24(1) 34ndash112
Avramov D and G Zhou (2010) ldquoBayesian Portfolio Analysisrdquo Annual Review of Financial Economics 225ndash47
Baker M and J Wurgler (2006) ldquoInvestor Sentiment and the Cross-Section of Stock Returnsrdquo Journal ofFinance 61 1645ndash80
Baks K P A Metrick and J Wachter (2001) ldquoShould Investors Avoid All Actively Managed Mutual FundsA Study in Bayesian Performance Evaluationrdquo Journal of Finance 56 45ndash85
Ball R (1985) ldquoAnomalies in Relationships between Securitiesrsquo Yields and Yield-Surrogatesrdquo Journal of FinancialEconomics 6 103ndash126
Barillas F and J Shanken (2018) ldquoComparing Asset Pricing Modelsrdquo The Journal of Finance 73(2) 715ndash754
Bartlett M S (1957) ldquoComment on a Statistical Paradox by DV Lindleyrdquo Biometrika 44(1-2) 533ndash534
Basu S (1977) ldquoInvestment Performance of Common Stocks in Relation to their Price-Earnings Ratio A Test ofthe Efficient Market Hypothesisrdquo Journal of Finance 32 663ndash82
Belloni A V Chernozhukov and C Hansen (2014) ldquoInference on treatment effects after selection amonghigh-dimensional controlsrdquo The Review of Economic Studies 81 608ndash650
Breeden D T M R Gibbons and R H Litzenberger (1989) ldquoEmpirical Test of the Consumption-OrientedCAPMrdquo The Journal of Finance 44(2) 231ndash262
Bryzgalova S (2015) ldquoSpurious Factors in Linear Asset Pricing Modelsrdquo working paper
Bryzgalova S and C Julliard (2018) ldquoConsumption in Asset Returnsrdquo London School of EconomicsManuscript
Campbell J Y (1996) ldquoUnderstanding Risk and Returnrdquo Journal of Political Economy 104(2) 298ndash345
Campbell J Y J Hilscher and J Szilagyi (2008) ldquoIn Search of Distress Riskrdquo Journal of Finance 632899ndash2939
Carhart M M (1997) ldquoOn Persistence in Mutual Fund Performancerdquo The Journal of Finance 52(1) 57ndash82
Chan K N-F Chen and D A Hsieh (1985) ldquoAn Exploratory Investigation of the Firm Size Effectrdquo Journalof Financial Economics 14 451ndash71
Chen L R Novy-Marx and L Zhang (2010) ldquoA New Three-Factor Modelrdquo working paper
Chen N-F S A Ross and R Roll (1986) ldquoEconomic Forces and the Stock Marketrdquo Journal of Business 59383ndash403
Chernov M L Lochstoer and S Lundeby (2019) ldquoConditional Dynamics and the Multi-Horizon Risk-ReturnTrade-Offrdquo working paper
Chib S X Zeng and L Zhao (forthcoming) ldquoOn Comparing Asset Pricing Modelsrdquo The Journal of Finance
Cochrane J H (2005) Asset Pricing Revised Edition Princeton University Press
Cooper M J H Gulen and M J Schill (2008) ldquoAsset Growth and the Cross-Section of Stock ReturnsrdquoJournal of Finance 63 1609ndash52
Cremers K M (2002) ldquoStock Return Predictability A Bayesian Model Selection Perspectiverdquo The Review ofFinancial Studies 15(4) 1223ndash1249
Daniel K D Hirshleifer and L Sun (2019) ldquoShort- and Long-Horizon Behavioral Factorsrdquo Review ofFinancial Studies
Daniel K L Mota S Rottke and T Santos (2018) ldquoThe Cross-Section of Risk and Returnrdquo workingpaper
Daniel K and S Titman (2006) ldquoMarket reactions to tangible and intangible informationrdquo Journal of Finance61 1605ndash43
Fabozzi F D Huang and G Zhou (2010) ldquoRobust Portfolios Contributions from Operations Research andFinancerdquo Annals of Operations Research 172 191ndash220
43
Fama E F and K French (2008) ldquoDissecting Anomaliesrdquo Journal of Finance 63 1653ndash78
Fama E F and K R French (1992) ldquoThe Cross-Section of Expected Stock Returnsrdquo The Journal of Finance47 427ndash465
(1993) ldquoCommon Risk Factors in the Returns on Stocks and Bondsrdquo The Journal of Financial Economics33 3ndash56
Fama E F and K R French (2015) ldquoA Five-Factor Asset Pricing Modelrdquo Journal of Financial Economics116 1ndash22
(2016) ldquoDissecting Anomalies with a Five-Factor Modelrdquo Review of Financial Studies 29 69ndash103
Fama E F and J D MacBeth (1973) ldquoRisk Return and Equilibrium Empirical Testsrdquo Journal of PoliticalEconomy 81(3) 607ndash636
Ferson W and C Harvey (1991) ldquoThe Variation of Economic Risk Premiumsrdquo Journal of Political Economy99 385ndash415
Frazzini A and L H Pedersen (2014) ldquoBetting against Betardquo Journal of Financial Economics 111(1) 1ndash25
Garlappi L R Uppal and T Wang (2007) ldquoPortfolio Selection with Parameter and Model Uncertainty AMulti-Prior Approachrdquo Review of Financial Studies 20 41ndash81
George E I and R E McCulloch (1993) ldquoVariable Selection via Gibbs Samplingrdquo Journal of the AmericanStatistical Association 88 881ndash889
(1997) ldquoApproaches for Bayesian Variable Selectionrdquo Statistica Sinica 7 339ndash373
Gertler M and E Grinols (1982) ldquoUnemployment Inflation and Common Stock Returnsrdquo Journal of MoneyCredit and Banking 14 216ndash33
Geweke J (1999) ldquoUsing Simulation Methods for Bayesian Econometric Models Inference Development andCommunicationrdquo Econometric Reviews 18(1) 1ndash73
Ghosh A C Julliard and A Taylor (2016) ldquoWhat is the Consumption-CAPM missing An Information-Theoretic Framework for the Analysis of Asset Pricing Modelsrdquo Review of Financial Studies 30 442ndash504
(2018) ldquoAn Information-Theoretic Asset Pricing Modelrdquo London School of Economics Manuscript
Giannone D M Lenza and G E Primiceri (2018) ldquoEconomic Predictions with Big Data The Illusion ofSparsityrdquo FRB of New York Staff Report No 847
Gibbons M R S A Ross and J Shanken (1989) ldquoA Test of the Efficiency of a Given Portfoliordquo Econometrica57(5) 1121ndash1152
Giglio S G Feng and D Xiu (2019) ldquoTaming the factor Zoordquo Journal of Finance forthcoming
Giglio S and D Xiu (2018) ldquoAsset Pricing with Omitted Factorsrdquo working paper
Gospodinov N R Kan and C Robotti (2014) ldquoMisspecification-Robust Inference in Linear Asset-PricingModels with Irrelevant Risk Factorsrdquo The Review of Financial Studies 27(7) 2139ndash2170
(2019) ldquoToo Good to be True Fallacies in Evaluating Risk Factor Modelsrdquo Journal of Financial Economics132(2) 451ndash471
Goyal A Z L He and S-W Huh (2018) ldquoDistance-Based Metrics A Bayesian Solution to the Power andExtreme-Error Problems in Asset-Pricing Testsrdquo Swiss Finance Institute Research Paper No 18-78 available atSSRN httpsssrncomabstract=3286327
Hall R E (1978) ldquoThe Stochastic Implications of the Life Cycle Permanent Income Hypothesisrdquo Journal ofPolitical Economy 86(6) 971ndash87
Harvey C and Y Liu (2019) ldquoCross-sectional Alpha Dispersion and Performance Evaluationrdquo Journal ofFinancial Economics forthcoming
Harvey C and G Zhou (1990) ldquoBayesian Inference in Asset Pricing Testsrdquo Journal of Financial Economics26 221ndash254
Harvey C R (2017) ldquoPresidential Address The Scientific Outlook in Financial Economicsrdquo The Journal ofFinance 72(4) 1399ndash1440
Harvey C R Y Liu and H Zhu (2016) ldquo and the Cross-Section of Expected Returnsrdquo The Review ofFinancial Studies 29(1) 5ndash68
He A D Huang and G Zhou (2018) ldquoNew Factors Wanted Evidence from a Simple Specification Testrdquoworking paper available at SSRN httpsssrncomabstract=3143752
44
He Z B Kelly and A Manela (2017) ldquoIntermediary Asset Pricing New Evidence from Many Asset ClassesrdquoJournal of Financial Economics 126 1ndash35
Hirshleifer D H Kewei S Teoh and Y Zhang (2004) ldquoDo Investors Overvalue Firms with Bloated BalanceSheetsrdquo Journal of Accounting and Economics 38 297ndash331
Hoeting J A D Madigan A E Raftery and C T Volinsky (1999) ldquoBayesian Model Averaging ATutorialrdquo Statistical Science 14(4) 382ndash401
Hou K C Xue and L Zhang (2015) ldquoDigesting Anomalies An Investment Approachrdquo The Review of FinancialStudies 28(3) 650ndash705
Huang D F Jiang J Tu and G Zhou (2015) ldquoInvestor Sentiment Aligned A Powerful Predictor of StockReturnsrdquo Review of Financial Studies 28 791ndash837
Huang D J Li and G Zhou (2018) ldquoShrinking Factor Dimension A Reduced-Rank Approachrdquo workingpaper available at SSRN httpsssrncomabstract=3205697
Ishwaran H J S Rao et al (2005) ldquoSpike and Slab Variable Selection Frequentist and Bayesian StrategiesrdquoThe Annals of Statistics 33(2) 730ndash773
Jagannathan R and Z Wang (1996) ldquoThe Conditional CAPM and the Cross-Section of Expected ReturnsrdquoThe Journal of Finance 51(1) 3ndash53
Jeffreys H (1961) Theory of Probability Oxford Calendron Press
Jegadeesh N and S Titman (1993) ldquoReturns to Buying Winners and Selling Losers Implications for StockMarket Efficiencyrdquo Journal of Finance 48 65ndash91
(2001) ldquoProfitability of Momentum Strategies An Evaluation of Alternative Explanationsrdquo Journal ofFinance 56 699ndash720
Jones C and J Shanken (2005) ldquoMutual Fund Performance with Learning across Fundsrdquo Journal of FinancialEconomics 78 507ndash552
Jurado K S Ludvigson and S Ng (2015) ldquoMeasuring Uncertaintyrdquo American Economic Review 105 1177ndash1215
Kan R and C Zhang (1999a) ldquoGMM Tests of Stochastic Discount Factor Models with Useless Factorsrdquo Journalof Financial Economics 54(1) 103ndash127
(1999b) ldquoTwo-Pass Tests of Asset Pricing Models with Useless Factorsrdquo the Journal of Finance 54(1)203ndash235
Kass R E and A E Raftery (1995) ldquoBayes Factorsrdquo Journal of the American Statistical Association 90(430)773ndash795
Kelly B S Pruitt and Y Su (2019) ldquoCharacteristics are covariances A unified model of risk and returnrdquoJournal of Financial Economics 134 501ndash524
Kleibergen F (2009) ldquoTests of Risk Premia in Linear Factor Modelsrdquo Journal of Econometrics 149(2) 149ndash173
Kleibergen F and Z Zhan (2015) ldquoUnexplained Factors and their Effects on Second Pass R-squaredsrdquo Journalof Econometrics 189(1) 101ndash116
Kozak S S Nagel and S Santosh (2019) ldquoShrinking the Cross-Sectionrdquo Journal of Financial Economicsforthcoming
Lancaster T (2004) Introduction to Modern Bayesian Econometrics Wiley
Langlois H (2019) ldquoMeasuring Skewness Premiardquo Journal of Financial Economics
Lettau M and S Ludvigson (2001) ldquoResurrecting the (C) CAPM A Cross-Sectional Test when Risk Premiaare Time-Varyingrdquo Journal of political economy 109(6) 1238ndash1287
Lewellen J S Nagel and J Shanken (2010) ldquoA Skeptical Appraisal of Asset Pricing Testsrdquo Journal ofFinancial Economics 96(2) 175ndash194
Lintner J (1965) ldquoSecurity Prices Risk and Maximal Gains from Diversificationrdquo Journal of Finance 20587ndash615
Ludvigson S S Ma and S Ng (2019) ldquoUncertainty and Business Cycles Exogenous Impulse or EndogenousResponserdquo American Economic Journal Macroeconomics
Madigan D and A E Raftery (1994) ldquoModel Selection and Accounting for Model Uncertainty in GraphicalModels Using Occamrsquos Windowrdquo Journal of the American Statistical Association 89(428) 1535ndash1546
45
Novy-Marx R (2013) ldquoThe Other Side of Value The Gross Profitability Premiumrdquo Journal of Financial Eco-nomics 108 1ndash28
Ohlson J A (1980) ldquoFinancial Ratios and the Probabilistic Prediction of Bankruptcyrdquo Journal of AccountingResearch 18 109ndash131
Pastor L (2000) ldquoPortfolio Selection and Asset Pricing Modelsrdquo The Journal of Finance 55(1) 179ndash223
Pastor L and R F Stambaugh (2000) ldquoComparing Asset Pricing Models an Investment Perspectiverdquo Journalof Financial Economics 56(3) 335ndash381
Pastor L and R F Stambaugh (2002) ldquoInvesting in Equity Mutual Fundsrdquo Journal of Financial Economics63 351ndash380
Pastor L and R F Stambaugh (2003) ldquoLiquidity Risk and Expected Stock Returnsrdquo Journal of Politicaleconomy 111(3) 642ndash685
Phillips D L (1962) ldquoA Technique for the Numerical Solution of Certain Integral Equations of the First KindrdquoJ ACM 9(1) 84ndash97
Polson N G and J G Scott (2011) ldquoShrink Globally Act Locally Sparse Bayesian Regularization and Pre-dictionrdquo in Bayesian Statistics 9 ed by J M Bernardo M J Bayarri J O Berger A P Dawid D HeckermanA F M Smith and M West Oxford University Press
Primiceri G E (2005) ldquoTime Varying Structural Vector Autoregressions and Monetary Policyrdquo Review of Eco-nomic Studies 72(3) 821ndash852
Raftery A E D Madigan and J A Hoeting (1997) ldquoBayesian Model Averaging for Linear RegressionModelsrdquo Journal of the American Statistical Association 92(437) 179ndash191
Ritter J R (1991) ldquoThe Long-Run Performance of Initial Public Offeringsrdquo Journal of Finance 46 3ndash27
Roberts H V (1965) ldquoProbabilistic Predictionrdquo Journal of the American Statistical Association 60(309) 50ndash62
Shanken J (1987) ldquoA Bayesian Approach to Testing Portfolio Efficiencyrdquo Journal of Financial Economics 19195ndash215
(1992) ldquoOn the Estimation of Beta-Pricing Modelsrdquo The Review of Financial Studies 5 1ndash33
Sharpe W F (1964) ldquoCapital Asset Prices A Theory of Market Equilibrium under Conditions of Riskrdquo Journalof Finance 19(3) 425ndash42
Sims C A (2007) ldquoThinking about Instrumental Variablesrdquo mimeo
Sloan R (1996) ldquoDo Stock Prices Fully Reflect Information in Accruals and Cash Flows about Future EarningsrdquoThe Accounting Review 71 289ndash315
Stambaugh R F and Y Yuan (2016) ldquoMispricing Factorsrdquo Review of Financial Studies 30 1270ndash1315
Stock J H (1991) ldquoConfidence Intervals for the Largest Autoregressive Root in US Macroeconomic Time SeriesrdquoJournal of Monetary Economics 28(3) 435ndash459
Tikhonov A A Goncharsky V Stepanov and A G Yagola (1995) Numerical Methods for the Solutionof Ill-Posed Problems Mathematics and Its Applications Springer Netherlands
Titman S K J Wei and F Xie (2004) ldquoCapital Investments and Stock Returnsrdquo Journal of Financial andQuantitative Analysis 39 677ndash700
Uppal R P Zaffaroni and I Zviadadze (2018) ldquoCorrecting Misspecified Stochastic Discount Factorsrdquoworking paper
Yogo M (2006) ldquoA Consumption-Based Explanation of Expected Stock Returnsrdquo Journal of Finance 61(2)539ndash580
46
A Additional Derivations
A1 Derivation of the posterior distribution in section II1
Letrsquos consider first the time-series regression We assume that 983171tiidsim N (0Σ) or 983171 sim MVN (0TtimesN Σotimes
IT ) The likelihood of data (RF ) is therefore
p(data|BΣ) = (2π)minusNT2 |Σ|minus
T2 exp
983069minus1
2tr[Σminus1(Rminus FB)⊤(Rminus FB)]
983070
After assigning the Jeffreysrsquo prior for (BΣ) π(BΣ) prop |Σ|minusN+1
2 we simplify the likelihood
function by exploiting the fact that OLS estimated residuals are orthogonal to the regressors
(Rminus FB)⊤(Rminus FB) = [Rminus FBols minus F (B minus Bols)]⊤[Rminus FBols minus F (B minus Bols)]
= (Rminus FBols)⊤(Rminus FBols) + (B minus Bols)
⊤F⊤F (B minus Bols)
= T Σ+ (B minus Bols)⊤F⊤F (B minus Bols)
where
Bols =
983075a⊤
β⊤
983076= (F⊤F )minus1F⊤R Σols =
1
T(Rminus FBols)
⊤(Rminus FBols)
Therefore the posterior distribution in the first step is
p(BΣ|data) prop (2π)minusNT2 |Σ|minus
T+N+12 exp
983069minus1
2tr[Σminus1(Rminus FB)⊤(Rminus FB)]
983070
prop |Σ|minusT+N+1
2 eminus12tr[Σminus1(T Σ)]eminus
12tr[Σminus1(BminusBols)
⊤F⊤F (BminusBols)]
Hence the posterior distribution of B conditional on data and Σ is
p(B|Σ data) prop exp
983069minus1
2tr[Σminus1(B minus Bols)
⊤F⊤F (B minus Bols)]
983070
and the above is the kernel of the multivariate normal in equation (8)
If we further integrate out B it is easy to show that
p(Σ|data) prop |Σ|minusT+NminusK
2 exp
983069minus1
2tr[Σminus1(T Σ)]
983070
Therefore the posterior distribution of Σ is the inverse-Wishart in equation (9)
Recall that β = (1N βf ) λ⊤ = (λc λ⊤
f ) If we assume that the pricing error αi follows an
independent and idencitical normal distribution N (0σ2) the likelihood function in the second step
is
p(data|λσ2) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070
where data in the second step include (aβf ) drawn from the first step Assuming the diffuse
47
Jeffreysrsquo prior π(λσ2) prop 1σ2 the posterior distribution of (λσ2) is
p(λσ2|dataBΣ) prop (σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070
= (σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βλ+ β(λminus λ))⊤(aminus βλ+ β(λminus λ))
983070
= (σ2)minusN+2
2 exp
983069minusN σ2
2σ2
983070exp
983083minus(λminus λ)⊤β⊤β(λminus λ)
2σ2
983084
there4 p(λ|σ2 dataBΣ) prop exp
983083minus(λminus λ)⊤β⊤β(λminus λ)
2σ2
983084
where λ = (β⊤β)minus1β⊤a and σ2 = (aminusβλ)⊤(aminusβλ)N Therefore the posterior conditional distribution
of λ is the one in equation (12) Finally we can derive the posterior distribution of σ2 by integrating
out λ
p(σ2|dataBΣ) =
983133p(λσ2|dataBΣ)dσ2 prop (σ2)minus
NminusK+12 exp
983069minusN σ2
2σ2
983070
hence obtaining the posterior distribution in equation (13)
A2 Non-spherical pricing errors
Our framework can also easily accommodate non-spherical cross-sectional pricing errors To see
this note that under the null of the model we can rewrite equation (1) as Rt = βλ + βfft + 983171t
Consider the cross-sectional regression and let ET define the sample mean operator Since ET [Rt] =
βλ+ET [βfft]+ET [983171t] = βλ+ET [983171t] the pricing error α should be equal to ET [983171t] Hence under
the hypothesis that the model is correctly specified and in the spirit of the central limit theorem
a suitable distributional assumption for the pricing errors α in the second step is
α|Σ sim N
9830610N
1
TΣ
983062
Implying the distribution of R equiv ET [Rt] sim N (βλ 1T Σ) The likelihood function in the second step
is then
p(data|λ) = (2π)minusN2
9830559830559830559830551
TΣ
983055983055983055983055minus 1
2
exp
983069minusT
2(Rminus βλ)⊤Σminus1(Rminus βλ)
983070
prop exp
983069minusT
2(λ⊤β⊤Σminus1βλminus 2λ⊤β⊤Σminus1R)
983070
Hence we can now define the following estimator
Definition 3 (Bayesian Fama-MacBeth GLS2 (BFM-GLS2)) The BFM-GLS2 posterior dis-
tribution of λ is
λ|dataBΣ sim N (λΣλ) (18)
48
where λ = (β⊤Σminus1β)minus1β⊤Σminus1R Σλ = 1T (β
⊤Σminus1β)minus1 and where β and Σ are drawn from the
Normal-inverse-Wishart in (8)-(9)
In the above the conditional expectation λ = (β⊤Σminus1β)minus1β⊤Σminus1R is essentially the Fama-
MacBeth GLS estimate of λ and Σλ is just the standard covariance matrix of the GLS estimates
Different from our OLS version we incorporate the uncertainty of R by drawing λ from the normal
distribution with variance matrix 1T (β
⊤Σminus1β)minus1
In this case two forces are at play to make useless factors detectable in finite sample First as
for the BFM and BFM-GLS cases useless factors will generate posterior draws with diverging 983141λand flipping sing Second differently from our OLS version we incorporate the uncertainty of R
by drawing λ from the normal distribution with variance matrix 1T (β
⊤Σminus1β)minus1
A3 Formal derivation of the flat prior pitfall for risk premia
Following the derivation in section A1 the likelihood function in the second step is
p(data|γλσ2) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βγλγ)
⊤(aminus βγλγ)
983070(19)
Assigning a Jeffreysrsquo prior to the parameters23 (λσ2) the marginal likelihood function conditional
on model index γ is
p(data|γ) =983133 983133
p(data|γλσ2)π(λσ2|γ)dλdσ2
prop983133 983133
(σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βγλγ)
⊤(aminus βγλγ)
983070dλdσ2
=
983133 983133(σ2)minus
N+22 exp
983083minusN σ2
γ
2σ2
983084exp
983083minus(λγ minus λγ)
⊤β⊤γ βγ(λγ minus λγ)
2σ2
983084dλdσ2
= (2π)pγ+1
2 |β⊤γ βγ |minus
12
983133(σ2)minus
Nminuspγ+1
2 exp
983083minusN σ2
γ
2σ2
983084dσ2
= (2π)pγ+1
2 |β⊤γ βγ |minus
12
Γ(Nminuspγ+1
2 )
(N σ2
γ
2 )Nminuspγ+1
2
where λγ = (β⊤γ βγ)
minus1β⊤γ a σ
2γ =
(aminusβγ λγ)⊤(aminusβγ λγ)N and Γ denotes the Gamma function
A4 Proof of Remark 2
Proof Consider two nested linear factor models γ and γ prime The only difference between γ and γ prime
is γp γp equals 1 in model γ but 0 in model γ prime Let γminusp denote a (K minus 1) times 1 vector of model
index excluding γp γ⊤ = (γ⊤minusp 1) and γ prime⊤ = (γ⊤
minusp 0) Suppose that we rearrange the ordering of
23More precisely the priors for (λσ2) are π(λγ σ2) prop 1
σ2 and λminusγ = 0
49
factors such that factor p is the last one To begin with we introduce some matrix notations
βγ = (βγprime βp) Dγ =
983075Dγprime
1ψp
983076 β⊤
γ βγ +Dγ =
983075β⊤γprimeβγprime +Dγprime β⊤
γprimeβp
β⊤p βγprime β⊤
p βp + 1ψp
983076
|β⊤γ βγ +Dγ | = |β⊤
γprimeβγprime +Dγprime |times |β⊤p βp +
1
ψpminus β⊤
p βγprime(β⊤γprimeβγprime +Dγprime)minus1β⊤
γprimeβp|
|β⊤γ βγ +Dγ | = |β⊤
γprimeβγprime +Dγprime |times |β⊤p βp +
1
ψpminus β⊤
p βγprime(β⊤γprimeβγprime +Dγprime)minus1β⊤
γprimeβp|
|Dγ | = |Dγprime |times 1
ψp
Equipped with the above we have by direct calculation
p(data|γj = 1γminusj)
p(data|γj = 0γminusj)=
|Dγ |12
|β⊤γ βγ+Dγ |
12
1983059
SSRγ2
983060N2
|Dγprime |
12
|β⊤γprimeβγprime+D
γprime |12
1983061
SSRγprime2
983062N2
=
983061SSRγprime
SSRγ
983062N2983061|Dγ ||Dγprime |
983062 12
983075|β⊤
γprimeβγprime +Dγprime ||β⊤
γ βγ +Dγ |
983076 12
=
983061SSRγprime
SSRγ
983062N2
ψminus 1
2p
983055983055983055983055β⊤p βp +
1
ψpminus β⊤
p βγprime
983059β⊤γprimeβγprime +Dγprime
983060minus1β⊤γprimeβp
983055983055983055983055minus 1
2
=
983061SSRγprime
SSRγ
983062N29830611 + ψpβ
⊤p
983063IN minus βγprime
983059β⊤γprimeβγprime +Dγprime
983060minus1β⊤γprime
983064βp
983062minus 12
where β⊤p
983147IN minus βγprime(β⊤
γprimeβγprime +Dγprime)minus1β⊤γprime
983148βp = minb(βpminusβγprimeb)⊤(βpminusβγprimeb)+ b⊤Dγprimeb which
is the minimal value of the penalised sum of squared errors when we use βγprime to predict βp
A5 Confidence Intervals for R2 in Fama-MacBeth Estimation
In the spirit of Stock (1991) and Lewellen Nagel and Shanken (2010) for each of the simulations
we use the following approach to construct the confidence interval for cross-sectional R2 produced
by the standard Fama-MacBeth estimation
First we choose a hypothetical true cross-sectional R2 denoted by R2h The expected test asset
returns E[Rt] are assumed to follow E[Rt] = hβλ+α where β and λ are sample estimates of β
and λ from historical data and αiiidsim N (0σ2
e) The constants h and σ2e are chosen to match the
hypothetical cross-sectional R2
Let R denote the vector of historical average asset returns and V arN (x) denote the sample
cross-sectional variance of vector x The population cross-sectional R2 is therefore
R2h =
h2V arN (βλ)
V arN (E[Rt])=
h2V arN (βλ)
h2V arN (βλ) + σ2e
At the same time wersquod like to maintain the historical cross-sectional dispersion of test asset returns
50
so we further have the following equation
V arN (E[Rt]) = h2V arN (βλ) + σ2e = V arN (ET [Rt])
h2 =R2
hV arN (ET [Rt])
V arN (βλ)
σ2e = (1minusR2
h)V arN (ET [Rt])
Solving the system for h and σ2e we simulate a vector of pricing errors from a normal distribution
αiiidsim N (0σ2
e) Since the sample variance of the draws αi is generally different from σ2e with a non-
zero cross-sectional covariance with β we adjust the vector α by (1) subtracting CovN (βλα)
V arN (βλ)βλ
from α such as the sample covariance between α and βλ is zero (2) multiplying α by σeσN (α) in
order that sample cross-sectional variance of α equals σ2e
Second we simulate a random sample of both factors and test asset returns Factors are drawn
with replacement from their empirical sample while test assets returns Rt are assumed to follow a
parametric normal distribution that is
Rt|ftiidsim N (hβλ+α+ βft Σ)24
We then use Fama-MacBeth two-step approach to estimateR2 for every simulated sample ftRtTt=1
The second step is repeated for 1000 times For each hypothetical R2 between 0 and 1 we find the
(5 95) confidence interval in the simulation Finally build a 90 confidence interval for R2 by
including those values of hypothetically true R2 whose 90 confidence interval contains R2
The confidence intervals for GLS R2 can be found in a similar way with the only difference
of focusing on cross-sectional R2 for a linear combination of Rt ie Σminus 12Rt Let Rt = Σminus 1
2Rt
β = Σminus 12 β and 983171t = Σminus 1
2 983171tiidsim N (0 IN ) The simulation then relies on Rt rather than Rt
Rt|ftiidsim N (hβλ+α+ βft IN )
A6 Large N behavior
In this section we investigate the properties of the BFM procedure in estimating the risk premia
and R-squared as well successfully identifying irrelevant factors in the model when applied to a
large cross-section For the sake of brevity here we present the overview of the simulation results
with Tables C4 - C9 summarizing the output
We consider the same simulation design as described at the beginning of Section III except
for the choice of the cross-section of test assets which time series and cross-sectional features we
mimic In our baseline case in the previous subsections we built a cross-section to emulate the 25
Fama-French portfolios sorted by size and value Now instead we consider the properties of the
following composite cross-sections to simulate returns
24Σ is the sample estimate of covariance matrix of random errors in time-series regression in equation (1)
51
(a) N = 55 25 Fama-French portfolios sorted by size and value and 30 industry portfolios
(b) N = 100 25 Fama-French portfolios sorted by size and value 30 industry portfolios 25
profitability and investment portfolios 10 momentum portfolios and 10 long-term reversal
portfolios
The rest of the simulation design stays unchanged ie the strong factor mimics the behavior of
HML with its betas and risk premia corresponding to their in-sample values cross-sectional R2adj
as well as portfolio average returns and variance of the residuals
Tables C4 and C7 focus on the case of including only a strong factor into the model that is
inherently misspecified and estimated on a large cross-section (N = 55 and N = 100 respectively)
Risk premia estimates recovered by BFM are centered around the pseudo-true values and confi-
dence intervals produced by the posterior coverage have the correct size Confidence intervals for
R2adj are equally centered around the true values and overall become quite tight for a large sample
size (eg T ge 600) The GLS version of the cross-sectional fit is slightly lower than than the true
value and this is largely due to the estimation errors in the weight matrix that are particularly
important for a large cross-section but overall are very close to the true values
Tables C5 and C8 focus on the model with just a useless factor As expected the standard
Fama-MacBeth estimation yields confidence intervals for the risk premia with wrong size rejecting
the zero risk premia for a useless factor with increasing frequency as T becomes large In contrast
the BFM inference remains valid with its empirical rejection rates being close to the true size of
the tests
Finally Tables C6 and C9 consider the mixed case of estimating a model that includes both
strong and useless factors Again BFM correctly identifies the presence of a spurious factor in
the specification and if anything becomes somewhat more conservative underrejecting the zero
risk premia associated with it This conservative inference is also shared by the estimates of the
strong factor risk premia with the latter being particularly evident in a large sample estimation
(T = 20 000 N = 100) Again the reason for this additional estimation uncertainty seems to lie
in the large N behavior of the cross-sectional regression (betas and the weight matrix in case of
the GLS) The posterior of a cross-sectional measure of fit remains relatively tight around the true
value
B Data
CAPM Following Sharpe (1964) and Lintner (1965) the only risk factor is the excess return on
market portfolio which is proxied by a value-weighted portfolio of all CRSP firms incorporated in
US and listed on the NYSE AMEX or NASDAQ We use 1-month Treasury rate as a proxy for
risk-free rate The data comes from Ken Frenchrsquos website
Fama-French 3 factor model Fama and French (1992) extend CAPM by introducing two
additional factors SMB and HML where SMB is the return difference between portfolios of stocks
52
with small and large market capitalizations and HML is the return difference between portfolios of
stocks with high and low book-to-market ratio Again the data comes from Ken Frenchrsquos website
Carhart (1997) This paper extends Fama-French 3 factor model by including a momentum
factor UMD (also available from Ken French website)
q-factor model Hou Xue and Zhang (2015) introduce a four-factor model that includes market
excess return a new size factor (ME) proxied by the return difference between large and small
stocks an investment factor (IA) proxied by the return difference of stocks with high and low
investment-to-asset ratio and finally the profitability factor (ROE) created by sorting stocks based
on their return-on-equity ratio We receive the data from the authors An alternative is Fama-
French five factor model but we only show the result of the first one since factors in these two
papers are extremely similar
Liquidity Pastor and Stambaugh (2003) created a liquidity factor based on the fact that order
flows result in larger return reversals when liquidity is lower We download the monthly tradable
liquidity factor from Stambaughrsquos website
Quality-minus-junk Asness Frazzini and Pedersen (2019) introduced the QMJ factor and
demonstrated that profitable growing well-managed companies referred to as lsquoqualityrsquo firms
command a higher rate of return The factor is available from AQR data library
CCAPM Real growth rate in nondurable consumption per capita (quarterly data) is computed
from the data on consumption levels available at St Louis FRED
Scaled CAPM Lettau and Ludvigson (2001) considered a conditional SDFmt+1 = atminusbtMKTt+1
where at and bt are linear function of conditional information cayt We download cayt from the
authorsrsquo website
Scaled CCAPM Similar to conditional CAPM but the SDF includes nondurable consumption
growth instead of the market factor
HC-CCAPM Jagannathan and Wang (1996) added a labor factor to CAPM Following their
paper we compute the returns on human capital as
Rlabort =
Ltminus1 + Ltminus2
Ltminus2 + Ltminus3minus 1 (20)
where Lt is the disposable labor income per capita
Scaled HC-CCAPM Lettau and Ludvigson (2001) extend Jagannathan and Wang (1996) by
considering a conditional SDF in which cay is the only conditional information
Durable consumption model Yogo (2006) emphasized the role of durable consumption goods
in explaining high returns of small stocks and value stocks We consider a three-factor model
market excess return non-durable consumption growth and durable (real) consumption growth Cd
(seasonally adjusted at annual rates) The data is from Yogorsquos website 1952Q1 to 2001Q4
53
B1 Factor list
Table B1 List of the factors for cross-sectional asset pricing models
Factor ID Factor name and description Reference Sourceconstruction
MKT Market excess return Sharpe (1964) Lintner(1965)
Ken French website
SMB Size factor constructed as a long-shortportfolio of stocks sorted by their mar-ket cap (small-minus-big)
Fama and French (1992) Ken French website
HML Value factor constructed as a long-short portfolio of stocks sorted by theirbook-to-market ratio (high-minus-low)
Fama and French (1992) Ken French website
RMW Profitability factor constructed as along-short portfolio of stocks sorted bytheir profitability (robust-minus-weak)
Fama and French (2015) Ken French website
CMA Investment factor constructed as along-short portfolio of stocks sorted bytheir investment activity (conservative-minus-aggressive)
Fama and French (2015) Ken French website
UMD Momentum factor constructed as along-short portfolio of stocks sorted bytheir 12-2 cumulative previous return(up-minus-down)
Carhart (1997) Je-gadeesh and Titman(1993)
Ken French website
STREV Short-term reversal factor constructedas a long-short portfolio of stockssorted by their previous month return
Jegadeesh and Titman(1993)
Ken French website
LTREV Long-term reversal factor constructedas a long-short portfolio of stockssorted by their cumulative return ac-crued in the previous 60-13 months
Jegadeesh and Titman(2001)
Ken French website
q IA Investment factor constructed as along-short portfolio of stocks sorted bytheir investment-to-capital
Hou Xue and Zhang(2015)
Lu Zhang
q ROE Profitability factor constructed as along-short portfolio of stocks sorted bytheir return on equity
Hou Xue and Zhang(2015)
Lu Zhang
LIQ NT Liquidity factor computed as the av-erage of individual-stock measures es-timated with daily data (residual pre-dictability controlling for the marketfactor)
Pastor and Stambaugh(2003)
Robert Stambaughwebsite
LIQ TR Liquidity factor constructed as a long-short portfolio of stocks sorted by theirexposure to LIQ NT
Pastor and Stambaugh(2003)
Robert Stambaughwebsite
MGMT Mispricing factor constructed as acombination of anomalies related tofirmrsquos management practices (stock is-sue accruals asset growth etc)
Stambaugh and Yuan(2016)
Robert Stambaughwebsite
PERF Mispricing factor constructed as acombination of anomalies related tofirmrsquos performance (profitability dis-tress return on assets etc)
Stambaugh and Yuan(2016)
Robert Stambaughwebsite
ACCR Accruals factor constructed as a long-short portfolio of stocks sorted bychanges in operating working capitalper split-adjusted share from the fiscalyear end t-2 to t-1 divided by book eq-uity per share in t-1
Sloan (1996) Robert Stambaughwebsite
54
DISSTR Distress factor constructed as a long-short portfolio of stocks sorted by thepredicted failure probability
Campbell Hilscher andSzilagyi (2008)
Robert Stambaughwebsite
ASS Growth Asset growth factor constructed as along-short portfolio of stocks sorted bygrowth rate of total assets in the previ-ous fiscal year
Cooper Gulen andSchill (2008)
Robert Stambaughwebsite
COMP ISSUE Composite issue factor constructed asa long-short portfolio of stocks sortedby the growth in the firmrsquos total marketvalue of equity above that of the stockrsquosrate of return
Daniel and Titman(2006)
Robert Stambaughwebsite
GR PROF Gross profitability factor constructedas a long-short portfolio of stockssorted by the ratio of gross profit toassets creates abnormal benchmark-adjusted returns
Novy-Marx (2013) Robert Stambaughwebsite
INV IN ASS Investment-in-assets factor con-structed as a long-short portfolio ofstocks sorted by the annual change ingross property plant and equipmentplus the annual change in inventoriesscaled by lagged book value of theassets
Titman Wei and Xie(2004)
Robert Stambaughwebsite
NetOA Net operating assets factor con-structed as a long-short portfolio ofstocks sorted by net operating assets
Hirshleifer Kewei Teohand Zhang (2004)
Robert Stambaughwebsite
OSCORE Ohlson O-score factor constructed as along-short portfolio of stocks sorted bythe predicted value of a distress mea-sure
Ohlson (1980) Robert Stambaughwebsite
ROA Return-on-assets factor constructed asa long-short portfolio of stocks sortedby their return on assets
Chen Novy-Marx andZhang (2010)
Robert Stambaughwebsite
STOCK ISS Stock issuance factor constructed asa long-short portfolio of stocks sortedby their annual log change in split-adjusted shares outstanding
Ritter (1991) Fama andFrench (2008)
Robert Stambaughwebsite
INTERM CR Innovations to the intermediariesrsquo cap-ital (equity) ratio
He Kelly and Manela(2017)
Asaf Manela website
BAB Betting-against-beta factor con-structed as a portfolio that holdslow-beta assets leveraged to a betaof 1 and that shorts high-beta assetsde-leveraged to a beta of 1
Frazzini and Pedersen(2014)
AQR data library
HML DEVIL A version of the HML factor that relieson the current price level to sort thestocks into long and short legs
Asness and Frazzini(2013)
AQR data library
QMJ Auality-minus-junk factor constructedas a long-short portfolio of stockssorted by the combination of theirsafety profitability growth and thequality of management practices
Asness Frazzini andPedersen (2019)
AQR data library
FIN UNC A measure of financial uncertainty Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
REAL UNC A measure of real economic uncertainty Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
55
MACRO UNC A measure of macroeconomic uncer-tainty
Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
TERM Term spread measured as the differ-ence in 10 year Treasury bonds and fedfunds rate
Chen Ross and Roll(1986) Fama and French(1993)
FRED-MD database
DELTA SLOPE Change in the difference between a10-year Treasury bond yield and a 3-month Treasury bill yield
Ferson and Harvey(1991)
FRED-MD database
CREDIT Credit spread measured as the dif-ference between Moodyrsquos BAA corpo-rate bond yields and 10-year Treasurybonds
Chen Ross and Roll(1986) Fama and French(1993)
FRED-MD database
DIV Dividend yield Campbell (1996) FRED-MD databasePE Price-earnings ratio Basu (1977) Ball (1985) FRED-MD database
BW INV SENT Investor sentiment measure aggre-gated from a set of indices orthogonalto macroeconomic fundamentals
Baker and Wurgler(2006)
Dashan Huang web-site
HJTZ INV SENT Investor sentiment measure extractedwith PLS from Baker and Wurgler(2006) proxies
Huang Jiang Tu andZhou (2015)
Dashan Huang web-site
BEH PEAD Short-term behavioral factor reflectingpost-earnings announcement drift
Daniel Hirshleifer andSun (2019)
Kent Daniel website
BEH FIN Long-term behavioral factor predom-inantly capturing the impact of shareissuance and correction
Daniel Hirshleifer andSun (2019)
Kent Daniel website
MKTlowast Market factor with a hedged unpricedcomponent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
SMBlowast SMB with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
HMLlowast HML with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
RMWlowast RMW with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
CMAlowast CMA with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
SKEW Systematic skewness factor con-structed as a long-short portfolio ofstocks sorted on the their predictedsystematic skewness rank
Langlois (2019) Hugues Langloiswebsite
NONDUR Nondurable consumption growth (realchain-weighted per capita)
Chen Ross and Roll(1986) Breeden Gib-bons and Litzenberger(1989)
Monthly consump-tion expenditurethe chain-weightedprice index andpopulation data arefrom BEA
SERV Growth rate (real chain-weighted percapita) service expenditure
Breeden Gibbons andLitzenberger (1989)Hall (1978)
Monthly expenditurefor services thechain-weighted priceindex and popula-tion data are fromBEA
UNRATE Unemployment rate Gertler and Grinols(1982)
FRED-MD database
IND PROD Growth rate of industrial production Chan Chen and Hsieh(1985) Chen Ross andRoll (1986)
Industrial Produc-tion Index is fromthe Board of Gover-nors of the FederalReserve System
56
OIL Monthly growth rate of the ProducerPrice index for Crude Petroleum (do-mestic production)
Chen Ross and Roll(1986)
PPI is from US Bu-reau of Labor Statis-tics
The table presents the list of factors used in Section IV2 For each of the variables we present their identificationindex the nature of the factor and the source of data for downloading andor constructing the time series
57
C Additional Tables
Table C1 Tests of risk premia in a correctly specified model with a strong factor
λintercept λHML R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0115 0049 0017 0114 0050 0014 -417 4429200 0101 0055 0012 0096 0047 0010 096 6773
FM 600 0106 0053 0011 0100 0048 0011 3052 84561000 0096 0050 0011 0102 0044 0010 4786 901820000 0102 0042 0012 0100 0049 0007 9620 9943
100 0063 0027 0006 0034 0010 0002 -337 3516200 0107 0055 0012 0080 0037 0005 -299 4906
BFM 600 0095 0050 0007 0080 0035 0007 431 69131000 0092 0054 0008 0092 0047 0012 3291 781820000 0108 0054 0012 0110 0052 0008 9654 9849
Panel B GLS
100 0221 0156 0061 0212 0137 0064 1062 7380200 0153 0088 0024 0144 0088 0024 5923 8571
FM 600 0117 0062 0017 0115 0060 0014 8338 94461000 0106 0053 0011 0112 0052 0011 9003 964920000 0100 0058 0010 0103 0047 0011 9947 9981
100 0151 0088 0026 0149 0086 0026 2642 6569200 0123 0066 0016 0124 0066 0015 4907 7352
BFM 600 0106 0055 0012 0105 0056 0011 7640 87301000 0107 0054 0011 0103 0051 0011 8512 915420000 0098 0057 0009 0100 0051 0013 9915 9950
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in a correctlyspecified model with an intercept and a strong factor Hypothetical true value of R2
adj is 100 Fama-MacBeth esti-mates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shankencorrection Confidence intervals for BFM estimates are constructed using a posterior distribution of Fama-MacBethestimates of λ The last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simula-tions evaluated at the simulation point estimates for FM and its posterior mode for BFM
58
Table C2 Tests of risk premia in a correctly specified model with a useless factor
λintercept λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0087 0033 0008 0019 0003 0000 -419 4844200 0071 0034 0006 0052 0011 0000 -420 5684
FM 600 0077 0031 0005 0131 0037 0000 -422 58501000 0071 0035 0006 0205 0066 0002 -416 612820000 0133 0077 0027 0709 0479 0140 -411 7007
100 0039 0011 0002 0001 0001 0000 -229 -012200 0038 0013 0001 0002 0000 0000 -232 030
BFM 600 0048 0012 0002 0017 0006 0001 -221 1261000 0048 0011 0000 0031 0010 0000 -214 16020000 0007 0000 0000 0103 0051 0014 -176 5793
Panel B GLS
100 0215 0130 0052 0211 0117 0033 -310 3727200 0141 0080 0023 0139 0072 0013 -373 2282
FM 600 0108 0053 0014 0142 0065 0010 -377 21161000 0097 0053 0009 0171 0082 0012 -371 209220000 0108 0047 0010 0621 0559 0372 -259 1697
100 0136 0074 0022 0029 0011 0001 -194 1060200 0112 0060 0014 0012 0004 0000 -239 837
BFM 600 0097 0047 0010 0010 0002 0000 -265 7491000 0096 0047 0009 0010 0002 0000 -276 83220000 0071 0034 0004 0077 0033 0004 -326 766
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a correctly specified model with an intercept and a useless factor The true value of R2 is 0 Fama-MacBeth esti-mates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shankencorrection Confidence intervals for BFM estimates are constructed using a posterior distribution of Fama-MacBethestimates of λ The last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simula-tions evaluated at the simulation point estimates for FM and its posterior mode for BFM
59
Table C3 Tests of risk premia in a correctly specified model with useless and strong factors
λintercept λHML λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0073 0038 0006 0071 0033 0006 0043 0012 0000 -445 5930200 0073 0035 0006 0090 0045 0008 0028 0006 0000 1083 7484
FM 600 0076 0033 0005 0100 0046 0009 0027 0005 0000 3873 85721000 0073 0034 0006 0090 0046 0009 0026 0005 0000 5415 911320000 0087 0032 0006 0061 0026 0005 0025 0008 0000 9687 9947
100 0039 0013 0001 0023 0007 0001 0002 0001 0000 -197 4174200 0048 0023 0000 0046 0015 0000 0002 0001 0000 003 5372
BFM 600 0054 0025 0003 0062 0026 0006 0002 0000 0000 4056 72821000 0084 0034 0006 0064 0021 0002 0002 0000 0000 5763 808920000 0071 0033 0005 0069 0032 0004 0004 0000 0000 9704 9860
Panel B GLS
100 0193 0129 0043 0205 0133 0043 0180 0121 0031 1405 7480200 0135 0080 0021 0141 0075 0024 0129 0062 0010 6015 8683
FM 600 0101 0054 0010 0109 0052 0009 0091 0037 0004 8331 94231000 0104 0055 0010 0105 0048 0011 0085 0039 0003 8995 963220000 0095 0047 0006 0101 0052 0005 0088 0035 0004 9944 9981
100 0133 0074 0023 0132 0074 0023 0026 0009 0001 2796 6680200 0106 0054 0012 0106 0053 0012 0013 0004 0000 4899 7362
BFM 600 0094 0047 0009 0093 0046 0009 0007 0001 0000 7654 87351000 0090 0045 0010 0091 0044 0007 0005 0001 0000 8530 914620000 0092 0043 0008 0092 0046 0008 0004 0001 0000 9914 9951
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and
λstrong λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor The true value
of the cross-sectional R2adj is 100 Fama-MacBeth estimates are constructed using OLS (GLS) two-step cross-
sectional regressions with standard errors including Shanken correction Confidence intervals for BFM estimates areconstructed using a posterior distribution of Fama-MacBeth estimates of λ The last two columns report the 5th and95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at the simulation point estimates for FMand its posterior mode for BFM
60
Table C4 Tests of risk premia in a misspecified model with a strong factor (N = 55)
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0107 0058 0014 0115 0059 0012 -186 1305200 0091 0046 0009 0102 0052 0011 -184 1485600 0104 0052 0013 0109 0060 0014 -166 14951000 0100 0056 0009 0123 0061 0013 -123 135120000 0104 0049 0009 0109 0048 0009 353 803
BFM 100 0017 0004 0000 0002 0000 0000 -155 -067200 0046 0017 0004 0023 0006 0000 -168 273600 0089 0042 0012 0071 0036 0005 -174 8441000 0093 0047 0007 0089 0040 0007 -171 95920000 0101 0048 0011 0099 0042 0008 329 777
Panel B GLS
FM 100 0509 0428 0305 0513 0431 0305 2093 6471200 0295 0206 0102 0274 0187 0091 3999 6657600 0170 0109 0042 0176 0113 0034 5585 71321000 0170 0106 0034 0176 0105 0031 6010 723220000 0145 0082 0020 0141 0073 0022 6917 7182
BFM 100 0265 0181 0086 0262 0186 0079 1872 5919200 0163 0100 0032 0147 0086 0024 3401 5842600 0116 0060 0017 0118 0058 0014 5125 65951000 0120 0064 0016 0121 0063 0014 5689 686520000 0106 0053 0009 0095 0048 0013 6897 7163
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor estimates of a cross-section of 55 test assets The true valueof the cross-sectional R2
adj is 572 (7071) in OLS (GLS) estimation Fama-MacBeth estimates are constructedusing OLS (GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidenceintervals and their size for BFM estimates are constructed using posterior coverage of Fama-MacBeth estimates of λThe last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluatedat the simulation point estimates for FM and its posterior mode for BFM The test assets mimic the time series andcross-sectional properties of 25 Fama-French size-value portfolios and 30 industry portfolios while the strong factorproxies the HML factor (all the data available from Ken French website)
61
Table C5 Tests of risk premia in a misspecified model with a useless factor (N = 55)
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0095 0050 0013 0084 0031 0003 -186 2080200 0084 0042 0007 0093 0038 0004 -186 1801600 0103 0051 0011 0155 0077 0011 -187 13621000 0101 0051 0009 0226 0131 0033 -187 141720000 0191 0126 0050 0716 0650 0460 -187 988
BFM 100 0013 0002 0000 0000 0000 0000 -105 -047200 0039 0015 0001 0002 0000 0000 -128 -043600 0076 0040 0006 0004 0000 0000 -144 -0491000 0082 0035 0004 0022 0009 0000 -150 -02620000 0129 0059 0005 0078 0037 0008 -168 -020
Panel B GLS
FM 100 0492 0411 0287 0540 0468 0302 -134 2318200 0267 0196 0088 0364 0274 0153 -159 1359600 0166 0101 0035 0408 0323 0198 -162 10091000 0154 0089 0028 0475 0401 0266 -155 86820000 0155 0100 0044 0806 0772 0705 -068 668
BFM 100 0259 0180 0085 01695 0105 0041 -014 1285200 0162 0094 0030 0065 0027 0005 -089 615600 0108 0058 0013 0056 0022 0003 -127 5311000 0112 0057 0014 0064 0021 0004 -138 47920000 0051 0020 0001 0093 0049 0011 -072 172
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor estimated on a cross-section of 55 portfolios Thetrue value of the cross-sectional R2 is zero The test assets mimic the time series and cross-sectional properties of 25Fama-French size-value portfolios and 30 industry portfolios
62
Table C6 Tests of risk premia in a misspecified model with useless and strong factors (N = 55)
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0097 0052 0014 0099 0045 0009 0108 0041 0006 -318 2477200 0085 0041 0006 0090 0049 0011 0117 0051 0008 -298 2350600 0095 0045 0013 0108 0060 0012 0191 0100 0015 -212 21211000 0083 0043 0006 0125 0072 0016 0258 0171 0047 -155 213720000 0109 0055 0012 0146 0086 0038 0766 0704 0535 267 1778
BFM 100 0014 0002 0000 0000 0000 0000 0000 0000 0000 -141 161200 0037 0014 0001 0017 0005 0000 0002 0000 0000 -203 726600 0069 0031 0005 0058 0027 0002 0007 0000 0000 -227 10961000 0064 0024 0003 0069 0027 0005 0026 0008 0001 -209 117320000 0028 0006 0000 0068 0033 0004 0080 0042 0009 262 873
Panel B GLS
FM 100 0488 0406 0282 0495 0411 0281 0537 0465 0306 2201 6613200 0263 0194 0088 0252 0167 0077 0359 0279 0151 4047 6707600 0157 0094 0032 0161 0093 0027 0398 0313 0199 5581 71591000 0146 0087 0029 0144 0090 0026 0475 0393 0251 5994 725020000 0140 0094 0035 0134 0089 0041 0789 0747 0667 6888 7237
BFM 100 0253 0182 0081 0260 0176 0077 0168 0104 0039 2036 6007200 0156 0092 0028 0140 0082 0021 0063 0029 0004 3451 5844600 0105 0058 0013 0108 0049 0011 0054 0024 0002 5125 66141000 0105 0054 0015 0111 0056 0009 0057 0024 0003 5678 687320000 0051 0019 0001 0045 0018 0001 0093 0045 0013 6873 7157
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and λstrong
and λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor estimated on a cross-section
of 55 portfolios The true value of the cross-sectional R2adj is 572 (7071) in OLS (GLS) estimation The test
assets mimic the time series and cross-sectional properties of 25 Fama-French size-value portfolios and 30 industryportfolios while the strong factor proxies the HML factor
63
Table C7 Tests of risk premia in a misspecified model with a strong factor (N = 100)
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0091 0044 0010 0109 0055 0012 -098 1310600 0102 0052 0013 0116 0057 0019 -042 12311000 0099 0056 0011 0117 0060 0014 031 114520000 0099 0044 0010 0116 0068 0017 431 748
BFM 200 0016 0004 0000 0006 0001 0000 -082 074600 0076 0034 0006 0060 0028 0005 -086 7901000 0081 0046 0007 0076 0037 0007 -072 85920000 0097 0044 0011 0106 0057 0014 418 742
Panel B GLS
FM 200 0495 0411 0287 0481 0403 0268 2938 5885600 0255 0169 0077 0257 0178 0073 4736 62091000 0234 0149 0059 0205 0129 0049 5198 635220000 0166 0100 0035 0170 0097 0036 6142 6398
BFM 200 0249 0163 0070 0233 0158 0073 2567 5296600 0132 0082 0023 0137 0077 0020 4287 56771000 0125 0073 0021 0112 0060 0014 4849 597020000 0101 0056 0012 0098 0053 0013 6114 6375
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for the pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor estimated on a cross-section of 100 portfolios The truevalue of R2
adj is 585 (6297) for the OLS (GLS) estimation Fama-MacBeth estimates are constructed using OLS(GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidence intervalsand their size for BFM estimates are constructed using a posterior distribution of Fama-MacBeth estimates of λ Thelast two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at thesimulation point estimates for FM and its posterior mode for BFM The simulations design follows the methodologydescribed in Section III with the test assets mimicking the composite cross-section of 25 Fama-French size-BMportfolios 30 industry portfolios 25 profitability and investment portfolios 10 momentum portfolios as well as 10long-term reversal portfolios (all available from Ken French website) A strong factor mimics the behavior of HML
64
Table C8 Tests of risk premia in a misspecified model with a useless factor (N = 100)
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0091 0041 0009 0147 0079 0015 -101 1481600 0110 0062 0010 0280 0180 0064 -100 13551000 0096 0053 0007 0341 0248 0101 -100 122220000 0221 0151 0061 0805 0759 0643 -100 1220
BFM 200 0014 0003 0000 0000 0000 0000 -050 002600 0070 0030 0003 0014 0002 0000 -066 0261000 0073 0034 0005 0026 0010 0001 -071 02720000 0097 0035 0003 0091 0045 0008 -081 109
Panel B GLS
FM 200 0472 0397 0272 0530 0454 0320 -072 1495600 0230 0149 0064 0458 0363 0235 -052 9691000 0196 0122 0048 0487 0407 0275 -037 86120000 0106 0063 0021 0760 0717 0640 171 615
BFM 200 0244 0161 0066 0150 0092 0031 -016 769600 0129 0073 0021 0078 0035 0008 -055 6761000 0118 0066 0019 0070 0033 0006 -060 62420000 0076 0034 0005 0093 0049 0009 172 399
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor estimated on a cross-section of 100 portfolios The truevalue of R2 is zero The test assets mimic the time series and cross-sectional properties of composite cross-section of25 Fama-French size-BM portfolios 30 industry portfolios 25 profitability and investment portfolios 10 momentumportfolios as well as 10 long-term reversal portfolios while the strong factor mimics the behavior of HML (all thedata is available from Ken French website)
Table C9 Tests of risk premia in a misspecified model with useless and strong factors (N = 100)
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0088 0043 0009 0098 0045 0008 0187 0104 0026 -124 2118600 0103 0056 0011 0104 0064 0015 0319 0217 0081 -020 19571000 0105 0049 0009 0115 0058 0013 0389 0284 0134 070 182620000 0102 0057 0014 0140 0075 0028 0823 0782 0684 400 1766
BFM 200 0012 0003 0000 0005 0000 0000 0000 0000 0000 -051 541600 0062 0025 0003 0046 0021 0002 0016 0004 0000 -063 10771000 0062 0029 0005 0057 0025 0003 0029 0008 0001 -005 107820000 0026 0008 0001 0042 0015 0000 0089 0039 0011 399 818
Panel B GLS
FM 200 0467 0392 0268 0471 0383 0254 0523 0454 0315 2971 5943600 0227 0149 0059 0228 0154 0060 0455 0363 0227 4745 62221000 0191 0118 0046 0178 0114 0036 0480 0401 0262 5207 636820000 0098 0054 0020 0095 0062 0024 0749 0707 0621 6121 6416
BFM 200 0243 0160 0066 0230 0155 0072 0152 0087 0031 2613 5350600 0126 0072 0022 0132 0071 0017 0074 0033 0007 4268 56921000 0117 0067 0017 0109 0057 0013 0068 0034 0006 4864 597920000 0068 0032 0002 0066 0034 0006 0087 0047 0009 6102 6371
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and λstrong
λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor on the cross-section of 100
portfolios The true value of R2adj is 585 (6297) in OLS (GLS) estimation The test assets mimic the time series
and cross-sectional properties of composite cross-section of 25 Fama-French size-BM portfolios 30 industry portfolios25 profitability and investment portfolios 10 momentum portfolios as well as 10 long-term reversal portfolios whilethe strong factor mimics the behavior of HML (all the data is available from Ken French website)
65
Table C10 The probability of retaining risk factors using BF
T 55 57 59 61 63 65
Panel A strong factors
Jeffreys Prior fstrong 200 0860 0845 0830 0812 0792 0771600 0987 0985 0985 0983 0981 09791000 0998 0998 0997 0996 0996 0995
Spike-and-Slab Prior fstrong 200 0749 0718 0685 0654 0618 0586600 0982 0979 0978 0972 0970 09641000 0996 0996 0996 0995 0994 0992
Panel B useless factors
Jeffreys Prior fuseless 200 1000 0995 0982 0940 0862 0726600 1000 0999 0998 0988 0971 09201000 1000 1000 1000 0999 0990 0971
Spike-and-Slab Prior fuseless 200 0419 0248 0149 0083 0040 0022600 0119 0062 0028 0012 0007 00041000 0083 0028 0011 0001 0000 0000
Panel C strong and useless factors
Jeffreys Prior fstrong 200 0928 0912 0891 0878 0860 0838600 0994 0994 0992 0991 0991 09891000 0999 0999 0999 0999 0999 0999
fuseless 200 0955 0894 0788 0642 0489 0360600 0957 0895 0764 0618 0461 03541000 0957 0893 0787 0645 0483 0357
Spike-and-Slab Prior fstrong 200 0753 0725 0697 0666 0629 0592600 0982 0980 0979 0972 0970 09611000 0996 0996 0996 0995 0994 0994
fuseless 200 0283 0154 0085 0044 0023 0011600 0077 0031 0006 0002 0000 00001000 0053 0008 0000 0000 0000 0000
The table shows the frequency of retaining risk factors for different choice sets across 1000 simulations of differentsize (T=200 600 and 1000) In Panel A the candidate risk factor is truly cross-sectionally priced and stronglyidentified while in Panel B they are not Panel C summarizes the case of using both strong and useless candidatefactors in the model A candidate factor is retained in the model if its marginal posterior probability p(γi = 1|data)is greater than a certain threshold ie 55 57 59 61 63 and 65
66
Table C11 Two-pass regressions with tradable factors 25 Fama-French portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CAPM Intercept 1421 1769 1404 -091 1461[0627 2215] [-435 6661] [0595 2256] [-403 5211]
MKT -0647 -0631[-1429 0135] [-1458 0163]
Fama and French (1992) Intercept 1273 6071 1249 5604 5566[0730 1816] [2000 8514] [0682 1834] [4275 6946]
MKT -0686 -0664[-1227 -0145] [-1242 -0102]
SMB 0140 0140[0075 0205] [0073 0205]
HML 0380 0379[0322 0438] [0319 0439]
Quality-minus-junk Intercept 0626 7442 0648 6947 6879Asness Frazzini and Pedersen (2014) [-0048 1300] [4480 9520] [-0055 1308] [5548 7946]
QMJ 0369 0358[0148 0589] [0130 0591]
MKT -0097 -0116[-0763 0568] [-0772 0580]
SMB 0209 0206[0157 0260] [0151 0262]
HML 0338 0340[0288 0388] [0289 0392]
Panel B GLS
CAPM Intercept 1362 4805 1323 4706 4265[0941 1783] [2696 10000] [0846 1802] [1411 6530]
MKT -0788 -0749[-1206 -0369] [-1227 -0287]
Fama and French (1992) Intercept 1397 8849 1353 8543 8543[0924 1870] [8286 9771] [0846 1878] [8003 8989]
MKT -0827 -0783[-1298 -0356] [-1311 -0283]
SMB 0179 0178[0145 0213] [0142 0215]
HML 0348 0350[0312 0385] [0310 0391]
Quality-minus-junk Intercept 0966 8948 0982 8736 8642Asness Frazzini and Pedersen (2014) [0395 1537] [8320 9640] [0383 1616] [8120 9090]
QMJ 0378 0359[0204 0552] [0172 0539]
MKT -0425 -0438[-0984 0133] [-1052 0157]
SMB 0197 0196[0160 0234] [0156 0235]
HML 0359 0359[0321 0398] [0320 0399]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of monthly excess returns for 25 Fama-French sizevalue portfolios Each model is estimated viaOLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructed basedon the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructed as inLewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide theposterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and 99level respectively
67
Table C12 Two-pass regressions with tradable factors 25 Fama-French portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CCAPM Intercept 1444 662 1814 -202 545[0155 2732] [-435 9374] [0106 3664] [-423 4175]
∆Cnd 0383 0254[-0221 0988] [-0462 0956]
Scaled CAPM Intercept 4489 1617 3731 1739 2565[1479 7499] [-1429 6114] [-0268 7514] [-391 5783]
cay 1141 0563[-0198 2480] [-2262 3085]
MKT -2119 -1444[-4890 0653] [-5241 2643]
MKT times cay -8554 -5188[-19227 2119] [-17911 9415]
Scaled HC-CAPM Intercept 4405 1386 3343 3895 3659[1182 7629] [-2632 5453] [-0500 7182] [-006 6630]
cay 1038 0552[-0444 2520] [-1957 2848]
∆Y 0437 0034[-0421 1295] [-0995 0928]
MKT -2049 -1148[-5031 0933] [-4946 2772]
∆Y times cay 1428 0724[-1047 3903] [-3458 4645]
MKT times cay -7782 -4127[-18579 3015] [-17926 10532]
Panel B GLS
CCAPM Intercept 2274 -251 2336 -303 -056[1386 3162] [-435 9896] [1360 3266] [-403 1109]
∆Cnd 0139 0088[-0114 0391] [-0222 0383]
Scaled CAPM Intercept 2623 4949 2668 5626 4567[0794 4451] [1657 7829] [0859 4376] [439 7146]
cay 0362 0205[-0759 1483] [-0853 1277]
MKT -0596 -0642[-2412 1220] [-2325 1149]
MKT times cay 0644 0310[-5695 6984] [-5988 6529]
Scaled HC-CAPM Intercept 2486 5014 2604 5832 4698[0510 4463] [1158 7726] [0801 4372] [558 7371]
cay 0361 0199[-0902 1623] [-0879 1269]
∆Y -0415 -0242[-0878 0048] [-0672 0157]
MKT -0479 -0588[-2441 1482] [-2357 1241]
∆Y times cay 0395 0150[-1761 2552] [-1718 2036]
MKT times cay 1158 0477[-5888 8205] [-5890 6968]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of quarterly excess returns for 25 Fama-French sizevalue portfolios Each model is estimatedvia OLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructed basedon the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructed as inLewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide theposterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and 99level respectively
68
Table C13 Two-pass regressions with tradable factors 25 Fama-French + 17 industry portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Carhart (1997) Intercept 0755 4720 0780 3835 3769
[0167 1342] [803 8559] [0158 1395] [2222 5427]
MKT -0162 -0188
[-0759 0435] [-0808 0432]
SMB 0144 0142
[0082 0206] [0078 0207]
HML 0320 0316
[0251 0388] [0245 0388]
UMD 0759 0673
[-0360 1878] [-0476 1802]
q-factor model Intercept 0744 4505 0761 3647 3600
Hou Xue and Zhang (2015) [0244 1243] [138 8338] [0239 1287] [1711 5508]
ROE 0173 0156
[-0139 0485] [-0154 0481]
IA 0287 0279
[0117 0456] [0117 0455]
ME 0230 0221
[0135 0325] [0121 0320]
MKT -0199 -0211
[-0703 0304] [-0741 0311]
Liquidity Factor Intercept 0958 277 0963 -124 620
Pastor and Stambaugh (2003) [0417 1499] [-513 4113] [0377 1527] [-435 3340]
LIQ 0210 0153
[-1001 1421] [-1077 1445]
MKT -0274 -0281
[-0822 0274] [-0851 0340]
Panel B GLS
Carhart (1997) Intercept 0839 8826 0909 8486 8397
[0467 1210] [7562 9557] [0487 1339] [7895 8812]
MKT -0235 -0309
[-0610 0140] [-0737 0120]
SMB 0162 0161
[0134 0190] [0130 0193]
HML 0351 0350
[0316 0386] [0312 0390]
UMD 1426 1184
[0832 2020] [0541 1861]
q-factor model Intercept 1107 4289 1103 3742 3631
Hou Xue and Zhang (2015) [0761 1454] [027 7341] [0714 1486] [2022 5216]
ROE 0270 0247
[0057 0482] [0008 0502]
IA 0277 0263
[0134 0420] [0107 0428]
ME 0221 0216
[0156 0287] [0144 0288]
MKT -0524 -0520
[-0869 -0179] [-0900 -0138]
Liquidity Factor Intercept 1186 2897 1159 1811 2373
Pastor and Stambaugh (2003) [0874 1497] [-197 7687] [0803 1519] [438 5253]
LIQ 0089 0065
[-0567 0744] [-0696 0871]
MKT -0598 -0571
[-0909 -0288] [-0936 -0212]
69
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CAPM Intercept 0966 467 0957 -113 306
[0427 1506] [-250 5490] [0382 1522] [-246 2863]
MKT -0277 -0269
[-0823 0269] [-0838 0311]
Fama-French (1992) Intercept 1016 4330 1002 3425 3370
[0540 1491] [182 8166] [0517 1500] [2194 4614]
MKT -0445 -0430
[-0916 0026] [-0912 0068]
SMB 0147 0144
[0086 0208] [0077 0208]
HML 0316 0313
[0248 0383] [0242 0383]
Quality-minus-junk Intercept 0836 4931 0836 3861 3934
Asness et all (2019) [0323 1350] [914 8449] [0314 1365] [2459 5577]
QMJ 0153 0147
[-0065 0372] [-0066 0373]
MKT -0293 -0289
[-0796 0210] [-0814 0233]
SMB 0179 0175
[0114 0243] [0105 0240]
HML 0302 0300
[0234 0369] [0230 0373]
Panel B GLS
CAPM Intercept 1192 3060 1170 2602 2573
[0884 1500] [058 7848] [0815 1517] [600 5118]
MKT -0604 -0583
[-0911 -0298] [-0923 -0227]
Fama-French (1992) Intercept 1182 8616 1150 8272 8234
[0853 1510] [7303 9353] [0788 1519] [7736 8662]
MKT -0596 -0564
[-0923 -0268] [-0937 -0198]
SMB 0160 0160
[0132 0187] [0129 0191]
HML 0349 0348
[0314 0384] [0310 0386]
Quality-minus-junk Intercept 0970 8674 0977 8227 8276
Asness et all (2019) [0606 1334] [7451 9335] [0561 1369] [7759 8693]
QMJ 0293 0278
[0159 0427] [0125 0439]
MKT -0393 -0399
[-0754 -0033] [-0791 0012]
SMB 0166 0165
[0138 0194] [0134 0196]
HML 0353 0352
[0318 0388] [0315 0392]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of monthly excess returns for 25 Fama-French and 17 industry portfolios Each model is estimatedvia OLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructedbased on the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructedas in Lewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we providethe posterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of thecross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and99 level respectively
70
Table C14 Two-pass regressions with nontradable factors 25 Fama-French + 17 industry port-folios
Fama-MacBeth Bayesian Estimation
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CCAPM Intercept 1817 356 1975 -124 209
[0905 2729] [-250 6208] [0795 3135] [-245 2680]
∆Cnd 0189 0123
[-0216 0594] [-0285 0502]
Scaled CAPM Intercept 2141 1528 2344 298 1306
[0529 3754] [-789 9568] [0692 3944] [-451 4069]
cay 1408 0574
[-0182 2999] [-0935 1939]
MKT 0108 -0142
[-1471 1686] [-1747 1499]
cay timesMKT -2554 -2159
[-8811 3702] [-8813 4620]
Scaled CCAPM Intercept 1607 1709 1983 95 1468
[0394 2820] [-789 10000] [0761 3180] [-372 4189]
cay 1137 0511
[-0394 2669] [-0871 1865]
∆Cnd 0452 0175
[-0103 1007] [-0261 0598]
cay times∆Cnd -0102 0030
[-1752 1547] [-1175 1292]
HC-CAPM Intercept 2185 611 2221 -147 644
[0720 3650] [-513 6005] [0667 3735] [-429 3093]
∆Y 0398 0201
[-0011 0808] [-0290 0667]
MKT 0123 0050
[-1392 1638] [-1513 1650]
Panel B GLS
CCAPM Intercept 2351 264 2442 -057 218
[1700 3003] [-250 9795] [1612 3258] [-204 1296]
∆Cnd 0211 0132
[0034 0387] [-0081 0351]
Scaled CAPM Intercept 2516 3951 2653 4332 3470
[1467 3566] [1045 7411] [1616 3705] [360 6539]
cay 0783 0424
[0056 1511] [-0311 1119]
MKT -0504 -0639
[-1548 0541] [-1674 0406]
cay timesMKT 3785 2327
[-0412 7981] [-2053 6791]
Scaled CCAPM Intercept 2244 331 2378 296 356
[1378 3109] [-789 8166] [1539 3237] [-476 1690]
cay 0742 0425
[-0006 1489] [-0316 1104]
∆Cnd 0160 0119
[-0078 0398] [-0116 0342]
cay times∆Cnd 0355 0190
[-0343 1053] [-0455 0840]
HC-CAPM Intercept 2908 3810 2808 4454 3356
[2053 3762] [854 7372] [1809 3826] [194 6591]
∆Y -0155 -0089
[-0381 0072] [-0357 0174]
MKT -0892 -0793
[-1742 -0043] [-1803 0208]
71
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Scaled HC-CAPM Intercept 2099 1372 2243 1599 2186
[0549 3649] [-1389 4875] [0491 3955] [-121 4771]
cay 1124 0529
[-0326 2574] [-1066 1981]
∆Y 0088 0106
[-0314 0489] [-0399 0599]
MKT 0169 -0068
[-1340 1679] [-1805 1744]
cay times∆Y 1277 0579
[-1361 3914] [-1930 2919]
cay timesMKT -2125 -1605
[-8058 3808] [-8349 5354]
Durable CCAPM Intercept 2281 2316 2220 1423 1459
[0025 4537] [-789 10000] [0403 4028] [-359 4178]
∆Cnd 0544 0194
[0054 1034] [-0163 0525]
∆Cd 0481 0104
[-0101 1062] [-0353 0531]
MKT -0102 -0063
[-2466 2262] [-1948 1863]
Panel B GLS
Scaled HC-CAPM Intercept 2662 3848 2698 4312 3502
[1626 3698] [433 7267] [1555 3844] [231 6466]
cay 0726 0416
[-0029 1481] [-0324 1146]
∆Y -0165 -0088
[-0440 0110] [-0372 0198]
MKT -0652 -0685
[-1684 0379] [-1816 0438]
cay times∆Y 0681 0333
[-0648 2009] [-1017 1616]
cay timesMKT 3266 2123
[-0950 7482] [-2173 6502]
Durable CCAPM Intercept 1958 2926 1994 412 2558
[0634 3281] [-789 6763] [0831 3125] [-269 6336]
∆Cnd 0164 0065
[-0065 0394] [-0126 0260]
∆Cd 0289 0123
[-0011 0589] [-0117 0377]
MKT 0002 -0026
[-1322 1326] [-1165 1125]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of quarterly excess returns for 25 Fama-French and 17 industry portfolios Each modelis estimated via OLS and GLS We report point estimates and 5 confidence intervals for risk premia which areconstructed based on the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence levelconstructed as in Lewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimationwe provide the posterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode andmedian of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the90 95 and 99 level respectively
72
Table C15 Posterior factor probabilities and risk premia of 26 million sparse models
Pr [γj = 1|data] E [λj |data]ψ ψ
factors 1 5 10 20 50 100 1 5 10 20 50 100
HML 0455 0702 0720 0695 0646 0614 0106 0231 0249 0245 0229 0219MKTlowast 0274 0513 0594 0658 0700 0702 0050 0211 0321 0440 0553 0591MKT 0086 0178 0311 0446 0550 0576 0009 0067 0163 0286 0408 0454SMBlowast 0246 0350 0317 0264 0210 0188 0039 0113 0120 0110 0095 0089PERF 0136 0160 0147 0125 0098 0083 -0020 -0050 -0053 -0049 -0041 -0036STOCK ISS 0099 0117 0125 0123 0110 0099 -0008 -0029 -0042 -0050 -0053 -0051COMP ISSUE 0097 0101 0104 0103 0096 0088 0010 0029 0041 0051 0059 0059CMA 0111 0114 0104 0091 0075 0066 0008 0018 0019 0019 0017 0016ROA 0089 0079 0083 0094 0102 0097 0010 0023 0034 0051 0068 0070UMD 0091 0097 0097 0090 0078 0071 0006 0020 0026 0029 0030 0030BEH PEAD 0081 0069 0069 0074 0089 0108 0001 0002 0004 0007 0016 0030STRev 0078 0064 0063 0067 0086 0112 0000 0002 0003 0007 0022 0048NONDUR 0079 0065 0064 0067 0079 0097 0000 0000 0001 0001 0003 0006DISSTR 0085 0079 0076 0070 0060 0053 0010 0030 0039 0043 0042 0040MGMT 0090 0083 0077 0068 0055 0047 0007 0017 0019 0019 0017 0015BW ISENT 0079 0064 0062 0063 0069 0077 0000 0000 0000 0001 0002 0004TERM 0078 0063 0061 0062 0069 0078 0000 0000 0001 0001 0004 0007FIN UNC 0078 0063 0061 0061 0066 0073 -0000 -0000 -0000 -0000 -0000 -0000LIQ TR 0078 0063 0060 0061 0066 0075 -0000 0000 0001 0002 0007 0015DeltaSLOPE 0078 0063 0061 0061 0066 0073 -0000 -0000 -0000 -0000 -0000 -0001IPGrowth 0078 0063 0061 0061 0066 0072 -0000 -0000 -0000 -0000 -0001 -0002Oil 0078 0063 0061 0061 0066 0072 -0000 0001 0001 0002 0004 0009SERV 0078 0063 0061 0061 0065 0072 -0000 -0000 -0000 -0000 -0000 -0000DEFAULT 0078 0063 0060 0060 0064 0069 -0000 -0000 0000 0000 0000 0000NetOA 0079 0063 0061 0062 0065 0065 0001 0003 0004 0007 0014 0018DIV 0078 0063 0060 0060 0064 0069 0000 -0000 -0000 -0000 -0000 -0000PE 0078 0063 0060 0060 0064 0068 -0000 -0001 -0001 -0002 -0004 -0008REAL UNC 0078 0063 0060 0060 0064 0067 0000 0000 0000 0000 0000 0000HJTZ ISENT 0078 0063 0060 0060 0064 0068 -0000 -0000 -0000 -0000 -0000 -0001UNRATE 0078 0063 0060 0060 0064 0068 0000 -0000 -0000 -0000 -0001 -0002INTERM CAP RATIO 0084 0065 0061 0062 0062 0059 0009 0013 0018 0027 0038 0041LIQ NT 0079 0063 0060 0060 0063 0066 -0001 -0001 -0002 -0002 -0003 -0004QMJ 0113 0090 0069 0051 0035 0028 0012 0016 0014 0010 0007 0005MACRO UNC 0078 0063 0060 0059 0060 0061 0000 0000 0000 0000 0000 0000INV IN ASS 0078 0062 0059 0059 0060 0061 0000 0000 0001 0001 0003 0006ACCR 0081 0067 0062 0059 0056 0055 -0002 -0005 -0006 -0006 -0005 -0004LTRev 0078 0064 0063 0062 0059 0053 -0001 -0005 -0008 -0012 -0016 -0017CMAlowast 0079 0062 0059 0058 0059 0058 0000 0000 -0000 -0001 -0002 -0002ASS Growth 0078 0060 0056 0055 0055 0052 -0001 -0001 -0001 -0004 -0008 -0011HMLlowast 0079 0062 0058 0055 0053 0049 -0001 -0001 -0001 -0001 -0001 -0001RMWlowast 0077 0058 0052 0047 0040 0035 0000 0001 0001 0002 0002 0002BAB 0079 0059 0051 0045 0039 0034 -0003 -0003 -0003 -0001 0001 0003IA 0083 0060 0051 0043 0035 0030 -0003 -0004 -0004 -0003 -0003 -0003O SCORE 0077 0056 0047 0040 0032 0027 -0003 -0004 -0005 -0005 -0005 -0005ROE 0076 0055 0047 0040 0032 0027 0001 0004 0005 0005 0004 0003SMB 0039 0024 0027 0042 0067 0077 0002 0002 0004 0007 0014 0017SKEW 0090 0062 0047 0034 0024 0019 -0000 -0000 -0000 -0000 -0000 -0000BEH FIN 0079 0051 0040 0031 0023 0019 -0006 -0005 -0003 -0001 -0000 -0000GR PROF 0077 0047 0038 0031 0025 0020 0004 0003 0003 0003 0003 0003RMW 0073 0045 0036 0030 0023 0018 0003 0004 0004 0003 0003 0003HML DEVIL 0063 0036 0027 0020 0015 0012 0002 0003 0002 0001 0000 0000
Posterior probabilities of factors Pr [γj = 1|data] and posterior mean of factor risk premia E [λj |data] computedusing the Dirac spike and slab approach of section II22 51 factors and all possible models with up to 5 factorsyielding about 26 million candidate models The prior probability of a factor being included is about 1038 Thedata is monthly 197310 to 201612 Test assets cross-section of 25 Fama-French size and book-to-market and 30Industry portfolios The 51 factors considered are described in Table B1 of Appendix B
73
Table C16 Factor models with highest posterior probability (Dirac spike-and-slab ψ = 20)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232SMB 983232 983232 983232STRevBW ISENTLIQ TRUNRATENONDURTERMCOMP ISSUE 983232 983232OilBEH PEAD 983232DeltaSLOPEINV IN ASSIPGrowthDEFAULTUMD 983232ROA 983232 983232 983232 983232 983232 983232 983232REAL UNCPECMA 983232ACCRSERVSTOCK ISS 983232 983232 983232DIVMACRO UNCFIN UNCLIQ NTCMANetOAHJTZ ISENTLTRevRMWHMLINTERM CAP RATIOASS GrowthPERF 983232IABABDISSTR 983232ROEMGMTO SCOREQMJBEH FINGR PROFSKEWRMWHML DEVILSMBProbability () 01080 00964 00771 00710 00709 00668 00661 00624 00600 00560
Factors and posterior model probabilities of ten most likely specifications computed using the Dirac spike and slabapproach of section II22 ψ = 20 51 factors and all possible models with up to 5 factors yielding about 26 millionmodels and a model prior probability of the order of 10minus7 Specifications organised by columns with the symbol 983232indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612 Test assetscross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factors considered aredescribed in Table B1 of Appendix B
74
Table C17 Factor Models with highest posterior probability (continuous spike-and-slab ψ = 10)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKTlowast 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232SMBlowast 983232 983232 983232 983232 983232 983232 983232STRev 983232 983232 983232 983232 983232 983232 983232IPGrowth 983232 983232 983232 983232 983232BEH PEAD 983232 983232 983232 983232 983232NONDUR 983232 983232 983232 983232SERV 983232 983232 983232LIQ TR 983232 983232 983232 983232PE 983232 983232 983232 983232 983232 983232DeltaSLOPE 983232 983232 983232 983232BW ISENT 983232 983232DEFAULT 983232 983232 983232 983232 983232 983232Oil 983232 983232 983232DIV 983232 983232 983232 983232 983232 983232REAL UNC 983232 983232 983232 983232 983232 983232UNRATE 983232 983232TERM 983232 983232 983232 983232HJTZ ISENT 983232 983232 983232 983232 983232 983232UMD 983232 983232 983232 983232 983232CMAlowast 983232 983232 983232 983232 983232 983232LIQ NT 983232 983232 983232 983232 983232 983232 983232 983232FIN UNC 983232 983232 983232 983232 983232 983232 983232STOCK ISS 983232 983232NetOA 983232 983232 983232 983232 983232 983232MACRO UNC 983232 983232 983232 983232 983232INV IN ASS 983232 983232 983232COMP ISSUE 983232 983232 983232 983232 983232ACCR 983232 983232 983232 983232 983232 983232HMLlowast 983232 983232 983232 983232 983232 983232ROA 983232 983232 983232 983232 983232 983232RMWlowast 983232 983232 983232 983232 983232 983232CMA 983232 983232 983232 983232 983232 983232LTRev 983232 983232 983232 983232 983232 983232 983232 983232INTERM CAP RATIO 983232 983232 983232 983232 983232 983232ASS Growth 983232 983232 983232 983232 983232 983232DISSTR 983232 983232 983232 983232PERF 983232 983232 983232BAB 983232 983232IA 983232 983232 983232 983232MGMT 983232 983232 983232 983232ROE 983232 983232 983232O SCORE 983232BEH FIN 983232 983232 983232 983232 983232GR PROF 983232 983232 983232 983232 983232QMJ 983232 983232RMW 983232SKEW 983232 983232 983232SMB 983232HML DEVIL 983232 983232Probability () 00111 00111 00111 00100 00100 00100 00100 00100 00100 00100
Factors and posterior model probabilities of ten most likely specifications computed using the continuous spike andslab approach of section II23 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 225quadrillion models and a model prior probability of the order of 10minus16 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
75
Table C18 Factor models with highest posterior probability (Continuous spike-and-slab ψ = 20)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232MKTlowast 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232SMBlowast 983232 983232 983232STRev 983232 983232 983232 983232 983232 983232 983232 983232IPGrowth 983232 983232 983232BEH PEAD 983232 983232 983232 983232 983232 983232NONDUR 983232 983232 983232SERV 983232 983232 983232 983232 983232 983232LIQ TR 983232 983232 983232 983232 983232PE 983232 983232 983232 983232 983232 983232 983232DeltaSLOPE 983232 983232 983232 983232 983232 983232BW ISENT 983232 983232 983232 983232 983232DEFAULT 983232 983232 983232 983232 983232Oil 983232 983232DIV 983232 983232 983232 983232REAL UNC 983232 983232 983232 983232 983232UNRATE 983232 983232 983232 983232 983232TERM 983232 983232 983232 983232HJTZ ISENT 983232 983232 983232 983232 983232 983232 983232 983232UMD 983232 983232 983232 983232CMAlowast 983232 983232LIQ NT 983232 983232 983232 983232 983232FIN UNC 983232 983232 983232 983232 983232 983232STOCK ISS 983232 983232 983232 983232 983232NetOA 983232 983232 983232 983232 983232 983232 983232MACRO UNC 983232 983232INV IN ASS 983232 983232 983232 983232COMP ISSUE 983232 983232 983232 983232ACCR 983232 983232 983232HMLlowast 983232 983232 983232 983232 983232 983232 983232ROA 983232 983232 983232 983232 983232 983232RMWlowast 983232 983232 983232 983232 983232CMA 983232 983232 983232 983232 983232 983232LTRev 983232 983232 983232INTERM CAP RATIO 983232 983232 983232ASS Growth 983232 983232 983232 983232DISSTR 983232 983232 983232 983232 983232PERF 983232BAB 983232 983232 983232 983232IA 983232 983232 983232 983232MGMT 983232 983232 983232ROE 983232 983232 983232 983232 983232O SCORE 983232 983232 983232 983232 983232BEH FIN 983232GR PROF 983232 983232 983232QMJ 983232 983232RMW 983232 983232SKEW 983232 983232SMB 983232HML DEVIL 983232 983232Probability () 00133 00122 00111 00111 00100 00100 00100 00100 00100 00089
Factors and posterior model probabilities of ten most likely specifications computed using the continuous spike andslab approach of section II23 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 225quadrillion models and a model prior probability of the order of 10minus16 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
76
D Additional Figures
Panel A OLS
00 01 02 03 04 05
02
46
810
λstrong
Density
BFMFM
(a) Strong factor
minus2 minus1 0 1 2
00
02
04
06
08
10
λuseless
Density
BFMFM
(b) Useless factor
Panel B GLS
025 030 035
05
1015
2025
30
λstrong
Density
BFMFM
(c) Strong factor
minus20 minus15 minus10 minus05 00 05 10
00
04
08
12
λuseless
Density
BFMFM
(d) Useless factor
Figure D1 Posterior distribution of the risk premia estimates in a misspecified model that includesboth strong and irrelevant factors
The graph presents posterior distribution of risk premia estimates for a misspecified model with both strong anduseless factors in one representative simulation Panels (a) and (b) display posterior distribution of the BFM-OLSestimates of risk premia along with the frequentist distribution implied by the point estimates and standard errorsPanels (c) and (d) report the same objects for GLS T =1000
77
0 5 10
00
04
08
12
Parameter
Den
sity
PriorLikelihood ILikelihood IILikelihood IIIPosterior IPosterior IIPosterior III
Figure D2 Posterior distributions with heavy-tailed likelihood and thin-tailed prior
Posterior shapes generated by a thin-tails (Gaussian) prior and a heavy-tails (Cauchy) likelihood as the likelihood
peak moves away from the prior mean Posteriors I II and III are generated respectively by the likelihood I II and
III
78
(aβf ) and Σ
983141B =
983075983141a⊤
983141β⊤f
983076= (F⊤F )minus1F⊤R 983141Σ =
1
T(Rminus F 983141B)⊤(Rminus F 983141B)
In the second step the OLS estimates of the factor risk premia are
983141λ = (983141β⊤ 983141β)minus1 983141β⊤R (4)
where 983141β = (1N 983141βf ) and λ⊤ = (λc λf ⊤) The canonical Shanken (1992) corrected covariance matrix
of the estimated risk premia is4
983141σ2(983141λ) = 1
T
983147(983141β⊤ 983141β)minus1 983141β⊤ 983141Σ983141β(983141β⊤ 983141β)minus1(1 + 983141λ⊤
f983141Σminus1f
983141λf )983148 (5)
where 983141Σf is the sample estimate of the variance-covariance matrix of the factors ft There are
two sources of estimation uncertainty in the OLS estimates of λ First we do not know the test
assetsrsquo expected returns but instead estimate them as sample means R According to the time-
series regression R sim N (a 1T Σ) asymptotically Second if β is known the asymptotic covariance
matrix of 983141λ is simply 1T (β
⊤β)minus1β⊤ 983141Σβ(β⊤β)minus1 The extra term (1 + λ⊤fΣ
minus1f λf ) is included to
account for the fact that βf is estimated
Alternatively we can run a (feasible) GLS regression in the second stage obtaining the estimates
983141λ = (983141β⊤ 983141Σminus1 983141β)minus1 983141β⊤ 983141Σminus1R (6)
where 983141Σ = 1T 983141983171
⊤983141983171 and 983141983171 denotes the OLS residuals and with the associated covariance matrix of
the estimates
983141σ2(983141λ) = 1
T(983141β⊤ 983141Σminus1 983141β)minus1(1 + 983141λ⊤
f983141Σminus1f
983141λf ) (7)
Equations (4) and (6) make it clear that in the presence of a spurious (or useless) factor ie
such that βj = CradicT C isin RN risk premia are no longer identified Furthermore their estimates
diverge leading to inference problems for both the useless and the strong (ie βj ∕rarr 0 as T rarr infin)
factors (see eg Kan and Zhang (1999b)) In the presence of such an identification failure the
cross-sectional R2 also becomes untrustworthy If a useless factor is included into the two-pass
regression the OLS R2 tend to be highly inflated (although the GLS R2 is less affected)5
4An alternative way (see eg Cochrane (2005) Page 242) to account for the uncertainty from ldquogenerated regres-sorsrdquo such as βf is to estimate the whole system in GMM The moments are
gT (aβf λ) =
983061IN otimes IK+1
A⊤
983062983091
983107E[Rt minus aminus βfft]
E[(Rt minus aminus βfft)otimes f⊤t ]
E[Rt minus λc1N minus βfλf ]
983092
983108 = 0
where β = (1N βf ) A⊤ = β⊤ for OLS and A⊤ = β⊤Σminus1 for GLS Also note that GLS estimation is not the sameas efficient GMM estimation
5For example Kleibergen and Zhan (2015) derive the asymptotic distribution of the R2 under the assumptionthat a few unknown factors are able to explain expected asset returns and show that in the presence of a useless
7
This problem arises not only when using the Fama-MacBeth two-step procedure Kan and
Zhang (1999a) point out that the identification condition in the GMM test of linear stochastic
discount factor models fails when a useless factor is included Moreover this leads to overrejection
of the hypothesis of a zero risk premium for the useless factor under the Wald test and the power
of the over-identifying restriction test decreases Gospodinov Kan and Robotti (2019) document
similar problems within the maximum likelihood estimation and testing framework
Consequently several papers have attempted to develop alternative statistical procedures that
are robust to the presence of useless factors Kleibergen (2009) proposes several novel statis-
tics whose large sample distributions are unaffected by the failure of the identification condition
Gospodinov Kan and Robotti (2014) derive robust standard errors for the GMM estimates of
factor risk premia in the linear stochastic factor framework and prove that t-statistics calculated
using their standard errors are robust even when the model is misspecified and a useless factor is
included Bryzgalova (2015) introduces a LASSO-like penalty term in the cross-sectional regression
to shrink the risk premium of the useless factor towards zero
In this paper we provide a Bayesian inference and model selection framework that i) can be
easily used for robust inference in the presence and detection of useless factors (section II1) and
ii) can be used for both model selection and model averaging even in the presence of a very large
number of candidate (traded or non traded and possibly useless) risk factors ndash ie the entire factor
zoo
II1 Bayesian Fama-MacBeth
This section introduces our hierarchical Bayesian Fama-MacBeth (BFM) estimation method A
formal derivation is presented in Appendix A1 To start with letrsquos consider the time-series regres-
sion We assume that the time series error terms follow an iid multivariate Gaussian distribution
(the approach at the cost of analytical solutions could be generalized to accommodate different
distributional assumptions) ie 983171 sim MVN (0TtimesN Σ otimes IT ) The likelihood of the data (RF ) is
then
p(data|BΣ) = (2π)minusNT2 |Σ|minus
T2 exp
983069minus1
2tr
983147Σminus1(Rminus FB)⊤(Rminus FB)
983148983070
The time-series regression is always valid even in the presence of a spurious factor For simplicity
we choose the non-informative Jeffreysrsquo prior for (BΣ) π(BΣ) prop |Σ|minusN+1
2 Note that this prior
is flat in the B dimension The posterior distribution of (BΣ) is therefore
B|Σ data sim MVN983059983141BolsΣotimes (F⊤F )minus1
983060 (8)
Σ|data sim W -1983059T minusK minus 1 T Σ
983060 (9)
where 983141Bols and Σ denote the canonical OLS based estimates and W -1 is the inverse-Wishart
distribution (a multivariate generalization of the inverse-gamma distribution) From the above
factor the OLS R2 is more likely to be inflated than its GLS counterpart
8
we can sample the posterior distribution of the parameters (BΣ) by first drawing the covariance
matrix Σ from the inverse-Wishart distribution conditional on the data and then drawing B from
a multivariate normal distribution conditional on the data and the draw of Σ
If the model is correctly specified in the sense that all true factors are included expected
returns of the assets should be fully explained by their risk exposure β and the prices of risk λ
ie E[Rt] = βλ But since given our mean normalisation of the factors E[Rt] = a we have the
least square estimate (β⊤β)minus1β⊤a Therefore we can define our first estimator
Definition 1 (Bayesian Fama-MacBeth (BFM)) The posterior distribution of λ conditional
on B Σ and the data is a Dirac distribution at (β⊤β)minus1β⊤a A draw (λ(j)) from the posterior
distribution of λ conditional on the data only is obtained by drawing B(j) and Σ(j) from the Normal-
inverse-Wishart in (8)-(9) and computing (β⊤(j)β(j))
minus1β⊤(j)a(j)
The posterior distribution of λ defined above accounts both for the uncertainty about the
expected returns (via the sampling of a) and the uncertainty about the factor loadings (via the
sampling of β) Note that differently from the frequentist case in equation (5) there is no ldquoextra
termrdquo (1 + λ⊤fΣ
minus1f λf ) to account for the fact that βf is estimated The reason being that it
is unnecessary to explicitly adjust standard errors of λ in the Bayesian approach since we keep
updating βf in each simulation step automatically incorporating the uncertainty about βf into the
posterior distribution of λ Furthermore it is quite intuitive from the above definition of the BFM
estimator why we expect posterior inference to detect weak and spurious factors in finite sample
For such factors the near singularity of (β⊤(j)β(j))
minus1 will cause the draws for λ(j) to diverge as in
the frequentist case Nevertheless the posterior uncertainty about factor loadings and risk premia
will cause β⊤(j)a(j) to flip sign across draws causing the posterior distribution of λ to put substantial
probability mass on both values above and below zero Hence centered posterior credible intervals
will tend to include zero with high probability
In addition to the price of risk λ we are also interested in estimating the cross-sectional fit of
the model ie the cross-sectional R2 Once we obtain the posterior draws of the parameters we
can easily obtain the posterior distribution of the cross-sectional R2 defined as
R2ols = 1minus (aminus βλ)⊤(aminus βλ)
(aminus a1N )⊤(aminus a1N ) (10)
where a = 1N
983123Ni ai That is for each posterior draw of (a β λ) we can construct the corre-
sponding draw for the R2 from equation (10) hence tracing out its posterior distribution We can
think of equation (10) as the population R2 where a β and λ are unknown After observing
the data we infer the posterior distribution of a β and λ and from these we can recover the
distribution of the R2
However realistically the models are rarely true Therefore one might want to allow for the
presence of pricing errors α in the cross-sectional regression6 This can be easily accommodated
6As we will show in the next section this is essential for model selection
9
within our Bayesian framework since in this case the data generating process in the second stage
becomes a = βλ + α If we further assume that pricing error αi follows an independent and
identical normal distribution N (0σ2) the likelihood function in the second step becomes7
p(data|λσ2β) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070 (11)
In the cross-sectional regression the ldquodatardquo are the expected risk premia a and the factor loadings
β albeit these quantities are not directly observable to the researcher Hence in the above we
are conditioning on the knowledge of these quantities which can be sampled from the first step
Normal-inverse-Wishart posterior distribution (8)-(9) Conceptually this is not very different from
the Bayesian modeling of latent variables In the benchmark case we assume a Jeffreysrsquo diffuse
prior8 for (λσ2) π(λσ2) prop σminus2 In Appendix A1 we show that the posterior distribution of
(λσ2) is then
λ|σ2BΣ data sim N
983091
983109983107(β⊤β)minus1β⊤a983167 983166983165 983168983141λ
σ2(β⊤β)minus1
983167 983166983165 983168Σλ
983092
983110983108 (12)
σ2|BΣ data sim Γminus1
983075N minusK minus 1
2(aminus βλ)⊤(aminus βλ)
2
983076 (13)
where Γminus1 denotes the inverse-gamma distribution The conditional distribution in equation (12)
makes it clear that the posterior takes into account both the uncertainty about the market price
of risk stemming from the first stage uncertainty about the β and a (that are drawn from the
Normal-inverse-Wishart posterior in equations (8)-(9)) and the random pricing errors α that have
the conditional posterior variance in equation (13) If test assetsrsquo expected excess returns are fully
explained by β there are no pricing errors and σ2(β⊤β)minus1 converges to zero otherwise this layer
of uncertainty always exists
Note also that we can think of the posterior distribution of (β⊤β)minus1β⊤a as a Bayesian decision
makerrsquos belief about the dispersion of the Fama-MacBeth OLS estimates after observing the data
RtftTt=1 Alternatively when pricing errors α are assumed to be zero under the null hypothesis
the posterior distribution of λ in equation (12) collapses to a degenerate distribution where λ
equals (β⊤β)minus1β⊤a with probability one
Often the cross sectional step of the Fama-MacBeth estimation is performed via GLS rather than
least squares In our setting under the null of the model this leads to λ = (β⊤Σminus1β)minus1β⊤Σminus1a
Therefore we define the following GLS estimator
Definition 2 (Bayesian Fama-MacBeth GLS (BFM-GLS)) The posterior distribution of λ
conditional on B Σ and the data is a Dirac distribution at (β⊤Σminus1β)minus1β⊤Σminus1a A draw (λ(j))
7We derive a formulation with non-spherical cross-sectional pricing errors that leads to a GLS type estimator inAppendix A2
8As shown in the next subsection in the presence of useless factors such prior is not appropriate for modelselection based on Bayes factors and posterior probabilities since it does not lead to proper marginal likelihoodsTherefore we introduce therein a novel prior for model selection
10
from the posterior distribution of λ conditional on the data only is obtained by drawing B(j) and Σ(j)
from the Normal-inverse-Wishart in equations (8)-(9) and computing (β⊤(j)Σ
minus1(j)β(j))
minus1β⊤(j)Σ
minus1(j)a(j)
From the posterior sampling of the parameters in the above definition we can also obtain the
posterior distribution of the cross-sectional GLS R2 defined as
R2gls = 1minus
(aminus βλgls)⊤Σminus1(aminus βλgls)
(aminus a1N )⊤Σminus1(aminus a1N ) (14)
Once again we can think of equation (14) as the population GLS R2 that is a function of the
unknown quantities a β and λ But after observing the data we infer the posterior distribution
of the parameters and from these we recover the posterior distribution of the R2gls
Remark 1 (Generated factors) Often factors are estimated as eg in the case of principal
components (PCs) and factor mimicking portfolios (albeit the latter is not needed in our setting)
This generates an additional layer of uncertainty normally ignored in empirical analysis due to
the associated asymptotic complexities Nevertheless it is relatively easy to adjust the Bayesian
estimators of risk premia to account for this uncertainty In the case of a mimicking portfolio
under a diffuse prior and Normal errors the posterior distribution of the portfolio weights follow the
standard Normal-inverse-Gamma of Gaussian linear regression models (see eg Lancaster (2004))
Similarly in the case of principal components as factors under a diffuse prior the covariance
matrix from which the PCs are constructed follow an inverse-Wishart distribution9 Hence the
posterior distributions in Definitions 1 and 2 can account for the generated factors uncertainty by
first drawing from an inverse-Wishart the covariance matrix from which PCs are constructed or
the Normal-inverse-Gamma posterior of the mimicking portfolios coefficients and then sampling
the remaining parameters as explained in the definitions
II2 Model selection
In the previous subsection we have derived simple Bayesian estimators that deliver in finite sam-
ple credible intervals robust to the presence of spurious factors and avoid over-rejecting the null
hypothesis of zero risk premia for such factors
However given the plethora of risk factors that have been proposed in the literature a robust
approach for models selection across non-necessarily nested models and that can handle poten-
tially a very large number of possible models as well as both traded and non-traded factors is
of paramount importance for empirical asset pricing The canonical way of selecting models and
testing hypothesis within the Bayesian framework is through Bayesrsquo factors and posterior prob-
abilities and that is the approach we present in this section This is for instance the approach
suggested by Barillas and Shanken (2018) The key elements of novelty of the proposed method
are that i) our procedure is robust to the presence of spurious and weak factors ii) it is directly
9Based on these two observations Allena (2019) proposes a generalisation of Barillas and Shanken (2018) modelcomparison approach for these type of factors
11
applicable to both traded and non-traded factors and iii) it selects models based on their cross-
sectional performance (rather than the time series one) ie on the basis of the risk premia that the
factors command
In this subsection we show first that flat priors for risk premia (as the Jeffreysrsquo priors used
in section II1 for illustrative purposes) are not suitable for model selection in the presence of
spurious factors Given the close analogy between frequentist testing and Bayesian inference with
flat priors this is not too surprising But the novel insight is that the problem arises exactly
because of the use of flat priors and can therefore be fixed by using non-flat yet non-informative
priors Second we introduce ldquospike-and-slabrdquo priors that are robust to the presence of spurious
factors and particularly powerful in high-dimensional model selection ie when one wants as in
our empirical application to test all factors in the zoo
II21 Pitfalls of Flat Priors for Risk Premia
We start this section by discussing why flat priors for risk premia as the Jeffreysrsquo prior are not
desirable in model selection Since we want to focus and select models based on the cross-sectional
asset pricing properties of the factors for simplicity we retain Jeffreysrsquo priors for the time series
parameter (aβf Σ) of the first-step regression
In order to perform model selection we relax the (null) hypothesis that models are correctly
specified and allow instead for the presence of cross-sectional pricing errors That is we consider
the cross-sectional regression a = βλ + α For illustrative purposes we focus on spherical errors
but all the results in this and the following subsections can be generalized to the non-spherical error
setting in Appendix A2
Similar to many Bayesian variable selection problems we introduce a vector of binary latent
variables γ⊤ = (γ0 γ1 γK) where γj isin 0 1 When γj = 1 it indicates that the factor j
(with associated loadings βj) should be included into the model and vice versa The number of
included factors is simply given by pγ =983123K
j=0 γj Note that we do not shrink the intercept so
γ0 is always equal to 1 (as the common intercept plays the role of the first ldquofactorrdquo) The notation
βγ = [βj ]γj=1 represents a pγ-columns sub-matrix of β
When testing whether the risk premium of factor j is zero the null hypothesis is H0 λj = 0
In our notation this null hypothesis can be expressed as H0 γj = 0 while the alternative is
H1 γj = 1 This is a small but important difference relative to the canonical frequentist testing
approach for useless factors the risk premium is not identified hence testing whether it is equal to
any given value is per se problematic Nevertheless as we show in the next section with appropriate
priors whether a factor should be included or not is a well-defined question even in the presence
of useless factors
In the Bayesian framework the prior distribution of parameters under the alternative hypothesis
should be carefully specified Generally speaking the priors for nuisance parameters such as β σ2
and Σ do not greatly influence the cross-sectional inference But as we are about to show this is
not the case for the priors about risk premia
12
Recall that when considering multiple models say wlog model γ and model γ prime by Bayesrsquo
theorem we have that the posterior probability of model γ is
Pr(γ|data) = p(data|γ)p(data|γ) + p(data|γ prime)
where we have given equal prior probability to each model and p(data|γ) denotes the marginal
likelihood of the model indexed by γ In Appendix A3 we show that when using a Jeffreysrsquo prior
(that is flat for λ) the marginal likelihood is
p(data|γ) prop (2π)pγ+1
2 |β⊤γ βγ |minus
12
Γ983059Nminuspγ+1
2
983060
983059N σ2
γ
2
983060Nminuspγ+1
2
(15)
where λγ = (β⊤γ βγ)
minus1β⊤γ a σ
2γ =
(aminusβγ λγ)⊤(aminusβγ λγ)N and Γ denotes the Gamma function
Therefore if model γ includes a useless factor (whose β asymptotically converges to zero) the
matrix β⊤γ βγ is nearly singular and its determinant goes to zero sending the marginal likelihood
in (15) to infinity As a result the posterior probability of the model containing the spurious factor
go to one Consequently under a flat prior for risk premia the model containing a useless factor
will always be selected asymptotically However the posterior distribution of λ for the spurious
factor is robust and particularly disperse in any finite sample
Moreover it is highly likely that conclusions based on the posterior coverage of λ contradict
those arising from Bayesrsquo factors When the prior distribution of λj is too diffuse under the
alternative hypothesis H1 the Bayesrsquo factor tends to favor H0 over H1 even though the estimate
of λj is far from 0 The reason is that even though H0 seems quite unlikely based on posterior
coverages the data is even more unlikely under H1 than under H0 Therefore a disperse prior for
λj may push the posterior probabilities to favor H0 and make it fail to identity true factors This
phenomenon is the so called ldquoBartlett Paradoxrdquo (see Bartlett (1957))
Note also that flat hence improper priors for the risk premia are not legitimate since they render
the posterior model probabilities arbitrary Suppose that we are testing the null H0 λj = 0 Under
the null hypothesis the prior for (λσ2) is λj = 0 and π(λminusj σ2) prop 1
σ2 However the prior under the
alternative hypothesis is π(λj λminusj σ2) prop 1
σ2 Since the marginal likelihoods of data p(data|H0)
and p(data|H1) are both undetermined we cannot define the Bayesrsquo factor p(data|H1)p(data|H0)
(see eg
Cremers (2002) Chib Zeng and Zhao (forthcoming)) In contrast for nuisance parameters such
as σ2 we can continue to assign improper priors Since both hypotheses H0 and H1 include σ2
the prior for it will be offset in the Bayesrsquo factor and in the posterior probabilities Therefore we
can only assign improper priors for common parameters10 Similarly we can still assign improper
priors for β and Σ in the first time series step
The final reason why it might be undesirable to use Jeffreysrsquo prior in the second step is that it
does not impose any shrinkage on the parameters This is problematic given the large number of
10See Kass and Raftery (1995) (and also Cremers (2002)) for more detailed discussion
13
members of the factor zoo while we have only limited time-series observations of both factors and
test asset returns
In the next subsection we propose an appropriate prior for risk premia that is both robust to
spurious factors and can be used for model selection even when dealing with a very large number
of potential models
II22 Spike and Slab Prior for Risk Premia
In order to make sure that the integration of the marginal likelihood is well-behaved we propose
a novel prior specification for the factorsrsquo risk premia λ⊤f = (λ1 λK)11 Since the inference in
time-series regression is always valid we only modify the priors of the cross-sectional regression
parameters
The prior that we propose belongs to the so-called ldquospike-and-slabrdquo family For exemplifying
purposes in this section we introduce a Dirac spike so that we can easily illustrate its implications
for model selection In the next subsection we generalize the approach to a ldquocontinuous spikerdquo
prior and study its finite sample performance in our simulation setup
In particular we model the uncertainty underlying the model selection problem with a mixture
prior π(λσ2γ) prop π(λ|σ2γ)π(σ2)π(γ) for the risk premium of the j-th factor When γj = 1
and hence the factor should be included in the model the prior follows a normal distribution given
by λj |σ2 γj = 1 sim N (0σ2ψj) where ψj is a quantity that we will be defining below When instead
γj = 0 and the corresponding risk factor should not be included in the model the prior is a Dirac
distribution at zero For the cross-sectional variance of the pricing errors we keep the same prior
that would arise with Jeffreysrsquo approach π(σ2) prop σminus2
Let D denote a diagonal matrix with elements cψminus11 middot middot middot ψminus1
K and Dγ the sub-matrix of D
corresponding to model γ We can then express the prior for the risk factors λγ of model γ as
λγ |σ2γ sim N (0σ2Dminus1γ )
Note that c is a small positive number since we do not shrink the common intercept λc of the
cross-sectional regression
Given the above prior specification we sample the posterior distribution by sequentially drawing
from the conditional distributions of the parameters (ie we use a Gibbs sampling algorithm) The
crucial steps in addition to the sampling of the times series parameters from the posteriors in
equations (8)-(9) are as follows
Sampling λγ
Note that
11We do not shrink the intercept λc
14
p(λ|dataσ2γ) prop p(data|λσ2γ)π(λ|σ2γ)
prop (2π)minuspγ2 |Dγ |
12 (σ2)minus
N+pγ2 exp
983069minus 1
2σ2[(aminus βγλγ)
⊤(aminus βγλγ) + λ⊤γDγλγ ]
983070
= (2π)minuspγ2 |Dγ |
12 (σ2)minus
N+pγ2 e
983069minus
(λγminusλγ )⊤(β⊤γ βγ+Dγ )(λγminusλγ )
2σ2
983070
e
983153minusSSRγ
2σ2
983154
where SSRγ = a⊤aminus a⊤βγ(β⊤γ βγ +Dγ)
minus1β⊤γ a = minλγ(aminus βγλγ)
⊤(aminus βγλγ) + λ⊤γDγλγ
Note that SSRγ is the minimized sum of squared errors with generalised ridge regression penalty
term λ⊤γDγλγ That is our prior modelling is analogous to introducing a Tikhonov-Phillips
regularisation (see Tikhonov Goncharsky Stepanov and Yagola (1995) Phillips (1962)) in the
cross-sectional regression step and has the same rationale delivering a well defined marginal
likelihood in the presence of rank deficiency (that in our settings arises in the presence of useless
factors) However in our setting the shrinkage applied to the factors is heterogeneous since we
rely on the partial correlation between factors and test assets to set ψj as
ψj = ψ times ρ⊤j ρj (16)
where ρj is an N times 1 vector of correlation coefficients between factor j and the test assets and
ψ isin R+ is a tuning parameter which controls the shrinkage over all the factors12 When the
correlation between fjt and Rt is very low as in the case of a useless factor the penalty for λj
which is the reciprocal of ψρ⊤j ρj is very large and dominates the sum of squared errors
Let λγ = (β⊤γ βγ +Dγ)
minus1β⊤γ a and σ2(λγ) = σ2(β⊤
γ βγ +Dγ)minus1 the posterior distribution of
λγ is
λγ |dataσ2γ sim N (λγ σ2(λγ))
The above equation makes it clear why this Bayesian formulation is robust to spurious factors
When β converges to zero (β⊤γ βγ +Dγ) is dominated by Dγ so the identification condition for
the risk premia no longer fails When a factor is spurious its correlation with test assets converges
to zero hence the penalty for this factor ψminus1j goes to infinity As a result the posterior mean
of λγ λγ = (β⊤γ βγ +Dγ)
minus1β⊤γ a is shrunk towards zero and the posterior variance term σ2(λ)
approaches σ2Dminus1γ Consequently the posterior distribution of λ for a spurious factor is nearly the
same as its prior In contrast for a normal factor that has non-zero covariance with test assets the
information contained in β dominates the prior information since in this case the absolute size of
Dγ is small relative to β⊤γ βγ
Using our priors and integrating out λ yields
p(data|σ2γ) =
983133p(data|λσ2γ)π(λ|σ2γ)dλ prop (σ2)minus
N2
|Dγ |12
|β⊤γ βγ +Dγ |
12
exp
983069minusSSRγ
2σ2
983070
12Alternatively we could have set ψj = ψ times β⊤j βj where βj is an N times 1 vector However ρj has the advantage
of being invariant to the units in which the factors are measured
15
Sampling σ2
From the Bayesrsquo theorem we have that the posterior of σ2 given by
p(σ2|dataγ) prop p(data|σ2γ)π(σ2) prop (σ2)minusN2minus1 exp
983069minusSSRγ
2σ2
983070
hence the posterior distribution of σ2 is an inverse-Gamma σ2|dataγ sim Γminus1983059N2
SSRγ
2
983060
Finally we obtain the marginal likelihood of the data in model γ by integrating out σ2
p(data|γ) =983133
p(data|σ2γ)π(σ2)dσ2 prop |Dγ |12
|β⊤γ βγ +Dγ |
12
1
(SSRγ2)N2
When comparing two models using posterior model probabilities is equivalent to simply using the
ratio of the marginal likelihoods ie the Bayesrsquo factor defined as
BFγγprime = p(data|γ)p(data|γ prime)
where we have given equal prior probability to model γ and model γprime
Remark 2 (Bayes Factor) Consider two nested linear factor models γ and γ prime The only dif-
ference between γ and γ prime is γp γp equals 1 in model γ but 0 in model γ prime Let γminusp denote a
(K minus 1) times 1 vector of model index excluding γp γ⊤ = (γ⊤minusp 1) and γ prime⊤ = (γ⊤
minusp 0) where without
loss of generality we have assumed that the factor p is ordered last The Bayesrsquo factor is then
BFγγprime =
983061SSRγprime
SSRγ
983062N2 983059
1 + ψpβ⊤p
983147IN minus βγprime(β⊤
γprimeβγprime +Dγprime)minus1β⊤γprime
983148βp
983060minus 12 (17)
The above result is proved in Appendix A4
Since β⊤p [IN minus βγprime(β⊤
γprimeβγprime + Dγprime)minus1β⊤γprime ]βp is always positive ψp plays an important role in
variable selection For a strong and useful factor that can substantially reduce pricing errors the
first term in equation (17) dominates and the Bayesrsquo factor will be much greater than 1 hence
providing evidence in favour of model γ
Remember that SSRγ = minλγ(a minus βγλγ)⊤(a minus βγλγ) + λ⊤
γDγλγ hence we always have
SSRγ le SSRγprime in sample There are two effects of increasing ψp i) when ψp is large the penalty
for λp is small hence it is easier to minimise SSRγ and SSRγprimeSSRγ becomes much larger than
1 ii) large ψp decreases the second term in equation (17) lowering the Bayesrsquo factor and acting as
a penalty for dimensionality
A particularly interesting case is when the factor is useless βp converges to zero but the
penalty term 1ψp prop 1ρ⊤pρp goes to infinity On the one hand the first term in equation (17) will
converge to 1 on the other hand since ψp asymp 0 in large sample the second term in equation (17)
will also be around 1 Therefore the Bayesrsquo factor for a useless factor will go to 1 asymptotically13
13But in finite sample it may deviate from its asymptotic value so we should not use 1 as a threshold when testingthe null hypothesis H0 γp = 0
16
In contrast a useful factor should be able to greatly reduce the sum of squared errors SSRγ so
the Bayesrsquo factor will be dominated by SSRγ yielding a value substantially above 1
Remark 3 (Level Factors) Identification failure of factorsrsquo risk premia can arise in the presence
of lsquolevel factorsrsquo exposure to which is non-zero but lacks cross-sectional spread ie βj rarr c1N with
c ∕= 0 These factors help explain the average level of returns but not the their cross-sectional
dispersion and hence are collinear with the common cross-sectional intercept Our approach can
handle this case by using variance standardised variables in the estimation and replacing the penalty
in (16) with ψj = ψtimes983145ρj⊤983145ρj where 983145ρj is the cross-sectionally demeaned vector of correlations with
asset returns ie 983145ρj = ρj minus983059
1N
983123Ni=1 ρji
983060times 1N
II23 Continuous Spike
We extend the Dirac spike-and-slab prior by encoding a continuous spike for λj when γj equals
0 Following the literature on Bayesian variable selection (see eg George and McCulloch (1993)
George and McCulloch (1997) and Ishwaran Rao et al (2005)) we model the uncertainty under-
lying model selection with a mixture prior π(λσ2γω) = π(λ|σ2γ)π(σ2)π(γ|ω)π(ω) which is
specified as following
λj |γj σ2 sim N (0 r(γj)ψjσ2)
Note the introduction of a new element r(γj) in the prior and where r(1) = 1 and r(0) = r As
we explain below the additional parameter vector ω encodes our prior beliefs about the sparsity
of the true model
Redefine D as a diagonal matrix with elements c (r(γ1)ψ1)minus1 (r(γK)ψK)minus1 where ψj is
given as before by equation (16) In matrix notation the prior for λ is λ|σ2γ sim N (0σ2Dminus1)
The term r(γj)ψj in Dminus1 is set to be small or large depending on whether γj = 0 or γj = 1 In the
empirical implementation we force r to be much less than 1 since we intend to shrink λj towards
zero when γj is 014 Hence the spike component concentrates the mass of λ towards zero whereas
the slab component allows λ to take values over a much wider range Therefore the posterior
distribution of λ is very similar to the case of a Dirac spike in section II22
We use Gibbs sampling (ie sequential sampling from conditional distributions) to draw the
posterior distribution of the parameters (λγωσ2) where as explained below ω encodes our
prior beliefs about the sparsity of the true model
Sampling λγ
Combining the likelihood and the prior for λ we have
p(λ|dataσ2γ) prop p(data|λσ2γ)p(λ|σ2γ) prop exp
983069minus 1
2σ2
983147λ⊤(β⊤β +D)λminus 2λ⊤β⊤a
983148983070
Therefore defining λ = (β⊤β+D)minus1β⊤a and σ2(λ) = σ2(β⊤β+D)minus1 the posterior distribution
14We can set r = 00001 In our framework r is essentially a tune parameter hence we need to choose a reasonablevalue such that we can identify useful factor but exclude spurious ones
17
of λ can be expressed as λ|dataσ2γω sim N (λ σ2(λ))
Sampling γjKj=1
Even though the prior on model index γ could be simply set to be π(γ) = 12K we interpret
π(γj = 1|ωj) = ωj as our prior belief about the sparsity of the true model As in the literature on
predictors selection we assign the following prior distribution to (γω)
π(γj = 1|ωj) = ωj ωj sim Beta(aω bω)
Different hyper-parameters aω and bω determine whether we favor more parsimonious models or
not15
Given a ωj the Bayes factor for the j-th risk then factor is
p(γj = 1|dataλωσ2γminusj)
p(γj = 0|dataλωσ2γminusj)=
ωj
1minus ωj
p(λj |γj = 1σ2)
p(λj |γj = 0σ2)
If we had instead imposed ωj = 05 as in section II22 the Bayesrsquo factor would be fully determined
byp(λj |γj=1σ2)p(λj |γj=0σ2)
Sampling ω
From Bayesrsquo theorem we have
p(ωj |dataλγσ2) prop π(ωj)π(γj |ωj) prop ωγjj (1minus ωj)
1minusγjωaωminus1j (1minus ωj)
bωminus1
prop ωγj+aωminus1j (1minus ωj)
1minusγj+bωminus1
Therefore the posterior distribution of ωj is ωj |dataλγσ2 sim Beta(γj + aω 1minus γj + bω)
Sampling σ2
Finally
p(σ2|dataωλγ) prop (σ2)minusN+K+1
2minus1 exp
983069minus 1
2σ2[(aminus βλ)⊤(aminus βλ) + λ⊤Dλ]
983070
Hence the posterior distribution of σ2 is
σ2|dataωλγ sim Γminus1
983061N +K + 1
2(aminus βλ)⊤(aminus βλ) + λ⊤Dλ
2
983062
Note that when ωj is a constant 05 and r converges to 0 the continuous slab-and-spike prior
is equivalent to the one with a Dirac spike in section II22 However the MCMC algorithm of
the continuous spike setting is particularly useful in the high dimensional case Imagine that there
are 30 candidate factors in the factor zoo In the Dirac spike-and-slab prior case we have to
15We set aω = bω = 2 in the benchmark case However we could assign aω = 1 and bω = 2 in order to select asparser model
18
calculate the posterior model probabilities for 230 different models Given that we update (aβ)
in each sampling round posterior probabilities for all models are necessarily re-computed for every
new draw of these quantities rendering the computational cost very large In contrast using our
continuous spike approach we can simply use the posterior mean of γj to approximate the posterior
marginal probability of the j-th factor
III Simulation
We build a simple setting for a linear factor model that includes both strong and irrelevant factors
and allows for potential model misspecification
The cross-section of asset returns mimics empirical properties of 25 Fama-French portfolios
sorted by size and value We generate both factors and test asset returns from normal distributions
assuming that HML is the only useful factor A misspecified model also includes pricing errors from
the two-step Fama-MacBeth procedure which makes the vector of simulated expected returns equal
to their sample mean estimates of 25 Fama-French portfolios Finally a spurious factor is simulated
from an independent normal distribution with mean zero and standard deviation 1
ftuselessiidsim N (0 (1)2)
ftHMLiidsim N (rHML σ
2HML)
ftHML = ftHML minus macrftHML
Rt|ftHMLiidsim
983099983103
983101N (λc1N + β
983059λHML + ftHML
983060 Σ) if the model is correct
N (R+ βftHML Σ) if the model is misspecified
where factor loadings risk premia and variance-covariance matrix of returns are equal to their
OLS sample estimates from time series and two-pass Fama-MacBeth regressions of 25 size and
value portfolios on HML All the model parameters are estimated on monthly data from July 1963
to December 2017
To better illustrate the properties of the frequentist and bayesian approaches we consider 3
estimation setups
(a) the model includes only a strong factor (HML)
(b) the model includes only a useless factor
(c) the model includes both strong and useless factors
Each setting can be correctly or incorrectly specified with the following sample sizes T = 100
200 600 1000 and 20000 We compare the performance of the OLSGLS standard frequentist
and Bayesian Fama-MacBeth estimators (FM and BFM correspondingly) with the focus on risk
premia recovery testing and identification of strong and useless factors for model comparison
19
III1 Estimating risk premia via Bayesian Fama-MacBeth
Since it is unlikely that in most empirical settings a linear factor model is correctly specified we
focus our discussion on the case that allows for model misspecification We report similar simulation
results for the case of correct model specification in Appendix C
Table 1 Tests of risk premia in a misspecified model with a strong factor
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0107 0052 0007 0115 0076 0013 -422 4530200 0094 0067 0006 0118 0072 0015 -303 5621
FM 600 0097 0053 0006 0098 0047 0011 389 55751000 0088 0048 0012 0103 0060 0012 865 532820000 0109 0056 0010 0110 0060 0008 2486 3655
100 0063 0026 0006 0032 0013 0001 -356 3609200 0086 0041 0012 0079 0039 0008 -367 4689
BFM 600 0087 0043 0013 0087 0047 0009 -067 52611000 0095 0046 0009 0093 0046 0009 440 534620000 0096 0052 0012 0100 0052 0012 2390 3601
Panel B GLS
100 0246 0173 0067 0251 0154 0070 1720 7464200 0165 0107 0031 0171 0105 0035 5337 8045
FM 600 0137 0076 0020 0137 0073 0016 6942 83871000 0131 0072 0019 0132 0072 0015 7341 841620000 0122 0074 0014 0113 0061 0015 8034 8283
100 0139 0083 0022 0143 0083 0024 3127 6815200 0131 0072 0014 0121 0069 0019 4858 7299
BFM 600 0114 0061 0013 0126 0067 0018 6594 80091000 0100 0048 0009 0106 0058 0008 7063 814920000 0092 0050 0012 0100 0048 0012 8022 8267
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor The true value of the cross-sectional R2
adj is 3055(8175) for the OLS (GLS) estimation Fama-MacBeth estimates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidence intervals and their size for BFMestimates are constructed using posterior coverage of Fama-MacBeth estimates of λ The last two columns report the5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at the simulation point estimatesfor FM and its posterior mode for BFM
Table 1 presents the size of the tests for risk premia and confidence intervals for cross-sectional
R2 in probably the most relevant (and best-case) scenario for empirical applications It compares
the performance of frequentist and Bayesian Fama-MacBeth estimators for the case when the model
is misspecified and includes a single cross-sectional factor that is priced and strongly identified
proxied by HML Since the model is misspecified cross-sectional R2 never reaches 100 even for
T = 20 000 with the population value of 31 (82) for OLS (GLS) As expected both FM and
BFM estimators are very similar to each other and provide confidence intervals of correct size In
case of the standard FM approach they are constructed using standard t-statistics adjusted for
Shanken correction and in case of the BFM we rely on the quantiles of the posterior distribution
to form the credible confidence intervals for parameters The last two columns also report the
quantiles of the mode of the posterior distribution of R2 across the simulations
20
Table 2 summarizes risk premia estimation for the same cross-section of 25 expected returns
but using a useless (spurious) factor as a candidate cross-sectional factor As expected the standard
Fama-MacBeth estimator fails to recognize the rank failure in the second stage and conventional
risk premia estimates and t-statistics are inflated Indeed it is widely known since Kan and Zhang
(1999) that if the model is misspecified then t-statistics of the spurious factors tend to infinity
asymptotically This is confirmed in Panels A and B for T = 20 000 the probability of finding a
useless factor to have a t-stat above 196 is almost 50 for FM-OLS and over 80 for FM-GLS
Furthermore cross-sectional measures of fit such as R2 and related quantities are substantially
inflated even though itrsquos true value is 0 for both OLS and GLS settings in-sample estimates
produced by the frequentist approach not only have a much wider empirical support (for example
from -4 to over 40 for FM-OLS) but also uncertainty that does not decrease with the sample
size
Table 2 Tests of risk premia in a misspecified model with a useless factor
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0072 0037 0010 0055 0011 0001 -429 4184200 0078 0043 0005 0084 0021 0001 -418 4524
FM 600 0089 0043 0014 0223 0116 0013 -428 43701000 0093 0052 0014 0333 0187 0027 -429 450120000 0213 0135 0067 0698 0488 0172 -427 4363
100 0035 0011 0001 0001 0000 0000 -247 -036200 0041 0013 0001 0008 0002 0000 -261 -016
BFM 600 0071 003 0003 0031 0006 0002 -287 0191000 0047 002 0001 0039 0017 0002 -298 08420000 0034 0013 0000 0091 0043 0012 -336 1140
Panel B GLS
100 0238 0144 0066 0305 0199 0062 -347 3857200 0152 0091 0028 0292 0189 0067 -375 1953
FM 600 0126 0066 0017 0407 0314 0148 -385 16431000 0117 0063 0013 0510 0401 0239 -405 156920000 0104 0039 0005 0864 0847 0768 -311 1333
100 0128 0070 0019 0047 0018 0002 -218 1154200 0107 0060 0014 0034 0011 0000 -275 891
BFM 600 0093 0046 0008 0042 0012 0001 -325 6621000 0083 0031 0004 0061 0028 0004 -339 51920000 0023 0006 0000 0099 0049 0011 -304 138
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor The true value of the cross-sectional R2 is zero
The Bayesian approach to Fama-MacBeth regressions successfully overcomes the hurdle of use-
less factors As Table 2 demonstrates both BFM-OLS and BFM-GLS are able to identify the spu-
rious factor with the posterior distribution providing credible confidence bounds with the proper
control for the size of the test (eg for T = 20 000 the probability to reject the null of no risk
premia attached to a useless factor when using a test with the nominal size of 10 is 91) Rec-
ognizing a useless factor cross-sectional measures of fit are more conservative and overall tighter
Why does the bayesian approach work when the frequentist fails The argument is probably
best summarized by Figure 1 that plots a posterior distribution of λuseless for BFM from one of
21
minus4 minus2 0 2 4
00
02
04
06
08
λuseless
Density
BFMminusOLSFMminusOLS
Figure 1 Risk premia estimates of a useless factor
The graph presents the posterior distribution (blue line) of λuseless from the BFM-OLS estimation in a misspecifiedmodel with a useless factor based on a single simulation with T = 1 000 The red line depicts the asymptoticdistribution of Fama-MacBeth estimate of risk premium under the normal approximation
the simulations along with the pseudo-true value of risk premium defined as 0 in this case In this
particular simulation Fama-MacBeth OLS estimate of λuseless is -119 with Shanken-corrected
t-statistics equal to -255 so according to traditional hypothesis testing we would reject the null
of λuseless = 0 even at 1 The posterior distribution of BFM estimates of risk premium (the
blue line in Figure 1) behaves rather differently it is centered around 0 and overall more spread
out with a confidence interval (minus1603 1201) Intuitively the main driving force behind it
is the fact that in BFM β is updated continuously when β is close to zero the posterior draws
of β will be positive or negative randomly which implies that the conditional expectation of λ in
Equation 12 will also flip sign depending on the draw As a result the posterior distribution of
λuseless is centered around 0 and so is the confidence interval The same logic applies to the case
of BFM-GLS
Finally Table 3 combines the insights for true and irrelevant factors and presents the simulation
results for the most realistic model setup that includes both useless and strong factors As expected
in the conventional case of frequentist Fama-MacBeth estimation the useless factor is often found
to be a significant predictor of the asset returns its OLS (GLS) t-statistic would be above a
5-critical value in over 60 (80) of the simulations On the contrary the bayesian confidence
intervals have the right coverage and reject the null of no risk premia attached to the spurious
factor with frequency asymptotically approaching the size of the tests
The crowding out of the true factors by the useless ones could also be an important empirical
concern When the model is misspecified the presence of spurious factors can also bias the risk
premia estimates for the strong ones and often leads to their crowding out of the model Panel
A in Table 3 serves as a good illustration of this possibility with risk premia estimates for the
22
Table 3 Tests of risk premia in a misspecified model with useless and strong factors
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0082 0039 0008 0121 0067 0016 0099 0023 0001 -513 5663200 0096 0044 0005 0157 0100 0034 0129 0039 0005 127 6190
FM 600 0093 0034 0014 0212 0147 0071 0264 0129 0022 840 61781000 0102 0046 0010 0261 0194 0098 0380 0199 0056 1184 624820000 0114 0054 0009 0289 0229 0152 0848 0633 0240 2507 6076
100 0035 0012 0001 0028 0007 0001 0004 0001 0000 -211 4033200 0049 0017 0001 0067 0031 0004 0011 0003 0000 -175 4828
BFM 600 005 0018 0004 0099 0047 0005 0047 0014 0002 1020 55721000 0041 0021 0003 0102 0048 0011 0071 0035 0004 1487 569520000 0017 0007 0000 0087 0033 0007 0099 0055 0012 2480 5466
Panel B GLS
100 0219 0155 0057 0224 0135 0066 0303 0198 0064 1911 7775200 0155 0092 0028 0149 0090 0024 0263 0183 0061 5537 8171
FM 600 0121 0068 0015 0116 0064 0016 0391 0293 0134 6948 84331000 0115 0061 0013 0115 0057 0012 0487 0387 0216 7305 847420000 0084 0050 0009 0100 0041 0005 0864 0836 0757 7979 8424
100 0122 0069 0016 0129 0070 0017 0046 0017 0002 3243 6869200 0112 0056 0012 0099 0048 0012 0031 0012 0000 4844 7355
BFM 600 0096 0049 0011 0086 0045 0009 0049 0016 0002 6576 80301000 0081 0036 0007 0073 0032 0006 0058 0030 0003 7064 815420000 0027 0005 0000 0022 0007 0000 0098 0047 0013 7974 8259
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and
λstrong λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor The true value of the
cross-sectional R2adj is 3055 (8175) for the OLS (GLS) estimation
strong factor are clearly biased in the frequentist estimation by the identification failure in case of
the frequentist approach Again in this case BFM provides reliable albeit conservative confidence
bounds for model parameters
III11 Evaluating cross-sectional fit
In addition to risk premia estimates it is often useful to understand the quality of cross-sectional
fit of the model Indeed the increase in cross-sectional R2 is often interpreted as measuring the
economic importance of the predictor contrary to the statistical one implied by the risk premia
significance It is well-known however that the average values of R2 are not always informative
about the true model performance its sample distribution often suffers from a large estimation un-
certainty (see eg Stock (1991) and Lewellen Nagel and Shanken (2010)) ad has a non-standard
distribution when the matrix of β has reduced rank (Kleibergen and Zhan (2015) Gospodinov
Kan and Robotti (2019)) In this section we further investigate the properties of cross-sectional
R2 in the frequentist and Bayesian Fama-MacBeth regressions
Figure 2 shows the distribution of cross-sectional OLS R2 across a large number of simulations
for the asymptotic case of T = 20 000 and a misspecified process for returns As Panel (a)
illustrates if the model is strongly identified the distribution of posterior mode of R2 for BFM
tends to coincide with that of conventional Fama-MacBeth procedure as expected The major
23
difference emerges whenever a useless factor is included into the candidate set of variables Indeed
it is well-known that in this case the distribution of conventional measures of fit is nonstandard
and often inflated (Kleibergen and Zhan (2015)) This is further confirmed in Figures 2 b) and
c) that show that under the presence of spurious factors conventional Fama-MacBeth R2 has an
extremely spreadout right tail of the distribution which makes it easy to find a substantial increase
of fit whenever the model is simply not identified This unfortunate property of the frequentist
approach is not shared by the inference with BFM Indeed the mode of the posterior distribution
of R2 is generally tightly centered around the true values The slight bump to the right tail of the
distribution comes from the fact that whenever a spurious factor is included into the model with a
small probability (based on t-statistic cut-off this is equal to the size of the test see eg Table 3
Panel A) its fit will be similar to that of the frequentist estimation
00 02 04 06 08 10
02
46
810
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(a) Model with a strong factor
00 02 04 06 08
010
2030
40
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(b) Model with a useless factor
00 02 04 06 08 10
02
46
8
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(c) Model with strong and useless factors
Figure 2 Cross-sectional distribution of R2adj in models with strong and useless factors
Figures (a)-(c) show the asymptotic distribution of cross-sectional R2 under different model specifications across 1000simulations of sample size T = 20 000 Blue lines correspond to the distribution of the posterior mode for R2
adj whilegreen lines depict the pointwise sample distribution of cross-sectional fit evaluated at Fama-MacBeth risk premiaestimates The dark yellow dash line stands for the true value of R2
adj in the model
24
However the pointwise distribution of cross-sectional R2 across the simulations is only part of
the story as it does not reveal the in-sample estimation uncertainty and whether the confidence
intervals are credible in reflecting it While BFM incorporates this uncertainty directly into the
shape of its posterior distribution one needs to rely on bootstrap-like algorithms to build a similar
analogue in the frequentist case As frequentist benchmark we use the approach Lewellen Nagel
and Shanken (2010) to construct the confidence interval for R2 Details on this procedure can be
found in Appendix A5
minus05 00 05 10
00
05
10
15
20
25
Radj2
Den
sity
BFM PosteriorFM Radj
2
5th95th CI of BFM5th95th CI of LNSTrue Radj
2
(a) Model with a useless factor
minus04 minus02 00 02 04 06 08 10
00
05
10
15
20
Radj2
Den
sity
BFM PosteriorFM Radj
2
5th95th CI of BFM5th95th CI of LNSTrue Radj
2
(b) Model with strong and useless factors
Figure 3 The estimation uncertainty of cross-sectional R2Figures (a) and (b) show the posterior density of the cross-sectional R2 (blue) in one representative simulationalong with its 90 confidence interval (red) The dark yellow line denotes the true value of R2 while the green linedepicts its in-sample Fama-MacBeth estimate with the confidence interval constructed following Lewellen Nagel andShanken (2010) (black)
Figure 3 presents the posterior distribution of cross-sectional R2 for a model that contains a
useless factor (and potentially a strong one too) and contrasts it with a frequentist value and
the confidence interval around it Consider for example Panel a) The fact that the in-sample
Fama-MacBeth estimate of cross-sectional fit (51) is substantially higher than the mode of the
posterior distribution (-2 which is close to the true value of R2adj about minus4) is not surprising
given the previous results on the pointwise distribution of the estimates What is quite interesting
however is the coverage of the confidence interval constructed via the simulation-based approach of
Lewellen Nagel and Shanken (2010) Not only does it not include true value of the cross-sectional
fit but in fact in this particular simulation it suggests that R2adj should be between 42 and
100 A similar mismatch between the seemingly high levels of cross-sectional fit produced by a
frequentist approach and their true values can also be observed in Panel b) of Figure 3 for the
case of including both strong and a useless factors
In section A6 of the Appendix we show that the performance of the Bayesian Fama-MacBeth
method is robust to the use of a larger cross-sectional dimension ndash ie the above discussed properties
25
hold in a larger N setting
III2 Bayes factors
How well do flat and spike-and-slab priors work empirically in selecting relevant and detecting
spurious factors in the cross-section of asset returns We revisit the theoretical results from Section
II1 using the same simulation design used to evaluate the estimation of risk premia
Table 4 The probability of retaining risk factors using Bayes factors
T 55 57 59 61 63 65
Panel A strong factors
Jeffreys Prior fstrong 200 0813 0784 0758 0722 0693 0662600 0929 0915 0896 0876 0851 08341000 0972 0963 0957 0951 0937 0924
Spike-and-Slab Prior fstrong 200 0605 0570 0538 0505 0476 0447600 0853 0830 0807 0784 0762 07351000 0954 0940 0928 0912 0896 0875
Panel B useless factors
Jeffreys Prior fuseless 200 1000 0996 0988 0967 0919 0822600 0998 0998 0995 0988 0977 09431000 1000 1000 1000 0994 0983 0965
Spike-and-Slab Prior fuseless 200 0325 0188 0106 0057 0024 0013600 0072 0025 0006 0002 0001 00001000 0051 0015 0001 0000 0000 0000
Panel C strong and useless factors
Jeffreys Prior fstrong 200 0924 0897 0874 0848 0821 0799600 0988 0985 0976 0974 0965 09581000 0998 0996 0996 0995 0992 0987
fuseless 200 0984 0960 0910 0811 0702 0584600 0999 0993 0985 0954 0913 08541000 1000 1000 0995 0986 0966 0945
Spike-and-Slab Prior fstrong 200 0627 0591 0552 0509 0474 0452600 0860 0830 0802 0783 0758 07271000 0956 0942 0927 0911 0895 0875
fuseless 200 0260 0128 0071 0031 0019 0010600 0080 0028 0010 0004 0001 00011000 0058 0013 0005 0000 0000 0000
The table shows the frequency of retaining risk factors for different choice sets across 1000 simulations of differentsize (T=200 600 and 1000) In Panel A the candidate risk factor is truly cross-sectionally priced and stronglyidentified while in Panel B they are not Panel C summarizes the case of using both strong and useless candidatefactors in the model A candidate factor is retained in the model if its marginal posterior probability p(γi = 1|data)is greater than a certain threshold ie 55 57 59 61 63 and 65
Consider a cross-section of 25 portfolios that is actually loading on 2 systematic sources of risk
with the econometrician potentially observing at most only one of them a strong (and priced) ft
However there is also a second candidate factor available which is orthogonal to asset returns
and essentially useless We compute Bayes factors corresponding to each of the potential sources
of risk and document the empirical probability of retaining the variable in the model across 1000
26
simulations Again we consider models that contain either strong or useless factors or a com-
bination of both and different sample sizes (T = 200 600 and 1000) In each case we run the
Gibbs sampling algorithm derived using continuous spike-and-slab prior and then approximate the
marginal probability of each factor by the posterior mean of γj The decision rule is based on a
range of critical values 55-65 such that whenever the posterior mean of γj is above a particular
threshold we retain the factor Finally we also compute the probability of retaining a factor under
Jeffreys prior which would be the standard in the literature
Table 4 summarizes our findings When only a true risk factor is included in the candidate set
(Panel A) both Jeffreyrsquos and spike-and-slab priors successfully identify it with a high probability
especially in large sample For small sample sizes however Jeffreys prior clearly has a somewhat
higher power of detecting a true risk factor This outcome is not surprising as the estimator does
not impose additional shrinkage on the risk premium The difference however vanishes for larger
sample sizes
The difference between the two priors becomes drastic whenever useless factors are included
in the model (Panels B Panels B and C in Table 4) As discussed in Section II21 since the
matrix β⊤γ βγ is nearly singular and its determinant goes to zero under a flat prior the posterior
probability of including a spurious factor in the model converges to 1 asymptotically For example
the probability of misidentifying a spurious factor as being the true source of risk is almost 1
under Jeffreys prior even for a very short sample This in turn makes the overall process of model
selection invalid
Overall we find the asymptotic behavior of the spike-and-slab prior encouraging for variable
and model selection While often somewhat conservative in very short samples it successfully
eliminates the impact of the spurious factors from the model and identifies the true sources of
risk
IV Empirical Applications
In this section we apply our Bayesian approach to a large set of factors proposed in the previous
literature First we use the Bayesian Fama-MacBeth method to analyse several notable factor
models (subsection IV1) Second we consider 51 tradable and non-tradable factors yielding more
than two quadrillion possible models and employ our spike and slab priors to compute factorsrsquo
posterior probabilities and implied risk premia (subsections IV2 and IV3) Third we compare
the performance of a (low dimensional) robust model constructed with only the factors that have
high posterior probability to the one of several notable factor models (subsection IV4) Fourth
we estimate the degree of sparsity (in terms of linear factors) of the true latent stochastic discount
factor as well as the SDF-implied maximum Sharpe ratio (subsection IV5)
27
IV1 Some notable factor models
In this section we illustrate the differences between the frequentist and Bayesian FM estimation
(both OLS and GLS) for several candidate models In particular we estimate a set of linear
factor models on the returns of the standard 25 Fama-French portfolios sorted by size and value
using frequentist and Bayesian Fama-MacBeth estimators We use monthly data over the 197001-
201712 sample for tradable factors and whenever possible nontradables For factors available
only at quarterly frequency the sample is 1952Q1-2017Q3 (whenever possible) A full description
of the data and models used can be found in Appendix B Additional empirical results for other
candidate factors and cross-sections (eg 25 Fama-French + 17 Industry portfolios) can be found
in Appendix C
Tables 5 and 6 summarize the performance of several leading factor models For the classical FM
approach we report point estimates of risk premia with their Shanken-corrected t-statistics and
the cross-sectional R2 along with its 90 confidence interval (constructed following the method-
ology of Lewellen Nagel and Shanken (2010)) For BFM we report the posterior mean of risk
premia estimates and the posterior median and mode of R2 along with the centered 90 posterior
coverage We chose to report both the median and the mode for cross-sectional fit because the of
the shape of its distribution which is often heavily skewed
Carhart (1997) 4-factor model OLS and GLS Fama-MacBeth estimates of risk premia indicate
that size value and momentum (SMB HML and UMD correspondingly) are significant drivers of
the cross-section of test assets The market factor does not command a significant risk premium
which is a typical finding for this model Cross-sectional fit seems to be high with R2 over 70 even
though it comes with rather wide confidence bounds according to Lewellen Nagel and Shanken
(2010) approach The Bayesian estimation indicates that part of the model success is due to the fact
that this cross section of test assets does not have much exposure to momentum especially after
one controls for the conventional Fama-French factors While still marginally significant its risk
premium is substantially lower under both BFM and BFM-GLS estimation with tighter bounds
for R2 too On the contrary both HML and SMB have virtually identical risk prices under both
FM and Bayesian estimations
Hou Xue and Zhang (2014) q-factor model emphasized the role of investment and profitability
in matching the cross-section of equity returns and true to the data we find these factors strongly
priced Both estimation strategies give identical parameter estimates and largely consistent mea-
sures of cross-sectional fit This in turn implies that all the risk premia are strongly identified
for this model when estimated on a cross-section of 25 Fama-French portfolios When industry
portfolios are among the test assets (see Appendix C) risk premia and R2 decline and some of the
parameters lose significance but broadly speaking the model performs in a qualitatively similar
way
Liquidity-adjusted CAPM of Pastor and Stambaugh (2003) seems to suffer from identification
failure as the risk premium on the liquidity factor ceases to remain significant when BFM is used
in estimation Wide confidence bounds and uncertain cross-sectional fit provide a stark difference to
28
Table 5 Tradable factors and 25 Fama-French portfolios sorted by size and value
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Carhart (1997) Intercept 0489 7063 0703 6432 6329[-0244 1222] [3160 9400] [-0061 1426] [4826 7646]
MKT 0120 -0101[-0631 0870] [-0822 0683]
SMB 0171 0164[0100 0241] [0089 0232]
HML 0404 0396[0331 0477] [0330 0466]
UMD 2445 1806[0955 3936] [0259 3328]
q-factor model Intercept 0912 6567 0922 6062 6123Hou Xue and Zhang (2014) [0286 1539] [3040 8680] [0276 1560] [4131 7640]
ROE 0394 0377[0016 0771] [-0020 0789]
IA 0387 0385[0203 0571] [0208 0580]
ME 0274 0268[0169 0379] [0158 0376]
MKT -0371 -0378[-0995 0252] [-1005 0272]
Liquidity-CAPM Intercept 0973 3624 1162 3409 3027Pastor and Stambaugh (2000) [-0084 2030] [-909 10000] [0175 2120] [-239 6146]
LIQ 3057 1785[0727 5388] [-1237 4150]
MKT -0281 -0449[-1350 0788] [-1371 0509]
Panel A GLS
Carhart (1997) Intercept 1017 8964 1083 8587 863[0389 1645] [8200 9760] [0458 1717] [8085 9105]
MKT -0434 -0504[-1065 0196] [-1150 0122]
SMB 0191 0189[0150 0233] [0150 0230]
HML 0356 0356[0313 0400] [0316 0395]
UMD 1626 1264[0479 2772] [0077 2401]
q-factor model Intercept 1305 5503 1277 4728 4854Hou Xue and Zhang (2014) [0779 1831] [2440 9640] [0702 1879] [3245 6419]
ROE 0295 0266[-0026 0615] [-0087 0640]
IA 0270 0265[0104 0437] [0093 0450]
ME 0251 0246[0161 0341] [0144 0345]
MKT -0749 -0720[-1268 -0229] [-1292 -0156]
Liquidity-CAPM Intercept 1244 4938 1256 5298 4317Pastor and Stambaugh (2000) [0664 1824] [2691 9891] [0738 1749] [1250 6653]
LIQ 1141 0775[-0232 2514] [-0450 2116]
MKT -0664 -0678[-1242 -0086] [-1176 -0162]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of 25 Fama-French monthly excess returns Each model is estimated via OLS and GLS We reportpoint estimates and 5 confidence intervals for risk premia which are constructed based on the asymptotic normaldistribution and cross-sectional R2 and its (5 95) confidence level constructed as in Lewellen Nagel and Shanken(2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide the posterior mean of λ denoted byλj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (595) credible intervals and denote significance at the 90 95 and 99 level respectively
29
Table 6 Nontradable factors and 25 Fama-French portfolios sorted by size and value
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Scaled CCAPM Intercept 1046 2567 1791 3436 2919[-0848 2940] [-1429 10000] [0001 3723] [-476 6207]
cay 1817 0791[-0653 4288] [-1347 2686]
∆Cnd 0713 0303[-0030 1456] [-0462 0951]
∆Cnd times cay 0804 0301[-1645 3253] [-1911 2270]
HC-CAPM Intercept 3243 -122 3090 354 957[1228 5257] [-909 3345] [0790 5259] [-748 4431]
∆Y 0464 0085[-0213 1140] [-1119 1058]
MKT -0719 -0656[-2680 1242] [-2859 1558]
Durable CCAPM Intercept 2214 5238 2780 471 4078[-1037 5465] [2800 10000] [-0184 5751] [120 6991]
∆Cnd 0743 0357[-0025 1511] [-0207 0832]
∆Cd -0057 0014[-0719 0605] [-0668 0693]
MKT 0083 -0495[-3322 3489] [-3395 2555]
Panel B GLS
Scaled CCAPM Intercept 2180 -1024 2257 -658 -313[0825 3536] [-1429 6457] [1221 3258] [-1187 1517]
cay 0435 0256[-0774 1643] [-0688 1217]
∆Cnd 0118 0089[-0266 0502] [-0214 0407]
∆Cnd times cay 0141 0063[-1005 1286] [-0845 0938]
HC-CAPM Intercept 2730 5636 2759 5824 4926[1458 4002] [3018 8364] [1379 4095] [967 7507]
∆Y -0421 -0241[-0742 -0099] [-0598 0114]
MKT -0717 -0740[-1979 0545] [-2073 0622]
Durable CCAPM Intercept 2960 4454 2841 5474 4099[0547 5374] [286 7829] [1102 4558] [-241 7215]
∆Cnd 0105 0052[-0265 0475] [-0201 0311]
∆Cd 0055 0025[-0390 0501] [-0286 0327]
MKT -0941 -0822[-3314 1432] [-2528 0895]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of 25 Fama-French quarterly excess returns Each model is estimated via OLS and GLS Wereport point estimates and 5 confidence intervals for risk premia which are constructed based on the asymptoticnormal distribution and cross-sectional R2 and its (5 95) confidence level constructed as in Lewellen Nageland Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide the posterior mean of λdenoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as wellas its (5 95) credible intervals and denote significance at the 90 95 and 99 level respectively
the pointwise estimates and their seemingly high significance levels under the standard frequentist
approach
Conditional CCAPM of Lettau and Ludvigson (2001) is weakly identified at best Unlike the
basic FM estimation that indicates a relative empirical success of the model the Bayesian approach
reveals most risk premia to be substantially lower losing all the accompanying statistical and
30
economic significance This is particularly pronounced in the BFM-GLS that delivers both risk
premia and cross-sectional R2 close to zero
Labour-adjusted CAPM of Jagannathan and Wang (1996) extends the classic CAPM framework
by introducing a proxy for human capital and finds it strongly priced in the cross-sections of stocks
returns The BFM estimates of risk premia are substantially lower and no longer significant with
the same patterns observed under both OLS and GLS procedures
Durable CCAPM of Yogo (2006) in the linearized version included the durable consumption
factor and found that its impact is priced in a number of cross-sections sorted by size and value past
betas and other characteristics Even though the Lewellen Nagel and Shanken (2010) approach
indicates a really wide support for the cross-sectional R2 the model found empirical support in the
data We find that both durable and nondurable consumption are weak predictors of the cross-
section of returns as the magnitude of their risk premia substantially declines and is no longer
significant The model is still characterized by a wide confidence interval for R2 but overall its
pricing ability is questionable at best
Appendix C provides additional empirical results on the performance of both frequentist and
bayesian Fama-MacBeth estimators In many cases when the models are well specified and strongly
identified in the data there is almost no distinction between the two approaches One notable
difference however are the confidence intervals of the R2 that are often notoriously wide in
the frequentist case There are also cases however when the difference in model performance
becomes large affecting both risk premia estimates and measures of cross-sectional fit Similar
to Gospodinov Kan and Robotti (2019) we caution the reader against blindly relying on the
estimates produced by conventional Fama-MacBeth procedure and advocate a robust approach to
inference
IV2 Sampling two quadrillion models
We now turn our attention to a large cross-section of candidate asset pricing factors In particular
we focus on 51 (both traded and non) monthly factors available from October 1973 to December
2016 (ie T = 600) Factors are described in details in Table B1 of Appendix B In choosing
the cross-section of assets to price we follow Lewellen Nagel and Shanken (2010) and employ 25
Fama-French size and book-to-market portfolios plus 30 Industry portfolio (ie N = 55) Since
we do not restrict the maximum number of factors to be included all the possible combinations
of factors give us a total of 251 possible specifications ie 225 quadrillion models Note that each
model involves 55 time series regressions and one cross-section regression ie we jointly evaluate
the equivalent of 126 quadrillion regressions
We employ the continuous spike-and-slab approach of section II23 since it is the most suited
for handling a very large number of possible models and report both the posterior (given the data)
of each factor (ie E [γj |data] forallj) as well as the posterior means of the factorsrsquo risk premia (ie
E [λj |data] forallj) computed as the Bayesian Model Average16 across all the models considered
16If we are interested in some quantity ∆ that is well-defined for every model j = 1 M (eg risk premia
31
The posterior evaluation is performed and reported over a wide range for the parameter (ψ in
equation (16)) that controls the degree of shrinkage of potentially useless factorsrsquo risk premia from
ψ = 1 (ie very strong shrinkage) to ψ = 100 (making the shrinkage virtually irrelevant) The
prior for each factor inclusion is a Beta(1 1) (ie a uniform on [0 1]) yielding a prior expectation
for γj equal to 5017
000
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
1 5 10 20 50 100
Post
erio
r pro
babi
lity
ψ
HML MKT MKT SMB STRev
IPGrowth BEH_PEAD NONDUR SERV LIQ_TR
PE DeltaSLOPE BW_ISENT DEFAULT Oil
DIV REAL_UNC UNRATE TERM HJTZ_ISENT
UMD CMA LIQ_NT FIN_UNC STOCK_ISS
NetOA MACRO_UNC INV_IN_ASS COMP_ISSUE ACCR
HML ROA RMW CMA LTRev
INTERM_CAP_RATIO ASS_Growth DISSTR PERF BAB
IA MGMT ROE O_SCORE BEH_FIN
GR_PROF QMJ RMW SKEW SMB
HML_DEVIL
Figure 4 Posterior factor probabilities
Posterior probabilities of factors E [γj |data] estimated over the 197310-201612 sample using a cross-section of 25Fama-French size and book-to-market and 30 Industry test asset portfolios computed using the continuous spike andslab approach of section II23 and 51 factors yielding 251 asymp 225 quadrillion models The 51 factors considered aredescribed in Table B1 of Appendix B The prior distribution for the j-th factor inclusion is a Beta(1 1) yielding aprior expectation of γj = 50 Posterior probabilities are plotted for ψ isin [1 100]
Figure 4 plots the posterior probabilities of the 51 factors as a function of the parameter ψ
The corresponding values are reported in Table 7 Overall the inclusion of only four factors finds
maximum Sharpe ratio etc) from Bayesrsquo theorem we have
E [∆|data] =M983131
j=0
E [∆|datamodel = j] Pr (model = j|data)
where E [∆|datamodel = j] = limLrarrinfin1L
983123Li=1 ∆(θ
(j)i ) and
983153θ(j)i
983154L
i=1denotes draws from the posterior distribution
of the parameters of model j That is the (Bayesian model average) expectation of ∆ conditional on only thedata is simply the weighted average of the expectation in every model with weights equal to the modelsrsquo posteriorprobabilities See eg Raftery Madigan and Hoeting (1997) Hoeting Madigan Raftery and Volinsky (1999)
17Using a Beta(2 2) that still implies a prior probability of factor inclusion of 50 but lower probabilities forvery dense and very sparse models we obtain virtually identical results Furthermore using a prior in favour of moresparse factor models (ie a Beta(2 8)) the empirical findings are very similar to the ones reported
32
Table 7 Posterior factor probabilities E [γj |data] and risk premia in 225 quadrillion models
E [γj |data] E [λj |data]ψ ψ
Factors 1 5 10 20 50 100 1 5 10 20 50 100 F
HML 0860 0909 0916 0903 0882 0877 0192 0288 0301 0300 0292 0293 0377MKT 0730 0807 0831 0851 0837 0778 0108 0271 0370 0476 0565 0572 0514MKT 0501 0630 0705 0748 0749 0707 0047 0184 0285 0385 0467 0472 0563SMB 0654 0662 0580 0500 0429 0370 0076 0138 0133 0117 0103 0102 0215STRev 0506 0526 0511 0530 0568 0550 0003 0013 0026 0049 0106 0158 0438IPGrowth 0508 0508 0501 0502 0508 0502 0000 -0001 -0001 -0002 -0004 -0007 0121BEH PEAD 0486 0512 0478 0492 0511 0517 0004 0012 0018 0027 0049 0072 0619NONDUR 0475 0499 0490 0504 0504 0509 0000 0002 0003 0005 0011 0018 0170SERV 0504 0481 0497 0502 0491 0497 0000 0000 0000 0000 -0001 -0001 0052LIQ TR 0494 0493 0485 0485 0506 0507 0000 0005 0010 0020 0048 0083 0438PE 0495 0494 0502 0490 0494 0492 -0002 -0004 -0005 -0006 -0008 -0009 6401DeltaSLOPE 0478 0490 0506 0498 0482 0511 0000 0000 -0001 -0001 -0002 -0005 0096BW ISENT 0489 0491 0496 0498 0504 0477 0000 0002 0003 0005 0010 0014 0094DEFAULT 0497 0498 0490 0497 0484 0490 0000 0000 0000 0001 0001 0002 0313Oil 0495 0504 0489 0490 0494 0483 0002 0011 0019 0034 0068 0108 0852DIV 0488 0491 0486 0494 0491 0505 0000 0000 0000 -0001 -0002 -0003 0876REAL UNC 0479 0490 0503 0490 0498 0484 0000 0000 0000 0000 0000 0000 0043UNRATE 0483 0499 0484 0490 0490 0486 0000 -0001 -0002 -0003 -0006 -0008 1083TERM 0470 0476 0480 0497 0503 0501 0000 0003 0005 0009 0018 0030 0942HJTZ ISENT 0483 0480 0491 0487 0489 0477 0000 0000 0000 -0001 0000 0000 0210UMD 0531 0537 0536 0495 0422 0384 0028 0069 0089 0100 0102 0108 0646CMA 0499 0504 0491 0495 0466 0442 0001 0000 -0002 -0005 -0008 -0012 0242LIQ NT 0477 0478 0494 0493 0481 0471 -0003 -0005 -0004 -0002 0009 0019 0501FIN UNC 0481 0477 0471 0477 0489 0491 0000 0000 0000 0000 -0001 -0001 0096STOCK ISS 0515 0537 0509 0473 0433 0374 -0032 -0081 -0096 -0103 -0110 -0105 0515NetOA 0504 0504 0481 0489 0423 0402 0005 0015 0022 0032 0040 0042 0544MACRO UNC 0476 0483 0479 0471 0463 0430 0000 0000 0000 0000 0000 0000 0073INV IN ASS 0479 0476 0479 0461 0454 0440 0001 0002 0005 0010 0023 0029 0549COMP ISSUE 0514 0518 0503 0481 0393 0363 0037 0083 0101 0111 0105 0108 0497ACCR 0483 0477 0486 0472 0436 0415 -0009 -0012 -0009 0001 0015 0015 0343HML 0491 0481 0479 0469 0431 0376 -0002 -0001 0001 0004 0008 0010 0251ROA 0517 0514 0499 0473 0399 0319 0041 0105 0141 0162 0154 0123 0551RMW 0508 0481 0483 0465 0397 0347 0002 0008 0013 0017 0017 0016 0219CMA 0560 0530 0499 0432 0351 0292 0031 0056 0062 0058 0051 0044 0351LTRev 0485 0491 0474 0442 0402 0350 -0007 -0025 -0032 -0035 -0036 -0029 0252INTERM CAP RATIO 0499 0477 0452 0451 0380 0324 0027 0037 0054 0082 0103 0104 0790ASS Growth 0473 0470 0458 0439 0380 0327 0001 0005 0005 0000 -0005 -0003 0525DISSTR 0505 0487 0456 0410 0340 0269 0043 0094 0105 0104 0085 0070 0475PERF 0541 0497 0448 0390 0319 0259 -0054 -0079 -0072 -0064 -0054 -0049 0651BAB 0460 0448 0433 0405 0372 0325 -0010 -0012 -0008 0006 0027 0037 0921IA 0497 0465 0425 0385 0325 0257 -0010 -0011 -0009 -0008 -0008 -0007 0409MGMT 0516 0469 0424 0379 0302 0244 0028 0055 0058 0056 0050 0044 0631ROE 0482 0446 0414 0364 0299 0233 0009 0024 0027 0028 0024 0018 0555O SCORE 0475 0426 0403 0368 0281 0229 -0009 -0014 -0015 -0018 -0016 -0012 002BEH FIN 0483 0410 0360 0321 0238 0196 -0016 -0008 -0003 0000 0003 0003 076GR PROF 0459 0405 0366 0314 0249 0206 0011 0005 0008 0011 0012 0014 0199QMJ 0502 0408 0369 0300 0232 0178 0030 0030 0026 0018 0014 0011 0405RMW 0464 0395 0356 0302 0245 0193 0015 0025 0028 0027 0025 0020 0292SKEW 0481 0388 0345 0287 0219 0179 -0001 0000 0000 0000 0000 0000 0438SMB 0336 0305 0319 0312 0311 0277 0019 0032 0041 0047 0054 0051 0257HML DEVIL 0434 0326 0289 0242 0183 0152 0014 0013 0009 0005 0003 0002 0356
Posterior probabilities of factors E [γj |data] and posterior mean of factor risk premia E [λj |data] computed usingthe continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225 quadrillion models Theprior for each factor inclusion is a Beta(1 1) yielding a prior expectation for γj equal to 50 The last column reportssample average returns for the tradable factors The data is monthly 197310 to 201612 Test assets cross-sectionof 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factors considered are described inTable B1 of Appendix B Numbers denoted with the asterisk in the last column correspond to the return on thefactor-mimicking portfolio of the nontradable factor constructed by a linear projection of its values on the set of 51test assets and scaled to have the same volatility as the original nontradable factor
33
substantial support in our empirical analysis First the celebrated Fama-French HML (high-minus-
low) designed to capture the so-called lsquovalue premiumrsquo is a strong determinant of the cross-section
of asset returns For ψ = 10 (a reasonable benchmark) its posterior probability is about 895 and
only for very strong shrinkage (ψ = 1) the posterior probability gets reduced to 639 Second
the market factor in the version of Daniel Mota Rottke and Santos (2018) (MKTlowast that is
meant to have hedged out the unpriced risk contained in the market index) has also high posterior
probability (856 for ψ = 10) Third the simple market factor (MKT) seems also to be a robust
source of priced risk albeit with an empirical performance weaker than MKTlowast Fourth albeit to
a lesser extent SMBlowast the Daniel Mota Rottke and Santos (2018) version of the small-minus-big
Fama-French factor (meant to capture the so called lsquosizersquo premium) seems also to contain relevant
information for pricing the cross-section of asset returns with a posterior probability in the 60-
70 range for most values of ψ Beside the ones above mentioned all other factors have posterior
probabilities of about 50 or less for all values of ψ Interestingly the results are not very sensitive
to the choice of ψ
In addition to the posterior probabilities of the factors Table 7 reports the posterior means
of the factor risk premia computed as Bayesian Model Average ie the weighted average of the
posterior means in each possible factor model specification with weights equal to the posterior
probability of each specification being the true data generating process (see eg Roberts (1965)
Geweke (1999) Madigan and Raftery (1994)) The results are not very sensitive to the choice of
ψ except when considering very small values for of the shrinkage parameter ψ since in this case
posterior means are shrunk toward zero Interestingly the estimated price of risk for the market
factor is positive despite it being very often estimated as a negative quantity when considering
multifactor models and not dissimilar from the market excess return over the same period More
generally there is a clear pattern in cross-sectionally estimated (ie ex post) factor risk premia and
their simple time series average estimates (reported in the last column of Table 7) for the robust
four factors (HML MKTlowast MKT and SMBlowast) ex post risk premia are very similar to the time series
estimnates while the opposite holds true for the other factors In other words robust factors seems
to price themselves well (since theoretically their own beta is one) while other factors donrsquot
IV3 Estimating 26 million sparse factor models
Instead of drawing unrestricted factor models specifications we now constraint models to have a
maximum of 5 factors ndash ie we are imposing sparsity on the implied stochastic discount factor as
in most of the previous empirical asset pricing literature that has tried to identify low dimensional
factor models to explain the cross-section of asset returns Given our set of 51 factors this approach
yields about 26 millions of possible models (ie the equivalent of about 147 million time series and
cross-sectional regressions) Since we now do not sample the possible models posterior probabilities
are computed using the marginal likelihoods of all these models ie the posterior probability of
model γj is computed as
Pr(γj |data) =p(data|γj)983123i p(data|γi)
34
000
1000
2000
3000
4000
5000
6000
7000
8000
1 5 10 20 50 60
Post
erio
r pro
babi
lity
ψ
HML MKT MKT SMB PERF
STOCK_ISS COMP_ISSUE CMA ROA UMD
BEH_PEAD STRev NONDUR DISSTR MGMT
BW_ISENT TERM FIN_UNC LIQ_TR DeltaSLOPE
IPGrowth Oil SERV DEFAULT NetOA DIV PE REAL_UNC HJTZ_ISENT UNRATE
INTERM_CAP_RATIO LIQ_NT QMJ MACRO_UNC INV_IN_ASS
ACCR LTRev CMA ASS_Growth HML
RMW BAB IA O_SCORE ROE
SMB SKEW BEH_FIN GR_PROF RMW
HML_DEVIL
Figure 5 Posterior factor probabilities
Posterior probabilities of factors Pr [γj = 1|data] estimated over the 197310-201612 sample using a cross-section of25 Fama-French size and book-to-market and 30 Industry test asset portfolios computed using the Dirac spike andslab approach of section II22 51 factors and all possible models with up to 5 factors yielding about 26 millioncandidate models The prior probability of a factor being included is about 1038 The 51 factors considered aredescribed in Table B1 of Appendix B Posterior probabilities are plotted for ψ isin [1 100]
where we have assigned equal prior probability to all possible specifications and p(data|γj) denotes
the marginal likelihood of the j-th model To both simplify the numerical computation and to
illustrate the qualities of the approach we use the Dirac spike-and-slab prior of section II22 since
we can leverage its closed form solution for the marginal likelihoods18
Posterior factor probabilities are reported in Figure 5 and jointly with the Bayesian model
averaging of risk premia across the sparse models in Table C15 of the Appendix Note that in
this case all factors have an ex ante probability of being included equal to 1038 The results
are strikingly similar to the ones in Table 7 and Figure 4 as before only for four factors (HML
MKTlowast MKT and SMBlowast) we observe a marked increase in the posterior probability of inclusion
after observing the data Furthermore these factors seems to price themselves well ndash ie the BMA
of their risk premia are very similar to the sample average of their excess returns ndash while other
factors do not
35
Table 8 Factor Models with highest posterior probability (Dirac spike-and-slab ψ = 10)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232SMB 983232 983232 983232 983232 983232 983232STRevBW ISENTLIQ TRUNRATENONDURTERMCOMP ISSUE 983232 983232OilBEH PEADDeltaSLOPEINV IN ASSIPGrowthDEFAULTUMD 983232ROA 983232 983232 983232REAL UNCPECMA 983232ACCRSERVSTOCK ISS 983232 983232 983232DIVMACRO UNCFIN UNCLIQ NTCMANetOAHJTZ ISENTLTRevRMWHMLINTERM CAP RATIOASS GrowthPERF 983232 983232 983232IABABDISSTRROEMGMT 983232 983232O SCOREQMJBEH FINGR PROFSKEWRMWHML DEVILSMB
Probability () 00666 00599 00574 00522 00503 00460 00437 00428 00423 00393
Factors and posterior model probabilities of ten most likely specifications computed using the Dirac spike and slabapproach of section II22 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 26 millioncandidate models and a model prior probability of the order of 10minus7 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
36
IV4 A robust factors model
The previous subsections suggest that only a small number of factors (HML MKTlowast MKT and to a
lesser extent SMBlowast) are robust explanators of the cross-section of asset returns Furthermore Table
8 that reports the ten factor model specifications with the highest posterior probabilities19 shows
that these robust factors tend to be almost always included in the most likely models both HML
and MKTlowast are featured in all ten specifications while MKT is excluded only once and SMBlowast is
included in the five most likely specification plus the tenth one Note that the posterior probabilities
in Table 8 might appear small in absolute terms but are actually four orders of magnitude larger
than the prior model probabilities (equal to one over the number of models considered)
Therefore a natural question is whether the four factors identified as robust in the above
analysis do indeed deliver a significantly better cross-sectional asset pricing model We answer this
questions by comparing the performance of a four factor model with HML MKTlowast MKT and SMBlowast
as factors to the one of several notable factor models20 In particular Table 9 reports the model
posterior probabilities for the specifications considered ie the probability of any of these models
being the true data generating process Strikingly for almost any value of ψ the model posterior
probabilities are in the single digits range for all models but the robust factors one the probability
of this specification is always higher than 85 except when using a very strong shrinkage (in which
case it is reduced to 65) Furthermore for ψ in the most salient range (10-20) the posterior
probability of the robust factors model is about 90
Table 9 Posterior probabilities of notable models vs robust factors
ψmodel 1 5 10 20 50 100
CAPM 002 001 001 002 004 008Fama and French (1992) 005 001 001 001 000 000Fama and French (2016) 009 006 005 004 003 002Carhart (1997) 005 001 001 001 001 001Hou Xue and Zhang (2015) 002 000 000 000 000 000Pastor and Stambaugh (2000) 001 000 000 001 001 002Asness Frazzini and Pedersen (2014) 010 003 002 002 002 001Robust Factors Model 065 088 090 090 089 085
Posterior model probabilities for the specifications in the first column for different values of ψ computed using theDirac spike-and-slab prior The models and their factors are described in Appendix B The model in the last rowuses the HML MKT MKTlowast and SBMlowast factors described in Table B1 The data is monthly 197310 to 201612 Testassets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios
18Alternatively one could use the continuous spike-and-slab approach employed in the previous subsection anddrop the draws of specifications with more than 5 factors but this significantly reduces computational efficiency
19Table 8 focuses on the Dirac spike-and-slab specification with ψ = 10 Very similar results are reported in tablesC16-C18 of Appendix C for other values of ψ and for the continuous prior case
20Note that the correlation between MKT and MKTlowast is not too large being about 064
37
IV5 The sparsity of the stochastic discount factor
We have shown that a robust model with only four ndash robust ndash factors is much more likely to
capture the true stochastic discount factor than all the notable models considered (see Table 9)
Nevertheless are only four factors likely to deliver a sufficiently accurate representation of the true
latent stochastic discount factor
Thanks to our Bayesian method this question can be easily answered In particular by using
our estimations of about 225 quadrillion models and their posterior probabilities we can compute
the posterior distribution of the dimensionality of the lsquotruersquo model That is for any integer number
between one and fifty-one we can compute the posterior probability of the SDF being a function
of that number of factors
Figure 6 reports the posterior distributions of the dimensionality of the SDF for various values
of ψ These distributions are also summarized in Table 10 For the most salient values of ψ (10 and
20) the posterior mean of the number of factors in the true SDF is in the 24-25 range and the 95
posterior credible intervals are contained in the 17 to 32 factors range That is there is substantial
evidence that the SDF is dense in the space of the linear factors considered given the factors at
hand a relative large number of them is needed to provide an accurate representation of the true
latent SDF Since most of the literature has focused on very low dimension linear factor models
this finding suggests that most empirical results therein have been affected by a large degree of
misspecification
It is worth noticing that as Figure 6 and Table 10 show for very large ψ ie with basically
a flat prior for factor risk premia the posterior dimensionality is reduced This is due to two
phenomena we have already outlined First if some of the factors are useless (and our analysis
points in this direction) under a flat prior they do tend to have a higher posterior probability and
drive out the true sources of priced risk Second a flat prior for the risk premia can generate a
ldquoBartlett Paradoxrdquo (see the discussion in section II21 and Bartlett (1957))
Table 10 The posterior dimensionality of the stochastic discount factor
ψ1 5 10 20 50 100
mean 2570 2525 2460 2370 2203 2046median 26 25 25 24 22 2025 19 18 18 17 15 145 20 20 19 18 16 1595th 31 31 30 30 28 26975th 33 32 32 31 29 27
Summary statistics of the posterior density of the true model number of factors for various values of ψ Estimated
over the 197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30 Industry
test asset portfolios using the continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225
quadrillion models The prior for each factor inclusion is a Beta(1 1)
38
0 10 20 30 40 50
000
004
008
012
Model Dimensionality
Freq
uenc
y
ψ=1ψ=5ψ=10ψ=20ψ=50ψ=100
Figure 6 Posterior density of the dimensionality of the stochastic discount factor
Posterior density of the true model having the number of factors listed on the horizontal axis Estimated over the
197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30 Industry test asset
portfolios computed using the continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225
quadrillion models The prior for each factor inclusion is a Beta(1 1) yielding a prior expectation for γj equal to 50
The 51 factors considered are described in Table B1 of Appendix B Posterior densities are plotted for ψ isin [1 100]
Note that if the factors proposed in the literature were to capture different and uncorrelated
sources of risk one might worry that a SDF that is dense in the space of factors might imply
unrealistically high Sharpe ratios (see eg the discussion in Kozak Nagel and Santosh (2019))
Since given a factor model the SDF-implied maximum Sharpe ratio is just a function of the
factorsrsquo risk premia and covariance matrix our Bayesian method allows to construct the posterior
distribution of the maximum Sharpe ratio for each of the 2 quadrilion models considered Therefore
using the posterior probabilities of each possible model specification we can actually construct
the (Bayesian Model Averaging) posterior distribution of the SDF-implied maximum Sharpe ratio
(conditional on the data only)
Figure 7 and table 11 report respectively the posterior distribution of the (annualized) SDF-
implied maximum Sharpe ratio and its summary statistics for several values of the parameter ψ
Except when a very strong shrinkage (small ψ) is imposed (and hence risk premia and consequently
Sharpe ratios are shrunk toward zero) the posterior distribution of the Sharpe ratio are quite
similar for all values of ψ Furthermore despite the SDF being dense in the space of factors the
posterior maximum Sharpe ratio does not appear to be unrealistically high eg for ψ isin [10 20]
its posterior mean is about 074ndash080 and the 95 posterior credible intervals are in the 050ndash114
range Interestingly Ghosh Julliard and Taylor (2016 2018) provides a non-parametric estimate
of the pricing kernel extracted using an information-theoretic approach and wide cross-sections of
equity portfolios and find SDF-implied maximum Sharpe ratios of very similar magnitude
39
00 05 10 15
01
23
4
Sharpe Ratio
Dens
ity
ψ=1ψ=5ψ=10ψ=20ψ=50ψ=100
Figure 7 Posterior density of the Sharpe ratio of the model-implied stochastic discount factor
Posterior density of the Sharpe ratio of the model-implied stochastic discount factor for various values of ψ isin [1 100]
Estimated over the 197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30
Industry test asset portfolios computed using the continuous spike and slab approach of section II23 51 factors
described in Table B1 of Appendix B and Bayesian Model Averaging of the 251 asymp 225 quadrillion possible models
The prior for each factor inclusion is a Beta(1 1)
Table 11 Posterior distribution of the Sharpe ratio of the model-implied stochastic discount factor
ψ1 5 10 20 50 100
mean 044 067 074 080 087 092median 043 066 073 079 086 09125 026 044 050 054 059 0625 029 047 053 058 063 06695th 060 089 099 108 117 124975th 064 094 105 114 124 133
Summary statistics of the posterior distribution of the maximal Sharpe ratio of the model-implied SDF for various
values of ψ isin [1 100] Estimated over the 197310-201612 sample using a cross-section of 25 Fama-French size and
book-to-market and 30 Industry test asset portfolios computed using the continuous spike and slab approach of
section II23 51 factors described in Table B1 of Appendix B and Bayesian Model Averaging of the 251 asymp 225
quadrillion possible models The prior for each factor inclusion is a Beta(1 1)
V Extensions
In additions to the extensions formalized in remarks 1 (on how to handle generated factors such as
principal components and factor mimicking portfolios) and 3 (on how to handle the identification
failure generated by lsquolevel factorsrsquo) our method can be feasibly extended to encompass several
salient generalizations
40
First based on economic considerations one might possibly want to bound the maximum risk
premia (or the maximum Sharpe ratios) associated with the factors This can be achieved by re-
placing the Gaussian distributions in our spike-and-slab priors with (rescaled and centered) Beta
distributions since the latter have bounded support Furthermore for the sake of expositional
simplicity and closed form solutions we have focused on regularising spike and slab priors with
exponential tails Nevertheless our approach that shrinks useless factors based on their correla-
tion with asset returns could be as well implemented using polynomial tailed (ie heavy-tailed)
mixing priors (see Polson and Scott (2011) for a general discussion of priors for regularisation and
shrinkage)21 The rational for using heavy tailed priors is that when the likelihood has thick tails
while the prior has thin tail if the likelihood peak moves too far from the prior mean the posterior
eventually eventually reverts toward the prior This is illustrated in Figure D2 of the Appendix
Nevertheless note that this mechanism (first pointed out in Jeffreys (1961)) is actually desirable
in our settings in order to shrink the risk premia of useless factors toward zero22
Second Lewellen Nagel and Shanken (2010) points out that the first pass time-series regression
is often affected by having a strong factor structure in the residuals Given the hierarchical structure
of our Bayesian approach one can add latent linear components in the time series regression of asset
returns on factors reformulate the time series estimation step as a state-space problem and filter
the latent components (eg via Kalman filter) The posterior sampling of the time series parameters
would then be enriched by the drawing of the added terms as in Bryzgalova and Julliard (2018)
Furthermore one could allow the latent time series factors to be potential priced in the cross-
section (again as in Bryzgalova and Julliard (2018)) This extension would increase the numerical
complexity of the procedure in the time-series step but would nonetheless leave unchanged the
method proposed in this paper at the cross-sectional step (with the only difference that the time
series loadings of the latent factors could be included in the cross-sectional step as if these latent
factors were observable) This extended approach would lead to valid posterior inference and model
selection
Third again thanks to the hierarchical structure of our method time varying time series betas
could be accommodated by adopting the time varying parameters approach of Primiceri (2005)
in the time series step And since in our approach the asset specific expected risk premia are
a parameter estimated in the time series step this extension would also allow for time variation
in asset risk premia Furthermore albeit this would increase the numerical complexity of the
cross-sectional inference step the time varying parameters formulation could also be used for the
modelling of the factor risk premia
21For example albeit alternatives with more desirable properties exist our spike and slab could be implementedusing a Cauchy prior with location parameter set to zero and scale parameter proportional to ψj as defined in equation(16)
22Risk premia have a natural support hence the prior can be chosen (adjusting the paramater ψ) to attach highprobability to it And since useless factors will tend to generate heavy tailed likelihoods with posterior peaks for riskpremia that deviate toward infinity the risk premia of such factors will be shrunk toward the prior mean if the priorhas thin-tails
41
VI Conclusions
We have developed a novel (Bayesian) method for the analysis of linear factor models in asset
pricing The approach can handle the quadrillions of models generated by the zoo of traded and
non traded factors and delivers inference that is robust to the common identification failures and
spurious inference problems caused by useless factors
We have applied our approach to the study of more than two quadrillion factor model spec-
ification and have found that 1) only a handful of factors (the Fama and French (1992) ldquohigh-
minus-lowrdquo proxy for the value premium the market index as well the adjusted versions of the
ldquosmall-minus-bigrdquo size factor and the market factor of Daniel Mota Rottke and Santos (2018))
seem to be robust explanators of the cross-sections of asset returns 2) jointly the four robust fac-
tors provide a model that is compared to the previous empirical literature one order of magnitude
more likely to have generated the observed asset returns (itrsquos posterior probability is about 90)
3) with very high probability the ldquotruerdquo latent stochastic discount factor is dense in the space of
factors proposed in the previous literature ie capturing its characteristics requires the use of 24-25
factors (at the posterior mean of the SDF sparsity) 4) despite being dense in the space of factors
the SDF-implied maximum Sharpe ratio is not excessive suggesting a high degree of commonality
in terms of captured risks among the factors in the zoo
As a byproduct of our novel framework for empirical asset pricing we provide a very simple
Bayesian version of the Fama and MacBeth (1973) regression method (BFM) We show that this
simple procedure (that does neither require optimisation nor tuning parameters and is not harder
to implement than eg the Shanken (1992) correction for standard errors) makes useless factors
easily detectable in finite sample In extensive simulations the BFM and its GLS analogue (BFM-
GLS) perform well even with relatively small time and large cross-sectional dimensions We apply
BFM and BFM-GLS to several notable factor models and document that a range of non-traded
factors such as consumption proxies labour factors or the consumption-to-wealth ratio are only
weakly identified at best and are characterised by a substantial degree of model misspecification
and uncertainty
Finally thanks to its hierarchical structure our framework is extremely flexible and can accom-
modate and deliver robust inference in the presence of 1) pre-estimated factors (eg mimicking
portfolios and principal components) 2) latent priced and unpriced factors 3) time varying betas
as well as asset and factor risk premia
42
References
Allena R (2019) ldquoComparing Asset Pricing Models with Traded and Non-Traded Factorsrdquo working paper
Asness C and A Frazzini (2013) ldquoThe Devil in HMLs Detailsrdquo Journal of Portfolio Management 39 49ndash68
Asness C S A Frazzini and L H Pedersen (2019) ldquoQuality minus Junkrdquo Review of Accounting Studies24(1) 34ndash112
Avramov D and G Zhou (2010) ldquoBayesian Portfolio Analysisrdquo Annual Review of Financial Economics 225ndash47
Baker M and J Wurgler (2006) ldquoInvestor Sentiment and the Cross-Section of Stock Returnsrdquo Journal ofFinance 61 1645ndash80
Baks K P A Metrick and J Wachter (2001) ldquoShould Investors Avoid All Actively Managed Mutual FundsA Study in Bayesian Performance Evaluationrdquo Journal of Finance 56 45ndash85
Ball R (1985) ldquoAnomalies in Relationships between Securitiesrsquo Yields and Yield-Surrogatesrdquo Journal of FinancialEconomics 6 103ndash126
Barillas F and J Shanken (2018) ldquoComparing Asset Pricing Modelsrdquo The Journal of Finance 73(2) 715ndash754
Bartlett M S (1957) ldquoComment on a Statistical Paradox by DV Lindleyrdquo Biometrika 44(1-2) 533ndash534
Basu S (1977) ldquoInvestment Performance of Common Stocks in Relation to their Price-Earnings Ratio A Test ofthe Efficient Market Hypothesisrdquo Journal of Finance 32 663ndash82
Belloni A V Chernozhukov and C Hansen (2014) ldquoInference on treatment effects after selection amonghigh-dimensional controlsrdquo The Review of Economic Studies 81 608ndash650
Breeden D T M R Gibbons and R H Litzenberger (1989) ldquoEmpirical Test of the Consumption-OrientedCAPMrdquo The Journal of Finance 44(2) 231ndash262
Bryzgalova S (2015) ldquoSpurious Factors in Linear Asset Pricing Modelsrdquo working paper
Bryzgalova S and C Julliard (2018) ldquoConsumption in Asset Returnsrdquo London School of EconomicsManuscript
Campbell J Y (1996) ldquoUnderstanding Risk and Returnrdquo Journal of Political Economy 104(2) 298ndash345
Campbell J Y J Hilscher and J Szilagyi (2008) ldquoIn Search of Distress Riskrdquo Journal of Finance 632899ndash2939
Carhart M M (1997) ldquoOn Persistence in Mutual Fund Performancerdquo The Journal of Finance 52(1) 57ndash82
Chan K N-F Chen and D A Hsieh (1985) ldquoAn Exploratory Investigation of the Firm Size Effectrdquo Journalof Financial Economics 14 451ndash71
Chen L R Novy-Marx and L Zhang (2010) ldquoA New Three-Factor Modelrdquo working paper
Chen N-F S A Ross and R Roll (1986) ldquoEconomic Forces and the Stock Marketrdquo Journal of Business 59383ndash403
Chernov M L Lochstoer and S Lundeby (2019) ldquoConditional Dynamics and the Multi-Horizon Risk-ReturnTrade-Offrdquo working paper
Chib S X Zeng and L Zhao (forthcoming) ldquoOn Comparing Asset Pricing Modelsrdquo The Journal of Finance
Cochrane J H (2005) Asset Pricing Revised Edition Princeton University Press
Cooper M J H Gulen and M J Schill (2008) ldquoAsset Growth and the Cross-Section of Stock ReturnsrdquoJournal of Finance 63 1609ndash52
Cremers K M (2002) ldquoStock Return Predictability A Bayesian Model Selection Perspectiverdquo The Review ofFinancial Studies 15(4) 1223ndash1249
Daniel K D Hirshleifer and L Sun (2019) ldquoShort- and Long-Horizon Behavioral Factorsrdquo Review ofFinancial Studies
Daniel K L Mota S Rottke and T Santos (2018) ldquoThe Cross-Section of Risk and Returnrdquo workingpaper
Daniel K and S Titman (2006) ldquoMarket reactions to tangible and intangible informationrdquo Journal of Finance61 1605ndash43
Fabozzi F D Huang and G Zhou (2010) ldquoRobust Portfolios Contributions from Operations Research andFinancerdquo Annals of Operations Research 172 191ndash220
43
Fama E F and K French (2008) ldquoDissecting Anomaliesrdquo Journal of Finance 63 1653ndash78
Fama E F and K R French (1992) ldquoThe Cross-Section of Expected Stock Returnsrdquo The Journal of Finance47 427ndash465
(1993) ldquoCommon Risk Factors in the Returns on Stocks and Bondsrdquo The Journal of Financial Economics33 3ndash56
Fama E F and K R French (2015) ldquoA Five-Factor Asset Pricing Modelrdquo Journal of Financial Economics116 1ndash22
(2016) ldquoDissecting Anomalies with a Five-Factor Modelrdquo Review of Financial Studies 29 69ndash103
Fama E F and J D MacBeth (1973) ldquoRisk Return and Equilibrium Empirical Testsrdquo Journal of PoliticalEconomy 81(3) 607ndash636
Ferson W and C Harvey (1991) ldquoThe Variation of Economic Risk Premiumsrdquo Journal of Political Economy99 385ndash415
Frazzini A and L H Pedersen (2014) ldquoBetting against Betardquo Journal of Financial Economics 111(1) 1ndash25
Garlappi L R Uppal and T Wang (2007) ldquoPortfolio Selection with Parameter and Model Uncertainty AMulti-Prior Approachrdquo Review of Financial Studies 20 41ndash81
George E I and R E McCulloch (1993) ldquoVariable Selection via Gibbs Samplingrdquo Journal of the AmericanStatistical Association 88 881ndash889
(1997) ldquoApproaches for Bayesian Variable Selectionrdquo Statistica Sinica 7 339ndash373
Gertler M and E Grinols (1982) ldquoUnemployment Inflation and Common Stock Returnsrdquo Journal of MoneyCredit and Banking 14 216ndash33
Geweke J (1999) ldquoUsing Simulation Methods for Bayesian Econometric Models Inference Development andCommunicationrdquo Econometric Reviews 18(1) 1ndash73
Ghosh A C Julliard and A Taylor (2016) ldquoWhat is the Consumption-CAPM missing An Information-Theoretic Framework for the Analysis of Asset Pricing Modelsrdquo Review of Financial Studies 30 442ndash504
(2018) ldquoAn Information-Theoretic Asset Pricing Modelrdquo London School of Economics Manuscript
Giannone D M Lenza and G E Primiceri (2018) ldquoEconomic Predictions with Big Data The Illusion ofSparsityrdquo FRB of New York Staff Report No 847
Gibbons M R S A Ross and J Shanken (1989) ldquoA Test of the Efficiency of a Given Portfoliordquo Econometrica57(5) 1121ndash1152
Giglio S G Feng and D Xiu (2019) ldquoTaming the factor Zoordquo Journal of Finance forthcoming
Giglio S and D Xiu (2018) ldquoAsset Pricing with Omitted Factorsrdquo working paper
Gospodinov N R Kan and C Robotti (2014) ldquoMisspecification-Robust Inference in Linear Asset-PricingModels with Irrelevant Risk Factorsrdquo The Review of Financial Studies 27(7) 2139ndash2170
(2019) ldquoToo Good to be True Fallacies in Evaluating Risk Factor Modelsrdquo Journal of Financial Economics132(2) 451ndash471
Goyal A Z L He and S-W Huh (2018) ldquoDistance-Based Metrics A Bayesian Solution to the Power andExtreme-Error Problems in Asset-Pricing Testsrdquo Swiss Finance Institute Research Paper No 18-78 available atSSRN httpsssrncomabstract=3286327
Hall R E (1978) ldquoThe Stochastic Implications of the Life Cycle Permanent Income Hypothesisrdquo Journal ofPolitical Economy 86(6) 971ndash87
Harvey C and Y Liu (2019) ldquoCross-sectional Alpha Dispersion and Performance Evaluationrdquo Journal ofFinancial Economics forthcoming
Harvey C and G Zhou (1990) ldquoBayesian Inference in Asset Pricing Testsrdquo Journal of Financial Economics26 221ndash254
Harvey C R (2017) ldquoPresidential Address The Scientific Outlook in Financial Economicsrdquo The Journal ofFinance 72(4) 1399ndash1440
Harvey C R Y Liu and H Zhu (2016) ldquo and the Cross-Section of Expected Returnsrdquo The Review ofFinancial Studies 29(1) 5ndash68
He A D Huang and G Zhou (2018) ldquoNew Factors Wanted Evidence from a Simple Specification Testrdquoworking paper available at SSRN httpsssrncomabstract=3143752
44
He Z B Kelly and A Manela (2017) ldquoIntermediary Asset Pricing New Evidence from Many Asset ClassesrdquoJournal of Financial Economics 126 1ndash35
Hirshleifer D H Kewei S Teoh and Y Zhang (2004) ldquoDo Investors Overvalue Firms with Bloated BalanceSheetsrdquo Journal of Accounting and Economics 38 297ndash331
Hoeting J A D Madigan A E Raftery and C T Volinsky (1999) ldquoBayesian Model Averaging ATutorialrdquo Statistical Science 14(4) 382ndash401
Hou K C Xue and L Zhang (2015) ldquoDigesting Anomalies An Investment Approachrdquo The Review of FinancialStudies 28(3) 650ndash705
Huang D F Jiang J Tu and G Zhou (2015) ldquoInvestor Sentiment Aligned A Powerful Predictor of StockReturnsrdquo Review of Financial Studies 28 791ndash837
Huang D J Li and G Zhou (2018) ldquoShrinking Factor Dimension A Reduced-Rank Approachrdquo workingpaper available at SSRN httpsssrncomabstract=3205697
Ishwaran H J S Rao et al (2005) ldquoSpike and Slab Variable Selection Frequentist and Bayesian StrategiesrdquoThe Annals of Statistics 33(2) 730ndash773
Jagannathan R and Z Wang (1996) ldquoThe Conditional CAPM and the Cross-Section of Expected ReturnsrdquoThe Journal of Finance 51(1) 3ndash53
Jeffreys H (1961) Theory of Probability Oxford Calendron Press
Jegadeesh N and S Titman (1993) ldquoReturns to Buying Winners and Selling Losers Implications for StockMarket Efficiencyrdquo Journal of Finance 48 65ndash91
(2001) ldquoProfitability of Momentum Strategies An Evaluation of Alternative Explanationsrdquo Journal ofFinance 56 699ndash720
Jones C and J Shanken (2005) ldquoMutual Fund Performance with Learning across Fundsrdquo Journal of FinancialEconomics 78 507ndash552
Jurado K S Ludvigson and S Ng (2015) ldquoMeasuring Uncertaintyrdquo American Economic Review 105 1177ndash1215
Kan R and C Zhang (1999a) ldquoGMM Tests of Stochastic Discount Factor Models with Useless Factorsrdquo Journalof Financial Economics 54(1) 103ndash127
(1999b) ldquoTwo-Pass Tests of Asset Pricing Models with Useless Factorsrdquo the Journal of Finance 54(1)203ndash235
Kass R E and A E Raftery (1995) ldquoBayes Factorsrdquo Journal of the American Statistical Association 90(430)773ndash795
Kelly B S Pruitt and Y Su (2019) ldquoCharacteristics are covariances A unified model of risk and returnrdquoJournal of Financial Economics 134 501ndash524
Kleibergen F (2009) ldquoTests of Risk Premia in Linear Factor Modelsrdquo Journal of Econometrics 149(2) 149ndash173
Kleibergen F and Z Zhan (2015) ldquoUnexplained Factors and their Effects on Second Pass R-squaredsrdquo Journalof Econometrics 189(1) 101ndash116
Kozak S S Nagel and S Santosh (2019) ldquoShrinking the Cross-Sectionrdquo Journal of Financial Economicsforthcoming
Lancaster T (2004) Introduction to Modern Bayesian Econometrics Wiley
Langlois H (2019) ldquoMeasuring Skewness Premiardquo Journal of Financial Economics
Lettau M and S Ludvigson (2001) ldquoResurrecting the (C) CAPM A Cross-Sectional Test when Risk Premiaare Time-Varyingrdquo Journal of political economy 109(6) 1238ndash1287
Lewellen J S Nagel and J Shanken (2010) ldquoA Skeptical Appraisal of Asset Pricing Testsrdquo Journal ofFinancial Economics 96(2) 175ndash194
Lintner J (1965) ldquoSecurity Prices Risk and Maximal Gains from Diversificationrdquo Journal of Finance 20587ndash615
Ludvigson S S Ma and S Ng (2019) ldquoUncertainty and Business Cycles Exogenous Impulse or EndogenousResponserdquo American Economic Journal Macroeconomics
Madigan D and A E Raftery (1994) ldquoModel Selection and Accounting for Model Uncertainty in GraphicalModels Using Occamrsquos Windowrdquo Journal of the American Statistical Association 89(428) 1535ndash1546
45
Novy-Marx R (2013) ldquoThe Other Side of Value The Gross Profitability Premiumrdquo Journal of Financial Eco-nomics 108 1ndash28
Ohlson J A (1980) ldquoFinancial Ratios and the Probabilistic Prediction of Bankruptcyrdquo Journal of AccountingResearch 18 109ndash131
Pastor L (2000) ldquoPortfolio Selection and Asset Pricing Modelsrdquo The Journal of Finance 55(1) 179ndash223
Pastor L and R F Stambaugh (2000) ldquoComparing Asset Pricing Models an Investment Perspectiverdquo Journalof Financial Economics 56(3) 335ndash381
Pastor L and R F Stambaugh (2002) ldquoInvesting in Equity Mutual Fundsrdquo Journal of Financial Economics63 351ndash380
Pastor L and R F Stambaugh (2003) ldquoLiquidity Risk and Expected Stock Returnsrdquo Journal of Politicaleconomy 111(3) 642ndash685
Phillips D L (1962) ldquoA Technique for the Numerical Solution of Certain Integral Equations of the First KindrdquoJ ACM 9(1) 84ndash97
Polson N G and J G Scott (2011) ldquoShrink Globally Act Locally Sparse Bayesian Regularization and Pre-dictionrdquo in Bayesian Statistics 9 ed by J M Bernardo M J Bayarri J O Berger A P Dawid D HeckermanA F M Smith and M West Oxford University Press
Primiceri G E (2005) ldquoTime Varying Structural Vector Autoregressions and Monetary Policyrdquo Review of Eco-nomic Studies 72(3) 821ndash852
Raftery A E D Madigan and J A Hoeting (1997) ldquoBayesian Model Averaging for Linear RegressionModelsrdquo Journal of the American Statistical Association 92(437) 179ndash191
Ritter J R (1991) ldquoThe Long-Run Performance of Initial Public Offeringsrdquo Journal of Finance 46 3ndash27
Roberts H V (1965) ldquoProbabilistic Predictionrdquo Journal of the American Statistical Association 60(309) 50ndash62
Shanken J (1987) ldquoA Bayesian Approach to Testing Portfolio Efficiencyrdquo Journal of Financial Economics 19195ndash215
(1992) ldquoOn the Estimation of Beta-Pricing Modelsrdquo The Review of Financial Studies 5 1ndash33
Sharpe W F (1964) ldquoCapital Asset Prices A Theory of Market Equilibrium under Conditions of Riskrdquo Journalof Finance 19(3) 425ndash42
Sims C A (2007) ldquoThinking about Instrumental Variablesrdquo mimeo
Sloan R (1996) ldquoDo Stock Prices Fully Reflect Information in Accruals and Cash Flows about Future EarningsrdquoThe Accounting Review 71 289ndash315
Stambaugh R F and Y Yuan (2016) ldquoMispricing Factorsrdquo Review of Financial Studies 30 1270ndash1315
Stock J H (1991) ldquoConfidence Intervals for the Largest Autoregressive Root in US Macroeconomic Time SeriesrdquoJournal of Monetary Economics 28(3) 435ndash459
Tikhonov A A Goncharsky V Stepanov and A G Yagola (1995) Numerical Methods for the Solutionof Ill-Posed Problems Mathematics and Its Applications Springer Netherlands
Titman S K J Wei and F Xie (2004) ldquoCapital Investments and Stock Returnsrdquo Journal of Financial andQuantitative Analysis 39 677ndash700
Uppal R P Zaffaroni and I Zviadadze (2018) ldquoCorrecting Misspecified Stochastic Discount Factorsrdquoworking paper
Yogo M (2006) ldquoA Consumption-Based Explanation of Expected Stock Returnsrdquo Journal of Finance 61(2)539ndash580
46
A Additional Derivations
A1 Derivation of the posterior distribution in section II1
Letrsquos consider first the time-series regression We assume that 983171tiidsim N (0Σ) or 983171 sim MVN (0TtimesN Σotimes
IT ) The likelihood of data (RF ) is therefore
p(data|BΣ) = (2π)minusNT2 |Σ|minus
T2 exp
983069minus1
2tr[Σminus1(Rminus FB)⊤(Rminus FB)]
983070
After assigning the Jeffreysrsquo prior for (BΣ) π(BΣ) prop |Σ|minusN+1
2 we simplify the likelihood
function by exploiting the fact that OLS estimated residuals are orthogonal to the regressors
(Rminus FB)⊤(Rminus FB) = [Rminus FBols minus F (B minus Bols)]⊤[Rminus FBols minus F (B minus Bols)]
= (Rminus FBols)⊤(Rminus FBols) + (B minus Bols)
⊤F⊤F (B minus Bols)
= T Σ+ (B minus Bols)⊤F⊤F (B minus Bols)
where
Bols =
983075a⊤
β⊤
983076= (F⊤F )minus1F⊤R Σols =
1
T(Rminus FBols)
⊤(Rminus FBols)
Therefore the posterior distribution in the first step is
p(BΣ|data) prop (2π)minusNT2 |Σ|minus
T+N+12 exp
983069minus1
2tr[Σminus1(Rminus FB)⊤(Rminus FB)]
983070
prop |Σ|minusT+N+1
2 eminus12tr[Σminus1(T Σ)]eminus
12tr[Σminus1(BminusBols)
⊤F⊤F (BminusBols)]
Hence the posterior distribution of B conditional on data and Σ is
p(B|Σ data) prop exp
983069minus1
2tr[Σminus1(B minus Bols)
⊤F⊤F (B minus Bols)]
983070
and the above is the kernel of the multivariate normal in equation (8)
If we further integrate out B it is easy to show that
p(Σ|data) prop |Σ|minusT+NminusK
2 exp
983069minus1
2tr[Σminus1(T Σ)]
983070
Therefore the posterior distribution of Σ is the inverse-Wishart in equation (9)
Recall that β = (1N βf ) λ⊤ = (λc λ⊤
f ) If we assume that the pricing error αi follows an
independent and idencitical normal distribution N (0σ2) the likelihood function in the second step
is
p(data|λσ2) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070
where data in the second step include (aβf ) drawn from the first step Assuming the diffuse
47
Jeffreysrsquo prior π(λσ2) prop 1σ2 the posterior distribution of (λσ2) is
p(λσ2|dataBΣ) prop (σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070
= (σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βλ+ β(λminus λ))⊤(aminus βλ+ β(λminus λ))
983070
= (σ2)minusN+2
2 exp
983069minusN σ2
2σ2
983070exp
983083minus(λminus λ)⊤β⊤β(λminus λ)
2σ2
983084
there4 p(λ|σ2 dataBΣ) prop exp
983083minus(λminus λ)⊤β⊤β(λminus λ)
2σ2
983084
where λ = (β⊤β)minus1β⊤a and σ2 = (aminusβλ)⊤(aminusβλ)N Therefore the posterior conditional distribution
of λ is the one in equation (12) Finally we can derive the posterior distribution of σ2 by integrating
out λ
p(σ2|dataBΣ) =
983133p(λσ2|dataBΣ)dσ2 prop (σ2)minus
NminusK+12 exp
983069minusN σ2
2σ2
983070
hence obtaining the posterior distribution in equation (13)
A2 Non-spherical pricing errors
Our framework can also easily accommodate non-spherical cross-sectional pricing errors To see
this note that under the null of the model we can rewrite equation (1) as Rt = βλ + βfft + 983171t
Consider the cross-sectional regression and let ET define the sample mean operator Since ET [Rt] =
βλ+ET [βfft]+ET [983171t] = βλ+ET [983171t] the pricing error α should be equal to ET [983171t] Hence under
the hypothesis that the model is correctly specified and in the spirit of the central limit theorem
a suitable distributional assumption for the pricing errors α in the second step is
α|Σ sim N
9830610N
1
TΣ
983062
Implying the distribution of R equiv ET [Rt] sim N (βλ 1T Σ) The likelihood function in the second step
is then
p(data|λ) = (2π)minusN2
9830559830559830559830551
TΣ
983055983055983055983055minus 1
2
exp
983069minusT
2(Rminus βλ)⊤Σminus1(Rminus βλ)
983070
prop exp
983069minusT
2(λ⊤β⊤Σminus1βλminus 2λ⊤β⊤Σminus1R)
983070
Hence we can now define the following estimator
Definition 3 (Bayesian Fama-MacBeth GLS2 (BFM-GLS2)) The BFM-GLS2 posterior dis-
tribution of λ is
λ|dataBΣ sim N (λΣλ) (18)
48
where λ = (β⊤Σminus1β)minus1β⊤Σminus1R Σλ = 1T (β
⊤Σminus1β)minus1 and where β and Σ are drawn from the
Normal-inverse-Wishart in (8)-(9)
In the above the conditional expectation λ = (β⊤Σminus1β)minus1β⊤Σminus1R is essentially the Fama-
MacBeth GLS estimate of λ and Σλ is just the standard covariance matrix of the GLS estimates
Different from our OLS version we incorporate the uncertainty of R by drawing λ from the normal
distribution with variance matrix 1T (β
⊤Σminus1β)minus1
In this case two forces are at play to make useless factors detectable in finite sample First as
for the BFM and BFM-GLS cases useless factors will generate posterior draws with diverging 983141λand flipping sing Second differently from our OLS version we incorporate the uncertainty of R
by drawing λ from the normal distribution with variance matrix 1T (β
⊤Σminus1β)minus1
A3 Formal derivation of the flat prior pitfall for risk premia
Following the derivation in section A1 the likelihood function in the second step is
p(data|γλσ2) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βγλγ)
⊤(aminus βγλγ)
983070(19)
Assigning a Jeffreysrsquo prior to the parameters23 (λσ2) the marginal likelihood function conditional
on model index γ is
p(data|γ) =983133 983133
p(data|γλσ2)π(λσ2|γ)dλdσ2
prop983133 983133
(σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βγλγ)
⊤(aminus βγλγ)
983070dλdσ2
=
983133 983133(σ2)minus
N+22 exp
983083minusN σ2
γ
2σ2
983084exp
983083minus(λγ minus λγ)
⊤β⊤γ βγ(λγ minus λγ)
2σ2
983084dλdσ2
= (2π)pγ+1
2 |β⊤γ βγ |minus
12
983133(σ2)minus
Nminuspγ+1
2 exp
983083minusN σ2
γ
2σ2
983084dσ2
= (2π)pγ+1
2 |β⊤γ βγ |minus
12
Γ(Nminuspγ+1
2 )
(N σ2
γ
2 )Nminuspγ+1
2
where λγ = (β⊤γ βγ)
minus1β⊤γ a σ
2γ =
(aminusβγ λγ)⊤(aminusβγ λγ)N and Γ denotes the Gamma function
A4 Proof of Remark 2
Proof Consider two nested linear factor models γ and γ prime The only difference between γ and γ prime
is γp γp equals 1 in model γ but 0 in model γ prime Let γminusp denote a (K minus 1) times 1 vector of model
index excluding γp γ⊤ = (γ⊤minusp 1) and γ prime⊤ = (γ⊤
minusp 0) Suppose that we rearrange the ordering of
23More precisely the priors for (λσ2) are π(λγ σ2) prop 1
σ2 and λminusγ = 0
49
factors such that factor p is the last one To begin with we introduce some matrix notations
βγ = (βγprime βp) Dγ =
983075Dγprime
1ψp
983076 β⊤
γ βγ +Dγ =
983075β⊤γprimeβγprime +Dγprime β⊤
γprimeβp
β⊤p βγprime β⊤
p βp + 1ψp
983076
|β⊤γ βγ +Dγ | = |β⊤
γprimeβγprime +Dγprime |times |β⊤p βp +
1
ψpminus β⊤
p βγprime(β⊤γprimeβγprime +Dγprime)minus1β⊤
γprimeβp|
|β⊤γ βγ +Dγ | = |β⊤
γprimeβγprime +Dγprime |times |β⊤p βp +
1
ψpminus β⊤
p βγprime(β⊤γprimeβγprime +Dγprime)minus1β⊤
γprimeβp|
|Dγ | = |Dγprime |times 1
ψp
Equipped with the above we have by direct calculation
p(data|γj = 1γminusj)
p(data|γj = 0γminusj)=
|Dγ |12
|β⊤γ βγ+Dγ |
12
1983059
SSRγ2
983060N2
|Dγprime |
12
|β⊤γprimeβγprime+D
γprime |12
1983061
SSRγprime2
983062N2
=
983061SSRγprime
SSRγ
983062N2983061|Dγ ||Dγprime |
983062 12
983075|β⊤
γprimeβγprime +Dγprime ||β⊤
γ βγ +Dγ |
983076 12
=
983061SSRγprime
SSRγ
983062N2
ψminus 1
2p
983055983055983055983055β⊤p βp +
1
ψpminus β⊤
p βγprime
983059β⊤γprimeβγprime +Dγprime
983060minus1β⊤γprimeβp
983055983055983055983055minus 1
2
=
983061SSRγprime
SSRγ
983062N29830611 + ψpβ
⊤p
983063IN minus βγprime
983059β⊤γprimeβγprime +Dγprime
983060minus1β⊤γprime
983064βp
983062minus 12
where β⊤p
983147IN minus βγprime(β⊤
γprimeβγprime +Dγprime)minus1β⊤γprime
983148βp = minb(βpminusβγprimeb)⊤(βpminusβγprimeb)+ b⊤Dγprimeb which
is the minimal value of the penalised sum of squared errors when we use βγprime to predict βp
A5 Confidence Intervals for R2 in Fama-MacBeth Estimation
In the spirit of Stock (1991) and Lewellen Nagel and Shanken (2010) for each of the simulations
we use the following approach to construct the confidence interval for cross-sectional R2 produced
by the standard Fama-MacBeth estimation
First we choose a hypothetical true cross-sectional R2 denoted by R2h The expected test asset
returns E[Rt] are assumed to follow E[Rt] = hβλ+α where β and λ are sample estimates of β
and λ from historical data and αiiidsim N (0σ2
e) The constants h and σ2e are chosen to match the
hypothetical cross-sectional R2
Let R denote the vector of historical average asset returns and V arN (x) denote the sample
cross-sectional variance of vector x The population cross-sectional R2 is therefore
R2h =
h2V arN (βλ)
V arN (E[Rt])=
h2V arN (βλ)
h2V arN (βλ) + σ2e
At the same time wersquod like to maintain the historical cross-sectional dispersion of test asset returns
50
so we further have the following equation
V arN (E[Rt]) = h2V arN (βλ) + σ2e = V arN (ET [Rt])
h2 =R2
hV arN (ET [Rt])
V arN (βλ)
σ2e = (1minusR2
h)V arN (ET [Rt])
Solving the system for h and σ2e we simulate a vector of pricing errors from a normal distribution
αiiidsim N (0σ2
e) Since the sample variance of the draws αi is generally different from σ2e with a non-
zero cross-sectional covariance with β we adjust the vector α by (1) subtracting CovN (βλα)
V arN (βλ)βλ
from α such as the sample covariance between α and βλ is zero (2) multiplying α by σeσN (α) in
order that sample cross-sectional variance of α equals σ2e
Second we simulate a random sample of both factors and test asset returns Factors are drawn
with replacement from their empirical sample while test assets returns Rt are assumed to follow a
parametric normal distribution that is
Rt|ftiidsim N (hβλ+α+ βft Σ)24
We then use Fama-MacBeth two-step approach to estimateR2 for every simulated sample ftRtTt=1
The second step is repeated for 1000 times For each hypothetical R2 between 0 and 1 we find the
(5 95) confidence interval in the simulation Finally build a 90 confidence interval for R2 by
including those values of hypothetically true R2 whose 90 confidence interval contains R2
The confidence intervals for GLS R2 can be found in a similar way with the only difference
of focusing on cross-sectional R2 for a linear combination of Rt ie Σminus 12Rt Let Rt = Σminus 1
2Rt
β = Σminus 12 β and 983171t = Σminus 1
2 983171tiidsim N (0 IN ) The simulation then relies on Rt rather than Rt
Rt|ftiidsim N (hβλ+α+ βft IN )
A6 Large N behavior
In this section we investigate the properties of the BFM procedure in estimating the risk premia
and R-squared as well successfully identifying irrelevant factors in the model when applied to a
large cross-section For the sake of brevity here we present the overview of the simulation results
with Tables C4 - C9 summarizing the output
We consider the same simulation design as described at the beginning of Section III except
for the choice of the cross-section of test assets which time series and cross-sectional features we
mimic In our baseline case in the previous subsections we built a cross-section to emulate the 25
Fama-French portfolios sorted by size and value Now instead we consider the properties of the
following composite cross-sections to simulate returns
24Σ is the sample estimate of covariance matrix of random errors in time-series regression in equation (1)
51
(a) N = 55 25 Fama-French portfolios sorted by size and value and 30 industry portfolios
(b) N = 100 25 Fama-French portfolios sorted by size and value 30 industry portfolios 25
profitability and investment portfolios 10 momentum portfolios and 10 long-term reversal
portfolios
The rest of the simulation design stays unchanged ie the strong factor mimics the behavior of
HML with its betas and risk premia corresponding to their in-sample values cross-sectional R2adj
as well as portfolio average returns and variance of the residuals
Tables C4 and C7 focus on the case of including only a strong factor into the model that is
inherently misspecified and estimated on a large cross-section (N = 55 and N = 100 respectively)
Risk premia estimates recovered by BFM are centered around the pseudo-true values and confi-
dence intervals produced by the posterior coverage have the correct size Confidence intervals for
R2adj are equally centered around the true values and overall become quite tight for a large sample
size (eg T ge 600) The GLS version of the cross-sectional fit is slightly lower than than the true
value and this is largely due to the estimation errors in the weight matrix that are particularly
important for a large cross-section but overall are very close to the true values
Tables C5 and C8 focus on the model with just a useless factor As expected the standard
Fama-MacBeth estimation yields confidence intervals for the risk premia with wrong size rejecting
the zero risk premia for a useless factor with increasing frequency as T becomes large In contrast
the BFM inference remains valid with its empirical rejection rates being close to the true size of
the tests
Finally Tables C6 and C9 consider the mixed case of estimating a model that includes both
strong and useless factors Again BFM correctly identifies the presence of a spurious factor in
the specification and if anything becomes somewhat more conservative underrejecting the zero
risk premia associated with it This conservative inference is also shared by the estimates of the
strong factor risk premia with the latter being particularly evident in a large sample estimation
(T = 20 000 N = 100) Again the reason for this additional estimation uncertainty seems to lie
in the large N behavior of the cross-sectional regression (betas and the weight matrix in case of
the GLS) The posterior of a cross-sectional measure of fit remains relatively tight around the true
value
B Data
CAPM Following Sharpe (1964) and Lintner (1965) the only risk factor is the excess return on
market portfolio which is proxied by a value-weighted portfolio of all CRSP firms incorporated in
US and listed on the NYSE AMEX or NASDAQ We use 1-month Treasury rate as a proxy for
risk-free rate The data comes from Ken Frenchrsquos website
Fama-French 3 factor model Fama and French (1992) extend CAPM by introducing two
additional factors SMB and HML where SMB is the return difference between portfolios of stocks
52
with small and large market capitalizations and HML is the return difference between portfolios of
stocks with high and low book-to-market ratio Again the data comes from Ken Frenchrsquos website
Carhart (1997) This paper extends Fama-French 3 factor model by including a momentum
factor UMD (also available from Ken French website)
q-factor model Hou Xue and Zhang (2015) introduce a four-factor model that includes market
excess return a new size factor (ME) proxied by the return difference between large and small
stocks an investment factor (IA) proxied by the return difference of stocks with high and low
investment-to-asset ratio and finally the profitability factor (ROE) created by sorting stocks based
on their return-on-equity ratio We receive the data from the authors An alternative is Fama-
French five factor model but we only show the result of the first one since factors in these two
papers are extremely similar
Liquidity Pastor and Stambaugh (2003) created a liquidity factor based on the fact that order
flows result in larger return reversals when liquidity is lower We download the monthly tradable
liquidity factor from Stambaughrsquos website
Quality-minus-junk Asness Frazzini and Pedersen (2019) introduced the QMJ factor and
demonstrated that profitable growing well-managed companies referred to as lsquoqualityrsquo firms
command a higher rate of return The factor is available from AQR data library
CCAPM Real growth rate in nondurable consumption per capita (quarterly data) is computed
from the data on consumption levels available at St Louis FRED
Scaled CAPM Lettau and Ludvigson (2001) considered a conditional SDFmt+1 = atminusbtMKTt+1
where at and bt are linear function of conditional information cayt We download cayt from the
authorsrsquo website
Scaled CCAPM Similar to conditional CAPM but the SDF includes nondurable consumption
growth instead of the market factor
HC-CCAPM Jagannathan and Wang (1996) added a labor factor to CAPM Following their
paper we compute the returns on human capital as
Rlabort =
Ltminus1 + Ltminus2
Ltminus2 + Ltminus3minus 1 (20)
where Lt is the disposable labor income per capita
Scaled HC-CCAPM Lettau and Ludvigson (2001) extend Jagannathan and Wang (1996) by
considering a conditional SDF in which cay is the only conditional information
Durable consumption model Yogo (2006) emphasized the role of durable consumption goods
in explaining high returns of small stocks and value stocks We consider a three-factor model
market excess return non-durable consumption growth and durable (real) consumption growth Cd
(seasonally adjusted at annual rates) The data is from Yogorsquos website 1952Q1 to 2001Q4
53
B1 Factor list
Table B1 List of the factors for cross-sectional asset pricing models
Factor ID Factor name and description Reference Sourceconstruction
MKT Market excess return Sharpe (1964) Lintner(1965)
Ken French website
SMB Size factor constructed as a long-shortportfolio of stocks sorted by their mar-ket cap (small-minus-big)
Fama and French (1992) Ken French website
HML Value factor constructed as a long-short portfolio of stocks sorted by theirbook-to-market ratio (high-minus-low)
Fama and French (1992) Ken French website
RMW Profitability factor constructed as along-short portfolio of stocks sorted bytheir profitability (robust-minus-weak)
Fama and French (2015) Ken French website
CMA Investment factor constructed as along-short portfolio of stocks sorted bytheir investment activity (conservative-minus-aggressive)
Fama and French (2015) Ken French website
UMD Momentum factor constructed as along-short portfolio of stocks sorted bytheir 12-2 cumulative previous return(up-minus-down)
Carhart (1997) Je-gadeesh and Titman(1993)
Ken French website
STREV Short-term reversal factor constructedas a long-short portfolio of stockssorted by their previous month return
Jegadeesh and Titman(1993)
Ken French website
LTREV Long-term reversal factor constructedas a long-short portfolio of stockssorted by their cumulative return ac-crued in the previous 60-13 months
Jegadeesh and Titman(2001)
Ken French website
q IA Investment factor constructed as along-short portfolio of stocks sorted bytheir investment-to-capital
Hou Xue and Zhang(2015)
Lu Zhang
q ROE Profitability factor constructed as along-short portfolio of stocks sorted bytheir return on equity
Hou Xue and Zhang(2015)
Lu Zhang
LIQ NT Liquidity factor computed as the av-erage of individual-stock measures es-timated with daily data (residual pre-dictability controlling for the marketfactor)
Pastor and Stambaugh(2003)
Robert Stambaughwebsite
LIQ TR Liquidity factor constructed as a long-short portfolio of stocks sorted by theirexposure to LIQ NT
Pastor and Stambaugh(2003)
Robert Stambaughwebsite
MGMT Mispricing factor constructed as acombination of anomalies related tofirmrsquos management practices (stock is-sue accruals asset growth etc)
Stambaugh and Yuan(2016)
Robert Stambaughwebsite
PERF Mispricing factor constructed as acombination of anomalies related tofirmrsquos performance (profitability dis-tress return on assets etc)
Stambaugh and Yuan(2016)
Robert Stambaughwebsite
ACCR Accruals factor constructed as a long-short portfolio of stocks sorted bychanges in operating working capitalper split-adjusted share from the fiscalyear end t-2 to t-1 divided by book eq-uity per share in t-1
Sloan (1996) Robert Stambaughwebsite
54
DISSTR Distress factor constructed as a long-short portfolio of stocks sorted by thepredicted failure probability
Campbell Hilscher andSzilagyi (2008)
Robert Stambaughwebsite
ASS Growth Asset growth factor constructed as along-short portfolio of stocks sorted bygrowth rate of total assets in the previ-ous fiscal year
Cooper Gulen andSchill (2008)
Robert Stambaughwebsite
COMP ISSUE Composite issue factor constructed asa long-short portfolio of stocks sortedby the growth in the firmrsquos total marketvalue of equity above that of the stockrsquosrate of return
Daniel and Titman(2006)
Robert Stambaughwebsite
GR PROF Gross profitability factor constructedas a long-short portfolio of stockssorted by the ratio of gross profit toassets creates abnormal benchmark-adjusted returns
Novy-Marx (2013) Robert Stambaughwebsite
INV IN ASS Investment-in-assets factor con-structed as a long-short portfolio ofstocks sorted by the annual change ingross property plant and equipmentplus the annual change in inventoriesscaled by lagged book value of theassets
Titman Wei and Xie(2004)
Robert Stambaughwebsite
NetOA Net operating assets factor con-structed as a long-short portfolio ofstocks sorted by net operating assets
Hirshleifer Kewei Teohand Zhang (2004)
Robert Stambaughwebsite
OSCORE Ohlson O-score factor constructed as along-short portfolio of stocks sorted bythe predicted value of a distress mea-sure
Ohlson (1980) Robert Stambaughwebsite
ROA Return-on-assets factor constructed asa long-short portfolio of stocks sortedby their return on assets
Chen Novy-Marx andZhang (2010)
Robert Stambaughwebsite
STOCK ISS Stock issuance factor constructed asa long-short portfolio of stocks sortedby their annual log change in split-adjusted shares outstanding
Ritter (1991) Fama andFrench (2008)
Robert Stambaughwebsite
INTERM CR Innovations to the intermediariesrsquo cap-ital (equity) ratio
He Kelly and Manela(2017)
Asaf Manela website
BAB Betting-against-beta factor con-structed as a portfolio that holdslow-beta assets leveraged to a betaof 1 and that shorts high-beta assetsde-leveraged to a beta of 1
Frazzini and Pedersen(2014)
AQR data library
HML DEVIL A version of the HML factor that relieson the current price level to sort thestocks into long and short legs
Asness and Frazzini(2013)
AQR data library
QMJ Auality-minus-junk factor constructedas a long-short portfolio of stockssorted by the combination of theirsafety profitability growth and thequality of management practices
Asness Frazzini andPedersen (2019)
AQR data library
FIN UNC A measure of financial uncertainty Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
REAL UNC A measure of real economic uncertainty Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
55
MACRO UNC A measure of macroeconomic uncer-tainty
Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
TERM Term spread measured as the differ-ence in 10 year Treasury bonds and fedfunds rate
Chen Ross and Roll(1986) Fama and French(1993)
FRED-MD database
DELTA SLOPE Change in the difference between a10-year Treasury bond yield and a 3-month Treasury bill yield
Ferson and Harvey(1991)
FRED-MD database
CREDIT Credit spread measured as the dif-ference between Moodyrsquos BAA corpo-rate bond yields and 10-year Treasurybonds
Chen Ross and Roll(1986) Fama and French(1993)
FRED-MD database
DIV Dividend yield Campbell (1996) FRED-MD databasePE Price-earnings ratio Basu (1977) Ball (1985) FRED-MD database
BW INV SENT Investor sentiment measure aggre-gated from a set of indices orthogonalto macroeconomic fundamentals
Baker and Wurgler(2006)
Dashan Huang web-site
HJTZ INV SENT Investor sentiment measure extractedwith PLS from Baker and Wurgler(2006) proxies
Huang Jiang Tu andZhou (2015)
Dashan Huang web-site
BEH PEAD Short-term behavioral factor reflectingpost-earnings announcement drift
Daniel Hirshleifer andSun (2019)
Kent Daniel website
BEH FIN Long-term behavioral factor predom-inantly capturing the impact of shareissuance and correction
Daniel Hirshleifer andSun (2019)
Kent Daniel website
MKTlowast Market factor with a hedged unpricedcomponent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
SMBlowast SMB with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
HMLlowast HML with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
RMWlowast RMW with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
CMAlowast CMA with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
SKEW Systematic skewness factor con-structed as a long-short portfolio ofstocks sorted on the their predictedsystematic skewness rank
Langlois (2019) Hugues Langloiswebsite
NONDUR Nondurable consumption growth (realchain-weighted per capita)
Chen Ross and Roll(1986) Breeden Gib-bons and Litzenberger(1989)
Monthly consump-tion expenditurethe chain-weightedprice index andpopulation data arefrom BEA
SERV Growth rate (real chain-weighted percapita) service expenditure
Breeden Gibbons andLitzenberger (1989)Hall (1978)
Monthly expenditurefor services thechain-weighted priceindex and popula-tion data are fromBEA
UNRATE Unemployment rate Gertler and Grinols(1982)
FRED-MD database
IND PROD Growth rate of industrial production Chan Chen and Hsieh(1985) Chen Ross andRoll (1986)
Industrial Produc-tion Index is fromthe Board of Gover-nors of the FederalReserve System
56
OIL Monthly growth rate of the ProducerPrice index for Crude Petroleum (do-mestic production)
Chen Ross and Roll(1986)
PPI is from US Bu-reau of Labor Statis-tics
The table presents the list of factors used in Section IV2 For each of the variables we present their identificationindex the nature of the factor and the source of data for downloading andor constructing the time series
57
C Additional Tables
Table C1 Tests of risk premia in a correctly specified model with a strong factor
λintercept λHML R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0115 0049 0017 0114 0050 0014 -417 4429200 0101 0055 0012 0096 0047 0010 096 6773
FM 600 0106 0053 0011 0100 0048 0011 3052 84561000 0096 0050 0011 0102 0044 0010 4786 901820000 0102 0042 0012 0100 0049 0007 9620 9943
100 0063 0027 0006 0034 0010 0002 -337 3516200 0107 0055 0012 0080 0037 0005 -299 4906
BFM 600 0095 0050 0007 0080 0035 0007 431 69131000 0092 0054 0008 0092 0047 0012 3291 781820000 0108 0054 0012 0110 0052 0008 9654 9849
Panel B GLS
100 0221 0156 0061 0212 0137 0064 1062 7380200 0153 0088 0024 0144 0088 0024 5923 8571
FM 600 0117 0062 0017 0115 0060 0014 8338 94461000 0106 0053 0011 0112 0052 0011 9003 964920000 0100 0058 0010 0103 0047 0011 9947 9981
100 0151 0088 0026 0149 0086 0026 2642 6569200 0123 0066 0016 0124 0066 0015 4907 7352
BFM 600 0106 0055 0012 0105 0056 0011 7640 87301000 0107 0054 0011 0103 0051 0011 8512 915420000 0098 0057 0009 0100 0051 0013 9915 9950
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in a correctlyspecified model with an intercept and a strong factor Hypothetical true value of R2
adj is 100 Fama-MacBeth esti-mates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shankencorrection Confidence intervals for BFM estimates are constructed using a posterior distribution of Fama-MacBethestimates of λ The last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simula-tions evaluated at the simulation point estimates for FM and its posterior mode for BFM
58
Table C2 Tests of risk premia in a correctly specified model with a useless factor
λintercept λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0087 0033 0008 0019 0003 0000 -419 4844200 0071 0034 0006 0052 0011 0000 -420 5684
FM 600 0077 0031 0005 0131 0037 0000 -422 58501000 0071 0035 0006 0205 0066 0002 -416 612820000 0133 0077 0027 0709 0479 0140 -411 7007
100 0039 0011 0002 0001 0001 0000 -229 -012200 0038 0013 0001 0002 0000 0000 -232 030
BFM 600 0048 0012 0002 0017 0006 0001 -221 1261000 0048 0011 0000 0031 0010 0000 -214 16020000 0007 0000 0000 0103 0051 0014 -176 5793
Panel B GLS
100 0215 0130 0052 0211 0117 0033 -310 3727200 0141 0080 0023 0139 0072 0013 -373 2282
FM 600 0108 0053 0014 0142 0065 0010 -377 21161000 0097 0053 0009 0171 0082 0012 -371 209220000 0108 0047 0010 0621 0559 0372 -259 1697
100 0136 0074 0022 0029 0011 0001 -194 1060200 0112 0060 0014 0012 0004 0000 -239 837
BFM 600 0097 0047 0010 0010 0002 0000 -265 7491000 0096 0047 0009 0010 0002 0000 -276 83220000 0071 0034 0004 0077 0033 0004 -326 766
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a correctly specified model with an intercept and a useless factor The true value of R2 is 0 Fama-MacBeth esti-mates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shankencorrection Confidence intervals for BFM estimates are constructed using a posterior distribution of Fama-MacBethestimates of λ The last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simula-tions evaluated at the simulation point estimates for FM and its posterior mode for BFM
59
Table C3 Tests of risk premia in a correctly specified model with useless and strong factors
λintercept λHML λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0073 0038 0006 0071 0033 0006 0043 0012 0000 -445 5930200 0073 0035 0006 0090 0045 0008 0028 0006 0000 1083 7484
FM 600 0076 0033 0005 0100 0046 0009 0027 0005 0000 3873 85721000 0073 0034 0006 0090 0046 0009 0026 0005 0000 5415 911320000 0087 0032 0006 0061 0026 0005 0025 0008 0000 9687 9947
100 0039 0013 0001 0023 0007 0001 0002 0001 0000 -197 4174200 0048 0023 0000 0046 0015 0000 0002 0001 0000 003 5372
BFM 600 0054 0025 0003 0062 0026 0006 0002 0000 0000 4056 72821000 0084 0034 0006 0064 0021 0002 0002 0000 0000 5763 808920000 0071 0033 0005 0069 0032 0004 0004 0000 0000 9704 9860
Panel B GLS
100 0193 0129 0043 0205 0133 0043 0180 0121 0031 1405 7480200 0135 0080 0021 0141 0075 0024 0129 0062 0010 6015 8683
FM 600 0101 0054 0010 0109 0052 0009 0091 0037 0004 8331 94231000 0104 0055 0010 0105 0048 0011 0085 0039 0003 8995 963220000 0095 0047 0006 0101 0052 0005 0088 0035 0004 9944 9981
100 0133 0074 0023 0132 0074 0023 0026 0009 0001 2796 6680200 0106 0054 0012 0106 0053 0012 0013 0004 0000 4899 7362
BFM 600 0094 0047 0009 0093 0046 0009 0007 0001 0000 7654 87351000 0090 0045 0010 0091 0044 0007 0005 0001 0000 8530 914620000 0092 0043 0008 0092 0046 0008 0004 0001 0000 9914 9951
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and
λstrong λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor The true value
of the cross-sectional R2adj is 100 Fama-MacBeth estimates are constructed using OLS (GLS) two-step cross-
sectional regressions with standard errors including Shanken correction Confidence intervals for BFM estimates areconstructed using a posterior distribution of Fama-MacBeth estimates of λ The last two columns report the 5th and95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at the simulation point estimates for FMand its posterior mode for BFM
60
Table C4 Tests of risk premia in a misspecified model with a strong factor (N = 55)
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0107 0058 0014 0115 0059 0012 -186 1305200 0091 0046 0009 0102 0052 0011 -184 1485600 0104 0052 0013 0109 0060 0014 -166 14951000 0100 0056 0009 0123 0061 0013 -123 135120000 0104 0049 0009 0109 0048 0009 353 803
BFM 100 0017 0004 0000 0002 0000 0000 -155 -067200 0046 0017 0004 0023 0006 0000 -168 273600 0089 0042 0012 0071 0036 0005 -174 8441000 0093 0047 0007 0089 0040 0007 -171 95920000 0101 0048 0011 0099 0042 0008 329 777
Panel B GLS
FM 100 0509 0428 0305 0513 0431 0305 2093 6471200 0295 0206 0102 0274 0187 0091 3999 6657600 0170 0109 0042 0176 0113 0034 5585 71321000 0170 0106 0034 0176 0105 0031 6010 723220000 0145 0082 0020 0141 0073 0022 6917 7182
BFM 100 0265 0181 0086 0262 0186 0079 1872 5919200 0163 0100 0032 0147 0086 0024 3401 5842600 0116 0060 0017 0118 0058 0014 5125 65951000 0120 0064 0016 0121 0063 0014 5689 686520000 0106 0053 0009 0095 0048 0013 6897 7163
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor estimates of a cross-section of 55 test assets The true valueof the cross-sectional R2
adj is 572 (7071) in OLS (GLS) estimation Fama-MacBeth estimates are constructedusing OLS (GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidenceintervals and their size for BFM estimates are constructed using posterior coverage of Fama-MacBeth estimates of λThe last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluatedat the simulation point estimates for FM and its posterior mode for BFM The test assets mimic the time series andcross-sectional properties of 25 Fama-French size-value portfolios and 30 industry portfolios while the strong factorproxies the HML factor (all the data available from Ken French website)
61
Table C5 Tests of risk premia in a misspecified model with a useless factor (N = 55)
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0095 0050 0013 0084 0031 0003 -186 2080200 0084 0042 0007 0093 0038 0004 -186 1801600 0103 0051 0011 0155 0077 0011 -187 13621000 0101 0051 0009 0226 0131 0033 -187 141720000 0191 0126 0050 0716 0650 0460 -187 988
BFM 100 0013 0002 0000 0000 0000 0000 -105 -047200 0039 0015 0001 0002 0000 0000 -128 -043600 0076 0040 0006 0004 0000 0000 -144 -0491000 0082 0035 0004 0022 0009 0000 -150 -02620000 0129 0059 0005 0078 0037 0008 -168 -020
Panel B GLS
FM 100 0492 0411 0287 0540 0468 0302 -134 2318200 0267 0196 0088 0364 0274 0153 -159 1359600 0166 0101 0035 0408 0323 0198 -162 10091000 0154 0089 0028 0475 0401 0266 -155 86820000 0155 0100 0044 0806 0772 0705 -068 668
BFM 100 0259 0180 0085 01695 0105 0041 -014 1285200 0162 0094 0030 0065 0027 0005 -089 615600 0108 0058 0013 0056 0022 0003 -127 5311000 0112 0057 0014 0064 0021 0004 -138 47920000 0051 0020 0001 0093 0049 0011 -072 172
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor estimated on a cross-section of 55 portfolios Thetrue value of the cross-sectional R2 is zero The test assets mimic the time series and cross-sectional properties of 25Fama-French size-value portfolios and 30 industry portfolios
62
Table C6 Tests of risk premia in a misspecified model with useless and strong factors (N = 55)
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0097 0052 0014 0099 0045 0009 0108 0041 0006 -318 2477200 0085 0041 0006 0090 0049 0011 0117 0051 0008 -298 2350600 0095 0045 0013 0108 0060 0012 0191 0100 0015 -212 21211000 0083 0043 0006 0125 0072 0016 0258 0171 0047 -155 213720000 0109 0055 0012 0146 0086 0038 0766 0704 0535 267 1778
BFM 100 0014 0002 0000 0000 0000 0000 0000 0000 0000 -141 161200 0037 0014 0001 0017 0005 0000 0002 0000 0000 -203 726600 0069 0031 0005 0058 0027 0002 0007 0000 0000 -227 10961000 0064 0024 0003 0069 0027 0005 0026 0008 0001 -209 117320000 0028 0006 0000 0068 0033 0004 0080 0042 0009 262 873
Panel B GLS
FM 100 0488 0406 0282 0495 0411 0281 0537 0465 0306 2201 6613200 0263 0194 0088 0252 0167 0077 0359 0279 0151 4047 6707600 0157 0094 0032 0161 0093 0027 0398 0313 0199 5581 71591000 0146 0087 0029 0144 0090 0026 0475 0393 0251 5994 725020000 0140 0094 0035 0134 0089 0041 0789 0747 0667 6888 7237
BFM 100 0253 0182 0081 0260 0176 0077 0168 0104 0039 2036 6007200 0156 0092 0028 0140 0082 0021 0063 0029 0004 3451 5844600 0105 0058 0013 0108 0049 0011 0054 0024 0002 5125 66141000 0105 0054 0015 0111 0056 0009 0057 0024 0003 5678 687320000 0051 0019 0001 0045 0018 0001 0093 0045 0013 6873 7157
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and λstrong
and λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor estimated on a cross-section
of 55 portfolios The true value of the cross-sectional R2adj is 572 (7071) in OLS (GLS) estimation The test
assets mimic the time series and cross-sectional properties of 25 Fama-French size-value portfolios and 30 industryportfolios while the strong factor proxies the HML factor
63
Table C7 Tests of risk premia in a misspecified model with a strong factor (N = 100)
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0091 0044 0010 0109 0055 0012 -098 1310600 0102 0052 0013 0116 0057 0019 -042 12311000 0099 0056 0011 0117 0060 0014 031 114520000 0099 0044 0010 0116 0068 0017 431 748
BFM 200 0016 0004 0000 0006 0001 0000 -082 074600 0076 0034 0006 0060 0028 0005 -086 7901000 0081 0046 0007 0076 0037 0007 -072 85920000 0097 0044 0011 0106 0057 0014 418 742
Panel B GLS
FM 200 0495 0411 0287 0481 0403 0268 2938 5885600 0255 0169 0077 0257 0178 0073 4736 62091000 0234 0149 0059 0205 0129 0049 5198 635220000 0166 0100 0035 0170 0097 0036 6142 6398
BFM 200 0249 0163 0070 0233 0158 0073 2567 5296600 0132 0082 0023 0137 0077 0020 4287 56771000 0125 0073 0021 0112 0060 0014 4849 597020000 0101 0056 0012 0098 0053 0013 6114 6375
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for the pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor estimated on a cross-section of 100 portfolios The truevalue of R2
adj is 585 (6297) for the OLS (GLS) estimation Fama-MacBeth estimates are constructed using OLS(GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidence intervalsand their size for BFM estimates are constructed using a posterior distribution of Fama-MacBeth estimates of λ Thelast two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at thesimulation point estimates for FM and its posterior mode for BFM The simulations design follows the methodologydescribed in Section III with the test assets mimicking the composite cross-section of 25 Fama-French size-BMportfolios 30 industry portfolios 25 profitability and investment portfolios 10 momentum portfolios as well as 10long-term reversal portfolios (all available from Ken French website) A strong factor mimics the behavior of HML
64
Table C8 Tests of risk premia in a misspecified model with a useless factor (N = 100)
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0091 0041 0009 0147 0079 0015 -101 1481600 0110 0062 0010 0280 0180 0064 -100 13551000 0096 0053 0007 0341 0248 0101 -100 122220000 0221 0151 0061 0805 0759 0643 -100 1220
BFM 200 0014 0003 0000 0000 0000 0000 -050 002600 0070 0030 0003 0014 0002 0000 -066 0261000 0073 0034 0005 0026 0010 0001 -071 02720000 0097 0035 0003 0091 0045 0008 -081 109
Panel B GLS
FM 200 0472 0397 0272 0530 0454 0320 -072 1495600 0230 0149 0064 0458 0363 0235 -052 9691000 0196 0122 0048 0487 0407 0275 -037 86120000 0106 0063 0021 0760 0717 0640 171 615
BFM 200 0244 0161 0066 0150 0092 0031 -016 769600 0129 0073 0021 0078 0035 0008 -055 6761000 0118 0066 0019 0070 0033 0006 -060 62420000 0076 0034 0005 0093 0049 0009 172 399
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor estimated on a cross-section of 100 portfolios The truevalue of R2 is zero The test assets mimic the time series and cross-sectional properties of composite cross-section of25 Fama-French size-BM portfolios 30 industry portfolios 25 profitability and investment portfolios 10 momentumportfolios as well as 10 long-term reversal portfolios while the strong factor mimics the behavior of HML (all thedata is available from Ken French website)
Table C9 Tests of risk premia in a misspecified model with useless and strong factors (N = 100)
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0088 0043 0009 0098 0045 0008 0187 0104 0026 -124 2118600 0103 0056 0011 0104 0064 0015 0319 0217 0081 -020 19571000 0105 0049 0009 0115 0058 0013 0389 0284 0134 070 182620000 0102 0057 0014 0140 0075 0028 0823 0782 0684 400 1766
BFM 200 0012 0003 0000 0005 0000 0000 0000 0000 0000 -051 541600 0062 0025 0003 0046 0021 0002 0016 0004 0000 -063 10771000 0062 0029 0005 0057 0025 0003 0029 0008 0001 -005 107820000 0026 0008 0001 0042 0015 0000 0089 0039 0011 399 818
Panel B GLS
FM 200 0467 0392 0268 0471 0383 0254 0523 0454 0315 2971 5943600 0227 0149 0059 0228 0154 0060 0455 0363 0227 4745 62221000 0191 0118 0046 0178 0114 0036 0480 0401 0262 5207 636820000 0098 0054 0020 0095 0062 0024 0749 0707 0621 6121 6416
BFM 200 0243 0160 0066 0230 0155 0072 0152 0087 0031 2613 5350600 0126 0072 0022 0132 0071 0017 0074 0033 0007 4268 56921000 0117 0067 0017 0109 0057 0013 0068 0034 0006 4864 597920000 0068 0032 0002 0066 0034 0006 0087 0047 0009 6102 6371
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and λstrong
λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor on the cross-section of 100
portfolios The true value of R2adj is 585 (6297) in OLS (GLS) estimation The test assets mimic the time series
and cross-sectional properties of composite cross-section of 25 Fama-French size-BM portfolios 30 industry portfolios25 profitability and investment portfolios 10 momentum portfolios as well as 10 long-term reversal portfolios whilethe strong factor mimics the behavior of HML (all the data is available from Ken French website)
65
Table C10 The probability of retaining risk factors using BF
T 55 57 59 61 63 65
Panel A strong factors
Jeffreys Prior fstrong 200 0860 0845 0830 0812 0792 0771600 0987 0985 0985 0983 0981 09791000 0998 0998 0997 0996 0996 0995
Spike-and-Slab Prior fstrong 200 0749 0718 0685 0654 0618 0586600 0982 0979 0978 0972 0970 09641000 0996 0996 0996 0995 0994 0992
Panel B useless factors
Jeffreys Prior fuseless 200 1000 0995 0982 0940 0862 0726600 1000 0999 0998 0988 0971 09201000 1000 1000 1000 0999 0990 0971
Spike-and-Slab Prior fuseless 200 0419 0248 0149 0083 0040 0022600 0119 0062 0028 0012 0007 00041000 0083 0028 0011 0001 0000 0000
Panel C strong and useless factors
Jeffreys Prior fstrong 200 0928 0912 0891 0878 0860 0838600 0994 0994 0992 0991 0991 09891000 0999 0999 0999 0999 0999 0999
fuseless 200 0955 0894 0788 0642 0489 0360600 0957 0895 0764 0618 0461 03541000 0957 0893 0787 0645 0483 0357
Spike-and-Slab Prior fstrong 200 0753 0725 0697 0666 0629 0592600 0982 0980 0979 0972 0970 09611000 0996 0996 0996 0995 0994 0994
fuseless 200 0283 0154 0085 0044 0023 0011600 0077 0031 0006 0002 0000 00001000 0053 0008 0000 0000 0000 0000
The table shows the frequency of retaining risk factors for different choice sets across 1000 simulations of differentsize (T=200 600 and 1000) In Panel A the candidate risk factor is truly cross-sectionally priced and stronglyidentified while in Panel B they are not Panel C summarizes the case of using both strong and useless candidatefactors in the model A candidate factor is retained in the model if its marginal posterior probability p(γi = 1|data)is greater than a certain threshold ie 55 57 59 61 63 and 65
66
Table C11 Two-pass regressions with tradable factors 25 Fama-French portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CAPM Intercept 1421 1769 1404 -091 1461[0627 2215] [-435 6661] [0595 2256] [-403 5211]
MKT -0647 -0631[-1429 0135] [-1458 0163]
Fama and French (1992) Intercept 1273 6071 1249 5604 5566[0730 1816] [2000 8514] [0682 1834] [4275 6946]
MKT -0686 -0664[-1227 -0145] [-1242 -0102]
SMB 0140 0140[0075 0205] [0073 0205]
HML 0380 0379[0322 0438] [0319 0439]
Quality-minus-junk Intercept 0626 7442 0648 6947 6879Asness Frazzini and Pedersen (2014) [-0048 1300] [4480 9520] [-0055 1308] [5548 7946]
QMJ 0369 0358[0148 0589] [0130 0591]
MKT -0097 -0116[-0763 0568] [-0772 0580]
SMB 0209 0206[0157 0260] [0151 0262]
HML 0338 0340[0288 0388] [0289 0392]
Panel B GLS
CAPM Intercept 1362 4805 1323 4706 4265[0941 1783] [2696 10000] [0846 1802] [1411 6530]
MKT -0788 -0749[-1206 -0369] [-1227 -0287]
Fama and French (1992) Intercept 1397 8849 1353 8543 8543[0924 1870] [8286 9771] [0846 1878] [8003 8989]
MKT -0827 -0783[-1298 -0356] [-1311 -0283]
SMB 0179 0178[0145 0213] [0142 0215]
HML 0348 0350[0312 0385] [0310 0391]
Quality-minus-junk Intercept 0966 8948 0982 8736 8642Asness Frazzini and Pedersen (2014) [0395 1537] [8320 9640] [0383 1616] [8120 9090]
QMJ 0378 0359[0204 0552] [0172 0539]
MKT -0425 -0438[-0984 0133] [-1052 0157]
SMB 0197 0196[0160 0234] [0156 0235]
HML 0359 0359[0321 0398] [0320 0399]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of monthly excess returns for 25 Fama-French sizevalue portfolios Each model is estimated viaOLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructed basedon the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructed as inLewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide theposterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and 99level respectively
67
Table C12 Two-pass regressions with tradable factors 25 Fama-French portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CCAPM Intercept 1444 662 1814 -202 545[0155 2732] [-435 9374] [0106 3664] [-423 4175]
∆Cnd 0383 0254[-0221 0988] [-0462 0956]
Scaled CAPM Intercept 4489 1617 3731 1739 2565[1479 7499] [-1429 6114] [-0268 7514] [-391 5783]
cay 1141 0563[-0198 2480] [-2262 3085]
MKT -2119 -1444[-4890 0653] [-5241 2643]
MKT times cay -8554 -5188[-19227 2119] [-17911 9415]
Scaled HC-CAPM Intercept 4405 1386 3343 3895 3659[1182 7629] [-2632 5453] [-0500 7182] [-006 6630]
cay 1038 0552[-0444 2520] [-1957 2848]
∆Y 0437 0034[-0421 1295] [-0995 0928]
MKT -2049 -1148[-5031 0933] [-4946 2772]
∆Y times cay 1428 0724[-1047 3903] [-3458 4645]
MKT times cay -7782 -4127[-18579 3015] [-17926 10532]
Panel B GLS
CCAPM Intercept 2274 -251 2336 -303 -056[1386 3162] [-435 9896] [1360 3266] [-403 1109]
∆Cnd 0139 0088[-0114 0391] [-0222 0383]
Scaled CAPM Intercept 2623 4949 2668 5626 4567[0794 4451] [1657 7829] [0859 4376] [439 7146]
cay 0362 0205[-0759 1483] [-0853 1277]
MKT -0596 -0642[-2412 1220] [-2325 1149]
MKT times cay 0644 0310[-5695 6984] [-5988 6529]
Scaled HC-CAPM Intercept 2486 5014 2604 5832 4698[0510 4463] [1158 7726] [0801 4372] [558 7371]
cay 0361 0199[-0902 1623] [-0879 1269]
∆Y -0415 -0242[-0878 0048] [-0672 0157]
MKT -0479 -0588[-2441 1482] [-2357 1241]
∆Y times cay 0395 0150[-1761 2552] [-1718 2036]
MKT times cay 1158 0477[-5888 8205] [-5890 6968]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of quarterly excess returns for 25 Fama-French sizevalue portfolios Each model is estimatedvia OLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructed basedon the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructed as inLewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide theposterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and 99level respectively
68
Table C13 Two-pass regressions with tradable factors 25 Fama-French + 17 industry portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Carhart (1997) Intercept 0755 4720 0780 3835 3769
[0167 1342] [803 8559] [0158 1395] [2222 5427]
MKT -0162 -0188
[-0759 0435] [-0808 0432]
SMB 0144 0142
[0082 0206] [0078 0207]
HML 0320 0316
[0251 0388] [0245 0388]
UMD 0759 0673
[-0360 1878] [-0476 1802]
q-factor model Intercept 0744 4505 0761 3647 3600
Hou Xue and Zhang (2015) [0244 1243] [138 8338] [0239 1287] [1711 5508]
ROE 0173 0156
[-0139 0485] [-0154 0481]
IA 0287 0279
[0117 0456] [0117 0455]
ME 0230 0221
[0135 0325] [0121 0320]
MKT -0199 -0211
[-0703 0304] [-0741 0311]
Liquidity Factor Intercept 0958 277 0963 -124 620
Pastor and Stambaugh (2003) [0417 1499] [-513 4113] [0377 1527] [-435 3340]
LIQ 0210 0153
[-1001 1421] [-1077 1445]
MKT -0274 -0281
[-0822 0274] [-0851 0340]
Panel B GLS
Carhart (1997) Intercept 0839 8826 0909 8486 8397
[0467 1210] [7562 9557] [0487 1339] [7895 8812]
MKT -0235 -0309
[-0610 0140] [-0737 0120]
SMB 0162 0161
[0134 0190] [0130 0193]
HML 0351 0350
[0316 0386] [0312 0390]
UMD 1426 1184
[0832 2020] [0541 1861]
q-factor model Intercept 1107 4289 1103 3742 3631
Hou Xue and Zhang (2015) [0761 1454] [027 7341] [0714 1486] [2022 5216]
ROE 0270 0247
[0057 0482] [0008 0502]
IA 0277 0263
[0134 0420] [0107 0428]
ME 0221 0216
[0156 0287] [0144 0288]
MKT -0524 -0520
[-0869 -0179] [-0900 -0138]
Liquidity Factor Intercept 1186 2897 1159 1811 2373
Pastor and Stambaugh (2003) [0874 1497] [-197 7687] [0803 1519] [438 5253]
LIQ 0089 0065
[-0567 0744] [-0696 0871]
MKT -0598 -0571
[-0909 -0288] [-0936 -0212]
69
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CAPM Intercept 0966 467 0957 -113 306
[0427 1506] [-250 5490] [0382 1522] [-246 2863]
MKT -0277 -0269
[-0823 0269] [-0838 0311]
Fama-French (1992) Intercept 1016 4330 1002 3425 3370
[0540 1491] [182 8166] [0517 1500] [2194 4614]
MKT -0445 -0430
[-0916 0026] [-0912 0068]
SMB 0147 0144
[0086 0208] [0077 0208]
HML 0316 0313
[0248 0383] [0242 0383]
Quality-minus-junk Intercept 0836 4931 0836 3861 3934
Asness et all (2019) [0323 1350] [914 8449] [0314 1365] [2459 5577]
QMJ 0153 0147
[-0065 0372] [-0066 0373]
MKT -0293 -0289
[-0796 0210] [-0814 0233]
SMB 0179 0175
[0114 0243] [0105 0240]
HML 0302 0300
[0234 0369] [0230 0373]
Panel B GLS
CAPM Intercept 1192 3060 1170 2602 2573
[0884 1500] [058 7848] [0815 1517] [600 5118]
MKT -0604 -0583
[-0911 -0298] [-0923 -0227]
Fama-French (1992) Intercept 1182 8616 1150 8272 8234
[0853 1510] [7303 9353] [0788 1519] [7736 8662]
MKT -0596 -0564
[-0923 -0268] [-0937 -0198]
SMB 0160 0160
[0132 0187] [0129 0191]
HML 0349 0348
[0314 0384] [0310 0386]
Quality-minus-junk Intercept 0970 8674 0977 8227 8276
Asness et all (2019) [0606 1334] [7451 9335] [0561 1369] [7759 8693]
QMJ 0293 0278
[0159 0427] [0125 0439]
MKT -0393 -0399
[-0754 -0033] [-0791 0012]
SMB 0166 0165
[0138 0194] [0134 0196]
HML 0353 0352
[0318 0388] [0315 0392]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of monthly excess returns for 25 Fama-French and 17 industry portfolios Each model is estimatedvia OLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructedbased on the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructedas in Lewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we providethe posterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of thecross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and99 level respectively
70
Table C14 Two-pass regressions with nontradable factors 25 Fama-French + 17 industry port-folios
Fama-MacBeth Bayesian Estimation
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CCAPM Intercept 1817 356 1975 -124 209
[0905 2729] [-250 6208] [0795 3135] [-245 2680]
∆Cnd 0189 0123
[-0216 0594] [-0285 0502]
Scaled CAPM Intercept 2141 1528 2344 298 1306
[0529 3754] [-789 9568] [0692 3944] [-451 4069]
cay 1408 0574
[-0182 2999] [-0935 1939]
MKT 0108 -0142
[-1471 1686] [-1747 1499]
cay timesMKT -2554 -2159
[-8811 3702] [-8813 4620]
Scaled CCAPM Intercept 1607 1709 1983 95 1468
[0394 2820] [-789 10000] [0761 3180] [-372 4189]
cay 1137 0511
[-0394 2669] [-0871 1865]
∆Cnd 0452 0175
[-0103 1007] [-0261 0598]
cay times∆Cnd -0102 0030
[-1752 1547] [-1175 1292]
HC-CAPM Intercept 2185 611 2221 -147 644
[0720 3650] [-513 6005] [0667 3735] [-429 3093]
∆Y 0398 0201
[-0011 0808] [-0290 0667]
MKT 0123 0050
[-1392 1638] [-1513 1650]
Panel B GLS
CCAPM Intercept 2351 264 2442 -057 218
[1700 3003] [-250 9795] [1612 3258] [-204 1296]
∆Cnd 0211 0132
[0034 0387] [-0081 0351]
Scaled CAPM Intercept 2516 3951 2653 4332 3470
[1467 3566] [1045 7411] [1616 3705] [360 6539]
cay 0783 0424
[0056 1511] [-0311 1119]
MKT -0504 -0639
[-1548 0541] [-1674 0406]
cay timesMKT 3785 2327
[-0412 7981] [-2053 6791]
Scaled CCAPM Intercept 2244 331 2378 296 356
[1378 3109] [-789 8166] [1539 3237] [-476 1690]
cay 0742 0425
[-0006 1489] [-0316 1104]
∆Cnd 0160 0119
[-0078 0398] [-0116 0342]
cay times∆Cnd 0355 0190
[-0343 1053] [-0455 0840]
HC-CAPM Intercept 2908 3810 2808 4454 3356
[2053 3762] [854 7372] [1809 3826] [194 6591]
∆Y -0155 -0089
[-0381 0072] [-0357 0174]
MKT -0892 -0793
[-1742 -0043] [-1803 0208]
71
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Scaled HC-CAPM Intercept 2099 1372 2243 1599 2186
[0549 3649] [-1389 4875] [0491 3955] [-121 4771]
cay 1124 0529
[-0326 2574] [-1066 1981]
∆Y 0088 0106
[-0314 0489] [-0399 0599]
MKT 0169 -0068
[-1340 1679] [-1805 1744]
cay times∆Y 1277 0579
[-1361 3914] [-1930 2919]
cay timesMKT -2125 -1605
[-8058 3808] [-8349 5354]
Durable CCAPM Intercept 2281 2316 2220 1423 1459
[0025 4537] [-789 10000] [0403 4028] [-359 4178]
∆Cnd 0544 0194
[0054 1034] [-0163 0525]
∆Cd 0481 0104
[-0101 1062] [-0353 0531]
MKT -0102 -0063
[-2466 2262] [-1948 1863]
Panel B GLS
Scaled HC-CAPM Intercept 2662 3848 2698 4312 3502
[1626 3698] [433 7267] [1555 3844] [231 6466]
cay 0726 0416
[-0029 1481] [-0324 1146]
∆Y -0165 -0088
[-0440 0110] [-0372 0198]
MKT -0652 -0685
[-1684 0379] [-1816 0438]
cay times∆Y 0681 0333
[-0648 2009] [-1017 1616]
cay timesMKT 3266 2123
[-0950 7482] [-2173 6502]
Durable CCAPM Intercept 1958 2926 1994 412 2558
[0634 3281] [-789 6763] [0831 3125] [-269 6336]
∆Cnd 0164 0065
[-0065 0394] [-0126 0260]
∆Cd 0289 0123
[-0011 0589] [-0117 0377]
MKT 0002 -0026
[-1322 1326] [-1165 1125]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of quarterly excess returns for 25 Fama-French and 17 industry portfolios Each modelis estimated via OLS and GLS We report point estimates and 5 confidence intervals for risk premia which areconstructed based on the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence levelconstructed as in Lewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimationwe provide the posterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode andmedian of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the90 95 and 99 level respectively
72
Table C15 Posterior factor probabilities and risk premia of 26 million sparse models
Pr [γj = 1|data] E [λj |data]ψ ψ
factors 1 5 10 20 50 100 1 5 10 20 50 100
HML 0455 0702 0720 0695 0646 0614 0106 0231 0249 0245 0229 0219MKTlowast 0274 0513 0594 0658 0700 0702 0050 0211 0321 0440 0553 0591MKT 0086 0178 0311 0446 0550 0576 0009 0067 0163 0286 0408 0454SMBlowast 0246 0350 0317 0264 0210 0188 0039 0113 0120 0110 0095 0089PERF 0136 0160 0147 0125 0098 0083 -0020 -0050 -0053 -0049 -0041 -0036STOCK ISS 0099 0117 0125 0123 0110 0099 -0008 -0029 -0042 -0050 -0053 -0051COMP ISSUE 0097 0101 0104 0103 0096 0088 0010 0029 0041 0051 0059 0059CMA 0111 0114 0104 0091 0075 0066 0008 0018 0019 0019 0017 0016ROA 0089 0079 0083 0094 0102 0097 0010 0023 0034 0051 0068 0070UMD 0091 0097 0097 0090 0078 0071 0006 0020 0026 0029 0030 0030BEH PEAD 0081 0069 0069 0074 0089 0108 0001 0002 0004 0007 0016 0030STRev 0078 0064 0063 0067 0086 0112 0000 0002 0003 0007 0022 0048NONDUR 0079 0065 0064 0067 0079 0097 0000 0000 0001 0001 0003 0006DISSTR 0085 0079 0076 0070 0060 0053 0010 0030 0039 0043 0042 0040MGMT 0090 0083 0077 0068 0055 0047 0007 0017 0019 0019 0017 0015BW ISENT 0079 0064 0062 0063 0069 0077 0000 0000 0000 0001 0002 0004TERM 0078 0063 0061 0062 0069 0078 0000 0000 0001 0001 0004 0007FIN UNC 0078 0063 0061 0061 0066 0073 -0000 -0000 -0000 -0000 -0000 -0000LIQ TR 0078 0063 0060 0061 0066 0075 -0000 0000 0001 0002 0007 0015DeltaSLOPE 0078 0063 0061 0061 0066 0073 -0000 -0000 -0000 -0000 -0000 -0001IPGrowth 0078 0063 0061 0061 0066 0072 -0000 -0000 -0000 -0000 -0001 -0002Oil 0078 0063 0061 0061 0066 0072 -0000 0001 0001 0002 0004 0009SERV 0078 0063 0061 0061 0065 0072 -0000 -0000 -0000 -0000 -0000 -0000DEFAULT 0078 0063 0060 0060 0064 0069 -0000 -0000 0000 0000 0000 0000NetOA 0079 0063 0061 0062 0065 0065 0001 0003 0004 0007 0014 0018DIV 0078 0063 0060 0060 0064 0069 0000 -0000 -0000 -0000 -0000 -0000PE 0078 0063 0060 0060 0064 0068 -0000 -0001 -0001 -0002 -0004 -0008REAL UNC 0078 0063 0060 0060 0064 0067 0000 0000 0000 0000 0000 0000HJTZ ISENT 0078 0063 0060 0060 0064 0068 -0000 -0000 -0000 -0000 -0000 -0001UNRATE 0078 0063 0060 0060 0064 0068 0000 -0000 -0000 -0000 -0001 -0002INTERM CAP RATIO 0084 0065 0061 0062 0062 0059 0009 0013 0018 0027 0038 0041LIQ NT 0079 0063 0060 0060 0063 0066 -0001 -0001 -0002 -0002 -0003 -0004QMJ 0113 0090 0069 0051 0035 0028 0012 0016 0014 0010 0007 0005MACRO UNC 0078 0063 0060 0059 0060 0061 0000 0000 0000 0000 0000 0000INV IN ASS 0078 0062 0059 0059 0060 0061 0000 0000 0001 0001 0003 0006ACCR 0081 0067 0062 0059 0056 0055 -0002 -0005 -0006 -0006 -0005 -0004LTRev 0078 0064 0063 0062 0059 0053 -0001 -0005 -0008 -0012 -0016 -0017CMAlowast 0079 0062 0059 0058 0059 0058 0000 0000 -0000 -0001 -0002 -0002ASS Growth 0078 0060 0056 0055 0055 0052 -0001 -0001 -0001 -0004 -0008 -0011HMLlowast 0079 0062 0058 0055 0053 0049 -0001 -0001 -0001 -0001 -0001 -0001RMWlowast 0077 0058 0052 0047 0040 0035 0000 0001 0001 0002 0002 0002BAB 0079 0059 0051 0045 0039 0034 -0003 -0003 -0003 -0001 0001 0003IA 0083 0060 0051 0043 0035 0030 -0003 -0004 -0004 -0003 -0003 -0003O SCORE 0077 0056 0047 0040 0032 0027 -0003 -0004 -0005 -0005 -0005 -0005ROE 0076 0055 0047 0040 0032 0027 0001 0004 0005 0005 0004 0003SMB 0039 0024 0027 0042 0067 0077 0002 0002 0004 0007 0014 0017SKEW 0090 0062 0047 0034 0024 0019 -0000 -0000 -0000 -0000 -0000 -0000BEH FIN 0079 0051 0040 0031 0023 0019 -0006 -0005 -0003 -0001 -0000 -0000GR PROF 0077 0047 0038 0031 0025 0020 0004 0003 0003 0003 0003 0003RMW 0073 0045 0036 0030 0023 0018 0003 0004 0004 0003 0003 0003HML DEVIL 0063 0036 0027 0020 0015 0012 0002 0003 0002 0001 0000 0000
Posterior probabilities of factors Pr [γj = 1|data] and posterior mean of factor risk premia E [λj |data] computedusing the Dirac spike and slab approach of section II22 51 factors and all possible models with up to 5 factorsyielding about 26 million candidate models The prior probability of a factor being included is about 1038 Thedata is monthly 197310 to 201612 Test assets cross-section of 25 Fama-French size and book-to-market and 30Industry portfolios The 51 factors considered are described in Table B1 of Appendix B
73
Table C16 Factor models with highest posterior probability (Dirac spike-and-slab ψ = 20)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232SMB 983232 983232 983232STRevBW ISENTLIQ TRUNRATENONDURTERMCOMP ISSUE 983232 983232OilBEH PEAD 983232DeltaSLOPEINV IN ASSIPGrowthDEFAULTUMD 983232ROA 983232 983232 983232 983232 983232 983232 983232REAL UNCPECMA 983232ACCRSERVSTOCK ISS 983232 983232 983232DIVMACRO UNCFIN UNCLIQ NTCMANetOAHJTZ ISENTLTRevRMWHMLINTERM CAP RATIOASS GrowthPERF 983232IABABDISSTR 983232ROEMGMTO SCOREQMJBEH FINGR PROFSKEWRMWHML DEVILSMBProbability () 01080 00964 00771 00710 00709 00668 00661 00624 00600 00560
Factors and posterior model probabilities of ten most likely specifications computed using the Dirac spike and slabapproach of section II22 ψ = 20 51 factors and all possible models with up to 5 factors yielding about 26 millionmodels and a model prior probability of the order of 10minus7 Specifications organised by columns with the symbol 983232indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612 Test assetscross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factors considered aredescribed in Table B1 of Appendix B
74
Table C17 Factor Models with highest posterior probability (continuous spike-and-slab ψ = 10)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKTlowast 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232SMBlowast 983232 983232 983232 983232 983232 983232 983232STRev 983232 983232 983232 983232 983232 983232 983232IPGrowth 983232 983232 983232 983232 983232BEH PEAD 983232 983232 983232 983232 983232NONDUR 983232 983232 983232 983232SERV 983232 983232 983232LIQ TR 983232 983232 983232 983232PE 983232 983232 983232 983232 983232 983232DeltaSLOPE 983232 983232 983232 983232BW ISENT 983232 983232DEFAULT 983232 983232 983232 983232 983232 983232Oil 983232 983232 983232DIV 983232 983232 983232 983232 983232 983232REAL UNC 983232 983232 983232 983232 983232 983232UNRATE 983232 983232TERM 983232 983232 983232 983232HJTZ ISENT 983232 983232 983232 983232 983232 983232UMD 983232 983232 983232 983232 983232CMAlowast 983232 983232 983232 983232 983232 983232LIQ NT 983232 983232 983232 983232 983232 983232 983232 983232FIN UNC 983232 983232 983232 983232 983232 983232 983232STOCK ISS 983232 983232NetOA 983232 983232 983232 983232 983232 983232MACRO UNC 983232 983232 983232 983232 983232INV IN ASS 983232 983232 983232COMP ISSUE 983232 983232 983232 983232 983232ACCR 983232 983232 983232 983232 983232 983232HMLlowast 983232 983232 983232 983232 983232 983232ROA 983232 983232 983232 983232 983232 983232RMWlowast 983232 983232 983232 983232 983232 983232CMA 983232 983232 983232 983232 983232 983232LTRev 983232 983232 983232 983232 983232 983232 983232 983232INTERM CAP RATIO 983232 983232 983232 983232 983232 983232ASS Growth 983232 983232 983232 983232 983232 983232DISSTR 983232 983232 983232 983232PERF 983232 983232 983232BAB 983232 983232IA 983232 983232 983232 983232MGMT 983232 983232 983232 983232ROE 983232 983232 983232O SCORE 983232BEH FIN 983232 983232 983232 983232 983232GR PROF 983232 983232 983232 983232 983232QMJ 983232 983232RMW 983232SKEW 983232 983232 983232SMB 983232HML DEVIL 983232 983232Probability () 00111 00111 00111 00100 00100 00100 00100 00100 00100 00100
Factors and posterior model probabilities of ten most likely specifications computed using the continuous spike andslab approach of section II23 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 225quadrillion models and a model prior probability of the order of 10minus16 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
75
Table C18 Factor models with highest posterior probability (Continuous spike-and-slab ψ = 20)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232MKTlowast 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232SMBlowast 983232 983232 983232STRev 983232 983232 983232 983232 983232 983232 983232 983232IPGrowth 983232 983232 983232BEH PEAD 983232 983232 983232 983232 983232 983232NONDUR 983232 983232 983232SERV 983232 983232 983232 983232 983232 983232LIQ TR 983232 983232 983232 983232 983232PE 983232 983232 983232 983232 983232 983232 983232DeltaSLOPE 983232 983232 983232 983232 983232 983232BW ISENT 983232 983232 983232 983232 983232DEFAULT 983232 983232 983232 983232 983232Oil 983232 983232DIV 983232 983232 983232 983232REAL UNC 983232 983232 983232 983232 983232UNRATE 983232 983232 983232 983232 983232TERM 983232 983232 983232 983232HJTZ ISENT 983232 983232 983232 983232 983232 983232 983232 983232UMD 983232 983232 983232 983232CMAlowast 983232 983232LIQ NT 983232 983232 983232 983232 983232FIN UNC 983232 983232 983232 983232 983232 983232STOCK ISS 983232 983232 983232 983232 983232NetOA 983232 983232 983232 983232 983232 983232 983232MACRO UNC 983232 983232INV IN ASS 983232 983232 983232 983232COMP ISSUE 983232 983232 983232 983232ACCR 983232 983232 983232HMLlowast 983232 983232 983232 983232 983232 983232 983232ROA 983232 983232 983232 983232 983232 983232RMWlowast 983232 983232 983232 983232 983232CMA 983232 983232 983232 983232 983232 983232LTRev 983232 983232 983232INTERM CAP RATIO 983232 983232 983232ASS Growth 983232 983232 983232 983232DISSTR 983232 983232 983232 983232 983232PERF 983232BAB 983232 983232 983232 983232IA 983232 983232 983232 983232MGMT 983232 983232 983232ROE 983232 983232 983232 983232 983232O SCORE 983232 983232 983232 983232 983232BEH FIN 983232GR PROF 983232 983232 983232QMJ 983232 983232RMW 983232 983232SKEW 983232 983232SMB 983232HML DEVIL 983232 983232Probability () 00133 00122 00111 00111 00100 00100 00100 00100 00100 00089
Factors and posterior model probabilities of ten most likely specifications computed using the continuous spike andslab approach of section II23 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 225quadrillion models and a model prior probability of the order of 10minus16 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
76
D Additional Figures
Panel A OLS
00 01 02 03 04 05
02
46
810
λstrong
Density
BFMFM
(a) Strong factor
minus2 minus1 0 1 2
00
02
04
06
08
10
λuseless
Density
BFMFM
(b) Useless factor
Panel B GLS
025 030 035
05
1015
2025
30
λstrong
Density
BFMFM
(c) Strong factor
minus20 minus15 minus10 minus05 00 05 10
00
04
08
12
λuseless
Density
BFMFM
(d) Useless factor
Figure D1 Posterior distribution of the risk premia estimates in a misspecified model that includesboth strong and irrelevant factors
The graph presents posterior distribution of risk premia estimates for a misspecified model with both strong anduseless factors in one representative simulation Panels (a) and (b) display posterior distribution of the BFM-OLSestimates of risk premia along with the frequentist distribution implied by the point estimates and standard errorsPanels (c) and (d) report the same objects for GLS T =1000
77
0 5 10
00
04
08
12
Parameter
Den
sity
PriorLikelihood ILikelihood IILikelihood IIIPosterior IPosterior IIPosterior III
Figure D2 Posterior distributions with heavy-tailed likelihood and thin-tailed prior
Posterior shapes generated by a thin-tails (Gaussian) prior and a heavy-tails (Cauchy) likelihood as the likelihood
peak moves away from the prior mean Posteriors I II and III are generated respectively by the likelihood I II and
III
78
This problem arises not only when using the Fama-MacBeth two-step procedure Kan and
Zhang (1999a) point out that the identification condition in the GMM test of linear stochastic
discount factor models fails when a useless factor is included Moreover this leads to overrejection
of the hypothesis of a zero risk premium for the useless factor under the Wald test and the power
of the over-identifying restriction test decreases Gospodinov Kan and Robotti (2019) document
similar problems within the maximum likelihood estimation and testing framework
Consequently several papers have attempted to develop alternative statistical procedures that
are robust to the presence of useless factors Kleibergen (2009) proposes several novel statis-
tics whose large sample distributions are unaffected by the failure of the identification condition
Gospodinov Kan and Robotti (2014) derive robust standard errors for the GMM estimates of
factor risk premia in the linear stochastic factor framework and prove that t-statistics calculated
using their standard errors are robust even when the model is misspecified and a useless factor is
included Bryzgalova (2015) introduces a LASSO-like penalty term in the cross-sectional regression
to shrink the risk premium of the useless factor towards zero
In this paper we provide a Bayesian inference and model selection framework that i) can be
easily used for robust inference in the presence and detection of useless factors (section II1) and
ii) can be used for both model selection and model averaging even in the presence of a very large
number of candidate (traded or non traded and possibly useless) risk factors ndash ie the entire factor
zoo
II1 Bayesian Fama-MacBeth
This section introduces our hierarchical Bayesian Fama-MacBeth (BFM) estimation method A
formal derivation is presented in Appendix A1 To start with letrsquos consider the time-series regres-
sion We assume that the time series error terms follow an iid multivariate Gaussian distribution
(the approach at the cost of analytical solutions could be generalized to accommodate different
distributional assumptions) ie 983171 sim MVN (0TtimesN Σ otimes IT ) The likelihood of the data (RF ) is
then
p(data|BΣ) = (2π)minusNT2 |Σ|minus
T2 exp
983069minus1
2tr
983147Σminus1(Rminus FB)⊤(Rminus FB)
983148983070
The time-series regression is always valid even in the presence of a spurious factor For simplicity
we choose the non-informative Jeffreysrsquo prior for (BΣ) π(BΣ) prop |Σ|minusN+1
2 Note that this prior
is flat in the B dimension The posterior distribution of (BΣ) is therefore
B|Σ data sim MVN983059983141BolsΣotimes (F⊤F )minus1
983060 (8)
Σ|data sim W -1983059T minusK minus 1 T Σ
983060 (9)
where 983141Bols and Σ denote the canonical OLS based estimates and W -1 is the inverse-Wishart
distribution (a multivariate generalization of the inverse-gamma distribution) From the above
factor the OLS R2 is more likely to be inflated than its GLS counterpart
8
we can sample the posterior distribution of the parameters (BΣ) by first drawing the covariance
matrix Σ from the inverse-Wishart distribution conditional on the data and then drawing B from
a multivariate normal distribution conditional on the data and the draw of Σ
If the model is correctly specified in the sense that all true factors are included expected
returns of the assets should be fully explained by their risk exposure β and the prices of risk λ
ie E[Rt] = βλ But since given our mean normalisation of the factors E[Rt] = a we have the
least square estimate (β⊤β)minus1β⊤a Therefore we can define our first estimator
Definition 1 (Bayesian Fama-MacBeth (BFM)) The posterior distribution of λ conditional
on B Σ and the data is a Dirac distribution at (β⊤β)minus1β⊤a A draw (λ(j)) from the posterior
distribution of λ conditional on the data only is obtained by drawing B(j) and Σ(j) from the Normal-
inverse-Wishart in (8)-(9) and computing (β⊤(j)β(j))
minus1β⊤(j)a(j)
The posterior distribution of λ defined above accounts both for the uncertainty about the
expected returns (via the sampling of a) and the uncertainty about the factor loadings (via the
sampling of β) Note that differently from the frequentist case in equation (5) there is no ldquoextra
termrdquo (1 + λ⊤fΣ
minus1f λf ) to account for the fact that βf is estimated The reason being that it
is unnecessary to explicitly adjust standard errors of λ in the Bayesian approach since we keep
updating βf in each simulation step automatically incorporating the uncertainty about βf into the
posterior distribution of λ Furthermore it is quite intuitive from the above definition of the BFM
estimator why we expect posterior inference to detect weak and spurious factors in finite sample
For such factors the near singularity of (β⊤(j)β(j))
minus1 will cause the draws for λ(j) to diverge as in
the frequentist case Nevertheless the posterior uncertainty about factor loadings and risk premia
will cause β⊤(j)a(j) to flip sign across draws causing the posterior distribution of λ to put substantial
probability mass on both values above and below zero Hence centered posterior credible intervals
will tend to include zero with high probability
In addition to the price of risk λ we are also interested in estimating the cross-sectional fit of
the model ie the cross-sectional R2 Once we obtain the posterior draws of the parameters we
can easily obtain the posterior distribution of the cross-sectional R2 defined as
R2ols = 1minus (aminus βλ)⊤(aminus βλ)
(aminus a1N )⊤(aminus a1N ) (10)
where a = 1N
983123Ni ai That is for each posterior draw of (a β λ) we can construct the corre-
sponding draw for the R2 from equation (10) hence tracing out its posterior distribution We can
think of equation (10) as the population R2 where a β and λ are unknown After observing
the data we infer the posterior distribution of a β and λ and from these we can recover the
distribution of the R2
However realistically the models are rarely true Therefore one might want to allow for the
presence of pricing errors α in the cross-sectional regression6 This can be easily accommodated
6As we will show in the next section this is essential for model selection
9
within our Bayesian framework since in this case the data generating process in the second stage
becomes a = βλ + α If we further assume that pricing error αi follows an independent and
identical normal distribution N (0σ2) the likelihood function in the second step becomes7
p(data|λσ2β) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070 (11)
In the cross-sectional regression the ldquodatardquo are the expected risk premia a and the factor loadings
β albeit these quantities are not directly observable to the researcher Hence in the above we
are conditioning on the knowledge of these quantities which can be sampled from the first step
Normal-inverse-Wishart posterior distribution (8)-(9) Conceptually this is not very different from
the Bayesian modeling of latent variables In the benchmark case we assume a Jeffreysrsquo diffuse
prior8 for (λσ2) π(λσ2) prop σminus2 In Appendix A1 we show that the posterior distribution of
(λσ2) is then
λ|σ2BΣ data sim N
983091
983109983107(β⊤β)minus1β⊤a983167 983166983165 983168983141λ
σ2(β⊤β)minus1
983167 983166983165 983168Σλ
983092
983110983108 (12)
σ2|BΣ data sim Γminus1
983075N minusK minus 1
2(aminus βλ)⊤(aminus βλ)
2
983076 (13)
where Γminus1 denotes the inverse-gamma distribution The conditional distribution in equation (12)
makes it clear that the posterior takes into account both the uncertainty about the market price
of risk stemming from the first stage uncertainty about the β and a (that are drawn from the
Normal-inverse-Wishart posterior in equations (8)-(9)) and the random pricing errors α that have
the conditional posterior variance in equation (13) If test assetsrsquo expected excess returns are fully
explained by β there are no pricing errors and σ2(β⊤β)minus1 converges to zero otherwise this layer
of uncertainty always exists
Note also that we can think of the posterior distribution of (β⊤β)minus1β⊤a as a Bayesian decision
makerrsquos belief about the dispersion of the Fama-MacBeth OLS estimates after observing the data
RtftTt=1 Alternatively when pricing errors α are assumed to be zero under the null hypothesis
the posterior distribution of λ in equation (12) collapses to a degenerate distribution where λ
equals (β⊤β)minus1β⊤a with probability one
Often the cross sectional step of the Fama-MacBeth estimation is performed via GLS rather than
least squares In our setting under the null of the model this leads to λ = (β⊤Σminus1β)minus1β⊤Σminus1a
Therefore we define the following GLS estimator
Definition 2 (Bayesian Fama-MacBeth GLS (BFM-GLS)) The posterior distribution of λ
conditional on B Σ and the data is a Dirac distribution at (β⊤Σminus1β)minus1β⊤Σminus1a A draw (λ(j))
7We derive a formulation with non-spherical cross-sectional pricing errors that leads to a GLS type estimator inAppendix A2
8As shown in the next subsection in the presence of useless factors such prior is not appropriate for modelselection based on Bayes factors and posterior probabilities since it does not lead to proper marginal likelihoodsTherefore we introduce therein a novel prior for model selection
10
from the posterior distribution of λ conditional on the data only is obtained by drawing B(j) and Σ(j)
from the Normal-inverse-Wishart in equations (8)-(9) and computing (β⊤(j)Σ
minus1(j)β(j))
minus1β⊤(j)Σ
minus1(j)a(j)
From the posterior sampling of the parameters in the above definition we can also obtain the
posterior distribution of the cross-sectional GLS R2 defined as
R2gls = 1minus
(aminus βλgls)⊤Σminus1(aminus βλgls)
(aminus a1N )⊤Σminus1(aminus a1N ) (14)
Once again we can think of equation (14) as the population GLS R2 that is a function of the
unknown quantities a β and λ But after observing the data we infer the posterior distribution
of the parameters and from these we recover the posterior distribution of the R2gls
Remark 1 (Generated factors) Often factors are estimated as eg in the case of principal
components (PCs) and factor mimicking portfolios (albeit the latter is not needed in our setting)
This generates an additional layer of uncertainty normally ignored in empirical analysis due to
the associated asymptotic complexities Nevertheless it is relatively easy to adjust the Bayesian
estimators of risk premia to account for this uncertainty In the case of a mimicking portfolio
under a diffuse prior and Normal errors the posterior distribution of the portfolio weights follow the
standard Normal-inverse-Gamma of Gaussian linear regression models (see eg Lancaster (2004))
Similarly in the case of principal components as factors under a diffuse prior the covariance
matrix from which the PCs are constructed follow an inverse-Wishart distribution9 Hence the
posterior distributions in Definitions 1 and 2 can account for the generated factors uncertainty by
first drawing from an inverse-Wishart the covariance matrix from which PCs are constructed or
the Normal-inverse-Gamma posterior of the mimicking portfolios coefficients and then sampling
the remaining parameters as explained in the definitions
II2 Model selection
In the previous subsection we have derived simple Bayesian estimators that deliver in finite sam-
ple credible intervals robust to the presence of spurious factors and avoid over-rejecting the null
hypothesis of zero risk premia for such factors
However given the plethora of risk factors that have been proposed in the literature a robust
approach for models selection across non-necessarily nested models and that can handle poten-
tially a very large number of possible models as well as both traded and non-traded factors is
of paramount importance for empirical asset pricing The canonical way of selecting models and
testing hypothesis within the Bayesian framework is through Bayesrsquo factors and posterior prob-
abilities and that is the approach we present in this section This is for instance the approach
suggested by Barillas and Shanken (2018) The key elements of novelty of the proposed method
are that i) our procedure is robust to the presence of spurious and weak factors ii) it is directly
9Based on these two observations Allena (2019) proposes a generalisation of Barillas and Shanken (2018) modelcomparison approach for these type of factors
11
applicable to both traded and non-traded factors and iii) it selects models based on their cross-
sectional performance (rather than the time series one) ie on the basis of the risk premia that the
factors command
In this subsection we show first that flat priors for risk premia (as the Jeffreysrsquo priors used
in section II1 for illustrative purposes) are not suitable for model selection in the presence of
spurious factors Given the close analogy between frequentist testing and Bayesian inference with
flat priors this is not too surprising But the novel insight is that the problem arises exactly
because of the use of flat priors and can therefore be fixed by using non-flat yet non-informative
priors Second we introduce ldquospike-and-slabrdquo priors that are robust to the presence of spurious
factors and particularly powerful in high-dimensional model selection ie when one wants as in
our empirical application to test all factors in the zoo
II21 Pitfalls of Flat Priors for Risk Premia
We start this section by discussing why flat priors for risk premia as the Jeffreysrsquo prior are not
desirable in model selection Since we want to focus and select models based on the cross-sectional
asset pricing properties of the factors for simplicity we retain Jeffreysrsquo priors for the time series
parameter (aβf Σ) of the first-step regression
In order to perform model selection we relax the (null) hypothesis that models are correctly
specified and allow instead for the presence of cross-sectional pricing errors That is we consider
the cross-sectional regression a = βλ + α For illustrative purposes we focus on spherical errors
but all the results in this and the following subsections can be generalized to the non-spherical error
setting in Appendix A2
Similar to many Bayesian variable selection problems we introduce a vector of binary latent
variables γ⊤ = (γ0 γ1 γK) where γj isin 0 1 When γj = 1 it indicates that the factor j
(with associated loadings βj) should be included into the model and vice versa The number of
included factors is simply given by pγ =983123K
j=0 γj Note that we do not shrink the intercept so
γ0 is always equal to 1 (as the common intercept plays the role of the first ldquofactorrdquo) The notation
βγ = [βj ]γj=1 represents a pγ-columns sub-matrix of β
When testing whether the risk premium of factor j is zero the null hypothesis is H0 λj = 0
In our notation this null hypothesis can be expressed as H0 γj = 0 while the alternative is
H1 γj = 1 This is a small but important difference relative to the canonical frequentist testing
approach for useless factors the risk premium is not identified hence testing whether it is equal to
any given value is per se problematic Nevertheless as we show in the next section with appropriate
priors whether a factor should be included or not is a well-defined question even in the presence
of useless factors
In the Bayesian framework the prior distribution of parameters under the alternative hypothesis
should be carefully specified Generally speaking the priors for nuisance parameters such as β σ2
and Σ do not greatly influence the cross-sectional inference But as we are about to show this is
not the case for the priors about risk premia
12
Recall that when considering multiple models say wlog model γ and model γ prime by Bayesrsquo
theorem we have that the posterior probability of model γ is
Pr(γ|data) = p(data|γ)p(data|γ) + p(data|γ prime)
where we have given equal prior probability to each model and p(data|γ) denotes the marginal
likelihood of the model indexed by γ In Appendix A3 we show that when using a Jeffreysrsquo prior
(that is flat for λ) the marginal likelihood is
p(data|γ) prop (2π)pγ+1
2 |β⊤γ βγ |minus
12
Γ983059Nminuspγ+1
2
983060
983059N σ2
γ
2
983060Nminuspγ+1
2
(15)
where λγ = (β⊤γ βγ)
minus1β⊤γ a σ
2γ =
(aminusβγ λγ)⊤(aminusβγ λγ)N and Γ denotes the Gamma function
Therefore if model γ includes a useless factor (whose β asymptotically converges to zero) the
matrix β⊤γ βγ is nearly singular and its determinant goes to zero sending the marginal likelihood
in (15) to infinity As a result the posterior probability of the model containing the spurious factor
go to one Consequently under a flat prior for risk premia the model containing a useless factor
will always be selected asymptotically However the posterior distribution of λ for the spurious
factor is robust and particularly disperse in any finite sample
Moreover it is highly likely that conclusions based on the posterior coverage of λ contradict
those arising from Bayesrsquo factors When the prior distribution of λj is too diffuse under the
alternative hypothesis H1 the Bayesrsquo factor tends to favor H0 over H1 even though the estimate
of λj is far from 0 The reason is that even though H0 seems quite unlikely based on posterior
coverages the data is even more unlikely under H1 than under H0 Therefore a disperse prior for
λj may push the posterior probabilities to favor H0 and make it fail to identity true factors This
phenomenon is the so called ldquoBartlett Paradoxrdquo (see Bartlett (1957))
Note also that flat hence improper priors for the risk premia are not legitimate since they render
the posterior model probabilities arbitrary Suppose that we are testing the null H0 λj = 0 Under
the null hypothesis the prior for (λσ2) is λj = 0 and π(λminusj σ2) prop 1
σ2 However the prior under the
alternative hypothesis is π(λj λminusj σ2) prop 1
σ2 Since the marginal likelihoods of data p(data|H0)
and p(data|H1) are both undetermined we cannot define the Bayesrsquo factor p(data|H1)p(data|H0)
(see eg
Cremers (2002) Chib Zeng and Zhao (forthcoming)) In contrast for nuisance parameters such
as σ2 we can continue to assign improper priors Since both hypotheses H0 and H1 include σ2
the prior for it will be offset in the Bayesrsquo factor and in the posterior probabilities Therefore we
can only assign improper priors for common parameters10 Similarly we can still assign improper
priors for β and Σ in the first time series step
The final reason why it might be undesirable to use Jeffreysrsquo prior in the second step is that it
does not impose any shrinkage on the parameters This is problematic given the large number of
10See Kass and Raftery (1995) (and also Cremers (2002)) for more detailed discussion
13
members of the factor zoo while we have only limited time-series observations of both factors and
test asset returns
In the next subsection we propose an appropriate prior for risk premia that is both robust to
spurious factors and can be used for model selection even when dealing with a very large number
of potential models
II22 Spike and Slab Prior for Risk Premia
In order to make sure that the integration of the marginal likelihood is well-behaved we propose
a novel prior specification for the factorsrsquo risk premia λ⊤f = (λ1 λK)11 Since the inference in
time-series regression is always valid we only modify the priors of the cross-sectional regression
parameters
The prior that we propose belongs to the so-called ldquospike-and-slabrdquo family For exemplifying
purposes in this section we introduce a Dirac spike so that we can easily illustrate its implications
for model selection In the next subsection we generalize the approach to a ldquocontinuous spikerdquo
prior and study its finite sample performance in our simulation setup
In particular we model the uncertainty underlying the model selection problem with a mixture
prior π(λσ2γ) prop π(λ|σ2γ)π(σ2)π(γ) for the risk premium of the j-th factor When γj = 1
and hence the factor should be included in the model the prior follows a normal distribution given
by λj |σ2 γj = 1 sim N (0σ2ψj) where ψj is a quantity that we will be defining below When instead
γj = 0 and the corresponding risk factor should not be included in the model the prior is a Dirac
distribution at zero For the cross-sectional variance of the pricing errors we keep the same prior
that would arise with Jeffreysrsquo approach π(σ2) prop σminus2
Let D denote a diagonal matrix with elements cψminus11 middot middot middot ψminus1
K and Dγ the sub-matrix of D
corresponding to model γ We can then express the prior for the risk factors λγ of model γ as
λγ |σ2γ sim N (0σ2Dminus1γ )
Note that c is a small positive number since we do not shrink the common intercept λc of the
cross-sectional regression
Given the above prior specification we sample the posterior distribution by sequentially drawing
from the conditional distributions of the parameters (ie we use a Gibbs sampling algorithm) The
crucial steps in addition to the sampling of the times series parameters from the posteriors in
equations (8)-(9) are as follows
Sampling λγ
Note that
11We do not shrink the intercept λc
14
p(λ|dataσ2γ) prop p(data|λσ2γ)π(λ|σ2γ)
prop (2π)minuspγ2 |Dγ |
12 (σ2)minus
N+pγ2 exp
983069minus 1
2σ2[(aminus βγλγ)
⊤(aminus βγλγ) + λ⊤γDγλγ ]
983070
= (2π)minuspγ2 |Dγ |
12 (σ2)minus
N+pγ2 e
983069minus
(λγminusλγ )⊤(β⊤γ βγ+Dγ )(λγminusλγ )
2σ2
983070
e
983153minusSSRγ
2σ2
983154
where SSRγ = a⊤aminus a⊤βγ(β⊤γ βγ +Dγ)
minus1β⊤γ a = minλγ(aminus βγλγ)
⊤(aminus βγλγ) + λ⊤γDγλγ
Note that SSRγ is the minimized sum of squared errors with generalised ridge regression penalty
term λ⊤γDγλγ That is our prior modelling is analogous to introducing a Tikhonov-Phillips
regularisation (see Tikhonov Goncharsky Stepanov and Yagola (1995) Phillips (1962)) in the
cross-sectional regression step and has the same rationale delivering a well defined marginal
likelihood in the presence of rank deficiency (that in our settings arises in the presence of useless
factors) However in our setting the shrinkage applied to the factors is heterogeneous since we
rely on the partial correlation between factors and test assets to set ψj as
ψj = ψ times ρ⊤j ρj (16)
where ρj is an N times 1 vector of correlation coefficients between factor j and the test assets and
ψ isin R+ is a tuning parameter which controls the shrinkage over all the factors12 When the
correlation between fjt and Rt is very low as in the case of a useless factor the penalty for λj
which is the reciprocal of ψρ⊤j ρj is very large and dominates the sum of squared errors
Let λγ = (β⊤γ βγ +Dγ)
minus1β⊤γ a and σ2(λγ) = σ2(β⊤
γ βγ +Dγ)minus1 the posterior distribution of
λγ is
λγ |dataσ2γ sim N (λγ σ2(λγ))
The above equation makes it clear why this Bayesian formulation is robust to spurious factors
When β converges to zero (β⊤γ βγ +Dγ) is dominated by Dγ so the identification condition for
the risk premia no longer fails When a factor is spurious its correlation with test assets converges
to zero hence the penalty for this factor ψminus1j goes to infinity As a result the posterior mean
of λγ λγ = (β⊤γ βγ +Dγ)
minus1β⊤γ a is shrunk towards zero and the posterior variance term σ2(λ)
approaches σ2Dminus1γ Consequently the posterior distribution of λ for a spurious factor is nearly the
same as its prior In contrast for a normal factor that has non-zero covariance with test assets the
information contained in β dominates the prior information since in this case the absolute size of
Dγ is small relative to β⊤γ βγ
Using our priors and integrating out λ yields
p(data|σ2γ) =
983133p(data|λσ2γ)π(λ|σ2γ)dλ prop (σ2)minus
N2
|Dγ |12
|β⊤γ βγ +Dγ |
12
exp
983069minusSSRγ
2σ2
983070
12Alternatively we could have set ψj = ψ times β⊤j βj where βj is an N times 1 vector However ρj has the advantage
of being invariant to the units in which the factors are measured
15
Sampling σ2
From the Bayesrsquo theorem we have that the posterior of σ2 given by
p(σ2|dataγ) prop p(data|σ2γ)π(σ2) prop (σ2)minusN2minus1 exp
983069minusSSRγ
2σ2
983070
hence the posterior distribution of σ2 is an inverse-Gamma σ2|dataγ sim Γminus1983059N2
SSRγ
2
983060
Finally we obtain the marginal likelihood of the data in model γ by integrating out σ2
p(data|γ) =983133
p(data|σ2γ)π(σ2)dσ2 prop |Dγ |12
|β⊤γ βγ +Dγ |
12
1
(SSRγ2)N2
When comparing two models using posterior model probabilities is equivalent to simply using the
ratio of the marginal likelihoods ie the Bayesrsquo factor defined as
BFγγprime = p(data|γ)p(data|γ prime)
where we have given equal prior probability to model γ and model γprime
Remark 2 (Bayes Factor) Consider two nested linear factor models γ and γ prime The only dif-
ference between γ and γ prime is γp γp equals 1 in model γ but 0 in model γ prime Let γminusp denote a
(K minus 1) times 1 vector of model index excluding γp γ⊤ = (γ⊤minusp 1) and γ prime⊤ = (γ⊤
minusp 0) where without
loss of generality we have assumed that the factor p is ordered last The Bayesrsquo factor is then
BFγγprime =
983061SSRγprime
SSRγ
983062N2 983059
1 + ψpβ⊤p
983147IN minus βγprime(β⊤
γprimeβγprime +Dγprime)minus1β⊤γprime
983148βp
983060minus 12 (17)
The above result is proved in Appendix A4
Since β⊤p [IN minus βγprime(β⊤
γprimeβγprime + Dγprime)minus1β⊤γprime ]βp is always positive ψp plays an important role in
variable selection For a strong and useful factor that can substantially reduce pricing errors the
first term in equation (17) dominates and the Bayesrsquo factor will be much greater than 1 hence
providing evidence in favour of model γ
Remember that SSRγ = minλγ(a minus βγλγ)⊤(a minus βγλγ) + λ⊤
γDγλγ hence we always have
SSRγ le SSRγprime in sample There are two effects of increasing ψp i) when ψp is large the penalty
for λp is small hence it is easier to minimise SSRγ and SSRγprimeSSRγ becomes much larger than
1 ii) large ψp decreases the second term in equation (17) lowering the Bayesrsquo factor and acting as
a penalty for dimensionality
A particularly interesting case is when the factor is useless βp converges to zero but the
penalty term 1ψp prop 1ρ⊤pρp goes to infinity On the one hand the first term in equation (17) will
converge to 1 on the other hand since ψp asymp 0 in large sample the second term in equation (17)
will also be around 1 Therefore the Bayesrsquo factor for a useless factor will go to 1 asymptotically13
13But in finite sample it may deviate from its asymptotic value so we should not use 1 as a threshold when testingthe null hypothesis H0 γp = 0
16
In contrast a useful factor should be able to greatly reduce the sum of squared errors SSRγ so
the Bayesrsquo factor will be dominated by SSRγ yielding a value substantially above 1
Remark 3 (Level Factors) Identification failure of factorsrsquo risk premia can arise in the presence
of lsquolevel factorsrsquo exposure to which is non-zero but lacks cross-sectional spread ie βj rarr c1N with
c ∕= 0 These factors help explain the average level of returns but not the their cross-sectional
dispersion and hence are collinear with the common cross-sectional intercept Our approach can
handle this case by using variance standardised variables in the estimation and replacing the penalty
in (16) with ψj = ψtimes983145ρj⊤983145ρj where 983145ρj is the cross-sectionally demeaned vector of correlations with
asset returns ie 983145ρj = ρj minus983059
1N
983123Ni=1 ρji
983060times 1N
II23 Continuous Spike
We extend the Dirac spike-and-slab prior by encoding a continuous spike for λj when γj equals
0 Following the literature on Bayesian variable selection (see eg George and McCulloch (1993)
George and McCulloch (1997) and Ishwaran Rao et al (2005)) we model the uncertainty under-
lying model selection with a mixture prior π(λσ2γω) = π(λ|σ2γ)π(σ2)π(γ|ω)π(ω) which is
specified as following
λj |γj σ2 sim N (0 r(γj)ψjσ2)
Note the introduction of a new element r(γj) in the prior and where r(1) = 1 and r(0) = r As
we explain below the additional parameter vector ω encodes our prior beliefs about the sparsity
of the true model
Redefine D as a diagonal matrix with elements c (r(γ1)ψ1)minus1 (r(γK)ψK)minus1 where ψj is
given as before by equation (16) In matrix notation the prior for λ is λ|σ2γ sim N (0σ2Dminus1)
The term r(γj)ψj in Dminus1 is set to be small or large depending on whether γj = 0 or γj = 1 In the
empirical implementation we force r to be much less than 1 since we intend to shrink λj towards
zero when γj is 014 Hence the spike component concentrates the mass of λ towards zero whereas
the slab component allows λ to take values over a much wider range Therefore the posterior
distribution of λ is very similar to the case of a Dirac spike in section II22
We use Gibbs sampling (ie sequential sampling from conditional distributions) to draw the
posterior distribution of the parameters (λγωσ2) where as explained below ω encodes our
prior beliefs about the sparsity of the true model
Sampling λγ
Combining the likelihood and the prior for λ we have
p(λ|dataσ2γ) prop p(data|λσ2γ)p(λ|σ2γ) prop exp
983069minus 1
2σ2
983147λ⊤(β⊤β +D)λminus 2λ⊤β⊤a
983148983070
Therefore defining λ = (β⊤β+D)minus1β⊤a and σ2(λ) = σ2(β⊤β+D)minus1 the posterior distribution
14We can set r = 00001 In our framework r is essentially a tune parameter hence we need to choose a reasonablevalue such that we can identify useful factor but exclude spurious ones
17
of λ can be expressed as λ|dataσ2γω sim N (λ σ2(λ))
Sampling γjKj=1
Even though the prior on model index γ could be simply set to be π(γ) = 12K we interpret
π(γj = 1|ωj) = ωj as our prior belief about the sparsity of the true model As in the literature on
predictors selection we assign the following prior distribution to (γω)
π(γj = 1|ωj) = ωj ωj sim Beta(aω bω)
Different hyper-parameters aω and bω determine whether we favor more parsimonious models or
not15
Given a ωj the Bayes factor for the j-th risk then factor is
p(γj = 1|dataλωσ2γminusj)
p(γj = 0|dataλωσ2γminusj)=
ωj
1minus ωj
p(λj |γj = 1σ2)
p(λj |γj = 0σ2)
If we had instead imposed ωj = 05 as in section II22 the Bayesrsquo factor would be fully determined
byp(λj |γj=1σ2)p(λj |γj=0σ2)
Sampling ω
From Bayesrsquo theorem we have
p(ωj |dataλγσ2) prop π(ωj)π(γj |ωj) prop ωγjj (1minus ωj)
1minusγjωaωminus1j (1minus ωj)
bωminus1
prop ωγj+aωminus1j (1minus ωj)
1minusγj+bωminus1
Therefore the posterior distribution of ωj is ωj |dataλγσ2 sim Beta(γj + aω 1minus γj + bω)
Sampling σ2
Finally
p(σ2|dataωλγ) prop (σ2)minusN+K+1
2minus1 exp
983069minus 1
2σ2[(aminus βλ)⊤(aminus βλ) + λ⊤Dλ]
983070
Hence the posterior distribution of σ2 is
σ2|dataωλγ sim Γminus1
983061N +K + 1
2(aminus βλ)⊤(aminus βλ) + λ⊤Dλ
2
983062
Note that when ωj is a constant 05 and r converges to 0 the continuous slab-and-spike prior
is equivalent to the one with a Dirac spike in section II22 However the MCMC algorithm of
the continuous spike setting is particularly useful in the high dimensional case Imagine that there
are 30 candidate factors in the factor zoo In the Dirac spike-and-slab prior case we have to
15We set aω = bω = 2 in the benchmark case However we could assign aω = 1 and bω = 2 in order to select asparser model
18
calculate the posterior model probabilities for 230 different models Given that we update (aβ)
in each sampling round posterior probabilities for all models are necessarily re-computed for every
new draw of these quantities rendering the computational cost very large In contrast using our
continuous spike approach we can simply use the posterior mean of γj to approximate the posterior
marginal probability of the j-th factor
III Simulation
We build a simple setting for a linear factor model that includes both strong and irrelevant factors
and allows for potential model misspecification
The cross-section of asset returns mimics empirical properties of 25 Fama-French portfolios
sorted by size and value We generate both factors and test asset returns from normal distributions
assuming that HML is the only useful factor A misspecified model also includes pricing errors from
the two-step Fama-MacBeth procedure which makes the vector of simulated expected returns equal
to their sample mean estimates of 25 Fama-French portfolios Finally a spurious factor is simulated
from an independent normal distribution with mean zero and standard deviation 1
ftuselessiidsim N (0 (1)2)
ftHMLiidsim N (rHML σ
2HML)
ftHML = ftHML minus macrftHML
Rt|ftHMLiidsim
983099983103
983101N (λc1N + β
983059λHML + ftHML
983060 Σ) if the model is correct
N (R+ βftHML Σ) if the model is misspecified
where factor loadings risk premia and variance-covariance matrix of returns are equal to their
OLS sample estimates from time series and two-pass Fama-MacBeth regressions of 25 size and
value portfolios on HML All the model parameters are estimated on monthly data from July 1963
to December 2017
To better illustrate the properties of the frequentist and bayesian approaches we consider 3
estimation setups
(a) the model includes only a strong factor (HML)
(b) the model includes only a useless factor
(c) the model includes both strong and useless factors
Each setting can be correctly or incorrectly specified with the following sample sizes T = 100
200 600 1000 and 20000 We compare the performance of the OLSGLS standard frequentist
and Bayesian Fama-MacBeth estimators (FM and BFM correspondingly) with the focus on risk
premia recovery testing and identification of strong and useless factors for model comparison
19
III1 Estimating risk premia via Bayesian Fama-MacBeth
Since it is unlikely that in most empirical settings a linear factor model is correctly specified we
focus our discussion on the case that allows for model misspecification We report similar simulation
results for the case of correct model specification in Appendix C
Table 1 Tests of risk premia in a misspecified model with a strong factor
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0107 0052 0007 0115 0076 0013 -422 4530200 0094 0067 0006 0118 0072 0015 -303 5621
FM 600 0097 0053 0006 0098 0047 0011 389 55751000 0088 0048 0012 0103 0060 0012 865 532820000 0109 0056 0010 0110 0060 0008 2486 3655
100 0063 0026 0006 0032 0013 0001 -356 3609200 0086 0041 0012 0079 0039 0008 -367 4689
BFM 600 0087 0043 0013 0087 0047 0009 -067 52611000 0095 0046 0009 0093 0046 0009 440 534620000 0096 0052 0012 0100 0052 0012 2390 3601
Panel B GLS
100 0246 0173 0067 0251 0154 0070 1720 7464200 0165 0107 0031 0171 0105 0035 5337 8045
FM 600 0137 0076 0020 0137 0073 0016 6942 83871000 0131 0072 0019 0132 0072 0015 7341 841620000 0122 0074 0014 0113 0061 0015 8034 8283
100 0139 0083 0022 0143 0083 0024 3127 6815200 0131 0072 0014 0121 0069 0019 4858 7299
BFM 600 0114 0061 0013 0126 0067 0018 6594 80091000 0100 0048 0009 0106 0058 0008 7063 814920000 0092 0050 0012 0100 0048 0012 8022 8267
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor The true value of the cross-sectional R2
adj is 3055(8175) for the OLS (GLS) estimation Fama-MacBeth estimates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidence intervals and their size for BFMestimates are constructed using posterior coverage of Fama-MacBeth estimates of λ The last two columns report the5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at the simulation point estimatesfor FM and its posterior mode for BFM
Table 1 presents the size of the tests for risk premia and confidence intervals for cross-sectional
R2 in probably the most relevant (and best-case) scenario for empirical applications It compares
the performance of frequentist and Bayesian Fama-MacBeth estimators for the case when the model
is misspecified and includes a single cross-sectional factor that is priced and strongly identified
proxied by HML Since the model is misspecified cross-sectional R2 never reaches 100 even for
T = 20 000 with the population value of 31 (82) for OLS (GLS) As expected both FM and
BFM estimators are very similar to each other and provide confidence intervals of correct size In
case of the standard FM approach they are constructed using standard t-statistics adjusted for
Shanken correction and in case of the BFM we rely on the quantiles of the posterior distribution
to form the credible confidence intervals for parameters The last two columns also report the
quantiles of the mode of the posterior distribution of R2 across the simulations
20
Table 2 summarizes risk premia estimation for the same cross-section of 25 expected returns
but using a useless (spurious) factor as a candidate cross-sectional factor As expected the standard
Fama-MacBeth estimator fails to recognize the rank failure in the second stage and conventional
risk premia estimates and t-statistics are inflated Indeed it is widely known since Kan and Zhang
(1999) that if the model is misspecified then t-statistics of the spurious factors tend to infinity
asymptotically This is confirmed in Panels A and B for T = 20 000 the probability of finding a
useless factor to have a t-stat above 196 is almost 50 for FM-OLS and over 80 for FM-GLS
Furthermore cross-sectional measures of fit such as R2 and related quantities are substantially
inflated even though itrsquos true value is 0 for both OLS and GLS settings in-sample estimates
produced by the frequentist approach not only have a much wider empirical support (for example
from -4 to over 40 for FM-OLS) but also uncertainty that does not decrease with the sample
size
Table 2 Tests of risk premia in a misspecified model with a useless factor
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0072 0037 0010 0055 0011 0001 -429 4184200 0078 0043 0005 0084 0021 0001 -418 4524
FM 600 0089 0043 0014 0223 0116 0013 -428 43701000 0093 0052 0014 0333 0187 0027 -429 450120000 0213 0135 0067 0698 0488 0172 -427 4363
100 0035 0011 0001 0001 0000 0000 -247 -036200 0041 0013 0001 0008 0002 0000 -261 -016
BFM 600 0071 003 0003 0031 0006 0002 -287 0191000 0047 002 0001 0039 0017 0002 -298 08420000 0034 0013 0000 0091 0043 0012 -336 1140
Panel B GLS
100 0238 0144 0066 0305 0199 0062 -347 3857200 0152 0091 0028 0292 0189 0067 -375 1953
FM 600 0126 0066 0017 0407 0314 0148 -385 16431000 0117 0063 0013 0510 0401 0239 -405 156920000 0104 0039 0005 0864 0847 0768 -311 1333
100 0128 0070 0019 0047 0018 0002 -218 1154200 0107 0060 0014 0034 0011 0000 -275 891
BFM 600 0093 0046 0008 0042 0012 0001 -325 6621000 0083 0031 0004 0061 0028 0004 -339 51920000 0023 0006 0000 0099 0049 0011 -304 138
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor The true value of the cross-sectional R2 is zero
The Bayesian approach to Fama-MacBeth regressions successfully overcomes the hurdle of use-
less factors As Table 2 demonstrates both BFM-OLS and BFM-GLS are able to identify the spu-
rious factor with the posterior distribution providing credible confidence bounds with the proper
control for the size of the test (eg for T = 20 000 the probability to reject the null of no risk
premia attached to a useless factor when using a test with the nominal size of 10 is 91) Rec-
ognizing a useless factor cross-sectional measures of fit are more conservative and overall tighter
Why does the bayesian approach work when the frequentist fails The argument is probably
best summarized by Figure 1 that plots a posterior distribution of λuseless for BFM from one of
21
minus4 minus2 0 2 4
00
02
04
06
08
λuseless
Density
BFMminusOLSFMminusOLS
Figure 1 Risk premia estimates of a useless factor
The graph presents the posterior distribution (blue line) of λuseless from the BFM-OLS estimation in a misspecifiedmodel with a useless factor based on a single simulation with T = 1 000 The red line depicts the asymptoticdistribution of Fama-MacBeth estimate of risk premium under the normal approximation
the simulations along with the pseudo-true value of risk premium defined as 0 in this case In this
particular simulation Fama-MacBeth OLS estimate of λuseless is -119 with Shanken-corrected
t-statistics equal to -255 so according to traditional hypothesis testing we would reject the null
of λuseless = 0 even at 1 The posterior distribution of BFM estimates of risk premium (the
blue line in Figure 1) behaves rather differently it is centered around 0 and overall more spread
out with a confidence interval (minus1603 1201) Intuitively the main driving force behind it
is the fact that in BFM β is updated continuously when β is close to zero the posterior draws
of β will be positive or negative randomly which implies that the conditional expectation of λ in
Equation 12 will also flip sign depending on the draw As a result the posterior distribution of
λuseless is centered around 0 and so is the confidence interval The same logic applies to the case
of BFM-GLS
Finally Table 3 combines the insights for true and irrelevant factors and presents the simulation
results for the most realistic model setup that includes both useless and strong factors As expected
in the conventional case of frequentist Fama-MacBeth estimation the useless factor is often found
to be a significant predictor of the asset returns its OLS (GLS) t-statistic would be above a
5-critical value in over 60 (80) of the simulations On the contrary the bayesian confidence
intervals have the right coverage and reject the null of no risk premia attached to the spurious
factor with frequency asymptotically approaching the size of the tests
The crowding out of the true factors by the useless ones could also be an important empirical
concern When the model is misspecified the presence of spurious factors can also bias the risk
premia estimates for the strong ones and often leads to their crowding out of the model Panel
A in Table 3 serves as a good illustration of this possibility with risk premia estimates for the
22
Table 3 Tests of risk premia in a misspecified model with useless and strong factors
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0082 0039 0008 0121 0067 0016 0099 0023 0001 -513 5663200 0096 0044 0005 0157 0100 0034 0129 0039 0005 127 6190
FM 600 0093 0034 0014 0212 0147 0071 0264 0129 0022 840 61781000 0102 0046 0010 0261 0194 0098 0380 0199 0056 1184 624820000 0114 0054 0009 0289 0229 0152 0848 0633 0240 2507 6076
100 0035 0012 0001 0028 0007 0001 0004 0001 0000 -211 4033200 0049 0017 0001 0067 0031 0004 0011 0003 0000 -175 4828
BFM 600 005 0018 0004 0099 0047 0005 0047 0014 0002 1020 55721000 0041 0021 0003 0102 0048 0011 0071 0035 0004 1487 569520000 0017 0007 0000 0087 0033 0007 0099 0055 0012 2480 5466
Panel B GLS
100 0219 0155 0057 0224 0135 0066 0303 0198 0064 1911 7775200 0155 0092 0028 0149 0090 0024 0263 0183 0061 5537 8171
FM 600 0121 0068 0015 0116 0064 0016 0391 0293 0134 6948 84331000 0115 0061 0013 0115 0057 0012 0487 0387 0216 7305 847420000 0084 0050 0009 0100 0041 0005 0864 0836 0757 7979 8424
100 0122 0069 0016 0129 0070 0017 0046 0017 0002 3243 6869200 0112 0056 0012 0099 0048 0012 0031 0012 0000 4844 7355
BFM 600 0096 0049 0011 0086 0045 0009 0049 0016 0002 6576 80301000 0081 0036 0007 0073 0032 0006 0058 0030 0003 7064 815420000 0027 0005 0000 0022 0007 0000 0098 0047 0013 7974 8259
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and
λstrong λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor The true value of the
cross-sectional R2adj is 3055 (8175) for the OLS (GLS) estimation
strong factor are clearly biased in the frequentist estimation by the identification failure in case of
the frequentist approach Again in this case BFM provides reliable albeit conservative confidence
bounds for model parameters
III11 Evaluating cross-sectional fit
In addition to risk premia estimates it is often useful to understand the quality of cross-sectional
fit of the model Indeed the increase in cross-sectional R2 is often interpreted as measuring the
economic importance of the predictor contrary to the statistical one implied by the risk premia
significance It is well-known however that the average values of R2 are not always informative
about the true model performance its sample distribution often suffers from a large estimation un-
certainty (see eg Stock (1991) and Lewellen Nagel and Shanken (2010)) ad has a non-standard
distribution when the matrix of β has reduced rank (Kleibergen and Zhan (2015) Gospodinov
Kan and Robotti (2019)) In this section we further investigate the properties of cross-sectional
R2 in the frequentist and Bayesian Fama-MacBeth regressions
Figure 2 shows the distribution of cross-sectional OLS R2 across a large number of simulations
for the asymptotic case of T = 20 000 and a misspecified process for returns As Panel (a)
illustrates if the model is strongly identified the distribution of posterior mode of R2 for BFM
tends to coincide with that of conventional Fama-MacBeth procedure as expected The major
23
difference emerges whenever a useless factor is included into the candidate set of variables Indeed
it is well-known that in this case the distribution of conventional measures of fit is nonstandard
and often inflated (Kleibergen and Zhan (2015)) This is further confirmed in Figures 2 b) and
c) that show that under the presence of spurious factors conventional Fama-MacBeth R2 has an
extremely spreadout right tail of the distribution which makes it easy to find a substantial increase
of fit whenever the model is simply not identified This unfortunate property of the frequentist
approach is not shared by the inference with BFM Indeed the mode of the posterior distribution
of R2 is generally tightly centered around the true values The slight bump to the right tail of the
distribution comes from the fact that whenever a spurious factor is included into the model with a
small probability (based on t-statistic cut-off this is equal to the size of the test see eg Table 3
Panel A) its fit will be similar to that of the frequentist estimation
00 02 04 06 08 10
02
46
810
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(a) Model with a strong factor
00 02 04 06 08
010
2030
40
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(b) Model with a useless factor
00 02 04 06 08 10
02
46
8
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(c) Model with strong and useless factors
Figure 2 Cross-sectional distribution of R2adj in models with strong and useless factors
Figures (a)-(c) show the asymptotic distribution of cross-sectional R2 under different model specifications across 1000simulations of sample size T = 20 000 Blue lines correspond to the distribution of the posterior mode for R2
adj whilegreen lines depict the pointwise sample distribution of cross-sectional fit evaluated at Fama-MacBeth risk premiaestimates The dark yellow dash line stands for the true value of R2
adj in the model
24
However the pointwise distribution of cross-sectional R2 across the simulations is only part of
the story as it does not reveal the in-sample estimation uncertainty and whether the confidence
intervals are credible in reflecting it While BFM incorporates this uncertainty directly into the
shape of its posterior distribution one needs to rely on bootstrap-like algorithms to build a similar
analogue in the frequentist case As frequentist benchmark we use the approach Lewellen Nagel
and Shanken (2010) to construct the confidence interval for R2 Details on this procedure can be
found in Appendix A5
minus05 00 05 10
00
05
10
15
20
25
Radj2
Den
sity
BFM PosteriorFM Radj
2
5th95th CI of BFM5th95th CI of LNSTrue Radj
2
(a) Model with a useless factor
minus04 minus02 00 02 04 06 08 10
00
05
10
15
20
Radj2
Den
sity
BFM PosteriorFM Radj
2
5th95th CI of BFM5th95th CI of LNSTrue Radj
2
(b) Model with strong and useless factors
Figure 3 The estimation uncertainty of cross-sectional R2Figures (a) and (b) show the posterior density of the cross-sectional R2 (blue) in one representative simulationalong with its 90 confidence interval (red) The dark yellow line denotes the true value of R2 while the green linedepicts its in-sample Fama-MacBeth estimate with the confidence interval constructed following Lewellen Nagel andShanken (2010) (black)
Figure 3 presents the posterior distribution of cross-sectional R2 for a model that contains a
useless factor (and potentially a strong one too) and contrasts it with a frequentist value and
the confidence interval around it Consider for example Panel a) The fact that the in-sample
Fama-MacBeth estimate of cross-sectional fit (51) is substantially higher than the mode of the
posterior distribution (-2 which is close to the true value of R2adj about minus4) is not surprising
given the previous results on the pointwise distribution of the estimates What is quite interesting
however is the coverage of the confidence interval constructed via the simulation-based approach of
Lewellen Nagel and Shanken (2010) Not only does it not include true value of the cross-sectional
fit but in fact in this particular simulation it suggests that R2adj should be between 42 and
100 A similar mismatch between the seemingly high levels of cross-sectional fit produced by a
frequentist approach and their true values can also be observed in Panel b) of Figure 3 for the
case of including both strong and a useless factors
In section A6 of the Appendix we show that the performance of the Bayesian Fama-MacBeth
method is robust to the use of a larger cross-sectional dimension ndash ie the above discussed properties
25
hold in a larger N setting
III2 Bayes factors
How well do flat and spike-and-slab priors work empirically in selecting relevant and detecting
spurious factors in the cross-section of asset returns We revisit the theoretical results from Section
II1 using the same simulation design used to evaluate the estimation of risk premia
Table 4 The probability of retaining risk factors using Bayes factors
T 55 57 59 61 63 65
Panel A strong factors
Jeffreys Prior fstrong 200 0813 0784 0758 0722 0693 0662600 0929 0915 0896 0876 0851 08341000 0972 0963 0957 0951 0937 0924
Spike-and-Slab Prior fstrong 200 0605 0570 0538 0505 0476 0447600 0853 0830 0807 0784 0762 07351000 0954 0940 0928 0912 0896 0875
Panel B useless factors
Jeffreys Prior fuseless 200 1000 0996 0988 0967 0919 0822600 0998 0998 0995 0988 0977 09431000 1000 1000 1000 0994 0983 0965
Spike-and-Slab Prior fuseless 200 0325 0188 0106 0057 0024 0013600 0072 0025 0006 0002 0001 00001000 0051 0015 0001 0000 0000 0000
Panel C strong and useless factors
Jeffreys Prior fstrong 200 0924 0897 0874 0848 0821 0799600 0988 0985 0976 0974 0965 09581000 0998 0996 0996 0995 0992 0987
fuseless 200 0984 0960 0910 0811 0702 0584600 0999 0993 0985 0954 0913 08541000 1000 1000 0995 0986 0966 0945
Spike-and-Slab Prior fstrong 200 0627 0591 0552 0509 0474 0452600 0860 0830 0802 0783 0758 07271000 0956 0942 0927 0911 0895 0875
fuseless 200 0260 0128 0071 0031 0019 0010600 0080 0028 0010 0004 0001 00011000 0058 0013 0005 0000 0000 0000
The table shows the frequency of retaining risk factors for different choice sets across 1000 simulations of differentsize (T=200 600 and 1000) In Panel A the candidate risk factor is truly cross-sectionally priced and stronglyidentified while in Panel B they are not Panel C summarizes the case of using both strong and useless candidatefactors in the model A candidate factor is retained in the model if its marginal posterior probability p(γi = 1|data)is greater than a certain threshold ie 55 57 59 61 63 and 65
Consider a cross-section of 25 portfolios that is actually loading on 2 systematic sources of risk
with the econometrician potentially observing at most only one of them a strong (and priced) ft
However there is also a second candidate factor available which is orthogonal to asset returns
and essentially useless We compute Bayes factors corresponding to each of the potential sources
of risk and document the empirical probability of retaining the variable in the model across 1000
26
simulations Again we consider models that contain either strong or useless factors or a com-
bination of both and different sample sizes (T = 200 600 and 1000) In each case we run the
Gibbs sampling algorithm derived using continuous spike-and-slab prior and then approximate the
marginal probability of each factor by the posterior mean of γj The decision rule is based on a
range of critical values 55-65 such that whenever the posterior mean of γj is above a particular
threshold we retain the factor Finally we also compute the probability of retaining a factor under
Jeffreys prior which would be the standard in the literature
Table 4 summarizes our findings When only a true risk factor is included in the candidate set
(Panel A) both Jeffreyrsquos and spike-and-slab priors successfully identify it with a high probability
especially in large sample For small sample sizes however Jeffreys prior clearly has a somewhat
higher power of detecting a true risk factor This outcome is not surprising as the estimator does
not impose additional shrinkage on the risk premium The difference however vanishes for larger
sample sizes
The difference between the two priors becomes drastic whenever useless factors are included
in the model (Panels B Panels B and C in Table 4) As discussed in Section II21 since the
matrix β⊤γ βγ is nearly singular and its determinant goes to zero under a flat prior the posterior
probability of including a spurious factor in the model converges to 1 asymptotically For example
the probability of misidentifying a spurious factor as being the true source of risk is almost 1
under Jeffreys prior even for a very short sample This in turn makes the overall process of model
selection invalid
Overall we find the asymptotic behavior of the spike-and-slab prior encouraging for variable
and model selection While often somewhat conservative in very short samples it successfully
eliminates the impact of the spurious factors from the model and identifies the true sources of
risk
IV Empirical Applications
In this section we apply our Bayesian approach to a large set of factors proposed in the previous
literature First we use the Bayesian Fama-MacBeth method to analyse several notable factor
models (subsection IV1) Second we consider 51 tradable and non-tradable factors yielding more
than two quadrillion possible models and employ our spike and slab priors to compute factorsrsquo
posterior probabilities and implied risk premia (subsections IV2 and IV3) Third we compare
the performance of a (low dimensional) robust model constructed with only the factors that have
high posterior probability to the one of several notable factor models (subsection IV4) Fourth
we estimate the degree of sparsity (in terms of linear factors) of the true latent stochastic discount
factor as well as the SDF-implied maximum Sharpe ratio (subsection IV5)
27
IV1 Some notable factor models
In this section we illustrate the differences between the frequentist and Bayesian FM estimation
(both OLS and GLS) for several candidate models In particular we estimate a set of linear
factor models on the returns of the standard 25 Fama-French portfolios sorted by size and value
using frequentist and Bayesian Fama-MacBeth estimators We use monthly data over the 197001-
201712 sample for tradable factors and whenever possible nontradables For factors available
only at quarterly frequency the sample is 1952Q1-2017Q3 (whenever possible) A full description
of the data and models used can be found in Appendix B Additional empirical results for other
candidate factors and cross-sections (eg 25 Fama-French + 17 Industry portfolios) can be found
in Appendix C
Tables 5 and 6 summarize the performance of several leading factor models For the classical FM
approach we report point estimates of risk premia with their Shanken-corrected t-statistics and
the cross-sectional R2 along with its 90 confidence interval (constructed following the method-
ology of Lewellen Nagel and Shanken (2010)) For BFM we report the posterior mean of risk
premia estimates and the posterior median and mode of R2 along with the centered 90 posterior
coverage We chose to report both the median and the mode for cross-sectional fit because the of
the shape of its distribution which is often heavily skewed
Carhart (1997) 4-factor model OLS and GLS Fama-MacBeth estimates of risk premia indicate
that size value and momentum (SMB HML and UMD correspondingly) are significant drivers of
the cross-section of test assets The market factor does not command a significant risk premium
which is a typical finding for this model Cross-sectional fit seems to be high with R2 over 70 even
though it comes with rather wide confidence bounds according to Lewellen Nagel and Shanken
(2010) approach The Bayesian estimation indicates that part of the model success is due to the fact
that this cross section of test assets does not have much exposure to momentum especially after
one controls for the conventional Fama-French factors While still marginally significant its risk
premium is substantially lower under both BFM and BFM-GLS estimation with tighter bounds
for R2 too On the contrary both HML and SMB have virtually identical risk prices under both
FM and Bayesian estimations
Hou Xue and Zhang (2014) q-factor model emphasized the role of investment and profitability
in matching the cross-section of equity returns and true to the data we find these factors strongly
priced Both estimation strategies give identical parameter estimates and largely consistent mea-
sures of cross-sectional fit This in turn implies that all the risk premia are strongly identified
for this model when estimated on a cross-section of 25 Fama-French portfolios When industry
portfolios are among the test assets (see Appendix C) risk premia and R2 decline and some of the
parameters lose significance but broadly speaking the model performs in a qualitatively similar
way
Liquidity-adjusted CAPM of Pastor and Stambaugh (2003) seems to suffer from identification
failure as the risk premium on the liquidity factor ceases to remain significant when BFM is used
in estimation Wide confidence bounds and uncertain cross-sectional fit provide a stark difference to
28
Table 5 Tradable factors and 25 Fama-French portfolios sorted by size and value
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Carhart (1997) Intercept 0489 7063 0703 6432 6329[-0244 1222] [3160 9400] [-0061 1426] [4826 7646]
MKT 0120 -0101[-0631 0870] [-0822 0683]
SMB 0171 0164[0100 0241] [0089 0232]
HML 0404 0396[0331 0477] [0330 0466]
UMD 2445 1806[0955 3936] [0259 3328]
q-factor model Intercept 0912 6567 0922 6062 6123Hou Xue and Zhang (2014) [0286 1539] [3040 8680] [0276 1560] [4131 7640]
ROE 0394 0377[0016 0771] [-0020 0789]
IA 0387 0385[0203 0571] [0208 0580]
ME 0274 0268[0169 0379] [0158 0376]
MKT -0371 -0378[-0995 0252] [-1005 0272]
Liquidity-CAPM Intercept 0973 3624 1162 3409 3027Pastor and Stambaugh (2000) [-0084 2030] [-909 10000] [0175 2120] [-239 6146]
LIQ 3057 1785[0727 5388] [-1237 4150]
MKT -0281 -0449[-1350 0788] [-1371 0509]
Panel A GLS
Carhart (1997) Intercept 1017 8964 1083 8587 863[0389 1645] [8200 9760] [0458 1717] [8085 9105]
MKT -0434 -0504[-1065 0196] [-1150 0122]
SMB 0191 0189[0150 0233] [0150 0230]
HML 0356 0356[0313 0400] [0316 0395]
UMD 1626 1264[0479 2772] [0077 2401]
q-factor model Intercept 1305 5503 1277 4728 4854Hou Xue and Zhang (2014) [0779 1831] [2440 9640] [0702 1879] [3245 6419]
ROE 0295 0266[-0026 0615] [-0087 0640]
IA 0270 0265[0104 0437] [0093 0450]
ME 0251 0246[0161 0341] [0144 0345]
MKT -0749 -0720[-1268 -0229] [-1292 -0156]
Liquidity-CAPM Intercept 1244 4938 1256 5298 4317Pastor and Stambaugh (2000) [0664 1824] [2691 9891] [0738 1749] [1250 6653]
LIQ 1141 0775[-0232 2514] [-0450 2116]
MKT -0664 -0678[-1242 -0086] [-1176 -0162]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of 25 Fama-French monthly excess returns Each model is estimated via OLS and GLS We reportpoint estimates and 5 confidence intervals for risk premia which are constructed based on the asymptotic normaldistribution and cross-sectional R2 and its (5 95) confidence level constructed as in Lewellen Nagel and Shanken(2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide the posterior mean of λ denoted byλj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (595) credible intervals and denote significance at the 90 95 and 99 level respectively
29
Table 6 Nontradable factors and 25 Fama-French portfolios sorted by size and value
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Scaled CCAPM Intercept 1046 2567 1791 3436 2919[-0848 2940] [-1429 10000] [0001 3723] [-476 6207]
cay 1817 0791[-0653 4288] [-1347 2686]
∆Cnd 0713 0303[-0030 1456] [-0462 0951]
∆Cnd times cay 0804 0301[-1645 3253] [-1911 2270]
HC-CAPM Intercept 3243 -122 3090 354 957[1228 5257] [-909 3345] [0790 5259] [-748 4431]
∆Y 0464 0085[-0213 1140] [-1119 1058]
MKT -0719 -0656[-2680 1242] [-2859 1558]
Durable CCAPM Intercept 2214 5238 2780 471 4078[-1037 5465] [2800 10000] [-0184 5751] [120 6991]
∆Cnd 0743 0357[-0025 1511] [-0207 0832]
∆Cd -0057 0014[-0719 0605] [-0668 0693]
MKT 0083 -0495[-3322 3489] [-3395 2555]
Panel B GLS
Scaled CCAPM Intercept 2180 -1024 2257 -658 -313[0825 3536] [-1429 6457] [1221 3258] [-1187 1517]
cay 0435 0256[-0774 1643] [-0688 1217]
∆Cnd 0118 0089[-0266 0502] [-0214 0407]
∆Cnd times cay 0141 0063[-1005 1286] [-0845 0938]
HC-CAPM Intercept 2730 5636 2759 5824 4926[1458 4002] [3018 8364] [1379 4095] [967 7507]
∆Y -0421 -0241[-0742 -0099] [-0598 0114]
MKT -0717 -0740[-1979 0545] [-2073 0622]
Durable CCAPM Intercept 2960 4454 2841 5474 4099[0547 5374] [286 7829] [1102 4558] [-241 7215]
∆Cnd 0105 0052[-0265 0475] [-0201 0311]
∆Cd 0055 0025[-0390 0501] [-0286 0327]
MKT -0941 -0822[-3314 1432] [-2528 0895]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of 25 Fama-French quarterly excess returns Each model is estimated via OLS and GLS Wereport point estimates and 5 confidence intervals for risk premia which are constructed based on the asymptoticnormal distribution and cross-sectional R2 and its (5 95) confidence level constructed as in Lewellen Nageland Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide the posterior mean of λdenoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as wellas its (5 95) credible intervals and denote significance at the 90 95 and 99 level respectively
the pointwise estimates and their seemingly high significance levels under the standard frequentist
approach
Conditional CCAPM of Lettau and Ludvigson (2001) is weakly identified at best Unlike the
basic FM estimation that indicates a relative empirical success of the model the Bayesian approach
reveals most risk premia to be substantially lower losing all the accompanying statistical and
30
economic significance This is particularly pronounced in the BFM-GLS that delivers both risk
premia and cross-sectional R2 close to zero
Labour-adjusted CAPM of Jagannathan and Wang (1996) extends the classic CAPM framework
by introducing a proxy for human capital and finds it strongly priced in the cross-sections of stocks
returns The BFM estimates of risk premia are substantially lower and no longer significant with
the same patterns observed under both OLS and GLS procedures
Durable CCAPM of Yogo (2006) in the linearized version included the durable consumption
factor and found that its impact is priced in a number of cross-sections sorted by size and value past
betas and other characteristics Even though the Lewellen Nagel and Shanken (2010) approach
indicates a really wide support for the cross-sectional R2 the model found empirical support in the
data We find that both durable and nondurable consumption are weak predictors of the cross-
section of returns as the magnitude of their risk premia substantially declines and is no longer
significant The model is still characterized by a wide confidence interval for R2 but overall its
pricing ability is questionable at best
Appendix C provides additional empirical results on the performance of both frequentist and
bayesian Fama-MacBeth estimators In many cases when the models are well specified and strongly
identified in the data there is almost no distinction between the two approaches One notable
difference however are the confidence intervals of the R2 that are often notoriously wide in
the frequentist case There are also cases however when the difference in model performance
becomes large affecting both risk premia estimates and measures of cross-sectional fit Similar
to Gospodinov Kan and Robotti (2019) we caution the reader against blindly relying on the
estimates produced by conventional Fama-MacBeth procedure and advocate a robust approach to
inference
IV2 Sampling two quadrillion models
We now turn our attention to a large cross-section of candidate asset pricing factors In particular
we focus on 51 (both traded and non) monthly factors available from October 1973 to December
2016 (ie T = 600) Factors are described in details in Table B1 of Appendix B In choosing
the cross-section of assets to price we follow Lewellen Nagel and Shanken (2010) and employ 25
Fama-French size and book-to-market portfolios plus 30 Industry portfolio (ie N = 55) Since
we do not restrict the maximum number of factors to be included all the possible combinations
of factors give us a total of 251 possible specifications ie 225 quadrillion models Note that each
model involves 55 time series regressions and one cross-section regression ie we jointly evaluate
the equivalent of 126 quadrillion regressions
We employ the continuous spike-and-slab approach of section II23 since it is the most suited
for handling a very large number of possible models and report both the posterior (given the data)
of each factor (ie E [γj |data] forallj) as well as the posterior means of the factorsrsquo risk premia (ie
E [λj |data] forallj) computed as the Bayesian Model Average16 across all the models considered
16If we are interested in some quantity ∆ that is well-defined for every model j = 1 M (eg risk premia
31
The posterior evaluation is performed and reported over a wide range for the parameter (ψ in
equation (16)) that controls the degree of shrinkage of potentially useless factorsrsquo risk premia from
ψ = 1 (ie very strong shrinkage) to ψ = 100 (making the shrinkage virtually irrelevant) The
prior for each factor inclusion is a Beta(1 1) (ie a uniform on [0 1]) yielding a prior expectation
for γj equal to 5017
000
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
1 5 10 20 50 100
Post
erio
r pro
babi
lity
ψ
HML MKT MKT SMB STRev
IPGrowth BEH_PEAD NONDUR SERV LIQ_TR
PE DeltaSLOPE BW_ISENT DEFAULT Oil
DIV REAL_UNC UNRATE TERM HJTZ_ISENT
UMD CMA LIQ_NT FIN_UNC STOCK_ISS
NetOA MACRO_UNC INV_IN_ASS COMP_ISSUE ACCR
HML ROA RMW CMA LTRev
INTERM_CAP_RATIO ASS_Growth DISSTR PERF BAB
IA MGMT ROE O_SCORE BEH_FIN
GR_PROF QMJ RMW SKEW SMB
HML_DEVIL
Figure 4 Posterior factor probabilities
Posterior probabilities of factors E [γj |data] estimated over the 197310-201612 sample using a cross-section of 25Fama-French size and book-to-market and 30 Industry test asset portfolios computed using the continuous spike andslab approach of section II23 and 51 factors yielding 251 asymp 225 quadrillion models The 51 factors considered aredescribed in Table B1 of Appendix B The prior distribution for the j-th factor inclusion is a Beta(1 1) yielding aprior expectation of γj = 50 Posterior probabilities are plotted for ψ isin [1 100]
Figure 4 plots the posterior probabilities of the 51 factors as a function of the parameter ψ
The corresponding values are reported in Table 7 Overall the inclusion of only four factors finds
maximum Sharpe ratio etc) from Bayesrsquo theorem we have
E [∆|data] =M983131
j=0
E [∆|datamodel = j] Pr (model = j|data)
where E [∆|datamodel = j] = limLrarrinfin1L
983123Li=1 ∆(θ
(j)i ) and
983153θ(j)i
983154L
i=1denotes draws from the posterior distribution
of the parameters of model j That is the (Bayesian model average) expectation of ∆ conditional on only thedata is simply the weighted average of the expectation in every model with weights equal to the modelsrsquo posteriorprobabilities See eg Raftery Madigan and Hoeting (1997) Hoeting Madigan Raftery and Volinsky (1999)
17Using a Beta(2 2) that still implies a prior probability of factor inclusion of 50 but lower probabilities forvery dense and very sparse models we obtain virtually identical results Furthermore using a prior in favour of moresparse factor models (ie a Beta(2 8)) the empirical findings are very similar to the ones reported
32
Table 7 Posterior factor probabilities E [γj |data] and risk premia in 225 quadrillion models
E [γj |data] E [λj |data]ψ ψ
Factors 1 5 10 20 50 100 1 5 10 20 50 100 F
HML 0860 0909 0916 0903 0882 0877 0192 0288 0301 0300 0292 0293 0377MKT 0730 0807 0831 0851 0837 0778 0108 0271 0370 0476 0565 0572 0514MKT 0501 0630 0705 0748 0749 0707 0047 0184 0285 0385 0467 0472 0563SMB 0654 0662 0580 0500 0429 0370 0076 0138 0133 0117 0103 0102 0215STRev 0506 0526 0511 0530 0568 0550 0003 0013 0026 0049 0106 0158 0438IPGrowth 0508 0508 0501 0502 0508 0502 0000 -0001 -0001 -0002 -0004 -0007 0121BEH PEAD 0486 0512 0478 0492 0511 0517 0004 0012 0018 0027 0049 0072 0619NONDUR 0475 0499 0490 0504 0504 0509 0000 0002 0003 0005 0011 0018 0170SERV 0504 0481 0497 0502 0491 0497 0000 0000 0000 0000 -0001 -0001 0052LIQ TR 0494 0493 0485 0485 0506 0507 0000 0005 0010 0020 0048 0083 0438PE 0495 0494 0502 0490 0494 0492 -0002 -0004 -0005 -0006 -0008 -0009 6401DeltaSLOPE 0478 0490 0506 0498 0482 0511 0000 0000 -0001 -0001 -0002 -0005 0096BW ISENT 0489 0491 0496 0498 0504 0477 0000 0002 0003 0005 0010 0014 0094DEFAULT 0497 0498 0490 0497 0484 0490 0000 0000 0000 0001 0001 0002 0313Oil 0495 0504 0489 0490 0494 0483 0002 0011 0019 0034 0068 0108 0852DIV 0488 0491 0486 0494 0491 0505 0000 0000 0000 -0001 -0002 -0003 0876REAL UNC 0479 0490 0503 0490 0498 0484 0000 0000 0000 0000 0000 0000 0043UNRATE 0483 0499 0484 0490 0490 0486 0000 -0001 -0002 -0003 -0006 -0008 1083TERM 0470 0476 0480 0497 0503 0501 0000 0003 0005 0009 0018 0030 0942HJTZ ISENT 0483 0480 0491 0487 0489 0477 0000 0000 0000 -0001 0000 0000 0210UMD 0531 0537 0536 0495 0422 0384 0028 0069 0089 0100 0102 0108 0646CMA 0499 0504 0491 0495 0466 0442 0001 0000 -0002 -0005 -0008 -0012 0242LIQ NT 0477 0478 0494 0493 0481 0471 -0003 -0005 -0004 -0002 0009 0019 0501FIN UNC 0481 0477 0471 0477 0489 0491 0000 0000 0000 0000 -0001 -0001 0096STOCK ISS 0515 0537 0509 0473 0433 0374 -0032 -0081 -0096 -0103 -0110 -0105 0515NetOA 0504 0504 0481 0489 0423 0402 0005 0015 0022 0032 0040 0042 0544MACRO UNC 0476 0483 0479 0471 0463 0430 0000 0000 0000 0000 0000 0000 0073INV IN ASS 0479 0476 0479 0461 0454 0440 0001 0002 0005 0010 0023 0029 0549COMP ISSUE 0514 0518 0503 0481 0393 0363 0037 0083 0101 0111 0105 0108 0497ACCR 0483 0477 0486 0472 0436 0415 -0009 -0012 -0009 0001 0015 0015 0343HML 0491 0481 0479 0469 0431 0376 -0002 -0001 0001 0004 0008 0010 0251ROA 0517 0514 0499 0473 0399 0319 0041 0105 0141 0162 0154 0123 0551RMW 0508 0481 0483 0465 0397 0347 0002 0008 0013 0017 0017 0016 0219CMA 0560 0530 0499 0432 0351 0292 0031 0056 0062 0058 0051 0044 0351LTRev 0485 0491 0474 0442 0402 0350 -0007 -0025 -0032 -0035 -0036 -0029 0252INTERM CAP RATIO 0499 0477 0452 0451 0380 0324 0027 0037 0054 0082 0103 0104 0790ASS Growth 0473 0470 0458 0439 0380 0327 0001 0005 0005 0000 -0005 -0003 0525DISSTR 0505 0487 0456 0410 0340 0269 0043 0094 0105 0104 0085 0070 0475PERF 0541 0497 0448 0390 0319 0259 -0054 -0079 -0072 -0064 -0054 -0049 0651BAB 0460 0448 0433 0405 0372 0325 -0010 -0012 -0008 0006 0027 0037 0921IA 0497 0465 0425 0385 0325 0257 -0010 -0011 -0009 -0008 -0008 -0007 0409MGMT 0516 0469 0424 0379 0302 0244 0028 0055 0058 0056 0050 0044 0631ROE 0482 0446 0414 0364 0299 0233 0009 0024 0027 0028 0024 0018 0555O SCORE 0475 0426 0403 0368 0281 0229 -0009 -0014 -0015 -0018 -0016 -0012 002BEH FIN 0483 0410 0360 0321 0238 0196 -0016 -0008 -0003 0000 0003 0003 076GR PROF 0459 0405 0366 0314 0249 0206 0011 0005 0008 0011 0012 0014 0199QMJ 0502 0408 0369 0300 0232 0178 0030 0030 0026 0018 0014 0011 0405RMW 0464 0395 0356 0302 0245 0193 0015 0025 0028 0027 0025 0020 0292SKEW 0481 0388 0345 0287 0219 0179 -0001 0000 0000 0000 0000 0000 0438SMB 0336 0305 0319 0312 0311 0277 0019 0032 0041 0047 0054 0051 0257HML DEVIL 0434 0326 0289 0242 0183 0152 0014 0013 0009 0005 0003 0002 0356
Posterior probabilities of factors E [γj |data] and posterior mean of factor risk premia E [λj |data] computed usingthe continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225 quadrillion models Theprior for each factor inclusion is a Beta(1 1) yielding a prior expectation for γj equal to 50 The last column reportssample average returns for the tradable factors The data is monthly 197310 to 201612 Test assets cross-sectionof 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factors considered are described inTable B1 of Appendix B Numbers denoted with the asterisk in the last column correspond to the return on thefactor-mimicking portfolio of the nontradable factor constructed by a linear projection of its values on the set of 51test assets and scaled to have the same volatility as the original nontradable factor
33
substantial support in our empirical analysis First the celebrated Fama-French HML (high-minus-
low) designed to capture the so-called lsquovalue premiumrsquo is a strong determinant of the cross-section
of asset returns For ψ = 10 (a reasonable benchmark) its posterior probability is about 895 and
only for very strong shrinkage (ψ = 1) the posterior probability gets reduced to 639 Second
the market factor in the version of Daniel Mota Rottke and Santos (2018) (MKTlowast that is
meant to have hedged out the unpriced risk contained in the market index) has also high posterior
probability (856 for ψ = 10) Third the simple market factor (MKT) seems also to be a robust
source of priced risk albeit with an empirical performance weaker than MKTlowast Fourth albeit to
a lesser extent SMBlowast the Daniel Mota Rottke and Santos (2018) version of the small-minus-big
Fama-French factor (meant to capture the so called lsquosizersquo premium) seems also to contain relevant
information for pricing the cross-section of asset returns with a posterior probability in the 60-
70 range for most values of ψ Beside the ones above mentioned all other factors have posterior
probabilities of about 50 or less for all values of ψ Interestingly the results are not very sensitive
to the choice of ψ
In addition to the posterior probabilities of the factors Table 7 reports the posterior means
of the factor risk premia computed as Bayesian Model Average ie the weighted average of the
posterior means in each possible factor model specification with weights equal to the posterior
probability of each specification being the true data generating process (see eg Roberts (1965)
Geweke (1999) Madigan and Raftery (1994)) The results are not very sensitive to the choice of
ψ except when considering very small values for of the shrinkage parameter ψ since in this case
posterior means are shrunk toward zero Interestingly the estimated price of risk for the market
factor is positive despite it being very often estimated as a negative quantity when considering
multifactor models and not dissimilar from the market excess return over the same period More
generally there is a clear pattern in cross-sectionally estimated (ie ex post) factor risk premia and
their simple time series average estimates (reported in the last column of Table 7) for the robust
four factors (HML MKTlowast MKT and SMBlowast) ex post risk premia are very similar to the time series
estimnates while the opposite holds true for the other factors In other words robust factors seems
to price themselves well (since theoretically their own beta is one) while other factors donrsquot
IV3 Estimating 26 million sparse factor models
Instead of drawing unrestricted factor models specifications we now constraint models to have a
maximum of 5 factors ndash ie we are imposing sparsity on the implied stochastic discount factor as
in most of the previous empirical asset pricing literature that has tried to identify low dimensional
factor models to explain the cross-section of asset returns Given our set of 51 factors this approach
yields about 26 millions of possible models (ie the equivalent of about 147 million time series and
cross-sectional regressions) Since we now do not sample the possible models posterior probabilities
are computed using the marginal likelihoods of all these models ie the posterior probability of
model γj is computed as
Pr(γj |data) =p(data|γj)983123i p(data|γi)
34
000
1000
2000
3000
4000
5000
6000
7000
8000
1 5 10 20 50 60
Post
erio
r pro
babi
lity
ψ
HML MKT MKT SMB PERF
STOCK_ISS COMP_ISSUE CMA ROA UMD
BEH_PEAD STRev NONDUR DISSTR MGMT
BW_ISENT TERM FIN_UNC LIQ_TR DeltaSLOPE
IPGrowth Oil SERV DEFAULT NetOA DIV PE REAL_UNC HJTZ_ISENT UNRATE
INTERM_CAP_RATIO LIQ_NT QMJ MACRO_UNC INV_IN_ASS
ACCR LTRev CMA ASS_Growth HML
RMW BAB IA O_SCORE ROE
SMB SKEW BEH_FIN GR_PROF RMW
HML_DEVIL
Figure 5 Posterior factor probabilities
Posterior probabilities of factors Pr [γj = 1|data] estimated over the 197310-201612 sample using a cross-section of25 Fama-French size and book-to-market and 30 Industry test asset portfolios computed using the Dirac spike andslab approach of section II22 51 factors and all possible models with up to 5 factors yielding about 26 millioncandidate models The prior probability of a factor being included is about 1038 The 51 factors considered aredescribed in Table B1 of Appendix B Posterior probabilities are plotted for ψ isin [1 100]
where we have assigned equal prior probability to all possible specifications and p(data|γj) denotes
the marginal likelihood of the j-th model To both simplify the numerical computation and to
illustrate the qualities of the approach we use the Dirac spike-and-slab prior of section II22 since
we can leverage its closed form solution for the marginal likelihoods18
Posterior factor probabilities are reported in Figure 5 and jointly with the Bayesian model
averaging of risk premia across the sparse models in Table C15 of the Appendix Note that in
this case all factors have an ex ante probability of being included equal to 1038 The results
are strikingly similar to the ones in Table 7 and Figure 4 as before only for four factors (HML
MKTlowast MKT and SMBlowast) we observe a marked increase in the posterior probability of inclusion
after observing the data Furthermore these factors seems to price themselves well ndash ie the BMA
of their risk premia are very similar to the sample average of their excess returns ndash while other
factors do not
35
Table 8 Factor Models with highest posterior probability (Dirac spike-and-slab ψ = 10)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232SMB 983232 983232 983232 983232 983232 983232STRevBW ISENTLIQ TRUNRATENONDURTERMCOMP ISSUE 983232 983232OilBEH PEADDeltaSLOPEINV IN ASSIPGrowthDEFAULTUMD 983232ROA 983232 983232 983232REAL UNCPECMA 983232ACCRSERVSTOCK ISS 983232 983232 983232DIVMACRO UNCFIN UNCLIQ NTCMANetOAHJTZ ISENTLTRevRMWHMLINTERM CAP RATIOASS GrowthPERF 983232 983232 983232IABABDISSTRROEMGMT 983232 983232O SCOREQMJBEH FINGR PROFSKEWRMWHML DEVILSMB
Probability () 00666 00599 00574 00522 00503 00460 00437 00428 00423 00393
Factors and posterior model probabilities of ten most likely specifications computed using the Dirac spike and slabapproach of section II22 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 26 millioncandidate models and a model prior probability of the order of 10minus7 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
36
IV4 A robust factors model
The previous subsections suggest that only a small number of factors (HML MKTlowast MKT and to a
lesser extent SMBlowast) are robust explanators of the cross-section of asset returns Furthermore Table
8 that reports the ten factor model specifications with the highest posterior probabilities19 shows
that these robust factors tend to be almost always included in the most likely models both HML
and MKTlowast are featured in all ten specifications while MKT is excluded only once and SMBlowast is
included in the five most likely specification plus the tenth one Note that the posterior probabilities
in Table 8 might appear small in absolute terms but are actually four orders of magnitude larger
than the prior model probabilities (equal to one over the number of models considered)
Therefore a natural question is whether the four factors identified as robust in the above
analysis do indeed deliver a significantly better cross-sectional asset pricing model We answer this
questions by comparing the performance of a four factor model with HML MKTlowast MKT and SMBlowast
as factors to the one of several notable factor models20 In particular Table 9 reports the model
posterior probabilities for the specifications considered ie the probability of any of these models
being the true data generating process Strikingly for almost any value of ψ the model posterior
probabilities are in the single digits range for all models but the robust factors one the probability
of this specification is always higher than 85 except when using a very strong shrinkage (in which
case it is reduced to 65) Furthermore for ψ in the most salient range (10-20) the posterior
probability of the robust factors model is about 90
Table 9 Posterior probabilities of notable models vs robust factors
ψmodel 1 5 10 20 50 100
CAPM 002 001 001 002 004 008Fama and French (1992) 005 001 001 001 000 000Fama and French (2016) 009 006 005 004 003 002Carhart (1997) 005 001 001 001 001 001Hou Xue and Zhang (2015) 002 000 000 000 000 000Pastor and Stambaugh (2000) 001 000 000 001 001 002Asness Frazzini and Pedersen (2014) 010 003 002 002 002 001Robust Factors Model 065 088 090 090 089 085
Posterior model probabilities for the specifications in the first column for different values of ψ computed using theDirac spike-and-slab prior The models and their factors are described in Appendix B The model in the last rowuses the HML MKT MKTlowast and SBMlowast factors described in Table B1 The data is monthly 197310 to 201612 Testassets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios
18Alternatively one could use the continuous spike-and-slab approach employed in the previous subsection anddrop the draws of specifications with more than 5 factors but this significantly reduces computational efficiency
19Table 8 focuses on the Dirac spike-and-slab specification with ψ = 10 Very similar results are reported in tablesC16-C18 of Appendix C for other values of ψ and for the continuous prior case
20Note that the correlation between MKT and MKTlowast is not too large being about 064
37
IV5 The sparsity of the stochastic discount factor
We have shown that a robust model with only four ndash robust ndash factors is much more likely to
capture the true stochastic discount factor than all the notable models considered (see Table 9)
Nevertheless are only four factors likely to deliver a sufficiently accurate representation of the true
latent stochastic discount factor
Thanks to our Bayesian method this question can be easily answered In particular by using
our estimations of about 225 quadrillion models and their posterior probabilities we can compute
the posterior distribution of the dimensionality of the lsquotruersquo model That is for any integer number
between one and fifty-one we can compute the posterior probability of the SDF being a function
of that number of factors
Figure 6 reports the posterior distributions of the dimensionality of the SDF for various values
of ψ These distributions are also summarized in Table 10 For the most salient values of ψ (10 and
20) the posterior mean of the number of factors in the true SDF is in the 24-25 range and the 95
posterior credible intervals are contained in the 17 to 32 factors range That is there is substantial
evidence that the SDF is dense in the space of the linear factors considered given the factors at
hand a relative large number of them is needed to provide an accurate representation of the true
latent SDF Since most of the literature has focused on very low dimension linear factor models
this finding suggests that most empirical results therein have been affected by a large degree of
misspecification
It is worth noticing that as Figure 6 and Table 10 show for very large ψ ie with basically
a flat prior for factor risk premia the posterior dimensionality is reduced This is due to two
phenomena we have already outlined First if some of the factors are useless (and our analysis
points in this direction) under a flat prior they do tend to have a higher posterior probability and
drive out the true sources of priced risk Second a flat prior for the risk premia can generate a
ldquoBartlett Paradoxrdquo (see the discussion in section II21 and Bartlett (1957))
Table 10 The posterior dimensionality of the stochastic discount factor
ψ1 5 10 20 50 100
mean 2570 2525 2460 2370 2203 2046median 26 25 25 24 22 2025 19 18 18 17 15 145 20 20 19 18 16 1595th 31 31 30 30 28 26975th 33 32 32 31 29 27
Summary statistics of the posterior density of the true model number of factors for various values of ψ Estimated
over the 197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30 Industry
test asset portfolios using the continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225
quadrillion models The prior for each factor inclusion is a Beta(1 1)
38
0 10 20 30 40 50
000
004
008
012
Model Dimensionality
Freq
uenc
y
ψ=1ψ=5ψ=10ψ=20ψ=50ψ=100
Figure 6 Posterior density of the dimensionality of the stochastic discount factor
Posterior density of the true model having the number of factors listed on the horizontal axis Estimated over the
197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30 Industry test asset
portfolios computed using the continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225
quadrillion models The prior for each factor inclusion is a Beta(1 1) yielding a prior expectation for γj equal to 50
The 51 factors considered are described in Table B1 of Appendix B Posterior densities are plotted for ψ isin [1 100]
Note that if the factors proposed in the literature were to capture different and uncorrelated
sources of risk one might worry that a SDF that is dense in the space of factors might imply
unrealistically high Sharpe ratios (see eg the discussion in Kozak Nagel and Santosh (2019))
Since given a factor model the SDF-implied maximum Sharpe ratio is just a function of the
factorsrsquo risk premia and covariance matrix our Bayesian method allows to construct the posterior
distribution of the maximum Sharpe ratio for each of the 2 quadrilion models considered Therefore
using the posterior probabilities of each possible model specification we can actually construct
the (Bayesian Model Averaging) posterior distribution of the SDF-implied maximum Sharpe ratio
(conditional on the data only)
Figure 7 and table 11 report respectively the posterior distribution of the (annualized) SDF-
implied maximum Sharpe ratio and its summary statistics for several values of the parameter ψ
Except when a very strong shrinkage (small ψ) is imposed (and hence risk premia and consequently
Sharpe ratios are shrunk toward zero) the posterior distribution of the Sharpe ratio are quite
similar for all values of ψ Furthermore despite the SDF being dense in the space of factors the
posterior maximum Sharpe ratio does not appear to be unrealistically high eg for ψ isin [10 20]
its posterior mean is about 074ndash080 and the 95 posterior credible intervals are in the 050ndash114
range Interestingly Ghosh Julliard and Taylor (2016 2018) provides a non-parametric estimate
of the pricing kernel extracted using an information-theoretic approach and wide cross-sections of
equity portfolios and find SDF-implied maximum Sharpe ratios of very similar magnitude
39
00 05 10 15
01
23
4
Sharpe Ratio
Dens
ity
ψ=1ψ=5ψ=10ψ=20ψ=50ψ=100
Figure 7 Posterior density of the Sharpe ratio of the model-implied stochastic discount factor
Posterior density of the Sharpe ratio of the model-implied stochastic discount factor for various values of ψ isin [1 100]
Estimated over the 197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30
Industry test asset portfolios computed using the continuous spike and slab approach of section II23 51 factors
described in Table B1 of Appendix B and Bayesian Model Averaging of the 251 asymp 225 quadrillion possible models
The prior for each factor inclusion is a Beta(1 1)
Table 11 Posterior distribution of the Sharpe ratio of the model-implied stochastic discount factor
ψ1 5 10 20 50 100
mean 044 067 074 080 087 092median 043 066 073 079 086 09125 026 044 050 054 059 0625 029 047 053 058 063 06695th 060 089 099 108 117 124975th 064 094 105 114 124 133
Summary statistics of the posterior distribution of the maximal Sharpe ratio of the model-implied SDF for various
values of ψ isin [1 100] Estimated over the 197310-201612 sample using a cross-section of 25 Fama-French size and
book-to-market and 30 Industry test asset portfolios computed using the continuous spike and slab approach of
section II23 51 factors described in Table B1 of Appendix B and Bayesian Model Averaging of the 251 asymp 225
quadrillion possible models The prior for each factor inclusion is a Beta(1 1)
V Extensions
In additions to the extensions formalized in remarks 1 (on how to handle generated factors such as
principal components and factor mimicking portfolios) and 3 (on how to handle the identification
failure generated by lsquolevel factorsrsquo) our method can be feasibly extended to encompass several
salient generalizations
40
First based on economic considerations one might possibly want to bound the maximum risk
premia (or the maximum Sharpe ratios) associated with the factors This can be achieved by re-
placing the Gaussian distributions in our spike-and-slab priors with (rescaled and centered) Beta
distributions since the latter have bounded support Furthermore for the sake of expositional
simplicity and closed form solutions we have focused on regularising spike and slab priors with
exponential tails Nevertheless our approach that shrinks useless factors based on their correla-
tion with asset returns could be as well implemented using polynomial tailed (ie heavy-tailed)
mixing priors (see Polson and Scott (2011) for a general discussion of priors for regularisation and
shrinkage)21 The rational for using heavy tailed priors is that when the likelihood has thick tails
while the prior has thin tail if the likelihood peak moves too far from the prior mean the posterior
eventually eventually reverts toward the prior This is illustrated in Figure D2 of the Appendix
Nevertheless note that this mechanism (first pointed out in Jeffreys (1961)) is actually desirable
in our settings in order to shrink the risk premia of useless factors toward zero22
Second Lewellen Nagel and Shanken (2010) points out that the first pass time-series regression
is often affected by having a strong factor structure in the residuals Given the hierarchical structure
of our Bayesian approach one can add latent linear components in the time series regression of asset
returns on factors reformulate the time series estimation step as a state-space problem and filter
the latent components (eg via Kalman filter) The posterior sampling of the time series parameters
would then be enriched by the drawing of the added terms as in Bryzgalova and Julliard (2018)
Furthermore one could allow the latent time series factors to be potential priced in the cross-
section (again as in Bryzgalova and Julliard (2018)) This extension would increase the numerical
complexity of the procedure in the time-series step but would nonetheless leave unchanged the
method proposed in this paper at the cross-sectional step (with the only difference that the time
series loadings of the latent factors could be included in the cross-sectional step as if these latent
factors were observable) This extended approach would lead to valid posterior inference and model
selection
Third again thanks to the hierarchical structure of our method time varying time series betas
could be accommodated by adopting the time varying parameters approach of Primiceri (2005)
in the time series step And since in our approach the asset specific expected risk premia are
a parameter estimated in the time series step this extension would also allow for time variation
in asset risk premia Furthermore albeit this would increase the numerical complexity of the
cross-sectional inference step the time varying parameters formulation could also be used for the
modelling of the factor risk premia
21For example albeit alternatives with more desirable properties exist our spike and slab could be implementedusing a Cauchy prior with location parameter set to zero and scale parameter proportional to ψj as defined in equation(16)
22Risk premia have a natural support hence the prior can be chosen (adjusting the paramater ψ) to attach highprobability to it And since useless factors will tend to generate heavy tailed likelihoods with posterior peaks for riskpremia that deviate toward infinity the risk premia of such factors will be shrunk toward the prior mean if the priorhas thin-tails
41
VI Conclusions
We have developed a novel (Bayesian) method for the analysis of linear factor models in asset
pricing The approach can handle the quadrillions of models generated by the zoo of traded and
non traded factors and delivers inference that is robust to the common identification failures and
spurious inference problems caused by useless factors
We have applied our approach to the study of more than two quadrillion factor model spec-
ification and have found that 1) only a handful of factors (the Fama and French (1992) ldquohigh-
minus-lowrdquo proxy for the value premium the market index as well the adjusted versions of the
ldquosmall-minus-bigrdquo size factor and the market factor of Daniel Mota Rottke and Santos (2018))
seem to be robust explanators of the cross-sections of asset returns 2) jointly the four robust fac-
tors provide a model that is compared to the previous empirical literature one order of magnitude
more likely to have generated the observed asset returns (itrsquos posterior probability is about 90)
3) with very high probability the ldquotruerdquo latent stochastic discount factor is dense in the space of
factors proposed in the previous literature ie capturing its characteristics requires the use of 24-25
factors (at the posterior mean of the SDF sparsity) 4) despite being dense in the space of factors
the SDF-implied maximum Sharpe ratio is not excessive suggesting a high degree of commonality
in terms of captured risks among the factors in the zoo
As a byproduct of our novel framework for empirical asset pricing we provide a very simple
Bayesian version of the Fama and MacBeth (1973) regression method (BFM) We show that this
simple procedure (that does neither require optimisation nor tuning parameters and is not harder
to implement than eg the Shanken (1992) correction for standard errors) makes useless factors
easily detectable in finite sample In extensive simulations the BFM and its GLS analogue (BFM-
GLS) perform well even with relatively small time and large cross-sectional dimensions We apply
BFM and BFM-GLS to several notable factor models and document that a range of non-traded
factors such as consumption proxies labour factors or the consumption-to-wealth ratio are only
weakly identified at best and are characterised by a substantial degree of model misspecification
and uncertainty
Finally thanks to its hierarchical structure our framework is extremely flexible and can accom-
modate and deliver robust inference in the presence of 1) pre-estimated factors (eg mimicking
portfolios and principal components) 2) latent priced and unpriced factors 3) time varying betas
as well as asset and factor risk premia
42
References
Allena R (2019) ldquoComparing Asset Pricing Models with Traded and Non-Traded Factorsrdquo working paper
Asness C and A Frazzini (2013) ldquoThe Devil in HMLs Detailsrdquo Journal of Portfolio Management 39 49ndash68
Asness C S A Frazzini and L H Pedersen (2019) ldquoQuality minus Junkrdquo Review of Accounting Studies24(1) 34ndash112
Avramov D and G Zhou (2010) ldquoBayesian Portfolio Analysisrdquo Annual Review of Financial Economics 225ndash47
Baker M and J Wurgler (2006) ldquoInvestor Sentiment and the Cross-Section of Stock Returnsrdquo Journal ofFinance 61 1645ndash80
Baks K P A Metrick and J Wachter (2001) ldquoShould Investors Avoid All Actively Managed Mutual FundsA Study in Bayesian Performance Evaluationrdquo Journal of Finance 56 45ndash85
Ball R (1985) ldquoAnomalies in Relationships between Securitiesrsquo Yields and Yield-Surrogatesrdquo Journal of FinancialEconomics 6 103ndash126
Barillas F and J Shanken (2018) ldquoComparing Asset Pricing Modelsrdquo The Journal of Finance 73(2) 715ndash754
Bartlett M S (1957) ldquoComment on a Statistical Paradox by DV Lindleyrdquo Biometrika 44(1-2) 533ndash534
Basu S (1977) ldquoInvestment Performance of Common Stocks in Relation to their Price-Earnings Ratio A Test ofthe Efficient Market Hypothesisrdquo Journal of Finance 32 663ndash82
Belloni A V Chernozhukov and C Hansen (2014) ldquoInference on treatment effects after selection amonghigh-dimensional controlsrdquo The Review of Economic Studies 81 608ndash650
Breeden D T M R Gibbons and R H Litzenberger (1989) ldquoEmpirical Test of the Consumption-OrientedCAPMrdquo The Journal of Finance 44(2) 231ndash262
Bryzgalova S (2015) ldquoSpurious Factors in Linear Asset Pricing Modelsrdquo working paper
Bryzgalova S and C Julliard (2018) ldquoConsumption in Asset Returnsrdquo London School of EconomicsManuscript
Campbell J Y (1996) ldquoUnderstanding Risk and Returnrdquo Journal of Political Economy 104(2) 298ndash345
Campbell J Y J Hilscher and J Szilagyi (2008) ldquoIn Search of Distress Riskrdquo Journal of Finance 632899ndash2939
Carhart M M (1997) ldquoOn Persistence in Mutual Fund Performancerdquo The Journal of Finance 52(1) 57ndash82
Chan K N-F Chen and D A Hsieh (1985) ldquoAn Exploratory Investigation of the Firm Size Effectrdquo Journalof Financial Economics 14 451ndash71
Chen L R Novy-Marx and L Zhang (2010) ldquoA New Three-Factor Modelrdquo working paper
Chen N-F S A Ross and R Roll (1986) ldquoEconomic Forces and the Stock Marketrdquo Journal of Business 59383ndash403
Chernov M L Lochstoer and S Lundeby (2019) ldquoConditional Dynamics and the Multi-Horizon Risk-ReturnTrade-Offrdquo working paper
Chib S X Zeng and L Zhao (forthcoming) ldquoOn Comparing Asset Pricing Modelsrdquo The Journal of Finance
Cochrane J H (2005) Asset Pricing Revised Edition Princeton University Press
Cooper M J H Gulen and M J Schill (2008) ldquoAsset Growth and the Cross-Section of Stock ReturnsrdquoJournal of Finance 63 1609ndash52
Cremers K M (2002) ldquoStock Return Predictability A Bayesian Model Selection Perspectiverdquo The Review ofFinancial Studies 15(4) 1223ndash1249
Daniel K D Hirshleifer and L Sun (2019) ldquoShort- and Long-Horizon Behavioral Factorsrdquo Review ofFinancial Studies
Daniel K L Mota S Rottke and T Santos (2018) ldquoThe Cross-Section of Risk and Returnrdquo workingpaper
Daniel K and S Titman (2006) ldquoMarket reactions to tangible and intangible informationrdquo Journal of Finance61 1605ndash43
Fabozzi F D Huang and G Zhou (2010) ldquoRobust Portfolios Contributions from Operations Research andFinancerdquo Annals of Operations Research 172 191ndash220
43
Fama E F and K French (2008) ldquoDissecting Anomaliesrdquo Journal of Finance 63 1653ndash78
Fama E F and K R French (1992) ldquoThe Cross-Section of Expected Stock Returnsrdquo The Journal of Finance47 427ndash465
(1993) ldquoCommon Risk Factors in the Returns on Stocks and Bondsrdquo The Journal of Financial Economics33 3ndash56
Fama E F and K R French (2015) ldquoA Five-Factor Asset Pricing Modelrdquo Journal of Financial Economics116 1ndash22
(2016) ldquoDissecting Anomalies with a Five-Factor Modelrdquo Review of Financial Studies 29 69ndash103
Fama E F and J D MacBeth (1973) ldquoRisk Return and Equilibrium Empirical Testsrdquo Journal of PoliticalEconomy 81(3) 607ndash636
Ferson W and C Harvey (1991) ldquoThe Variation of Economic Risk Premiumsrdquo Journal of Political Economy99 385ndash415
Frazzini A and L H Pedersen (2014) ldquoBetting against Betardquo Journal of Financial Economics 111(1) 1ndash25
Garlappi L R Uppal and T Wang (2007) ldquoPortfolio Selection with Parameter and Model Uncertainty AMulti-Prior Approachrdquo Review of Financial Studies 20 41ndash81
George E I and R E McCulloch (1993) ldquoVariable Selection via Gibbs Samplingrdquo Journal of the AmericanStatistical Association 88 881ndash889
(1997) ldquoApproaches for Bayesian Variable Selectionrdquo Statistica Sinica 7 339ndash373
Gertler M and E Grinols (1982) ldquoUnemployment Inflation and Common Stock Returnsrdquo Journal of MoneyCredit and Banking 14 216ndash33
Geweke J (1999) ldquoUsing Simulation Methods for Bayesian Econometric Models Inference Development andCommunicationrdquo Econometric Reviews 18(1) 1ndash73
Ghosh A C Julliard and A Taylor (2016) ldquoWhat is the Consumption-CAPM missing An Information-Theoretic Framework for the Analysis of Asset Pricing Modelsrdquo Review of Financial Studies 30 442ndash504
(2018) ldquoAn Information-Theoretic Asset Pricing Modelrdquo London School of Economics Manuscript
Giannone D M Lenza and G E Primiceri (2018) ldquoEconomic Predictions with Big Data The Illusion ofSparsityrdquo FRB of New York Staff Report No 847
Gibbons M R S A Ross and J Shanken (1989) ldquoA Test of the Efficiency of a Given Portfoliordquo Econometrica57(5) 1121ndash1152
Giglio S G Feng and D Xiu (2019) ldquoTaming the factor Zoordquo Journal of Finance forthcoming
Giglio S and D Xiu (2018) ldquoAsset Pricing with Omitted Factorsrdquo working paper
Gospodinov N R Kan and C Robotti (2014) ldquoMisspecification-Robust Inference in Linear Asset-PricingModels with Irrelevant Risk Factorsrdquo The Review of Financial Studies 27(7) 2139ndash2170
(2019) ldquoToo Good to be True Fallacies in Evaluating Risk Factor Modelsrdquo Journal of Financial Economics132(2) 451ndash471
Goyal A Z L He and S-W Huh (2018) ldquoDistance-Based Metrics A Bayesian Solution to the Power andExtreme-Error Problems in Asset-Pricing Testsrdquo Swiss Finance Institute Research Paper No 18-78 available atSSRN httpsssrncomabstract=3286327
Hall R E (1978) ldquoThe Stochastic Implications of the Life Cycle Permanent Income Hypothesisrdquo Journal ofPolitical Economy 86(6) 971ndash87
Harvey C and Y Liu (2019) ldquoCross-sectional Alpha Dispersion and Performance Evaluationrdquo Journal ofFinancial Economics forthcoming
Harvey C and G Zhou (1990) ldquoBayesian Inference in Asset Pricing Testsrdquo Journal of Financial Economics26 221ndash254
Harvey C R (2017) ldquoPresidential Address The Scientific Outlook in Financial Economicsrdquo The Journal ofFinance 72(4) 1399ndash1440
Harvey C R Y Liu and H Zhu (2016) ldquo and the Cross-Section of Expected Returnsrdquo The Review ofFinancial Studies 29(1) 5ndash68
He A D Huang and G Zhou (2018) ldquoNew Factors Wanted Evidence from a Simple Specification Testrdquoworking paper available at SSRN httpsssrncomabstract=3143752
44
He Z B Kelly and A Manela (2017) ldquoIntermediary Asset Pricing New Evidence from Many Asset ClassesrdquoJournal of Financial Economics 126 1ndash35
Hirshleifer D H Kewei S Teoh and Y Zhang (2004) ldquoDo Investors Overvalue Firms with Bloated BalanceSheetsrdquo Journal of Accounting and Economics 38 297ndash331
Hoeting J A D Madigan A E Raftery and C T Volinsky (1999) ldquoBayesian Model Averaging ATutorialrdquo Statistical Science 14(4) 382ndash401
Hou K C Xue and L Zhang (2015) ldquoDigesting Anomalies An Investment Approachrdquo The Review of FinancialStudies 28(3) 650ndash705
Huang D F Jiang J Tu and G Zhou (2015) ldquoInvestor Sentiment Aligned A Powerful Predictor of StockReturnsrdquo Review of Financial Studies 28 791ndash837
Huang D J Li and G Zhou (2018) ldquoShrinking Factor Dimension A Reduced-Rank Approachrdquo workingpaper available at SSRN httpsssrncomabstract=3205697
Ishwaran H J S Rao et al (2005) ldquoSpike and Slab Variable Selection Frequentist and Bayesian StrategiesrdquoThe Annals of Statistics 33(2) 730ndash773
Jagannathan R and Z Wang (1996) ldquoThe Conditional CAPM and the Cross-Section of Expected ReturnsrdquoThe Journal of Finance 51(1) 3ndash53
Jeffreys H (1961) Theory of Probability Oxford Calendron Press
Jegadeesh N and S Titman (1993) ldquoReturns to Buying Winners and Selling Losers Implications for StockMarket Efficiencyrdquo Journal of Finance 48 65ndash91
(2001) ldquoProfitability of Momentum Strategies An Evaluation of Alternative Explanationsrdquo Journal ofFinance 56 699ndash720
Jones C and J Shanken (2005) ldquoMutual Fund Performance with Learning across Fundsrdquo Journal of FinancialEconomics 78 507ndash552
Jurado K S Ludvigson and S Ng (2015) ldquoMeasuring Uncertaintyrdquo American Economic Review 105 1177ndash1215
Kan R and C Zhang (1999a) ldquoGMM Tests of Stochastic Discount Factor Models with Useless Factorsrdquo Journalof Financial Economics 54(1) 103ndash127
(1999b) ldquoTwo-Pass Tests of Asset Pricing Models with Useless Factorsrdquo the Journal of Finance 54(1)203ndash235
Kass R E and A E Raftery (1995) ldquoBayes Factorsrdquo Journal of the American Statistical Association 90(430)773ndash795
Kelly B S Pruitt and Y Su (2019) ldquoCharacteristics are covariances A unified model of risk and returnrdquoJournal of Financial Economics 134 501ndash524
Kleibergen F (2009) ldquoTests of Risk Premia in Linear Factor Modelsrdquo Journal of Econometrics 149(2) 149ndash173
Kleibergen F and Z Zhan (2015) ldquoUnexplained Factors and their Effects on Second Pass R-squaredsrdquo Journalof Econometrics 189(1) 101ndash116
Kozak S S Nagel and S Santosh (2019) ldquoShrinking the Cross-Sectionrdquo Journal of Financial Economicsforthcoming
Lancaster T (2004) Introduction to Modern Bayesian Econometrics Wiley
Langlois H (2019) ldquoMeasuring Skewness Premiardquo Journal of Financial Economics
Lettau M and S Ludvigson (2001) ldquoResurrecting the (C) CAPM A Cross-Sectional Test when Risk Premiaare Time-Varyingrdquo Journal of political economy 109(6) 1238ndash1287
Lewellen J S Nagel and J Shanken (2010) ldquoA Skeptical Appraisal of Asset Pricing Testsrdquo Journal ofFinancial Economics 96(2) 175ndash194
Lintner J (1965) ldquoSecurity Prices Risk and Maximal Gains from Diversificationrdquo Journal of Finance 20587ndash615
Ludvigson S S Ma and S Ng (2019) ldquoUncertainty and Business Cycles Exogenous Impulse or EndogenousResponserdquo American Economic Journal Macroeconomics
Madigan D and A E Raftery (1994) ldquoModel Selection and Accounting for Model Uncertainty in GraphicalModels Using Occamrsquos Windowrdquo Journal of the American Statistical Association 89(428) 1535ndash1546
45
Novy-Marx R (2013) ldquoThe Other Side of Value The Gross Profitability Premiumrdquo Journal of Financial Eco-nomics 108 1ndash28
Ohlson J A (1980) ldquoFinancial Ratios and the Probabilistic Prediction of Bankruptcyrdquo Journal of AccountingResearch 18 109ndash131
Pastor L (2000) ldquoPortfolio Selection and Asset Pricing Modelsrdquo The Journal of Finance 55(1) 179ndash223
Pastor L and R F Stambaugh (2000) ldquoComparing Asset Pricing Models an Investment Perspectiverdquo Journalof Financial Economics 56(3) 335ndash381
Pastor L and R F Stambaugh (2002) ldquoInvesting in Equity Mutual Fundsrdquo Journal of Financial Economics63 351ndash380
Pastor L and R F Stambaugh (2003) ldquoLiquidity Risk and Expected Stock Returnsrdquo Journal of Politicaleconomy 111(3) 642ndash685
Phillips D L (1962) ldquoA Technique for the Numerical Solution of Certain Integral Equations of the First KindrdquoJ ACM 9(1) 84ndash97
Polson N G and J G Scott (2011) ldquoShrink Globally Act Locally Sparse Bayesian Regularization and Pre-dictionrdquo in Bayesian Statistics 9 ed by J M Bernardo M J Bayarri J O Berger A P Dawid D HeckermanA F M Smith and M West Oxford University Press
Primiceri G E (2005) ldquoTime Varying Structural Vector Autoregressions and Monetary Policyrdquo Review of Eco-nomic Studies 72(3) 821ndash852
Raftery A E D Madigan and J A Hoeting (1997) ldquoBayesian Model Averaging for Linear RegressionModelsrdquo Journal of the American Statistical Association 92(437) 179ndash191
Ritter J R (1991) ldquoThe Long-Run Performance of Initial Public Offeringsrdquo Journal of Finance 46 3ndash27
Roberts H V (1965) ldquoProbabilistic Predictionrdquo Journal of the American Statistical Association 60(309) 50ndash62
Shanken J (1987) ldquoA Bayesian Approach to Testing Portfolio Efficiencyrdquo Journal of Financial Economics 19195ndash215
(1992) ldquoOn the Estimation of Beta-Pricing Modelsrdquo The Review of Financial Studies 5 1ndash33
Sharpe W F (1964) ldquoCapital Asset Prices A Theory of Market Equilibrium under Conditions of Riskrdquo Journalof Finance 19(3) 425ndash42
Sims C A (2007) ldquoThinking about Instrumental Variablesrdquo mimeo
Sloan R (1996) ldquoDo Stock Prices Fully Reflect Information in Accruals and Cash Flows about Future EarningsrdquoThe Accounting Review 71 289ndash315
Stambaugh R F and Y Yuan (2016) ldquoMispricing Factorsrdquo Review of Financial Studies 30 1270ndash1315
Stock J H (1991) ldquoConfidence Intervals for the Largest Autoregressive Root in US Macroeconomic Time SeriesrdquoJournal of Monetary Economics 28(3) 435ndash459
Tikhonov A A Goncharsky V Stepanov and A G Yagola (1995) Numerical Methods for the Solutionof Ill-Posed Problems Mathematics and Its Applications Springer Netherlands
Titman S K J Wei and F Xie (2004) ldquoCapital Investments and Stock Returnsrdquo Journal of Financial andQuantitative Analysis 39 677ndash700
Uppal R P Zaffaroni and I Zviadadze (2018) ldquoCorrecting Misspecified Stochastic Discount Factorsrdquoworking paper
Yogo M (2006) ldquoA Consumption-Based Explanation of Expected Stock Returnsrdquo Journal of Finance 61(2)539ndash580
46
A Additional Derivations
A1 Derivation of the posterior distribution in section II1
Letrsquos consider first the time-series regression We assume that 983171tiidsim N (0Σ) or 983171 sim MVN (0TtimesN Σotimes
IT ) The likelihood of data (RF ) is therefore
p(data|BΣ) = (2π)minusNT2 |Σ|minus
T2 exp
983069minus1
2tr[Σminus1(Rminus FB)⊤(Rminus FB)]
983070
After assigning the Jeffreysrsquo prior for (BΣ) π(BΣ) prop |Σ|minusN+1
2 we simplify the likelihood
function by exploiting the fact that OLS estimated residuals are orthogonal to the regressors
(Rminus FB)⊤(Rminus FB) = [Rminus FBols minus F (B minus Bols)]⊤[Rminus FBols minus F (B minus Bols)]
= (Rminus FBols)⊤(Rminus FBols) + (B minus Bols)
⊤F⊤F (B minus Bols)
= T Σ+ (B minus Bols)⊤F⊤F (B minus Bols)
where
Bols =
983075a⊤
β⊤
983076= (F⊤F )minus1F⊤R Σols =
1
T(Rminus FBols)
⊤(Rminus FBols)
Therefore the posterior distribution in the first step is
p(BΣ|data) prop (2π)minusNT2 |Σ|minus
T+N+12 exp
983069minus1
2tr[Σminus1(Rminus FB)⊤(Rminus FB)]
983070
prop |Σ|minusT+N+1
2 eminus12tr[Σminus1(T Σ)]eminus
12tr[Σminus1(BminusBols)
⊤F⊤F (BminusBols)]
Hence the posterior distribution of B conditional on data and Σ is
p(B|Σ data) prop exp
983069minus1
2tr[Σminus1(B minus Bols)
⊤F⊤F (B minus Bols)]
983070
and the above is the kernel of the multivariate normal in equation (8)
If we further integrate out B it is easy to show that
p(Σ|data) prop |Σ|minusT+NminusK
2 exp
983069minus1
2tr[Σminus1(T Σ)]
983070
Therefore the posterior distribution of Σ is the inverse-Wishart in equation (9)
Recall that β = (1N βf ) λ⊤ = (λc λ⊤
f ) If we assume that the pricing error αi follows an
independent and idencitical normal distribution N (0σ2) the likelihood function in the second step
is
p(data|λσ2) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070
where data in the second step include (aβf ) drawn from the first step Assuming the diffuse
47
Jeffreysrsquo prior π(λσ2) prop 1σ2 the posterior distribution of (λσ2) is
p(λσ2|dataBΣ) prop (σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070
= (σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βλ+ β(λminus λ))⊤(aminus βλ+ β(λminus λ))
983070
= (σ2)minusN+2
2 exp
983069minusN σ2
2σ2
983070exp
983083minus(λminus λ)⊤β⊤β(λminus λ)
2σ2
983084
there4 p(λ|σ2 dataBΣ) prop exp
983083minus(λminus λ)⊤β⊤β(λminus λ)
2σ2
983084
where λ = (β⊤β)minus1β⊤a and σ2 = (aminusβλ)⊤(aminusβλ)N Therefore the posterior conditional distribution
of λ is the one in equation (12) Finally we can derive the posterior distribution of σ2 by integrating
out λ
p(σ2|dataBΣ) =
983133p(λσ2|dataBΣ)dσ2 prop (σ2)minus
NminusK+12 exp
983069minusN σ2
2σ2
983070
hence obtaining the posterior distribution in equation (13)
A2 Non-spherical pricing errors
Our framework can also easily accommodate non-spherical cross-sectional pricing errors To see
this note that under the null of the model we can rewrite equation (1) as Rt = βλ + βfft + 983171t
Consider the cross-sectional regression and let ET define the sample mean operator Since ET [Rt] =
βλ+ET [βfft]+ET [983171t] = βλ+ET [983171t] the pricing error α should be equal to ET [983171t] Hence under
the hypothesis that the model is correctly specified and in the spirit of the central limit theorem
a suitable distributional assumption for the pricing errors α in the second step is
α|Σ sim N
9830610N
1
TΣ
983062
Implying the distribution of R equiv ET [Rt] sim N (βλ 1T Σ) The likelihood function in the second step
is then
p(data|λ) = (2π)minusN2
9830559830559830559830551
TΣ
983055983055983055983055minus 1
2
exp
983069minusT
2(Rminus βλ)⊤Σminus1(Rminus βλ)
983070
prop exp
983069minusT
2(λ⊤β⊤Σminus1βλminus 2λ⊤β⊤Σminus1R)
983070
Hence we can now define the following estimator
Definition 3 (Bayesian Fama-MacBeth GLS2 (BFM-GLS2)) The BFM-GLS2 posterior dis-
tribution of λ is
λ|dataBΣ sim N (λΣλ) (18)
48
where λ = (β⊤Σminus1β)minus1β⊤Σminus1R Σλ = 1T (β
⊤Σminus1β)minus1 and where β and Σ are drawn from the
Normal-inverse-Wishart in (8)-(9)
In the above the conditional expectation λ = (β⊤Σminus1β)minus1β⊤Σminus1R is essentially the Fama-
MacBeth GLS estimate of λ and Σλ is just the standard covariance matrix of the GLS estimates
Different from our OLS version we incorporate the uncertainty of R by drawing λ from the normal
distribution with variance matrix 1T (β
⊤Σminus1β)minus1
In this case two forces are at play to make useless factors detectable in finite sample First as
for the BFM and BFM-GLS cases useless factors will generate posterior draws with diverging 983141λand flipping sing Second differently from our OLS version we incorporate the uncertainty of R
by drawing λ from the normal distribution with variance matrix 1T (β
⊤Σminus1β)minus1
A3 Formal derivation of the flat prior pitfall for risk premia
Following the derivation in section A1 the likelihood function in the second step is
p(data|γλσ2) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βγλγ)
⊤(aminus βγλγ)
983070(19)
Assigning a Jeffreysrsquo prior to the parameters23 (λσ2) the marginal likelihood function conditional
on model index γ is
p(data|γ) =983133 983133
p(data|γλσ2)π(λσ2|γ)dλdσ2
prop983133 983133
(σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βγλγ)
⊤(aminus βγλγ)
983070dλdσ2
=
983133 983133(σ2)minus
N+22 exp
983083minusN σ2
γ
2σ2
983084exp
983083minus(λγ minus λγ)
⊤β⊤γ βγ(λγ minus λγ)
2σ2
983084dλdσ2
= (2π)pγ+1
2 |β⊤γ βγ |minus
12
983133(σ2)minus
Nminuspγ+1
2 exp
983083minusN σ2
γ
2σ2
983084dσ2
= (2π)pγ+1
2 |β⊤γ βγ |minus
12
Γ(Nminuspγ+1
2 )
(N σ2
γ
2 )Nminuspγ+1
2
where λγ = (β⊤γ βγ)
minus1β⊤γ a σ
2γ =
(aminusβγ λγ)⊤(aminusβγ λγ)N and Γ denotes the Gamma function
A4 Proof of Remark 2
Proof Consider two nested linear factor models γ and γ prime The only difference between γ and γ prime
is γp γp equals 1 in model γ but 0 in model γ prime Let γminusp denote a (K minus 1) times 1 vector of model
index excluding γp γ⊤ = (γ⊤minusp 1) and γ prime⊤ = (γ⊤
minusp 0) Suppose that we rearrange the ordering of
23More precisely the priors for (λσ2) are π(λγ σ2) prop 1
σ2 and λminusγ = 0
49
factors such that factor p is the last one To begin with we introduce some matrix notations
βγ = (βγprime βp) Dγ =
983075Dγprime
1ψp
983076 β⊤
γ βγ +Dγ =
983075β⊤γprimeβγprime +Dγprime β⊤
γprimeβp
β⊤p βγprime β⊤
p βp + 1ψp
983076
|β⊤γ βγ +Dγ | = |β⊤
γprimeβγprime +Dγprime |times |β⊤p βp +
1
ψpminus β⊤
p βγprime(β⊤γprimeβγprime +Dγprime)minus1β⊤
γprimeβp|
|β⊤γ βγ +Dγ | = |β⊤
γprimeβγprime +Dγprime |times |β⊤p βp +
1
ψpminus β⊤
p βγprime(β⊤γprimeβγprime +Dγprime)minus1β⊤
γprimeβp|
|Dγ | = |Dγprime |times 1
ψp
Equipped with the above we have by direct calculation
p(data|γj = 1γminusj)
p(data|γj = 0γminusj)=
|Dγ |12
|β⊤γ βγ+Dγ |
12
1983059
SSRγ2
983060N2
|Dγprime |
12
|β⊤γprimeβγprime+D
γprime |12
1983061
SSRγprime2
983062N2
=
983061SSRγprime
SSRγ
983062N2983061|Dγ ||Dγprime |
983062 12
983075|β⊤
γprimeβγprime +Dγprime ||β⊤
γ βγ +Dγ |
983076 12
=
983061SSRγprime
SSRγ
983062N2
ψminus 1
2p
983055983055983055983055β⊤p βp +
1
ψpminus β⊤
p βγprime
983059β⊤γprimeβγprime +Dγprime
983060minus1β⊤γprimeβp
983055983055983055983055minus 1
2
=
983061SSRγprime
SSRγ
983062N29830611 + ψpβ
⊤p
983063IN minus βγprime
983059β⊤γprimeβγprime +Dγprime
983060minus1β⊤γprime
983064βp
983062minus 12
where β⊤p
983147IN minus βγprime(β⊤
γprimeβγprime +Dγprime)minus1β⊤γprime
983148βp = minb(βpminusβγprimeb)⊤(βpminusβγprimeb)+ b⊤Dγprimeb which
is the minimal value of the penalised sum of squared errors when we use βγprime to predict βp
A5 Confidence Intervals for R2 in Fama-MacBeth Estimation
In the spirit of Stock (1991) and Lewellen Nagel and Shanken (2010) for each of the simulations
we use the following approach to construct the confidence interval for cross-sectional R2 produced
by the standard Fama-MacBeth estimation
First we choose a hypothetical true cross-sectional R2 denoted by R2h The expected test asset
returns E[Rt] are assumed to follow E[Rt] = hβλ+α where β and λ are sample estimates of β
and λ from historical data and αiiidsim N (0σ2
e) The constants h and σ2e are chosen to match the
hypothetical cross-sectional R2
Let R denote the vector of historical average asset returns and V arN (x) denote the sample
cross-sectional variance of vector x The population cross-sectional R2 is therefore
R2h =
h2V arN (βλ)
V arN (E[Rt])=
h2V arN (βλ)
h2V arN (βλ) + σ2e
At the same time wersquod like to maintain the historical cross-sectional dispersion of test asset returns
50
so we further have the following equation
V arN (E[Rt]) = h2V arN (βλ) + σ2e = V arN (ET [Rt])
h2 =R2
hV arN (ET [Rt])
V arN (βλ)
σ2e = (1minusR2
h)V arN (ET [Rt])
Solving the system for h and σ2e we simulate a vector of pricing errors from a normal distribution
αiiidsim N (0σ2
e) Since the sample variance of the draws αi is generally different from σ2e with a non-
zero cross-sectional covariance with β we adjust the vector α by (1) subtracting CovN (βλα)
V arN (βλ)βλ
from α such as the sample covariance between α and βλ is zero (2) multiplying α by σeσN (α) in
order that sample cross-sectional variance of α equals σ2e
Second we simulate a random sample of both factors and test asset returns Factors are drawn
with replacement from their empirical sample while test assets returns Rt are assumed to follow a
parametric normal distribution that is
Rt|ftiidsim N (hβλ+α+ βft Σ)24
We then use Fama-MacBeth two-step approach to estimateR2 for every simulated sample ftRtTt=1
The second step is repeated for 1000 times For each hypothetical R2 between 0 and 1 we find the
(5 95) confidence interval in the simulation Finally build a 90 confidence interval for R2 by
including those values of hypothetically true R2 whose 90 confidence interval contains R2
The confidence intervals for GLS R2 can be found in a similar way with the only difference
of focusing on cross-sectional R2 for a linear combination of Rt ie Σminus 12Rt Let Rt = Σminus 1
2Rt
β = Σminus 12 β and 983171t = Σminus 1
2 983171tiidsim N (0 IN ) The simulation then relies on Rt rather than Rt
Rt|ftiidsim N (hβλ+α+ βft IN )
A6 Large N behavior
In this section we investigate the properties of the BFM procedure in estimating the risk premia
and R-squared as well successfully identifying irrelevant factors in the model when applied to a
large cross-section For the sake of brevity here we present the overview of the simulation results
with Tables C4 - C9 summarizing the output
We consider the same simulation design as described at the beginning of Section III except
for the choice of the cross-section of test assets which time series and cross-sectional features we
mimic In our baseline case in the previous subsections we built a cross-section to emulate the 25
Fama-French portfolios sorted by size and value Now instead we consider the properties of the
following composite cross-sections to simulate returns
24Σ is the sample estimate of covariance matrix of random errors in time-series regression in equation (1)
51
(a) N = 55 25 Fama-French portfolios sorted by size and value and 30 industry portfolios
(b) N = 100 25 Fama-French portfolios sorted by size and value 30 industry portfolios 25
profitability and investment portfolios 10 momentum portfolios and 10 long-term reversal
portfolios
The rest of the simulation design stays unchanged ie the strong factor mimics the behavior of
HML with its betas and risk premia corresponding to their in-sample values cross-sectional R2adj
as well as portfolio average returns and variance of the residuals
Tables C4 and C7 focus on the case of including only a strong factor into the model that is
inherently misspecified and estimated on a large cross-section (N = 55 and N = 100 respectively)
Risk premia estimates recovered by BFM are centered around the pseudo-true values and confi-
dence intervals produced by the posterior coverage have the correct size Confidence intervals for
R2adj are equally centered around the true values and overall become quite tight for a large sample
size (eg T ge 600) The GLS version of the cross-sectional fit is slightly lower than than the true
value and this is largely due to the estimation errors in the weight matrix that are particularly
important for a large cross-section but overall are very close to the true values
Tables C5 and C8 focus on the model with just a useless factor As expected the standard
Fama-MacBeth estimation yields confidence intervals for the risk premia with wrong size rejecting
the zero risk premia for a useless factor with increasing frequency as T becomes large In contrast
the BFM inference remains valid with its empirical rejection rates being close to the true size of
the tests
Finally Tables C6 and C9 consider the mixed case of estimating a model that includes both
strong and useless factors Again BFM correctly identifies the presence of a spurious factor in
the specification and if anything becomes somewhat more conservative underrejecting the zero
risk premia associated with it This conservative inference is also shared by the estimates of the
strong factor risk premia with the latter being particularly evident in a large sample estimation
(T = 20 000 N = 100) Again the reason for this additional estimation uncertainty seems to lie
in the large N behavior of the cross-sectional regression (betas and the weight matrix in case of
the GLS) The posterior of a cross-sectional measure of fit remains relatively tight around the true
value
B Data
CAPM Following Sharpe (1964) and Lintner (1965) the only risk factor is the excess return on
market portfolio which is proxied by a value-weighted portfolio of all CRSP firms incorporated in
US and listed on the NYSE AMEX or NASDAQ We use 1-month Treasury rate as a proxy for
risk-free rate The data comes from Ken Frenchrsquos website
Fama-French 3 factor model Fama and French (1992) extend CAPM by introducing two
additional factors SMB and HML where SMB is the return difference between portfolios of stocks
52
with small and large market capitalizations and HML is the return difference between portfolios of
stocks with high and low book-to-market ratio Again the data comes from Ken Frenchrsquos website
Carhart (1997) This paper extends Fama-French 3 factor model by including a momentum
factor UMD (also available from Ken French website)
q-factor model Hou Xue and Zhang (2015) introduce a four-factor model that includes market
excess return a new size factor (ME) proxied by the return difference between large and small
stocks an investment factor (IA) proxied by the return difference of stocks with high and low
investment-to-asset ratio and finally the profitability factor (ROE) created by sorting stocks based
on their return-on-equity ratio We receive the data from the authors An alternative is Fama-
French five factor model but we only show the result of the first one since factors in these two
papers are extremely similar
Liquidity Pastor and Stambaugh (2003) created a liquidity factor based on the fact that order
flows result in larger return reversals when liquidity is lower We download the monthly tradable
liquidity factor from Stambaughrsquos website
Quality-minus-junk Asness Frazzini and Pedersen (2019) introduced the QMJ factor and
demonstrated that profitable growing well-managed companies referred to as lsquoqualityrsquo firms
command a higher rate of return The factor is available from AQR data library
CCAPM Real growth rate in nondurable consumption per capita (quarterly data) is computed
from the data on consumption levels available at St Louis FRED
Scaled CAPM Lettau and Ludvigson (2001) considered a conditional SDFmt+1 = atminusbtMKTt+1
where at and bt are linear function of conditional information cayt We download cayt from the
authorsrsquo website
Scaled CCAPM Similar to conditional CAPM but the SDF includes nondurable consumption
growth instead of the market factor
HC-CCAPM Jagannathan and Wang (1996) added a labor factor to CAPM Following their
paper we compute the returns on human capital as
Rlabort =
Ltminus1 + Ltminus2
Ltminus2 + Ltminus3minus 1 (20)
where Lt is the disposable labor income per capita
Scaled HC-CCAPM Lettau and Ludvigson (2001) extend Jagannathan and Wang (1996) by
considering a conditional SDF in which cay is the only conditional information
Durable consumption model Yogo (2006) emphasized the role of durable consumption goods
in explaining high returns of small stocks and value stocks We consider a three-factor model
market excess return non-durable consumption growth and durable (real) consumption growth Cd
(seasonally adjusted at annual rates) The data is from Yogorsquos website 1952Q1 to 2001Q4
53
B1 Factor list
Table B1 List of the factors for cross-sectional asset pricing models
Factor ID Factor name and description Reference Sourceconstruction
MKT Market excess return Sharpe (1964) Lintner(1965)
Ken French website
SMB Size factor constructed as a long-shortportfolio of stocks sorted by their mar-ket cap (small-minus-big)
Fama and French (1992) Ken French website
HML Value factor constructed as a long-short portfolio of stocks sorted by theirbook-to-market ratio (high-minus-low)
Fama and French (1992) Ken French website
RMW Profitability factor constructed as along-short portfolio of stocks sorted bytheir profitability (robust-minus-weak)
Fama and French (2015) Ken French website
CMA Investment factor constructed as along-short portfolio of stocks sorted bytheir investment activity (conservative-minus-aggressive)
Fama and French (2015) Ken French website
UMD Momentum factor constructed as along-short portfolio of stocks sorted bytheir 12-2 cumulative previous return(up-minus-down)
Carhart (1997) Je-gadeesh and Titman(1993)
Ken French website
STREV Short-term reversal factor constructedas a long-short portfolio of stockssorted by their previous month return
Jegadeesh and Titman(1993)
Ken French website
LTREV Long-term reversal factor constructedas a long-short portfolio of stockssorted by their cumulative return ac-crued in the previous 60-13 months
Jegadeesh and Titman(2001)
Ken French website
q IA Investment factor constructed as along-short portfolio of stocks sorted bytheir investment-to-capital
Hou Xue and Zhang(2015)
Lu Zhang
q ROE Profitability factor constructed as along-short portfolio of stocks sorted bytheir return on equity
Hou Xue and Zhang(2015)
Lu Zhang
LIQ NT Liquidity factor computed as the av-erage of individual-stock measures es-timated with daily data (residual pre-dictability controlling for the marketfactor)
Pastor and Stambaugh(2003)
Robert Stambaughwebsite
LIQ TR Liquidity factor constructed as a long-short portfolio of stocks sorted by theirexposure to LIQ NT
Pastor and Stambaugh(2003)
Robert Stambaughwebsite
MGMT Mispricing factor constructed as acombination of anomalies related tofirmrsquos management practices (stock is-sue accruals asset growth etc)
Stambaugh and Yuan(2016)
Robert Stambaughwebsite
PERF Mispricing factor constructed as acombination of anomalies related tofirmrsquos performance (profitability dis-tress return on assets etc)
Stambaugh and Yuan(2016)
Robert Stambaughwebsite
ACCR Accruals factor constructed as a long-short portfolio of stocks sorted bychanges in operating working capitalper split-adjusted share from the fiscalyear end t-2 to t-1 divided by book eq-uity per share in t-1
Sloan (1996) Robert Stambaughwebsite
54
DISSTR Distress factor constructed as a long-short portfolio of stocks sorted by thepredicted failure probability
Campbell Hilscher andSzilagyi (2008)
Robert Stambaughwebsite
ASS Growth Asset growth factor constructed as along-short portfolio of stocks sorted bygrowth rate of total assets in the previ-ous fiscal year
Cooper Gulen andSchill (2008)
Robert Stambaughwebsite
COMP ISSUE Composite issue factor constructed asa long-short portfolio of stocks sortedby the growth in the firmrsquos total marketvalue of equity above that of the stockrsquosrate of return
Daniel and Titman(2006)
Robert Stambaughwebsite
GR PROF Gross profitability factor constructedas a long-short portfolio of stockssorted by the ratio of gross profit toassets creates abnormal benchmark-adjusted returns
Novy-Marx (2013) Robert Stambaughwebsite
INV IN ASS Investment-in-assets factor con-structed as a long-short portfolio ofstocks sorted by the annual change ingross property plant and equipmentplus the annual change in inventoriesscaled by lagged book value of theassets
Titman Wei and Xie(2004)
Robert Stambaughwebsite
NetOA Net operating assets factor con-structed as a long-short portfolio ofstocks sorted by net operating assets
Hirshleifer Kewei Teohand Zhang (2004)
Robert Stambaughwebsite
OSCORE Ohlson O-score factor constructed as along-short portfolio of stocks sorted bythe predicted value of a distress mea-sure
Ohlson (1980) Robert Stambaughwebsite
ROA Return-on-assets factor constructed asa long-short portfolio of stocks sortedby their return on assets
Chen Novy-Marx andZhang (2010)
Robert Stambaughwebsite
STOCK ISS Stock issuance factor constructed asa long-short portfolio of stocks sortedby their annual log change in split-adjusted shares outstanding
Ritter (1991) Fama andFrench (2008)
Robert Stambaughwebsite
INTERM CR Innovations to the intermediariesrsquo cap-ital (equity) ratio
He Kelly and Manela(2017)
Asaf Manela website
BAB Betting-against-beta factor con-structed as a portfolio that holdslow-beta assets leveraged to a betaof 1 and that shorts high-beta assetsde-leveraged to a beta of 1
Frazzini and Pedersen(2014)
AQR data library
HML DEVIL A version of the HML factor that relieson the current price level to sort thestocks into long and short legs
Asness and Frazzini(2013)
AQR data library
QMJ Auality-minus-junk factor constructedas a long-short portfolio of stockssorted by the combination of theirsafety profitability growth and thequality of management practices
Asness Frazzini andPedersen (2019)
AQR data library
FIN UNC A measure of financial uncertainty Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
REAL UNC A measure of real economic uncertainty Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
55
MACRO UNC A measure of macroeconomic uncer-tainty
Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
TERM Term spread measured as the differ-ence in 10 year Treasury bonds and fedfunds rate
Chen Ross and Roll(1986) Fama and French(1993)
FRED-MD database
DELTA SLOPE Change in the difference between a10-year Treasury bond yield and a 3-month Treasury bill yield
Ferson and Harvey(1991)
FRED-MD database
CREDIT Credit spread measured as the dif-ference between Moodyrsquos BAA corpo-rate bond yields and 10-year Treasurybonds
Chen Ross and Roll(1986) Fama and French(1993)
FRED-MD database
DIV Dividend yield Campbell (1996) FRED-MD databasePE Price-earnings ratio Basu (1977) Ball (1985) FRED-MD database
BW INV SENT Investor sentiment measure aggre-gated from a set of indices orthogonalto macroeconomic fundamentals
Baker and Wurgler(2006)
Dashan Huang web-site
HJTZ INV SENT Investor sentiment measure extractedwith PLS from Baker and Wurgler(2006) proxies
Huang Jiang Tu andZhou (2015)
Dashan Huang web-site
BEH PEAD Short-term behavioral factor reflectingpost-earnings announcement drift
Daniel Hirshleifer andSun (2019)
Kent Daniel website
BEH FIN Long-term behavioral factor predom-inantly capturing the impact of shareissuance and correction
Daniel Hirshleifer andSun (2019)
Kent Daniel website
MKTlowast Market factor with a hedged unpricedcomponent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
SMBlowast SMB with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
HMLlowast HML with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
RMWlowast RMW with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
CMAlowast CMA with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
SKEW Systematic skewness factor con-structed as a long-short portfolio ofstocks sorted on the their predictedsystematic skewness rank
Langlois (2019) Hugues Langloiswebsite
NONDUR Nondurable consumption growth (realchain-weighted per capita)
Chen Ross and Roll(1986) Breeden Gib-bons and Litzenberger(1989)
Monthly consump-tion expenditurethe chain-weightedprice index andpopulation data arefrom BEA
SERV Growth rate (real chain-weighted percapita) service expenditure
Breeden Gibbons andLitzenberger (1989)Hall (1978)
Monthly expenditurefor services thechain-weighted priceindex and popula-tion data are fromBEA
UNRATE Unemployment rate Gertler and Grinols(1982)
FRED-MD database
IND PROD Growth rate of industrial production Chan Chen and Hsieh(1985) Chen Ross andRoll (1986)
Industrial Produc-tion Index is fromthe Board of Gover-nors of the FederalReserve System
56
OIL Monthly growth rate of the ProducerPrice index for Crude Petroleum (do-mestic production)
Chen Ross and Roll(1986)
PPI is from US Bu-reau of Labor Statis-tics
The table presents the list of factors used in Section IV2 For each of the variables we present their identificationindex the nature of the factor and the source of data for downloading andor constructing the time series
57
C Additional Tables
Table C1 Tests of risk premia in a correctly specified model with a strong factor
λintercept λHML R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0115 0049 0017 0114 0050 0014 -417 4429200 0101 0055 0012 0096 0047 0010 096 6773
FM 600 0106 0053 0011 0100 0048 0011 3052 84561000 0096 0050 0011 0102 0044 0010 4786 901820000 0102 0042 0012 0100 0049 0007 9620 9943
100 0063 0027 0006 0034 0010 0002 -337 3516200 0107 0055 0012 0080 0037 0005 -299 4906
BFM 600 0095 0050 0007 0080 0035 0007 431 69131000 0092 0054 0008 0092 0047 0012 3291 781820000 0108 0054 0012 0110 0052 0008 9654 9849
Panel B GLS
100 0221 0156 0061 0212 0137 0064 1062 7380200 0153 0088 0024 0144 0088 0024 5923 8571
FM 600 0117 0062 0017 0115 0060 0014 8338 94461000 0106 0053 0011 0112 0052 0011 9003 964920000 0100 0058 0010 0103 0047 0011 9947 9981
100 0151 0088 0026 0149 0086 0026 2642 6569200 0123 0066 0016 0124 0066 0015 4907 7352
BFM 600 0106 0055 0012 0105 0056 0011 7640 87301000 0107 0054 0011 0103 0051 0011 8512 915420000 0098 0057 0009 0100 0051 0013 9915 9950
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in a correctlyspecified model with an intercept and a strong factor Hypothetical true value of R2
adj is 100 Fama-MacBeth esti-mates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shankencorrection Confidence intervals for BFM estimates are constructed using a posterior distribution of Fama-MacBethestimates of λ The last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simula-tions evaluated at the simulation point estimates for FM and its posterior mode for BFM
58
Table C2 Tests of risk premia in a correctly specified model with a useless factor
λintercept λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0087 0033 0008 0019 0003 0000 -419 4844200 0071 0034 0006 0052 0011 0000 -420 5684
FM 600 0077 0031 0005 0131 0037 0000 -422 58501000 0071 0035 0006 0205 0066 0002 -416 612820000 0133 0077 0027 0709 0479 0140 -411 7007
100 0039 0011 0002 0001 0001 0000 -229 -012200 0038 0013 0001 0002 0000 0000 -232 030
BFM 600 0048 0012 0002 0017 0006 0001 -221 1261000 0048 0011 0000 0031 0010 0000 -214 16020000 0007 0000 0000 0103 0051 0014 -176 5793
Panel B GLS
100 0215 0130 0052 0211 0117 0033 -310 3727200 0141 0080 0023 0139 0072 0013 -373 2282
FM 600 0108 0053 0014 0142 0065 0010 -377 21161000 0097 0053 0009 0171 0082 0012 -371 209220000 0108 0047 0010 0621 0559 0372 -259 1697
100 0136 0074 0022 0029 0011 0001 -194 1060200 0112 0060 0014 0012 0004 0000 -239 837
BFM 600 0097 0047 0010 0010 0002 0000 -265 7491000 0096 0047 0009 0010 0002 0000 -276 83220000 0071 0034 0004 0077 0033 0004 -326 766
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a correctly specified model with an intercept and a useless factor The true value of R2 is 0 Fama-MacBeth esti-mates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shankencorrection Confidence intervals for BFM estimates are constructed using a posterior distribution of Fama-MacBethestimates of λ The last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simula-tions evaluated at the simulation point estimates for FM and its posterior mode for BFM
59
Table C3 Tests of risk premia in a correctly specified model with useless and strong factors
λintercept λHML λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0073 0038 0006 0071 0033 0006 0043 0012 0000 -445 5930200 0073 0035 0006 0090 0045 0008 0028 0006 0000 1083 7484
FM 600 0076 0033 0005 0100 0046 0009 0027 0005 0000 3873 85721000 0073 0034 0006 0090 0046 0009 0026 0005 0000 5415 911320000 0087 0032 0006 0061 0026 0005 0025 0008 0000 9687 9947
100 0039 0013 0001 0023 0007 0001 0002 0001 0000 -197 4174200 0048 0023 0000 0046 0015 0000 0002 0001 0000 003 5372
BFM 600 0054 0025 0003 0062 0026 0006 0002 0000 0000 4056 72821000 0084 0034 0006 0064 0021 0002 0002 0000 0000 5763 808920000 0071 0033 0005 0069 0032 0004 0004 0000 0000 9704 9860
Panel B GLS
100 0193 0129 0043 0205 0133 0043 0180 0121 0031 1405 7480200 0135 0080 0021 0141 0075 0024 0129 0062 0010 6015 8683
FM 600 0101 0054 0010 0109 0052 0009 0091 0037 0004 8331 94231000 0104 0055 0010 0105 0048 0011 0085 0039 0003 8995 963220000 0095 0047 0006 0101 0052 0005 0088 0035 0004 9944 9981
100 0133 0074 0023 0132 0074 0023 0026 0009 0001 2796 6680200 0106 0054 0012 0106 0053 0012 0013 0004 0000 4899 7362
BFM 600 0094 0047 0009 0093 0046 0009 0007 0001 0000 7654 87351000 0090 0045 0010 0091 0044 0007 0005 0001 0000 8530 914620000 0092 0043 0008 0092 0046 0008 0004 0001 0000 9914 9951
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and
λstrong λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor The true value
of the cross-sectional R2adj is 100 Fama-MacBeth estimates are constructed using OLS (GLS) two-step cross-
sectional regressions with standard errors including Shanken correction Confidence intervals for BFM estimates areconstructed using a posterior distribution of Fama-MacBeth estimates of λ The last two columns report the 5th and95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at the simulation point estimates for FMand its posterior mode for BFM
60
Table C4 Tests of risk premia in a misspecified model with a strong factor (N = 55)
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0107 0058 0014 0115 0059 0012 -186 1305200 0091 0046 0009 0102 0052 0011 -184 1485600 0104 0052 0013 0109 0060 0014 -166 14951000 0100 0056 0009 0123 0061 0013 -123 135120000 0104 0049 0009 0109 0048 0009 353 803
BFM 100 0017 0004 0000 0002 0000 0000 -155 -067200 0046 0017 0004 0023 0006 0000 -168 273600 0089 0042 0012 0071 0036 0005 -174 8441000 0093 0047 0007 0089 0040 0007 -171 95920000 0101 0048 0011 0099 0042 0008 329 777
Panel B GLS
FM 100 0509 0428 0305 0513 0431 0305 2093 6471200 0295 0206 0102 0274 0187 0091 3999 6657600 0170 0109 0042 0176 0113 0034 5585 71321000 0170 0106 0034 0176 0105 0031 6010 723220000 0145 0082 0020 0141 0073 0022 6917 7182
BFM 100 0265 0181 0086 0262 0186 0079 1872 5919200 0163 0100 0032 0147 0086 0024 3401 5842600 0116 0060 0017 0118 0058 0014 5125 65951000 0120 0064 0016 0121 0063 0014 5689 686520000 0106 0053 0009 0095 0048 0013 6897 7163
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor estimates of a cross-section of 55 test assets The true valueof the cross-sectional R2
adj is 572 (7071) in OLS (GLS) estimation Fama-MacBeth estimates are constructedusing OLS (GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidenceintervals and their size for BFM estimates are constructed using posterior coverage of Fama-MacBeth estimates of λThe last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluatedat the simulation point estimates for FM and its posterior mode for BFM The test assets mimic the time series andcross-sectional properties of 25 Fama-French size-value portfolios and 30 industry portfolios while the strong factorproxies the HML factor (all the data available from Ken French website)
61
Table C5 Tests of risk premia in a misspecified model with a useless factor (N = 55)
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0095 0050 0013 0084 0031 0003 -186 2080200 0084 0042 0007 0093 0038 0004 -186 1801600 0103 0051 0011 0155 0077 0011 -187 13621000 0101 0051 0009 0226 0131 0033 -187 141720000 0191 0126 0050 0716 0650 0460 -187 988
BFM 100 0013 0002 0000 0000 0000 0000 -105 -047200 0039 0015 0001 0002 0000 0000 -128 -043600 0076 0040 0006 0004 0000 0000 -144 -0491000 0082 0035 0004 0022 0009 0000 -150 -02620000 0129 0059 0005 0078 0037 0008 -168 -020
Panel B GLS
FM 100 0492 0411 0287 0540 0468 0302 -134 2318200 0267 0196 0088 0364 0274 0153 -159 1359600 0166 0101 0035 0408 0323 0198 -162 10091000 0154 0089 0028 0475 0401 0266 -155 86820000 0155 0100 0044 0806 0772 0705 -068 668
BFM 100 0259 0180 0085 01695 0105 0041 -014 1285200 0162 0094 0030 0065 0027 0005 -089 615600 0108 0058 0013 0056 0022 0003 -127 5311000 0112 0057 0014 0064 0021 0004 -138 47920000 0051 0020 0001 0093 0049 0011 -072 172
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor estimated on a cross-section of 55 portfolios Thetrue value of the cross-sectional R2 is zero The test assets mimic the time series and cross-sectional properties of 25Fama-French size-value portfolios and 30 industry portfolios
62
Table C6 Tests of risk premia in a misspecified model with useless and strong factors (N = 55)
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0097 0052 0014 0099 0045 0009 0108 0041 0006 -318 2477200 0085 0041 0006 0090 0049 0011 0117 0051 0008 -298 2350600 0095 0045 0013 0108 0060 0012 0191 0100 0015 -212 21211000 0083 0043 0006 0125 0072 0016 0258 0171 0047 -155 213720000 0109 0055 0012 0146 0086 0038 0766 0704 0535 267 1778
BFM 100 0014 0002 0000 0000 0000 0000 0000 0000 0000 -141 161200 0037 0014 0001 0017 0005 0000 0002 0000 0000 -203 726600 0069 0031 0005 0058 0027 0002 0007 0000 0000 -227 10961000 0064 0024 0003 0069 0027 0005 0026 0008 0001 -209 117320000 0028 0006 0000 0068 0033 0004 0080 0042 0009 262 873
Panel B GLS
FM 100 0488 0406 0282 0495 0411 0281 0537 0465 0306 2201 6613200 0263 0194 0088 0252 0167 0077 0359 0279 0151 4047 6707600 0157 0094 0032 0161 0093 0027 0398 0313 0199 5581 71591000 0146 0087 0029 0144 0090 0026 0475 0393 0251 5994 725020000 0140 0094 0035 0134 0089 0041 0789 0747 0667 6888 7237
BFM 100 0253 0182 0081 0260 0176 0077 0168 0104 0039 2036 6007200 0156 0092 0028 0140 0082 0021 0063 0029 0004 3451 5844600 0105 0058 0013 0108 0049 0011 0054 0024 0002 5125 66141000 0105 0054 0015 0111 0056 0009 0057 0024 0003 5678 687320000 0051 0019 0001 0045 0018 0001 0093 0045 0013 6873 7157
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and λstrong
and λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor estimated on a cross-section
of 55 portfolios The true value of the cross-sectional R2adj is 572 (7071) in OLS (GLS) estimation The test
assets mimic the time series and cross-sectional properties of 25 Fama-French size-value portfolios and 30 industryportfolios while the strong factor proxies the HML factor
63
Table C7 Tests of risk premia in a misspecified model with a strong factor (N = 100)
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0091 0044 0010 0109 0055 0012 -098 1310600 0102 0052 0013 0116 0057 0019 -042 12311000 0099 0056 0011 0117 0060 0014 031 114520000 0099 0044 0010 0116 0068 0017 431 748
BFM 200 0016 0004 0000 0006 0001 0000 -082 074600 0076 0034 0006 0060 0028 0005 -086 7901000 0081 0046 0007 0076 0037 0007 -072 85920000 0097 0044 0011 0106 0057 0014 418 742
Panel B GLS
FM 200 0495 0411 0287 0481 0403 0268 2938 5885600 0255 0169 0077 0257 0178 0073 4736 62091000 0234 0149 0059 0205 0129 0049 5198 635220000 0166 0100 0035 0170 0097 0036 6142 6398
BFM 200 0249 0163 0070 0233 0158 0073 2567 5296600 0132 0082 0023 0137 0077 0020 4287 56771000 0125 0073 0021 0112 0060 0014 4849 597020000 0101 0056 0012 0098 0053 0013 6114 6375
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for the pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor estimated on a cross-section of 100 portfolios The truevalue of R2
adj is 585 (6297) for the OLS (GLS) estimation Fama-MacBeth estimates are constructed using OLS(GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidence intervalsand their size for BFM estimates are constructed using a posterior distribution of Fama-MacBeth estimates of λ Thelast two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at thesimulation point estimates for FM and its posterior mode for BFM The simulations design follows the methodologydescribed in Section III with the test assets mimicking the composite cross-section of 25 Fama-French size-BMportfolios 30 industry portfolios 25 profitability and investment portfolios 10 momentum portfolios as well as 10long-term reversal portfolios (all available from Ken French website) A strong factor mimics the behavior of HML
64
Table C8 Tests of risk premia in a misspecified model with a useless factor (N = 100)
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0091 0041 0009 0147 0079 0015 -101 1481600 0110 0062 0010 0280 0180 0064 -100 13551000 0096 0053 0007 0341 0248 0101 -100 122220000 0221 0151 0061 0805 0759 0643 -100 1220
BFM 200 0014 0003 0000 0000 0000 0000 -050 002600 0070 0030 0003 0014 0002 0000 -066 0261000 0073 0034 0005 0026 0010 0001 -071 02720000 0097 0035 0003 0091 0045 0008 -081 109
Panel B GLS
FM 200 0472 0397 0272 0530 0454 0320 -072 1495600 0230 0149 0064 0458 0363 0235 -052 9691000 0196 0122 0048 0487 0407 0275 -037 86120000 0106 0063 0021 0760 0717 0640 171 615
BFM 200 0244 0161 0066 0150 0092 0031 -016 769600 0129 0073 0021 0078 0035 0008 -055 6761000 0118 0066 0019 0070 0033 0006 -060 62420000 0076 0034 0005 0093 0049 0009 172 399
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor estimated on a cross-section of 100 portfolios The truevalue of R2 is zero The test assets mimic the time series and cross-sectional properties of composite cross-section of25 Fama-French size-BM portfolios 30 industry portfolios 25 profitability and investment portfolios 10 momentumportfolios as well as 10 long-term reversal portfolios while the strong factor mimics the behavior of HML (all thedata is available from Ken French website)
Table C9 Tests of risk premia in a misspecified model with useless and strong factors (N = 100)
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0088 0043 0009 0098 0045 0008 0187 0104 0026 -124 2118600 0103 0056 0011 0104 0064 0015 0319 0217 0081 -020 19571000 0105 0049 0009 0115 0058 0013 0389 0284 0134 070 182620000 0102 0057 0014 0140 0075 0028 0823 0782 0684 400 1766
BFM 200 0012 0003 0000 0005 0000 0000 0000 0000 0000 -051 541600 0062 0025 0003 0046 0021 0002 0016 0004 0000 -063 10771000 0062 0029 0005 0057 0025 0003 0029 0008 0001 -005 107820000 0026 0008 0001 0042 0015 0000 0089 0039 0011 399 818
Panel B GLS
FM 200 0467 0392 0268 0471 0383 0254 0523 0454 0315 2971 5943600 0227 0149 0059 0228 0154 0060 0455 0363 0227 4745 62221000 0191 0118 0046 0178 0114 0036 0480 0401 0262 5207 636820000 0098 0054 0020 0095 0062 0024 0749 0707 0621 6121 6416
BFM 200 0243 0160 0066 0230 0155 0072 0152 0087 0031 2613 5350600 0126 0072 0022 0132 0071 0017 0074 0033 0007 4268 56921000 0117 0067 0017 0109 0057 0013 0068 0034 0006 4864 597920000 0068 0032 0002 0066 0034 0006 0087 0047 0009 6102 6371
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and λstrong
λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor on the cross-section of 100
portfolios The true value of R2adj is 585 (6297) in OLS (GLS) estimation The test assets mimic the time series
and cross-sectional properties of composite cross-section of 25 Fama-French size-BM portfolios 30 industry portfolios25 profitability and investment portfolios 10 momentum portfolios as well as 10 long-term reversal portfolios whilethe strong factor mimics the behavior of HML (all the data is available from Ken French website)
65
Table C10 The probability of retaining risk factors using BF
T 55 57 59 61 63 65
Panel A strong factors
Jeffreys Prior fstrong 200 0860 0845 0830 0812 0792 0771600 0987 0985 0985 0983 0981 09791000 0998 0998 0997 0996 0996 0995
Spike-and-Slab Prior fstrong 200 0749 0718 0685 0654 0618 0586600 0982 0979 0978 0972 0970 09641000 0996 0996 0996 0995 0994 0992
Panel B useless factors
Jeffreys Prior fuseless 200 1000 0995 0982 0940 0862 0726600 1000 0999 0998 0988 0971 09201000 1000 1000 1000 0999 0990 0971
Spike-and-Slab Prior fuseless 200 0419 0248 0149 0083 0040 0022600 0119 0062 0028 0012 0007 00041000 0083 0028 0011 0001 0000 0000
Panel C strong and useless factors
Jeffreys Prior fstrong 200 0928 0912 0891 0878 0860 0838600 0994 0994 0992 0991 0991 09891000 0999 0999 0999 0999 0999 0999
fuseless 200 0955 0894 0788 0642 0489 0360600 0957 0895 0764 0618 0461 03541000 0957 0893 0787 0645 0483 0357
Spike-and-Slab Prior fstrong 200 0753 0725 0697 0666 0629 0592600 0982 0980 0979 0972 0970 09611000 0996 0996 0996 0995 0994 0994
fuseless 200 0283 0154 0085 0044 0023 0011600 0077 0031 0006 0002 0000 00001000 0053 0008 0000 0000 0000 0000
The table shows the frequency of retaining risk factors for different choice sets across 1000 simulations of differentsize (T=200 600 and 1000) In Panel A the candidate risk factor is truly cross-sectionally priced and stronglyidentified while in Panel B they are not Panel C summarizes the case of using both strong and useless candidatefactors in the model A candidate factor is retained in the model if its marginal posterior probability p(γi = 1|data)is greater than a certain threshold ie 55 57 59 61 63 and 65
66
Table C11 Two-pass regressions with tradable factors 25 Fama-French portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CAPM Intercept 1421 1769 1404 -091 1461[0627 2215] [-435 6661] [0595 2256] [-403 5211]
MKT -0647 -0631[-1429 0135] [-1458 0163]
Fama and French (1992) Intercept 1273 6071 1249 5604 5566[0730 1816] [2000 8514] [0682 1834] [4275 6946]
MKT -0686 -0664[-1227 -0145] [-1242 -0102]
SMB 0140 0140[0075 0205] [0073 0205]
HML 0380 0379[0322 0438] [0319 0439]
Quality-minus-junk Intercept 0626 7442 0648 6947 6879Asness Frazzini and Pedersen (2014) [-0048 1300] [4480 9520] [-0055 1308] [5548 7946]
QMJ 0369 0358[0148 0589] [0130 0591]
MKT -0097 -0116[-0763 0568] [-0772 0580]
SMB 0209 0206[0157 0260] [0151 0262]
HML 0338 0340[0288 0388] [0289 0392]
Panel B GLS
CAPM Intercept 1362 4805 1323 4706 4265[0941 1783] [2696 10000] [0846 1802] [1411 6530]
MKT -0788 -0749[-1206 -0369] [-1227 -0287]
Fama and French (1992) Intercept 1397 8849 1353 8543 8543[0924 1870] [8286 9771] [0846 1878] [8003 8989]
MKT -0827 -0783[-1298 -0356] [-1311 -0283]
SMB 0179 0178[0145 0213] [0142 0215]
HML 0348 0350[0312 0385] [0310 0391]
Quality-minus-junk Intercept 0966 8948 0982 8736 8642Asness Frazzini and Pedersen (2014) [0395 1537] [8320 9640] [0383 1616] [8120 9090]
QMJ 0378 0359[0204 0552] [0172 0539]
MKT -0425 -0438[-0984 0133] [-1052 0157]
SMB 0197 0196[0160 0234] [0156 0235]
HML 0359 0359[0321 0398] [0320 0399]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of monthly excess returns for 25 Fama-French sizevalue portfolios Each model is estimated viaOLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructed basedon the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructed as inLewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide theposterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and 99level respectively
67
Table C12 Two-pass regressions with tradable factors 25 Fama-French portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CCAPM Intercept 1444 662 1814 -202 545[0155 2732] [-435 9374] [0106 3664] [-423 4175]
∆Cnd 0383 0254[-0221 0988] [-0462 0956]
Scaled CAPM Intercept 4489 1617 3731 1739 2565[1479 7499] [-1429 6114] [-0268 7514] [-391 5783]
cay 1141 0563[-0198 2480] [-2262 3085]
MKT -2119 -1444[-4890 0653] [-5241 2643]
MKT times cay -8554 -5188[-19227 2119] [-17911 9415]
Scaled HC-CAPM Intercept 4405 1386 3343 3895 3659[1182 7629] [-2632 5453] [-0500 7182] [-006 6630]
cay 1038 0552[-0444 2520] [-1957 2848]
∆Y 0437 0034[-0421 1295] [-0995 0928]
MKT -2049 -1148[-5031 0933] [-4946 2772]
∆Y times cay 1428 0724[-1047 3903] [-3458 4645]
MKT times cay -7782 -4127[-18579 3015] [-17926 10532]
Panel B GLS
CCAPM Intercept 2274 -251 2336 -303 -056[1386 3162] [-435 9896] [1360 3266] [-403 1109]
∆Cnd 0139 0088[-0114 0391] [-0222 0383]
Scaled CAPM Intercept 2623 4949 2668 5626 4567[0794 4451] [1657 7829] [0859 4376] [439 7146]
cay 0362 0205[-0759 1483] [-0853 1277]
MKT -0596 -0642[-2412 1220] [-2325 1149]
MKT times cay 0644 0310[-5695 6984] [-5988 6529]
Scaled HC-CAPM Intercept 2486 5014 2604 5832 4698[0510 4463] [1158 7726] [0801 4372] [558 7371]
cay 0361 0199[-0902 1623] [-0879 1269]
∆Y -0415 -0242[-0878 0048] [-0672 0157]
MKT -0479 -0588[-2441 1482] [-2357 1241]
∆Y times cay 0395 0150[-1761 2552] [-1718 2036]
MKT times cay 1158 0477[-5888 8205] [-5890 6968]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of quarterly excess returns for 25 Fama-French sizevalue portfolios Each model is estimatedvia OLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructed basedon the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructed as inLewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide theposterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and 99level respectively
68
Table C13 Two-pass regressions with tradable factors 25 Fama-French + 17 industry portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Carhart (1997) Intercept 0755 4720 0780 3835 3769
[0167 1342] [803 8559] [0158 1395] [2222 5427]
MKT -0162 -0188
[-0759 0435] [-0808 0432]
SMB 0144 0142
[0082 0206] [0078 0207]
HML 0320 0316
[0251 0388] [0245 0388]
UMD 0759 0673
[-0360 1878] [-0476 1802]
q-factor model Intercept 0744 4505 0761 3647 3600
Hou Xue and Zhang (2015) [0244 1243] [138 8338] [0239 1287] [1711 5508]
ROE 0173 0156
[-0139 0485] [-0154 0481]
IA 0287 0279
[0117 0456] [0117 0455]
ME 0230 0221
[0135 0325] [0121 0320]
MKT -0199 -0211
[-0703 0304] [-0741 0311]
Liquidity Factor Intercept 0958 277 0963 -124 620
Pastor and Stambaugh (2003) [0417 1499] [-513 4113] [0377 1527] [-435 3340]
LIQ 0210 0153
[-1001 1421] [-1077 1445]
MKT -0274 -0281
[-0822 0274] [-0851 0340]
Panel B GLS
Carhart (1997) Intercept 0839 8826 0909 8486 8397
[0467 1210] [7562 9557] [0487 1339] [7895 8812]
MKT -0235 -0309
[-0610 0140] [-0737 0120]
SMB 0162 0161
[0134 0190] [0130 0193]
HML 0351 0350
[0316 0386] [0312 0390]
UMD 1426 1184
[0832 2020] [0541 1861]
q-factor model Intercept 1107 4289 1103 3742 3631
Hou Xue and Zhang (2015) [0761 1454] [027 7341] [0714 1486] [2022 5216]
ROE 0270 0247
[0057 0482] [0008 0502]
IA 0277 0263
[0134 0420] [0107 0428]
ME 0221 0216
[0156 0287] [0144 0288]
MKT -0524 -0520
[-0869 -0179] [-0900 -0138]
Liquidity Factor Intercept 1186 2897 1159 1811 2373
Pastor and Stambaugh (2003) [0874 1497] [-197 7687] [0803 1519] [438 5253]
LIQ 0089 0065
[-0567 0744] [-0696 0871]
MKT -0598 -0571
[-0909 -0288] [-0936 -0212]
69
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CAPM Intercept 0966 467 0957 -113 306
[0427 1506] [-250 5490] [0382 1522] [-246 2863]
MKT -0277 -0269
[-0823 0269] [-0838 0311]
Fama-French (1992) Intercept 1016 4330 1002 3425 3370
[0540 1491] [182 8166] [0517 1500] [2194 4614]
MKT -0445 -0430
[-0916 0026] [-0912 0068]
SMB 0147 0144
[0086 0208] [0077 0208]
HML 0316 0313
[0248 0383] [0242 0383]
Quality-minus-junk Intercept 0836 4931 0836 3861 3934
Asness et all (2019) [0323 1350] [914 8449] [0314 1365] [2459 5577]
QMJ 0153 0147
[-0065 0372] [-0066 0373]
MKT -0293 -0289
[-0796 0210] [-0814 0233]
SMB 0179 0175
[0114 0243] [0105 0240]
HML 0302 0300
[0234 0369] [0230 0373]
Panel B GLS
CAPM Intercept 1192 3060 1170 2602 2573
[0884 1500] [058 7848] [0815 1517] [600 5118]
MKT -0604 -0583
[-0911 -0298] [-0923 -0227]
Fama-French (1992) Intercept 1182 8616 1150 8272 8234
[0853 1510] [7303 9353] [0788 1519] [7736 8662]
MKT -0596 -0564
[-0923 -0268] [-0937 -0198]
SMB 0160 0160
[0132 0187] [0129 0191]
HML 0349 0348
[0314 0384] [0310 0386]
Quality-minus-junk Intercept 0970 8674 0977 8227 8276
Asness et all (2019) [0606 1334] [7451 9335] [0561 1369] [7759 8693]
QMJ 0293 0278
[0159 0427] [0125 0439]
MKT -0393 -0399
[-0754 -0033] [-0791 0012]
SMB 0166 0165
[0138 0194] [0134 0196]
HML 0353 0352
[0318 0388] [0315 0392]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of monthly excess returns for 25 Fama-French and 17 industry portfolios Each model is estimatedvia OLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructedbased on the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructedas in Lewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we providethe posterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of thecross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and99 level respectively
70
Table C14 Two-pass regressions with nontradable factors 25 Fama-French + 17 industry port-folios
Fama-MacBeth Bayesian Estimation
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CCAPM Intercept 1817 356 1975 -124 209
[0905 2729] [-250 6208] [0795 3135] [-245 2680]
∆Cnd 0189 0123
[-0216 0594] [-0285 0502]
Scaled CAPM Intercept 2141 1528 2344 298 1306
[0529 3754] [-789 9568] [0692 3944] [-451 4069]
cay 1408 0574
[-0182 2999] [-0935 1939]
MKT 0108 -0142
[-1471 1686] [-1747 1499]
cay timesMKT -2554 -2159
[-8811 3702] [-8813 4620]
Scaled CCAPM Intercept 1607 1709 1983 95 1468
[0394 2820] [-789 10000] [0761 3180] [-372 4189]
cay 1137 0511
[-0394 2669] [-0871 1865]
∆Cnd 0452 0175
[-0103 1007] [-0261 0598]
cay times∆Cnd -0102 0030
[-1752 1547] [-1175 1292]
HC-CAPM Intercept 2185 611 2221 -147 644
[0720 3650] [-513 6005] [0667 3735] [-429 3093]
∆Y 0398 0201
[-0011 0808] [-0290 0667]
MKT 0123 0050
[-1392 1638] [-1513 1650]
Panel B GLS
CCAPM Intercept 2351 264 2442 -057 218
[1700 3003] [-250 9795] [1612 3258] [-204 1296]
∆Cnd 0211 0132
[0034 0387] [-0081 0351]
Scaled CAPM Intercept 2516 3951 2653 4332 3470
[1467 3566] [1045 7411] [1616 3705] [360 6539]
cay 0783 0424
[0056 1511] [-0311 1119]
MKT -0504 -0639
[-1548 0541] [-1674 0406]
cay timesMKT 3785 2327
[-0412 7981] [-2053 6791]
Scaled CCAPM Intercept 2244 331 2378 296 356
[1378 3109] [-789 8166] [1539 3237] [-476 1690]
cay 0742 0425
[-0006 1489] [-0316 1104]
∆Cnd 0160 0119
[-0078 0398] [-0116 0342]
cay times∆Cnd 0355 0190
[-0343 1053] [-0455 0840]
HC-CAPM Intercept 2908 3810 2808 4454 3356
[2053 3762] [854 7372] [1809 3826] [194 6591]
∆Y -0155 -0089
[-0381 0072] [-0357 0174]
MKT -0892 -0793
[-1742 -0043] [-1803 0208]
71
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Scaled HC-CAPM Intercept 2099 1372 2243 1599 2186
[0549 3649] [-1389 4875] [0491 3955] [-121 4771]
cay 1124 0529
[-0326 2574] [-1066 1981]
∆Y 0088 0106
[-0314 0489] [-0399 0599]
MKT 0169 -0068
[-1340 1679] [-1805 1744]
cay times∆Y 1277 0579
[-1361 3914] [-1930 2919]
cay timesMKT -2125 -1605
[-8058 3808] [-8349 5354]
Durable CCAPM Intercept 2281 2316 2220 1423 1459
[0025 4537] [-789 10000] [0403 4028] [-359 4178]
∆Cnd 0544 0194
[0054 1034] [-0163 0525]
∆Cd 0481 0104
[-0101 1062] [-0353 0531]
MKT -0102 -0063
[-2466 2262] [-1948 1863]
Panel B GLS
Scaled HC-CAPM Intercept 2662 3848 2698 4312 3502
[1626 3698] [433 7267] [1555 3844] [231 6466]
cay 0726 0416
[-0029 1481] [-0324 1146]
∆Y -0165 -0088
[-0440 0110] [-0372 0198]
MKT -0652 -0685
[-1684 0379] [-1816 0438]
cay times∆Y 0681 0333
[-0648 2009] [-1017 1616]
cay timesMKT 3266 2123
[-0950 7482] [-2173 6502]
Durable CCAPM Intercept 1958 2926 1994 412 2558
[0634 3281] [-789 6763] [0831 3125] [-269 6336]
∆Cnd 0164 0065
[-0065 0394] [-0126 0260]
∆Cd 0289 0123
[-0011 0589] [-0117 0377]
MKT 0002 -0026
[-1322 1326] [-1165 1125]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of quarterly excess returns for 25 Fama-French and 17 industry portfolios Each modelis estimated via OLS and GLS We report point estimates and 5 confidence intervals for risk premia which areconstructed based on the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence levelconstructed as in Lewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimationwe provide the posterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode andmedian of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the90 95 and 99 level respectively
72
Table C15 Posterior factor probabilities and risk premia of 26 million sparse models
Pr [γj = 1|data] E [λj |data]ψ ψ
factors 1 5 10 20 50 100 1 5 10 20 50 100
HML 0455 0702 0720 0695 0646 0614 0106 0231 0249 0245 0229 0219MKTlowast 0274 0513 0594 0658 0700 0702 0050 0211 0321 0440 0553 0591MKT 0086 0178 0311 0446 0550 0576 0009 0067 0163 0286 0408 0454SMBlowast 0246 0350 0317 0264 0210 0188 0039 0113 0120 0110 0095 0089PERF 0136 0160 0147 0125 0098 0083 -0020 -0050 -0053 -0049 -0041 -0036STOCK ISS 0099 0117 0125 0123 0110 0099 -0008 -0029 -0042 -0050 -0053 -0051COMP ISSUE 0097 0101 0104 0103 0096 0088 0010 0029 0041 0051 0059 0059CMA 0111 0114 0104 0091 0075 0066 0008 0018 0019 0019 0017 0016ROA 0089 0079 0083 0094 0102 0097 0010 0023 0034 0051 0068 0070UMD 0091 0097 0097 0090 0078 0071 0006 0020 0026 0029 0030 0030BEH PEAD 0081 0069 0069 0074 0089 0108 0001 0002 0004 0007 0016 0030STRev 0078 0064 0063 0067 0086 0112 0000 0002 0003 0007 0022 0048NONDUR 0079 0065 0064 0067 0079 0097 0000 0000 0001 0001 0003 0006DISSTR 0085 0079 0076 0070 0060 0053 0010 0030 0039 0043 0042 0040MGMT 0090 0083 0077 0068 0055 0047 0007 0017 0019 0019 0017 0015BW ISENT 0079 0064 0062 0063 0069 0077 0000 0000 0000 0001 0002 0004TERM 0078 0063 0061 0062 0069 0078 0000 0000 0001 0001 0004 0007FIN UNC 0078 0063 0061 0061 0066 0073 -0000 -0000 -0000 -0000 -0000 -0000LIQ TR 0078 0063 0060 0061 0066 0075 -0000 0000 0001 0002 0007 0015DeltaSLOPE 0078 0063 0061 0061 0066 0073 -0000 -0000 -0000 -0000 -0000 -0001IPGrowth 0078 0063 0061 0061 0066 0072 -0000 -0000 -0000 -0000 -0001 -0002Oil 0078 0063 0061 0061 0066 0072 -0000 0001 0001 0002 0004 0009SERV 0078 0063 0061 0061 0065 0072 -0000 -0000 -0000 -0000 -0000 -0000DEFAULT 0078 0063 0060 0060 0064 0069 -0000 -0000 0000 0000 0000 0000NetOA 0079 0063 0061 0062 0065 0065 0001 0003 0004 0007 0014 0018DIV 0078 0063 0060 0060 0064 0069 0000 -0000 -0000 -0000 -0000 -0000PE 0078 0063 0060 0060 0064 0068 -0000 -0001 -0001 -0002 -0004 -0008REAL UNC 0078 0063 0060 0060 0064 0067 0000 0000 0000 0000 0000 0000HJTZ ISENT 0078 0063 0060 0060 0064 0068 -0000 -0000 -0000 -0000 -0000 -0001UNRATE 0078 0063 0060 0060 0064 0068 0000 -0000 -0000 -0000 -0001 -0002INTERM CAP RATIO 0084 0065 0061 0062 0062 0059 0009 0013 0018 0027 0038 0041LIQ NT 0079 0063 0060 0060 0063 0066 -0001 -0001 -0002 -0002 -0003 -0004QMJ 0113 0090 0069 0051 0035 0028 0012 0016 0014 0010 0007 0005MACRO UNC 0078 0063 0060 0059 0060 0061 0000 0000 0000 0000 0000 0000INV IN ASS 0078 0062 0059 0059 0060 0061 0000 0000 0001 0001 0003 0006ACCR 0081 0067 0062 0059 0056 0055 -0002 -0005 -0006 -0006 -0005 -0004LTRev 0078 0064 0063 0062 0059 0053 -0001 -0005 -0008 -0012 -0016 -0017CMAlowast 0079 0062 0059 0058 0059 0058 0000 0000 -0000 -0001 -0002 -0002ASS Growth 0078 0060 0056 0055 0055 0052 -0001 -0001 -0001 -0004 -0008 -0011HMLlowast 0079 0062 0058 0055 0053 0049 -0001 -0001 -0001 -0001 -0001 -0001RMWlowast 0077 0058 0052 0047 0040 0035 0000 0001 0001 0002 0002 0002BAB 0079 0059 0051 0045 0039 0034 -0003 -0003 -0003 -0001 0001 0003IA 0083 0060 0051 0043 0035 0030 -0003 -0004 -0004 -0003 -0003 -0003O SCORE 0077 0056 0047 0040 0032 0027 -0003 -0004 -0005 -0005 -0005 -0005ROE 0076 0055 0047 0040 0032 0027 0001 0004 0005 0005 0004 0003SMB 0039 0024 0027 0042 0067 0077 0002 0002 0004 0007 0014 0017SKEW 0090 0062 0047 0034 0024 0019 -0000 -0000 -0000 -0000 -0000 -0000BEH FIN 0079 0051 0040 0031 0023 0019 -0006 -0005 -0003 -0001 -0000 -0000GR PROF 0077 0047 0038 0031 0025 0020 0004 0003 0003 0003 0003 0003RMW 0073 0045 0036 0030 0023 0018 0003 0004 0004 0003 0003 0003HML DEVIL 0063 0036 0027 0020 0015 0012 0002 0003 0002 0001 0000 0000
Posterior probabilities of factors Pr [γj = 1|data] and posterior mean of factor risk premia E [λj |data] computedusing the Dirac spike and slab approach of section II22 51 factors and all possible models with up to 5 factorsyielding about 26 million candidate models The prior probability of a factor being included is about 1038 Thedata is monthly 197310 to 201612 Test assets cross-section of 25 Fama-French size and book-to-market and 30Industry portfolios The 51 factors considered are described in Table B1 of Appendix B
73
Table C16 Factor models with highest posterior probability (Dirac spike-and-slab ψ = 20)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232SMB 983232 983232 983232STRevBW ISENTLIQ TRUNRATENONDURTERMCOMP ISSUE 983232 983232OilBEH PEAD 983232DeltaSLOPEINV IN ASSIPGrowthDEFAULTUMD 983232ROA 983232 983232 983232 983232 983232 983232 983232REAL UNCPECMA 983232ACCRSERVSTOCK ISS 983232 983232 983232DIVMACRO UNCFIN UNCLIQ NTCMANetOAHJTZ ISENTLTRevRMWHMLINTERM CAP RATIOASS GrowthPERF 983232IABABDISSTR 983232ROEMGMTO SCOREQMJBEH FINGR PROFSKEWRMWHML DEVILSMBProbability () 01080 00964 00771 00710 00709 00668 00661 00624 00600 00560
Factors and posterior model probabilities of ten most likely specifications computed using the Dirac spike and slabapproach of section II22 ψ = 20 51 factors and all possible models with up to 5 factors yielding about 26 millionmodels and a model prior probability of the order of 10minus7 Specifications organised by columns with the symbol 983232indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612 Test assetscross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factors considered aredescribed in Table B1 of Appendix B
74
Table C17 Factor Models with highest posterior probability (continuous spike-and-slab ψ = 10)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKTlowast 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232SMBlowast 983232 983232 983232 983232 983232 983232 983232STRev 983232 983232 983232 983232 983232 983232 983232IPGrowth 983232 983232 983232 983232 983232BEH PEAD 983232 983232 983232 983232 983232NONDUR 983232 983232 983232 983232SERV 983232 983232 983232LIQ TR 983232 983232 983232 983232PE 983232 983232 983232 983232 983232 983232DeltaSLOPE 983232 983232 983232 983232BW ISENT 983232 983232DEFAULT 983232 983232 983232 983232 983232 983232Oil 983232 983232 983232DIV 983232 983232 983232 983232 983232 983232REAL UNC 983232 983232 983232 983232 983232 983232UNRATE 983232 983232TERM 983232 983232 983232 983232HJTZ ISENT 983232 983232 983232 983232 983232 983232UMD 983232 983232 983232 983232 983232CMAlowast 983232 983232 983232 983232 983232 983232LIQ NT 983232 983232 983232 983232 983232 983232 983232 983232FIN UNC 983232 983232 983232 983232 983232 983232 983232STOCK ISS 983232 983232NetOA 983232 983232 983232 983232 983232 983232MACRO UNC 983232 983232 983232 983232 983232INV IN ASS 983232 983232 983232COMP ISSUE 983232 983232 983232 983232 983232ACCR 983232 983232 983232 983232 983232 983232HMLlowast 983232 983232 983232 983232 983232 983232ROA 983232 983232 983232 983232 983232 983232RMWlowast 983232 983232 983232 983232 983232 983232CMA 983232 983232 983232 983232 983232 983232LTRev 983232 983232 983232 983232 983232 983232 983232 983232INTERM CAP RATIO 983232 983232 983232 983232 983232 983232ASS Growth 983232 983232 983232 983232 983232 983232DISSTR 983232 983232 983232 983232PERF 983232 983232 983232BAB 983232 983232IA 983232 983232 983232 983232MGMT 983232 983232 983232 983232ROE 983232 983232 983232O SCORE 983232BEH FIN 983232 983232 983232 983232 983232GR PROF 983232 983232 983232 983232 983232QMJ 983232 983232RMW 983232SKEW 983232 983232 983232SMB 983232HML DEVIL 983232 983232Probability () 00111 00111 00111 00100 00100 00100 00100 00100 00100 00100
Factors and posterior model probabilities of ten most likely specifications computed using the continuous spike andslab approach of section II23 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 225quadrillion models and a model prior probability of the order of 10minus16 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
75
Table C18 Factor models with highest posterior probability (Continuous spike-and-slab ψ = 20)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232MKTlowast 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232SMBlowast 983232 983232 983232STRev 983232 983232 983232 983232 983232 983232 983232 983232IPGrowth 983232 983232 983232BEH PEAD 983232 983232 983232 983232 983232 983232NONDUR 983232 983232 983232SERV 983232 983232 983232 983232 983232 983232LIQ TR 983232 983232 983232 983232 983232PE 983232 983232 983232 983232 983232 983232 983232DeltaSLOPE 983232 983232 983232 983232 983232 983232BW ISENT 983232 983232 983232 983232 983232DEFAULT 983232 983232 983232 983232 983232Oil 983232 983232DIV 983232 983232 983232 983232REAL UNC 983232 983232 983232 983232 983232UNRATE 983232 983232 983232 983232 983232TERM 983232 983232 983232 983232HJTZ ISENT 983232 983232 983232 983232 983232 983232 983232 983232UMD 983232 983232 983232 983232CMAlowast 983232 983232LIQ NT 983232 983232 983232 983232 983232FIN UNC 983232 983232 983232 983232 983232 983232STOCK ISS 983232 983232 983232 983232 983232NetOA 983232 983232 983232 983232 983232 983232 983232MACRO UNC 983232 983232INV IN ASS 983232 983232 983232 983232COMP ISSUE 983232 983232 983232 983232ACCR 983232 983232 983232HMLlowast 983232 983232 983232 983232 983232 983232 983232ROA 983232 983232 983232 983232 983232 983232RMWlowast 983232 983232 983232 983232 983232CMA 983232 983232 983232 983232 983232 983232LTRev 983232 983232 983232INTERM CAP RATIO 983232 983232 983232ASS Growth 983232 983232 983232 983232DISSTR 983232 983232 983232 983232 983232PERF 983232BAB 983232 983232 983232 983232IA 983232 983232 983232 983232MGMT 983232 983232 983232ROE 983232 983232 983232 983232 983232O SCORE 983232 983232 983232 983232 983232BEH FIN 983232GR PROF 983232 983232 983232QMJ 983232 983232RMW 983232 983232SKEW 983232 983232SMB 983232HML DEVIL 983232 983232Probability () 00133 00122 00111 00111 00100 00100 00100 00100 00100 00089
Factors and posterior model probabilities of ten most likely specifications computed using the continuous spike andslab approach of section II23 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 225quadrillion models and a model prior probability of the order of 10minus16 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
76
D Additional Figures
Panel A OLS
00 01 02 03 04 05
02
46
810
λstrong
Density
BFMFM
(a) Strong factor
minus2 minus1 0 1 2
00
02
04
06
08
10
λuseless
Density
BFMFM
(b) Useless factor
Panel B GLS
025 030 035
05
1015
2025
30
λstrong
Density
BFMFM
(c) Strong factor
minus20 minus15 minus10 minus05 00 05 10
00
04
08
12
λuseless
Density
BFMFM
(d) Useless factor
Figure D1 Posterior distribution of the risk premia estimates in a misspecified model that includesboth strong and irrelevant factors
The graph presents posterior distribution of risk premia estimates for a misspecified model with both strong anduseless factors in one representative simulation Panels (a) and (b) display posterior distribution of the BFM-OLSestimates of risk premia along with the frequentist distribution implied by the point estimates and standard errorsPanels (c) and (d) report the same objects for GLS T =1000
77
0 5 10
00
04
08
12
Parameter
Den
sity
PriorLikelihood ILikelihood IILikelihood IIIPosterior IPosterior IIPosterior III
Figure D2 Posterior distributions with heavy-tailed likelihood and thin-tailed prior
Posterior shapes generated by a thin-tails (Gaussian) prior and a heavy-tails (Cauchy) likelihood as the likelihood
peak moves away from the prior mean Posteriors I II and III are generated respectively by the likelihood I II and
III
78
we can sample the posterior distribution of the parameters (BΣ) by first drawing the covariance
matrix Σ from the inverse-Wishart distribution conditional on the data and then drawing B from
a multivariate normal distribution conditional on the data and the draw of Σ
If the model is correctly specified in the sense that all true factors are included expected
returns of the assets should be fully explained by their risk exposure β and the prices of risk λ
ie E[Rt] = βλ But since given our mean normalisation of the factors E[Rt] = a we have the
least square estimate (β⊤β)minus1β⊤a Therefore we can define our first estimator
Definition 1 (Bayesian Fama-MacBeth (BFM)) The posterior distribution of λ conditional
on B Σ and the data is a Dirac distribution at (β⊤β)minus1β⊤a A draw (λ(j)) from the posterior
distribution of λ conditional on the data only is obtained by drawing B(j) and Σ(j) from the Normal-
inverse-Wishart in (8)-(9) and computing (β⊤(j)β(j))
minus1β⊤(j)a(j)
The posterior distribution of λ defined above accounts both for the uncertainty about the
expected returns (via the sampling of a) and the uncertainty about the factor loadings (via the
sampling of β) Note that differently from the frequentist case in equation (5) there is no ldquoextra
termrdquo (1 + λ⊤fΣ
minus1f λf ) to account for the fact that βf is estimated The reason being that it
is unnecessary to explicitly adjust standard errors of λ in the Bayesian approach since we keep
updating βf in each simulation step automatically incorporating the uncertainty about βf into the
posterior distribution of λ Furthermore it is quite intuitive from the above definition of the BFM
estimator why we expect posterior inference to detect weak and spurious factors in finite sample
For such factors the near singularity of (β⊤(j)β(j))
minus1 will cause the draws for λ(j) to diverge as in
the frequentist case Nevertheless the posterior uncertainty about factor loadings and risk premia
will cause β⊤(j)a(j) to flip sign across draws causing the posterior distribution of λ to put substantial
probability mass on both values above and below zero Hence centered posterior credible intervals
will tend to include zero with high probability
In addition to the price of risk λ we are also interested in estimating the cross-sectional fit of
the model ie the cross-sectional R2 Once we obtain the posterior draws of the parameters we
can easily obtain the posterior distribution of the cross-sectional R2 defined as
R2ols = 1minus (aminus βλ)⊤(aminus βλ)
(aminus a1N )⊤(aminus a1N ) (10)
where a = 1N
983123Ni ai That is for each posterior draw of (a β λ) we can construct the corre-
sponding draw for the R2 from equation (10) hence tracing out its posterior distribution We can
think of equation (10) as the population R2 where a β and λ are unknown After observing
the data we infer the posterior distribution of a β and λ and from these we can recover the
distribution of the R2
However realistically the models are rarely true Therefore one might want to allow for the
presence of pricing errors α in the cross-sectional regression6 This can be easily accommodated
6As we will show in the next section this is essential for model selection
9
within our Bayesian framework since in this case the data generating process in the second stage
becomes a = βλ + α If we further assume that pricing error αi follows an independent and
identical normal distribution N (0σ2) the likelihood function in the second step becomes7
p(data|λσ2β) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070 (11)
In the cross-sectional regression the ldquodatardquo are the expected risk premia a and the factor loadings
β albeit these quantities are not directly observable to the researcher Hence in the above we
are conditioning on the knowledge of these quantities which can be sampled from the first step
Normal-inverse-Wishart posterior distribution (8)-(9) Conceptually this is not very different from
the Bayesian modeling of latent variables In the benchmark case we assume a Jeffreysrsquo diffuse
prior8 for (λσ2) π(λσ2) prop σminus2 In Appendix A1 we show that the posterior distribution of
(λσ2) is then
λ|σ2BΣ data sim N
983091
983109983107(β⊤β)minus1β⊤a983167 983166983165 983168983141λ
σ2(β⊤β)minus1
983167 983166983165 983168Σλ
983092
983110983108 (12)
σ2|BΣ data sim Γminus1
983075N minusK minus 1
2(aminus βλ)⊤(aminus βλ)
2
983076 (13)
where Γminus1 denotes the inverse-gamma distribution The conditional distribution in equation (12)
makes it clear that the posterior takes into account both the uncertainty about the market price
of risk stemming from the first stage uncertainty about the β and a (that are drawn from the
Normal-inverse-Wishart posterior in equations (8)-(9)) and the random pricing errors α that have
the conditional posterior variance in equation (13) If test assetsrsquo expected excess returns are fully
explained by β there are no pricing errors and σ2(β⊤β)minus1 converges to zero otherwise this layer
of uncertainty always exists
Note also that we can think of the posterior distribution of (β⊤β)minus1β⊤a as a Bayesian decision
makerrsquos belief about the dispersion of the Fama-MacBeth OLS estimates after observing the data
RtftTt=1 Alternatively when pricing errors α are assumed to be zero under the null hypothesis
the posterior distribution of λ in equation (12) collapses to a degenerate distribution where λ
equals (β⊤β)minus1β⊤a with probability one
Often the cross sectional step of the Fama-MacBeth estimation is performed via GLS rather than
least squares In our setting under the null of the model this leads to λ = (β⊤Σminus1β)minus1β⊤Σminus1a
Therefore we define the following GLS estimator
Definition 2 (Bayesian Fama-MacBeth GLS (BFM-GLS)) The posterior distribution of λ
conditional on B Σ and the data is a Dirac distribution at (β⊤Σminus1β)minus1β⊤Σminus1a A draw (λ(j))
7We derive a formulation with non-spherical cross-sectional pricing errors that leads to a GLS type estimator inAppendix A2
8As shown in the next subsection in the presence of useless factors such prior is not appropriate for modelselection based on Bayes factors and posterior probabilities since it does not lead to proper marginal likelihoodsTherefore we introduce therein a novel prior for model selection
10
from the posterior distribution of λ conditional on the data only is obtained by drawing B(j) and Σ(j)
from the Normal-inverse-Wishart in equations (8)-(9) and computing (β⊤(j)Σ
minus1(j)β(j))
minus1β⊤(j)Σ
minus1(j)a(j)
From the posterior sampling of the parameters in the above definition we can also obtain the
posterior distribution of the cross-sectional GLS R2 defined as
R2gls = 1minus
(aminus βλgls)⊤Σminus1(aminus βλgls)
(aminus a1N )⊤Σminus1(aminus a1N ) (14)
Once again we can think of equation (14) as the population GLS R2 that is a function of the
unknown quantities a β and λ But after observing the data we infer the posterior distribution
of the parameters and from these we recover the posterior distribution of the R2gls
Remark 1 (Generated factors) Often factors are estimated as eg in the case of principal
components (PCs) and factor mimicking portfolios (albeit the latter is not needed in our setting)
This generates an additional layer of uncertainty normally ignored in empirical analysis due to
the associated asymptotic complexities Nevertheless it is relatively easy to adjust the Bayesian
estimators of risk premia to account for this uncertainty In the case of a mimicking portfolio
under a diffuse prior and Normal errors the posterior distribution of the portfolio weights follow the
standard Normal-inverse-Gamma of Gaussian linear regression models (see eg Lancaster (2004))
Similarly in the case of principal components as factors under a diffuse prior the covariance
matrix from which the PCs are constructed follow an inverse-Wishart distribution9 Hence the
posterior distributions in Definitions 1 and 2 can account for the generated factors uncertainty by
first drawing from an inverse-Wishart the covariance matrix from which PCs are constructed or
the Normal-inverse-Gamma posterior of the mimicking portfolios coefficients and then sampling
the remaining parameters as explained in the definitions
II2 Model selection
In the previous subsection we have derived simple Bayesian estimators that deliver in finite sam-
ple credible intervals robust to the presence of spurious factors and avoid over-rejecting the null
hypothesis of zero risk premia for such factors
However given the plethora of risk factors that have been proposed in the literature a robust
approach for models selection across non-necessarily nested models and that can handle poten-
tially a very large number of possible models as well as both traded and non-traded factors is
of paramount importance for empirical asset pricing The canonical way of selecting models and
testing hypothesis within the Bayesian framework is through Bayesrsquo factors and posterior prob-
abilities and that is the approach we present in this section This is for instance the approach
suggested by Barillas and Shanken (2018) The key elements of novelty of the proposed method
are that i) our procedure is robust to the presence of spurious and weak factors ii) it is directly
9Based on these two observations Allena (2019) proposes a generalisation of Barillas and Shanken (2018) modelcomparison approach for these type of factors
11
applicable to both traded and non-traded factors and iii) it selects models based on their cross-
sectional performance (rather than the time series one) ie on the basis of the risk premia that the
factors command
In this subsection we show first that flat priors for risk premia (as the Jeffreysrsquo priors used
in section II1 for illustrative purposes) are not suitable for model selection in the presence of
spurious factors Given the close analogy between frequentist testing and Bayesian inference with
flat priors this is not too surprising But the novel insight is that the problem arises exactly
because of the use of flat priors and can therefore be fixed by using non-flat yet non-informative
priors Second we introduce ldquospike-and-slabrdquo priors that are robust to the presence of spurious
factors and particularly powerful in high-dimensional model selection ie when one wants as in
our empirical application to test all factors in the zoo
II21 Pitfalls of Flat Priors for Risk Premia
We start this section by discussing why flat priors for risk premia as the Jeffreysrsquo prior are not
desirable in model selection Since we want to focus and select models based on the cross-sectional
asset pricing properties of the factors for simplicity we retain Jeffreysrsquo priors for the time series
parameter (aβf Σ) of the first-step regression
In order to perform model selection we relax the (null) hypothesis that models are correctly
specified and allow instead for the presence of cross-sectional pricing errors That is we consider
the cross-sectional regression a = βλ + α For illustrative purposes we focus on spherical errors
but all the results in this and the following subsections can be generalized to the non-spherical error
setting in Appendix A2
Similar to many Bayesian variable selection problems we introduce a vector of binary latent
variables γ⊤ = (γ0 γ1 γK) where γj isin 0 1 When γj = 1 it indicates that the factor j
(with associated loadings βj) should be included into the model and vice versa The number of
included factors is simply given by pγ =983123K
j=0 γj Note that we do not shrink the intercept so
γ0 is always equal to 1 (as the common intercept plays the role of the first ldquofactorrdquo) The notation
βγ = [βj ]γj=1 represents a pγ-columns sub-matrix of β
When testing whether the risk premium of factor j is zero the null hypothesis is H0 λj = 0
In our notation this null hypothesis can be expressed as H0 γj = 0 while the alternative is
H1 γj = 1 This is a small but important difference relative to the canonical frequentist testing
approach for useless factors the risk premium is not identified hence testing whether it is equal to
any given value is per se problematic Nevertheless as we show in the next section with appropriate
priors whether a factor should be included or not is a well-defined question even in the presence
of useless factors
In the Bayesian framework the prior distribution of parameters under the alternative hypothesis
should be carefully specified Generally speaking the priors for nuisance parameters such as β σ2
and Σ do not greatly influence the cross-sectional inference But as we are about to show this is
not the case for the priors about risk premia
12
Recall that when considering multiple models say wlog model γ and model γ prime by Bayesrsquo
theorem we have that the posterior probability of model γ is
Pr(γ|data) = p(data|γ)p(data|γ) + p(data|γ prime)
where we have given equal prior probability to each model and p(data|γ) denotes the marginal
likelihood of the model indexed by γ In Appendix A3 we show that when using a Jeffreysrsquo prior
(that is flat for λ) the marginal likelihood is
p(data|γ) prop (2π)pγ+1
2 |β⊤γ βγ |minus
12
Γ983059Nminuspγ+1
2
983060
983059N σ2
γ
2
983060Nminuspγ+1
2
(15)
where λγ = (β⊤γ βγ)
minus1β⊤γ a σ
2γ =
(aminusβγ λγ)⊤(aminusβγ λγ)N and Γ denotes the Gamma function
Therefore if model γ includes a useless factor (whose β asymptotically converges to zero) the
matrix β⊤γ βγ is nearly singular and its determinant goes to zero sending the marginal likelihood
in (15) to infinity As a result the posterior probability of the model containing the spurious factor
go to one Consequently under a flat prior for risk premia the model containing a useless factor
will always be selected asymptotically However the posterior distribution of λ for the spurious
factor is robust and particularly disperse in any finite sample
Moreover it is highly likely that conclusions based on the posterior coverage of λ contradict
those arising from Bayesrsquo factors When the prior distribution of λj is too diffuse under the
alternative hypothesis H1 the Bayesrsquo factor tends to favor H0 over H1 even though the estimate
of λj is far from 0 The reason is that even though H0 seems quite unlikely based on posterior
coverages the data is even more unlikely under H1 than under H0 Therefore a disperse prior for
λj may push the posterior probabilities to favor H0 and make it fail to identity true factors This
phenomenon is the so called ldquoBartlett Paradoxrdquo (see Bartlett (1957))
Note also that flat hence improper priors for the risk premia are not legitimate since they render
the posterior model probabilities arbitrary Suppose that we are testing the null H0 λj = 0 Under
the null hypothesis the prior for (λσ2) is λj = 0 and π(λminusj σ2) prop 1
σ2 However the prior under the
alternative hypothesis is π(λj λminusj σ2) prop 1
σ2 Since the marginal likelihoods of data p(data|H0)
and p(data|H1) are both undetermined we cannot define the Bayesrsquo factor p(data|H1)p(data|H0)
(see eg
Cremers (2002) Chib Zeng and Zhao (forthcoming)) In contrast for nuisance parameters such
as σ2 we can continue to assign improper priors Since both hypotheses H0 and H1 include σ2
the prior for it will be offset in the Bayesrsquo factor and in the posterior probabilities Therefore we
can only assign improper priors for common parameters10 Similarly we can still assign improper
priors for β and Σ in the first time series step
The final reason why it might be undesirable to use Jeffreysrsquo prior in the second step is that it
does not impose any shrinkage on the parameters This is problematic given the large number of
10See Kass and Raftery (1995) (and also Cremers (2002)) for more detailed discussion
13
members of the factor zoo while we have only limited time-series observations of both factors and
test asset returns
In the next subsection we propose an appropriate prior for risk premia that is both robust to
spurious factors and can be used for model selection even when dealing with a very large number
of potential models
II22 Spike and Slab Prior for Risk Premia
In order to make sure that the integration of the marginal likelihood is well-behaved we propose
a novel prior specification for the factorsrsquo risk premia λ⊤f = (λ1 λK)11 Since the inference in
time-series regression is always valid we only modify the priors of the cross-sectional regression
parameters
The prior that we propose belongs to the so-called ldquospike-and-slabrdquo family For exemplifying
purposes in this section we introduce a Dirac spike so that we can easily illustrate its implications
for model selection In the next subsection we generalize the approach to a ldquocontinuous spikerdquo
prior and study its finite sample performance in our simulation setup
In particular we model the uncertainty underlying the model selection problem with a mixture
prior π(λσ2γ) prop π(λ|σ2γ)π(σ2)π(γ) for the risk premium of the j-th factor When γj = 1
and hence the factor should be included in the model the prior follows a normal distribution given
by λj |σ2 γj = 1 sim N (0σ2ψj) where ψj is a quantity that we will be defining below When instead
γj = 0 and the corresponding risk factor should not be included in the model the prior is a Dirac
distribution at zero For the cross-sectional variance of the pricing errors we keep the same prior
that would arise with Jeffreysrsquo approach π(σ2) prop σminus2
Let D denote a diagonal matrix with elements cψminus11 middot middot middot ψminus1
K and Dγ the sub-matrix of D
corresponding to model γ We can then express the prior for the risk factors λγ of model γ as
λγ |σ2γ sim N (0σ2Dminus1γ )
Note that c is a small positive number since we do not shrink the common intercept λc of the
cross-sectional regression
Given the above prior specification we sample the posterior distribution by sequentially drawing
from the conditional distributions of the parameters (ie we use a Gibbs sampling algorithm) The
crucial steps in addition to the sampling of the times series parameters from the posteriors in
equations (8)-(9) are as follows
Sampling λγ
Note that
11We do not shrink the intercept λc
14
p(λ|dataσ2γ) prop p(data|λσ2γ)π(λ|σ2γ)
prop (2π)minuspγ2 |Dγ |
12 (σ2)minus
N+pγ2 exp
983069minus 1
2σ2[(aminus βγλγ)
⊤(aminus βγλγ) + λ⊤γDγλγ ]
983070
= (2π)minuspγ2 |Dγ |
12 (σ2)minus
N+pγ2 e
983069minus
(λγminusλγ )⊤(β⊤γ βγ+Dγ )(λγminusλγ )
2σ2
983070
e
983153minusSSRγ
2σ2
983154
where SSRγ = a⊤aminus a⊤βγ(β⊤γ βγ +Dγ)
minus1β⊤γ a = minλγ(aminus βγλγ)
⊤(aminus βγλγ) + λ⊤γDγλγ
Note that SSRγ is the minimized sum of squared errors with generalised ridge regression penalty
term λ⊤γDγλγ That is our prior modelling is analogous to introducing a Tikhonov-Phillips
regularisation (see Tikhonov Goncharsky Stepanov and Yagola (1995) Phillips (1962)) in the
cross-sectional regression step and has the same rationale delivering a well defined marginal
likelihood in the presence of rank deficiency (that in our settings arises in the presence of useless
factors) However in our setting the shrinkage applied to the factors is heterogeneous since we
rely on the partial correlation between factors and test assets to set ψj as
ψj = ψ times ρ⊤j ρj (16)
where ρj is an N times 1 vector of correlation coefficients between factor j and the test assets and
ψ isin R+ is a tuning parameter which controls the shrinkage over all the factors12 When the
correlation between fjt and Rt is very low as in the case of a useless factor the penalty for λj
which is the reciprocal of ψρ⊤j ρj is very large and dominates the sum of squared errors
Let λγ = (β⊤γ βγ +Dγ)
minus1β⊤γ a and σ2(λγ) = σ2(β⊤
γ βγ +Dγ)minus1 the posterior distribution of
λγ is
λγ |dataσ2γ sim N (λγ σ2(λγ))
The above equation makes it clear why this Bayesian formulation is robust to spurious factors
When β converges to zero (β⊤γ βγ +Dγ) is dominated by Dγ so the identification condition for
the risk premia no longer fails When a factor is spurious its correlation with test assets converges
to zero hence the penalty for this factor ψminus1j goes to infinity As a result the posterior mean
of λγ λγ = (β⊤γ βγ +Dγ)
minus1β⊤γ a is shrunk towards zero and the posterior variance term σ2(λ)
approaches σ2Dminus1γ Consequently the posterior distribution of λ for a spurious factor is nearly the
same as its prior In contrast for a normal factor that has non-zero covariance with test assets the
information contained in β dominates the prior information since in this case the absolute size of
Dγ is small relative to β⊤γ βγ
Using our priors and integrating out λ yields
p(data|σ2γ) =
983133p(data|λσ2γ)π(λ|σ2γ)dλ prop (σ2)minus
N2
|Dγ |12
|β⊤γ βγ +Dγ |
12
exp
983069minusSSRγ
2σ2
983070
12Alternatively we could have set ψj = ψ times β⊤j βj where βj is an N times 1 vector However ρj has the advantage
of being invariant to the units in which the factors are measured
15
Sampling σ2
From the Bayesrsquo theorem we have that the posterior of σ2 given by
p(σ2|dataγ) prop p(data|σ2γ)π(σ2) prop (σ2)minusN2minus1 exp
983069minusSSRγ
2σ2
983070
hence the posterior distribution of σ2 is an inverse-Gamma σ2|dataγ sim Γminus1983059N2
SSRγ
2
983060
Finally we obtain the marginal likelihood of the data in model γ by integrating out σ2
p(data|γ) =983133
p(data|σ2γ)π(σ2)dσ2 prop |Dγ |12
|β⊤γ βγ +Dγ |
12
1
(SSRγ2)N2
When comparing two models using posterior model probabilities is equivalent to simply using the
ratio of the marginal likelihoods ie the Bayesrsquo factor defined as
BFγγprime = p(data|γ)p(data|γ prime)
where we have given equal prior probability to model γ and model γprime
Remark 2 (Bayes Factor) Consider two nested linear factor models γ and γ prime The only dif-
ference between γ and γ prime is γp γp equals 1 in model γ but 0 in model γ prime Let γminusp denote a
(K minus 1) times 1 vector of model index excluding γp γ⊤ = (γ⊤minusp 1) and γ prime⊤ = (γ⊤
minusp 0) where without
loss of generality we have assumed that the factor p is ordered last The Bayesrsquo factor is then
BFγγprime =
983061SSRγprime
SSRγ
983062N2 983059
1 + ψpβ⊤p
983147IN minus βγprime(β⊤
γprimeβγprime +Dγprime)minus1β⊤γprime
983148βp
983060minus 12 (17)
The above result is proved in Appendix A4
Since β⊤p [IN minus βγprime(β⊤
γprimeβγprime + Dγprime)minus1β⊤γprime ]βp is always positive ψp plays an important role in
variable selection For a strong and useful factor that can substantially reduce pricing errors the
first term in equation (17) dominates and the Bayesrsquo factor will be much greater than 1 hence
providing evidence in favour of model γ
Remember that SSRγ = minλγ(a minus βγλγ)⊤(a minus βγλγ) + λ⊤
γDγλγ hence we always have
SSRγ le SSRγprime in sample There are two effects of increasing ψp i) when ψp is large the penalty
for λp is small hence it is easier to minimise SSRγ and SSRγprimeSSRγ becomes much larger than
1 ii) large ψp decreases the second term in equation (17) lowering the Bayesrsquo factor and acting as
a penalty for dimensionality
A particularly interesting case is when the factor is useless βp converges to zero but the
penalty term 1ψp prop 1ρ⊤pρp goes to infinity On the one hand the first term in equation (17) will
converge to 1 on the other hand since ψp asymp 0 in large sample the second term in equation (17)
will also be around 1 Therefore the Bayesrsquo factor for a useless factor will go to 1 asymptotically13
13But in finite sample it may deviate from its asymptotic value so we should not use 1 as a threshold when testingthe null hypothesis H0 γp = 0
16
In contrast a useful factor should be able to greatly reduce the sum of squared errors SSRγ so
the Bayesrsquo factor will be dominated by SSRγ yielding a value substantially above 1
Remark 3 (Level Factors) Identification failure of factorsrsquo risk premia can arise in the presence
of lsquolevel factorsrsquo exposure to which is non-zero but lacks cross-sectional spread ie βj rarr c1N with
c ∕= 0 These factors help explain the average level of returns but not the their cross-sectional
dispersion and hence are collinear with the common cross-sectional intercept Our approach can
handle this case by using variance standardised variables in the estimation and replacing the penalty
in (16) with ψj = ψtimes983145ρj⊤983145ρj where 983145ρj is the cross-sectionally demeaned vector of correlations with
asset returns ie 983145ρj = ρj minus983059
1N
983123Ni=1 ρji
983060times 1N
II23 Continuous Spike
We extend the Dirac spike-and-slab prior by encoding a continuous spike for λj when γj equals
0 Following the literature on Bayesian variable selection (see eg George and McCulloch (1993)
George and McCulloch (1997) and Ishwaran Rao et al (2005)) we model the uncertainty under-
lying model selection with a mixture prior π(λσ2γω) = π(λ|σ2γ)π(σ2)π(γ|ω)π(ω) which is
specified as following
λj |γj σ2 sim N (0 r(γj)ψjσ2)
Note the introduction of a new element r(γj) in the prior and where r(1) = 1 and r(0) = r As
we explain below the additional parameter vector ω encodes our prior beliefs about the sparsity
of the true model
Redefine D as a diagonal matrix with elements c (r(γ1)ψ1)minus1 (r(γK)ψK)minus1 where ψj is
given as before by equation (16) In matrix notation the prior for λ is λ|σ2γ sim N (0σ2Dminus1)
The term r(γj)ψj in Dminus1 is set to be small or large depending on whether γj = 0 or γj = 1 In the
empirical implementation we force r to be much less than 1 since we intend to shrink λj towards
zero when γj is 014 Hence the spike component concentrates the mass of λ towards zero whereas
the slab component allows λ to take values over a much wider range Therefore the posterior
distribution of λ is very similar to the case of a Dirac spike in section II22
We use Gibbs sampling (ie sequential sampling from conditional distributions) to draw the
posterior distribution of the parameters (λγωσ2) where as explained below ω encodes our
prior beliefs about the sparsity of the true model
Sampling λγ
Combining the likelihood and the prior for λ we have
p(λ|dataσ2γ) prop p(data|λσ2γ)p(λ|σ2γ) prop exp
983069minus 1
2σ2
983147λ⊤(β⊤β +D)λminus 2λ⊤β⊤a
983148983070
Therefore defining λ = (β⊤β+D)minus1β⊤a and σ2(λ) = σ2(β⊤β+D)minus1 the posterior distribution
14We can set r = 00001 In our framework r is essentially a tune parameter hence we need to choose a reasonablevalue such that we can identify useful factor but exclude spurious ones
17
of λ can be expressed as λ|dataσ2γω sim N (λ σ2(λ))
Sampling γjKj=1
Even though the prior on model index γ could be simply set to be π(γ) = 12K we interpret
π(γj = 1|ωj) = ωj as our prior belief about the sparsity of the true model As in the literature on
predictors selection we assign the following prior distribution to (γω)
π(γj = 1|ωj) = ωj ωj sim Beta(aω bω)
Different hyper-parameters aω and bω determine whether we favor more parsimonious models or
not15
Given a ωj the Bayes factor for the j-th risk then factor is
p(γj = 1|dataλωσ2γminusj)
p(γj = 0|dataλωσ2γminusj)=
ωj
1minus ωj
p(λj |γj = 1σ2)
p(λj |γj = 0σ2)
If we had instead imposed ωj = 05 as in section II22 the Bayesrsquo factor would be fully determined
byp(λj |γj=1σ2)p(λj |γj=0σ2)
Sampling ω
From Bayesrsquo theorem we have
p(ωj |dataλγσ2) prop π(ωj)π(γj |ωj) prop ωγjj (1minus ωj)
1minusγjωaωminus1j (1minus ωj)
bωminus1
prop ωγj+aωminus1j (1minus ωj)
1minusγj+bωminus1
Therefore the posterior distribution of ωj is ωj |dataλγσ2 sim Beta(γj + aω 1minus γj + bω)
Sampling σ2
Finally
p(σ2|dataωλγ) prop (σ2)minusN+K+1
2minus1 exp
983069minus 1
2σ2[(aminus βλ)⊤(aminus βλ) + λ⊤Dλ]
983070
Hence the posterior distribution of σ2 is
σ2|dataωλγ sim Γminus1
983061N +K + 1
2(aminus βλ)⊤(aminus βλ) + λ⊤Dλ
2
983062
Note that when ωj is a constant 05 and r converges to 0 the continuous slab-and-spike prior
is equivalent to the one with a Dirac spike in section II22 However the MCMC algorithm of
the continuous spike setting is particularly useful in the high dimensional case Imagine that there
are 30 candidate factors in the factor zoo In the Dirac spike-and-slab prior case we have to
15We set aω = bω = 2 in the benchmark case However we could assign aω = 1 and bω = 2 in order to select asparser model
18
calculate the posterior model probabilities for 230 different models Given that we update (aβ)
in each sampling round posterior probabilities for all models are necessarily re-computed for every
new draw of these quantities rendering the computational cost very large In contrast using our
continuous spike approach we can simply use the posterior mean of γj to approximate the posterior
marginal probability of the j-th factor
III Simulation
We build a simple setting for a linear factor model that includes both strong and irrelevant factors
and allows for potential model misspecification
The cross-section of asset returns mimics empirical properties of 25 Fama-French portfolios
sorted by size and value We generate both factors and test asset returns from normal distributions
assuming that HML is the only useful factor A misspecified model also includes pricing errors from
the two-step Fama-MacBeth procedure which makes the vector of simulated expected returns equal
to their sample mean estimates of 25 Fama-French portfolios Finally a spurious factor is simulated
from an independent normal distribution with mean zero and standard deviation 1
ftuselessiidsim N (0 (1)2)
ftHMLiidsim N (rHML σ
2HML)
ftHML = ftHML minus macrftHML
Rt|ftHMLiidsim
983099983103
983101N (λc1N + β
983059λHML + ftHML
983060 Σ) if the model is correct
N (R+ βftHML Σ) if the model is misspecified
where factor loadings risk premia and variance-covariance matrix of returns are equal to their
OLS sample estimates from time series and two-pass Fama-MacBeth regressions of 25 size and
value portfolios on HML All the model parameters are estimated on monthly data from July 1963
to December 2017
To better illustrate the properties of the frequentist and bayesian approaches we consider 3
estimation setups
(a) the model includes only a strong factor (HML)
(b) the model includes only a useless factor
(c) the model includes both strong and useless factors
Each setting can be correctly or incorrectly specified with the following sample sizes T = 100
200 600 1000 and 20000 We compare the performance of the OLSGLS standard frequentist
and Bayesian Fama-MacBeth estimators (FM and BFM correspondingly) with the focus on risk
premia recovery testing and identification of strong and useless factors for model comparison
19
III1 Estimating risk premia via Bayesian Fama-MacBeth
Since it is unlikely that in most empirical settings a linear factor model is correctly specified we
focus our discussion on the case that allows for model misspecification We report similar simulation
results for the case of correct model specification in Appendix C
Table 1 Tests of risk premia in a misspecified model with a strong factor
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0107 0052 0007 0115 0076 0013 -422 4530200 0094 0067 0006 0118 0072 0015 -303 5621
FM 600 0097 0053 0006 0098 0047 0011 389 55751000 0088 0048 0012 0103 0060 0012 865 532820000 0109 0056 0010 0110 0060 0008 2486 3655
100 0063 0026 0006 0032 0013 0001 -356 3609200 0086 0041 0012 0079 0039 0008 -367 4689
BFM 600 0087 0043 0013 0087 0047 0009 -067 52611000 0095 0046 0009 0093 0046 0009 440 534620000 0096 0052 0012 0100 0052 0012 2390 3601
Panel B GLS
100 0246 0173 0067 0251 0154 0070 1720 7464200 0165 0107 0031 0171 0105 0035 5337 8045
FM 600 0137 0076 0020 0137 0073 0016 6942 83871000 0131 0072 0019 0132 0072 0015 7341 841620000 0122 0074 0014 0113 0061 0015 8034 8283
100 0139 0083 0022 0143 0083 0024 3127 6815200 0131 0072 0014 0121 0069 0019 4858 7299
BFM 600 0114 0061 0013 0126 0067 0018 6594 80091000 0100 0048 0009 0106 0058 0008 7063 814920000 0092 0050 0012 0100 0048 0012 8022 8267
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor The true value of the cross-sectional R2
adj is 3055(8175) for the OLS (GLS) estimation Fama-MacBeth estimates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidence intervals and their size for BFMestimates are constructed using posterior coverage of Fama-MacBeth estimates of λ The last two columns report the5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at the simulation point estimatesfor FM and its posterior mode for BFM
Table 1 presents the size of the tests for risk premia and confidence intervals for cross-sectional
R2 in probably the most relevant (and best-case) scenario for empirical applications It compares
the performance of frequentist and Bayesian Fama-MacBeth estimators for the case when the model
is misspecified and includes a single cross-sectional factor that is priced and strongly identified
proxied by HML Since the model is misspecified cross-sectional R2 never reaches 100 even for
T = 20 000 with the population value of 31 (82) for OLS (GLS) As expected both FM and
BFM estimators are very similar to each other and provide confidence intervals of correct size In
case of the standard FM approach they are constructed using standard t-statistics adjusted for
Shanken correction and in case of the BFM we rely on the quantiles of the posterior distribution
to form the credible confidence intervals for parameters The last two columns also report the
quantiles of the mode of the posterior distribution of R2 across the simulations
20
Table 2 summarizes risk premia estimation for the same cross-section of 25 expected returns
but using a useless (spurious) factor as a candidate cross-sectional factor As expected the standard
Fama-MacBeth estimator fails to recognize the rank failure in the second stage and conventional
risk premia estimates and t-statistics are inflated Indeed it is widely known since Kan and Zhang
(1999) that if the model is misspecified then t-statistics of the spurious factors tend to infinity
asymptotically This is confirmed in Panels A and B for T = 20 000 the probability of finding a
useless factor to have a t-stat above 196 is almost 50 for FM-OLS and over 80 for FM-GLS
Furthermore cross-sectional measures of fit such as R2 and related quantities are substantially
inflated even though itrsquos true value is 0 for both OLS and GLS settings in-sample estimates
produced by the frequentist approach not only have a much wider empirical support (for example
from -4 to over 40 for FM-OLS) but also uncertainty that does not decrease with the sample
size
Table 2 Tests of risk premia in a misspecified model with a useless factor
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0072 0037 0010 0055 0011 0001 -429 4184200 0078 0043 0005 0084 0021 0001 -418 4524
FM 600 0089 0043 0014 0223 0116 0013 -428 43701000 0093 0052 0014 0333 0187 0027 -429 450120000 0213 0135 0067 0698 0488 0172 -427 4363
100 0035 0011 0001 0001 0000 0000 -247 -036200 0041 0013 0001 0008 0002 0000 -261 -016
BFM 600 0071 003 0003 0031 0006 0002 -287 0191000 0047 002 0001 0039 0017 0002 -298 08420000 0034 0013 0000 0091 0043 0012 -336 1140
Panel B GLS
100 0238 0144 0066 0305 0199 0062 -347 3857200 0152 0091 0028 0292 0189 0067 -375 1953
FM 600 0126 0066 0017 0407 0314 0148 -385 16431000 0117 0063 0013 0510 0401 0239 -405 156920000 0104 0039 0005 0864 0847 0768 -311 1333
100 0128 0070 0019 0047 0018 0002 -218 1154200 0107 0060 0014 0034 0011 0000 -275 891
BFM 600 0093 0046 0008 0042 0012 0001 -325 6621000 0083 0031 0004 0061 0028 0004 -339 51920000 0023 0006 0000 0099 0049 0011 -304 138
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor The true value of the cross-sectional R2 is zero
The Bayesian approach to Fama-MacBeth regressions successfully overcomes the hurdle of use-
less factors As Table 2 demonstrates both BFM-OLS and BFM-GLS are able to identify the spu-
rious factor with the posterior distribution providing credible confidence bounds with the proper
control for the size of the test (eg for T = 20 000 the probability to reject the null of no risk
premia attached to a useless factor when using a test with the nominal size of 10 is 91) Rec-
ognizing a useless factor cross-sectional measures of fit are more conservative and overall tighter
Why does the bayesian approach work when the frequentist fails The argument is probably
best summarized by Figure 1 that plots a posterior distribution of λuseless for BFM from one of
21
minus4 minus2 0 2 4
00
02
04
06
08
λuseless
Density
BFMminusOLSFMminusOLS
Figure 1 Risk premia estimates of a useless factor
The graph presents the posterior distribution (blue line) of λuseless from the BFM-OLS estimation in a misspecifiedmodel with a useless factor based on a single simulation with T = 1 000 The red line depicts the asymptoticdistribution of Fama-MacBeth estimate of risk premium under the normal approximation
the simulations along with the pseudo-true value of risk premium defined as 0 in this case In this
particular simulation Fama-MacBeth OLS estimate of λuseless is -119 with Shanken-corrected
t-statistics equal to -255 so according to traditional hypothesis testing we would reject the null
of λuseless = 0 even at 1 The posterior distribution of BFM estimates of risk premium (the
blue line in Figure 1) behaves rather differently it is centered around 0 and overall more spread
out with a confidence interval (minus1603 1201) Intuitively the main driving force behind it
is the fact that in BFM β is updated continuously when β is close to zero the posterior draws
of β will be positive or negative randomly which implies that the conditional expectation of λ in
Equation 12 will also flip sign depending on the draw As a result the posterior distribution of
λuseless is centered around 0 and so is the confidence interval The same logic applies to the case
of BFM-GLS
Finally Table 3 combines the insights for true and irrelevant factors and presents the simulation
results for the most realistic model setup that includes both useless and strong factors As expected
in the conventional case of frequentist Fama-MacBeth estimation the useless factor is often found
to be a significant predictor of the asset returns its OLS (GLS) t-statistic would be above a
5-critical value in over 60 (80) of the simulations On the contrary the bayesian confidence
intervals have the right coverage and reject the null of no risk premia attached to the spurious
factor with frequency asymptotically approaching the size of the tests
The crowding out of the true factors by the useless ones could also be an important empirical
concern When the model is misspecified the presence of spurious factors can also bias the risk
premia estimates for the strong ones and often leads to their crowding out of the model Panel
A in Table 3 serves as a good illustration of this possibility with risk premia estimates for the
22
Table 3 Tests of risk premia in a misspecified model with useless and strong factors
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0082 0039 0008 0121 0067 0016 0099 0023 0001 -513 5663200 0096 0044 0005 0157 0100 0034 0129 0039 0005 127 6190
FM 600 0093 0034 0014 0212 0147 0071 0264 0129 0022 840 61781000 0102 0046 0010 0261 0194 0098 0380 0199 0056 1184 624820000 0114 0054 0009 0289 0229 0152 0848 0633 0240 2507 6076
100 0035 0012 0001 0028 0007 0001 0004 0001 0000 -211 4033200 0049 0017 0001 0067 0031 0004 0011 0003 0000 -175 4828
BFM 600 005 0018 0004 0099 0047 0005 0047 0014 0002 1020 55721000 0041 0021 0003 0102 0048 0011 0071 0035 0004 1487 569520000 0017 0007 0000 0087 0033 0007 0099 0055 0012 2480 5466
Panel B GLS
100 0219 0155 0057 0224 0135 0066 0303 0198 0064 1911 7775200 0155 0092 0028 0149 0090 0024 0263 0183 0061 5537 8171
FM 600 0121 0068 0015 0116 0064 0016 0391 0293 0134 6948 84331000 0115 0061 0013 0115 0057 0012 0487 0387 0216 7305 847420000 0084 0050 0009 0100 0041 0005 0864 0836 0757 7979 8424
100 0122 0069 0016 0129 0070 0017 0046 0017 0002 3243 6869200 0112 0056 0012 0099 0048 0012 0031 0012 0000 4844 7355
BFM 600 0096 0049 0011 0086 0045 0009 0049 0016 0002 6576 80301000 0081 0036 0007 0073 0032 0006 0058 0030 0003 7064 815420000 0027 0005 0000 0022 0007 0000 0098 0047 0013 7974 8259
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and
λstrong λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor The true value of the
cross-sectional R2adj is 3055 (8175) for the OLS (GLS) estimation
strong factor are clearly biased in the frequentist estimation by the identification failure in case of
the frequentist approach Again in this case BFM provides reliable albeit conservative confidence
bounds for model parameters
III11 Evaluating cross-sectional fit
In addition to risk premia estimates it is often useful to understand the quality of cross-sectional
fit of the model Indeed the increase in cross-sectional R2 is often interpreted as measuring the
economic importance of the predictor contrary to the statistical one implied by the risk premia
significance It is well-known however that the average values of R2 are not always informative
about the true model performance its sample distribution often suffers from a large estimation un-
certainty (see eg Stock (1991) and Lewellen Nagel and Shanken (2010)) ad has a non-standard
distribution when the matrix of β has reduced rank (Kleibergen and Zhan (2015) Gospodinov
Kan and Robotti (2019)) In this section we further investigate the properties of cross-sectional
R2 in the frequentist and Bayesian Fama-MacBeth regressions
Figure 2 shows the distribution of cross-sectional OLS R2 across a large number of simulations
for the asymptotic case of T = 20 000 and a misspecified process for returns As Panel (a)
illustrates if the model is strongly identified the distribution of posterior mode of R2 for BFM
tends to coincide with that of conventional Fama-MacBeth procedure as expected The major
23
difference emerges whenever a useless factor is included into the candidate set of variables Indeed
it is well-known that in this case the distribution of conventional measures of fit is nonstandard
and often inflated (Kleibergen and Zhan (2015)) This is further confirmed in Figures 2 b) and
c) that show that under the presence of spurious factors conventional Fama-MacBeth R2 has an
extremely spreadout right tail of the distribution which makes it easy to find a substantial increase
of fit whenever the model is simply not identified This unfortunate property of the frequentist
approach is not shared by the inference with BFM Indeed the mode of the posterior distribution
of R2 is generally tightly centered around the true values The slight bump to the right tail of the
distribution comes from the fact that whenever a spurious factor is included into the model with a
small probability (based on t-statistic cut-off this is equal to the size of the test see eg Table 3
Panel A) its fit will be similar to that of the frequentist estimation
00 02 04 06 08 10
02
46
810
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(a) Model with a strong factor
00 02 04 06 08
010
2030
40
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(b) Model with a useless factor
00 02 04 06 08 10
02
46
8
Radj2
Dens
ity
Posterior ModeFMminusOLSTrue Radj
2
(c) Model with strong and useless factors
Figure 2 Cross-sectional distribution of R2adj in models with strong and useless factors
Figures (a)-(c) show the asymptotic distribution of cross-sectional R2 under different model specifications across 1000simulations of sample size T = 20 000 Blue lines correspond to the distribution of the posterior mode for R2
adj whilegreen lines depict the pointwise sample distribution of cross-sectional fit evaluated at Fama-MacBeth risk premiaestimates The dark yellow dash line stands for the true value of R2
adj in the model
24
However the pointwise distribution of cross-sectional R2 across the simulations is only part of
the story as it does not reveal the in-sample estimation uncertainty and whether the confidence
intervals are credible in reflecting it While BFM incorporates this uncertainty directly into the
shape of its posterior distribution one needs to rely on bootstrap-like algorithms to build a similar
analogue in the frequentist case As frequentist benchmark we use the approach Lewellen Nagel
and Shanken (2010) to construct the confidence interval for R2 Details on this procedure can be
found in Appendix A5
minus05 00 05 10
00
05
10
15
20
25
Radj2
Den
sity
BFM PosteriorFM Radj
2
5th95th CI of BFM5th95th CI of LNSTrue Radj
2
(a) Model with a useless factor
minus04 minus02 00 02 04 06 08 10
00
05
10
15
20
Radj2
Den
sity
BFM PosteriorFM Radj
2
5th95th CI of BFM5th95th CI of LNSTrue Radj
2
(b) Model with strong and useless factors
Figure 3 The estimation uncertainty of cross-sectional R2Figures (a) and (b) show the posterior density of the cross-sectional R2 (blue) in one representative simulationalong with its 90 confidence interval (red) The dark yellow line denotes the true value of R2 while the green linedepicts its in-sample Fama-MacBeth estimate with the confidence interval constructed following Lewellen Nagel andShanken (2010) (black)
Figure 3 presents the posterior distribution of cross-sectional R2 for a model that contains a
useless factor (and potentially a strong one too) and contrasts it with a frequentist value and
the confidence interval around it Consider for example Panel a) The fact that the in-sample
Fama-MacBeth estimate of cross-sectional fit (51) is substantially higher than the mode of the
posterior distribution (-2 which is close to the true value of R2adj about minus4) is not surprising
given the previous results on the pointwise distribution of the estimates What is quite interesting
however is the coverage of the confidence interval constructed via the simulation-based approach of
Lewellen Nagel and Shanken (2010) Not only does it not include true value of the cross-sectional
fit but in fact in this particular simulation it suggests that R2adj should be between 42 and
100 A similar mismatch between the seemingly high levels of cross-sectional fit produced by a
frequentist approach and their true values can also be observed in Panel b) of Figure 3 for the
case of including both strong and a useless factors
In section A6 of the Appendix we show that the performance of the Bayesian Fama-MacBeth
method is robust to the use of a larger cross-sectional dimension ndash ie the above discussed properties
25
hold in a larger N setting
III2 Bayes factors
How well do flat and spike-and-slab priors work empirically in selecting relevant and detecting
spurious factors in the cross-section of asset returns We revisit the theoretical results from Section
II1 using the same simulation design used to evaluate the estimation of risk premia
Table 4 The probability of retaining risk factors using Bayes factors
T 55 57 59 61 63 65
Panel A strong factors
Jeffreys Prior fstrong 200 0813 0784 0758 0722 0693 0662600 0929 0915 0896 0876 0851 08341000 0972 0963 0957 0951 0937 0924
Spike-and-Slab Prior fstrong 200 0605 0570 0538 0505 0476 0447600 0853 0830 0807 0784 0762 07351000 0954 0940 0928 0912 0896 0875
Panel B useless factors
Jeffreys Prior fuseless 200 1000 0996 0988 0967 0919 0822600 0998 0998 0995 0988 0977 09431000 1000 1000 1000 0994 0983 0965
Spike-and-Slab Prior fuseless 200 0325 0188 0106 0057 0024 0013600 0072 0025 0006 0002 0001 00001000 0051 0015 0001 0000 0000 0000
Panel C strong and useless factors
Jeffreys Prior fstrong 200 0924 0897 0874 0848 0821 0799600 0988 0985 0976 0974 0965 09581000 0998 0996 0996 0995 0992 0987
fuseless 200 0984 0960 0910 0811 0702 0584600 0999 0993 0985 0954 0913 08541000 1000 1000 0995 0986 0966 0945
Spike-and-Slab Prior fstrong 200 0627 0591 0552 0509 0474 0452600 0860 0830 0802 0783 0758 07271000 0956 0942 0927 0911 0895 0875
fuseless 200 0260 0128 0071 0031 0019 0010600 0080 0028 0010 0004 0001 00011000 0058 0013 0005 0000 0000 0000
The table shows the frequency of retaining risk factors for different choice sets across 1000 simulations of differentsize (T=200 600 and 1000) In Panel A the candidate risk factor is truly cross-sectionally priced and stronglyidentified while in Panel B they are not Panel C summarizes the case of using both strong and useless candidatefactors in the model A candidate factor is retained in the model if its marginal posterior probability p(γi = 1|data)is greater than a certain threshold ie 55 57 59 61 63 and 65
Consider a cross-section of 25 portfolios that is actually loading on 2 systematic sources of risk
with the econometrician potentially observing at most only one of them a strong (and priced) ft
However there is also a second candidate factor available which is orthogonal to asset returns
and essentially useless We compute Bayes factors corresponding to each of the potential sources
of risk and document the empirical probability of retaining the variable in the model across 1000
26
simulations Again we consider models that contain either strong or useless factors or a com-
bination of both and different sample sizes (T = 200 600 and 1000) In each case we run the
Gibbs sampling algorithm derived using continuous spike-and-slab prior and then approximate the
marginal probability of each factor by the posterior mean of γj The decision rule is based on a
range of critical values 55-65 such that whenever the posterior mean of γj is above a particular
threshold we retain the factor Finally we also compute the probability of retaining a factor under
Jeffreys prior which would be the standard in the literature
Table 4 summarizes our findings When only a true risk factor is included in the candidate set
(Panel A) both Jeffreyrsquos and spike-and-slab priors successfully identify it with a high probability
especially in large sample For small sample sizes however Jeffreys prior clearly has a somewhat
higher power of detecting a true risk factor This outcome is not surprising as the estimator does
not impose additional shrinkage on the risk premium The difference however vanishes for larger
sample sizes
The difference between the two priors becomes drastic whenever useless factors are included
in the model (Panels B Panels B and C in Table 4) As discussed in Section II21 since the
matrix β⊤γ βγ is nearly singular and its determinant goes to zero under a flat prior the posterior
probability of including a spurious factor in the model converges to 1 asymptotically For example
the probability of misidentifying a spurious factor as being the true source of risk is almost 1
under Jeffreys prior even for a very short sample This in turn makes the overall process of model
selection invalid
Overall we find the asymptotic behavior of the spike-and-slab prior encouraging for variable
and model selection While often somewhat conservative in very short samples it successfully
eliminates the impact of the spurious factors from the model and identifies the true sources of
risk
IV Empirical Applications
In this section we apply our Bayesian approach to a large set of factors proposed in the previous
literature First we use the Bayesian Fama-MacBeth method to analyse several notable factor
models (subsection IV1) Second we consider 51 tradable and non-tradable factors yielding more
than two quadrillion possible models and employ our spike and slab priors to compute factorsrsquo
posterior probabilities and implied risk premia (subsections IV2 and IV3) Third we compare
the performance of a (low dimensional) robust model constructed with only the factors that have
high posterior probability to the one of several notable factor models (subsection IV4) Fourth
we estimate the degree of sparsity (in terms of linear factors) of the true latent stochastic discount
factor as well as the SDF-implied maximum Sharpe ratio (subsection IV5)
27
IV1 Some notable factor models
In this section we illustrate the differences between the frequentist and Bayesian FM estimation
(both OLS and GLS) for several candidate models In particular we estimate a set of linear
factor models on the returns of the standard 25 Fama-French portfolios sorted by size and value
using frequentist and Bayesian Fama-MacBeth estimators We use monthly data over the 197001-
201712 sample for tradable factors and whenever possible nontradables For factors available
only at quarterly frequency the sample is 1952Q1-2017Q3 (whenever possible) A full description
of the data and models used can be found in Appendix B Additional empirical results for other
candidate factors and cross-sections (eg 25 Fama-French + 17 Industry portfolios) can be found
in Appendix C
Tables 5 and 6 summarize the performance of several leading factor models For the classical FM
approach we report point estimates of risk premia with their Shanken-corrected t-statistics and
the cross-sectional R2 along with its 90 confidence interval (constructed following the method-
ology of Lewellen Nagel and Shanken (2010)) For BFM we report the posterior mean of risk
premia estimates and the posterior median and mode of R2 along with the centered 90 posterior
coverage We chose to report both the median and the mode for cross-sectional fit because the of
the shape of its distribution which is often heavily skewed
Carhart (1997) 4-factor model OLS and GLS Fama-MacBeth estimates of risk premia indicate
that size value and momentum (SMB HML and UMD correspondingly) are significant drivers of
the cross-section of test assets The market factor does not command a significant risk premium
which is a typical finding for this model Cross-sectional fit seems to be high with R2 over 70 even
though it comes with rather wide confidence bounds according to Lewellen Nagel and Shanken
(2010) approach The Bayesian estimation indicates that part of the model success is due to the fact
that this cross section of test assets does not have much exposure to momentum especially after
one controls for the conventional Fama-French factors While still marginally significant its risk
premium is substantially lower under both BFM and BFM-GLS estimation with tighter bounds
for R2 too On the contrary both HML and SMB have virtually identical risk prices under both
FM and Bayesian estimations
Hou Xue and Zhang (2014) q-factor model emphasized the role of investment and profitability
in matching the cross-section of equity returns and true to the data we find these factors strongly
priced Both estimation strategies give identical parameter estimates and largely consistent mea-
sures of cross-sectional fit This in turn implies that all the risk premia are strongly identified
for this model when estimated on a cross-section of 25 Fama-French portfolios When industry
portfolios are among the test assets (see Appendix C) risk premia and R2 decline and some of the
parameters lose significance but broadly speaking the model performs in a qualitatively similar
way
Liquidity-adjusted CAPM of Pastor and Stambaugh (2003) seems to suffer from identification
failure as the risk premium on the liquidity factor ceases to remain significant when BFM is used
in estimation Wide confidence bounds and uncertain cross-sectional fit provide a stark difference to
28
Table 5 Tradable factors and 25 Fama-French portfolios sorted by size and value
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Carhart (1997) Intercept 0489 7063 0703 6432 6329[-0244 1222] [3160 9400] [-0061 1426] [4826 7646]
MKT 0120 -0101[-0631 0870] [-0822 0683]
SMB 0171 0164[0100 0241] [0089 0232]
HML 0404 0396[0331 0477] [0330 0466]
UMD 2445 1806[0955 3936] [0259 3328]
q-factor model Intercept 0912 6567 0922 6062 6123Hou Xue and Zhang (2014) [0286 1539] [3040 8680] [0276 1560] [4131 7640]
ROE 0394 0377[0016 0771] [-0020 0789]
IA 0387 0385[0203 0571] [0208 0580]
ME 0274 0268[0169 0379] [0158 0376]
MKT -0371 -0378[-0995 0252] [-1005 0272]
Liquidity-CAPM Intercept 0973 3624 1162 3409 3027Pastor and Stambaugh (2000) [-0084 2030] [-909 10000] [0175 2120] [-239 6146]
LIQ 3057 1785[0727 5388] [-1237 4150]
MKT -0281 -0449[-1350 0788] [-1371 0509]
Panel A GLS
Carhart (1997) Intercept 1017 8964 1083 8587 863[0389 1645] [8200 9760] [0458 1717] [8085 9105]
MKT -0434 -0504[-1065 0196] [-1150 0122]
SMB 0191 0189[0150 0233] [0150 0230]
HML 0356 0356[0313 0400] [0316 0395]
UMD 1626 1264[0479 2772] [0077 2401]
q-factor model Intercept 1305 5503 1277 4728 4854Hou Xue and Zhang (2014) [0779 1831] [2440 9640] [0702 1879] [3245 6419]
ROE 0295 0266[-0026 0615] [-0087 0640]
IA 0270 0265[0104 0437] [0093 0450]
ME 0251 0246[0161 0341] [0144 0345]
MKT -0749 -0720[-1268 -0229] [-1292 -0156]
Liquidity-CAPM Intercept 1244 4938 1256 5298 4317Pastor and Stambaugh (2000) [0664 1824] [2691 9891] [0738 1749] [1250 6653]
LIQ 1141 0775[-0232 2514] [-0450 2116]
MKT -0664 -0678[-1242 -0086] [-1176 -0162]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of 25 Fama-French monthly excess returns Each model is estimated via OLS and GLS We reportpoint estimates and 5 confidence intervals for risk premia which are constructed based on the asymptotic normaldistribution and cross-sectional R2 and its (5 95) confidence level constructed as in Lewellen Nagel and Shanken(2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide the posterior mean of λ denoted byλj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (595) credible intervals and denote significance at the 90 95 and 99 level respectively
29
Table 6 Nontradable factors and 25 Fama-French portfolios sorted by size and value
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Scaled CCAPM Intercept 1046 2567 1791 3436 2919[-0848 2940] [-1429 10000] [0001 3723] [-476 6207]
cay 1817 0791[-0653 4288] [-1347 2686]
∆Cnd 0713 0303[-0030 1456] [-0462 0951]
∆Cnd times cay 0804 0301[-1645 3253] [-1911 2270]
HC-CAPM Intercept 3243 -122 3090 354 957[1228 5257] [-909 3345] [0790 5259] [-748 4431]
∆Y 0464 0085[-0213 1140] [-1119 1058]
MKT -0719 -0656[-2680 1242] [-2859 1558]
Durable CCAPM Intercept 2214 5238 2780 471 4078[-1037 5465] [2800 10000] [-0184 5751] [120 6991]
∆Cnd 0743 0357[-0025 1511] [-0207 0832]
∆Cd -0057 0014[-0719 0605] [-0668 0693]
MKT 0083 -0495[-3322 3489] [-3395 2555]
Panel B GLS
Scaled CCAPM Intercept 2180 -1024 2257 -658 -313[0825 3536] [-1429 6457] [1221 3258] [-1187 1517]
cay 0435 0256[-0774 1643] [-0688 1217]
∆Cnd 0118 0089[-0266 0502] [-0214 0407]
∆Cnd times cay 0141 0063[-1005 1286] [-0845 0938]
HC-CAPM Intercept 2730 5636 2759 5824 4926[1458 4002] [3018 8364] [1379 4095] [967 7507]
∆Y -0421 -0241[-0742 -0099] [-0598 0114]
MKT -0717 -0740[-1979 0545] [-2073 0622]
Durable CCAPM Intercept 2960 4454 2841 5474 4099[0547 5374] [286 7829] [1102 4558] [-241 7215]
∆Cnd 0105 0052[-0265 0475] [-0201 0311]
∆Cd 0055 0025[-0390 0501] [-0286 0327]
MKT -0941 -0822[-3314 1432] [-2528 0895]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of 25 Fama-French quarterly excess returns Each model is estimated via OLS and GLS Wereport point estimates and 5 confidence intervals for risk premia which are constructed based on the asymptoticnormal distribution and cross-sectional R2 and its (5 95) confidence level constructed as in Lewellen Nageland Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide the posterior mean of λdenoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as wellas its (5 95) credible intervals and denote significance at the 90 95 and 99 level respectively
the pointwise estimates and their seemingly high significance levels under the standard frequentist
approach
Conditional CCAPM of Lettau and Ludvigson (2001) is weakly identified at best Unlike the
basic FM estimation that indicates a relative empirical success of the model the Bayesian approach
reveals most risk premia to be substantially lower losing all the accompanying statistical and
30
economic significance This is particularly pronounced in the BFM-GLS that delivers both risk
premia and cross-sectional R2 close to zero
Labour-adjusted CAPM of Jagannathan and Wang (1996) extends the classic CAPM framework
by introducing a proxy for human capital and finds it strongly priced in the cross-sections of stocks
returns The BFM estimates of risk premia are substantially lower and no longer significant with
the same patterns observed under both OLS and GLS procedures
Durable CCAPM of Yogo (2006) in the linearized version included the durable consumption
factor and found that its impact is priced in a number of cross-sections sorted by size and value past
betas and other characteristics Even though the Lewellen Nagel and Shanken (2010) approach
indicates a really wide support for the cross-sectional R2 the model found empirical support in the
data We find that both durable and nondurable consumption are weak predictors of the cross-
section of returns as the magnitude of their risk premia substantially declines and is no longer
significant The model is still characterized by a wide confidence interval for R2 but overall its
pricing ability is questionable at best
Appendix C provides additional empirical results on the performance of both frequentist and
bayesian Fama-MacBeth estimators In many cases when the models are well specified and strongly
identified in the data there is almost no distinction between the two approaches One notable
difference however are the confidence intervals of the R2 that are often notoriously wide in
the frequentist case There are also cases however when the difference in model performance
becomes large affecting both risk premia estimates and measures of cross-sectional fit Similar
to Gospodinov Kan and Robotti (2019) we caution the reader against blindly relying on the
estimates produced by conventional Fama-MacBeth procedure and advocate a robust approach to
inference
IV2 Sampling two quadrillion models
We now turn our attention to a large cross-section of candidate asset pricing factors In particular
we focus on 51 (both traded and non) monthly factors available from October 1973 to December
2016 (ie T = 600) Factors are described in details in Table B1 of Appendix B In choosing
the cross-section of assets to price we follow Lewellen Nagel and Shanken (2010) and employ 25
Fama-French size and book-to-market portfolios plus 30 Industry portfolio (ie N = 55) Since
we do not restrict the maximum number of factors to be included all the possible combinations
of factors give us a total of 251 possible specifications ie 225 quadrillion models Note that each
model involves 55 time series regressions and one cross-section regression ie we jointly evaluate
the equivalent of 126 quadrillion regressions
We employ the continuous spike-and-slab approach of section II23 since it is the most suited
for handling a very large number of possible models and report both the posterior (given the data)
of each factor (ie E [γj |data] forallj) as well as the posterior means of the factorsrsquo risk premia (ie
E [λj |data] forallj) computed as the Bayesian Model Average16 across all the models considered
16If we are interested in some quantity ∆ that is well-defined for every model j = 1 M (eg risk premia
31
The posterior evaluation is performed and reported over a wide range for the parameter (ψ in
equation (16)) that controls the degree of shrinkage of potentially useless factorsrsquo risk premia from
ψ = 1 (ie very strong shrinkage) to ψ = 100 (making the shrinkage virtually irrelevant) The
prior for each factor inclusion is a Beta(1 1) (ie a uniform on [0 1]) yielding a prior expectation
for γj equal to 5017
000
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
1 5 10 20 50 100
Post
erio
r pro
babi
lity
ψ
HML MKT MKT SMB STRev
IPGrowth BEH_PEAD NONDUR SERV LIQ_TR
PE DeltaSLOPE BW_ISENT DEFAULT Oil
DIV REAL_UNC UNRATE TERM HJTZ_ISENT
UMD CMA LIQ_NT FIN_UNC STOCK_ISS
NetOA MACRO_UNC INV_IN_ASS COMP_ISSUE ACCR
HML ROA RMW CMA LTRev
INTERM_CAP_RATIO ASS_Growth DISSTR PERF BAB
IA MGMT ROE O_SCORE BEH_FIN
GR_PROF QMJ RMW SKEW SMB
HML_DEVIL
Figure 4 Posterior factor probabilities
Posterior probabilities of factors E [γj |data] estimated over the 197310-201612 sample using a cross-section of 25Fama-French size and book-to-market and 30 Industry test asset portfolios computed using the continuous spike andslab approach of section II23 and 51 factors yielding 251 asymp 225 quadrillion models The 51 factors considered aredescribed in Table B1 of Appendix B The prior distribution for the j-th factor inclusion is a Beta(1 1) yielding aprior expectation of γj = 50 Posterior probabilities are plotted for ψ isin [1 100]
Figure 4 plots the posterior probabilities of the 51 factors as a function of the parameter ψ
The corresponding values are reported in Table 7 Overall the inclusion of only four factors finds
maximum Sharpe ratio etc) from Bayesrsquo theorem we have
E [∆|data] =M983131
j=0
E [∆|datamodel = j] Pr (model = j|data)
where E [∆|datamodel = j] = limLrarrinfin1L
983123Li=1 ∆(θ
(j)i ) and
983153θ(j)i
983154L
i=1denotes draws from the posterior distribution
of the parameters of model j That is the (Bayesian model average) expectation of ∆ conditional on only thedata is simply the weighted average of the expectation in every model with weights equal to the modelsrsquo posteriorprobabilities See eg Raftery Madigan and Hoeting (1997) Hoeting Madigan Raftery and Volinsky (1999)
17Using a Beta(2 2) that still implies a prior probability of factor inclusion of 50 but lower probabilities forvery dense and very sparse models we obtain virtually identical results Furthermore using a prior in favour of moresparse factor models (ie a Beta(2 8)) the empirical findings are very similar to the ones reported
32
Table 7 Posterior factor probabilities E [γj |data] and risk premia in 225 quadrillion models
E [γj |data] E [λj |data]ψ ψ
Factors 1 5 10 20 50 100 1 5 10 20 50 100 F
HML 0860 0909 0916 0903 0882 0877 0192 0288 0301 0300 0292 0293 0377MKT 0730 0807 0831 0851 0837 0778 0108 0271 0370 0476 0565 0572 0514MKT 0501 0630 0705 0748 0749 0707 0047 0184 0285 0385 0467 0472 0563SMB 0654 0662 0580 0500 0429 0370 0076 0138 0133 0117 0103 0102 0215STRev 0506 0526 0511 0530 0568 0550 0003 0013 0026 0049 0106 0158 0438IPGrowth 0508 0508 0501 0502 0508 0502 0000 -0001 -0001 -0002 -0004 -0007 0121BEH PEAD 0486 0512 0478 0492 0511 0517 0004 0012 0018 0027 0049 0072 0619NONDUR 0475 0499 0490 0504 0504 0509 0000 0002 0003 0005 0011 0018 0170SERV 0504 0481 0497 0502 0491 0497 0000 0000 0000 0000 -0001 -0001 0052LIQ TR 0494 0493 0485 0485 0506 0507 0000 0005 0010 0020 0048 0083 0438PE 0495 0494 0502 0490 0494 0492 -0002 -0004 -0005 -0006 -0008 -0009 6401DeltaSLOPE 0478 0490 0506 0498 0482 0511 0000 0000 -0001 -0001 -0002 -0005 0096BW ISENT 0489 0491 0496 0498 0504 0477 0000 0002 0003 0005 0010 0014 0094DEFAULT 0497 0498 0490 0497 0484 0490 0000 0000 0000 0001 0001 0002 0313Oil 0495 0504 0489 0490 0494 0483 0002 0011 0019 0034 0068 0108 0852DIV 0488 0491 0486 0494 0491 0505 0000 0000 0000 -0001 -0002 -0003 0876REAL UNC 0479 0490 0503 0490 0498 0484 0000 0000 0000 0000 0000 0000 0043UNRATE 0483 0499 0484 0490 0490 0486 0000 -0001 -0002 -0003 -0006 -0008 1083TERM 0470 0476 0480 0497 0503 0501 0000 0003 0005 0009 0018 0030 0942HJTZ ISENT 0483 0480 0491 0487 0489 0477 0000 0000 0000 -0001 0000 0000 0210UMD 0531 0537 0536 0495 0422 0384 0028 0069 0089 0100 0102 0108 0646CMA 0499 0504 0491 0495 0466 0442 0001 0000 -0002 -0005 -0008 -0012 0242LIQ NT 0477 0478 0494 0493 0481 0471 -0003 -0005 -0004 -0002 0009 0019 0501FIN UNC 0481 0477 0471 0477 0489 0491 0000 0000 0000 0000 -0001 -0001 0096STOCK ISS 0515 0537 0509 0473 0433 0374 -0032 -0081 -0096 -0103 -0110 -0105 0515NetOA 0504 0504 0481 0489 0423 0402 0005 0015 0022 0032 0040 0042 0544MACRO UNC 0476 0483 0479 0471 0463 0430 0000 0000 0000 0000 0000 0000 0073INV IN ASS 0479 0476 0479 0461 0454 0440 0001 0002 0005 0010 0023 0029 0549COMP ISSUE 0514 0518 0503 0481 0393 0363 0037 0083 0101 0111 0105 0108 0497ACCR 0483 0477 0486 0472 0436 0415 -0009 -0012 -0009 0001 0015 0015 0343HML 0491 0481 0479 0469 0431 0376 -0002 -0001 0001 0004 0008 0010 0251ROA 0517 0514 0499 0473 0399 0319 0041 0105 0141 0162 0154 0123 0551RMW 0508 0481 0483 0465 0397 0347 0002 0008 0013 0017 0017 0016 0219CMA 0560 0530 0499 0432 0351 0292 0031 0056 0062 0058 0051 0044 0351LTRev 0485 0491 0474 0442 0402 0350 -0007 -0025 -0032 -0035 -0036 -0029 0252INTERM CAP RATIO 0499 0477 0452 0451 0380 0324 0027 0037 0054 0082 0103 0104 0790ASS Growth 0473 0470 0458 0439 0380 0327 0001 0005 0005 0000 -0005 -0003 0525DISSTR 0505 0487 0456 0410 0340 0269 0043 0094 0105 0104 0085 0070 0475PERF 0541 0497 0448 0390 0319 0259 -0054 -0079 -0072 -0064 -0054 -0049 0651BAB 0460 0448 0433 0405 0372 0325 -0010 -0012 -0008 0006 0027 0037 0921IA 0497 0465 0425 0385 0325 0257 -0010 -0011 -0009 -0008 -0008 -0007 0409MGMT 0516 0469 0424 0379 0302 0244 0028 0055 0058 0056 0050 0044 0631ROE 0482 0446 0414 0364 0299 0233 0009 0024 0027 0028 0024 0018 0555O SCORE 0475 0426 0403 0368 0281 0229 -0009 -0014 -0015 -0018 -0016 -0012 002BEH FIN 0483 0410 0360 0321 0238 0196 -0016 -0008 -0003 0000 0003 0003 076GR PROF 0459 0405 0366 0314 0249 0206 0011 0005 0008 0011 0012 0014 0199QMJ 0502 0408 0369 0300 0232 0178 0030 0030 0026 0018 0014 0011 0405RMW 0464 0395 0356 0302 0245 0193 0015 0025 0028 0027 0025 0020 0292SKEW 0481 0388 0345 0287 0219 0179 -0001 0000 0000 0000 0000 0000 0438SMB 0336 0305 0319 0312 0311 0277 0019 0032 0041 0047 0054 0051 0257HML DEVIL 0434 0326 0289 0242 0183 0152 0014 0013 0009 0005 0003 0002 0356
Posterior probabilities of factors E [γj |data] and posterior mean of factor risk premia E [λj |data] computed usingthe continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225 quadrillion models Theprior for each factor inclusion is a Beta(1 1) yielding a prior expectation for γj equal to 50 The last column reportssample average returns for the tradable factors The data is monthly 197310 to 201612 Test assets cross-sectionof 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factors considered are described inTable B1 of Appendix B Numbers denoted with the asterisk in the last column correspond to the return on thefactor-mimicking portfolio of the nontradable factor constructed by a linear projection of its values on the set of 51test assets and scaled to have the same volatility as the original nontradable factor
33
substantial support in our empirical analysis First the celebrated Fama-French HML (high-minus-
low) designed to capture the so-called lsquovalue premiumrsquo is a strong determinant of the cross-section
of asset returns For ψ = 10 (a reasonable benchmark) its posterior probability is about 895 and
only for very strong shrinkage (ψ = 1) the posterior probability gets reduced to 639 Second
the market factor in the version of Daniel Mota Rottke and Santos (2018) (MKTlowast that is
meant to have hedged out the unpriced risk contained in the market index) has also high posterior
probability (856 for ψ = 10) Third the simple market factor (MKT) seems also to be a robust
source of priced risk albeit with an empirical performance weaker than MKTlowast Fourth albeit to
a lesser extent SMBlowast the Daniel Mota Rottke and Santos (2018) version of the small-minus-big
Fama-French factor (meant to capture the so called lsquosizersquo premium) seems also to contain relevant
information for pricing the cross-section of asset returns with a posterior probability in the 60-
70 range for most values of ψ Beside the ones above mentioned all other factors have posterior
probabilities of about 50 or less for all values of ψ Interestingly the results are not very sensitive
to the choice of ψ
In addition to the posterior probabilities of the factors Table 7 reports the posterior means
of the factor risk premia computed as Bayesian Model Average ie the weighted average of the
posterior means in each possible factor model specification with weights equal to the posterior
probability of each specification being the true data generating process (see eg Roberts (1965)
Geweke (1999) Madigan and Raftery (1994)) The results are not very sensitive to the choice of
ψ except when considering very small values for of the shrinkage parameter ψ since in this case
posterior means are shrunk toward zero Interestingly the estimated price of risk for the market
factor is positive despite it being very often estimated as a negative quantity when considering
multifactor models and not dissimilar from the market excess return over the same period More
generally there is a clear pattern in cross-sectionally estimated (ie ex post) factor risk premia and
their simple time series average estimates (reported in the last column of Table 7) for the robust
four factors (HML MKTlowast MKT and SMBlowast) ex post risk premia are very similar to the time series
estimnates while the opposite holds true for the other factors In other words robust factors seems
to price themselves well (since theoretically their own beta is one) while other factors donrsquot
IV3 Estimating 26 million sparse factor models
Instead of drawing unrestricted factor models specifications we now constraint models to have a
maximum of 5 factors ndash ie we are imposing sparsity on the implied stochastic discount factor as
in most of the previous empirical asset pricing literature that has tried to identify low dimensional
factor models to explain the cross-section of asset returns Given our set of 51 factors this approach
yields about 26 millions of possible models (ie the equivalent of about 147 million time series and
cross-sectional regressions) Since we now do not sample the possible models posterior probabilities
are computed using the marginal likelihoods of all these models ie the posterior probability of
model γj is computed as
Pr(γj |data) =p(data|γj)983123i p(data|γi)
34
000
1000
2000
3000
4000
5000
6000
7000
8000
1 5 10 20 50 60
Post
erio
r pro
babi
lity
ψ
HML MKT MKT SMB PERF
STOCK_ISS COMP_ISSUE CMA ROA UMD
BEH_PEAD STRev NONDUR DISSTR MGMT
BW_ISENT TERM FIN_UNC LIQ_TR DeltaSLOPE
IPGrowth Oil SERV DEFAULT NetOA DIV PE REAL_UNC HJTZ_ISENT UNRATE
INTERM_CAP_RATIO LIQ_NT QMJ MACRO_UNC INV_IN_ASS
ACCR LTRev CMA ASS_Growth HML
RMW BAB IA O_SCORE ROE
SMB SKEW BEH_FIN GR_PROF RMW
HML_DEVIL
Figure 5 Posterior factor probabilities
Posterior probabilities of factors Pr [γj = 1|data] estimated over the 197310-201612 sample using a cross-section of25 Fama-French size and book-to-market and 30 Industry test asset portfolios computed using the Dirac spike andslab approach of section II22 51 factors and all possible models with up to 5 factors yielding about 26 millioncandidate models The prior probability of a factor being included is about 1038 The 51 factors considered aredescribed in Table B1 of Appendix B Posterior probabilities are plotted for ψ isin [1 100]
where we have assigned equal prior probability to all possible specifications and p(data|γj) denotes
the marginal likelihood of the j-th model To both simplify the numerical computation and to
illustrate the qualities of the approach we use the Dirac spike-and-slab prior of section II22 since
we can leverage its closed form solution for the marginal likelihoods18
Posterior factor probabilities are reported in Figure 5 and jointly with the Bayesian model
averaging of risk premia across the sparse models in Table C15 of the Appendix Note that in
this case all factors have an ex ante probability of being included equal to 1038 The results
are strikingly similar to the ones in Table 7 and Figure 4 as before only for four factors (HML
MKTlowast MKT and SMBlowast) we observe a marked increase in the posterior probability of inclusion
after observing the data Furthermore these factors seems to price themselves well ndash ie the BMA
of their risk premia are very similar to the sample average of their excess returns ndash while other
factors do not
35
Table 8 Factor Models with highest posterior probability (Dirac spike-and-slab ψ = 10)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232SMB 983232 983232 983232 983232 983232 983232STRevBW ISENTLIQ TRUNRATENONDURTERMCOMP ISSUE 983232 983232OilBEH PEADDeltaSLOPEINV IN ASSIPGrowthDEFAULTUMD 983232ROA 983232 983232 983232REAL UNCPECMA 983232ACCRSERVSTOCK ISS 983232 983232 983232DIVMACRO UNCFIN UNCLIQ NTCMANetOAHJTZ ISENTLTRevRMWHMLINTERM CAP RATIOASS GrowthPERF 983232 983232 983232IABABDISSTRROEMGMT 983232 983232O SCOREQMJBEH FINGR PROFSKEWRMWHML DEVILSMB
Probability () 00666 00599 00574 00522 00503 00460 00437 00428 00423 00393
Factors and posterior model probabilities of ten most likely specifications computed using the Dirac spike and slabapproach of section II22 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 26 millioncandidate models and a model prior probability of the order of 10minus7 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
36
IV4 A robust factors model
The previous subsections suggest that only a small number of factors (HML MKTlowast MKT and to a
lesser extent SMBlowast) are robust explanators of the cross-section of asset returns Furthermore Table
8 that reports the ten factor model specifications with the highest posterior probabilities19 shows
that these robust factors tend to be almost always included in the most likely models both HML
and MKTlowast are featured in all ten specifications while MKT is excluded only once and SMBlowast is
included in the five most likely specification plus the tenth one Note that the posterior probabilities
in Table 8 might appear small in absolute terms but are actually four orders of magnitude larger
than the prior model probabilities (equal to one over the number of models considered)
Therefore a natural question is whether the four factors identified as robust in the above
analysis do indeed deliver a significantly better cross-sectional asset pricing model We answer this
questions by comparing the performance of a four factor model with HML MKTlowast MKT and SMBlowast
as factors to the one of several notable factor models20 In particular Table 9 reports the model
posterior probabilities for the specifications considered ie the probability of any of these models
being the true data generating process Strikingly for almost any value of ψ the model posterior
probabilities are in the single digits range for all models but the robust factors one the probability
of this specification is always higher than 85 except when using a very strong shrinkage (in which
case it is reduced to 65) Furthermore for ψ in the most salient range (10-20) the posterior
probability of the robust factors model is about 90
Table 9 Posterior probabilities of notable models vs robust factors
ψmodel 1 5 10 20 50 100
CAPM 002 001 001 002 004 008Fama and French (1992) 005 001 001 001 000 000Fama and French (2016) 009 006 005 004 003 002Carhart (1997) 005 001 001 001 001 001Hou Xue and Zhang (2015) 002 000 000 000 000 000Pastor and Stambaugh (2000) 001 000 000 001 001 002Asness Frazzini and Pedersen (2014) 010 003 002 002 002 001Robust Factors Model 065 088 090 090 089 085
Posterior model probabilities for the specifications in the first column for different values of ψ computed using theDirac spike-and-slab prior The models and their factors are described in Appendix B The model in the last rowuses the HML MKT MKTlowast and SBMlowast factors described in Table B1 The data is monthly 197310 to 201612 Testassets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios
18Alternatively one could use the continuous spike-and-slab approach employed in the previous subsection anddrop the draws of specifications with more than 5 factors but this significantly reduces computational efficiency
19Table 8 focuses on the Dirac spike-and-slab specification with ψ = 10 Very similar results are reported in tablesC16-C18 of Appendix C for other values of ψ and for the continuous prior case
20Note that the correlation between MKT and MKTlowast is not too large being about 064
37
IV5 The sparsity of the stochastic discount factor
We have shown that a robust model with only four ndash robust ndash factors is much more likely to
capture the true stochastic discount factor than all the notable models considered (see Table 9)
Nevertheless are only four factors likely to deliver a sufficiently accurate representation of the true
latent stochastic discount factor
Thanks to our Bayesian method this question can be easily answered In particular by using
our estimations of about 225 quadrillion models and their posterior probabilities we can compute
the posterior distribution of the dimensionality of the lsquotruersquo model That is for any integer number
between one and fifty-one we can compute the posterior probability of the SDF being a function
of that number of factors
Figure 6 reports the posterior distributions of the dimensionality of the SDF for various values
of ψ These distributions are also summarized in Table 10 For the most salient values of ψ (10 and
20) the posterior mean of the number of factors in the true SDF is in the 24-25 range and the 95
posterior credible intervals are contained in the 17 to 32 factors range That is there is substantial
evidence that the SDF is dense in the space of the linear factors considered given the factors at
hand a relative large number of them is needed to provide an accurate representation of the true
latent SDF Since most of the literature has focused on very low dimension linear factor models
this finding suggests that most empirical results therein have been affected by a large degree of
misspecification
It is worth noticing that as Figure 6 and Table 10 show for very large ψ ie with basically
a flat prior for factor risk premia the posterior dimensionality is reduced This is due to two
phenomena we have already outlined First if some of the factors are useless (and our analysis
points in this direction) under a flat prior they do tend to have a higher posterior probability and
drive out the true sources of priced risk Second a flat prior for the risk premia can generate a
ldquoBartlett Paradoxrdquo (see the discussion in section II21 and Bartlett (1957))
Table 10 The posterior dimensionality of the stochastic discount factor
ψ1 5 10 20 50 100
mean 2570 2525 2460 2370 2203 2046median 26 25 25 24 22 2025 19 18 18 17 15 145 20 20 19 18 16 1595th 31 31 30 30 28 26975th 33 32 32 31 29 27
Summary statistics of the posterior density of the true model number of factors for various values of ψ Estimated
over the 197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30 Industry
test asset portfolios using the continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225
quadrillion models The prior for each factor inclusion is a Beta(1 1)
38
0 10 20 30 40 50
000
004
008
012
Model Dimensionality
Freq
uenc
y
ψ=1ψ=5ψ=10ψ=20ψ=50ψ=100
Figure 6 Posterior density of the dimensionality of the stochastic discount factor
Posterior density of the true model having the number of factors listed on the horizontal axis Estimated over the
197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30 Industry test asset
portfolios computed using the continuous spike and slab approach of section II23 and 51 factors yielding 251 asymp 225
quadrillion models The prior for each factor inclusion is a Beta(1 1) yielding a prior expectation for γj equal to 50
The 51 factors considered are described in Table B1 of Appendix B Posterior densities are plotted for ψ isin [1 100]
Note that if the factors proposed in the literature were to capture different and uncorrelated
sources of risk one might worry that a SDF that is dense in the space of factors might imply
unrealistically high Sharpe ratios (see eg the discussion in Kozak Nagel and Santosh (2019))
Since given a factor model the SDF-implied maximum Sharpe ratio is just a function of the
factorsrsquo risk premia and covariance matrix our Bayesian method allows to construct the posterior
distribution of the maximum Sharpe ratio for each of the 2 quadrilion models considered Therefore
using the posterior probabilities of each possible model specification we can actually construct
the (Bayesian Model Averaging) posterior distribution of the SDF-implied maximum Sharpe ratio
(conditional on the data only)
Figure 7 and table 11 report respectively the posterior distribution of the (annualized) SDF-
implied maximum Sharpe ratio and its summary statistics for several values of the parameter ψ
Except when a very strong shrinkage (small ψ) is imposed (and hence risk premia and consequently
Sharpe ratios are shrunk toward zero) the posterior distribution of the Sharpe ratio are quite
similar for all values of ψ Furthermore despite the SDF being dense in the space of factors the
posterior maximum Sharpe ratio does not appear to be unrealistically high eg for ψ isin [10 20]
its posterior mean is about 074ndash080 and the 95 posterior credible intervals are in the 050ndash114
range Interestingly Ghosh Julliard and Taylor (2016 2018) provides a non-parametric estimate
of the pricing kernel extracted using an information-theoretic approach and wide cross-sections of
equity portfolios and find SDF-implied maximum Sharpe ratios of very similar magnitude
39
00 05 10 15
01
23
4
Sharpe Ratio
Dens
ity
ψ=1ψ=5ψ=10ψ=20ψ=50ψ=100
Figure 7 Posterior density of the Sharpe ratio of the model-implied stochastic discount factor
Posterior density of the Sharpe ratio of the model-implied stochastic discount factor for various values of ψ isin [1 100]
Estimated over the 197310-201612 sample using a cross-section of 25 Fama-French size and book-to-market and 30
Industry test asset portfolios computed using the continuous spike and slab approach of section II23 51 factors
described in Table B1 of Appendix B and Bayesian Model Averaging of the 251 asymp 225 quadrillion possible models
The prior for each factor inclusion is a Beta(1 1)
Table 11 Posterior distribution of the Sharpe ratio of the model-implied stochastic discount factor
ψ1 5 10 20 50 100
mean 044 067 074 080 087 092median 043 066 073 079 086 09125 026 044 050 054 059 0625 029 047 053 058 063 06695th 060 089 099 108 117 124975th 064 094 105 114 124 133
Summary statistics of the posterior distribution of the maximal Sharpe ratio of the model-implied SDF for various
values of ψ isin [1 100] Estimated over the 197310-201612 sample using a cross-section of 25 Fama-French size and
book-to-market and 30 Industry test asset portfolios computed using the continuous spike and slab approach of
section II23 51 factors described in Table B1 of Appendix B and Bayesian Model Averaging of the 251 asymp 225
quadrillion possible models The prior for each factor inclusion is a Beta(1 1)
V Extensions
In additions to the extensions formalized in remarks 1 (on how to handle generated factors such as
principal components and factor mimicking portfolios) and 3 (on how to handle the identification
failure generated by lsquolevel factorsrsquo) our method can be feasibly extended to encompass several
salient generalizations
40
First based on economic considerations one might possibly want to bound the maximum risk
premia (or the maximum Sharpe ratios) associated with the factors This can be achieved by re-
placing the Gaussian distributions in our spike-and-slab priors with (rescaled and centered) Beta
distributions since the latter have bounded support Furthermore for the sake of expositional
simplicity and closed form solutions we have focused on regularising spike and slab priors with
exponential tails Nevertheless our approach that shrinks useless factors based on their correla-
tion with asset returns could be as well implemented using polynomial tailed (ie heavy-tailed)
mixing priors (see Polson and Scott (2011) for a general discussion of priors for regularisation and
shrinkage)21 The rational for using heavy tailed priors is that when the likelihood has thick tails
while the prior has thin tail if the likelihood peak moves too far from the prior mean the posterior
eventually eventually reverts toward the prior This is illustrated in Figure D2 of the Appendix
Nevertheless note that this mechanism (first pointed out in Jeffreys (1961)) is actually desirable
in our settings in order to shrink the risk premia of useless factors toward zero22
Second Lewellen Nagel and Shanken (2010) points out that the first pass time-series regression
is often affected by having a strong factor structure in the residuals Given the hierarchical structure
of our Bayesian approach one can add latent linear components in the time series regression of asset
returns on factors reformulate the time series estimation step as a state-space problem and filter
the latent components (eg via Kalman filter) The posterior sampling of the time series parameters
would then be enriched by the drawing of the added terms as in Bryzgalova and Julliard (2018)
Furthermore one could allow the latent time series factors to be potential priced in the cross-
section (again as in Bryzgalova and Julliard (2018)) This extension would increase the numerical
complexity of the procedure in the time-series step but would nonetheless leave unchanged the
method proposed in this paper at the cross-sectional step (with the only difference that the time
series loadings of the latent factors could be included in the cross-sectional step as if these latent
factors were observable) This extended approach would lead to valid posterior inference and model
selection
Third again thanks to the hierarchical structure of our method time varying time series betas
could be accommodated by adopting the time varying parameters approach of Primiceri (2005)
in the time series step And since in our approach the asset specific expected risk premia are
a parameter estimated in the time series step this extension would also allow for time variation
in asset risk premia Furthermore albeit this would increase the numerical complexity of the
cross-sectional inference step the time varying parameters formulation could also be used for the
modelling of the factor risk premia
21For example albeit alternatives with more desirable properties exist our spike and slab could be implementedusing a Cauchy prior with location parameter set to zero and scale parameter proportional to ψj as defined in equation(16)
22Risk premia have a natural support hence the prior can be chosen (adjusting the paramater ψ) to attach highprobability to it And since useless factors will tend to generate heavy tailed likelihoods with posterior peaks for riskpremia that deviate toward infinity the risk premia of such factors will be shrunk toward the prior mean if the priorhas thin-tails
41
VI Conclusions
We have developed a novel (Bayesian) method for the analysis of linear factor models in asset
pricing The approach can handle the quadrillions of models generated by the zoo of traded and
non traded factors and delivers inference that is robust to the common identification failures and
spurious inference problems caused by useless factors
We have applied our approach to the study of more than two quadrillion factor model spec-
ification and have found that 1) only a handful of factors (the Fama and French (1992) ldquohigh-
minus-lowrdquo proxy for the value premium the market index as well the adjusted versions of the
ldquosmall-minus-bigrdquo size factor and the market factor of Daniel Mota Rottke and Santos (2018))
seem to be robust explanators of the cross-sections of asset returns 2) jointly the four robust fac-
tors provide a model that is compared to the previous empirical literature one order of magnitude
more likely to have generated the observed asset returns (itrsquos posterior probability is about 90)
3) with very high probability the ldquotruerdquo latent stochastic discount factor is dense in the space of
factors proposed in the previous literature ie capturing its characteristics requires the use of 24-25
factors (at the posterior mean of the SDF sparsity) 4) despite being dense in the space of factors
the SDF-implied maximum Sharpe ratio is not excessive suggesting a high degree of commonality
in terms of captured risks among the factors in the zoo
As a byproduct of our novel framework for empirical asset pricing we provide a very simple
Bayesian version of the Fama and MacBeth (1973) regression method (BFM) We show that this
simple procedure (that does neither require optimisation nor tuning parameters and is not harder
to implement than eg the Shanken (1992) correction for standard errors) makes useless factors
easily detectable in finite sample In extensive simulations the BFM and its GLS analogue (BFM-
GLS) perform well even with relatively small time and large cross-sectional dimensions We apply
BFM and BFM-GLS to several notable factor models and document that a range of non-traded
factors such as consumption proxies labour factors or the consumption-to-wealth ratio are only
weakly identified at best and are characterised by a substantial degree of model misspecification
and uncertainty
Finally thanks to its hierarchical structure our framework is extremely flexible and can accom-
modate and deliver robust inference in the presence of 1) pre-estimated factors (eg mimicking
portfolios and principal components) 2) latent priced and unpriced factors 3) time varying betas
as well as asset and factor risk premia
42
References
Allena R (2019) ldquoComparing Asset Pricing Models with Traded and Non-Traded Factorsrdquo working paper
Asness C and A Frazzini (2013) ldquoThe Devil in HMLs Detailsrdquo Journal of Portfolio Management 39 49ndash68
Asness C S A Frazzini and L H Pedersen (2019) ldquoQuality minus Junkrdquo Review of Accounting Studies24(1) 34ndash112
Avramov D and G Zhou (2010) ldquoBayesian Portfolio Analysisrdquo Annual Review of Financial Economics 225ndash47
Baker M and J Wurgler (2006) ldquoInvestor Sentiment and the Cross-Section of Stock Returnsrdquo Journal ofFinance 61 1645ndash80
Baks K P A Metrick and J Wachter (2001) ldquoShould Investors Avoid All Actively Managed Mutual FundsA Study in Bayesian Performance Evaluationrdquo Journal of Finance 56 45ndash85
Ball R (1985) ldquoAnomalies in Relationships between Securitiesrsquo Yields and Yield-Surrogatesrdquo Journal of FinancialEconomics 6 103ndash126
Barillas F and J Shanken (2018) ldquoComparing Asset Pricing Modelsrdquo The Journal of Finance 73(2) 715ndash754
Bartlett M S (1957) ldquoComment on a Statistical Paradox by DV Lindleyrdquo Biometrika 44(1-2) 533ndash534
Basu S (1977) ldquoInvestment Performance of Common Stocks in Relation to their Price-Earnings Ratio A Test ofthe Efficient Market Hypothesisrdquo Journal of Finance 32 663ndash82
Belloni A V Chernozhukov and C Hansen (2014) ldquoInference on treatment effects after selection amonghigh-dimensional controlsrdquo The Review of Economic Studies 81 608ndash650
Breeden D T M R Gibbons and R H Litzenberger (1989) ldquoEmpirical Test of the Consumption-OrientedCAPMrdquo The Journal of Finance 44(2) 231ndash262
Bryzgalova S (2015) ldquoSpurious Factors in Linear Asset Pricing Modelsrdquo working paper
Bryzgalova S and C Julliard (2018) ldquoConsumption in Asset Returnsrdquo London School of EconomicsManuscript
Campbell J Y (1996) ldquoUnderstanding Risk and Returnrdquo Journal of Political Economy 104(2) 298ndash345
Campbell J Y J Hilscher and J Szilagyi (2008) ldquoIn Search of Distress Riskrdquo Journal of Finance 632899ndash2939
Carhart M M (1997) ldquoOn Persistence in Mutual Fund Performancerdquo The Journal of Finance 52(1) 57ndash82
Chan K N-F Chen and D A Hsieh (1985) ldquoAn Exploratory Investigation of the Firm Size Effectrdquo Journalof Financial Economics 14 451ndash71
Chen L R Novy-Marx and L Zhang (2010) ldquoA New Three-Factor Modelrdquo working paper
Chen N-F S A Ross and R Roll (1986) ldquoEconomic Forces and the Stock Marketrdquo Journal of Business 59383ndash403
Chernov M L Lochstoer and S Lundeby (2019) ldquoConditional Dynamics and the Multi-Horizon Risk-ReturnTrade-Offrdquo working paper
Chib S X Zeng and L Zhao (forthcoming) ldquoOn Comparing Asset Pricing Modelsrdquo The Journal of Finance
Cochrane J H (2005) Asset Pricing Revised Edition Princeton University Press
Cooper M J H Gulen and M J Schill (2008) ldquoAsset Growth and the Cross-Section of Stock ReturnsrdquoJournal of Finance 63 1609ndash52
Cremers K M (2002) ldquoStock Return Predictability A Bayesian Model Selection Perspectiverdquo The Review ofFinancial Studies 15(4) 1223ndash1249
Daniel K D Hirshleifer and L Sun (2019) ldquoShort- and Long-Horizon Behavioral Factorsrdquo Review ofFinancial Studies
Daniel K L Mota S Rottke and T Santos (2018) ldquoThe Cross-Section of Risk and Returnrdquo workingpaper
Daniel K and S Titman (2006) ldquoMarket reactions to tangible and intangible informationrdquo Journal of Finance61 1605ndash43
Fabozzi F D Huang and G Zhou (2010) ldquoRobust Portfolios Contributions from Operations Research andFinancerdquo Annals of Operations Research 172 191ndash220
43
Fama E F and K French (2008) ldquoDissecting Anomaliesrdquo Journal of Finance 63 1653ndash78
Fama E F and K R French (1992) ldquoThe Cross-Section of Expected Stock Returnsrdquo The Journal of Finance47 427ndash465
(1993) ldquoCommon Risk Factors in the Returns on Stocks and Bondsrdquo The Journal of Financial Economics33 3ndash56
Fama E F and K R French (2015) ldquoA Five-Factor Asset Pricing Modelrdquo Journal of Financial Economics116 1ndash22
(2016) ldquoDissecting Anomalies with a Five-Factor Modelrdquo Review of Financial Studies 29 69ndash103
Fama E F and J D MacBeth (1973) ldquoRisk Return and Equilibrium Empirical Testsrdquo Journal of PoliticalEconomy 81(3) 607ndash636
Ferson W and C Harvey (1991) ldquoThe Variation of Economic Risk Premiumsrdquo Journal of Political Economy99 385ndash415
Frazzini A and L H Pedersen (2014) ldquoBetting against Betardquo Journal of Financial Economics 111(1) 1ndash25
Garlappi L R Uppal and T Wang (2007) ldquoPortfolio Selection with Parameter and Model Uncertainty AMulti-Prior Approachrdquo Review of Financial Studies 20 41ndash81
George E I and R E McCulloch (1993) ldquoVariable Selection via Gibbs Samplingrdquo Journal of the AmericanStatistical Association 88 881ndash889
(1997) ldquoApproaches for Bayesian Variable Selectionrdquo Statistica Sinica 7 339ndash373
Gertler M and E Grinols (1982) ldquoUnemployment Inflation and Common Stock Returnsrdquo Journal of MoneyCredit and Banking 14 216ndash33
Geweke J (1999) ldquoUsing Simulation Methods for Bayesian Econometric Models Inference Development andCommunicationrdquo Econometric Reviews 18(1) 1ndash73
Ghosh A C Julliard and A Taylor (2016) ldquoWhat is the Consumption-CAPM missing An Information-Theoretic Framework for the Analysis of Asset Pricing Modelsrdquo Review of Financial Studies 30 442ndash504
(2018) ldquoAn Information-Theoretic Asset Pricing Modelrdquo London School of Economics Manuscript
Giannone D M Lenza and G E Primiceri (2018) ldquoEconomic Predictions with Big Data The Illusion ofSparsityrdquo FRB of New York Staff Report No 847
Gibbons M R S A Ross and J Shanken (1989) ldquoA Test of the Efficiency of a Given Portfoliordquo Econometrica57(5) 1121ndash1152
Giglio S G Feng and D Xiu (2019) ldquoTaming the factor Zoordquo Journal of Finance forthcoming
Giglio S and D Xiu (2018) ldquoAsset Pricing with Omitted Factorsrdquo working paper
Gospodinov N R Kan and C Robotti (2014) ldquoMisspecification-Robust Inference in Linear Asset-PricingModels with Irrelevant Risk Factorsrdquo The Review of Financial Studies 27(7) 2139ndash2170
(2019) ldquoToo Good to be True Fallacies in Evaluating Risk Factor Modelsrdquo Journal of Financial Economics132(2) 451ndash471
Goyal A Z L He and S-W Huh (2018) ldquoDistance-Based Metrics A Bayesian Solution to the Power andExtreme-Error Problems in Asset-Pricing Testsrdquo Swiss Finance Institute Research Paper No 18-78 available atSSRN httpsssrncomabstract=3286327
Hall R E (1978) ldquoThe Stochastic Implications of the Life Cycle Permanent Income Hypothesisrdquo Journal ofPolitical Economy 86(6) 971ndash87
Harvey C and Y Liu (2019) ldquoCross-sectional Alpha Dispersion and Performance Evaluationrdquo Journal ofFinancial Economics forthcoming
Harvey C and G Zhou (1990) ldquoBayesian Inference in Asset Pricing Testsrdquo Journal of Financial Economics26 221ndash254
Harvey C R (2017) ldquoPresidential Address The Scientific Outlook in Financial Economicsrdquo The Journal ofFinance 72(4) 1399ndash1440
Harvey C R Y Liu and H Zhu (2016) ldquo and the Cross-Section of Expected Returnsrdquo The Review ofFinancial Studies 29(1) 5ndash68
He A D Huang and G Zhou (2018) ldquoNew Factors Wanted Evidence from a Simple Specification Testrdquoworking paper available at SSRN httpsssrncomabstract=3143752
44
He Z B Kelly and A Manela (2017) ldquoIntermediary Asset Pricing New Evidence from Many Asset ClassesrdquoJournal of Financial Economics 126 1ndash35
Hirshleifer D H Kewei S Teoh and Y Zhang (2004) ldquoDo Investors Overvalue Firms with Bloated BalanceSheetsrdquo Journal of Accounting and Economics 38 297ndash331
Hoeting J A D Madigan A E Raftery and C T Volinsky (1999) ldquoBayesian Model Averaging ATutorialrdquo Statistical Science 14(4) 382ndash401
Hou K C Xue and L Zhang (2015) ldquoDigesting Anomalies An Investment Approachrdquo The Review of FinancialStudies 28(3) 650ndash705
Huang D F Jiang J Tu and G Zhou (2015) ldquoInvestor Sentiment Aligned A Powerful Predictor of StockReturnsrdquo Review of Financial Studies 28 791ndash837
Huang D J Li and G Zhou (2018) ldquoShrinking Factor Dimension A Reduced-Rank Approachrdquo workingpaper available at SSRN httpsssrncomabstract=3205697
Ishwaran H J S Rao et al (2005) ldquoSpike and Slab Variable Selection Frequentist and Bayesian StrategiesrdquoThe Annals of Statistics 33(2) 730ndash773
Jagannathan R and Z Wang (1996) ldquoThe Conditional CAPM and the Cross-Section of Expected ReturnsrdquoThe Journal of Finance 51(1) 3ndash53
Jeffreys H (1961) Theory of Probability Oxford Calendron Press
Jegadeesh N and S Titman (1993) ldquoReturns to Buying Winners and Selling Losers Implications for StockMarket Efficiencyrdquo Journal of Finance 48 65ndash91
(2001) ldquoProfitability of Momentum Strategies An Evaluation of Alternative Explanationsrdquo Journal ofFinance 56 699ndash720
Jones C and J Shanken (2005) ldquoMutual Fund Performance with Learning across Fundsrdquo Journal of FinancialEconomics 78 507ndash552
Jurado K S Ludvigson and S Ng (2015) ldquoMeasuring Uncertaintyrdquo American Economic Review 105 1177ndash1215
Kan R and C Zhang (1999a) ldquoGMM Tests of Stochastic Discount Factor Models with Useless Factorsrdquo Journalof Financial Economics 54(1) 103ndash127
(1999b) ldquoTwo-Pass Tests of Asset Pricing Models with Useless Factorsrdquo the Journal of Finance 54(1)203ndash235
Kass R E and A E Raftery (1995) ldquoBayes Factorsrdquo Journal of the American Statistical Association 90(430)773ndash795
Kelly B S Pruitt and Y Su (2019) ldquoCharacteristics are covariances A unified model of risk and returnrdquoJournal of Financial Economics 134 501ndash524
Kleibergen F (2009) ldquoTests of Risk Premia in Linear Factor Modelsrdquo Journal of Econometrics 149(2) 149ndash173
Kleibergen F and Z Zhan (2015) ldquoUnexplained Factors and their Effects on Second Pass R-squaredsrdquo Journalof Econometrics 189(1) 101ndash116
Kozak S S Nagel and S Santosh (2019) ldquoShrinking the Cross-Sectionrdquo Journal of Financial Economicsforthcoming
Lancaster T (2004) Introduction to Modern Bayesian Econometrics Wiley
Langlois H (2019) ldquoMeasuring Skewness Premiardquo Journal of Financial Economics
Lettau M and S Ludvigson (2001) ldquoResurrecting the (C) CAPM A Cross-Sectional Test when Risk Premiaare Time-Varyingrdquo Journal of political economy 109(6) 1238ndash1287
Lewellen J S Nagel and J Shanken (2010) ldquoA Skeptical Appraisal of Asset Pricing Testsrdquo Journal ofFinancial Economics 96(2) 175ndash194
Lintner J (1965) ldquoSecurity Prices Risk and Maximal Gains from Diversificationrdquo Journal of Finance 20587ndash615
Ludvigson S S Ma and S Ng (2019) ldquoUncertainty and Business Cycles Exogenous Impulse or EndogenousResponserdquo American Economic Journal Macroeconomics
Madigan D and A E Raftery (1994) ldquoModel Selection and Accounting for Model Uncertainty in GraphicalModels Using Occamrsquos Windowrdquo Journal of the American Statistical Association 89(428) 1535ndash1546
45
Novy-Marx R (2013) ldquoThe Other Side of Value The Gross Profitability Premiumrdquo Journal of Financial Eco-nomics 108 1ndash28
Ohlson J A (1980) ldquoFinancial Ratios and the Probabilistic Prediction of Bankruptcyrdquo Journal of AccountingResearch 18 109ndash131
Pastor L (2000) ldquoPortfolio Selection and Asset Pricing Modelsrdquo The Journal of Finance 55(1) 179ndash223
Pastor L and R F Stambaugh (2000) ldquoComparing Asset Pricing Models an Investment Perspectiverdquo Journalof Financial Economics 56(3) 335ndash381
Pastor L and R F Stambaugh (2002) ldquoInvesting in Equity Mutual Fundsrdquo Journal of Financial Economics63 351ndash380
Pastor L and R F Stambaugh (2003) ldquoLiquidity Risk and Expected Stock Returnsrdquo Journal of Politicaleconomy 111(3) 642ndash685
Phillips D L (1962) ldquoA Technique for the Numerical Solution of Certain Integral Equations of the First KindrdquoJ ACM 9(1) 84ndash97
Polson N G and J G Scott (2011) ldquoShrink Globally Act Locally Sparse Bayesian Regularization and Pre-dictionrdquo in Bayesian Statistics 9 ed by J M Bernardo M J Bayarri J O Berger A P Dawid D HeckermanA F M Smith and M West Oxford University Press
Primiceri G E (2005) ldquoTime Varying Structural Vector Autoregressions and Monetary Policyrdquo Review of Eco-nomic Studies 72(3) 821ndash852
Raftery A E D Madigan and J A Hoeting (1997) ldquoBayesian Model Averaging for Linear RegressionModelsrdquo Journal of the American Statistical Association 92(437) 179ndash191
Ritter J R (1991) ldquoThe Long-Run Performance of Initial Public Offeringsrdquo Journal of Finance 46 3ndash27
Roberts H V (1965) ldquoProbabilistic Predictionrdquo Journal of the American Statistical Association 60(309) 50ndash62
Shanken J (1987) ldquoA Bayesian Approach to Testing Portfolio Efficiencyrdquo Journal of Financial Economics 19195ndash215
(1992) ldquoOn the Estimation of Beta-Pricing Modelsrdquo The Review of Financial Studies 5 1ndash33
Sharpe W F (1964) ldquoCapital Asset Prices A Theory of Market Equilibrium under Conditions of Riskrdquo Journalof Finance 19(3) 425ndash42
Sims C A (2007) ldquoThinking about Instrumental Variablesrdquo mimeo
Sloan R (1996) ldquoDo Stock Prices Fully Reflect Information in Accruals and Cash Flows about Future EarningsrdquoThe Accounting Review 71 289ndash315
Stambaugh R F and Y Yuan (2016) ldquoMispricing Factorsrdquo Review of Financial Studies 30 1270ndash1315
Stock J H (1991) ldquoConfidence Intervals for the Largest Autoregressive Root in US Macroeconomic Time SeriesrdquoJournal of Monetary Economics 28(3) 435ndash459
Tikhonov A A Goncharsky V Stepanov and A G Yagola (1995) Numerical Methods for the Solutionof Ill-Posed Problems Mathematics and Its Applications Springer Netherlands
Titman S K J Wei and F Xie (2004) ldquoCapital Investments and Stock Returnsrdquo Journal of Financial andQuantitative Analysis 39 677ndash700
Uppal R P Zaffaroni and I Zviadadze (2018) ldquoCorrecting Misspecified Stochastic Discount Factorsrdquoworking paper
Yogo M (2006) ldquoA Consumption-Based Explanation of Expected Stock Returnsrdquo Journal of Finance 61(2)539ndash580
46
A Additional Derivations
A1 Derivation of the posterior distribution in section II1
Letrsquos consider first the time-series regression We assume that 983171tiidsim N (0Σ) or 983171 sim MVN (0TtimesN Σotimes
IT ) The likelihood of data (RF ) is therefore
p(data|BΣ) = (2π)minusNT2 |Σ|minus
T2 exp
983069minus1
2tr[Σminus1(Rminus FB)⊤(Rminus FB)]
983070
After assigning the Jeffreysrsquo prior for (BΣ) π(BΣ) prop |Σ|minusN+1
2 we simplify the likelihood
function by exploiting the fact that OLS estimated residuals are orthogonal to the regressors
(Rminus FB)⊤(Rminus FB) = [Rminus FBols minus F (B minus Bols)]⊤[Rminus FBols minus F (B minus Bols)]
= (Rminus FBols)⊤(Rminus FBols) + (B minus Bols)
⊤F⊤F (B minus Bols)
= T Σ+ (B minus Bols)⊤F⊤F (B minus Bols)
where
Bols =
983075a⊤
β⊤
983076= (F⊤F )minus1F⊤R Σols =
1
T(Rminus FBols)
⊤(Rminus FBols)
Therefore the posterior distribution in the first step is
p(BΣ|data) prop (2π)minusNT2 |Σ|minus
T+N+12 exp
983069minus1
2tr[Σminus1(Rminus FB)⊤(Rminus FB)]
983070
prop |Σ|minusT+N+1
2 eminus12tr[Σminus1(T Σ)]eminus
12tr[Σminus1(BminusBols)
⊤F⊤F (BminusBols)]
Hence the posterior distribution of B conditional on data and Σ is
p(B|Σ data) prop exp
983069minus1
2tr[Σminus1(B minus Bols)
⊤F⊤F (B minus Bols)]
983070
and the above is the kernel of the multivariate normal in equation (8)
If we further integrate out B it is easy to show that
p(Σ|data) prop |Σ|minusT+NminusK
2 exp
983069minus1
2tr[Σminus1(T Σ)]
983070
Therefore the posterior distribution of Σ is the inverse-Wishart in equation (9)
Recall that β = (1N βf ) λ⊤ = (λc λ⊤
f ) If we assume that the pricing error αi follows an
independent and idencitical normal distribution N (0σ2) the likelihood function in the second step
is
p(data|λσ2) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070
where data in the second step include (aβf ) drawn from the first step Assuming the diffuse
47
Jeffreysrsquo prior π(λσ2) prop 1σ2 the posterior distribution of (λσ2) is
p(λσ2|dataBΣ) prop (σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βλ)⊤(aminus βλ)
983070
= (σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βλ+ β(λminus λ))⊤(aminus βλ+ β(λminus λ))
983070
= (σ2)minusN+2
2 exp
983069minusN σ2
2σ2
983070exp
983083minus(λminus λ)⊤β⊤β(λminus λ)
2σ2
983084
there4 p(λ|σ2 dataBΣ) prop exp
983083minus(λminus λ)⊤β⊤β(λminus λ)
2σ2
983084
where λ = (β⊤β)minus1β⊤a and σ2 = (aminusβλ)⊤(aminusβλ)N Therefore the posterior conditional distribution
of λ is the one in equation (12) Finally we can derive the posterior distribution of σ2 by integrating
out λ
p(σ2|dataBΣ) =
983133p(λσ2|dataBΣ)dσ2 prop (σ2)minus
NminusK+12 exp
983069minusN σ2
2σ2
983070
hence obtaining the posterior distribution in equation (13)
A2 Non-spherical pricing errors
Our framework can also easily accommodate non-spherical cross-sectional pricing errors To see
this note that under the null of the model we can rewrite equation (1) as Rt = βλ + βfft + 983171t
Consider the cross-sectional regression and let ET define the sample mean operator Since ET [Rt] =
βλ+ET [βfft]+ET [983171t] = βλ+ET [983171t] the pricing error α should be equal to ET [983171t] Hence under
the hypothesis that the model is correctly specified and in the spirit of the central limit theorem
a suitable distributional assumption for the pricing errors α in the second step is
α|Σ sim N
9830610N
1
TΣ
983062
Implying the distribution of R equiv ET [Rt] sim N (βλ 1T Σ) The likelihood function in the second step
is then
p(data|λ) = (2π)minusN2
9830559830559830559830551
TΣ
983055983055983055983055minus 1
2
exp
983069minusT
2(Rminus βλ)⊤Σminus1(Rminus βλ)
983070
prop exp
983069minusT
2(λ⊤β⊤Σminus1βλminus 2λ⊤β⊤Σminus1R)
983070
Hence we can now define the following estimator
Definition 3 (Bayesian Fama-MacBeth GLS2 (BFM-GLS2)) The BFM-GLS2 posterior dis-
tribution of λ is
λ|dataBΣ sim N (λΣλ) (18)
48
where λ = (β⊤Σminus1β)minus1β⊤Σminus1R Σλ = 1T (β
⊤Σminus1β)minus1 and where β and Σ are drawn from the
Normal-inverse-Wishart in (8)-(9)
In the above the conditional expectation λ = (β⊤Σminus1β)minus1β⊤Σminus1R is essentially the Fama-
MacBeth GLS estimate of λ and Σλ is just the standard covariance matrix of the GLS estimates
Different from our OLS version we incorporate the uncertainty of R by drawing λ from the normal
distribution with variance matrix 1T (β
⊤Σminus1β)minus1
In this case two forces are at play to make useless factors detectable in finite sample First as
for the BFM and BFM-GLS cases useless factors will generate posterior draws with diverging 983141λand flipping sing Second differently from our OLS version we incorporate the uncertainty of R
by drawing λ from the normal distribution with variance matrix 1T (β
⊤Σminus1β)minus1
A3 Formal derivation of the flat prior pitfall for risk premia
Following the derivation in section A1 the likelihood function in the second step is
p(data|γλσ2) = (2πσ2)minusN2 exp
983069minus 1
2σ2(aminus βγλγ)
⊤(aminus βγλγ)
983070(19)
Assigning a Jeffreysrsquo prior to the parameters23 (λσ2) the marginal likelihood function conditional
on model index γ is
p(data|γ) =983133 983133
p(data|γλσ2)π(λσ2|γ)dλdσ2
prop983133 983133
(σ2)minusN+2
2 exp
983069minus 1
2σ2(aminus βγλγ)
⊤(aminus βγλγ)
983070dλdσ2
=
983133 983133(σ2)minus
N+22 exp
983083minusN σ2
γ
2σ2
983084exp
983083minus(λγ minus λγ)
⊤β⊤γ βγ(λγ minus λγ)
2σ2
983084dλdσ2
= (2π)pγ+1
2 |β⊤γ βγ |minus
12
983133(σ2)minus
Nminuspγ+1
2 exp
983083minusN σ2
γ
2σ2
983084dσ2
= (2π)pγ+1
2 |β⊤γ βγ |minus
12
Γ(Nminuspγ+1
2 )
(N σ2
γ
2 )Nminuspγ+1
2
where λγ = (β⊤γ βγ)
minus1β⊤γ a σ
2γ =
(aminusβγ λγ)⊤(aminusβγ λγ)N and Γ denotes the Gamma function
A4 Proof of Remark 2
Proof Consider two nested linear factor models γ and γ prime The only difference between γ and γ prime
is γp γp equals 1 in model γ but 0 in model γ prime Let γminusp denote a (K minus 1) times 1 vector of model
index excluding γp γ⊤ = (γ⊤minusp 1) and γ prime⊤ = (γ⊤
minusp 0) Suppose that we rearrange the ordering of
23More precisely the priors for (λσ2) are π(λγ σ2) prop 1
σ2 and λminusγ = 0
49
factors such that factor p is the last one To begin with we introduce some matrix notations
βγ = (βγprime βp) Dγ =
983075Dγprime
1ψp
983076 β⊤
γ βγ +Dγ =
983075β⊤γprimeβγprime +Dγprime β⊤
γprimeβp
β⊤p βγprime β⊤
p βp + 1ψp
983076
|β⊤γ βγ +Dγ | = |β⊤
γprimeβγprime +Dγprime |times |β⊤p βp +
1
ψpminus β⊤
p βγprime(β⊤γprimeβγprime +Dγprime)minus1β⊤
γprimeβp|
|β⊤γ βγ +Dγ | = |β⊤
γprimeβγprime +Dγprime |times |β⊤p βp +
1
ψpminus β⊤
p βγprime(β⊤γprimeβγprime +Dγprime)minus1β⊤
γprimeβp|
|Dγ | = |Dγprime |times 1
ψp
Equipped with the above we have by direct calculation
p(data|γj = 1γminusj)
p(data|γj = 0γminusj)=
|Dγ |12
|β⊤γ βγ+Dγ |
12
1983059
SSRγ2
983060N2
|Dγprime |
12
|β⊤γprimeβγprime+D
γprime |12
1983061
SSRγprime2
983062N2
=
983061SSRγprime
SSRγ
983062N2983061|Dγ ||Dγprime |
983062 12
983075|β⊤
γprimeβγprime +Dγprime ||β⊤
γ βγ +Dγ |
983076 12
=
983061SSRγprime
SSRγ
983062N2
ψminus 1
2p
983055983055983055983055β⊤p βp +
1
ψpminus β⊤
p βγprime
983059β⊤γprimeβγprime +Dγprime
983060minus1β⊤γprimeβp
983055983055983055983055minus 1
2
=
983061SSRγprime
SSRγ
983062N29830611 + ψpβ
⊤p
983063IN minus βγprime
983059β⊤γprimeβγprime +Dγprime
983060minus1β⊤γprime
983064βp
983062minus 12
where β⊤p
983147IN minus βγprime(β⊤
γprimeβγprime +Dγprime)minus1β⊤γprime
983148βp = minb(βpminusβγprimeb)⊤(βpminusβγprimeb)+ b⊤Dγprimeb which
is the minimal value of the penalised sum of squared errors when we use βγprime to predict βp
A5 Confidence Intervals for R2 in Fama-MacBeth Estimation
In the spirit of Stock (1991) and Lewellen Nagel and Shanken (2010) for each of the simulations
we use the following approach to construct the confidence interval for cross-sectional R2 produced
by the standard Fama-MacBeth estimation
First we choose a hypothetical true cross-sectional R2 denoted by R2h The expected test asset
returns E[Rt] are assumed to follow E[Rt] = hβλ+α where β and λ are sample estimates of β
and λ from historical data and αiiidsim N (0σ2
e) The constants h and σ2e are chosen to match the
hypothetical cross-sectional R2
Let R denote the vector of historical average asset returns and V arN (x) denote the sample
cross-sectional variance of vector x The population cross-sectional R2 is therefore
R2h =
h2V arN (βλ)
V arN (E[Rt])=
h2V arN (βλ)
h2V arN (βλ) + σ2e
At the same time wersquod like to maintain the historical cross-sectional dispersion of test asset returns
50
so we further have the following equation
V arN (E[Rt]) = h2V arN (βλ) + σ2e = V arN (ET [Rt])
h2 =R2
hV arN (ET [Rt])
V arN (βλ)
σ2e = (1minusR2
h)V arN (ET [Rt])
Solving the system for h and σ2e we simulate a vector of pricing errors from a normal distribution
αiiidsim N (0σ2
e) Since the sample variance of the draws αi is generally different from σ2e with a non-
zero cross-sectional covariance with β we adjust the vector α by (1) subtracting CovN (βλα)
V arN (βλ)βλ
from α such as the sample covariance between α and βλ is zero (2) multiplying α by σeσN (α) in
order that sample cross-sectional variance of α equals σ2e
Second we simulate a random sample of both factors and test asset returns Factors are drawn
with replacement from their empirical sample while test assets returns Rt are assumed to follow a
parametric normal distribution that is
Rt|ftiidsim N (hβλ+α+ βft Σ)24
We then use Fama-MacBeth two-step approach to estimateR2 for every simulated sample ftRtTt=1
The second step is repeated for 1000 times For each hypothetical R2 between 0 and 1 we find the
(5 95) confidence interval in the simulation Finally build a 90 confidence interval for R2 by
including those values of hypothetically true R2 whose 90 confidence interval contains R2
The confidence intervals for GLS R2 can be found in a similar way with the only difference
of focusing on cross-sectional R2 for a linear combination of Rt ie Σminus 12Rt Let Rt = Σminus 1
2Rt
β = Σminus 12 β and 983171t = Σminus 1
2 983171tiidsim N (0 IN ) The simulation then relies on Rt rather than Rt
Rt|ftiidsim N (hβλ+α+ βft IN )
A6 Large N behavior
In this section we investigate the properties of the BFM procedure in estimating the risk premia
and R-squared as well successfully identifying irrelevant factors in the model when applied to a
large cross-section For the sake of brevity here we present the overview of the simulation results
with Tables C4 - C9 summarizing the output
We consider the same simulation design as described at the beginning of Section III except
for the choice of the cross-section of test assets which time series and cross-sectional features we
mimic In our baseline case in the previous subsections we built a cross-section to emulate the 25
Fama-French portfolios sorted by size and value Now instead we consider the properties of the
following composite cross-sections to simulate returns
24Σ is the sample estimate of covariance matrix of random errors in time-series regression in equation (1)
51
(a) N = 55 25 Fama-French portfolios sorted by size and value and 30 industry portfolios
(b) N = 100 25 Fama-French portfolios sorted by size and value 30 industry portfolios 25
profitability and investment portfolios 10 momentum portfolios and 10 long-term reversal
portfolios
The rest of the simulation design stays unchanged ie the strong factor mimics the behavior of
HML with its betas and risk premia corresponding to their in-sample values cross-sectional R2adj
as well as portfolio average returns and variance of the residuals
Tables C4 and C7 focus on the case of including only a strong factor into the model that is
inherently misspecified and estimated on a large cross-section (N = 55 and N = 100 respectively)
Risk premia estimates recovered by BFM are centered around the pseudo-true values and confi-
dence intervals produced by the posterior coverage have the correct size Confidence intervals for
R2adj are equally centered around the true values and overall become quite tight for a large sample
size (eg T ge 600) The GLS version of the cross-sectional fit is slightly lower than than the true
value and this is largely due to the estimation errors in the weight matrix that are particularly
important for a large cross-section but overall are very close to the true values
Tables C5 and C8 focus on the model with just a useless factor As expected the standard
Fama-MacBeth estimation yields confidence intervals for the risk premia with wrong size rejecting
the zero risk premia for a useless factor with increasing frequency as T becomes large In contrast
the BFM inference remains valid with its empirical rejection rates being close to the true size of
the tests
Finally Tables C6 and C9 consider the mixed case of estimating a model that includes both
strong and useless factors Again BFM correctly identifies the presence of a spurious factor in
the specification and if anything becomes somewhat more conservative underrejecting the zero
risk premia associated with it This conservative inference is also shared by the estimates of the
strong factor risk premia with the latter being particularly evident in a large sample estimation
(T = 20 000 N = 100) Again the reason for this additional estimation uncertainty seems to lie
in the large N behavior of the cross-sectional regression (betas and the weight matrix in case of
the GLS) The posterior of a cross-sectional measure of fit remains relatively tight around the true
value
B Data
CAPM Following Sharpe (1964) and Lintner (1965) the only risk factor is the excess return on
market portfolio which is proxied by a value-weighted portfolio of all CRSP firms incorporated in
US and listed on the NYSE AMEX or NASDAQ We use 1-month Treasury rate as a proxy for
risk-free rate The data comes from Ken Frenchrsquos website
Fama-French 3 factor model Fama and French (1992) extend CAPM by introducing two
additional factors SMB and HML where SMB is the return difference between portfolios of stocks
52
with small and large market capitalizations and HML is the return difference between portfolios of
stocks with high and low book-to-market ratio Again the data comes from Ken Frenchrsquos website
Carhart (1997) This paper extends Fama-French 3 factor model by including a momentum
factor UMD (also available from Ken French website)
q-factor model Hou Xue and Zhang (2015) introduce a four-factor model that includes market
excess return a new size factor (ME) proxied by the return difference between large and small
stocks an investment factor (IA) proxied by the return difference of stocks with high and low
investment-to-asset ratio and finally the profitability factor (ROE) created by sorting stocks based
on their return-on-equity ratio We receive the data from the authors An alternative is Fama-
French five factor model but we only show the result of the first one since factors in these two
papers are extremely similar
Liquidity Pastor and Stambaugh (2003) created a liquidity factor based on the fact that order
flows result in larger return reversals when liquidity is lower We download the monthly tradable
liquidity factor from Stambaughrsquos website
Quality-minus-junk Asness Frazzini and Pedersen (2019) introduced the QMJ factor and
demonstrated that profitable growing well-managed companies referred to as lsquoqualityrsquo firms
command a higher rate of return The factor is available from AQR data library
CCAPM Real growth rate in nondurable consumption per capita (quarterly data) is computed
from the data on consumption levels available at St Louis FRED
Scaled CAPM Lettau and Ludvigson (2001) considered a conditional SDFmt+1 = atminusbtMKTt+1
where at and bt are linear function of conditional information cayt We download cayt from the
authorsrsquo website
Scaled CCAPM Similar to conditional CAPM but the SDF includes nondurable consumption
growth instead of the market factor
HC-CCAPM Jagannathan and Wang (1996) added a labor factor to CAPM Following their
paper we compute the returns on human capital as
Rlabort =
Ltminus1 + Ltminus2
Ltminus2 + Ltminus3minus 1 (20)
where Lt is the disposable labor income per capita
Scaled HC-CCAPM Lettau and Ludvigson (2001) extend Jagannathan and Wang (1996) by
considering a conditional SDF in which cay is the only conditional information
Durable consumption model Yogo (2006) emphasized the role of durable consumption goods
in explaining high returns of small stocks and value stocks We consider a three-factor model
market excess return non-durable consumption growth and durable (real) consumption growth Cd
(seasonally adjusted at annual rates) The data is from Yogorsquos website 1952Q1 to 2001Q4
53
B1 Factor list
Table B1 List of the factors for cross-sectional asset pricing models
Factor ID Factor name and description Reference Sourceconstruction
MKT Market excess return Sharpe (1964) Lintner(1965)
Ken French website
SMB Size factor constructed as a long-shortportfolio of stocks sorted by their mar-ket cap (small-minus-big)
Fama and French (1992) Ken French website
HML Value factor constructed as a long-short portfolio of stocks sorted by theirbook-to-market ratio (high-minus-low)
Fama and French (1992) Ken French website
RMW Profitability factor constructed as along-short portfolio of stocks sorted bytheir profitability (robust-minus-weak)
Fama and French (2015) Ken French website
CMA Investment factor constructed as along-short portfolio of stocks sorted bytheir investment activity (conservative-minus-aggressive)
Fama and French (2015) Ken French website
UMD Momentum factor constructed as along-short portfolio of stocks sorted bytheir 12-2 cumulative previous return(up-minus-down)
Carhart (1997) Je-gadeesh and Titman(1993)
Ken French website
STREV Short-term reversal factor constructedas a long-short portfolio of stockssorted by their previous month return
Jegadeesh and Titman(1993)
Ken French website
LTREV Long-term reversal factor constructedas a long-short portfolio of stockssorted by their cumulative return ac-crued in the previous 60-13 months
Jegadeesh and Titman(2001)
Ken French website
q IA Investment factor constructed as along-short portfolio of stocks sorted bytheir investment-to-capital
Hou Xue and Zhang(2015)
Lu Zhang
q ROE Profitability factor constructed as along-short portfolio of stocks sorted bytheir return on equity
Hou Xue and Zhang(2015)
Lu Zhang
LIQ NT Liquidity factor computed as the av-erage of individual-stock measures es-timated with daily data (residual pre-dictability controlling for the marketfactor)
Pastor and Stambaugh(2003)
Robert Stambaughwebsite
LIQ TR Liquidity factor constructed as a long-short portfolio of stocks sorted by theirexposure to LIQ NT
Pastor and Stambaugh(2003)
Robert Stambaughwebsite
MGMT Mispricing factor constructed as acombination of anomalies related tofirmrsquos management practices (stock is-sue accruals asset growth etc)
Stambaugh and Yuan(2016)
Robert Stambaughwebsite
PERF Mispricing factor constructed as acombination of anomalies related tofirmrsquos performance (profitability dis-tress return on assets etc)
Stambaugh and Yuan(2016)
Robert Stambaughwebsite
ACCR Accruals factor constructed as a long-short portfolio of stocks sorted bychanges in operating working capitalper split-adjusted share from the fiscalyear end t-2 to t-1 divided by book eq-uity per share in t-1
Sloan (1996) Robert Stambaughwebsite
54
DISSTR Distress factor constructed as a long-short portfolio of stocks sorted by thepredicted failure probability
Campbell Hilscher andSzilagyi (2008)
Robert Stambaughwebsite
ASS Growth Asset growth factor constructed as along-short portfolio of stocks sorted bygrowth rate of total assets in the previ-ous fiscal year
Cooper Gulen andSchill (2008)
Robert Stambaughwebsite
COMP ISSUE Composite issue factor constructed asa long-short portfolio of stocks sortedby the growth in the firmrsquos total marketvalue of equity above that of the stockrsquosrate of return
Daniel and Titman(2006)
Robert Stambaughwebsite
GR PROF Gross profitability factor constructedas a long-short portfolio of stockssorted by the ratio of gross profit toassets creates abnormal benchmark-adjusted returns
Novy-Marx (2013) Robert Stambaughwebsite
INV IN ASS Investment-in-assets factor con-structed as a long-short portfolio ofstocks sorted by the annual change ingross property plant and equipmentplus the annual change in inventoriesscaled by lagged book value of theassets
Titman Wei and Xie(2004)
Robert Stambaughwebsite
NetOA Net operating assets factor con-structed as a long-short portfolio ofstocks sorted by net operating assets
Hirshleifer Kewei Teohand Zhang (2004)
Robert Stambaughwebsite
OSCORE Ohlson O-score factor constructed as along-short portfolio of stocks sorted bythe predicted value of a distress mea-sure
Ohlson (1980) Robert Stambaughwebsite
ROA Return-on-assets factor constructed asa long-short portfolio of stocks sortedby their return on assets
Chen Novy-Marx andZhang (2010)
Robert Stambaughwebsite
STOCK ISS Stock issuance factor constructed asa long-short portfolio of stocks sortedby their annual log change in split-adjusted shares outstanding
Ritter (1991) Fama andFrench (2008)
Robert Stambaughwebsite
INTERM CR Innovations to the intermediariesrsquo cap-ital (equity) ratio
He Kelly and Manela(2017)
Asaf Manela website
BAB Betting-against-beta factor con-structed as a portfolio that holdslow-beta assets leveraged to a betaof 1 and that shorts high-beta assetsde-leveraged to a beta of 1
Frazzini and Pedersen(2014)
AQR data library
HML DEVIL A version of the HML factor that relieson the current price level to sort thestocks into long and short legs
Asness and Frazzini(2013)
AQR data library
QMJ Auality-minus-junk factor constructedas a long-short portfolio of stockssorted by the combination of theirsafety profitability growth and thequality of management practices
Asness Frazzini andPedersen (2019)
AQR data library
FIN UNC A measure of financial uncertainty Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
REAL UNC A measure of real economic uncertainty Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
55
MACRO UNC A measure of macroeconomic uncer-tainty
Jurado Ludvigson andNg (2015) LudvigsonMa and Ng (2019)
Sydney Ludvigsonwebsite
TERM Term spread measured as the differ-ence in 10 year Treasury bonds and fedfunds rate
Chen Ross and Roll(1986) Fama and French(1993)
FRED-MD database
DELTA SLOPE Change in the difference between a10-year Treasury bond yield and a 3-month Treasury bill yield
Ferson and Harvey(1991)
FRED-MD database
CREDIT Credit spread measured as the dif-ference between Moodyrsquos BAA corpo-rate bond yields and 10-year Treasurybonds
Chen Ross and Roll(1986) Fama and French(1993)
FRED-MD database
DIV Dividend yield Campbell (1996) FRED-MD databasePE Price-earnings ratio Basu (1977) Ball (1985) FRED-MD database
BW INV SENT Investor sentiment measure aggre-gated from a set of indices orthogonalto macroeconomic fundamentals
Baker and Wurgler(2006)
Dashan Huang web-site
HJTZ INV SENT Investor sentiment measure extractedwith PLS from Baker and Wurgler(2006) proxies
Huang Jiang Tu andZhou (2015)
Dashan Huang web-site
BEH PEAD Short-term behavioral factor reflectingpost-earnings announcement drift
Daniel Hirshleifer andSun (2019)
Kent Daniel website
BEH FIN Long-term behavioral factor predom-inantly capturing the impact of shareissuance and correction
Daniel Hirshleifer andSun (2019)
Kent Daniel website
MKTlowast Market factor with a hedged unpricedcomponent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
SMBlowast SMB with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
HMLlowast HML with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
RMWlowast RMW with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
CMAlowast CMA with a hedged unpriced compo-nent
Daniel Mota Rottkeand Santos (2018)
Kent Daniel website
SKEW Systematic skewness factor con-structed as a long-short portfolio ofstocks sorted on the their predictedsystematic skewness rank
Langlois (2019) Hugues Langloiswebsite
NONDUR Nondurable consumption growth (realchain-weighted per capita)
Chen Ross and Roll(1986) Breeden Gib-bons and Litzenberger(1989)
Monthly consump-tion expenditurethe chain-weightedprice index andpopulation data arefrom BEA
SERV Growth rate (real chain-weighted percapita) service expenditure
Breeden Gibbons andLitzenberger (1989)Hall (1978)
Monthly expenditurefor services thechain-weighted priceindex and popula-tion data are fromBEA
UNRATE Unemployment rate Gertler and Grinols(1982)
FRED-MD database
IND PROD Growth rate of industrial production Chan Chen and Hsieh(1985) Chen Ross andRoll (1986)
Industrial Produc-tion Index is fromthe Board of Gover-nors of the FederalReserve System
56
OIL Monthly growth rate of the ProducerPrice index for Crude Petroleum (do-mestic production)
Chen Ross and Roll(1986)
PPI is from US Bu-reau of Labor Statis-tics
The table presents the list of factors used in Section IV2 For each of the variables we present their identificationindex the nature of the factor and the source of data for downloading andor constructing the time series
57
C Additional Tables
Table C1 Tests of risk premia in a correctly specified model with a strong factor
λintercept λHML R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0115 0049 0017 0114 0050 0014 -417 4429200 0101 0055 0012 0096 0047 0010 096 6773
FM 600 0106 0053 0011 0100 0048 0011 3052 84561000 0096 0050 0011 0102 0044 0010 4786 901820000 0102 0042 0012 0100 0049 0007 9620 9943
100 0063 0027 0006 0034 0010 0002 -337 3516200 0107 0055 0012 0080 0037 0005 -299 4906
BFM 600 0095 0050 0007 0080 0035 0007 431 69131000 0092 0054 0008 0092 0047 0012 3291 781820000 0108 0054 0012 0110 0052 0008 9654 9849
Panel B GLS
100 0221 0156 0061 0212 0137 0064 1062 7380200 0153 0088 0024 0144 0088 0024 5923 8571
FM 600 0117 0062 0017 0115 0060 0014 8338 94461000 0106 0053 0011 0112 0052 0011 9003 964920000 0100 0058 0010 0103 0047 0011 9947 9981
100 0151 0088 0026 0149 0086 0026 2642 6569200 0123 0066 0016 0124 0066 0015 4907 7352
BFM 600 0106 0055 0012 0105 0056 0011 7640 87301000 0107 0054 0011 0103 0051 0011 8512 915420000 0098 0057 0009 0100 0051 0013 9915 9950
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in a correctlyspecified model with an intercept and a strong factor Hypothetical true value of R2
adj is 100 Fama-MacBeth esti-mates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shankencorrection Confidence intervals for BFM estimates are constructed using a posterior distribution of Fama-MacBethestimates of λ The last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simula-tions evaluated at the simulation point estimates for FM and its posterior mode for BFM
58
Table C2 Tests of risk premia in a correctly specified model with a useless factor
λintercept λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0087 0033 0008 0019 0003 0000 -419 4844200 0071 0034 0006 0052 0011 0000 -420 5684
FM 600 0077 0031 0005 0131 0037 0000 -422 58501000 0071 0035 0006 0205 0066 0002 -416 612820000 0133 0077 0027 0709 0479 0140 -411 7007
100 0039 0011 0002 0001 0001 0000 -229 -012200 0038 0013 0001 0002 0000 0000 -232 030
BFM 600 0048 0012 0002 0017 0006 0001 -221 1261000 0048 0011 0000 0031 0010 0000 -214 16020000 0007 0000 0000 0103 0051 0014 -176 5793
Panel B GLS
100 0215 0130 0052 0211 0117 0033 -310 3727200 0141 0080 0023 0139 0072 0013 -373 2282
FM 600 0108 0053 0014 0142 0065 0010 -377 21161000 0097 0053 0009 0171 0082 0012 -371 209220000 0108 0047 0010 0621 0559 0372 -259 1697
100 0136 0074 0022 0029 0011 0001 -194 1060200 0112 0060 0014 0012 0004 0000 -239 837
BFM 600 0097 0047 0010 0010 0002 0000 -265 7491000 0096 0047 0009 0010 0002 0000 -276 83220000 0071 0034 0004 0077 0033 0004 -326 766
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a correctly specified model with an intercept and a useless factor The true value of R2 is 0 Fama-MacBeth esti-mates are constructed using OLS (GLS) two-step cross-sectional regressions with standard errors including Shankencorrection Confidence intervals for BFM estimates are constructed using a posterior distribution of Fama-MacBethestimates of λ The last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simula-tions evaluated at the simulation point estimates for FM and its posterior mode for BFM
59
Table C3 Tests of risk premia in a correctly specified model with useless and strong factors
λintercept λHML λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
100 0073 0038 0006 0071 0033 0006 0043 0012 0000 -445 5930200 0073 0035 0006 0090 0045 0008 0028 0006 0000 1083 7484
FM 600 0076 0033 0005 0100 0046 0009 0027 0005 0000 3873 85721000 0073 0034 0006 0090 0046 0009 0026 0005 0000 5415 911320000 0087 0032 0006 0061 0026 0005 0025 0008 0000 9687 9947
100 0039 0013 0001 0023 0007 0001 0002 0001 0000 -197 4174200 0048 0023 0000 0046 0015 0000 0002 0001 0000 003 5372
BFM 600 0054 0025 0003 0062 0026 0006 0002 0000 0000 4056 72821000 0084 0034 0006 0064 0021 0002 0002 0000 0000 5763 808920000 0071 0033 0005 0069 0032 0004 0004 0000 0000 9704 9860
Panel B GLS
100 0193 0129 0043 0205 0133 0043 0180 0121 0031 1405 7480200 0135 0080 0021 0141 0075 0024 0129 0062 0010 6015 8683
FM 600 0101 0054 0010 0109 0052 0009 0091 0037 0004 8331 94231000 0104 0055 0010 0105 0048 0011 0085 0039 0003 8995 963220000 0095 0047 0006 0101 0052 0005 0088 0035 0004 9944 9981
100 0133 0074 0023 0132 0074 0023 0026 0009 0001 2796 6680200 0106 0054 0012 0106 0053 0012 0013 0004 0000 4899 7362
BFM 600 0094 0047 0009 0093 0046 0009 0007 0001 0000 7654 87351000 0090 0045 0010 0091 0044 0007 0005 0001 0000 8530 914620000 0092 0043 0008 0092 0046 0008 0004 0001 0000 9914 9951
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and
λstrong λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor The true value
of the cross-sectional R2adj is 100 Fama-MacBeth estimates are constructed using OLS (GLS) two-step cross-
sectional regressions with standard errors including Shanken correction Confidence intervals for BFM estimates areconstructed using a posterior distribution of Fama-MacBeth estimates of λ The last two columns report the 5th and95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at the simulation point estimates for FMand its posterior mode for BFM
60
Table C4 Tests of risk premia in a misspecified model with a strong factor (N = 55)
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0107 0058 0014 0115 0059 0012 -186 1305200 0091 0046 0009 0102 0052 0011 -184 1485600 0104 0052 0013 0109 0060 0014 -166 14951000 0100 0056 0009 0123 0061 0013 -123 135120000 0104 0049 0009 0109 0048 0009 353 803
BFM 100 0017 0004 0000 0002 0000 0000 -155 -067200 0046 0017 0004 0023 0006 0000 -168 273600 0089 0042 0012 0071 0036 0005 -174 8441000 0093 0047 0007 0089 0040 0007 -171 95920000 0101 0048 0011 0099 0042 0008 329 777
Panel B GLS
FM 100 0509 0428 0305 0513 0431 0305 2093 6471200 0295 0206 0102 0274 0187 0091 3999 6657600 0170 0109 0042 0176 0113 0034 5585 71321000 0170 0106 0034 0176 0105 0031 6010 723220000 0145 0082 0020 0141 0073 0022 6917 7182
BFM 100 0265 0181 0086 0262 0186 0079 1872 5919200 0163 0100 0032 0147 0086 0024 3401 5842600 0116 0060 0017 0118 0058 0014 5125 65951000 0120 0064 0016 0121 0063 0014 5689 686520000 0106 0053 0009 0095 0048 0013 6897 7163
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor estimates of a cross-section of 55 test assets The true valueof the cross-sectional R2
adj is 572 (7071) in OLS (GLS) estimation Fama-MacBeth estimates are constructedusing OLS (GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidenceintervals and their size for BFM estimates are constructed using posterior coverage of Fama-MacBeth estimates of λThe last two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluatedat the simulation point estimates for FM and its posterior mode for BFM The test assets mimic the time series andcross-sectional properties of 25 Fama-French size-value portfolios and 30 industry portfolios while the strong factorproxies the HML factor (all the data available from Ken French website)
61
Table C5 Tests of risk premia in a misspecified model with a useless factor (N = 55)
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0095 0050 0013 0084 0031 0003 -186 2080200 0084 0042 0007 0093 0038 0004 -186 1801600 0103 0051 0011 0155 0077 0011 -187 13621000 0101 0051 0009 0226 0131 0033 -187 141720000 0191 0126 0050 0716 0650 0460 -187 988
BFM 100 0013 0002 0000 0000 0000 0000 -105 -047200 0039 0015 0001 0002 0000 0000 -128 -043600 0076 0040 0006 0004 0000 0000 -144 -0491000 0082 0035 0004 0022 0009 0000 -150 -02620000 0129 0059 0005 0078 0037 0008 -168 -020
Panel B GLS
FM 100 0492 0411 0287 0540 0468 0302 -134 2318200 0267 0196 0088 0364 0274 0153 -159 1359600 0166 0101 0035 0408 0323 0198 -162 10091000 0154 0089 0028 0475 0401 0266 -155 86820000 0155 0100 0044 0806 0772 0705 -068 668
BFM 100 0259 0180 0085 01695 0105 0041 -014 1285200 0162 0094 0030 0065 0027 0005 -089 615600 0108 0058 0013 0056 0022 0003 -127 5311000 0112 0057 0014 0064 0021 0004 -138 47920000 0051 0020 0001 0093 0049 0011 -072 172
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor estimated on a cross-section of 55 portfolios Thetrue value of the cross-sectional R2 is zero The test assets mimic the time series and cross-sectional properties of 25Fama-French size-value portfolios and 30 industry portfolios
62
Table C6 Tests of risk premia in a misspecified model with useless and strong factors (N = 55)
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 100 0097 0052 0014 0099 0045 0009 0108 0041 0006 -318 2477200 0085 0041 0006 0090 0049 0011 0117 0051 0008 -298 2350600 0095 0045 0013 0108 0060 0012 0191 0100 0015 -212 21211000 0083 0043 0006 0125 0072 0016 0258 0171 0047 -155 213720000 0109 0055 0012 0146 0086 0038 0766 0704 0535 267 1778
BFM 100 0014 0002 0000 0000 0000 0000 0000 0000 0000 -141 161200 0037 0014 0001 0017 0005 0000 0002 0000 0000 -203 726600 0069 0031 0005 0058 0027 0002 0007 0000 0000 -227 10961000 0064 0024 0003 0069 0027 0005 0026 0008 0001 -209 117320000 0028 0006 0000 0068 0033 0004 0080 0042 0009 262 873
Panel B GLS
FM 100 0488 0406 0282 0495 0411 0281 0537 0465 0306 2201 6613200 0263 0194 0088 0252 0167 0077 0359 0279 0151 4047 6707600 0157 0094 0032 0161 0093 0027 0398 0313 0199 5581 71591000 0146 0087 0029 0144 0090 0026 0475 0393 0251 5994 725020000 0140 0094 0035 0134 0089 0041 0789 0747 0667 6888 7237
BFM 100 0253 0182 0081 0260 0176 0077 0168 0104 0039 2036 6007200 0156 0092 0028 0140 0082 0021 0063 0029 0004 3451 5844600 0105 0058 0013 0108 0049 0011 0054 0024 0002 5125 66141000 0105 0054 0015 0111 0056 0009 0057 0024 0003 5678 687320000 0051 0019 0001 0045 0018 0001 0093 0045 0013 6873 7157
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and λstrong
and λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor estimated on a cross-section
of 55 portfolios The true value of the cross-sectional R2adj is 572 (7071) in OLS (GLS) estimation The test
assets mimic the time series and cross-sectional properties of 25 Fama-French size-value portfolios and 30 industryportfolios while the strong factor proxies the HML factor
63
Table C7 Tests of risk premia in a misspecified model with a strong factor (N = 100)
λc λstrong R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0091 0044 0010 0109 0055 0012 -098 1310600 0102 0052 0013 0116 0057 0019 -042 12311000 0099 0056 0011 0117 0060 0014 031 114520000 0099 0044 0010 0116 0068 0017 431 748
BFM 200 0016 0004 0000 0006 0001 0000 -082 074600 0076 0034 0006 0060 0028 0005 -086 7901000 0081 0046 0007 0076 0037 0007 -072 85920000 0097 0044 0011 0106 0057 0014 418 742
Panel B GLS
FM 200 0495 0411 0287 0481 0403 0268 2938 5885600 0255 0169 0077 0257 0178 0073 4736 62091000 0234 0149 0059 0205 0129 0049 5198 635220000 0166 0100 0035 0170 0097 0036 6142 6398
BFM 200 0249 0163 0070 0233 0158 0073 2567 5296600 0132 0082 0023 0137 0077 0020 4287 56771000 0125 0073 0021 0112 0060 0014 4849 597020000 0101 0056 0012 0098 0053 0013 6114 6375
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for the pseudo-true values of λlowast
i in amisspecified model with an intercept and a strong factor estimated on a cross-section of 100 portfolios The truevalue of R2
adj is 585 (6297) for the OLS (GLS) estimation Fama-MacBeth estimates are constructed using OLS(GLS) two-step cross-sectional regressions with standard errors including Shanken correction Confidence intervalsand their size for BFM estimates are constructed using a posterior distribution of Fama-MacBeth estimates of λ Thelast two columns report the 5th and 95th percentiles of cross-sectional R2
adj across 1000 simulations evaluated at thesimulation point estimates for FM and its posterior mode for BFM The simulations design follows the methodologydescribed in Section III with the test assets mimicking the composite cross-section of 25 Fama-French size-BMportfolios 30 industry portfolios 25 profitability and investment portfolios 10 momentum portfolios as well as 10long-term reversal portfolios (all available from Ken French website) A strong factor mimics the behavior of HML
64
Table C8 Tests of risk premia in a misspecified model with a useless factor (N = 100)
λc λuseless R2adj
T 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0091 0041 0009 0147 0079 0015 -101 1481600 0110 0062 0010 0280 0180 0064 -100 13551000 0096 0053 0007 0341 0248 0101 -100 122220000 0221 0151 0061 0805 0759 0643 -100 1220
BFM 200 0014 0003 0000 0000 0000 0000 -050 002600 0070 0030 0003 0014 0002 0000 -066 0261000 0073 0034 0005 0026 0010 0001 -071 02720000 0097 0035 0003 0091 0045 0008 -081 109
Panel B GLS
FM 200 0472 0397 0272 0530 0454 0320 -072 1495600 0230 0149 0064 0458 0363 0235 -052 9691000 0196 0122 0048 0487 0407 0275 -037 86120000 0106 0063 0021 0760 0717 0640 171 615
BFM 200 0244 0161 0066 0150 0092 0031 -016 769600 0129 0073 0021 0078 0035 0008 -055 6761000 0118 0066 0019 0070 0033 0006 -060 62420000 0076 0034 0005 0093 0049 0009 172 399
The table shows the frequency of rejecting the null hypothesisH0 λi = λlowasti for pseudo-true value of λc and λlowast
useless = 0in a misspecified model with an intercept and a useless factor estimated on a cross-section of 100 portfolios The truevalue of R2 is zero The test assets mimic the time series and cross-sectional properties of composite cross-section of25 Fama-French size-BM portfolios 30 industry portfolios 25 profitability and investment portfolios 10 momentumportfolios as well as 10 long-term reversal portfolios while the strong factor mimics the behavior of HML (all thedata is available from Ken French website)
Table C9 Tests of risk premia in a misspecified model with useless and strong factors (N = 100)
λc λstrong λuseless R2adj
T 10 5 1 10 5 1 10 5 1 5th 95th
Panel A OLS
FM 200 0088 0043 0009 0098 0045 0008 0187 0104 0026 -124 2118600 0103 0056 0011 0104 0064 0015 0319 0217 0081 -020 19571000 0105 0049 0009 0115 0058 0013 0389 0284 0134 070 182620000 0102 0057 0014 0140 0075 0028 0823 0782 0684 400 1766
BFM 200 0012 0003 0000 0005 0000 0000 0000 0000 0000 -051 541600 0062 0025 0003 0046 0021 0002 0016 0004 0000 -063 10771000 0062 0029 0005 0057 0025 0003 0029 0008 0001 -005 107820000 0026 0008 0001 0042 0015 0000 0089 0039 0011 399 818
Panel B GLS
FM 200 0467 0392 0268 0471 0383 0254 0523 0454 0315 2971 5943600 0227 0149 0059 0228 0154 0060 0455 0363 0227 4745 62221000 0191 0118 0046 0178 0114 0036 0480 0401 0262 5207 636820000 0098 0054 0020 0095 0062 0024 0749 0707 0621 6121 6416
BFM 200 0243 0160 0066 0230 0155 0072 0152 0087 0031 2613 5350600 0126 0072 0022 0132 0071 0017 0074 0033 0007 4268 56921000 0117 0067 0017 0109 0057 0013 0068 0034 0006 4864 597920000 0068 0032 0002 0066 0034 0006 0087 0047 0009 6102 6371
The table shows the frequency of rejecting the null hypothesis H0 λi = λlowasti for pseudo-true values of λc and λstrong
λlowastuseless equiv 0 in a misspecified model with an intercept a strong and a useless factor on the cross-section of 100
portfolios The true value of R2adj is 585 (6297) in OLS (GLS) estimation The test assets mimic the time series
and cross-sectional properties of composite cross-section of 25 Fama-French size-BM portfolios 30 industry portfolios25 profitability and investment portfolios 10 momentum portfolios as well as 10 long-term reversal portfolios whilethe strong factor mimics the behavior of HML (all the data is available from Ken French website)
65
Table C10 The probability of retaining risk factors using BF
T 55 57 59 61 63 65
Panel A strong factors
Jeffreys Prior fstrong 200 0860 0845 0830 0812 0792 0771600 0987 0985 0985 0983 0981 09791000 0998 0998 0997 0996 0996 0995
Spike-and-Slab Prior fstrong 200 0749 0718 0685 0654 0618 0586600 0982 0979 0978 0972 0970 09641000 0996 0996 0996 0995 0994 0992
Panel B useless factors
Jeffreys Prior fuseless 200 1000 0995 0982 0940 0862 0726600 1000 0999 0998 0988 0971 09201000 1000 1000 1000 0999 0990 0971
Spike-and-Slab Prior fuseless 200 0419 0248 0149 0083 0040 0022600 0119 0062 0028 0012 0007 00041000 0083 0028 0011 0001 0000 0000
Panel C strong and useless factors
Jeffreys Prior fstrong 200 0928 0912 0891 0878 0860 0838600 0994 0994 0992 0991 0991 09891000 0999 0999 0999 0999 0999 0999
fuseless 200 0955 0894 0788 0642 0489 0360600 0957 0895 0764 0618 0461 03541000 0957 0893 0787 0645 0483 0357
Spike-and-Slab Prior fstrong 200 0753 0725 0697 0666 0629 0592600 0982 0980 0979 0972 0970 09611000 0996 0996 0996 0995 0994 0994
fuseless 200 0283 0154 0085 0044 0023 0011600 0077 0031 0006 0002 0000 00001000 0053 0008 0000 0000 0000 0000
The table shows the frequency of retaining risk factors for different choice sets across 1000 simulations of differentsize (T=200 600 and 1000) In Panel A the candidate risk factor is truly cross-sectionally priced and stronglyidentified while in Panel B they are not Panel C summarizes the case of using both strong and useless candidatefactors in the model A candidate factor is retained in the model if its marginal posterior probability p(γi = 1|data)is greater than a certain threshold ie 55 57 59 61 63 and 65
66
Table C11 Two-pass regressions with tradable factors 25 Fama-French portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CAPM Intercept 1421 1769 1404 -091 1461[0627 2215] [-435 6661] [0595 2256] [-403 5211]
MKT -0647 -0631[-1429 0135] [-1458 0163]
Fama and French (1992) Intercept 1273 6071 1249 5604 5566[0730 1816] [2000 8514] [0682 1834] [4275 6946]
MKT -0686 -0664[-1227 -0145] [-1242 -0102]
SMB 0140 0140[0075 0205] [0073 0205]
HML 0380 0379[0322 0438] [0319 0439]
Quality-minus-junk Intercept 0626 7442 0648 6947 6879Asness Frazzini and Pedersen (2014) [-0048 1300] [4480 9520] [-0055 1308] [5548 7946]
QMJ 0369 0358[0148 0589] [0130 0591]
MKT -0097 -0116[-0763 0568] [-0772 0580]
SMB 0209 0206[0157 0260] [0151 0262]
HML 0338 0340[0288 0388] [0289 0392]
Panel B GLS
CAPM Intercept 1362 4805 1323 4706 4265[0941 1783] [2696 10000] [0846 1802] [1411 6530]
MKT -0788 -0749[-1206 -0369] [-1227 -0287]
Fama and French (1992) Intercept 1397 8849 1353 8543 8543[0924 1870] [8286 9771] [0846 1878] [8003 8989]
MKT -0827 -0783[-1298 -0356] [-1311 -0283]
SMB 0179 0178[0145 0213] [0142 0215]
HML 0348 0350[0312 0385] [0310 0391]
Quality-minus-junk Intercept 0966 8948 0982 8736 8642Asness Frazzini and Pedersen (2014) [0395 1537] [8320 9640] [0383 1616] [8120 9090]
QMJ 0378 0359[0204 0552] [0172 0539]
MKT -0425 -0438[-0984 0133] [-1052 0157]
SMB 0197 0196[0160 0234] [0156 0235]
HML 0359 0359[0321 0398] [0320 0399]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of monthly excess returns for 25 Fama-French sizevalue portfolios Each model is estimated viaOLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructed basedon the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructed as inLewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide theposterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and 99level respectively
67
Table C12 Two-pass regressions with tradable factors 25 Fama-French portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CCAPM Intercept 1444 662 1814 -202 545[0155 2732] [-435 9374] [0106 3664] [-423 4175]
∆Cnd 0383 0254[-0221 0988] [-0462 0956]
Scaled CAPM Intercept 4489 1617 3731 1739 2565[1479 7499] [-1429 6114] [-0268 7514] [-391 5783]
cay 1141 0563[-0198 2480] [-2262 3085]
MKT -2119 -1444[-4890 0653] [-5241 2643]
MKT times cay -8554 -5188[-19227 2119] [-17911 9415]
Scaled HC-CAPM Intercept 4405 1386 3343 3895 3659[1182 7629] [-2632 5453] [-0500 7182] [-006 6630]
cay 1038 0552[-0444 2520] [-1957 2848]
∆Y 0437 0034[-0421 1295] [-0995 0928]
MKT -2049 -1148[-5031 0933] [-4946 2772]
∆Y times cay 1428 0724[-1047 3903] [-3458 4645]
MKT times cay -7782 -4127[-18579 3015] [-17926 10532]
Panel B GLS
CCAPM Intercept 2274 -251 2336 -303 -056[1386 3162] [-435 9896] [1360 3266] [-403 1109]
∆Cnd 0139 0088[-0114 0391] [-0222 0383]
Scaled CAPM Intercept 2623 4949 2668 5626 4567[0794 4451] [1657 7829] [0859 4376] [439 7146]
cay 0362 0205[-0759 1483] [-0853 1277]
MKT -0596 -0642[-2412 1220] [-2325 1149]
MKT times cay 0644 0310[-5695 6984] [-5988 6529]
Scaled HC-CAPM Intercept 2486 5014 2604 5832 4698[0510 4463] [1158 7726] [0801 4372] [558 7371]
cay 0361 0199[-0902 1623] [-0879 1269]
∆Y -0415 -0242[-0878 0048] [-0672 0157]
MKT -0479 -0588[-2441 1482] [-2357 1241]
∆Y times cay 0395 0150[-1761 2552] [-1718 2036]
MKT times cay 1158 0477[-5888 8205] [-5890 6968]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of quarterly excess returns for 25 Fama-French sizevalue portfolios Each model is estimatedvia OLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructed basedon the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructed as inLewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we provide theposterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and 99level respectively
68
Table C13 Two-pass regressions with tradable factors 25 Fama-French + 17 industry portfolios
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Carhart (1997) Intercept 0755 4720 0780 3835 3769
[0167 1342] [803 8559] [0158 1395] [2222 5427]
MKT -0162 -0188
[-0759 0435] [-0808 0432]
SMB 0144 0142
[0082 0206] [0078 0207]
HML 0320 0316
[0251 0388] [0245 0388]
UMD 0759 0673
[-0360 1878] [-0476 1802]
q-factor model Intercept 0744 4505 0761 3647 3600
Hou Xue and Zhang (2015) [0244 1243] [138 8338] [0239 1287] [1711 5508]
ROE 0173 0156
[-0139 0485] [-0154 0481]
IA 0287 0279
[0117 0456] [0117 0455]
ME 0230 0221
[0135 0325] [0121 0320]
MKT -0199 -0211
[-0703 0304] [-0741 0311]
Liquidity Factor Intercept 0958 277 0963 -124 620
Pastor and Stambaugh (2003) [0417 1499] [-513 4113] [0377 1527] [-435 3340]
LIQ 0210 0153
[-1001 1421] [-1077 1445]
MKT -0274 -0281
[-0822 0274] [-0851 0340]
Panel B GLS
Carhart (1997) Intercept 0839 8826 0909 8486 8397
[0467 1210] [7562 9557] [0487 1339] [7895 8812]
MKT -0235 -0309
[-0610 0140] [-0737 0120]
SMB 0162 0161
[0134 0190] [0130 0193]
HML 0351 0350
[0316 0386] [0312 0390]
UMD 1426 1184
[0832 2020] [0541 1861]
q-factor model Intercept 1107 4289 1103 3742 3631
Hou Xue and Zhang (2015) [0761 1454] [027 7341] [0714 1486] [2022 5216]
ROE 0270 0247
[0057 0482] [0008 0502]
IA 0277 0263
[0134 0420] [0107 0428]
ME 0221 0216
[0156 0287] [0144 0288]
MKT -0524 -0520
[-0869 -0179] [-0900 -0138]
Liquidity Factor Intercept 1186 2897 1159 1811 2373
Pastor and Stambaugh (2003) [0874 1497] [-197 7687] [0803 1519] [438 5253]
LIQ 0089 0065
[-0567 0744] [-0696 0871]
MKT -0598 -0571
[-0909 -0288] [-0936 -0212]
69
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CAPM Intercept 0966 467 0957 -113 306
[0427 1506] [-250 5490] [0382 1522] [-246 2863]
MKT -0277 -0269
[-0823 0269] [-0838 0311]
Fama-French (1992) Intercept 1016 4330 1002 3425 3370
[0540 1491] [182 8166] [0517 1500] [2194 4614]
MKT -0445 -0430
[-0916 0026] [-0912 0068]
SMB 0147 0144
[0086 0208] [0077 0208]
HML 0316 0313
[0248 0383] [0242 0383]
Quality-minus-junk Intercept 0836 4931 0836 3861 3934
Asness et all (2019) [0323 1350] [914 8449] [0314 1365] [2459 5577]
QMJ 0153 0147
[-0065 0372] [-0066 0373]
MKT -0293 -0289
[-0796 0210] [-0814 0233]
SMB 0179 0175
[0114 0243] [0105 0240]
HML 0302 0300
[0234 0369] [0230 0373]
Panel B GLS
CAPM Intercept 1192 3060 1170 2602 2573
[0884 1500] [058 7848] [0815 1517] [600 5118]
MKT -0604 -0583
[-0911 -0298] [-0923 -0227]
Fama-French (1992) Intercept 1182 8616 1150 8272 8234
[0853 1510] [7303 9353] [0788 1519] [7736 8662]
MKT -0596 -0564
[-0923 -0268] [-0937 -0198]
SMB 0160 0160
[0132 0187] [0129 0191]
HML 0349 0348
[0314 0384] [0310 0386]
Quality-minus-junk Intercept 0970 8674 0977 8227 8276
Asness et all (2019) [0606 1334] [7451 9335] [0561 1369] [7759 8693]
QMJ 0293 0278
[0159 0427] [0125 0439]
MKT -0393 -0399
[-0754 -0033] [-0791 0012]
SMB 0166 0165
[0138 0194] [0134 0196]
HML 0353 0352
[0318 0388] [0315 0392]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with tradable risk factorson a cross-section of monthly excess returns for 25 Fama-French and 17 industry portfolios Each model is estimatedvia OLS and GLS We report point estimates and 5 confidence intervals for risk premia which are constructedbased on the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence level constructedas in Lewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimation we providethe posterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode and median of thecross-sectional R2 as well as its (5 95) credible intervals and denote significance at the 90 95 and99 level respectively
70
Table C14 Two-pass regressions with nontradable factors 25 Fama-French + 17 industry port-folios
Fama-MacBeth Bayesian Estimation
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
CCAPM Intercept 1817 356 1975 -124 209
[0905 2729] [-250 6208] [0795 3135] [-245 2680]
∆Cnd 0189 0123
[-0216 0594] [-0285 0502]
Scaled CAPM Intercept 2141 1528 2344 298 1306
[0529 3754] [-789 9568] [0692 3944] [-451 4069]
cay 1408 0574
[-0182 2999] [-0935 1939]
MKT 0108 -0142
[-1471 1686] [-1747 1499]
cay timesMKT -2554 -2159
[-8811 3702] [-8813 4620]
Scaled CCAPM Intercept 1607 1709 1983 95 1468
[0394 2820] [-789 10000] [0761 3180] [-372 4189]
cay 1137 0511
[-0394 2669] [-0871 1865]
∆Cnd 0452 0175
[-0103 1007] [-0261 0598]
cay times∆Cnd -0102 0030
[-1752 1547] [-1175 1292]
HC-CAPM Intercept 2185 611 2221 -147 644
[0720 3650] [-513 6005] [0667 3735] [-429 3093]
∆Y 0398 0201
[-0011 0808] [-0290 0667]
MKT 0123 0050
[-1392 1638] [-1513 1650]
Panel B GLS
CCAPM Intercept 2351 264 2442 -057 218
[1700 3003] [-250 9795] [1612 3258] [-204 1296]
∆Cnd 0211 0132
[0034 0387] [-0081 0351]
Scaled CAPM Intercept 2516 3951 2653 4332 3470
[1467 3566] [1045 7411] [1616 3705] [360 6539]
cay 0783 0424
[0056 1511] [-0311 1119]
MKT -0504 -0639
[-1548 0541] [-1674 0406]
cay timesMKT 3785 2327
[-0412 7981] [-2053 6791]
Scaled CCAPM Intercept 2244 331 2378 296 356
[1378 3109] [-789 8166] [1539 3237] [-476 1690]
cay 0742 0425
[-0006 1489] [-0316 1104]
∆Cnd 0160 0119
[-0078 0398] [-0116 0342]
cay times∆Cnd 0355 0190
[-0343 1053] [-0455 0840]
HC-CAPM Intercept 2908 3810 2808 4454 3356
[2053 3762] [854 7372] [1809 3826] [194 6591]
∆Y -0155 -0089
[-0381 0072] [-0357 0174]
MKT -0892 -0793
[-1742 -0043] [-1803 0208]
71
FM BFM
Model Factors λj R2adj λj R2
adjmode R2adjmedian
Panel A OLS
Scaled HC-CAPM Intercept 2099 1372 2243 1599 2186
[0549 3649] [-1389 4875] [0491 3955] [-121 4771]
cay 1124 0529
[-0326 2574] [-1066 1981]
∆Y 0088 0106
[-0314 0489] [-0399 0599]
MKT 0169 -0068
[-1340 1679] [-1805 1744]
cay times∆Y 1277 0579
[-1361 3914] [-1930 2919]
cay timesMKT -2125 -1605
[-8058 3808] [-8349 5354]
Durable CCAPM Intercept 2281 2316 2220 1423 1459
[0025 4537] [-789 10000] [0403 4028] [-359 4178]
∆Cnd 0544 0194
[0054 1034] [-0163 0525]
∆Cd 0481 0104
[-0101 1062] [-0353 0531]
MKT -0102 -0063
[-2466 2262] [-1948 1863]
Panel B GLS
Scaled HC-CAPM Intercept 2662 3848 2698 4312 3502
[1626 3698] [433 7267] [1555 3844] [231 6466]
cay 0726 0416
[-0029 1481] [-0324 1146]
∆Y -0165 -0088
[-0440 0110] [-0372 0198]
MKT -0652 -0685
[-1684 0379] [-1816 0438]
cay times∆Y 0681 0333
[-0648 2009] [-1017 1616]
cay timesMKT 3266 2123
[-0950 7482] [-2173 6502]
Durable CCAPM Intercept 1958 2926 1994 412 2558
[0634 3281] [-789 6763] [0831 3125] [-269 6336]
∆Cnd 0164 0065
[-0065 0394] [-0126 0260]
∆Cd 0289 0123
[-0011 0589] [-0117 0377]
MKT 0002 -0026
[-1322 1326] [-1165 1125]
The table summarises risk premia estimates and cross-sectional fit for a selection of models with nontradable riskfactors on a cross-section of quarterly excess returns for 25 Fama-French and 17 industry portfolios Each modelis estimated via OLS and GLS We report point estimates and 5 confidence intervals for risk premia which areconstructed based on the asymptotic normal distribution and cross-sectional R2 and its (5 95) confidence levelconstructed as in Lewellen Nagel and Shanken (2010) for FM estimation In Bayesian Fama-MacBeth estimationwe provide the posterior mean of λ denoted by λj its (25 975) credible intervals the posterior mode andmedian of the cross-sectional R2 as well as its (5 95) credible intervals and denote significance at the90 95 and 99 level respectively
72
Table C15 Posterior factor probabilities and risk premia of 26 million sparse models
Pr [γj = 1|data] E [λj |data]ψ ψ
factors 1 5 10 20 50 100 1 5 10 20 50 100
HML 0455 0702 0720 0695 0646 0614 0106 0231 0249 0245 0229 0219MKTlowast 0274 0513 0594 0658 0700 0702 0050 0211 0321 0440 0553 0591MKT 0086 0178 0311 0446 0550 0576 0009 0067 0163 0286 0408 0454SMBlowast 0246 0350 0317 0264 0210 0188 0039 0113 0120 0110 0095 0089PERF 0136 0160 0147 0125 0098 0083 -0020 -0050 -0053 -0049 -0041 -0036STOCK ISS 0099 0117 0125 0123 0110 0099 -0008 -0029 -0042 -0050 -0053 -0051COMP ISSUE 0097 0101 0104 0103 0096 0088 0010 0029 0041 0051 0059 0059CMA 0111 0114 0104 0091 0075 0066 0008 0018 0019 0019 0017 0016ROA 0089 0079 0083 0094 0102 0097 0010 0023 0034 0051 0068 0070UMD 0091 0097 0097 0090 0078 0071 0006 0020 0026 0029 0030 0030BEH PEAD 0081 0069 0069 0074 0089 0108 0001 0002 0004 0007 0016 0030STRev 0078 0064 0063 0067 0086 0112 0000 0002 0003 0007 0022 0048NONDUR 0079 0065 0064 0067 0079 0097 0000 0000 0001 0001 0003 0006DISSTR 0085 0079 0076 0070 0060 0053 0010 0030 0039 0043 0042 0040MGMT 0090 0083 0077 0068 0055 0047 0007 0017 0019 0019 0017 0015BW ISENT 0079 0064 0062 0063 0069 0077 0000 0000 0000 0001 0002 0004TERM 0078 0063 0061 0062 0069 0078 0000 0000 0001 0001 0004 0007FIN UNC 0078 0063 0061 0061 0066 0073 -0000 -0000 -0000 -0000 -0000 -0000LIQ TR 0078 0063 0060 0061 0066 0075 -0000 0000 0001 0002 0007 0015DeltaSLOPE 0078 0063 0061 0061 0066 0073 -0000 -0000 -0000 -0000 -0000 -0001IPGrowth 0078 0063 0061 0061 0066 0072 -0000 -0000 -0000 -0000 -0001 -0002Oil 0078 0063 0061 0061 0066 0072 -0000 0001 0001 0002 0004 0009SERV 0078 0063 0061 0061 0065 0072 -0000 -0000 -0000 -0000 -0000 -0000DEFAULT 0078 0063 0060 0060 0064 0069 -0000 -0000 0000 0000 0000 0000NetOA 0079 0063 0061 0062 0065 0065 0001 0003 0004 0007 0014 0018DIV 0078 0063 0060 0060 0064 0069 0000 -0000 -0000 -0000 -0000 -0000PE 0078 0063 0060 0060 0064 0068 -0000 -0001 -0001 -0002 -0004 -0008REAL UNC 0078 0063 0060 0060 0064 0067 0000 0000 0000 0000 0000 0000HJTZ ISENT 0078 0063 0060 0060 0064 0068 -0000 -0000 -0000 -0000 -0000 -0001UNRATE 0078 0063 0060 0060 0064 0068 0000 -0000 -0000 -0000 -0001 -0002INTERM CAP RATIO 0084 0065 0061 0062 0062 0059 0009 0013 0018 0027 0038 0041LIQ NT 0079 0063 0060 0060 0063 0066 -0001 -0001 -0002 -0002 -0003 -0004QMJ 0113 0090 0069 0051 0035 0028 0012 0016 0014 0010 0007 0005MACRO UNC 0078 0063 0060 0059 0060 0061 0000 0000 0000 0000 0000 0000INV IN ASS 0078 0062 0059 0059 0060 0061 0000 0000 0001 0001 0003 0006ACCR 0081 0067 0062 0059 0056 0055 -0002 -0005 -0006 -0006 -0005 -0004LTRev 0078 0064 0063 0062 0059 0053 -0001 -0005 -0008 -0012 -0016 -0017CMAlowast 0079 0062 0059 0058 0059 0058 0000 0000 -0000 -0001 -0002 -0002ASS Growth 0078 0060 0056 0055 0055 0052 -0001 -0001 -0001 -0004 -0008 -0011HMLlowast 0079 0062 0058 0055 0053 0049 -0001 -0001 -0001 -0001 -0001 -0001RMWlowast 0077 0058 0052 0047 0040 0035 0000 0001 0001 0002 0002 0002BAB 0079 0059 0051 0045 0039 0034 -0003 -0003 -0003 -0001 0001 0003IA 0083 0060 0051 0043 0035 0030 -0003 -0004 -0004 -0003 -0003 -0003O SCORE 0077 0056 0047 0040 0032 0027 -0003 -0004 -0005 -0005 -0005 -0005ROE 0076 0055 0047 0040 0032 0027 0001 0004 0005 0005 0004 0003SMB 0039 0024 0027 0042 0067 0077 0002 0002 0004 0007 0014 0017SKEW 0090 0062 0047 0034 0024 0019 -0000 -0000 -0000 -0000 -0000 -0000BEH FIN 0079 0051 0040 0031 0023 0019 -0006 -0005 -0003 -0001 -0000 -0000GR PROF 0077 0047 0038 0031 0025 0020 0004 0003 0003 0003 0003 0003RMW 0073 0045 0036 0030 0023 0018 0003 0004 0004 0003 0003 0003HML DEVIL 0063 0036 0027 0020 0015 0012 0002 0003 0002 0001 0000 0000
Posterior probabilities of factors Pr [γj = 1|data] and posterior mean of factor risk premia E [λj |data] computedusing the Dirac spike and slab approach of section II22 51 factors and all possible models with up to 5 factorsyielding about 26 million candidate models The prior probability of a factor being included is about 1038 Thedata is monthly 197310 to 201612 Test assets cross-section of 25 Fama-French size and book-to-market and 30Industry portfolios The 51 factors considered are described in Table B1 of Appendix B
73
Table C16 Factor models with highest posterior probability (Dirac spike-and-slab ψ = 20)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232SMB 983232 983232 983232STRevBW ISENTLIQ TRUNRATENONDURTERMCOMP ISSUE 983232 983232OilBEH PEAD 983232DeltaSLOPEINV IN ASSIPGrowthDEFAULTUMD 983232ROA 983232 983232 983232 983232 983232 983232 983232REAL UNCPECMA 983232ACCRSERVSTOCK ISS 983232 983232 983232DIVMACRO UNCFIN UNCLIQ NTCMANetOAHJTZ ISENTLTRevRMWHMLINTERM CAP RATIOASS GrowthPERF 983232IABABDISSTR 983232ROEMGMTO SCOREQMJBEH FINGR PROFSKEWRMWHML DEVILSMBProbability () 01080 00964 00771 00710 00709 00668 00661 00624 00600 00560
Factors and posterior model probabilities of ten most likely specifications computed using the Dirac spike and slabapproach of section II22 ψ = 20 51 factors and all possible models with up to 5 factors yielding about 26 millionmodels and a model prior probability of the order of 10minus7 Specifications organised by columns with the symbol 983232indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612 Test assetscross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factors considered aredescribed in Table B1 of Appendix B
74
Table C17 Factor Models with highest posterior probability (continuous spike-and-slab ψ = 10)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232 983232MKTlowast 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232SMBlowast 983232 983232 983232 983232 983232 983232 983232STRev 983232 983232 983232 983232 983232 983232 983232IPGrowth 983232 983232 983232 983232 983232BEH PEAD 983232 983232 983232 983232 983232NONDUR 983232 983232 983232 983232SERV 983232 983232 983232LIQ TR 983232 983232 983232 983232PE 983232 983232 983232 983232 983232 983232DeltaSLOPE 983232 983232 983232 983232BW ISENT 983232 983232DEFAULT 983232 983232 983232 983232 983232 983232Oil 983232 983232 983232DIV 983232 983232 983232 983232 983232 983232REAL UNC 983232 983232 983232 983232 983232 983232UNRATE 983232 983232TERM 983232 983232 983232 983232HJTZ ISENT 983232 983232 983232 983232 983232 983232UMD 983232 983232 983232 983232 983232CMAlowast 983232 983232 983232 983232 983232 983232LIQ NT 983232 983232 983232 983232 983232 983232 983232 983232FIN UNC 983232 983232 983232 983232 983232 983232 983232STOCK ISS 983232 983232NetOA 983232 983232 983232 983232 983232 983232MACRO UNC 983232 983232 983232 983232 983232INV IN ASS 983232 983232 983232COMP ISSUE 983232 983232 983232 983232 983232ACCR 983232 983232 983232 983232 983232 983232HMLlowast 983232 983232 983232 983232 983232 983232ROA 983232 983232 983232 983232 983232 983232RMWlowast 983232 983232 983232 983232 983232 983232CMA 983232 983232 983232 983232 983232 983232LTRev 983232 983232 983232 983232 983232 983232 983232 983232INTERM CAP RATIO 983232 983232 983232 983232 983232 983232ASS Growth 983232 983232 983232 983232 983232 983232DISSTR 983232 983232 983232 983232PERF 983232 983232 983232BAB 983232 983232IA 983232 983232 983232 983232MGMT 983232 983232 983232 983232ROE 983232 983232 983232O SCORE 983232BEH FIN 983232 983232 983232 983232 983232GR PROF 983232 983232 983232 983232 983232QMJ 983232 983232RMW 983232SKEW 983232 983232 983232SMB 983232HML DEVIL 983232 983232Probability () 00111 00111 00111 00100 00100 00100 00100 00100 00100 00100
Factors and posterior model probabilities of ten most likely specifications computed using the continuous spike andslab approach of section II23 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 225quadrillion models and a model prior probability of the order of 10minus16 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
75
Table C18 Factor models with highest posterior probability (Continuous spike-and-slab ψ = 20)
modelfactor 1 2 3 4 5 6 7 8 9 10
HML 983232 983232 983232 983232 983232 983232 983232 983232 983232MKTlowast 983232 983232 983232 983232 983232 983232 983232MKT 983232 983232 983232 983232 983232 983232 983232 983232SMBlowast 983232 983232 983232STRev 983232 983232 983232 983232 983232 983232 983232 983232IPGrowth 983232 983232 983232BEH PEAD 983232 983232 983232 983232 983232 983232NONDUR 983232 983232 983232SERV 983232 983232 983232 983232 983232 983232LIQ TR 983232 983232 983232 983232 983232PE 983232 983232 983232 983232 983232 983232 983232DeltaSLOPE 983232 983232 983232 983232 983232 983232BW ISENT 983232 983232 983232 983232 983232DEFAULT 983232 983232 983232 983232 983232Oil 983232 983232DIV 983232 983232 983232 983232REAL UNC 983232 983232 983232 983232 983232UNRATE 983232 983232 983232 983232 983232TERM 983232 983232 983232 983232HJTZ ISENT 983232 983232 983232 983232 983232 983232 983232 983232UMD 983232 983232 983232 983232CMAlowast 983232 983232LIQ NT 983232 983232 983232 983232 983232FIN UNC 983232 983232 983232 983232 983232 983232STOCK ISS 983232 983232 983232 983232 983232NetOA 983232 983232 983232 983232 983232 983232 983232MACRO UNC 983232 983232INV IN ASS 983232 983232 983232 983232COMP ISSUE 983232 983232 983232 983232ACCR 983232 983232 983232HMLlowast 983232 983232 983232 983232 983232 983232 983232ROA 983232 983232 983232 983232 983232 983232RMWlowast 983232 983232 983232 983232 983232CMA 983232 983232 983232 983232 983232 983232LTRev 983232 983232 983232INTERM CAP RATIO 983232 983232 983232ASS Growth 983232 983232 983232 983232DISSTR 983232 983232 983232 983232 983232PERF 983232BAB 983232 983232 983232 983232IA 983232 983232 983232 983232MGMT 983232 983232 983232ROE 983232 983232 983232 983232 983232O SCORE 983232 983232 983232 983232 983232BEH FIN 983232GR PROF 983232 983232 983232QMJ 983232 983232RMW 983232 983232SKEW 983232 983232SMB 983232HML DEVIL 983232 983232Probability () 00133 00122 00111 00111 00100 00100 00100 00100 00100 00089
Factors and posterior model probabilities of ten most likely specifications computed using the continuous spike andslab approach of section II23 ψ = 10 51 factors and all possible models with up to 5 factors yielding about 225quadrillion models and a model prior probability of the order of 10minus16 Specifications organised by columns with thesymbol 983232 indicating that the factor in the corresponding row is included The data is monthly 197310 to 201612Test assets cross-section of 25 Fama-French size and book-to-market and 30 Industry portfolios The 51 factorsconsidered are described in Table B1 of Appendix B
76
D Additional Figures
Panel A OLS
00 01 02 03 04 05
02
46
810
λstrong
Density
BFMFM
(a) Strong factor
minus2 minus1 0 1 2
00
02
04
06
08
10
λuseless
Density
BFMFM
(b) Useless factor
Panel B GLS
025 030 035
05
1015
2025
30
λstrong
Density
BFMFM
(c) Strong factor
minus20 minus15 minus10 minus05 00 05 10
00
04
08
12
λuseless
Density
BFMFM
(d) Useless factor
Figure D1 Posterior distribution of the risk premia estimates in a misspecified model that includesboth strong and irrelevant factors
The graph presents posterior distribution of risk premia estimates for a misspecified model with both strong anduseless factors in one representative simulation Panels (a) and (b) display posterior distribution of the BFM-OLSestimates of risk premia along with the frequentist distribution implied by the point estimates and standard errorsPanels (c) and (d) report the same objects for GLS T =1000
77
0 5 10
00
04
08
12
Parameter
Den
sity
PriorLikelihood ILikelihood IILikelihood IIIPosterior IPosterior IIPosterior III
Figure D2 Posterior distributions with heavy-tailed likelihood and thin-tailed prior
Posterior shapes generated by a thin-tails (Gaussian) prior and a heavy-tails (Cauchy) likelihood as the likelihood
peak moves away from the prior mean Posteriors I II and III are generated respectively by the likelihood I II and
III
78