dynamic portfolio optimization with transaction costs ...brown and smith: dynamic portfolio...
TRANSCRIPT
MANAGEMENT SCIENCEVol. 57, No. 10, October 2011, pp. 1752–1770issn 0025-1909 �eissn 1526-5501 �11 �5710 �1752 http://dx.doi.org/10.1287/mnsc.1110.1377
© 2011 INFORMS
Dynamic Portfolio Optimization with TransactionCosts: Heuristics and Dual Bounds
David B. Brown, James E. SmithFuqua School of Business, Duke University, Durham, North Carolina 27708 {[email protected], [email protected]}
We consider the problem of dynamic portfolio optimization in a discrete-time, finite-horizon setting. Ourgeneral model considers risk aversion, portfolio constraints (e.g., no short positions), return predictability,
and transaction costs. This problem is naturally formulated as a stochastic dynamic program. Unfortunately,with nonzero transaction costs, the dimension of the state space is at least as large as the number of assets,and the problem is very difficult to solve with more than one or two assets. In this paper, we consider severaleasy-to-compute heuristic trading strategies that are based on optimizing simpler models. We complement theseheuristics with upper bounds on the performance with an optimal trading strategy. These bounds are based onthe dual approach developed in Brown et al. (Brown, D. B., J. E. Smith, P. Sun. 2009. Information relaxationsand duality in stochastic dynamic programs. Oper. Res. 58(4) 785–801). In this context, these bounds are given byconsidering an investor who has access to perfect information about future returns but is penalized for using thisadvance information. These heuristic strategies and bounds can be evaluated using Monte Carlo simulation. Weevaluate these heuristics and bounds in numerical experiments with a risk-free asset and 3 or 10 risky assets.In many cases, the performance of the heuristic strategy is very close to the upper bound, indicating that theheuristic strategies are very nearly optimal.
Key words : dynamic programming; portfolio optimizationHistory : Received August 10, 2010; accepted April 16, 2011, by Dimitris Bertsimas, optimization. Published
online in Articles in Advance July 15, 2011.
1. IntroductionDynamic portfolio theory—dating from the semi-nal work of Mossin (1968), Samuelson (1969), andMerton (1969, 1971)—provides a rigorous frame-work for determining optimal investment strategiesin idealized environments that assume there are notransaction costs. The solutions to these models relyheavily on the absence of transaction costs. For exam-ple, in some models, the optimal solutions recom-mend holding constant fractions of the investor’swealth in different assets. To implement such a strat-egy, an investor must continually buy and sell assetsin order to maintain the target fractions as asset pricesfluctuate. In practice, transactions are costly, and con-tinual rebalancing can be quite expensive. If expectedreturns vary over time, the situation may be evenworse as the investor continually trades to adjust to amoving target.
Constantinides (1979) studied a general discrete-time model of portfolio optimization with transactioncosts and obtained the strongest structural results inthe setting with power utility, proportional transac-tion costs, and two assets. Specifically, he showed thatthere is a two-dimensional convex cone of asset posi-tions where it is optimal to not trade, and that whenthe asset position is outside of this cone, it is optimalto trade to bring the asset position to the boundary of
the cone. Davis and Norman (1990) and Shreve andSoner (1994) analyzed continuous-time versions ofthis model with one risky and one risk-free asset andestablished analogous results. Muthuraman (2007)and others have developed numerical methods for thecase with a single risky asset.
Of course, practitioners must actually contend withmultiple risky assets and, unfortunately, this prob-lem is much more difficult to solve. The portfo-lio optimization problem is naturally formulated asa stochastic dynamic program. With no transactioncosts, the optimal investments typically depend onthe investor’s wealth but not the investor’s asset posi-tions. However, with transaction costs, the optimalinvestments depend on the investor’s initial assetpositions, and the dimension of the state space isat least as large as the number of assets consid-ered. The resulting dynamic program suffers from the“curse of dimensionality” and is very difficult to solvewith more than one or two risky assets.1
In this paper, we consider the problem ofdynamic portfolio optimization in a discrete-time,finite-horizon setting. Our general model considers
1 The special case with a quadratic utility and quadratic transac-tion costs and no portfolio constraints can be formulated as a lin-ear quadratic control problem that is straightforward to solve; seeGârleanu and Pedersen (2009).
1752
Brown and Smith: Dynamic Portfolio Optimization with Transaction CostsManagement Science 57(10), pp. 1752–1770, © 2011 INFORMS 1753
risk aversion (we maximize the expected utility ofterminal wealth), portfolio constraints (e.g., no shortpositions), the possibility of predictable returns, andconvex transaction costs. We introduce several easy-to-compute heuristic trading strategies that are basedon solving simpler optimization problems. The firstheuristic is a “cost-blind” strategy that follows thetrading strategy given by ignoring transaction costs;this is considered primarily as a benchmark for eval-uating the performance of the other heuristics andbounds. The second heuristic is a “one-step” strat-egy that can be viewed as approximating the dynamicprogramming recursion by taking the continuationvalue to be the value function for the model thatignores transaction costs; transaction costs are con-sidered in the current period only. Finally, we con-sider a “rolling buy-and-hold” strategy where, in eachperiod, we solve an optimization problem with trans-action costs with the simplifying assumption thatthere will be no further opportunities to trade over afixed horizon; the continuation value at the end of thehorizon is again taken to be the value function for themodel that ignores transaction costs.
We complement these heuristics with upper boundson the performance with an optimal trading strat-egy. These bounds are based on the dual approachdeveloped in Brown et al. (2009). In this context, thebounds are given by considering an investor whohas access to perfect information about future returnsbut is penalized for using this advance information.The dual approach of Brown et al. (2009) general-izes the dual approach developed for option pricingproblems by Rogers (2002), Haugh and Kogan (2004),and Andersen and Broadie (2004) (see also relatedearlier work by Davis and Karatzas 1994) to con-sider general stochastic dynamic programs; this gen-eralization is essential for the application to portfoliooptimization problems. In this paper, we generatepenalties using the approach for “good penalties”suggested in Brown et al. (2009) and develop anew gradient-based approach for generating penaltiesthat exploits the convex structure of the underlyingstochastic dynamic program. These heuristic strate-gies and bounds can be simultaneously evaluatedusing Monte Carlo simulation.
We evaluate these heuristics and bounds in numer-ical experiments with a risk-free asset and 3 riskyassets (with predictable returns) or with 10 riskyassets (without predictability). The results are promis-ing: the run times are reasonable and, in many cases,the performance of the heuristic strategy is very closeto the upper bound, indicating that the heuristicstrategies are very nearly optimal.
Brandt (2010) provides a recent survey of researchin portfolio optimization and touches briefly on issuesrelated to transaction costs. Closer to this paper,
Akian et al. (1996), Leland (2000), Muthuraman andKumar (2006), and Lynch and Tan (2010) considerportfolio choice problems with multiple risky assetsand transaction costs in various settings. These papersdevelop analytic frameworks for the case with manyassets, but focus on numerical examples with tworisky assets; the numerical methods employed arebased on grid approximations of the state space anddo not scale well for problems with more risky assets.Muthuraman and Zha (2008) describe a numericalprocedure that scales better: their procedure assumesa particular form of trading strategy, estimates thevalue function given this trading strategy using sim-ulation, and then updates the trading strategy. Thecomputational effort required by this scheme scalespolynomially in the number of assets. Their exam-ple with seven risky assets requires approximately62 hours to compute a trading strategy; there is noguarantee that the resulting strategy is optimal or anyindication of how much better one might do with anoptimal strategy. Chryssikou (1998) studies portfoliooptimization with quadratic transaction costs and noconstraints using a pair of heuristic strategies, one ofwhich is similar to our rolling buy-and-hold strategy.
We view our contributions to be (i) the study ofsome easy-to-compute heuristic trading strategies thatmay be useful in practice and (ii) the development ofa dual bounding approach that can be used to eval-uate the quality of these and other heuristics. Boththe heuristics and dual bounds are fairly flexible andcan be adapted to problems with different forms ofutility functions, different forms of transaction costs(provided they are convex), different forms of portfo-lio constraints, and different models of returns.
There is also a recent literature that uses dual meth-ods to evaluate portfolio strategies with portfolio con-straints. For example, Haugh et al. (2006) and Haughand Jain (2011), following Cvitanic and Karatzas(1992) and others, use Lagrange multiplier methodsto “dualize” portfolio constraints in continuous-timeportfolio allocation models; see Rogers (2003) for areview and synthesis of related theory. In contrast, wetreat portfolio constraints directly and “dualize” thenonanticipativity constraints that require the investorto use only the information available at the time adecision is made; our penalties can be viewed asLagrange multipliers associated with these nonantic-ipativity constraints. This interpretation is discussedin more detail in Brown et al. (2009).
In the next section, we describe our basic modeland note some structural properties of the model.In §3, we describe our heuristics. In §4, we describeour approach to dual bounds in general and discussthe specific bounds we consider in our numericalexperiments. In §5, we describe our numerical exper-iments and results. The online appendix (available at
Brown and Smith: Dynamic Portfolio Optimization with Transaction Costs1754 Management Science 57(10), pp. 1752–1770, © 2011 INFORMS
http://faculty.fuqua.duke.edu/∼dbbrown/bio/papers/brown_smith_2011_online_appendix.pdf) contains proofsof the propositions in the paper as well as somedetailed assumptions and results for the numericalexperiments.
2. The Portfolio Optimization ModelTime is discrete and indexed as t = 01 0 0 0 1 T , witht = 0 being the current period and T being the ter-minal period. There are n risky assets and a risk-free asset (cash). The risk-free rate rf is assumed tobe known and constant over time. The returns ofthe risky assets are stochastic and denoted by rt =
4rt111 0 0 0 1 rt1n5, where rt1 i ≥ 0 is the (gross) return onasset i from period t − 1 to period t.
The monetary values of the risky asset holdings atthe beginning of period t are described by the vec-tor xt = 4xt111 0 0 0 1 xt1n5; the cash position in period t isdenoted ct . We let the trade vector at = 4at111 0 0 0 1 at1n5denote the amounts of risky assets bought (if at1 i > 0)or sold (if at1 i < 0) in period t. The transaction costsassociated with trade vector at are given by �4at5. Inour general analysis and approach, we will assumethat �4at5 is a nonnegative and convex function of thetrades at with �405= 0. In our numerical experiments,we will focus on the special case of proportional trans-action costs with
�4at5=
n∑
i=1
4�+
i a+
t1 i − �−
i a−
t1 i51 (1)
where a+
t1 i = max4at1 i105 and a−t1 i = min4at1 i105 denote
the positive and negative components of the trades,and �+
i and �−i ≥ 0 are the proportional costs for buy-
ing and selling, respectively, asset i. Alternatively, wecould use a quadratic function for transaction costs tocapture a “linear price impact,” where trades lead totemporary linear changes in prices. Many other formsare also possible.
Taking transaction costs into account, the assetholdings and cash position evolve according to
xt+1 = rt+1 · 4xt + at51 (2)
ct+1 = rf 4ct − 1′at −�4at550 (3)
Here the dot (·) denotes the componentwise productof two vectors (so xt+11 i = rt+11 i4xt1 i + at1 i5) and 1 is ann-vector of ones. The investor’s wealth wt in period tis the sum of the total dollar holdings across the riskyassets and cash, i.e.,
wt = 1′xt + ct0 (4)
The investor’s goal is to maximize the expected util-ity of terminal wealth, Ɛ6U 4wT 57, where U is a non-decreasing and concave utility function. Note that in
this formulation, we define wealth in terms of themarket value of the portfolio. We could have insteaddefined wealth in terms of the liquidation value ofthe portfolio, including the transaction costs associ-ated with liquidation. In this case, we would takewt = 1′xt − �4−xt5 + ct0 Our general approach worksin either case, although the numerical results wouldobviously be somewhat different.
We will assume that the investor’s trades atin period t are restricted to a closed, convex set�t4xt1 ct5. In our numerical experiments, we willfocus on the case where the investor is not allowed tohave short positions in risky assets or cash, so givenan asset position 4xt1 ct5, the allowed trades are
�t4xt1ct5=8at ∈�n2 xt+at ≥01ct−1′at−�4at5≥090 (5)
We will also consider numerical results for the casewhere short positions are allowed, but there is a mar-gin requirement that limits the total (long or short)position in risky assets to be no more than l timesthe investor’s posttrade wealth. In this case, the set ofallowed trades is also convex and can be written as
�t4xt1 ct5
=
{
at ∈�n2n∑
i=1
�xt1 i+at1 i�≤ l41′xt+ct−�4at55
}
0 (6)
In general, we consider sets of allowed trades�t4xt1 ct5 defined in terms of a set �t of allowed finalpositions (or holdings): at ∈ �t4xt1 ct5 if and only if4xt + at1 ct − 1′at − �4at55 ∈ �t . We assume that theallowed set of final positions �t is closed, convex, andnondecreasing in ct (if 4xt1 c1
t 5 ∈ �t and c1t ≤ c2
t , then4xt1 c2
t 5 ∈�t). This implies that �t4xt1 ct5 is convex foreach 4xt1 ct5.
We will allow the possibility that returns exhibitsome degree of predictability. To model this, we let ztdenote a vector of observable market state variablesthat provides information about the returns rt+1 ofthe risky assets. We will assume that zt follows aMarkov process. The returns rt+1 may depend on zt ,but, given zt , the returns are assumed to be condition-ally independent of prior returns and earlier valuesof the market state variable.
This portfolio optimization problem can be formu-lated as a stochastic dynamic program with state vari-ables consisting of the current positions in risky assetsand cash (xt1 ct) and the market state variable (zt).We take the terminal value function to be the utilityof terminal wealth, VT 4xT 1 cT 1zT 5 = U41′xT + cT 5, andearlier value functions Vt are given recursively as
Vt4xt1ct1zt5= maxat∈�t 4xt1ct 5
Wt4at1xt1ct1zt5 (7)
Wt4at1xt1ct1zt5
=Ɛ6Vt+14rt+1 ·4xt+at51rf 4ct−1′at−�4at551zt+15 �zt70 (8)
Brown and Smith: Dynamic Portfolio Optimization with Transaction CostsManagement Science 57(10), pp. 1752–1770, © 2011 INFORMS 1755
In (8), expectations are taken over the random assetreturns rt+1 and the next-period market state zt+1. Wewill assume that these expectations (and other expec-tations in the paper) are well defined and that themaxima in (7) are attained by some set of trades.
The following proposition states some key proper-ties of this portfolio optimization model.
Proposition 2.1 (Properties of the PortfolioOptimization Model). (1) For any market state zt ,Vt4xt1 ct1zt5 is nondecreasing in cash ct and jointly con-cave in the asset position 4xt1 ct5.
(2) For any market state zt , Wt4at1xt1 ct1zt5 is jointlyconcave in the trades at and asset position 4xt1 ct5.
Thus, for any given market state zt and asset andcash position 4xt1 ct5, the optimization problem (7) isconvex: we are maximizing a concave function over aconvex set. Unfortunately, the dimension of the statespace makes the portfolio optimization problem verydifficult to solve, even with just a few risky assets. Forexample, suppose the market state variable zt is onedimensional. If we approximated the state space usinga grid with 20 points for this market state variableand 100 points for each of the n + 1 asset positions,the state space would consist of 20 × 100n+1 states. Todetermine the value function Vt4xt1 ct1zt5 on this grid,we would have to solve the optimization problem (7)for each of these 20 × 100n+1 states in each period.In our numerical examples with n = 3 risky assetsand predictability, the state space would include 20 ×
1004 = 2 billion elements. With n = 10 risky assetsand no predictability, the state space would include10011 = 1022 elements. Moreover, each of these opti-mization problems involves expectations (8) over the(n+1)-dimensional space of 4rt+11zt+15 outcomes, andwe would have to somehow interpolate between gridpoints when solving for the optimal trades.
If there are no transaction costs (� = 0), the portfo-lio optimization problem can be greatly simplified bytaking the dynamic programming state variables to bethe current wealth (wt) and market state variable (zt);we no longer need to consider the specific asset posi-tions 4xt1 ct5. In this simpler dynamic program, thedecision variables are the posttrade positions in riskyassets xt = xt +at . Let �t4wt5 denote the set of possibleposttrade positions in risky assets given initial wealthwt ; that is, �t4wt5= 8xt2 4xt1wt −1′xt5 ∈�t9. For exam-ple, with no transaction costs, the case described by(5), where the investor is not allowed to have shortpositions, corresponds to a feasible set of posttradeasset positions of the form
�t4wt5= 8xt ∈�n2 xt ≥ 01 1′xt ≤wt90 (9)
We can then write the recursion for this “friction-less model” as follows: The terminal value function isV
fT 4wT 1zT 5=U4wT 5, and earlier value functions are
Vft 4wt1zt5 = max
xt∈�t 4wt 5W
ft 4xt1wt1zt51 (10)
Wft 4xt1wt1zt5 = Ɛ6V f
t+14r′
t+1xt + rf 4wt − 1′xt51 zt+15 � zt70
(11)
This frictionless model also has a convex structureand its results can be related to those of the morecomplicated model with transaction costs.
Proposition 2.2 (Properties of the FrictionlessPortfolio Optimization Model). (1) For any mar-ket state zt , V
ft 4wt1zt5 is nondecreasing and concave in
wealth wt .(2) For any market state zt , W
ft 4xt1wt1zt5 is jointly
concave in the posttrade asset positions xt and wealth wt .(3) For any market state zt and asset position 4xt1 ct5,
Vt4xt1 ct1zt5≤ Vft 41
′xt + ct1zt5.
Thus, to solve the frictionless model, we need tosolve a convex optimization problem for each marketstate zt and wealth wt . For example, if the market statevariable zt is one dimensional, we could solve thisdynamic program on a two-dimensional grid involv-ing zt and wt . The expectations over 4rt+11 zt+15 in(11) will still be high dimensional if we have manyassets, but can be evaluated using various methods.In our numerical experiments, we will approximatethese expectations using discrete approximations ofthe underlying distributions; see §5.1 below.
If the investor has a power utility function, the fric-tionless model simplifies further. Specifically, suppose
U4wT 5=1
1 −�w1−�
T 1 (12)
where � > 0 is the coefficient of relative risk aversion;in the case where � = 1, U4wT 5= ln4wT 5. We can thenwrite the value function as
Vft 4wt1zt5=
11 −�
w1−�t �t4zt51 (13)
where �t4zt5 is defined recursively with �T 4zT 5 = 1and
11 −�
�t4zt5
= maxÈt∈�t 415
Ɛ
[
11 −�
4r′t+1Èt + rf 41 − 1′Èt551−��t+14zt+15 � zt
]
0
(14)
Here Èt = 4�11 0 0 0 1 �n5 are the posttrade fractions ofwealth wt invested in the risky assets. In this case, the
Brown and Smith: Dynamic Portfolio Optimization with Transaction Costs1756 Management Science 57(10), pp. 1752–1770, © 2011 INFORMS
dimension of the state space is equal to the dimensionof the market state variable zt .
Note that if the next-period market state zt+1 andthe returns rt+1 are independent given zt , then Equa-tion (14) factors and it is optimal to pursue a myopictrading strategy that maximizes the expected utilityof next period’s wealth. However, if the next-periodmarket state zt+1 and the returns rt+1 are not inde-pendent and the investor has a relative risk-aversioncoefficient � > 1 (as is considered to be typical), theoptimal trading strategies will include some degreeof hedging against unfavorable changes in the mar-ket state variable: compared to the myopic strate-gies, these strategies tend to have higher next-periodwealth in scenarios with poor future prospects (i.e.,with high values of �t+14zt+15) and lower next-periodwealth in scenarios with better future prospects (i.e.,with low values of �t+14zt+15).
3. Some Heuristic Trading StrategiesGiven that the portfolio optimization model withtransaction costs is difficult to solve, it is natural toconsider heuristic trading strategies based on solu-tions to approximate models that are simpler to solve.We will consider several such heuristic strategies.
3.1. Cost-Blind StrategiesFirst, we consider cost-blind strategies that ignore trans-action costs and simply follow the optimal tradingstrategy recommended by the frictionless model. Tosimulate such a strategy, we need to first solve thedynamic program for the frictionless model (10) tofind the optimal posttrade asset positions xt or frac-tions Èt = xt/wt for each period t, market state zt ,and wealth level wt ; we do this once and store theresults. In the body of the simulation, in each period,we choose trades at to move to the investor to the rec-ommended fractions Èt for the current market statezt , and wealth level wt . Given this trade, we generaterandom returns rt+1 for the risky assets and calcu-late next-period asset positions 4xt+11 ct+15 using Equa-tions (2) and (3), deducting transaction costs from thecash position. We then generate the next-period mar-ket state zt+1 and continue the simulation process forthe next period.
Although this cost-blind strategy may perform rea-sonably well when the transaction costs are small orwhen it is optimal to put all of the investor’s wealth ina single asset, with larger transaction costs and morebalanced investments, we would expect this strategyto trade too much in pursuit of marginal improve-ments in asset positions that do not exceed the cost ofexecuting the trade.
3.2. One-Step StrategiesSecond, we consider a heuristic strategy where theinvestor uses the value function from the friction-less model as an approximate continuation value, butincludes transaction costs in the current period. In thiscase, in each period, the investor chooses trades atthat solve
maxat∈�t 4xt1ct 5
Ɛ6V ft+14r
′
t+14xt+at5+rf 4ct−1′at−�4at551zt+15 �zt70
(15)
To simulate such a one-step strategy, we first solve thedynamic program for the frictionless model (10) todetermine the value function V
ft 4wt1zt5; we do this in
advance of the simulation and store the value func-tion. In the body of the simulation, for each period ineach trial, we solve the optimization problem (15) tofind the recommended trade at for the then-prevailing4xt1 ct1zt5 scenario. We then generate random returnsrt+1 for the risky assets, calculate next-period assetpositions 4xt+11 ct+15 using Equations (2) and (3), gen-erate a new market state zt+1, and continue to thenext period. Note that the objective function in (15)is concave in the trades at (this follows from theassumption that the transaction cost function �4at5 isconvex and the fact that V
ft 4wt1zt5 is nondecreasing
and concave in wealth wt ; see Proposition 2.2), so (15)is a convex optimization problem. The optimizationproblem is, however, complicated by the presence ofthe high-dimensional expectation in (15): these expec-tations must be evaluated to calculate the objectivefunction for any candidate trade at .
Whereas the cost-blind strategies are likely to tradetoo much because they neglect the costs of trading, wewould expect these one-step strategies to trade too lit-tle because they underestimate the benefits of movingtoward the optimal position without transaction costs.Or, put another way, the frictionless model underes-timates the cost of being out of the optimal positionbecause it assumes the investor can costlessly adjustthe position in the next period. If the optimal assetpositions change slowly over time, moving towardthe optimal position in one period may provide ben-efits in future periods as well as the current one.
Although we would have to solve the full portfoliooptimization problem (7) to exactly capture the long-term impacts of adjusting portfolio positions, we canperhaps approximate this effect by reducing the trans-action costs �4at5 appearing in the objective functionin (15). There are a variety of ways we might mod-ify these costs and we can experiment to find a goodmodification. In our numerical experiments, we con-sider monthly trades and focus on the case where weadjust the transaction costs by dividing �4at5 by divid-ing by a time-dependent constant. Specifically, we
Brown and Smith: Dynamic Portfolio Optimization with Transaction CostsManagement Science 57(10), pp. 1752–1770, © 2011 INFORMS 1757
focus on the case where we divide by the smaller of 6or the number of periods remaining 4T − t5. The intu-itive interpretation of this adjustment is that the bene-fit of adjusting the asset positions lasts approximatelysix months. We will call these trading strategies modi-fied one-step strategies and consider alternative divisorsin §5.4.
3.3. Rolling Buy-and-Hold StrategiesFinally, we consider a heuristic trading strategywhere, in each period, the investor chooses tradesto maximize the expected utility of wealth at somehorizon h periods into the future, taking transactioncosts into account, but assuming that there will beno opportunities to adjust the portfolio over this timehorizon. As in the one-step strategy, the continuationvalue at the horizon is approximated by the valuefunction for the frictionless model. That is, in period t,the investor chooses trades at that solve
maxat∈�t 4xt1 ct 5
Ɛ[
Vf
t+h44rt+h · 0 0 0 · rt+15′4xt + at5
+ rhf 4ct − 1′at −�4at551 zt+h5 � zt]
1 (16)
with the understanding that we use the terminal util-ity U4wT 5 in place of V
f
t+h4wT+h5 whenever t +h > T .Although this objective function assumes that thereare no future opportunities to adjust the portfolio overthis time horizon, when simulating or executing thisstrategy, the investor solves the same problem in thenext period. As with the one-step strategies, the objec-tive function in (16) is concave in at and, when sim-ulating with this strategy, we must solve the convexoptimization problem (16) once for each period ofeach simulated trial.
As with the modification of the one-step strategies,some degree of experimentation may be required toidentify a good horizon h for a particular problem.In our numerical experiments with monthly trading,we will focus on the case where the horizon h issix months, but will consider alternatives in §5.4. Wewill refer to these heuristic trading strategies as therolling buy-and-hold strategies. Chryssikou (1998) stud-ies a similar heuristic, but with the horizon fixed atthe terminal period T , i.e., with terminal utility U4wT 5
in place of V f
t+h4wT+h5 in (16).
4. Dual BoundsWe can evaluate the heuristic strategies of §3 usingsimulation, and we can rerun these simulations withvariations of these strategies (e.g., adjusting the modi-fication of transaction costs for the one-step strategiesor the horizon for the rolling buy-and-hold strategies)in an attempt to improve their performance. Whendoing these experiments, it would be helpful to know
how much better we could possibly do. The friction-less model provides an upper bound on performance(see Proposition 2.2), but when transaction costs aresubstantial, this “no-transaction-cost bound” may berather weak.
In this section, we will derive upper bounds onperformance using the dual approach developed inBrown et al. (2009). This dual approach consists oftwo elements: (i) we relax the “nonanticipativity” con-straints that require the trading decisions to dependonly on the information available at the time the deci-sion is made, and (ii) we impose penalties that punishviolations of these nonanticipativity constraints. Wefirst describe this dual approach in general and thendescribe the penalties we will use in our numericalexperiments. As we will see, these dual bounds aretypically tighter than the bound given by the modelthat ignores transaction costs.
4.1. The Dual ApproachIn our discussion of the dual bounds, it helps tointroduce notation to describe the full sequences ofmarket states z = 4z01 0 0 0 1zT 5, returns r = 4r11 0 0 0 1 rT 5,and trades a = 4a01 0 0 0 1aT−15. Using this notation, wewrite the pretrade asset and cash positions as xt4a1 r5and ct4a5 and wealth as wt4a1 r5; these are calculatedaccording to Equations (2)–(4) and are given by
xt4a1 r5 =
t−1∑
�=0
4rt · 0 0 0 · r�+15 · a� + 4rt · 0 0 0 · r15 · x01
ct4a5 =
t−1∑
�=0
r t−�f 4−1′a� −�4a�55+ r tf c0
and wt4a1 r5 = 1′xt4a1 r5 + ct4a5. Similarly, we writethe set of feasible trade sequences a as �4r5. Notethat for any given return sequence r, the position inrisky assets xt4a1 r5 is linear in the trade sequencea, and with convex transaction costs, the cash posi-tion ct4a5 and wealth wt4a1 r5 are concave in the tradesequence a.
A trading strategy can be viewed as a function�4r1z5 that maps from sequences of returns r andmarket states z to a trade sequence a. A trading strat-egy � is feasible if (i) �4r1z5 is in �4r5 for each4r1z5, and (ii) � is nonanticipative in that the trade atselected in period t depends only on what is knownin period t; that is, at depends on the market states4z01 0 0 0 1zt5 and the asset returns 4r11 0 0 0 1 rt5, but notthe future market states or returns. We let A denotethis set of feasible strategies. In this notation, we canrewrite the portfolio optimization problem (7) com-pactly as
max�∈A
Ɛ6U 4wT 4�4r1 z51 r5570 (17)
Brown and Smith: Dynamic Portfolio Optimization with Transaction Costs1758 Management Science 57(10), pp. 1752–1770, © 2011 INFORMS
Here the expectations are taken over sequences ofreturns r and market states z, with trading strategy �selecting trades in each (r1 z5 scenario.
In deriving dual upper bounds for this problem,we will focus on a “perfect information relaxation”that assumes the investor knows all market statesz and all asset returns r before making any trad-ing decisions. The penalties �4a1 r1z5 depend on thesequence of trades a, returns r, and market states z ina given scenario; we say a penalty � is dual feasible ifƐ6�4�4r1 z51 r1 z57≤ 0 for any feasible trading strategy�. We can then state the duality result as follows.
Proposition 4.1 (Dual Bound). For any feasibletrading strategy � and any dual feasible penalty �,
Ɛ6U 4wT 4�4r1 z51 r557
≤ Ɛ
[
maxa∈�4r5
8U4wT 4a1 r55−�4a1 r1 z59]
0 (18)
The problem on the right of (18) is perhaps easiestto understand by considering how we estimate thisexpression using simulation. In each trial of the simu-lation, we generate a sequence of market states z andasset returns r, drawing samples according to theirjoint stochastic process. We then solve a deterministic“inner problem” of the form
maxa∈�4r5
8U4wT 4a1 r55−�4a1 r1z59 (19)
to find the sequence of trades a in �4r5 that maxi-mizes the penalized objective, U4wT 4a1 r55−�4a1 r1z5,assuming perfect foresight, i.e., assuming that the fullsequences of market states z and asset returns r areknown. We obtain an estimate of the dual bound(18) by averaging the optimal values from these innerproblems across the trials of the simulation. Notethat because wealth wT 4a1 r5 is concave in the tradesequence a and the utility function is increasing andconcave in wealth, the utility of final wealth is con-cave in a for a given return sequence r. If we con-sider penalties �4a1 r1z5 that are convex in the tradesequence a for given sequences of returns r and mar-ket states z, the inner problem (19) will be a deter-ministic convex optimization problem in a and willnot be difficult to solve.
It is not hard to see that inequality (18) holds withany dual feasible penalty for any feasible strategy �.To see this, note that
Ɛ6U 4wT 4�4r1 z51 r557
≤ Ɛ6U 4wT 4�4r1 z51 r55−�4�4r1 z51 r1 z57
≤ Ɛ
[
maxa∈�4r5
8U4wT 4a1 r55−�4a1 r1 z59]
0 (20)
The first inequality follows from the assumption that� is dual feasible; because � is assumed to be a feasi-ble trading strategy, this implies Ɛ6�4�4r1 z51 r1 z57≤ 0.
The second inequality follows from the fact that thevalue with perfect foresight must meet or exceed thevalue of any feasible trading strategy �: The investorwith perfect foresight could choose the sequence oftrades that would be chosen by � in each scenario andobtain the same value. However, the investor withperfect foresight can usually do better by choosing adifferent sequence of trades that maximizes the penal-ized objective in the given scenario.
The bound (18) is a special case of the weak dual-ity result from Brown et al. (2009). Brown et al. (2009)also show that strong duality holds in this frameworkin that there exists an optimal penalty �∗ such thatthe inequality in (18) will hold with equality with this�∗ and an optimal strategy �∗; “complementary slack-ness” also holds in that an optimal penalty �∗ willlead to trades a in the dual problem that match thoseof an optimal strategy.
Note that the penalty � = 0 is trivially dual feasible.In this case, the inner problem (19) amounts to findingan optimal trading strategy given perfect knowledgeof all future returns, and the dual bound (18) is theexpected utility with perfect information. This innerproblem is straightforward to solve, but the boundis typically quite weak. To obtain tighter bounds, weneed to choose a penalty that reduces the benefit pro-vided by having advance knowledge of future marketstates and returns. In addition, to ensure reasonablecomputational times, the penalties should be easy tocompute and lead to an inner problem (19) that iseasy to solve. We will consider two types of penalties.First we will consider penalties that are constructedfollowing the prescription for “good penalties” fromBrown et al. (2009). Second, we will consider a newtype of penalty that exploits the convex structure ofthe primal problem.
4.2. Penalties Based onApproximate Value Functions
Brown et al. (2009) suggest constructing penaltiesby choosing a sequence of generating functions4g01 0 0 0 1 gT 5 that approximate the continuation valuefunctions for the dynamic programming model. Inthis context, the generating functions gt4a1 r1z5 maydepend on the full sequences of returns r and marketstates z, but depend only on trades up to period t,4a01 0 0 0 1at5. The penalty is then taken to be
�4a1r1z5
=
T∑
t=0
4gt4a1r1z5−Ɛ6gt4a1r1z5 �r110001rt1z010001zt750 (21)
Brown et al. (2009) show that penalties constructedthis way are dual feasible. Moreover, if we take thegenerating functions gt4a1 r1z5 to be the optimal con-tinuation values Vt+14xt+14a1 r51 ct+14a51zt+15 for the
Brown and Smith: Dynamic Portfolio Optimization with Transaction CostsManagement Science 57(10), pp. 1752–1770, © 2011 INFORMS 1759
original portfolio optimization program (7), the result-ing “ideal penalty” is optimal: it provides a dualbound equal to the optimal value for the primal, andthe optimal trades in the dual problem are feasibleand optimal for the primal problem.
We can approximate this ideal penalty usingapproximations of the continuation values as gen-erating functions. For example, consider the contin-uation values for the one-step strategies; we couldapproximate the continuation value using the con-tinuation value from the frictionless model (10)by taking the generating function gt4a1 r1z5 to beV
ft+14wt+14a1 r51zt+15. Although this frictionless value
function is reasonably easy to compute, it leadsto an inner problem that is not easy to solve:V
ft+14wt+14a1 r51zt+15 is a concave function of the trades
a, but when used to generate a penalty � using (21),V
ft+1 enters into the objective for the inner problem
with both positive and negative signs, leading to anobjective function that is neither convex nor concaveand an inner problem that is generally not easy tosolve.
To overcome this difficulty, we will take gt4a1 r1z5to be a linear approximation of V
ft+14wt+14a1 r51zt+15
based on a first-order Taylor series expansion in thetrades a around the trades a∗ given by some fixedtrading strategy. For example, in the case with pro-portional transaction costs given by Equation (1), thelinear approximation of V f
t+1 yields a generating func-tion of the form
gt4a1r1z5 = Vft+14wt+14a
∗1r51zt+15+Vf ′
t+14wt+14a∗1r51zt+15
·
t∑
�=1
n∑
i=1
(
¡wt+1
¡a+
�1i
4a+
�1i−a∗+
�1i5+¡wt+1
¡a−�1i
4a−
�1ia∗−
�1i5
)
1 (22)
where Vf ′
t+1 denotes the derivative of V ft+1 with respect
to wealth and a+
t1 i1 a−t1 i1 a
∗+
t1 i , and a∗+
t1 i denote the pos-itive and negative components of a and a∗. With apower utility function, V f
t+1 can be calculated analyti-cally as V
f ′
t 4wt1zt5=w−�t �t4zt51 where �t4zt5 is deter-
mined when solving the frictionless model (14); inother cases, V
ft+1 may have to be estimated numeri-
cally. The partial derivatives in (22) are given by
¡wt+1
¡a+
�1 i
=
t+1∏
� ′=�+1
r� ′1 i − r t+1−�f 41 + �+
i 51
¡wt+1
¡a−�1 i
=
t+1∏
� ′=�+1
r� ′1 i − r t+1−�f 41 − �−
i 50
With a generating function of the form of (22), using(21) we obtain a dual feasible penalty � that is linearin the trades a (or, more precisely, linear in the posi-tive and negative components of a), for any sequenceof returns r and market states z. The objective for the
inner problem (19) is then concave in a and the result-ing inner problem is a deterministic convex optimiza-tion problem that is not difficult to solve. Note thatthere are high-dimensional expectations (over returnsrt+1 and the market state zt+1) in the definition of thepenalty (21); however, these expectations only affectthe weights associated with the trades in this linearpenalty, and the weights need only be calculated oncewhen solving the inner problem in a simulated sce-nario. In our numerical experiments, we will considerbounds generated by penalties of this form, taking a∗
to be the trades suggested by the modification of theone-step heuristic strategy. We will call these boundsthe modified one-step bounds.
We can construct a similar penalty using a generat-ing function based on the analogue of the continua-tion value used to determine the rolling buy-and-holdstrategy,
Ɛ6V f
t+h44rt+h ·000 · rt+25′xt+14a1r5+rh−1
f ct+14a51zt+h5 �zt+170
(23)
Note that this is a function of the period-t+ 1 marketstate zt+1 (by conditioning) and returns rt+1 (throughxt+1), but does not depend on later market states orreturns because these are integrated out in the expec-tations. Here too we will consider a generating func-tion based on a first-order Taylor series expansion of(23) in the trades a around the trades a∗ given bysome heuristic strategy; the details are provided inthe online appendix. As with the modified one-steppenalty, this leads to a dual feasible penalty � that islinear in the positive and negative components of afor any sequence of returns r and market states z. Inour numerical experiments, we will consider boundsgenerated by penalties of this form, taking a∗ to be thetrades suggested by a rolling buy-and-hold strategy.We will call these the rolling buy-and-hold dual bounds.
4.3. Gradient-Based PenaltiesWe will also consider gradient-based penalties thatexploit the convex structure of the primal optimiza-tion problem. To understand the motivation for thisapproach, assume (for the sake of this motivating dis-cussion) that the utility of terminal wealth U4wT 4a1 r55is differentiable in the sequence of trades a and that anoptimal trading strategy �∗ for the portfolio optimiza-tion problem with transaction costs is known. Thensuppose we take the penalty �4a1 r1z5 to be
�4a1 r1z5= ïaU4wT 4�∗4r1z51 r55′4a−�∗4r1z551 (24)
where ïaU4wT 4a1 r55 is the gradient of terminal util-ity with respect to the trade sequence a for a givensequence of returns r. Note that this penalty is linearin the trade sequence a.
Brown and Smith: Dynamic Portfolio Optimization with Transaction Costs1760 Management Science 57(10), pp. 1752–1770, © 2011 INFORMS
We can view the primal problem (17) as a con-vex optimization problem with the decision variablesbeing the trading strategy �; this will be formal-ized in the proof of the proposition below. The first-order conditions for this optimization problem can beshown to imply that
Ɛ6�4�4r1 z51 r1 z57
= Ɛ6ïaU4wT 4�∗4r1 z51 r55′4�4r1 z5−�∗4r1 z557≤ 0 (25)
for any feasible strategy �. This means the penalty(24) is dual feasible.
Now consider the deterministic inner problem (19)given by this penalty:
maxa∈�4r5
8U4wT 4a1r55−ïaU4wT 4�∗4r1z51r55′4a−�∗4r1z5590
(26)
Because the penalty is linear in the trade sequence a,the inner problem (26) is a convex optimization prob-lem, and its first-order conditions are necessary andsufficient for an optimal solution. The gradient of theobjective function in (26) with respect to the tradesequence a is
ïaU4wT 4a1 r55−ïaU4wT 4�∗4r1z51 r550 (27)
Now note that if we take the trade sequence to bethat selected by the optimal strategy, i.e., a = �∗4r1z5,the gradient (27) is equal to zero. Because this a isin �4r5 and it sets the gradient equal to zero, thisa must be an optimal solution for the inner prob-lem (26). Moreover, with a= �∗4r1z5, the penalty (24)is zero and the objective for the inner problem (26)reduces to U4wT 4�
∗4r1z51 r55 and the dual bound isƐ6U 4wT 4�
∗4r1 z51 r557. Thus, the penalty (24) is optimal:it yields a dual trading strategy that is optimal forthe primal problem and a dual bound equal to theoptimal value for the primal.
Of course, in practice we do not know the optimalstrategy �∗ for the portfolio optimization problem andcannot use the penalty (24). We can, however, approx-imate the original problem and use similar penaltiesbased on the optimal solution to this approximateproblem. For example, we can approximate the orig-inal problem by considering the frictionless model(10). If we take wT 4a1 r5 to be the terminal wealthwithout transaction costs and A to be the set of fea-sible trading strategies without transaction costs (andlet �4r5 be the set of feasible trades in the approximatemodel given returns r), we can then write the approx-imate optimization problem based on the frictionlessmodel as
max�∈A
Ɛ6U 4wT 4�4r1 z51 r5570 (28)
Let �∗4r1z5 be an optimal trading strategy for thisapproximating frictionless model and consider agradient-based penalty of the form of (24), but withwT in place of wT and �∗ in place of �∗, i.e.,
�4a1 r1z5= ïaU4wT 4�∗4r1z51 r55′4a− �∗4r1z550 (29)
The argument leading to (25) requires the strategy tobe optimal for the chosen wealth function (i.e., for(28)), but it does not require the strategy to be optimalfor the true wealth function with transaction costs.If the set of feasible strategies for the approximatemodel A includes those for the real model A (i.e.,�4r5⊆ �4r5 for all r), then Ɛ6�4�4r1 z51 r1 z57 will holdfor all � in A, and this approximate penalty � will bedual feasible for the original problem. However, theinner problem with this approximate penalty,
maxa∈�4r5
8U4wT 4a1r55−ïaU4wT 4�∗4r1z51r55′4a−�∗4r1z5591
(30)
will generally not be optimized by taking the tradesequence to be a = �∗4r1z5. Nevertheless, becausethe penalty is dual feasible, the dual problem withthis approximate penalty will provide a valid upperbound on the performance of any feasible tradingstrategy.
We can use this gradient-based approach with avariety of approximations of the wealth function aslong as the approximate wealth function is concaveand we can identify an optimal strategy for theapproximate problem. The following proposition for-malizes this gradient-based approach to penalties.
Proposition 4.2 (Gradient-Based Penalties). Let�∗ be an optimal trading strategy for the portfolio choiceproblem (28) with modified terminal wealth wT 4a1 r5,assumed concave in a, and modified allowable trades �4r5,assumed convex and satisfying �4r5⊆ �4r5 for each returnsequence r. Consider the penalty � given by Equation (29):
(1) � is dual feasible.(2) If wT 4a1 r5 = wT 4a1 r5 and �4r5 = �4r5 for each
return sequence r, then the dual bound (18) holds withequality with penalty �.
(3) If wT 4a1 r5≤ wT 4a1 r5, then,
maxa∈�4r5
8U4wT 4a1 r55− �4a1 r1z59
≤U4wT 4�∗4r1z51 r550 (31)
The first two parts of the proposition formalize theresults discussed earlier. We will discuss the last partof the proposition in a moment. Note that our defi-nition of gradient-based penalties � in Equation (29)implicitly assumes that U and wT are differentiable sothe necessary gradients exist and � corresponds to adirectional derivative. If the gradient does not exist,
Brown and Smith: Dynamic Portfolio Optimization with Transaction CostsManagement Science 57(10), pp. 1752–1770, © 2011 INFORMS 1761
we can define � in terms of the directional derivativeinstead; this directional derivative will exist wheneverU � wT is concave. The use of the directional deriva-tives is discussed in more detail in the proof of theresult in the online appendix.
In our numerical experiments, we will considertwo examples of gradient-based penalties. The first isbased on the frictionless model, as discussed above.In this case, the approximate terminal wealth wT 4a1 r5is given by taking the transaction costs to be zero,and the allowable trade sequences �4r5 are the sameas in the original model but without the transactioncosts: �4r5 ⊆ �4r5 then follows from our assumptionthat the set of feasible asset positions �t is nonde-creasing in cash. In this frictionless model, the gradi-ent of the utility of terminal wealth is ïaU4wT 4a1 r55=
U ′4wT 4a1 r55ïawT 4a1 r5, where U ′ is the derivative ofthe utility function and ïawT 4a1 r5 is a nT × 1 vectorwith entries corresponding to trade at given by
ïat wT 4a1 r5 = 4rT · 0 0 0 · rt+15− rT−tf 10
We will call the resulting penalty the frictionlessgradient-based penalty.
Note that with this approximation, the wealth withtransaction costs wT 4a1 r5 is less than or equal to thewealth without transaction costs wT 4r1z5 (for all r andz), so the last part of Proposition 4.2 applies: the opti-mal values for the inner problems with this penalty(on the left side of (31)) will be less than or equal tothe utility of final wealth with no transaction costs (onthe right side of (31)) for every r and z. This impliesthat the dual bounds using this frictionless gradientpenalty must be at least as tight as the no-transaction-cost bound given by the value function for the fric-tionless model.
Although the frictionless gradient penalty leads totighter bounds than the frictionless model, we canperhaps do better if we somehow incorporate theeffects of transaction costs in the approximate model.The key for the gradient penalty approach is to dothis in a way that still allows us to find the optimalsolution for the approximate model. One way to dothis is to consider a variation of the original modelwhere the transaction costs depend on the posttradeasset positions rather than the trades. In this case, thetransaction costs are of the form �4xt5, where xt =
xt + at and the cash position evolves according toct+1 = rf 4ct −1′at − �4xt +at551 rather than Equation (3).In such a case, we can represent the portfolio prob-lem as a dynamic program like that of the frictionlessmodel (10) with wealth wt and the market state ztas state variables and posttrade asset positions xt asdecision variables, without considering the specificsof the asset positions 4xt1 ct5.
In our numerical experiments, we will consider amodified gradient-based penalty where we take �4xt5 to
be proportional to the posttrade asset positions xt .Specifically, we will take the proportional fee for theasset positions to be equal to the proportional fee fortrades divided by the number of periods (T ) in themodel. More generally, we could consider �4xt5 thatassume transaction costs are proportional to the dif-ference between the posttrade position xt and somereference position. For example, we might take thisreference position to be the asset allocation recom-mended by the frictionless model for the same marketstate or, alternatively, the initial (period 0) asset posi-tion. Our modified gradient bound can be viewed asan example of this general form where the referenceposition is taken to be a zero position (i.e., with zeroinvestment in each risky asset), which is the assumedinitial position in the experiments. Of course, there area number of possible variations on these ideas, andwe could experiment to perhaps find better bounds.
5. Numerical ExperimentsIn this section, we describe the experiments that weuse to test the proposed trading strategies and dualbounds. We first describe the details of the modelsconsidered and then discuss the run times and numer-ical results. We then consider variations on the heuris-tics and bounds, as well as the constraints.
5.1. Model DetailsWe will test the heuristic strategies and dual boundsby evaluating these heuristics and bounds in a seriesof simulations with varying parameter values. In allcases, we begin by solving the dynamic program forthe frictionless model (Equations (10) and (11)) for thegiven parameter values; we also solve the analogousdynamic program used to determine the modifiedgradient penalty of §4.3. We then repeatedly gener-ate random sequences of market states and returns.For each sequence of market states and returns, we“run” the heuristic strategies of §3, determining thesequence of trades selected by the heuristic and thecorresponding terminal wealth and utility. We alsosolve the inner problem for each of the dual bounds inthis same scenario. We repeat this simulation processfor a given number of trials.
In our experiments, we assume monthly time steps,proportional transaction costs �4at5 = �
∑ni=1 �at1 i�,
and power utilities. We will consider a variety ofparameters:
• time horizons T of 6, 12, 24, and 48 months;• transaction cost rates � of 0.5%, 1.0%, and 2.0%;• relative risk-aversion coefficients � of 1.5, 3.0, or
8.0, reflecting low, medium and high degrees of riskaversion.
In §§5.2–5.4, we will focus on the case with con-straint (5) ruling out short positions; we consider lim-ited leverage constraints of the form of (6) in §5.5. In
Brown and Smith: Dynamic Portfolio Optimization with Transaction Costs1762 Management Science 57(10), pp. 1752–1770, © 2011 INFORMS
all cases, we assume the investor starts with all wealthinvested in cash and normalize wealth to one, i.e., weassume x0 = 0 and c0 = 1. We will consider 1,000 trialsin each simulation.
We consider two different models of returns. Thefirst highlights the role of predictability and the sec-ond considers a larger number of risky assets.
5.1.1. Model with Three Risky Assets and Pre-dictability. We first consider a model with three riskyassets and one market state variable, based on Lynch(2001); Lynch studied the impact of predictabilityon portfolio choices, without considering transactioncosts. Specifically, letting Ðt = ln rt , the model assumesreturns and market states evolve according to
[
Ðt+1
zt+1
]
=
[
ar +brzt
az +bzzt
]
+
[
et+1
vt+1
]
1 (32)
where the stochastic increments 4et+1vt+15 are multi-variate normal with mean zero and covariance èev.The three risky assets correspond to value-weightedequity portfolios sorted by the size of the underly-ing firms (i.e., small-, medium-, and large-cap stocks),and the market state variable is a normalized indexreflecting the dividend yield (specifically, the contin-uously compounded 12-month dividend yield on thevalue-weighted New York Stock Exchange portfolio).Lynch (2001) estimates this model using data from1927 to 1996. The returns are inflation adjusted andthe risk-free rate rf is 1000042. The other numericalassumptions are discussed in the online appendix.
In this model, the market state variable has a sig-nificant impact on expected returns. With no transac-tion costs and medium risk aversion (� = 300), we findthat with high values of zt (i.e., with a large dividendyield), the investor should invest heavily in a mix ofsmall- and medium-sized stocks and hold no cash.With zt = 0, the investor should invest in a mix ofall three assets, while holding substantial reserves incash. With negative values for zt , the investor shouldinvest most of his wealth in cash. In our numericalexperiments, we will assume that the initial marketstate is neutral (i.e., z0 = 0).
We follow Lynch (2001) and use discrete approxi-mations of the uncertainties to calculate expectations.We approximate the market state variable using agrid with 19 points. The idiosyncratic returns (et+1in Equation (32)) are approximated using a Gaussianquadrature approach with three points per asset. ThisGaussian quadrature approximation exactly matchesthe mean and covariance structure for log returnsÐt and matches higher-order moments (3rd–5th) ofthis joint distribution as well; see, e.g., Judd (1998)for an introduction to Gaussian quadrature meth-ods. Taken together, the joint distribution for returnsand the market state variable are approximated using
a four-dimensional grid with a total of 33 × 19 =
513 elements. This discrete approximation scheme isused to calculate the expectations required to solvethe dynamic programming model for the friction-less model, to evaluate the expectations in the opti-mization problems for the heuristic trading strategies(in §3), and to evaluate the expectations appearingin the penalties based on approximate value func-tions (in §4.2). For consistency, we also use this dis-crete approximation in the simulations, i.e., we gen-erate sample returns and market states from thisgrid according to the probabilities of the discreteapproximation.
5.1.2. Model with 10 Risky Assets and No Pre-dictability. We also consider examples with 10 riskyassets and no predictability. In this case, we assumethat the asset returns follow a discrete-time multivari-ate geometric Brownian motion process, a special caseof (32) with Ðt = ln rt evolving according to Ðt+1 =
ar + et+1, where the stochastic increments et+1 aremultivariate normal with mean zero and covarianceèe. In this example, the 10 risky assets correspondto five equity indices (S&P 500, Russell 2000 Value,MSCI World Gross, Russell 1,000 Value Index, RussellMidCap Index), three bond indices (Lehman Broth-ers’ U.S. government and corporate bond indices andLehman Brothers’ Fixed-Rate Mortgage-Backed Secu-rities Index), a real estate index trust (NAREIT), anda composite index of one- to five-year U.S. Treasuries.The parameters were estimated using monthly returndata from 1981 to 2006 and are provided in the onlineappendix. The risk-free rate rf = 100048 is the aver-age return on three-month U.S. treasuries, estimatedfrom the same data set. These returns are not inflationadjusted.
We also use discrete approximations of the uncer-tainties to calculate expectations in this model. Theidiosyncratic returns (et+1) are approximated using amultidimensional quadrature (or cubature) formula inStroud (1971, p. 317) that includes 2n + 2n points andexactly matches the first five moments of the returndistribution. With 10 assets, the return distributionis approximated using a 10-dimensional grid with atotal of 1,044 elements. There are a variety of dif-ferent approaches we could use to calculate expecta-tions in these models, and there is a trade-off betweenthe accuracy of the approximation and the amountof work involved. Stroud (1971) provides a compre-hensive review of multidimensional quadrature for-mulas; one such formula matches three momentsof the underlying distribution and involves only 2npoints. Alternatively, we might consider evaluatingthese expectations using Monte Carlo or quasi–MonteCarlo methods; see, e.g., Judd (1998) or Glasserman(2004) for discussions of these approaches.
Brown and Smith: Dynamic Portfolio Optimization with Transaction CostsManagement Science 57(10), pp. 1752–1770, © 2011 INFORMS 1763
Table 1 Run Times (Seconds) Required to Evaluate Heuristic Strategies and Dual Bounds
Simple DP models Heuristic strategies Dual bounds
Modified Rolling Rolling FrictionlessHorizon No trans. trans. cost Modified buy-and- Zero Modified buy-and- gradient Modified(T ) cost model Cost blind One-step one-step hold penalty one-step hold based gradient
Three-asset model with predictability6 501 503 507 29301 29101 28502 901 1200 1306 900 1008
12 1001 1004 1002 62203 59109 58309 1202 1700 2103 1200 140724 2000 2008 2000 1121408 1118009 1121504 2106 3300 4108 2202 290948 3909 4108 4009 2148206 2136309 2155807 7202 8000 10501 6204 8908
Ten-asset model without predictability6 008 008 905 74300 83900 87002 1003 1408 2006 1204 1301
12 105 105 1609 1148505 1164307 1178409 2004 2505 3404 2103 220224 208 301 3400 3102005 3124300 3135409 5504 6003 7708 5707 620948 506 603 6606 6115006 6174206 7123809 26900 22104 25001 25304 29104
5.2. Run TimesTable 1 provides the run times for evaluating theheuristics and dual bounds in a simulation with1,000 trials for the two different return models andfour different time horizons (T ). In all cases, we con-sider the case with risk-aversion coefficient � = 3 andthe transaction cost rate �= 0001; changes in these twoparameters do not appreciably affect the run times.These computations were run on a Dell personal com-puter with a 2.55 GHz Intel Core 2 Quad CPU pro-cessor and 3.25 GB of RAM, running Windows XP.The calculations were done using Matlab with a sin-gle processor; the run times were estimated usingMatlab’s Profiler utility. In our calculations, we usedthe general purpose MOSEK optimization toolbox forMatlab to solve the convex optimization problems.We could almost certainly improve the run times bydeveloping more specialized code for the particularforms of optimization problems that we consider.
The first two columns in Table 1 report the timerequired to solve the dynamic program for the fric-tionless model. We also show the time required tosolve the dynamic program with modified transac-tion costs that is used to calculate the modified gradi-ent bound. These models must be solved once, beforerunning the simulation. As expected, the run timesfor these two models are quite similar and grow lin-early with the number of periods in the model. Themodels without predictability take less time to solve:although they have more assets (10 rather 3), they donot involve a market state variable, and there is onlyone scenario to evaluate in each period, as opposedto the 19 market states considered in the model withpredictability.
Most of the time in the simulation is spent evaluat-ing the heuristic strategies. The cost-blind heuristic isquite easy to evaluate, because we simply move to theposttrade asset allocations recommended by the fric-tionless model. The other heuristics require solving a
convex optimization problem in each period to deter-mine the trades that optimize the heuristic’s objec-tive in that scenario. The run times thus grow linearlywith the number of periods and the number of trials.The complexity of each of these convex optimizationproblems grows more than linearly in the number ofassets (in theory no worse than polynomially), butthis depends on the details of the optimization meth-ods used.
The dual bounds take less time to calculate. Herewe solve one deterministic inner problem for eachtrial; the number of decision variables is the numberof assets n times the number of periods T or 2nTwhen we decompose the trades into their positive andnegative components. The run times grow linearly inthe number of trials, and the complexity of the con-vex optimization problem grows more than linearlyin the number of decision variables involved (nT or2nT ); this polynomial growth is evident in the runtimes in Table 1 for the dual problems with increas-ing horizon T . The run times required to evaluate theheuristic strategies are longer than the run times forthe dual problems because the optimization problemsfor the heuristic strategies involve high-dimensionalexpectations (over returns and market states) to cal-culate objective function values for each setting of thedecision variables.
Finally, remember that the run times in Table 1are the times required to evaluate the quality of theheuristic strategies and dual bounds. In practice, ifwe want to use the modified one-step or rolling buy-and-hold heuristics to recommend a trade, we needonly solve the corresponding optimization problemsonce for the current state and period. Dividing therun times in Table 1 by the 1,000 trials in the simu-lation and the number of periods considered (T ), wesee that trades recommended by these heuristic strate-gies can be determined in a fraction of a second on adesktop PC.
Brown and Smith: Dynamic Portfolio Optimization with Transaction Costs1764 Management Science 57(10), pp. 1752–1770, © 2011 INFORMS
5.3. ResultsIn each simulation, for each heuristic and dual bound,we calculate
• the average utility of final wealth or value for thedual bound; and
• the mean “turnover,” defined as the average vol-ume of trade in each period, 41/T 5
∑T−1t=0
∑ni=1 �at1 i�.
To simplify the interpretation of the results, we willconvert the utilities and bounds to annualized cer-tainty equivalent returns. Given a time horizon of Tmonths and a mean utility calculated in a simula-tion of �, the annualized certainty equivalent returnis defined as the constant annual return r that yieldsutility �, i.e., the r that solves
�=U4w0rT /125 (33)
where w0 is the initial wealth. We estimate mean stan-dard errors for these certainty equivalent returns andthe duality gaps (the differences between upper andlower bounds on optimal returns) using the “deltamethod” (see, e.g., Casella and Berger 2002, p. 240)based on a first-order Taylor series expansion of thecertainty equivalent formula (i.e., the inverse of Equa-tion (33)).
To reduce the variance in our estimates of theexpected utilities, we use a simple control variatetechnique (see, e.g., Glasserman 2004, p. 185) usingthe utilities for the frictionless model as a controlvariate. Specifically, for a given strategy �, we esti-mate its expected utility as
� =1S
S∑
s=1
{
U4wT 4�4rs1zs51 rs55
+�6Vf
0 4w01z05−U4wfT 4�
f 4rs1zs51 rs557}
1 (34)
where S is the number of trials, rs and zs are thesequences of returns and market states in trial s,�4rs1zs5 and �f 4rs1zs5 are the trades for the chosenstrategy and frictionless strategy in trial s, and wT andw
fT are the terminal wealths with and without trans-
action costs. Here Vf
0 4w01z05 is the expected utility forthe frictionless model in the initial state; this is com-puted before we begin the simulation. The term insidethe parentheses in (34) has zero mean, so adjustingthe estimate of expected utility by adding this termdoes not bias the estimate. The regression coefficient� in (34) is given as 4�y/�x5�xy , where �x is the stan-dard deviation of U4w
fT 4�
f 4rs1zs51 rs55, �y is the stan-dard deviation of U4wT 4�4rs1zs51 rs55, and �xy is thecorrelation between these quantities. The estimates forthe gradient-based dual bounds of §4.3 are similarlyadjusted using control variates.
Table 2 shows the simulation results for the 3-assetmodel with predictability for a time horizon (T ) of
12 months; results for the other time horizons areshown in Table A3 in the online appendix. Figure 1summarizes the results for all time horizons, transac-tion costs, and risk-aversion levels, showing the cer-tainty equivalent returns for the best heuristic policy(the bottom end of the error bars) and the best dualbound (the upper end of the error bar); the lengthof the error bar thus represents the duality gap fora particular set of parameters. In these results, theannualized certainty returns are stated in percentageterms. For example, in the first row of Table 2, we seethat in the case with risk-aversion coefficient � = 105and transaction cost rate � = 005%, the modified one-step heuristic has an annualized certainty equivalentreturn quoted as 6.57%; this corresponds to an esti-mated value of r in Equation (33) of 1.0657. The meanstandard errors are also quoted in percentage terms;the 95% confidence interval on r for the modified one-step strategy in this case is 100657 ±1096×000013. Theturnover means that an investor following the modi-fied one-step strategy would execute trades averaging9.5% of his initial wealth in each period.
In Table 2, we see that the modified one-stepand rolling buy-and-hold heuristic strategies performsimilarly, and consistently outperform the cost-blindstrategy and the (unmodified) one-step strategy. Therolling buy-and-hold strategy “wins” in most cases,but its performance is typically only slightly betterthan the modified one-step strategies. In most of thesecases, the cost-blind strategies perform substantiallyworse than these two heuristic strategies, with largerdifferences occurring when the transaction costs arelarger and when the investor is less risk averse (has alow value of �). Looking at the turnover, we see that,as expected, the modified one-step and rolling buy-and-hold strategies trade less than the cost-blind strat-egy, but more than the (unmodified) one-step strategy.
Examining the dual bounds, we see that the zero-penalty bound performs very poorly, as expected:an investor with perfect foresight can achieve highreturns, even with transaction costs. The boundswith penalties are much better and are also substan-tially better than the simple no-transaction-cost boundgiven by using the value function for the friction-less model. There is no consistent winner among thedual bounds: the modified one-step, modified gra-dient, and rolling buy-and-hold bounds all performbest in some cases. The modified gradient bounds winmore often with higher levels of risk aversion (highervalues of �); this may be because the quality of thefirst-order Taylor series approximation underlying themodified one-step and rolling buy-and-hold boundsdegrades with higher levels of risk aversion. Thefrictionless gradient-based bound consistently outper-forms the no-transaction-cost bound (as it must, basedon Proposition 4.2(3)), but is never the best of the
Brown and Smith: Dynamic Portfolio Optimization with Transaction CostsManagement Science 57(10), pp. 1752–1770, © 2011 INFORMS 1765
Table2
Results
forthe
3-As
setM
odel
with
Pred
ictability
Para
met
ers
Heur
istic
stra
tegi
esDu
albo
unds
Best
perf
orm
ance
Risk
-Tr
ans.
Fric
tionl
ess
No-tr
ans.
-Ho
rizon
aver
sion
cost
rate
Mod
ified
Rolli
ngbu
y-Ze
roM
odifi
edRo
lling
buy-
grad
ient
Mod
ified
cost
Best
Best
uppe
r(T
)co
eff.
(�)
(�)(
%)
(%)
Cost
blin
dOn
e-st
epon
e-st
epan
d-ho
ldpe
nalty
one-
step
and-
hold
base
dgr
adie
ntbo
und
stra
tegy
boun
dGa
p
1210
500
5CE
retu
rn40
6650
9860
5760
5654
054
6077
7022
6095
6073
7025
6.57
6.73
0016
Mea
nst
d.er
ror
0002
0018
0013
0013
0048
0000
0002
0001
0000
Mod
ified
Mod
ified
0013
Turn
over
4107
601
905
903
7005
801
1602
405
701
4203
one-
step
grad
ient
1210
510
0CE
retu
rn20
1320
8360
0160
0150
005
6030
6040
6069
6024
7025
6.01
6.24
0023
Mea
nst
d.er
ror
0005
0018
0014
0015
0046
0001
0002
0001
0000
Rolli
ngbu
y-M
odifi
ed00
15Tu
rnov
er4102
205
804
803
5405
702
903
401
609
4203
and-
hold
grad
ient
1210
520
0CE
retu
rn−
2074
0051
5003
5003
4304
250
4150
3260
2350
3570
255.
035.
3200
29M
ean
std.
erro
r00
0900
0000
1900
1900
4300
0100
0100
0200
01M
odifi
edRo
lling
buy-
0019
Turn
over
4000
000
607
606
4002
609
709
305
606
4203
one-
step
and-
hold
123
005
CEre
turn
2069
3020
3049
3050
5006
630
7840
4130
8230
6930
993.
503.
6900
19M
ean
std.
erro
r00
0100
1100
0700
0700
4800
0000
0200
0000
00Ro
lling
buy-
Mod
ified
0007
Turn
over
2101
301
602
600
7005
506
1508
207
506
2103
and-
hold
grad
ient
123
100
CEre
turn
1040
1065
3020
3021
4603
930
5630
6830
6630
4330
993.
213.
4300
22M
ean
std.
erro
r00
0200
0900
1000
1000
4600
0100
0100
0100
00Ro
lling
buy-
Mod
ified
0010
Turn
over
2100
102
409
407
5405
502
802
205
504
2103
and-
hold
grad
ient
123
200
CEre
turn
−10
1200
5120
6920
6940
003
3018
3007
3039
2097
3099
2.69
2.97
0028
Mea
nst
d.er
ror
0005
0000
0012
0012
0042
0001
0001
0001
0001
Rolli
ngbu
y-M
odifi
ed00
12Tu
rnov
er2007
000
305
304
4002
502
605
202
501
2103
and-
hold
grad
ient
128
005
CEre
turn
1030
1050
1060
1060
4008
310
7620
3610
7310
6910
791.
601.
6900
09M
ean
std.
erro
r00
0000
0400
0300
0300
5500
0000
0200
0000
00Ro
lling
buy-
Mod
ified
0003
Turn
over
801
102
204
203
7005
306
1502
100
302
801
and-
hold
grad
ient
128
100
CEre
turn
0081
0093
1049
1050
3702
410
7210
8110
6710
5910
791.
501.
5900
10M
ean
std.
erro
r00
0100
0300
0400
0400
5000
0100
0100
0000
00Ro
lling
buy-
Mod
ified
0004
Turn
over
801
005
109
108
5405
302
605
009
301
801
and-
hold
grad
ient
128
200
CEre
turn
−00
1700
5110
3110
3131
078
1069
1057
1057
1043
1079
1.31
1.43
0012
Mea
nst
d.er
ror
0002
0000
0005
0005
0042
0001
0001
0001
0000
Rolli
ngbu
y-M
odifi
ed00
05Tu
rnov
er80
000
010
310
340
0230
640
700
830
080
1an
d-ho
ldgr
adie
nt
Aver
age
0019
Min
imum
0009
Max
imum
0029
Brown and Smith: Dynamic Portfolio Optimization with Transaction Costs1766 Management Science 57(10), pp. 1752–1770, © 2011 INFORMS
Figure 1 Bounds on Optimal Returns in the 3-Asset Example
Transaction cost rate: � = 0.0%
� = 0.5%
� = 1.0%
� = 2.0%
Risk-aversion coefficient (�) = 1.5
Risk-aversion coefficient (�) = 3.0
Risk-aversion coefficient (�) = 8.0
Horizon (T )
Ann
ualiz
ed c
erta
inty
equ
ival
ent r
etur
n (%
)
0 5 10 15 20 25 30 35 40 45 50
8.0
7.0
6.0
5.0
4.0
3.0
2.0
1.0
0.0
dual bounds. Examining the mean standard errors,we see that the dual bounds with penalties are quiteprecisely estimated; the mean standard errors are typ-ically much smaller than the mean standard errorsassociated with the modified one-step and rollingbuy-and-hold heuristic strategies.
The duality gaps—the difference between certaintyequivalent returns for the best bound and beststrategy—are often quite small. In these case, there isrelatively little room for improvement on the heuris-tic strategies. In Table 2, the gaps range from 0.09% to0.29% and average 0.19%. The results are somewhatbetter for shorter time horizons and somewhat worsefor the longer time horizons. With T = 48 months, thegaps for the 3-asset case average 0.32%. The dualitygaps also appear to increase with higher transactioncosts and with lower risk aversion. The worst gapfor all of the 3-asset cases is 0.62% for the low riskaversion, large transaction cost case (� = 105; �= 0002)with the long time horizon (T = 48). The mean stan-dard errors for the duality gaps are close to the meanstandard errors for the heuristic strategies because themean standard errors for the heuristic strategies dom-inate those of the dual bound.
Table 3 shows the simulation results for the 10-asset model without predictability for a time horizon(T ) of 12 months; results for the other time horizonsare shown in Table A4 in the online appendix. Theseresults are summarized in Figure 2. Here we find thatthe cost-blind strategy performs much better, becausewith no predictability, the cost-blind strategies trademuch less: there is some rebalancing of the portfolioin response to idiosyncratic gains or losses on par-ticular assets but no large-scale changes in the assetpositions in response to changes in the market statevariable. For example, in the low risk-aversion cases(� = 105), the cost-blind strategy calls for placing allwealth in two assets, rebalancing these positions over
time in response to idiosyncratic gains or losses. Themodified one-step and rolling buy-and-hold heuris-tics place all of their wealth in these same two assets,but do not rebalance in subsequent periods. In manycases, the duality gap is very close to zero, suggestingthe heuristic strategies are nearly optimal.2
5.4. Tuning the HeuristicsIn general, we find the results of §5.3 encouraging,particularly given how little effort has been made tofine-tune the heuristics and penalties used. In manycases, the duality gaps are quite small, and the heuris-tic strategies are probably good enough for most prac-tical applications. Where the duality gaps are larger,we may be able to improve performance by varyingthe heuristics and/or tightening the bounds by vary-ing the penalties. For example, in the modified one-step strategy we reduced the transaction costs in theoptimization problem (15) by dividing the costs bythe smaller of 6 and the number of periods remain-ing. This rule apparently performs reasonably well inmost cases considered here, but we could perhaps dobetter with different scaling rules. Similarly, we tookthe horizon h in the rolling buy-and-hold objective(16) to be 6; again, we could perhaps do better with adifferent horizon.
In Table 4, we report results with varying divisorsand horizons for two cases where the duality gapswere largest; the bold entries in each row of the tablecorrespond to the best heuristic and dual bound ineach case. In the case with the largest duality gap forthe 3-asset model (� = 105; � = 0002, T = 48), we cancut the gap of 0.62% to 0.47% by considering a longerhorizon or larger divisor. Similarly, for the case withthe largest gap in Table 3 (� = 800; � = 0002, T = 12)with a gap of 0.52%, we can improve the performanceof the heuristics and reduce the gap to 0.03%. In thiscase, it appears that a rolling buy-and-hold strategywith a longer time horizon is nearly optimal.
5.5. Results with Short Sales and BorrowingIn the numerical experiments presented thus far, wehave focused exclusively on the case where shortpositions and borrowing are not allowed. Alterna-tively, we can consider cases where borrowing andshort positions are allowed and the investor faces aconstraint on the total leverage allowed, i.e., a con-straint of the form of (6) for a given leverage limit l.The set of feasible trades is larger in this case (andgrows larger as l increases), and one might wonderhow the heuristics and dual bounds perform withleverage.
2 There are a few cases where the estimated duality gap is slightlynegative. In these cases, the estimated gaps are small comparedto their mean standard errors. In these cases, we believe the truegap is very close to zero and the negative estimate is a result ofsampling error.
Brown and Smith: Dynamic Portfolio Optimization with Transaction CostsManagement Science 57(10), pp. 1752–1770, © 2011 INFORMS 1767
Table3
Results
forthe
10-Asset
Mod
elWith
outP
redictab
ility
Para
met
ers
Heur
istic
stra
tegi
esDu
albo
unds
Best
perf
orm
ance
Risk
-Tr
ans.
Fric
tionl
ess
No-tr
ans.
-Ho
rizon
aver
sion
cost
rate
Mod
ified
Rolli
ngbu
y-Ze
roM
odifi
edRo
lling
buy-
grad
ient
Mod
ified
cost
Best
Best
uppe
r(T
)co
eff.
(�)
(�)(
%)
(%)
Cost
blin
dOn
e-st
epon
e-st
epan
d-ho
ldpe
nalty
one-
step
and-
hold
base
dgr
adie
ntbo
und
stra
tegy
boun
dGa
p
1210
500
5CE
retu
rn11
013
1107
613
006
1300
653
076
1300
613
006
1303
71300
813
062
13.0
613
.06
0000
Mea
nst
d.er
ror
0015
0004
0000
0000
0031
0000
0000
0000
0000
Rolli
ngbu
y-Ro
lling
buy-
0000
Turn
over
2605
605
803
803
1210
080
380
330
870
680
9an
d-ho
ldan
d-ho
ld
1210
510
0CE
retu
rn90
4550
9112
050
1205
046
007
1205
012
050
1301
61205
513
062
12.5
012
.50
0000
Mea
nst
d.er
ror
0015
0000
0000
0000
0027
0000
0000
0001
0000
Rolli
ngbu
y-Ro
lling
buy-
0000
Turn
over
2603
000
803
803
9000
803
803
302
706
809
and-
hold
and-
hold
1210
520
0CE
retu
rn70
2350
9111
039
1103
936
040
1103
911
039
1208
31104
813
062
11.3
911
.39
0000
Mea
nst
d.er
ror
0013
0000
0000
0000
0024
0000
0000
0002
0000
Rolli
ngbu
y-Ro
lling
buy-
0000
Turn
over
2203
000
802
802
4906
802
802
205
705
809
and-
hold
and-
hold
123
005
CEre
turn
1102
680
8111
034
1103
452
032
1103
711
036
1107
11103
811
091
11.3
411
.36
0002
Mea
nst
d.er
ror
0000
0006
0001
0001
0028
0000
0000
0000
0000
Mod
ified
Rolli
ngbu
y-00
01Tu
rnov
er90
830
380
380
312
100
803
803
302
706
908
one-
step
and-
hold
123
100
CEre
turn
1006
250
9210
079
1007
944
084
1008
210
081
1105
41008
511
091
10.7
910
.81
0002
Mea
nst
d.er
ror
0000
0000
0001
0001
0025
0000
0000
0001
0000
Mod
ified
Rolli
ngbu
y-00
01Tu
rnov
er90
700
080
380
390
0080
380
320
970
690
8on
e-st
epan
d-ho
ld
123
200
CEre
turn
9035
5091
9024
9023
3503
690
8190
8111
023
9080
1109
19.
359.
8000
45M
ean
std.
erro
r00
0000
0000
0400
0400
2200
0000
0000
0200
00Co
stbl
ind
Mod
ified
0000
Turn
over
907
000
600
509
4906
802
802
205
705
908
grad
ient
128
005
CEre
turn
9009
7000
9016
9016
4800
590
2590
2290
6090
2190
749.
169.
2100
05M
ean
std.
erro
r00
0000
0500
0100
0100
3300
0000
0000
0000
00Ro
lling
buy-
Mod
ified
0001
Turn
over
909
103
803
803
1210
080
380
320
470
890
9an
d-ho
ldgr
adie
nt
128
100
CEre
turn
8045
5091
8049
8049
4102
680
7580
7490
4780
6890
748.
498.
6800
19M
ean
std.
erro
r00
0000
0000
0300
0300
2900
0000
0000
0100
00M
odifi
edM
odifi
ed00
03Tu
rnov
er90
900
060
860
890
0080
380
320
370
890
9on
e-st
epgr
adie
nt
128
200
CEre
turn
7019
5091
7015
7014
3204
580
6480
6090
2470
7190
747.
197.
7100
52M
ean
std.
erro
r00
0000
0000
0800
0800
2500
0100
0100
0100
00Co
stbl
ind
Mod
ified
0000
Turn
over
908
000
203
203
4906
801
802
201
707
909
grad
ient
Aver
age
0014
Min
imum
0000
Max
imum
0052
Brown and Smith: Dynamic Portfolio Optimization with Transaction Costs1768 Management Science 57(10), pp. 1752–1770, © 2011 INFORMS
Figure 2 Bounds on Optimal Returns in the 10-Asset Example
Transaction cost rate:� = 0.0% Risk-aversion coefficient (�) = 1.5
Risk-aversion coefficient (�) = 3.0
Risk-aversion coefficient (�) = 8.0
Ann
ualiz
ed c
erta
inty
equ
ival
ent r
etur
n (%
) 15.0
14.0
13.0
12.0
11.0
10.0
9.0
8.0
7.0
6.0
5.0
Horizon (T )0 5 10 15 20 25 30 35 40 45 50
� = 0.5%� = 1.0% � = 2.0%
To investigate this, we conducted a series of numer-ical experiments where we consider leverage con-straints with varying limits. We focus on the 10-asset model without predictability and the case withT = 12, � = 3, and �= 100% and consider leverage lim-its l varying from 1 to 5. In this study, we varied thedivisors and horizons for the modified one-step androlling buy-and-hold heuristics and dual bounds (asin §5.4), considering values of 6, 9, and 12. The resultsare presented in Table 5.
In Table 5, we see that the certainty equivalentreturns for the dual bounds and no-transaction-costbounds are all increasing with the leverage limit,
Table 4 Certainty Equivalent Returns (%) with Varying Parameters forHeuristics
Horizon (h) or divisor 3 6 9 12 15 18 21 24
Three assets with predictabilityT = 48, � = 105, �= 0002
HeuristicModified one-step 5077 6033 6042 6043 6040 6037 6034 6032Rolling buy-and-hold 5076 6031 6042 6045 6045 6044 6042 6040
BoundsModified one-step 7033 7004 6099 6093 6092 6092 6093 6093Rolling buy-and-hold 7029 6097 7013 7057 8014 8069 9021 9066
GapsMean 1052 0064 0056 0048 0047 0049 0051 0052Std. error 0013 0011 0010 0009 0008 0008 0008 0009
Ten assets without predictabilityT = 12, � = 8, �= 0002
HeuristicModified one-step 6026 7015 7054 7061Rolling buy-and-hold 6027 7014 7054 7061
BoundsModified one-step 11063 8064 7093 7083Rolling buy-and-hold 11071 8060 7086 7064
GapsMean 5036 1045 0032 0003Std. error 0008 0008 0005 0003
reflecting the larger set of feasible trades.3 The zero-penalty bound in particular increases greatly withhigher leverage limits as the larger feasible sets allowthe investor with advance knowledge of asset returnsto more effectively exploit the arbitrage opportuni-ties provided by such information. The performanceof the modified one-step and rolling buy-and-holdheuristic strategies need not improve with larger fea-sible sets; the larger feasible sets lead to higher valuesfor the heuristics’ objective function (in (15) and (16)),but this may not actually lead to better performance.
Overall, the duality gaps widen with higher lever-age limits; the worst case is with a leverage limit l =500, which has a gap of 0047%. Although we couldperhaps do better with more sophisticated heuris-tics and bounds, we note that in this worst case, themodified one-step and rolling buy-and-hold heuristics(with certainty equivalent returns of approximately13015%) greatly outperform the cost-blind strategy(return of 9072%), and the dual bounds with penal-ties (13061%) are much tighter than the no-transaction-cost bound (18033%). Thus, these heuristics and dualbounds greatly outperform strategies and bounds thatsimply ignore transactions costs.
6. ConclusionIn this paper, we have studied some easy-to-computeheuristics for managing portfolios with transactioncosts and developed a dual approach for examiningthe quality of these heuristics. The approach is generalin that we can consider a variety of utility functions,a variety of forms for transaction costs (provided theyare convex functions), and a variety of constraint sets(provided they are convex), as well as a variety of dif-ferent models for returns. Our numerical experimentsare promising: the run times, even without using cus-tomized optimization software, are reasonable, and inmany cases, the performance of the heuristic strategyis very close to the upper bound, indicating that theheuristic strategies are very nearly optimal.
Frankly, we were surprised that these heuristicsperformed so well. At a high level, the key issue is tomanage the trade-off between improving asset posi-tions and minimizing transaction costs. These heuris-tics capture this trade-off in a relatively crude butapparently effective manner. In studying these heuris-tics, we find the dual upper bounds particularly help-ful: When the bounds tell us that the performance ofthese heuristics is nearly optimal, we know that there
3 Note that with power utility and a return model with log-normalreturns, it is never optimal to borrow or take short positionsbecause these positions lead to a positive probability of an infinitenegative utility. However, with our discrete approximations of thereturn distribution, it is optimal to borrow and take short positionsin this numerical example.
Brown and Smith: Dynamic Portfolio Optimization with Transaction CostsManagement Science 57(10), pp. 1752–1770, © 2011 INFORMS 1769
Table5
Results
forthe
10-Asset
Exam
plewith
VaryingLe
verage
Limit
Heur
istic
stra
tegi
esDu
albo
unds
Best
perf
orm
ance
Leve
rage
Mod
ified
Rolli
ngbu
y-Ze
roM
odifi
edRo
lling
buy-
Fric
tionl
ess
Mod
ified
No-tr
ans.
-cos
tBe
stBe
stup
per
limit
(l)
Cost
blin
dOn
e-st
epon
e-st
epan
d-ho
ldpe
nalty
one-
step
and-
hold
grad
ient
base
dgr
adie
ntbo
und
stra
tegy
boun
dGa
p
1.0
Best
divi
sor/h
orizo
n6
66
6CE
retu
rn(%
)10
062
5091
1007
91007
974
046
1008
21008
11105
410
085
1109
11007
910
081
0002
Mea
nst
d.er
ror(
%)
0000
0000
0001
0001
0046
0000
0000
0001
0000
Mod
ified
Rolli
ngbu
y-00
00Tu
rnov
er(%
)90
800
080
380
316
605
803
803
209
706
908
one-
step
and-
hold
2.0
Best
divi
sor/h
orizo
n6
66
6CE
retu
rn(%
)12
014
5091
1203
81203
717
4012
1206
81207
314
045
1208
415
003
1203
812
068
0031
Mea
nst
d.er
ror(
%)
0000
0000
0001
0001
1042
0000
0000
0001
0000
Mod
ified
Mod
ified
0001
Turn
over
(%)
2103
000
1807
1807
4170
41604
1607
408
1407
2108
one-
step
one-
step
3.0
Best
divi
sor/h
orizo
n9
99
9CE
retu
rn(%
)12
009
5092
1301
41301
831
4006
1306
013
053
1601
013
054
1608
913
018
1305
300
35M
ean
std.
erro
r(%
)00
0100
0000
0400
0430
2100
0100
0100
0200
00Ro
lling
buy-
Rolli
ngbu
y-00
05Tu
rnov
er(%
)35
0400
02402
2308
7880
72205
2402
609
2108
3606
and-
hold
and-
hold
4.0
Best
divi
sor/h
orizo
n12
1212
12CE
retu
rn(%
)10
096
5092
1302
31302
450
4062
1400
713
062
1608
313
063
1707
313
024
1306
200
38M
ean
std.
erro
r(%
)00
0100
0000
1100
1060
2700
0300
0100
0300
00Ro
lling
buy-
Rolli
ngbu
y-00
10Tu
rnov
er(%
)50
0000
02507
2408
1132
506
2303
1508
800
2806
5204
and-
hold
and-
hold
5.0
Best
divi
sor/h
orizo
n12
1212
12CE
retu
rn(%
)90
7250
921301
21301
575
7054
1402
113
061
1703
913
072
1803
313
015
1306
100
47M
ean
std.
erro
r(%
)00
0200
0000
1500
131102
400
0700
0100
0300
01Ro
lling
buy-
Rolli
ngbu
y-00
13Tu
rnov
er(%
)63
0500
02507
2408
2108
109
2603
1608
805
3503
6703
and-
hold
and-
hold
Brown and Smith: Dynamic Portfolio Optimization with Transaction Costs1770 Management Science 57(10), pp. 1752–1770, © 2011 INFORMS
is little to be gained by considering more complicatedheuristics. Given the complexity of the full dynamicprogramming model with transaction costs, we findthis quite reassuring.
AcknowledgmentsThe authors thank the anonymous associate editor and ref-erees for their helpful comments.
ReferencesAkian, M., J. L. Menaldi, A. Sulem. 1996. On an investment-
consumption model with transaction costs. SIAM J. ControlOptim. 34(1) 329–364.
Andersen, L., M. Broadie. 2004. Primal-dual simulation algorithmfor pricing multidimensional American options. ManagementSci. 50(9) 1222–1234.
Brandt, M. W. 2010. Portfolio choice problems. Y. Aït-Sahalia,L. P. Hansen, eds. Handbook of Financial Econometrics. Elsevier,Oxford, UK, 269–336.
Brown, D. B., J. E. Smith, P. Sun. 2009. Information relaxationsand duality in stochastic dynamic programs. Oper. Res. 58(4)785–801.
Casella, G., R. L. Berger. 2002. Statistical Inference, 2nd ed. DuxburyPress, Pacific Grove, CA.
Chryssikou, E. 1998. Multiperiod portfolio optimization in the pres-ence of transaction costs, Ph.D. dissertation, Sloan School ofManagement, Massachusetts Institute of Technology, Boston.
Constantinides, G. 1979. Multiperiod consumption and investmentbehavior with convex transaction costs. Management Sci. 25(11)1127–1137.
Cvitanic, J., I. Karatzas. 1992. Convex duality in constrained port-folio optimization. Ann. Appl. Probab. 2(4) 767–818.
Davis, M. H. A., I. Karatzas. 1994. A deterministic approach to opti-mal stopping. F. P. Kelly, ed. Probability, Statistics and Optimiza-tion: A Tribute to Peter Whittle. John Wiley & Sons, Chichester,UK, 455–466.
Davis, M. H. A., A. Norman. 1990. Portfolio selection with transac-tion costs. Math. Oper. Res. 15(4) 676–713.
Gârleanu, N., L. Pedersen. 2009. Dynamic trading with predictablereturns and transaction costs. Working paper, Haas School ofBusiness, University of California, Berkeley, Berkeley.
Glasserman, P. 2004. Monte Carlo Methods in Financial Engineering.Springer-Verlag, New York.
Haugh, M. B., A. Jain. 2011. The dual approach to portfolio eval-uation: A comparison of the static, myopic and generalizedbuy-and-hold strategies. Quant. Finance 11(1) 81–99.
Haugh, M. B., L. Kogan. 2004. Pricing American options: A dualityapproach. Oper. Res. 52(2) 258–270.
Haugh, M. B., L. Kogan, J. Wang. 2006. Evaluating portfolio poli-cies: A duality approach. Oper. Res. 54(3) 405–418.
Judd, K. L. 1998. Numerical Methods in Economics. MIT Press,Cambridge, MA.
Leland, H. E. 2000. Optimal portfolio management with transactioncosts and capital gains taxes. Technical report, Haas School ofBusiness, University of California, Berkeley, Berkeley.
Lynch, A. W. 2001. Portfolio choice and equity characteristics: Char-acterizing the hedging demands induced by return predictabil-ity. J. Financial Econom. 62(1) 67–130.
Lynch, A. W., S. Tan. 2010. Multiple risky assets, transaction costsand return predictability: Allocation rules and implications forU.S. investors. J. Financial Quant. Anal. 45(4) 1015–1053.
Merton, R. C. 1969. Lifetime portfolio selection under uncertainty:The continuous time case. Rev. Econom. Statist. 51(3) 247–257.
Merton, R. C. 1971. Optimum consumption and portfolio rules in acontinuous time model. J. Econom. Theory 3(4) 373–413.
Mossin, J. 1968. Optimal multiperiod portfolio policies. J. Bus. 41(2)215–229.
Muthuraman, K. 2007. A computational scheme for optimalinvestment-consumption with proportional transaction costs.J. Econom. Dynam. Control 31(4) 1132–1159.
Muthuraman, K., S. Kumar. 2006. Multi-dimensional portfolio opti-mization with proportional transaction costs. Math. Finance16(2) 301–335.
Muthuraman, K., H. Zha. 2008. Simulation based portfolio opti-mization for large portfolios with transaction costs. Math.Finance 18(1) 115–134.
Rogers, L. C. G. 2002. Monte Carlo valuation of American options.Math. Finance 12(3) 271–286.
Rogers, L. C. G. 2003. Duality in constrained optimal investmentand consumption problems: A synthesis. P. Bank, F. Baudoin,H. Föllmer, L. C. G. Rogers, M. Soner, N. Touzi, eds. Paris-Princeton Lectures on Mathematical Finance 2002 (Lecture Notes inMathematics), Vol. 1814. Springer, Berlin, 95–131.
Samuelson, P. 1969. Lifetime portfolio selection by dynamic stochas-tic programming. Rev. Econom. Statist. 51(3) 239–246.
Shreve, S. E., H. M. Soner. 1994. Optimal investment and consump-tion with transaction costs. Ann. Appl. Probab. 4(3) 609–692.
Stroud, A. H. 1971. Approximate Calculation of Multiple Integrals.Prentice-Hall, Englewood Cliffs, NJ.