3. decision making under uncertainty certainty and …maykwok/courses/fin_econ_05/fin_… · ·...
Post on 30-Mar-2018
220 Views
Preview:
TRANSCRIPT
3. Decision making under uncertainty
Certainty and Uncertainty
Economic agents choose actions on the basis of consequences that
the chosen actions produce. Other factors may interact with an
action (state of the world) to produce a particular consequence.
A = set of feasible actions
S = set of possible states of the world
C = set of consequences
A combination of an action a ∈ A and a state s ∈ S will produce a
particular consequence c ∈ C.
(s, a) → c = f(s, a).
Uncertainty about the state of the world is often modelled by a
probability measure on S.
? Choosing an action “a” determines a consequence for each state
of the world, f(s, a). The decision over actions in A can therefore
be viewed as a decision over state-dependent consequences.
Write (c11, c21, · · · , cs1) as the state-contingent consequences as-
sociated with action a1. Choosing a1 over a2 is the same as
choosing (c11, · · · , cs1) over (c12, · · · , cs2).
? If f is constant with respect to the state of the world, then the
decision is taken under certainty .
Alternative viewpoint – choice of probability distribution over out-
come
The relationship among actions, states of the world and conse-
quences is described by f : S ×A → C.
Since a probability distribution measure is defined on S, there is
an induced probability distribution on the set of consequences for
each action. Consider action a ∈ A, and any (measurable) subset of
consequences K ⊂ C,
Prob {K} := prob{s ∈ S|f(s, a) ∈ K}.The probability of a particular consequence is equal to the probability
of the states of the world which lead to this consequence given a
particular action.
Hence, the choice of an action amounts to the choice of a probabil-
ity distribution on consequences (like the choice of different gambles
or investment choices – choice among alternative probability distri-
butions).
Example
Consider a price-taking firm which maximizes profit by choosing
single input, labor `. Let φ(`) denote the production function, w
and p are the prices for input and output.
The firm’s profit function: π(`) = pφ(`)− w`.
Action = choice of input level `; consequence = profit π(`).
Assume there are 2 states: s1 and s2 and two levels `1 and `2.
prob {π(s1, `)} = prob {s1},prob {π(s2, `)} = prob {s2}.Let the production function be
φ(s, `) =
{ √` for s = s1 (rainy)
2√
` for s = s2 (sunny).
Assume prob {s1} = 3/4, prob {s2} = 1/4; p = 2 and w = 2.
Choosing ` = 1 implies prob {π = 1} = 3/4; prob {π = 3} = 1/4.
Choosing ` = 4 implies prob {π = 0} = 3/4; prob {π = 4} = 1/4.
In the state-space approach, a choice of action `i is a choice of the
state-contingent profit level (π(s1, `i), π(s2, `i)).
Objects of choice can be viewed either as
• state-contingent outcomes
• probability distributions.
Formalism
Given a set of outcomes C and a probability distribution on the
set of states, each action induces a probability distribution on the
outcomes in C.
If the set of consequences is finite, C = {c1, · · · , cn}, then each action
determines a vector of probabilities from the set
∆n =
(p1, · · · pn) ∈ Rn
+
∣∣∣n∑
i=1
pi = 1
with pi = prob ({s ∈ S|f(s, a) = ci}). Here, ∆n is a (n − 1)-
dimensional simplex.
Von Neumann-Morgenstern utility index – utility function over out-
comes u(ci).
Given a von Neumann-Morgenstern utility function u, one can treat
the expected utility representationn∑
i=1
piu(ci) as a function of the
probability distribution (p1, · · · , pn). Define a utility function on prob-
abilities as
U(p1, · · · , pn) =n∑
i=1
piu(ci).
Theorem (existence of representation of preferences by a continu-
ous utility function)
Assumptions on a preference ordering over probability distributions:
1. Completeness requires the ordering to order any pair of proba-
bility distributions in ∆n.
2. Transitivity: p  q, q  r then p  r.
3. Continuity: For a continuous transformation of a probability
distribution p into another probability distribution q, where q  p,
the course of transformation leads to a probability distribution
that is indifference to any probability distribution ranked between
p and q. That is, preference for probability distributions do not
change abruptly.
Existence of utility function on ∆n
If a preference ordering over the probability distributions in ∆n sat-
isfies completeness, transitivity and continuity, there exists a utility
function U : ∆n → R that represents this preference ordering. The
utility function U(·) is unique up to monotone transformation.
[One can take any strictly increasing function : R → R, say, f(x) =
exp(x), to obtain another equivalent utility function U(p) = f(U(p)].
This representation evaluates a probability distribution P (p1, · · · , pn)
over outcomes (c1, · · · , cn) by forming a weighted average of the util-
ities u(ci) derived from the different outcomes using the probabilities
as weights i.e. computing the expected utility.
Independence axiom
The preference relation on ∆n represented by the utility function
U(·) satisfies for any p, q, r ∈ ∆n and any α ∈ [0,1]
U(αp + (1− α)r) ≥ U(αq + (1− α)r) iff U(p) ≥ U(q).
? One can decompose any two probability distributions into parts
that are identical and parts that are different.
º
⇐⇒ º
º
Ranking of different parts of a compound probability distribution
Example
Investor is indifferent between X and Y ; Z is a third prospect. In-
vestor should be indifferent to these 2 gambles;
X with prob p and Z with prob 1− p
Y with prob p and Z with prob 1− p
If a person were indifferent between having a Ford or a Datsun, she
would be indifferent to buy a lottery ticket for $10 that gave a 1
in 500 chance of winning a Ford or a ticket for $10 that gave the
same change of winning a Datsun.
Allais Paradox (1952)
C1 = 5 million, C2 = 1 million, C3 = 0
prob {C1} prob {C2} prob {C3}p 0 1 0q 0.1 0.89 0.01r 0.1 0 0.9s 0 0.11 0.89
Most people prefer p over q (did not consider the 10% chance of win-
ning 5 million worth the risk of losing one million with 1% chance).
Most people prefer r over s. According to the independence axiom,
one of the following must be true for the preferences.
(i) U(p) > U(q) and U(s) > U(r), or
(ii) U(q) > U(p) and U(r) > U(s), or
(iii) U(p) = U(q) and U(r) = U(s).
The actual behaviors in the experiment violate the independent ax-
iom.
Theorem (expected utility representation)
A utility function U on ∆n satisfies the independence axiom iff there
is a utility function over outcomes u : C → R such that for all p and
q ∈ ∆n
U(p) ≥ U(q) iffn∑
i=1
piu(ci) ≥n∑
i=1
qiu(ci).
Remark
1. Unlike expected utility functions, utility indexes are unique only
up to a linear affine transformation: V (x) = a + b U(x), for
a, b ∈ R and b > 0.
2. Utility indexes must be bounded in order that a well-defined
expected utility function exists. An example is the failure in the
St. Peterbury Paradox when the linear utility index: U(x) = x is
used.
Certainty equivalent, risk premium and risk aversion
1. The certainty equivalent of a probability distribution F is the
real number C(F ) that satisfies
u(C(F )) =∫
Cu(x) dF (x)
4= U(F ).
2. The risk premium is the real number q(F ) that satisfies
q(F ) = µ(F )− C(F )
where µ(F ) =∫
Cx dF (x) = expected value of F .
Would he prefer to receive the expected value of a lottery with
certainty than to receive the lottery itself?
risk-averserisk-neutralrisk-loving
if
q(F ) > 0q(F ) = 0q(F ) < 0
for all probability distribution F .
Alternative approach: whether the investor prefers a probability dis-
tribution to its expected value.
Consider u(µ(F ) − q(F )) = u(C(F )) =∫
u(x) dF (x)4= U(F ), since
u(x) is strictly increasing, we have
q(F )>=<
0 ⇐⇒ u(µ(F ))>=<
U(F ),
where µ(F ) denotes the expected value of the distribution F and
U(F ) is the expected utility of the distribution F .
Consider an arbitrary distribution F that is concentrated on the two
outcomes x1 and x2
u(µ(F )) = u(p1x1 + p2x2)>=<
p1u(x1) + p2u(x2) = U(F )
depending on whether the agent is
risk-averserisk-neutralrisk-loving
.
A function u : R→ R is
concavelinearconvex
u(λx1 + (1− λ)x2)>=<
λu(x1) + (1− λ)u(x2), 0 ≤ λ ≤ 1.
Conclusion: An expected-utility maximizing agent is
risk-averserisk-neutralrisk-loving
if u(x) is
concavelinearconvex
.
Stochastic dominance
? Knowing the utility function, we have the full information on
preference. Using the maximum expected utility criterion, we
obtain a complete ordering of all the investments under consid-
eration.
? What happens if we have only partial information on preferences
(say, prefer more to less and/or risk aversion)?
? For example, in the First Order Stochastic Dominance Rule,
we only consider the class of utility functions, call U1, such that
u′ ≥ 0. This is a very general assumption and it does not assume
any specific utility function.
Dominance in U1
Investment A dominates investment B in U1 if for all utility functions
such that u ∈ U1, EAu(x) ≥ EBu(x), or equivalently, U(FA) ≥ U(FB),
and for at least one utility function, there is a strict inequality.
Efficient set in U1 (not being dominated)
An investment is included in the efficient set if there is no other
investment that dominates it.
Inefficient set in U1 (being dominated)
The inefficient set includes all inefficient investments. An inefficient
investment is that there is at least one investment in the efficient
set that dominates it.
The partition into efficient and inefficient sets depends on the choice
of the class of utility functions. In general, the smaller the efficient
set relative to the feasible set, the better for the decision maker.
First order stochastic dominance
Can we argue that Investment A is better than Investment B? It
is still possible that the return from investing in B is 11% but the
return is only 8% from investing in A.
? By looking at the cumulative probability distribution, we observe
that for all returns and the odds of obtaining that return or less,
B consistently has a higher or same value.
Recall that for each action a ∈ A, there is an induced probabil-
ity distribution on C (the set of all consequences). To compare
two choices of action, we examine their corresponding probability
distribution.
Definition
A probability distribution F dominates another probability distribu-
tion G according to the first-order stochastic dominance if
F (x) ≤ G(x) for all x ∈ C.
Lemma
F dominates G by FSD if and only if∫
Cu(x) dF (x) ≥
∫
Cu(x) dG(x)
for all strictly increasing expected utility indexes u(x).
Proof
Let a and b be the smallest and largest values F and G can take on.
Consider∫ b
au(x) d[F (x)−G(x)] = u(x)[F (x)−G(x)]ba︸ ︷︷ ︸
zero since F (a) = G(a) = 0and F (b) = G(b) = 1
−∫ b
au′(x)[F (x)−G(x)] dx
∫
Cu(x) dF (x) ≥
∫
Cu(x) dG(x) ⇔ −
∫ b
au′(x)[F (x)−G(x)] dx ≥ 0.
Thus, for u′(x) > 0,
F (x) ≤ G(x) ⇐⇒∫
Cu(x) dF (x) ≥
∫
Cu(x) dG(x).
A≥
FSD B iff rAd= rB + α, where α ≥ 0.
That is, asset A’s rate of return is equal in distribution to asset B’s
rate of return plus a non-negative random variable α.
This arises from the relation
E[u(1 + rA)] = E[u(1 + rB + α)] ≥ E[u(1 + rB)].
Second order stochastic dominance
If both investments turn out the worst, the investor obtains 6%
from A and only 5% from B. If the second worst return occurs, the
investor obtains 8% from A rather than 9% from B. If he is risk
averse, then he should be willing to lose 1% in return at a higher
level of return in order to obtain an extra 1% at a lower return level.
If risk aversion is assumed, then A is preferred to B.
Definition
A probability distribution F dominates another probability distribu-
tion G according to the second order stochastic dominance if for all
x ∈ C ∫ x
−∞F (y) dy ≤
∫ x
−∞G(y) dy.
According to SSD, A is preferred over B since the sum of cumulative
probability for A is always less than or equal to that for B.
Theorem
If F dominates G by SSD, then∫
Cu(x) dF (x) ≥
∫
Cu(x) dG(x)
for all increasing and concave expected utility indexes u(x).
Proof∫ b
au(x) d[F (x)−G(x)] = −
∫ b
au′(x)[F (x)−G(x)] dx
= −u′(x)∫ x
a[F (y)−G(y)] dy|ba
+∫ b
au′′(x)
∫ x
a[F (y)−G(y)] dydx
= −u′(b)∫ b
a[F (y)−G(y)] dy
+∫ b
au′′(x)
∫ x
a[F (y)−G(y)] dydx.
Given that u′(b) > 0 and u′′(x) < 0,∫
Cu(x) dF (x) ≥
∫
Cu(x) dG(x) if
∫ x
a[F (y)−G(y)] dy ≤ 0,∀x.
Example
F (x) =
0 if x < 1x− 1 if 1 ≤ x ≤ 21 if x ≥ 2
, G(x) =
0 if x < 0x/3 if 0 ≤ x ≤ 31 if x ≥ 3
.
F dominates G by SSD since∫ x
−∞F (y) dy ≤
∫ x
−∞G(y) dy.
F (x) is seen to be more concentrated (less dispersed).
Sufficient rules and necessary rules for second order stochastic
dominance
Sufficient rule 1: FSD rule is sufficient for SSD
Proof : If F dominates G by FSD, then F (x) ≤ G(x), ∀x.
This implies∫ x
a[G(y)− F (y)] dy ≥ 0.
Remark
The efficient set according to SSD is larger than that of FSD.
Since SSD rule requires risk aversion in addition to FSD rule, some
elements in the inefficient set according to FSD may not stay again
in the inefficient set of SSD.
Sufficient rule 2 :
MinF (x) > MaxG(x) is a sufficient rule for SSD.
Example
F Gx p(x) x p(x)5 1/2 2 3/410 1/2 4 1/4
MinF (x) = 5 ≥ MaxG(x) = 4 so that F (x) ≤ G(x). Hence, F
dominates G.
MinF (x) ≥ MaxG(x) ⇒ FSD ⇒ SSD ⇒ EFu(x) ≥ EGu(x),∀u ∈ U2.
Necessary rule 1 (Geometric means)
Given a risky project with the distribution (xi, pi), i = 1, · · · , n, the
geometric mean, Xgeo, is defined as
Xgeo = xp11 · · ·xpn
n =n∏
i=1
xpii , xi ≥ 0.
Taking logarithm on both sides
lnXgeo = Σpi lnxi = E[lnX].
Xgeo(F ) ≥ Xgeo(G) is a necessary condition for dominance of F over G by SSD.
Proof
Suppose F dominates G by SSD, we have
EFu(x) ≥ EGu(x),∀u ∈ U2.
Since lnx = u(x) ∈ U2,
EF lnx = lnF Xgeo ≥ EG lnx = lnG Xgeo;
we obtain lnXgeo(F ) ≥ lnXgeo(G). Since the logarithm function is
an increasing function, we deduce Xgeo(F ) ≥ Xgeo(G). Therefore,
F dominates G by SSD ⇒ Xgeo(F ) ≥ Xgeo(G).
Necessary rule 2 (left-tail rule)
Suppose F dominates G by SSD, then
MinF (x) ≥ MinG(x),
that is, the left tail of G must be “thicker”.
Proof by contradiction: Suppose MinF (x) < MinG(x), and write
xk = MinF (x). At xk, G will still be zero but F will be positive.
Observe that∫ xk
−∞[G(y)− F (y)] dy =
∫ xk
−∞[0− F (y)] dy < 0,
implying F is not dominated by G by SSD. Hence, if F dominates
G, then MinF (x) ≥ MinG(x).
Skewness and portfolio analysis (Third-order stochastic dominance)
Skewness is a measure of asymmetry of a distribution, defined byµ3
σ3, where µ3 is the third order moment. Say, the normal distribution
has zero skewness.
log-normal return distribution exhibits positive skewness
Empirical studies show that investors should prefer positive skew-
ness. All else constant, they should prefer portfolio with a higher
probability of very large payoffs.
Portfolio analysis is based on the first three moments of return
distribution rather than just mean and variance.
The utility of an agent can be constructed in terms of the moments
of the probability distributions.
For any distribution function p
µ(p) =∫
w dp(w)
σ2(p) =∫
[w − µ(p)]2 dp(w)
Question How can a utility function V (µ, σ) be justified in terms
of the expected utility theory?
Two possibilities
1. Placing restrictions on the probability distribution p.
2. Placing restrictions on the expected utility function u(·) defined
on consequences.
Formulation
Consider an expected utility function defined on wealth levels u(w)
and a wealth distribution function p. Write p(·|M) to represent
the distribution function p determined by M , where M is the set of
moments of the distribution. For the expected utility of a probability
distribution
V (M)4= U(p(·|M)) =
∫u(w) dp(w|M).
The expected utility is a function of all moments of the distribution
p.
If a distribution is completely described by its first two moments
(µ, σ), then the expected utility function based upon this distribution
will be a function of these two moments. The normal distribution
is the only distribution that is fully characterized by its first two
moments.
Quadratic utility
— places no constraints on the distribution function p.
u(w) = αw2 + w, α ∈ R.
For an arbitrary p,∫
u(w) dp(w) = α∫
w2 dp(w) +∫
w dp(w)
= α[σ2(p) + µ(p)2
]+ µ(p).
Hence, for a quadratic expected utility index u(w), the expected
utility function depends exclusively on µ(p) and σ2(p).
When α < 0, u(w) is decreasing in w for w > −α
2(violates the axiom
of non-satiation). Also, for α < 0, the quadratic utility demonstrates
increasing absolute risk aversion:
Ra(w) = − 2α
2αw + 1and R′a(w) =
4α2
(2αw + 1)2> 0.
Two-asset portfolio analysis – risky asset and riskfree asset
* absolute risk aversion and demand function for risky asset
Let a denote the number of units of risky asset,
b denote the number of units of riskfree asset.
rs = return from the risky asset in state s
R = return from the riskless asset.
Return from the portfolio (a, b) in state s
Ws(a, b) = rsa + Rb.
Let the price of the risky asset be q and the price of the riskless
asset be the numeraire.
The investor’s budget constraint is W0 = aq + b, where W0 is the
initial wealth of the investor; b = W0 − qa. We assume no short
selling so that a > 0.
Assume a finite set of states S = {1, · · · , s} with probability distri-
bution p = (p1, · · · , ps).
The optimization problem of an expected utility-maximizing investor:
choose (a, b) to maximize∑
s∈S
psu(Ws(a, b))
subject to qa + b = W0.
Choose a to maximize∑
s∈S
psu(RW0 + (rs −Rq)a).
The first order condition is∑
s∈S
psu′(RW0 + (rs −Rq)a)[rs −Rq] = 0.
If the investor is risk-averse, u′′(·) is strictly negative, then the sec-
ond order condition is∑
s∈S
psu′′(RW0 + (rs −Rq)a)(rs −Rq)2 < 0.
A solution to the first order condition must be a maximum if the
investor is risk averse.
Question
Is the demand for number of units of the risky asset increasing or
decreasing in initial wealth?
Define a(W0) = demand function for the risky asset, which is the
optimal solution to the portfolio choice problem.
Lemma
a′(W0) > 0 if R′a(x) < 0
a′(W0) = 0 if R′a(x) = 0
a′(W0) < 0 if R′a(x) > 0
Proof
Consider the derivative with respect to W0 of the first order condi-
tion:∑
s∈S
psu′′(RW0 + (rs −Rq)a(W0))(rs −Rq)R
+∑
s∈S
psu′′(RW0 + (rs −Rq)a(W0))(rs −Rq)2a′(W0) = 0.
Solving for a′(W0):
a′(W0) = − ∑
s∈S
psu′′(RW0 + (rs −Rq)a(W0))(rs −Rq)2
−1
R
∑
s∈S
psu′′(RW0 + (rs −Rq)a(W0))(rs −Rq)
.
If the investor is risk-averse, u′′(·) < 0. Hence, the sign of a′(W0)
should be the same as the sign of∑
s∈S
psu′′(RW0 + (rs −Rq)a(W0)) (rs −Rq)︸ ︷︷ ︸
can be positive or negative
.
Recall the definition: Ra(x) = −u′′(x)u′(x)
; the above term can be ex-
pressed as
−∑
s∈S
psu′(RW0 + (rs −Rq)a(W0))(rs −Rq)
Ra(RW0 + (rs −Rq)a(W0)).
For all s ∈ S
(rs −Rq)Ra(RW0) S (rs −Rq)Ra(RW0 + (rs −Rq)a(W0))
if and only if R′a(x) T 0.
Take the case R′a(x) < 0,
(i) for rs −Rq > 0
Ra(RW0) > Ra(RW0 + (rs −Rq)a(W0))
(ii) for rs −Rq < 0
Ra(RW0) < Ra(RW0 + (rs −Rq)a(W0)).
Easier to visualize if we write
y = rs −Rq, x0 = RW0, λ = a(W0) > 0.
We have
yRa(x0) > yRa(x0 + λy) iff R′a(x) < 0.
Lastly, consider R′a(x) < 0, the sign of a′(W0) depends on the sign
of
−∑
s∈S
psu′(RW0 + (rs −Rq)a(W0))(rs −Rq)
Ra(RW0 + (rs −Rq)a(W0))
> −Ra(RW0)∑
s∈S
psu′(RW0 + (rs −Rq)a(W0))(rs −Rq)
= 0 [due to the first order condition]
Hence, a′(W0) > 0. When the absolute risk aversion is a decreasing
function, investors would invest more on risky asset when the initial
wealth level is higher.
Consider the two-asset portfolio again, where one asset is riskfree
and the other is riskless. Define the elasticity of demand of the risky
asset with respect to the wealth by
η =daa
dW0W0
.
For risk averse investor, show that
η S 1 ifdRR
dW= −W
u′′(W )
u′(W )T 0.
Proof
Recall that W = W0(1 + rf) + a(r − rf) and
η = 1 +
(da
dW0
)W0 − a
a.
From previous result onda
dW0, we have
η = 1 +W0(1 + rf)E[u′′(W )(r − rf)] + aE[u′′(W )(r − rf)
2]
aE[−u′′(W )(r − rf)2]
= 1 +E[u′′(W ){W0(1 + rf) + a(r − rf)}(r − rf)]
aE[−u′′(W )(r − rf)2]
= 1 +E[u′′(W )W (r − rf)]
aE[−u′′(W )(r − rf)2]
= 1 +E[RR(W )u′(W )(r − rf)]
aE[u′′(W )(r − rf)2].
Since u′′(W ) < 0 for concave utility function, we have
sign (η − 1) = −sign (E[RR(W )u′(W )(r − rf)]).
Suppose RR(W ) is an increasing function, then
RR(W ) = RR(W0(1 + rf) + a(r − rf)){≥ RR(W0(1 + rf)) when r ≥ rf< RR(W0(1 + rf)) when r < rf .
By the rule of conditional probability, we have
E[RR(W )u′(W )(r − rf)]
= E[RR(W )u′(W )(r − rf)|r − rf ≥ 0]Prob (r − rf ≥ 0)
+ E[RR(W )u′(W )(r − rf)|r − rf < 0]Prob (r − rf < 0).
Consider E[RR(W )u′(W )(r − rf)|(r − rf) < 0], since u′(W ) > 0 and
RR(W ) > 0, we have
RR(W )(r − rf) > RR(W0(1 + rf))(r − rf) for r − rf < 0
so that
E[RR(W )u′(W )(r − rf)]
> RR(W0(1 + rf))E[u′(W )(r − rf)] = 0 so that η < 1.
top related