probability theory and 0 0.5 1 assessment - utrecht university · probability theory and assessment...

22
Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for describing uncertainties: 0 0.5 1 p(disease d) = 0.35 In a medical problem, a probability p(disease d)=0.35 for a patient is interpreted as: in a population of 1000 similar patients, 350 have the disease d; the attending physician assesses the likelihood that the patient has the disease d at 0.35. 78 / 401 Sources of information Sources of probabilistic information are: (experimental) data; literature; human experts. The first two sources are the most reliable. 79 / 401 A frequentist’s probabilities in theory, a probability Pr(d) is the relative frequency with which d occurs in an infinitely repeated experiment; in practice, the probability is estimated from (sufficient) experimental data. Example In a clinical study, 10000 men over 40 years of age have been examined for hypertension: hypertension no hypertension 1983 8017 The probability of hypertension in a man aged 45 is estimated to be p(hypertension)= 1983 10000 =0.20 80 / 401

Upload: lytram

Post on 28-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

Probability theory andassessment

77 / 401

Probability basics

Probability theory offers a mathematical language for describinguncertainties:

0 0.5 1

p(disease d) = 0.35

In a medical problem, a probability p(disease d) = 0.35 for apatient is interpreted as:

• in a population of 1000 similar patients, 350 have thedisease d;

• the attending physician assesses the likelihood that thepatient has the disease d at 0.35.

78 / 401

Sources of information

Sources of probabilistic information are:

• (experimental) data;• literature;• human experts.

The first two sources are the most reliable.

79 / 401

A frequentist’s probabilities

• in theory, a probability Pr(d) is the relative frequency withwhich d occurs in an infinitely repeated experiment;

• in practice, the probability is estimated from (sufficient)experimental data.

ExampleIn a clinical study, 10000 men over 40 years of age have beenexamined for hypertension:

hypertension no hypertension1983 8017

The probability of hypertension in a man aged 45 is estimated tobe

p(hypertension) =1983

10000= 0.20

80 / 401

Page 2: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

Subjective probabilities

For assessing probabilities with experts, various tools areavailable:

• probability scales;• probability wheels;• betting models;• lottery models.

A subjective probability is based upon personal knowledge andexperience.

81 / 401

Probability scales — an example

For their new soda, an expert from Colaco has to assess theprobability that the soda will turn out a national success:

• the expert is asked, usingmathematical notation, to assess:

p(national success) = ?

• the expert is asked to indicate theprobability on a scale from 0 to100% certainty:

100

85

75

50

25

15

0

fifty-fifty

uncertain

certain

impossible(almost)

improbable

expected

probable

(almost)

82 / 401

Probability wheels

A probability wheel is composed of two coloured faces and ahand:

The expert is asked to adjust the area of the red face so that theprobability of the hand stopping there, equals the probability ofinterest.

83 / 401

Betting models — an example

For their new soda, an expert from Colaco is asked to assessthe probability of a national success:

• the expert is offered two bets:

d

national success

national failure

national success

national failure

x euro

−y euro

−x euro

y euro

• if the expert is indifferent between d and d̄, then

x · Pr(n) − y · (1 − Pr(n)) = y · (1 − Pr(n)) − x · Pr(n)

from which we find Pr(n) =y

x + y.

84 / 401

Page 3: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

Lottery models — an example

For their new soda, an expert from Colaco is asked to assessthe probability of a national success:

• the expert is offered two lotteries:

d

national success

national failure

p(outcome)

p(not outcome)

Hawaiian trip

chocolate bar

Hawaiian trip

chocolate bar

• if the expert is indifferent between d and d̄, thenPr(n) = p(outcome).

• the second lottery is termed the reference lottery;

85 / 401

Coherence and calibration

• subjective probabilities are coherent if they adhere to thepostulates of probability theory;

• subjective probabilities are well calibrated if they reflect truefrequencies.

86 / 401

Overconfidence and underconfidence

• a human expert is an overconfident assessor if, comparedwith the true frequencies, his subjective probabilities show atendency towards the extremes;

• a human expert is an underconfident assessor if, comparedwith the true frequencies, his subjective probabilities show atendency away from the extremes.

87 / 401

Heuristics

Upon assessing probabilities, people tend to use simplecognitive heuristics:

• representativeness: the probability of an outcome is basedupon the similarity with a stereotype outcome;

• availability: the probability of an outcome is based upon theease with which similar outcomes are recalled;

• anchoring-and-adjusting: the probability of an outcome isassessed by adjusting an initially chosen anchor probability:

88 / 401

Page 4: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

Pitfalls

Using the representativeness heuristic upon assessingprobabilities, can introduce biases:

• the prior probabilities, or base rates, are insufficiently takeninto account;

• the assessments are based upon insufficient samples;• the weights of the characteristics of the stereotype outcome

are insufficiently taken into consideration;• . . .

89 / 401

Pitfalls — cntd.

Using the availability heuristic upon assessing probabilities, canintroduce biases:

• the ease of recall from memory is influenced by recency,rareness, and the past consequences for the assessor;

• the ease of recall is further influenced by external stimuli:Example

• . . .90 / 401

Pitfalls — cntd.

Using the anchoring-and-adjusting heuristic upon assessingprobabilities, can introduce biases:

• the assessor does not choose an appropriate anchor;• the assessor does not adjust the anchor to a sufficient

extent:Example

• . . .91 / 401

Continuous chance variables

In decision trees, chance variables are discrete:• a discrete variable C takes a single value from a non-empty

finite set of values {c1, . . . , cn}, n ≥ 2;• the distribution associated with C is a probability mass

function, which assigns a probability to each value ci of C.

In reality, chance variables can also be continuous:• a continuous variable C takes a single value within a

non-empty range of values [a, b], a < b;• the distribution associated with C is a probability distribution

function, which defines a probability for any interval[x, y] ⊆ [a, b].

92 / 401

Page 5: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

Using continuous chance variables

The following procedure provides for modelling a continuousvariable C in a decision tree:

1 construct a cumulative distribution function for C:• pivoting on the values of C, or

• pivoting on the cumulative probabilities for C;

2 approximate the probability distribution function for C by aprobability mass function for a discrete chance variable C ′:

• using the Pearson-Tukey method, or

• using bracket medians.

93 / 401

Pivoting on values

For a real-estate agency,the demand for luxuryapartments is acontinuous chancevariable A: 0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

no. of appartments

50 60 70 80 90 100 110 120 130

A cumulative distribution function is constructed by• assessing the cumulative probabilities for a number of

values of A:Pr(A ≤ 65) = 0.08 Pr(A ≤ 90) = 0.90Pr(A ≤ 80) = 0.65 Pr(A ≤ 110) = 0.97

• and drawing a curvethrough them:

0

0.2

0.4

0.6

0.8

1

50 60 70 80 90 100 110 120 130

cum

. dis

trib

utio

n

no. of appartments 94 / 401

Pivoting on values — cntd.

For assessing cumulative probabilities, by pivoting on the valuesof a variable under study, lottery models can be used.

Example

d

A ≤ 90 apartments

A > 90 apartments

p(outcome)

p(not outcome)

Hawaiian trip

chocolate bar

Hawaiian trip

chocolate bar

Other elicitation tools can be used as well.

95 / 401

Pivoting on cumulative probabilities

Reconsider the demandfor luxury apartments,modelled by a continuouschance variable A:

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

no. of appartments

50 60 70 80 90 100 110 120 130

A cumulative distribution function is constructed by• finding the values for A that give a number of cumulative

probabilities:

Pr(A ≤ a1) = 0.05 Pr(A ≤ a3) = 0.50Pr(A ≤ a2) = 0.35 Pr(A ≤ a4) = 0.95

• and drawing a curvethrough them:

0

0.2

0.4

0.6

0.8

1

50 60 70 80 90 100 110 120 130

cum

. dis

trib

utio

n

no. of appartments 96 / 401

Page 6: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

Pivoting on cumulative probabilities — cntd.

For assessing cumulative probabilities, by pivoting on thesecumulative probabilities, lottery models can be used.

Example

d

A ≤ x apartments

A > x apartments

p(outcome) = 0.35

p(not outcome) = 0.65

Hawaiian trip

chocolate bar

Hawaiian trip

chocolate bar

Other elicitation tools can be used as well.

97 / 401

The Pearson-Tukey method

The Pearson-Tukey method approximates the distributionfunction for a continuous variable C by a probability massfunction over a discrete variable C ′:

• find the values c1, c2, c3 for which

Pr(C ≤ c1) = 0.05 Pr(C ≤ c3) = 0.95Pr(C ≤ c2) = 0.50

• construct the discrete chance variable C ′ with valuesc1, c2, c3 and probabilities

Pr(C ′ = c1) = 0.185 Pr(C ′ = c3) = 0.185Pr(C ′ = c2) = 0.63

98 / 401

The Pearson-Tukey method — an example

For the real-estate agency, the demand for luxury apartments isa continuous chance variable A, for which probabilities need tobe assessed:

• for the cumulative probabilities

Pr(A ≤ a1) = 0.05 Pr(A ≤ a3) = 0.95Pr(A ≤ a2) = 0.50

we have found

a1 = 62 a2 = 77 a3 = 99

• the variable A′ is constructed with

Pr(A′ = 62) = 0.185 Pr(A′ = 99) = 0.185Pr(A′ = 77) = 0.63

99 / 401

Bracket medians (Clemen)

The method of bracket medians approximates the distributionfunction for a continuous variable C by a probability massfunction over a discrete variable C ′:

1 for a number of equally likely intervals, for example five, findthe values of C for whichPr(C ≤ c1) = 0 Pr(C ≤ c4) = 0.60Pr(C ≤ c2) = 0.20 Pr(C ≤ c5) = 0.80Pr(C ≤ c3) = 0.40 Pr(C ≤ c6) = 1.00

2 for each interval [ci, ci+1], i = 1, . . . , 5, establish the bracketmedian mi such that

Pr(ci ≤ C ≤ mi) = Pr(mi ≤ C ≤ ci+1)

3 construct the discrete chance variable C ′ withPr(C ′ = m1) = Pr(C ′ = m2) = Pr(C ′ = m3) =Pr(C ′ = m4) = Pr(C ′ = m5) = 0.20

100 / 401

Page 7: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

Bracket medians – more general and practical

With the bracket medians method, using n equally likelyintervals, steps 1 and 2 basically provide for finding values mi,i = 1, . . . , n, of the chance variable C, such that

Pr(C ≤ mi) =1

2n+ k · n, k = 0, . . . , n − 1

to construct, in step 3, the discrete chance variable C ′ with

Pr(C ′ = mi) =1

n, i = 1, . . . , n

101 / 401

Brackets medians — an example

For the real-estate agency, the demand for luxury apartments isa continuous chance variable A, for which probabilities need tobe assessed:

• for the five cumulative probabilities

Pr(A ≤ m1) = 0.10 Pr(A ≤ m4) = 0.70Pr(A ≤ m2) = 0.30 Pr(A ≤ m5) = 0.90Pr(A ≤ m3) = 0.50

we have foundm1 = 65 m3 = 77 m5 = 89m2 = 73 m4 = 81

• the variable A′ is constructed withPr(A′ = 65) = 0.20 Pr(A′ = 81) = 0.20Pr(A′ = 73) = 0.20 Pr(A′ = 89) = 0.20Pr(A′ = 77) = 0.20

102 / 401

The real-estate agency revisited

Reconsider the decision problem for the real-estate agency:

• with the Pearson-Tukeymethod, the agency’sdecision tree includes:the expected demand is 78apartments;

Pr(A′ = 62) = 0.185

Pr(A′ = 77) = 0.63

Pr(A′ = 99) = 0.185

• using bracket medians, theagency’s tree includes:the expected demand is 77apartments;

Pr(A′ = 65) = 0.20

Pr(A′ = 73) = 0.20

Pr(A′ = 77) = 0.20

Pr(A′ = 81) = 0.20

Pr(A′ = 89) = 0.20

103 / 401

M.C. Airport: probability assessments Ia

For each of the identified chance variables, the outcome willdepend on the chosen strategy, that is, on activity and site in1975, 1985 and 1995.

It is assumed that the chance variables are all probabilisticallyindependent of each other, therefore

Pr(C) = Pr(C1 ∧ C2 ∧ C3 ∧ C4 ∧ C5 ∧ C6) =6

i=1

Pr(Ci)

Probabilities are to be assessed for• the outcome of each Ci, i = 1, . . . , 6 for each of the ∼ 100

decision alternatives of D

(where D captures the sequence D75, D85, D95)

104 / 401

Page 8: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

M.C. Airport: probability assessments Ib

For each of the identified chance variables, the outcome willdepend on the chosen strategy, that is, on activity and site in1975, 1985 and 1995.

Alternatively, probabilities can be assessed for• the outcome of each Ci, i = 1, . . . , 6, for the 16 decision

alternatives of each D j, j = 75, 85, 95.

This second option• requires less assessments;• requires easier assessments (C j

i vs Ci);• assumes that C k

i is independent of Cji given Dk, j = 75, 85,

k = 85, 95;• requires that for some functions fi, i = 1, . . . , 6,

Pr(Ci) = fi(Pr(C 75i ), Pr(C 85

i ), Pr(C 95i )).

105 / 401

M.C. Airport: probability assessments IIa

The required probabilities were established from• information from previous studies;• government administrators (group consensus).

For each of the Cji (i = 1, . . . , 6, j = 75, 85, 95) cumulative

distributions were assessed using• the fractile method, and• consistency checks

Distributions for Ci were derived from Cji by defining

Ci ≡C 75

i + C 85i + C 95

i

3

106 / 401

M.C. Airport: probability assessments IIb

ExampleConsider the 1975 noise impact of the ’all activity at Texcoco’alternative. To establish Pr(C 75

6 | D 75 = T-IDMG), the followingnumbers are assessed:

min #people = ? 400.000max #people = ? 800.000Pr(#people ≤ a1) = 0.5 ⇒ a1 = ? 640.000Pr(#people ≤ a2) = 0.25 ⇒ a2 = ? 540.000Pr(#people ≤ a3) = 0.75 ⇒ a3 = ? 700.000etc. �

107 / 401

Introduction to utility theoryand assessment

108 / 401

Page 9: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

The appraisal of consequences

For various decision problems, the fundamental objectives havea natural numerical scale:

• money;• percentages;• length of life;• . . .

For other decision problems, not all objectives have such ascale:

• reputation;• attractiveness;• quality of life;• . . .

For such objectives, a proxy scale may be used, or a newnumerical scale need be designed.

109 / 401

The gangrene problem

A 68-year old man is suffering from diabetic gangrene at aninjured foot. The attending physician has to choose between twodecision alternatives:

• to amputate the leg below the knee,which involves a small risk of death;

• to wait:

• if untreated, the gangrene may cure;

• if the gangrene expands, an amputation above theknee becomes necessary, which involves a larger riskof death.

110 / 401

The gangrene problem — continued

The elements of the gangrene problem are organised in thefollowing decision tree:

amputation

below knee

wait

survive

p = 0.99

die

p = 0.01

recover

p = 0.70

worsen

p = 0.30

amputation

above knee

survive

p = 0.90

die

p = 0.10

amputatedbelow knee

death

cured

amputatedabove knee

death

Before the decision tree can be evaluated, the variousconsequences need be assigned numerical appraisals.

111 / 401

The gangrene problem — continued

Reconsider the gangrene problem and compare the followingappraisals of the consequences

0.89

amputation

below knee

0.92

wait

survive

p = 0.99

die

p = 0.01

recover

p = 0.70

worsen

p = 0.30

0.72

amputation

above knee

survive

p = 0.90

die

p = 0.10

0.9

0

1.0

0.8

0

112 / 401

Page 10: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

Lotteries

DEFINITION

A lottery is simply a probability distribution over a known, finiteset of outcomes, that is, it is a pair L = (R, Pr) with

• R = {r1, . . . , rn}, n ≥ 2, is a set of rewards;• Pr is a probability distribution over R, with

i=1,...,n Pr(ri) = 1.

A lottery L is commonly denoted

L = [ p1, r1; . . . ; pi, ri; . . . ; pn, rn]

where pi = Pr(ri), i = 1, . . . , n;

occasionally, we write [R] for short.

A lottery is graphicallyrepresented as:

p1

pi

pn

r1

ri

rn

· · ·

· · ·

113 / 401

Types of lottery

There exist different types of lottery:

• a lottery is termed a certain lottery if it has a single reward r

with Pr(r) = 1;• a lottery is called a simple lottery if all its rewards are

certain lotteries;• a lottery is coined a compound lottery if at least one of its

rewards is not a certain lottery.

114 / 401

The gangrene problem revisited

Reconsider the gangrene problem. In the problem, several typesof lottery occur:

• the certain lottery [1.0, amputated above knee];• the simple lottery [0.99, amputated below knee; 0.01,death]:

p = 0.99

p = 0.01

amputated below knee

death

• the compound lottery[

0.7, cured; 0.3, [0.9, amputated above knee; 0.1,death]]

:p = 0.70

p = 0.30

p = 0.90

p = 0.10

cured

amputated above knee

death

115 / 401

A preference ordering

DEFINITION

Let L be a set of lotteries. A binary relation � is termed apreference ordering on L if � adheres to the followingproperties:

• for all Li, Lj ∈ L, we have that Li � Lj or Lj � Li;• for all Li, Lj, Lk ∈ L, if Li � Lj and Lj � Lk, then Li � Lk;

If Li � Lj, we say that Li is preferred over Lj.

If Li � Lj and Lj � Li, we say that Li and Lj are equivalent; wethen write Li ∼ Lj.

116 / 401

Page 11: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

The gangrene problem revisited

Reconsider the gangrene problem:• a preference ordering on the set of all lotteries of the

problem specifies a total ordering on the four certainlotteries:

L1 = amputated below kneeL2 = deathL3 = curedL4 = amputated above knee

it seems evident that L3 � L1 � L4 � L2;• the ordering also specifies a total ordering on the more

complex lotteries:

L5

p = 0.99

p = 0.01

amputatedbelow knee

death

L6

p = 0.90

p = 0.10

amputatedabove knee

death

it seems evident that L5 � L6.117 / 401

The gangrene problem — continued

Reconsider the gangrene problem, with L3 � L1 � L4 � L2:

L1 = amputatied below kneeL2 = deathL3 = curedL4 = amputated above knee

For

L7

p = 0.99

p = 0.01

amputatedbelow knee

death

L8

p = 0.70

p = 0.30

p = 0.90

p = 0.10

cured

amputatedabove knee

death

Does L7 � L8 hold, or is L8 � L7 ?

118 / 401

The continuity axiom (aka Archimedean axiom)

Let L be a set of lotteries and let � be a preference ordering onL. Then, the continuity axiom asserts:

for all Li, Lj, Lk ∈ L with Li � Lj � Lk, there isprobability p such that [p, Li; (1 − p), Lk] ∼ Lj

Consider three (certain) lotteries Li � Lj � Lk with rewardsri, rj, rk. The axiom states that there exists a probability p suchthat

1.0 rj ∼

p

1 − p

ri

rk

• p is termed the calibration probability for Lj and[p, Li; (1 − p), Lk];

• Lj is termed the certainty equivalent for [p, Li; (1 − p), Lk].119 / 401

The independence axiom (or: substitutability)

Let L be a set of lotteries and let � be a preference ordering onL. Then, the independence axiom asserts:

for all Li, Lj, Lk ∈ L with Li ∼ Lj and for eachprobability p, we have that[p, Li; (1 − p), Lk] ∼ [p, Lj; (1 − p), Lk]

Consider three lotteries Li, Lj, Lk. The independence axiomstates that if Li and Lj are equivalent, then so are

p

1 − p

Li

Lk

p

1 − p

Lj

Lk

for all probabilities p.120 / 401

Page 12: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

The unequal probability axiom (or: monotonicity)

Let L be a set of lotteries and let � be a preference ordering onL. Then, the unequal probability axiom asserts:

for all Li, Lj ∈ L with Li � Lj and for all probabilitiesp, p′ with p ≥ p′, we have[p, Li; (1 − p), Lj] � [p′, Li; (1 − p), Lj]

Consider two lotteries Li, Lj with Li � Lj. The unequalprobability axiom states that

p

1 − p

Li

Lj

p′

1 − p′

Li

Lj

for all probabilities p ≥ p′.121 / 401

The compound lottery axiom

Let L be a set of lotteries and let � be a preference ordering onL. Then, the compound lottery axiom asserts:

for all Li, Lj ∈ L, Lj = [q, Lm; (1 − q),Ln], Lm, Ln ∈ L,0 ≤ q ≤ 1, and for each probability p, we have[p, Li; (1−p), Lj] ∼ [p, Li; (1−p)·q, Lm; (1−p)·(1−q), Ln].

Consider two lotteries Li, Lk with Lk = [q, Li; (1 − q), Lj], Lj ∈ L,0 ≤ q ≤ 1. The compound lottery axiom states that

p

1 − p

q

1 − q

Li

Li

Lj

∼p + (1 − p) · q

(1 − p) · (1 − q)

Li

Lj

for all probabilities p. The axiom is also termed “no fun ingambling”.

122 / 401

A rational preference ordering

DEFINITION

Let L be a set of lotteries. A preference ordering � on L is arational preference ordering if it adheres to the Von Neumann –Morgenstern utility axioms:

• the continuity axiom;• the independence axiom;• the unequal probability axiom;• the compound lottery axiom.

123 / 401

The assumption of finiteness

Consider the following decision problem:

lose all your money

gamble

win billions

p = 0.99999

everything bad

p = 0.00001

bankruptcy

riches

infinite misery

Bayes criterion for choosing between decision alternatives doesnot help much if the problem involves consequences of infiniteappraisal.

124 / 401

Page 13: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

The main theorem of utility theory

THEOREM

Let L be a set of lotteries and let � be a rational preferenceordering on L. Then, there exists a real function u on L such that

• for all Li, Lj ∈ L, we have that Li � Lj iff u(Li) ≥ u(Lj);• for each [p1, L1; . . . ; pn, Ln] ∈ L, we have that

u([p1, L1; . . . ; pn, Ln]) =∑

i=1,...,n

pi · u(Li).

The function u is termed a utility function on L.

125 / 401

The main theorem — a sketchy proof

Consider two lotteries L and L′ with the rewards r1 � · · · � rn,n ≥ 1:

L = [p1, r1; · · · ; pn, rn] L′ = [p′1, r1; · · · ; p′n, rn]

For each reward ri we can find a value 0 ≤ ui ≤ 1, such that

[ui, r1; (1 − ui), rn] ∼ ri

For the two lotteries, we then have that

L ∼[

i=1,...,n

pi · ui, r1; (1 −∑

i=1,...,n

pi · ui), rn

]

L′ ∼[

i=1,...,n

p′i · ui, r1; (1 −∑

i=1,...,n

p′i · ui), rn

]

So, L � L′ if and only if∑

i=1,...,n

pi · ui ≥∑

i=1,...,n

p′i · ui.

126 / 401

Some notes

The main theorem of utility theory implies that:

• a lottery with highest utility is a most preferred lottery;• any rational preference ordering on a set of lotteries is

encoded uniquely by the utilities of its certain lotteries.

The main theorem does not imply that:

• a lottery with highest expected reward is a most preferredlottery

127 / 401

Utility versus expected reward

ExampleConsider the set of euro rewards R = {0, 10, 20, 50}, and thefollowing two lotteries over R:

L1 = [0.0, 0; 1.0, 20], and L2 = [0.7, 10; 0.3, 50]

Although IEr(L2) > IEr(L1), the decisionmaker may express thepreference L1 ≻ L2 without being irrational! Consider twopossible utility functions over the rewards, expressing that moremoney is preferred to less:

u1(0) = 0, u1(10) = 0.2, u1(20) = 0.4, u1(50) = 1

u2(0) = 0, u2(10) = 0.1, u2(20) = 0.6, u1(50) = 1

Note that u1(L1) < u1(L2), but u2(L1) > u2(L2).

Note: even if objectives have a natural numerical scale,preferences (over lotteries) may be such that a utility function isrequired!

128 / 401

Page 14: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

Strategic equivalenceDEFINITION

Let L be a set of lotteries. Two utility functions ui, uj on L arestrategically equivalent, written ui ∼ uj, if they imply the samepreference ordering on L.

Example : Consider the following two utility functions for thegangrene problem:

u1(amputated below knee) = 0.9u1(death) = 0u1(cured) = 1.0u1(amputated above knee) = 0.8

andu2(amputated below knee) = 2.98u2(death) = 1u2(cured) = 3.2u2(amputated above knee) = 2.76

The functions u1 and u2 are strategically equivalent.129 / 401

A linear transformation of utilities

THEOREM

Let L be a set of lotteries and let ui, uj be two utility functions onL, then

ui ∼ uj ⇐⇒ uj = a · ui + b,

for some constants a, b with a > 0

A utility function is unique up to a positive linear transformation.

130 / 401

Normalisation

Let X be an attribute with values x1� . . .�xn, n > 1.

Often, utility functions are normalised such that, for example,

u(x1) = 0 and u(xn) = 1

These values only set the origin of u(X) and the unit ofmeasurement.

If we decide to consider, for example, xi ≺ x1 or xj ≻ xn, thenu(xi) < 0, and u(xj) > 1, respectively.

131 / 401

The gangrene problem revisited

Reconsider the gangrene problem with

u(amputated below knee) = 0.9u(death) = 0u(cured) = 1.0u(amputated above knee) = 0.8

For the two lotteries

L5

p = 0.99

p = 0.01

amputatedbelow knee

death

L6

p = 0.90

p = 0.10

amputatedabove knee

death

we have thatu(L5) = 0.99 · u(amputated below knee)+

+0.01 · u(death) = 0.99 · 0.9 = 0.89

u(L6) = 0.90 · u(amputated above knee)++0.10 · u(death) = 0.90 · 0.8 = 0.72

So, L5 � L6.132 / 401

Page 15: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

The gangrene problem — continued

Reconsider the gangrene problem with

u(amputated below knee) = 0.9u(death) = 0u(cured) = 1.0u(amputated above knie) = 0.8

For the two lotteries

L7

p = 0.99

p = 0.01

amputatedbelow knee

death

L8

p = 0.70

p = 0.30

p = 0.90

p = 0.10

cured

amputatedabove knee

death

we have thatu(L7) = 0.99 · 0.9 = 0.89

u(L8) = 0.70 · 1.0 + 0.3 · 0.9 · 0.8 = 0.92

So, L8 � L7.133 / 401

Utility assessment

• subjective assessment• direct methods

• magnitude estimation/production

• ratio estimation/production• indirect (behavioural) methods

• based on reference gambles

• ”objective” assessment• choose a mathematical function

134 / 401

Direct methods

Magnitude estimation or production can be done usinga utility scale:

0

0.5

1.0ExampleReconsider the diabetic gangrene treatmentexample. To assess the utilities for the differenttreatment consequences, a patient is asked oneof the following types of question:

• ”How do you value life after an above-kneeamputation?” (estimation);

• ”Which consequence do you associate with autility of 0.2?” (production) �

135 / 401

Direct methods

Ratio estimation or production:

ExampleMy sister is interested in buying a new car. To assess theutilities for the different car options, she is asked one of thefollowing types of question:

• ”How much more do you value a Volvo than a Fiat?”(estimation);

• ”Which car seems to you twice as valuable as a Fiat?”(production)

This method was used to assess the empirical utility of money(Galanter, 1962).

empirical utility function for monetary gain: ∼√

x

empirical utility function for monetary loss: ∼ −x2

136 / 401

Page 16: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

Reference gamble

Consider a set of consequences for which utilities are to beassessed. Let ci, cj, ck be consequences from that set such thatci � cj � ck.

A reference gamble is a choice between two lotteries:

1 the certain lottery L = [1.0, cj]

2 the simple lottery L′ = [p, ci; (1 − p), ck]

• L′ is the reference lottery of the ”gamble” — if u(ci) = 1.0and u(ck) = 0, then L′ is called a standard reference lottery;

• for fixed ci, cj, and ck, the probability p for which L ∼ L′ isthe indifference probability for L and L′;

• for fixed p and ci, ck, the consequence cj for which L ∼ L′ isthe certainty equivalent, or indifference point, for L′.

137 / 401

Assessment using certainty equivalents

The utilities of a decision maker for the possible consequencesin a decision problem can be assessed using several certaintyequivalents:

1 elicit a preference order on consequences from mostpreferred (c+) to least preferred (c−);

2 assign the first two points of the utility function:

u(c+) = 1 and u(c−) = 0

3 create a standard reference gamble with p = 0.5;4 elicit the certainty equivalent cCE for the reference lottery;5 compute the utility of this indifference point;6 create two reference gambles with p = 0.5 and

consequences c+ and cCE, and cCE and c−, respectively;7 repeat steps 4 – 6 until enough points are found to draw the

utility curve.138 / 401

Computing a utility

Consider the lotteries L and L′ of a reference gamble:

L = [1.0, cj] and L′ = [p, ci; (1 − p), ck]

where ci � cj � ck. Assume that the utilities u(ci) and u(ck) forconsequences ci and ck are known.

Question:How do we compute u(cj) for consequence cj?

Answer:If either

p is the indifference probability for L and L′,or

cj is the certainty equivalent of L′

then L ∼ L′ and consequently

1.0 · u(cj) = p · u(ci) + (1 − p) · u(ck).

From this equation we can solve u(cj).139 / 401

An example

Suppose you have an old computer for which components arebound to need replacement in the near future. You have anumber of replacement alternatives that will cost you betweene 50 and e 500. What is your utility function?

Two points can be fixed:

u(e 500) = 0 and u(e 50) = 1

You are then presented with thefollowing gamble: 0.5

0.5

cCE

50

500

Suppose that cCE = 200 is the indifference point for this gamble.We then have that

1.0 · u(200) = 0.5 · u(50) + 0.5 · u(500) = 0.5

140 / 401

Page 17: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

Assessment using probability equivalents

The utilities of a decision maker for the possible consequencesin a decision problem can be assessed using several probabilityequivalents:

1 elicit a preference order on consequences from mostpreferred (c+) to least preferred (c−);

2 assign the first two points of the utility function:

u(c+) = 1 and u(c−) = 0

3 create a standard reference gamble for enoughintermediate outcomes;

4 elicit the indifference probability for the lotteries in each ofthe gambles;

5 compute the utility of an intermediate outcome using thisindifference probability.

141 / 401

An example

Reconsider the decision problem for treatment of diabeticgangrene. The possible consequences of treatment are:

cured 1.0amputated below knee ?amputated above knee ?death 0.0

The patient is given thefollowing gamble: p

1 − p

amputatedbelow knee

cured

death

Suppose that p = 0.9 is the indifference probability for thelotteries.We then have that

u(amputated below knee) = 0.9·u(cured)+0.1·u(death) = 0.9

142 / 401

Direct vs indirect methods

direct:• roots in psychophysics• inferior in both validity & reliability• easily applied to risky tasks with complex consequences

indirect:• roots in utility axioms• time consuming• irrelevant ”gaming” effect• distasteful / unethical• unsuitable for measuring very small or very large utilities

143 / 401

Example: extreme utilities

For the diabetic gangrene decision problem, we could use thefollowing gambles:

p

1 − p

amputatedabove kneeamputatedbelow knee

death

q

1 − q

amputatedbelow knee

cured

amputatedabove knee

144 / 401

Page 18: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

Subjective utility assessments

• can change overtime• are required for unknown / unexperienced outcomes• are influenced by framing and certainty effects• are influenced by third parties• are not comparable from person to person• . . .

145 / 401

Risk attitudes

146 / 401

An Example

Suppose you are given the choice between the following two“games”:

0.5

0.5 e −1

e 5

0.5

0.5

e 6

e −2

Do you have a clear preference for playing one or the other?

147 / 401

Risk-neutral preferences

Let X be an attribute with values x1 � . . . � xn, n ≥ 2, measuredin some unit. Let L = [p, xi; (1 − p), xj] be a lottery over X.

A decision maker is risk-neutral if each additional unit of X isvalued with the same increase in utility:

Example DEFINITION

A decision maker is risk-neutral,if

u(p · xi + (1 − p) · xj) =

p · u(xi) + (1 − p) · u(xj)

that is, the utility function islinear.

Note that u′′(X) = 0.148 / 401

Page 19: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

An Example

Suppose you are given the choice between the following two“games”:

0.5

0.5 e −2

e 52

0.5

0.5

e 5000

e −4950

Which game would you prefer to play?

149 / 401

Risk-averse preferences

Let X be an attribute with values x1 � . . . � xn, n ≥ 2, measuredin some unit. Let L = [p, xi; (1 − p), xj] be a lottery over X.

A decision maker is risk-averse if each additional unit of X isvalued with a smaller increase in utility:

Example DEFINITION

A decision maker is risk-averse,if

u(p · xi + (1 − p) · xj) >

p · u(xi) + (1 − p) · u(xj)

that is, the utility function isconcave.

Note that u′′(X) < 0.150 / 401

An Example

Suppose you are given the choice between the following two“games”:

0.2

0.8 e 0, 10

e 10

0.8

0.2

e 2, 50

e 1

Which game would you prefer to play?

151 / 401

Risk-prone preferences

Let X be an attribute with values x1 � . . . � xn, n ≥ 2, measuredin some unit. Let L = [p, xi; (1 − p), xj] be a lottery over X.

A decision maker is risk-prone, or risk-seeking, if eachadditional unit of X is valued with a larger increase in utility:

Example DEFINITION

A decision maker is risk-prone,if

u(p · xi + (1 − p) · xj) <

p · u(xi) + (1 − p) · u(xj)

that is, the utility function isconvex.

Note that u′′(X) > 0.152 / 401

Page 20: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

Risk attitudes

0X

u(X)

RA

RS

RA

The zero-illusion curve:

0X

u(X)

RS

RA

153 / 401

Discounting

• the process of translating future rewards to their presentvalue;

• necessary to perform ”cost-benefit” analyses now.

Examples

• the time value of moneyvalue of an euro depends on when it is availableeuros can be invested to yield more euros

• the time value of lifelife years in the future less valuable than today (?)life years are valued relative to money

154 / 401

A discountingfactor

A discountingfactor indicates how much less a patient valueseach successive year of life, compared to the previous year:

The utility function for length of lifewith a constant discountingfactor δ

is approximated by

u(x) =

∫ x

t=0

e−δ·tdt

for life-expectancy x.

155 / 401

An example

Consider the following ‘standard reference gamble’:

1.0x years ∼

0.50

0.50

25 years

0 years

For a utility function for length of life with a constantdiscountingfactor δ = 0.02, we find:

u(0 years) =

0

t=0

e−0.02·tdt = 0

andu(25 years) =

25

t=0

e−0.02·tdt = 19.67

The patient should be indifferent about the choice between thetwo lotteries for a life-expectancy x for which

u(x) = 0.5 · 19.67 + 0.5 · 0 = 9.835

We find from∫ x

t=0

e−0.02·t dt = 9.835 that x ≈ 11 years.156 / 401

Page 21: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

The risk-premium

Let X be an attribute with values x1 � . . . � xn, n ≥ 2, measuredin some unit. Let u(X) be a utility function over X.

Consider a lottery L = [p, xi; (1 − p), xj] over X. Let xC be thecertainty equivalent of lottery L and xE the expected value of L.

The risk premium RP of L isdefined as:

RP =

{

xE − xC if u(X) is increasingxC − xE if u(X) is decreasing

157 / 401

An example

Let u(x) = −e−0.2x for all x ∈ [0 . . . 50] Consider the lotteryL = [0.5, 0; 0.5, 10]. Compute the risk premium for this lottery.

The expected value xE of the lottery is

0.5 · 0 + 0.5 · 10 = 5

The certainty equivalent xC of the lottery is determined from

u(xC) = 0.5 · u(0) + 0.5 · u(10) = −0.5 − 0.5 · e−2 ≈ −0.57

and equals 2.83.

As u(x) is an increasing function, the risk premium for L is

RP = 5 − 2.83 = 2.17

ExerciseLet u(x) = − log(x + 30) for all x > −30. Consider the lotteryL = [0.5,−20; 0.5,−10]. What is the lottery’s risk premium? Isthe decision maker risk averse or risk prone?

158 / 401

Risk averseness & risk premium

A decision maker is• risk averse iff his/her risk premium is positive for all

nondegenerate lotteries;• decreasingly risk averse iff he/she is risk averse and

• his/her risk premium for any lottery[0.5, x − h; 0.5, x + h] decreases (↓ 0) as x increases;

• increasingly risk averse iff he/she is risk averse and• his/her risk premium for any lottery

[0.5, x − h; 0.5, x + h] increases (↑ ∞) as x increases;• constantly risk averse iff he/she is risk averse and

• his/her risk premium for any lottery[0.5, x − h; 0.5, x + h] remains constant for all x.

159 / 401

Risk proneness & risk premium

A decision maker is• risk prone iff his/her risk premium is negative for all

nondegenerate lotteries;• decreasingly risk prone iff he/she is risk prone and

• his/her risk premium for any lottery[0.5, x − h; 0.5, x + h] increases (↑ 0) as x increases;

• increasingly risk prone iff he/she is risk prone and• his/her risk premium for any lottery

[0.5, x − h; 0.5, x + h] decreases (↓ −∞) as x increases;• constantly risk prone iff he/she is risk prone and

• his/her risk premium for any lottery[0.5, x − h; 0.5, x + h] remains constant for all x.

160 / 401

Page 22: Probability theory and 0 0.5 1 assessment - Utrecht University · Probability theory and assessment 77 / 401 Probability basics Probability theory offers a mathematical language for

The degree of risk aversion / proneness – Introduction

Information regarding a decision maker’s risk attitude is given by

RP: sign indicates aversion vs proneness;magnitude captures the degree of this behaviour,for one specific lottery!

u′′(X): sign indicates aversion vs proneness;magnitude conveys no relevant information,since strategically equivalent functions capturesame risk attitude:

X

u(X)

x − h x x + h

u(x) = −3e−x

certaintyequivalent

X

u(X)

x − h x x + h

u(x) = 1 − e−x

certaintyequivalent

RP RP

(xE) (xE)

161 / 401

The risk-aversion function

DEFINITION

Consider a utility function u(X), and let σ ∈ {+,−} denote thesign of u′(x). The risk-aversion function R(X) for u(X) is thendefined by

R(X) = −σu′′(X)

u′(X)

• if R(X) > 0 then the decision maker is risk averse;• if R(X) < 0 then the decision maker is risk prone;• if R(X) = 0 then the decision maker is risk neutral;

THEOREM

R(X) is increasing (decreasing, constant) iff the decisionmaker’s risk premium for any lottery [0.5, x − h; 0.5, x + h] isincreasing (decreasing, constant) for increasing x.

162 / 401

An example

Let u(x) = 1 − e−x/900, x > 0, be a utility function. Find therisk-aversion function for X.

We have that

u′(x) =1

900· e−x/900, x > 0

and

u′′(x) =−1

9002· e−x/900, x > 0

Since u(x) is an increasing function in x (u′(x) > 0 for all x > 0),the risk-aversion function is defined as

R(X) = −u′′(X)

u′(X)

We conclude that R(X) =1

900and that u(X) models constant

risk aversion.163 / 401