problems for solution - solution manual & test bank … several generations of students and...

Problems for Solutionfrom

The Theory of Probabilityand

Selected Solutions

Santosh S. VenkateshUniversity of Pennsylvania

Contents

Preface vii

A ELEMENTS

I Probability Spaces 3

II Conditional Probability 25

III A First Look at Independence 51

IV Probability Sieves 65

V Numbers Play a Game of Chance 87

VI The Normal Law 105

VII Probabilities on the Real Line 121

VIII The Bernoulli Schema 143

IX The Essence of Randomness 169

X The Coda of the Normal 211

B FOUNDATIONS

XI Distribution Functions and Measure 247

XII Random Variables 257

XIII Great Expectations 279

XIV Variations on a Theme of Integration 293

Contents

XV Laplace Transforms 331

XVI The Law of Large Numbers 357

XVII From Inequalities to Concentration 377

XVIII Poisson Approximation 385

XIX Convergence in Law, Selection Theorems 397

XX Normal Approximation 407

Index 419

vi

Preface

Several generations of students and teaching assistants worked out problemsand contributed to the selected solutions assembled here. I should especiallyacknowledge Wei Bi, Shao Chieh Fang, Gaurav Kasbekar, Jonathan Nukpezah,Alireza Tahbaz Salehi, Shahin Shahrampour, Evangelos Vergetis, and ZhengweiWu. I must confess immediately that not all the solutions have seen the samelevel of scrutiny; and, while I have expended some effort to bring some uni-formity of style and presentation to the material, it may occasionally lack somepolish. It struck me, however, that a reader desirous of an early view of the re-sults may be willing to forgive the deficiencies in the current crude assemblageand, accordingly, that it may be preferable to provide a rough and ready collec-tion of solutions now, however unpolished, unvetted, and incomplete, and notwait till I had had the time to proof-read them carefully and present them in auniform format and style while filling in the gaps.

References to ToP are, of course, to The Theory of Probability; referencesto Equations, Examples, Theorems, Sections, Chapters, and so on point to themain text; the referencing conventions are as in ToP. As a navigational aid, pageheadings are labelled by chapter and problem; the detailed index that I haveprovided may also be of use.

There are inevitably errors in the text lying in wait to be discovered. Ican only hope that these are of the obvious kind that do not cause great con-sternation and apologise for these in advance. I would certainly very muchappreciate receiving word of errors and ambiguities both here and in the maintext of ToP.

January 1, 2015: The task of compiling solutions proceeds slowly. Thecurrent version of the palimpsest provides solutions for more than 75% of theproblems: of the 560 problems assigned for solution in ToP, 428 now have solu-tions spelled out in detail. In many instances I have provided two or three dif-ferent approaches to solution to illustrate different perspectives; and the solu-tions frequently flesh out themes and generalisations suggested by the patternsof attack. Table 1 on the following page identifies the current list of problemswhose solutions appear in this manuscript.

Preface

Chapter titles Solutions provided Not ready for prime time

A I Probability Spaces 1–30 31–34

II Conditional Probabilities 1–19, 22–31 20, 21

III A First Look at Independence 1–20 21–23

IV Probability Sieves 1–26, 29 27, 28

V Numbers Play a Game of Chance 1–18 19–21

VI The Normal Law 1–11 12, 13

VII Probabilities on the Real Line 1–22 —

VIII The Bernoulli Schema 1–26, 28–38 27

IX The Essence of Randomness 1–36 —

X The Coda of the Normal 1–12, 14–22, 24–33 13, 23

B XI Distribution Functions and Measure 1–5, 7–11 6, 12, 13

XII Random Variables 1–24 25, 26

XIII Great Expectations 1–5, 7–23 6, 24–26

XIV Variations on a Theme of Integration 1–11, 13–22, 24–27, 30–36, 39–51, 54–56 12, 23, 28, 29, 37, 38, 52, 53

XV Laplace Transforms 1–11, 17, 20–31 12–16, 18, 19

XVI The Law of Large Numbers 1–4, 11, 14–20 5–10, 12, 13, 21–31

XVII From Inequalities to Concentration 1 2–35

XVIII Poisson Approximation 4, 5, 20–23 1–3, 6–19

XIX Convergence in Law, Selection Theorems 1–4, 6, 11, 17, 19, 20 5, 7–10, 12–16, 18

XX Normal Approximation 1–3, 6–8, 10–14 4, 5, 9, 15–19

C XXI Sequences, Functions, Spaces — —

Table 1: List of compiled solutions.

viii

Part A

ELEMENTS

I

Probability Spaces

Notation for generalised binomial coefficients will turn out to be useful goingforward and I will introduce them here for ease of reference. As a matter of convention,for real t and integer k, we define( t

k

)=

t(t−1)(t−2)···(t−k+1)

k! if k ≥ 0,0 if k < 0.

Problems 1–5 deal with these generalised binomial coefficients.

1. Pascal’s triangle. Prove that(tk−1

)+(tk

)=(t+1k

).

SOLUTION: The identity is a generalisation of Pascal’s triangle which, for integer t, maybe verified by an elementary combinatorial argument. Simple algebraic factorisation isall that is needed to verify the general case of the identity for real-valued t. The identityis obvious for k < 0 and for k ≥ 0, we have„

t

k− 1

«+

„t

k

«=t(t− 1) · · · (t− k+ 2)

(k− 1)!+t(t− 1) · · · (t− k+ 1)

k!

=t(t− 1) · · · (t− k+ 2)

k!

`k+ (t− k+ 1)

´=

„t+ 1

k

«.

2. If t > 0, show that(−tk

)= (−1)k

(t+k−1k

)and hence that

(−1k

)= (−1)k

and(−2k

)= (−1)k(k+ 1) if k ≥ 0.

SOLUTION: By factoring out the terms −1 from the numerator, we have„−t

k

«=

(−t)(−t− 1)(−t− 2) · · · (−t− k+ 1)

k!

= (−1)kt(t+ 1)(t+ 2) · · · (t+ k− 1)

k!= (−1)k

„t+ k− 1

k

«.

If k ≥ 0, then, for t = 1, this reduces to the identity`

−1k

´= (−1)k while, for t = 2, we

obtain the useful result that`

−2k

´= (−1)k(k+ 1).

Probability Spaces I.4

3. Show(1/2k

)= (−1)k−1 1

k

(2k−2k−1

)2−2k+1 and

(−1/2k

)= (−1)k

(2kk

)2−2k.

SOLUTION: By factoring out 1/2 from each of the terms in the product in the numerator,we have„

1/2

k

«=

12(12

− 1)(12

− 2) · · · (12

− k+ 1)

k!

=

„1

2

«k 1(1− 2) · · · (1− 2(k− 1))

k(k− 1)!=

„1

2

«k(−1)k−1 1 · 3 · 5 · 7 · · · (2k− 3)

k(k− 1)!

=

„1

2

«k(−1)k−1 1 · 3 · 5 · 7 · · · (2k− 3)

`2 · 4 · · · (2k− 2)

´k(k− 1)!

`2 · 4 · · · (2k− 2)

´=

„1

2

«k(−1)k−1 (2k− 2)!

k2k−1(k− 1)!(k− 1)!= (−1)k−1 1

k

„2k− 2

k− 1

«2

−2k+1.

Similarly,„−1/2

k

«=

−12(−1

2− 1)(−1

2− 2) · · · (−1

2− k+ 1)

k!=

„−1

2

«k1 · 3 · 5 · · · (2k− 1)

k!

=

„−1

2

«k 1 · 3 · 5 · · · (2k− 1)`2 · 4 · · · (2k)

´k!`2 · 4 · · · (2k)

´ =

„−1

2

«k(2k)!

2kk!k!= (−1)k

„2k

k

«2

−2k.

4. Prove Newton’s binomial theorem

(1+ x)t = 1+( t1

)x+

( t2

)x2 +

( t3

)x3 + · · · =

∞∑k=0

( tk

)xk,

the series converging for all real t whenever |x| < 1. Thence, if t = n is anypositive integer obtain the usual binomial formula (a+b)n =

∑nk=0

(nk

)an−kbk.

SOLUTION: When t = n is a positive integer we have the classical binomial expansion(1+x)n =

`n0

´x0+

`n1

´x1+ · · ·+

`nn

´xn known from antiquity. The generalisation of this

venerable result to real powers t is due to Isaac Newton. This formula was the harbingerof the calculus.

Taylor’s theorem with remainder says that if f is n + 1 times continuously dif-ferentiable then

f(x) =

n∑k=0

f(k)(0)

xk

k!+ Rn(x)

where the remainder is given by the expression

Rn(x) =1

n!

∫x0

tnf(n+1)(x− t)dt.

The derivatives are easy to compute for the given function f(x) = (1 + x)t and arecompactly expressed in terms of the falling factorial notation:

f(k)(x) = t(t− 1)(t− 2) · · · (t− k+ 1)(1+ x)t−k = t

k(1+ x)t−k (k ≥ 0).

4

I.5 Probability Spaces

It follows that

f(x) =

n∑k=0

tk

k!xk + Rn(x)

and we now only need to show that Rn(x) → 0 for each x. But this is easy to see as thederivatives of f grow only exponentially fast with order. Indeed, if |x| < 1, then we mayselect a positive functionM(x) so that

|f(n+1)(t)| ≤ max1, t

n+1(1+ |x|)t−n−1 ≤M(x)n+1

uniformly for all t in the closed and bounded interval |t| ≤ |x|. It follows that

|Rn(x)| ≤ M(x)n+1

n!

∫x0

tndt =

`M(x)x

´n+1

(n+ 1)!,

and the bound on the right converges to zero as n tends to infinity because of the super-exponential growth of the factorial function. Allowing n → ∞ in Taylor’s formula, wehence obtain Newton’s binomial Theorem

(1+ x)t =

∞∑k=0

„t

k

«xk (|x| < 1).

Specialising to the case when t = n is a positive integer, we see that`nk

´= nk/k! = 0 if

k > n and the usual binomial formula results.

5. Continuation. With t = −1, the previous problem reduces to the geo-metric series

1

1+ x= 1− x+ x2 − x3 + x4 − · · ·

convergent for all |x| < 1. By integrating this expression termwise derive theTaylor expansion of the natural logarithm

log(1+ x) = x− 12x2 + 1

3x3 − 1

4x4 + · · ·

convergent for |x| < 1. (Unless explicitly noted otherwise, our logarithms arealways to the natural or "Napier" base e.) Hence derive the alternative forms

log(1− x) = −x− 12x2 − 1

3x3 − 1

4x4 − · · · ,

12 log

(1+x1−x

)= x+ 1

3x3 + 1

5x5 + · · · .

SOLUTION: Setting t = −1 in Newton’s binomial theorem we obtain the geometric series

1

1+ x= 1− x+ x2 − x3 + x4 − · · ·

which converges for |x| < 1. By a formal term by term integration of both sides from 0

to x, we obtain

log(1+ x) = x−1

2x2 +

1

3x3 −

1

4x4 + · · · , (1)

5


the term-wise integration easily seen to be permissible as˛∫x0

`−1ntn + (−1)n+1

tn+1 + · · ·

´dt

˛≤∫x0

tn(1+ t+ t2 + · · · )dt

=

∫x0

tn

1− tdt ≤ 1

1− x

∫x0

tndt =

xn+1

1− x→ 0 (n→∞).

Setting x← −x in (1), we obtain the companion series,

log(1− x) = −x−1

2x2 −

1

3x3 −

1

4x4 + · · · , (2)

and combining (1,2) we see that

1

2log

1+ x

1− x=1

2

`log(1+ x) − log(1− x)

´= x+

1

3x3 +

1

5x5 + · · · (|x| < 1),

as asserted.

Problems 6–18 deal with sample spaces in intuitive settings.

6. A fair coin is tossed repeatedly. What is the probability that on thenth toss: (a) a head appears for the first time? (b) heads and tails are balanced?(c) exactly two heads have occurred?

SOLUTION: (a) A head occurs for the first time on the the nth toss if, and only if, the nthtoss results in a head and the first n − 1 tosses result in tails. As all 2n sequences of nheads and tails are equally likely for the given problem, the probability in question is2−n.

(b) Heads and tails can only be balanced in n tosses of a coin if n is even.In this case, there are precisely

`nn/2

´ways of specifying the locations of the heads in

the sequence and so, for even n, the probability that heads and tails are balanced is`nn/2

´2−n. For odd n, the probability of balance is zero.

(c) There are`n2

´sequences of n heads and tails containing precisely two heads.

The probability of seeing precisely two heads is hence`n2

´2−n.

7. Six cups and six saucers come in pairs, two pairs are red, two arewhite, and two are blue. If cups are randomly assigned to saucers find theprobability that no cup is upon a saucer of the same colour.

SOLUTION: It is simplest to systematically account for all possibilities of arrangements.For purposes of enumeration we may consider the saucers to be labelled from 1 through6, the cups likewise. Formally, each permutation (Π1, . . . , Π6) of (1, . . . , 6) represents asample point of the experiment and corresponds to the cup-saucer matchings i 7→ Πi.As (Π1, . . . , Π6) ranges over all permutations, we range over all 6! assignments of cupsto saucers. Of these, those arrangements where all cups are on saucers of a differentcolour are of two types.

6


1. Both cups of a given colour are placed on matching saucers of a different colour. A littleintrospection shows that this means that each pair of cups of a given colour must bematched with a pair of saucers of a different colour. If, say, both red cups are placedon blue saucers then both blue cups must be placed on white saucers which meansin turn that both white cups must be placed on blue saucers. The red cups maybe placed on the blue saucers in 2! ways, the blue cups on the white saucers in 2!ways, and the white cups on the blue saucers in 2! ways, for a total of 2!×2!×2! = 8

possibilites with the assignments red → blue → white. Interchanging the rôles ofblue and white, there are 8 more assignments of the form red → white → blue.There are hence a total of 16 arrangements with both cups of each colour placed onmatching saucers of a different colour.

2. The cups of a given colour are placed on saucers of differing colours, neither of the originalcolour. We may systematically enumerate the possibilities as follows. The first redcup may be placed on one of the four saucers that are not red; once a saucer isselected, the second red cup has two possibilites in the saucers of the remainingcolour; there are thus 8 ways in which we may deploy the red cups on one blueand one white saucer. Now both red cups are deployed, one on a blue saucer andone on a white saucer. The two blue cups must now be placed one on the remainingwhite saucer and one on a red saucer. The red saucer may be selected in two waysand the blue cup to be placed on it may be selected in two ways; there are thus 4ways in the blue cups may be placed on a white and a red saucer. Finally, thereremain two white cups, one blue and one red saucer, and both deployments ofcups on saucers give rise to valid arrangements. In total we have 8 × 4 × 2 = 64

arrangements of the cups on the saucers so that the cups of a given colour areplaced on saucers of differing colours, neither of the original colour.

Combining the two cases, there are 16 + 64 = 80 arrangements of cups on saucers sothat no cup is on a saucer of a matching colour. The associated probability is hence(16+ 64)/6! = 1/9, or about ten percent.

8. Birthday paradox. In a group of n unrelated individuals, none born ona leap year, what is the probability that at least two share a birthday? Show thatthis probability exceeds one-half if n ≥ 23. [Make natural assumptions for theprobability space.]

SOLUTION: We suppose that the year consists of 365 days (we disregard leap years), thatindividuals are equally likely to be born on any day (a careful analysis of data shows thatthis is not strictly true but it’s certainly a reasonable starting point; our results are at leastconservative), and that births are independent of one another (twins and triplets neednot apply). The number of arrangements of n birthdays so that no two are the same is, inthe falling factorial notation, 365n := 365 · (365− 1) · (365− 2) · · · (365−n+ 1). As thereare 365n arrangements in total, the probability that no two birthdays coincide is givenby 365n/365n and, accordingly, the probability that two or more birthdays coincide isgiven by 1−365n/365n. For n = 23 this evaluates to approximately 0.5073which manypeople find surprising. For n = 30, the probability is in excess of 70%.

An asymptotic estimate which is quite remarkably good may be arrived at for

7


the probability. We begin by estimating

Nn

Nn=N

N· (N− 1)

N· (N− 2)

N· · · (N− n+ 1)

N

= 1 ·“1−

1

N

”“1−

2

N

”· · ·“1−

n− 1

N

”.

Taking logarithms of both sides, we see then that

logNn

Nn=

n−1∑k=1

log“1−

k

N

”.

The Taylor series (2) for the logarithm shows that, for each fixed n and 1 ≤ k ≤ n − 1,we have log

`1 − k

N

´= − k

N− ξN, where the order term ξN ≤ k2/N2 < n2/N2 for all

sufficiently largeN. It follows that

logNn

Nn= −

n−1∑k=1

k

N− ξN = −

n(n− 1)

2N− ξN

where ξN < n3/N2. We now have our desired estimate. Suppose KN is any positivesequence satisfying KN/N2/3 → 0 as N → ∞. If n ≤ KN then Nn/Nn ∼ e−(n−1)n/2N

where the asymptotic equivalence is to be taken to mean that the ratio of the two sides tends toone asN→∞.

With N = 365 and n = 23, the asymptotic estimate for the probability thattwo or more birthdays coincide yields the approximation 1− e−22×23/2×365 = 0.500 · · ·which differs from the exact answer only in about 7 parts in 1000.

9. Lottery. A lottery specifies a random subset R of r out of the first nnatural numbers by picking one at a time. Determine the probabilities of thefollowing events: (a) there are no consecutive numbers, (b) there is exactly onepair of consecutive numbers, (c) the numbers are drawn in increasing order.Suppose that you have picked your own random set of r numbers. What is theprobability that (d) your selection matches R?, (e) exactly k of your numbersmatch up with numbers in R?

SOLUTION: (a)`n−r+1r

´‹`nr

´: one can show by a direct combinatorial argument that the

number of r-sets with no consecutive numbers is given by`n−r+1r

´but a demonstration

by induction on n is even faster. The base of the induction is obvious as, if n < 2r − 1,then there are no r-sets without consecutive numbers, and, when n = 2r − 1, there isprecisely one r-set consisting of the odd numbers 1, 3, . . . , 2r− 1 which has no consec-utive numbers: 1 =

`(2r−1)−r+1

r

´. Now fix r ≥ 0 and consider the situation for a generic

value of n ≥ 2r − 1. If we increase n to n + 1 then the r-sets with no consecutive num-bers are of two types. Type (i) r-sets do not include n + 1: but then n + 1 is irrelevantand the situation reverts to that for n; by induction hypothesis, there are hence precisely`n−r+1r

´type (i) r-sets. Type (ii) r-sets include n+ 1: such r-sets cannot include n so that

the remaining r−1 numbers are selected from 1 through n−1with no consecutive num-bers; by induction hypothesis again, there are hence precisely

`(n−1)−(r−1)+1

r−1

´type (ii)

8


r-sets. Thus, the number of r-sets selected from 1, . . . , n, n+ 1 so as to not include anyconsecutive numbers is given by„

n− r+ 1

r

«+

„(n− 1) − (r− 1) + 1

r− 1

«=

„n− r+ 1

r

«+

„n− r+ 1

r− 1

«=

„(n+ 1) − r+ 1

r

«,

the final step by Pascal’s triangle. This completes the induction.

(b) (r − 1)`n−r+1r−1

´‹`nr

´: proof by induction on n and r. The base case for r =

n = 2 is obvious. Now consider the effect of moving from n to n + 1. The r-sets of1, . . . , n − 1, n, n + 1 which contain precisely one pair of consecutive numbers are ofthree types. Type (i) r-sets do not include n+1: but this situation reverts to that for n andso, by induction hypothesis, the number of type (i) r-sets is equal to (r − 1)

`n−(r−1)r−1

´.

Type (ii) r-sets include n + 1 but not n: but then, we must select r − 1 integers from1, . . . , n−1 in such a way that there there is precisely one pair of consecutive numbers;by induction hypothesis, the number of type (ii) r-sets is equal to (r − 2)

`(n−1)−(r−2)

r−2

´.

Type (iii) r-sets include both n and n + 1: but then the remaining r − 2 numbers cannotinclude n− 1 and must be chosen from 1, . . . , n− 2 in such a way as to obtain no pairof consecutive numbers; by part (a) there are hence exactly

`(n−2)−(r−2)+1

r−2

´type (iii)

r-sets. Accordingly, the total number of r-sets selected from 1, . . . , n, n + 1 so as tocontain precisely one pair of consecutive numbers is given by

(r− 1)

„n− (r− 1)

r− 1

«+ (r− 2)

„(n− 1) − (r− 2)

r− 2

«+

„(n− 2) − (r− 2) + 1

r− 2

«= (r− 1)

„n− (r− 1)

r− 1

«+ (r− 1)

„(n− 1) − (r− 2)

r− 2

«= (r− 1)

»„n− r+ 1

r− 1

«+

„n− r+ 1

r− 2

«–= (r− 1)

„(n+ 1) − r+ 1

r− 1

«,

the final step again by Pascal’s triangle. This concludes the induction.

(c) 1/r!: for any selection of r numbers, precisely one out of the r! equally likelyarrangements is in increasing order.

(d) 1‹`nr

´: self-explanatory.

(e)`rk

´`n−rr−k

´‹`nr

´: specify the k numbers that match and select the remaining

numbers from those excluded from R.

10. Poker. Hands at poker (see Example 3.4) are classified in the follow-ing categories: a one pair is a hand containing two cards of the same rank andthree cards of disparate ranks in three other ranks; a two pair contains a pair ofone rank, another pair of another rank, and a card of rank other than that of thetwo pairs; a three-of-a-kind contains three cards of one rank and two cards withranks differing from each other and the rank of the cards in the triple; a straightcontains five cards of consecutive ranks, not all of the same suit; a flush has all

9


five cards of the same suit, not in sequence; a full house contains three cards ofone rank and two cards of another rank; a four-of-a-kind contains four cards ofone rank and one other card; and a straight flush contains five cards in sequencein the same suit. Determine their probabilities.

SOLUTION: The sample space consists of the`525

´= 2, 598, 960 selections of five cards

from a standard pack, the probability measure naturally uniform. The probabilities ofthe various hands are obtained by systematic enumeration, the form of the solutionsgiven below suggestive of the line of thought.

One pair:13 ·

`123

´· 6 · 43`

525

´ =1760

4165= 0.422569;

Two pair:

`132

´· 11 · 6 · 6 · 4`525

´ =198

4165= 0.047539;

Three-of-a-kind:13 ·

`122

´· 4 · 42`

525

´ =88

4165= 0.021129;

Straight:9 · 45`525

´ =192

54145= 0.003546;

Full house:13 · 12 · 4 · 6`

525

´ =6

4165= 0.001441;

Four-of-a-kind:13 · 12 · 4`

525

´ =1

4165= 0.000240;

Straight flush:9 · 4`525

´ =3

216580= 0.000014;

Royal flush:4`525

´ =1

649740= 1.53908× 10−6

.

11. Occupancy configurations, the Maxwell–Boltzmann distribution. An oc-cupancy configuration of n balls placed in r urns is an arrangement (k1, . . . , kr)where urn i has ki balls in it. If the balls and urns are individually distinguish-able, determine the number of distinguishable arrangements leading to a givenoccupancy configuration (k1, . . . , kr) where ki ≥ 0 for each i and k1+ · · ·+kr =n. If balls are distributed at random into the urns determine thence the proba-bility that a particular occupancy configuration (k1, . . . , kr) is discovered. Thesenumbers are called the Maxwell–Boltzmann statistics in statistical physics. Whileit was natural to posit that physical particles were distributed in this fashion itcame as a nasty jar to physicists to discover that no known particles actuallybehaved in accordance with common sense and followed this law.1

SOLUTION: The number of ways in which k1 balls can be selected to be placed in thefirst urn is

`nk1

´. Of the remaining balls, the number of ways k2 balls can be selected

1S. Weinberg, The Quantum Theory of Fields. Cambridge: Cambridge University Press, 2000.

10


to be placed in the second urn is`n−k1k2

´. Of the remaining, the number of ways we

can specify k3 balls to be placed in the third urn is`n−k1−k2

k3

´. And, proceeding in this

fashion, the number of ways the occupancy configuration (k1, . . . , kr) can be achievedis „

n

k1

«„n− k1k2

«„n− k1 − k2

k3

«· · ·„n− k1 − · · ·− kr−1

kr

«=

n!

k1!(n− k1)!· (n− k1)!

k2!(n− k1 − k2)!· (n− k1 − k2)!

k3!(n− k1 − k2 − k3)!· · · (n− k1 − · · ·− kr−1)!kr!(n− k1 − · · ·− kr)!

.

Factors cancel successively in the denominators and numerators of the terms of the ex-pression on the right which is now seen to simplify to the form

n!

k1!k2! · · ·kr!.

On the other hand, there are clearly rn distributions of n balls in r urns, the assump-tions of the Maxwell-Boltzmann distributions giving each equal probability. WritingPMB(k1, . . . , kr) for the probability that the occupancy configuration (k1, . . . , kr) arisesin the Maxwell-Boltzmann distribution, we see then that

PMB(k1, . . . , kr) =n!

k1!k2! · · ·kr!r

−n (k1, . . . , kr ≥ 0;k1 + · · ·+ kr = n).

12. Continuation, the Bose–Einstein distribution. Suppose now that theballs are indistinguishable, the urns distinguishable. Let An,r be the numberof distinguishable arrangements of the n balls into the r urns. By lexicographicarrangement of the distinct occupancy configurations, show the validity of therecurrence An,r = An,r−1 + An−1,r−1 + · · · + A1,r−1 + A0,r−1 with boundaryconditions An,1 = 1 for n ≥ 1 and A1,r = r for r ≥ 1. Solve the recurrence forAn,r. [The standard mode of analysis is by a combinatorial trick by arrange-ment of the urns sequentially with sticks representing urn walls and identicalstones representing balls. The alternative method suggested here provides aprincipled recurrence as a starting point instead.] The expression 1/An,r repre-sents the probability that a given occupancy configuration is discovered assum-ing all occupancy configurations are equally likely. These are the Bose–Einsteinstatistics of statistical physics and have been found to apply to bosons such asphotons, certain atomic nuclei like those of the carbon-12 and helium-4 atoms,gluons which underlie the strong nuclear force, and W and Z bosons whichmediate the weak nuclear force.

SOLUTION: Guessing the pattern (not easy). With n = 5, a systematic enumeration showsthat A5,1 = 1, A5,2 = 6 and A5,3 = 21. A diligent examination of such cases may leadone to a shrewd suspicion of the answer; and once the answer is guessed it is easy toverify by induction. But, it must be admitted, seeing the pattern is not easy. Generatingfunctions provide a principled path to solution.

A recursive approach—with a generatingfunctionological solution: If there is onlyone urn then all the balls have to go into it, whence An,1 = 1 for n ≥ 1. If, on the

11


other hand, there is only one ball then it may go into any of the r urns, whence A1,r = r

for r ≥ 1. This establishes the end-points of a recurrence. Now suppose urn 1 hask1 = k balls. Then the remaining n − k balls must be distributed in the remaining r − 1

urns. Summing over the possible values for k gives the recurrence An,r = An,r−1 +

An−1,r−1 + · · ·+A1,r−1 +A0,r−1 for n, r ≥ 1.

It will simplify summation limits to expand this recurrence to the entire integerlattice on the plane. To begin, it will be convenient to set boundary conditions

A0,0 = 1,

An,r = 0 (n < 0 or r < 0),

which enables us to expand the basic recurrence to everywhere on the integer latticeexcepting only the origin:

An,r =∑k≤n

Ak,r−1ˆ(n, r) 6= (0, 0)

˜. (†)

(We should verify immediately that the recurrence yields A1,r = r for r ≥ 1 and An,1 =

1 for n ≥ 1, as it must.) Now introduce the sequence of generating functions

Gr(s) =∑n

An,rsn with G0(s) = 1. († ′)

Suppose r 6= 0. Then the recurrence (†) is valid for all n. Multiplying both sides of (†)by sn and summing over all nwe then obtain

Gr(s) =∑n

An,rsn =∑n

∑k≤n

Ak,r−1sn (r 6= 0).

The interchange in the order of summation that is indicated is natural, right, and effec-tive, and leads to a summable geometric series. For |s| < 1, we now obtain

Gr(s) =∑k

Ak,r−1∑n≥k

sn =∑k

Ak,r−1sk

1− s=Gr−1(s)

1− s(r 6= 0).

The recurrence is trivial for r < 0. For r > 0 we may repeatedly run the process and, byinduction, obtain

Gr(s) =Gr−1(s)

1− s=Gr−2(s)

(1− s)2= · · · = G0(s)

(1− s)r.

As G0(s) = 1, we see hence that

Gr(s) = (1− s)−r =

∞∑n=0

„−r

n

«(−s)n =

∞∑n=0

„n+ r− 1

n

«sn (r > 0), († ′′)

the first step is by the binomial theorem (Problem 4), the second by the negative binomialidentity (Problem 2). Comparing the sums in († ′, † ′′) termwise we may write down byinspection the solution

An,r =

„n+ r− 1

n

«,

12


the terms being non-zero only for n ≥ 0 and r ≥ 1 by virtue of the conventions for thebinomial coefficients.

Another approach—combinatorial this time: Imagine the urns arranged sequen-tially from left to right, a vertical bar or “stick” representing the divide between twosuccessive urns. The arrangement of n indistinguishable balls (or “stones”) into r dis-tinguishable urns may then be seen to be equivalent to an arrangement of sticks andstones as shown in Figure 1. Each distinguishable deployment is in 1-1 correspondence

1 2 3 4 5 6 7 8

Figure 1: Arrangement of n = 7 indistinguishable balls into r = 8 distinguishable urns. Thethick vertical bars at either end represent immovable barriers; the thin vertical bars representmovable sticks corresponding to urn partitions.

with an arrangement of r − 1 sticks and n stones and there are precisely`r−1+nn

´such

arrangements.

Writing PBE(k1, . . . , kr) for the probability of the occupancy configuration un-der the Bose–Einstein statistics, we see that

PBE(k1, . . . , kr) =1

An,r=

1`n+r−1n

´ (k1, . . . , kr ≥ 0;k1 + · · ·+ kr = n).

13. Continuation, the Fermi–Dirac distribution. With balls indistinguish-able and urns distinguishable, suppose that the only legal occupancy configu-rations (k1, . . . , kr) are those where each ki is either 0 or 1 only and k1+· · ·+kr =n, and suppose further that all legal occupancy configurations are equally likely.The conditions impose the constraint n ≤ r and prohibit occupancy configura-tions where an urn is occupied by more than one ball. Now determine the prob-ability of observing a legal occupancy configuration (k1, . . . , kr). These are theFermi–Dirac statistics and have been found to apply to fermions such as electrons,neutrons, and protons.

SOLUTION: Since an urn can contain either no balls or precisely one ball, each validFermi-Dirac occupancy configuration (k1, . . . , kr) is completely specified by identifica-tion of the n occupied urns. There are thus precisely

`rn

´legal occupancy configurations.

Writing PFD(k1, . . . , kr) for the probability of the occupancy configuration (k1, . . . , kr)

under the Fermi–Dirac statistics, we see that

PFD(k1, . . . , kr) =1`rn

´ `k1, . . . , kr ∈ 0, 1;k1 + · · ·+ kr = n; 0 ≤ n ≤ r

´.

14. Chromosome breakage and repair. Each of n sticks is broken into a longpart and a short part, the parts jumbled up and recombined pairwise to form n

13


new sticks. Find the probability (a) that the parts will be joined in the originalorder, and (b) that all long parts are paired with short parts.2

SOLUTION: (a) The total number of possibilities for recombination is`2n2

´`2n−22

´· · ·`22

´n!

= (2n− 1)(2n− 3) · · · (1).

The original order is just one outcome of these possibilities and so the probability thatthe sticks are recombined in their original order is simply

1

(2n− 1)(2n− 3) · · · (1) =(2n)(2n− 2) · · · 4 · 2

(2n)!=2nn!

(2n)!.

(b) Lay out, say, the short parts in any order. The long parts may then be pairedwith the short parts in n! ways and so the probability of recombination with a properpairing of shorts and longs is given by

n!

(2n− 1)(2n− 3) · · · (1) =2n(n!)2

(2n)!=

2n`2nn

´ .15. Spread of rumours. In a small town of n people a person passes a titbit

of information to another person. A rumour is now launched with each recip-ient of the information passing it on to a randomly chosen individual. Whatis the probability that the rumour is told r times without (a) returning to theoriginator, (b) being repeated to anyone. Generalisation: redo the calculations ifeach person tells the rumour tom randomly selected people.

SOLUTION: (a) The first person can pass on the rumour to any of n − 1 people. Startingwith the first recipient, each person receiving the rumour can pass it on also in n − 1

ways of which n − 2 possibilities do not return the gossip to the originator. Thus, theprobability that the rumour is told r times without returning to the originator is

(n− 1)(n− 2)r−1

(n− 1)(n− 1)r−1=“1−

1

n− 1

”r−1.

If each person tells the rumour to m randomly selected people, by a similar argumentthe probability becomes`

n−1m

´`n−2m

´r−1`n−1m

´`n−1m

´r−1 =

»(n− 2)m

(n− 1)m

–r−1=

»m−1∏j=0

“1−

1

n− 1+ j

”–r−1.

2If sticks represent chromosomes broken by, say, X-ray irradiation, then a recombination of twolong parts or two short parts causes cell death. See D. G. Catcheside, “The effect of X-ray dosageupon the frequency of induced structural changes in the chromosomes of Drosophila Melanogaster”,Journal of Genetics, vol. 36, pp. 307–320, 1938.

14


(b) The chance that the rumour does not return to anyone is

(n− 1)r

(n− 1)r=

r−1∏j=0

“1−

j

n− 1

”.

The generalisation when the rumour is told each time tom randomly selected people is`n−1m

´`n−1−(m+1)

m

´`n−1−(2m+1)

m

´· · ·`n−1−((r−1)m+1)

m

´`n−1m

´r =

r−1∏j=0

m−1∏k=0

“1−

jm+ 1

n− 1− k

”.

16. Keeping up with the Joneses. The social disease of keeping up with theJoneses may be parodied in this invented game of catch-up. Two protagonistsare each provided with an n-sided die the faces of which show the numbers 1, 2,. . . , n. Both social climbers start at the foot of the social ladder. Begin the gameby having the first player roll her die and move as many steps up the ladder asshow on the die face. The second player, envious of the progress of her rival,takes her turn next, rolls her die, and moves up the ladder as many steps asshow on her die face. The game now progresses apace. At each turn, whicheverplayer is lower on the ladder rolls her die and moves up as many steps as indicatedon the die face. The game terminates at the first instant when both players endup on the same ladder step. (At which point, the two rivals realise the futility ofit all and as the social honours are now even they resolve to remain friends andeschew social competition.) Call each throw of a die a turn and letN denote thenumber of turns before the game terminates. Determine the distribution of N.[Continued in Problem VII.7.]

SOLUTION: It is easy to get confused in this problem but if one keeps in mind the salientfeature that, at each trial, the person who is lower on the social rung can match hercompetitor in exactly one out of the n possible outcomes for her throw then the situationclarifies. By the rules of the game, the rungs of the ladder at which the competitors dwellafter each throw are distinct till the final rung at which parity is obtained. Suppose τthrows were made in total. There will then be τ − 1 distinct ladder rungs selected, inincreasing order, say, 1 ≤ L1 < L2 < · · · < Lτ−1, the final throw landing the competitormoving last onto the rung Lτ−1 occupied by her rival. Suppose the first competitorthrows her die τ1 times before parity is obtained, the outcomes being k(1)

1 , k(1)2 , . . . , k(1)

τ1 ;likewise, suppose the second competitor throws her die a total of τ2 times with outcomesk

(2)1 , k(2)

2 , . . . , k(2)τ2 . Clearly, τ1 + τ2 = τ. With S(1)

1 = k(1)1 , the partial sums S(1)

i+1 =

S(1)i + k

(1)i+1 of the first competitor then select an increasing subsequence (S

(1)1 =)Li1 <

Li2 < · · · < Liτ1−1 < Liτ1 = Lτ−1(= S(1)τ1 ) of the ladder rung sequence terminating

in the rung Lτ−1. Likewise, the corresponding sequence of partial sums for the secondcompetitor, S(2)

1 = k(2)1 , S(2)

i+1 = S(2)i + k

(2)i+1 selects another increasing subsequence

(S(2)1 =)L ′i1 < L

′i2< · · · < L ′iτ2−1

< L ′iτ2= Lτ−1(= S

(2)τ2 ) of the ladder rung sequence

which interleaves the ladder rung sequence of the first competitor and intersects it onlyin the final step.

15


We now work systematically, moving a competitor up the ladder rung sequenceuntil she leapfrogs her opponent then switching to her opponent and moving her up un-til she, in turn, leapfrogs her opponent, then switching back to the first competitor, andso on, until finally parity is achieved. Up through the penultimate step, the situation is asfollows: the competitor currently under consideration is at ladder rung L, her opponentat ladder rung L ′ > L where 1 ≤ L ′ − L ≤ n − 1. The competitor under considerationnow moves up to rung L + k where k is the outcome of her die throw. If L + k < L ′

we repeat the process; if L + k > L ′, we switch focus to her competitor. In either case,L + k 6= L ′ and there are hence n − 1 legal values that k can assume. This situationpersists from the second through the penultimate throws. The first throw which has npossibilities sets the initial bar; and in the final throw which determines parity, the com-petitor playing catch-up throws exactly the value L ′ − L to land on the same rung Lτ−1

as her opponent. There are thus exactly n · (n − 1)τ−2 · 1 sequences of throws of thedice which result in parity achieved for the first time on the τth throw. As there are nτ

possibilities for τ throws, it follows that the distribution of the stopping time N of thegame is given by

PN = τ =n · (n− 1)τ−2

nτ=

„1−

1

n

«τ−2

· 1n

(τ ≥ 2).

This is the geometric distribution with parameter 1/n.

17. The hot hand. A particular basketball player historically makes onebasket for every two shots she takes. During a game in which she takes verymany shots there is a period during which she seems to hit every shot; at somepoint, say, she makes five shots in a row. This is clearly evidence that she ison a purple patch (has a “hot hand”) where, temporarily at least, her chancesof making a shot are much higher than usual, and so the team tries to funnelthe ball to her to milk her run of successes as much as possible. Is this goodthinking? Propose a model probability space for this problem and specify theevent of interest.

SOLUTION: The natural sample space for this model problem is an unending sequenceof fair coin tosses, each such sequence representing one sample point or “outcome” ofthis gedanken experiment. In a succession of coin tosses, say that a first success run oflength r occurs at trial n if a succession of r successes occurs for the first time at trialsn − r + 1, n − r + 2, . . . , n. The occurrence of a first success run at epoch n triggers arenewal and we consider a statistically identical process to restart with trial n + 1 and,thereafter, each time a success run of length r is first observed. For instance, the renewalepochs for success runs of length 3 are identified by a comma followed by a little addedspace for visual delineation in the following sequence of coin flips with 1 representingsuccess and 0 representing failure: 0110111, 1000010111, 0111, 111, 101 · · · . With thisconvention, success runs of length r determine a renewal process. A similar conventionand terminology applies to failure runs, as well as to runs of either type. The samplespace is continuous and may be identified with the unit interval as we have seen inExamples I.7.6 and I.7.7.

16


The event of interest is now identified with those sequences for which thereexists (at least one) success run of length five in the first n trials. The probability measureis uniform in this interval (Examples I.7.7 and I.7.8).

18. Continuation, success run probability. What is the probability that, inthe normal course of things, the player makes five (or more) shots in a rowsomewhere among a string of 50 consecutive shot attempts? If the chance issmall then the occurrence of a run of successes somewhere in an observed se-quence of attempts would suggest either that an unlikely event has transpiredor that the player was temporarily in an altered state (or, “in the zone”) whilethe run was in progress. Naturally enough, we would then attribute the ob-served run to a temporary change in odds (a hot hand) and not to the occur-rence of an unlikely event. If, on the other hand, it turns out to be not at allunlikely that there will be at least one moderately long success run somewherein a string of attempts, then one cannot give the hot hand theory any credence.[This problem has attracted critical interest3 and illustrates a surprising andcounter-intuitive aspect of the theory of fluctuations. The analysis is elemen-tary but not at all easy—Problems XV.20–27 provide a principled scaffoldingon which problems of this type can be considered.]

SOLUTION: For each n, write un for the probability that there is a success run of length5 at trial n. Likewise, let fn represent the probability that there is a first success run oflength 5 at trial n. Finally, let qn denote the probability of no success run of length 5through trial n. By additivity, qn = 1 −

∑k≤n fk and so it suffices to determine fk for

each k. It turns out to be more convenient to first determine un, n ≥ 1 .

Now, there will be five consecutive successes at trials n− 4, n− 3, n− 2, n− 1,andn if, and only if, for some kwith 0 ≤ k ≤ 4, a success run of length five terminated attrial n−k and there were k consecutive successes that followed at trials n−k+1, . . . , n.By additivity, it follows that 2−5 = un2

−0+un−12−1+ · · ·+un−42

−4, or, equivalently,

un = 2−5 − un−12

−1 − un−22−2 − un−32

−3 − un−42−4.

The boundary conditions are clear: u1 = u2 = u3 = u4 = 0. We may now churn therecurrence through to evaluate un systematically. Computing the first few values in thesequence, we have

u5 = 2−5 = 0.03125,

u6 = 2−5 − u52

−1 = 2−6 = 0.015625,

3T. Gilovich, R. Vallone, and A. Tversky, “The hot hand in basketball: On the misperception ofrandom sequences”, Cognitive Psychology, vol. 17, pp. 295–314, 1985. Their conclusion? That the hothand theory is a widespread cognitive illusion affecting all beholders, players, coaches, and fans.Public reaction to the story was one of disbelief. When the celebrated cigar-puffing coach of theBoston Celtics, Red Auerbach, was told of Gilovich and his study, he grunted, “Who is this guy?So he makes a study. I couldn’t care less”. Auerbach’s quote is reported in D. Kahneman, Thinking,Fast and Slow. New York: Farrar, Straus, and Giroux, 2011, p. 117.

17


u7 = 2−5 − u62

−1 − u52−2 = 2

−5 − 2−7 − 2−7 = 2−6 = 0.015625,

u8 = 2−5 − u72

−1 − u62−2 − u52

−3 = 2−5 − 2−7 − 2−8 − 2−8 = 2

−6 = 0.015625,

u9 = 2−5 − u82

−1 − u72−2 − u62

−3 − u52−4

= 2−5 − 2−7 − 2−8 − 2−9 − 2−9 = 2

−6 = 0.015625,

u10 = 2−5 − u92

−1 − u82−2 − u72

−3 − u62−4 = 2

−6`2− (1− 2−4)

´= 0.016016.

A few more iterations shows that the sequence quickly converges to the stationary solu-tion satisfying u = 2−5 − u2−1 − u2−2 − u2−3 − u2−4, or,

u =2−5

1+ 2−1 + 2−2 + 2−3 + 2−4=

2−5

2− 2−4=1

62= 0.016129.

Numerical evaluation shows that the sequence has essentially converged to its limitingvalue by n = 20 (with an error of less than one part in one million).

Now to determine the values fn, n ≥ 1 . The boundary conditions are clear:f1 = f2 = f3 = f4 = 0. Now, if there is a success run of length five at trial n thenthere must exist some k ≤ n for which (i) there was a first success run of length five at k,and (ii) the next n − k trials result in a success run of length five culminating at trial n.Summing over the possibilities, we see that, for n ≥ 1,

un = f1un−1+f2un−2+· · ·+fn−1u1+fn or fn = un−f1un−1−f2un−2−· · ·−fn−1u1.

Churning through the first few values in the recurrence yields

f5 = u5 = 2−5 = 0.03125,

f6 = u6 − f5u1 = u6 = 2−6 = 0.015625,

f7 = u7 − f6u1 − f5u2 = u7 = 2−6 = 0.015625,

f8 = u8 − f7u1 − f6u2 − f5u3 = u8 = 2−6 = 0.015625,

f9 = u9 − f8u1 − f7u2 − f6u3 − f5u4 = u9 = 2−6 = 0.015625,

f10 = u10 − f9u1 − f8u2 − f7u3 − f6u4 − f5u5

= 2−6`

2− (1− 2−4)´

− 2−10 = 2−6 = 0.015625,

f11 = u11 − f10u1 − f9u2 − f8u3 − f7u4 − f6u5 = 0.0148926,

and numerical evaluation shows that

q50 = 1−

50∑k=1

fk =126135883035101

281474976710656= 0.448125

rounded up to six decimal places. The chance that there is a success run of length fivesomewhere in a sequence of fifty tosses of a fair coin is in excess of 55%. It is not at allunlikely hence that the observed run is merely an aspect of chance fluctuations and nomysterious “hot hand” need be proposed to explain the phenomenon. Problems XV.20–27 sketch a more satisfying general, analytical solution.

The concluding Problems 19–34 are of a theoretical character.

18


19. De Morgan’s laws. Show that(⋃

λAλ)

=⋂λA

λ and

(⋂λAλ

)=⋃

λAλ where λ take values in an arbitrary index set Λ, possibly infinite.

SOLUTION: The set implications for the first of de Morgan’s laws are given by

ω ∈“[λ

Aλ

” ⇔ ω /∈[λ

Aλ ⇔ ∀λ : ω /∈ Aλ ⇔ ∀λ : ω ∈ Aλ ⇔ ω ∈

\λ

Aλ

and so`S

λAλ

=TλA

λ. Likewise, the corresponding set implications for the second

of de Morgan’s laws are given by

ω ∈“\λ

Aλ

” ⇔ ω /∈\λ

Aλ ⇔ ∃λ0 s.t. ω /∈ Aλ0 ⇔ ω ∈ Aλ0 ⇔ ω ∈

[λ

Aλ,

and so`T

λAλ

=SλA

λ.

20. Show that(⋃

jAj)

\(⋃

j Bj)

and(⋂

jAj)

\(⋂

j Bj)

are both subsetsof⋃j(Aj \ Bj). When is there equality?

SOLUTION: By de Morgan’s laws and the distributivity of unions and intersections,“[j

Aj

”/“[j

Bj

”=“[j

Aj

”∩“[k

Bk

”

=“[j

Aj

”∩\k

Bk

=[j

“Aj ∩

\k

Bk

” (a)⊆[j

“Aj ∩ B

j

”=[j

(Aj \ Bj).

Equality holds in step (a) if, and only if, Aj \ Bj is disjoint fromSk6=j Bk for every j.

By a similar argument,“\j

Aj

”/“\j

Bj

”=\k

Ak ∩“\j

Bj

”

=\k

Ak ∩“[j

Bj

”=[j

“B

j ∩\k

Ak

” (b)⊆[j

“B

j ∩Aj

”=[j

(Aj \ Bj).

Equality holds in step (b) if, and only if, Aj \ Bj is disjoint fromSk6=jA

k for every j.

21. σ-algebras containing two sets. Suppose A and B are non-empty sub-sets of Ω and A ∩ B 6= ∅. If A ⊆ B determine the smallest σ-algebra containingboth A and B. Repeat the exercise if A 6⊆ B.

SOLUTION:

If A ⊆ B: F =∅,Ω,A, B,A

, B, A ∪ B

, A ∩ B

.

If A 6⊆ B: F =∅,Ω,A, B,A

, B, A ∪ B,A ∩ B,A ∪ B

,

A ∩ B

, A ∪ B, A

∪ B,A ∩ B,A ∩ B, AMB, (AMB)

.

19


22. Indicator functions. Suppose A is any subset of a universal setΩ. Theindicator for A, denoted 1A, is the function 1A(ω) which takes value 1 when ωis in A and value 0 when ω is not in A. Indicators provide a very simple char-acterisation of the symmetric difference between sets. Recall that if A and Bare subsets of some universal set Ω, we define AMB := (A ∩ Bc) ∪ (B ∩ Ac) =(A\B)∪(B\A). This is equivalent to the statement that 1AMB = 1A+1B (mod 2)where, on the right-hand side, the addition is modulo 2. Likewise, intersectionhas the simple representation, 1A∩B = 1A · 1B (mod 2). Verify the followingproperties of symmetric differences (and cast a thought to how tedious the ver-ification would be otherwise): (a) AMB = BMA (the commutative property),(b) (AMB) MC = AM(BMC) (the associative property), (c) (AMB) M(BMC) =AMC, (d) (AMB) M(CMD) = (AMC) M(BMD), (e) AMB = C if, and only if,A = BMC, (f) AMB = CMD if, and only if, AMC = BMD. In view of theirindicator characterisations, it now becomes natural to identify symmetric dif-ference with “addition”,A⊕B := AMB, and intersection with “multiplication”,A⊗ B := A ∩ B.

SOLUTION: (a, b) The commutative and associative properties fall into our lap from theindicator representation as addition is commutative and associative: 1A + 1B = 1B + 1A(mod 2) and 1A + (1B + 1C) = (1A + 1B) + 1C (mod 2). (c) As 1B + 1B = 0 (mod 2), itfollows that

1AMB) M(BMC) = 1AMB + 1BMC = 1A + 1B + 1B + 1C = 1A + 1C (mod 2).

(d) Using the commutative and associative properties of addition,

1(AMB) M(CMD) = 1(AMB) + 1(CMD) = 1A + 1B + 1C + 1D

= (1A + 1C) + (1B + 1D) = 1(AMC) M(BMD) (mod 2).

(e) If AMB = C then 1A + 1B = 1C (mod 2). By adding 1B to both sides, we obtain1A = 1A + 1B + 1B = 1B + 1C (mod 2). (f) If 1AMB = 1CMD, proceed similarly byadding 1B and 1C to both sides.

23. Why is the family of events called an algebra? Suppose F is a non-emptyfamily of subsets of a universal setΩ that is closed under complementation andfinite unions. First argue that F is closed under intersections, set differences,and symmetric differences. Now, for A and B in F, define A ⊗ B := A ∩ B andA⊕ B := AMB. Show that equipped with the operations of “addition” (⊕) and“multiplication” (⊗), the system F may be identified as an algebraic system inthe usual sense of the word. That is, addition is commutative, associative, andhas ∅ as the zero element, i.e., A ⊕ B = B ⊕ A, A ⊕ (B ⊕ C) = (A ⊕ B) ⊕ C,andA⊕∅ = A, and multiplication is commutative, associative, distributes overaddition, and has Ω as the unit element, i.e., A ⊗ B = B ⊗ A, A ⊗ (B ⊗ C) =(A⊗ B)⊗ C, A⊗ (B⊕ C) = (A⊗ B)⊕ (A⊗ C), and A⊗Ω = A.

20


SOLUTION: Pick A,B ∈ F. We know that A ∩ B =À ∪ B

´. Since F is closed undercomplementation and finite unions, one can conclude that: A ∩ B ∈ F. Again considerA,B ∈ F. By assumption, B ∈ F. By the previous result, A \ B = A ∩ B ∈ F.

Now consider the symmetric set difference

AMB = (A ∪ B) \ (A ∩ B).

By assumption, (A ∪ B) ∈ F, and the first part of the problem implies that (A ∩ B) ∈ F

as well. Therefore, the second part implies that their difference is also in F.

We now define

A B = A ∩ B and A⊕ B = AMB

and show that these operations define an algebra:

A⊕∅ = AM ∅ = A,

AΩ = A ∩Ω = A,

A⊕ B = AMB = (A ∪ B) \ (A ∩ B) = (B ∪A) \ (B ∩A) = BMA = B⊕A,A⊕ (B⊕ C) = AM(BMC) = (A B)⊕ (A C) = (A ∩ B) ∪ (A ∩ C) \ (A ∩ B ∩ C)

= [A ∩ (B ∪ C)] \ [A ∩ (B ∩ C)] = A (B⊕ C),

A⊕ (B⊕ C) = AMˆ(B ∪ C) ∩

`B

∪ C´˜

=Â ∪(B ∪ C) ∩

`B

∪ C´˜∩Â

∪`B

∩ C´∪ (B ∩ C)

˜= (A ∪ B ∪ C) ∩

À ∪ B ∪ C

´∩À

∪ B ∪ C´∩À

∪ B ∪ C´

=ˆC ∪(A ∪ B) ∩

À

∪ B´˜∩ˆC

∪À ∪ B

´∩À

∪ B´˜

= [C ∪ (AMB)] ∩ [C ∪ (AMB)]

= (A⊕ B)⊕ C.

The remaining expressions are trivial to verify.

24. Show that a σ-algebra F is closed under countable intersections.

SOLUTION: Suppose Aj ∈ F for j ≥ 1. Then Aj ∈ F for each j (closure of F under

complementation), henceSjA

j ∈ F (closure of F under countable unions), and finallyT

jAj =`S

jAj

´ ∈ F (de Morgan’s laws and another appeal to closure of F undercomplementation).

25. If A = A1, . . . , An is a finite family of sets what is the maximumnumber of elements that σ(A) can have?

SOLUTION: For each j, write A(1)j = Aj and A(2)

j = Aj . Then the sets of the form

B(i1, i2, . . . , in) = A(i1)1 ∩ A(i2)

2 ∩ · · · ∩ A(in)n as i1, i2, . . . , in vary over 1, 2 form

a (disjoint) partition B of Ω. The sets of the form B(i1, . . . , in) are the “atomic” setsengendered by the “parents” A1, A2, . . . , An. There are at most 2n such atomic sets(clearly); there will be exactly 2n atomic sets in the partition if each of the intersections

21


is non-empty. Every element in the σ-algebra σ(A) is a finite union of atomic sets of theform B(i1, . . . , in) ∈ B, each such union creating a distinct element of σ(A). It followsthat cardσ(A) ≤ 22

card A

with equality whenever the atomic partition B of Ω has 2card A

elements.

26. For any a < b show that the open interval (a, b) may be obtainedvia a countable number of operations on half-closed intervals by showing that(a, b) may be represented as the countable union of the half-closed intervals(a, b−(b−a)/n

]asn varies over all integers≥ 1. Likewise show how the closed

interval [a, b] and the reversed half-closed interval [a, b) may be obtained fromhalf-closed intervals by a countable number of operations. Show how to gener-ate the singleton point a by a countable number of operations on half-closedintervals.

SOLUTION: Generation of intervals of various types from half-closed intervals.

Open intervals: (a, b) =[n

(a, b− 1/n];

Closed intervals: [a, b] =\n

(a− 1/n, b];

Left-closed, right-open intervals: [a, b) =\n

(a− 1/n, b) =\n

[m

(a− 1/n, b− 1/m];

Singleton: a =\n

(a− 1/n, a].

27. Continuation. Show that intervals of any type are obtainable by count-able operations on intervals of a given type.

SOLUTION: The following steps illustrate the sequence (a, b] → (a, b) → [a, b) →[a, b]→ (a, b]:

(a, b) =[n

(a, b− 1/n];

[a, b) =\n

(a− 1/n, b);

[a, b] =\n

[a, b+ 1/n);

(a, b] =[n

[a+ 1/n, b].

28. Increasing sequences of sets. Suppose An, n ≥ 1 is an increasingsequence of events, An ⊆ An+1 for each n. Let A =

⋃n≥1An. Show that

P(An)→ P(A) as n→∞.

SOLUTION: With the convention A0 = ∅, define the sequence of sets

Cn = An \An−1 (n ≥ 1).

22


(Observe that C1 = A1.) The sequence of sets Cn, n ≥ 1 is pairwise disjoint and, byconstruction it is clear that An =

Snk=1 Ck and A =

Sn≥1An =

Sk≥1 Ck. We may

hence express A as a countable union of disjoint sets. Countable additivity now yields

P(A) = P“[k≥1

Ck

”=

∞∑k=1

P(Ck) = limn→∞

n∑k=1

P(Ck)

= limn→∞P

“ n[k=1

Ck

”= limn→∞P(An).

In words, probability measure is continuous from below.

29. Decreasing sequences of sets. Suppose Bn, n ≥ 1 is a decreasingsequence of events, Bn ⊇ Bn+1 for each n. Let B =

⋂n≥1 Bn. Show that

P(Bn)→ P(B) as n→∞.

SOLUTION: If Bn, n ≥ 1 is a decreasing sequence of sets then Bn ⊆ B

n+1 and we mayapply the result of Problem 27 to prove continuity from above.

30. Subadditivity. Suppose that P is a set function on the σ-algebra F

satisfying Axioms 1, 2, and 3 of probability measure. If it is true that for allsequences B1, B2, . . . , Bn, . . . of sets in F satisfying A ⊆

⋃n Bn we have P(A) ≤∑

n P(Bn) then show that P satisfies Axiom 4. It is frequently easier to verifythe continuity axiom by this property.

SOLUTION: Suppose An ↓ ∅ is a sequence of events decreasing monotonically to theempty set. As the monotone property of measure depends only on the additivity andpositivity axioms, we see that P(An) ≥ P(An+1) for each n, whence P(An), n ≥ 1 is abounded, decreasing sequence of positive numbers, hence has a (positive) limit P(An) ↓p ≥ 0. We now proceed to show that in fact p = 0.

Fix any ε > 0. By definition of limit, we may now select n = n(ε) sufficientlylarge so that p− ε < P(An+m) ≤ p for allm ≥ 1. As An ⊇ An+m for allm ≥ 1, we seethat An = (An \ An+m) ∪ An+m is the union of disjoint sets and so, by the positivityand additivity axioms,

P(An \An+m) = P(An) − P(An+m) < p− (p− ε) = ε (m ≥ 1).

On the other hand,

An \An+m = (An \An+1) ∪ (An+1 \An+2) ∪ · · · ∪ (An+m−1 \An+m)

is a union of disjoint sets and so, by additivity again,

P(An \An+m) =

n+m−1∑j=n

P(Aj \Aj+1).

It follows hence thatn+m−1∑j=n

P(Aj \Aj+1) < ε

23


for everym ≥ 1 and, a fortiori, by passing to the limit,

limm→∞

n+m−1∑j=n

P(Aj \Aj+1) =

∞∑j=n

P(Aj \Aj+1) ≤ ε.

On the other hand, An =S∞j=n(Aj \Aj+1), and by the assumed subadditivity of the set

function P, we have

P(An) ≤∞∑j=n

P(Aj \Aj+1).

We conclude that 0 ≤ P(An) ≤ ε, eventually, for a sufficiently large choice of n = n(ε).As ε > 0 may be chosen arbitrarily small, we are led inexorably to the conclusion thatP(An)→ 0 as n→∞.

31. A semiring of sets. Suppose Ω is the set of all rational points in theunit interval [0, 1] and let A be the set of all intersections of the set Ω with ar-bitrary open, closed, and half-closed subintervals of [0, 1]. Show that A has thefollowing properties: (a) ∅ ∈ A; (b) A is closed under intersections; and (c) ifA1 and A are elements of A with A1 ⊆ A then A may be represented as a finiteunion A = A1 ∪ A2 ∪ · · · ∪ An of pairwise disjoint sets in A, the given set A1being the first term in the expansion. A family of sets with the properties (a),(b), and (c) is called a semiring.

32. Continuation. For each selection of 0 ≤ a ≤ b ≤ 1, let Aa,b be the setof points obtained by intersectingΩwith any of the intervals (a, b), [a, b], [a, b),or (a, b]. Define a set function Q on the sets of A by the formula Q(Aa,b) =b−a. Show thatQ is additive but not countably additive, hence not continuous.[Hint: AlthoughQ(Ω) = 1, Ω is a countable union of single-element sets, eachof which hasQ-measure zero.]

33. Continuation. Suppose A ∈ A and A1, A2, . . . , An, . . . is a sequenceof pairwise disjoint subsets of A, all belonging to A. Show that

∑nQ(An) ≤

Q(A).

34. Continuation. Show that there exists a sequence B1, B2, . . . , Bn, . . .of sets in A satisfying A ⊆

⋃n Bn but Q(A) >

∑nQ(Bn). This should be

contrasted with Problem 30.

24

II

Conditional Probability

1. Dice. If three dice are thrown what is the probability that one shows a6 given that no two show the same face? Repeat for n dice where 2 ≤ n ≤ 6.

SOLUTION: A CONDITIONAL APPROACH TO SOLUTION. If n dice are thrown, let Aj =

A(n)j denote the event that die j shows a 6 and let A = A(n) = A

(n)1 ∪A(n)

2 ∪ · · · ∪A(n)n

denote the event that some die shows a 6. Let B = B(n) denote the event that the diceall show different face values. We are interested in the conditional probability

PÀ

(n)| B

(n)´ = PÀ

(n)1 ∪· · ·∪A

(n)n ) | B

(n)´ =P`(A

(n)1 ∩ B(n)) ∪ · · · ∪ (A

(n)n ∩ B(n))

´P`B(n)

´ .

The eventsA

(n)j ∩ B(n), 1 ≤ j ≤ n

are mutually exclusive as at most one die can

show a 6 if all face values are different. Accordingly, by additivity,

PÀ

(n)| B

(n)´ =PÀ

(n)1 ∩ B(n))

´P`B(n)

´ + · · ·+PÀ

(n)n ∩ B(n))

´P`B(n)

´ =nPÀ

(n)1 ∩ B(n))

´P`B(n)

ás, by the symmetry inherent in the situation, each of the terms on the right contributesthe same amount. A consideration of small values of n lays bare the general pattern.

THE CASE n = 1 is trivial as it is obvious that PÀ(1)

´= 1/6, P

`B(1)

´= 1, and

PÀ(1) ∩ B(1)

´= P

À(1)

´= 1/6, whence

PÀ

(1)| B

(1)´ =PÀ(1) ∩ B(1)

´P`B(1)

´ =1

6.

WHEN n = 2 we see that P`B(2)

´= (6 × 5)/(6 × 6) and P

À

(2)1 ∩ B(2)

´=

(1× 5)/(6× 6), whence

PÀ

(2)1 | B

(2)´ =PÀ

(2)1 ∩ B

(2)´

P`B(2)

´ =1× 56× 6

ffi6× 56× 6 =

1

6.

It follows that PÀ(2) | B(2)

´= 2/6.

Conditional Probability II.4

WHEN n = 3we see that P`B(3)

´= (6×5×4)/(6×6×6) and P

À

(3)1 ∩B

(3)´

=

(1× 5× 4)/(6× 6× 6), whence

PÀ

(3)1 | B

(3)´ =PÀ

(3)1 ∩ B

(3)´

P`B(3)

´ =1× 5× 46× 6× 6

ffi6× 5× 46× 6× 6 =

1

6.

Thus, PÀ(3) | B(3)

´= 3/6.

THE CASE OF GENERAL 1 ≤ n ≤ 6: the pattern is now clear. We have P`B(n)

´=

6n/6n and PÀ

(n)1 ∩ B(n)

´= 5n−1/6n where the falling factorial notation xk = x(x −

1) · · · (x− k+ 1) helps consolidate expressions. It follows that

PÀ

(n)1 | B

(n)´ =PÀ

(n)1 ∩ B(n)

´P`B(n)

´ =5n−1

6n

ffi6n

6n=1

6,

and soPÀ

(n)| B

(n)´ =n

6(1 ≤ n ≤ 6).

A DIRECT COMBINATORIAL APPROACH. The simple form of the solution suggests thatthere may be a direct combinatorial pathway to the answer. And indeed there is. In arandom permutation of (1, . . . , 6), the face value 6 is equally likely to occur at any ofthe six locations. Identifying the first n locations as the face values of the n dice (with1 ≤ n ≤ 6), the probability that one die shows 6 given that all die faces are different isn/6.

2. Keys. An individual somewhat the worse for drink has a collection ofn keys of which one fits his door. He tries the keys at random discarding keysas they fail. What is the probability that he succeeds on the rth trial?

SOLUTION: All possible permutations for n keys are n! and with the right key fixed atthe rth place, the possible permutations reduce to (n − 1)!. The probability of the eventthat he succeeds on the rth trial is (n− 1)!/n! = 1/n.

3. Let π1, . . . , πn be a random permutation of the numbers 1, . . . , n. Ifyou are told that πk > π1, . . . , πk > πk−1, what is the probability that πk = n?

SOLUTION: WriteA for the event πk = n andB for the event πk > π1, . . . , πk > πk−1.As all permutations are equally likely, the digit n is equally likely to occur at any of then locations and hence P(A) = 1/n. To compute the probability of B, we observe thatthe first k digits in the permutation take values in a sub-collection J = j1, . . . , jk (insome order). Given that the first k digits take values in this set, the largest of these digitsis equally likely to occur at any of the first k locations. This conditional probability isinvariant with respect to the specification of the sub-collection J and so P(B) = 1/k

(by a trivial exercise of total probability by summing over all possibilities for the sub-collection J). As it is clear that the occurrence of A implies the occurrence of B, we seethat P(A ∩ B) = P(A) = 1/n. It follows that

P(A | B) =P(A ∩ B)

P(B)=

P(A)

P(B)=1/n

1/k=k

n.

26

problems for solution - solution manual & test bank … several generations of students and...

Documents