finite mathematics - dartmouth college

Finite Mathematics

TSILB1

Version 4.0A0, 5 October 1998

1This Space Intentionally Left Blank. Contributors include: John G. Ke-meny, J. Laurie Snell, and Gerald L. Thompson. Additional work by: PeterDoyle. Copyright (C) 1998 Peter G. Doyle. Derived from works Copyright(C) 1957, 1966, 1974 John G. Kemeny, J. Laurie Snell, Gerald L. Thomp-son. This work is freely redistributable under the terms of the GNU FreeDocumentation License.

Chapter 2

Sets and subsets

2.1 Introduction

A well-defined collection of objects is known as a set. This concept, inits complete generality, is of great importance in mathematics since allof mathematics can be developed by starting from it.

The various pieces of furniture in a given room form a set. So dothe books in a given library, or the integers between 1 and 1,000,000, orall the ideas that mankind has had, or the human beings alive betweenone billion B.C. and ten billion A.D. These examples are all examplesof finite sets, that is, sets having a finite number of elements. All thesets discussed in this book will be finite sets.

There are two essentially different ways of specifying a set. Onecan give a rule by which it can be determined whether or not a givenobject is a member of the set, or one can give a complete list of theelements in the set. We shall say that the former is a description ofthe set and the latter is a listing of the set. For example, we can definea set of four people as (a) the members of the string quartet whichplayed in town last night, or (b) four particular persons whose namesare Jones, Smith, Brown, and Green. It is customary to use bracesto, surround the listing of a set; thus the set above should be listed{Jones, Smith,Brown,Green}.

We shall frequently be interested in sets of logical possibilities, sincethe analysis of such sets is very often a major task in the solving of aproblem. Suppose, for example, that we were interested in the successesof three candidates who enter the presidential primaries (we assumethere are no other entries). Suppose that the key primaries will be heldin New Hampshire, Minnesota, Wisconsin, and California. Assume

3

4 CHAPTER 2. SETS AND SUBSETS

that candidate A enters all the primaries, that B does not contest inNew Hampshire’s primary, and C does not contest in Wisconsin’s. Alist of the logical possibilities is given in Figure 2.1. Since the NewHampshire and Wisconsin primaries can each end in two ways, and theMinnesota and California primaries can each end in three ways, thereare in all 2 · 2 · 3 · 3 = 36 different logical possibilities as listed in Figure2.1.

A set that consists of some members of another set is called a subset

of that set. For example, the set of those logical possibilities in Figure2.1 for which the statement “Candidate A wins at least three primaries”is true, is a subset of the set of all logical possibilities. This subset canalso be defined by listing its members: {P1,P2,P3,P4,P7,P13,P19}.

In order to discuss all the subsets of a given set, let us introduce thefollowing terminology. We shall call the original set the universal set,one-element subsets will be called unit sets, and the set which containsno members the empty set. We do not introduce special names for otherkinds of subsets of the universal set. As an example, let the universalset U consist of the three elements {a, b, c}. The proper subsets of U arethose sets containing some but not all of the elements of U . The propersubsets consist of three two-element sets namely, {a, b}, {a, c}, and{b, c} and three unit sets, namely, {a}, {b}, and {c}. To complete thepicture, we also consider the universal set a subset (but not a propersubset) of itself, and we consider the empty set ∅, that contains noelements of U , as a subset of U . At first it may seem strange that weshould include the sets U and ∅ as subsets of U , but the reasons fortheir inclusion will become clear later.

We saw that the three-element set above had 8 = 23 subsets. Ingeneral, a set with n elements has 2n subsets, as can be seen in thefollowing manner. We form subsets P of U by considering each of theelements of U in turn and deciding whether or not to include it in thesubset P . If we decide to put every element of U into P , we get theuniversal set, and if we decide to put no element of U into P , we getthe empty set. In most cases we will put some but not all the elementsinto P and thus obtain a proper subset of U . We have to make ndecisions, one for each element of the set, and for each decision we haveto choose between two alternatives. We can make these decisions in2 · 2 · . . . · 2 = 2n ways, and hence this is the number of different subsetsof U that can be formed. Observe that our formula would not havebeen so simple if we had not included the universal set and the emptyset as subsets of U .

2.1. INTRODUCTION 5

Figure 2.1: ♦


In the example of the voting primaries above there are 236 or about70 billion subsets. Of course, we cannot deal with this many subsets ina practical problem, but fortunately we are usually interested in onlya few of the subsets. The most interesting subsets are those whichcan be defined by means of a simple rule such as “the set of all logicalpossibilities in which C loses at least two primaries”. It would be diffi-cult to give a simple description for the subset containing the elements{P1,P4,P14,P30,P34}. On the other hand, we shall see in the nextsection how to define new subsets in terms of subsets already defined.

Example 2.1 We illustrate the two different ways of specifying sets interms of the primary voting example. Let the universal set U be thelogical possibilities given in Figure 2.1.

1. What is the subset of U in which candidate B wins more primariesthan either of the other candidates?

[Ans. {P11,P12,P17,P23,P26,P28,P29}.]

2. What is the subset in which the primaries are split two and two?

[Ans. {P5,P8,P10,P15,P21,P30,P31,P35}.]

3. Describe the set {P1,P4,P19,P22}.

[Ans. The set of possibilities for which A wins in Minnesota andCalifornia.]

4. How can we describe the set {P18,P24,P27}

[Ans. The set of possibilities for which C wins in California, andthe other primaries are split three ways.]

♦

Exercises

1. In the primary example, give a listing for each of the followingsets.

(a) The set in which C wins at least two primaries.

2.1. INTRODUCTION 7

(b) The set in which the first three primaries are won by thesame candidate.

(c) The set in which B wins all four primaries.

2. The primaries are considered decisive if a candidate can win threeprimaries, or if he or she wins two primaries including California.List the set in which the primaries are decisive.

3. Give simple descriptions for the following sets (referring to theprimary example).

(a) {P33,P36}.(b) {P10,P11,P12,P28,P29,P30}.(c) {P6,P20,P22}.

4. Joe, Jim, Pete, Mary, and Peg are to be photographed. Theywant to line up so that boys and girls alternate. List the set ofall possibilities.

5. In Exercise 4, list the following subsets.

(a) The set in which Pete and Mary are next to each other.

(b) The set in which Peg is between Joe and Jim.

(c) The set in which Jim is in the middle.

(d) The set in which Mary is in the middle.

(e) The set in which a boy is at each end.

6. Pick out all pairs in Exercise 5 in which one set is a subset of theother.

7. A TV producer is planning a half-hour show. He or she wants tohave a combination of comedy, music, and commercials. If eachis allotted a multiple of five minutes, construct the set of possibledistributions of time. (Consider only the total time allotted toeach.)

8. In Exercise 7, list the following subsets.

(a) The set in which more time is devoted to comedy than tomusic.


(b) The set in which no more time is devoted to commercialsthan to either music or comedy.

(c) The set in which exactly five minutes is devoted to music.

(d) The set in which all three of the above conditions are satis-fied.

9. In Exercise 8, find two sets, each of which is a proper subset ofthe set in 8a and also of the set in 8c.

10. Let U be the set of paths in Figure ??. Find the subset in which

(a) Two balls of the same color are drawn.

(b) Two different color balls are drawn.

11. A set has 101 elements. How many subsets does it have? Howmany of the subsets have an odd number of elements?

[Ans. 2101; 2100.]

12. Do Exercise 11 for the case of a set with 102 elements.

2.2 Operations on subsets

In Chapter ?? we considered the ways in which one could form newstatements from given statements. Now we shall consider an analogousprocedure, the formation of new sets from given sets. We shall assumethat each of the sets that we use in the combination is a subset of someuniversal set, and we shall also want the newly formed set to be a subsetof the same universal set. As usual, we can specify a newly formed seteither by a description or by a listing.

If P and Q are two sets, we shall define a new set P ∩Q, called theintersection of P and Q, as follows: P ∩ Q is the set which containsthose and only those elements which belong to both P and Q. As anexample, consider the logical possibilities listed in Figure 2.1. Let P bethe subset in which candidate A wins at least three primaries, i.e., theset {P1,P2,P3,P4,P7,P13,P19}; let Q be the subset in which A winsthe first two primaries, i.e., the set {P1,P2,P3,P4,P5,P6}. Then theintersection P ∩Q is the set in which both events take place, i.e., whereA wins the first two primaries and wins at least three primaries. ThusP ∩Q is the set {P1,P2,P3,P4}.

2.2. OPERATIONS ON SUBSETS 9

Figure 2.2: ♦

If P and Q are two sets, we shall define a new set P ∪ Q calledthe union of P and Q as follows: P ∪ Q is the set that contains thoseand only those elements that belong either to P or to Q (or to both).In the example in the paragraph above, the union P ∪ Q is the set ofpossibilities for which either A wins the first two primaries or wins atleast three primaries, i.e., the set {P1,P2,P3,P4,P5,P6,P7,P13,P19}.

To help in visualizing these operations we shall draw diagrams,called Venn diagrams, which illustrate them. We let the universal setbe a rectangle and let subsets be circles drawn inside the rectangle.In Figure 2.2 we show two sets P and Q as shaded circles. Then thedoubly crosshatched area is the intersection P ∩Q and the total shadedarea is the union P ∪Q.

If P is a given subset of the universal set U , we can define a new setP called the complement of P as follows: P is the set of all elementsof U that are not contained in P . For example, if, as above, Q is theset in which candidate A wins the first two primaries, then Q is the set{P7,P8, . . . ,P36}. The shaded area in Figure 2.3 is the complementof the set P . Observe that the complement of the empty set ∅ is theuniversal set U , and also that the complement of the universal set isthe empty set.

Sometimes we shall be interested in only part of the complement of aset. For example, we might wish to consider the part of the complementof the set Q that is contained in P , i.e., the set P ∩ Q. The shadedarea in Figure 2.4 is P ∩ Q.

A somewhat more suggestive definition of this set can be given as


Figure 2.3: ♦

Figure 2.4: ♦


follows: Let P − Q be the difference of P and Q, that is, the set thatcontains those elements of P that do not belong to Q. Figure 2.4 showsthat P ∩ Q and P −Q are the same set. In the primary voting exampleabove, the set P −Q can be listed as {P7,P13,P19}.

The complement of a subset is a special case of a difference set,since we can write Q = U −Q. If P and Q are nonempty subsets whoseintersection is the empty set, i.e., P ∩Q = ∅, then we say that they aredisjoint subsets.

Example 2.2 In the primary voting example let R be the set in whichA wins the first three primaries, i e., the set {P1,P2,P3}; let S be theset in which A wins the last two primaries, i.e., the set {P1,P7,P13,P19,P25,P31}.Then R∩S = {P1} is the set in which A wins the first three primariesand also the last two, that is, he or she wins all the primaries. We alsohave

R ∪ S = {P1,P2,P3,P7,P13,P19,P25,P31},which can be described as the set in which A wins the first three pri-maries or the last two. The set in which A does not win the first threeprimaries is R = {P4,P5, . . . ,P36}. Finally, we see that the differenceset R − S is the set in which A wins the first three primaries but notboth of the last two. This set can be found by taking from R the el-ement P1 which it has in common with S, so that R − S = {P2,P3}.♦

Exercises

1. Draw Venn diagrams for P ∩Q, P ∩ Q, P ∩Q, P ∩ Q.

2. Give a step-by-step construction of the diagram for (P − Q) ∪(P ∩ Q).

3. Venn diagrams are also useful when three subsets are given. Con-struct such a diagram, given the subsets P . Q. and R. Identifyeach of the eight resulting areas in terms of P , Q, and R.

4. In testing blood, three types of antigens are looked for: A, B, andRh. Every person is classified doubly. He or she is Rh positive ifhe or she has the Rh antigen, and Rh negative otherwise. He orshe is type AB, A, or B depending on which of the other antigenshe or she has, with type O having neither A nor B. Draw a Venndiagram, and identify each of the eight areas.


Figure 2.5: ♦

5. Considering only two subsets, the set X of people having antigenA, and the set Y of people having antigen B. define (symbolically)the types AB, A, B and O.

6. A person can receive blood from another person if he or she hasall the antigens of the donor. Describe in terms of X and Y thesets of people who can give to each of the four types. Identifythese sets in terms of blood types.

7. The tabulation in Figure 2.5 records the reaction of a numberof spectators to a television show. A11 the categories can bedefined in terms of the following four: M (male), G (grown-up),L (liked), V (very much). How many people fall into each of thefollowing categories?

(a) M .

[Ans. 34.]

(b) L.

(c) V .

(d) M ∩ G ∩ L ∩ V .

[Ans. 2.]

(e) M ∩G ∩ L.

(f) (M ∩G) ∪ (L ∩ V ).


(g) ˜M ∩G.

[Ans. 48.]

(h) M ∪ G.

(i) M −G.

(j) [M − (G ∩ L ∩ V )].

8. In a survey of 100 students, the numbers studying various lan-guages were found to be: Spanish, 28; German, 30; French, 42;Spanish and German, 8; Spanish and French, 10; German andFrench, 5; all three languages, 3.

(a) How many students were studying no language?

[Ans. 20.]

(b) How many students had French as their only language?

[Ans. 30.]

(c) How many students studied German if and only if they stud-ied French?

[Ans. 38.]

[Hint: Draw a Venn diagram with three circles, for French, Ger-man, and Spanish students. Fill in the numbers in each of theeight areas, using the data given above. Start from the end of thelist and work back.]

9. In a later survey of the 100 students (see Exercise 8) the numbersstudying the various languages were found to be: German only,18; German but not Spanish, 23; German and French, 8; German,26; French, 48; French and Spanish, 8; no language, 24.

(a) How many students took Spanish?

[Ans. 18.]

(b) How many took German and Spanish but not French?

[Ans. None.]

(c) How many took French if and only if they did not take Span-ish?


[Ans. 50.]

10. The report of one survey of the 100 students (see Exercise 8)stated that the numbers studying the various languages were: allthree languages, 5; German and Spanish, 10; French and Spanish,8; German and French, 20; Spanish, 30; German, 23; French, 50.The surveyor who turned in this report was fired. Why?

2.3 The relationship between sets and com-

pound statements

The reader may have observed several times in the preceding sectionsthat there was a close connection between sets and statements, andbetween set operations and compounding operations. In this sectionwe shall formalize these relationships.

If we have a number of statements relative to a set of logical pos-sibilities, there is a natural way of assigning a set to each statement.First of all, we take the set of logical possibilities as our universal set.Then to each statement we assign the subset of logical possibilitiesof the universal set for which that statement is true. This idea is soimportant that we embody it in a formal definition.

Definition. Let U be a set of logical possibilities, let p be a state-ment relative to it, and let P be that subset of the possibilities forwhich p is true; then we call P the truth set of p.

If p and q are statements, then p∨q and p∧q are also statements andhence must have truth sets. To find the truth set of p ∨ q, we observethat it is true whenever p is true or q is true (or both). Therefore wemust assign to p ∨ q the logical possibilities which are in P or in Q (orboth); that is, we must assign to p ∨ q the set P ∪ Q. On the otherhand, the statement p ∧ q is true only when both p and q are true, sothat we must assign to p ∧ q the set P ∩Q.

Thus we see that there is a close connection between the logical op-eration of disjunction and the set operation of union, and also betweenconjunction and intersection. A careful examination of the definitionsof union and intersection shows that the word “or” occurs in the defini-tion of union and the word “and” occurs in the definition of intersection.Thus the connection between the two theories is not surprising.

Since the connective “not” occurs in the definition of the comple-ment of a set, it is not surprising that the truth set of ¬p is P . This

2.3. THE RELATIONSHIP BETWEEN SETS AND COMPOUND STATEMENTS15

Figure 2.6: ♦

follows since ¬p is true when p is false, so that the truth set of ¬pcontains all logical possibilities for which p is false, that is, the truthset of ¬p is P .

The truth sets of two propositions p and q are shown in Figure2.6. Also marked on the diagram are the various logical possibilitiesfor these two statements. The reader should pick out in this diagramthe truth sets of the statements p ∨ q, p ∧ q, ¬p, and ¬q.

The connection between a statement and its truth set makes it pos-sible to “translate” a problem about compound statements into a prob-lem about sets. It is also possible to go in the reverse direction. Givena problem about sets, think of the universal set as being a set of logicalpossibilities and think of a subset as being the truth set of a statement.Hence we can “translate” a problem about sets into a problem aboutcompound statements.

So far we have discussed only the truth sets assigned to compoundstatements involving ∨, ∧, and ¬. All the other connectives can bedefined in terms of these three basic ones, so that we can deduce whattruth sets should be assigned to them. For example, we know thatp → q is equivalent to ¬p ∨ q (see Figure ??). Hence the truth set ofp → q is the same as the truth set of ¬p∨q, that is, it is P∪Q. The Venndiagram for p → q is shown in Figure 2.7, where the shaded area is thetruth set for the statement. Observe that the unshaded area in Figure2.7 is the set P − Q = P ∩ Q, which is the truth set of the statement

p ∧ ¬q. Thus the shaded area is the set ˜P −Q =˜

P ∩ Q, which is thetruth set of the statement ¬(p∧¬q). We have thus discovered the factthat p → q, ¬p ∨ q, and ¬(p ∧ ¬q) are equivalent. It is always thecase that two compound statements are equivalent if and only if they


Figure 2.7: ♦

have the same truth sets. Thus we can test for equivalence by checkingwhether they have the same Venn diagram.

Suppose that p is a statement that is logically true. What is itstruth set? Now p is logically true if and only if it is true in everylogically possible case, so that the truth set of p must be U . Similarly,if p is logically false, then it is false for every logically possible case, sothat its truth set is the empty set ∅.

Finally, let us consider the implication relation. Recall that p im-plies p if and only if the conditional p → q is logically true. But p → q

is logically true if and only if its truth set is U , that is, ˜(P −Q) = U , or(P −Q) = ∅. From Figure 2.4 we see that if P −Q is empty, then P iscontained in Q. We shall symbolize the containing relation as follows:P ⊂ Q means “P is a subset of Q”. We conclude that p → q is logicallytrue if and only if P ⊂ Q.

Figure 2.8 supplies a “dictionary” for translating from statementlanguage to set language, and back. To each statement relative to aset of possibilities U there corresponds a subset of U , namely the truthset of the statement. This is shown in lines 1 and 2 of the figure. Toeach connective there corresponds an operation on sets, as illustratedin the next four lines. And to each relation between statements therecorresponds a relation between sets, examples of which are shown inthe last two lines of the figure.

Example 2.3 Prove by means of a Venn diagram that the statement[p ∨ (¬p ∨ q)] is logically true. The assigned set of this statement is[P ∪ (P ∪ Q)], and its Venn diagram is shown in Figure 2.9. Inthat figure the set P is shaded vertically, and the set P ∪ Q is shaded


Figure 2.8: ♦

Figure 2.9: ♦


Figure 2.10: ♦

horizontally. Their union is the entire shaded area, which is U , so thatthe compound statement is logically true. ♦

Example 2.4 Prove by means of Venn diagrams that p ∨ (q ∧ r) isequivalent to (p ∨ q) ∧ (p ∨ r). The truth set of p ∨ (q ∧ r) is theentire shaded area in diagram (a) of Figure 2.10, and the truth set of(p∨ q)∧ (p∨ r) is the doubly shaded area in diagram (b). Since thesetwo sets are equal, we see that the two statements are equivalent. ♦

Example 2.5 Show by means of a Venn diagram that q implies p → q.The truth set of p → q is the shaded area in Figure 2.7. Since thisshaded area includes the set Q. we see that q implies p → q. ♦

Exercises

Note. In Exercises 1, 2, and 3, find first the truth set of eachstatement.

1. Use Venn diagrams to test which of the following statements arelogically true or logically false.

(a) p ∨ ¬p.[Ans. logically true.]


(b) p ∧ ¬p.[Ans. logically false.]

(c) p ∨ (¬p ∧ q).

(d) p → (q → p).

[Ans. logically true.]

(e) p ∧ ¬(q → p).

[Ans. logically false.]

2. Use Venn diagrams to test the following statements for equiva-lences.

(a) p ∨ ¬q.(b) ¬(p ∧ q).

(c) ¬(q ∧ ¬q).(d) p → ¬q.(e) ¬p ∨ ¬q.

[Ans. 2a and 2c equivalent; 2b and 2d and 2e equivalent.]

3. Use Venn diagrams for the following pairs of statements to testwhether one implies the other.

(a) p; p ∧ q.

(b) p ∧ ¬q;¬p → ¬q.(c) p → q; q → p.

(d) p ∧ q; p ∧ ¬q.

4. Devise a test for inconsistency of p and q, using Venn diagrams.

5. Three or more statements are said to be inconsistent if they can-not all be true. What does this state about their truth sets?

6. Consider these three statements.

If this is a good course, then I will work hard in it.

If this is not a good course, then I shall get a bad grade in it.


I will not work hard, but I will get a good grade in this course.

(a) Assign variables to the components of each of these state-ments.

(b) Bring the statements into symbolic form.

(c) Find the truth sets of the statements.

(d) Rest for consistency.

[Ans. Inconsistent.]

Note. In Exercises 7, 8, and 9, assign to each set a statementhaving it as a truth set.

7. Use truth tables to find which of the following sets are empty.

(a) (P ∪Q) ∩ (P ∪ Q).

(b) (P ∩Q) ∩ (Q ∩R).

(c) (P ∩Q)− P .

(d) (P ∪ R) ∩ (P ∪ Q)

[Ans. 7b and 7c.]

8. Use truth tables to find out whether the following sets are alldifferent.

(a) P ∩ (Q ∪R).

(b) (R −Q) ∪ (Q− R).

(c) (R ∪Q) ∩ ˜(R ∩Q).

(d) (P ∩Q) ∪ (P ∩R).

(e) (P ∩Q ∩ R) ∪ (P ∩ Q ∩R) ∪ (P ∩Q ∩ R) ∪ (P ∩ Q ∩ R).

9. Use truth tables for the following pairs of sets to test whether oneis a subset of the other.

(a) P ;P ∩Q.

(b) P ∩ Q;Q ∩ P .

(c) P −Q;Q− P .

2.4. THE ABSTRACT LAWS OF SET OPERATIONS 21

(d) P ∩ Q;P ∪Q.

10. Show, both by the use of truth tables and by the use of Venndiagrams, that p ∧ (q ∨ r) is equivalent to (p ∧ q) ∨ (p ∧ r).

11. The symmetric difference of P and Q is defined to be (P −Q) ∪(Q− P ). What connective corresponds to this set operation?

12. Let p, q, r be a complete set of alternatives (see Section ??). Whatcan we say about the truth sets P,Q,R?

2.4 The abstract laws of set operations

The set operations which we have introduced obey some very simple ab-stract laws, which we shall list in this section. These laws can be provedby means of Venn diagrams or they can be translated into statementsand checked by means of truth tables.

The abstract laws given below bear a close resemblance to the ele-mentary algebraic laws with which the student is already familiar. Theresemblance can be made even more striking by replacing ∪ by + and∩ by ×. For this reason, a set, its subsets, and the laws of combi-nation of subsets are considered an algebraic system, called a Boolean

algebra—after the British mathematician George Boole who was thefirst person to study them from the algebraic point of view. Any othersystem obeying these laws, for example, the system of compound state-ments studied in Chapter ??, is also known as a Boolean algebra. Wecan study any of these systems from either the algebraic or the logicalpoint of view.

Below are the basic laws of Boolean algebras. The proofs of theselaws will be left as exercises.

The laws governing union and intersection:


A1. A ∪A = A.A2. A ∩A = A.A3. A ∪B = B ∪ A.A4. A ∩B = B ∩ A.A5. A ∪ (B ∪ C) = (A ∪ B) ∪ C.A6. A ∩ (B ∩ C) = (A ∩ B) ∩ C.A7. A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C).A8. A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C).A9. A ∪ U = U .A10. A ∩ ∅ = ∅.A11. A ∪ ∅ = A.A12. A ∩ U = A.

The laws governing complements:

B1. Ã = A.

B2. A ∪ A = U .B3. A ∩ A = ∅.B4. Ã ∪ B = A ∩ B.

B5. Ã ∩ B = A ∪ B.

B6. U = ∅.The laws governing set-differences:

C1. A− B = A ∩ B.

C2. U − A = A.C3. A− U = ∅.C4. A− ∅ = A.C5. ∅ −A = ∅.C6. A− A = ∅.C7. (A− B)− C = A− (B ∪ C).C8. A− (B − C) = (A− B) ∪ (A ∩ C).C9. A ∪ (B − C) = (A ∪B)− (C − A).C10. A ∩ (B − C) = (A ∩B)− (A ∩ C).

Exercises

1. Test laws in the group A1–A12 by means of Venn diagrams.

2. “Translate” the A-laws into laws about compound statements.Test these by truth tables.

3. Test the laws in groups B and C by Venn diagrams.

2.4. THE ABSTRACT LAWS OF SET OPERATIONS 23

4. “Translate” the B- and C-laws into laws about compound state-ments. Test these by means of truth tables.

5. Derive the following results from the 28 basic laws.

(a) A = (A ∩ B) ∪ (A ∩ B).

(b) A ∪B = (A ∩B) ∪ (A ∩ B) ∪ (A ∩B).

(c) A ∩ (A ∪B) = A.

(d) A ∪ (A ∩B) = A ∪B.

6. From the A- and B-laws and from C1, derive C2–C6.

7. Use A1–A12 and C2–C10 to derive B1, B2, B3, and B6.

Supplementary exercises.

Note. Use the following definitions in these exercises: Let + besymmetric difference (see Exercise 11), × be intersection, let 0 be∅ and 1 be U .

8. From A2, A4, and A6 derive the properties of multiplication.

9. Find corresponding properties for addition.

10. Set up addition and multiplication tables for 0 and 1.

11. What do A× 0, A× 1, A+ 0, and A+ 1 equal?

[Ans. 0;A;A; A.]

12. Show that

A× (B + C) = (A×B) + (A× C).

13. Show that the following equation is not always true.

A+ (B × C) = (A +B)× (A + C).


Figure 2.11: ♦

2.5 Two-digit number systems

In the decimal number system one can write any number by using onlythe ten digits, 0, 1, 2, . . . , 9. Other number systems can be constructedwhich use either fewer or more digits. Probably the simplest numbersystem is the binary number system which uses only the digits 0 and1. We shall consider all the possible ways of forming number systemsusing only these two digits.

The two basic arithmetical operations are addition and multiplica-tion. To understand any arithmetic system, it is necessary to knowhow to add or multiply any two digits together. Thus to understandthe decimal system, we had to learn a multiplication table and an ad-dition table, each of which had 100 entries. To understand the binarysystem, we have to learn a multiplication and an addition table, eachof which has only four entries. These are shown in Figure 2.11. Themultiplication table given there is completely determined by the twofamiliar rules that multiplying a number by zero gives zero, and mul-tiplying a number by one leaves it unchanged. For addition, we haveonly the rule that the addition of zero to a number does not changethat number. The latter rule is sufficient to determine all but one ofthe entries in the addition table in Figure 2.11. We must still decidewhat shall be the sum 1 + 1.

What are the possible ways in which we can complete the additiontable? The only one-digit numbers that we can use are 0 and l, andthese lead to interesting systems. Of the possible two-digit numbers, wesee that 00 and 01 are the same as 0 and l and so do not give anythingnew. The number 11 or any greater number would introduce a “jump”in the table, hence the only other possibility is 10. The addition tablesof these three different number systems are shown in Figure 2.12, andthey all have the multiplication table shown in Figure 2.11. Each ofthese systems is interesting in itself as the interpretations below show.

Let us say that the parity of a positive integer is the fact of its being

2.5. TWO-DIGIT NUMBER SYSTEMS 25

Figure 2.12: ♦

odd or even. Consider now the number system having the additiontable (a) in Figure 2.12 and let 0 represent “even” and 1 represent“odd”. The tables above now tell how the parity of a combination oftwo positive integers is related to the parity of each. Thus 0 ·1 = 0 tellsus that the product of an even number and an odd number is even,while 1 + 1 = 0 tells us that the sum of two odd numbers is even, etc.Thus the first number system is that which we get from the arithmeticof the positive integers if we consider only the parity of numbers.

The second number system, which has the addition table (b) in Fig-ure 2.12, has an interpretation in terms of sets. Let 0 correspond to theempty set ∅ and 1 correspond to the universal set U . Let the additionof numbers correspond to the union of sets and let the multiplicationof sets correspond to the intersection of sets. Then 0 · 1 = 0 tells usthat ∅ ∩ U = ∅ and 1 + 1 = 1 tells us that U ∪ U = U . The studentshould give the interpretations for the other arithmetic computationspossible for this number system.

Finally, the third number system, which has the addition table in (c)of Figure 2.12, is the so-called binary number system. Every ordinaryinteger can be written as a binary integer. Thus the binary 0 corre-sponds to the ordinary 0, and the binary unit 1 to the ordinary singleunit. The binary number 10 means a “unit of higher order” and corre-sponds to the ordinary number two (not to ten). The binary number100 then means two times two or four. In general, if bnbn−1 . . . b2b1b0 is abinary number, where each digit is either 0 or 1, then the correspondingordinary integer I is given by the formula

I = bn · 2n + bn−1 · 2n−1 + . . .+ b2 · 22 + b1 · 2 + b0.

Thus the binary number 11001 corresponds to 24+23+1 = 16+8+1 =25. The table in Figure 2.13 shows some binary numbers and theirdecimal equivalents.


Figure 2.13: ♦

Because electronic circuits are particularly well adapted to perform-ing computations in the binary system, modern high-speed electroniccomputers are frequently constructed to work in the binary system.

Example 2.6 As an example of a computation, let us multiply 5 by5 in the binary system. Since the binary equivalent of 5 is the number101, the multiplication is done as follows.

1 0 11 0 11 0 1

0 0 01 0 11 1 0 0 1

The answer is the binary number 11001, which we saw above was equiv-alent to the decimal integer 25, the answer we expected to get. ♦

Exercises

1. Complete the interpretations of the addition and multiplicationtables for the number systems representing

(a) parity,

(b) the sets U and ∅.

2. (a) What are the binary numbers corresponding to the integers11, 52, 64, 98, 128, 144?

[Partial Ans. 1100010 corresponds to 98.]

(b) What decimal integers correspond to the binary numbers1111, 1010101, 1000000, 11011011?

2.5. TWO-DIGIT NUMBER SYSTEMS 27

[Partial Ans. 1010101 corresponds to 85.]

3. Carry out the following operations in the binary system. Checkyour answer.

(a) 29 + 20.

(b) 9 · 7.

4. Of the laws listed below, which apply to each of the three systems?

(a) x+ y = y + x.

(b) x+ x = x.

(c) x+ x+ x = x.

5. Interpret a + b to be the larger of the two numbers a and b, anda · b to be the smaller of the two. Write tables of “addition” and“multiplication” for the digits 0 and 1. Compare the result withthe three systems given above.

[Ans. Same as the U , ∅ system.]

6. What do the laws A1–A10 of Section 2.4 tell us about the secondnumber system established above?

7. The first number system above (about parity) can be interpretedto deal with the remainders of integers when divided by 2. Aneven number leaves 0, an odd number leaves 1. Construct tablesof addition and multiplication for remainders of integers whendivided by 3. [Hint: These will be 3 by 3 tables.]

8. Given a set of four elements, suppose that we want to numberits subsets. For a given subset, write down a binary number asfollows: The first digit is 1 if and only if the first element is in thesubset, the second digit is 1 if and only if the second element isin the subset, etc. Prove that this assigns a unique number, from0 to 15, to each subset.

9. In a multiple choice test the answers were numbered 1, 2, 4, and8. The students were told that there might be no correct answer,or that one or more answers might be correct. They were told toadd together the numbers of the correct answers (or to write 0 ifno answer was correct).


(a) By using the result of Exercise 8, show that the resultingnumber gives the instructor all the information he or shewants.

(b) On a given question the correct sum was 7. Three studentsput down 4, 8, and 15, respectively. Which answer was mostnearly correct? Which answer was worst?

[Ans. 15 best, 8 worst.]

10. In the ternary number system, numbers are expressed to the base3, so that 201 in this system stands for 2 · 32 + 0 · 3 + 1 · 1 = 19.

(a) Write the numbers from 1 through 30 in this notation.

(b) Construct a table of addition and multiplication for the dig-its 0, 1, 2.

(c) Carry out the multiplication of 5 · 5 in this system. Checkyour answer.

11. Explain the meaning of the numeral “2907” in our ordinary (base10) notation, in analogy to the formula given for the binary sys-tem.

12. Show that the addition and multiplication tables set up in Exer-cise 10 correspond to one of our three systems.

2.6 Voting coalitions

As an application of our set concepts, we shall consider the significanceof voting coalitions in voting bodies. Here the universal set is a setof human beings which form a decision-making body. For example,the universal set might be the members of a committee, or of a citycouncil, or of a convention, or of the House of Representatives, etc.Each member can cast a certain number of votes. The decision as towhether or not a measure is passed can be decided by a simple majorityrule, or two-thirds majority, etc.

Suppose now that a subset of the members of the body forms acoalition in order to pass a measure. The question is whether or notthey have enough votes to guarantee passage of the measure. If theyhave enough votes to carry the measure, then we say they form a win-ning coalition. If the members not in the coalition can pass a measure

2.6. VOTING COALITIONS 29

of their own, then we say that the original coalition is a losing coalition.Finally, if the members of the coalition cannot carry their measure, andif the members not in the coalition cannot carry their measure, thenthe coalition is called a blocking coalition.

Let us restate these definitions in set-theoretic terms. A coalitionC is winning if they have enough votes to carry an issue; coalition Cis losing if the coalition C is winning; and coalition C is blocking ifneither C nor C is a winning coalition.

The following facts are immediate consequences of these definitions.The complement of a winning coalition is a losing coalition. The com-plement of a losing coalition is a winning coalition. The complement ofa blocking coalition is a blocking coalition.

Example 2.7 A committee consists of six members each having onevote. A simple majority vote will carry an issue. Then any coalitionof four or more members is winning, any coalition with one or twomembers is losing, and any three-person coalition is blocking. ♦

Example 2.8 Suppose in Example 2.7 one of the six members (saythe chair) is given the additional power to break ties. Then any three-person coalition of which the chair is a member is winning, while theother three-person coalitions are losing; hence there are no blockingcoalitions. The other coalitions are as in Example 2.7. ♦

Example 2.9 Let the universal set U be the set {x, y,w, z}, where xand y each has one vote, w has two votes, and z has three votes. Supposeit takes five votes to carry a measure. Then the winning coalitions are:{z,w}, {z, x, y}, {z,w, x}, {z,w, y}, and U . The losing coalitions arethe complements of these sets. Blocking coalitions are: {z}, {z, x},{z, y}, {w, x}, {w, y}, and {w, x, y}. ♦

The last example shows that it is not always necessary to list allmembers of a winning coalition. For example, if the coalition {z,w} iswinning, then it is obvious that the coalition {z,w, y} is also winning.In general, if a coalition C is winning, then any other set that has C as asubset will also be winning. Thus we are led to the notion of a minimalwinning coalition. A minimal winning coalition is a winning coalitionwhich contains no smaller winning coalition as a subset. Another wayof stating this is that a minimal winning coalition is a winning coalition


such that, if any member is lost from the coalition, then it ceases to bea winning coalition.

If we know the minimal winning coalitions, then we know everythingthat we need to know about the voting problem. The winning coalitionsare all those sets that contain a minimal winning coalition, and thelosing coalitions are the complements of the winning coalitions. Allother sets are blocking coalitions.

In Example 2.7 the minimal winning coalitions are the sets contain-ing four members. In Example 2.8 the minimal winning coalitions arethe three-member coalitions that contain the tie-breaking member andthe four-member coalitions that do not contain the tie-breaking mem-ber. The minimal winning coalitions in the third example are the sets{z,w} and {z, x, y}.

Sometimes there are committee members who have special powersor lack of power. If a member can pass any measure he or she wisheswithout needing anyone else to vote with him or her, then we call himor her a dictator. Thus member x is a dictator if and only if {x} is awinning coalition. A somewhat weaker but still very powerful memberis one who can by himself or herself block any measure. If x is such amember, then we say that x has veto power. Thus x has veto powerif and only if {x} is a blocking coalition. Finally if x is not a memberof any minimal winning coalition, we shall call him or her a powerlessmember. Thus x is powerless if and only if any winning coalition ofwhich x is a member is a winning coalition without him or her.

Example 2.10 An interesting example of a decision-making body isthe Security Council of the United Nations. (We discuss the rulesprior to 1966.) The Security Council has eleven members consistingof the five permanent large-nation members called the Big Five, andsix small-nation members. In order that a measure be passed by theCouncil, seven members including all of the Big Five must vote for themeasure. Thus the seven-member sets made up of the Big Five plustwo small nations are the minimal winning coalitions. Then the losingcoalitions are the sets that contain at most four small nations. Theblocking coalitions are the sets that are neither winning nor losing. Inparticular, a unit set that contains one of the Big Five as a memberis a blocking coalition. This is the sense in which a Big Five memberhas a veto. [The possibility of “abstaining” is immaterial in the abovediscussion.]

In 1966 the number of small-nation members was increased to 10.

2.6. VOTING COALITIONS 31

A measure now requires the vote of nine members, including all of theBig Five. (See Exercise 11.) ♦

Exercises

1. A committee has w, x, y, and z as members. Member w has twovotes, the others have one vote each. List the winning, losing,and blocking coalitions.

2. A committee has n members, each with one vote. It takes amajority vote to carry an issue. What are the winning, losing,and blocking coalitions?

3. Rhe Board of Estimate of New York City consists (that is, con-sisted at one time) of eight members with voting strength as fol-lows:

s. Mayor 4 votest. Controller 4u. Council President 4v. Brooklyn Borough President 2w. Manhattan Borough President 2x. Bronx Borough President 2y. Richmond Borough President 2z. Queens Borough President 2

A simple majority is needed to carry an issue. List the minimalwinning coalitions. List the blocking coalitions. Do the same ifwe give the mayor the additional power to break ties.

4. A company has issued 100,000 shares of common stock and eachshare has one vote. How many shares must a stockholder have tobe a dictator? How many to have a veto?

[Ans. 50,001; 50,000.]

5. In Exercise 4, if the company requires a two-thirds majority voteto carry an issue, how many shares must a stockholder have tobe a dictator or to have a veto?

[Ans. At least 66,667; at least 33,334.]


6. Prove that if a committee has a dictator as a member, then theremaining members are powerless.

7. We can define a maximal losing coalition in analogy to the mini-mal winning coalitions. What is the relation between the maximallosing and minimal winning coalitions? Do the maximal losingcoalitions provide all relevant information?

8. Prove that any two minimal winning coalitions have at least onemember in common.

9. Find all the blocking coalitions in the Security Council example(Example 2.10).

10. Prove that if a member has veto power and if he or she togetherwith any one other member can carry a measure, then the distri-bution of the remaining votes is irrelevant.

11. Find the winning, losing, and blocking coalitions in the SecurityCouncil, using the revised (1966) structure.

Suggested reading.

Birkhoff, G., and S. MacLane, A Survey of Modern Algebra, 1953,Chapter XI.

Tarski, A., Introduction to Logic, 2d rev. ed., 1946, Chapter IV.

Allendoerfer, C. B., and C. O. Oakley, Principles of Mathematics, 1955,Chapter V.

Johnstone, H. W., Jr., Elementary Deductive Logic, 1954, Part Three.

Breuer, Joseph, Introduction to the Theory of Sets, 1958.

Fraenkel, A. A., Abstract Set Theory, 1953.

Kemeny, John G., Hazleton Mirkil, J. Laurie Snell, and Gerald L.Thompson, Finite Mathematical Structures, 1959, Chapter 2.

Chapter 3

Partitions and counting

3.1 Partitions

The problems to be studied in this chapter can be most convenientlydescribed in terms of partitions of a set. A partition of a set U is asubdivision of the set into subsets that are disjoint and exhaustive, i.e.,every element of U must belong to one and only one of the subsets.The subsets Ai in the partition are called cells. Thus [A1, A2, . . . , Ar]is a partition of U if two conditions are satisfied: (1) Ai ∩ Aj = ∅ ifi 6= j (the cells are disjoint) and (2) A1 ∪ A2 ∪ . . . ∪ Ar = U (the cellsare exhaustive).

Example 3.1 If U = {a, b, c, d, e}, then [{a, b}, {c, d, e}] and [{b, c, e}, {a}, {d}]and [{a}, {b}, {c}, {d}, {e}] are three different partitions of U . The lastis a partition into unit sets. ♦

The process of going from a fine to a less fine analysis of a set oflogical possibilities is actually carried out by means of a partition. Forexample, let us consider the logical possibilities for the first three gamesof the World Series if the Yankees play the Dodgers. We can list thepossibilities in terms cf the winner of each game as

{YYY,YYD,YDY,DYY,DDY,DYD,YDD,DDD}.We form a partition by putting all the possibilities with the same num-ber of wins for the Yankees in a single cell,

[{YYY}, {YYD,YDY,DYY}, {DDY,DYD,YDD}, {DDD}].Thus, if we wish the possibilities to be Yankees win three games, wintwo, win one, win zero, then we are considering a less detailed analysis

33

34 CHAPTER 3. PARTITIONS AND COUNTING

obtained from the former analysis by identifying the possibilities in eachcell of the partition.

If [A1, A2, . . . , Ar] and [B1, B2, . . . , Bs] are two partitions of the sameset U , we can obtain a new partition by considering the collection of allsubsets of U of the form Ai ∩ Bj (see Exercise 7). This new partitionis called the cross-partition of the original two partitions.

Example 3.2 A common use of cross-partitions is in the problem ofclassification. For example, from the set U of all life forms we can formthe partition [P,A] where P is the set of all plants and A is the setof all animals. We may also form the partition [E, F ] where E is theset of extinct life forms and F is the set of all existing life forms. Thecross-partition

[P ∩ E, P ∩ F,A ∩ E,A ∩ F ]

gives a complete classification according to the two separate classifica-tions. ♦

Many of the examples with which we shall deal in the future willrelate to processes which take place in stages. It will be convenientto use partitions and cross-partitions to represent the stages of theprocess. The graphical representation of such a process is, of course,a tree. For example, suppose that the process is such that we learn insuccession the truth values of a series of statements relative to a givensituation. If U is the set of logical possibilities for the situation, andp is a statement relative to U , then the knowledge of the truth valueof p amounts to knowing which cell of the partition [P, P ] contains theactual possibility. Recall that P is the truth set of p, and P is thetruth set of ¬p. Suppose now we discover the truth value of a secondstatement q. This information can again be described by a partition,namely, [Q, Q]. The two statements together give us information whichcan be represented by the cross-partition of these two partitions,

[P ∩Q,P ∩ Q, P ∩Q, P ∩ Q].

That is, if we know the truth values of p and q, we also know whichof the cells of this cross-partition contains the particular logical pos-sibility describing the given situation. Conversely, if we knew whichcell contained the possibility, we would know the truth values for thestatements p and q.

The information obtained by the additional knowledge of the truthvalue of a third statement r, having a truth set R, can be represented

3.1. PARTITIONS 35

by the cross-partition of the three partitions [P, P ], [Q, Q], [R, R] Thiscross-partition is

[P∩Q∩R,P∩Q∩R, P∩Q∩R,P∩Q∩R, P∩Q∩R, P∩Q∩R, P∩Q∩R, P∩Q∩R].

Notice that now we have the possibility narrowed down to being in oneof 8 = 23 possible cells. Similarly, if we knew the truth values of nstatements, our partition would have 2n cells.

If the set U were to contain 220 (approximately one million) logicalpossibilities, and if we were able to ask yes-no questions in such a waythat the knowledge of the truth value of each question would cut thenumber of possibilities in half each time, then we could determine in 20questions any given possibility in the set U . We could accomplish thiskind of questioning, for example, if we had a list of all the possibilitiesand were allowed to ask “Is it in the first half?” and, if the answer isyes, then “Is it in the first one-fourth?”, etc. In practice we ordinarilydo not have such a list, and we can only approximate this procedure.

Example 3.3 In the familiar radio game of twenty questions it is notunusual for a contestant to try to carry out a partitioning of the abovekind. For example, he or she may know that he or she is trying to guessa city. He or she might ask, “Is the city in North America?” and if theanswer is yes, “Is it in the United States?” and if yes, “Is it west ofthe Mississippi?” and if no, “Is it in the New England states?”, etc. Ofcourse, the above procedure does not actually divide the possibilitiesexactly in half each time. The more nearly the answer to each questioncomes to dividing the possibilities in half, the more certain one can beof getting the answer in twenty questions, if there are at most a millionpossibilities. ♦

Exercises

1. If U is the set of integers from 1 to 6, find the cross-partitions ofthe following pairs of partitions

(a) [{1, 2, 3}, {4, 5, 6}] and [{1, 4}, {2, 3, 5, 6}].[Ans. [{1}, {2, 3}, {4}, {5, 6}].]

(b) [{1, 2, 3, 4, 5}, {6}] and [{1, 3, 5}, {2, 6}, {4}].

2. A coin is thrown three times. List the possibilities according towhich side turns up each time. Give the partition formed by


putting in the same cell all those possibilities for which the samenumber of heads occur.

3. Let p and q be two statements with truth set P and Q. Whatcan be said about the cross-partition of [P, P ] and [Q, Q] in thecase that

(a) p implies q.

[Ans. P ∩ Q = ∅.]

(b) p is equivalent to q.

(c) p and q are inconsistent.

4. Consider the set of eight states consisting of Illinois, Colorado,Michigan, New York, Vermont, Texas, Alabama, and California.

(a) Show that in three “yes” or “no” questions one can identifyany one of the eight states.

(b) Design a set of three “yes” or “no” questions which can beanswered independently of each other and which will serveto identify any one of the states.

5. An unabridged dictionary contains about 600,000 words and 3000pages. If a person chooses a word from such a dictionary, is itpossible to identify this word by twenty “yes” or “no” questions?If so, describe the procedure that you would use and discuss thefeasibility of the procedure. (One approach is the following. Use12 questions to locate the page, but then you may need 9 questionsto locate the word.)

6. Jones has two parents, each of his or her parents had two parents,each of these had two parents, etc. Tracing a person’s family treeback 40 generations (about 1000 years) gives Jones 240 ancestors,which is more people than have been on the earth in the last 1000years. What is wrong with this argument?

7. Let [A1, A2, A3] and [B1, B2] be two partitions. Prove that thecross-partition of the two given partitions really is a partition,that is, it satisfies requirements (1) and (2) for partitions.

3.1. PARTITIONS 37

8. The cross-partition formed from the truth sets of n statementshas 2n cells. As seen in Chapter ??, the truth table of a state-ment compounded from n statements has 2n rows. What is therelationship between these two facts?

9. Let p and q be statements with truth sets P and Q. Form thepartition [P ∩Q,P ∩ Q, P ∩Q, P ∩ Q]. State in each case belowwhich of the cells must be empty in order to make the givenstatement a logically true statement.

(a) p → q

(b) p ↔ q

(c) p ∨ ¬p(d) p

10. A partition [A1, A2, . . . , An] is said to be a refinement of the par-tition [B1, B2, . . . , Bm] if every Aj is a subset of some Bk. Showthat a cross-partition of two partitions is a refinement of each ofthe partitions from which the cross-partition is formed.

11. Consider the partition of the people in the United States deter-mined by classification according to states. The classification ac-cording to county determines a second partition. Show that thisis a refinement of the first partition. Give a third partition whichis different from each of these and is a refinement of both.

12. What can be said concerning the cross-partition of two partitions,one of which is a refinement of the other?

13. Given nine objects, of which it is known that eight have the sameweight and one is heavier, show how, in two weighings with a panbalance, the heavy one can be identified.

14. Suppose that you are given thirteen objects, twelve of which arethe same, but one is either heavier or lighter than the others.Show that, with three weighings using a pan balance, it is possibleto identify the odd object. (A complete solution to this problemis given on page 42 of Mathematical Snapshots, second edition,by H. Steinhaus.)

15. A subject can be completely classified by introducing several sim-ple subdivisions and taking their cross-partition. Thus, courses


in college may be partitioned according to subject, level of ad-vancement, number of students, hours per week, interests, etc.For each of the following subjects, introduce five or more par-titions. How many cells are there in the complete classification(cross-partition) in each case?

(a) Detective stories.

(b) Diseases.

16. Assume that in a given generation x men are Republicans and yare Democrats and that the total number of men remains at 50million in each generation. Assume further that it is known that20 per cent of the sons of Republicans are Democrats and 30 percent of the sons of Democrats are Republicans in any generation.What conditions must x and y satisfy if there are to be the samenumber of Republicans in each generation? Is there more thanone choice for x and y? If not, what must x and y be?

[Partial Ans. There are 30 million Republicans.]

17. Assume that there are 30 million Democratic and 20 million Re-publican men in the country. It is known that p per cent of thesons of Democrats are Republicans, and q per cent of the sonsof Republicans are Democrats. If the total number of men re-mains 50 million, what condition must p and q satisfy so that thenumber in each party remains the same? Is there more than onechoice of p and q?

3.2 The number of elements in a set

The remainder of this chapter will be devoted to certain counting prob-lems. For any set X we shall denote by n(X) the number of elementsin the set.

Suppose we know the number of elements in certain given sets andwish to know the number in other sets related to these by the opera-tions of unions, intersections, and complementations. As an example,consider the following problem.

Suppose that we are told that 100 students take mathematics, and150 students take economics. Can we then tell how many take eithermathematics or economics? The answer is no, since clearly we would

3.2. THE NUMBER OF ELEMENTS IN A SET 39

Figure 3.1: ♦

also need to know how many students take both courses. If we knowthat no student takes both courses, i.e., if we know that the two setsof students are disjoint, then the answer would be the sum of the twonumbers or 250 students.

In general, if we are given disjoint sets A and B, then it is true thatn(A∪B) = n(A)+n(B). Suppose now that A and B are not disjoint asshown in Figure 3.1. We can divide the set A into disjoint sets A∩ Band A∩B. Similarly, we can divide B into the disjoint sets A∩B andA ∩ B. Thus,

n(A) = n(A ∩ B) + n(A ∩ B),

n(B) = n(A ∩ B) + n(A ∩ B).

Adding these two equations, we obtain

n(A) + (B) = n(A ∩ B) + n(A ∩ B) + n(A ∩ B) + 2n(A ∩ B).

Since the sets A ∩ B, A ∩ B, and A ∩ B are disjoint sets whose unionis A ∪ B, we obtain the formula

n(A ∪ B) = n(A) + n(B)− n(A ∩B),

which is valid for any two sets A and B.

Example 3.4 Let p and q be statements relative to a set U of logicalpossibilities. Denote by P and Q the truth sets of these statements.The truth set of p∨ q is P ∪Q and the truth set of p∧ q is P ∩Q. Thusthe above formula enables us to find the number of cases where p∨ q istrue if we know the number of cases for which p, q, and p ∧ q are true.♦


Figure 3.2: ♦

Example 3.5 More than two sets. It is possible to derive formulas forthe number of elements in a set which is the union of more than twosets (see Exercise 6), but usually it is easier to work with Venn dia-grams. For example, suppose that the registrar of a school reports thefollowing statistics about a group of 30 students: l9 take mathematics.17 take music. 11 take history. 12 take mathematics and music. 7 takehistory and mathematics. 5 take music and history. 2 take mathemat-ics, history, and music. We draw the Venn diagram in Figure 3.2 andfill in the numbers for the number of elements in each subset workingfrom the bottom of our list to the top. That is, since 2 students takeall three courses, and 5 take music and history, then 3 take history andmusic but not mathematics, etc. Once the diagram is completed wecan read off the number which take any combination of the courses.For example, the number which take history but not mathematics is3 + 1 = 4. ♦

Example 3.6 Cancer studies. The following reasoning is often foundin statistical studies on the effect of smoking on the incidence of lungcancer. Suppose a study has shown that the fraction of smokers amongthose who have lung cancer is greater than the fraction of smokersamong those who do not have lung cancer. It is then asserted that thefraction of smokers who have lung cancer is greater than the fractionof nonsmokers who have lung cancer. Let us examine this argument.

Let S be the set of all smokers in the population, and C be the


Figure 3.3: ♦

set of all people with lung cancer. Let a = n(S ∩ C), b = n(S ∩ C),c = n(S ∩ C) and d = n(S ∩ C), as indicated in Figure 3.3. Thefractions in which we are interested are

p1 =a

a+ b, p2 =

c

c+ d, p3 =

a

a+ c, p4 =

b

b+ d,

where p1 is the fraction of those with lung cancer that smoke, p2 thefraction of those without lung cancer that smoke, p3 the fraction ofsmokers who have lung cancer, and p4 the fraction of nonsmokers whohave cancer.

The argument above states that if p1 > p2, then p3 > p4. Thehypothesis,

a

a+ b>

c

c+ d

is true if and only if ac + ad > ac + bc, that is, if and only if ad > bc.The conclusion

a

a+ c>

b

b+ d

is true if and only if ab + ad > ab + bc, that is, if and only if ad > bc.Thus the two statements p1 > p2 and p3 > p4 are in fact equivalentstatements, so that the argument is valid. ♦

Exercises

1. In Example 3.5, find

(a) The number of students that take mathematics but do nottake history.


[Ans. 12.]

(b) The number that take exactly two of the three courses.

(c) The number that take one or none of the courses.

2. In a chemistry class there are 20 students, and in a psychologyclass there are 30 students. Find the number in either the psy-chology class or the chemistry class if

(a) The two classes meet at the same hour.

[Ans. 50.]

(b) The two classes meet at different hours and 10 students areenrolled in both courses.

[Ans. 40.]

3. If the truth set of a statement p has 10 elements, and the truthset of a statement q has 20 elements, find the number of elementsin the truth set of p ∨ q if

(a) p and q are inconsistent.

(b) p and q are consistent and there are two elements in thetruth set of p ∧ q.

4. If p is a statement that is true in ten cases, and q is a statementthat is true in five cases, find the number of cases in which bothp and q are true if p ∨ q is true in ten cases. What relation holdsbetween p and q?

5. Assume that the incidence of lung cancer is 16 per 100,000, andthat it is estimated that 75 per cent of those with lung cancersmoke and 60 per cent of those without lung cancer smoke. (Thesenumbers are fictitious.) Estimate the fraction of smokers withlung cancer, and the fraction of nonsmokers with lung cancer.

[Ans. 20 and 10 per 100,000.]

6. Let A, B, and C be any three sets of a universal set U . Draw aVenn diagram and show that

n(A ∪ B ∪ C) = n(A) + n(B) + n(C)

−n(A ∩ B)− n(A ∩ C)− n(B ∩ C)

+n(A ∩B ∩ C).


7. Analyze the data given below and draw a Venn diagram like thatin Figure 3.2. Assuming that every student in the school takesone of the courses, find the total number of students in the school.

(a) First case:

28 students take English.23 students take French.23 students take German.12 students take English and French.11 students take English and German.8 students take French and German.5 students take all three courses.

(b) Second case:

36 students take English.23 students take French.13 students take German.6 students take English and French.11 students take English and German.4 students take French and German.1 students take all three courses.

(Comment on this result.)

8. Suppose that in a survey concerning the reading habits of studentsit is found that:

60 per cent read magazine A; 50 per cent read magazine B; 50per cent read magazine C; 30 per cent read magazines A and B;20 per cent read magazines B and C; 30 per cent read magazinesA and C; 10 per cent read all three magazines.

(a) What per cent read exactly two magazines?

[Ans. 50.]

(b) What per cent do not read any of the magazines?

[Ans. 10.]

9. If p and q are equivalent statements and n(P ) = 10, what isn(P ∪Q)?

10. If p implies q, prove that n(P ∪ Q) = n(P ) + n(Q).


11. On a transcontinental airliner, there are 9 boys, 5 American chil-dren, 9 men, 7 foreign boys, 14 Americans, 6 American males, and7 foreign females. What is the number of people on the plane?

[Ans. 33.]


12. Prove that n(A) = n(U)− n(A).

13. Show that n(A ∩ B) = n( ˜A ∪ B) = n(U)− n(A ∪B).

14. In a collection of baseball players there are ten who can playonly outfield positions, five who can play only infield positionsbut cannot pitch, three who can pitch, four who can play anyposition but pitcher, and two who can play any position at all.How many players are there in all?

[Ans. 22.]

15. Ivyten College awarded 38 varsity letters in football, 15 in bas-ketball, and 20 in baseball. If these letters went to a total of 58men and only three of these men lettered in all three sports, howmany men received letters in exactly two of the three sports?

[Ans. 9.]

16. Let U be a finite set. For any two sets A and B define the “dis-tance” from A to B to be d(A,B) = n(A ∩ B) + n(A ∩ B).

(a) Show that d(A,B) ≥ 0. When is d(A,B) = 0?

(b) If A, B, and C are nonintersecting sets, show that

d(A,C) ≤ d(A,B) + d(B,C).

(c) Show that for any three sets A, B, and C

d(A,C) ≤ d(A,B) + d(B,C).

3.3. PERMUTATIONS 45

Figure 3.4: ♦

3.3 Permutations

We wish to consider here the number of ways in which a group of ndifferent objects can be arranged. A listing of n different objects ina certain order is called a permutation of the n objects. We considerfirst the case of three objects, a, b, and c. We can exhibit all possiblepermutations of these three objects as paths of a tree, as shown inFigure 3.4. Each path exhibits a possible permutation, and there aresix such paths. We could also list these permutations as follows: abc,bca, acb, cab, bac, cba. If we were to construct a similar tree for nobjects, we would find that the number of paths could be found bymultiplying together the numbers n, n− 1, n− 2, continuing down tothe number 1. The number obtained in this way occurs so often thatwe give it a symbol, namely n!, which is read “n factorial”. Thus, forexample, 3! = 3 · 2 · 1 = 6, 4! = 4 · 3 · 2 · 1 = 24, etc. For reasonswhich will be clear later, we define 0! = 1. Thus we can say there are

n! different permutations of n distinct objects.

Example 3.7 In the game of Scrabble, suppose there are seven let-tered blocks from which we try to form a seven-letter word. If theseven letters are all different, we must consider 7! = 5040 differentorders. ♦

Example 3.8 A quarterback has a sequence of ten plays. Suppose hisor her coach instructs him or her to run through the ten-play sequencewithout repetition. How much freedom is left to the quarterback? Heor she may choose any one of 10! = 3, 628, 800 orders in which to callthe plays. ♦


Example 3.9 How many ways can n people be seated around a cir-cular table? When this question is asked, it is usually understood thattwo arrangements are different only if at least one person has a differentperson on the right in the two arrangements. Consider then one personin a fixed position. There are (n− 1)! ways in which the other peoplemay be seated. We have now counted all the arrangements we wish toconsider different. ♦

A general principle. There are many counting problems forwhich it is not possible to give a simple formula for the number ofpossible cases. In many of these the only way to find the numberof cases is to draw a tree and count them (see Exercise 4). In someproblems, the following general principle is useful.

If one thing can be done in exactly r different ways, for each of

these a second thing can be done in exactly s different ways, for each

of the first two, a third can be done in exactly t ways, etc., then the

sequence of things can be done in the product of the numbers of ways

in which the individual things can be done, i.e., r · s · t ways.The validity of the above general principle can be established by

thinking of a tree representing all the ways in which the sequence ofthings can be done. There would be r branches from the starting posi-tion. From the ends of each of these r branches there would be s newbranches, and from each of these t new branches, etc. The number ofpaths through the tree would be given by the product r · s · t.

Example 3.10 The number of permutations of n distinct objects isa special case of this principle. If we were to list all the possible per-mutations, there would be n possibilities for the first, for each of thesen − 1 for the second, etc., until we came to the last object, and forwhich there is only one possibility. Thus there are n(n − 1) . . . 1 = n!possibilities in all. ♦

Example 3.11 If there are three roads from city x to city y and tworoads from city y to city z, then there are 3 · 2 = 6 ways that a personcan drive from city x to city z passing through city y. ♦

Example 3.12 Suppose there are n applicants for a certain job. Threeinterviewers are asked independently to rank the applicants accordingto their suitability for the job. It is decided that an applicant will behired if he or she is ranked first by at least two of the three interviewers.


What fraction of the possible reports would lead to the acceptance ofsome candidate? We shall solve this problem by finding the fractionof the reports which do not lead to an acceptance and subtract thisanswer from 1. Frequently, an indirect attack of this kind on a problemis easier than the direct approach. The total number of reports possibleis (n!)3 since each interviewer can rank the men in n! different ways.If a particular report does not lead to the acceptance of a candidate,it must be true that each interviewer has put a different applicant infirst place. This can be done in n(n − 1)(n − 2) different ways by ourgeneral principle. For each possible first choices, there are [(n − 1)!]3

ways in which the remaining men can be ranked by the interviewers.Thus the number of reports which do not lead to acceptance is n(n −1)(n− 2)[(n− 1)!]3. Dividing this number by (n!)3 we obtain

(n− 1)(n− 2)

n2

as the fraction of reports which fail to accept a candidate. The fractionwhich leads to acceptance is found by subtracting this fraction from 1which gives

3n− 2

n2

For the case of three applicants, we see that 79of the possibilities lead to

acceptance. Here the procedure might be criticized on the grounds thateven if the interviewers are completely ineffective and are essentiallyguessing there is a good chance that a candidate will be accepted onthe basis of the reports. For n equal to ten, the fraction of acceptancesis only .28, so that it is possible to attach more significance to theinterviewers ratings, if they reach a decision. ♦

Exercises

1. In how many ways can five people be lined up in a row for a grouppicture? In how many ways if it is desired to have three in thefront row and two in the back row?

[Ans. 120;120.]

2. Assuming that a baseball team is determined by the players andthe position each is playing, how many teams can be made from13 players if


(a) Each player can play any position?

(b) Two of the players can be used only as pitchers?

3. Grades of A, B, C, D, or F are assigned to a class of five students.

(a) How many ways may this be done, if no two students receivethe same grade?

[Ans. 120.]

(b) Two of the students are named Smith and Jones. How manyways can grades be assigned if no two students receive thesame grade and Smith must receive a higher grade thanJones?

[Ans. 60.]

(c) How many ways may grades be assigned if only grades of Aand F are assigned?

[Ans. 32.]

4. A certain club wishes to admit seven new members, four of whomare Republicans and three of whom are Democrats. Suppose theclub wishes to admit them one at a time and in such a way thatthere are always more Republicans among the new members thanthere are Democrats. Draw a tree to represent all possible waysin which new members can be admitted, distinguishing membersby their party only.

5. There are three different routes connecting city A to city B. Howmany ways can a round trip be made from A to B and back?How many ways if it is desired to take a different route on theway back?

[Ans. 9;6.]

6. How many different ways can a ten-question multiple-choice exambe answered if each question has three possibilities, a, b, and c?How many if no two consecutive answers are the same?

7. Modify Example 3.12 so that, to be accepted, an applicant mustbe first in two of the interviewers’ ratings and must be either firstor second in the third interviewers’ rating. What fraction of thepossible reports lead to acceptance in the case of three applicants?In the case of n?


[Ans. 49; 4n2 .]

8. A town has 1240 registered Republicans. It is desired to contacteach of these by phone to announce a meeting. A committee ofr people devise a method of phoning s people each and askingeach of these to call t new people. If the method is such that noperson is called twice,

(a) How many people know about the meeting after the phon-ing?

(b) If the committee has 40 members and it is desired that all1240 Republicans be informed of the meeting and that s andt should be the same, what should they be?

9. In the Scrabble example (Example 3.7), suppose the letters areQ, Q, U, F, F, F, A. How many distinguishable arrangements arethere for these seven letters?

[Ans. 420.]

10. How many different necklaces can be made

(a) If seven different sized beads are available?

[Ans. 360.]

(b) If six of the beads are the same size and one is larger?

[Ans. 1.]

(c) If the beads are of two sizes, five of the smaller size and twoof the larger size?

[Ans. 3.]

11. Prove that two people in Columbus, Ohio, have the same initials.

12. Find the number of distinguishable arrangements for each of thefollowing collections of five symbols. (The same letters with dif-ferent subscripts indicate distinguishable objects.)

(a) A1, A2, B1, B2, B3.

[Ans. 120.]


(b) A,A,B1, B2, B3.

[Ans. 60.]

(c) A,A,B,B,B.

13. Show that the number of distinguishable arrangements possiblefor n objects, n1 of type 1, n2 of type 2, etc., for r different typesis

n!

n1!n2! · · ·nr!.


14. (a) How many four digit numbers can be formed from the digits1, 2, 3, 4, using each digit only once?

(b) How many of these numbers are less than 3000?

[Ans. 12.]

15. How many license plates can be made if they are to contain fivesymbols, the first two being letters and the last three digits?

16. How many signals can a ship show if it has seven flags and a signalconsists of five flags hoisted vertically on a rope?

[Ans. 2520.]

17. We must arrange three green, two red, and four blue books on asingle shelf.

(a) In how many ways can this be done if there are no restric-tions?

(b) In how many ways if books of the same color must be groupedtogether?

(c) In how many ways if, in addition to the restriction in 17b,the red books must be to the left of the blue books?

(d) In how many ways if, in addition to the restrictions in 17band 17c, the red and blue books must not be next to eachother?

[Ans. 288.]

3.4. COUNTING PARTITIONS 51

18. A youngster has three shades of nail polish with which to painthis or her fingernails. In how many ways can he or she do this(each nail being one solid color) if there are no more than twodifferent shades on each hand?

[Ans. 8649.]

3.4 Counting partitions

Up to now we have not had occasion to consider the partitions [{1, 2}, {3, 4}]and [{3, 4}, {1, 2}] of the integers from 1 to 4 as being different parti-tions. Here it will be convenient to do so, and to indicate this distinc-tion we shall use the term ordered partition. An ordered partition with

r cells is a partition with r cells (some of which may be empty), witha particular order specified for the cells.

We are interested in counting the number of possible ordered par-titions with r cells that can be formed from a set of n objects having aprescribed number of elements in each cell. We consider first a specialcase to illustrate the general procedure.

Suppose that we have eight students, A, B, C, D, E, F, G, and H,and we wish to assign these to three rooms, Room 1, which is a tripleroom, Room 2, a triple room, and Room 3, a double room. In howmany different ways can the assignment be made? One way to assignthe students is to put them in the rooms in the order in which theyarrive, putting the first three in Room 1, the next three in Room 2,and the last two in Room 3. There are 8! ways in which the studentscan arrive, but not all of these lead to different assignments. We canrepresent the assignment corresponding to a particular order of arrivalas follows,

|BCA|DFE|HG|.In this case, B, C, and A are assigned to Room 1, D, F, and E to Room2, and H and G to Room 3. Notice that orders of arrival which simplychange the order within the rooms lead to the same assignment. Thenumber of different orders of arrival which lead to the same assignmentas the one above is the number of arrangements which differ from thegiven one only in that the arrangement within the rooms is different.There are 3! · 3! · 2! such orders of arrival, since we can arrange thethree in Room 1 in 3! different ways, for each of these the ones inRoom 2 in 3! different ways, and for each of these, the ones in Room


3 in 2! ways. Thus we can divide the 8! different orders of arrival intogroups of 3! · 3! · 2! different orders such that all the orders of arrivalin a single group lead to the same room assignment. Since there are3! · 3! · 2! elements in each group and 8! elements altogether, there are8!

3!3!2!groups, or this many different room assignments.

The same argument could be carried out for n elements and r rooms,with n1 in the first, n2 in the second, etc. This would lead to thefollowing result. Let n1, n2, . . . , nr be nonnegative integers with

n1 + n2 + . . .+ nr = n.

Then:The number of ordered partitions with r cells [A1, A2, . . . , Ar] of a

set of n elements with n1 in the first cell, n2 in the second, etc. is

n!

n1!n2! . . . nr!

We shall denote this number by the symbol(

n!

n1!, n2!, . . . , nr!

).

Note that this symbol is defined only if n1 + n2 + . . .+ nr = n.The special case of two cells is particularly important. Here the

problem can be stated equivalently as the problem of finding the num-ber of subsets with r elements that can be chosen from a set of nelements. This is true because any choice defines a partition [A, A],where A is the set of elements chosen and A is the set of remainingelements.

The number of such partitions is n!r!(n−r)!

and hence this is also the

number of subsets with r elements. Our notation(

nr,n−r

)for this case

is shortened to(nr

).

Notice that(

nn−r

)is the number of subsets with n − r elements

which can be chosen from n, which is the number of partitions of theform [A, A] above. Clearly, this is the same as the number of [A, A]

partitions. Hence(nr

)=(

nn−r

).

Example 3.13 A college has scheduled six football games during aseason. How many ways can the season end in two wins, three losses,and one tie? From each possible outcome of the season, we form a


partition, with three cells, of the opposing teams. In the first cellwe put the teams which our college defeats, in the second the teamsto which our college loses, and in the third cell the teams which ourcollege ties. There are

(6

2,3,1

)= 60 such partitions, and hence 60 ways

in which the season can end with two wins, three losses, and one tie. ♦

Example 3.14 In the game of bridge, the hands N, E, S, and W deter-mine a partition of the 52 cards having four cells, each with 13 elements.Thus there are 52!

13!13!13!13!different bridge deals. This number is about

5.3645 · 1028, or approximately 54 billion billion billion deals. ♦

Example 3.15 The following example will be important in probabilitytheory, which we take up in the next chapter. If a coin is thrown sixtimes, there are 26 possibilities for the outcome of the six throws, sinceeach throw can result in either a head or a tail. How many of thesepossibilities result in four heads and two tails? Each sequence of sixheads and tails determines a two-cell partition of the numbers fromone to six as follows: In the first cell put the numbers correspondingto throws which resulted in a head, and in the second put the numberscorresponding to throws which resulted in tails. We require that thefirst cell should contain four elements and the second two elements.Hence the number of the 26 possibilities which lead to four heads andtwo tails is the number of two-cell partitions of six elements which havefour elements in the first cell and two in the second cell. The answer is(64

)= 15. For n throws of a coin, a similar analysis shows that there are(

nr

)different sequences of H’s and T’s of length n which have exactly r

heads and n− r tails. ♦

Exercises

1. Compute the following numbers.

(a)(75

)

[Ans. 21.]

(b)(32

)

(c)(72

)

(d)(250249

)


[Ans. 250.]

(e)(50

)

(f)(

51,2,2

)

(g)(

42,0,2

)

[Ans. 6.]

(h)(

21,1,1

)

2. Give an interpretation for(n0

)and also for

(nn

). Can you now give

a reason for making 0! = 1?

3. How many ways can nine students be assigned to three triplerooms? How many ways if one particular pair of students refuseto room together?

[Ans. 1680; 1260.]

4. A group of seven boys and ten girls attends a dance. If all theboys dance in a particular dance, how many choices are there forthe girls who dance? For the girls who do not dance? How manychoices are there for the girls who do not dance, if three of thegirls are sure to be asked to dance?

5. Suppose that a course is given at three different hours. If fifteenstudents sign up for the course,

(a) How many possibilities are there for the ways the studentscould distribute themselves in the classes?

[Ans. 315.]

(b) How many of the ways would give the same number of stu-dents in each class?

[Ans. 756,756.]

6. A college professor anticipates teaching the same course for thenext 35 years. So as not to become bored with his or her jokes, heor she decides to tell exactly three jokes every year and in no twoyears to tell exactly the same three jokes. What is the minimumnumber of jokes that will accomplish this? What is the minimumnumber if he or she determines never to tell the same joke twice?


7. How many ways can you answer a ten-question true-false exam,marking the same number of answers true as you do false? Howmany if it is desired to have no two consecutive answers the same?

8. From three Republicans and three Democrats, find the numberof committees of three which can be formed

(a) With no restrictions.

[Ans. 20.]

(b) With three Republicans and no Democrats.

[Ans. 1.]

(c) With two Republicans and one Democrat.

[Ans. 9.]

(d) With one Republican and two Democrats.

[Ans. 9.]

(e) With no Republicans and three Democrats.

[Ans. 1.]

What is the relation between your answers to the five parts ofthis question?

9. Exercise 8 suggests that the following should be true.

(2n

n

)=

(n

0

)(n

n

)+

(n

1

)(n

n− 1

)+

(n

2

)(n

n− 2

)+ . . .+

(n

n

)(n

0

).

Show that it is true.

10. A student needs to choose two electives from six possible courses.

(a) How many ways can he or she make his or her choice?

[Ans. 15.]

(b) How many ways can he or she choose if two of the coursesmeet at the same time?

[Ans. 14.]


(c) How many ways can he or she choose if two of the coursesmeet at 10 o’clock, two at 11 o’clock, and there are no otherconflicts among the courses?

[Ans. 13.]


11. Consider a town in which there are three plumbers, A, B, andC. On a certain day six residents of the town telephone for aplumber. If each resident selects a plumber from the telephonedirectory, in how many ways can it happen that

(a) Three residents call A, two residents call B, and one residentcalls C?

[Ans. 60.]

(b) The distribution of calls to the plumbers is three, two, andone?

[Ans. 360.]

12. Two committees (a labor relations committee and a quality con-trol committee) are to be selected from a board of nine men. Theonly rules are (1) the two committees must have no members incommon, and (2) each committee must have at least four men.In how many ways can the two committees be appointed?

13. A group of ten people is to be divided into three committees ofthree, three, and six members, respectively. The chair of thegroup is to serve on all three committees and is the only memberof the group who serves on more than one committee. In howmany ways can the committee assignments be made?

[Ans. 756.]

14. In a class of 20 students, grades of A, B, C, D, and F are to beassigned. Omit arithmetic details in answering the following.

(a) In how many ways can this be done if there are no restric-tions?

[Ans. 520.]

3.5. SOME PROPERTIES OF THE NUMBERS(NJ

). 57

(b) In how many ways can this be done if the grades are assignedas follows: 2 A’s, 3 B’s, 10 C’s, 3 D’s, and 2 F’s?

(c) In how many ways can this be done if the following rules areto be satisfied: exactly 10 C’s; the same number of A’s asF’s; the same number of B’s as D’s; always more B’s thanA’s?

[Ans.(

205,10,5

)+(

201,4,10,4,1

)+(

202,3,10,3,2

).]

15. Establish the identity(n

r

)(r

k

)=

(n

k

)(n− k

r − k

)

for n ≥ r ≥ k in two ways, as follows:

(a) Replace each expression by a ratio of factorials and showthat the two sides are equal.

(b) Consider the following problem: From a set of n people acommittee of r is to be chosen, and from these r people asteering subcommittee of k people is to be selected. Showthat the two sides of the identity give two different ways ofcounting the possibilities for this problem.

3.5 Some properties of the numbers(nj

).

The numbers(nj

)introduced in the previous section will play an impor-

tant role in our future work. We give here some of the more importantproperties of these numbers.

A convenient way to obtain these numbers is given by the famousPascal triangle, shown in Figure 3.5. To obtain the triangle we firstwrite the 1’s down the sides. Any of the other numbers in the trianglehas the property that it is the sum of the two adjacent numbers in therow just above. Thus the next row in the triangle is 1, 6, 15, 20, 15,6, 1. To find the number

(nj

)we look in the row corresponding to the

number n and see where the diagonal line corresponding to the value ofj intersects this row. For example,

(42

)= 6 is in the row marked n = 4

and on the diagonal marked j = 2.The property of the numbers

(nj

)upon which the triangle is based

is (n + 1

j

)=

(n

j − 1

)+

(n

j

).


Figure 3.5: ♦

This fact can be verified directly (see Exercise 6), but the following

argument is interesting in itself. The number(n+1j

)is the number of

subsets with j elements that can be formed from a set of n+1 elements.Select one of the n+1 elements, x. The

(n+1j

)subsets can be partitioned

into those that contain x and those that do not. The latter are subsetsof j elements formed from n objects, and hence there are

(nj

)such

subsets. The former are constructed by adding x to a subset of j − 1elements formed from n elements, and hence there are

(n

j−1

)of them.

Thus (n + 1

j

)=

(n

j − 1

)+

(n

j

).

If we look again at the Pascal triangle, we observe that the numbersin a given row increase for a while, and then decrease. We can provethis fact in general by considering the ratio of two successive terms,

(n

j+1

)

(nj

) =n!

(j + 1)!(n− j − 1)!· j!(n− j)!

n!=

n− j

j + 1.

The numbers increase as long as the ratio is greater than 1, i.e., n−j >j + 1. This means that j < 1

2(n − 1). We must distinguish the case

of an even n from an odd n. For example, if n = 10, j must be less


). 59

than 12(10 − 1) = 4.5. Hence for j up to 4 the terms are increasing,

from j = 5 on, the terms decrease. For n = 11, j must be less than12(11− 1) = 5. For j = 5, (11− j)/(j + 1) = 1. Hence, up to j = 5 the

terms increase, then(115

)=(116

), and then the terms decrease.

Exercises

1. Extend the Pascal triangle to n = 16. Save the result for lateruse.

2. Prove that

(n

0

)+

(n

1

)+

(n

2

)+ . . .+

(n

n

)= 2n,

using the fact that a set with n elements has 2n subsets.

3. For a set of ten elements prove that there are more subsets withfive elements than there are subsets with any other fixed numberof elements.

4. Using the fact that(

nr+1

)= n−r

r+1·(nr

), compute

(30s

)for s = 1, 2, 3, 4

from the fact that(300

)= 1.

[Ans. 30; 435; 4060; 27,405.]

5. There are(5213

)different possible bridge hands. Assume that a list

is made showing all these hands, and that in this list the firstcard in every hand is crossed out. This leaves us with a list oftwelve-card hands. Prove that at least two hands in the latter listcontain exactly the same cards.

6. Prove that (n+ 1

j

)=

(n

j − 1

)+

(n

j

)

using only the fact that

(n

j

)=

n!

j!(n− j)!.


7. Construct a triangle in the same way that the Pascal trianglewas constructed, except that whenever you add two numbers, useparity addition (table (a) in Figure 2.12). Construct the trianglefor 16 rows. What does this triangle tell you about the numbersin the Pascal triangle? Use this result to check your triangle inExercise 1.

8. In the triangle obtained in Exercise 7, what property do the rows1, 2, 4, 8, and 16 have in common? What does this say about thenumbers in the corresponding rows of the Pascal triangle? Whatwould you predict for the terms in the 32nd row of the Pascaltriangle?

9. For the following table state how one row is obtained from thepreceding row and give the relation of this table to the Pascaltriangle.

1 1 1 1 1 1 11 2 3 4 5 6 71 3 6 10 15 21 281 4 10 20 35 56 841 5 15 35 70 126 2101 6 21 56 126 252 4621 7 28 84 210 462 924

10. Referring to the table in Exercise 9, number the columns startingwith 0, 1, 2, . . . and number the rows starting with 1, 2, 3, . . .. Letf(n, r) be the element in the nth column and the rth row. Thetable was constructed by the rule

f(n, r) = f(n− 1, r) + f(n, r − 1)

for n > 0 and r > 1, and f(n, 1) = f(0, r) = 1 for all n and r.Verify that

f(n, r) =

(n + r − 1

n

)

satisfies these conditions and is in fact the only choice for f(n, r)which will satisfy the conditions.

11. Consider a set {a1, a2, a3} of three objects which cannot be distin-guished from one another. Then the ordered partitions with twocells which could be distinguished are: [{a1, a2, a3}, ∅], [{a1, a2}, {a3}],[{a1}, {a2, a3}], [∅, {a1, a2, a3}]. List all such ordered partitionswith three cells. How many are there?


). 61

[Ans. 10.]

12. Let f(n, r) be the number of distinguishable ordered partitionswith r cells which can be formed from a set of n indistinguishableobjects. Show that f(n, r) satisfies the conditions

f(n, r) = f(n− 1, r) + f(n, r − 1)

for n > 0 and r > 1, and f(n, 1) = f(0, r) = 1 for all n and r.[Hint: Show that f(n, r − 1) is the number of partitions whichhave the last cell empty and f(n−1, r) is the number which haveat least one element in the last cell.]

13. Using the results of Exercises 10 and 12, show that the numberof distinguishable ordered partitions with r cells which can beformed from a set of n indistinguishable objects is

(n+ r − 1

n

).

14. Assume that a mail carrier has seven letters to put in three mailboxes. How many ways can this be done if the letters are notdistinguished?

[Ans. 36.]

15. For n ≥ r ≥ k ≥ s show that the identity(n

r

)(r

k

)(k

s

)=

(n

s

)(n− s

k − s

)(n− k

r − k

).

holds by replacing each binomial coefficient by a ratio of factorials.

16. Establish the identity in Exercise 15 in another way by showingthat the two sides of the expression are simply two different waysof counting the number of solutions to the following problem:From a set of n people a subset of r is to be chosen; from the setof r people a subset of k is to be chosen; and from the set of kpeople a subset of s people is to be chosen.

17. Generalize the identity in Exercises 15 and 16 to solve the problemof finding the number of ways of selecting a t-element subset froman s-element subset from a k-element subset from an r-elementsubset of an n-element set, where n ≥ r ≥ k ≥ s ≥ t.


3.6 Binomial and multinomial theorems

It is sometimes necessary to expand products of the form (x + y)3,(x+ 2y + 11z)5, etc. In this section we shall consider systematic waysof carrying out such expansions.

Consider first the special case (x+ y)3. We write this as

(x+ y)3 = (x+ y)(x+ y)(x+ y).

To perform the multiplication, we choose either an x or a y from eachof the three factors and multiply our choices together; we do this forall possible choices and add the results. We represent a particular setof choices by a two-cell partition of the numbers 1, 2, 3. In the firstcell we put the numbers which correspond to factors from which wechose an x. In the second cell we put the numbers which correspond tofactors from which we chose a y. For example, the partition [{1, 3}, {2}]corresponds to a choice of x from the first and third factors and y fromthe second. The product so obtained is xyx = x2y. The coefficient ofx2y in the expansion of (x+ y)3 will be the number of partitions whichlead to a choice of two x’s and one y, that is, the number of two-cellpartitions of three elements with two elements in the first cell and onein the second, which is

(32

)= 3. More generally, the coefficient of the

term of the form xjy3−j will be(3j

)for j = 0, 1, 2, 3. Thus we can write

the desired expansion as

(x+ y)3 =

(3

3

)x3 +

(3

2

)x2y +

(3

1

)xy2 +

(3

0

)y3

= x3 + 3x2y + 3xy2 + y3.

The same argument carried out for the expansion (x+ y)n leads tothe binomial theorem of algebra.

Binomial theorem. The expansion of (x+ y)n is given by(n

n

)xn +

(n

n− 1

)xn−1y +

(n

n− 2

)xn−2y2 + . . .+

(n

1

)xyn−1 +

(n

0

)yn.

Example 3.16 Let us find the expansion for (a−2b)3. To fit this intothe binomial theorem, we think of x as being a and y as being −2b.Then we have

(a− 2b)3 = a3 + 3a2(−2b) + 3a(−2b)2 + (−2b)3

= a3 − 6a2b+ 12ab2 − 8b3.

3.6. BINOMIAL AND MULTINOMIAL THEOREMS 63

♦We turn now to the problem of expanding the trinomial (x+y+z)3.

Again we write

(x+ y + z)3 = (x+ y + z)(x+ y + z)(x+ y + z).

This time we choose either an x or y or z from each of the three factors.Our choice is now represented by a three-cell partition of the set of num-bers {1, 2, 3}. The first cell has the numbers corresponding to factorsfrom which we choose an x, the second cell the numbers correspondingto factors from which we choose a y, and the third those from whichwe choose a z. For example, the partition [{1, 3}, ∅, {2}] corresponds toa choice of x from the first and third factors, no y’s, and a z from thesecond factor. The term obtained is xzx = x2z. The coefficient of theterm x2z in the expansion is thus the number of three-cell partitionswith two elements in the first cell, none in the second, and one in thethird. There are

(3

2,0,1

)= 3 such partitions. In general, the coefficient

of the term of the form xaybzc in the expansion of (x+ y + z)3 will be

(3

a, b, c

)=

3!

a!b!c!.

Finding this way the coefficient for each possible a, b, and c, we obtain

(x+y+z)3 = x3+y3+x3+3x2y+3xy2+3yz2+3y2z+3xz2+3x2z+6xyz.

The same method can be carried out in general for finding the ex-pansion of (x1 + x2 + . . .+ xr)

n. From each factor we choose either anx1, or x2, . . . , or xr, form the product and add these products for allpossible choices. We will have rn products, but many will be equal. Aparticular choice of one term from each factor determines an r-cell par-tition of the numbers from 1 to n. In the first cell we put the numbersof the factors from which we choose an x1, in the second cell those fromwhich we choose x2, etc. A particular choice gives us a term of the formxn1

1 xn2

2 . . . xnr

r with n1+n2 + . . .+nr = n. The corresponding partitionhas n1 elements in the first cell, n2 in the second, etc. For each suchpartition we obtain one such term. Hence the number of these termswhich we obtain is the number of such partitions, which is

(n

n1, n2, . . . , nr

)=

n!

n1!n2! . . . nr!.


Thus we have the multinomial theorem.Multinomial theorem. The expansion of (x1 + x2 + . . .+ xr)

n isfound by adding all terms of the form

(n

n1, n2, . . . , nr

)xn1

1 xn2

2 . . . xnr

r

where n1 + n2 + . . .+ nr = n.

Exercises

1. Expand by the binomial theorem

(a) (x+ y)4.

(b) (1 + x)5.

(c) (x− y)3.

(d) (2x+ a)4.

(e) (2x− 3y)3.

(f) (100− 1)5.

2. Expand

(a) (x+ y + x)4.

(b) (2x+ y − z)3.

(c) (2 + 2 + 1)3. (Evaluate two ways.)

3. (a) Find the coefficient of the term x2y3z2 in the expansion of(x+ y + z)7.

[Ans. 210.]

(b) Find the coefficient of the term x6y3z2 in the expression (x−2y + 5z)11

[Ans. -924,000.]

4. Using the binomial theorem prove that

(a)

(n

0

)+

(n

1

)+

(n

2

)+ . . .+

(n

n

)= 2n.

3.6. BINOMIAL AND MULTINOMIAL THEOREMS 65

(b)

(n

0

)−(n

1

)+

(n

2

)−(n

3

)+ . . .±

(n

n

)= 0

for n > 0.

5. Using an argument similar to the one in Section 3.6, prove that

(n + 1

i, j, k

)=

(n

i− 1, j, k

)+

(n

i, j − 1, k

)+

(n

i, j, k − 1

).

6. Let f(n, r) be the number of terms in the multinomial expansionof

(x1 + x2 + . . .+ xr)n

and show that

f(n, r) =

(n+ r − 1

n

).

[Hint: Show that the conditions of Exercise 10 are satisfied byshowing that f(n, r − 1) is the number of terms which do nothave xr and f(n− 1, r) is the number which do.]

7. How many terms are there in each of the expansions:

(a) (x+ y + z)6?

[Ans. 28.]

(b) (a + 2b+ 5c+ d)4?

[Ans. 35.]

(c) (r + s+ t + u+ v)6?

[Ans. 210.]

8. Prove that kn is the sum of the numbers(

nr1,r2,...,rk

)for all choices

of r1, r2, . . . , rk such that

r1 + r2 + . . .+ rk = n.



9. Show that the problem given in Exercise 15b can also be solvedby a multinomial coefficient, and hence show that

(n

n− r, r − k, k

)=

(n

r

)(r

k

)=

(n

k

)(n− k

r − k

).

10. Show that the problem given in Exercise 16 can also be solved bya multinomial coefficient, and hence show that(

n

n− r, r − k, k − s, s

)=

(n

r

)(r

k

)(k

s

)=

(n

s

)(n− s

k − s

)(n− k

r − k

).

11. If a + b+ c = n, show that(

n

a, b, c

)=

(n

a

)(n− a

b

).

12. If a + b+ c+ d = n, show that(

n

a, b, c, d

)=

(n

a

)(n− a

b

)(n− a− b

c

).

13. If n1+n2+ . . .+nr = n, guess a formula that relates the multino-mial coefficient to a product of binomial coefficients. [Hint: Usethe formulas in Exercises 11 and 12 to guide you.]

14. Use Exercises 11, 12, and 13 to show that the multinomial co-efficients can always be obtained by taking products of suitablenumbers in the first n rows of the Pascal triangle.

3.7 Voting power

We return to the problem raised in Section 2.6. Now we are interestednot only in coalitions, but also in the power of individual members. Wewill develop a numerical measure of voting power that was suggestedby L. S. Shapley and M. Shubik. While the measure will be explainedin detail below, for the reasons for choosing this particular measure thereader is referred to the original paper.

First of all we must realize that the number of votes a membercontrols is not in itself a good measure of his or her power. If x hasthree votes and y has one vote, it does not necessarily follow that x has

3.7. VOTING POWER 67

three times the power that y has. Thus if the committee has just threemembers {x, y, z} and z also has only one vote, then x is a dictator andy is powerless.

The basic idea of the power index is found in considering variousalignments of the committee members on a number of issues. The nmembers are ordered x1, x2, . . . , xn according to how likely they are tovote for the measure. If the measure is to carry, we must persuade x1

and x2 up to xi to vote for it until we have a winning coalition. If{x1, x2, . . . , xi} is a winning coalition but {x1, x2, . . . , xi−1} is not win-ning, then xi is the crucial member of the coalition. We must persuadehim or her to vote for the measure, and he or she is the one hardest topersuade of the i necessary members. We call xi the pivot.

For a purely mathematical measure of the power of a member wedo not consider the views of the members. Rather we consider allpossible ways that the members could be aligned on an issue, and seehow often a given member would be the pivot. That means consideringall permutations, and there will be n! of them. In each permutationone member will be the pivot. The frequency with which a member isthe pivot of an alignment is a good measure of his or her voting power.

Definition. The voting power of a member of a committee is thenumber of alignments in which he or she is pivotal divided by the totalnumber of alignments. (The total number of alignments, of course, isn! for a committee of n members.)

Example 3.17 If all n members have one vote each, and it takes amajority vote to carry a measure, it is easy to see (by symmetry) thateach member is pivot in 1/n of the alignments. Hence each member haspower equal to 1/n. Let us illustrate this for n = 3. There are 3! = 6alignments. It takes two votes to carry a measure; hence the secondmember is always the pivot. The alignments are: 123, 132, 213, 231,312, 321. The pivots are emphasized. Each member is pivot twice,hence has power 2

6= 1

3. ♦

Example 3.18 Reconsider Example 2.9 of Section 2.6 from this pointof view. There are 24 permutations of the four members. We will listthem, with the pivot emphasized:

wxyz wxzy wyxz wyzx wzxy wzyxxwyz xwzy xywz xyzw xzwy xzywyxwz yxzw ywxz ywzx yzxw yzwxzxyw zxwy zyxw zywx zwxy zwyx


We see that z has power of 1424, w has 6

24, x and y have 2

24each. (Or,

simplified, they have 712, 3

12, 1

12, 1

12power, respectively.) We note that

these ratios are much further apart than the ratio of votes which is3 : 2 : 1 : 1. Here three votes are worth seven times as much as thesingle vote and more than twice as much as two votes. ♦

Example 3.19 Reconsider Example 2.10 of Section 2.6. By an analy-sis similar to the ones used so far it can be shown that in the SecurityCouncil of the United Nations before 1966, each of the Big Five had76385

or approximately .197 power, while each of the small nations hadapproximately .002 power. (See Exercise 12.) This reproduces our in-tuitive feeling that, while the small nations in the Security Council arenot powerless, nearly all the power is in the hands of the Big Five.

The voting powers according to the 1966 revision will be workedout in Exercise 13. ♦

Example 3.20 In a committee of five each member has one vote, butthe chair has veto power. Hence the minimal winning coalitions arethree-member coalitions including the chair. There are 5! = 120 per-mutations. The pivot cannot come before the chair, since without thechair we do not have a winning coalition. Hence, when the chair is inplace number 3, 4, or 5, he or she is the pivot. This happens in 3

5of the

permutations. When he or she is in position 1 or 2, then the number 3member is pivot. The number of permutations in which the chair is inone of the first two posltions and a given member is third is 2 · 3! = 12.Hence the chair has power 3

5, and each of the others has power 1

10. ♦

Exercises

1. A committee of three makes decisions by majority vote. Writeout all permutations, and calculate the voting powers if the threemembers have

(a) One vote each.

[Ans. 13, 13, 13.]

(b) One vote for two of them, two votes for the third.

[Ans. 16, 16, 23.]

(c) One vote for two of them, three votes for the third.

3.7. VOTING POWER 69

[Ans. 0, 0, 1.]

(d) One, two, and three votes, respectively.

[Ans. 16, 16, 23.]

(e) Two votes each for two of them, and three votes for thethird.

[Ans. 13, 13, 13.]

2. Prove that in any decision-making body the sum of the powers ofthe members is 1.

3. What is the power of a dictator? What is the power of a “pow-erless” member? Prove that your answers are correct.

4. A large company issued 100,000 shares. These are held by threestockholders, who have 50,000, 49,999, and one share, respec-tively. Calculate the powers of the three members.

[Ans. 23, 16, 16.]

5. A committee consists of 100 members having one vote each, plusa chairman who can break ties. Calculate the power distribution.(Do not try to write out all permutations!)

6. In Exercise 5, give the chairman a veto instead of the power tobreak ties. How does this change the power distribution?

[Ans. The chairman has power 50101

.]

7. How are the powers in Exercise 1 changed if the committee re-quires a 3

4vote to carry a measure?

8. If in a committee of five, requiring majority decisions, each mem-ber has one vote, then each has power 1

5. Now let us suppose that

two members team up, and always vote the same way. Does thisincrease their power? (The best way to represent this situation isby allowing only those permutations in which these two membersare next to each other.)

[Ans. Yes, the pair’s power increases from .4 to .5.]


9. If the minimal winning coalitions are known, show that the powerof each member can be determined without knowing anythingabout the number of votes that each member controls.

10. Answer the following questions for a three-man committee.

(a) Find all possible sets of minimal winning coalitions.

(b) For each set of minimal winning coalitions find the distribu-tion of voting power.

(c) Verify that the various distributions of power found in Ex-ercises 1 and 7 are the only ones possible.

11. In Exercise 1 parts 1a and 1e have the same answer, and parts 1band 1d and Exercise 4 also have the same answer. Use the resultsof Exercise 9 to find a reason for these coincidences.

12. Compute the voting power of one of the Big Five in the SecurityCouncil of the United Nations as follows:

(a) Show that for the nation to be pivotal it must be in thenumber 7 spot or later.

(b) Show that there are(62

)6!4! permutations in which the nation

is pivotal in the number 7 spot.

(c) Find similar formulas for the number of permutations inwhich it is pivotal in the number 8, 9, 10, or 11 spot.

(d) Use this information to find the total number of permuta-tions in which it is pivotal, and from this compute the powerof the nation.

13. Apply the method of Exercise 12 to the revised voting schemein the Security Council (10 small-nation members, and 9 votesrequired to carry a measure). What is the power of a large nation?Has the power of one of the small nations increased or decreased?

[Ans. 4212145

(nearly the same as before); decreased.]

3.8 Techniques for counting

We know that there is no single method or formula for solving all count-ing problems. There are, however, some useful techniques that can be

3.8. TECHNIQUES FOR COUNTING 71

learned. In this section we shall discuss two problems that illustrateimportant techniques.

The first problem illustrates the importance of looking for a generalpattern in the examination of special cases. We have seen in Section3.2 and Exercise 6 of that section, that the following formulas hold forthe number of elements in the union of two and three sets, respectively.

n(A1 ∪ A2) = n(A1) + n(A2)− n(A1 ∩A2),

n(A1 ∪A2 ∪ A3) = n(A1) + n(A2) + n(A3)

−n(A1 ∩A2)− n(A1 ∩ A3)− n(A2 ∩A3)

+n(A1 ∩A2 ∩ A3).

On the basis of these formulas we might conjecture that the number ofelements in the union of any finite number of sets could be obtained byadding the numbers in each of the sets, then subtracting the numbers ineach possible intersection of two sets, then adding the numbers in eachpossible intersection of three sets, etc. If this is correct, the formula forthe intersection of four sets should be

n(A1 ∪ A2 ∪ A3 ∪A4) = n(A1) + n(A2) + n(A3) + n(A4) (3.1)

− n(A1 ∩ A2)− n(A1 ∩ A3)− n(A1 ∩A4)

− n(A2 ∩ A3)− n(A2 ∩ A4)− n(A3 ∩A4)

+ n(A1 ∩A2 ∩A3) + n(A1 ∩ A2 ∩ A4)

+ n(A1 ∩A3 ∩A4) + n(A2 ∩ A3 ∩ A4)

− n(A1 ∩ A2 ∩A3 ∩A4)

Let us try to establish this formula. We must show that if u is anelement of at least one of the four sets, then it is counted exactly onceon the right-hand side of 3.1. We consider separately the cases whereu is in exactly 1 of the sets, exactly 2 of the sets, etc.

For instance, if u is in exactly two of the sets it will be counted twicein the terms of the right-hand side of 3.1 that involve single sets, oncein the terms that involve the intersection of two sets, and not at all inthe terms that involve the intersections of three or four sets. Again, ifu is in exactly three of the sets it will be counted three times in theterms involving single sets, twice in the terms involving intersectionsof two sets, once in the terms involving the intersections of three sets,and not at all in the last term involving the intersection of all four sets.Considering each possibility we have the following table.


Number of sets that contain u Number of times it is counted1 12 2− 13 3− 3 + 14 4− 6 + 4− 1

We see from this that, in every case, u is counted exactly once on theright-hand side of 3.1. Furthermore, if we look closely, we detect apattern in the numbers in the righthand column of the above table. Ifwe put a −1 in front of these numbers we have

1 −1 + 12 −1 + 2− 13 −1 + 3− 3 + 14 −1 + 4− 6 + 4− 1

We now recognize that these numbers are simply the numbers in thefirst four rows of the Pascal triangle, but with alternating + and −signs. Since we put a −1 in each row of the table, we want to showthat the sum of each row is 0. If that is true, it should be a generalproperty of the Pascal triangle. That is, if we put alternating signs inthe jth row of the Pascal triangle, we should get a sum of 0. But thisis indeed the case, since, by the binomial theorem, for j > 0,

0 = ±(1− 1)j

= 1−(j

1

)+

(j

2

)−(j

3

)+ . . .± 1

= −1 +

(j

1

)−(j

2

)+

(j

3

)− . . .∓ 1.

Thus we have not only seen why the formula works for the case of foursets, but we have also found the method for proving the formula for thegeneral case. That is, suppose we wish to establish that the number ofelements in the union of n sets may be obtained as an alternating sumby adding the numbers of elements in each of the sets, subtracting thenumbers of elements in each pairwise intersection of the sets, adding thenumbers of elements in each intersection of three sets, etc. Consideran element u that is in exactly j of the sets. Let us see how manytimes u will be counted in the alternating sum. If it is in j of the sets,it will first be counted j times in the sum of the elements in the setsby themselves. For u to be in the intersection of two sets, we mustchoose two of the j sets to which it belongs. This can be done in

(j2

)


different ways. Hence an amount(j2

)will be subtracted from the sum.

To be in the intersection of three sets, we must choose three of the jsets containing u. This can be done in

(j3

)different ways. Thus, an

amount(j3

)will be added to the sum, etc. Hence the total number of

times u will be counted by the alternating sum is

(j

1

)−(j

2

)+

(j

3

)− . . .± 1

since we have just seen that, if we add −1 to the sum, we obtain 0.Hence the sum itself must always be 1. That is, no matter how manysets u is in, it will be counted exactly once by the alternating sum, andthis is true for every element u in the union. We have thus establishedthe general formula

n(A1 ∪A2 ∪ . . . ∪ An) = n(A1) + n(A2) + . . .+ n(An) (3.2)

−n(A1 ∩ A2)− n(A1 ∩A3)− . . .

+n(A1 ∩A2 ∩A3) + n(A1 ∩A2 ∩ A4) + . . .

− . . .

±n(A1 ∩ A2 ∩ . . . ∩An).

This formula is called the inclusion-exclusion formula for the numberof elements in the union of n sets. It can be extended to formulas forcounting the number of elements that occur in two of the sets, three ofthe sets, etc. See Exercises 21, 25, and 27.

Example 3.21 In a high school the following language enrollments arerecorded for the senior class.

English 150French 75German 35Spanish 50

Also, the following overlaps are noted.

Taking English and French 70Taking English and German 30Taking English and Spanish 40Taking French and German 5Taking English, French and German 2


If every student takes at least one language, how many seniors arethere?

Let E, F , G, and S be the sets of students taking English, French,German, and Spanish, respectively. Using formula 3.1 and ignoringempty sets, we have

n(E ∪ F ∪G ∪ S) = n(E) + n(F ) + n(G) + n(S)

− n(E ∩ F )− n(E ∩G)− n(E ∩ S)− n(F ∩G)

+ n(E ∩ F ∩G)

= 150 + 75 + 35 + 50− 70− 30− 40− 5 + 2

= 167.

Since every student takes at least one language, the total number ofstudents is 167. ♦

Example 3.22 The four words

TABLE, BASIN, CLASP, BLUSH

have the following interesting properties. Each word consists of fivedifferent letters. Any two words have exactly two letters in common.Any three words have one letter in common. No letter occurs in allfour words. How many different letters are there?

Letting the words be sets of letters, there are(41

)ways of taking

these sets one at a time,(42

)ways of taking them two at a time, etc.

Hence formula 3.2 gives

(4

1

)· 5−

(4

2

)· 2 +

(4

3

)· 1−

(4

4

)· 0 = 12

as the number of distinct letters. The reader should verify this answerby direct count. ♦

It often happens that a counting problem can be formulated in anumber of different ways that sound quite different but that are in factequivalent. And in one of these ways the answer may suggest itselfreadily. To illustrate how a reformulation can make a hard soundingproblem easy, we give an alternate method for solving the problemconsidered in Exercise 13.

The problem is to count the number of ways that n indistinguishableobjects can be put into r cells. For instance, if there are three objects


and three cells, the number of different ways can be enumerated asfollows (Using ◦ for object and bars to indicate the sides of the cells:

| ◦ ◦ ◦ | | || ◦◦ | ◦ | || ◦◦ | | ◦ || ◦ | ◦◦ | || ◦ | ◦ | ◦ || | ◦ ◦ ◦ | || | ◦◦ | ◦ || | ◦ | ◦◦ || | | ◦ ◦ ◦ |

We see that in this case there are ten ways the task can be accomplished.But the answer for the general case is not clear.

If we look at the problem in a slightly different manner, the answersuggests itself. Instead of putting the objects in the cells, we imagineputting the cells around the objects. In the above case we see thatthree cells are constructed from four bars. Two of these bars mustbe placed at the ends. The two other bars together with the threeobjects we regard as occupying five intermediate positions. Of thesefive intermediate positions we must choose two of them for bars andthree for the objects. Hence the total number of ways we can accomplishthe task is

(52

)=(53

)= 10, which is the answer we got by counting all

the ways.For the general case we can argue in the same manner. We have r

cells and n objects. We need r + 1 bars to form the r cells, but twoof these must be fixed on the ends. The remaining r − 1 bars togetherwith the n objects occupy r − 1 + n intermediate positions. And wemust choose r − 1 of these for the bars and the remaining n for theobjects. Hence our task can be accomplished in

(n+ r − 1

r − 1

)=

(n + r − 1

n

)

different ways.

Example 3.23 Seven people enter an elevator that will stop at fivefloors. In how many different ways can the people leave the elevator ifwe are interested only in the number that depart at each floor, and donot distinguish among the people? According to our general formula,


the answer is (7 + 5− 1

7

)=

(11

7

)= 330.

Suppose we are interested in finding the number of such possibilities inwhich at least one person gets off at each floor. We can then arbitrarilyassign one person to get off at each floor, and the remaining two canget off at any floor. They can get off the elevator in

(2 + 5− 1

2

)=

(6

2

)= 15

different ways. ♦

Exercises

1. The survey discussed in Exercise 8 has been enlarged to includea fourth magazine D. It was found that no one who reads eithermagazine A or magazine B reads magazine D. However, 10 percent of the people read magazine D and 5 per cent read both Cand D. What per cent of the people do not read any magazine?

[Ans. 5 per cent.]

2. A certain college administers three qualifying tests. They an-nounce the following results: “Of the students taking the tests, 2per cent failed all three tests, 6 per cent failed tests A and B, 5per cent failed A and C, 8 per cent failed B and C, 29 per centfailed test A, 32 per cent failed B, and 16 per cent failed C.” Howmany students passed all three qualifying tests?

3. Four partners in a game require a score of exactly 20 points towin. In how many ways can they accomplish this?

[Ans.(233

).]

4. In how many ways can eight apples be distributed among fourboys? In how many ways can this be done if each boy is to getat least one apple?

5. Suppose we have n balls and r boxes with n ≥ r. Show that thenumber of different ways that the balls can be put into the boxeswhich insures that there is at least one ball in every box is

(n−1r−1

).


6. Identical prizes are to be distributed among five boys. It is ob-served that there are 15 ways that this can be done if each boy isto get at least one prize. How many prizes are there?

[Ans. 7.]

7. Let p1, p2, . . . , pn be n statements relative to a possibility spaceU . Show that the inclusion-exclusion formula gives a formulafor the number of elements in the truth set of the disjunctionp1∨ p2 ∨ . . .∨ pn in terms of the numbers of elements in the truthsets of conjunctions formed from subsets of these statements.

8. A boss asks his or her secretary to put letters written to sevendifferent persons into addressed envelopes. Find the number ofways that this can be done so that at least one person gets his orher own letter. [Hint: Use the result of Exercise 7, letting pi bethe statement “The ith person gets his or her own letter”.]

[Ans. 3186.]

9. Consider the numbers from 2 to 10 inclusive. Let A2 be the set ofnumbers divisible by 2 and A3 the set of numbers divisible by 3.Find n(A2 ∪ A3) by using the inclusion-exclusion formula. Fromthis result find the number of prime numbers between 2 and 10(where a prime number is a number divisible only by itself andby 1). [Hint: Be sure to count the numbers 2 and 3 among theprimes.]

10. Use the method of Exercise 9 to find the number of prime numbersbetween 2 and 100 inclusive. [Hint: Consider first the sets A2,A3, A5, and A7.]

[Ans. 25.]

11. Verify that the following formula gives the number of elements inthe intersection of three sets.

n(A1 ∩A2 ∩ A3) = n(A1) + n(A2) + n(A3)

− n(A1 ∪A2)− n(A1 ∪ A3)− n(A2 ∪ A3)

+ n(A1 ∪ A2 ∪ A3).


12. Show that if we replace ∩ by ∪ and ∪ by ∩ in formula 3.2, weget a valid formula for the number of elements in the intersectionof n sets. [Hint: Apply the inclusion-exclusion formula to theleft-hand side of

n(A1 ∪ A2 ∪ . . . ∪ An) = n(U)− n(A1 ∩ A2 ∩ . . . ∩An.]

13. For n ≤ m prove that(m

0

)(n

0

)+

(m

1

)(n

1

)+

(m

2

)(n

2

)+ . . .+

(m

n

)(n

n

)=

(m+ n

n

)

by carrying out the following two steps:

(a) Show that the left-hand side counts the number of ways ofchoosing equal numbers of men and women from sets of mmen and n women.

(b) Show that the right-hand side also counts the same numberby showing that we can select equal numbers of men andwomen by selecting any subset of n persons from the wholeset, and then combining the men selected with the womennot selected.

14. By an ordered partition of n with r elements we mean a sequenceof r nonnegative integers, possibly some 0, written in a definiteorder, and having n as their sum. For instance, [1, 0, 3] and [3, 0, 1]are two different ordered partitions of 4 with three elements. Showthat the number of ordered partitions of n with r elements is(n+r−1

n

).

15. Show that the number of different possibilities for the outcomesof rolling n dice is

(n+5n

).

Note. The next few exercises illustrate an important countingtechnique called the reflection principle. In Figure 3.6 we showa path from the point (0, 2) to the point (7, 1). We shall beinterested in counting the number of paths of this type where ateach step the path moves one unit to the right, and either oneunit up or one unit down. We shall see that this model is usefulfor analyzing voting outcomes.

16. Show that the number of different paths leading from the point(0, 2) to (7, 1) is

(73

). [Hint: Seven decisions must be made, of

which three moves are up and the rest down.]


Figure 3.6: ♦

17. Show that the number of different paths from (0, 2) to (7, 1) whichtouch the x-axis at least once is the same as the total number ofpaths from the point (0,−2) to the point (7, 1). [Hint: Showthat for every path to be counted from (0, 2) that touches the x-axis, there corresponds a path from (0,−2) to (7, 1) obtained byreflecting the part of the path to the first touching point throughthe x-axis. A specific example is shown in Figure 3.7.]

18. Use the results of Exercises 16 and 17 to find the number of pathsfrom (0, 2) to (7, 1) that never touch the x-axis.

[Ans. 14.]

19. Nine votes are cast in a race between two candidates A and B.Candidate A wins by one vote. Find the number of ways theballots can be counted so that candidate A is leading throughoutthe entire count. [Hint: The first vote counted must be for A.Counting the remaining eight votes corresponds to a path from(1, 1) to (9, 1). We want the number of paths that never touchthe x-axis.]

[Ans. 14.]

20. Let the symbol n(k)r stand for “the number of elements that are

in k or more of the r sets A1, A2, . . . , Ar”. Show that n(1)3 =

n(A1 ∪A2 ∪ A3).


Figure 3.7: ♦

21. Show that

n(2)3 = n((A1 ∩ A2) ∪ (A1 ∩A3) ∪ (A2 ∩A3))

= n(A1 ∩ A2) + n(A1 ∩ A3) + n(A2 ∩ A3)− 2n(A1 ∩A2 ∩ A3)

by using the inclusion-exclusion formula. Also develop an inde-pendent argument for the last formula.

22. Use Exercise 21 to find the number of letters that appear two ormore times in the three words TABLE, BASIN, and CLASP.

23. Give an interpretation for n(1)3 − n

(2)3 .

24. Use Exercise 23 to find the number of letters that occur exactlyonce in the three words of Exercise 22.

25. Develop a general argument like that in Exercise 21 to show that

n(2)4 = n(A1 ∩A2) + n(A1 ∩ A3) + n(A1 ∩ A4)

+ n(A2 ∩ A3) + n(A2 ∩ A4) + n(A3 ∩ A4)

− 2[n(A1 ∩A2 ∩A3) + n(A1 ∩A2 ∩ A4)

+ n(A1 ∩ A3 ∩A4) + n(A2 ∩A3 ∩ A4)]

+ 3n(A1 ∩A2 ∩ A3 ∩ A4)

26. Use Exercise 25 to find the number of letters used two or moretimes in the four words of Example 3.22.


27. From the formulas in Exercises 21 and 25 guess the general for-mula for n(2)

r and develop a general argument to establish itscorrectness.

Suggested reading.

Shapley, L. S., and M. Shubik, “A Method for Evaluating the Distribu-tion of Power in a Committee System”, The American Political ScienceReview 48 (1954), pp. 787–792.

Whitworth, W. A., Choice and Chance, with 1000 Exercises, 1934.

Goldberg, S., Probability: An Introduction, 1960.

Parzen, E., Modern Probability Theory and Its Applications, 1960.

Chapter 4

Probability theory

4.1 Introduction

We often hear statements of the following kind: “It is likely to raintoday”, “I have a fair chance of passing this course”, “There is an evenchance that a coin will come up heads”, etc. In each case our statementrefers to a situation in which we are not certain of the outcome, but weexpress some degree of confidence that our prediction will be verified.The theory of probability provides a mathematical framework for suchassertions.

Consider an experiment whose outcome is not known. Suppose thatsomeone makes an assertion p about the outcome of the experiment,and we want to assign a probability to p. When statement p is consid-ered in isolation, we usually find no natural assignment of probabilities.Rather, we look for a method of assigning probabilities to all conceiv-able statements concerning the outcome of the experiment. At first thismight seem to be a hopeless task, since there is no end to the state-ments we can make about the experiment. However we are aided by abasic principle:

Fundamental assumption. Any two equivalent statements willbe assigned the same probability. As long as there are a finite numberof logical possibilities, there are only a finite number of truth sets, andhence the process of assigning probabilities is a finite one. We proceedin three steps: (l) we first determine U , the possibility set, that is, theset of all logical possibilities, (2) to each subset X of U we assign anumber called the measure m(X), (3) to each statement p we assignm(P ), the measure of its truth set, as a probability. The probability ofstatement p is denoted by Pr[p].

83

84 CHAPTER 4. PROBABILITY THEORY

The first step, that of determining the set of logical possibilities,is one that we considered in the previous chapters. It is important torecall that there is no unique method for analyzing logical possibilities.In a given problem we may arrive at a very fine or a very rough analysisof possibilities, causing U to have many or few elements.

Having chosen U , the next step is to assign a number to each sub-set X of U , which will in turn be taken to be the probability of anystatement having truth set X . We do this in the following way.

Assignment of a measure. Assign a positive number (weight) toeach element of U , so that the sum of the weights assigned is 1. Thenthe measure of a set is the sum of the weights of its elements. Themeasure of the set ∅ is 0.

In applications of probability to scientific problems, the analysis ofthe logical possibilities and the assignment of measures may dependupon factual information and hence can best be done by the scientistmaking the application.

Once the weights are assigned, to find the probability of a particularstatement we must find its truth set and find the sum of the weightsassigned to elements of the truth set. This problem, which might seemeasy, can often involve considerable mathematical difficulty. The de-velopment of techniques to solve this kind of problem is the main taskof probability theory.

Example 4.1 An ordinary die is thrown. What is the probability thatthe number which turns up is less than four? Here the possibility set isU = {1, 2, 3, 4, 5, 6}. The symmetry of the die suggests that each faceshould have the same probability of turning up. To make this so, weassign weight 1

6to each of the outcomes. The truth set of the statement

“The number which turns up is less than four” is {1, 2, 3}. Hence theprobability of this statement is 3

6= 1

2, the sum of the weights of the

elements in its truth set. ♦

Example 4.2 A gambler attends a race involving three horses A, B,and C. He or she feels that A and B have the same chance of winningbut that A (and hence also B) is twice as likely to win as C is. What isthe probability that A or C wins? We take as U the set {A,B,C}. If wewere to assign weight a to the outcome C, then we would assign weight2a to each of the outcomes A and B. Since the sum of the weights mustbe l, we have 2a + 2a + a = 1, or a = 1

5. Hence we assign weights 2

5,

25, 1

5to the outcomes A, B, and C, respectively. The truth set of the

4.1. INTRODUCTION 85

statement “Horse A or C wins” is {A,C}. The sum of the weights ofthe elements of this set is 2

5+ 1

5= 3

5. Hence the probability that A or

C wins is 35. ♦

Exercises

1. Assume that there are n possibilities for the outcome of a givenexperiment. How should the weights be assigned if it is desiredthat all outcomes be assigned the same weight?

2. Let U = {a, b, c}. Assign weights to the three elements so thatno two have the same weight, and find the measures of the eightsubsets of U .

3. In an election Jones has probability 12of winning, Smith has prob-

ability 13, and Black has probability 1

6.

(a) Construct U .(b) Assign weights.

(c) Find the measures of the eight subsets.

(d) Give a pair of nonequivalent predictions which have the sameprobability.

4. Give the possibility set U for each of the following experiments.

(a) An election between candidates A and B is to take place.

(b) A number from 1 to 5 is chosen at random.

(c) A two-headed coin is thrown.

(d) A student is asked for the day of the year on which his orher birthday falls.

5. For which of the cases in Exercise 4 might it be appropriate toassign the same weight to each outcome?

6. Suppose that the following probabilities have been assigned to thepossible results of putting a penny in a certain defective peanut-vending machine: The probability that nothing comes out is 1

2.

The probability that either you get your money back or you getpeanuts (but not both) is 1

3.


(a) What is the probability that you get your money back andalso get peanuts?

[Ans. 16.]

(b) From the information given, is it possible to find the proba-bility that you get peanuts?

[Ans. No.]

7. A die is loaded in such a way that the probability of each face isproportional to the number of dots on that face. (For instance, a6 is three times as probable as a 2.) What is the probability ofgetting an even number in one throw?

[Ans. 47.]

8. If a coin is thrown three times, list the eight possibilities for theoutcomes of the three successive throws. A typical outcome canbe written (HTH). Determine a probability measure by assigningan equal weight to each outcome. Find the probabilities of thefollowing statements.

(a) r: The number of heads that occur is greater than the num-ber of tails.

[Ans. 12.]

(b) s: Exactly two heads occur.

[Ans. 58.]

(c) t: The same side turns up on every throw.

[Ans. 14.]

9. For the statements given in Exercise 8, which of the followingequalities are true?

(a) Pr[r ∨ s] = Pr[r] + Pr[s]

(b) Pr[s ∨ t] = Pr[s] + Pr[t]

(c) Pr[r ∨ ¬r] = Pr[r] + Pr[¬r](d) Pr[r ∨ t] = Pr[r] + Pr[t]

4.2. PROPERTIES OF A PROBABILITY MEASURE 87

10. Which of the following pairs of statements (see Exercise 8) areinconsistent? (Recall that two statements are inconsistent if theirtruth sets have no element in common.)

(a) r, s

[Ans. consistent.]

(b) s, t

[Ans. inconsistent.]

(c) r,¬r[Ans. inconsistent.]

(d) r, t

[Ans. consistent.]

11. State a theorem suggested by Exercises 9 and 10.

12. An experiment has three possible outcomes, a, b, and c. Let p bethe statement “the outcome is a or b”, and q be the statement“the outcome is b or c”. Assume that weights have been assignedto the three outcomes so that Pr[p] = 2

3and Pr[q] = 5

6. Find the

weights.

[Ans. 16, 12, 13.]

13. Repeat Exercise 12 if Pr[p] = 12and Pr[q] = 3

8.

4.2 Properties of a probability measure

Before studying special probability measures, we shall consider somegeneral properties of such measures which are useful in computationsand in the general understanding of probability theory.

Three basic properties of a probability measure are

(A) m(X) = 0 if and only if X = ∅.(B) 0 < m(X) < 1 for any set X .

(C) For two sets X and Y ,

m(X ∪ Y ) = m(X) +m(Y )

if and only if X and Y are disjoint, i.e., have no elements in common.


The proofs of properties (A) and (B) are left as an exercise (seeExercise 16). We shall prove (C). We observe first that m(X) +m(Y )is the sum of the weights of the elements of X added to the sum of theweights of Y . If X and Y are disjoint, then the weight of every elementof X ∪ Y is added once and only once, and hence m(X) + m(Y ) =m(X ∪ Y ). Assume now that X and Y are not disjoint. Here theweight of every element contained in both X and Y , i.e., in X ∩ Y ,is added twice in the sum m(X) + m(Y ). Thus this sum is greaterthan m(X ∪ Y ) by an amount m(X ∩ Y ). By (A) and (B), if X ∩ Yis not the empty set, then m(X ∩ Y ) > 0. Hence in this case we havem(X) + m(Y ) > m(X ∩ Y ). Thus if X and Y are not disjoint, theequality in (C) does not hold. Our proof shows that in general we have

(C′) For any two sets X and Y , m(X∪Y ) = m(X)+m(Y )−m(X∩Y ).

Since the probabilities for statements are obtained directly from theprobability measurem(X), any property ofm(X) can be translated intoa property about the probability of statements. For example, the aboveproperties become, when expressed in terms of statements,

(a) Pr[p] = 0 if and only if p is logically false.

(b) 0 < Pr[p] < 1 for any statement p.

(c) The equalityPr[p ∨ q] = Pr[p] + Pr[q]

holds and only if p and q are inconsistent.

(c′) For any two statements p and q,

Pr[p ∨ q] = Pr[p] + Pr[q]− Pr[p ∧ q].

Another property of a probability measure which is often useful incomputation is

(D) m(X) = 1−m(X),

or, in the language of statements,

(d) Pr[¬p] = 1− Pr[p].

The proofs of (D) and (d) are left as an exercise (see Exercise 17).It is important to observe that our probability measure assigns prob-

ability 0 only to statements which are logically false, i.e., which are falsefor every logical possibility. Hence, a prediction that such a statement


will be true is certain to be wrong. Similarly, a statement is assignedprobability 1 only if it is true in every case, i.e., logically true. Thusthe prediction that a statement of this type will be true is certain to becorrect. (While these properties of a probability measure seem quitenatural, it is necessary, when dealing with infinite possibility sets, toweaken them slightly. We consider in this book only the finite possibil-ity sets.)

We shall now discuss the interpretation of probabilities that are not0 or 1. We shall give only some intuitive ideas that are commonlyheld concerning probabilities. While these ideas can be made mathe-matically more precise, we offer them here only as a guide to intuitivethinking.

Suppose that, relative to a given experiment, a statement has beenassigned probability p. From this it is often inferred that if a sequence ofsuch experiments is performed under identical conditions, the fractionof experiments which yield outcomes making the statement true wouldbe approximately p. The mathematical version of this is the “law oflarge numbers” of probability theory (which will be treated in Section4.10). In cases where there is no natural way to assign a probabilitymeasure, the probability of a statement is estimated experimentally.A sequence of experiments is performed and the fraction of the ex-periments which make the statement true is taken as the approximateprobability for the statement.

A second and related interpretation of probabilities is concernedwith betting. Suppose that a certain statement has been assigned prob-ability p. We wish to offer a bet that the statement will in fact turnout to be true. We agree to give r dollars if the statement does notturn out to be true, provided that we receive s dollars if it does turnout to be true. What should r and s be to make the bet fair? If it weretrue that in a large number of such bets we would win s a fraction pof the times and lose r a fraction 1 − p of the time, then our averagewinning per bet would be sp−r(1−p). To make the bet fair we shouldmake this average winning 0. This will be the case if sp = r(1 − p)or if r/s = p/(1 − p). Notice that this determines only the ratio of rand s. Such a ratio, written r : s, is said to give odds in favor of thestatement.

Definition. The odds in favor of an outcome are r : s (r to s), if theprobability of the outcome is p, and r/s = p/(1−p). Any two numbershaving the required ratio may be used in place of r and s. Thus 6 : 4odds are the same as 3 : 2 odds.


Example 4.3 Assume that a probability of 34has been assigned to a

certain horse winning a race. Then the odds for a fair bet would be34: 14. These odds could be equally well written as 3 : 1, 6 : 2 or 12 : 4,

etc. A fair bet would be to agree to pay $3 if the horse loses and receive$1 if the horse wins. Another fair bet would be to pay $6 if the horseloses and win $2 if the horse wins. ♦

Exercises

1. Let p and q be statements such that Pr[p ∧ q] = 14, Pr[¬p] = 1

3,

and Pr[q] = 12. What is Pr[p ∨ q]?

[Ans. 1112.]

2. Using the result of Exercise 1, find Pr[¬p ∧ ¬q].

3. Let p and q be statements such that Pr[p] = 12and Pr[q] = 2

3. Are

p and q consistent?

[Ans. Yes.]

4. Show that, if Pr[p] + Pr[q] > 1, then p and q are consistent.

5. A student is worried about his or her grades in English and Art.The student estimates that the probability of passing English is.4, the probability of passing at least one course with probability.6, but that the probability of 1 of passing both courses is only.1? What is the probability that the student will pass Art?

[Ans. .3.]

6. Given that a school has grades A, B, C, D, and F, and that astudent has probability .9 of passing a course, and .6 of getting agrade lower than B, what is the probability that the student willget a C or D?

[Ans. 12.]

7. What odds should a person give on a bet that a six will turn upwhen a die is thrown?


8. Referring to Example 4.2, what odds should the man be willingto give for a bet that either A or B will come in first?

9. Prove that if the odds in favor of a given statement are r : s, thenthe probability that the statement will be true is r/(r + s).

10. Using the result of Exercise 9 and the definition of “odds”, showthat if the odds are r : s that a statement is true, then the oddsare s : r that it is false.

11. A gambler is willing to give 5 : 4 odds that the Dodgers will winthe World Series. What must the probability of a Dodger victorybe for this to be a fair bet?

[Ans. 59.]

12. A statistician has found through long experience that if he or shewashes the car, it rains the next day 85 per cent of the time.What odds should the statistician give that this will occur nexttime?

13. A gambler offers 1 : 3 odds that A will occur, 1 : 2 odds that Bwill occur. The gambler knows that A and B cannot both occur.What odds should he or she give that A or B will occur?

[Ans. 7 : 5.]

14. A gambler offers 3 : 1 odds that A will occur, 2 : 1 odds that Bwill occur. The gambler knows that A and B cannot both occur.What odds should he or she give that A or B will occur?

15. Show from the definition of a probability measure that m(X) = 1if and only if X = U .

16. Show from the definition of a probability measure that properties(A), (B) of the text are true.

17. Prove property (D) of the text. Why does property (d) followfrom this property?

18. Prove that if R, S, and T are three sets that have no element incommon,

m(R ∪ S ∪ T ) = m(R) +m(S) +m(T ).


19. If X and Y are two sets such that X is a subset of Y , prove thatm(X) ≤ m(Y ).

20. If p and q are two statements such that p implies q, prove thatPr[p] ≤ Pr[q].

21. Suppose that you are given n statements and each has been as-signed a probability less than or equal to r. Prove that the prob-ability of the disjunction of these statements is less than or equalto nr.

22. The following is an alternative proof of property (C′) of the text.Give a reason for each step.

(a) X ∪ Y = (X ∩ Y ) ∪ (X ∩ Y ) ∪ (X ∩ Y ).

(b) m(X ∪ Y ) = m(X ∩ Y ) +m(X ∩ Y ) +m(X ∩ Y ).

(c) m(X ∪ Y ) = m(X) +m(Y )−m(X ∩ Y ).

23. If X , Y , and Z are any three sets, prove that, for any probabilitymeasure,

m(X ∪ Y ∪ Z) = m(X) +m(Y ) +m(Z)

−m(X ∩ Y )−m(Y ∩ Z)−m(X ∩ Z)

+m(X ∩ Y ∩ Z).

24. Translate the result of Exercise 23 into a result concerning threestatements p, q, and r.

25. A man offers to bet “dollars to doughnuts” that a certain eventwill take place. Assuming that a doughnut costs a nickel, whatmust the probability of the event be for this to be a fair bet?

[Ans. 2021.]

26. Show that the inclusion-exclusion formula 3.2 is true is n is re-placed by m. Apply this result to

Pr[p1 ∨ p2 ∨ . . . ∨ pn].

4.3. THE EQUIPROBABLE MEASURE 93

4.3 The equiprobable measure

We have already seen several examples where it was natural to assignthe same weight to all possibilities in determining the appropriate prob-ability measure. The probability measure determined in this manneris called the equiprobable measure. The measure of sets in the case ofthe equiprobable measure has a very simple form. In fact, if U has nelements and if the equiprobable measure has been assigned, then forany set X , m(X) is r/n, where r is the number of elements in the setX . This is true since the weight of each element in X is 1/n, and hencethe sum of the weights of elements of X is r/n.

The particularly simple form of the equiprobable measure makesit easy to work with. In view of this, it is important to observe thata particular choice for the set of possibilities in a given situation maylead to the equiprobable measure, while some other choice will not. Forexample, consider the case of two throws of an ordinary coin. Supposethat we are interested in statements about the number of heads whichoccur. If we take for the possibility set the set U = {HH,HT,TH,TT}then it is reasonable to assign the same weight to each outcome, andwe are led to the equiprobable measure. If, on the other hand, wewere to take as possible outcomes the set U = {no H, one H, two H},it would not be natural to assign the same weight to each outcome,since one head can occur in two different ways, while each of the otherpossibilities can occur in only one way.

Example 4.4 Suppose that we throw two ordinary dice. Each die canturn up a number from 1 to 6; hence there are 6 · 6 possibilities. Weassign weight 1

36to each possibility. A prediction that is true in j cases

will then have probability j/36. For example, “The sum of the dice is5” will be true if we get 1 + 4, 2 + 3, 3 + 2, or 4 + 1. Hence theprobability that the sum of the dice is 5 is 4

36= 1

9. The sum can be 12

in only one way, 6 + 6. Hence the probability that the sum is 12 is 136.

♦

Example 4.5 Suppose that two cards are drawn successively from adeck of cards. What is the probability that both are hearts? Thereare 52 possibilities for the first card, and for each of these there are51 possibilities for the second. Hence there are 52 · 51 possibilities forthe result of the two draws. We assign the equiprobable measure. Thestatement “The two cards are hearts” is true in 13 · 12 of the 52 · 51


possibilities. Hence the probability of this statement is 13·12/(52·51) =117. ♦

Example 4.6 Assume that, on the basis of a predictive index appliedto students A, B, and C when entering college, it is predicted that afterfour years of college the scholastic record of A will be the highest, C thesecond highest, and B the lowest of the three. Suppose, in fact, thatthese predictions turn out to be exactly correct. If the predictive indexhas no merit at all and hence the predictions amount simply to guessing,what is the probability that such a prediction will be correct? Thereare 3! = 6 orders in which the men might finish. If the predictions werereally just guessing, then we would assign an equal weight to each of thesix outcomes. In this case the probability that a particular prediction istrue is 1

6. Since this probability is reasonably large, we would hesitate

to conclude that the predictive index is in fact useful, on the basisof this one experiment. Suppose, on the other hand, it predicted theorder of six men correctly. Then a similar analysis would show that,by guessing, the probability is 1

6!= 1

720that such a prediction would be

correct. Hence, we might conclude here that there is strong evidencethat the index has some merit. ♦

Exercises

1. A letter is chosen at random from the word “random”. What isthe probability that it is an n? That it is a vowel?

[Ans. 16;13.]

2. An integer between 3 and 12 inclusive is chosen at random. Whatis the probability that it is an even number? That it is even anddivisible by three?

3. A card is drawn at random from a pack of playing cards.

(a) What is the probability that it is either a heart or the kingof clubs?

[Ans. 726.]

(b) What is the probability that it is either the queen of heartsor an honor card (i.e., ten, jack, queen, king, or ace)?


[Ans. 513.]

4. A word is chosen at random from the set of words

U = {men, bird, ball, field, book}.

Let p, q, and r be the statements:

p: The word has two vowels.

q: The first letter of the word is b.

r: The word rhymes with cook.

Find the probability of the following statements.

(a) p.

(b) q.

(c) r.

(d) p ∧ q.

(e) (p ∨ q) ∧ ¬r.(f) p → q.

[Ans. 45.]

5. A single die is thrown. Find the probability that

(a) An odd number turns up.

(b) The number which turns up is greater than two.

(c) A seven turns up.

6. In the Primary voting example of Section 2.1, assume that all 36possibilities in the elections are equally likely. Find

(a) The probability that candidate A wins more states than ei-ther B or C.

[Ans. 718.]

(b) That all the states are won by the same candidate.

[Ans. 136.]

(c) That every state is won by a different candidate.


[Ans. 0.]

7. A single die is thrown twice. What value for the sum of the twooutcomes has the highest probability? What value or values ofthe sum has the lowest probability of occurring?

8. Two boys and two girls are placed at random in a row for apicture. What is the probability that the boys and girls alternatein the picture?

[Ans. 13.]

9. A certain college has 500 students and it is known that

300 read French.200 read German.50 read Russian.20 read French and Russian.30 read German and Russian.20 read German and French.10 read all three languages.

If a student is chosen at random from the school, what is theprobability that the student

(a) Reads two and only two languages?

(b) Reads at least one language?

10. Suppose that three people enter a restaurant which has a rowof six seats. If they choose their seats at random, what is theprobability that they sit with no seats between them? What isthe probability that there is at least one empty seat between anytwo of them?

11. Find the probability of obtaining each of the following pokerhands. (A poker hand is a set of five cards chosen at randomfrom a deck of 52 cards.)

(a) Royal flush (ten, jack, queen, king, ace in a single suit).

[Ans. 4/(525

)= .0000015.]

(b) Straight flush (five in a sequence in a single suit, but not aroyal flush).


[Ans. (40− 4)/(525

)= .000014.]

(c) Four of a kind (four cards of the same face value).

[Ans. 624/(525

)= .00024.]

(d) Full house (one pair and one triple of the same face value).

[Ans. 3744/(525

)= .0014.]

(e) Flush (five cards in a single suit but not a straight or royalflush).

[Ans. (5148− 40)/(525

)= .0020.]

(f) Straight (five cards in a row, not all of the same suit).

[Ans. (10, 240− 40)/(525

)= .0039.]

(g) Straight or better.

[Ans. .0076.]

12. If ten people are seated at a circular table at random, what isthe probability that a particular pair of people are seated next toeach other?

[Ans. 29.]

13. A room contains a group of n people who are wearing badgesnumbered from 1 to n. If two people are selected at random, whatis the probability that the larger badge number is a 3? Answerthis problem assuming that n = 5, 4, 3, 2.

[Ans. 15; 13; 23; 0.]

14. In Exercise 13, suppose that we observe two men leaving the roomand that the larger of their badge numbers is 3. What might weguess as to the number of people in the room?

15. Find the probability that a bridge hand will have suits of

(a) 5, 4, 3, and 1 cards.

[Ans.4!(13

5)(13

4)(13

3)(13

1)

(5213)

= .129.]


(b) 6, 4, 2, and 1 cards.

[Ans. .047.]

(c) 4, 4, 3, and 2 cards.

[Ans. .216.]

(d) 4, 3, 3, and 3 cards.

[Ans. .105.]

16. There are(5213

)= 6.5 × 1011 possible bridge hands. Find the

probability that a bridge hand dealt at random will be all of onesuit. Estimate roughly the number of bridge hands dealt in theentire country in a year. Is it likely that a hand of all one suitwill occur sometime during the year in the United States?


17. Find the probability of not having a pair in a hand of poker.

18. Find the probability of a “bust” hand in poker. [Hint: A handis a “bust” if there is no pair, and it is neither a straight nor aflush.]

[Ans. .5012.]

19. In poker, find the probability of having

(a) Exactly one pair.

[Ans. .4226.]

(b) Two pairs.

[Ans. .0475.]

(c) Three of a kind.

[Ans. .0211.]

20. Verify from Exercises 11, 18, 19 that the probabilities for all pos-sible poker hands add up to one (within a rounding error).

21. A certain French professor announces that he or she will selectthree out of eight pages of text to put on an examination and thateach student can choose one of these three pages to translate.

4.4. TWO NONINTUITIVE EXAMPLES 99

(a) What is the maximum number of pages that a student shouldprepare in order to be certain of being able to translate apage that he or she has studied?

(b) Smith decides to study only four of the eight pages. Whatis the probability that one of these four pages will appear onthe examination?

4.4 Two nonintuitive examples

There are occasions in probability theory when one finds a problem forwhich the answer, based on probability theory, is not at all in agreementwith one’s intuition. It is usually possible to arrange a few wagers thatwill bring one’s intuition into line with the mathematical theory. Aparticularly good example of this is provided by the matching birthdaysproblem.

Assume that we have a room with r people in it and we proposethe bet that there are at least two people in the room having the samebirthday, i.e., the same month and day of the year. We ask for thevalue of r which will make this a fair bet. Few people would be willingto bet even money on this wager unless there were at least 100 peoplein the room. Most people would suggest 150 as a reasonable number.However, we shall see that with 150 people the odds are approximately4,100,000,000,000,000 to 1 in favor of two people having the same birth-day, and that one should be willing to bet even money with as few as23 people in the room.

Let us first find the probability that in a room with r people, no twohave the same birthday. There are 365 possibilities for each person’sbirthday (neglecting February 29). There are then 365r possibilitiesfor the birthdays of r people. We assume that all these possibilitiesare equally likely. To find the probability that no two have the samebirthday we must find the number of possibilities for the birthdayswhich have no day represented twice. The first person can have any of365 days for his or her birthday. For each of these, if the second personis to have a different birthday, there are only 364 possibilities for hisor her birthday. For the third person, there are 363 possibilities if heor she is to have a different birthday than the first two, etc. Thus theprobability that no two people have the same birthday in a group of rpeople is

qr =365 · 364 · . . . · (365− r + 1)

365r.


Figure 4.1: ♦

The probability that at least two people have the same birthday isthen pr = 1− qr. In Figure 4.1 the values of pr and the odds for a fairbet, pr : (1− pr) are given for several values of r.

We consider now a second problem in which intuition does not leadto the correct answer. A hat-check clerk has checked n hats, but theyhave become hopelessly scrambled. The clerk hands back the hats atrandom. What is the probability that at least one head gets its ownhat? For this problem some people’s intuition would lead them toguess that for a large number of hats this probability should be small,while others guess that it should be large. Few people guess that theprobability is neither large nor small and essentially independent of thenumber of hats involved.


Let pj be the statement “the jth head gets its own hat back”. Wewish to find Pr[p1 ∨ p2 ∨ . . . ∨ pn]. We know from Exercise 26 thata probability of this form can be found from the inclusion-exclusionformula. We must add all probabilities of the form Pr[pi], then subtractthe sum of all probabilities of the form Pr[pi ∧ pj], then add the sum ofall probabilities of the form Pr[pi ∧ pj ∧ pk], etc.

However, each of these probabilities represents the probability thata particular set of heads get their own hats back. These probabilitiesare very easy to compute. Let us find the probability that out of nheads some particular m of them get back their own hats. There are n!ways that the hats can be returned. If a particular m of them are to gettheir own hats there are only (n−m)! ways that it can be done. Hencethe probability that a particular m heads get their own hats back is

(n−m)!

n!.

There are(nm

)different ways we can choose m heads out of n. Hence

the mth group of terms contributes

(n

m

)· (n−m)!

n!=

1

m!

to the alternating sum. Thus

Pr[p1 ∨ p2 ∨ . . . ∨ pn] = 1− 1

2!+

1

3!− 1

4!+ . . .± 1

n!,

where the + sign is chosen if n is odd and the − sign if n is even. InFigure 4.2, these numbers are given for the first few values of n.

It can be shown that, as the number of hats increases, the proba-bilities approach a number 1 − (1/e) = .632121 . . ., where the numbere = 2.71828 . . . is a number that plays an important role in manybranches of mathematics.

Exercises

1. What odds should you be willing to give on a bet that at leasttwo people in the United States Senate have the same birthday?

[Ans. 3, 300, 000 : 1.]


Figure 4.2: ♦

2. What is the probability that in the House of Representatives atleast two men have the same birthday?

3. What odds should you be willing to give on a bet that at at leasttwo of the Presidents of the United States have had the samebirthday? Would you win the bet?

[Ans. More than 4 : 1; Yes. Polk and Harding were born onNov. 2.]

4. What odds should you be willing to give on the bet that at leasttwo of the Presidents of the United States have died on the sameday of the year? Would you win the bet?

[Ans. More than 2.7 : 1; Yes. Jefferson, Adams, and Monroe alldied on July 4.]

5. Four men check their hats. Assuming that the hats are returnedat random, what is the probability that exactly four men get theirown hats? Calculate the answer for 3, 2, 1, 0 men.

[Ans. 124; 0; 1

4; 13; 38.]

6. A group of 50 knives and forks a dance. The partners for a danceare chosen by lot (knives dance with forks). What is the approx-imate probability that no knife dances with its own fork?


7. Show that the probability that, in a group of r people, exactlyone pair has the same birthday is

tr =

(r

2

)365 · 364 . . . (365− r + 2)

365r.

8. Show that tr =(r2

)qr

366−r, where tr is defined in Exercise 7, and qr

is the probability that no pair has the same birthday.

9. Using the result of Exercise 8 and the results given in Figure 4.1,find the probability of exactly one pair of people with the samebirthday in a group of r people, for r = 15, 20, 25, 30, 40, 50.

[Ans. .22; .32; .38; .38; .26; .12.]

10. What is the approximate probability that there has been exactlyone pair of Presidents with the same birthday?


11. Find a formula for the probability of having more than one coin-cidence of birthdays among n people, i.e., of having at least twopairs of identical birthdays, or of three or more people havingthe same birthday. [Hint: Take the probability of at least onecoincidence, and subtract the probability of having exactly onepair.]

12. Compute the probability of having more than one coincidence ofbirthdays when there are 20, 25, 30, 40, or 50 people in the room.

13. What is the smallest number of people you need in order to havea better than even chance of finding more than one coincidenceof birthdays?

[Ans. 36.]

14. Is it very surprising that there was more than one coincidence ofbirthdays among the dates on which Presidents died?

15. A game of solitaire is played as follows: A deck of cards is shuffled,and then the player turns the cards up one at a time. As theplayer turns the cards, he or she calls out the names of the cards


in a standard order—say “two of clubs”, “three of clubs”, etc.The object of the game is to go through the entire deck withoutonce calling out the name of the card one turns up. What is theprobability of winning? How does the probability change if oneuses a single suit in place of a whole deck?

4.5 Conditional probability

Suppose that we have a given U and that measures have been assignedto all subsets of U . A statement p will have probability Pr[p] = m(P ).Suppose we now receive some additional information, say that state-ment q is true. How does this additional information alter the proba-bility of p?

The probability of p after the receipt of the information q is called itsconditional probability, and it is denoted by Pr[p|q], which is read “theprobability of p given q”. In this section we will construct a method offinding this conditional probability in terms of the measure m.

If we know that q is true, then the original possibility set U hasbeen reduced to Q and therefore we must define our measure on thesubsets of Q instead of on the subsets of U . Of course, every subset Xof Q is a subset of U , and hence we know m(X), its measure before qwas discovered. Since q cuts down on the number of possibilities, itsnew measure m′(X) should be larger.

The basic idea on which the definition of m′ is based is that, whilewe know that the possibility set has been reduced to Q, we have nonew information about subsets of Q. If X and Y are subsets of Q, andm(X) = 2 · m(Y ), then we will want m′(X) = 2 · m′(Y ). This willbe the case if the measures of subsets of Q are simply increased by aproportionality factor m′(X) = k · m(X), and all that remains is todetermine k. Since we know that 1 = m′(Q) = k · m(Q), we see thatk = 1/m(Q) and our new measure on subsets of U is determined bythe formula

m′(X) =m(X)

m(Q). (4.1)

How does this affect the probability of p? First of all, the truth setof p has been reduced. Because all elements of Q have been eliminated,the new truth set of p is P ∩Q and therefore

Pr[p|q] = m′(P ∩Q) =m(P ∩Q)

m(Q)=

Pr[p ∧ q]

Pr[q]. (4.2)

4.5. CONDITIONAL PROBABILITY 105

Note that if the original measure m is the equiprobable measure, thenthe new measure m′ will also be the equiprobable measure on the setQ.

We must take care that the denominators in 4.1 and 4.2 be differentfrom zero. Observe that m(Q) will be zero if Q is the empty set, whichhappens only if q is self-contradictory. This is also the only case inwhich Pr[q] = 0, and hence we make the obvious assumption that ourinformation q is not self-contradictory.

Example 4.7 In an election, candidate A has a .4 chance of winning,B has .3 chance, C has .2 chance, and D has .1 chance. Just before theelection, C withdraws. Now what are the chances of the other threecandidates? Let q be the statement that C will not win, i.e., that A or Bor D will win. Observe that Pr[q] = .8, hence all the other probabilitiesare increased by a factor of 1/.8 = 1.25. Candidate A now has .5 chanceof winning, B has .375, and D has .125. ♦

Example 4.8 A family is chosen at random from the set of all familieshaving exactly two children (not twins). What is the probability thatthe family has two boys, if it is known that there is a boy in the family?Without any information being given, we would assign the equiprobablemeasure on the set U = {BB,BG,GB,GG}, where the first letter ofthe pair indicates the sex of the younger child and the second that ofthe older. The information that there is a boy causes U to change to{BB,BG,GB}, but the new measure is still the equiprobable measure.Thus the conditional probability that there are two boys given thatthere is a boy is 1

3. If, on the other hand, we know that the first

child is a boy, then the possibilities are reduced to {BB,BG} and theconditional probability is 1

2. ♦

A particularly interesting case of conditional probability is that inwhich Pr[p|q] = Pr[p]. That is, the information that q is true has noeffect on our prediction for p. If this is the case, we note that

Pr[p ∧ q] = Pr[p]Pr[q]. (4.3)

And the case Pr[q|p] = q leads to the same equation. Whenever equa-tion 4.3 holds, we say that p and q are independent. Thus if q is not aself-contradiction, p and q are independent if and only if Pr[p|q] = Pr[p].


Example 4.9 Consider three throws of an ordinary coin, where weconsider the eight possibilities to be equally likely. Let p be the state-ment “A head turns up on the first throw” and q be the statement,“A tail turns up on the second throw”. Then Pr[p] = Pr[q] = 1

2and

Pr[p ∧ q] = 14and therefore p and q are independent statements. ♦

While we have an intuitive notion of independence, it can happenthat two statements, which may not seem to be independent, are infact independent. For example, let r be the statement “The same sideturns up all three times”. Let s be the statement “At most one headoccurs”. Then r and s are independent statements (see Exercise 10).

An important use of conditional probabilities arises in the followingmanner. We wish to find the probability of a statement p. We observethat there is a complete set of alternatives q1, q2, . . . , qn such that theprobability Pr[qi] as well as the conditional probabilities Pr[p|qi] can befound for every i. Then in terms of these we can find Pr[p] by

Pr[p] = Pr[q1]Pr[p|q1] + Pr[q2]Pr[p|q2] + . . .+ Pr[qn]Pr[p|qn].

The proof of this assertion is left as an exercise (see Exercise 13).

Example 4.10 A psychology student once studied the way mathe-maticians solve problems and contended that at times they try toohard to look for symmetry in a problem. To illustrate this she asked anumber of mathematicians the following problem: Fifty balls (25 whiteand 25 black) are to be put in two urns, not necessarily the same num-ber of balls in each. How should the balls be placed in the urns so asto maximize the chance of drawing a black ball, if an urn is chosen atrandom and a ball drawn from this urn? A quite surprising number ofmathematicians answered that you could not do any better than 1

2by

the symmetry of the problem. In fact one can do a good deal better byputting one black ball in urn 1, and all the 49 other balls in urn 2. Tofind the probability in this case let p be the statement “A black ball isdrawn”, q1 the statement “Urn 1 is drawn” and q2 the statement “Urn2 is drawn”. Then q1 and q2 are a complete set of alternatives so

Pr[p] = Pr[q1]Pr[p|q1] + Pr[q2]Pr[p|q2].

But Pr[q1] = Pr[q2] =12and Pr[p|q1] = 1, Pr[p|q2] = 24

49. Thus

Pr[p] =1

2· 1 + 1

2· 2449

=73

98= .745.


When told the answer, a number of the mathematicians that had said12replied that they thought there had to be the same number of balls

in each urn. However, since this had been carefully stated not to benecessary, they also had fallen into the trap of assuming too muchsymmetry. ♦

Exercises

1. A card is drawn at random from a pack of playing cards. Whatis the probability that it is a 5, given that it is between 2 and 7inclusive?

2. A die is loaded in such a way that the probability of a givennumber turning up is proportional to that number (e.g., a 6 isthree times as likely to turn up as a 2).

(a) What is the probability of rolling a 3 given that an oddnumber turns up?

[Ans. 13.]

(b) What is the probability of rolling an even number given thata number greater than three turns up?

[Ans. 23.]

3. A die is thrown twice. What is the probability that the sum ofthe faces which turn up is greater than 10, given that one of themis a 6? Given that the first throw is a 6?

[Ans. 311; 13.]

4. Referring to Exercise 9, what is the probability that the studentsselected studies German if

(a) He or she studies French?

(b) He or she studies French and Russian?

(c) He or she studies neither French nor Russian?

5. In the primary voting example of Section 2.1, assuming that theequiprobable measure has been assigned, find the probability thatA wins at least two primaries, given that B drops out of theWisconsin primary.


[Ans. 79.]

6. If Pr[¬p] = 14and Pr[q|p] = 1

2, what is Pr[p ∧ q]?

[Ans. 38.]

7. A student takes a five-question true-false exam. What is theprobability that the student will get all answers correct if

(a) The student is only guessing?

(b) The student knows that the instructor puts more true thanfalse questions on his or her exams?

(c) The student also knows that the instructor never puts threequestions in a row with the same answer?

(d) The student also knows that the first and last questions musthave the opposite answer?

(e) The student also knows that the answer to the second prob-lem is “false”?

8. Three persons, A, B, and C, are placed at random in a straightline. Let r be the statement “B is to the right of A” and let s bethe statement “C is to the right of A”.

(a) What is the Pr[r ∧ s]?

[Ans. 13.]

(b) Are r and s independent?

[Ans. No.]

9. Let a deck of cards consist of the jacks and queens chosen from abridge deck, and let two cards be drawn from the new deck. Find

(a) The probability that the cards are both jacks, given that oneis a jack.

[Ans. 311

= .27.]

(b) The probability that the cards are both jacks, given that oneis a red jack.

[Ans. 513

= .38.]


The probability that the cards are both jacks, given that oneis the jack of hearts.

[Ans. 37= .43.]

10. Prove that statements r and s in Example 4.9 are independent.

11. The following example shows that r may be independent of p andq without being independent of p ∧ q and p ∨ q. We throw acoin twice. Let p be “The first toss comes out heads”, q be “Thesecond toss comes out heads”, and r be “The two tosses come outthe same”. Compute Pr[r],Pr[r|p],Pr[r|q],Pr[r|p∧ q],Pr[r|p∨ q].

[Ans. 12, 12, 12, 1, 1

3.]

12. Prove that for any two statements p and q,

Pr[p] = Pr[p ∧ q] + Pr[p ∧ ¬q].

13. Let p be any statement and q1, q2, q3 be a complete set of alter-natives. Prove that

Pr[p] = Pr[q1]Pr[p|q1] + Pr[q2]Pr[p|q2] + Pr[q3]Pr[p|q3].

14. Prove that the procedure given in Example 4.10 does maximizethe chance of getting a black ball. [Hint: Show that you canassume that one urn contains more black balls than white ballsand then consider what is the best that could be achieved, firstin the urn with more black than white balls, and then in the urnwith more white than black balls.]


15. Assume that p and q are independent statements relative to agiven measure. Prove that each of the following pairs of state-ments are independent relative to this same measure.

(a) p and ¬q.(b) ¬q and p.

(c) ¬p and ¬q


16. Prove that for any three statements p, q, and r,

Pr[p ∧ q ∧ r] = Pr[p] · Pr[q|p] · Pr[r|p ∧ q].

17. A coin is thrown twice. Let p be the statement “Heads turns upon the first toss” and q the statement “Heads turns up on thesecond toss”. Show that it is possible to assign a measure to thepossibility space {HH,HT,TH,TT} so that these statements arenot independent.

18. A multiple-choice test question lists five alternative answers, ofwhich just one is correct. If a student has done the homework,then he or she is certain to identify the correct answer; otherwise,the student chooses an answer at random. Let p be the statement“The student does the homework” and q the statement “Thestudent answers the question correctly”. Let Pr[p] = a.

(a) Find a formula for Pr[p|q] in terms of a.

(b) Show that Pr[p|q] ≥ Pr[p] for all values of a. When does theequality hold?

19. A coin is weighted so that heads has probability .7, tails hasprobability .2, and it stands on edge with probability .1. Whatis the probability that it does not come up heads, given that itdoes not come up tails?

[Ans. 18.]

20. A card is drawn at random from a deck of playing cards. Are thefollowing pairs of statements independent?

(a) p: A jack is drawn. q: A black card is drawn.

(b) p: An even numbered heart is drawn. q: A red card smallerthan a five is drawn.

21. A simple genetic model for the color of a person’s eyes is thefollowing: There are two kinds of color-determining genes, B andb, and each person has two color-determining genes. If both areb, he or she has blue eyes; otherwise he or she has brown eyes.Assume that one-quarter of the people have two B genes, one-quarter of the people have two b genes, and the rest have one Bgene and one b gene.

4.6. FINITE STOCHASTIC PROCESSES 111

(a) If a person has brown eyes, what is the probability that heor she has two B genes?

Assume that a child’s mother and father have brown eyesand blue eyes, respectively.

(b) What is the probability that the child will have brown eyes?

(c) If the child has brown eyes, what is the probability that thefather has two B genes?

[Ans. 12.]

22. Three red, three green, and three blue balls are to be put intothree urns, with at least two balls in each urn. Then an urn isselected at random and two balls withdrawn.

(a) How should the balls be put in the urns in order to maximizethe probability of drawing two balls of different color? Whatis the probability?

[Ans. 1.]

(b) How should the balls be put in the urns in order to maximizethe probability of withdrawing a red and a green ball? Whatis the maximum probability?

[Ans. 710.]

4.6 Finite stochastic processes

We consider here a very general situation which we will specialize inlater sections. We deal with a sequence of experiments where the out-come on each particular experiment depends on some chance element.Any such sequence is called a stochastic process. (The Greek word“stochos” means “guess”.) We shall assume a finite number of exper-iments and a finite number of possibilities for each experiment. Weassume that, if all the outcomes of the experiments which precede agiven experiment were known, then both the possibilities for this ex-periment and the probability that any particular possibility will occurwould be known. We wish to make predictions about the process as awhole. For example, in the case of repeated throws of an ordinary coinwe would assume that on any particular experiment we have two out-comes, and the probabilities for each of these outcomes is 1

2regardless


Figure 4.3: ♦

of any other outcomes. We might be interested, however, in the proba-bilities of statements of the form, “More than two-thirds of the throwsresult in heads”, or “The number of heads and tails which occur is thesame”, etc. These are questions which can be answered only when aprobability measure has been assigned to the process as a whole. In thissection we show how a probability measure can be assigned, using thegiven information. In the case of coin tossing, the probabilities (hencealso the possibilities) on any given experiment do not depend upon theprevious results. We will not make any such restriction here since theassumption is not true in general.

We shall show how the probability measure is constructed for a par-ticular example, and the procedure in the general case is similar. Weassume that we have a sequence of three experiments, the possibilitiesfor which are indicated in Figure 4.3. The set of all possible outcomeswhich might occur on any of the experiments is represented by the set{a, b, c, d, e, f}. Note that if we know that outcome b occurred on thefirst experiment, then we know that the possibilities on experiment twoare {a, e, d}. Similarly, if we know that b occurred on the first experi-ment and a on the second, then the only possibilities for the third are{c, f}. We denote by pa the probability that the first experiment re-sults in outcome a, and by pb the probability that outcome b occurs inthe first experiment. We denote by pb,d the probability that outcomed occurs on the second experiment, which is the probability computedon the assumption that outcome b occurred on the first experiment.Similarly for pb,a,pb,e,pa,a,pa,c. We denote by pbd,c the probability thatoutcome c occurs on the third experiment, the latter probability being


computed on the assumption that outcome b occurred on the first ex-periment and d on the second. Similarly for pba,c,pba,f , etc. We haveassumed that these numbers are given and the fact that they are proba-bilities assigned to possible outcomes would mean that they are positiveand that pa + pb = 1, pb,a + pb,e + pb,d = 1, and pbd,a + pbd,c = 1, etc.

It is convenient to associate each probability with the branch ofthe tree that connects to the branch point representing the predictedoutcome. We have done this in Figure 4.3 for several branches. Thesum of the numbers assigned to branches from a particular branch pointis 1, e.g., pb,a + pb,e + pb,d = 1.

A possibility for the sequence of three experiments is indicated by apath through the tree. We define now a probability measure on the setof all paths. We call this a tree measure. To the path corresponding tooutcome b on the first experiment, d on the second, and c on the third,we assign the weight pb·pb,d·pbd,c. That is the product of the probabilitiesassociated with each branch along the path being considered. We findthe probability for each path through the tree.

Before showing the reason for this choice, we must first show that itdetermines a probability measure, in other words, that the weights arepositive and the sum of the weights is 1. The weights are products ofpositive numbers and hence positive. To see that their sum is 1 we firstfind the sum of the weights of all paths corresponding to a particularoutcome, say b, on the first experiment and a particular outcome, sayd, on the second. We have

pb · pb,d · pbd,a + pb · pb,d · pbd,c = pb · pb,d[pbd,a + pbd,c] = pb · pb,d.For any other first two outcomes we would obtain a similar result.

For example, the sum of the weights assigned to paths correspondingto outcome a on the first experiment and c on the second is pa · pa,c.Notice that when we have verified that we have a probability measure,this will be the probability that the first outcome results in a and thesecond experiment results in c.

Next we find the sum of the weights assigned to all the paths cor-responding to the cases where the outcome of the first experiment isb. We find this by adding the sums corresponding to the different pos-sibilities for the second experiment. But by our preceding calculationthis is

pb · pb,a + pb · pb,e + pb · pb,d = pb[pb,a + pb,e + pb,d] = pb.

Similarly, the sum of the weights assigned to paths correspondingto the outcome a on the first experiment is pa. Thus the sum of all


weights is pa + pb = 1. Therefore we do have a probability measure.Note that we have also shown that the probability that the outcome ofthe first experiment is a has been assigned probability pa in agreementwith our given probability.

To see the complete connection of our new measure with the givenprobabilities, let Xj = z be the statement “The outcome of the jthexperiment was z”. Then the statement [X1 = b ∧X2 = d ∧X3 = c] isa compound statement that has been assigned probability pb ·pb,d ·pbd,c.The statement [X1 = b∧X2 = d] we have noted has been assigned prob-ability pb ·pb,d and the statement [X1 = b] has been assigned probabilitypb. Thus

Pr[X3 = c|X2 = d ∧X1 = b] =pb · pb,d · pbd,c

pb · pb,d= pbd,c,

Pr[X2 = d|X1 = b] =pb · pb,d

pb= pb,d.

Thus we see that our probabilities, computed under the assumptionthat previous results were known, become the corresponding condi-tional probabilities when computed with respect to the tree measure.It can be shown that the tree measure which we have assigned is theonly one which will lead to this agreement. We can now find the proba-bility of any statement concerning the stochastic process from our treemeasure.

Example 4.11 Suppose that we have two urns. Urn 1 contains twoblack balls and three white balls. Urn 2 contains two black balls andone white ball. An urn is chosen at random and a ball chosen from thisurn at random. What is the probability that a white ball is chosen?A hasty answer might be 1

2since there are an equal number of black

and white balls involved and everything is done at random. However,it is hasty answers like this (which is wrong) which show the need fora more careful analysis.

We are considering two experiments. The first consists in choosingthe urn and the second in choosing the ball. There are two possibilitiesfor the first experiment, and we assign p1 = p2 =

12for the probabilities

of choosing the first and the second urn, respectively. We then assignp1,W = 3

5for the probability that a white ball is chosen, under the

assumption that urn 1 is chosen. Similarly we assign p1,B = 25, p2,W = 1

3,

p2,B = 23. We indicate these probabilities on their possibility tree in

Figure 4.4. The probability that a white ball is drawn is then found


Figure 4.4: ♦

from the tree measure as the sum of the weights assigned to paths whichlead to a choice of a white ball. This is 1

2· 35+ 1

2· 13= 7

15. ♦

Example 4.12 Suppose that a drunkard leaves a bar which is on acorner which he or she knows to be one block from home. He or she isunable to remember which street leads home, and proceeds to try eachof the streets at random without ever choosing the same street twiceuntil he or she goes on the one which leads home. What possibilitiesare there for the trip home, and what is the probability for each ofthese possible trips? We label the streets A, B, C, and Home. Thepossibilities together with typical probabilities are given in Figure 4.5.The probability for any particular trip, or path, is found by taking theproduct of the branch probabilities. ♦

Example 4.13 Assume that we are presented with two slot machines,A and B. Each machine pays the same fixed amount when it pays off.Machine A pays off each time with probability 1

2, and machine B with

probability 14. We are not told which machine is A. Suppose that we

choose a machine at random and win. What is the probability thatwe chose machine A? We first construct the tree (Figure 4.6) to showthe possibilities and assign branch probabilities to determine a treemeasure. Let p be the statement “Machine A was chosen” and q bethe statement “The machine chosen paid off”. Then we are asked for

Pr[p|q] = Pr[p ∧ q]

Pr[q]


Figure 4.5: ♦

Figure 4.6: ♦


The truth set of the statement p∧ q consists of a single path which hasbeen assigned weight 1

4. The truth set of the statement q consists of

two paths, and the sum of the weights of these paths is 12· 12+ 1

2· 14= 3

8.

Thus Pr[p|q] = 23. Thus if we win, it is more likely that we have machine

A than B and this suggests that next time we should play the samemachine. If we lose, however, it is more likely that we have machineB than A, and hence we would switch machines before the next play.(See Exercise 9.) ♦

Exercises

1. The fractions of Republicans, Democrats, and Independent votersin cities A and B are

City A: .30 Republican, .40 Democratic, .30 Independent;

City B: .40 Republican, .50 Democratic, .10 Independent.

A city is chosen at random and two voters are chosen succes-sively and at random from the voters of this city. Construct atree measure and find the probability that two Democrats arechosen. Find the probability that the second voter chosen is anIndependent voter.

[Ans. .205; .2.]

2. A coin is thrown. If a head turns up, a die is rolled. If a tailturns up, the coin is thrown again. Construct a tree measure torepresent the two experiments and find the probability that thedie is thrown and a six turns up.

3. An athlete wins a certain tournament if he or she can win twoconsecutive games out of three played alternately with two oppo-nents A and B. A is a better player than B. The probability ofwinning a game when B is the opponent 2

3. The probability of

winning a game when A is the opponent is only 13. Construct a

tree measure for the possibilities for three games, assuming thathe or she plays alternately but plays A first. Do the same assum-ing that he or she plays B first. In each case find the probabilitythat he or she will win two consecutive games. Is it better to playtwo games against the strong player or against the weaker player?


[Ans. 1027; 8

27; better to play strong player twice.]

4. Construct a tree measure to represent the possibilities for fourthrows of an ordinary coin. Assume that the probability of ahead on any toss is 1

2regardless of any information about other

throws.

5. A student claims to be able to distinguish beer from ale. Thestudent is given a series of three tests. In each test, the studentis given two cans of beer and one of ale and asked to pick outthe ale. If the student gets two or more correct we will admitthe claim. Draw a tree to represent the possibilities (either rightor wrong) for the student’s answers. Construct the tree measurewhich would correspond to guessing and find the probability thatthe claim will be established if the student guesses on every trial.

6. A box contains three defective light bulbs and seven good ones.Construct a tree to show the possibilities if three consecutivebulbs are drawn at random from the box (they are not replacedafter being drawn). Assign a tree measure and find the probabil-ity that at least one good bulb is drawn out. Find the probabilitythat all three are good if the first bulb is good.

[Ans. 119120

; 512.]

7. In Example 4.12, find the probability that the drunkard reacheshome after trying at most one wrong street.

8. In Example 4.13, find the probability that machine A was chosen,given that we lost.

9. In Example 4.13, assume that we make two plays. Find the prob-ability that we win at least once under the assumption

(a) That we play the same machine twice.

[Ans. 1932.]

(b) That we play the same machine the second time if and onlyif we won the first time.

[Ans. 2032.]


10. A chess player plays three successive games of chess. The player’spsychological makeup is such that the probability of winning agiven game is (1

2)k+1, where k is the number of games won so far.

(For instance, the probability of winning the first game is 12, the

probability of winning the second game if the player has alreadywon the first game is 1

4, etc.) What is the probability that the

player will win at least two of the three games?

11. Before a political convention, a political expert has assigned thefollowing probabilities. The probability that the President willbe willing to run again is 1

2. If the President is willing to run, the

President and his or her Vice President are sure to be nominatedand have probability 3

5of being elected again. If the President

does not run, the present Vice President has probability 110

ofbeing nominated, and any other presidential candidate has prob-ability 1

2of being elected. What is the probability that the present

Vice President will be re-elected?

[Ans. 1340.]

12. There are two urns, A and B. Urn A contains one black and onered ball. Urn B contains two black and three red balls. A ball ischosen at random from urn A and put into urn B. A ball is thendrawn at random from urn B.

(a) What is the probability that both balls drawn are of thesame color?

[Ans. 712.]

(b) What is the probability that the first ball drawn was red,given that the second ball drawn was black?

[Ans. 25.]


13. Assume that in the World Series each team has probability 12of

winning each game, independently of the outcomes of any othergame. Assign a tree measure. (See Section ?? for the tree.) Findthe probability that the series ends in four, five, six, and sevengames, respectively.


14. Assume that in the World Series one team is stronger than theother and has probability .6 for winning each of the games. Assigna tree measure and find the following probabilities.

(a) The probability that the stronger team wins in 4, 5, 6, and7 games, respectively.

(b) The probability that the weaker team wins in 4, 5, 6, and 7games, respectively.

(c) The probability that the series ends in 4, 5, 6, and 7 games,respectively.

[Ans. .16; .27; .30; .28.]

(d) The probability that the strong team wins the series.

[Ans. .71.]

15. Redo Exercise 14 for the case of two poorly matched teams, wherethe better team has probability .9 of winning a game.

[Ans. (c).66;.26;.07;.01; (d).997.]

16. In the World Series from 1905 through 1965 (excluding series ofmore than seven games) there were 11 four-game, 14 five-game,13 six-game, and 20 seven-game series. Which of the assumptionsin Exercises 13, 14, 15 comes closest to predicting these results?Is it a good fit?

[Ans. .6; No.]

17. Consider the following assumption concerning World Series: Ninetyper cent of the time the two teams are evenly matched, while 10per cent of the time they are poorly matched, with the betterteam having probability .9 of winning a game. Show that thisassumption comes closer to predicting the actual outcomes thanthose considered in Exercise 16.

18. We are given three coins. Coin A is fair while coins B and C areloaded: B has probability .6 of heads and C has probability .4 ofheads. A game is played by tossing a coin twice starting with coinB. If a head is obtained, B is tossed again, otherwise the secondcoin to be tossed is chosen at random from A and C.

4.7. BAYES’S PROBABILITIES 121

(a) Draw the tree for this game, assigning branch and pathweights.

(b) Let p be the statement “The first toss results in heads” andlet q be the statement “The second toss results in heads”.Find Pr[p], Pr[q], Pr[q|p].

[Ans. .6; .54; .6.]

19. A and B play a series of games for which they are evenly matched.A player wins the series either by winning two games in a row,or by winning a total of three games. Construct the tree and thetree measure.

(a) What is the probability that A wins the series?

(b) What is the probability that more than three games need tobe played?

20. In a room there are three chests, each chest contains two drawers,and each drawer contains one coin. In one chest each drawercontains a gold coin; in the second chest each drawer contains asilver coin; and in the last chest one drawer contains a gold coinand the other contains a silver coin. A chest is picked at randomand then a drawer is picked at random from that chest. Whenthe drawer is opened, it is found to contain a gold coin. What isthe probability that the other drawer of that same chest will alsocontain a gold coin?

[Ans. 23.]

4.7 Bayes’s probabilities

The following situation often occurs. Measures have been assigned ina possibility space U . A complete set of alternatives, p1, p2, . . . , pn hasbeen singled out. Their probabilities are determined by the assignedmeasure. (Recall that a complete set of alternatives is a set of state-ments such that for any possible outcome one and only one of thestatements is true.) We are now given that a statement q is true. Wewish to compute the new probabilities for the alternatives relative tothis information. That is, we wish the conditional probabilities Pr[pj|q]for each pj . We shall give two different methods for obtaining theseprobabilities.


The first is by a general formula. We illustrate this formula forthe case of four alternatives: p1, p2, p3, p4. Consider Pr[p2|q]. From thedefinition of conditional probability,

Pr[p2|q] =Pr[p2 ∧ q]

Pr[q].

But since p1, p2, p3, p4 are a complete set of alternatives,

Pr[q] = Pr[p1 ∧ q] + Pr[p2 ∧ q] + Pr[p3 ∧ q] + Pr[p4 ∧ q].

Thus

Pr[p2|q] =Pr[p2 ∧ q]

Pr[p1 ∧ q] + Pr[p2 ∧ q] + Pr[p3 ∧ q] + Pr[p4 ∧ q].

Since Pr[pj ∧ q] = Pr[pj]Pr[q|pj], we have the desired formula

Pr[p2|q] =Pr[p2]Pr[q|p2]

Pr[p1]Pr[q|p1] + Pr[p2]Pr[q|p2] + Pr[p3]Pr[q|p3] + Pr[p4]Pr[q|p4].

Similar formulas apply for the other alternatives, and the formula gen-eralizes in an obvious way to any number of alternatives. In its mostgeneral form it is called Bayes’s theorem.

Example 4.14 Suppose that a freshman must choose among mathe-matics, physics, chemistry, and botany as his or her science course. Onthe basis of the interest he or she expressed, his or her adviser assignsprobabilities of .4, .3, .2 and .1 to the student’s choosing each of thefour courses, respectively. The adviser does not hear which course thestudent actually chose, but at the end of the term the adviser hearsthat he or she received an A in the course chosen. On the basis of thedifficulties of these courses the adviser estimates the probability of thestudent getting an A in mathematics to be .1, in physics .2, in chemistry.3, and in botany .9. How can the adviser revise the original estimatesas to the probabilities of the student taking the various courses? UsingBayes’s theorem we get

Pr[The student took math|The student got an A] =(.4)(.1)

(.4)(.1) + (.3)(.2) + (.2)(.3) + (.1)(.9)=

4

25

Similar computations assign probabilities of .24, .24, and .36 to theother three courses. Thus the new information, that the student re-ceived an A, had little effect on the probability of having taken physicsor chemistry, but it has made mathematics less likely, and botany muchmore likely. ♦


It is important to note that knowing the conditional probabilitiesof q relative to the alternatives is not enough. Unless we also know theprobabilities of the alternatives at the start, we cannot apply Bayes’stheorem. However, in some situations it is reasonable to assume thatthe alternatives are equally probable at the start. In this case thefactors Pr[p1], . . . ,Pr[p4] cancel from our basic formula, and we get thespecial form of the theorem:

If Pr[p1] = Pr[p2] = Pr[p3] = Pr[p4] then

Pr[p2|q] =Pr[q|p2]

Pr[q|p1] + Pr[q|p2] + Pr[q|p3] + Pr[q|p4].

Example 4.15 In a sociological experiment the subjects are handedfour sealed envelopes, each containing a problem. They are told to openone envelope and try to solve the problem in ten minutes. From pastexperience, the experimenter knows that the probability of their beingable to solve the hardest problem is .1. With the other problems, theyhave probabilities of .3, .5, and .8. Assume the group succeeds withinthe allotted time. What is the probability that they selected the hardestproblem? Since they have no way of knowing which problem is in whichenvelope, they choose at random, and we assign equal probabilities tothe selection of the various problems. Hence the above simple formulaapplies. The probability of their having selected the hardest problemis

.1

.1 + .3 + .5 + .8=

1

17.

♦

The second method of computing Bayes’s probabilities is to draw atree, and then to redraw the tree in a different order. This is illustratedin the following example.

Example 4.16 There are three urns. Each urn contains one whiteball. ln addition, urn I contains one black ball, urn II contains two,and urn III contains 3. An urn is selected and one ball is drawn. Theprobability for selecting the three urns is 1

6, 1

2, and 1

3, respectively. If

we know that a white ball is drawn, how does this alter the probabilitythat a given urn was selected?

First we construct the ordinary tree and tree measure, in Figure4.7.


Figure 4.7: ♦

Figure 4.8: ♦


Figure 4.9: ♦

Next we redraw the tree, using the ball drawn as stage 1, and theurn selected as stage 2. (See Figure 4.8.) We have the same paths asbefore, but in a different order. So the path weights are read off fromthe previous tree. The probability of drawing a white ball is

1

12+

1

6+

1

12=

1

3.

This leaves the branch weights of the second stage to be computed. Butthis is simply a matter of division. For example, the branch weightsfor the branches starting at W must be 1

4, 12, 14to yield the correct

path weights. Thus, if a white ball is drawn, the probability of havingselected urn I has increased to 1

4, the probability of having picked urn

III has fallen to 14, while the probability of having chosen urn II is

unchanged (see Figure 4.9). ♦This method is particularly useful when we wish to compute all the

conditional probabilities. We will apply the method next to Example4.14. The tree and tree measure for this example in the natural order isshown in Figure 4.10. In that figure the letters M, P, C, and B standfor mathematics, physics, chemistry, and botany, respectively.

The tree drawn in reverse order is shown in Figure 4.11. Eachpath in this tree corresponds to one of the paths in the original tree.Therefore the path weights for this new tree are the same as the weightsassigned to the corresponding paths in the first tree. The two branchweights at the first level represent the probability that the studentreceives an A or that he or she does not receive an A. These probabilitiesare also easily obtained from the first tree. In fact,

Pr[A] = .04 + .06 + .06 + .09 = .25

andPr[¬A] = 1− Pr[A] = .75.


Figure 4.10: ♦

Figure 4.11: ♦


Figure 4.12: ♦

We have now enough information to obtain the branch weights atthe second level, since the product of the branch weights must be thepath weights. For example, to obtain pA,M we have

.25 · pA,M = .04; pA,M = .16.

But pA,M is also the conditional probability that the student tookmath given that he or she got an A. Hence this is one of the newprobabilities for the alternatives in the event that the student receivedan A. The other branch probabilities are found in the same way andrepresent the probabilities for the other alternatives. By this methodwe obtain the new probabilities for all alternatives under the hypothesisthat the student receives an A as well as the hypothesis that the studentdoes not receive an A. The results are shown in the completed tree inFigure 4.12.

Exercises

1. Urn I contains 7 red and 3 black balls and urn II contains 6 redand 4 black balls. An urn is chosen at random and two balls aredrawn from it in succession without replacement. The first ballis red and the second black. Show that it is more probable thaturn II was chosen than urn I.

2. A gambler is told that one of three slot machines pays off withprobability 1

2, while each of the other two pays off with probability

13.


(a) If the gambler selects one at random and plays it twice, whatis the probability that he or she will lose the first time andwin the second?

[Ans. 25108

.]

(b) If the gambler loses the first time and wins the second, whatis the probability he or she chose the favorable machine?

[Ans. 925.]

3. During the month of May the probability of a rainy day is .2. TheDodgers win on a clear day with probability .7, but on a rainyday only with probability .4. If we know that they won a certaingame in May, what is the probability that it rained on that day?

[Ans. 18.]

4. Construct a diagram to represent the truth sets of various state-ments occurring in the previous exercise.

5. On a multiple-choice exam there are four possible answers foreach question. Therefore, if a student knows the right answer, heor she has probability 1 of choosing correctly; if the student isguessing, he or she has probability 1

4of choosing correctly. Let us

further assume that a good student will know 90 per cent of theanswers, a poor student only 50 per cent. If a good student hasthe right answer, what is the probability that he or she was onlyguessing? Answer the same question about a poor student, if thepoor student has the right answer.

[Ans. 137; 15.]

6. Three economic theories are proposed at a given time, which ap-pear to be equally likely on the basis of existing evidence. Thestate of the American economy is observed the following year,and it turns out that its actual development had probability .6 ofhappening according to the first theory; and probabilities .4 and.2 according to the others. How does this modify the probabilitiesof correctness of the three theories?


7. Let p1, p2, p3, and p4 be a set of equally likely alternatives. LetPr[q|p1] = a, Pr[q|p2] = b, Pr[q|p3] = c, Pr[q|p4] = d. Show that ifa+ b+ c+d = 1, then the revised probabilities of the alternativesrelative to q are a, b, c, and d, respectively.

8. In poker, Smith holds a very strong hand and bets a considerableamount. The probability that Smith’s opponent, Jones, has abetter hand is .05. With a better hand Jones would raise the betwith probability .9, but with a poorer hand Jones would raiseonly with probability .2. Suppose that Jones raises, what is thenew probability that he or she has a winning hand?

[Ans. 947.]

9. A rat is allowed to choose one of five mazes at random. If we knowthat the probabilities of his or her getting through the variousmazes in three minutes are .6, .3, .2, .1, .1, and we find that therat escapes in three minutes, how probable is it that he or shechose the first maze? The second maze?

[Ans. 613; 313.]

10. Three men, A, B, and C, are in jail, and one of them is to behanged the next day. The jailer knows which man will hang, butmust not announce it. Man A says to the jailer, “Tell me thename of one of the other two who will not hang. If both are to gofree, just toss a coin to decide which to say. Since I already knowthat at least one of them will go free, you are not giving awaythe secret.” The jailer thinks a moment and then says, “No, thiswould not be fair to you. Right now you think the probabilitythat you will hang is 1

3, but if I tell you the name of one of the

others who is to go free, your probability of hanging increases to 12.

You would not sleep as well tonight.” Was the jailer’s reasoningcorrect?

[Ans. No.]

11. One coin in a collection of 8 million coins has two heads. Therest are fair coins. A coin chosen at random from the collectionis tossed ten times and comes up heads every time. What is theprobability that it is the two-headed coin?


12. Referring to Exercise 11, assume that the coin is tossed n timesand comes up heads every time. How large does n have to beto make the probability approximately 1

2that you have the two-

headed coin?

[Ans. 23.]

13. A statistician will accept job a with probability 12, job b with

probability 13, and job c with probability 1

6. In each case he or

she must decide whether to rent or buy a house. The probabilitiesof buying are 1

3if he or she takes job a, 2

3if he or she takes job

b, and 1 if he or she takes job c. Given that the statistician buysa house, what are the probabilities of having taken each job?

[Ans. .3; .4; .3.]

14. Assume that chest X-rays for detecting tuberculosis have the fol-lowing properties. For people having tuberculosis the test willdetect the disease 90 out of every 100 times. For people not hav-ing the disease the test will in 1 out of every 100 cases diagnosethe patient incorrectly as having the disease. Assume that theincidence of tuberculosis is 5 persons per 10,000. A person isselected at random, given the X-ray test, and the radiologist re-ports the presence of tuberculosis. What is the probability thatthe person in fact has the disease?

4.8 Independent trials with two outcomes

In the preceding section we developed a way to determine a probabil-ity measure for any sequence of chance experiments where there areonly a finite number of possibilities for each experiment. While thisprovides the framework for the general study of stochastic processes,it is too general to be studied in complete detail. Therefore, in proba-bility theory we look for simplifying assumptions which will make ourprobability measure easier to work with. It is desired also that these as-sumptions be such as to apply to a variety of experiments which wouldoccur in practice. In this book we shall limit outselves to the study oftwo types of processes. The first, the independent trials process, willbe considered in the present section. This process was the first one tobe studied extensively in probability theory. The second, the Markov

4.8. INDEPENDENT TRIALS WITH TWO OUTCOMES 131

chain process, is a process that is finding increasing application, par-ticularly in the social and biological sciences, and will be considered inSection 4.13.

A process of independent trials applies to the following situation.Assume that there is a sequence of chance experiments, each of whichconsists of a repetition of a single experiment, carried out in such a waythat the results of any one experiment in no way affect the results inany other experiment. We label the possible outcome of a single ex-periment by a1, . . . , ar. We assume that we are also given probabilitiesp1, . . . , pr for each of these outcomes occurring on any single experi-ment, the probabilities being independent of previous results. The treerepresenting the possibilities for the sequence of experiments will havethe same outcomes from each branch point, and the branch probabili-ties will be assigned by assigning probability pj to any branch leadingto outcome aj . The tree measure determined in this way is the measureof an independent trials process. In this section we shall consider theimportant case of two outcomes for each experiment. The more generalcase is studied in Section 4.11.

In the case of two outcomes we arbitrarily label one outcome “suc-cess” and the other “failure”. For example, in repeated throws of acoin we might call heads success, and tails failure. We assume there isgiven a probability p for success and a probability q = 1− p for failure.The tree measure for a sequence of three such experiments is shown inFigure 4.13. The weights assigned to each path are indicated at the endof the path. The question which we now ask is the following. Given anindependent trials process with two outcomes, what is the probabilityof exactly x successes in n experiments? We denote this probability byf(n, x; p) to indicate that it depends upon n, x, and p.

Assume that we had a tree for this general situation, similar to thetree in Figure 4.13 for three experiments, with the branch points labeledS for success and F for failure. Then the truth set of the statement“Exactly x successes occur” consists of all paths which go through xbranch points labeled S and n − x labeled F . To find the probabilityof this statement we must add the weights for all such paths. We arehelped first by the fact that our tree measure assigns the same weightto any such path, namely pxqn−x. The reason for this is that everybranch leading to an S is assigned probability p, and every branchleading to F is assigned probability q, and in the product there willbe x p’s and (n− x) q’s. To find the desired probability we need onlyfind the number of paths in the truth set of the statement “Exactly x


Figure 4.13: ♦

successes occur”. To each such path we make correspond an orderedpartition of the integers from 1 to n which has two cells, x elements inthe first and n − x in the second. We do this by putting the numbersof the experiments on which success occurred in the first cell and thosefor which failure occurred in the second cell. Since there are

(nx

)such

partitions there are also this number of paths in the truth set of thestatement considered. Thus we have proved:

In an independent trials process with two outcomes the

probability of exactly x successes in n experiments is given by

f(n, x; p) =

(n

x

)pxqn−x.

Example 4.17 Consider n throws of an ordinary coin. We label heads“success” and tails “failure”, and we assume that the probability is 1

2

for heads on any one throw independently of the outcome of any otherthrow. Then the probability that exactly x heads will turn up is

f(n, x;1

2) =

(n

x

)(1

2)n.

For example, in l00 throws the probability that exactly 50 heads will


turn up is

f(100, 50;1

2) =

(100

50

)(1

2)100,

which is approximately .08. Thus we see that it is quite unlikely thatexactly one-half of the tosses will result in heads. On the other hand,suppose that we ask for the probability that nearly one-half of the tosseswill be heads. To be more precise, let us ask for the probability thatthe number of heads which occur does not deviate by more than l0 from50. To find this we must add f(100, x; 1

2) for x = 40, 41, . . . , 60. If this

is done, we obtain a probability of approximately .96. Thus, while itis unlikely that exactly 50 heads will occur, it is very likely that thenumber of heads which occur will not deviate from 50 by more than l0.♦

Example 4.18 Assume that we have a machine which, on the basisof data given, is to predict the outcome of an election as either a Re-publican victory or a Democratic victory. If two identical machines aregiven the same data, they should predict the same result. We assume,however, that any such machine has a certain probability q of reversingthe prediction that it would ordinarily make, because of a mechanicalor electrical failure. To improve the accuracy of our prediction we givethe same data to r identical machines, and choose the answer whichthe majority of the machines give. To avoid ties we assume that r isodd. Let us see how this decreases the probability of an error due to afaulty machine.

Consider r experiments, where the jth experiment results in successif the jth machine produces the prediction which it would make whenoperating without any failure of parts. The probability of success isthen p = 1 − q. The majority decision will agree with that of a per-fectly operating machine if we have more than r/2 successes. Suppose,for example, that we have five machines, each of which has a proba-bility of .1 of reversing the prediction because of a parts failure. Thenthe probability for success is .9, and the probability that the majoritydecision will be the desired one is

f(5, 3; 0.9) + f(5, 4; 0.9) + f(5, 5; 0.9)

which is found to be approximately .991 (see Exercise 3).Thus the above procedure decreases the probability of error due to

machine failure from .1 in the case of one machine to .009 for the caseof five machines. ♦


Exercises

1. Compute for n = 4, n = 8, n = 12, and n = 16 the probability ofobtaining exactly 1

2heads when an ordinary coin is thrown.

[Ans. .375; .273; .226; .196.]

2. Compute for n = 4, n = 8, n = 12, and n = 16 the probabilitythat the fraction of heads deviates from 1

2by less than 1

5.

[Ans. .375; .711, .854; .923.]

3. Verify that the probability .991 given in Example 4.18 is correct.

4. Assume that Peter and Paul match pennies four times. (In match-ing pennies, Peter wins a penny with probability 1

2, and Paul wins

a penny with probability 12.) What is the probability that Peter

wins more than Paul? Answer the same for five throws. For thecase of 12,917 throws.

[Ans. 516; 12; 12.]

5. If an ordinary die is thrown four times, what is the probabilitythat exactly two sixes will occur?

6. In a ten-question true-false exam, what is the probability of get-ting 70 per cent or better by guessing?

[Ans. 1164.]

7. Assume that, every time a batter comes to bat, he or she hasprobability .3 for getting a hit. Assuming that hits form an in-dependent trials process and that the batter comes to bat fourtimes, what fraction of the games would he or she expect to getat least two hits? At least three hits? Four hits?

[Ans. .348; .084; .008.]

8. A coin is to be thrown eight times. What is the most probablenumber of heads that will occur? What is the number havingthe highest probability, given that the first four throws resultedin heads?


9. A small factory has ten workers. The workers eat their lunch atone of two diners, and they are just as likely to eat in one as in theother. If the proprietors want to be more than .95 sure of havingenough seats, how many seats must each of the diners have?

[Ans. Eight seats.]

10. Suppose that five people are chosen at random and asked if theyfavor a certain proposal. If only 30 per cent of the people favorthe proposal, what is the probability that a majority of the fivepeople chosen will favor the proposal?

11. In Example 4.18, if the probability for a machine reversing itsanswer due to a parts failure is .2, how many machines wouldhave to be used to make the probability greater than .89 that theanswer obtained would be that which a machine with no failurewould give?

[Ans. Three machines.]

12. Assume that it is estimated that a torpedo will hit a ship withprobability 1

3. How many torpedos must be fired if it is desired

that the probability for at least one hit should be greater than.9?

13. A student estimates that, if he or she takes four courses, he orshe has probability .8 of passing each course. If he or she takesfive courses, he or she has probability .7 of passing each course,and if he or she takes six courses he or she has probability .5 forpassing each course. The student’s only goal is to pass at leastfour courses. How many courses should he or she take for thebest chance of achieving this goal?

[Ans. 5.]


14. In a certain board game players move around the board, and eachturn consists of a player’s rolling a pair of dice. If a player is onthe square Park Bench, he or she must roll a seven or doublesbefore being allowed to move out.


(a) What is the probability that a player stuck on Park Benchwill be allowed to move out on the next turn?

[Ans. 13.]

(b) How many times must a player stuck on Park Bench rollbefore the chances of getting out exceed 3

4.

[Ans. 4.]

15. A restaurant orders five pieces of apple pie and five pieces ofcherry pie. Assume that the restaurant has ten customers, andthe probability that a customer will ask for apple pie is 3

4and for

cherry pie is 14.

(a) What is the probability that the ten customers will all beable to have their first choice?

(b) What number of each kind of pie should the restaurant orderif it wishes to order ten pieces of pie and wants to maximizethe probability that the ten customers will all have their firstchoice?

16. Show that it is more probable to get at least one ace with 4 dicethan at least one double ace in 24 throws of two dice.

17. A thick coin, when tossed, will land “heads” with a probability of512, “tails” with a probability of 5

12, and will land on edge with a

probability of 16. If it is tossed six times, what is the probability

that it lands on edge exactly two times?

[Ans. .2009.]

18. Without actually computing the probabilities, find the value of xfor which f(20, x; .3) is largest.

19. A certain team has probability 23of winning whenever it plays.

(a) What is the probability the team will win exactly four outof five games?

[Ans. 80243

.]

(b) What is the probability the team will win at most four outof five games?

4.9. A PROBLEM OF DECISION 137

[Ans. 211243

.]

(c) What is the probability the team will win exactly four gamesout of five if it has already won the first two games of thefive?

[Ans. 49.]

4.9 A problem of decision

In the preceding sections we have dealt with the problem of calculatingthe probability of certain statements based on the assumption of a givenprobability measure. In a statistics problem, one is often called uponto make a decision in a case where the decision would be relativelyeasy to make if we could assign probabilities to certain statements, butwe do not know how to assign these probabilities. For example, if avaccine for a certain disease is proposed, we may be called upon todecide whether or not the vaccine should be used. We may decide thatwe could make the decision if we could compare the probability thata person vaccinated will get the disease with the probability that aperson not vaccinated will get the disease. Statistical theory developsmethods to obtain from experiments some information which will aidin estimating these probabilities, or will otherwise help in making therequired decision. We shall illustrate a typical procedure.

Smith claims to have the ability to distinguish ale from beer andhas bet Jones a dollar to that effect. Now Smith does not mean thathe or she can distinguish beer from ale every single time, but rather aproportion of the time which is significantly greater than 1

2.

Assume that it is possible to assign a number p which representsthe probability that Smith can pick out the ale from a pair of glasses,one containing ale and one beer. We identify p = 1

2with having no

ability, p > 12with having some ability, and p < 1

2with being able to

distinguish, but having the wrong idea which is the ale. If we knewthe value of p, we would award the dollar to Jones if p were ≤ 1

2, and

to Smith if p were > 12. As it stands, we have no knowledge of p and

thus cannot make a decision. We perform an experiment and make adecision as follows.

Smith is given a pair of glasses, one containing ale and the otherbeer, and is asked to identify which is the ale. This procedure is re-peated ten times, and the number of correct identifications is noted. If


the number correct is at least eight, we award the dollar to Smith, andif it is less than eight, we award the dollar to Jones.

We now have a definite procedure and shall examine this procedureboth from Jones’s and Smith’s points of view. We can make two kinds oferrors. We may award the dollar to Smith when in fact the appropriatevalue of p is ≤ 1

2, or we may award the dollar to Jones when the

appropriate value for p is > 12There is no way that these errors can

be completely avoided. We hope that our procedure is such that eachbettor will be convinced that, if he or she is right, he or she will verylikely win the bet.

Jones believes that the true value of p is 12. We shall calculate the

probability of Jones winning the bet if this is indeed true. We assumethat the individual tests are independent of each other and all have thesame probability 1

2for success. (This assumption will be unreasonable if

the glasses are too large.) We have then an independent trials processwith p = 1

2to describe the entire experiment. The probability that

Jones will win the bet is the probability that Smith gets fewer thaneight correct. From the table in Figure 4.14 we compute that thisprobability is approximately .945. Thus Jones sees that, if he or sheis right, it is very likely that he or she will win the bet.

Smith, on the other hand, believes that p is significantly greaterthan 1

2. If Smith believes that p is as high as .9, we see from Figure

4.14 that the probability of Smith’s getting eight or more correct is.930. Then both parties will be satisfied by the bet.

Suppose, however, that Smith thinks the value of p is only about.75. Then the probability that Smith will get eight or more correct andthus win the bet is .526. There is then only an approximately evenchance that the experiment will discover Smith’s abilities, and Smithprobably will not be satisfied with this. If Smith really thinks his or herability is represented by a p value of about 3

4, we would have to devise

a different method of awarding the dollar. We might, for example,propose that Smith win the bet if he or she gets seven or more correct.Then, if Smith has probability 3

4of being correct on a single trial, the

probability that Smith will win the bet is approximately .776. If p = 12

the probability that Jones will win the bet is about .828 under this newarrangement. Jones’s chances of winning are thus decreased, but Smithmay be able to convince him or her that it is a fairer arrangement thanthe first procedure.

In the above example, it was possible to make two kinds of er-rors. The probability of making these errors depended on the way we


Figure 4.14: ♦


designed the experiment and the method we used for the required de-cision. In some cases we are not too worried about the errors and canmake a relatively simple experiment. In other cases, errors are very im-portant, and the experiment must be designed with that fact in mind.For example, the possibility of error is certainly important in the casethat a vaccine for a given disease is proposed, and the statistician isasked to help in deciding whether or not it should be used. In this caseit might be assumed that there is a certain probability p that a personwill get the disease if not vaccinated, and a probability r that the per-son will get it if he or she is vaccinated. If we have some knowledgeof the approximate value of p, we are then led to construct an experi-ment to decide whether r is greater than p, equal to p, or less than p.The first case would be interpreted to mean that the vaccine actuallytends to produce the disease, the second that it has no effect, and thethird that it prevents the disease; so that we can make three kinds oferrors. We could recommend acceptance when it is actually harmful,we could recommend acceptance when it has no effect, or finally wecould reject it when it actually is effective. The first and third mightresult in the loss of lives, the second in the loss of time and money ofthose administering the test. Here it would certainly be important thatthe probability of the first and third kinds of errors be made small. Tosee how it is possible to make the probability of both errors small, wereturn to the case of Smith and Jones.

Suppose that, instead of demanding that Smith make at least eightcorrect identifications out of ten trials, we insist that Smith make atleast 60 correct identifications out of 100 trials. (The glasses must nowbe very small.) Then, if p = 1

2, the probability that Jones wins the

bet is about .98; so that we are extremely unlikely to give the dollar toSmith when in fact it should go to Jones. (If p < 1

2it is even more likely

that Jones will win.) If p > 12we can also calculate the probability that

Smith will win the bet. These probabilities are shown in the graph inFigure 4.15. The dashed curve gives for comparison the correspondingprobabilities for the test requiring eight out of ten correct. Note thatwith 100 trials, if p is 3

4, the probability that Smith wins the bet is

nearly 1, while in the case of eight out of ten, it was only about 12.

Thus in the case of 100 trials, it would be easy to convince both Smithand Jones that whichever one is correct is very likely to win the bet.

Thus we see that the probability of both types of errors can be madesmall at the expense of having a large number of experiments.


Figure 4.15: ♦

Exercises

1. Assume that in the beer and ale experiment Jones agrees to paySmith if Smith gets at least nine out of ten correct.

(a) What is the probability of Jones paying Smith even thoughSmith cannot distinguish beer and ale, and guesses?

[Ans. .011.]

(b) Suppose that Smith can distinguish with probability .9. Whatis the probability of not collecting from Jones?

[Ans. .264.]

2. Suppose that in the beer and ale experiment Jones wishes theprobability to be less than .1 that Smith will be paid if, in fact,Smith guesses. How many of ten trials must Jones insist thatSmith get correct to achieve this?

3. In the analysis of the beer and ale experiment, we assume thatthe various trials were independent. Discuss several ways thaterror can enter, because of the nonindependence of the trials, andhow this error can be eliminated. (For example, the glasses inwhich the beer and ale were served might be distinguishable.)

4. Consider the following two procedures for testing Smith’s abilityto distinguish beer from ale.


(a) Four glasses are given at each trial, three containing beer andone ale, and Smith is asked to pick out the one containingale. This procedure is repeated ten times. Smith must guesscorrectly seven or more times. Find the probability thatSmith wins by guessing.

[Ans. .003.]

(b) Ten glasses are given to Smith, and Smith is told that fivecontain beer and five ale, and asked to name the five thatcontain ale. Smith must choose all five correctly. Find theprobability that Smith wins by guessing.

[Ans. .004.]

(c) Is there any reason to prefer one of these two tests over theother?

5. A testing service claims to have a method for predicting the orderin which a group of freshmen will finish in their scholastic recordat the end of college. The college agrees to try the method ona group of five students, and says that it will adopt the methodif, for these five students, the prediction is either exactly corrector can be changed into the correct order by interchanging onepair of adjacent students in the predicted order. If the method isequivalent to simply guessing, what is the probability that it willbe accepted?

[Ans. 124.]

6. The standard treatment for a certain disease leads to a cure in 14

of the cases. It is claimed that a new treatment will result in acure in 3

4of the cases. The new treatment is to be tested on ten

people having the disease. If seven or more are cured, the newtreatment will be adopted. If three or fewer people are cured, thetreatment will not be considered further. If the number curedis four, five, or six, the results will be called inconclusive, anda further study will be made. Find the probabilities for each ofthese three alternatives under the assumption first, that the newtreatment has the same effectiveness as the old, and second, underthe assumption that the claim made for the treatmnent is correct.

4.10. THE LAW OF LARGE NUMBERS 143

7. Three students debate the intelligence of Springer spaniels. Oneclaims that Springers are mostly (say 90 per cent of them) in-telligent. A second claims that very few (say 10 per cent) areintelligent, while a third one claims that a Springer is just aslikely to be intelligent as not. They administer an intelligencetest to ten Springers, classifying them as intelligent or not. Theyagree that the first student wins the bet if eight or more are in-telligent, the second if two or fewer, the third in all other cases.For each student, calculate the probability that he or she winsthe bet, if he or she is right.

[Ans. .930, .930, .890.]

8. Ten students take a test with ten problems. Each student on eachquestion has probability 1

2of being right, if he or she does not

cheat. The instructor determines the number of students who geteach problem correct. If instructor finds on four or more problemsthere are fewer than three or more than seven correct, he or sheconsiders this convincing evidence of communication between thestudents. Give a justification for the procedure. [Hint: The tablein Figure 4.14 must be used twice, once for the probability of fewerthan three or more than seven correct answers on a given problem,and the second time to find the probability of this happening onfour or more problems.]

4.10 The law of large numbers

In this section we shall study some further properties of the indepen-dent trials process with two outcomes. In Section 4.8 we saw that theprobability for x successes in n trials is given by

f(n, x; p) =

(n

x

)pxqn−x.

In Figure 4.16 we show these probabilities graphically for n = 8 andp = 3

4. In Figure 4.17 we have done similarly for the case of n = 7 and

p = 34.

We see in the first case that the values increase up to a maximumvalue at x = 6 and then decrease. In the second case the values increaseup to a maximum value at x = 5, have the same value for x = 6, and


Figure 4.16: ♦

Figure 4.17: ♦


then decrease. These two cases are typical of what can happen ingeneral.

Consider the ratio of the probability of x+1 successes in n trials tothe probability of x successes in n trials, which is

(n

x+1

)px+1qn−x−1

(nx

)pxqn−x

=n− x

x+ 1· pq.

This ratio will be greater than one as long as (n − x)p > (x + 1)qor as long as x < np − q. If np − q is not an integer, the values(nx

)pxqn−x increase up to a maximum value, which occurs at the first

integer greater than np − q, and then decrease. In case np − q is aninteger, the values

(nx

)pxqn−x increase up to x = np − q, are the same

for x = np− q and x = np− q + 1, and then decrease.Thus we see that, in general, values near np will occur with the

largest probability. It is not true that one particular value near np ishighly likely to occur, but only that it is relatively more likely thana value further from np. For example, in 100 throws of a coin, np =100 · 1

2= 50. The probability of exactly 50 heads is approximately .08.

The probability of exactly 30 is approximately .00002.More information is obtained by studying the probability of a given

deviation of the proportion of successes x/n from the number p; thatis, by studying for ǫ > 0,

Pr[|xn− p| < ǫ].

For any fixed n, p, and ǫ, the latter probability can be found byadding all the values of f(n, x; p) for values of x for which the inequalityp − ǫ < x/n < p + ǫ is true. In Figure 4.18 we have given theseprobabilities for the case p = .3 with various values for ǫ and n. In thefirst column we have the case ǫ = .1. We observe that as n increases,the probability that the fraction of successes deviates from .3 by lessthan .1 tends to the value 1. In fact to four decimal places the answeris 1 after n = 400. In column two we have the same probabilities forthe smaller value of ǫ = .05. Again the probabilities are tending to 1but not so fast. In the third column we have given these probabilitiesfor the case ǫ = .02. We see now that even after 1000 trials there is stilla reasonable chance that the fraction x/n is not within .02 of the valueof p = .3. It is natural to ask if we can expect these probabilities alsoto tend to 1 if we increase n sufficiently. The answer is yes and this is


Figure 4.18: ♦


Figure 4.19: ♦

assured by one of the fundamental theorems of probability called thelaw of large numbers. This theorem asserts that, for any ǫ > 0,

Pr[|xn− p| < ǫ]

tends to 1 as n increases indefinitely.It is important to understand what this theorem says and what it

does not say. Let us illustrate its meaning in the case of coin tossing.We are going to toss a coin n times and we want the probability to bevery high, say greater than .99, that the fraction of heads which turnup will be very close, say within .00l of the value .5. The law of largenumbers assures us that we can have this if we simply choose n largeenough. The theorem itself gives us no information about how large nmust be. Let us however consider this question.

To say that the fraction of the times success results is near p is thesame as saying that the actual number of successes x does not deviatetoo much from the expected number np. To see the kind of deviationswhich might be expected we can study the value of Pr[|x−np| ≥ d]. Atable of these values for p = .3 and various values of n and d are givenin Figure 4.19. Let us ask how large d must be before a deviation aslarge as d could be considered surprising. For example, let us see foreach n the value of d which makes Pr[|x − np| ≥ d] about .04. Fromthe table, we see that d should be 7 for n = 50, 9 for n = 80, 10 forn = 100, etc. To see deviations which might be considered more typicalwe look for the values of d which make Pr[|x− np| ≥ d] approximately13. Again from the table, we see that d should be 3 or 4 for n = 50, 4

or 5 for n = 80, 5 for n = 100, etc. The answers to these two questions


are given in the last two columns of the table. An examination of thesenumbers shows us that deviations which we would consider surprisingare approximately

√n while those which are more typical are about

one-half as large or√n/2.

This suggests that√n, or a suitable multiple of it, might be taken

as a unit of measurement for deviations. Of course, we would also haveto study how Pr[|x − np| ≥ d] depends on p. When this is done, onefinds that

√npq is a natural unit; it is called a standard deviation. It

can be shown that for large n the following approximations hold.

Pr[|x− np| ≥ √npq] ≈ .3174

Pr[|x− np| ≥ 2√npq] ≈ .0455

Pr[|x− np| ≥ 3√npq] ≈ .0027

That is, a deviation from the expected value of one standard devi-ation is rather typical, while a deviation of as much as two standarddeviations is quite surprising and three very surprising. For values of pnot too near 0 or 1, the value of

√pq is approximately 1

2. Thus these

approximations are consistent with the results we observed from ourtable.

For large n, Pr[x − np ≥ k√npq] or Pr[x − np ≤ −k

√npq] can be

shown to be approximately the same. Hence these probabilities can beestimated for k = 1, 2, 3 by taking 1

2the values given above.

Example 4.19 In throwing an ordinary coin 10,000 times, the ex-pected number of heads is 5000, and the standard deviation for the

number of heads is√10, 000(1

2)(1

2) = 50. Thus the probability that

the number of heads which turn up deviates from 5000 by as much asone standard deviation, or 50, is approximately .317. The probabilityof a deviation of as much as two standard deviations, or 100, is ap-proximately .046. The probability of a deviation of as much as threestandard deviations, or 150, is approximately .003. ♦

Example 4.20 Assume that in a certain large city, 900 people arechosen at random and asked if they favor a certain proposal. Of the900 asked, 550 say they favor the proposal and 350 are opposed. If, infact, the people in the city are equally divided on the issue, would it beunlikely that such a large majority would be obtained in a sample of 900of the citizens? If the people were equally divided, we would assumethat the 900 people asked would form an independent trials process


with probability 12for a “yes” answer and 1

2for a “no” answer. Then

the standard deviation for the number of “yes” answers in 900 trials is√900(1

2)(1

2) = 15. Then it would be very unlikely that we would obtain

a deviation of more than 45 from the expected number of 450. Thefact that the deviation in the sample from the expected number was100, then, is evidence that the hypothesis that the voters were equallydivided is incorrect. The assumption that the true proportion is anyvalue less than 1

2would also lead to the fact that a number as large

as 550 favoring in a sample of 900 is very unlikely. Thus we are led tosuspect that the true proportion is greater than 1

2. On the other hand,

if the number who favored the proposal in the sample of 900 were 465,we would have only a deviation of one standard deviation, under theassumption of an equal division of opinion. Since such a deviation isnot unlikely, we could not rule out this possibility on the evidence ofthe sample. ♦

Example 4.21 A certain Ivy League college would like to admit 800students in their freshman class. Experience has shown that if theyadmit 1250 students they will have acceptances from approximately800. If they admit as many as 50 too many students they will haveto provide additional dormitory space. Let us find the probability thatthis will happen assuming that the acceptances of the students can beconsidered to be an independent trials process. We take as our estimatefor the probability of an acceptance p = 800

1250. Then the expected num-

ber of acceptances is 800 and the standard deviation for the number ofacceptances is

√1250 · .64 · .36 ≈ 17. The probability that the number

accepted is three standard deviations or 51 from the mean is approx-imately .0027. This probability takes into account a deviation abovethe mean or below the mean. Since in this case we are only interestedin a deviation above the mean, the probability we desire is half of thisor approximately .0013. Thus we see that it is highly unlikely that thecollege will have to have new dormitory space under the assumptionswe have made. ♦

We finish this discussion of the law of large numbers with some finalremarks about the interpretation of this important theorem.

Of course no matter how large n is we cannot prevent the coin fromcoming up heads every time. If this were the case we would observe afraction of heads equal to 1. However, this is not inconsistent with thetheorem, since the probability of this happening is (1

2)n which tends to


0 as n increases. Thus a fraction of 1 is always possible, but becomesincreasingly unlikely.

The law of large numbers is often misinterpreted in the followingmanner. Suppose that we plan to toss the coin 1000 times and after500 tosses we have already obtained 400 heads. Then we must obtainless than one-half heads in the remaining 500 tosses to have the fractioncome out near 1

2. It is tempting to argue that the coin therefore owes

us some tails and it is more likely that tails will occur in the last 500tosses. Of course this is nonsense, since the coin has no memory. Thepoint is that something very unlikely has already happened in the first500 tosses. The final result can therefore also be expected to be a resultnot predicted before the tossing began.

We could also argue that perhaps the coin is a biased coin but thiswould make us predict more heads than tails in the future. Thus thelaw of averages, or the law of large numbers, should not give you greatcomfort if you have had a series of very bad hands dealt you in yourlast 100 poker hands. If the dealing is fair, you have the same chanceas ever of getting a good hand.

Early attempts to define the probability p that success occurs ona single experiment sounded like this. If the experiment is repeatedindefinitely, the fraction of successes obtained will tend to a number p,and this number p is called the probability of success on a single exper-iment. While this fails to be satisfactory as a definition of probability,the law of large numbers captures the spirit of this frequency conceptof probability.

Exercises

1. If an ordinary die is thrown 20 times, what is the expected numberof times that a six will turn up? What is the standard deviationfor the number of sixes that turn up?

[Ans. 103; 53.]

2. Suppose that an ordinary die is thrown 450 times. What is theexpected number of throws that result in either a three or a four?What is the standard deviation for the number of such throws?

3. In 16 tosses of an ordinary coin, what is the expected numberof heads that turn up? What is the standard deviation for thenumber of heads that occur?


[Ans. 8;2.]

4. In 16 tosses of a coin, find the exact probability that the numberof heads that turn up differs from the expected number by (a)as much as one standard deviation, and (b) by more than onestandard deviation. Do the same for the case of two standarddeviations, and for the case of three standard deviations. Showthat the approximations given for large n lie between the valuesobtained, but are not very accurate for so small an n.

[Ans. .454; .210; .077; .021; .004; .001.]

5. Consider n independent trials with probability p for success. Let rand s be numbers such that p < r < s. What does the law of largenumbers say about Pr[r < x

n< s] as we increase n indefinitely?

Answer the same question in the case that r < p < s.

6. A drug is known to be effective in 20 per cent of the cases whereit is used. A new agent is introduced, and in the next 900 timesthe drug is used it is effective 250 times. What can be said aboutthe effectiveness of the drug?

7. In a large number of independent trials with probability p forsuccess, what is the approximate probability that the number ofsuccesses will deviate from the expected number by more thanone standard deviation but less than two standard deviations?

[Ans. .272.]

8. What is the approximate probability that, in 10,000 throws of anordinary coin, the number of heads which turn up lies between4850 and 5150? What is the probability that the number of headslies in the same interval, given that in the first 1900 throws therewere 1600 heads?

9. Suppose that it is desired that the probability be approximately.95 that the fraction of sixes that turn up when a die is thrown ntimes does not deviate by more than .01 from the value 1

6. How

large should n be?

[Ans. Approximately 5555.]


10. Suppose that for each roll of a fair die you lose $1 when an oddnumber comes up and win $1 when an even number comes up.Then after 10,000 rolls you can, with approximately 84 per centconfidence, expect to have lost not more than $(how much?).

11. Assume that 10 per cent of the people in a certain city havecancer. If 900 people are selected at random from the city, whatis the expected number which will have cancer? What is thestandard deviation? What is the approximate probability thatmore than 108 of the 900 chosen have cancer?

[Ans. 90;9;.023.]

12. Suppose that in Exercise 11, the 900 people are chosen at randomfrom those people in the city who smoke. Under the hypothesisthat smoking has no effect on the incidence of cancer, what is theexpected number in the 900 chosen that have cancer? Supposethat more than 120 of the 900 chosen have cancer, what might besaid concerning the hypothesis that smoking has no effect on theincidence of cancer?

13. In Example 4.20, we made the assumption in our calculationsthat, if the true proportion of voters in favor of the proposalwere p, then the 900 people chosen at random represented anindependent trials process with probability p for a “yes” answer,and 1− p for a “no” answer. Give a method for choosing the 900people which would make this a reasonable assumption. Criticizethe following methods.

(a) Choose the first 900 people in the list of registered Republi-cans.

(b) Choose 900 names at random from the telephone book.

(c) Choose 900 houses at random and ask one person from eachhouse, the houses being visited in the mid-morning.

14. For n throws of an ordinary coin, let tn be such that

Pr[−tn <x

n− 1

2< tn] = .997,

where x is the number of heads that turn up. Find tn for n = 104,n = 106, and n = 1020.

4.11. INDEPENDENT TRIALSWITHMORE THAN TWOOUTCOMES153

[Ans. .015; .0015; .000,000,000,15.]

15. Assume that a calculating machine carries out a million opera-tions to solve a certain problem. In each operation the machinegives the answer 10−5 too small, with probability 1

2, and 10−5 too

large, with probability 12. Assume that the errors are independent

of one another. What is a reasonable accuracy to attach to theanswer? What if the machine carries out 1010 operations?

[Ans. ±.01;±1.]

16. A computer tosses a coin 1 million times, and obtains 499,588heads. Is this number reasonable?

4.11 Independent trials with more than

two outcomes

By extending the results of Section 4.8, we shall study the case ofindependent trials in which we allow more than two outcomes. Weassume that we have an independent trials process where the possibleoutcomes are a1, a2, . . . , ak, occurring with probabilities p1, p2, . . . , pk,respectively. We denote by

f(r1, r2, . . . , rk; p1, p2, . . . , pk)

the probability that, in n = r1+r2+ . . .+rk such trials, there will be r1occurrences of a1, r2 occurrences of a2, etc. In the case of two outcomesthis notation would be f(r1, r2; p1, p2). In Section 4.8 we wrote this asf(n, r + 1; p) since r2 and p2 are determined from n, r1, and p1. Weshall indicate how this probability is found in general, but carry outthe details only for a special case. We choose k = 3, and n = 5 forpurposes of illustration. We shall find f(1, 2, 2; p1, p2, p3).

We show in Figure 4.20 enough of the tree for this process to indicatethe branch probabilities for a path (heavy lined) corresponding to theoutcomes a2, a3, a1, a2, a3. The tree measure assigns weight p2 · p3 · p1 ·p2 · p3 = p1 · p22 · p23 to this path.

There are, of course, other paths through the tree corresponding toone occurrence of a1, two of a2, and two of a3. However, they would allbe assigned the same weight p1 · p22 · p23, by the tree measure. Hence to


Figure 4.20: ♦

find f(l, 2, 2; p1, p2, p3) we must multiply this weight by the number ofpaths having the specified number of occurrences of each outcome.

We note that the path a2, a3, a1, a2, a3 can be specified by the three-cell partition [{3}, {1, 4}, {2, 5}] of the numbers from 1 to 5. Here thefirst cell shows the experiment which resulted in a1, the second cellshows the two that resulted in a2, and the third shows the two thatresulted in a3. Conversely, any such partition of the numbers from 1to 5 with one element in the first cell, two in the second, and two inthe third corresponds to a unique path of the desired kind. Hence thenumber of paths is the number of such partitions. But this is

(5

1, 2, 2

)=

5!

1!2!2!

(see 3.4), so that the probability of one occurrence of a1, two of a2, andtwo of a3 is (

5

1, 2, 2

)· p1 · p22 · p23.

The above argument carried out in general leads, for the case ofindependent trials with outcomes a1, a2, . . . , ak occurring with proba-bilities p1, p2, . . . , pk, to the following.

The probability for r1 occurrences of a1, r2 occurrences of

a2, etc., is given by

f(r1, r2, . . . , rk; p1, p2, . . . , pk) =

(n

r1, r2, . . . , rk

)pr11 · pr22 · . . . · prkk .


Example 4.22 A die is thrown 12 times. What is the probability thateach number will come up twice? Here there are six outcomes, 1, 2,3, 4, 5, 6 corresponding to the six sides of the die. We assign eachoutcome probability 1

6. We are then asked for

f(2, 2, 2, 2, 2, 2;1

6,1

6,1

6,1

6,1

6,1

6)

which is(

12

2, 2, 2, 2, 2, 2

)(1

6)2(

1

6)2(

1

6)2(

1

6)2(

1

6)2(

1

6)2 = .0034.

♦

Example 4.23 Suppose that we have an independent trials processwith four outcomes a1, a2, a3, a4 occurring with probability p1, p2. p3,p4, respectively. It might be that we are interested only in the proba-bility that r1 occurrences of a1 and r2 occurrences of a2 will take placewith no specification about the number of each of the other possibleoutcomes. To answer this question we simply consider a new exper-iment where the outcomes are a1, a2, a3. Here a3 corresponds to anoccurrence of either a3 or a4 in our original experiment. The corre-sponding probabilities would be p1, p2 and p3 with p3 = p3 + p4. Letr3 = n− (r1 + r2) Then our question is answered by finding the prob-ability in our new experiment for r1 occurrences of a1, r2 of a2, and r3of a3, which is (

n

r1, r2, r3

)pr11 · pr22 · p3r3 .

♦The same procedure can be carried out for experiments with any

number of outcomes where we specify the number of occurrences ofsuch particular outcomes. For example, if a die is thrown ten timesthe probability that a one will occur exactly twice and a three exactlythree times is given by

(10

2, 3, 5

)(1

6)2(

1

6)2(

4

6)5 = .043.


Exercises

1. Suppose that in a city 60 per cent of the population are Democrats,30 per cent are Republicans, and 10 per cent are Independents.What is the probability that if three people are chosen at randomthere will be one Republican, one Democrat, and one Independentvoter?

[Ans. .108.]

2. Three horses, A, B, and C, compete in four races. Assumingthat each horse has an equal chance in each race, what is theprobability that A wins two races and B and C win one each?What is the probability that the same horse wins all four races?

[Ans. 427; 127.]

3. Assume that in a certain large college 40 per cent of the studentsare freshmen, 30 per cent are sophomores, 20 per cent are juniors,and 10 per cent are seniors. A committee of eight is chosen atrandom from the student body. What is the probability thatthere are equal numbers from each class on the committee?

4. Let us assume that when a batter comes to bat, he or she hasprobability .6 of being put out, .1 of getting a walk, .2 of gettinga single, .1 of getting an extra base hit. If he or she comes to batfive times in a game, what is the probability that

(a) He gets two walks and three singles?

[Ans. .0008.]

(b) He gets a walk, a single, an extra base hit (and is out twice)?

[Ans. .043.]

(c) He has a perfect day (i.e., never out)?

[Ans. .010.]

5. Assume that a single torpedo has a probability 12of sinking a

ship, probability 14of damaging it, and probability 1

4of missing.

Assume further that two damaging shots sink the ship. Whatis the probability that four torpedos will succeed in sinking theship?


[Ans. 251256

.]

6. Jones, Smith, and Green live in the same house. The mailman hasobserved that Jones and Smith receive the same amount of mailon the average, but that Green receives twice as much as Jones(and hence also twice as much as Smith). If he or she has fourletters for this house, what is the probability that each residentreceives at least one letter?

7. If three dice are thrown, find the probability that there is one sixand two fives, given that all the outcomes are greater than three.

[Ans. 19.]

8. An athlete plays a tournament consisting of three games. In eachgame he or she has probability 1

2for a win, 1

4for a loss, and 1

4for a

draw, independently of the outcomes of other games. To win thetournament he or she must win more games than he or she loses.What is the probability that he or she wins the tournament?

9. Assume that in a certain course the probability that a studentchosen at random will get an A is .1, that he or she will get a Bis .2, that he or she will get a C is .4, that he or she will get a Dis .2, and that he or she will get an F is .1. What distribution ofgrades is most likely in the case of four students?

[Ans. One B, two C’s, one D.]

10. Let us assume that in a World Series game a batter has probability14of getting no hits, 1

2for getting one hit, 1

4for getting two hits,

assuming that the probability of getting more than two hits isnegligible. In a four-game World Series, find the probability thatthe batter gets

(a) Exactly two hits.

[Ans. 764.]

(b) Exactly three hits.

[Ans. 732.]

(c) Exactly four hits.


[Ans. 35128

.]

(d) Exactly five hits.

[Ans. 732.]

(e) Fewer than two hits or more than five.

[Ans. 23128

.]

11. Gypsies sometimes toss a thick coin for which heads and tails areequally likely, but which also has probability 1

5of standing on

edge (i.e., neither heads nor tails). What is the probability ofexactly one head and four tails in five tosses of a gypsy coin?

12. A family car is driven by the father, two sons, and the mother.The fenders have been dented four times, three times while themother was driving. Is it fair to say that the mother is a worsedriver than the men?

4.12 Expected value

In this section we shall discuss the concept of expected value. Althoughit originated in the study of gambling games, it enters into almost anydetailed probabilistic discussion.

Definition. If in an experiment the possible outcomes are numbers,a1, a2, . . . , ak, occurring with probability p1, p2, . . . , pk, then the expectedvalue is defined to be

E = a1p1 + a2p2 + . . .+ akpk.

The term “expected value” is not to be interpreted as the value thatwill necessarily occur on a single experiment. For example, if a personbets $1 that a head will turn up when a coin is thrown, he or she mayeither win $1 or lose $1. His expected value is (1)(1

2) + (−1)(1

2) = 0,

which is not one of the possible outcomes. The term, expected value,had its origin in the following consideration. If we repeat an experimentwith expected value E a large number of times, and if we expect a1 afraction p1 of the time, a2 a fraction p2 of the time, etc., then the averagethat we expect per experiment is E. In particular, in a gambling gameE is interpreted as the average winning expected in a large numberof plays. Here the expected value is often taken as the value of thegame to the player. If the game has a positive expected value, the

4.12. EXPECTED VALUE 159

game is said to be favorable; if the game has expected value zero itis said to be fair; and if it has negative expected value it is describedas unfavorable. These terms are not to be taken too literally, sincemany people are quite happy to play games that, in terms of expectedvalue, are unfavorable. For instance, the buying of life insurance maybe considered an unfavorable game which most people choose to play.

Example 4.24 For the first example of the application of expectedvalue we consider the game of roulette as played at Monte Carlo. Thereare several types of bets which the gambler can make, and we considertwo of these.

The wheel has the number 0 and the numbers from 1 to 36 markedon equally spaced slots. The wheel is spun and a ball comes to restin one of these slots. If the player puts a stake, say of $1, on a givennumber, and the ball comes to rest in this slot, then he or she receivesfrom the croupier 36 times the stake, or $36. The player wins $35with probability 1

37and loses $1 with probability 36

37. Hence his or her

expected winnings are

36 · 1

37− 1 · 36

37= −.027.

This can be interpreted to mean that in the long run the player canexpect to lose about 2.7 per cent of his or her stakes.

A second way to play is the following. A player may bet on “red”or “black”. The numbers from 1 to 36 are evenly divided between thetwo colors. If a player bets on “red”, and a red number turns up, theplayer receives twice the stake. If a black number turns up, the playerloses the stake. If 0 turns up, then the wheel is spun until it stops ona number different from 0. If this is black, the player loses; but if it isred, the player receives only the original stake, not twice it. For thistype of play, the player wins $1 with probability 18

37, breaks even with

probability 12· 137, and loses $1 with probability 18

37+ 1

2· 137. Hence his

or her expected winning is

1 · 1837

+ 0 · 1

74− 1 · 37

74= −.0135.

In this case the player can expect to lose about 1.35 per cent of hisor her stakes in the long run. Thus the expected loss in this case isonly half as great as in the previous case. ♦


Example 4.25 A player rolls a die and receives a number of dollarscorresponding to the number of dots on the face which turns up. Whatshould the player pay for playing, to make this a fair game? To answerthis question, we note that the player wins 1, 2, 3, 4, 5 or 6 dollars,each with probability 1

6. Hence, the player’s expected winning is

1(1

6) + 2(

1

6) + 3(

1

6) + 4(

1

6) + 5(

1

6) + 6(

1

6) = 3.5.

Thus if the player pays $3.50, the expected winnings will be zero. ♦

Example 4.26 What is the expected number of successes in the caseof four independent trials with probability 1

3for success? We know that

the probability of x successes is(4x

)(13)x(2

3)4−x. Thus

E = 0 ·(4

0

)(1

3)0(

2

3)4 + 1 ·

(4

1

)(1

3)1(

2

3)3 + 2 ·

(4

2

)(1

3)2(

2

3)2 +

3 ·(4

3

)(1

3)3(

2

3)1 + 4 ·

(4

4

)(1

3)4(

2

3)0

= 0 +32

81+

48

81+

24

81+

4

81=

108

81=

4

3.

In general, it can be shown that in n trials with probability p for success,the expected number of successes is np. ♦

Example 4.27 In the game of craps a pair of dice is rolled by one ofthe players. If the sum of the spots shown is 7 or 11, he or she wins.If it is 2, 3, or 12, he or she loses. If it is another sum, he or she mustcontinue rolling the dice until he or she either repeats the same sum orrolls a 7. In the former case he or she wins, in the latter he or she loses.Let us suppose that he or she wins or loses $1. Then the two possibleoutcomes are +1 and −1. We will compute the expected value of thegame. First we must find the probability that he or she will win.

We represent the possibilities by a two-stage tree shown in Figure4.21. While it is theoretically possible for the game to go on indefi-nitely, we do not consider this possibility. This means that our analysisapplies only to games which actually stop at some time.

The branch probabilities at the first stage are determined by think-ing of the 36 possibilities for the throw of the two dice as being equallylikely and taking in each case the fraction of the possibilities which


Figure 4.21: ♦

correspond to the branch as the branch probability. The probabilitiesfor the branches at the second level are obtained as follows. If, forexample, the first outcome was a 4, then when the game ends, a 4 or 7must have occurred. The possible outcomes for the dice were

{(3, 1), (1, 3), (2, 2), (4, 3), (3, 4), (2, 5), (5, 2), (1, 6), (6, 1)}.

Again we consider these possibilities to be equally likely and assign tothe branch considered the fraction of the outcomes which correspond tothis branch. Thus to the 4 branch we assign a probability 3

9= 1

3. The

other branch probabilities are determined in a similar way. Having thetree measure assigned, to find the probability of a win we must simplyadd the weights of all paths leading to a win. If this is done, we obtain244495

. Thus the player’s expected value is

1 · (244495

) + (−1) · (251495

) = − 7

495= −.0141.

Hence the player can expect to lose 1.41 per cent of his or her stakesin the long run. It is interesting to note that this is just slightly lessfavorable than the losses in betting on “red” in roulette. ♦


Exercises

1. Suppose that A tosses two coins and receives $2 if two headsappear, $1 if one head appears, and nothing if no heads appear.What is the expected value of the game to A?

[Ans. $1.]

2. Smith and Jones are matching coins. If the coins match, Smithgets $1, and if they do not, Jones get $1.

(a) If the game consists of matching twice, what is the expectedvalue of the game for Smith?

(b) Suppose that Smith quits if he or she wins the first round heor she quits, and plays the second round if he or she loses thethe first. Jones is not allowed to quit. What is the expectedvalue of the game for Smith?

3. If five coins are thrown, what is the expected number of headsthat will turn up?

[Ans. 52.]

4. A coin is thrown until the first time a head comes up or untilthree tails in a row occur. Find the expected number of times thecoin is thrown.

5. A customer wishes to purchase a five cent newspaper. The cus-tomer has in his or her pocket one dime and five pennies. Thenews agent offers to let the customer have the paper in exchangefor one coin drawn at random from the customer’s pocket.

(a) Is this a fair proposition and, if not, to whom is it favorable?

[Ans. Favorable to customer.]

(b) Answer the same question assuming that the news agentdemands two coins drawn at random from the customer’spocket.

[Ans. Fair proposition.]

6. A bets 50 cents against B’s x cents that, if two cards are dealtfrom a shuffled pack of ordinary playing cards, both cards will beof the same color. What value of x will make this bet fair?


7. Prove that if the expected value of a given experiment is E, andif a constant c is added to each of the outcomes, the expectedvalue of the new experiment is E + c.

8. Prove that, if the expected value of a given experiment is E, andif each of the possible outcomes is multiplied by a constant k, theexpected value of the new experiment is k · E.

9. A gambler plays the following game: A card is drawn from abridge deck; if it is an ace, the gambler wins $5; if it is a jack, aqueen or a king, he or she wins $2; for any other card he or sheloses $1. What is the expected winning per play?

10. An urn contains two black and three white balls. Balls are suc-cessively drawn from the urn without replacement until a blackball is obtained. Find the expected number of draws required.

11. Using the result of Exercises 13 and 14 of Section 4.6, find theexpected number of games in the World Series (a) under the as-sumption that each team has probability 1 of winning each gameand (b) under the assumption that the stronger team has proba-bility .6 of winning each game.

[Ans. 5.81; 5.75.]

12. Suppose that we modify the game of craps as follows: On a 7 or11 the player wins $2, on a 2, 3, or 12 he or she loses $3; otherwisethe game is as usual. Find the expected value of the new game,and compare it with the old value.

13. Suppose that in roulette at Monte Carlo we place 50 cents on“red” and 50 cents on “black”. What is the expected value onthe game? Is this better or worse than placing $1 on “red”?

14. Betting on “red” in roulette can be described roughly as follows.We win with probability .49, get our money back with probability.01, and lose with probability .50. Draw the tree for three playsof the game, and compute (to three decimals) the probability ofeach path. What is the probability that we are ahead at the endof three bets?

[Ans. .485.]


15. Assume that the odds are r : s that a certain statement will betrue. If a gambler receives s dollars if the statement turns outto be true, and gives r dollars if not, what is his or her expectedwinning?

16. Referring to Exercise 9 of Section 4.3, find the expected numberof languages that a student chosen at random reads.

17. Referring to Exercise 5 of Section 4.4, find the expected numberof men who get their own hats.

[Ans. 1.]

18. A pair of dice is rolled. Each die has the number 1 on two oppositefaces, the number 2 on two opposite faces, and the number 3 ontwo opposite faces. The “roller” wins a dollar if

(i) the sum of four occurs on the first roll; or

(ii) the sum of three or five occurs on the first roll and the samesum occurs on a subsequent roll before the sum of four occurs.

Otherwise he or she loses a dollar.

(a) What is the probability that the person rolling the dice wins?

[Ans. 2345.]

(b) What is the expected value of the game?

[Ans. 145.]

4.13 Markov chains

In this section we shall study a more general kind of process than theones considered in the last three sections.

We assume that we have a sequence of experiments with the fol-lowing properties. The outcome of each experiment is one of a finitenumber of possible outcomes a1, a2, . . . , ar. It is assumed that the prob-ability of outcome aj on any given experiment is not necessarily inde-pendent of the outcomes of previous experiments but depends at mostupon the outcome of the immediately preceding experiment. We as-sume that there are given numbers pij which represent the probability

4.13. MARKOV CHAINS 165

Figure 4.22: ♦

of outcome aj on any given experiment, given that outcome ai oc-curred on the preceding experiment. The outcomes a1, a2, . . . , ar arecalled states, and the numbers pij are called transition probabilities. Ifwe assume that the process begins in some particular state, then wehave enough information to determine the tree measure for the processand can calculate probabilities of statements relating to the over-all se-quence of experiments. A process of the above kind is called a Markovchain process.

The transition probabilities can be exhibited in two different ways.The first way is that of a square array. For a Markov chain with statesa1, a2, and a3, this array is written as

P =

p11 p12 p13p21 p22 p23p31 p32 p33

.

Such an array is a special case of a matrix. Matrices are of fundamentalimportance to the study of Markov chains as well as being importantin the study of other branches of mathematics. They will be studied indetail in ??.

A second way to show the transition probabilities is by a transitiondiagram. Such a diagram is illustrated for a special case in Figure4.22. The arrows from each state indicate the possible states to whicha process can move from the given state.

The matrix of transition probabilities which corresponds to this


diagram is the matrix

a1 a2 a3a1a2a3

0 1 00 1

212

13

0 23

.

An entry of 0 indicates that the transition is impossible.Notice that in the matrix P the sum of the elements of each row is

1. This must be true in any matrix of transition probabilities, since theelements of the ith row represent the probabilities for all possibilitieswhen the process is in state ai.

The kind of problem in which we are most interested in the studyof Markov chains is the following. Suppose that the process starts instate i. What is the probability that after n steps it will be in statej? We denote this probability by p

(n)ij . Notice that we do not mean by

this the nth power of the number pij . We are actually interested in thisprobability for all possible starting positions i and all possible terminalpositions j. We can represent these numbers conveniently again by amatrix. For example, for n steps in a three-state Markov chain we writethese probabilities as the matrix

P (n) =

p(n)11 p

(n)12 p

(n)13

p(n)21 p

(n)22 p

(n)23

p(n)31 p

(n)32 p

(n)33

.

Example 4.28 Let us find for a Markov chain with transition proba-bilities indicated in Figure 4.22 the probability of being at the variouspossible states after three steps, assuming that the process starts atstate a1. We find these probabilities by constructing a tree and a treemeasure as in Figure 4.23.

The probability p(3)13 , for example, is the sum of the weights assigned

by the tree measure to all paths through our tree which end at statea3. That is,

p(3)13 = 1 · 1

2· 12+ 1 · 1

2· 23=

7

12.

Similarly

p(3)12 = 1 · 1

2· 12=

1

4and

p(3)11 = 1 · 1

2· 13=

1

6.


Figure 4.23: ♦

By constructing a similar tree measure, assuming that we startat state a2, we could find p

(3)21 ,p

(3)22 ,and p

(3)23 . The same is true for

p(3)31 ,p

(3)32 ,and p

(3)33 . If this is carried out (see Exercise 7) we can write

the results in matrix form as follows:

P (3) =

a1 a2 a3a1a2a3

16

14

712

736

724

3772

427

718

2554

.

Again the rows add up to 1, corresponding to the fact that if we startat a given state we must reach some state after three steps. Noticenow that all the elements of this matrix are positive, showing that it ispossible to reach any state from any state in three steps. In the nextchapter we will develop a simple method of computing P (n). ♦

Example 4.29 example:4.13.2 Suppose that we are interested in study-ing the way in which a given state votes in a series of national elec-tions. We wish to make long-term predictions and so will not considerconditions peculiar to a particular election year. We shall base ourpredictions only on the past history of the outcomes of the elections,Republican or Democratic. It is clear that a knowledge of these pastresults would influence our predictions for the future. As a first ap-proximation, we assume that the knowledge of the past beyond the lastelection would not cause us to change the probabilities for the outcomeson the next election. With this assumption we obtain a Markov chainwith two states R and D and matrix of transition probabilities

R D

RD

(1− a ab 1− b

).


The numbers a and b could be estimated from past results as follows.We could take for a the fraction of the previous years in which theoutcome has changed from Republican in one year to Democratic inthe next year, and for b the fraction of reverse changes.

We can obtain a better approximation by taking into account theprevious two elections. In this case our states are RR, RD, DR, andDD, indicating the outcome of two successive elections. Being in stateRR means that the last two elections were Republican victories. If thenext election is a Democratic victory, we will be in state RD. If theelection outcomes for a series of years is DDDRDRR, then our processhas moved from state DD to DD to DR to RD to DR, and finally to RR.Notice that the first letter of the state to which we move must agreewith the second letter of the state from which we came, since theserefer to the same election year. Our matrix of transition probabilitieswill then have the form,

RR DR RD DDRRDRRDDD

1− a 0 a 0b 0 1− b 00 1− c 0 c0 d 0 1− d

.

Again the numbers a, b, c, and d would have to be estimated. Thestudy of this example is continued in ??. ♦

Example 4.30 The following example of a Markov chain has beenused in physics as a simple model for diffusion of gases. We shall seelater that a similar model applies to an idealized problem in changingpopulations.

We imagine n black balls and n white balls which are put intotwo urns so that there are n balls in each urn. A single experimentconsists in choosing a ball from each urn at random and putting the ballobtained from the first urn into the second urn, and the ball obtainedfrom the second urn into the first. We take as state the number ofblack balls in the first urn. If at any time we know this number, thenwe know the exact composition of each urn. That is, if there are j blackballs in urn 1, there must be n − j black balls in urn 2, n − j whiteballs in urn 1, and j white balls in urn 2. If the process is in state j,then after the next exchange it will be in state j − 1, if a black ball ischosen from urn 1 and a white ball from urn 2. It will be in state j if aball of the same color is drawn from each urn. It will be in state j + 1


if a white ball is drawn from urn 1 and a black ball from urn 2. Thetransition probabilities are then given by (see Exercise 12)

pjj−1 = (j

n)2, j > 0

pjj =2j(n− j)

n2

pjj+1 = (n− j

n)2, j < n

pjk = 0 otherwise.

A physicist would be interested, for example, in predicting the compo-sition of the urns after a certain number of exchanges have taken place.Certainly any predictions about the early stages of the process woulddepend upon the initial composition of the urns. For example, if westarted with all black balls in urn 1, we would expect that for some timethere would be more black balls in urn 1 than in urn 2. On the otherhand, it might be expected that the effect of this initial distributionwould wear off after a large number of exchanges. We shall see later,in ??, that this is indeed the case. ♦

Exercises

1. Draw a state diagram for the Markov chain with transition prob-abilities given by the following matrices.

12

12

00 1 012

0 12

,

13

13

13

13

13

13

13

13

13

,

(0 11 0

),

0 1 0 01 0 0 00 0 1

212

0 0 12

12

.


Figure 4.24: ♦

2. Give the matrix of transition probabilities corresponding to thetransition diagrams in Figure 4.24.

3. Find the matrix P (2) for the Markov chain determined by thematrix of transition probabilities

P =

(12

12

13

23

).

[Ans.

(512

712

718

1118

).]

4. What is the matrix of transition probabilities for the Markovchain in Example 4.30, for the case of two white balls and twoblack balls?

5. Find the matrices P (2), P (3), P (4) for the Markov chain deter-mined by the transition probabilities

(1 00 1

).

Find the same for the Markov chain determined by the matrix

(0 11 0

).


6. Suppose that a Markov chain has two states, a1 and a2, andtransition probabilities given by the matrix

(13

23

12

12

).

By means of a separate chance device we choose a state in whichto start the process. This device chooses a1 with probability 1

2

and a2 with probability 12. Find the probability that the process

is in state a1 after the first step. Answer the same question in thecase that the device chooses a1 with probability 1

3and a2 with

probability 23.

[Ans. 512; 49.]

7. Referring to the Markov chain with transition probabilities indi-cated in Figure 4.22, construct the tree measures and determinethe values of

p(3)21 , p

(3)22 , p

(3)23

andp(3)31 , p

(3)32 , p

(3)33 .

8. A certain calculating machine uses only the digits 0 and 1. It issupposed to transmit one of these digits through several stages.However, at every stage there is a probability p that the digitwhich enters this stage will be changed when it leaves. We form aMarkov chain to represent the process of transmission by takingas states the digits 0 and 1. What is the matrix of transitionprobabilities?

9. For the Markov chain in Exercise 8, draw a tree and assign a treemeasure, assuming that the process begins in state 0 and movesthrough three stages of transmission. What is the probabilitythat the machine after three stages produces the digit 0, i.e., thecorrect digit? What is the probability that the machine neverchanged the digit from 0?

10. Assume that a man’s profession can be classified as professional,skilled laborer, or unskilled laborer. Assume that of the sonsof professional men 80 per cent are professional, 10 per cent areskilled laborers, and 10 per cent are unskilled laborers. In the


Figure 4.25: ♦

case of sons of skilled laborers, 60 per cent are skilled laborers, 20per cent are professional, and 20 per cent are unskilled laborers.Finally, in the case of unskilled laborers, 50 per cent of the sonsare unskilled laborers, and 25 per cent each are in the other twocategories. Assume that every man has a son, and form a Markovchain by following a given family through several generations. Setup the matrix of transition probabilities. Find the probabilitythat the grandson of an unskilled laborer is a professional man.

[Ans. .375.]

11. In Exercise 10 we assumed that every man has a son. Assumeinstead that the probability a man has a son is .8. Form a Markovchain with four states. The first three states are as in Exercise10, and the fourth state is such that the process enters it if a manhas no son, and that the state cannot be left. This state repre-sents families whose male line has died out. Find the matrix oftransition probabilities and find the probability that an unskilledlaborer has a grandson who is a professional man.

[Ans. .24.]

12. Explain why the transition probabilities given in Example 4.30are correct.


13. Five points are marked on a circle. A process moves clockwisefrom a given point to its neighbor with probability 2

3, or counter-

clockwise to its neighbor with probability 13.


(a) Considering the process to be a Markov chain process, findthe matrix of transition probabilities.

(b) Given that the process starts in a state 3, what is the prob-ability that it returns to the same state in two steps?

14. In northern New England, years for apples can be described asgood, average, or poor. Suppose that following a good year theprobabilities of good, average, or poor years are respectively .4, .4,and .2. Following a poor year the probabilities of good, average,or poor years are .2, .4, and .4 respectively. Following an averageyear the probabilities that the next year will be good or poor areeach .2, and of an average year, .6.

(a) Set up the transition matrix of this Markov chain.

(b) 1965 was a good year. Compute the probabilities for 1966,1967, and 1968.

[Ans. For 1967: .28, .48, .24.]

15. In Exercise 14 suppose that there is probability 14for a good

year, 12for an average year, and 1

4for a poor year. What are the

probabilities for the following year?

16. A teacher in an oversized mathematics class finds, after gradingall homework papers for the first two assignments, that it is nec-essary to reduce the amount of time spent in such grading. Hetherefore designs the following system: Papers will be markedsatisfactory or unsatisfactory. All papers of students receiving amark of unsatisfactory on any assignment will be read on each ofthe two succeeding days. Of the remaining papers, the teacherwill read one-fifth, chosen at random. Assuming that each paperhas a probability of one-fifth of being classified “unsatisfactory”,

(a) Set up a three-state Markov chain to describe the process.

(b) Suppose that a student has just handed in a satisfactorypaper. What are the probabilities for the next two assign-ments?

17. In another model for diffusion, it is assumed that there are twourns which together contain N balls numbered from 1 to N . Eachsecond a number from 1 to N is chosen at random, and the ballwith the corresponding number is moved to the other urn. Set


up a Markov chain by taking as state the number of balls in urn1. Find the transition matrix.

4.14 The central limit theorem

We continue our discussion of the independent trials process with twooutcomes. As usual, let p be the probability of success on a trial, andf(n, p; x) be the probability of exactly x successes in n trials.

In Figure 4.26 we have plotted bar graphs which represent f(n, .3; x)for n = 10, 50, 100, and 200. We note first of all that the graphs aredrifting off to the right. This is not surprising, since their peaks occurat np, which is steadily increasing. We also note that while the totalarea is always 1, this area becomes more and more spread out.

We want to redraw these graphs in a manner that prevents thedrifting and the spreading out. First of all, we replace x by x − np,assuring that our peak always occurs at 0. Next we introduce a newunit for measuring the deviation, which depends on n, and which givescomparable scales. As we saw in Section 4.10, the standard deviation√npq is such a unit.

We must still insure that probabilities are represented by areas inthe graph. In Figure 4.26 this is achieved by having a unit base for eachrectangle, and having the probability f(n, p; x) as height. Since we arenow representing a standard deviation as a single unit on the horizontalaxis, we must take f(n, p; x)

√npq as the heights of our rectangles. The

resulting curves for n = 50 and 200 are shown in Figures 4.27 and 4.28,respectively.

We note that the two figures look very much alike. We have alsoshown in Figure 4.28 that it can be approximated by a bell-shapedcurve. This curve represents the function

f(x) =1√2π

e−x2/2

and is known as the normal curve. It is a fundamental theorem ofprobability theory that as n increases, the appropriately rescaled bargraphs more and more closely approach the normal curve. The theoremis known as the Central Limit Theorem, and we have illustrated itgraphically.

More precisely, the theorem states that for any two numbers a and

4.14. THE CENTRAL LIMIT THEOREM 175

Figure 4.26: ♦


Figure 4.27: ♦

Figure 4.28: ♦


Figure 4.29: ♦

b, with a < b,

Pr[a <x− np√

npq< b]

approaches the area under the normal curve between a and b, as nincreases. This theorem is particularly interesting in that the normalcurve is symmetric about 0, while f(n, p; x) is symmetric about theexpected value np only for the case p = 1

2. It should also be noted that

we always arrive at the same normal curve, no matter what the valueof p is.

In Figure 4.29 we give a table for the area under the normal curvebetween 0 and d. Since the total area is 1, and since it is symmetricabout the origin, we can compute arbitrary areas from this table. Forexample, suppose that we wish the area between −1 and +2. The areabetween 0 and 2 is given in the table as .477. The area between −1and 0 is the same as between 0 and 1, and hence is given as .341. Thusthe total area is .818. The area outside the interval (−1, 2) is then


1− .818 = .182.

Example 4.31 Let us find the probability that x differs from the ex-pected value np by as much as d standard deviations.

Pr[|x− np| ≥ d√npq] = Pr[|x− np√

npq≥ d].

and hence the approximate answer should be the area outside theinterval (−d, d) under the normal curve. For d = 1, 2, 3 we obtain1 − (2 · .341) = .318, 1 − (2 · .477) = .046, 1 − (2 · .4987) = .0026, re-spectively. These agree with the values given in Section 4.10, to withinrounding errors. In fact, the Central Limit Theorem is the basis ofthose estimates. ♦

Example 4.32 In Example 4.19 we considered the example of throw-ing a coin 10,000 times. The expected number of heads that turn upis 5000 and the standard deviation is

√10, 000 · 1

2· 1

2= 50. We ob-

served that the probability of a deviation of more than two standarddeviations (or 100) was very unlikely. On the other hand, consider theprobability of a deviation of less than .1 standard deviation. That is,of a deviation of less than five. The area from 0 to .1 under the normalcurve is .040 and hence the probability of a deviation from 5000 of lessthan five is approximately .08. Thus, while a deviation of 100 is veryunlikely, it is also very unlikely that a deviation of less than five willoccur. ♦

Example 4.33 The normal approximation can be used to estimatethe individual probabilities f(n, x; p) for large n. For example, let usestimate f(200, 65; .3). The graph of the probabilities f(200, x; .3) wasgiven in Figure 4.28 together with the normal approximation. Thedesired probability is the area of the bar corresponding to x = 65. Aninspection of the graph suggests that we should take the area under thenormal curve between 64.5 and 65.5 as an estimate for this probability.In normalized units this is the area between

4.5√200(.3)(.7)

and5.5√

200(.3)(.7)


or between .6944 and .8487. Our table is not fine enough to find thisarea, but from more complete tables, or by machine computation, thisarea may be found to be .046 to three decimal places. The exact valueto three decimal places is .045. This procedure gives us a good estimate.

If we check all of the values of f(200, x; .3) we find in each casethat we would make an error of at most .001 by using the normalapproximation. There is unfortunately no simple way to estimate theerror caused by the use of the Central Limit Theorem. The error willclearly depend upon how large n is, but it also depends upon how nearp is to 0 or 1. The greatest accuracy occurs when p is near 1

2. ♦

Example 4.34 Suppose that a drug has been administered to a num-ber of patients and found to be effective a fraction p of the time. Assum-ing an independent trials process, it is natural to take p as an estimatefor the unknown probability p for success on any one trial. It is usefulto have a method of estimating the reliability of this estimate. Onemethod is the following. Let x be the number of successes for the druggiven to n patients. Then by the Central Limit Theorem

Pr[|x− np√npq

| ≤ 2] ≈ .95.

This is the same as saying

Pr[|x/n− p√pq/n

| ≤ 2] ≈ .95.

Putting p = x/n, we have

Pr[|p− p| ≤ 2√pq/n] ≈ .95.

Using the fact that pq < 4 (see Exercise 12) we have

Pr[|p− p| ≤ 1√n] ≥ .95.

This says that no matter what p is, with probability ≥ .95, the truevalue will not deviate from the estimate p by more than 1√

nIt is cus-

tomary then to say that

p− 1√n≤ p ≤ p+

1√n


with confidence .95. The interval

[p− 1√n, p+

1√n]

is called a 95 per cent confidence interval. Had we started with

Pr[|x− np√npq

| ≤ 3] ≈ .99,

we would have obtained the 99 per cent confidence interval

[p− 3

2√n, p+

3

2√n]

For example, if in 400 trials the drug is found effective 124 times,or .31 of the times, the 95 per cent confidence interval for p is

[.31− 1

20, .31 +

1

20] = [.26, .36]

and the 99 per cent confidence interval is

[.31− 3

40, .31 +

3

40] = [.235, .385].

♦

Exercises

1. Let x be the number of successes in n trials of an independenttrials process with probability p for success. Let x⋆ = x−np√

npqFor

large n estimate the following probabilities.

(a) Pr[x⋆ < −2.5].

[Ans. .006.]

(b) Pr[x⋆ < 2.5].

(c) Pr[x⋆ ≥ −.5].

(d) Pr[−1.5 < x⋆ < 1].

[Ans. .774.]

2. A coin is biased in such a way that a head comes up with proba-bility .8 on a single toss. Use the normal approximation to esti-mate the probability that in a million tosses there are more than800,400 heads.


3. Plot a graph of the probabilities f(10, x, .5). Plot a graph also ofthe normalized probabilities as in Figures 4.27 and 4.28.

4. An ordinary coin is tossed one million times. Let x be the numberof heads which turn up. Estimate the following probabilities.

(a) Pr[499, 500 < x < 500, 500]

[Ans. Approximately .682.]

(b) Pr[499, 000 < x < 501, 000],


(c) Pr[498, 500 < x < 501, 500],


5. Assume that a baseball player has probability .37 of getting a hiteach time he or she comes to bat. Find the probability of gettingan average of .388 or better if he or she comes to bat 300 timesduring the season. (In 1957 Ted Williams had a batting averageof .388 and Mickey Mantle had an average of .353. If we assumethis difference is due to chance, we may estimate the probabilityof a hit as the combined average, which is about .37.)

[Ans. .242.]

6. A true-false examination has 48 questions. Assume that the prob-ability that a given student knows the answer to any one questionis 3

4. A passing score is 30 or better. Estimate the probability

that the student will fail the exam.

7. In Example 4.21 of Section 4.10, assume that the school decidesto admit 1296 students. Estimate the probability that they willhave to have additional dormitory space.


8. Peter and Paul each have 20 pennies. They agree to match pen-nies 400 times, keeping score but not paying until the 400 matchesare over. What is the probability that one of the players will notbe able to pay? Answer the same question for the case that Peterhas 10 pennies and Paul has 30.


9. In tossing a coin 100 times, the probability of getting 50 headsis, to three decimal places, .080. Estimate this same probabilityusing the Central Limit Theorem.

[Ans. .080.]

10. A standard medicine has been found to be effective in 80 percent of the cases where it is used. A new medicine for the samepurpose is found to be effective in 90 of the first 100 patients onwhich the medicine is used. Could this be taken as good evidencethat the new medication is better than the old?

11. In the Weldon dice experiment, 12 dice were thrown 26,306 timesand the appearance of a 5 or a 6 was considered to be a suc-cess. The mean number of successes observed was, to four deci-mal places, 4.0524. Is this result significantly different from theexpected average number of 4?

[Ans. Yes.]

12. Prove that pq ≤ 14. [Hint: write p = 1

2+ x.]

13. Suppose that out of 1000 persons interviewed 650 said that theywould vote for Mr. Big for mayor. Construct the 99 per centconfidence interval for p, the proportion in the city that wouldvote for Mr. Big.

14. Opinion pollsters in election years usually poll about 3000 voters.Suppose that in an election year 51 per cent favor candidate A and49 per cent favor candidate B. Construct 95 per cent confidencelimits for candidate A winning.

[Ans. [.492, .528].]

15. In an experiment with independent trials we are going to estimatep by the fraction p of successes. We wish our estimate to be within.02 of the correct value with probability .95. Show that 2500observations will always suffice. Show that if it is known that pis approximately .1, then 900 observations would be sufficient.

4.15. GAMBLER’S RUIN 183

16. An experimenter has an independent trials process and he or shehas a hypothesis that the true value of p is p0. He decides to carryout a number of trials, and from the observed r calculate the 95per cent confidence interval for p. He will reject p0 if it does notfall within these limits. What is the probability that he or shewill reject p0 when in fact it is correct? Should he or she acceptp0 if it does fall within the confidence interval?

17. A coin is tossed 100 times and turns up heads 61 times. Usingthe method of Exercise 16 test the hypothesis that the coin is afair coin.

[Ans. Reject.]

18. Two railroads are competing for the passenger traffic of 1000 pas-sengers by operating similar trains at the same hour. If a givenpassenger is equally likely to choose one train as the other, howmany seats should the railroad provide if it wants to be sure thatits seating capacity is sufficient in 99 out of 100 cases?

[Ans. 537.]

4.15 Gambler’s ruin

In this section we will study a particular Markov chain, which is inter-esting in itself and has far-reaching applications. Its name, “gambler’sruin”, derives from one of its many applications. In the text we willdescribe the chain from the gambling point of view, but in the exerciseswe will present several other applications.

Let us suppose that you are gambling against a professional gambler,or gambling house. You have selected a specific game to play, on whichyou have probability p of winning. The gambler has made sure that thegame is in his or her favor, so that p < 1

2. However, in most situations

p will be close to 12. (The cases p = 1

2and p > 1

2are considered in the

exercises.)At the start of the game you have A dollars, and the gambler has B

dollars. You bet $1 on each game, and play until one of you is ruined.What is the probability that you will be ruined? Of course, the answerdepends on the exact values of p, A, and B. We will develop a formulafor the ruin-probability in terms of these three given numbers.


Figure 4.30: ♦

First we will set the problem up as a Markov chain. Let N = A+B,the total amount of money in the game. As states for the chain wechoose the numbers 0, 1, 2, . . . , N . At any one moment the position ofthe chain is the amount of money you have. The initial position isshown in Figure 4.30.

If you win a game, your money increases by $1, and the gambler’sfortune decreases by $1. Thus the new position is one state to the rightof the previous one. If you lose a game, the chain moves one step tothe left. Thus at any step there is probability p of moving one stepto the right, and probability q = 1 − p of one step to the left. Sincethe probabilities for the next position are determined by the presentposition, it is a Markov chain.

If the chain reaches 0 or N , we stop. When 0 is reached, you areruined. When N is reached, you have all the money, and you haveruined the gambler. We will be interested in the probability of yourruin, i.e., the probability of reaching 0.

Let us suppose that p and N are fixed. We actually want the prob-ability of ruin when we start at A. However, it turns out to be easier tosolve a problem that appears much harder: Find the ruin-probabilityfor every possible starting position. For this reason we introduce thenotation xi, to stand for the probability of your ruin if you start inposition i (that is, if you have i dollars).

Let us first solve the problem for the case N = 5. We have theunknowns x0, x1, x2, x3, x4, x5. Suppose that we start at position 2.The chain moves to 3, with probability p, or to 1, with probability q.Thus

Pr[ruin|start at 2] = Pr[ruin|start at 3] · p+ Pr[ruin|start at 1] · q.

using the conditional probability formula, with a set of two alternatives.But once it has reached state 3, a Markov chain behaves just as if ithad been started there. Thus

Pr[ruin|start at 3] = x3.


And, similarly,Pr[ruin|start at 1] = x1.

We obtain the key relation

x2 = px3 + qx1.

We can modify this as follows:

(p+ q)x2 = px3 + qx1,

p(x2 − x3) = q(x1 − x2),

x1 − x2 = r(x2 − x3),

where r = p/q and hence r < 1. When we write such an equation foreach of the four “ordinary” positions, we obtain

x0 − x1 = r(x1 − x2),x1 − x2 = r(x2 − x3),x2 − x3 = r(x3 − x4),x3 − x4 = r(x4 − x5)

(4.4)

We must still consider the two extreme positions. Suppose that thechain reaches 0. Then you are ruined, hence the probability of yourruin is 1. While if the chain reaches N = 5, the gambler drops out ofthe game, and you can’t be ruined. Thus

x0 = 1, x5 = 0. (4.5)

If we substitute the value of x5 in the last equation of 4.4, we havex3−x4 = rx4. This in turn may be substituted in the previous equation,etc. We thus have the simpler equations

x4 = 1 · x4,x3 − x4 = rx4.x2 − x3 = r2x4.x1 − x2 = r3x4.x0 − x1 = r4x4

(4.6)

Let us add all the equations. We obtain

x0 = (1 + r + r2 + r3 + r4)x4.

From 4.5 we have that x0 = 1. We also use the simple identity

(1− r)(1 + r + r2 + r3 + r4) = 1− r5.


And then we solve for x4:

x4 =1− r

1− r5.

If we add the first two equations in 4.6, we have that x3 = (1 + r)x4.Similarly, adding the first three equations, we solve for x2, and addingthe first four equations we obtain x1. We now have our entire solution,

x1 =1− r4

1− r5, x2 =

1− r3

1− r5, x3 =

1− r2

1− r5, x4 =

1− r1

1− r5. (4.7)

The same method will work for any value of N . And it is easy to guessfrom 4.7 what the general solution looks like. If we want xA, the answeris a fraction like those in 4.7. In the denominator the exponent of r isalways N . In the numerator the exponent is N − A, or B. Thus theruin-probability is

xA =1− rB

1− rN. (4.8)

We recall that A is the amount of money you have, B is the gambler’sstake, N = A + B, p is your probability of winning a game, and r =p/(1− p).

In Figure 4.31 we show some typical values of the ruin-probability.Some of these are quite startling. If the probability of p is as low as .45(odds against you on each game 11: 9) and the gambler has 20 dollarsto put up, you are almost sure to be ruined. Even in a nearly fair game,say p = .495, with each of you having $50 to start with, there is a .731chance for your ruin.

It is worth examining the ruin-probability formula 4.8 more closely.Since the denominator is always less than 1, your probability of ruin isat least 1 − rB. This estimate does not depend on how much moneyyou have, only on p and B. Since r is less than 1, by making B largeenough, we can make rB practically 0, and hence make it almost certainthat you will be ruined.

Suppose, for example, that a gambler wants to have probability .999of ruining you. (You can hardly call him or her a gambler under thosecircumstances!) The gambler must make sure that rB < .001. Forexample, if p = .495, the gambler needs $346 to have probability .999of ruining you, even if you are a millionaire. If p = .48, the gamblerneeds only $87. And even for the almost fair game with p = .499, $1727will suffice.


Figure 4.31: ♦


There are two ways that gamblers achieve this goal. Small gamblinghouses will fix the odds quite a bit in their favor, making r much lessthan 1. Then even a relatively small bank of B dollars suffices to assurethem of winning. Larger houses, with B quite sizable, can afford to letyou play nearly fair games.

Exercises

1. An urn has nine white balls and 11 black balls. A ball is drawn,and replaced. If it is white, you win five cents, if black, you losefive cents. You have a dollar to gamble with, and your opponenthas fifty cents. If you keep on playing till one of you loses allhis or her money, what is the probability that you will lose yourdollar?

[Ans. .868.]

2. Suppose that you are shooting craps, and you always hold thedice. You have $20, your opponent has $10, and $1 is bet on eachgame; estimate your probability of ruin.

3. Two government agencies, A and B, are competing for the sametask. A has 50 positions, and B has 20. Each year one positionis taken away from one of the agencies, and given to the other.If 52 per cent of the time the shift is from A to B, what do youpredict for the future of the two agencies?

[Ans. One agency will be abolished. B survives with probability.8, A with probability .2.]

4. What is the approximate value of xA if you are rich, and thegambler starts with $1?

5. Consider a simple model for evolution. On a small island there isroom for 1000 members of a certain species. One year a favorablemutant appears. We assume that in each subsequent generationeither the mutants take one place from the regular members ofthe species, with probability .6, or the reverse happens. Thus,for example, the mutation disappears in the very first generationwith probability .4. What is the probability that the mutantseventually take over? [Hint: See Exercise 4.]


[Ans. 13.]

6. Verify that the proof of formula 4.8 in the text is still correctwhen p > 1

2. Interpret formula 4.8 for this case.

7. Show that if p > 12, and both parties have a substantial amount

of money, your probability of ruin is approximately 1/rA.

8. Modify the proof in the text to apply to the case p = 12. What is

the probability of your ruin?

[Ans. B/N .]

9. You are matching pennies. You have 25 pennies to start with,and your opponent has 35. What is the probability that you willwin all his or her pennies?

10. Jones lives on a short street, about 100 steps long. At one end ofthe street is Jones’s home, at the other a lake, and in the middlea bar. One evening Jones leaves the bar in a state of intoxication,and starts to walk at random. What is the probability that Joneswill fall into the lake if

(a) Jones is just as likely to take a step to the right as to theleft?

[Ans. 12.]

(b) Jones has probability .51 of taking a step towards home?

[Ans. .119.]

11. You are in the following hopeless situation: You are playing agame in which you have only 1

3chance of winning. You have

$1, and your opponent has $7. What is the probability of yourwinning all his or her money if

(a) You bet $1 each time?

[Ans. 1255

.]

(b) You bet all your money each time?

[Ans. 127.]


12. Repeat Exercise 11 for the case of a fair game, where you haveprobability 1

2of winning.

13. Modify the proof in the text to compute yi, the probability ofreaching state N = 5.

14. Verify, in Exercise 13, that xi + yi = 1 for every state. Interpret.

Note: The following exercises deal with the following ruin prob-lem: A and B play a game in which A has probability W ofwinning. They keep playing until either A has won six times orB has won three times.

15. Set up the process as a Markov chain whose states are (a, b),where a is the number of times A won, and b the number of Bwins.

16. For each state compute the probability of A winning from thatposition. [Hint: Work from higher a- and b-values to lower ones.]

17. What is the probability that A reaches his or her goal first?

[Ans. 10242187

.]

18. Suppose that payments are made as follows: If A wins six games,A receives $1, if B wins three games then A pays $1. What is theexpected value of the payment, to the nearest penny?

Suggested reading.

Cramer, Harald, The Elements of Probability Theory, Part I, 1955.

Feller, W., An Introduction to Probability Theory and its Applications,1950.

Goldberg, S., Probability: An Introduction, 1960.

Mosteller, F., Fifty Challenging Problems in Probability with Solutions,1965.

Neyman, J., First Course in Probability and Statistics, 1950.

Parzen, E., Modern Probability Theory and Its Applications, 1960.

Whitworth, W. A., Choice and Chance, with 1000 Exercises, 1934.

finite mathematics - dartmouth college

Documents