problems on discrete mathematics1 ltex at january 11, 2007 › faculty › chungli › dis300 ›...

Problems on DiscreteMathematics1

Chung-Chih Li2

Kishan Mehrotra3

Syracuse University, New York

LATEX at January 11, 2007

(Part II)

1No part of this book can be reproduced without permission from the [email protected]@ecs.syr.edu

Contents

Preface i

Acknowledgment iii

I Basic Concepts 1

0 Preliminary 3

0.1 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

0.2 Patterns of theorems and proof . . . . . . . . . . . . . . . . . . . 5

1 Sets 7

1.1 Definitions and Basic Theorems . . . . . . . . . . . . . . . . . . . 9

1.1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.1.2 Basic Theorems . . . . . . . . . . . . . . . . . . . . . . . . 11

1.2 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.3 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2 Logic 37

2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.1.1 Propositional Logic . . . . . . . . . . . . . . . . . . . . . . 39

2.1.2 Predicate Logic . . . . . . . . . . . . . . . . . . . . . . . . 43

2.1.3 Predicates and Sets . . . . . . . . . . . . . . . . . . . . . 44

2.2 Logical Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

2.2.1 Laws of Logic . . . . . . . . . . . . . . . . . . . . . . . . . 45

2.2.2 Rules of Inference . . . . . . . . . . . . . . . . . . . . . . 47

2.2.3 Inference Rules for Quantified Predicates . . . . . . . . . 48

2.3 DNF and CNF . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

2.3.1 The DNF of a given wff . . . . . . . . . . . . . . . . . . . 49

2.3.2 The CNF of a given wff . . . . . . . . . . . . . . . . . . . 51

2.3.3 A shortcut to find the DNF and CNF . . . . . . . . . . . 53

2.4 Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

2.5 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

3 Mathematical Induction 101

3.1 Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

3.1.1 Necessary Conditions of Using Mathematical Induction . 103

3.1.2 The Underlying Theory of Mathematical Induction . . . . 104

3.1.3 Mathematical Induction of the First Form (Weak Induction)105

3.1.4 Mathematical Induction of the Second Form (Strong In-duction) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

3.2 Mathematical Induction and Recursive Definition . . . . . . . . . 107

3.2.1 Recursive Definitions for Functions . . . . . . . . . . . . . 107

3.2.2 Recursive Definitions for Sets and Structural Induction . 109

3.3 Nested Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

3.3.1 The underlying logic of nested induction . . . . . . . . . . 112

3.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

3.5 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

4 Relations 155

4.1 Definitions, Theorems, and Comments . . . . . . . . . . . . . . . 157

4.1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 157

4.1.2 Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

4.2 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

4.3 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

5 Functions 183

5.1 Definitions, Theorems, and Comments . . . . . . . . . . . . . . . 185

5.1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 185

5.1.2 Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

5.2 The Pigeonhole Principle . . . . . . . . . . . . . . . . . . . . . . 188

5.3 Asymptotic Notations . . . . . . . . . . . . . . . . . . . . . . . . 189

vi

5.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

5.5 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

II Specific Topics 207

6 Integers 209

6.1 Floor and Ceiling Functions . . . . . . . . . . . . . . . . . . . . . 211

6.2 Divisibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

6.3 Greatest Common Divisor . . . . . . . . . . . . . . . . . . . . . . 215

6.4 Congruence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223

6.5 Solving Linear Congruence Equations . . . . . . . . . . . . . . . 228

6.6 Solving Linear Congruence Equations with multiple variables . . 231

6.7 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

6.7.1 Chinese Remainder Theorem . . . . . . . . . . . . . . . . 233

6.7.2 Fermat’s Little Theorem and Euler’s Theorem . . . . . . 236

6.7.3 RSA Cryptosystem . . . . . . . . . . . . . . . . . . . . . . 239

6.8 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242

6.9 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246

7 Binomial Theorem and Counting 269

7.1 The Binomial Theorem . . . . . . . . . . . . . . . . . . . . . . . 271

7.2 Principles and Typical Problems for Counting . . . . . . . . . . . 274

7.2.1 Urns and Balls Model . . . . . . . . . . . . . . . . . . . . 276

7.2.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 282

7.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285

7.4 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292

8 Recurrence Relations and Generating Functions 329

8.1 Recurrence Relations . . . . . . . . . . . . . . . . . . . . . . . . . 331

8.1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 333

8.2 Solving Recurrence Relations . . . . . . . . . . . . . . . . . . . . 334

8.2.1 Repeated Substitution Method . . . . . . . . . . . . . . . 334

8.2.2 Characteristic Root Method . . . . . . . . . . . . . . . . . 335

8.2.3 Generating Function Method . . . . . . . . . . . . . . . . 340

vii

8.2.4 An Example . . . . . . . . . . . . . . . . . . . . . . . . . 342

8.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346

8.4 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349

9 Discrete Probability 369

9.1 Definitions and Terminologies . . . . . . . . . . . . . . . . . . . 371

9.1.1 Examples and Discussion . . . . . . . . . . . . . . . . . . 374

9.2 Theorems of Probability . . . . . . . . . . . . . . . . . . . . . . . 376

9.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378

9.4 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382

III Appendices 397

A Loop Invariance 399

B Sample Quizzes 405

viii

Part II

Specific Topics

Chapter 6

Integers— Preliminary Background for Number Theoretics & Cryptography

The integers were created by God;all else is the work of man.

– Ludwig Kronecker

6.1. Floor and Ceiling Functions 211

It is quite true to say that the concept of integers is the very first mathe-matical concept grasped by everyone. We can count almost right after we canbarely speak at the age of three or four. Within a few years, through education,everyone will learn to have a fair skill of doing basic operations on integers suchas addition, substraction, multiplication, and division. It may be fair to say thatthe four basic arithmetic operations we just mentioned are more than enoughfor our daily life. However, number theory – the study of integers – had beendeveloped far beyond daily applications. In fact, numerous civilizations hadindependently conceived some mechanical procedures (algorithms) for solvingproblems related to integers from earliest times. Most of the antiquities werediscovered by amateur mathematicians called numerologists either for fun or forspiritual purposes. Recently, with the development of modern computers, num-ber theory turns out to be an indispensable tool in many important applicationssuch as coding theory and cryptography. Since the theory has been well devel-oped into an independent and deep branch in mathematics on its own right, itis worthwhile for us to study the subject once again and carefully analyze themathematical properties behind this seeming elementary school topic. We willfind that integers are not as naive as they look like.

6.1 Floor and Ceiling Functions

Let R denote the set of all real numbers, Z the set of all integers, and N theset of all natural numbers. By convention, 0 6∈ N. In number theory, we studyintegers only. We use the following two functions to trim any number into aninteger.

Definition 6.1: Floor function b c and Ceiling function d eLet t ∈ R.

• btc is the largest integer a such that a ≤ t.

• dte is the smallest integer a such that t ≤ a.

Here are some examples:

b1.1c = 1, b−2.1c = b−2.3c = −3, bπc = 3,

d1.1e = 2, d−2.1e = d−2.3e = −2, dπe = 4.

We can easily check their correctness according to the definitions above. It isalso easy to see that neither the floor nor the ceiling function is injective. Weobserve that, for an arbitrary x ∈ R. bxc may not be the nearest integer to t.How about bx + 1

2c? Consider the following theorem.

c© Chung-Chih Li, Kishan Mehrotra

212 6. Integers

Theorem 6.1 For any number x, if (x + 12 ) 6∈ Z, then bx + 1

2c is the uniquenearest integer to x.

Proof: Suppose x + 12 6∈ Z. Let x = k + s, where k ∈ Z and 0 < s < 1. Since

x + 12 6∈ Z, it follows that s 6= 1

2 . We observe that 1.) if 0 < s < 12 then the

integer nearest to x is k;, and 2.) if 12 < s < 1 then the integer nearest to x is

k + 1. Thus, it is sufficient to prove that

1. if 0 < s < 12 , then bx + 1

2c = k, and

2. if 12 < s < 1, then bx + 1

2c = k + 1.

case 1:0 < s < 1

2 ⇒ 12 < s + 1

2 < 1⇒ k + 1

2 < k + s + 12 < k + 1

⇒ k + 12 < x + 1

2 < k + 1⇒ k < x + 1

2 < k + 1⇒ bx + 1

2c = k.

case 2:12 < s < 1 ⇒ 1 < s + 1

2 < 1 + 12

⇒ k + 1 < k + s + 12 < k + 1 + 1

2⇒ k + 1 < x + 1

2 < k + 1 + 12

⇒ bx + 12c = k + 1.

2

The result of the above theorem is not particularly important, but its proofprovides another chance to be familiar with mathematical arguments of thiskind.

Definition 6.2: For a ∈ R, the absolute value, denoted by |a|, of a is definedas follows.

|a| =

a if a ≥ 0;−a if a < 0.

6.2 Divisibility

Among the four arithmetic operation, division is the most interesting one. Wewill have a close look at integer division in this section.

Definition 6.3: For a, b ∈ Z we say a divides b iff there exists an integer k suchthat ak = b. We use a|b to denote that a divides b.


6.2. Divisibility 213

Definition 6.4: p ∈ N is said to be prime if and only if p ≥ 2 and p has nopositive divisors except 1 and p itself.

Definition 6.5: Let a, b ∈ N. We say that a and b are relatively prime to eachother if and only if there is no integer other than 1 that divides both a andb.

Note that 1 is a natural number without positive divisors except 1 and itself,but we do not consider 1 as a prime number by convention due to the fact thatotherwise many interested theorems will become trivial.

Theorem 6.2 Every integer can be presented as a product of primes. More-over, for every n ∈ Z with n > 1, z can be uniquely factorized as

n = p1p2 · · · pk, where p1 ≤ p2 ≤ · · · ≤ pk.

Proof: We will prove the moreover-part of the theorem by mathematical in-duction. When n = 2, it is self-evident. Suppose when n ≥ 2 the statement istrue. Consider n + 1. If n + 1 is a prime, then the statement is automaticallytrue for n+1. Suppose n+1 is not a prime. Then, there are two integers a andb such that, 2 ≤ a < n, 2 ≤ b < n, and n = ab. By the inductive hypothesis, wecan uniquely factorize a and b as:

a = p1p2 · · · pi, and b = q1q2 · · · qj ,

where p1 ≤ p2 ≤ · · · ≤ pi and q1 ≤ q2 ≤ · · · ≤ qj . Since p1p2 · · · pi and q1q2 · · · qj

are unique, with some proper arrangement, n can be uniquely presented as

n = ab = r1r2 · · · ri+j .

where r1 ≤ r2 ≤ · · · ≤ ri+j are prime numbers. 2

Algorithms: Algorithm is a very old concept in many civilizations datingfrom ancient times. An algorithm is nothing more or less than a recipe forcarrying out a sequence of operations to solve a problem. In mathematics, analgorithm that can correctly construct mathematical objects of required or giveanswers to mathematical problems is perfectly to be considered as a mathemati-cal proof. In fact, some mathematicians known as constructivists or intuitionistsmaintain a doctrine that giving an effective algorithm is the only legitimate wayto prove mathematical theorems. Their philosophy is straightforward: if youclaim that something exists, then you have to provide an effective way to builtit or find it, and an algorithm is such an effective way.

Let’s take a look at the notion of algorithms. Here we borrow the definitionfrom Encyclopedia of Computer Science and Engineering: An algorithm is “theprecise characterization of a method of solving a problem”. The phrase in the


214 6. Integers

definition that needs emphasis is “precise characterization.” The encyclopediafurther notes that any algorithm must have the following properties:

1. Finiteness. Application of the algorithm to a particular set of data mustresult in a finite sequence of actions.

2. Unique Initialization. The action that starts the algorithm must beunique.

3. Unique Succession. Each action in the algorithm must be followed bya unique successor action.

4. Solution. The algorithm must terminate with a solution to the problem,or it must indicate that for the given data the problem is insoluble by thealgorithm.

Except the first property, the readers should not take properties 2, 3, and 4 tooserious. Properties 2 and 3 are redundant in a sense that they do not enhance orundermine the power of algorithms. The last property requires the correctnessof an algorithm for an interested problem. In fact, any algorithm does solvesome problem, only may not be the one we want to solve. If the problem isgiven and we are asked to write an algorithm to solve it, usually, it is the lastproperty that is most difficult to verify.

In the following we prove a theorem by giving a “correct” algorithm. Wewill repeatedly use the theorem in this chapter. To most of us, the theorem is sobasic that can be understood intuitively. However, a formal proof is demanded.However, to formally prove the correctness of a given algorithm is a big pain onthe neck and is way beyond the scope of this book. Here we simply follow thealgorithm and comprehend its correctness by out intuition.

Theorem 6.3 Let a, b ∈ Z and b 6= 0. There exist unique integers q and rsuch that

a = qb + r, where 0 ≤ r < |b|;Note that we require r to be nonnegative. Such an r is called the remainderof a divided by b.

Proof: As we mentioned earlier, an easy way to prove the theorem is to writean algorithm that takes two integers a and b, and if b 6= 0, the algorithm willoutput correct q and r.

Consider the algorithm in Figure 6.1. The algorithm will be a bit easier ifwe restrict the input numbers to be positive. 2

To get a better idea about how this algorithm works, run the division algo-


6.3. Greatest Common Divisor 215

Input a, br ←− a; q ←− 0;while not 0 ≤ r < |b| do

if a× b ≥ 0then r ←− r − b;else r ←− r + b;

endifq ←− q + 1;

endwhileif a× b ≥ 0

then return(q, r);else return(−q, r);

endif

Figure 6.1: The Division Algorithm

rithm on the following inputs as an exercise.

a = 10, b = 3; a = 10, b = −3; a = −10, b = 3; a = −10, b = −3.

Although an algorithm is considered as a formal mathematical proof, wehave to face a few challenges. How can you be sure that the algorithm indeeddoes what it is supposed to do? Even it does, how can you be sure that allother algorithms for the same problem always give the same answer on thesame input so we can claim the uniqueness of q and r? Unfortunately, it is inprinciple impossible to prove correctness and uniqueness by simply giving analgorithm. We do not intend to answer the questions and complete our proofhere. Our purpose here is to get a feel for the concept that an algorithm canserve as a mathematical tool to prove theorems. Also, Theorem 6.3 itself lies asthe very foundation of the entire number theory.

6.3 Greatest Common Divisor

Definition 6.6: Given two integers m and n, we use gcd(m,n) to denote thegreatest common divisor of m and n, which is the largest positive integerthat divides both m and n.

Consider the following examples. gcd(12, 16) = 4, gcd(−315, 91) = 7, andgcd(10, 0) = 10. By convention, we take gcd(0, 0) = 0. We can rewrite thedefinition of “relatively prime” as follows:


216 6. Integers

Definition 6.7: Integers a and b are said to be relatively prime to each other ifgcd(a, d) = 1.

We present some useful properties related to gcd in the following theorems.Most of the theorems are intuitively understandable and can be verified by ourintuition. However, intuition was built up from experience, not from rigorousmathematical arguments. As we go deeper into the theory, intuition no longerhelps. (Have a quick peek at Theorem 6.10. to see if you can be convincedby intuition.) Here we present proofs in terms of mathematical arguments thatmeet a certain level of rigorousness.

Theorem 6.4 Suppose a, b ∈ N with a = da′ and b = db′. gcd(a, b) = d iffa′ and b′ are relatively prime, i.e., gcd(a′, b′) = 1.

Proof: Given a, b ∈ N with a = da′ and b = db′. Let gcd(a, b) = d andgcd(a′, b′) = k. By contradiction, assume k 6= 1. We have a = dka′′ andb = dkb′′ and that dk is a common divisor of a and b. Since a, b ∈ N, it followsthat k 6= 0 and dk > d. This contradicts the assumption that d is the greatestcommon divisor of a and b.

For the other direction, assume a = da′ and b = db′ and gcd(a′, b′) = 1. Itis clear that d is a common divisor of a and b. Since gcd(a′, b′) = 1, there isno common divisor other than 1 that can be extracted from a′ and b′. Thus, itis impossible to obtain a common divisor of a and b bigger than d. Therefore,gcd(a, b) = d. 2

Theorem 6.5 Let a, b,m ∈ N. We have

gcd(ma, mb) = m gcd(a, b).

Proof: Let gcd(a, b) = d. By Theorem 6.4, there are a′ and b′ such that a = da′,b = db′, and gcd(a′, b′) = 1. Clearly, ma = mda′ and mb = mdb′. By the otherdirection of Theorem 6.4, since gcd(a′, b′) = 1, it follows that md is the greatestcommon divisor of ma and mb. Thus, gcd(ma,mb) = md = m gcd(a, b). 2

Theorem 6.6 For any a, b, x, y ∈ Z, xa + yb is divisible by gcd(a, b).

Proof: Let gcd(a, b) = d. There must be two integers a′ and b′ such that a = da′

and b = db′. We have

xa + yb = xda′ + ydb′; = (xa′ + yb′)d.

Since x, y, a′, and b′ are all integers, xa′ + yb′ must be an integer too, say k.Therefore, xa + yb = kd , which is divisible by d. 2



Theorem 6.7 Let a, b ∈ N with a ≥ b. Then,

gcd(a, b) = gcd(a− b, b).

Proof: Let a, b ∈ N with a ≥ b. and gcd(a, b) = d. By Theorem 6.4, there area′ and b′ such that a = da′, b = db′, and gcd(a′, b′) = 1. Also, a− b = d(a′− b′).One can verify that if a′ and b′ are relatively prime, then so are a′ − b′ and b′.Thus, gcd(a′ − b′, b′)=1. With Theorem 6.5, we have

gcd(a− b, b) = gcd(d(a′ − b′), db′) = d gcd((a′ − b′), b′) = d = gcd(a, b).

2

Definition 6.8: Given two nonzero integers a and b, we use lcm(a, b) to denotethe least positive common multiple of a and b, which is the least positiveinteger that can be divided by a and of b.

The following theorem connects the concepts of gcd and lcm.

Theorem 6.8 Let a and b be two natural numbers. We have

lcm(a, b) =ab

gcd(a, b).

Proof: Let gcd(a, b) = d. By Theorem 6.4, we have that a = da′, b = db′, andgcd(a′, b′) = 1. Thus,

ab

gcd(a, b)= da′b′.

It is clear that da′b′ is a common multiple of a and b. What remains to proveis that da′b′ is the least one. Let m be any common positive multiple of a andb, i.e., a|m and b|m. Since a = da′, by division algorithm, there is some k ∈ Nsuch that m = da′k. Also, since b|m, we have db′|da′k. It follows that b′|a′k.Since gcd(a′, b′) = 1, we have b′|k. It is clear that, if b′|k, then b′ ≤ k, and henceda′b′ ≤ da′k = m. Therefore, any common positive multiple of a and b must begreater than or equal to da′b′. 2

Theorem 6.9 Let a, b be integers. If m is a multiple of both a and b, thenm is also a multiple of lcm(a, b).

Proof: Suppose gcd(a, b) = d, and let a = da′, b = db′. Thus, gcd(a′, b′) = 1and lcm(a, b) = da′b′. Since m is a common multiple of a and b, there existintegers k1 and k2 such that m = k1a = k2b. We have

m = k1da′ =k1

b′da′b′ =

k1

b′lcm(a, b).


218 6. Integers

We will now prove that k1b′ is an integer. From the assumption we have

k1da′ = k2db′ ⇒ k1

b′a′ = k2.

Since k2 is an integer and gcd(a′, b′) = 1, k1b′ must be an integer. 2

In the following, we present a theorem that is important in a sense that itcan help us simplify many proofs. The proof of the theorem itself is not trivial.

Theorem 6.10 For any a, b ∈ Z, there exist integers x and y such that

xa + yb = gcd(a, b).

Proof: It is clear that if one of a and b is a zero, we can easily assign 1 or 0 tox and y to satisfy the theorem. For simplicity, we can assume a, b ∈ N withoutloss of generality. Let

s = minxa + yb > 0 : x, y ∈ Z. (6.1)

That is, s is the smallest natural number of all possible xa + yb with x, y ∈ Z.We argue that, for any x, y ∈ Z, s|(xa + yb). Let s = ua + vb for some u, v ∈ Z.Also, let a = da′, b = db′ and gcd(a, b) = d. Thus,

s = ua + vb = d(ua′ + vb′).

Fix x, y ∈ Z. By the division algorithm, we have q, r ∈ Z such that,

xa + yb = sq + r, 0 ≤ r < s.

It follows that

r = xa + yb− sq = xa + yb− uqa− vqb = (x− uq)a + (y − vq)b.

Thus, r is also a linear combination of a and b with integral coefficients and r 6= s.Together with the assumption in (6.1), we conclude that the only possible valuefor r is 0. Therefore, s|(xa + yb) for any x, y ∈ Z. Consequently, s|a (whenx = 1, y = 0) and s|b (when x = 0, y = 1). In other words, s is a commondivisor of a and b. Thus, s ≤ gcd(a, b). By Theorem 6.6, gcd(a, b)|s, and hencegcd(a, b) ≤ s. Therefore, s = gcd(a, b) = ua + vb for some u, v ∈ Z. 2

Theorem 6.11 Consider a, b, q ∈ Z. If both a and b are divisible by q, thengcd(a, b) is divisible by q.

Proof: Suppose a = qa′ and b = qb′, where q, a′ and b′ are all integers. Letgcd(a, b) = d. By Theorem 6.10, there exist integers x, y such that d = xa + yb.Thus,

d =xqa′ + yqb′

d

q=xa′ + yb′.



Because x, y, a′, and b′ are integers, dq must be an integer. Therefore, q|d. 2

Theorem 6.12 Let a, b ∈ Z. There exist x, y ∈ Z such that xa + yb = 1 iffgcd(a, b) = 1.

Proof: Suppose xa + yb = 1 where x and y are integers. Let a = da′, b = db′

and gcd(a, b) = d. Without loss of generality, we may assume d 6= 0. We have

xda′ + ydb′ = 1 =⇒ d =1

xa′ + yb′.

The only case that both d and xa′+yb′ are integers is that when d = xa′+yb′ = 1.The other direction of this theorem is simply a special case of Theorem 6.10.2

The following theorem is a corollary of Theorem 6.12.

Theorem 6.13 If a, b1, b2, . . . , bn ∈ Z and gcd(a, b1) = gcd(a, b2) = · · · =gcd(a, bn) = 1, then gcd(a, b1b2 . . . bn) = 1.

Proof: By Theorem 6.10 we find x1, x2, . . . , xn and y1, y2, . . . , yn such that

1 = x1a + y1b1;1 = x2a + y2b2;

...1 = xna + ynbn.

Thus,1n = (x1a + y1b1)(x2a + y2b2) · · · (xna + ynbn).

Therefore, 1 = Aa + B(b1b2 . . . bn), where A is a polynomial in a, x1, . . . , xn,b1, . . . bn and y1, . . . , yn, and B = y1y2 · · · yn. Since both A and B are integers,by Theorem 6.12, gcd(a, b1b2 . . . bn) = 1. 2

Euclid’s Algorithm Our next question is: how do we find the greatest com-mon divisor of any two integers? This is not a particularly difficult problem;we can by brute force check 1, 2, ..., up to the smaller one of the two to seeif they are common divisors and pick up the greatest one. This always works,but we also want to solve in an efficient way. The great Greek mathematician,Euclid, about 2300 years ago gave an elegant algorithm now known as Euclid’salgorithm to solve this problem. The algorithm is one of the oldest algorithms.Before we introduce the algorithm, we at first make some observations. Let mand n be any two integers.

gcd(m, n) = gcd(n,m) (6.2)


220 6. Integers

function gcd(m,n)if n = 0

then return m;else return gcd(n,m− bm

n cn);endif

endfunction

Figure 6.2: Euclid’s Algorithm

gcd(−m,−n) = gcd(−m,n) = gcd(m,−n) = gcd(m,n). (6.3)

Equation (6.2) implies that we can assume the first argument of our algorithmnever less than the second one without loss of generality. Equation (6.3) im-plies that we can confine our attention to non-negative integers. Our secondobservation is:

gcd(m, 0) = m, gcd(0, n) = n and gcd(0, 0) = 0. (6.4)

In other words, if one of the two integers is 0, then the other integer is the gcd.This tells us when to terminate our algorithm. Together with Theorem 6.7,we have an idea about how to proceed in our algorithm to guarantee that thealgorithm will reach the terminating condition.

As we recursively call the algorithm with new arguments m − n and n, wecan subtract as many n’s as possible from m, so long as the difference remainsnon-negative? To find how many times n can be subtracted from m we simplyuse the division algorithm.

m = q × n + r,

where 0 ≤ r ≤ n − 1. Let q = bmn c. It is clear that we can remove n from

m q many times. With this background we are ready to present this easy butnot trivial algorithm. Note that we have decided to use non-negative integersm and n only. The algorithm is shown in Figure 6.2.

A nonrecursive version of Euclid’s algorithm is shown in Figure 6.3.

Consider the following two examples. Let q = bmn c and r = m− bm

n cn.

m = n×q + r,

946 = 726×1 + 220,726 = 220×3 + 66,220 = 66×3 + 22,66 = 22×3 + 0.

m = n×q + r,

1247 = 98×12 + 71,98 = 71×1 + 27,71 = 27×2 + 17,27 = 17×1 + 10,17 = 10×1 + 7,10 = 7×1 + 3,7 = 3×2 + 1,3 = 1×3 + 0.



function gcd (m,n)repeat while n 6= 0

q ←− bmn c;

r ←− m− q × n;m ←− n;n ←− r;

endrepeatreturn m;

endfunction

Figure 6.3: Nonrecursive Euclid’s Algorithm

Therefore, gcd(946, 726) = 22 and gcd(1247, 98) = 1. 2

Extended Euclid’s Algorithm: Recall Theorem 6.10 stating that for anyintegers a and b, there are integers x and y, such that xa+yb = gcd(a, b). As wementioned earlier, it is an important theorem in a sense that it simplifies manyproofs for interested theorems. Moreover, the values of x and y are needed forsolving linear congruence equations, which will be introduced in the next section.The proof given for Theorem 6.10 is logically perfect, but it says nothing abouthow to actually find out the values of x and y. (They do exist, alright!) A proofof that kind is called “nonconstructive”. Clearly, a correct algorithm to find xand y indeed is a legitimate proof for the existence claim stated in Theorem6.10. We call this kind of proofs “constructive”, where an effective procedure toconstruct the claimed objects is provided. In the following we give an algorithmto actually find out the values of x and y in Theorem 6.10.

Since the idea to find values for x and y in Theorem 6.10 is in fact involvedin the Euclid’s algorithm and we only make some modifications, the algorithmis called Extended Euclid’s Algorithm. Again, we make some observations first,which will help us present our arguments. (i) It is obvious that if n = 0,then x = 1, y = 0. This gives the terminating condition. (ii) if n 6= 0, we setn1 = m−bm

n cn,m1 = n and apply the method recursively. That is, we computex1, y1 such that

x1m1 + y1n1 = gcd(m1, n1) = gcd(m,n).

To make this presentation easier to follow, we rewrite m = m0, n = n0, x = x0,and y = y0. Suppose that we have obtained x1 and y1 such that

x1m1 + y1n1 = gcd(m1, n1) = gcd(m0, n0).


222 6. Integers

function egcd(m,n)if n = 0

then return (1, 0);else (x, y) ←− egcd(n,m− bm

n cn);endifreturn (y, x− bm

n cy);endfunction

Figure 6.4: Extended Euclid’s Algorithm

function egcd(m,x0, y0;n, x1, y1)if n = 0

then return (x0, y0);endifr ←− m− bm

n cn;x2 ←− x1 − bm

n cx1;y2 ←− y1 − bm

n cy1;return egcd(n, x1, y1; r, x2, y2);

endfunction

Figure 6.5: Extended Euclid’s Algorithm (V.2)

Substituting for m1 and n1 in terms of m0 and n0, we get

gcd(m0, n0) = x1n0 + y1(m0 − bm0

n0cn0)

= y1m0 + (x1 − bm0

n0cy1)n0

Thus, x0 = y1 and y0 = x1−bm0n0cy1. In general, the following result is obtained:

xi−1 = yi,yi−1 = xi − bmi−1

ni−1cyi.

In the last step we have xn = 1 and yn = 0. We can find (xi−1, yi−1) from(xi, yi). Then, find (xi−2, yi−2) from (xi−1, yi−1), and so on, until we obtainthe desired values of x0 and y0. We build these step into the algorithm shownin Figure 6.4. We can also build these steps into a forward version as shown inFigure 6.5. Given any nonnegative integers m and n, the function will be calledby egcd(m, 1, 0; n, 0, 1). 2

The algorithm in Figure 6.5 is easier for us to work with paper and pencil.Consider 246 and 165. To find x and y such that, gcd(242, 165) = 242x + 165y,


6.4. Congruence 223

we havem/n x0/x1 y0/y1 r242 1 0165 0 1 177 1 −1 211 −2 3 70

The result shows that x = −2 and y = 3.

6.4 Congruence

Congruence is a term used in number theory to express statements about divis-ibility. As we have seen earlier, a division gives two numbers, a quotient and aremainder. In number theory, we are interested in remainders. What left whenan integer divided by a concerned divisor turns out to be an important prop-erty in many applications. Here is the easiest example: odd numbers and evennumbers, where we fix the divisor to 2. Numbers are separated into two cate-gories. In each category, all numbers share the same property that they leavethe same remainder when divided by 2. Clearly, if we change the divisor to abigger number, we then can separate numbers into more categories according totheir remainders left by the division. For convenience, we define the followingnotations.

Definition 6.9: Let a and m be two integers with m 6= 0. The remainder of adivided by m is denoted by (a mod m).

Definition 6.10: Let a, b, m ∈ Z with m 6= 0. If (a mod m) = (b mod m), wesay that m is a modulus of a and b.

Note that, although the division algorithm is not limited to positive divisors, wegenerally confine our attention to positive moduli (plural of modulus). Thus,when m serves as a modulus, we simply let m ∈ N for the time beings. Recallthat 0 6∈ N under our conventions. Thus, for any m ∈ N, (a mod m) is well-defined.

Definition 6.11: Let a, b ∈ Z and m ∈ N. We say that a is congruent to bmodulo m iff (a mod m) = (b mod m). We denote this by

a ≡ b (mod m).

Alternatively, we also use a ≡m b as a standard notation for a ≡ b (mod m).Since ≡m is symmetric (i.e., if a ≡m b, then b ≡m a), we can simply say that a


224 6. Integers

and b are congruent modulo m. It is also easy to verify that ≡m is reflexive andtransitive. Therefore, ≡m is an equivalent relation over integers. Consequently,≡m induces an equivalent class over Z.

Definition 6.12: Let a ∈ Z and m ∈ N. a m ⊆ Z is defined by

a m = x : x ∈ Z and x ≡ a (mod m).a m is called the congruence class (or residue class) of a modulo m.

In other words, a m is the set of integers that leave the same remainder whendivided by m. Clearly, for any m ∈ N as a divisor, there are m many possibleremainders, which are 0, 1, . . . , and m − 1. We generalize this fact into thefollowing theorem.

Theorem 6.14 For any m ∈ N, there are exactly m distinct residue classes.

Proof: The theorem above follows directly from the result of the division algo-rithm that r, as the remainder, has 0 ≤ r < m. Thus,

0 m , 1 m , . . . , m−1m (6.5)

are m distinct residue classes. For any x ∈ Z, by the division algorithm, theremainder is unique, and hence x m must be one of the residue classes we justlisted in (6.5). 2

Definition 6.13: Let m ∈ N. We say that a0, a1, . . . , am−1 is a completesystem of residues modulo m if

a0 m ∪ a1 m ∪ · · · ∪ am−1m = Z.

Recall the definition of partition from Chapter 4. If a0, a1, . . . , am−1 is acomplete system of residues mod m, one can verify that

a0 m , a1 m , · · · , am−1

m

forms a partition of Z.

Theorem 6.15 Let a, b ∈ Z and m ∈ N. We have,

a ≡ b (mod m) ⇐⇒ m|(a− b).

Proof: By definition, if a ≡ b (mod m), then a = xm + r and b = ym + r forsome integers x, y, and r with 0 ≤ r < m. Since a − b = (x − y)m, it follows


6.4. Congruence 225

that m|(a− b). For the other direction, suppose (a− b) = km for some integerk. By the division algorithm, there are unique integers q and r with 0 ≤ r < msuch that, b = qm + r. Therefore, a = km + b = (k + q)m + r. By definition,a ≡ b (mod m). 2

Theorem 6.16 Let a, b ∈ Z and m ∈ N. We have,

a ≡ b (mod m) ⇐⇒ ∀k ∈ Z[a ≡ b + km (mod m)].

Proof: It is clear that if a − b is a multiple of m, then for any integer k,a − (b + km) is also a multiple of m. We directly use Theorem 6.15 to obtainthis theorem. 2

Theorem 6.17 Let a, b ∈ Z and m ∈ N. Suppose a ≡ b (mod m) and d ∈ Nis a common divisor of a, b, and m. We have

a

d≡ b

d(mod

m

d).

Proof: By the division algorithm and the assumption, there are p, q, r ∈ Z with0 ≤ r < m such that, a = pm + r and b = qm + r. Again, by the divisionalgorithm, r = sd + t for some integers s and t with 0 ≤ t < d.

a

d= p× m

d+

r

d;

b

d= q × m

d+

r

d.

Since d|a and d|m, ad and m

d are integers. It follows that rd must be an integer.

Also,0 < d and 0 ≤ r < m =⇒ 0 ≤ r

d<

m

d.

Thus, rd is the remainder of a

d divided by md . Likewise, r

d is also the remainderof b

d divided by md . Therefore, a

d ≡ bd (mod m

d ). 2

Note that, in general, if d ∈ N is a common divisor of a and b, a ≡ b (mod m)does not imply a

d ≡ bd (mod m). For example, 8 ≡ 20 (mod 6). Consider d = 4,

we obtain 2 6≡ 5 (mod 6). In other words, giving ac ≡ bc (mod m) does notimply a ≡ b (mod m). Nevertheless, in some special case, the implication doeshold (see Theorems 6.20 and 6.21.)

Except division, the other three arithmetic operations (+, −, and ×) in factpreserve the equivalence relation ≡m in the following sense.

Theorem 6.18 Suppose a ≡ b (mod m) and x ≡ y (mod m). We have:

1. (a + x) ≡ (b + y) (mod m).

2. (a− x) ≡ (b− y) (mod m).

3. (a× x) ≡ (b× y) (mod m).


226 6. Integers

Proof: Since the proofs for properties 1. and 2. are straightforward, we skipthem. For property 3., consider the the division algorithm and let p1, p2, q1, q2, r,and r′ be the integers obtained by the algorithm such that, a = p1m + r,b = p2m + r, x = q1m + r′, and y = q2m + r′. Thus,

a× x = p1q1m2 + (p1r

′ + q1r)m + rr′;

b× y = p2q2m2 + (p2r

′ + q2r)m + rr′.

Therefore, a× x ≡ rr′ (mod m) and b× y ≡ rr′ (mod m). By transitivity, wehave (a× x) ≡ (b× y) (mod m). 2

Proof: Here we provide another proof. By assumptions, a − b and x − y areboth multiples of m. Consequently, so is (a− b)(x− y). Consider

(a− b)(x− y) = ax− ay − bx + by

= ax− by − ay − bx + 2by

= ax− by − y(a− b)− b(x− y).

Thus, ax − by = (a − b)(x − y) + y(a − b) + b(x − y) must be a multiple of m,and hence ax ≡ by (mod m). 2

Theorem 6.19 Suppose a ≡ b (mod m). Then, for any integer n ≥ 0, wehave an ≡ bn (mod m).

Proof: It is clear when n = 0, an = bn = 1. For 0 < n, use the assumptionthat a ≡ b (mod m) and repeatedly use the multiplication rule in Theorem6.18 to get the result. (A formal proof requires arguments using mathematicalinductions. Try it!) 2

Theorem 6.20 Let a, b, c ∈ Z and m ∈ N. Suppose gcd(c,m) = 1. Then,

ac ≡ bc (mod m) ⇐⇒ a ≡ b (mod m).

Proof: Suppose ac ≡ bc (mod m) and c 6= 1 and m 6= 1. By Theorem 6.15,ac − bc = km for some integer k. In other words, c(a−b)

m is an integer (i.e.,k). But gcd(c,m) = 1, c is not divisible by m. Thus, it must be the case thatm|(a−b), and hence a ≡ b (mod m). The other direction is a trivial applicationof the multiplication rule in Theorem 6.18. (Hint: c ≡ c (mod m).) 2

The previous theorem can be generalized as follows.

Theorem 6.21 Let a, b, c ∈ Z and m ∈ N. Suppose gcd(c,m) = d. Then,

ac ≡ bc (mod m) ⇐⇒ a ≡ b (modm

d).


6.4. Congruence 227

Proof: We omit the proof which is similar to the proof for Theorem 6.20.(Compare the proofs for the next two theorems.) 2

Theorem 6.22 Suppose gcd(m,n) = 1. We have

[a ≡ b (mod m) and a ≡ b (mod n)] ⇐⇒ [a ≡ b (mod mn)].

Proof: Let gcd(m,n) = 1. For one direction, suppose a ≡ b (mod m), anda ≡ b (mod n). Then, there are some integers p and q such that, a − b =pm and a − b = qn. Thus, pm = qn and p = qn

m . Since gcd(m,n) = 1, itfollows that q

m must be an integer. Let q = km for some integer k. Therefore,a−b = qn = kmn, and hence a ≡ b (mod mn). For the other direction, considera− b = kmn = (km) · n = (kn) ·m. 2

Theorem 6.23

[a ≡ b (mod m) and a ≡ b (mod n)] ⇐⇒ [a ≡ b (mod lcm(m,n))].

Proof: Suppose gcd(m,n) = d. By Theorem 6.4, we have m = dm′ andn = dn′ where gcd(m′, n′) = 1. Also, suppose a ≡ b (mod m), and a ≡ b(mod n). Thus, there are some integers p and q such that, a− b = pm = pdm′

and a − b = qn = qdn′. Thus, pm = qn and p = qn′

m′ . Since gcd(m′, n′) = 1,it follows that q

m′ is an integer, i.e., q = km′ for some integer k. Therefore,a− b = qn = km′n = kdm′n′, and hence a ≡ b (mod dm′n′). By Theorem 6.8dm′n′ = lcm(m,n). Therefore, a ≡ b (mod lcm(m,n)). The other direction isstraightforward. Consider a− b = kdm′n′ = kn′m = km′n. 2

Note that the argument in the proof above is unnecessarily involved; ourpurpose is to let the readers be familiar with the definition of linear congruenceequations and some proven theorems. We can argue Theorem 6.23 like self-evident. Here is the argument: If a − b is common multiple of n and m, then,by Theorem 6.9, a− b is also a multiple of lcm(m,n).

Definition 6.14: Let a ∈ Z and m ∈ N. If b ∈ Z and ab ≡ 1 (mod m), we saythat b is a multiplicative inverse of a modulo m.

Theorem 6.24 Let a ∈ Z and m ∈ N. If gcd(a,m) = 1, then, there is amultiplicative inverse of a.

Proof: By Theorem 6.10, there exist integers x and y such that, xa + ym =gcd(a,m) = 1. Thus, m|(xa − 1), and hence such x is a multiplicative inverseof a. 2


228 6. Integers

For convenience, we use a− to denote the multiplicative inverse of a, espe-cially when we restrict the numbers of interest to Zm = 0, 1, 2, . . . ,m − 1. Itis clear that if a multiplicative does exists, we can use the extended Euclid’sAlgorithm to find it, and, moreover, there must be one and only one in Zm. Itis also clear to see that, aa− ≡ a−a ≡ 1 (mod m).

Theorem 6.25 Let a, b ∈ Z,m ∈ N. Suppose ab ≡ c (mod m) and a−

exists, then b ≡ a−c (mod m).

Proof: Since ab ≡ c (mod m), we have a−ab ≡ a−c (mod m). By definition,aa− = km + 1 for some k ∈ Z. Thus, a−ab = kbm + b ≡ a−c (mod m). ByTheorem 6.16, b ≡ a−c (mod m). 2

6.5 Solving Linear Congruence Equations

Consider the following equation with one variable x:

3x ≡ 5 (mod 7). (6.6)

By trial and error, we may find x = 4 to be a solution to equation (6.6), for3× 4 ≡ 5 (mod 7). Are there any other solutions? How about equation

3x ≡ 5 (mod 6)? (6.7)

The values 0, 1, 2, 3, 4 and 5 don’t seem to work. But how far should we trybefore we announce that there is no solution to equation (6.7)? Is there asystematic way to solve equations of this kind? With the backgrounds we havelearned from previous sections, we are now in a good position to answer thesequestions by presenting a method for solving equations of this kind.

Definition 6.15: Let a, b ∈ Z and m ∈ N. The following equation:

ax ≡ b (mod m), (6.8)

is called a linear congruence equation with variable x ranged over Z.

Solving a linear congruence equation is to find the values for x such thatequation (6.8) holds. In other words, we want to find the integer solution xsuch that ax − b is a multiple of m. At first, we want to examine under whatcondition the solutions do exist.

According to Theorem 6.15, we can rewrite (6.8) and ask what is the solutionfor x in the following equation?

ax− b = km, for some k ∈ Z;or ax + mk = b, for some k ∈ Z.

(6.9)


6.5. Solving Linear Congruence Equations 229

Consider equation (6.9). By Theorem 6.6, if gcd(a,m) = d, then ax + mkmust be a multiple of d. Moreover, by Theorem 6.10, there exists integers x′

and k′ such that x′a + k′m = d. Put these together, if d divides b, then thereexists an integer x for the equality in (6.9). On the other hand, if gcd(a,m)does not divide b, the above arguments become invalid. Consider the followingcongruence equation:

2x ≡ 3 (mod 4). (6.10)

We claim that there is no solution for this equation. Why? An easy explanationis that since, for any integer x, 2x−3 is an odd number, and it is impossible foran odd number to be a multiple of even number, 4. Thus there is no solutionfor (6.10). How about the equation in (6.7)? We shall generalize our discussionin the following theorem.

Theorem 6.26 For any a, b ∈ N and m ∈ N. The linear congruence equationax ≡ b (mod m) has a solution iff gcd(a,m)|b.

Proof: Suppose x0 ∈ Z is a solution of the equation. Then, there is an integerk such that

x0a + km = b.

By Theorem 6.6, gcd(a,m) divides b.

Conversely, suppose gcd(a,m) = d and d|b. Let a = da′,m = dm′ andb = db′. We argue that,

a′x ≡ b′ (mod m′) (6.11)

has a solution. By Theorem 6.4, gcd(a′,m′) = 1. By Theorem 6.10, there areintegers x′ and y′ such that

x′a′ + y′m′ = gcd(a′,m′) = 1.

Multiply b′ on both sides to get

b′x′a′ + b′y′m′ = b′. (6.12)

Clearly, since b′y′ is an integer, b′x′a′ − b′ is a multiple of m′. Thus, b′x′a′ ≡ b′

(mod m′), and hence b′x′ is a solution to equation (6.11). Multiply d on bothsides of (6.12), we have

b′x′da′ + b′y′dm′ = db′.

Thus, b′x′a + b′y′m = b., which means b′x′a ≡ b (mod m) and b′x′ is too asolution to ax ≡ b (mod m). 2

Note that, by contrapositive, we have a usefully statement: if gcd(a, m) 6 |b,then ax ≡ b (mod m) has no solution. Also, from the proof above, it is alreadyclear about how to actually find a solution, if any, to a given linear congruenceequations. We summarize the procedure in Figure 6.6.


230 6. Integers

Given ax ≡ b (mod m).

Step 1: Use the extended Euclid’s algorithm to find integers x′

and y′ such that x′a + y′m = d = gcd(a,m).

Step 2: If d does not divide b, then stop (no solution exists).

Step 3: x = x′bd .

Output x as a solution.

Figure 6.6: Solving Linear Congruence Equation

Note that, if your extended Euclid’s algorithm requires a to be positive andif the given a < 0, then you have to change the sign of a when you call theextended Euclid’s algorithm and take x = −x′b

d in step 3.

It seems that the solution, if any, given by the algorithm above is not theonly solution. Our next question: what are others?

Theorem 6.27 Given a linear congruence equation ax ≡ b (mod m), if x0

is a solution to the equation, then x0m′ is the set of all solution, where

m′ = m/ gcd(a,m).

Proof: Let x0 be a solution and i be some integer such that x0 + i is anothersolutions. Thus, m|(ax0 − b) and m|(a(x0 + i)− b). Consider

a(x0 + i)− b

m=

ax0 − b

m+

ai

m.

Clearly, i must be some integer that makes aim to be an integer. Let gcd(a,m) =

d, we haveai

m=

da′idm′ =

a′im′ .

Since gcd(a′, m′) = 1, i must be a multiple of m′. Therefore, every element inthe following set is a solution.

S =

x0 + km′ : k ∈ Z and m′ =m

gcd(a,m)

.

Clearly, S = x0m′ where m′ = m/ gcd(a,m).

What remains to prove is to argue that S does contain all solutions? Leta = da′ and m = dm′. Since x0 is a solution, ax0 − b = k0m for some integer


6.6. Solving Linear Congruence Equations with multiple variables 231

k0. Fix any solution x to the equation. Since x is a solution, we also haveax− b = km for some integer k. Consider

x− x0 =b + km

a− b + k0m

a=

(k − k0)ma

=(k − k0)m′

a′=

k − k0

a′×m′.

Since (x − x0) ∈ Z and gcd(a′,m′) = 1, it follows that k−k0a′ ∈ Z. Therefore,

x− x0 is a multiple of m′, and hence x ∈ x0m′. 2

Theorem 6.28 Let m, n ∈ N with gcd(m, n) = 1. Then, x0 is a solution toax ≡ b (mod mn) iff x0 is a solution to the following system:

ax ≡ b (mod m),ax ≡ b (mod n).

Proof: This theorem directly follows Theorem 6.23. 2

Recall the definition of multiplicative inverse from Definition 6.14, and The-orem 6.25. If ax ≡ b (mod m), then we have

a−ax ≡ x ≡ a−b (mod m).

Note that the condition for the existence of a− is sufficient to the existence ofthe solutions to the linear congruence equitation.

6.6 Solving Linear Congruence Equations withmultiple variables

The form of the linear congruence equation with one variable shown in equation(6.8) is the easiest one. Nevertheless, we have spent a great deal of time to learnhow to solve it, because it is the most basic one to which a more complicatecongruence equation can be reduced. More precisely, solving some complicatecongruence equation may be reduced to solving a series of equations in the formof (6.8). In many cases we also need the techniques introduced in the nextsection to solve congruence equations. Solving more general linear congruenceequations is by all means a difficult subject and is way out of the scope of thisbook. Here we just briefly examine another form of linear congruence equations,in which we have more than one variables.

Definition 6.16: Let a1, a2, . . . , an ∈ Z, n, m ∈ N, and x1, x2, . . . , xn are vari-ables range over Z. The equation,

a1x1 + a2x2 + · · ·+ anxn ≡ b (mod m), (6.13)

is called a linear congruence equation with n variables.


232 6. Integers

Given a congruence in (6.13), let gcd(a1, a2, . . . , an,m) = d. It is easy tosee that, if d 6 | b, then there is no solution to (6.13). Moreover, since we cancancel d from both sides and the modulus, we can assume d = 1 without loss ofgenerality. Let this be the case and assume that

gcd(a1, a2, . . . , an−1,m) = d′.

We know that, gcd(an, d′) = 1, because otherwise d 6= 1 (here we assume d′ 6= 1).Therefore, there is a solution to

anxn ≡ b (mod d′).

Using the method introduced in the previous section, we obtain the solution setu

d′. For convenience, we restrict our solutions to Zm and find a u such that0 ≤ u < d′. Let m

d′ = m′. Thus, every values in

u, u + d′, u + 2d′, . . . , u + (m′ − 1)d′, (6.14)

is a possible value of xn in the solutions to (6.13). Then, we substitute eachvalue in (6.14) for xn to remove the variable xn and repeat the process aboveuntil no more variables left. Consider the following congruence as an example,where we want to find (x, y) ∈ Z12 × Z12 for the equation.

9x + 4y ≡ 5 (mod 12). (6.15)

Since gcd(9, 4, 12) = 1 which divides 5, we can proceed. To remove the variabley, consider gcd(9, 12) = 3 and solve

4y ≡ 5 (mod 3),

where the modulus 3 is the value of gcd(9, 12). Equivalently, we solve

y ≡ 2 (mod 3)

to get y = 2, 5, 8, or 11. Then, we substitute every value for y in (6.15) andsolve the result congruence. Here we just consider one case: y = 2. The othercases are similar. If y = 2, the equation (6.15) yields

9x + 8 ≡ 5 (mod 12).

We can further simplify as follows:

9x + 8 + 4 ≡ 5 + 4 (mod 12)9x + 12 ≡ 9 (mod 12)

9x ≡ 9 (mod 12)3x ≡ 3 (mod 4).

Thus, x = 1, 5, or 9. Therefore, the solutions in this case are (1, 2), (5, 2), and(9, 2).


6.7. Applications 233

6.7 Applications

6.7.1 Chinese Remainder Theorem

Let’s consider the following game. A person thinks of an integer less than 60and divides the number by 3, 4, and 5, respectively. Then, the person will tellyou the remainders, respectively. Based on the three remainders told, you areasked to figure out what the number is. Does the number always exist? Are thethree remainders you have sufficient to locate the number? About 400 A.D., aChinese mathematician named Sun Tsu gave a beautiful theorem now knownas Chinese Remainder Theorem to answer these question.

The game we just mentioned is a simplified form of the theorem. Here wetranslate the game into the notations we have learned. Suppose that x is thechosen integer. Given the following simultaneous congruence equations,

x ≡ a (mod 3),x ≡ b (mod 4),x ≡ c (mod 5),

where a, b, and c are the remainders of x divided by 3, 4, and 5, respectively.Our goal is to find a solution for x that lies between 0 and 60.

The idea for solving the above simultaneous congruences is straightforward.We start with the first congruence equation and solve it by using the techniqueintroduced in Section 6.5. In this simple form, the solution always exists, whichis 3k + 2, for any k ∈ Z. The, put the solution into the second congruenceequation to obtain a new equation with variable k, and solve it. Repeat theprocess until all equations are solved or an unsolvable equation is encountered.Consider the following example:

x ≡ 2 (mod 3), (6.16)x ≡ 2 (mod 4), (6.17)x ≡ 3 (mod 5). (6.18)

At first, we solve (6.16) to obtain the solution set, 3k + 2 : k ∈ Z. But notevery value in the solution set is a solution to (6.17), i.e., not every k for 3k + 2gives a solution to (6.17). Thus, we put 3k + 2 as x into (6.17) to have

3k + 2 ≡ 2 (mod 4),

or, equivalently,

3k ≡ 0 (mod 4). (6.19)

The solution of (6.19) is k = 4u, u ∈ Z. Thus, the solution candidates so far wehave is

x = 3k + 2 = 3× 4u + 2 = 12u + 2, u ∈ Z.


234 6. Integers

Putting this x into the last equation(6.18), we have

12u + 2 ≡ 3 (mod 5),

or equivalently,

12u ≡ 1 (mod 5). (6.20)

The solution to (6.20) is u = 3+5v, v ∈ Z. Therefore, the solution to the originalsimultaneous congruences is

x = 12 · (3 + 5v) + 2 = 60v + 38, v ∈ Z.

Thus, x = 38 is one of the solutions, where v = 0. The set of all solution of thesystem of congruence equations is 38

60.Clearly, if we restrict x ∈ Z60, then38 is the solution.

Let us review the above solution. To be able to solve the congruences (6.16),(6.19), and (6.20), we need

gcd(1, 3)|2, gcd(3, 4)|0, and gcd(12, 5)|1. (6.21)

Clearly, if the divisors, 3, 4, and 5 are relatively prime to each other, then thegcd’s in (6.21) are 1’s and the divisibility in (6.21) is true, and hence there existsa solution to the simultaneous congruences. If this is not the case, then not allof the equations in (6.16), (6.19), and (6.20) are solvable. Numerous nontrivialvariations of the Chinese Remainder Theorem have been discovered by othermathematicians, including Fibonacci (1202) and Euler (1734). The formulationwe state in the following is probably the easiest one. Nevertheless, the methodcan be generalized to solve a more general form of simultaneous congruencesystem as shown in (6.25).

Theorem 6.29 (Chinese Remainder Theorem) Let the congruence sys-tem be given as follows.

x ≡ a1 (mod m1),x ≡ a2 (mod m2),...

......

x ≡ an (mod mn).

(6.22)

If m1, . . . ,mn are relatively prime to each other, then the system has asolution:

x =∑

1≤i≤n

aixiM

mi, (6.23)

where M = m1 · · ·mn, and xi is a solution of

M

mix ≡ 1 (mod mi). (6.24)



Proof: Suppose that m1, . . . , mn are relatively prime to each other. It is clearthat for each i, M

miand mi are relatively prime, i.e., gcd( M

mi,mi) = 1, and hence,

for every i ∈ 1, 2, . . . , n, the congruence in (6.24) has a solution.

Next, we prove that (6.23) is indeed a solution to every congruence equationin (6.22). In other words, we want to prove that, for every j = 1, . . . , n,

∑

1≤i≤n

aixiM

mi

− aj

is a multiple of mj . It is clear that, if i 6= j, then aixiMmi

is a multiple of mj .Thus, what remains to prove is that

ajxjM

mj− aj = aj(xj

M

mj− 1)

is a multiple of mj . Since xj is a solution of x Mmj

≡ 1 (mod mj), there existsan integer k such that

(xjM

mj− 1) = kmj .

Therefore, aj(xjMmj

− 1) = ajkmj is also a multiple of mj . 2

Here we consider a bit more general from of simultaneous linear congruencesystems, where the coefficients of the variables can be any integers.

a1x ≡ b1 (mod m1),a2x ≡ b2 (mod m2),...

......

anx ≡ bn (mod mn).

(6.25)

It is clearly that if one of the congruence in system (6.25) has no solution, thenthe system has no solution too. Thus, we at first check, for every 1 ≤ i ≤n, gcd(ai, mi)|bi. If this is the case, we can solve each congruence in (6.25)independently. Let ui

m′i

be the set of solutions to aix ≡ bi (mod mi) wherem′

i = mi

gcd(ai,mi). Clearly, the solutions to (6.25) is the intersection:

u1m′

1∩ u2

m′2∩ un

m′n

.

which is simply the solution of the following system:

x ≡ u1 (mod m′1),

x ≡ u2 (mod m′2),

......

...x ≡ un (mod m′

n).

Then, we use the method in Theorem 6.29 to solve the system above.


236 6. Integers

6.7.2 Fermat’s Little Theorem and Euler’s Theorem

This is one of Fermat’s best known theorems and was proved by Fermat himselfin 1640. It is known as his “little theorem” to distinguish it from his “great”theorem.1 The theorem was once thought to be one of the least applicabletheorems in mathematics. Just for fun, Fermat wanted to derive a conditionto construct “big” perfect numbers.2 But due to the use of computer sciencein areas such as coding theory and cryptography, Fermat’s little theorem hasbecome one of the most useful tools from number theory.

Theorem 6.30 (Fermat’s Little Theorem) If p is a prime number andp 6 | n, then

np−1 ≡ 1 (mod p).

Proof: Let i, j ∈ Z with 1 ≤ i, j ≤ (p− 1) and

in ≡ jn (mod p)

That is, (i− j)n = kp for some integer k. Since p is a prime number and p 6 | n,it follows that gcd(p, n) = 1. Therefore, if p|(i − j)n, then it must be the casethat p|(i− j). But 1 ≤ i, j ≤ (p− 1), the only possible case is (i− j) = 0, andhence i = j. Therefore,

[in ≡ jn (mod p)] =⇒ [i = j]. (6.26)

The contrapositive of (6.26) is:

[i 6= j] =⇒ [in 6≡ jn (mod p)]. (6.27)

In other words, there are exactly (p−1) residue classes modulo p in the followinglist:

1n p , 2n p , . . . , (p− 1)np . (6.28)

It is clear that 1n, 2n, . . . , and (p − 1)n are not multiples of p. Thus, none ofthe residue classes in (6.29) is equivalent to 0 p . By Theorem 14, there areexactly p distinct residue classes mod p. After removing 0 p , they are

1 p , 2 p , . . . , p− 1p . (6.29)

Therefore, the residue classes in (6.29) and (6.28) are exactly the same, exceptperhaps in a different order. By the multiplicative property in Theorem 6.18,

1The “great” theorem best known as Fermat’s “last theorem” stated by Fermat himselfbut failed to provide a proof. The theorem after more than 300 years since first stated wasfinally proved in 1992.

2A perfect number is a natural number that equals the sum of its proper factors. Forexample, 6 and 28 are perfect numbers since 6=1+2+3 and 28 = 1+2+4+7+14, while 8 isnot since 8 6= 1 + 2 + 4. Perfect numbers turns out to be very rare. There are only 4 perfectnumbers less than 10000!!



we have

1 · 2 · · · (p− 1) ≡ (n · 2n · · · (p− 1)n) (mod p)≡ (1 · 2 · · · (p− 1))np−1 (mod p).

Since p is a prime, it follows that gcd(1 · 2 · · · (p− 1), p) = 1. By Theorems 6.20,we can cancel 1 · 2 · · · (p− 1) from both sides and obtain the result:

1 ≡ np−1 (mod p).

2

Euler’s Phi Function and Theorem: Fermat’s little theorem was general-ized by Euler. Euler’s theorem gives us an easy way to find the residue classof a composed number. In other words, if m is a huge composed number, wecan use Euler’s theorem to find the remainder of m, when divided by d quickly,without actually performing the division algorithm.

Definition 6.17: Define ϕ : N → N where, ϕ(m) is the total number of residueclasses mod m that are relatively prime to m. This function is called Euler’sPhi function.

For example, let the modulus m = 3. There are three residue classes,0

3, 13, and 2

3. Numbers in 03 are not relatively prime to 3. Therefore,

ϕ(3) = 2. If m = 4, there are 4 residue classes. But numbers in 04 and 2

4

are not relatively prime to 4. Therefore, ϕ(4) = 2. In view of these observations,we can redefine Euler’s Phi function as follows.

Definition 6.18: Euler’s Phi function, ϕ(m), is the total number of elementsin 0, 1, 2, . . . , m− 1 that are relatively prime to m.

Clearly, if p is a prime number, then ϕ(p) = p − 1. Note that 0 is notrelatively prime to any numbers.

Theorem 6.31 Let p be a prime number. Then, for any e ∈ N, we have

ϕ(pe) = pe − pe−1.

Proof: Since p is a prime number, only those numbers that are multiples of pare not relatively prime to p. They are

0, p, 2p, 3p, . . . , pe − p.

Since pe − p = (pe−1 − 1)p, it follows that there are pe−1 numbers among0, 1, 2, . . . , pe − 1 not relatively prime to p. Therefore, ϕ(pe) = pe − pe−1. 2


238 6. Integers

Theorem 6.32 If a, b ∈ N are two integers such that gcd(a, b) = 1, then

ϕ(ab) = ϕ(a)ϕ(b).

Proof: Let A ⊆ Za, B ⊆ Zb, and C ⊆ Zab be three residue systems relativelyprime to a, b, and ab, respectively, given as follows.

A = a1, a2, . . . , aϕ(a),B = b1, b2, . . . , bϕ(b),C = c1, c2, . . . , cϕ(ab).

Moreover, we assume all elements in each system are distinct. In other words,|A| = ϕ(a), and same to B and C. Our task is to define a bijection f : C →A × B. If such a function exists, then |C| = |A × B| = |A| × |B|, and henceϕ(ab) = ϕ(a)ϕ(b).

For each x ∈ C, define f(x) = (ra, rb), where x ∈ ra a and x ∈ rbb . Since

x ∈ C, by assumption, gcd(x, ab) = 1. Also,

[gcd(x, ab) = 1] =⇒ [gcd(x, a) = 1 and gcd(x, b) = 1].

It follows that there must exist some ra ∈ A and rb ∈ B that satisfy thedefinition of f . Moreover, by the division algorithm, such ra and rb are uniquelydetermined to ensure that f is single valued. Therefore, for every x ∈ C, f(c)is well defined in A×B.

What remains is to argue that f is bijection. Fix an ra ∈ A and an rb ∈ B.Consider the following system:

x ≡ ra (mod a),x ≡ rb (mod b). (6.30)

Clearly, if x can satisfy both congruences in (6.30), then f(x) = (ra, rb). Sincegcd(a, b) = 1, by the Chinese Remainder Theorem, the congruence system (6.30)has a solution, say, x′. Moreover,

(gcd(x′, a) = 1 and gcd(x′, b) = 1) =⇒ gcd(x′, ab) = 1.

In other words, x′ ∈ ciab for some ci ∈ C. Thus, f is surjective. By the

division algorithm, such a ci is unique, and hence f is injective. 2

With Theorems 6.31 and 6.32, we can easily find the value of ϕ(n). Forexample, to compute ϕ(210), we factorize 210 = 2 · 3 · 5 · 7 first. Then,

ϕ(210) = ϕ(2 · 3 · 5 · 7)= ϕ(2) · ϕ(3) · ϕ(5) · ϕ(7)= 1 · 2 · 4 · 6= 48.



For another example, consider 2016. Since 2016 = 25 · 32 · 7, we have

ϕ(2016) = ϕ(25 · 32 · 7)= ϕ(25) · ϕ(32) · ϕ(7)= (25 − 24) · (32 − 3) · 6= 576.

That is, there are 576 natural numbers in Z2016 relatively prime to 2016.

Theorem 6.33 (Euler’s Theorem) Let a,m ∈ N with gcd(a, m) = 1.Then,

aϕ(m) ≡ 1 (mod m).

Proof: The idea is similar to the proof of Fermat’s little theorem. Let ϕ(m) = nand p1, p2, . . . pn are natural numbers less than m and relatively prime to m.Thus, the following classes are residue classes relatively prime to m.

p1 m, p2 m, . . . , pn m. (6.31)

Let i ∈ N with 1 ≤ i ≤ n. Since gcd(a,m) = 1 and gcd(pi,m) = 1, it followsthat gcd(api,m) = 1 and therefore api m is a residue class relatively prime tom. Therefore,

ap1 m , ap2 m , . . . , apn m, (6.32)

are also residue classes relatively prime to m. With the same argument used inthe proof of Fermat’s Little Theorem, we conclude that the residue classes in(6.31) and (6.32) are indeed the same. Similarly,

p1p2 · · · pn ≡ ap1ap2 · · · apn ≡ p1p2 · · · pnan (mod m).

Since gcd(p1p2 · · · pn,m) = 1, By Theorems 6.20, we can cancel p1p2 · · · pn fromboth sides to have 1 ≡ an (mod p) we we want. 2

6.7.3 RSA Cryptosystem

RSA Cryptosystem probably is the most important application of number the-ory of all time, which is a public-key cryptosystem named for its inventors,Rivest, Shamir, and Adleman who put forward the idea in their celebrated pa-per in 1977. The RSA algorithm lies down the foundation of cryptography for anentire generation of modern cryptographers and remains the most widely usedencryption method for today’s communication technology. The three inventorswere awarded Truing Award in 2002 that was considered the Nobel Prize ofcomputer science.


240 6. Integers

Consider the following situation. Bob wants Alice to send him secret infor-mation over an insecure channel. For some unavoidable difficulties, Bob andAlice can’t meet in person or communicate through a secure channel, and thus,it is impossible for them to share a secret key without the fear of being in-tercepted. How can Alice transfer the secret information to Bob under thissituation? Public-key cryptosystem is a solution. Here is the idea:

1. Bob chooses public-keys and secret-keys. He then keeps the secret-keys tohimself and announces the public-keys to Alice.

2. Alice gets the public-keys and uses them to encrypt the information sheintends to send to Bob. Then, she sends off the encrypted text.

3. Bob receives the encrypted text and uses the secret-keys to decrypt it.

Note that, a third person may also have the public-keys but only Bob has thesecret-keys. Thus, only Bob can retrieve the information from the encryptedtext. If necessary, Bob can openly teach Alice how to encrypt the informationwith the public-keys. Without the knowledge of the secret-keys, a third personcan’t3 feasibly reverse the encryption procedure to obtain the original infor-mation, even if the encryption algorithm is given. Now, we present the RSAcryptosystem.

RSA Cryptosystem:

1. Bob does the followings:

(a) Choose two different big prime numbers p and q.(b) Calculate m and n such that, m = pq and n = ϕ(pq).(c) Find a and b such that, ab ≡ 1 (mod n).(d) Tell Alice a and m. (Keep b, p, q, n in a safe.)

2. Alice does the followings:

(a) Calculate t = (sa mod m), where s is the secret information shewants to send.

(b) Send t to Bob.

3. Bob receives t and reads the result of (tb mod m)

Clearly, if s = (tb mod m), then Bob does get the secret information from Alice.To see this is the case, fix an s with 0 < s < min(p, q).4 Thus, gcd(s,m) = 1since p and q are primes and m = pq.

3Well, we believe he/she can’t, unless P = NP.4In fact, 0 < s < min(p, q) is more restrictive than necessary. Any s ∈ Z(pq−1) can be

correctly encrypted and decrypted. See the discussion below.



tb ≡ (sa)b (mod m)≡ sab (mod m), recall that ab ≡ 1 (mod n)≡ s1+kn (mod m), where n = ϕ(m), k ∈ Z≡ s · skϕ(m) (mod m)≡ s · 1 (mod m), by Euler’s Theorem≡ s (mod m)

Note that, we have had restricted the message s to be smaller than min(p, q)in order to ensure that gcd(s,m) = 1 so we can apply the Euler’s Theorem. Infact, this restriction can be removed. Let s ∈ Z(pq−1). It is obvious that s cannotbe greater than (pq − 1) due to the modulus, m = pq. Suppose p = min(p, q)and gcd(s, m) > 1. Then, it must be the case that s = s′p for some integer s′

with 1 ≤ s′ ≤ q − 1. It follows that gcd(s′, q) = 1 and gcd(s, q) = 1. As above,we can verify that,

sϕ(m) ≡ sϕ(p)ϕ(q) ≡ sϕ(p)(q−1) ≡ 1 (mod q).

Therefore, there is some k ∈ Z such that

1 = sϕ(m) + kq.

Multiply each side by s,

s = s · sϕ(m) + s · kq = s1+ϕ(m) + s′kpq. = s1+ϕ(m) + s′km.

Thus,s ≡ s1+ϕ(m) (mod m).

Based on the discussion above, the only two restrictions for the RSA algorithmto work is p 6= q and s ∈ Zpq−1. If s = 0 or s = 1, obviously, the plaintext andciphertext will be the same and there will be no secret at all.

Clearly, if a third person knows p or q, then the third person can derive b froma as Bob does and break the system. In practices, we choose two prime numberswith about 100 digits in decimal. This should be sufficient for many sensitiveinformation such as, credit card numbers, bank accounts, licence numbers, andso on. For textual information such as classified documents, we can divide theinformation into small blocks. To find two prime numbers with 100 digits is nottrivial but can be easily done with today’s computers. However, to factorizea 200 digits number that is the product of two prime numbers is impossiblewith today’s number theory and technology. Thus, the security of an RSAcryptosystem depends on the intractability of factorizing m.


242 6. Integers

6.8 Problems

Problem 1: For what integers a does the following equation hold?

|2a| = |2−a|.

Problem 2: Consider functions g and h defined as follows.

g(x) = x− bxc;h(x) = x− dxe.

For which x ∈ R is |g(x)| = |h(x)|? Prove your answer.

Problem 3: Prove that for all odd integers n, 8 divides n2 − 1.

Problem 4: According to the values of a and b given in the following, find,by the division algorithm, the values of q and r such that,

a = qb + r, where 0 ≤ r < |b|.(i) a = 387, b = 28; (ii) a = 191, b = −14; (iii) a = −78, b = 15;(iv) a = −105, b = −7.

Problem 5: Let n ≥ 0 be an integer. Without using mathematical induc-tion, prove that 5 divides n(n4 − 1).

Problem 6: Let m ∈ N, x ∈ Z. Prove that⌈ x

m

⌉=

⌊x + m− 1

m

⌋.

Problem 7: Let m,n ∈ Z where m > n > 0 and

r = m− bmncn.

Show that r < m2 .

Problem 8: Let t ∈ Z. Prove that⌈12

⌊t

2

⌋⌉=

⌊t + 2

4

⌋.

Problem 9: Prove that the results of the above problem hold for any t ∈ R.

Problem 10: Egyptian mathematicians in 1800 B.C. represented rationalnumbers between 0 and 1 as sums of unit fractions 1/a + 1/b + · · ·+ 1/kwhere a, b, . . . , k were distinct positive integers. E.g., they wrote 2/5 as1/3 + 1/15. Prove that it is always possible to do this in a systematicway as described below: If 0 < m/n < 1, then for q =

⌈nm

⌉

m

n=

1q

+

representation ofm

n− 1

q

.

Show that the above procedure terminates after a finite number of steps.


6.8. Problems 243

Problem 11: Find the base-8 representation of 100 and base-9 representa-tion of 1000. Show details of your work.

Problem 12: Let b ≥ 2 be an integer. Write a recursive algorithm to find,for any n ∈ N, the base-b expression of n.

Problem 13: Let a, b, x, and y be integers such that gcd(a, b) = ax + by.Prove that x and y are relatively prime.

Problem 14: Let a, b be relatively prime integers. Suppose m is any integersuch that a|m and b|m. Prove that ab|m.

Problem 15: Find integers a, b, x, y such that ax+by = 2, but gcd(a, b) 6= 2.Explain how that can be possible.

Problem 16: Find gcd(242, 165) and gcd(17296, 18416).

Problem 17: Define F0 = 0, F1 = 1, and for n ≥ 2,

Fn = Fn−1 + Fn−2.

This sequence is called the Fibonacchi sequence. Let m, n ∈ N andm > n ≥ 0. Prove by mathematical induction that if Euclid’s algorithmperforms k recursive calls to find gcd of m and n, then

m ≥ Fk+2 and n ≥ Fk+1.

Problem 18: Find x, y ∈ Z such that

gcd(375, 275) = 375x + 275y.

Problem 19: Prove that if x and y are odd numbers, then it is impossibleto find an integer a such that x2 + y2 = a2.

Problem 20: Prove that if x and y are not divisible by 3, then it is impos-sible to find an integer a such that x2 + y2 = a2.

Problem 21: Prove that if 2n−1 is prime, then n is a prime. Is the conversetrue?

Problem 22: Find all m ≥ 1 such that 27 ≡ 9 (mod m).

Problem 23: Which of the elements of A are congruent to which otherelements mod 3? mod 7? Explain.

A := 687, 589, 931, 847, 527.

Problem 24: For each pair (x, m) in B, find the least positive integer rsuch that x ≡ r (mod m). Explain.

B := (19, 2), (131, 5), (84, 14), (141, 17).


244 6. Integers

Problem 25: Find the solutions, if any, to the following congruences:

1. 5x ≡ 9 (mod 17).

2. 18y ≡ 8 (mod 15).

3. 12z ≡ 15 (mod 42).

Problem 26: Let m,n ≥ 1 and a ∈ Z. Prove that if an ≡ 1 (mod m) thengcd(a, m) = 1.

Problem 27: Do the numbers 19, 8, −3, −5, 10, 5 form a complete residuesystem modulo 6?

Problem 28: Let m ∈ N and s ∈ Z, and let C be any complete system ofresidues mod m. Prove that

C + s := c + s; c ∈ Cis also a complete system of residues mod m.

Problem 29: Find the least positive residue of (15)35 mod 19. Show yoursteps and explain your procedure.

Problem 30: Find the least positive residue of (29)36 mod 17.

Problem 31: Let gcd(t, m) = 1, and let a0, . . . , am−1 be any completesystem of residues mod m. Prove that ta0, . . . , tam−1 is also a completesystem of residues mod m.

Problem 32: Without referring to calendars, show that the calendar forDecember 1976 is the same as that for July 1987.

Problem 33: Let f(x) = 13x3−5x2 +14x−10. Compute the least positiveresidue of f(12) mod 7.

Problem 34: Let m ≥ 1. Prove that for all a, x ∈ Z, if x ∈ a⊔

m, thengcd(x, m) = gcd(a, m).

Problem 35: In the following calculate the least positive residues.

1. 214 (mod 17)

2. 3100 (mod 5)

Problem 36: Solve the following three congruence systems, respectively.

x ≡ 3 (mod 5)x ≡ 2 (mod 6)x ≡ 3 (mod 7)

y ≡ 2 (mod 5)y ≡ 7 (mod 13)y ≡ 11 (mod 8)

z ≡ 7 (mod 16)z ≡ 1 (mod 9)z ≡ 2 (mod 25)

Problem 37: Prove that for all integers m ≥ 0,

(2m − 1)(2m − 2)(2m − 4) ≡ 0 (mod 7).


6.8. Problems 245

Problem 38: Find all n ≥ 0 for which

3n + 4n ≡ 0 (mod 7).

Problem 39: Let m and n be positive integers. Prove that the partitionof Z given by congruence mod m is a refinement of that for congruencemod n iff m is a multiple of n.

Problem 40: Define a sequence a0, a1, a2, . . . of integers by the recursionan+2 = an+1 + an (for all n ≥ 0) and the initial conditions a0 = 0 anda1 = 1. Prove (by induction?) that a5k ≡ 0 (mod 5) for all k ≥ 0.

Problem 41: Find ϕ (18), ϕ (40), and ϕ (72).

Problem 42: Prove that 220 ≡ 1 (mod 75) by using Euler’s theorem andTheorem 6.23.

Problem 43: Prove that for any m ≥ 1,

ϕ(m) = m∏

p|m(1− 1

p).

Here p|m means that p is a prime dividing m. The product is taken overall such primes.

Problem 44: (i) Prove that if n is odd, then ϕ(2n) = ϕ(n).

(ii) Is there an n ≥ 1 for which ϕ(3n) = ϕ(n)?

(iii) Prove that if n is even, then ϕ(2n) = 2ϕ(n).

Problem 45: Find all integers n ≥ 1 such that ϕ(n) = 8.


246 6. Integers

6.9 Solutions

Solution 1: Since for all integer a, 2a > 0 and 2−a > 0, we can get rid ofthe absolute value sign. Therefore,

|2a| = |2−a| ⇒ 2a = 2−a

⇒ 2a2a = 1⇒ 2a = 1⇒ a = 0.

2

Solution 2: For all x ∈ R we know that x ≥ bxc, and x ≤ dxe. Therefore,for all x ∈ R, g(x) ≥ 0, and h(x) ≤ 0. Let x = k + s, where k ∈ Z, s ∈ R and0 ≤ s < 1.

|g(x)| = |h(x)|⇒ g(x) = −h(x)⇒ x− bxc = dxe − x⇒ 2x = dxe+ bxc

Thus, 2k + 2s = dk + se+ bk + sc.

case 1: If s = 0, i.e., if x is an integer, then for all k ∈ Z the above equalityholds.

case 2: If 0 < s < 1, then dk + se = k + 1, and bk + sc = k. If the aboveequality holds, then 2k + 2s = 2k + 1. Therefore, s = 1/2.

From the two cases above, the true set of |g(x)| = |h(x)| is

0,±12,±1,±(1 +

12),±2,±(2 +

12),±3, . . ..

2

Solution 3: Let n = 2k + 1, k ∈ Z. n2 − 1 = (2k + 1)2 − 1 = 4k(k + 1).

Let k′ range over Z.

case 1: If k = 2k′, then n2 − 1 = 8k′(2k′ + 1).


6.9. Solutions 247

case 2: If k = 2k′ + 1, then

n2 − 1 = 4(2k′ + 1)(2k′ + 2) = 8(2k′ + 1)(k′ + 1).

In either case, n2 − 1 is divisible by 8. 2

Solution 4:

(i) 387 = 28× 13 + 23, q = 13, r = 23.

(ii) 191 = −14×−13 + 9, q = −13, r = 9.

(iii) − 78 = 15×−6 + 12, q = −6, r = 12.

(iv) − 105 = −7× 15 + 0, q = 15, r = 0.

2

Solution 5: The solution is easy to follow if we write n(n4 − 1) as

(n− 1)n(n + 1)(n2 + 1).

Any n ∈ Z can be written as n = 5q + r, where q ∈ Z and 0 ≤ r < 5. We have5 cases. In each case one of the factors of n(n4 − 1) is 0, as seen below.

case 1: n = 5q; n itself is divisible by 5.

case 2: n = 5q + 1; n− 1 is divisible by 5.

case 3: n = 5q + 2;

n2 + 1 = (5q + 2)2 + 1 = 25q2 + 20q + 4 + 1 = 5(5q2 + 4q + 1).

Therefore n2 + 1 is divisible by 5.

case 4: n = 5q + 3;

n2 + 1 = (5q + 3)2 + 1 = 25q2 + 30q + 9 + 1 = 5(5q2 + 6q + 2).

Therefore n2 + 1 is divisible by 5.

case 5: n = 5q + 4; n + 1 is divisible by 5.

2


248 6. Integers

Solution 6: By the division algorithm we get x = qm + r where q, r ∈ Zand 0 ≤ r < m.

Therefore, ⌈ x

m

⌉=

⌈q +

r

m

⌉=

q if r = 0,q + 1 if r > 0,

and ⌊x + m− 1

m

⌋=

⌊q +

r

m+ 1− 1

m

⌋=

q if r = 0,q + 1 if 0 < r < m.

Therefore, ⌈ x

m

⌉=

⌊x + m− 1

m

⌋.

2

Solution 7: By the division algorithm 0 ≤ r < n. Consider the followingtwo cases:

case 1: 0 < n ≤ m2 . In this case, r < m

2 immediately follows.

case 2: m2 < n < m. In this case it is obvious that 1 < m

n < 2, or bmn c = 1.

Consequently, r = m− bmn cn = m− n < m− m

2 . Therefore, r < m2 .

2

Solution 8: Problem 9 is a special case of problem 10, hence its solution isobtained from the following solution. 2

Solution 9: Let t ∈ R. By the division algorithm, t = 2k + r, where 0 ≤r < 2, r ∈ R, and k ∈ Z. By substituting t = 2k + r in the left side we get

⌈12

⌊t

2

⌋⌉=

⌈12

⌊2k + r

2

⌋⌉

=⌈

12

⌊k +

r

2

⌋⌉=

⌈12k

⌉, because 0 ≤ r

2< 1.

Similarly, by substituting t = 2k + r in the right side we get⌊

t + 24

⌋=

⌊2k + r + 2

4

⌋=

⌊k

2+

r + 24

⌋=

⌊k

2+ s

⌋,


6.9. Solutions 249

where 12 ≤ s = r+2

4 < 1. The integer k is either odd or even.

case 1: If k is even, then k = 2n for some n,⌈

k2

⌉= dne = n, and

⌊k

2+ s

⌋= bn + sc = n.

Thus, the equality holds.

case 2: If k is odd, then k = 2n + 1 for some n,⌈

k2

⌉=

⌈n + 1

2

⌉= n + 1, and

⌊k

2+ s

⌋=

⌊n +

12

+ s

⌋=

⌊n + 1 + (s− 1

2)⌋

= n + 1, because 0 ≤ s− 12

<12.

Therefore, in both cases the equality holds. 2

Solution 10: First we consider a variant of the division algorithm. Letm,n ∈ Z where m 6= 0, then there exist unique integers q, r such that

[n = mq − r, and 0 ≤ r < |m|, (6.33)

where q = d nme and r = mq − n. In addition, let 0 < m < n so that 0 ≤ r < m.

Dividing both sides of (6.33)by n and simple manipulation gives

m

n=

1q

+r

nq.

Now, we construct a recursive algorithm to express a ratio as Egyptians did.

f(m,n)if m = 1 then print( 1

n ); stop; q ← d n

me;r ← mq − n;if r = 0 then print(1

q ); stop; print(1

q+);

f(r, nq);end f

How do we know this algorithm will stop for any integers n,m with 0 <m < n? Because we know that r is strictly smaller than m. Therefore, if we


250 6. Integers

recursively apply f to r as the first argument, eventually the first argument willbecome 0 or 1 and no more recursive calls will be made. 2

Solution 11: In both cases we use the division algorithm and successivelywrite the remainders:

100 = 12×8 + 412 = 1×8 + 41 = 0×8 + 1

Therefore, 100 = (144)8.

1000 = 111×9 + 1111 = 12×9 + 312 = 1×9 + 31 = 0×9 + 1

Therefore, 1000 = (1331)9. 2

Solution 12:

b(n)if n = 0 then stop;k ← bn

bc;

r ← n− bk;b(k); print(r); stop;

end b

2

Solution 13: Let a = da′, b = db′ and d = gcd(a, b) = ax + by for someintegers x and y. We have

d = ax + by = da′x + db′y=⇒ 1 = a′x + b′y.

Because a′ and b′ are integers, by Theorem 6.12, gcd(x, y) = 1. 2

Solution 14: Suppose a|m, b|m, m ∈ Z. Since a and b are relatively prime


6.9. Solutions 251

integers, we obtain

gcd(a, b) = 1 ⇒ xa + yb = 1 for some x, y ∈ Z⇒ mxa + myb = m

⇒ mxaab

+ mbyab

= mab

⇒ mb

x + ma y = m

ab

Since all mb

, x, ma , and y are integers, therefore m

abis an integer too, and hence

ab|m. 2

Solution 15: Let a = b = x = y = 1. It is easy to see that ax + by = 2, butgcd(a, b) 6= 2. 2

Solution 16: Let q = bmn c and r = m− bm

n cn.

m = n×q + r,

242 = 165×1 + 77,165 = 77×2 + 11,77 = 11×7 + 0. Therefore, gcd(242, 165) = 11.

m = n×q + r,

18416 = 17296×1 + 112017296 = 1120×15 + 4961120 = 496×2 + 128496 = 128×3 + 112128 = 112×1 + 16112 = 16×7 + 0. Therefore, gcd(18416, 17296) = 16.

2

Solution 17: The statement is correct, which will be proven by mathemat-ical induction as follows.

• Inductive Basis: k = 1. Under the condition m,n ∈ N and 0 ≤ n < mand because the algorithm makes one recursive call, the smallest values ofm,n are 2, 1, respectively. Since Fk+2 = F3 = 2, and Fk+1 = F2 = 1, thebase is proven.

• Inductive Hypothesis: Suppose the algorithm makes k recursive callsand Fk+2 ≤ m,Fk+1 ≤ n.


252 6. Integers

• Inductive Step: Suppose the algorithm makes k + 1 recursive calls.After the first recursive call the new arguments are n and m − bm/ncn.Let

m′ = n, (6.34)

n′ = m− bmncn. (6.35)

By the assumption we know that gcd(m′, n′) will make k recursive calls,and by the inductive hypothesis we also know that

Fk+2 ≤ m′, (6.36)Fk+1 ≤ n′. (6.37)

From (6.34) and (6.36) we have Fk+2 ≤ n. The given condition, n < m,implies that 1 ≤ bm

n c. Thus, from (6.35) we have

n′ ≤ m− n =⇒ n′ + n ≤ m

=⇒ Fk+1 + Fk+2 ≤ n′ + n ≤ m =⇒ Fk+3 ≤ m.

2

Solution 18:gcd(375, 275) = 375x + 275y.

m/n x0/x1 y0/y1 r375 1 0275 0 1 1100 1 −1 275 −2 3 125 3 −4 30

Therefore, x = 3, y = −4. 2

Solution 19: Let x = 2p + 1, y = 2q + 1 where p, q ∈ Z. Then,

x2 + y2 = (2p + 1)2 + (2q + 1)2

= 4p2 + 4p + 1 + 4q2 + 4q + 1

= 4(p2 + p + q2 + q) + 2= 4K + 2


6.9. Solutions 253

where K = p2 + p + q2 + q, which is an integer.

Suppose it is possible to find an integer a such that a2 = x2 +y2. Then thereare two possible cases:

case 1: a = 2k for some integer k. Then a2 = 4k2, and

4K + 2 = 4k2 =⇒ K +12

= k2.

This is impossible, because both K and k2 are integers.

case 2: a = 2k + 1 for some integer k. Then a2 = 4k2 + 4k + 1, and

4K + 2 = 4k2 + 4k + 1 =⇒ K +14

= k2 + k.

This is also impossible, because K, k, and k2 are integers.

Therefore, x2 + y2 = a2 is impossible. 2

Solution 20: If x is not divisible by 3, then there are two cases: x = 3p + 1and x = 3p + 2 for some integer p. Similarly, if y is not divisible by 3, thereare two cases: y = 3q + 1 and y = 3q + 2. Also, integer a itself has 3 cases:a = 3k, a = 3k + 1, and 3k + 2. All together, we have 2 × 2 × 3 cases. Theproof for each case is essentially the same. Here we just prove one case: x =3p + 2, y = 3q + 2, a = 3k + 2, where p, q, r ∈ Z.

x2 + y2 = (3p + 2)2 + (3q + 2)2

= 9p2 + 12p + 4 + 9q2 + 12q + 4

= 3(3p2 + 4p + 3q2 + 4q) + 8= 3K + 8

where K = 3p2 + 4p + 3q2 + 4q, which is an integer. Similarly,

a2 = (3k + 2)2

= 9k2 + 12k + 4

= 3(3k2 + 4k) + 4= 3K ′ + 4

where K ′ = 3p2 + 4p, which is an integer. Therefore,

3K + 8 = 3K ′ + 4 =⇒ K +43

= K ′.


254 6. Integers

Since both K and k′ are integers, this is impossible. 2

Solution 21: Suppose that 2n − 1 is a prime and n is not a prime. If n isnot a prime, then there exist p, q ∈ Z and p, q ≥ 2 such that n = pq.

2n − 1 = 2pq − 1= (2p)q − 1

= (2p − 1)((2p)q−1 + (2p)q−2 + · · ·+ 1).

Because p, q ≥ 2, both 2p − 1 and (2p)q−1 + (2p)q−2 + · · · + 1 are greater thanor equal to 2. Therefore, 2n − 1 is not a prime number, which contradicts theassumption.

The converse is wrong. Here is a counterexample: n = 11, which is prime.We have, 2n − 1 = 2047 = 23× 89, which is not prime. 2

Solution 22: 27 ≡ 9 (mod m) means for some integer k, (27 − 9) = km.Therefore, we have to find all possible m such that (27 − 9)/m is an integer.Nothing but the factors of 18 can satisfy the requirement. Therefore, m ∈1, 2, 3, 6, 9, 18. 2

Solution 23: In this question we seek the elements of A that have thesame remainder after being divided by 3 and 7, respectively; such elements arecongruent to each other.

687 = 229×3 + 0 = 98×7 + 1589 = 196×3 + 1 = 84×7 + 1931 = 310×3 + 1 = 133×7 + 0847 = 282×3 + 1 = 121×7 + 0527 = 175×3 + 2 = 75×7 + 2

Therefore,

589 ≡ 931 ≡ 847 ≡ 1 (mod 3)687 ≡ 589 ≡ 1 (mod 7)931 ≡ 847 ≡ 0 (mod 7)

2


6.9. Solutions 255

Solution 24:

We find the remainder of x after dividing by m for each pair x and m.

x = k ×m + r,

19 = 9 × 2 + 1131 = 26× 5 + 184 = 6 ×14 + 0141 = 8 ×17 + 5

Therefore,

19 ≡ 1 (mod 2)131 ≡ 1 (mod 5)84 ≡ 0 (mod 14)

141 ≡ 5 (mod 17)

2

Solution 25:

1. To solve 5x ≡ 9 (mod 17), we first note that gcd(5, 17) = 1. Clearly,gcd(5, 17)|9, and hence the given congruence equation has a solution. Us-ing the extended Euclid’s algorithm, one may have

gcd(5, 17) = 1 = 5× 7 + 17× (−2).

Thus, x0 = 7 × 9 = 63 is a solution. Moreover, lcm(5, 17)/5 = 17. Thus,63

17, or equivalently, 1217 is the set of all solutions.

2. 18y ≡ 8 (mod 15) has no solution, because gcd(18, 15) = 3 6 | 8.

3. 12z ≡ 15 (mod 42) has no solution, because gcd(12, 42) = 6 6 | 15.

2

Solution 26: Let an ≡ 1 (mod m). Therefore, an−1 = km for some k ∈ N.

an − km = 1 ⇒ (an−1)a + (−k)m = 1⇒ gcd(a,m) = 1, by Theorem 6.12

2


256 6. Integers

Solution 27: Given a set of integers of size m, the easiest way to seewhether the given set is a complete residue system modulo m is to use thedivision algorithm to find the remainder for each element divided by m. If theremainders cover all of the integers in

0, 1, 2, . . . , m− 1,

then the given set is a complete residue system modulo m. In other words, ifwe find that two of them have the same remainder, then the given set is nota complete residue system modulo m, because we need m distinct equivalenceclasses to cover the entire Z. Therefore, this problem is simply to find theremainders and check them. By using the division algorithm, we know,

19 ∈ 16 8 ∈ 2

6 −3 ∈ 36

−5 ∈ 16 10 ∈ 4

6 5 ∈ 56

Therefore, 19, 8,−3,−5, 10, 5 is not a complete residue system modulo 6, be-cause 19

6 = −56 . 2

Solution 28: We want to prove that if s ∈ Z, then

Z =⋃

c∈C

c m =⇒ Z =⋃

c∈C

c + sm

Let k ∈ Z. Since C is a complete residue system modulo m, if s ∈ Z, then

k − s ∈ Z ⇒ ∃c ∈ C, k − s ∈ c m

⇒ k − s ≡ c (mod m)

Such a c must exist because C is a complete system of residues mod m. By theproperties stated in Theorem 6.18 and the fact that s ≡ s (mod m), we have

(k − s) + s ≡ c + s (mod m)k ≡ c + s (mod m)

k ∈ c + sm

That means,∀k ∈ Z,∃µ ∈ (C + s), k ∈ c + s

m

where µ = c + s. Therefore, C + s is a complete residue system modulo m. 2


6.9. Solutions 257

Solution 29: There are many ways to find the least positive residue of(15)35 mod 19. We can use Theorem 6.18 directly to solve this problem. In thefollowing reduction we use the facts that 225 ≡ 16 (mod 19), 256 ≡ 9 (mod 19),etc.

1535 = 152·17+1 = 22517 · 15 ≡ 1617 · 15 = (162)8 · 16 · 15 = 2568 · 240≡ 98 · 12 = 814 · 12≡ 54 · 12 = 252 · 12≡ 62 · 12 = 36 · 12≡ 17 · 12 = 204≡ 14 (mod 19)

Alternatively, we can use Fermat’s theorem. We observe that 15 and 19 arerelative primes and 35 ≥ 19. If we take p = 19, a = 15, we get

1518 ≡ 1 (mod 19).

This removes a big exponent of 15, and the rest of the exponent can be simplifiedas before:

1535 = 1518 · 1517 ≡ 1 · 1517 = (152)8 · 15 = 2258 · 15≡ 168 · 15 = 232 · 15 = 218 · 214 · 15≡ 1 · 214 · 15 = 642 · 60≡ 72 · 3≡ 11 · 3≡ 14 (mod 19)

2

Solution 30: Observe that 29 and 17 are relative primes and 36 ≥ 17. ByFermat’s theorem (29)16 mod 17 ≡ 1 and 2936 = (292)16 · 294. Since 29 = 12mod 17, we get

294 ≡ 1 · 124 = 1442

≡ 82 = 64≡ 13 (mod 17)

2

Solution 31: From the definition of the complete system of residues modm we obtain the following result.


258 6. Integers

Result: Let S be a set of integers with size m. S is a complete system ofresidues mod m if and only if no two elements in S are congruent mod m toeach other.

The proof of this result, left as an exercise, can be obtained by an applicationof the pigeonhole principle.

Let a0, . . . , am−1 be a complete system of residues mod m. Let 0 ≤ i, j <

m, i 6= j, t relatively prime to m, and tai m = tajm. Then it follows that

tai ≡ taj (mod m) ⇒ tai − taj = km for some k ∈ Z

⇒ t(ai − aj) = km for some k ∈ Z

⇒ m|t(ai − aj)⇒ m|(ai − aj) because gcd(t,m) = 1⇒ ai − aj = km for some k ∈ Z

⇒ ai m = ajm.

This is a contradiction, because a0, . . . , am−1 is a complete system of residuesmod m, and from the corollary we know that ai m 6= aj

m if and only ifai 6= aj . Therefore, the assumption that tai m = taj

m is wrong. 2

Solution 32: The number of days between December 1, 1976, and July 1,1987, is 3864, which is 552 · 7. If the number of days is a multiple of 7, thenthe two months have the same monthly calendar. Therefore, December 1976and July 1987 share the same monthly calendar. Try cal 1976 and cal 1987 onUnix. 2

Solution 33: f(x) = 13x3 − 5x2 + 14x− 10.

We will use the following congruent relations:

13 ≡ 6 ≡ −1 (mod 7)12 ≡ 5 ≡ −2 (mod 7)14 ≡ 0 (mod 7)

Therefore,

f(12) ≡ 13 · 123 − 5 · 122 + 14 · 12− 10≡ 6 · (−2)3 − (−2) · (−2)2 + 0− 3 (mod 7)= −48 + 8− 3 = −43≡ 6 (mod 7)

2


6.9. Solutions 259

Solution 34: Recall Theorem 6.6 stating that, for any a, b, x, y ∈ Z,gcd(a, b) | (xa + yb).

Let m ≥ 1, a, x ∈ Z. We want to prove that

x ∈ a m =⇒ gcd(x, m) = gcd(a, m).

If x ∈ a m , we know x− a = km for some k ∈ Z. Therefore,

x + (−k)m = a (6.38)

a + km = x. (6.39)

Let gcd(x,m) = d and gcd(a,m) = d′, thus d|m and d′|m. From (6.38) we knowa is a linear combination of x and m. Thus, d|a and

d|a, d|m =⇒ d| gcd(a,m) =⇒ d|d′. (6.40)

Similarly, from (6.39), we have d′|x and

d′|x, d′|m =⇒ d′| gcd(x,m) =⇒ d′|d. (6.41)

Therefore, d = d′ and gcd(x,m) = gcd(a,m). 2

Solution 35: (i)

214 ≡ (24)3 · 22 = 163 · 4≡ (−1)3 · 4 = −4≡ 13 (mod 17)

(ii) We use Fermat’s little theorem for this problem,

3(100) ≡ (325)4 ≡ 1 (mod 5),

because 5 is a prime and 5 6 | 325. 2

Solution 36:

• To solve

x ≡ 3 (mod 5)x ≡ 2 (mod 6)x ≡ 3 (mod 7)


260 6. Integers

we first note that 5, 6, and 7 are relatively prime to each other. Therefore,we can use the Chinese Remainder Algorithm to find the solutions.

Let M = 5 · 6 · 7 = 210. We first solve the following equations separately.

c1 · M

5≡ 1 (mod 5)

c2 · M

6≡ 1 (mod 6)

c3 · M

7≡ 1 (mod 7)

That is,

c1 · 42 ≡ 1 (mod 5)c2 · 35 ≡ 1 (mod 6)c3 · 30 ≡ 1 (mod 7)

⇒

c1 ∈ 35

c2 ∈ 56

c3 ∈ 47

After we get one solution for each c1, c2, and c3, we can find our firstsolution, x0, for the simultaneous congruence,

x0 = c1 · 42 · 3 + c2 · 35 · 2 + c3 · 30 · 3= 3 · 42 · 3 + 5 · 35 · 2 + 4 · 30 · 3= 1088

Therefore, the solution set is 1088210 or equivalently, x ∈ 38

210

• To solve

y ≡ 2 (mod 5)y ≡ 7 (mod 13)y ≡ 11 (mod 8)

we first check and witness that 5, 13, and 8 are relatively prime to eachother. Let M = 5 · 13 · 18 = 520. First, we solve the following equationsseparately.

c1 · M

5≡ 1 (mod 5)

c2 · M

13≡ 1 (mod 13)

c3 · M

8≡ 1 (mod 8)

That is,

c1 · 104 ≡ 1 (mod 5)c2 · 40 ≡ 1 (mod 13)c3 · 65 ≡ 1 (mod 8)

⇒

c1 ∈ −15

c2 ∈ 113

c3 ∈ 18


6.9. Solutions 261

Then we can find a solution y0,

y0 = c1 · 104 · 2 + c2 · 40 · 7 + c3 · 65 · 11= −1 · 104 · 2 + 1 · 40 · 7 + 1 · 65 · 11= 787 ≡ 267 (mod 520)

Therefore, the solution set is 267520

• To solve,

z ≡ 7 (mod 16)z ≡ 1 (mod 9)z ≡ 2 (mod 25)

we first make sure that 16, 9, and 25 are relatively prime to each other.Let M = 16 · 9 · 25 = 3600, and solve the following equations separately.

c1 · M

16≡ 1 (mod 16)

c2 · M

9≡ 1 (mod 9)

c3 · M

25≡ 1 (mod 25)

That is,

c1 · 225 ≡ 1 (mod 16)c2 · 400 ≡ 1 (mod 9)c3 · 144 ≡ 1 (mod 25)

⇒

c1 ∈ 116

c2 ∈ −29

c3 ∈ 425

z0 = c1 · 225 · 7 + c2 · 400 · 1 + c3 · 144 · 2= 1927 (mod 3600)

Therefore, the solution set is 19273600

2

Solution 37: Observe that 7 is a prime number, and 7 cannot divide 2q forany positive integer q. Thus, we can make use of Fermat’s little theorem and

(2q)6 ≡ 1 (mod 7) for any integer q ≥ 1.

Moreover, by the division algorithm we know that for any positive integer m,there are unique q and r such that

m = 6q + r 0 ≤ r < 6.


262 6. Integers

Therefore, for m ≥ 0 and α = 1, 2, 4, we get 2m − α ≡ 26q+r − α ≡ 26q2r − α ≡(2r − α)( (mod 7)). Consequently, (2m − 1)(2m − 2)(2m − 4) ≡ (2r − 1)(2r −2)(2r − 4)( (mod 7)). Therefore, we only have to prove that for each possible rthe later polynomial is 0. For any integer r, 0 ≤ r ≤ 6, one of the factors of theabove equation is 0, e.g., if r = 4, then 2r − 2 = 16− 2 = 14 = 0( (mod 7)).

Therefore, in all cases (2m − 1)(2m − 2)(2m − 4) ≡ 0 (mod 7).

Alternatively, we can expand the polynomial and apply Fermat’s theorem:

(2m − 1)(2m − 2)(2m − 4)= (22m − 3 · 2m + 2)(2m − 4)= 23m − 3 · 22m + 2 · 2m − 4 · 22m + 12 · 2m − 8= 8m − 7 · 22m + 14 · 2m − 8≡ 1m − 0 · 22m + 0 · 2m − 1 (mod 7)≡ 0 (mod 7)

2

Solution 38: We can use Fermat’s little theorem to solve this problemeasily. We observe that 7 is a prime number and for any positive integer k,7 6 | 3k and 7 6 | 4k, and by Fermat’s little theorem, 36 ≡ 46 ≡ 1 (mod 7). Givenany integer n, by using the division algorithm we can find unique p and r suchthat n = 6p + r, where 0 ≤ r < 6.

3n + 4n = 36p+r + 46p+r

= (3p)63r + (4p)64r

≡ 1 · 3r + 1 · 4r (mod 7)≡ 3r + 4r (mod 7)

Therefore, n is a solution of

3n + 4n ≡ 0 (mod 7)

if and only if r is a solution of

3r + 4r ≡ 0 (mod 7).

In other words, if r is a solution, so is any element in r6 . Let’s check all

possible r:

r = 0 : 30 + 40 = 2 6≡ 0 (mod 7).

r = 1 : 31 + 41 = 7 ≡ 0 (mod 7).

r = 2 : 32 + 42 = 25 6≡ 0 (mod 7).


6.9. Solutions 263

r = 3 : 33 + 43 ≡ 2 · 3 + 2 · 4 ≡ 0 (mod 7).

r = 4 : 34 + 44 ≡ 22 + 22 6≡ 0 (mod 7).

r = 5 : 35 + 45 ≡ 12 + 16 ≡ 0 (mod 7).

Therefore, the solutions are

x ∈ 16

⋃3

6

⋃5

6 .

2

Solution 39: Given any positive integers m and n, let Rm and Rn be twopartitions of Z given by congruences mod m and n respectively. That is,

Rm =

0 m, 1 m, · · · , m− 1m

Rn =

0 n , 1 n, · · · , n− 1n

We want to prove that

Rm is a refinement of Rn ⇐⇒ m = kn, k ∈ Z.

1. ( ⇐= ) Suppose m = kn for some integer k. Consider any i m ∈ Rm.We want to prove that there exists a j

n∈ Rn such that i m⊆ jn.

Let x, y ∈ i m and x 6= y. Consequently, x ≡ y (mod m), or in otherwords, x − y = qm for some q ∈ Z. Substituting for m = kn, we getx − y = qkn or x ≡ y (mod n). Thus, x, y ∈ j

n where jn∈ Rn and

Rm is a refinement of Rn. (Think why do we need two elements, x and y,in the above proof?)

2. ( =⇒ ) Suppose Rm is a refinement of Rn. Let i m∈ Rm, jn ∈ Rn,

and i m⊆ jn. It is clear that i ∈ i m and j ∈ j

n .

(i ∈ i m, i m⊆ jn) =⇒ i ∈ j

n

Therefore, i ≡ j (mod n), and hence i n = jn . Moreover,

i + m ∈ i m ⇒ i + m ∈ jn because i m⊆ j

n

⇒ i + m ∈ i n because jn = i n

⇒ i + m ≡ i (mod n)⇒ i + m− i = kn, k ∈ Z

⇒ m = kn, k ∈ Z


264 6. Integers

Therefore, Rm is a refinement of Rn iff m = kn for some integer k. 2

Solution 40: We prove this problem by mathematical induction on k.

• Inductive Basis: For k = 0, a0 = 0 ≡ 0 (mod 5).

• Inductive Hypothesis: Assume a5k ≡ 0 (mod 5).

• Inductive Step: We want to prove that a5(k+1) ≡ 0 (mod 5). Towardsthis goal we repeatedly use an = an−1 + an−2 to expand a5(k+1) in thefollowing equalities:

a5(k+1) = a5k+5

= a5k+3 + a5k+4

= a5k+1 + a5k+2 + a5k+2 + a5k+3

= a5k+1 + 2a5k+2 + a5k+3

= a5k+1 + 2a5k+2 + a5k+1 + a5k+2

= 2a5k+1 + 3a5k+2

= 2a5k+1 + 3(a5k + a5k+1)= 3a5k + 5a5k+1

≡ 0 + 0 (mod 5)≡ 0 (mod 5)

Therefore, for all k ≥ 0, a5k ≡ 0 (mod 5). 2

Solution 41:ϕ(18) = ϕ(2 · 32)

= ϕ(2)ϕ(32)= 1 · (32 − 31)= 6.

ϕ(40) = ϕ(23 · 5)= ϕ(23)ϕ(5)= (23 − 22) · 4= 16.

ϕ(72) = ϕ(23 · 32)= ϕ(23)ϕ(32)= (23 − 22) · (32 − 3)= 24.

2


6.9. Solutions 265

Solution 42:

By Euler’s Theorem if gcd(a,m) = 1, then aϕ(m) ≡ 1 (mod m). By Theorem6.23, if a ≡ b (mod m1) and a ≡ b (mod m2), then a ≡ b (mod lcm(m1,m2)).

Using 75 = 3·52 we can apply Theorem 6.23. Note that ϕ(52) = (52−5) = 20.Since 5 is a prime and gcd(2, 5) = 1, by Euler’s theorem we have

2ϕ(52) ≡ 1 (mod 52) ⇒ 220 ≡ 1 (mod 25)

Since 3 is a prime, and 3 cannot divide 210, then by Fermat’s theorem we have

(210)2 ≡ 1 (mod 3) ⇒ 220 ≡ 1 (mod 3) (6.42)

Because lcm(3, 25) = 75, by Theorem (6.23) we have (210)2 ≡ 1 (mod 75).Therefore, 220 ≡ 1 (mod 75). 2

Solution 43: Given any integer m ≥ 1, we can factor m into a product ofprime numbers:

m = pe11 pe2

2 · · · penn .

Thus,

ϕ(m) = ϕ(pe11 pe2

2 . . . penn )

= ϕ(pe11 )ϕ(pe2

2 ) . . . ϕ(penn )

= (pe11 − pe1−1

1 )(pe22 − pe2−1

1 ) . . . (penn − pen−1

n )

= pe11 (1− 1

p1)pe2

2 (1− 1p2

) . . . penn (1− 1

pn)

= m∏

p|m(1− 1

p).

Comment: In∏

p|m(1− 1p ) it is better to let m range over all integers greater

but not equal to 1, unless we define∏ ∅ = 1.

2

Solution 44: (i) If n is odd, then gcd(2, n) = 1. Therefore,

ϕ(2n) = ϕ(2)ϕ(n) = 1 · ϕ(n) = ϕ(n).

2

(ii) Because n ≥ 1, we know ϕ(n) 6= 0. Let’s discuss it in cases.


266 6. Integers

Case 1: gcd(3, n) = 1.

ϕ(3n) = ϕ(3)ϕ(n) = 2 · ϕ(n) 6= ϕ(n).

Case 2: gcd(3, n) 6= 1.

Because 3 is a prime number, if gcd(3, n) 6= 1, then 3|n. Let n = 3km,where k ≥ 1 and gcd(3,m) = 1.

ϕ(n) = ϕ(3km)

= ϕ(3k)ϕ(m)

= (3k − 3k−1)ϕ(m).

ϕ(3n) = ϕ(3k+1m)

= ϕ(3k+1)ϕ(m)

= (3k+1 − 3k)ϕ(m).

Because (3k − 3k−1) 6= (3k+1 − 3k), therefore ϕ(n) 6= ϕ(3n).

In both cases, ϕ(n) 6= ϕ(3n). 2

(iii) If n is even, then we can present n as 2km, where k ≥ 1 and gcd(2,m) = 1.

ϕ(2n) = ϕ(2 · 2km)

= ϕ(2k+1)ϕ(m)

= (2k+1 − 2k)ϕ(m).

2ϕ(n) = 2ϕ(2km)

= 2ϕ(2k)ϕ(m)

= 2(2k − 2k−1)ϕ(m)

= (2k+1 − 2k)ϕ(m).

Therefore, ϕ(2n) = 2ϕ(n). 2

Solution 45: Any integer n ≥ 1 can be represented as a product of primenumbers, n = pe1

1 pe22 · · · pen

n , and

ϕ(n) = ϕ(pe11 )ϕ(pe2

2 ) · · ·ϕ(penn ).

Thus, we find all possible combinations of pe11 , pe2

2 , . . . , penn that make ϕ(n) = 8.


6.9. Solutions 267

To do this, let’s list the values of ϕ(pe1i ) in a systematic manner:

ϕ(2) = 1 ϕ(22) = 2 ϕ(23) = 4 ϕ(24) = 8ϕ(3) = 2 ϕ(32) = 6 ϕ(33) = 18 · · ·ϕ(5) = 4 ϕ(52) = 20 · · ·ϕ(7) = 6 ϕ(72) = 42 · · ·ϕ(11) = 10 · · ·

Although ϕ is not a monotone increasing function, integers in each columnrepresenting λp · ϕ(pe) are monotone increasing. Likewise, integers in eachrow representing λe · ϕ(pe) are monotone increasing. Here p ranges over primenumbers and e over natural numbers. Therefore, we don’t have to consider thep and e where ϕ(pe) is greater than 8.

It is not difficult to search the table and find that there are only 5 possiblenumbers that satisfy the desired property:

ϕ(2)ϕ(3)ϕ(5) = 8 n = 30ϕ(22)ϕ(5) = 8 n = 20ϕ(23)ϕ(3) = 8 n = 24

ϕ(24) = 8 n = 16ϕ(3)ϕ(5) = 8 n = 15.

Therefore, for n ∈ 30, 20, 24, 16, 15, ϕ(n) = 8. 2


Chapter 7

Binomial Theorem andCounting

“What’s one and one and one and one and one and oneand one and one and one and one?”

“I don’t know,” said Alice, “I lost count.”“She can’t do addition,” said the Red Queen.

– Lewis Carroll

7.1. The Binomial Theorem 271

7.1 The Binomial Theorem

Study of the Binomial Theorem is important for several reasons. In its simplestform this celebrated theorem primarily gives a general formula to write (1+x)n

in powers of x, where n is a positive integer. Moreover, this result can beextended to the case when n is a real number. Another important reason tostudy the Binomial Theorem is that the coefficients in the expansion of (1+x)n

have many interesting counting properties.

Theorem 7.1 (The Binomial Theorem) For all nonnegative integers n,

(1 + x)n =n∑

k=0

(n

k

)xk

= 1 + nx +n(n− 1)

2x2 + . . . +

n!k!(n− k)!

xk + . . . + xn.

The coefficient(nk

)is known as the Binomial Coefficient, which is the same

as the number of ways to choose k items out of n. For example, to evaluate thevalue of 1.044 we can apply the above theorem in a straightforward manner:

1.044 = (1 + 0.4)4 = 1 +(

41

)0.04 +

(42

)0.042 +

(43

)0.043 +

(44

)0.044

= 1 + 4× 0.04 + 6× 0.0016 + 4× 0.000064 + 0.00000256= 1.16985856.

The above result applies to a more general case where we wish to evaluate(a + x)n due to the equality (a + x)n = an(1 + x

a )n.

Interpretation of the Binomial Coefficient can be extended so that it applieswhen n can take any real number; thus extending the application of the aboveBinomial Theorem. Note that for any positive integers n and k with 0 ≤ k ≤ n,we observe that

(n

k

)=

n!k!(n− k)!

=n(n− 1) . . . (n− k + 1)

k!. (7.1)

The definition of(nk

)can be generalized in terms of equation (7.1) above for real

values n as follows.

Definition 7.1: For any real number r and nonnegative integer k, define(

r

k

)=

r(r − 1) . . . (r − k + 1)k!

.


272 7. Binomial Theorem and Counting

Although the question “How many ways can we select k items out of − 12?” is

absurd, there is a perfectly legitimate interpretation of(− 1

2k

): For k = 0, 1, 2, . . .,

(− 12

k

)=

(− 12 )(− 1

2 − 1) . . . (− 12 − k + 1)

k!

=(− 1

2 )(− 32 ) . . . (− 2k−1

2 )k!

= (−1)k 1× 3× 5× . . .× (2k − 1)2kk!

= (−1)k 1× 2× 3× . . .× (2k − 1)× (2k)2kk!× 2× 4× . . .× (2k)

=(−1)k

4k

(2k)!k!k!

=(−1)k

4k

(2k

k

).

In brief, now we have a general result:

(1 + x)n =∑

k≥0

(n

k

)xk. (7.2)

As an application of the above result we can approximate the value of 1.04−12

to the desired degree of accuracy as follows:

(1 + 0.04)−12 =

∑

k≥0

(− 12

k

)(0.04)k

=∑

k≥0

(−1)k 14k

(2k

k

)(0.04)k

= 1− 14

(21

)0.04 +

142

(42

)0.042 − 1

43

(63

)0.043 +

144

(84

)0.044

− 145

(105

)0.045 + . . .

= 1− 0.02 + 0.0006− 0.00002 + 0.0000007= 0.9805807.

The expansion in Equation (7.2), applicable to all real values of n, is knownas the Binomial Series Theorem.

The Binomial Theorem can be proved by an application of mathematicalinduction, whereas a proof of the Binomial Series Theorem can be obtained bythe Taylor Series Expansion.

Comment: The Binomial Series Theorem, although applicable in general,generates a convergent series only if |x| < 1.


7.1. The Binomial Theorem 273

The multinomial theorem extends the Binomial Theorem for more than onevariable.

Theorem 7.2

(x1 + x2 + . . . + xm)n =∑ n!

i1!i2! . . . im!xi1

1 xi22 . . . xim

m ,

where each ik is a nonnegative integer and∑m

`=1 i` = n.

For instance, for nonnegative integer values of n and given x, y, and z,

(x + y + z)n =∑

0≤i,j,k≤n;i+j+k=n

n!i!j!k!

xiyjzk.

The Binomial Coefficients satisfy several important equalities. Some basicidentities are listed in the following theorems (be careful of the ranges of thevariables in each theorem):

Theorem 7.3 (Symmetry Identity) If n, k ∈ Z, and 0 ≤ k ≤ n, then(

n

k

)=

(n

n− k

).

Theorem 7.4 (Absorption Identities) If k ∈ Z, r ∈ R, and k 6= 0, then(

r

k

)=

r

k

(r − 1k − 1

),

k

(r

k

)= r

(r − 1k − 1

),

(r − k)(

r

k

)= r

(r − 1

k

).

Theorem 7.5 (Addition Formula) If k ∈ Z and r ∈ R, then(

r

k

)=

(r − 1

k

)+

(r − 1k − 1

)

Theorem 7.6 (Knight’s-move Identity) If n, k ∈ Z and 0 ≤ k, then

∑

0≤i≤k

(n− i

k − i

)=

(n + 1

k

).



Theorem 7.7 (Summation of Index Identities) If r ∈ R, m,n ∈ Z,then ∑

k≤n

(r + k

k

)=

(r + n + 1

n

),

and ∑

0≤k≤n

(k

m

)=

(n + 1m + 1

).

Theorem 7.8 (Negating Upper Index Identity) If k ∈ Z and r ∈ R,then (

r

k

)= (−1)k

(k − r − 1

k

).

Theorem 7.9 (Vandermonde Identity) If r, s ∈ R and m,n ∈ Z, then

∑

k∈Z

(r

m + k

)(s

n− k

)=

(r + s

n + m

).

7.2 Principles and Typical Problems for Count-ing

In this section we first present some basic rules and principles for countingproblems. We next consider four cases of counting in terms of the “urns andballs” model and show how this model can help us evaluate many interestingproblems.

Theorem 7.10 (Rule of Sum) Let A and B be two events. Then,

|A ∪B| = |A|+ |B| if A ∩B = ∅|A|+ |B| − |A ∩B| otherwise

In words, suppose there are n ways in which event A occurs, and m ways inwhich event B occurs. There are then n + m ways in which one of the events Aor B occurs, provided A∩B = ∅. And, if the two sets are not disjoint, correctionby |A ∩B| is necessary to avoid double counting.

The above idea can be generalized to more than two sets and gives an im-portant result known as the inclusion-exclusion principle. To present the re-sults succinctly, we first introduce some notations. Let A1, A2, . . . , An be nevents. Let ai = |Ai| for i = 1, . . . , n, aij = |Ai ∩ Aj | for i, j = 1, . . . , n, i 6= j,


7.2. Principles and Typical Problems for Counting 275

aijk = |Ai ∩ Aj ∩ Ak| for i, j, k = 1, . . . , n, i 6= j 6= k, etc. Finally, letS1 =

∑ni=1 ai, S2 =

∑ni=1,j=1,i6=j aij , S3 =

∑i,j,k,i 6=j 6=k aijk, etc. We then

have the following theorem.

Theorem 7.11 (Inclusion-Exclusion Principle)

|A1 ∪A2 ∪ . . . ∪An| = S1 − S2 + S3 − . . .± Sn

=n∑

i=1

(−1)n−1Si

Theorem 7.12 (Rule of Product) Suppose A and B are two events suchthat |A| = n and |B| = m. There are then n×m ways in which the eventA×B occurs. In other words, |A×B| = |A| × |B|.

In general, we can generalize Theorem 7.12 to more than two sets. IfA1, A2, . . . , An are n events such that |Ai| = mi, then

|A1 ×A2 × . . .×An| = m1 ×m2 × . . .×mn.

Theorem 7.13 (Permutation) The number of permutations of n distin-guishable objects is given by n!.

For example, the set of all possible six permutations of three objects a, b, cis (a, b, c), (a, c, b), (b, a, c), (b, c, a), (c, a, b), (c, b, a). This is also equal to thenumber of bijective functions between two n-sets.

Theorem 7.14 (Combination of n objects taken k at a time) In short,we say n choose k. The number of ways to select k objects from n distin-guishable objects is

(nk

).

This is also equal to the coefficient of xk in the expansion of (1 + x)n. Forthis reason,

(nk

)is also known as the Binomial Coefficient. Another popular

notation for this expression is C(n, k). Note that

C(n, k) =(

n

k

)=

n!k!(n− k)!

.

For example, C(5, 2) =(52

)= 5!

2!3! = 10, and represents the number of ways wecan select 2 persons out 5.



Theorem 7.15 Let A be a collection of n objects that are not all distinct.Suppose there are m kinds of objects in A, among which, there are r1

objects of the 1st kind, r2 objects of the 2nd kind,..., and rm objects of themth kind. The total number of permutations of these n objects are

n!r1!r2! · · · rm!

.

There is no doubt that r1 + r2 + · · · + rm = n, and if all these n objectsare distinct, then m = n and r1 = r2 = · · · = rm = 1.

Theorem 7.16 (Permutation of n objects taken k at a time) Thenumber of ways to select and permute k objects from n distinguishableobjects is given by

(nk

)k!.

This is also denoted by P (n, k) and is called the number of permutations ofk objects out of n. For example, we can choose two letters out of a, b, c, d in(42

)= 6 different ways. These six selections are:

a, b, a, c, a, d, b, c, b, d, c, d.

However, the set a, b represents the selection of a and b, it does not orderthem. Two distinct permutations of the these letters are a, b and b, a. Thus, bythe product rule there are 12 different ways of ordered selection of two lettersout of four. The number of injective functions from an r-set to an n-set isexactly equal to P (n, r).

7.2.1 Urns and Balls Model

Consider the case of n urns and b balls, where n and b are positive integers.There is no specific order among the balls and an urn can hold as many ballsas b.

There are four distinct cases, depending upon whether the balls are distin-guishable or indistinguishable and the urns are distinguishable or indistinguish-able, as seen in Figure 7.1.

We consider the four cases shown in Figure 7.1, enumerate the number ofways to distribute balls among the urns, and show how the model applies toother interesting applications.



4

3

1 21

4

2 34

2

1 3

2 3

I II I II

4

1

2

1 3 4

I II

1 2

3

4

I II I IIIII

1a

2a

1b

3a 3b 3c

2b 2c

1c

4b 4c4a

1. Consider the first row of the figure, in which the balls and urns are bothdistinguishable. We can tell the difference between 1a, 1b, and 1c.

2. In the second row the balls are indistinguishable and the urns are distin-guishable. In this case we cannot tell the difference between 2a and 2b,but we can tell that 2b and 2c are two different ways in which to distributefour balls.

3. In the third row the balls are distinguishable, but the urns are indistin-guishable. In this case, 3a and 3c are the same way in which to distributefour balls, while 3b is a different way.

4. In the last case shown in the fourth row, balls and urns are both indistin-guishable. Thus, 4a, 4b, and 4c are identical: they are the same way inwhich to distribute four balls among two urns.

Figure 7.1: Distribution of four balls in two urns under four different assump-tions



Theorem 7.17 The number of ways to distribute b distinguishable balls inton distinguishable urns is given by nb.

This observation is straightforward; there are n different ways to distribute eachball. This number is also equal to the number of functions from a b-set to ann-set.

Theorem 7.18 The number of ways to distribute b indistinguishable ballsinto n distinguishable urns is given by

(b + n− 1

b

).

Proof: A simple observation gives the desired result. Consider the followingfigure where we have distributed six balls among three distinguishable urns; a“” denotes a ball, and a “|” denotes a boundary between two urns. In thefigure the first urn has two balls, the second urn is empty, and the third urn hasfour balls.

| | Note that in this example we have used two |’s to represent two boundariesbetween three urns. So, in general, we need n−1 sticks to represent the bound-aries between n urns. We seek the solution of the problem: How many ways arethere to distribute n − 1 |’s among n + b − 1 distinct places? In other words,how many ways are there to select n− 1 objects from n + b− 1 distinguishableobjects? The answer is

(b + n− 1

n− 1

)=

(b + n− 1

b

).

2

The number(b+n−1

b

)is also equal to the number of permutations of b iden-

tical objects of one kind and n− 1 identical objects of the other kind. Finally,it also equals the number of different nonnegative integer solutions of

x1 + x2 + · · ·+ xn = b.

Theorem 7.19 The number of ways to put b distinguishable balls into nindistinguishable urns is

n∑

j=0

j∑

k=0

(−1)k

j!

(j

k

)(j − k)b.

The proof of Theorem 7.19 is lengthy and requires several intermediate re-sults. Consequently we break it into several small theorems.



Theorem 7.20 The number of m-digit integers each containing the digits 0through 9 at least once is given by

10∑

k=0

(−1)k

(10k

)(10− k)m.

Proof: Let Bi be the set of all m-digit integers that do not contain the ithinteger, i = 0, 1, . . . , 9 and Ai = Bi. Clearly, the set Ai denotes all m-digitintegers that contain the integer i at least once. Note that we wish to find|A0 ∩A1 ∩ . . . ∩A9|. But

|A0 ∩A1 ∩ . . . ∩A9| = |B0 ∩B1 ∩ . . . ∩B9|= |B0 ∪B1 ∪ . . . ∪B9|= S0 − |B0 ∪B1 ∪ . . . ∪B9|, (7.3)

where S0 denotes the set of all m-digit integers. Obviously, S0 = 10m; thereare ten possible digits for each of the m possible positions. From the inclusion-exclusion principle,

|B0 ∪B1 ∪ . . . ∪B9| = S1 − S2 + S3 − . . . + S9,

where S1 =∑9

i=1 |Bi|, S2 =∑

i 6=j |Bi ∩ Bj |, S3 =∑

i 6=j 6=k |Bi ∩ Bj ∩ Bk|, etc.In turn, |Bi| = (10 − 1)m because all possible digits, except i, are allowed,|Bi∩Bj | = (10−2)m because all possible digits, except i and j are allowed, and|Bi ∩Bj ∩Bk| = (10− 3)m because all possible digits, except i, j, k, are allowed,and there are

(101

)ways to choose i,

(102

)ways to choose i and j,

(103

)ways to

choose i, j and k, etc. Therefore, S1 =(101

)(10− 1)m, S2 =

(102

)(10− 2)m, S3 =(

103

)(10− 3)m, etc. Finally,

|B0 ∪B1 ∪ . . . ∪B9|=

(101

)(10− 1)m −

(102

)(10− 2)m +

(103

)(10− 3)m + . . .

=10∑

k=1

(−1)k−1

(10k

)(10− k)m.

Substitution of this value in Equation (7.3) gives the desired result:

|A0 ∩A1 ∩ . . . ∩A9| =10∑

k=0

(−1)k

(10k

)(10− k)m.

2



Theorem 7.21 The number of ways to put b distinguishable balls in j dis-tinguishable urns so that no urn is empty is

j∑

k=0

(−1)k

(j

k

)(j − k)b.

Proof: The proof is similar to Theorem 7.20. We want all j urns to be used(just like all ten digits) while distributing b balls (just like all m positions). 2

The number obtained for Theorem 7.21 is equal to the number of surjectivefunctions from a b-set to an n-set.

Theorem 7.22 The number of ways to put b distinguishable balls in j in-distinguishable urns so that no urn is empty is

1j!

j∑

k=0

(−1)k

(j

k

)(j − k)b. (7.4)

Proof: The proof is straightforward. There are j! permutations of j distin-guishable urns. 2

Proof of Theorem 7.19: There are n indistinguishable urns, and when bdistinguishable balls are distributed among them some of them may be empty.Let there be j nonempty urns for j = 0, 1, . . . , n. For a fixed j the numberof ways of distribution is given by Theorem 7.22. Thus by the rule of sum weobtain the desired result, i.e.,

n∑

j=0

j∑

k=0

(−1)k

j!

(j

k

)(j − k)b.

2

In how many ways can b indistinguishable balls be distributed among nindistinguishable urns? An answer to this question is not available in closedform. Several approximations and asymptotic expressions are available; wepresent partial insight into the problem.

Let p(n) denote the distinct number of ways a positive integer n can bewritten as a sum of nonnegative integers. For example, we note that five can bewritten as the sum of the elements in one of each sets shown in the following.

5 4, 1 3, 1, 1 2, 1, 1, 1 1, 1, 1, 1, 13, 2 2, 2, 1

Likewise, for integer 6, p(6) = 11, this can be seen from Table 7.1.



6 5, 1 4, 1, 1 3, 1, 1, 1 2, 1, 1, 1, 1 1, 1, 1, 1, 1, 1, 14, 2 3, 2, 1 2, 2, 1, 13, 3 2, 2, 2

Table 7.1: The partition of integer 6

p(n) is known as the number of ways to partition an integer n. What doesp(n) have to do with the problem of distributing balls into urns? Consider, forexample, the third column of Table 7.1 in which each set that contains threeelements represents a case of distributing six indistinguishable balls into threeindistinguishable urns provided none of the three urns is empty. We have onecase in the first column, three cases in the second column, three cases in thethird column, and so on. All together, we have 11 cases, which is the numberof ways to distribute six indistinguishable balls into six indistinguishable runs.Note that some of the six urns may be empty in these 11 cases. If only threeurns are available, then the ways to distribute six indistinguishable balls intothree indistinguishable runs are the cases in Table 7.1 up to the third column.On the other hand, if there are more than six urns, then we can think of a tablethat is similar to Table 7.1 with more columns attached, but which, after thesixth column, are all empty, because we cannot distribute six balls into morethan six nonempty urns.

In general, let p(b, n) denote the number of ways to distribute b indistin-guishable balls among n indistinguishable urns. A recurrence relation, statedin the following theorem, allows us to evaluate p(b, n) in terms of its smallervalues.

Theorem 7.23 For all integers b and n,

p(b, n) =∑

1≤k≤n

p(b− k, k). (7.5)

Proof: From the definitions of p(n) and p(n, n) it is easy to see that p(n, n) =p(n) for all n, and p(b, n) = 0 if either b ≤ 0 or n ≤ 0. Therefore,

∑

1≤k≤n

p(b− k, k) =min(b,n)∑

k=1

p(b− k, k).

Without loss of generality, assume b ≥ n. The recurrence relation is given bythe fact that p(b, n) is the sum of the ways to distribute b balls among n urns inthe cases of one nonempty urns, two nonempty urns, three nonempty urns,...,and n nonempty urns. For each k where 1 ≤ k ≤ n, to make sure that there arek nonempty urns, we give each urn a ball first, then we distribute the remainingb−k balls among k urns without restriction: there are p(b−k, k) ways. Applyingthe rule of sum, we have the result. 2



7.2.2 Summary

We summarize the most typical counting problems in this section. Sometimestwo problems share the same underlying concept, although they may look verydifferent. Thus, we can solve them by using the same method. Here we putsome related problems together followed by their solutions.

1. Rule of Sum: Suppose there are n ways in which event A occurs and mways in which event B occurs; then there are n + m ways in whichone of the events A or B occurs.[Note: We assume that A ∩B = ∅.]

2. Rule of Product: Suppose there are n ways in which event A occursand m ways in which event B occurs; then there are n×m ways inwhich both events A and B occur.[Note: We assume that A ∩B = ∅.]

3. Related problems: (a) The number of permutations of n distinguish-able objects.

(b) The number of bijective functions between two n-sets.Formula:

n!.

Notation: P (n).

4. Related problems: (a) The number of ways to select r objects from ndistinguishable objects.

(b) The coefficient of xr in the expansion of (1 + x)n.Formula: (

n

r

).

Notation: C(n, r).

5. Related problems: (a) The number of ways to select and permute robjects from n distinguishable objects.

(b) The number of injective functions from an r-set to an n-set.Formula: (

n

r

)r!.



Notation: P (n, r).

6. Related problems: (a) The number of permutations of a1 objects ofthe 1st kind, a2 objects of the 2nd kind, . . ., an objects of thenth kind.

Formula:(a1 + a2 + · · ·+ an)!

a1!a2! · · · an!.

7. Related problems: (a) The number of ways to distribute b distinguish-able balls into n distinguishable urns.

(b) The number of functions from a b-set to an n-set.

Formula:nb.

8. Related problems: (a) The number of ways to distribute b indistin-guishable balls into n distinguishable urns.

(b) The number of permutations of b identical objects of one kindand n− 1 identical objects of the other kind.

(c) The number of different nonnegative integer solution of

x1 + x2 + · · ·+ xn = b.

Formula: (b + n− 1

b

).

Notation: Some textbooks denote this number as s(b, n).

9. Related problems: (a) The number of ways to distribute b distinguish-able balls into n distinguishable urns, provided none of the urnsis empty.

(b) The number of surjective functions from a b-set to an n-set.

Formula: ∑

0≤j

(−1)j

(n

j

)(n− j)b.

10. Related problems: (a) The number of ways to distribute b distinguish-able balls into n indistinguishable urns, provided none of the urnsis empty.



(b) The number of ways to partition a b-set into n cells.

Formula:1n!

∑

0≤j

(−1)j

(n

j

)(n− j)b.

Notation: This number is also known as the Stirling number of the sec-ond kind.

S(b, n) or

bn

.


7.3. Problems 285

7.3 Problems

Problem 1: Calculate(53

)and

(−73

)directly from the definition.

Problem 2: Prove that, for integer n ≥ 0,

∑k

(n

k

)= n2n−1.


∑k(k − 1)

(n

k

)= n(n− 1)2n−2.


∑k2

(n

k

)= n(n + 1)2n−2.

Problem 5: Let n be any positive integer. Prove that

∑

0≤k

(n

2k + 1

)= 2n−1.

Problem 6: Prove, for integers n, k, where n ≥ 0 and k ≥ 1,(

n

k

)=

(n

k − 1

)(n− k + 1)/k.

Problem 7: Prove, for integers n ≥ 0 and k ≥ 1,(

n

k

)= (n/k)

(n− 1k − 1

).

Problem 8: For any real number n and integer k ≥ 0 the following equalityis given by Definition 7.1.

(n

k

)=

n(n− 1)(n− 2) · · · (n− k + 1)k!

.

Prove by differentiating both sides of the following equation with respectto x that this definition is appropriate to extend the Binomial Theoremto real number n.

(1 + x)n =∑

0≤k

(n

k

)xk.



Problem 9: Let

f(x) = ao + a1x + a2x2 + · · ·+ anxn.

Prove that (xn)f(1/x) is the polynomial∑

an−kxk. What is the degreeof the new polynomial?

Problem 10: Find the coefficient of x4 in the expansion of

(2 +x

4)10.

Problem 11: Prove ∑

0≤k

26−k

(6k

)= 36.

Problem 12: Prove, for any integer n ≥ 0,

∑

0≤k

2n−k

(n

k

)= 3n.

Problem 13: Use the Binomial Theorem to calculate (1.2)−1.2 and accuratethe answer up to three decimal places.

Problem 14: Prove that for all integers n and k, where n ≥ 2 and k ≥ 2,(

n

k

)=

(n− 2k − 2

)+ 2

(n− 2k − 1

)+

(n− 2

k

).

Problem 15: For k ≥ 0 and m ≥ 0, prove that

∑

0≤j≤k

(m− j

k − j

)=

(m + 1

k

).

Problem 16: For k ≥ 0 and m ≥ 0, prove that

∑

0≤n≤m

(n

k

)=

(m + 1k + 1

).

Problem 17: Let n ≥ 0. Prove, for all r and k such that 0 ≤ r ≤ n andk ≥ 0, (

n

k

)=

∑

0≤j

(r

j

)(n− r

k − j

).

[Hint: Use the Vandermonde identity given on page 274.]


7.3. Problems 287

Problem 18: For all integers j ≥ 0, find integers ak (which depend also onj) so that

∑

0≤k

ak

(n

k

)=

(n + 3

j

)

holds for all n ∈ N. [Hint. Try a few specific cases first.]

Problem 19: Use the Binomial Theorem to prove that for all x,−1 < x < 1,1

1− x2= 1 + x2 + x4 + x6 + x8 + · · · .

[Hint: 1− x2 = (1 + x)(1− x), or let −x2 = u.]

Problem 20: You are at a corner on a rectangular grid of streets. You wantto go to a corner m blocks east and n blocks north. You may travel onlynorth or east. Prove that the total number of different routes you maytake is

(m+n

m

).

Problem 21: Consider the following street map.

6- 6-6- 6-6- 6-6- 6-

6- 6-6- 6-6- 6-6- 6-

6- 6-6- 6-6- 6-6- 6-

6- 6-6- 6-6- 6-6- 6-

6- 6-6- 6-6- 6-6- 6-

6- 6-6- 6-6- 6-6- 6-

6- 6-6- 6-6- 6-6- 6-

6- 6-6- 6-6- 6-6- 6-

6- 6-6- 6-6- 6-6- 6-6- 6

-6- 6-

6- 6-

6- 6-

6- 6-

6- 6-

6- 6-

6- 6-

6- 6-

6- 6-

6- 6-6- 6-6- 6-6- 6-

A

C

B

D

The map shows that B is three blocks east and one block north from A,C is three blocks east and two blocks north from B, and D is four blockseast and two blocks north from C. Suppose we want to move from A toD via B or C but not both, and we are allowed to move only north oreast. How many different routes can we take?

Problem 22: In Manhattan the streets are one way, as shown in the fol-lowing map.

6

6

- -?

?¾¾

6

6

- -?

?¾¾

6

6

- -?

?¾¾

6

6

- -?

?¾¾

6

6

- -?

?¾¾

6

6

- -?

?¾¾

6

6

- -?

?¾¾

6

6

- -?

?¾¾

6

6

- -?

?¾¾

6

6

- -?

?¾¾

6

6

- -?

?¾¾

6

6

- -?

?¾¾

A

B



Suppose we are allowed to drive east and north only, and have to followthe street direction. How may different routes are there to drive from Ato B in the above map?

Problem 23: Let n and r be nonnegative integers such that r ≥ n. Considerthe multinomial expansion of

(x1 + · · ·+ xr)n.

1. Prove that the number of terms in this expansion in which none ofthe xi’s has exponent 2 or greater is

(rn

).

2. Prove also that each such term has coefficient n!.

Problem 24: Let f be a function defined as

f(x) = (1 + 2x)10.

Fine the coefficient of x10 in the expansion of

f(1− x2

6).

[Hint: what is f(1− x2

6 )?

Problem 25: We have proved that, for nonnegative integers n, r,(

n

r

)=

(n− 1

r

)+

(n− 1r − 1

)

by algebra. Give a combinatorial explanation for this equality; i.e., youare allowed to use the property that nr is the number of ways to chooser objects from n distinguishable objects only.

Problem 26: Suppose you have a computer with eight empty slots for in-terface cards, two parallel ports for printers, and four serial ports formodems, scanners or mice. Suppose you have three interface cards, oneprinter, one mouse, and one modem. In how many ways can you connectthem to your computer?

Problem 27: From a laser printer n wires lead. They are all to be connectedsomehow to k terminals on a computer. How many ways are there to makethese connections

1. if no two wires may be connected to the same terminal?

2. in general?

Problem 28: Students in a class of 43 vote on a date for an exam. Eachstudent votes for just one of the five possible days. How many differentways may they vote?


7.3. Problems 289

Problem 29: Five students choose a different topic from a list of eight topicsfor their final report. How many choices may the class make?

Problem 30: Consider a circle with n points marked on it. Let n ≥ 2.Suppose you want to draw a chord by connecting two of those n points.How many different chords can you draw the triangle?

Problem 31: Consider a circle with n points marked on it. Let n ≥ 3. Inhow many different ways can you have to draw a triangle by connecting3 of these n points.

Problem 32: Consider a convex polygon with n vertices where n ≥ 3. Sup-pose you want to draw a triangle by connecting 3 vertices but the trianglecannot share any sides of the polygon. In how many ways can you draw.

Problem 33: How many different full houses with kings are there in a handof poker?

Problem 34: How many different full houses with diamonds are there in ahand of poker?

Problem 35: How many strings are there of five 0’s and five 1’s which startwith 101?

Problem 36: How many strings are there of seven 0’s, two 1’s, and one 2?

Problem 37: How many eight-digit long strings of 0’s and 1’s are therehaving no more than two 1’s? Express your answer in terms of binomialcoefficients and explain.

Problem 38: You have one of each of the following coins: penny, nickel,dime, quarter, half-dollar, and dollar. How many different amounts ofmoney can you make with these coins? Explain.

Problem 39: A set of 60 people are surveyed. Among 60 people, there are26 people who like pizza, 32 people who like ice cream, and 30 peoplewho like tofu. There are 14 people who like both pizza and ice cream,seven people who like both pizza and tofu, two people who like both icecream and tofu, and two people who like all three. How many dislike allthree?

Problem 40: How many strings of length five in the letters a, b, c, d, ehave two or more consecutive a’s?

Problem 41: In how many integers between 1 and 10,000 does the digit 7appear? (These integers are expressed in base 10.)

Problem 42: Among all permutations f of 1, 2, 3, 4, 5 how many havef(1) odd? f(2) odd? Both odd?



Problem 43: The 26 letters of the alphabet are written in a row (once each)so that no two vowels are together. How many ways may this be done?(Vowels: a, e, i, o, u.)

Problem 44: How many permutations of a 6-set consist of exactly two cy-cles? Explain.

Problem 45: Seven people divide into three teams to play Trivial Pursuit.If the only restrictions are that no team may be empty, everyone is onsome team, and no two teams overlap, how many ways are there to choosethe teams?

Problem 46: Seven people enter an elevator in the basement. Each exitsat floor 1, 2, 3, or 4. How many different ways can this happen?

Problem 47: Consider the 5-set X = a, b, c, d, e. How many 12-reps arethere from X in which a appears at least twice and d at least three times?

Problem 48: How many possible 5-letters words are there using the Englishalphabet? What type of ball and urn problem is this?

Problem 49: 1. How many surjections are there from a 5-set to a 5-set?Explain.

2. How many are there from a 5-set to a 3-set?

Problem 50: Consider w + x + y + z = 10.

1. If w ≥ 2, how many solutions in nonnegative integers to the equa-tion?

2. How many in positive integers?

Problem 51: How many nonnegative solutions to the following inequality?

12 ≤ w + x + y + z ≤ 14.

Problem 52: You have eight Hershey’s kisses (identical pieces of candy), allof which you give to four people. How many ways are there to distributethis candy?

Problem 53: There are seven copies of one book, eight of a second book,and nine of a third book. How many ways can two people divide them ifeach takes 12 books? Explain.

Problem 54: Suppose that balls and urns are distinguishable, that urns arestacks (so the oder of balls within them matters) and that even the orderof the urns make a difference. In how may way can b balls be put in uurns?

Problem 55: Same as the previous problem. What if the order of urns willnot make a difference, but urns are still stacks?


7.3. Problems 291

Problem 56: How many ways can b indistinguishable balls be placed in udistinguishable urns if each urn must contain at least k balls.

Problem 57: How many partitions are there of an 8-set

1. into two cells, one of three elements, the other of five elements?

2. into three cells, one of two elements, and the other two of threeelements each?

Problem 58: Let S(n,m) be the Stirling number of the second kind; i.e.,the number of ways to partition an n-set into m cells. Prove that,

S(n,m) = S(n− 1,m− 1) + mS(n− 1, m).

[Hint: Hold the nth elements first, then in which cell you would put thenth elements?]

Problem 59: Let A, B be sets, where |A| = 7 and |B| = 5. How manyinjections are there from A to B? and from B to A?

Problem 60: Let A, B be sets, where |A| = 7 and |B| = 5. How manysurjections are there from A to B? and from B to A?

Problem 61: A set of ten people are formed into committees of four mem-bers each in such a way that every 3-subset of people are members to-gether of exactly two committees. How many committees are there alto-gether?

Problem 62: Let

A = 1, 2, 3, 4, 5, 6, 7, 8, andB = 1, 3, 5, 7.

Suppose we want to pick six elements from A and two elements fromB and arrange them into a string. How many different strings can beconstructed?



7.4 Solutions

Solution 1: By definition,(53

)is the coefficient of x3 in the polynomial:

(1 + x)5 = 1 + 5x1 + 10x2 + 10x3 + 5x4 + x5.

Therefore,(53

)= 10. This definition does not work for

(−73

). We have to use

Definition 7.1, and get(−7

3

)=

(−7)(−7− 1)(−7− 2)3!

= −84.

2

Solution 2: Let f(x) = (1+x)n, n ≥ 0. By the Binomial Theorem, we knowthat

f(x) = (1 + x)n (7.6)

=(

n

0

)x0 +

(n

1

)x1 + · · ·+

(n

n

)xn. (7.7)

The 1st derivative of f with respect to x is equal to the 1st derivatives of (7.6)and (7.7) with respect to x. Thus,

f ′(x) = n(1 + x)n−1

= 1(

n

1

)x0 + 2

(n

2

)x1 · · ·+ n

(n

n

)xn−1 =

∑k

(n

k

)xk−1.

Let x = 1, we get

n2n−1 =∑

k

(n

k

). (7.8)

2

Solution 3: Let f(x) = (1 + x)n, n ≥ 0. Similar to the previous problem,we will find the 2nd derivatives of (7.6) and (7.7) with respect to x. From theprevious problem, we have

f ′(x) = n(1 + x)n−1 =∑

k

(n

k

)xk−1.

Thus,

f ′′(x) = n(n− 1)(1 + x)n−2 =∑

(k − 1)k(

n

k

)xk−2.


7.4. Solutions 293

Let x = 1, we have

n(n− 1)2n−2 =∑

k(k − 1)(

n

k

). (7.9)

2

Solution 4: Recall the following properties of polynomials. If f(k) and g(k)are two polynomials in k and a is a constant, then we have the following twoequalities. ∑

(f(k) + g(k)) =∑

f(k) +∑

g(k),

and ∑af(k) = a

(∑f(k)

).

From (7.9), we have

n(n− 1)2n−2 =∑

(k − 1)k(

n

k

)

=∑

(k2 − k)(

n

k

)

=∑ (

k2

(n

k

)− k

(n

k

))

=∑

k2

(n

k

)−

∑k

(n

k

).

From (7.8), we know that∑

knk = n2n−1. Thus,

∑k2

(n

k

)=

∑k

(n

k

)+ n(n− 1)2n−2

=n2n−1 + n(n− 1)2n−2

=(2n + n(n− 1))2n−2

=n(n + 1)2n−2.

2

Solution 5: Let n be any positive integer, and let

f(x) = (1 + x)n

=(

n

0

)x0 +

(n

1

)x1 + · · ·+

(n

2k

)x2k +

(n

2k + 1

)x2k+1 + · · ·+

(n

n

)x2n.



Let x = 1, we have f(1) = (2)n, and

2n =(

n

0

)+

(n

1

)+ · · ·+

(n

2k

)+

(n

2k + 1

)+ · · ·+

(n

n

). (7.10)

Similarly, let x = −1, we have f(−1) = 0, and

0 =(

n

0

)−

(n

1

)+ · · ·+

(n

2k

)−

(n

2k + 1

)+ · · ·+ (−1)n

(n

n

). (7.11)

Subtract (7.11) from (7.10), we have

2n = 2(

n

1

)+ 2

(n

3

)+ · · ·+ 2

(n

2k + 1

)+ 2

(n

2k + 3

)· · ·

Therefore,

2n =∑

0≤k

2(

n

2k + 1

)= 2×

∑

0≤k

(n

2k + 1

),

or

2n−1 =∑

0≤k

(n

2k + 1

).

2

Solution 6: For integers n ≥ 0 and k ≥ 1,

(n

k

)=

n!k! (n− k)!

=n! (n− k + 1)

k(k − 1)! (n− k + 1) (n− k)!

=n!

(k − 1)! (n− (k − 1))!× n− k + 1

k

=(

n

k − 1

)(n− k + 1)/k.

2


7.4. Solutions 295

Solution 7: For integers n ≥ 0 and k ≥ 1,(

n

k

)=

n!k! (n− k)!

=n(n− 1)!

k(k − 1)! (n− 1− (k − 1))!

=n

k× (n− 1)!

(k − 1)! (n− 1− (k − 1))!

= (n/k)(

n− 1k − 1

).

2

Solution 8: Let n be any real number, and let ak =(nk

)for any natural

number k, i.e,

(1 + x)n = a0 + a1x1 + a2x

2 + · · ·+ akxk + · · ·We will keep differentiating with respect to x on both sides until ak in the righthand side becomes a constant term, i.e., ak becomes the coefficient of x0. Thus,we have to differentiate both sides k times, and then obtain

n(n− 1) · · · (n− k + 1)(1 + x)n−k = (k!)akx0.

Let x = 0, we get n(n− 1) · · · (n− k + 1) = (k!)ak. Therefore,

ak =n(n− 1) · · · (n− k + 1)

k!=

(n

k

).

2

Solution 9: Let f(x) = a0 + a1x + · · ·+ anxn. Then,

xnf(1x

) = xn(a0 + a1(1x

) + a2(1x

)2 + · · ·+ an(1x

)n)

= a0xn + a1(

xn

x) + a2(

xn

x2) + · · ·+ an(

xn

xn)

= a0xn + a1x

n−1 + a2xn−2 + · · ·+ anxn−n

= an−0x0 + an−1x

1 + · · ·+ an−nxn

=∑

an−kxk.

This is a polynomial of degree n in x. 2



Solution 10: We observe that

(2 +x

4)10 = 210(1 +

x

23)10. (7.12)

By the Binomial Theorem, the coefficient of x4 in (1 + x23 )10 is

(104

)(

123

)4.

Therefore, the coefficient of x4 in (7.12) is

210

(104

)(

123

)4 =1052

.

2

Solution 11: By the Binomial Theorem,

(1 +1x

)6 =∑

0≤k

(6k

)(1x

)k.

Multiplying both sides of the equation by x6 gives

x6(1 +1x

)6 = x6∑

0≤k

(6k

)(1x

)k,

x6(x + 1

x)6 =

∑

0≤k

(6k

)x6−k,

(x + 1)6 =∑

0≤k

(6k

)x6−k.

Finally, let x = 2, we have

36 =∑

0≤k

26−k

(6k

).

2

Solution 12: This is a generalization of Problem 11. By Binomial Theorem,

(1 +1x

)n =∑

0≤k

(n

k

)(1x

)k.


7.4. Solutions 297

Multiplying both sides of the equation by xn gives

xn(1 +1x

)n = xn∑

0≤k

(n

k

)(1x

)k,

xn(x + 1

x)n =

∑

0≤k

(n

k

)xn−k,

(x + 1)n =∑

0≤k

(n

k

)xn−k.

Let x = 2, we get ∑

0≤k

2n−k

(n

k

)= 3n.

2

Solution 13: By Binomial Theorem,

(1.2)−1.2 = (1 + 0.2)−1.2 =∑

k≥0

(−1.2k

)(o.2)k.

We calculate a few terms up to k = 6 in the following.

k = 0 :(−1.2

0

)(0.2)0 = 1

k = 1 :(−1.2

1

)(0.2)1 = −1.2

1 · (0.2)1 = −0.24

k = 2 :(−1.2

2

)(0.2)2 = (−1.2)(−2.2)

2·1 · (0.2)2 = 0.0528

k = 3 :(−1.2

3

)(0.2)3 = (−1.2)(−2.2)(−3.2)

3·2·1 · (0.2)3 = −0.011264

k = 4 :(−1.2

4

)(0.2)4 = (−1.2)(−2.2)(−3.2)(−4.2)

4·3·2·1 · (0.2)4 = 0.002365

k = 5 :(−1.2

5

)(0.2)5 = (−1.2)(−2.2)···(−5.2)

5·4·3·2·1 · (0.2)5 = −0.000492

k = 6 :(−1.2

6

)(0.2)6 = (−1.2)(−2.2)···(−6.2)

6·5·4·3·2·1 · (0.2)6 = −0.000085

We note that when k = 6 we get −0.000085 that is too small to affect therequired accuracy. In other words, only the first six terms, k = 0, 1, . . . , 5, aresignificant to the required accuracy. Therefore,

(1.2)−1.2 ≈ 1− 0.24 + 0.0528− 0.011264 + 0.002365− 0.000492 = 0.803409.

After round off, we get (1.2)(−1.2) = 0.803. 2



Solution 14: Let n ≥ 2 and k ≥ 2. Using the addition formula given onpage 273, we have

(n

k

)=

(n− 1k − 1

)+

(n− 1

k

)

=[(

n− 2k − 2

)+

(n− 2k − 1

)]+

[(n− 2k − 1

)+

(n− 2

k

)]

=(

n− 2k − 2

)+ 2

(n− 2k − 1

)+

(n− 2

k

).

2

Solution 15: Let k ≥ 0 and m ≥ k−1. By repeatedly applying the additionformula, we have

(m+1

k

)=

(mk

)+

(m

k−1

)

=(mk

)+

(m−1k−1

)+

(m−1k−2

)

=(mk

)+

(m−1k−1

)+

(m−2k−2

)+

(m−2k−3

)

=(mk

)+ · · ·+ (

m−kk−k

)+

(m−k

k−k−1

).

Because m− k−1 = 0, we have

(m + 1

k

)=

(m

k

)+ · · ·+

(m− k

k − k

)=

∑

0≤j≤k

(m− j

k − j

).

In case that 0 ≤ m < k − 1, both sides are equal to 0. 2


7.4. Solutions 299

Solution 16: The idea is similar to Problem 15 except that the additionformula is applied to the other term. Let k ≥ 0 and m ≥ 0.

(m+1k+1

)=

(mk

)+

(m

k+1

)

=(mk

)+

(m−1

k

)+

(m−1k+1

)

=(mk

)+

(m−1

k

)+

(m−2

k

)+

(m−2k+1

)

=(mk

)+

(m−1

k

)+

(m−2

k

)+ · · ·+ (

1k

)+

(1

k+1

)

=(mk

)+

(m−1

k

)+

(m−2

k

)+ · · ·+ (

1k

)+

(0k

)+

(0

k+1

).

Because k ≥ 0, we know that k + 1 > 0, and thus(

0k+1

)= 0. Therefore,

(m + 1k + 1

)=

(m

k

)+

(m− 1

k

)+ · · ·+

(0k

)=

∑

0≤n≤m

(n

k

).

2

Solution 17: This problem can be proven straightforwardly by applyingthe Vandermonde identity given on page 274. Substitute s = n− r, m = 0 andn = k in the Vandermonde identity to get

∑

j∈Z

(r

0 + j

)(n− r

k − j

)=

(r + (n− r)

0 + k

)

∑

j∈Z

(r

j

)(n− r

k − j

)=

(n

k

). (7.13)

Since(rj

)= 0 for all j < 0, (7.13) can be rewritten as

(n

k

)=

∑

0≤j

(r

j

)(n− r

k − j

).

2

Solution 18: The left hand side of the given equation is:

a0

(n

0

)+ a1

(n

1

)+ · · ·+ ak

(n

k

)+ · · · · · · (7.14)

In this problem we ask: For all k ≥ 0, what is ak such that, (7.14) is equal to



(n+3

j

), where j is a fixed nonnegative integer. By the addition formula, we have

(n+3

j

)=

(n+2j−1

)+

(n+2

j

)

=(n+1j−2

)+

(n+1j−1

)+

(n+1j−1

)+

(n+1

j

)

=(n+1j−2

)+ 2

(n+1j−1

)+

(n+1

j

)

=(

nj−3

)+

(n

j−2

)+ 2(

(n

j−2

)+

(n

j−1

)) +

(n

j−1

)+

(nj

)

=(

nj−3

)+ 3

(n

j−2

)+ 3

(n

j−1

)+

(nj

).

We observe that each term in the above expression can be indexed as

aj−3

(n

j−3

)+ aj−2

(n

j−2

)+ aj−1

(n

j−1

)+ aj

(nj

)=

∑j−3≤k≤j

ak

(nk

), (7.15)

where, aj−3 = 1, aj−2 = 3, aj−1 = 3, aj = 1. We also note that 1,3,3,1 is in thepattern of the Pascal Triangle, that is,

aj−3 =(

30

), aj−2 =

(31

), aj−1 =

(32

), aj =

(33

).

Let’s make it clearer,

aj−3 =(

3(j−3)−j+3

), aj−2 =

(3

(j−2)−j+3

), aj−1 =

(3

(j−1)−j+3

), aj =

(3

j−j+3

).

Therefore, we conclude that, for j − 3 ≤ k ≤ j,

ak =(

3k − j + 3

).

Since for k < j − 3 or k > j, ak =(

3k−j+3

)= 0, we can extend (7.15) to all k,

and get∑

0≤k

ak

(n

k

)=

(n + 3

j

), where ak =

(3

k − j + 3

).

2

Solution 19: We have two approaches to prove that for −1 < x < 1,

11− x2

= 1 + x2 + x4 + x6 + x8 + · · · · · ·


7.4. Solutions 301

Method 1: Let u = −x2. By the Binomial Theorem, we get

11− x2

= (1 + u)−1

= 1 +(−1

1

)u1 +

(−12

)u2 + · · ·

= 1− u1 + u2 − u3 + u4 + · · ·= 1− (−x2)1 + (−x2)2 − (−x2)3 + · · ·= 1 + x2 + x4 + x6 + x8 + · · · · · ·

2

Method 2: We know 1− x2 = (1 + x)(1− x), and

11− x2

=1

(1− x)(1 + x)=

12

(1

1− x+

11 + x

).

By the binomial theorem,

11− x

= (1− x)−1

= 1 +(−1

1

)(−x)1 +

(−12

)(−x)2 + · · ·

= 1 + x + x2 + x3 + x4 · · · (7.16)

11 + x

= (1 + x)−1

= 1 +(−1

1

)x1 +

(−12

)x2 + · · ·

= 1− x + x2 − x3 + x4 · · · (7.17)

Therefore,

11− x2

=(7.16) + (7.17)

2= 1 + x2 + x4 + x6 + x8 + · · · · · ·

2



Solution 20: Consider the following figure.

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

a a b

a b

a a a b

b

a

A

B

Suppose we want to move from A to B, where B is 7 blocks east and 4 blocksnorth from A, and, we are allowed to move only north or east.

Let’s use two different approaches to solve this problem.

Method 1: Let an “a” denote one eastward move and a “b” denote onenorthward move. Apparently, we have to make exactly 7 eastward moves and 4northward moves to reach B from A. For example, if we traveled through thosesmall black circles in the figure to reach B from A, the sequence of moves wemade would be

a · a · b · a · b · a · a · a · b · b · a. (7.18)

If we pick up a different order of those a’s and b’s, we will get another pathto B. Therefore, the number of different ways from A to B is the number ofdifferent permutations of the string shown in (7.18). It is

(4 + 7)!4!7!

=(

4 + 74

)=

(4 + 7

7

).

In general, if the destination is m blocks to the east and n blocks to the north,then we need a sequence of m + n that consists of m eastward and n northwardmoves. The number of different ways from A to B is the number of such possiblesequences, i.e., the permutations of m + n objects in which there are m objectsof the same kind(“a”) and n objects of the other kind(“b”). The number isgiven by

(m + n)!n!m!

=(

m + n

m

)=

(m + n

n

).

2

Method 2: Let “a” and “b” carry the same meaning. We will use (a + b) todenote one possible move. One may consider “+” as “or”, and (a + b) to meanan eastward move or a northward move. Suppose we want to make two moves.


7.4. Solutions 303

All possible ways can be represented as the product of two moves as shown inthe following,

(a + b) · (a + b) = a · a + a · b + b · a + b · b.Each term represents a distinct possible path. For example, a · b means aneastward move followed by a northward move. Among those different paths,some of them will reach the same destination, for example, a · b and b · a willboth stop at the same point.

Consider the example we discussed in method 1 again. We want to movefrom A to B. To do so, we need 11 moves. All possible ways of 11 moves canbe represented in the expansion of

(a + b)11. (7.19)

In the expansion, those terms with 7 a’s and 4 b’s are the routes for us tomove from A to B, where the order of a’s and b’s is of no importance. Eachof such terms contributes 1 to the coefficient of a7b4 in the expansion of (7.19).Therefore, the coefficient of a7b4 is the answer we want.

In general, if the destination is m blocks to the east and n blocks to thenorth, the number of different routes is the coefficient of ambn in the expansionof

(a + b)m+n. (7.20)

The remaining question is, how to find out the coefficient of ambn withoutreally expanding (7.19)? Here is the most general form of the Binomial Theorem:

For any real number r,

(x + y)r =∑

0≤k

(r

k

)xr−kyk. (7.21)

Since the proof is similar to the proof of Problem 8, we just give the ideahere: Let ak be the coefficient of xr−kyk in the expansion. Take the kth partialderivative of (x+y)r with respect to y, i.e. ∂k(x+y)r/∂ky. Then find the valueof ∂k(x + y)r/∂ky with x = 1, y = 0, which will be the value of k! · ak. If r is apositive integer, (7.21) can be rewritten as

(x + y)r =r∑

k=o

(r

k

)xr−kyk.

Therefore, the coefficient of ambn in the expansion of (a + b)m+n is(m+n

m

). 2



Solution 21: Let RA→B denote the set of different ways to move from A toB, RA→B→C denote the set of different ways to move from A to C via B, andlet |RA→B | be the size of RA→B . In other words, |RA→B | is number of differentways to move from A to B. The similar notation will be used for other sets.The problem can be restated as: what is the size of

(RA→B→D ∪RA→C→D)−RA→B→C→D?

It is easy to see that

RA→B→C→D = (RA→B→D ∩RA→C→D) ⊆ (RA→B→D ∪RA→C→D).

Recall that, if X and Y are sets and X ⊆ Y , then |Y −X| = |Y |−|X|. Therefore,

|(RA→B→D ∪RA→C→D)−RA→B→C→D|= |RA→B→D ∪RA→C→D| − |RA→B→C→D|.

Using the idea in Problem 20 we find:

|RA→B | =(3+13

)= 4, |RA→C | =

(6+36

)= 84, |RB→C | =

(3+23

)= 10,

|RB→D| =(7+47

)= 330, |RC→D| =

(4+24

)= 15.

Thus,

|RA→B→D| = |RA→B | × |RB→D| = 1320,

|RA→C→D| = |RA→C | × |RC→D| = 1260,

|RA→B→C→D| = |RA→B | × |RB→C | × |RC→D| = 600,

and,

|RA→B→D ∪RA→C→D| = |RA→B→D|+ |RA→C→D| − |RA→B→D ∩RA→C→D|= |RA→B→D|+ |RA→C→D| − |RA→B→C→D| = 1980.

Therefore,

|(RA→B→D ∪RA→C→D)−RA→B→C→D| = 1980− 600 = 1380.

2

Solution 22: Since the westward and southward directions will never bethe choice, we can simply ignore those streets with westward and southwarddirections. Use the idea in Problem 20 to get

(3 + 2

2

)= 10

different routes. 2


7.4. Solutions 305

Solution 23: Suppose n, r ∈ N0 and r ≥ n. Let xj1xj2 · · ·xjnbe a general

term in the expansion of

(x1 + · · ·+ xr︸︷︷︸T1

)(x1 + · · ·+ xr︸︷︷︸T2

) · · · (x1 + · · ·+ xr︸︷︷︸Tn

). (7.22)

Each Tk will contribute one xi from x1, · · · , xr to xj1xj2 · · ·xjn . In other words,we will choose one xi from Tk for xj1xj2 · · ·xjn

. We don’t want any xi to beselected twice so none of the xi’s in xj1xj2 · · ·xjn

has an exponent greater thanone. There are n Ti’s, which means that we will choose n objects from r objectswithout repetition. There are

(rn

)many ways to do so, and each such selection

will form a different xj1xj2 · · ·xjnin which none of xi’s has an exponent two or

greater. 2

[Part 2] Fix a term, xj1xj2 · · ·xjn, in the expansion of (7.22). The coeffi-

cient of xj1xj2 · · ·xjn is the number of ways those xji ’s come from. For example,if xj1 = xj2 = · · · = xjn = x1, that forms the term xn

1 and the correspondingcoefficient is 1 because every Ti must contribute x1 without any other choice.For another example, consider xn−1

1 x2. x2 may come from any one of Ti; thereare n choices and hence the corresponding coefficient is n.

In the problem, we need xji 6= xjkfor all i 6= k. Consider xj1 first: xj1 may

come from T1, T2, · · · , or Tn; there are n choices. After xj1 has been chosen, xj2

has n−1 choices because there are n−1 remaining Ti’s. Similarly, xj3 has n−2choices, and so on. For xjn , it has only one choice. Therefore, there are n! manyways to choose all distinct n xi’s in xj1xj2 · · ·xjn , and hence the correspondingcoefficient is n!. 2

Beside the combinatorial approach described above, we can also use theanalytic approach to prove this problem.

Consider

(x1 + x2 + · · ·+ xr)n (7.23)= · · ·+ a(xj1xj2 · · ·xjn) + · · · (7.24)

(7.24) is the expansion of (7.23), in which we consider the term a(xj1xj2 · · ·xjn)only. Take the partial derivative with respect to xj1 , xj2 , · · · , and xjn on both(7.23) and (7.24). That is,

∂n(x1 + x2 + · · ·+ xr)n

∂xj1∂xj2 · · · ∂xjn

=∂n(· · ·+ a(xj1xj2 · · ·xjn) + · · · )

∂xj1∂xj2 · · · ∂xjn

.

Because r ≥ n, by the pigeonhole principle, any term in the expansion of (7.23)other than xj1xj2 · · ·xjn lacks at least one variable in xj1 , xj2 · · ·xjn. There-fore,

∂n(· · ·+ a(xj1xj2 · · ·xjn) + · · · )∂xj1∂xj2 · · · ∂xjn

= a,



because all other terms except xj1xj2 · · ·xjnwill be canceled by at least one of

the partial differentiation, ∂xj1 , ∂xj2 , . . . , or ∂xjn. And,

∂n(x1 + x2 + · · ·+ xr)n

∂xj1∂xj2 · · · ∂xjn

= n!,

because all xj1 , xj2 , . . . , xjnin x1, x2, . . . , xr. Thus, a = n!. 2

Solution 24: Let f(x) = (1 + 2x)10. Then,

f(1− x2

6) = (1 + 2(1− x2

6))10 = (3− x2

3)10.

By the Binomial Theorem,

(3− x2

3)10 =

10∑

k=0

(10k

)310−k(−x2

3)k.

In the right hand side of the above equation, k = 5 is the only term that givesx10. Therefore, the coefficient of x10 in f(1− x2

6 ) corresponds to the coefficientof the term when k = 5 in the summation; it is

(105

)35(−1

3)5 = −252.

2

Solution 25: We want to pick up r objects from n different objects. Let’skeep an eye on the nth object. We have the following two possible cases.

1. The nth object is not selected. It means that we must pick up r objectsfrom the rest of the n − 1 objects. There are

(n−1

r

)different ways to do

so.

2. The nth object is selected. It means that we must pick up r − 1 objectsfrom the rest of the n−1 different objects. There are

(n−1r−1

)different ways

to do so.

Thus, by the rule of sum, we have(

n− 1r

)+

(n− 1r − 1

)

different ways to choose r objects from n different objects. 2


7.4. Solutions 307

Solution 26: Three interface cards have to be plugged in the associatedslots. The 1st interface card has 8 choices, the 2nd has 7 choices, and the 3rdhas 6 choices. The printer has to be connected to one of the two parallel ports,so, it has 2 choices. The mouse and the modem have to be connected to theserial ports; they have 4 and 3 choices respectively. Therefore, the total numberof ways to connect this system is,

(8× 7× 6)× (2)× (4× 3) = 8064.

2

Solution 27:

1. No two wires are connected to the same terminal, and because we areasked to connect all wires to some terminals, we may assume that k ≥ nto make the connection possible. For the first wire, there are k terminalsto choose. For the second wire, there are k−1 terminals to choose becauseone of the k terminals had been connected to the first wire. Up to the nth

wire, there are k − n + 1 terminals left to choose. Therefore, we have

k(k − 1)(k − 2) · · · (k − n + 1) =(

k

n

)× n!

different connections.

2. In general, every wire has k choices. Therefore, we have kn differentconnections.

2

Solution 28: Every student has 5 choices, and we have 43 students. There-fore, we have

43︷︸︸︷5× 5× · · · × 5 = 543

different ways. 2



Solution 29: The 1st student has 8 choices, the 2nd student has 7 choices,the 3rd student has 6 choices, the 4th student has 5 choices, and the 5th studenthas 4 choices. Therefore, the answer is 8× 7× 6× 5× 4 = 6720. 2

Solution 30: To draw a chord, we can pick up any two different points outof the given n points on the circle. Therefore, the answer is

(n2

). 2

Solution 31: Three points are required to draw a triangle. Any three pointson the circle will do. Hence, the answer is

(n3

). 2

Solution 32: In some counting problems it is convenient to find the objectsthat do not satisfy the desired conditions (unqualified objects). In these caseswe can use the following strategy:

Step 1: Count all possible objects.Step 2: Count all unqualified objects.Step 3: Remove the unqualified objects from all possible objects.

For this problem, all possible triangles are determined by any three differentvertices; they are

(n3

)many. The triangles sharing sides with the polygon are

the unqualified objects. They can be classified in two categories:

Case 1: Sharing one side of the polygon.Case 2: Sharing two sides of the polygon.

The following figure illustrates these cases. The figure consists of a polygonwith n vertices where only 6 vertices and 5 sides are actually drawn. In thisfigure 4124, 412k, . . . ,412(n− 1) share only one side of the polygon, whereas4123, and 4243 share two sides.

n− 1

n

1 2

3

4

k

Case 1: Sharing one side. Fix one side (for example, side 1–2). There are n−4


7.4. Solutions 309

vertices left such that a triangle can be drawn without using additionalside of the polygon. Since the polygon has n sides, all together we haven(n− 4) triangles that share one side with the polygon.

Case 2: Sharing two sides. We observe that each vertex and its adjacent sidesdetermine a triangle sharing two sides with the polygon. For example, inthe given figure, vertex 2 determines sides 1–2 and 2–3 that are commonwith4123. There are n distinct vertices, thus, we have n distinct trianglesthat share two sides with the polygon.

Therefore, the number of triangles that satisfy conditions of the problem is(n3

)− n(n− 4)− n. 2

Solution 33: There are two cases.

Two kings and three of the other denominations :

We choose any two kings out of four, then choose another denominationout of the remaining 12 denominations, i.e., A, 2, 3, . . . , 10, J,Q , andselect three cards out of four of the the kind. Therefore, the total numberis (

42

)(121

)(43

)= 288.

Three kings and two of the other denominations :

In this case, we choose three kings out of four, choose another denomina-tion out of the remaining 12 denominations, and select two cards out offour of the the kind. Therefore, the total number is

(43

)(121

)(42

)= 288.

All together, we have 288 + 288 = 576 different full houses with kings. 2

Solution 34: We will solve this problem by using two approaches.

Method 1, the strategy in Problem 32: The number of different fullhouses with diamonds = (the number of different full houses) − (the number ofdifferent full houses without diamonds). We have

(131

)(43

)(121

)(42

)(7.25)



different full houses. Out of these, there are(

131

)(33

)(121

)(32

). (7.26)

full houses without diamonds. (In the above expressions we use(33

)and

(32

),

because we have to choose cards from 3 remaining suits.) Therefore, the numberof different full houses with diamonds is

(7.25)− (7.26) = 3276.

2

Method 2, the direct method: There are two cases: (i) diamonds in 3cards of a kind, and (ii) diamonds in 2 cards of a kind. If we simply add thecorresponding counts of cases (i) and (ii), we will count some hull houses twicesince the two cases are not exclusive. We have to remove the overlapped case:(iii) diamonds are in both 3 cards of a kind and 2 cards of a kind. Therefore,from the count of cases (i) and (ii) we remove the count of case (iii).

(i) A diamonds in 2 cards of a kind:(

131

)(31

)(121

)(43

). (7.27)

The factor(31

)is used above because a diamond card must be selected,

thus, we need to select one more card out of the remaining 3.

(ii) A diamonds in 3 cards of a kind:(

131

)(32

)(121

)(42

). (7.28)

(iii) Diamonds in both: (131

)(32

)(121

)(31

). (7.29)

Therefore, the answer is (7.27) + (7.28)− (7.29) = 3276. 2

Solution 35: Since first three digits are fixed as 101, therefore, this questionis in fact asking how many strings are there of four 0’s and three 1’s? In otherwords, how many permutations are there of four 0’s and three 1’s? We useTheorem 7.15 and get

7!4!3!

= 35.

2


7.4. Solutions 311

Solution 36: Same as the previous problem, use Theorem 7.15, and get

(7 + 2 + 1)!7!2!1!

=10!

7!2!1!= 360.

2

Solution 37: The set of strings that satisfy the requirement of this problemis the union of the following three sets.

A = 8-strings without 1,B = 8-strings with one 1,C = 8-strings with two 1′s.

By Theorem 7.15,

|A| = 8!8!0!

, |B| = 8!7!1!

, |C| = 8!6!2!

Clearly, A,B and C are disjoint. Therefore, the size of the union is

8!8!

+8!

7!1!+

8!6!2!

= 37.

2

Another explanation is: A is the set of strings of length 8, where we pick up0 place for 1 for which there are

(80

)choices; B is the set of strings of length 8,

where we pick up 1 place for 1, for which there are(81

)choices; for C, we pick

up 2 places for 1’s for which there are(82

)choices. Therefore, all together, we

have (80

)+

(81

)+

(82

)= 37

choices. 2

Solution 38: Suppose we want to pick up some of the 6 coins. For each coin,we have two choices: do or do not choose the coin. After we make 6 decisions,one for each coin, we will have a certain amount of money at hand. There are26 = 64 different ways for these 6 decisions. Therefore, there are 64 differentpossible amounts of money. There is an important observation: A distinctpossible combination gives a distinct denomination. For example 30 cents canbe arranged only by picking a quarter and a nickel, no other combination ofcoins will produce a sum equal to 30 cents. 2



Solution 39: Let P : people who like pizza,I: people who like ice cream,T : people who like tofu.

The following facts are given: There are 60 people surveyed, |P | = 26,|I| = 32, |T | = 30, |P ∩ I| = 14, |P ∩ T | = 7, |I ∩ T | = 10, and |P ∩ I ∩ T | = 2.Thus, using the concept of inclusion-exclusion, we have

|P ∪ I ∪ T | = 26 + 32 + 30− 14− 7− 10 + 2 = 59

where, P ∪I∪T is the set of people who like at least one of the three. Therefore,

60− |P ∪ I ∪ T | = 60− 59 = 1

is the number of people who dislike all three. 2

Solution 40: It is possible to count the number of valid strings directly,but it will become very difficult when the length of the strings is large. Instead,we can count the invalid strings and subtract it from the number of all possiblestrings.

Let’s solve the problem in both ways. One may find the advantage anddisadvantage of each of them.

Method 1: (indirect) The number of all possible strings is

55 = 3125, (7.30)

because, each place has 5 choices and a string of length 5 has 5 places.

We will classify the invalid strings into the following disjoint groups.

1. Strings without a: There are

45 = 1024 (7.31)

many such strings, because each place has 4 choices b, c, d, e.2. Strings with one a: First we construct strings of length 4 made fromb, c, d, e, and reserve one place for a to insert later. We will get stringsof length 5 with exactly one a after insertion. There are 44 many stringswithout a. Consider a string of length 4 without a, l1l2l3l4, and mark theplaces by numbered boxes to which a can be inserted.

1 l1 2 l2 3 l3 4 l4 5

After we insert one a into one of the five boxes above, we get a string oflength 5 with exactly one a. There are

(51

)choices of boxes, and putting


7.4. Solutions 313

an a in a different box will result in a different string. Therefore, the totalnumber of strings with exactly one a is

(51

)44 = 1280. (7.32)

3. Strings with two separated a’s: Similarly, we reserve two places for two a’sand count the number of strings of length 3 made from b, c, d, e. Thereare 43 many such strings. Again, consider a string of length 3 without a,l1l2l3.

1 l1 2 l2 3 l3 4

We will insert two a’s into two distinct boxes above with(42

)choices.

Therefore, the total number of strings with two separated a’s is(

42

)43 = 384 (7.33)

4. Strings with three separated a’s: Again, we reserve three places for threea’s and count the number of strings of length 2 made from b, c, d, e.There are 42 many strings. And again, consider a string of length 2 withouta, l1l2.

1 l1 2 l2 3

Similarly, we will insert three a’s into three of the boxes above. There are(33

)choices. Therefore, the total number of strings with three separate a’s

is (33

)42 = 16 (7.34)

It is impossible to construct a string of length 5 with 4 or more separated a’s.Therefore, we conclude that the total number of strings of length 5 with two ormore consecutive a’s is

(7.30)− (7.31)− (7.32)− (7.33)− (7.34) = 421.

2

Method 2: (direct) Let’s agree the following conventions.

a : A place occupied by a.

• : A place to be filled by any one of b, c, d, e.

? : A place to be filled by any one of a, b, c, d, e.

Therefore, there are 4 choices for • , and 5 choices for ? .

We will sum up the following 4 cases:



1. Strings with 2 consecutive a’s: We have the following subcases,

(a) a a • ? ? — 1× 1× 4× 5× 5(b) • a a • ? — 4× 1× 1× 4× 5(c) ? • a a • — 5× 4× 1× 1× 4(d) ? ? • a a — 5× 5× 4× 1× 1

All together, there are 100+80+80+100 = 360 ways. However, if we putaa in ? ? in both cases (a) and (d), we will get a duplicate. Hence wehave over counted 4 times, because we have 4 such strings: aabaa, aacaa,aadaa, and aaeaa. Thus, we have to subtract 4 from 360, and hence thereare 356 distinct strings in this case.


(a) a a a • ? — ×1× 1× 1× 4× 5(b) • a a a • — ×4× 1× 1× 1× 4(c) ? • a a a — ×5× 4× 1× 4× 4

Therefore, the total number is 20 + 16 + 20 = 56.


(a) a a a a • — ×1× 1× 1× 1× 4(b) • a a a a — ×4× 1× 1× 1× 1

We have 8 such strings in this case.

4. Strings with 5 consecutive a’s: The only string is aaaaa.

All together, we have 356 + 56 + 8 + 1 = 421 different strings of length 5 withtwo or more consecutive a’s.

Question: Why the first method is better? (try length of 6) 2

Solution 41:

Method 1: We don’t have to consider the five digit number, 10,000, becauseit is obvious that there is no 7 in the number. Therefore, this problem is asking:If we construct numbers by filling one of 0, 1, · · · , 9 in each of the followingboxes, then how many different numbers are there with at least one 7 in theboxes?

To answer this question, we put a 7 in one of the boxes and define the setsA,B,C, and D of integers that are obtained by filling 0, 1, · · · , or 9 in theremaining three boxes.


7.4. Solutions 315

A: 7

B: 7

C: 7

D: 7

Now, the question is: what is |A ∪B ∪ C ∪D| ? It’s clear to see that

|A|+ |B|+ |C|+ |D| =4× 103 = 4000, (7.35)

but, one should be careful, unlike Problem 40, the four sets are not disjoint.Thus, we have to find the size of the intersection of each pair. For example,|A ∩ B|. A ∩ B is the set of 4 digit numbers with 7’s in the first two places.There are 102 numbers in this case. We have

(42

)different pairs. Thus,

|A∩B|+ |A∩C|+ |A∩D|+ |B∩C|+ |B∩D|+ |C∩D| =(

42

)102 = 600. (7.36)

Likewise, the intersection of three sets contains ten elements, and we have(43

)of them. We have

|A ∩B ∩C|+ |A ∩B ∩D|+ |A ∩C ∩D|+ |B ∩C ∩D| =(

43

)10 = 40. (7.37)

Finally,

|A ∩B ∩ C ∩D| =(

44

)100 = 1. (7.38)

By the inclusion-exclusion principle, we have

|A ∪B ∪ C ∪D| = (7.35)− (7.36) + (7.37)− (7.38)

=(

41

)103 −

(42

)102 +

(43

)10−

(44

)100

= 3439.

2

There is a much easier approach to solve this problem given in Method 2 .

Method 2: We count all integers from 1 to 10,000 and remove those withoutany 7.

There are (94 − 1) + 1 numbers without 7. Note, we have to subtract onebecause 0000 does not count, but it is counted in the 94 numbers, and plus 1for the last number 10,000. Therefore, the answer is 10, 000− 94 = 3439. 2



Solution 42:

1. f(1) is odd.

In this case, let’s select the value for f(1) first. f(1) has 3 choices. Af-ter one odd number has gone for f(1), there are 4 numbers remain forf(2), . . . , f(5). There are 4! ways to do so. Therefore, all together, thereare

3× 4! = 72

different permutation functions.

2. f(2) is odd.

A little trick here is: we do not select a value for f(1) before we do it forf(2). Because, we can choose any number in 1, 2, 3, 4, 5 for f(1), andthe number of choices for f(2) depends on the number f(1) had chosen –even or odd. This will introduce a little difficulty since we have to discussin cases. Therefore, we start with f(2) then the rest of the function. Theanswer is same as the previous one, i.e.,

3× 4! = 72.

3. Both f(1) and f(2) are odd.

We select two values for f(1) and f(2) first. f(1) has 3 choices, andafterwards f(2) has 2 choices. Then there are 3, 2, and 1 choices for therest of the function respectively. Therefore, we have

3× 2× 3! = 36.

2

Solution 43: We first permute 21 consonants. There are 21! permutations.For each such permutation, we can insert five vowels into five of 22 placesbetween every two consonants plus head and tail positions as shown in thefollowing.

c1 c2 · · · · · · c211 2 3 21 22

In such a way, no two vowels will be together. We have(225

)ways to choose 5

places for vowels. Moreover, different order of the vowels will result in a differentstring. There are 5! permutations of 5 vowels. Therefore, there are

21!×(

225

)× 5!

ways. 2


7.4. Solutions 317

Solution 44: To facilitate the count, we make a simple observation: Acycle of a permutation constitutes an equivalence class. Therefore, if f is apermutation on a 6-set, A, and f consists two cycles, f forms two equivalenceclasses on A. In other words, f will partition A into a partition of two cells.Unfortunately, there is no easy way to count the number of all possible f . Wehave to count it case by case. At first, we have to count the number of waysto partition a 6-set into 2 cells. Then, we have to count the number of possiblepermutations in each different partition.

We observe that if f as a permutation on X forms a cycle on S, where S ⊆ Xand |S| = n. Then, we could have (n− 1)! different f ’s on S.

For a 6-set, we have three cases of partitions of 2 cells.

1-cell, 5-cell : In this case, there are(61

)partitions. We have (1− 1)! permu-

tation on the 1-cell and (5− 1)! permutations on the 5-cell. Therefore, wehave

(61

)× 0!× 4! = 144

different permutations in this case.


)partitions. We have (2 − 1)! per-

mutation on the 2-cell (4− 1)! permutations on the 4-cell. Therefore, wehave

(62

)× 1!× 3! = 90



)/2 partitions. We have (3 − 1)! per-

mutations on each 3-cell. Therefore, we have

(63

)

2× 2!× 2! = 40


Altogether, we have 274 different permutations on a 6-set consist of exactly twocycles. 2



Solution 45: This is the set partition problem. Seven people form a 7-setand three teams are three cells in the partition of the 7-set. By using the relatedformula, we have

13!

(37 −

(31

)27 +

(32

)17

)= 301

different ways to from three teams from seven people. 2

Solution 46: This problem is a bit ambiguous. If each person is considereddistinct, then each of the seven people has 4 choices. Therefore, we have 47

ways to let these people exit the elevator. 2

If whoever exits the elevator does not matter, then the number of ways isthe number of nonnegative integer solutions of the following equation.

f1 + f2 + f3 + f4 = 7.

By using the related formula, we have

s(4, 7) =(

4 + 7− 17

)=

(107

)= 120

different ways to let people exit. 2

Solution 47: This problem can be translated into counting the number ofnonnegative integer solutions of the following equation:

a + b + c + d + e = 12, (7.39)

with constraint that a ≥ 2 and d ≥ 3. Let a = a′ + 2, d = d′ + 3. Then, (7.39)simplifies to

a′ + b + c + d′ + e = 7.

Therefore, the number of nonnegative integer solutions is

s(5, 7) =(

5 + 7− 17

)= 330.

2

Solution 48: Each place for a letter has 26 choices, and we have 5 placesto be filled. Therefore, we have 265 different 5-letter words.


7.4. Solutions 319

This problem is same as the distinguishable balls, distinguishable urns prob-lem, where 26 is the number of urns, and 5 is the number of balls. 2

Solution 49:

1. The number of surjections from 5-set to 5-set is

55 −(

51

)45 +

(52

)35 −

(53

)25 +

(54

)15 = 120.

Alternatively, if the sizes of two sets are the same, then any surjectionbetween them is also a bijection. Therefore, the answer of this problem issame as the number of different bijections from 5-set to 5-set. There are5! = 120 different bijections. 2

2. The number of surjections from 5-set to 3-set is

35 −(

31

)25 +

(32

)15 = 210.

2

Solution 50:

1. The idea to count the number of nonnegative integer solutions to w +x + y + z = 10 with restrict w ≥ 2 is that: We first take 2 from 10 andgive it to w, then find out the number of ways to distribute the remaininginteger (10-2) into four nonnegative integer variables by using the standardformula.

Let w = w′ + 2. If w′ ≥ 0, then w ≥ 2 as required. Thus,

w + x + y + z = 10 ⇒ (w′ + 2) + x + y + z = 10⇒ w′ + x + y + z = 8. (7.40)

The number of nonnegative integer solutions to (7.40) is(

8 + 4− 18

)= 165.

2. For this problem, all variables have to be at least 1. As in the abovesolution, let w = w′+ 1, x = x′+ 1, y = y′+ 1, and z = z′+ 1. We obtain

w′ + x′ + y′ + z′ = 6. (7.41)



The number of nonnegative integer solutions to (7.41) is

(6 + 4− 1

6

)= 84.

2

Solution 51: The value of w + x + y + z can be 12, 13 or 14. Therefore, weadd the numbers of nonnegative integer solutions to the following equations:

w + x + y + z = 12,w + x + y + z = 13,w + x + y + z = 14.

There are (12 + 3

12

)+

(13 + 3

13

)+

(14 + 3

14

)= 1695

nonnegative integer solutions. 2

Solution 52: Let a, b, c, d denote the number of Hershey’s kisses given to 4people respectively. We count the number of nonnegative integer solutions tothe following equation:

a + b + c + d = 8.

There are (4 + 8− 1

8

)=

(118

)= 165

different ways to distribute the kisses. 2

Solution 53: We notice that (7 + 8 + 9) − 12 = 12. Thus, after the firstperson chooses 12 books, the other one has to pick up the remaining 12 bookswithout any choice. Therefore, we only have to count the number of ways thefirst person can select 12 books. Suppose the first person picks up a copiesof the first book, b copies of the second book, and c copies of the third book.From the given facts that there are seven copies of the first book, eight copiesof the second book, and nine copies of the third book, we know that 0 ≤ a ≤ 7,0 ≤ b ≤ 8, 0 ≤ c ≤ 9, and

a + b + c = 12. (7.42)


7.4. Solutions 321

Let S be the number of nonnegative integer solutions to (7.42). If there is noconstraint on a, b, c, then the answer is

S =(

3 + 12− 112

)= 91.

However, these solutions allow a, b, c to take value as large as 12. Due to theconstraint, a > 7, b > 8, or c > 9 is impossible. Therefore, we have to removethe impossible solutions. Let T be the answer of this problem. By the principleof inclusion-exclusion, the desired answer is

T = S − Sa≥8 − Sb≥9 − Sc≥10 + Sa≥8,b≥9+Sb≥9,c≥10 + Sa≥8,c≥10 − Sa≥8,b≥9,c≥10,

where, for example, Sa≥8,b≥9 is the number of solutions to (7.42) with theconstraint a ≥ 8 and b ≥ 9.

It is easy to see that Sa≥8,b≥9, Sb≥9,c≥10, Sa≥8,c≥10, and Sa≥8,b≥9,c≥10are all equal to zero because it is impossible to have a nonnegative solution to(7.42) in which, for example, a ≥ 8 and b ≥ 9. Therefore, we only have to countthe nonnegative integer solutions of the following equations by using the sameidea in the solution of Problem 50.

Sa≥8: Let a = a′ + 8, a′ ≥ 0. Then, a + b + c = 12 ⇒ a′ + b + c = 4, and

Sa≥8 =(

3 + 4− 14

)= 15.

Sb≥9: Let b = b′ + 9, b′ ≥ 0. Then, a + b + c = 12 ⇒ a + b′ + c = 3, and

Sb≥9 =(

3 + 3− 13

)= 10.

Sc≥10: Let c = c′ + 10, c′ ≥ 0. Then, a + b + c = 12 ⇒ a + b + c′ = 2, and

Sc≥10 =(

3 + 2− 12

)= 6.

Therefore, T = 91− 15− 10− 6 = 60. 2



Solution 54: First, we distribute b indistinguishable balls into u distin-guishable urns. We have

(b+u−1

b

)different ways. Secondly, consider the fol-

lowing figure, where the distribution is represented as b indistinguishable balls(squares) and u− 1 indistinguishable vertical bars.

urn 1︷︸︸︷ urn 2︷︸︸︷· · · · · ·

urn u︷︸︸︷· · ·

If we label the balls to make them distinguishable, then we obtain

urn 1︷︸︸︷1 2 3 4

urn 2︷︸︸︷5 6 7 · · · · · ·

urn u︷︸︸︷x · · · b

Since each urn is a stack, we may imagine that the leftmost ball in each urn isat the bottom and the rightmost ball is on the top of the stack. It is clear that adifferent arrangement of the b balls will result in a different distribution. Thereare b! different arrangement for each case. Finally, the order of the u urns makedifferent, we have u! ways to permute the urns. Therefore, the answer is

b!(

b + u− 1b

)u!.

2

Solution 55: The argument is exactly the same as in the solution of Problem54 except that we do not permute the u urns. Therefore, the answer is

b!(

b + u− 1b

).

2

Solution 56: We will give k balls to each urn first. Then distribute theremaining b− uk balls to u urns,

(b− uk + u− 1

u− 1

).

2


7.4. Solutions 323

Solution 57:

1. First, we select 3 objects out of an 8-set to form the cell of 3 elements.There are

(83

)=

8 · 7 · 63!

= 56

different ways to do so. The remaining 5 elements of the 8-set form theother cell. There are only one way to do so, i.e., pick them all,

(55

)= 1.

Therefore, there are 56 · 1 = 56 different partitions.

2. As in part 1, we select 3 elements to form the first 3-cell; there are 56ways. From the remaining 5 elements, we select another 3 elements toform the second 3-cell; there are

(53

)= 10 ways. Finally, we the remaining

2 elements form the last cell; there is one way. Altogether, we get 56 ·10 · 1 = 560 combinations. But, in a partition of a set we cannot tell thedifference between the first 3-element-cell and the second 3-element-cell.For example,

1, 2, 34, 5, 67, 8 and 4, 5, 61, 2, 37, 8

are considered as the same partition. Therefore, the 560 combinationshave to be divided by 2. The answer is

(83

)(53

)(22

)

2= 280.

2

Solution 58: Suppose we want to partition an n-set into m cells. We havethe following two cases.

case 1 : The nth element forms its own cells. In this case, we partition the(n− 1)-set into m− 1 cells. We have S(n− 1,m− 1) different ways to doso.



case 2 : The nth element is in one of m cells with other elements. In this case,we partition the (n − 1)-set into m cells. We have S(n − 1,m) differentways to do so. The nth element has to be put into one of the m cells. Wehave m choices to do so. Altogether, we have mS(n−1,m) different ways.

By the rule of sum, we have S(n − 1,m − 1) + mS(n − 1,m) different ways topartition an n-set into m cells. 2

Solution 59: Let A,B be sets, where |A| = 7 and |B| = 5.

1. From A to B: Since we do not consider partial functions, all elements inA have to be assigned a value in B. By the pigeon hole principle, it isimpossible to assign a distinct value in B to every element in A. Therefore,there is no injection from A to B.

2. From B to A: We have 7× 6× 5× 4× 3 = 2520 injections.

2

Solution 60: Let A,B be sets, where |A| = 7 and |B| = 5. We can use thestandard formula to find the number of surjections between A and B.

1. From A to B: We have

57 −(

51

)47 +

(52

)37 −

(53

)27 +

(54

)17 = 16800

surjections from A to B.

2. From B to A: We can mindlessly use the standard formula to get theanswer, which is 0.

Alternatively, we can argue it directly that an onto function form B to Ais impossible by the pigeon hole principle. By the pigeon hole principle,at least one element in B will map to at least two elements in A, and thisis not a function.

2


7.4. Solutions 325

Solution 61: Let’s first understand what does this problem mean. Forillustration, let P = 1, 2, 3, 4, 5 denote the set of 5 people. A way to formcommittees of 4 members can be represented by a set of 4-subsets of P . Let Ebe a such set,

E = 1, 2, 3, 4, 3, 4, 5, 1, 5, 1, 2, 3, 2, 3, 4, 5, 4, 5, 1, 2

One can verify that any 3-subset of P is contained in exactly two elements of E.For example, 1, 2, 3 ⊂ 1, 2, 3, 4, 1, 2, 3 ⊂ 5, 1, 2, 3, and no other elementof E contains 1, 2, 3. Therefore, E, the largest set of 4-subsets of P , is a wayto form the committees that satisfies the criteria of this problem. The numberof members (committees) in E is simply

(54

). Unfortunately, if P is a 6-set, the

set of all 4-subsets of P will not be the case. For example, if we add 6 to P ,then 1, 2, 3, 6 ∈ E is also a valid committee and 1, 2, 3 is a subset of it.Consequently, 1, 2, 3 will be contained in more than two committees in E.

Let us go back to the case of 5 people and can define a set F called flag setas follows.

F = (X, T )|X ∈ E, |T | = 3, T ⊂ X.T is a 3-subset of X. For example, for 1, 2, 3, 4 ∈ E, we have

(1, 2, 3, 4, 1, 2, 3) ∈ F,(1, 2, 3, 4, 1, 2, 4) ∈ F,(1, 2, 3, 4, 1, 3, 4) ∈ F,(1, 2, 3, 4, 2, 3, 4) ∈ F.

To count the number of elements in F , we can use the property og X and itsrelation with E;

|F | = |E| ×(

43

)= 4× |E|, (7.43)

because each member in E introduces 4 elements for T . Now, let’s count F byusing the property of T . If E has the property that is required by this problem,we know that any 3-subset of P is contained in exactly two elements in E. Inother words, for every 3-subset T , we can find exactly two pairs (X1, T ) and(X2, T ) in F , where T ⊂ X1 and T ⊂ X2. Since P has

(53

)3-subsets, we have

|F | =(

53

)× 2 = 20. (7.44)

Since the numbers of the size of F obtained by (7.43) and (7.44) must agreewith each other, we have

4× |E| = 20,

and hence |E| = 5, which agrees with the number of the elements of E seenabove.



Now, suppose |P | = 10. There are(103

)ways to select a 3-subset from P .

We modify the corresponding number in (7.44) and calculate the size of validE in the same way. That is,

|E| ×(

43

)=

(103

)× 2.

Therefore, |E| = 60. 2

Solution 62: If we examine the number of ways to pick up numbers from Aand B, discuss the overlapped cases, and check the possible duplicate numbers,then the task will become almost impossible. As in many counting problems,we can find a much easier way if we look at the problem from a different pointof view.

The idea is, we do not care where do the 8 numbers come from. For example,suppose we pick up 1, 1, 2, 3, 4, 5, 6, 7. We don’t care in which way we form the8-set: 1, 2, 4, 5, 6, 7∪1, 3, 1, 2, 3, 4, 6, 7∪1, 5, or 1, 2, 3, 4, 5, 6∪1, 7.All we want to know is that in this case, we can have 8!

2! different strings.

We have the following three cases.

Case 1: All numbers selected from A and B are distinct. 1, 2, 3, 4, 5, 6, 7, 8is the only case, and we can have

8! = 40320 (7.45)

different strings.

Case 2: Exactly two number are the seam among the numbers selected fromA and B. For each selection, we have 8!

2! different strings.

Now, we want to know in how many ways can we select numbers for Aand B so that there are exactly two numbers the same? Since one of thetwo same numbers must come from B, we have

(41

)cases. Among the

eight selected numbers, two of them are the same, and the rest distinctsix numbers are selected from the remaining seven numbers. Thus, wehave

(41

)(76

)different selections. Therefore, we can have

(41

)(76

)8!2!

= 564480, (7.46)

different strings.

Case 3: There are exactly two duplicate numbers among the numbers selectedfrom A and B. For example, we may pick up 2, 3, 4, 5, 6, 7 from A and3, 5 from B. For each such selection, we have 8!

2!2! different strings.


7.4. Solutions 327

As in the previous case, the two duplicate numbers must come from Bin

(42

)ways, and the rest of the four distinct numbers are selected from

the remaining six numbers in(64

)ways. Thus, we have

(42

)(64

)different

selections and (42

)(64

)8!

2! 2!= 907200, (7.47)

different strings.

Altogether, we have (7.45) + (7.46) + (7.47) = 1512000 different strings. 2


Chapter 8

Recurrence Relations andGenerating Functions

And God said unto Moses:“I am that I am.”

– (Exodus, III, 14)

8.1. Recurrence Relations 331

The concepts behind induction and recursion are intimately related. As wehave seen in Chapter 3 – Mathematical Induction, many examples show thatthe properties of recursively defined functions or sets can be proved by mathe-matical induction. In the study of fundamental mathematics, “recursive” and“computable” carry the same meaning; that is, a function f can be recursivelydefined if and only if f can be computed by a program.

However, just being computable (being recursively definable) does not meantoo much in the sense of efficiency. If it is possible, we want the function’s“closed-form”, so we can compute it efficiently. For example, we prefer to usen(n + 1)/2 than a recursive program to compute the sum of the first n naturalnumbers.

A formula that recursively defines a function is called a “recurrence relation”or a “recurrence equation”. Solving a recurrence equation means to find a close-form of the function defined by the recurrence equation. In this chapter, weemphasize on how to solve a given recurrence equation, few examples are givento illustrate why a recurrence equation solution of a given problem is preferable.

Some methods are suitable for solving certain kinds of recurrence equations,but there is no universal method to solve all kinds of recurrence equations. Wehave to, unfortunately, study different methods to find the solutions of differenttypes of recurrence equations.

The generating function is an important subject in mathematics with appli-cation s in many diverse areas. Without too much pondering on the propertiesof the generation functions, we use it as a tool to solve some recurrence relations.

8.1 Recurrence Relations

A recurrence equation relates the value, an, of a sequence in terms of some orall of its past values, an−1, an−2, . . .. In the most general form a recurrenceequation is defined as follows:

an = f(an−1, an−2, . . . , a0),

where f is a given function. Following are two illustrative examples of recurrencerelations.

Example 8.1 a1 = 2, and an = 4an−1 − 2 for n ≥ 2.

Example 8.2 a1 = 2, a2 = 7, and an = a2n−1 + 2an−2 for n ≥ 3.

Example 8.3 a1 = 1, a2 = 2, a3 = 1, and an = 3an−1 − an−2 + 2an−3 forn ≥ 4.


332 8. Recurrence Relations and Generating Functions

Recurrence equations are also known as difference equations. Recurrenceequations are valuable not only in mathematics and computer science but also inmany other disciplines. In this chapter, our goal is to find an explicit solution ofa recurrence equation and, more important, to understand the key mathematicalideas that lead to their solutions. Remember, sometimes it is natural to describethe solution of a problem in terms of a recurrence equations. The followingexample illustrates this point.

Example 8.4 A national car rental company allows customers one-way rentalfrom one city to another city. Each month it finds that one fifth of the cars thatstart the month in New York City end it in Washington, D.C., and one sixthof the cars that start the month in Washington, D.C. end it in New York City.If the initial inventory in each city is 1000 cars, describe the situation after nmonths.

A solution of this problem is most conveniently obtained in terms of recur-rence equations. If Nn and Wn denote the number of cars in the beginning ofthe nth month in New York City and Washington, D.C., respectively, then thenumber of cars in the beginning of the n + 1th month satisfy:

Nn+1 =45Nn +

16Wn

Wn+1 =15Nn +

56Wn,

where N0 = W0 = 1000. It is an easy observation that the values of Nn andWn can be obtained for n = 1, 2, . . . from these equations. Another importantexercise is to study the behavior of Nn and Wn as n gets larger. 2

The following example shows that using recurrence relations sometimes canprovide us an easier way to understand and solve some problems.

Example 8.5 Find the number of distinct partitions of a set of size n into kblocks. Let A = a1, a2, . . . , an. A moments thought shows that the desirednumber is not easy to obtain; there are too many ways we can put n elementsin k subsets. However, a recurrence equation is not difficult to build. Let Sn,k

denotes the desired number, i.e., the number of partitions of A into k blocks.Then,

Sn+1,k+1 = Sn,k + (k + 1)Sn,k+1 (8.1)

Why? A simple explanation follows. Let B = A∪an+1 and we wish to obtaina partition of B into (k + 1) blocks. There are two possibilities.

Case 1: Consider a partition of A in (k + 1) blocks and put an+1 in any oneof the blocks. Since there are Sn,k+1 distinct partitions of A into (k + 1)blocks and an+1 can be placed in any one of them, the total number ofways to do it is (k + 1)Sn,k+1.


8.1. Recurrence Relations 333

Case 2: Consider any partition of A in k blocks. and add to it the singletonset an+1. Thus, there are Sn,k ways to achieve distinct partitions.

This justifies Equation (8.1). In addition, it is easy to verify that Sn,1 = Sn,n = 1for n ≥ 1.

It may not be possible to find a closed form for Sn,k but clearly, we can findthe values of Sn,k by using Equation (8.1). For example, S1,1 = S2,1 = S2,2 =S3,1 = 1 and

S3,k = S2,k + 2S2,2 = 3,

S4,2 = S3,1 + 2S3,2 = 1 + 2× 3 = 7,

S4,3 = S3,2 + 3S3,3 = 3 + 3 = 6,

etc. 2

One systematic method for solving a recurrence equation is similar to thesolution procedure of differential equations. Students familiar with the lattershould not be surprised that the difference equations are discrete versions ofdifferential equations.

In Example 8.1, a1 = 2 is known as the initial condition and the otherequation defines the recurrence. Similarly, in Example 8.2, a1 = 2, a2 = 7are the initial conditions and the other equation defines the desired recurrenceequations. It is possible to obtain all other values of the sequence by repeatedlysubstituting the previously obtained values. In Example 8.1, a2 = 4a1 − 2 =8− 2 = 6, a3 = 4a2 − 2 = 24− 2 = 22, etc. However, one of the main purposesbehind the study of recurrence equation is to be able to solve a given recurrenceequation in closed form, i.e., to write the value of an in terms of n so thatwe don’t have to evaluate it repeatedly by substitution. In this chapter westudy how to solve linear recurrence equations, an important subclass of generalrecurrence equations.

8.1.1 Definitions

Definition 8.1: A recurrence equation for an is called linear if it can be writtenas a linear function of its past values, an−1, an−2, etc.

Example 8.1 above is a linear recurrence equations, whereas Example 8.2 isa nonlinear recurrence equation because an is a nonlinear function of an−1.

Definition 8.2: The order of a linear recurrence equation is k if an is a linearfunction of k past values an−1, an−2, . . . , an−k. In some cases, all of the pastvalues may not be present on the left-hand side. For this reason the order ofthe recurrence equation is defined as the difference between the largest andsmallest subscripts of the equation.



Example 8.1 above is a linear recurrence equation of order 1. Example 8.3linear recurrence equation is of order 3.

Definition 8.3: A linear recurrence equation has constant coefficients if co-efficients of an−1, an−2, etc., are all constant, i.e., do not depend on theindices n, n− 1, etc.

For example, the recurrence equation in Example 8.3 has constant coeffi-cients, whereas the recurrence equation an = nan−1 + 4 does not satisfy theconstant coefficient criterion.

In this chapter we confine our attention to linear recurrence equation withconstant coefficients of orders 1 and 2.

Definition 8.4: A linear recurrence equation is called homogeneous if an isa linear function of past values an−1, an−2, etc., only and does not containany other additional terms; otherwise, it is called nonhomogeneous.

Example 8.1 above is a nonhomogeneous linear recurrence equation and therecurrence equation given in Example 8.3 is a homogeneous linear recurrenceequation.

A nonhomogeneous linear recurrence equation consists of two parts—thenonhomogeneous component f(n) and the rest of the equation, called thehomogeneous component. In Example 8.1, the homogeneous part is an =4an−1 and the nonhomogeneous component is f(n) = −2.

Definition 8.5: The solution of a given linear recurrence equation is called thegeneral solution. It consists of two parts, the first part obtained from thehomogeneous part and the second part contributed by the nonhomogeneouscomponent f(n). This general solution must satisfy the given set of initialconditions. A solution so obtained in know as the particular solution.

8.2 Solving Recurrence Relations

The solution of a given recurrence equation can be obtained by several methods;three of which are popular. When the recurrence equation is of order 1, it isconvenient to obtain its solution by the method of repeated substitutions.The second method is similar to the solution procedure used to solve differentialequations, and the third method is via generating functions. All of themethods have distinct advantages.

8.2.1 Repeated Substitution Method

This method is also called resubstitution method. As the name of this repeatedsubstitution method indicates, the idea is to:


8.2. Solving Recurrence Relations 335

1. Substitute the values of an−1 in the given equation, then the value ofan−2, then the value of an−3, etc. These values are obtained from thegiven recurrence equation by replacing n by n − 1, n − 2, etc. in thedefining equation.

2. Guess the solution of the recurrence equation from the above observations.

3. Prove the guessed result by mathematical induction.

Comment: The repeated substitution method works well for linear recurrenceequations of order 1 and may or may not work for equations of order 2.Results can be obtained for the case when the coefficients in the linearrecurrence equation are not constants.

Example 8.6 For n ∈ N0,

f(n) =

2 if n = 0;3 + 2f(n− 1) if n ≥ 1.

f(n) = 3 + 2f(n− 1)= 3 + 2(3 + 2f(n− 2))= 3× (1 + 2) + 22f(n− 2)= 3× (1 + 2) + 22(3 + 2f(n− 3))= 3× (1 + 2 + 22) + 23f(n− 3)= · · · · · ·= 3(1 + 2 + 22 + · · ·+ 2n−1) + 2nf(0)= 3× (2n − 1) + 2n+1

= 5× 2n − 3.

Therefore, for n ∈ 0, 1, 2 . . ., f(n) = 5× 2n − 3. 2

Is this the correct solution? The verification can be made by mathematicalinduction, see the problem section.

Comment: One should compare the different representations for recurrenceequations between Example 8.6 and the previous examples. The notationsan and f(n) carry the same meaning.

8.2.2 Characteristic Root Method

Solving Nonhomogeneous, Constant Coefficients, and Linear Differ-ence Equations

This method is suitable to solve nonhomogeneous, constant coefficients andfirst or second-order linear difference equations. Theoretically, we can use this



method to solve difference equations of order higher than 2, but higher orderwill introduce more characteristic roots and make the job more involved.

With this method the solution is obtained in several steps. First, we outlinethe major steps and then describe their implementation.

1. The first step is to remove the nonhomogeneous part from the given re-currence equation, thus obtaining the reduced homogeneous equation.

2. The second step is to find the solution of this homogeneous equation usingthe method described below. The solution so obtained contains someunknown coefficients that are determined later.

3. In the third step we obtain the solution associated with the nonhomoge-neous part and combine it with the solution of the homogeneous part.

4. Finally, using the given initial conditions, and, if necessary, the first fewvalues of the sequence obtained from the given equation, we obtain theunknown constants.

Solution of a Homogeneous Recurrence Equation

For convenience of presentation we consider a homogeneous recurrence equationwith constant coefficients of order 2. Suppose that we wish to find a solution of

an = c1an−1 + c2an−2

where c1 and c2 are two given constants. If an = rn is a solution of the givenequation, then the following condition must be satisfied:

rn = c1rn−1 + c2r

n−2.

In other words, r is a solution of the quadratic equation r2 − c1r − c2 = 0.This quadratic equation is known as the characteristic equation of the givenhomogeneous recurrence equation. Suppose that r1 and r2 denote the two rootsof the characteristic equation (for convenience of presentation we do not considerthe case when the two roots are complex). It can then be verified that bothan = A1r

n1 and an = A2r

n2 are solutions of the given homogeneous recurrence

equation, where A1 and A2 are two constants. More generally, an = A1rn1 +A2r

n2

is a solution of the given homogeneous recurrence equation.

This general solution has only one remaining detail that requires furtherinvestigation. What will be the general solution of the given recurrence equationif r1 = r2? It would be unreasonable to suggest that the solution is given by(A1 + A2)rn

1 , because it is equivalent to a solution given by one root of thecharacteristic equation, whereas a quadratic equation has two roots. In the



case of two equal roots the general solution of the given second-order recurrenceequation is given by (A1 + A2n)rn

1 .

In summary, to find the general solution of an = c1an−1 + c2an−2

Step 1: Obtain the associated characteristic equation r2 − c1r − c2 = 0.

Step 2: Find the two roots, r1 and r2, of this equation. If

1. the roots are different, then the general solution is an = A1rn1 +A2r

n2 .

2. if the roots are equal, the the general solution is an = (A1 + A2n)rn1 .

Step 3: The two unknown constants, A1 and A2, are determined by the giveninitial conditions.

Solution of the Nonhomogeneous Part

In this subsection we consider a solution of the given recurrence equation that isgoverned by the nonhomogeneous part. First we consider the simple situationswhere the nonhomogeneous part, f(n), is given as:

f(n) = anp(n),

where p(n) is a polynomial in n of degree e. For example, f(n) = 2n(3n2 +4n+5). Here a = 2 and p(n) = (3n2 + 4n + 5). Note that p(n) is a polynomial in nof degree e = 2.

For such functions, the particular solution is governed by the characteristicequation (r− a)e+1. Because this characteristic equation has e + 1 equal roots,its solution is (A1n

e+A2ne−1+. . .+Ae+1)an. However, is possible that a root of

the characteristic equation of the homogeneous part may be equal to a. For thisreason, the characteristic equation (r−a)e+1 is combined with the characteristicequation of the homogeneous part of the given recurrence equation equation andthe new characteristic equation describes the general solution of the problem.We illustrate the complete procedure below.

We solvean = c1an−1 + c2an−2 + anp(n),

where p(n) is a polynomial in n of degree e.

The homogeneous part generates the characteristic equation (r2 − c1r − c2)and the nonhomogeneous part generates the characteristic equation (r− a)e+1.The combined characteristic equation is

(r2 − c1r − c2)(r − a)e+1 = (r − r1)(r − r2)(r − a)e+1,



where r1 and r2 are two roots of the quadratic equation (r2 − c1r − c2) = 0.Solution of the given recurrence equation depends on the values of r1 and r2.All possible cases are considered below. For convenience of presentation we takee = 2.

Case 1. If r1, r2, a are all distinct, then the general solution is

an = Arn1 + Brn

2 + (Cn2 + Dn + E)an.

Case 2. If r1 = r2, r1 6= a, then the general solution is

an = (A + Bn)rn1 + (Cn2 + Dn + E)an.

Case 3. If r1 6= r2, r2 = a, then the general solution is

an = Arn1 + (Bn3 + Cn2 + Dn + E)an.

Similarly, if r1 6= r2, r1 = a, then the general solution is

an = Arn2 + (Bn3 + Cn2 + Dn + E)an.

Case 4. If r1 = r2 = a, then the general solution is

an = (An4 + Bn3 + Cn2 + Dn + E)an.

Finally, we find the (five) unknown coefficients from the first five values of thesequence; two initial conditions are generally known, and three more values ofthe sequence can be generated from the given recurrence equation. Solution ofa first-order nonhomogeneous recurrence equation can be obtained in exactly asimilar manner; in this case the homogeneous component generates a character-istic equation of order 1.

In more general situations the nonhomogeneous component may containmore than one expressions. For example, suppose that the nonhomogeneouscomponent of the given recurrence equation is:

f1(n) + f2(n) = an1p1(n) + an

2p2(n).

In this case we find two characteristic equations—one for each fi(n) and combineit with the characteristic equation of the homogeneous part. Depending on thenumber of common roots, a polynomial of appropriate degree in n is generated.The following result is repeatedly applied to obtain the desired general solutionfrom the combined characteristic equation.

If a root, r∗, of the characteristic equation has multiplicity m, thenits contribution to the general solution of the given recurrence equa-tion is a polynomial in n of degree m − 1 multiplied by r∗n, i.e.,(A1n

m−1 + A2nm−2 + . . . + Am−1n + Am)r∗n.



We illustrate the above solution procedure by means of some examples.

Example 8.7 Solve the difference equation,

an = an−1 + n2, a0 = 0.

Solution. This a first-order difference equation with nonhomogeneous compo-nent f(n) = n2, which can be viewed as n2 ·1n. Thus p(n) = n2 is a polynomialin n of degree 2 and a = 1. The homogeneous part of the recurrence equation isan−an−1 = 0 whose characteristic equation is (r−1) and the nonhomogeneouscomponent generates the characteristic equation (r − 1)3. Thus the combinedcharacteristic equation is

(r − 1)(r − 1)3 = (r − 1)4.

This characteristic equation one root, 1, with multiplicity 4. Hence the generalsolution of the given recurrence equation is

an = (An2 + Bn2 + Cn + D)1n = An3 + Bn2 + Cn + D.

First four values of the sequence are a0 = 0, a1 = 1, a2 = 5, a3 = 14, one ofthem is the given initial condition and three more are obtained from the givenrecurrence equation.

Using these four values we find A, B, C, and D. We solve a system of fourlinear equations

D = 0A + B + C + D = 1

8A + 4B + 2C + D = 527A + 9B + 3C + D = 14

whose solution is

A =13, A =

12, C =

16, and D = 0.

Therefore, the final answer of the given recurrence equation is

an =13n3 +

12n2 +

16n.

2

Example 8.8 Solve

an = 3an−1 − 2an−2 + 2n + n2, a0 = 1, a1 = 2.



Solution. This is a second-order, nonhomogeneous linear difference equationwith f(n) = 2n + n2. It can be viewed as sum of two functions

f1(n) = 2np1(n), and f2(n) = 1np2(n),

where p1(n) = 1 is a polynomial of degree 0 and p2(n) = n2 is a polynomial ofdegree 2.

The homogeneous part of the equation has the characteristic equation

(r2 − 3r + 2) = (r − 1)(r − 2)

and the nonhomogeneous parts have the characteristic equations (r − 2) and(r−1)3, respectively. Hence, the given recurrence equation’s solution is obtainedfrom the characteristic equation:

(r − 2)(r − 1)(r − 2)(r − 1)3.

This equation has two distinct roots r = 2 and r = 1 of multiplicity 2 and 4respectively. Hence, the general solution of the above characteristic equation is

an = (An + B)2n + (Cn3 + Dn2 + En + F ).

A, B,C, D,E, and F are obtained from the first two given conditions, a0 =1, a1 = 2, and a2, a3, a4, a5 obtained from the given recurrence equation bysubstituting n = 2, 3, 4, 5 respectively.

We solve the system of six linear equations obtained by substituting n =0, 1, . . . , 6 into the given recurrence equation and obtain the final solution:

an = (2n + 8)2n − (13n3 +

52n2 +

496

n + 7).

8.2.3 Generating Function Method

In this section we consider basic properties of the generating functions anddescribe how to find the solution of a given difference equation.

Definition 8.6: Given a sequence of numbers a0, a1, . . . , an, . . . the associatedgenerating function is:

A(z) = a0 + a1z + a2z2 + . . . + anzn + . . . ,

where z is a variable.

Let A(z) =∑

n≥0 anzn, B(z) =∑

n≥0 bnzn, and C(z) =∑

n≥0 cnzn. We havethe following properties:

1. A(z) + B(z) = C(z) if and only if ∀n ≥ 0 [an + bn = cn].

2. A(z)×B(z) = C(z) if and only if ∀n ≥ 0 [cn =∑n

i=0 ai · bn−i].



Some Useful Power Series and Their Closed Forms

In order to effectively use the generation functions, it is important to know someimportant sums. The following is a list of some useful power series and theirclosed forms. One may refer to the calculus textbooks for their proofs.

Let r be any real number and n be ant natural numbers.

11− sz

=∑

0≤i

sizi

1(1− sz)2

=∑

0≤i

(i + 1)sizi

(1 + sz)n =∑

0≤i≤n

(n

i

)sizi

1(1 + sz)n

=∑

0≤i

(−n

i

)sizi

ln(1 + sz) =∑

1≤i

(−1)i+1

isizi

− ln(1− sz) =∑

1≤i

1isizi

esz =∑

0≤i

1i!

sizi

The connection between the closed forms of power series, generating func-tions, and recurrence relations can be seen in examples considered below.

Using Generating Functions to Solve Recurrence Equations

In brief, the key steps and idea behind the use of a generating function to solvea recurrence equation are:

Step 1: Write down the generating function of the given sequence.

Step 2: Use the given relation between an and an−1 etc. to remove the recur-rence from the generating function and simplify it.

Step 3: Expand the generating function obtained to in powers of z.

Step 4: Equate the coefficients of zn in two different expressions of the gen-erating functions of the same sequence. Thus obtain the solution of thegiven recurrence equation.



The following example explains the above steps.

Example 8.9 Consider the recurrence:

a0 = 1, and for n ≥ 1, an = 2an−1 + 1.

In the first step, all that we need to do is to write the generating function:

A(z) = a0 + a1z + a2z2 + . . . + anzn + . . . .

Next, by looking at the given recurrence relation we realize that for all valuesof n occurrences of an − 2an−1 can be replaced by 1. To utilize this propertywe multiply A(z) by 2z and subtract it from A(z), i.e., perform the followingmanipulations.

A(z) = a0 + a1z + a2z2 + a3z

3 + · · · + anzn + · · ·2zA(z) = + 2a0z + 2a1z

2 + 2a3z3 + · · · + 2an−1z

n + · · ·(1− 2z)A(z) = a0 + 1z + 1z2 + 1z3 + · · · + 1zn + · · ·

We have,

(1− 2z)A(z) = 1 + z + z2 + z3 + · · · =1

1− z

because a0 = 1. Consequently,

A(z) =1

(1− 2z)(1− z)=

21− 2z

+−1

1− z.

We know that

21− 2z

=∑

0≤i

2 · 2izi and−1

1− z=

∑

0≤i

(−1) · zi.

Thus,A(z) =

∑

0≤i

2 · 2izi +∑

0≤i

(−1) · zi =∑

0≤i

(2i+1 − 1)zi.

Therefore, an = 2n+1 − 1. 2

8.2.4 An Example

In this subsection we solve a recurrence equation by all three methods discussedabove. Consider the following recurrence equation:

a0 = 1, and for all n ≥ 1, 2an = an−1 + 2n.



Repeated Substitution Method

The recurrence equation can be rewritten as an = 12an−1 +2n−1 for all values of

n ≥ 1. Substituting n−1 in place of n gives an−1 = 12an−2 +2n−2. Likewise, we

can replace n by n−2. In the following development we make these substitutionsand simplify:

an = 12an−1 + 2n−1

= 12 ( 1

2an−2 + 2n−2) + 2n−1 (substituting an−1 = 12an−2 + 2n−2)

= 122 an−2 + 2n−3 + 2n−1

= 122 ( 1

2an−3 + 2n−3) + (2n−3 + 2n−1)(substituting an−2 = 12an−3 + 2n−3)

= 123 an−3 + 2n−5 + 2n−3 + 2n−1

...= 1

2k an−k + 2n−2k+1 + · · ·+ 2n−3 + 2n−1 (guessing the answer)...

= 12n a0 + 2n−2n+1 + · · ·+ 2n−3 + 2n−1 (let k = n)

= 12n + 2−n+1 + · · ·+ 2n−3 + 2n−1 (a0 = 1)

= 12n + 2−n+1−2n−122

1−22 (sum of the geometric progression)

= 12n + 1

3 (2n+1 − 2−n+1)

= 12n + 1

3 (2 · 2n − 2 · 12n )

= 13 · 1

2n + 23 · 2n.

Thus we have shown that the solution of the recurrence equation is 13 · 1

2n + 23 ·2n.

2

Characteristic Root Method

Since an = 12an−1 + 2n−1, the characteristic polynomial is (r− 1

2 )(r− 2), where(r − 1

2 ) is contributed by the homogeneous part an − 12an−1 = 0 and (r − 2) is

contributed by the nonhomogeneous part 2n. Therefore, the general solution is

an = A(12)n + B2n.

We know a0 = 1, a1 = 12a0+20 = 3

2 . To find the values of A and B we substituten = 0 and n = 1 in the above general solution and solve the following equations:

1 = a0 = A( 12 )0 + B20 = A + B

32 = a1 = A( 1

2 )1 + B21 = 12A + 2B



We have A = 13 B = 2

3 . Therefore, the solution is

an =13· (1

2)n +

23· 2n.

2

Generating Function Method

Consider the following generating function G(z),

G(z) = a0 + a1z + a2z2 + · · ·+ anzn + · · · (8.2)

Take 12z × (8.2), we get

12zG(z) =

12a0z +

12a1z

2 +12a2z

3 + · · ·+ 12anzn+1 + · · · (8.3)

Take (8.2)− (8.3), and since an − 12an−1 = 2n−1 we get

(1− 12z)G(z) =a0 + (a1 − 1

2a0)z + (a2 − 1

2a1)z2 + · · ·+ (an − 1

2an−1)zn + · · ·

=1 + 21−1z + 22−1z2 + · · ·+ 2n−1zn + · · ·=1 +

12[2z + (2z)2 + · · ·+ (2z)n + · · · ]

=1 +12

2z

1− 2z( sum of the geometric progression)

=1 +z

1− 2z.

Therefore,

G(z) =1

1− 12z

+1

(1− 12z)(1− 2z)

=1

1− 12z

+2z

(2− z)(1− 2z). (8.4)

Now, we want to decompose the second term in the right hand side of (8.4) inthe following form:

2z

(2− z)(1− 2z)=

A

2− z+

B

1− 2z

=A− 2zA + 2B −Bz

(2− z)(1− 2z)

=(A + 2B) + (−2A−B)z

(2− z)(1− 2z).



We have (A + 2B) = 0 and (2A + B) = −2, and solve the equations to haveA = − 4

3 and B = 23 . Therefore, the generating function G(z) is

G(z) =1

1− 12z

+− 4

3

2− z+

23

1− 2z

=1

1− 12z− 2

3· 11− 1

2z+

23· 11− 2z

=13· 11− 1

2z+

23· 11− 2z

.

Therefore, the solution is

an =13· (1

2)n +

23· 2n.

2



Notation: N0 = 0, 1, 2, 3, . . ., N = 1, 2, 3, . . ., and R the set of real num-bers.

8.3 Problems

In Problems 1 through 6 find a recurrence relation in terms of previous values.Also give the boundary conditions.

Problem 1: Find a recurrence relation for the sum of the first n positiveodd integers in terms of sum of the first (n− 1) positive odd integers.

Problem 2: Find a recurrence relation for the maximum number of piecesof a pizza made by n straight cuts.

Problem 3: Find a recurrence relation for the number of ways to put ncents in a machine using identical pennies, nickels, dimes, and quarters.

Problem 4: Recall the recurrence relation for r-combinations from n ob-jects:

(nr

)=

(n−1r−1

)+

(n−1

r

). Find a similar recurrence relation for r-

permutations from n objects; i.e., for P (n, r).

Problem 5: Find a recurrence relation for the number of n digit binarysequences with no consecutive 1’s.

Problem 6: Find a recurrence relation for the maximum number of nodesin a binary tree of depth d.

Problem 7: Find a recurrence relation for the number of ways to fullyparenthesize an expression of n variables:

x1 + x2 + x3 + · · ·+ xn.

For example, ((x1 + x2) + x3), (x1 + (x2 + x3)) are the only two waysto fully parenthesize x1 + x2 + x3. Parenthesization is an importantsubject in compilers.

Hint: Use the following as a starting point. For i = 1, . . . , n,

x1 + · · ·+ xn = ((x1 + · · ·+ xi) + (xi+1 + · · ·+ xn)).

Use the Repeated Substitution Method to guess the solutions of the recurrencerelations in Problems 8 through 11 and verify the correctness of your guessesby mathematical induction.

Problem 8: Solve the following recurrence relation: For n ∈ N0,

f(n) =

2 if n = 0;3 + f(n− 1) if n ≥ 1.


8.3. Problems 347

Problem 9: Solve the following recurrence relation: For n ∈ N0,

f(n) =

2 if n = 0;3f(n− 1) + 2 if n ≥ 1.

Problem 10: Solve the following recurrence relation: For n ∈ N,

f(n) =

1 if n = 1;3f(dn

3 e) if n ≥ 2.

Problem 11: Solve the following recurrence relation: For n ∈ N,

f(n) =

c if n = 1;af(dn

b e) + cn if n ≥ 2,

where b ∈ N, a, c ∈ R, and b > 1.

Problem 12: Use the characteristic root method to solve the following lin-ear homogeneous recurrence relation.

Let f(0) = 1, f(1) = −1, and for n ∈ N0,

f(n + 2) + 2f(n + 1)− 3f(n) = 0.

Problem 13: Let fn be the Fibonacci numbers, i.e., f0 = f1 = 1 andfn = fn−1 + fn−2 for n ≥ 2. Define an as

an =fn

fn−1, for n ∈ N.

Give a recurrence relation to compute an and solve the relation.

Hint: Assume that an converges to r as n →∞.

Problem 14: Solve the following recurrence relation:

a0 = 0,a1 = −1,an − 7an−1 + 12an−2 = 0 for n ≥ 2.


a1 = 1,a2 = 1,an + 2an−1 − 15an−2 = 0 for n ≥ 3.


a0 = 2,a1 = 0,−2an + 18an−2 = 0 for n ≥ 2.




a1 = 2,a2 = 6,an − 4an−1 + 4an−2 = 0 for n ≥ 3.


a1 = 5,a2 = −5,an + 6an−1 + 9an−2 = 0 for n ≥ 3.


a0 = 1,a1 = 2,an − 5an−1 + 6an−2 = 2n + 1 for n ≥ 2.


a0 = 1,a1 = −1,an − 3an−1 + 2an−2 = n for n ≥ 2.

Problem 21: Given a0 = 0, a1 = 1, a2 = 4, a3 = 13, and

an + ban−1 + can−2 = 0 for n ≥ 2.

where b and c are two unknown numbers. Find an in a closed form.

Problem 22: Given a0 = 0, a1 = 1, and

3an − 10an−1 + 3an−2 = 3n for n ≥ 2,

Solve the equation.

Problem 23: Given a0 = 0, a1 = 1, and

5an − 6an−1 + an−2 = n2(15)n for n ≥ 2.

Solve the equation.


8.4. Solutions 349

8.4 Solutions

Solution 1: If n = 1, the sum of the first n positive odd integers is simply1. If n ≥ 2, the sum of the first n positive odd integers is the nth positiveodd integer plus the sum of the first n− 1 positive odd integers. Therefore, wecan define the sum of the first n positive odd integers, f(n), in the followingdifference equation. For n ∈ N,

f(n) =

1 if n = 1;f(n− 1) + (2n− 1) if n ≥ 2.

Note: the nth positive odd integer is 2n− 1 not 2n + 1.

Alternatively, you can let n range over N0, and let f(0) = 0 as the initialpoint of the difference equation. 2

Solution 2: Let f(n) be the maximum number of pieces into which a pizzais cut by n straight cuts.

It’s clear that if n = 0, we have the whole pizza. Thus, f(0) = 1.

To maximize the number of pieces of the pizza by n straight cuts we mustnot use parallel cuts and we must not have any three cuts which intersect atthe same point. In other words, if we have a pizza with n − 1 cuts and obtainthe maximum number of pieces, then the nth cut must cross all previous n− 1cuts at new n − 1 different points. That means, the nth cut must run throughn of the f(n − 1) pieces by the n − 1 previous cuts. Due to the nth cut, eachof the n pieces will be cut into two pieces. Therefore, the nth cut will introducen more pieces. For example, in the following figure the 4th cut has cut each ofthe pieces p1, p2, p3 and p4 into two parts.

23

1

4p1

p2p3

p4

We can use the following difference equation to represent the numbers:

Let n ∈ N0.

f(n) =

1 if n = 0;f(n− 1) + n if n ≥ 1.



2

Solution 3: Let n ∈ N0, and f(n) be the number of ways to put n centsinto a machine by using identical pennies, nickels, dimes, or quarters.

For n = 0, the only way is: do nothing, so we have f(0) = 1. For 1 ≤ n ≤ 4,we have to put in all pennies. Thus, f(n) = 1 for 1 ≤ n ≤ 4.

When n = 5, let’s consider the first coin we will put into the machine. Wecan put in a penny or a nickel. If we put in a penny, then we have f(5−1) waysto put in the rest of the 4 cents. If we put in a nickel, then we have f(5−5) waysto put in the rest of the 0 cents. Therefore, we have f(5) = f(5− 1) + f(5− 5)ways to put in 5 cents, i.e., f(5) = 1 + 1 = 2. The two ways are

1[1111] and 5[ ],

where [1111] is the way to put in 4 cents, and [] is the way to put in 0 cents.For 6 ≤ n ≤ 9, the situation is similar to the case n = 5: we have two choicesfor the 1st coin (a penny and a nickel). Thus, we have

f(n) = f(n− 1) + f(n− 5) for 5 ≤ n ≤ 9.

For example, if n = 6, we have the following ways:

1[11111], 1[5], and 5[1],

where [11111] and [5] are the ways to put in 6 − 1 cents, and [1] is the way toput in 6− 5 cents. The idea is the same for larger values of n. For 10 ≤ n < 25,we have 3 choices for the 1st coin (a dim is another option). Likewise, whenn ≥ 25, we can use any kind of the coins for the 1st coin. Thus, we have thefollowing difference equation describes all 4 possible cases. For n ∈ N0,

f(n) =

1 if 0 ≤ n ≤ 4;

f(n− 1) + f(n− 5) if 5 ≤ n ≤ 9;

f(n− 1) + f(n− 5) + f(n− 10) if 10 ≤ n ≤ 24;

f(n− 1) + f(n− 5) + f(n− 10) + f(n− 25) if 25 ≤ n.

2

Solution 4: Let n, r ∈ N0 and P (n, r) denote the number of r-permutationsfrom n distinct objects.


8.4. Solutions 351

It is clear that P (n, 0) = 1 and P (n, r) = 0 for n < r. Suppose r 6= 0 andn ≥ r. Consider the selection procedure with respect to nth object. We havethe following two choices.

1. The nth object is not selected. In this case, we permute r objects out ofthe remaining n− 1 objects in P (n− 1, r) ways.

2. The nth object is selected. In this case, we first find (r− 1)-permutationsfrom the remaining n−1 objects in P (n−1, r−1) ways. For each (r−1)-permutation, we have to insert the nth object into it, and we have r way todo so. In another words, we can have r different r-permutations from each(r− 1)-permutation. Therefore, we have rP (n− 1, r− 1) r-permutations.

By the rule of sum, we have

P (n, r) = P (n− 1, r) + rP (n− 1, r − 1).

2

Solution 5: Let n ∈ N, and f(n) be the number of n digit binary sequenceswith no consecutive 1’s. Some of the initial few sequences are shown below:

n = 1 n = 2 n = 3 n = 40 0 0 00 0 000 01 1 0 10 0 100 0

01 01 0 010 00 01 001 01 01 101 0

00 0110 0101 01

From the table, f(1) = 2 and f(2) = 3. For n = 3, we obtain the valid stringsas follows: (i) We take all valid strings of length 2 and at the end of each weattach 0. (ii) We take all valid strings of length 1 and at the end of each weattach 01. Likewise, for n = 4, the valid strings are all valid strings of length 3attached with 0 and all valid strings of length 2 attached with 01. In general,we have the following procedures in two cases to generate valid strings of lengthn ≥ 3 based on the previous valid strings.

i. Take the valid strings of length n− 1 and at the end of each string attach0. We obtainf(n− 1) new strings in this case.

ii. Take the valid strings of length n− 2 and at the end of each string attach01. We obtain f(n− 2) new strings in this case.



It is clear that both procedures above generate valid strings of length 2. Butwe have to argue that the string generated by the two procedures cover all validstrings of length n. Let s be a valid string of length n ≥ 3 and let s′ and s′′

denote two substrings of s, where s′ consists of the first n− 1 digits of s and s′′

consists of the first n− 2 digits of s. There are two cases:

1. The nth digit of s is 0. In this case, s′ can be any valid string of lengthn− 1. Thus, the procedure described in i. generate s.

2. The nth digit of s is 1. In this case the (n − 1)th digit of s must be 0and the s′′ can be any valid string of length n − 2. Thus, the proceduredescribed in ii. generate s.

Finally, by the rule of sum, we have

f(n) =

2 if n = 1;3 if n = 2;f(n− 1) + f(n− 2) if n ≥ 3.

2

Solution 6: To maximize the number of nodes in a binary tree, we haveto maximize the number of nodes at each level1. Let’s first find a recurrencerelation, l(d), and its solution for the maximum number of nodes at level d in abinary tree. It’s clear that l(0) = 1, and l(d) = 2 × l(d − 1) for d ≥ 2, becausethe root is the only node at level 0, and each node at level d− 1 must have twochildren in order to maximize the number of nodes at the next level. Therefore,

l(d) = 2l(d− 1) = 22l(d− 2) = · · · = 2d.

Let n(d) be the maximum number of nodes in a binary tree of depth d. Wenote that n(0) = 1, and for d ≥ 1, n(d) is the the maximum number of nodesin a binary tree of depth d− 1 plus the the maximum number of nodes at leveld. Thus, we obtain the difference equation:

n(d) =

1 if d = 0;n(d− 1) + 2d if d ≥ 1.

2

1We define that the root is at level 0, and any child of a node at level k is at level k + 1.Thus, the leaves of a binary tree of depth d are at level ≤ d.


8.4. Solutions 353

Solution 7: Let f(n) be the number of ways to fully parenthesize theexpression:

x1 + x2 + x3 + · · ·+ xn.

When n = 1, we have (x1) only, thus f(1) = 1. (In general we omit theparentheses when n = 1.) When n = 2, (x1+x2) is the only way to parenthesize.Thus f(2) = 1. When n = 3, we have two ways to parenthesize:(x1 +(x2 + x3))and ((x1 + x2) + x3). Therefore,f(3) = 2.

In general, after we put the outermost pair of parentheses, we can exhaus-tively list all possible ways to put the second parenthesize the second outer pairsof parentheses.

case1 (x1 + (x2 + x3 . . . + xn));2 ((x1 + x2) + (x3 . . . + xn));· · · · · · · · · ·i ((x1 + · · ·+ xi) + (xi+1 + · · ·+ xn));· · · · · · · · · ·

n− 1 ((x1 + . . . + xn−1) + xn).

Note that ((· · · ) + (· · · ) + (· · · )) is not fully parenthesized because it can befurther parenthesized as (((· · · ) + (· · · )) + (· · · )) or ((· · · ) + ((· · · ) + (· · · ))).Consider the ith case above. We have f(i) ways to parenthesize the first part,x1 + · · ·+xi, and f(n− i) ways to parenthesize the second part, xi+1 + · · ·+xn.By the product rule, there are f(i)×f(n− i) ways to further parenthesize in thecase. For i = 1, · · · , n− 1, by the sum rule, we obtain the following equation:

f(n) =

1 if n = 1;n−1∑

i=1

f(i)f(n− i) if n ≥ 2.

2

Solution 8: To solve the recurrence equation:

f(n) =

2 if n = 0;3 + f(n− 1) if n ≥ 1,



we apply the resubstitution method. Thus,

f(n) = 3 + f(n− 1)= 3 + (3 + f(n− 2))= 3× 2 + f(n− 2)= 3× 2 + (3 + f(n− 3))= 3× 3 + f(n− 3)= · · · · · ·= 3n + f(0)= 3n + 2.

Therefore, for n ∈ N0, f(n) = 3n + 2. 2

Prove by Mathematical Inductions:

The basis step is obviously satisfied: f(0) = 0 + 2 = 2. For hypothesis,assume that f(n) = 3n+2. For inductive step, we use the definition for f(n+1)and the hypothesis to have

f(n + 1) = 3 + f(n)= 3 + (3n + 2)= 3(n + 1) + 2.

2

Solution 9: Given

f(n) =

2 if n = 0;3f(n− 1) + 2 if n ≥ 1,

By applying the resubstitution method, we obtain:

f(n) = 3f(n− 1) + 2= 3(3f(n− 2) + 2) + 2

= 32f(n− 2) + 3 · 2 + 2

= 32(3f(n− 3) + 2) + 3 · 2 + 2

= 33f(n− 3) + 32 · 2 + 3 · 2 + 2= · · · · · ·= 3nf(0) + 3n−1 · 2 + · · ·+ 3 · 2 + 2

= 3n · 2 + 3n−1 · 2 + · · ·+ 3 · 2 + 2

= 3n+1 − 1.

Therefore, f(n) = 3n+1 − 1 for n ∈ N0.


8.4. Solutions 355


The basis step is satisfied because f(0) = 31−1 = 2. For hypothesis, assumethat f(n) = 3n+1− 1. For inductive step, we use the definition for f(n+1) andthe hypothesis to have

f(n + 1) = 3f(n) + 2

= 3(·3n+1 − 1) + 2

= 3n+2 − 1.

2

Solution 10: First, we prove that if n, a, b ∈ N, then

⌈dna eb

⌉=

⌈ n

ab

⌉. (8.5)

Let n = abp + r, where 0 ≤ r < ab, and p is an integer. If r = 0, the equality(8.5) follows immediately. If r 6= 0, i.e., 0 < r < ab, then

⌈ n

ab

⌉=

⌈abp + r

ab

⌉= p + 1, and

⌈n

a

⌉=

⌈abp + r

a

⌉=

⌈bp +

r

a

⌉= bp + k, 1 ≤ k ≤ b,

Becuse (0 < r < ab) implies (0 < ra < b), which, in turn, implies that the

smallest integer containing ra cannot bigger than b. And, since (1 ≤ k ≤ b)

implies ( 1b ≤ k

b ≤ 1), we have

⌈dna eb

⌉=

⌈bp + k

b

⌉=

⌈p +

k

b

⌉= p + 1 =

⌈ n

ab

⌉.

Therefore, ⌈d n3k e3

⌉=

⌈ n

3k+1

⌉, k ≥ 0. (8.6)

2

To solve the recurrence equation:

f(n) =

1 if n = 1;3f(dn

3 e) if n ≥ 2,



we use the result given in (8.6) and the resubstitution method; we have

f(n) = 3f(⌈n

3

⌉)

= 3(3f(⌈dn

3 e3

⌉))

= 32f(⌈ n

32

⌉)

= 33f(⌈ n

33

⌉)

= · · ·= 3kf(

⌈ n

3k

⌉) where k = dlogn

3 e= 3kf(1)

= 3dlogn3 e.

Note: If n is a power of 3, then f(n) = n. 2


The basis step is satisfied because f(1) = 3dlog13e = 30 = 1. For hypothesis,

assume that f(n) = 3dlogn3 e. For inductive step, we discuss in two cases of n.

case 1: n = 3k where k ≥ 1.

f(n + 1) = f(3k + 1)

= 3f(⌈

3k + 13

⌉)

= 3f(⌈3k−1 +

13

⌉)

= 3f(3k−1 + 1)

= 3 · 3

log(3k−1+1)3

. (8.7)

Since 3k−1 < 3k−1 +1 < 3k, thus (8.7) = 3 ·3k = 3k+1. And, since 3k < 3k +1 <

3k+1, thus⌈log3k+1

3

⌉= k + 1. Therefore, f(n + 1) = 3dlogn+1

3 e.


8.4. Solutions 357

case 2: 3k−1 < n < 3k where k ≥ 1.

3k−1 < n < 3k ⇒ 3k−1 + 1 < n + 1 < 3k + 1

⇒ 3k−1 + 1 < n + 1 ≤ 3k (8.8)

⇒ 3k−2 +13

<n + 1

3≤ 3k−1

⇒ k − 2 < logdn+1

3 e3 ≤ k − 1

⇒⌈logd

n+13 e

3

⌉= k − 1.

f(n + 1) = 3f(⌈

n + 13

⌉)

= 3 · 3log

dn+13 e

3

= 3 · 3k−1

= 3k.

From (8.8), k − 1 < log(n+1)3 ≤ k. Thus,

⌈log(n+1)

3

⌉= k. Therefore f(n + 1) =

3llog

(n+1)3

m. 2

Solution 11: We also repeatedly use the result in (8.6) and the resubstitu-tion method to solve the recurrence equation:

f(n) =

c if n = 1;af(dn

b e) + cn if n ≥ 2,

where a, c ∈ R, b ∈ N and b > 1.

Resubstitution Method:

f(n) = af(dnb e) + cn

= a(af(d dn

b eb e) + cdn

b e)

+ cn

= a2f(d nb2 e) + acdn

b e+ cn

= a3f(d nb3 e) + a2cd n

b2 e+ acdnb e+ cn

= · · · · · ·= akf(d n

3k e) + ak−1cd nbk−1 e+ · · ·+ a2cd n

b2 e+ acdnb e+ cn k = dlogn

b e= akc + ak−1cd n

bk−1 e+ · · ·+ a2cd nb2 e+ acdn

b e+ cn

= akcd nbk e+ ak−1cd n

bk−1 e+ · · ·+ a2cd nb2 e+ acdn

b e+ a0cd nb0 e

= c∑dlogn

b ei=0 aid n

bi e.



It is difficult to further simplify the above formula. But in many applications,e.g., the algorithm analysis, an asymptotic solution is accurate enough for ourpurpose, e.g., O gives a upper bound, Ω gives a lower bound, and Θ gives anapproximation. In order to get rid of the ceiling functions in the formula weassume that n is a power of b or simply. In such a way, we can further simplifythe formula in the following.

c

lognb∑

i=0

ai n

bi= cn

lognb∑

i=0

(a

b

)i

= cn1− (

ab

)(lognb +1)

1− ab

assume a 6= b

= cnb

a− b

(alogn

b +1

nb− 1

)

=calogn

b +1

a− b− cnb

a− b. (8.9)

Thus, we have the following results.

(8.9) ∈

Θ(n) if a < b;Θ(n log n) if a = b;Θ(nloga

b ) if a > b.

Note 1: A precise mathematical inductive proof of the result is very involveddue to the ceiling function. You can assume that n ranges over the powernumbers of b.

Note 2: For this problem, b must be a natural number because, otherwise,d dn

b eb e = d n

b2 e is incorrect in general. Here is a counter example: if n = 2and b = 1.5, then d n

b2 e = 1 and d dnb eb e = 2.

2

Solution 12: Let f(0) = 1, f(1) = −1, and for n ≥ 2,

f(n + 2) + 2f(n + 1)− 3f(n) = 0.

We solve the equation by the characteristic root method. We first find charac-teristic polynomial, r2 +2r− 3, and solve its associated characteristic equation,r2 + 2r − 3 =, to obtain its two roots: r = and r = −31. Since the two rootsare distinct, the general solution to the equation is:

f(n) = A · 1n + B(−3)n.


8.4. Solutions 359

The initial conditions f(0) = 1 and f(1) = −1 generate the following equations:

1 = A · 10 + B · (−3)0

−1 = A · 11 + B · (−3)1

Solve the above equations to get A = B = 12 . That gives the solution:

f(n) =12· 1n +

12· (−3)n

=12

+12· (−3)n.

2

Solution 13: Let fn be the Fibonacci numbers, i.e., f0 = f1 = 1 andfn = fn−1 + fn−2 for n ≥ 2. Define

an =fn

fn−1, for n ∈ N.

For n = 1, a1 = f1f0

= 11 = 1, and for n ≥ 2,

an =fn

fn−1=

fn−1 + fn−2

fn−1= 1 +

fn−2

fn−1= 1 +

1fn−1/fn−2

= 1 +1

an−1.

Therefore,

an =

1 if n = 1;1 + 1

an−1if n ≥ 2.

2

We are interested in the value of limn→∞

an. Assume limn→∞

an = r. That means,when n approximates to infinity, an = an−1 = r. Thus, we have

limn→∞

an = 1 +1

limn→∞

an−1=⇒ r = 1 +

1r

=⇒ 0 = r2 − r − 1

=⇒ r =1 +

√5

2or r =

1−√52

.

It is clear that the negative root can’t be the solution because all an are positive.Therefore,

limn→∞

an =1 +

√5

2.

2



Solution 14: Let a0 = 0, a1 = −1, and for n ≥ 2, an−7an−1 +12an−2 = 0.

Step 1: The associated characteristic equation is r2 − 7r + 12 = 0 and its tworoots are r = 3 and r = 4.

Step 2: We note that (i) the nonhomogeneous part is 0 and (ii) all character-istic roots are distinct. Thus, the general solution to the given recurrenceequation is

an = A3n + B4n.

Step 3: We use the initial conditions to solve the following equations for theunknown constants A and B.

n = 0 : 0 = A + B,n = 1 : −1 = 3A + 4B.

=⇒ A = 1, B = −1.

Therefore,an = 3n − 4n, n ≥ 0.

2

Solution 15: Let a1 = 1, a2 = 1, and for n ≥ 3, an + 2an−1 − 15an−2 = 0.

Step 1: The associated characteristic equation is r2 +2r− 15 = 0 and its tworoots are r = −5 and r = 3.

Step 2: We note that (i) the nonhomogeneous part is 0 and (ii) all character-istic roots are distinct. Thus, the general solution is

an = A(−5)n + B3n.

Step 3: We use the initial conditions to solve the following equations for theunknown constants A and B. [Note: n starts from 1.]

n = 1 : 1 = −5A + 3Bn = 2 : 1 = 25A + 9B

=⇒ A = − 1

20, B =

14.

Therefore,

an = − 120× (−5)n +

14× 3n =

14(−5)n−1 +

14× 3n, n ≥ 1.

2

Solution 16: Let a0 = 2, a1 = 0, and for n ≥ 2, −2an + 18an−2 = 0.


8.4. Solutions 361

Step 1: The associated characteristic equation is −2r2 + 18 = 0 and its tworoots are r = 3 and r = −3.


an = A(−3)n + B3n.


n = 0 : 2 = A + Bn = 1 : 0 = −3A + 3B

=⇒ A = 1, B = 1.

Therefore, an = (−3)n + 3n, n ≥ 0.

2

Solution 17: Let a1 = 2, a2 = 6, and for n ≥ 3, an − 4an−1 + 4an−2 = 0.


Step 2: We have note that (i) the nonhomogeneous part is 0 and (ii) the twocharacteristic roots are the same. Thus, the general solution is

an = (A + Bn)2n.


n = 1 : 2 = (A + B)× 2,n = 2 : 6 = (A + 2B)× 4.

=⇒ A =12, B =

12.

Therefore,

an = (12

+n

2)2n = (1 + n)2n−1, n ≥ 1.

2

Solution 18: Let a1 = 5, a2 = −5, and for n ≥ 3, an + 6an−1 + 9an−2 = 0.

Step 1: The associated characteristic equation is r2 + 6r + 9 = 0 and its tworoots are r = −3 and r = −3.



Step 2: We have note that (i) the nonhomogeneous part is 0 and (ii) the twocharacteristic roots are the same. Thus, the general solution is

an = (A + Bn)(−3)n.


n = 1 : 5 = (A + B)(−3)n = 2 : −5 = (A + 2B)(−3)2

=⇒ A =

−259

, B =109

.

Therefore,

an = (−259

+109

n)(−3)n = 5(2n− 5)(−3)n−2, n ≥ 1.

2

Solution 19: Let a0 = 1, a1 = 2, and for n ≥ 2, an−5an−1+6an−2 = 2n+1.

This is a nonhomogeneous recurrence equation, hence its solution is obtainedfrom two parts as shown in step 2 in the following.


Step 2: (i) We note that the nonhomogeneous part is nonzero, which is

(2n + 1) · 1n.

Since 1 is not one of the characteristic roots of the homogeneous part,therefore the solution to the nonhomogeneous part is a polynomial ofdegree of 1 (the degree of the nonhomogeneous part, 2n + 1). Let it be

Cn + D.

The solution to the nonhomogeneous part must also satisfy the differenceequation. Thus,

(Cn + D)− 5(C(n− 1) + D) + 6(C(n− 2) + D) = 2n + 1.

After Simplifying, we obtain

2Cn + (2D − 7C) = 2n + 1.


8.4. Solutions 363

By comparing the coefficients of n and the constant in the both sides ofthe above equality, we get

2C = 2,

2D − 7C = 1.

Therefore, C = 1, D = 4, and the solution to the nonhomogeneous part is

n + 4.

(ii) Since all characteristic roots are distinct, the general solution to thehomogeneous part of the recurrence equation is an = A2n + B3n. Bycombining this solution with the solution to the nonhomogeneous part,n + 4, we get

an = A2n + B3n + n + 4.


n = 0 : 1 = A + B + 0 + 4,n = 1 : 2 = 2A + 3B + 1 + 4.

=⇒ A = −6, B = 3.

Therefore,

an = −6 · 2n + 3 · 3n + n + 4 = −3 · 2n+1 + 3n+1 + n + 4, n ≥ 0.

2

Solution 20: Let a0 = 1, a1 = −1, and for n ≥ 2, an − 3an−1 + 2an−2 = n.

As the previous problem, this is a nonhomogeneous recurrence equation.The solution is obtained from two parts.


Step 2: (i) We note that the nonhomogeneous part is nonzero, which is n ·1n, and 1 is one of the characteristic roots. Thus, the solution to thenonhomogeneous part is a polynomial of degree 2. Let it be

Bn2 + Cn + D.

Instead of finding the values of B,C, D at this point, we leave them un-solved until the final step.

(ii) By combining the result obtained in (i) with the contribution of theother characteristic root, r = 2, we get the general solution

an = A2n + (Bn2 + Cn + D)1n.



Step 3: Since we have 4 unknown constants, A,B, C, and D, we need at least4 initial values to solve them. We already have a0 = 1 and a1 = −1; twomore are needed.

a2 = 3a1 − 2a0 + 2 = −3− 2 + 2 = −3,

a3 = 3a2 − 2a1 + 3 = −9 + 2 + 3 = −4.

These four initial conditions give the following linear equations:

n = 0 : 1 = A + 0 + 0 + D,n = 1 : −1 = 2A + B + C + D,n = 2 : −3 = 4A + 4B + 2C + D,n = 3 : −4 = 8A + 9B + 3C + D.

A typical systematic method to solve the above system is by reducing thematrix of coefficients to an upper triangular matrix via row and columnoperations as shown below.2

1 0 0 1 12 1 1 1 −14 4 2 1 −38 9 3 1 −4

=⇒

1 0 0 1 10 1 1 −1 −30 4 2 −3 −70 9 3 −7 −12

=⇒

1 0 0 1 10 1 1 −1 −30 0 −2 1 50 0 −6 2 15

=⇒

1 0 0 1 10 1 1 −1 −30 0 −2 1 50 0 0 −1 0

Using the last matrix and working backwards give the following four equa-tions, which are very easy to solve.

−D = 0 =⇒ D = 0,

−2C + D = 5 =⇒ C = −52,

B + C −D = −3 =⇒ B = −12,

A + D = 1 =⇒ A = 1.

Therefore,

an = 2n − n2 + 5n

2, n ≥ 0.

2

2One can refer to any textbook in linear algebra for more details.


8.4. Solutions 365

Solution 21: Given a0 = 0, a1 = 1, a2 = 4, a3 = 13, and

an + ban−1 + can−2 = 0 for n ≥ 2,

We use the given initial values to solve b and c. We know that

a3 + ba2 + ca1 = 0,

a2 + ba1 + ca0 = 0.

Thus,

13 + 4b + c = 0,

4 + b + 0 = 0.

By solving the above equations, we get b = −4, c = 3. Then, we have a standarddifference equation and sufficient information to solve it. a0 = 0, a1 = 1, and

an − 4an−1 + 3an−2 = 0 for n ≥ 2,



an = A3n + B1n.


n = 0 : 0 = A + Bn = 1 : 1 = 3A + B

=⇒ A =

12, B = −1

2.

Therefore,

an =123n − 1

2, n ≥ 0.

2

Solution 22: Let a0 = 0, a1 = 1, and for n ≥ 2, 3an−10an−1 +3an−2 = 3n.

This is a nonhomogeneous recurrence equation. The solution is obtainedfrom two parts.

Step 1: The associated characteristic equation is 3r2− 10r +3 = 0 and its tworoots are r = 1

3 and r = 3.



Step 2: (i) We note that the nonhomogeneous part is nonzero, which is 3n,and 3 is one of the characteristic roots. Thus, the solution to the nonho-mogeneous part is the product of 3n and a polynomial of degree 1. Let itbe

(Bn + C)3n.

(ii) By combining the result obtained in (i) with the contribution of theother characteristic root, r = 1

3 , we get the general solution

an = A(13)n + (Bn + C)3n.

Step 3: Since we have three unknown constants A,B and C, we need at least3 initial values to solve them. We already have a0 = 1 and a1 = 1; a2 canbe obtained from the given difference equation.

3a2 = 32 + 10a1 − 3a0 = 9 + 10 =⇒ a2 =193

.

Now A, B, and C can be determined from the following system:

n = 0 : 0 = A + 0 + C,

n = 1 : 1 = 13A + 3B + 3C,

n = 2 : 193 = 1

9A + 18B + 9C.

By solving the system, we get

A =364

, B =38, C = − 3

64.

Therefore,

an =364· (1

3)n + (

38· n− 3

64)3n, n ≥ 0.

2

Solution 23: Let a0 = 0, a1 = 1, and for n ≥ 2, 5an − 6an−1 + an−2 =n2( 1

5 )n.

This is a nonhomogeneous recurrence equation. The solution is obtainedfrom two parts.

Step 1: The associated characteristic equation is 5r2 − 6r + 1 = 0 and its tworoots are r = 1

5 and r = 1.


8.4. Solutions 367

Step 2: (i) We note that the nonhomogeneous part is

n2(15)n,

and 15 is one of the characteristic roots, thus the solution to the nonho-

mogeneous part is the product of ( 15 )n and a polynomial of degree 3 (the

degree of n2 plus 1). Let it be

(An3 + Bn2 + Cn + D)(15)n.

(ii) The other characteristic root is 1. Combining with the result in (i),the general solution is

an = (An3 + Bn2 + Cn + D)(15)n + E · 1n.

Step 3: Since we have 5 unknown constants, we need at least 5 initial valuesto solve them. We already have a0 = 1 and a1 = 1; and we will generatethree more by using the given difference equation and a0 and a1. We get

a2 =15453

, a3 =80854

, a4 =409455

.

We have the following system:

0 = 0 + 0 + 0 + D + E

1 = A + B + C + D + E

15453 = 8A

52 + 4B52 + 2C

52 + D52 + E

80854 = 27A

53 + 9B53 + 3C

53 + D53 + E

409455 = 64A

54 + 16B54 + 4C

54 + D54 + E

After solving the system, we get

A = − 9710560

, B = − 3873520

, C = − 79528

, D = −843640

, E =843640

.

2


Chapter 9

Discrete Probability

Mathematics is the name we giveto the collection of all possible patterns and interrelationships.

– John D. Barrow

9.1. Definitions and Terminologies 371

9.1 Definitions and Terminologies

In our everyday life we encounter situations for which it is not possible to pre-dict the exact outcome. Such situations arise in most of our scientific studiesas well. Theory of probability helps us to understand such experiments in thesense that it allows us to calculate the likelihood of a complex collection of pos-sible outcomes. When we talk about probability, everybody appears to havesome idea of what it means. However, it is easy to go wrong in precise numer-ical evaluation; two seemingly correct arguments may give completely differentanswers. It is, therefore, important to develop a formal approach.

In our study, we will concentrate on the discrete probability theory, i.e.,where probabilities are obtained by summation instead of by integration.

Definition 9.1: Experiments for which the outcomes cannot be predicted withcertainty are called random experiments, and the collection of all possibleoutcomes is known as sample space, or just space, typically denoted as setS.

Definition 9.2: Any subset of a sample space S is called an event. Any elementin a sample space is called an elementary event.

Definition 9.3: A rule assigns a numerical value to an event A ⊆ S. Thisrule is known as the probability distribution and satisfies the following threeconditions;

1. For any subsets A of S, 0 ≤ Pr(A) ≤ 1.

2. Pr(∅) = 0, Pr(S) = 1.

3. If A and B are two disjoint subsets of S, then

Pr(A ∪B) = Pr(A) + Pr(B).

This rule is also known as the addition rule.

Definition 9.4: The value Pr(A) is known as the probability of event A. A sam-ple space together with a probability distribution is known as a probabilityspace.

Definition 9.5: Let A and B be two events such that Pr(A) 6= 0. The condi-tional probability of event B, given event A, denoted Pr(B|A), is definedas:

Pr(B|A) =Pr(A ∩B)

Pr(A).

Definition 9.6: Events A and B are independent if and only if

Pr(A ∩B) = Pr(A)Pr(B).


372 9. Discrete Probability

Comment: An alternative and descriptive definition of independence is

Pr(B|A) = Pr(B),

provided that Pr(A) is not zero. The above definition says that theconditional probability of B, given A, is the same as its unconditionalprobability. In other words, event A does not influence the occurrenceof event B.

Definition 9.7: A random variable is a function defined on the elementaryevents s ∈ S of a probability space:

X : S → R; , Pr(X = x) = Pr(A : for s ∈ A X(s) = x).

Definition 9.8: The odds of an event A is the ratio P (A)/P (A′) where A′ de-notes the complement of A.

Definition 9.9: Probability distribution function (pdf) of a random variable de-scribes how the probability is distributed over possible values that the ran-dom variable can take. We denote this by p(x). For example, if the randomvariable X takes values 0, 1, 2, . . . , n with positive probability and all othervalues are impossible, then

p(i) = Pr(X = i), i = 0, 1, . . . , n

denotes its pdf. Another related function is

F (x) = Pr(X ≤ x) =∑

j≤x

p(j).

This function is called the cumulative distribution function(cdf) or just thedistribution function(df).

Definition 9.10: Let X and Y be two random variables. We say that X andY are independent random variables if

Pr(X = x and Y = y) = Pr(X = x) ·Pr(Y = y)

for all x and y.

How is this definition related to the events in a sample space? This isexplained in the following discussion. Given a ∈ R we can find the set ofsample events such that X = a. The set that generates X = a can bewritten as X−1(a). Likewise, for b ∈ R we denote by Y −1(b) the set of allevents such that Y = b . In view of this observation we say that two randomvariables X and Y are independent if and only if for all a, b ∈ R

Pr(X−1(a) ∩ Y −1(b)) = Pr(X−1(a))Pr(Y −1(b)).



For example, let S = 00, 01, 10, 11 and define

Pr(00) =12,Pr(01) =

14,Pr(10) =

18, and Pr(11) =

18.

Define X = the 1st coordinate of x and Y = the 2nd coordinate of x for anyx ∈ S. In this example

x Pr(x) X(x) Y (x)00 1/2 0 001 1/4 0 110 1/8 1 011 1/8 1 1

we have

Pr(x | X(x) = 0 & Y (x) = 0) = Pr(00, 01 ∩ 00, 10)= Pr(00) =

12.

Pr(x | X(s) = 0) = Pr(00, 01) =12

+14

=34.

Pr(x |Y (x) = 0) = Pr(00, 10) =12

+18

=58.

Since 12 6= 3

4 × 58 , X and Y are not independent.

Definition 9.11: Let X be a random variable. The mean (also known as theexpected value and typically denoted by µ or E(X)) of X is defined as

∑

∀x

x ·Pr(X = x).

Definition 9.12: Let X be a random variable. The median of X is defined tobe the set of all x such that

Pr(X ≤ x) ≥ 12

and Pr(X ≥ x) ≥ 12.

Definition 9.13: Let X be a random variable. The mode of X is defined to bethe set of all x such that

Pr(X = x) ≥ Pr(X = x′) for all values of X = x′.

Definition 9.14: The variance of a random variable, typically denoted as σ2,is defined as

var(X) =∑

∀x

(x− E(X))2Pr(X = x).

The square root of variance is known as the standard deviation.



9.1.1 Examples and Discussion

Example 9.1 Let S = (1, 1), (1, 2), . . . , (1, 6), (2, 1), . . . , (6, 6) be the sam-ple space of rolling a regular die twice.

The sample space S contains 36 elements. Here, (2, 3) is an elementaryevent and represents the fact that the first face value is 2 and the second facevalue is 3. Two rolls such as (2,3) and (3,2) are considered different. Anevent can be written as a subset, e.g., (2, 3), (1, 4), or can be expressed inwords, e.g., “both face values are the same.” In set notations the latter eventis (1, 1), (2, 2), . . . , (6, 6). We usually assume that the die is “fair” (regular),i.e., each of the six face values will be seen with equal probability as 1

6 . Theprobability of the event that “both face values are the same” is 6

36 , obtained bya simple formula

Pr(event) =number of points favorable to the eventnumber of points in the sample space

.

It is important to note that this formula is applicable only if all elementaryevents in the sample space are assigned equal probabilities. If the elementaryevents occur with different probabilities, then the probability of an event isobtained by the addition rule given above, i.e.,

Pr(A) =∑

x∈A

Pr(x).

A random variable X could be defined by the property:

X = the value equal to the sum on the spots on the dice roll. (9.1)

For example, X((2, 3)) = 5. [To make our notation simpler we prefer to writeX((a, b)) as X(a, b).] Another random variable Y could be defined as:

Y := the value equal to the absolute value of the differenceon the spots on the dice roll. (9.2)

For example, Y (2, 3) = |2− 3| = 1.

Consider the random variable X we just defined. The pdf and cdf of X areeasily obtained (see table below) if each of the two dice is fair.

x 2 3 4 5 6 7 8 9 10 11 12

p(x) 136

236

336

436

536

636

536

436

336

236

136

F (x) 136

336

636

1036

1536

2136

2636

3036

3336

3536

3636



It is easy to verify that the mean, median, and mode of the random variableX is 7. 2

Example 9.2 Here we consider another example where a die is rolled twice,except that it is a loaded (not a fair) die. Suppose that the probabilities of facevalues are distributed as shown below:

x 1 2 3 4 5 6

Pr(x) 13

13

112

112

112

112

In this example, the probability of the event “both face values are the same”is

19

+19

+1

144+

1144

+1

144+

1144

=14.

For the random variable Y , defined in (9.2), we get the following pdf andcdf:

x 0 1 2 3 4 5

p(x) 36144

46144

20144

18144

16144

8144

F (x) 36144

82144

102144

120144

136144

144144

How do we obtain the above pdf? We illustrate it for one case: p(1) = Pr(Y =1). Clearly, Y = 1 when the difference in the face values is 1. That is, Pr(Y =1) = Pr(A) where

A = (1, 2), (2, 1), (2, 3), (3, 2), (3, 4), (4, 3), (4, 5), (5, 4), (5, 6), (6, 5).In turn, for example, Pr(1, 2) = 1

313 = 1

9 and Pr(2, 3) = 13

112 = 1

36 , etc. There-fore,

p(1) =19

+19

+136

+136

+1

144+

1144

+1

144+

1144

+1

144+

1144

.

The mean of Y with respect to this loaded die is

0× 36144

+ 1× 46144

+ 2× 20144

+ 3× 18144

+ 4× 16144

+ 5× 8144

= 1.694.

The median of Y is 1 (note that Pr(Y ≤ 1) = 82/144 ≥ 0.5, and Pr(Y ≥ 1) =108/144 ≥ 0.5), and the mode of Y is also 1. 2

Comment: Sample space may take different forms, one of which may be moreconvenient than the others. For example, when a die is tossed two timesthe most natural sample space is

(1, 1), (1, 2), . . . , (6, 6).



However, if we are interested in events relating to the sum of the facevalues of the two tosses, then we might as well choose another samplespace,

2, 3, . . . , 12.It is obvious that the latter sample space will not be useful for answeringsome of the other questions, e.g., what is the probability that both facevalues are the same, whereas the first sample space answers this questioneasily.

9.2 Theorems of Probability

Theorem 9.1: Let A′ denote the complementary event of A. Then, for anyevent A,

Pr(A′) = 1−Pr(A).

Theorem 9.2: For arbitrary events A and B,

Pr(A ∪B) = Pr(A) + Pr(B)−Pr(A ∩B).

Theorem 9.3: Let A and B be two events. Then,

Pr(A ∩B) = Pr(A)Pr(B|A) = Pr(B)Pr(A|B).

Theorem 9.4: (Bayes’ Theorem) Let A1, . . . , An be n events such that (i)A1 ∪A2 ∪ . . .∪An = S, and (ii) Ai ∩Aj = ∅, where S denotes the samplespace. Then,

Pr(Ai|B) =Pr(Ai)Pr(B|Ai)∑n

j=1 Pr(Ai)Pr(B|Aj)

for any i = 1, . . . , n.

Theorem 9.5: For any two random variables X and Y ,

E(X + Y ) = E(X) + E(Y ).

Theorem 9.6: For any random variable X and constant a,

E(aX) = aE(X).

Theorem 9.7: For two independent random variables X and Y ,

E(XY ) = E(X)E(Y ).

Theorem 9.8: The variance of a random variable X is often easy to evaluateusing the following formula:

var(X) =∑

∀x

x2Pr(X = x)− E(X)2.


9.2. Theorems of Probability 377

Theorem 9.9: (Chebychev’s Inequality) Let X be any random variable withexpected value µ and standard deviation σ. Then, for any constant k > 1,

Pr(µ− kσ < X < µ + kσ) ≥ 1− 1k2

.



9.3 Problems

Problem 1: Let A = a, b, c, d, e. Suppose we want to pick three lettersfrom A. List the sample space of this experiment.

Problem 2: Three letters are chosen at random from the word “steve.”

1. Describe the sample space,

2. Describe event E where “one of the three letters is t.” What is theprobability of this event? Assume all selections are equally likely.

3. Describe event E, where “one or more of the three letters is e”.What is its probability? Assume all selections are equally likely.

Problem 3: Three letters are chosen at random from “calculator.” Whatis the probability that

1. one of them is a t?

2. one or more of them is a c?

Assume all selections are equally likely.

Problem 4: Ten students, p1, p2, · · · , p10, are randomly arranged in a line.

1. Describe the sample space.

2. Describe event E, where “p1 and p2 are next to each other.” Whatis its probability?

Problem 5: When you throw three fair dice, what is the probability thatthe sum of the numbers on the top faces is 10?

Problem 6: Define S := 0, 1, . . . , n and Pr(k) =(nk

)2−n.

1. Verify that (S,Pr) is a sample space.

2. Define two random variables, f and g, as

for x ∈ S, f(x) = x, g(x) = 1.

Calculate the expected values of the random variables f and g.

Problem 7: In bridge a 52-card deck is dealt out in four hands of 13 cardseach. What is the probability that your hand has no ace? [Assume thatall 52! shuffles are equally likely.]

Problem 8: Suppose the probability of an event is x/y, where x and y arepositive real numbers. Express the odds of the event in terms of x and y.

Problem 9: You throw a fair die. If an even number turns up you get thatnumber of dollars. If an odd number turns up you pay that number ofdollars. What is the value of your expectation? Explain.


9.3. Problems 379

Problem 10: You throw two dice. If the sum of the faces is even, youwin that much, if odd you lose that amount. What is your expectation?Explain.

Problem 11: Let A and B be disjoint events in a sample space. Prove thatA and B are not independent unless Pr(A) or Pr(B) is 0.

Problem 12: Suppose S is a sample space of exactly two points:

S = a, b and Pr(a)Pr(b) 6= 0.

Prove that no two random variables f and g on S are independent unlessone is a constant function.

Problem 13: Suppose there is a machine that at the press of a buttonreturns an integer from 2 to 101, all with equal probability.

1. Players A and B agree to gamble as follows: A pays d dollars to playthe game. B presses the button and pays p dollars to A, where pis the smallest prime dividing the integer returned by the machine.What should d be so that the game is “fair” (i.e., the expected valueof each players winning is zero).

2. Suppose A plays the above game ten times at a fee of d = $10 perplay. What is A’s expected net winnings?

Problem 14: The World Series ends as soon as one team wins four games.There are no ties, and there are only two teams. What is the probabilitythat the series lasts for seven games if

1. the teams are evenly matched;

2. one team is a 3-to-2 favorite (the true odds) over the other?

Problem 15: Suppose we want to pick two numbers from 1, 2, · · · , 100randomly. What is the probability that the sum of the two picked num-bers is divisible by 5?

Problem 16: Let (S,Pr) be a sample space and f a random variable on(S,Pr). We define a set T and a function G as follows:

T = v | v = f(a), a ∈ S;G : T → [0, 1], ∀v ∈ T, G(v) =

∑

a∈S,f(a)=v

Pr(a).

Prove that (T, G) is a sample space.

Problem 17: A box contains one black ball and two white balls. You areasked to pick balls from the box until the black ball is chosen for the firsttime. What is the expected time for picking the black ball if



(i) you don’t have to replace the chosen white ball in the box?

(ii) you have to replace the chosen white ball in the box?

Problem 18: Throwing two dice, you win |x−y| dollars, where x and y arethe face numbers. Show that your expected win is 35

18 .

Problem 19: You are to throw three dice and win the difference betweenthe maximum and minimum face numbers. Show that your expectationis 35

12 .

Problem 20: If the probability of winning a tennis game is p, what is theprobability that after six games of tennis you lead your opponent fourgames to two? Answer the question for (i) p = 0.5, (ii) p = 0.6, and (iii)p = 0.4.

Problem 21: A class of 15 students has three computer terminals availablein the lab. Each student uses a terminal with probability 1/10. What isthe probability that one or more students from this class will have to waitto use a terminal? Calculate the probability accurate to two significantfigures.

Problem 22: On the average, 15 out-of-state cars pass a certain point ona road per hour. What is the probability that exactly four out-of-statecars pass that point in a 12-minute period?

Problem 23: Let S = 1, 2, 3, 4, 5, 6 and let Pr, f , and g be defined asfollows:

s Pr(s) f(s) g(s)1 1/4 1 32 1/6 0 33 1/6 1 34 1/6 0 65 1/6 1 66 1/12 0 6

1. Prove that (S,Pr) is a sample space.

2. Let 1, 2, 6 and 2, 4 be two events. Are they independent? Why?

3. Are f and g two independent random variables on (S,Pr)? Why?

Problem 24: For f and g defined above, find

1. E(f + g).

2. Var(f + g).

Problem 25: Over a large set of inputs a program runs twice as often as itaborts. What is the probability that of the next 6 attempts, 4 or morewill run?


9.3. Problems 381

Problem 26: A bag has1 counter marked 12 counters marked 43 counters marked 9and so on up to n counters marked n2 (where n is a positive integer).You draw one counter and are paid the amount shown on it. Show thatyour expectation is equal to the number of counters.

Problem 27: Show that for 10,000 flips of a balanced coin, the probabilityis at least 0.99 that the proportion of heads will fall between 0.45 and0.55.



9.4 Solutions

Solution 1: The sample space is the collection of all possible ways to pickup three letters from a, b, c, d, e, f. We know that there are

(63

)= 20 ways to

do so. They are:

abc, abd, abe, abf, acd, ace, acf, ade, adf, aef,bcd, bce, bcf, bde, bdf, bef, cde, cdf, cef, def.

2

Solution 2: Three letters are chosen at random from steve.

1. The sample space is

ste, stv, sev, see, tev, tee, eev.One must pay special attention to the probabilities of the events in thesample space. The probabilities are not equal to each other. For example,

Pr(ste) =210

.

Because we have(53

)= 10 ways to select three letters from five different

positions in steve, and e in the event ste can either come from the firstor the second e in steve. Thus, we have two ways to pick up ste. Thefollowing table shows the probabilities of the events.

ste stv sev see tev tee eev

Pr 210

110

210

110

210

110

110

2. The event “one of the three letters is t” is E = ste, stv, tev, tee, and itsprobability is

Pr(ste) + Pr(stv) + Pr(tev) + Pr(tee) =610

.

3. The event “one or more of the three letters is e” is

E = ste, sev, see, tev, tee, eev,and its probability is

Pr(ste) + Pr(sev) + Pr(see) + Pr(tev) + Pr(tee) + Pr(eev) =910

.


9.4. Solutions 383

Alternatively, E = S − stv, where S is the entire sample space, and“stv” is the only event without an “e.” Thus, the probability of E is

1−Pr(stv) =910

.

2

Solution 3: First, we consider the sample space consisting of all possibleselections of three letters out of ten. We pick three letters out of ten, randomly,in

(103

)ways. All events in the sample space have an equal probability 1/

(103

).

Now, the remaining problem is how to evaluate the probability of the events inwhich we are interested.

1. “All selections that contain one t”: First we pick a t and then in(92

)ways

we pick the the other two letters. Therefore, we have(92

)many ways to

select one t. Therefore,

Pr(the selection contains one t is) =

(92

)(103

) =310

.

2

2. We have two c’s in the word calculator . Therefore, we have two kindsof events: with one c and with two c’s. We evaluate their probabilitiesseparately. For one c, we pick up either one of the two c’s and then pickup two more letters (other than c) in 2× (

82

)ways. For two c’s we select

one more letter from the remaining eight letters in(81

)ways. Therefore,

Pr(the selection has at least one t) =2× (

82

)(103

) +

(81

)(103

) =815

.

2

Solution 4:

1. The sample space is the set of all possible permutations of p1, p2, · · · , p10.Thus, the size of the sample space is 10!. And, since the students arerandomly lined up, each order has the same probability, 1

10! .

2. We consider p1 and p2 together as one student A. There are 9! permu-tations of nine students A, p3, p4, · · · , p10. For each permutation obtained



in this manner, we have two ways to place p1 and p2 next to each other:p1p2 or p2p1. Therefore,

Pr(p1 and p2 are next to each other) =9!× 210!

= 0.2.

2

Solution 5: It is not difficult to exhaustively list all possible results of rollingthree dice. However, if the number of dice is more than four, the exhaustivemethod becomes cumbersome. We use an alternative approach to answer thisproblem.

We mark the three dice to make them distinguishable. The size of the samplespace is 63. Let x, y, and z denote the numbers on the top faces of the threedice. We want to know how many integer solutions of

x + y + z = 10 (9.3)

there are when 1 ≤ x ≤ 6, 1 ≤ y ≤ 6, 1 ≤ z ≤ 6.

An equivalent question is: How many integer solutions of

x′ + y′ + z′ = 7 (9.4)

are there when 0 ≤ x′ ≤ 5, 0 ≤ y′ ≤ 5, 0 ≤ z′ ≤ 5? There are(

7 + 3− 17

)= 36

nonnegative integer solutions of the above problem. We must remove solutionsin which x′ ≥ 6, y′ ≥ 6, or z′ ≥ 6. There are

(1 + 3− 1

2

)× 3 = 9

such solutions. Thus we have 36 − 9 = 27 different ways in which the sum ofthe numbers on the three faces is 10. Therefore, the desired probability is

2763

=18.

2

Solution 6: Recall that∑n

k=0

(nk

)= 2n and

∑nk=0 k

(nk

)= n2n−1.


9.4. Solutions 385

1. To verify that (S,Pr) is a well-defined sample space we check that S =0, 1, . . . , n and Pr(k) =

(nk

)2−n ≥ 0. Moreover,

∑

k∈S

Pr(k) =n∑

k=0

(n

k

)2−n

= 2−nn∑

k=0

(n

k

)

= 2−n2n

= 1.

2. Let f : S → R; f(x) = x.

E(f) =∑

x∈S

f(x)Pr(x)

=n∑

x=0

x

(n

x

)2−n

= 2−nn∑

x=0

x

(n

x

)

= 2−nn 2n−1

=n

2.

2

If g : S → R; g(x) = 1, then

E(g) =∑

x∈S

g(x)Pr(x) =∑

x∈S

Pr(x) = 1.

2

Solution 7: The size of the sample space is(5213

). The number of ways to

pick 13 cards from 48 cards (after removing four aces) is(4813

). Therefore,

Pr(A hand without an ace) =

(4813

)(5313

) =39 · 38 · 37 · 3652 · 51 · 50 · 49

.

Alternatively, consider the probability of drawing a card that is not an ace: theprobability that the first card is not an ace is 48/52, the probability that thesecond card is not an ace is 47/51, . . . , and the probability that the 13th card



is not an ace is 36/40. Therefore, the probability of drawing 13 cards withoutan ace is

4852× 47

51× · · · × 36

40=

39 · 38 · 37 · 3652 · 51 · 50 · 49

2

Solution 8: In this problem, Pr(E) = x/y. Therefore, the odds of event Eare

Pr(E)1−Pr(E)

=

x

y

1− x

y

=x

y − x.

2

Solution 9: The set of possible outcomes of throwing a die is

S = 1, 2, 3, 4, 5, 6.Define a random variable

f(x) =

x if x is even,−x if x is odd,

(9.5)

where a positive number means you will win that amount of money, while anegative number indicates the amount you pay. Then,

E(f) =∑

x∈S

f(x)16

=16(−1 + 2− 3 + 4− 5 + 6)

=12.

Therefore, you expect to win 12 dollar per game. 2

Solution 10: For this problem, we can define the sample space and theassociated probability function in two different ways. Each has advantages anddisadvantages.

Method 1: Define the sample space S = 2, 3, . . . , 12 and the random variable

f(x) =

x if x is even,−x if x is odd.


9.4. Solutions 387

Then, the probability density function is relatively difficult, but the expectedvalue is easy to obtain. The following table shows the probability of each eventin the sample space.

x 2 3 4 5 6f(x) 2 −3 4 −5 6

Pr(x) 136

236

336

436

536

x 7 8 9 10 11 12f(x) −7 8 −9 10 −11 12

Pr(x) 636

536

436

336

236

136

Thus,

E(f) =12∑

x=2

f(x) ·Pr(x) = 0.

Method 2: If the sample space is S = 1, 2, 3, 4, 5, 6×1, 2, 3, 4, 5, 6, then theprobability density function is easy to describe: for all (x, y) ∈ S, Pr(x, y) = 1

36 .However, extra work is required to define the random variable and to calculateits expectation. We define the random variable as:

f(x, y) =

(x + y) if x + y is even,−(x + y) if x + y is odd.

It can be seen that the expectation is the same. 2

Solution 11: Let A and B be two disjoint events. Since A and B are disjointevents, Pr(A ∩B) = Pr(∅) = 0. Suppose A and B are independent, then

Pr(A ∩B) = Pr(A)Pr(B).

Therefore,Pr(A)Pr(B) = 0,

which means that at least one of Pr(A) and Pr(B) is 0. 2

Solution 12: Let f and g be two independent variables on S,Pr, whereS = a, b and Pr(a)Pr(b) 6= 0. We want to prove that at least one of f and gis a constant function.



Let u and v be two values in the range of f and of g i.e., f−1(u) 6= ∅ andg−1(v) 6= ∅. Because f and g are independent, we have

Pr(s|f(s) = u & g(s) = v) = Pr(s|f(s) = u)Pr(s|g(s) = v).We have the following cases:

1. s|f(s) = u & g(s) = v = ∅. In this case, we have the following subcases:

(a) s|f(s) = u = a and s|g(s) = v = b.(b) s|f(s) = u = b and s|g(s) = v = a.

In both subcases we obtained Pr(a)Pr(b) = 0, which contradicts ourassumptions.

2. s|f(s) = u & g(s) = v = a. We also have the following subcases:

(a) s|f(s) = u = a and s|g(s) = v = a. In this subcase

Pr(a) = Pr(a)Pr(a),

and hence Pr(a) = 1, Pr(b) = 0, a contradiction.(b) s|f(s) = u = a, b and s|g(s) = v = a. In this subcase f is a

constant u, because for all s ∈ a, b f(s) = u.(c) s|f(s) = u = a and s|g(s) = v = a, b. Similarly, g is a

constant.

3. s|f(s) = u & g(s) = v = b. The arguments for this case are similar tothose for the previous case.

4. s|f(s) = u & g(s) = v = a, b. It is clear that in this case both f andg are constants.

2

Solution 13: Part 1: The set of all possible outcomes is S = 2, 3, . . . 101.Given a prime p ∈ S, define

A(p) = x : x ∈ S, p divides x and 6 ∃ a prime q < p such that q divides xand n(p) = the number of elements in A(p). In these notations it is easy toobserve that the expected amount of dollars paid by B to A is

∑

p∈S,p is a primepn(p)100

,

because for each x ∈ A(p) B plays p dollars to A. For each prime p ∈ S theassociated value of n(p) is given in the following table. A brief comment on howto obtain it follows the table.


9.4. Solutions 389

prime : n(p) prime : g(p) prime : g(p)2 : 50 29 : 1 67 : 13 : 17 31 : 1 71 : 15 : 7 37 : 1 73 : 17 : 4 41 : 1 79 : 1

11 : 1 43 : 1 83 : 113 : 1 47 : 1 89 : 117 : 1 53 : 1 97 : 119 : 1 59 : 1 101 : 123 : 1 61 : 1 :

p = 2: It is easy to see that n(2) = 50 because 2 divides 50 integers in S.p = 3: There are b 100

3 c = 33 integers divisible by 3 in S. But out of these33 integers b 100

6 c = 16 are divisible by 2 and therefore belong to A(2). hence,n(3) = 33− 16 = 17.p = 5: To find the number of elements divisible by 5 but not by 2 or 3we obtain the number of elements divisible by 5 (there are b 100

5 c = 20 suchelements) and then subtract the integers divisible by 2×5 = 10 and by 3×5 =15 but without counting such elements twice. Thus, n(5) = 20−b 100

10 c−b 10015 c+

b 10030 c = 20− 10− 6 + 3 = 7.

p = 7: Using the above argument,

n(7) = b1007c − b100

14c − b100

21c − b100

35c+ b100

42c+ b100

70c

= 14− 7− 4− 2 + 2 + 1= 4.

Values for the remaining primes in S can be obtained in a similar manner.Alternatively, a simple observation shows that for all remaining primes in S thevalue of n(p) is 1. We know that every integer can be uniquely expressed as aproduct of prime numbers. For any x ∈ S, if it is a multiple of p ≥ 11, the otherfactor must be smaller than 11, thus divisible by a prime smaller than 11 andcannot be counted in A(p), except when x = p.

In summary, the expected amount of money B pays to A is∑

p∈S

p× g(p)100

= 13.58.

Therefore, for d = 13.58 the game is a fair game. 2

Part 2: If A pays $10 per play, and expects to gain $13.58, then the amountof money that A expects to gain is

$13.58× 10− $10× 10 = $35.80.

2



Solution 14: The World Series lasts seven games if in the first six gameseach team wins three games. The order in which each team wins three gamesis immaterial. There are

6!3! · 3!

= 20

possible sequences for the first six games in which each team wins three. Wedon’t care which team wins the 7th game.

1. Let two teams be evenly matched. For each possible sequence of matches,the probability is ( 1

2 )6. Thus the total probability is

(12)6 × 20 =

516

.

2. If one team is a 3-to-2 favorite over the other, then the probability ofthe better team winning is 3

2+3 = 35 , and the probability of the other

team winning is 1− 35 = 2

5 . Therefore, for each sequence of six games theprobability is ( 3

5 )3( 25 )3. Thus, the total probability is

(35)3(

25)3 × 20 =

8643125

.

From a comparison of the two numbers above, we conclude that if two teamsare not evenly matched, we will have less chance of seeing seven games. 2

Solution 15: In order to find two integers such that there sum is divisibleby 5 we consider the following partition of 1, 2, . . . , 100:

C1 = 1, 6, 11, . . . , 96, C2 = 2, 7, 12, . . . , 97, C3 = 3, 8, 13, . . . , 98,C4 = 4, 9, 14, . . . , 99, C5 = 5, 10, 15, . . . , 100.

Each of the above sets has 20 elements and if one of the two numbers is selectedfrom C1, then the other must be chosen from C4, if one of the two numbersis selected from C2, then the other must be chosen from C3, or both must beselected from C5. Therefore, there are

(201

)(201

)+

(201

)(201

)+

(202

)= 990 different

ways to select two numbers from 1, 2, . . . , 100 such that their sum is divisibleby five. Divide 990 by

(1002

), the size of the sample space, to get the probability

0.20. 2


9.4. Solutions 391

Solution 16: To prove that (T, G) is a sample space, we have to prove that∑

v∈T

G(v) = 1. (9.6)

By definition of G(v), the left hand side of Equation (9.6) can be rewritten as:

∑

v∈T

∑

a∈S,G(a)=v

Pr(a)

. (9.7)

Equation (9.7) is equation to∑

a∈S Pr(a) = 1. Therefore, (F, fP ) is a samplespace. 2

Solution 17:

1. If the balls are drawn without replacement, then the number of balls drawnto get the first black ball and the associated probabilities are:

Number of balls Probability1 1

3

2 23 · 1

2 = 13

3 23 · 1

2 · 1 = 13

Thus, the expected number of balls drawn is 1× 13 + 2×× 1

3 + 3× 13 = 2.

2. If the balls are drawn with replacement, then the probabilities of choosinga white ball or a black ball do not change, they remain 2

3 and 13 respectively

for each draw of a ball. Moreover, it is conceivable that we may never seea black ball. Suppose that n balls are drawn to see the first black ball,i.e., the black ball is picked in the nth draw. The probability for this tohappen is ( 2

3 )n−1 13 , where ( 2

3 )n−1 is the probability of drawing a whiteball n−1 times in a row and 1

3 is the probability of drawing the black ballin the nth attempt. Thus, the expected number of draws is:

E =∑

n≥1

n · 13· (2

3)n−1.

Using the methods of 7 it can be seen that E = 3.

2



Solution 18: There are six possible values of |x− y|. These values and theassociated probabilities are shown below:

|x− y| 0 1 2 3 4 5Probability 6

361036

836

636

436

236

For each case, we count the number of possible values of x and y, and then dividethe obtained number by 36 to get the probability. We illustrate the procedurefor one case; |x−y| = 3. First note that x and y behave in a symmetric mannerand second note that possible pairs of x and y such that x < y are (1,4), (2,5),and (3,6). Thus, there are six possible cases.

The expectation of winning |x− y| amount of dollars is

0× 636

+ 1× 1036

+ 2× 836

+ 3× 636

+ 4× 436

+ 5× 236

=3518

.

2

Solution 19: Let the face values of the three dice be x, y and z, m =max(x, y, z), and µ = min(x, y, z). We want to find the expectation of m − µ.Six possible values of |m− µ| and associated probabilities are given below.

|m− µ| 5 4 3 2 1 0Probability 6

361036

836

636

436

236

These values are exactly the same as in the previous problem, hence the expectedvalue is 35

18 . To illustrate the probability calculations, consider the cases |m−µ| =5 and |m− µ| = 4. How can we get |m− µ| = 5? This is possible if at least oneout of x, y and z takes value 6, at least one out of the remaining two takes avalue 1, and the remaining third is allowed to take any value between 1 and 6.There are three ways to select the die that takes the largest value, two ways toselect the die that takes the smallest value and six ways for the remaining die.Thus the probability of |m− µ| = 5 is 3×2×6

6×6×6 = 16 .

Note that |m − µ| = 4 is possible in two cases: If the largest face value is5 and the smallest is 1 and if the largest face value is 6 and the smallest is 2.Consider the second case (largest face value is 6 and the smallest is 2). Thisis possible in 3 × 2 × 5 = 30 ways because any one of the three out of x, yand z takes value 6, at least one out of the remaining two takes a value 2, andthe remaining third is allowed to take any value between 2 and 6. Exactly thesame number of possibilities exist for the other case. Hence, the probability of|m− µ| = 4 is 2×30

6×6×6 = 1036 . 2


9.4. Solutions 393

Solution 20: We can choose four out of six winning games in(64

)different

ways and if the probability of winning a game is p, then the probability ofwinning four out of six games is

(64

)p4(1− p)2. Thus:

p probability0.5 0.2340.6 0.3110.4 0.0138

2

Solution 21: Three computer terminals are available. If less than or equalto three students try to use the terminals, then no student will be waiting inline. Therefore, the answer of this problem is:

1− Probability(3 or less students use the terminals).l − [

(150

)(0.9)15 +

(151

)(0.9)14(0.1) +

(152

)(0.9)13(0.1)2 +

(153

)(0.9)12(0.1)3]

= 0.1791.

2

Solution 22:

λ =15

60/12= 3.

Using the formula Pr(k) = e−λ λk

k! , we get Pr(4) = 0.168. 2

Solution 23:

1. To prove that (S,Pr) is a sample space, we first check that for all s ∈S, 0 ≤ Pr(s) ≤ 1. Then

∑

s∈S

Pr(s) =14

+16

+16

+16

+16

+112

= 1.

Therefore, (S,Pr) is a sample space. 2



2. Let 1, 2, 6 and 2, 4 be two events.

Pr(1, 2, 6 ∩ 2, 4) = Pr(2) =16

Pr(1, 2, 6) =14

+16

+112

=12

Pr(2, 4) =16

+16

=13

SincePr(1, 2, 6 ∩ 2, 4) = Pr(1, 2, 6)Pr(2, 4),

events 1, 2, 6 and 2, 4 are independent. 2

3. f and g are not independent random variables for the following reasons:

Pr(s|f(s) = 0 & g(s) = 3) = Pr(2) =16,

Pr(s|f(s) = 0) = Pr(2, 4, 6) =16

+16

+112

=512

,

Pr(s|g(s) = 3) = Pr(1, 2, 3) =14

+16

+16

=712

,

and 16 6= 5

12 × 712 .

2

Solution 24:

1.

E(f + g) =∑

s∈S

(f + g)(s) ·Pr(s)

= (f(1) + g(1)) ·Pr(1) + · · ·+ (f(6) + g(6)) ·Pr(6)

= 4 · 14

+ 3 · 16

+ 4 · 16

+ 6 · 16

+ 7 · 16

+ 6 · 112

=296

.

2. Since f and g are not independent, we cannot use the equality

Var(f + g) = Var(f) + Var(g).

We have to calculate Var(f + g) from the definition directly, i.e.,

Var(f + g) = E(f + g)2 − (E(f + g))2.


9.4. Solutions 395

E(f + g)2 =∑

s∈S

(f + g)2(s) ·Pr(s)

= (f(1) + g(1))2 ·Pr(1) + · · ·+ (f(6) + g(6))2 ·Pr(6)

= 42 · 14

+ 32 · 16

+ 42 · 16

+ 62 · 16

+ 72 · 16

+ 62 · 112

=30412

.

Therefore,

Var(f + g) =30412

− 292

62=

7136

.

2

Solution 25: The odds that the program will run is 2:1. Therefore,Pr(a program will run) = 2

3 . Let B denote the event that four or more pro-grams will run and Aj denote that exactly j programs will run. Then,

Pr(B) = Pr(A4 ∪A5 ∪A6)= Pr(A4) + Pr(A5) + Pr(A6)

=(

64

)(23)4(

13)2 +

(65

)(23)5

13

+(

66

)(23)6

= 0.5706

2

Solution 26: There are∑

1≤k≤n k = n(n+1)2 counters. The probability

of choosing counter k and receiving k2 amount is 2kn(n+1) , for k = 1, 2, . . . , n.

Hence, the expected amount received is

∑

1≤k≤n

2k3

n(n + 1)=

2n2(n + 1)2

4n(n + 1)

=n(n + 1)

22

Solution 27: For this problem

µ = 10, 000× 12

= 5, 000 and



σ =

√10, 000× 1

2× 1

2=

1002

= 50,

and 1− 1k2 = 0.99 yields k = 10. Thus, by Chebyshev’s inequality, we know that

the probability is at least 0.99 that we will get between 5, 000−10×50 = 4, 500and 5, 000 + 10× 50 = 5, 500 heads. Hence,the probability is at least 0.99 thatthe proportion of heads will fall between 4500

10000 = 0.45 and 550010000 = 0.55. 2


Part III

Appendices

Appendix A

Loop Invariance

The formal proof of correctness of algorithmic programs is an elusive problem inmathematics mainly because of the loop structure that can be found in almostevery practical programming language. The difficulty is simply that we don’tknow how many times a loop will be executed in general. It is also clear that wecannot enumerate all possible input data for testing our programs. Mathemat-ical induction is a natural way to solve this problem. However, when the loopbecomes sophisticated, the proof will become unacceptable cumbersome. Toover come this problem, the theorem of loop invariant was introduced. In fact,the idea behind the theorem of loop invariant does not go beyond mathematicalinduction but it does simplify the proof of program’s correctness.

Literally, a loop invariant of a loop means something that remains unchangedduring the execution of the loop, no matter how many times, even forever,the loop will be executed. Unfortunately, there are still uncountably manymathematical truths that are always true no matter how you write your loops.For examples: 2 = 2, 1 ≤ 2, |x + y| ≤ |x| + |y|, . . . etc. By definition, they areall loop invariants, but mostly useless. Among those invariants, there are somethat can help us to prove correctness of a given program. How to find out auseful one can be difficult that is far beyond the scope of this note; we wouldhave to spend an entire semester of in Semantics of Formal Methods for thistopic.

The purpose of this handout is just to give you the concept of loop invariant,how to express a condition in n term of logical proposition, how to prove a givencondition is a loop invariant of a program, and how to use the invariant toprove correctness of a program, i.e., we want to be mathematically sure thata program does what it is supposed to do. This is an excellent example ofapplying mathematical induction to computer science.

400 Appendix A

We will start with a proof of correctness of a given program containing aloop. Then, given a proposition, we will prove by mathematical induction againthat the proposition is a loop invariant of the program. In another words,we will prove that the given proposition is always true no matter how manytimes the loop is executed. Afterward, we will make use of the loop invariantto prove correctness of the program. Finally, we will abstract the concept ofloop invariant into a theorem, and prove by mathematical induction that thetheorem can be applied to all loops. Once we accept the theorem, we canprove the correctness of any given loops without using mathematical inductionexplicitly.

§§§Consider the program in the following flowchart. We want to prove that if

the program ever halts, then ans = x0 + y0.

Start

?x ← x0

y ← y0

?

-¾ true falsex 6= 0

x ← x− 1y ← y + 1

ans ← y

6- a

?

Halt

Since the value of x is the only factor to change the computational path ofthe program, and its value is initiated by x0, we will prove this program byinduction on the value of x. Without loss of generality, suppose x ranges overω1. Consider the following predicate.

S(n) : If the program reaches point a with any value of y and x = n,then the program halts with ans = x + y.

If we can prove that for all n ∈ ω, S(n) is true, then given any value for x0

and y0, the program definitely will reach point a, as the first time, with x = x0

and y = y0, and since S(x0) is true, we can conclude that the program will halt

1ω = 0, 1, 2, . . ..


Loop Invariance 401

with ans = x+y = x0 +y0. Therefore, it is sufficient to prove that the programis correct, if we can prove that for all n ∈ ω, S(n) is true.

Inductive Base: S(0)

If the program reaches point a and with any value of y and x = 0, thenthe next step is to test if x 6= 0?, which is false. Then ans ← y willbe executed and the program will halt with ans = y = 0 + y = x + y.Therefore, S(0) = true. The inductive base holds.

Inductive Hypothesis: Suppose S(n) = true.

Inductive Step: S(n + 1)

Suppose that the program reaches point a at the moment t0 with any valueof y and x = n+1. And, suppose y = k at the moment t0. Since n+1 6= 0,the test x 6= 0 will be followed by the execution of x ← x−1 and y ← y+1.Then the program will reaches point a again at the moment t1 with newvalues of x and y, i.e., x = n and y = k+1. From the inductive hypothesisthat S(n) is true, we know that the program will halt with ans = x + y,where x and y have the values at the moment t1, i.e., x = n and y = k+1.Thus, the program will halt with ans = x + y = n + (k + 1) = (n + 1) + k.Back to the moment t0, we can claim that the program will halt withans = x + y, where x = n + 1 and y = k. Therefore, S(n + 1) = true. 2

§§§Loop Invariant : x + y = x0 + y0.

We want to prove that x + y = x0 + y0 is a loop invariant of the program.In other words, we want to prove that x + y = x0 + y0 is always true no matterhow many time the loop being executed.

L(n) : If the program reaches point a after the loop has been executedn times, then x + y = x0 + y0.

Inductive Base: L(0)

If the loop has not been executed, then x = x0 and y = y0. Therefore,x + y = x0 + y0. L(0) = true. The inductive base holds.

Inductive Hypothesis: Suppose L(n) = true.

Inductive Step: L(n + 1)

Suppose at the moment t0, the loop has been executed n times. Since L(n)is true, x+y = x0 +y0. And, Suppose at this moment t0, x = k and y = l.If the loop will be executed one more time, then x will be decreased by 1and y will be increased by 1, and then the program will reach point a atthe moment t1. At this moment t1, x+y = (k−1)+(l+1) = k+l = x0+y0.Therefore, L(n + 1) = true. 2


402 Appendix A

§§§Now, let’s see how to use the loop invariant to prove correctness of the

program. You will see the proof will be simplified a lot.

The idea is very simple. When the program reaches the beginning of theloop (i.e., at point a), we have the condition x + y = x0 + y0, and we knowthat it is a loop invariant. Eventually, if the program ever exists the loop, theproposition of the loop invariant is still true, plus the condition to exist theloop must be true. In our program, the exist condition is x = 0. Therefore,y = 0 + y = x + y = x0 + y0. After ans ← y, the program will halt withans = x0 + y0. The program is correct. 2

§§§Let’s generalize the idea. Consider the following flowchart of a stand loop

in some program.

Start

?

-¾ true falseBS

6- a

b

Let P and B be two propositions, where ¬B serves as the exit condition ofthe loop, i.e., when B is not true, the program will exit the loop. Let P havethe following property:

P: If (P ∧ B) is true before the program fragment S is executed,then P is still true after S is executed.

We have the following lemma.

Lemma: For all n ∈ ω, I(n) is true, where I(n) is defined as,

I(n) : If P is true when the loop is entered, then after the loop hasbeen executed n times, if ever, P is still true.

Inductive Base: I(0)

If P is true at begin of the loop and the loop has not been executed, thenP is true, because nothing has been done. Therefore, I(0) = true. Theinductive base holds.


Loop Invariance 403

Inductive Hypothesis: Suppose I(n) = true.

Inductive Step: I(n + 1)

Suppose P is true at begin of the execution of the loop, and suppose theloop has been executed n times at the moment t0. Since I(n) is true, P istrue at the moment t0. If the loop will be executed one more time, thatmeans B is true at the moment t0. Therefore, (P ∧B) is true, and becauseof the property of P , after the loop being executed one more time, P isstill true. Therefore, I(n + 1) = true. 2

The above lemma proves that, if the proposition P has the property P withrespect to any loop of any algorithmic program, then P is a loop invariant ofthe loop. Now, we are at the position to have the following theorem.

Theorem: Suppose P has the property P, then if P is true when the loop isentered, then (P ∧ ¬B) is true when the loop is exited, if ever.

Proof: If the loop is exited, then the loop has been executed finitely manytimes, say n times. From the lemma, P is true. And B must be false to exitthe loop. Therefore, (P ∧ ¬B) is true when the loop is exited. 2

Conclusion:

Let’s go back to the program, and see how easy to use the theorem to provecorrectness of the program. Given x + y = x0 + y0, we make sure the followingproperty satisfied.

P: If ((x + y = x0 + y0) ∧ (x 6= 0)) is true before the programfragment (x ← x−1; y ← y−1) is executed, then (x+y = x0+y0)is still true after the fragment is executed.

And, since x + y = x0 + y0 is true when the loop in entered, then by thetheorem ((x + y = x0 + y0) ∧ ¬(x 6= 0)) is true when the loop is exited. Thus,x + y = 0 + y = y = x0 + y0 is true when the loop is existed. Therefore, afterans ← y is executed, ans = x0 + y0. 2


Appendix B

Sample Quizzes

Quiz 1

Problem I. (20 points)

Let Xb denote the base b representation of an integer.

1. 12345 + 2217 = X8. Find X.

2. 7118 + 1000101102 = X2. Find X.

Problem II. (20 pints)

Let gcd(a, b) denote the greatest common divisor of a and b.

1. a = 65536, b =32∑

i=1

i. What is gcd(a, b)?

2. Find x and y such that, 75x + 30y = gcd(75, 30).

Problem III. (30 points)

1. (10 points) Give a counter example for each to disprove the following,

(a) gcd(a, b) = d ⇒ gcd(a, db) = d.

(b) gcd(a, b) = d ⇒ gcd(a/d, b) = 1.

2. (20 points) Let gcd(a, b) = d, and a = da′, b = db′. Prove that

gcd(a′, d) = 1 ⇒ gcd(a, db) = d.

[hint: Use Lemma 11.1, page 239.]

406 Appendix B

Problem IV. (30 points)

Let t ∈ R, and t 6∈ Z. Prove or disprove that

b|t|c = |dte| − 1 if t > 0;|btc| − 1 if t < 0.

Quiz 2

Problem I. (10 points) Let ϕ be the Euler phi function defined in the text-book. Find the values of the following two.

1. (5 points) ϕ(160).

2. (5 points) ϕ(400).

Problem II. (15 points) Find all solutions of

24x ≡ 36 (mod 66).

Problem III. (15 points) Use Fermat’s theorem (P.282), proposition 3.18(P.265), and Euler’s theorem (P.285) to prove that

2972 ≡ 1 (mod 210).

Problem IV. (20 points) Find the least positive residue of 8470 (mod 13).

Problem V. (20 points) Let ϕ be the Euler phi function defined in the text-book. Prove that, for n ∈ N, if 3|n, then ϕ(2ϕ(n)) = 2ϕ(ϕ(n)).

[Hint: homework problem 48]

Problem VI. (20 points) Find all the solutions of

x ≡ 2 (mod 5)x ≡ 3 (mod 6)x ≡ 4 (mod 7)

Quiz 3

Problem I : (20 points) Define function f : R → R as the following.

f(x) = (1 + x)20.

Find the coefficient of x15 in the expansion of f(x− 12 ).


Sample Quizzes 407

[Note: Abbreviate your answer. Do not leave any binomial coefficients in youranswer. For example, you have to find the value, 6, for

42

. ]

Problem II : (20 points) Let n ∈ R, prove that

∑

0≤k

(n

2k

)=

∑

0≤k

(n

2k + 1

)= 2n−1.

Problem III : (20 points) Suppose we want to construct strings of length7 consist of letters in a,b,c,d,e,f, and those string must satisfy

1. start with ba,

2. have no two or more consecutive a’s.

How many such strings can we have?

[Note: Abbreviate your answer. Do not leave any binomial coefficients in youranswer. ]

Problem IV : (20 points) Consider the following figure.

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

6-

A

C

B

D

B is 3 blocks east and 1 block north from A.C is 3 blocks east and 2 blocks north from B.D is 4 blocks east and 2 blocks north from C.

Suppose we want to move from A to D via B or C but not both, and,we are only allowed to move north or east. How many different routes wecan take?


408 Appendix B

[Note: You can leave binomial coefficients in your answer, but you have to ex-plain why you propose them. ]

Problem V : (20 points) How many different terms are there in the expansionof

(x + y + z)7?

For example, (x + y + z)2 = x2 + y2 + z2 + 2xy + 2yz + 2xz. Thus, thereare 6 different terms in the expansion of (x + y + z)2.

[Note: Abbreviate your answer. Do not leave any binomial coefficients in youranswer. ]

Quiz 4

Problem I : (15 points) Let x ∈ R, and m ∈ N. Prove that,⌈dxe

m

⌉=

⌈ x

m

⌉.

Problem II : (20 points)

1. (10 points) Find integers u and v such that,

126u + 150v = gcd(126, 150).

2. (10 points) Find the integer solutions of the following equation.

22x ≡ 12 (mod 30).

Problem III : (15 points)

1. (5 points) What is the coefficient of x3 in the expansion of (1+3x)15 ?

[Note: You can leave the factorial operator but not binomial coefficients,i.e.,

nk

, in your answer. ]

2. (10 points) Let n, k ∈ N. Prove that(

1k

)+

(2k

)+ · · ·+

(n

k

)=

(n + 1k + 1

)


Sample Quizzes 409

Problem IV : (15 points)

1. (5 points) 10 students A, B, C, D, . . . , J line up randomly. What isthe probability that A, B, C, D are consecutive in the line?

2. (10 points) How many nonnegative integer solutions are there for thefollowing equation.

a + b + c + d = 15, where 2 ≤ d ≤ 10.

[Note: To both sub-problems, you can leave the factorial operator but not bi-nomial coefficients, i.e.,

nk

, in your answers. ]

Problem V : (15 points) Define the following sets.

A = 1, 2, 3, 4, 5, 6, 7, 8B = 1, 3, 5, 7C = 2, 4, 6, 8

1. (4 points) How many surjections are there from B to C?

2. (5 points) How many surjections are there from A to B?

3. (6 points) If we pick up 6 numbers from A and 2 numbers from B,and arrange them into a string. How many different strings we mayhave?

[Note: Please find out the values. ]

Problem VI : (20 points) Let S = 1, 2, 3, 4, 5, 6. Define Pr, f and g on Sin the following table.

s Pr(s) f(s) g(s)1 1/4 1 32 1/6 0 33 1/6 1 34 1/6 0 65 1/6 1 66 1/12 0 6

1. (3 points) Prove that (S,Pr) is a sample space.

2. (3 points) Let 1, 2, 6 and 2, 4 be two events. Are they indepen-dent? Why?

3. (3 points) Are f and g two independent variables on (S,Pr)? Why?

4. (4 points) Find E(f + g).


410 Appendix B

5. (7 points) Find V ar(f + g).

Quiz 5

Problem I : (25 points) Prove that, if x and y are odd integers, then there isno integer a such that x2 + y2 = a2. [hint: a = 2k + 1 or a = 2k.]

Problem II : (30 points) The Chinese remainder theorem says that if m1,m2, . . . , mn

are relatively prime to each other, then there must exist solutions for theabove system, but the theorem does not say if m1,m2, . . . , mn are notrelatively prime to each other, then there is no solutions. Consider

x ≡ b1 (mod m1)x ≡ b2 (mod m2)

...x ≡ bn (mod mn)

Suppose m1,m2, . . . , mn are not relatively prime to each other, and wehave somehow a solution x0 for the above system. Then,

1. What are the rest solutions?

2. Explain your answer.

Problem III : (20) Explain why if gcd(a,m) 6 | b, then ax ≡ b (mod m) hasno solution?

Problem IV : (25 points) Use the binomial theorem to prove that, for allx,−1 < x < 1,

11− x2

= 1 + x2 + x4 + x6 + x8 + · · · · · ·

[Hint: 1− x2 = (1 + x)(1− x). ]

Quiz 6


Sample Quizzes 411

Problem I : (25 points) Prove that, for all x,−1 < x < 1,

1 + x2

(1 + x)2(1− x)2=

∑

0≤k

(2k + 1)x2k.

[Hint: Use the binomial theorem to find the power series of 1(1+x)2

and 1(1−x)2

first. ]

Problem II : (25 points) Prove that, for all x,−1 < x < 1,

1/2(1 + x)2

+1/2

(1− x)2=

∑

0≤k

(2k + 1)x2k.

[Hint: Use the binomial theorem to find the power series of 1(1+x)2

and 1(1−x)2

first. ]

Problem III : (25 points) Prove that, for all k ∈ N,

(k!)! is divisible by (k!)(k−1)!.

Problem IV : (25 points)


problems on discrete mathematics1 ltex at january 11, 2007 › faculty › chungli › dis300 ›...

Documents