discrete mathematics through guided discovery: classnotes...

DISCRETE MATHEMATICS THROUGH

GUIDED DISCOVERY:

CLASSNOTES FOR MATH 399—Fall 2004

Kenneth P. Bogart 1

Department of MathematicsDartmouth College

Mary E. Flahive 2

Department of MathematicsOregon State University

June 6, 2005

1This author was supported by National Science Foundation Grant Num-ber DUE-0087466 for his development of the original notes.

2This author is supported by National Science Foundation Grant Number DUE-0410641 for her adaption of the original notes.

Preface

Dear Math 399 Student:These notes serve as an introduction to discrete mathematics. The notes

are a work-in-progress, an adaption for our class of a book by Dr. KennethBogart, Dartmouth College. The federal government is supporting both thedevelopment of the original notes and this adaptation.

You’ve already read the words Guided Discovery in the title. Thesenotes consist almost entirely of problems. During a usual class period youwill work in groups on problems, and I’ll walk around listening. Most of thetime my guidance will be in the form of asking a question, but sometimesgiving a hint—a question or a hint which might at first seem unrelated tothe problem. I do this because I’m to be a guide. The problems in thesenotes (and my perhaps cryptic hints) are designed to lead you to discoverfor yourself and to prove for yourself. There is considerable evidence thatthis leads to deep learning and understanding, and that’s the reason for themath department’s support of this endeavor. We will discuss the format ofthe course more on Monday. Please read on so you can learn more aboutthis philosophy of learning.

Much of your experience in mathematics courses before this term prob-ably has had the following flavor: For every problem there is a method thathas already been taught in the class and your job is to figure out whichmethod applies and then to apply it. That is not the case here and it isalso not the case in the companion advanced calculus course. Here you willdiscover the method for yourself, usually based on some simplified exam-ples. Then in later problems you may recognize a pattern that suggests youmight try to use the method again or to try a modified method. We knowthis is new for many of you. The point of learning in this way is that youare learning how to discover ideas and methods for yourself, and not justapplying methods that someone else has told you about. And that is whatdoing mathematics is really about.

Another result of how more elementary math is learned is that it is

i

ii PREFACE

believed that if a problem couldn’t be solved in ten or twenty minutes, thenit can’t be solved at all. There will be problems in this book that take hoursof hard thought. Many of these problems were first conceived and solved byprofessional mathematicians, and they spent days or weeks on them. Butthen how can you be expected to solve them at all? The difference is that thesolution is already known to the author and the notes have been constructedto give you a context in which to think. Even though some of the problemsare so open-ended that you might start them without any idea of the answer,the context and the leading examples that precede them have prepared astructure for you to work within. That doesn’t mean you’ll get them rightaway, but you will get a solid satisfaction when you see what you can figureout with concentrated thought. Besides, you will be working with the othersin your group and you can ask me for hints since this is guided discovery.

There’s also another way in which I function as a guide: Each week I willassign some problems to be collected the next Monday and I will carefullyread some of these written solutions and comment on many of them. Youshould try to write up answers to all the problems that you work on, even if Ihaven’t asked you to hand them in. In your writeup, if you claim somethingis true, you must explain why it is true; that is, you should prove it. Whenyou write up a problem, remember that the reader has to be able to “get”your ideas and understand exactly what you are saying from the words youhave written. For this to work, it might be best either to forget that Iprobably know how to do the problem or to think that I’ve never seen yourparticular solution.

One further caution: Don’t expect the wording of the problem to suggestthe answer because these problems aren’t designed that way. If you laterhave a job which uses mathematics, the problems are going to be open-endedbecause nobody will have done them before. So, working on open-endedproblems now should help to prepare you to do mathematics and to applymathematics in other areas later on.

As you print out more of the book, you will see that it’s divided intothree parts: Course Notes, Supplementary Sections, and Review Material.The Course Notes are divided into chapters, and I will guide your progressso that most of you will begin each chapter at the same time. Some of youwill work quickly, and those students will work on additional problems inthe Supplementary Sections. Others of you might find the Review Materialis helpful to remedy some deficiency in your math background. Many of youwill proceed linearly through the course notes (also returning to problemswhich at first might stump you).

Above all, this book is dedicated to the principle that doing mathematics

iii

is fun. As long as you know that some of the problems are going to requiremore than one attempt before you hit on the main idea, you can relax andenjoy your successes. And also you can trust that as you work more andmore problems and share ideas within your group, many problems that atfirst seemed intractable will later become a source of satisfaction.

Mary Flahive

September 2004

iv PREFACE

Contents

Preface i

I COURSE NOTES 1

1 Beginning Combinatorics 31.1 What is Combinatorics? . . . . . . . . . . . . . . . . . . . . . 31.2 Basic Counting Principles . . . . . . . . . . . . . . . . . . . . 41.3 Functions and their Directed Graphs . . . . . . . . . . . . . . 81.4 The Pigeonhole Principle . . . . . . . . . . . . . . . . . . . . 111.5 The Bijection Principle and Counting Subsets . . . . . . . . . 131.6 Additional Problems for Chapter 1 . . . . . . . . . . . . . . . 17

2 Product Principle, Revisited 192.1 The General Product Principle . . . . . . . . . . . . . . . . . 192.2 Counting Functions . . . . . . . . . . . . . . . . . . . . . . . . 202.3 The Quotient Principle . . . . . . . . . . . . . . . . . . . . . . 222.4 Equivalence Relations . . . . . . . . . . . . . . . . . . . . . . 262.5 Equivalence Classes . . . . . . . . . . . . . . . . . . . . . . . . 282.6 Additional Problems for Chapter 2 . . . . . . . . . . . . . . . 31

3 Bijection Principle, Revisited 333.1 Lattice Paths and Catalan Numbers . . . . . . . . . . . . . . 333.2 Undirected Graphs . . . . . . . . . . . . . . . . . . . . . . . . 353.3 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373.4 Labelled Trees and Prufer Codes . . . . . . . . . . . . . . . . 383.5 Additional Problems for Chapter 3 . . . . . . . . . . . . . . . 40

v

vi CONTENTS

4 Inductive Reasoning in Discrete Mathematics 434.1 The Principle of Mathematical Induction . . . . . . . . . . . 434.2 Inductive Definitions . . . . . . . . . . . . . . . . . . . . . . . 454.3 Recurrences . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464.4 Proof of the General Product Principle . . . . . . . . . . . . . 474.5 Spanning Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 494.6 Shortest Paths in Graphs . . . . . . . . . . . . . . . . . . . . 524.7 Additional Problems for Chapter 4 . . . . . . . . . . . . . . . 53

5 Distribution Problems 555.1 The Idea of Distributions . . . . . . . . . . . . . . . . . . . . 555.2 Ordered-functions . . . . . . . . . . . . . . . . . . . . . . . . 575.3 Multisets and Compositions of integers . . . . . . . . . . . . . 585.4 Broken Permutations and Lah Numbers . . . . . . . . . . . . 605.5 Additional Problems for Chapter 5 . . . . . . . . . . . . . . . 60

6 Generating Functions 636.1 Using Pictures to Visualize Counting . . . . . . . . . . . . . . 636.2 Generating Functions . . . . . . . . . . . . . . . . . . . . . . . 67

6.2.1 Generating polynomials . . . . . . . . . . . . . . . . . 676.2.2 Generating functions . . . . . . . . . . . . . . . . . . . 68

6.3 Generating Functions and Recurrences . . . . . . . . . . . . . 736.4 Additional Problems for Chapter 6 . . . . . . . . . . . . . . . 74

7 Recurrences, Revisited 777.1 Rabbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 777.2 The Method of Partial Fractions . . . . . . . . . . . . . . . . 787.3 Nonnegative recurrences . . . . . . . . . . . . . . . . . . . . . 807.4 Additional Exercises for Chapter 7 . . . . . . . . . . . . . . . 81

8 The Principle of Inclusion and Exclusion 838.1 The Size of a Union of Sets . . . . . . . . . . . . . . . . . . . 83

8.1.1 Unions of two or three sets . . . . . . . . . . . . . . . 838.1.2 Unions of an arbitrary finite number of sets . . . . . . 84

8.2 The Principle of Inclusion and Exclusion . . . . . . . . . . . . 858.3 Applications of Inclusion and Exclusion . . . . . . . . . . . . 87

8.3.1 Multisets with restricted numbers of elements . . . . . 878.3.2 The Menage Problem . . . . . . . . . . . . . . . . . . 878.3.3 Counting onto functions . . . . . . . . . . . . . . . . . 88

8.4 The chromatic polynomial of a graph . . . . . . . . . . . . . . 88

CONTENTS vii

8.4.1 Deletion-Contraction and Chromatic Polynomials . . . 908.5 Additional Problems for Chapter 8 . . . . . . . . . . . . . . . 91

II SUPPLEMENTARY SECTIONS 93

1 Ramsey Numbers 951.1 The Generalized Pigeonhole Principle . . . . . . . . . . . . . 951.2 Ramsey Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 961.3 The Existence of Ramsey Numbers . . . . . . . . . . . . . . . 971.4 A Bit of Asymptotic Combinatorics . . . . . . . . . . . . . . . 99

2 Permutation Groups 1012.1 The rotations of a square . . . . . . . . . . . . . . . . . . . . 1012.2 Groups of permutations . . . . . . . . . . . . . . . . . . . . . 1032.3 The symmetric group . . . . . . . . . . . . . . . . . . . . . . 1052.4 The dihedral group . . . . . . . . . . . . . . . . . . . . . . . . 1062.5 The cycle decomposition of a permutation . . . . . . . . . . . 1092.6 Additional Exercises for Supplementary Chapter 2 . . . . . . 112

3 Group Actions 1133.1 Groups acting on colorings of sets . . . . . . . . . . . . . . . . 1163.2 Orbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1183.3 The Cauchy-Frobenius-Burnside Theorem . . . . . . . . . . . 1223.4 Polya-Redfield Enumeration Theory . . . . . . . . . . . . . . 1253.5 The Orbit-Fixed Point Theorem . . . . . . . . . . . . . . . . 1273.6 The Polya-Redfield Theorem . . . . . . . . . . . . . . . . . . 1283.7 Additional Exercises for Supplementary Chapter 3 . . . . . . 132

III REVIEW MATERIAL 135

A More on Functions and Digraphs 137A.1 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137A.2 Digraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

B More on Equivalence Relations 141

C More on the Principle of Mathematical Induction 143

viii CONTENTS

Part I

COURSE NOTES

1

Chapter 1

Beginning Combinatorics

As you glance at the pages of these notes you will see that some of theproblems are marked with symbols, although many problems might not bemarked in this way. A summary of the meaning of the symbols appearsbelow.

· essential for this or the next section• essential◦ motivational material+ summary→ especially interesting∗ difficult

1.1 What is Combinatorics?

Combinatorial mathematics arises from studying how we can combine ob-jects into arrangements. For example, we might be combining sports teamsinto a tournament, samples of tires into groups for testing on cars, studentsinto classes to compare approaches to teaching a subject, or members of atennis club into pairs to play tennis. There are many questions that can beasked about such arrangements of objects. Here we will focus on questionsabout how many ways we may combine the objects into arrangements of thedesired type. These are called counting problems.

Sometimes combinatorial mathematicians ask if a certain arrangementis possible. For instance, if we have ten baseball teams and each team hasto play each other team once, can we schedule the whole series if we onlyhave the fields available at enough times for forty games? Sometimes com-binatorial mathematicians ask if all the arrangements we might be able to

3

4 CHAPTER 1. BEGINNING COMBINATORICS

make have a certain desirable property. For instance, do all ways of testingfive brands of tires on five different cars compare each brand with each otherbrand on at least one common car?

Counting problems (and problems of the other sorts described above)arise throughout physics, biology, computer science, statistics, and manyother subjects. If we wanted to demonstrate all these relationships we wouldhave to take detours into all of these subjects. Instead although we will givesome important applications, we will usually phrase our discussions aroundeither everyday experience or mathematical experience so that you won’thave to learn a new context before learning the mathematics.

1.2 Basic Counting Principles

◦1. Five schools plan to send their baseball team to a tournament in whicheach team must play each other team exactly once. How many gamesmust be played?

•2. Now some number n of schools plan to send their baseball teams toa tournament in which each team must play each other team exactlyonce. Let us think of the teams as numbered 1 through n.

(a) How many games does Team 1 have to play?(b) How many additional games (other than the one with Team 1)

does Team 2 have to play?(c) How many additional games (other than those with the first i−1

teams) does Team i have to play?(d) In terms of your answers to the previous parts of this problem,

what is the total number of games that must be played?

Hint. If you have trouble doing this problem for general n, work onn = 6 before studying the general n.

•3. One of the schools sending its team to the tournament has to travelsome distance, and so the school is making sandwiches for team mem-bers to eat along the way. There are three choices for the kind of breadand five choices for the kind of filling. How many different kinds ofsandwiches are available?

+ 4. An ordered pair (a, b) consists of two “members” (or “coordinates”)which we have labelled here as a and b. We say a is the first member ofthe pair and b is the second member of the pair. If M is an m-elementset and N is an n-element set, how many ordered pairs are there whose

1.2. BASIC COUNTING PRINCIPLES 5

first member is in M and whose second member is in N? Review yoursolution to the previous problems from this perspective.

◦5. Since a sandwich by itself is pretty boring, students from the schoolin Problem 3 are offered a choice of a drink (from among five differentkinds), a sandwich (with the choices as in Problem 3), and a fruit(from among four different kinds). In how many ways may a studentmake a choice of a lunch, if every lunch is a choice of these three items?

•6. The coach of the team in Problem 3 knows of an ice cream shop alongthe way where she plans to stop to buy each team member a triple-decker cone. The store offers 12 different flavors of ice cream, andtriple-decker cones are made only in homemade waffle cones. (Herewe’re allowing repeated flavors; in fact, a triple-decker with threescoops of the same flavor is even possible. Be sure to count Straw-berry, Vanilla, Chocolate as different from Chocolate, Vanilla, Straw-berry, etc.)

(a) How many possible triple-decker cones will be available to theteam members?

(b) How many triple-deckers have three different kinds of ice cream?

•7. The idea of a function is ubiquitous in mathematics. A function ffrom a set S to a set T is a relationship between the two sets thatassociates to each element x in the set S exactly one member f(x) inthe set T . We will come back to the ideas of function and relationshipin more detail and from different points of view from time to time.This quick review should probably allow you to answer this question.

(a) Using f, g, . . . to stand for various functions, list all the differentfunctions from the set {1, 2} to the set {a, b}. For example, youmight start with the function f given by

f(1) = a and f(2) = b .

(b) Let’s look at the last part in a different way. Instead of askingfor a list of all the functions, suppose we simply asked how manyfunctions are there from the set {1, 2} to the set {a, b}. Countthe number of functions without writing an exhaustive list.

(c) How many functions are there from the 3-element set {1, 2, 3} tothe 2-element set {a, b}?

(d) How many functions are there from the 2-element set {a, b} tothe 3-element set {1, 2, 3}?


(e) How many functions are there from any 3-element set to any12-element set?

(f) Re-do Problem 6(a) by constructing a function from the 3-elementset of positions in the triple-decker to the set of 12-element set offlavors. Explicitly describe your function (in words). This ideaof using functions to count is very powerful, and is one of thefoundations of combinatorics.

8. A function f is called a one-to-one function (often called an in-jection) if f is a function which has the property that whenever x isdifferent from y, then f(x) is different from f(y).

(a) How many one-to-one functions are there from a 3-element set toa 2-element set?

(b) How many one-to-one functions are there from a 3-element set toa 12-element set?

(c) Re-do Problem 6(b).

•9. A group of three hungry team members from Problem 6 notices itwould be cheaper to buy three pints of ice cream to share. (And theywould also then get more ice cream!)

(a) In how many ways may they choose three pints of two differentflavors? How is this problem different from Problem 6?

(b) In how many ways may they choose three pints with at least twodifferent flavors?

(c) In how many ways can they choose three pints of ice cream withno restrictions on repeating flavors?

In the last part of Problem 9 it was helpful to break the question intocases that you could solve by previous methods. After doing that you couldthen figure out the answer to the whole question by using the answers fromthe specific cases. Because this is a fairly common strategy, some specialterminology is helpful. Two sets are said to be disjoint if they have noelements in common. For example, the sets {1, 3, 12} and {6, 4, 8, 2} aredisjoint, but {1, 3, 12} and {3, 5, 7} are not disjoint sets. Three or more setsare said to be mutually disjoint if no two of them have any elements incommon. To solve Problem 9(c), the set of all possible choices of three pintsof ice cream can be broken into three mutually disjoint sets: Three pintsof different flavors; three pints with two different flavors; three pints of thesame flavor.

1.2. BASIC COUNTING PRINCIPLES 7

•10. (a) What can you say about the size of the union of a finite numberof finite mutually disjoint sets? Does this have anything to dowith any of the previous problems?

(b) What can you say about the size of the union of m mutuallydisjoint sets, each of the same size n? Identify where you usedthis result in your solutions to the previous problems.

The problems you’ve just completed contain among them kernels of thefundamentals of enumerative combinatorics. For example, in your solu-tion to Problem 10(a) you just stated the Sum Principle (illustrated inFigure 1.1), and in Problem 10(b), the Product Principle (illustrated inFigure 1.2.) These are two of the most basic principles of combinatorics,and they form a foundation on which we will develop many other countingprinciples.

Figure 1.1: The union of these two disjoint sets has size 17.

Figure 1.2: The union of four disjoint sets of size 5.

You may have noticed some standard mathematical words and phrases,such as set, ordered pair, function and so on, are creeping into the prob-lems. One of our goals in these notes is to show how most counting problemscan be recognized as standard mathematical objects. Since most of the intel-lectual content of these notes is in the problems, it is natural that definitions


of concepts will often be within problems.1 Problem 4 is meant to suggestthat the question we asked in Problem 3 was really a problem of countingall the ordered pairs consisting of a bread choice and a filling choice. Weuse A × B to stand for the set of all ordered pairs whose first member isin A and whose second member is in B, and we call A× B the Cartesianproduct of A and B. Therefore you can think of Problem 3 as asking youfor the size of the Cartesian product of M and N , where M is the set of allbread types and N is the set of all possible fillings; that is, the number ofdifferent kinds of sandwiches equals the number of elements in the Cartesianproduct M ×N .

When a set S is a union of m mutually disjoint sets B1, B2, . . . , Bm, thenthe sets B1, B2, . . . , Bm is said to form a partition of the set S. (Note thata partition of S is a set of sets.) In order that the set S is not confused withthe sets Bi into which we have divided it, the sets B1, B2, . . . , Bm are oftencalled the blocks of the partition. Using this language of partitions, theSum Principle and the Product Principle translate to:

The Sum PrincipleIf we have a partition of a finite set S, then the size of S is the sum of thesizes of the blocks of the partition.

The Product PrincipleIf we have a partition of a finite set S into m blocks, each of the samesize n, then S has size mn.

You’ll notice that in both of these statements we talk about a partitionof a finite set. We could modify our language a bit to cover infinite sets, butwhenever we talk about the size of a set in what follows, we will be workingwith finite sets. In order to avoid possible complications in the future, letus agree that when we refer to the size of a set, we are implicitly assumingthe set is finite.

1.3 Functions and their Directed Graphs

Relations between subsets of the set of real numbers can be graphed inthe Cartesian plane, and it is helpful to remember how you used a graph

1When you come across an unfamiliar term in a problem, it was most likely definedearlier. If you check for it in the index, you will probably be able to find the definition ofthe term.

1.3. FUNCTIONS AND THEIR DIRECTED GRAPHS 9

of a relation to determine whether the relation is a function on the realnumbers. Indeed to do this you would check that each vertical straight linecrosses the graph of the relation in exactly one point. You might also recallhow to determine whether such a function is one-to-one by examining eachhorizontal straight line to see if it crosses the graph of the function at mostone time. If each horizontal line crosses the graph in at most one point, thefunction is one-to-one. If even one horizontal line crosses the graph in morethan one point, the function is not one-to-one.

The domain and co-domain of the functions we work with here will usu-ally both be finite and will often contain objects which are not numbers.Because of this, graphs in the Cartesian plane are usually not available. Butall is not lost, since there is another kind of graph called a directed graphor digraph that is especially useful when dealing with functions betweenfinite sets. Figure 1.3 has several examples of digraphs of functions.

In analyzing the graphs in Figure 1.3 you should notice the following:When we want to draw the digraph that represents a relation from a set Sto a set T , we draw a line of dots called vertices to represent the elementsof S and another (usually parallel) line of vertices to represent the elementsof T . (Part (e) is slightly different.) We then draw an arrow from the vertexfor x ∈ S to the vertex for y ∈ T if and only if x is related to y. Sucharrows are called edges. Because there is an inherent order in a relation,we must be careful to draw an arrow and not just a line segment. Whenour relation is a function, f : S → T , one arrow is drawn from each x ∈ Sto its corresponding f(x) ∈ T . Sometimes, as in part (e) of Figure 1.3, wehave a function from a set S to itself. In that case we usually simplify thepicture by drawing only one set of vertices representing the elements of S.Digraphs can often be more enlightening if we experiment with the relationto find a nice placement of the vertices rather than putting them in a row.

There is a simple test for whether a digraph of a relation from S to T isa digraph of a function from S to T : There must be one and only one arrowleaving each vertex representing an element of S, and the arrow must endat an element of T . This test works because the fact that there is only onearrow means that each x in S is related to exactly one element of T , whichis the definition of function. The fact that there is one arrow means thatf(x) is defined for each x in S, and we sometimes emphasize this by sayingthe function is well-defined.

11. In how many ways can you pass out nine different candies to threechildren? Set up your solution as a problem about counting functions.In how many ways can you pass out the candy if each child must get


Figure 1.3: What is a digraph of a function?

1

3

4

5

1

-2 0

4

-1 1

9

0 2

16

1 3

25

2 4

2

(a) The function given by f(x) = x on the domain {1,2,3,4,5}.

2

(c) The function from the set {-2,-1,0,1,2} to the set {0,1,2,3,4} given by f (x) = x .2

0

000

1

001

2

010

3

011

4

100

5

101

6

110

7

111

(b) The function from the set {0,1,2,3,4,5,6,7} to the set of triplesof zeros and ones given by f(x) = the binary representation of x.

a 0

b 1

c 2

d 3

e 4

(d) Not the digraph of a function.

(e) The function from {0, 1, 2, 3, 4, 5} to {0, 1, 2, 3, 4, 5} given by f (x) = x + 2 mod 6

0

1

2

3

4

5

1.4. THE PIGEONHOLE PRINCIPLE 11

at least one piece? Exactly three pieces?

12. Suppose we have n distinguishable balls. How many ways can we painteach of them with one color, chosen from red, black, green and blue?

•13. A function f : S → T is an onto function (also called a surjection)if each element of T is f(x) for at least one x ∈ S.

(a) Choose finite sets S and T and a function from S to T that isone-to-one but not onto. Draw the digraph of your function.

(b) Choose finite sets S and T and a function from S to T that isonto but not one-to-one. Draw the digraph of your function.

•14. Digraphs of functions can help us visualize the ideas of one-to-onefunctions and onto functions. In each of the following parts devise atest similar to the one we described for testing when a digraph is thedigraph of a function.

(a) What does the digraph of a one-to-one function from a finite setX to a finite set Y look like? (Remember that in order to be aone-to-one function, a relation must be a function.)

(b) What does the digraph of an onto function from a finite set X toa finite set Y look like?

(c) What does the digraph of a bijection from a set X to a finite setY look like?

If you find you still have difficulty working with functions, it would be agood idea to work through Appendix A outside of class. It covers functionsand also digraphs in more detail.

1.4 The Pigeonhole Principle

◦15. US coins are all marked with the year in which they were made. Howmany coins do you need to guarantee that on (at least) two of them,the date has the same last digit? (When we say “to guarantee that on(at least) two of them,...” we mean that you can find two coins withthe same last digit. You might be able to find three with that lastdigit, or you might be able to find one pair with the last digit 1 andone pair with the last digit 9, or any combination of equal last digits,as long as there is at least one pair with the same last digit.)

There are many ways to explain your answer to Problem 15. For exam-ple, you can separate the coins into stacks or blocks according to the last


digit of their date. That is, you can put all the coins with a given last digitin a block together (putting no other coins in that block) and repeat thisprocess until all coins are in some block. Using the terminology we intro-duced earlier, this gives a partition of your set of coins into blocks of coinswith the same last digit. If no two coins have the same last digit, then eachblock has at most one coin. Since there are only ten digits, there are at mostten non-empty blocks and by the Sum Principle there are at most ten coins.So, with ten coins it is possible to have no two with the same last digit, butwith eleven coins some block must have at least two coins in order for thesum of the sizes of at most ten blocks to be 11. This is one explanationof why we need eleven coins in Problem 15. This kind of situation arisesoften in combinatorial situations, and so rather than always using the SumPrinciple to explain our reasoning, we enunciate another principle which wecan think of as a variant of the Sum Principle.

The Pigeonhole PrincipleIf we partition a set with more than n elements into n blocks, then at leastone block has more than one element.

The Pigeonhole Principle gets its name from the idea of a grid of little boxesthat might be used to sort mail or used as mailboxes for a group of peoplein an office. The boxes in such grids are sometimes called pigeonholes inanalogy with the stacks of boxes used to house homing pigeons back whenhoming pigeons were used to carry messages. People will sometimes statethis principle in a more colorful way as “if we put more than n pigeons inton pigeonholes, then some pigeonhole contains more than one pigeon.”

16. Prove that any function from a set of size n to a set of size less thann cannot be one-to-one.

•17. Prove that if f is a one-to-one function between finite sets of equalsize, then f must be onto. Compare this with Problem 13(a).

•18. Prove that if f is an onto function between finite sets of equal size,then f must be one-to-one. Compare this with Problem 13(b).

You’ve just proved a shortcut for showing that a function between two setsof the same size is a bijection: you need only show that the function is eitherone-to-one or onto because Problems 17 and 18 combine to say that you getthe other property for free.

Students who are interested in learning more about Pigeonhole-typeproblems are encouraged to work through the problems in Chapter 1 inthe Supplementary Sections in the back of the book.

1.5. THE BIJECTION PRINCIPLE AND COUNTING SUBSETS 13

1.5 The Bijection Principle and Counting Subsets

The digraphs marked (a), (b), and (e) in Figure 1.3 are digraphs of bijec-tions. Your description in Problem 14 of the digraph of a bijection illustratesanother fundamental principle of combinatorial mathematics:

The Bijection PrincipleTwo sets have the same size if and only if there is a bijection between thesets.

It is surprising how this innocent-sounding principle gives insight into someotherwise very complicated counting arguments. From now on, for anypositive integer n we will use the notation [n] for the set {1, . . . , n}. Forexample, [4] equals the set {1, 2, 3, 4}.

19. The binary representation of a positive integer m is an ordered lista1a2 . . . ak of zeros and ones (also called a binary k-string) such that

m = a12k−1 + a22k−2 + · · ·+ ak20.

Let n be a fixed positive integer. For this problem, let S be the set ofbinary representations of numbers between 0 and 2n− 1, and let T bethe set of all subsets of [n]. Note that the empty set ∅ is a subset ofevery set.

(a) For n = 2, write out the sets S and T , and then describe abijection from S to T .

(b) Using your strategy from part (a), describe a bijection from S toT for general n. Explain why your map is a bijection from S to T .

(c) Find the number of subsets of [n].

You may have seen the notation(nk

), which stands for the number of

ways to choose a k-element subset from an n-element set. The symbol(nk

)is read as “n choose k” and is called a binomial coefficient because forfixed n, the binomial coefficients

(nk

)are the coefficients in the expansion

of binomial powers (x + y)n. Sometimes(nk

)is written as C(n, k), or nCk,

but we don’t use that notation here. Also, another common way to readthe binomial coefficient notation is “the number of combinations of n thingstaken k at a time” but we’ve found that can cause confusion and so we won’tread the notation that way in this course.


→20. A basketball team has 12 players and only five players can play atany given time during a game. In the following parts, use binomialcoefficients in your answers whenever possible.

(a) In how many ways may the coach choose the five players?(b) To be more realistic, the five players playing a game normally

consist of two guards, two forwards, and one center. If there arefive guards, four forwards, and three centers on the team, in howmany ways can the coach choose two guards, two forwards, andone center?Hint. The coach is making a sequence of decisions. Can youfigure out how many choices the coach has for each decision inthe sequence?

(c) What if one of the centers is equally skilled at playing forward?

•21. Let C be the set of all k-element subsets of [n] that contain the num-ber n, and let D be the set of all k-element subsets of [n] that don’tcontain n.

(a) Find C and D for n = 5 and k = 2.(b) Let C ′ be the set of (k − 1)-element subsets of [n− 1]. Describe

a bijection from C to C ′. (A verbal description is fine.)(c) Let D′ be the set of k-element subsets of [n − 1]. Describe a

bijection from D to D′. (A verbal description is fine.)(d) Based on the two previous parts, express the sizes of C and D in

terms of binomial coefficients involving n− 1.(e) Apply the Sum Principle to C and D to obtain a formula that

expresses(nk

)in terms of two binomial coefficients involving n−1.

You have just derived Pascal’s Equation which is the basis for thefamous Pascal’s Triangle, the triangle in Figure 1.4. In that figure, wenumber the rows and columns so that the top row is the 0-th row and theinitial entry of a row will be called the 0-th number in the row. Then the n-th row is set up as follows: the number of k-element subsets of an n-elementset is the k-th number over in the n-th row. Looking at your formula fromProblem 21, you’ll see that it doesn’t say anything about

(nk

)when k = 0 or

k = n, but otherwise it says that each entry is the sum of the two that areabove it and just to the left or right.

22. Just for practice, use your formula to get the 8-th row of Pascal’sTriangle.

1.5. THE BIJECTION PRINCIPLE AND COUNTING SUBSETS 15

Figure 1.4: Pascal’s Triangle

11 1

1 2 11 3 3 1

1 4 6 4 11 5 10 10 5 1

1 6 15 20 15 6 11 7 21 35 35 21 7 1

→23. Without writing out any more complete rows, write enough of Pascal’sTriangle to get a numerical answer for the first question in Problem 9(on page 9). Try to do this as efficiently as possible. You should beable to get the answer by writing down at most 10 numbers (includingthe answer).

•24. In how many ways may we pass out k (identical) ping-pong balls to nchildren if each child may get at most one?

Hint. Ask yourself “What is a problem like this doing in the middleof a bunch of problems about counting subsets of a set? Is it related?Or is it supposed to give a break from sets?”

25. In Chapter 2 we’ll derive a formula for calculating(nk

)directly. For

now, use Pascal’s Triangle to find the numerical answers to Problem 20.

26. There is a bijection that lets us give another proof of the fact that aset of size n has 2n subsets. Namely, for each subset A of [n], define afunction (traditionally denoted by χA) as follows: 2

χA(i) =

{1 if i ∈ A,

0 if i 6∈ A.

The function χA is called the characteristic function of A. Noticethat the characteristic function is a function from [n] to {0, 1}.(a) For practice, consider the function χ{1,3} for the subset S = {1, 3}

of the set {1, 2, 3, 4}.2The symbol χ is the Greek letter chi that is pronounced Ki, where the i sounds like

“eye.”


• Show χ{1,3}(1) = 1 and χ{1,3}(2) = 0.• Find χ{1,3}(3) and χ{1,3}(4).

(b) Let S be the set of all subsets of [n]. Let T be the set of allfunctions from [n] to {0, 1}. We define a map f : S → T byf(A) = χA for all A ∈ S. Explain why f is a bijection.Hint. Work with n = 3 first if you can’t see what to do forgeneral n.

(c) Why does the fact that f is a bijection prove that [n] has 2n

subsets?

The proofs in Problem 19 and 26 use essentially the same bijection, butthey interpret sequences of zeros and ones differently and so end up beingdifferent proofs. They both are proofs of the following theorem.

Theorem 1. Any n-element set has 2n subsets.

→27. Analysis of Pascal’s Triangle shows that the table is symmetric aboutits central axis; that is, it gives the same number for

(nk

)as it gives

for(

nn−k

). Whenever two quantities are counted by the same formula

it is good for our insight to find a bijection that demonstrates the twosets have the same size. In fact this is a guiding principle of researchin combinatorial mathematics. Find a bijection that proves that

(nk

)equals

(n

n−k

).

Hint. The first thing you need to decide is “What are the two setswhose elements we are counting?” Then it will be easier to think of abijection between these two sets. (Note that these two sets are sets ofsets.)

Now that you’re at the end of the first chapter, I’d like you to thinkabout how you’ve approached these problems. There are some fairly generaltechniques that can be mentioned. As you work on a problem, think aboutwhy you are doing what you are doing. Is it helping you? If your currentapproach doesn’t feel right, try to see why. Is this a problem you candecompose into simpler problems? Can you see a way to make up a simpleexample, even a silly one, of what the problem is asking you to do? If aproblem is asking you to do something for every value of an integer n, thenwhat happens with simple values of n like 0, 1, and 2? Don’t worry aboutmaking mistakes, because mistakes often lead mathematicians to their bestinsights.

1.6. ADDITIONAL PROBLEMS FOR CHAPTER 1 17

1.6 Additional Problems for Chapter 1

In the following problems it’s fine give an answer in terms of products,sums, quotients, and involving binomial coefficients, rather than finding anumerical answer.

1. Suppose there are twenty team members in Problem 6 (on page 5),and that each member orders a triple-decker cone from the shop. Howmany group orders are possible?

→2. We can write n as a sum of n ones. How many plus signs do we usein this sum? In how many ways may we write n as a sum of a list ofk positive numbers? Such a list is called a composition of n into kparts.

Hint. Binomial coefficients.

3. In the last problem we defined a composition of n into k parts. Whatis the total number of compositions of n (into any number of parts)?

→4. In a circular ice cream dish we are going to put four scoops of ice creamof four distinct flavors chosen from among twelve flavors. Assumingwe place four scoops of the same size as if they were at the cornersof a square, and recognizing that moving the dish doesn’t change theway in which we have put the ice cream into the dish, in how manyways may we choose the ice cream and put it into the dish?

→5. A tennis club has 4n members. To specify a doubles match, we choosetwo teams of two people. In how many ways may we arrange themembers into doubles matches so that each player is in (exactly) onedoubles match? In how many ways may we do it if we specify inaddition who serves first on each team?

6. A town has n street lights running along the north side of Main Street.The poles on which they are mounted need to be painted so that theydo not rust. In how many ways may they be painted with red, white,blue, and green if an even number of them are to be painted green?

Hint. First try for small n.

Chapter 2

Product Principle, Revisited

There are two especially helpful variants of the Product Principle, which wewill call the General Product Principle and the Quotient Principle. Althoughyou have already used both of these ideas in earlier problems, they occurfrequently enough for us to focus specifically on them in this chapter.

2.1 The General Product Principle

One version of the Product Principle applies directly in problems such asProblem 5 and Problem 6 (on page 5), where we were not simply taking aunion of m mutually disjoint sets of size n, but rather m disjoint sets of sizen, each of which was itself a union of m′ mutually disjoint sets of size n′.This is a cumbersome way to think about this type of counting problem,and it’s better to think in terms of making a sequence of choices as in thenext problem.

•28. Suppose we make a sequence of m choices, where

• there are k1 possible first choices, and• for each way of making the first i − 1 choices, there are ki ways

to make the i-th choice.

In how many different ways may we make our sequence of choices?(At this time you need not prove your answer correct. Just write itdown.)

The counting principle you gave in the last problem is called the GeneralProduct Principle. A proof of this generalization from the original Prod-uct Principle is outlined in problems in Chapter 4 (beginning on page 47).

19

20 CHAPTER 2. PRODUCT PRINCIPLE, REVISITED

For right now, you may accept it as another counting principle and use itwherever you like.

+ 29. Re-do Problem 6 (on page 5) by applying the General Product Prin-ciple.

→30. A tennis club has 2n members, and you want to pair up the membersin twos for singles matches.

(a) Find the number of pairings for some small values of 2n, say forthe cases of 2, 4, and 6 members.

(b) In how many ways may we pair up all 2n members of the club?Hint. Suppose you have an alphabetical list of the names of themembers of the club. First write the process of pairing as a se-quence of choices. Then apply the Generalized Product Principle.

(c) Suppose that, in addition to specifying who plays whom, for eachpairing we also want to specify who serves first. Use your answerto part (b) and the General Product Principle to show there are2n∏n−1

i=0 (2n− 2i− 1) possible pairings under these rules.1

•31. Use the General Product Principle to count the number of bijectionson the set [k]. Explain.

2.2 Counting Functions

Now we return to Problem 7 (on page 5) and use the General ProductPrinciple to justify—or perhaps finish—the question about the number offunctions from a 3-element set to a 12-element set. Recall that we are usingthe notation [n] for the set {1, . . . , n}.

+ 32. Consider functions f from the set [2] to the set [12].

(a) How many functions are there with f(2) = 1? With f(2) = 2?With f(2) = 3? With f(2) = i for any fixed i between 1 and 12?

(b) The set of functions from [2] to [12] is the union of 12 sets: theset of f with f(2) = 1, the set of f with f(2) = 2, . . . , the setof f with f(2) = 12. How many functions does each of these setshave? From the Product Principle, what can you conclude aboutthe number of functions in the union of these 12 sets?

1The Pi notation or product notation we use here for products works just like theSigma notation works for sums.

2.2. COUNTING FUNCTIONS 21

(c) Find the number of functions from [2] to [12].

+ 33. Now consider the set of functions from [3] to [12].

(a) For each i, let Si be the set of functions f from [3] to [12] withf(3) = i. What is the size Si? What is the size of the union ofthe sets Si?

(b) How many functions are there from [3] to [12]?(c) Use the General Product Principle to justify your answer in Prob-

lem 7(f).

+ 34. (a) Based on the examples you’ve seen so far (including those in Prob-lem 7), formulate a conjecture about how many functions thereare from the set [m] to the set [n].

(b) Prove your conjecture is true.(c) A common notation for the set of all functions from a set M to

a set N is NM . Why is this a good notation?

+ 35. Now suppose we are thinking about a set S of functions f from [m] tosome (finite) set X. (For example, in Problem 6(a) (on page 5) we hadthe set of functions from the three possible places for scoops in an ice-cream cone to 12 flavors of ice cream.) Suppose there are k1 choices forf(1). (In Problem 6, k1 = 12, because there were 12 ways to choose thefirst scoop.) Suppose that for each choice of f(1) there are k2 choicesfor f(2). (In Problem 6, k2 = 12 when the second flavor could be thesame as the first, but k2 = 11 when the flavors had to be different.) Ingeneral, suppose that for each choice of f(1), f(2), . . . , f(i− 1), thereare ki choices forf(i). (In Problem 6(b), where the flavors have tobe different, for each choice of f(1) and f(2), there are 10 choices forf(3).) What we have assumed so far about the functions in S may besummarized as:

• There are k1 choices for f(1).• For each choice of f(1), f(2), . . . , f(i − 1), there are ki choices

for f(i).

How many functions are in the set S? Is there any practical differencebetween the result of this problem and the General Product Principle(as given in Problem 28)?

The point of Problem 35 is that although originally we stated the GeneralProduct Principle very informally, to be more mathematically precise it isa statement about counting sets of functions.


•36. This problem revisits the question: How many subsets does a set Swith n elements have? (This is your third proof of this fact. For theother two proofs, refer to Problems 19 and 26.)

(a) For the specific case of n = 3, describe a sequence of three de-cisions which could be made to yield subsets of [3]. Apply theGeneral Product Principle to find the number of subsets of [3].Re-work this from a function point of view.

(b) Use the functional interpretation of the General Product Princi-ple to prove that a set with n elements has 2n subsets.

37. In how many ways may we pass out k distinct pieces of fruit to nchildren (with no restriction on how many pieces of fruit a child mayget)?

Hint. You can identify each possible distribution of fruit as a function.

◦38. Assuming k ≤ n, in how many ways may we pass out k distinct piecesof fruit to n children if each child may get at most one? What is thenumber if k > n? Assume for both questions that we pass out all thefruit. Note that each of these is a list of k distinct things chosen from aset S (of children), which is usually called a k-element permutationof S.

39. Find the number of one-to-one functions from a k-element set to ann-element set.

Donald Knuth invented the notation nk, read “n to the k falling” or“nto the k down” for the number you just found:

nk = n(n− 1) · · · (n− k + 1) =k∏

i=1

(n− i + 1) . (2.1)

•40. Express nk as a quotient of factorials.

2.3 The Quotient Principle

In Table 2.1 we have listed all 3-element permutations of the 5-element set{a, b, c, d, e}, in such a way that each row consists of all 3-element permuta-tions of some subset of {a, b, c, d, e}. Because a given k-element subset can belisted as a k-element permutation in k! ways, there are 3! = 6 permutationsin each row.

2.3. THE QUOTIENT PRINCIPLE 23

Table 2.1: The 3-element permutations of {a, b, c, d, e} organized by rowsaccording to which 3-element set they permute.

abc acb bac bca cab cbaabd adb bad bda dab dbaabe aeb bae bea eab ebaacd adc cad cda dac dcaace aec cae cea eac ecaade aed dae dea ead edabcd bdc cbd cdb dbc dcbbce bec cbe ceb ebc ecbbde bed dbe deb ebd edbcde ced dce dec ecd edc

Since each 3-element permutation appears exactly once in Table 2.1, thetable is a partition of the set of 3-element permutations of {a, b, c, d, e}. Eachrow is a block of size six and consists of all 3-element permutations of some3-element subset of {a, b, c, d, e}. Since there are ten rows, we see that thereare ten 3-element subsets of {a, b, c, d, e}. An alternate way to see this is toobserve that we partitioned the set of all sixty 3-element permutations of{a, b, c, d, e} into some number q of blocks, each of the same size six. Thusby the Product Principle, q · 6 = 60, so q = 10.

•41. Rather than restricting ourselves to n = 5 and k = 3, we can partitionthe set of all k-element permutations of an n-element set S into blocks.We do so by letting BK be the set (block) of all k-element permutationsof K for each k-element subset K of S. Thus as in our precedingexample, each block consists of all permutations of some subset K ofour n-element set. For example, the permutations of K = {a, b, c} arelisted in the first row of Table 2.1. The questions that follow are aboutthe corresponding partition of the set of k-element permutations of S,where S and k are arbitrary.

(a) How many permutations are there in a block?(b) Since S has n elements, what does Problem 38 tell you about the

total number of k-element permutations of S?(c) Describe a bijection between the set of blocks of the partition and

the set of k-element subsets of S.(d) What formula does this give you for the number

(nk

)of k-element


subsets of an n-element set?Hint. You can make good use of the Product Principle here.

The last problem gives the formula promised in Chapter 1.

Theorem 2. Let n, k be nonnegative integers with k ≤ n. Then(n

k

)=

n!k! (n− k)!

.

•42. Use Theorem 2 to find numerical answers to Problem 20 (on page 14).

→43. While the formula in Theorem 2 is very useful, it doesn’t give us asense of how big the binomial coefficients are. We can get a veryrough idea, for example, of the size of

(2nn

)by recognizing that we can

write (2n)n/n! as 2nn · 2n−1

n−1 · · ·n+1

1 , and each quotient is at least 2, sothe product is at least 2n. If this were an accurate estimate, it wouldmean the fraction of n-element subsets of a 2n-element set would beabout 2n/22n = 1/2n, which becomes very small as n becomes large.However, it is pretty clear the approximation will not be a very goodone, because some of the terms in that product are much larger than2. In fact, if

(2nk

)were the same for every k, then each would be

the fraction 12n+1 of 22n. This is much larger than the fraction 1

2n .But our intuition (and also Pascal’s Triangle) suggests that

(2nn

)is

much larger than(2n1

)and is likely larger than

(2n

n−1

)so we can be

sure our approximation is a bad one. For estimates like this, JamesStirling developed a formula to approximate n! when n is large, namelyn! is about

(√2πn

)nn/en. In fact the ratio of n! to this expression

approaches 1 as n becomes infinite.2 We write this as

n! ≈√

2πnnn

en.

We read this notation as n! is asymptotic to√

2πnnn

en . Use Stirling’sFormula to show that the fraction of subsets of size n in an 2n-elementset is approximately 1/

√πn—which is a much bigger fraction than

12n !

2Proving this takes more of a detour than is advisable here. However, there is anelementary proof which you can work through in the problems of the end of Section 1of Chapter 1 of Introductory Combinatorics by Kenneth P. Bogart, Harcourt AcademicPress,(2000).

2.3. THE QUOTIENT PRINCIPLE 25

•44. In how many ways may n people sit around a round table? (Assumethat when people are sitting around a round table, all that reallymatters is who is to each person’s right. For example, if we can getone arrangement of people around the table from another by havingeveryone get up and move to the right one place and sit back down,then we get an equivalent arrangement of people. Notice that you canget a list from a seating arrangement by marking a place at the table,and then listing the people at the table, starting at that place andmoving around to the right.) There are at least two different ways ofdoing this problem. Try to find them both.

•45. Describe a way to partition the n-element permutations of the n peopleinto blocks so that there is a bijection between the set of blocks of thepartition and the set of inequivalent arrangements of the n peoplearound a round table. What method of solution for the last problemdoes this correspond to?

•46. In this sequence of problems you have been using the Product Principlein a new way. One of the ways in which we previously stated theProduct Principle was “If we partition a set into m blocks each of sizen, then the set has size m · n.” In Problems 41(c) and 45 we knewthe size p of a set P of permutations of a set, and we knew we hadpartitioned P into some unknown number of blocks, each of a certainknown size r. If we let q stand for the number of blocks, what doesthe Product Principle tell us about p, q, and r? What do we get whenwe solve for q?

The formula you found in Problem 46 is so useful that we are going tosingle it out as another principle.

The Quotient PrincipleIf we partition a set P of size p into q blocks, each of which has size r,then q = p/r.

The Quotient Principle is really just a restatement of the Product Prin-ciple, but thinking about it as a principle in its own right often leads us tofind solutions to problems. Notice that it does not always yield a formula forthe number of blocks of a partition, since it only works when all the blockshave the same size.


Next we introduce the idea of an equivalence relation. We will see whatequivalence relations have to do with partitions and discuss the QuotientPrinciple from that point of view.

2.4 Equivalence Relations

So far we’ve used relations primarily to talk about functions. There isanother kind of relation, called an equivalence relation, that gives anotherslant on the Quotient Principle.

Equivalence relations are in the background of some of the problemsconsidered at the start of these notes. In Problem 9 (on page 6) with threedistinct flavors, it was probably tempting to say there are 12 flavors for thefirst pint, 11 for the second, and 10 for the third, so there are 12 ·11 ·10 waysto choose the pints of ice cream. However, once the pints have been chosen,bought, and put into a bag, there is no way to tell which is first, which issecond and which is third. What we just counted is the number of lists ofthree distinct flavors—one-to-one functions from the set {1, 2, 3} into the setof ice cream flavors. Two of those lists become equivalent once the ice creampurchase is made if they have the same flavors of ice cream. In other words,two of those lists are equivalent (are related) if they list the same subset ofthe set of ice cream flavors. To visualize this relation with a digraph, wewould need one vertex for each of the 12 · 11 · 10 lists which is not feasible,and even with five flavors of ice cream we would need one vertex for each of5 · 4 · 3 = 60 lists. So for now we will work with the easier-to-draw questionof choosing three pints of ice cream of different flavors from a choice of fourflavors of ice cream.

47. Suppose we have four flavors of ice cream: V(anilla), C(hocolate),S(trawberry) and P(each). Draw the directed graph whose verticesconsist of all lists of three distinct flavors of the ice cream, and whoseedges connect two lists if they list the same three flavors. This graphmakes it pretty clear in how many “really different” ways we maychoose 3 flavors out of four. How many is it?

→48. Now suppose again we are choosing three distinct flavors of ice creamout of four possible flavors, but instead of putting scoops in a coneor choosing pints, we are going to have the three scoops arrangedsymmetrically in a circular dish. Just as when we chose three pints,we can describe a selection of ice cream in terms of which one goes inthe dish first, which one goes in second (say to the right of the first),

2.4. EQUIVALENCE RELATIONS 27

and which one goes in third (say to the right of the second scoop,which makes it to the left of the first scoop). But again, two of theselists will sometimes be equivalent; that is, once they are in the dish,we can’t tell which one went in first. Think about what makes twolists of flavors equivalent, and draw the directed graph whose verticesconsist of all lists of three of the flavors of ice cream and whose edgesconnect two lists between which we cannot distinguish as dishes of icecream. How many dishes of ice cream can we distinguish from oneanother?

49. Draw the digraph for Problem 44 in the special case where we havefour people sitting around the table.

In the last few problems, we began with a set of lists and then saidwhen two lists were equivalent representations of the objects we are tryingto count. Then you drew the directed graph for the relation of equivalence.Check that each of those digraphs has an arrow from each vertex (list) toitself. This is what is meant when we say a relation is reflexive. Also, seethat whenever you have an arrow from one vertex to a second, there is anarrow from the second back to the first. This is what is meant when we saya relation is symmetric. There is another property of those relations youhave graphed. Namely, whenever you have an arrow from L1 to L2 and anarrow from L2 to L3, then there is an arrow from L1 to L3. This is what ismeant when we say a relation is transitive.

You also undoubtedly have noticed that each of these directed graphsdivides up into clumps of mutually connected vertices, and in the next sec-tion we will see that this is what equivalence relations are all about. Beforewe look into this further, let’s be a bit more precise in our description ofwhat it means for a relation to be reflexive, symmetric or transitive. Firstof all, writing (a, b) ∈ R to say that a is related to b is somewhat messy, andit is really more common to write aRb to mean that a is related to b. (Forexample, if our relation is the “less than relation on {1, 2, 3}”, you are muchmore likely to use x < y than you are (x, y) ∈ <, aren’t you? )

• If R is a relation on a set X, we say R is reflexive if xRx for everyx ∈ X.

• If R is a relation on a set X, we say R is symmetric if xRy holdswhenever yRx holds.

• If R is a relation on a set X, we say R is transitive if whenever bothxRy and yRz, then xRz as well.


We call a relation an equivalence relation if it is reflexive, symmetricand transitive. Each relation of equivalence in the Problems 47, 48 and 49had these three properties, and so each is an equivalence relation. Can youvisualize the same three properties in the relations of equivalence that youwould use in Problems 41(c) and 44?

50. In how many ways may we string n distinct beads on a necklace with-out a clasp? Here assume the necklace is made by stringing the beadson a string, and then carefully knotting the two ends of the string to-gether in a way that the joint can’t be seen. Assume someone can pickup the necklace, move it around in space and put it back down, givingan apparently different way of stringing the beads that is equivalentto the first.

51. In the necklace problem our lists are lists of beads. What makestwo lists equivalent for the purpose of describing a necklace? Verifyexplicitly that this relationship of equivalence is reflexive, symmetric,and transitive.

Work through Appendix B if you need more practice with checkingwhether a relation is an equivalence relation.

2.5 Equivalence Classes

We now look at the clumping behavior you observed in the digraphs of theequivalence relations in the last section.

52. Suppose that R is an equivalence relation on a set X and for eachx ∈ X, define the set Cx = {y ∈ X : yRx}.(a) If Cx and Cz have an element y in common, what can you con-

clude about Cx and Cz (besides the fact that they have an elementin common!)? Be explicit about what property(ies) of equivalencerelations justify your answer.

(b) Why is every element of X in some set Cx? Be explicit about whatproperty(ies) of equivalence relations you are using to answer thisquestion. Notice that we might simultaneously denote a set byCx and Cy. Explain why the union of the sets Cx is X.

(c) Explain why two distinct sets Cx and Cz are disjoint. What dothese sets have to do with the “clumping” you saw in the digraphof Problems 47 and 48?

2.5. EQUIVALENCE CLASSES 29

The sets Cx in Problem 52 are called the equivalence classes of theequivalence relation R. You have just proved that if R is an equivalencerelation on the set X, then each element of X is in exactly one equivalenceclass of R. That is,

Theorem 3. If R is an equivalence relation on X, then the set of equivalenceclasses of R is a partition of X.

53. In Problem 44 the equivalence classes correspond to seating arrange-ments. For each of Problems 41(c), 47, 48, and 50, what does anequivalence class correspond to? (Four answers are expected here.)

54. Given the partition {1, 3}, {2, 4, 6}, {5} of the set {1, 2, 3, 4, 5, 6}, de-fine two elements of {1, 2, 3, 4, 5, 6} to be related if they are in the samepart of the partition. That is, define 1 to be related to 3 (and 1 and3 each related to itself), define 2 and 4, 2 and 6, and 4 and 6 to berelated (and each of 2, 4, and 6 to be related to itself), and define 5 tobe related to itself. Show that this relation is an equivalence relation.

55. Suppose P = {S1, S2, S3, . . . , Sk} is a partition of S. Define two ele-ments of S to be related if they are in the same set Si, and otherwisenot to be related. Show that this relation is an equivalence relation onthe set S. Show that the equivalence classes of the equivalence relationare the sets Si.

In Problem 55 you just proved that every partition of a set gives riseto (or induces) an equivalence relation, and that the classes of the inducedequivalence relation are the blocks of the original partition. Thus Problem 52and Problem 55 proves the following Theorem.

Theorem 4. A relation R is an equivalence relation on a set S if and onlyif S may be partitioned into sets S1, S2, . . . , Sn in such a way that xRy ifand only if x and y are in the same block Si of the partition.

In each of Problems 44, 47, 48, and 50 what we were doing was countingthe number of equivalence classes of an equivalence relation. There was aspecial structure to the problems that made this somewhat easy to do. Forexample, in Problem 47, we had a set of 4 · 3 · 2 = 24 lists of three distinctflavors chosen from V, C, S, and P. Each list was equivalent to 3·2·1 = 3! = 6lists, including itself, since the order in which we selected the three flavorswas unimportant. Thus the set of all 4 ·3 ·2 lists was a union of some numbern of equivalence classes, each of size 6. By the Product Principle, if we have


a union of n disjoint sets, each of size 6, the union has 6n elements. Butwe already knew that the union was the set of all 24 lists of three distinctletters chosen from our four letters. Thus we have 6n = 24, so that we haven = 4 equivalence classes. In Problem 48 there is a subtle change. If wechoose the flavors V, C, and S, and arrange them in the dish with C to theright of V and S to the right of C, then the scoops are in different relativepositions than if we arrange them instead with S to the right of V and Cto the right of S. Thus the order in which the scoops go into the dish issomewhat important—somewhat, because putting in V first, then C to itsright and S to its right is the same as putting in S first, then V to its rightand C to its right. In this case, each list of three flavors is equivalent to onlythree lists, including itself, and so if there are n equivalence classes, we have3n = 24. This gives 24/3 = 8 equivalence classes.

56. If we have an equivalence relation that divides a set with k elementsinto equivalence classes each of size m, what is the number n of equiv-alence classes? Explain why. How can this be used to compute thenumber of different necklaces in Problem 50?

We restate the result of this problem in the next theorem. Note that itis exactly what we call the Quotient Principle.

Theorem 5. If an equivalence relation on a set of size k has equivalenceclasses each of the same size m, then the number of equivalence classes isk/m.

57. In how many ways may we attach two identical red beads and twoidentical blue beads to the corners of a square (with one bead percorner) if the square is free to move around in (three-dimensional)space?

58. What are the equivalence classes (write them out as sets of lists) in thelast problem? Why can’t we use Theorem 5 to compute the numberof equivalence classes?

→59. (This problem has already been given as Problem 30 on page 20. Nowwe have several ways to approach the problem.) A tennis club has2n members. We want to pair up the members by twos for singlesmatches.

(a) In how many ways may we pair up all the members of the club?Give at least two solutions different from the one you gave in


Problem 30. (You may not have done Problem 30. In that case,see if you can find three solutions.)

(b) Suppose that in addition to specifying who plays whom, for eachpairing we say who serves first. Now in how many ways may wespecify our pairs? Try to find as many solutions as you can.


∗1. Suppose we plan to put six distinct computers in a network as shownin Figure 2.1. The lines indicate which computers can communicatedirectly with which others. Consider two ways of assigning comput-ers to the nodes of the network different if there are two computersthat communicate directly in one assignment and that don’t commu-nicate directly in the other. In how many different ways can we assigncomputers to the network?

Figure 2.1: A computer network.

→2. In as many ways as you can, show that(nk

) (n−km

)=(

nm

) (n−m

k

).

∗3. We have n identical ping-pong balls. In how many ways may we paintthem red, white, blue, and green?

∗4. We have n identical ping-pong balls. In how many ways may we paintthem red, white, blue, and green if we use green paint on an evennumber of them?

Chapter 3

Bijection Principle, Revisited

60. We introduced the Bijection Principle on page 13:

Two sets have the same size if and only if there is abijection between them.

Identify at least two places where the Bijection Principle was used inChapter 2.

3.1 Lattice Paths and Catalan Numbers

◦61. In a part of some city, all streets run either north-south or east-west,and there are no dead ends. Suppose we are standing on a streetcorner. In how many ways may we walk to a corner that is four blocksnorth and six blocks east, using as few blocks as possible?

·62. Problem 61 has a geometric interpretation in a coordinate plane. Alattice path in the plane is a “curve” composed of line segments thateither go from a point (i, j) to the point (i+1, j) or from a point (i, j)to the point (i, j + 1), where i and j are integers. (Thus lattice pathsalways move either up or to the right.) The length of the path is thenumber of such line segments.

(a) What is the length of a lattice path from (0, 0) to (m,n)?(b) Show there are exactly

(m+n

n

)lattice paths from (0, 0) to (m,n).

(c) How many lattice paths are there from (i, j) to (m,n), assumingi, j, m, and n are all integers?

33

34 CHAPTER 3. BIJECTION PRINCIPLE, REVISITED

◦63. A school play requires a ten-dollar donation per person; the donationgoes into the student activity fund. Assume that each person whocomes to the play pays with either a ten-dollar bill or a twenty-dollarbill. The teacher who is collecting the money forgot to get changebefore the event. If there are always at least as many people who havepaid with a ten as a twenty as they arrive, the teacher won’t have togive anyone an IOU for change. Suppose 2n people come to the play,and exactly half of them pay with ten-dollar bills.

(a) Describe a bijection between the set of sequences of tens andtwenties given to the teacher and the set of lattice paths from(0, 0) to (n, n). (Be sure to explain why your map is a function,one-to-one, and onto.)

(b) What is the geometric interpretation of a sequence that does notrequire the teacher to give any IOUs?Hint. Each such sequence corresponds to a path that can’t crossover (but may touch) a certain line.

Notice that a lattice path from (0, 0) to (n, n) stays inside (or on theedges of) the square whose sides are the x-axis, the y-axis, the line x = nand the line y = n. It may or may not stay within or on the triangle whosesides are the x-axis, the line x = n and the line y = x. Any lattice paththat does stay within this triangle is called a Catalan path. In Figure 3.1we show the lattice points which form the triangle for n = 4, whose sidesare the x-axis, the line x = 4 and the line y = x. The next sequence ofproblems is designed to compute Cn, the number of Catalan paths from (0, 0)to (n, n). An important tool in these problems is the Feller ReflectionPrinciple, in which a lattice path P from (0, 0) to (n, n) which is notCatalan is transformed into a lattice path from (−1, 1) to (n, n).

Figure 3.1: The Catalan paths from (0, 0) to (i, i) for i = 0, 1, 2, 3, 4. Thenumber of paths to the point (i, i) is written just above the point.

1

1

2

5

14

3.2. UNDIRECTED GRAPHS 35

64. Let P be a lattice path from (0, 0) to (n, n) which is not Catalan.

(a) Show the path P must have at least one point on the line y = x+1.Let the point P be the point whose x-coordinate is least amongall these points of intersection with the line y = x + 1.

(b) Take the part of path P which lies from (0, 0) to this point P ,and reflect it about the line y = x + 1. (That is, replace everyupstep with a step one unit to the left and every right step witha step one unit down.) Show that this new path is a lattice pathfrom (−1, 1) to (n, n).Hint. Try it on a few non-Catalan lattice paths from (0, 0) to(4, 4).

(c) Find a bijection between non-Catalan lattice paths from (0, 0) to(n, n) and lattice paths from (−1, 1) to (n, n). Be sure you checkthat you have a bijection.

(d) Use the Bijection Principle to find a formula for the Catalannumber Cn, defined to be the number of Catalan paths from(0, 0) to (n, n).Hint. Noting that a path either touches the line y = x + 1 or itdoesn’t, use the Sum Principle—which could be called the Dif-ference Principle here.

3.2 Undirected Graphs

In Section 1.3 we introduced the idea of a directed graph (or digraph), andhere we talk about undirected graphs, usually simply called graphs, whichconsist of vertices and edges. We describe vertices and edges in much thesame way as points and lines are described in geometry: We don’t really saywhat vertices and edges are, but we say what they do.

A graph consists of a set V called a vertex set and a set E called an edgeset. Each member of V is called a vertex (plural:vertices) and each memberof E is called an edge (plural: edges). Associated with each edge are two(not necessarily different) vertices called its endpoints. We draw pictures ofgraphs by drawing points to represent the vertices and line segments (curvedif we choose) whose endpoints are at vertices to represent the edges.

In Figure 3.2 we show three pictures of graphs. Each grey circle in thefigure represents a vertex; each line segment represents an edge. You willnote that we have labelled the vertices—we can choose labels or not as weplease. The third graph also shows that it is possible to have an edge thatconnects a vertex to itself or it is possible to have two or more edges between


Figure 3.2: Three different graphs

z

w

xy

v

a b

c

d

e

f

1

2

3

45

6

7

8

two vertices. The degree of a vertex is the number of times it appears asthe endpoint of edges; thus the degree of y in the third graph in the figureis four.

◦65. (a) What is the degree of each vertex in the graph on the left inFigure 3.2?

(b) For each graph in Figure 3.2 is the number of vertices of odddegree even or odd?

→·66. The sum of the degrees of the vertices of a (finite) graph is related ina natural way to the number of edges.

(a) What is the relationship?(b) Find a proof your statement in part (a).

·67. What can you say about the number of vertices of odd degree in agraph?

A walk in a graph is an alternating sequence v0 e1 v1 . . . ek vk of verticesand edges such that all consecutive vertices vi−1 and vi are the endpoints ofthe edge ei. A graph is called connected if for any pair of vertices there isa walk starting at one vertex and ending at the other vertex.

3.3. TREES 37

68. Which of the graphs in Figure 3.2 are connected?

◦69. A path in a graph is a walk with no repeated vertices. Find the longestpath you can in the third graph of Figure 3.2.

◦70. A cycle in a graph is a walk (with at least one edge) whose first andlast vertex are the same but which has no other repeated vertices oredges. Which graphs in Figure 3.2 have cycles? What is the largestnumber of edges in a cycle in the second graph in Figure 3.2? What isthe smallest number of edges in a cycle in the third graph in Figure 3.2?

3.3 Trees

◦71. A connected graph with no cycles is called a tree. Which graphs (ifany) in Figure 3.2 are trees?

72. In a tree with n vertices, given two vertices, how many paths can youfind between them? Prove that you are correct.

73. Find all trees with at most 4 vertices. Give an argument which showsthat every tree with at least two vertices has at least one vertex ofdegree 1. For every n ≥ 3, explain how to construct a tree with nvertices that has n− 1 vertices of degree 1.

74. For any tree with n ≥ 2 vertices, remove one of its vertices of degree 1and the edge containing that vertex (but do not remove the otherendpoint of the edge). Prove that the graph that remains is a tree.

→·75. On the basis of your examples of trees with at most 4 vertices, make aconjecture about the relationship between the number of vertices andedges in a tree.

76. Prove your conjecture from Problem 75 by contradiction. For this,assume there is at least one counterexample, that is, a tree whichdoes not have the correct number of edges. Let n0 be the smallestnumber of vertices among all such counterexamples.1 Then prove thata counterexample cannot exist.

1The fact that every set of positive integers has a smallest element is called the Well-Ordering Principle. In an axiomatic development of numbers, one takes the Well-OrderingPrinciple or some equivalent principle as an axiom.


77. Show that any tree with at least two vertices must have at least twovertices of degree 1. Show that this is the best possible result: Forall n ≥ 2, find a tree with n vertices that has exactly two vertices ofdegree 1.

Hint. One approach to the problem is to use facts that we alreadyknow about degrees of vertices and edges.

3.4 Labelled Trees and Prufer Codes

Figure 3.3: The three labelled trees on three vertices

1 23

2 31

2 13

→78. How many labelled trees are there on the vertex set {1, 2}? On thevertex set {1, 2, 3}? (Note that when we label the vertices of our tree,we use the convention that the tree which has edges between vertices 1and 2 and between vertices 2 and 3 is different from the tree that hasedges between vertices 1 and 3 and between vertices 2 and 3. SeeFigure 3.3.) How many (labelled) trees are there on four vertices?How many (labelled) trees are there with five vertices? You don’thave a lot of data to guess from, but try to guess a formula for thenumber of labelled trees with vertex set [n].

Hint. When you get to four and especially five vertices, draw all theunlabelled trees you can think of, and then figure out in how manydifferent ways you can put labels on the vertices.

We are now going to introduce a method that will prove the formula youjust guessed in the last problem. Given a tree with n ≥ 2 vertices which hasbeen labelled with elements of [n], we define a sequence b1, b2, . . . of integersinductively as follows:

• If the tree has two vertices, the sequence consists of one entry, thelabel of the vertex with the larger label. Otherwise, let a1 be the

3.4. LABELLED TREES AND PRUFER CODES 39

lowest-numbered vertex of degree 1 in the tree.2 Let b1 be the labelof the unique vertex in the tree adjacent to a1 and write down b1. Forexample, in the first graph in Figure 3.2, a1 is 1 and b1 is 2.

• Given a1 through ai−1, let ai be the lowest-numbered vertex of degree 1in the tree3 you get by deleting a1 through ai−1 and let bi be the uniquevertex in this new tree adjacent to ai. For example, in the first graphin Figure 3.2, a2 = 2 and b2 = 3. Then a3 = 5 and b3 = 4.

We use B to stand for the sequence of bis obtained in this way. Forthe tree (the first graph) in Figure 3.2, the sequence B is 2344378. At thispoint, you should work as a group to draw some other labelled trees on eightvertices and to construct the associated sequence B.

79. (a) How long will the sequence B be if it is computed from a labelledtree with n vertices (labelled with 1 through n)?

(b) From your examples, what can you say about the last member ofthe sequence of bis? Explain.

(c) Can you tell from the sequence of bis what a1 is?

The sequence b1, b2, . . . , bn−2 is called a Prufer coding or Prufer codefor the tree. Thus the Prufer code for the labelled tree of Figure 3.2 isB = 234437. Notice that we do not include the term bn−1 in the Prufercode because we know it is n.

Let S be the set of all labelled trees on nine vertices. For each graphG ∈ S, define P (G) to be the Prufer code for G. Why is the relation{ (G, P (G)) : G ∈ S} a function with domain S? What is a co-domain forthis function? (At this point, you don’t have to find the smallest co-domain.)

80. Play the following game in your group: Each of you should individuallychoose a G ∈ S and secretly find its Prufer code, P (G). The othermembers of the group should try to find a labelled graph which hasyour sequence as its Prufer code. Is there only one? What does thatsay about the function P?

81. Now as a group, write down any sequence of seven integers from [9].Try to find a graph G such that this sequence is P (G). Do this forseveral different sequences. What does this say about the function P?

2Notice that we’re using the existence of such a vertex, which you proved in Problem 73,as well as Problem 77.

3In Problem 74 you proved that a tree always results at each stage.


82. Find a bijection between the set of labelled trees with n vertices andanother set you can “count” that will tell you how many labelled treesthere are on n labelled vertices.

I want the thank the Fall 2003 Math 399 class for the idea of writing thelast sequence of problems as a game.

In addition to providing us with a way to count labelled trees, there is agood bit of interesting information encoded in the Prufer code for a tree.

83. What can you say about the vertices of degree one from the Prufercode for a tree labelled with the integers from 1 to n?Hint. What vertex or vertices in the sequence b1, b2, . . . bn−1 can havedegree 1?

84. What can you say about the Prufer code for a tree with exactly twovertices of degree 1 (and perhaps some vertices with other degrees aswell)? Does this characterize such trees?

→85. What can you determine about the degree of the vertex labelled i fromthe Prufer code of the tree?Hint. If a vertex has degree 1, how many times does it appear in thePrufer code of the tree? What about a vertex of degree 2?

→86. What is the number of (labelled) trees on n vertices with three verticesof degree 1? (Assume they are labelled with the integers 1 through n.)Hint. How many vertices appear exactly once in the Prufer code ofthe tree and how many appear exactly twice?


·1. Write down a list of all sixteen 0-1 sequences of length four, startingwith 0000 in such a way that each entry differs from the previous oneby changing just one digit. This is called a Gray Code. That is, aGray Code for 0-1 sequences of length n is a list of the sequences sothat each entry differs from the previous one in exactly one place. Canyou describe how to get a Gray Code for 0-1 sequences of length fivefrom the one you found for sequences of length 4? Can you describehow to prove that there is a Gray code for sequences of length n?

→2. Use the idea of a Gray Code from Problem 1 to prove bijectively thatthe number of even-sized subsets of an n-element set equals the numberof odd-sized subsets of an n-element set.


→3. A list of parentheses is said to be balanced if there are the same numberof left parentheses as right, and as we count from left to right wealways find at least as many left parentheses as right parentheses.For example, (((()()))()) is balanced and neither ((()) nor (()()))(() isbalanced. How many balanced lists of n left and n right parenthesesare there?

Hint. Catalan numbers.

→∗4. How many labelled trees on n vertices have exactly four vertices ofdegree 1?

→∗5. The degree sequence of a graph is a list of the degrees of the verticesin non-increasing order. For example the degree sequence of the firstgraph in Figure 3.2 is (3, 3, 2, 2, 1, 1, 1, 1). For a graph with verticeslabelled 1 through n, the ordered degree sequence of the graph isthe sequence d1, d2, . . . dn in which di is the degree of vertex i.

(a) How many labelled trees are there on n vertices with ordereddegree sequence d1, d2, . . . dn?

(b) How many labelled trees are there on n vertices with the degreesequence in which the degree d appears id times?

→6. MOVE TO BIJECTION CHAP Use the idea of a Gray Code fromProblem 1 to prove bijectively that the number of even-sized subsetsof an n-element set equals the number of odd-sized subsets of an n-element set.

→7. DO THIS AFTER THE CATALAN NUMBERS A list of parenthesesis said to be balanced if there are the same number of left parenthesesas right, and as we count from left to right we always find at least asmany left parentheses as right parentheses. For example, (((()()))())is balanced and ((()) and (()()))(() are not. How many balanced listsof n left and n right parentheses are there?

Chapter 4

Inductive Reasoning inDiscrete Mathematics

4.1 The Principle of Mathematical Induction

One way of looking at the Principle of Mathematical Induction is that it tellsus that if we know the “small” cases of a theorem and we can derive eachother case of the theorem from a smaller case, then the theorem is true in allcases. However, this particular way of stating the principle is unnecessarilyrestrictive because it requires us to derive each case from the immediatelypreceding case, or from some other previous case. This restriction can beweakened, and removing it leads us to a more general statement of the Prin-ciple of Mathematical Induction which people often call the Strong Principleof Mathematical Induction. We will refer to it simply as the Principle ofMathematical Induction.

The Principle of Mathematical InductionIn order to prove a statement about an integer n if we can

1. prove our statement when n = b and

2. prove that the statements we get with n = b, n = b+1, . . .n = k−1imply the statement with n = k,

then our statement is true for all integers n ≥ b.

87. What postage do you think we can make using only three-cent andfive-cent stamps? If you have an unlimited supply of these two types

43

44CHAPTER 4. INDUCTIVE REASONING IN DISCRETE MATHEMATICS

of stamps, do you think that there is a number N such that if n ≥ N ,then we can make n cents worth of postage?

You probably see that we can make n cents worth of postage as long asn is at least 8. However, for instance you didn’t try to make 13 cents inpostage by working with the fact that four 3-cent stamps made 12 cents.Rather, you saw that you could get 10 cents (two fives) and then could adda 3-cent to that to get 13 cents. Thus we need to use more than one “basecase” if we want to prove by induction that we are correct. Here’s a possibleproof:

We know that we can make 8 cents with one 3-cent stamp and one5-cent stamp. We let k be a number greater than 8, and assume thatit is possible to make any amount between 8 and k−1 cents in postagewith three- and five-cent stamps. Now if k is less than 11, it is 8 or9 or 10. We have already made 8, 9 can be made with three 3-centsand 10 with two 5-cents. (So we can assume that k − 1 ≥ 10, that is,k− 3 ≥ 8.) Since k− 3 is between 8 and k− 1 (inclusive), then by ourinductive hypothesis we know that k−3 cents worth of postage can bemade with some combination of 3- and 5-cent stamps. Adding another3-cent stamp makes k cents worth of postage. Thus by the Principleof Mathematical Induction, we can make n cents in stamps with three-and five-cent stamps for each n ≥ 8.

Let’s break down this argument further. Some people might say that wereally had three base cases (namely, n = 8, 9, and 10) in the proof aboveand once we had proved those three consecutive base cases, then we couldreduce any other case to one of these base cases by successively subtracting3 (and we can achieve the stamp difference with 3-cent stamps). That is anappropriate way to look at the proof. In fact, a computer scientist mightsay that if we want to write a program that figures out how to make n centsin postage, we use one method for the cases n = 8 to n = 10, and then ageneral method (namely, adding 3-cent stamps) for all the other cases. Soto write a program it is important to think in terms of having multiple basecases. How do you know what your base cases are? You have to do (atleast) as many basic cases as your proof requires. For instance, if we hadused only the two base cases of k = 8 and k = 9 then the method could notpossibly get k = 10, since k − 3 = 7 cannot be made using only 3-cent and5-cent stamps.

88. What postage can be made with five- and six- cent stamps? Do youthink that there is a number N such that if n ≥ N , then we can maken cents worth of postage?

4.2. INDUCTIVE DEFINITIONS 45

+ 89. Prove by induction on m ≥ 1 that there are nm functions from [m]to [n]. So, this proves the conjecture in Problem 34(a) (on page 21)without appealing to the General Product Principle.

Hint. Think of using the (ordinary or simple) Product Principle.

If you feel uncomfortable with the Principle of Mathematical Induction,you should read Appendix C. Reading the appendix is not required andmost of you should continue with the rest of the book without reading it.Inductive reasoning will be used in the rest of this chapter as well as in therest of the book.

4.2 Inductive Definitions

We have already used factorial notation. One way to describe n! is to defineit in two stages: The base case of 0! = 1 and the inductive step n! = n(n−1)!for n > 0. By the Principle of Mathematical Induction, this pair of equationsdefines n! for all nonnegative integers n. For this reason we call the definitionan inductive definition. (An inductive definition is sometimes called arecursive definition.) We can often get very easy proofs of useful facts byusing inductive definitions.

→90. An inductive definition of an for nonnegative n is given by a0 = 1 andan = a an−1. (Notice the similarity to the inductive definition of n!.)We remarked above that inductive definitions often give us easy proofsof useful facts. Here we apply this inductive definition to prove twouseful facts about exponents that you have been using almost sinceyou learned the meaning of exponents.

(a) Use this definition to prove the rule of exponents am+n = aman

for nonnegative m and n.Hint. This may look difficult because one can’t decide in advanceon whether to try to induct on m, on n, on their sum, or on someother quantity which involves m and/or n.

(b) Use this definition to prove the rule of exponents amn = (am)n.

+ 91. Suppose that f is a function on the nonnegative integers such thatf(0) = 0 and f(n) = n + f(n − 1). Prove that f(n) = n(n + 1)/2.Notice that this gives another proof that 1 + 2 + · · ·+ n = n(n + 1)/2,because this sum satisfies the two conditions for f . (The sum is takento be 0 when n = 0 because it has no terms.)


4.3 Recurrences

92. How is the number of subsets of an n-element set related to the numberof subsets of an (n− 1)-element set? Prove that you are correct.

93. Explain why it is that the number of bijections from an n-element setto an n-element set is equal to n times the number of bijections froman (n− 1)-element set to an (n− 1)-element set.

We can summarize the observations in the last two problems as follows.If sn stands for the number of subsets of an n-element set, then

sn = 2sn−1 , (4.1)

and if bn stands for the number of bijections from an n-element set to ann-element set, then

bn = nbn−1. (4.2)

Equations 4.1 and 4.2 are examples of recurrence equations, which aresometimes called recurrence relations, or simply recurrences. A recurrenceis an equation that expresses the n-th term of a sequence an in terms ofother values of ai for i < n. Thus Equations 4.1 and 4.2 are examples ofrecurrences. Other examples of recurrences are

an = an−1 + 7, (4.3)

an = 3an−1 + 2n, (4.4)

an = an−3 + 3an−2 , (4.5)

an = a1an−1 + a2an−2 + · · ·+ an−1a1. (4.6)

A solution to a recurrence is any sequence that satisfies the recurrence.Thus the sequence given by sn = 2n is a solution to Recurrence 4.1, andnote that sn = 17 ·2n and sn = −13 ·2n are also solutions to Recurrence 4.1.What this shows is that a recurrence can have infinitely many solutions, butin a given problem there is generally one solution that is of interest to us.For example, if we are interested in the number of subsets of a set, then thesolution to Recurrence 4.1 that we care about is sn = 2n. Notice this is theonly solution we have mentioned that begins with that s0 = 1, the numberof subsets of the emptyset.

94. Use induction to show that there is only one solution to Recurrence 4.1that begins with s0 = 1. This gives another count of the number ofsubsets theorem at the end of Chapter 1 (page 16).

4.4. PROOF OF THE GENERAL PRODUCT PRINCIPLE 47

95. A first-order recurrence is one which expresses an in terms ofan−1 (to the first power) and other functions of n, but does not includeany of a0, . . . , an−2 in the equation.

(a) Which of the Recurrences 4.1 through 4.6 are first-order recur-rences?

(b) Show that there is one and only one sequence an that has all of thefollowing properties: it is defined for every nonnegative integern; it satisfies a given first-order recurrence; and it satisfies a0 = afor some fixed constant a.

Figure 4.1: The Towers of Hanoi Puzzle

→96. The Towers of Hanoi Puzzle (refer to Figure 4.1) has three rods risingfrom a rectangular base and n rings of different sizes. At the beginningof the puzzle all rings are stacked on one rod in order of decreasingsize. A legal move consists of moving a ring from one rod to another insuch a way that it does not land on top of a smaller ring. If mn is thenumber of moves required to move all the rings from the initial rod toanother rod of your choosing, give a recurrence for mn. Explain.

·97. In Problem 66 (on page 36) you proved that the sum of the degrees ofthe vertices of a (finite) graph equals twice the number of edges in thegraph. There are several proofs of this fact.

(a) Find a proof that uses induction on the number of edges.(b) Find a proof that uses induction on the number of vertices.

4.4 Proof of the General Product Principle

We have now reached the point where you can prove the General ProductPrinciple which you’ve already used in Section 2.1. The next problem canbe considered as a warmup exercise. Recall that the simplest form of theSum Principle says


The size of two disjoint (finite) sets is the sum of their sizes.

98. Use the simplest form of the Sum Principle to prove the Sum Principlefor partitions of a set:

If we have a partition of a finite set S, then the size ofS is the sum of the sizes of the blocks of the partition.

Hint. You should choose a useful variable to induct on. The numberof blocks in the partition? The size of the first block of the partition?The size of the set we are partitioning? Or something else?

In Problem 28 (on page 19) we gave the following form of the GeneralProduct Principle:

If we make a sequence of m choices for which

• there are k1 possible first choices, and

• for each way of making the first i− 1 choices, there areki ways to make the i-th choice,

then we may make our sequence of choices in k1k2 · · · kn

ways.

In Problem 35 we stated the General Product Principle (on page 21) asfollows:

Let S be a set of functions f from [n] to some set X. Supposethat

• there are k1 choices for f(1), and

• for each choice of f(1), f(2), . . . f(i − 1), there are ki

choices for f(i).

Then the number of functions in the set S is k1k2 · · · kn.

You may use either of these ways of stating the General Product Principlein the following problem.

+ 99. Prove either General Product Principle from the Product Principle:

If we have a partition of a finite set S into m blocks,each of the same size n, then S has size mn.

4.5. SPANNING TREES 49

4.5 Spanning Trees

Many of the applications of trees arise from trying to find an efficient wayto connect all the vertices of a graph by a path. For example, in a telephonenetwork, at any given time we have a certain number of wires (or microwavechannels, or cellular channels) available for use. These wires or channels gofrom one specific place to another specific place, and so the wires or channelsmay be thought of as edges of a graph and the places where the wires connectmay be thought of as vertices of that graph. A tree whose vertices are allof the vertices of the graph G and whose edges are some of the edges of agraph G is called a spanning tree of G. A spanning tree for a telephonenetwork gives a way to route calls between any two vertices in the networkthat uses the minimum number of wires. For example, Figure 4.2 containsall spanning trees of the graph on the far left of the figure.

Figure 4.2: A graph and all its spanning trees.

100. Show that every connected graph has a spanning tree. It is possibleto find a proof that starts with the graph and works “down” towardsthe spanning tree and to find a proof that starts with just the verticesand works “up” towards the spanning tree. Try to find both kinds ofproof.

Our motivation for talking about spanning trees was the idea of finding aminimum number of edges needed to connect all the edges of a commu-nication network together. In many cases the edges of a communicationnetwork have costs associated with them. For example, one cell-phone oper-ator might charge another one when a customer of the first uses an antennaof the other.


Suppose a company has offices in a number of cities and wants to put to-gether a communication network connecting its various locations with high-speed communication lines, and to do so at minimum cost. This can bemodeled by a graph whose vertices are the cities in which it has offices andwhose edges represent possible communications lines between the cities. Ofcourse there will not necessarily be lines between each pair of cities—in factthe company will not want to pay for a line connecting city i and city j ifit can already connect them indirectly by using other lines it has chosen.From this discussion we see that the company has a graph theory modelof the problem and will want to choose a spanning tree of minimum costamong all spanning trees of the communications graph. This special treeis often called a minimal spanning tree for the graph. For this type ofapplication, numbers are assigned to the edges of a graph and the sum ofthe numbers on the edges of a spanning tree will be called the cost of thespanning tree.

→101. Describe an inductive method (or better, two methods different in atleast one aspect) for finding a spanning tree of minimum cost in aconnected graph whose edges are labelled with costs. Prove that yourmethod(s) work by contradiction, beginning by assuming there existsa spanning tree whose cost is lower.

Hint. Think of selecting one edge of the tree at a time.

The method you used in Problem 101 is called a greedy method, becauseeach time you made a choice of an edge, you chose the least costly edgeavailable to you.

There are two operations on graphs that we can apply to get a recurrencewhich will allow us to compute the number of spanning trees of a graph.Each operation is applied to an edge e of a graph G. The first is calleddeletion: we delete the edge e from the graph by removing it from the edgeset (but don’t remove either of its endpoints). Work through Figure 4.3for an example of how a sequence of edge deletions can be used to get aspanning tree.

The second operation is called contraction of an edge. Intuitively, wecontract an edge by shrinking its length until its endpoints coincide and welet the rest of the graph “go along for the ride.” To be more precise, wecontract the edge e with endpoints v and w as follows:

(a) remove from the edge set all edges having either v or w (or both) foran endpoint;

(b) remove v and w from the vertex set;

4.5. SPANNING TREES 51

Figure 4.3: Deleting two appropriate edges from this graph gives a spanningtree.

(c) add a new vertex E to the vertex set;(d) for each remaining vertex that had an edge removed in part (a), add

an edge from the vertex to E;(e) add an edge from E to E for any edge other than e whose endpoints

were in the set {v, w}.The wording for this process of contraction is more complicated than theprocess. Go through the examples in Figure 4.4 to get a better understand-ing of the idea.

Figure 4.4: The results of contracting three different edges in a graph.

ee

1 23e

E

E

E

4

56

7

1 23

4

56

7

1 23

4

56

7

13

46

7

23

4

5

7

1

4

56

7

We use G \ e (read as G minus e) to represent the graph that resultsfrom deleting e from G, and we use G/e (read as G contract e) for the resultof contracting e from G.

→·102. (a) How do the number of spanning trees of G not containing theedge e and the number of spanning trees of G containing e relateto the number of spanning trees of G \ e and G/e?Hint. If you have a spanning tree of G that contains e, is the


graph that results from that tree by contracting e still a tree?(b) Use #(G) to represent the number of spanning trees of a graph

G (so that, for example, #(G/e) equals the number of spanningtrees of G/e). Find an expression for #(G) in terms of #(G/e)and #(G\e). This expression is called the deletion-contractionrecurrence. In what sense can this be considered a recurrence?

(c) Use the recurrence of the previous part repeatedly to show thatthe graph in Figure 4.5 has twenty-one spanning trees.

Figure 4.5: A graph.

1 2

34

5

4.6 Shortest Paths in Graphs

Let us return to the application we considered in the last section. Supposethat a company has a main office in one city and regional offices in othercities. Most of the communication in the company is between the main officeand the regional offices, so the company wants to find a spanning tree thatminimizes not the total cost over all possible edges, but rather the cost ofcommunication between the main office and each of the regional offices. Itis not clear that such a spanning tree even exists, and this problem is aspecial case of the following. We have a connected graph with nonnegativenumbers (called weights) assigned to each edge. The (weighted) length ofa path in the graph is the sum of the weights of its edges, and the distancebetween two vertices is the least (weighted) length of any path between thetwo vertices. There are two optimization (actually minimization) problemsinherent here: Given a vertex v, what is the distance between v and eachother vertex? Given a vertex v, can you find a spanning tree in G such thatthe length of the path in the spanning tree from v to each vertex x is thedistance from v to x in G?

Consider the following inductive process, which is known as Dijkstra’salgorithm). The algorithm is applied to a weighted graph whose verticesare labelled 1 to n.


• Let d(1) := 0. Let d(i) := ∞ for all other i.Let v(1) := 1. Let v(j) := 0 for all other j.For each i and j, let w(i, j) be the minimum weight of an edge betweeni and j, or ∞ if there are no such edges.Let k := 1. Let t := 1.

• For each i, if d(i) > d(k) + w(k, i) let d(i) = d(k) + w(k, i).

• Among those i with v(i) = 0, choose one for which d(i) is a minimum,and let k = i. Increase t by 1. Let v(i) = 1.

• Repeat the previous two steps until t = n.

103. (a) Apply Dijkstra’s algorithm to several graphs.(b) Show that at the end of the algorithm, each d(i) equals the dis-

tance from vertex 1 to vertex i.

104. In every connected graph, is there always a spanning tree such thatfor every vertex i, the distance from vertex 1 to vertex i given by thealgorithm in Problem 103 is the distance from vertex 1 to vertex i inthe tree?


1. Give an inductive definition of the product notationn∏

i=1

ai.

2. Use the inductive definition of an to prove that (ab)n = anbn for allnonnegative integers n.

→3. We draw n mutually intersecting circles in the plane so that each onecrosses each other one exactly twice and no three intersect in the samepoint. Find a recurrence for the number rn of regions into which theplane is divided by n circles. (One circle divides the plane into tworegions, the inside and the outside.) Find the number of regions withn circles. For what values of n can you draw a Venn diagram showingall the possible intersections of n sets using circles to represent eachof the sets?

Hint. If we have n − 1 circles drawn in such a way that they definern−1 regions, and we draw a new circle, each time it crosses anothercircle, except for the last time, it finishes dividing one region into twoparts and starts dividing a new region into two parts.


Another Hint. Compare rn with the number of subsets of an n-elementset.

→4. A hydrocarbon molecule is a molecule whose only atoms are eithercarbon atoms or hydrogen atoms. In a simple molecular model of ahydrocarbon, a carbon atom will bond to exactly four other atoms,and hydrogen atom will bond to exactly one other atom. Such amodel is shown in Figure 4.6. We represent a hydrocarbon compound

Figure 4.6: A model of a butane molecule.

C CC C

H H

H H H

H H

H

H

H

with a graph whose vertices are labelled with C’s and H’s so thateach C vertex has degree four and each H vertex has degree one. Ahydrocarbon is called an “alkane” if the graph is a tree. Commonexamples are methane (natural gas), butane (one version of which isshown in Figure 4.6), propane, hexane (ordinary gasoline), octane (tomake gasoline burn more slowly), etc.

(a) How many vertices are labelled H in the graph of an alkane withexactly n vertices labelled C?

(b) An alkane is called butane if it has four carbon atoms. Why dowe say one version of butane is shown in Figure 4.6?

5. (a) Find a recurrence for the number of ways to divide 2n peopleinto sets of two for tennis games. (Don’t worry about who servesfirst.)

(b) Give a recurrence for the number of ways to divide 2n people intosets of two for tennis games and to determine who serves first.

→6. Give a recurrence for the number of ways to divide 4n people into setsof four for games of bridge. (Don’t worry about how they sit aroundthe bridge table or who is the first dealer.)

Chapter 5

Distribution Problems

5.1 The Idea of Distributions

Many of the problems we solved in earlier chapters may be considered to bedistribution problems—problems which involve distributing objects (such aspieces of fruit or ping-pong balls) to recipients (such as children). The waysof viewing counting problems as distribution problems can be somewhatindirect. For example, in Problem 24 (on page 15) you probably workedthrough the fact that the number of ways to pass out k ping-pong balls ton children so that no child gets more than one ball is the number of waysthat we may choose a k-element subset of an n-element set. We can thinkthat the children are the recipients and the identical ping-pong balls are theobjects we are distributing, and that the distribution is done in such a waythat each recipient gets at most one ball. Those children who receive a ballform the k-element subset of the n-element set of children.

It can be helpful to have more than one way to think of solving theseproblems. Another popular model for distributions is to think of puttingballs in boxes rather than distributing objects to recipients. Passing outidentical objects is modeled by putting identical balls into boxes. Passingout distinct objects is modeled by putting distinct balls into boxes. So,when we are passing out objects to recipients, we may think of the objectsas being either identical or distinct. We may also think of the recipientsas being either identical (grocery bags in the case of putting fruit into bagsin the grocery store) or distinct (children in the case of passing fruit outto children). We may restrict the distributions to those that give at leastone object to each recipient, or those that give exactly one object to eachrecipient, or those that give at most one object to each recipient, or we may

55

56 CHAPTER 5. DISTRIBUTION PROBLEMS

have no such restrictions. If the objects are distinct, it may be that the orderin which the objects are received is relevant (think about ice cream in 3-decker cones) or that the order in which the objects are received is irrelevant(think about dropping a handful of candy into a child’s trick-or-treat bag).

105. Consider the distribution of k distinct objects to n distinct recipients,with different conditions on how the objects are received. The firstrow in the following table is already filled in. Fill in the other rows(except the ones with ?).

Conditions Number of Ways Mathematical ModelNo conditions nk functionsEach gets at most oneEach gets exactly oneOrder matters ? ?Order mattersEach gets at least one ? ?

106. Consider the distribution of k identical objects to n distinct recipients,with different conditions on how the objects are received. Fill in allentries in the table (except any with ?).

Conditions Number of Ways Mathematical ModelNo conditions ? ?Each gets at least one ? ?Each gets at most oneEach gets exactly one

107. Consider the distribution of k objects to n identical recipients, withdifferent conditions on how the objects are received. Fill in the entriesof the table (except for the entries with ?).

Objects Conditions Number of WaysDistinct Each gets at most oneDistinct Each gets exactly oneDistinct Order matters ?Distinct Order matters and each gets at least one ?Identical Each gets at most oneIdentical Each gets exactly one

5.2. ORDERED-FUNCTIONS 57

The goal of this chapter is to develop methods that will allow you to fillin the question marks in the tables. I will hand out copies of a completesummary table after everyone has finished this chapter.

5.2 Ordered-functions

108. Suppose we wish to place k distinct books onto the shelves of a book-case with n shelves. Assume that the shelves are long enough so thatall of the books would fit on any of the shelves. Also, let’s imaginethat once we are done putting books on the shelves, we push the bookson a shelf as far to the left as we can. This means that we are onlythinking about how the books sit relative to each other, not about theexact places where we put any book. Since the books are all different,we can number them as the first book, the second book and so on.

(a) How many places are there to place the first book?(b) When we go to place the second book, if we decide to place it

on the shelf that already has a book does it matter if we place itto the left or right of the book that is already there? How manyplaces are there where we can place the second book?

(c) In how many ways may we place the ith book into the bookcase?(d) In how many ways may we place all k books?

109. Suppose we wish to place the books in Problem 108 (satisfying theassumptions we made there) so that each shelf gets at least one book.Now in how many ways may we place the books?

Hint. How can you make sure that each shelf gets at least one bookbefore you start the process described in Problem 108?

110. Here’s another way to think about the last problem. Imagine firstlining up the k books in a row, which gives k−1 spaces between them.Choose n − 1 of these spaces in which to slide a piece of paper as adivider. Now put the books before the first divider on shelf one, andthe books after divider i on shelf i + 1. This gives an arrangementof the books on the shelves so that every shelf has a book. Use thismethod to find the number of different arrangements in Problem 109.

For any given arrangement of books in the bookcase, the assignment ofa book to the shelf on which it is put is simply a function from the set ofbooks to the set of shelves. But that function only records which shelf each


book is on. It doesn’t say which book sits to the left of which others on theshelf, information which is an important part of how the books are arrangedon the shelves. In other words, the order in which the books appear on theshelves their books also matters. So, in order for our assignment to record allthe information it must assign an ordered list of books to each shelf. We willcall such a map an ordered-function.1 More precisely, an ordered-functionfrom a set S to a set T is a map that assigns an (ordered) list of elementsof S to elements of T in such a way that each element of S appears on oneand only one of the lists. An ordered-onto-function is one which assignsa list to every element of T (as in Problem 109.)

In Problem 108 you showed that the number of ordered-functions from

a k-element set to an n-element set isk∏

i=1

(n + i− 1). This product occurs

frequently enough that it has a name; it is called the kth rising factorialpower of n and is denoted by nk. It is read as “n to the k rising.” (Thisnotation is due to Don Knuth, who also suggested the notation for fallingfactorial powers which we defined earlier on page 22.)

5.3 Multisets and Compositions of integers

In the last section you considered distinct objects (books) distributed todistinct recipients (shelves in a bookcase). What happens when we haveindistinguishable objects?

•111. In how many ways may the bookstore distribute brand-new copies ofthe same calculus text on n empty shelves in the store? Here we assumeeach shelf is wide enough to hold all the books and it is possible forsome shelves to get no books. Write your final answer as a binomialcoefficient.

Hint. Use Problem 108.

A multiset chosen from a set S may be thought of as a kind of gen-eralized subset of S in which repeated elements are allowed. To determinea multiset we must say how many times (allowing the possibility of zerotimes) each member of S appears in the multiset. The number of times an

1We hyphen-ate the word “ordered-function” because an ordered-function from S to Tis in general not a function from S to T . The phrase ordered-function is not a standardone, because there is as yet no standard name for the result of an ordered distributionproblem.

5.3. MULTISETS AND COMPOSITIONS OF INTEGERS 59

element appears is called its multiplicity. For example if we choose threeidentical red marbles, six identical blue marbles and four identical greenmarbles from a bag of red, blue, green, white and yellow marbles then themultiplicity of a red marble in our multiset is three and the multiplicity of ayellow marble is zero. The size of a multiset is the sum of the multiplicitiesof its elements. For example if we choose three identical red marbles, sixidentical blue marbles and four identical green marbles, then the size of ourmultiset of marbles is 13.

•112. (a) Find a bijection between arrangements of identical books on nshelves and multisets chosen from an n-element set.

(b) What is the number of multisets of size k that can be chosen froman n-element set?

→113. (Optional) Your answer in the previous problem is expressible as abinomial coefficient. Since a binomial coefficient counts subsets, finda bijection between subsets of something and multisets chosen from aset S.

114. How many solutions are there in nonnegative integers to the equationx1 + x2 + · · ·+ xm = r, where m and r are fixed positive integers?

·115. In how many ways may we put k identical books onto n distinct shelvesif each shelf must get at least one book?

A composition of the positive integer k into n parts is a list of n positiveintegers that sum to k.

·116. How many compositions are there of an integer k into n parts? Com-pare this with Problem 114.

→117. (Optional) Your answer in Problem 116 can be expressed as a binomialcoefficient. This means it should be possible to interpret a compositionas a subset of some set. Find a bijection between compositions of k inton parts and certain subsets of some set. Explain explicitly how to getthe composition from the subset and the subset from the composition.

Hint. If we line up k identical books, how many adjacencies are therein between books?

·118. Explain the connection between compositions of k into n parts andthe problem of distributing k identical objects to n distinct recipientsso that each recipient gets at least one.


119. In how many ways can we distribute k identical objects to n distinctrecipients so that each recipient gets at least m? (Here we assume thatk ≥ mn.)

5.4 Broken Permutations and Lah Numbers

120. How many ways may you stack 5 distinct books into 3 identical boxesso that each box contains at least one book? How is this problemdifferent from arranging 5 distinct books in a bookcase with 3 book-shelves (in such a way that each shelf gets at least one book)? Here,the order of books in a stack makes a difference.

You might have thought to first partition the five books into three blocksand then to follow that by ordering the books within the blocks of thepartition. This turns out not to be a useful combinatorial way of visualizingthe problem because the number of ways to order the books in the variousblocks depends on the sizes of the blocks and not just the number of blocks.

121. In this problem we want to count the number of ways to stack k distinctbooks into n identical boxes so that there is at least one book in everybox.

(a) First consider the set S of all arrangements of the k distinctbooks on n distinct shelves in which every shelf has at least onebook. When are two of the arrangements in S the same as faras what you’re asked to count in this problem? Define this ideaof sameness as an equivalence relation on S. Explain why everyequivalence class has the same size. What is this size?

(b) Find the number of ways to stack k distinct books into n identicalboxes so that there is a stack in every box. This number is usuallydenoted by L(k, n) and is called a Lah number.

122. Explain why the number of ways to stack k distinct books into n iden-tical boxes is

∑ni=1 L(k, i).

123. Fill in the entries with ?s in the tables at the beginning of this chapter.


1. Answer each of the following questions.


(a) In how many ways may we pass out k identical pieces of candyto n children?

(b) In how many ways may we pass out k distinct pieces of candy ton children?

(c) In how many ways may we pass out k identical pieces of candyto n children so that each gets at most one? (Assume k ≤ n.)

(d) In how many ways may we pass out k distinct pieces of candy ton children so that each gets at most one? (Assume k ≤ n.)

(e) In how many ways may we pass out k identical pieces of candyto n children so that each gets at least one? (Assume k ≥ n.)

2. The neighborhood improvement committee has been given r trees todistribute to s families living along one side of a street. Unless oth-erwise specified, it doesn’t matter where a family plants the trees itgets.

(a) In how many ways can they distribute all of them if the trees aredistinct, there are more families than trees, and each family canget at most one tree?

(b) In how many ways can they distribute all of them if the trees aredistinct and any family can get any number?

(c) In how many ways can they distribute all the trees if the trees areidentical, there are no more trees than families, and any familyreceives at most one?

(d) In how many ways can they distribute all the trees if they areidentical and anyone may receive any number of trees?

(e) In how many ways can all the trees be distributed and plantedif the trees are distinct, any family can get any number, and afamily must plant its trees in an evenly spaced row along theroad?

(f) Answer the question in part (e) assuming that every family mustget a tree.

(g) Answer the question in part (d) assuming that each family mustget at least one tree.

Chapter 6

Generating Functions

6.1 Using Pictures to Visualize Counting

Suppose you want to choose a snack of three pieces of fruit from amongapples, pears and bananas. Since we have not precluded choosing two orthree of the same fruit, all your choices can be symbolically represented as

+ + + + + + + + + .

(Why doesn’t appear?) Here we are using a picture of a piece of fruitto represent taking a piece of that fruit. For instance, stands for takingan apple; represents taking an apple and a pear; and for takingtwo apples. You can think of the plus sign as standing for the exclusiveor; that is, + would stand for “I take an apple or a banana but notboth.” We extend this similarity with mathematical notation by condensingour expression to

3 + 3 + 3 + 2 + 2 + 2 + 2 + 2 + 2 + . (6.1)

In this notation 3 stands for choosing three apples, while 2 representsa choice of two apples and a banana, and so on. What our notation in(6.1) is really doing is giving us a convenient way to list all three-elementmultisets chosen from the set { , , }. This approach was inspired byGeorge Polya’s paper “Picture Writing,” in the December 1956 issue of TheAmerican Mathematical Monthly. While we are taking a somewhat moreformal approach than Polya, it is still completely in the spirit of his work.

Suppose now that we plan to choose between one and three apples, be-tween one and two pears, and between one and two bananas, and that we

63

64 CHAPTER 6. GENERATING FUNCTIONS

don’t place any other restriction on the total number of fruit to be chosen.In a somewhat clumsy way we could describe our fruit selections as

+ 2 + 2+ 2 2+ 2 +· · ·+ 2 2 2+ 3 +· · ·+ 3 2 2.

(6.2)

•124. (a) Using an A in place of the picture of an apple, a P in placeof the picture of a pear, and a B in place of the picture of abanana, write out the entire expression intended in (6.2), thatis, without any dots for left-out terms. (You may use picturesinstead of letters if you prefer, but it gets tedious quite quickly!)Now expand the product (A + A2 + A3)(P + P 2)(B + B2) andcompare the result with your expression.

(b) Substitute an x for each of A, P and B in the expression youfound in part (a). Expand the result in powers of x and give aninterpretation of the coefficient of xn.

Expanding( + 2 + 3)( + 2)( + 2) (6.3)

gives the expression in (6.2). This means that (6.2) and (6.3) each describesthe number of multisets we can choose from the set { , , } in which

appears between one and three times, and and each appears onceor twice. We interpret (6.2) as describing each individual multiset we canchoose, and we interpret (6.3) as saying that we first decide how many appleswe will take, and then decide how many pears we will take, and then decidehow many bananas. At this stage it might seem a bit magical that doingordinary algebra with the second formula yields the first, but in fact wecould define addition and multiplication with these pictures more formallyso we could explain in detail why things work out.

You’ve seen in our descriptions of the ways of choosing fruit that we’vetreated the pictures of the fruit as if they were variables. In the theoryof generating functions, we will associate variables (or polynomials or evenpower series) with members of a set.

Here we adapt language introduced by George Polya to describe how toassociate variables with the members of a set. By a picture of a member ofa set S we will mean a variable, or perhaps a product of powers of variables(or even a polynomial in the variables). A function P that assigns a pictureP (s) to each member s ∈ S will be called a picture function. The pictureenumerator for a picture function P defined on a set S will be the sum of

6.1. USING PICTURES TO VISUALIZE COUNTING 65

the pictures of the elements in S, which we can write symbolically as

EP (S) =∑s∈S

P (s) .

If S is the set of our three fruit, then A is the picture of an apple, andA+P +B is the picture enumerator of the picture function on S. Likewise,when S is the set of all multisets of fruit with from one to three apples, oneto two pears, and one to two bananas, then (6.2) is the picture enumeratorfor S. We have chosen this language because the picture enumerator lists(that is, enumerates) all elements of S according to their pictures.

•125. (a) What should be the picture of taking no apples? Find a poly-nomial in the variable A that says you should take between zeroand three apples.

(b) Write a picture enumerator that says we may take between zeroand three apples, between zero and three pears, and between zeroand three bananas.

Recall that a tree is determined by its edge set. Suppose you have a treewhose vertex set is [n], and that we define the picture of the vertex i to bexi and xixj to be the picture of the edge with endpoints xi and xj . We thendefine the picture of a tree T to be the product

P (T ) =∏

i,j ∈Txixj (6.4)

where T = { i, j : i and j are connected by an edge in the tree T}.

126. Explain why the above picture P (T ) can be re-written as∏n

i=1 xdeg(i)i .

127. For each n ≥ 1, let Sn be the set of all trees with vertex set [n]. Foreach tree T ∈ Sn use the picture P (T ) given in (6.4). Find the pictureenumerators EP (Sn) for each of n = 2, 3, 4. In each case, factor thepolynomials as completely as possible.

128. Explain why x1x2 · · ·xn is always a factor of the picture of any treeon n vertices.

129. (Optional)

(a) Write down the picture of a tree on five vertices with one vertexof degree four, say vertex i.


(b) If a tree on five vertices has a vertex of degree three, what are thepossible degrees of the other vertices? What can you say aboutthe picture of a tree with a vertex of degree three?

(c) If a tree on five vertices has no vertices of degree three or four,what can you say about the picture of the tree?

(d) Write down the picture enumerator for all trees on five vertices.Hint. Remember the formula involving degrees and edges.

→130. (Optional) As above, for n ≥ 1 let Sn be the set of all trees withvertex set [n]. Prove that the picture enumerator EP (Sn) equalsx1x2 · · ·xn(x1 + x2 + · · ·+ xn)n−2.

Hint. When you factor out x1x2 · · ·xn from the enumerator of trees,the result is a sum of terms of degree n− 2.

·131. Remember back when we used A2 to stand for taking two apples, andP 3 to stand for taking three pears? Back then we used the productA2P 3 to stand for taking two apples and three pears, which meanswe chose the picture of the ordered pair (2 apples, 3 pears) to be thejuxtaposition , the product of the pictures of a multiset of twoapples and a multiset of three pears.Show that if S1 and S2 are sets with picture functions P1 and P2 definedon them, and if we define the picture of an ordered pair (x1, x2) ∈S1×S2 to be P ((x1, x2)) = P1(x1)P2(x2), then the picture enumeratorof P on the set S1 × S2 is EP1(S1)EP2(S2). We call this the ProductPrinciple for Picture Enumerators.

•132. Suppose you want to choose a snack of between zero and three apples,between zero and three pears, and between zero and three bananas.

(a) Write a polynomial in one variable x in which the coefficient ofxn is the number of ways to choose a snack with n pieces of fruit.

(b) Suppose an apple costs 20 cents, a banana costs 25 cents, and apear costs 30 cents. What should you substitute for A, P , andB in Problem 125(b) in order to get a polynomial in which thecoefficient of xn is the number of ways to choose a selection offruit that costs n cents?

(c) Suppose an apple has 40 calories, a pear has 60 calories, and abanana has 80 calories. What should you substitute for A, P ,and B in Problem 125(b) in order to get a polynomial in whichthe coefficient of xn is the number of ways to select fruit with atotal of n calories?

6.2. GENERATING FUNCTIONS 67

•133. In this problem, you want choose a subset of the set [n]. For each ifrom 1 to n, use xi to be the picture of choosing i to be in the subset.

(a) What is the picture enumerator for either choosing i or not choos-ing i to be in the subset?

(b) What is the picture enumerator for all possible choices of subsetsof [n]? What should be substituted for xi in order to get a poly-nomial in x such that the coefficient of xk is the number of waysto choose a k-element subset of [n]?

(c) You have just proved a special case of what theorem?

6.2 Generating Functions

6.2.1 Generating polynomials

In your solution to Problem 133 you saw that we can think of the processof expanding the polynomial (1+x)n as a way of “generating” the binomialcoefficients

(nk

)as the coefficients of xk in the expansion of (1 + x)n. For

this reason, we call (1 + x)n the generating polynomial for the binomialcoefficients

(nk

). More generally, the generating polynomial for a finite

sequence a0, . . . , an is the polynomial∑n

i=0 aixi.

In Problem 132(a) you converted the picture enumerator for selectingbetween zero and three each of apples, pears, and bananas to the generatingpolynomial of the finite sequence a0, . . . , an in which ai is the number ofsuch fruit snacks which contain i pieces of fruit. When you substituted xc

for each fruit picture (where c is the number of calories in that particularkind of fruit), the resulting polynomial was the generating polynomial for thenumber of fruit selections with i calories. Also remember that the originalpicture enumerator was obtained by multiplying three picture enumerators:

1 + A + A2 + A3 , 1 + P + P 2 + P 3 , 1 + B + B2 + B3 .

Substituting xc for each fruit picture where c is now the cost of the fruit (asin Problem 132(b)), these picture enumerators become

1 + x20 + x40 + x60 , 1 + x30 + x60 + x90 , 1 + x25 + x50 + x75 ,

where in each case the coefficient of xi gives the number of selections ofthat particular fruit which cost i cents. The Product Principle of PictureEnumerators therefore translates directly into a Product Principle for Gen-erating Polynomials. Before stating this principle, please note that in eachof the above instances we had:


• a finite set S of possible fruit selections (for instance, from zero tothree apples);

• an associated value function defined from S to the nonnegative integers(for instance, the cost of the fruit selection or the number of caloriesin the fruit selection);

• and a polynomial that is the generating polynomial for the number ofelements s ∈ S which have the value i. We will call this polynomialthe generating polynomial associated with the value.

The Product Principle for Generating Polynomials

Let S1,S2 be finite sets with value functions v1, v2. If Gi(x)is the generating polynomial associated with the value vi

then the coefficient of xk in the polynomial G1(x)G2(x) isthe number of ordered pairs (s1, s2) ∈ S1 × S2 such thatv1(s1) + v2(s2) = k.

6.2.2 Generating functions

Generating functions are also defined for infinite sequences, and they arecleverly defined in such a way that the generating function for an infinitesequence {ai}i≥0 with only finitely many nonzero terms (say ai = 0 for alli > N) is the same as the generating polynomial for the finite sequencea0, . . . , aN . The generating function for {ai : i ≥ 0} is the expression∑∞

i=0 aixi, a formal power series. When we talk here about formal power

series, we’re thinking of the series as a convenient way of representing theterms of sequences that interest us. You will see that they are convenientfor our purposes because the sum and product of formal power series aredefined in a way that captures properties of the sequences that are importantin discrete mathematics. The sum of two series is coefficientwise addition,that is,


Addition of Formal Power Series(∞∑i=0

aixi

)+

(∞∑

j=0

bjxj

)=

∞∑k=0

(ak + bk)xk .

Before defining multiplication of two formal power series, we want toemphasize that in calculus (and in analysis in general) we are interested inwhether or not a power series it is a function, and so there it is important toknow for what values of x the power series converges. In discrete mathemat-ics, power series can be purely formal objects; that is, even though we usethe phrase generating function, we don’t require the power series to actuallyrepresent a function and so we don’t have to worry about convergence.1

Now on to multiplication.

◦134. (a) What is the coefficient of x2 in the polynomial

(a0 + a1x + a2x2)(b0 + b1x + b2x

2 + b3x3) ?

What is the coefficient of x4?(b) In part(a), why is there a b0 and a b1 in your expression for the

coefficient of x2 but there is not a b0 or a b1 in your expressionfor the coefficient of x4? What is the coefficient of x4 in

(a0 + a1x + a2x2 + a3x

3 + a4x4)(b0 + b1x + b2x

2 + b3x3 + b4x

4)?

Express this coefficient in the form

4∑i=0

something,

where the “something” is an expression you need to figure out.

◦135. The point of Problem 134 is that when the sequences {ai} and {bj}are finite (or, equivalently for our purposes when {ai} and {bj} are

1An historical aside: Before settling on our current definition of the word “function”,the word evolved through several meanings, starting with very imprecise meanings andending with our current definition. The terminology “generating function” may be thoughtof as an example of one of the earlier uses of the term function.


infinite sequences with ai = 0 for i > n and bj = 0 for j > m), thenthere is a very nice formula for the coefficient of xk in the product(

n∑i=0

aixi

) m∑j=0

bjxj

.

Write this formula explicitly.

•136. Assuming that the rules of polynomial arithmetic apply to formalpower series, write down a formula for the coefficient ck of xk in theproduct ( ∞∑

i=0

aixi

) ∞∑j=0

bjxj

.

The expression you obtained in Problem 136 defines the product of formalpower series. That is, we define the product of two formal power series∑∞

i=0 aixi and

∑∞j=0 bjx

j to be∑∞

k=0 ckxk, where ck is the expression you

found in Problem 136. For convenience of referral, write the proper coeffi-cient ck in the following formula:

Multiplication of Formal Power Series(∞∑i=0

aixi

)(∞∑

j=0

bjxj

)=

∞∑k=0

xk .

◦137. Use the definition of multiplication to find the product (1−x)∑n

k=0 xk

and the product (1− x)∑∞

k=0 xk.

Since your expression for the product of two formal power series wasderived using usual polynomial algebra, it should not be surprising that theproduct of formal power series also satisfies the usual rules of polynomialalgebra, such as the associative law and the commutative law. We couldexplicitly state these rules and prove that they are all valid for formal powerseries multiplication, but for our purposes that’s excessive and so we’ll justmove on to using the algebra of generating functions to solve problems. Also,because the algebra of generating functions is the same whether the sequence


is finite or infinite, the Product Principle for Generating Polynomials (asgiven on page 68) can be shown to hold for all generating functions, andmathematical induction can be used to extend this principle from two setsto any finite number of sets:

The Product Principle for Generating Functions

Suppose each of the sets S1,S2, . . . ,Sn has a value function definedfrom the set to the nonnegative integers. For each i, let Gi(x) be thegenerating function associated with the value on the set Si. Then thegenerating function for the number of n-tuples of each possible totalvalue is the product

G(x) = G1(x)G2(x) . . . Gn .

•138. (Optional) It is possible to give a proof of the Product Principle forGenerating Functions that does not rely on the Product Principle forPicture Enumerators, and that is worked through in this problem.Suppose that we have two sets S1 and S2. Let v1 (here v standsfor value) be a function from S1 to the nonnegative integers and letv2 be a function from S2 to the nonnegative integers. Define a newfunction v on the set S1 ×S2 by v(x1, x2) = v1(x1) + v2(x2). Supposefurther that

∑∞i=0 aix

i is the generating function for the number ofelements x1 ∈ S1 of value i, that is, with v1(x1) = i. Suppose alsothat

∑∞j=0 bjx

j is the generating function for the number of elementsx2 of S2 of value j, that is, with v2(x2) = j. Prove that the coefficientof xk in ( ∞∑

i=0

aixi

) ∞∑j=0

bjxj

is the number of ordered pairs (x1, x2) in S1 × S2 with total valuek, that is, with v1(x1) + v2(x2) = k. This is called the ProductPrinciple for Generating Functions.

Hint. If this problem appears difficult, the most likely reason is becausethe definitions are all new and symbolic. Focus on what it means for∑∞

k=0 ckxk to be the generating function for ordered pairs of total

value k. In particular, how do we get an ordered pair with total value


k? What do we need to know about the values of the components ofthe ordered pair?

•139. Write the product (1 − x)∑∞

k=0 xk as a power series. How does thishelp you find a formula for (1 − x)−n as a formal power series whosecoefficients involve binomial coefficients? What does this formula tellyou about how we should define

(−nk

)when n is positive?

140. Suppose once again that i is an integer between 1 and n. In Prob-lem 137 you encountered the formal power series

∑∞k=0 xk in which

the coefficient of every xk is 1, an example of an infinite geometricseries. In this problem it will be useful to interpret this series as agenerating function in which the coefficient 1 is the number of mul-tisets of size k chosen from the singleton set {i} . Namely, there isonly one way to chose a multiset of size k from {i}: choose i exactlyk times. Express the generating function in which the coefficient ofxk is the number of k-element multisets chosen from [n] as a power ofanother power series. What does Problem 112 (on page 59) tell youabout what this generating function equals?

141. Express the generating function for the number of multisets of size kchosen from [n] (where n is fixed but k can be any nonnegative integer)as a 1 over something relatively simple.

For future reference, fill in the coefficients in the following power seriesrepresentation:

1

(1− x)n=

∞∑k=0

xk . (6.5)

142. (a) Write down the generating function for the number of ways todistribute identical pieces of candy to three children so that nochild gets more than 4 pieces.

(b) Using the fact that

(1− x)(1 + x + x2 + . . . + xn−1) = 1− xn ,

write the generating function from part (a) as a quotient of poly-nomials.

6.3. GENERATING FUNCTIONS AND RECURRENCES 73

(c) Under the restriction in part (a), in how many ways we can passout exactly ten pieces of candy?

•143. Let j < n be positive integers.

(a) What is the generating function for the number of multisets cho-sen from an n-element set so that each element appears at leastj times and less than m times?

(b) Write the generating function from part (a) as a quotient of poly-nomials, then as a product of a polynomial and a power series.

Hint. Intepret Problem 142 in terms of multisets.

6.3 Generating Functions and Recurrences

Recall that a recurrence for a sequence {an}n≥0 expresses each an in termsof values ai for i < n (refer to Section 4.3, pages 46ff). For example, theequation ai = 3ai−1+2i is a first-order linear constant coefficient recurrence.Algebraic manipulations with generating functions can sometimes uncoverthe solutions to a recurrence relation.

144. Suppose that ai = 3ai−1 + 3i.

(a) Multiply both sides of the recurrence by xi and sum both theleft-hand side and right-hand side from i = 1 to infinity. In theleft-hand side use the fact that

∞∑i=1

aixi =

( ∞∑i=0

aixi)− a0

and in the right-hand side, use the fact that

∞∑i=1

ai−1xi = x

∞∑i=1

ai−1xi−1 = x

∞∑j=0

ajxj = x

∞∑i=0

aixi

(where we substituted j for i− 1 to see explicitly how to changethe limits of summation, a surprisingly useful trick) to rewritethe equation in terms of the power series

∑∞i=0 aix

i. Solve theresulting equation for the power series

∑∞i=0 aix

i. You can savea lot of writing by using a variable like y to stand for the powerseries.

(b) Use the previous part and our earlier results on power series toget a formula for ai in terms of a0.


(c) Now suppose that ai = 3ai−1 +2i. Repeat the previous two stepsfor this recurrence.Hint. You may run into a product of the form

∑∞i=0 aixi times∑∞

j=0 bjxj . Note that in the product, the coefficient of xk is∑ki=0 aibk−i = bk

∑ki=0

ai

bi .

◦145. Suppose we deposit $ 5000 in a savings certificate that pays ten-percentinterest and we also participate in a program to add $1000 to thecertificate at the end of each year (from the end of the first year on)that follows (also subject to interest). Assuming we make the $ 5000deposit at the end of year 0, and letting ai be the amount of moneyinthe account at the end of year i, write a recurrence for the amountof money the certificate is worth at the end of year n. Solve thisrecurrence. How much money do we have in the account (after ouryear-end deposit) at the end of ten years? At the end of 20 years?


1. Let n be a fixed positive integer. Suppose there is an unlimited supplyof identical pieces of candy. What is the generating function for thenumber of ways to pass out k pieces of candy to n children in such a waythat each child gets between three and six pieces of candy (inclusive)?Find a formula for the number of ways to pass out the candy.

2. Let m and n be fixed nonnegative integers. Express the generatingfunction for the number of k-element multisets of an n-element setsuch that no element appears more than m times as a quotient of twopolynomials. Use this expression to get a formula for the number ofk-element multisets of an n-element set such that no element appearsmore than m times.

3. We have some chairs which we are going to paint with red, white,blue, green, yellow and purple paint. Suppose that we may paint anynumber of chairs red or white, that we may paint at most one chairblue, at most three chairs green, only an even number of chairsyellow,and only a multiple of four chairs purple. In how many ways may wepaint n chairs?

◦4. (a) In paying off a mortgage loan with initial amount A, annual in-terest rate p (on a monthly basis) with a monthly payment ofm, what recurrence describes the amount owed after n months


of payments in terms of the amount owed after n − 1 months?Some technical details: You make the first payment after onemonth. The amount of interest included in your monthly pay-ment is .01p/12. This interest rate is applied to the amount youowed immediately after making your last monthly payment.

(b) Find a formula for the amount owed after n months.(c) Find a formula for the number of months needed to bring the

amount owed to zero. Another technical point: If you were tomake the standard monthly payment m in the last month, youmight actually end up owing a negative amount of money. There-fore it is okay if the result of your formula for the number ofmonths needed gives a non-integer number of months. The bankwould just round up to the next integer and adjust your paymentso your balance comes out to zero.

(d) What should the monthly payment be to pay off the loan over aperiod of 30 years?

Chapter 7

Recurrences, Revisited

7.1 Rabbits

The sequence of problems that follows describes a mathematical model of afictional population of rabbits. We use the example of a rabbit populationfor historic reasons, and our goal is a classical sequence of numbers called theFibonacci numbers. Fibonacci1 introduced them in his book, Liber Abaci,published in 1202.

•146. Suppose we begin at the end of month 0 with 10 pairs (where a pairmeans one female and one male) of baby rabbits. For purposes ofmodeling the rabbit population we make three assumptions:

• Rabbits are mature and begin to reproduce after one month.

• Each mature pair produces two new pairs at the end of eachmonth.

• No rabbit dies during our period of observation.

Let an be the number of rabbit pairs we have at the end of month n.Show that a0 = 10 and an = an−1 + 2an−2. This is an example ofa second-order recurrence which is linear and has constant coeffi-cients. Using a method similar to that of Problem 144, show that

∞∑i=0

aixi =

101− x− 2x2

.

1Apparently Leonardo de Pisa was given the name Fibonacci posthumously. It is ashortening of “son of Bonacci” in Italian.

77

78 CHAPTER 7. RECURRENCES, REVISITED

This is the generating function for the sequence {ai} giving the numberof rabbit pairs in month i. We shall shortly see a method for convertingthis to a solution to the recurrence.

•147. In Fibonacci’s original problem, there is one pair of baby rabbits at theend of month 0 and each pair of mature rabbits produces one new pairat the end of each month, but otherwise the situation is the same as inProblem 146. Under these assumptions, find the generating functionfor the number of pairs of rabbits at the end of n months.

→148. Find the generating function for the solutions to the recurrence

ai = 5ai−1 − 6ai−2 + 2i.

The recurrences we have seen in this section are called second-orderbecause they specify ai in terms of the two preceeding terms ai−1 and ai−2.They are called linear because ai−1 and ai−2 each appear to the first power,and they are called constant coefficient because the coefficients of ai−1 andai−2 are constants.

7.2 The Method of Partial Fractions

You have been able to express all of the generating functions in the pre-vious section in terms of the reciprocal of a quadratic polynomial. This isoften called the rational-function representation of the generating function.In order to uncover the solution sequences, we need a power series repre-sentation for the generating function. It turns out that whenever you canfactor a polynomial into linear factors (and over the complex numbers sucha factorization exists for every polynomial) you can use that factorizationand results from Chapter 6 to express its reciprocal in terms of power series.

•149. Express1

x− 3+

2x− 2

as a single fraction.

◦150. In Problem 149 when you added numerical multiples of the reciprocalsof first-degree polynomials you got a fraction in which the denomina-tor is a quadratic polynomial. (This will always happen unless thetwo denominators are multiples of each other, because their least com-mon denominator will simply be their product, a quadratic polyno-mial.) This leads us to ask whether a fraction whose denominator isa quadratic polynomial can always be expressed as a sum of fractions

7.2. THE METHOD OF PARTIAL FRACTIONS 79

whose denominators are first-degree polynomials. Find numbers c andd so that

5x + 1(x− 3)(x + 5)

=c

x− 3+

d

x + 5.

•151. In Problem 150 you may have simply guessed at values of c and d, oryou may have solved a matrix equation. Given constants a, b, r1, andr2 (with r1 6= r2), write a matrix equation which can be solved to getc and d for which

ax + b

(x− r1)(x− r2)=

c

x− r1+

d

x− r2.

Writing the equations as in Problem 151 and solving them is called theMethod of Partial Fractions, which you probably have already used incalculus. This method will let you find power series expansions for generat-ing functions of the type in your answers to Problems 146 and 148.

•152. Use the Method of Partial Fractions to convert the generating functionin Problem 146 into the form

c

x− r1+

d

x− r2,

and use this with results from Chapter 6 to find a formula for an.

•153. Use the quadratic formula to factor x2+x−1, and then use the factors

to find the partial fraction decomposition of1

x2 + x− 1. (Hint: You

can save yourself a tremendous amount of frustrating algebra if youarbitrarily choose one of the roots to be called r1. Then call the otherroot r2 and solve the problem using these algebraic symbols in placeof the actual roots. Not only will you save yourself some work, butyou will get a formula you could use in other problems. When you aredone, substitute in the actual values of the solutions and simplify.)

154. Use the partial fraction decomposition you found in Problem 153 towrite the generating function you found in Problem 147 in the form

∞∑n=0

anxn .

This gives an explicit formula for an, which is called Binet’s Formula.Find Binet’s Formula.


The polynomial ch(x) = x2− c1x− c2 obtained from the second-order linearconstant-coefficient recurrence an = c1an−1 + c2an−2 is usually called thecharacteristic polynomial of the recurrence and we call chR(x) =1− c1x− c2x

2 its reciprocal polynomial.

155. Find the characteristic and reciprocal polynomials for the Fibonaccirecurrence and find relationships among the roots of these polynomials.Experiment with other second-order recurrences.

•156. When we have a0 = 1 and a1 = 1 (that is, when we begin with one pairof baby rabbits), the numbers an are called Fibonacci Numbers.

(a) Use either the defining recurrence or Binet’s Formula to find theFibonacci numbers a2 through a8. Are you amazed that Binet’sFormula produces integers, or for that matter are you surprisedit even produces rational numbers? Why does the recurrenceequation tell you that the Fibonacci numbers are all integers?

(b) Find an algebraic explanation (not using the recurrence equation)for what happens to make the square roots of five go away inBinet’s Formula. Think Binomial Theorem.

→157. Explain why there is a real number b such that, for large values of n,the value of the nth Fibonacci number is almost exactly (but not quite)some constant times bn. Find b and the constant.

→∗158. As a challenge, see if you can find a way to show algebraically (notusing the recurrence, but rather the formula you get by removing thesquare roots of five) that the formula for the Fibonacci numbers yieldsintegers.

159. Use the technique of generating functions to solve the recurrence

an = 4an−1 − 4an−2 .

7.3 Nonnegative recurrences

160. A second-order recurrence an = c1an−1 + c2an−2 with c1, c2 ≥ 0 andc2 6= 0 is frequently called a nonnegative recurrence. This typeof recurrence is important, since it often occurs in applications. Findan explicit formula for the solution sequences {an} of a nonnegativesecond-order recurrence.

7.4. ADDITIONAL EXERCISES FOR CHAPTER 7 81

161. If an = c1an−1 + c2an−2 is a nonnegative recurrence, show it has onepositive root r0 and that the other root r satisfies |r| < r0. Because ofthis property, r0 is usually called the dominant root of the recurrence.

162. If r0 is the dominant root of a nonnegative recurrence an = c1an−1 +c2an−2, show there exists a positive constant c such that

limn→∞

an

rn0

= c .

Compare this with Problem 157.

163. Let f(x) = c0xn− c1x

n−1− . . .− cn−1x− cn be a polynomial such thatevery ci is a nonnegative real number and c0, cn 6= 0. Such polynomialsare often called nonnegative polynomials. If f(x) is a nonnegativepolynomial whose derivative f ′(x) has exactly one positive root, usecalculus to show that f(x) has exactly one positive real root.

164. If f(x) is a nonnegative polynomial, must f ′(x) also be a nonnegativepolynomial? Either prove that it is always true or find the generalform of counterexamples.

165. Prove that any nonnegative polynomial, regardless of degree n ≥ 1,has exactly one positive root, which is again called the dominantroot of the nonnegative polynomial.

166. Suppose r0 is the dominant root of the nonnegative polynomial f(x) =xn − c1x

n−1 − . . . − cn−1x − cn. If r is any other root of f(x), showthat

|r|n ≤ c1|r|n−1 + . . . + cn−1|r|+ cn . (7.1)

(This inequality holds even when r is a non-real complex number, butif you don’t know much about complex numbers then just prove theresult for real roots r.) Use this inequality to conclude why the uniquepositive root r0 is called the dominant root.

7.4 Additional Exercises for Chapter 7

◦1. (a) In paying off a mortgage loan with initial amount A, annual in-terest rate p% (on a monthly basis) with a monthly payment ofm, what recurrence describes the amount owed after n monthsof payments in terms of the amount owed after n − 1 months?Some technical details: You make the first payment after one


month. The amount of interest included in your monthly pay-ment is .01p/12. This interest rate is applied to the amount youowed immediately after making your last monthly payment.

(b) Find a formula for the amount owed after n months.(c) Find a formula for the number of months needed to bring the

amount owed to zero. Another technical point: If you were tomake the standard monthly payment m in the last month, youmight actually end up owing a negative amount of money. There-fore it is okay if the result of your formula for the numbero fmonths needed gives a non-integer number of months. The bankwould just round up to the next integer and adjust your paymentso your balance comes out to zero.

(d) What should the monthly payment be to pay off the loan over aperiod of 30years?

→2. Find a recurrence for the number of ways to divide a convex n-goninto triangles by means of non-intersecting diagonals. How do thesenumbers relate to the Catalan numbers?

3. One natural but oversimplified model for the growth of a tree is that allnew wood grows from the previous year’s growth and is proportionalto it in amount. To be more precise, assume that the (total) lengthof the new growth in a given year is the constant c times the (total)length of new growth in the previous year. Write a recurrence for thetotal length an of all the branches of the tree at the end of growingseason n. Find the general solution to your recurrence. Assume thatwe begin with a one meter cutting of new wood (from the previousyear) which branches out and grows a total of two meters of new woodin the first year. What will the total length of all the branches of thetree be at the end of n years?

Chapter 8

The Principle of Inclusionand Exclusion

8.1 The Size of a Union of Sets

One of our very first counting principles was the Sum Principle which saysthat the size of a union of disjoint sets is the sum of their sizes. Computingthe size of the union of overlapping sets quite naturally requires informationabout how they overlap. Taking such information into account will allow usto develop a powerful extension of the Sum Principle known as the “Principleof Inclusion and Exclusion.”

8.1.1 Unions of two or three sets

◦167. In a biology lab study of the effects of basic fertilizer ingredients onplants, 16 plants are treated with potash, 16 plants are treated withphosphate, and among these eight plants are treated with both phos-phate and potash. No other treatments are used. How many plantsreceive at least one treatment? If 32 plants are studied, how manyreceive no treatment?

+ 168. Give a formula for the size of the union A ∪ B of two sets A and Bin terms of the sizes |A| of A, |B| of B, and |A ∩ B| of A ∩ B. If Aand B are subsets of some “universal” set U , express the size of thecomplement U − (A ∪ B) in terms of the sizes |U | of U , |A| of A, |B|of B, and |A ∩B| of A ∩B.

Hint. Try drawing a Venn Diagram.

83

84 CHAPTER 8. THE PRINCIPLE OF INCLUSION AND EXCLUSION

◦169. In Problem 167, just two fertilizers were used to treat all the sampleplants. Now suppose there are three fertilizer treatments: 15 plantsare treated with nitrates, 16 with potash, 16 with phosphate, 7 withnitrate and potash, 9 with nitrate and phosphate, 8 with potash andphosphate and 4 with all three. Now how many plants have beentreated? If 32 plants were studied, how many received no treatmentat all?

•170. Give a formula for the size of A∪B ∪C in terms of the sizes of A, B,C and the intersections of these sets.

8.1.2 Unions of an arbitrary finite number of sets

•171. Conjecture a formula for the size of a union of sets

A1 ∪A2 ∪ . . . ∪An =n⋃

i=1

Ai

in terms of the sizes of the sets Ai and their intersections.

The hardest part of generalizing Problem 170 to Problem 171 is probablyfinding a good notation to express your conjecture. In fact, for many peopleit would be easier to express the conjecture in words than to express it asmathematical formula. We will describe some notation that will make yourtask easier. It is similar to the notation

EP (S) =∑s∈S

P (s)

that we used to stand for the sum of the pictures of the elements of a set Swhen we introduced picture enumerators. Let us define⋂

i∈I

Ai

to mean the intersection over all elements i in the set I of Ai. Thus⋂i∈{1,3,4,6}

Ai = A1 ∩A3 ∩A4 ∩A6. (8.1)

This kind of notation, consisting of an operator with a description under-neath of the values of a dummy variable of interest to us, is a generalizationof what you have already used with summation notation for sums and prod-uct notation for products.

8.2. THE PRINCIPLE OF INCLUSION AND EXCLUSION 85

•172. Use notation something like that of (8.1) to express the answer toProblem 171. Note there are many different correct ways to do thisproblem. Try to write down more than one and choose the nicestone you can. Say why you chose it because your view of what makesa formula nice may be different from somebody else’s. (Also, yourbest formula won’t necessarily involve all the elements of (8.1). Theauthor’s version doesn’t use all those elements.)

•173. A group of n students carrying backpacks goes to a restaurant. Themanager invites everyone to check their backpack at the check deskand everyone does. While they are eating, a child playing in the checkroom randomly moves around the claim check stubs. We will try tocompute the probability that, at the end of the meal, at least onestudent receives his or her own backpack. This probability is thefraction of the total number of ways to return the backpacks in whichat least one student gets his or her own backpack back.

(a) What is the total number of ways to pass back the backpacks?(b) In how many of the distributions of backpacks to students does

at least one student get his or her own backpack? It might be agood idea to first consider cases with n = 3, 4, and 5.

(c) What is the probability that at least one student gets the correctbackpack?

(d) What is the probability that no student gets his or her own back-pack?

(e) As the number of students becomes large, what does the proba-bility that no student gets the correct backpack approach?

Problem 173 is “classically” called the hat check problem; the namecomes from substituting hats for backpacks. If is also sometimes called thederangement problem. A derangement of an n-element set is a permu-tation of that set (thought of as a bijection on [n]) that maps no elementof the set to itself. One can think of a way of handing back the backpacksas a permutation f of the students: f(i) is the owner of the backpack thatstudent i receives. Then a derangement is a way to pass back the backpacksso that no student gets his or her own.

8.2 The Principle of Inclusion and Exclusion

The formula you have given in Problem 172 is often called the Principle ofInclusion and Exclusion for unions of sets. The reason is the pattern in


the formula: it first adds (includes) all the sizes of the sets, then subtracts(excludes) all the sizes of the intersections of two sets, then adds (includes)all the sizes of the intersections of three sets, and so on. Notice that wehaven’t yet proved the principle. There are a variety of proofs. Perhaps oneof the most straightforward (though not the most elegant) is an inductiveproof that relies on the fact that

A1 ∪A2 ∪ · · · ∪An = (A1 ∪A2 ∪ · · · ∪An−1) ∪An

and the formula for the size of a union of two sets.

174. Use induction to give a proof of your formula for the Principle ofInclusion and Exclusion.

→175. We get a more elegant proof if we ask for a picture enumerator forA1 ∪A2 ∪ · · · ∪An. So let us assume A is a set with a picture functionP defined on it and that each set Ai is a subset of A.

(a) By thinking about how we got the formula for the size of a union,write down instead a conjecture for the picture enumerator of aunion. You could use a notation like EP (

⋂i∈S Ai) for the picture

enumerator of the intersection of the sets Ai for i in a subset Sof [n].

(b) If x ∈⋃n

i=1 Ai, what is the coefficient of P (x) in (the inclusion-exclusion side of) your formula for EP (

⋃ni=1 Ai)?

Hint. Let T be the set of all i such that x ∈ Ai. In terms of x,what is different about the i in T and those not in T?

(c) If x 6∈⋃n

i=1 Ai, what is the coefficient of P (x) in (the inclusion-exclusion side of) your formula for EP (

⋃ni=1 Ai)?

(d) How have you proved your conjecture for the picture enumeratorof the union of the sets Ai?

(e) How can you get the formula for the Principle of Inclusion andExclusion from your formula for the picture enumerator of theunion?

176. Frequently when we apply the Principle of Inclusion and Exclusion, wewill have a situation like that of Problem 173(d). That is, we will havea set A and subsets A1, A2, . . . , An and we will want the size (or theprobability) of the set of elements in A that are not in the union. Thisset is known as the complement of the union of the Ais in A, andis denoted by A −

⋃ni=1 Ai, or if A is clear from context, by

⋃ni=1 Ai.

Give the formula for⋃n

i=1 Ai. The Principle of Inclusion and Exclusiongenerally refers to both this formula and the one for the union.

8.3. APPLICATIONS OF INCLUSION AND EXCLUSION 87

We can find a very elegant way of writing the formula in Problem 176 ifwe let

⋂i∈∅ Ai = A. For this reason, when we have a family of subsets Ai of

a set A, we define1⋂

i∈∅ Ai to be A.

8.3 Applications of Inclusion and Exclusion

8.3.1 Multisets with restricted numbers of elements

177. In how many ways may we distribute k identical apples to n childrenso that no child gets more than four apples? Notice you know how tofigure out how many ways the apples can be distributed so that childi gets five or more apples. Compare your result with your result inProblem 142 (on page 72).

8.3.2 The Menage Problem

A certain town has a large number of 8-year-old twins, who are all in thesame third-grade class. The teacher asks n sets of twins to sit around around table.

178. Let Ai be the set of all such sittings in which the children in the i-thset of twins are sitting next to each other. Find |Ai|. Find |Ai ∩ Aj |for i 6= j.

179. For each of n = 4, n = 5, find the number of ways n sets of twins canbe seated if no one may sit next to his or her twin.

→180. For general n, in how many ways can the n sets of twins be seated ifno one may sit next to his or her twin?

→∗181. In this problem we are again seating n sets of twins around a roundtable, and now each set of twins has one boy and one girl. In howmany ways can they sit so that no person is next to their twin andthe genders alternate around the table. This problem is called theMenage Problem. (Hint: Reason somewhat as you did in Prob-lem 180, noting that if the set of all twins who do sit side-by-side is

1For those interested in logic and set theory, given a family of subsets Ai of a set A, wedefine

⋂i∈S Ai to be the set of all members x of A that are in Ai for all i ∈ S. (Note that

this allows x to be in some other Ajs as well.) Then if S = ∅, our intersection consistsof all members x of A that satisfy the statement “if i ∈ ∅, then x ∈ Ai.” But since thehypothesis of the ‘if-then’ statement is false, the statement itself is true for all x ∈ A.Therefore

⋂i:i∈∅ Ai = A.


nonempty, then the gender of the person at each place at the table isdetermined once we seat one couple in that set, or, for that matter,once we seat one person.)

8.3.3 Counting onto functions

•182. Given a function f from the k-element set K to the n-element set [n],we say f is in the set Ai if f(x) 6= i for every x ∈ K. If f is anonto function, how many of these sets does f belong to? What is thenumber of functions from a k-element set onto an n-element set?

183. If we roll a die eight times, we get a sequence of 8 numbers, the numberof dots on top on the first roll, the number on the second roll, and soon. If you want to get an actual numerical answer to any part of thisproblem, you will likely need either a computer algebra package, aprogrammable calculator, or a spreadsheet.

(a) What is the number of ways of rolling the die eight times so thateach of the numbers one through six appears at least once in oursequence?

(b) What is the probability that we get a sequence in which all sixnumbers between one and six appear?

(c) How many times do we have to roll the die for the probabilitythat all six numbers appear in our sequence to be at least 1/2?

8.4 The chromatic polynomial of a graph

In Section 3.2 (on pages 35ff) we defined a graph to consist of set V ofelements called vertices and a set E of elements called edges such that eachedge joins two vertices. A coloring of the vertices of a graph by the elementsof a set C (of colors) is an assignment of an element of C to each vertex ofthe graph; that is, a function from the vertex set V of the graph to C. Acoloring is called a proper coloring if for each edge joining two distinctvertices2, the two vertices it joins have different colors. You may have heardof the famous Four Color Theorem of graph theory that says if there is adrawing of a graph in the plane so that no two edges cross (though they maytouch at a vertex), then the graph has a proper coloring with four colors.Here we are interested in a different, though related, problem: namely, in

2If a graph has a loop connecting some vertex to itself, the loop must of course connecta vertex to a vertex of the same color. Because of this. in this definition we only consideredges with two distinct vertices.

8.4. THE CHROMATIC POLYNOMIAL OF A GRAPH 89

how many ways may we properly color a graph (regardless of whether it canbe drawn in the plane or not) using k or fewer colors?

When we studied trees, we restricted ourselves to connected graphs.(Recall that a graph is connected if, for each pair of vertices, there is a walkbetween them.) Here, disconnected graphs will also be important to us.

184. Given a graph which might or might not be connected, define a relationon its set V of vertices by: For v, w ∈ V , vRw if and only if there is apath from v to w. Prove that this relation is an equivalence relation.Its equivalence classes are called the connected components of thegraph.

Notice that the connected components depend on the edge set of thegraph; that is, if we have a graph on the vertex set V with edge set Eand another graph on the same vertex set V with edge set E′, then thesetwo graphs could have different connected components. It is traditionalto use the Greek letter γ (gamma)3 to represent the number of connectedcomponents of a graph. In particular, γ(V,E) stands for the number ofconnected components of the graph with vertex set V and edge set E.

We next show how the Principle of Inclusion and Exclusion may be usedto compute the number of ways to color a graph properly using colors froma set C of c colors.

·185. Suppose we have a graph G with vertex set V and edge set E ={e1, e2, . . . e|E|}. Suppose F is a subset of E. Suppose we have a setC of c colors with which to color the vertices.

(a) In terms of γ(V, F ), in how many ways may we color the verticesof G so that each edge in F connects two vertices of the samecolor?

(b) Given a coloring of G, for each edge ei in E, let us consider theset Ai of all colorings in which both endpoints of ei are coloredthe same color. In which sets Ai does a proper coloring lie?

(c) Find a formula (which may involve summing over all subsets Fof the edge set of the graph and using the number γ(V, F ) ofconnected components of the graph with vertex set V and edgeset F ) for the number of proper colorings of G using colors in theset C.

3The Greek letter gamma is pronounced gam-uh, where gam rhymes with ham.


The formula you found in Problem 185 involves powers of the numberof colors, and so it is a polynomial function of c. It is called the chromaticpolynomial of the graph G. Since we like to think about polynomials ashaving a variable x and we like to think of c as standing for some constant,people often use x as the notation for the number of colors we are using tocolor G. Frequently people will use χG(x) to stand for the number of waysto color G with x colors, and call χG(x) the chromatic polynomial of G.

8.4.1 Deletion-Contraction and Chromatic Polynomials

→186. In Section 4.5 (on pages 50ff) we introduced the deletion-contractionrecurrence for counting spanning trees of a graph.

(a) Figure out how the chromatic polynomial of a graph is related tothe chromatic polynomials of the graphs resulting from deletionof an edge e and from contraction of that same edge e.

(b) Try to find a recurrence like the one for counting spanning treesthat expresses the chromatic polynomial of a graph in terms ofthe chromatic polynomials of G − e and G/e for an arbitraryedge e.

(c) Use the recurrence from the last part to give another proof thatthe number of ways to color a graph with x colors is a polynomialfunction of x.

187. Use the deletion-contraction recurrence to reduce the computation ofthe chromatic polynomial of the graph in Figure 8.1 to computingchromatic polynomials that you can easily compute. (You can sim-plify your computations by thinking about the effect on the chromaticpolynomial of deleting an edge that is a loop, or deleting one of severaledges between the same two vertices.)

Figure 8.1: A graph.

1 2

34

5


→188. (a) In how many ways may you properly color the vertices of a pathon n vertices with x colors? Describe any dependence of thechromatic polynomial of a path on the number of vertices.

(b) In how many ways may you properly color the vertices of a cycleon n vertices with x colors? Describe any dependence of thechromatic polynomial of a cycle on the number of vertices.

189. In how many ways may you properly color the vertices of a tree on nvertices with x colors?

→190. What do you observe about the signs of the coefficients of the chro-matic polynomial of the graph in Figure 8.1? What about the signsof the coefficients of the chromatic polynomial of a path? Of a cycle?Of a tree? Make a conjecture about the signs of the coefficients of achromatic polynomial and prove it.


1. Each person attending a party has been asked to bring a prize. Theperson planning the party has arranged to give out exactly as manyprizes as there are guests, but any person may win any number ofprizes. If there are n guests, in how many ways may the prizes begiven out so that nobody gets the prize that he or she brought?

2. There are m students attending a seminar in a room with n seats. Theseminar is a long one, and in the middle the group takes a break. Inhow many ways may the students return to the room and sit down sothat nobody is in the same seat as before?

3. What is the number of ways to pass out k pieces of candy from anunlimited supply of identical candy to n children (where n is fixed) sothat each child gets between three and six pieces of candy (inclusive)?If you have done Additional Problem 1 in Chapter 6, compare youranswer in that problem with your answer in this one.

→4. In how many ways may k distinct books be arranged on n shelves sothat no shelf gets more than m books?

→5. Suppose that n children join hands in a circle for a game at nurseryschool. The game involves everyone falling down (and letting go). Inhow many ways may they join hands in a circle again so that nobody


has the same person immediately to the right both times the groupjoins hands?

→∗6. Suppose that n people link arms in a folk-dance and dance in a circle.Later on they let go and dance some more, after which they link armsin a circle again. In how many ways can they link arms the secondtime so that no one links with a person with whom he or she linkedarms before?

→∗7. (A challenge—the authors have not tried to solve this one!) RedoProblem 6 in the case that there are n men and n women and thegenders must alternate in the circular arrangements.

→8. Suppose we take two graphs G1 and G2 with disjoint vertex sets, wechoose one vertex on each graph, and connect these two vertices by anedge e to get a graph G12. How does the chromatic polynomial of G12

relate to the chromatic polynomials of G1 and G2?

Part II

SUPPLEMENTARYSECTIONS

93

Chapter 1

Ramsey Numbers

Mathematical Prerequisites: Work through Chapter 1. Some acquain-tance with induction is required.

1.1 The Generalized Pigeonhole Principle

Generalized Pigeonhole PrincipleIf we partition a set with more than kn elements into n blocks, then atleast one block has at least k + 1 elements.

Exercise 1.1. Prove the Generalized Pigeonhole Principle.

Exercise 1.2. All the powers of 5 end in the digit 5, and all the powers of2 are even. Show that there exists an integer n such that if you take thefirst n powers of a prime other than two or five, one must have “01” as thelast two digits.

Exercise 1.3. Show that in a set of six people, there is either a subset ofat least three people who all know each other or a subset of at least threepeople none of whom know each other. (Here we assume that if Person 1knows Person 2, then Person 2 knows Person 1.)

Hint. Look at it from the perspective of one person, say Person 1.

Exercise 1.4. Draw five circles labelled Al, Sue, Don, Pam, and Jo. Find away to draw red and green lines between people so that every pair of peopleis joined by a line and there is neither a triangle consisting entirely of red

95

96 CHAPTER 1. RAMSEY NUMBERS

lines or a triangle consisting of green lines. What does Exercise 1.3 tell youabout the possibility of doing this with six people’s names? What does thisexercise say about the conclusion of Exercise 1.3 holding when there are fivepeople in our set rather than six?

1.2 Ramsey Numbers

Exercises 1.3 and 1.4 together show that six is the smallest number R withthe property that if we have R people in a room, then there is either a setof (at least) three mutual acquaintances or a set of (at least) three mutualstrangers. Another way to say the same thing is to say that six is the smallestnumber so that no matter how we connect six points in the plane (no threeon a line) with red and green lines, we can find either a red triangle or agreen triangle. There is a name for this property. The Ramsey NumberR(m,n) is the smallest number R such that if we have R people in a room,then there is a set of at least m mutual acquaintances or at least n mutualstrangers.

There is also a geometric description of Ramsey Numbers which uses theidea of a complete graph on R vertices. A complete graph on R verticesconsists of R points in the plane, together with line segments (or curves)connecting each pair vertices. As you may guess, a complete graph is aspecial case of something called a graph. The word graph will be defined inSection 3.2. The points are called vertices and the line segments are callededges.

Figure 1.1: Three ways to draw a complete graph K4 on four vertices.

In Figure 1.1 we show three different ways to draw a complete graphon four vertices. We use Kn to stand for a complete graph on n vertices.Our geometric description of R(3, 3) may be translated into the languageof graph theory by saying R(3, 3) is the smallest number R such that if wecolor the edges of a KR with two colors, then we can find in our picture aK3 all of whose edges have the same color. The graph theory descriptionof R(m,n) is that R(m,n) is the smallest number R such that if we colorthe edges of a KR with the colors red and green, then we can find in our

1.3. THE EXISTENCE OF RAMSEY NUMBERS 97

picture either a Km all of whose edges are red or a Kn all of whose edgesare green. Because we could have said our colors in the opposite order, wemay conclude that R(m,n) = R(n, m). In particular R(n, n) is the smallestnumber R such that if we color the edges of a KR with two colors, then ourpicture contains a Kn all of whose edges have the same color.

Exercise 1.5. Since R(3, 3) = 6, an uneducated guess might be that R(4, 4) =8. Show that this is not the case.

Hint. To get started, try to write down what it means to say R(4, 4) doesnot equal 8.

Exercise 1.6. Show that among ten people, there are either four mutualacquaintances or three mutual strangers. What does this say about R(4, 3)?

Hint. Review Exercise 1.3 and your solution of it.

Exercise 1.7. Show that among an odd number of people there is at leastone person who is an acquaintance of an even number of people and thereforealso a stranger to an even number of people.

Hint. Let ai be the number of acquaintances of person i.

Exercise 1.8. Find a way to color the edges of a K8 with red and green sothat there is no red K4 and no green K3.

Hint. Often when there is a counter-example, there is one with a good dealof symmetry. (Caution: there is a difference between often and always!)

Exercise 1.9. Find R(4, 3).

Hint. There is a relevant exercise that you haven’t used yet.

As of this writing, relatively few Ramsey Numbers are known: R(3, n)for all n < 10; R(4, 4) = 18; and R(5, 4) = R(4, 5) = 25.

1.3 The Existence of Ramsey Numbers

We have just given two different descriptions of the Ramsey number R(m, n).However, if you look carefully, you will see that we never showed thatRamsey Numbers actually exist—we merely described what they were andshowed that R(3, 3) and R(3, 4) exist by computing them directly.

Provided we can show that there is some number R such that for anygroup of R people, there are either m mutual acquaintances or n mutual


strangers, we will know that the Ramsey Number R(m,n) exists, because itis the smallest such R. Mathematical induction will allow us to show that(m+n−2

m−1

)is one such R, which will then prove R(m,n) ≤

(m+n−2

m−1

).

The question is, what should we induct on, m or n? In other words, dowe use the fact that with

(m+n−3

m−2

)people in a room there are at least m− 1

mutual acquaintances or n mutual strangers, or do we use the fact that withat least

(m+n−3

n−2

)people in a room there are at least m mutual acquaintances

or at least n − 1 mutual strangers? It turns out that we use both. Thatis, we want to be able to simultaneously induct on m and n. One way todo that is to use yet another variation on the Principle of MathematicalInduction, the Principle of Double Mathematical Induction. (which can bederived from one of our earlier ones.)

The Principle of Double Mathematical InductionIn order to prove a statement about integers m and n, if we can

1. Prove the statement when m = a and n = b, for fixed integers a andb,

2. Show that the truth of the statement for values of m and n witha + b ≤ m + n < k implies the truth of the statement for m + n = k,

then we can conclude that the statement is true for all pairs of integersm ≥ a and n ≥ b.

Exercise 1.10. (a) Prove that if there are(m+n−2

m−1

)people in a room,

then there are either at least m mutual acquaintances or at least nmutual strangers.

(b) Prove that R(m,n) exists.

Hint. Double induction, using Pascal’s Equation.

Exercise 1.11. Prove that R(m,n) ≤ R(m− 1, n) + R(m,n− 1).

Hint. Induction is actually not needed here. Rather, begin by explicitlystating what you need to prove.

Exercise 1.12. (a) What does the equation in Exercise 1.11 tell us aboutR(4, 4)?

(b) Consider 17 people arranged in a circle such that each person is ac-quainted with the first, second, fourth, and eighth person to the right

1.4. A BIT OF ASYMPTOTIC COMBINATORICS 99

and the first, second, fourth, and eighth person to the left. Can youfind a set of four mutual acquaintances? Can you find a set of fourmutual strangers?

(c) What is R(4, 4)?

Exercise 1.13. (Optional) Prove the inequality of Exercise 1.10 by induc-tion on m + n.

Exercise 1.14. Use Stirling’s approximation (Problem 43) to convert theupper bound for R(n, n) that you get from Exercise 1.10 to a multiple of apower of an integer.

1.4 A Bit of Asymptotic Combinatorics

Exercise 1.14 gives us an upper bound on R(n, n). A very clever tech-nique due to Paul Erdos, called the “probabilistic method,” will give a lowerbound. Since both bounds are exponential in n, they show that R(n, n)grows exponentially as n gets large. An analysis of what happens to a func-tion of n as n gets large is usually called an asymptotic analysis. Theprobabilistic method, at least in its simpler forms, can be expressed interms of averages, so one does not need to know the language of probabilityin order to understand it. We will apply it to Ramsey numbers in the nextproblem. Combined with the result of Exercise 1.14, this problem will giveus that

√2

n< R(n, n) < 22n−2, so that we know that the Ramsey number

R(n, n) grows exponentially with n.

Exercise 1.15. Suppose we have two integers n and m. We consider allpossible ways to color the edges of the complete graph Km with two colors,say red and blue. For each coloring, we look at each n-element subset N ofthe vertex set M of Km. Then N together with the edges of Km connectingvertices in N forms a complete graph on n vertices. This graph, which wedenote by KN , has its edges colored by the original coloring of the edges ofKm.

(a) Why is it that, if there is no subset N ⊆ M so that all the edges ofKN are colored the same color for any coloring of the edges of Km,then R(n, n) > m?Hint. Use the definition of R(n, n).

(b) To apply the probabilistic method, we are going to compute the av-erage, over all colorings of Km, of the number of sets N ⊆ M with|N | = n such that KN does have all its edges the same color. Explainwhy it is that if the average is less than 1, then for some coloring there


is no set N such that KN has all its edges colored the same color. Whydoes this mean that R(n, n) > m?

(c) We call a KN monochromatic for a coloring c of Km if the color c(e)assigned to edge e is the same for every edge of KN . Let us definemono(c,N) to be 1 if N is monochromatic for c and to be 0 otherwise.Find a formula for the average number of monochromatic KN s over allcolorings of Km that involves a double sum first over all edge coloringsc of Km and then over all n-element subsets N ⊆ M of mono(c,N).Hint. Note that there are 2(m

2 ) graphs on a set of n vertices.(d) Show that your formula for the average reduces to 2

(mn

)· 2−(n

2)

Hint. if you interchange the order of summation so that you sum oversubsets first and colorings second, you can take advantage of the factthat for a fixed subset N , you can count the number of colorings inwhich it is monochromatic.

(e) Explain why R(n, n) > m if(mn

)≤ 2(n

2)−1.

(f) Explain why R(n, n) >n√

n!2(n2)−1.

(g) By using Stirling’s formula, show that if n is large enough, thenR(n, n) >

√2n =

√2

n. (Here large enough means large enough for

Stirling’s formula to be reasonably accurate.)

Chapter 2

Permutation Groups

Mathematical Prerequisites: Some acquaintance with equivalence re-lations is required. If you’ve already had some abstract algebra, you canprogress through this fairly quickly.

Until now we have primarily thought of permutations as ways of listingthe elements of a set. In this chapter we will find it very useful to thinkof permutations as functions. This will help us in using permutations tosolve enumeration problems that involve counting the blocks of a partitionin which the blocks don’t have the same size (and so cannot be solved by theQuotient Principle.) We begin by studying the kinds of permutations thatarise in situations where we have used the Quotient Principle in the past.

2.1 The rotations of a square

In Figure 2.1 (MUST FIGURE OUT HOW TO FIX LAST LINE THERE... SORRY ABOUT THE MESS) we show a square with its four verticeslabelled a, b, c, and d. We have also labelled the spots in the plane whereeach of these vertices falls with the label 1, 2, 3, or 4. Then we have shownthe effect of rotating the square clockwise through 90, 180, 270, and 360degrees (which is the same as rotating through 0 degrees). Underneatheach of the rotated squares we have named the function that carries out therotation. (DOESN’T SHOW UP.) We use ρ, the Greek letter pronounced“row,” to stand for a 90 degree clockwise rotation. We use ρ2 to stand fortwo 90 degree rotations, and so on. We can think of the function ρ as afunction on the four element set [4], with ρ(1) = 2, ρ(2) = 3, ρ(3) = 4and ρ(4) = 1. Notice that ρ is a permutation on the set [4]. And for any

101

102 CHAPTER 2. PERMUTATION GROUPS

Figure 2.1: The four possible results of rotating a square but maintainingits location.

mbox= identit

a1

b2

d4

c3

d1

a2

c4

b3

c1

d2

b4

a3

b1

c2

a4

d3

a1

b2

d4

c3

‰ ‰

‰ ‰ ‰ 2 3 4

0=

function ϕ (the Greek letter phi, usually pronounced “fee,” but sometimes“fie”) from the plane back to itself that may move the square around in theplane but otherwise leaves it in the same location, we let ϕ(i) be the labelof the place where vertex previously in position i is now.

Exercise 2.1. The composition f ◦ g of two functions f and g is defined asusual by f ◦ g(x) = f(g(x)). Is ρ3 the composition of ρ and ρ2? Does theanswer depend on the order in which we write ρ and ρ2? How is ρ2 relatedto ρ? Is the composition of two permutations always a permutation?

In Exercise 2.1 you see that we can think of ρ2 ◦ ρ as the result of firstrotating by 90 degrees and then by another 180 degrees. In other words, thecomposition of two rotations is the same thing as first doing one and thendoing the other. Of course there is nothing special about 90 degrees and 180degrees. As long as we first do one rotation through a multiple of 90 degreesand then another rotation through a multiple of 90 degrees, the compositionof these rotations is a rotation through a multiple of 90 degrees. If we firstrotate by 90 degrees and then by 270 degrees then we have rotated by 360degrees, which does nothing visible to the square. Thus we say that ρ4 is the“identity function.” In general the identity function on a set S, denotedby ι (the Greek letter iota, pronounced eye-oh-ta), is the function that takeseach element of the set to itself. In symbols, ι(x) = x for every x ∈ S. Ofcourse the identity function on a set is a permutation of that set.

2.2. GROUPS OF PERMUTATIONS 103

2.2 Groups of permutations

Exercise 2.2. For any function ϕ from a set S to itself, we define ϕn (fornonnegative integers n) inductively by ϕ0 = ι and ϕn = ϕn−1 ◦ ϕ for everypositive integer n. If ϕ is a permutation, is ϕn a permutation? Based onyour experience with previous inductive proofs, what do you expect ϕn ◦ϕm

to be? What do you expect (ϕm)n to be? There is no need to prove theselast two answers are correct.

Exercise 2.3. If we perform the composition ι ◦ ϕ for any function ϕ fromS to S, what function do we get? What if we perform the composition ϕ◦ ι?

What you have observed about iota in Exercise 2.3 is called the identityproperty of iota. In the context of permutations, people usually call thefunction ι “the identity” rather than calling it “iota.” Since rotating firstby 90 degrees and then by 270 degrees has the same effect as doing nothing,we can think of the 270 degree rotation as undoing what the 90 degreerotation does. For this reason we say that in the rotations of the square,ρ3 is the “inverse” of ρ. In general, a function ϕ : T → S is called aninverse of a function σ : S → T (σ is the lower case Greek letter sigma)if ϕ ◦ σ = σ ◦ ϕ = ι. Since a permutation is a bijection, it has a uniqueinverse. And since the inverse of a bijection is a bijection, the inverse ofa permutation is a permutation. We use ϕ−1 to denote the inverse of thepermutation ϕ.

Exercise 2.4. If f : S → T , g : T → X, and h : X → Y , is

h ◦ (g ◦ f) = (h ◦ g) ◦ f?

What does this say about the status of the associative law

ρ ◦ (σ ◦ ϕ) = (ρ ◦ σ) ◦ ϕ

in a permutation group?

We’ve seen that the rotations of the square are functions that return thesquare to its original location but may move the vertices to different places.In this way we create permutations of the vertices of the square. We’veobserved four important properties of these permutations:

• (Associative Property) The associative law holds for products of per-mutations.


• (Identity Property) These permutations include the identity permuta-tion.

• (Inverse Property) Whenever these permutations include ϕ, they alsoinclude ϕ−1.

• (Closure Property) Whenever these permutations include ϕ and σ,they also include ϕ ◦ σ.

A set of permutations with these four properties is called a permutationgroup, or a group of permutations. The concept of a permutation group isa special case of the concept of a group that one studies in abstract algebra.When we refer to a group in what follows, if you know what groups are in themore abstract sense, you may replace the term “permutation group” by theword “group”. If you do not know about groups in this more abstract sense,then you may assume we mean permutation group when we say group. Wecall the group of permutations corresponding to rotations of the square therotation group of the square. There is a similar rotation group with nelements for any regular n-gon.

Exercise 2.5. (a) How should we define ϕ−n for an element ϕ of a per-mutation group?

(b) Will the two standard rules for exponents

aman = am+n and (am)n = amn

still hold in a group if one or more of the exponents may be negative?(No proof required yet.)

(c) Proving that (ϕ−m)n = ϕ−mn when m and n are nonnegative is differ-ent from proving that (ϕm)−n = ϕ−mn when m and n are nonnegative.Make a list of all such formulas we would need to prove in order toprove that the rules of exponents of part (b) do hold for all nonnegativeand negative m and n.

(d) If the rules hold, give enough of the proof to show that you know howto do it; otherwise give a counterexample.

Exercise 2.6. If a finite set of permutations satisfies the closure propertyis it a permutation group?

Hint. If σi = σj and i 6= j, what can you conclude about ι?

Exercise 2.7. There are three-dimensional geometric motions of the squarethat return the square to its original location but move some of the vertices

2.3. THE SYMMETRIC GROUP 105

to other positions. For example, if we flip the square around a diagonal,most of it moves out of the plane during the flip, but the square lands inthe same location. Draw a figure like Figure 2.1 that shows all the possibleresults of such motions, including the ones shown in Figure 2.1. Do thecorresponding permutations form a group?

Exercise 2.8. Let σ and ϕ be permutations.(a) Why must σ ◦ ϕ have an inverse?(b) Is (σ ◦ ϕ)−1 = σ−1ϕ−1? (Prove or give a counterexample.)(c) Is (σ ◦ ϕ)−1 = ϕ−1σ−1? (Prove or give a counterexample.)

Exercise 2.9. Explain why the set of all permutations of four elements is apermutation group. How many elements does this group have? This groupis called the symmetric group on four letters and is denoted by S4.

2.3 The symmetric group

In general, the set of all permutations of an n-element set is a group. It iscalled the symmetric group on n letters. We don’t have nice geometricdescriptions (like rotations) for all its elements, and it would be inconvenientto have to write down something like “Let σ(1) = 3, σ(2) = 1, σ(3) = 4, andσ(4) = 1” each time we need to introduce a new permutation. We introducea new notation for permutations that allows us to denote them reasonablycompactly and compose them reasonably quickly: If σ is the permutationof the set [4] given by σ(1) = 3, σ(2) = 1, σ(3) = 4 and σ(4) = 2, we write

σ =(

1 2 3 43 1 4 2

).

We call this notation the two-row notation for permutations. In the two-row notation for a permutation σ of {a1, a2, . . . , an}, we write the numbersa1 through an in one row and we write σ(a1) through σ(an) in a row rightbelow, enclosing both rows in parentheses. Notice that(

1 2 3 43 1 4 2

)=(

2 1 4 31 3 2 4

),

although the second ordering of the columns is rarely used. If ϕ is given by

ϕ =(

1 2 3 44 1 2 3

),


Figure 2.2: How to multiply permutations in two-row notation.

) ( ) ( )=2

1

3

4

4

2

1

3

1

4

2

1

3

2

4

3

1

2

2

3

3

1

4

4(

then, by applying the definition of composition of functions, we may computeσ ◦ ϕ as shown in Figure 2.2.

We don’t normally put the circle ◦ between two permutations in two-rownotation when we are composing them, and we usually refer to the operationas multiplying the permutations or as the product of the permutations. Tosee how Figure 2.2 illustrates composition, notice that the arrow starting at1 in ϕ goes to 4. Then from the 4 in ϕ it goes to the 4 in σ and then to 2.This illustrates that ϕ(1) = 4 and σ(4) = 2, so that σ(ϕ(1)) = 2.

Exercise 2.10. For practice, compute(

1 2 3 4 53 4 1 5 2

)(1 2 3 4 54 3 5 1 2

).

2.4 The dihedral group

We found four permutations that correspond to rotations of the square. InExercise 2.7 you found four additional permutations that correspond to flipsof the square in space. One flip fixes the vertices in the places labelled 1and 3 and interchanges the vertices in the places labelled 2 and 4. Let usdenote it by ϕ1|3. One flip fixes the vertices in the positions labelled 2 and4 and interchanges those in the positions labelled 1 and 3. Let us denoteit by ϕ2|4. One flip interchanges the vertices in the places labelled 1 and 2and also interchanges those in the places labelled 3 and 4. Let us denote itby ϕ12|34. The fourth flip interchanges the vertices in the places labelled 1and 4 and interchanges those in the places labelled 2 and 3. Let us denoteit by ϕ14|23. Notice that ϕ1|3 is a permutation that takes the vertex in place1 to the vertex in place 1 and the vertex in place 3 to the vertex in place3, while ϕ12|34 is a permutation that takes the edge between places 1 and 2to the edge between places 2 and 1 (which is the same edge) and takes theedge between places 3 and 4 to the edge between places 4 and 3 (which isthe same edge). This should help to explain the similarity in the notation

2.4. THE DIHEDRAL GROUP 107

for the two different kinds of flips.

Exercise 2.11. Write down the two-row notation for ρ3, ϕ2|4, ϕ12|34 andϕ2|4 ◦ ϕ12|34.

Exercise 2.12. (You may have already done this in Exercise 2.7, in whichcase you need not do it again!) In Exercise 2.7, if a rigid motion in three-dimensional space returns the square to its original location, in how manyplaces can vertex number one land? Once the location of vertex numberone is decided, how many possible locations are there for vertex two? Oncethe locations of vertex one and vertex two are decided, how many locationsare there for vertex three? Answer the same question for vertex four. Whatdoes this say about the relationship between the four rotations and four flipsdescribed just before Exercise 2.11 and the permutations you described inExercise 2.7?

The four rotations and the four flips of the square described before Ex-ercise 2.11 form a group called the dihedral group of the square. Sometimesthe group is denoted D8 because it has eight elements, and sometimes thegroup is denoted by D4 because it deals with four vertices. Let us agreeto use the notation D4 for the dihedral group of the square. There is asimilar dihedral group, denoted by Dn, of all the rigid motions of three-dimensional space that return a regular n-gon to its original location (butmight put the vertices in different places).

Exercise 2.13. Another view of the dihedral group of the square is thatit is the group of all distance-preserving functions, also called isometries,from a square to itself. Notice that an isometry must be a bijection. (Why?)Any rigid motion of the square preserves the distances between all pairs ofpoints in the square. However, it is conceivable that there might be someisometries that do not arise from rigid motions. (We will see some lateron in the case of a cube.) Show that there are exactly eight isometries(distance-preserving functions) from a square to itself.

Hint. Once you know where the corners of the square go under the actionof an isometry, how much do you know about the isometry?

Exercise 2.14. How many elements does the group Dn have? Prove thatyou are correct.

Exercise 2.15. In Figure 2.3 we show a cube with the positions of itsvertices and faces labelled. As with motions of the square, we let ϕ(x) bethe label of the place where vertex previously in position x is now.


Figure 2.3: A cube with the positions of its vertices and faces labelled. Thecurved arrows point to the faces that are blocked by the cube.

u

b

f

l

t

r

3 4

5

1

6

78

2

(a) Write in two-row notation the permutation ρ of the vertices that corre-sponds to rotating the cube 90 degrees around the vertical axis throughthe faces t (for top) and u (for underneath). (Rotate in a right-handedfashion around this axis, meaning that vertex 6 goes to the back andvertex 8 comes to the front.)

(b) Write in two-row notation the permutation ϕ that rotates the cube120 degrees around the diagonal from vertex 1 to vertex 7 and carriesvertex 8 to vertex 6.

(c) Compute the two-row notation for ρ ◦ ϕ.(d) Is the permutation ρ ◦ ϕ a rotation of the cube around some axis? If

so, say what the axis is and how many degrees we rotate around theaxis. If ρ ◦ ϕ is not a rotation, give a geometric description of it.

Exercise 2.16. Let R be the group of permutations of the vertices of acube that arise from rigid motions of the cube. How many permutationsare in R? R is sometimes called the “rotation group” of the cube. Can youjustify this?

Exercise 2.17. As with a two-dimensional figure, it is possible to talkabout isometries of a three-dimensional figure. These are distance-preserving

2.5. THE CYCLE DECOMPOSITION OF A PERMUTATION 109

functions from the figure to itself. The function that reflects the cube inFigure 2.3 through a plane halfway between the bottom face and top faceexchanges the vertices 1 and 5, 2 and 6, 3 and 7, and 4 and 8 of the cube.This function preserves distances between points in the cube. However, itcannot be achieved by a rigid motion of the cube because a rigid motionthat takes vertex 1 to vertex 5, vertex 2 to vertex 6, vertex 3 to vertex 7,and vertex 4 to vertex 8 would not return the cube to its original location;rather it would put the bottom of the cube where its top previously was andwould put the rest of the cube above that square rather than below it.

(a) How many elements are there in the group of permutations of [8] thatcorrespond to isometries of the cube?

(b) Is every permutation of [8] that corresponds to an isometry either arotation or a reflection?Hint. Why is it sufficient to focus on permutations that take vertex 1to itself?

We have seen that the dihedral group D4 contains a copy of the groupof rotations of the square. When one group G of permutations of a set Sis a subset of another group G′ of permutations of S, we say that G is asubgroup of G′. The reason we introduce this new word subgroup is toemphasize that the composition operation gives the same result whether itis performed in the larger group or the smaller group.

Exercise 2.18. Find all subgroups of the group D4 and explain why yourlist is complete.

Exercise 2.19. Can you find any subgroups of the symmetric group S4 withtwo elements? Three elements? Four elements? Six elements? (For eachpositive answer, describe a subgroup. For each negative answer, explain whynot.)

2.5 The cycle decomposition of a permutation

The digraph of a permutation gives us a nice way to think about it. Notice

that the product in Figure 2.2 is(

1 2 3 42 3 1 4

). We have drawn the directed

graph of this permutation in Figure 2.4. You see that the graph has twodirected cycles, the rather trivial one with vertex 4 pointing to itself, andthe nontrivial one with vertex 1 pointing to vertex 2 pointing to vertex 3which points back to vertex 1.


Figure 2.4: The directed graph of a permutation.

1

234

A permutation is called a cycle on [n] if its digraph on the set [n] consists

of exactly one cycle. Thus(

1 2 32 3 1

)is a cycle but

(1 2 3 42 3 1 4

)is not a

cycle by our definition. We write (1 2 3) or (2 3 1) or (3 1 2) to stand for

the cycle σ =(

1 2 32 3 1

). We can describe cycles in another way as well.

A cycle of the permutation σ is a list (i σ(i) σ2(i) . . . σn(i)) that does nothave repeated elements while the list (i σ(i) σ2(i) . . . σn(i) σn+1(i)) doeshave repeated elements.

Exercise 2.20. If the list (i σ(i) σ2(i) . . . σn(i)) does not have repeatedelements but the list (i σ(i) σ2(i) . . . σn(i) σn+1(i)) does have repeatedelements, then what is σn+1(i)?

We say σj(i) is an element of the cycle (i σ(i) σ2(i) . . . σn(i)). Noticethat the case j = 0 means i is an element of the cycle. Notice also thatif j > n, σj(i) = σj−n−1(i), so the distinct elements of the cycle are i,σ(i), σ2(i), through σn(i). We think of the cycle (i σ(i) σ2(i) . . . σn(i)) asrepresenting the permutation σ restricted to the set of elements of the cycle.We say that the cycles

(i σ(i) σ2(i) . . . σn(i))

and(σj(i) σj+1(i) . . . σn(i) i σ(i) σ2(i) . . . σj−1(i))

are equivalent. Equivalent cycles represent the same permutation on theset of elements of the cycle. For this reason, we consider equivalent cyclesto be equal in the same way we consider 1

2 and 24 to be equal. In particular,

2.5. THE CYCLE DECOMPOSITION OF A PERMUTATION 111

this means that

(i1 i2 . . . in) = (ij ij+1 . . . in i1 i2 . . . ij−1) .

We will see that every permutation on [n] can be written as a product ofcycles on the blocks in some partition of [n].

Exercise 2.21. Find the cycles of the permutations ρ, ϕ1|3 and ϕ12|34 inthe group D4.

Exercise 2.22. Find the cycles of the permutation(1 2 3 4 5 6 7 8 93 4 6 2 9 7 1 5 8

).

Exercise 2.23. If two cycles of σ have an element in common, what canwe say about them?

Exercise 2.23 leads almost immediately to the following theorem.

Theorem 6. For each permutation σ of a set S, there is a unique partitionof S each of whose blocks is the set of elements of a cycle of σ.

More informally, we may say that every permutation partitions its do-main into disjoint cycles. We call the set of cycles of a permutation thecycle decomposition of the permutation. In Exercises 2.21 and 2.22 youfound the cycle decompositions of typical elements of the group D4 and ofthe permutation (

1 2 3 4 5 6 7 8 93 4 6 2 9 7 1 5 8

).

Since the cycles of a permutation σ tell us σ(x) for every x in the domainof σ, the cycle decomposition of a permutation completely determines thepermutation. Using our informal language, we can express this idea in thefollowing corollary to Theorem 6.

Corollary 1. Every partition of a set S into cycles determines a uniquepermutation of S.

Exercise 2.24. Prove Theorem 6.

The group of all rotations of the square is simply the set of the fourpowers of the cycle ρ = (1 2 3 4). For this reason, it is called a cyclicgroup and often denoted by C4. Similarly, the rotation group of an n-gonis usually denoted by Cn.


Exercise 2.25. Write a recurrence for the number c(k, n) of permutationsof [k] that have exactly n cycles, including 1-cycles. Use it to write a tableof c(k, n) for k between 1 and 7 inclusive.

Exercise 2.26. A permutation σ is called an involution if σ2 = ι. Whenyou write down the cycle decomposition of an involution, what is specialabout its cycles?

2.6 Additional Exercises for Supplementary Chap-ter 2

1. Show that a function from S to T has an inverse (defined on T ) if andonly if it is a bijection.

2. How many elements are in the dihedral group D3? The symmetricgroup S3? What can you conclude about D3 and S3?

3. A tetrahedron is a three-dimensional geometric figure with four ver-tices, six edges, and four triangular faces. Suppose we start with atetrahedron in space and consider the set of all permutations of thevertices of the tetrahedron that correspond to moving the tetrahedronin space and returning it to its original location, perhaps with thevertices in different places.

(a) Explain why these permutations form agroup.(b) What is the size of this group?(c) Write down in two row notation a permutation that is not in this

group.

4. Find a three-element subgroup of the group S3. Can you find a differ-ent three-element subgroup of S3?

5. Prove true or demonstrate false with a counterexample: “In a permu-tation group, (σϕ)n = σnϕn.”

6. If a group G acts on a set S, and if σ(x) = y, is there anythinginteresting we can say about the subgroups Fix(x) and Fix(y)?

Chapter 3

Group Actions

Mathematical Prerequisites: The material in Supplementary Chapter 2.The information on picture functions and picture enumerators from Chap-ter 6 is needed in Sections 3.4ff.

We defined the rotation group C4 and the dihedral group D4 as twogroups of permutations on the vertices of a square. These permutationsrepresent rigid motions of the square in the plane and in three-dimensionalspace respectively. The square has geometric features of interest other thanits vertices; for example, its diagonals, or its edges. Any geometric motionof the square that returns it to its original location takes each diagonal to apossibly different diagonal, and takes each edge to a possibly different edge.In Figure 3.1 (AGAIN THE FIGURE STILL NEEDS FIXING) we show theresults of the rotations of a square on its sides and diagonals.

Figure 3.1: The results on the sides and diagonals of rotating the square.

= identit

1 2

4 3

1 2

4 3

1 2

4 3

1 2

4 3

1 2

4 3

‰ ‰

‰ ‰ ‰ 2 3 4

0=

s4 s2

s1

d13

d

d24

d24 13

d24

d13

d13

d24

d13

d24

s3

s3 s1

s4

s2

s2 s4

s3

s1

s1 s3

s2

s4

s4 s2

s1

s3

Å‰

The rotation group permutes the sides of the square and permutes the

113

114 CHAPTER 3. GROUP ACTIONS

diagonals of the square as it rotates the square. Thus we say the rotationgroup “acts” on the sides and diagonals of the square.

Exercise 3.1. (a) Write down the two-line notation for the permutationρ that a 90 degree rotation does to the sides of the square.

(b) Write down the two-line notation for the permutation ρ2 that a 180degree rotation does to the sides of the square.

(c) Is ρ2 = ρ ◦ ρ? Why or why not?(d) Write down the two-line notation for the permutation ρ that a 90

degree rotation does to the diagonals d13, and d24 of the square.(e) Write down the two-line notation for the permutation ρ2 that a 180

degree rotation does to the diagonals of the square.(f) Is ρ2 = ρ ◦ ρ? Why or why not? What familiar permutation is ρ2 in

this case?

We have just seen that the fact that we have defined a permutation groupas the permutations of some specific set doesn’t preclude us from thinkingof the elements of that group as permuting the elements of some other setas well. We are going to say that the group D4 “acts” on the edges anddiagonals of a square, and the group R of permutations of the vertices of acube that arise from rigid motions of the cube “acts” on the edges, faces,diagonals, etc. of the cube.

Exercise 3.2. In Figure 2.3 we showed a cube with the positions of itsvertices and faces labelled. As with the motions of the square, we let welet ϕ(x) be the label of the place where vertex previously in position x isnow.

(a) In Exercise 2.15 we wrote in two-row notation the permutation ρ ofthe vertices that corresponds to rotating the cube 90 degrees aroundthe vertical axis through the faces t (for top) and u (for underneath).(We rotated in a right-handed fashion around this axis, meaning thatvertex 6 goes to the back and vertex 8 comes to the front.) Write intwo-row notation the permutation ρ of the faces that corresponds tothis member ρ of R.

(b) In Exercise 2.15 we wrote in two-row notation the permutation ϕ thatrotates the cube 120 degrees around the diagonal from vertex 1 tovertex 7 and carries vertex 8 to vertex 6. Write in two-row notationthe permutation ϕ of the faces that corresponds to this member of R.

(c) In Exercise 2.15 we computed the two-row notation for ρ ◦ ϕ. Nowcompute the two-row notation for ρ ◦ ϕ (ρ was defined in part (a)),and write in two-row notation the permutation ρ ◦ ϕ of the faces that

115

corresponds to the action of the permutation ρ ◦ ϕ on the faces of thecube. For this question it helps to think geometrically about whatmotion of the cube is carried out by ρ◦ϕ. What do you observe aboutρ ◦ ϕ and ρ ◦ ϕ?

We say that a permutation group G acts on a set S if, for each memberσ of G there is a permutation σ of S such that

σ ◦ ϕ = σ ◦ ϕ

for every member σ and ϕ of G. In Exercise 3.2(c) you saw one example ofthis condition. If we think intuitively of ρ and ϕ as motions in space, thenfollowing the action of ϕ by the action of ρ does give us the action of ρ ◦ ϕ.We can also reason directly with the permutations in the group R of rigidmotions (rotations) of the cube to show that R acts on the faces of the cube.

Exercise 3.3. Show that a group G of permutations of a set S acts on Swith ϕ = ϕ for all ϕ in G.

Exercise 3.4. The group D4 is a group of permutations of the set [4] as inExercise 2.7. We are going to show in this problem how this group acts onthe two-element subsets of [4]. In Exercise 3.9 we will see a natural geometricinterpretation of this action. In particular, for each two-element subset {i, j}of [4] and each member σ of D4 we define σ({i, j}) = {σ(i), σ(j)}. Showthat with this definition of σ, the group D4 acts on the two-element subsetsof [4].

Exercise 3.5. Suppose that σ and ϕ are permutations in the group R ofrigid motions of the cube. We have argued already that each rigid motionsends a face to a face. Thus σ and ϕ both send the vertices on one face tothe vertices on another face. Let {h, i, j, k} be the set of labels next to thevertices on a face F .

(a) What are the labels next to the vertices of the face F ′ that F is sentto by ϕ? (The function ϕ may appear in your answer.)

(b) What are the labels next to the vertices of the face F ′′ that F ′ is sentto by σ?

(c) What are the labels next to the vertices of the face F ′′′ that F is sentto by σ ◦ ϕ?

(d) How have you just shown that the group R acts on the faces?


3.1 Groups acting on colorings of sets

Recall that when you were asked in Problem 57 (on page 30) to find thenumber of ways to place two red beads and two blue beads at the corners ofa square which is free to move in three-dimensional space, you were not ableto use the Quotient Principle to answer the question. Instead you had tosee that you could divide the set of six lists of two Rs and two Bs into twosets, one of size two in which the Rs and Bs alternated and one of size fourin which the two reds (and therefore the two blues) would be side-by-sideon the square. Saying that the square is free to move in space is equivalentto saying that two arrangements of beads on the square are equivalent if amember of the dihedral group carries one arrangement to the other. Thusan important ingredient in the analysis of such problems will be how a groupcan act on colorings of a set of vertices. We can describe the coloring of thesquare in Figure 3.2 as the function f with

f(1) = R, f(2) = R, f(3) = B, and f(4) = B,

but it is more compact and turns out to be more suggestive to represent thecoloring in Figure 3.2 as the set of ordered pairs

(1, R), (2, R), (3, B), (4, B). (3.1)

Figure 3.2: The colored square with coloring {(1, R), (2, R), (3, B), (4, B)}

R1

R2

B4

B3

This gives us an explicit list of which colors are assigned to which vertex.Then if we rotate the square through 90 degrees, we see that the set ofordered pairs becomes

{(ρ(1), R), (ρ(2), R), (ρ(3), B), (ρ(4), B)} (3.2)

which is the same as

{(2, R), (3, R), (4, B), (1, B)}

3.1. GROUPS ACTING ON COLORINGS OF SETS 117

or, in a more natural order,

{(1, B), (2, R), (3, R), (4, B)}. (3.3)

The reordering we did in (3.3) suggests yet another simplification ofnotation. As long as we know that the first elements of our pairs are labelledby the members of [n] for some integer n and we are listing our pairs inincreasing order by the first component, we can denote the coloring

{(1, B), (2, R), (3, R), (4, B)}

by BRRB. In the case where we have numbered the elements of the set Swe are coloring, we will call this list of colors of the elements of S in orderthe standard notation for the coloring. We will call the ordering usedin (3.3) the standard ordering of the coloring. Thus we have two naturalways to represent a coloring of a set as a function: as a set of ordered pairsand as a list. Different representations are useful for different things. Forexample, the representation by ordered pairs will provide a natural way todefine the action of a group on colorings of a set. Given a coloring as afunction f , we denote the set of ordered pairs

{(x, f(x))|x ∈ S},

suggestively as (S, f) for short. We use f(1)f(2) · · · f(n) to stand for aparticular coloring (S, f) in the standard (list) notation.

Exercise 3.6. Suppose now that instead of coloring the vertices of a squarewe color its edges. We will use the shorthand 12, 23, 34, and 41 to stand forthe edges of the cube between vertex 1 and vertex 2, vertex 2 and vertex 3,and so on. Then a coloring of the edges with 12 red, 23 blue, 34 red and 41blue can be represented as

{(12, R), (23, B), (34, R), (41, B)}. (3.4)

If ρ is the rotation through 90 degrees, then we have a permutation ρ actingon the edges. This permutation acts on the colorings to give us a permuta-tion ρ of the set of colorings.

(a) What is ρ of the coloring in (3.4)?(b) What is ρ2 of the coloring in (3.4)?

If G is a group that acts the set S, we define the action of G on thecolorings (S, f) by

σ((S, f)) = σ({(x, f(x))|x ∈ S}) = {(σ(x), f(x))|x ∈ S}. (3.5)


We have the two bars over σ, because σ is a permutation of one set (vertices)that gives us a permutation σ of a second set (edges), and then σ acts togive a permutation σ of a third set (the set of colorings). For example,suppose we want to analyze colorings of the faces of a cube under the actionof the rotation group of the cube as we have defined it on the vertices. Eachvertex-permutation σ in the group gives a permutation σ of the faces of thecube. Then each permutation σ of the faces gives us a permutation σ of thecolorings of the faces. In the special case that G is a group of permutationsof S rather than a group acting on S, (3.5) becomes

σ((S, f)) = σ({(x, f(x))|x ∈ S}) = {(σ(x), f(x))|x ∈ S}.

In the case where G is the rotation group of the square acting on the verticesof the square, the example of acting on a coloring by ρ that we saw in (3.3)is an example of this kind of action. In the standard notation, when we acton a coloring by σ, the color in position i moves to position σ(i).

Exercise 3.7. Why does the action we have defined on colorings in (3.5)take a coloring to a coloring?

Exercise 3.8. Show that if G is a group of permutations of a set S, and fis a coloring function on S, then the equation

σ({(x, f(x))|x ∈ S}) = {(σ(x), f(x))|x ∈ S}

defines an action of G on the colorings (S, f) of S.

Hint. Before you try to show that σ actually is a permutation of the color-ings, it would be useful to verify the second part of the definition of a groupaction, namely that σ ◦ ϕ = σ ◦ ϕ

3.2 Orbits

Exercise 3.9. For this problem refer back to Exercise 3.4.(a) What is the set of two-element subsets that you get by computing

σ({1, 2}) for all σ in D4?(b) What is the multiset of two-element subsets that you get by computing

σ({1, 2}) for all σ in D4?(c) What is the set of two-element subsets you get by computing σ({1, 3})

for all σ in D4?(d) What is the multiset of two-element subsets that you get by computing

σ({1, 3}) for all σ in D4?

3.2. ORBITS 119

(e) Describe the two sets in parts (a) and (c) geometrically in terms ofthesquare.

Exercise 3.10. This problem uses the notation for permutations in thedihedral group of the square introduced before Exercise 2.11. What is theeffect of a 180 degree rotation ρ2 on the diagonals of a square? What isthe effect of the flip ϕ1|3 on the diagonals of a square? How many elementsof D4 send each diagonal to itself; that is, fixes each diagonal? How manyelements of D4 interchange the diagonals of a square?

In Exercise 3.9 you saw that the action of the dihedral group D4 ontwo-element subsets of [4] seems to split them into two blocks, one with twoelements and one with four elements. We call these two blocks the “orbits”of D4 acting on the two-element subsets of [4]. More generally, given a groupG acting on a set S, the orbit of G determined by an element x of S is theset

{σ(x)|σ ∈ G},

and is denoted by Gx. In Exercise 3.9 it was possible to have Gx = Gy forx 6= y. In fact in that problem, Gx = Gy for every y in Gx.

Exercise 3.11. Suppose a group acts on a set S. Could an element of Sbe in two different orbits? (Say why or why not.)

Hint. If z ∈ Gx and z ∈ Gy, how can you use elements of G to explain therelationship between x and y?

Another Hint. Suppose σ is a fixed member of G. As τ ranges over G, whichelement of G occur as τσ?

Exercise 3.11 almost completes the proof of the following theorem.

Theorem 7. Suppose a group acts on a set S. The orbits of G form apartition of S.

It is probably worth pointing out that this theorem tells us that the orbitGx is also the orbit Gy for y ∈ Gx.

Exercise 3.12. Complete the proof of Theorem 7.

Notice that thinking in terms of orbits actually hides some informationabout the action of our group. When we computed the multiset of all resultsof D4 acting on {1, 2}, we got an eight-element multiset containing each sidetwice. When we computed the multiset of all results of acting on {1, 3} with


the elements of D4, we got an eight-element multiset containing each diag-onal of the square four times. These multisets remind us that we are actingon our two-element sets with an eight-element group. The multiorbit of Gdetermined by an element x of S is the multiset

{σ(x)|σ ∈ G},

and is denoted by Gxmulti. When we used the Quotient Principle to countcircular seating arrangements or necklaces, we partitioned a set of lists ofpeople or beads into blocks of equivalent lists. In the case of seating n peoplearound a round table, what made two lists equivalent was, in retrospect, theaction of the rotation group Cn. In the case of stringing n beads on a stringto make a necklace, what made two lists equivalent was the action of thedihedral group Dn. Thus the blocks of our partitions were orbits of therotation group or the dihedral group, and we were counting the numberof orbits of the group action. In Problem 57 (on page 30), we were notable to apply the Quotient Principle because we had blocks of differentsizes. However, these blocks were still orbits of the action of the group D4.And, even though the orbits have different sizes, we expect that each orbitcorresponds naturally to a multiorbit and that the multiorbits all have thesame size. Thus if we had a version of the Quotient Principle for a union ofmultisets, we could hope to use it to count the number of multiorbits.

Exercise 3.13. (a) Find the orbit and multiorbit of D4 acting on thecoloring

{(1, R), (2, R), (3, B), (4, B)},

or in standard notation RRBB, of the vertices of a square.(b) How many group elements map the coloring RRBB to itself? What is

the multiplicity of RRBB in its multiorbit?(c) Find the orbit and multiorbit of D4 acting on the coloring

{(1, R), (2, B), (3, R), (4, B)}.

(d) How many elements of the group send the coloring RBRB to itself?What is the multiplicity of RBRB in its orbit?

Exercise 3.14. (a) If G is a group, how is the set {τσ|τ ∈ G} related toG?

(b) Use this to show that y is in the multiorbit Gxmulti if and only ifGxmulti = Gymulti.

3.2. ORBITS 121

Exercise 3.14(b) tells us that, when G acts on S, each element x of S isin one and only one multiorbit. Since each orbit is a subset of a multiorbitand each element x of S is in one and only one orbit, this also tells us thereis a bijection between the orbits of G and the multiorbits of G, so that wehave the same number of orbits as multiorbits. When a group acts on a set,a group element is said to fix an element of x ∈ S if σ(x) = x. The set ofall elements fixing an element x is denoted by Fix(x).

Exercise 3.15. Suppose a group G acts on a set S. What is special aboutthe subset Fix(x) for an element x of S?

Exercise 3.16. Suppose a group G acts on a set S. What is the relationshipof the multiplicity of x ∈ S in its multiorbit and the size of Fix(x)?

Exercise 3.17. What can you say about relationships between the mul-tiplicity of an element y in the multiorbit Gxmulti and the multiplicites ofother elements? Try to use this to get a relationship between the size of anorbit of G and the size of G.

Hint. How does the size of a multiorbit compare to the size of G?

We suggested earlier that a Quotient Principle for Multisets might proveuseful. The Quotient Principle came from the Sum Principle, and we donot have a Sum Principle for Multisets. Such a principle would say that thesize of a union of disjoint multisets is the sum of their sizes. We have notyet defined the union of multisets or disjoint multisets, because we haven’tneeded the ideas until now. We define the union of two multisets S and Tto be the multiset in which the multiplicity of an element x is the maximum1

of the multiplicity of x in S and its multiplicity in T . Similarly, the union ofa family of multisets is defined by defining the multiplicity of an element xto be the maximum of its multiplicities in the members of the family. (Re-member that all our sets are finite.) Two multisets are said to be disjointif no element is a member of both; that is, if no element has multiplicity1 or more in both multisets. Since the size of a multiset is the sum of themultiplicities of its members, we immediately get the Sum Principle forMultisets, and from that the Product and Quotient Principles.

1We choose the maximum rather than the sum so that the union of sets is a specialcase of the union of multisets.


The Sum Principle for Multisets

The size of a union of disjoint multisets is the sum of theirsizes.

The Product Principle!for Multisets.

The union of a set of m disjoint multisets, each of size n,has size mn.

The Quotient Principle for Multisets

If a p-element multiset is a union of q disjoint multisets,each of the same size r, then q = p/r.

Exercise 3.18. How does the size of the union of the set of multiorbits ofa group G acting on a set S relate to the number of multiorbits and the sizeof G?

Exercise 3.19. How does the size of the union of the set of multiorbits ofa group G acting on a set S relate to the numbers |Fix(x)|?

Exercise 3.20. In Exercises 3.18 and 3.19 you computed the size of theunion of the set of multiorbits of a group G acting on a set S in two differentways, getting two different expressions which must be equal. Write theequation that says they are equal and solve for the number of multiorbits,and therefore the number of orbits.

3.3 The Cauchy-Frobenius-Burnside Theorem

Exercise 3.21. In Exercise 3.20 you stated and proved a theorem thatexpresses the number of orbits in terms of the number of group elementsfixing each element of S. It is often easier to find the number of elementsfixed by a given group element than to find the number of group elementsfixing an element of S.

(a) For this purpose, how does the sum∑

x∈S |Fix(x)| relate to the numberof ordered pairs (σ, x) (with σ ∈ G and x ∈ S) such that σ fixes x?

3.3. THE CAUCHY-FROBENIUS-BURNSIDE THEOREM 123

(b) Let χ(σ) denote the number of elements of S fixed by σ. How can thenumber of ordered pairs (σ, x) (with σ ∈ G and x ∈ S) such that σfixes x be computed from χ(σ)? (It is okay to have a summation inyour answer.)

(c) What does this tell you about the number of orbits?

Exercise 3.22. A second computation of the result of Exercise 3.21 can bedone as follows.

(a) Let χ(σ, x) = 1 if σ(x) = x and let χ(σ, x) = 0 otherwise. Noticethat χ is different from the χ in the previous problem, because it is afunction of two variables. Use χ to convert the single summation inyour answer to Exercise 3.20 into a double summation over elementsx of S and elements σ of G.

(b) Reverse the order of the previous summation in order to convert itinto a single sum involving the function χ given by

χ(σ) = the number of elements of S left fixed by σ.

In Exercise 3.21 you gave a formula for the number of orbits of a group Gacting on a set X. This formula was first worked out by Cauchy in the case ofthe symmetric group, and then for more general groups by Frobenius. In hispioneering book on Group Theory, Burnside used this result as a lemma, andwhile he attributed the result to Cauchy and Frobenius in the first edition ofhis book, in later editions he did not. Later on, other mathematicians whoused his book named the result “Burnside’s Lemma,” which is the nameby which it is still most commonly known. Let us agree to call this resultthe Cauchy-Frobenius-Burnside Theorem, or CFB Theorem for short in acompromise between historical accuracy and common usage.

Exercise 3.23. In how many ways may we string four (identical) red, six(identical) blue, and seven (identical) green beads on a necklace?

Exercise 3.24. If we have an unlimited supply of identical red beads andidentical blue beads, in how many ways may we string 17 of them on anecklace?

Exercise 3.25. If we have five (identical) red, five (identical) blue, andfive (identical) green beads, in how many ways may we string them on anecklace?

Exercise 3.26. In how many ways may we paint the faces of a cube withsix different colors, using all six?


Exercise 3.27. In how many ways may we paint the faces of a cube withtwo colors of paint? What if both colors must be used?

Hint. There are five kinds of elements in the rotation group of the cube. Forexample, there are six rotations by 90 degrees or 270 degrees around an axisconnecting the centers of two opposite faces and there are 8 rotations (of120 degrees and 240 degrees, respectively) around an axis connecting twodiagonally opposite vertices.

Exercise 3.28. In how many ways may we color the edges of a (regular)(2n + 1)-gon free to move around in the plane (so it cannot be flipped) ifwe use red n times and blue n + 1 times? If this is a number you have seenbefore, identify it.

Hint. Is it possible for a nontrivial rotation to fix any coloring?

Exercise 3.29. In how many ways may we color the edges of a (regular)(2n + 1)-gon free to move in three-dimensional space so that n edges arecolored red and n + 1 edges are colored blue? Your answer may depend onwhether n is even or odd.

Exercise 3.30. How many different proper colorings with four colors arethere of the vertices of a graph which is a cycle on five vertices? (If we getone coloring by rotating or flipping another one, they aren’t really different.)

Figure 3.3: A graph on six vertices.

1 2

3

45

6

Exercise 3.31. How many different proper colorings with four colors arethere of the graph in Figure 3.3? Two graphs are the same if we can redrawone of the graphs, not changing the vertex set or edge set, so that it isidentical to the other one. This is equivalent to permuting the vertices in

3.4. POLYA-REDFIELD ENUMERATION THEORY 125

some way so that when we apply the permutation to the endpoints of theedges to get a new edge set, the new edge set is equal to the old one. Such apermutation is called an automorphism of the graph. Thus two coloringsare different if there is no automorphism of the graph that carries one to theother one.

Hint. There are 48 elements in the group of automorphisms of the graph.

Another Hint. For this problem, it may be easier to ask which group elementsfix a coloring rather than which colorings are fixed by a group element.

3.4 Polya-Redfield Enumeration Theory

George Polya and Robert Redfield independently developed a theory of gen-erating functions that describe the action of a group G on colorings of a set Sby a set T when we know the action of G on S. Polya’s work on the subjectis very accessible in its exposition, and so the subject has become popularlyknown as Polya theory, though Polya-Redfield theory would be a bettername. In this section we develop the elements of this theory.

The idea of coloring a set S has many applications. For example, theset S might be the positions in a hydrocarbon molecule which are occupiedby hydrogen, and the group could be the group of spatial symmetries of themolecule (that is, the group of permutations of the atoms of the moleculethat move the molecule around so that in its final position the moleculecannot be distinguished from the original molecule). The colors could thenbe radicals (including hydrogen itself) that we could substitute for eachhydrogen position in the molecule. Then the number of orbits of coloringsis the number of chemically different compounds we could create by usingthese substitutions.2 In Figure 3.4 we show two different ways to substitutethe OH radical for a hydrogen atom in the chemical diagram we gave forbutane (on page 54. We have colored one vertex of degree 1 with the radicalOH and the rest with the atom H. There are only two distinct ways to dothis, as the OH must either connect to an “end” C or a “middle” C. This

2There is a fascinating subtle issue of what makes two molecules different. For example,suppose we have a molecule in the form of a cube, with one atom at each vertex. If weinterchange the top and bottom faces of the cube, each atom is still connected to exactlythe same atoms as before. However, we cannot achieve this permutation of the verticesby a member of the rotation group of the cube. It could well be that the two versionsof the molecule interact with other molecules in different ways, in which case we wouldconsider them as chemically different. On the other hand, if the two versions interact withother molecules in the same way, we would have no reason to consider them chemicallydifferent. This kind of symmetry is an example of what is called chirality in chemistry.


shows that there are two different forms, called isomers of the compoundshown. This compound is known as butyl alcohol.

Figure 3.4: The two different isomers of butyl alcohol.

C CC C

H H

H H

H H

H

H

H

C CC C

H H

H H H

H OH

OH

H

H

H

So think intuitively about some “figure” that has places to be colored.(For instance, the faces of a cube, the beads on a necklace, circles at thevertices of an n-gon, etc.) How can we picture the coloring? If we num-ber the places to be colored, say 1 to n, then we have a standard way torepresent our coloring. For example, if our colors are blue, green and red,then BBGRRGBG describes a typical coloring of 8 such places. Unless theplaces are somehow “naturally” numbered, this idea of a coloring imposesstructure that is not really there. Even if the structure is there, visualizingour colorings in this way doesn’t “pull together” any common features ofdifferent colorings; we are simply visualizing all possible colorings. We havea group (think of it as symmetries of the figure you are imagining) that actson the places. That group then acts in a natural way on the colorings ofthe places and we are interested in orbits of the colorings. Thus we want apicture that pulls together the common features of the colorings in an orbit.

One way to pull together similarities of colorings would be to let theletters we are using as pictures of colors commute as we did with our picturesin Section 6.1 (on pages 63ff). Then our picture BBGRRGBG becomesB3G3R2, so our picture now records simply how many times we use eachcolor. Think about how we defined the action of a group on the coloringsof a set on which the group acts. You will see that acting with a groupelement won’t change how many times each color is used; it simply movescolors to different places. Thus the picture we now have of a given coloringis an equally appropriate picture for each coloring in an orbit. One naturalquestion for us to ask is “How many orbits have a given picture?”

Exercise 3.32. Suppose we draw identical circles at the vertices of a regularhexagon. Suppose we color these circles with two colors, red and blue.

(a) In how many ways may we color the set [6] using the colors red and

3.5. THE ORBIT-FIXED POINT THEOREM 127

blue?(b) These colorings are partitioned into orbits by the action of the rota-

tion group on the hexagon. Using our standard notation, write downall these orbits and observe how many orbits have each picture, as-suming the picture of a coloring is the product of commuting variablesrepresenting the colors.

(c) Using the picture function of the previous part, write down the pictureenumerator for the orbits of colorings of the vertices of a hexagon underthe action of the rotation group.

In Exercise 3.32(c) we have a picture enumerator for pictures of orbitsof the action of a group on colorings. As above, we ask how many orbitsof the colorings have any given picture. We can think of a multivariablegenerating function in which the letters we use to picture individual colorsare the variables, and the coefficient of a picture is the number of orbits withthat picture. Such a generating function provides an answer to our naturalquestion, and so it is this sort of generating function we will seek. Since theCFB Theorem was our primary tool for saying how many orbits we have,it makes sense to think about whether the CFB Theorem has an analog interms of pictures of orbits.

3.5 The Orbit-Fixed Point Theorem

Exercise 3.33. Suppose now we have a group G acting on a set and we havea picture function on that set with the additional feature that for each orbitof the group, all its elements have the same picture. In this circumstance wedefine the picture of an orbit or multiorbit to be the picture of any one ofits members. The orbit enumerator Orb(G, S) is the sum of the picturesof the orbits. (Note that this is the same as the sum of the pictures of themultiorbits.) The fixed-point enumerator Fix(G, S) is the sum of thepictures of each of the fixed points of each of the elements of G. We aregoing to construct a generating function analog of the CFB Theorem. Themain idea of the proof of the CFB Theorem was to try to compute in twodifferent ways the number of elements (i.e., the sum of all the multiplicitiesof the elements) in the union of all the multiorbits of a group acting on aset. Suppose instead we try to compute the sum of all the pictures of allthe elements in the union of the multiorbits of a group acting on a set. Bythinking about how this sum relates to Orb(G, S) and Fix(G, S), find ananalog of the CFB Theorem that relates these two enumerators. State andprove this theorem.


We will call the theorem of Exercise 3.33 the Orbit-Fixed Point The-orem. In order to apply the Orbit-Fixed Point Theorem, we need somebasic facts about picture enumerators.

Exercise 3.34. Suppose that P1 and P2 are picture functions on sets S1

and S2 in the sense of Section 6.1 (beginning on page 64.) Define P onS1 × S2 by P (x1, x2) = P1(x1)P2(x2). How are EP1 , EP1 , and EP related?(You may have already done this problem in another context!)

Exercise 3.35. Suppose Pi is a picture function on a set Si for all i =1, . . . , k. We define the picture of a k-tuple (x1, x2, . . . , xk) to be the productof the pictures of its elements, i.e.

P ((x1, x2, . . . xk)) =k∏

i=1

Pi(xi).

How does the picture enumerator EP

of the set S1 × S2 × · · · × Sk of allk-tuples with xi ∈ Si relate to the picture enumerators of the sets Si? Inthe special case that Si = S for all i and Pi = P for all i, what is E

P(Sk)?

Exercise 3.36. Use the Orbit-Fixed Point Theorem to determine the orbitenumerator for the colorings, with two colors (red and blue), of six circlesplaced at the vertices of a hexagon which is free to move in the plane.Compare the coefficients of the resulting polynomial with the various orbitsyou found in Exercise 3.32.

Exercise 3.37. Find the generating function (in variables R, B) for color-ings of the faces of a cube with two colors (red and blue). What does thegenerating function tell you about the number of ways to color the cube (upto spatial movement) with various combinations of the two colors?

3.6 The Polya-Redfield Theorem

Polya’s (and Redfield’s) famed enumeration theorem deals with situationssuch as those in Exercises 3.36 and 3.37 in which we want a generatingfunction for the set of all colorings a set S using a set T of colors, wherethe picture of a coloring is the product of the multiset of colors it uses. Weare again thinking of the colors as variables. The point of the next series ofexercises is to analyze the solutions to Exercises 3.36 and 3.37 in order to seewhat Polya and Redfield saw (although they didn’t see it in this notationor using this terminology).

3.6. THE POLYA-REDFIELD THEOREM 129

Exercise 3.38. In Exercise 3.36 we have four kinds of group elements:the identity (which fixes every coloring); the rotations through 60 or 300degrees; the rotations through 120 and 240 degrees; and the rotation through180 degrees. The fixed-point enumerator for the rotation group acting onthe colorings of the hexagon is by definition the sum of the fixed-pointenumerators of colorings fixed by the identity; of colorings fixed by 60- or300-degree rotations; of colorings fixed by 120- or 240-degree rotations; andof colorings fixed by the 180-degree rotation. To the extent that you haven’talready done it in an earlier exercise, write down each of these enumerators(one for each kind of permutation) individually and factor each one (overthe integers) as completely as you can.

Exercise 3.39. In Exercise 3.37 we have five different kinds of group el-ements. For each kind of element, to the extent that you haven’t alreadydone it in an earlier exercise, write down the fixed-point enumerator for theelements of that kind. Factor the enumerators as completely as you can.

Exercise 3.40. In Exercise 3.38, each “kind” of group element has a “kind”of cycle structure. For example, a rotation through 180 degrees has threecycles of size two. What kind of cycle decomposition does a rotation through60 or 300 degrees have? What kind of cycle decomposition does a rotationthrough 120 or 240 degrees have? Discuss the relationship between the cyclestructures and the factored enumerators of fixed points of the permutationsin Exercise 3.38.

Recall that we said that a group of permutations acts on a set S if, foreach member σ of G there is a permutation σ of S such that

σ ◦ ϕ = σ ◦ ϕ

for all members σ, ϕ ∈ G. Since σ is a permutation of S, σ has a cycle decom-position as a permutation of S (as well as whatever its cycle decompositionis in the original permutation group G).

Exercise 3.41. In Exercise 3.39, each“kind” of group element has a “kind”of cycle decomposition in the action of the rotation group of the cube on thefaces of the cube. For example, a rotation of the cube through 180 degreesaround a vertical axis through the centers of the top and bottom faces hastwo cycles of size two and two cycles of size one. To the extent that youhaven’t already done it in an earlier exercose, answer the following ques-tions. How many such rotations does the group have? What are the other“kinds” of group elements, and what are their cycle structures? Discuss the


relationship between the cycle decomposition and the factored enumeratorin Exercise 3.39.

The usual way of describing the Polya-Redfield Enumeration Theoreminvolves the “cycle indicator” or “cycle index” of a group acting on a set.Suppose we have a group G acting on a finite set S. Since each groupelement σ gives us a permutation σ of S, as such it has a decomposition intodisjoint cycles as a permutation of S. Suppose σ has c1 cycles of size 1, c2

cycles of size 2, ..., cn cycles of size n. Then the cycle monomial of σ is

z(σ) = zc11 zc2

2 · · · zcnn .

The cycle indicator or cycle index of G acting on S is

Z(G, S) =1|G|

∑σ:σ∈G

z(σ).

Exercise 3.42. (a) What is the cycle index for the group D6 acting onthe vertices of a hexagon?

(b) What is the cycle index for the group of rotations of the cube actingon the faces of the cube?

Exercise 3.43. How can you compute the orbit enumerator of G acting oncolorings of S by a finite set T of colors from the cycle index of G actingon S? (Use t, thought of as a variable, as the picture of an element t of T .)State and prove the relevant theorem! This is Polya’s and Redfield’s famousenumeration theorem.

Exercise 3.44. Suppose we make a necklace by stringing 12 pieces ofbrightly colored plastic tubing onto a string and fastening the ends of thestring together, and that we have ample supplies of blue, green, red, andyellow tubing available. Give a generating function in which the coefficientof BiGjRkY h is the number of necklaces we can make with i blues, j greens,k reds, and h yellows. How many terms would this generating function haveif you expanded it in terms of powers of B,G, R, and Y ? Does it make senseto do this expansion? How many of these necklaces have 3 blues, 3 greens,2 reds, and 4 yellows?

Exercise 3.45. What should we substitute for the variables representingcolors in the orbit enumerator of G acting on the set of colorings of S by aset T of colors in order to compute the total number of orbits of G actingon the set of colorings? What should we substitute into the variables in the

3.6. THE POLYA-REDFIELD THEOREM 131

cycle index of a group G acting on a set S in order to compute the totalnumber of orbits of G acting on the colorings of S by a set T? Find thenumber of ways to color the faces of a cube with four colors.

Exercise 3.46. We have red, green, and blue sticks all of the same length,with a dozen sticks of each color. We are going to make the skeleton ofa cube by taking eight identical lumps of modeling clay and pushing threesticks into each lump so that the lumps become the vertices of the cube.(Clearly we won’t need all the sticks!) In how many different ways could wemake our cube? How many cubes have four edges of each color? How manyhave two red, four green, and six blue edges?

Exercise 3.47. How many cubes can we make in Exercise 3.46 if the lumpsof modeling clay can be any of four colors?

Figure 3.5: A possible computer network.

1 2

3

45

6

Exercise 3.48. In Figure 3.5 we see a graph with six vertices. Supposewe have three different kinds of computers that can be placed at the sixvertices of the graph to form a network. In how many different ways maythe computers be placed? Note that two computer placements are the sameif there is an automorphism of the graph that carries one to the other. (Referto Exercise 3.31 for the definition of automorphism.)

Hint. The group of automorphisms contains D6 as a subgroup.

Another Hint. The permutations with four one-cycles and the two-cycle(1 4), (2 5), or (3 6) are in the group of automorphisms. Once you know thecycle structure of D6 and (1 4)D6 = {(1 4)σ|σ ∈ D6}, you know the cyclestructure of every element of the group.


Exercise 3.49. Two simple graphs on the set [n] with edge sets E and E′

(which we think of as sets of two-element sets for this exercise) are said tobe isomorphic if there is a permutation σ of [n] which, in its action oftwo-element sets, carries E to E′. We say two graphs are different if theyare not isomorphic. Thus the number of different graphs is the number oforbits of the set of all sets of two-element subsets of [n] under the action ofthe group Sn. We can represent an edge set by its characteristic function(as in Exercise 26 on page 15). That is, we define

χE({u, v}) ={

1 if {u, v} ∈ E0 otherwise.

Thus we can think of the set of graphs as a set of colorings with colors 0 and1 of the set of all two-element subsets of [n]. The number of different graphswith vertex set [n] is thus the number of orbits of this set of colorings underthe action of the symmetric group Sn on the set of two-element subsets of[n]. Use this to find the number of different graphs on five vertices.

Hint. What does the symmetric group on five vertices have to do with thisexercise?

3.7 Additional Exercises for Supplementary Chap-ter 3

1. (a) If a group G acts on a set S, does σ(f) = f ◦ σ define a groupaction on the functions from S to a set T? Why or why not?

(b) If a group G acts on a set S, does σ(f) = f ◦ σ−1 define a groupaction on the functions from S to a set T? Why or why not?

(c) Is either of the possible group actions essentially the same as theaction we described on colorings of a set, or is that an entirelydifferent action?

2. Find the number of ways to color the faces of a tetrahedron with twocolors.

3. Find the number of ways to color the faces of a tetrahedron with fourcolors so that each color is used.

4. Find the cycle index of the group of spatial symmetries of the tetra-hedronacting on the vertices. Find the cycle index for the same groupacting on the faces.

3.7. ADDITIONAL EXERCISES FOR SUPPLEMENTARY CHAPTER 3133

5. Find the generating function for the number of ways to color the facesof the tetrahedron with red, blue, green and yellow.

→6. Find the generating function for the number of ways to color the facesof acube with four colors so that all four colors are used.

→7. How many different graphs are there on six vertices with seven edges?

→8. Show that if H is a subgroup of the group G, then H acts on G byσ(τ) = σ ◦ τ for all σ in H and τ in G. What is the size of an orbit ofthis action? How does the size of a subgroup of a group relate to thesize of the group?

Part III

REVIEW MATERIAL

135

Appendix A

More on Functions andDigraphs

A.1 Functions

Exercise A.1. Consider the functions from S = {−2,−1, 0, 1, 2} to T ={1, 2, 3, 4, 5} defined by f(x) = x + 3, and g(x) = x5 − 5x3 + 5x + 3. Writedown the set of all ordered pairs (x, f(x)) for x ∈ S, and the set of all orderedpairs (x, g(x)) for x ∈ S. Are the two functions the same or different?

Exercise A.1 points out how two functions which appear to be differentare actually the same on some domain of interest to us. Most of the timewhen we are thinking about functions it is fine to think of a function casuallyas a relationship between two sets. In Exercise A.1 the set of ordered pairsyou wrote down for each function is called the relation of the function.When we want to distinguish between the casual and the careful in talkingabout relationships, our casual term will be “relationship” and our carefulterm will be “relation.” So relation is a technical word in mathematics, andas such it has a technical definition:. A relation from a set S to a set T is aset of ordered pairs whose first elements are in S and whose second elementsare in T . Another way to say this is that a relation from S to T is a subsetof the Cartesian product S × T .

A typical way to define a function f from a set S (called the domainof the function) to a set T (called the co-domain) is that f is a relationfrom S to T which relates each element of S to one and only one member ofT . We use f(x) to stand for the element of T that is related to the elementx of S, and we use the standard shorthand f : S → T for “f is a functionfrom S to T”.

137

138 APPENDIX A. MORE ON FUNCTIONS AND DIGRAPHS

Exercise A.2. Here are some questions that will help you get used to theformal idea of a relation and the related formal idea of a function. S willstand for a finite set of size s and T will stand for a finite set of size t.

(a) What is the size of the largest relation from S to T?(b) What is the size of the smallest relation from S toT?(c) What is the size of the relation of a function from S to T? That is,

how many ordered pairs are in the relation of a function from S to T?(d) Before working this and the next exercise, review the definitions of one-

to-one function and onto function in Chapter 1. How many differentelements must appear as second elements of the ordered pairs in therelation of a one-to-one function from S to T?

(e) What is the minimum size that S can have if there is a onto functionfrom S to T?

Sketch of solution. (a) st because that is the size of the relation that hasall the ordered pairs (x, y) with x ∈ S and y ∈ T .

(b) 0, because the empty set of ordered pairs is a relation.(c) s.(d) s, exactly one for each element of S.(e) The size of S must be at least t.

Exercise A.3. When f is a function from S to T , the sets S and T playa big role in determining whether a function is one-to-one or onto. For theremainder of this exercise, let S and T stand for the set of nonnegative realnumbers.

(a) If f : S → T is given by f(x) = x2, is f one-to-one? Is f onto?(b) Now assume for the rest of the exercise that S′ is the set of all real

numbers and g : S′ → T is given by g(x) = x2. Is g one-to-one? Is gonto?

(c) Assume for the rest of the exercise that T ′ is the set of all real numbersand h : S → T ′ is given by h(x) = x2. Is h one-to-one? Is h onto?

(d) And if the function j : S′ → T ′ is given by j(x) = x2, is j one-to-one?Is j onto?

(e) If f : S → T is a function, we say that f maps x to y as another wayto say that f(x) = y. Suppose S = T = {1, 2, 3}. Give a functionfrom S to T that is not onto. Notice that two different members ofS have mapped to the same element of T . Thus when we say that fassociates one and only one element of T to each element of S, it isquite possible that the one and only one element f(1) that f maps 1to is exactly the same as the one and only one element f(2) that fmaps 2 to.

A.2. DIGRAPHS 139

A.2 Digraphs

a

b

c

d

Figure A.1: The Alphabet Digraph.

In Figure A.1 we illustrate digraph of the “comes before in alphabeticalorder” relation on the letters a, b, c, and d. We draw the arrow from ato b, for example, because a comes before b in alphabetical order. We tryto choose the locations for the vertices so that the arrows capture whatwe are trying to illustrate as well as possible. Sometimes this entails re-drawing our directed graph several times until we think the arrows capturethe relationship well.

Exercise A.4. Draw the digraph of the “is a proper subset of” relationon the set of subsets of a two element set. (Remember the empty set isa subset.) How many arrows would you have had to draw if this exerciseasked you to draw the digraph for the subsets of a three-element set?

Exercise A.5. (a) Draw the digraph of the relation from the set {A, M,P, S} to the set {Sam, Mary, Pat, Ann, Polly, Sarah} given by “is thefirst letter of.”

(b) Draw the digraph of the relation from the set {Sam, Mary, Pat, Ann,Polly, Sarah} to the set {A, M, P, S} given by “has as its first letter.”

Exercise A.6. When we draw the digraph of a function f , we draw an arrowfrom the vertex representing x to the vertex representing f(x). One of therelations you considered in Exercise A.5 is the relation of a function.

(a) Which relation is the relation of a function?(b) How does the digraph help you visualize that one relation is a function

and the other is not?

Exercise A.7. Digraphs of functions help us to visualize whether or notthey are onto or one-to-one. For example, let both S and T be the set{−2,−1, 0, 1, 2} and let S′ and T ′ both be the set {0, 1, 2}. Let f(x) =2− |x|.

140 APPENDIX A. MORE ON FUNCTIONS AND DIGRAPHS

(a) Draw the digraph of the function f , assuming its domain is S and itsrange is T . Use the digraph to explain why or why not this functionmaps S onto T .

(b) Use the digraph of the previous part to explain whether or not thefunction is one-to one.

(c) Draw the digraph of the function f assuming its domain is S and itsrange is T ′. Use the digraph to explain whether or not the function isonto.

(d) Use the digraph of the previous part to explain whether or not thefunction is one-to-one.

(e) Draw the digraph of the function f , assuming its domain is S′ and itsrange is T ′. Use the digraph to explain whether the function is onto.

(f) Use the digraph of the previous part to explain whether the functionis one-to-one.

(g) Suppose that the function f has domain S′ and range T . Draw thedigraph of f and use it to explain whether f is onto.

(h) Use the digraph of the previous part to explain whether or not f isone-to-one.

A function from a set X to a set Y which is both one-to-one and ontois frequently called a bijection, especially in combinatorics. Your work inExercise A.7 should show you that a digraph is the digraph of a bijectionfrom X to Y when all four of the following properties hold:

• The vertices of the digraph represent the elements of X and Y (so thatX is the possible domain and Y is the possible co-domain).

• Each vertex representing an element of X has one and only one arrowleaving it (so that f is indeed a function).

• Each vertex representing an element of Y has at least one one arrowentering it (so that it is onto).

• Each vertex representing an element of Y has at most one arrow en-tering it (so it is one-to-one).

Of course, the last two properties can be combined to the requirementthat each vertex representing an element of Y has exactly one arrow enteringit.

Appendix B

More on EquivalenceRelations

Exercise B.1. Which of the reflexive, symmetric and transitive propertiesdoes the < relation on the integers have?

Exercise B.2. A relation R on the set of ordered pairs of positive integersthat you learned about in grade school in another notation is the relationthat says (m,n) is related to(h, k) if mk = hn. Show that this relation isan equivalence relation. In what context did you learn about this relationin grade school?

Hint. To show a relation is an equivalence relation, you need to show itsatisfies the definition of an equivalence relation.

Exercise B.3. Another relation that you may have learned about in school,perhaps in the guise of “clock arithmetic,” is the relation of equivalencemodulo n. For integers (positive, negative, or zero) a and b,we write

a ≡ b (mod n)

to mean that a− b is an integer multiple of n, and in this case, we say thata is congruent to b modulo n. Show that the relation of congruencemodulo n is an equivalence relation.

Exercise B.4. Define a relation on the set of all lists of n distinct integerschosen from {1, 2, . . . , n}, by saying two lists are related if they have thesame elements (though perhaps in a different order) in the first k places,and the same elements (though perhaps in a different order) in the lastn− k places. Show this relation is an equivalence relation.

141

142 APPENDIX B. MORE ON EQUIVALENCE RELATIONS

Appendix C

More on the Principle ofMathematical Induction

You should have already seen the Principle of Mathematical Induction inother courses. If you’ve been able to work through Section 4.1 of Chapter 4,there is no need to read this chapter. This section is provided in case you’vefound your background to be deficient, and you think spending time outsideclass reviewing this will be helpful.

Exercise C.1. (a) Write down a list of all subsets of {1, 2}. Don’t forgetthe empty set! Group the sets containing 2 separately from the others.

(b) Write down a list of the subsets of {1, 2, 3}. Group the sets containing3 separately from the others.

(c) Look for a natural way to match up the subsets containing 2 in part (a)with those not containing 2. Look for a way to match up the subsetsin part (b) containing 3 with those not containing 3.

(d) On the basis of the previous part, you should be able to find a bijectionbetween the collection of subsets of {1, 2, . . . , n} containing n and thosenot containing n. (If you are having difficulty figuring out the bijection,try rethinking Parts (a) and (b), perhaps by doing a similar exercisewith the set {1, 2, 3, 4}.) Describe the bijection and explain why it is abijection. Explain why the number of subsets of{1, 2, . . . , n} containingn equals the number of subsets of {1, 2, . . . , n− 1}.

(e) Parts (a) and (b) suggest strongly that the number of subsets of an-element set is 2n. In particular, the empty set has 20 subsets;a one-element set has 21 subsets: itself and the empty set; and inparts (a) and (b) we saw that two-element and three-element setshave 22 and 23 subsets, respectively. So there are certainly some val-

143

144APPENDIX C. MORE ON THE PRINCIPLE OF MATHEMATICAL INDUCTION

ues of n for which an n-element set has 2n subsets. One way to provethat an n-element set has 2n subsets for all values of n is to argueby contradiction. For this purpose, suppose there is a nonnegativeinteger n such that an n-element set doesn’t have exactly 2n subsets.In that case there maybe more than one such n, and so choose k tobe the smallest such n. (Notice that k − 1 is still a positive integer,because k can’t be 0, 1, 2, or 3.) Since k was the smallest value ofn for which the statement “An n-element set has 2n subsets” is false,what do you know about the number of subsets of a (k − 1)-elementset? What do you know about the number of subsets of the k-elementset {1, 2, . . . , k} that don’t contain k? What do you know about thenumber of subsets of {1, 2, . . . , k} that do contain k? What does theSum Principle tell you about the number of subsets of {1, 2, . . . , k}?Notice that this contradicts the way in which we chose k, and the onlyassumption that went into our choice of k was that “there is a non-negative integer n such that an n-element set doesn’t have exactly 2n

subsets.” Since this assumption has led us to a contradiction, it mustbe false. What can you now conclude about the statement “for everynonnegative integer n, an n-element set has exactly 2n subsets?”

Exercise C.2. Notice that the nth odd integer is 2n − 1, and so the ex-pression

1 + 3 + 5 + · · ·+ 2n− 1 (C.1)

is the sum of the first n odd integers . Experiment a bit with the sum forthe first few positive integers and guess its value in terms of n. Now applythe technique of Exercise C.1 to prove that you are right.

In Exercises C.1 and C.2, our proofs had several distinct elements: Wehad a statement involving an integer n. We knew the statement was truefor the first few nonnegative integers in Exercise C.1 and for the first fewpositive integers in Exercise C.2. We wanted to prove that the statementwas true for all nonnegative integers in Exercise C.1 and for all positiveintegers in Exercise C.2. In both cases we used the method of proof bycontradiction: for that purpose we assumed that there was a value of n forwhich our formula wasn’t true. We then chose k to be the smallest value ofn for which our formula wasn’t true. This meant that when n was k − 1,our formula was true, (or else that k − 1 wasn’t a nonnegative integer inExercise C.1 or that k − 1 wasn’t a positive integer in Exercise C.2). Whatwe did next was the crux of the proof. We showed that the truth of ourstatement for n = k − 1 implied the truth of our statement for n = k. This

145

gave us a contradiction to the assumption that there was an n that madethe statement false. In fact, we will see that we can bypass entirely the useof proof by contradiction. We used it to help you discover the central ideasof the technique of proof by mathematical induction. The central core ofmathematical induction is the proof that the truth of a statement about theinteger n for n = k − 1 implies the truth of the statement for n = k. Forexample, once we know that a set of size 0 has 20 subsets, if we have provedour implication, we can then conclude that a set of size 1 has 21 subsets,from which we can conclude that a set of size 2 has 22 subsets, from whichwe can conclude that a set of size 3 has 23 subsets, and so on up to a set ofsize n having 2n subsets for any nonnegative integer n we choose. In otherwords, although it was the idea of proof by contradiction that led us to thinkabout such an implication, we can now do without the contradiction at all.What we need to prove a statement about n by this method is a place tostart; that is, a value b of n for which we know the statement to be true,and then a proof that the truth of our statement for n = k − 1 implies thetruth of the statement for n = k whenever k > b.

The Principle of Mathematical InductionIn order to prove a statement about an integer n, if we can

• Prove the statement when n = b, for some fixed integer b;

• Show that the truth of the statement for n = k− 1 implies the truthof the statement for n = k whenever k > b;

then we can conclude the statement is true for all integers n ≥ b.

As an example, let us return to Exercise C.1. The statement we wish toprove is the statement that “A set of size n has 2n subsets.”

Our statement is true when n = 0, because a set of size 0 isthe empty set, for which the only subset is the empty set, giving1 = 20 subsets. (This step of our proof is called a base step.)

Now suppose that k > 0 and every set with k − 1 elements has2k−1 subsets. Suppose S = {a1, a2, . . . ak} is a set with k ele-ments. We partition the subsets of S into two blocks. Block B1

consists of the subsets that do not contain an and block B2 con-sists of thesubsets that do contain ak. Each set in B1 is a subsetof {a1, a2, . . . ak−1}, and each subset of {a1, a2, . . . ak−1} is in B1.Thus B1 is the set of all subsets of {a1, a2, . . . ak−1}. Therefore

146APPENDIX C. MORE ON THE PRINCIPLE OF MATHEMATICAL INDUCTION

by our assumption in the first sentence of this paragraph, the sizeof B1 is 2k−1. Consider the function from B2 to B1 which takesa subset of S including ak and removes ak from it. The set B2 isthe domain of this function, because every set in B2 contains ak.This function is onto, because if T is a set in B1, then T ∪ {ak}is a set in B2 which the function sends to T . This function isone-to-one because if V and W are two different sets in B2, thenremoving ak from both of them gives two different sets in B1.Thus we have a bijection between B1 andB2, so B1 and B2 havethe same size. Therefore by the Sum Principle the size of B1∪B2

is 2k−1+2k−1 = 2k. Therefore, S has 2ksubsets. This shows thatif a set of size k− 1 has 2k−1 subsets, then a set of size k has 2k

subsets. Therefore by the principle of mathematical induction,a set of size n has 2n subsets for every nonnegative integer n.

The first sentence of the last paragraph is called the inductive hypoth-esis. In an inductive proof we always make an inductive hypothesis as partof proving that the truth of our statement when n = k− 1 implies the truthof our statement when n = k. The last paragraph itself is called the induc-tive step of our proof. In an inductive step we derive the statement forn = k from the statement for n = k − 1, thus proving that the truth of ourstatement when n = k − 1 implies the truth of our statement when n = k.The last sentence in the last paragraph is called the inductive conclusion.

All inductive proofs should have a base step, an inductive hypothesis,an inductivestep, and an inductive conclusion. There are a couple detailsworth noticing. First, in this exercise, our base step was the case n = 0, or inother words, we had b = 0. However, in other proofs, b could be any integer,positive, negative, or 0. Second, our proof that the truth of our statementfor n = k−1 implies the truth of our statement for n = k required that k beat least 1, so that there would be an element ak we could remove in orderto describe our bijection. However, the second condition in the statementof the Principle of Mathematical Induction only requires that we be able toprove the implication for k > 0, so we were allowed to assume k > 0.

Exercise C.3. Use mathematical induction to prove your formula fromExercise C.2.

Exercise C.4. Experiment with various values of n in the sum

11 · 2

+1

2 · 3+

13 · 4

+ · · ·+ 1n · (n + 1)

=n∑

i=1

1i · (i + 1)

.

147

Guess a formula for this sum and prove your guess is correct by induction.

Exercise C.5. For large values of n, which is larger, n2 or 2n? A graphmight be help to decide what “large value” means here. Use mathematicalinduction to prove that you are correct.

Exercise C.6. What is wrong with the following attempt at an inductiveproof that all integers in any consecutive set of n integers are equal for everypositive integer n?

For an arbitrary integer i, all integers from i to i are equal, so ourstatement is true when n = 1. Now suppose k > 1 and all integers in anyconsecutive set of k − 1 integers are equal. Let S be a set of k consecutiveintegers. By the inductive hypothesis, the first k−1 elements of S are equaland the last k− 1 elements of S are equal. Therefore all the elements in theset S are equal. Thus by the principle of mathematical induction, for everypositive n, every n consecutive integers are equal.

discrete mathematics through guided discovery: classnotes...

Documents