by yuli ye - university of toronto t-space · yuli ye doctor of philosophy graduate department of...

GENERALIZING CONTEXTS AMENABLE TO GREEDY AND GREEDY-LIKE

ALGORITHMS

by

Yuli Ye

A thesis submitted in conformity with the requirementsfor the degree of Doctor of Philosophy

Graduate Department of Computer ScienceUniversity of Toronto

Copyright © 2013 by Yuli Ye

Abstract

Generalizing Contexts Amenable to Greedy and Greedy-Like Algorithms

Yuli Ye

Doctor of Philosophy

Graduate Department of Computer Science

University of Toronto

2013

One central question in theoretical computer science is how to solve problems accu-

rately and quickly. Despite the encouraging development of various algorithmic techniques

in the past, we are still at the very beginning of the understanding of these techniques. One

particularly interesting paradigm is the greedy algorithm paradigm. Informally, a greedy al-

gorithm builds a solution to a problem incrementally by making locally optimal decisions at

each step. Greedy algorithms are important in algorithm design as they are natural, concep-

tually simple to state and usually efficient. Despite wide applications of greedy algorithms

in practice, their behaviour is not well understood. However, we do know that in several spe-

cific settings, greedy algorithms can achieve good results. This thesis focuses on examining

contexts in which greedy and greedy-like algorithms are successful, and extending them to

more general settings. In particular, we investigate structural properties of graphs and set

systems, families of special functions, and greedy approximation algorithms for several clas-

sic NP-hard problems in those contexts. A natural phenomenon we observe is a trade-off

between the approximation ratio and the generality of those contexts.

ii

Acknowledgements

It is my great honour to study theoretical computer science at the University of Toronto and

having a great advisor, Allan Borodin, in helping me go through one of the important pro-

cesses of my life. I am greatly indebted to him, for his long-time support, both emotionally

and financially, for his inspiration and guidance, for his patience and encouragement. It has

been a long journey, and without him, this would never be possible.

My deepest thanks to my committee members: Charles Rackoff and Derek Corneil, for

their careful reading, comments and suggestions on the thesis. I particularly enjoyed the

conversations I had with Derek. His passion to research and his life philosophy set a true

role model for me. Special thanks to Faith Ellen, for serving on my committee for earlier

checkpoints and many constructive and helpful comments to drafts of the thesis. It is my

honour to have Magnús M. Halldórsson be my external examiner. I thank him for his help-

ful comments about the thesis and his excellent research in the field of approximation algo-

rithms. Many ideas of this thesis are borrowed from his papers and insights.

I would like to thank all my coauthors, especially Stephen Cook and John Brzozowski. It

is my privilege to work with them. I admire their work ethic and persistence, and have learnt

a great deal from them. A very special thanks to Dai Le, a wonderful friend for the past four

years. We had many enjoyable conversations, and walked almost every path on campus. It

is great to have you as a companion during these years. I also would like thank Renqiang

Min, Phuong Nguyen, Yuval Filmus, Brendan Lucier, Justin Ward, Joel Oren, Xiaodan Zhu

and many other. It is impossible to enumerate all their names here.

Finally, thanks to my family in Toronto and China, my wife Lingling and my dear daugh-

ter April. I know you have waited for so long. The dream finally comes true.

iii

Contents

1 Introduction 1

1.1 What is a Greedy Algorithm? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Greedy Algorithms: A Brief History . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Approximation Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.4 Why Study Greedy Algorithms? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.5 A List of Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.6 Overview of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Greedy Algorithms on Special Structures 13

2.1 Chordal Graphs and Related Structures . . . . . . . . . . . . . . . . . . . . . . . . 13

2.1.1 Interval Selection and Colouring . . . . . . . . . . . . . . . . . . . . . . . . 15

2.1.2 Perfect Elimination Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.1.3 Extending Perfect Elimination Orderings . . . . . . . . . . . . . . . . . . . 18

2.1.4 Inductive and Universal Neighbourhood Properties and Their Related

Graphs Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.2 Natural Subclasses of the Four Families . . . . . . . . . . . . . . . . . . . . . . . . 23

2.2.1 Graphs induced by the Job Interval Selection Problem . . . . . . . . . . . 23

2.2.2 Planar Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.2.3 Disk and Unit Disk Graphs, Intersection Graphs of Convex Shapes . . . . 28

2.2.4 More Subclasses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

iv

2.3 Properties of G(I Sk ) and G(CCk ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.4 Greedy Algorithms for G(I Sk ) and G(CCk ) . . . . . . . . . . . . . . . . . . . . . . 36

2.4.1 Maximum Independent Set . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

2.4.2 Minimum Vertex Colouring . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.4.3 Minimum Vertex Cover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

2.4.4 Weighted Maximum c-Colourable Subgraph . . . . . . . . . . . . . . . . . 46

2.4.5 The Graph Class G(CC2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

2.5 Matroids and Chordoids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

2.5.1 Matroids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

2.5.2 Greedy Algorithms and Matroids . . . . . . . . . . . . . . . . . . . . . . . . 54

2.5.3 Chordoids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

3 Greedy Algorithms for Special Functions 61

3.1 Linear Functions and Submodular Functions . . . . . . . . . . . . . . . . . . . . 61

3.2 Max-Sum Diversification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

3.2.1 A Greedy Algorithm and Its Analysis . . . . . . . . . . . . . . . . . . . . . . 65

3.2.2 Further Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

3.3 Weakly Submodular Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

3.3.1 Examples of Weakly Submodular Functions . . . . . . . . . . . . . . . . . 76

3.3.2 Weakly Submodular Function Maximization . . . . . . . . . . . . . . . . . 81

3.3.3 Further Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

4 Sum Colouring - A Case Study of Greedy Algorithms 91

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

4.2 NP-Hardness for Penny Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

4.3 Approximation Algorithms for d-Claw-Free Graphs and their Subclasses . . . . 103

4.3.1 Compact Colouring for G(I Sk ) . . . . . . . . . . . . . . . . . . . . . . . . . 103

4.3.2 Unit Square Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

v

4.4 Priority Inapproximation for Sum Colouring . . . . . . . . . . . . . . . . . . . . . 107

4.4.1 Fixed Order and Adaptive Order . . . . . . . . . . . . . . . . . . . . . . . . 107

4.4.2 Deriving Lower Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

4.4.3 An Inapproximation Lower Bound for Sum Colouring . . . . . . . . . . . 109

4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

5 Greedy Algorithms with Weight Scaling 112

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

5.2 Weight Scaling for Degrees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

5.3 Weight Scaling for Claws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

5.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

6 Conclusion 122

Bibliography 124

vi

List of Figures

2.1 A chordal graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2 A set of eight intervals ordered by non-decreasing finish-time . . . . . . . . . . . 15

2.3 An optimal solution of interval selection on the input in Fig. 2.2 . . . . . . . . . 16

2.4 An optimal solution of interval colouring on the input in Fig. 2.2 . . . . . . . . . 16

2.5 The interval graph of Fig. 2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.6 A graph in G(I S2) and G(I S2) but not in G(CC2) and G(CC2) . . . . . . . . . . . . 23

2.7 An example of the job interval selection problem . . . . . . . . . . . . . . . . . . 24

2.8 No triangular face . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.9 One triangular face . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.10 Two adjacent triangular faces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.11 An example of a disk graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.12 Partition the plane into six regions . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.13 An example of a circular-arc graph . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.14 An example of the construction for k = 1 . . . . . . . . . . . . . . . . . . . . . . . 34

2.15 A vertex v in H and one of its independent neighbours u . . . . . . . . . . . . . 36

2.16 A mapping from O to A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

2.17 A maximum matching between N (v) and N2(v) . . . . . . . . . . . . . . . . . . . 41

2.18 An example of a triangle weight decomposition of a graph . . . . . . . . . . . . . 45

4.1 An optimal sum colouring of G . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

4.2 Unit interval graphs and proper interval graphs . . . . . . . . . . . . . . . . . . . 93

vii

4.3 Unit square graphs and proper intersection graphs of axis-parallel rectangles . 94

4.4 Unit disk graphs and penny graphs . . . . . . . . . . . . . . . . . . . . . . . . . . 94

4.5 Transformation from planar graphs with maximum degree 3 to penny graphs . 96

4.6 Transformation for straight pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

4.7 Transformation for uneven pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

4.8 Transformation for corner pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

4.9 The edge gadget . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

4.10 Best colouring of the edge gadget when both u and v are coloured 1 . . . . . . . 99

4.11 Recolour v to improve the sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

4.12 The adjacent edge gadget . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

4.13 Transformation between two grid vertices . . . . . . . . . . . . . . . . . . . . . . 101

4.14 Corner cases in the first transformation . . . . . . . . . . . . . . . . . . . . . . . . 101

4.15 An overlapping adjacent pair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

4.16 A degree-three corner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

4.17 Two graphs for adaptive priority algorithms . . . . . . . . . . . . . . . . . . . . . 110

5.1 An example for weight scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

viii

Chapter 1

Introduction

For many application areas, greedy strategies are a natural, conceptually simple, and ef-

ficient algorithmic approach. Although for the vast majority of optimization problems,

greedy algorithms are not optimal, in some specific settings, such as for simple graph and

scheduling problems, greedy algorithms can find a global optimum. Natural questions to

ask are what brings success to greedy algorithms and to what extent can we generalize this

success. Before we investigate these questions, we first give some background and motiva-

tion; namely, understanding what they are, their history, settings in which greedy algorithms

are effective and, in general, why they are an interesting subject to study.

1.1 What is a Greedy Algorithm?

Most greedy algorithms appear in the form of an iterative procedure that, at each step,

makes locally optimal decisions with respect to a certain criterion. A commonly used ex-

ample of a problem that can be solved using a greedy algorithm is interval selection.

Given a set of intervals, each represented by its start-time and finish-time, we

are to select a subset of non-overlapping intervals with maximum cardinality.

1

CHAPTER 1. INTRODUCTION 2

Natural greedy algorithms for this problem decide at each step which interval to consider

next and what to do about it. The keyword “greedy” is reflected in this decision-making at

each step. Usually, this decision has the following characteristics:

• Local: decisions are based on information maintained locally for each input item.

This can the degree of a vertex, the weight of an edge, the distance to a vertex in a

graph, or the index in a particular ordering.

• Irrevocable: once determined, decisions cannot be changed afterwards.

• Greedy: decisions are made so as to optimize a certain criterion.

Note that these characteristics may not always be present in designing a greedy algorithm.

However, they do appear frequently. There are often many different choices for a greedy rule

to optimize the decision at each step. For example, for the interval selection problem, the

following rules may be used for selecting the next interval. At each step, we can choose: (1)

an interval with the smallest processing time, or (2) an interval with the earliest start-time,

or (3) an interval with the smallest number of conflicting intervals, or (4) an interval with the

earliest finish-time. By symmetry, choosing the earliest start-time/finish-time is equivalent

to choosing the latest finish-time/start-time. Although all these choices are reasonable for a

greedy algorithm, only (4) leads to an optimal solution. Therefore, choosing the right greedy

rule is crucial when designing a greedy algorithm.

1.2 Greedy Algorithms: A Brief History

Greedy algorithms are natural and they have been widely used in practice. Some of these

algorithms are so natural and important that people rarely associate them with greedy al-

gorithms. In fact, it was not until the early 1970’s, that the term “greedy algorithm” started

emerging. As algorithm design became a standard course in computer science, the term


became more popular, and people started to recognize greedy algorithms as a general way

to solve problems.

Greedy algorithms have a long history. In the early 13th century, in the book Liber Abaci,

Fibonacci described a process of finding a representation of a fraction by a sum of unit frac-

tions (i.e., fractions of the form 1n ) with different denominators. The problem is known as

the Egyptian fraction problem. The process described by Fibonacci is a greedy algorithm. At

each step, it finds the largest unit fraction not exceeding the remaining fraction, subtracts it

from the remaining fraction, until the remaining fraction becomes zero. Fibonacci showed

that such a greedy process terminates in a number of steps which is no greater than the

numerator of the original fraction. Hence it produces a finite representation.

Greedy algorithms appear in many important applications. For example, in graph the-

oretic problems, we have Prim’s algorithm and Kruskal’s algorithm for finding a minimum

weighted spanning tree. Prim’s algorithm maintains a set S of discovered vertices. At each

step, it adds to S the “closest” vertex to S. Kruskal’s algorithm sorts all edges in non-decreasing

order of weights. At each step, it selects the next edge if it does not create a cycle with pre-

viously selected edges. Like Prim’s algorithm, Dijkstra’s algorithm for finding shortest paths

can also be viewed as a greedy algorithm. Dijkstra’s algorithm maintains a set S of discov-

ered vertices. At each step, it adds to S the “closest” vertex to the starting vertex via vertices

in S. In data compression, we have Huffman’s algorithm to generate the optimal prefix-code

tree. Huffman’s algorithm is a greedy algorithm in the sense that at each step, it extracts the

two smallest elements from a set, combines them and puts the newly combined element

back into the set. In scheduling, Graham’s list scheduling algorithm for minimizing com-

pletion time is a greedy algorithm. It sorts jobs in non-increasing order of processing time.

At each step, it schedules the next job to the least loaded machine. Similarly, Johnson’s first

fit decreasing algorithm for bin packing is also a greedy algorithm. It sorts items in non-

increasing size. At each step, it assigns the next item to the first bin into which it fits and

opens a new bin if the item does not fit into an existing bin.


There are also important developments in studying special structures related to greedy

algorithms. Matroids and related systems were first studied in the 1950s by Rado [76], Gale [34]

and Edmonds [27]. This work was later extended by Korte and Lovász [61]. In particular,

matroids characterize those hereditary set systems for which the natural greedy algorithm

always optimizes linear objectives. Related to this development, a similar concept, known

as the Monge property, is studied in the literature of transportation problems. In his 1963

paper, Hoffman [49] gave a necessary and sufficient condition, the existence of a Monge

sequence, that determines when a certain transportation problem can be solved greedily.

Greedy algorithms are also studied in optimizing special functions. An important result of

Nemhauser, Wosley and Fisher [71] states that, over a uniform matroid, the natural greedy

algorithm achieves an ee−1 approximation to the optimal value of a monotone submodular

set function. This has led applications in fields such as information retrieval [57] and natural

language processing [69, 67, 68].

Recently, there has been growing interest in trying to better understand greedy algo-

rithms. One key development is the priority framework initiated by Borodin, Nielsen and

Rackoff [12] in 2002. It gives a precise model for greedy algorithms, so that their power and

limitations can be analyzed. We briefly discuss the priority framework in Section 4.4.

1.3 Approximation Algorithms

Throughout the thesis, we focus on approximation algorithms for optimization problems.

Most of these problems are NP-hard.

The celebrated result of NP-completeness [20] shows that hard problems exist. Karp’s

landmark paper on twenty-one NP-complete problems [56] shows they are abundant. In

practice, finding optimal solutions for such problems is computationally intractable. They

are commonly addressed with heuristics that provide a solution, but with no guarantee on

the solution’s quality. For many optimization problems, however, a sub-optimal solution


that is

• close to the global optimum, and

• computationally feasible,

can sometimes be used as a good substitute for the computationally infeasible optimal solu-

tion. Algorithms that find such sub-optimal solutions are called approximation algorithms.

Central to the framework of approximation algorithms is the definition of the approxima-

tion ratio, a mathematical measure which bounds the worst case performance of such algo-

rithms. For a given optimization problem with objective function φ(·), we let σ be an input

instance, and let A (σ) and O (σ) be the algorithm’s solution and the optimal solution respec-

tively. If the problem is a minimization problem, then the approximation ratio is defined to

be the supremum of the ratio between the value of algorithm’s solution and the value of the

optimal solution over the set of all possible input instances Σ:

ρ(A ) = supσ∈Σ

φ[A (σ)]

φ[O (σ)].

It is not hard to see that, for minimization problems, the approximation ratio is no less than

one. For a maximization problem, we take the convention that the approximation ratio is

defined to be the supremum of the ratio between the value of the optimal solution and the

value of algorithm’s solution over the set of all possible input instances:

ρ(A ) = supσ∈Σ

φ[O (σ)]

φ[A (σ)],

so that the approximation ratio is always no less than one. Based on these definitions, it is

desirable to have a polynomial time algorithm having a ratio close to one. The approxima-

tion ratio provides a guarantee on the quality of the solution obtained in the worst case.

In some cases, trade-offs between the approximation ratio and the running time are pos-

sible. An approximation algorithm is a polynomial-time approximation scheme (PTAS) if it

takes an instance of an optimization problem and a fixed parameter ε > 0 and, in polyno-

mial time in the input size n, produces a solution that is within a factor 1+ε of the optimal.


An algorithm is a fully polynomial-time approximation scheme (FPTAS) if the running time

is polynomial in both the input size n and 1ε

.

An optimization problem having a polynomial-time approximation algorithm with ap-

proximation ratio bounded by a constant is said to be approximable. The class of such prob-

lems is called APX. In this thesis, we focus on approximation algorithms with small constant

approximation ratios. They provide a good guarantee on the quality of a solution. In prac-

tical settings, when inputs are not chosen adversarially, these algorithms can achieve better

results than their worst-case bounds.

1.4 Why Study Greedy Algorithms?

It this section, we give a series of motivating examples to show that greedy algorithms are

a powerful, subtle and interesting class of algorithms. We start with a greedy algorithm for

the set cover problem.

Problem 1.4.1 Given a universe U of n elements and a collection S of m subsets of U : S ={S1,S2, . . . ,Sm} such that U = ∪m

i=1Si , the set cover problem asks to find a cover C ⊆ S with

minimum size such that U =∪Si∈C Si .

The following very simple and natural greedy algorithm has an Hn ≈ lnn approximation,

where Hn =∑ni=1

1i is the nth harmonic number. This bound is tight up to a constant factor

under the assumption that P is not equal to N P [78].

GREEDY SET COVER

1: C =;2: while C does not cover elements in U do

3: Pick a set Si ∈ S that covers the most uncovered elements in U

4: Remove Si from S

5: Add Si to C


6: end while

7: Return C

Now we look at a related problem, vertex cover.

Problem 1.4.2 Given a graph G = (V ,E), the vertex cover problem asks to find a cover C ⊆V

with minimum size such that every edge in E is incident to at least one vertex in C .

The vertex cover problem can be viewed a special case of the set cover problem by taking

the universe to be the set of edges and each set in the collection to be the subset of edges

incident to a particular vertex. Note that each edge appears in exactly two such sets as it

has two end vertices. This type of set cover problem is also referred as the 2-frequency set

cover problem. The following algorithm, known as the largest degree heuristic, is a direct

translation of the greedy set cover algorithm above.

GREEDY VERTEX COVER

1: C =;2: while C does not cover all edges in E do

3: Pick a vertex v in G with the largest current degree

4: Remove v and all its incident edges from G

5: Add v to C

6: end while

7: Return C

Like the greedy set cover algorithm, the above greedy algorithm for vertex cover has an

Hn approximation. However, as a special case of set cover, the vertex cover problem admits

algorithms with better approximation ratios. The best-known ratio for vertex cover is two,

and the bound is tight in the sense that the ratio 2−ε, for any constant ε> 0, is not possible

assuming the unique games conjecture [58]. A greedy algorithm similar to the style of the

algorithm above that achieves approximation ratio two was obtained by Clarkson in [19]

using a slightly different greedy rule.


CLARKSON’S ALGORITHM

1: C =;2: For all v ∈V , let w(v) = 1

3: while C does not cover all edges in E do

4: Let d(v) be the current degree of v

5: Select v in G minimizing w(v)d(v)

6: For any neighbour u of v in G , let w(u) = w(u)− w(v)d(v)


8: Add v to C

9: end while

10: Return C

Theorem 1.4.3 [19] Clarkson’s Algorithm achieves an approximation ratio of two for the ver-

tex cover problem.

Note that without line six, Clarkson’s Algorithm is the same as the greedy vertex cover

algorithm. This is because w(v) will always have a value one, and minimizing 1d(v) at each

step is the same as maximizing d(v). This demonstrates the flexibility of greedy algorithms.

This flexibility essentially comes from the greedy choice we can make at each step. We know

very little about how this flexibility can translate to the design of better algorithms. In some

cases, slightly changing the greedy rule used in an algorithm can make its behaviour myste-

rious and make its analysis challenging. For example, consider the following algorithm for

vertex cover.

ANOTHER GREEDY ALGORITHM FOR VERTEX COVER

1: C =;2: while C does not cover all edges in E do

3: For all v ∈V , let d(v) be its current degree and N (v) be the set of its neighbours

4: Select v in G maximizing∑

u∈N (v)1

d(u)−1



6: Add v to C

7: end while

It seems difficult to find an input instance leading to an approximation ratio greater than

two for this algorithm, nor a proof of any constant approximation ratio.

1.5 A List of Problems

Throughout the thesis, we will discuss algorithms for many NP-hard problems. This section

collects definitions for all these problems.

1. Weighted Maximum Independent Set (WMIS): Given a graph G = (V ,E) and a weight

function w : V → Z+, the goal is to find a subset S of vertices maximizing the total

weight of S such that no two vertices in S are adjacent in G .

2. Maximum Independent Set (MIS): This is the unweighted version of WMIS obtained by

taking the weight of each vertex to be one. The size of an MIS of a graph G is denoted

as α(G).

3. Weighted Minimum Vertex Cover (WMVC): Given a graph G = (V ,E) and a weight func-

tion w : V → Z+, the goal is to find a subset S of vertices minimizing the total weight

of S such that every edge is incident to at least one vertex in S.

4. Minimum Vertex Cover (MVC): This is the unweighted version of WMVC obtained by

taking the weight of each vertex to be one.

5. Weighted Maximum Clique (WMC): Given a graph G = (V ,E) and a weight function

w : V → Z+, the goal is to find a subset S of vertices maximizing the total weight of S

such that every two vertices in S are adjacent in G .


6. Maximum Clique (MC): This is the unweighted version of WMC obtained by taking

the weight of each vertex to be one.

7. Weighted Maximum c-Colourable Subgraph (WCOLc ): Given a graph G = (V ,E) and

a weight function w : V → Z+, the goal is to find a subset S of vertices maximizing

the total weight of S such that S can be partitioned into c independent subsets. This

problem generalizes WMIS in the sense that WMIS is WCOLc with c = 1.

8. Maximum c-Colourable Subgraph (COLc ): This is the unweighted version of WCOLc

obtained by taking the weight of each vertex to be one.

9. Minimum Vertex Colouring (COL): Given a graph G = (V ,E), the goal is to colour the

vertices with a minimum number of colours such that no two vertices with the same

colour are adjacent in G . The minimum number of colours needed to colour G is

called the chromatic number of G , and is denoted as χ(G).

10. Minimum Clique Cover (MCC): Given a graph G = (V ,E), the goal is to colour the ver-

tices with a minimum number of colours such that every two vertices with the same

colour are adjacent in G .

11. Sum Colouring (SC): Given a graph G = (V ,E), a proper colouring of G is an assignment

c : V → Z+ such that for any two adjacent vertices u, v , c(u) 6= c(v). The goal of the

problem is to give a proper colouring of G such that∑

v∈V c(v) is minimized.

For the rest of the thesis, from time to time, we will use the above acronyms to refer to

particular problems.

1.6 Overview of the Thesis

This thesis has two parts. The first half (Chapters 2 and 3) examines the contexts where

greedy algorithms have good performance, and extends to more general settings. Chapter 2


studies structural properties of graphs and set systems. In particular, we define general-

izations of chordal graphs called inductive k-independent graphs. We study properties of

such families of graphs, and we show that several natural classes of graphs are inductive k-

independent for small constants k; for example, planar graphs. For any fixed constant k, we

develop simple, polynomial time approximation algorithms for inductive k-independent

graphs for several well-studied NP-complete problems. For the extension to matroids, we

give a new definition of a hereditary set system by replacing the augmentation property of a

matroid by an ordered augmentation property. We present several related natural problems,

and give positive and negative results about optimization problems over such set systems.

In particular, the unweighted maximum independent set problem can be solved greedily

in linear time given an ordering of elements satisfying the ordered augmentation property,

while the corresponding weighted version of the problem is NP-hard.

Chapter 3 focuses on optimization problems for special classes of functions. We con-

sider the problem of maximizing a set function over a uniform matroid and over a general

matroid. We extend known results for modular and monotone submodular functions to

more general functions. One class of functions is the objective function in the max-sum di-

versification problem, which is a linear combination of a submodular function and the sum

of metric distances of a set. The other class of functions is weakly submodular functions,

which generalizes the objective function in max-sum diversification. We discuss greedy

(and local search) algorithms for problems optimizing these functions and obtain constant

approximation guarantees. In particular, for max-sum diversification, we obtain a greedy

2-approximation algorithm over a uniform matroid, and a 2-approximation local search al-

gorithm over a matroid. For weakly submodular functions, we obtain a 5.95-approximation

greedy algorithm over a uniform matroid, and a 14.5-approximation local search algorithm

over a matroid.

The second half of the thesis (Chapter 4 and 5) presents some results about the design

of greedy algorithms. Chapter 4 is a case study of greedy algorithms for the sum colouring


problem. In particular, we prove the problem is NP-hard for penny graphs, unit disk graphs

and unit square graphs. We design approximation algorithms for the class of d-claw-free

graphs and its subclasses. In particular, a (d − 1)-approximation greedy algorithms for d-

claw-free graphs and a 2-approximation algorithm for unit square graphs. We use the pri-

ority framework developed in [12], and give a priority inapproximability result for the sum

colouring problem on a specific subclass of d-claw-free graphs. Chapter 5 discusses the

weight scaling technique for designing a greedy algorithm. We focus on graph optimiza-

tion problems with weighted vertices. The weight scaling technique gives a scaling factor

for each vertex. These scaling factors can be used to produce an ordering in which a greedy

algorithm considers vertices. We prove general bounds for greedy algorithms using differ-

ent scaling factors, and provide a uniform view of several results in the literature. Chapter 6

concludes the thesis.

Chapter 2

Greedy Algorithms on Special Structures

Although relatively rare, there are problems for which simple greedy algorithms can achieve

an optimal solution. In many of those cases, it is the underlying structure of the prob-

lem that allows for the success of the algorithm. For example, although the maximum in-

dependent set problem is NP-hard for general graphs, a simple greedy algorithm solves it

for chordal graphs in polynomial-time. In this chapter, we discuss two different settings

in which greedy algorithms achieve good performance. First, we consider properties of

a graph based on the neighbourhood of nodes, extending chordal graphs and claw-free

graphs. Then we discuss set systems generalizing matroids and chordal graphs.

2.1 Chordal Graphs and Related Structures

Throughout this chapter, we focus on graphs which are simple, connected and undirected.

Vertices or edges of a graph may in some cases be weighted. Initially, we consider un-

weighted graphs. We start with an important graph class: chordal graphs.

The study of chordal graphs can be traced back to the late 1950s; the first definition of

chordal graphs is given by Hajnal and Surányi [40].

Definition 2.1.1 [40] A graph G is chordal if each cycle in G of length at least four has at least

13

CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 14

one chord.

1

2

3

45

6

Figure 2.1: A chordal graph

Figure 2.1 shows a chordal graph with six vertices. The cycle 2 - 6 - 4 - 5 - 2 has a chord

5 - 6 . Chordal graphs appear frequently under other names in the literature, such as tri-

angulated graphs, rigid circuit graphs, monotone transitive graphs, and perfect elimination

graphs. Chordal graphs have many different characterizations. Dirac [25] proved that a

graph is chordal if and only if every minimal vertex separator is a clique. Fulkerson and

Gross [33] showed that a graph is chordal if and only if it admits a perfect elimination or-

dering. This was also observed by Rose [82]. Based on this definition, the first linear time

recognition algorithm for chordal graphs [81] was devised using lexicographic breadth-first

search. Later, Tarjan and Yannakakis [83] gave an even simpler recognition algorithm for

chordal graphs using maximum cardinality search.

Chordal graphs also have a beautiful characterization in intersection graph theory. In-

dependently, Buneman [14], Gavril [36] and Walter [86] proved that a graph is chordal if and

only if it is an intersection graph of subtrees of a tree. In fact, for every chordal graph, there

is a subtree representation of a clique tree, and such a clique tree can be found in linear

time. See the work of Hsu and Ma [50].

Not only do chordal graphs have rich characterizations and efficient recognition algo-

rithms, they also contain many interesting subclasses, such as interval graphs, split graphs

and k-trees. Many NP-hard problems also become easy for chordal graphs. For example,

the maximum independent set problem, the maximum clique problem and minimum vertex


colouring problem can all be solved in linear time for chordal graphs. Most notably, each of

these algorithms is a greedy algorithm utilizing a perfect elimination ordering of a chordal

graph. We explore this phenomena in this chapter. For the purpose of illustration, we start

with two problems on interval graphs.

2.1.1 Interval Selection and Colouring

The interval selection problem and the interval colouring problem are two examples of prob-

lems that admit optimal greedy algorithms.

Problem 2.1.2 Given a set S of n intervals where each interval Ik is a half open interval

(sk , fk ] with a start-time sk and a finish-time fk , the goal of the interval selection problem is

to find a non-overlapping subset of S with maximum size.

1

2

3

4

5

6

7

8

Figure 2.2: A set of eight intervals ordered by non-decreasing finish-time

AN OPTIMAL GREEDY ALGORITHM FOR INTERVAL SELECTION

1: Sort all intervals according to non-decreasing finish-time

2: for i = 1, . . . ,n do

3: Select the i th interval if it does not overlap with anything selected before

4: end for

The set of intervals selected by the above greedy algorithm is shown in red in Fig. 2.3.

Note that it is impossible to select more than four non-overlapping intervals for the input in

Fig. 2.2; hence the solution produced is optimal.


1

2

3

4

5

6

7

8

Figure 2.3: An optimal solution of interval selection on the input in Fig. 2.2

Problem 2.1.3 Given a set S of n intervals where each interval Ik is a half open interval

(sk , fk ] with a start-time sk and a finish-time fk , the goal of the interval colouring problem

is to assign a colour to each interval such that any two intervals with the same colour do not

overlap.

AN OPTIMAL GREEDY ALGORITHM FOR INTERVAL COLOURING

1: Sort all intervals according to non-decreasing start-time

2: for i = 1, . . . ,n do

3: Colour the i th interval with the first available colour j not used by any interval over-

lapping with the i th interval

4: end for

1

2

3

4

5

6

8

7

Figure 2.4: An optimal solution of interval colouring on the input in Fig. 2.2

The colour assigned to each interval by the above greedy algorithm is shown in Fig. 2.4.

Observe that intervals 1 , 2 and 3 overlap with each other, so it is impossible to use less

than three colours for the input in Fig. 2.2; hence the solution produced is optimal.

Note that the first algorithm utilizes a non-decreasing order of finish-time. The sec-

ond algorithm utilizes a non-decreasing order of start-time, which is equivalent to a non-

increasing order of finish-time. A natural question to ask is what property of such orderings


allow both greedy algorithms to be optimal, and to what extent can this be generalized.

2.1.2 Perfect Elimination Ordering

A key property of the ordering of intervals with non-decreasing finish-time is that for any

particular interval Ik = (sk , fk ], all its overlapping intervals appearing later in the ordering

also overlap with each other. More specifically, all of them must contain the time point fk .

The interval graph of a set of intervals is obtained by viewing each interval as a vertex, and

drawing an edge between two vertices if and only if the two intervals overlap. The perfect

elimination ordering is the generalization of the non-decreasing finish-time ordering of in-

tervals to the graph theoretical setting.

Definition 2.1.4 Given a graph G = (V ,E), a perfect elimination ordering is an ordering of

vertices such that for each vertex v, the neighbours of v that occur after v in the ordering form

a clique.

1

2

3

4 5

6

8

7

Figure 2.5: The interval graph of Fig. 2.2

Figure 2.5 shows the graph representation of Fig. 2.2. In particular, the vertices of the

graph are labeled according to the labelling of intervals in Fig. 2.2. The ordering of these

labels gives a perfect elimination ordering. In general, a perfect elimination ordering char-

acterizes the class of chordal graphs.

Theorem 2.1.5 [33, 82] A graph is a chordal graph if and only if it admits a perfect elimina-

tion ordering.


Many NP-hard optimization problems can be solved or have good approximation so-

lutions for chordal graphs because of the existence of a perfect elimination ordering. In

the subsequent subsections, we generalize perfect elimination orderings, hence extending

chordal graphs to more general graph classes. We show later in this chapter that the prob-

lems mentioned above can have good approximation solutions for these natural extensions.

2.1.3 Extending Perfect Elimination Orderings

In the definition of a perfect elimination ordering, we call the subgraph induced by the

neighbours of v that occur after v in the ordering the inductive neighbourhood of v with

respect to the given ordering. The perfect elimination ordering states that for any v in the

given ordering the size of an MIS in the inductive neighbourhood of v is one. A natural

extension is to relax this MIS size to a general parameter k.

Definition 2.1.6 [2, 89] Given a graph G = (V ,E), a k-independence ordering is an ordering

of vertices such that for each vertex v, the size of an MIS of the inductive neighbourhood of v

is at most k.

The minimum of such k over all possible orderings of vertices of a graph G is called the

inductive independence number1 of that graph. We denote it by λ(G). This extension of a

perfect elimination ordering leads to a natural generalization of chordal graphs.

Definition 2.1.7 [2, 89] A graph G is inductive k-independent if λ(G) ≤ k.

Surprisingly, this extension seems to have only been relatively recently proposed in [2] and

not studied subsequently. It turns out to be a rich extension. We defer the discussion on

natural subclasses of this extension to Section 2.2.

The way we extend perfect elimination orderings is to put a constraint on the inductive

neighbourhoods of all the vertices. In fact, similar concepts exist in the literature. For ex-

1Akcoglu et al. [2] refer to this as the directed local independence number.


ample, a graph is k-degenerate2 [66] if every subgraph has a vertex of degree at most k. This

definition was extended to the weighted case in [54] and was referred as weighted induc-

tiveness. In [53] and more recently in [55], an inductive neighbourhood property based on

the size of an MCC is also studied. In the next subsection, we give a uniform view of graph

classes based on inductive neighbourhood properties.

2.1.4 Inductive and Universal Neighbourhood Properties and Their Re-

lated Graphs Classes

We define our terminology first. Let G = (V ,E) be a graph of n vertices. If X ⊆ V , the sub-

graph of G induced by X is denoted by G[X ]. For a particular vertex vi ∈V , let d(vi ) denote

its degree and N (vi ) denote the set of neighbours of vi , excluding vi . Given an ordering of

vertices v1, v2, . . . , vn , we use Vi to denote the set of vertices {vi , . . . , vn}.

Let P be a graph property. It is closed on induced subgraphs if whenever P holds for a

graph G , it also holds for any induced subgraph of G . A graph has an inductive neighbour-

hood property with respect to P if and only if there exists an ordering of vertices v1, v2, . . . , vn

such that for any vi , P holds on G[N (vi )∩Vi ]. The set of all graphs satisfying such an induc-

tive neighbourhood property is denoted as G(P ). Such an ordering of vertices is called an

elimination ordering with respect to the property P . A graph has a universal neighbourhood

property with respect to P if and only if for all vertices v1, v2, . . . , vn , P holds on G[N (vi )]. The

set of all graphs satisfying such a neighbourhood property is denoted as G(P ).

Proposition 2.1.8 If the property P is closed on induced subgraphs, then G(P ) ⊆ G(P ).

Proof: Let G be a graph with n vertices in the class G(P ), and let v1, v2, . . . , vn be an arbi-

trary ordering of vertices. For any vertex vi , since the property P holds on G[N (vi )] and the

property is closed on induced subgraphs, P holds on G[N (vi )∩Vi ]. Therefore, the original

2This is also known as k-inductiveness in [51].


ordering of vertices v1, v2, . . . , vn is an elimination ordering for G with respect to the property

P . Therefore, G is in the class of G(P ).

Proposition 2.1.9 If the property P is closed on induced subgraphs, then for a graph G in

G(P ), any induced subgraph of G is also in G(P ).

Proof: We prove this by contradiction. Suppose the statement is false and let G be a graph

in G(P ) that contains an induced subgraph which is not in G(P ). Let G ′ be a minimum size

induced subgraph of G that is not in G(P ), and let N (v) denote the set of neighbours of v

in G and N ′(v) denote the set of neighbours of v in G ′. Note that for any vertex v in G ′, the

property P does not hold on G[N ′(v)]. Otherwise, deleting v from G ′ will create a smaller

induced subgraph of G that is not in G(P ). Let v1, v2, . . . , vn be an elimination ordering of

vertices in G with respect to the property P , and let vi be the first vertex in the ordering

that appears in G ′. Clearly, the property P holds on G[N (vi )∩Vi ], but does not hold on

G[N ′(vi )]. Since N ′(vi ) ⊆ N (vi ) ∩Vi , G[N ′(vi )] is an induced subgraph of G[N (vi ) ∩Vi ];

which is a contradiction.

Theorem 2.1.10 If the property P can be tested in O(p(n)) time, then a graph in G(P ) can be

recognized in O(np(n)) time.

Proof: The graph G is in G(P ) if and only if the property P holds on G[N (v)] for every vertex

v in G . Since the property P can be tested in p(n) time, we can test it for all the vertices to

determine if G is in G(P ).

Theorem 2.1.11 If the property P is closed on induced subgraphs, and the property P can

be tested in O(p(n)) time, then a graph in G(P ) can be recognized in O(n2p(n)) time by the

following algorithm. Furthermore, let Q be a queue, an elimination ordering with respect to

the property P can be constructed and stored in Q.

RECURSIVE_TEST(G ,P )


1: if G is empty then

2: Return TRUE

3: end if

4: for all v ∈V do

5: if P holds on G[N (v)] then

6: Enqueue v to Q

7: Return RECURSIVE_TEST(G[V \ {v}],P )

8: end if

9: end for

10: Return FALSE

Proof: For a given graph G , if the above algorithm returns TRUE, then the ordering of ver-

tices given by Q is an elimination ordering with respect to property P . Therefore G is in

G(P ).

We prove the other direction by contradiction. Suppose the above algorithm fails to rec-

ognize a graph in G(P ), let G be a minimum counter-example, that is, G is in G(P ) but RE-

CURSIVE_TEST on (G ,P ) returns FALSE. There are two cases:

1. It returns FALSE because for every vertex v in the graph G , the property P does not

hold on G[N (v)]. Then G is not in G(P ), which is a contradiction.

2. It returns FALSE because the recursive call on (G[V \ {v}],P ) returns FALSE. Then since

the property P is closed on induced subgraphs, by Proposition 2.1.9, G[V \ {v}] is in

G(P ). Hence, G[V \ {v}] is a smaller counter-example. This is also a contradiction.

Note that each recursive call requires checking the remaining vertices of the graph (in

the worst case), and reduces the number of vertices of the graph by one. Therefore, the over-

all running time of recognizing a graph in G(P ) is O(n2) times the time to test the property

P . Note that if G is in G(P ), then the ordering of vertices in queue Q provides a certificate.


If G is not in G(P ), then the remaining graph at the termination of the RECURSIVE_TEST

provides a certificate on why G is not in G(P ).

In this chapter, we focus on graphs with their inductive and universal neighbourhoods

satisfying the following two graph properties:

1. MCC ≤ k: The size of the minimum clique cover is no more than k. We denote the

two classes of such graphs by G(CCk ) and G(CCk ).

2. M I S ≤ k: The size of the maximum independent set is no more than k. We denote the

two classes of such graphs by G(I Sk ) and G(I Sk ).

Note that both properties are closed on induced subgraphs, and by Proposition 2.1.8, we

have G(CCk ) ⊆ G(CCk ) and G(I Sk ) ⊆ G(I Sk ). It is not difficult to show the inclusion is proper

for all positive integer k.

Theorem 2.1.12 For any positive integer k, G(CCk ) ⊂ G(CCk ) and G(I Sk ) ⊂ G(I Sk ).

Proof: Consider the complete bipartite graph Kk,k+1. It is in G(CCk ) and G(I Sk ) but is not

in G(CCk ) and G(I Sk ).

Furthermore, as M I S ≤ k is a weaker property3 than MCC ≤ k, we have G(CCk ) ⊆ G(I Sk )

and G(CCk ) ⊆ G(I Sk ).

Theorem 2.1.13 For any positive integer k > 1, G(CCk ) ⊂ G(I Sk ) and G(CCk ) ⊂ G(I Sk ). For

k = 1, G(CCk ) = G(I Sk ) and G(CCk ) = G(I Sk ).

Proof: We give an explicit construction to show separations between G(CCk ) and G(I Sk ),

and between G(CCk ) and G(I Sk ). For any k > 1, the construction starts with two cycles of

length 2k +1. We then connect every vertex in the first cycle to every vertex in the second

cycle. See Fig. 2.6 for the case when k = 2. By symmetry, the induced subgraph G[N (v)] for

3In fact, the gap between the size of M I S and the size of MCC can be arbitrarily large.


Figure 2.6: A graph in G(I S2) and G(I S2) but not in G(CC2) and G(CC2)

any v is an odd cycle C2k+1 plus two independent vertices, each connecting to every vertex

in the cycle. It is not difficult to see that the size of an MIS for graph G[N (v)] is k while the

size of an MCC for graph G[N (v)] is k +1. Therefore, it is in G(I Sk ) and G(I Sk ) but is not in

G(CCk ) and G(CCk ).

This gives us four families of graphs having rich and interesting properties. Note that

G(CC1) = G(I S1) is exactly the class of chordal graphs, using the characterization in terms of

admitting a perfect elimination ordering. In the following section, we give natural examples

of graphs contained in these families.

2.2 Natural Subclasses of the Four Families

The four families we have defined in the previous section: G(CCk ), G(I Sk ), G(CCk ) and

G(I Sk ) have many interesting subclasses. In this section, we give some natural examples of

graphs in these families with small constant parameters.

2.2.1 Graphs induced by the Job Interval Selection Problem

In the job interval selection problem, we are given a set of jobs. Each job is a set of half open

intervals on the real line. To schedule a job, exactly one of these intervals must be selected.


To schedule several jobs, the intervals selected for the jobs must not overlap. The objective

is to schedule as many jobs as possible under these constraints.

We can view the job interval selection problem as an MIS on a special input graph as

follows: The vertices of the graphs are intervals. Two vertices are adjacent if and only if the

corresponding intervals overlap or they belong to the same job. Fig. 2.7a shows an input

instance of the job interval selection problem. Each job is denoted by a particular colour;

the set of intervals associated with that job is coloured using that particular colour. The in-

tervals are labeled by non-decreasing finish-time. Fig. 2.7b shows the corresponding graph

of the input instance in Fig. 2.7a. Graphs induced by the job interval selection problem are

also know as strip graphs in [42]. They are a special type of 2-interval graphs.

1

2

3

4

5

6

7

8

(a) An input instance

1

2

3

4 5

6

8

7

(b) The corresponding input graph

Figure 2.7: An example of the job interval selection problem

Observation 2.2.1 Any graph induced by the job interval selection problem is in G(CC2).

Proof: We order intervals by non-decreasing finish-time. We examine this ordering with re-

spect to the input graph. For ease of explanation, we do not distinguish between an interval

and its corresponding vertex. For any particular vertex v , we can partition the neighbours

of v appearing later in the ordering into two groups: those containing the finish-time of v

and those that belong to the same job as v . If a neighbour of v satisfies both conditions, it

does not matter which group we classify it into. Observe that within each group, any pair of

vertices are adjacent. In other words, the inductive neighbourhood of v can be covered by

two cliques. Therefore, any input graph of a job interval selection problem is in G(CC2).


2.2.2 Planar Graphs

Planar graphs are well-studied objects in the literature not only because of their numerous

applications but also due to the existence of many interesting properties. In this section, we

present a nice property of planar graphs in terms of their inductive neighbourhoods.

Theorem 2.2.2 Any planar graph is in G(CC3).

Proof: Proof by contradiction. Suppose there are planar graphs which are not in G(CC3).

Let G = (V ,E) be a minimum size counter-example, so for any vertex v in G , the size of

an MCC of G[N (v)]) is greater than three; this forbids G having vertices of degree three or

less. We examine a planar embedding of G ; in the sequel, we do not distinguish G from

this planar embedding. A vertex-edge pair is pair (v,e) such that edge e is incident to v . We

count the number of faces, edges and vertices by first charging them to vertex-edge pairs,

and then sum up the charges.

Let v be a vertex and d(v) be the degree of v . Let e be an edge incident to v , then the

vertex-charge from v to (v,e) is 1d(v) , the edge-charge from e to (v,e) is 1

2 . For a face f , the

number of boundary edges of f is called the degree of the face. It is denoted as d( f ). A pair

(v,e) is incident to a face if e is a boundary edge of that face. The face-charge from f to an

incident pair (v,e) is 12d( f ) . Note that these charges are carefully chosen such that if we sum

up vertex-charges over all vertex-edge pairs, the sum is the number of vertices. If we sum up

edge-charges over all vertex-edge pairs, the sum is the number of edges. If we sum up face-

charges over all vertex-edge pairs, the sum is the number of faces. We provide an upper

bound on the number of faces and derive a contradiction using the Euler characteristic.

Depending on the degree of v of each vertex-edge pair (v,e), we have three cases:

1. The degree of v is four. Let A be the set of vertex-edge pairs containing such a vertex.

Then since the size of an MCC of G[N (v)]) is greater than three, none of the neigh-

bours of v in G can be adjacent. Therefore, for each edge incident to v , the face to its

left has degree at least four and so does the face to its right. Note that they might be


the same face. The face-charge of such vertex-edge pair is at most 14 . If we sum up all

face-charges of vertex-edge pairs containing v , it is at most one.

2. The degree of v is five. Let B be the set of vertex-edge pairs containing such a vertex.

Then since the size of an MCC of G[N (v)]) is greater than three, there are at most two

triangular faces incident to v in G . We break it down further into three cases:

(a) If there is no triangular face incident to v in G , we denote the set of vertex-edge

pairs containing such a vertex as B1. See Fig. 2.8 below. Using a similar argument

v

Figure 2.8: No triangular face

as above, the face-charge of such vertex-edge pair is at most 14 . If we sum up all

face-charges of vertex-edge pairs containing v , it is at most 54 .

(b) If there is exactly one triangular face incident to v in G , we denote the set of

vertex-edge pairs containing such a vertex as B2. See Fig. 2.9 below. Observe that

the face-changes of vertex-edge pairs involved in the triangular face is at most

724 . If we sum up all face-charges of vertex-edge pairs containing v , it is at most

43 .

(c) If there are exactly two triangular faces incident to v in G , we denote the set of

vertex-edge pairs containing such a vertex as B3. See Fig. 2.10 below. Note that

these two triangular faces must be adjacent, for otherwise the size of an MCC

of G[N (v)]) is no more than three. Using a similar argument, if we sum up all


v

Figure 2.9: One triangular face

v

Figure 2.10: Two adjacent triangular faces

face-charges of vertex-edge pairs containing v , it is at most 1712 .

3. The degree of v is more than five. Let C be the set of vertex-edge pairs containing

such a vertex. The face-charge of such vertex-edge pair is at most 13 . If we sum up all

face-charges of vertex-edge pairs containing v , it is at most 13 d(v).

Let F denote the set of faces. We count the total number of vertices, edges and provide an

upper bound on the total number of faces by summing up vertex-changes, edge-changes

and face-charges, respectively, over all vertex-edges pairs. The total number of vertices is

|V | = |A|+ |B |+ |C |.

The total number of edges is

|E | = 1

2[4|A|+5|B |+ ∑

v∈Cd(v)] = 2|A|+ 5

2|B |+ 1

2

∑v∈C

d(v).


For the total number of faces, we have

|F | ≤ |A|+ 5

4|B1|+ 4

3|B2|+ 17

12|B3|+ 1

3

∑v∈C

d(v).

Therefore, bounding the Euler characteristic:

|V |− |E |+ |F | ≤ |C |− 1

4|B1|− 1

6|B2|− 1

12|B3|− 1

6

∑v∈C

d(v) ≤ |C |− 1

6

∑v∈C

d(v).

Since d(v) ≥ 6 for any (v,e) ∈C , we have

|V |− |E |+ |F | ≤ |C |− 1

6

∑v∈C

d(v) ≤ 0.

This contradicts the fact that G is planar since every planar graph has Euler characteristic 2.

Therefore any planar graph is in G(CC3).

2.2.3 Disk and Unit Disk Graphs, Intersection Graphs of Convex Shapes

Geometric intersection graphs play an important role in many applications. For example,

interval graphs in scheduling, disk graphs and unit disk graphs in wireless communication,

and intersecting rectangles in layout problems and bioinformatics. Due to geometric con-

straints, these graphs have many interesting properties. In this subsection, we study the

relationship between disk graphs, unit disk graphs, translates of a convex shape and the

four families we have defined in the previous section: G(CCk ), G(I Sk ), G(CCk ) and G(I Sk ).

We restrict our attention to the two dimensional plane; each geometric object is a closed

set in R2. We define classes of geometric intersection graphs as follows. We are given a set of

geometric objects in the plane; these are the vertices of the intersection graph. Two vertices

are adjacent if and only if the two objects overlap; i.e., have a non-empty intersection. We

first consider disk graphs and unit disk graphs where the objects are (respectively) disks of

arbitrary radius and disks of fixed radius. Figure 2.11 shows a set of disks on the plane and

the corresponding disk graph. There are two important geometric properties of disks.


(a) A set of disks on the plane (b) The corresponding disk graph

Figure 2.11: An example of a disk graph

Observation 2.2.3 Given a set of disks on the plane and a particular disk s, the set of over-

lapping disks of s no smaller than s can be partitioned into six (possibly empty) subsets, such

that within each subset, any two disks overlap.

Proof: For a given disk s, we construct the partition into six subsets explicitly. In particular,

take the centre of s as the origin and partition the plane into six regions with an equal angle

of 60◦, see Fig. 2.12 below. To be precise, each region includes its left boundary and excludes

its right boundary. For any two disks no smaller than s, if they overlap with s and their

Figure 2.12: Partition the plane into six regions

centres lie in the same region, then they must overlap with each other due to geometric

constraints. In fact, this is true even if their centres lie on adjacent boundaries. Therefore,

we can partition the set of overlapping disks of s no smaller than s into six subsets based on

which region the centre of the overlapping disk lies in.


Observation 2.2.4 Given a set of disks on the plane and a particular disk s, there are at most

five overlapping disks of s no smaller than s such that no two of them overlap.

Proof: We prove this by contradiction. Suppose there are six overlapping disks of s no

smaller than s such that no two of them overlap. By the pigeonhole principle, there are at

least two disks whose angle with respect to the centre of s is less or equal to 60◦. Therefore,

these two disks must overlap, which is a contradiction.

Theorem 2.2.5 Any disk graph is in G(I S5) and G(CC6).

Proof: This is immediate from Observation 2.2.3 and 2.2.4 if we order the set of disks by

non-decreasing size.

Corollary 2.2.6 Any unit disk graph is in G(I S5) and G(CC6).

The properties of disk graphs and unit disk graphs inspire us to study more general ge-

ometric shapes: convex shapes. Here we focus on intersection graphs of uniform and non-

uniform translates of a convex shape. For a given shape, a uniform translate is the same

shape with the same size and orientation but a possibly different location. A non-uniform

translate is the same shape with the same orientation but a possibly different size and loca-

tion. It is clear that disk graphs are examples of intersection graphs of non-uniform trans-

lates and unit disk graphs are examples of intersection graphs of uniform translates.

In [59], Kim, Kostochka and Nakprasit proved that for an intersection graph of uniform

translates of a convex shape, if it has a clique number k then it is (3k −3)-degenerate. Al-

though the statement of that result is not immediately applicable, the proof shows that the

neighbourhood of the topmost object can be covered by three cliques. This leads to the

following theorem.

Theorem 2.2.7 [59] Any intersection graph of uniform translates of a convex shape is in

G(CC3).


A weaker version of Theorem 2.2.7, which is independently proved using a geometric

argument, can be found in [89]. Similar to Observation 2.2.3 and 2.2.4, we can extend The-

orem 2.2.5 and 2.2.6 to the following theorems.

Theorem 2.2.8 [89] Any intersection graph of uniform translates of a convex shape is in

G(I S5) and G(CC6).

Theorem 2.2.9 [89] Any intersection graph of non-uniform translates of a convex shape is in

G(I S5) and G(CC6).

2.2.4 More Subclasses

There are three more subclasses we want to mention: d-claw-free graphs, k-degenerate

graphs, and circular-arc graphs.

Definition 2.2.10 A graph is d-claw-free if every vertex has less than d independent neigh-

bours.

Note that the class of d-claw-free graphs is exactly the class G(I Sd−1). There are many

interesting subclasses of d-claw-free graphs; we give two examples.

1. Line Graphs: Given a graph G , its line graph L(G) is a graph such that each vertex of

L(G) is an edge of G , and two vertices of L(G) are adjacent if and only if their corre-

sponding edges share a vertex in G . Observe that line graphs are in G(CC2). This is

because any edge has two end vertices. Any other edge incident to it must share one

of the end vertices.

2. Intersection Graphs of k-Sets: Given a universe U of n elements, and m subsets of U ,

each with at most k elements, its intersection graph is a graph such that each vertex

is a subset, two vertices are adjacent if and only if the two subsets have non-empty

intersection. Observe that intersection graphs of k-sets are in G(CCk ).


Definition 2.2.11 A graph is k-degenerate if every subgraph has a vertex of degree at most k.

A very general subclass of k-degenerate graphs is the class of graphs with treewidth at most

k. We give the definition of graphs with treewidth at most k in terms of k trees. A k-tree can

be formed by starting with a clique of size k and then repeatedly adding vertices in such a

way that each added vertex has exactly k neighbours which form a clique. The graphs that

have treewidth at most k are exactly the subgraphs of k-trees, and for this reason they are

also called partial k-trees.

Graphs with treewidth at most k are quite general and useful. It includes rich subclasses

even for small values of k. For example, series-parallel graphs and outer planar graph are

graphs with treewidth at most 2. Many graph problems, to be more precise, all problems

definable in monadic second-order logic, can be solved in polynomial time using dynamic

programming for graphs with bounded treewidth [21].

Definition 2.2.12 A graph is a circular-arc graph if it is the intersection graph of arcs of a

circle.

Given a set of arcs of a circle, vertices of a circular arc graphs are these arcs. Two vertices are

adjacent if two corresponding arcs overlap. See Fig. 2.13 for an example. Given a circular-

(a) A set of arcs of a circle (b) The corresponding circular-arc graph

Figure 2.13: An example of a circular-arc graph

arc graph, the length of an arc is the corresponding angle if we connect both end-points

to the centre of the circle. Consider the arc c with the smallest arc length. All intersecting

arcs contain either the left end-point of c or the right end-point of c. Therefore, circular-arc

graphs are in the class of G(CC2).


We have seen so far several natural examples of graphs in the families of G(CCk ), G(I Sk ),

G(CCk ) and G(I Sk ) for small values of k. In the next section, we study properties of the graph

classes G(I Sk ) and G(CCk ) when k is a small constant.

2.3 Properties of G(I Sk) and G(CCk)

First, we show that any graph in G(I Sk ) can be recognized in polynomial time using Theo-

rem 2.1.11.

Corollary 2.3.1 A k-independence ordering of a graph G in G(I Sk ) with n vertices can be

constructed in O(k2nk+3) time and linear space.

Proof: The property M I S ≤ k is closed on induced subgraphs and can be tested in time

O(k2nk+1). By Theorem 2.1.11, a graph in G(I Sk ) can be recognized in time O(k2nk+3) and

linear space of the size of the graph.

By an observation of Itai and Rodeh [52] and results in [28], we can improve this running

time to O(n4.376), O(n5.334) and O(n6.220) for k = 2, 3 and 4 respectively. If we allow an al-

gorithm to use O(nk+2) space, we can further improve the running time of the recognition

algorithm.

Proposition 2.3.2 A k-independence ordering of a graph G in G(I Sk ) with n vertices can be

constructed in O(k2nk+2) time and O(nk+2) space.

Proof: Given a graph G , we build a bipartite graph H = (A,B) in the following way. We

construct a vertex-node (a node in A) for each vertex in G and a subset-node (a node in B)

for each subset of size k + 1 in G . We connect a vertex-node to a subset-node with a red

edge if the vertex of the vertex-node is adjacent to all vertices in the subset-node and these

vertices form an independent set. We connect a vertex-node to a subset-node with a black


2

41

3

(a) A graph G

2 41 3

1,2 1,3 1,4 2,3 2,4 3,4

(b) The corresponding graph H

Figure 2.14: An example of the construction for k = 1

edge if the vertex of the vertex-node is one of the vertices in the subset-node. See Fig. 2.14

for an example of the construction for k = 1.

Observe that constructing such a graph H takes O(k2nk+2) time and O(nk+2) space.

Once H is constructed, we construct an ordering of vertices of G in the following way. At

each step, we look for a vertex-node in A that is not incident to any red edge. The vertex

of such a vertex-node is then the next vertex added to the ordering. We then delete such a

vertex-node in A and all its neighbours in B together with all incident (black and red) edges,

and continue. If finally there are no vertices remaining in A, then we have constructed an

inductive k-independent ordering. Otherwise at a particular step, every vertex-node in A

has at least one red edge and we can conclude that G is not an inductive k-independent

graph.

It is known [26] that MIS is W[1]-complete for general graphs when it is parameterized by

the size of the maximum independent set. By a reduction from MIS, finding the inductive

independence number of a graph is also complete for W[1], hence it is unlikely to obtain

a fixed parameter tractable solution for recognizing a graph in G(I Sk ) for a general k. But

this does not exclude the possibility to improve the current time complexity for small fixed

parameters, for example, when k = 2 or 3. It is interesting to note that recognizing a chordal

graph, i.e., a graph in G(I S1), can be done in linear time (in the number of edges), while the

generic algorithm above runs in time O(n3). It seems quite possible to improve the running


time of the generic recognition algorithm; we leave this as an open question.

We are primarily interested in graphs with a small inductive independence number.

Graphs with a large inductive independence number are not interesting to us as they cannot

be recognized efficiently and do not provide good approximation bounds. We discuss the

latter in Section 2.4. In many specific cases, not only do we know a-priori that a graph has

a small inductive independence number, like those subclasses we discussed in Section 2.2,

but also a k-independence ordering with a desired k can be computed much more effi-

ciently than the time complexity bound provided by Proposition 2.3.2. We give several ob-

servations below which all follow immediately from the specific ordering discussed in Sec-

tion 2.2.

Observation 2.3.3 [89] A 2-independence ordering can be computed in O(n logn) time for

any input graph of the job interval selection problem with n intervals.


any intersection graph of n uniform translates of a convex shape.


any intersection graph of n non-uniform translates of a convex shape.

Next we bound the inductive independence number of a graph by the number of vertices

and edges in the graph.

Theorem 2.3.6 A graph G with n vertices and m edges has inductive independence number

no more than min{bn2 c,bpmc,b

√1+4[(n

2)−m]+1

2 c}.

Proof: Let λ(G) be the inductive independence number of G . We can then find an induced

subgraph H such that every vertex has at least λ(G) independent neighbours. Let v be a

vertex in H and u be one of its λ(G) independent neighbours. Note that u must again have

at least λ(G) independent neighbours. Furthermore, since u is not adjacent to any of the


v u

Figure 2.15: A vertex v in H and one of its independent neighbours u

other λ(G)−1 independent neighbours of v , the two independent neighbour sets of u and

v has to be disjoint; see Fig. 2.15 below. Therefore, the total number of vertices is at least

2λ(G), the total number of edges is at least λ(G)2 and the total number of missed edges is at

least 2(λ(G)

2

). Therefore, a graph G with n vertices and m edges has inductive independence

number no more than

min{bn

2c,bpmc,b

√1+4[

(n2

)−m]+1

2c}.

We now consider the class G(CCk ). Unlike the property M I S ≤ k, the property MCC ≤ k

is NP-hard to test for k > 2. Therefore, Theorem 2.1.11 no longer applies. However, the

property MCC ≤ 2 can be tested in linear time for general graphs as testing bipartiteness

can be done in linear time by a greedy algorithm. Hence, the graph class G(CC2) can be

recognized in polynomial time. By the RECURSIVE_TEST algorithm in Theorem 2.1.11, the

following corollary is immediate.

Corollary 2.3.7 For any graph G in G(CC2) with n vertices, an elimination ordering with

respect to the property MCC ≤ 2 can be constructed in O(mn2) time and linear space.

2.4 Greedy Algorithms for G(I Sk) and G(CCk)

In this section, we focus on algorithmic aspects of the two families G(I Sk ) and G(CCk ). We

show that for several classic NP-hard problems, good approximation algorithms can be de-

veloped for graphs in these two classes; furthermore all these algorithms are greedy-like


algorithms. For simplicity, other than Subsection 2.4.4, we focus mainly on the unweighted

case of these problems. Note that the weighted case of maximum independent set requires

an extension to the stack algorithm [11].

Since G(CCk ) ⊆ G(I Sk ), we discuss algorithms for the class G(I Sk ) whenever possible

since a result for G(I Sk ) implies the same result for G(CCk ) but not vice versa. A discus-

sion for the graphs class G(CC2) is given in Subsection 2.4.5, as by Corollary 2.3.7, graphs in

G(CC2) can be recognized in polynomial time.

For most algorithms, we are more concerned about their approximation ratios than their

precise time complexities. The running time of an algorithm for the graph class G(I Sk ) is

usually bounded by the running time for constructing a k-independence ordering. Never-

theless, for a fixed constant k, such running time is polynomial.

2.4.1 Maximum Independent Set

For general graphs, MIS is NP-hard and even NP-hard to approximate within a factor of n1−ε

for any constant ε> 0. However for chordal graphs, MIS can be solved by a greedy algorithm

in polynomial time. We extend this result and show that a k-approximation for MIS can be

achieved on G(I Sk ).

A GREEDY ALGORITHM FOR MIS ON G(I Sk )

1: Sort all vertices according a k-independence ordering

2: for i = 1, . . . ,n do

3: Select the i th vertex if it is not adjacent to anything selected before

4: end for

Theorem 2.4.1 [2, 89] The above greedy algorithm achieves a k-approximation for MIS on

G(I Sk ).

Proof: We prove it using a charging argument. Let π be a k-independence ordering. Let O

be the optimal solution and A be the greedy solution. We order vertices in O and A accord-


ing to the k-independence ordering used by the algorithm; let v1, v2, . . . , vp and u1,u2, . . . ,uq

be the induced ordering of vertices in O and in A respectively according to π. We define the

following mapping from O to A : A vertex vi in O is mapped to the same vertex in A if vi

exists in A , or the first vertex in A that is adjacent to vi ; see Fig. 2.16 for an example. We

v1 v2 vi vp

u1 u2 u j uq

. . . . . . . . .

. . . . . . . . .

Figure 2.16: A mapping from O to A

observe the following properties of this mapping. First of all, every vertex in O maps to some

vertex in A ; for otherwise, A would include that vertex by the greedy selection rule. Fur-

thermore, no vertex in O maps to some vertex in A that appears later in the k-independence

ordering; for otherwise, A would include that vertex by the greedy selection rule. Since the

set of vertices that map to a particular vertex in A has to form an independent set, by the

definition of the k-independence ordering, there are at most k vertices in O that map to the

same vertex in A . Hence we can conclude that the size of O is at most k times of the size of

A . Therefore, the above algorithm is a k-approximation for MIS on G(I Sk ).

For the weighted case, a local ratio algorithm [2] can achieve the same approximation

ratio of k for graphs in G(I Sk ). We state the theorem below without a proof. This local ratio

algorithm can be view as a two-pass greedy-like algorithm. A more general problem will

be discussed in detail in Subsection 2.4.4, and we will obtain a result that implies Theo-

rem 2.4.2.

Theorem 2.4.2 [2] There is a k-approximation local ratio algorithm for WMIS on G(I Sk ).


2.4.2 Minimum Vertex Colouring

The minimum vertex colouring problem is a well-studied NP-hard problem. For a graph

with n vertices, it is NP-hard to approximate the chromatic number within n1−ε for any

fixed ε > 0. For chordal graphs, a greedy algorithm on the reverse of any perfect elimina-

tion ordering gives an optimal colouring. For graphs in G(I Sk ), the same greedy algorithm

achieves a k-approximation.

A GREEDY ALGORITHM FOR COL ON G(I Sk )

1: Sort all vertices according to a reverse k-independence ordering

2: for i = 1, . . . ,n do

3: Colour the i th vertex with the first available colour j not used by any of its neighbours

4: end for

Theorem 2.4.3 The above greedy algorithm achieves a k-approximation for COL on G(I Sk ).

Proof: Let v1, v2, . . . , vn be a k-independence ordering, so the algorithm colours the vertices

according to the ordering vn , vn−1, . . . , v1. Let Vi = {vi , . . . , vn}, we prove by induction that the

algorithm achieves a k-approximation for G[Vi ] for all i from n to 1. The base case is clear,

since when i = n, G[Vn] is just a single vertex. Now we assume the statement holds for i > t ,

i.e., the number of colours ci used in the algorithm for G[Vi ] satisfies

ci ≤ k ·χ(G[Vi ]).

Now we consider i = t . There are three cases:

1. If ct = ct+1, then the statement holds trivially since

ct = ct+1 ≤ k ·χ(G[Vt+1]) ≤ k ·χ(G[Vt ]).

2. If χ(G[Vt ]) =χ(G[Vt+1])+1, then the statement also holds trivially since

ct ≤ ct+1 +1 ≤ k ·χ(G[Vt+1])+1 ≤ k(χ(G[Vt+1])+1) = k ·χ(G[Vt ]).


3. The only remaining case is when ct = ct+1 + 1, and χ(G[Vt ]) = χ(G[Vt+1]). Suppose

ct > k ·χ(G[Vt ]). Since we have to increase the number of colours, there exist ct+1

neighbours of vt , each having a different colour. These ct+1 neighbours together with

vt must be grouped into χ(G[Vt ]) colour classes in the optimal colouring. Therefore at

least one colour class in the optimal colouring will have at least ct+1+1χ(G[Vt ]) vertices from

the set N (vt )∩Vt . Since

ct+1 +1

χ(G[Vt ])= ct

χ(G[Vt ])> k,

we have one colour class containing more than k vertices from N (vt )∩Vt . This con-

tradicts the fact that v1, v2, . . . , vn is an inductive k-independent ordering.

This completes the induction; therefore the algorithm achieves a k-approximation for COL

on G(I Sk ).

2.4.3 Minimum Vertex Cover

The minimum vertex cover problem is one of the most celebrated problems in the area of

approximation algorithms, because there exist several simple 2-approximation algorithms,

yet for general graphs no known algorithm4 can achieve an approximation ratio better than

2−ε for any fixed ε> 0. The problem is NP-hard and NP-hard to approximate within a factor

of 1.36 [24]. In this subsection, we discuss approximation algorithms for MVC on G(I Sk ).

A graph is triangle-free if no three vertices in the graph form a triangle of edges. We first

discuss graphs in G(I Sk ) that are triangle-free.

MVC on Triangle-Free G(I Sk )

For a given vertex v , let N2(v) denote vertices with distance two to v , we first prove the

following lemma.

4In fact, the ratio 2−ε for any fixed ε> 0 is not possible assuming the unique games conjecture; see [58].


Lemma 2.4.4 Given a triangle-free graph G in G(I Sk ), let v be a vertex of minimum degree,

then there is a matching of size d(v)−1 between N (v) and N2(v).

Proof: We colour vertices in N (v) red (big) and vertices in N2(v) blue (small). Let M be a

maximum matching between red and blue vertices; i.e., every edge in M has one red end

vertex and one blue end vertex, and each vertex in N (v) ∪ N2(v) occurs at most once in

M . Let R1 be the set of red vertices that participate in the matching, and R2 be the set of

remaining red vertices. Let B1 be the set of blue vertices that participate in the matching,

and B2 be the set of remaining blue vertices; see Fig. 2.17 below. Note that no edge connects

v

R1

R2

B1

B2

M

u

Figure 2.17: A maximum matching between N (v) and N2(v)

a vertex in R2 to a vertex in B2. Furthermore, since G is triangle-free, no edge connects

any two vertices in N (v). Suppose that |M | < d(v)−1, then R2 is non-empty. For any vertex

u ∈ R2, its neighbours are contained in the set B1∪{v}, therefore d(u) < d(v); this contradicts

the fact that v is a vertex of minimum degree. Therefore, |M | ≥ d(v)−1, and hence there is

a matching of size d(v)−1 between N (v) and N2(v).

We now consider the following greedy algorithm for a triangle-free, k-inductive inde-

pendent graph G .

A GREEDY-LIKE ALGORITHM FOR MVC ON TRIANGLE-FREE G(I Sk )


1: C =;2: while G is not empty do

3: Pick a vertex v with minimum degree

4: Let M be a matching of size d(v)−1 between N (v) and N2(v)

5: Let u be a vertex in N (v) that is not in the matching M

6: Add u and vertices in M to C , remove them and their incident edges from G

7: Remove all isolated vertices from G

8: end while

9: Return C

Theorem 2.4.5 The above greedy-like algorithm achieves a (2− 1k )-approximation for MVC

on triangle-free graphs in G(I Sk ).

Proof: At each step of the algorithm, let M ′ = M ∪{uv} and let S be the set of vertices added

to the cover C . Observe that S covers all edges in M ′ and has size 2d(v)− 1. Let S′ be a

maximum size subset of vertices of M ′ that covers M ′ in an optimal solution. Note that

|S′| ≥ d(v); furthermore, the set of edges in G covered by S′ is a subset of edges in G covered

by S. Since G is in G(I Sk ) and triangle-free, we have d(v) ≤ k. Therefore, the approximation

ratio is at most d(v)−1d(v) ≤ 2k−1

k = 2− 1k .

We can further improve the ratio in Theorem 2.4.5 to 2− 2k+1 using the following result of

Hochbaum [48], which is based on Nemhauser and Trotter’s decomposition scheme [72].

Theorem 2.4.6 [48] Let G be a weighted graph with n vertices and m edges. If it takes only s

steps to colour the vertices of G with c colours then it takes only s+O(nm logn) steps to find a

vertex cover whose weight is at most 2− 2c times the weight of an optimal vertex cover.

In order to use Theorem 2.4.6, we first prove the following two lemmas.

Lemma 2.4.7 If a graph in G(I Sk ) with n vertices is triangle-free, then a k-independence

ordering can be constructed in O(kn logn) time.


Proof: We store the set of vertices of the graph into a priority queue with updates, using

their degrees as their priorities. Since the graph is triangle-free, for any vertex v , N (v) is an

independent set. By Theorem 2.1.11, a k-independence ordering can be constructed at each

step by dequeuing the vertex of minimum degree and updating the degrees of its neigh-

bours in the priority queue. As there are at most k neighbours at each step, these updates

takes at most O(k logn) time. Therefore, a k-independence ordering can be constructed in

O(kn logn) time.

Lemma 2.4.8 If a graph in G(I Sk ) is triangle-free, then a simple greedy algorithm can provide

a valid colouring of its vertices using at most k +1 colours.

Proof: By Lemma 2.4.7, for a triangle-free graph in G(I Sk ), a k-independence ordering can

be constructed efficiently. Suppose we colour the vertices of the graph according to the

reverse of this ordering. Since the graph is triangle-free, whenever we colour a vertex v , at

most k neighbours of v are already coloured. Therefore, we would use at most k+1 colours.

By Theorem 2.4.6, Lemma 2.4.7 and Lemma 2.4.8, the following theorem is immediate.

Theorem 2.4.9 There is a time O(mn logn) algorithm that achieves a (2− 2k+1 )-approximation

for WMVC on triangle-free G(I Sk ).

Although Theorem 2.4.9 has a better approximation ratio than Theorem 2.4.5, its running

time is slightly less efficient than Theorem 2.4.5. Furthermore, the greedy algorithm for

Theorem 2.4.5 is a much simpler combinatorial algorithm.

Note that Halperin’s algorithm [45] can achieve a factor of (2−(1−o(1)) 2lnlnklnk )-approximation

for WMVC on triangle-free G(I Sk ), but it uses a more complicated semidefinite program-

ming (SPD) relaxation of vertex cover.


MVC on G(I Sk )

We now discuss approximating MVC for all graphs in G(I Sk ); i.e., without the triangle-free

assumption. Note that for a given graph G = (V ,E), if S is an MIS of G then V \ S is an MVC

of G . For k = 1, the graph class of G(I Sk ) is exactly chordal graphs, and MVC can be solved

optimally in polynomial time. For the remainder of this subsection, we assume k > 1. Note

that if a graph contains a triangle, adding all three vertices of the triangle to the cover can

introduce at most one extra vertex to the optimal cover in terms of covering the three edges

of the triangle. That means if the approximation ratio of an algorithm we are aiming for is

greater than 32 , then we can remove a triangle, add its vertices to the cover and reduce to

a smaller problem without sacrificing the approximation ratio of the algorithm. This leads

the following meta-algorithm for graphs in G(I Sk ).

A META-ALGORITHM FOR MVC ON G(I Sk )

1: C =;2: Remove all triangles from G and add their vertices to C .

3: Let C ′ be the cover returned by running on G an approximation algorithm for MVC on

triangle-free G(I Sk )

4: Return C ∪C ′

Note that removing all triangles can be done in matrix multiplication time O(nω) ≈O(n2.376) or in O(mn) time for sparse graphs. Combining this fact and the above meta-

algorithm with Theorem 2.4.5 and Theorem 2.4.9, we have the following two theorems.

Theorem 2.4.10 For k > 1, there is a polynomial time algorithm that achieves a (2 − 1k )-

approximation for MVC on G(I Sk ). The algorithm runs in O(nω) ≈ O(n2.376) time or in

O(mn) time for sparse graphs.

Theorem 2.4.11 For k > 2, there is an algorithm that runs in O(mn logn) time and achieves

a (2− 2k+1 )-approximation for MVC on G(I Sk ).


For the weighted case, we can use a similar trick as used in [15] and obtain the following

result.

Theorem 2.4.12 For k > 2, there is a polynomial time algorithm that achieves a (2− 2k+1 )-

approximation for WMVC on G(I Sk ).

Proof: This follows from the local ratio vertex cover algorithm of Bar-Yehuda and Even [8]

as we now explain. We do the following triangle weight decomposition. Consider a given

graph G in G(I Sk ) with weights on its vertices. If there is a triangle with positive weights on

all its three vertices, let wmin be the minimum weight of the three. Take out this triangle and

label these three vertices with wmin. Reduce the weight of the three vertices in the original

graph by wmin. Repeat the above until there is no triangle with positive weights on all its

vertices. An example of a triangle weight decomposition is shown in Fig. 2.18. After we

3 2

1 2

3

(a) The original graph

3 0

1 0

1

2

2

2

(b) A triangle weight decomposition

Figure 2.18: An example of a triangle weight decomposition of a graph

have a triangle weight decomposition of a graph G , we have a resulting graph Gr with a

set of weighted triangles: T1,T2, . . . ,Tp . We first take vertices having weight 0 in Gr into

C1, and remove them from Gr . The result graph is G ′r . It is not hard to see G ′

r is triangle-

free. Furthermore, since G ′r is an induced subgraph of G , it is still in G(I Sk ). We then apply

Theorem 2.4.6 to get a (2− 2k+1 )-approximation for G ′

r . The vertex cover for G ′r is C2. Then

C =C1∪C2 is a vertex cover with an approximation ratio (2− 2k+1 ) to the optimal vertex cover

of G . To see this, we provide two observations:


1. The set C is a valid vertex cover for G . Suppose an edge e is not covered by C , then

e cannot incident to a vertex of weight 0 in Gr . Therefore, it must be an edge in G ′r .

Since C2 is a vertex cover for G ′r , C2 covers e. Therefore e is covered by C . This is a

contradiction.

2. The total weight in C is no more than (2− 2k+1 ) of the optimal vertex cover of G . Let

the weight of an optimal vertex cover of G ′r is w0, then the weight of an optimal vertex

cover of Gr is also w0. Let Copt be an optimal vertex cover of G . Then the weight of

Copt on Gr (taking weights of Gr instead of G) is at least w0. Since C2 is a (2− 2k+1 )-

approximation for G ′r , and vertices in C1 have weight 0, the weight of C on Gr is at

most (2− 2k+1 )w0. Now we adding back successively T1,T2, . . . ,Tp to Gr one at a time.

Let w(Ti ) be the weight of Ti . At each step i , the weight of Copt on the resulting graph

increases by at least 23 w(Ti ) since at least two vertices in Ti is in Copt, while the weight

of C on the resulting graph increases by at most w(Ti ). Therefore the final ratio be-

tween the weight of C and the weight of Copt is at most

(2− 2k+1 )w0 +∑p

i=1 w(Ti )

w0 + 23

∑pi=1 w(Ti )

.

For k > 2, this ratio is at most 2− 2k+1 .

Therefore, the algorithm achieves a (2− 2k+1 )-approximation for WMVC on G(I Sk ).

Similarly, Halperin’s algorithm gives a (2− (1−o(1)) 2lnlnklnk )-approximation for WMVC on

G(I Sk ).

2.4.4 Weighted Maximum c-Colourable Subgraph

The interval selection problem discussed in Section 2.1.1 is often extended to multiple ma-

chines. For identical machines, the graph-theoretic formulation of this problem leads to a

natural generalization of MIS. In this section, we discuss the weighted version of this gener-

alization: WCOLc .


Recall that in the weighted maximum c-colourable subgraph problem, we are given a

graph G = (V ,E) with n vertices and m edges, and a weight function w : V →Z+. The goal is

to find a subset S of vertices maximizing the total weight of S such that S can be partitioned

into c independent subsets. This problem is also referred to as the weighted maximum c-

partite induced subgraph problem in some graph theory literature [1]. The problem is known

to be NP-hard [88] even for chordal graphs. Chakaravarthy and Roy [17] showed that for

chordal graphs, the problem admits a simple and efficient 2-approximation algorithm. We

strengthen this result and extend it to the graph class of G(I Sk ).

Theorem 2.4.13 For all k ≥ 1 and c ≥ 1, there is a polynomial time algorithm that achieves a

(k +1− 1c )-approximation for WCOLc on G(I Sk ).

We describe an algorithm that achieves the approximation ratio for Theorem 2.4.13. The

algorithm is called a stack algorithm as modelled in [11]. For each colour class l , we allocate

a stack Sl to temporarily store candidate vertices potentially assigned to that colour class.

For each vertex v , let wl (v) denote its updated weight with respect to the stack Sl .

A STACK ALGORITHM FOR WCOLc ON G(I Sk )

1: Sort all vertices according a k-independence ordering

2: for i = 1, . . . ,n do

3: Let wl (vi ) = w(vi )−∑v j∈Sl∩N (vi ) wl (v j ) for each colour class l

4: if wl (vi ) ≤ 0 for all l = 1. . .c then

5: Reject vi without assigning any colour

6: else

7: Let h = argmaxcl=1 wl (vi ) and push vi onto Sh

8: end if

9: end for

10: for l = 1, . . . ,c do

11: while Sl is not empty do


12: Pop v out of Sl

13: if v is adjacent to any vertex with colour l then

14: Reject v without assigning any colour

15: else

16: Assign colour l to v

17: end if

18: end while

19: end for

We call lines 2 to 9 the push phase of the algorithm and lines 10 to 19 the pop phase of

the algorithm. For simplicity, we do not distinguish between a stack and the set of vertices

it contains at the end of the push phase; it should be clear which is being referred to by

the context in which it appears. Let Wl be the total updated weight of vertices in Sl , and

let W = ∑cl=1 Wl . Before proving Theorem 2.4.13, we give three lemmas. Let M be a c by c

square matrix, and let Σ be the set of all permutations of {1,2, . . . ,c}. For any σ ∈Σ, let σi be

the i th element in the permutation.

Lemma 2.4.14 There exists a permutation σ such that

∑i

Miσi ≤1

c

∑i , j

Mi j .

Proof: Suppose otherwise. Then for each permutation σ we have

∑i

Miσi >1

c

∑i , j

Mi j .

We sum over all σ ∈Σ. Since in total we have c ! permutations, we have

∑σ∈Σ

∑i

Miσi > c ! · 1

c

∑i , j

Mi j .

Since each Mi j is counted exactly (c −1)! times on the left hand side, we have

(c −1)!∑i , j

Mi j > (c −1)!∑i , j

Mi j ,

which is a contradiction.


Lemma 2.4.15 The solution of the algorithm has total weight at least W .

Proof: Let Al be the set of vertices in the solution of the algorithm with colour l . For any

given vertex vi ∈ Al , let Sil be the content of the stack Sl before vi is being pushed onto the

stack. We have

w(vi ) = wl (vi )+ ∑v j∈Si

l ∩N (vi )

wl (v j ).

If we sum over all vi ∈ Al , we have

∑vi∈Al

w(vi ) = ∑vi∈Al

wl (vi )+ ∑vi∈Al

∑v j∈Si

l ∩N (vi )

wl (v j ) ≥ ∑vt∈Sl

wl (vt ) =Wl .

The inequality holds because for any vt ∈ Sl , we either have vt ∈ Sil ∩N (vi ) for some vi ∈ Al

or we have vt ∈ Al . Summing over colour classes, we have that the solution of the algorithm

has total weight at least W .

We now proceed to the proof of Theorem 2.4.13.

Proof: Let A be the solution of the algorithm and O be the optimal solution. For each given

vertex vi in O, let oi be its colour class in O, and ai be its colour class in A if it is in A. Let Sioi

be the content of the stack Soi when the algorithm considers vi . We then have three cases:

1. If vi is rejected during the push phase of the algorithm then we have

w(vi ) ≤ ∑v j∈Si

oi∩N (vi )

woi (v j ).

In this case, we charge w(vi ) to all woi (v j ) with v j ∈ Sioi∩N (vi ). Each woi (v j ) can be

charged at most k times coming from the same colour class.

2. If vi is accepted into the same colour class during the push phase of the algorithm

then we have

w(vi ) = woi (vi )+ ∑v j∈Si

oi∩N (vi )

woi (v j ).


In this case, we charge w(vi ) to woi (vi ) and all woi (v j ) with v j ∈ Sioi∩N (vi ). Note that

they all appear in the same colour class oi ; woi (vi ) is charged at most once and each

woi (v j ) is charged at most k times coming from the same colour class.

3. If vi is accepted into a different colour class during the push phase of the algorithm

then we have

w(vi ) = woi (vi )+ ∑v j∈Si

oi∩N (vi )

woi (v j ) ≤ wai (vi )+ ∑v j∈Si

oi∩N (vi )

woi (v j ).

In this case, we charge w(vi ) to wai (vi ) and all woi (v j ) with v j ∈ Sioi∩N (vi ). Note that

each woi (v j ) appears in the same colour class oi and is charged at most k times com-

ing from the same colour class. However wai (vi ) in this case is in a different colour

class ai and is charged at most once coming from a different colour class.

If we sum over all vi ∈O, we have

∑vi∈O

w(vi ) ≤ ∑vi∈A∩O∧oi 6=ai

wai (vi )+kc∑

l=1

∑vt∈Sl

wl (vt ).

The inequality holds because when we sum over all weights of vi ∈ O, there are two types

of charges for wl (vt ) for any vertex vt ∈ Sl . There are charges coming from the same colour

class, of which there can be at most k; and charges coming from a different colour class. The

latter ones only appear when vi is accepted into a stack of a different colour class (compar-

ing to the optimal solution) during the push phase of the algorithm. Therefore there is at

most one such charge, which leads to the extra term∑

vi∈A∩O∧oi 6=aiwai (vi ). Note that if we

can permute the colour classes of the optimal solution so that for any vi ∈ A ∩O, oi = ai ,

then this extra term disappears and we achieve a k-approximation. But it might be the case

that no matter how we permute the colour classes of the optimal solution, we always have

some vi ∈ A∩O with oi 6= ai .

We construct the weight matrix M in the following way. An assignment i → j is to assign

the colour class i of O to the colour class j of A. A vertex is misplaced with respect to this

assignment i → j if it is in A∩O and its colour class is i in O, but is not j in A. We then let Mi j


be the total updated weight of misplaced vertices with respect to the assignment i → j . Note

that the total weight of the matrix is (c − 1)∑

vi∈A∩O wai (vi ), and applying Lemma 2.4.14,

there exists a permutation of the colour class in O such that

∑vi∈A∩O∧oi 6=ai

wai (vi ) ≤ c −1

c

∑vi∈A∩O

wai (vi ) ≤ c −1

c

∑vi∈A

w(vi ).

Therefore, we have

∑vi∈O

w(vi ) ≤ c −1

c

∑vi∈A

w(vi )+kc∑

l=1

∑vt∈Sl

wl (vt ) ≤ c −1

c

∑vi∈A

w(vi )+kW.

By Lemma 2.4.15, we have

∑vi∈O

w(vi ) ≤ c −1

c

∑vi∈A

w(vi )+kW ≤ (k +1− 1

c)

∑vi∈A

w(vi ).

Therefore, the algorithm achieves a (k +1− 1c )-approximation for WCOLc on G(I Sk ).

Note that given a k-independence ordering, the running time of the stack algorithm for

Theorem 2.4.13 is dominated by the push phase and can be bounded by O(min{m logc +n,m + cn}). The first quantity is obtained as follows: for each vertex, we maintain a priority

queue of its updated weights for all the colour classes. An update occurs for each edge in

the graph and the cost of such an update is O(logc). Therefore the running time is bounded

by O(m logc +n). For the second quantity, at each step, we basically calculate the updated

weighted of that vertex for all colour classes, and then find the best colour class to push that

vertex onto the stack. Calculating the update weighted for all vertices costs time O(m), and

finding the best colour class for each vertex costs time O(c). Therefore the running time is

bounded by O(m + cn).

In general, by a result in [6], the existence of an r -approximation for WMIS always im-

plies (using a greedy algorithm that repeatedly takes an r -approximation solution of WMIS

in the remaining graph) an approximation algorithm with ratio (cr )c

(cr )c−(cr−1)c for WCOLc . Note

that when c = 1, (cr )c

(cr )c−(cr−1)c = r . When c = 2 and r = 1, (cr )c

(cr )c−(cr−1)c = 43 . For the remaining

cases, we have

(cr )c

(cr )c − (cr −1)c= 1

1− (1− 1cr )c

≤ 1

1−e−r.


This ratio is no more than r +1− 1c for all choices of r and c. However, the running time of

this algorithm is O(c(m+n)) given a k-independence ordering, which is slightly worse than

the stack algorithm.

2.4.5 The Graph Class G(CC2)

The graph class G(CCk ) is a subclass of G(I Sk ), hence all algorithms studied in this section

apply to the graph class of G(CCk ). Furthermore, if an elimination ordering with respect

to the property MCC ≤ k is given, then both WMC and MCC can be approximated within

a ratio of k. However, as noted in Section 2.3, unlike the property M I S ≤ k, the property

MCC ≤ k is NP-hard to test for k > 2. Therefore, Theorem 2.1.11 no longer applies. In this

subsection, we focus on the graph class G(CC2).

The graph class G(CC2) contains several interesting subclasses such as line graphs, trans-

lates of a uniform rectangle, input graphs of the job interval selection problem and circular-

arc graphs. Furthermore, by Corollary 2.3.7, for a graph in G(CC2), an elimination order with

respect to the property MCC ≤ 2 can be constructed in polynomial time. Here, we give an

optimal algorithm for WMC and a 2-approximation algorithm for MCC on G(CC2).

Theorem 2.4.16 Given a graph in G(CC2), there is an algorithm that solves WMC in polyno-

mial time.

Proof: Let v1, v2, . . . , vn be an elimination ordering with respect to the property MCC ≤ 2.

For each vi , let Gi = G[(N (vi )∪ {vi })∩Vi ]. Since the size of an MCC on Gi is at most 2, the

complement of Gi is a bipartite graph. Note that a WMIS in a bipartite graph can be de-

termined in polynomial time [23][39], hence a WMC in Gi can be computed in polynomial

time. We compute a WMC for each Gi , and the largest one is a WMC for G . To see why,

consider any weighted maximum clique C of G . Let v j be the vertex in C that appears first

in the elimination ordering. Hence, when we compute an WMC for G j , we catch a weighted

maximum clique of G .


Theorem 2.4.17 Given a graph in G(CC2), there is a polynomial time 2-approximation al-

gorithm for MCC.

Proof: Let v1, v2, . . . , vn be an elimination ordering with respect to the property MCC ≤ 2.

We construct an independent set S by repeatedly taking a vertex according to this elimina-

tion ordering and removing all its neighbours. For each vi ∈ S, let Gi =G[(N (vi )∪ {vi })∩Vi ].

Since the size of an MCC on Gi is at most 2, there are at most two cliques in an MCC on Gi .

We take the union of those cliques for every Gi . It is clear that this is a clique cover for G ,

and has size 2|S|. Since S is an independent set, an MCC on G has size at least |S|. Therefore,

the algorithm achieves a 2-approximation.

2.5 Matroids and Chordoids

The previous sections discuss graph structures based on inductive and universal neighbour-

hood properties. In this section, we consider set systems. In particular, we discuss matroids,

an extension of matroids, and greedy algorithms on these set systems.

2.5.1 Matroids

Matroids are well studied objects in combinatorial optimization. A matroid M is a pair

(U ,F ), where U is a set of ground elements and F is a family of subsets of U , called inde-

pendent sets, with the following properties :

• Trivial Property: ;∈F .

• Hereditary Property: If A ∈F and B ⊂ A, then B ∈F .

• Augmentation Property: If A,B ∈ F and |A| = |B | + 1, then there exists an element

e ∈ A \ B such that B ∪ {e} ∈F .


The maximal independent sets of a matroid are called bases. By the augmentation property,

all bases have the same cardinality. For a given subset A of U , the rank of A, denoted as

r (A), is the size of the largest independent set contained in A. The rank function satisfies

the following properties for all A,B ⊆U :

• Monotonicity: A ⊆ B implies r (A) ≤ r (B);

• Submodularity: r (A∪B)+ r (A∩B) ≤ r (A)+ r (B).

The definition of a matroid captures the key notion of independence from both linear al-

gebra and graph theory. For example, consider a set of vectors S in a vector space and let

F denote the set of linearly independent sets of vectors, then (S,F ) is a matroid. Given a

simple graph, let E be the set of edges and let F be the set of forests, then (E ,F ) is also a

matroid. We give two more examples of matroids which we use in the thesis.

1. Uniform Matroid: Given a set U of ground elements, let F be the set of all subsets

of U with no more than k elements. Then (U ,F ) is known as the uniform matroid of

rank k.

2. Partition Matroid: Given a set U of ground elements which is partitioned into sets

U1,U2, . . .Ul . Let F be the set of all subsets of U with no more than ki elements from

each partition Ui for all i = 1,2, . . . , l . Then (U ,F ) is a partition matroid. Note that a

uniform matroid is a special case of a partition matroid for which l = 1.

2.5.2 Greedy Algorithms and Matroids

One interesting aspect of matroids is the connection to greedy algorithms. Given a matroid

M = (U ,F ) and a positive weight function w : U →R+, there is a natural optimization prob-

lem associated with the matroid M and this weight function w , namely that of finding an

independent set of maximum total weight. We call this problem the maximum independent

set problem on matroids. Let |U | = n. Sort elements in U in non-increasing order of weights.


Let xi denote the i th element in this order. The following “natural" greedy algorithm solves

the problem optimally:

GREEDY ALGORITHM FOR MATROIDS

1: S =;2: for i = 1, . . . ,n do

3: if S ∪ {xi } ∈F , add xi to S

4: end for

5: return S

Theorem 2.5.1 [76] The above greedy algorithm optimally solves the weighted maximum

independent set problem on matroids.

Another interesting fact is the reverse direction of the implication for hereditary set sys-

tems.

Theorem 2.5.2 [27] Let (U ,F ) be a hereditary set system. If for every choice of a weight func-

tion w : U → R+, the above greedy algorithm constructs a feasible set with maximum total

weight, then F is the set of independent sets of a matroid M with underlying ground set U .

Matroids give a characterization of hereditary set systems for which the “natural" greedy

algorithm achieves the optimal solution for the maximum independent set problem.

2.5.3 Chordoids

Note that for the maximum independent set problem on a matroid, if the problem is un-

weighted, then any ordering of elements will give an optimal solution for the greedy algo-

rithm. This is different than the greedy algorithm for the maximum independent set prob-

lem on chordal graphs, where a specific ordering of vertices has to be used. We extend the

definition of matroid by replacing the definition of augmentation property by the following

property, which we call the ordered augmentation property.


Definition 2.5.3 A set system (U ,F ) satisfies the ordered augmentation property if there is

a total ordering of elements e1,e2, . . . ,en such that for any feasible set S ∈F and any element

ei 6∈ S, if S−∪ {ei } ∈F and S+ 6= ; where S− = {e j |e j ∈ S, j < i } and S+ = {e j |e j ∈ S, j > i }, then

there exists an element ek ∈ S+ such that S \ {ek }∪ {ei } ∈F .

It turns out the ordered augmentation property is strictly weaker than the augmentation

property of matroids.

Proposition 2.5.4 The augmentation property implies the ordered augmentation property.

Proof: Let (U ,F ) be a set system that satisfies the augmentation property, and let e1,e2, . . . ,en

be an arbitrary ordering of elements in U . For any feasible set S ∈F and any element ei 6∈ S,

if S−∪ {ei } ∈ F and S+ 6= ;, by the augmentation property, we can repeatedly augment the

set starting with S−∪ {ei } using elements in S+ until its size is equal to the size of S. This

implies that there exists an element ek ∈ S+ such that S \ {ek }∪ {ei } ∈F .

Definition 2.5.5 Let C = (U ,F ) be a set system. If C satisfies the trivial property, the heredi-

tary property and the ordered augmentation property, then C is a chordoid.

By Proposition 2.5.4, chordoids generalize matroids. We give four additional examples

of chordoids.

Example 2.5.6 Let U be a set of elements and let w : U →Z+ be a positive weight function on

elements of U . Let B be a positive integer and let F = {S ⊆U |∑e∈S w(e) ≤ B}. Then (U ,F ) is

a chordoid.

If we take an ordering of elements in non-decreasing order of weights (breaking ties arbi-

trarily), then the set system satisfies the ordered augmentation property. Let S be an inde-

pendent set. Let e1 ∈ S, e2 6∈ S and S′ = S \ {e1}∪ {e2}. If w(e1) ≥ w(e2), then S′ is independent

since∑

e∈S′ w(e) ≤ B . Note that since each element has a positive weight, any subset of an

independent set is also independent, therefore the set system (U ,F ) is a chordoid. Note


that the above constraint∑

e∈S w(e) ≤ B is often referred to as a knapsack constraint. There-

fore, any optimization problem over a knapsack constraint is an optimization problem over

a chordoid.

Example 2.5.7 Let G = (V ,E) be a chordal graph, and let F be the family of independent sets

of the graph G. Then (V ,F ) is a chordoid.

Note that any subset of an independent set is independent. Therefore, this set system satis-

fies the hereditary property. We now examine the ordered augmentation property. Consider

a perfect elimination ordering of vertices, for any independent set S and any vertex v 6∈ S,

let S− be the set of vertices in S appearing earlier than v in the ordering and let S+ be the set

of vertices in S appearing later than v in the ordering. By the definition of perfect elimina-

tion ordering, there is at most one vertex in S+ that can be adjacent to v . Hence, if S−∪ {v}

is independent, to augment S with v , maintaining independence of the set, we need to re-

move at most one vertex in S+. Therefore, a perfect elimination ordering of a chordal graph

satisfies the ordered augmentation property; the set system (V ,F ) is a chordoid.

Example 2.5.8 Given a set of codewords U over some alphabetΣ, let F be the set of prefix-free

subsets of U . Then (U ,F ) is a chordoid.

A subset of a prefix-free set is clearly prefix-free; hence the hereditary property is satisfied.

Consider an ordering of all codewords in a non-increasing order of lengths (breaking ties

arbitrarily). Note that for any prefix-free set S and a codeword w , there is at most one code-

word of a smaller or equal length in S that can be a prefix of w . Therefore, (U ,F ) satisfies

the ordered augmentation property.

Example 2.5.9 Given a set U of partial vectors of length n of the form

(a1, a2, . . . , ai , ?, ?, . . . , ?).

The unknown entries are marked with ?. The number of known entries of a vector is called its

effective length. A subset S of partial vectors is independent if no matter what the unknown


values are, the subset S is linearly independent. Let F be the family of independent sets of U .

Then (U ,F ) is a chordoid.

Clearly, any subset of an independent set is independent. It remains to verify the ordered

augmentation property.

Proposition 2.5.10 A set system (U ,F ) where U is the set of particle vectors of length n satis-

fies the ordered augmentation property when ordered in a non-increasing manner by effective

length.

Proof: Let v1, v2, . . . , vn be a non-increasing order of partial vectors ordered according to

their effective lengths. In sequel, we assume that all subsets of U are ordered using this

ordering.

Let S be an independent set and v be a partial vector that is not in S. Let S− denote the

set of partial vectors in S appearing earlier than v in the ordering, and let S+ denote the set

of partial vectors in S appearing later than v . We assume that S−∪ {v} is independent, but

S ∪ {v} is not. An independent set is v-dependent if adding v to it makes it dependent. Let

D denote the set of the minimal v-dependent subsets of S. For any set D in D, define the

index of D to be the largest index (the position in the ordering v1, v2, . . . , vn) among all partial

vectors in D . Let m be the maximum index over all sets in D. We claim that S \ {vm}∪ {v} is

independent.

To see this, let T = {t1, . . . , tk } be a set in D with index m. Then there exist non-zero

constants α1,α2, . . . ,αk such that

k∑i=1

αi · ti + v = (0,0, . . . ,0,?, ?, . . . , ?),

where we make the convention that ?+ a =? and ? · a =? for any real number a. Note that

by our choice of T , tk = vm and the effective length of −∑ki=1αi · ti is no greater than the

effective length of v . Furthermore, −∑ki=1αi · ti has the same values as v for all its known

entries.


Now suppose that S \ {vm}∪ {v} is not independent. Let H = {h1, . . . ,hl } be a minimal

subset of S \ {vm} that is v-dependent. Then there exist non-zero constants β1,β2, . . . ,βl

such thatl∑

i=1βi ·hi + v = (0,0, . . . ,0,?, ?, . . . , ?).

Therefore, we havel∑

i=1βi ·hi −

k∑i=1

αi · ti = (0,0, . . . ,0,?, ?, . . . , ?),

where there is at least one non-zero coefficient, i.e., the coefficient of tk . Therefore the orig-

inal set S is not independent, which is a contradiction.

Given a chordoid (U ,F ), let u1,u2, . . . ,un be the an ordering of elements satisfying the

ordered augmentation property. The following greedy algorithm is optimal for the maxi-

mum independent set problem over a chordoid.

AN OPTIMAL GREEDY ALGORITHM FOR MIS OVER A CHORDOID

1: S =;2: for i = 1, . . . ,n do

3: Add ui to S if the resulting set is independent

4: end for

5: Return S

Theorem 2.5.11 The greedy algorithm solves the maximum independent set problem opti-

mally for a chordoid.

Proof: Let A be the greedy solution and O be an optimal solution. We are going to slowly

change O to A. We let O0 =O.

Letπ be an ordering of elements satisfying the ordered augmentation property. We order

elements in A according to π: a1, a2, . . . , am , and apply the following procedure at each step

i , for i = 1, . . . ,m. If ai ∈ Oi−1 then Oi = Oi−1. Otherwise, we add the element ai to Oi−1.

Note that no element in Oi−1 \ A appears earlier than ai in the ordering, for otherwise, it


will be chosen by the greedy algorithm. By the ordered augmentation property, after adding

ai , we need at most remove one element in Oi−1 appearing later than ai to maintain its

independence. We let the resulting set be Oi . We have two observations:

1. Oi−1 has the same size as Oi for all i = 1, . . . ,m.

2. Om = A.

The first observation is easy as at each step we either do nothing, or add one element and

remove at most one element. Since we cannot do better than the optimal solution, the

sizes of Oi for all i are kept fixed. For the second observation, it is not hard to see A ⊆ Om .

Furthermore Om cannot contain any extra element, for otherwise, it will be chosen by the

greedy algorithm.

Therefore |A| = |O0| = |O|. The greedy algorithm is optimal.

Theorem 2.5.12 The weighted maximum independent set for a chordoid is NP-hard.

Proof: This is immediate since the knapsack problem is NP-hard and it is a special case of

the weighted maximum independent set problem for a chordoid.

There are other set system characterizations for greedy algorithms in the literature, most

notably, greedoids in [61]. A set system (U ,F ) is a greedoid if it satisfies the trivial property,

the augmentation property and the following accessible property instead of the hereditary

property:

• Accessible Property: If A ∈F and A 6= ;, then ∃e ∈ A such that A \ {e} ∈F .

Chordoids are different from matroids, greedoids in two key aspects. For matroids and gree-

doids, the greedy algorithm is always optimal for the weighted maximum independent set

problem, while this does not hold for all chordoids. Secondly, both matroids and greedoids

have the same size for all bases, while this is not true for chordoids.

Chapter 3

Greedy Algorithms for Special Functions

An optimization problem takes the form of optimizing an objective function subject to some

constraints. While the previous chapter deals with special structures yielding constraints, in

this chapter, we study special families of objective functions. In order to make a fair com-

parison among different objective functions, we fix our constraints, and consider a general

class of optimization problems of the following form: Given a universe U and a set function

f : 2U → R, we want to find a subset S of U with cardinality p, where p is a fixed constant,

maximizing f (S). The constraint of the problem is a very simple cardinality constraint,

which is also known as the uniform matroid constraint. The objective functions we con-

sider in this chapter start from very simple linear functions and submodular functions to

more general functions: functions modelling diversity and weakly submodular functions. A

set function is monotone if for all S ⊂ T ⊆U , f (S) ≤ f (T ); it is normalized if f (;) = 0. In this

chapter, we restrict our attention to monotone and normalized functions.

3.1 Linear Functions and Submodular Functions

A set function f is linear if for all S,T ⊆U ,

f (S)+ f (T ) = f (S ∪T )+ f (S ∩T ).

61

CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 62

The other way to view a linear function is that the contribution of an individual element e

to a set is the value of that element, which is essentially f ({e}). Therefore, we can use the

following simple greedy algorithm to optimize a linear function over a uniform matroid.

LINEAR FUNCTION MAXIMIZATION

1: for i = 1, . . . , p do

2: Choose an element giving most increase in value to the current set

3: Add that element to the current set

4: end for

It is not hard to see that the above greedy algorithm solves the problem optimally. A more

general class of functions is the class of submodular functions. A set function f is sumodular

if for all S,T ⊆U ,

f (S)+ f (T ) ≥ f (S ∪T )+ f (S ∩T ).

Note that submodular functions are often studied in value oracle model [73], where the

only access to f (·) is through a black box returning f (S) for a given set S. We can also view

a submodular function in terms of marginal gain. An equivalent definition is that for all

S ⊂ T ⊂ T ∪ {x} ⊆U ,

f (S ∪ {x})− f (S) ≥ f (T ∪ {x})− f (T ).

This basically says the marginal gain of an element to a set is no greater than the gain to a

smaller subset.

For the problem of maximizing a submodular function over a uniform matroid, we can

use the same greedy algorithm.

SUBMODULAR FUNCTION MAXIMIZATION

1: for i = 1, . . . , p do

2: Choose an element giving most increase in value to the current set

3: Add that element to the current set

4: end for


For submodular functions, the greedy algorithm does not always find the optimal solu-

tion to the problem. However, a result of Nemhauser, Wolsey and Fisher [71] shows that

it achieves an approximation ratio of ee−1 ≈ 1.58. Furthermore, the bound is known to be

tight both in the value oracle model and explicitly posed instances assuming P is not equal

to N P [31].

3.2 Max-Sum Diversification

We now turn our attention to more general functions. The linear functions or submodular

functions discussed in the previous section often model the quality of a given subset. For

some applications, this is not enough. For example, in portfolio management, allocating

equities only according to the total expected return might lead to a large potential risk as the

portfolio is not diversified. A similar situation occurs in information retrieval. For example,

in search engines, when pre-knowledge of the user intent is not available, it is actually better

for a search engine to diversify its displayed results to improve user satisfaction. In many

such situations, diversity is an important measure that must be brought into consideration.

Recently, there has been a rising interest in the notion of diversity, especially in the con-

text of social media and web search. However the concept of diversity is not new, there is a

rich and long line of research dealing with a similar concept in the literature of location the-

ory. In particular, the placement of facilities on a network to maximize some function of the

distances between facilities. The situation arises when proximity of facilities is undesirable,

for example, the distribution of business franchises in a city. Such location problems are of-

ten referred to as dispersion problems; for more motivation and early work, see [29, 30, 64].

Analytical models for the dispersion problem assume that the given network is repre-

sented by a set V = {v1, v2, . . . , vn} of n vertices with metric distance between every pair of

vertices. The objective is to locate p facilities (p ≤ n) among the n vertices, with at most one

facility per vertex, such that some function of distances between facilities is maximized.


Different objective functions are considered for the dispersion problems in the literature.

For example, the max-sum criterion (maximize the total distances between all pairs of fa-

cilities) in [87, 29, 77], the max-min criterion (maximize the minimum distance between a

pair of facilities) in [64, 29, 77], the max-mst (maximize the minimum spanning tree among

all facilities) and many other related criteria in [41, 18]. The general problem (even in the

metric case) for most of these criteria is NP-hard, and approximation algorithms have been

developed and studied; see [18] for a summary of previous known results.

In this section, we study a problem that extends the max-sum dispersion problem. We

first give the definition of a metric distance function.

Definition 3.2.1 Let U be the underlying ground set, a distance function d(·, ·) measuring

between every pair of elements is metric if it satisfies the following properties:

1. Non-Negativity: For any x, y ∈U , d(x, y) ≥ 0.

2. Coincidence Axiom: For any x, y ∈U , d(x, y) = 0 if and only if x = y.

3. Symmetry: For any x, y ∈U , d(x, y) = d(y, x).

4. Triangle Inequality: For any x, y, z ∈U , d(x, y)+d(x, z) ≥ d(y, z).

Definition 3.2.2 Let U be the underlying ground set, and d(·, ·) a metric distance function.

Given a fixed integer k, the goal of the max-sum dispersion problem is to find a subset S ⊆U

that maximizes∑

{u,v}⊆S d(u, v) subject to |S| = p.

The max-sum dispersion problem is known to be NP-hard [46], but it is not known

whether or not it admits a PTAS. In [77], Ravi, Rosenkrantz and Tayi give a greedy algorithm

and prove that it has an approximation ratio within a factor of four. This is later improved by

Hassin, Rubinstein and Tamir [47], who show a different algorithm with an approximation

ratio of two. This is the best known ratio today. We study a generalization of the max-sum

dispersion problem; we call it the max-sum diversification problem.


Definition 3.2.3 Let U be the underlying ground set, and d(·, ·) a metric distance function for

any pair of elements in U . Let f (·) be a non-negative set function measuring the weight of any

subset. Given a fixed integer k, the goal of the max-sum diversification problem is to find a

subset S ⊆U that:

maximizes f (S)+λ∑{u,v}⊆S d(u, v)

subject to |S| = p,

whereλ is a non-negative parameter specifying a desired trade-off between the two objectives;

i.e., the quality of the set and the diversity of the set.

The max-sum diversification problem is first proposed and studied in the context of re-

sult diversification in [38] 1, where the function f (·) is linear. In their paper, the value of f (S)

measures the relevance of a given subset to a search query, and the value∑

{u,v}⊆S d(u, v)

gives a diversity measure on S. The parameter λ specifies a desired trade-off between rel-

evance and diversity. They reduce the problem to the max-sum dispersion problem, and

using an algorithm in [47], they obtain an approximation ratio of two. We study the prob-

lem with more general weight functions: normalized, monotone submodular set functions.

Therefore, the problem also extends the submodular maximization problem discussed in

Section 3.1. Note that results in [38] no longer apply after extending the weight functions to

submodular set functions.

3.2.1 A Greedy Algorithm and Its Analysis

In this subsection, we give a non-oblivious greedy algorithm for the max-sum diversification

problem that achieves a 2-approximation. Before giving the algorithm, we first introduce

our notation. We extend the notion of distance function to sets. For disjoint subsets S,T ⊆U , let d(S) =∑

{u,v}⊆S d(u, v), and d(S,T ) =∑u∈S,v∈T d(u, v).

1In fact, they have a slightly different but equivalent formulation.


Now we define various types of marginal gain. For any given subset S ⊆ U and an ele-

ment u ∈ U \ S, let φ(S) be the value of the objective function. Let du(S) = ∑v∈S d(u, v) be

the marginal gain on the distance; let fu(S) = f (S ∪ {u})− f (S) be the marginal gain on the

weight; and φu(S) = fu(S)+λdu(S) be the total marginal gain on the objective function. Let

f ′u(S) = 1

2 fu(S), and φ′u(S) = f ′

u(S)+λdu(S). We consider the following simple greedy algo-

rithm:

A GREEDY ALGORITHM FOR MAX-SUM DIVERSIFICATION

1: S =;2: while |S| < p do

3: Find u ∈U \ S maximizing φ′u(S)

4: S = S ∪ {u}

5: end while

6: return S

Note that the above greedy algorithm is non-oblivious as it is not selecting the next ele-

ment with respect to the objective function but rather with respect to a closely related “po-

tential function". To show a bounded approximation ratio for the algorithm, we utilize the

following variation of a lemma in [77].

Lemma 3.2.4 Given a metric distance function d(·, ·) defined on U , and two disjoint subsets

X and Y of U , we have the following inequality:

(|X |−1)d(X ,Y ) ≥ |Y |d(X ).

Proof: For any x1, x2 ∈ X and y ∈ Y , by the triangle inequality, we have

d(x1, y)+d(x2, y) ≥ d(x1, x2).

Summing up over all unordered pairs of {x1, x2}, we have

(|X |−1)d(X , y) ≥ d(X ).


Summing up over all y , we have

(|X |−1)d(X ,Y ) ≥ |Y |d(X ).

Theorem 3.2.5 The greedy algorithm achieves a 2-approximation for the max-sum diversi-

fication problem with normalized, monotone submodular set functions.

Proof: Let O be an optimal solution, and G , the greedy solution at the end of the algorithm.

Let Gi be the greedy solution at the end of step i , i < p; and let A = O ∩Gi , B = Gi \ A and

C =O \ A. By lemma 3.2.4, we have the following three inequalities:

(|C |−1)d(B ,C ) ≥ |B |d(C ) (3.1)

(|C |−1)d(A,C ) ≥ |A|d(C ) (3.2)

(|A|−1)d(A,C ) ≥ |C |d(A) (3.3)

Furthermore, we have

d(A,C )+d(A)+d(C ) = d(O) (3.4)

Note that the algorithm clearly achieves the optimal solution if p = 1. If |C | = 1, then

|A| = p −1. Since |A| = p −1 if and only if |Gi | = p −1, we have i = p −1 and Gi ⊂O. Let v be

the element in C , and let u be the element taken by the greedy algorithm in the next step,

then φ′u(Gi ) ≥φ′

v (Gi ). Therefore,

1

2fu(Gi )+λdu(Gi ) ≥ 1

2fv (Gi )+λdv (Gi ),

which implies

φu(Gi ) = fu(Gi )+λdu(Gi )

≥ 1

2fu(Gi )+λdu(Gi )

≥ 1

2fv (Gi )+λdv (Gi )

≥ 1

2φv (Gi );


and hence φ(G) ≥ 12φ(O).

Now we can assume that p > 1 and |C | > 1. We apply the following non-negative multi-

pliers to equations (3.1), (3.2), (3.3), (3.4) and add them: (3.1)∗ 1|C |−1 + (3.2)∗ |C |−|B |

p(|C |−1) + (3.3)∗i

p(p−1) + (3.4)∗ i |C |p(p−1) ; we then have

d(A,C )+d(B ,C )− i |C |(p −|C |)p(p −1)(|C |−1)

d(C ) ≥ i |C |p(p −1)

d(O).

Since p > |C |,d(C ,Gi ) ≥ i |C |

p(p −1)d(O).

By submodularity and monotonicity of f ′(·), we have

∑v∈C

f ′v (Gi ) ≥ f ′(C ∪Gi )− f ′(Gi ) ≥ f ′(O)− f ′(G).

Therefore,

∑v∈C

φ′v (Gi ) = ∑

v∈C[ f ′

v (Gi )+λd({v},Gi )]

= ∑v∈C

f ′v (Gi )+λd(C ,Gi )

≥ [ f ′(O)− f ′(G)]+ λi |C |p(p −1)

d(O).

Let ui+1 be the element taken at step (i +1), then we have

φ′ui+1

(Gi ) ≥ 1

p[ f ′(O)− f ′(G)]+ λi

p(p −1)d(O).

Summing over all i from 0 to p −1, we have

φ′(G) =p−1∑i=0

φ′ui+1

(Gi ) ≥ [ f ′(O)− f ′(G)]+ λ

2d(O).

Hence,

f ′(G)+λd(G) ≥ f ′(O)− f ′(G)+ λ

2d(O),

and

φ(G) = f (G)+λd(G) ≥ 1

2[ f (O)+λd(O)] = 1

2φ(O).

This completes the proof.


Note that the approximation ratio of 2 obtained in Theorem 3.2.5 is tight with respect to

the greedy algorithm. Consider the following example.

Example 3.2.6 Let U be a set of 2p elements and let A and B be a bipartition of U , each

containing p elements. The weight of each element is 0. The distance function d(·, ·) is defined

as follows. We have d(x, y) = 2 if x ∈ A and y ∈ A; otherwise d(x, y) = 1.

Note that d(·, ·) is a metric distance function. Furthermore, it is possible for the greedy al-

gorithm to choose the set B as the solution to the problem. The optimal solution is A, and

φ(A) = 2φ(B). Therefore, the approximation ratio of 2 obtained in Theorem 3.2.5 is tight

with respect to the greedy algorithm.

3.2.2 Further Discussions

It is natural to extend the cardinality constraint of the max-sum diversification problem to

a general matroid constraint.

Definition 3.2.7 Let U be the underlying ground set, and F be the set of independent subsets

of U such that M =<U ,F > is a matroid. Let d(·, ·) be a metric distance function measuring

the distance on every pair of elements. For any subset of U , let f (·) be a non-negative set

function measuring the total weight of the subset. The goal of the max-sum diversification

problem with a matroid constraint is to find a subset S ∈F that:

maximizes f (S)+λ∑{u,v}:u,v∈S d(u, v)

where λ is a parameter specifying a desired trade-off between the two objectives.

As before, we letφ(S) be the value of the objective function for a set S. The greedy algorithm

in the previous subsection still applies, but it fails to achieve any constant approximation ra-

tio. Consider the following partition matroid. The set of ground elements U = {e1,e2,e3,e4}.

The bases of the matroid are

{e1,e2}, {e1,e3}, {e4,e2}, {e4,e3}.


This is a partition matroid with one block {e1,e4} and the other block {e2,e3} and one ele-

ment allowed per block. The weight of each element is 0. The distances between any pair of

elements2 are defined as follows:

d(e1,e2) = d(e1,e3) = d(e2,e3) = 1;

d(e1,e4) = d(e2,e4) = d(e3,e4) = n.

It is not hard to see that d(·, ·) is a metric distance function. Note that the greedy algo-

rithm may pick e1 during its first iteration. No matter what it picks during the second iter-

ation, the resulting solution has a value of 1. However, there is a basis with value n. There-

fore, the approximation ratio is unbounded. This is in contrast to the greedy algorithm of

Nemhauser, Wolsey and Fisher [71] for submodular function maximization, which achieves

a 2-approximation after replacing the uniform matroid constraint by a general matroid con-

straint. Note that the problem is trivial if the rank of the matroid is less than two. Therefore,

without loss of generality, we assume the rank is greater or equal to two. Let

{x, y} = argmax{x,y}∈F

[ f ({x, y})+λd(x, y)].

We consider the following oblivious local search algorithm:

MAX-SUM DIVERSIFICATION WITH A MATROID CONSTRAINT

1: Let S be a basis of M containing both x and y

2: while there exists u ∈U \ S and v ∈ S such that S ∪ {u} \ {v} ∈F and φ(S ∪ {u} \ {v}) >φ(S)

do

3: S = S ∪ {u} \ {v}

4: end while

5: return S

It turns out that the above local search algorithm achieves an approximation ratio of 2.

Note that if the rank of the matroid is two, then the algorithm is clearly optimal. From now

2For each pair (x, y), we only define d(x, y). The value of d(y, x) is the same as d(x, y).


on, we assume the rank of the matroid is greater than two. Before we prove the theorem, we

need a few lemmas. First, we state a result in [13].

Lemma 3.2.8 [13] For any two sets X ,Y ∈ F with |X | = |Y |, there is a bijective mapping

g : X → Y such that X ∪ {g (x)} \ {x} ∈F for any x ∈ X .

Let O be an optimal solution, and S, the solution at the end of the local search algorithm.

Let A = O ∩S, B = S \ A and C = O \ A. Since both S and O are bases of the matroid, they

have the same cardinality. Therefore, B and C have the same cardinality. By Lemma 3.2.8,

there is a bijective mapping g : B → C such that S ∪ {g (b)} \ {b} ∈ F for any b ∈ B . Let B ={b1,b2, . . . ,bt }, and let ci = g (bi ) for all i = 1, . . . , t . Without loss of generality, we assume t ≥ 2,

for otherwise, the algorithm is optimal by the local optimality condition.

Lemma 3.2.9 f (S)+∑ti=1 f (S ∪ {ci } \ {bi }) ≥ f (S \ {b1, . . . ,bt })+∑t

i=1 f (S ∪ {ci }).

Proof: Since f is submodular,

f (S)− f (S \ {b1}) ≥ f (S ∪ {c1})− f (S ∪ {c1} \ {b1})

f (S \ {b1})− f (S \ {b1,b2}) ≥ f (S ∪ {c2})− f (S ∪ {c2} \ {b2})

...

f (S \ {b1, . . . ,bt−1})− f (S \ {b1, . . . ,bt }) ≥ f (S ∪ {ct })− f (S ∪ {ct } \ {bt }).

Summing up these inequalities, we have

f (S)− f (S \ {b1, . . . ,bt }) ≥t∑

i=1f (S ∪ {ci })−

t∑i=1

f (S ∪ {ci } \ {bi }),

and the lemma follows.

Lemma 3.2.10∑t

i=1 f (S ∪ {ci }) ≥ (t −1) f (S)+ f (S ∪ {c1, . . . ,ct }).


Proof: Since f is submodular,

f (S ∪ {ct })− f (S) = f (S ∪ {ct })− f (S)

f (S ∪ {ct−1})− f (S) ≥ f (S ∪ {ct ,ct−1})− f (S ∪ {ct })

f (S ∪ {ct−2})− f (S) ≥ f (S ∪ {ct ,ct−1,ct−2})− f (S ∪ {ct ,ct−1})

...

f (S ∪ {c1})− f (S) ≥ f (S ∪ {c1, . . . ,ct })− f (S ∪ {c2, . . . ,ct })

Summing up these inequalities, we have

t∑i=1

f (S ∪ {ci })− t f (S) ≥ f (S ∪ {c1, . . . ,ct })− f (S),


Lemma 3.2.11∑t

i=1 f (S ∪ {ci } \ {bi }) ≥ (t −2) f (S)+ f (O).

Proof: Combining Lemma 3.2.9 and Lemma 3.2.10, we have

f (S)+t∑

i=1f (S ∪ {ci } \ {bi })

≥ f (S \ {b1, . . . ,bt })+t∑

i=1f (S ∪ {ci })

≥ (t −1) f (S)+ f (S ∪ {c1, . . . ,ct })

≥ (t −1) f (S)+ f (O).

Therefore, the lemma follows.

Lemma 3.2.12 If t > 2, d(B ,C )−∑ti=1 d(bi ,ci ) ≥ d(C ).

Proof: For any bi ,c j ,ck , we have

d(bi ,c j )+d(bi ,ck ) ≥ d(c j ,ck ).


Summing up these inequalities over all i , j ,k with i 6= j , i 6= k, j 6= k, we have each d(bi ,c j )

with i 6= j being counted (t − 2) times; and each d(ci ,c j ) with i 6= j being counted (t − 2)

times. Therefore

(t −2)[d(B ,C )−t∑

i=1d(bi ,ci )] ≥ (t −2)d(C ),


Lemma 3.2.13∑t

i=1 d(S ∪ {ci } \ {bi }) ≥ (t −2)d(S)+d(O).

Proof:

t∑i=1

d(S ∪ {ci } \ {bi })

=t∑

i=1[d(S)+d(ci ,S \ {bi })−d(bi ,S \ {bi })]

= td(S)+t∑

i=1d(ci ,S \ {bi })−

t∑i=1

d(bi ,S \ {bi })

= td(S)+t∑

i=1d(ci ,S)−

t∑i=1

d(ci ,bi )−t∑

i=1d(bi ,S \ {bi })

= td(S)+d(C ,S)−t∑

i=1d(ci ,bi )−d(A,B)−2d(B).

There are two cases. If t > 2 then by Lemma 3.2.13, we have

d(C ,S)−t∑

i=1d(ci ,bi )

= d(A,C )+d(B ,C )−t∑

i=1d(ci ,bi )

≥ d(A,C )+d(C ).

Furthermore, since d(S) = d(A)+d(B)+d(A,B), we have 2d(S)−d(A,B)− 2d(B) ≥ d(A).

Therefore

t∑i=1

d(S ∪ {ci } \ {bi })

= td(S)+d(C ,S)−t∑

i=1d(ci ,bi )−d(A,B)−2d(B)

≥ (t −2)d(S)+d(A,C )+d(C )+d(A)

≥ (t −2)d(S)+d(O).


If t = 2, then since the rank of the matroid is greater than two, A 6= ;. Let z be an element in

A, then we have

2d(S)+d(C ,S)−t∑


= d(A,C )+d(B ,C )−t∑

i=1d(ci ,bi )+2d(A)+d(A,B)

≥ d(A,C )+d(c1,b2)+d(c2,b1)+d(A)+d(z,b1)+d(z,b2)

≥ d(A,C )+d(A)+d(c1,c2)

≥ d(A,C )+d(A)+d(C )

= d(O).

Therefore

t∑i=1

d(∪{ci } \ {bi })

= td(S)+d(C ,S)−t∑


≥ (t −2)d(S)+d(O).

This completes the proof.

Now we are ready to prove the theorem.

Theorem 3.2.14 The local search algorithm achieves an approximation ratio of 2 for the

max-sum diversification problem with a matroid constraint.

Proof: Since S is a locally optimal solution, we have φ(S) ≥φ(S ∪ {ci } \ {bi }) for all i . There-

fore, for all i we have

f (S)+λd(S) ≥ f (S ∪ {ci } \ {bi })+λd(S ∪ {ci } \ {bi }).

Summing up over all i , we have

t f (S)+λtd(S) ≥t∑

i=1f (S ∪ {ci } \ {bi })+λ

t∑i=1

d(S ∪ {ci } \ {bi }).



t f (S)+λtd(S) ≥ (t −2) f (S)+ f (O)+λt∑

i=1d(S ∪ {ci } \ {bi }).


t f (S)+λtd(S) ≥ (t −2) f (S)+ f (O)+λ[(t −2)d(S)+d(O)].

Therefore,

2 f (S)+2λd(S)) ≥ f (O)+λd(O).

φ(S) ≥ 1

2φ(O),

this completes the proof.

Theorem 3.2.14 shows that even in the more general case with a matroid constraint, we

can still achieve an approximation ratio of 2. In fact, by Example 3.2.6, the set B in the

example is a locally optimal set; therefore, this ratio is tight. Note that with a small sacrifice

on the approximation ratio, the algorithm can be modified to run in polynomial time by

looking for an ε-improvement instead of an arbitrary improvement.

3.3 Weakly Submodular Functions

Submodular functions are well-studied objects in combinatorial optimization, game theory

and economics. The natural diminishing returns property makes them suitable for many

applications. In this section, we study an extension of submodular functions which also

generalizes the objective function in the max-sum diversification problem. Recall the defi-

nition of a submodular function: A function f (·) is submodular if for any two sets S and T ,

we have

f (S)+ f (T ) ≥ f (S ∪T )+ f (S ∩T ).


We consider the following variation, and we call a function f (·) weakly submodular if for

and two sets S and T , we have

|T | f (S)+|S| f (T ) ≥ |S ∩T | f (S ∪T )+|S ∪T | f (S ∩T ).

3.3.1 Examples of Weakly Submodular Functions

There are several natural examples of weakly submodular functions. Again, all functions

considered here are normalized and monotone.

Submodular Functions

From the definition, it is not obvious that submodular functions is a subclass of weakly sub-

modular functions. First, we prove this is the case.

Proposition 3.3.1 Any submodular function is weakly submodular.

Proof: Given a monotone submodular function f (·) and two subsets S and T , without loss

of generality, we assume |S| ≤ |T |, then

|T | f (S)+|S| f (T ) = |S|[ f (S)+ f (T )]+ (|T |− |S|) f (S).

By submodularity f (S)+ f (T ) ≥ f (T ∪S)+ f (T ∩S) and monotonicity f (S) ≥ f (S ∩T ), we

have

|T | f (S)+|S| f (T ) = |S|[ f (S)+ f (T )]+ (|T |− |S|) f (S)

≥ |S|[ f (S ∪T )+ f (S ∩T )]+ (|T |− |S|) f (S ∩T )

= |S| f (S ∪T )+|T | f (S ∩T )

= |S ∩T | f (S ∪T )+ (|S|− |S ∩T |) f (S ∪T )+|T | f (S ∩T ).

And again by monotonicity f (S ∪T ) ≥ f (S ∩T ), we have

(|S|− |S ∩T |) f (S ∪T )+|T | f (S ∩T ) ≥ (|S|+ |T |− |S ∩T |) f (S ∩T ) = |S ∪T | f (S ∩T ).


Therefore

|T | f (S)+|S| f (T ) ≥ |S ∩T | f (S ∪T )+|S ∪T | f (S ∩T );

the proposition follows.

Sum of Metric Distances of a Set

Let U be a metric space with a distance function d(·, ·). For any subset S, define d(S) to be

the sum of distances induced by S; i.e.,

d(S) = ∑{u,v}:u,v∈S

d(u, v)

where d(u, v) measures the distance between u and v . We also extend the function to a pair

of disjoint subsets S and T and define d(S,T ) to be the sum of distances between S and T ;

i.e., d(S,T ) =∑{u,v}:u∈S,v∈T d(u, v). We have the following proposition.

Proposition 3.3.2 The sum of metric distances of a set is weakly submodular.

Proof: Given two subsets S and T of U , let A = S \ T , B = T \ S and C = S ∩T . Observe the

fact that by the triangle inequality, we have

|B |d(A,C )+|A|d(B ,C ) ≥ |C |d(A,B).

Therefore,

|T |d(S)+|S|d(T )

= (|B |+ |C |)[d(A)+d(C )+d(A,C )]+ (|A|+ |C |)[d(B)+d(C )+d(B ,C )]

= |C |[d(A)+d(B)+d(C )+d(A,C )+d(B ,C )]+ (|A|+ |B |+ |C |)d(C )

+|B |d(A)+|A|d(B)+|B |d(A,C )+|A|d(B ,C )

≥ |C |[d(A)+d(B)+d(C )+d(A,C )+d(B ,C )]+|S ∪T |d(S ∩T )+|C |d(A,B)

= |C |[d(A)+d(B)+d(C )+d(A,C )+d(B ,C )+d(A,B)]+|S ∪T |d(S ∩T )

= |S ∩T |d(S ∪T )+|S ∪T |d(S ∩T ).


Average Non-Negative Segmentation Functions

Given an m × n matrix M and any subset S ⊆ [m], a segmentation function σ(S) is the

sum of the maximum elements of each column whose row indices appear in S; i.e.; σ(S) =∑nj=1 maxi∈S Mi j . A segmentation function is average non-negative if for each row i , the sum

of all entries of M is non-negative; i.e.,∑n

j=1 Mi j ≥ 0.

We can use columns to model individuals, and rows to model items, then each entry

of Mi j represents how much the individual j likes the item i . The average non-negative

property basically requires that for each item i , on average people do not hate it. Next, we

show that an average non-negative segmentation function is weakly-submodular. We first

prove the following two lemmas.

Lemma 3.3.3 An average non-negative segmentation function is monotone.

Proof: Let S be a proper subset of [m], and e be an element in [m] that is not in S. If S

is empty, then by the average non-negative property, we have σ({e}) = ∑nj=1 Me j ≥ 0. Oth-

erwise, by adding e to S we have maxi∈S∪{e} Mi j ≥ maxi∈S Mi j for all 1 ≤ j ≤ n. Therefore

σ(S ∪ {e}) ≥σ(S).

Lemma 3.3.4 For any non-disjoint set S and T and an average non-negative segmentation

function σ(·), we have

σ(S)+σ(T ) ≥σ(S ∪T )+σ(S ∩T ).

This is also referred as the meta-submodular property [60].

Proof: For any non-disjoint set S and T and an average non-negative segmentation func-

tion σ(·), we let σ j (S) = maxi∈S Mi j . We show a stronger statement that for any j ∈ [n], we

have

σ j (S)+σ j (T ) ≥σ j (S ∪T )+σ j (S ∩T ).


Let e be an element in S ∪T such that Me j is maximum. Without loss of generality, assume

e ∈ S, then σ j (S) =σ j (S ∪T ) = Me j . Since S ∩T ⊆ T , we have σ j (T ) ≥σ j (S ∩T ). Therefore,

σ j (S)+σ j (T ) ≥σ j (S ∪T )+σ j (S ∩T ).

Summing over all j ∈ [n], we have

σ(S)+σ(T ) ≥σ(S ∪T )+σ(S ∩T )

as desired.

Proposition 3.3.5 Any average non-negative segmentation function is weakly submodular.

Proof: For any two set S and T and an average non-negative segmentation function σ(·),

if S and T are non-disjoint then by Lemma 3.3.4, S and T satisfy the submodular property

and hence they satisfy the weakly submodular property by Proposition 3.3.1. If S and T are

disjoint, then |S ∩T | = 0, and |S ∪T | = |S|+ |T |. By monotonicity property in Lemma 3.3.1,

we also have σ(S) ≥σ(S ∩T ) and σ(T ) ≥σ(S ∩T ). Therefore,

|S ∩T |σ(S ∪T )+|S ∪T |σ(S ∩T ) ≤ |T |σ(S ∩T )+|S|σ(S ∩T ) ≤ |T |σ(S)+|S|σ(T );

the weakly submodular property is also satisfied.

Squares of Cardinality of a Set

For a given set S, let f (S) = |S|2. We show that this function is also weakly submodular.

Proposition 3.3.6 The square of cardinality of a set is weakly submodular.


Proof: Given two subsets S and T of U , let a = |S \ T |, b = |T \ S| and c = |S ∩T |.

|T | f (S)+|S| f (T )

= (b + c)(a + c)2 + (a + c)(b + c)2

= (a +b +2c)(b + c)(a + c)

= (a +b +2c)(ab +ac +bc + c2)

≥ (a +b +2c)(ac +bc + c2)

= (a +b +2c)c(a +b + c)

= c(a +b + c)2 + (a +b + c)c2

= |S ∩T | f (S ∪T )+|S ∪T | f (S ∩T ).

The Objective Function of Max-Sum Diversification

We first show a property of weakly submodular functions.

Lemma 3.3.7 Non-negative linear combinations of weakly submodular functions are weakly

submodular.

Proof: Consider weakly submodular functions f1, f2, . . . , fn and non-negative numbersα1,α2, . . . ,αn .

Let g (S) =∑ni=1αi fi (S), then for any two set S and T , we have

|T |g (S)+|S|g (T )

= |T |n∑

i=1αi fi (S)+|S|

n∑i=1

αi fi (T )

=n∑

i=1αi [|T | fi (S)+|S| fi (T )]

≥n∑

i=1αi [|S ∩T | fi (S ∪T )+|S ∪T | fi (S ∩T )]

= |S ∩T |n∑

i=1αi fi (S ∪T )+|S ∪T |

n∑i=1

αi fi (S ∩T )

= |S ∩T |g (S ∪T )+|S ∪T |g (S ∩T ).


Therefore, g (S) is weakly submodular.

Corollary 3.3.8 The objective function of the max-sum diversification problem is weakly sub-

modular.

Proof: This follows immediate from Proposition 3.3.1 and 3.3.2 and Lemma 3.3.7.

3.3.2 Weakly Submodular Function Maximization

In this subsection, we discuss a greedy approximation algorithms for maximizing weakly

submodular functions over a uniform matroid.

Given an underlying set U and a weakly submodular function f (·) defined on every sub-

set of U , the goal is to select a subset S maximizing f (S) subject to a cardinality constraint

|S| ≤ p. We consider the following greedy algorithm.

GREEDY ALGORITHM FOR WEAKLY SUBMODULAR FUNCTION MAXIMIZATION

1: S =;2: while |S| < p do

3: Find u ∈U \ S maximizing f (S ∪ {u})− f (S)

4: S = S ∪ {u}

5: end while

6: return S

Theorem 3.3.9 The above greedy algorithm achieves an approximation ratio ≈ 5.95.

Before getting into the proof, we first prove two algebraic identities.

Lemma 3.3.10n∑

j=1(

i +1

i) j−1 = i (

i +1

i)n − i .


Proof: Note that the expression on the left-hand side is a geometric sum. Therefore, we

haven∑

j=1(

i +1

i) j−1 = ( i+1

i )n −1i+1

i −1= i (

i +1

i)n − i .

Lemma 3.3.11

n∑j=1

j (i +1

i) j−1 = ni 2(

i +1

i)n+1 − (n +1)i 2(

i +1

i)n + i 2.

Proof: Consider the function f (x) = ∑nj=1 x j with x 6= 1, its derivative f ′(x) = ∑n

j=1 j x j−1.

Since f (x) is a geometric sum and x 6= 1, we have

f (x) = xn+1 −1

x −1.

Taking derivatives on both sides we have

f ′(x) = (n +1)xn(x −1)−xn+1 +1

(x −1)2= nxn+1 − (n +1)xn +1

(x −1)2.

Therefore, we haven∑

j=1j x j−1 = nxn+1 − (n +1)xn +1

(x −1)2.

Substituting x with i+1i , we have

n∑j=1

j (i +1

i) j−1 = n( i+1

i )n+1 − (n +1)( i+1i )n +1

( i+1i −1)2

= ni 2(i +1

i)n+1 − (n +1)i 2(

i +1

i)n + i 2.

Now we proceed to the proof to Theorem 3.3.9.

Proof: Let Si be the greedy solution after the i th iteration; i.e., |Si | = i . Let O be an optimal

solution, and let Ci =O \ Si . Let mi = |Ci |, and Ci = {c1,c2, . . . ,cmi }. By the weakly submodu-


larity definition, we get the following mi inequalities for each 0 < i < p:

(i +mi −1) f (Si ∪ {c1})+ (i +1) f (Si ∪ {c2, . . . ,cmi }) ≥ (i ) f (Si ∪ {c1 . . . ,cmi })+ (i +mi ) f (Si )

(i +mi −2) f (Si ∪ {c2})+ (i +1) f (Si ∪ {c3, . . . ,cmi }) ≥ (i ) f (Si ∪ {c2 . . . ,cmi })+ (i +mi −1) f (Si )

...

(i +1) f (Si ∪ {cmi−1})+ (i +1) f (Si ∪ {cmi }) ≥ (i ) f (Si ∪ {cmi−1,cmi })+ (i +2) f (Si )

(i ) f (Si ∪ {cmi })+ (i +1) f (Si ) ≥ (i ) f (Si ∪ {cmi })+ (i +1) f (Si ).

Multiplying the j th inequality by ( i+1i ) j−1, and summing all of them up, we have

mi∑j=1

(i +mi − j )(i +1

i) j−1 f (Si ∪ {c j })+ (i +1)(

i +1

i)mi−1 f (Si )

≥ (i ) f (Si ∪ {c1, . . . ,cmi })+mi∑j=1

(i +mi − j +1)(i +1

i) j−1 f (Si ).

By monotonicity, we have f (Si ∪ {c1, . . . ,cmi }) ≥ f (O). Rearranging the inequality,

mi∑j=1

(i +mi − j )(i +1

i) j−1 f (Si ∪ {c j }) ≥ (i ) f (O)+

mi−1∑j=1

(i +mi − j +1)(i +1

i) j−1 f (Si ).

By the greedy selection rule, we know that f (Si+1) ≥ f (Si ∪{c j }) for any 1 ≤ j ≤ mi , therefore

we have

mi∑j=1

(i +mi − j )(i +1

i) j−1 f (Si+1) ≥ (i ) f (O)+

mi−1∑j=1

(i +mi − j +1)(i +1

i) j−1 f (Si ).

For the ease of notation, we let

ai =mi∑j=1

(i +mi − j )(i +1

i) j−1 bi =

mi−1∑j=1

(i +mi − j +1)(i +1

i) j−1

We first simplify ai and bi .

ai =mi∑j=1

(i +mi − j )(i +1

i) j−1

=mi∑j=1

(i +mi )(i +1

i) j−1 −

mi∑j=1

j (i +1

i) j−1.


By Lemma 3.3.10 and 3.3.11, we have

ai = (i +mi )[i (i +1

i)mi − i ]−mi i 2(

i +1

i)mi+1 + (mi +1)i 2(

i +1

i)mi − i 2

= [i 2 + i mi −mi (i 2 + i )+ (mi +1)i 2](i +1

i)mi −2i 2 − i mi

= 2i 2(i +1

i)mi −2i 2 − i mi .

Similarly, we have

bi =mi−1∑

j=1(i +mi − j +1)(

i +1

i) j−1

=mi−1∑

j=1(i +mi +1)(

i +1

i) j−1 −

mi−1∑j=1

j (i +1

i) j−1

= (i +mi +1)[i (i +1

i)mi−1 − i ]− (mi −1)i 2(

i +1

i)mi +mi i 2(

i +1

i)mi−1 − i 2

= [i 2 + i mi + i − (mi −1)(i 2 + i )+mi i 2](i +1

i)mi−1 −2i 2 − i mi − i

= 2i (i +1)(i +1

i)mi−1 −2i 2 − i mi − i

= 2i 2(i +1

i)mi −2i 2 − i mi − i .

Now let

a∗i =

p∑j=1

(i +p − j )(i +1

i) j−1 b∗

i =p−1∑j=1

(i +p − j +1)(i +1

i) j−1

We have

a∗i −ai = b∗

i −bi ≥ 0

Therefore,

a∗i f (Si+1)−b∗

i f (Si ) = ai f (Si+1)−bi f (Si )+ (a∗i −ai )[ f (Si+1)− f (Si )].

Since f (·) is monotone, we have f (Si+1)− f (Si ) ≥ 0. Therefore,

a∗i f (Si+1)−b∗

i f (Si ) ≥ ai f (Si+1)−bi f (Si ) ≥ i f (O).


Then we have the following set of inequalities:

a∗1 f (S2) ≥ 1 f (O)+b∗

1 f (S1)

a∗2 f (S3) ≥ 2 f (O)+b∗

2 f (S2)

...

a∗p−2 f (Sp−1) ≥ (p −2) f (O)+b∗

p−2 f (Sp−2)

a∗p−1 f (Sp ) ≥ (p −1) f (O)+b∗

p−1 f (Sp−1).

Multiplying the i th inequality by∏i−1

j=1 a∗j∏i

j=2 b∗j

, summing all of them up and ignore the term

b∗1 f (S1), ∏p−1

j=1 a∗j∏p−1

j=2 b∗j

f (Sp ) ≥p−1∑i=1

i∏i−1

j=1 a∗j∏i

j=2 b∗j

f (O).

Therefore the approximation ratio

f (O)

f (Sp )≤

∏p−1j=1 a∗

j∏p−1j=2 b∗

j∑p−1i=1

i∏i−1

j=1 a∗j∏i

j=2 b∗j

=p−1∑

i=1

i∏p−1

j=i+1 b∗j∏p−1

j=i a∗j

−1

=(

p−1∑i=1

[i

a∗i

·p−1∏

j=i+1

b∗j

a∗j

]

)−1

.

Note that the approximation ratio is simply a function of p, and it converges3 to 5.95 as p

tends to ∞. In particular, the approximation ratio is 3.74 when p = 10 and approximation

ratio is 5.62 when p = 100.

3.3.3 Further Discussions

As discussed in Subsection 3.2.2, it is natural to consider the general matroid constraint for

the problem of weakly submodular function maximization. For this more general problem,

the greedy algorithm in the previous section no longer achieves any constant approximation

ratio. We consider the following oblivious local search algorithm:

WEAKLY SUBMODULAR FUNCTION MAXIMIZATION WITH A MATROID CONSTRAINT

3This number is obtained by a computer program.


1: Let S be a basis of M

2: while exists u ∈U \ S and v ∈ S such that S ∪ {u} \ {v} ∈F and f (S ∪ {u} \ {v}) > f (S) do

3: S = S ∪ {u} \ {v}

4: end while

5: return S

Before we prove the theorem, we need to prove several lemmas. Let O be the optimal

solution, and S, the solution at the end of the local search algorithm. Let s be the size of a

basis; let A = O ∩S, B = S \ A and C = O \ A. By Lemma 3.2.8, there is a bijective mapping

g : B →C such that S ∪ {b} \ {g (b)} ∈F for any b ∈ B . Let B = {b1,b2, . . . ,bt }, and let ci = g (bi )

for all i = 1, . . . , t . We reorder b1,b2, . . . ,bt in different ways. Let b′1,b′

2, . . . ,b′t be an ordering

such that the corresponding c ′1,c ′2, . . . ,c ′t maximizes the sum∑t

i=1(s − i )( s+1s )i−1 f (S ∪ {c ′i });

and let b′′1 ,b′′

2 , . . . ,b′′t be an ordering such that the corresponding c ′′1 ,c ′′2 , . . . ,c ′′t minimizes the

sum∑t

i=1(s + t − i )( s+1s )i−1 f (S ∪ {c ′′i }).

Lemma 3.3.12 Given three non-increasing non-negative sequences:

α1 ≥α2 ≥ ·· · ≥αn ≥ 0,

β1 ≥β2 ≥ ·· · ≥βn ≥ 0,

x1 ≥ x2 ≥ ·· · ≥ xn ≥ 0.

Then we have

n∑i=1

αi xi

n∑i=1

βi ≥n∑

i=1βi xn+1−i

n∑i=1

αi .


Proof: Consider the following:

nn∑

i=1αi xi = nα1x1 +nα2x2 +·· ·+nαn xn

=n∑

i=1αi x1 + (nα1 −

n∑i=1

αi )x1 +nα2x2 +·· ·+nαn xn

≥n∑

i=1αi x1 + (nα1 +nα2 −

n∑i=1

αi )x2 +·· ·+nαn xn

=n∑

i=1αi x1 +

n∑i=1

αi x2 + (nα1 +nα2 −2n∑

i=1αi )x2 +·· ·+nαn xn

...

≥n∑

i=1αi x1 +

n∑i=1

αi x2 +·· ·+n∑

i=1αi xn + (nα1 +nα2 +·· ·+nαn −n

n∑i=1

αi )xn

=n∑

i=1αi

n∑i=1

xi

Similarly, we have

nn∑

i=1βi xn+1−i = nβ1xn +nβ2xn−1 +·· ·+nβn x1

=n∑

i=1βi xn + (nβ1 −

n∑i=1

βi )xn +nβ2xn−1 +·· ·+nβn x1

≤n∑

i=1βi xn + (nβ1 +nβ2 −

n∑i=1

βi )xn−1 +·· ·+nβn x1

=n∑

i=1βi xn +

n∑i=1

βi xn−1 + (nβ1 +nβ2 −2n∑

i=1βi )xn−1 +·· ·+nβn x1

...

≤n∑

i=1βi xn +

n∑i=1

βi xn−1 +·· ·+n∑

i=1βi x1 + (nα1 +nβ2 +·· ·+nβn −n

n∑i=1

βi )x1

=n∑

i=1βi

n∑i=1

xi

Therefore the lemma follows.

Lemma 3.3.13

t∑i=1

(s − i )(s +1

s)i−1 f (S ∪ {c ′i })

≤ s f (S)+t∑

i=1(s +1− i )(

s +1

s)i−1 f (S ∪ {c ′i } \ {b′

i })− (s +1)(s +1

s)t−1 f (S \ {b′

1, . . . ,b′t }).


Proof: By the definition of weakly submodular, we have

s f (S)+ s f (S ∪ {c ′1} \ {b′1}) ≥ (s −1) f (S ∪ {c ′1})+ (s +1) f (S \ {b′

1})

s f (S \ {b′1})+ (s −1) f (S ∪ {c ′2} \ {b′

2}) ≥ (s −2) f (S ∪ {c ′2})+ (s +1) f (S \ {b′1,b′

2})

...

s f (S \ {b′1, . . . ,b′

t−1})+ (s − t +1) f (S ∪ {c ′t } \ {b′t }) ≥ (s − t ) f (S ∪ {c ′t })+ (s +1) f (S \ {b′

1, . . . ,b′t })

Multiplying the i th inequality by ( s+1s )i−1, and summing all of them up to get

s f (S)+t∑

i=1(s +1− i )(

s +1

s)i−1 f (S ∪ {c ′i } \ {b′

i })

≥t∑

i=1(s − i )(

s +1

s)i−1 f (S ∪ {c ′i })+ (s +1)(

s +1

s)t−1 f (S \ {b′

1, . . . ,b′t }).

After rearranging the inequality, we get

t∑i=1

(s − i )(s +1

s)i−1 f (S ∪ {c ′i })

≤ s f (S)+t∑

i=1(s +1− i )(

s +1

s)i−1 f (S ∪ {c ′i } \ {b′

i })− (s +1)(s +1

s)t−1 f (S \ {b′

1, . . . ,b′t }).

Lemma 3.3.14

t∑i=1

(s + t − i )(s +1

s)i−1 f (S ∪ {c ′′i })−

t∑i=1

(s + t +1− i )(s +1

s)i−1 f (S)

≥ s f (S ∪ {c ′′1 , . . . ,c ′′t })− (s +1)(s +1

s)t−1 f (S)

Proof: By the definition of weakly submodular, we have

(s + t −1) f (S ∪ {c ′′1 })+ (s +1) f (S ∪ {c ′′2 , . . . ,c ′′mi}) ≥ s f (S ∪ {c ′′1 , . . . ,c ′′mi

})+ (s + t ) f (S)

...

(s +1) f (S ∪ {c ′′t−1})+ (s +1) f (S ∪ {c ′′t }) ≥ s f (S ∪ {c ′′t−1,c ′′t })+ (s +2) f (S)

s f (S ∪ {c ′′t })+ (s +1) f (S) ≥ s f (S ∪ {c ′′t })+ (s +1) f (S).


Multiplying the i th inequality by ( s+1s )i−1, and summing all of them up, we have

t∑i=1

(s + t − i )(s +1

s)i−1 f (S ∪ {c ′′i })+ (s +1)(

s +1

s)t−1 f (S)

≥ s f (S ∪ {c ′′1 , . . . ,c ′′t })+t∑

i=1(s + t +1− i )(

s +1

s)i−1 f (S).

Therefore, we have

t∑i=1

(s + t − i )(s +1

s)i−1 f (S ∪ {c ′′i })

≥ s f (S ∪ {c ′′1 , . . . ,c ′′t })+t∑

i=1(s + t +1− i )(

s +1

s)i−1 f (S)− (s +1)(

s +1

s)t−1 f (S).

Let

A =t∑

i=1(s − i )(

s +1

s)i−1, B =

t∑i=1

(s +1− i )(s +1

s)i−1,

C =t∑

i=1(s + t − i )(

s +1

s)i−1, D =

t∑i=1

(s + t +1− i )(s +1

s)i−1.

Lemma 3.3.15

Ct∑

i=1(s − i )(

s +1

s)i−1 f (S ∪ {c ′i }) ≥ A

t∑i=1

(s + t − i )(s +1

s)i−1 f (S ∪ {c ′′i }).

Proof: This is immediate by Lemma 3.3.12

Theorem 3.3.16 Let s be the size of a basis, the local search algorithm achieves an approxi-

mation ratio of 14.5 for an arbitrary s, approximately 10.88 when s = 6. The ratio converges

to 10.22 as s tends to ∞.

Proof: Since S is a locally optimal solution, we have

f (S) ≥ f (S ∪ {c ′i } \ {b′i }).

Since f (S \ {b′1, . . . ,b′

t }) ≥ 0, by Lemma 3.3.13, we have

t∑i=1

(s − i )(s +1

s)i−1 f (S ∪ {c ′i }) ≤ s f (S)+

t∑i=1

(s +1− i )(s +1

s)i−1 f (S).


Therefore,t∑

i=1(s − i )(

s +1

s)i−1 f (S ∪ {c ′i }) ≤ (s +B) f (S).

On the other hand, we have O ⊆ S ∪ {c ′′1 , . . . ,c ′′t }, by monotonicity, we have f (O) ≤ f (S ∪{c ′′1 , . . . ,c ′′t }). By Lemma 3.3.14, we have

t∑i=1

(s + t − i )(s +1

s)i−1 f (S ∪ {c ′′i }) ≥ s f (O)+ [D − (s +1)(

s +1

s)t−1] f (S).

Lemma 3.3.12, we have

Ct∑

i=1(s − i )(

s +1

s)i−1 f (S ∪ {c ′i }) ≥ A

t∑i=1

(s + t − i )(s +1

s)i−1 f (S ∪ {c ′′i }).

Therefore

C (s +B) f (S) ≥ As f (O)+ A[D − (s +1)(s +1

s)t−1] f (S)

Hence the approximation ratio:

f (O)

f (S)≤ C B − AD +C s + A(s +1)( s+1

s )t−1

As= C B − AD +C s

As+ (

s +1

s)t .

Simplifying the notation, we have

f (O)

f (S)≤

∑ti=1(s2 + st + t i − si )( s+1

s )i−1 +∑2t−1i=t+1 t (2t − i )( s+1

s )i−1∑ti=1 s(s − i )( s+1

s )i−1+ (

s +1

s)t .

The expression is monotonically increasing with t and is bounded from above4 by 14.5 for

s > 1. In particular, it has an approximate value of 10.88 when s = 6. The ratio converges to

10.22 as s tends to ∞.

4This number is obtained by a computer program.

Chapter 4

Sum Colouring - A Case Study of Greedy

Algorithms

In this chapter, we study greedy algorithms through a particular problem: the sum colour-

ing problem. We focus on the class of d-claw-free graphs and its subclasses, proving NP-

hardness and giving greedy approximation algorithms for the problem. Finally, we de-

rive inapproximation lower bounds for the sum colouring problem on restricted families

of graphs using the priority framework developed in [12].

4.1 Introduction

The sum colouring problem (SC), also known as the chromatic sum problem, was formally

introduced in [62]. For a given graph G = (V ,E), a proper colouring of G is an assignment of

positive integers to its vertices φ : V → Z+ such that no two adjacent vertices are assigned

the same colour. The sum colouring problem seeks a proper colouring such that the sum

of colours over all vertices∑

v∈V φ(v) is minimized. When this sum is minimized, this sum

is called the chromatic sum of the graph G . Sum colouring has many applications in job

scheduling and resource allocation. For example, consider an instance of job scheduling

in which one is given a set of jobs S each requiring unit execution time. One can view this

91

CHAPTER 4. SUM COLOURING - A CASE STUDY OF GREEDY ALGORITHMS 92

instance in a graph-theoretic sense: we construct a graph G whose vertex set is in one-to-

one correspondence with the set of input jobs S, and an edge exists between two vertices if

and only if the corresponding jobs conflict for resources. In other words, we consider the

underlying conflict graph G of the job scheduling instance. Finding the chromatic sum of G

corresponds to minimizing the average job completion time.

The sum colouring problem has been studied extensively in the literature. The problem

is NP-hard for general graphs [62], and cannot be approximated within n1−ε for any constant

ε> 0 unless ZPP=NP [5][32]. Note that an optimal colouring of a graph does not necessarily

yield an optimal sum colouring for this graph. Consider a graph G and an optimal sum

colouring of G in Fig. 4.1. It uses three colours, while the chromatic number of G is two. In

1 2 3 1

1

1

1

1

Figure 4.1: An optimal sum colouring of G

fact, the gap between the chromatic number and the number of colours used in an optimal

sum colouring can be made arbitrarily large, even for the case of trees [62].

The sum colouring problem is polynomial time solvable for proper interval graphs [74]

and trees [62]. However, the problem is APX-hard for both bipartite graphs [7] and inter-

val graphs [70], which is a little surprising given that many NP-hard problems are solvable

in polynomial time for these two classes. The best known approximation algorithm for in-

terval graphs has an approximation ratio of 1.796 [43]. For bipartite graphs, there is a 2726 -

approximation [37].

In this chapter, we focus on the class of d-claw-free graphs and its subclasses. Recall that

a graph is d-claw-free if every vertex has less than d independent neighbours. The class of


d-claw-free graphs is exactly the class G(I Sd−1) discussed in Chapter 2. Here we give sub-

classes of d-claw-free graphs in addition to those given in Subsection 2.2.4. All these sub-

classes fall into the category of geometric intersection graphs defined in Subsection 2.2.3.

1. Unit Interval Graphs: The vertices are unit intervals in a real line, and two vertices are

adjacent if and only if the two corresponding intervals overlap; see Fig. 4.2a.

2. Proper Interval Graphs: The vertices are intervals in a real line and no interval is prop-

erly contained in another interval. Two vertices are adjacent if and only if the two

corresponding intervals overlap; see Fig. 4.2b. It is known that the class of proper

interval graphs and the class of unit interval graphs coincide [80]. Furthermore, a ge-

ometric representation of a proper interval graph can be transformed to a geometric

representation of a unit interval graph in polynomial time using only expansion and

contraction of intervals [9].

(a) A unit interval graph (b) A proper interval graph

Figure 4.2: Unit interval graphs and proper interval graphs

3. Unit Square Graphs: The vertices are axis-parallel unit squares1 in a two dimensional

plane, and two vertices are adjacent if and only if the two corresponding squares over-

lap; see Fig. 4.3a.

4. Proper Intersection Graphs of Axis-Parallel Rectangles: The vertices are axis-parallel

rectangles in a two dimensional plane and the projection of any rectangle onto either

1Note that here we do not allow unit squares to rotate. For the rest of this chapter, whenever we say unitsquares, we mean axis-parallel unit squares.


the x-axis or y-axis is not properly contained in that of another rectangle. Two vertices

are adjacent if and only if the two corresponding rectangles intersects; see Fig. 4.3b.

(a) A unit square graph(b) A proper intersection graph of axis-parallel

rectangles

Figure 4.3: Unit square graphs and proper intersection graphs of axis-parallel rectangles

5. Unit Disk Graphs: The vertices are unit disks in a two dimensional plane, and two

vertices are adjacent if and only if the two corresponding disks overlap; see Fig. 4.4a.

6. Penny Graphs: The vertices are unit disks in a two dimensional plane that do not share

a common interior point, and two vertices are adjacent if and only if the two corre-

sponding disks touch each other at the boundary; see Fig. 4.4b.

(a) A unit disk graph (b) A penny graph

Figure 4.4: Unit disk graphs and penny graphs


It is not hard to see that unit interval graphs and proper interval graphs are 3-claw-free;

unit square graphs and proper intersection graphs of axis-parallel rectangles are 5-claw-

free; and unit disk graphs and penny graphs are 6-claw-free. We first show the class of proper

intersection graphs of axis-parallel rectangles and the class of unit square graphs coincide.

Theorem 4.1.1 The class of proper intersection graphs of axis-parallel rectangles is the same

as the class of unit square graphs. Furthermore, a geometric representation of a proper inter-

section graph of axis-parallel rectangles can be transformed to a geometric representation a

unit square graph in polynomial time.

Proof: It is clear that unit square graphs are contained in the class of proper intersection

graphs of axis-parallel rectangles. We only need to show the reverse direction. Given a ge-

ometric representation of a proper intersection graph of axis-parallel rectangles, for each

axis, its projection is a proper interval graph. By applying on both x-axis and y-axis the

transformation given in [9], which converts a proper interval representation to a unit in-

terval representation using only expansion and contraction of intervals, a geometric repre-

sentation of a unit square graph can be constructed in polynomial time. Therefore, the two

classes coincide.

4.2 NP-Hardness for Penny Graphs

In this section, we show sum colouring is NP-hard for penny graphs. The reduction com-

bines ideas in [16] and [44], and reduces from the maximum independent set problem on

planar graphs with maximum degree 3. First, we make use of the following observation from

Valiant [85].

Lemma 4.2.1 [85] A planar graph G with maximum degree 3 can be embedded in the plane

using O(|V |2) units of area in such a way that its vertices are at integer coordinates and its


edges are drawn so that they are made up of line segments of the form x = i or y = j , for

integers i and j .

Given a planar graph G with maximum degree 3, we first apply Lemma 4.2.1 to draw its

embedding onto integer coordinates. Without loss of generality we assume those coordi-

nates are multiples of 8 units. We replace each vertex with a filled unit disk, and for each

edge uv , we replace it with luv tangent hollow unit disks where luv is the Manhattan dis-

tance between u and v . We call the resulting penny graph G ′. See figure 4.5. Note that there

are three types of adjacent pair of unit disks. A corner pair refers two adjacent disks such

that one of them is at the corner; an uneven pair refers two adjacent disks such that the cen-

tre of at least one of them does not lie on the grid; the rest of the pairs are straight pairs. It is

not hard to observe the following relationship between the sizes of maximum independent

sets of the two graphs.

Figure 4.5: Transformation from planar graphs with maximum degree 3 to penny graphs

Lemma 4.2.2 Let α(·) denote the size of the maximum independent set, then α(G ′) =α(G)+∑uv∈E

luv2 .

Proof: We first show that α(G ′) is at least α(G)+∑uv∈E

luv2 . Given a maximum indepen-

dent set I of G , for any edge uv , at least one of u and v are not in I , hence we can add


luv2 alternating disks for each edge uv to form an independent set of G ′. Therefore, α(G ′) ≥α(G)+∑

uv∈Eluv

2 . On the other hand, given a maximum independent set I ′ of G ′, we can do

the following modifications to I ′ without changing the size of I ′. For each edge uv in G , if

both u and v are in I ′, then the number of disks along the edge uv which are in I ′ must be

less than luv2 . In this case, we can then remove, say v , from I ′ and increase the number of

disks in I ′ along the edge uv by at least one. We keep doing that until for any edge uv in G

there is at most one vertex in I ′.

It is clear that after such modification, the vertices in I ′∩G form an independent set for

G , and hence, α(G ′) ≤α(G)+∑uv∈E

luv2 .

We now do a second transformation. The goal of this transformation is to insert a gadget

between two vertices (a pair of adjacent unit disks) and link the size of maximum inde-

pendent set to the chromatic sum. For each straight pair of adjacent unit disks, we do a

transformation as shown in Fig. 4.6. For each uneven pair of adjacent unit disks, we do a

transformation as shown in Fig. 4.7 and for each corner pair of adjacent unit disks, we do a

transformation as shown in Fig. 4.8.

Figure 4.6: Transformation for straight pairs

The purpose of the second transformation is that for each edge uv in G ′, we want to add

an edge gadget as shown in Fig 4.9. Because the original graph is a planar graph with max-

imum degree 3, we can add these edge gadgets in such a way that there are no overlapping


Figure 4.7: Transformation for uneven pairs

Figure 4.8: Transformation for corner pairs

disks and two disks in different gadgets does not touch each other. We call the resulting

graph G ′′. We now prove the following lemma to complete the reduction.

Lemma 4.2.3 Let m be the number of edges in G ′ and n be the number of vertices. Let α(G ′)

be the size of the maximum independent set of G ′. Then the chromatic sum of G ′′ is 8m+2n−α(G ′).

Proof: We first show that the chromatic sum of G ′′ is at most 8m +2n −α(G ′). To see that

we give an explicit colouring of G ′′. Let I be the maximum independent set of G ′, we colour

all vertices in I with colour 1. We then colour the remaining vertices in G ′ with colour 2.

Consider an edge gadget as depicted in figure 4.9. Since at least one of u and v is coloured


u y z v

x

p q

Figure 4.9: The edge gadget

with 2, without loss of generality, assume u has colour 2. We then colour y with 1, z with 3,

x with 2 and p, q with 1. Therefore, the chromatic sum of G ′′ is at most 8m +2n −α(G ′).

We now show the chromatic sum of G ′′ is at least 8m+2n−α(G ′). Assume an optimal sum

colouring, we first claim that all vertices in G ′ coloured with 1 must form an independent set

of G ′. Suppose this is not the case and assume both u and v are coloured with 1. Then the

best possible choice of colours leads to Fig. 4.10, which achieves the sum 12. If we recolour

1 2 3 1

1

2 2

Figure 4.10: Best colouring of the edge gadget when both u and v are coloured 1

v with 2, we achieve the sum 11 as show in Fig. 4.11. However, recolouring v might lead to

a conflict in its adjacent edge gadgets. We claim that we can recolour each of its adjacent

edge gadgets to maintain at most its original sum. Let u′ be a vertex adjacent to v in G ′, and

y ′, z ′, x ′, p ′, q ′ be the vertices in this edge gadget, see Fig. 4.12 below. There are two cases:

1. If u′ is coloured with 2, then colour z ′ with 1, y ′ with 3, x ′ with 2, p ′, q ′ with 1. This

is the minimum possible, given that the colour of u′ does not change. Therefore, it

cannot exceed the original.


1 2 1 2

3

1 1

Figure 4.11: Recolour v to improve the sum

v y ′ z ′ u′

x ′

p ′ q ′

Figure 4.12: The adjacent edge gadget

2. If u′ is not coloured with 2, then colour z ′ with 2, y ′ with 1, x ′ with 3, p ′, q ′ with 1. This

is also the minimum possible, so it cannot exceed the original.

Therefore by recolouring v with 2 and properly recolouring all its adjacent gadgets, we can

reduce the total sum. This contradicts the fact that the original colouring was an optimal

sum colouring. Therefore, all vertices in G ′ coloured with 1 must form an independent set

of G ′. For the remaining vertices in G ′, we at least colour them with 2 and for each gadget, 8

is the best possible. Therefore, the chromatic sum of G ′′ is at least 8m +2n −α(G ′).

Theorem 4.2.4 Sum colouring is NP-hard for penny graphs.

Proof: The NP-hardness follows immediately from Lemma 4.2.1, 4.2.2 and 4.2.3.

Since penny graphs are special cases of unit disk graphs, we have the following corollar-

ies.

Corollary 4.2.5 Sum colouring is NP-hard for unit disk graphs.


Note that we can modify the two transformations using unit squares. For the the first trans-

formation, we replace all unit disks with unit squares and make two modifications:

1. Between any two grid vertices, if the original transformation contains an uneven pair

of unit disks, we do not use an uneven pair of unit squares, instead, we squeeze eight

unit squares into a seven-unit length; see Fig. 4.13.

Figure 4.13: Transformation between two grid vertices

2. In the degree-two or degree-three corner case, we shift the unit squares adjacent to

the corner square by a tenth of the unit so that they no longer touch each other; see

Fig. 4.13.

(a) A degree-two corner (b) A degree-three corner

Figure 4.14: Corner cases in the first transformation

Note that these two modifications are so small that we can treat them as if all unit squares

are aligned perfectly, yet the underlying intersection graph is the same as the one produced

using unit disks.

Now we describe the second transformation. We focus on the two slightly complicated

cases: overlapping adjacent pairs and degree-three corners. The rest of the cases can be

easily handled using similar gadgets. The case of overlapping adjacent pairs is illustrated in

Fig. 4.15 and the case of degree-three corners is illustrated in Fig. 4.16.

Based on the above observations, the following theorem is immediate.


Figure 4.15: An overlapping adjacent pair

Figure 4.16: A degree-three corner

Theorem 4.2.6 Sum colouring is NP-hard for unit square graphs.

Since proper intersection graphs of axis-parallel rectangles coincide with unit square

graphs, we have the following corollary.

Corollary 4.2.7 Sum colouring is NP-hard for proper intersection graphs of axis-parallel rect-

angles.


4.3 Approximation Algorithms for d-Claw-Free Graphs and

their Subclasses

In this section, we give greedy approximation algorithms for the sum colouring problem.

We first show a natural and simple greedy algorithm that achieves a k-approximation for

the graph class of G(I Sk ). We then further improve the ratio for the more specific class of

unit square graphs.

4.3.1 Compact Colouring for G(I Sk)

A (k + 1)-approximation sum colouring for the class of G(I Sk ) was stated in [43] and a k-

approximation was stated in [35], but a formal proof does not seem to exist in the literature.

We provide a formal proof here. We use the notion of compact colouring as defined in [5].

Definition 4.3.1 [5] A proper vertex colouring φ(·) is compact if and only if every vertex v

with φ(v) = i has a neighbour u with φ(u) = j for every j , 1 ≤ j ≤ i −1.

A compact colouring of a graph G is easily attainable; we can simply colour the vertices

of G in a first-fit greedy fashion, and assign each vertex v the minimal colour that does not

conflict with its previously assigned neighbours. This simple algorithm, which is linear time

and can be used in online settings, yields the following result.

Theorem 4.3.2 [63, 84] Given a graph G, the sum of colours in a compact colouring obtained

using the first-fit greedy algorithm is at most m +n, where n is the number of vertices and m

is the number of edges.

Proof: For completeness, we include a proof here. We order the vertices by non-decreasing

order of assigned colour, breaking ties arbitrarily. Let the i th vertex in this ordering be vi .

Let ci be the colour assigned to vi and li be the number of neighbours of vi preceding vi

in the ordering. Note that ci ≤ li +1 because in the worst case the li neighbours of vi that


precede vi in the ordering have been assigned unique colours 1,2, . . . , li . Summing over all

vertices, we obtain:

n∑i=1

ci ≤n∑

i=1(li +1)

= n +n∑

i=1li

= n +m

Lemma 4.3.3 For any graph in G(I Sk ), its chromatic sum is at least n + 1k ·m.

Proof: Given a graph G = (V ,E) in G(I Sk ), letφ(·) be a colouring of G that achieves the chro-

matic sum. Consider reconstructing G , one vertex at a time, in a non-decreasing order of the

colour assigned to each, breaking ties lexicographically. In other words, we define a total or-

dering ≺, such that u ≺ v if and only if φ(u) <φ(v) or φ(u) =φ(v) and u is lexicographically

before v . Let N ′(v) denote the set of neighbours of v that appears before v :

N ′(v) = {u|u ∈ N (v) and u ≺ v}

Note that∑

v∈V |N ′(v)| = m since every edge (u, v) is counted exactly once. If φ(v) = c, then

vertices in N ′(v) are assigned a colour smaller than c. Since the graph G is G(I Sk ), for each

colour class, there are at most k vertices in N ′(v), therefore |N ′(v)| ≤ k ·(c−1). It follows that:

φ(v) = 1+ (c −1)

= 1+ 1

k·k · (c −1)

≥ 1+ 1

k· |N ′(v)|

Summing over all vertices, we obtain:∑v∈V

φ(v) ≥ ∑v∈V

(1+ 1

k· |N ′(v)|)

= n + 1

k·m

Combining Theorem 4.3.2 with Lemma 4.3.3, we confirm the following result.

Theorem 4.3.4 For a graph in G(I Sk ), the first-fit greedy algorithm achieves a k-approximation

to the sum colouring problem.


4.3.2 Unit Square Graphs

Since unit square graphs are in G(I S4), Theorem 4.3.4 shows that the first-fit greedy algo-

rithm achieves a 4-approximation for the sum colouring problem. We can improve the

approximation ratio by using structural properties of unit square graphs. A unit strip is a

infinitely long region defined by the set of points {(x, y)|y ∈ [i , i +1)} for some fixed x. First,

we have the following observation:

Observation 4.3.5 Given a unit strip, consider unit squares whose centre lies inside this strip.

Let H be the intersection graph induced by those unit squares, then H is a unit interval graph.

Given a geometric representation of a unit square graph, i.e., a list of (x, y)−coordinates

for the centres of n congruent squares. Partition the graph using unit strips so that centres

of unit squares are covered. Label unit strips 1,2,3, . . . , from top to bottom. The strip is odd

if the label is odd, even otherwise. We consider the following algorithm:

AN ALGORITHM FOR SC ON UNIT SQUARE GRAPHS

1: Optimally sum colour each odd strip with colour classes consisting of odd numbers

{1,3,5, . . . }

2: Optimally sum colour each even strip with colour classes consisting of even numbers

{2,4,6, . . . }

3: Return the resulting colouring

Theorem 4.3.6 The above algorithm is a 2-approximation for the sum colouring problem on

unit square graphs. Furthermore, the algorithm can be viewed as a greedy algorithm and

runs in time O(n logn) provided that we are given the set of centres of the unit squares.

Proof: First observe that the graph can be partitioned into at most n non-empty unit strips,

and such a partition is easily obtainable. Just sort the y-coordinates of the centres and

greedily cover them with unit intervals. The strips induced by these unit intervals yield a


desired partition. Since each strip contains at least one centre and a centre can appear in at

most one strip, there are at most n unit strips.

Secondly, observe that the colouring given by the algorithm is a valid colouring for the

given graph. This is because:

1. two squares in different strips with the same parity cannot intersect each other;

2. the colouring is a proper colouring within each strip;

3. the colouring does not create any violation between two adjacent strips.

Finally, the algorithm uses at most twice the sum of an optimal sum-colouring. To see this,

let A be the colouring of G obtained by the algorithm and O be an optimal sum-colouring of

G . For each strip si , let Gi be the graph induced by si . Let Ai be the colouring of Gi in A, Bi

be an optimal sum-colouring of Gi , and Oi be the colouring of Gi in O. For convenience, for

a particular colouring C , we use sum(C ) to denote its sum. By line 2 and 3 of our algorithm,

we have sum(Ai ) ≤ 2sum(Bi ), for all i . Since Bi is an optimal sum-colouring of Gi , we have

sum(Bi ) ≤ sum(Oi ). Therefore, we have

sum(A) =∑i

sum(Ai ) ≤ 2∑

isum(Bi ) ≤ 2

∑i

sum(Oi ) = 2sum(O).

Therefore, the algorithm is a 2-approximation for the sum colouring problem on unit square

graphs.

Note that by Observation 4.3.5, each strip is a unit interval graph. Therefore, an opti-

mal sum-colouring can be obtained by the first-fit left-to-right greedy algorithm [74]. The

partitioning of the graph into non-empty unit strips actually defines a total ordering of unit

squares. Therefore, the algorithm can be viewed as a greedy algorithm. Since there are at

most n strips, and optimally sum-colouring each strip using the desired colour class takes

linear time, the algorithm runs in time O(n logn) provided that we are given the set of cen-

tres of the unit squares. It runs in linear time in the number of squares if the centres are

pre-sorted in their x and y coordinates respectively.


By Theorem 4.1.1 and 4.3.6, we immediately have the following corollary.

Corollary 4.3.7 Given a geometric representation of a proper axis-parallel rectangle graph,

there is a polynomial time algorithm that achieves a 2-approximation for the sum colouring

problem.

In the next section, we show limitations of greedy algorithms using the priority frame-

work developed in [12].

4.4 Priority Inapproximation for Sum Colouring

We first give a brief introduction to the priority framework. The popularity of the class of

greedy algorithms has lent itself to be an interesting object to study. As a general algorithmic

paradigm, one interesting question to ask about greedy algorithms is their ultimate power

and limitations in solving specific problems. However, this is impossible without a precise

definition of the object itself. Initiated by Borodin, Nielsen and Rackoff [12], the priority

framework focuses on the style of the algorithm. It is a useful model for analyzing greedy

algorithms and has led to a number of insightful results. This includes scheduling problems

in [12] and [79], graph problems in [22] and [10], facility location and set cover in [3], and

Max2Sat in [75]. The priority framework consist of two types of priority algorithms: fixed

order and adaptive order priority algorithms.

4.4.1 Fixed Order and Adaptive Order

Both fixed order and adaptive order priority algorithms use total orderings of all possible

input items. In a fixed order priority algorithm, a total ordering of all possible input items

is maintained throughout the algorithm and irrevocable decisions are made iteratively on

each input item according to this ordering. Let S be a problem instance that consists of a

set of input items. The structure of a fixed order priority algorithm is described as follows:


A FIXED ORDER PRIORITY ALGORITHM

1: Determine a total ordering on all possible input items

2: while S is not empty do

3: Let si be the first input item in S according to the ordering

4: Make an irrevocable decision on si

5: Remove si from S

6: end while

For adaptive order, the algorithm is allowed to reorder the remaining input items to de-

cide which item is considered next. The structure of an adaptive order priority algorithm is

described as follows:

AN ADAPTIVE ORDER PRIORITY ALGORITHM

1: while S is not empty do

2: Determine a total ordering on all possible input items (see discussion below)

3: Let si be the first input item in S according to the ordering

4: Make an irrevocable decision on si

5: Remove si from S

6: end while

4.4.2 Deriving Lower Bounds

One key contribution of the priority model is that it provides a formal framework where

lower bounds can be derived. This is often achieved by an adversary argument. The ad-

versary argument can be viewed as a game between a priority algorithm and an adversary:

the adversary is constructing an input instance while the algorithm is constructing a solu-

tion to that input instance. Let S be the set of all possible input items. For adaptive priority

algorithms, at each step, the algorithm determines a total ordering on S. The adversary se-

lect an item e in S, present it to the algorithm, and removes all items before e and possibly


some items after e from S. The algorithm then makes an irrevocable decision on e. The

game repeats until S is empty. There is one constraint for the adversary. Once the game is

completed, all items presented to the algorithm by the adversary must form a valid input

instance. Therefore, at the end, the adversary constructs an input instance, while the pri-

ority algorithm constructs a solution. A lower bound on the approximation is obtained by

examining the ratio between the value of the optimal solution of that instance to the value

for the algorithm’s solution.

4.4.3 An Inapproximation Lower Bound for Sum Colouring

We illustrate the idea by showing an inapproximation lower bound for the sum colouring

problem. We consider a natural vertex adjacency model. Each data item is a vertex repre-

sented by its label and the labels of its neighbours. We have the following inapproximation

result for the sum colouring problem.

Theorem 4.4.1 There is no adaptive priority algorithm in the vertex adjacency model for the

sum colouring problem on planar 4-clawfree bipartite graphs that can achieve approxima-

tion ratio better than 1110 .

Proof: We exam the two graphs in Fig. 4.17 below. The graph G1 on the left has 7 vertices:

five vertices have degree two and two vertices have degree three. One can verify that the

optimal solution for this graph is 10 by giving colour 1 to B ,F,G ,D and 2 to everything else.

The graph G2 on the right has 7 vertices; three vertices have degree two and four vertices

have degree three. One can also verify that the optimal solution for this graph is also 10 by

giving colour 1 toA,G ,F,E and 2 to everything else.

In the vertex adjacency model, any adaptive priority algorithm determines an initial or-

dering on all possible input items. The adversary will present to the algorithm the first input

item e having degree 2 or 3 in this ordering. There are four cases:


Figure 4.17: Two graphs for adaptive priority algorithms

• If e is a vertex of degree 2 and the algorithm assign colour 1 to it, then the adversary

chooses graph G1 and presents vertex A to the algorithm. The solution obtained by

the algorithm is at least 11.

• If e is a vertex of degree 2 and the algorithm assign a colour other than 1 to it, then

the adversary chooses graph G1 and presents vertex B to the algorithm. The solution

obtained by the algorithm is at least 11.

• If e is a vertex of degree 3 and the algorithm assign colour 1 to it, then the adversary

chooses graph G1 and presents vertex C to the algorithm. The solution obtained by

the algorithm is at least 11.

• If e is a vertex of degree 2 and the algorithm assign a colour other than 1 to it, then

the adversary chooses graph G2 and presents vertex A to the algorithm. The solution

obtained by the algorithm is at least 11.

In all above cases, the algorithm cannot achieve approximation ratio better than 1110 .

4.5 Conclusion

In this chapter, we have discussed the sum colouring problem on restricted families of

graphs. We establish NP-hard results and develop greedy approximation algorithms for in-


teresting subclasses of d-claw-free graphs. Furthermore, we analyze the problem in the

priority model and give an inapproximation lower bound. The overall approach is by study-

ing specialization and generalization of graph classes, establish new positive and negative

results for greedy algorithms.

Although we have addressed greedy algorithms for sum colouring problems to some

extent, there are many open questions. We list a few of them below:

1. For unit square graphs, the sum colouring problem is NP-hard and we have a greedy

algorithm that achieves a 2-approximation. Can we improve it?

2. For unit disk graphs, the best known approximation ratio is 5 from Theorem 4.3.4. Can

we improve it?

3. The best known sum colouring algorithm for chordal graphs is a 4-approximation de-

rived from the repeated MIS approach in [5]. Can we improve it?

Chapter 5

Greedy Algorithms with Weight Scaling

In this chapter, we discuss the weight scaling technique in designing a greedy algorithm.

We focus on the problem of weighted maximum independent set for general graphs. The

weight scaling technique in this case produces a scaling factor f (v) for each vertex v based

purely on the structure of the given graph. These scaling factors are then used to guide the

design of our greedy algorithms. Two types of weight scaling are discussed in this chapter:

weight scaling for degrees and weight scaling for claws.

5.1 Introduction

Recall that for a given vertex v , its degree is the number of its neighbours and its claw-size

is the maximum number of its neighbours that are independent. The maximum degree and

maximum claw-size are important graph parameters which play essential roles in bounding

other properties as well as approximation ratios of many algorithms. However, they are

local properties, and do not reflect the global characteristic of the underlying graph. For

example, a star Sn has a central vertex of degree n, claw size n, but the rest of the vertices all

having degree 1. The goal of this chapter is to identify parameters that capture more global

characteristics of the given graph and yet still can be useful in bounding the approximation

ratios of some algorithms.

112

CHAPTER 5. GREEDY ALGORITHMS WITH WEIGHT SCALING 113

A key motivating example behind this development is theα-greedy algorithm of Lehmann,

O’Callaghan and Shoham [65] in single-minded combinatorial auctions. Here we consider

the combinatorial version of the problem, which we call the weighted set packing problem.

Given a set of m items and n players, each player i is interested in one subset Si and is will-

ing to pay vi to get exactly this subset. The goal of the weighted set packing problem is to

allocate subsets of items to players such that the social welfare (total amount of payments)

is maximized. Let α be a constant that 0 ≤ α ≤ 1, the α-greedy algorithm in [65] is stated

below.

THE α-GREEDY ALGORITHM

1: Sort all subsets S1, . . . ,Sn non-increasingly according to vi|Si |α

2: Rename the subsets according to this ordering as T1, . . . ,Tn

3: for i = 1 → n do

4: Allocate the subset Ti to its player if all items in Ti are still available

5: end for

Interestingly, the α-greedy algorithm achieves an approximation ratio ofp

m when α = 12 .

Note that setting α = 12 is the best possible choice for the α-greedy algorithm. To see this,

we consider two cases:

• If 0 ≤α< 12 , then consider the following instance. There are m items and m+1 players.

Each player i , for 1 ≤ i ≤ m, is interested in a singleton set containing only item i and

is willing to pay a dollar for it. The last player is interested in getting everything for the

price of mα. It is not hard to see that the α-greedy algorithm will satisfy only the last

player, leading to an approximation ratio of m1−α >pm.

• If 12 < α ≤ 1, then consider the following instance. There are m items and 2 players.

The first player is interested in the first item and is willing to pay a dollar for it. The

second player is interested in getting everything for the price of mα. It is not hard to

see that the α-greedy algorithm will satisfy only the first player, leading to an approx-


imation ratio of mα >pm.

Note that not only is setting α = 12 the best possible choice for the α-greedy algorithm, the

α-greedy algorithm with α= 12 is the best possible assuming NP is not equal to ZPP [65]. It

is somewhat surprising that such a simple algorithm achieves the state of the art.

Viewing the problem in a graph theoretical setting, if we make a vertex for each subset,

and connect two vertices if and only if the two subsets are non-disjoint, then we can con-

struct a graph, which we call an auction graph, and the subsets allocation problem is essen-

tially the weighted maximum independent set problem for the auction graph. It is not hard

to see that the auction graph is (m +1)-claw-free, and hence a simple greedy algorithm by

taking vertices according to non-increasing order of the weights gives an m-approximation

for the weighted maximum independent set problem. Note that in this case, scaling the

value vi by a factor of |Si |α helps the algorithm to achieve a better approximation ratio. We

ask the question whether this phenomenon can be generalized.

For the remainder of this chapter, we focus on the problem of weighted maximum inde-

pendent set for general graphs. Given a graph G = (V ,E) of n vertices and a weight function

w : V →R+, we consider the following generic greedy algorithm.

A GENERIC GREEDY ALGORITHM WITH WEIGHT SCALING

1: S =;2: Assign each vertex v a scaling factor f (v)

3: Sort vertices non-increasingly according to w(v)f (v)

4: Rename the vertices according to this ordering as v1, . . . , vn

5: for i = 1 → n do

6: Add vi to S if no vertices in S is adjacent to vi

7: end for

8: Return S

Everything in this algorithm is fixed except we need to determine for each vertex v a

scaling factor f (v). In the next two sections, we discuss two ways of choosing scaling factors.


5.2 Weight Scaling for Degrees

First, we give a few definitions. For a given graph G = (V ,E), and a weight scaling function

f : V →R+, the soft degree of a vertex v is

d f (v) = ∑u∈N (v)

f (u)

f (v),

and the maximum soft degree of G with respect to f (·) is

∆ f = maxv∈V

d f (v).

The optimal maximum soft degree of G is

∆= minf∆ f .

It turns out the the optimal maximum soft degree is the largest eigenvalue of the adjacency

matrix of G . Before we prove that, we first show the following lemma. A weight scaling

function f (·) is optimal1 if it produces an optimal maximum soft degree; it is fixed if for all

v ∈ V , d f (v) = c for some constant c. Without loss of generality, for the remainder of this

chapter, we assume the graph is finite, simple, connected and undirected.

Lemma 5.2.1 A weight scaling function f (·) is optimal if and only it is fixed.

Proof: We first prove the “only if" direction. Suppose there is an optimal weight scaling

function f (·) which is not fixed. Let C be the set of vertices with the maximum soft degree.

Raise scaling factors of vertices in C by a factor of (1+ε) for a very small ε. Note that the soft

degree of a vertex in C may decrease, and the soft degree of a vertex outside C may increase.

We choose a suitable ε such that the new set of vertices with the maximum soft degree is

a subset of C , i.e., no vertex outside C obtains the maximum soft degree. Since the graph

is connected and C is a strict subset of V , at least one vertex in C must have a neighbour

outside C , therefore its soft degree must decrease. Hence, the size of the set of vertices with

1In this section, the optimality of a weight scaling function is always with respect to soft degrees.


the maximum soft degree is strictly decreasing. We keep doing this until the maximum soft

degree is reduced. This contradicts the assumption that f (·) is optimal. Therefore, if f (·) is

optimal then f (·) is fixed.

Suppose there is a fixed weight scaling function f (·) that is not optimal, then by the

argument above, there must exists another fixed function g (·) with ∆g < ∆ f . Note that ratio

g (v)f (v) is not fixed for all vertices, for otherwise ∆g = ∆ f . Let T be the set of vertices such that

g (v)f (v) is minimized. Since the graph is connected and T is a strict subset of V , there is a vertex

v in T and who has at least one neighbour that is not in T . We then have the following

inequality:

∑u∈N (v)

f (u)

f (v)= ∑

u∈N (v)

f (u) · g (v)f (v)

g (v)< ∑

u∈N (v)

f (u) · g (u)f (u)

g (v)= ∑

u∈N (v)

g (u)

g (v).

Since both f and g is fixed, we have

∆ f =∑

u∈N (v)

f (u)

f (v)

and

∆g = ∑u∈N (v)

g (u)

g (v),

hence ∆ f < ∆g , which is a contradiction. Therefore, if f (·) is fixed then f (·) is optimal.

Theorem 5.2.2 For any given graph G, ∆= λmax, where λmax is the largest eigenvalue of the

adjacency matrix of G.

Proof: By Lemma 5.2.1, a weight scaling function is optimal if and only if it is fixed. Let f (·)be an optimal function, then since it is fixed, we have for all v ∈V ,

∆= ∑u∈N (v)

f (u)

f (v).

Therefore, ∆ · f (v) =∑u∈N (v) f (u). Let M be the adjacency matrix of G , then we have

∆ · [ f (v1), f (v2), . . . , f (vn)]t = M · [ f (v1), f (v2), . . . , f (vn)]t ,


then ∆ is just an eigenvalue of M . Since M has non-negative entries and f (v) > 0 for all

v ∈ V , by Perron-Frobenius Theorem, ∆= λmax, where λmax is the largest eigenvalue of the

adjacency matrix of G .

Also note that, by the above argument, an optimal weight scaling function is easily ob-

tainable by computing the principal eigenvector of M , which can be done in polynomial

time.

Now show that ∆ provides a bound on the approximation ratio of the generic greedy

algorithm with weight scaling.

Theorem 5.2.3 Let f (·) be a weight scaling function, then the generic greedy algorithm with

f (·) achieves an approximation ratio ∆ f for the weighted maximum independent set prob-

lem.

Proof: We compare the algorithm’s solution A with the optimal solution O. We concentrate

only on vertices in A′ = A \O and O′ =O \ A. Let w(A′) =∑v∈A′ w(v) and w(O′) =∑

v∈O′ w(v),

we show w(O′)w(A′) ≤ ∆ f . For each vertex u in O′, there must exist a vertex v in A′, such that v is

adjacent to u and is considered before u in the greedy algorithm. Therefore, we have

w(u)

f (u)≤ w(v)

f (v),

which implies

w(u) ≤ f (u)

f (v)w(v).

Summing over all vertices in O′, we have

w(O′) = ∑u∈O′

w(u) ≤ ∑v∈A′

∑u∈N (v)∩O′

f (u)

f (v)w(v) ≤ ∑

v∈A′

∑u∈N (v)

f (u)

f (v)w(v) ≤ ∆ f

∑v∈A′

w(v) = ∆ f w(A′),

as desired. Therefore the greedy algorithm achieves a ∆ f -approximation for the weighted

maximum independent set problem. Note that if f is the optimal weight scaling function,

then the algorithm achieves a ∆-approximation.


Note that during the proof of Theorem 5.2.3, we relax the quantity∑

u∈N (v)∩O′ f (u)f (v) to be

less or equal to the quantity∑

u∈N (v)f (u)f (v) . The first quantity actually requires all the u’s to be

in O′ and hence they are independent. This observation leads to the possibility of tightening

the approximation ratio. We examine the following simple example. Let G be the graph in

Fig. 5.1. The optimal weight scaling function f (·) assigns a scaling factor to each vertex as

v1

v3v4

v2

Figure 5.1: An example for weight scaling

follows:

f (v1) = f (v3) = 1, f (v2) = f (v4) =p

17+1

4,

which implies ∆ =p

17+12 ≈ 2.56. However, for each i , 1 ≤ i ≤ 4, the neighbours of vi are

not independent. So if we look at the quantity of∑

u∈N (vi )∩O′ f (u)f (vi ) for each vi , which forces

the neighbours to be independent as they are in O′, we can have a better bound. Let δi =∑u∈N (vi )∩O′ f (u)

f (vi ) , then we have:

δ1 ≤p

17+1

4, δ3 ≤

p17+1

4, δ2 ≤

p17−1

2, δ4 ≤

p17−1

2.

Therefore, the approximation ratio is bounded by

maxiδi ≤

p17−1

2≈ 1.56,

which is much better than before.

Observe further that if maxi δi is the quality we are trying to minimize, the current choice

of the weight scaling function is not optimal. Considering the following new weight scaling

function f (·):

f (v1) = f (v3) = 1, f (v2) = f (v4) =p2,


it is not hard to see that

maxiδi ≤

p2 ≈ 1.41.

This motivates the definition of a new measure, which we discuss in the next section.

5.3 Weight Scaling for Claws

Given a graph G = (V ,E), and a weight scaling function f : V →R+, for a vertex v , let Π(v, f )

denote any weighted maximum independent set in N (v) where the weights of the vertices

are given by f (·). The soft claw size of a vertex v is

λ f (v) = ∑u∈Π(v, f )

f (u)

f (v),

and the maximum soft claw size of G with respect to f (·) is

Λ f = maxv∈V

λ f (v).

The optimal maximum soft claw size of G is

Λ= minfΛ f .

Similar to Theorem 5.2.3, we have the following theorem.

Theorem 5.3.1 Let f (·) be a weight scaling function, then the generic greedy algorithm with

f (·) achieves an approximation ratio Λ f for the weighted maximum independent set prob-

lem.

Proof: The proof is essentially the same as the proof for Theorem 5.2.3. The key difference

is to observe that ∑u∈N (v)∩O′

f (u)

f (v)≤ ∑

u∈Π(v, f )

f (u)

f (v)≤ Λ f .

The remainder part of the proof follows immediately.


It is clear that for any weight scaling function f (·), Λ f ≤ ∆ f . However, a drawback of

this definition is that computing the optimal maximum soft claw size and a weight scaling

function achieving that might be difficult. There are two situations in which this definition

is useful.

1. Although we study the problem of the weighted maximum independent set problem,

the scaling factors are independent of the weights of vertices. So if the underlying

graph is fixed, and the weights of vertices are the input to the problem, we can pre-

compute the scaling factors that give a good maximum soft claw size.

2. For some special graph classes, it might be easy to compute the optimal maximum

soft claw size. Or, it might be easy to find a weight scaling function for which we can

bound the maximum soft claw size; theα-greedy algorithm is an example of this case.

Theorem 5.3.2 [65] Given a weighted set packing problem with n players and m items, each

player i is interested in one subset Si and is willing to pay vi to get exactly this subset. Let G be

its auction graph, and let f (·) be a weight scaling function such that f (Si ) = |Si |α, 0 ≤α≤ 1,

then the maximum soft claw size of G with respect to f (·) is Λ f ≤ maxi∈[n] |Si |1−2αmα. In

particular, Λ f ≤p

m when α= 12 .

Proof: For any subset Si (a vertex in the auction graph), by Hölder inequality, we have

∑S j∈Π(Si , f )

f (S j ) = ∑S j∈Π(Si , f )

|S j |α ≤ |Π(Si , f )|1−α(∑

S j∈Π(Si , f )|S j |)α.

Since Π(Si , f ) is an independent set, for any S j ∈ Π(Si , f ) and Sk ∈ Π(Si , f ) with j 6= k, we

have S j ∩Sk =;. Therefore∑

S j∈Π(Si , f ) |S j | ≤ m, where m is the total number of items. Fur-

thermore, since the sets in Π(Si , f ) are all adjacent to Si , for any S j ∈ Π(Si , f ), S j ∩Si 6= ;.

Hence, |Π(Si , f )| ≤ |Si |. Combining these two facts, we have


f (S j ) ≤ |Π(Si , f )|1−α(∑

S j∈Π(Si , f )|S j |)α ≤ |Si |1−αmα.


Therefore,

Λ f = maxi∈[n]


f (S j )

f (Si )≤ max

i∈[n]

|Si |1−α|Si |α

mα = maxi∈[n]

|Si |1−2αmα.

Setting α= 12 , we then have Λ f ≤

pm as desired.

This shows an application of weight scaling for claws. There are other examples in the

literature showing a similar application of this technique. For instance, the following result

in [4] can also be obtained via weight scaling for claws.

Theorem 5.3.3 [4] For each compact convex figure e, let A(e) be the area of e. Considering

an intersection graph induced by a set S of compact convex figures with aspect ratio R, and a

weight scaling function f (·) such that for each e ∈ S, f (e) = [A(e)]13 , then Λ f ∈O(R

43 ).

5.4 Conclusion

In this chapter, we study the weight scaling technique in designing a greedy algorithm. This

technique computes a set of scaling factors based on the underlying structure of the prob-

lem. These scaling factors can then be used to produce an ordering of input items to be

considered by the greedy algorithm. This provides some “guidance" to the greedy algorithm,

and can often improve its approximation ratio.

There are results in the literature obtainable using these techniques, but they are often

implicit. We provide a uniform view for these results by defining the general framework of

weight scaling, and proving general results under this framework.

We primarily focused on weight scaling so as to produce a fixed ordering. Note that

we can apply these techniques dynamically; i.e., the scaling factors get updated during the

execution of the greedy algorithm. Clarkson’s algorithm for vertex cover [19] mentioned in

Section 1.4 is, in some sense, an example of dynamic weight scaling. We leave this as an

interesting research direction for studying greedy algorithms.

Chapter 6

Conclusion

A traditional approach to algorithm design is to focus on a particular problem, and start with

simple algorithms for the problem, analyze them, identify their advantages and weaknesses,

and then make improvements and come up with better and more sophisticated designs.

This is a problem-oriented approach to algorithm design, and is usually very effective in

finding a good solution to that specific problem.

In this thesis, we have taken an orthogonal approach. We focus on greedy algorithms, a

particular algorithmic paradigm, and for each problem studied, we examine the generaliza-

tions and specializations of this problem, and find out to what extent greedy algorithms will

still work or produce a reasonable result. There are at least two benefits from this approach:

1. It gives us an opportunity to understand contexts where greedy algorithms work well

and see a trade-off between the performance of greedy algorithms (in terms of the

approximation ratio) and the generality of those contexts.

2. It allows us to explore the flexibility in designing a greedy algorithm, and to see the

power and limitations of this particular algorithmic paradigm.

One primary focus of the thesis is to consider generalizations and specializations of

a problem varied by its underlying structure and its objective function. Studying an NP-

122

CHAPTER 6. CONCLUSION 123

hard graph problem over different graph classes is particularly interesting as we have well-

developed theory for graph families and hierarchies based on their structural properties.

Furthermore, since the problem is NP-hard, it is studied under the framework of approxi-

mation algorithms. The approximation ratio give us a natural measure to assess the trade-

off between performance and generality.

There are two major directions that we left unexplored in this thesis. The first one is

submodular maximization over graph structures. The submodular maximization problem

is well-studied in the literature over a set system and with a knapsack constraint, but has

not received much attention for graph problems. Many weighted graph optimization prob-

lems use a modular set function as an objective function. What if the objective function is

submodular? Furthermore, what is the role of greedy algorithms for such problems? The

second direction is dynamic weight scaling mentioned at the end of Chapter 5. We believe

this technique can lead to a more sophisticated design of greedy algorithms, which can po-

tentially compete with the state of the art for some problems.

In summary, we studied the design and analysis of greedy algorithms. We obtained a

collection of new results and witness evidence of trade-offs between the performance of a

greedy algorithm and the generality of a problem. We identified new structures and prop-

erties, and gave a uniform view of some existing results in the literature. However, we are

still at a beginning stage of understanding greedy and greedy-like algorithms, and such a

research program remains a challenge.

Bibliography

[1] Louigi Addario-Berry, W. Sean Kennedy, Andrew D. King, Zhentao Li, and Bruce A.

Reed. Finding a maximum-weight induced k-partite subgraph of an i-triangulated

graph. Discrete Applied Mathematics, 158(7):765–770, 2010.

[2] Karhan Akcoglu, James Aspnes, Bhaskar DasGupta, and Ming-Yang Kao. Opportunity-

cost algorithms for combinatorial auctions. In E. J. Kontoghiorghes, B. Rustem, and

S. Siokos, editors, Applied Optimization 74: Computational Methods in Decision-

Making, Economics and Finance, pages 455–479. Kluwer Academic Publishers, 2002.

[3] Spyros Angelopoulos and Allan Borodin. On the power of priority algorithms for facility

location and set cover. In APPROX, pages 26–39, 2002.

[4] Moshe Babaioff and Liad Blumrosen. Computationally-feasible truthful auctions for

convex bundles. In APPROX-RANDOM, pages 27–38, 2004.

[5] Amotz Bar-Noy, Mihir Bellare, Magnús M. Halldórsson, Hadas Shachnai, and Tami

Tamir. On chromatic sums and distributed resource allocation. Inf. Comput.,

140(2):183–202, 1998.

[6] Amotz Bar-Noy, Sudipto Guha, Joseph Naor, and Baruch Schieber. Approximating the

throughput of multiple machines in real-time scheduling. SIAM J. Comput., 31(2):331–

352, 2001.

124

BIBLIOGRAPHY 125

[7] Amotz Bar-Noy and Guy Kortsarz. Minimum color sum of bipartite graphs. Journal of

Algorithms, 28(2):339 – 365, 1998.

[8] R. Bar-Yehuda and S. Even. A local-ratio theorem for approximating the weighted ver-

tex cover problem. In G. Ausiello and M. Lucertini, editors, Analysis and Design of Algo-

rithms for Combinatorial Problems, volume 109 of North-Holland Mathematics Stud-

ies, pages 27 – 45. North-Holland, 1985.

[9] Kenneth P. Bogart and Douglas B. West. A short proof that “proper = unit”. Discrete

Math., 201(1-3):21–23, 1999.

[10] Allan Borodin, Joan Boyar, and Kim S. Larsen. Priority algorithms for graph optimiza-

tion problems. In WAOA, pages 126–139, 2004.

[11] Allan Borodin, David Cashman, and Avner Magen. How well can primal-dual and local-

ratio algorithms perform? ACM Transactions on Algorithms, 7(3):29, 2011.

[12] Allan Borodin, Morten N. Nielsen, and Charles Rackoff. (Incremental) priority algo-

rithms. In SODA, pages 752–761, 2002.

[13] Richard A. Brualdi. Comments on bases in dependence structures. Bulletin of the Aus-

tralian Mathematical Society, 1(02):161–167, 1969.

[14] Peter Buneman. A Characterisation of Rigid Circuit Graphs. Discrete Mathematics,

9:205–212, 1974.

[15] Ayelet Butman, Danny Hermelin, Moshe Lewenstein, and Dror Rawitz. Optimization

problems in multiple-interval graphs. In SODA, pages 268–277, 2007.

[16] M.R. Cerioli, L. Faria, T.O. Ferreira, and F. Protti. On minimum clique partition and

maximum independent set on unit disk graphs and penny graphs: complexity and ap-

proximation. Electronic Notes in Discrete Mathematics, 18:73 – 79, 2004.

BIBLIOGRAPHY 126

[17] Venkatesan T. Chakaravarthy and Sambuddha Roy. Approximating maximum weight k-

colorable subgraphs in chordal graphs. Inf. Process. Lett., 109(7):365–368, March 2009.

[18] Barun Chandra and Magnús M. Halldórsson. Approximation algorithms for dispersion

problems. J. Algorithms, 38(2):438–465, 2001.

[19] Kenneth L. Clarkson. A modification of the greedy algorithm for vertex cover. Inf. Pro-

cess. Lett., 16(1):23–25, 1983.

[20] Stephen A. Cook. The complexity of theorem-proving procedures. In Proceedings of

the third annual ACM symposium on Theory of computing, STOC ’71, pages 151–158,

New York, NY, USA, 1971. ACM.

[21] Bruno Courcelle. The monadic second-order logic of graphs i. recognizable sets of

finite graphs. Information and Computation, pages 12–75, 1990.

[22] Sashka Davis and Russell Impagliazzo. Models of greedy algorithms for graph prob-

lems. In SODA, pages 381–390, 2004.

[23] Celina M. Herrera de Figueiredo and Frédéric Maffray. Optimizing bull-free perfect

graphs. SIAM J. Discrete Math., 18(2):226–240, 2004.

[24] Irit Dinur and Samuel Safra. On the hardness of approximating minimum vertex cover.

Annals of Mathematics, 162:2005, 2004.

[25] G. Dirac. On rigid circuit graphs. Abhandlungen aus dem Mathematischen Seminar der

Universität Hamburg, 25:71–76, 1961.

[26] Rodney G. Downey and Michael R. Fellows. Fixed-parameter tractability and com-

pleteness II: On completeness for W[1]. Theor. Comput. Sci., 141(1&2):109–131, 1995.

[27] Jack Edmonds. Matroids and the greedy algorithm. Mathematical Programming,

1:127–136, 1971.

BIBLIOGRAPHY 127

[28] Friedrich Eisenbrand and Fabrizio Grandoni. On the complexity of fixed parameter

clique and dominating set. Theor. Comput. Sci., 326(1-3):57–67, 2004.

[29] Erhan Erkut. The discrete p-dispersion problem. European Journal of Operational

Research, 46(1):48–60, May 1990.

[30] Erhan Erkut and Susan Neuman. Analytical models for locating undesirable facilities.

European Journal of Operational Research, 40(3):275–291, June 1989.

[31] Uriel Feige. A threshold of ln n for approximating set cover. J. ACM, 45(4):634–652,

1998.

[32] Uriel Feige and Joe Kilian. Zero knowledge and the chromatic number. J. Comput. Syst.

Sci., 57(2):187–199, 1998.

[33] D. R. Fulkerson and O. A. Gross. Incidence Matrices and Interval Graphs. Pacific J.

Math., 15:835–855, 1965.

[34] David Gale. Optimal assignments in an ordered set: An application of matroid theory.

Journal of Combinatorial Theory, 4(2):176 – 180, 1968.

[35] Rajiv Gandhi, Magnús M. Halldórsson, Guy Kortsarz, and Hadas Shachnai. Improved

bounds for scheduling conflicting jobs with minsum criteria. ACM Trans. Algorithms,

4(1):1–20, 2008.

[36] F. Gavril. The intersection graphs of subtrees in trees are exactly the chordal graphs.

Journal of Combinatorial Theory, Series B, 16(1):47 – 56, 1974.

[37] Krzysztof Giaro, Robert Janczewski, Marek Kubale, and Michal Malafiejski. A 27/26-

approximation algorithm for the chromatic sum coloring of bipartite graphs. In AP-

PROX, pages 135–145, 2002.

[38] Sreenivas Gollapudi and Aneesh Sharma. An axiomatic approach for result diversifica-

tion. In World Wide Web Conference Series, pages 381–390, 2009.

BIBLIOGRAPHY 128

[39] Martin Grötschel, László Lovász, and Alexander Schrijver. The ellipsoid method and its

consequences in combinatorial optimization. Combinatorica, 1(2):169–197, 1981.

[40] A. Hajnal and J. Surányi. Über die auflosung von graphen in vollständige teilgraphen.

Ann. Univ. Sci. Budagest Eötvös. Sect. Math., 1:113–121, 1958.

[41] Magnús M. Halldórsson, Kazuo Iwano, Naoki Katoh, and Takeshi Tokuyama. Finding

subsets maximizing minimum structures. In Symposium on Discrete Algorithms, pages

150–159, 1995.

[42] Magnús M. Halldórsson and Ragnar K. Karlsson. Strip graphs: Recognition and

scheduling. In WG, pages 137–146, 2006.

[43] Magnús M. Halldórsson, Guy Kortsarz, and Hadas Shachnai. Sum coloring interval

and k-claw free graphs with application to scheduling dependent jobs. Algorithmica,

37(3):187–209, 2003.

[44] Magnús M. Halldórsson and Guy Kortsarz. Tools for multicoloring with applications to

planar graphs and partial k-trees. Journal of Algorithms, 42(2):334 – 366, 2002.

[45] Eran Halperin. Improved approximation algorithms for the vertex cover problem in

graphs and hypergraphs. In SODA, pages 329–337, 2000.

[46] P. Hansen and I. D. Moon. Dispersion facilities on a network. Presentation at the

TIMS/ORSA Joint National Meeting, Washington, D.C., 1988.

[47] Refael Hassin, Shlomi Rubinstein, and Arie Tamir. Approximation algorithms for max-

imum dispersion. Oper. Res. Lett., 21(3):133–137, 1997.

[48] Dorit S. Hochbaum. Efficient bounds for the stable set, vertex cover and set packing

problems. Discrete Applied Mathematics, 6(3):243 – 254, 1983.

[49] A. Hoffman. On simple linear programming problems. Proceedings of Symposia in Pure

Mathematics, 7:317 – 327, 1963.

BIBLIOGRAPHY 129

[50] Wen-Lian Hsu and Tze-Heng Ma. Substitution decomposition on chordal graphs and

applications. In Wen-Lian Hsu and R. Lee, editors, ISA’91 Algorithms, volume 557 of

Lecture Notes in Computer Science, pages 52–60. Springer Berlin / Heidelberg, 1991.

[51] Sandy Irani. Coloring inductive graphs on-line. Algorithmica, 11(1):53–72, 1994.

[52] Alon Itai and Michael Rodeh. Finding a minimum circuit in a graph. SIAM J. Comput.,

7(4):413–423, 1978.

[53] Robert E. Jamison and Henry Martyn Mulder. Tolerance intersection graphs on binary

trees with constant tolerance 3. Discrete Mathematics, 215:115–131, 2000.

[54] Akihisa Kako, Takao Ono, Tomio Hirata, and Magnús M. Halldórsson. Approximation

algorithms for the weighted independent set problem in sparse graphs. Discrete Ap-

plied Mathematics, 157(4):617–626, 2009.

[55] Frank Kammer, Torsten Tholey, and Heiko Voepel. Approximation algorithms for in-

tersection graphs. In APPROX-RANDOM, pages 260–273, 2010.

[56] R. Karp. Reducibility among combinatorial problems. In R. Miller and J. Thatcher,

editors, Complexity of Computer Computations, pages 85–103. Plenum Press, 1972.

[57] David Kempe, Jon M. Kleinberg, and Éva Tardos. Maximizing the spread of influence

through a social network. In KDD, pages 137–146, 2003.

[58] Subhash Khot and Oded Regev. Vertex cover might be hard to approximate to within

2−ε. In IEEE Conference on Computational Complexity, pages 379–, 2003.

[59] Seog-Jin Kim, Alexandr V. Kostochka, and Kittikorn Nakprasit. On the chromatic num-

ber of intersection graphs of convex sets in the plane. Electr. J. Comb., 11(1), 2004.

[60] Jon Kleinberg, Christos Papadimitriou, and Prabhakar Raghavan. Segmentation prob-

lems. In Proceedings of the thirtieth annual ACM symposium on Theory of computing,

STOC ’98, pages 473–482, New York, NY, USA, 1998. ACM.

BIBLIOGRAPHY 130

[61] Bernhard Korte and László Lovász. Mathematical structures underlying greedy algo-

rithms. In FCT, pages 205–209, 1981.

[62] E. Kubicka and A. J. Schwenk. An introduction to chromatic sums. In Proceedings of

the 17th conference on ACM Annual Computer Science Conference, CSC ’89, pages 39–

45, New York, NY, USA, 1989. ACM.

[63] Ewa Kubicka, Grzegorz Kubicki, and Dionysios Kountanis. Approximation algorithms

for the chromatic sum. In Proceedings of the The First Great Lakes Computer Science

Conference on Computing in the 90’s, pages 15–21, London, UK, 1991. Springer-Verlag.

[64] Michael J. Kuby. Programming models for facility dispersion: The p-dispersion and

maxisum dispersion problems. Geographical Analysis, 19(4):315–329, 1987.

[65] Daniel J. Lehmann, Liadan O’Callaghan, and Yoav Shoham. Truth revelation in approx-

imately efficient combinatorial auctions. J. ACM, 49(5):577–602, 2002.

[66] Don R. Lick and Arthur T. White. k-degenerate graphs. Canadian Journal of Mathe-

matics, 22:1082–1096, 1970.

[67] Hui Lin and Jeff Bilmes. Multi-document summarization via budgeted maximization

of submodular functions. In HLT-NAACL, pages 912–920, 2010.

[68] Hui Lin and Jeff Bilmes. A class of submodular functions for document summarization.

In North American chapter of the Association for Computational Linguistics/Human

Language Technology Conference (NAACL/HLT-2011), Portland, OR, June 2011.

[69] Hui Lin, Jeff Bilmes, and Shasha Xie. Graph-based submodular selection for extrac-

tive summarization. In Proc. IEEE Automatic Speech Recognition and Understanding

(ASRU), Merano, Italy, December 2009.

[70] M.Gonen. Coloring problems on interval graphs and trees. Master’s thesis, The Open

Univ., Tel-Aviv, 2001.

BIBLIOGRAPHY 131

[71] G. Nemhauser, L. Wolsey, and M. Fisher. An analysis of the approximations for maxi-

mizing submodular set functions. Mathematical Programming, 1978.

[72] G. L. Nemhauser and L. E. Trotter. Vertex packings: Structural properties and algo-

rithms. Mathematical Programming, 8:232–248, 1975.

[73] G. L. Nemhauser and L. A. Wolsey. Best algorithms for approximating the maximum of

a submodular set function. Math. Oper. Res., 3(3):177–188, 1978.

[74] S. Nicoloso, Majid Sarrafzadeh, and X. Song. On the sum coloring problem on interval

graphs. Algorithmica, 23(2):109–126, 1999.

[75] Matthias Poloczek. Bounds on greedy algorithms for max sat. In ESA, pages 37–48,

2011.

[76] R. Rado. Note on independence functions. Proceedings of the London Mathematical

Society, 3(7):300–320, 1957.

[77] S. S. Ravi, D. J. Rosenkrantz, and G. K. Tayi. Heuristic and special case algorithms for

dispersion problems. Operations Research, 42(2):299–310, March-April 1994.

[78] Ran Raz and Shmuel Safra. A sub-constant error-probability low-degree test, and a

sub-constant error-probability pcp characterization of np. In STOC, pages 475–484,

1997.

[79] Oded Regev. Priority algorithms for makespan minimization in the subset model. Inf.

Process. Lett., 84(3):153–157, 2002.

[80] F. S. Roberts. Indifference graphs. F. Harary (Ed.), Proof Techniques in Graph Theory,

pages 139–146, 1969.

[81] D. Rose, R. Tarjan, and G. Lueker. Algorithmic aspects of vertex elimination on graphs.

SIAM Journal on Computing, 5(2):266–283, 1976.

BIBLIOGRAPHY 132

[82] Donald J Rose. Triangulated graphs and the elimination process. Journal of Mathemat-

ical Analysis and Applications, 32(3):597 – 609, 1970.

[83] R. Tarjan and M. Yannakakis. Simple linear-time algorithms to test chordality of graphs,

test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs. SIAM Jour-

nal on Computing, 13(3):566–579, 1984.

[84] J. Thomassen, P. Erdös, Y. Alavi, J. Malde, and A.J. Schwenk. Tight bounds on the chro-

matic sum of a connected graph. J. Graph Theory, 13:353–357, 1989.

[85] Leslie G. Valiant. Universality considerations in vlsi circuits. IEEE Trans. Computers,

30(2):135–140, 1981.

[86] James R. Walter. Representations of chordal graphs as subtrees of a tree. Journal of

Graph Theory, 2(3):265–267, 1978.

[87] D. W. Wang and Yue-Sun Kuo. A study on two geometric location problems. Inf. Process.

Lett., 28:281–286, August 1988.

[88] Mihalis Yannakakis and Fanica Gavril. The maximum k-colorable subgraph problem

for chordal graphs. Information Processing Letters, 24(2):133 – 137, 1987.

[89] Yuli Ye and Allan Borodin. Elimination graphs. In ICALP (1), pages 774–785, 2009.

by yuli ye - university of toronto t-space · yuli ye doctor of philosophy graduate department of...

Documents