un weighted sc

8/12/2019 Un Weighted Sc

1/18

Approximating the unweighted k-set cover problem: greedy

meets local search

Asaf Levin

August 21, 2008

Abstract

In the unweighted set-cover problem we are given a set of elementsE={e1, e2, . . . , en}and a collection Fof subsets ofE. The problem is to compute a sub-collection S OLFsuch that

SjSOL

Sj =Eand its size |SOL| is minimized. When |S| k for all S Fwe obtain the unweighted k-set cover problem. It is well known that the greedy algorithmis an Hk-approximation algorithm for the unweighted k-set cover, where Hk =

ki=1

1

i

is the k-th harmonic number, and that this bound on the approximation ratio of thegreedy algorithm, is tight for all constant values of k. Since the set cover problem is afundamental problem, there is an ongoing research effort to improve this approximationratio using modifications of the greedy algorithm. The previous best improvement ofthe greedy algorithm is an

Hk

1

2

-approximation algorithm. In this paper we present

a new

Hk

196

390-approximation algorithm for k 4 that improves the previous best

approximation ratio for all values ofk 4. Our algorithm is based on combining localsearch during various stages of the greedy algorithm.

1 Introduction

In the weighted set-cover problem we are given a set of elements E = {e1, e2, . . . , en}

and a collection Fof subsets ofE, where SFS = E and each S F has a positive cost

cS. The goal is to compute a sub-collectionSOL Fsuch that

SSOL S= Eand its costSSOL cSis minimized. Such a sub-collection of subsets is called a cover. When we consider

instances of the weighted set-cover such that each Sj has at most k elements (|S| k

for all S F), we obtain the weighted k-set cover problem. The unweighted set

cover problem and the unweighted k-set cover problemare the special cases of the

weighted set coverand of weighted k-set cover, respectively, where cS= 1 S F.

It is well known (see [3]) that a greedy algorithm is an Hk-approximation algorithm for the

weightedk-set cover, whereHk =k

i=11iis thek-th harmonic number, and that this bound is

tight even for the unweighted k-set cover problem (see, [13, 17]). For unbounded values ofk,

Slavk [21] showed that the approximation ratio of the greedy algorithm for the unweighted set

cover problem is ln n lnln n + (1). Feige [6] proved that unlessN PDTIM E(npolylog n)

the unweighted set cover problem cannot be approximated within a factor (1 ) ln n, for any

>0. Raz and Safra [20] proved that ifP =N P then for some constant c, the unweighted

Department of Statistics, The Hebrew University, Jerusalem, Israel. email [email protected]

1


2/18

set cover problem cannot be approximated within a factor c log n. This result shows that the

greedy algorithm is an asymptotically best possible approximation algorithm for the weighted

and unweighted set cover problem (unless N P D TIME(npolylog n

)). The unweightedk-setcover problem is known to be NP-complete [14] and MAX SNP-hard for allk 3 [4, 15, 18].

Another algorithm for the weighted set cover problem by Hochbaum [11] has an approximation

ratio that depends on the maximum number of subsets that contain any given element (the

local-ratio algorithm of Bar-Yehuda and Even [2] has the same performance guarantee). See

Paschos [19] for a survey on these results.

In spite of the above bad news Goldschmidt, Hochbaum and Yu [8] modified the greedy

algorithm for the unweighted k-set cover and showed that the resulting algorithm has a

performance guarantee of Hk 16 . Halldorsson [9] presented an algorithm based on local

search that has an approximation ratio of Hk 13 for the unweighted k-set cover, and a

(1.4 + )-approximation algorithm for the unweighted 3-set cover. Duh and Furer [5] furtherimproved this result and presented an (Hk

12)-approximation algorithm for the unweighted

k-set cover. We will base our algorithm on the algorithm of Duh and Furer [5], and therefore

we will review their algorithm and results in Section 2.2. All of these improvements [8, 9, 5] are

based on running the greedy algorithm until each new subset covers at most t new elements

(where t = 2 in [8] and larger values oft in [9, 5]) and then switch to another algorithm.

Regarding approximation algorithms for the weighted k-set cover problem within a factor

better than Hk, a first improvement step was given by Fujito and Okumura [7] who pre-

sented an

Hk 112

-approximation algorithm for thek-set cover problem where the cost of

each subset is either 1 or 2. More recently, Hassin and Levin [10] provided an Hk k18k9 -

approximation algorithm for the general weightedk-set cover problem.The maximum set packing problem is the following related problem: We are given a

set of elements E = {e1, e2, . . . , en} and a collection Fof subsets ofE, where SFS = E,

and the goal is to compute a maximum size set packing, i.e., a sub-collectionF Fof disjoint

subsets. The relation between the maximum set packing problem and the unweighted set cover

problem is that the fractional version of the maximum set packing problem is the dual linear

program of the fractional version of the unweighted set cover problem. Hurkens and Schrijver

[12] proved that a local-search algorithm for the maximum set packing problem where each

subset in F has at most k elements, is a2k

-approximation algorithm. Therefore, this

local-search algorithm has a better performance guarantee than the greedy selection rule that

returns any maximal sub-collection. The greedy selection rule has an approximation ratio of1k

.

Paper overview: In Section 2 we review the greedy algorithm for the unweighted min-

imum k-set cover problem, and its analysis, the semi-local optimization algorithm of [5],

and then we present our improved algorithm. We analyze its performance in Section 3, i.e.,

we show in Theorem 1 that our improved algorithm is an

Hk 196390

-approximation algo-

rithm for the unweighted k-set cover problem where k4, improving the earlier

Hk 12

-

approximation algorithm of [5]. We conclude in Section 4 by discussing open questions.

2


3/18

2 Algorithms for the unweighted k-set cover problem

In Subsection 2.1 we review the greedy algorithm for the unweighted minimum k-set cover

problem, and its analysis. In Subsection 2.2 we review the semi-local optimization algorithm

of [5]. In Subsection 2.3 we present our improved algorithm.

Given an input to the unweightedk-set cover problem we let theextended inputbe defined

over the same set of elements where the collection of subsets of the extended input is obtained

from the input by including every subset of a subset in the input (i.e., the extended input

is the closure of the input under taking subsets). We note that the extended input can be

represented compactly by representing the maximal (under inclusion) subsets. A solution to

the extended input is easily transformed into a solution for the original input by adding a

superset which is included in the input, of each subset in the solution. This mapping can be

maintained while creating the solution. For simplifying the presentation of the algorithms weassume that they are solving the extended input. We also assume that the optimal solution

is with respect to the extended input.

We start our study by stating a simplification lemma on the structure of the optimal

solution.

Lemma 1 Without loss of generality, we may assume that the optimal solution to the (ex-

tended input of) a set cover instance satisfies that each element is covered by exactly one

subset of the optimum.

Proof: Let an optimal solution to the problem consist of a collection of sets Sj ,j J, with

jJS

j =E. We now construct another optimal solution formed of element-disjoint sets S

j

whereSj Sj for all j J. To do that, we assign each element e Eto the smallest index

setSj ,j J that containse, and for all values ofj we letSj be the set of elements assigned

to Sj . In the extended input the setsS

j for all j belong to the collection F, and the claim

follows.

We define a j-setto be a set with j elements. We fix an optimal solutionOP T, and we

say that a k-set is an optimalk-setif it is contained in OP T.

Given a partial cover C and an algorithm , let cost(C) be the number of sets used by

Algorithm applied on the elements left uncovered by Cand letcost,1(C) be the number of

1-sets among those.

2.1 The greedy algorithm

In this subsection we review the greedy algorithm for the unweighted k-set cover problem and

the proof of its performance guarantee.

The greedy algorithm starts with an empty collection of subsets in the solution and no

element being covered. Then, it iterates the following procedure until all elements are covered:

Let wS be the number of currently uncovered elements in a set S F, and the current

ratio ofS is rS= 1wS

. Let S be a set such that rS is minimized. The algorithm adds S to

the collection of subsets of the solution, defines the elements ofS as covered, and assigns a

priceofrS to all the elements that are now covered but were uncovered prior to this iteration

(i.e., the elements that were first covered by S

).

3


4/18

Johnson [13], Lovasz [17] and Chvatal [3] showed that the greedy algorithm is an Hk-

approximation algorithm for the unweighted k-set cover.

Chvatals proof is the following: first note that the cost of the greedy solution equals thesum of prices assigned to the elements. Second, consider a set S that belongs to an optimal

solution OP T. Then, OP Tpays 1 for S. Consider the elements ofSin the order in which

they are covered by the greedy algorithm breaking ties arbitrarily. When the algorithm covers

the i-th element ofS, the algorithm could, instead, choose Sas a feasible set with a current

ratio of 1|S|i+1 . Therefore, the price assigned to the this element is at most 1|S|i+1 . It follows

that the total price assigned to the elements ofS is at most|S|

i=11

|S|i+1 =|S|

i=11i

Hk,

and therefore, the approximation ratio of the greedy algorithm is at most Hk.

2.2 The semi-local optimization algorithm

Duh and Furer [5] suggested the following procedure to approximate the unweighted 3-set

cover problem. In a pure local improvement step, we replace a number of sets with fewer

sets to form a new cover with a reduced cost. To define a semi-local step, they observed (see

also [8]) that once the 3-sets are selected the remaining instance can be solved optimally in

polynomial time by reduction to maximum matching. Hence, to solve the unweighted 2-set

cover instance results after selecting the 3-sets, they invoke the following algorithm A.

Algorithm A for solving optimally unweighted 2-set cover instance

1. Find a maximum matching in the following graph: there is a vertex for each element,and an edge between two vertices if there is a 2-set consisting of this pair of elements.

2. Return the set of 2-sets corresponding to the edges of the maximum matching andthe 1-sets of the uncovered elements (by the collection of 2-sets which we found).

Thus a local change in the 3-sets allows any global changes in the 2-sets and 1-sets and

such a change is called a semi-local change. They allowed the algorithm to remove one 3-set

and insert at most a pair of 3-sets if one of the following happens: either the total cost is

reduced, or the total cost remains the same and the number of 1-sets in the resulting solution

is reduced (thus the total cost is the primary objective whereas the number of 1-sets is a

secondary objective). This results in the approximation algorithm (Algorithm B below) forthe unweightedk-set cover of [5] which is useful mainly for k = 3.

4


5/18

Algorithm B for approximating unweighted k-set cover instance

1. Greedily build a maximal collectionCof disjoint sets where each set in the collectioncontains at least three elements.

2. While there are sets C C and C1, C2 / C such that C = (C \ {C}) {C1, C2} isa collection of disjoint sets where each set in the collection contains at least threeelements, and such that the following condition hold:costA(C) + |C|< costA(C)+ |C| or (costA(C)+ |C|= costA(C)+ |C| andcostA,1(C)

un weighted sc

Documents