un weighted sc
TRANSCRIPT
-
8/12/2019 Un Weighted Sc
1/18
Approximating the unweighted k-set cover problem: greedy
meets local search
Asaf Levin
August 21, 2008
Abstract
In the unweighted set-cover problem we are given a set of elementsE={e1, e2, . . . , en}and a collection Fof subsets ofE. The problem is to compute a sub-collection S OLFsuch that
SjSOL
Sj =Eand its size |SOL| is minimized. When |S| k for all S Fwe obtain the unweighted k-set cover problem. It is well known that the greedy algorithmis an Hk-approximation algorithm for the unweighted k-set cover, where Hk =
ki=1
1
i
is the k-th harmonic number, and that this bound on the approximation ratio of thegreedy algorithm, is tight for all constant values of k. Since the set cover problem is afundamental problem, there is an ongoing research effort to improve this approximationratio using modifications of the greedy algorithm. The previous best improvement ofthe greedy algorithm is an
Hk
1
2
-approximation algorithm. In this paper we present
a new
Hk
196
390-approximation algorithm for k 4 that improves the previous best
approximation ratio for all values ofk 4. Our algorithm is based on combining localsearch during various stages of the greedy algorithm.
1 Introduction
In the weighted set-cover problem we are given a set of elements E = {e1, e2, . . . , en}
and a collection Fof subsets ofE, where SFS = E and each S F has a positive cost
cS. The goal is to compute a sub-collectionSOL Fsuch that
SSOL S= Eand its costSSOL cSis minimized. Such a sub-collection of subsets is called a cover. When we consider
instances of the weighted set-cover such that each Sj has at most k elements (|S| k
for all S F), we obtain the weighted k-set cover problem. The unweighted set
cover problem and the unweighted k-set cover problemare the special cases of the
weighted set coverand of weighted k-set cover, respectively, where cS= 1 S F.
It is well known (see [3]) that a greedy algorithm is an Hk-approximation algorithm for the
weightedk-set cover, whereHk =k
i=11iis thek-th harmonic number, and that this bound is
tight even for the unweighted k-set cover problem (see, [13, 17]). For unbounded values ofk,
Slavk [21] showed that the approximation ratio of the greedy algorithm for the unweighted set
cover problem is ln n lnln n + (1). Feige [6] proved that unlessN PDTIM E(npolylog n)
the unweighted set cover problem cannot be approximated within a factor (1 ) ln n, for any
>0. Raz and Safra [20] proved that ifP =N P then for some constant c, the unweighted
Department of Statistics, The Hebrew University, Jerusalem, Israel. email [email protected]
1
-
8/12/2019 Un Weighted Sc
2/18
set cover problem cannot be approximated within a factor c log n. This result shows that the
greedy algorithm is an asymptotically best possible approximation algorithm for the weighted
and unweighted set cover problem (unless N P D TIME(npolylog n
)). The unweightedk-setcover problem is known to be NP-complete [14] and MAX SNP-hard for allk 3 [4, 15, 18].
Another algorithm for the weighted set cover problem by Hochbaum [11] has an approximation
ratio that depends on the maximum number of subsets that contain any given element (the
local-ratio algorithm of Bar-Yehuda and Even [2] has the same performance guarantee). See
Paschos [19] for a survey on these results.
In spite of the above bad news Goldschmidt, Hochbaum and Yu [8] modified the greedy
algorithm for the unweighted k-set cover and showed that the resulting algorithm has a
performance guarantee of Hk 16 . Halldorsson [9] presented an algorithm based on local
search that has an approximation ratio of Hk 13 for the unweighted k-set cover, and a
(1.4 + )-approximation algorithm for the unweighted 3-set cover. Duh and Furer [5] furtherimproved this result and presented an (Hk
12)-approximation algorithm for the unweighted
k-set cover. We will base our algorithm on the algorithm of Duh and Furer [5], and therefore
we will review their algorithm and results in Section 2.2. All of these improvements [8, 9, 5] are
based on running the greedy algorithm until each new subset covers at most t new elements
(where t = 2 in [8] and larger values oft in [9, 5]) and then switch to another algorithm.
Regarding approximation algorithms for the weighted k-set cover problem within a factor
better than Hk, a first improvement step was given by Fujito and Okumura [7] who pre-
sented an
Hk 112
-approximation algorithm for thek-set cover problem where the cost of
each subset is either 1 or 2. More recently, Hassin and Levin [10] provided an Hk k18k9 -
approximation algorithm for the general weightedk-set cover problem.The maximum set packing problem is the following related problem: We are given a
set of elements E = {e1, e2, . . . , en} and a collection Fof subsets ofE, where SFS = E,
and the goal is to compute a maximum size set packing, i.e., a sub-collectionF Fof disjoint
subsets. The relation between the maximum set packing problem and the unweighted set cover
problem is that the fractional version of the maximum set packing problem is the dual linear
program of the fractional version of the unweighted set cover problem. Hurkens and Schrijver
[12] proved that a local-search algorithm for the maximum set packing problem where each
subset in F has at most k elements, is a2k
-approximation algorithm. Therefore, this
local-search algorithm has a better performance guarantee than the greedy selection rule that
returns any maximal sub-collection. The greedy selection rule has an approximation ratio of1k
.
Paper overview: In Section 2 we review the greedy algorithm for the unweighted min-
imum k-set cover problem, and its analysis, the semi-local optimization algorithm of [5],
and then we present our improved algorithm. We analyze its performance in Section 3, i.e.,
we show in Theorem 1 that our improved algorithm is an
Hk 196390
-approximation algo-
rithm for the unweighted k-set cover problem where k4, improving the earlier
Hk 12
-
approximation algorithm of [5]. We conclude in Section 4 by discussing open questions.
2
-
8/12/2019 Un Weighted Sc
3/18
2 Algorithms for the unweighted k-set cover problem
In Subsection 2.1 we review the greedy algorithm for the unweighted minimum k-set cover
problem, and its analysis. In Subsection 2.2 we review the semi-local optimization algorithm
of [5]. In Subsection 2.3 we present our improved algorithm.
Given an input to the unweightedk-set cover problem we let theextended inputbe defined
over the same set of elements where the collection of subsets of the extended input is obtained
from the input by including every subset of a subset in the input (i.e., the extended input
is the closure of the input under taking subsets). We note that the extended input can be
represented compactly by representing the maximal (under inclusion) subsets. A solution to
the extended input is easily transformed into a solution for the original input by adding a
superset which is included in the input, of each subset in the solution. This mapping can be
maintained while creating the solution. For simplifying the presentation of the algorithms weassume that they are solving the extended input. We also assume that the optimal solution
is with respect to the extended input.
We start our study by stating a simplification lemma on the structure of the optimal
solution.
Lemma 1 Without loss of generality, we may assume that the optimal solution to the (ex-
tended input of) a set cover instance satisfies that each element is covered by exactly one
subset of the optimum.
Proof: Let an optimal solution to the problem consist of a collection of sets Sj ,j J, with
jJS
j =E. We now construct another optimal solution formed of element-disjoint sets S
j
whereSj Sj for all j J. To do that, we assign each element e Eto the smallest index
setSj ,j J that containse, and for all values ofj we letSj be the set of elements assigned
to Sj . In the extended input the setsS
j for all j belong to the collection F, and the claim
follows.
We define a j-setto be a set with j elements. We fix an optimal solutionOP T, and we
say that a k-set is an optimalk-setif it is contained in OP T.
Given a partial cover C and an algorithm , let cost(C) be the number of sets used by
Algorithm applied on the elements left uncovered by Cand letcost,1(C) be the number of
1-sets among those.
2.1 The greedy algorithm
In this subsection we review the greedy algorithm for the unweighted k-set cover problem and
the proof of its performance guarantee.
The greedy algorithm starts with an empty collection of subsets in the solution and no
element being covered. Then, it iterates the following procedure until all elements are covered:
Let wS be the number of currently uncovered elements in a set S F, and the current
ratio ofS is rS= 1wS
. Let S be a set such that rS is minimized. The algorithm adds S to
the collection of subsets of the solution, defines the elements ofS as covered, and assigns a
priceofrS to all the elements that are now covered but were uncovered prior to this iteration
(i.e., the elements that were first covered by S
).
3
-
8/12/2019 Un Weighted Sc
4/18
Johnson [13], Lovasz [17] and Chvatal [3] showed that the greedy algorithm is an Hk-
approximation algorithm for the unweighted k-set cover.
Chvatals proof is the following: first note that the cost of the greedy solution equals thesum of prices assigned to the elements. Second, consider a set S that belongs to an optimal
solution OP T. Then, OP Tpays 1 for S. Consider the elements ofSin the order in which
they are covered by the greedy algorithm breaking ties arbitrarily. When the algorithm covers
the i-th element ofS, the algorithm could, instead, choose Sas a feasible set with a current
ratio of 1|S|i+1 . Therefore, the price assigned to the this element is at most 1|S|i+1 . It follows
that the total price assigned to the elements ofS is at most|S|
i=11
|S|i+1 =|S|
i=11i
Hk,
and therefore, the approximation ratio of the greedy algorithm is at most Hk.
2.2 The semi-local optimization algorithm
Duh and Furer [5] suggested the following procedure to approximate the unweighted 3-set
cover problem. In a pure local improvement step, we replace a number of sets with fewer
sets to form a new cover with a reduced cost. To define a semi-local step, they observed (see
also [8]) that once the 3-sets are selected the remaining instance can be solved optimally in
polynomial time by reduction to maximum matching. Hence, to solve the unweighted 2-set
cover instance results after selecting the 3-sets, they invoke the following algorithm A.
Algorithm A for solving optimally unweighted 2-set cover instance
1. Find a maximum matching in the following graph: there is a vertex for each element,and an edge between two vertices if there is a 2-set consisting of this pair of elements.
2. Return the set of 2-sets corresponding to the edges of the maximum matching andthe 1-sets of the uncovered elements (by the collection of 2-sets which we found).
Thus a local change in the 3-sets allows any global changes in the 2-sets and 1-sets and
such a change is called a semi-local change. They allowed the algorithm to remove one 3-set
and insert at most a pair of 3-sets if one of the following happens: either the total cost is
reduced, or the total cost remains the same and the number of 1-sets in the resulting solution
is reduced (thus the total cost is the primary objective whereas the number of 1-sets is a
secondary objective). This results in the approximation algorithm (Algorithm B below) forthe unweightedk-set cover of [5] which is useful mainly for k = 3.
4
-
8/12/2019 Un Weighted Sc
5/18
Algorithm B for approximating unweighted k-set cover instance
1. Greedily build a maximal collectionCof disjoint sets where each set in the collectioncontains at least three elements.
2. While there are sets C C and C1, C2 / C such that C = (C \ {C}) {C1, C2} isa collection of disjoint sets where each set in the collection contains at least threeelements, and such that the following condition hold:costA(C) + |C|< costA(C)+ |C| or (costA(C)+ |C|= costA(C)+ |C| andcostA,1(C)