un weighted sc

Upload: mouhabalde

Post on 03-Jun-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 Un Weighted Sc

    1/18

    Approximating the unweighted k-set cover problem: greedy

    meets local search

    Asaf Levin

    August 21, 2008

    Abstract

    In the unweighted set-cover problem we are given a set of elementsE={e1, e2, . . . , en}and a collection Fof subsets ofE. The problem is to compute a sub-collection S OLFsuch that

    SjSOL

    Sj =Eand its size |SOL| is minimized. When |S| k for all S Fwe obtain the unweighted k-set cover problem. It is well known that the greedy algorithmis an Hk-approximation algorithm for the unweighted k-set cover, where Hk =

    ki=1

    1

    i

    is the k-th harmonic number, and that this bound on the approximation ratio of thegreedy algorithm, is tight for all constant values of k. Since the set cover problem is afundamental problem, there is an ongoing research effort to improve this approximationratio using modifications of the greedy algorithm. The previous best improvement ofthe greedy algorithm is an

    Hk

    1

    2

    -approximation algorithm. In this paper we present

    a new

    Hk

    196

    390-approximation algorithm for k 4 that improves the previous best

    approximation ratio for all values ofk 4. Our algorithm is based on combining localsearch during various stages of the greedy algorithm.

    1 Introduction

    In the weighted set-cover problem we are given a set of elements E = {e1, e2, . . . , en}

    and a collection Fof subsets ofE, where SFS = E and each S F has a positive cost

    cS. The goal is to compute a sub-collectionSOL Fsuch that

    SSOL S= Eand its costSSOL cSis minimized. Such a sub-collection of subsets is called a cover. When we consider

    instances of the weighted set-cover such that each Sj has at most k elements (|S| k

    for all S F), we obtain the weighted k-set cover problem. The unweighted set

    cover problem and the unweighted k-set cover problemare the special cases of the

    weighted set coverand of weighted k-set cover, respectively, where cS= 1 S F.

    It is well known (see [3]) that a greedy algorithm is an Hk-approximation algorithm for the

    weightedk-set cover, whereHk =k

    i=11iis thek-th harmonic number, and that this bound is

    tight even for the unweighted k-set cover problem (see, [13, 17]). For unbounded values ofk,

    Slavk [21] showed that the approximation ratio of the greedy algorithm for the unweighted set

    cover problem is ln n lnln n + (1). Feige [6] proved that unlessN PDTIM E(npolylog n)

    the unweighted set cover problem cannot be approximated within a factor (1 ) ln n, for any

    >0. Raz and Safra [20] proved that ifP =N P then for some constant c, the unweighted

    Department of Statistics, The Hebrew University, Jerusalem, Israel. email [email protected]

    1

  • 8/12/2019 Un Weighted Sc

    2/18

    set cover problem cannot be approximated within a factor c log n. This result shows that the

    greedy algorithm is an asymptotically best possible approximation algorithm for the weighted

    and unweighted set cover problem (unless N P D TIME(npolylog n

    )). The unweightedk-setcover problem is known to be NP-complete [14] and MAX SNP-hard for allk 3 [4, 15, 18].

    Another algorithm for the weighted set cover problem by Hochbaum [11] has an approximation

    ratio that depends on the maximum number of subsets that contain any given element (the

    local-ratio algorithm of Bar-Yehuda and Even [2] has the same performance guarantee). See

    Paschos [19] for a survey on these results.

    In spite of the above bad news Goldschmidt, Hochbaum and Yu [8] modified the greedy

    algorithm for the unweighted k-set cover and showed that the resulting algorithm has a

    performance guarantee of Hk 16 . Halldorsson [9] presented an algorithm based on local

    search that has an approximation ratio of Hk 13 for the unweighted k-set cover, and a

    (1.4 + )-approximation algorithm for the unweighted 3-set cover. Duh and Furer [5] furtherimproved this result and presented an (Hk

    12)-approximation algorithm for the unweighted

    k-set cover. We will base our algorithm on the algorithm of Duh and Furer [5], and therefore

    we will review their algorithm and results in Section 2.2. All of these improvements [8, 9, 5] are

    based on running the greedy algorithm until each new subset covers at most t new elements

    (where t = 2 in [8] and larger values oft in [9, 5]) and then switch to another algorithm.

    Regarding approximation algorithms for the weighted k-set cover problem within a factor

    better than Hk, a first improvement step was given by Fujito and Okumura [7] who pre-

    sented an

    Hk 112

    -approximation algorithm for thek-set cover problem where the cost of

    each subset is either 1 or 2. More recently, Hassin and Levin [10] provided an Hk k18k9 -

    approximation algorithm for the general weightedk-set cover problem.The maximum set packing problem is the following related problem: We are given a

    set of elements E = {e1, e2, . . . , en} and a collection Fof subsets ofE, where SFS = E,

    and the goal is to compute a maximum size set packing, i.e., a sub-collectionF Fof disjoint

    subsets. The relation between the maximum set packing problem and the unweighted set cover

    problem is that the fractional version of the maximum set packing problem is the dual linear

    program of the fractional version of the unweighted set cover problem. Hurkens and Schrijver

    [12] proved that a local-search algorithm for the maximum set packing problem where each

    subset in F has at most k elements, is a2k

    -approximation algorithm. Therefore, this

    local-search algorithm has a better performance guarantee than the greedy selection rule that

    returns any maximal sub-collection. The greedy selection rule has an approximation ratio of1k

    .

    Paper overview: In Section 2 we review the greedy algorithm for the unweighted min-

    imum k-set cover problem, and its analysis, the semi-local optimization algorithm of [5],

    and then we present our improved algorithm. We analyze its performance in Section 3, i.e.,

    we show in Theorem 1 that our improved algorithm is an

    Hk 196390

    -approximation algo-

    rithm for the unweighted k-set cover problem where k4, improving the earlier

    Hk 12

    -

    approximation algorithm of [5]. We conclude in Section 4 by discussing open questions.

    2

  • 8/12/2019 Un Weighted Sc

    3/18

    2 Algorithms for the unweighted k-set cover problem

    In Subsection 2.1 we review the greedy algorithm for the unweighted minimum k-set cover

    problem, and its analysis. In Subsection 2.2 we review the semi-local optimization algorithm

    of [5]. In Subsection 2.3 we present our improved algorithm.

    Given an input to the unweightedk-set cover problem we let theextended inputbe defined

    over the same set of elements where the collection of subsets of the extended input is obtained

    from the input by including every subset of a subset in the input (i.e., the extended input

    is the closure of the input under taking subsets). We note that the extended input can be

    represented compactly by representing the maximal (under inclusion) subsets. A solution to

    the extended input is easily transformed into a solution for the original input by adding a

    superset which is included in the input, of each subset in the solution. This mapping can be

    maintained while creating the solution. For simplifying the presentation of the algorithms weassume that they are solving the extended input. We also assume that the optimal solution

    is with respect to the extended input.

    We start our study by stating a simplification lemma on the structure of the optimal

    solution.

    Lemma 1 Without loss of generality, we may assume that the optimal solution to the (ex-

    tended input of) a set cover instance satisfies that each element is covered by exactly one

    subset of the optimum.

    Proof: Let an optimal solution to the problem consist of a collection of sets Sj ,j J, with

    jJS

    j =E. We now construct another optimal solution formed of element-disjoint sets S

    j

    whereSj Sj for all j J. To do that, we assign each element e Eto the smallest index

    setSj ,j J that containse, and for all values ofj we letSj be the set of elements assigned

    to Sj . In the extended input the setsS

    j for all j belong to the collection F, and the claim

    follows.

    We define a j-setto be a set with j elements. We fix an optimal solutionOP T, and we

    say that a k-set is an optimalk-setif it is contained in OP T.

    Given a partial cover C and an algorithm , let cost(C) be the number of sets used by

    Algorithm applied on the elements left uncovered by Cand letcost,1(C) be the number of

    1-sets among those.

    2.1 The greedy algorithm

    In this subsection we review the greedy algorithm for the unweighted k-set cover problem and

    the proof of its performance guarantee.

    The greedy algorithm starts with an empty collection of subsets in the solution and no

    element being covered. Then, it iterates the following procedure until all elements are covered:

    Let wS be the number of currently uncovered elements in a set S F, and the current

    ratio ofS is rS= 1wS

    . Let S be a set such that rS is minimized. The algorithm adds S to

    the collection of subsets of the solution, defines the elements ofS as covered, and assigns a

    priceofrS to all the elements that are now covered but were uncovered prior to this iteration

    (i.e., the elements that were first covered by S

    ).

    3

  • 8/12/2019 Un Weighted Sc

    4/18

    Johnson [13], Lovasz [17] and Chvatal [3] showed that the greedy algorithm is an Hk-

    approximation algorithm for the unweighted k-set cover.

    Chvatals proof is the following: first note that the cost of the greedy solution equals thesum of prices assigned to the elements. Second, consider a set S that belongs to an optimal

    solution OP T. Then, OP Tpays 1 for S. Consider the elements ofSin the order in which

    they are covered by the greedy algorithm breaking ties arbitrarily. When the algorithm covers

    the i-th element ofS, the algorithm could, instead, choose Sas a feasible set with a current

    ratio of 1|S|i+1 . Therefore, the price assigned to the this element is at most 1|S|i+1 . It follows

    that the total price assigned to the elements ofS is at most|S|

    i=11

    |S|i+1 =|S|

    i=11i

    Hk,

    and therefore, the approximation ratio of the greedy algorithm is at most Hk.

    2.2 The semi-local optimization algorithm

    Duh and Furer [5] suggested the following procedure to approximate the unweighted 3-set

    cover problem. In a pure local improvement step, we replace a number of sets with fewer

    sets to form a new cover with a reduced cost. To define a semi-local step, they observed (see

    also [8]) that once the 3-sets are selected the remaining instance can be solved optimally in

    polynomial time by reduction to maximum matching. Hence, to solve the unweighted 2-set

    cover instance results after selecting the 3-sets, they invoke the following algorithm A.

    Algorithm A for solving optimally unweighted 2-set cover instance

    1. Find a maximum matching in the following graph: there is a vertex for each element,and an edge between two vertices if there is a 2-set consisting of this pair of elements.

    2. Return the set of 2-sets corresponding to the edges of the maximum matching andthe 1-sets of the uncovered elements (by the collection of 2-sets which we found).

    Thus a local change in the 3-sets allows any global changes in the 2-sets and 1-sets and

    such a change is called a semi-local change. They allowed the algorithm to remove one 3-set

    and insert at most a pair of 3-sets if one of the following happens: either the total cost is

    reduced, or the total cost remains the same and the number of 1-sets in the resulting solution

    is reduced (thus the total cost is the primary objective whereas the number of 1-sets is a

    secondary objective). This results in the approximation algorithm (Algorithm B below) forthe unweightedk-set cover of [5] which is useful mainly for k = 3.

    4

  • 8/12/2019 Un Weighted Sc

    5/18

    Algorithm B for approximating unweighted k-set cover instance

    1. Greedily build a maximal collectionCof disjoint sets where each set in the collectioncontains at least three elements.

    2. While there are sets C C and C1, C2 / C such that C = (C \ {C}) {C1, C2} isa collection of disjoint sets where each set in the collection contains at least threeelements, and such that the following condition hold:costA(C) + |C|< costA(C)+ |C| or (costA(C)+ |C|= costA(C)+ |C| andcostA,1(C)