a dependent lp-rounding approach for the k-median problem moses charikar 1 shi li 1 1 department of...
TRANSCRIPT
![Page 1: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/1.jpg)
A Dependent LP-Rounding Approach for the k-Median Problem
Moses Charikar1 Shi Li11Department of Computer Science
Princeton University
ICALP 2012, Warwick, UK
![Page 2: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/2.jpg)
• Introduction• Linear Programming Relaxation• Simple Pseudo-Approx. for k-median• Our Algorithm for k-median
Outline
![Page 3: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/3.jpg)
k-Median as a Clustering Problem
• Given: metric (X, d), k• Partition X into k clusters• Select a center for each
cluster• Minimize sum of distances to
the centers:
• Quantifies how well a set can be divided into k partitions
k = 4
![Page 4: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/4.jpg)
k-Median in Operation Research
• Given metric (F C, d), ko F : set of facilitieso C : set of clients
• Open k facilities• Connect each client to its
nearest open facility• Minimize total connection
cost
k = 4
![Page 5: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/5.jpg)
Related Problem : Facility Location
Problem• Given metric (F C, d), k
o F : set of facilitieso C : set of clientso fi : facility cost of opening i
• Open k facilities • Connect each client to its
nearest open facility• Minimize total connection
cost
{fi ≥ 0 : i F}
Open a set F' F of facilities
Minimize sum of facility cost and connection cost,
k = 4
![Page 6: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/6.jpg)
Known Results
• *local search: if switching p facilities can not improve a solution, then the solution is a 3+2/p-approx.
• Integrality gap of the natural linear programming is between 2 and 3o the proof of the upper bound 3 is non-constructive
Approx. Hardness of appox.
facility location 1.488 [Li11] 1.463 [GK98,Sri02]
k-median 3+ε* [AGK+01] 1+2/e+ε [JMS02]
![Page 7: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/7.jpg)
Our Results• A LP-rounding approach for k-median
o prove 3.25 approximation ratioo thus give a constructive proof for the 3.25 integrality gapo faster running time compared to the local search algorithmo potential to improve the 3+ε approximation
• the upper bound 3.25 is not tight• our algorithm may already give approximation ratio smaller than 3
![Page 8: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/8.jpg)
Our Resultsprev. best approx. ratio our approx. ratio
k-facility location [Zha06] 3.25
matroid median 16 [KKN+11] 9
knapsack median ≥ 1000 [Kum12] 34
• k-facility location: facility location problem with constraint that at most k facilities can be open
• matroid median: the set of open facilities must be an independent set of a given matroid
• knapsack median problem: each facility has a cost, the total cost of open facilities can not exceed a budget B
![Page 9: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/9.jpg)
• Introduction• Linear Programming Relaxation• Simple Pseudo-Approx. for k-median• Our Algorithm for k-median
Outline
![Page 10: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/10.jpg)
Natural LP Relaxation• yi{0,1}, iF : whether facility i is open
• xi,j{0,1}, iF, jC : whether client i is connected to facility j
Every client j must be connected to 1 facility
Client j can only be connected to an open facility
We can open at most k facilities
![Page 11: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/11.jpg)
Canonical Instance
• km facilities• every client j is connected to its nearest m facilities
• in the LP solution, yi=1/m, xi,j{0,1/m}
facilities clients
j
![Page 12: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/12.jpg)
Canonical Instance
• Fj: the set of m facilities that j is connected to
• average distance from j to Fj
• maximum distance from j to Fj
• LP value =
facilities clients
j
![Page 13: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/13.jpg)
• Introduction• Linear Programming Relaxation• Simple Pseudo-Approx. for k-median• Our Algorithm for k-median
Outline
![Page 14: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/14.jpg)
Pseudo-Approximation• An (α, c)-pseudo approximation is a solution that
opens at most αk facilities and whose connection cost is at most c times the optimal cost
• A warm-up : (1 + ε, O(1/ε))-pseudo approximation for k-median
![Page 15: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/15.jpg)
Pseudo-Approximation
• Let m' = m / (1+ε), y'i=(1+ε)yi=1/m'
• Every client only needs to connect to m' facilities• We fractionally open km(1/m')=(1+ε)k facilities
• Define F'j, d'av(j),d'max(j) similarly
facilities clients
j
![Page 16: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/16.jpg)
Pseudo-Approximation
• Two clients j and j' conflict if F'jF'j' ≠ ∅
• Select a set C' of clients such that no two clients in C' conflict each other
facilities clients
j
j'
![Page 17: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/17.jpg)
Pseudo-Approximation
• greedily constructing C'C with no conflictiono while C ≠ ,∅
• select jC with the minimum dav(j)
• add j to C' • remove j and all clients that conflict j from C
facilities clients
![Page 18: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/18.jpg)
Pseudo-Approximation
• open facilitieso For every j C', randomly open 1 of the m' facility in F'jo For any facility i that is not inside jC'F'j , open i with probability 1/m'
• connect each client to its nearest open facility
facilities clients
Fact: every facility is open with probability 1/m'
![Page 19: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/19.jpg)
Pseudo-Approximation
Proof Enough to assume j C' • ∃j' C' s.t
o F'jF'j' ≠ and ∅ d'av(j') ≤ d'av(j)
• E[Cj] ≤ E[Cj']+d(j, j')
≤ E[Cj']+d'max(j)+d'max(j')
≤ d'av(j')+(1/ε)d'av(j')+(1/ε)d'av(j')
≤ (1+2/ε)d'av(j) ≤ (1+2/ε)dav(j)
Lemma E[Cj] ≤ O(1/ε)dav(j), where Cj is the connection cost of j
j
j'
facilities clients
F'j
F'j'
![Page 20: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/20.jpg)
• Introduction• Linear Programming Relaxation• Simple Pseudo-Approx. for k-median• Our Algorithm for k-median
Outline
![Page 21: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/21.jpg)
Barrier to Obtain True Approximation
• If ε=0, then F'j=Fj
• dmax(j) >> dav(j)
• With non-zero prob., j will be connected to facilities in Fj'
• The expected connection cost of j is unbounded compared to dav(j)
facilities clients
Fj
Fj'
j
j'
![Page 22: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/22.jpg)
Remove the Barrier• Solution: j only “claims”
close facilities in Fj
• Let Uj be the set of claimed facilities
• Use Uj to replace Fj in the algorithm
• New Barrier: |Uj| < m might happen
• can not guarantee always a facility open in Uj
Fj
Uj
j
![Page 23: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/23.jpg)
Remove the New Barrier
• can guarantee |Uj| ≥ m/2
• |UjUj'| ≥ m if Uj and Uj' are disjoint
• pair the clients in C'• always open 1 facility (possibly
2 facilities) in UjUj' for a matched pair (j, j')
j
Uj
Uj'
j'
![Page 24: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/24.jpg)
Remove the New Barrier
• How to open facilities for a matched pair?
• m boxes in a line
• Permute facilities in Uj put them in the leftmost |Uj| boxes
• Permute facilities in Uj' put them in the rightmost |Uj'| boxes
• Open facilities in a random selected box
m
Uj
Uj'
![Page 25: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/25.jpg)
The Algorithm• Filtering
o 2 clients j and j' conflict if d(j, j') ≤ 4max{dav(j),dav(j')}
o while C ≠ ∅
• select j C that minimizes dav(j);
• add j to C'• remove j and all clients that
conflict j from C
![Page 26: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/26.jpg)
The Algorithm• Filtering• Claiming
o For any j C', let 2Rj be the distance between j and its nearest neighbor in C'
o A facility i is claimed by j, if
• i Fj and
• d(i, j) ≤ Rj
i.e, Uj = Fj Ball(j, Rj)
Fact: any client j C' will claim at least m/2 and at most m facilities.
![Page 27: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/27.jpg)
The Algorithm• Filtering• Claiming• Matching
o while there are at least 2 unmatched clients in C'• select 2 unmatched clients j and j'
that minimizes d(j, j')• match j and j'
![Page 28: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/28.jpg)
The Algorithm• Filtering• Claiming• Matching• Rounding
o For each matched pair (j, j'), open 1 or 2 facilities in UjUj'
o If there is an unmatched client j, open 0 or 1 facility in Uj
o For each facility i that is not inside any Uj, open i with probability 1/m
o Connect each client to its nearest open facility
![Page 29: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/29.jpg)
Proof of Constant Approx.
Ratio
Proof • it is enough to assume jC'
o Assume jC', there exists a client j' such that
d(j', j) ≤ 4dav(j) and dav(j') ≤ dav(j)
o Assume E[Cj'] ≤ αdav(j')
o E[Cj] ≤ d(j, j') + E[Cj'] ≤ 4dav(j)+αdav(j') ≤ (4+α)dav(j)
• W.L.O.G, assume dav(j) = 1
Lemma E[Cj] ≤ O(1)dav(j), where Cj is the connection cost of j
![Page 30: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/30.jpg)
Proof of Constant Approx.
Ratio
j j1j2
nearest neighbor of j in C' j2 is matched with j1
2Rj 2Rj1 ≤ 2Rj
Rj Rj1 Rj2
• There is always 1 facility open in Uj1Uj2
• Any facility in Uj1Uj2 is at most 2Rj+2Rj1+Rj2≤ 5Rj away from j
• |Uj| ≥ m(1-1/Rj)
• with prob. 1-1/Rj, connect to a random facility in Uj
• only with prob. 1/Rj, connect to a facility that is 5Rj away
• E[Cj] ≤ 5 n
Uj Uj1 Uj2
![Page 31: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/31.jpg)
Proof of 3.25 approx. ratio
• complicated, details omitted• rough idea : for a client j C'
o j1C' is the client that conflicts and removes j in the filtering phase
o j2C' is the nearest neighbor of j1 in C'
o j3C' is the client matched with j2
o Consider the nearest open facility of j in FjFj1Uj2Uj3
• Our algorithm opens k facilities in expectation• Can be easily transformed so that it always opens k
facilities• Algorithm naturally extends to k-FL problem
![Page 32: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/32.jpg)
Ongoing Work• Joint work with Svensson, improved the best
approximation ratio (3+ε) for k-median
![Page 33: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/33.jpg)
Summary• We introduced a LP-rounding algorithm for k-median
problemo proved 3.25 approximation ratio for the problemo it has potential to improve the decade-long 3 approximation
• Improved approximation algorithms for the following problemso k-facility location problem 3.25o Matroid median problem 9o Knapsack median problem 34
![Page 34: A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,](https://reader037.vdocuments.net/reader037/viewer/2022110116/551b1a99550346f70d8b63fd/html5/thumbnails/34.jpg)
Thanks