approximation algorithms for sequential testing of boolean functions lisa hellerstein polytechnic...

29
Approximation algorithms for sequential testing of Boolean functions Lisa Hellerstein Polytechnic Institute of NYU Joint work with Devorah Kletenik (Polytechnic Institute of NYU) and Amol Deshpande (U. of Maryland)

Upload: lorena-perkins

Post on 30-Dec-2015

223 views

Category:

Documents


0 download

TRANSCRIPT

Approximation algorithms for sequential testing of Boolean

functions

Lisa HellersteinPolytechnic Institute of NYU

Joint work with Devorah Kletenik (Polytechnic Institute of NYU) and Amol Deshpande (U. of Maryland)

Reference

Approximation Algorithms for Stochastic Boolean Function Evaluation and Stochastic Submodular Set CoverAmol Deshpande, Lisa Hellerstein, Devorah Kletenikarxiv.org

Evaluating a Boolean Functionon given input

e.g. f(x1,x2, x3) = x1 ∨ ( x2 ∧

x3 )

What is f(0,0,1) = ?

Evaluating a Boolean Functionon given input

e.g. f(x1,x2, x3) = 0 ∨ ( 0 ∧ 1 )

What is f(0,0,1) = ? Answer: 0

Evaluating a Boolean Functionon unknown input

e.g. f(x1,x2, x3) = x1 ∨ ( x2 ∧

x3 )

f(?,?,?) =

Evaluating a Boolean Functionon unknown input

e.g. f(x1,x2, x3) = x1 ∨ ( x2 ∧ x3 )

f(?,?,?) = $3 to get x1

$2 to get x2

$6 to get x3

Evaluating a Boolean Functionon unknown input

e.g. f(x1,x2, x3) = 0 ∨ ( x2 ∧ x3 )

f(0,?,?) = Charges $3 to get x1 $3

$2 to get x2

$6 to get x3

Evaluating a Boolean Functionon unknown input

e.g. f(x1,x2, x3) = 0 ∨ ( x2 ∧ 1 )

f(1,?,1) = Charges $3 to get x1 $3

$6 to get x2

$2 to get x3 $2

Evaluating a Boolean Function

e.g. f(x1,x2, x3) = 0 ∨ ( 1 ∧ 1 )

f(1,1,1) = 0 Charges $3 to get x1 $3

$6 to get x2 $6

$2 to get x3 $2

x1 ∨ ( x2 ∧ x3 )

x1

1x2

x3

1

0

=0

=1

0 1

strategy used

Stochastic Version

e.g. f(x1,x2, x3) = x1 ∨ ( x2 ∧ x3 )

f(?,?,?) = $3 to get x1 Prob[x1= 1] = 1/300

$2 to get x2 Prob[x2= 1] = 8/9

$6 to get x3 Prob[x3= 1] = 9/10

assume independent

Sequential Testing Problem

Given • representation of a Boolean function f(x1,…,xn)

• Costs c1,…,cn (ci > 0)

• Probabilities p1,…,pn (0 < pi < 1)

pi = Prob[xi = 1], assume independence

Task: Find strategy for evaluating f on unknown input having minimum expected cost

Previous Results on Sequential Testing

• OR function– Optimal strategy is to test xi in decreasing order of pi/ci

• Read-once DNF formula– Poly-time algorithm [Boros&Unluyurt, Greiner et al.]

• k-of-n function (unweighted linear threshold)– Poly-time algorithm [Ben-Dov, Salloum&Breuer, ...]– Extension to double-regular functions [ Boros&Unyulurt]

• CDNF formula (generalization of decision tree)– Approximation algorithm but only for monotone formula, ci=1, pi =

1/2 [Kaplan, Kushilevitz, Mansour]• Linear Threshold Formula– NP-hard [Cox]– Easy for ci=1, pi = 1/2 [Fiat&Pechyony,Boros&Unluyurt]

Our main results

• O(log kd)-approximation algorithm for evaluating CDNF formula– k is number of terms, d is number of clauses– same factor as Kaplan et al, works for non-monotone formulas,

arbitrary ci and pi

– based on Adaptive Greedy algorithm of Golovin and Krause for Stochastic Submodular Set Cover

• 3-approximation algorithm for evaluating linear threshold formulas– Based on Dual Adaptive Greedy

• Dual Adaptive Greedy– new algorithm for Stochastic Submodular Set Cover

Stochastic Submodular Set Cover• Given – utility function g: {0,1,*}n → ℤ≥0

e.g. g(*,1,0,*) = 15

where Q ∃ ∊ ℤ≥0 s.t. for all x {0,1}∊ n , g(x) = Q,

and g is monotone, submodular, g(*,…,*)=0 Q is called the goal utility– Costs c1,.., cn (where ci > 0)

ci is cost of testing xi

– Probabilities p1,…,pn (0 ≤ pi ≤ 1)

pi = Prob[xi = 1], assume independence

(Golovin and Krause, COLT 2010)

• Perform sequential tests on variables xi

• Current knowledge of xi values is represented by partial assignment b

e.g. (*,1,0,*)

Task: Find a strategy for achieving g(b) = Q that has minimum expected cost

g: {0,1,*}n → ℤ≥0

• Def: g is monotone– Given partial assignments b and c, c extends b → g(c) ≥ g(b)extends = produced by changing some xi from * to 0, 1

Additional knowledge can only increase ge.g. g(0,*,1) ≥ g(*,*,1)

• Submodularc extends b, ci=bi=* →

g(cxi←1) – g(c) ≤ g(bxi←1) – g(b)

and same for xi ←0 Additional knowledge less valuable (or equally valuable) the more you know already e.g. g(0,*,1) – g(*,*,1) ≤ g(0,*,*) – g(*,*,*)

Adaptive Greedy Algorithm for SSSC

• Start with b = (*,...,*)• While g(b) < Q– test xi with largest expected increase in

utility, per unit cost

E[∆g]/ci – update b with value of xi

Thm [Golovin,Krause]Expected-cost(Adaptive-Greedy) ≤

Expected-cost(OPT) (ln(Q) + 1)

Apply Adaptive Greedy to Sequential Testing of Boolean Function f

• Idea: Convert problem of evaluating Boolean function f into submodular set cover problem

• Construct utility function g such that g(b) = Q iff value of f is determined by b• Adaptive Greedy is O(log Q)-approximation • Challenges– g must be submodular, monotone– Don’t want Q to be too big

Utility function g for CDNFCDNF formula f x1x2 ∨ x2x3 ∨ x3x4 (x1 ∨ x3)(x2 ∨ x4)(x2 ∨ x3)

Intuition: Don’t know value of f yet if there is a term that could still be satisfied and a clause that could still be falsified i.e., if there is a (live term, live clause) pair b=(1,0,*,*) pair ( x3x4 , (x2 ∨ x4) )

Let Q = total #(term,clause) pairs Q = 3*3 = 9

For b ∊ {0,1,*}n, let g(b) = Q – [#(live term, live clause) pairs for b]

g(1,0,*,*) = 9 - 1*2 = 7

Utility function g for CDNF (continued)

g(b) = Q iff value of f determined by b g(*,…,*)=0, g monotone, g submodular

Thm: There is an O(log kd)-approximation algorithm for evaluating CDNF formula, where k is number of terms, d is number of clausesPf: Use Adaptive Greedy with above utility function g. Since Q=kd, it is within a factor of O(log kd) of optimal.

What about evaluating Boolean linear threshold formulas?

e.g. Given 2x1 + x2 - x3 ≥ 0,

costs ci , probabilities pi , want testing strategy for xi to determine if inequality is satisfied

Idea: Find good utility function g with goal value Q. Gives O(log Q)-approximation.

Problem: There are threshold functions s.t. goal value of any good utility function is 2O(n).

Need new approach!

• Recall algorithms for set cover– Standard Greedy Algorithm• (ln m + 1)-approximation for set cover

– Dual Greedy (Hochbaum)• 2-approximation for vertex cover

• Is there an adaptive dual greedy algorithm for Stochastic Submodular Set Cover?– There is now…

Adaptive Dual Greedy

• Generalization of Dual Greedy algorithm of Fujito for Submodular Set Cover

• Submodular Set Cover– Special case of Stochastic Submodular Set cover

where pi = 0 or 1 for all i

• Fujito’s algorithm based on dual of Wolsey’s LP for submodular set cover

• Constraints in this dual LP have form

where gS(i) = increase in utility you will get by testing

xi assuming you already tested S.

• Algorithm works by setting variables “greedily” make constraints tight.

• OUR PROBLEM: In SSSC, increase in utility depends on test outcomes. Until do tests, don’t know outcomes!

Nicyig iNS

SS

,)(

• Our solution:– At every step in the algorithm, only consider

coefficients gS(i) of ys whose variables in S have been tested already

– For each i, set gS(i) to be expected increase in utility from testing xi (if you’ve already tested xi, pretend you haven’t here)

– No objective function

Thm: Adaptive Dual Greedy is an α-approximation algorithm for the SSSC problem, where

and the max is taken over all x in {0,1}n and allproper prefixes S of the sequence of items tested while running the algorithm with test outcomes determined by x.

),(

)(max )( ,

xSgQ

jgxCj xS

Theorem: There is a 3-approximation algorithm for sequential testing of linear threshold formulas.Proof Sketch: Construct good utility function g corresponding to given linear threshold function, so that solving SSSC for g is equivalent to evaluating the function. Apply Dual Adaptive Greedy. Show that α ≤ 3.

Conclusion

• Approach to designing approximation algorithms for sequential testing of Boolean functions

• Adaptive Dual Greedy algorithm for SSSC problem

• Many open problems…