the average sensitivity of an intersection of halfspaces filethe average sensitivity of an...
Post on 29-Oct-2019
8 Views
Preview:
TRANSCRIPT
The Average Sensitivity of an Intersection of Halfspaces
Daniel Kane
Department of MathematicsStanford University
aladkeenin@gmail.com
June 2nd, 2014
D. Kane (Stanford) Intersections of Halfspaces June 2014 1 / 20
Average Sensitivity
Definition
Given a Boolean function f : ±1n → 0, 1 we define the averagesensitivity of f to be
AS(f ) = Ex∼u±1n [#i : f (x) 6= f (x i )],
where x i is x with the sign of the i th coordinate flipped.Define the noisesensitivity with parameter ε to be
NSε(f ) = Pr(f (X ) 6= f (Y ))
where X and Y differ on each coordinate independently with probability ε.
Measure of complexity of Boolean function.
Applications to learning theory.
D. Kane (Stanford) Intersections of Halfspaces June 2014 2 / 20
Sensitivity of Algebraically Defined Functions
There has been significant work in recent years on bounding the noisesensitivity of simple classes of algebraically defined functions. For example:
Indicator functions of halfspaces
Bounded degree polynomial threshold functions
Indicator functions of the intersection of a bounded number ofhalfspaces
D. Kane (Stanford) Intersections of Halfspaces June 2014 3 / 20
Sensitivity of Halfspaces
For f : ±1n → 0, 1 the indictor function of a halfspace:
Gotsman-Linial 1994: AS(f ) ≤ 21−n( nbn/2c
)(n − bn/2c) = O(
√n)
Benjamini-Kalai-Schramm 2001: NSε(f ) = O(ε1/4)
Peres 2004: NSε(f ) = O(ε1/2)
D. Kane (Stanford) Intersections of Halfspaces June 2014 4 / 20
Sensitivity of Threshold Functions
For f : ±1n → 0, 1 a degree-d polynomial threshold function:
Conjecture(Gotsman-Linial 1994): AS(f ) = O(d√n)
Diakonikolas-Harsha-Klivans-Meka-Raghavendra-Servedio-Tan 2010:
AS(f ) ≤ 2O(d)n1−1/(4d+6),NSε(f ) ≤ 2O(d)ε1/(4d+6)
K.2010: Gaussian analogue of Gotsman-Linial Proved.
K.2013:AS(f ) ≤
√n logO(d log(d))(n)2O(d2 log(d)),
NSε(f ) ≤√ε logO(d log(d))(ε)2O(d2 log(d))
D. Kane (Stanford) Intersections of Halfspaces June 2014 5 / 20
Sensitivity of Intersections of Halfspaces
Let f : ±1n → 0, 1 be the indicator function of an intersection of atmost k halfspaces.
Recall that for k = 1 that AS(f ) = O(√n),NSε(f ) = O(
√ε)
Trivial Bound: AS(f ) = O(k√n),NSε(f ) = O(k
√ε)
Nazarov 2008: Gaussian surface area Γ(f ) = O(√
log(k)), suggestsAS(f ) ≤ O(
√log(k)n),NSε(f ) = O(
√log(k)ε)
Harsha-Klivans-Meka 2010: For regular halfspaces,NSε(f ) ≤ log(k)O(1)ε1/6
This talk: AS(f ) = O(√
log(k)n),NSε(f ) = O(√
log(k)ε), optimal
up to constants if k 2n, 2ε−1
D. Kane (Stanford) Intersections of Halfspaces June 2014 6 / 20
Unate Functions
The bulk of our argument requires only that linear threshold functions aremonotonic in the following sense:
Definition
We say that a function f : ±1n → R is unate if for each i , f is eithernon-increasing in the i th coordinate or non-decreasing in the i th coordinate.
The bulk of our work is encapsulated in the following Theorem:
Theorem
If f1, . . . , fk : ±1n → 0, 1 are unate and F =∨k
i=1 fi then
AS(F ) = O(√
log(k)n).
D. Kane (Stanford) Intersections of Halfspaces June 2014 7 / 20
k = 1Our argument depends on generalizing the correct proof for k = 1.
f : ±1n → 0, 1 unate
WLOG f non-decreasing in each coordinate
AS(f ) =n∑
i=1
E[|f (x)− f (x i )|
]=
n∑i=1
E[f (x i+)− f (x i−)
]= 2E
[f (x)
(n∑
i=1
xi
)]
≤ 2E
[max
(0,
n∑i=1
xi
)]= O(
√n).
D. Kane (Stanford) Intersections of Halfspaces June 2014 8 / 20
Bound Based on VolumeCan actually strengthen the bound if E[f ] is small.
Lemma
Let S : ±1n → 0, 1 if E[S(x)] = p then
E
[S(x)
(n∑
i=1
xi
)]= O(p
√log(1/p)n).
Proof.
E
[S(x)
(n∑
i=1
xi
)]≤∫ ∞0
Pr
(S(x)
(n∑
i=1
xi
)> y
)dy
≤∫ ∞0
min
(p,Pr
(n∑
i=1
xi > y
))dy
≤ O(p√
log(1/p)n).
D. Kane (Stanford) Intersections of Halfspaces June 2014 9 / 20
Proof Overview
Let Fm :=∨m
i=1 fi
Let Sm := Fm − Fm−1
Let pm := E[Sm]
Proposition
AS(Fm) ≤ AS(Fm−1) + O(pm√
log(1/pm)n)
Theorem follows since
AS(F ) ≤ O(√n)∑k
i=1 pi√
log(1/pi )∑ki=1 pi ≤ 1
x√
log(1/x) is concave
D. Kane (Stanford) Intersections of Halfspaces June 2014 10 / 20
Change in Sensitivities
AS(Fm)−AS(Fm−1) =n∑
i=1
E[∣∣Fm(x)− Fm(x i )
∣∣− ∣∣Fm−1(x)− Fm−1(x i )∣∣]
WLOG fm is non-decreasing in each coordinate
Claim
If fm is non-decreasing in each coordinate, then for each x , i ,∣∣Fm(x)− Fm(x i )∣∣− ∣∣Fm−1(x)− Fm−1(x i )
∣∣≤ xi
((Fm(x)− Fm(x i )
)−(Fm−1(x)− Fm−1(x i )
))= xi (Sm(x)− Sm(x i )).
D. Kane (Stanford) Intersections of Halfspaces June 2014 11 / 20
Proof of Claim
Case 1: fm(x) = fm(x i ) = 0
Fm(x) = Fm−1(x) and Fm(x i ) = Fm−1(x i )
both sides of the equation are 0.
Case 2: fm(x) = 1 or fm(x i ) = 1
WLOG xi = 1
fm(x) ≥ fm(x i ) so fm(x) = Fm(x) = 1
xi(Fm(x)− Fm(x i )
)≥∣∣Fm(x)− Fm(x i )
∣∣−xi
(Fm−1(x)− Fm−1(x i )
)≥ −
∣∣Fm−1(x)− Fm−1(x i )∣∣
Result follows
D. Kane (Stanford) Intersections of Halfspaces June 2014 12 / 20
Difference of Sensitivities
AS(Fm)− AS(Fm−1) = E
[n∑
i=1
∣∣Fm(x)− Fm(x i )∣∣− ∣∣Fm−1(x)− Fm−1(x i )
∣∣]
≤ E
[n∑
i=1
xi (Sm(x)− Sm(x i ))
]
= 2E
[Sm(x)
(n∑
i=1
xi
)]= O(pm
√log(1/pm)n).
Theorem follows.
D. Kane (Stanford) Intersections of Halfspaces June 2014 13 / 20
Tightness
Theorem
For k ≤ 2n there exists an F given as the indicator function of anintersection of at most k halfspaces so that
AS(F ) = Ω(√
log(k)n).
Consider union instead
f halfspace with E[f ] ≤ 1/4k and AS(f ) = Ω(√
log(k)n/k)
fi random rotation of f
F =∨k
i=1 fi
D. Kane (Stanford) Intersections of Halfspaces June 2014 14 / 20
Tightness
AS(F ) at least
2−nk∑
i=1
#x , y adjacent fi (x) 6= fi (y), fj(x) = fj(y) = 0 for all j 6= i
2nΩ(√
log(k)n/k) pairs with fi (x) 6= fi (y)
In expectation, half of them have fj(x) = fj(y) = 0
Expected sensitivity Ω(√
log(k)n)
D. Kane (Stanford) Intersections of Halfspaces June 2014 15 / 20
Noise Sensitivity
Theorem
Let F be the indicator function of an intersection of at most k halfspaces.Then
NSε(F ) = O(√
log(k)ε).
Remark
This does not hold for intersections of unate functions.
D. Kane (Stanford) Intersections of Halfspaces June 2014 16 / 20
Noise Sensitivity Bound
Let ε = 1/m. Generate X ,Y differing on ε-fraction of coordinates in thefollowing way:
Randomly divide coordinates into m bins
Randomly fix relative signs of coordinates within each bin
Randomly determine bin signs to get X
Reverse signs in one random bin to get Y
After first two steps have intersection of halfspaces on m coordinates.
NSε(F ) = E[AS(F ′)/m] = O(√
log(k)/m) = O(√
log(k)ε).
D. Kane (Stanford) Intersections of Halfspaces June 2014 17 / 20
Learning Result
Using standard reductions we obtain:
Corollary
The concept class of intersections of k halfspaces with respect to theuniform distribution on ±1n is agnostically learnable with error opt + εin time nO(log(k)ε−2).
Remark
The problem of learning intersections of halfspaces was considered byKlivans-ODonnell-Servedio, where they achieved a bound of nO(k2/ε2),which is substantially improved by the above.
D. Kane (Stanford) Intersections of Halfspaces June 2014 18 / 20
Conclusion
We have proved essentially optimal bounds on the average sensitivity andnoise sensitivity of intersections of a bounded number of halfspaces, andshown applications to learning theory. Potential further work could go intofinding the correct constants for these bounds.
D. Kane (Stanford) Intersections of Halfspaces June 2014 19 / 20
Acknowledgements
This work was done with the support of an NSF postdoctoral fellowship.
D. Kane (Stanford) Intersections of Halfspaces June 2014 20 / 20
Itai Benjamini, Gil Kalai and Oded Schramm, Noise Sensitivity ofBoolean Functions and Applications to Percolation Inst. HautesEtudes Sci. Publ. Math. 90 pp. 5–43 (2001).
Ilias Diakonikolas, Prahladh Harsha, Adam Klivans, Raghu Meka,Prasad Raghavendra, Rocco A. Servedio, Li-Yang Tan Bounding theaverage sensitivity and noise sensitivity of polynomial thresholdfunctions Proceedings of the 42nd ACM symposium on Theory ofcomputing (STOC), 2010.
Ilias Diakonikolas, Prasad Raghavendra, Rocco A. Servedio, Li-YangTan Average sensitivity and noise sensitivity of polynomial thresholdfunctions http://arxiv.org/abs/0909.5011.
Craig Gotsman, Nathan Linial Spectral properties of thresholdfunctions Combinatorica, Vol. 14(1), pp. 35-50, 1994.
Adam Kalai, Adam R. Klivans, Yishay Mansour, Rocco ServedioAgnostically Learning Halfspaces, Foundations of Computer Science(FOCS), 2005.
D. Kane (Stanford) Intersections of Halfspaces June 2014 20 / 20
Daniel M. Kane The Correct Exponent for the Gotsman-LinialConjecture, Conference on Computational Complexity (CCC) 2013.
Daniel M. Kane The Gaussian Surface Area and Noise Sensitivity ofDegree-d Polynomial Threshold Functions, in Conference onComputational Complexity (CCC) 2010, pp. 205–210
Prahladh Harsha, Adam R. Klivans, Raghu Meka An InvariancePrinciple for Polytopes, Symposium on Theory of Computing (STOC),2010.
Adam Klivans, Ryan ODonnell and Rocco Servedio, LearningIntersections and Thresholds of Halfspaces J. Computer Syst. Sci. 68,pp. 808–840 (2004).
Adam R. Klivans, Ryan O’Donnell, Rocco A. Servedio, LearningGeometric Concepts via Gaussian Surface Area In the Proceedings ofthe 49th Foundations of Computer Science (FOCS), pp. 541–550,2008.
D. Kane (Stanford) Intersections of Halfspaces June 2014 20 / 20
Yuval Peres Noise Stability of Weighted Majority, manuscript availableat http://arxiv.org/abs/math/0412377.
D. Kane (Stanford) Intersections of Halfspaces June 2014 20 / 20
top related