1 on the benefits of adaptivity in property testing of dense graphs joint works with mira gonen and...

31
1 On the Benefits of Adaptivi ty in Property Testing of D ense Graphs Joint works with Mira Gonen and Oded Goldreich Dana Ron Tel-Aviv University

Post on 21-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

1

On the Benefits of Adaptivity in Property Testing of Dense Graphs

Joint works with Mira Gonen and Oded Goldreich

Dana Ron

Tel-Aviv University

2

Let G = (V,E) be a graph.

If G has property P: accept w.h.p.

If G is “-far” from P: reject w.h.p.

“-far”: “-fraction of the graph” should be modified to obtain P.

A Testing Algorithm for graph property Pcan query the graph G on the neighborhood relations of vertices in G.

Graph Property Testing

3

Models used for Testing Graph Properties

1 2 … d

1 2 … d

1

n

Bounded-Degree Graphs Model [Goldreich, R]:(graph is represented by n incidence lists of size d) queries: who is i’th neighbor of v? -far: dn edges should be modified. suitable: (almost)-regular sparse graphs (in particular, constant-degree graphs)

Dense Graphs Model [Goldreich,

Goldwasser, R]:(graph is represented by n x n adjacency matrix) queries: is (u,v) E ? -far: n2 edges should be modified. suitable: dense graphs

v

1u

4

Adaptive vs. non-adaptive testing

• A tester is adaptive if its queries depend on answers to previously asked queries.

• There are properties (e.g., Inverse Functions, certain Context-Free languages), for which there is an exponential gap between adaptive and non-adaptive testing [Fischer].

• There are properties (e.g. Monotonicity, Linear Codes) for which there is no such gap [Fischer], [Ben-Sasson, Harsha, Raskhodnikova]. (This is useful for proving lower bounds.)

5

Adaptive vs. non-adaptive testing of graph properties

• In the bounded degree model testers are “adaptive by nature”- [Raskhodnikova & Smith].

• Can adaptivity be beneficial in the dense graphs model ?

• [Alon, Fischer, Krivelevich, Szegedy] [Goldreich & Trevisan] showed that there is at most a quadratic gap in the query complexity between adaptive and non-adaptive testers in the dense graphs model.

• Is there an actual gap in the dense graphs model?

We answer this question positively.

6

Adaptive vs. non-adaptive testing in the Dense-graphs model

• We [Gonen & R] first reveal a gap by considering the well studied problem of testing Bipartiteness:For every testing bipartiteness of graphs with max-degree O(n) can be done adaptively with Õ(1/) queries, while (by [Bogdanov, Trevisan]) (1/2) queries are necessary for non-adaptive algorithms.

• This result leaves open the question of whether there is a single property for which such a gap exists for every

• We [Goldreich & R] then show that such a gap (4/3 in the exponent) exists for Clique-Collection, a larger gap (3/2 ) exists for BiClique-Collection, and we conjecture that a certain property gives an almost maximum gap (2-o(1)).

7

Part I:

A Gap for testing Bipartiteness

8

Testing Bipartiteness in Dense Graphs

• Bipartiteness Algorithm of [GGR]:

– Uniformly and independently select (log(1/)/2) vertices in graph.

– If subgraph induced by selected vertices is bipartite, then accept, otherwise, reject.

Query complexity and running time of algorithm: Õ(1/4) . Slight variant yields Õ(1/3)

• Improved analysis of [Alon & Krivelevich]: – Sufficient to randomly select only Õ(1/) vertices.

The query complexity and running time : Õ(1/2) • Are Õ(1/2) queries on pairs of vertices necessary?

9

Lower Bounds [Bogdanov & Trevisan]

(1/2) for non-adaptive algorithms

(1/3/2) for adaptive algorithms

The lower bounds hold for graphs of small degree, that is, the degree of every vertex is (n

10

New Result:

Describe an adaptive bipartiteness tester that performs Õ(1/) queries for graphs with maximum degree O(n) .

A variant of the algorithm tests the combined property of having degree O(n) and being bipartite.

A Few notes• The (1/2) lower bound of [BT] for non-adaptive

algorithms holds in this low-degree case.• Our tester matches the (1/3/2) lower bound of

[BT] up to polylogarithmic factors in 1/• We also show that Õ(1/) queries suffice when

(almost) all vertices have degree (1/2 n).

11

Main Idea behind Algorithm: Apply techniques from the Bounded-

Degree modelAlgorithm selects (uniformly at random) a subset of vertices T, where |T|= (log(1/Let GT denote the subgraph induced by T.Emulates the bipartiteness tester of [Goldreich & R] for bounded-degree graphs on GT.

Show: (1) w.h.p GT has small degree and is -far from bipartite for large (distance measured w.r.t. bounded-deg model).(2) Can emulate algorithm efficiently using vertex-pair queries

Consider G that is -farfrom being bipartite and has maximum degree O(n).

12

Lemma 1: W.h.p GT is ’-far from being bipartite for ’ = (/log(1/)) (in the dense graphs model)

Analysis

G

/log(1/-far (in dense model) -> logc(1/-far (in bound-deg model)

far

The number of edges that should be removed from GT is at least ’ |T|2 = d|T|, where 1/logc(1/ GT is –far from being bipartite in the bounded degree model.

GT

Lemma 2: W.h.p degree of vertices in GT is at most d=logc(1/

13

Emulating the Algorithm for Bounded-Degree Graphs

In order to run the [GR] algorithm we have to emulate random walks in GT by using vertex-pair queries only:

To perform a random walk step from a vertex v:

perform all queries (v,u) for u in T, and take a random neighbor.

each neighbor query takes |T|=Õ(1/ vertex-pair queries.

total cost = Õ(1/

u1

v

u2

u3

ud(v)

T

[GR] Alg starts from 1/= logc(1/) vertices in GT, performs

Õ(|T|1/2poly(1/))=Õ(1/1/2) random walks from each start vertex, each of length logc|T|poly(1/)= logc(1/)

14

An Adaptive Testing Algorithm for graphs with degree O(n)

• Uniformly at random select a subset of vertices T, where |T|= (log(1/and let GT denote the subgraph induced by T.

• Uniformly and independently at random select (logc(1/ vertices from T. Let the set of vertices selected be denoted by W.

• For each vertex v W, perform (logc(1/ random walks in GT , each of length (logc(1/

• If an odd-length cycle is detected in the subgraph induced by all random walks then reject, otherwise accept.

Emulate the [GR] algorithm for testing bipartiteness of bounded-degree graphs (with appropriate setting of parameters)

Total complexity: O(logc(1/)) times O(logc(1/)/1/2)O(logc(1/)) times O(log(1/)/) = O(logc’(1/)/3/2) .

15

Part II:

A Gap for Standard Testing (for all )

16

Main Two Properties Studied

Clique Collection (denoted CC): Graphs that consist of a union of cliques (of any number and size)

Bi-Clique Collection (denoted BCC): Graphs that consist of a union of bi-cliques (each bi-clique is complete bipartite subgraph with each side an ind. set)

Property corresponds to a “perfect clustering”

Property corresponds to a “perfect bi-clustering” (e.g., when have a relation over two types of entities (

17

Main Results

Thm 1: - There exists an adaptive tester for CC with query complexity and running time Õ(1/).- Every non-adaptive algorithm for CC must have query complexity (1/4/3). (Furthermore, result is tight.)

Thm 2: - There exists an adaptive tester for BCC with query complexity and running time Õ(1/).- Every non-adaptive algorithm for BCC must have query complexity (1/3/2).

18

Conjecture: almost quadratic gap

A Super-Cycle of length t consists of t ind. sets that are arranged on a cycle such that each pair of adjacent ind. sets have complete bipartite graph between them. A t-Super-Cycle Collection (t-SCC) is a union of super-cycles of length at most t .

Can show that every non-adaptive algorithm for t-SCC must have query complexity (1/2-2/t). Conjecture that exists an adaptive tester for t-SCC with query complexity and running time Õ(1/).

19

Adaptive Algorithm for CC

Basic observation: If G ( = (V,E) ) CC then the following holds for every v V: for every two neighbors u and w of v, (u,w) E, and for every neighbor u and non-neighbor z, (u,z) E.

vz

w

u

High-level idea of Algorithm: select sample of (“start”) vertices. For each start vertex v take sample of neighbors and of non-neighbors of v and verify that above holds. (To sample neighbors and non-neighbors, take sample S and query all pairs (v,x) for x S )

Remember: total number of queries should be Õ(1/) so should be thrifty…

vS

NS(v)

20

Adaptive Algorithm for CCLet s1 = (1), s2 = (log3(1/)), m = (log(1/)). For j = 1,…,m select s12j start vertices and for each start vertex v do:1. Select random subset S of s2/(2j ) vertices.

2. Determine NS(v) (neighbors of v in S).

3. Select sample of s2/(2j ) pairs in NS(v) x NS(v) and check that each is an edge (if too few neighbors then consider all pairs).

4. Select sample of s2/(2j ) pairs in NS(v) x (S\NS(v)) and check that each is not an edge

If all checks pass then accept, otherwise, reject

v S

NS(v)

Query complexity and running time:

)/1(~

))2/((2 21 1 jjm

jsOs

21

Illustrative example for Alg for CC

Consider following case of graph that should be rejected: The graph consists of subsets of size (n) where in each there are a constant fraction of missing edges.

In first iteration select constant num of vertices. For each v selected, select subset S of size Õ(1/). W.h.p, |NS(v)| = logc(1/) and among them have pair with no edge between them.

Since alg is adaptive it can “focus” on the neighbors. If it is not adaptive it “can’t know” where to ask the queries.

22

Lower bound for non-adaptive testing of CC

Define Two families that are hard to distinguish non-adaptively, in less than 1/4/3 queries.First family: graphs consist of 1/(2) cliques, each of size 2n;

Second family: graphs consist of 1/(4) bi-cliques, each of size 4n;

23

Summary of Part II

Main Conjecture: t ≥ 3 have property for which

adaptive complexity is

while non-adaptive is

)/1(~ q

)(~ /22 tq

)()( 2/3/22 qq t

)(~

)(~ 3/4/22 qq t Proved for t = 3 : CC -

Proved l.b for t = 4: BCC -but u.b is open.

For general t holds for promise problem.

24

Conclusions and Open problems

Adaptivity can be beneficial in the dense-graphs model. The dense-graphs model is not all about combinatorics: algorithmic aspects play a role.

What is the complexity of adaptive algorithms for testing bipartiteness of general graphs? (Comment: if all degrees at least 1/2n then complexity is Õ(1/3/2)) Is our conjecture correct? (I.e., there exists a quadratic gap for the property we suggest (or another property))? For what constants c [1,2] is there a gap (in the exponent) of a factor of c? Characterize the class of graph properties for which c=1.

25

Thanks

26

Proof sketch of the main Lemma

Assume S,R has Property 1 and GT has Property 2.

Consider any fixed partition (S1R1,S2R2) of T.

Property 1 R spans at least (/16)|R|2 conflicting edges.

#conflicting edges mapped to each violating edge

c’’log(1/) (using Property 2)

there are at least ’’|R|2 violating edges with respect to (S1R1,S2R2), ’’ /(c log (1/))

there are at least ’|T|2 violating edges with respect to (S1R1,S2R2), ’= ’’/4.

Conflicting edges mapping Violating edges

Lemma: w.h.p. GT is ’-far from being bipartite for ’ = (/log(1/)).

27

Analysis

Proving the main result from the main Lemma:

1. Assume main Lemma holds. 2. w.h.p the degree of vertices in GT d=polylog(1/ The number of edges that should be removed from

GT is at least d|T|, where 1/polylog(1/ GT is –far from being bipartite in the bounded

degree model. Applying techniques from the bounded-degree

graphs model we get our main result.

’|T|2 edges need to be removed from GT

|T|= Õ(1/).

’= (/log(1/)).

Run the [GR] algorithm on GT

28

w.h.p over the choice of sample T, all vertices in GT have degree at most d=O(log(1/)), and it is necessary to remove more than ’|T|2 edges in order to make it bipartite, for ’ = (/(log(1/)))

GT is -far from being bipartite in the bounded degree model, for ’|T|2/d|T| = (1/(log(1/)))

Emulating the Algorithm for Bounded-Degree Graphs

Main Lemma +

Claim

29

Adaptive Algorithm for CCConsider two extreme cases (of graphs that should be rejected): (1) Graph consists of subsets of size (n) where in each there are a constant fraction of missing edges. (2) Graph is (-close toa single clique.

In first case, for any start vertex, suffices to obtain constant number of pairs of neighbors, but to get a neighbor requires sample of (1/random vertices. In second case need to obtain (1/ pairs of neighbors, but constant size sample suffices for each neighbor.

30

Analysis of Alg for CC

Accepts every graph in CC w.p. 1.

To prove that rejects w.p. at least 2/3 graphs that are -far from CC, prove contrapositive statement: if alg accepts w.p. > 1/3 then graph is -close to CC.This is done by iteratively constructing a collection of (almost) cliques (with few edges between them) using the neighbor sets of “good” vertices: Vertex v is good if N(v) is close to being a clique and few edges going out of N(v).

31

Constructing (almost) Clique Collection

Quantify “goodness” according to “closeness” and “fewness”: (roughly) v is j-good if at most 2j|N(v)||V| missing edges in N(v) and extra edges going out of N(v).

Definition is such that if pass test w.p. > 1/3 then num of vertices that are not j-good and have deg at least 2j|V| is at most 2-j|V|.

Work in phases. In phase j considerj-good vertices that have degree at least 2j|V| and relatively few neighbors already covered. Define new (almost) clique Cv based on N(v).

Can show that frac of uncovered vertices at start of phase j is 2-ji so “can afford” lower-quality cliques.

v

N(v)