proximity oblivious testing oded goldreich weizmann institute of science joint work with dana ron

Proximity Oblivious Testing

Oded GoldreichWeizmann Institute of Science

Joint work with Dana Ron

Property Testing: informal definition

A relaxation of a decision problem:For a fixed property P and any object O,determine whether O has property P,or whether O is far from having property P (i.e., far from any other object having P).

Focus: sub-linear time algorithms – performing the task by inspecting the object at few locations.

? ?

?

??

Property Testing: the standard (one-sided error) def’n

A property P = n Pn , where Pn is a set of functions

with domain Dn.

The tester gets explicit input n and ,and oracle access to a function with domain Dn.

• If f Pn then Prob[Tf(n,) accepts] = 1.

• If f is -far from Pn then Prob[Tf(n,) rejects] > 2/3.

Focus: query complexity q(n,)=q() ( « | Dn |)

Terminology: is called the proximity parameter.

How does a tester use the proximity parameter

Some testers use the proximity parameter merely in order to determine the number of times that a basic test is performed, where the basic test is oblivious of the proximity parameter.We call such basic tests proximity oblivious testers.Example: the BLR (linearity)

tester.On input (prox.par.) and oracle f,

repeat the following test O(1/ ) times:1. Select uniformly x,y in Dn

2. Accept iff f(x)+f(y)=f(x+y).

Proximity Oblivious Testing: the basic definition

A property P = n Pn ’ where Pn is a set of functions

with domain Dn.

A P.O. Tester (POT) gets explicit input n (but not ),and oracle access to a function with domain Dn.

• If f Pn then Prob[Tf(n) accepts] = 1.

• If f Pn then Prob[Tf(n) rejects] > (P(f)),

where : (0,1] (0,1] (is the “detection rate”) and P(f) denotes the distance of f from P.

Focus: constant query complexity q(n)=q ( « | Dn |)

N.B.: A standard tester is obtained by repeating the POT (i.e., on prox. par. , repeat O(1/()) times).

Questions addressed in this work

1. Which “testable” properties have POTs?

2. How does the complexity of the standard tester obtained by repeating the POT compare to the complexity of the best possible standard tester .

These questions are studied mainly in two standard models

of testing graph properties: (i) the adjacency matrix model and (ii)

the bounded-degree model.

Example: the BLR (linearity) tester.

The complexity of the (std.) tester obtained

by repeating the POT equals (up to a constant)

the complexity of the best possible standard tester.

PART 1: In the adjacency matrix model

A graph G=(V,E) is represented by a function g:[N][N]{0,1} (i.e., g(u,v)=1 iff (u,v) is an edge in G).

The adjacency matrix model: two simple examples

A graph G=(V,E) is represented by a function g:[N][N]{0,1}.

Example 1: Clique. The property of being a clique has a “trivial” two-query POT with ()=.

Example 2: BiClique. The property of being a biclique has a three-query POT with ()=.

Select s[N] arbitrarily, and random u,v[N], and accept iff the induced subgraph is a biclique (i.e., has an even number of edges).

Example 2: analysis of the 3-query POT

Select s[N] arbitrarily, and random u,v[N], and accept iff the induced subgraph is a biclique (i.e., has an even number of edges).

sAnalysis technique: consider an induced partition.

(s)

[N] \ (s)

#edges in same side + #non-edges between sides > N2

induced subgraph induced subgraph has 1 or 3 edges has a single edge

Suppose that the graph is -far from Biclique. Then

Example 3: triangle-freeness [AFKS, Alon]

THM: -freeness has a 3-query POT with ()=1/Tower(1/), but no O(1)-query POT with ()=poly().

The point is that being -far from -freeness means that N2 edges must be omitted to obtain a -free graph, but this does not mean that the graph has N3 (nor poly()N3 ) triangles.

Conclusion: easy testability and POT-ness are “far from straightforward”.

Example 4: testing bipartiteness

THM: Bipartitness has no O(1)-query POT.

PF: A graph can be -far from Bipartiteness still all its O(1)-vertex induced subgraphs may be bipartite. E.g., a super-cycle of (1/) (equal-sized) independent sets such that each adjacent pairs of sets is connected by a complete bipartite graph.

Recall that Bipartitness is efficiently testable with poly(1/) queries.

Conclusion: easily testable properties may not have POTs.

Characterization of graph properties having a POT

THM (oversimplified): Property P has an O(1)-query POT iff P equals the set of F-free graphs, where F is a fixed set of O(1)-size graphs.

PF idea: Given a POT , we derive a canonical POT (a la [GT]), which yields a characterization of P in terms of forbidden subgraphs (equiv., allowed induced subgraphs). In the other direction, use [AFKS].Clarification: For a set of graphs F and a graph G, we say that G is F-free if no induced subgraph of G belongs to F.

THM (actual): Property P = N PN has a O(1)-query POT iff for some

constant c and every N, it holds that PN equals the set of FN -free graphs, where FN is a set of c-size graphs.

Example 5: testing Clique Collection (CC)

THM: CC has a 3-query POT with ()=O(2), and no O(1)-query POT can do better.

PF (of the lower bound): Consider a collection of 1/4 balanced bicliques, each of size 4N. This graph is -far from CC while rejecting it requires hitting some biclique at least three times.

Recall that CC is efficiently testable

with Õ(1/) queries [GR], and even Õ(-4/3) non-adaptive queries suffice.

Conclusion: The (std.) tester obtained by repeating the best POT may have significantly higher complexity than the standard tester.

Example 6: testing c-Clique Collection (c-CC)

THM: For every c2, the property c-CC has a (c+1)-query POT with ()=O(c/2), and no O(1)-query POT can do better.

PF (of the lower bound): Consider a graph consisting of c small cliques, each of size sqrt()N and a large clique of size (1-sqrt())N. This graph is -far from c-CC while rejecting it requires hitting each of the c small cliques.

Recall that c-CC is testable with Õ(1/) queries [GR], even non-adaptively!

Conclusion: The (std.) tester obtained by repeating the best POT may have tremendously higher complexity than the standard tester.

PART 2: In the bounded-degree model

A graph G=(V,E) of degree bound d, is represented by a function g:[N][d][N]{0}(i.e., g(u,i)=v iff v is the ith neighbor of u in G and g(u,i)=0 iff v has less than i neighbors).

The bounded-degree model: preliminaries to the characterization

DEF (generalized subgraph freeness): graphs with vertices marked full, semi-full, and partial such that a

disallowed mapping of F=([n],EF) to G=([N],E) satisfies

• for full vertex v, map(neigh(v)) = neigh(map(v))

• for semi-full vertex v, map(neigh(v)) = neigh(map(v)) map([n])

• for partial vertex v, map(neigh(v)) neigh(map(v))

E.g., induced (resp., non-induced) graph-freeness corresponds to the special case of using only semi-full (resp., partial) markings.

Generalized subgraph freeness: non-propagation

DEF (abbrev.): a disallowed mapping of F=([n], EF) to G=([N],E) satisfies

• for full vertex v, map(neigh(v)) = neigh(map(v))

• for semi-full vertex v, map(neigh(v)) = neigh(map(v)) map([n])

• for partial vertex v, map(neigh(v)) neigh(map(v)).

Def: F is non-propagating if there exists :(0,1](0,1] such that if every mapping of every marked graph in F to the graph G uses a vertex in B, then G is (|B|/N)-close to being F-free.

• Not all sets F are non-propagating.

• For any F with no full vertices, F is non-propagating.

• Degree-regularity is captured by a non-propagating F. Note that this is a non-hereditary property.

The bounded-degree model: characterization

Def: F is non-propagating if there exists :(0,1](0,1] such that if every mapping of every marked graph in F to the graph G uses a vertex in B, then G is (|B|/N)-close to being F-free.

• Not all sets F are non-propagating.

• For any F with no full vertices, F is non-propagating.

• Degree-regularity is captured by a non-propagating F. Note that this is a non-hereditary property.

THM (ov. sim.): A property P has an O(1)-query POT iff for some non-propagating F it holds that P equals F-freeness.

OPEN: Can every generalized subgraph freeness property be captured by F-freeness for some non-propagating F ?

Other Models (of property testing)

THM: If property P is testable by a non-adaptive tester

that (i) makes a number of queries that only depends on the proximity parameter and (ii) rejects based on a constant-sized “witness”, then P has a POT.

Note: strong codeword tests (cf. [GS]) correspond to POTs.

OPEN: Do codes of 1/polylog rate have O(1)-query codeword POT?

The EndThe slides of this talk are available at

http://www.wisdom.weizmann.ac.il/~oded/T/pot.ppt

The paper itself is available at http://www.wisdom.weizmann.ac.il/~oded/p_testPOT.html

A companion paper is available at http://www.wisdom.weizmann.ac.il/~oded/p_testAA.html

On the companion paper “Algorithmic Aspects of Property Testing in the Dense Graphs Model”

THM [GT]: If a graph property is testable by q(N,) queries then it is testable by a canonical tester of query complexity O(q(N,)2).

A canonical tester inspects a random induced subgraph and accepted iff the inspected graph has a predetermined property.

Me (since 2001): “In this model, there is no room for algorithms -- property testing reduces to sheer combinatorics.”

Me (now): A finer examination (which cares for the quadratic blow-up) reveals the role of algorithms; as shown in the paper, adaptive algorithms outperform non-adaptive ones, which in turn outperform canonical testers.

proximity oblivious testing oded goldreich weizmann institute of science joint work with dana ron

Documents