characterizations of hw and more.ppt [modalità compatibilità]frank/biss2009/characterizations of...

21
Hypertree Decompositions: Ch t i ti Characterizations, Applications and Extensions Applications and Extensions Prof Francesco Scarcello Prof. Francesco Scarcello Bertinoro International Spring School 2009 2-13 March 2009 Characterizations of Hypertree width Is hypertree width A t l t? A natural concept? A natural generalization of hypergraph li it ? acyclicity? Are there nice characterizations In terms of logic? In terms of games? Yes ! Characterizations of Hypertree width Logical characterization: Logical characterization: Loosely guarded logic Game characterization: The robber and marshals game Guarded Formulas ) ( K K ϕ g X Guard atom: ) ) ( var(g free ϕ k-guarded Formulas (loosely guarded): ) ( 2 1 K L K ϕ k g g g X k-guard GF(FO) GF (FO) are well-studied GF(FO), GF k (FO) are well-studied fragments of FO (Van Benthem’97, Gradel’99)

Upload: vannhi

Post on 18-Jul-2019

221 views

Category:

Documents


0 download

TRANSCRIPT

Hypertree Decompositions:Ch t i tiCharacterizations,Applications and ExtensionsApplications and Extensions

Prof Francesco ScarcelloProf. Francesco ScarcelloBertinoro International Spring School 20092-13 March 2009

Characterizations ofHypertree width

Is hypertree widthA t l t?A natural concept?A natural generalization of hypergraph

li it ?acyclicity?

Are there nice characterizationsIn terms of logic?gIn terms of games?

Yes !

Characterizations ofHypertree width

Logical characterization:Logical characterization:Loosely guarded logic

Game characterization:The robber and marshals game

Guarded Formulas )( KK ϕ∧∃ gX

Guard atom: ))( var(gfree ⊆ϕ

k-guarded Formulas (loosely guarded): )(

21KLK ϕ∧∧∧∧∃

kgggX

k-guard

GF(FO) GF (FO) are well-studiedGF(FO), GFk(FO) are well-studied fragments of FO (Van Benthem’97, Gradel’99)

Logical Characterization of HW

Theorem: )(GF HWkk

L= )(kk

From this general result, we also get anice logical cha acte i ation of ac clic q e iesnice logical characterization of acyclic queries:

)GF(ACYCLICHW LCorollary: )GF( ACYCLIC HW1

L==Corollary:

An Example)),(),,(),,(),,(( .,,,,, WTsUZYrTYXqZYXpWUTZYX ∧∧∧∃

p(X,Y,Z)I li

p(X,Y,Z)

q(X,Y,T) r(Y,Z,U)

Is acyclic:

s(T,W)

Indeed, there exists an equivalent guarded formula:

∧∃ ),,(( .,, ZYXpZYX ∧∃∧∃ )),(.),,((q . WTsWTYXT

)),,(r . UZYU∃∧GuardGuard

Guarded subformula

An Example)),(),,(),,(),,(( .,,,,, WTsUZYrTYXqZYXpWUTZYX ∧∧∧∃

p(X,Y,Z)I li

p(X,Y,Z)

q(X,Y,T) r(Y,Z,U)

Is acyclic:

s(T,W)

Indeed, there exists an equivalent guarded formula:

∧∃ ),,(( .,, ZYXpZYX ∧∃∧∃ )),(.),,((q . WTsWTYXT

)),,(r . UZYU∃∧Guard

Guarded subformula

Game characterization:Robber and Marshals

A robber and k marshals play the game on a hypergraphhypergraph

The marshals have to capture the robber

The robber tries to elude her capture, by running arbitrarily fast on the vertices of therunning arbitrarily fast on the vertices of the hypergraph

Robbers and Marshals: the rules

Each marshal stays on an edge of the hypergraphEach marshal stays on an edge of the hypergraph and controls all of its vertices at once

The robber can go from a vertex to another vertex running along the edges, but she cannot pass th h ti t ll d b h lthrough vertices controlled by some marshal

The marshals win the game if they are able toThe marshals win the game if they are able to monotonically shrink the moving space of the robber, and thus eventually capture hery p

Consequently, the robber wins if she can go back to l ll d b h lsome vertex previously controlled by marshals

Step 0: the empty hypergraph

VP R

S

X Y

ZT U

W

Step 1: first move of the marshals

VVP R

S

X Y

ZT U

W

Step 1: first move of the marshals

VP R

S

X Y

ZT U

W

Step 2a: shrinking the space

VP R

S

X Y

ZT U

W

Step 2a: shrinking the space

VP R

S

X Y

ZT U

W

Step 2a: shrinking the space

VP R

S

X Y

ZT U

W

The capture

VP R

S

X Y

ZT U

W

A different robber’s choice

VVP R

S

X Y

ZT U

W

Step 2b: the capture

VVP RV

S

X Y

ZT U

W

R&M Game and Hypertree Width

Let H be a hypergraph.Theorem: H has hypertree width ≤ k if andTheorem: H has hypertree width ≤ k if and only if k marshals have a winning strategy on HH.Corollary: H is acyclic if and only if one marshal has a winning strategy on Hmarshal has a winning strategy on H.

Winning strategies on H correspond toWinning strategies on H correspond to hypertree decompositions of H and vice versaversa.

Strategies and DecompositionsStrategies and Decompositions

)()()(),(),,(),,,(),,,(ZXWdVPRfYXg

ZYeZUTcPUYSbRTXSaans∧∧∧

∧∧∧∧←

VVP R

),,(),,(),( ZXWdVPRfYXg ∧∧∧

VVP R

SSX Y

ZT UW

First choice of the two marshalsFirst choice of the two marshals

(S X T R) b(S YU P)

VVP R

a(S,X,T,R), b(S,Y,U,P)

VVP R

SSX Y

ZT UW

A possible choice for the robberA possible choice for the robber

(S X T R) b(S YU P)

VVP R

a(S,X,T,R), b(S,Y,U,P)

VVP R

SSX Y

ZT UW

The captureThe capture

(S X T R) b(S YU P)

VP R

a(S,X,T,R), b(S,Y,U,P)

VVP R

S

V

SX Y

f(R,P,V)

ZT UW

The second choice for the robberThe second choice for the robber

(S X T R) b(S YU P)

VVP R

a(S,X,T,R), b(S,Y,U,P)

VVP R

SSX Y

f(R,P,V)

ZT UW

The marshals corner the robberThe marshals corner the robber

(S X T R) b(S YU P)

VVP R

a(S,X,T,R), b(S,Y,U,P)

VVP R

SSX Y

f(R,P,V) g(X,Y), c(T,Z,U)

ZT UW

The captureThe capture

(S X T R) b(S YU P)

VVP R

a(S,X,T,R), b(S,Y,U,P)

VVP R

SSX Y

f(R,P,V) g(X,Y), c(T,Z,U)

ZT UWg(X,Y), d(W,X,Z)

What about the clique-width?What about the clique width?

Fi d MS f l b l t d i l i lFixed MS2 formulae can be evaluated in polynomial time on bounded clique-width graphs(Courcelle’90; Courcelle Makowski Rotics ’00)(Courcelle 90; Courcelle, Makowski, Rotics, 00)Bounded treewidth entails bounded clique width, but not vice versanot vice versa.

Some interesting questions arise:A e bo nded cliq e idth q e ies t actable?Are bounded clique-width queries tractable?Is it more powerful than hypertree width?C th t t bilit b i f MS i bCan the tractability barrier for MS2 queries be pushed any further?

A iA comparison

(Gottlob and Pichler ’04)

Some answers (Gottlob and Pichler ’04)Some answers (Gottlob and Pichler 04)

Bounded clique width (of the incidence graph entailsBounded clique width (of the incidence graph entails bounded hypertree width

Bounded clique width queries may be answered in polynomial timepolynomial time

Evaluating fixed MS2 formulae on bounded hypertree g 2 ypwidth structures is NP-hard

Unlike treewidth, hypertree width is not useful in the MS2 setting

Limits to practical applicationsLimits to practical applications

If the idth is not lo omp ting a h pe t ee de omposition is aIf the width is not low, computing a hypertree decomposition is a very demanding task

e.g., we failed in finding a decomposition for a CSPs with g , g p~ 800 constraints (atoms), encountered in a Nasa projectSolution: develop heuristics for computing hypertree decompositions even if not the minimum width onesdecompositions, even if not the minimum width ones

Even the minimum width hypertree decompositions are not equivalent for real-world applications

e.g., is not possible to answer database queries efficiently looking only at the structure of the hypergraph, without considering the data in the relationsconsidering the data in the relationsSolution: we need a more refined notion that identify our “preferred decompositions”

Are they equivalent?

HH11, , HH2, 2, HH33HH11, , HH2 2 , , HH33

HH33,, HH44 ,, HH55 HH3,3, HH77 HH11,, HH66HH33, , HH44 HH11, , HH66 33, , 4 4 , , 55 3,3, 77 11, , 66

HH HH HH HH HH33, , HH88HH44, , HH55 HH44, , HH77

HH77, , HH88

Two Possible Approaches toTwo Possible Approaches to Query Optimization

1. Data oriented:Maintain a dictionary with information on theMaintain a dictionary with information on the dataExploit this information for finding the bestExploit this information for finding the best query plan

2. Structure Oriented:Look for structural properties of the queryp p q yPossibly, answer the query in (output) polynomial time

Data or Structure?

All commercial DBMS are equippedAll commercial DBMS are equipped with Data-oriented query plannersMany important theoretical papers deals with structural properties of p pqueries

Wide tractability classes have been identified

Wh ?Why?Data Oriented:Data Oriented:

Good amount of statistical information availableEfficient algorithms (mainly heuristics)Simple kind of queries (typically short queries)

Structure Oriented:P l i l U B d h f i hPolynomial Upper Bound on the cost of answering the queryVery good for long queriese y good o o g que esAble to deal with databases without information on the data

Can we combine these separate worlds?

What is the best choice?(X Y)p1(X,Y)

p2(X,Z,W)

p5(Y,T,X)

p (Y X L) p (T W)p6(Y,X,L) p7(T,W)

p3(X,Z) p4(Z,S)p8(T,R) p9(W,O)

Wh t i th b t h i ?What is the best choice?p (T W)p7(T,W)

p5(Y,T,X)p9(W,O)

p1(X,Y) p6(Y,X,L)

p2(X,Z,W)p8(T R)

p4(Z S)

p8(T,R)

p3(X,Z)p4(Z,S)

S d t i t d lSome data oriented plannersD i iDynamic programmingGreedyIt ti D i P iIterative Dynamic Programming

[Kossmann & Stoker, TODS, 2000]Restrictions of the search space:Restrictions of the search space:

B h tL ft d t

2 1 1!n næ ö÷ç æ ö÷ç ÷ç ÷ç ÷ç ÷÷ æ öç ÷ç ÷ç ÷÷ çç ÷ç ÷÷

- ×

Bushy tree

!

Left-deep tree

9 atomi 11

!nn

ç çç ÷ç ÷÷ ç÷çç ÷ ÷è øç ÷ ç ÷÷ç ç ÷÷ç ÷çç ÷÷ è øç ÷ç ÷ç ÷ç ÷÷çç ÷è ø÷

× --

518 918 400

!n

362 880

9 atomi

518.918.400362.880

Weighted Hypertree Decompositions

H t d iti h i kH t d iti h i k b d d idthb d d idth••Hypertree decompositions having kHypertree decompositions having k--bounded width are bounded width are no more equivalentno more equivalent••We want to find the best onesWe want to find the best onesWe want to find the best onesWe want to find the best ones••We need a way for weighting decompositions according We need a way for weighting decompositions according to a given criteriumto a given criterium

Hypertree Weighting FunctionsHypertree Weighting FunctionsLet Let HH be a hypergraph, ωbe a hypergraph, ωHH is any polynomialis any polynomial--time function that maps time function that maps each hypertree decomposition HD = <T,χ, λ> of each hypertree decomposition HD = <T,χ, λ> of HH to a real number, to a real number, called the weight of HDcalled the weight of HDcalled the weight of HD.called the weight of HD.

Example: ωExample: ω (HD) = max(HD) = max |λ(p)||λ(p)|Example: ωExample: ωHH (HD) = max (HD) = max pp∈∈verticesvertices(T)(T) |λ(p)||λ(p)|

Example

QQ00: ans ← s1(A,B,D) : ans ← s1(A,B,D) ∧∧ s2(B,C,D) s2(B,C,D) ∧∧ s3(B E)s3(B E) ∧∧ s4(D G)s4(D G) ∧∧ s5(E F G)s5(E F G)∧∧ s3(B,E) s3(B,E) ∧∧ s4(D,G) s4(D,G) ∧∧ s5(E,F,G) s5(E,F,G) ∧∧ s6(E,H) s6(E,H) ∧∧ s7(F, I) s7(F, I) ∧∧ s8(G, J).s8(G, J).

Minimum Lexicographical Decomposition

ωωHH (HD) = (HD) = ΣΣi=1i=1 |{|{pp∈∈verticesvertices(T) s.t. |λ(p)|=i}| x B(T) s.t. |λ(p)|=i}| x Bii--11lexlex ∞∞

B=|B=|edgesedges((HH)|+1)|+1

ωωHH (HD’)=6 x 9(HD’)=6 x 90 0 + 1 x 9+ 1 x 911lexlex

HD’HD’

Minimum Lexicographical Decomposition

ωω (HD’’) 4 x 9(HD’’) 4 x 900 + 3 x 9+ 3 x 911lexlexωωHH (HD’’)=4 x 9(HD’’)=4 x 90 0 + 3 x 9+ 3 x 911lexlex

HD’’HD’’HDHD

Classes of Hypertree decompositions

: all hypertree decompositions of ΗΗ having width at

kHDkHDHH

decompositions of ΗΗ having width at most k

ll hkNHDkNHD : all hypertree decompositions of Η Η in normal form

kNHDkNHDHH

phaving width at most k

Recall that deciding whetherRecall that deciding whether kHDkHDHH≠ 0≠ 0 is inis in LOGCFLLOGCFL

as well asas well as kNHDkNHDHH≠ 0≠ 0 is inis in LOGCFLLOGCFL

Minimal Hypertree DecompositionsLet Let HH be a hypergraph, be a hypergraph, ωωH H be a weighting function, be a weighting function, CC be a class of Hypertree decompositionsbe a class of Hypertree decompositionsCCH H be a class of Hypertree decompositions,be a class of Hypertree decompositions,

HDHD CC ii [ω[ω CC ]] minimalminimalHDHD∈∈CCH H isis [ω[ωH H ,C,CH H ]]--minimalminimal

if there exists noif there exists no HD’HD’∈∈ CCH H if there exists noif there exists no

ωωH H (HD’) < ω(HD’) < ωH H (HD)(HD)

HDHD ∈∈ CCH H

such thatsuch that

Global weighting functionsGlobal weighting functions

Not surprisinglyNot surprisingly, if we have no restriction on weighting functions, computing a [ω[ωH H ,kHD,kHDH H ]]--minimal hypertree decomposition is yp pNP-hardHardness holds even for k 1 i e justHardness holds even for k=1, i.e., just looking for the best join trees of an acyclic hypergraph.

We need simpler yet useful weighting functionsWe need simpler yet useful weighting functionsWe need simpler, yet useful weighting functionsWe need simpler, yet useful weighting functions

Vertex aggregation functionA fi t tt t f “l l b l” l tiA fi t tt t f “l l b l” l tiA first attempt of a “less global” evaluation: A first attempt of a “less global” evaluation: -- weight each vertex separatelyweight each vertex separately

th lth l

(HD)(HD) ( )( )vv

-- sum these values sum these values

ΛΛHH (HD) = (HD) = ΣΣpp∈∈verticesvertices(T)(T) VVHH (p)(p)vv

Example: the lexicographical weighting function Example: the lexicographical weighting function ωωΗΗlexlex

Tree aggregation functionsMore general and powerful: More general and powerful:

Tree aggregation functionsg pg p

-- weight verticesweight vertices-- weight pairs of adjacent verticesweight pairs of adjacent verticesg p jg p j-- generalize + to “any” operator generalize + to “any” operator ⊕⊕ such that such that

is a semiringis a semiringgg

FFHH,v,e,v,e

==FFHH

Ex:Ex: max,v,max,v,┴┴FFHH, ,, ,┴┴

VVHH (p)(p)wherewhere =|λ(p)|=|λ(p)|

FFmax,v,emax,v,esepsep

(p q(p q) |χ) |χ(p)(p) ∩∩ χχ(p)|(p)|sepsepFFHH eeHH (p,q(p,q)=|χ)=|χ(p) (p) ∩ ∩ χχ(p)|(p)|wherewheresepsep

Negative resultsNegative results⊕⊕ v ev e

kHDHkHDH[F[FΗΗ ,kHD,kHDH H ] ] -- minimalminimalComputing aComputing a decomposition is decomposition is NPNP--hardhard

⊕⊕,,v,ev,e

FFHH,v,e,v,e

•• Hardness holds even if is a vertex aggregation Hardness holds even if is a vertex aggregation FFHH gg ggg gfunctionfunction•• we thus have to act on the second source of complexity: we thus have to act on the second source of complexity:

ffthe class of possible decompositions we are interested inthe class of possible decompositions we are interested in

Normal Form Hypertree DecompositionsNormal Form Hypertree Decompositions

Positive resultsPositive results⊕⊕ v ev e

[F[FΗΗ ,kNHD,kNHDH H ]]--minimalminimalComputing aComputing a decomposition is in LOGCFLdecomposition is in LOGCFL⊕⊕,,v,ev,e

[F[FΗΗ ,kNHD,kNHDH H ]]--minimalminimalComputing aComputing a decomposition is LOGCFLdecomposition is LOGCFL--hardhard⊕⊕,,v,ev,e

[[ ΗΗ ,, H H ]]Computing aComputing a

This problem is thus parallelizable (if v and e are parallelizable)

fWe have a precise characterization of its complexityNote that the possible LOGCFL-hardness of treewidth

d h t idth i till bland hypertree-width is still an open problem

Hypertrees and Query PlansHypertrees and Query Plans

j(J X YX’ Y’)

a(S X X’ C F) b(S YY’ C’ F’)

j(J,X,Y,X ,Y )

a(S,X,X ,C,F), b(S,Y,Y ,C ,F )

j( X Y ) (C C’ Z) j( X’ Y’) f(F F’ Z’)j(_,X,Y,_,_), c(C,C’,Z) j(_,_,_,X’,Y’), f(F,F’,Z’)

d(X Z) e(YZ) h(Y’ Z’)g(X’ Z’) f(F Z’)d(X,Z) e(Y,Z) h(Y’,Z’)g(X’,Z’), f(F,_,Z’)

p(B,X’,F) q(B’,X’,F)

It is just a partial specification!

A Tree Aggregation Function for Query OptimizationHD T λ b h t d iti fHD T λ b h t d iti f (Q)(Q)HD = <T,χ, λ> be a hypertree decomposition for HD = <T,χ, λ> be a hypertree decomposition for HH(Q)(Q)

For each vertex p, For each vertex p, p,p,let E(p) be the relational expression associated with itlet E(p) be the relational expression associated with it

( )( )**VVH(Q)H(Q) (p)(p)** is the estimated cost of evaluatingis the estimated cost of evaluating E(p)E(p)

( )( ) ( )( )( )( )eeHH (p,q)(p,q) E(q)E(q)E(p)E(p)is the estimated cost of evaluatingis the estimated cost of evaluating**

= F= FH(Q)H(Q)+,v*,e+,v*,e**

costcostH(Q)H(Q)

Example

Consider again the conjunctive query:

),(),',()',',',,(),,',,( ZXdZCCcFCYYSbFCXXSaans ∧∧∧∧←

),','(),',()',',,,()','()','()',',(),(FXBqFXBpYXYXJj

ZYhZXgZFFfZYe∧∧

∧∧∧∧

Quantitative InformationQuantitative Information<ATOM> <VARIABLE> <SELECTIVITY>

a(S X Xp C F) S 14

<ATOM> <CARDINALITY>

a(S,X,Xp,C,F) S 14a(S,X,Xp,C,F) X 24a(S,X,Xp,C,F) Xp 16a(S,X,Xp,C,F) C 21a(S X Xp C F) F 15

a(S,X,Xp,C,F) 4606

b(S,Y,Yp,Cp,Fp) 2808c(C,Cp,Z) 1748

a(S,X,Xp,C,F) F 15b(S,Y,Yp,Cp,Fp) S 17b(S,Y,Yp,Cp,Fp) Y 5b(S,Y,Yp,Cp,Fp) Yp 12

d(X,Z) 3756e(Y,Z) 3554

f(F,Fp,Zp) 2892g(Xp Zp) 4573

( , , p, p, p) pb(S,Y,Yp,Cp,Fp) Cp 20b(S,Y,Yp,Cp,Fp) Fp 7

c(C,Cp,Z) C 18c(C Cp Z) Cp 7g(Xp,Zp) 4573

h(Yp,Zp) 3390j(J,X,Y,Xp,Yp) 4234

c(C,Cp,Z) Cp 7c(C,Cp,Z) Z 19

d(X,Z) X 18d(X,Z) Z 7e(Y Z) Y 21

Cardinality of relations in the Database

e(Y,Z) Y 21e(Y,Z) Z 13

Attribute selectivitythe Database y

Example: k = 2

k = 2 3,521,741

E l k 3Example: k = 3

k = 3 1 373 879k 3 1,373,879

E l k 4Example: k = 4

k = 4 854 867k = 4 854,867

Weighted Hypertree Decompositions for DBMSs

Hypertrees vs OracleHypertrees vs Oracle TPC-Hqueries

Inside PostgreSQLg Q

Ongoing and Future WorkOngoing and Future Work

Thorough experimentation activityHeuristicsHeuristicsOpen problem on the Special Condition (related to the Sandwich Problem)Sandwich Problem)

Some open questionsSome open questions

S i l di iSpecial condition:It is not necessary for query tractabilityy q y y

Special Condition

j(J,X,Y,X’,Y’)Each variable that disappeared t t

a(S,X,X’,C,F), b(S,Y,Y’,C’,F’)

j( )at some vertex v

j( ,X,Y, , ), c(C,C’,Z) j( , , ,X’,Y’), f(F,F’,Z’)J X Yj(_, , ,_,_), ( , , ) j(_,_,_, , ), ( , , )

d(X,Z) e(Y,Z) h(Y’,Z’)g(X’,Z’), f(F, ,Z’)( , ) ( , ) ( , )g( , ), ( ,_, )

p(B X’ F) q(B’ X’ F)Does not appear inth bt t d p(B,X ,F) q(B ,X ,F)the subtrees rootedat v

Some open questionsSome open questions

Special condition:Special condition:It is not necessary for query tractability

Generalized Hypertree DecompositionIs it necessary for computing hypertree d iti ?decompositions?

When conjunctive queries are tractable?When conjunctive queries are tractable?Is a class of queries tractable iff it has bounded hypertree width?yp

The Hypergraph Sandwich Problem

The Sandwich Problem

H1 ≤ H2 if each hyperedge of H1 is included in some hyperedge of Hin some hyperedge of H2

Given two hypergraphs Hc and Hr,i th li h h H tis there an acyclic hypergraph H s.t.Hc ≤ H ≤ Hr ?

Generalized Hypertree Width reduces to the Sandwich problem

An Hr hyperedge for each union of k hyperedges of Hc

Structural methodswithoutSt t l d itiStructural decompositions

The CoreThe core of a query Q is a query Q’ s t :The core of a query Q is a query Q s.t.:1. atoms(Q’) ⊆ atoms(Q)2 There is a mapping h: var(Q) → var(Q’)2. There is a mapping h: var(Q) → var(Q )

s.t.,∀ r(X)∈atoms(Q), r(h(X))∈atoms(Q’)3 There is no query Q’’ satisfying 1 and 2 and such3. There is no query Q satisfying 1 and 2 and such

that atoms(Q’’)⊂ atoms(Q’)

Q Q’l

3

21

45

Q

3

21QExample:

3 4

6

3

The core Q’ is unique and it is equivalent to Q

CORE is NP-hardD idi h th Q’ i th f Q i NP h d

(well-known)

Deciding whether Q’ is the core of Q is NP-hardFor instance, let 3COL be the class of all 3-colourable graphs containing a trianglegraphs containing a triangleClearly, deciding whether G∈3COL is NP-hardIt is easy to see that G 3COL K is the core of GIt is easy to see that G∈3COL ⇔ K3 is the core of G

Q’

3

21

45

21Q’Q

3 4

6

3

6

Promise Problems, treewidth, cores, and tractability

Tractability (promise version): “CSP(A) is in polynomial time forTractability (promise version): CSP(A) is in polynomial time for an arbitrary class A of problems if there is a polynomial time algorithm that, - if its input consists of (the encoding of) two structures A∈A and B correctly decides whether there is a homomorphism fromand B, correctly decides whether there is a homomorphism from A to B. - if the input is not of this form, then the answer of the algorithm may be arbitrary.” (Grohe 03)algorithm may be arbitrary. (Grohe 03)

A sufficient condition: Let w≥1 and A be a class of structures such that the core of e a d be a c ass o s uc u es suc a e co e oeach structure in A has tree-width at most w. Then CSP(A) is in polynomial time. (Dalmau, Kolaitis, and Vardi 02).

It is necessary, for fixed-arity structures:Assume that FPT ≠ W[1]. Then for every recursively enumerable class A of structures of bounded arity, the problem CSP(A) is in polynomial time if and only if the cores of all structures in Apolynomial time if, and only if, the cores of all structures in Ahave bounded tree-width. (Grohe 03)

Some observationsSome observations

The tractability result is definitely useful if we have some guarantee that our instances belongs to the l Aclass A

In general, this is not the case, and checking this membership is NP hardmembership is NP-hard

If d t h h t (i l)If we do not have such a guarantee, (in general) the polynomial time algorithm for CSP(A) gives neither a checkable justification for its answersneither a checkable justification for its answers, nor a way to compute a solution (unless P=NP)

An exampleConsider such an algorithm Alg and take the class of

colorable graphs 3COL:

The core of any graph in 3COL has treewidth 2 (it is the triangle), therefore CSP(3COL) is in polynomial time (promise version)therefore CSP(3COL) is in polynomial time (promise version)

Take a graph G: if Alg(G,K3) says “yes”, this could be eithercorrect if G belongs to 3COL orcorrect if G belongs to 3COL, orwrong, if G does not (and hence the answer of Alg is arbitrary, as the given instance does not belong to the class A=3COL at hand)

Thus, if Alg provides a (PTIME checkable) certificate for its answer---e.g., a solution for the instance (G,K3)---then P=NP

Why that?Why that?

I i i l f d h l i h i h hIntuitively, one can try to feed the algorithm with the graph G plus “something”, in order to build a solution node-by-node, calling Alg many timesnode by node, calling Alg many times

What is wrong with that? gSuch a “something” in general may lead us out of the class A we started with (we may loose a possible initial guarantee of membership in A)initial guarantee of membership in A)

However it works if the class has the property to beHowever, it works if the class has the property to be closed under adding such a “something”

Pebble Games and no-Promise Problems

T t bilit ( i i ) CSP(A) i i l i l ti ifTractability (no-promise version): CSP(A) is in polynomial time if there is a polynomial time algorithm that, given a pair (A,B),- says “yes” only if (A,B) is a “yes” CSP instance;- says “no” only if (A,B) is a “no” CSP instance;says no only if (A,B) is a no CSP instance; - says “don’t know” only if A does not belong to A.

A nice result:A nice result: Let w≥1 and A be a class of structures having generalized hypertree width at most w. Then CSP(A) is in polynomial time. (Chen and Dalmau, CP’05).

Notes:The algorithm does not involve the computation of a generalized h t idth ( hi h i f t ld b NP h d)hypertree width (which in fact would be NP-hard)The proof exploits pebble gamesThe algorithm is based on a consistency-like procedure

The algorithm (database view))()()()( ZYeZUTcPUYSbRTXSaans ∧∧∧∧←

C t th j i f ll t),,(),,(),(

),(),,(),,,(),,,(ZXWdVPRfYXg

ZYeZUTcPUYSbRTXSaans∧∧∧

∧∧∧∧←

Compute the joins of all sets of (at most) k relationsIteratively, remove tuples VVP Rte at e y, e o e tup esenforcing local consistency until

some relation is empty Ssome relation is emptywhence Output “No”

or a fixpoint is reached

SX Y

pTry to compute a solutiontuple by tuple

If this fails Output “Don’t

ZT UW

If this fails, Output Don t know”

A database view ( d lt ti f)A database view (and an alternative proof)

)()()()( ZYeZUTcPUYSbRTXSaans ∧∧∧∧←),,(),,(),(

),(),,(),,,(),,,(ZXWdVPRfYXg

ZYeZUTcPUYSbRTXSaans∧∧∧

∧∧∧∧←

VVP R

SX Y

ZT UZT UW

If there is a generalized hypertree decomposition ofIf there is a generalized hypertree decomposition of width at most k (in this example, k=2)

… enforcing pairwise consistency among… enforcing pairwise consistency among all k-unions

a(S X T R) b(S YU P)(X Y) (S X T R) g(X Y) b(S YU P)a(S,X,T,R), b(S,Y,U,P)g(X,Y), a(S,X,T,R) g(X,Y), b(S,Y,U,P)

f(R,P,V) g(X,Y), c(T,Z,U)g(X,Y)a(S,X,T,R), c(T,Z,U)

g(X,Y), d(W,X,Z)b(S,Y,U,P), c(T,Z,U) f(R,P,V), c(T,Z,U)

means enforcing it on any subset of… means enforcing it on any subset of them

a(S X T R) b(S YU P)(X Y) (S X T R) g(X Y) b(S YU P)a(S,X,T,R), b(S,Y,U,P)g(X,Y), a(S,X,T,R) g(X,Y), b(S,Y,U,P)

f(R,P,V) g(X,Y), c(T,Z,U)g(X,Y)a(S,X,T,R), c(T,Z,U)

g(X,Y), d(W,X,Z)b(S,Y,U,P), c(T,Z,U) f(R,P,V), c(T,Z,U)

means enforcing it on any subset of… means enforcing it on any subset of them

ab(S X T R YU P)(X YS T R) gb(X YS U P)ab(S,X,T,R,Y,U,P)ga(X,Y,S,T,R) gb(X,Y,S,U,P)

f(R,P,V) gc(X,Y,T,Z,U)g(X,Y)ac(S,X,T,R,Z,U)

gd(X,Y,W,Z)bc(S,Y,U,P,T,Z) fc(R,P,V,T,Z,U)

even on those that may be arranged to form… even on those that may be arranged to form a generalized hypertree decomposition

ab(S X T R YU P)ab(S,X,T,R,Y,U,P)

f(R,P,V) gc(X,Y,T,Z,U)

gd(X,Y,W,Z)

Therefore,

• From hd properties, the resulting queryis acyclic and equivalent to the original one

b(S X T R YU P)• By construction, it is locally consistent

ab(S,X,T,R,Y,U,P)• Thus, it is globally consistent,

because it is acyclic

f(R,P,V) gc(X,Y,T,Z,U)

y

( )gd(X,Y,W,Z)

What about the converse?

Let Q be a query:W k th t k l l i t t ilWe know that k-local consistency entails global consistency if ghw(Q)≤kIs it the case thatk-local consistency entails global consistency y g y

only if if ghw(Q)≤k ?(which means: for each database )(which means: for each database, …)

To decompose or pnot to decompose?

It depends on the applications…Computing a (generalized) hypertree width p g (g ) ypmay be viewed as a clever way to achieve global consistency With a decompositionWith a decomposition:we compute the full reducer in O(m nk log n)

from the full reducer it is easy to enumerate all solutionsybut we have to add the cost of computing the decomposition

Without a decomposition: we decide the problem in O(m2k n2k log n)we decide the problem in O(m2k n2k log n)we compute a solution in O(m2k+1 n2k log n)

ConclusionConclusion

Several hot open problems on uniform homomorphismhomomorphismremain to be solved.

For papers and furthermaterial see:

http://www.deis.unical.it/scarcello/Hypertrees/

andhttp://www.dbai.tuwien.ac.at/staff/proj/hypertree/

and here…