logical query languages motivation: 1.logical rules extend more naturally to recursive queries than...

37
Logical Query Languages Motivation: 1. Logical rules extend more naturally to recursive queries than does relational algebra. Used in SQL recursion. 2. Logical rules form the basis for many information- integration systems and applications.

Upload: anabel-terry

Post on 24-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Logical Query Languages

Motivation:

1. Logical rules extend more naturally to recursive queries than does relational algebra.

Used in SQL recursion.

2. Logical rules form the basis for many information-integration systems and applications.

Page 2: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Datalog

• First-order predicate logic to represent knowledge and also as a language for expressing operations on relations.

• Example:

boss(E,M) :- manages(E,M)

boss(E,M) :- boss(E,N) & manages(N,M)

Substitute constant for the variables E,N,M and if the substitution makes the right side true, then the left side must also be true.

Page 3: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Datalog ExampleLikes(drinker, beer)Sells(bar, beer, price)Frequents(drinker, bar)

Happy(d) <-Frequents(d,bar) ANDLikes(d,beer) ANDSells(bar,beer,p)

• Above is a rule.• Left side = head.• Right side = body = AND of subgoals.• Head and subgoals are atoms.

Atom = predicate and arguments. Predicate = relation name or arithmetic predicate, e.g. <. Arguments are variables or constants.

• Subgoals (not head) may optionally be negated by NOT.

Page 4: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Meaning of Rules

Head is true of its arguments if there exist values for local variables (those in body, not in head) that make all of the subgoals true.

• If no negation or arithmetic comparisons, just natural join the subgoals and project onto the head variables.

ExampleAbove rule equivalent to Happy(d) =

πdrinker(Frequents Likes Sells)

Page 5: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Evaluation of RulesTwo, dual, approaches:1. Variable-based: Consider all possible assignments of

values to variables. If all subgoals are true, add the head to the result relation.

2. Tuple-based: Consider all assignments of tuples to subgoals that make each subgoal true. If the variables are assigned consistent values, add the head to the result.

Example: Variable-Based AssignmentS(x,y) <- R(x,z) AND R(z,y) AND NOT R(x,y)

R =A B1 22 3

Page 6: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

• Only assignments that make first subgoal true:1. x 1, z 2.2. x 2, z 3.• In case (1), y 3 makes second subgoal true.

Since (1,3) is not in R, the third subgoal is also true. Thus, add (x,y) = (1,3) to relation S.

• In case (2), no value of y makes the second subgoal true. Thus, S =

A B1 3

Page 7: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Example: Tuple-Based AssignmentTrick: start with the positive (not negated), relational (not

arithmetic) subgoals only.S(x,y) <- R(x,z) AND R(z,y) AND NOT R(x,y)

R = A B1 22 3

• Four assignments of tuples to subgoals:R(x,z) R(z,y)(1,2) (1,2)(1,2) (2,3)(2,3) (1,2)(2,3) (2,3)

• Only the second gives a consistent value to z.• That assignment also makes NOT R(x,y) true.• Thus, (1,3) is the only tuple for the head.

Page 8: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Datalog Programs

• A collection of rules is a Datalog program.

• Predicates/relations divide into two classes: EDB = extensional database = relation stored

in DB. IDB = intensional database = relation defined

by one or more rules.

• A predicate must be IDB or EDB, not both. Thus, an IDB predicate can appear in the body

or head of a rule; EDB only in the body.

Page 9: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

ExampleConvert the following SQL (Find the manufacturers of the beers Joe

sells):Beers(name, manf)Sells(bar, beer, price)

SELECT manfFROM BeersWHERE name IN( SELECT beer FROM Sells WHERE bar = 'Joe''s Bar');

to a Datalog program.JoeSells(b) <- Sells('Joe''s Bar', b, p)Answer(m) <- JoeSells(b) AND Beers(b,m)

• Note: Beers, Sells = EDB; JoeSells, Answer = IDB.

Page 10: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

• sibling(X,Y) :- parent(X,Z) & parent(Y,Z) & X notequalto Y.

• cousin(X,Y) :- parent(X,Xp) & parent(Y,Yp) & sibling(Xp,Yp).

• cousin(X,Y) :- parent(X,Xp) & parent(Y,Yp) & cousin(Xp,Yp).

• related(X,Y) :- sibling(X,Y).

• related(X,Y) :- related(X,Z) & parent(Y,Z).

• related(X,Y) :- related(Z,Y) & parent(X,Z).

Page 11: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

SafetyA rule can make no sense if variables appear in funny ways.

Examples• S(x) <- R(y)• S(x) <- NOT R(x)• S(x) <- R(y) AND x < yIn each of these cases, the result is infinite, even if the relation R is

finite.• To make sense as a database operation, we need to require three

things of a variable x (= definition of safety). If x appears in either

1. The head,2. A negated subgoal, or3. An arithmetic comparison,

then x must also appear in a nonnegated, “ordinary” (relational) subgoal of the body.

• We insist that rules be safe, henceforth.

Page 12: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Safety (Contd.)• Avoid rules that create infinite relations from

finite ones by insisting that each variable appearing in the rule be “limited.”

Formally define limited variable as:1. Any variable that appears as an argument in an

ordinary predicate of the body;2. Any variable X that appears in a subgoal X = a

or a = X, where a is a constant;3. Variable X is limited if it appears in a subgoal X

= Y or Y = X, where Y is a variable already to be limited.

Page 13: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Safety (Contd.)

• P(X,Y) :- q(X,Z) & W = a & Y = W.

X and Z are limited by rule (1) because of the first subgoal in the body. W is limited by the rule (2) because of the second subgoal, and therefore (3) tell us Y is limited because of the third subgoal.

Page 14: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Evaluating Nonrecursive Rules

Involves two steps:

1. Compute the relation defined by a rule body

2. Compute the relation for the nonrecursive predicate (head of the rule body)

• Algorithm 3.1: compute a relation for a rule body using relational algebra operations.

• Algorithm 3.2: evaluating nonrecursive rules using relational algebra operations.

(refer to handouts)

Page 15: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Rectified Rules

• Before applying Alg. 3.2, we rectify the rules.• The purpose of rectifying the rules is to represent

the rule head of predicate p to be identical and of the form p(X1, .. Xk) for distinct variables X1,.., Xk.

• Consider all rules with p in the head, compute the relations for these rules, project onto the variables appearing in the heads and take the union.

Page 16: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Rectification (Contd.)• Example: consider the predicate p defined by the rules

p(a,X,Y) :- r(X,Y).

p(X,Y,X) :- r(Y,X).

We rectify these rules by making both heads be p(U,V,W) and adding subgoals as follows.

p(U,V,W) :- r(X,Y) & U=a & V=X & W=Y.

p(U,V,W) :- r(Y,X) & U=X & V=Y & W=X.

Next, substituting for X, Y one of the new variables U,V, or W, as appropriate, we get

p(U,V,W) :- r(V,W) & U=a.

p(U,V,W) :- r(V,U) & W=U.

Page 17: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Expressive Power of Datalog

• Nonrecursive Datalog = (classical) relational algebra. See discussion in text.

• Datalog simulates SQL select-from-where without aggregation and grouping.

• Recursive Datalog expresses queries that cannot be expressed in SQL.

• But none of these languages have full expressive power (Turing completeness).

Page 18: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Recursion• IDB predicate P depends on predicate Q if there is a

rule with P in the head and Q in a subgoal.• Draw a graph: nodes = IDB predicates, arc P Q

means P depends on Q.• Cycles if and only if recursive.

Recursive ExampleSib(x,y) <- Par(x,p) AND Par(y,p)

AND x <> y

Cousin(x,y) <- Sib(x,y)Cousin(x,y) <- Par(x,xp)

AND Par(y,yp)AND Cousin(xp,yp)

Page 19: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Iterative Fixed-Point Evaluates Recursive Rules

StartIDB = ø

Changeto IDB?

Apply rulesto IDB, EDB

yes no

done

Page 20: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

ExampleEDB Par =

• Note, because of symmetry, Sib and Cousin facts appear in pairs, so we shall mention only (x,y) when both (x,y) and (y,x) are meant.

a d

ecb

hgf

ikj

Page 21: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Sib Cousin

Initial Round 1 (b,c), (c,e) add: (g,h), (j,k)

Round 2 (b,c), (c,e)

add: (g,h), (j,k)

Round 3 (f,g), (f,h)

add: (g,i), (h,i)

(i,k)

Round 4 (k,k)

add: (i,j)

Page 22: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Another example

path(X,Y) :- arc(X,Y).

path(X,Y) :- path(X,Z) & path(Z,Y).

Datalog equation for the relation P corresponding to the path predicate:

P(X,Y) = A(X,Y) union πX,Y (P(X,Z) natural join P(Z,Y))

Find a solution to the equation if A ={(1,2), (2,3)}.

Page 23: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Stratified Negation

• Negation wrapped inside a recursion makes no sense.• Even when negation and recursion are separated,

there can be ambiguity about what the rules mean, and some one meaning must be selected.

• Stratified negation is an additional restraint on recursive rules (like safety) that solves both problems:

1. It rules out negation wrapped in recursion.

2. When negation is separate from recursion, it yields the intuitively correct meaning of rules (the stratified model).

Page 24: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Problem with Recursive Negation

Consider:P(x) <- Q(x) AND NOT P(x)

• Q = EDB = {1,2}.

• Compute IDB P iteratively? Initially, P = . Round 1: P = {1,2}. Round 2: P = , etc., etc.

Page 25: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Problem (Contd.)

p(X) :- r(X) & NOT q(X). P = R-Q

q(X) :- r(X) & NOT p(X). Q = R-P

Suppose R consists of a single tuple 1,

R = {1},

S1 : P = 0 and Q = {1}.

S2 : P = {1} and Q = 0.

Both S1 and S2 are solutions to the equations P = R-Q and Q = R-P. Both are minimal fixed points and the rules don’t have a least fixed point.

Page 26: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

StrataIntuitively: stratum of an IDB predicate = maximum

number of negations you can pass through on the way to an EDB predicate.

• Must not be in “stratified” rules.• Define stratum graph:

Nodes = IDB predicates. Arc P Q if Q appears in the body of a rule with head P. Label that arc “–” if Q is in a negated subgoal.

ExampleP(x) <- Q(x) AND NOT P(x)

P–

Page 27: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

ExampleWhich target nodes cannot be reached from

any source node?Reach(x) <- Source(x)Reach(x) <- Reach(y) AND Arc(y,x)

NoReach(x) <- Target(x)AND NOT Reach(x)

NoReach

Reach

Page 28: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Computing StrataStratum of an IDB predicate A = maximum number of

“–” arcs on any path from A in the stratum graph.

Examples• For first example, stratum of P is .• For second example, stratum of Reach is 0; stratum

of NoReach is 1.

Stratified NegationA Datalog program is stratified if every IDB predicate

has a finite stratum.

Stratified ModelIf a Datalog program is stratified, we can compute the

relations for the IDB predicates lowest-stratum-first.

Page 29: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

ExampleReach(x) <- Source(x)Reach(x) <- Reach(y) AND Arc(y,x)

NoReach(x) <- Target(x) AND NOT Reach(x)• EDB:

Source = {1}. Arc = {(1,2), (3,4), (4,3)}. Target = {2,3}.

• First compute Reach = {1,2} (stratum 0).• Next compute NoReach = {3}.

1 2 3 4

source target target

Page 30: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

SQL Recursion

WITH

stuff that looks like Datalog rules

an SQL query about EDB, IDB

• Rule =[RECURSIVE] R(<arguments>) AS

SQL query

Page 31: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Example• Find Sally’s cousins, using EDB Par(child, parent).

WITHSib(x,y) AS

SELECT p1.child, p2,childFROM Par p1, Par p2WHERE p1.parent = p2.parent

AND p1.child <> p2.child,

RECURSIVE Cousin(x,y) ASSib

UNION(SELECT p1.child, p2.child FROM Par p1, Par p2, Cousin WHERE p1.parent = Cousin.x AND p2.parent = Cousin.y)

SELECT yFROM CousinWHERE x = 'Sally';

Page 32: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Plan for Describing Legal SQL Recursion

• Define “monotonicity,” a property that generalizes “stratification.”

• Generalize stratum graph to apply to SQL queries instead of Datalog rules. (Non)monotonicity replaces NOT in subgoals.

• Define semantically correct SQL recursions in terms of stratum graph.

MonotonicityIf relation P is a function of relation Q (and perhaps

other things), we say P is monotone in Q if adding tuples to Q cannot cause any tuple of P to be deleted.

Page 33: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Monotonicity Example

In addition to certain negations, an aggregation can cause nonmonotonicity.Sells(bar, beer, price)

SELECT AVG(price)FROM SellsWHERE bar = 'Joe''s Bar';

• Adding to Sells a tuple that gives a new beer Joe sells will usually change the average price of beer at Joe’s.

• Thus, the former result, which might be a single tuple like (2.78) becomes another single tuple like (2.81), and the old tuple is lost.

Page 34: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Generalizing Stratum Graph to SQL• Node for each relation defined by a “rule.”• Node for each subquery in the “body” of a rule.• Arc P Q if

u P is “head” of a rule, and Q is a relation appearing in the FROM list of the rule (not in the FROM list of a subquery), as argument of a UNION, etc.

u P is head of a rule, and Q is a subquery directly used in that rule (not nested within some larger subquery).

u P is a subquery, and Q is a relation or subquery used directly within P [analogous to (a) and (b) for rule heads].

• Label the arc – if P is not monotone in Q.• Requirement for legal SQL recursion: finite strata only.

Page 35: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Example

For the Sib/Cousin example, there are three nodes: Sib, Cousin, and SQ (the second term of the union in the rule for Cousin).

• No nonmonotonicity, hence legal.

Sib Cousin

SQ

Page 36: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

A Nonmonotonic ExampleChange the UNION to EXCEPT in the rule for Cousin.

RECURSIVE Cousin(x,y) ASSib

EXCEPT(SELECT p1.child, p2.child FROM Par p1, Par p2, Cousin WHERE p1.parent = Cousin.x AND p2.parent = Cousin.y)

• Now, adding to the result of the subquery candelete Cousin facts; i.e., Cousin is nonmonotone in SQ.

• Infinite number of –’s in cycle, so illegal in SQL.

Sib Cousin

SQ

Page 37: Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical

Another Example:NOT Doesn’t Mean Nonmonotone

Leave Cousin as it was, but negate one of the conditions in thewhere-clause.RECURSIVE Cousin(x,y) AS

SibUNION

(SELECT p1.child, p2.child FROM Par p1, Par p2, Cousin WHERE p1.parent = Cousin.x AND NOT (p2.parent = Cousin.y))

• You might think that SQ depends negatively on Cousin, but it doesn’t. If I add a new tuple to Cousin, all the old tuples still exist and yield whatever

tuples in SQ they used to yield. In addition, the new Cousin tuple might combine with old p1 and p2 tuples to

yield something new.