analyzing ambiguity of context-free grammars

56
JULY 18, 2007 CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS" Analyzing Ambiguity of Context-Free Grammars Claus Brabrand brabrand(at)brics.dk DAIMI, University of Aarhus Anders Møller amoeller(at)brics.dk DAIMI, University of Aarhus Robert Giegerich robert(at)TechFak.Uni-Bielefeld.de University of Bielefeld, Germany

Upload: phuong

Post on 05-Jan-2016

29 views

Category:

Documents


0 download

DESCRIPTION

Analyzing Ambiguity of Context-Free Grammars. Claus Brabrand brabrand(at)brics.dk DAIMI, University of Aarhus. Robert Giegerich robert(at)TechFak.Uni-Bielefeld.de University of Bielefeld, Germany. Anders Møller amoeller(at)brics.dk DAIMI, University of Aarhus. Outline. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Analyzing Ambiguity of Context-Free Grammars

JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Analyzing Ambiguity of Context-Free Grammars

Claus Brabrandbrabrand(at)brics.dk

DAIMI, University of Aarhus

Anders Mølleramoeller(at)brics.dk

DAIMI, University of Aarhus

Robert Giegerichrobert(at)TechFak.Uni-Bielefeld.de

University of Bielefeld, Germany

Page 2: Analyzing Ambiguity of Context-Free Grammars

[ 2 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Outline

Introduction (and Motivation) Characterization of Ambiguity

(aka. "Vertical-" and "Horizontal-" Ambiguity)

Framework (for Analyzing Ambiguity) Regular Approximation (AMN) Assessment (Applications and Examples) Related Work Conclusion

Page 3: Analyzing Ambiguity of Context-Free Grammars

[ 3 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Motivation (for CFG Ambiguity)

1

2

Programming Languages

Models of Real-World Physical Structures

STM : EXP ";" | "if" "(" EXP ")" STM | "if" "(" EXP ")" STM "else" STM | "while" "(" EXP ")" "do" STM

EXP : EXP "*" TERM | EXP "/" TERM | TERM

TERM : TERM "+" FACT | TERM "-" FACT | FACT

FACT : CONST | VAR

P : "(" P ")" | "(" O ")"

O : L P | P R | S P S | H

L : "." L | "."

R : "." R | "."

S : "." S | "."

H : "." H | "." "." "."

Engineer

ComputerScientist

Unambiguous

Unambiguousint f() {

if (b)

if (c)

f();

else

y++;

}

...

what the programmer intended

AACGGAG

CGGTGGC

ATCGGAT

CGACTTT

beneficial...

parser

parserlethal...

prediction of physicalstructure

G

G

G

G

P

P'

M

M'

Ambiguous

Ambiguous

programming language (CFG)

physical structure model (CFG)

Page 4: Analyzing Ambiguity of Context-Free Grammars

[ 4 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Context-Free Grammar Ambiguity

However: Undecidable! i.e., no one can decide this line:

However^2…

T

s

T’

s

=

unambiguous ambiguous

Ambiguity: *: multiple derivation trees ?

?

Ambiguity meansthere such that:

Page 5: Analyzing Ambiguity of Context-Free Grammars

[ 5 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

However: Conservative Analysis!

Use conservative (over-)approximation:

“Yes!” “G guaranteed unambiguous!” Safely use any GLR parser on G

...and never get two parses at runtime!

unambiguous ambiguous

Yes!

.G

...just because it’s undecidable, doesn’t mean there aren’t (good) conservative approximations! Indeed, the whole area of static analysis works on “side-stepping undecidability”.

...just because it’s undecidable, doesn’t mean there aren’t (good) conservative approximations! Indeed, the whole area of static analysis works on “side-stepping undecidability”.

Page 6: Analyzing Ambiguity of Context-Free Grammars

[ 6 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Conservative Analysis (cont'd)

Undecidability means: “there’ll always be a slack”:

However, still useful! Possible interpretations of “Don't know?”:

Treat as error (reject grammar): “Please redesign your grammar” (as in LR(k))

Treat as warning: “Here are some potential problems”

unambiguous ambiguous

Don't know?

. .

Page 7: Analyzing Ambiguity of Context-Free Grammars

[ 7 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Problems with Existing Solutions

Hard to reason (locally) about ambiguity: Intricate structural property of a grammar

Are "left-to-right" (or "right-to-left") biased: Cannot handle "palindromic grammars"

(...a serious problem for RNA analysis)!

Error messages: Hard to "pin-point ambiguity" (in terms of grammar) Also: would like "shortest examples" for debugging

(...especially for grammar non-experts)!

conflicts: 7 shift/reduce, 9 reduce/reduce

1

2

3

Page 8: Analyzing Ambiguity of Context-Free Grammars

[ 8 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Outline

Introduction (and Motivation) Characterization of Ambiguity

(aka. "Vertical-" and "Horizontal-" Ambiguity)

Framework (for Analyzing Ambiguity) Regular Approximation (AMN) Assessment (Applications and Examples) Related Work Conclusion

Page 9: Analyzing Ambiguity of Context-Free Grammars

[ 9 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Terminology:Context-Free Grammar

N finite set of nonterminals finite set of terminals s N start nonterminal : N P(E*) production function, E = N

G = N, , s,

Assume (trivially): Reachability (all nN reachable from s) Productivity (all nN derive some string)

L : E* P(*) "language-of" operator, L(G)

EXP : ID

| EXP '+' EXP

| EXP '*' EXP

N N

Page 10: Analyzing Ambiguity of Context-Free Grammars

[ 10 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Vertical Unambiguity

“Vertical unambiguity”:

Example ("xy"):

n N : , ' (n) : ' L() L(') =

xy

S : 'x' Y | X 'y'

Y : 'y'X : 'x'

Verticallyambiguous string:

~ “reduce/reduce conflict” in [Yacc]

G

Page 11: Analyzing Ambiguity of Context-Free Grammars

[ 11 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Horizontal Unambiguity

“Horizontal unambiguity”:

where: is given by:

Example ("xay"):

n N: (n): = l r L(l) L(r) =

: P(*) P(*) P(*)

X Y := { xay | x,y* a+ x,xaL(X) y,ayL(Y) }

xay

S : 'x' V W

V : 'a' | W : 'a' 'y' | 'y'

Horizontalllyambiguous string:

~ “shift/reduce conflict” in [Yacc]

G

"overlap"

x a y

X Y

YX

Page 12: Analyzing Ambiguity of Context-Free Grammars

[ 12 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Characterization of Ambiguity

Theorem 1 (characterization):

Lemma 1a: (“”)

Lemma 1b: (“”)

G G G unambiguous

G G G unambiguous

G G G unambiguous

(aka. "soundness")

(aka. "completeness")

"G is vertically and horizontally unambiguous"

The proofs are in the Tech. Report(straightforward induction proofs)

Note: Ambiguity fully characterized

Still undecidable (...of course)

Structural problem Finite number of linguistic problems

Page 13: Analyzing Ambiguity of Context-Free Grammars

[ 13 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Outline

Introduction (and Motivation) Characterization of Ambiguity

(aka. "Vertical-" and "Horizontal-" Ambiguity)

Framework (for Analyzing Ambiguity) Regular Approximation (AMN) Assessment (Applications and Examples) Related Work Conclusion

Page 14: Analyzing Ambiguity of Context-Free Grammars

[ 14 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

(Over-)Approximation (A )

(Over-)Approximation, A:

Approximated vertical unambiguity:

Approximated horizontal unambiguity:

A decidable emptiness of “ ” and “ ” decidable (on co-dom(A ))

E* : L() A()

n N : , ' (n) : A() A(') =

A

A

n N: (n): = l r A(l) A(r) =

G

G

A : E* P(*)

L : E* P(*)

Page 15: Analyzing Ambiguity of Context-Free Grammars

[ 15 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Unambiguity Approximation

Proposition 2 (approximation soundness):

Proof:

"Larger sets don't overlap smaller sets don't overlap"(equivalently: "Conflicts w/ smaller sets conflicts w/ larger sets"):

G unambiguous

A() A(') = L() L(') =

A(l) A(r) = L(l) L(r) =

AA

AA

G G

G G G Gand henceby transitivityvia (Theorem 1)

Page 16: Analyzing Ambiguity of Context-Free Grammars

[ 16 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Compositionality (of A's)

Proposition 3 (compositionality):

Proof: Follows from definition [proof omitted]

Also:“approximations are locally(!) compositional”

A, A’ decidable (over-)approximations A A’ decidable (over-)approximation

unambiguous ambiguous

unambiguous ambiguous

unambiguous ambiguous

A

A’

A A’

Page 17: Analyzing Ambiguity of Context-Free Grammars

[ 17 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Are there any Approximations!?!

Are there any approximations?!?

YES!; e.g., "The worst... ...approximation"

A*() := * everything (constant)

Almost useless: “Can only acquit totally trivial grammars:

as unambiguous”

unambiguous ambiguous

worst approximation

N : 'x'

but safe(!)

Page 18: Analyzing Ambiguity of Context-Free Grammars

[ 18 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Outline

Introduction (and Motivation) Characterization of Ambiguity

(aka. "Vertical-" and "Horizontal-" Ambiguity)

Framework (for Analyzing Ambiguity) Regular Approximation (AMN) Assessment (Applications and Examples) Related Work Conclusion

Page 19: Analyzing Ambiguity of Context-Free Grammars

[ 19 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Regular Approximation (AMN)!

AMN() = [Mohri-Nederhof]G() CFG REGDFA (Over-)Approximation

Properties of this “ ”: Good (over-)approximation! Produces regular languages:

almost everything is decidable (constructively, via automata)!

Note: Works on a language-level, L(G), ... ...not on the structure-level of the grammar, G

“Regular Approximation of Context-Free Grammars through Transformation”[Mohri-Nederhof, 2000]

Black-box

Page 20: Analyzing Ambiguity of Context-Free Grammars

[ 20 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Outline

Introduction (and Motivation) Characterization of Ambiguity

(aka. "Vertical-" and "Horizontal-" Ambiguity)

Framework (for Analyzing Ambiguity) Regular Approximation (AMN) Assessment (Applications and Examples) Related Work Conclusion

Page 21: Analyzing Ambiguity of Context-Free Grammars

[ 21 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Assessment (implementation)

Java impl.: "grambiguity" (510 lines, using): "dk.brics.automaton" [ http://www.brics.dk/automaton/ ] "dk.brics.grammar" [ http://www.brics.dk/grammar/ ] Java String Analyzer [ http://www.brics.dk/JSA/ ]

/* unambiguous */

P[aPa] : "a" P "a" ; [a] | "a" ; [empty] | ;

unambiguous grammar!

P

/* ambiguous */

E[plus] : E "+" E ; [mult] | E "*" E ; [x] | "x" ;

*** (potential) vertical ambiguity detected: 'E[plus]' vs. 'E[mult]' shortest ambiguous string: "x*x+x"

*** (potential) horizontal ambiguity detected: 'E[plus:0..0]' vs. 'E[plus:1..2]' shortest ambiguous string: "x+x+x"

*** (potential) horizontal ambiguity detected: 'E[plus:0..1]' vs. 'E[plus:2..2]' shortest ambiguous string: "x+x+x"

*** (potential) horizontal ambiguity detected: 'E[mult:0..0]' vs. 'E[mult:1..2]' shortest ambiguous string: "x*x*x"

*** (potential) horizontal ambiguity detected: 'E[mult:0..1]' vs. 'E[mult:2..2]' shortest ambiguous string: "x*x*x"

*** (potentially) ambiguous grammar: 1 (potential) vertical ambiguity 4 (potential) horizontal ambiguities

E

Page 22: Analyzing Ambiguity of Context-Free Grammars

[ 22 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Examples: Palindromesand "Anti-palindromes" Palindromic examples:

P : "a" P "a" ; | "a" ; | ;

unambiguous grammar!

P : "a" P "a" ; | "b" P "b" ; | "b" ; | "a" ; | ;

unambiguous grammar!

P : "a" P "a" ; | ;

unambiguous grammar!

R : "a" R "b" ; | "b" R "a" ; | "a" "b" ; | "b" "a" ;

unambiguous grammar!

R : "a" R "b" ; | "b" R "a" ; | ;

unambiguous grammar!

Note: all arenon-LR-Regular grammars !!

Page 23: Analyzing Ambiguity of Context-Free Grammars

[ 23 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

...inherent in RNA Analysis!!!"Predicting behavior of genes":

"Complimentary base pairs"// 'G-C', 'A-U', and 'G-U':

R : 'G' R 'C' | 'C' R 'G' | 'A' R 'U' | 'U' R 'A' | 'G' R 'U' | 'U' R 'G' |

Page 24: Analyzing Ambiguity of Context-Free Grammars

[ 24 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Examples: RNA Analysis (G1)

RNA Analysis (G1):%> java –jar Grambiguity.jar G1.cfg

*** (potential) vertical ambiguity detected: 'S[aS]' vs. 'S[Sa]' shortest ambiguous string: "."

*** (potential) vertical ambiguity detected: 'S[aa]' vs. 'S[SS]' shortest ambiguous string: "()"

*** (potential) vertical ambiguity detected: 'S[aS]' vs. 'S[SS]' shortest ambiguous string: "."

*** (potential) vertical ambiguity detected: 'S[Sa]' vs. 'S[SS]' shortest ambiguous string: "."

*** (potential) vertical ambiguity detected: 'S[SS]' vs. 'S[empty]' shortest ambiguous string: ""

*** (potential) horizontal ambiguity detected: 'S[SS:0..0]' vs. 'S[SS:1..1]' shortest ambiguous string: "."

*** (potentially) ambiguous grammar: 5 (potential) vertical ambiguities 1 (potential) horizontal ambiguity

/* ambiguous */

S[aa] : "(" S ")" ; [aS] | "." S ; [Sa] | S "." ; [SS] | S S ; [empty] | ;

G1

Page 25: Analyzing Ambiguity of Context-Free Grammars

[ 25 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Examples: RNA Analysis (G2)

RNA Analysis (G2):*** (potential) vertical ambiguity detected: 'S[aS]' vs. 'S[Sa]' shortest ambiguous string: "."

*** (potential) vertical ambiguity detected: 'S[aPa]' vs. 'S[SS]' shortest ambiguous string: "()"

*** (potential) vertical ambiguity detected: 'S[aS]' vs. 'S[SS]' shortest ambiguous string: "."

*** (potential) vertical ambiguity detected: 'S[Sa]' vs. 'S[SS]' shortest ambiguous string: "."

*** (potential) vertical ambiguity detected: 'S[SS]' vs. 'S[empty]' shortest ambiguous string: ""

*** (potential) vertical ambiguity detected: 'P[aPa]' vs. 'P[S]' shortest ambiguous string: "()"

*** (potential) horizontal ambiguity detected: 'S[SS:0..0]' vs. 'S[SS:1..1]' shortest ambiguous string: "."

*** (potentially) ambiguous grammar: 6 (potential) vertical ambiguities 1 (potential) horizontal ambiguity

/* ambiguous */

S[aPa] : "(" P ")" ; [aS] | "." S ; [Sa] | S "." ; [SS] | S S ; [empty] | ;

P[aPa] : "(" P ")" ; [S] | S ;

G2

Page 26: Analyzing Ambiguity of Context-Free Grammars

[ 26 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Examples: RNA Analysis (G3-G6)

RNA Analysis (G3,G4,G5,G6):S[aS] : "." S ; [T] | T ; [empty] | ;

T[Ta] : T "." ; [aSa] | "(" S ")" ; [TaSa] | T "(" S ")" ;

S[LS] : L S ; [L] | L ;

L[aFa] : "(" F ")" ; [a] | "." ;

F[aFa] : "(" F ")" ; [LS] | L S ;

S[aS] : "." S ; [aSaS] | "(" S ")" S ; [empty] | ;

S[aPa] : "(" P ")" ; [aL] | "." L ; [Ra] | R "." ; [LS] | L S ;

L[aPa] : "(" P ")" ; [aL] | "." L ;

R[Ra] : R "." ; [empty] | ;

P[aPa] : "(" P ")" ; [aNa] | "(" N ")" ;

N[aL] : "." L ; [Ra] | R "." ; [LS] | L S ;

unambiguous grammar!

G3

G4

G5

G6

Similarly for 'G7' and 'G8'(using an unfolding trick)

Page 27: Analyzing Ambiguity of Context-Free Grammars

[ 27 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Examples: "voss" & "voss-light"

P : "(" P ")" ; // P: Closed structure | "(" O ")" ;

O : L P ; // O: Open structure | P R ; | S P S ; | H ;

L : "." L ; // L: Left bulge | "." ;

R : "." R ; // R: Right bulge | "." ;

S : "." S ; // S: Singlestrand | "." ;

H : "." H ; // H: Hairpin 3+loop | "." "." "." ;

LR(k):

LR(1) = 3 r/r conflictsLR(3) = 12 r/r conflictsLR(5) = 93 r/r conflictsLR(7) = 249 r/r conflictsLR(9) = 513 r/r conflicts...

unambiguous grammar!

Page 28: Analyzing Ambiguity of Context-Free Grammars

[ 28 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Example: Java Expressions

Exp[assign] : Exp1 "=" Exp ; [exp1] | Exp1 ;

Exp1[or] : Exp1 "||" Exp2 ; [exp2] | Exp2 ;

Exp2[and] : Exp2 "&&" Exp3 ; [exp3] | Exp3 ;

Exp3[eq] : Exp3 "==" Exp4 ; [neq] | Exp3 "!=" Exp4 ; [exp4] | Exp4 ;

Exp4[lt] : Exp4 "<" Exp5 ; [leq] | Exp4 "<=" Exp5 ; [gt] | Exp4 ">" Exp5 ; [geq] | Exp4 ">=" Exp5 ; [exp5] | Exp5 ;

/* -- cont'd -- */

Exp5[add] : Exp5 "+" Exp6 ; [sub] | Exp5 "-" Exp6 ; [exp6] | Exp6 ;

Exp6[mul] : Exp6 "*" Exp7 ; [div] | Exp6 "/" Exp7 ; [exp7] | Exp7 ;

Exp7[not] : "!" Exp7 ; [exp8] | Exp8 ;

Exp8[par] : "(" Exp ")" ; [con] | Con ;

Con[num] : "0" ; [id] | "x" ;

unambiguous grammar!

Page 29: Analyzing Ambiguity of Context-Free Grammars

[ 29 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Error Messages (Amb. Example)

Ambiguous Expressions:E[plus] : E "+" E ; [mult] | E "*" E ; [x] | "x" ;

*** (potential) vertical ambiguity detected: 'E[plus]' vs. 'E[mult]' shortest ambiguous string: "x*x+x"

*** (potential) horizontal ambiguity detected: 'E[plus:0..0]' vs. 'E[plus:1..2]' shortest ambiguous string: "x+x+x"

*** (potential) horizontal ambiguity detected: 'E[plus:0..1]' vs. 'E[plus:2..2]' shortest ambiguous string: "x+x+x"

*** (potential) horizontal ambiguity detected: 'E[mult:0..0]' vs. 'E[mult:1..2]' shortest ambiguous string: "x*x*x"

*** (potential) horizontal ambiguity detected: 'E[mult:0..1]' vs. 'E[mult:2..2]' shortest ambiguous string: "x*x*x"

*** (potentially) ambiguous grammar: 1 (potential) vertical ambiguity 4 (potential) horizontal ambiguities

precedence"+" vs. "*"

assoc.of "+"

assoc.of "*"

Page 30: Analyzing Ambiguity of Context-Free Grammars

[ 30 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

BaseR

P

Benchmark Grammars

LALR(1)LR(1)LR(2)LR(3)LR(4)LR(5)LR(6)LR(7)LR(8)

LR(k)..

G3 G7

G4

G5

G8

G6

AMBIGUOUS[OUR]

UNAMBIGUOUS

G1

G2

ExpO/E

Voss

Voss-light

(5V+1H)

(6V+1H)

Amb-Exp

(1V+4H)

Page 31: Analyzing Ambiguity of Context-Free Grammars

[ 31 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Outline

Introduction (and Motivation) Characterization of Ambiguity

(aka. "Vertical-" and "Horizontal-" Ambiguity)

Framework (for Analyzing Ambiguity) Regular Approximation (AMN) Assessment (Applications and Examples) Related Work Conclusion

Page 32: Analyzing Ambiguity of Context-Free Grammars

[ 32 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Related Work (Dynamic)

Dynamic disambiguation: “Disambiguation-by-convention”:

Longest match, most specific match, …

Customizable: [Bison v. 1.5+]: %dprec, %merge [ASF+SDF]: “disambiguation filters”

Dynamic ambiguity interception: GLR ([Tomita], [Early], [Bison], [ASF+SDF], …)

Page 33: Analyzing Ambiguity of Context-Free Grammars

[ 33 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Related Work (Static)

Static disambiguation: “Disambiguation-by-convention”:

First match, most specific match, …

Customizable: [Yacc]: %left, %right, %nonassoc, %prec

Static ambiguity interception: Our work goes here LL(k), LALR(1), LR(k), LR-regular, … Sylvain Schmitz (ICALP 2007):

"Conservative Ambiguity Detection in Context-Free Grammars" Subsumes LR-regular Incomparable to our technique

S : A A

A : 'a' A 'a'

| 'b'

Page 34: Analyzing Ambiguity of Context-Free Grammars

[ 34 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Outline

Introduction (and Motivation) Characterization of Ambiguity

(aka. "Vertical-" and "Horizontal-" Ambiguity)

Framework (for Analyzing Ambiguity) Regular Approximation (AMN) Assessment (Applications and Examples) Related Work Conclusion

Page 35: Analyzing Ambiguity of Context-Free Grammars

[ 35 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Conclusion

Advantages (of our approach): Characterization! Possible to reason (locally) about ambiguity (Composable) Analysis Framework Complete decision procedure for regular grammars Inherently parallelizable DFA Counterexamples: and shortest (possibly) ambiguous string Not "left-to-right" or "right-to-left" biased: Can handle palindromic grammars Well-suited for RNA analysis :)

Page 36: Analyzing Ambiguity of Context-Free Grammars

[ 36 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

“Analyzing Ambiguity of Context-Free Grammars”“Analyzing Ambiguity of Context-Free Grammars”

Conclusion (cont'd)

It has been known since 1962 that the ambiguity problem for context-free grammars is undecidable. Ambiguity in context-free grammars is a recurring problem in language design and parser generation, as well as in applications where grammars are used as models of real-world physical structures. However, the fact that the problem is undecidable does not mean that there are no useful approximations to the problem.

We observe that there is a simple linguistic characterization of the grammar ambiguity problem, and we show how to exploit this to conservatively approximate the problem based on local regular approximations and grammar unfoldings. As an application, we consider grammars that occur in RNA analysis in bioinformatics, and we demonstrate that our static analysis of context-free grammars is sufficiently precise and efficient to be practically useful.

It has been known since 1962 that the ambiguity problem for context-free grammars is undecidable. Ambiguity in context-free grammars is a recurring problem in language design and parser generation, as well as in applications where grammars are used as models of real-world physical structures. However, the fact that the problem is undecidable does not mean that there are no useful approximations to the problem.

We observe that there is a simple linguistic characterization of the grammar ambiguity problem, and we show how to exploit this to conservatively approximate the problem based on local regular approximations and grammar unfoldings. As an application, we consider grammars that occur in RNA analysis in bioinformatics, and we demonstrate that our static analysis of context-free grammars is sufficiently precise and efficient to be practically useful.

Page 37: Analyzing Ambiguity of Context-Free Grammars

JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Thank you

Questions, please?

Page 38: Analyzing Ambiguity of Context-Free Grammars

[ 38 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Advertisement

Film about teaching/learning: based on educational research theories:

Freely available on google video + on DVD (subtitles in 7 languages) Used on all continents for teaching teachers about teaching and learning 3,500+ DVDs (non-profit) sold in a few months 17,000+ online views Features epilogue by Prof. John Biggs

[ http://www.daimi.au.dk/~brabrand/short-film/ ]

Page 39: Analyzing Ambiguity of Context-Free Grammars

JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

BONUS SLIDES

Page 40: Analyzing Ambiguity of Context-Free Grammars

[ 40 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

[Mohri-Nederhof]: O(n2vh) Vertical Amb: O(n3v4h4) Horizontal Amb: O(n3v3h5) Total: O(n3v3h4(v+h)) O(g5)

Asymptotic (Time) Complexity

N1 : e1,1 … ea,1

| … | e1,p … ea,p

h

n

v

n = |N| v = max {|(N)|, NN} h = max {||, (N), NN} g = nvh = |G|

Page 41: Analyzing Ambiguity of Context-Free Grammars

[ 41 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Other (cheaper) approximations

Use cheaper approximations first: e.g.:

< F , M , L >

set offirst chars

set oflast chars

set ofmiddle chars

Page 42: Analyzing Ambiguity of Context-Free Grammars

[ 42 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Example: Odd/Even

Keeping track of parity (odd/even):Start : Even ; | Odd ;

Even : "(" "(" Even ")" ")" ; | ;

Odd : "(" "(" Odd ")" ")" ; | "(" ")" ;

unambiguous grammar!

A(Even) = A(Odd) =

{ (2n )2m | n,m0 } { (2n+1 )2m+1 | n,m0 }

L(Even) = { (2n )2n | n0 } L(Odd) = { (2n+1 )2n+1 | n0 }

Page 43: Analyzing Ambiguity of Context-Free Grammars

[ 43 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

AMN is Decidable!

. Constructively decidable (using DFAs):

O(|XDFA||YDFA|)

Constructively decidable (using DFAs):

O(|XDFA||YDFA|)

Constructively decidable

with potential counterexamples (as DFAs);i.e., we can extract shortest (potentially ambiguous) strings!

X Y =

X Y =

AMNAMN

Page 44: Analyzing Ambiguity of Context-Free Grammars

[ 44 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

For X,Y regular languages (NFAs):

All overlappings, “xay” (as DFA's) (essentially a variant of "DFA product-construction", '')

Decision Algorithm for (X Y)

XNFA YNFA

[X;Y]NFA

a path :

X'NFA Y'NFA

a

x y

a

a

x a y

X Y

YX

a

Page 45: Analyzing Ambiguity of Context-Free Grammars

[ 45 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Examples: RNA Analysis (G7)

RNA Analysis (G7,G8):*** (potential) vertical ambiguity detected: 'P[aPa]' vs. 'P[aNa]' shortest ambiguous string: "(((.)"

*** (potentially) ambiguous grammar: 1 (potential) vertical ambiguity 0 (potential) horizontal ambiguities

S[aPa] : "(" P ")" ; [aL] | "." L ; [Ra] | R "." ; [LS] | L S ;

L[aPa] : "(" P ")" ; [aL] | "." L ;

R[Ra] : R "." ; [empty] | ;

P[aPa] : "(" P ")" ; [aNa] | "(" N ")" ;

N[aL] : "." L ; [Ra] | R "." ; [LS] | L S ;

G7

*** (potential) vertical ambiguity detected: 'P[aPa]' vs. 'P[aNa]' shortest ambiguous string: "(((.)"

*** (potentially) ambiguous grammar: 1 (potential) vertical ambiguity 0 (potential) horizontal ambiguities

S[aS] : "." S ; [T] | T ; [empty] | ;

T[Ta] : T "." ; [aPa] | "(" P ")" ; [TaPa] | T "(" P ")" ;

P[aPa] : "(" P ")" ; [aNa] | "(" N ")" ;

N[aS] : "." S ; [Ta] | T "." ; [TaPa] | T "(" P ")" ;

G8Note: these are allspurious errors

due to imprecisionsin the analysis

Page 46: Analyzing Ambiguity of Context-Free Grammars

[ 46 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Example: Expressions

Expressions:E[term] : T ; [plus] | E "+" T ; T[x] : "x" ; [par] | "(" E ")" ;

*** (potential) vertical ambiguity detected: 'E[term]' vs. 'E[plus]' shortest ambiguous string: "x+x"

*** (potential) horizontal ambiguity detected: 'E[plus:0..0]' vs. 'E[plus:1..2]' shortest ambiguous string: "x+x+x"

*** (potential) horizontal ambiguity detected: 'E[plus:0..1]' vs. 'E[plus:2..2]' shortest ambiguous string: "x+x+x"

*** (potentially) ambiguous grammar: 1 (potential) vertical ambiguity 2 (potential) horizontal ambiguities

Note: General problem with non-linear recursive structures

However, there's a trick...

Page 47: Analyzing Ambiguity of Context-Free Grammars

[ 47 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Examples: Expressions (cont'd)

Expressions:E[term] : T ; [plus] | E "+" T ; T[x] : "x" ; [par] | "(" E ")" ;

E[term] : T ; [plus] | E "+" T ; T[x] : "x" ; [par] | "(" E ")" ;

E[term] : T ; [plus] | E "+" T ; T[x] : "x" ; [par] | "(" E ")" ; unambiguous grammar!

unfold wrt. '(' and ')'

u = x+(x+(x+x)+x)+x

= x+(x+(x+x)+x)+x

E : T : E "+" T T : "x" : "(" E ")"

E : T : E "+" T T : "x" : "(" E ")"

E : T : E "+" T T : "x" : "(" E ")"

E

E

unfold trick:(inside/outside)parentheses

G

Gu

AST

ASTu

G

Gu

Page 48: Analyzing Ambiguity of Context-Free Grammars

[ 48 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Conservative Analysis (cont'd)

Undecidability means: “there’ll always be a slack”:

However, still useful! Possible interpretations of “Don't know?”:

Treat as error (reject grammar): “Please redesign your grammar” (as in LR(k))

Treat as warning: “Here are some potential problems”

unambiguous ambiguous

Don't know?

. .

Page 49: Analyzing Ambiguity of Context-Free Grammars

[ 49 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Proof (Lemma 1a): “”

Lemma 1a:

…contrapositively:

Proof structure: Assume G ambiguous (i.e. 2 der. trees for )

Show: by induction in max height of the 2 derivation trees

G G G unambiguous

G ambiguous G G

G G

Page 50: Analyzing Ambiguity of Context-Free Grammars

[ 50 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Proof (Lemma 1a): “” (Base)

Base case (height 1): The ambiguity means that:

However, this means that: = t0 t1 .. t||-1 = '(i.e. the two trees must be the same); and so the result holds vacuously

N

’1N

1

=

Page 51: Analyzing Ambiguity of Context-Free Grammars

[ 51 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Proof (Lemma 1a): “” (I.H.)

Induction step (height n): Assume induction hypothesis (for height n-1)

The ambiguity means:

N

n-1

N

n-1

i ’i’

… …i … …’i’

11

||-1= ’0 ’|’|-10.. .. .. ..

=

Ti T'i

Page 52: Analyzing Ambiguity of Context-Free Grammars

[ 52 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Proof (Lemma 1a): “” (’)

Case (i.e. different production):

…but then i.e., we have a vertical ambiguity:

L() L(’) {}

G

N

n-1

N

n-1

i ’i’

… …i … …’i’

11

||-1= ’0 ’|’|-10.. .. .. ..

=

Ti T'i

Page 53: Analyzing Ambiguity of Context-Free Grammars

[ 53 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Proof (Lemma 1a): “” (=’,1)

Case (i.e., same prod. ): i.e. “the top of the trees are the same”

Case : ambiguity in subtreei (i.e. & ambiguously derive same i):

Induction hypothesis (on these subtrees)

i : i = ’i

i : i = ’i

N

n-1

N

n-1

i i

… …i … …i’

11

||-1= 0 ||-10.. .. .. ..

=

G G

= ’

Ti

Ti T'i

T'i

Page 54: Analyzing Ambiguity of Context-Free Grammars

[ 54 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Case (i.e., same prod. ): Case (i.e., ):

Now let k = min{ i | i = 'i }

...then:

N

n-1

N

n-1

x

...k

i : i ’i

11

i : i = ’i

=

y x'

...k

y'

L(0 .. k) L(k+1 .. || ) {xay'} G

Proof (Lemma 1a): “” (=’,2)

i : i = ’i = ’

k 'k

= xay'

Page 55: Analyzing Ambiguity of Context-Free Grammars

[ 55 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Proof (Lemma 1b): “”

Lemma 1b:

...contrapositively:

Proof: Assume “ ” (vertical conflict):

But then derive (using reachability + derivability of N):

s * x N x * x a * x a y

s * x N x ’ * x a * x a y

N * a, N ’ * a, L() L(’) {a}

G G G unambiguous

G ambiguous G G

for some N

G a

mb

igu

ou

s

Page 56: Analyzing Ambiguity of Context-Free Grammars

[ 56 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Proof (Lemma 1b): “” (cont’d)

Assume “ ” (horizontal conflict): Then for some NN:

But then derive (using reachability + derivability of N):

s * v N v l r * v x r * v x a y * v x a y w

s * v N v l r * v x a r * v x a y * v x a y w

N l r , where L(l) L(r)

x,y * : a + : x,xa L(l) y,ay L(r)

i.e.

G a

mb

igu

ou

s