j uly 18, 2007 ciaa'07, p rague "a nalyzing a mbiguity of c ontext-free g rammars "...

Post on 13-Jan-2016

219 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Analyzing Ambiguity of Context-Free Grammars

Claus Brabrandbrabrand(at)brics.dk

DAIMI, University of Aarhus

Anders Mølleramoeller(at)brics.dk

DAIMI, University of Aarhus

Robert Giegerichrobert(at)TechFak.Uni-Bielefeld.de

University of Bielefeld, Germany

[ 2 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Outline

Introduction (and Motivation) Characterization of Ambiguity

(aka. "Vertical-" and "Horizontal-" Ambiguity)

Framework (for Analyzing Ambiguity) Regular Approximation (AMN) Assessment (Applications and Examples) Related Work Conclusion

[ 3 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Motivation (for CFG Ambiguity)

1

2

Programming Languages

Models of Real-World Physical Structures

STM : EXP ";" | "if" "(" EXP ")" STM | "if" "(" EXP ")" STM "else" STM | "while" "(" EXP ")" "do" STM

EXP : EXP "*" TERM | EXP "/" TERM | TERM

TERM : TERM "+" FACT | TERM "-" FACT | FACT

FACT : CONST | VAR

P : "(" P ")" | "(" O ")"

O : L P | P R | S P S | H

L : "." L | "."

R : "." R | "."

S : "." S | "."

H : "." H | "." "." "."

Engineer

ComputerScientist

Unambiguous

Unambiguousint f() {

if (b)

if (c)

f();

else

y++;

}

...

what the programmer intended

AACGGAG

CGGTGGC

ATCGGAT

CGACTTT

beneficial...

parser

parserlethal...

prediction of physicalstructure

G

G

G

G

P

P'

M

M'

Ambiguous

Ambiguous

programming language (CFG)

physical structure model (CFG)

[ 4 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Context-Free Grammar Ambiguity

However: Undecidable! i.e., no one can decide this line:

However^2…

T

s

T’

s

=

unambiguous ambiguous

Ambiguity: *: multiple derivation trees ?

?

Ambiguity meansthere such that:

[ 5 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

However: Conservative Analysis!

Use conservative (over-)approximation:

“Yes!” “G guaranteed unambiguous!” Safely use any GLR parser on G

...and never get two parses at runtime!

unambiguous ambiguous

Yes!

.G

...just because it’s undecidable, doesn’t mean there aren’t (good) conservative approximations! Indeed, the whole area of static analysis works on “side-stepping undecidability”.

...just because it’s undecidable, doesn’t mean there aren’t (good) conservative approximations! Indeed, the whole area of static analysis works on “side-stepping undecidability”.

[ 6 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Conservative Analysis (cont'd)

Undecidability means: “there’ll always be a slack”:

However, still useful! Possible interpretations of “Don't know?”:

Treat as error (reject grammar): “Please redesign your grammar” (as in LR(k))

Treat as warning: “Here are some potential problems”

unambiguous ambiguous

Don't know?

. .

[ 7 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Problems with Existing Solutions

Hard to reason (locally) about ambiguity: Intricate structural property of a grammar

Are "left-to-right" (or "right-to-left") biased: Cannot handle "palindromic grammars"

(...a serious problem for RNA analysis)!

Error messages: Hard to "pin-point ambiguity" (in terms of grammar) Also: would like "shortest examples" for debugging

(...especially for grammar non-experts)!

conflicts: 7 shift/reduce, 9 reduce/reduce

1

2

3

[ 8 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Outline

Introduction (and Motivation) Characterization of Ambiguity

(aka. "Vertical-" and "Horizontal-" Ambiguity)

Framework (for Analyzing Ambiguity) Regular Approximation (AMN) Assessment (Applications and Examples) Related Work Conclusion

[ 9 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Terminology:Context-Free Grammar

N finite set of nonterminals finite set of terminals s N start nonterminal : N P(E*) production function, E = N

G = N, , s,

Assume (trivially): Reachability (all nN reachable from s) Productivity (all nN derive some string)

L : E* P(*) "language-of" operator, L(G)

EXP : ID

| EXP '+' EXP

| EXP '*' EXP

N N

[ 10 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Vertical Unambiguity

“Vertical unambiguity”:

Example ("xy"):

n N : , ' (n) : ' L() L(') =

xy

S : 'x' Y | X 'y'

Y : 'y'X : 'x'

Verticallyambiguous string:

~ “reduce/reduce conflict” in [Yacc]

G

[ 11 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Horizontal Unambiguity

“Horizontal unambiguity”:

where: is given by:

Example ("xay"):

n N: (n): = l r L(l) L(r) =

: P(*) P(*) P(*)

X Y := { xay | x,y* a+ x,xaL(X) y,ayL(Y) }

xay

S : 'x' V W

V : 'a' | W : 'a' 'y' | 'y'

Horizontalllyambiguous string:

~ “shift/reduce conflict” in [Yacc]

G

"overlap"

x a y

X Y

YX

[ 12 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Characterization of Ambiguity

Theorem 1 (characterization):

Lemma 1a: (“”)

Lemma 1b: (“”)

G G G unambiguous

G G G unambiguous

G G G unambiguous

(aka. "soundness")

(aka. "completeness")

"G is vertically and horizontally unambiguous"

The proofs are in the Tech. Report(straightforward induction proofs)

Note: Ambiguity fully characterized

Still undecidable (...of course)

Structural problem Finite number of linguistic problems

[ 13 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Outline

Introduction (and Motivation) Characterization of Ambiguity

(aka. "Vertical-" and "Horizontal-" Ambiguity)

Framework (for Analyzing Ambiguity) Regular Approximation (AMN) Assessment (Applications and Examples) Related Work Conclusion

[ 14 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

(Over-)Approximation (A )

(Over-)Approximation, A:

Approximated vertical unambiguity:

Approximated horizontal unambiguity:

A decidable emptiness of “ ” and “ ” decidable (on co-dom(A ))

E* : L() A()

n N : , ' (n) : A() A(') =

A

A

n N: (n): = l r A(l) A(r) =

G

G

A : E* P(*)

L : E* P(*)

[ 15 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Unambiguity Approximation

Proposition 2 (approximation soundness):

Proof:

"Larger sets don't overlap smaller sets don't overlap"(equivalently: "Conflicts w/ smaller sets conflicts w/ larger sets"):

G unambiguous

A() A(') = L() L(') =

A(l) A(r) = L(l) L(r) =

AA

AA

G G

G G G Gand henceby transitivityvia (Theorem 1)

[ 16 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Compositionality (of A's)

Proposition 3 (compositionality):

Proof: Follows from definition [proof omitted]

Also:“approximations are locally(!) compositional”

A, A’ decidable (over-)approximations A A’ decidable (over-)approximation

unambiguous ambiguous

unambiguous ambiguous

unambiguous ambiguous

A

A’

A A’

[ 17 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Are there any Approximations!?!

Are there any approximations?!?

YES!; e.g., "The worst... ...approximation"

A*() := * everything (constant)

Almost useless: “Can only acquit totally trivial grammars:

as unambiguous”

unambiguous ambiguous

worst approximation

N : 'x'

but safe(!)

[ 18 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Outline

Introduction (and Motivation) Characterization of Ambiguity

(aka. "Vertical-" and "Horizontal-" Ambiguity)

Framework (for Analyzing Ambiguity) Regular Approximation (AMN) Assessment (Applications and Examples) Related Work Conclusion

[ 19 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Regular Approximation (AMN)!

AMN() = [Mohri-Nederhof]G() CFG REGDFA (Over-)Approximation

Properties of this “ ”: Good (over-)approximation! Produces regular languages:

almost everything is decidable (constructively, via automata)!

Note: Works on a language-level, L(G), ... ...not on the structure-level of the grammar, G

“Regular Approximation of Context-Free Grammars through Transformation”[Mohri-Nederhof, 2000]

Black-box

[ 20 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Outline

Introduction (and Motivation) Characterization of Ambiguity

(aka. "Vertical-" and "Horizontal-" Ambiguity)

Framework (for Analyzing Ambiguity) Regular Approximation (AMN) Assessment (Applications and Examples) Related Work Conclusion

[ 21 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Assessment (implementation)

Java impl.: "grambiguity" (510 lines, using): "dk.brics.automaton" [ http://www.brics.dk/automaton/ ] "dk.brics.grammar" [ http://www.brics.dk/grammar/ ] Java String Analyzer [ http://www.brics.dk/JSA/ ]

/* unambiguous */

P[aPa] : "a" P "a" ; [a] | "a" ; [empty] | ;

unambiguous grammar!

P

/* ambiguous */

E[plus] : E "+" E ; [mult] | E "*" E ; [x] | "x" ;

*** (potential) vertical ambiguity detected: 'E[plus]' vs. 'E[mult]' shortest ambiguous string: "x*x+x"

*** (potential) horizontal ambiguity detected: 'E[plus:0..0]' vs. 'E[plus:1..2]' shortest ambiguous string: "x+x+x"

*** (potential) horizontal ambiguity detected: 'E[plus:0..1]' vs. 'E[plus:2..2]' shortest ambiguous string: "x+x+x"

*** (potential) horizontal ambiguity detected: 'E[mult:0..0]' vs. 'E[mult:1..2]' shortest ambiguous string: "x*x*x"

*** (potential) horizontal ambiguity detected: 'E[mult:0..1]' vs. 'E[mult:2..2]' shortest ambiguous string: "x*x*x"

*** (potentially) ambiguous grammar: 1 (potential) vertical ambiguity 4 (potential) horizontal ambiguities

E

[ 22 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Examples: Palindromesand "Anti-palindromes" Palindromic examples:

P : "a" P "a" ; | "a" ; | ;

unambiguous grammar!

P : "a" P "a" ; | "b" P "b" ; | "b" ; | "a" ; | ;

unambiguous grammar!

P : "a" P "a" ; | ;

unambiguous grammar!

R : "a" R "b" ; | "b" R "a" ; | "a" "b" ; | "b" "a" ;

unambiguous grammar!

R : "a" R "b" ; | "b" R "a" ; | ;

unambiguous grammar!

Note: all arenon-LR-Regular grammars !!

[ 23 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

...inherent in RNA Analysis!!!"Predicting behavior of genes":

"Complimentary base pairs"// 'G-C', 'A-U', and 'G-U':

R : 'G' R 'C' | 'C' R 'G' | 'A' R 'U' | 'U' R 'A' | 'G' R 'U' | 'U' R 'G' |

[ 24 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Examples: RNA Analysis (G1)

RNA Analysis (G1):%> java –jar Grambiguity.jar G1.cfg

*** (potential) vertical ambiguity detected: 'S[aS]' vs. 'S[Sa]' shortest ambiguous string: "."

*** (potential) vertical ambiguity detected: 'S[aa]' vs. 'S[SS]' shortest ambiguous string: "()"

*** (potential) vertical ambiguity detected: 'S[aS]' vs. 'S[SS]' shortest ambiguous string: "."

*** (potential) vertical ambiguity detected: 'S[Sa]' vs. 'S[SS]' shortest ambiguous string: "."

*** (potential) vertical ambiguity detected: 'S[SS]' vs. 'S[empty]' shortest ambiguous string: ""

*** (potential) horizontal ambiguity detected: 'S[SS:0..0]' vs. 'S[SS:1..1]' shortest ambiguous string: "."

*** (potentially) ambiguous grammar: 5 (potential) vertical ambiguities 1 (potential) horizontal ambiguity

/* ambiguous */

S[aa] : "(" S ")" ; [aS] | "." S ; [Sa] | S "." ; [SS] | S S ; [empty] | ;

G1

[ 25 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Examples: RNA Analysis (G2)

RNA Analysis (G2):*** (potential) vertical ambiguity detected: 'S[aS]' vs. 'S[Sa]' shortest ambiguous string: "."

*** (potential) vertical ambiguity detected: 'S[aPa]' vs. 'S[SS]' shortest ambiguous string: "()"

*** (potential) vertical ambiguity detected: 'S[aS]' vs. 'S[SS]' shortest ambiguous string: "."

*** (potential) vertical ambiguity detected: 'S[Sa]' vs. 'S[SS]' shortest ambiguous string: "."

*** (potential) vertical ambiguity detected: 'S[SS]' vs. 'S[empty]' shortest ambiguous string: ""

*** (potential) vertical ambiguity detected: 'P[aPa]' vs. 'P[S]' shortest ambiguous string: "()"

*** (potential) horizontal ambiguity detected: 'S[SS:0..0]' vs. 'S[SS:1..1]' shortest ambiguous string: "."

*** (potentially) ambiguous grammar: 6 (potential) vertical ambiguities 1 (potential) horizontal ambiguity

/* ambiguous */

S[aPa] : "(" P ")" ; [aS] | "." S ; [Sa] | S "." ; [SS] | S S ; [empty] | ;

P[aPa] : "(" P ")" ; [S] | S ;

G2

[ 26 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Examples: RNA Analysis (G3-G6)

RNA Analysis (G3,G4,G5,G6):S[aS] : "." S ; [T] | T ; [empty] | ;

T[Ta] : T "." ; [aSa] | "(" S ")" ; [TaSa] | T "(" S ")" ;

S[LS] : L S ; [L] | L ;

L[aFa] : "(" F ")" ; [a] | "." ;

F[aFa] : "(" F ")" ; [LS] | L S ;

S[aS] : "." S ; [aSaS] | "(" S ")" S ; [empty] | ;

S[aPa] : "(" P ")" ; [aL] | "." L ; [Ra] | R "." ; [LS] | L S ;

L[aPa] : "(" P ")" ; [aL] | "." L ;

R[Ra] : R "." ; [empty] | ;

P[aPa] : "(" P ")" ; [aNa] | "(" N ")" ;

N[aL] : "." L ; [Ra] | R "." ; [LS] | L S ;

unambiguous grammar!

G3

G4

G5

G6

Similarly for 'G7' and 'G8'(using an unfolding trick)

[ 27 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Examples: "voss" & "voss-light"

P : "(" P ")" ; // P: Closed structure | "(" O ")" ;

O : L P ; // O: Open structure | P R ; | S P S ; | H ;

L : "." L ; // L: Left bulge | "." ;

R : "." R ; // R: Right bulge | "." ;

S : "." S ; // S: Singlestrand | "." ;

H : "." H ; // H: Hairpin 3+loop | "." "." "." ;

LR(k):

LR(1) = 3 r/r conflictsLR(3) = 12 r/r conflictsLR(5) = 93 r/r conflictsLR(7) = 249 r/r conflictsLR(9) = 513 r/r conflicts...

unambiguous grammar!

[ 28 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Example: Java Expressions

Exp[assign] : Exp1 "=" Exp ; [exp1] | Exp1 ;

Exp1[or] : Exp1 "||" Exp2 ; [exp2] | Exp2 ;

Exp2[and] : Exp2 "&&" Exp3 ; [exp3] | Exp3 ;

Exp3[eq] : Exp3 "==" Exp4 ; [neq] | Exp3 "!=" Exp4 ; [exp4] | Exp4 ;

Exp4[lt] : Exp4 "<" Exp5 ; [leq] | Exp4 "<=" Exp5 ; [gt] | Exp4 ">" Exp5 ; [geq] | Exp4 ">=" Exp5 ; [exp5] | Exp5 ;

/* -- cont'd -- */

Exp5[add] : Exp5 "+" Exp6 ; [sub] | Exp5 "-" Exp6 ; [exp6] | Exp6 ;

Exp6[mul] : Exp6 "*" Exp7 ; [div] | Exp6 "/" Exp7 ; [exp7] | Exp7 ;

Exp7[not] : "!" Exp7 ; [exp8] | Exp8 ;

Exp8[par] : "(" Exp ")" ; [con] | Con ;

Con[num] : "0" ; [id] | "x" ;

unambiguous grammar!

[ 29 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Error Messages (Amb. Example)

Ambiguous Expressions:E[plus] : E "+" E ; [mult] | E "*" E ; [x] | "x" ;

*** (potential) vertical ambiguity detected: 'E[plus]' vs. 'E[mult]' shortest ambiguous string: "x*x+x"

*** (potential) horizontal ambiguity detected: 'E[plus:0..0]' vs. 'E[plus:1..2]' shortest ambiguous string: "x+x+x"

*** (potential) horizontal ambiguity detected: 'E[plus:0..1]' vs. 'E[plus:2..2]' shortest ambiguous string: "x+x+x"

*** (potential) horizontal ambiguity detected: 'E[mult:0..0]' vs. 'E[mult:1..2]' shortest ambiguous string: "x*x*x"

*** (potential) horizontal ambiguity detected: 'E[mult:0..1]' vs. 'E[mult:2..2]' shortest ambiguous string: "x*x*x"

*** (potentially) ambiguous grammar: 1 (potential) vertical ambiguity 4 (potential) horizontal ambiguities

precedence"+" vs. "*"

assoc.of "+"

assoc.of "*"

[ 30 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

BaseR

P

Benchmark Grammars

LALR(1)LR(1)LR(2)LR(3)LR(4)LR(5)LR(6)LR(7)LR(8)

LR(k)..

G3 G7

G4

G5

G8

G6

AMBIGUOUS[OUR]

UNAMBIGUOUS

G1

G2

ExpO/E

Voss

Voss-light

(5V+1H)

(6V+1H)

Amb-Exp

(1V+4H)

[ 31 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Outline

Introduction (and Motivation) Characterization of Ambiguity

(aka. "Vertical-" and "Horizontal-" Ambiguity)

Framework (for Analyzing Ambiguity) Regular Approximation (AMN) Assessment (Applications and Examples) Related Work Conclusion

[ 32 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Related Work (Dynamic)

Dynamic disambiguation: “Disambiguation-by-convention”:

Longest match, most specific match, …

Customizable: [Bison v. 1.5+]: %dprec, %merge [ASF+SDF]: “disambiguation filters”

Dynamic ambiguity interception: GLR ([Tomita], [Early], [Bison], [ASF+SDF], …)

[ 33 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Related Work (Static)

Static disambiguation: “Disambiguation-by-convention”:

First match, most specific match, …

Customizable: [Yacc]: %left, %right, %nonassoc, %prec

Static ambiguity interception: Our work goes here LL(k), LALR(1), LR(k), LR-regular, … Sylvain Schmitz (ICALP 2007):

"Conservative Ambiguity Detection in Context-Free Grammars" Subsumes LR-regular Incomparable to our technique

S : A A

A : 'a' A 'a'

| 'b'

[ 34 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Outline

Introduction (and Motivation) Characterization of Ambiguity

(aka. "Vertical-" and "Horizontal-" Ambiguity)

Framework (for Analyzing Ambiguity) Regular Approximation (AMN) Assessment (Applications and Examples) Related Work Conclusion

[ 35 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Conclusion

Advantages (of our approach): Characterization! Possible to reason (locally) about ambiguity (Composable) Analysis Framework Complete decision procedure for regular grammars Inherently parallelizable DFA Counterexamples: and shortest (possibly) ambiguous string Not "left-to-right" or "right-to-left" biased: Can handle palindromic grammars Well-suited for RNA analysis :)

[ 36 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

“Analyzing Ambiguity of Context-Free Grammars”“Analyzing Ambiguity of Context-Free Grammars”

Conclusion (cont'd)

It has been known since 1962 that the ambiguity problem for context-free grammars is undecidable. Ambiguity in context-free grammars is a recurring problem in language design and parser generation, as well as in applications where grammars are used as models of real-world physical structures. However, the fact that the problem is undecidable does not mean that there are no useful approximations to the problem.

We observe that there is a simple linguistic characterization of the grammar ambiguity problem, and we show how to exploit this to conservatively approximate the problem based on local regular approximations and grammar unfoldings. As an application, we consider grammars that occur in RNA analysis in bioinformatics, and we demonstrate that our static analysis of context-free grammars is sufficiently precise and efficient to be practically useful.

It has been known since 1962 that the ambiguity problem for context-free grammars is undecidable. Ambiguity in context-free grammars is a recurring problem in language design and parser generation, as well as in applications where grammars are used as models of real-world physical structures. However, the fact that the problem is undecidable does not mean that there are no useful approximations to the problem.

We observe that there is a simple linguistic characterization of the grammar ambiguity problem, and we show how to exploit this to conservatively approximate the problem based on local regular approximations and grammar unfoldings. As an application, we consider grammars that occur in RNA analysis in bioinformatics, and we demonstrate that our static analysis of context-free grammars is sufficiently precise and efficient to be practically useful.

JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Thank you

Questions, please?

[ 38 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Advertisement

Film about teaching/learning: based on educational research theories:

Freely available on google video + on DVD (subtitles in 7 languages) Used on all continents for teaching teachers about teaching and learning 3,500+ DVDs (non-profit) sold in a few months 17,000+ online views Features epilogue by Prof. John Biggs

JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

BONUS SLIDES

[ 40 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

[Mohri-Nederhof]: O(n2vh) Vertical Amb: O(n3v4h4) Horizontal Amb: O(n3v3h5) Total: O(n3v3h4(v+h)) O(g5)

Asymptotic (Time) Complexity

N1 : e1,1 … ea,1

| … | e1,p … ea,p

h

n

v

n = |N| v = max {|(N)|, NN} h = max {||, (N), NN} g = nvh = |G|

[ 41 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Other (cheaper) approximations

Use cheaper approximations first: e.g.:

< F , M , L >

set offirst chars

set oflast chars

set ofmiddle chars

[ 42 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Example: Odd/Even

Keeping track of parity (odd/even):Start : Even ; | Odd ;

Even : "(" "(" Even ")" ")" ; | ;

Odd : "(" "(" Odd ")" ")" ; | "(" ")" ;

unambiguous grammar!

A(Even) = A(Odd) =

{ (2n )2m | n,m0 } { (2n+1 )2m+1 | n,m0 }

L(Even) = { (2n )2n | n0 } L(Odd) = { (2n+1 )2n+1 | n0 }

[ 43 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

AMN is Decidable!

. Constructively decidable (using DFAs):

O(|XDFA||YDFA|)

Constructively decidable (using DFAs):

O(|XDFA||YDFA|)

Constructively decidable

with potential counterexamples (as DFAs);i.e., we can extract shortest (potentially ambiguous) strings!

X Y =

X Y =

AMNAMN

[ 44 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

For X,Y regular languages (NFAs):

All overlappings, “xay” (as DFA's) (essentially a variant of "DFA product-construction", '')

Decision Algorithm for (X Y)

XNFA YNFA

[X;Y]NFA

a path :

X'NFA Y'NFA

a

x y

a

a

x a y

X Y

YX

a

[ 45 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Examples: RNA Analysis (G7)

RNA Analysis (G7,G8):*** (potential) vertical ambiguity detected: 'P[aPa]' vs. 'P[aNa]' shortest ambiguous string: "(((.)"

*** (potentially) ambiguous grammar: 1 (potential) vertical ambiguity 0 (potential) horizontal ambiguities

S[aPa] : "(" P ")" ; [aL] | "." L ; [Ra] | R "." ; [LS] | L S ;

L[aPa] : "(" P ")" ; [aL] | "." L ;

R[Ra] : R "." ; [empty] | ;

P[aPa] : "(" P ")" ; [aNa] | "(" N ")" ;

N[aL] : "." L ; [Ra] | R "." ; [LS] | L S ;

G7

*** (potential) vertical ambiguity detected: 'P[aPa]' vs. 'P[aNa]' shortest ambiguous string: "(((.)"

*** (potentially) ambiguous grammar: 1 (potential) vertical ambiguity 0 (potential) horizontal ambiguities

S[aS] : "." S ; [T] | T ; [empty] | ;

T[Ta] : T "." ; [aPa] | "(" P ")" ; [TaPa] | T "(" P ")" ;

P[aPa] : "(" P ")" ; [aNa] | "(" N ")" ;

N[aS] : "." S ; [Ta] | T "." ; [TaPa] | T "(" P ")" ;

G8Note: these are allspurious errors

due to imprecisionsin the analysis

[ 46 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Example: Expressions

Expressions:E[term] : T ; [plus] | E "+" T ; T[x] : "x" ; [par] | "(" E ")" ;

*** (potential) vertical ambiguity detected: 'E[term]' vs. 'E[plus]' shortest ambiguous string: "x+x"

*** (potential) horizontal ambiguity detected: 'E[plus:0..0]' vs. 'E[plus:1..2]' shortest ambiguous string: "x+x+x"

*** (potential) horizontal ambiguity detected: 'E[plus:0..1]' vs. 'E[plus:2..2]' shortest ambiguous string: "x+x+x"

*** (potentially) ambiguous grammar: 1 (potential) vertical ambiguity 2 (potential) horizontal ambiguities

Note: General problem with non-linear recursive structures

However, there's a trick...

[ 47 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Examples: Expressions (cont'd)

Expressions:E[term] : T ; [plus] | E "+" T ; T[x] : "x" ; [par] | "(" E ")" ;

E[term] : T ; [plus] | E "+" T ; T[x] : "x" ; [par] | "(" E ")" ;

E[term] : T ; [plus] | E "+" T ; T[x] : "x" ; [par] | "(" E ")" ; unambiguous grammar!

unfold wrt. '(' and ')'

u = x+(x+(x+x)+x)+x

= x+(x+(x+x)+x)+x

E : T : E "+" T T : "x" : "(" E ")"

E : T : E "+" T T : "x" : "(" E ")"

E : T : E "+" T T : "x" : "(" E ")"

E

E

unfold trick:(inside/outside)parentheses

G

Gu

AST

ASTu

G

Gu

[ 48 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Conservative Analysis (cont'd)

Undecidability means: “there’ll always be a slack”:

However, still useful! Possible interpretations of “Don't know?”:

Treat as error (reject grammar): “Please redesign your grammar” (as in LR(k))

Treat as warning: “Here are some potential problems”

unambiguous ambiguous

Don't know?

. .

[ 49 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Proof (Lemma 1a): “”

Lemma 1a:

…contrapositively:

Proof structure: Assume G ambiguous (i.e. 2 der. trees for )

Show: by induction in max height of the 2 derivation trees

G G G unambiguous

G ambiguous G G

G G

[ 50 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Proof (Lemma 1a): “” (Base)

Base case (height 1): The ambiguity means that:

However, this means that: = t0 t1 .. t||-1 = '(i.e. the two trees must be the same); and so the result holds vacuously

N

’1N

1

=

[ 51 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Proof (Lemma 1a): “” (I.H.)

Induction step (height n): Assume induction hypothesis (for height n-1)

The ambiguity means:

N

n-1

N

n-1

i ’i’

… …i … …’i’

11

||-1= ’0 ’|’|-10.. .. .. ..

=

Ti T'i

[ 52 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Proof (Lemma 1a): “” (’)

Case (i.e. different production):

…but then i.e., we have a vertical ambiguity:

L() L(’) {}

G

N

n-1

N

n-1

i ’i’

… …i … …’i’

11

||-1= ’0 ’|’|-10.. .. .. ..

=

Ti T'i

[ 53 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Proof (Lemma 1a): “” (=’,1)

Case (i.e., same prod. ): i.e. “the top of the trees are the same”

Case : ambiguity in subtreei (i.e. & ambiguously derive same i):

Induction hypothesis (on these subtrees)

i : i = ’i

i : i = ’i

N

n-1

N

n-1

i i

… …i … …i’

11

||-1= 0 ||-10.. .. .. ..

=

G G

= ’

Ti

Ti T'i

T'i

[ 54 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Case (i.e., same prod. ): Case (i.e., ):

Now let k = min{ i | i = 'i }

...then:

N

n-1

N

n-1

x

...k

i : i ’i

11

i : i = ’i

=

y x'

...k

y'

L(0 .. k) L(k+1 .. || ) {xay'} G

Proof (Lemma 1a): “” (=’,2)

i : i = ’i = ’

k 'k

= xay'

[ 55 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Proof (Lemma 1b): “”

Lemma 1b:

...contrapositively:

Proof: Assume “ ” (vertical conflict):

But then derive (using reachability + derivability of N):

s * x N x * x a * x a y

s * x N x ’ * x a * x a y

N * a, N ’ * a, L() L(’) {a}

G G G unambiguous

G ambiguous G G

for some N

G a

mb

igu

ou

s

[ 56 ]JULY 18, 2007CIAA'07, PRAGUE "ANALYZING AMBIGUITY OF CONTEXT-FREE GRAMMARS"

Proof (Lemma 1b): “” (cont’d)

Assume “ ” (horizontal conflict): Then for some NN:

But then derive (using reachability + derivability of N):

s * v N v l r * v x r * v x a y * v x a y w

s * v N v l r * v x a r * v x a y * v x a y w

N l r , where L(l) L(r)

x,y * : a + : x,xa L(l) y,ay L(r)

i.e.

G a

mb

igu

ou

s

top related