parsing context-free grammars, parsing, syntax trees

46
Parsing Parsing Context-Free Grammars, Context-Free Grammars, Parsing, Syntax Trees Parsing, Syntax Trees

Upload: posy-byrd

Post on 06-Jan-2018

294 views

Category:

Documents


3 download

DESCRIPTION

Parsing ; =; i1 = while sum0not ; == = = i10i+sum + i 1sumi

TRANSCRIPT

ParsingParsingContext-Free Grammars, Context-Free Grammars,

Parsing, Syntax TreesParsing, Syntax Trees

ParsingParsing

Produce the parse tree for a given program Produce the parse tree for a given program (token stream):(token stream):

i := 1 ; sum := 0 ; while not i = 10 do { i = …i := 1 ; sum := 0 ; while not i = 10 do { i = …

ParsingParsing;;

== ;;

ii 11

= = whilewhilesumsum 00 not not ;;

==== = = = =ii 1010 ii ++ sumsum + +

ii 1 1 sumsum ii

Recursive-Descent ParsingRecursive-Descent ParsingUsing a stack, match grammar symbols Using a stack, match grammar symbols

(terminal and nonterminal) with the (terminal and nonterminal) with the tokens in the stream:tokens in the stream:

i = 1 ; sum = 0 ; while i = 1 ; sum = 0 ; while not …not …

C_

Recursive-Descent ParsingRecursive-Descent ParsingBeginning with the grammar’s “start Beginning with the grammar’s “start

symbol”, pop the next stack symbol and symbol”, pop the next stack symbol and try to match it with the next token in the try to match it with the next token in the streamstream

ii = 1 ; sum = 0 ; while not … = 1 ; sum = 0 ; while not …C_

Recursive-Descent ParsingRecursive-Descent ParsingIf they don’t match, expand the symbol by If they don’t match, expand the symbol by

the appropriate grammar production and the appropriate grammar production and push those symbols onto the stackpush those symbols onto the stack

i = 1 ; sum = 0 ; while not …i = 1 ; sum = 0 ; while not …

C ::= C ::= S ; CS ; C

S;C_

Recursive-Descent ParsingRecursive-Descent ParsingParse tree nodes corresponding to the Parse tree nodes corresponding to the

new symbols will spawn as children of new symbols will spawn as children of the popped symbol’s node …the popped symbol’s node …

i = 1 ; sum = 0 ; while i = 1 ; sum = 0 ; while not …not …

C ::= C ::= S ; CS ; C

S;C_

Recursive-Descent ParsingRecursive-Descent ParsingCC;;

SS== CC;;ii 11

= = whilewhilesumsum 00 not not ;;

==== = = = =ii 1010 ii ++ sumsum + +

ii 1 1 sumsum ii

Recursive-Descent ParsingRecursive-Descent ParsingPop the next stack symbol and try to Pop the next stack symbol and try to

match it with the next token in the match it with the next token in the streamstream

ii = 1 ; sum = 0 ; while not … = 1 ; sum = 0 ; while not …S;C_

Recursive-Descent ParsingRecursive-Descent ParsingIf they don’t match, expand the symbol If they don’t match, expand the symbol

by the appropriate grammar production by the appropriate grammar production and push those symbols onto the stackand push those symbols onto the stack

i = 1 ; sum = 0 ; while not …i = 1 ; sum = 0 ; while not …

S ::= S ::= id = numid = num

id=num;C_

Recursive-Descent ParsingRecursive-Descent ParsingParse tree nodes corresponding to the Parse tree nodes corresponding to the

new symbols will spawn as children of new symbols will spawn as children of the popped symbol’s node …the popped symbol’s node …

i = 1 ; sum = 0 ; while i = 1 ; sum = 0 ; while not …not …

S ::= S ::= id = numid = num

id=num;C_

Recursive-Descent ParsingRecursive-Descent ParsingC;C;

SS== CC;;id id ii num num 11

= = whilewhilesumsum 00 not not ;;

==== = = = =ii 1010 ii ++ sumsum + +

ii 1 1 sumsum ii

Recursive-Descent ParsingRecursive-Descent ParsingPop the next stack symbol and try to Pop the next stack symbol and try to

match it with the next token in the match it with the next token in the stream. If they match, eat the token…stream. If they match, eat the token…

ii = 1 ; sum = 0 ; while = 1 ; sum = 0 ; while not …not …id=num;C_

Recursive-Descent ParsingRecursive-Descent ParsingC;C;

S=S= CC;;idid ii num num 11

= = whilewhilesumsum 00 not not ;;

==== = = = =ii 1010 ii ++ sumsum + +

ii 1 1 sumsum ii

Recursive-Descent ParsingRecursive-Descent ParsingPop the next stack symbol and try to Pop the next stack symbol and try to

match it with the next token in the match it with the next token in the stream. If they match, eat the tokenstream. If they match, eat the token

== 1 ; sum = 0 ; while not 1 ; sum = 0 ; while not ……=num;C_

Recursive-Descent ParsingRecursive-Descent ParsingC;C;

SS== CC;;idid i i num num 11

= = whilewhilesumsum 00 not not ;;

==== = = = =ii 1010 ii ++ sumsum + +

ii 1 1 sumsum ii

Recursive-Descent ParsingRecursive-Descent ParsingPop the next stack symbol and try to Pop the next stack symbol and try to

match it with the next token in the match it with the next token in the stream. If they match, eat the tokenstream. If they match, eat the token

11 ; sum = 0 ; while not i ; sum = 0 ; while not i == …== …num;C_

Recursive-Descent ParsingRecursive-Descent ParsingC;C;

S=S= CC;;idid i i numnum 11

= = whilewhilesumsum 00 not not ;;

==== = = = =ii 1010 ii ++ sumsum + +

ii 1 1 sumsum ii

Recursive-Descent ParsingRecursive-Descent ParsingPop the next stack symbol and try to Pop the next stack symbol and try to

match it with the next token in the match it with the next token in the stream. If they match, eat the tokenstream. If they match, eat the token

;; sum = 0 ; while not i sum = 0 ; while not i == …== …;C_

Recursive-Descent ParsingRecursive-Descent ParsingCC;;

SS== CC;;idid i i numnum 1 1

= = whilewhilesumsum 00 not not ;;

==== = = = =ii 1010 ii ++ sumsum + +

ii 1 1 sumsum ii

Recursive-Descent ParsingRecursive-Descent ParsingPop the next stack symbol and try to Pop the next stack symbol and try to

match it with the next token in the match it with the next token in the stream. stream.

sumsum = 0 ; while not i == = 0 ; while not i == 10 …10 …

C_

Recursive-Descent ParsingRecursive-Descent ParsingIf they don’t match, expand the symbol If they don’t match, expand the symbol

by the appropriate grammar production by the appropriate grammar production and push those symbols onto the stackand push those symbols onto the stack

sum = 0 ; while not i == 10 …sum = 0 ; while not i == 10 …

C := C := S ; CS ; C

S;C_

Recursive-Descent ParsingRecursive-Descent ParsingParse tree nodes corresponding to the Parse tree nodes corresponding to the

new symbols will spawn as children of new symbols will spawn as children of the popped symbol’s node …the popped symbol’s node …

sum = 0 ; while not i == sum = 0 ; while not i == 10 …10 …

C := C := S ; CS ; C

S;C_

Recursive-Descent ParsingRecursive-Descent ParsingC;C;

== CC;;

ii 11

SS = = CC while whilesumsum 00 not not ;;

==== = = = =ii 1010 ii ++ sumsum + +

ii 1 1 sumsum ii

Recursive-Descent ParsingRecursive-Descent Parsing

Recursive-Descent ParsingRecursive-Descent ParsingPop the next stack symbol and try to Pop the next stack symbol and try to

match it with the next token in the match it with the next token in the stream. If they match, eat the tokenstream. If they match, eat the token

++ i } i }+id}_

Recusive-Descent ParsingRecusive-Descent Parsing;;

== ;;

ii 11

= = whilewhilesumsum 00 not not ;;

==== = = = =ii 1010 ii ++ sumsum ++

ii 1 1 sumsum ii

Recursive-Descent ParsingRecursive-Descent ParsingPop the next stack symbol and try to Pop the next stack symbol and try to

match it with the next token in the match it with the next token in the stream. If they match, eat the tokenstream. If they match, eat the token

ii } }id}_

Recusive-Descent ParsingRecusive-Descent Parsing;;

== ;;

ii 11

= = whilewhilesumsum 00 not not ;;

==== = = = =ii 1010 ii ++ sumsum + +

ii 1 1 sumsum ii

Recursive-Descent ParsingRecursive-Descent ParsingIf the symbol stack and token stream If the symbol stack and token stream

simultaneously become empty then simultaneously become empty then the program successfully parsed.the program successfully parsed.

}}}_

Which production?Which production?When we expand a nonterminal, how do When we expand a nonterminal, how do

we decide which production to use?we decide which production to use?

ii + 1 + 1

E ::= E + numE ::= E + num || numnum

E_

Which production?Which production?We must predict the structure of the We must predict the structure of the

program based on the next input program based on the next input token – predictive parsing.token – predictive parsing.

ii + 1 + 1

E ::= E + numE ::= E + num ||numnum

E_

Which production?Which production?If we have a choice of productions based If we have a choice of productions based

on the next input token, the grammar on the next input token, the grammar belongs to a class that can’t be parsed belongs to a class that can’t be parsed by the predictive algorithm.by the predictive algorithm.

ii + 1 + 1

E ::= E + numE ::= E + num ||numnum

E_

Unambiguous Arithmetic Unambiguous Arithmetic GrammarGrammar

EE ::=::= E + TE + T || TTTT ::=::= T * FT * F || FFF ::=F ::= numnum || (E)(E)

We don’t know if an E will have 1 term We don’t know if an E will have 1 term or many terms. Should we expand E or many terms. Should we expand E to E + T or T?to E + T or T?

Unambiguous Arithmetic Unambiguous Arithmetic GrammarGrammar

EE ::=::= E + TE + T || TTTT ::=::= T * FT * F || FFF ::=F ::= numnum || (E)(E)

E ::= T { + T }* E ::= T { + T }* T + T + T + … T + T + T + …

We do know that E must always start We do know that E must always start with T followed by 0 or more with T followed by 0 or more + T+ T ‘s ‘s

Unambiguous Arithmetic Unambiguous Arithmetic GrammarGrammar

EE ::=::= T E’T E’ || E + T E + T || TTE’ ::=E’ ::= + T E’+ T E’ || εεTT ::=::= T * FT * F || FFF ::=F ::= numnum || (E)(E)

Solution: left-factor the TSolution: left-factor the TE’ is E’ is nullablenullable

and symbolizes an appended (+ T)and symbolizes an appended (+ T)Now, what about T and F ?Now, what about T and F ?

Unambiguous Arithmetic Unambiguous Arithmetic GrammarGrammar

EE ::=::= T E’T E’E’ ::=E’ ::= + T E’+ T E’ || εεTT ::=::= F T’F T’ || T * FT * F || FFT’ ::=T’ ::= * F T’* F T’ || εεF ::=F ::= numnum || (E)(E)

The T/F interrelationship is isomorphic…The T/F interrelationship is isomorphic…

The “First” SetThe “First” SetEE ::=::= T E’T E’E’ ::=E’ ::= + T E’+ T E’ || εεTT ::=::= F T’F T’T’ ::=T’ ::= * F T’* F T’ || εεF ::=F ::= numnum || (E)(E)

First (T E’ ) = { num , ( } First (T E’ ) = { num , ( } First (+ T E’ ) = { + }First (+ T E’ ) = { + }First (F T’ ) = ??First (F T’ ) = ??……

Predictive Parsing TablePredictive Parsing Table

++ ** numnum (( ))EE TE’TE’ TE’TE’E’E’ +TE’+TE’TT FT’FT’ FT’FT’T’T’ *FT’*FT’FF numnum (E)(E)

Next Token (terminal) in input

Nex

t sym

bol (

non-

term

inal

) on

stac

k

Predictive Parsing TablePredictive Parsing TableProduction under each terminal in its Production under each terminal in its

First setFirst set

One entry per (stack symbol * input One entry per (stack symbol * input token),token),

… … otherwise??otherwise??

Empty cells raise parsing errorEmpty cells raise parsing error

But, not so fast…But, not so fast…Since the null production will have an Since the null production will have an

empty First set, it doesn’t appear in empty First set, it doesn’t appear in the table?the table?

((1 + 3) * 5 …1 + 3) * 5 …

E ::= T E’E ::= T E’

E

But, not so fast…But, not so fast…Since the null production will have an Since the null production will have an

empty First set, it doesn’t appear in empty First set, it doesn’t appear in the table?the table?

33) * 5 …) * 5 …numT’E’)T’E’

But, not so fast…But, not so fast…Since the null production will have an Since the null production will have an

empty First set, it doesn’t appear in empty First set, it doesn’t appear in the table?the table?

)) * 5 … * 5 …

Parse error?Parse error?

T’E’)T’E’

The “Follow” SetThe “Follow” SetEE ::=::= T E’T E’E’ ::=E’ ::= + T E’+ T E’ || εεTT ::=::= F T’F T’T’ ::=T’ ::= * F T’* F T’ || εεF ::=F ::= numnum || (E)(E)

Follow ( E’ ) = { ) } Follow ( E’ ) = { ) } Follow ( T’ ) = { ) , + }Follow ( T’ ) = { ) , + }

Include a null production entry for each nullable Include a null production entry for each nullable nonterminal under all terminals in its Follow set …nonterminal under all terminals in its Follow set …

Complete Predictive Parsing Complete Predictive Parsing TableTable

++ ** numnum (( ))EE TE’TE’ TE’TE’E’E’ +TE’+TE’ εεTT FT’FT’ FT’FT’T’T’ εε *FT’*FT’ εε FF numnum (E)(E)

Next Token (terminal) in input

Nex

t sym

bol (

non-

term

inal

) on

stac

k

Recursive Descent ParsingRecursive Descent ParsingCCS ; CS ; Cid := num ; Cid := num ; Ci := num ; Ci := num ; Ci := 1i := 1

;;:=:= ;;

ii 11

:= := whilewhilesumsum 00 not not ;;

== :=:= :=:=ii 1010 sumsum ++ ii ++

sumsum ii ii 11