language translators: week 17 scom.hud.ac.uk/scomtlm/cis2380/ see appel’s book chapter 3 for...

11
LANGUAGE TRANSLATORS: WEEK 17 scom.hud.ac.uk/scomtlm/cis2380/ See Appel’s book chapter 3 for support reading Last Week: Top-down, Table driven parsers Next Few Weeks: Bottom-up, Table driven, Shift-Reduce (SR) Parser How SR Parsers Work How to Create SR Parsers Practical: How to use JavaCup (which creates SR Parses) This Week’s TUTORIALS: How SR Parsers Work + JavaCup

Upload: phoebe-blankenship

Post on 03-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: LANGUAGE TRANSLATORS: WEEK 17 scom.hud.ac.uk/scomtlm/cis2380/ See Appel’s book chapter 3 for support reading Last Week: Top-down, Table driven parsers

LANGUAGE TRANSLATORS: WEEK 17

scom.hud.ac.uk/scomtlm/cis2380/See Appel’s book chapter 3 for support reading

Last Week: Top-down, Table driven parsers

Next Few Weeks:Bottom-up, Table driven, Shift-Reduce (SR) Parser How SR Parsers Work How to Create SR Parsers Practical: How to use JavaCup (which creates SR Parses)

This Week’s TUTORIALS: How SR Parsers Work + JavaCup

Page 2: LANGUAGE TRANSLATORS: WEEK 17 scom.hud.ac.uk/scomtlm/cis2380/ See Appel’s book chapter 3 for support reading Last Week: Top-down, Table driven parsers

Parsing Direction

GeneratingSymbol

begin X = X + 1 ; end

TOP DOWN

BOTTOM UP

Page 3: LANGUAGE TRANSLATORS: WEEK 17 scom.hud.ac.uk/scomtlm/cis2380/ See Appel’s book chapter 3 for support reading Last Week: Top-down, Table driven parsers

PushdownAutomata

Table driven Parsers are “Pushdown Automata” (PDA)

PDAs are composed of a control, a stack and a table

The control decides what to do using the table, the current input, the top of the stack, and the current state (if any).

stack operations (for stack K) include:“Pop(K)” – removes (discards) the top of K“Push(W,K)” – pushes symbols of string W from right to left on to the top of K“Top(K)” – reads the top of the stack

table operations are restricted to table look-up

table

stack

control

Output

Input Sequence of Symbols

Page 4: LANGUAGE TRANSLATORS: WEEK 17 scom.hud.ac.uk/scomtlm/cis2380/ See Appel’s book chapter 3 for support reading Last Week: Top-down, Table driven parsers

Table-Driven Parser

Table driven Parsing – Summary Parsing is carried out by a PDA PDA’s table is automatically created from the

language’s grammar. Language – independent method (generate a different

table for each language): the control and stack are fixed, depending on the type of Table-driven parser.

table

Input Sequence of Tokens

stack

control

ParseTree orError

Different Language?Change Table

Page 5: LANGUAGE TRANSLATORS: WEEK 17 scom.hud.ac.uk/scomtlm/cis2380/ See Appel’s book chapter 3 for support reading Last Week: Top-down, Table driven parsers

Top Down Parsing – SummaryControl Loop

K is the stack

n is the next token

case Top(K)

-- Non-Terminal:

If lookup(Top(K),n) returns rule X ::= W then {Pop(K); Push(W, K);} else error

-- Token:

If n == Top(K) then {Pop(K); consume n;} else error

end case

table

stack

control

TableTokens

Non-Terminals

Entries are Refs. To Rules X ::= W

Stackcontains NTs/Ts symbols (Non-Terminals /Tokens)

n

Page 6: LANGUAGE TRANSLATORS: WEEK 17 scom.hud.ac.uk/scomtlm/cis2380/ See Appel’s book chapter 3 for support reading Last Week: Top-down, Table driven parsers

Top Down Parsing – Summary

Initialisation:

stack = Push(start symbol, empty stack)

n is the first token in the input stream

End condition

stack = empty stack

Input stream is empty

Page 7: LANGUAGE TRANSLATORS: WEEK 17 scom.hud.ac.uk/scomtlm/cis2380/ See Appel’s book chapter 3 for support reading Last Week: Top-down, Table driven parsers

Table-Driven Parser

Introducing: Bottom-up, Table-driven, Shift-Reduce (SR) Parser

Same “Table- Driven” idea – so architecture is the same Table Auto-generated from Grammar as Top Down Parser

But: Control, Stack + Table more elaborate Uses an explicit state More robust + popular than Top Down Popular Parser Generators available (e.g. JavaCup, YACC)

table

Input Sequence of Tokens

stack

controlParser-Generator(JAVACUP)

BNF Grammar

ParseTree orError

Page 8: LANGUAGE TRANSLATORS: WEEK 17 scom.hud.ac.uk/scomtlm/cis2380/ See Appel’s book chapter 3 for support reading Last Week: Top-down, Table driven parsers

Introducing SR Parsing (Appel, p60)Control Loop

K is the stack

n is the next token

s is Top(K), the current state

case lookup(n,s) of

s x: Push(n,K); Push(x,K)

r y: Reduce K with rule y

g z: Push(z,K)

end case

NB: any null entries give syntax errors

table

stack

control

Table

Tokens+Non-Terminals

StateNumbers

Shift (s), Reduce(r), Goto (g) actions on the Stack / State

Stack

state No.NTs/Tsstate No.NTs/Tsstate No.ETC .. ….

n

Page 9: LANGUAGE TRANSLATORS: WEEK 17 scom.hud.ac.uk/scomtlm/cis2380/ See Appel’s book chapter 3 for support reading Last Week: Top-down, Table driven parsers

Introducing SR Parsing

Initialisation:

stack = Push(state 1, empty stack)

n is the first token in the input stream

End condition

Reach entry “accept” in the table.

Page 10: LANGUAGE TRANSLATORS: WEEK 17 scom.hud.ac.uk/scomtlm/cis2380/ See Appel’s book chapter 3 for support reading Last Week: Top-down, Table driven parsers

“Shift-Reduce” Parsers - General Workings

Shift x: (means “Shift symbol, move to state x”)

Put symbol onto the top of the stack;

Put the new state number x on top of the stack

Reduce y: (means “Reduce with rule y”)

matching the RHS of rule y with the top of the stack and REMOVE all the matched top;

Push the LHS of rule y onto the top of the stack;

Input LHS of rule k + state below it to the Table to find out the next state s, then push s onto the stack.

Page 11: LANGUAGE TRANSLATORS: WEEK 17 scom.hud.ac.uk/scomtlm/cis2380/ See Appel’s book chapter 3 for support reading Last Week: Top-down, Table driven parsers

LR Parsers - SummaryIn this lecture we have reviewed how TD table driven parsers work we have seen HOW BU table driven parsers work

During the week – go through trace in handout example

NB The SR parser is called “LR” means parse string from Left to

right, but build up the parse tree from the Right of the string first.

“Most” parsers are “LR(1)” - the “1” means they look at the 1

next token in the string.