top down parsing

15
8 January 2004 Department of Software & Media Technology 1 Top Down Parsing Top Down Parsing Recursive Descent Parsing Top-down parsing: Build tree from root symbol Each production corresponds to one recursive procedure Each procedure recognizes an instance of a non- terminal, returns tree fragment for the non- terminal

Upload: caldwell-douglas

Post on 30-Dec-2015

23 views

Category:

Documents


0 download

DESCRIPTION

Top Down Parsing. Recursive Descent Parsing Top-down parsing: Build tree from root symbol Each production corresponds to one recursive procedure Each procedure recognizes an instance of a non-terminal, returns tree fragment for the non-terminal. General model. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Top Down Parsing

8 January 2004 Department of Software & Media Technology 1

Top Down ParsingTop Down Parsing

Recursive Descent Parsing Top-down parsing:

– Build tree from root symbol– Each production corresponds to one recursive procedure– Each procedure recognizes an instance of a non-terminal,

returns tree fragment for the non-terminal

Page 2: Top Down Parsing

8 January 2004 Department of Software & Media Technology 2

General modelGeneral model

Each right-hand side of a production provides body for a function

Each non-terminal on the right hand side is translated into a call to the function that recognizes that non-terminal

Each terminal in the right hand side is translated into a call to the lexical scanner. If the resulting token is not the expected terminal error occurs.

Each recognizing function returns a tree fragment.

Page 3: Top Down Parsing

8 January 2004 Department of Software & Media Technology 3

Example: parsing a declarationExample: parsing a declaration

FULL_TYPE_DECLARATION ::= type DEFINING_IDENTIFIER is TYPE_DEFINITION; Translates into:

– get token type– Find a defining_identifier -- function call– get token is– Recognize a type_definition -- function call– get token semicolon

In practice, we already know that the first token is type, that’s why this routine was called in the first place! Predictive parsing is guided by the next token

Page 4: Top Down Parsing

8 January 2004 Department of Software & Media Technology 4

Example: parsing a loopExample: parsing a loop

FOR_STATEMENT ::=

ITERATION_SCHEME loop STATEMENTS end loop;

Node1 := find_iteration_scheme; -- call function

get token loop

List1 := Sequence of statements -- call function

get token end

get token loop

get token semicolon;

Result := build loop_node with Node1 and List1

return Result

Page 5: Top Down Parsing

8 January 2004 Department of Software & Media Technology 5

Problem:Problem:

If there are multiple productions for a non-terminal, mechanism is required to determine which production to use:

IF_STAT ::= if COND then Stats end if;

IF_STAT ::= if COND then Stats ELSIF_PART end if;

When next token is if, so which production to use ?

Page 6: Top Down Parsing

8 January 2004 Department of Software & Media Technology 6

One Solution: factorize grammarOne Solution: factorize grammar

If several productions have the same prefix, rewrite as single production:

IF_STAT ::= if COND then STATS [ELSIF_PART] end if;

– Problem now reduces to recognizing whether an optional– Component (ELSIF_PART) is present

Page 7: Top Down Parsing

8 January 2004 Department of Software & Media Technology 7

Second Problem of RecursionSecond Problem of Recursion

Grammar should not be left-recursive: E ::= E + T | T Problem: to find an E, start by finding an E…

– Original scheme leads to infinite loop– Grammar is inappropriate for recursive-descent

Page 8: Top Down Parsing

8 January 2004 Department of Software & Media Technology 8

Solution to left-recursionSolution to left-recursion

E ::= E + T | T means that eventually E expands into

T + T + T ….

Rewrite as:– E ::= TE’– E’ ::= + TE’ | epsilon

Informally: E’ is a possibly empty sequence of terms separated by an operator

Page 9: Top Down Parsing

8 January 2004 Department of Software & Media Technology 9

Recursion can involve multiple Recursion can involve multiple productionsproductions

A ::= B C | D B ::= A E | F

– Can be rewritten as:

A ::= A E C | F C | D– Now apply previous method

– General algorithm to detect and remove left-recursion

Page 10: Top Down Parsing

8 January 2004 Department of Software & Media Technology 10

Further ProblemFurther Problem

Transformation does not preserve associativity:

– E ::= E + T | T – Parses a + b + c as (a + b) + c– E ::= TE’, E’ ::= + TE’ | epsilon– Parses a + b +c as a + (b + c)

– Incorrect for a - b – c : must rewrite tree

Page 11: Top Down Parsing

8 January 2004 Department of Software & Media Technology 11

In practice: use loop to find sequence of termsIn practice: use loop to find sequence of terms

Node1 := P_Term; -- call function that recognizes a term

loop

exit when Token not in Token_Class_Binary_Addop;

Node2 := New_Node (P_Binary_Adding_Operator);

Scan; -- past operator

Set_Left_Opnd (Node2, Node1);

Set_Right_Opnd (Node2, P_Term); -- find next term

Set_Op_Name (Node2);

Node1 := Node2; -- operand for next operation

end loop;

Page 12: Top Down Parsing

8 January 2004 Department of Software & Media Technology 12

LL (1) ParsingLL (1) Parsing

LL (1) grammars If table construction is successful, grammar is LL (1):

left-to right, leftmost derivation with one-token lookahead.

If construction fails, can conceive of LL (2), etc. Ambiguous grammars are never LL (k) If a terminal is in First for two different productions

of A, the grammar cannot be LL (1). Grammars with left-recursion are never LL (k) Some useful constructs are not LL (k)

Page 13: Top Down Parsing

8 January 2004 Department of Software & Media Technology 13

Building LL (1) parse tablesBuilding LL (1) parse tables

Table indexed by non-terminal and token. Table entry is a production:

for each production P: A loop for each terminal a in First () loop T (A, a) := P; end loop; if in First (), then for each terminal b in Follow () loop T (A, b) := P; end loop; end if;end loop; All other entries are errors. If two assignments conflict, parse table cannot be built.

Page 14: Top Down Parsing

8 January 2004 Department of Software & Media Technology 14

Left Recursion Removal & Left Left Recursion Removal & Left FactoringFactoring

Left Recursion Removal:

Left Factoring:

Page 15: Top Down Parsing

8 January 2004 Department of Software & Media Technology 15

Synatx Tree Construction in LL(1)Synatx Tree Construction in LL(1)

First and Follow Sets

LL(k) Parsers (Extending the Lookahead

Error Recovery in Top Down Parsers

Error Recovery in LL(1) Parsers