1 note as usual, these notes are based on the sebesta text. the tree diagrams in these slides are...

27
1 Note • As usual, these notes are based on the Sebesta text. • The tree diagrams in these slides are from the lecture slides provided in the instructor resources for the text, and were made by David Garrett.

Post on 21-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

1

Note

• As usual, these notes are based on the Sebesta text.

• The tree diagrams in these slides are from the lecture slides provided in the instructor resources for the text, and were made by David Garrett.

2

Context-free (CF) grammar

• A CF grammar is formally presented as a 4-tuple G=(T,NT,P,S), where:– T is a set of terminal symbols (the alphabet)– NT is a set of non-terminal symbols– P is a set of productions (or rules), where

PNT(TNT)*– SNT

3

Example 1

L1 = { 0, 00, 1, 11 }

G1 = ( {0,1}, {S}, { S0, S00, S1, S11 }, S )

4

Example 2L2 = { the dog chased the dog, the dog chased a

dog, a dog chased the dog, a dog chased a dog, the dog chased the cat, … }

G2 = ( { a, the, dog, cat, chased }, { S, NP, VP, Det, N, V }, { S NP VP, NP Det N, Det a | the, N dog | cat, VP V | VP NP, V chased }, S )

Notes: S = Sentence, NP = Noun Phrase , N = Noun

VP = Verb Phrase, V = Verb, Det = Determiner

5

Examples of lexemes and tokens

Lexemes Tokens

foo identifier

i identifier

sum identifier

-3 integer_literal

10 integer_literal

1 integer_literal

; statement_separator

= assignment_operator

6

BNF Fundamentals• Sample rules [p. 128]

<assign> → <var> = <expression><if_stmt> → if <logic_expr> then <stmt><if_stmt> → if <logic_expr> then <stmt> else <stmt>

• non-terminals/tokens surrounded by < and >• lexemes are not surrounded by < and >• keywords in language are in bold• → separates LHS from RHS• | expresses alternative expansions for LHS

<if_stmt> → if <logic_expr> then <stmt> | if <logic_expr> then <stmt> else

<stmt>• = is in this example a lexeme

7

BNF Rules• A rule has a left-hand side (LHS) and a right-

hand side (RHS), and consists of terminal and nonterminal symbols

• A grammar is often given simply as a set of rules (terminal and non-terminal sets are implicit in rules, as is start symbol)

8

Describing Lists

• There are many situations in which a programming language allows a list of items (e.g. parameter list, argument list).

• Such a list can typically be as short as empty or consisting of one item.

• Such lists are typically not bounded.

• How is their structure described?

9

Describing lists

• The are described using recursive rules.

• Here is a pair of rules describing a list of identifiers, whose minimum length is one:

<ident_list> ident | ident, <ident_list>

• Notice that ‘,’ is part of the object language

10

Example 3

L3 = { 0, 1, 00, 11, 000, 111, 0000, 1111, … }

G3 = ( {0,1}, {S,ZeroList,OneList},

{ S ZeroList | OneList,

ZeroList 0 | 0 ZeroList,

OneList 1 | 1 OneList }, S )

11

Derivation of sentences from a grammar

• A derivation is a repeated application of rules, starting with the start symbol and ending with a sentence (all terminal symbols).

12

Example: derivation from G2

• Example: derivation of the dog chased a catS NP VP

Det N VP

the N VP

the dog VP

the dog V NP

the dog chased NP

the dog chased Det N

the dog chased a N

the dog chased a cat

13

Example: derivations from G3

• Example: derivation of 0 0 0 0S ZeroList

0 ZeroList 0 0 ZeroList 0 0 0 ZeroList 0 0 0 0

• Example: derivation of 1 1 1S OneList

1 OneList 1 1 OneList 1 1 1

14

Observations about derivations

• Every string of symbols in the derivation is a sentential form.

• A sentence is a sentential form that has only terminal symbols.

• A leftmost derivation is one in which the leftmost nonterminal in each sentential form is the one that is expanded.

• A derivation can be leftmost, rightmost, or neither.

15

An example grammar

<program> <stmt-list><stmt-list> <stmt>

| <stmt> ; <stmt-list> <stmt> <var> = <expr> <var> a

| b | c | d

<expr> <term> + <term> | <term> - <term>

<term> <var> | const

16

A leftmost derivation

<program> => <stmts>

=> <stmt>

=> <var> = <expr>

=> a = <expr>

=> a = <term> + <term>

=> a = <var> + <term>

=> a = b + <term>

=> a = b + const

17

Parse tree

• A hierarchical representation of a derivation

<program>

<stmts>

<stmt>

const

a

<var> = <expr>

<var>

b

<term> + <term>

18

Parse trees and compilation

• A compiler builds a parse tree for a program (or for different parts of a program).

• If the compiler cannot build a well-formed parse tree from a given input, it reports a compilation error.

• The parse tree serves as the basis for semantic interpretation/translation of the program.

19

Extended BNF

• Optional parts are placed in brackets [ ]<proc_call> -> ident [(<expr_list>)]

• Alternative parts of RHSs are placed inside parentheses and separated via vertical bars <term> → <term> (+|-) const

• Repetitions (0 or more) are placed inside braces { }<ident> → letter {letter|digit}

20

Comparison of BNF and EBNF• sample grammar fragment expressed in BNF

<expr> <expr> + <term>| <expr> - <term>| <term>

<term> <term> * <factor>| <term> / <factor>| <factor>

• same grammar fragment expressed in EBNF<expr> <term> {(+ | -) <term>}<term> <factor> {(* | /) <factor>}

21

Ambiguity in grammars

• A grammar is ambiguous if and only if it generates a sentential form that has two or more distinct parse trees

22

An ambiguous grammar for arithmetic expressions

<expr> <expr> <op> <expr> | const<op> / | -

<expr>

<expr> <expr>

<expr> <expr>

<expr>

<expr> <expr>

<expr> <expr>

<op>

<op>

<op>

<op>

const const const const const const- -/ /

<op>

23

Disambiguating the grammar

• If we use the parse tree to indicate precedence levels of the operators, we can remove the ambiguity.

• The following rules give / a higher precedence than -

<expr> <expr> - <term> | <term><term> <term> / const| const

<expr>

<expr> <term>

<term> <term>

const const

const/

-

24

Links to BNF-style grammars for actual programming languages

Below are some links to grammars for real programming languages. Look at how the grammars are expressed.

– http://www.schemers.org/Documents/Standards/R5RS/– http://www.sics.se/isl/sicstuswww/site/documentation.html

In the ones listed below, find the parts of the grammar that deal with operator precedence.

– http://java.sun.com/docs/books/jls/index.html– http://www.lykkenborg.no/java/grammar/JLS3.html– http://www.enseignement.polytechnique.fr/profs/informatique/Jean-Jacques.L

evy/poly/mainB/node23.html– http://www.lrz-muenchen.de/~bernhard/Pascal-EBNF.html

25

Associativity of operators• When multiple operators appear in an expression, we

need to know how to interpret the expression.

• Some operators (e.g. +) are associative, meaning that the meaning of an expression with multiple instances of the operator is the same no matter how it is interpreted:

(a+b)+c = a+(b+c)

• Some operators (e.g. -) are not associative:(a-b)-c a-(b-c) e.g. try a=10, b=8, c=6

(10-8)-6 = -4 but 10-(8-6)=8

26

Associativity of Operators• Operator associativity can also be indicated by a

grammar

<expr> -> <expr> + <expr> | const (ambiguous)<expr> -> <expr> + const | const (unambiguous)

<expr><expr>

<expr>

<expr> const

const

const

+

+

27

Links to BNF-style grammars for actual programming languages

Below are some links to grammars for real programming languages. Look at how the grammars are expressed.

– http://www.schemers.org/Documents/Standards/R5RS/– http://www.sics.se/isl/sicstuswww/site/documentation.html

In the ones listed below, find the parts of the grammar that deal with operator associativity.

– http://java.sun.com/docs/books/jls/index.html– http://www.lykkenborg.no/java/grammar/JLS3.html– http://www.enseignement.polytechnique.fr/profs/informatique/Jean-

Jacques.Levy/poly/mainB/node23.html– http://www.lrz-muenchen.de/~bernhard/Pascal-EBNF.html