languages and grammars

24
Languages and Grammars MSU CSE 260

Upload: helki

Post on 23-Jan-2016

54 views

Category:

Documents


1 download

DESCRIPTION

Languages and Grammars. MSU CSE 260. Outline. Introduction: E xample Phrase-Structure Grammars: Terminology, Definition, Derivation, Language of a Grammar, Examples Exercise 10.1 (1) Types of Phrase-Structure Grammars Derivation Trees: Example, Parsing Exercise 10.1 (2, 3) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Languages and Grammars

Languages and Grammars

MSU CSE 260

Page 2: Languages and Grammars

Outline

• Introduction: Example

• Phrase-Structure Grammars: Terminology, Definition, Derivation, Language of a Grammar, Examples

– Exercise 10.1 (1)

• Types of Phrase-Structure Grammars• Derivation Trees: Example, Parsing

– Exercise 10.1 (2, 3)

• Backus-Naur Form

Page 3: Languages and Grammars

Introduction

• In the English language, the grammar determines whether a combination of words is a valid sentence.

• Are the following valid sentences?– The large rabbit hops quickly. Yes– The frog writes neatly. Yes– Swims quickly mathematician. No

• Grammars are concerned with the syntax (form) of a sentence, and NOT its semantics (or meaning.)

Page 4: Languages and Grammars

English Grammar

• Sentence: noun phrase followed by verb phrase;• Noun phrase: article adjective noun, or article

noun;• Verb phrase: verb adverb, or verb;• Article: a, or the;• Adjective: large, or hungry;• Noun: rabbit, or mathematician, or frog;• Verb: eats, or hops, or writes, or swims;• Adverb: quickly, or wildly, or neatly;

Page 5: Languages and Grammars

Example

• Sentence• Noun phrase verb phrase• Article adjective noun verb phrase• Article adjective noun verb adverb• the adjective noun verb adverb• the large noun verb adverb• the large rabbit verb adverb• the large rabbit hops adverb• the large rabbit hops quickly

Page 6: Languages and Grammars

Grammars and Computation

• Grammars are used as a model of computation.

• Grammars are used to:– generate the words of a language, and– determine whether a word is in a language.

Page 7: Languages and Grammars

Phrase-Structure Grammars Terminology

• Definitions. A vocabulary (or alphabet) V is a finite, nonempty set of elements called symbols.

• A word (or sentence) over V is a string of finite length of elements of V.

• The empty string (or null string,) denoted by , is the string containing no symbols.

• The set of all words over V is denoted by V*.• A language over V is a subset of V*.

Page 8: Languages and Grammars

Phrase-Structure Grammars

• A language can be specified by:– listing all the words in the language, or

– giving a set of criteria satisfied by its words, or

– using a grammar.

• A grammar provides:– a set of symbols, and

– a set of rules, called productions, for producing words by replacing strings by other strings: w0 w1.

Page 9: Languages and Grammars

Phrase-Structure GrammarDefinition

A phrase-structure grammar G = (V, T, S, P) consists of:– a vocabulary V,– a subset T of V consisting of terminal elements,– a start symbol S from V, and– a set P of productions.The set N = V-T consists of nonterminal symbols.Every production in P must contain at least one

nonterminal on its left side.

Page 10: Languages and Grammars

Phrase-structure Grammar Example

• G = {V, T, S, P}, where– V = {a, b, A, B, S},– T = {a, b},– S is the start symbol, and– P = { S Aba,

A BB,B ab,AB b}.

Page 11: Languages and Grammars

Phrase-Structure GrammarsDerivation

• Definition. Let G = (V, T, S,P) be a phrase-structure grammar.

Let w0 = lz0r and w1 = lz1r be strings over V.– If z0 z1 is a production of G, we say that:

w1 is directly derivable from w0 (denoted: w0 w1.)– If w0, w1, …, wn are strings over V such that:

w0 w1, w1 w2, …, wn-1 wn, we say that:

wn is derivable from w0 (denoted: w0 * wn.)Note. * should be on top of .

– The sequence of all steps used to obtain wn from w0 is called a derivation.

Page 12: Languages and Grammars

Example

• In the previous example grammar, the production: B ab makes the string Aaba directly derivable from string ABa.– ABa Aaba

• Also Aaba BBaba Bababa abababa– using: A BB, B ab, and B ab.

• So: ABa * abababaabababa is derivable from ABa.

Page 13: Languages and Grammars

Language of a Grammar

• Definition. Let G = (V, T, S, P) be a phrase-structure grammar.The language generated by G (or the language of G), denoted by L(G), is the set of all strings of terminals that are derivable from the start symbol S.

L(G) = {wT* | S * w}.

Page 14: Languages and Grammars

Example

• Let G = {V, T, S, P} be the grammar where:– V = {S, 0, 1},– T = {0, 1},– P = { S 11S,

S 0}.

• What is L(G)?– At any stage of the derivation we can either:

• add two 1s at the end of the string, or• terminate the derivation by adding a 0 at the end of the string.

– L(G)={0, 110, 11110, 1111110, …} = Set of all strings that begin with an even number of 1s and end with 0.

Page 15: Languages and Grammars

Exercise 10.1 (1)

Page 16: Languages and Grammars

Types of Grammars

• A type 0 (phrase-structure) grammar has no restrictions on its productions.

• A type 1 (or context-sensitive) grammar has productions only of forms:– w1 w2 with length of w2 length of w1, or– w1 .

• A type 2 (or context-free) grammar has productions only of the form A w2, where A is a single nonterminal symbol.

Page 17: Languages and Grammars

Types of Grammars – cont.

• A type 3 (or regular) grammar has productions only of the form:– A aB, or A a, where

• A and B are nonterminal symbols, and• a is a terminal symbol, or

– S .

• Note.– Every type 3 grammar is a type 2 grammar– Every type 2 grammar is a type 1 grammar– Every type 1 grammar is a type 0 grammar

Page 18: Languages and Grammars

Types of Grammars - Summary

Type Restrictions on productions w1w2

0 No restrictions

1 l(w1) l(w2), or w2=

2 w1=A where AN

3 w1=A, and w2=aB or w2=a, where

AN, BN, aT, or

w1=S and w2=

Page 19: Languages and Grammars

Derivation Trees

• For type 2 (context-free) grammars:

A derivation (or parse) tree, is an ordered rooted tree that represents a derivation in the language generated by a context-free grammar, where:– the root represents the starting symbol;

– the internal vertices represent nonterminal symbols;

– the leaves represent the terminal symbols;

– for a production A w, the vertex representing A will have children vertices that represent each symbol in w.

Page 20: Languages and Grammars

Example

• Derivation tree for:

the hungry rabbit eats quickly

sentence

noun phrase verb phrase

article adjective noun verb adverb

the hungry rabbit eats quickly

Page 21: Languages and Grammars

Exercise 10.1 (2, 3)

Page 22: Languages and Grammars

Parsing

• To determine whether a string is in the language generated by a grammar, use:– Top-down parsing:

• Begin with S and attempt to derive the word by successively applying productions, or

– Bottom-up parsing: • Work backward: Begin by inspecting the word and

apply productions backward.

Page 23: Languages and Grammars

Example

• Let G = {V, T, S, P} be the grammar where:– V = {a, b, c, A, B, C, S}, T = {a, b, c}, – Productions: Determine whether cbab is in L(G)?

S AB Top-down parsing: A Ca S ABB Ba S AB CaB B Cb S AB CaB cbaB B b S AB CaB cbaB cbab C cb Bottom-up parsing: C b Cab cbab

Ab Cab cbabAB Ab Cab cbabS AB Ab Cab cbab

Page 24: Languages and Grammars

Backus-Naur Form

• Used with type 2 (context-free) grammars; like for specification of programming languages:– Use ::= instead of – Enclose nonterminal symbols within < >– Group productions with same left side with symbol |

• Example.– <signed integer> ::= <sign><integer>– <sign> ::= + | -– <integer> ::= <digit> | <digit><integer>– <digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9