language and grammars © university of liverpoolcomp 319slide 1

32
LANGUAGE AND GRAMMARS © University of Liverpool COMP 319 slide 1

Post on 22-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

LANGUAGE AND GRAMMARS

© University of LiverpoolCOMP 319 slide 1

Page 2: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Contents• Languages and Grammars• Formal languages• Formal grammars• Generative grammars• Analytic grammars• Context-free grammars• LL parsers• LR parsers• Rewrite systems• L-systems

© University of LiverpoolCOMP319 slide 2

Page 3: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Software Engineering FoundationSoftware engineering may be summarised by saying that it concerns the construction of programs to solve problems and that there are three parts:

- Construction/engineering, and methods

- Problems, and problem solving, and

- Programs© University of LiverpoolCOMP319 slide 3

Page 4: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Languages and grammar• Languages are spoken and written

(linguistics)• To be effective they must be based on a

shared set of rules – a grammar• Grammars are introspective they are

based on and couched in language• Natural language grammars are

constantly shifting and locally negotiated• A grammar is a formal language in which

the rules of discourse are discussed and are the aim

© University of LiverpoolCOMP319 slide 4

Page 5: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Formal language concepts• The concept emerges because of the

need to define rules (for language)• Formally, they are collections of

words composed of smaller, atomic units

• Issues of concern are- the number and nature of the atomic units,

- the precision level required,- the completeness of the formalism

© University of LiverpoolCOMP319 slide 5

Page 6: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Examples of formal languages

• The set of all words over {a, b}• The set {an : n is a prime number}• The set of syntactically correct

programs in a given computer programming language

• The set of inputs upon which a certain Turing machine halts

© University of LiverpoolCOMP319 slide 6

Page 7: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Formal language specification

There are many ways in which a formal language can be specified e.g.

• strings produced in a formal grammar

• strings produced by regular expressions

• the strings accepted by automata• logic and other formalisms

© University of LiverpoolCOMP319 slide 7

Page 8: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Language Production Operations

• Concatenation of strings drawn from the two languages

• Intersection or union of common strings in both languages

• Complement of one language• Right quotient of one by the other• Kleene star operation on one

language• Reverse of a language• Shuffle combination of languages

© University of LiverpoolCOMP319 slide 8

Page 9: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Formal Grammars

• Noam Chomsky- Linguist, philosopher at MIT- 1956, papers on information and grammar

• Types of formal grammar- Generative grammar- Analytical grammar

© University of LiverpoolCOMP319 slide 9

Page 10: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Generative formal grammars

• Generative grammars:A set of rules by which all possible strings in a language to be described can be generated by successively rewriting strings starting from a designated start symbol.

In effect it formalises an algorithm that generates strings in the language.

© University of LiverpoolCOMP319 slide 10

Page 11: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Analytic formal grammars

• Analytic grammars:A set of rules that assumes an arbitrary string as input, and which successively reduces or analyses that string to yield a final boolean “yes/no” that indicates whether that string is a member of the language described by the grammar

In effect a parser or recogniser for a language

© University of LiverpoolCOMP319 slide 11

Page 12: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Generative grammar components

Chomsky’s definition – essentially for linguistics but perfect for formal computing grammars; consists of the following components:

- A finite set N of nonterminal symbols- A finite set of terminal symbols disjoint from N- A finite set P of production rules where a rule is

of the form: string in ( N)* → string in ( N)*

- A symbol S in N that is identified as the start symbol

© University of LiverpoolCOMP319 slide 12

Page 13: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Generative grammar definition

• A language of a formal grammar:• G = (N, ,P, S)• Is denoted by L(G)• And is defined as all those strings

over such that can be generated by starting from the symbol S and then applying P until no more nonterminal symbols are present

© University of LiverpoolCOMP319 slide 13

Page 14: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

A generative formal grammar• Given the terminals {a, b}, nonterminals {S, A,

B} where S is the special start symbol and• Productions:

S → ABSS → (the empty string)BA → ABBS → bBb → bbAb → abAa → aa

Defines all the words of the from anbn, (i.e. n copies of a followed by n copies of b)

© University of LiverpoolCOMP319 slide 14

Page 15: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Context Free Grammars

• Theoretical basis of most programming languages.

• Easy to generate a parser using a compiler compiler.

• Two main approaches exist: top-down parsing e.g. LL parsers, and bottom-up parsing e.g. LR parsers.

© University of LiverpoolCOMP319 slide 15

Page 16: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

LL parser• Table based, top down parser for a

subset of the context-free grammars (LL grammars).

• Parsing is Left to right, and constructs a Leftmost derivation of the sentence.

• LL(k) parsers use k tokens of look-ahead to parse the LL(k) grammar sentence.

• LL(1) grammars are popular and fast because only the next token is considered in parsing decisions.

© University of LiverpoolCOMP319 slide 16

Page 17: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Table based LL parsing

© University of LiverpoolCOMP319 slide 17

Input buffer: <null> | | +-------------+ Stack | | S <---| Parser | --> Output $ | | +-------------+ ^ |

+-----------+ | Parsing | | table | +-----------+

Architecture• Consider the grammar

1. S → F2. S → ( S + F)3. F → 1

• This has the parsing table

e.g. 1 and S implies rule 1i.e. Stack S is replaced with

Fand 1 is outputStack and Input same =

deleteStack and Input different =

error• Example input

( 1 + 1 ) $

( ) 1 + $

S 2 - 1 - -

F - - 3 - -

Page 18: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Table based LL parsing

© University of LiverpoolCOMP319 slide 18

• Consider the grammar1. S → F2. S → ( S + F)3. F → 1

• This has the parsing table

e.g. 1 and S implies rule 1i.e. Stack S is replaced with

Fand 1 is outputStack and Input same =

deleteStack and Input different =

error• Example input

( 1 + 1 ) $

( ) 1 + $

S 2 - 1 - -

F - - 3 - -

input stack action output

( S$ parse ( S : 2 2

( (S + F)$ ( ( delete 2

1 S + F)$ parse 1 S : 1 21

1 F + F)$ parse 1 F : 3 213

1 1 + F)$ 1 1 delete 213

+ + F)$ + + delete 213

1 F)$ parse 1 F : 3 2133

1 1)$ 1 1 delete 2133

) )$ ) ) delete 2133

$ $ stop 2133

Page 19: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Parse Tree

Page 20: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Left Right Parser• Bottom up parser for context-free

grammars used by many program language compilers

• Parsing is Left to right, and produces a Rightmost derivation.

• LR(k) parsers uses k tokens of look-ahead.• LR(1) is the most common type of parser

used by many programming languages. Usually always generated using a parser generator which constructs the parsing table; e.g. Simple LR parser (SLR), Look Ahead LR (LALR) e.g. Yacc, Canonical LR.

© University of LiverpoolCOMP319 slide 20

Page 21: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Left Right parser example..

• Rules ...• 1) E → E * B• (2) E → E + B• (3) E → B• (4) B → 0• (5) B → 1

© University of LiverpoolCOMP319 slide 21

Page 22: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Left Right parser example

© University of LiverpoolCOMP319 slide 22

Page 23: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Re-writing• Rewriting is a general process involving

strings and alphabets. Classified according to what is rewritten e.g. strings, terms, graphs, etc.

• A rewrite system is a set of equations that characterises a system of computation that provides one method of automating theorem proving and is based on use of rewrite rules.

• Examples of practical systems that use this approach includes the software Mathematica.

© University of LiverpoolCOMP319 slide 23

Page 24: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Re-writing logic example

• ! ! A = A // eliminate double negative

• !(A AND B) = !A OR !B // de-morgan

© University of LiverpoolCOMP319 slide 24

Page 25: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Re-writing in Mathematica (Wolfram)

© University of LiverpoolCOMP319 slide 25

Page 26: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

L-systems

• Named after Aristid Lindenmeyer (1925-1989) a Swedish theoretical biologist and botanist who worked at the University of Utrecht (Netherlands)

• Are a formal grammar used to model the growth and morphology of plants and animals

• In plant and animal modelling a special form, the parametric L-system is used – based on rewriting.

• Because of their recursive, parallel, and unlimited nature they lead to concepts of self-similarity and fractional dimension and fractal-like forms.

© University of LiverpoolCOMP319 slide 26

Page 27: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

L-system structure• The basic system is identical to formal grammars:

G = {V, S, Ω, P}• where

G is the grammar definedV (the alphabet) a set of symbols that can be replaced by

(variables)S is a set of symbols that remain fixed (constants)Ω(start, axiom or initiator) a string from V, the initial stateP is a set of rules or productions defining the ways

variables can be replaced by constants and other variables. Each rule, consists of a LHS (predecessor) and RHS (successor)

© University of LiverpoolCOMP319 slide 27

Page 28: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

© University of LiverpoolCOMP319 slide 28

Slide 28

Example 1: Fibonacci numbers

• V: A B • C: none• Ω : A• P: p1: A → B p2: B →

AB

N=0 AN=1 → BN=2 → AB N=3 → BAB N=4 → ABBAB N=5 → BABABBAB N=6 → ABBABBABABBABN=7 → BABABBAB...Counting lengths we get: 1,1,2,3,5,8,13,21,...The Fibonacci numbers

Page 29: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

© University of LiverpoolCOMP319 slide 29

Slide 29

Example 2: Algal growth

• V: A B • C: none• Ω : A• P: p1: A → AB p2: B → A

N=0 A → ABN=1 → ABAN=2 → ABAABN=3 → ABAABABA

Page 30: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

© University of LiverpoolCOMP319 slide 30

COMP319 Software Engineering II

Example 3: Koch snowflake

• V: F • C: none• Ω : F• P: p1: F → F+F-F-

F+F

N=0 F N=1 → F+F-F-F+FN=2 → F+F-F-F+F+F...N=3 etc

Page 31: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Example 4: 3D Hilbert curve

© University of LiverpoolCOMP319 slide 31

Page 32: LANGUAGE AND GRAMMARS © University of LiverpoolCOMP 319slide 1

Example 5: Branching

© University of LiverpoolCOMP319 slide 32