rewrite systems and grammars

Rewrite Systems and Grammars

A rewrite system (or production system or rule-based system) is:

● a list of rules, and

● an algorithm for applying them.

Each rule has a left-hand side and a right hand side.

Example rules:

S aSb

aS

aSb bSabSa

A Rewrite System Formalism

A rewrite system formalism specifies:

● The form of the rules

● How simple-rewrite works:

● How to choose rules?

● When to quit?


simple-rewrite(R: rewrite system, w: initial string) =

1. Set working-string to w.

2. Until told by R to halt do:

Match the lhs of some rule against some part of working-string.

Replace the matched part of working-string with the rhs of the rule that was matched.

3. Return working-string.


EXAMPLE:

w = SaS

Rules:

[1] S aSb

[2] aS

● What order to apply the rules?

● When to quit?

Rule Based Systems

● Expert systems

● Cognitive modeling

● Business practice modeling

● General models of computation

● Grammars

Grammars

Define Languages

A grammar is a set of rules that are stated in terms of two alphabets:

• a terminal alphabet, , that contains the symbols that make up the strings in L(G), and

• a nonterminal alphabet, the elements of which will function as working symbols that will be used while the grammar is operating. These symbols will disappear by the time the grammar finishes its job and generates a string.

• A grammar has a unique start symbol, often called S.

Grammars

A grammar G is a quadruple,

(V, , R, S), where:

• V is the rule alphabet, which contains nonterminals and

terminals.

• (the set of terminals) is a subset of V,

• R (the set of rules) is a finite subset of V*(V - )V* V*,

• S (the start symbol) is an element of V - .

Grammars

An element (, ) in R is called a production and is

written as .

α is referred to as the left hand side of the

production and is referred to as the right hand

side of the production.

Notice that the left side must have at least one

nonterminal

Grammars

Example: G1 = (V, ∑, R, S) where

V = {A, S, 0, 1}

∑ = {0, 1}

R = {S 0A1

0A 00A1

A }

Note: Grammars are generators, while automata are recognizers. A grammar defines a language in a recursive manner.

Grammars

• A sentential form of a grammar G = (V, ∑, R, S) is defined recursively as follows:

1. S is a sentential form

2. If is a sentential form and is a production in R, then is also a sentential form

• A sentential form of G containing no nonterminal symbol is called a sentence generated by G.

Grammars

The language generated by a grammar G, denoted

by L(G), is the set of all sentences generated by

G.

Terminology:

G (directly derives) is a relation on V *. If

is a production in R, then G

k k-fold product of

+ transitive closure of

* reflexive transitive closure of

Grammars

If k then there is a sequence of strings 0,

i, i+1, k such that

0 = , i-1 i (1 i k), k =

This sequence of strings is called a derivation

sequence of length k.

L(G) = { w | w in ∑* and S * w}

Example: L(G1) = {0n1n | n > 0}

Classification of Grammars

Definition: A grammar G is said to be

1) Right-linear if each production in R is of the form A xB or A x where A and B are in V-∑ and x in ∑*.

2) Context-free if each production is of the form A , where A is in V-∑ and is in V*.

3) Context-sensitive if each production is of the form , where || ||.

4) A grammar with no restrictions as above is called unrestricted.


If a language L is generated by a ‘type x’

grammar, then the language L is said to be

‘type x’ language.

Right-linear grammars generate right-linear languages,

context-free grammars (CFGs) generate context-free

languages (CFLs), context-sensitive grammars (CSGs)

generate context-sensitive languages (CSLs).

The four types are referred to as Chomsky hierarchy.


Examples:

A B | C

B 0B | 1B | 011

C 0D | 1C |

D 0C | 1D

S BS aSa

B

B bB

S 0A1S → 010A 00A1A 01

S 0A10A 00A1A

Equivalence of regular languages and right-linear languages

Lemma 1

(1) , (2) { }, (3) {a} for all a in are right-linear languages.

Proof:

In each case we construct a right-linear grammar generating the language.

(1) G = ({S}, , , S} is a right linear grammar such that L(G) = .

(2) G = ({S}, , { S }, S) is a right-linear grammar such that L(G) = { }.

(3) Ga = ({S, a}, { a }, { S a}, S) is a right-linear grammar such that L(Ga) = { a }.


Lemma2

If L1 and L2 are right-linear languages, then

(1) L1 L2, (2) L1L2, (3) L1* are right-linear languages.

Proof: Since L1 and L2 are right-linear languages, we can assume that there are right-linear grammars G1 = (V1, , R1, S1) and G2 = (V2, , R2, S2) such that L(G1) = L1 and L(G2) = L2.

We can assume that nonterminals of G1 and G2 are disjoint (otherwise rename the symbols).

Define the following grammars:

(1) G3 = (V1 V2 {S3}, ,

R1 R2 {S3 S1 | S2}, S3)


(2) G4 = (V1 V2, , R4, S1) where R4 is defined as follows:

(i) if A xB is in R1, then A xB is in R4

(ii) if A x is in R1, then A xS2 is in R4

(iii) All productions in R2 are in R4

(3) G5 = (V1 {S5}, , R5, S5) where R5 is defined as follows:

(i) if A xB is in R1, then A xB is in R5

(ii) if A x is in R1, then A xS5 and

Ax are in R5

(iii) S5 S1 | are in R5

Claim: (1) L(G3) = L(G1) L(G2),

(2) L(G4) = L(G1)L(G2), and

(3) L(G5) = (L(G1))*.


(Proof of claim (2):

If S1 * w1 in G1 then S1

* w1S2 in G4.

If S2 * x in G2 then S2 x* in G4.

So, L(G1)L(G2) L(G4).

Now suppose S1 * w in G4. Since there are no

productions of the form A x in G4 that came out of G1, we can write the derivation in the form

S1 * xS2

* xy where w = xy. By construction, there must be derivations of the form S1

* x and

S2 * y. Hence, L(G4) L(G1)L(G2).

Proofs of claims 1 and 3 are left as exercise.


Theorem : A regular language is a right-linear

language

Proof: Follows from Lemma 1 and inductive

applications of Lemma2 on the construction of the

regular language.

Left-linear grammars

Definitions:

A grammar G = (V, ∑, R, S) is said to be left-linear if

each production is of the form A Bx or A x,

where A and B are in V-∑ and x is in ∑*.

A B | C

B B0 | B1 | 110

C D0 | C1 |

D C0 | D1

Equivalence of left-linear languages

and regular languages

Let G = (V, ∑, R, S) be a left-linear grammar.

Let G` = (V, ∑, R`, S) be the grammar obtained from

G as follows:

If A → α is a production in R add A → αR to R`

Then L(G`) = L(G)`. G` is a right-linear grammar.

Since L(G`) is a regular language. L(G)’ is also

regular.

If L is a regular language then there is a left linear

grammar G such that L = L(G)

rewrite systems and grammars

Documents