regular expressions and languages - vidyarthiplus

41
www.vidyarthiplus.com RJ edition www.vidyarthiplus.com CS2303 THEORY OF COMPUTATION BE-CSE/IT=V-SEM/VII-SEM REGULATION 2008/2010 UNIT II - & REGULAR EXPRESSIONS AND LANGUAGES 1. What is a regular expression? A regular expression is a string that describes the whole set of strings according to certain syntax rules. These expressions are used by many text editors and utilities to search bodies of text for certain patterns etc. Definition is: Let _ be an alphabet. The regular expression over _ and the sets they denote are: i. _ is a r.e and denotes empty set. ii. _ is a r.e and denotes the set {_} iii. For each ‘a’ in _ , a+ is a r.e and denotes the set {a}. iv. If ‘r’ and ‘s’ are r.e denoting the languages R and S respectively then (r+s), (rs) and (r*) are r.e that denote the sets RUS, RS and R* respectively. 2. Differentiate L* and L+ _ L* denotes Kleene closure and is given by L* =U Li i=0 example : 0* ={_ ,0,00,000,…………………………………} Language includes empty words also. _ L+ denotes Positive closure and is given by L+= U Li i=1 q0 q1 3. What is Arden’s Theorem? Arden’s theorem helps in checking the equivalence of two regular expressions. Let P and Q be the two regular expressions over the input alphabet _. The regular expression R is given as : R=Q+RP Which has a unique solution as R=QP*.

Upload: others

Post on 24-Jan-2022

16 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

CS2303 THEORY OF COMPUTATION BE-CSE/IT=V-SEM/VII-SEM

REGULATION 2008/2010

UNIT II - & REGULAR EXPRESSIONS AND LANGUAGES

1. What is a regular expression? A regular expression is a string that describes the whole set of strings according to

certain syntax rules. These expressions are used by many text editors and utilities to search bodies of text for certain patterns etc. Definition is: Let _ be an alphabet. The regular expression over _ and the sets they denote are:

i. _ is a r.e and denotes empty set.

ii. _ is a r.e and denotes the set {_}

iii. For each ‘a’ in _ , a+ is a r.e and denotes the set {a}.

iv. If ‘r’ and ‘s’ are r.e denoting the languages R and S respectively then (r+s),

(rs) and (r*) are r.e that denote the sets RUS, RS and R* respectively.

2. Differentiate L* and L+ _

L* denotes Kleene closure and is given by L* =U Li i=0

example : 0* ={_ ,0,00,000,…………………………………}

Language includes empty words also.

_

L+ denotes Positive closure and is given by L+= U Li i=1 q0 q1

3. What is Arden’s Theorem? Arden’s theorem helps in checking the equivalence of two regular expressions. Let P and Q be the two regular expressions over the input alphabet _. The regular expression R is given as : R=Q+RP Which has a unique solution as R=QP*.

Page 2: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

4. Write a r.e to denote a language L which accepts all the strings which begin or end with either 00 or 11.

The r.e consists of two parts:

L1=(00+11) (any no of 0’s and 1’s) =(00+11)(0+1)*

L2=(any no of 0’s and 1’s)(00+11) =(0+1)*(00+11)

Hence r.e R=L1+L2 =[(00+11)(0+1)*] + [(0+1)* (00+11)]

5. Construct a r.e for the language over the set _={a,b} in which total number of a’s are divisible by 3

( b* a b* a b* a b*)*

6. what is: (i) (0+1)* (ii)(01)* (iii)(0+1) (iv)(0+1)+ (0+1)*= { _ , 0 , 1 , 01 , 10 ,001 ,101 ,101001,…………………}

Any combinations of 0’s and 1’s.

(01)*={_ , 01 ,0101 ,010101 ,…………………………………..}

All combinations with the pattern 01.

(0+1)= 0 or 1,No other possibilities.

(0+1)+= {0,1,01,10,1000,0101,………………………………….}

7. Reg exp denoting a language over _ ={1} having (i) even length of string (ii) odd length of a string

(i) Even length of string R=(11)*

(ii) Odd length of the string R=1(11)*

8. Reg exp for: (i) All strings over {0,1} with the substring ‘0101’ (ii) All strings beginning with ’11 ‘ and ending with ‘ab’ (iii) Set of all strings over {a,b}with 3 consecutive b’s. (iv) Set of all strings that end with ‘1’and has no substring ‘00’

(i)(0+1)* 0101(0+1)*

Page 3: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

(ii)11(1+a+b)* ab

(iii)(a+b)* bbb (a+b)*

(iv)(1+01)* (10+11)*

Page 4: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

9. Construct a r.e for the language which accepts all strings with atleast two c’s over the set Σ={c,b}

(b+c)* c (b+c)* c (b+c)*

10. What are the applications of Regular expressions and Finite automata Lexical analyzers and Text editors are two applications.

Lexical analyzers:

The tokens of the programming language can be expressed using regular expressions.

The lexical analyzer scans the input program and separates the tokens.For eg

identifier can be expressed as a regular expression

as: (letter)(letter+digit)*

If anything in the source language matches with this reg exp then it is

recognized as an identifier.The letter is{A,B,C,………..Z,a,b,c….z} and digit is {0,1,…9}.Thus reg exp identifies token in a language.

Text editors:

These are programs used for processing the text. For example

UNIX text editors uses the reg exp for substituting the strings such as: S/bbb*/b/

Gives the substitute a single blank for the first string of two or more blanks in a given line. In UNIX text editors any reg exp is converted to an NFA with Єtransitions, this NFA can be then simulated directly.

11. .Reg exp for the language that accepts all strings in which ‘a’ appears tripled overthe set Σ ={a}

reg exp=(aaa)*

12. .What are the applications of pumping lemma? Pumping lemma is used to check if a language is regular or not.

(i) (ii) (iii) (iv) (v)

Assume that the language(L) is regular. Select a constant ‘n’. Select a string(z) in L, such that |z|>n. Split the word z into u,v and w such that |uv|<=n and |v|>=1. You achieve a contradiction to pumping lemma that there exists an ‘i’ Such

that uvi w is not in L.Then L is not a regular language.

Page 5: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

13. What is the closure property of regular sets? The regular sets are closed under union, concatenation and Kleene closure.

r1Ur2= r1 +r2

r1.r2= r1r2

( r )*=r*

The class of regular sets are closed under complementation, substitution,

homomorphism and inverse homomorphism.

14. .Reg exp for the language such that every string will have atleast one ‘a’ followed by atleast one ‘b’.

R=a+b+

15. Write the exp for the language starting with and has no consecutive b’s . reg exp=(a+ab)*

16. Lists on the closure properties of Regular sets. (i) (ii) (iii) (iv) (v) (vi) (vii)

Union Concatenation Closure Complementation Intersection Transpose Substitutions

(viii) Homomorphism

17. Let R be any set of regular languages. IsUR regular? Prove it. Yes. Let P,Q be any two regular languages .As per theorem

L( R )=L(P UQ)

=L(P+Q)

Since ‘+’ is a operator for regular expresstions L( R ) is also regular.

Page 6: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

19. What are the three methods of conversion of DFA to RE? (i) (ii) (iii)

Regular Expression equation method Arden’s Theorem. State elimination technique,

20. What are the algorithms of minimization DFA? (i) (ii)

Myhill-Nerode Theorem Construction of πfinal from π.

PART – B

1. Explain the regular expression and its operation

REGULAR EXPRESSIONS

The regular expressions can define exactly the same language that the various forms of automata describe the regular language.

Operators of regular expressions

Building regular expressions

Precedence of regular expressions operators

Operators of regular expressions: Union

Union of two languages L and M is represented as L U M.

Ex: L={001,10,111} M={ є,001}

Then L U M={ є,001,10,111}

Concatenation Concatenation of two languages is represented as L.M or simply LM.

Ex: L={001,10,111} M={є,001}

Then L.M={001,10,111,001001,10001,111001}

Kleen closure or closure or star The kleen closure is represented as L*. there is also L+ which is the

Page 7: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

closure neglecting the є.

Page 8: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

If L={0,1} then L*= {0,1,00,01,10,11…}

BUILDING REGULAR EXPRESSIONS

The regular expressions are denoted by L(E). There are three parts of representing the language:

regular expressions denoting the languages {є} and ф respectively. Then, L (є) =є and L (ф) =ф.

If ais any symbol then a is a regular expression. This expression denotes the language {a}

is L(a)=a. A variable capitalized or italic such as L is a variable representing any language.

There are four induction steps:

If E and F are regular language then E+F is a regular expression denoting the union of L(E) and L(F) is represented as L(E+F)=L(E)+L(F).

If E and F are regular language then E.F is a regular expression denoting the concatenation of L(E) and L(F) is represented as L(EF)=L(E).L(F).

If E is a regular language then

E* is a regular expression denoting the closure of

L(E) is represented as L(E*)=(L(E))*

If E is a regular language then(k) (E) a parenthesized E is also a regular expression denoting the same language as . E formally as L ((E)) =L (E).

PRECEDENCE OF REGULAR EXPRESSION OPERATOR

The star operator is of highest precedence.

Page 9: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

The dot operator reserves the second All the unions are grouped with their operands.

FINITE AUTOMATA AND REGULAR EXPRESSION

There are two rules defining the regular expression:

Page 10: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

Every language defined by one of their automata is also defined by a regular expression. Every language defined by a regular expression is defined by one of these automata.

From DFA’s to regular expression

Converting DFA’s to regular expression by eliminating states

Converting regular expression to automata.

From DFA’s to regular expression:

In DFA’s the set of strings are allowed to pass through some set of strings.

In regular expression the expression we generate at the end represent all possible paths.

2. If L=L(A) for some DFA A, then there is a regular expression R such that L=L(R).

Theorem: If L=L(A) for some DFA A, then there is a regular expression R such that L=L(R).

Proof:

Suppose that A’s state are {1,2,3….,n} for some integer n. by assuming that n has

positive integers are finite state. Regular expression is denoted as Rij

Where i,j – states W – set of strings such that w is the label of the path from state I to state j in A k- the path has no intermediate nodes whose number is greater than k.

i

j

The vertical dimensions represent the states from 1 at the bottom to n at the top.

The horizontal dimensions travel along the path i, j>k. To construct expression Rij(k) the inductive definition starting at k=0 and reaching k=n. Basis:

Page 11: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

The basis is k=0. Since all the states are numbered 1 or above the restriction on paths is that the paths must have intermediate states at all.

There are 2 kinds of paths:

An arc from node(state) i to node j.

A path of length 0 that consists of some node i.

Page 12: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

If i ≠ j, the case (1) is possible. We must examine the DFA A and find those input symbols a

such that there is a transition state i to state j on symbol a. If there is no such symbol a then Rij(0) = φ. If there is exactly one such symbol a then Rij(0)=a. If there are separate a1,a2…ak that label arcs from state I to state j then

Rij(k)=a1+a2+…+ak. If i=j,

The legal paths are the path of length 0 and all loops from i to itself. The path of length 0 is represented by regular expression ε since that has no symbols along it.

We add ε as ε+a1+a2…..+ak.

Induction: Suppose that there is a path from state I to state j that goes no state higher than has no

symbols than k.

There are 2 cases:

The path does not go through state k at all. In this case the label of the path is in the language of Rij(k-1)

at least once. Then we break the path

into several pieces. The first goes from state i to k, last goes from k to j and all the middle go from itself without passing through k.

\

Page 13: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

i

Represent the part of the path that gets to state k first time.

k Rik(k-1) (Rkk(k-1))* Rkj(k-1) The portion that goes from k to itself, zero times or more than

j

The part of that k leaves for the last time and goes to state j

Figure 2.2 conversion of Regular Expression

This is represented in regular expression as

Rij(k)= Rij(k-1) + Rik(k-1) (Rkk(k-1))* Rkj(k-1)

3. Let us convert DFA to regular expression

q3 q1 q2

Page 14: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

Clearly the DFA accepts all the strings corresponding to regular expression 1*01(0+11)*.

The regular expressions for the path that can go through are

Page 15: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

=(1+ε)+ (1+ε) (1+ε)* (1+ε)

=1+1.1*.1 = 1.1*

R11(0) = 1.1*

Apply the same rule for k=2 & k=3. We consider R13(3) as the final state since it only has the start state 1 and final state 3 and the

No state

R1 1 (k-1) = 1+ε

R1 2 (k-1)=0

R1 3 (k-1)=φ

R2 1 (k-1)=φ

R2 2 (k-1) =ε

R2 3 (k-1)=1

R3 1 (k-1)=φ

R3 2 (k-1)=1

R3 3 (k-1)=0+ε

From theorem,

1 state R1 1 (k-1) = 1+ε

R1 2 (k-1)=1*0

R1 3 (k-1)=φ

R2 1 (k-1)=φ

R2 2 (k-1)=ε

R2 3 (k-1)=1

R3 1 (k-1)=φ

R3 2 (k-1)=1

R3 3 (k-1)=0+ε

2 state R1 1 (k-1) = 1+ε

R1 2 (k-1)=1*0

R1 3 (k-1)=1*01

R2 1 (k-1)=φ

R2 2 (k-1)=ε

R2 3 (k-1)=1

R3 1 (k-1)=φ

R3 2 (k-1)=1

R3 3 (k-1)=0+ε

Rij(k)= Rij(k-1)

Where k=1, Ri

+ Rik(k-1) (Rkk(k-1))* Rkj(k-1)

(0

Page 16: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

j (0)= Rij (0) + Rik (0) (Rkk (0))* Rkj ) R1 (0 + 1 (0

) = R11 ) R11 (0) (R11 (0))* R11 (0) termediate k value 2. Therefore,

Page 17: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

R13(2) =1*01(0+11)*

4. Explain the procedure Converting DFA’s to regular expression by eliminating states

Converting DFA’s to regular expression by eliminating states:

The DFA to regular expression can also be followed in NFA or even by ε NFA’s.

approach

to

duplicating work at some points.

For example all the expressions of (Rkk(k-1))*

The approach for constructing regular expression that we solve is by eliminating state. When we eliminate state s, all the path went through s no longer exist as shown in the figure below.

From the above figure , s is to be eliminated The predecessor are q1,q2…qk

The successor states are p1,p2…pm

We connect directly from q to p that moves through s. rather than single symbol it involves strings a simple way to regular expression.

The automata with regular expression as labels. The language of the automation is the union

Page 18: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

of overall paths from the start state to an accepting state of the language formed by

concatenation the language of regular expression along that path.

Each symbol a or ε if it is allowed can be thought as a regular expression whose language is

Page 19: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

a single string {a} or {ε} The possibility of some q’s and p’s by assuming that s is not among q’s and p’s even if there

is a loop from s to itself.

The regular expression Qi is a called labels of qi

The regular expression of Pm is called labels of pm The regular expression Rij on the arc from qi to pj for all i and j.

of eliminating state s, result

All arcs involving state is arc deleted.

For predecessor qi of s and for each successor pj of a, a regular expression thar represents all the paths that start at qi, go to s, perhaps loop around s zero or more times and finally go to pj

The expression for these paths is QiS*Pj. This expression is added to the arc from qi to pj. If there was no arc qi-> pj then first introduce one with regular expression φ.

The strategy for constructing a regular expression from a finite automation is as follows:

For each accepting state q, apply the above reduction process to produce an equivalent automation with regular expression labels on the arc. Eliminate all the states except q and the start state q0.

o

If q≠q0 then we shall be left with a two state automation that looks in the figure below

Page 20: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

(R+SU*T)SU*

L(R) or L(SU*T) => repeats start state itself number of

Page 21: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

times SU*T => path go to the accepting state via path L(S).

The desired regular expression is the sum of all the expression described from the

reduced automata for each accepting state by rules above.

4. Convert the NFA in the figure which accepts all strings containing 110.

Solution:

replace with regular expression

1. Using the figure we write

Q1=1 P1=1 R11=φ

A to c= φ +1φ1

=11 Eliminating state

B

2. Eliminating state c Q1=11 P1=0 R11=φ

A to c= φ +11φ0

=110

Page 22: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

3. The generic two state regular expression

Page 23: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

here R=0+1 S=110 T=φ U=0+1 = ((0+1)+ 110(0+1)* φ) 110(0+1)*

=(0+1)*.110.(0+1)*

5. Every language defined by a regular expression is also defined by a finite automation.

Converting regular expression to automata: Theorem:

Every language defined by a regular expression is also defined by a finite automation.

Proof: Suppose L=L® for regular expression R we show L=L(E) for some ε NFA E with

Exactly one accepting state No arcs into initial state

No arcs out of the accepting state

Basis: a. The language of automation is easily seen to be {ε}

b. The language shows the construction of {φ}

c. This automation for regular expression a is denoted by {a}.

Page 24: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

Induction:

We assume statement of the theorem is true for the immediate sub

expression of a regular expression that is the language of these sub expression are also the language of ε NFA’s with a single accepting state.

There are four cases:

1. The expression is R+S for smaller expression R and S

2. The expression RS for some smaller expression R and S.

3. R* for some smaller expression R. that automation allows to go either a. Directly from the start state to the accepting state along a path labeled ε. That path

lets us accept ε which is in L(R*) no matter what R’s expression is.

b. To the start state of the automation for R., through that automation one or more times, and then the accepting state. this allows the path as L(R), L(R)L(R),

L(R)L(R)L® and so on this covering all strings on L(R*).

4. The expression is (R) for some smaller expression R. the automation for R also serves as

the automation for (R), parenthesis doesn’t change value.

Problem: 6. Let us convert the regular expression (0+1)*1(0+1)

Solution: Step 1:

Construct automation for 0+1.

Step 2: construct automation for (0+1)*

Page 25: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

7. APPLICATIONS OF REGULAR EXPRESSION

APPLICATIONS OF REGULAR EXPRESSION

A regular expression that gives a picture of the pattern we want to recognize is the medium of choice for applications that search for patterns in text.

The regular expressions are then compiled behind the scene into DFA or NFA which are stimulated to produce program that recognize patterns in text.

Regular expression in UNIX

Lexical analysis

Finding patterns in text

Regular expression in UNIX:

The UNIX extensions include certain features especially the ability to name and refer to previous strings that have matched a pattern that actually allow non-regular language

to be recognized.

The first enhancement to the regular expression notation concerns the font that most real applications deal with ASCII character set.

The regular expression allows writing character classes to represent large set

of characters as succinctly as possible.

The rules for characters are

o mean all the character from x to y in ASCII sequence o The symbol. (dot) stands for any characters.

Page 26: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

o The sequence [a1,a2,…ak] stands for both number and characters

Page 27: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

Lexical analysis:

It scans the source program and recognize all tokens those substrings of consecutive characters that belong together logically.

Keywords and identifiers are common examples of tokens.

UNIX command lex and its GNU version flex accept as input a list of regular expression in the UNIX style, each followed by a bracketed section of code that

indicate what lexical analyzer is to do when it finds an instance of that token.

The one that creates tokens is called as lexical analyzer generator.

Regular expression notation is exactly as powerful as we need to describe tokens.

These commands are able to use regular expression to DFA conversion process to generate an efficient function that breaks source programs into tokens.

Finding patterns in text:

Automata could be used to search efficiently for a set of words in a large repository such as web

By using regular expression notation it becomes easy to describe the patterns at a high level with little effort and to modify description quickly when things go wrong

We want to scan a very large number of web pages and detect even addresses.

REGULAR GRAMMARS

A regular grammar is represented by G={V,T,P,S}

Where,

G= regular grammar V=set of non terminals P=product rules

S=start symbol where S€ V

Page 28: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

A->a A, B are the non terminals and a is the input symbol.

Construction of regular grammars and regular expression:

Construct a NFA with ε from regular expression

Eliminate ε transitions and convert it into equivalent DFA.

From constructed DFA the corresponding states become non terminal symbols and transition made are equivalent to production rules.

Problem:

8. Construct regular grammar for a*b(a+b)*

Solution:

Step1: convert into DFA

Step 2: convert it into ε closures

Page 29: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

Step 3: eliminate the ε

Note: check whether the resultant automation is DFA or NFA, if it is DFA proceed to regular grammar else convert it from NFA to DFA and then go to regular grammar

Step 4: write the regular grammar G= {V, T, P, S}

V={ q0, q1}

T=a,b

q 0->bq0 q 0->b

q1->a q1

q1->b q1

q1-> a a1->b}

Page 30: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

9.

Explain The Pumping Lema For Regular Languages PROPERTIES OF REGULAR LANGUAGES Regular language has 3 major properties Pumping Lemma Closure properties Decision properties

PUMPING LEMMA FOR REGULAR LANGUAGES

Theorem:

Let L be a regular language. Then there exists a constant n (which depends on L) such that every string w in L such that |w|>=n, we can break w into three strings, w=xyz, such that

1. y≠є

2. |xy|<=n

3. For all k>=0, the string xykz is also in L

Proof:

Let L be regular. Then L=L(A) for some DFA A.

Let A has n states

Consider any string of length n or more, say m

o w=a1a2….. anan+1an+2 ….. Am o where m>=n

For i=0,1,2,……n define state pi to be d(q0,a1a2….. ai) where d is the transition function of A. then p0 =q0

By the pigeonhole principle, it is not possible for the n+1 different pi for i=0,1,2,….n to be distinct since there are only n state

Page 31: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

This indicates a repetition of some state and we can find two integers i and j, with

0<=i<j<=n such that pi=pj.

We can break w=xyz as follows

1. x=a1a2….. ai

2. y= ai+1ai+2….. aj

3. z= aj+1aj+2….. am

11. Explain the CLOSURE PROPERTIES OF REGULAR LANGUAGES

Closure properties express the idea that any language L formed from any regular language by certain operations, then L is also regular.

Closure Properties

Union (of two regular language is regular)

Intersection

Complement

Difference

Reversal

Homomorphism

Page 32: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

Closure under Union

Let L and M are regular

Then these languages have regular expressions R and S i.e. L=L(R) and M=L(S)

LUM = L(R) U L(S) = L(R+S)

Since R and S are the regular expressions then R+S is also regular

This implies LUM is also regular (Union is Closed)

Closure under complementation

If L is regular then to prove that its complement Ľ (L bar), which is defined as S * - L, is

also regular.

If there exists any DFA that accepts Ľ, then we say that the complement of L is also

regular.

For this we learn to construct such DFA

Constructing a DFA to accept L complement

Since L is regular, there exists a DFA that accepts it.

Let L=L(R ).

Convert the regular expression R to an є-NFA.

Convert the є-NFA to a DFA by the subset construction.

Complement the accepting states of that DFA i.e. now the accepting states of this new DFA are the states other than the accepting states of previous one (i.e. Q-F)

Turn the complement DFA into a regular expression

Proving that the complement of a regular language is also regular

Page 33: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

Let L=L(A) for some DFA A

Where A= (Q, S,d, q0,F)

Then Ľ=L(B) where B=(Q, S,d, q0,Q-F)

Then A and B are same other than the fact that their accepting states are different

Then any string w is in L(B) if and only if d*(q0,w) is in Q-F which occurs only if w is not in L(A). Hence Ľ is regular.

Example:

Prove that L(A) = (0+1)*01 is regular.

Prove that the language consisting of all strings that do not end in 01 is regular.

If this language is regular, draw a DFA accepting the language.

DFA accepting language that does Prove that there is a DFA that accepts all strings of L ∩ M

Let a string w ∈ L ∩ M

Let us assume that the newly constructed DFA is the required one

This means w ∈ L and w ∈ M

dL(qL,w) ∈FL and

Since L and M are regular, w is accepted by the corresponding automata i.e.

d (( qL, qM),w) ∈(FL,FM)

This indicates that

Page 34: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

dM (qM,w) ∈ FMnot end with 01

Page 35: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

Closure under Intersection

Intersection of two regular languages L and M is written as L ∩ M

Demorgan’s Law

L∩M= L U M

If L is regular, then complement of L is also regular.

If M is regular, then complement of M is also regular.

The union of two regular languages is proved to be regular

Hence intersection of two regular languages is also regular.

Special DFA construct for intersection of two regular languages

Let L and M be the languages of automata

AL=(QL, S,dL, qL,FL) and

AM=(QM, S,dM, qM,FM)

Construct an automaton

A = (QL XQM, S,d, qLXqM,FLXFM)

Where the states are the pairs of type (p,q) such that p is in QL and q is in QM

The alphabet S is assumed to be the same.

Page 36: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

The transition function d = dLXdM such that

o d((p,q),a)= an intermediate state of A which is a pair of states obtained by transition of state p (in AL) and of state q (in AM)

Page 37: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

\\

d((p,q),a)=(dL(p,a) ,dM (q,a))

The start state of the new DFA =(qL ,qM)

Final states of the new DFA = cross products of FL and FM

Now to prove that L ∩ M is regular

Prove that there is a DFA that accepts all strings of L ∩ M

Let a string w ∈ L ∩ M

Let us assume that the newly constructed DFA is the required one

This means w ∈ L and w ∈ M

Since L and M are regular, w is accepted by the corresponding automata i.e.

dL(qL,w) ∈FL and dM (qM,w) ∈ FM

d (( qL, qM),w) ∈(FL,FM)

This indicates that

Hence L ∩ M is Regular

Closure under Difference

L-M = set of all strings that are in L and not in M

i.e. L-M = L ∩ M

Since Complement of M is regular as M is regular, Intersection of L and M is also regular

Hence the difference of two regular languages is regular

Example:

Let L=L(A) and M= L(B) where automata A and B are described by the following transition diagrams

Page 38: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

Reversal of a regular language

Reversal of a string w = a1a2……an is the string anan-1……a1 denoted as wR.

Reversal of string 110101 is 101011

Reversal of string abaaa is aaaba

Reversal of a language L is the language consisting of the reversals of all its strings

LR= {wR | w∈ L}

Example: reversal of a language

Let L= {001,10,111} then

LR = {100,01,111} 12. Proving reversal of a regular language is regular: through DFA construct

Proving reversal of a regular language is regular: through DFA construct

Construct an automaton for LR

o Reverse all arcs in the transition diagram for A

o Make the start state of A the only accepting state for the new automaton

o Create a new start state P0 with transitions on є to all the accepting states of A

Page 39: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

L={100,1100,…101,1101,…}

Example: L=1*0(0+1)*

3.10

Page 40: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

LR = {001,0011,….101,1011,…}

First we draw a DFA for this language L

Example: drawing a new automaton for LR= (0+1)*01*

Reverse the arcs and make p as accepting state

Introduce a new start state p0 and draw an arc with symbol є from p0 to the final state of A

Homomorphism

A string homomorphism is a function on strings that works by substituting a particular string for each symbol.

Example : Let S= {0,1}; w=0011

o then function h(0)=ab

h(1)=є

Then h(w)= h(0)h(0)h(1)h(1)=ababєє

h(w)=abab

Page 41: REGULAR EXPRESSIONS AND LANGUAGES - Vidyarthiplus

www.vidyarthiplus.com RJ edition

www.vidyarthiplus.com

Homomorphism to a language is applied by applying it to each of the strings in the language

h(L) = {h(w) | w is in L}

Let L= 10*1

i.e. L={11,101,1001,…}

Then h(L) = {єє, єabє, єababє,…} or (ab)*

Theorem

If L is regular then h(L) is also regular

Let L=L( R) for regular expression R

Let E be an expression over S

Then let h(E) be an expression obtained by replacing each symbol a of S in E by h(a)

Then h(R) defines the language h (L).