zvi kohavi and niraj k. jha 1 finite-state recognizers

27
1 Zvi Kohavi and Niraj K. Jha Finite-state Recognizers Finite-state Recognizers

Upload: asia-chatt

Post on 01-Apr-2015

239 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

1

Zvi Kohavi and Niraj K. Jha

Finite-state RecognizersFinite-state Recognizers

Page 2: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

2

Deterministic RecognizersDeterministic Recognizers

Treat FSM as a recognizer that classifies input strings into two classes:

strings it accepts and strings it rejects

Finite-state recognizer:• Equivalent to a string of input symbols that enter the machine at successive times• Finite-state control: Moore FSM• States in which output symbol is 1 (0): accepting (rejecting) states• A string is accepted by an FSM: if and only if the state the FSM enters after having read the rightmost symbol is an

accepting state• Set of strings recognized by an FSM: all input strings that take the FSM from its starting state to an accepting state

1 01

Finitecontrol

0

Head

Tape

0 0 11

Page 3: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

3

Transition GraphTransition GraphExample: a machine that accepts a string if and only if the string begins and ends with a 1, and every 0 in the string is preceded and followed by at least a single 1

Transition graph: consists of a set of vertices and various directed arcs connecting them• At least one of the vertices is specified as a starting vertex• Arcs are labeled with symbols from the input alphabet

• A vertex may have one or more Ii-successors or none

• It accepts a string if the string is described by at least one path emanating from a starting vertex and terminating at an accepting vertex

• It may be deterministic or non-deterministic

A

C B

(a) Deterministic state diagram.

01

01

0,1

A

B

1 0

1

(b) Transition graph.

Page 4: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

4

ExampleExample

Example: 1110 and 11011 accepted by the transition graph below, but 100 rejected

Equivalent transition graphs: two or more graphs that recognize the same set of strings

• Each graph below accepts a string: if and only if each 1 is preceded by at least two 0’s

A D

B0

1 0

1

11

C0

A B C0 0

0

1

A B C0 0

0

1

Page 5: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

5

Graphs Containing -TransitionsGraphs Containing -Transitions

-transitions: when no input symbol is used to make the transition

Example: Graph that recognizes a set of strings that start with an even number of 1’s, followed by an even number of 0’s, and end with substring 101

B D

A

0

1

01

0

C1

E F G1

B D

A

0

1

01

0

C1

E F G1

(a) A graph containing a -transition.

(b) An equivalent graph without -transitions.

0

1

Page 6: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

6

Converting Nondeterministic into Converting Nondeterministic into Deterministic GraphsDeterministic Graphs

Example: Transition graph and its transition table

• Successor table and deterministic graph:

B

A C

(a) Transition graph.

0

101

0,1

0 1

A

B

C

C

AB

AC

A

(b) Transition table.

(a) Successor table.

0 1

AB

C

C

AB

AC

A

AB C

AC

0

0

0

1

1

A

(b) State diagram of an equivalentdeterministic machine.

AC

ABC

ABC A

C

ACABC

A

ABC

0

1

1

0 0,1

Page 7: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

7

TheoremTheorem

Theorem: Let S be a set of strings that can be recognized by a nondeterministic transition graph Gn. Then S can also be recognized by an equivalent deterministic graph Gd. Moreover, if Gn has p vertices, Gd will have at most 2p vertices

Page 8: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

8

Regular ExpressionsRegular Expressions

Example: Sets of strings and the corresponding expression• Graph (a) recognizes set {101}: expression denoted as 101• Graph (b) recognizes set {01,10}: expression = 01 + 10• Graph (c) recognizes {0111,1011}: expression = 0111 + 1011

– Concatenation of 01 + 10 and 11• Graph (d) recognizes set { ,1,11,111,1111,…}: expression = 1*

0 11

(a)

1

1 0

0

(c)

1

1

1

1 0

0

(b)

(d)

1

Page 9: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

9

Regular Expressions (Contd.)Regular Expressions (Contd.)

Example: 01(01)* = 01 + 0101 + 010101 + 01010101 + …• R* = + R + R2 + R3 + …

Example: Set of strings on {0,1} beginning with a 0 and followed only by 1’s: 01*

Example: Set of strings on {0,1} containing exactly two 1’s: 0*10*10*

Example: Set of all strings on {0,1}:

(0+1)* = + 0 + 1 + 00 + 01 + 10 + 11 + 000 + …

Example: Set of strings on {0,1} that begin with substring 11: 11(0+1)*

Example: Transition graphs and the sets of strings they recognize

B

C

1

B

A

0

1

1

A1

D E

(a) (01 + 10)*11.

0

1

(b) (10*)*.

0

Page 10: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

10

Definition and Basic PropertiesDefinition and Basic Properties

Let A = {a1,a2,…,ap} be a finite alphabet: then the class of regular expressions over alphabet A is defined recursively as follows:

• Any symbol, a1, a2, …, ap alone is a regular expression: as are null string and empty set

• If P and Q are regular expressions: then so is their concatenation PQ and their union P+Q

– If P is a regular expression: then so is its closure P*• No other expressions are regular: unless they can be generated in a finite

number of applications of the above rules

Recognizers for and :

BA

(a) A graph accepting . (b) A graph accepting .

Page 11: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

11

IdentitiesIdentities

+ R = R

R = R =

R = R = R

* =

* =

Set of strings that can be described by a regular expression: regular set• Not every set of strings is regular• Set over {0,1}, which consists of k 0’s (for all k), followed by a 1, followed in turn by k 0’s, is not regular:

010 + 00100 + 0001000 + … + 0k10k + …– Requires an infinite number of applications of the union operation

• However, certain infinite sums are regular– Set consisting of alternating 0’s and 1’s, starting and ending with a 1: 1(01)*

Page 12: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

12

Manipulating Regular ExpressionsManipulating Regular Expressions

A regular set may be described by more than one regular expression• Such expressions are called equivalent

Example: Alternating 0’s and 1’s, starting and ending with 1• 1(01)* or (10)*1

Let P, Q, and R be regular expressions: thenR + R = R

PQ + PR = P(Q+R); PQ + RQ = (P + R)Q

R*R* = R*

RR* = R*R

(R*)* = R*

+ RR* = R*

(PQ)*P = P(QP)*

(P + Q)* = (P*Q*)* = (P* + Q*)* = P*(QP*)* = (P*Q)*P*

+ (P + Q)*Q = (P*Q)*

Page 13: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

13

ExamplesExamples

Example: Prove that the set of strings in which every 0 is immediately followed by at least two 1’s can be described by both R1 and R2, where

R1 = + 1*(011)*(1*(011)*)*

R2 = (1 + 011)*

Proof: R1 = + 1*(011)*(1*(011)*)*

= (1*(011)*)*

= (1 + 011)* = R2

Example: Prove the identity

(1 + 00*1) + (1 + 00*1)(0 +10*1)*(0 + 10*1) = 0*1(0 + 10*1)*

Proof: LHS = (1 + 00*1)[ + (0 + 10*1)*(0 + 10*1)]

= [( + 00*)1][ + (0 + 10*1)*(0 + 10*1)]

= 0*1(0 + 10*1)*

Page 14: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

14

Transition Graphs Recognizing Regular Transition Graphs Recognizing Regular SetsSets

Theorem: Every regular expression R can be recognized by a transition graph

Proof:(a) R = . (b) R = . (c) R = .i

i

(a) Graphs recognizing P and Q.

G

H

(c) A graph recognizing PQ.

G H

(b) A graph recognizing P+Q.

G

H

G

(d) A graph recognizing P*.

Page 15: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

15

ExampleExample

Example: Construct a transition graph recognizing R = (0 + 1(01)*)*

0

P

(a) R = P*; P = 0 + 1(01)*.

A B C

(b) P = 0 + Q; Q = 1(01)*.

A B C

Q

0

(c) Q = 1T; T = (01)*.

A B C

T

D

1

0

(d) Final step.

A B C

D E

1

1

0

F

Page 16: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

16

Example (Contd.)Example (Contd.)

Example: Prove that (P + Q)* = P*(QP*)*

P

(a) Graph recognizing P*(QP*)*.

Q

(b) Equivalent graph with no-transitions.

P

P

Q

P,Q

PP

Q

P,Q

(a) Equivalent deterministicgraph recognizing (P + Q)*.

Page 17: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

17

Informal TechniquesInformal Techniques

Example: Construct a graph that recognizes P = (01 + (11 + 0)1*0)*11

Graph for Q = (11 + 0)1*0

Graph for P

A

0

B 1

C

D

11

0

A0

B 1

C

F

11

0

D

1

E

0

1

1

Page 18: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

18

ExampleExample

Example: Construct a graph that recognizes R = (1(00)*1 + 01*0)*

E

B0

0

1 D

1

F

C

(a) Partial graph.

0

1

A

0

E

B0

0

1 D

1

F

C

(b) Complete graph.

0

1

A

0

0

11

0

Page 19: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

19

Regular Sets Corresponding to Transition Regular Sets Corresponding to Transition GraphsGraphs

The set of strings that can be recognized by a transition graph (hence, an FSM) is a regular set

Theorem: Let Q, P, and R be regular expressions on a finite alphabet. Then, if P does not contain :

• Equation R = Q + RP has a unique solution given by R = QP*• Equation R = Q + PR has a unique solution given by R = P*Q

Page 20: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

20

Systems of EquationsSystems of Equations

Example: Derive the set of strings derived by the following transition graph

A = + A0 + B1 (1)

B = A0 + B1 + C0 (2)

C = B0 (3)

Substituting (3) into (2):

B = A0 + B1 + B00 = A0 + B(1 + 00) (4)

From the theorem:

B = A0(1 + 00)* (5)

Substituting (5) into (1):

A = + A0 + A0(1 + 00)*1 = + A(0 + 0(1 + 00)*1) (6)

From the theorem:

A = (0 + 0(1 + 00)*1)* = (0 + 0(1 + 00)*1)* (7)

Hence, solution C from (7), (5) and (3):

C = (0 + 0(1 + 00)*1)*0(1 + 00)*0

A B C

0

0 01

0

1

Page 21: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

21

TheoremTheorem

Theorem: The set of strings that take an FSM M from an arbitrary state Si to another state Sj is a regular set

• Combining the two theorems:– An FSM recognizes a set of strings if and only if it is a regular set

Applications: the correspondence between regular sets and FSMs enables us to determine whether certain sets are regular

Example: Let R denote a regular set on alphabet A that can be recognized by machine M1

• Complement R’: set containing all the strings on A that are not contained in R

• R’ describes a regular set: since it can be recognized by a machine M2, which is obtained from M1 by complementing the output values associated with the states of M1

Page 22: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

22

ExamplesExamples

Example: Let P&Q represent the intersection of sets P and Q• Prove P&Q is regular • Since P’ and Q’ are regular:

– P’ + Q’ is regular– Hence, (P’ + Q’)’ is regular– Since P&Q = (P’ + Q’)’: P&Q is regular

Regular expressions containing complementation, intersection, union, concatenation, closure: extended regular expressions

Example: Consider the set of strings on {0,1} s.t. no string in the set contains three consecutive 0’s• Set can be described by: [(0 + 1)*000(0 + 1)*]’• More complicated expression if complementation not used:

(1 + 01 + 001)*( + 0 + 00)

Page 23: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

23

ExampleExample

Example: Let M be an FSM whose input/output alphabet is {0,1}. Assume the machine has a designated starting state. Let z1z2…zn denote the output sequence produced by M in response to input sequence x1x2…xn. Define a set SM, which consists of all the strings w s.t.

w = z1x1z2x2…znxn for any x1x2…xn in (0 + 1)*. Prove that SM is regular.

• Given the state diagram of M: replace each directed arc with two directed arcs and a new state, as shown in the figure

• Retain the original starting state: designate all the original states as accepting states

• The resulting nondeterministic graph recognizes SM: thus SM is regular

xzReplace A Bx/z

A B with

Page 24: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

24

Example (Contd.)Example (Contd.)

Example (contd.): Derive SN for machine N shown below

A B0/1

1/0

1/1

0/0

A B

F

1 0

1

0

D

0

C

1

0

E

1

A

1

CE

(a) Transition graph.

DF

B

1

1

1

000

0

(b) Equivalent deterministic form.

0,1

Page 25: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

25

Two-way RecognizersTwo-way Recognizers

Two-way recognizer (or two-way machine): consists of a finite-state control coupled through a head to a tape

• Initially: the finite-state control is in its designated starting state, with its head scanning the leftmost square of the tape

• The machine then proceeds to read the symbols of the tape: one at a time• In each cycle of computation: the machine examines the symbol currently scanned by the head,

shifts the head one square to the right or left, and then enters a new (not necessarily distinct) state• If the machine eventually moves off the tape on the right end entering an accepting state: the tape

is accepted by the machine• A machine can reject a tape: either by moving off its right end while entering a rejecting state or by

looping within the tape• Null string can be represented either by: the absence of an input tape or by a completely blank

tape• A machine accepts if and only if: its starting state is an accepting state

Page 26: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

26

ExampleExample

Example: A two-way machine recognizing set 100*

1

A

0c

A

B

A

B

1

A

c

A B D D D D

(a) A loop. (b) Rejection of a tape.

Page 27: Zvi Kohavi and Niraj K. Jha 1 Finite-state Recognizers

27

Convenience of Using Two-way MachinesConvenience of Using Two-way Machines

Two-way machines are as powerful as one-way machines w.r.t the class of tapes they can recognize

• However, for some computations: it is convenient to use two-way machines since they may require fewer states

Example: Consider the two-way machine shown in the table, which accepts a tape if and only if it contains at least three 1’s and at least two 0’s

• The minimal one-way machine that is equivalent to the two-way machine has 12 states: since it must examine the tapes for the appropriate number of 0’s and 1’s simultaneously

01

A

c

A B B B C C

(a) Rejecting a tape. (b) Accepting a tape.

0 00 1

C

01

A

c

A B B C

DD

0 01 1

DD

E E F F F G G