cse 311 foundations of computing i lecture 27 fsm limits, pattern matching autumn 2012 cse 311 1

40
CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 Autumn 2012 CSE 311 1

Upload: shanon-cox

Post on 17-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

CSE 311 Foundations of Computing I

Lecture 27FSM Limits, Pattern Matching

Autumn 2012

Autumn 2012 CSE 311 1

Page 2: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

Announcements• Reading assignments

– 7th Edition, Section 13.4– 6th Edition, Section 12.4– 5th Edition, Section 11.4

• Next week– 7th edition: 2.5 (Cardinality) + p. 201 and 13.5 – 6th edition: pp. 158-160 (Cardinality)+ p 177 and 12.5– 5th edition: Pages 233-236 (Cardinality) and 11.5

• Topic list and sample final exam problems have been posted• Comprehensive final, roughly 67% of material post midterm• Review session, Saturday, December 8, 10 am – noon (tentatively)• Final exam, Monday, December 10

– 2:30-4:20 pm or 4:30-6:20 pm, Kane 220.– If you have a conflict, contact instructors ASAP

Autumn 2012 CSE 311 2

Page 3: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

Last lecture highlights

• NFAs from Regular Expressions

Autumn 2012 CSE 311 3

(01 1)*0

0

0

1

1

Page 4: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

Last lecture highlights

• “Subset construction”: NFA to DFA

Autumn 2012 CSE 311 4

c

a

b

0

0,1

1

0NFA

a,b

DFA

0

c

1

b

b,c

1

0

a,b,c

1

0,1

0

0

1

10

Page 5: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

5

Converting an NFA to a regular expression

• Consider the DFA for the mod 3 sum– Accept strings from {0,1,2}* where the digits mod

3 sum of the digits is 0

Autumn 2012 CSE 311

t0 t2

t1

0

00

1 1

1

2

22

Page 6: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

6

Splicing out a node

• Label edges with regular expressions

Autumn 2012 CSE 311

t0 t2

t1

0

00

1 1

1

2

22

t0→t1→t0 : 10*2t0→t1→t2 : 10*1t2→t1→t0 : 20*2t2→t1→t2 : 20*1

Page 7: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

7

Finite Automaton without t1

Autumn 2012 CSE 311

t0 t2

R1

R1: 0 | 10*2R2: 2 | 10*1R3: 1 | 20*2R4: 0 | 20*1

R5: R1 | R2R4*R3

R4R2

R3

t0

R5

Final regular expression:(0 | 10*2 | (2 | 10*1)(0 | 20*1)*(1 |20*2))*

Page 8: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

Generalized NFAs

• Like NFAs but allow– Parallel edges– Regular Expressions as edge labels

• NFAs already have edges labeled or a

• An edge labeled by A can be followed by reading a string of input chars that is in the language represented by A

• A string x is accepted iff there is a path from start to final state labeled by a regular expression whose language contains x

Autumn 2012 CSE 311 8

Page 9: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

Starting from NFA

• Add new start state and final state

• Then eliminate original states one by one, keeping the same language, until it looks like:

• Final regular expression will be A

Autumn 2012 CSE 311 9

λλλ

A

Page 10: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

Only two simplification rules:• Rule 1: For any two states q1 and q2 with

parallel edges (possibly q1=q2), replace

• Rule 2: Eliminate non-start/final state q3 by replacing all

for every pair of states q1, q2 (even if q1=q2)Autumn 2012 CSE 311 10

q1q2

A

Bby A⋃B

q1q2

AB

C AB*Cq1 q3 q2 q1q2by

Page 11: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

11

Automata Theory Summary

• Every DFA is an NFA• Every NFA can be converted into a DFA that

accepts the same language– However there may be an exponential increase in

the number of states• Given a regular expression, we can construct

an NFA that recognizes it• Given an NFA we can construct an regular for

the strings accepted by itAutumn 2012 CSE 311

Page 12: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

What can Finite State Machines do?

• We’ve seen how we can get DFAs to recognize all regular languages

• What about some other languages we can generate with CFGs?– { 0n1n : n≥0 }?– Binary Palindromes?– Strings of Balanced Parentheses?

Autumn 2012 CSE 311 12

Page 13: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

A={0n1n : n≥0} cannot be recognized by any DFA

Consider the infinite set of strings S={, 0, 00, 000, 0000, ...}Claim: No two strings in S can end at the same

state of any DFA for A, so no such DFA can existProof: Suppose nm and 0n and 0m end at the same

state p. Since 0n1n is in A, following 1n after state p

must lead to a final state. But then the DFA would accept 0m1n which is a contradictionAutumn 2012 CSE 311 13

Page 14: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

The set B of binary palindromes cannot be recognized by any DFA

Consider the infinite set of strings S={, 0, 00, 000, 0000, ...}Claim: No two strings in S can end at the same

state of any DFA for B, so no such DFA can existProof: Suppose nm and 0n and 0m end at the same

state p. Since 0n10n is in B, following 10n after state p must lead to a final state.

But then the DFA would accept 0m10n which is a contradictionAutumn 2012 CSE 311 14

Page 15: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

The set P of strings of balanced parentheses cannot be recognized by any DFA

Autumn 2012 CSE 311 15

Page 16: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

The set P of strings {1j | j = n2} cannot be recognized by any DFA

Autumn 2012 CSE 311 16

Suppose 1j and 1k reach the same state p with j < k

1k1k(k-1) must reach an accepting state q1j1k(k-1) must reach the same accepting state q

Thus, j + k(k-1) = k2 – k + j must be a perfect squareIs that possible?

Page 17: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

17

Pattern Matching

• Given – a string s of n characters– a pattern p of m characters– usually m<<n

• Find– all occurrences of the pattern p in the string s

• Obvious algorithm: – try to see if p matches at each of the positions in s

• stop at a failed match and try the next position

Autumn 2012 CSE 311

Page 18: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

18

Pattern p = x x x x x y

String s = x x x x x x x x x y x x x x x x x x x x y x x

Autumn 2012 CSE 311

Page 19: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

19

Pattern p = x y x y y x y x y x x

String s = x y x x y x y x y y x y x y x y y x y x y x x

Autumn 2012 CSE 311

Page 20: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

20

x y x y y x y x y x xString s = x y x x y x y x y y x y x y x y y x y x y x x

Autumn 2012 CSE 311

Page 21: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

21

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

x y x y y x y x y x x

Autumn 2012 CSE 311

Page 22: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

22

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y x y y x y x y x x

Autumn 2012 CSE 311

Page 23: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

23

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y

x y x y y x y x y x x

Autumn 2012 CSE 311

Page 24: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

24

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y

x y x y yx y x y y x y x y x x

Autumn 2012 CSE 311

Page 25: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

25

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y

x y x y yx

x y x y y x y x y x x

Autumn 2012 CSE 311

Page 26: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

26

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y

x y x y yx

x y x y y x y x y x xx y x y y x y x y x x

Autumn 2012 CSE 311

Page 27: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

27

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y

x y x y yx

x y x y y x y x y x xx

x y x y y x y x y x x

Autumn 2012 CSE 311

Page 28: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

28

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y

x y x y yx

x y x y y x y x y x xx

x y xx y x y y x y x y x x

Autumn 2012 CSE 311

Page 29: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

29

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y

x y x y yx

x y x y y x y x y x xx

x y xx

x y x y y x y x y x x

Autumn 2012 CSE 311

Page 30: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

30

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y

x y x y yx

x y x y y x y x y x xx

x y xx

xx y x y y x y x y x x

Autumn 2012 CSE 311

Page 31: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

31

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y

x y x y yx

x y x y y x y x y x xx

x y xx

xx y x y y

x y x y y x y x y x x

Autumn 2012 CSE 311

Page 32: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

32

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y

x y x y yx

x y x y y x y x y x xx

x y xx

xx y x y y

xx y x y y x y x y x x

Worst-case time O(mn)

Autumn 2012 CSE 311

Page 33: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

33

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx yx y x y

y x x y x y y x y x y x x x

x y xx

x x y x y y x x y x y y x y x y x

x

Lots of wasted work

Autumn 2012 CSE 311

Page 34: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

34

Better Pattern Matching via Finite Automata

• Build a DFA for the pattern (preprocessing) of size O(m)– Keep track of the ‘longest match currently active’– The DFA will have only m+1 states

• Run the DFA on the string n steps

• Obvious construction method for DFA will be O(m2) but can be done in O(m) time.

• Total O(m+n) time

Autumn 2012 CSE 311

Page 35: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

35

Building a DFA for the pattern

Pattern p=x y x y y x y x y x x

y y y xxx xy x y x

Autumn 2012 CSE 311

Page 36: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

36

Building a DFA for the pattern

Pattern p=x y x y y x y x y x x

y y y xxx xy x y x

Autumn 2012 CSE 311

y

Page 37: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

37

Building a DFA for the pattern

Pattern p=x y x y y x y x y x x

y y y xxx xy x y x

Autumn 2012 CSE 311

y x

y

x

x

Page 38: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

38

Building a DFA for the pattern

Pattern p=x y x y y x y x y x x

y y y xxx xy x y x

Autumn 2012 CSE 311

y x

y x

y

x

y

x x

Page 39: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

39

Building a DFA for the pattern

Pattern p=x y x y y x y x y x x

y y y xxx xy x y x

Autumn 2012 CSE 311

y x

y x

y

x

y

xy

yy

x

x

Page 40: CSE 311 Foundations of Computing I Lecture 27 FSM Limits, Pattern Matching Autumn 2012 CSE 311 1

40

Generalizing

• Can search for arbitrary combinations of patterns– Not just a single pattern– Build NFA for pattern then convert to DFA ‘on the fly’.

• Compare DFA constructed above with subset construction for the obvious NFA.

Autumn 2012 CSE 311