cse 311 foundations of computing i lecture 24 fsm limits, pattern matching autumn 2011 cse 3111

35
CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 Autumn 2011 CSE 311 1

Upload: abraham-edwards

Post on 19-Jan-2018

215 views

Category:

Documents


0 download

DESCRIPTION

Last lecture highlights NFAs from Regular Expressions Autumn 2011CSE 3113 (01  1)*

TRANSCRIPT

Page 1: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

CSE 311 Foundations of Computing I

Lecture 24FSM Limits, Pattern Matching

Autumn 2011

Autumn 2011 CSE 311 1

Page 2: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

Announcements

• Reading assignments– 7th Edition, Section 13.4– 6th Edition, Section 12.4– 5th Edition, Section 11.4

Autumn 2011 CSE 311 2

Page 3: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

Last lecture highlights

• NFAs from Regular Expressions

Autumn 2011 CSE 311 3

(01 1)*0

0

0

1

1

Page 4: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

Last lecture highlights

• “Subset construction”: NFA to DFA

Autumn 2011 CSE 311 4

c

a

b0

0,1

1

0NFA

a,b

DFA

0

c

1

b

b,c

1

0

a,b,c

1

0,1

0

0

11

0

Page 5: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

What can Finite State Machines do?

• We’ve seen how we can get DFAs to recognize all regular languages

• What about some other languages we can generate with CFGs?– { 0n1n : n≥0 }?– Binary Palindromes?– Strings of Balanced Parentheses?

Autumn 2011 CSE 311 5

Page 6: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

A={0n1n : n≥0} cannot be recognized by any DFA

Consider the infinite set of strings S={, 0, 00, 000, 0000, ...}Claim: No two strings in S can end at the same

state of any DFA for A, so no such DFA can existProof: Suppose nm and 0n and 0m end at the same state p.

Since 0n1n is in A, following 1n after state p must lead to a final state.

But then the DFA would accept 0m1n which is a contradiction

Autumn 2011 CSE 311 6

Page 7: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

The set B of binary palindromes cannot be recognized by any DFA

Consider the infinite set of strings S={, 0, 00, 000, 0000, ...}Claim: No two strings in S can end at the same

state of any DFA for B, so no such DFA can existProof: Suppose nm and 0n and 0m end at the same state p.

Since 0n10n is in B, following 10n after state p must lead to a final state.

But then the DFA would accept 0m10n which is a contradiction

Autumn 2011 CSE 311 7

Page 8: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

The set P of strings of balanced parentheses cannot be recognized by any DFA

Autumn 2011 CSE 311 8

Page 9: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

9

Pattern Matching• Given

– a string, s, of n characters– a pattern, p, of m characters– usually m<<n

• Find– all occurrences of the pattern p in the string s

• Obvious algorithm: – try to see if p matches at each of the positions in s

• stop at a failed match and try the next position

Page 10: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

10

Pattern p = x y x y y x y x y x x

String s = x y x x y x y x y y x y x y x y y x y x y x x

Page 11: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

11

x y x y y x y x y x xString s = x y x x y x y x y y x y x y x y y x y x y x x

Page 12: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

12

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

x y x y y x y x y x x

Page 13: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

13

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y x y y x y x y x x

Page 14: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

14

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y

x y x y y x y x y x x

Page 15: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

15

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y

x y x y yx y x y y x y x y x x

Page 16: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

16

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y

x y x y yx

x y x y y x y x y x x

Page 17: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

17

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y

x y x y yx

x y x y y x y x y x xx y x y y x y x y x x

Page 18: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

18

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y

x y x y yx

x y x y y x y x y x xx

x y x y y x y x y x x

Page 19: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

19

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y

x y x y yx

x y x y y x y x y x xx

x y xx y x y y x y x y x x

Page 20: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

20

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y

x y x y yx

x y x y y x y x y x xx

x y xx

x y x y y x y x y x x

Page 21: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

21

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y

x y x y yx

x y x y y x y x y x xx

x y xx

xx y x y y x y x y x x

Page 22: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

22

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y

x y x y yx

x y x y y x y x y x xx

x y xx

xx y x y y

x y x y y x y x y x x

Page 23: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

23

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx y

x y x y yx

x y x y y x y x y x xx

x y xx

xx y x y y

xx y x y y x y x y x x

Worst-case time O(mn)

Page 24: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

24

x y x yString s = x y x x y x y x y y x y x y x y y x y x y x x

xx yx y x y

y x x y x y y x y x y x x x

x y xx

x x y x y y x x y x y y x y x y x

x

Lots of wasted work

Page 25: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

25

Better Pattern Matching via Finite Automata

• Build a DFA for the pattern (preprocessing) of size O(m)– Keep track of the ‘longest match currently active’– The DFA will have only m+1 states

• Run the DFA on the string n steps

• Obvious construction method for DFA will be O(m2) but can be done in O(m) time.

• Total O(m+n) time

Page 26: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

26

Building a DFA for the patternPattern p=x y x y y x y x y x x

y y y xxx xy x y x

Page 27: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

27

Preprocessing the patternPattern p=x y x y y x y x y x x

y y y xxx xy x y x

y

Page 28: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

28

Preprocessing the patternPattern p=x y x y y x y x y x x

y y y xxx xy x y x

yx

y

x

x

Page 29: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

29

Preprocessing the patternPattern p=x y x y y x y x y x x

y y y xxx xy x y x

yx

y

x

x

y

x

y

x

Page 30: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

30

Preprocessing the patternPattern p=x y x y y x y x y x x

y y y xxx xy x y x

yx

y

x

x

y

x

y

x

y

y

x

y

Page 31: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

31

Generalizing

• Can search for arbitrary combinations of patterns– Not just a single pattern– Build NFA for pattern then convert to DFA ‘on the fly’.

• Compare DFA constructed above with subset construction for the obvious NFA.

Page 32: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

A Quick Note...

• On how to convert NFAs and DFAs to equivalent regular expressions...

• We’ve already seen – DFAs and NFAs recognize the same languages– NFAs (and therefore DFAs) recognize any language

given by a regular expression

• This completes the equivalence of DFAs and regular expressions

Autumn 2011 CSE 311 32

Page 33: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

Generalized NFAs • Like NFAs but allow

– Parallel edges– Regular Expressions as edge labels

• NFAs already have edges labeled or a• An edge labeled by A can be followed by reading a

string of input chars that is in the language represented by A

• A string x is accepted iff there is a path from start to final state labeled by a regular expression whose language contains x

Autumn 2011 CSE 311 33

Page 34: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

Starting from NFA

• Add new start state and final state

• Then eliminate original states one by one, keeping the same language, until it looks like:

• Final regular expression will be AAutumn 2011 CSE 311 34

λλλ

A

Page 35: CSE 311 Foundations of Computing I Lecture 24 FSM Limits, Pattern Matching Autumn 2011 CSE 3111

Only two simplification rules:• Rule 1: For any two states q1 and q2 with parallel

edges (possibly q1=q2), replace

• Rule 2: Eliminate non-start/final state q3 by replacing all

for every pair of states q1, q2 (even if q1=q2)Autumn 2011 CSE 311 35

q1q2

A

Bby A⋃B

q1q2

AB

C AB*Cq1 q3 q2 q1q2by