# non-regular languages and the pumping lemma 2015-10-02آ use the pumping lemma for regular...

Post on 04-Jun-2020

0 views

Embed Size (px)

TRANSCRIPT

Non-Regular Languages and The Pumping Lemma

Foundations of Computer Science Theory

Regular Languages

• For the regular languages, we have seen that there is a “circle of conversions” from one representation to another:

RE

DFA

NFA

ε-NFA

Properties of Language Classes

• A language class is a set of languages – Example: the regular languages

• Language classes have two important kinds of properties: 1. Closure properties 2. Decision properties

Closure Properties

• Recall that a closure property of a language class says that given any languages in the class, an operation (e.g., union) produces another language in the same class

• The regular languages are closed under union, concatenation, the Kleene star, intersection, difference, complement, and reversal

Decision Properties

• A decision property for a class of languages is an algorithm that takes a formal description of a language (e.g., an NFA or a regular expression) and determines whether or not some property holds

• For example, given a specific language L, we could ask, “Is the language L empty?” or “Is the language L finite?”

Why Decision Properties?

• If we think of a language as representing a certain protocol for processing data (i.e., a way of solving computational problems), then decision properties can tell us a lot about the behavior of the protocol – For example, “Is the language finite?” could

correspond to “Is there a way to solve the problem in a finite number of steps?”

– “Is the language empty?” could correspond to “Is there any way to solve the problem?”

The Emptiness Problem

• Our first decision property for regular languages is the question “Given a regular language, does the language contain any string at all?”

• Algorithm: – Create an NFA for the language – Compute the set of states reachable from the start

state – If at least one final state is reachable, then the

answer is “yes”, otherwise the answer is “no”

The Membership Problem

• The membership problem asks, “Is string w in regular language L?”

• Algorithm: – Create an NFA (or DFA) for the language – Simulate the action of the NFA on the sequence of

input symbols forming w

Start

1

0

A C B 1

0 0,1 Here’s a DFA for all strings without

consecutive 1’s. Test membership on an input string 01011. Not accepted.

• The finiteness problem asks, “Is a given regular language finite?”

• Algorithm: – Create an NFA for the language – If the NFA has n states, and the NFA accepts only

strings of length strictly less than n, then the language is finite

– If a given regular language is not finite then it is infinite

The Finiteness Problem

0 0 A B C

start

• If a regular language is infinite, it means that repetition is allowed when generating strings

• To define an infinite language we could use either: – The Kleene star (such as in a regular expression), or – Loops on states (such as in an NFA)

• Recall that an NFA for a regular language (finite or infinite) always has a finite number of states

• Algorithm: – Construct an NFA for the language – Test the NFA on strings that would force the NFA to go

through a loop if one exists

The Infiniteness Problem

• If an n-state NFA accepts a string of length n or longer, then there must be a state that appears at least twice on the path from the start state to a final state – This means that there must be a loop in the NFA,

because there are at least n+1 states visited along this particular path

The Infiniteness Problem

Here’s an NFA for strings of consecutive 0’s that have at least two 0’s. It has 3 states. String 000, with length 3, is accepted. To accept this string, the NFA visits 3 + 1 = 4 states (A, B, B, C). State B appears twice (loop!). Notice that string 00 is also accepted, as is string 00000…

0

0

0 A B C

start

Let w = xyz be a string accepted by an NFA with sub-strings x, y, and z, and

y ≠ ε, i.e., |y| ≥ 1.

q x y

z

Then x yi z is in the language for all i ≥ 0.

This statement implies that if we can find such a w (where y is not ε) then there are an infinite number of strings in L (i.e., L is infinite).

The Infiniteness Problem

Theorem: Let M = (Q, Σ, δ, s, F) be any NFA. If M accepts any string of length |Q| or greater, then the regular language recognized by M is infinite.

Proof: M starts in the start state and each time M reads an

input character, it visits another state. So, in processing a string of length n, M visits a total of n + 1 states. If n + 1 > |Q|, then, by the pigeonhole principle, some state must get more than one visit. So, if n ≥ |Q|, then M must visit at least one state more than once. This implies that there must be a loop in the NFA, which means that the state can be visited an infinite number of times.

The Infiniteness Problem

Theorem: There is a countably infinite number of regular languages.

Proof: The upper bound on the number of regular

languages is the number of possible finite automata. Given an alphabet, we could enumerate all possible NFAs (start with those with one state, then those with two states, then three states, etc.). Thus, the number of NFAs is countably infinite. Since there are fewer regular languages than there are NFAs, the regular languages must also be countably infinite

How Many Languages are Regular?

Theorem: There is an uncountably infinite number of non- regular languages.

Proof: A language is a set of strings over a non-empty

finite alphabet ∑. Thus, the set of all languages is the set of all sets of strings (the power set of all strings). We have already proven (using the diagonalization technique) that the power set of any countably infinite set is uncountably infinite. Therefore, there is an uncountably infinite number of languages that are not regular.

So there must be many more non-regular languages than

there are regular ones

How Many Languages are Not Regular?

How Many Languages Are There?

• Every finite language is regular • Some infinite languages are regular:

− a*b* − {w ∈ {a, b}* : every a is immediately followed

by b} • Some infinite languages are not regular:

− {w ∈ {a, b}* : anbn, n ≥ 0} − {w ∈ {a, b}* : every a has a matching b

somewhere, and the number of b’s is at least as great as the number of a’s}

Is a Language Regular?

Showing that a Language is Not Regular

• Recall that every regular language can be recognized by some finite automaton

• Recall also that finite automata can only use a finite amount of memory to record essential properties of the language

Question: What is the longest string that a 5- state NFA can accept without going through any loops?

Showing that a Language is Not Regular

Question? If an NFA with n states accepts any string of length ≥ n, how many strings does it accept?

For example, let L = bab*ab

w = ba b ab is a string accepted by this language. x y z

Therefore, xy*z must also be in L.

So L includes: baab, babab, babbab, babbbbbbbbbbab

Showing that a Language is Not Regular

• To show that a language is not regular, we use the pumping lemma for regular languages – In an NFA, “long” strings require that some states

must be visited more than once (i.e., the NFA must contain at least one loop)

• Long strings can be “pumped” and still be accepted – If a language contains at least one long string

that cannot be pumped, then the language is not regular

Showing that a Language is Not Regular

For every regular language L, there is an integer k, such that for every string w in L of length ≥ k, we can write w = xyz such that:

1. |xy| ≤ k 2. |y| > 0 3. For all i ≥ 0, xyiz is in L

Number of states in NFA First cycle in NFA

The Pumping Lemma

• Recall our earlier claim that {anbn : n > 0} is not a regular language

• Proof by contradiction: – Suppose it is regular, then there must be an associated

k such that |xy| ≤ k – Pick any string in L: let’s pick w = akbk – Since |xy| ≤ k, both x and y must consist of only a’s – Pump y up by choosing i = 2

• Then xyyz should be in L, but it is not because this string has more a’s than b’s

• For example, if w = aaabbb and i = 2, then aaaabbb should also be in L, but it is not because it has more a’s than b’s

The Pumping Lemma

• What if we made a different choice for w? – We can still prove that L = {anbn : n >

Recommended