pushdown automata - cs.rit.edurlc/courses/theory/classnotes/pushdownautomata.pdf · pushdown...

Pushdown Automata

Foundations of Computer Science Theory

Pushdown Automata

• The pushdown automaton (PDA) is an automaton equivalent to the context-free grammar in language-defining power

• However, only the non-deterministic PDA defines all of the context-free languages

• The deterministic version models parsers – Most programming languages have deterministic

PDAs

Pushdown Automata

• We can think of a PDA as an ε-NFA with the added power of being able to manipulate a stack

• Its moves are determined by: 1. The current state (of its “NFA”) 2. The current input symbol (or ε) and 3. The current symbol on top of its stack

Example of a PDA

q

0 1 1 1

X Y Z

State

Stack ←Top of Stack

Input

• Being nondeterministic, the PDA has a choice of next moves

• For each choice, the PDA could do any of the following: 1. Stay in same state 2. Change state 3. Replace the top symbol on the stack by a sequence

of zero or more symbols • Zero symbols = “pop” the stack • More than zero symbols = sequence of “pushes”

Changing States

PDA Formalism

• Formally, a PDA is described as: 1. A finite set of states (Q) 2. An input alphabet (Σ) 3. A stack alphabet (Γ) 4. A transition function (δ) 5. A start state (s, in Q) 6. A start symbol (S, in Γ) 7. A set of final states (F ⊆ Q)

Conventions

• a, b, … are input symbols – Sometimes we allow ε as a possible value

• …, X, Y, Z are stack symbols – If we write the stack contents using a string of individual

symbols, then we write the symbols so the top of stack is at the left end

• …, w, x, y, z are strings of input symbols • α, β,… are strings of stack symbols

The Transition Function

• The transition function takes three arguments: 1. A state in Q 2. An input, which is either a symbol in Σ or ε 3. A stack symbol in Γ, the symbol at the top of the stack

• δ(q, a, X) is a set of zero or more actions of the form (p, α) – Where q and p are states, a is an input character (can

be ε), X is the symbol at the top of the stack, and α is a string of stack symbols that will replace the symbol X

Actions of the PDA • If δ(q, a, X) contains (p, α) among its actions, then

the PDA could (if this action is chosen): 1. Change to state p 2. Remove a from the front of the input

• If a is ε then the remaining input does not change 3. Replace X on the top of the stack by α

• Although the PDA may have a choice of several different (p, α) pairs, it has to pick one pair and then do all three things associated with that pair – It can’t pick a next state from one pair, and a stack string

from another

Designing a PDA

• Example: Design a PDA to accept {0n1n : n ≥ 1} • The states:

– s = start state (stay in state s if we have seen only 0’s so far)

– p = another state (we’ve seen at least one 1 and may now proceed only if the inputs are also 1)

– f = final state (accept)

Designing a PDA

• The stack symbols: – S = start symbol (also marks the bottom of the

stack, so we know when we have counted the same number of 1’s as 0’s)

– X = marker (used to count the number of 0’s seen on the input)

• Push one X onto the stack for each 0 read on the input • Pop one X from the stack for each 1 read on the input • When the bottom-marker S again becomes the top

stack symbol, we know we have seen exactly as many 1’s as there were 0’s, so we accept

Designing a PDA

• The transitions: – δ(s, 0, S) = {(s, XS)} (X is now at the top of the stack) – δ(s, 0, X) = {(s, XX)} (Replace X on top of stack with

XX for each 0 read from the input) – δ(s, 1, X) = {(p, ε)} (When we see the first 1, go to

state p and pop one X, i.e. replace X with ε) – δ(p, 1, X) = {(p, ε)} (Pop one X for every 1 read) – δ(p, ε, S) = {(f, S)} (End of input - accept if S is at the

top of the stack)

Actions of the Example PDA

s

0 0 0 1 1 1

S


s

0 0 1 1 1

X S


s

0 1 1 1

X X S


s

1 1 1

X X X S


p

1 1

X X S


p

1

X S


p

S


f

S

Instantaneous Descriptions

• We can formalize this idea by using an instantaneous description (ID)

• An ID is a triple (q, w, α), where: 1. q is the current state 2. w is the remaining input 3. α is the stack contents (note: the top of the

stack is shown as the leftmost symbol of α)

The “Goes-To” Relation

• If δ(q, a, X) = (p, β) then we can write (q, aw, Xα) ⊦ (p, w, βα) for input string w and stack contents α – The symbol ⊦ is read as “goes to on one move” – This is similar to ⇒ in a grammar derivation

• Extend ⊦ to ⊦* (read as “goes to on zero or more moves”) – Basis: A ⊦* A – Induction: If A ⊦* B and B ⊦ C, then A ⊦* C

Example of Goes-To

• Using the previous example PDA, we can describe the sequence of moves by:

(s, 000111, S) ⊦ (s, 00111, XS) ⊦ (s, 0111, XXS) ⊦ (s, 111, XXXS) ⊦ (p, 11, XXS) ⊦ (p, 1, XS) ⊦ (p, ε, S) ⊦ (f, ε, S)

• Thus, (s, 000111, S) ⊦* (f, ε, S) • What would happen on input 0001111?

s

0 0 0 1 1 1

S

(s, 0001111, S) ⊦ (s, 001111, XS) ⊦ (s, 01111, XXS) ⊦ (s, 1111, XXXS) ⊦ (p, 111, XXS) ⊦ (p, 11, XS) ⊦ (p, 1, S) ⊦ (f, 1, S)

• The last move, where the state changes from p to f, is legal because a PDA can use an epsilon as input even if there is input remaining

• Since state f has no transitions, the sequence cannot be extended and the last 1 can never be consumed

• Thus 0001111 is not accepted, because the input is not completely consumed

Example of Goes-To

s

0 0 0 1 1 1 1

S

Language of a PDA

• One way to define the language of a PDA is by final state: – The language recognized by a PDA is the set of

strings w such that (s, w, S) ⊦* (f, ε, α) for final state f and any α

Language of a PDA

• Another way to define the language of a PDA is by empty stack: – The language recognized by a PDA is the set of

strings w such that (s, w, S) ⊦* (q, ε, ε) for any state q

– This is sometimes referred to as null stack

• A PDA may require both final state and empty stack to define the language

A PDA for AnBn = {anbn: n ≥ 0}

M = (Q, Σ, Γ, δ, s, F), where: Q = {s, f } /* the states Σ = {a, b} /* the input alphabet Γ = {a} /* the stack alphabet F = {s, f } /* the accepting states δ contains three transition rules:

(s, a, є), (s, a) /* (start-state, char, top-of-stack), (go-to-state, new-top-of-stack) (s, b, a), (f, є) (f, b, a), (f, є)

input char/top-of-stack/new-top-of-stack

This PDA accepts by both final state and empty stack.

A PDA for {anb2n: n ≥ 0}

M = (Q, Σ, Γ, δ, s, F), where: Q = {s} /* the states Σ = {(, )} /* the input alphabet Γ = {(} /* the stack alphabet F = {s} /* the accepting state δ contains two transition rules:

(s, (, є), (s, () /* (start-state, char, top-of-stack), (go-to-state, new-top-of-stack) (s, ), ( ), (s, є) **Important: є does not mean that the stack is empty, it just means

that it does not matter what is on the top of the stack

A PDA for Balanced Parentheses

M = (Q, Σ, Γ, δ, s, F), where: Q = {s, f} /* the states Σ = {a, b, c} /* the input alphabet Γ = {a, b} /* the stack alphabet F = {f} /* the accepting states δ contains: ((s, a, є), (s, a)) ((s, b, є), (s, b)) ((s, c, є), (f, є)) ((f, a, a), (f, є)) ((f, b, b), (f, є))

A PDA for {wcwR: w ∈ {a, b}*}

S → є S → aSa S → bSb

A PDA for PalEven ={wwR: w ∈ {a, b}*}

Grammar to generate palindromes of even length:

Note: this PDA is necessarily non-deterministic because it does not know when to transition from state s to state f.

A PDA for {w ∈ {a, b}* : #a(w) = #b(w)}

Non-deterministic PDA to recognize equal numbers of a’s and b’s.

Start with the PDA for n = m:

a/є/a

b/a/є

b/a/є

1 2

A PDA for L = {ambn : m ≠ n; m, n > 0}

• Case 1: If both the stack and the input are empty (m = n), halt and reject • Case 2: If the input is empty but the stack is not (m > n) (accept) • Case 3: If the stack is empty but the input is not (m < n) (accept)

Non-deterministic PDA to recognize unequal numbers of a’s and b’s:

Then consider n ≠ m:

a/є/a

b/a/є

b/a/є

$/a/є

є/a/є

2 1 3

Reducing Non-Determinism

a/є/a

b/a/є

b/a/є

1 2

Case 2: If the input is empty but the stack is not (m > n) (accept):

Use $ to indicate the end of the input stream.

Case 1: If both the stack and the input are empty, halt and reject:

Case 3: If the stack is empty but the input is not (m < n) (accept):


a/є/a

b/a/є

b/a/є

2 1 4 b/#/є

b/є/є

Use # to indicate an empty stack.


Putting all 3 cases together:

Eliminating Non-Determinism

• To be completely deterministic, there must be at most one choice of move for any state q, input symbol a, and stack symbol X

• In addition, there must not be a choice between using input ε or a real character

PDAs and CFLs

• Deterministic PDAs can recognize all of the regular languages – A PDA is a DFA when the PDA ignores its stack

• Deterministic PDAs can not recognize all of the context-free languages – We need non-deterministic PDAs to do that

PDAs and CFLs

• Every context-free language has a non-deterministic PDA (n-PDA) that recognizes it

• Conversely, the languages recognized by non-deterministic PDAs are context-free

Theorem: The class of languages recognized by n-PDAs is exactly the class of context-free languages

Lemma: Every context-free language is recognized by

some n-PDA • The proof is by construction • Two approaches: top-down parsing or bottom-up

parsing • We won’t show the proof here, but the main idea is

to let the stack keep track of the productions

PDAs and CFLs

Consider AnBnCn = {anbncn: n ≥ 0}. Now consider L = ¬ (AnBnCn). L is the union of two

languages: 1. L1 = {w ∈ {a, b, c}* : the letters are out of order}, plus 2. L2 = {aibjck: i, j, k ≥ 0 and (i ≠ j or j ≠ k)} (in other words, strings

with an unequal numbers of a’s, b’s, and c’s, where the a’s come before the b’s and the b’s come before the c’s)

A simple DFA can recognize L1. A PDA for L2 can be built similar to the one we created for accepting strings with an unequal number of a’s and b’s. Thus, ¬ (AnBnCn)is context-free, whereas AnBnCn is not context-free (as we will prove).

The Power of Non-Determinism

Are Context-Free Languages Closed Under Complement?

• ¬AnBnCn is context-free

• If the context-free languages were closed under complement, then ¬¬AnBnCn = AnBnCn would also be context-free

• But we can prove that it is not! – (Stay tuned)

pushdown automata - cs.rit.edurlc/courses/theory/classnotes/pushdownautomata.pdf · pushdown...

Documents