unit 7 - ?· unit 9 more pushdown automata context-free languages pumping lemma for cfl reading:...

Download Unit 7 - ?· Unit 9 More Pushdown Automata Context-free Languages Pumping Lemma for CFL Reading: Sipser,…

Post on 06-Oct-2018

212 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • 1

    Unit 9

    More Pushdown AutomataContext-free LanguagesPumping Lemma for CFL

    Reading: Sipser, chapter 2.3

  • 2

    Properties of PDAs

    An NFA can only distinguish between |Q| different characterizations.

    A PDA can distinguish between an unlimited number of characterizations.

    A PDA can recognize non-regular languages because the stack can count.

    A PDA can count more than once. But it can not mix counters. Only one active counter can be used each time.

  • 3

    Example: L = {aibicjdj |i,j0}

    Construct a PDA to recognize:

    L = {aibicjdj |i,j0}

    We will use the empty stack model.

    The basic idea:

    Push to the stack an A for each a, pop an A

    for each b.

    Push to the stack a C for each c, pop a C for

    each d .

  • syntactic computational

    4

    Properties of PDAs

    We had two ways to describe regular languages:

    Regular-Expressions DFA / NFA

    How about context-free-languages?

    computational

    CFG

    syntactic

    PDA

  • CFG=PDA

    Theorem: A language is context-free iff

    some pushdown automaton recognizes it.

    Proof:

    CFLPDA: we show that if L is CFL then a PDA recognizes it.

    PDA CFL: we show that if a PDA recognizes L then L is CFL.

    5

  • From CFG to PDA

    6

    Proof idea: Use PDA to simulate leftmost derivations.

    Leftmost derivation : A derivation of a string is a leftmost derivation if at every step the leftmost remaining variable is the one replaced.

    We use the stack to store the suffix that has not been derived so far.

    Any terminal symbols appearing before the leftmost variable are matched right away.

  • 7

    Different derivations for the same parse tree

    235

    23E

    2 EE

    2E

    EEE

    235

    E35

    E E5

    E EE

    EEE

    5 + 3 x 2

    E

    leftmostderivation

    rightmostderivation

    E

    E E

    E

    CFG: EEE | E+E

    E0 | 1 | 2 | | 9

  • control

    5 + 3 x 2 ExE$

    input: stack:

    control

    5 + 3 x 2 E$

    E

    input: stack:

    EEE

    EE

    Starting configuration:

  • control

    5 + 3 x 2

    5+ExE$

    input: stack:

    E E5

    E EE

    EEE

    control

    5 + 3 x 2 E+ExE$

    input: stack:E EE

    EEE

    E+EE

    5+EE

  • control

    5 + 3 x 2 ExE$

    input: stack:

    E E5

    E EE

    EEE

    EE

    control

    5 + 3 x 2 3xE$

    input: stack:

    E 35

    E E5

    E EE

    EEE

    3E

  • control

    5 + 3 x 2 E$

    input: stack:

    E

    E 35

    E E5

    E EE

    EEE

    2 35

    E 35

    E E5

    E EE

    EEE

    control

    5 + 3 x 2 2$

    input: stack:

    2

  • 2 35

    E 35

    E E5

    E EE

    EEE

    control

    5 + 3 x 2 $input: stack:

    The string 5 + 3 x 2 is accepted

  • Informally:

    1. Place the marker symbol $ and the start variable

    S on the stack.

    2. Repeat the following steps:

    If the top of the stack is a variable A:

    Choose a rule A1k and substitute A with 1k

    If the top of the stack is a terminal a:

    Read next input symbol and compare to a

    If they dont match, reject (die)

    If top of stack is $, go to accept state

    13

    From CFG to PDA

  • For a given CFG G=(V,,S,R),

    we construct a PDA P=(Q,,,,q0,F) where:

    Q={qstart, qloop, qaccpt}

    = V{$}

    q0=qstart

    F={qaccpt}

    14

    From CFG to PDA

  • We define as follows (shorthand notation):

    (qstart,,)={(qloop,S$)}

    (qloop,,A)={(qloop, 1k) | for each A1k in R}

    (qloop,a,a)={(qloop,) | for each a }

    (qloop,,$)={(qaccpt,)}

    15

    From CFG to PDA

    ,S$qstart qloop qaccpt

    {,A1k | for rules A 1k}

    {a,a | for all a}

    ,$

  • Construct a PDA for the following CFG G:

    SaTb | b L(G)= a*bTTa |

    16

    Example:

    ,S$qstart qloop qaccpt

    ,SaTb

    ,Sb

    ,TTa

    ,T

    a,a

    b,b

    ,$

  • 17

    From PDA to CFG

    First, we simplify the PDA:

    It has a single accept state qf

    $ is always popped exactly before accepting

    Each transition is either a push, or a pop, but

    not both

    context-free grammar pushdown automaton

  • 18

    From PDA to CFG

    single accept state qf:

    ,

    ,

  • 19

    From PDA to CFG

    $ is always popped exactly before accepting:

    {,A | A, A$}

    ,$

  • 20

    From PDA to CFG

    Each transition is either a push, or a pop:

    ,ab ,a ,b

    , ,z ,z

    z

  • 21

    From PDA to CFG

    For any word w accepted by a PDA

    P=(Q,,,,q0,qf) the process starts at q0 with an

    empty stack and ends at qf with an empty stack.

    Definition: for any two states p,qQ we define

    Lp,q to be the language that if we starts at p with

    an empty stack and run on wLp,q we end at q

    with an empty stack.

    We define for Lp,q a variables Ap,q s.t.

    Lp,q = {w | Ap,q* w}

    Note, that L(P)=Lq0,qf

  • 22

    From PDA to CFG Consider a word wLp,q

    While running w on P, the stack is empty at p and

    at q but what happens in the middle?

    Two possibilities:

    Option 1: The stack also empty in the middle

    Option 2: The stack never empty in the middle

    p qr p q

    stack

    height

  • 23

    From PDA to CFG

    Option 1: The stack also empty in the middle

    If the stack become empty at some state r then the

    word wLpq can be reconstructed by a

    concatenation of a word from Lpr and a word from

    Lrq, thus Lpr Lrq Lpq

    In the CFG we express this by a rule: Apq AprArq

    p qr

    generated by Apr generated by Arq

  • 24

    From PDA to CFG

    Option 2: The stack never empty in the middle

    The symbol that has been pushed at p is the

    symbol that is popped at q.

    Thus, if at p we read a symbol a and moved to r,

    while from state s we read a symbol b and moved

    to q, aLr,sbLp,q and in CFG we have Apq aArsb

    p q

    generated by Ars

    r s

    a b

  • 25

    From PDA to CFG

    Let P=(Q, , , , q0, qf) a given PDA.

    We construct a CFL G=(V,,S,R) as follows*:

    V = {Ap,q | p,qQ}

    S=A

    R is a set of rules constructed as follows:

    q0,qf

    * Proof of correctness and further reading at the supplementary

    material in the course web page .

  • 26

    From PDA to CFG Add the following rules to R:

    1. For each p,q,r,sQ, t, and a,b,

    if (r,t)(p,a,) and (q,)(s,b,t) add a rule

    Apq aAr,sb

    2. For each p,q,rQ, add a rule Ap,q Ap,r Ar,q

    1. For each pQ, add the rule Ap,p

    p ra,t

    s qb,t

    p r q

    p

    pop tpush t

  • 27

    Example:

    qs q0#,,$

    q1

    0,A 1,A

    q2,$

    qs q0#,z,$

    q1

    0,A 1,A

    q2,$

    q3,z

    L(P)=0n#1n

  • 28

    Example:

    start variable: AS2productions:

    ASS ASSASSASS AS0A0SASS AS1A1SASS AS2A2SAS1 ASSAS1

    A00

    ...

    A11 A22

    AS2 A01A01 0 A011

    A33

    AS1 AS0A01AS1 AS1A11

    ASS

    qs q0#,z,$

    q1

    0,A 1,A

    q2,$

    q3,z

    A01 #A33

  • CFG=PDA

    29

    We have shown that a language is context-free

    iff some pushdown automaton recognizes it.

    In particular all regular languages can be

    generated by CFGs and so can be recognized

    by PDA.

    The class of languages accepted by non-

    deterministic PDAs is larger than those

    accepted by deterministic PDAs.

  • DPDA

    30

    The Context-free Languages

    the regular languages

    context-free languages

  • 31

    Non context-free Languages

    Consider the language L={aibici |i0}.

    When trying to build a push-down automaton that recognizes L, we can compare the number of a-'s with b-'s or c-'s but not both;

    If we compared the number of a-'s to the number of b-'s then we can't compare c-'s with any of them, as at this stage the stack (or counter) is empty.

  • 32

    So some languages seem to be not CFL.

    The question is which?

    This can be determined using the pumping lemma for context-free languages.

    Non context-free Languages

  • 33

    The Pumping Lemma - background

    Let L be a CFL and let G be a simplegrammar (no unit/ rules) generating it.

    Let wL be a long enough word (we will say later what is long).

    The parsing tree of w contains a long path from S to some leaf (terminal).

    On this long path some variable R must repeat (remember, w is long).

  • 34

    The Pumping Lemma - background

    Divide w into uvxyz

    according to the parse

    tree, as in the figure.

    Each occurrence of R

    has a subtree under it.

    xu v y z

    S

    R

    R

  • 35

    The Pumping Lemma - background

    The upper occurrence of R has a larger subtree

    and generates vxy.

    The lower occurrence of R has a smaller

    subtree and generates only x.

    Both subtrees are generated by the same

    variable R.

    That means if we substitute one for the other we

    will still obtain v

Recommended

View more >