formal languages and chomsky hierarchy[1]

Upload: saurabh-singh

Post on 07-Jul-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    1/36

    SAURABH SINGH

    Formal Languages, Automata andGrammar

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    2/36

    Language

    Language is the ability to acquire and use complex systems

    communication, particularly the human ability to do so, andis any specific example of such a system.

    The system of words or signs that people use to express thoufeeling to each other O any one of the systems of human laare used and understood by a particular group of people.

    The scientific study of language is called linguistics

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    3/36

    Linguistics

    !n linguistics, formal languages are used for the scientific st

    human language.Linguists pri"ilege a generati"e approach, as they are intere

    defining a #finite$ set of rules stating the grammar based onreasonable sentence in the language can be constructed.

     A grammar does not describe the meaning of the sentences

     be done with them in whate"er context % but only their form

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    4/36

    &homs'y (ierarchy 

    )oam &homs'y #*+-$ is an American linguist, philosopher,

    scientist, historian, and acti"ist.!n yntactic tructures #*+/0$, &homs'y models 'nowledge o

    using a formal grammar, by claiming that formal grammars cthe ability of a hearer1spea'er to produce and interpret an infnumber of sentences with a limited set of grammatical rules a

    set of terms.The human brain contains a limited set of rules for organi2ing

    'nown as 3ni"ersal Grammar. The basic rules of grammar arinto the brain, and manifest themsel"es without being taught

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    5/36

    &homs'y (ierarchy 

    &homs'y proposed a hierarchy that partitions formal gramm

    classes with increasing expressi"e power, i.e. each successi"generate a broader set of formal languages than the one befo

    !nterestingly, modelling some aspects of human language remore complex formal grammar #as measured by the &homshierarchy$ than modelling others.

    4xample, 5hile a regular language is powerful enough to m4nglish morphology #symbols, words$, it is not powerful enomodel 4nglish syntax.

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    6/36

    Linguistics in &omputer cience

    !n computer science, formal languages are used for the prec

    definition of programming languages and, therefore, in thede"elopment of compilers.

     A compiler is a computer program #or set of programs$ thatsource code written in a programming language #the sourceinto another computer language #the target language$.

    &omputer scientists pri"ilege a recognition approach based machines #automata$ that ta'e in input a sentence and decidit belongs to the reference language.

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    7/36

     Automata and Grammar

     5hich class of formal languages is recogni2ed by a gi"en typ

    automata6There is an equi"alence between the &homs'y hierarchy and

    different 'inds of automata. Thus, theorems about formal lacan be dealt with as either grammars or automata.

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    8/36

    7escribing formal languages8 generati"e approach

    Generati"e approach

     A language is the set of strings generated by a grammar.

    Generation process tart symbol 4xpand with rewrite rules.

    top when a word of the language is generated.

    9ros and &ons The generati"e approach is appealing to humans. Grammars are formal, informati"e, compact, finite descriptions for possibly infi

     but are clearly inefficient if implemented nai"ely.

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    9/36

    7escribing formal languages8 recognition approach

    ecognition approach

     A language is the set of strings accepted by an automaton.

    ecognition process tart in initial state. Transitions to other states guided by the string symbols. 3ntil read whole string and reach accept1re:ect state.

    9ros and &ons The recognition approach is appealing to machines.  Automata are formal, compact, low%le"el machines that can be implemented eas

    efficiently, but can be hard to understand to humans.

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    10/36

    Formal languages8 definition and basic notions

    Formal language

    !s a set of words, that is, finite strings of symbols ta'en from the alphabetlanguage is defined.

     Alphabet8 a finite, non%empty set of symbols.

    4xample ;* < = >, * ?

    ; < = >, *, , @, , /, B, 0, -, + ? ;@ < = >, *, , @, , /, B, 0, -, +, A, C, &, 7, 4, F ?

    ; < = a, b, c,D, 2 ?

    )otation a, b, c, . . . denote symbols

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    11/36

    Formal languages8 definition and basic notions

    tring #or word$ on an alphabet ;8 a finite sequence of symbols in ;.

    4xample *>*> ;ϵ * *@ ;ϵ hello ;ϵ

    )otation E is the empty string  ", w, x, y, 2, . . . denote strings

    w is the length of w #the number of symbols in w$.

    4xample a < * */ < @ E < >

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    12/36

    Formal languages8 definition and basic notions

    '%th power of an alphabet ;8 ;'   =a*Da'   a*Da'   ;?ϵ

    4xample ;> < =E? for any

    ;** < =>, *?

    ;,* < =>>, >*, *>, **?

    leene closures of an alphabet ;8 ;H 

    ;I

     

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    13/36

    Formal languages8 definition and basic notions

    tring operations8

     "w is the concatenation of " and w   " is a substring of w iff x"y < w   " is a prefix of w iff "y < w   " is a suffix of w iff x" < w 

    4xample

     wE < Ew < w 

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    14/36

    Formal languages8 definition and basic notions

    Formal language8 mathematical definition

     A language o"er a gi"en alphabet ; is any subset of ;H.4xample 4nglish, &hinese, . . .

    &, 9ascal, Ja"a, (TKL, . . .

    the set of binary numbers whose "alue is prime8

    =*> ** *>* *** *>**D. ? M #the empty language$

    =E?

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    15/36

    Operations on languages

    Let L* and L, be languages o"er the alphabets ;* and ;,, resp

    Then8 L* 3 L, < =w w Lϵ * N w Lϵ ,?

    < =w ;H* w ?

    L*L, < =w * w ,  w *  L*  w  Ʌ  ,  L,?

    LH* < =E? 3 L* 3 L,* 3D

    L* L < =w w L*  w L Ʌ  ,?

     

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    16/36

    Grammar

     A grammar is a tuple G < #V,T,S,P $ where

    V is a finite, non%empty set of symbols called "ariables #or non%terminals or syntactic categories$

    T is an alphabet of symbols called terminals

     S ∈ V is the start #or initial$ symbol of the grammar  P is a finite set of productions α → β where α #∈ V ∪ T $I and β #∈ V ∪ T

    !n the example ! eat apple→  V < =entence, ub:ect, Nerb, Ob:ect?,T < =!, ou, 4at, Cuy , 9en, Apple

    =entence?, and P < =entence ub:ectNerbOb:ect, ub:ect ! ou,→ → Cuy, Ob:ect 9en Apple?.→

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    17/36

    The &homs'y hierarchy8 summary 

    Le"el Language type Grammar rules Accepting machines

    @ egular P , P , )FAs #or 7FAs$→ →  P a →

    &ontext%free P Q→   )ondeterministic

      pushdown automata

    * &ontext%sensiti"e R Q→   )ondeterministic linear  with R S Q bounded automata

    > 3nrestricted R Q→   Turing machines

      #unrestricted$

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    18/36

    #Type%@$ egular Grammar

    Type%@ grammars generate regular languages. Type%@ grammars must ha"e aterminal on the left%hand side and a right%hand side consisting of a single terterminal followed by a single non%terminal.

    The productions must be in the form P a or P a → →

     where P, ) #)on terminal$∈

    and a T #Terminal$∈

    The rule E is allowed if does not appear on the right side of any rule.→

    4xample → aC C → bC C → ε  5hat language does this define6 abH

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    19/36

    Finite Automata

     A finite automaton is a /%tuple K < #, ;, δ, q>, F$

    is the set of states ; is the alphabet

    δ 8 × ; is the transition function→

    q> ∈  is the start state

    F ⊆  is the set of accept states

    L#K$ < the language of machine K < set of all strings machiaccepts

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    20/36

    Finite Automata

    M = (Q, Σ, δ, q0, F) where Q = {q0, q1, q2, q3}

    Σ = {0,1}

    δ : Q × Σ → Q transition function

    q0 ∈ Q is start state

    F = {q1, q2}⊆

     Q acce!t states

    0

    00

    1

    q0

    q1

    q3

    C ild h ll d l h i

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    21/36

    Cuild an automaton that accepts all and only those stringcontain >>*

    q q>>

    * >

    *

    q> q>>*

    >> *

    >,*

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    22/36

    Limits of egular languages and finite automata

     5hat types of languages canUt FAUs accept6 !n other words, w

    are there on the complexity of regular languages6

    FAUs lac' memory, so that you canUt ha"e one part of a reguldependent on another part.

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    23/36

    #Type%$&ontext%free grammars

    Type% grammars generate context%free languages.

    The productions must be in the form A V, where A ) #)on termi→ ∈ #T )$H #tring of terminals and non%terminals$.∪

    These languages generated by these grammars are be recogni2ed by deterministic pushdown automaton.

    4xample P→

    P abaPb→  L={anbn | n }

     5hereas in regular languages, nonterminals were restricted as to whappear in the rules, now they can appear anywhere. (ence the term

     

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    24/36

    9ushdown Automata

     Pushdown automata extend FAUs in one "ery important w

    now gi"en a stack on which we can store information. Thistandard L!FO stac', where information gets pushed ontopopped off the top.

    This means that we can now choose transitions based notinput, but also based on whatUs on the top of the stac'.

     5e also now ha"e transition actions a"ailable to us. 5e ca push a specific element to the top of the stac', or pop the off the stac'.

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    25/36

    9ushdown Automaton #97A$

    finite control and a single unbounded stac' 

    a, W1AWa, A1AA 

     b, A1ε

     b, A1ε

    X, W1ε

    g models finite program I one unbounded stac' of bounded registers 

    W

    top

     b

    }1:{   ≥=   nba L   nn

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    26/36

    9ushdown Automaton #97A$

    a, W1AW

    a, A1AA 

     b, A1ε

     b, A1ε

    X, W1ε

    g models finite program I one unbounded stac' of bounded registers 

    W W

     A A 

    W WWW W

     A 

     A 

     A 

     A 

     A 

     A 

     A 

    a a a b b b X

    accepting

    }1:#{   ≥=   nba L  nn

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    27/36

    9ushdown Automaton #97A$

    a, W1AWa, A1AA 

     b, A1ε

     b, A1ε

    X, W1ε

    g models finite program I one unbounded stac' of bounded registers 

    W W

     A A 

    W WWW W

     A 

     A 

     A 

     A 

     A 

     A 

     A 

    a a a b b b b X

    ? re:ecting

    }1:#{   ≥=   nba L  nn

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    28/36

    9ushdown Automaton #97A$

    a, W1AWa, A1AA 

     b, A1ε

     b, A1ε

    X, W1ε

    g models finite program I one unbounded stac' of bounded registers 

    W W

     A A 

    W WWW

     A 

     A 

     A 

     A 

     A 

     A 

     A 

    a a a b b X

    ?re:ecting

    }1:#{   ≥=   nba L  nn

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    29/36

    #Type%*$&ontext%ensiti"e Grammar

    Type%* grammars generate context%sensiti"e languages. The productions must be in the form

    R A Q R V Q→

      where A ) #)on%terminal$ and R, Q, V #T )$H #trings of terminals and non%terminals$∈ ∈ ∪The strings R and Q may be empty, but V must be non%empty.

    The rule E is allowed if does not appear on the right side of any rule. The languages generated →are recogni2ed by a linear bounded automaton.

    4xample abcaAbc→  Ab bA →

     Ac Cbcc→  bC Cb→ aC aaaaA →

     L={anbncn n }

     Alternate 7efinition8 9

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    30/36

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    31/36

    Turing Kachines

    ecall that )FAs are Zessentially memory%lessU, whilst )97As areequipped with memory in the form of a stac'.

    To [nd the right 'inds of machines for the top two &homs'yle"els, we need to allow more general manipulation of memory.

     A Turing machine essentially consists of a [nite%state control unit,equipped with a memory tape, in[nite in both directions. 4ach cellon the tape contains a symbol drawn from a [nite alphabet \.

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    32/36

    Turing Kachines cont.

     At each step, the beha"iour of the machine can depend on the current state of the control symbol at the current read position.

    7epending on these things, the machine may then o"erwrite the current tape symbol withshift the tape left or right by one cell, :ump to a new control state.

    This happens repeatedly until #letUs say$ the control unit enters some final state.

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    33/36

    Turing Kachines cont.

    To use a Turing machine T as an acceptor for a language o"e

    p the tape with the test string s ;H written left%to%∈

    right starting at the read position, and with blan' symbols e"e

    Then let the machine run #maybe o"erwriting s$, and if it entethe [nal state, declare that the original string s is accepted.

    The language accepted by T #written L#T $$ consists of all strins that are accepted in this way.

    Theorem8 A set L ;H is generated by some unrestricted #Ty⊆grammar if and only if L < L#T $ for some Turing machine T  both Type > grammars and Turing machines lead to the sameof recursi"ely enumerable languages.

    i hi

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    34/36

    Turing Kachines cont.

     A Turing machine T consists of8

     A set of control states  An initial state i ∈  A [nal #accepting$ state f ∈  A tape alphabet \  An input alphabet ; \⊆  A blan' symbol ] \ ] ;∈  A transition function ^ 8 _ \ _ \ _ =L, ?.→

    Li b d d

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    35/36

    Linear bounded automata

    uppose we modify our model to allow :ust a [nite tape, initiallycontaining :ust the test string s with end mar'ers on either side8

     

    The machine therefore has :ust a [nite amount of memory,determined by the length of the input string. 5e call this a linear

     bounded automaton.

    #LCAs are sometimes de[ned as ha"ing tape length bounded by aconstant multiple of length of input string ` doesnUt ma'e anydi erence in principle.$ff 

    Theorem8 A language L ; is context%sensiti"e if and only if⊆ ∗L < L#T $ for some non%deterministic linear bounded automaton T .

    Th &h ' hi h

  • 8/18/2019 Formal Languages and Chomsky Hierarchy[1]

    36/36

    The &homs'y hierarchy8 summary 

    Le"el Language type Grammar rules Accepting machines

    @ egular P , P , )FAs #or 7FAs$→ →

      P a →

    &ontext%free P Q→   )ondeterministic

      pushdown automata

    * &ontext%sensiti"e R Q→   )ondeterministic linear  with R S Q bounded automata

    > 3nrestricted R Q→   Turing machines

      #unrestricted$