courtesy costas buch - rpi1 simplifications of context-free grammars

Download Courtesy Costas Buch - RPI1 Simplifications of Context-Free Grammars

Post on 19-Dec-2015

219 views

Category:

Documents

4 download

Embed Size (px)

TRANSCRIPT

  • Slide 1
  • Courtesy Costas Buch - RPI1 Simplifications of Context-Free Grammars
  • Slide 2
  • Courtesy Costas Buch - RPI2 A Substitution Rule Substitute Equivalent grammar
  • Slide 3
  • Courtesy Costas Buch - RPI3 A Substitution Rule Equivalent grammar Substitute
  • Slide 4
  • Courtesy Costas Buch - RPI4 In general: Substitute equivalent grammar
  • Slide 5
  • Courtesy Costas Buch - RPI5 Nullable Variables Nullable Variable:
  • Slide 6
  • Courtesy Costas Buch - RPI6 Removing Nullable Variables Example Grammar: Nullable variable
  • Slide 7
  • Courtesy Costas Buch - RPI7 Substitute Final Grammar
  • Slide 8
  • Courtesy Costas Buch - RPI8 Unit-Productions Unit Production: (a single variable in both sides)
  • Slide 9
  • Courtesy Costas Buch - RPI9 Removing Unit Productions Observation: Is removed immediately
  • Slide 10
  • Courtesy Costas Buch - RPI10 Example Grammar:
  • Slide 11
  • Courtesy Costas Buch - RPI11 Substitute
  • Slide 12
  • Courtesy Costas Buch - RPI12 Remove
  • Slide 13
  • Courtesy Costas Buch - RPI13 Substitute
  • Slide 14
  • Courtesy Costas Buch - RPI14 Remove repeated productions Final grammar
  • Slide 15
  • Courtesy Costas Buch - RPI15 Useless Productions Some derivations never terminate... Useless Production
  • Slide 16
  • Courtesy Costas Buch - RPI16 Another grammar: Not reachable from S Useless Production
  • Slide 17
  • Courtesy Costas Buch - RPI17 In general: if then variable is useful otherwise, variable is useless contains only terminals
  • Slide 18
  • Courtesy Costas Buch - RPI18 A production is useless if any of its variables is useless Productions useless Variables useless
  • Slide 19
  • Courtesy Costas Buch - RPI19 Removing Useless Productions Example Grammar:
  • Slide 20
  • Courtesy Costas Buch - RPI20 First: find all variables that can produce strings with only terminals Round 1: Round 2:
  • Slide 21
  • Courtesy Costas Buch - RPI21 Keep only the variables that produce terminal symbols: (the rest variables are useless) Remove useless productions
  • Slide 22
  • Courtesy Costas Buch - RPI22 Second: Find all variables reachable from Use a Dependency Graph not reachable
  • Slide 23
  • Courtesy Costas Buch - RPI23 Keep only the variables reachable from S Final Grammar (the rest variables are useless) Remove useless productions
  • Slide 24
  • Courtesy Costas Buch - RPI24 Removing All Step 1: Remove Nullable Variables Step 2: Remove Unit-Productions Step 3: Remove Useless Variables
  • Slide 25
  • Courtesy Costas Buch - RPI25 Normal Forms for Context-free Grammars
  • Slide 26
  • Courtesy Costas Buch - RPI26 Chomsky Normal Form Each productions has form: variable or terminal
  • Slide 27
  • Courtesy Costas Buch - RPI27 Examples: Not Chomsky Normal Form Chomsky Normal Form
  • Slide 28
  • Courtesy Costas Buch - RPI28 Convertion to Chomsky Normal Form Example: Not Chomsky Normal Form
  • Slide 29
  • Courtesy Costas Buch - RPI29 Introduce variables for terminals:
  • Slide 30
  • Courtesy Costas Buch - RPI30 Introduce intermediate variable:
  • Slide 31
  • Courtesy Costas Buch - RPI31 Introduce intermediate variable:
  • Slide 32
  • Courtesy Costas Buch - RPI32 Final grammar in Chomsky Normal Form: Initial grammar
  • Slide 33
  • Courtesy Costas Buch - RPI33 From any context-free grammar (which doesnt produce ) not in Chomsky Normal Form we can obtain: An equivalent grammar in Chomsky Normal Form In general:
  • Slide 34
  • Courtesy Costas Buch - RPI34 The Procedure First remove: Nullable variables Unit productions
  • Slide 35
  • Courtesy Costas Buch - RPI35 Then, for every symbol : In productions: replace with Add production New variable:
  • Slide 36
  • Courtesy Costas Buch - RPI36 Replace any production with New intermediate variables:
  • Slide 37
  • Courtesy Costas Buch - RPI37 Theorem: For any context-free grammar (which doesnt produce ) there is an equivalent grammar in Chomsky Normal Form
  • Slide 38
  • Courtesy Costas Buch - RPI38 Observations Chomsky normal forms are good for parsing and proving theorems It is very easy to find the Chomsky normal form for any context-free grammar
  • Slide 39
  • Courtesy Costas Buch - RPI39 Greinbach Normal Form All productions have form: symbolvariables
  • Slide 40
  • Courtesy Costas Buch - RPI40 Examples: Greinbach Normal Form Not Greinbach Normal Form
  • Slide 41
  • Courtesy Costas Buch - RPI41 Conversion to Greinbach Normal Form: Greinbach Normal Form
  • Slide 42
  • Courtesy Costas Buch - RPI42 Theorem: For any context-free grammar (which doesnt produce ) there is an equivalent grammar in Greinbach Normal Form
  • Slide 43
  • Courtesy Costas Buch - RPI43 Observations Greinbach normal forms are very good for parsing It is hard to find the Greinbach normal form of any context-free grammar
  • Slide 44
  • Courtesy Costas Buch - RPI44 Compilers
  • Slide 45
  • Courtesy Costas Buch - RPI45 Compiler Program v = 5; if (v>5) x = 12 + v; while (x !=3) { x = x - 3; v = 10; }...... Add v,v,0 cmp v,5 jmplt ELSE THEN: add x, 12,v ELSE: WHILE: cmp x,3... Machine Code
  • Slide 46
  • Courtesy Costas Buch - RPI46 Lex
  • Slide 47
  • Courtesy Costas Buch - RPI47 Lex: a lexical analyzer A Lex program recognizes strings For each kind of string found the lex program takes an action
  • Slide 48
  • Courtesy Costas Buch - RPI48 Var = 12 + 9; if (test > 20) temp = 0; else while (a < 20) temp++; Lex program Identifier: Var Operand: = Integer: 12 Operand: + Integer: 9 Semicolumn: ; Keyword: if Parenthesis: ( Identifier: test.... Input Output
  • Slide 49
  • Courtesy Costas Buch - RPI49 In Lex strings are described with regular expressions if then + - = /* operators */ /* keywords */ Lex program Regular expressions
  • Slide 50
  • Courtesy Costas Buch - RPI50 (0|1|2|3|4|5|6|7|8|9)+ /* integers */ /* identifiers */ Regular expressions (a|b|..|z|A|B|...|Z)+ Lex program
  • Slide 51
  • Courtesy Costas Buch - RPI51 integers [0-9]+(0|1|2|3|4|5|6|7|8|9)+
  • Slide 52
  • Courtesy Costas Buch - RPI52 (a|b|..|z|A|B|...|Z)+ [a-zA-Z]+ identifiers
  • Slide 53
  • Courtesy Costas Buch - RPI53 Each regular expression has an associated action (in C code) Examples: \n Regular expressionAction linenum++; [a-zA-Z]+ printf(identifier); [0-9]+ prinf(integer);
  • Slide 54
  • Courtesy Costas Buch - RPI54 Default action: ECHO; Prints the string identified to the output
  • Slide 55
  • Courtesy Costas Buch - RPI55 A small lex program % [a-zA-Z]+printf(Identifier\n); [0-9]+printf(Integer\n); [ \t\n] ; /*skip spaces*/
  • Slide 56
  • Courtesy Costas Buch - RPI56 1234 test var 566 78 9800 Input Output Integer Identifier Integer
  • Slide 57
  • Courtesy Costas Buch - RPI57 % [a-zA-Z]+ printf(Identifier\n); [0-9]+ prinf(Integer\n); [ \t] ; /*skip spaces*/. printf(Error in line: %d\n, linenum); Another program %{ int linenum = 1; %} \nlinenum++;
  • Slide 58
  • Courtesy Costas Buch - RPI58 1234 test var 566 78 9800 + temp Input Output Integer Identifier Integer Error in line: 3 Identifier
  • Slide 59
  • Courtesy Costas Buch - RPI59 Lex matches the longest input string if ifend Regular Expressions Input: ifend if Matches: ifend if Example:
  • Slide 60
  • Courtesy Costas Buch - RPI60 Internal Structure of Lex Lex Regular expressions NFADFA Minimal DFA The final states of the DFA are associated with actions
  • Slide 61
  • Courtesy Costas Buch - RPI61 Lexical analyzer parser Compiler program machine code input output
  • Slide 62
  • Courtesy Costas Buch - RPI62 A parser knows the grammar of the programming language
  • Slide 63
  • Courtesy Costas Buch - RPI63 Parser PROGRAM STMT_LIST STMT_LIST STMT; STMT_LIST | STMT; STMT EXPR | IF_STMT | WHILE_STMT | { STMT_LIST } EXPR EXPR + EXPR | EXPR - EXPR | ID IF_STMT if (EXPR) then STMT | if (EXPR) then STMT else STMT WHILE_STMT while (EXPR) do STMT
  • Slide 64
  • Courtesy Costas Buch - RPI64 The parser finds the derivation of a particular input 10 + 2 * 5 Parser E -> E + E | E * E | INT E => E + E => E + E * E => 10 + E*E => 10 + 2 * E => 10 + 2 * 5 input derivation
  • Slide 65
  • Courtesy Costas Buch - RPI65 10 E 25 E => E + E => E + E * E => 10 + E*E => 10 + 2 * E => 10 + 2 * 5 derivation derivation tree