Lisp:LabInformationDonald F. Ross
GeneralModel
11/0
5/20
17P
S -
Lisp
-La
b In
fo
2
program source text
lexical
analysis
syntax
analysis
true
or
false
is_id,
is_number … etc.
+ grammar rules
+ symbol table fns
lexemes
tokensstream
What you need to write
Whatcanwere‐usefromtheearlierlabs?
• Conceptual model• parser <grammar rules> get_token
match lookahead
• Lisp program clichés• conditional expression (if / cond)• recursion (tail recursion) (no iteration!!!)• no (or few) variables
11/0
5/20
17P
S -
Lisp
-La
b In
fo
3
Whatcodeisprovided?• get‐lex (state) – returns (c lexeme) list
• extracts a lexeme (from input stream (ip))• get‐token (state)
• state (the parse descriptor) is updated• map‐lexeme (lexeme) ;; partial code!
• returns a list containing a (token lexeme)• create‐parser‐state (stream)
• constructor for structure pstate• pstate(stream, lookahead, nextchar, status, symtab)• init to (ip, ( ), #\Space, ‘OK, ( ) )
11/0
5/20
17P
S -
Lisp
-La
b In
fo
4
Whatcodeisprovided?• match (state symbol)
• Compare symbol (expected token)with the token in state (token stream input)
• parser• driver function for the system• requires the i/p file name• checks to see if this is a “program”
• i.e. according to the grammar – program is S (the start symbol)• G = (S, P, NT, T) (S – start symbol)
11/0
5/20
17P
S -
Lisp
-La
b In
fo
5
Whatcodeisrequired?• is‐id (lexeme), is‐number (lexeme), … (you write these!)
• test lexeme predicates• may require some help functions
• THINK ABOUT WHAT IS REQUIRED !!!• the input is a string• a stringmay be thought of as a list of characters• (check the character and string handling functions in Lisp + some other functions Hint: for each; for all; (check suitable functions))
11/0
5/20
17P
S -
Lisp
-La
b In
fo
6
Whatcodeisrequired?• map‐lexememust be completed
• lexeme to token conversion• The code for the grammar is required
• program has been provided as an example• parser is the driver program
• you may think about modifications here• you may want to think about how you are going to run the test suite
11/0
5/20
17P
S -
Lisp
-La
b In
fo
7
Whattothinkabout• The grammar code should be simple
• Ts are handled using match Ts match• NTs require a function (applied to state) NTs function• conditional expressions are required• (tail) recursion must (!) be used• you may need to define some help predicates• your code should be easy to upgrade• use your knowledge from labs 1 & 2 !!!
• Ts = Terminal Symbols• NTs = Non‐Terminal Symbols
11/0
5/20
17P
S -
Lisp
-La
b In
fo
8
Thecode:defstruct (defineastructure)(defstruct pstate
(stream) ;; input stream (ip) ‐ param(lookahead) ;; (token lexeme) ‐ a list(nextchar) ;; next char after lexeme ‐ a char(status) ;; parse status OK / NOTOK ‐ symbol(symtab) ;; the symbol table ‐ a list
)
This is a STATE DESCRIPTOR since the parser process is actually a state process.
11/0
5/20
17P
S -
Lisp
-La
b In
fo
9
Thecode:defstruct (defineastructure)• Defines a structure with fields pstate (structure name)• Creates
• a constructor make‐pstate(make‐pstate
:stream ip ;; input stream:lookahead () ;; empty list:nextchar #\Space ;; space char:status 'OK ;; symbol:symtab () ) ;; empty list
• readers for the fields (pstate‐stream state)(pstate‐lookahead … etc.
• writer: (setf (pstate‐stream state) x) ;; state is an instance• writer: (setf (pstate‐lookahead state) x) ;; of structure pstate• a predicate (pstate‐p x)
• Test an object to see if it is of the defstruct defined data type
11/0
5/20
17P
S -
Lisp
-La
b In
fo
10
Thecode:parser(defun parse (filename)(format t "~%‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐")(format t "~%‐‐‐ Parsing program: ~S " filename)(format t "~%‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐~%")(with‐open‐file (ip (open filename) :direction :input)(setf state (create‐parser‐state ip)) ;; constructor + writer(setf (pstate‐nextchar state) (read‐char ip nil 'EOF)) ;; writer(get‐token state) ;; get first token(program state) ;; parse the program(check‐end state) ;; check for extra symbols(symtab‐display state) ;; display symbol table)
(if (eq (pstate‐status state) 'OK) ;; reader(format t "Parse Successful. ~%")(format t "Parse Fail. ~%"))
(format t "‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐~%"))
11/0
5/20
17P
S -
Lisp
-La
b In
fo
11
• Parameters: filename• Write message to screen (standard output)• Open the file for read using the filename• If the input program is legal (program), output “Parse Successful” else output “Parse Fail”
• NB the parser descriptor is initialised (see below)
(defun create‐parser‐state (ip)(make‐pstate
:stream ip:lookahead ():nextchar #\Space:status 'OK:symtab ()
))
Thecode:parser
11/0
5/20
17P
S -
Lisp
-La
b In
fo
12() = empty list
Thecode:get‐token(defun get‐token (state)(let ((result (get‐lex state))) ;; (c lexeme)
(setf (pstate‐nextchar state) (first result))(setf (pstate‐lookahead state) ;; (token lexeme)
(map‐lexeme (second result))))) ;; return value is (token lexeme)
get‐lex returns a list ‐ result is set to – (c lexeme) ;; a liststate‐nextchar is set to c ;; next character AFTER lexeme
state‐lookahead is set to map‐lexeme applied to lexemeto give a (token lexeme) list ;; i.e. lexeme (token lexeme)
11/0
5/20
17P
S -
Lisp
-La
b In
fo
13
Thecode:map‐lexeme(defun map‐lexeme (lexeme)(format t “Symbol: ~S ~%" lexeme)(list (cond((string= lexeme "program“) 'PROGRAM)((string= lexeme "var“ ) 'VAR)
; ...((string= lexeme "(“ ) 'LP)((string= lexeme ")“ ) 'RP)((string= lexeme "“ ) 'EOF);
; ((is‐id lexeme ) 'ID); ((is‐number lexeme ) 'NUM)
(t 'UNKNOWN))lexeme)
) ; NB the result is a list: (token lexeme) e.g. (VAR “var”)
11/0
5/20
17P
S -
Lisp
-La
b In
fo
14
NB!
Thecode:map‐lexeme• Parameters: lexeme (a string)
• Write out the lexeme to the screen• Return a list (of two elements) with the token and the corresponding lexeme e.g. (PROGRAM “program”)
• Using• string= to compare the lexeme with a pattern
• keywords (“program”, “var”, …)• symbols ( “(“, “)”, … )
• is_id (a predicate for identifiers)• alphanumeric strings beginning with an alpha
• is_number (a predicate for numbers)• numeric strings
• You have to write is_id and is_number
11/0
5/20
17P
S -
Lisp
-La
b In
fo
15
Thecode:match
(defunmatch (state symbol)(if (eq symbol (token state))(get‐token state) ;; get next token(synerr1 state symbol) ;; error message)
)
NB: identify the reader here
11/0
5/20
17P
S -
Lisp
-La
b In
fo
16
Howtotest• Think about what you want to do FIRST!• Stepwise development• Top down• Slowly (festina lente!)• Add 1 construct at a time program header; var part; stat part
• Test a correct Pascal program first (test case #1)• Run a clisp window + editor window• Decide on which error conditions you can test + corresponding error messages
• Test each error condition separately as you add the code• Decide how you are going to run the test suite
11/0
5/20
17P
S -
Lisp
-La
b In
fo
17
Parser‐ summary• Use what you know from lab 1 and lab 2• NB: as in Prolog, the character after the lexememust be kept• Recall that parsing is a linear process (reading the ip stream)• Reader + Lexer: read lexemes & return (c lexeme) list• pstate defines a descriptor for the parser state at each stage
• is initialised by make‐pstate & updated by parse (ip, nextchar), get‐token (lookahead, nextchar), error functions (status) and symbol table functions (symtab)
• Look for a constructor, readers & writers, a predicate• a constructor: make‐pstate• a reader: (pstate‐<fieldname> state)• a writer: (setf (pstate‐<fieldname> state) x)• a predicate: (pstate‐p x)
11/0
5/20
17P
S -
Lisp
-La
b In
fo
18
Parser– functionaldescription• Reader & Lexer
• get‐lex: state char x lexeme• Help functions
• ctos: char string• str‐con: string x char string• whitespace: character Boolean• get‐name: ip x lexeme x char char x lexeme• get‐number: ip x lexeme x char char x lexeme• get‐symbol: ip x lexeme x char char x lexeme
• x is the cross product• recall that the last character read must be kept and “passed forward” – (get‐lex state) returns a pair (char lexeme)
11/0
5/20
17P
S -
Lisp
-La
b In
fo
19
Parser– statedescriptor(state)• state (a record)
• lookahead (token lexeme)• stream ip – the input stream pointer• nextchar the last char read (after the token/lexeme)• status the parse status – OK / not OK• symtab the Symbol Table
• state describes the current state of the parser• lookahead has now become a pair of values (token lexeme)
11/0
5/20
17P
S -
Lisp
-La
b In
fo
20
Parser– (thedriver:parse)• parse: filename Boolean
• Description• Print a header• Open the input file & read the first character• Set state to the input stream (ip) and first character• Get the first token ( (get‐token state) (get‐lex state) )• Call program (start of parse – (program state) )• Check for extra characters after the program text• Print the Symbol Table (symtab‐display state)• Print the parse result: Parse Successful / Parse Fail• Print a footer
11/0
5/20
17P
S -
Lisp
-La
b In
fo
21
Parser– helpfunctions• get‐token: state state’
• lookahead is set to (token lexeme)• match: state x symbol state’
• if token = symbol (get‐token state) else error message• application (match state ‘BEGIN)
• map‐lexeme: lexeme (token lexeme)• return a (token lexeme) pair from a lexeme
• grammar functions for non‐terminals• example (defun stat‐part (state)
(match state 'BEGIN)(stat‐list state)(match state 'END)(match state 'FSTOP)
)
11/0
5/20
17P
S -
Lisp
-La
b In
fo
22