syntax and semantics {week 03}
DESCRIPTION
The College of Saint Rose CIS 433 – Programming Languages David Goldschmidt, Ph.D. Syntax and semantics {week 03}. from Concepts of Programming Languages , 9th edition by Robert W. Sebesta, Addison-Wesley, 2010, ISBN 0-13-607347-6. Syntax. - PowerPoint PPT PresentationTRANSCRIPT
Syntax and semantics{week 03}
The College of Saint RoseCIS 433 – Programming LanguagesDavid Goldschmidt, Ph.D.
from Concepts of Programming Languages, 9th edition by Robert W. Sebesta, Addison-Wesley, 2010, ISBN 0-13-607347-6
Syntax
Syntax is the expected form or structure of the expressions, statements, and program units of a programming language
Syntax of a Java while statement: while ( <boolean_expr> ) <statement>
Partial syntax of an if statement: if ( <boolean_expr> ) <statement>
Semantics
Semantics is the meaning of the expressions, statements, and program units of a given programming language
Semantics of a Java while statement while ( <boolean_expr> ) <statement> Execute <statement> zero or more times
as long as <boolean_expr> evaluates to true
Syntax and semantics
Together, syntax and semantics definea programming language
Syntax errors are detectedand reported by a compiler
Errors related to semantics are defects in program logic that cause incorrect resultsor program crashes
Defining syntax
Terminology to describe syntax: A sentence is a string of characters over an
alphabet of symbols A language is a set of sentences A lexeme is the lowest-level syntactic unit
of a language (e.g. +, *, sum, while)▪ One step above individual characters
A token is a set of lexemes▪ e.g. identifier, equal_sign, integer_literal, etc.
Language recognizers
A language recognizer reads an input string and determines whether it belongs to the given language
This is the syntax analysis partof a compiler or interpreter
input strings(source code)
languagerecognizer
accept or rejecteach input string
Language generators
A language generator produces syntactically acceptable strings of a given language
Not practical to generate all valid strings Instead, inspect generator rules (the
grammar) to determine if a sentence is acceptable for a given language
languagegenerator valid strings
of the language
Noam Chomsky
In the mid-1950s, linguist Noam Chomsky (born 1928) developed four classesof generative grammars Context-free grammars (CFGs) are
useful for describing programminglanguage syntax
Regular grammars are useful for describingvalid tokens of a programming language
Backus-Naur Form (BNF)
In 1960, John Backus and Peter Naur developed a formal notationfor specifying programminglanguage syntax Backus-Naur Form (BNF) is nearly
identical to Chomsky’s context-free grammars
Syntax of an assignment statement in BNF:▪ <assign> <var> = <expression> ;
BNF structure
Syntax of an assignment statement in BNF: BNF rule or production defining <assign>:
<assign> <var> = <expression> ; abstraction
being defineddefinition
of <assign> The definition consists of other abstractions, as well as lexemes and tokens
Example language
<program> begin <stmts> end<stmts> <stmt> | <stmt> ; <stmts><stmt> <var> = <expr><var> a | b | c | d | e<expr> <term> + <term> | <term> - <term><term> <var> | literal-integer-value
a vertical barindicates an OR
a token, which is simplya grouping of lexemes
Write a sentence that conforms to this grammar
Derivations
A derivation is a repeated application of rules Start with a start symbol and end with a
sentence <program> => begin <stmts> end => begin <stmt> end => begin <var> = <expr> end => begin b = <expr> end => begin b = <term> + <term> end => begin b = <var> + <term> end => begin b = c + <term> end => begin b = c + 123 end
Many possible (often infinite) derivations
<program> begin <stmts> end<stmts> <stmt> | <stmt> ; <stmts><stmt> <var> = <expr><var> a | b | c | d | e<expr> <term> + <term> | <term> - <term><term> <var> | literal-integer-value
Leftmost and rightmost derivations
A leftmost derivation is one in which the leftmost abstraction is always the next one expanded
Write both a leftmost and rightmost derivation to obtain this sentence: begin d = 10 - a end
Why is the leftmost derivation important?
<program> begin <stmts> end<stmts> <stmt> | <stmt> ; <stmts><stmt> <var> = <expr><var> a | b | c | d | e<expr> <term> + <term> | <term> - <term><term> <var> | literal-integer-value
Working with grammars
Given this simple grammar:
Which of the following sentences aregenerated by this grammar?▪ baaabbccc▪ abc▪ bbaabbaabbaabbaac▪ aabbbbccccccccccccccccccccc
<S> <A> <B> <C><A> a <A> | a<B> b <B> | b<C> c <C> | c
What next? (i)
Write BNF for the following constructs from your favorite programming language: Assignment statement▪ Include operators +, -, *, /, %, ++, --
Complete while and if statements Class header for Java/C++/C# etc.
What next? (ii)
Given this grammar:
Show both leftmost and rightmost derivations for the following sentences:▪ A = A * ( B + ( C * A ) ) ▪ B = B * ( (D) + C ) ▪ C = A + B + C * D + A
<assign> <var> = <expr> <var> A | B | C | D <expr> <var> + <expr> | <var> * <expr> | ( <expr> ) | <var>
What next? (iii)
Use BNF to write a grammarfor reverse Polish notation Use <expr> as your start symbol
Valid sentences include:▪ 5 8 19 + * ▪ 2 3 + 5 7 * / ▪ 2 3 + 5 7 * / 3 4 + * 1 – ▪ 9 9 * 8 7 * * 5 5 * * 4 -
What next? (iv)
Read and study Chapter 3
Do Exercises at the end of Chapter 3