PL-3 Handout

IT2009
Syntax and Semantics

Basics
• Syntax – the form of the expressions, statements, and program units of a programming language
• Semantics – the meaning of the expressions, statements, and program units of a programming
language
• Lexemes – include the numeric literals, operators, and special words of a programming language
• Token – a category of the lexemes of a language
Sample statement:
index = 2 * count + 17;
The lexemes and tokens of this statement are:
Lexemes Tokens
index identifier
= equal_sign
2 int_literal
* mult_op
count identifier
17 int_literal
; semicolon
• A language recognizer reads input strings of the language and decides whether the strings belong to
the language. A language generator creates sentences of a language.
Formal Methods of Describing Syntax

• Grammar – describes the syntax of programming languages; a collection of rules
• Noam Chomsky – described four (4) classes of generative devices or grammars that define four (4)
classes of language. Two of these grammar classes are regular and context-free grammars.
• Regular grammars – describe the forms of the tokens of the programming languages
• Context-free grammars – describe the syntax of whole programming languages
• Backus-Naur Form (BNF) – a natural notation for describing syntax developed by John Backus and
Peter Naur in the mid-1950s
• Metalanguage – a language that is used to describe another language
• BNF uses abstraction for syntactic structures. Example: <assign> → <var> = <expression>
• Left-hand side (LHS) – the abstraction being defined; the text on the left-side of the arrow
• Right-hand side (RHS) – the definition of the LHS; the text to the right of the arrow
• Rule (or production) – a mixture of tokens, lexemes, and references to other abstractions
• The abstractions in a BNF description, or grammar, are called nonterminal symbols or nonterminals.
• The lexemes and tokens of the rules are called terminal symbols or terminals.
• Recursion is used to describe list of syntactic elements in programming languages. A rule is recursive
if its LHS appears in its RHS.
• The sentences of the language are generated through a sequence of applications of the rules,
beginning with a special nonterminal of the grammar called the start symbol. This sequence of rule
applications is called a derivation.
Sample grammar:
<program> → begin <stmt_list> end
<stmt_list> → <stmt>
| <stmt> ; <stmt_list>
<stmt> → <var> = <expression>
<var> → A | B | C
<expression> → <var> + <var>
| <var> - <var>
| <var>
03 Handout 1 *Property of STI

Page 1 of 6
IT2009
Derivation of the sample grammar:

<program> => begin <stmt_list> end
=> begin <stmt> ; <stmt_list> end
=> begin <var> = <expression> ; <stmt_list> end
=> begin A = <expression> ; <stmt_list> end
=> begin A = <var> + <var> ; <stmt_list> end
=> begin A = B + <var> ; <stmt_list> end
=> begin A = B + C ; <stmt_list> end
=> begin A = B + C ; <stmt> end
=> begin A = B + C ; <var> = <expression> end
=> begin A = B + C ; B = <expression> end
=> begin A = B + C ; B = <var> end
=> begin A = B + C ; B = C end
• Each of the strings in the derivation is called a sentential form.
• In leftmost derivation, the replaced nonterminal is always the leftmost nonterminal in the sentential
form. In rightmost derivation, the replaced nonterminal is always the rightmost nonterminal.
• Parse trees – hierarchical syntactic structures of the sentences of the languages
Sample grammar:
<assign> → <id> = <expr>
<id> → A | B | C
<expr> → <id> + <expr>
| <id> * <expr>
| ( <expr> )
| <id>
Sentence:
A = B * ( A + C )
Derivation:
<assign> => <id> = <expr>
=> A = <expr>
=> A = <id> * <expr>
=> A = B * <expr>
=> A = B * ( <expr> )
=> A = B * ( <id> + <expr> )
=> A = B * ( A + <expr> )
=> A = B * ( A + <id> )
=> A = B * ( A + C )
Parse Tree:

Page 2 of 6
IT2009
• A grammar that generates a sentential form for which there are two or more distinct parse trees
is said to be ambiguous. A grammar is ambiguous if the grammar generates a sentence with more
than one (1) leftmost derivation or rightmost derivation.
Sample of an ambiguous grammar:
<id> → A | B | C
<expr> → <expr> + <expr>
| <expr> * <expr>
| ( <expr> )
| <id>
Sentence:
A = B + C * A
Parse Trees:
• An operator in an arithmetic expression that is generated lower in the parse tree has precedence
over an operator produced higher up in the tree.
Sample of an unambiguous grammar:
<id> → A | B | C
<expr> → <expr> + <term>
| <term>
<term> → <term> * <factor>
| <factor>
<factor> → ( <expr> )
| <id>
Sentence:
A = B + C * A
Leftmost derivation:
=> A = <expr>
=> A = <expr> + <term>
=> A = <term> + <term>
=> A = <factor> + <term>
=> A = <id> + <term>
=> A = B + <term>
=> A = B + <term> * <factor>
=> A = B + <factor> * <factor>
=> A = B + <id> * <factor>
=> A = B + C * <factor>
=> A = B + C * <id>
=> A = B + C * A

Page 3 of 6
IT2009
Rightmost derivation:
=> <id> = <expr> + <term>
=> <id> = <expr> + <term> * <factor>
=> <id> = <expr> + <term> * <id>
=> <id> = <expr> + <term> * A
=> <id> = <expr> + <factor> * A
=> <id> = <expr> + <id> * A
=> <id> = <expr> + C * A
=> <id> = <term> + C * A
=> <id> = <factor> + C * A
=> <id> = <id> + C * A
=> <id> = B + C * A
=> A = B + C * A
Both the leftmost and the rightmost derivations are represented by the same parse tree.
Parse Tree:
• When an expression includes two (2) operators that have the same precedence, a semantic rule is
required to specify which should have precedence. The rule is named associativity.
Sample grammar:
<id> → A | B | C
| <term>
| <factor>
<factor> → ( <expr> )
| <id>
Sentence:
A = B + C + A

Page 4 of 6
IT2009
Parse Tree:
• When a grammar rule has its LHS also appearing at the beginning of its RHS, the rule is said to be left
recursive. This left recursion specifies left associativity.
• To indicate right associativity, right recursion can be used. A grammar rule is right recursive if the LHS
appears at the right end of the RHS.
• Extended BNF (or EBNF) – extended versions of BNF that increase the readability and writabiliy of BNF
• Three (3) EBNF Extensions:
o Optional parts of an RHS are placed in brackets.
Example: <if_stmt> → if (<expression>) <statement> [else<statement>]
o Braces are used to indicate that the enclosed part can be repeated indefinitely or left out
altogether.
Example: <ident_list> → <identifier> {, <identifier>}
o When a single element must be chosen from a group, the options are placed in parentheses,
and separated by the OR operator, |.
Example: <term> → <term> (* | / | %) <factor>
BNF:
| <expr> - <term>
| <term>
| <term> / <factor>
| <factor>
<factor> → <expr> ** <factor>
<exp>
<exp> → (<expr>)
| id
EBNF:
<expr> → <term> {(+ | -) <term>}
<term> → <factor> {(* | /) <factor>}
<factor> → <exp> {** <exp>}
<exp> → (<expr>)
| id

Page 5 of 6
IT2009
Attribute Grammars
• Static semantics – consists of semantic rules that can be checked during program compilation
• Attribute grammar – a descriptive formalism that can describe both the syntax and static semantics
of a language; extensions to context-free grammars; consists of a grammar, a set of attribute, a set of
attribute computation functions, and a set of predicates, which together describe static semantic rules
• Attribute computation functions (or semantic functions) – specify how attribute values are computed
• Predicate functions – state the semantic rules of the language
• Two (2) Classes of Attributes:
o Synthesized attributes – pass semantic information up a parse tree, meaning from the
attributes attached to the children of its nonterminal
o Inherited attributes – pass semantic information down and across a tree, meaning from the
attributes attached to the parent (or siblings) of its nonterminal
Dynamic Semantics
• Dynamic semantics – consists of semantic rules that can be checked during program execution
• Semantic Description Methods:
o Operational semantics – method of describing the meaning of language constructs in terms
of their effects on an ideal machine
 Natural operational semantics: At the highest level, the interest is the final result of
the execution of a complete program.
 Structural operational semantics: At the lowest level, operational semantics can be
used to determine the precise meaning of program through an examination of the
complete sequence of state changes that occur when the program is executed.
o Denotational semantics – formalizes the meanings of programming languages by constructing
mathematical objects (denotations) to describe the meanings of language constructs
o Axiomatic semantics – a tool for proving the correctness of a program
• Assertions (or predicates) – logical expressions used in axiomatic semantics
o Precondition – an assertion before a statement
o Postcondition – an assertion that follows a statement
• The weakest precondition is the least restrictive precondition that will guarantee the validity of the
associated precondition.
• Inference rule – a method of inferring the truth of one (1) assertion on the basics of the values of
other assertions
The general form of an inference rule:
𝑺𝑺𝑺𝑺, 𝑺𝑺𝑺𝑺, … , 𝑺𝑺𝑺𝑺
𝑺𝑺
• The top part of an inference rules is called its antecedent; the bottom part is called its consequent.
• Axiom – a logical statement that is assumed to be true. Therefore, an axiom is an inference rule
without an antecedent.
References:
Sebesta, Robert W. (2012). Concepts of Programming Languages. 10th ed. USA: Pearson Education, Inc.
Ben-Ari, Mordechai (2006). Understanding Programming Languages. Chichester: John Wiley & Sons, Inc.
Tucker, Allan B. and Noonan, Robert E. (2002). Programming Languages: Principles and Paradigms. 2nd ed.
New York: Mc-Graw Hill

Page 6 of 6

PL-3 Handout

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

PL-3 Handout

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

PL-3 Handout

Uploaded by

Copyright:

Available Formats

IT2009

Syntax and Semantics

Formal Methods of Describing Syntax

03 Handout 1 *Property of STI

Derivation of the sample grammar:

03 Handout 1 *Property of STI

03 Handout 1 *Property of STI

03 Handout 1 *Property of STI

03 Handout 1 *Property of STI

03 Handout 1 *Property of STI

You might also like