Module-5-Syntax Directed Translation
Module-5-Syntax Directed Translation
5.1 Introduction
In the previous chapters, we have discussed first two phases of the compiler i.e., lexical
analysis phase and syntax analysis phase. In this section, let us concentrate on the third
phase of the compiler called semantic analysis. The main goal of the semantic analysis is
to check for correctness of program and enable proper execution.
We know that the job of the parser is only to verify that the input program consists of
tokens arranged in syntactically valid combination. In sematic analysis we check whether
they form a sensible set of instructions in the programming language. For example,
If an identifier is already declared, semantic analyzer checks whether the type of an
identifier is respected in all expressions and statements used in the program. That is, it
checks whether the type of RHS of an expression of an assignment statement match
the type on LHS. It also checks whether the LHS needs to be properly declared and is
it an assignable identifier or not.
Ex 1: int a = 10 + 20; // valid
Ex 2: char b[] = “Hello”; // valid
int a = 10 + b; // syntactically valid but, semantically invalid statement;
// invalid usage of identifier b in the expression
Ex 3: int a[5][5]; // syntactically valid
a = 10 + 20; // syntactically valid but, semantically invalid statement;
It checks whether the type and number of parameters in the function definition and
the type and number of arguments in the function call are same or not. If not
appropriate error messages are displayed
It checks whether the type of operands are same in an arithmetic operation. If not it
displays appropriate error messages
The index variable used in an array must be integer type else it is a semantic error.
Appropriate type conversion is also done by semantic analysis phase
5.2 Syntax Directed Translation
Definition: Semantic analysis is the third phase of the compiler which acts as an interface
between syntax analysis phase and code generation phase. It accepts the parse tree from
the syntax analysis phase and adds the semantic information to the parse tree and
performs certain checks based on this information. It also helps constructing the symbol
table with appropriate information. Some of the actions performed semantic analysis
phase are:
Type checking i.e., number and type of arguments in function call and in function
header of function definition must be same. Otherwise, it results in semantic error.
Object binding i.e., associating variables with respective function definitions
Automatic type conversion of integers in mixed mode of operations
Helps intermediate code generation.
Display appropriate error messages
The semantics of a language can be described very easily using two notations namely:
Syntax directed definition (SDD)
Syntax directed translation (SDT)
First, we shall discuss about syntax-directed definition and next we shall discuss about
syntax-directed translation.
E → E1 + T
Now, let us consider the production, its derivation and corresponding parse tree as shown
below:
Production Derivation Parse tree
E → E1 + T E E1 + T E
E1 + T
For example, a simple SDD for the production E→ E1 + T can be written as shown
below:
Observe that a semantic rule is associated with production where the attribute name val is
associated with each non-terminal used in the rule.
Definition: The rule that describe how to compute the attribute values of the attributes
associated with a grammar symbol using attribute values of other grammar symbols is
called semantic rule.
For example, consider the production E → E1 + T. The attribute value of E which
is on LHS of the production denoted by E.val can be calculated by adding the attribute
values of variables E and T which are on RHS of the production denoted by E1.val and
T.val as shown below:
E.val = E1.val + T.val // Semantic rule
The attribute value for a node in the parse tree may depend on information from its
children nodes or its sibling nodes or parent nodes. Based on how the attribute values are
obtained we can classify the attributes. Now, let us see “What are the different types or
classifications of attributes?” There are two types of attributes namely:
Synthesized attribute
Inherited attribute
Definition: The attribute value of a non-terminal A derived from the attribute values of
its children or itself is called synthesized attribute. Thus, the attribute values of
synthesized attributes are passed up from children to the parent node in bottom-up
manner.
For example, consider the production: E → E1 + T. Suppose, the attribute value
val of E on LHS (head) of the production is obtained by adding the attribute values E1.val
and T.val appearing on the RHS (body) of the production as shown below:
Production Semantic Rule Parse tree with attribute values
E→E+T E.val = E1.val + T.val E.val = 30
E1.val = 10 + T.val = 20
Now, attribute val with respect to E appearing on head of the production is called
synthesized attribute. This is because, the value of E.val which is 30, is obtained from the
children by adding the attribute values 10 and 20 as shown in above parse tree.
Introduction to Compiler Design - 5.5
In the production, D stands for declaration, T stands for type such as int and V stands for
the variable sum as in above declaration. The production, semantic rule and parse tree
along with attribute values is shown below:
id.entry
On similar line, the value int stored in V.inh is transferred to its child id.entry and
hence entry is inherited attribute of id and attribute value is denoted by id.entry
Note: With the help of the annotated parse tree, it is very easy for us to construct SDD
for a given grammar.
Definition: A parse tree showing the attribute values of each node is called annotated
parse tree. The terminals in the annotated parse tree can have only synthesized attribute
values and they are obtained directly from the lexical analyzer. So, there are no semantic
rules in SDD (short form Syntax Directed Definition) to get the lexical values into
terminals of the annotated parse tree. The other nodes in the annotated parse tree may be
either synthesized or inherited attributes. Note: Terminals can never have inherited
attributes
For example, consider the partial annotated tree shown below:
E.val = 30
E1.val = 10 + T.val = 20
In the above partial annotated parse tree, the attribute values 10, 20 and 30 are stored in
E1.val, T.val and E.val respectively.
S → En
E→E+T|T
T→T*F|F
F → (E) | digit
S → Tn
T→T*F|F
F → digit
On similar lines we can write the semantic rules for the following productions as shown
below:
Productions Semantic rules
S → En S.val = E.val
E → E1 + T E.val = E1.val + T.val
E→T E.val = T.val
F → (E) | F.val = E.val
Now, the final SDD along with productions and semantic rules is shown below:
Productions Semantic Rules
S → En S.val = E.val
E→E+T E.val = E1.val + T.val
E→T E.val = T.val
T→T*F T.val = T1.val * F.val
T→F T.val = F.val
F → (E) F.val = E.val
F → digit F.val = digit.lexval
Example 5.2: Write the grammar and syntax directed definition for a simple desk
calculator and show annotated parse tree for the expression (3+4)*(5+6)
5.8 Syntax Directed Translation
On similar lines we can write the semantic rules for the following productions as shown
below:
Productions Semantic rules
S → En S.val = E.val
E → E1 + T E.val = E1.val + T.val
E → E1 – T E.val = E1.val – T.val
T → T1 / F T.val = T1.val + F.val
E→T E.val = T.val
F → (E) F.val = E.val
Introduction to Compiler Design - 5.9
So, the final SDD for simple desk calculator can be written as shown below:
Productions Semantic Rules
S → En S.val = E.val
E→E+T E.val = E1.val + T.val
E→E-T E.val = E1.val - T.val
E→T E.val = T.val
T→T*F T.val = T1.val * F.val
T→T/F T.val = T1.val / F.val
T→F T.val = F.val
F → (E) F.val = E.val
F → digit F.val = digit.lexval
The annotated parse tree for the expression (3+4)*(5+6) consisting of attribute values for
each non-terminal is shown below:
S.val = 77
E.val = 77 n (EOF)
T.val = 77
T1.val = 7 F.val = 11
*
( E.val = 11 )
F.val = 7
( ) E1.val = 5 + T.val = 6
E.val = 7
digit.lexval = 3
5.10 Syntax Directed Translation
Now, the question is “How do we construct an annotated parse tree? In what order do we
evaluate attributes?”
If we want to evaluate an attribute of a node of a parse tree, it is necessary to evaluate
all the attributes upon which its value depends.
If all attributes are synthesized, then we must evaluate the attributes of all of its
children before we can evaluate the attribute of the node itself.
With synthesized attributes, we can evaluate attributes in any bottom up order.
Whether synthesized or inherited attributes there is no single order in which the
attributes have to be evaluated. There can be one or more orders in which the
evaluation can be done.
For example, to see how an annotated parse tree can be constructed, refer example 5.1.
Before proceeding further, let us see, “What is circular dependency when evaluating the
attribute value of a node in an annotated parse tree?”
Definition: If the attribute value of a parent node depends on the attribute value of child
node and vice-versa, then we say, there exists a circular dependency between the child
node and parent node. In this situation, it is not possible to evaluate the attribute of either
parent node or the child node since one value depends on another value.
For example, consider the non-terminal A with synthesized attribute A.s and non-
terminal B with inherited attribute B.i with following productions and semantic rules:
Note that the above two semantic rules are circular in nature (see above figure). Note that
to compute A.s we require the value of B.i and to compute the value of B.i, we require
the value of A.s. So, it is impossible to evaluate either the value of A.s or the value of B.i
without evaluating other.
T→T*F|F
F → digit
Example 5.3: Obtain SDD and annotated parse tree for the following grammar using top-
down approach:
T→T*F|F
F → digit
Let us see, “What happens if top-down parser is used to parser the input 3*5 using the
above grammar?” We have already seen earlier in previous chapters that when we use
top-down parser, first we have to eliminate left-recursion and then do parsing. So, the
given grammar is not suitable for top-down parser because of left-recursion. So, let us
remove left-recursion. The grammar obtained after removing left-recursion is shown
below:
5.12 Syntax Directed Translation
T → F T'
T '→ * F T 1'
T '→ ϵ
F → digit
To construct SDD for the above grammar, let us obtain the derivation tree for the
expression 2*3 with some of the attributes as shown below:
F.val = 2 T'
digit.lexval = 3 ϵ
Observe the following points from above partial annotated parse tree
The values 2, 3 and 4 are moved upwards till we get node F. Since the attribute value
of F is obtained from its child, the attribute of F denoted by val will be synthesized
attribute and F.val can be obtained using semantic rule by considering the production
F → digit as shown below:
Production Semantic Rule Type
F → digit F.val = digit.lexval synthesized
Now, we need to multiply 2*3. Observe that there is no node with 2, * and 3 as the
children. So, we can perform 2*3 using the production T '→ * F T1 ' from the top as
shown below:
T '→ * F T1 '
2 * 3=6
Now, the question is how to transfer 2 to T ', how to multiply 2*3 and how to store
the result 6 in T1 '. These activities can be done as shown below:
1) Observe from the partial annotated parse tree and above scenario that, the first
operand 2 already present in F.val which is the left child of T must be transferred
to right child T ' using the production T → F T ' as shown below:
Production Semantic Rule Type
T → F T' T '.inh = F.val Inherited
Introduction to Compiler Design - 5.13
2 * 3=6
Observe from above figure that multiply 2*3 means we need to compute T ' * F
and store the result in T1'. That is, take the inherited attribute value T '.inh and
multiply with synthesized attribute value F.val and store the result in T1'.inh.
This can be done using as shown below:
Production Semantic Rule Type
T '→ *FT1 ' T1 '.inh = T '.inh * F.val Inherited
F.val = 2 T '.inh = 2
digit.lexval = 3 ϵ
Observe from above figure that the node T1'.inh which is same as T' produce ϵ and so,
the synthesized attribute value T'.syn can be obtained using the production T '→ ϵ
and its semantic rule as shown below:
Production Semantic Rule Type
T '→ ϵ T '.syn = T '.inh Synthesized
The partial annotated parse tree is shown below:
T
F.val = 2 T '.inh = 2
digit.lexval = 3 ϵ
5.14 Syntax Directed Translation
The synthesized value T'.syn = 6 which we call as T1'.syn is transferred to its parent
T' with attribute value T'.syn = 6 using the production T '→ * F T1 ' as shown below:
The annotated parse tree that shows the value of T.val is shown below:
T.val = 6
So, the final SDD for the given grammar can be written as shown below:
Productions Semantic Rules Type
Example 5.4: Obtain SDD for the following grammar using top-down approach:
S → En
E→E+T|T
T→T*F|F
F → ( E ) | digit
and obtain annotated parse tree for the expression (3 + 4 ) * (5 + 6)n
Solution: The given grammar has left recursion and hence it is not suitable for top-down
parser. To make it suitable for top-down parsing, we have to eliminate left recursion.
After eliminating left recursion (see example 2.13 for details), the following grammar is
obtained:
S → En
Note: The variables S, E, T and F are present
E → T E' both in given grammar and grammar obtained
E '→ + T E1' | ϵ after left recursion. So, only for the variables S,
T → F T' E, T and F we use the attribute name v (stands
for val) and for all other variables we use s for
T '→ * F T 1' | ϵ synthesized attribute and i for inherited attribute
F → ( E ) | digit
They do not have left recursion and they are retained in the grammar which is obtained
after eliminating left recursion. So, we can compute the attribute value of LHS (head)
from the attribute values of RHS (i.e., children) for the above productions and hence they
have synthesized attributes. The productions, semantic rules and type of the attribute are
shown below:
Production Semantic Rule Type
S→En S.v = E.v Synthesized
F→(E) F.v = E.v Synthesized
F→d F.v = d.lexval Synthesized
Consider the following productions and write the annotated parse tree for the expression
2*3 with flow of information (See example 5.3 for detailed explanation) as shown below:
Productions Annotated parse tree for 2*3
T → F T' T.val = 6
T '→ * F T 1' | ϵ
F → digit F.val = 2 T '.inh = 2 T '.syn = 6
The above productions and their respective semantic rules along with the type of attribute
are shown below:
Production Semantic rules
T → F T' T '.inh = F.val
T.val = T'.syn
Introduction to Compiler Design - 5.17
Exactly similar to the above we can write the semantic rules for the given productions as
shown below:
Production Semantic rules
E → T E' E '.inh = T.val
E.val = E'.syn
Combining all productions and semantic rules we can write the final SDD as shown
below:
Production Semantic Rule Type
S→En S.v = E.v Synthesized
The annotated parse tree that shows the values of each attributed value while evaluating
the following expression (3 + 4) * (5 + 6) is shown below:
S.v = 77
E.s = 77 n
T.v = 77 E '.i = 77
E '.s = 77
ϵ
F.v = 7 T '.i = 7
T '.s = 77
( )
E.v = 7
* F.v = 11 T1'.i = 77
T1'.s = 77
T.v = 3 E '.i = 3 E'.s = 7 ( E.v = 11 ) ϵ
E1 '.i = 7
F.v = 3 T '.i = 3 + T.v = 4 E1'.s = 7
T.v = 5 E '.i = 5 E'.s = 11
T '.s = 3
ϵ
F.v = 4 T '.i = 4 E1 '.i = 11
F.v = 5 T '.i = 5 + T.v = 6 E1'.s = 11
d.lex = 3 ϵ T '.s = 4
T '.s = 5
ϵ
d.lex = 4 ϵ F.v = 6 T '.i = 6
d.lex = 5 ϵ T '.s = 6
d.lex = 6 ϵ
5.5 Evaluation order for SDD’s
The evaluation order to find attribute values in a parse tree using semantic rules can be
easily obtained with the help of dependency graph. While annotated parse tree shows the
values of attributes, a dependency graph helps us to determine how those values can be
computed.
5.5.1 Dependency graphs
Now, let us see “What is a dependency graph?”
Definition: A graph that shows the flow of information which helps in computation of
various attribute values in a particular parse tree is called dependency graph. An edge
from one attribute instance to another attribute instance indicates that the attribute value
of the first is needed to compute the attribute value of the second.
Introduction to Compiler Design - 5.19
E1 val T val
In the above figure, the dotted lines along with nodes connected to them represent the
parse tree. The shaded nodes represented as val with solid arrows originating from one
node and ends in another node is the dependency graph.
Example 5.5: Obtain the dependency graph for the annotated parse tree obtained in
example 5.3
The dependency graph for the annotated parse tree obtained in example 5.3 can be
written as shown below:
T 9 val
digit 2 lexval ϵ
Observe the following points from the above dependency graph
Nodes identified by numbers 1 and 2 represent the attribute lexval which is associated
with two leaves labeled digit
Nodes 3 and 4 represent the attribute val associated with two nodes labeled F.
The edge from node 1 to node 3 and from node 2 to node 4 indicates that in the
semantic rule, the attribute value F.val is obtained using attribute value digit.lexval
Nodes 5 and 6 represent the inherited attribute T'.inh which is associated with each of
the non-terminal T'
5.20 Syntax Directed Translation
The edge from 3 to 5 indicate that T'.inh is obtained from its sibling F.val and hence
T' has an inherited attribute name inh
The edge from 5 to 6 and another edge from 4 to 6 indicate that the two attribute
values T'.inh and F.val are multiplied to get the attribute value at node 6
The edge from 6 to 7 indicate that there is an ϵ-production and the attribute value is
obtained from itself and hence its attribute value T'.syn is obtained from T'.inh
The node 7 is obtained from itself, 8 is obtained from node 7 and 9 is obtained from
node 8 and all are synthesized attributes.
Finally, T.val at node 9 is obtained from its child at node 8.
Definition: Topological sort of a directed graph is a sequence of nodes which gives the
order in which the various attribute values can be computed in a parse tree. Using the
dependency graph, we can write the order in which we can evaluate various attribute
values in the parse tree. This ordering is nothing but the topological sort of the graph.
There may be one or more orders to evaluate attribute values. If the dependency graph
has an edge from A to B, then the attribute corresponding to A must be evaluated before
evaluating attribute value at node B.
Example 5.6: Give the topological sort of the following dependency graph
T 9 val
digit 2 lexval ϵ
Solution: The various topological sorts that can be obtained using the dependency graph
are shown below:
Topological sort 1: 1, 2, 3, 4, 5, 6, 7, 8, 9
Topological sort 2: 1, 3, 2, 4, 5, 6, 7, 8, 9
Topological sort 3: 1, 3, 5, 2, 4, 6, 7, 8, 9 and so on
Note: If there is a cycle in the dependency graph, topological sort does not exist.
Introduction to Compiler Design - 5.21
In this section, let us see Given an SDD, it is very difficult to tell whether there exist a
cycle in the dependency graph corresponding a parse tree. But, in practice, translations
can be implemented using classes of SDD‟s that will guarantee an evaluation order, since
they do not permit dependency graphs with cycles.
Now, let us see “What are different classes of SDD‟s that guarantee evaluation order?”
The two classes of SDD‟s that guarantee an evaluation order are:
S-attributed definition
L-attributed definition
Now, let us see “What is S-attributed definition?”
postorder (N)
{
for (each child C of N from left)
postorder(C)
end for
Definition: An SDD without any side effects is called attribute grammar. The semantic
rules in an attribute grammar define the value of an attribute purely in terms of the values
of other attributes and constants. Attribute grammars have the following properties:
They do not have any side effects
They allow any evaluation order consistent with dependency graph.
For example, the SDD obtained from example 5.1 is an attribute grammar. For
simplicity, the SDD‟s that we have seen so far have semantic rules without side effects.
But, in practice, it is convenient to allow SDD‟s to have limited side effects, such as
printing the result computed by a desk calculator or interacting with symbol table and so
on.
L-attributed definition is a second class of SDD. The idea behind the class is that
“Between the attributes associated with a production body, dependency graph edges can
go only from left to right, but not from right to left. Hence L-attributed. Now, formally,
let us see “What is an L-attributed definition?”
For example, the SDD shown in example 5.3 is L-attributed. To see why, let us consider
the following productions and semantic rules:
Definition: The main job of the semantic rule is to compute the attribute value of each
non-terminal in the corresponding parse tree. Any other activity performed other than
computing the attribute value is treated as side effect in a SDD.
5.24 Syntax Directed Translation
For example, attribute grammars have no side effects and allow any evaluation
order consistent with dependency graph. But, translation schemes impose left-to-right
evaluation and allow semantic actions to contain any program fragment. In practice,
translation involves side effects:
printing the result computed by a desk calculator
Interacting with symbol table. That is, a code generator might enter the type of an
identifier into a symbol table etc.
Now, let us see “How to control the side effects in SDD?” The side effects in SDD can be
controlled in one of the following ways:
Permitting side effects when attribute evaluation based on any topological sort of the
dependency graph produces a correct translation
Impose constraints in the evaluation order so that the same translation is produced for
any allowable order
For example, consider the SDD given in example 5.1. This SDD do not have any side
effects. Now, let us consider the first semantic rule and corresponding production shown
below:
Production Semantic rule
S → En S.val = E.val
Let us modify the semantic rule and introduce a side effect by printing the value of E.val
as shown below:
Production Semantic rule
S → En print ( E.val)
The complete SDD of desk calculator with side effects is shown below:
Example 5.7: Write the grammar and SDD for a simple desk calculator with side effect
Solution: The SDD for simple desk calculator along with side effect of printing the value
of an attribute after evaluation can be written as shown below:
Introduction to Compiler Design - 5.25
Example 5.8: Write the SDD for a simple type declaration and write the annotated parse
tree and the dependency graph for the declaration “float a, b, c”
Solution: The grammar for a simple type declaration can be written as shown below:
D→TL
T → int | float
L → L1 , id | id
Consider the parse tree for the declaration: “float a, b, c” where a, b and c are identifiers
represented by id1, id2 and id3 respectively along with partial annotated parse tree
showing the direction of evaluations:
D
2
T.type = float L.inh = float
3
1 5
float L1.inh = L.inh , id3.entry
3
5
L1.inh = L.inh , id2.entry
4
id1.entry
5.26 Syntax Directed Translation
Finally, the SDD for the grammar can be written by looking into dependency graph as
shown below:
Productions Semantic Rules
D→TL L.inh = T.type
T → int T.type = integer
T → float T.type = float
L → L1, id L1.inh = L.inh
Addtype(L.inh, id.entry)
L → id Addtype(L.inh, id.entry)
Note: Thus, we have an L-attributed definition with side effect of adding the type into
symbol table for the corresponding identifier. The dependency graph for type declaration
statement “float a, b, c” is shown below:
id1 1 entry
Example 5.9: Write the SDD for a simple desk calculator. Write the annotated parse tree
for the expression 3*5+4n
Solution: The SDD for a simple desk calculator is shown below: (See example 5.2 for
detailed explanation)
S.val = 19
E. val = 19 n (EOF)
E.val = 15 + T.val = 4
T.val = 15 F.val = 4
* digit.lexval = 4
T1.val = 3 F.val = 5
F.val = 3 digit.lexval = 5
digit.lexval = 3
Introduction to Compiler Design - 5.29
Definition: The Syntax Directed Translation (in short SDT) is a context free grammar
with embedded semantic actions. The semantic actions are nothing but the sequence of
steps or program fragments that will be carried out when that production is used in the
derivation. The SDTs are used:
To build syntax trees for programming constructs.
To translate infix expressions into postfix notation
To evaluate expressions
The main application of SDT in this section is the construction of syntax trees. Since
some of the compilers use syntax trees as an intermediate representation, an SDD accepts
the input string and produce a syntax tree. The syntax tree is also called abstract syntax
tree. The parse tree is also called concrete syntax tree.
Before proceeding further, let us see “What is a syntax tree? What is the difference
between syntax tree and parse tree?”
Definition: A syntax tree also called abstract syntax tree is a compressed form of parse
tree which is used to represent language constructs. In a syntax tree for an expression,
each interior node represents an operator and the children of the node represent the
operands of the operator. In general, any programming construct can be handled by
making up an operator for the construct and treat semantically meaningful components of
that construct as operands. For example, the syntax tree for the expression “6 + 4 – 2” is
shown below:
Note 1: In a syntax tree, all the operators and keywords appear as interior nodes.
Note 2: In a parse tree all operators and keywords appear as leaf nodes.
5.30 Syntax Directed Translation
Example 5.10: For the following grammar show the parse tree and syntax tree for the
expression 3 *5 + 4:
E→E+T|E–T|T
T→ T*F | T/F | F
F→ (E) | digit | id
Solution: The annotated parse tree for the expression 3*5+4 is shown in example 5.9.
Now, the parse tree and syntax tree for the expression 3*5+4 are shown below:
E + T
+
T F
* 4
T1 * F 4 3 5
F 5
3
Parse tree Syntax tree
Now, let us see “How to construct syntax trees?” The syntax tree for expressions can be
constructed using two SDD‟s namely:
S-attributed definition which is used for bottom-up parsing
L-attributed definition which is used for top-down parsing
The main application of SDT in this section is the construction of syntax trees. Now, let
us see “How to construct semantic rules that help us to create syntax trees for the
expressions?” Each node in a syntax tree represents a programming construct and the
children of the node represent meaningful components of that construct. A syntax-tree
node representing an expression E1 + E2 has label + and two children representing the
sub-expressions E1 and E2 as shown below:
+ The nodes of a syntax tree can be implemented by
creating objects where each object containing two or
E1 E2 more fields. The two functions that are useful in
constructing the syntax trees are shown below:
Introduction to Compiler Design - 5.31
Leaf(op, val) : This function is called only for the terminals and it is used to create
only leaf nodes containing two fields namely:
op field holds the label for the node
val field holds the lexical value obtained from the lexical analyzer
Node(op, c1, c2, c3,…..cn) : This function is called to create only interior nodes with
various fields namely:
op field holds the label for the node
c1, c2, c3, …cn refer (or pointers) to children for the node labeled op.
Example 5.11: Obtain the semantic rules to construct a syntax tree for simple arithmetic
expressions using bottom up approach
E→E+T|E–T|T
T→ T*F | T/F | F
F→ (E) | digit | id
It is easy to construct syntax trees with the help of SDD‟s (For detail of SDD of
arithmetic expression see example 5.2) as shown below:
For the production of the form E → E1 + E2, we have to use a rule that creates a node
with „+‟ for op fields two children E1.node and E2.node that represent sub-
expressions. This can be done by creating a node with new operator using function
Node() as shown below:
E → E1 + E2
On similar lines, we can write the semantic rules for other productions consisting of
the operators +, –, * and / as shown below:
For the production of the form E → T and E→ ( T ) , no node is created, since E.node
is the same as that of T.node. So, semantic rules for the productions of the above form
can be written as shown below:
Production Semantic rules
E→T E.node = T.node
T→F T.node = F.node
F→(E) F.node = E.node
For the production of the form A → a where a is a terminal, use the function Leaf()
with a and a.entry as the parameter if a is and identifier or a.val as the parameter if a
is a number of digit. So, semantic rules for the productions of the above form can be
written as shown below:
Production Semantic rules
F → digit F.node = new Leaf(digit, digit.val)
F → id F.node = new Leaf(id, id.entry)
So, the final set of SDDs used to construct syntax trees for simple expressions can be
written as shown below:
Production Semantic rules
E → E1 + T E.node = new Node(„+‟, E1.node, T.node)
E → E1 – T E.node = new Node(„–‟, E1.node, T.node)
E→T E.node = T.node
T → T1 * F T.node = new Node(„*‟, T1.node, F.node)
T → T1 / F T.node = new Node(„/‟, T1.node, F.node)
T→F T.node = F.node
F → (E) F.node = E.node
F → digit F.node = new Leaf(digit, digit.val)
F → id F.node = new Leaf(id, id.entry)
Note: The above semantic rules show a left-recursive grammar that is S-attributed (so all
attributes are synthesized). Now, let us see how to create a syntax tree using above
semantic rules.
Solution: Consider for the arithmetic expression “a - 4 + c”. Here, we make use of the
following functions to create the nodes of syntax trees for expressions with binary
operators.
Node(op , left , right) creates an operator node with label op and two fields containing
pointers to left and right sub-tree
Leaf(id , entry) creates a identifier node with label id and a field containing entry,
which pointer to the symbol-table entry for the identifier.
Leaf(digit , val) creates a number node with label digit and a field containing val
which represent the value of the number.
The following sequence of function calls creates the syntax tree for the arithmetic
expression a – 4 + c as shown below:
Step 1: Create a leaf node for identifier a using the function Leaf() as shown below:
p1
p1 = new Leaf(id , id.entry); id
to entry for a
Step 2: Create a leaf node for digit 4 using the function Leaf() as shown below:
p1 p2
p2 = new Leaf(digit , 4);
id digit 4
to entry for a
Step 3: Create an interior node by passing „–„ as the first argument and pointers to the
first two leaves p1 and p2 with the help of function Node() as shown below:
p3
p3 = new Node(‘–‘ , p1, p2);
–
p1 p2
id digit 4
to entry for a
5.34 Syntax Directed Translation
Step 4: Create a leaf node for identifier c using the function Leaf() as shown below:
p3 p4
p4 = new Leaf(id, id.entry);
– id
p1 p2 to entry for c
id digit 4
to entry for a
Step 5: Create an interior node by passing „+„ as the first argument and pointers to the
left sub-tree identified by p3 and pointer to the right sub-tree identified by p4 with the
help of function Node() as shown below:
p5
p5 = new Node(„+‟, p3, p4)
+
p3 p4
– id
p1 p2 to entry for c
id digit 4
to entry for a
Thus, to get the above syntax tree for the expression “a – 4 + c”, the various steps that are
used are shown below:
Note: If the above rules are evaluated during a postorder traversal of the parse tree or
with reductions during bottom-up parse, then above sequence of steps ends with p5
pointing to the root of the constructed parse tree.
The parse tree, annotated parse tree depicting the construction of a syntax tree for the
arithmetic expression “a - 4 + c” is shown below:
E.node
E.node + T.node
E.node – id
T.node
T.node digit
id
+
– id
to entry for c
id digit 4
to entry for a
Example 5.13: Obtain the semantic rules to construct a syntax tree for simple arithmetic
expressions using top-down approach with operators + and –
Solution: The grammar to generate an arithmetic expression with two operators + and –
is shown below:
E→E+T|E–T|T
T → (E) | digit | id
The above grammar is not suitable for top-down parser since it has left recursion. After
eliminating left recursion, we get the following grammar:
E → T E'
E' → + T E' | – T E' | ϵ
T → (E) | digit | id
To construct syntax tree, first obtain the SDD for the above grammar. The SDD for the
above grammar is shown below (For details refer example 5.4)
Production Semantic Rule
The functions Leaf() and Node() functions are used to create the syntax tree. Now,
constructing syntax trees during top-down parsing can be obtained using the following
semantic rules:
E 13 node
id 7 entry ϵ
Example 5.14: Obtain the semantic rules to construct a syntax tree for simple arithmetic
expressions with operators -, +, * and / using top-down approach
5.38 Syntax Directed Translation
E→E+T|E–T|T
T→ T*F | T/F | F
F→ (E) | digit | id
The above grammar is not suitable for top-down parser since it has left recursion. After
eliminating left recursion, we get the following grammar:
E → T E'
E' → + T E' | – T E' | ϵ
T → F T'
T' → * F T' | / F T' | ϵ
F→ (E) | digit | id
To construct syntax tree, first obtain the SDD for the above grammar. The SDD for the
above grammar is shown below (For details refer example 5.4)
Now, let us see “What is the use of inherited attributes?” During top-down parsing, the
grammar should not have left recursion. if the grammar has left recursion, we have to
eliminate left recursion. The resulting grammar need inherited attributes. Inherited
attributes are useful when the structure of the parse tree differs from the syntax tree for
the specified input. The attributes can then be used to carry information from one part of
the parse to another part of the parse tree. But sometimes, even though the grammar does
not have left recursion, the language itself demands inherited attributes. This can be
explained by considering array type as shown below:
Example 5.15: “Give the syntax directed translation of type int [2][3] and also given the
semantic rules for the respective productions”
Solution: The grammar for multi-dimensional array can be written as shown below:
T→BC
B → int
B → float
C → [num] C1
C→ϵ
The parse tree to get the string int [2][3] is shown below:
int
[ 2 ] C1.b = integer
Observe the following points from above partial annotated parse tree:
The type int is moved to parent B and attribute t is synthesized. The production and
equivalent semantic rule is shown below:
Production Semantic rule
B → int B.t = integer
B → float B.t = float
The type integer has to be transferred from B.t on the left to C on the right using the
production T → B C as shown using the arrow mark and hence it is inherited
attribute denoted by b. The production and equivalent semantic rule is shown below:
Production Semantic rule
T→BC C.b = B.t
Now, the attribute value of C.b moves down and copied into C1 using the production
C → [num] C1. It must be inherited. The equivalent semantic rule is shown below:
Production Semantic rule
C → [num] C1 C1.b = C.b
5.42 Syntax Directed Translation
Because of the production C → ϵ, the right most C in the above tree, takes its
synthesized value C.t from itself using inherited attribute value C.b. The semantic
rule is shown below:
Production Semantic rule
C→ϵ C.t = C.b
Now, the synthesized attribute value integer has to be moved from right most non-
terminal to the root node as shown below in partial annotated parse tree:
int
[ 2 ] C1.b = integer C.t = array(3,integer)
The attribute value C1.t is moved to C.t upwards using the production C → [num] C1.
The semantic rule can be written as shown below:
Production Semantic rule
C → [num] C1 C.t = array(num.val, C1.t)
Finally, attribute value of C.t is moved to T.t using the production T → B C and
equivalent semantic rule is shown below:
Production Semantic rule
T→BC T.t = C.t
So, the final annotated parse tree is shown below:
T T.t = array(2, array(3,integer))
int
[ 2 ] C1.b = integer C.t = array(3,integer)
Definition: In an SDD implementation, we parse the grammar bottom-up and the SDD is
S-attributed. An SDT is constructed such that the actions to be executed are placed at the
5.44 Syntax Directed Translation
end of the production and are executed only when the right hand side of the production is
reduced to left hand side of the production i.e., reduction of the body to the head of the
production. The SDT‟s with all actions at the right end of the production bodies are called
postfix SDT’s or postfix syntax-directed-translations.
Example 5.16: Obtain Postfix SDT implementation of the desk calculator to evaluate the
given expression
SDT can be easily obtained by looking at the SDD shown in example 5.7. Observe that
all the semantic rules in SDD enclosed within braces results in SDT. The postfix SDT
implementation of the desk calculator is shown below:
Productions Actions
S→En { print(E.val) }
E → E1 + T { E.val = E1.val + T.val }
E→T { E.val = T.val }
T → T1 * F { T.val = T1.val * F.val }
T→F { T.val = F.val }
F→(E) { F.val = E.val }
F → digit { F.val = digit.lexval }
In above SDT observe that actions enclosed within braces are present at the end of each
production.
Note: In a typical SDT, actions can be placed either at the beginning of the production or
at the end of the production or somewhere in between. Now, instead of evaluating the
arithmetic expression, let us convert the infix expression to postfix expression
Example 5.17: Write the SDD and annotate parse tree for converting an infix to postfix
expression
Note: The semantic rule for the production E → E1 + T is E.t = E1.t + T.t. But, in the
postfix notation, it can be written as: E.t = E1.t || T.t || „+‟ where the symbol || is used as
concatenation operator. The annotated parse tree for converting the expression 2*3+4 can
be written by writing the parse tree for the given expression as shown below:
E.t = 2 3 * 4 +
E.t = 2 3 * + T.t = 4
T.t = 2 3 * F.t = 4
* digit.lext = 4
T1.t = 2 F.t = 3
F.t = 2 digit.lext = 3
digit.lext = 2
5.46 Syntax Directed Translation
Example 5.18: Write the SDT for converting an infix to postfix expression. Show the
actions for translating the expression 2*3+4 into its equivalent postfix expression
E + T {print(„+‟)}
T F
* 4 {print(„4‟)}
T F {print(„*‟)}
Note: When we draw the parse tree for the
F 3 {print(„3‟)} translation scheme, indicate an action by
introducing rightmost child and connect it
to the parent using dashed line as shown in
2 {print(„2‟)} the parse tree
Now, traversing in postorder and executing only the actions we get the postfix
expression:
23*4+
Introduction to Compiler Design - 5.47
Now, the SDT for converting an infix expression to postfix expression can be easily
written using the above parse tree and is shown below:
Now, let us “Explain parser-stack implementation of Postfix SDT‟s?” Postfix SDT‟s can
be implemented during LR parsing by executing the actions whenever reduction occurs.
This can be explained as shown below:
The grammar symbols to be reduced to LHS of the production are present on top of
the stack. For example, for the production of the form: A → X Y Z and assume stack
contains X, Y and Z and they are the symbols to be reduced
Apart from placing grammar symbols on the top of the stack, place the attribute
values of grammar symbols also on the top of the stack
A grammar symbol may have more than one attribute and hence all the attributes
associated with a grammar symbol should be passed on to the stack
So, the stack should be implemented as an array of grammar
records where each record on the stack has the symbol attribute
grammar symbol and its associated attributes. For
the sake of convenience only one attribute of
grammar symbol is shown in figure.
If the attributes are synthesized and actions are
present at the end of the production, then we can Z Z.z
compute the attribute value of LHS of the Y Y.y
production using the attribute values of RHS of the X X.x
production. For example,
Stack
5.48 Syntax Directed Translation
Example 5.18: Write the actions of desk calculator SDT so that they manipulate the
parser explicitly
Solution: The SDT for desk calculator can be written as shown below: (See example
5.16 for details)
S→En { print(E.val) }
E → E1 + T { E.val = E1.val + T.val }
E→T { E.val = T.val }
T → T1 * F { T.val = T1.val * F.val }
T→F { T.val = F.val }
F→(E) { F.val = E.val }
F → digit { F.val = digit.lexval }
The desk calculator can be implemented using the stack as shown below:
Consider the production S → E n and its stack contents
n top
E top-1
Stack
Using the above SDD, the attribute value E.val which is in stack at position top-1 has
to be printed. This can be done using the statement:
Print(stack[top-1]);
Top = top -1
Consider the production E → E1 + T and its stack contents before and after reduction
T top top
+ top-1 top-1
E top-2 E top-2
Stack Stack
But, for the productions E → T and T→ F, no action is necessary because the length
of the stack will not change.
Consider the production F → ( E ) and its stack contents before and after reduction
) top top
E top-1 top-1
( top-2 F top-2
Stack Stack
The final actions for the respective productions are shown below:
Productions Actions
S→En { print(stack[top-1]);
top = top -1}
E→T
T→F
F → digit
5.50 Syntax Directed Translation
Now, let us see “When action part specified in the production is executed?” An action
can be placed at any position within the body of the function. The action is performed
immediately after all symbols to its left are processed. Thus, if we have the production:
B→X{a}Y
then action a is performed after we have recognized X (if X is a terminal) or all terminals
derived from X (if X is a non-terminal)
Example 5.18: Write the SDT for converting an infix to prefix expression. Show the
actions for translating the expression 2*3+4 into its equivalent prefix expression
To convert an infix expression to prefix expression, write the parse tree to get the
expression 2*3+4 and print the operator or operand using the following rules:
If only child is present (representing an operand), introduce one left sibling printing
the operand
If any one of the children contains an operator, introduce leftmost sibling to print the
operator
Note: When we draw the parse tree for the translation scheme, indicate an action by
introducing leftmost child and connect it to the parent using dashed line as shown in the
parse tree. The parse tree to get the expression “2 * 3 + 4” is shown below:
Introduction to Compiler Design - 5.51
{print(„+‟)} E + T
T F
* {print(„4‟)} 4
{print(„*‟)} T F
F {print(„3‟)} 3
{print(„2‟)} 2
Now, traversing in preorder and executing only the actions we get the prefix expression:
+*234
Now, the SDT for converting an infix expression to postfix expression can be easily
written using the above parse tree and is shown below:
Exercises:
8) Write the grammar and syntax directed definition for a simple desk calculator and
show annotated parse tree for the expression (3+4)*(5+6)
9) What is circular dependency when evaluating the attribute value of a node in an
annotated parse tree
10) What is the use of inherited attributes
11) Obtain SDD and annotated parse tree for the following grammar using top-down
approach:
T→T*F|F
F → digit
12) Obtain SDD for the following grammar using top-down approach:
S → En
E→E+T|T
T→T*F|F
F → ( E ) | digit
13) and obtain annotated parse tree for the expression (3 + 4 ) * (5 + 6)n
14) What is a dependency graph
Introduction to Compiler Design - 5.53
15) Obtain the dependency graph for the annotated parse tree obtained in example 5.3
16) What is topological sort of the graph
17) Give the topological sort of the following dependency graph
18) What are different classes of SDD‟s that guarantee evaluation order
19) What is S-attributed definition
20) What is an attribute grammar
21) What is an L-attributed definition
22) What is a side effect in a SDD
23) How to control the side effects in SDD
24) Write the grammar and SDD for a simple desk calculator with side effect
25) Write the SDD for a simple type declaration and write the annotated parse tree and
the dependency graph for the declaration “float a, b, c”
26) Write the SDD for a simple desk calculator. Write the annotated parse tree for the
expression 3*5+4n
27) What is syntax directed translation
28) What is a syntax tree? What is the difference between syntax tree and parse tree?”
29) For the following grammar show the parse tree and syntax tree for the expression
3 *5 + 4:
E→E+T|E–T|T
T→ T*F | T/F | F
F→ (E) | digit | id
30) How to construct semantic rules that help us to create syntax trees for the
expressions?
31) Obtain the semantic rules to construct a syntax tree for simple arithmetic expressions
using bottom up approach
32) Create a syntax tree for the expression “a – 4 + c”
33) Obtain the semantic rules to construct a syntax tree for simple arithmetic expressions
using top-down approach with operators + and –
5.54 Syntax Directed Translation
34) Obtain the semantic rules to construct a syntax tree for simple arithmetic expressions
with operators -, +, * and / using top-down approach
35) Give the syntax directed translation of type int [2][3] and also given the semantic
rules for the respective productions
36) What is syntax directed translation scheme
37) What is Postfix Syntax-directed-translation or Postfix SDT
38) Obtain Postfix SDT implementation of the desk calculator to evaluate the given
expression
39) Write the SDD and annotate parse tree for converting an infix to postfix expression
40) Write the SDT for converting an infix to postfix expression. Show the actions for
translating the expression 2*3+4 into its equivalent postfix expression
41) Explain parser-stack implementation of Postfix SDT‟s
42) Write the actions of desk calculator SDT so that they manipulate the parser explicitly
43) When action part specified in the production is executed
44) Write the SDT for converting an infix to prefix expression. Show the actions for
translating the expression 2*3+4 into its equivalent prefix expression