Compiler Key3
Compiler Key3
Compiler Key3
11. a. i. Explain in detail about the role of lexical analyzer with the possible error
recovery actions. (6)
Few errors are discernible at the lexical level alone, because a lexical analyzer has a
very localized view of a source program. The simplest recovery strategy is “panic mode”
recovery: delete the successive characters from the remaining input until the lexical
analyzer can find a well-formed token. Other possible error recovery actions are
o Deleting an extraneous character
o Inserting a missing character
o Replacing an incorrect character by a correct character
o Transposing two adjacent characters
-2-
NOV/DEC-'07/CS1352-Answer Key
ii. What is a compiler? Explain the various phases of compiler in detail, with a neat
sketch. (10)
The process of compilation is very complex. So it comes out to be customary from
the logical as well as implementation point of view to partition the compilation process
into several phases. A phase is a logically cohesive operation that takes as input one
representation of source program and produces as output another representation. (2)
Source program is a stream of characters: E.g. pos = init + rate * 60 (6)
– lexical analysis: groups characters into non-separable units, called token, and
generates token stream: id1 = id2 + id3 * const
• The information about the identifiers must be stored somewhere (symbol
table).
– Syntax analysis: checks whether the token stream meets the grammatical
specification of the language and generates the syntax tree.
– Semantic analysis: checks whether the program has a meaning (e.g. if pos is a record
and init and rate are integers then the assignment does not make a sense).
:=
:=
id1
+
id1
+
id2
*
id2
*
id3 inttoreal
id3 60 60
-3-
NOV/DEC-'07/CS1352-Answer Key
Error Handling
Detection of Different Errors Which Correspond to All Phases
Each phase should know somehow to deal with error, so that compilation
can proceed, to allow further errors to be detected
Source Program
1
Lexical Analyzer
2
Syntax Analyzer
3
Semantic Analyzer
5
Code Optimizer
6
Code Generator
Target Program
(2)
(OR)
b. i. Give the minimized DFA for the following expression (a|b)*abb. (10)
-4-
NOV/DEC-'07/CS1352-Answer Key
Calculation of followpos:
Node followpos
1 {1, 2, 3}
2 {1, 2, 3}
3 {4}
4 {5}
5 {6}
6 -
-5-
NOV/DEC-'07/CS1352-Answer Key
The position associated with the end marker #, 6 is in D. So, D is the final state.
DFA
a a
b
a b b
A B C D
a
-6-
NOV/DEC-'07/CS1352-Answer Key
Transition table:
Input symbol
States
a b
A B A
B B C
C B D
D B A
Parser obtains a string of tokens from the lexical analyzer and verifies that the string
can be generated by the grammar for the source language. It can report any syntax error
in an intelligible fashion.
Errors can be of lexical, syntactic, semantic or logical. The error handler in a parser has
simple-to-state goals:
should report the presence of errors clearly and accurately
should recover from each error quickly enough to be able to detect subsequent
errors
should not significantly slow down the processing of correct programs
-7-
NOV/DEC-'07/CS1352-Answer Key
(OR)
-8-
NOV/DEC-'07/CS1352-Answer Key
action goto
States
= * id $ S L R
0 s4 s5 1 2 3
1 Acc
2 s6 r5
3 r2
4 s4 s5 8 7
5 r4
6 s11 s12 10 9
7 r3
8 r5
9 r1
10 r5
11 s11 s12 10 13
12 r4
13 r3
This grammar is LR(1), since it does not produce any multi-defined entry in its
parsing table.
-9-
NOV/DEC-'07/CS1352-Answer Key
action goto
States
= * id $ S L R
0 s4 s5 1 2 3
1 Acc
2 s6 r5
3 r2
4 s4 s5 8 7
5 r4 r4
6 s4 s5 8 9
7 r3 r3
8 r5 r5
9 r1
ii. What are the reasons for using LR parser technique? (4)
- 10 -
NOV/DEC-'07/CS1352-Answer Key
13. a. i. Explain about the different type of three address statements. (8)
It is one of the intermediate representations. It is a sequence of statements of the
form x:= y op z, where x, y, and z are names, constants or compiler-generated
temporaries and op is an operator which can be arithmetic or a logical operator. E.g.
x+y*z is translated as t1=y*z and t2=x+t1.
Reason for the term three-address code is that each statement usually contains
three addresses, two for the operands and one for the result. (2)
Implementation: (4)
Quadruples
Record with four fields, op, arg1, arg2 and result
Triples
Record with three fields, op, arg1, arg2 to avoid entering temporary
names into symbol table. Here, refer the temporary value by the position of
the statement that computes it.
Indirect triples
List the pointers to triples rather than listing the triples
For a: = b* -c + b * -c
Quadruples
Op arg1 arg2 result
(0) uminus c t1
(1) * b t1 t2
(2) uminus c t3
(3) * b t3 t4
(4) + t2 t4 t5
(5) := t5 a
Triples
Op arg1 arg2
(0) uminus c
(1) * b (0)
(2) uminus c
(3) * b (2)
(4) + (1) (3)
(5) assign a (4)
- 11 -
NOV/DEC-'07/CS1352-Answer Key
Indirect Triples
Op arg1 arg2 Statement
(14) uminus c (0) (14)
(15) * b (14) (1) (15)
(16) uminus c (2) (16)
(17) * b (16) (3) (17)
(18) + (15) (17) (4) (18)
(19) assign a (18) (5) (19)
(OR)
- 12 -
NOV/DEC-'07/CS1352-Answer Key
14. a. i. Construct the DAG for the following basic block: (6)
d:=b*c
e:=a+b
b:=b*c
a:=e-d
- 13 -
NOV/DEC-'07/CS1352-Answer Key
For dead-code elimination, delete from a dag any root (root with no ancestors)
that has no live variables. Repeated application of this will remove all nodes from the dag
that corresponds to dead code.
(OR)
- 14 -
NOV/DEC-'07/CS1352-Answer Key
- 15 -
NOV/DEC-'07/CS1352-Answer Key
ii. Describe in detail about optimization of basic blocks with example. (6)
Code improving transformations:
Structure-preserving transformations
o Common sub expression elimination
o Dead-code eliminations
Algebraic transformations like reduction in strength.
Structure preserving transformations: (3)
It is implemented by constructing a dag for a basic block. Common sub
expression can be detected by noticing, as a new node m is about to be added,
whether there is an existing node n with the same children, in the same order, and
with the same operator. If so, n computes the same value as m and may be used in its
place.
E.g. DAG for the basic block
d:=b*c
e:=a+b
b:=b*c
a:=e-d is given by
- 16 -
NOV/DEC-'07/CS1352-Answer Key
For dead-code elimination, delete from a dag any root (root with no ancestors)
that has no live variables. Repeated application of this will remove all nodes from the
dag that corresponds to dead code.
Use of algebraic identities: (3)
e.g. x+0 = 0+x=x
x-0 = x
x*1 = 1*x = x
x/1 = x
Reduction in strength:
Replace expensive operator by a cheaper one.
x ** 2 = x * x
Constant folding:
Evaluate constant expressions at compile time and replace them by their values.
Can use commutative and associative laws
E.g. a=b+c
e=c+d+b
IC: a=b+c
t=c+d
e=t+b
If t is not needed outside the block, change this to
a=b+c
e=a+d
using both the associativity and commutativity of +.
(OR)
- 17 -
NOV/DEC-'07/CS1352-Answer Key
Temporaries are used to hold values that arise in the evaluation of expressions.
Local data is the data that is local to the execution of procedure. Saved machine status
represents status of machine just before the procedure is called. Control link (dynamic
link) points to the activation record of the calling procedure. Access link refers to the
non-local data in other activation records. Actual parameters are the one which is passed
to the called procedure. Returned value field is used by the called procedure to return a
value to the calling procedure
- 18 -