Assignment 1
Assignment 1
Assignment 1
6th SEMESTER
COMPILER CONSTRUCTION [CSE304]
ASSIGNMENT - 1
SUBMITTED BY:
NAME – K. NAVEEN
ENROLLMENT NUMBER – A2305219369
B. TECH (CSE)
SUBMITTED TO
FACULTY NAME: Dr. Prabhishek Singh
1. How bootstrapping of a compiler to more than one machine is done? Discuss.
Answer 1- The T- diagram shows a compiler SCIT for Source S, Target T, implemented in I.
2. What do you understand by pass? Discuss merits and demerits of multi-pass and single-
pass compiler.
Answer 2- A Compiler pass refers to the traversal of a compiler through the entire program.
Single-pass Compiler :
•Advantage: More effective than multi-pass compilers in the compiler point of view.
•Disadvantage: It compiles less efficient programs.
Multi-pass Compiler :
•Advantages: It can be played very role useful when optimizing code.
•Disadvantages: It is a very Slower process which takes a lot of time to compile the codes.
4. Why it is difficult to simulate NFA? Discuss the method of constructing an NFA from
regular expression.
Answer 4- It is hard for a computer program to simulate an NFA because the transition function
is multivalued. Fortunately, an algorithm, called the subset construction will convert an NFA for
any language into a DFA that recognizes the same language.
Thompson’s construction algorithm, also called the McNaughten-Yamada-Thompson algorithm
is a method of transforming a RE into an equivalent non deterministic finite automaton (NFA).
Method-
Step 1- Construct an NFA with null moves from the given RE.
Step 2- Remove null transition from the NFA and convert it into its equivalent DFA.
5. Explain all necessary phases and passes in a compiler design. Write down the purpose of
each pass. What is bootstrapping?
Answer 5- Compiler Phases
The compilation process contains the sequence of various phases. Each phase takes source
program in one representation and produces output in another representation. Each phase takes
input from its previous stage.
There are the various phases of compiler:
Lexical Analysis:
Lexical analyzer phase is the first phase of compilation process. It takes source code as input. It
reads the source program one character at a time and converts it into meaningful lexemes.
Lexical analyzer represents these lexemes in the form of tokens.
Syntax Analysis
Syntax analysis is the second phase of compilation process. It takes tokens as input and generates
a parse tree as output. In syntax analysis phase, the parser checks that the expression made by the
tokens is syntactically correct or not.
Semantic Analysis
Semantic analysis is the third phase of compilation process. It checks whether the parse tree
follows the rules of language. Semantic analyzer keeps track of identifiers, their types and
expressions. The output of semantic analysis phase is the annotated tree syntax.
Intermediate Code Generation
In the intermediate code generation, compiler generates the source code into the intermediate
code. Intermediate code is generated between the high-level language and the machine language.
The intermediate code should be generated in such a way that you can easily translate it into the
target machine code.
Code Optimization
Code optimization is an optional phase. It is used to improve the intermediate code so that the
output of the program could run faster and take less space. It removes the unnecessary lines of
the code and arranges the sequence of statements in order to speed up the program execution.
Code Generation
Code generation is the final stage of the compilation process. It takes the optimized intermediate
code as input and maps it to the target machine language. Code generator translates the
intermediate code into the machine code of the specified computer.
Pass is a complete traversal of the source program. Compiler has two passes to traverse the
source program.
Multi-pass Compiler
•Multi pass compiler is used to process the source code of a program several times.
•In the first pass, compiler can read the source program, scan it, extract the tokens and store the
result in an output file.
•In the second pass, compiler can read the output file produced by first pass, build the syntactic
tree and perform the syntactical analysis. The output of this phase is a file that contains the
syntactical tree.
•In the third pass, compiler can read the output file produced by second pass and check that the
tree follows the rules of language or not. The output of semantic analysis phase is the annotated
tree syntax.
•This pass is going on, until the target output is produced.
One-pass Compiler
•One-pass compiler is used to traverse the program only once. The one-pass compiler passes
only once through the parts of each compilation unit. It translates each part into its final machine
code.
•In the one pass compiler, when the line source is processed, it is scanned and the token is
extracted.
•Then the syntax of each line is analyzed and the tree structure is build. After the semantic part,
the code is generated.
•The same process is repeated for each line of code until the entire program is compiled.
Lex is a program that generates lexical analyzer. It is used with YACC parser generator. The
lexical analyzer is a program that transforms an input stream into a sequence of tokens. It reads
the input stream and produces the source code as output through implementing the lexical
analyzer in the C program. The function of Lex is as follows:
Firstly lexical analyzer creates a program lex.1 in the Lex language. Then Lex compiler
runs the lex.1 program and produces a C program lex.yy.c.
Finally C compiler runs the lex.yy.c program and produces an object program a.out.
a.out is lexical analyzer that transforms an input stream into a sequence of tokens.
9. Discuss the implementation of lookahead operator while doing the lexical analysis?
Answer 9- A scanner (or lexical analyzer) is built from the finite automaton (FA) that
corresponds to the set of regular expressions. In determining the end of a token, a scanner usually
needs to examine an extra symbol from the input. This is called 1-symbol lookahead. Lookahead
operator is the additional operator that is read by LEX in order to distinguish additional pattern
for a token. Lexical analyzer is used to read one characteristic ahead of valid lexeme and then
retracts to produce token. At times it is needed to have certain characteristics in the end of input
to match with the pattern. In such cases, slash (/) is used to indicate the end of part of pattern that
matches the lexeme.
For example, in some language keywords are not reserved, So the statements-
IF(I,J)=5 and IF(condition) THEN
Results in a conflict whether to produce IF as an array name or a keyword. TO resolve this the
LEX rule for keyword IF can be written as IF^(.*\){letters}
Lexical Analysis scans input string from left to right one character at a time to identify tokens. It
uses two pointers to scan tokens −
Preliminary Scanning − Certain processes are best performed as characters are moved from the
source file to the buffer. For example, it can delete comments. Languages like FORTRAN which
ignores blank can delete them from the character stream. It can also collapse strings of several
blanks into one blank. Pre-processing the character stream being subjected to lexical analysis
saves the trouble of moving the look ahead pointer back and forth over a string of blanks.
11. Construct minimum state DFA for the following regular expression (a|b)*a(a|b)
Answer 11-
We can remove ambiguity solely on the basis of the following two properties –
a) Remove useless symbols
b) Elimination of null production
c) Elimination of unit production
13. Consider the following programs and evaluate the number of tokens?
a) int max (I,j)
int I, j;
\* return max of I &j */
{
return 1>j?i:j;
}
b) printf (“+lai x = %d”, i);
c) printf (“%d,&i=%x”,i, &i);
d) a+++++b
e) a+++b
f) #include<studio.h>
Answer 13-
a- 23
b- 7
c- 10
d- 5
e- 4
f- 7
14. Which of the following string is said to be a token without seeing next character?
a) + b) ++ c) < d) (
Answer 14- b and d