Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Assignment 1

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

AMITY SCHOOL OF ENGINEERING AND TECHNOLOGY


AMITY UNIVERSITY UTTAR PRADESH NOIDA

6th SEMESTER
COMPILER CONSTRUCTION [CSE304]

ASSIGNMENT - 1

SUBMITTED BY:
NAME – K. NAVEEN
ENROLLMENT NUMBER – A2305219369
B. TECH (CSE)

BATCH 2019 – 2023, 6th SEMESTER, SECTION – 6CSE5(Y)

SUBMITTED TO
FACULTY NAME: Dr. Prabhishek Singh
1. How bootstrapping of a compiler to more than one machine is done? Discuss.
Answer 1- The T- diagram shows a compiler SCIT for Source S, Target T, implemented in I.

Follow some steps to produce a new language L for machine A:


1. Create a compiler SCAA for subset, S of the desired language, L using language "A" and that
compiler runs on machine A.

2. Create a compiler LCSA for language L written in a subset of L.

3. Compile LCSA using the compiler SCAA to obtain LCAA. LCAA is a compiler for language L, which


runs on machine A and produces code for machine A.

2. What do you understand by pass? Discuss merits and demerits of multi-pass and single-
pass compiler.
Answer 2- A Compiler pass refers to the traversal of a compiler through the entire program.
Single-pass Compiler :
•Advantage: More effective than multi-pass compilers in the compiler point of view.
•Disadvantage: It compiles less efficient programs.
Multi-pass Compiler :
•Advantages: It can be played very role useful when optimizing code.
•Disadvantages: It is a very Slower process which takes a lot of time to compile the codes.

3. Why do translators are needed?


Answer 3- To convert the source code into machine code, translators are needed.
A translator takes a program written in source language as input and converts it into a program in
target language as output.
It also detects and reports the error during translation.

4. Why it is difficult to simulate NFA? Discuss the method of constructing an NFA from
regular expression.
Answer 4- It is hard for a computer program to simulate an NFA because the transition function
is multivalued. Fortunately, an algorithm, called the subset construction will convert an NFA for
any language into a DFA that recognizes the same language.
Thompson’s construction algorithm, also called the McNaughten-Yamada-Thompson algorithm
is a method of transforming a RE into an equivalent non deterministic finite automaton (NFA).
Method-
Step 1- Construct an NFA with null moves from the given RE.
Step 2- Remove null transition from the NFA and convert it into its equivalent DFA.

5. Explain all necessary phases and passes in a compiler design. Write down the purpose of
each pass. What is bootstrapping?
Answer 5- Compiler Phases
The compilation process contains the sequence of various phases. Each phase takes source
program in one representation and produces output in another representation. Each phase takes
input from its previous stage.
There are the various phases of compiler:
Lexical Analysis:
Lexical analyzer phase is the first phase of compilation process. It takes source code as input. It
reads the source program one character at a time and converts it into meaningful lexemes.
Lexical analyzer represents these lexemes in the form of tokens.
Syntax Analysis
Syntax analysis is the second phase of compilation process. It takes tokens as input and generates
a parse tree as output. In syntax analysis phase, the parser checks that the expression made by the
tokens is syntactically correct or not.
Semantic Analysis
Semantic analysis is the third phase of compilation process. It checks whether the parse tree
follows the rules of language. Semantic analyzer keeps track of identifiers, their types and
expressions. The output of semantic analysis phase is the annotated tree syntax.
Intermediate Code Generation
In the intermediate code generation, compiler generates the source code into the intermediate
code. Intermediate code is generated between the high-level language and the machine language.
The intermediate code should be generated in such a way that you can easily translate it into the
target machine code.
Code Optimization
Code optimization is an optional phase. It is used to improve the intermediate code so that the
output of the program could run faster and take less space. It removes the unnecessary lines of
the code and arranges the sequence of statements in order to speed up the program execution.
Code Generation
Code generation is the final stage of the compilation process. It takes the optimized intermediate
code as input and maps it to the target machine language. Code generator translates the
intermediate code into the machine code of the specified computer.
Pass is a complete traversal of the source program. Compiler has two passes to traverse the
source program.
Multi-pass Compiler
•Multi pass compiler is used to process the source code of a program several times.
•In the first pass, compiler can read the source program, scan it, extract the tokens and store the
result in an output file.
•In the second pass, compiler can read the output file produced by first pass, build the syntactic
tree and perform the syntactical analysis. The output of this phase is a file that contains the
syntactical tree.
•In the third pass, compiler can read the output file produced by second pass and check that the
tree follows the rules of language or not. The output of semantic analysis phase is the annotated
tree syntax.
•This pass is going on, until the target output is produced.

One-pass Compiler
•One-pass compiler is used to traverse the program only once. The one-pass compiler passes
only once through the parts of each compilation unit. It translates each part into its final machine
code.
•In the one pass compiler, when the line source is processed, it is scanned and the token is
extracted.
•Then the syntax of each line is analyzed and the tree structure is build. After the semantic part,
the code is generated.
•The same process is repeated for each line of code until the entire program is compiled.

Bootstrapping is widely used in the compilation development. Bootstrapping is used to produce a


self-hosting compiler. Self-hosting compiler is a type of compiler that can compile its own
source code. Bootstrap compiler is used to compile the compiler and then you can use this
compiled compiler to compile everything else as well as future versions of itself.

6. What do you understand by lexical analyzer generator and LEX compiler.


Answer 6- Lexical Analyzer
 It is also called scanner. It takes the output of preprocessor (which performs file inclusion
and macro expansion) as the input which is in pure high level language. 
 It reads the characters from source program and groups them into lexemes (sequence of
characters that “go together”). 
 Each lexeme corresponds to a token.
 Tokens are defined by regular expressions which are understood by the lexical analyzer. 
 It also removes lexical errors (for e.g., erroneous characters), comments and white space.

Lex is a program that generates lexical analyzer. It is used with YACC parser generator. The
lexical analyzer is a program that transforms an input stream into a sequence of tokens. It reads
the input stream and produces the source code as output through implementing the lexical
analyzer in the C program. The function of Lex is as follows:
 Firstly lexical analyzer creates a program lex.1 in the Lex language. Then Lex compiler
runs the lex.1 program and produces a C program lex.yy.c.
 Finally C compiler runs the lex.yy.c program and produces an object program a.out.
 a.out is lexical analyzer that transforms an input stream into a sequence of tokens.

7. Show the construction of NFA for following regular expression.


(a|b)*a(a|b) (a|b)
Answer 7-
8. Describe various compiler writing tools?
Answer 8- Parser Generator: It produces syntax analyzers (parsers) from the input that is based
on a grammatical description of programming language or on a context-free grammar. It is useful
as the syntax analysis phase is highly complex and consumes more manual and compilation time.
Example: PIC, EQM
 Scanner Generator: It generates lexical analyzers from the input that consists of regular
expression description based on tokens of a language. It generates a finite automaton to
recognize the regular expression. Example: Lex
 Syntax directed translation engines: It generates intermediate code with three address
format from the input that consists of a parse tree. These engines have routines to traverse
the parse tree and then produces the intermediate code. In this, each node of the parse tree
is associated with one or more translations.
 Automatic code generators: It generates the machine language for a target machine. Each
operation of the intermediate language is translated using a collection of rules and then is
taken as an input by the code generator. A template matching process is used. An
intermediate language statement is replaced by its equivalent machine language statement
using templates.
 Data-flow analysis engines: It is used in code optimization.Data flow analysis is a key
part of the code optimization that gathers the information, that is the values that flow from
one part of a program to another.
 Compiler construction toolkits: It provides an integrated set of routines that aids in
building compiler components or in the construction of various phases of compiler.

9. Discuss the implementation of lookahead operator while doing the lexical analysis?
Answer 9- A scanner (or lexical analyzer) is built from the finite automaton (FA) that
corresponds to the set of regular expressions. In determining the end of a token, a scanner usually
needs to examine an extra symbol from the input. This is called 1-symbol lookahead. Lookahead
operator is the additional operator that is read by LEX in order to distinguish additional pattern
for a token. Lexical analyzer is used to read one characteristic ahead of valid lexeme and then
retracts to produce token. At times it is needed to have certain characteristics in the end of input
to match with the pattern. In such cases, slash (/) is used to indicate the end of part of pattern that
matches the lexeme.
For example, in some language keywords are not reserved, So the statements-
IF(I,J)=5 and IF(condition) THEN
Results in a conflict whether to produce IF as an array name or a keyword. TO resolve this the
LEX rule for keyword IF can be written as IF^(.*\){letters}

10. Discuss input buffering and preliminary scanning in lexical analysis?


Answer 10- Lexical Analysis has to access secondary memory each time to identify tokens. It is
time-consuming and costly. So, the input strings are stored into a buffer and then scanned by
Lexical Analysis.

Lexical Analysis scans input string from left to right one character at a time to identify tokens. It
uses two pointers to scan tokens −

 Begin Pointer (bptr) − It points to the beginning of the string to be read.


 Look Ahead Pointer (lptr) − It moves ahead to search for the end of the token.

Preliminary Scanning − Certain processes are best performed as characters are moved from the
source file to the buffer. For example, it can delete comments. Languages like FORTRAN which
ignores blank can delete them from the character stream. It can also collapse strings of several
blanks into one blank. Pre-processing the character stream being subjected to lexical analysis
saves the trouble of moving the look ahead pointer back and forth over a string of blanks.

11. Construct minimum state DFA for the following regular expression (a|b)*a(a|b)

Answer 11-

12. What is meant by ambiguous grammar? How ambiguity is avoided?


Answer 12- A grammar is said to be ambiguous if there exists more than one leftmost derivation
or more than one rightmost derivation or more than one parse tree for the given input string.

We can remove ambiguity solely on the basis of the following two properties –
a) Remove useless symbols
b) Elimination of null production
c) Elimination of unit production

13. Consider the following programs and evaluate the number of tokens?
a) int max (I,j)
int I, j;
\* return max of I &j */
{
return 1>j?i:j;
}
b) printf (“+lai x = %d”, i);
c) printf (“%d,&i=%x”,i, &i);
d) a+++++b
e) a+++b
f) #include<studio.h>
Answer 13-
a- 23
b- 7
c- 10
d- 5
e- 4
f- 7

14. Which of the following string is said to be a token without seeing next character?
a) +          b) ++           c) <           d) (
Answer 14- b and d

You might also like