Material For CAT 1

PART-A
In which phase, parse tree is generated?

a)Semantic analysis
1. b)Syntax analysis
c)Intermediate code generation
d)Code generation
When expression total=5+3 is tokenized then what is the token category of total?
a) Assignment operator
2. b) Identifier
c) Integer Literal
d) Addition Operator
Leaf nodes in a parse tree indicate?
a) sub terminals
3. b) half-terminals
c) non-terminals
d) terminals
In Which of the following phase of compiler, FSA (Finite State Automata)?
a)Code optimization
4. b)Code generation
c)Lexical analysis
d) Parser
What does the lexical analyzer take as input?
a)Tokens
5. b)Parse Tree
c)Source Code
d)Machine Code
In which derivation the right-most non-terminal symbol is replaced at each step?
a) Right look ahead
6. b) Right claim
c) Rightmost
d) Right non-terminal
Which of the following are labeled by operator symbol?
a) Root
7. b) Interior nodes
c) Leaves
d) Nodes
Which derivation is generated by the bottom-up parser?
a)Right-most derivation in reverse
8. b)Left-most derivation in reverse
c)Right-most derivation
d)Left-most derivation
A form of recursive-descent parsing that does not require any back-tracking is known as?
a) predictive parsing
9. b) non-predictive parsing
c) recursive parsing
d) non-recursive parsing
In which phase of the compiler is grammar checked?
a)Syntax Analysis
10. b)Code optimization
c)Semantic analysis
d)Code generation.
What is the use of a symbol table in compiler design?
a) Finding name’s scope
11. b) Type checking
c) Keeping all of the names of all entities in one place
d) Correcting error
Which of the following is a part of a compiler that takes as input a stream of characters and
produces as output a stream of words along with their associated syntactic categories?
a) Optimizer
12.
b) Scanner
c) Parser
d) Sentinals
How will you speed up the lexical analysis phase in input buffering ?
1. Double the buffer size
2. Introduce one more buffer
3. Use sentinel character at the end of buffer
13.
a. 1, 2, 3
b. 1, 2
c. 2, 3
d. 0,0
The RE gives none or many instances of an x or y is?
a) (x+y)
14. b) (x+y)*
c) (x* + y)
d) (xy)*
Characters are grouped into tokens in which of the following phase of the compiler design?
a) Code generator
15. b) Lexical analyzer
c) Parser
d) Code optimization
Which one of the following is a bottom up parser?
a)Predictive Parser
16. b)Recursive Descent Parser
c)Non recursive descent parser
d)Shift- Reduce Parser
Which of these does not belong to CFG?
a) Terminal Symbol
17. b) Non terminal Symbol
c) Start symbol
d) End Symbol
Which of the following derivations does a top-down parser use while parsing an input string?
a) Leftmost derivation
18. b) Leftmost derivation in reverse
c) Rightmost derivation
d) Rightmost derivation in reverse
What does LR stand for?
a)Right to left
19. b)Left to right
c)Left to right and Rightmost Derivation in reverse
d)Left to right reduction
In which phase of the compiler is grammar checked?
a)Syntax Analysis
20. b)Code optimization
c)Semantic analysis
d)Code generation
Part-B
1.What are the two parts of a compilation? Explain briefly.
Analysis and Synthesis are the two parts of compilation. o The analysis part breaks up the
source program into constituent pieces and creates an intermediate representation of the
source program. o The synthesis part constructs the desired target program from the
intermediate representation.
2.Differentiate Lexeme, Token, Pattern with example.
Tokens- Sequence of characters that have a collective meaning. Patterns- There is a set of
strings in the input for which the same token is produced as output. This set of strings is
described by a rule called a pattern associated with the token Lexeme- A sequence of
characters in the source program that is matched by the pattern for a token.
3. Construct NFA for the regular expression: (a+b)*
4.Define a Context Free Grammar (CFG)
A context free grammar G is a collection of the following :
· V is a set of non terminals
· T is a set of terminals
· S is a start symbol
· P is a set of production rules
G can be represented as G = (V,T,S,P)
Production rules are given in the following form
Non terminal → (V U T)*
5.Eliminate the left recursion for the following grammar
E-->E+T/T
T-->T*F/F
F-->(E)/id
After Left Recursion:
6.Draw a transition diagram to represent relational operators.

7. List the operations on string languages.
1. Union
Union is the most common set operation. Consider the two languages L and M.
Then the union of these two languages is denoted by:
L ∪ M = { s | s is in L or s is in M}
That means the string s from the union of two languages can either be from
language L or from language M.
If L = {a, b} and M = {c, d}Then L ∪ M = {a, b, c, d}
2. Concatenation
Concatenation links the string from one language to the string of another language
in a series in all possible ways. The concatenation of two different languages is
denoted by:
L ⋅ M = {st | s is in L and t is in M}If L = {a, b} and M = {c, d}
Then L ⋅ M = {ac, ad, bc, bd}
3. Kleene Closure
Kleene closure of a language L provides you with a set of strings. This set of
strings is obtained by concatenating L zero or more time. The Kleene closure of
the language L is denoted by:
If L = {a, b}L* = {∈, a, b, aa, bb, aaa, bbb, …}
4. Positive Closure
The positive closure on a language L provides a set of strings. This set of strings
is obtained by concatenating ‘L’ one or more times. It is denoted by:
It is similar to the Kleene closure. Except for the term L0, i.e. L+ excludes ∈ until
it is in L itself.
If L = {a, b}L+ = {a, b, aa, bb, aaa, bbb, …}
8. Construct a parse tree for –(id + id )

9.Compute FIRST for all the non-terminals for the following grammar.
S→ (L) | a
L→ L, S | S
Remove Left Recursion for production L
L-> SL’
L’ -> ,SL’ | ∈
The grammar after eliminating left recursion is-
S → (L) / a
L → SL’
L’ → ,SL’ / ∈
First:
First(s) ==> { ( , a }
First (L) ==> { ( , a }
First (L’) ==> { , , ∈}
10.Eliminate Left Recursion from the following grammar.
S -> Sab | T
Left Recursion :
S-->T S’
S’ --> abS’ | ∈
PART-C
1.Discuss about the recognition of tokens.
How tokens can be recognize
Lexical analyzer read the source program character by character and produces a
stream of tokens.
The token may be identifier , a variable, a operator, a constant or a keyword.
In order to specify the tokens we use regular expression.
We have to recognize the tokens with the help of transition diagrams.
Recognition of tokens is done to separate out different tokens.
Example: assume the following grammar fragment to generate a specific

language:
where the terminals if, then, else, relop, id, and num generate sets of strings given
by the following regular definitions:
Where letter and digits are as defined previously.
For this language fragment the lexical analyzer will recognize the keywords if,
then, else, as well as the lexemes denoted by relop, id, and num. To simplify
matters, we assume keywords are
reserved; that is, they cannot be used as identifiers. The num represents the
unsigned integer and real numbers of Pascal. In addition, we assume lexemes are
separated by white space,
consisting of nonnull sequences of blanks, tabs, and newlines. The lexical
analyzer will strip out white space. It will do so by comparing a string against the
regular definition ws, below.
If a match for ws is found, the lexical analyzer does not return a token to the
parser.
Transition Diagram
Tokens can be recognized by Finite Automata
A Finite automaton(FA) is a simple idealized machine used to recognize patterns

within input taken from some character set(or Alphabet) C. The job of FA is to
accept or reject an input depending on whether the pattern defined by the FA
occurs in the input.
There are two notations for representing Finite Automata. They are
Transition Diagram
Transition Table
Transition diagram is a directed labeled graph in which it contains nodes and

edges Nodes represents the states and edges represents the transition of a state
Every transition diagram is only one initial state represented by an arrow mark (--
>) and zero or more final states are represented by double circle
Example:
Where state "1" is initial state and state 3 is final state.
As an intermediate step in the construction of a lexical analyzer, we first produce

flowchart, called a Transition diagram. Transition diagrams depict the actions that
take place when a lexical analyzer is called by the parser to get the next token.
The TD uses to keep track of information about characters that are seen as the
forward pointer scans the input. it dose that by moving from position to position
in the diagram as characters are read.
Components of Transition Diagram
Finite Automata for recognizing identifiers

Finite Automata for recognizing keywords
Finite Automata for recognizing numbers
Finite Automata for relational operators
Finite Automata for recognizing white spaces

2.What are the phases of compiler? Explain the phases in detail. Write down the
output of each phase for the expression a:b+c*60
Phases of a compiler: A compiler operates in phases. A phase is a logically interrelated

operation that takes source program in one representation and produces output in another
representation.
The phases include:

1. Lexical analysis (“scanning”)
Reads in program, groups characters into “tokens”
2. Syntax analysis (“parsing”)
Structures token sequence according to grammar rules of the language.
3. Semantic analysis
Checks semantic constraints of the language.
4. Intermediate code generation
Translates to “lower level” representation.
5. Code optimization
Improves code quality.
6. Code generation.
Phase-1: Lexical Analysis

Lexical analyzer reads the stream of characters making up the source program and
groups the characters into meaningful sequences called lexeme
• For each lexeme, the lexical analyzer produces a token of the form that it passes
on to the subsequent phase, syntax analysis (token-name, attribute-value)
• Token-name: an abstract symbol is used during syntax analysis.
• attribute-value: points to an entry in the symbol table for this token.
Example:
newval := oldval + 12
Tokens:
newval Identifier
= Assignment operator
oldval Identifier
+ Add operator
12 Number
Lexical analyzer truncates white spaces and also removes errors.
Phase-2: Syntax Analysis

• Also called Parsing or Tokenizing.
• The parser uses the first components of the tokens produced by the lexical
analyzer to create a tree-like intermediate representation that depicts the
grammatical structure of the token stream.
• A typical representation is a syntax tree in which each interior node represents an
operation and the children of the node represent the arguments of the operation
Phase-3: Semantic Analysis

• The semantic analyzer uses the syntax tree and the information in the symbol
table to check the source program for semantic consistency with the language
definition.
• Gathers type information and saves it in either the syntax tree or the symbol table,
for subsequent use during intermediate-code generation.
• An important part of semantic analysis is type checking, where the compiler
checks that each operator has matching operands.
• For example, many programming language definitions require an array index to
be an integer; the compiler must report an error if a floating-point number is
used to index an array.
• Example:
newval := oldval+12
The type of the identifier newval must match with the type of expression
(oldval+12).
Example:
Semantic analysis
• Syntactically correct, but semantically incorrect
example:
sum = a + b;
int a;
double sum; data type mismatch
char b;
Phase-4: Intermediate Code Generation

After syntax and semantic analysis of the source program, many compilers
generate an explicit low-level or machine-like intermediate representation(a
program for an abstract machine). This intermediate representation should have
two important properties:
• it should be easy to produce and
• it should be easy to translate into the target machine.
The considered intermediate form called three-address code, which consists of a
sequence of assembly-like instructions with three operands per instruction. Each
operand can act like a register.
This phase bridges the analysis and synthesis phases of translation.
Example:
Phase-5: Code Optimization

• The compiler looks at large segments of the program to decide how to improve
performance
• The machine-independent code-optimization phase attempts to improve the
intermediate code so that better target code will result.
• Usually better means:
• faster, shorter code, or target code that consumes less power.
• There are simple optimizations that significantly improve the running time of the
target program without slowing down compilation too much.
• Optimization cannot make an inefficient algorithm efficient - “only makes an
efficient algorithm more efficient”
Example:
The above intermediate code will be optimized as:
Temp1 = Id3 * 1
Id1 = Id2 + Temp1
Phase-6: Code Generation
• The last phase of translation is code generation.
• Takes as input an intermediate representation of the source program and maps it
into the target language
• If the target language is machine, code, registers or memory locations are
selected for each of the variables used by the program.
• Then, the intermediate instructions are translated into sequences of machine
instructions that perform the same task.
• A crucial aspect of code generation is the judicious assignment of registers to
hold variables.
Example:
3. Construct Deterministic Finite Automata for the given regular expression.

(0+1)* 01
4.Construct Parsing table for the grammar and find states made by predictive
parser on input “id + id * id” and find FIRST and FOLLOW.
E -> E + T | T
T -> T * F | F
F -> (E)
F-> id
Step1 : Eliminate Left Recursion:
After eliminating left-recursion the grammar is

E → TE’
E’ → +TE’ | ε
T → FT’
T’ → *FT’ | ε
F → (E) | id
Step 2: Find First and Follow
First( ) :
FIRST(E) = { ( , id}
FIRST(E’) ={+ , ε }
FIRST(T) = { ( , id}
FIRST(T’) = {*, ε }
FIRST(F) = { ( , id }
Follow( ):
FOLLOW(E) = { $, ) }
FOLLOW(E’) = { $, ) }
FOLLOW(T) = { +, $, ) }
FOLLOW(T’) = { +, $, ) }
FOLLOW(F) = {+, * , $ , ) }
Step 3: Construct Parsing Table
Predictive Parsing Table:

Step 4 : Parse the given Input string: id+id*id
5. Construct Leftmost Derivation. , Rightmost Derivation, Derivation Tree for

the
following grammar with respect to the string “aaabbabbba”.
S aB | bA
AaS| bAA|a
B bS | aBB | b
Leftmost Derivation
The process of deriving a string by expanding the leftmost non-terminal at each step is
called as leftmost derivation.
The geometrical representation of leftmost derivation is called as a leftmost derivation
tree.
S → aB
→ aaBB (Using B → aBB)
→ aaaBBB (Using B → aBB)
→ aaabBB (Using B → b)
→ aaabbB (Using B → b)
→ aaabbaBB (Using B → aBB)
→ aaabbabB (Using B → b)
→ aaabbabbS (Using B → bS)
→ aaabbabbbA (Using S → bA)
→ aaabbabbba (Using A → a)
Derivation Tree
Rightmost Derivation-
The process of deriving a string by expanding the rightmost non-terminal at each step
is called as rightmost derivation.
The geometrical representation of rightmost derivation is called as a rightmost
derivation tree.
S → aB
→ aaBB (Using B → aBB)
→ aaBaBB (Using B → aBB)
→ aaBaBbS (Using B → bS)
→ aaBaBbbA (Using S → bA)
→ aaBaBbba (Using A → a)
→ aaBabbba (Using B → b)
→ aaaBBabbba (Using B → aBB)
→ aaaBbabbba (Using B → b)
→ aaabbabbba (Using B → b)
Derivation Tree
6. Check whether the following grammar can be implemented using predictive parser.
Check whether the string “abfg” is accepted or not using predictive parsing.
SA
AaB|Ad
BbBC|f
Cg
Step 1: (Eliminate Left Recursion)
S -> A
A -> aBA’
A’-> dA’|€
B -> bBC|f
C -> g
Step 2: Find First and Follow

FIRST
FOLLOW
Step 3 : Construct Parsing Table
Step 4 : Parsing the Input String ==> “abfg”

7.Construct LR (0) parsing table for the grammar. Check whether the input
string “aabb” is accepted or not.
S-AA
A-aA|b
Augmented Grammer
S’--> AA
S-->AA
A-->aA|b
Canonical collection of given grammer:
.
S’ --> AA
.
S--> AA
A--> .aA|.b
Constructing Data flow diagram
Construction of parsing Table:
Parsing Input string:

Material For CAT 1

Uploaded by

Copyright:

Available Formats

Material For CAT 1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Material For CAT 1

Uploaded by

Copyright:

Available Formats

PART-A

In which phase, parse tree is generated?

2.Differentiate Lexeme, Token, Pattern with example.

3. Construct NFA for the regular expression: (a+b)*

4.Define a Context Free Grammar (CFG)

A context free grammar G is a collection of the following :

· V is a set of non terminals

· P is a set of production rules

G can be represented as G = (V,T,S,P)

Production rules are given in the following form

Non terminal → (V U T)*

5.Eliminate the left recursion for the following grammar

After Left Recursion:

6.Draw a transition diagram to represent relational operators.

If L = {a, b} and M = {c, d}Then L ∪ M = {a, b, c, d}

L ⋅ M = {st | s is in L and t is in M}If L = {a, b} and M = {c, d}

Then L ⋅ M = {ac, ad, bc, bd}

If L = {a, b}L* = {∈, a, b, aa, bb, aaa, bbb, …}

If L = {a, b}L+ = {a, b, aa, bb, aaa, bbb, …}

8. Construct a parse tree for –(id + id )

Remove Left Recursion for production L

The grammar after eliminating left recursion is-

First (L) ==> { ( , a }

First (L’) ==> { , , ∈}

10.Eliminate Left Recursion from the following grammar.

1.Discuss about the recognition of tokens.

How tokens can be recognize

The token may be identifier , a variable, a operator, a constant or a keyword.

In order to specify the tokens we use regular expression.

We have to recognize the tokens with the help of transition diagrams.

Recognition of tokens is done to separate out different tokens.

Example: assume the following grammar fragment to generate a specific

Where letter and digits are as defined previously.

Tokens can be recognized by Finite Automata

A Finite automaton(FA) is a simple idealized machine used to recognize patterns

Transition diagram is a directed labeled graph in which it contains nodes and

Where state "1" is initial state and state 3 is final state.

As an intermediate step in the construction of a lexical analyzer, we first produce

Components of Transition Diagram

Finite Automata for recognizing identifiers

Finite Automata for recognizing numbers

Finite Automata for relational operators

Finite Automata for recognizing white spaces

Phases of a compiler: A compiler operates in phases. A phase is a logically interrelated

The phases include:

Phase-1: Lexical Analysis

Phase-2: Syntax Analysis

Phase-3: Semantic Analysis

Phase-4: Intermediate Code Generation

Phase-5: Code Optimization

3. Construct Deterministic Finite Automata for the given regular expression.

Step1 : Eliminate Left Recursion:

After eliminating left-recursion the grammar is

Step 2: Find First and Follow

Step 3: Construct Parsing Table

Predictive Parsing Table:

5. Construct Leftmost Derivation. , Rightmost Derivation, Derivation Tree for

Step 2: Find First and Follow

Step 3 : Construct Parsing Table