Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Unit 2

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 10

UNIT-II

SYNTAX ANALYSIS

Parser
 Parser is a compiler that is used to break the data into smaller elements coming from
lexical analysis phase.
 A parser takes input in the form of sequence of tokens and produces output in the
form of parse tree.
 Parsing is of two types: top down parsing and bottom up parsing

TOP-DOWN PARSER:
 The top down parsing is known as recursive parsing or predictive parsing.
 Bottom up parsing is used to construct a parse tree for an input string.
 In the top down parsing, the parsing starts from the start symbol and transform it into
the input symbol.
Recursive Descent Parsing: Recursive descent parsing is a type of top-down parsing
technique. This technique follows the process for every terminal and non-terminal entity. It
reads the input from left to right and constructs the parse tree from right to left. As the
technique works recursively, it is called recursive descent parsing.

Back-tracking: The parsing technique that starts from the initial pointer, the root node. If the
derivation fails, then it restarts the process with different rules.

BOTTOM-UP PARSING: The bottom-up parsing works just the reverse of the top-down
parsing. It first traces the rightmost derivation of the input until it reaches the start symbol.

Shift-
Reduce Parsing: Shift-reduce parsing works on two steps: Shift step and Reduce step.
Shift step: The shift step indicates the increment of the input pointer to the next input symbol
that is shifted.
Reduce Step: When the parser has a complete grammar rule on the right-hand side and
replaces it with RHS.
s rule

aS 1
LR Parsing: LR parser is one of the
most efficient aaS 1 syntax analysis
techniques as it works with context-free
grammar. In LR aaaS 1 parsing L stands for the
left to right tracing, and R stands for
the right to left tracing.
aaaaS 1
CONTEXT- FREE GRAMMER:

A context free aaaaaS 1 grammar (CFG) is a


forma grammar which is used to generate
all the possible aaaaaaS 1 patterns of strings in a
given formal language.
aaaaaa 2
It is defined as four tuples −

G=(V,T,P,S)

 G is a grammar, which consists of a set of production rules. It is used to generate the


strings of a language.
 T is the final set of terminal symbols. It is denoted by lower case letters.
 V is the final set of non-terminal symbols. It is denoted by capital letters
 P is a set of production rules, which is used for replacing non-terminal symbols (on
the left side of production) in a string with other terminals (on the right side of
production).
 S is the start symbol used to derive the string
Example

Construct CFG for the language having any number of a's over the set ∑={a}

Solution

Regular Expression= a*

Production rule for the Regular Expression is as follows −

S->aS rule 1

S-> ε rule 2

Now if we want to derive a string "aaaaaa" we can start with start symbol

Start with start symbol:


The regular expression=a* can generate a set of strings { ε,a,aa,aaa,...}

We can have a null string because S is a start symbol and rule 2 gives S-> ε

Example
 The grammar ({A}, {a, b, c}, P, A), P : A → aA, A → abc.
 The grammar ({S, a, b}, {a, b}, P, S), P: S → aSa, S → bSb, S → ε
 The grammar ({S, F}, {0, 1}, P, S), P: S → 00S | 11F, F → 00F | ε
Generation of Derivation Tree

A derivation tree or parse tree is an ordered rooted tree that graphically


represents the semantic information a string derived from a context-free
grammar.

Representation Technique
 Root vertex − Must be labeled by the start symbol.
 Vertex − Labeled by a non-terminal symbol.
 Leaves − Labeled by a terminal symbol or ε.
If S → x1x2 …… xn is a production rule in a CFG, then the parse tree /
derivation tree will be as follows −
There are two different approaches to draw a derivation tree −

Top-down Approach −
 Starts with the starting symbol S
 Goes down to tree leaves using productions
Bottom-up Approach −
 Starts from tree leaves
 Proceeds upward to the root which is the starting symbol S
Derivation or Yield of a Tree

The derivation or the yield of a parse tree is the final string obtained by
concatenating the labels of the leaves of the tree from left to right, ignoring the
Nulls. However, if all the leaves are Null, derivation is Null.

Example

Let a CFG {N,T,P,S} be

N = {S}, T = {a, b}, Starting symbol = S, P = S → SS | aSb | ε

One derivation from the above CFG is “abaabb”

S → SS → aSbS → abS → abaSb → abaaSbb → abaabb


Sentential Form and Partial Derivation Tree

A partial derivation tree is a sub-tree of a derivation tree/parse tree such that


either all of its children are in the sub-tree or none of them are in the sub-tree.

Example

If in any CFG the productions are −

S → AB, A → aaA | ε, B → Bb| ε

the partial derivation tree can be the following −


If a partial derivation tree contains the root S, it is called a sentential form. The
above sub-tree is also in sentential form.
Leftmost and Rightmost Derivation of a String
 Leftmost derivation − A leftmost derivation is obtained by applying
production to the leftmost variable in each step.
 Rightmost derivation − A rightmost derivation is obtained by applying
production to the rightmost variable in each step.
Example

Let any set of production rules in a CFG be

X → X+X | X*X |X| a

over an alphabet {a}.

The leftmost derivation for the string "a+a*a" may be −

X → X+X → a+X → a + X*X → a+a*X → a+a*a

The stepwise derivation of the above string is shown as below −


The rightmost derivation for the above string "a+a*a" may be −

X → X*X → X*a → X+X*a → X+a*a → a+a*a

The stepwise derivation of the above string is shown as below −


Left and Right Recursive Grammars
In a context-free grammar G, if there is a production in the form X →
Xa where X is a non-terminal and ‘a’ is a string of terminals, it is called a left
recursive production. The grammar having a left recursive production is called
a left recursive grammar.
And if in a context-free grammar G, if there is a production is in the form X →
aX where X is a non-terminal and ‘a’ is a string of terminals, it is called a right
recursive production. The grammar having a right recursive production is
called a right recursive grammar.
Ambiguous grammar: A CFG is said to be ambiguous if there exists more than one
derivation tree for the given input string i.e., more than one LeftMost Derivation Tree
(LMDT) or RightMost Derivation Tree (RMDT)
Let us consider this grammar: E -> E+E|id We can create a 2 parse tree from this
grammar to obtain a string id+id+id. The following are the 2 parse trees generated
by left-most derivation:
Both the above parse trees are derived from the same grammar rules but
both parse trees are different. Hence the grammar is ambiguous
Unambiguous Grammar : A context-free grammar is called unambiguous
grammar if there exists one and only one derivation tree or parse
tree. Example –
X -> AB
A -> Aa / a
B -> b

You might also like