Grammars
Grammars
Grammars
Context free grammar is a formal grammar which is used to generate all possible strings in a given formal language.
Context free grammar G can be defined by four tuples as:
G= (V, T, P, S)
Where,
Derivation is a sequence of production rules. It is used to get the input string through these production rules. During parsing
we have to take two decisions. These are as follows:
•We have to decide the non-terminal which is to be replaced.
•We have to decide the production rule by which the non-terminal will be replaced.
We have two options to decide which non-terminal to be replaced with production rule.
Left-most Derivation
In the left most derivation, the input is scanned and replaced with the production rule from left to right. So in left most
derivatives we read the input string from left to right.
Example:
Production rules:
S=S+S
S=S-S
S = a | b |c
Input:
a-b+c
Derivation
In the right most derivation, the input is scanned and replaced with the production rule from right to left. So in right most
derivatives we read the input string from right to left.
Example:
S=S+S
S=S-S
S = a | b |c
Input:
a-b+c
•Parse tree is the graphical representation of symbol. The symbol can be terminal or non-terminal.
•In parsing, the string is derived using the start symbol. The root of the parse tree is that start symbol.
•It is the graphical representation of symbol that can be terminals or non-terminals.
•Parse tree follows the precedence of operators. The deepest sub-tree traversed first. So, the operator in the parent node has
less precedence over the operator in the sub-tree.
The parse tree follows these points:
All leaf nodes have to be terminals.
All interior nodes have to be non-terminals.
In-order traversal gives original input string.
Parse Tree
Example:
Production rules:
T= T + T | T * T
T = a|b|c
Input:
a*b+c
Parse Tree
Step 3:
Step 1:
Step 2:
Step 4:
Step 3:
Ambiguity
A grammar is said to be ambiguous if there exists more than one leftmost derivation or
more than one rightmost derivative or more than one parse tree for the given input
string. If the grammar is not ambiguous then it is called unambiguous.
Example:
S = aSb | SS
S=∈
For the string aabb, the above grammar generates two parse trees:
If the grammar has ambiguity then it is not good for a compiler construction. No method can automatically detect
and remove the ambiguity but you can remove ambiguity by re-writing the whole grammar without ambiguity.
Parser
• Parser is a compiler that is used to break the data into smaller elements
coming from lexical analysis phase.
• A parser takes input in the form of sequence of tokens and produces output in
the form of parse tree.
• In the bottom up parsing, the parsing starts with the input symbol and construct the parse tree up to
the start symbol by tracing out the rightmost derivations of string in reverse.
Example
E→T
T→T*F
T → id
F→T
F → id
Bottom Up Parsing
• Parse Tree representation of input string "id * id" is as follows:
• Bottom up parsing is classified in to various parsing.
These are as follows:
1.Shift-Reduce Parsing
2.Operator Precedence Parsing
3.Table Driven LR Parsing
a.LR( 1 )
b.SLR( 1 )
c.CLR ( 1 )
d.LALR( 1 )
Top Down Parsing Techniques
• Recursive Descent Parsing
• Predictive Parsing
• LL(K) grammars
Recursive Descent Parsing
• Its a top down parsing technique that constructs parse tree from top and the input is read
from left to right.
• A procedure is associated with each non terminal of the grammar.
• This technique recursively parses the input to make a parse tree which may/may not
requires back tracking.
• A form of recursive descent parsing that does not require any back tracking is called
Predictive parsing.
• This parsing technique is regarded recursive as it uses context-free grammar
which is recursive in nature.
Recursive Descent Parsing
E()
{ E->iE|
E| -> +iE| /epsilon
if(l==‘i’)
{
match(‘i’);
E();
}
}
l=getchar();
Recursive Descent Parsing
Recursive Descent Parsing
S → rXd | rZd
X → oa | ea
Z → ai
It will start with S from the production rules and will match its yield to the left-most letter of the input, i.e.
‘r’. The very production of S (S → rXd) matches with it. So the top-down parser advances to the next input
letter (i.e. ‘e’). The parser tries to expand non-terminal ‘X’ and checks its production from the left (X →
oa). It does not match with the next input symbol. So the top-down parser backtracks to obtain the next
production rule of X, (X → ea).
Now the parser matches all the input letters in an ordered manner. The string is accepted.
Recursive Descent Parsing
S → rXd | rZd
X → oa | ea
Z → ai
Prior to top-down parsing
• Checklist :
We call that the grammar has been “left-factored”, and the apparent ambiguity has
been removed. Repeating this for every rule left-factors a grammar completely
Example
stmt -> if exp then stmt endif | if exp then stmt endif else stmt endif