Grammars

Context Free Grammar
Context free grammar is a formal grammar which is used to generate all possible strings in a given formal language.
Context free grammar G can be defined by four tuples as:
G= (V, T, P, S)
Where,
G describes the grammar
T describes a finite set of terminal symbols.
V describes a finite set of non-terminal symbols
P describes a set of production rules
S is the start symbol.

In CFG, the start symbol is used to derive the string. You can derive the string by repeatedly replacing a non-terminal by the
right hand side of the production, until all non-terminal have been replaced by terminal symbols.
Production rules:
S → aSa
S → bSb
S→c
Now check that abbcbba string can be derived from the given CFG.
S ⇒ aSa
S ⇒ abSba
S ⇒ abbSbba
S ⇒ abbcbba
By applying the production

S → aSa, S → bSb recursively and finally applying the production S → c,
we get the string abbcbba.
Derivation is a sequence of production rules. It is used to get the input string through these production rules. During parsing
we have to take two decisions. These are as follows:
•We have to decide the non-terminal which is to be replaced.
•We have to decide the production rule by which the non-terminal will be replaced.
We have two options to decide which non-terminal to be replaced with production rule.
Left-most Derivation
In the left most derivation, the input is scanned and replaced with the production rule from left to right. So in left most
derivatives we read the input string from left to right.
Example:
Production rules:
S=S+S
S=S-S
S = a | b |c
Input:
a-b+c
Derivation
The left-most derivation is:

S=S+S
S=S-S+S
S=a-S+S
S=a-b+S
S=a-b+c
Right Most Derivation
In the right most derivation, the input is scanned and replaced with the production rule from right to left. So in right most
derivatives we read the input string from right to left.
Example:
S=S+S
S=S-S
S = a | b |c
Input:
a-b+c
The right-most derivation is:

S=S-S
S=S-S+S
S=S-S+c
S=S-b+c
S=a-b+c
Parse tree
•Parse tree is the graphical representation of symbol. The symbol can be terminal or non-terminal.
•In parsing, the string is derived using the start symbol. The root of the parse tree is that start symbol.
•It is the graphical representation of symbol that can be terminals or non-terminals.
•Parse tree follows the precedence of operators. The deepest sub-tree traversed first. So, the operator in the parent node has
less precedence over the operator in the sub-tree.
The parse tree follows these points:
All leaf nodes have to be terminals.
All interior nodes have to be non-terminals.
In-order traversal gives original input string.
Parse Tree
Example:
Production rules:
T= T + T | T * T
T = a|b|c
Input:
a*b+c
Parse Tree
Step 3:
Step 1:
Step 2:
Step 4:
Step 3:
Ambiguity
A grammar is said to be ambiguous if there exists more than one leftmost derivation or
more than one rightmost derivative or more than one parse tree for the given input
string. If the grammar is not ambiguous then it is called unambiguous.
Example:
S = aSb | SS
S=∈
For the string aabb, the above grammar generates two parse trees:
If the grammar has ambiguity then it is not good for a compiler construction. No method can automatically detect
and remove the ambiguity but you can remove ambiguity by re-writing the whole grammar without ambiguity.
Parser
• Parser is a compiler that is used to break the data into smaller elements
coming from lexical analysis phase.
• A parser takes input in the form of sequence of tokens and produces output in
the form of parse tree.
• Parsing is of two types: top down parsing and bottom up parsing.

Parser
Top down Parsing
• The top down parsing is known as recursive parsing or predictive

parsing.
• Bottom up parsing is used to construct a parse tree for an input string.
• In the top down parsing, the parsing starts from the start symbol and
transform it into the input symbol.
• Recursive Descent parsers and LL parsers are Top down parsers.
Top down Parsing
Parse Tree representation of input string "acdb" is as follows:
Bottom Up Parsing
• Bottom up parsing is also known as shift-reduce parsing.
• Bottom up parsing is used to construct a parse tree for an input string.
• In the bottom up parsing, the parsing starts with the input symbol and construct the parse tree up to
the start symbol by tracing out the rightmost derivations of string in reverse.
Example
E→T
T→T*F
T → id
F→T
F → id
Bottom Up Parsing
• Parse Tree representation of input string "id * id" is as follows:
• Bottom up parsing is classified in to various parsing.
These are as follows:
1.Shift-Reduce Parsing
2.Operator Precedence Parsing
3.Table Driven LR Parsing
a.LR( 1 )
b.SLR( 1 )
c.CLR ( 1 )
d.LALR( 1 )
Top Down Parsing Techniques
• Recursive Descent Parsing
• Predictive Parsing
• LL(K) grammars
Recursive Descent Parsing
• Its a top down parsing technique that constructs parse tree from top and the input is read
from left to right.
• A procedure is associated with each non terminal of the grammar.
• This technique recursively parses the input to make a parse tree which may/may not
requires back tracking.
• A form of recursive descent parsing that does not require any back tracking is called
Predictive parsing.
• This parsing technique is regarded recursive as it uses context-free grammar
which is recursive in nature.
E()
{ E->iE|
E| -> +iE| /epsilon
if(l==‘i’)
{
match(‘i’);
E();
}
}
l=getchar();
S → rXd | rZd
X → oa | ea
Z → ai
Input string is rea

For an input string: read, a top-down parser, will behave like this:
It will start with S from the production rules and will match its yield to the left-most letter of the input, i.e.
‘r’. The very production of S (S → rXd) matches with it. So the top-down parser advances to the next input
letter (i.e. ‘e’). The parser tries to expand non-terminal ‘X’ and checks its production from the left (X →
oa). It does not match with the next input symbol. So the top-down parser backtracks to obtain the next
production rule of X, (X → ea).
Now the parser matches all the input letters in an ordered manner. The string is accepted.
S → rXd | rZd
X → oa | ea
Z → ai
Prior to top-down parsing
• Checklist :
1. Remove ambiguity if possible by rewriting the grammar

2. Remove left- recursion, otherwise it may lead to an infinite loop.
3. Do left- factoring.
Left- factoring
• In predictive parsing , the prediction is made about which rule to follow to parse
the non-terminal by reading the following input symbols
• In case of predictive parsing, left-factoring helps removable ambiguity.
• “Left factoring is a grammar transformation that is useful for producing a
grammar suitable for predictive parsing. The basic idea is that when it is not clear
which of two alternative productions to use to expand a non-terminal A, we may
be able to rewrite the A-productions to defer the decision until we have seen
enough of the input to make the right choice.”
- Aho,Ullman,Sethi
Left-factoring
• Here is a grammar rule that is ambiguous:
A -> xP1 | xP2 | xP3 | xP4 ….| xPn
Where x & Pi’s are strings of terminals and non-terminals and x !=e
If we rewrite it as
A-> xP’
P’ -> P1|P2|P3 …|Pn
We call that the grammar has been “left-factored”, and the apparent ambiguity has
been removed. Repeating this for every rule left-factors a grammar completely
Example
stmt -> if exp then stmt endif | if exp then stmt endif else stmt endif
We can left factor it as follows :
stmt -> if exp then stmt endif ELSEFUNC

ELSEFUNC -> else stmt endif | e (epsilon)
Thereby removing the ambiguity

Left Recursion
Left-Recursion
Left Recursion
Predictive Parsing
• Predictive parser is a recursive descent parser, which has the

capability to predict which production is to be used to replace the
input string. The predictive parser does not suffer from backtracking.
• To accomplish its tasks, the predictive parser uses a look-ahead
pointer, which points to the next input symbols. To make the parser
back-tracking free, the predictive parser puts some constraints on the
grammar and accepts only a class of grammar known as LL(k)
grammar.
Predictive Parsing
Predictive parsing
• Predictive parsing uses a stack and a parsing table to parse the input and
generate a parse tree. Both the stack and the input contains an end
symbol $ to denote that the stack is empty and the input is consumed. The
parser refers to the parsing table to take any decision on the input and stack
element combination.
• In recursive descent parsing, the parser may have more than one production
to choose from for a single instance of input, whereas in predictive parser,
each step has at most one production to choose. There might be instances
where there is no production matching the input string, making the parsing
procedure to fail.
LL Parser
• An LL Parser accepts LL grammar. LL grammar is a subset of context-
free grammar but with some restrictions to get the simplified version,
in order to achieve easy implementation. LL grammar can be
implemented by means of both algorithms namely, recursive-descent
or table-driven.
• LL parser is denoted as LL(k). The first L in LL(k) is parsing the input

from left to right, the second L in LL(k) stands for left-most derivation
and k itself represents the number of look aheads. Generally k = 1, so
LL(k) may also be written as LL(1).
LL Parser
A grammar G is LL(1) if A → α | β are two distinct productions of G:

• for non terminal, both α and β derive strings beginning with a.
• at most one of α and β can derive empty string.
• if β → t, then α does not derive any string beginning with a terminal
in FOLLOW(A).

Grammars

Uploaded by

Copyright:

Available Formats

Grammars

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Grammars

Uploaded by

Copyright:

Available Formats

Context Free Grammar

G describes the grammar

T describes a finite set of terminal symbols.

V describes a finite set of non-terminal symbols

P describes a set of production rules

S is the start symbol.

By applying the production

The left-most derivation is:

The right-most derivation is:

• Parsing is of two types: top down parsing and bottom up parsing.

• The top down parsing is known as recursive parsing or predictive

• Bottom up parsing is also known as shift-reduce parsing.

• Bottom up parsing is used to construct a parse tree for an input string.

Input string is rea

1. Remove ambiguity if possible by rewriting the grammar

We can left factor it as follows :

stmt -> if exp then stmt endif ELSEFUNC

Thereby removing the ambiguity

• Predictive parser is a recursive descent parser, which has the

• LL parser is denoted as LL(k). The first L in LL(k) is parsing the input

A grammar G is LL(1) if A → α | β are two distinct productions of G:

You might also like