Unit-3 Flat
Unit-3 Flat
Unit-3 Flat
GRAMMAR
Context Free Grammar is formal grammar, the syntax or structure of a formal
language can be described using context-free grammar (CFG), a type of formal
grammar. The grammar has four tuples: (V,T,P,S).
V - It is the collection of variables or nonterminal symbols.
T - It is a set of terminals.
P - It is the production rules that consist of both terminals and
nonterminals.
S - It is the Starting symbol.
A grammar is said to be the Context-free grammar if every production is in the form
of :
G -> (V∪T)*, where G ∊ V
And the left-hand side of the G, here in the example can only be a
Variable, it cannot be a terminal.
But on the right-hand side here it can be a Variable or Terminal or both
combination of Variable and Terminal.
Above equation states that every production which contains any combination of the
‘V’ variable or ‘T’ terminal is said to be a context-free grammar.
For example the grammar A = { S, a,b, P,S} having production :
Here S is the starting symbol.
{a,b} are the terminals generally represented by small characters.
P is variable along with S.
S-> aS
S-> bSa
but
a->bSa, or
a->ba is not a CFG as on the left-hand side there is a variable
which does not follow the CFGs rule.
In the computer science field, context-free grammars are frequently used, especially
in the areas of formal language theory, compiler development, and natural language
processing. It is also used for explaining the syntax of programming languages and
other formal languages.
Limitations of Context-Free Grammar
Apart from all the uses and importance of Context-Free Grammar in the Compiler
design and the Computer science field, there are some limitations that are addressed,
that is CFGs are less expressive, and neither English nor programming language can
be expressed using Context-Free Grammar. Context-Free Grammar can be
ambiguous means we can generate multiple parse trees of the same input. For some
grammar, Context-Free Grammar can be less efficient because of the exponential
time complexity. And the less precise error reporting as CFGs error reporting system
is not that precise that can give more detailed error messages and information.
CLASSIFICATION OF GRAMMAR
Context Free Grammars (CFG) can be classified on the basis of following two
properties:
1) Based on number of strings it generates.
If CFG is generating finite number of strings, then CFG is Non-
Recursive (or the grammar is said to be Non-recursive grammar)
If CFG can generate infinite number of strings then the grammar is said to
be Recursive grammar
During Compilation, the parser uses the grammar of the language to make a parse
tree(or derivation tree) out of the source code. The grammar used must be
unambiguous. An ambiguous grammar must not be used for parsing.
2) Based on number of derivation trees.
If there is only 1 derivation tree then the CFG is unambiguous.
If there are more than 1 left most derivation tree or right most derivation
or parse tree , then the CFG is ambiguous.
Examples of Recursive and Non-Recursive Grammars
Recursive Grammars
1) S->SaS
S->b
The language(set of strings) generated by the above grammar is :{b, bab, babab,…},
which is infinite.
2) S-> Aa
A->Ab|c
The language generated by the above grammar is :{ca, cba, cbba …}, which is
infinite.
Note: A recursive context-free grammar that contains no useless rules necessarily
produces an infinite language.
Non-Recursive Grammars
S->Aa
A->b|c
The language generated by the above grammar is :{ba, ca}, which is finite.
Types of Recursive Grammars
Based on the nature of the recursion in a recursive grammar, a recursive CFG can be
again divided into the following:
Left Recursive Grammar (having left Recursion)
Right Recursive Grammar (having right Recursion)
General Recursive Grammar(having general Recursion)
For Example:
S --> AB
AB --> abc
B --> b
Type 2: Context-Free Grammar: Type-2 grammars generate context-free
languages. The language generated by the grammar is recognized by a Pushdown
automata. In Type 2:
First of all, it should be Type 1.
The left-hand side of production can have only one variable and there is
no restriction on
.
For example:
S --> AB
A --> a
B --> b
Type 3: Regular Grammar: Type-3 grammars generate regular languages. These
languages are exactly all languages that can be accepted by a finite-state
automaton. Type 3 is the most restricted form of grammar.
Type 3 should be in the given form only :
V --> VT / T (left-regular grammar)
(or)
V --> TV /T (right-regular grammar)
For example:
S --> a
The above form is called strictly regular grammar.
There is another form of regular grammar called extended regular grammar. In this
form:
V --> VT* / T*. (extended left-regular grammar)
(or)
V --> T*V /T* (extended right-regular grammar)
For example :
S --> ab.
Representation Technique
Top-down Approach −
Starts with the starting symbol S
Goes down to tree leaves using productions
Bottom-up Approach −
Starts from tree leaves
Proceeds upward to the root which is the starting symbol S
Derivation or Yield of a Tree
Example
Example
PARSE TREE
o Parse tree is the graphical representation of symbol. The symbol can be terminal or
non-terminal.
o In parsing, the string is derived using the start symbol. The root of the parse tree is
that start symbol.
o It is the graphical representation of symbol that can be terminals or non-terminals.
o Parse tree follows the precedence of operators. The deepest sub-tree traversed first.
So, the operator in the parent node has less precedence over the operator in the sub-
tree.
Example:
Production rules:
1. T= T + T | T * T
2. T = a|b|c
Input:
a * b + c
Step 1:
Step 2:
Step 3:
Step 4:
Step 5:
AMBIGIOUS GRAMMAR
Ambiguous grammar: A CFG is said to be ambiguous if there exists more than
one derivation tree for the given input string i.e., more than
one LeftMost Derivation Tree (LMDT) or RightMost Derivation Tree (RMDT).
Definition: G = (V,T,P,S) is a CFG that is said to be ambiguous if and only if there
exists a string in T* that has more than one parse tree. where V is a finite set of
variables. T is a finite set of terminals. P is a finite set of productions of the form, A
-> α, where A is a variable and α ∈ (V ∪ T)* S is a designated variable called the
start symbol.
For Example:
Let us consider this grammar: E -> E+E|id We can create a 2 parse tree from this
grammar to obtain a string id+id+id. The following are the 2 parse trees generated
by left-most derivation:
Both the above parse trees are derived from the same grammar rules but both parse
trees are different. Hence the grammar is ambiguous. 2. Let us now consider the
following grammar:
Set of alphabets ∑ = {0,…,9, +, *, (, )}
E -> I
E -> E + E
E -> E * E
E -> (E)
I -> ε | 0 | 1 | … | 9
1. Each variable (i.e. non-terminal) and each terminal of G appears in the derivation of
some word in L.
2. There should not be any production as X → Y where X and Y are non-terminal.
3. If ε is not in the language L then there need not to be the production X → ε.
For Example:
In the above example, the variable 'C' will never occur in the derivation of any string,
so the production C → ad is useless. So we will eliminate it, and the other
productions are written in such a way that variable C can never reach from the
starting variable 'T'.
PauseNext
Unmute
Duration 18:10
Loaded: 5.50%
Â
Fullscreen
To remove this useless production A → aA, we will first find all the variables which
will never lead to a terminal string such as variable 'A'. Then we will remove all the
productions in which the variable 'B' occurs.
Elimination of ε Production
The productions of type S → ε are called ε productions. These type of productions
can only be removed from those grammars that do not generate ε.
Step 1: First find out all nullable non-terminal variable which derives ε.
Step 3: Now combine the result of step 2 with the original production and remove ε
productions.
Example:
Remove the production from the following CFG by preserving the meaning of it.
1. S → XYX
2. X → 0X | ε
3. Y → 1Y | ε
Solution:
Let us take
1. S → XYX
1. S → YX
1. S → XY
If Y = ε then
1. S → XX
1. S → X
1. S → Y
Now,
1. S → XY | YX | XX | X | Y
1. X → 0X
Similarly Y → 1Y | 1
1. S → XY | YX | XX | X | Y
2. X → 0X | 0
3. Y → 1Y | 1
Step 3: Repeat step 1 and step 2 until all unit productions are removed.
For example:
1. S → 0A | 1B | C
2. A → 0S | 00
3. B → 1 | A
4. C → 01
Solution:
1. S → 0A | 1B | 01
1. B → 1 | 0S | 00
Properties
The unit productions are the productions in which one non-terminal gives
another nonterminal
Remove unit production
Step 1 − To remove X->Y add production X->a to the grammar rule whenever Y->a
occurs in the grammar.
Step 2 − Now delete X->Y from the grammar
Step 3 − Repeat Step 1 and 2 until all unit productions are removed
Example
Consider the context free grammar given below and remove unit production
for the same.
S->0A|1B|C
A->0S|00
B->1|A
C->01
Explanation
Step 1
S->C is unit production but while removing S->C we have to consider what C
gives so we can add a rule to S.
S->0A|1B|01
Step 2
B->1|0S|00
S->0A|1B|01
A->0S|00
B->1|0S|00
C->01
For example:
1. G1 = {S → AB, S → c, A → a, B → b}
2. G2 = {S → aA, A → a, B → c}
The production rules of Grammar G1 satisfy the rules specified for CNF, so the
grammar G1 is in CNF. However, the production rule of Grammar G2 does not satisfy
the rules specified for CNF as S → aZ contains terminal followed by non-terminal. So
the grammar G2 is not in CNF.
1. S1 → S
Step 2: In the grammar, remove the null, unit and useless productions. You can refer
to the Simplification of CFG.
Step 3: Eliminate terminals from the RHS of the production if they exist with other
non-terminals or terminals. For example, production S → aA can be decomposed as:
1. S → RA
2. R → a
Step 4: Eliminate RHS with more than two non-terminals. For example, S → ASB can
be decomposed as:
1. S → RS
2. R → AS
Example:
Convert the given CFG to CNF. Consider the given grammar G1:
1. S → a | aA | B
2. A → aBB | ε
3. B → Aa | b
Solution:
Step 1: We will create a new production S1 → S, as the start symbol S appears on the
RHS. The grammar will be:
1. S1 → S
2. S → a | aA | B
3. A → aBB | ε
4. B → Aa | b
1. S1 → S
2. S → a | aA | B
3. A → aBB
4. B → Aa | b | a
1. S1 → S
2. S → a | aA | Aa | b
3. A → aBB
4. B → Aa | b | a
Also remove the unit production S1 → S, its removal from the grammar yields:
1. S0 → a | aA | Aa | b
2. S → a | aA | Aa | b
3. A → aBB
4. B → Aa | b | a
1. S0 → a | XA | AX | b
2. S → a | XA | AX | b
3. A → XBB
4. B → AX | b | a
5. X → a
Step 4: In the production rule A → XBB, RHS has more than two symbols, removing it
from grammar yield:
1. S0 → a | XA | AX | b
2. S → a | XA | AX | b
3. A → RB
4. B → AX | b | a
5. X → a
6. R → XB
For example:
The production rules of Grammar G1 satisfy the rules specified for GNF, so the
grammar G1 is in GNF. However, the production rule of Grammar G2 does not satisfy
the rules specified for GNF as A → ε and B → ε contains ε(only start symbol can
generate ε). So the grammar G2 is not in GNF.
If the given grammar is not in CNF, convert it into CNF. You can refer the following
topic to convert the CFG into CNF: Chomsky normal form
If the context free grammar contains left recursion, eliminate it. You can refer the
following topic to eliminate left recursion: Left Recursion
Step 3: In the grammar, convert the given production rule into GNF form.
If any production rule in the grammar is not in GNF form, convert it.
Example:
1. S → XB | AA
2. A → a | SA
3. B → b
4. X → a
Solution:
As the given grammar G is already in CNF and there is no left recursion, so we can
skip step 1 and step 2 and directly go to step 3.
1. S → XB | AA
2. A → a | XBA | AAA
3. B → b
4. X → a
The production rule S → XB and B → XBA is not in GNF, so we substitute X → a in the
production rule S → XB and B → XBA as:
1. S → aB | AA
2. A → a | aBA | AAA
3. B → b
4. X → a
1. S → aB | AA
2. A → aC | aBAC
3. C → AAC | ε
4. B → b
5. X → a
1. S → aB | AA
2. A → aC | aBAC | a | aBA
3. C → AAC | AA
4. B → b
5. X → a
PUMPING LEMMA
If L is a context-free language, there is a pumping length p such
that any string w ∈ L of length ≥ p can be written as w = uvxyz,
where vy ≠ ε, |vxy| ≤ p, and for all i ≥ 0, uvixyiz ∈ L.
Applications of Pumping Lemma
Problem
Solution
Example
Concatenation
If L1 and L2 are context free languages, then L1L2 is also context
free.
Example
Kleene Star
Example
Example-02:
Context Free Grammar (CFG) is of great practical importance. It is used for following
purposes-
Properties-
Remember
If L1 and L2 are two context free languages, then-
L1 ∪ L2 is also a context free language.
L1.L2 is also a context free language.
L1* and L2* are also context free languages.
L1 ∩ L2 is not a context free language.
L1′ and L2′ are not context free languages.