Unit-3 Flat

UNIT-3
GRAMMAR
Context Free Grammar is formal grammar, the syntax or structure of a formal
language can be described using context-free grammar (CFG), a type of formal
grammar. The grammar has four tuples: (V,T,P,S).
V - It is the collection of variables or nonterminal symbols.
T - It is a set of terminals.
P - It is the production rules that consist of both terminals and
nonterminals.
S - It is the Starting symbol.
A grammar is said to be the Context-free grammar if every production is in the form
of :
G -> (V∪T)*, where G ∊ V
 And the left-hand side of the G, here in the example can only be a
Variable, it cannot be a terminal.
 But on the right-hand side here it can be a Variable or Terminal or both
combination of Variable and Terminal.
Above equation states that every production which contains any combination of the
‘V’ variable or ‘T’ terminal is said to be a context-free grammar.
For example the grammar A = { S, a,b, P,S} having production :
 Here S is the starting symbol.
 {a,b} are the terminals generally represented by small characters.
 P is variable along with S.
S-> aS
S-> bSa
but
a->bSa, or
a->ba is not a CFG as on the left-hand side there is a variable
which does not follow the CFGs rule.
In the computer science field, context-free grammars are frequently used, especially
in the areas of formal language theory, compiler development, and natural language
processing. It is also used for explaining the syntax of programming languages and
other formal languages.
Limitations of Context-Free Grammar
Apart from all the uses and importance of Context-Free Grammar in the Compiler
design and the Computer science field, there are some limitations that are addressed,
that is CFGs are less expressive, and neither English nor programming language can
be expressed using Context-Free Grammar. Context-Free Grammar can be
ambiguous means we can generate multiple parse trees of the same input. For some
grammar, Context-Free Grammar can be less efficient because of the exponential
time complexity. And the less precise error reporting as CFGs error reporting system
is not that precise that can give more detailed error messages and information.
CLASSIFICATION OF GRAMMAR
Context Free Grammars (CFG) can be classified on the basis of following two
properties:
1) Based on number of strings it generates.
 If CFG is generating finite number of strings, then CFG is Non-
Recursive (or the grammar is said to be Non-recursive grammar)
 If CFG can generate infinite number of strings then the grammar is said to
be Recursive grammar
During Compilation, the parser uses the grammar of the language to make a parse
tree(or derivation tree) out of the source code. The grammar used must be
unambiguous. An ambiguous grammar must not be used for parsing.
2) Based on number of derivation trees.
 If there is only 1 derivation tree then the CFG is unambiguous.
 If there are more than 1 left most derivation tree or right most derivation
or parse tree , then the CFG is ambiguous.
Examples of Recursive and Non-Recursive Grammars
Recursive Grammars
1) S->SaS
S->b
The language(set of strings) generated by the above grammar is :{b, bab, babab,…},
which is infinite.
2) S-> Aa
A->Ab|c
The language generated by the above grammar is :{ca, cba, cbba …}, which is
infinite.
Note: A recursive context-free grammar that contains no useless rules necessarily
produces an infinite language.
Non-Recursive Grammars
S->Aa
A->b|c
The language generated by the above grammar is :{ba, ca}, which is finite.
Types of Recursive Grammars
Based on the nature of the recursion in a recursive grammar, a recursive CFG can be
again divided into the following:
 Left Recursive Grammar (having left Recursion)
 Right Recursive Grammar (having right Recursion)
 General Recursive Grammar(having general Recursion)
CHOMSKY HIERARCHY THEOREM

According to Chomsky hierarchy, grammar is divided into 4 types as follows:
1. Type 0 is known as unrestricted grammar.
2. Type 1 is known as context-sensitive grammar.
3. Type 2 is known as a context-free grammar.
4. Type 3 Regular Grammar.
Type 0: Unrestricted Grammar:

Type-0 grammars include all formal grammar. Type 0 grammar languages are
recognized by turing machine. These languages are also known as the Recursively
Enumerable languages.
Grammar Production in the form of where
is ( V + T)* V ( V + T)*
V : Variables
T : Terminals.
is ( V + T )*.
In type 0 there must be at least one variable on the Left side of production.
For example:
Sab --> ba
A --> S
Here, Variables are S, A, and Terminals a, b.
Type 1: Context-Sensitive Grammar
Type-1 grammars generate context-sensitive languages. The language generated by
the grammar is recognized by the Linear Bound Automata
In Type 1
 First of all Type 1 grammar should be Type 0.
 Grammar Production in the form of
That is the count of symbol in is less than or equal to
Also β ∈ (V + T)+
i.e. β can not be ε
For Example:
S --> AB
AB --> abc
B --> b
Type 2: Context-Free Grammar: Type-2 grammars generate context-free
languages. The language generated by the grammar is recognized by a Pushdown
automata. In Type 2:
 First of all, it should be Type 1.
 The left-hand side of production can have only one variable and there is
no restriction on
.
For example:
S --> AB
A --> a
B --> b
Type 3: Regular Grammar: Type-3 grammars generate regular languages. These
languages are exactly all languages that can be accepted by a finite-state
automaton. Type 3 is the most restricted form of grammar.
Type 3 should be in the given form only :
V --> VT / T (left-regular grammar)
(or)
V --> TV /T (right-regular grammar)
For example:
S --> a
The above form is called strictly regular grammar.
There is another form of regular grammar called extended regular grammar. In this
form:
V --> VT* / T*. (extended left-regular grammar)
(or)
V --> T*V /T* (extended right-regular grammar)
For example :
S --> ab.
LEFT MOST AND RIGHT MOST DERIVATION

Definition − A context-free grammar (CFG) consisting of a finite
set of grammar rules is a quadruple (N, T, P, S) where
 N is a set of non-terminal symbols.
 T is a set of terminals where N ∩ T = NULL.
 P is a set of rules, P: N → (N ∪ T)*, i.e., the left-hand side of
the production rule P does have any right context or left
context.
 S is the start symbol.
Example
 The grammar ({A}, {a, b, c}, P, A), P : A → aA, A → abc.
 The grammar ({S, a, b}, {a, b}, P, S), P: S → aSa, S →
bSb, S → ε
 The grammar ({S, F}, {0, 1}, P, S), P: S → 00S | 11F, F →
00F | ε
Generation of Derivation Tree
A derivation tree or parse tree is an ordered rooted tree that

graphically represents the semantic information a string derived
from a context-free grammar.
Representation Technique
 Root vertex − Must be labeled by the start symbol.

 Vertex − Labeled by a non-terminal symbol.
 Leaves − Labeled by a terminal symbol or ε.
If S → x1x2 …… xn is a production rule in a CFG, then the parse
tree / derivation tree will be as follows −
There are two different approaches to draw a derivation tree −
Top-down Approach −
 Starts with the starting symbol S
 Goes down to tree leaves using productions
Bottom-up Approach −
 Starts from tree leaves
 Proceeds upward to the root which is the starting symbol S
Derivation or Yield of a Tree
The derivation or the yield of a parse tree is the final string

obtained by concatenating the labels of the leaves of the tree
from left to right, ignoring the Nulls. However, if all the leaves are
Null, derivation is Null.
Example
Let a CFG {N,T,P,S} be
N = {S}, T = {a, b}, Starting symbol = S, P = S → SS | aSb | ε
One derivation from the above CFG is “abaabb”
S → SS → aSbS → abS → abaSb → abaaSbb → abaabb
Sentential Form and Partial Derivation Tree

A partial derivation tree is a sub-tree of a derivation tree/parse
tree such that either all of its children are in the sub-tree or none
of them are in the sub-tree.
Example
If in any CFG the productions are −
S → AB, A → aaA | ε, B → Bb| ε
the partial derivation tree can be the following −
If a partial derivation tree contains the root S, it is called

a sentential form. The above sub-tree is also in sentential form.
Leftmost and Rightmost Derivation of a String
 Leftmost derivation − A leftmost derivation is obtained by

applying production to the leftmost variable in each step.
 Rightmost derivation − A rightmost derivation is obtained by
applying production to the rightmost variable in each step.
Example
Let any set of production rules in a CFG be
X → X+X | X*X |X| a
over an alphabet {a}.
The leftmost derivation for the string "a+a*a" may be −
X → X+X → a+X → a + X*X → a+a*X → a+a*a
The stepwise derivation of the above string is shown as below −

The rightmost derivation for the above string "a+a*a" may be −
X → X*X → X*a → X+X*a → X+a*a → a+a*a
The stepwise derivation of the above string is shown as below −

Left and Right Recursive Grammars
In a context-free grammar G, if there is a production in the
form X → Xa where X is a non-terminal and ‘a’ is a string of
terminals, it is called a left recursive production. The grammar
having a left recursive production is called a left recursive grammar.
And if in a context-free grammar G, if there is a production is in
the form X → aX where X is a non-terminal and ‘a’ is a string of
terminals, it is called a right recursive production. The grammar
having a right recursive production is called a right recursive
grammar.
PARSE TREE
o Parse tree is the graphical representation of symbol. The symbol can be terminal or
non-terminal.
o In parsing, the string is derived using the start symbol. The root of the parse tree is
that start symbol.
o It is the graphical representation of symbol that can be terminals or non-terminals.
o Parse tree follows the precedence of operators. The deepest sub-tree traversed first.
So, the operator in the parent node has less precedence over the operator in the sub-
tree.
The parse tree follows these points:

o All leaf nodes have to be terminals.
o All interior nodes have to be non-terminals.
o In-order traversal gives original input string.
Example:
Production rules:
1. T= T + T | T * T
2. T = a|b|c
Input:
a * b + c
Step 1:
Step 2:
Step 3:
Step 4:
Step 5:
AMBIGIOUS GRAMMAR
Ambiguous grammar: A CFG is said to be ambiguous if there exists more than
one derivation tree for the given input string i.e., more than
one LeftMost Derivation Tree (LMDT) or RightMost Derivation Tree (RMDT).
Definition: G = (V,T,P,S) is a CFG that is said to be ambiguous if and only if there
exists a string in T* that has more than one parse tree. where V is a finite set of
variables. T is a finite set of terminals. P is a finite set of productions of the form, A
-> α, where A is a variable and α ∈ (V ∪ T)* S is a designated variable called the
start symbol.
For Example:
Let us consider this grammar: E -> E+E|id We can create a 2 parse tree from this
grammar to obtain a string id+id+id. The following are the 2 parse trees generated
by left-most derivation:
Both the above parse trees are derived from the same grammar rules but both parse
trees are different. Hence the grammar is ambiguous. 2. Let us now consider the
following grammar:
Set of alphabets ∑ = {0,…,9, +, *, (, )}
E -> I
E -> E + E
E -> E * E
E -> (E)
I -> ε | 0 | 1 | … | 9
From the above grammar String 3*2+5 can be derived in 2 ways:

I) First leftmost derivation II) Second leftmost
derivation
E=>E*E E=>E+E
=>I*E =>E*E+E
=>3*E+E =>I*E+E
=>3*I+E =>3*E+E
=>3*2+E =>3*I+E
=>3*2+I =>3*2+I
=>3*2+5 =>3*2+5
Following are some examples of ambiguous grammar:
 S-> aS |Sa| Є
 E-> E +E | E*E| id
 A -> AA | (A) | a
 S -> SS|AB , A -> Aa|a , B -> Bb|b
Whereas following grammars are unambiguous:
 S -> (L) | a, L -> LS | S
 S -> AA , A -> aA , A -> b
Inherently ambiguous Language: Let L be a Context Free Language (CFL). If
every Context-Free Grammar G with Language L = L(G) is ambiguous, then L is
said to be inherently ambiguous Language. Ambiguity is a property of grammar not
languages. Ambiguous grammar is unlikely to be useful for a programming language
because two parse tree structures(or more) for the same string(program) imply two
different meanings (executable programs) for the program. An inherently ambiguous
language would be absolutely unsuitable as a programming language because we
would not have any way of fixing a unique structure for all its programs. For
example,
L = {anbncm} ∪ {anbmcm}
SIMPLIFICATION OF CFG-ELIMINATION OF USELESS

SYMBOLS
As we have seen, various languages can efficiently be represented by a context-free
grammar. All the grammar are not always optimized that means the grammar may
consist of some extra symbols(non-terminal). Having extra symbols, unnecessary
increase the length of grammar. Simplification of grammar means reduction of
grammar by removing useless symbols. The properties of reduced grammar are
given below:
1. Each variable (i.e. non-terminal) and each terminal of G appears in the derivation of
some word in L.
2. There should not be any production as X → Y where X and Y are non-terminal.
3. If ε is not in the language L then there need not to be the production X → ε.
Let us study the reduction process in

detail./p>
Removal of Useless Symbols

A symbol can be useless if it does not appear on the right-hand side of the
production rule and does not take part in the derivation of any string. That symbol is
known as a useless symbol. Similarly, a variable can be useless if it does not take part
in the derivation of any string. That variable is known as a useless variable.
For Example:
1. T → aaB | abA | aaT

2. A → aA
3. B → ab | b
4. C → ad
In the above example, the variable 'C' will never occur in the derivation of any string,
so the production C → ad is useless. So we will eliminate it, and the other
productions are written in such a way that variable C can never reach from the
starting variable 'T'.
PauseNext
Unmute
Current TimeÂ 0:14
DurationÂ 18:10
Loaded: 5.50%
Â
Fullscreen
Production A → aA is also useless because there is no way to terminate it. If it never

terminates, then it can never produce a string. Hence this production can never take
part in any derivation.
To remove this useless production A → aA, we will first find all the variables which
will never lead to a terminal string such as variable 'A'. Then we will remove all the
productions in which the variable 'B' occurs.
Elimination of ε Production
The productions of type S → ε are called ε productions. These type of productions
can only be removed from those grammars that do not generate ε.
Step 1: First find out all nullable non-terminal variable which derives ε.
Step 2: For each production A → a, construct all production A → x, where x is

obtained from a by removing one or more non-terminal from step 1.
Step 3: Now combine the result of step 2 with the original production and remove ε
productions.
Example:
Remove the production from the following CFG by preserving the meaning of it.
1. S → XYX
2. X → 0X | ε
3. Y → 1Y | ε
Solution:
Now, while removing ε production, we are deleting the rule X → ε and Y → ε. To

preserve the meaning of CFG we are actually placing ε at the right-hand side
whenever X and Y have appeared.
Let us take
1. S → XYX
If the first X at right-hand side is ε. Then
1. S → YX
Similarly if the last X in R.H.S. = ε. Then
1. S → XY
If Y = ε then
1. S → XX
If Y and X are ε then,
1. S → X
If both X are replaced by ε
1. S → Y
Now,
1. S → XY | YX | XX | X | Y
Now let us consider
1. X → 0X
If we place ε at right-hand side for X then,

1. X → 0
2. X → 0X | 0
Similarly Y → 1Y | 1
Collectively we can rewrite the CFG with removed ε production as
1. S → XY | YX | XX | X | Y
2. X → 0X | 0
3. Y → 1Y | 1
Removing Unit Productions

The unit productions are the productions in which one non-terminal gives another
non-terminal. Use the following steps to remove unit production:
Step 1: To remove X → Y, add production X → a to the grammar rule whenever Y → a

occurs in the grammar.
Step 2: Now delete X → Y from the grammar.
Step 3: Repeat step 1 and step 2 until all unit productions are removed.
For example:
1. S → 0A | 1B | C
2. A → 0S | 00
3. B → 1 | A
4. C → 01
Solution:
S → C is a unit production. But while removing S → C we have to consider what C

gives. So, we can add a rule to S.
1. S → 0A | 1B | 01
Similarly, B → A is also a unit production so we can modify it as
1. B → 1 | 0S | 00
Thus finally we can write CFG without unit production as

1. S → 0A | 1B | 01
2. A → 0S | 00
3. B → 1 | 0S | 00
4. C → 01
E- PRODUCTION AND UNIT PRODUCTION

All grammars are not always optimized, which means the grammar may
consist of some extra symbols (non-terminals) which increase the length of
grammar.
So, we have to reduce the grammar by removing the useless symbols.
Properties
The properties to reduce grammar are explained below −
 Each non-terminal and terminal of G appears in the derivation of some word in L.

 There should not be any production as X->Y where X and Y are non-terminals.
 If epsilon is not in language L then there need not be in the production X-> ε.
The diagram given herewith describe the properties to reduce grammar −
The unit productions are the productions in which one non-terminal gives
another nonterminal
Remove unit production
The steps to remove the unit production are given below −
 Step 1 − To remove X->Y add production X->a to the grammar rule whenever Y->a
occurs in the grammar.
 Step 2 − Now delete X->Y from the grammar
 Step 3 − Repeat Step 1 and 2 until all unit productions are removed
Example
Consider the context free grammar given below and remove unit production
for the same.
S->0A|1B|C
A->0S|00
B->1|A
C->01
Explanation
Step 1
S->C is unit production but while removing S->C we have to consider what C
gives so we can add a rule to S.
S->0A|1B|01
Step 2
B->A is also unit production
B->1|0S|00
Finally, we can write CFG without unit production as follows −
S->0A|1B|01
A->0S|00
B->1|0S|00
C->01
NORMAL FORM FOR CFG-CHOMSKY NORMAL FORM AND

GREIBACH NORMAL FORM
CNF stands for Chomsky normal form. A CFG(context free grammar) is in
CNF(Chomsky normal form) if all production rules satisfy one of the following
conditions:
o Start symbol generating ε. For example, A → ε.

o A non-terminal generating two non-terminals. For example, S → AB.
o A non-terminal generating a terminal. For example, S → a.
For example:
1. G1 = {S → AB, S → c, A → a, B → b}
2. G2 = {S → aA, A → a, B → c}
The production rules of Grammar G1 satisfy the rules specified for CNF, so the
grammar G1 is in CNF. However, the production rule of Grammar G2 does not satisfy
the rules specified for CNF as S → aZ contains terminal followed by non-terminal. So
the grammar G2 is not in CNF.
Steps for converting CFG into CNF

Step 1: Eliminate start symbol from the RHS. If the start symbol T is at the right-hand
side of any production, create a new production as:
1. S1 → S
Where S1 is the new start symbol.
Step 2: In the grammar, remove the null, unit and useless productions. You can refer
to the Simplification of CFG.
Step 3: Eliminate terminals from the RHS of the production if they exist with other
non-terminals or terminals. For example, production S → aA can be decomposed as:
1. S → RA
2. R → a
Step 4: Eliminate RHS with more than two non-terminals. For example, S → ASB can
be decomposed as:
1. S → RS
2. R → AS
Example:
Convert the given CFG to CNF. Consider the given grammar G1:
1. S → a | aA | B
2. A → aBB | ε
3. B → Aa | b
Solution:
Step 1: We will create a new production S1 → S, as the start symbol S appears on the
RHS. The grammar will be:
1. S1 → S
2. S → a | aA | B
3. A → aBB | ε
4. B → Aa | b
Step 2: As grammar G1 contains A → ε null production, its removal from the

grammar yields:
1. S1 → S
2. S → a | aA | B
3. A → aBB
4. B → Aa | b | a
Now, as grammar G1 contains Unit production S → B, its removal yield:
1. S1 → S
2. S → a | aA | Aa | b
3. A → aBB
4. B → Aa | b | a
Also remove the unit production S1 → S, its removal from the grammar yields:
1. S0 → a | aA | Aa | b
2. S → a | aA | Aa | b
3. A → aBB
4. B → Aa | b | a
Step 3: In the production rule S0 → aA | Aa, S → aA | Aa, A → aBB and B → Aa,

terminal a exists on RHS with non-terminals. So we will replace terminal a with X:
1. S0 → a | XA | AX | b
2. S → a | XA | AX | b
3. A → XBB
4. B → AX | b | a
5. X → a
Step 4: In the production rule A → XBB, RHS has more than two symbols, removing it
from grammar yield:
1. S0 → a | XA | AX | b
2. S → a | XA | AX | b
3. A → RB
4. B → AX | b | a
5. X → a
6. R → XB
Hence, for the given grammar, this is the required CNF.
GNF stands for Greibach normal form. A CFG(context free grammar) is in

GNF(Greibach normal form) if all the production rules satisfy one of the following
conditions:
o A start symbol generating ε. For example, S → ε.

o A non-terminal generating a terminal. For example, A → a.
o A non-terminal generating a terminal which is followed by any number of non-
terminals. For example, S → aASB.
For example:
1. G1 = {S → aAB | aB, A → aA| a, B → bB | b}

2. G2 = {S → aAB | aB, A → aA | ε, B → bB | ε}
The production rules of Grammar G1 satisfy the rules specified for GNF, so the
grammar G1 is in GNF. However, the production rule of Grammar G2 does not satisfy
the rules specified for GNF as A → ε and B → ε contains ε(only start symbol can
generate ε). So the grammar G2 is not in GNF.
Steps for converting CFG into GNF

Step 1: Convert the grammar into CNF.
If the given grammar is not in CNF, convert it into CNF. You can refer the following
topic to convert the CFG into CNF: Chomsky normal form
Step 2: If the grammar exists left recursion, eliminate it.
If the context free grammar contains left recursion, eliminate it. You can refer the
following topic to eliminate left recursion: Left Recursion
Step 3: In the grammar, convert the given production rule into GNF form.
If any production rule in the grammar is not in GNF form, convert it.
Example:
1. S → XB | AA
2. A → a | SA
3. B → b
4. X → a
Solution:
As the given grammar G is already in CNF and there is no left recursion, so we can
skip step 1 and step 2 and directly go to step 3.
The production rule A → SA is not in GNF, so we substitute S → XB | AA in the

production rule A → SA as:
1. S → XB | AA
2. A → a | XBA | AAA
3. B → b
4. X → a
The production rule S → XB and B → XBA is not in GNF, so we substitute X → a in the
production rule S → XB and B → XBA as:
1. S → aB | AA
2. A → a | aBA | AAA
3. B → b
4. X → a
Now we will remove left recursion (A → AAA), we get:
1. S → aB | AA
2. A → aC | aBAC
3. C → AAC | ε
4. B → b
5. X → a
Now we will remove null production C → ε, we get:
1. S → aB | AA
2. A → aC | aBAC | a | aBA
3. C → AAC | AA
4. B → b
5. X → a
The production rule S → AA is not in GNF, so we substitute A → aC | aBAC | a | aBA in

production rule S → AA as:
1. S → aB | aCA | aBACA | aA | aBAA

3. C → AAC
4. C → aCA | aBACA | aA | aBAA
5. B → b
6. X → a
The production rule C → AAC is not in GNF, so we substitute A → aC | aBAC | a | aBA

in production rule C → AAC as:
1. S → aB | aCA | aBACA | aA | aBAA

3. C → aCAC | aBACAC | aAC | aBAAC
4. C → aCA | aBACA | aA | aBAA
5. B → b
6. X → a
Hence, this is the GNF form for the grammar G
PUMPING LEMMA
If L is a context-free language, there is a pumping length p such
that any string w ∈ L of length ≥ p can be written as w = uvxyz,
where vy ≠ ε, |vxy| ≤ p, and for all i ≥ 0, uvixyiz ∈ L.
Applications of Pumping Lemma
Pumping lemma is used to check whether a grammar is context

free or not. Let us take an example and show how it is checked.
Problem
Find out whether the language L = {xnynzn | n ≥ 1} is context free or

not.
Solution
Let L is context free. Then, L must satisfy pumping lemma.

At first, choose a number n of the pumping lemma. Then, take z
as 0n1n2n.
Break z into uvwxy, where
|vwx| ≤ n and vx ≠ ε.
Hence vwx cannot involve both 0s and 2s, since the last 0 and the
first 2 are at least (n+1) positions apart. There are two cases −
Case 1 − vwx has no 2s. Then vx has only 0s and 1s. Then uwy,
which would have to be in L, has n 2s, but fewer than n 0s or 1s.
Case 2 − vwx has no 0s.
Here contradiction occurs.
Hence, L is not a context-free language.

CLOSURE PROPERTIES
Context-free languages are closed under −
 Union
 Concatenation
 Kleene Star operation
Union
Let L1 and L2 be two context free languages. Then L1 ∪ L2 is also
context free.
Example
Let L1 = { anbn , n > 0}. Corresponding grammar G1 will have P:

S1 → aAb|ab
Let L2 = { cmdm , m ≥ 0}. Corresponding grammar G2 will have P:
S2 → cBb| ε
Union of L1 and L2, L = L1 ∪ L2 = { anbn } ∪ { cmdm }
The corresponding grammar G will have the additional production

S → S1 | S2
Concatenation
If L1 and L2 are context free languages, then L1L2 is also context
free.
Example
Union of the languages L1 and L2, L = L1L2 = { anbncmdm }
The corresponding grammar G will have the additional production

S → S1 S2
Kleene Star
If L is a context free language, then L* is also context free.
Example
Let L = { anbn , n ≥ 0}. Corresponding grammar G will have P: S

→ aAb| ε
Kleene Star L1 = { anbn }*
The corresponding grammar G1 will have additional productions
S1 → SS1 | ε
Context-free languages are not closed under −
 Intersection − If L1 and L2 are context free languages, then
L1 ∩ L2 is not necessarily context free.
 Intersection with Regular Language − If L1 is a regular language
and L2 is a context free language, then L1 ∩ L2 is a context
free language.
 Complement − If L1 is a context free language, then L1’ may
not be context free.
APPLICATIONS OF CONTEXT FREE GRAMMAR

A context Free Grammar (CFG) is a 4-tuple such that-
G = (V , T , P , S)
where-
 V = Finite non-empty set of variables / non-terminal symbols

 T = Finite set of terminal symbols
 P = Finite non-empty set of production rules of the form A → α where A ∈ V
and α ∈ (V ∪ T)*
 S = Start symbol
Example-01:
Consider a grammar G = (V , T , P , S) where-

V = { S }
T = { a , b }
 P = { S → aSbS , S → bSaS , S → ∈ }
S={S}
 This grammar is an example of a context free grammar.

 It generates the strings having equal number of a’s and b’s.
Example-02:
Consider a grammar G = (V , T , P , S) where-

V={S}
T={(,)}
 P = { S → SS , S → (S) , S → ∈ }
S={S}
 This grammar is an example of a context free grammar.

 It generates the strings of balanced parenthesis.
Applications-
Context Free Grammar (CFG) is of great practical importance. It is used for following
purposes-
 For defining programming languages

 For parsing the program by constructing syntax tree
 For translation of programming languages
 For describing arithmetic expressions
 For construction of compilers
Context Free Language-
Properties-
 The context free languages are closed under union.

 The context free languages are closed under concatenation.
 The context free languages are closed under kleen closure.
 The context free languages are not closed under intersection and complement.
 The family of regular language is a proper subset of the family of context free
language.
 Each Context Free Language is accepted by a Pushdown automaton.
Remember
If L1 and L2 are two context free languages, then-
 L1 ∪ L2 is also a context free language.
 L1.L2 is also a context free language.
 L1* and L2* are also context free languages.
 L1 ∩ L2 is not a context free language.
 L1′ and L2′ are not context free languages.
Ambiguity in Context Free Grammar-

A grammar is said to be ambiguous if for a given string generated by the grammar, there
exists-
 more than one leftmost derivation
 or more than one rightmost derivation
 or more than one parse tree (or derivation tree).

Unit-3 Flat

Uploaded by

Copyright:

Available Formats

Unit-3 Flat

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit-3 Flat

Uploaded by

Copyright:

Available Formats

UNIT-3

CHOMSKY HIERARCHY THEOREM

Type 0: Unrestricted Grammar:

LEFT MOST AND RIGHT MOST DERIVATION

A derivation tree or parse tree is an ordered rooted tree that

 Root vertex − Must be labeled by the start symbol.

There are two different approaches to draw a derivation tree −

The derivation or the yield of a parse tree is the final string

Let a CFG {N,T,P,S} be

N = {S}, T = {a, b}, Starting symbol = S, P = S → SS | aSb | ε

One derivation from the above CFG is “abaabb”

S → SS → aSbS → abS → abaSb → abaaSbb → abaabb

Sentential Form and Partial Derivation Tree

If in any CFG the productions are −

S → AB, A → aaA | ε, B → Bb| ε

the partial derivation tree can be the following −

If a partial derivation tree contains the root S, it is called

Leftmost and Rightmost Derivation of a String

 Leftmost derivation − A leftmost derivation is obtained by

Let any set of production rules in a CFG be

X → X+X | X*X |X| a

over an alphabet {a}.

The leftmost derivation for the string "a+a*a" may be −

X → X+X → a+X → a + X*X → a+a*X → a+a*a

The stepwise derivation of the above string is shown as below −

X → X*X → X*a → X+X*a → X+a*a → a+a*a

The stepwise derivation of the above string is shown as below −

The parse tree follows these points:

From the above grammar String 3*2+5 can be derived in 2 ways:

SIMPLIFICATION OF CFG-ELIMINATION OF USELESS

Let us study the reduction process in

Removal of Useless Symbols

1. T → aaB | abA | aaT

Current TimeÂ 0:14

Production A → aA is also useless because there is no way to terminate it. If it never

Step 2: For each production A → a, construct all production A → x, where x is

Now, while removing ε production, we are deleting the rule X → ε and Y → ε. To

If the first X at right-hand side is ε. Then

Similarly if the last X in R.H.S. = ε. Then

If Y and X are ε then,

If both X are replaced by ε

Now let us consider

If we place ε at right-hand side for X then,

Collectively we can rewrite the CFG with removed ε production as

Removing Unit Productions

Step 1: To remove X → Y, add production X → a to the grammar rule whenever Y → a

Step 2: Now delete X → Y from the grammar.

S → C is a unit production. But while removing S → C we have to consider what C

Similarly, B → A is also a unit production so we can modify it as

Thus finally we can write CFG without unit production as

E- PRODUCTION AND UNIT PRODUCTION

So, we have to reduce the grammar by removing the useless symbols.

The properties to reduce grammar are explained below −

 Each non-terminal and terminal of G appears in the derivation of some word in L.

The diagram given herewith describe the properties to reduce grammar −

The steps to remove the unit production are given below −

B->A is also unit production

Finally, we can write CFG without unit production as follows −

X → X+X → a+X → a + XX → a+aX → a+a*a

X → XX → Xa → X+Xa → X+aa → a+a*a