Unit3 Toc
Unit3 Toc
Unit3 Toc
• It starts with a basis saying that a few obvious strings are in Lpa1
• Then exploits the idea that if a string is a palindrome, it must begin and end
with the same symbol.
• Further, when the first and last symbols are removed, the resulting string
must also be a palindrome.
An Example
• Lpal can be inductively described as follows:
• BASIS: ε, 0, and 1 are palindromes.
• INDUCTION: If w is a palindrome, so are 0w0 and 1w1. No string, is a
palindrome of 0’s and 1’s, unless it follows this basis and induction rule.
• We will write the above rules formally as follows:
• P→ε
• P→0
A context-free grammar for palindromes
• P→1
• P→0P0
• P→1P1
Grammars
Example
Example
Chomsky hierarchy of Languages and
Grammar
• The classification of languages into Four classes.
26
Example:A context-free grammar for simple
expressions
• We need two variables in this grammar. E, represents expressions. I,
represents identifiers.
1. E→I
2. E→E+E G = ({E, I}, T, P, E), where T is the set of symbols {+, *,
(, ), a, b, 0, 1} and P is the set of productions.
3. E→E*E
4. E→(E)
5. I→a
6. I→b
7. I→Ia
8. I→Ib
9. I→I0
10. I→I1
Derivations Using a Grammar
• We expand the start symbol using one of its productions (i.e., using a
production whose head is the start symbol).
• E⇒E*E⇒I*E⇒a*E⇒a*(E)⇒a*(E+E)⇒a*(a+E)⇒a*(a+I)⇒a*(a+I0)⇒
a*(a+I00)⇒a*(a+b00)
Leftmost and Rightmost Derivations
• In order to restrict the number of choices we have in deriving a string,
it is often useful to require that at each step we replace the leftmost
variable by one of its production bodies. Such a derivation is called a
leftmost derivation, and we indicate that a derivation is leftmost by
using the relation and , for one or many steps, respectively.
• Similarly, it is possible to require that at each step the rightmost
variable is replaced by one of its bodies. If so, we call this derivation
rightmost and use the symbols and to indicate one or many
rightmost derivation steps, respectively.
Example:
• The inference that a *(a+ b00) is in the language of variable E can be
reflected in a derivation of that string, starting with the string E.
Rightmost Derivation
Leftmost Derivation
Example: Leftmost Derivations
• Balanced-parentheses grammmar:
S -> SS | (S) | ()
• S =>lm SS =>lm (S)S =>lm (())S =>lm (())()
• Thus, S =>*lm (())()
• S => SS => S() => (S)() => (())() is a derivation, but not
a leftmost derivation.
34
Example: Rightmost Derivations
• Balanced-parentheses grammmar:
S -> SS | (S) | ()
• S =>rm SS =>rm S() =>rm (S)() =>rm (())()
• Thus, S =>*rm (())()
• S => SS => SSS => S()S => ()()S => ()()() is neither a
rightmost nor a leftmost derivation.
35
The Language of a Grammar
Example:
• G has productions S -> ε and S -> 0S1.
• L(G) = {0n1n | n > 0}.
Example
Example
Example
Example
Sentential Forms
43
Example: Parse Tree
S -> SS | (S) | ()
S
S S
( S ) ( )
( )
44
Yield of a Parse Tree
• The concatenation of the labels of the leaves in left-to-right order
• That is, in the order of a preorder traversal.
is called the yield of the parse tree.
• Example: yield of is (())()
S S
( S ) ( )
( )
45
Parse Trees, Left- and Rightmost
Derivations
• For every parse tree, there is a unique leftmost, and a unique
rightmost derivation.
1. If there is a parse tree with root labeled A and yield w, then A =>*lm w.
2. If A =>*lm w, then there is a parse tree with root A and yield w.
46
Ambiguous Grammars
• A CFG is ambiguous if there is a string in the language that is the yield
of two or more parse trees.
• Example: S -> SS | (S) | ()
47
Example – Two parse trees for ()()()
S S
S S S S
S S ( ) ( ) S S
( ) ( ) ( ) ( )
48
Example
Example
Consider the sentential form E + E * E. It has two derivations from
Removing Ambiguity From Grammars
We can remove ambiguity solely on the basis of the following two properties –
1. Precedence: If different operators are used, we will consider the precedence of the operators. The three
important characteristics are :
I. The level at which the production is present denotes the priority of the operator used.
II. The production at higher levels will have operators with less priority. In the parse tree, the nodes which
are at top levels or close to the root node will contain the lower priority operators.
III. The production at lower levels will have operators with higher priority. In the parse tree, the nodes
which are at lower levels or close to the leaf nodes will contain the higher priority operators.
2. Associativity: If the same precedence operators are in production, then we will have to consider the
associativity.
• If the associativity is left to right, then we have to prompt a left recursion in the production. The parse
tree will also be left recursive and grow on the left side. +, -, *, / are left associative operators.
• If the associativity is right to left, then we have to prompt the right recursion in the productions. The parse
tree will also be right recursive and grow on the right side. ^ is a right associative operator.
Example
53
Inherently Ambiguous CFLs
• However, for some languages, it may not be possible to remove
ambiguity
55
Closure of CFL’s Under Union
• Let L and M be CFL’s with grammars G and H,
respectively.
• Assume G and H have no variables in common.
• Names of variables do not affect the language.
• Let S1 and S2 be the start symbols of G and H.
56
Closure Under Union – (2)
• Form a new grammar for L M by combining all the symbols and
productions of G and H.
• Then, add a new start symbol S.
• Add productions S -> S1 | S2.
57
Closure Under Union – (3)
• In the new grammar, all derivations start with S.
• The first step replaces S by either S1 or S2.
• In the first case, the result must be a string in L(G) = L, and in the
second case a string in L(H) = M.
58
Closure of CFL’s Under Concatenation
59
Closure Under Concatenation – (2)
• Form a new grammar for LM by starting with all symbols and
productions of G and H.
• Add a new start symbol S.
• Add production S -> S1S2.
• Every derivation from S results in a string in L followed by one in M.
60
Closure Under Star
61
Closure of CFL’s Under Reversal
62
Closure of CFL’s Under Homomorphism
63
Example: Closure Under Homomorphism
64
Nonclosure Under Intersection
• Unlike the regular languages, the class of CFL’s is not
closed under .
• We know that L1 = {0n1n2n | n > 1} is not a CFL (use
the pumping lemma).
• However, L2 = {0n1n2i | n > 1, i > 1} is.
• CFG: S -> AB, A -> 0A1 | 01, B -> 2B | 2.
• So is L3 = {0i1n2n | n > 1, i > 1}.
• But L1 = L2 L3.
65
Nonclosure Under Difference
• We can prove something more general:
• Any class of languages that is closed under difference is closed under
intersection.
• Proof: L M = L – (L – M).
• Thus, if CFL’s were closed under difference, they would be closed
under intersection, but they are not.
66
Intersection with a Regular Language
• Intersection of two CFL’s need not be context free.
• But the intersection of a CFL with a regular
language is always a CFL.
• Proof involves running a DFA in parallel with a PDA,
and noting that the combination is a PDA.
• PDA’s accept by final state.
67
Simplification Of Context-free Grammars
• There are several ways in which one can restrict the format of
productions without reducing the generative power of context-free
grammars.
• If L is a nonempty context-free language then it can be generated by a
context-free grammar G with the following properties.
• Each variable and each terminal of G appears in the derivation of some word
in L.
• There are no productions of the form A -> B where A and B are variables.
Eliminating Useless Symbols
• Let G = (V, T, P, S) be a grammar. A symbol X is useful if there is a
derivation for some α, β, and w, where w is in T*. Otherwise X is
useless.
• Algorithm:
• Step 1: Include all Symbols W1, that derives some terminal and initialize i=1.
• Step 2: Include all Symbols Wi+1, that derives Wi.
• Step 3: Increment i and repeat step 2, util Wi+1=Wi
• Step4: Include all production rules that have Wi in it.
Eliminating Useless Symbols
• Lemma: Given a CFG G = (V, T, P, S) we can effectively find an
equivalent CFG G' = (V’, T’, P’, S) such that for each X in V’ U T
there exist α and β in (V’ U T )* for which .
• Algorithm:
• Step 1: Include the Start Symbol in Y1 and initialize i=1
• Step 2:Include all symbols Yi+1, that can be derived from Yi and include all
production rules that have been applied
• Step 3:Incrimnet i and repeat step 2, util Yi+1=Yi.
Example
• Consider the grammar Apply first Lemma:
T={a,c,e}, W1={A,C,E}
• S→AC|B W2={A,C,E,S}
• A→a W3={A,C,E,S}
G’={(A,C,E,S), (a,c,e), P, S}
• C→c|BC P: S→AC, A→a, C→c, E→aA|e
• Algorithm:
• Step 1: To eliminate A→B, add production A→x to the grammar rule
whenever B→x occurs in the grammar. [ x is in Terminal or can be null]
• Step 2: Delete A→B from the grammar.
• Step 3:Repeat from step 1until all Unit Productions are eliminated.
Example
• Remove unit production from following grammar:
• S→XY
• X→a,
• Y→Z|b Re
mo
• Z→M ve
u ni t
• M→N pro
duc
ti o
• N→a ns
S→XY
X→a,
Y→a|b Remove unreachable symbols S→XY
Z→a X→a,
M→a Y→a|b
N→a
Normal forms for CFGs
• Normal forms: Productions are of a special form