Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Discussion: Context-Free Grammars: Questions On Homework or Exam?

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

CS 373: Theory of Computation Sariel Har-Peled and Madhusudan Parthasarathy

Discussion : Context-Free Grammars


March 3, 2009

Questions on homework or exam?


Any questions? Complaints, etc?

1
1.1

Context free grammar for languages with balance or without it


Balanced strings

Let La=b = an bn n 1 . Here is a grammar for this language S aSb | .

1.2

Mixed balanced strings

Let Lmix be the language of all strings over {a, b}, with equal number of as and bs, where a=b the as and bs might be mixed together. We the following grammar for generating all strings that have the same numbers of as and bs S aSb | bSa | SS | , where S is the start variable. Advice To TA:: State the following lemma, and sketch its proof, but do not do the end proof in the discussion section. Point the interested students to the class notes. Lemma 1.1 We have L(G) = Lmix . a=b Proof: It is easy to see that every string that G generates has equal number of as and bs. As such, L(G) Lmix . a=b We will use induction on the length of string x L(G), 2n = |x|. For n = 0 we can generate by G. For n = 1, we can generate both ab and ba by G. Now for n > 1, consider a balanced string with length 2n, x = x1 x2 x3 x2n Lmix . a=b Let #c (y) be the number of appearances of the character c in the string y. Let i = #a (x1 xi ) #b (x1 xi ). Observe that 0 = 2n = 0. If j = 0, for some 1 < j < 2n,

then we can break x into two words y = x1 . . . xj and z = xj+1 . . . x2n that are both balanced. By induction, y, z L(G), and as such S y and S z. This implies that S SS yz = x. Namely, x L(G). The remaining case is that j = 0 for j = 2, . . . , 2n 1. If x1 = a then 1 = 1. As such, for all j = 1, . . . , 2n 1, we must have that j > 0. But then 2n = 0, which implies that 2n1 = 1. We conclude that x1 = a and x2n = b. As such, x2 . . . x2n1 is a balanced word, which by induction is generated by L(G). Thus, the x can be derived via S aSb ax2 x3 . . . x2n1 b = x. Thus, x L(G). The case x1 = b is handled in a similar fashion, and implies that x L(G) also in this case. We conclude that Lmix L(G). a=b Thus Lmix = L(G). a=b

1.3

Unbalanced pair

Consider the following language: La=b = an bm n = m and n, m 0 . If n = m then either n > m or m > n, therefore we can design this grammar by rst starting with the basic grammar for when n = m, and then transition into making more as or bs. Let X be the non terminal representing choosing to generate more as than bs and Y be the non-terminal for the other case. One grammar that generates La=b will therefore be: S aSb | aA | bB, A aA | , B bB | .

1.4

Balanced pair in a triple


L4 = ai bj ck i = j or j = k .

Consider the language

We can essentially combine two copies of the previous grammar (with one version that works on b and c) in order to create a grammar that generates L2 : S Sa=b C | ASb=c Sa=b aSa=b b | . Sb=c bSb=c c | . A Aa | C Cc | .

Exercise 1.2 Derive a CFG for the language L4 = ai bj ck i = j or j = k or i = k .

1.5

Unbalanced pair in a triple

Now consider the related language L2 = ai bj ck i = j or j = k . We can essentially combine two copies of the previous grammar (with one version that works on b and c) in order to create a grammar that generates L2 : S Sa=b C | ASb=c Sa=b aSa=b b | aA | bB. A Aa | B Bb | Sb=c bSb=c c | bB | cC. C Cc | .

1.6

Anything but balanced

Let = {a, b}, and let Let La=b = \ an bn n 1 . The idea is that lets rst generate all words that contain b in them, and then later the contain a. The grammar for this language is S1 ZbZaZ Z aZ | bZ | .

Clearly L(Z) La=b . The only words we miss, must have all their as before their bs. But these are all words of the form ai bj , where i = j 0. But we already saw how to generate such words in Section 1.3. Putting everything together, we get the following grammar. S S1 | Sa=b S1 ZbZaZ Z aZ | bZ | Sa=b aSa=b b | aA | bB, A aA | , B bB | .

Similar count
L = w0n w {a, b} and #a (w) = n ,

Consider the language

where #a (w) is the number of appearances of the character a in w. The grammar for this language is S | bS | aS0.

Inherent Ambiguity

In lecture, the following ambiguous grammar representing basic mathematical statements was discussed: E E E | E + E N 0N | 1N | 0 | 1. The ambiguity caused because there is no inherent preference from combining expressions with over + or vise versa. It was then xed by introducing a preference : E E E | T, TN T |N N 0N | 1N | 0 | 1.

However some languages are inherently ambiguous, no context free grammar without ambiguity can generate it. Consider the following language: L = an bn ck dk n, k 1 an bk ck dn n, k 1 .

In other words, it is the language of a+ b+ c+ d+ where either: 1. the number of as equals the number of bs and the number cs equals the number of ds 2. the number of as equals the number of ds and the number of bs equals the number of cs One ambiguous grammar that generates it S XY | Z, X aXb | , Y cYd | , Z aZd | T, T bTc | .

The reason why all grammars for this language must be ambiguous can be seen in strings of the form an bn cn dn n 1 . Any grammar needs some way of generating the string in a way that either the as and bs are equal and the c and ds are equal or the as and ds are equal and the bs and cs are equal. When generating equal as and bs, it must be still possible to have the same number of cs and ds. When generating equal as and ds , it must still be possible to have the same number of bs and cs. No matter what grammar is designed, any string of the form an bn cn dn n 1 must have at least two possible parse trees. (This is of course only an intuitive explanation. A formal proof that any grammar for this language must be ambiguous is considerably more tedious and harder.)

A harder example
L = xy x, y {0, 1} where |x| = |y|, and x = y .

Consider the following language:

It should be clear that this language cannot be regular. However, it may not be obvious that we can in fact design a context free grammar for it. x and y are guaranteed to be dierent if, for some k, the kth character is 0 in x and 1 in y (or vise versa). It is important to notice that we should not try to build x and y separately as, in a CFG, we would have no way to enforce them being of the same length. Instead, we just remember that if the string is of length 2n, the rst n characters are considered x and the second n characters are y. Similarly, notice that we cannot choose k ahead of time for similar reasons. So, consider the following string w = x1 x2 . . . xk1 1 xk+1 . . . xn y1 y2 . . . yk1 0 yk+1 . . . yn L. Then we can rewrite this string as follows
k1 chars nk chars k1 chars nk chars

w = x1 x2 . . . xk1 1 xk+1 . . . xn y1 y2 . . . yk1 0 yk+1 . . . yn . In particular, let z1 z2 . . . zn1 = xk+1 . . . xn y1 y2 . . . yk1 . Then,
k1 chars nk chars

w = x1 x2 . . . xk1 1z1 z2 . . . zn1 0 yk+1 . . . yn


k1 chars k1 chars n1k+1 chars nk chars

= x1 x2 . . . xk1 1 z1 . . . zk1 zk . . . zn1 0 yk+1 . . . yn .


=X =Y

Now, X is a word of odd length with 1 in the middle (and we denitely know how to generate this kind of words using context free grammars). And Y is a word of odd length, with 0 in the middle. In particular, any word of L can be written as either XY or Y X, where X and Y are as above. We conclude, that the grammar for this language is S XY | YX X DXD | 1 Y DYD | 0 D 0 | 1.

You might also like