Lexi Cal
Lexi Cal
Lexi Cal
printf(“Total = %d\n”,score);
7
Two process
• Scanner – deletion of comments, compaction
of consecutive whitespace characters into
one.
• fi(a==f(x))
Or equivalent
do 100 n=2,10,1
100 nfac=nfac*n
Tricky problems in Token recognition
• Assignment
DO 5 I = 1.25
do loop
• DO 5 I = 1,25
Input Buffering
• Two – Buffer Scheme
Input Buffering
• Examining ways of speeding reading the source program
– In one buffer technique, the last lexeme under process will be over-written when we
reload the buffer.
– Two-buffer scheme handling large look ahead safely
Buffer Pairs
• Two buffers of the same size, say 4096, are alternately reloaded.
• Two pointers to the input are maintained:
– Pointer lexeme_Begin marks the beginning of the current
lexeme.
– Pointer forward scans ahead until a pattern match is found.
Regular Expression
• Describing all the languages that can be built
from these operators applied to the symbols
of some alphabet.
letter(letter|digit)*
19
Specification of Patterns for Tokens: String
Operations
• The concatenation of two strings x and y is
denoted by xy
• The exponentation of a string s is defined by
note that s = s = s
20
Recognition of Tokens
Transition Diagrams
• Patterns -> Stylished flow charts
= 4
*
return(relop, LT)
5 return(relop, EQ)
>
=
6 7 return(relop, GE)
other
*
8 return(relop, GT)
23
Two More...
id :
letter or digit
delim :
delim
24
RE to Automata
Minimizing
• The DFA for a(b|c)*
Example #2: Applying Minimization
Example # 4
• Minimize the following DFA:
C
b a
b a
a b b start a b b
A B D E A B D E
a a
a
a b a
From Regular Expression to DFA Directly
29
From Regular Expression to DFA Directly
(Algorithm)
• Augment the regular expression r with a
special end symbol # to make accepting states
important: the new expression is r#
• Construct a syntax tree for r#
• Traverse the tree to construct functions
nullable, firstpos, lastpos, and followpos
30
From Regular Expression to DFA Directly:
Syntax Tree of (a|b)*abb#
concatenation
#
6
b
closure 5
b
4
a
alternation
* 3
position
| number
(for leafs )
a b
31
1 2
From Regular Expression to DFA Directly:
Annotating the Tree
• nullable(n): the sub tree at node n generates languages
including the empty string
Leaf true
{1, 2, 3} {6}
{1, 2} | {1, 2}
35
From Regular Expression to DFA Directly:
Algorithm
s0 := firstpos(root) where root is the root of the syntax tree
Dstates := {s0} and is unmarked
while there is an unmarked state T in Dstates do
mark T
for each input symbol a do
let U be the set of positions that are in followpos(p)
for some position p in T,
such that the symbol at position p is a
if U is not empty and not in Dstates then
add U as an unmarked state to Dstates
end if
Dtran[T,a] := U
end do
end do
36
From Regular Expression to DFA Directly:
Example
Node followpos
1 {1, 2, 3} 1
2 {1, 2, 3} 3 4 5 6
3 {4}
2
4 {5}
5 {6}
6 -
b b
a
start a 1,2, b 1,2, b 1,2,
1,2,3
3,4 3,5 3,6
a 37
a
Thank You