CD Manual

1.
CASE STUDY FOR COMPILER
1.OVERVIEW
Computers are a balanced mix of software and hardware. Hardware is just a piece of
mechanical device and its functions are being controlled by a compatible software. Hardware
understands instructions in the form of electronic charge, which is the counterpart of binary
language in software programming. Binary language has only two alphabets, 0 and 1. To
instruct, the hardware codes must be written in binary format, which is simply a series of 1s
and 0s. It would be a difficult and cumbersome task for computer programmers to write such
codes, which is why we have compilers to write such codes.
LANGUAGE PROCESSING SYSTEM

We have learnt that any computer system is made of hardware and software. The hardware
understands a language, which humans cannot understand. So we write programs in high-
level language, which is easier for us to understand and remember. These programs are then
fed into a series of tools and OS components to get the desired code that can be used by the
machine. This is known as Language Processing System. Compiler Design.
The high-level language is converted into binary language in various phases. A compiler is a
program that converts high-level language to assembly language. Similarly, an assembler is a
program that converts the assembly language to machine-level language. Let us first
understand how a program, using C compiler, is executed on a host machine.User writes a
program in C language (high-level language). The C compiler compiles the program and
translates it to assembly program (low-level language). An assembler then translates the
assembly program into machine code (object).
A linker tool is used to link all the parts of the program together for execution (executable
machine code). A loader loads all of them into memory and then the program is executed.
Before diving straight into the concepts of compilers, we should understand a few other tools
that work closely with compilers.
Preprocessor
A preprocessor, generally considered as a part of compiler, is a tool that produces input for
compilers. It deals with macro-processing, augmentation, file inclusion, language extension,
etc.
Interpreter
An interpreter, like a compiler, translates high-level language into low-level machine
language. The difference lies in the way they read the source code or input. A compiler reads
the whole source code at once, creates tokens, checks semantics, generates intermediate code,
executes the whole program and may involve many passes. In contrast, an interpreter reads a
statement from the input, converts it to an intermediate code, executes it, then takes the next
statement in sequence. If an error occurs, an interpreter stops execution and reports it;
whereas a compiler reads the whole program even if it encounters several errors.
Assembler
An assembler translates assembly language programs into machine code. The output of an
assembler is called an object file, which contains a combination of machine instructions as
well as the data required to place these instructions in memory.
Linker
Linker is a computer program that links and merges various object files together in order to
make an executable file. All these files might have been compiled by separate assemblers.
The major task of a linker is to search and locate referenced module/routines in a program
Compiler Design and to determine and to determine the memory location where these codes
will be loaded, making the program instruction to have absolute references.
Loader
Loader is a part of operating system and is responsible for loading executable files into
memory and execute them. It calculates the size of a program (instructions and data) and
creates memory space for it. It initializes various registers to initiate execution.
Cross-compiler
A compiler that runs on platform (A) and is capable of generating executable code for
platform (B) is called a cross-compiler.
Source-to-source Compiler
A compiler that takes the source code of one programming language and translates it into the
source code of another programming language is called a source-to-source compiler.A
compiler can broadly be divided into two phases based on the way they compile.
Analysis Phase
Known as the front-end of the compiler, the analysis phase of the compiler reads the source
program, divides it into core parts, and then checks for lexical, grammar, and syntax errors.
The analysis phase generates an intermediate representation of the source program and
symbol table, which should be fed to the Synthesis phase as input.
Synthesis Phase
Known as the back-end of the compiler, the synthesis phase generates the target program
with the help of intermediate source code representation and symbol table.
A compiler can have many phases and passes.
Pass : A pass refers to the traversal of a compiler through the entire program.
Phase : A phase of a compiler is a distinguishable stage, which takes input from the
previous stage, processes and yields output that can be used as input for the next stage. A
pass can have more than one phase.
2. CASE STUDY FOR LEX AND YACC TOOL
Lex is a program generator designed for lexical processing of character input streams. It
accepts a high-level, problem oriented specification for character string matching, and
produces a program in a general purpose language which recognizes regular expressions. The
regular expressions are specified by the user in the source specifications given to Lex. The
Lex written code recognizes these expressions in an input stream and partitions the input
stream into strings matching the expressions. At the boundaries between strings program
sections provided by the user are executed. The Lex source file associates the regular
expressions and the program fragments. As each expression appears in the input to the
program written by Lex, the corresponding fragment is executed.
The user supplies the additional code beyond expression matching needed to complete his
tasks, possibly including code written by other generators. The program that recognizes the
expressions is generated in the general purpose programming language employed for the
user's program fragments. Thus, a high level expression language is provided to write the
string expressions to be matched while the user's freedom to write actions is unimpaired. This
avoids forcing the user who wishes to use a string manipulation language for input analysis to
write processing programs in the same and often inappropriate string handling language.
Lex is not a complete language, but rather a generator representing a new language feature
which can be added to different programming languages, called ``host languages.'' Just as
general purpose languages can produce code to run on different computer hardware, Lex can
write code in different host languages. The host language is used for the output code
generated by Lex and also for the program fragments added by the user. Compatible run-time
libraries for the different host languages are also provided. This makes Lex adaptable to
different environments and different users. Each application may be directed to the
combination of hardware and host language appropriate to the task, the user's background,
and the properties of local implementations. At present, the only supported host language is
C, although Fortran (in the form of Ratfor [2] has been available in the past. Lex itself exists
on UNIX, GCOS, and OS/370; but the code generated by Lex may be taken anywhere the
appropriate compilers exist.
Lex turns the user's expressions and actions (called source in this memo) into the host
general-purpose language; the generated program is named yylex. The yylex program will
recognize expressions in a stream (called input in this memo) and perform the specified
actions for each expression as it is detected. See Figure 1.
+-------+
Source -> | Lex | -> yylex
+-------+
+-------+
Input -> | yylex | -> Output
+-------+
An overview of Lex
Figure 1
For a trivial example, consider a program to delete from the input all blanks or tabs at the
ends of lines.
%%
[ \t]+$ ;
is all that is required. The program contains a %% delimiter to mark the beginning of the
rules, and one rule. This rule contains a regular expression which matches one or more
instances of the characters blank or tab (written \t for visibility, in accordance with the C
language convention) just prior to the end of a line. The brackets indicate the character class
made of blank and tab; the + indicates `òne or more ...''; and the $ indicates `ènd of line,'' as
in QED. No action is specified, so the program generated by Lex (yylex) will ignore these
characters. Everything else will be copied. To change any remaining string of blanks or tabs
to a single blank, add another rule:
%%
[ \t]+$ ;
[ \t]+ printf(" ");
The finite automaton generated for this source will scan for both rules at once, observing at
the termination of the string of blanks or tabs whether or not there is a newline character, and
executing the desired rule action. The first rule matches all strings of blanks or tabs at the end
of lines, and the second rule all remaining strings of blanks or tabs.
Lex can be used alone for simple transformations, or for analysis and statistics gathering on a
lexical level. Lex can also be used with a parser generator to perform the lexical analysis
phase; it is particularly easy to interface Lex and Yacc [3]. Lex programs recognize only
regular expressions; Yacc writes parsers that accept a large class of context free grammars,
but require a lower level analyzer to recognize input tokens. Thus, a combination of Lex and
Yacc is often appropriate. When used as a preprocessor for a later parser generator, Lex is
used to partition the input stream, and the parser generator assigns structure to the resulting
pieces. The flow of control in such a case (which might be the first half of a compiler, for
example) is shown in Figure 2. Additional programs, written by other generators or by hand,
can be added easily to programs written by Lex.
lexical grammar
rules rules
| |
v v
+---------+ +---------+
| Lex | | Yacc |
+---------+ +---------+
| |
v v
+---------+ +---------+
Input -> | yylex | -> | yyparse | -> Parsed input
+---------+ +---------+
Lex with Yacc

Figure 2
Yacc users will realize that the name yylex is what Yacc expects its lexical analyzer to be
named, so that the use of this name by Lex simplifies interfacing.
Lex generates a deterministic finite automaton from the regular expressions in the source [4].
The automaton is interpreted, rather than compiled, in order to save space. The result is still a
fast analyzer. In particular, the time taken by a Lex program to recognize and partition an
input stream is proportional to the length of the input. The number of Lex rules or the
complexity of the rules is not important in determining speed, unless rules which include
forward context require a significant amount of rescanning. What does increase with the
number and complexity of rules is the size of the finite automaton, and therefore the size of
the program generated by Lex.
In the program written by Lex, the user's fragments (representing the actions to be performed
as each regular expression is found) are gathered as cases of a switch. The automaton
interpreter directs the control flow. Opportunity is provided for the user to insert either
declarations or additional statements in the routine containing the actions, or to add
subroutines outside this action routine.
Lex is not limited to source which can be interpreted on the basis of one character lookahead.
For example, if there are two rules, one looking for ab and another for abcdefg, and the input
stream is abcdefh, Lex will recognize ab and leave the input pointer just before cd. . . Such
backup is more costly than the processing of simpler languages.
3. IMPLEMENTATION SYMBOL TABLE
Description:
Symbol table management:

Symbol table is a data structure containing the record of each identifier, with fields
for the attributes of the identifier. The data structure allows us to find the record for each
identifier quickly and store or retrieve data from that record quickly. When
the lexical analyzer detects an identifier in the source program, the identifier is entered into
symbol table. The remaining phases enter information about identifiers in to the symbol table.
PROGRAM:
Symbol.c
#include<stdio.h>
#include<conio.h>
#include<string.h>
void create();
void search ();
void modify();
void display();
struct symbol
char label[10], adddress[10], value[5];
}s[10];
struct instructions
{
char address[10],label[10],opcode[10],operand[10];
}t;
int ch,n=0,i,flag=0; char a[10];
void main()
char a[10]; int opt; clrscr();
ch='y';
do
printf("\n Enter the choice\n");
printf("1.Create\n2.Search\n3.Modify\n4.Display\n5.Exit\n");
scanf("%d",&opt);
switch(opt)
case 1: create(); break;
case 2: search(); break;
case 3: modify(); break;
case 4: display(); break;
case 5: exit(0); break;
} }while(opt<5); getch(); }
void create()
printf("\nEnter the Address, Label, Opcode, Operand\n");

scanf("%s%s%s%s",&t.address,&t.label,&t.opcode,&t.operand);
if(strcmp(t.label,"_")!=0)
flag=0; for(i=0;i<=n:i++)
if(strcmp(s[i].label,t.label)==0)
flag=1; break;}}
if(flag==0)
strcpy(s[n].address,t.address); strcpy(s[n].label,t.label);
strcpy(s[n].value,t.operand); n++;
else
{ printf("\nThe Label is already there");
}}}
void search()
printf("\nEnter the Label"); scnaf("%s",&a);
for(i=0;i<=n;i++)
{ if(strcmp(a,s[i].label)==0)
flag=1; break; }}
if(flag==1)
printf("\t\tLABEL\tADDRESS\tVALUE\t\n");
printf("\t\t%s\t%s\t%s\n",s[i].label,s[i].address,s[i].value);
else
printf("Not Found");
}}
void modify()
char a[10]; flag=0; printf("\nEnter the label\n"); scanf("%s",&a);
for(i=0;i<=n;i++)
{ if(strcmp(a,s[i].label)==0)
flag=1; break;
}} if(flag==0)
{ printf("\nError");
else
{ printf("\nEnter the Address, Value\n");
scanf("%s%s",&s[i].address,&s[i].value);
}}
void display()
printf("\t\tLABEL\tADDRESS\tVALUE\t\t\n");
for(i=0;i<=n;i++)
printf("\t\t%s\t%s\t%s\t\t\n",s[i].label,s[i].address,s[i].value);}}
OUTPUT:
SYMBOL TABLE..
Enter the choice
1.Create
2.Search
3.Modify
4.Display
5.Exit
Enter the Address, Label, Opcode, Operand
1000 COPY START 4890
Enter the choice
1.Create
2.Search
3.Modify
4.Display
5.Exit
2000 COPY START 2345
The Label is already there
Enter the choice
1.Create
2.Search
3.Modify
4.Display
5.Exit
2000 CLOOP LDA 9000
Enter the choice
1.Create
2.Search
3.Modify
4.Display
5.Exit
Enter the Label COPY
LABEL ADDRESS VALUE
COPY 1000 4890
Enter the choice
1.Create
2.Search
3.Modify
4.Display
5.Exit
3
Enter the label
CLOOP
Enter the Address, Value
2003 5000
Enter the choice
1.Create
2.Search
3.Modify
4.Display
5.Exit
LABEL ADDRESS VALUE
COPY 1000 4890
CLOOP 2003 5000
Enter the choice
1.Create
2.Search
3.Modify
4.Display
5.Exit
5
4a. IMPLEMENTATION OF A LEXICAL ANALYZER IN
C-ARITHMETIC EXPRESSION
Description:
In compiler, lexical analysis is also called linear analysis or scanning. In lexical analysis the
stream of characters making up the source program is read from left to right and grouped into tokens
that are sequences of characters having a collective meaning.
LEX:
LEX helps write programs whose control flow is directed by instances of regular expressions
in the input stream. It is well suited for editor-script type transformations and for segmenting input in
preparation for a parsing routine.
LEX is a program generator designed for Lexical processing of character input streams. It
accepts a high-level, problem oriented specification for character string matching, and produces a
program in a general purpose language which recognizes regular expressions. The regular expressions
are specified by the user in the source specifications given to LEX. The LEX written code recognizes
these expressions in an input stream and partitions the input stream into strings matching the
expressions. At the boundaries between strings program sections provided by the user are executed.
The LEX source file associates the regular expressions and the program fragments. As each
expression appears in the input to the program written by LEX, the corresponding fragment is
executed
expressions is generated in the general purpose programming language employed for the user's
program fragments. Thus, a high level expression language is provided to write the string expressions
to be matched while the user's freedom to write actions is unimpaired. This avoids forcing the user
who wishes to use a string manipulation language for input analysis to write processing programs in
the same and often inappropriate string handling language
LEX is not a complete language, but rather a generator representing a new language feature
which can be added to different programming languages, called ``host languages.'' Just as general
purpose languages can produce code to run on different computer hardware, LEX can write code in
different host languages
LEX turns the user's expressions and actions (called source in this memo) into the host
general-purpose language; the generated program is named yyLEX. The yyLEX program will
recognize expressions in a stream (called input in this memo) and perform the specified actions for
each expression as it is detected. See Figure 1.
LEX Source Program lex.yy.c
LEX Compiler
lex.yy.c C Compiler a.out
Input Stream Sequence of Tokens

a.out
LEX Source:
The general format of LEX source is:
{definitions}
%%
{rules}
%%
{user subroutines}
where the definitions and the user subroutines are often omitted. The second %% is optional, but the
first is required to mark the beginning of the rules. The absolute minimum LEX program is thus (no
definitions, no rules) which translates into a program which copies the input to the output unchanged.
PROGRAM:
Carith.c
#include<stdio.h>
#include<ctype.h>
#include<conio.h>
#include<string.h>
char vars[100][100];
int vcnt;
char input[1000],c;
char token[50],tlen;
int state=0,pos=0,i=0,id;
char*getAddress(char str[])
for(i=0;i<vcnt;i++)
if(strcmp(str,vars[i])==0)
return vars[i];
strcpy(vars[vcnt],str);
return vars[vcnt++];
int isrelop(char c)
if(c==+|| c==-||c==*|| c==/|| c==%|| c==^)
return 1;
else
return 0;
int main(void)
clrscr();
printf(enter the input string:);
gets(input);
do
c=input[pos];
putchar(c);
switch(state)
case 0:
if(isspace(c))
printf(\b);
if(isalpha(c))
token[0]=c;
tlen=1; state=1;
if(isdigit(c))
state=2;
if(isrelop(c))
state=3;
if(c==;)
printf(\t<3,3>\n);
if(c===)
printf(\t<4,4>\n);
break;
case 1:
if(!isalnum(c))
token[tlen]=\0;
printf(\b\t<1,%p>\n,getAddress(token));
state=0; pos--;
else
token[tlen++]=c;
break;
case 2:
if(!digit(c))
printf(\b\t<2,%p>\n,&input[pos]);
state=0; pos--;
break;
case 3:
id=input[pos-1];
if(c===)
printf(\t<%d,%d>\n,id*10,id*10);
else
pintf(\b\t<%d,%d>\n,id,id);
pos--;
state=0; break;
pos++;
while(c!=0);
getch();
return 0;}
OUTPUT:
.C-ARITHMETIC EXPRESSION.
Enter the input string: a=b*c
a <1,08CE>
= <4,4>
b <1,0932>
* <42,42>
c <1,0996>
4b. IMPLEMENTATION OF A LEXICAL ANALYZER IN
C-FOR LOOP
Description:
LEX:
executed
PROGRAM:
Cloop.c
#include<stdio.h>
#include<ctype.h>
#include<conio.h>
#include<string.h>
char vars[100][100];
int vcnt;
char input[1000],c;
char token[50],tlen;
int state=0,pos=0,i=0,id;
char*getAddress(char str[])
for(i=0;i<vcnt;i++)
if(strcmp(str,vars[i])==0)
return vars[i];
strcpy(vars[vcnt],str);
return vars[vcnt++];
int isrelop(char c)
if(c==>|| c==<||c==||| c===)
return 1;
else
return 0;
int main(void)
{ clrscr();
printf(enter the input string:);

gets(input);
do
{ c=input[pos]; putchar(c);
switch(state)
case 0: if(c==i) state=1;
break;
case 1: if(c==f)
printf("\t<1,1>\n); break;
case 2: if(isspace(c))
printf(\b);
if(isalpha(c))
token[0]=c; tlen=1;
state=3;
if(isdigit(c))
state=4;
if(isrelop(c))
state=5;
if(c==;)
printf(\t<4,4>\n);
if(c==()
printf(\t<5,0>\n);
if(c==))
printf(\t<5,1>\n);
if(c=={)
printf(\t<6,1>\n);
if(c==})
printf(\t<6,2>\n);
break;
case 3:
if(!isalnum(c))
token[tlen]=\0;
printf(\b\t<2,%p>\n,getAddress(token));
state=2;
pos--;
else
token[tlen++]=c;
break;
case 4:
if(!isdigit(c))
{
printf(\b\t<3,%p>\n,&input[pos]);
state=2; pos--;
break;
case 5: id=input[pos-1];
if(c===) printf(\t<%d,%d>\n,id*10,id*10);
else
pintf(\b\t<%d,%d>\n,id,id); pos--;
state=2; break;
pos++;
while(c!=0);
getch(); return 0; }
OUTPUT:
C-FOR LOOP.
Enter the input string: if(a>=b) max=a;
if <1,1>
( <5,0>
a <2,08EE>
>= <620,620>
b <2,0952>
) <5,1>
max <2,09B6>
= <61,61>
a <2,08EE>
; <4,4>
4c. IMPLEMENTATION OF A LEXICAL ANALYZER IN
C-KEYWORD
Description:
LEX:
executed
PROGRAM:
Ckey.c
#include<stdio.h>
#include<conio.h>
#include<string.h>
#define found 1
#define notfound 0
int main()
int i,j,flag=notfound,result;
char keywords[10][10]={void,if,else,for,while,switch};
char str[10],str 1;
printf(\n enter the string);
scanf(%s,str);
gets(str);
printf(\n string is %s ,str);
for(i=0;i<0;i++)
result=strcmp(&keywords[i][0],str);
if(result==0)
flag=found;
break;
if(flag==notfound)
{
printf(\n\n%s string is not a keyword,str);
printf(keyword list is \n);
for(i=0;i<6;i++)
printf(%s\n,keywords[i]);
else
printf(\n\n%s string is a keyword,str);
getch();
}
OUTPUT:
.C-KEYWORD..
enter the string: abcd
abcd string is not a keyword
keyword list is
void
if
else
for
while
switch
enter the string: if
string is if
if string is a keyword
5.IMPLEMENTATION OF A LEXICAL
ANALYZER IN C++
Description:
LEX:
executed
PROGRAM:
Lex.c
#include<iostream.h>
#include<conio.h>
#include<ctype.h>
#include<string.h>
main()
int i,j,k,f,len,t;
char exp[100],word[12][12];
char op[10]={+,-,*,/};
char key[10][10]={while,do,else,if};
cout<<Enter the expression\n;
cin>>exp;
len=strlen(exp);t=0;
for(i=0;i<len;i++)
j=0;f=0;
if(isalpha(exp[i]))
while(isalpha(exp[i])
word[t][j]=exp[i];
i++;
j++;
word[t][j]=\0;
i--;
for(k=0;k<10;k++)
if(word[t]==key[k])
cout<<The<<word[t]<<is keyword\n;
f=1;
break;
}}
if(f==0)
cout<<The<<word[t]<<is identifier\n;
else if(isdigit(exp[i]))
while(isdigit(exp[i]))
word[t][i]=exp[i];
i++;j++:
i--;
cout<<The<<word[t]<<is constant\n;
}
else
cout<<The<<exp[i]<<is operator\n;
t++;
getch();
}
OUTPUT:
LEXICAL ANALYSER IN C++..
Enter the expression: int a+5
The int is keyword
The a is identifier
The + is operator
The 5 is constant
6a.IMPLEMENTATION OF A LEXICAL ANALYZER USING
LEXTOOL
Description:
LEX:
executed
LEX turns the user's expressions and actions (called source in this memo) into the host general-
purpose language; the generated program is named yyLEX. The yyLEX program will recognize
expressions in a stream (called input in this memo) and perform the specified actions
PROGRAM:
Lan.l
%{
%}
identifier[a-zA-Z][a-zA-Z0-9]*
%%
#.* {printf("\n%s is a preprocessor di rective",yytext);}
int |
float |
char |
double |
whi le |
do |
if |
break |
continue |
void |
swi tch |
return |
else |
goto {printf("\n%s is a keyword",yytext);}
{identifier}\( {printf("\n function %s",yytext);}
\{ {printf("\nblock begins");}
\} {printf("\nblock ends");}
\( {printf("\n");ECHO;}
{identifier}(\[[0-9]*\])* {printf("\n%s is an identifier",yytext);}
\".*\" {printf("\n %s is a string ",yytext);}
[0-9]+ {printf("\n%s is a number",yytext);
\<= |
\>= |
\< |
\> |
\== {printf("\n %s is a relational operator",yytext);}
\= |
\+ |
\- |
\/ |
\& |
% {printf("\n %s is a operator",yytext);}
.|
\n;
%%
int main(int argc,char **argv)
FILE *fi le; fi le=fopen("inp.c","r");
if(!file)
{ printf("could not open the fi le!!!");
exi t(0);
}
yyin=file; yylex();
printf("\n\n"); return(0);
int yywrap()
{ return 1;
INPUT FILE:
#include<stdio.h>
void main()
{ int a,b,c;
printf("enter the value for a,b");
scanf("%d%d",&a,&b)';
c=a+b;
printf("the value of c:%d",&c);
}
OUTPUT:
.LEXTOOL..
[3cse01@localhost ~]$ lex lan.l
[3cse01@localhost ~]$ cc lex.yy.c
[3cse01@localhost ~]$ ./a.out
#include<stdio.h> is a preprocessor directive
void is a keyword
function main(
block begins
int is a keyword
a is an identifier
b is an identifier
c is an identifier
function printf(
"enter the value for a,b" is a string
function scanf(
"%d%d" is a string
&is a operator
a is an identifier
&is a operator
b is an identifier
c is an identifier
= is a operator
a is an identifier
+ is a operator
b is an identifier
function printf(
"the value of c:%d" is a string
&is a operator
c is an identifier
block ends
6b.IMPLEMENTATION OF A LEXICAL ANALYZER FOR
IDENTIFIER USING LEXTOOL
Description:
LEX:
executed
general-purpose language; the generated program is named yyLEX. The yyLEX program will
recognize expressions in a stream (called input in this memo) and perform the specified actions for
each expression as it is detected. See Figure 1.
LEX Source Program lex.yy.c
LEX Compiler
lex.yy.c C Compiler a.out
Input Stream Sequence of Tokens

a.out
PROGRAM:
Lex.l
%option noyywrap
%{
int id_cnt=0; char ch;
%}
int|flaot|double|char
ch=input();
for(;;)
if(ch==,)
id_cnt++;
else if(ch==;)
break; ch=input();
} id_cnt++;
%%
main(int argc,char **argv)
FILE*fp;
fp=fopen(argv[1],r);
yyin=fp; yylex();
printf(\nTotal number of identifiers are %d,id_cnt);
INPUT FILE:
#include<stdio.h>
int main()
printf(C programming);
return 0;
}
OUTPUT:
..LEXTOOL.
Total number of identifiers are 2

7.CONVERSION OF REGULAR EXPRESSION TO NON-
DETERMINISTIC FINITE AUTOMATA
Description:
Method :
1) First phase r in to its constituent sub expressions.
2) Using rules (1) {BASIC RULES}, construct NFA for basic symbols.
3) Using rules (2) {INDUCTIVE RULE}, obtain NFA for regular expression.
4) Each NFA produced at each step posses some properties.
(i) Exactly one final state.
(ii) No edge enters start state.
(iii) No edge leaves final state.
Basic rules :
(1) Building states and transitions of partial NFA for symbols
st art
1 2
st art
1
a 2
(2)
(a) Building states and transitions of partial NFA for unions
a
2 3

st art
1 6
b
4 5
(b) Building states and transitions of partial NFA for concatenations

start a b
1 2 3
(c) Building states and transitions of partial NFA for closures

st art a
1 2 3 4
Description:
Input : A Regular expression r
Output : An DFA that recognizes L(r).
Method:
1) Construct an syntax tree for the augmented regular expression (r)#, where # is a
unique end marker appended to (r).
2) Construct the functions Null able, first pos, Laptops and follow-on by making
depth- first traversals of T.
3) Construct States, the set of states of D, the Duran the transition table of D by the
following procedure.
4) The start state of D is firstpos(root) and the accepting states are all those
containing the position associated with the end marker #.
PROGRAM:
Rnfa.c
#include<stdio.h>
#include<string.h>
char re[35];
int sta[20],fna[20],stb[20],fnb[20],ste[20],fne[20],cnt=0,cnta=0,cntb=0;
int cnte=0,tag=0,i=0,j=0,k=0,f=0,a=0,b=0,a1=0,b1=0;
void onlya(int sa,int fa)
sta[cnta]=sa;
fna[cnta]=fa;
cnta++;
void onlyb(int sa,int fa)
stb[cntb]=sa;
fnb[cntb]=fa;
cntb++;
void onlye(int sa,int fa)
ste[cnte]=sa;
fne[cnte]=fa;
cnte++;
void star()
int p=0;
if(re[i -1]!=')')
if(re[i -1]=='a')
onlye(fna[cnta-1],sta[cnta-1]);
onlye(sta[cnta-1],fna[cnta-1]);
}
if(re[i -1]=='b')
onlye(stb[cntb-1],fnb[cntb-1]);
onlye(fnb[cntb-1],stb[cntb-1]);
}}
else
j=i ;
do
j--;
if(re[j]=='a')
a1++;
if(re[j]=='b')
b1++;
if(re[j]==')')
p++;
if(re[j]=='(')
p--;
while((re[j]!='(')||(p>0));
if((re[j+1]=='a')||(re[j+2]=='a'))
onlye(cnt,sta[cnta-a1]);
onlye(sta[cnta-a1],cnt);
}
if((re[j+1]=='b')||(re[j+2]=='b'))
onlye(cnt,stb[cntb-b1]);
onlye(stb[cntb-b1],cnt);
}}}
void or()
if((re[i-1]!='a')||(re[i-1]!='b'))
for(k=i -1;k>=0;k--)
if(re[k]=='a')
a++;
if(re[k]=='b')
b++;
if(re[k]=='(')
onlye(++cnt,cnta-a);
if(re[k+1]=='b')
onlye(++cnt,cntb-b);
if(re[i -1]=='(')
i++;
if(re[i+1]=='a')
i++;
f=cnt;
onlye(f,++cnt);
onlya(f,++cnt);
if(re[i+1]=='b')
i++;
f=cnt;
onlye(f,++cnt);
onlyb(f,++cnt);
else
if((re[i-1]=='a')&&(re[i+1]=='b'))
onlyb(sta[cnta-1],++cnt);
onlye(fna[cnta-1],++cnt);
onlye(fnb[cntb-1],cnt);
i++;
if((re[i-1]=='b')&&(re[i+1]=='a'))
{
onlya(stb[cntb-1],++cnt);
onlye(fnb[cntb-1],++cnt);
onlye(fna[cnta-1],cnt);
i++;
}}}
int main()
printf("Enter the expression:");
scanf("%s",re);
for(i=0;i<strlen(re);i++)
if(re[i]=='a')
f=cnt;
onlya(f,++cnt);
if(re[i]=='b')
f=cnt;
onlyb(f,++cnt);
if(re[i]=='*')
star();
if(re[i]=='|')
or();
printf("\n states\t|a\t|b\t|e\n");
printf(".......................................");
while(tag<=cnt)
printf("\n");
printf("{%d}\t|",tag);
for(k=0;k<cnta;k++)
if(tag==sta[k])
printf("{%d}",fna[k]);
putchar('\t');
putchar('|');
for(k=0;k<=cntb;k++)
if(tag==stb[k])
printf("{%d}",fnb[k]);
putchar('\t');
putchar('|');
for(k=0;k<cnte;k++)
if(tag==ste[k])
printf("{%d}",fne[k]);
tag++;
printf("\n\t|\t|\t|");
putchar('\n'); return 0; }
OUTPUT:
.RE TO NFA
[2cse54@localhost ~]$ ./a.out
Enter the expression: (ab)*
states |a |b |e
.......................................
{0} |{1} |{0} |{2}
| | |
{1} | |{2} |{2}
| | |
{2} | | |{0}{1}
| | |
8a. ANALYSIS OF SOURCE PROGRAM USING YACC TOOL
Description:
YACC
Theory:
Yacc stands for "yet another compiler-compiler," reflecting the popularity of parser
generators in the early 1970's when the first version of Yacc was created by S. C. Johnson. Yacc is
available as a command on the UNIX system, and has been used to help implement hundreds of
compilers.
Structure of a Yacc Grammar
A yacc grammar consists of three sections: the definition section, the rules section, and the user
subroutines section.
... definition section ...
%%
... rules section ...
%%
... user subroutines section ...
The sections are separated by lines consisting of two percent signs. The first two sections are required,
although a section may be empty. The third section and the preceding "%%" line may be omitted.
PROGRAM:
Yac.y
%{
#include<stdio.h>
#include<ctype.h>
#include<stdlib.h>
%left +-
%left */
%%
stmt:stmt\n
{
printf(\n.....Valid Expression...\n);
exit(0);
|expr
|error\n
printf(\n...Invalid....\n);
exit(0);
expr:num
|let
|expr+expr
|expr-expr
|expr*expr
|expr/expr
|(expr)%%
main()
{ printf(Enter an expression to validate:);
yyparse(); }
yylex()
{
int ch;
while((ch=getchar())==);
if(isdigit(ch))
return num;
if(isalpha(ch))
return let;
return ch;
yyerror(char*s)
{printf(%s,s);}
OUTPUT:
..YACC TOOL..
adminlinux@newlab2sys53:~s vi yac.y
adminlinux@newlab2sys53:~s yacc yac.y
adminlinux@newlab2sys53:~s cc y.tab.c
adminlinux@newlab2sys53:~s ./a.out
Enter the expression:a*b
Valid expression
8b.IMPLEMENTATIOON OF CALCULATOR USING LEX AND YACC TOOL
Description:
LEX:
executed
general-purpose language; the generated program is named yyLEX. The yyLEX program
will recognize expressions in a stream (called input in this memo) and perform the specified
actions for each expression as it is detected.
YACC
Theory:
compilers.
%%
%%
PROGRAM:
Cal.Y
%{
#include<stdio.h>
#include<math.h>
#include"y.tab.h"
%}
%%
([0-9]+|([0-9]*\.[0-9]+)([eE][-+]?[0-9]+)?)
{yylval.dval=atof(yytext);
return NUMBER;}
MEM {return MEM;}
[\t];
\$ {return 0;}
\n {return yytext[0];}
. {return yytext[0];}
%%
Cal.y
%{
#include<stdio.h>
#include<math.h>
double memvar;
%}
%union
{
double dval;
}
%token<dval> NUMBER
%token<dval> MEM
%left '-' '+'
%left '*' '/'
%nonassoc UMINUS
%type<dval> expression
%%
start:statement '\n'
|start statement '\n'
statement:MEM '=' expression {memvar=$3;}
|expression {printf("answer=%g\n",$1);}
;
expression:expression'+'expression {$$=$1+$3;}
|expression'-'expression {$$=$1-$3;}
|expression'*'expression {$$=$1*$3;}
|expression'/'expression {if($3==0)
yyerror("divide by zero");
else
$$=$1/$3;
};
expression:'-'expression %prec UMINUS {$$= -$2;}
|'('expression')' {$$=$2;}
|NUMBER {$$=$1;}
|MEM {$$=memvar;};
%%
int main(void)
{
printf("Enter the expression");
yyparse();
printf("\n\n");
return 0;
}
int yywrap()
{
ww.annauniversityplus.com
54
return 0;
}
int yyerror(char *error)
{
printf("%s\n",error);
return 0;
}
OUTPUT:
..CALCULATOR USING YACC TOOL.
[CSE@localhost ~]$ lex cal.l
[CSE@localhost ~]$ yacc -d cal.y
[CSE@localhost ~]$ cc lex.yy.c y.tab.c -ll
[CSE@localhost ~]$ ./a.out
Enter the expression5+3
answer=8
[cse@NFSSERVER ~]$ ./a.out
Enter the expression5+-5
answer=0
[cse@NFSSERVER ~]$ ./a.out
Enter the expression+5/
syntax error
9.ANALYSIS OF SOURCE PROGRAM USING LEX AND YACC TOOL
Description:
LEX:
executed
general-purpose language; the generated program is named yyLEX. The yyLEX program
will recognize expressions in a stream (called input in this memo) and perform the specified
actions for each expression as it is detected.
Description:
YACC
Theory:
compilers.
%%
%%
PROGRAM:
Lexyac.c
%{
/*Definition Section*/
#include<stdio.h>
#include"y.tab.h"
//to keep track of errors
int l ineno=1;
%}
identifier[a-zA-Z][a-zA-Z]*
number[0-9]+
%%
main|(1)return MAIN;
if return IF;
else return ELSE;
while return WHILE;
int |
char |
float |
long return TYPE;
{identifier} return VAR;
{number} return NUM;
\< |
\> |
\<= |
\>= |
==return RELOP;
[:space: ]+;
[\t]+;
\n l ineno++;
.return yytext[0];
%%
int yywrap()
{
return 1;
}
Exp5.y:
%{
#include<stdio.h>
#include<string.h>
extern int l ineno;
int errno=0;
%}
%token NUM VAR RELOP MAIN
%token IF ELSE WHILE TYPE
//define precedence and associativity of operators]
%left'-''+'
%left'*''/'
%%
PROGRAM:MAIN BLOCK
;
BLOCK: '{'CODE'}'
;
CODE:BLOCK
|STATEMENT CODE
|STATEMENT
;
STATEMENT:DECST';'
|DECST{printf("\n missing ';' l ineno%d",lineno);errno++;}
|ASSIGNMENT';'
|ASSIGNMENT {printf("\nMissing ';' l ineno%d",lineno);errno++;}
|CONDST
|WHILEST
;
DECST:TYPE VARLIST
;
VARLIST:VAR','VARLIST
|VAR
;
ASSIGNMENT:VAR'='EXPR
;
EXPR:EXPR'+'EXPR
|EXPR'-'EXPR
|EXPR'*'EXPR
|EXPR'/'EXPR
|'-'EXPR
|'('EXPR')'
|VAR
|NUM
;
CONDST: IFST
|IFST ELSEST
;
IFST: IF'('CONDITION')'
BLOCK
;
ELSEST:ELSE
BLOCK
;
CONDITION:VAR RELOP VAR
|VAR RELOP NUM
|VAR
|NUM
;
WHILEST:WHILELOOP
;
WHILELOOP:WHILE'('CONDITION')'
BLOCK
;
%%
#include"lex.yy.c"
extern FILE *yyin;
int main(int argc,char *argv[])
{
FILE *fp;
int i ;
fp=fopen(argv[1],"r");
yyin=fp;
while(!feof(yyin))
{
yyparse();
}
i f(errno==0)
printf("\n no error found!! \n parsing successfull\n");
else
printf("\n%derrno(s) found!!",errno);
putchar('\n');
return 0;
}
yyerror()
{
printf("\n error on l ineno:%d\n",lineno);
errno++;
}
INPUT:
main()
{
a=5;
b=4;
c=a+b;
i f(a>b)
{
a=1;
}
else
{
c=2;
}
whi le(c==1)
{
a=1;
}
}
OUTPUT:
.......................SOURCE PROGRAM USING LEX &YACC TOOL............
[2cse54@localhost]lex lexyac.l
[2cse54@localhost]yacc -d lexyac.y
[2cse54@localhost]cc y.tab.c -ll
No Errors Found!!
Parsing Sucessfull
10. IMPLEMENTATION OF CODE GENERATOR TECHNIQUE
Description:
Code Optimization is an important phase of compiler. This phase optimizes the three
address codes into sequence of optimized three address codes.
Code Generator
Intermediate code statements
Optimized three address
codes(three address code)
Front end Code Generator Front end
Control flow Data Flow analysis Transformations

analysis
A Simple but effective technique for locally improving the target code is peephole
optimization,
A method for trying to improve the performance of the target program
By examining a short sequence of target instructions and replacing these instructions by a
shorter or faster sequence whenever possible.
Characteristics of peephole optimization
1. Redundant instruction elimination
2. Flow of control information
3. Algebraic Simplification
4. Use of machine Idiom.
PROGRAM:
Cg.c
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
char optr[4]={+,-,*,/};
char code[5][4]={{ADD},{SUB},{MUL},{DIV}};
char input[20][6];
void codegen();
void getip();
void assemble(char *pos, int j);
int main()
getip();
codegen();
void getip()
int i;
printf(Enter your 3 address code stmtz # to terminate:\n);
for(i=0;i++)
scanf(%s,input[i]);
if(strcmp(#,input[i]==0)
break;
}}
void codegen()
{
int i,j,flag=0;
for(i=0;strcmp(#,input[i]!=0;i++)
for(j=0;j<5;j++)
flag=0;
if(input[i][3]==\0)
printf(MOV%c,%c\n,input[i][2],input[i][0]);
flag=1;
break;
else
if(input[i][3]==optr[i])
assemble(input[i],j);
flag=1;
break;
}}}
if(flag==0)
printf(Error!!!\n);
exit(0);
}}}
void assemble(char*pos,int j)
int index=0;
printf(MOV%c,R%d\n,pos[4],index);
printf(%s%c,R%d\n,code[j],pos[2],index;
printf(MOV R%d,%c\n,index,pos[0]);
}
OUTPUT:
.CODE GENERATOR
Enter your 3 address code stmtz # to terminate:a=b+5
MOV 5,R0
S<<,R98
MOV R0,a
11. IMPLEMENTATION OF CODE OPTIMIZATION TECHNIQUE
Description:
Code Optimization is an important phase of compiler. This phase optimizes the three
address codes into sequence of optimized three address codes.
Code Optimization
Intermediate code statements Optimized three address
codes(three address code)
Front end Code Optimizer Front end
Control flow Data Flow analysis Transformations

analysis
A Simple but effective technique for locally improving the target code is peephole
optimization,
A method for trying to improve the performance of the target program
By examining a short sequence of target instructions and replacing these instructions by a
shorter or faster sequence whenever possible.
Characteristics of peephole optimization
1.Redundant instruction elimination
2.Flow of control information
3.Algebraic Simplification
4.Use of machine Idiom.
PROGRAM:
Co.c
#include<stdio.h>
#include<conio.h>
#include<string.h>
struct op
char l;
char r[20];
}op[10],pr[10];
void main()
int a,i,k,j,n,z=0,m,q;
char*p,*l;
char temp,t;
char*tem;
clrscr();
printf(Enter no of values);
scanf(%d,&n);
for(i=0;i<n;i++)
printf(left:\t);
op[i].l=getche();
printf(right:\t);
scanf(%s,op[i].r);
printf(intermediate code\n);
for(i=0;i<n;i++)
{
printf(%c=,op[i].l);
printf(%s\n,op[i].r);
for(i=0;i<n-1;i++)
temp=op[i].l;
for(j=0;j/<n;j++)
p=strchr(op[j].r,temp);
if(p)
pr[z].l=op[i].l;
strcpy(pr[z].r,op[i],r);
z++;
}}}
pr[z].l=op[n-1].l;
strcpy(pr[z].r,op[n-1].r);
z++;
printf(\n After dead code elimination\n);
for(k=0;k<z;k++)
printf(%c\t=,pr[k].l);
printf(%s\n,pr[k].r);
for(m=0;m<z;m++)
{
tem=pr[m].r;
for(j=m+1;j<z;j++)
p=strstr(tem,pr[j].r);
if(p)
t=pr[j].l;
pr[j].l=pr[m].l;
for(i=0;i<z:i++)
l=strchr(pr[i].r,t);
if(1)
a=1-pr[i].r;
//printf(pos:%d,a);
pr[i].r[a]=pr[m].l;
}}}}}
printf(Eliminate common expression\n);
for(i=0;i<z;i++)
printf(%c\t=pr[i].l);
printf(%s\n,pr[i].r);
//duplicate production elimination

for(i=0;i<z;i++)
for(j=i+1;j<z;j++)
q=strcmp(pr[i].r,pr[j].r);
if((pr[i].l==pr[j].l)&&!q)
pr[i].l=\0;
strcpy(pr[i].r,\0);
}}}
printf(optimized code);
for(i=0;i<z;i++)
if(pr[i].l!=\0)
printf(%c=,pr[i].l);
printf(%s\n.,pr[i].r);
}}
getch();
}
OUTPUT:
..CODE OPTIMIZATION
Enter no of values:3
left:a right:9
left:b right:c+d
left:e right:b+a
intermediate code
a=9
b=c+d
e=b+a
After dead code elimination
b=c+d
e=b+a
Eliminate common expression
b=c+d
e=b+a
optimized code
b=c+d
12. IMPLEMENTATION OF RECURSIVE DESCENT PARSER
Description:
Statically allocated names are bound to storage at compile time.
Storage bindings of statically allocated names never change, so even if a name is local to
a procedure, its name is always bound to the same storage.
The compiler uses the type of a name (retrieved from the symbol table) to determine
storage size required.
The required number of bytes (possibly aligned) is set aside for the name.
The address of the storage is fixed at compile time.
Storage is organized as a stack.
Activation records are pushed and popped.
Locals and parameters are contained in the activation records for the call.
This means locals are bound to fresh storage on every call.
If we have a stack growing downwards, we just need a stack_top pointer.
To allocate a new activation record, we just increase stack_top.
PROGRAM:
Rdp.c
#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
#include<string.h>
char ip_sym[15],ip_ptr=0;
void e_prime();
void t();
void e();
void t_prime();
void f();
void advance();
void e()
printf(\n\t\t E----->TE);
t();
e_prime();
void e_prime()
if(ip_sym[ip_ptr]==+)
printf(\n\t\t E--->+TE);
advance();
t();
e_prime();
else
printf(\n\t\t E----->e);
void t()
printf(\n\t\t T----->FT);
f();
t_prime();
void t_prime()
{
if(ip_sym[ip_ptr]==*)
printf(\n\t\t T----->*FT);
advance();
f();
t_prime();
else
printf(\n\t\t T----->e);
}}
void f()
if((ip_sym[ip_ptr]==i)||(ip_sym[ip_ptr]==j))
printf(\n\t\t F----->i);
advance();
else
if(ip_sym[ip_ptr]==()
advance();
e();
if(ip_sym[ip_ptr]==))
advance();
printff(\n\t\t F----->(E));
else
printf(\n\t\t Syntax error);
getch();
exit(1);
}}}}
void advance()
ip_ptr++;
void main()
int i;
clrscr();
printf(\n\t\t GRAMMER WITHOUT RECURSION);
printf(\n\t\t E----->TE\n\t\t E/e\r\t\t T----->FT);
printf(\n\t\t T----->*FT/e\n\t\t F----->(E)/id);
printf(\n\t\t Enter the input symbol:);
gets(ip_sym);
printf(\n\t\t Sequence of production rules);
e();
getch();
}
OUTPUT:
..RECURSIVE DESCENT PARAER
GRAMMER WITHOUT RECURSION
E----->TE
T----->FT
T----->*FT/e
F----->(E)/id
Enter the input symbol: T
Sequence of production rules
E----->TE
T----->FT
T----->e
E----->e
13. IMPLEMENTATION OF OPERATOR PRECEDENCE PARSER
Description:
1.Get the input as expression.
2. Check the preference of each symbol in the given expression with
all other symbols in the input expression.
3. After checking display the Operator Precedence matrix with
corresponding symbol comparison and specify it using the relational
operators< (Less than), > (Greater than) and =(equals).
4. Also display the postfix expression format of the given expression
as tree structure.
PROGRAM:
Opp.c
#include<stdio.h>
#include<conio.h>
#include<string.h>
#include<ctype.h>
char q[9][9]={
{>,>,<,<,<,<,<,>,>};
{>,>,<,<,<,<,>,<,>};
{>,>,>,>,<,<,>,<,>};
{>,>,>,>,<,<,>,<,>};
{>,>,<,<,<,<,>,<,>};
{<,<,<,<,<,<,=,<,E};
{>,>,>,>,>,E,>,E,>};
{>,>,>,>,>,E,>,E,>};
{<,<,<,<,<,<,E,<,A};
};
char s[30],st[30],qs[30];
int top=-1,r=-1,p=0;
void push(char a)
top++; st[top]=a;
char pop()
char a; a=st[top];
top--; return a;
int find(char a)
switch(a)
case +: return 0;
case -: return 1;
case *: return 2;
case /: return 3;
case ^: return 4;
case (: return 5;
case ): return 6;
case a: return 7;
case $: return 8;
default: return-1;
}}
void display(char a)
{
printf(\n Shifft %c, a);
void display1(char a)
if(isalpha(a))
printf(\n Reduce E->%c,a);
else if((a==+)||(a==-)||(a==*)||(a==/)||(a==^))
printf(\n Reduce E->E%c E,a);
else if(a==))
printf(\n Reduce E->(E));
intrel(char a,char b,char d)
if(isalpha(a)!=0)
a=a;
if(isalpha(b)!=0)
b=a;
if(q[find(a)][find(b)]==d)
return 1;
else
return 0;
void main()
{
char s[100]; int i=-1;
clrscr();
printf(\n\t Operator preceding parser\n);
printf(\n Enter the arithmetic expression end with$:);
gets(s); push($);
while(i)
if((s[p]==$)&&(st[top]==$))
printf(\n\n Accepted);
break;
else if(rel(st[top],s[p],<)||rel(st[top],s[p],=)
display(s[p]); push(s[p]);
p++;
else if(rel(st[top],s[p].>))
do
r++;
qs[r]=pop();
display1(qs[r]);
}
while(!rel(st[top],qs[r],<));
}}
getch();
}
OUTPUT:
OPERATOR PRECEDENCE PARSER.
Enter the arithmetic expression end with$: a-(b*c)^d$
Shift a
Reduce E->a
Shift
Shift (
Shift b
Reduce E->b
Shift *
Shift c
Reduce E->c
Reduce E->E*E
Shift )
Reduce E->(E)
Shift ^
Shift d
Reduce E->d
Reduce E->EÊReduce E->E-EAccepted

14. IMPLEMENTATION OF SHIFT REDUCE PARSER
Description:
1. A source program has three parts as follows:

Declarations
%%
translation rules
%%
supporting C routines
2. Declarations Section:
This section contains entries that:
i. Include standard I/O header file.
ii. Define global variables.
iii. Define the list rule as the place to start processing.
iv. Define the tokens used by the parser.
v. Define the operators and their precedence.
3. Rules Section:
The rules section defines the rules that parse the input stream.
Each rule of a grammar production and the associated semantic
action.
4. Programs Section:
The programs section contains the following subroutines. Because
these subroutines are included in this file, it is not necessary to use
the yacc library when processing this file.
5. Main- The required main program that calls the yyparse subroutine to
start the program.
6. yyerror(s)
PROGRAM:
Srp.c
#include<stdio.h>
#include<conio.h>
#include<string.h>
#include<stdio.h>
char ip_sym[15],stack[15];
int ip_ptr=0,st_ptr=0,len,i;
char temp[2],temp2[2];
char act[15];
void check();
void main()
clrscr();
printf(\n\t\t Shift Reduce Parser\n);
printf(\n\t ***** ***** *****);
printf(\n GRAMMAR\n\n);
printf(E->E+E\n E->E/E\n);
printf(E->E*E\n E->a/b \n);
printf(\n Enter the input symbol:\t);
gets(ip_sym);
printf(\n\t\t Stack implementation table);
printf(\n Stack \t\t Input Symbol \t\t Action);
printf(\n $\t\ t%s$\t\t\t.....,ip_sym);
strcpy(act,shift);
temp[0]=ip_sym[ip_ptr];
temp[1]=\0;
strcat(act,temp);
len=strlen(ip_sym);
for(i=0;i<=len-1;i++)
stack[str_ptr]=ip_sym[ip_ptr];
stack[st_ptr+1]=\0;
ip_sym[ip_ptr]=;
ip_ptr++;
printf(\n $%s\t\t%s$\t\t\t%s,stack,ip_sym,act);
strcpy(act,shift);
temp[0]=ip_sym[ip_ptr];
temp[1]=\0;
strcat(act,temp);
check();
st_ptr++;
st_ptr++;
check();
getch();
void check()
int flag=0;
temp2[0]=stack[st_ptr];
temp[1]=\0;
if((!strcmpi(temp2,a))||(!strcmpi(temp2,b)))
stack[st_ptr]=E;
if(!strcmpi(temp2,a);
printf(\n$%s\t\t%s$\t\t\t->a,stack,ip_sym);
else
printf(\n$%s\t\t%s$\t\t\t E->a,stack,ip_sym);
flag=1;
}
if((!strcmpi(temp2,+))||(!strcmpi(temp2,*))||(!strcmpi(temp2,/)))
flag=1;
if((!strcmpi(stack,E+E))||(!strcmpi(stack,E/E))||(!strcmpi(stack,E*E)))
strcpy(stack,E);
st_ptr=0;
if(!strcmpi(stack,E+E))
printf(\n$%s\t\t%s$\t\t\t E->E+E,stack,ip_sym);
else if(!strcmpi(stack,E/E))
printf(\n$%s\t\t%s$\t\t\tE->E/E,stack,ip_sym);
else
printf(\n$%s\t\t%s$\t\t\tE->E*E,stack,ip_sym);
flag=1;
if(!strcmpi(stack,E)&&ip_ptr==len)
printf(\n$%s\t\t%s$\t\t\t Accept,ip_sym);
getch();
exit(0);
if(flag==0)
{
printf(\n$%s\t\t%s$\t\t\t Reject,stack,ip_sym);
return;
}
OUTPUT:
.. SHIFT REDUCE PARSER
Shift Reduce Parser
***** ***** *****
GRAMMAR
E->E+E
E->E/E
E->E*E
E->a/b
Enter the input symbol: if(a*b)
Stack implementation table
Stack Input symbol Action
$ if(a*b)$ -----
$i f(a*b)$ Shift i
$if (a*b)$ Shift f
$if( a*b)$ Shift (
$if(a *b)$ Shift a
$if(E *b)$ E->a
$if(E* b)$ Shift i
$if(E*E) b) Reject
Press any key to continue..............

15. IMPLEMENT TYPE CHECKING
Description:
The purpose of types
To define what the program should do.
e.g. read an array of integers and return a double
To guarantee that the program is meaningful.
that it does not add a string to an integer

that variables are declared before they are used
To document the programmer's intentions.
better than comments, which are not checked by the compiler
To optimize the use of hardware.
reserve the minimal amount of memory, but not more

use the most appropriate machine instructions
What belongs to type checking
Depending on language, the type checker can prevent
application of a function to wrong number of arguments,

application of integer functions to floats,
use of undeclared variables in expressions,
functions that do not return values,
division by zero
array indices out of bounds,
nonterminating recursion,
sorting algorithms that don't sort...
Languages differ greatly in how strict their static semantics is: none of the things above is
checked by all programming languages!
In general, the more there is static checking in the compiler, the less need there is for manual
debugging.
Description formats for different compiler phases
These formats are independent of implementation language.
Lexer: regular expressions

Parser: BNF grammars
Type checker: typing rules
Interpreter: operational semantic rules
Code generator: compilation schemes
PROGRAM:
Typeche.c
#include <stdio.h>
#include <conio.h>
#include <ctype.h>
#include <string.h>
#include <stdlib.h>
char* type(char[],int);
void main()
{
char a[10],b[10],mess[20],mess1[20];
int i,l;
clrscr();
printf( "\n\n int a,b;\n\n int c=a+b\n");
printf( "\n\n Enter a value for a\n");
scanf("%s", a);
l=strlen(a);
printf(" \n a is :");
strcpy(mess,type(a,l));
printf("%s",mess);
printf( "\n\n Enter a value for b\n\n");
scanf("%s",b);
l=strlen(b);
printf(" \n b is :");
strcpy(mess1,type(b,l));
printf("%s",mess1); if(strcmp(mess,"int")==0&&strcmp(mess1,"int")==0)
{
printf("\n\n No Type Error");
}
else
{
printf("\n\n Type Error");
getch();
}
char* type(char x[],int m)
{
int i;
char mes[20];
for(i=0;i<m;i++)
{
if(isalpha(x[i]))
{
strcpy(mes,"AlphaNumeric");
goto X;
}
else if (x[i]=='.')
{
strcpy(mes,"float");
goto X;
}
if(ischar(x[i]))
{
strcpy(mes,"Character");
goto X;
}
else if (x[i]=='.')
{
strcpy(mes,"int");
goto X;
}
if(isdouble(x[i]))
{
strcpy(mes,"Double");
goto X;
}
else if (x[i]=='.')
{
strcpy(mes,"alpha numeric");
goto X;
}
}
strcpy(mes,"int");
X: return mes;
}
OUTPUT:
..TYPE CHECKING.
int a,b;
int c=a+b;
Enter a value for a: 10.5
a is float
Enter a value for b: 56
b is int
Type error
int a,b;
int c=a+b;
Enter a value for a: 22
a is int
Enter a value for b: a10
b is alphanumeric
Type error
int a,b;
int c=a+b;
Enter a value for a: 10
a is int
Enter a value for b: 20
b is int
c=30
No Type error
16.PROGRAM FOR COMPUTATION OF FIRST
Description:
This set is created to know what terminal symbol is derived in the first position by a
non-terminal. For example,
t
That is, derives t (terminal) in the very first position. So, t FIRST().
Calculating First Set
Look at the definition of FIRST() set:
if is a terminal, then FIRST() = { }.
if is a non-terminal and is a production, then FIRST() = { }.
if is a non-terminal and 1 2 3 n and any FIRST() contains t, then t is in
FIRST().
First set can be seen as: FIRST() = { t | * t } { | * }
PROGRAM:
First.c
#include<stdio.h>
#include<conio.h>
#include<string.h>
void main()
{
char t[5],nt[10],p[5][5],first[5][5],temp; int
i,j,not,nont,k=0,f=0;
clrscr();
printf("\nEnter the no. of Non-terminals in the grammer:");
scanf("%d",&nont);
printf("\nEnter the Non-terminals in the grammer:\n");
for(i=0;i<nont;i++)
{
scanf("\n%c",&nt[i]);
}
printf("\nEnter the no. of Terminals in the grammer: ( Enter e for absiline ) ");
scanf("%d",&not);
printf("\nEnter the Terminals in the grammer:\n");
for(i=0;i<not||t[i]=='$';i++)
{
scanf("\n%c",&t[i]);
}
for(i=0;i<nont;i++)
{
p[i][0]=nt[i];
first[i][0]=nt[i];
}
printf("\nEnter the productions :\n");
for(i=0;i<nont;i++)
{
scanf("%c",&temp);
printf("\nEnter the production for %c ( End the production with '$' sign )
:",p[i][0]);
for(j=0;p[i][j]!='$';)
{
j+=1;
scanf("%c",&p[i][j]);
}
}
for(i=0;i<nont;i++)
{
printf("\nThe production for %c -> ",p[i][0]);
for(j=1;p[i][j]!='$';j++)
{
printf("%c",p[i][j]);
}
}
for(i=0;i<nont;i++)
{
f=0;
for(j=1;p[i][j]!='$';j++)
{
for(k=0;k<not;k++)
{
if(f==1)
break;
if(p[i][j]==t[k])
{
first[i][j]=t[k];
first[i][j+1]='$';
f=1;
break;
}
else if(p[i][j]==nt[k])
{
first[i][j]=first[k][j];
if(first[i][j]=='e')
continue;
first[i][j+1]='$';
f=1;
break;
}
}
}
}
for(i=0;i<nont;i++)
{
printf("\n\nThe first of %c -> ",first[i][0]);
for(j=1;first[i][j]!='$';j++)
{
printf("%c\t",first[i][j]);
}
}
getch();
}
OUTPUT:
..PROGRAM FOR FIRST..
Enter the no. of Non-terminals in the grammer:3
Enter the Non-terminals in the grammer:
ERT
Enter the no. of Terminals in the grammer: ( Enter e for absiline ) 5
Enter the Terminals in the grammer:
ase*+
Enter the productions :
Enter the production for E ( End the production with '$' sign ) :a+s$
Enter the production for R ( End the production with '$' sign ) :e$
Enter the production for T ( End the production with '$' sign ) :Rs$
The production for E -> a+s
The production for R -> e
The production for T -> Rs
The first of E -> a
The first of R -> e
The first of T -> e s
17.TO IMPLEMENT STACK ALLOCATION USING ARRAY
DESCRIPTION:
Procedure calls and their activations are managed by means of stack memory allocation. It
works in last-in-first-out (LIFO) method and this allocation strategy is very useful for
recursive procedure calls.
PROGRAM:
Stack.c
#include<stdio.h>
#include<conio.h>
#define MAXSIZE 10
void push();
int pop();
void traverse();
int stack[MAXSIZE];
int Top=-1;
void main()
{
int choice;
char ch;
do
{
clrscr();
printf("\n1. PUSH ");
printf("\n2. POP ");
printf("\n3. TRAVERSE ");
printf("\nEnter your choice");
scanf("%d",&choice);
switch(choice)
{
case 1: push();
break;
case 2: printf("\nThe deleted element is %d",pop());
break;
case 3: traverse();
break;
default: printf("\nYou Entered Wrong Choice");
}
printf("\nDo You Wish To Continue (Y/N)");
fflush(stdin);
scanf("%c",&ch);
}
while(ch=='Y' || ch=='y');
}
void push()
{
int item;
if(Top == MAXSIZE - 1)
{
printf("\nThe Stack Is Full");
getch();
exit(0);
}
else
{
printf("Enter the element to be inserted");
scanf("%d",&item);
Top= Top+1;
stack[Top] = item;
}
}
int pop()
{
int item;
if(Top == -1)
{
printf("The stack is Empty");
getch();
exit(0);
}
else
{
item = stack[Top];
Top = Top-1;
}
return(item);
}
void traverse()
{
int i;
if(Top == -1)
{
printf("The Stack is Empty");
getch();
exit(0);
}
else
{
for(i=Top;i>=0;i--)
{
printf("Traverse the element");
printf("\n%d",stack[i]);
}
}
}
OUTPUT:
..STACK USING ARRAY.

1.PUSH
2. POP
3. TRAVERSE
4. QUIT
Enter your choice(1-4) 1
Enter the element: 10
Do you want to enter more elements? (y/n) y

Enter the element:20
Do you want to enter more elements?(y/n)y
Enter the element:30
Do you want to enter more elements?(y/n)n
The List is created
1.PUSH
2. POP
3. TRAVERSE
4. QUIT
Enter the element you want to delete: 20
The element is present in the list
The element is deleted
1.PUSH
2. POP
3. TRAVERSE
4. QUIT
10-> 30-> NULL
18. C PROGRAM TO IMPLEMENT SYNTAX DIRECTED TREE
DESCRIPTION:
The next phase is called the syntax analysis or parsing. It takes the token produced by lexical
analysis as input and generates a parse tree (or syntax tree). In this phase, token
arrangements are checked against the source code grammar, i.e., the parser checks if the
expression made by the tokens is syntactically correct.
PROGRAM:
Syn.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int parsecondition(char[],int,char*,int);
void gen(char [],char [],char[],int);
int main()
{
int counter = 0,stlen =0,elseflag=0;
char stmt[60]; // contains the input statement
char strB[54]; // holds the expression for 'if'
condition
char strS1[50]; // holds the statement for true
condition
char strS2[45]; // holds the statement for false
condition
printf("Format of if statement \n Example...\n");
printf("if (a<b) then (s=a);\n");
printf("if (a<b) then (s=a) else (s=b);\n\n");
printf("Enter the statement \n");
gets(stmt);
stlen = strlen(stmt);
counter = counter + 2; // increment over 'if'
counter = parsecondition(stmt,counter,strB,stlen);
if(stmt[counter]==')')
counter++;
counter = counter + 3; // increment over 'then'
counter = parsecondition(stmt,counter,strS1,stlen);
if(stmt[counter+1]==';')
{ //reached end of statement, generate the output
printf("\n Parsing the input statement....");
gen(strB,strS1,strS2,elseflag);
return 0;
}
if(stmt[counter]==')')
counter++; // increment over ')'
counter = counter + 3; // increment over 'else'
counter = parsecondition(stmt,counter,strS2,stlen);
counter = counter + 2; // move to the end of
statement
if(counter == stlen)
{ //generate the output
elseflag = 1;
printf("\n Parsing the input statement....");
gen(strB,strS1,strS2,elseflag);
return 0;
}
return 0;
} /* Function :
parsecondition
Description : This function parses the statement
from the given index to get the statement enclosed
in ()
Input : Statement, index to begin search, string
to store the condition, total string length
Output : Returns 0 on failure, Non zero counter
value on success
*/
int parsecondition(char input[],int cntr,char
*dest,int totallen)
{
int index = 0,pos = 0;
while(input[cntr]!= '(' && cntr <= totallen)
cntr++;
if(cntr >= totallen)
return 0;
index = cntr;
while (input[cntr]!=')')
cntr++;
if(cntr >= totallen)
return 0;
while(index<=cntr)
dest[pos++] = input[index++];
dest[pos]='\0'; //null terminate the string
return cntr; //non zero value
} /* Function :
gen ()
Description : This function generates three
address code
Input : Expression, statement for true condition,
statement for false condition, flag to denote if
the 'else' part is present in the statement
output :Three address code
*/
void gen(char B[],char S1[],char S2[],int elsepart)
{
int Bt =101,Bf = 102,Sn =103;
printf("\n\tIf %s goto %d",B,Bt);
printf("\n\tgoto %d",Bf);
printf("\n%d: ",Bt);
printf("%s",S1);
if(!elsepart)
printf("\n%d: ",Bf);
else
{ printf("\n\tgoto %d",Sn);
printf("\n%d: %s",Bf,S2);
printf("\n%d:",Sn);
}
}
OUTPUT:
.SYNTAX DIRECTED.
Format of if statement
Example ...
if (a<b) then (s=a);
if (a<b) then (s=a) else (s=b);
Enter the statement
if (a<b) then (x=a) else (x=b);
Parsing the input statement....
If (a<b) goto 101
goto 102
101: (x=a)
goto 103
102: (x=b)
103:

CD Manual

Uploaded by

Copyright:

Available Formats

CD Manual

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CD Manual

Uploaded by

Copyright:

Available Formats

1.

CASE STUDY FOR COMPILER

LANGUAGE PROCESSING SYSTEM

Lex with Yacc

Symbol table management:

void search ();

char label[10], adddress[10], value[5];

int ch,n=0,i,flag=0; char a[10];

char a[10]; int opt; clrscr();

printf("\n Enter the choice\n");

case 1: create(); break;

case 2: search(); break;

case 3: modify(); break;

case 4: display(); break;

case 5: exit(0); break;

printf("\nEnter the Address, Label, Opcode, Operand\n");

{ printf("\nThe Label is already there");

printf("\nEnter the Label"); scnaf("%s",&a);

char a[10]; flag=0; printf("\nEnter the label\n"); scanf("%s",&a);

{ printf("\nEnter the Address, Value\n");

Enter the choice

Enter the Address, Label, Opcode, Operand

1000 COPY START 4890

Enter the choice

Enter the Address, Label, Opcode, Operand

2000 COPY START 2345

The Label is already there

Enter the choice

Enter the Address, Label, Opcode, Operand

2000 CLOOP LDA 9000

Enter the choice

Enter the Label COPY

LABEL ADDRESS VALUE

COPY 1000 4890

Enter the choice

Enter the label

Enter the Address, Value

Enter the choice

LABEL ADDRESS VALUE

COPY 1000 4890

CLOOP 2003 5000

Enter the choice

lex.yy.c C Compiler a.out

Input Stream Sequence of Tokens

if(c==+|| c==-||c==*|| c==/|| c==%|| c==^)

Enter the input string: a=b*c

if(c==>|| c==<||c==||| c===)

printf(enter the input string:);

case 0: if(c==i) state=1;

printf(\n enter the string);

printf(\n string is %s ,str);

printf(keyword list is \n);

printf(\n\n%s string is a keyword,str);

enter the string: abcd

abcd string is not a keyword

enter the string: if

cout<<Enter the expression\n;

LEXICAL ANALYSER IN C++..