Compiler Design Lab Manual New
Compiler Design Lab Manual New
ENGINEERING COLLEGE
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
REGULATION R2021
PREPARED BY
Ms.V.Anithalakshmi
ASST. PROFESSOR/CSE
Table of Contents
Chapter Topics
I Introduction
II Syllabus
III Hardware and Software Requirements
IV Description about Experiments
V Exercises
1.
a. Using the LEX tool, develop a lexical analyser to recognize a few
patterns in C (Ex. Identifiers, constraints,comments, operators etc).
b.Create a symbol table, while recognizing identifiers.
2. Implement a Lexical Analyzer using LEX Tool.
3. Generate YACC specification for a few syntactic categories.
a. Program to recognize a valid arithmetic expression that uses operator
+, -, * and /.
b. Program to recognize a valid variable which starts with a letter
followed by any number of letters or digits.
c. Program to recognize a valid control structures syntax of C language
(For loop, while loop, if-else, if-else-if, switch-case, etc.).
d. Implementation of calculator using LEX and YACC
4. Generate three address code for a simple program using LEX and
YACC.
5. Implement type checking using Lex and Yacc.
6. Implement simple code optimization techniques (Constant folding,
Strength reduction and Algebraic transformation)
7. Implement back-end of the compiler for which the three address code
is given as input and the 8086-assembly language code is produced as
output. The target assembly instructions can be simple move , add, sub,
jump. Also simple addressing modes are used.
CS3501 COMPILER DESIGN LABORATORY
CHAPTER 1
INTRODUCTION
COMPILER
Compiler is a program that reads a program written in one language – the source language – and
translates it in to an equivalent program in another language – the target language.
ANALYSIS-SYNTHESIS MODEL OF COMPILATION
There are two parts to compilation: Analysis and Synthesis. The analysis part breaks up the source
program into constituent pieces and creates an intermediate representation of source program. The
synthesis part constructs the desired target program from the intermediate representation. Of the two
parts, synthesis requires the most specialize technique.
PHASES OF A COMPILER
A compiler operates in six phases, each of which transforms the source program from one
representation to another. The first three phases are forming the bulk of analysis portion of a compiler.
Two other activities, symbol table management and error handling, are also interacting with the six phases
of compiler. These six phases are lexical analysis, syntax analysis, semantic analysis, intermediate code
generation, code optimization and code generation.
LEXICAL ANALYSIS
In compiler, lexical analysis is also called linear analysis or scanning. In lexical analysis the stream
of characters making up the source program is read from left to right and grouped into tokens that are
sequences of characters having a collective meaning.
SYNTAX ANALYSIS
It is also called as Hierarchical analysis or parsing. It involves grouping the tokens of the source
program into grammatical phrases that are used by the compiler to synthesize output. Usually, a parse tree
represents the grammatical phrases of the source program.
SEMANTIC ANALYSIS
The semantic analysis phase checks the source program for semantic errors and gathers type information
for the subsequent code generation phase. It uses the hierarchical structure determined by the syntax-
analysis phase to identify the operators and operands of expressions and statements. An important
component of semantic analysis is type checking. Here the compiler checks that each operator has
operands that are permitted by the source language specification.
SYMBOL TABLE MANAGEMENT
Symbol table is a data structure containing the record of each identifier, with fields for the attributes
of the identifier. The data structure allows us to find the record for each identifier quickly and store or
retrieve data from that record quickly. When the lexical analyzer detects an identifier in the source
program, the identifier is entered into symbol table. The remaining phases enter information about
identifiers in to the symbol table.
ERROR DETECTION
Each phase can encounter errors. The syntax and semantic analysis phases usually handle a large
fraction of the errors detectable by compiler. The lexical phase can detect errors where the characters
remaining in the input do not form any token of language. Errors where the token stream violates the
structure rules of the language are determined by the syntax analysis phase.
INTERMEDIATE CODE GENERATION
After syntax and semantic analysis, some compilers generate an explicit intermediate
representation of the source program. This intermediate representation should have two important
properties: it should be easy to produce and easy to translate into target program
CODE OPTIMIZATION
The code optimization phase attempts to improve the intermediate code so that the faster running
machine code will result. There are simple optimizations that significantly improve the running time of
the target program without slowing down compilation too much.
CODE GENERATION
The final phase of compilation is the generation of target code, consisting normally of reloadable
machine code or assembly code.
CHAPTER 2
SYLLABUS
CS3501 COMPILER DESIGN LABORATORY L T P C
0 0 3 2
OBJECTIVES:
The student should be made to:
▪ Be exposed to compiler writing tools.
▪ Learn to implement the different Phases of compiler
▪ Be familiar with control flow and data flow analysis
▪ Learn simple optimization techniques
LIST OF EXPERIMENTS:
1.Using the LEX tool, Develop a lexical analyzer to recognize a few patterns in C. (Ex.
identifiers, constants, comments, operators etc.). Create a symbol table, while recognizing
identifiers.
b. Program to recognize a valid variable which starts with a letter followed by any
4. Generate three address code for a simple program using LEX and YACC.
6. Implement simple code optimization techniques (Constant folding, Strength reduction and
Algebraic transformation)
7. Implement back-end of the compiler for which the three address code is given as input and the 8086-
assembly language code is produced as output.
OUTCOMES: At the end of the course, the student should be able to
● Implement the different Phases of compiler using tools
● Analyze the control flow and data flow of a typical program
● Optimize a given program
● Generate an assembly language program equivalent to a source language program
CHAPTER 3
HARDWARE REQUIREMENTS
Processor Pentium IV
SOFTWARE REQUIREMENTS
The experiments are highly exposed to compiler writing tools, to implement the
different phases of compiler, to be familiar with control flow and data flow analysis and
to learn simple optimization techniques. It also helps the students to learn vast coverage
required for designing a compiler. The experiment about lexical analyzer exemplifies the
process of producing tokens, eliminating blank and comments, generating symbol table
in which it stores the information about identifiers, constants encountered in the input.
The lexical analyzer works in two phases, as in first phase it performs scan and in the
second phase it does lexical analysis which means producing the series of tokens.
LEARNING OBJECTIVES
The objective of this lab is to teach the students about various operating systems
including Windows, and Unix. Students learn about systems configuration and
administration. Students learn, explore and practice technologies related to Compiler
design and also gather knowledge about YACC programming language.
YACC
YET ANOTHER COMPILER-COMPILER
Yacc is a computer program for the Unix operating system. The name is an
acronym for "Yet Another Compiler Compiler". It is a LALR parser generator, generating
a parser, the part of a compiler that tries to make syntactic sense of the source code,
specifically a LALR parser, based on an analytic grammar written in a notation similar to
BNF. Yacc provides a general tool for describing the input to a computer program. The
Yacc user specifies the structures of his input, together with code to be invoked as each
such structure is recognized. Yacc turns such a specification into a subroutine that
handles the input process; frequently, it is convenient and appropriate to have most of
the flow of control in the user's application handled by this subroutine.
MKCE–
D GENERAL PROCEDURE FOR EXECUTING THE PROGRAMS
1. Use the TURBO C editor to create a file called hello.c containing the following lines:
#include<stdio.h>
main()
{
printf(“hello world”);
}
2. Save this file.At the UNIX command prompt ,invoke the gcc compiler as follows
gcc hello.c
3. The result of this step will be an executable file called a.out. This is the default file
name given be the compiler.
4. To execute the program type ./a.out
5. To compile a program and saved the compiled version in a different file name, use
the option as in gcc –o hello.exe hello.c
CHAPTER 5
EXERCISES P
Exercise-1(a):
Using the LEX tool, Develop a lexical analyzer to recognize a few patterns in C. (Ex.
identifiers, constants, comments, operators etc.
DESCRIPTION
Lexical analysis is the process of converting a sequence of characters into a
sequence of tokens, i.e. meaningful character strings. A program or function that
performs lexical analysis is called a lexical analyzer, lexer, tokenizer, or scanner, though
"scanner" is also used for the first stage of a lexer. A lexer is generally combined with a
parser, which together analyze the syntax of programming languages, such as in
compilers, but also HTML parsers in web browsers, among other examples.
PROCEDURE:
PROGRAM:
#include<stdio.h>
#include<conio.h>
#include<ctype.h>
#include<string.h>
#include<stdlib.h>
#define SIZE 128
#define NONE -1
#define EOS '\0'
#define NUM 256
#define KEYWORD257
#define PAREN 258
#define ID 259
#define ASSIGN 260
#define REL_OP 261
#define DONE 262
#define MAX 999
char lexemes[MAX];
char buffer[SIZE];
int lastchar = -1;
int lastentry = 0;
int tokenval=NONE;
int lineno=1;
struct entry
{
char *lexptr;
int token;
}symtable[100];
struct entry keywords[]={"if",KEYWORD,"else",KEYWORD,"for",KEYWORD,
"int",KEYWORD,"float",KEYWORD,"double",KEYWORD,"char",KEYWORD,
"struct",KEYWORD,"return",KEYWORD,0,0};
void Error_Message(char *m)
{
printf(stderr,"line %d:%s",lineno,m);
exit(1);
}
int look_up(char s[])
{
int k;
for(k=lastentry;k>0;k--)
if(strcmp(symtable[k].lexptr,s)==0)
return k;
return 0;
}C E - D E P A R T M E N F O R M A T I H N
OLOGY Pa
int insert(char s[],int tok)
{
int len; len=strlen(s);
if(lastentry+1>=MAX)
Error_Message("Symbol Table is Full");
if(lastchar+len+1>=MAX
Error_Message("Lexemes Array is
Full");
lastentry++;
symtable[lastentry].token=tok;
symtable[lastentry].lexptr=&lexemes[lastchar+1];
lastchar = lastchar + len + 1;
strcpy(symtable[lastentry].lexptr,s);
return lastentry;
}
void Initialize()
{
struct entry *ptr;
for(ptr=keywords;ptr->token;ptr++)
insert(ptr->lexptr,ptr->token);
}
int lexer()
{
int t;
int val,i=0;
while(1)
{
t=getchar();
if(t == ' '|| t=='\t');
else if(t=='\n')
lineno++;
else if(t == '('|| t == ')')
return PAREN;
else if(t=='<' ||t=='>' ||t=='<=' ||t=='>=' ||t ==
'!=') return REL_OP;
else if(t == '=')
return ASSIGN;
else if(isdigit(t))
{
ungetc(t,stdin);
MKCE-DEPARTMENFORMATIOHNOLOGY Page 17
scanf("%d",&tokenval);
return NUM;
}
else if(isalpha(t))
{
while(isalnum(t))
{
buffer[i]=t;
t=getchar();
i++;
if(i>=SIZE)
Error_Message("compiler error");
}
buffer[i]=EOS;
if(t!=EOF)
ungetc(t,stdin);
val=look_up(buffer);
if(val==0)
val=insert(buffer,ID);
tokenval=val;
return symtable[val].token;
}
else if(t==EOF)
return DONE;
else
{
tokenval=NONE;
return t;
}
}
}
void main()
{
int lookahead;
char ans;
clrscr();
printf("\nProgram for Lexical Analysis \n");
Initialize();
printf("\n Enter the expression and put ; at the end");
printf("\n Press Ctrl + Z to terminate... \n");
lookahead=lexer();
while(lookahead!=DONE)
MKCE-DEPARTMCHNOLOGY Page 18
{
if(lookahead==NUM)
printf("\n Number: %d",tokenval);
if(lookahead=='+'|| lookahead=='-'|| lookahead=='*'||
lookahead=='/')
printf("\n Operator\t\t%c",lookahead);
if(lookahead==PAREN)
printf("\n Parentesis");
if(lookahead==ID)
printf("\n Identifier: %s",symtable[tokenval].lexptr);
if(lookahead==KEYWORD)
printf("\n Keyword");
if(lookahead==ASSIGN)
printf("\n Assignment Operator");
if(lookahead==REL_OP)
printf("\n Relataional Operator");
lookahead=lexer();
}
}
Sample Output:
CONCLUSION:
Thus the C program for developing lexical analyzer to recognize a few patterns
was written and executed successfully.
DESCRIPTION:
A symbol table is a data structure used by a language translator such as a compiler
or interpreter, where each identifier in a program's source code is associated with
information relating to its declaration or appearance in the source, such as its type, scope
level and sometimes its location. A common implementation technique is to use a hash
table.
A compiler may use one large symbol table for all symbols or use separated,
hierarchical symbol tables for different scopes. There are also trees, linear lists and self-
organizing lists which can be used to implement symbol table. It also simplifies the
classification of literals in tabular format. The symbol table is accessed by most phases of
a compiler, beginning with the lexical analysis to optimization.
PROCEDURE:
PROGRAM:
#include<stdio.h>
#include<conio.h>
#include<string.h>
#include<ctype.h>
void main()
{
char in[50],dig[50],id[50];
int i=0,j=0,k,l=0;
clrscr();
printf("Enter the
Expression:\t"); gets(in);
printf("\n***************************************************************************");
printf("\nDataType Identifier Address Constants Operators SpecialChar\n");
printf("\n***************************************************************************");
while(in[i]!='\0')
{
if(isalpha(in[i]))
{
j=0;
while((isalpha(in[i]))||(isdigit(in[i])))
{
id[j]=in[i];
i++;
j++;
}
id[j]='\0';
if(strcmp(id,"char")==0||strcmp(id,"int")==0||strcmp(id,"float")==0||strcmp(id,"if"
)==0||strcmp(id,"long")==0||strcmp(id,"while")==0||strcmp(id,"do")==0||
strcmp(id,"for")==0||strcmp(id,"switch")==0||strcmp(id,"double")==0)
{
printf("\n");
for(l=0;l<j;l++)
printf("%c",id[l]);
}
else
{
printf("\t\t\t");
for(l=0;l<j;l++)
printf("%c %u",id[l]);
}
}
else if(isdigit(in[i]))
{
k=0;
while(isdigit(in[i]))
{
CHNOLOGY Page 10
dig[k]=in[i];
i++;
k++;
}
printf("\n\t\t\t\t");
for(l=0;l<k;l++)
printf("%c",dig[l]);
}
else if(in[i]=='+'||in[i]=='-'||in[i]=='*'||in[i]=='/'||in[i]=='<'||in[i]=='>'||in[i]=='=')
{
printf("\t\t\t\t\t\t\t%c",in[i]);
i++;
}
else if(in[i]==';'||in[i]==':'||in[i]=='.'||in[i]=='('||in[i]==')'||in[i]=='{'||in[i]=='}')
{
printf("\t\t\t\t\t\t\t\t\t%c",in[i]);
i++;
}
else
i++;
printf("\n"); printf("-----------------------------------------------------------
--------------"); getch();
}}
SAMPLE OUTPUT:
Result :
Thus the C program for implementing symbol table was written and executed
successfully.
HNOLOGY Page 11
ORMATI
Exercise-2:
DESCRIPTION:
Lex helps write programs whose control flow is directed by instances of regular
expressions in the input stream. It is well suited for editor-script type transformations and for
segmenting input in preparation for a parsing routine.
Lex source is a table of regular expressions and corresponding program fragments. The
table is translated to a program which reads an input stream, copying it to an output stream and
partitioning the input into strings which match the given expressions. As each such string is
recognized the corresponding program fragment is executed. The recognition of the
expressions is performed by a deterministic finite automaton generated by Lex. The program
fragments written by the user are executed in the order in which the corresponding regular
expressions occur in the input stream.
The lexical analysis programs written with Lex accept ambiguous specifications and
choose the longest match possible at each input point. If necessary, substantial lookahead is
performed on the input, but the input stream will be backed up to the end of the current
partition, so that the user has general freedom to manipulate it.
PROCEDURE:
MKCE-DEPARTMEINFORLOGY Page 22
PROGRAM:
%{
%}
identifier[a-zA-Z][a-zA-Z0-9]*
%%
#.* {printf("\n%s is a preprocessor directive",yytext);}
int |
float |
char |
double |
while |
do |
if |
break |
continue |
void |
switch |
return |
else |
goto {printf("\n%s is a keyword",yytext);}
{identifier}\( {printf("\n function %s",yytext);}
\{ {printf("\nblock begins");}
\} {printf("\nblock ends");}
\( {printf("\n");ECHO;}
{identifier}(\[[0-9]*\])* {printf("\n%s is an identifier",yytext);}
\".*\" {printf("\n %s is a string ",yytext);}
[0-9]+ {printf("\n%s is a number",yytext);
}
\<= |
\>= |
\< |
\> |
\== {printf("\n %s is a relational operator",yytext);}
\= | \+ | \- | \/ | \& |
% {printf("\n %s is a operator",yytext);}
.|
\n;
MKCE-DEPA Page 23
%%
int main(int argc,char **argv)
{
FILE *file;
file=fopen("inp.c","r");
if(!file)
{
printf("could not open the file!!!");
exit(0);
}
yyin=file;
yylex();
printf("\n\n");
return(0);
}
int yywrap()
{
return 1;
}
INPUT FILE:
#include<stdio.h>
void main()
{
int a,b,c;
printf("enter the value for a,b");
scanf("%d%d",&a,&b)';
c=a+b;
printf("the value of c:%d",&c);
}
OUTPUT:
[3cse01@localhost ~]$ lex ex3.l
[3cse01@localhost ~]$ cc lex.yy.c
[3cse01@localhost ~]$ ./a.out
#include<stdio.h> is a preprocessor directive
void is a keyword
function main(
block begins
int is a keyword
NOLOGY Page 24
a is an identifier b
is an identifier c is
an identifier
function printf(
"enter the value for a,b" is a string
function scanf(
"%d%d" is a string
& is an operator a
is an identifier
& is an operator
b is an identifier
c is an identifier
= is an operator
a is an identifier
+ is a operator
b is an identifier
function printf(
"the value of c:%d" is a string
& is a operator
c is an identifier
block ends
Result:
Thus the program to implement the lexical analyzer using lex tool for a subset of C
language was implemented and verified.
MKCE-DEPA
Exercise – 3(a):
Generate YACC specification for a few syntactic categories.
a. Program to recognize a valid arithmetic expression that uses operator +, -, * and
/.
AIM:
To write a Yacc program to valid arithmetic expression.
ALGORITHM:
Step 1: Start the program to recognize a valid arithmetic expressions.
Step 2: Define the pattern for digits in lex file
Step 3: Assign yylval to return the token number as integer in lex file
Step 4: Define the patterns for operators and expressions in yacc file
Step 5: Use yywrap() to print an error using yyerror() in yacc file.
Step 6: Use yyparse() to reads a stream of value pairs in yacc file
Step 7: Display the valid arithmetic expressions
Step 8: Stop the program.
PROGRAM:
LEX PART : arithmetic.l
%{
#include<stdio.h>
#include "y.tab.h"
%}
%%
[a-zA-Z] {return VARIABLE;}
[0-9] {return NUMBER;}
[\t] ;
[\n] {return 0;}
. {return yytext[0];}
%%
yywrap()
{}
OUTPUT:
$:lex arithmetic.l
$:yacc arithmetic.y
$:cc lex.yy.c y.tab.h
$:./a.out
RESULT:
Thus, the program to recognize the valid arithmetic expression is verified and executed
successfully..
Exercise-3(b):
Program to recognize a valid variable which starts with a letter followed by
any number of letters or digits.
AIM:
To write a yacc program to recognize a valid variable which starts with a letter followed
by any number of letters or digits.
ALGORITHM:
Step 1: Start the program to recognize a valid arithmetic expressions.
Step 2: Define the pattern for letters and digits in lex file.
Step 3: Define the pattern for identifiers in yacc file.
Step 4: Use yywrap() to print an error using yyerror() in yacc file.
Step 5: Use yyparse() to reads a stream of value pairs in yacc file.
Step 6: Print the valid identifiers using yytext().
Step 7: Display the valid identifiers. Step 8: Stop the program.
PROGRAM:
LEX PART : variable.l
%{
#include "y.tab.h"
%}
%%
[a-zA-Z] { return ALPHA ;}
[0-9]+ { return NUMBER ; }
"\n" { return ENTER ;}
. { return ER; }
%%
yywrap()
{}
YACC PART : variable.y
%{
#include<stdio.h>
#include<stdlib.h>
%}
%token ALPHA NUMBER ENTER ER
%%
var : v ENTER { printf(" Valid Variable\n"); exit(0);} v:ALPHA exp1 exp1:ALPHA exp1
|NUMBER exp1
|;
%%
yyerror()
{printf("Invalid Variable\n");}
main() { printf("Enter the Expression : "); yyparse(); }
OUTPUT:
$:lex variable.l
$:yacc variable.y
$:cc lex.yy.c y.tab.h $:./a.out
RESULT:
Thus, the given program to validate variable is implemented and executed successfully
Exercise-3(c):
Program to recognize a valid control structures syntax of C language (For loop, while loop, if-
else, if-else-if, switch-case, etc.).
AIM:
ALGORITHM:
#include<stdio.h>
#include "y.tab.h"
%}
alpha [A-Za-z]
digit [0-9]
%%
. return yytext[0];
%%
yywrap()
{}
#include <stdio.h>
#include <stdlib.h>
%}
%right '='
%left OR AND
%right UMINUS
%left '!'
%%
| E';'
| ST
| E ';'
| ST
E : ID '=' E
| E '+' E
| E '-' E
| E '*' E
| E '/' E
| E '<' E
| E '>' E
| E LE E
| E GE E
| E EQ E
| E NE E
| E OR E
| E AND E
| E '+' '+'
| E '-' '-'
| ID
| NUM
E2 : E'<'E | E'>'E
| E LE E
| E GE E
| E EQ E
| E NE E
| E OR E
| E AND E
;
%%
main() {
printf("Enter the
expression:\n"); yyparse(); }
yyerror() {
OUTPUT:
$:lex for.l
$:yacc for.y
$:cc lex.yy.c y.tab.h
$:./a.out
#include<stdio.h>
#include "y.tab.h"
%}
alpha [A-Za-z]
digit [0-9]
%%
[\t \n]
. return yytext[0];
%%
yywrap()
{}
#include <stdio.h>
#include <stdlib.h>
%}
%right '='
%left AND OR
%left '+''-'
%left '*''/'
%right UMINUS
%left '!'
%%
ST : ST ST
| E';'
E : ID'='E
| E'+'E
| E'-'E
| E'*'E
| E'/'E
| E'<'E
| E'>'E
| E LE E
| E GE E
| E EQ E
| E NE E
| E OR E
| E AND E
| ID
| NUM
E2 : E'<'E
| E'>'E
| E LE E
| E GE E
| E EQ E
| E NE E
| E OR E
| E AND E
| ID
| NUM
%%
main()
}
yyer
ror()
{
OUTPUT:
$:lex while.l
$:yacc while.y
$:cc lex.yy.c y.tab.h
$:./a.out
#include<stdio.h>
#include "y.tab.h"
%}
alpha [A-Za-z]
digit [0-9]
%%
[ \t\n]
. return yytext[0];
%%
yywrap()
{}
#include <stdio.h>
#include <stdlib.h>
%}
%right '='
%left AND OR
%left '+''-'
%left '*''/'
%right UMINUS
%left '!'
%%
S : ST {printf("Input accepted.\n");exit(0);};
ST1 : ST
|E
E : ID'='E
| E'+'E
| E'-'E
| E'*'E
| E'/'E
| E'<'E
| E'>'E
| E LE E
| E GE E
| E EQ E
| E NE E
| E OR E
| E AND E
| ID
| NUM
E2 : E'<'E
| E'>'E
| E LE E
| E GE E
| E EQ E
| E NE E
| E OR E
| E AND E
| ID
| NUM
%%
main()
}
yyer
ror()
{
OUTPUT:
$:lex if.l
$:yacc if.y
$:cc lex.yy.c y.tab.h
$:./a.out
%{
#include<stdio.h>
#include "y.tab.h"
%}
alpha [A-Za-z]
digit [0-9]
%%
[ \t\n]
. return yytext[0];
%%
yywra
p() {}
YACC
PART
:
elseif.
y
%{
#include <stdio.h>
#include <stdlib.h>
%}
%right '='
%left AND OR
%left '+''-'
%left '*''/'
%right UMINUS
%left '!'
%%
S : ST {printf("Input accepted.\n");exit(0);};
ST : IF '(' E2 ')' THEN ST1';' ELSEIF '(' E2 ')' THEN ST1';' ELSE ST1';'
ST1 : ST
|E
E : ID'='E
| E'+'E
| E'-'E
| E'*'E
| E'/'E
| E'<'E
| E'>'E
| E LE E
| E GE E
| E EQ E
| E NE E
| E OR E
| E AND E
| ID
| NUM
E2 : E'<'E
| E'>'E
| E LE E
| E GE E
| E EQ E
| E NE E
| E OR E
| E AND E
| ID
| NUM
%%
main()
}
yyer
ror()
{
OUTPUT:
$:lex elseif.l
$:yacc elseif.y
$:cc lex.yy.c y.tab.h
$:./a.out
PROGRAM: SWITCH
#include<stdio.h>
#include "y.tab.h"
%}
alpha [A-Za-z]
digit [0-9]
%%
[ \n\t] if return
IF; then return
THEN; while return
WHILE; switch return
SWITCH; case
return CASE; default
return DEFAULT; break
return BREAK;
. return yytext[0];
%%
yywrap()
{}
#include<stdio.h>
#include<stdlib.h>
%}
%right '='
%left AND OR
%left '+''-'
%left '*''/'
%right UMINUS
%left '!'
%%
S : ST{printf("\nInput accepted.\n");exit(0);};
ST : SWITCH'('ID')''{'B'}'
B : C
| CD
C : CC
| CASE NUM':'ST1 BREAK';'
;
D : DEFAULT':'ST1 BREAK';'
| DEFAULT':'ST1
| IF'('E2')'THEN E';'
| ST1 ST1
| E';'
E2 : E'<'E
| E'>'E
| E LE E
| E GE E
| E EQ E
| E NE E
| E AND E
| E OR E
E : ID'='E
| E'+'E
| E'-'E
| E'*'E
| E'/'E
| E'<'E
| E'>'E
| E LE E
| E GE E
| E EQ E
| E NE E
| E AND E
| E OR E
| ID
| NUM
%%
main()
}
yyer
ror()
{
OUTPUT:
$:lex switch.l
$:yacc switch.y
$:cc lex.yy.c y.tab.h
$:./a.out
RESULT:
Thus, the given program to recognize valid control structures is executed successfully.
Exercise-3(d):
AIM:
ALGORITHM:
PROGRAM:
#include<stdio.h>
int yylex(void);
%}
%nonassoc UMINUS
%%
|exp {printf("=%d\n",$1);}
%%
m
ai
n(
){
printf("Enter the
expression: "); yyparse(); }
yyerror()
{ printf("\nError
Occured\n"); }
OUTPUT:
$:lex calc.l
$:yacc calc.y
$:cc lex.yy.c y.tab.h
$:./a.out
RESULT:
Thus, the program to implement a calculator using lex tool is implemented and executed
successfully.
Exercise-4:
Generate three address code for a simple program using LEX and YACC
(Implementing Three Address Code)
Aim:
To generate three address code for a simple program using LEX and YACC.
Algorithm:
LEX:
1. Declare the required header file and variable declaration with in ‘%{‘ and
‘%}’.
lexemes.
3. LEX call yywrap() function after input is over. It should return 1 when work
is done or
YACC:
1. Declare the required header file and variable declaration with in ‘%{‘ and
‘%}’.
2. Define tokens in the first section and also define the associativity of the
operations
3. Mention the grammar productions and the action for each production.
LEX program:
%{
#include<stdio.h>
#include<string.h>
#include "ex4.tab.h"
%}
%%
[ \n\t]+ ;
[0-9]+ |
[0-9]+\.[0-9]+ {
yylval.dval=atof(yytext);
return NUMBER;
. return yytext[0];
%%
int yywrap(){
return 1;
YACC program:
%{
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
struct Symbol_Table
char sym_name[10];
char sym_type[10];
double value;
}Sym[10];
int sym_cnt=0;
int Index=0;
int temp_var=0;
void display_sym_tab();
void display_Quadruple();
void push(char*);
char* pop();
struct Quadruple
char operator[5];
char operand1[10];
char operand2[10];
char result[10];
}QUAD[25];
struct Stack
char *items[10];
int top;
}Stk;
%}
%union
int ival;
double dval;
char string[10];
%token <string> ID
%token MAIN
%%
i=search_symbol($3);
if(i!=-1)
printf("\n Multiple Declaration of Variable");
else
make_symtab_entry($3,$<string>0,0);
| ID'='NUMBER {
int i;
i=search_symbol($1);
if(i!=-1)
else
make_symtab_entry($1,$<string>0,$3);
int i;
i=search_symbol($3);
if(i!=-1)
else
make_symtab_entry($3,$<string>0,$5);
|ID { int i;
i=search_symbol($1);
if(i!=-1)
else
make_symtab_entry($1,$<string>0,0);
}
int i;
i=search_symbol($1);
if(i==-1)
else
char temp[10];
if(strcmp(Sym[i].sym_type,"int")==0)
sprintf(temp,"%d",(int)$3);
else
snprintf(temp,10,"%f",$3);
addQuadruple("=","",temp,$1);
| ID '=' ID ';'{
int i,j;
i=search_symbol($1);
j=search_symbol($3);
if(i==-1 || j==-1)
else
addQuadruple("=","",$3,$1);
char str[5],str1[5]="t";
strcat(str1,str);
temp_var++;
addQuadruple("+",pop(),pop(),str1);
push(str1);
char str[5],str1[5]="t";
strcat(str1,str);
temp_var++;
addQuadruple("-",pop(),pop(),str1);
push(str1);
char str[5],str1[5]="t";
strcat(str1,str);
temp_var++;
addQuadruple("*",pop(),pop(),str1);
push(str1);
char str[5],str1[5]="t";
strcat(str1,str);
temp_var++;
addQuadruple("/",pop(),pop(),str1);
push(str1);
|ID { int i;
i=search_symbol($1);
if(i==-1)
else
push($1);
snprintf(temp,10,"%f",$1);
push(temp);
%%
Stk.top = -1;
yyin = fopen("input.txt","r");
yyparse();
display_sym_tab();
printf("\n\n");
display_Quadruple();
printf("\n\n");
return(0);
int i,flag=0;
for(i=0;i<sym_cnt;i++)
if(strcmp(Sym[i].sym_name,sym)==0)
flag=1;
break;
if(flag==0)
return(-1);
else
return(i);
strcpy(Sym[sym_cnt].sym_name,sym);
strcpy(Sym[sym_cnt].sym_type,dtype);
Sym[sym_cnt].value=val;
sym_cnt++;
void display_sym_tab()
int i;
for(i=0;i<sym_cnt;i++)
printf("\n %s %s %f",Sym[i].sym_name,Sym[i].sym_type,Sym[i].value);
void display_Quadruple()
int i;
for(i=0;i<Index;i++)
printf("\n %d %s %s %s
%s",i,QUAD[i].result,QUAD[i].operator,QUAD[i].operand1,QUAD[i].operand
2);
}
int yyerror()
printf("\nERROR!!\n");
return(1);
Stk.top++;
Stk.items[Stk.top]=(char *)malloc(strlen(str)+1);
strcpy(Stk.items[Stk.top],str);
char * pop()
int i;
if(Stk.top==-1)
exit(0);
strcpy(str,Stk.items[Stk.top]);
Stk.top--;
return(str);
strcpy(QUAD[Index].operand2,op2);
strcpy(QUAD[Index].operand1,op1);
strcpy(QUAD[Index].result,res);
Index++;
Output:
Result:
Thus the three address code was generated successfully for a simple program
using LEX and YACC.
Exercise-5:
Implement type checking using Lex and Yacc.
AIM:
ALGORITHM:
Step1: Track the global scope type information (e.g. classes and their members)
Step2: Determine the type of expressions recursively, i.e. bottom-up, passing the resulting types
upwards.
PROGRAM CODE:
#include<stdio.h>
#include<stdlib.h>
int main()
int n,i,k,flag=0;
char vari[15],typ[15],b[15],c;
scanf(" %d",&n);
for(i=0;i<n;i++)
scanf(" %c",&vari[i]);
scanf(" %c",&typ[i]);
if(typ[i]=='f')
flag=1;
i=0;
getchar();
while((c=getchar())!='$')
b[i]=c;
i++; }
k=i;
for(i=0;i<k;i++)
if(b[i]=='/')
flag=1;
break; } }
for(i=0;i<n;i++)
if(b[0]==vari[i])
if(flag==1)
if(typ[i]=='f')
break; }
else
break; } }
else
break; } }
return 0;
}
OUTPUT:
Result:
Thus the LEX and YACC program for implementing type checking was written
and executed successfully.
Exercise-6:
DESCRIPTION:
Constant folding and constant propagation are related compiler optimizations
used by many modern compilers. An advanced form of constant propagation known as
sparse conditional constant propagation can more accurately propagate constants and
simultaneously remove dead code.
PROCEDURE:
1) Start the Program
2) Get the input as a source program
3) Process the instructions inside the input file
4) Apply transformations on loops, procedure calls, and address calculations
5) Code generation will occur
6) Use registers and select appropriate instructions
7) Perform Peephole optimization
Program:
#include<stdio.h>
#include<string.h>
#include<conio.h>
#include<stdlib.h>
#include<ctype.h>
struct ConstFold
{
char new_Str[10];
char str[10];
}
Opt_Data[20];
void ReadInput(char Buffer[],FILE *Out_file);
CONCLUSION:
Thus the C program for implementing simple code optimization technique was written and
executed successfully.
Exercise-7:
Implement back-end of the compiler for which the three address code is given as input and the 8086-assembly
language code is produced as output. The target assembly instructions can be simple move , add, sub, jump.
Also simple addressing modes are used.
DESCRIPTION:
Modern processors have only a limited number of register. Although some
processors, such as the x86, can perform operations directly on memory locations, we
will for now assume only register operations. Some processors (e.g., the MIPS
architecture) use three-address instructions. However, some processors permit only two
addresses; the result overwrites one of the sources. With these assumptions, code
something like the following would be produced for our example, after first assigning
memory locations to id1 and id2.
LD R1, id2
ADDF R1, R1, #3.0 // add float
RTOI R2, R1 // real to int
ST id1, R2
PROCEDURE:
1) Start the program
2) Open the input file
3) Enter the intermediate code as an input to the program
4) Apply conditions for checking the keywords in the intermediate code
5) Analyze each instruction in switch case
6) After generating machine code, copy it to the output file
7) Stop the program
PROGRAM:
#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
#include<string.h>
int label[20];
int no =0;
int main()
{
RMATION TECHNOLOGY Page 82
FILE *fp1,*fp2;
int check_label(int n);
char fname[10],op[10],ch;
char operand1[8],operand2[8],result[8];
int i=0,j=0;
clrscr();
printf("\n Enter filename of the intermediate code");
scanf("%s",&fname);
fp1=fopen(fname,"r");
fp2=fopen("target.txt","w");
if(fp1==NULL || fp2==NULL)
{
printf("\n Error Opening the
file"); getch();
exit(0);
}
while(!feof(fp1))
{
fprintf(fp2,"\n");
fscanf(fp1,"%s",op);
i++;
if(check_label(i))
{
fprintf(fp2,"\nlabel#%d:",i);
}
if(strcmp(op,"print")==0)
{
fscanf(fp1,"%s",result);
fprintf(fp2,"\n\t OUT %s",result);
}
if(strcmp(op,"goto")==0)
{
fscanf(fp1,"%s",operand2);
fprintf(fp2,"\n\t JMP label#%s",operand2);
label[no++] = atoi(operand2);
}
if(strcmp(op,"[]=")==0)
{
fscanf(fp1,"%s%s%s",operand1,operand2,result);
fprintf(fp2,"\n\tSTORE%s[%s],%s",operand1,operand2,result);
}
if(strcmp(op,"uminus")==0)