Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Compiler Lab Manual

Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

CS3501 – COMPILER DESIGN

LABORATORY
(LAB MANUAL)
(R-2021)
V SEMESTER
LIST OF EXPERIMENTS:

1. Using the LEX tool, Develop a lexical analyzer to


recognize a few patterns in C. (Ex. identifiers, constants,
comments, operators etc.). Create a symbol table, while
recognizing identifiers.

2. Implement a Lexical Analyzer using LEX Tool

3. Generate YACC specification for a few syntactic


categories.

Implementation of calculator using LEX and YACC

4. Generate three address code for a simple program using


LEX and YACC.

5. Implement type checking using Lex and Yacc.

6. Implement simple code optimization techniques (Constant


folding, Strength reduction and Algebraic transformation)

7. Implement back-end of the compiler for which the three


address code is given as input and the 8086 assembly
language code is produced as output.
Ex.No.1 DEVELOP A LEXICAL ANALYZER
TO RECOGNIZE A FEW PATTERNS IN C

To develop a lexical analyzer to identify identifiers, constants, comments, operators

PROGRAM :

//Develop a lexical analyzer to recognize a few patterns in C.

#include<string.h>
#include<ctype.h>
#include<stdio.h>
#include<stdlib.h>
void keyword(char str[10])
{
if(strcmp("for",str)==0||strcmp("while",str)==0||strcmp("do",str)==0||strcmp("int",str
)==0||strcmp("float",str)==0||strcmp("char",str)==0||strcmp("double",str)==0||strcmp
("printf",str)==0||strcmp("switch",str)==0||strcmp("case",str)==0)
printf("\n%s is a keyword",str);
else
printf("\n%s is an identifier",str);
}
void main()
{
FILE *f1,*f2,*f3;
char c,str[10],st1[10];
int num[100],lineno=0,tokenvalue=0,i=0,j=0,k=0;
f1=fopen("input","r");
f2=fopen("identifier","w");
f3=fopen("specialchar","w");
while((c=getc(f1))!=EOF)
{
if(isdigit(c))
{
tokenvalue=c-'0';
c=getc(f1);
while(isdigit(c))
{
tokenvalue*=10+c-'0';
c=getc(f1);
}
num[i++]=tokenvalue;
ungetc(c,f1);
}
else
if(isalpha(c))
{
putc(c,f2);
c=getc(f1);
while(isdigit(c)||isalpha(c)||c=='_'||c=='$')
{
putc(c,f2);
c=getc(f1);
}
putc(' ',f2);
ungetc(c,f1);
}
else
if(c==' '||c=='\t')
printf(" ");
else
if(c=='\n')
lineno++;
else
putc(c,f3);
}
fclose(f2);
fclose(f3);
fclose(f1);
printf("\n the no's in the program are:");
for(j=0;j<i;j++)
printf("\t%d",num[j]);
printf("\n");
f2=fopen("identifier","r");
k=0;
printf("the keywords and identifier are:");
while((c=getc(f2))!=EOF)
if(c!=' ')
str[k++]=c;
else
{
str[k]='\0';
keyword(str);
k=0;
}
fclose(f2);
f3=fopen("specialchar","r");
printf("\n Special Characters are");
while((c=getc(f3))!=EOF)
printf("\t%c",c);
printf("\n");
fclose(f3);
printf("Total no of lines are:%d",lineno);
}
OUTPUT:

RESULT:

Thus the program for developing a lexical analyzer to recognize a few patterns in C
has been executed successfully.
Ex.No:2 IMPLEMENTATION OF A LEXICAL
ANALYZER USING LEX
AIM:

To write a program for implementing a Lexical analyser using LEX tool in Linux
platform.

ALGORITHM:

Step1: Lex program contains three sections: definitions, rules, and user subroutines.
Each section must be separated from the others by a line containing only
the delimiter, %%. The format is as follows: definitions %% rules %% user_subroutines

Step2: In definition section, the variables make up the left column, and their
definitions make up the right column. Any C statements should be enclosed in %{..}%.
Identifier is defined such that the first letter of an identifier is alphabet and remaining
letters are alphanumeric.

Step3: In rules section, the left column contains the pattern to be recognized in an
input file to yylex(). The right column contains the C program fragment executed when
that pattern is recognized. The various patterns are keywords, operators, new line
character, number, string, identifier, beginning and end of block, comment statements,
preprocessor directive statements etc.

Step4: Each pattern may have a corresponding action, that is, a fragment of C source
code to execute when the pattern is matched.

Step5: When yylex() matches a string in the input stream, it copies the matched text
to an external character array, yytext, before it executes any actions in the rules
section.

Step6: In user subroutine section, main routine calls yylex(). yywrap() is used to get
more input.
Step7: The lex command uses the rules and actions contained in file to generate a
program, lex.yy.c, which can be compiled with the cc command. That program can then
receive input, break the input into the logical pieces defined by the rules in file, and
run program fragments contained in the actions in file.

//Implementation of Lexical Analyzer using Lex tool


%{
int COMMENT=0;
%}
identifier [a-zA-Z][a-zA-Z0-9]*
%%
#.* {printf("\n%s is a preprocessor directive",yytext);}
int |
float |
char |
double |
while |
for |
struct |
typedef |
do |
if |
break |
continue |
void |
switch |
return |
else |
goto {printf("\n\t%s is a keyword",yytext);}
"/*" {COMMENT=1;}{printf("\n\t %s is a COMMENT",yytext);}
{identifier}\( {if(!COMMENT)printf("\nFUNCTION \n\t%s",yytext);}
\{ {if(!COMMENT)printf("\n BLOCK BEGINS");}
\} {if(!COMMENT)printf("BLOCK ENDS ");}
{identifier}(\[[0-9]*\])? {if(!COMMENT) printf("\n %s IDENTIFIER",yytext);}
\".*\" {if(!COMMENT)printf("\n\t %s is a STRING",yytext);}
[0-9]+ {if(!COMMENT) printf("\n %s is a NUMBER ",yytext);}
\)(\:)? {if(!COMMENT)printf("\n\t");ECHO;printf("\n");}
\( ECHO;
= {if(!COMMENT)printf("\n\t %s is an ASSIGNMENT OPERATOR",yytext);}
\<= |
\>= |
\< |
== |
\> {if(!COMMENT) printf("\n\t%s is a RELATIONAL OPERATOR",yytext);}
%%
int main(int argc, char **argv)
{
FILE *file;
file=fopen("var.c","r");
if(!file)
{
printf("could not open the file");
exit(0);
}
yyin=file;
yylex();
printf("\n");
return(0);
}
int yywrap()
{
return(1);
}
INPUT:
//var.c
#include<stdio.h>
#include<conio.h>
void main()
{
int a,b,c;
a=1;
b=2;
c=a+b;
printf("Sum:%d",c);
}

OUTPUT:
RESULT:

Thus the program for implementation of Lexical Analyzer using Lex tool has been
executed successfully.
EX NO 3 Generate three address code for a simple
program using LEX and YACC.
a. Program to recognize a valid arithmetic expression that uses
operator +, -, * and /.
b. Program to recognize a valid variable which starts with a
letter followed by any number of letters or digits.
c IMPLEMENTATION OF CALCULATOR USING LEX & YACC

AIM:

To write a program for implementing a calculator for computing the given expression using
semantic rules of the YACC tool and LEX.

ALGORITHM:

Step1: A Yacc source program has three parts as follows:


Declarations %% translation rules %% supporting C routines
Step2: Declarations Section: This section contains entries that:
i. Include standard I/O header file.
ii. Define global variables.
iii. Define the list rule as the place to start processing.
iv. Define the tokens used by the parser. v. Define the operators and their precedence.
Step3: Rules Section: The rules section defines the rules that parse the input stream. Each rule of
a grammar production and the associated semantic action.
Step4: Programs Section: The programs section contains the following subroutines. Because these
subroutines are included in this file, it is not necessary to use the yacc library when processing this
file.
Step5: Main- The required main program that calls the yyparse subroutine to start the program.
Step6: yyerror(s) -This error-handling subroutine only prints a syntax error message.
Step7: yywrap -The wrap-up subroutine that returns a value of 1 when the end of input occurs.
The calc.lex file contains include statements for standard input and output, as programmar file
information if we use the -d flag with the yacc command. The y.tab.h file contains definitions for
the tokens that the parser program uses.

Step8: calc.lex contains the rules to generate these tokens from the input stream.
PROGRAM CODE:
//Implementation of calculator using LEX and YACC
LEX PART:
%{
#include<stdio.h>
#include "y.tab.h"
extern int yylval;
%}
%%
[0-9]+ {
yylval=atoi(yytext);
return NUMBER;
}
[\t] ;
[\n] return 0;
. return yytext[0];
%%
int yywrap()
{
return 1;
}
YACC PART:
%{
#include<stdio.h>
int flag=0;
%}
%token NUMBER
%left '+' '-'
%left '*' '/' '%'
%left '(' ')'
%%
ArithmeticExpression: E{
printf("\nResult=%d\n",$$);
return 0;
};
E:E'+'E {$$=$1+$3;}
|E'-'E {$$=$1-$3;}
|E'*'E {$$=$1*$3;}
|E'/'E {$$=$1/$3;}
|E'%'E {$$=$1%$3;}
|'('E')' {$$=$2;}
| NUMBER {$$=$1;}
;
%%
void main()
{
printf("\nEnter Any Arithmetic Expression which can have operations Addition, Subtraction,
Multiplication, Divison, Modulus and Round brackets:\n");
yyparse();
if(flag==0)
printf("\nEntered arithmetic expression is Valid\n\n");
}
void yyerror()
{
printf("\nEntered arithmetic expression is Invalid\n\n");
flag=1;
}
OUTPUT:

RESULT

Thus to write a program for implementing a calculator for computing the given expression using
semantic rules of the YACC tool and LEX is written and executed.
EX NO 4. Generate three address code for a simple
program using LEX and YACC.

Aim:
To generate three address code for a simple program using LEX and YACC
Algorithm:
Program:
%{
#include<stdio.h>
#include"y.tab.h"
int k=1;
%}

%%
[0-9]+ {
yylval.dval=yytext[0];
return NUM;
}

\n {return 0;}
. {return yytext[0];}
%%

void yyerror(char* str)


{
printf("\n%s",str);
}
char *gencode(char word[],char first,char op,char second)
{
char temp[10];
sprintf(temp,"%d",k);
strcat(word,temp);
k++;
printf("%s = %c %c %c\n",word,first,op,second);

return word; //Returns variable name like t1,t2,t3... properly


}
int yywrap()
{
return 1;
}

main()
{
yyparse();
return 0;
}
yacc.y
%{
#include<stdio.h>
int aaa;
%}

%union{
char dval;
}
%token <dval> NUM
%type <dval> E
%left '+' '-'
%left '*' '/' '%'

%%
statement : E {printf("\nt = %c \n",$1);}
;

E : E '+' E
{
char word[]="t";
char *test=gencode(word,$1,'+',$3);
$$=test;

}
| E '-' E
{
char word[]="t";
char *test=gencode(word,$1,'-',$3);
$$=test;
}
| E '%' E
{
char word[]="t";
char *test=gencode(word,$1,'%',$3);
$$=test;
}
| E '*' E
{
char word[]="t";
char *test=gencode(word,$1,'*',$3);
$$=test;
}
| E '/' E
{
char word[]="t";
char *test=gencode(word,$1,'/',$3);
$$=test;
}
| '(' E ')'
{
$$=$2;
}
| NUM
{
$$=$1;
}
;
%%
OUTPUT
Expected output for expression (2+3)*5 :
t1= 2 + 3
t2= t1 * 5

RESULT

Thus to generate three address code for a simple program using LEX and
YACC is written and executed.
EXNO 5 IMPLEMENTATION OF TYPE CHECKING
AIM:
To write a C program to implement type checking.
ALGORITHM:
1. Start the program for type checking of given expression
2. Read the expression and declaration
3. Based on the declaration part define the symbol table
4. Check whether the symbols present in the symbol table or not. If it is found in
the symbol table it displays “Label already defined”.
5. Read the data type of the operand 1, operand 2 and result in the symbol table.
6. If the both the operands’ type are matched then check for result variable.
Else, print “Type mismatch”.
7. If all the data type are matched then displays “No type mismatch”.

Program:
#include<stdio.h>
#include<stdlib.h>
int main()
{
int n,i,k,flag=0;
char vari[15],typ[15],b[15],c;
printf("Enter the number of variables:");
scanf(" %d",&n);
for(i=0;i<n;i++)
{
printf("Enter the variable[%d]:",i);
scanf(" %c",&vari[i]);
printf("Enter the variable-type[%d](float-f,int-i):",i);
scanf(" %c",&typ[i]);
if(typ[i]=='f')
flag=1;
}
printf("Enter the Expression(end with $):");
i=0;
getchar();
while((c=getchar())!='$')
{
b[i]=c;
i++; }
k=i;
for(i=0;i<k;i++)
{
if(b[i]=='/')
{
flag=1;
break; } }
for(i=0;i<n;i++)
{
if(b[0]==vari[i])
{
if(flag==1)
{
if(typ[i]=='f')
{ printf("\nthe datatype is correctly defined..!\n");
break; }
else
{ printf("Identifier %c must be a float type..!\n",vari[i]);
break; } }
else
{ printf("\nthe datatype is correctly defined..!\n");
break; } }
}
return 0;
}
OUTPUT:

RESULT
Thus to write a C program to implement type checking is written and executed.
EX NO 6. Implement simple code optimization techniques
(Constant folding, Strength reduction and Algebraic
transformation)
RESULT

To write a C program to implement simple code optimization techniques is written and


executed.
EX NO 7 IMPLEMENT THE BACK END OF THE
COMPILER
AIM:
To implement the back end of the compiler which takes the three address code and produces
the 8086 assembly language instructions that can be assembled and run using a 8086
assembler. The target assembly instructions can be simple move, add, sub, jump. Also
simple addressing modes are used.
INTRODUCTION:
A compiler is a computer program that implements a programming language specification to
“translate” programs, usually as a set of files which constitute the source code written in
source language, into their equivalent machine readable instructions(the target language,
often
having a binary form known as object code). This translation process is called compilation.
BACK END:
Some local optimization
Register allocation
Peep-hole optimization
Code generation
Instruction scheduling
The main phases of the back end include the following:
Analysis: This is the gathering of program information from the intermediate
representation derived from the input; data-flow analysis is used to build use-define
chains, together with dependence analysis, alias analysis, pointer analysis, escape
analysis etc.
Optimization: The intermediate language representation is transformed into
functionally equivalent but faster (or smaller) forms. Popular optimizations are
expansion, dead, constant, propagation, loop transformation, register allocation and
even automatic parallelization.
Code generation: The transformed language is translated into the output language,
usually the native machine language of the system. This involves resource and storage
decisions, such as deciding which variables to fit into registers and memory and the
selection and scheduling of appropriate machine instructions along with their
associated modes. Debug data may also need to be generated to facilitate debugging

ALGORITHM:
1. Start the program
2. Open the source file and store the contents as quadruples.
3. Check for operators, in quadruples, if it is an arithmetic operator generator it or if
assignment operator generates it, else perform unary minus on register C.
4. Write the generated code into output definition of the file in outp.c
5. Print the output.
6. Stop the program.

PROGRAM: (BACK END OF THE COMPILER)


#include<stdio.h>
#include<stdio.h>
//#include<conio.h>
#include<string.h>
void main()
{
char icode[10][30],str[20],opr[10];
int i=0;
//clrscr();
printf("\n Enter the set of intermediate code (terminated by
exit):\n");
do
{
scanf("%s",icode[i]);
} while(strcmp(icode[i++],"exit")!=0);
printf("\n target code generation");
printf("\n************************");
i=0;
do
{
strcpy(str,icode[i]);
switch(str[3])
{
case '+':
strcpy(opr,"ADD");
break;
case '-':
strcpy(opr,"SUB");
break;
case '*':
strcpy(opr,"MUL");
break;
case '/':
strcpy(opr,"DIV");
break;
}
printf("\n\tMov %c,R%d",str[2],i);
printf("\n\t%s%c,R%d",opr,str[4],i);
printf("\n\tMov R%d,%c",i,str[0]);
}while(strcmp(icode[++i],"exit")!=0);
//getch();
}

You might also like