Compiler Lab Manual
Compiler Lab Manual
Compiler Lab Manual
Submitted By :
Amit Garg
Index
• Introduction
• Phases Of Compiler
Program1: Design the Lexical Analyzer to split the file in to Tokens using C
Compiler.
Introduction
What is a Compiler
Compiler is a program that reads a program written in one language – the source language –
and translates it in to an equivalent program in another language – the target language. As an
important part of this translation process, the compiler reports to its user the presence of errors
in the source program.
Compiler Source
target program
Program
Error Message
There are two parts to compilation: Analysis and Synthesis. The analysis part breaks up the
source program into constituent pieces and creates an intermediate representation of source
program. The synthesis part constructs the desired target program from the intermediate
representation. Of the two parts, synthesis requires the most specialize technique.
2
The Phases Of A Compiler
A compiler operates in six phases, each of which transforms the source program from one
representation to another. The first three phases are forming the bulk of analysis portion of a
compiler. Two other activities, symbol table management and error handling, are also
interacting with the six phases of compiler. These six phases are lexical analysis, syntax
analysis, semantic analysis, intermediate code generation, code optimization and code
generation.
3
Phases of Compiler
Source
program
Lexical
Analyzer
Syntax
Analyzer
Semantic
Analyzer
Code
Optimizer
Code
Generator
Target
program
4
Lexical analysis
In compiler, lexical analysis is also called linear analysis or scanning. In lexical analysis the
stream of characters making up the source program is read from left to right and grouped into
tokens that are sequences of characters having a collective meaning.
Syntax analysis
It is also called as Hierarchical analysis or parsing. It involves grouping the tokens of the
source program into grammatical phrases that are used by the compiler to synthesize output.
Usually, a parse tree represents the grammatical phrases of the sourse program.
Semantic Analysis
The semantic analysis phase checks the source program for semantic errors and gathers type
information for the subsequent code generation phase. It uses the hierarchical structure
determined by the syntax-analysis phase to identify the operators and operands of expressions
and statements.
An important component of semantic analysis is type checking. Here the compiler checks
that each operator has operands that are permitted by the source language specification.
Symbol table is a data structure containing the record of each identifier, with fields for the
attributes of the identifier. The data structure allows us to find the record for each identifier
quickly and store or retrieve data from that record quickly. When the lexical analyzer detects
an identifier in the source program, the identifier is entered into symbol table. The remaining
phases enter information about identifiers in to the symbol table.
5
Error detection
Each phase can encounter errors. The syntax and semantic analysis phases usually handle a
large fraction of the errors detectable by compiler. The lexical phase can detect errors where
the characters remaining in the input do not form any token of language. Errors where the
token stream violates the structure rules of the language are determined by the syntax analysis
phase.
After syntax and semantic analysis, some compilers generate an explicit intermediate
representation of the source program. This intermediate representation should have two
important properties: it should be easy to produce and easy to translate into target program.
During semantic analysis the compiler tries to detect constructs that have the right syntactic
structure but no meaning to the operation involved
Code optimization
The code optimization phase attempts to improve the intermediate codeso that the faster-
running machine code will result. There are simple optimizations that significantly improve
the running time of the target program without slowing down compilation too much.
Code generation
The final phase of compilation is the generation of target code, consisting normally of
relocatable machine code or assembly code.
6
Program-1
Objective: Design the Lexical Analyzer to split the file in to tokens using C Compiler.
#include <stdio.h>
#include <conio.h>
void main()
{
int i=0,j=0,k=0;
char c,Token[809],TokenList[33][33];
FILE *fp;
clrscr(
);
fp=fopen("b.b","r");
while((c=getc(fp))!=EOF)
{
if ((c>=48 && c<=57) || (c>=65 && c<=90) || (c>=97 && c<=122) ||(c==95))
{
Token[i]=c;
i=i+1;
Token[i]='\0';
}
else
{
7
j=0;
if(i>0)
{
while(Token[j] !='\0')
{
TokenList[k][j]=Token[j];
j=j+1;
}
TokenList[k][j]='\0';
k=k+1;
}
if (c!=32)
{
TokenList[k][0]=c;
TokenList[k][1]=c & '\0';
k=k+1;
}
i=0;
}
}
printf("TOKENS ARE=-\n");
printf("----------\n");
for(i=0;i<k;i++)
{
printf("%s\n",TokenList[i]);
}
8
printf("\n Total Tokens in the File = %d\n",k);
getch();
}
9
Program-2
Objective: Design the Lexical Analyzer to identify the keywords in to the file using C
Compiler.
#include <stdio.h>
#include <conio.h>
#include <string.h>
void main()
{
int i=0,j=0,k=0,st,count=0,k1=0;
char c,Token[809],TokenList[33][33],KeyWords[10]
[10]={{"int"},{"float"},{"char"},{"printf"},
{"scanf"}},temp[22][22];
FILE *fp;
clrscr(
);
fp=fopen("c.c","r");
while((c=getc(fp))!=EOF)
{
if ((c>=48 && c<=57) || (c>=65 && c<=90) || (c>=97 &&
c<=122) ||(c==95))
{
Token[i]=c;
10
i=i+1;
Token[i]='\0';
}
else
{
j=0;
if(i>0)
{
while(Token[j] !='\0')
{
TokenList[k][j]=Token[j];
j=j+1;
}
TokenList[k][j]='\0';
k=k+1;
}
if (c!=32)
{
TokenList[k][0]=c;
TokenList[k][1]=c & '\0';
k=k+1;
}
i=0;
}
}
for(i=0;i<k;i++)
{
11
// printf("%s\n",TokenList[i]);
}
st=0;
while(st<=k)
{
for(i=0;i<=5;i++)
{
if((TokenList[st][0]>=48 && TokenList[st][0]<=57) ||
(TokenList[st][0]>=65 && TokenList[st][0]<=90) ||
(TokenList[st][0]>=97 && TokenList[st][0]<=122) ||
(TokenList[st][0]==95))
{
if (strcmp(TokenList[st],KeyWords[i])==0)
{
// printf("%s\n",TokenList[st]);
for(j=0;j<=k1;j++)
{
if(strcmp(TokenList[st],temp[j])==0) break;
}
if(j>k1)
{
strcpy(temp[k1],TokenList[st]);
k1=k1+1;
}
count=count+1;
}
}
}
12
st++;
}
getch();
}
13
Program-3
Objective: Count the number of While loops and number of For loops in a program using
the Lexical Analyzer.
#include <stdio.h>
#include <conio.h>
#include <string.h>
void main()
{
int i=0,j=0,k=0,st,count=0,k1=0;
char c,Token[809],TokenList[33][33],KeyWords[10]
[10]={{"int"},{"float"},{"char"},{"printf"},{"scanf"},
{"for"}},temp[22][22];
FILE *fp;
clrscr();
fp=fopen("c.c","r");
while((c=getc(fp))!=EOF)
{
if ((c>=48 && c<=57) || (c>=65 && c<=90) || (c>=97 &&
c<=122) ||(c==95))
{
14
Token[i]=c;
i=i+1;
Token[i]='\0';
}
else
{
j=0;
if(i>0)
{
while(Token[j] !='\0')
{
TokenList[k][j]=Token[j];
j=j+1;
}
TokenList[k][j]='\0';
k=k+1;
}
if (c!=32)
{
TokenList[k][0]=c;
TokenList[k][1]=c & '\0';
k=k+1;
}
i=0;
}
}
for(i=0;i<k;i++)
15
{
// printf("%s\n",TokenList[i]);
}
st=0;
i=0;
while(st<=k)
{
if((TokenList[st][0]>=48 && TokenList[st][0]<=57) ||
(TokenList[st][0]>=65 && TokenList[st][0]<=90) ||
(TokenList[st][0]>=97 && TokenList[st][0]<=122) ||
(TokenList[st][0]==95))
{
if (strcmp(TokenList[st],"for")==0)
count=count+1;
else if(strcmp(TokenList[st],"while")==0)
i=i+1;
}
st++;
}
printf("Total No. of For Loop In The Program = %d\n",count);
printf("Total No. of While Loop In The Program = %d\n",i);
getch();
}
16
Program-4
Objective: Count the number of IF conditions in a program using the Lexical Analyzer.
#include <stdio.h>
#include <conio.h>
#include <string.h>
void main()
{
int i=0,j=0,k=0,st,count=0,k1=0;
char c,Token[809],TokenList[33][33],KeyWords[10]
[10]={{"int"},{"float"},{"char"},{"printf"},{"scanf"},
{"for"}},temp[22][22];
FILE *fp;
clrscr();
fp=fopen("a.a","r");
while((c=getc(fp))!=EOF)
{
if ((c>=48 && c<=57) || (c>=65 && c<=90) || (c>=97 &&
c<=122) ||(c==95))
{
Token[i]=c;
i=i+1;
Token[i]='\0';
17
}
else
{
j=0;
if(i>0)
{
while(Token[j] !='\0')
{
TokenList[k][j]=Token[j];
j=j+1;
}
TokenList[k][j]='\0';
k=k+1;
}
if (c!=32)
{
TokenList[k][0]=c;
TokenList[k][1]=c & '\0';
k=k+1;
}
i=0;
}
}
for(i=0;i<k;i++)
{
printf("%s\n",TokenList[i]);
}
18
st=0;
i=0;
while(st<=k)
{
if((TokenList[st][0]>=48 && TokenList[st][0]<=57) ||
(TokenList[st][0]>=65 && TokenList[st][0]<=90) ||
(TokenList[st][0]>=97 && TokenList[st][0]<=122) ||
(TokenList[st][0]==95))
{
if (strcmp(TokenList[st],"if")==0)
count=count+1;
}
st++;
}
printf("Total No. of IF Condition In The Program =
%d\n",count);
getch();
}
19
Program-5
Objective: Count the number of Variables present in a file with data types using the
Lexical Analyzer.
#include <stdio.h>
#include <string.h>
#include <conio.h>
void dataType();
int st;
char TokenList[33][33];
void main()
{
int i=0,j=0,k=0,callmodule,brk;
char c,Token[809],KeyWords[10][10]={{"int"},{"float"},
{"char"},{"printf"},{"scanf"}};
FILE *fp;
clrscr();
fp=fopen("c.c","r");
while((c=getc(fp))!=EOF)
{
if ((c>=48 && c<=57) || (c>=65 && c<=90) || (c>=97 &&
c<=122) ||(c==95))
{
20
Token[i]=c;
i=i+1;
Token[i]='\0';
}
else
{
j=0;
if(i>0)
{
while(Token[j] !='\0')
{
TokenList[k][j]=Token[j];
j=j+1;
}
TokenList[k][j]='\0';
k=k+1;
}
if (c!=32)
{
TokenList[k][0]=c;
TokenList[k][1]=c & '\0';
k=k+1;
}
i=0;
}
}
// printf("\n########################%d\n",k);
21
for(i=0;i<k;i++)
{
// printf("%s\n",TokenList[i]);
}
}
}
switch(callmodule)
22
{
case 0:
{
dataType();
printf(" Are Integer Type Variables.");
callmodule=-1;
break;
}
case 1:
{
dataType();
printf(" Are Float Type Variables.");
callmodule=-1;
break;
}
case 2:
{
dataType();
printf(" Are Character Type Variables.");
callmodule=-1;
break;
}
}
st++;
}
printf("\n\n");
getch();
}
23
void dataType(void)
{
st=st+1;
printf("\n");
while(strcmp(TokenList[st],";")!=0)
{
if((TokenList[st][0]>=48 && TokenList[st][0]<=57) ||
(TokenList[st][0]>=65 && TokenList[st][0]<=90) ||
(TokenList[st][0]>=97 && TokenList[st][0]<=122) ||
(TokenList[st][0]==95))
{
printf("%s ",TokenList[st]);
}
st++;
}
st=st-1;