SPCC Practicalss

Experiment No.
3
Aim :- To Study Design and implementation of Lexical analyzer
Theory : Lexical analysis is the first phase of a compiler. It takes the modified source code
from language preprocessors that are written in the form of sentences. The lexical analyzer
breaks these syntaxes into a series of tokens, by removing any whitespace or comments in the
source code.If the lexical analyzer finds a token invalid, it generates an error. The lexical
analyzer works closely with the syntax analyzer. It reads character streams from the source
code, checks for legal tokens, and passes the data to the syntax analyzer when it demands.
Tokens: Lexemes are said to be a sequence of characters (alphanumeric) in a token. There
are some predefined rules for every lexeme to be identified as a valid token. These rules are
defined by grammar rules, by means of a pattern. A pattern explains what can be a token, and
these patterns are defined by means of regular expressions.
In programming language, keywords, constants, identifiers, strings, numbers,
operators and punctuations symbols can be considered as tokens.
Let us understand how the language theory undertakes the following terms
Alphabets
Any finite set of symbols {0,1} is a set of binary alphabets,
{0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F} is a set of Hexadecimal alphabets, {a-z, A-Z} is a set of
English language alphabets.
Strings
Any finite sequence of alphabets is called a string. Length of the string is the total number of
occurrence of alphabets, e.g., the length of the string tutorialspoint is 14 and is denoted by
|tutorialspoint| = 14. A string having no alphabets, i.e. a string of zero length is known as an
Special Symbols
A typical high-level language contains the following symbols:-
Addition(+), Subtraction(-), Modulo(%), Multiplication(*),
Arithmetic Symbols
Division(/)
Punctuation Comma(,), Semicolon(;), Dot(.), Arrow(->)
Assignment =
Special Assignment +=, /=, *=, -=
Comparison ==, !=, <, <=, >, >=
Preprocessor #
Location Specifier &
Logical &, &&, |, ||, !
Shift Operator >>, >>>, <<, <<<
Language
A language is considered as a finite set of strings over some finite set of alphabets. Computer
languages are considered as finite sets, and mathematically set operations can be performed
on them. Finite languages can be described by means of regular expressions.
Longest Match Rule
When the lexical analyzer read the source-code, it scans the code letter by letter; and when it
encounters a whitespace, operator symbol, or special symbols, it decides that a word is
completed.
Program :
#include<stdio.h> int len;

#include<conio.h> len=strlen(s);
#include<ctype.h> if(lastentry+1>=MAX)
#include<string.h> Error_Message("Lexenes Table is Full.");
#include<stdlib.h> if(lastentry+len+1>=MAX)
#define SIZE 128 Error_Message("Lexenes Array is Full.");
#define NONE -1 lastentry=lastentry+1;
#define EOS '\0' symtable[lastentry].token=tok;
#define NUM 256 symtable[lastentry].lexptr=&lexemes[lastchar
#define KEYWORD 257 +1];
#define PAREN 258 lastchar=lastchar+1+len;
#define ID 259 strcpy(symtable[lastentry].lexptr,s);
#define ASSIGN 260 return lastentry;
#define REL_OP 261 }
#define DONE 262
#define MAX 999 void Initialize()
{
char lexemes[MAX]; struct entry *ptr;
char buffer[SIZE]; for(ptr=keywords;ptr->token;ptr++)
int lastchar=-1; insert(ptr->lexptr,ptr->token);
int lastentry=0; }
int tokenval=NONE;
int lineno=1; int lexer()
{
struct entry int t;
{ int val,i=0;
char *lexptr; while(1)
int token; {
}symtable[100]; t=getchar();
struct entry if(t==' '||t=='\t');
keywords[]={"if",KEYWORD,"else",KEYW else if(t=='\n')
ORD,"for",KEYWORD,"int",KEYWORD,"fl lineno=lineno+1;
oat",KEYWORD,"double",KEYWORD,"char else if(t=='('||t==')')
",KEYWORD,"struct",KEYWORD,"return",K return PAREN;
EYWORD,0,0}; else if(t=='<'||t=='>'||t=='<'||t=='!')
return REL_OP;
void Error_Message(char *m) else if(t=='=')
{ return ASSIGN;
fprintf(stderr,"line %d:%s\n",lineno,m); else if(isdigit(t))
exit(1); {
} ungetc(t,stdin);
scanf("%d",&tokenval);
int look_up(char s[]) return NUM;
{ }
int k; else if(isalpha(t))
for(k=lastentry;k>0;k--) {
if(strcmp(symtable[k].lexptr,s)==0) while(isalnum(t))
return k; {
return 0; buffer[i]=t;
} t=getchar();
i=i+1;
int insert(char s[],int tok) if(i>=SIZE)
{ Error_Message("Compiler Error");
Experiment No. 4
Aim :- Implement Shift Reduce Parser.
Theory :-
A simple bottom-up parser, known as Shift-reduce parser can be implemented
using a stack to hold a sequence of terminal & non-terminal symbols. Symbols from the
input string can be shifted onto this stack, or the items on the stack can be reduced by
applying a grammar rule, such that the right-hand side of the rule matches symbols on the
top of the stack.
This is a bottom-up parser, starting from the input sequence, and making
reductions, we aim to end up with the goal symbol. The reduction of a sequential form is
achieved by substituting the left side of a production for a string which matches the right
side, rather than by substituting the right side of a production whose left side appears as a
non-terminal in the sentential form.
A bottom-up parsing algorithm might employ a parse stack, which contains
possible sentential form of terminals and/or non-terminals. As we read each terminal
from the input string we push it on the parse stack, and then examine the top elements of
this to see whether we can make a reduction. Some terminals may remain on the parse
stack quite a long time before they are finally pushed off and discarded.
Simulation of shift-reduce parser for
) | id
Input string : id + id * id
Stack Input Action
$ id + id * id $ shift
$id + id * id $
$E + id * id $ shift
$E + id * id $ shift
$E+id * id $
$E+E * id $ shift
$E+E* id $ shift
$E+E*id $
$E+E*E $
$E+E $
$E $ accept
Algorithm:-
1) Accept grammar rules from user.
2) Accept the string to be parsed by user.
3) Read the string character by character and identify a token.
4) Check which of the grammar rules matches with the token on the right hand side
of the rule.
5) Replace the token by the non-terminal at the left hand side of the rule.
6) If not end of the string then go to step (3).
7) If the string end has been reached & there is only one starting non-terminal after
reduction, then the string is accepted.
8) Otherwise the string is rejected as it is not a valid string generated by the given
grammar.
Conclusion :- Shift reduce parser for given set of grammar rules has been implemented
using stack.
PROGRAM :
include<stdio.h>
#include<conio.h> if(p!=1)
#include<string.h>
void main() while((ptr1==0)||(ptr2==0)||(ptr3==))
{
{ f=p;
char a[50],z,e[10],b[10],c[10],d[10]; for(j=2,s=0;j<0;j--)
int {
o,j,s,ptr1=0,ptr2=0,ptr3=0,i,l,k,m,n,x=1,pq,f; e[s++]=d[j];
clrscr(); g[o-s-2]=a[p--];
}
CALCULATION: \ \
\
ptr1=strcmp(e,g);
p=f;
m=strelen(b); if(ptr1==0)
n=strelen(c); {
o=strelen(d); for(j=p+1;j<=o;j++)
PARSER: \ a[p]=d[0];
k=strlen(a); x=0;
l=k; }
for(i=30;i>0;i--) f=p;
a[i]=a[--k]; for(j=2;s=0;j<n;j++)
for(i=30;i>0;i++) {
e[s++]=c[j];
\ g[n-s-2]=a[p--];
for(i=32;i<49;i++) }
\
\
ptr2=strcmp(e,g);
p=1;q=30-1; p=f;
i=1; if(ptr2==0)
{
\ for(j=2;j<n;j++)
\n STACK I/P STRING a[p--
ACTION \ a[++p]=c[0];
\
{
a[p]=a[q];
x=a[p];
Experiment No. 5
Aim :- Design and implement Operator precedence
Precedence Relations Bottom-up parsers for a large class of context-free grammars can be easily
developed using operator grammars. Operator Grammars have the property that no production
right side is empty or has two adjacent non-terminals. Consider: E-> E op E | id op-> + | * Not an
operator grammar but: E-> E + E | E * E | id This parser relies on the following three precedence
relations:
Example: The input string: id1 + id2 * id3 After inserting precedence relations becomes: $ + * $
Basic Principle
Having precedence relations allows identifying handles as follows:
1. Scan the string from left until seeing ·> and put a pointer.
2. Scan backwards the string from right to left until seeing forms the handle
4. Replace handle with the head of the production.
Relation Meaning a b a takes precedence over b id + * $ id ·> ·> ·> + * ·> ·> $
Operator Precedence Parsing Algorithm
Initialize: Set ip to point to the first symbol of the input string w$
Repeat: Let b be the top stack symbol, a the input symbol pointed to by ip
if (a is $ and b is $)
return
else
if a ·> b or a =· b
then
push a onto the stack
advance ip to the next input symbol
else if
a stack-top)
else error
end
Making Operator Precedence Relations :
The operator precedence parsers usually do not store the precedence table with the relations;
rather they are implemented in a special way. Operator precedence parsers use precedence
functions that map terminal symbols to integers, and so the precedence relations between the
symbols are implemented by numerical comparison.
Algorithm for Constructing Precedence Functions
1. Create functions fa for each grammar terminal a and for the end of string symbol.
2. Partition the symbols in groups so that fa and gb are in the same group if a =· b (there can be
symbols in the same group even if they are not connected by this relation).
3. Create a directed graph whose nodes are in the groups, next for each symbols a and b do: place an
edge from the group of gb to the group of fa if a b place an edge from the group of fa to that of gb. If
the constructed graph has a cycle then no precedence functions exist. When there are no cycles
collect the length of the longest paths from the groups of fa and gb respectively.
#include<stdio.h>
#include<conio.h>
int find(char a)
{ switch(a)
{ case 'a':
return 0;
case '+':
return 1;
case '*':
return 2;
case '$':
return 3;
}
}
void display(char a[],int top1,char b[],int top2)
{ int i;
for(i=0;i<=top1;i++)
printf("%c",a[i]);
printf("\t");
for(i=top2;i<strlen(b);i++)
printf("%c",b[i]);
printf("\n");
}
int main()
{ char table[][4]= {' ','>','>','>','<','<','<','>','<','>','<','>','<','<','<',' '};
char input[10];
char stack[10]={'$'};
int top_stack=0,top_input=0,i,j;
clrscr();
printf("operator precedence parsing for E->E+E/E*E/a\n");
printf("enter the string\n");
scanf("%s",input);
strcat(input,"$");
printf("stack\tinput\n");
display(stack,top_stack,input,top_input);
while(1)
{ if((stack[top_stack]=='$')&&(input[top_input]=='$'))
{
printf("string accepted");
break;
}
if(table[find(stack[top_stack])][find(input[top_input])]==' ')
{ printf("parse error");
getch();
exit(0);
}
if(table[find(stack[top_stack])][find(input[top_input])]=='<')
{ stack[++top_stack]='<';
stack[++top_stack]=input[top_input];
top_input++;
continue;
}
if(table[find(stack[top_stack])][find(input[top_input])]=='>')
{ stack[++top_stack]='>';
top_stack-=3;
}
}getch();}

SPCC Practicalss

Uploaded by

Copyright:

Available Formats

SPCC Practicalss

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SPCC Practicalss

Uploaded by

Copyright:

Available Formats

Experiment No.

#include<stdio.h> int len;

You might also like