SPCC Practicalss
SPCC Practicalss
SPCC Practicalss
3
Aim :- To Study Design and implementation of Lexical analyzer
Theory : Lexical analysis is the first phase of a compiler. It takes the modified source code
from language preprocessors that are written in the form of sentences. The lexical analyzer
breaks these syntaxes into a series of tokens, by removing any whitespace or comments in the
source code.If the lexical analyzer finds a token invalid, it generates an error. The lexical
analyzer works closely with the syntax analyzer. It reads character streams from the source
code, checks for legal tokens, and passes the data to the syntax analyzer when it demands.
Tokens: Lexemes are said to be a sequence of characters (alphanumeric) in a token. There
are some predefined rules for every lexeme to be identified as a valid token. These rules are
defined by grammar rules, by means of a pattern. A pattern explains what can be a token, and
these patterns are defined by means of regular expressions.
In programming language, keywords, constants, identifiers, strings, numbers,
operators and punctuations symbols can be considered as tokens.
Let us understand how the language theory undertakes the following terms
Alphabets
Any finite set of symbols {0,1} is a set of binary alphabets,
{0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F} is a set of Hexadecimal alphabets, {a-z, A-Z} is a set of
English language alphabets.
Strings
Any finite sequence of alphabets is called a string. Length of the string is the total number of
occurrence of alphabets, e.g., the length of the string tutorialspoint is 14 and is denoted by
|tutorialspoint| = 14. A string having no alphabets, i.e. a string of zero length is known as an
Special Symbols
A typical high-level language contains the following symbols:-
Addition(+), Subtraction(-), Modulo(%), Multiplication(*),
Arithmetic Symbols
Division(/)
Punctuation Comma(,), Semicolon(;), Dot(.), Arrow(->)
Assignment =
Special Assignment +=, /=, *=, -=
Comparison ==, !=, <, <=, >, >=
Preprocessor #
Location Specifier &
Logical &, &&, |, ||, !
Shift Operator >>, >>>, <<, <<<
Language
A language is considered as a finite set of strings over some finite set of alphabets. Computer
languages are considered as finite sets, and mathematically set operations can be performed
on them. Finite languages can be described by means of regular expressions.
Longest Match Rule
When the lexical analyzer read the source-code, it scans the code letter by letter; and when it
encounters a whitespace, operator symbol, or special symbols, it decides that a word is
completed.
Program :
PARSER: \ a[p]=d[0];
k=strlen(a); x=0;
l=k; }
for(i=30;i>0;i--) f=p;
a[i]=a[--k]; for(j=2;s=0;j<n;j++)
for(i=30;i>0;i++) {
e[s++]=c[j];
\ g[n-s-2]=a[p--];
for(i=32;i<49;i++) }
\
\
ptr2=strcmp(e,g);
p=1;q=30-1; p=f;
i=1; if(ptr2==0)
{
\ for(j=2;j<n;j++)
\n STACK I/P STRING a[p--
ACTION \ a[++p]=c[0];
\
{
a[p]=a[q];
x=a[p];
Experiment No. 5
Aim :- Design and implement Operator precedence
Precedence Relations Bottom-up parsers for a large class of context-free grammars can be easily
developed using operator grammars. Operator Grammars have the property that no production
right side is empty or has two adjacent non-terminals. Consider: E-> E op E | id op-> + | * Not an
operator grammar but: E-> E + E | E * E | id This parser relies on the following three precedence
relations:
Example: The input string: id1 + id2 * id3 After inserting precedence relations becomes: $ + * $
Basic Principle
Having precedence relations allows identifying handles as follows:
1. Scan the string from left until seeing ·> and put a pointer.
2. Scan backwards the string from right to left until seeing forms the handle
4. Replace handle with the head of the production.
Relation Meaning a b a takes precedence over b id + * $ id ·> ·> ·> + * ·> ·> $
Operator Precedence Parsing Algorithm
Initialize: Set ip to point to the first symbol of the input string w$
Repeat: Let b be the top stack symbol, a the input symbol pointed to by ip
if (a is $ and b is $)
return
else
if a ·> b or a =· b
then
push a onto the stack
advance ip to the next input symbol
else if
a stack-top)
else error
end
Making Operator Precedence Relations :
The operator precedence parsers usually do not store the precedence table with the relations;
rather they are implemented in a special way. Operator precedence parsers use precedence
functions that map terminal symbols to integers, and so the precedence relations between the
symbols are implemented by numerical comparison.
Algorithm for Constructing Precedence Functions
1. Create functions fa for each grammar terminal a and for the end of string symbol.
2. Partition the symbols in groups so that fa and gb are in the same group if a =· b (there can be
symbols in the same group even if they are not connected by this relation).
3. Create a directed graph whose nodes are in the groups, next for each symbols a and b do: place an
edge from the group of gb to the group of fa if a b place an edge from the group of fa to that of gb. If
the constructed graph has a cycle then no precedence functions exist. When there are no cycles
collect the length of the longest paths from the groups of fa and gb respectively.
#include<stdio.h>
#include<conio.h>
int find(char a)
{ switch(a)
{ case 'a':
return 0;
case '+':
return 1;
case '*':
return 2;
case '$':
return 3;
}
}
void display(char a[],int top1,char b[],int top2)
{ int i;
for(i=0;i<=top1;i++)
printf("%c",a[i]);
printf("\t");
for(i=top2;i<strlen(b);i++)
printf("%c",b[i]);
printf("\n");
}
int main()
{ char table[][4]= {' ','>','>','>','<','<','<','>','<','>','<','>','<','<','<',' '};
char input[10];
char stack[10]={'$'};
int top_stack=0,top_input=0,i,j;
clrscr();
printf("operator precedence parsing for E->E+E/E*E/a\n");
printf("enter the string\n");
scanf("%s",input);
strcat(input,"$");
printf("stack\tinput\n");
display(stack,top_stack,input,top_input);
while(1)
{ if((stack[top_stack]=='$')&&(input[top_input]=='$'))
{
printf("string accepted");
break;
}
if(table[find(stack[top_stack])][find(input[top_input])]==' ')
{ printf("parse error");
getch();
exit(0);
}
if(table[find(stack[top_stack])][find(input[top_input])]=='<')
{ stack[++top_stack]='<';
stack[++top_stack]=input[top_input];
top_input++;
display(stack,top_stack,input,top_input);
continue;
}
if(table[find(stack[top_stack])][find(input[top_input])]=='>')
{ stack[++top_stack]='>';
display(stack,top_stack,input,top_input);
top_stack-=3;
display(stack,top_stack,input,top_input);
}
}getch();}