Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Compiler Construction Week 04 Syntax Analysis I)

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 41

Compiler Construction

Week 04
Syntax Directed Translator

MUHAMMAD SHAHID KHAN


LECTURER
DEPARTMENT OF COMPUTING &
TECHNALOGY
IQRA UNIVERSITY ISLAMABAD
Layout

01 Syntax Directed Translator

02 What is Context Free Grammar?

03 Basic Terminologies of CFG


04 Derivation & Parse Tree
05 Ambiguity

2
Syntax
Directed
Translato
r
3
Syntax Directed Translator
 This section illustrates the compiling techniques
by developing a program that translates
representative programming language
statements into three-address code, an
intermediate representation.
 We will focus on
 Front end of a compiler
 Lexical analysis
 Parsing
 Intermediate code generation.
4
Syntax Directed Translator

5
Three address code

 Three address code (x=y op z)


 X=(a*b)+(c*d)
 If we want to convert in three address code, then
 t1= a*b X=y op z x=y+z

 t2= c*d X=op y x=+y

 X=t1+t2 X=y op z x=y

 Various statement in three address code


 Assignment

6
Introduction

 Analysis is organized around the "syntax" of the


language to be compiled.
 The syntax of a programming language
describes the proper form of its programs.
 The semantics of the language defines what its
programs mean.
 For specifying syntax, Context-Free Grammars is
used.
 Also known as BNF (Backus-Naur Form)

7
Introduction

 We start with a syntax-directed translation of an


infix expression to postfix form.
 Infix form: 9-5+2
 to
 Postfix form: 9 5-2+

8
Syntax Definition

 Context Free Grammar is used to specify the


syntax of the language.
 Shortly we can say it "Grammar".
 A grammar describes the hierarchical structure of
most programming language constructs.
 Ex.
 if (expression) statement else statement

9
Syntax Definition

 This rule can be expressed as production by


using the variable expr to denote an expression
and the variable stmt to denote a statement.
 stmt-> if (expr) stmt else stmt
 In a production
 lexical elements like the keyword if, else and the
parentheses are called terminals.
 Variables like expr and stmt represent sequences
of terminals and are called nonterminal.

10
What is
Context Free
Grammar?
11
Grammar

 A context-free grammar has four components


 A set of tokens (terminal symbols)
 A set of non-terminals
 A set of productions
 A designated start symbol
 Let's check an example that elaborates these
components.

12
 Expressions... Grammar
 9-5+2, 5-4, 8...
 Since a plus or minus sign must appear between two
digits, we refer to such expressions as lists of digits
separated by plus or minus signs.
 The productions are
 List -> list + digit P-
1
 List -> list – digit P-
2
 List -> digit P-
3
 Digit->0 1 2 3 4 5 6 7 8 9 P-
4
13
Context Free Grammar
• Why we need CFGs?

• Many languages are not regular for example string of


balanced parentheses ((((…))))

• There is no regular expression for this language

• A finite automata may repeat states; however, it can


not remember the number of times it has been to a
particular state

14
Context
• Context-free grammar is used to specifyFree Grammar
the syntax of a
language
• A grammar naturally describes the hierarchical structure
for e.g.

• Arrow may be read as "can have the form." Such a rule is


called a production.
• if and the parentheses are called terminals.
• Variables like expr and stmt represent sequences of
terminals and are called non-terminals.
15
Basic
Terminologie
s of CFG

16
A context-free grammar has four components:
Basic Terminologies of CFG
• Terminals: Sometimes referred to as "tokens." The
terminals are the elementary symbols of the language
defined by the grammar.

• Non-terminals: called "syntactic variables." Each non-


terminal represents a set of strings of terminals, in a
manner we shall describe.

• Production: Each production consists of a non-terminal


called the head or left side of the production, an arrow,
and a sequence of terminals and/or non-terminals , called
the body or right side of the production

• A designation of one of the non-terminals as the start


symbol. 17
Basic Terminologies of CFG
A context-free grammar has four components:

CFG (V, T, P, S)
• Terminals (T): Finite sets of end-nodes values
0, 1
• Non-terminals (V): S P
Finite sets of Variables
S, P
• Production (P): Substitution Rules
P  0P0
P  1P1
• Start symbol (S): Initiate Point P0
S
P1
P
18
Basic Terminologies of Lexical
Example (9-5+2, 3-1, or 7) Analysis

• Terminals:
• Non-terminals: list and digit
• Empty list: The string of zero terminals, written as
∊, is called the empty string

19
Derivation &
Parse Tree

20
Derivation & Parse Tree
Derivation:

• A grammar derives strings by beginning with the start


symbol and repeatedly replacing a non-terminal by the
body of a production for that non-terminal

21
Derivation & Parse Tree
Derivation:

list  list + digit


 list – digit + digit
 digit – digit + digit
 9 – digit + digit
 9 – 5 + digit
 9–5+2

Therefore, the string 9-5+2 belongs to the language


specified by the grammar

22
Parse Tree:
Derivation & Parse Tree
• Parsing is the problem of taking a string of terminals
and figuring out how to derive it from the start
symbol of the grammar.
• If it cannot be derived from the start symbol of the
grammar, then reporting syntax errors within the
string.
• Given a context-free grammar, a parse tree according
to the grammar is a tree with the following
properties:
• The root is labeled by the start symbol.
• Each leaf is labeled by a terminal or by ε.
• Each interior node is labeled by a nonterminal

23
Derivation & Parse Tree

Parse Tree:

If A→ X1 X2 ... Xn is a production, then node A has immediate children X1, X2, ..., Xn
where Xi is a (non)terminal or 8.

24
Derivation & Parse Tree

Parse Tree:

25
Derivation & Parse Tree
Parse Tree:

• A parse tree defines how the start symbol of a


grammar derives a string

• If non-terminal A has a production A → XYZ, then a


parse tree in the language is:

26
Parse Tree:
Derivation & Parse Tree

• Each node in the tree is labeled by a grammar


symbol
• An interior node and its children correspond to a
production
• The children of the root are labeled, from left to right

27
Ambiguity

28
Ambiguity
• Ambiguity is problematic because meaning of the
programs can be incorrect

• Ambiguity can be handled in several ways


• Enforce associativity and precedence
• Rewrite the grammar (cleanest way)

• There are no general techniques for handling ambiguity

• It is impossible to convert automatically an ambiguous


grammar to an unambiguous one

29
Ambiguity
Consider grammar
string  string +
string
| string –
string
|0|1|…
|9

30
Tree
Terminology

31
Tree Terminology

• A tree consists of one or more nodes.


• Exactly one is the root.
• If node N is the parent of node M, then M is a child of N.
• The children of one node are called siblings. They have
an order, from the left.
• A node with no children is called a leaf.
• A descendant of a node N is either N itself, a child of N, a
child of a
• child of N, and so on.

32
Associativity
of Operators

33
Associativity of Operators

• Left-associative operators have left-recursive


productions.
• For instance,
• list  list-digit | digit.
• String 9-5-2 has the same meaning as (9-5)-2.
• Right-associative operators have right-recursive
productions
• For Instance, see the grammar below
• right  letter =right |letter
• String a=b=c has the same meaning as a=(b=c)

34
Operator
Precedence

35
Associativity of Operators

• Consider the expression 9+5*2.


• There are two possible interpretations of this expression:
• (9+5)*2 or 9+(5°2)
• The associativity rules for + and apply to occurrences of
the same operator, so they do not resolve this ambiguity.
• A grammar for arithmetic expressions can be constructed
from a table showing the associativity and precedence of
operators.

36
Associativity and Precedence Table

37
Operator Precedence
• Let's see an example of four common arithmetic
operators and a precedence table, showing the operators
in order of increasing precedence.
• left-associative: +
• left-associative: */
• Now we create two nonterminal expr and term for the
two levels of precedence, and an extra nonterminal factor
for generating basic units in expressions.
• The basic units in expressions are presently digits and
parenthesized expressions.
• factor->digit I (expr)

38
Operator Precedence
• Now consider the binary operators, and /, that have the
highest precedence and left associativity.
• term->term * factor | term / factor | factor
• Similarly, expr generates lists of terms separated by the
additive operators.
• expr -> expr+ term I expr-term I term
• Final grammar is
• expr -> expr+ term I expr-term I term
• term->term * factor | term / factor | factor factor> digit I
(expr)

39
Operator Precedence

40
Solve It

“A tree of branches, you must construct,


A puzzle to solve, it's not abrupt.
Paths may fork, with no clear direction,
Follow them wrong, you'll face rejection.

What am I?”
41

You might also like