0% found this document useful (0 votes)

5 views

1 Syntax Analyzer

Uploaded by

Muhammad Ahsan Ansari

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

1 Syntax Analyzer

Uploaded by

Muhammad Ahsan Ansari

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Introduction to Syntax Analysis

Syntax Analysis

Syntax Analysis
The parser (syntax analyzer) receives the source code in
the form of tokens from the lexical analyzer and performs
syntax analysis, which create a tree-like intermediate
representation that depicts the grammatical structure of the
token stream.

Syntax analysis is also called parsing.

A typical representation is a abstract syntax tree in
which
each interior node represents an operation
the children of the node represent the arguments of the
operation
Position of Syntax Analyzer
Role of Parser

Parser
Checks the stream of words and their parts of speech
(produced by the scanner) for grammatical correctness
Determines if the input is syntactically well formed
Guides checking at deeper levels than syntax (static
semantics checking)
Builds an IR representation of the code
Study of Parsing

Parser
The parser
Needs the syntax of programming language constructs, which can be
specified by context-free grammars or BNF (Backus-Naur Form)
Need an algorithm for testing membership in the language of the grammar.

Roadmap
The roadmap for study of parsing
Context-free grammars and derivations

Top-down parsing
Recursive descent (predictive parsing)
LL (Left-to-right, Leftmost derivation) methods

Bottom-up parsing
Operator precedence parsing
LR (Left-to-right, Rightmost derivation) methods
SLR, canonical LR, LALR
Expressive Power of Different Parsing
Techniques
Benefits Offered by Grammar

Grammars offer significant benefits for both language

designers and compiler writers:
A grammar gives a precise, yet easy-to-understand
syntactic specification to a programming language.
Parsers can automatically be constructed for certain
classes of grammars.
The parser-construction process can reveal syntactic
ambiguities and trouble spots.
A grammar imparts structure to a language.
The structure is useful for translating source programs
into correct object code and for detecting errors.
A grammar allows a language to be evolved.
New constructs can be integrated more easily into an
implementation that follows the grammatical structure of
the language.
Why Not Use RE/DFA?

Advantages of RE/DFA

Simple & powerful notation for specifying patterns

Automatic construction of fast recognizers
Many kinds of syntax can be specified with REs

Limits of RE/DFA

Finite automata cannot count, which means a finite automaton cannot

accept a language like {anbn|n ≥ 1} that would require it to keep count of
the number of a’s before it sees the b’s.
Therefore, RE cannot check the balance of parenthesis, brackets, begin-end
pairs.
CFG vs. RE

Grammars are a more powerful notation than regular

expressions.
Every construct that can be described by a regular
expression can be described by a grammar, but not
vice-versa.
Every regular language is a context-free language, but
not vice-versa.
Context-Free Grammar

Definition
A context-free grammar (CFG) has four components:
A set of terminal symbols, sometimes referred to as "tokens."
A set of nonterminal symbols. sometimes called "syntactic variables."
One nonterminal is distinguished as the start symbol.

A set of productions in the form: LHS → RHS

whereLHS (called head, or left side) is a single nonterminal symbol
RHS (called body, or right side) consists of zero or more terminals
and nonterminals.

The terminals are the elementary symbols of the language defined by the
grammar.
Nonterminals impose a hierarchical structure on the language that is key to
syntax analysis and translation.
Conventionally, the productions for the start symbol are listed first.
The productions specify the manner in which the terminals and
nonterminals can be combined to form strings.
CFG Example

A CFG Grammar

1Expr → Expr Op
Expr 2Expr →
number 3Expr → id
4Op → +
5Op → -
6Op → *
7Op → /
where
Expr and Op are nonterminals
number, id, +, -, *, and / are terminals
Expr is the start symbol
CFG Example

Productions with the same head can be grouped. Therefore,

the previous CFG grammar is equivalent to the one below.

Equivalent CFG Grammar

1 Expr→ Expr Op Expr | number | id

2 Op → + | - | * | /
Another CFG Example

Grammar for simple arithmetic expressions

1 expr → expr + term | expr - term | term

2 term → term * factor | term / factor | factor
3 factor → ( expr ) | id
where
expr, term, and factor are nonterminals
id, +, -, *, /, (, and ) are terminals
expr is the start symbol
Notational Conventions

To avoid confusion between terminals and nonterminals, the following notational

conventions for grammar will be used.

terminal symbols

lowercase letters like a, b, c.

digits, operator and punctuation symbols, such as +, *, (, ), 0, 1, ..., 9.
Boldface strings such as id, or if. Each of which represents a single
terminal symbol.

nonterminal symbols

uppercase letters early in the alphabet like A, B, C.

lowercase italic names such as expr, or stmt.
Specific symbols begin with a uppercase letter such as Expr, Stmt.
Unless stated otherwise, the head of the first production is the start symbol.
Notational Conventions (cont)

To avoid confusion between terminals and nonterminals, the following notational

conventions for grammar will be used.

Grammar symbols (i.e. either terminal or nonterminal)

uppercase letters late in the alphabet, such as X, Y, Z.

lowercase Greek letters, α, β, γ for example, represent strings of grammar
symbols. Thus, a production can be written as: A → α.
Derivations

Derivations
A grammar derives strings by beginning with the start symbol and repeatedly
replacing a nonterminal by the body of a production for that nonterminal. This
sequence of replacements is called derivation.

Derivation Example
Given the grammar:
1 exp → exp op exp | ( exp ) | number
2 op → + | - | *
The following is a derivation for an expression. At each step the grammar rule
choice used for the replacement is given on the right.
Context-Free Language

∗ +
New Notations: ⇒ ⇒
= and =
∗
α 1⇒=α means
n α derives
1 α in nzero or more steps.
+
α 1 =⇒α n means α 1 derives α n in one or more steps.

Definition

⇒∗ α, where S is the start symbol of grammar G, then α is called a

If S =
sentential form of G. A sentential form may contain both terminals and
nonterminals.
A sentence of G is a sentential form with no nonterminals.
The language generated by a grammar G is its set of sentences, denoted
as L(G).
A language that can be generated by a context-free grammar is said to be a
context-free language.
If two grammars generate the same language, the grammars are said to be
equivalent.
Process of discovering a derivation is called parsing.
Leftmost and Rightmost Derivations

The point of parsing is to construct a derivation.

At each step, we choose a nonterminal to replace.
Different choices can lead to different derivations
Two derivations are of interest
Leftmost derivation - replace leftmost nonterminal at
⇒
each step, denoted as: lm.
Rightmost derivation - replace rightmost nonterminal at
each step, denoted as: ⇒rm.
Leftmost and rightmost are the two systematic
derivations. We don’t care about randomly-ordered
derivations!
Leftmost and Rightmost Derivations

Leftmost Derivation of (number - number)*number

Rightmost Derivation of (number - number)*number

Parse Trees

Definition
A parse Tree is a labeled tree representation of a derivation
that filters out the order in which productions are applied to
replace nonterminals.
The interior nodes are labeled by nonterminals
The leaf nodes are labeled by terminals
The children of each internal node A are labeled, from
left to right, by the symbols in the body of the production
by which this A was replaced during the derivation.

Since a parse tree ignores variations in the order in which symbols in

sentential forms are replaced, there is a many-to-one relationship between
derivations and parse tree.
Leftmost and Rightmost Derivations

The following is a parse tree for these two derivations

discussedhere.
Ambiguous Grammars

Definition
A grammar that produces more than one parse tree for
some sentence is said to be ambiguous. Such a grammar is
called ambiguous grammar.

Put another way,

If a grammar has more than one leftmost derivation for
a single sentential form, the grammar is ambiguous.
If a grammar has more than one rightmost derivation
for a single sentential form, the grammar is ambiguous
Ambiguous Grammars

The grammar:
1 exp → exp op exp | id | id
2 op → + | - | * | /
are ambiguous because there are two different parse trees for sentence: id -
number*id
Solving Ambiguity

There are two basic methods to deal with ambiguities.

Approach 1: Disambiguating Rule
State a rule that specifies in each ambiguous case which of
the parse trees is the correct one. Such a rule is called a
disambiguating rule.
Advantage: No need to change the grammar itself
Disadvantage: the syntactic structure of the language is
no longer given by the grammar alone.

Approach 2: Rewriting Grammar

Change the grammar into a form that forces the construction
of the correct parse tree, thus removing the ambiguity.
Precedence and Associativity

Ambiguous Grammar

1 exp → exp op exp | id | id

2 op → + | - | * | /

To use Approach 1 to remove ambiguity from the above ambiguous grammar, the
following disambiguating rules are defined:
all operators (+, -, *, /) are left associative.
+ and - have the same precedence
* and / have the same precedence
* and / have higher precedence than + and -.

Based on these rules, which parse tree inthis slideis correct?

Precedence and Associativity

Ambiguous Grammar

1 exp → exp op exp | id | id

2 op → + | - | * | /

We can add precedence to the above ambiguous grammar to remove ambiguity.

To add precedence:
Group Operators into Precedence Levels
Create a nonterminal for each level of precedence
Make operators left, right, or none associative. Position of the recursion
relative to the operator dictates the associativity
Left (right) recursion → left (right) associativity
None: Don’t be recursive, simply reference next higher precedence
non-terminal on both sides of operator
Isolate the corresponding levels of the grammar
Force the parser to recognize high precedence subexpressions first
Precedence and Associativity

The figure below demonstrates how to add precedence to a

grammar.
Dangling Else Problem

Dangling Else Grammar

stmt → if expr thenstmt
| if expr then stmt else stmt
| other

The above grammar is ambiguous since the string

if E1 then if E2 then S1 else S2 has the two parse trees shown below.
Dangling Else Problem

Two ways to solve the dangling else problem

Approach 1:Create the following disambiguating rule
Match each else with the closest unmatched
then.
Approach 2:Rewriting the grammar so that the
disambiguating rule can be incorporated
directly into the grammar.
Dangling Else Problem

The following explain the idea to rewrite the dangling-else grammar to remove the
ambiguity.
A statement appearing between a then and an else must be ”matched”; that
is, the interior statement must not end with an unmatched or open then.
A matched statement is either an if-then-else statement containing no open
statements or it is any other kind of unconditional statement.

Rewritten Grammar without Ambiguity

stmt → matched_stmt
| open_stmt

matched_stmt → if expr then matched_stmt else matched_stmt

| other

open_stmt → if expr thenstmt

| if expr then matched_stmt else open_stmt
Dangling Else Problem

With regarding to the dangling else problem

Rewrite the grammar is usually not taken. Instead, the
disambiguating rule is preferred.
The principal reason is that parsing methods are easy
to configure in such a way that the most closely nested
rule is obeyed.
Another reason is the added complexity of the new
grammar.
Class Problem

Ambiguous Grammar
S → S+S
| S-S
| S*S
| S/S
| (S)
| -S
| S ^S
| number

Precedence (high to low) Associativity

(), unary - ˆ is right-associative

^ rest are left-associative
*, /
+, -

1 Syntax Analyzer
No ratings yet
1 Syntax Analyzer
33 pages
Lecture 03
No ratings yet
Lecture 03
36 pages
3 Role of Parser
No ratings yet
3 Role of Parser
135 pages
Module 2 C D Notes
No ratings yet
Module 2 C D Notes
21 pages
Lesson 3: Syntax Analysis: Risul Islam Rasel
No ratings yet
Lesson 3: Syntax Analysis: Risul Islam Rasel
106 pages
Compiler Design CS_4
No ratings yet
Compiler Design CS_4
70 pages
Compiler Design Chapter-3
0% (1)
Compiler Design Chapter-3
177 pages
Compiler Construction Week 04 Syntax Analysis I)
No ratings yet
Compiler Construction Week 04 Syntax Analysis I)
41 pages
Chapter 3
No ratings yet
Chapter 3
77 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
14 pages
Unit 2 - Sessions 1 - 2
No ratings yet
Unit 2 - Sessions 1 - 2
36 pages
Syntax Analysis (Part-I)
No ratings yet
Syntax Analysis (Part-I)
88 pages
Topic 2 - Syntax and Semantics Lecture Notes
No ratings yet
Topic 2 - Syntax and Semantics Lecture Notes
50 pages
Chapter 3 - Syntax Analysis
No ratings yet
Chapter 3 - Syntax Analysis
16 pages
Chapter 3 - Syntax Analysis Part One
No ratings yet
Chapter 3 - Syntax Analysis Part One
17 pages
ATCD PPT Module-3
No ratings yet
ATCD PPT Module-3
136 pages
G52Cmp Compilers: Syntax Analysis
No ratings yet
G52Cmp Compilers: Syntax Analysis
36 pages
Lex
No ratings yet
Lex
13 pages
[Week 4] Syntax Analysis (CFG)
No ratings yet
[Week 4] Syntax Analysis (CFG)
50 pages
Samir Cfg
No ratings yet
Samir Cfg
105 pages
2.2 - Syntax Analysis (Upto Top-down Parsing)
No ratings yet
2.2 - Syntax Analysis (Upto Top-down Parsing)
91 pages
15 Syntax Parsing
No ratings yet
15 Syntax Parsing
30 pages
CH03
No ratings yet
CH03
57 pages
2-Role of Parser and Parse Tree-02!08!2024
No ratings yet
2-Role of Parser and Parse Tree-02!08!2024
69 pages
Atcd Unit 2
No ratings yet
Atcd Unit 2
49 pages
Syntax Analysis
No ratings yet
Syntax Analysis
58 pages
Chapter 4 Syntax Analysis
No ratings yet
Chapter 4 Syntax Analysis
95 pages
4th - Syntax Analysis
No ratings yet
4th - Syntax Analysis
29 pages
Unit 2
No ratings yet
Unit 2
45 pages
Unit2 TopDownParsing
No ratings yet
Unit2 TopDownParsing
12 pages
Syntax Analysis: EECS 483 - Lecture 4 University of Michigan Monday, September 17, 2006
No ratings yet
Syntax Analysis: EECS 483 - Lecture 4 University of Michigan Monday, September 17, 2006
28 pages
II. Parser: Syntax Analysis
No ratings yet
II. Parser: Syntax Analysis
18 pages
Chapter 6
No ratings yet
Chapter 6
52 pages
Compiler Design Unit 2
No ratings yet
Compiler Design Unit 2
24 pages
Module 2a - With soln
No ratings yet
Module 2a - With soln
90 pages
Automata Chapter 3
No ratings yet
Automata Chapter 3
14 pages
Compiler 3
No ratings yet
Compiler 3
11 pages
Chapter-3-Syntax Analysis
No ratings yet
Chapter-3-Syntax Analysis
126 pages
Lecture 5
No ratings yet
Lecture 5
28 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
11 pages
Syntax Analysis
No ratings yet
Syntax Analysis
47 pages
Principles of Programming Languages: Syntax Analysis
No ratings yet
Principles of Programming Languages: Syntax Analysis
51 pages
Chapter 4 Syntax Analysis
No ratings yet
Chapter 4 Syntax Analysis
95 pages
C Depart
No ratings yet
C Depart
7 pages
2nd Phase Syntax Analyzer -1
No ratings yet
2nd Phase Syntax Analyzer -1
136 pages
Chapter 3 - Syntax Analyzer
No ratings yet
Chapter 3 - Syntax Analyzer
28 pages
Module 2
No ratings yet
Module 2
19 pages
Class Three
No ratings yet
Class Three
74 pages
Unit 3 Syntax - Analyzer
No ratings yet
Unit 3 Syntax - Analyzer
56 pages
KCA015 Unit2
No ratings yet
KCA015 Unit2
29 pages
Chapter 3 Syntax Analysis Full Reading Material
No ratings yet
Chapter 3 Syntax Analysis Full Reading Material
76 pages
Chapter 3 - Syntax Analysis Part One
No ratings yet
Chapter 3 - Syntax Analysis Part One
10 pages
Unit 2 - Sessions 1 - 2
No ratings yet
Unit 2 - Sessions 1 - 2
133 pages
Parsing Part - 1
No ratings yet
Parsing Part - 1
53 pages
Chương 3. Phân Tích Cú Pháp
No ratings yet
Chương 3. Phân Tích Cú Pháp
103 pages
Chapter Three Context Free Grammar
No ratings yet
Chapter Three Context Free Grammar
55 pages
2. Simple Syntax Directed Translation
No ratings yet
2. Simple Syntax Directed Translation
51 pages
Unit 2 Notes
No ratings yet
Unit 2 Notes
43 pages
CD Chapter 2
No ratings yet
CD Chapter 2
39 pages
The Genetic Code of All Languages; Part-5 (Hebrew)
From Everand
The Genetic Code of All Languages; Part-5 (Hebrew)
Moni Kanchan Panda
No ratings yet
Practical 1
No ratings yet
Practical 1
17 pages
Practical 3
No ratings yet
Practical 3
7 pages
Practical 2
No ratings yet
Practical 2
38 pages
First Quiz DIP 2011 2
No ratings yet
First Quiz DIP 2011 2
1 page
L7 - CS310 - Computer Graphics
No ratings yet
L7 - CS310 - Computer Graphics
4 pages
Whom I Met - Pe Care I-Am Intalnit
No ratings yet
Whom I Met - Pe Care I-Am Intalnit
2 pages
Subject, Verb & Object - Learn English
No ratings yet
Subject, Verb & Object - Learn English
3 pages
Introduction: Process Writing: Lecturer Mayada Khdayr 2021
No ratings yet
Introduction: Process Writing: Lecturer Mayada Khdayr 2021
11 pages
Ugbs 210 Assignment 1
No ratings yet
Ugbs 210 Assignment 1
2 pages
2024 2025第一学期教学大纲
No ratings yet
2024 2025第一学期教学大纲
9 pages
Proficiency 2013 Listening Test
No ratings yet
Proficiency 2013 Listening Test
5 pages
Poetry Homework 1 Answers
No ratings yet
Poetry Homework 1 Answers
7 pages
Lat Soal Tenses SMU X
No ratings yet
Lat Soal Tenses SMU X
20 pages
Vocabulary Activity Instructions: 1A Days of The Week / Numbers
No ratings yet
Vocabulary Activity Instructions: 1A Days of The Week / Numbers
4 pages
Excel-Formulas-and-Functns Cheat-Sheet
No ratings yet
Excel-Formulas-and-Functns Cheat-Sheet
113 pages
Detailed Daily Lesson Plan in English For Grade 8
No ratings yet
Detailed Daily Lesson Plan in English For Grade 8
7 pages
Shalini - Grade 7 - Eng (GR) - CH-1. Nouns - Sample
No ratings yet
Shalini - Grade 7 - Eng (GR) - CH-1. Nouns - Sample
3 pages
[Post-training][Eval] Multilingual Daily Evaluation Standalone Human Annotation Guidelines (1)
No ratings yet
[Post-training][Eval] Multilingual Daily Evaluation Standalone Human Annotation Guidelines (1)
31 pages
Annexure III - Stylesheet 1
No ratings yet
Annexure III - Stylesheet 1
4 pages
Leaflet ArtOfWriting B2
No ratings yet
Leaflet ArtOfWriting B2
16 pages
Get Legal Environment 8th Edition By Jeffrey F. Beatty free all chapters
100% (3)
Get Legal Environment 8th Edition By Jeffrey F. Beatty free all chapters
34 pages
GANAG Blended Lesson Guide
No ratings yet
GANAG Blended Lesson Guide
1 page
Listados de Verbos
100% (1)
Listados de Verbos
8 pages
Long Vowels Grades 1-2 Standard E-Book
No ratings yet
Long Vowels Grades 1-2 Standard E-Book
49 pages
Cot Eng4 2NDQ
100% (1)
Cot Eng4 2NDQ
5 pages
Complete Download Colloquial Spanish The Complete Course for Beginners 2nd Edition Untza Otaola Alday PDF All Chapters
100% (5)
Complete Download Colloquial Spanish The Complete Course for Beginners 2nd Edition Untza Otaola Alday PDF All Chapters
60 pages
Future Perfect Story 1
No ratings yet
Future Perfect Story 1
7 pages
GR2 ModelWS
No ratings yet
GR2 ModelWS
5 pages
Basic English Grammar Book 1
No ratings yet
Basic English Grammar Book 1
25 pages
Thesis Writer Ghana
100% (2)
Thesis Writer Ghana
6 pages
Language Varieties
No ratings yet
Language Varieties
4 pages
Int Lesson 2 - Possibility Probability Obligation
No ratings yet
Int Lesson 2 - Possibility Probability Obligation
16 pages
English Grade 10: Quarter 4 Module 6 Observing Correct Grammar in Making
100% (1)
English Grade 10: Quarter 4 Module 6 Observing Correct Grammar in Making
12 pages
Modal Verbs
No ratings yet
Modal Verbs
4 pages
5PS Seq 01 Sect 01 L 01 I Get Ready
No ratings yet
5PS Seq 01 Sect 01 L 01 I Get Ready
2 pages