Bedasa
Bedasa
Bedasa
Introduction
Compiler Design
1
Introduction
Definition
target language.
Source Target
target program/code
A source progra Compil
program/code isis a aprogram/code written inprogra
the
m er m
program/code written in the
source language, which is usually a high-level language.
target language, which often is Error
messa
2 a machine language or an ge
Contd.
As a discipline compiler design involves multiple computer
Assembly language
Software Engineering
Computer Architecture
3
Why Study Theory of Compiler?
curiosity
Prerequisite for developing advanced compilers, which continues
to be active as new computer architectures emerge
7
with a single pass compilers, for example most latest
Contd.
3. Load and Go Compilers:- generates machine code & then
immediately executes it.
and Link/Load.
computer program.
assembler
linker/loader Library
files
target machine code
C. Linker:- is a program that takes one or more objects
generated by a compiler and combines them into a single
executable program.
D. Loader:- is the part of an operating system that is
11 responsible for loading programs from executables (i.e.,
Compiler vs. Interpreter
Ideal concept:
Source Compiler Executable
code
Input
Executable Output
data
data
Source
code Interpreter Output
Input data data
the other:
translated code.
End
together meaningfully
15 Checks source program for semantic errors
Phases of Compilation
Stream of characters
scanne
r Stream of tokens
parse
r
Parse/syntax tree
Semantic
analyzer
Annotated tree
Intermediate code
General Structure of a generator
Intermediate code
Compiler Code
optimization
Intermediate code
Code
generator
Target code
Code
optimization
Target code
Phase I: Lexical Analysis
The low-level text processing portion of the compiler
The source file, a stream of characters, is broken into larger
chunks called tokens.
For example:
void main() It will be broken into 13 tokens
{
int x; as below:
x=3; void main ( ) { int x ;
} x=3;}
The lexical analyzer (scanner) reads a stream of characters
and puts them together into some meaningful (with respect to
the source language) units called tokens.
Typically, spaces, tabs, end-of-line characters and comments
program syntax
Root node is entire program and leaves are tokens that were
Grammar (CFG)
Syntax structures are analyzed by DPDA (Deterministic Push
Down Automata)
19
Phase III: Semantic Analysis
It gets the parse tree from the parser together with
information about some syntactic elements
It determines if the semantics (meanings) of the program is
correct.
It detects errors of the program, such as using variables
before they are declared, assign an integer value to a
Boolean variable, …
This part deals with static semantic.
semantic of programs that can be checked by reading off
from the program only.
syntax of the language which cannot be described in
context-free grammar.
20
Mostly, a semantic analyzer does type checking (i.e. Gathers
Contd.
The main tool used by the semantic analyzer is a symbol table
Symbol table:- is a data structure with a record for each
identifier and its attributes
Attributes include storage allocation, type, scope, etc
All the compiler phases insert and modify the symbol table
Discovery of meaning in a program using the symbol table
Do static semantics check
Simplify the structure of the parse tree ( from parse tree to
abstract syntax tree (AST) )
Static semantics check
Making sure identifiers are declared before use
Type checking for assignments and operators
21 Checking types and number of parameters to subroutines
Phase IV: Intermediate Code Generation
An intermediate code generator
takes a parse tree from the semantic analyzer
generates a program in the intermediate language.
In some compilers, a source program is translated into an
are required.
To do code generation
Phaseassembly
VI: Machine Code Generation and
tree) and
Linking
Output the actual assembly code associated with the tiles that
The final phase of compilation coverts the assembly code into
we used to cover the tree
machine code and links (by a linker) in appropriate language
24
libraries
Code Optimization
Replacing an inefficient sequence of instructions with a better
sequence of instructions.
Sometimes called code improvement.
Code optimization can be done:
after semantic analyzing
performed on a parse tree
after intermediate code generation
performed on a intermediate code
after code generation
performed on a target code
1. Constant evaluation
2. Strength reduction
26
Global Optimization
Much more difficult; usually omitted from all but the most
compilers”
Optimization cannot make an inefficient algorithm efficient
27
The Phases of a Compiler
Phase Output Sample
Programmer (source Source string A=B+C;
code producer)
Scanner (performs lexical Token string ‘A’, ‘=’, ‘B’, ‘+’, ‘C’,
analysis) ‘;’
And symbol table with
names
Parser (performs syntax Parse tree or abstract ;
|
analysis based on the syntax tree =
grammar of the /\
programming language) A +
/\
B C
29
Compiler Construction Tools
Assignmen ?
tI
Compiler Design Tools
History of Compilers