Lecture 01
Lecture 01
Construction
Sehrish Saleem
MNS University of Engineering & Technology
Multan
Course Organization
General course information
Homework and project
information
2
General Information
Instructor Sehrish Saleem
Lectures 45
Text Compilers – Principles,
Techniques and Tools
by Aho, Sethi and
Ullman
3
Work Distribution
Theory
• Homeworks
• Exams
Practice
• Build a compiler
4
Project
Implementation
language: subset of java
Generated code: Intel x86
assembly
Implementation language: C++
Six programming assignments
5
Why Take this Course
Reason #1: understand compilers
and languages
understand the code structure
understand language semantics
understand relation between
source code and generated
machine code
become a better programmer
6
Why Take this Course
Reason #2: nice balance of
theory and practice
Theory
• mathematical models: regular
expressions, automata,
grammars, graphs
• algorithms that use these
models
7
Why Take this Course
Reason #2: nice balance of
theory and practice
Practice
• Apply theoretical notions to
build a real compiler
8
Why Take this Course
Reason #3: programming
experience
write a large program which
manipulates complex data
structures
learn more about C++ and
Intel x86
9
What are Compilers
Translate information from one
representation to another
Usually information = program
10
Examples
Typical Compilers:
• VC, VC++, GCC, JavaC
• FORTRAN, Pascal, VB(?)
Translators
• Word to PDF
• PDF to Postscript
11
In This Course
We will study typical compilation:
from programs written in high-
level languages to low-level
object code and machine code
12
Typical Compilation
High-level
High-level source
source code
code
Compiler
Low-level
Low-level machine
machine code
code
13
Source Code
int expr( int n )
{
int d;
d = 4*n*n*(n+1)*(n+1);
return d;
}
14
Source Code
Optimized for human
readability
Matches human notions of
grammar
Uses named constructs such
as variables and procedures
15
Assembly Code
.globl _expr
_expr:
imull %eax,%edx
pushl %ebp
movl 8(%ebp),%eax
movl %esp,%ebp
incl %eax
subl $24,%esp
imull %eax,%edx
movl 8(%ebp),%eax
movl %edx,-4(%ebp)
movl %eax,%edx
movl -4(%ebp),%edx
leal 0(,%edx,4),
%eax movl %edx,%eax
movl %eax,%edx jmp L2
imull 8(%ebp),%edx .align 4
movl 8(%ebp),%eax L2:
incl %eax leave
ret
16
Assembly Code
Optimized for hardware
Consists of machine
instructions
Uses registers and unnamed
memory locations
Much harder to understand by
humans
17
How to Translate
Correctness:
the generated machine code
must execute precisely the
same computation as the
source code
18
How to Translate
Is there a unique translation?
No!
Is there an algorithm for an
“ideal translation”? No!
19
How to Translate
Translation is a complex
process
source language and
generated code are very
different
Need to structure the
translation
20