Compiler Lecture-1

Uploaded by

Md. Abdul Mukit

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

Compiler Lecture-1

Uploaded by

Md. Abdul Mukit

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 47

CSE-361:Compiler Design

TA H S I N A H A S H E M
Lecturer
CSE,JU
TEXTBOOK
 Compilers: Principles, Techniques, and Tools
 Aho, Lam, Sethi, Ullman
 Modern Compiler Implementation in C (The Tiger Book).
 Andrew W. Appel
GRADING POLICY
Attendance 10%
Assignment 5%
Class Test 20% (Best two of three)
(includes exercises and lecture materials)
Surprise test 5%
(at the end or starting of lecture)
==========================================================
Total 40%
Final Exam 60%
(includes exercises and lecture materials)
LANGUAGE PROCESSOR
 A computer understands instructions in machine code, i.e. in the form of 0s and 1s. It is a tedious
task to write a computer program directly in machine code.
 The programs are written mostly in high level languages like Java, C++, Python etc. and are
called source code. These source code cannot be executed directly by the computer and must be
converted into machine language to be executed.

 Hence, a special translator system software is used to translate the program written in high-level
language into machine code is called Language Processor and the program after translated into
machine code (object program / object code).
The language processors can be any of the following three types:
1. Compiler
2. Interpreter
3. Assembler
LANGUAGE PROCESSOR: COMPILER
 A compiler is a program that reads the whole program written in a source
language/high level language as a whole in one go and translates it into an
equivalent program in a target language.
LANGUAGE PROCESSOR: INTERPRETER
 An interpreter is another common kind of language processor.
 Instead of producing a target program as a translation, an interpreter appears to
directly execute the operations specified in the source program on inputs
supplied by the user.

Source program
Interpreter Output
Input

Error messages
COMPILER VS. INTERPRETER
HYBRID COMPILER
 Java language processors combine compilation and interpretation.
HYBRID COMPILER
 A Java source program may first be compiled into an intermediate
form called bytecodes.
 The bytecodes are then interpreted by a virtual machine.

 A benefit of this arrangement is that bytecodes compiled

on one machine can be interpreted on another machine,
perhaps across a network.

 In order to achieve faster processing of inputs to outputs,

some Java compilers, called just-in-time compilers,
translate the bytecodes into machine language
immediately before they run the intermediate program to
process the input.
LANGUAGE PROCESSOR: ASSEMBLER
 The Assembler is used to translate the program written in Assembly language into
machine code. Assembly language is machine dependent yet mnemonics that are
being used to represent instructions in it are not directly understandable by
machine and high level language is machine independent.

Assembler
OTHER LANGUAGE PROCESSORS:
COMPILATION TASK IS FULL OF VARIETY
??
 Thousands of source languages
• Fortran, Pascal, C, C++, Java, ……
 Thousands of target languages
• Some other lower level language (assembly language), machine language
 Compilation process has similar variety
• Single pass, multi-pass, load-and-go, debugging, optimizing….
 Variety is overwhelming……

Good news is:

 Few basic techniques is sufficient to cover all variety
 Many efficient tools are available
JOB OF COMPILER
 We will study compilers that take as input programs in a high-level
programming language and give as output programs in a low-level assembly
language.
 Such compilers have 3 jobs:
 TRANSLATION
 VALIDATION
 OPTIMIZATION
WHY STUDY COMPILER?
 To be more effective users of compilers ... instead of treating a compiler as a
“black box”.
 To apply compiler techniques to typical SE tasks that require reading input and
taking action.
 To see how your core CS courses fit together, giving you the knowledge to
construct a compiler.
 To participate in R&D of high-level programming languages and optimizing
compilers.
CHALLENGES OF COMPILER
CONSTRUCTION
Compiler construction poses challenging and interesting problems:
 Compilers must process large inputs, perform complex algorithms, but also run quickly
 Compilers have primary responsibility for run-time performance
 Compilers are responsible for making it acceptable to use the full power of the
programming language
 Computer architects perpetually create new challenges for the compiler by building
more complex machine
 Compilers must hide that complexity from the programmer

A successful compiler requires mastery of the many complex interactions between

its constituent parts
COMPILER AND OTHER AREAS
Compiler construction involves ideas from different parts of computer science
 Artificial intelligence: Greedy algorithms, Heuristic search techniques
 Algorithms: Graph algorithms, Dynamic programming
 Theory of Automata: DFAs & PDAs, pattern matching, regular expressions
 Architecture: Pipelining and Instruction set use
REQUIREMENT
 In order to translate statements in a language, one needs to understand both
• Structure of the language: the way “sentences" are constructed in the language
• Meaning of the language: what each “sentence" stands for.

 Terminology:
 Structure ≡ Syntax
 Meaning ≡ Semantics
ANALYSIS-SYNTHESIS MODEL OF
COMPILATION
 Two major parts --
 Analysis: an intermediate representation is created from the given source
program.
Lexical Analyzer, Syntax Analyzer and Semantic Analyzer
 Synthesis: the equivalent target program is created from this intermediate
representation
Intermediate Code Generator, Code Optimizer, and Code Generator
PHASES OF COMPILER
COMPILATION STEPS/PHASES
 Lexical Analysis: Generates the “tokens” in the source program
 Syntax Analysis: Recognizes “sentences" in the program using the syntax of the
language
 Semantic Analysis: Infers information about the program using the semantics of the
language
 Intermediate Code Generation: Generates “abstract” code based on the syntactic
structure of the program and the semantic information
 Optimization: Refines the generated code using a series of optimizing transformations
 Final Code Generation: Translates the abstract intermediate code into specific
machine instructions
LEXICAL ANALYSIS
 Convert the stream of characters representing input program into a meaningful sequences called lexemes.
 For each lexeme, the lexical analyzer produces as output
A token of the form:
< token-name, attribute-value >
token-name  an abstract symbol that is used during syntax analysis
attribute-value points to an entry in the symbol table for this token
 Example:
Input: “*x++" Output: three tokens  “*", “x", “++"
Input: “static int" Output: two tokens:  “static" , “int"

 Removes the white spaces, comments

LEXICAL ANALYSIS
 Input: result = a + b * 10
 Tokens :
‘result’, ‘=‘, ‘a’, ‘+’, ‘b’, ‘*’, ‘10’

identifiers operators
LEXICAL ANALYSIS
 Input:
 Output: Sequence of tokens

• In this representation, the token names =, +, and * are abstract symbols for
the assignment, addition, and multiplication operators, respectively.
SYNTAX ANALYSIS (PARSING)
 Build a tree called a parse tree that reflects the structure of the input sentence.
 A syntax tree in which each interior node represents an operation and the children of
the node represent the arguments of the operation.
Example:
 The Phrase : x = +y
 Four Tokens  “x", “=“ ,“+" and “y“
 Structure x = (x+(y)) i.e., an assignment expression
SYNTAX ANALYSIS: GRAMMARS
 Expression grammar
Exp Exp ‘+’ Exp
| Exp ‘*’ Exp
| ID
| NUMBER
SYNTAX ANALYSIS: SYNTAX TREE
 Input: result = a + b * 10
SEMANTIC ANALYSIS
 Check the source program for semantic errors
 It uses the hierarchical structure determined by the syntax-analysis phase to
identify the operators and operands of expressions and statements
 Performs type checking
 Operator operand compatibility
Example:
The compiler must report an error if a floating-point number is used to index an array.
SEMANTIC ANALYSIS
 The language specification may permit some type conversions
called coercions.
 Example:
The compiler may convert or coerce
the integer into a floating-point number.
INTERMEDIATE CODE GENERATION
 Translate each hierarchical structure decorated as tree into intermediate code
 A program translated for an abstract machine
 Properties of intermediate codes
 Should be easy to produce
 Should be easy to translate into the target program
 Intermediate code hides many machine-level details, but has instruction-level
mapping to many assembly languages
 Main motivation: portability
 One commonly used form is “Three-address Code”
INTERMEDIATE CODE GENERATION
 We consider an intermediate form called “three-address code”.
 Like the assembly language for a machine in which every memory
location can act like a register.
 Three-address code consists of a
sequence of instructions,
each of which has at most three operands.
CODE OPTIMIZATION
 Apply a series of transformations to improve the time and space efficiency of
the generated code.
 Peephole optimizations: generate new instructions by combining/expanding on
a small number of consecutive instructions.
 Global optimizations: reorder, remove or add instructions to change the
structure of generated code
 Consumes a significant fraction of the compilation time
 Optimization capability varies widely
 Simple optimization techniques can be vary valuable
CODE OPTIMIZATION
CODE GENERATION
 Map instructions in the intermediate code to specific machine instructions.
 Memory management, register allocation, instruction selection, instruction
scheduling, …
 Generates sufficient information to enable symbolic debugging.
CODE GENERATION
For example, using registers R1 and R2, the intermediate code might get
translated into the machine code
SYMBOL TABLE
 Records the identifiers used in the source program
 Collect information about various attributes of each identifier
 Variables: type, scope, storage allocation
 Procedure: number and types of arguments, method of argument passing
 It’s a data structure containing a record for each identifier
 Different fields are collected and used at different phases of compilation
 When an identifier in the source program is detected by the lexical analyzer, the
identifier is entered into the symbol table
SYMBOL TABLE
 It is built in lexical and syntax analysis phases and It is used by compiler to achieve compile
time efficiency.
 The information is collected by the analysis phases of compiler and is used by synthesis phases
of compiler to generate code.

Items stored in Symbol table:

 Variable names and constants
 Procedure and function names
 Literal constants and strings
 Compiler generated temporaries
 Labels in source languages
SYMBOL TABLE
Information used by compiler from Symbol table:
 Data type and name
 Declaring procedures
 Offset in storage
 If structure or record then, pointer to structure table.
 For parameters, whether parameter passing by value or by reference
 Number and type of arguments passed to function
 Base Address
SYMBOL TABLE
It is used by various phases of compiler as follows :-
 Lexical Analysis: Creates new table entries in the table, example like entries about token.
 Syntax Analysis: Adds information regarding attribute type, scope, dimension, line of reference, use,
etc in the table.
 Semantic Analysis: Uses available information in the table to check for semantics i.e. to verify that
expressions and assignments are semantically correct(type checking) and update it accordingly.
 Intermediate Code generation: Refers symbol table for knowing how much and what type of run-
time is allocated and table helps in adding temporary variable information.
 Code Optimization: Uses information present in symbol table for machine dependent optimization.
 Target Code generation: Generates code by using address information of identifier present in the
table.
ERROR DETECTION, RECOVERY AND
REPORTING
 Each phase can encounter error.
 The tasks of the Error Handling process are to detect each error, report it to the
user, and then make some recover strategy and implement them to handle error.
 Specific types of error can be detected by specific phases
 Lexical Error : int abc, 1num ;
 Syntax Error: total = capital + rate year;
 Semantic Error: value = myarray [realIndex];
 Should be able to proceed and process the rest of the program after an error detected
 Should be able to link the error with the source program
ERROR DETECTION, RECOVERY AND
REPORTING
Types or Sources of Error –
There are two types of error: Run-time and Compile-time error:
 A Run-time error is an error which takes place during the execution of a program, and
usually happens because of adverse system parameters or invalid input data.
 The lack of sufficient memory to run an application or a memory conflict with
another program
 Logical error is another example. Logic errors, occur when executed code does not
produce the expected result. Logic errors are best handled by meticulous program
debugging.
ERROR DETECTION, RECOVERY AND
REPORTING
Types or Sources of Error –
There are two types of error: Run-time and Compile-time error:
 A Compile-time error rises at compile time, before execution of the program. Syntax error
or missing file reference that prevents the program from successfully compiling is the
example of this.
 Classification of Compile-time error –
 Lexical : This includes misspellings of identifiers, keywords or operators
 Syntactical : missing semicolon or unbalanced parenthesis
 Semantical : incompatible value assignment or type mismatches between operator and operand
 Logical : code not reachable, infinite loop.
REVIEWING THE ENTIRE PROCESS
REVIEWING THE ENTIRE PROCESS
COMPILER CONSTRUCTION TOOLS
1) Parser generators: It produces syntax analyzers (parsers) from the input that is
based on a grammatical description of programming language or on a context-
free grammar.
2) Scanner generators: It generates lexical analyzers from the input that consists
of regular expression description based on tokens of a language. It generates a
finite automaton to recognize the regular expression.
3) Syntax-directed translation engines: It generates intermediate code with three
address format from the input that consists of a parse tree. These engines have
routines to traverse the parse tree and then produces the intermediate code.
COMPILER CONSTRUCTION TOOLS
4) Code-generator: It generates the machine language for a target machine.
5) Data-flow analysis engines: A key part of the code optimization that gathers
the information, that is the values that flow from one part of a program to
another.
6) Compiler-construction toolkits: It provides an integrated set of routines
that aids in building compiler components or in the construction of various
phases of compiler.
Reading Materials
 Chapter -1 of your Text book:
 Compilers: Principles, Techniques, and Tools
 https://www.geeksforgeeks.org/compiler-design-tutorials/
THE END

Python Network Programming (Myanmar Version - I) by Khant Phyo
No ratings yet
Python Network Programming (Myanmar Version - I) by Khant Phyo
203 pages
Compiler Design
No ratings yet
Compiler Design
65 pages
Compiler Design
No ratings yet
Compiler Design
59 pages
CD Unit - 1 Lms Notes
No ratings yet
CD Unit - 1 Lms Notes
58 pages
Introduction To Compiling
100% (1)
Introduction To Compiling
26 pages
Compiler Unit - 1 PDF
No ratings yet
Compiler Unit - 1 PDF
16 pages
Quick Book of Compiler
100% (1)
Quick Book of Compiler
66 pages
Compiler Construction
No ratings yet
Compiler Construction
244 pages
Compiler Construction
No ratings yet
Compiler Construction
63 pages
1-Phases of Compiler
No ratings yet
1-Phases of Compiler
66 pages
Introduction to Compiler
No ratings yet
Introduction to Compiler
10 pages
Lec#1
No ratings yet
Lec#1
36 pages
Lecture 08 Language Translation PDF
No ratings yet
Lecture 08 Language Translation PDF
11 pages
Compiler Design Quick Guide
No ratings yet
Compiler Design Quick Guide
45 pages
CD Notes
No ratings yet
CD Notes
69 pages
ACD Unit-2 part-1
No ratings yet
ACD Unit-2 part-1
36 pages
CSC303 - Compiler Design - 060624
No ratings yet
CSC303 - Compiler Design - 060624
49 pages
Chapter 1 - Introduction
No ratings yet
Chapter 1 - Introduction
13 pages
AT&FL Lab 11
No ratings yet
AT&FL Lab 11
6 pages
Compiler Notes
No ratings yet
Compiler Notes
68 pages
Com 413 Compiler - Notes1-1
No ratings yet
Com 413 Compiler - Notes1-1
6 pages
Compiler Design Concepts Worked Out Examples and M
100% (1)
Compiler Design Concepts Worked Out Examples and M
100 pages
Unit-1 Notes CD OU
No ratings yet
Unit-1 Notes CD OU
19 pages
Compiler Construction Notes
No ratings yet
Compiler Construction Notes
61 pages
7MCE1C4-Principles of Compiler Design
No ratings yet
7MCE1C4-Principles of Compiler Design
117 pages
Lecture 1,2 Introduction
No ratings yet
Lecture 1,2 Introduction
40 pages
Introduction To Compiler Development
No ratings yet
Introduction To Compiler Development
15 pages
Ch1 IntroductiontoCompilerpdf 2023 12 18 08 57 18
No ratings yet
Ch1 IntroductiontoCompilerpdf 2023 12 18 08 57 18
71 pages
Compiler Design: Objectives
No ratings yet
Compiler Design: Objectives
45 pages
Unit 1
No ratings yet
Unit 1
29 pages
Unit 1
No ratings yet
Unit 1
29 pages
Compiler Design: Instructor: Mohammed O. Samara University
No ratings yet
Compiler Design: Instructor: Mohammed O. Samara University
28 pages
Unit-1: Introduction To Compilers
No ratings yet
Unit-1: Introduction To Compilers
8 pages
Assignment 1
No ratings yet
Assignment 1
8 pages
Compiler Construction CS-4207: Lecture 1 & 2 Instructor Name: Atif Ishaq
No ratings yet
Compiler Construction CS-4207: Lecture 1 & 2 Instructor Name: Atif Ishaq
29 pages
Notes Compile Complete
No ratings yet
Notes Compile Complete
117 pages
Compiler Design Unit 1
No ratings yet
Compiler Design Unit 1
55 pages
Compiler Notes Arv
No ratings yet
Compiler Notes Arv
171 pages
Chapter 1
No ratings yet
Chapter 1
4 pages
Introduction
0% (1)
Introduction
26 pages
Compiler Construction and Phases
No ratings yet
Compiler Construction and Phases
8 pages
Compiler Design
No ratings yet
Compiler Design
34 pages
INTRO TO COMPILERS
No ratings yet
INTRO TO COMPILERS
77 pages
Unit 1 Introduction: Cocsc14 Harshita Sharma
No ratings yet
Unit 1 Introduction: Cocsc14 Harshita Sharma
84 pages
1. Compiler Construction
No ratings yet
1. Compiler Construction
35 pages
Indian Institute of Information Technology, Bhagalpur: Assignment - 1
No ratings yet
Indian Institute of Information Technology, Bhagalpur: Assignment - 1
26 pages
Compiler Design
No ratings yet
Compiler Design
11 pages
Compiler Design - Quick Guide
No ratings yet
Compiler Design - Quick Guide
38 pages
Unit 1 - CD Cs3501
No ratings yet
Unit 1 - CD Cs3501
24 pages
Compiler Construction Complete PDF
100% (1)
Compiler Construction Complete PDF
21 pages
Compiler Design: Instructor: Mohammed O. Samara University
100% (1)
Compiler Design: Instructor: Mohammed O. Samara University
28 pages
Compiler
No ratings yet
Compiler
4 pages
Compiler Design - Quick Guide: Language Processing System
No ratings yet
Compiler Design - Quick Guide: Language Processing System
51 pages
Chapter 1 - Introduction To Comp
No ratings yet
Chapter 1 - Introduction To Comp
27 pages
15. Compiler Design
No ratings yet
15. Compiler Design
18 pages
Compiler Design-Short Notes
No ratings yet
Compiler Design-Short Notes
61 pages
Compiler Construction CS-4207 Lecture - 01 - 02: Input Output Target Program
No ratings yet
Compiler Construction CS-4207 Lecture - 01 - 02: Input Output Target Program
8 pages
Module 2&3
No ratings yet
Module 2&3
127 pages
Unit I SRM
100% (1)
Unit I SRM
36 pages
PART1 - Compiler Lecture Notes
No ratings yet
PART1 - Compiler Lecture Notes
7 pages
COMPUTER PROGRAMMING FOR KIDS: An Easy Step-by-Step Guide For Young Programmers To Learn Coding Skills (2022 Crash Course for Newbies)
From Everand
COMPUTER PROGRAMMING FOR KIDS: An Easy Step-by-Step Guide For Young Programmers To Learn Coding Skills (2022 Crash Course for Newbies)
Dexter Rogers
No ratings yet
Lecture Notes 05 (CSI2372 - Advanced Programming Concepts With C++)
No ratings yet
Lecture Notes 05 (CSI2372 - Advanced Programming Concepts With C++)
80 pages
Conditional Control Structure Chap - 04 - Class - 10
No ratings yet
Conditional Control Structure Chap - 04 - Class - 10
18 pages
Software Project Management: Project Scope and Activities
No ratings yet
Software Project Management: Project Scope and Activities
47 pages
Literature Review On Hostel Management System Project
100% (1)
Literature Review On Hostel Management System Project
6 pages
Flow Chart Promosi Karyawan
No ratings yet
Flow Chart Promosi Karyawan
6 pages
GTU SP
No ratings yet
GTU SP
3 pages
Scalable Javascript Application Architecture: Nicholas C. Zakas - @slicknet
No ratings yet
Scalable Javascript Application Architecture: Nicholas C. Zakas - @slicknet
108 pages
DBMS - Internal Assessment Question Paper
No ratings yet
DBMS - Internal Assessment Question Paper
3 pages
Dentalchart
No ratings yet
Dentalchart
3 pages
Tutorial 5
No ratings yet
Tutorial 5
35 pages
Assignment 7 - Software Engineering - 2023 (1) .Updated 29th
100% (1)
Assignment 7 - Software Engineering - 2023 (1) .Updated 29th
6 pages
Unit 1 - Exercise - Solution
No ratings yet
Unit 1 - Exercise - Solution
8 pages
05 - Decision Table Testing
No ratings yet
05 - Decision Table Testing
14 pages
User Manual LPT
No ratings yet
User Manual LPT
3 pages
Implementation of Round Robin Scheduling by MOHAMMAD GHUFRAN - 19DCS030
No ratings yet
Implementation of Round Robin Scheduling by MOHAMMAD GHUFRAN - 19DCS030
5 pages
Security Challenges Posed by Mobile Devices
No ratings yet
Security Challenges Posed by Mobile Devices
9 pages
Lab2 - Questions Only CON
No ratings yet
Lab2 - Questions Only CON
3 pages
CSE210 MODULE 1 + Process Part of Module 2
No ratings yet
CSE210 MODULE 1 + Process Part of Module 2
119 pages
Red Hat Enterprise Linux-9-Managing Idm Users Groups Hosts and Access Control Rules
No ratings yet
Red Hat Enterprise Linux-9-Managing Idm Users Groups Hosts and Access Control Rules
418 pages
Senior Product Designer
No ratings yet
Senior Product Designer
1 page
Logcat 1715400005946
No ratings yet
Logcat 1715400005946
30 pages
CS 01 Chapter 3
No ratings yet
CS 01 Chapter 3
36 pages
Simple Horserace Game Worksheet For Visual Basic 2008
No ratings yet
Simple Horserace Game Worksheet For Visual Basic 2008
4 pages
Resume Yash+Saini
No ratings yet
Resume Yash+Saini
1 page
8.4 IdentityIQ Plugins
No ratings yet
8.4 IdentityIQ Plugins
25 pages
Vxworks & Memory Management: Group A7 Cse8343
No ratings yet
Vxworks & Memory Management: Group A7 Cse8343
23 pages
Stock Maintenance System
100% (1)
Stock Maintenance System
71 pages
Dsu 15
No ratings yet
Dsu 15
3 pages
ITC Lesson 2 - Computing Profession
No ratings yet
ITC Lesson 2 - Computing Profession
25 pages