Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
46 views

Lecture 1 - Ch1. Introduction To Compiler

The document provides an overview of compilers, including: 1. A compiler translates a program written in a high-level language into machine language. It has a front-end that analyzes the source code and a back-end that generates the target code. 2. The compilation process involves several phases - lexical analysis, syntax analysis, semantic analysis, code generation, and code optimization. 3. Lexical analysis converts the source code text into tokens. Syntax analysis checks the grammar and produces an abstract syntax tree. Semantic analysis performs type checking using symbol table information.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views

Lecture 1 - Ch1. Introduction To Compiler

The document provides an overview of compilers, including: 1. A compiler translates a program written in a high-level language into machine language. It has a front-end that analyzes the source code and a back-end that generates the target code. 2. The compilation process involves several phases - lexical analysis, syntax analysis, semantic analysis, code generation, and code optimization. 3. Lexical analysis converts the source code text into tokens. Syntax analysis checks the grammar and produces an abstract syntax tree. Semantic analysis performs type checking using symbol table information.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

COMPILERS

LECTURE 1
CHAPTER.1 INTRODUCTION

Prof. Dr. Hala Abdel-El Galil


Dr. Ahmed Hesham Mostafa
Language Processors.
• a compiler is a program that can read a program in one language the source language -
and translate it into an equivalent program in another language - the target language;
• An important role of the compiler is to report any errors in the source program that it
detects during the translation process

• If the target program is an executable machine-language program, it can then be called by


the user to process inputs and produce outputs;
Overview of Compilers

• Compiler: translates a source program written in a High-Level Language


(HLL) such as Pascal, C++ into computer’s machine language (Low-Level
Language (LLL)).
* The time of conversion from source program into
object program is called compile time
* The object program is executed at run time

• Interpreter: processes an internal form of the source program and data at the
same time (at run time); no object program is generated.

3
Example Of Combining Both
Interpreter and Compiler
• Java language processors combine compilation
and interpretation
• A Java source program may first be compiled into
an intermediate form called bytecodes.
• The bytecodes are then interpreted by a virtual
machine. A benefit of this arrangement is that
bytecodes compiled on one machine can be
interpreted on another machine, perhaps across a
network.
• In order to achieve faster processing of inputs to
outputs, some Java compilers, called just-in-time
compilers, translate the bytecodes into machine
language immediately before they run the
intermediate program to process the input.
Typical Implementations
• Compilers
• FORTRAN, C, C++, Java, COBOL, etc. etc.
• Strong need for optimization, etc.
• Interpreters
• PERL, Python, awk, sed, sh, csh, postscript printer, Java VM
• Effective if interpreter overhead is low relative to execution cost of language
statements

5
What is a compiler?
A program that reads a program written in one language and translates
it into another language.

Source language Target language

Traditionally, compilers go from high-level languages to low-level


languages.

6
Implications
• Must recognize legal programs (& complain about illegal ones)
• Must generate correct code
• Must manage storage of all variables
• Must agree with OS on target format

Source Front End Back End Target

7
More Implications
• Need some sort of Intermediate Representation (IR)
• Front end maps source into IR
• Back end maps IR to target machine code

Source Front End Back End Target

8
Structure of a Compiler
• First approximation
• Front end: analysis
• Read source program and understand its structure and meaning
• Back end: synthesis
• Generate equivalent target language program

Source Front End Back End Target

9
Compiler Architecture
Intermediate
Intermediate
Language
Language

Scanner Parser Code


Source Semantic Code Target
language
(lexical (syntax Generator language
Analysis Optimizer
analysis) analysis)
tokens Parse tree

Symbol
Table

10
Phases of a compiler
Lexical Analysis
• The Firrst phase of a compiler is called lexical analysis or
scanning.
• The lexical analyzer reads the stream of characters making
up the source program and groups the characters into
meaningful sequences called lexemes.
• For each lexeme, the lexical analyzer produces as output a
token of the form

12
Lexical Analysis
• Token-name is an abstract symbol that is used during syntax
analysis,
• Attribute-value points to an entry in the symbol table for
this token.
• Information from the symbol-table entry is needed for
semantic analysis and code generation.
• For example, suppose a source program contains the
assignment statement =➔ position = initial + rate * 60
13
Lexical Analysis
• The characters in this assignment could be grouped into the following
lexemes and mapped into the following tokens passed on to the syntax
analyzer:
• “position” is a lexeme mapped into a token (id, 1), where id is an abstract
symbol standing for identifier and 1 points to the symbol table entry for
position. The symbol-table entry for an identifier holds information about the
identifier, such as its name and type.
• “=“ is a lexeme that is mapped into the token (=). Since this token
needs no attribute-value, we have omitted the second component.
For notational convenience, the lexeme itself is used as the name of
the abstract symbol

14
Lexical Analysis
• “initial” is a lexeme that is mapped into the token
(id, 2), where 2 points to the symbol-table entry for
initial.
• “+” is a lexeme that is mapped into the token (+).
• “rate” is a lexeme mapped into the token (id, 3), where 3
points to the symbol-table entry for rate.

15
Lexical Analysis
• “*” is a lexeme that is mapped into the token (*) .
• “60” is a lexeme that is mapped into the token (60).
• Blanks separating the lexemes would be discarded by the lexical
analyzer.
• The representation of the assignment statement after lexical
analysis as the sequence of tokens

16
Syntax Analysis (parser)
• The parser uses the first components of the tokens produced
by the lexical analyzer to create a tree-like intermediate
representation that depicts the grammatical structure of the
token stream.
• A typical representation is a syntax tree in which each
interior node represents an operation and the children of the
node represent the arguments of the operation

17
Syntax Analysis (parser)

18
Semantic Analysis
• The semantic analyzer uses the syntax tree and the
information in the symbol table to check the source program
for semantic check with the language definition.
• Gathers type information and saves it in either the syntax
tree or the symbol table, for subsequent use during
intermediate-code generation.

19
Semantic Analysis
• An important part of semantic analysis is type checking,
where the compiler checks that each operator has matching
operands. For example, many programming language
definitions require to the two sides of assiment statement to
have the same data type.

20
Intermediate Code Generation
• After syntax and semantic analysis of the source program, many compilers generate an explicit
low-level or machine-like intermediate representation (a program for an abstract machine). This
intermediate representation should have two important properties:
– it should be easy to produce and
– it should be easy to translate into the target machine.
• The considered intermediate form called three-address code, which consists of a sequence of
assembly-like instructions with three operands per instruction. Each operand can act like a
register.

21
Code Optimization
• The machine-independent code-optimization phase attempts to improve the intermediate code so
that better target code will result.
• Usually better means:
– faster, shorter code, or target code that consumes less power.
• The optimizer can deduce that the conversion of 60 from integer to floating point can be done
once and for all at compile time, so the int to float operation can be eliminated by replacing the
integer 60 by the floating-point number 60.0. Moreover, t3 is used only once
• There are simple optimizations that significantly improve the running time of the target program
without slowing down compilation too much.

22
Symbol-Table Management
• The symbol table is a data structure containing a record for each variable name,
with fields for the attributes of the name.
• The data structure should be designed to allow the compiler to find the record
for each name quickly and to store or retrieve data from that record quickly
• These attributes may provide information about the storage allocated for a
name, its type, its scope (where in the program its value may be used), and in
the case of procedure names, such things as the number and types of its
arguments, the method of passing each argument (for example, by value or by
reference), and the type returned.

23
Example – Symbol Table

24
The Phases of a Compiler
Phase Output
Programmer Source string
Scanner (performs lexical Token string
analysis)
Parser (performs syntax analysis Parse tree
based on the grammar of the
programming language)

Semantic analyzer (type Parse tree


checking, etc)
Intermediate code generator Three-address code

Optimizer Three-address code

Code generator Assembly code

Peephole optimizer Assembly code


Linker
• Linker is a program in a system which helps to link a object
modules of program into a single object file.
• It performs the process of linking. Linker are also called link
editors.
• Linking is performed at both compile time, when the
source code is translated into machine code and load time.
• when the program is loaded into memory by the loader.
Linking is performed at the last step in compiling a
program. 27
References
Compilers – Principles, Techniques and Tools, Second Edition by Alfred V.
Aho, Ravi Sethi, Jeffery D. Ullman

28
THANKS
SEE U NEXT
LECTURE

You might also like