Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Chapter 1 Introduction To Compiler Design

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

CHAPTER ONE

1. Introduction Of Compiler Design

✓ We have learned that Computers are a mix of software and hardware. Hardware is just a piece
of mechanical device and its functions are being controlled by a compatible software.

✓ Hardware understands instructions in the form of electronic charge, which is the counterpart
of binary language in software programming. Binary, 0 and 1.

✓ To instruct, the hardware codes must be written in binary format, which is simply a series of
1s and 0s. It would be a difficult for computer programmers to write such codes, which is why
we have compilers to write such codes

✓ The compiler is software that converts a program written in a high-level language


(Source Language) to a low-level language (Object/Target/Machine Language/0, 1’s).
✓ A translator or language processor is a program that translates an input program
written in a programming language into an equivalent program in another language.
The compiler is a type of translator, which takes a program written in a high-level
programming language as input and translates it into an equivalent program in low-
level languages such as machine language or assembly language.
✓ The program written in a high-level language is known as a source program, and the
program converted into a low-level language is known as an object (or target) program.
Without compilation, no program written in a high-level language can be executed. For
every programming language, we have a different compiler; however, the basic tasks
performed by every compiler are the same. The process of translating the source code
into machine code involves several stages, including lexical analysis, syntax analysis,
semantic analysis, code generation, and optimization.
✓ Compiler is an intelligent program as compare to an assembler. Compiler verifies all
types of limits, ranges, errors, etc. Compiler program takes more time to run and it
occupies huge amount of memory space. The speed of compiler is slower than other
system software. It takes time because it enters through the program and then does

compiler design compiled by Worku B (MSc 1


in computer science)
translation of the full program. When compiler runs on same machine and produces
machine code for the same machine on which it is running. Then it is called as self-
compiler or resident compiler. Compiler may run on one machine and produces the
machine codes for other computer then in that case it is called as cross compiler.
High-Level Programming Language
A high-level programming language is a language that has an abstraction of attributes of
the computer. High-level programming is more convenient to the user in writing a program.
Low-Level Programming Language
A low-Level Programming language is a language that doesn’t require programming ideas
and concepts.
Why Study Compilers?

✓ General background information for good software engineer

✓ Increases understanding of language semantics

✓ Seeing the machine code generated for language constructs helps understand performance
issues for languages

✓ Teaches good language design

✓ New devices may need device-specific languages

✓ New business fields may need domain-specific languages

Language Processing System

✓ The hardware understands a language, which humans cannot understand.

✓ So we write programs in high-level language, which is easier for us to understand and


remember.

✓ These programs are then fed into a series of tools and OS components to get the desired
code that can be used by the machine. This is known as Language Processing System.

compiler design compiled by Worku B (MSc 2


in computer science)
Pre-processor

A pre-processor, generally considered as a part of compiler, is a tool that produces input


for compilers. It deals with macro-processing, file inclusion, language extension, .

Assembler

An assembler translates assembly language programs into machine code. The output of an
assembler is called an object file, which contains a combination of machine instructions as
well as the data required to place these instructions in memory.

Compiler

A compiler is a program takes a program written in a source language and translates it into
an equivalent program in a target language.

It translates from a source language (typically a high level language) to a functionally


equivalent target language (typically the machine code).

compiler design compiled by Worku B (MSc 3


in computer science)
Linker

It is computer program that links and merges various object files together in order to make
an executable file.

Loader

▪ Loader is a part of operating system and is responsible for loading executable files into
memory and execute them.

▪ It calculates the size of a program (instructions and data) and creates memory space for it.

▪ It initializes various registers to initiate execution.

The high-level language is converted into binary language in various phases.

Let us first understand how a program, using C compiler, is executed on a host machine.

User writes a program in C language (high-level language).

✓ The C/C++ compiler compiles the program and translates it to assembly program (low-
level language).

✓ An assembler then translates the assembly program into machine code (object).

✓ A linker tool is used to link all the parts of the program together for execution (executable
machine code).

✓ A loader loads all of them into memory and then the program is executed.

Cross-compiler

✓ A compiler that runs on platform (A) and is capable of generating executable code for
platform (B) is called a cross-compiler.

Source-to-source Compiler

✓ A compiler that takes the source code of one programming language and translates it into
the source code of another programming language is called a source-to-source compiler.

compiler design compiled by Worku B (MSc 4


in computer science)
1.1. Phases of a Compiler
The compilation process is a sequence of various phases. Each phase takes input from its
previous stage, has its own representation of source program, and feeds its output to the next
phase of the compiler. Let us understand the phases of a compiler.

There are two major phases of compilation, which in turn have many parts. Each of
them takes input from the output of the previous level and works in a coordinated
way.

Analysis phase (front -end), An intermediate representation is created from the given source
code:

✓ Lexical Analyzer
✓ Syntax Analyzer
✓ Semantic Analyzer
✓ Intermediate Code Generator

The lexical analyzer divides the program into “tokens”, the Syntax analyzer recognizes
“sentences” in the program using the syntax of the language and the Semantic analyzer checks the
static semantics of each construct. Intermediate Code Generator generates “abstract” code

Synthesis phase(back-end), An equivalent target program is created from the intermediate


representation. It has two parts:
✓ Code Optimizer
✓ Code Generator
Code Optimizer optimizes the abstract code, and the final Code Generator translates abstract
intermediate code into specific machine instructions.

compiler design compiled by Worku B (MSc 5


in computer science)
A compiler is a software program that converts the high-level source code written in a
programming language into low-level machine code that can be executed by the computer
hardware.

The process of converting the source code into machine code involves several phases or
stages, which are collectively known as the phases of a compiler. The typical phases of a
compiler are:
1. Lexical Analysis: The first phase of a compiler is lexical analysis, also known as
scanning. This phase reads the source code and breaks it into a stream of tokens, which
are the basic units of the programming language. The tokens are then passed on to the
next phase for further processing.
2. Syntax Analysis: The second phase of a compiler is syntax analysis, also known as
parsing. This phase takes the stream of tokens generated by the lexical analysis phase

compiler design compiled by Worku B (MSc 6


in computer science)
and checks whether they conform to the grammar of the programming language. The
output of this phase is usually an Abstract Syntax Tree (AST).
3. Semantic Analysis: The third phase of a compiler is semantic analysis. This phase
checks whether the code is semantically correct, i.e., whether it conforms to the
language’s type system and other semantic rules. In this stage, the compiler checks the
meaning of the source code to ensure that it makes sense. The compiler performs type
checking, which ensures that variables are used correctly and that operations are
performed on compatible data types. The compiler also checks for other semantic
errors, such as undeclared variables and incorrect function calls.
4. Intermediate Code Generation: The fourth phase of a compiler is intermediate code
generation. This phase generates an intermediate representation of the source code that
can be easily translated into machine code.
5. Optimization: The fifth phase of a compiler is optimization. This phase applies
various optimization techniques to the intermediate code to improve the performance
of the generated machine code.
6. Code Generation: The final phase of a compiler is code generation. This phase takes
the optimized intermediate code and generates the actual machine code that can be
executed by the target hardware.
In summary, the phases of a compiler are: lexical analysis, syntax analysis, semantic
analysis, intermediate code generation, optimization, and code generation.

compiler design compiled by Worku B (MSc 7


in computer science)
Symbol Table – It is a data structure being used and maintained by the compiler, consisting
of all the identifier’s names along with their types. It helps the compiler to function
smoothly by finding the identifiers quickly.
The analysis of a source program is divided into mainly three phases. They are:
1. Linear Analysis-
This involves a scanning phase where the stream of characters is read from left to right. It
is then grouped into various tokens having a collective meaning.
2. Hierarchical Analysis-
In this analysis phase, based on a collective meaning, the tokens are categorized
hierarchically into nested groups.

compiler design compiled by Worku B (MSc 8


in computer science)
3. Semantic Analysis-
This phase is used to check whether the components of the source program are
meaningful or not.

The compiler has two modules namely the front end and the back end. Front-end constitutes the
Lexical analyzer, semantic analyzer, syntax analyzer, and intermediate code generator. And the
rest are assembled to form the back end.

Operations of Compiler
These are some operations that are done by the compiler.
® It breaks source programs into smaller parts.
® It enables the creation of symbol tables and intermediate representations.
® It helps in code compilation and error detection.
® it saves all codes and variables.
® It analyses the full program and translates it.
® Convert source code to machine code.
Advantages of Compiler Design
1. Efficiency: Compiled programs are generally more efficient than interpreted
programs because the machine code produced by the compiler is optimized for the
specific hardware platform on which it will run.
2. Portability: Once a program is compiled, the resulting machine code can be run on
any computer or device that has the appropriate hardware and operating system,
making it highly portable.
3. Error Checking: Compilers perform comprehensive error checking during the
compilation process, which can help catch syntax, semantic, and logical errors in the
code before it is run.
4. Optimizations: Compilers can make various optimizations to the generated
machine code, such as eliminating redundant instructions or rearranging code for
better performance.

compiler design compiled by Worku B (MSc 9


in computer science)
Disadvantages of Compiler Design
1. Longer Development Time: Developing a compiler is a complex and time-
consuming process that requires a deep understanding of both the programming
language and the target hardware platform.
2. Debugging Difficulties: Debugging compiled code can be more difficult than
debugging interpreted code because the generated machine code may not be easy to
read or understand.
3. Lack of Interactivity: Compiled programs are typically less interactive than
interpreted programs because they must be compiled before they can be run, which
can slow down the development and testing process.
4. Platform-Specific Code: If the compiler is designed to generate machine code for a
specific hardware platform, the resulting code may not be portable to other platforms.
1.2 Computer Language Representation

Both High level language and low-level language are the programming language’s types. The
main difference between high level language and low-level language is that, Programmers can
easily understand or interpret or compile the high-level language in comparison of machine. On
the other hand, Machine can easily understand the low-level language in comparison of human
beings. Examples of high-level languages are C, C++, Java, Python, etc. Let’s see the difference
between high level and low-level languages:

HLL LLL
✓ It is programmer friendly language. ✓ It is a machine friendly language.
✓ High level language is less memory ✓ Low level language is high memory
efficient. efficient.
✓ It is easy to understand. ✓ It is tough to understand.
✓ Debugging is easy. ✓ Debugging is complex comparatively.
✓ It is simple to maintain. ✓ It is complex to maintain
✓ It is portable. comparatively.
✓ It can run on any platform. ✓ It is non-portable.
✓ It is machine-dependent.

compiler design compiled by Worku B (MSc 10


in computer science)
✓ It needs compiler or interpreter for ✓ It needs assembler for translation.
translation. ✓ It is not commonly used now-a-days
✓ It is used widely for programming. in programming.

1.3. Compiler Construction Tools


The compiler writer can use some specialized tools that help in implementing various
phases of a compiler. These tools assist in the creation of an entire compiler or its parts.
Some commonly used compiler construction tools include:
Parser Generator – It produces syntax analyzers (parsers) from the input that is based
on a grammatical description of programming language or on a context-free grammar. It
is useful as the syntax analysis phase is highly complex and consumes more manual and
compilation time.

1. Scanner Generator – It generates lexical analyzers from the input that consists of
regular expression description based on tokens of a language. It generates a finite

automaton to recognize the regular expression. Example: Lex

compiler design compiled by Worku B (MSc 11


in computer science)
2. Syntax directed translation engines – It generates intermediate code with three
address formats from the input that consists of a parse tree. These engines have
routines to traverse the parse tree and then produces the intermediate code. In this,
each node of the parse tree is associated with one or more translations.
3. Automatic code generators – It generates the machine language for a target
machine. Each operation of the intermediate language is translated using a collection
of rules and then is taken as an input by the code generator. A template matching
process is used. An intermediate language statement is replaced by its equivalent
machine language statement using templates.
4. Data-flow analysis engines – It is used in code optimization.Data flow analysis is a
key part of the code optimization that gathers the information, that is the values that
flow from one part of a program to another. Refer – data flow analysis in Compiler
5. Compiler construction toolkits – It provides an integrated set of routines that aids
in building compiler components or in the construction of various phases of compiler.
Features of compiler construction tools:
Lexical Analyzer Generator: This tool helps in generating the lexical analyzer or scanner of
the compiler. It takes as input a set of regular expressions that define the syntax of the language
being compiled and produces a program that reads the input source code and tokenizes it bas ed
on these regular expressions.

compiler design compiled by Worku B (MSc 12


in computer science)
Parser Generator: This tool helps in generating the parser of the compiler. It takes as input a
context-free grammar that defines the syntax of the language being compiled and produces a
program that parses the input tokens and builds an abstract syntax tree.
Code Generation Tools: These tools help in generating the target code for the compiler. They
take as input the abstract syntax tree produced by the parser and produce code that can be
executed on the target machine.
Optimization Tools: These tools help in optimizing the generated code for efficiency and
performance. They can perform various optimizations such as dead code elimination, loop
optimization, and register allocation.
Debugging Tools: These tools help in debugging the compiler itself or the programs that are
being compiled. They can provide debugging information such as symbol tables, call stacks, and
runtime errors.
Profiling Tools: These tools help in profiling the compiler or the compiled code to identify
performance bottlenecks and optimize the code accordingly.
Documentation Tools: These tools help in generating documentation for the compiler and the
programming language being compiled. They can generate documentation for the syntax,
semantics, and usage of the language.
Language Support: Compiler construction tools are designed to support a wide range of
programming languages, including high-level languages such as C++, Java, and Python, as well
as low-level languages such as assembly language.
Cross-Platform Support: Compiler construction tools may be designed to work on multiple
platforms, such as Windows, Mac, and Linux.
User Interface: Some compiler construction tools come with a user interface that makes it easier
for developers to work with the compiler and its associated tools.

compiler design compiled by Worku B (MSc 13


in computer science)

You might also like