Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
73 views

Compiling Process

The document describes the 5 main stages of compilation: preprocessing, compiling (parsing and translation), assembling, and linking. Preprocessing involves removing comments and interpreting preprocessor directives. Parsing and translation breaks the code into a parse tree and generates assembly code. Assembling translates assembly code to object code. Linking combines object files and libraries to create an executable program that resolves external references.

Uploaded by

rupeshk_p
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views

Compiling Process

The document describes the 5 main stages of compilation: preprocessing, compiling (parsing and translation), assembling, and linking. Preprocessing involves removing comments and interpreting preprocessor directives. Parsing and translation breaks the code into a parse tree and generates assembly code. Assembling translates assembly code to object code. Linking combines object files and libraries to create an executable program that resolves external references.

Uploaded by

rupeshk_p
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 4

Compilation in general is split into roughly 5 stages :-

1) Preprocessing

2) Compiling ( Parsing and Translation )

3) Assembling

4) Linking

The compilation Process

Brief C Compilation Model

In UNIX all the 5 stages are implemented by one program, namely cc or gcc (or g++).
Under Microsoft Windows also under the MSVC++ front end, it is essentially the same.

C C++ Compilation Process 1


Compilers :-

gcc

gcc is the C compiler of choice for most UNIX. The program gcc itself is actually just a
front end that executes various other programs corresponding to each stage in the
compilation process. To get it to print out the commands it executes at each step, use gcc
-v

cl.exe

cl.exe is the back end to MSVC++, which is the the most prevalent development
environment in use on Microsoft Windows?. You'll find it has many options that are quite
similar to gcc. Try running cl -? for details.

The problem with running cl.exe outside of MSVC++ is that none of your include paths or
library paths are set. Running the program vsvars32.bat in the CommonX/Tools directory
will give you a shell with all the appropriate environment variables set to compile from the
command line.

1) Preprocessing

The preprocessor runs in a single pass, and essentially is just a substitution software. The
Preprocessor accepts source code as input and is responsible for

• removes comments in the source files.


• interpretes special preprocessor directives denoted by #. It places all the
contents of the include files into your .c file, and also translates all macros into
inline C code.

Linux :- gcc –E

gcc -E runs only the preprocessor stage. You can add -o file to redirect to a
file.

Windows :- cl -E

Likewise, cl -E will also run only the preprocessor stage, printing out the results to
standard out.

Preprocessor directives are used to save typing and to increase the readability of the code.
However C++ is meant to discourage much of the use of the preprocessor, since it can
cause subtle bugs. The pre-processed code is often written to an intermediate file with
extensions .i.

2) Parsing and Translating

Now the prerpocessed source code is submitted to the Parser. It parses the pre-
processed code and breaks the source code into small units and organizes it into
a structure called a tree. In the expression “A + B” the elements ‘A’, ‘+,’ and ‘B’
are leaves on the parse tree.

The code generator or a Translator then walks through the parse tree and
generates either assembly language code or machine code for the nodes of the

C C++ Compilation Process 2


tree. The assembly code generally have the extension of .s. If the code
generator creates assembly code, the assembler must then be run.

3) Assembling

The assembly stage is where assembly code is translated almost directly to machine
instructions. The assembler creates object code. On a UNIX system you may see files with
a .o suffix (.OBJ on MSDOS) to indicate object code files.

Linux :- GNU

Linux uses the GNU assembler. It takes input as an AT&T or Intel syntax asm file and
generates a .o object file.

Windows :- MASM

MASM is the Microsoft assembler. It is executed by running ml.

4) Linking

The linker combines a list of object modules into an executable program that can
be loaded and run by the operating system.
If a source file references library functions or functions defined in other source files the
link editor combines these functions (with main()) to create an executable file. External
Variable and external functions references resolved here also. The linker also adds a
special object module to perform start-up activities.

The linker can search through special files called libraries in order to resolve all its
references. A library contains a collection of object modules in a single file. A
library is created and maintained by a program called a librarian.

The executable produced have extension of .so or .exe etc.

Both Microsoft Windows and UNIX have similar linking procedures. Both systems support 3
styles of linking :-

Static Linking

Static linking means that for each function your program calls, the assembly to
that function is actually included in the executable file. Function calls are
performed by calling the address of this code directly, the same way that functions
of your program are called.

Dynamic Linking

Dynamic linking means that the library exists in only one location on the entire
system, and the operating system's virtual memory system will map that single
location into your program's address space when your program loads. The address
at which this map occurs is not always guaranteed, although it will remain constant
once the executable has been built. Functions calls are performed by making calls
to a compile-time generated section of the executable, called the Procedure

C C++ Compilation Process 3


Linkage Table, PLT, or jump table, which is essentially a huge array of jump
instructions to the proper addresses of the mapped memory.

Runtime Linking

Runtime linking is linking that happens when a program requests a function from a
library it was not linked against at compile time. The library is mapped with
dlopen() under UNIX, and LoadLibrary() under Microsoft Windows, both of which
return a handle that is then passed to symbol resolution functions (dlsym() and
GetProcAddress()), which actually return a function pointer that may be called
directly from the program as if it were any normal function. This approach is often
used by applications to load user-specified plugin libraries with well-defined
initialization functions. Such initialization functions typically report further function
addresses to the program that loaded them.

C C++ Compilation Process 4

You might also like