Compiling Process
Compiling Process
1) Preprocessing
3) Assembling
4) Linking
In UNIX all the 5 stages are implemented by one program, namely cc or gcc (or g++).
Under Microsoft Windows also under the MSVC++ front end, it is essentially the same.
gcc
gcc is the C compiler of choice for most UNIX. The program gcc itself is actually just a
front end that executes various other programs corresponding to each stage in the
compilation process. To get it to print out the commands it executes at each step, use gcc
-v
cl.exe
cl.exe is the back end to MSVC++, which is the the most prevalent development
environment in use on Microsoft Windows?. You'll find it has many options that are quite
similar to gcc. Try running cl -? for details.
The problem with running cl.exe outside of MSVC++ is that none of your include paths or
library paths are set. Running the program vsvars32.bat in the CommonX/Tools directory
will give you a shell with all the appropriate environment variables set to compile from the
command line.
1) Preprocessing
The preprocessor runs in a single pass, and essentially is just a substitution software. The
Preprocessor accepts source code as input and is responsible for
Linux :- gcc –E
gcc -E runs only the preprocessor stage. You can add -o file to redirect to a
file.
Windows :- cl -E
Likewise, cl -E will also run only the preprocessor stage, printing out the results to
standard out.
Preprocessor directives are used to save typing and to increase the readability of the code.
However C++ is meant to discourage much of the use of the preprocessor, since it can
cause subtle bugs. The pre-processed code is often written to an intermediate file with
extensions .i.
Now the prerpocessed source code is submitted to the Parser. It parses the pre-
processed code and breaks the source code into small units and organizes it into
a structure called a tree. In the expression “A + B” the elements ‘A’, ‘+,’ and ‘B’
are leaves on the parse tree.
The code generator or a Translator then walks through the parse tree and
generates either assembly language code or machine code for the nodes of the
3) Assembling
The assembly stage is where assembly code is translated almost directly to machine
instructions. The assembler creates object code. On a UNIX system you may see files with
a .o suffix (.OBJ on MSDOS) to indicate object code files.
Linux :- GNU
Linux uses the GNU assembler. It takes input as an AT&T or Intel syntax asm file and
generates a .o object file.
Windows :- MASM
4) Linking
The linker combines a list of object modules into an executable program that can
be loaded and run by the operating system.
If a source file references library functions or functions defined in other source files the
link editor combines these functions (with main()) to create an executable file. External
Variable and external functions references resolved here also. The linker also adds a
special object module to perform start-up activities.
The linker can search through special files called libraries in order to resolve all its
references. A library contains a collection of object modules in a single file. A
library is created and maintained by a program called a librarian.
Both Microsoft Windows and UNIX have similar linking procedures. Both systems support 3
styles of linking :-
Static Linking
Static linking means that for each function your program calls, the assembly to
that function is actually included in the executable file. Function calls are
performed by calling the address of this code directly, the same way that functions
of your program are called.
Dynamic Linking
Dynamic linking means that the library exists in only one location on the entire
system, and the operating system's virtual memory system will map that single
location into your program's address space when your program loads. The address
at which this map occurs is not always guaranteed, although it will remain constant
once the executable has been built. Functions calls are performed by making calls
to a compile-time generated section of the executable, called the Procedure
Runtime Linking
Runtime linking is linking that happens when a program requests a function from a
library it was not linked against at compile time. The library is mapped with
dlopen() under UNIX, and LoadLibrary() under Microsoft Windows, both of which
return a handle that is then passed to symbol resolution functions (dlsym() and
GetProcAddress()), which actually return a function pointer that may be called
directly from the program as if it were any normal function. This approach is often
used by applications to load user-specified plugin libraries with well-defined
initialization functions. Such initialization functions typically report further function
addresses to the program that loaded them.