Compiling A C Program - Behind The Scenes
Compiling A C Program - Behind The Scenes
Below are the steps we use on an Ubuntu machine with gcc compiler.
$ vi filename.c
The option -Wall enables all compiler’s warning messages. This option is
recommended to generate better code.
The option -o is used to specify the output le name. If we do not use this option, then an output le with name a.out is generated.
After compilation executable is generated and we run the generated executable using below command.
$ ./filename
Compiler converts a C program into an executable. There are four phases for a C program to become an executable:
1. Pre-processing
2. Compilation
3. Assembly
4. Linking
By executing below command, We get the all intermediate les in the current directory along with the executable.
Pre-processing
This is the rst phase through which source code is passed. This phase include:
Removal of Comments
Expansion of Macros
Expansion of the included les.
Conditional compilation
The preprocessed output is stored in the lename.i. Let’s see what’s inside lename.i: using $vi lename.i
In the above output, source le is lled with lots and lots of info, but at the end our code is preserved.
Analysis:
printf contains now a + b rather than add(a, b) that’s because macros have expanded.
Comments are stripped off.
#include<stdio.h> is missing instead we see lots of code. So header les has been expanded and included in our source le.
Compiling
The next step is to compile lename.i and produce an; intermediate compiled output le lename.s. This le is in assembly level instructions.
Let’s see through this le using $vi lename.s
The snapshot shows that it is in assembly language, which assembler can understand.
Assembly
In this phase the lename.s is taken as input and turned into lename.o by assembler. This le contain machine level instructions. At this
phase, only existing code is converted into machine language, the function calls like printf() are not resolved. Let’s view this le using $vi
lename.o
Linking
This is the nal phase in which all the linking of function calls with their de nitions are done. Linker knows where all these functions are
implemented. Linker does some extra work also, it adds some extra code to our program which is required when the program starts and ends.
For example, there is a code which is required for setting up the environment like passing command line arguments. This task can be easily
veri ed by using $size lename.o and $size lename. Through these commands, we know that how output le increases from an object le to
an executable le. This is because of the extra code that linker adds with our program.
Note that GCC by default does dynamic linking, so printf() is dynamically linked in above program. Refer this, this and this for more details on
static and dynamic linkings.