Computer Fundamental: For Programming
Computer Fundamental: For Programming
Computer Fundamental: For Programming
For programming
Compiling and Linking
The total process of going from source code files to an
executable might better be referred to as a build.
Creating an executable is a multistage process divided
into two components : compilation and linking.
In reality, even if a program "compiles fine" it might
not actually work because of errors during the linking
phase.
Compilation
Compilation refers to the processing of source code
files (.c, .cc, or .cpp) and the creation of an 'object' file.
This step doesn't create anything the user can actually
run.
The compiler merely produces the machine language
instructions that correspond to the source code file
that was compiled.
Compilation
For instance, if you compile (but don't link) three
separate files, you will have three object files created as
output, each with the name <filename>.o or
<filename>.obj (the extension will depend on your
compiler).
Each of these files contains a translation of your
source code file into a machine language file -- but you
can't run them yet! You need to turn them into
executables your operating system can use. That's
where the linker comes in.
Linking
Linking refers to the creation of a single executable file
from multiple object files.
In this step, it is common that the linker will complain about
undefined functions (commonly, main itself). During
compilation, if the compiler could not find the definition for
a particular function, it would just assume that the function
was defined in another file. If this isn't the case, there's no
way the compiler would know -- it doesn't look at the
contents of more than one file at a time. The linker, on the
other hand, may look at multiple files and try to find
references for the functions that weren't mentioned.
Linking
Knowing the difference between the compilation
phase and the link phase can make it easier to hunt for
bugs. Compiler errors are usually syntactic in nature --
a missing semicolon, an extra parenthesis. Linking
errors usually have to do with missing or multiple
definitions. If you get an error that a function or
variable is defined multiple times from the linker,
that's a good indication that the error is that two of
your source code files have the same function or
variable.
Configure, Make and Make install
configure; make; make installSubmitted by Willy on Saturday, November 22, 2003 - 12:55
Over and over I have heard people say that you just use the usual configure, make, make install sequence to get a program running. Unfortunately, most people using computers today have never used a
compiler or written a line of program code. With the advent of graphical user interfaces and applications builders, there are lots of serious programmers who have never done this.
What you have are three steps, each of which will use a whole host of programs to get a new program up and running. Running configure is relatively new compared with the use of make. But, each step has a
very distinct purpose. I am going to explain the second and third steps first, then come back to configure.
The make utility is embedded in UNIX history. It is designed to decrease a programmer's need to remember things. I guess that is actually the nice way of saying it decreases a programmer's need to
document. In any case, the idea is that if you establish a set of rules to create a program in a format make understands, you don't have to remember them again.
To make this even easier, the make utility has a set of built-in rules so you only need to tell it what new things it needs to know to build your particular utility. For example, if you typed in make love, make
would first look for some new rules from you. If you didn't supply it any then it would look at its built-in rules. One of those built-in rules tells make that it can run the linker (ld) on a program name ending in
.o to produce the executable program.
So, make would look for a file named love.o. But, it wouldn't stop there. Even if it found the .o file, it has some other rules that tell it to make sure the .o file is up to date. In other words, newer than the source
program. The most common source program on Linux systems is written in C and its file name ends in .c.
If make finds the .c file (love.c in our example) as well as the .o file, it would check their timestamps to make sure the .o was newer. If it was not newer or did not exist, it would use another built-in rule to
build a new .o from the .c (using the C compiler). This same type of situation exists for other programming languages. The end result, in any case, is that when make is done, assuming it can find the right
pieces, the executable program will be built and up to date.
The old UNIX joke, by the way, is what early versions of make said when it could not find the necessary files. In the example above, if there was no love.o, love.c or any other source format, the program would
have said:
make: don't know how to make love. Stop.
Getting back to the task at hand, the default file for additional rules in Makefile in the current directory. If you have some source files for a program and there is a Makefile file there, take a look. It is just text.
The lines that have a word followed by a colon are targets. That is, these are words you can type following the make command name to do various things. If you just type make with no target, the first target
will be executed.
What you will likely see at the beginning of most Makefile files are what look like some assignment statements. That is, lines with a couple of fields with an equal sign between them. Surprise, that is what they
are. They set internal variables in make. Common things to set are the location of the C compiler (yes, there is a default), version numbers of the program and such.
This now beings up back to configure. On different systems, the C compiler might be in a different place, you might be using ZSH instead of BASH as your shell, the program might need to know your host
name, it might use a dbm library and need to know if the system had gdbm or ndbm and a whole bunch of other things. You used to do this configuring by editing Makefile. Another pain for the programmer
and it also meant that any time you wanted to install software on a new system you needed to do a complete inventory of what was where.
As more and more software became available and more and more POSIX-compliant platforms appeared, this got harder and harder. This is where configure comes in. It is a shell script (generally written by
GNU Autoconf ) that goes up and looks for software and even tries various things to see what works. It then takes its instructions from Makefile.in and builds Makefile (and possibly some other files) that work
on the current system.
Background work done, let me put the pieces together.
You run configure (you usually have to type ./configure as most people don't have the current directory in their search path). This builds a new Makefile.
Type make This builds the program. That is, make would be executed, it would look for the first target in Makefile and do what the instructions said. The expected end result would be to build an executable
program.
Now, as root, type make install. This again invokes make, make finds the target install in Makefile and files the directions to install the program.
This is a very simplified explanation but, in most cases, this is what you need to know. With most programs, there will be a file named INSTALL that contains installation instructions that will fill you in on
other considerations. For example, it is common to supply some options to the configure command to change the final location of the executable program. There are also other make targets such as clean that
remove unneeded files after an install and, in some cases test which allows you to test the software between the make and make install steps.
Installing any program
Details :
Generally you would get Linux software in the tarball format (.tgz) This file has to be
uncompressed into any directory using tar command. In case you download a new tarball by the
name game.tgz, then you would have to type the following command
This would create a directory within the current directory and unzip all the files within that new
directory. Once this is complete the installation instructions ask you to execute the 3 (now
famous) commands : configure, make & make install.
Each software comes with a few files which are solely for the purpose of installation sake. One of
them is the configure script. The user has to run the following command at the prompt
$ ./configure
The above command makes the shell run the script named ' configure ' which
exists in the current directory. The configure script basically consists of many
lines which are used to check some details about the machine on which the
software is going to be installed. This script checks for lots of dependencies on
your system. For the particular software to work properly, it may be requiring a
lot of things to be existing on your machine already. When you run the
configure script you would see a lot of output on the screen , each being some
sort of question and a respective yes/no as the reply. If any of the major
requirements are missing on your system, the configure script would exit and
you cannot proceed with the installation, until you get those required things.
The main job of the configure script is to create a ' Makefile ' . This is a very
important file for the installation process. Depending on the results of the
tests (checks) that the configure script performed it would write down the
various steps that need to be taken (while compiling the software) in the file
named Makefile.
If you get no errors and the configure script runs successfully (if there is any
error the last few lines of the output would glaringly be stating the error) then
you can proceed with the next command which is
$ make
' make ' is actually a utility which exists on almost all Unix systems. For make utility to work
it requires a file named Makefile in the same directory in which you run make. As we have
seen the configure script's main job was to create a file named Makefile to be used with
make utility. (Sometimes the Makefile is named as makefile also)
make would use the directions present in the Makefile and proceed with the installation.
The Makefile indicates the sequence, that Linux must follow to build various components /
sub-programs of your software. The sequence depends on the way the software is designed
as well as many other factors.
The Makefile actually has a lot of labels (sort of names for different sections). Hence
depending on what needs to be done the control would be passed to the different sections
within the Makefile Or it is possible that at the end of one of the section there is a command
to go to some next section.
Basically the make utility compiles all your program code and creates the executables. For
particular section of the program to complete might require some other part of the code
already ready, this is what the Makefile does. It sets the sequence for the events so that your
program does not complain about missing dependencies.
If make ran successfully then you are almost done with the installation. Only the last step
remains which is
$ make install
As indicated before make uses the file named Makefile in the same directory.
When you run make without any parameters, the instruction in the Makefile
begin executing from the start and as per the rules defined within the Makefile
(particular sections of the code may execute after one another..thats why labels
are used..to jump from one section to another). But when you run make with
install as the parameter, the make utility searches for a label named install
within the Makefile, and executes only that section of the Makefile.
The install section happens to be only a part where the executables and other
required files created during the last step (i.e. make) are copied into the
required final directories on your machine. E.g. the executable that the user
runs may be copied to the /usr/local/bin so that all users are able to run the
software. Similarly all the other files are also copied to the standard directories
in Linux. Remember that when you ran make, all the executables were created
in the temporary directory where you had unzipped your original tarball. So
when you run make install, these executables are copied to the final directories.
Thats it !! Now the installation process must be clear to you. You surely will feel
more at home when you begin your next software installation.