Compiling Programs
Compiling Programs
Compiling Programs
INSTALLING PROGRAMS ON
THE CLUSTER
BUILDING COMPUTER PROGRAMS
C program (simple text file wri9en Binary executable file (a set of CPU
in C programming language) instruc-ons encoded in 0’s and 1’s)
Hello World!
0101010100001010101001
#include <stdio.h>
0101110101010010100000
0101110101001011000111
int main()
0110101010101010101010
{
1010001010101010111111
prin9(“Hello World!\n”);
0010111101110000111010
return 0;
1010100111101010111101
}
0101010000110101010101
Preprocessor
Expanded source code
Compiler
Assembly code Not all languages are compiled
languages! The process to the
Assembler leI applies to programs
External wri9en in C, C++, and Fortran.
Object code
libraries
Linker
Lower-level
Binary executable Machine-readable (i.e.
language
can be executed by CPU)
PREPROCESSOR
In C, preprocessor direc-ves
• Expands or removes special lines of code prior to compila-on. begin with the # symbol and are
NOT considered C code.
.
.
.
.
.
.
#ifndef FOO_H
#include <stdio.h>
#define PI 3.1415
#define FOO_H
.
.
#include “myHeader.h”
.
.
void myFunc(int);
#endif
.
.
• Copies contents of stdio.h • Replaces all instances of PI • Prevents expanding mul-ple
into file. within file with 3.1415. copies of the same header file
by defining a unique “macro”
for each header file.
COMPILER
.
#include <stdio.h>
.
main:
int main()
.cfi_startproc
Portability is an issue with
{
pushq %rbp
compiled languages since
prin9(“Hello World!\n”);
.cfi_def_cfa_offset 16
assembly language contains
return 0;
movq %rsp %rbp
instruc-ons that are specific to
}
.
.
a CPU’s architecture.
• Example ISAs are x86, x86_64, and ARM. Most machines in HPC today support x86_64.
ASSEMBLER AND LINKER
• Linker: s-ches together all object files (including any external libraries) into the final binary executable file.
icc –o hello –xHost hello.c Using the –xHost op-on leads to poor
binary portability. Only use this op-on if
• Use Intel’s C compiler to aggressively op-mize for the specific CPU
you are sure the binary will always be
microarchitecture
executed on a specific processor type.
EXTERNAL LIBRARIES (1/2)
• Sta<cally Linked Library: naming conven-on: liblibraryname.a (e.g. libcurl.a is a sta-c curl library)
• Dynamically Linked Library: naming conven-on: liblibraryname.so (e.g. libcurl.so is a dynamic curl library)
• Only the name of the library copied into the final executable, not any actual code.
• At run-me, the executable searches the LD_LIBRARY_PATH and standard path for the library.
• Requires less memory and disk space; mul-ple binaries can share the same dynamically linked library at once.
• By default, a linker looks for a dynamic library rather than a sta-c one.
• Do NOT need to specify the loca<on of a library at build <me if it’s in a standard loca<on (/lib64, /usr/lib64, /
lib, /usr/lib). For example, libc.so lives in /lib64.
EXTERNAL LIBRARIES (2/2)
• In this example, two libraries (gsl and gslcblas) are linked to the final executable.
• Alterna-vely, use LIBRARY_PATH and C_INCLUDE_PATH to specify loca-ons of libraries and headers.
• Check the LD_LIBRARY_PATH and output of the ldd command before running the program:
• LD_LIBRARY_PATH shows list of directories that linker searches for dynamically linked libraries
• Run ldd ./my_prog to see the dynamically linked libraries needed by an executable and the current path
to each library
PORTABILITY
• Many different compilers exist but not all compilers are created equal!
• GCC, Intel, AbsoI, Portland Group (PGI), MicrosoI Visual Studio (MSVS), to name a few.
• Some are free, others are not!
• It is not unusual (especially with large projects) for compiler A to build a program while compiler B fails.
• Error messages and levels of verbosity can also vary widely.
• This is especially true in scien-fic and high-performance compu-ng involving a lot of numerical processing.
• Compiler op-miza-ons are especially tricky, some-mes the compiler needs help from the programmer (e.g.
re-factoring code so the compiler can make easier/safer decisions about when to op-mize code).
• Some compilers (especially Intel’s) tend to outperform their counterparts because they have more in-mate/
nuanced informa-on about a CPU’s architecture (which are oIen Intel-based!).
AUTOMATING THE PROCESS: MAKEFILES (1/3)
Automating the build process
• The Make tool allows a programmer to define the dependencies between sets of files in
programming project, and sets of rules for how to (most o_en) build the project.
• make)u2lity))
• Default file is called Makefile or makefile. – Provides)a)way)for)separate)compila2on))
– Describe)the)dependencies)among)the)project)files))
• Allows build process to be broken up into discreet steps, if desired. For example, separate rules can be
defined for (i) compiling+assembling, (ii) linking, (iii) tes-ng, and (iv) installing code.
– Default)file)to)look)for)is)makefile-or)Makefile)
• Make analyzes the -mestamps of a target and that target’s dependencies to decide whether to execute a
project1.o*
By defining dependencies, you can project1.c* .c executable*
avoid unnecessarily rebuilding certain .o
files. For example, in the example on
the right, project2.c does not need to common.h* .h
be re-compiled if changes have been project2.o*
made to project1.c. .o
project2.c* .c
AUTOMATING THE PROCESS: MAKEFILES (2/3)
• Make analyzes the <mestamp of a target’s last modifica<on and compares it to that of the target’s
dependencies to decide whether to execute the command(s) defined for that target’s rule.
make test make install “make install” generally fails with “permission
denied” errors if you do not have administra-ve
• Generally runs unit tests. • Generally installs the soIware. privileges or have not configured the build to
install into a local directory.
AUTOMATING THE PROCESS: CONFIGURE SCRIPTS (1/2)
• A configure script is an executable file responsible for building a Makefile for a project.
• Determining the dependencies on a given system is difficult to predict and subject to constant change –
wri-ng a Makefile by hand for each system (or even a subset of representa-ve systems) would be an
enormous challenge and an administra-ve hassle.
• Instead, a configure script can be used to scan a system in search of all the needed dependencies (including
versions of soIware, loca-ons of external libraries), and build a Makefile that is specific to that system.
• Configure scripts are indispensible for large projects especially where the number of dependencies is large
and difficult to manage/track.
• Alterna-ves to the configure script exist (cmake being the most common).
./configure --help
• Show command line op-ons.
MAKE AND CONFIGURE MACROS
• There are a number of “macros” (think of as variables) that have standard meanings in Make and configure
scripts. These macros can generally be exported as environment variables to customize your build.
CC LDFLAGS
• C compiler command (e.g. gcc) • Linker flags (e.g. –L/path/to/lib)
CFLAGS LIBS
• C compiler flags (e.g. –Wall –O3) • Library names (e.g. –lcurl)
CPP FC
• C preprocessor command (e.g. gcc) • Fortran compiler command (e.g. gfortran)
CXX FFLAGS
• C++ compiler command (e.g. g++) • Fortran compiler flags (e.g. –O3)
CXXFLAGS MPICC
• C++ compiler flags (e.g. –Wall –O3) • MPI C compiler wrapper command (e.g. mpicc)
COMPILED VS. INTERPRETED LANGUAGES
Compiled Language
• Faster execu-on -me The tradeoffs listed to the leI
• Slower development -me are not universally true but in
• Less portable general apply.
• C, C++, Fortran
Interpreted Language
• Slower execu-on -me Many popular modules/packages (e.g. NumPy,
• Faster development -me SciPy) loaded from interpreted languages are
• More portable compiled shared object files and offer
• Python, Matlab, R, Ruby, Julia comparable performance to pure compiled
languages.