CPP Primer
CPP Primer
CPP Primer
Andrew Koenig
Bjarne Stroustrup
AT&T Research
Murray Hill, New Jersey 07974, USA
ABSTRACT
Over the past decade, C++ has become the most commonly used language for intro-
ducing object-oriented programming and other abstraction techniques into production
software. During this period, C++ has evolved to meet the challenges of production sys-
tems. In this, C++ differs radically from languages that come primarily from academic or
research environments, and from less widely used languages. Although C++ has also
been extensively used in academia and for research, its evolution was driven primarily by
feedback from its use in industrial applications.
In this paper, we focus on three design areas key to successful C++ use. In doing so, we
explore fundamental C++ concepts and facilities and present distinctive C++ design and
programming styles that have evolved to cope with the stringent demands of everyday
systems building. First we explore C++’s support for concrete data types and containers
and give examples of how the C++ generic programming facilities, together with well-
designed libraries, can yield flexibility and economy of expression. Next we examine
some uses of class hierarchies, touching on issues including encapsulation, interface
design, efficiency, and maintainability. Finally, we note that languages succeed for rea-
sons that are not entirely technical and review the background for C++’s success.
This paper is not a C++ tutorial. However, it does include enough code examples and
supporting commentary that readers familiar with programming languages in general but
unfamiliar with C++ can grasp the key C++ language constructs and programming tech-
niques.
1 Introduction
C++ was designed to combine the strengths of C as a systems programming language with Simula’s facili-
ties for organizing programs. During the 60’s and 70’s, the key concepts, techniques, and language features
for what came to be known as ‘‘object-oriented programming’’ and ‘‘object-oriented design’’ had devel-
oped in connection with the Simula language. During the 80’s, C’s close-to-the-machine semantics gave it
the edge in run-time and space efficiency, portability, and flexibility that established C as the dominant sys-
tems programming language.
Thus C++ started from a sound theoretical and practical basis. Feedback from widespread use guided its
further evolution. C++ supports the design and efficient implementation of elegant programs from toy
examples to very large systems.
Over the years, distinct C++ styles of design and programming have evolved. This evolution has pro-
gressed to the point where we can identify and explore key notions and techniques.
-2-
declares a function taking an integer argument and returning a reference to a Date. The type void is used
to specify that a function doesn’t return a value.
The set of operations is fairly typical for a user-defined type:
[1] A constructor specifying how objects/variables of the type are to be initialized. In this case, a Date
can be created given three integers representing the year, month, and day.
[2] A set of functions allowing a user to examine a Date. In this case, functions returning integers rep-
resenting the year, month, and day are provided, and also two functions returning a character string
representations of the Date. The char[] is a C-style array of characters, string is the C++
standard library string type. These functions are marked const to indicate that they don’t modify
the state of the object/variable they are called for.
[3] A set of functions allowing the user to manipulate Dates without actually having to know the
details of the representation or fiddle with the intricacies of the semantics.
[4] In addition to the explicitly declared operations, Dates can be freely copied.
Declared properties are checked at compile time. For example, if a function not declared in class Date tries
to use a private member, that function will cause the compiler to issue an error message. Similarly, the
const member functions do not modify the state, etc.
Here is a small—and contrived—example of how Dates can be used:
void f(Date& today)
{
Date lvb_day = Date(16,dec,today.year());
if (midnight()) today.add_day(1);
Also, if we ever need to change the representation of Date it is useful that the representation be used only
by a designated set of functions. For example, if we decided to try representing a Date as the number of
days before or after January 1 year 1 A.D. then only the functions declared in the declaration of Date
would need changing.
int max;
switch (mm) {
case feb:
max = 28+leapyear(yy);
break;
case apr: case jun: case sep: case nov:
max = 30;
break;
case jan: case mar: case may: case jul:
case aug: case oct: case dec:
max = 31;
break;
}
Date& Date::add_year(int n)
{
if (d==29 && m==feb && !leapyear(y+n)) {
d = 1;
m = mar;
}
y += n;
return *this;
}
The notation *this refers to the object for which a member function is invoked. It is equivalent to
Simula’s THIS and Smalltalk’s self. Returning a self-reference is a useful convention that allows
-6-
However, this still leaves the association implicit as far as the C++ language rules are concerned, and the
names pollute the global name space. A more recent approach is to enclose the class and its helper func-
tions in a namespace:
namespace Chrono { // facilities for dealing with time
class Date {
// ...
};
// ...
}
Names from a namespace can be used by explicitly qualifying them with the namespace name or by intro-
ducing an alias. For example:
void f(Chrono::Date d)
{
Chrono::Date next_sunday = Chrono::next_saturday(d).add_day(1);
}
or
using Chrono::Date; // introduce alias ‘‘Date’’
void f(Date d)
{
using Chrono::next_saturday; // introduce alias ‘‘next_saturday’’
If detailed control of names is not required, all the names from a namespace can be made available by a sin-
gle declaration:
using namespace Chrono; // make all names from Chrono available
void f(Date d)
{
Date next_sunday = next_saturday(d).add_day(1);
}
Using the more discriminating ways of referring to names in a namespace is less likely to lead to name
clashes and surprises [Stroustrup1994§17].
class Date_and_time {
private:
Date d;
Time t;
public:
Date_and_time(Date d, Time t);
Date_and_time(int d, Date::Month m, int y, Time t);
// ...
};
The derived class mechanism described in §5.2 can be used to define new types from a concrete class by
describing the desired differences, but that implementation technique is beyond the scope of this paper; see
[Stroustrup,1994§2.9.1].
A concrete class such as Date needs no hidden overhead in time or space. The size of a concrete type
is known at compile time so that objects can be allocated on the run-time stack (that is, without free-store
operations). The layout of each object is known at compile time so that inlining of operations is trivially
achieved. Similarly, layout compatibility with other languages, such as C and Fortran, also comes without
special effort.
A good set of such types can provide a concrete foundation for applications. We feel that many pro-
gramming languages have neglected concrete types. Lack of support for ‘‘small efficient types’’ can lead
to gross run-time and space inefficiencies when overly general and expensive mechanisms are used. Alter-
natively, it can lead to obscure programs and wasted time when programmers are forced to discard expen-
sive abstraction mechanisms in favor of direct manipulation of data structures or lower-level languages.
/* use squares */
}
creates an array containing n integer values with indices 0 through n-1 and sets each element to the square
of its index. Unfortunately, the size of such an array, in this case n, must be a compile-time constant.
In C, a variable-length array is usually simulated using the library functions malloc and free that
deal in raw memory. For example:
void f2(n) int n; /* a C function */
{
int *squares = malloc(n * sizeof(int));
int i;
/* use squares */
free(squares);
}
To make this work, C supports a form of type punning—it is possible to take an array of one type and treat
the memory it occupies as if it really contained memory of another type. This makes it possible to assign
the result from malloc to squares. C’s definition of indexing is what makes it possible to refer to
squares[i] as if it were an element of an array. Probably the greatest inconvenience of using C this
way is the requirement to free the memory explicitly when done with it.
Now let us look at how C++ handles variable length arrays. As with built-in arrays, C++ library arrays
are one-dimensional. Multi-dimensional arrays are most commonly used for numerical computation, which
is supported by a separate numerical library. A one-dimensional array is called a vector, and is used
something like this:
void f3(int n) // C++ function
{
vector<int> squares(n);
// use squares
}
This is not much more difficult than using a built-in array: As for the built-in arrays, there is no special
requirement to free the memory used by squares; that memory is automatically freed when the variable
goes out of scope†.
Making this work for an array size that is not a compile time constant and for an array that is a user-
defined type requires the ability
__________________
† Because f3() uses free store and f1() uses the stack, f3() incurs a fixed allocation overhead, which depends, among other
things, on how fast the system’s memory allocator is and how much trouble the compiler takes to optimize uses of the standard library.
In the (worst and unrealistic) case where use squares was nothing, with a compiler that uses the allocator that comes with the machine
and no special optimization, we measured the overhead to be to between a factor of 2 and a factor of 3 depending on the size of the
vector. On the other hand, when use squares was printing out the vector there were no measurable performance difference. We timed
an intermediate example, where use squares was to take the square root of each element. In this case, the overhead varied from 5% to
58% depending on the size of the vector. We leave it for the reader to decide in which situations the overhead might be significant.
There is no significant overhead in f3() compared to using the C-style variable length array in f2().
- 10 -
int k = 0;
while (k<n && squares[k]<=1000) ++k;
In addition to missing the part that resizes square, this code is tedious and error-prone. What we really
want to do is two things:
[1] find the first element, if any, of squares that is greater than 1000, and
[2] erase the elements of squares before the one we found.
The library offers ways to do that directly. First we write a predicate function, which checks if its argu-
ment is greater than 1000:
bool bigger1000(int n) { return n > 1000; }
Next we use a standard library function called find to locate the first element for which bigger1000 is
true:
void g2(vector<int>& squares)
{
vector<int>::iterator vi =
find_if(squares.begin(), squares.end(), bigger1000);
// resize squares
}
This last example introduces three things we haven’t seen before:
[1] the library defines a type vector<int>::iterator that can be used to mark a location in a
vector<int>;
[2] every vector has a pair of member functions called begin and end, which return iterators that
identify the initial element and a point one past the last element of the vector; and
[3] the library function find_if locates the first element between the points identified by two iterators
that satisfies the property given by its third argument. In this case, the third argument is a pointer to
the function bigger1000(); find_if calls through that pointer to check each element.
- 11 -
After calling find_if, the vector iterator vi will identify either the first element of squares that is
larger than 1000 or a point one past the end of squares. All that is left to do is erase the elements of
squares starting at the beginning and ending just before vi:
void g3(vector<int>& squares)
{
vector<int>::iterator vi =
find_if(squares.begin(), squares.end(), bigger1000);
squares.erase(squares.begin(), vi);
}
This will work even if vi points past the end. Of course, we can combine these two expressions and do
away with the local variable vi:
void g4(vector<int>& squares)
{
squares.erase(squares.begin(),
find_if(squares.begin(), squares.end(), bigger1000));
}
then gt(3,4) would be false and gt(4,3) would be true. These objects are not truly functions, but
they act like functions. We therefore call them function objects.
There is also a library function called bind2nd that takes a predicate and a value and yields an object
that, when called with a single argument, applies the predicate to that argument and the value. This is con-
fusing to describe, but easy to use:
(bind2nd(gt, 1000)) (999)
is false and
(bind2nd(gt, 1000)) (1001)
is true. We can therefore use bind2nd(gt, 1000) as our predicate instead of bigger1000()
when calling find_if:
void g5()
{
greater<int> gt;
squares.erase(squares.begin(),
find_if(squares.begin(), squares.end(),
bind2nd(gt, 1000)));
}
Again, we can go further still by eliminating the local variable gt. The explicit constructor call
greater<int>() will serve the same purpose by creating an anonymous object:
- 12 -
void g6()
{
squares.erase(squares.begin(),
find_if(squares.begin(), squares.end(),
bind2nd(greater<int>(), 1000)));
}
We can think of the body of this function as meaning
‘‘Remove from squares all the elements up to and not including the first element that is greater
than 1000.’’
For programmers without experience with functional languages, this may appear confusing at first
glance, but that is mostly because of unfamiliarity. Once one understands what the original operations do,
we find this code easier to understand than the original ‘‘straightforward’’ version, g1(). It is also easier
to convince ourselves of its correctness.
Importantly, the notational convenience of g6() has not been bought at the cost of run-time ineffi-
ciency compared to the conventional C-style version g1()†.
like Lisp or Smalltalk, where types are not determined until execution time, but it is unusual in languages
that support strong static typing. What in C++ makes this possible?
Here type I is the type of squares.begin(). We don’t actually know what that type is, but its name is
list<int>::iterator. All we know beyond its name is that it denotes an element of type int
somehow. We could think of an iterator as a simple pointer to int, though for a list a simple int* is
an unlikely candidate for an iterator type.
Our use of find_if is therefore equivalent to what we would have if we wrote it this way:
typedef typename list<int>::iterator I;
Iterator categories:
Input
Forward Bidirectional RandomAccess
Output
their operations:
++ * = == != -- []
This iterator nomenclature is not part of the C++ language. Instead, it is part of the standard library docu-
mentation. Thus, for example, the description of find_if states that the first two arguments must be
input iterators that delimit a range of values.
C++ templates do not require the author of functions like find_if to declare explicitly that its argu-
ments should be input iterators. In fact, there is no explicit way to declare such things even if the author
wanted to. We have heard numerous suggestions that C++ should make it possible to write find_if in a
style similar to the following:
template<class I: input_iterator, class P: predicate>
I find_if(I begin, I end, P pred)
{
// ...
}
Why does C++ offer no such facility? There are three main reasons:
[1] Any such facility would have to take into account not only inheritance but also built-in types and
operations on types not defined as members (such as the ‘‘helper functions’’ in §3.2 and §3.3).
Ordinary pointers meet the requirements for random-access iterators when they are used to point to
elements of (built-in) arrays. That means we would need some way of saying that for any type T,
T* is a random-access iterator. Otherwise, we would have to forego the ability to use functions like
find_if on built-in arrays.
[2] The facility would offer little additional safety, if any. The main benefit would be that errors would
be detected when a template function, such as find_if, is called instead of when code is generated
for it; we believe that this benefit alone is not enough to justify a whole new type-checking facility.
[3] Even if such a facility existed and checked usage completely at the earliest possible instant, that
would still not guarantee safety. To work correctly, a template requires that its parameter type pro-
vide the expected operations with the expected semantics. Specifying ‘‘the expected operations’’
can be messy and constraining. Specifying ‘‘the expected semantics’’ can be surprisingly difficult.
For example, most attempts to specify something as simple as a less than operator, <, in general can
involve the programmer in the intricacies of the IEEE floating-point value NaN (not a number). We
prefer to leave such complexity in the documentation.
In general, we know of no way of expressing constraints on template parameters that wouldn’t be either too
cumbersome or too constraining [Stroustrup,1994,§15.4]. Instead, C++ provides mechanisms for providing
separate implementations, called specializations, for special cases. For example, in addition to providing a
general list template, one can provide versions to be used for lists of pointers (in general), and for lists of
void* (in particular).
main()
{
printf("%s", "Hello world\n");
}
Here, printf() determines the type of its second argument at run time. In general, it has to, because its
first argument, the format string, might be a variable. In most cases, static type checking of printf() is
possible. However, from an implementer’s viewpoint, it is easier to put this kind of run-time type checking
into the printf library function than into the compiler.
The C++ equivalent,
#include <iostream.h>
main()
{
cout << "Hello world\n";
}
does not rely on run-time typing. Instead, the types of cout and of the the string literal are used to select
during compilation the appropriate version of the << operator to use. This means that there is no run-time
overhead involved in finding the right kind of output conversion to use and no possibility that the wrong
choice will cause a crash.
The cooperation between the user and the library is established through the convention that the << oper-
ator is used for output. If a library or a user needs to support output of a new type, a new << is provided.
Then ins is an object that on request will read strings from cin, so that if s is a string,
s = *ins++;
The STL model requires that we iterate from somewhere to somewhere. Consequently, we need a value
indicating ‘‘end of file’’ that we can compare the iterator ins to. Such a value is used by default for an
uninitialized input_iterator<string>, so that we can say something like this:
- 17 -
input_iterator<string> ins(cin);
input_iterator<string> eof;
void f1()
{
while (ins != eof) {
s = *ins++;
// ...
}
}
and the loop will be executed once for each string in the standard input file.
This is equivalent to:
void f2()
{
while (cin >> s) {
// ...
}
}
However, defining input_iterator makes it possible for the algorithm library to use the input/output
stream library unmodified. For example we can read all the strings in the standard input into a
vector<string> without writing an explicit loop. Instead, we can create the vector directly from the
standard input:
vector<string> vs(ins, eof);
Here, vs is constructed with two arguments, both iterators; doing that causes vs to be initialized with a
copy of the elements in the range delimited by those iterators. In this case, that range is the entire contents
of the standard input file.
Along similar lines, we can create an output iterator attached to the standard output file:
ostream_iterator<string> outs(cout, "\n");
Here, the second argument to the ostream_iterator constructor is a string that will be written after
each use of the ostream_iterator. Thus, for example
*outs++ = "Hello world";
int main()
{
istream_iterator<string> ins(cin), eof;
ostream_iterator<string> outs(cout, "\n");
class ival_box {
protected:
int val;
int low, high;
bool changed;
public:
ival_box(int ll, int hh)
{ changed = false; low=ll; high=hh; val = ll; }
virtual int get_value()
{ changed = false; return val; }
virtual void set_value(int i)
{ changed = false; val = i; }
virtual void prompt()
{ }
virtual bool was_changed() const
{ return changed; }
};
The default implementation of the functions is pretty sloppy and provided here primarily to illustrate the
intended semantics. A realistic class would, for example, provide some range checking.
Given this basic definition of ival_box, we can derive variants of the concept from it. For example:
class ival_slider : public ival_box {
// graphics stuff to define what the slider looks like, etc.
public:
ival_slider(int, int);
int get_value();
void prompt();
bool was_changed();
};
A class like ival_slider is said to be derived from class ival_box and ival_box is said to be a
base of ival_slider. Alternatively, we can call ival_box the superclass of ival_slider and
ival_slider a subclass of ival_box. The notation
class ival_slider : public ival_box { /* ... */ };
void some_fct()
{
ival_box* p1 = new ival_slider(0,5);
// ...
interact(p1);
ival_box* p2 = new ival_dial(1,12);
// ...
interact(p2);
}
Note that most application code is written in terms of (pointers to) plain ival_boxes the way
interact() is. That way, the application doesn’t have to know about the potentially large number of
variants of the ival_box concept. The knowledge of such specialized classes is isolated in the relatively
few functions that create such objects. This isolates users from changes in the implementations of the
derived classes and most code can be oblivious to the fact that there are different kinds of ival_boxes.
Where would we get the graphics stuff from? Most user-interface systems provide a class defining the
basic properties of being an entity on the screen, so if we use the system from ‘‘Big Bucks Inc.’’ we would
have to make each of our ival_slider, ival_dial, etc., classes a kind of BBwindow class. This
would most simply be achieved by rewriting our ival_box so that it derives from BBwindow. That
way, all our classes inherit all the properties of a BBwindow. For example, every ival_box can be
placed on the screen, obey the graphical style rules, be resized, be dragged around, etc., according to the
standard set by the BBwindow system. Our class hierarchy would look like this:
class ival_box : public BBwindow { /* ... */ }; // rewritten
ibox
islider idial
ipopup iflash
5.2.1 Critique
This design works well in many ways, and for many problems this kind of hierarchy is a good solution.
However, there are some awkward details that could lead us to look for alternative designs.
We retrofitted BBwindow as the base of ival_box. This is not quite right. The use of BBwindow
wasn’t part of our basic notion of an ival_box; it was an implementation detail. Deriving ival_box
from BBwindow elevated an implementation detail to a first-level design decision. That can be right, say
when working in the environment defined by ‘‘Big Bucks Inc.’’ is a key decision of how our organization
conducts its business. However, what if we also wanted to have implementations of our ival_boxes for
systems from ‘‘Imperial Bananas,’’ ‘‘Liberated Software,’’ and ‘‘Compiler Wizzes?’’ This would require
us to write and maintain four distinct versions of our program:
// BB version:
// CW version:
// IB version:
// LS version:
Finally, our program may have to run in a mixed environment where windows of different user-
interface systems coexist. This could happen either because two systems somehow share a screen, or
because our program needs to communicate with users on different systems. Having our user-interface sys-
tems ‘‘wired in’’ as the one and only base of our one and only ival_box interface just isn’t flexible
enough to handle that.
protected:
// functions overriding BBwindow virtual functions
// e.g. BBwindow::draw(), BBwindow::mouse1hit()
- 23 -
public:
ival_slider(int,int);
~ival_slider();
int get_value();
void set_value(int i);
void prompt();
bool was_changed() const;
};
Interestingly, this declaration allows application code to be written exactly as in the interact() and
some_fct() example above. All we have done is to restructure the implementation details in a more
logical way.
The virtual function ival_box::~ival_box() and its overriding function
ival_slider::~ival_slider() are destructors, that is, functions that are implicitly called when an
object is destroyed (goes out of scope, is explicitly deleted, etc.). Many classes require some form of
cleanup for an object before it goes away. Since the abstract class ival_box cannot know if a derived
class requires such cleanup, it must assume that it does. Defining a virtual destructor in the base ensures
proper cleanup. For example:
void f(ival_box* p)
{
// ...
delete p;
}
The delete operator explicitly destroys the object pointed to by p. We have no way of knowing exactly
which class the object pointed to by p belongs to, but thanks to ival_box’s virtual destructor, proper
cleanup as (optionally) defined by that class’ destructor will be called.
The ival_box hierarchy can now be defined like this:
class ival_box { /* ... */ };
islider idial
ipopup iflash
Each derived class inherits an abstract class (for example, ival_box) requiring it to implement the base
class’ pure virtual functions, and a BBwindow provides them with the means of doing so. Since
ival_box provides the interface for the derived class, it is publicly derived using :public. Since
BBwindow is only an implementation aid, it is derived using :protected. This implies that a program-
mer using, say ival_slider, cannot directly use facilities defined by BBwindow; only the interface
inherited by ival_box and possibly augmented by ival_slider is available.
Deriving directly from more than one class is usually called multiple inheritance. Note that
ival_slider must override functions from both ival_box and BBwindow so it must be defined by
deriving it directly or indirectly from both. As shown in §4.2, deriving ival_slider indirectly from
BBwindow by making BBwindow a base of ival_box is possible, but has undesirable side effects.
- 24 -
This design is cleaner and more easily maintainable than the traditional one—and no less efficient. It
still fails to solve the version control problem, though:
// common:
// BB version:
// CW version:
// ...
In addition, there is no way of having an ival_slider for BBwindows coexist with an ival_slider
for CWwindows even if the two user-interface systems can themselves coexist.
The obvious solution is to define several different ival_slider classes with separate names:
class ival_box { /* ... */ };
// ...
or graphically:
BBwindow ibox CWwindow
BBislider CWislider
To further insulate our application-oriented ival_box classes from implementation details, we can go one
step further and first derive an abstract ival_slider class from ival_box and then derive the system
specific ival_sliders from that:
class ival_box { /* ... */ };
// ...
or graphically:
ibox
BBislider CWislider
Usually, we can do better yet by utilizing more specific classes in the implementation hierarchy. For exam-
ple, if the Big Bucks Inc. system has a slider class, we can derive our ival_slider directly from the
BBslider:
- 25 -
or graphically:
ibox
BBislider CWislider
This improvement becomes significant where—as is not uncommon—our abstractions are not too different
from the ones provided by the system used for implementation. In that case, programming reduces to map-
ping between similar concepts. Derivation from general base classes, such as BB_window, is then done
only rarely.
The complete hierarchy will consist of our original application-oriented conceptual hierarchy of inter-
faces expressed as derived classes:
class ival_box { /* ... */ };
followed by the implementations of this hierarchy for various windows systems expressed as derived
classes:
// BB implementations:
class BB_ival_slider
: public ival_slider, protected BBslider { /* ... */ };
class BB_flashing_ival_slider
: public ival_slider,
private BBwindow_with_bells_and_whistles { /* ... */ };
class BB_popup_ival_slider
: public ival_slider, protected BBslider { /* ... */ };
// CW implementations:
class CW_ival_slider
: public ival_slider, protected CWslider { /* ... */ };
// ...
// ...
or graphically:
- 26 -
ibox
islider idial
ipopup iflash
Note how the original ibox class hierarchy appears unchanged, but is surrounded by implementation
classes.
5.3.1 Critique
The abstract class design is flexible, and almost as simple to deal with as the equivalent design relying on a
common base defining the user-interface system. In the latter design, the windows class is the root of a
tree. In the former, the original application class hierarchy appears unchanged as the root of classes that
supply its implementations. In either case, you can look at the ival_box family of classes without both-
ering with the window-related implementation details most of the time.
In either case, the complete implementation of each ival_box class must be rewritten when the public
interface of the user-interface system changes. However, in the abstract class design almost all user code is
protected against changes to the implementation hierarchy and require no recompilation.
BB_ival_maker BBim;
LS_ival_maker LSim;
void g()
{
f(&BBim); // let f use BB
f(&LSim); // let f use LS
}
This technique appears in [Gamma,1994] as the abstract factory pattern.
6 C++ Style
C++ is often inaccurately described as an object-oriented language, and (therefore?) often criticized for not
fulfilling everybody’s fantasies of what an object-oriented language ought to be.
If we have to stick a pretentious-sounding label on C++ it must be: C++ is a multi-paradigm language.
It supports several styles of programming and combinations of those styles. The traditional summary is
[Stroustrup,1994]:
box center; l. C++ is a general-purpose programming language that
– is a better C
– supports data abstraction
– supports object-oriented programming
However, the exact scope of this isn’t easy to pin down to a simple slogan such as ‘‘Everything is an
Object!’’ or ‘‘No side effects!’’ Such slogans are certainly not among the ideals of C++ even though
- 28 -
support for both object-oriented programming and functional styles of programming is.
Good C++ style is pragmatic, has evolved from the Simula ideas of object-oriented design as modelling,
places a premium on direct expression of ideas, shares much of C’s concern for low-level efficiency, and is
aimed at solving current everyday problems.
Naturally, this is just our view. Nothing is universally held in a community as large as the C++ user
community, but our view is directly reflected in the design of C++ [Stroustrup,1994,§4]. Fortunately for
people who hold other views, one of our strongest held opinions is exactly that C++ should support a vari-
ety of styles. Thus, even though we don’t try to provide direct support for every style of programming in
C++, we don’t go out of our way to prevent styles we don’t like, either. Indeed, it is often a source of
enjoyment to see people using C++ in ways we did not anticipate—especially when it is successful.
Unfortunately—or maybe fortunately—style is hard to define and must be taught (and learned!) with
liberal use of examples. We have presented three areas where C++ provides direct support, where a definite
view of design can guide the programmer, and where the design views and resulting coding style reflects
experience with C++. The examples were chosen to demonstrate areas that are not universally well-covered
by modern programming languages and where current practice—in C++ and other languages—often
diverges from our ideal. Thus the examples from §3, §4, and §5 can serve as discriminating cases and pos-
sibly as inspiration to do as well or better.
Clearly, by ‘‘style’’ we just don’t mean rules for indentation of code, the naming of variables, and the
banning of unfashionable language features. Good programs are the result of a focus on concepts and
sound notions of design, rather than mechanistic language-technical issues. Such issues matter, but at a
much more detailed level.
C++ supports enough data abstraction to make it possible to program at as high a level as in many more
‘‘advanced’’ languages. Doing so usually requires extensive work designing, implementing and tuning a
library supporting the style. Building such a framework should not be everyday work for most C++ pro-
grammers. For example, the STL wasn’t easy to design (Alex Stepanov and his colleagues worked on the
basic ideas for over a decade), was somewhat easier to implement (the current version was about two years
of work for two people), and it is quite simple to teach and use.
This is a key idea: first a relatively small group of people develops a library supporting an application
domain well. After that, many more people can use the library to develop applications or the next level of
library. We are not making a value judgement about programmers here. It easier to use a well-designed
library than it is to design and implement it, and the subset of C++ needed to produce a complete, efficient,
and elegant library is far larger than what is needed to use it. This has led some people to propose a class
system of programmers with the best programmers focused on library development and the worst restricted
to application development.
However, the demands on a programmer’s skills are a function of both the inherent difficulty of the
application and the quality of tools available for its development. Therefore, one cannot blindly assume
that lesser skills or fewer language features are needed for application development. Sometimes, things
seem the other way around with the library developers benefitting from a relatively limited and well-
defined problem domain, and the application developers suffering from being lost in an overly large and
complicated design space. From this observation comes the notion that the best way to make progress on a
large system is to focus on the development of several libraries or frameworks and then build the system
incrementally from those.
The unit of design is not the individual class—in C++ or in any other language. It is a set of classes
related by some logical criteria [Stroustrup,1991§12.11.3.3]. For example, the power of the STL comes
from the unifying criteria for what constitutes a container, an iterator, etc. Similarly, the discussion of
design issues relating to the input operation in §5 would have been impossible had we tried to consider the
problem one isolated class at a time.
Another key observation is that not every class is supposed to be used in the same way or obey the same
simple-minded design criteria. Often, simplified design rules of thumb are advertised as universal princi-
ples and a curious form of reductionism takes the place of calm thinking. Thus, we find people arguing that
because some classes are best designed as part of a hierarchy, every class must be designed to be part of a
class hierarchy; that because it makes sense for some functions to be virtual, every function must be
virtual; and that because some interfaces are best described as abstract classes, no class presented to a
users may contain data.
- 29 -
This kind of purely language-driven thinking makes no sense to us. We must focus on the concepts in
the application and map them into the language constructs in the most appropriate way. In other words, we
must design first and keep our programming-language-technical concerns secondary. On the other hand,
we don’t consider totally language-independent design practical. The design must map into the language
used for its implementation in a way that suits the fundamental structure of the language. In particular, a
design for a C++ program that tries to subvert C++’s static type system will be ugly, unpleasant to imple-
ment, and hard to maintain. Against the fundamental structure of a language—any language—one can win
Pyrrhic victories only.
One implication of this is that major interfaces are usually best defined in terms of specific user-defined
types and that a class should provide an interface that match a single coherent concept. This allows better
type checking, and wherever possible static (compile time) checking should be used to minimize confusion,
run-time errors, and the need for run-time checking of arguments passed across an interface. The Date
constructor can be used to illustrate some tradeoffs:
Date::Date(int d, Month m, int y);
Month is a user-defined type (an enumeration), so we can’t get much confusion from that. People reading
the declaration know what is expected; should they nevertheless mess up, the compiler catches the problem:
Date d1(1978,2,21); // error: 2 is not a Month
Date d2(1978,Date::feb,21); // ok
However, we reversed the year and the day. The Date constructor’s check of the range of dates in Febru-
ary will catch that at run-time.
Had Date been critical in our design, we might have introduced a Day or a Year type to allow
stronger compile-time checking. For example:
class Year {
int y;
public:
explicit Year(int i) { y = i; } // construct Year from int
operator int() const { return y; } // conversion: Year to int
};
class Date {
Date(int d, Month m, Year y);
// ...
};
community. There may very well be more good designers in the C++ community than in any other pro-
gramming community, but there certainly are more novices. The rapid growth of C++ usage ensures that.
We can teach design to small groups, and even to larger organizations. However, getting design technique
applied on a large scale (hundreds or thousands of programmers) is a task no language community has been
spectacularly successful at—yet.
7 Sociological Observations
A programming language by itself is useless. Unless supported by tools, techniques, and a user commu-
nity, a language is simply an intellectual plaything. There is a need for experimental languages, niche lan-
guages, languages devoted to the pursuit of beauty without compromise. However, C++ was never meant
to be one of those; it was designed and evolved to be a practical tool.
Like the success of C, the success of C++ was no accident. Naturally, a certain element of good fortune
was involved in both cases; nothing succeeds on a large scale without a bit of luck. However, a large part
of that success came from an effort to make C++ the best language possible, rather than the best possible
language.
Throughout its evolution, C++ was heavily influenced by a desire to make it a useful tool to a commu-
nity of potential users who already existed, whose problems we knew reasonably well. Another important
aspect was restraint: C++ was not allowed to grow without solid feedback on what we already had, without
practical experience with problem areas (where what we had felt ‘‘not good enough’’), and without con-
cerns for compatibility and transition issues. Theory was never a sufficient reason for adding something to
C++. Theory determines the form of what is added but not what is needed.
Because C++ was intended to be useful in the same areas as C, one major goal of C++ has been to do
everything C can, and do it as efficiently in time and space as C. Consequently, if one writes a C program
in C++, that program will be as fast and small as it would have been in C. This is not true of every C++
implementation, of course, but attainable in theory, and often achieved in practice. C++ even made a few
improvements on C in areas not related to abstraction. Some of those improvements, such as const types
and the ability to include argument types as part of a function declaration, found their way back into C.
Others, such as inline function definitions, did not.
The desire to do everything C can do is a strong constraint on C++. For example, it has made it infeasi-
ble to make the primitive C++ array and pointer operations any safer than their C counterparts. It is possi-
ble, of course, to define safe data structures as C++ classes, but in practice few C++ programmers have the
discipline needed to use such data structures exclusively. Thus, C is both a great strength of C++ and a
great weakness.
That C++’s relationship with C wouldn’t be easy was clear from the start. We like aspects of C, but
some key elements of the C language and culture are most disruptive to people trying to write more abstract
programs and trying to reason about programs. For example, the C preprocessor is essential for real-world
C programming, but is also a menace: any piece of source text may turn out not to be what it appears to be
because a macro substitution may radically change what the programmer wrote before the compiler sees it.
The traditional academic response to such problems seems to be ‘‘ban it!’’ The C++ answer has been:
first make the obnoxious feature redundant, then discourage its use; finally we may actually consider ban-
ning the now-unused feature. This strategy is slow and often frustrating, but it respects people’s practical
needs in a way a more radical approach doesn’t. C++ doesn’t yet have facilities that make the C preproces-
sor completely redundant, but inline functions, constants, namespaces, templates, etc., allow a programmer
to restrict the use of preprocessor facilities to a minimum related to source code management.
The policy regarding C/C++ compatibility has been expressed as: ‘‘As close to C as possible—but no
closer’’ [Koenig,1989]. In practice, this means that C++ accepts any C feature—however ugly—as long as
it does not interfere with the type system. This policy has kept incompatibilities to an easily manageable
minimum.
C is the de facto measure of efficiency. People generally accept that if something runs as fast as well-
written C it is fast enough. If it doesn’t, criticism results—fair or not. Since its inception, one of the aims
of C++ has been to make it possible to write programs that are not only abstract, but also run quickly.
Throughout the lifetime of C++, and well before it, people have argued that such emphasis on run-time per-
formance is unnecessary.
The typical argument runs something like this: ‘‘Computers are so fast these days that we can afford to
give up some of that speed if by doing so we gain something in exchange.’’ That something might be
development time, or safety, or whatever the favorite language of the person making the argument has to
offer. Such arguments are often valid, but not always, and it is not easy to tell when they will be important
and when they will not.
For small programs—such as many student projects and prototypes—efficiency rarely matters. Larger
systems, however, often consists of many layers of software. If overhead is allowed to build up in the indi-
vidual layers, the total system becomes glacial. Naturally, if the overhead in an individual layer is really
support for subsequent layers so that these layers become simpler and faster then this doesn’t happen.
Unfortunately, we have found this happy phenomenon less common than one might have hoped. In the
absence of such synergies, the language with the most efficient low-level semantics—that is, C or C++—
wins. Of course, when one is developing programs for one’s self or one’s immediate circle, such issues are
less important. That is one of the ways in which C++ has been guided by the requirements of commercial,
rather than academic, users.
Finally, there are application areas where efficiency is paramount. If you are writing an operating sys-
tems kernel or a network driver you don’t want any fat on your code—for any reason. For hard real-time
applications you have the additional requirement that the performance of every feature must be absolutely
predictable as well as sufficiently fast. C++ meets the requirements here.
By being C-compatible, C++ was able to benefit from C’s libraries easily, directly, and without over-
heads. The benefits of that are inestimable because it gives the C++ programmer access to the largest col-
lection of new and old code available. It made the difference between early C++ being a toy and being a
tool. In addition to gaining access to libraries written in C, link and layout compatibility with C allows C++
- 32 -
programs to call routines in languages with a C compatible calling sequence, such as Fortran and assembler
on many systems. Further, C++ functions can be called from such languages. This allowed C++ to be used
to write libraries for use from other languages from day one.
The estimate of the time needed to become comfortable with C++ and object-oriented design is based on
the assumption that the programmer/designer learns on the job and stays productive—usually by program-
ming in a ‘‘less adventurous’’ style of C++ during that period. If one could devote full time to learning
C++, one would be comfortable faster. However, without application of the new ideas on real projects that
degree of comfort could be misleading. Object-oriented programming and object-oriented design are
practical—rather than theoretical—disciplines. Unapplied, or applied only to toy examples, these ideas can
become dangerous ‘‘religions.’’
The time-consuming thing to learn about C++ is not syntax, but design concepts. A good indication of
poor appreciation of C++ is code littered with casts (explicit type conversions). Often, the casts are the
result of someone writing C or trying to write Smalltalk in C++.
Our observation is that most people who are aware that there is something to be learned can learn C++
well in a reasonable amount of time. The people who fail, and in consequence write appalling C++ and
complain a lot, are in our experience mostly people who approach C++ with the attitude that they know all
there is to know about programming so that all they have to do is to pick up ‘‘a bit of odd syntax.’’ Unfor-
tunately, some such people proceed to teach C++ or even write C++ textbooks, and their students then suffer
with them.
8 Conclusions
The C++ programming language has evolved in response to its user community. Managing that evolution
hasn’t been easy, but new language features, techniques, and libraries had to be developed to meet the
needs of a growing user community. The coming ISO/ANSI standard should herald a period of stability of
the language definition that ought to set of an explosion of work on tools, techniques, and libraries.
The key problem is education. To use C++ well—or any other language supporting abstraction
mechanisms—people must focus on design issues, and teaching design on a large scale is not easy.
9 Acknowledgements
Vince Russo made our Christmas preparations more interesting by suggesting that we might be able to
write this paper at the same time. Section 4 was partly inspired by Alex Stepanov’s work on the STL
[Stepanov,1994]. Section 5 was partly inspired by [Gamma,1994]. Brian Kernighan made constructive
comments of an draft of this paper.
10 References
[Booch,1993] Grady Booch: Object-oriented Analysis and Design with Applications, 2nd edition.
Benjamin Cummings, Redwood City, CA. 1993. ISBN 0-8053-5340-2.
[Gamma,1994] Gamma E., et.al.: Design Patterns. Addison Wesley. 1994. ISBN 0-201-63361-2.
[Koenig,1989] Andrew Koenig and Bjarne Stroustrup: As Close as Possible to C—but no Closer
The C++ Report. Vol 1 No 7 July 1989.
[Koenig,1995] Andrew Koenig (editor): The Working Papers for the ANSI-X3J16 /ISO-SC22-
WG21 C++ standards committee.
[Koenig,1995a] Andrew Koenig and Barbara Moo: Ruminations on C++. Book, to appear 1996.
[Stroustrup,1985] Bjarne Stroustrup: The C++ Programming Language. Addison Wesley, ISBN 0-
201-12078-X. October 1985.
[Stroustrup,1991] Bjarne Stroustrup: The C++ Programming Language (2nd Edition) Addison Wesley,
ISBN 0-201-53992-6. June 1991.
[Stroustrup,1994] Bjarne Stroustrup: The Design and Evolution of C++ Addison Wesley, ISBN 0-201-
54330-3. March 1994.
[Stepanov,1994] Alexander Stepanov and Meng Lee: The Standard Template Library. ISO Program-
ming language C++ project. Doc No: X3J16/94-0095, WG21/N0482. May 1994.
[Vilot,1994] Michael J Vilot: An Introduction to the STL Library. The C++ Report. October
1994.