C ++ Tutorials
C ++ Tutorials
These tutorials explain the C++ language from its basics up to the newest features
introduced by C++11. Chapters have a practical orientation, with example programs in all
sections to start practicing what is being explained right away.
Unit 1
Introduction
Compilers
Unit 2
Basics of C++
Structure of a program
Variables and types
Constants
Operators
Basic Input/Output
Unit 3
Program structure
Control Structures
Functions
Overloads and templates
Name visibility
Unit 4
Compound data types
Arrays
Character sequences
Pointers
Dynamic Memory
Data structures
Other data types
Unit 5
Classes
Classes (I)
Classes (II)
Special members
Friendship and inheritance
Polymorphism
Unit 6
Other language features
Type conversions
Exceptions
Preprocessor directives
Unit 7
C++ Standard Library
Unit 1
Compilers
The essential tools needed to follow these tutorials are a computer and a compiler
toolchain able to compile C++ code and build the programs to run on it.
C++ is a language that has evolved much over the years, and these tutorials explain many
features added recently to the language. Therefore, in order to properly follow the
tutorials, a recent compiler is needed. It shall support (even if only partially) the features
introduced by the 2011 standard.
Many compiler vendors support the new features at different degrees. See the bottom of
this page for some compilers that are known to support the features needed. Some of
them are free!
If for some reason, you need to use some older compiler, you can access an older version
of these tutorials here (no longer updated).
What is a compiler?
Computers understand only one language and that language consists of sets of instructions
made of ones and zeros. This computer language is appropriately called machine language.
00000 10011110
A particular computer's machine language program that allows a user to input two
numbers, adds the two numbers together, and displays the total could include these
machine code instructions:
00000 10011110
00001 11110100
00010 10011110
00011 11010100
00100 10111111
00101 00000000
As you can imagine, programming a computer directly in machine language using only
ones and zeros is very tedious and error prone. To make programming easier, high level
languages have been developed. High level programs also make it easier for programmers
to inspect and understand each other's programs easier.
This is a portion of code written in C++ that accomplishes the exact same purpose:
1 int a, b, sum;
2
3 cin >> a;
4 cin >> b;
5
6 sum = a + b;
7 cout << sum << endl;
Even if you cannot really understand the code above, you should be able to appreciate how
much easier it will be to program in the C++ language as opposed to machine language.
Because a computer can only understand machine language and humans wish to write in
high level languages high level languages have to be re-written (translated) into machine
language at some point. This is done by special programs called compilers, interpreters, or
assemblers that are built into the various programming applications.
Console programs
Console programs are programs that use text to communicate with the user and the
environment, such as printing text to the screen or reading input from a keyboard.
Console programs are easy to interact with, and generally have a predictable behavior that
is identical across all platforms. They are also simple to implement and thus are very
useful to learn the basics of a programming language: The examples in these tutorials are
all console programs.
The way to compile console programs depends on the particular tool you are using.
The easiest way for beginners to compile C++ programs is by using an Integrated
Development Environment (IDE). An IDE generally integrates several development tools,
including a text editor and tools to compile programs directly from it.
Here you have instructions on how to compile and run console programs using different
free Integrated Development Interfaces (IDEs):
If you happen to have a Linux or Mac environment with development features, you should
be able to compile any of the examples directly from a terminal just by including C++11
flags in the command for the compiler:
Linux, among
GCC g++ -std=c++0x example.cpp -o example_program
others...
Unit 2
Structure of a program
The best way to learn a programming language is by writing programs. Typically, the first program
beginners write is a program called "Hello World", which simply prints "Hello World" to your
computer screen. Although it is very simple, it contains all the fundamental components C++
programs have:
1 // my first program in C++ Hello World!
2 #include <iostream>
3
4 int main()
5 {
6 std::cout << "Hello World!";
7 }
The left panel above shows the C++ code for this program. The right panel shows the result when
the program is executed by a computer. The grey numbers to the left of the panels are line numbers
to make discussing programs and researching errors easier. They are not part of the program.
Let's examine this program line by line:
int main () { std::cout << " Hello World! "; std::cout << " Edit &
I'm a C++ program "; } Run
The source code could have also been divided into more code lines instead:
Comments
As noted above, comments do not affect the operation of the program; however, they provide an
important tool to document directly within the source code what the program does and how it
operates.
C++ supports two ways of commenting code:
1 // line comment
2 /* block comment */
The first of them, known as line comment, discards everything from where the pair of slash signs
(//) are found up to the end of that same line. The second one, known as block comment, discards
everything between the /* characters and the first appearance of the */ characters, with the
possibility of including multiple lines.
Let's add comments to our second program:
If you have seen C++ code before, you may have seen cout being used instead of std::cout.
Both name the same object: the first one uses its unqualified name (cout), while the second
qualifies it directly within the namespace std (as std::cout).
cout is part of the standard library, and all the elements in the standard C++ library are declared
within what is called a namespace: the namespace std.
In order to refer to the elements in the std namespace a program shall either qualify each and
every use of elements of the library (as we have done by prefixing cout with std::), or introduce
visibility of its components. The most typical way to introduce visibility of these components is by
means of using declarations:
using namespace std;
The above declaration allows all elements in the std namespace to be accessed in
an unqualified manner (without the std:: prefix).
With this in mind, the last example can be rewritten to make unqualified uses of cout as:
1 // my second program in C++ Hello World! I'm a C++ program
2 #include <iostream>
3 using namespace std;
4
5 int main ()
6 {
7 cout << "Hello World! ";
8 cout << "I'm a C++ program";
9 }
Both ways of accessing the elements of the std namespace (explicit qualification
and using declarations) are valid in C++ and produce the exact same behavior. For simplicity, and
to improve readability, the examples in these tutorials will more often use this latter approach
with using declarations, although note that explicit qualification is the only way to guarantee that
name collisions never happen.
Namespaces are explained in more detail in a later chapter.
Identifiers
A valid identifier is a sequence of one or more letters, digits, or underscore characters (_). Spaces,
punctuation marks, and symbols cannot be part of an identifier. In addition, identifiers shall always
begin with a letter. They can also begin with an underline character (_), but such identifiers are ‐
on most cases‐ considered reserved for compiler‐specific keywords or external identifiers, as well
as identifiers containing two successive underscore characters anywhere. In no case can they begin
with a digit.
C++ uses a number of keywords to identify operations and data descriptions; therefore, identifiers
created by a programmer cannot match these keywords. The standard reserved keywords that
cannot be used for programmer created identifiers are:
alignas, alignof, and, and_eq, asm, auto, bitand, bitor, bool, break,
case, catch, char, char16_t, char32_t, class, compl, const, constexpr,
const_cast, continue, decltype, default, delete, do, double,
dynamic_cast, else, enum, explicit, export, extern, false, float, for,
friend, goto, if, inline, int, long, mutable, namespace, new, noexcept,
not, not_eq, nullptr, operator, or, or_eq, private, protected, public,
register, reinterpret_cast, return, short, signed, sizeof, static,
static_assert, static_cast, struct, switch, template, this,
thread_local, throw, true, try, typedef, typeid, typename, union,
unsigned, using, virtual, void, volatile, wchar_t, while, xor, xor_eq
Specific compilers may also have additional specific reserved keywords.
Very important: The C++ language is a "case sensitive" language. That means that an identifier
written in capital letters is not equivalent to another one with the same name but written in small
letters. Thus, for example, the RESULTvariable is not the same as the result variable or
the Result variable. These are three different identifiers identifiying three different variables.
Character types: They can represent a single character, such as 'A' or '$'. The most
basic type is char, which is a one‐byte character. Other types are also provided for
wider characters.
Numerical integer types: They can store a whole number value, such as 7 or 1024. They
exist in a variety of sizes, and can either be signed or unsigned, depending on whether
they support negative values or not.
Floating‐point types: They can represent real values, such as 3.14 or 0.01, with
different levels of precision, depending on which of the three floating‐point types is
used.
Boolean type: The boolean type, known in C++ as bool, can only represent one of two
states, true or false.
Here is the complete list of fundamental types in C++:
Group Type names* Notes on size / precision
Declaration of variables
C++ is a strongly‐typed language, and requires every variable to be declared with its type before
its first use. This informs the compiler the size to reserve in memory for the variable and how to
interpret its value. The syntax to declare a new variable in C++ is straightforward: we simply write
the type followed by the variable name (i.e., its identifier). For example:
1 int a;
2 float mynumber;
These are two valid declarations of variables. The first one declares a variable of type int with the
identifier a. The second one declares a variable of type float with the identifier mynumber.
Once declared, the variables a and mynumber can be used within the rest of their scope in the
program.
If declaring more than one variable of the same type, they can all be declared in a single statement
by separating their identifiers with commas. For example:
int a, b, c;
This declares three variables (a, b and c), all of them of type int, and has exactly the same
meaning as:
1 int a;
2 int b;
3 int c;
To see what variable declarations look like in action within a program, let's have a look at the entire
C++ code of the example about your mental memory proposed at the beginning of this chapter:
Initialization of variables
When the variables in the example above are declared, they have an undetermined value until they
are assigned a value for the first time. But it is possible for a variable to have a specific value from
the moment it is declared. This is called the initialization of the variable.
In C++, there are three ways to initialize variables. They are all equivalent and are reminiscent of
the evolution of the language over the years:
The first one, known as c‐like initialization (because it is inherited from the C language), consists of
appending an equal sign followed by the value to which the variable is initialized:
type identifier = initial_value;
For example, to declare a variable of type int called x and initialize it to a value of zero from the
same moment it is declared, we can write:
int x = 0;
A second method, known as constructor initialization (introduced by the C++ language), encloses
the initial value between parentheses (()):
type identifier (initial_value);
For example:
int x (0);
Finally, a third method, known as uniform initialization, similar to the above, but using curly braces
({}) instead of parentheses (this was introduced by the revision of the C++ standard, in 2011):
type identifier {initial_value};
For example:
int x {0};
All three ways of initializing variables are valid and equivalent in C++.
1 // initialization of variables 6
2
3 #include <iostream>
4 using namespace std;
5
6 int main ()
7 {
8 int a=5; // initial value:
9 5
10 int b(3); // initial value:
11 3
12 int c{2}; // initial value:
13 2
14 int result; // initial value
15 undetermined
16
17 a = a + b;
18 result = a - c;
cout << result;
return 0;
}
1 int foo = 0;
2 auto bar = foo; // the same as: int bar = foo;
Here, bar is declared as having an auto type; therefore, the type of bar is the type of the value
used to initialize it: in this case it uses the type of foo, which is int.
Variables that are not initialized can also make use of type deduction with the decltype specifier:
1 int foo = 0;
2 decltype(foo) bar; // the same as: int bar;
Here, bar is declared as having the same type as foo.
auto and decltype are powerful features recently added to the language. But the type
deduction features they introduce are meant to be used either when the type cannot be obtained
by other means or when using it improves code readability. The two examples above were likely
neither of these use cases. In fact they probably decreased readability, since, when reading the
code, one has to search for the type of foo to actually know the type of bar.
Introduction to strings
Fundamental types represent the most basic types handled by the machines where the code may
run. But one of the major strengths of the C++ language is its rich set of compound types, of which
the fundamental types are mere building blocks.
An example of compound type is the string class. Variables of this type are able to store
sequences of characters, such as words or sentences. A very useful feature!
A first difference with fundamental data types is that in order to declare and use objects (variables)
of this type, the program needs to include the header where the type is defined within the standard
library (header <string>):
Constants
Constants are expressions with a fixed value.
Literals
Literals are the most obvious kind of constants. They are used to express particular values within
the source code of a program. We have already used some in previous chapters to give specific
values to variables or to express messages we wanted our programs to print out, for example, when
we wrote:
a = 5;
The 5 in this piece of code was a literal constant.
Literal constants can be classified into: integer, floating‐point, characters, strings, Boolean, pointers,
and user‐defined literals.
Integer Numerals
1 1776
2 707
3 -273
These are numerical constants that identify integer values. Notice that they are not enclosed in
quotes or any other special character; they are a simple succession of digits representing a whole
number in decimal base; for example, 1776 always represents the value one thousand seven
hundred seventy‐six.
In addition to decimal numbers (those that most of us use every day), C++ allows the use of octal
numbers (base 8) and hexadecimal numbers (base 16) as literal constants. For octal literals, the
digits are preceded with a 0 (zero) character. And for hexadecimal, they are preceded by the
characters 0x (zero, x). For example, the following literal constants are all equivalent to each
other:
1 75 // decimal
2 0113 // octal
3 0x4b // hexadecimal
All of these represent the same number: 75 (seventy‐five) expressed as a base‐10 numeral, octal
numeral and hexadecimal numeral, respectively.
These literal constants have a type, just like variables. By default, integer literals are of type int.
However, certain suffixes may be appended to an integer literal to specify a different integer type:
u or U unsigned
l or L long
ll or LL long long
Unsigned may be combined with any of the other two in any order to form unsigned
long or unsigned long long.
For example:
1 75 // int
2 75u // unsigned int
3 75l // long
4 75ul // unsigned long
5 75lu // unsigned long
In all the cases above, the suffix can be specified using either upper or lowercase letters.
Floating Point Numerals
They express real values, with decimals and/or exponents. They can include either a decimal point,
an e character (that expresses "by ten at the Xth height", where X is an integer value that follows
the e character), or both a decimal point and an e character:
1 3.14159 // 3.14159
2 6.02e23 // 6.02 x 10^23
3 1.6e-19 // 1.6 x 10^-19
4 3.0 // 3.0
These are four valid numbers with decimals expressed in C++. The first number is PI, the second
one is the number of Avogadro, the third is the electric charge of an electron (an extremely small
number) ‐all of them approximated‐, and the last one is the number three expressed as a floating‐
point numeric literal.
The default type for floating‐point literals is double. Floating‐point literals of type float or long
double can be specified by adding one of the following suffixes:
Suffix Type
f or F float
l or L long double
For example:
Character and string literals
Character and string literals are enclosed in quotes:
1 'z'
2 'p'
3 "Hello world"
4 "How do you do?"
The first two expressions represent single‐character literals, and the following two represent string
literals composed of several characters. Notice that to represent a single character, we enclose it
between single quotes ('), and to express a string (which generally consists of more than one
character), we enclose the characters between double quotes (").
Both single‐character and string literals require quotation marks surrounding them to distinguish
them from possible variable identifiers or reserved keywords. Notice the difference between these
two expressions:
x
'x'
Here, x alone would refer to an identifier, such as the name of a variable or a compound type,
whereas 'x' (enclosed within single quotation marks) would refer to the character literal 'x' (the
character that represents a lowercase xletter).
Character and string literals can also represent special characters that are difficult or impossible to
express otherwise in the source code of a program, like newline (\n) or tab (\t). These special
characters are all of them preceded by a backslash character (\).
Here you have a list of the single character escape codes:
\n newline
\r carriage return
\t tab
\v vertical tab
\b backspace
\a alert (beep)
1 x = "string expressed in \
2 two lines"
is equivalent to:
u char16_t
U char32_t
L wchar_t
Note that, unlike type suffixes for integer literals, these prefixes are case sensitive: lowercase
for char16_t and uppercase for char32_t and wchar_t.
For string literals, apart from the above u, U, and L, two additional prefixes exist:
Prefix Description
Other literals
Three keyword literals exist in C++: true, false and nullptr:
true and false are the two possible values for variables of type bool.
nullptr is the null pointer value.
1 bool foo = true;
2 bool bar = false;
3 int* p = nullptr;
Operators
Once introduced to variables and constants, we can begin to operate with them by using operators.
What follows is a complete list of operators. At this point, it is likely not necessary to know all of
them, but they are all listed here to also serve as reference.
Assignment operator (=)
The assignment operator assigns a value to a variable.
x = 5;
This statement assigns the integer value 5 to the variable x. The assignment operation always takes
place from right to left, and never the other way around:
x = y;
This statement assigns to variable x the value contained in variable y. The value of x at the
moment this statement is executed is lost and replaced by the value of y.
Consider also that we are only assigning the value of y to x at the moment of the assignment
operation. Therefore, if ychanges at a later moment, it will not affect the new value taken by x.
For example, let's have a look at the following code ‐ I have included the evolution of the content
stored in the variables as comments:
1 x = 5;
2 y = 2 + x;
With the final result of assigning 7 to y.
The following expression is also valid in C++:
x = y = z = 5;
It assigns 5 to the all three variables: x, y and z; always from right‐to‐left.
Arithmetic operators ( +, -, *, /, % )
The five arithmetical operations supported by C++ are:
operator description
+ addition
- subtraction
* multiplication
/ division
% modulo
Operations of addition, subtraction, multiplication and division correspond literally to their
respective mathematical operators. The last one, modulo operator, represented by a percentage
sign (%), gives the remainder of a division of two values. For example:
x = 11 % 3;
results in variable x containing the value 2, since dividing 11 by 3 results in 3, with a remainder of
2.
Compound assignment (+=, -=, *=, /=, %=, >>=, <<=, &=, ^=,
|=)
Compound assignment operators modify the current value of a variable by performing an
operation on it. They are equivalent to assigning the result of an operation to the first operand:
y += x; y = y + x;
x -= 5; x = x - 5;
x /= y; x = x / y;
price *= units + 1; price = price * (units+1);
and the same for all other compound assignment operators. For example:
Some expression can be shortened even more: the increase operator (++) and the decrease
operator (--) increase or reduce by one the value stored in a variable. They are equivalent
to +=1 and to -=1, respectively. Thus:
1 ++x;
2 x+=1;
3 x=x+1;
are all equivalent in its functionality; the three of them increase by one the value of x.
In the early C compilers, the three previous expressions may have produced different executable
code depending on which one was used. Nowadays, this type of code optimization is generally
performed automatically by the compiler, thus the three expressions should produce exactly the
same executable code.
A peculiarity of this operator is that it can be used both as a prefix and as a suffix. That means that
it can be written either before the variable name (++x) or after it (x++). Although in simple
expressions like x++ or ++x, both have exactly the same meaning; in other expressions in which
the result of the increment or decrement operation is evaluated, they may have an important
difference in their meaning: In the case that the increase operator is used as a prefix (++x) of the
value, the expression evaluates to the final value of x, once it is already increased. On the other
hand, in case that it is used as a suffix (x++), the value is also increased, but the expression
evaluates to the value that x had before being increased. Notice the difference:
Example 1 Example 2
x = 3; x = 3;
y = ++x; y = x++;
// x contains 4, y contains 4 // x contains 4, y contains 3
In Example 1, the value assigned to y is the value of x after being increased. While in Example 2, it
is the value x had before being increased.
Relational and comparison operators ( ==, !=, >, <, >=, <= )
Two expressions can be compared using relational and equality operators. For example, to know if
two values are equal or if one is greater than the other.
The result of such an operation is either true or false (i.e., a Boolean value).
The relational operators in C++ are:
operator description
== Equal to
!= Not equal to
1 (7 == 5) // evaluates to false
2 (5 > 4) // evaluates to true
3 (3 != 2) // evaluates to true
(6 >= 6) // evaluates to true
4 (5 < 5) // evaluates to false
5
Of course, it's not just numeric constants that can be compared, but just any value, including, of
course, variables. Suppose that a=2, b=3 and c=6, then:
The operator ! is the C++ operator for the Boolean operation NOT. It has only one operand, to its
right, and inverts it, producing false if its operand is true, and true if its operand is false.
Basically, it returns the opposite Boolean value of evaluating its operand. For example:
a b a && b
true true true
true false false
false true false
false false false
The operator || corresponds to the Boolean logical operation OR, which yields true if either of
its operands is true, thus being false only when both operands are false. Here are the possible
results of a||b:
|| OPERATOR (or)
a b a || b
true true true
true false true
false true true
false false false
For example:
1 ( (5 == 5) && (3 > 6) ) // evaluates to false ( true && false )
2 ( (5 == 5) || (3 > 6) ) // evaluates to true ( true || false )
When using the logical operators, C++ only evaluates what is necessary from left to right to come
up with the combined relational result, ignoring the rest. Therefore, in the last example
((5==5)||(3>6)), C++ evaluates first whether 5==5 is true, and if so, it never checks
whether 3>6 is true or not. This is known as short‐circuit evaluation, and works like this for these
operators:
operator short-circuit
if the left-hand side expression is false, the combined result is false (the right-
&&
hand side expression is never evaluated).
if the left-hand side expression is true, the combined result is true (the right-
||
hand side expression is never evaluated).
This is mostly important when the right‐hand expression has side effects, such as altering values:
1 // conditional operator 7
2 #include <iostream>
3 using namespace std;
4
5 int main ()
6 {
7 int a,b,c;
8
9 a=2;
10 b=7;
11 c = (a>b) ? a : b;
12
13 cout << c << '\n';
14 }
In this example, a was 2, and b was 7, so the expression being evaluated (a>b) was not true, thus
the first value specified after the question mark was discarded in favor of the second value (the
one after the colon) which was b (with a value of 7).
Comma operator ( , )
The comma operator (,) is used to separate two or more expressions that are included where only
one expression is expected. When the set of expressions has to be evaluated for a value, only the
right‐most expression is considered.
For example, the following code:
a = (b=3, b+2);
would first assign the value 3 to b, and then assign b+2 to variable a. So, at the end,
variable a would contain the value 5 while variable b would contain value 3.
Bitwise operators ( &, |, ^, ~, <<, >> )
Bitwise operators modify variables considering the bit patterns that represent the values they store.
| OR Bitwise inclusive OR
1 int i;
2 float f = 3.14;
3 i = (int) f;
The previous code converts the floating‐point number 3.14 to an integer value (3); the remainder
is lost. Here, the typecasting operator was (int). Another way to do the same thing in C++ is to
use the functional notation preceding the expression to be converted by the type and enclosing
the expression between parentheses:
i = int (f);
Both ways of casting types are valid in C++.
sizeof
This operator accepts one parameter, which can be either a type or a variable, and returns the size
in bytes of that type or object:
x = sizeof (char);
Here, x is assigned the value 1, because char is a type with a size of one byte.
The value returned by sizeof is a compile‐time constant, so it is always determined before
program execution.
Other operators
Later in these tutorials, we will see a few more operators, like the ones referring to pointers or the
specifics for object‐oriented programming.
Precedence of operators
A single expression may have multiple operators. For example:
x = 5 + 7 % 2;
In C++, the above expression always assigns 6 to variable x, because the % operator has a higher
precedence than the + operator, and is always evaluated before. Parts of the expressions can be
enclosed in parenthesis to override this precedence order, or to make explicitly clear the intended
effect. Notice the difference:
+ - unary prefix
Right-to-
3 Prefix (unary) & * reference / dereference
left
new delete allocation / deallocation
= *= /= %= +=
-= assignment / compound
Assignment-level Right-to-
15 >>= <<= &= ^= assignment
expressions left
|=
?: conditional operator
Left-to-
16 Sequencing , comma separator
right
When an expression has two operators with the same precedence level, grouping determines
which one is evaluated first: either left‐to‐right or right‐to‐left.
Enclosing all sub‐statements in parentheses (even those unnecessary because of their precedence)
improves code readability.
Basic Input/Output
The example programs of the previous sections provided little interaction with the user, if any at
all. They simply printed simple values on screen, but the standard library provides many additional
ways to interact with the user via its input/output features. This section will present a short
introduction to some of the most useful.
C++ uses a convenient abstraction called streams to perform input and output operations in
sequential media such as the screen, the keyboard or a file. A stream is an entity where a program
can either insert or extract characters to/from. There is no need to know details about the media
associated to the stream or any of its internal specifications. All we need to know is that streams
are a source/destination of characters, and that these characters are provided/accepted
sequentially (i.e., one after another).
The standard library defines a handful of stream objects that can be used to access what are
considered the standard sources and destinations of characters by the environment where the
program runs:
stream description
cout << "This " << " is a " << "single C++ statement";
This last statement would print the text This is a single C++ statement. Chaining
insertions is especially useful to mix literals and variables in a single statement:
cout << "I am " << age << " years old and my zipcode is " <<
zipcode;
Assuming the age variable contains the value 24 and the zipcode variable contains 90064, the
output of the previous statement would be:
I am 24 years old and my zipcode is 90064
What cout does not do automatically is add line breaks at the end, unless instructed to do so. For
example, take the following two statements inserting into cout:
cout << "This is a sentence.";
cout << "This is another sentence.";
The output would be in a single line, without any line breaks in between. Something like:
This is a sentence.This is another sentence.
To insert a line break, a new‐line character shall be inserted at the exact position the line should
be broken. In C++, a new‐line character can be specified as \n (i.e., a backslash character followed
by a lowercase n). For example:
1 int age;
2 cin >> age;
The first statement declares a variable of type int called age, and the second extracts from cin a
value to be stored in it. This operation makes the program wait for input from cin; generally, this
means that the program will wait for the user to enter some sequence with the keyboard. In this
case, note that the characters introduced using the keyboard are only transmitted to the program
when the ENTER (or RETURN) key is pressed. Once the statement with the extraction operation
on cin is reached, the program will wait for as long as needed until some input is introduced.
The extraction operation on cin uses the type of the variable after the >> operator to determine
how it interprets the characters read from the input; if it is an integer, the format expected is a
series of digits, if a string a sequence of characters, etc.
1 cin >> a;
2 cin >> b;
In both cases, the user is expected to introduce two values, one for variable a, and another for
variable b. Any kind of space is used to separate two consecutive input operations; this may either
be a space, a tab, or a new‐line character.
The extraction operator can be used on cin to get strings of characters in the same way as with
fundamental data types:
1 string mystring;
2 cin >> mystring;
However, cin extraction always considers spaces (whitespaces, tabs, new‐line...) as terminating
the value being extracted, and thus extracting a string means to always extract a single word, not
a phrase or an entire sentence.
To get an entire line from cin, there exists a function, called getline, that takes the stream (cin)
as first argument, and the string variable as second. For example:
stringstream
The standard header <sstream> defines a type called stringstream that allows a string to be treated
as a stream, and thus allowing extraction or insertion operations from/to strings in the same way
as they are performed on cin and cout. This feature is most useful to convert strings to numerical
values and vice versa. For example, in order to extract an integer from a string we can write:
1 string mystr ("1204");
2 int myint;
3 stringstream(mystr) >> myint;
This declares a string with initialized to a value of "1204", and a variable of type int. Then,
the third line uses this variable to extract from a stringstream constructed from the string. This
piece of code stores the numerical value 1204 in the variable called myint.
Unit 3
Statements and flow control
A simple C++ statement is each of the individual instructions of a program, like the variable
declarations and expressions seen in previous sections. They always end with a semicolon (;), and
are executed in the same order in which they appear in a program.
But programs are not limited to a linear sequence of statements. During its process, a program may
repeat segments of code, or take decisions and bifurcate. For that purpose, C++ provides flow
control statements that serve to specify what has to be done by our program, when, and under
which circumstances.
Many of the flow control statements explained in this section require a generic (sub)statement as
part of its syntax. This statement may either be a simple C++ statement, ‐such as a single instruction,
terminated with a semicolon (;) ‐ or a compound statement. A compound statement is a group of
statements (each of them terminated by its own semicolon), but all grouped together in a block,
enclosed in curly braces: {}:
{ statement1; statement2; statement3; }
The entire block is considered a single statement (composed itself of multiple substatements).
Whenever a generic statement is part of the syntax of a flow control statement, this can either be
a simple statement or a compound statement.
The if keyword is used to execute a statement or block, if, and only if, a condition is fulfilled. Its
syntax is:
if (condition) statement
Here, condition is the expression that is being evaluated. If this condition is
true, statement is executed. If it is false, statement is not executed (it is simply ignored), and
the program continues right after the entire selection statement.
For example, the following code fragment prints the message (x is 100), only if the value stored
in the x variable is indeed 100:
1 if (x == 100)
2 cout << "x is 100";
If x is not exactly 100, this statement is ignored, and nothing is printed.
If you want to include more than a single statement to be executed when the condition is fulfilled,
these statements shall be enclosed in braces ({}), forming a block:
1 if (x == 100)
2{
3 cout << "x is ";
4 cout << x;
5}
As usual, indentation and line breaks in the code have no effect, so the above code is equivalent
to:
1 if (x == 100)
2 cout << "x is 100";
3 else
4 cout << "x is not 100";
This prints x is 100, if indeed x has a value of 100, but if it does not, and only if it does not, it
prints x is not 100instead.
Several if + else structures can be concatenated with the intention of checking a range of values.
For example:
1 if (x > 0)
2 cout << "x is positive";
3 else if (x < 0)
4 cout << "x is negative";
5 else
6 cout << "x is 0";
This prints whether x is positive, negative, or zero by concatenating two if‐else structures. Again, it
would have also been possible to execute more than a single statement per case by grouping them
into blocks enclosed in braces: {}.
Iteration statements (loops)
Loops repeat a statement a certain number of times, or while a condition is fulfilled. They are
introduced by the keywords while, do, and for.
The while loop
The simplest kind of loop is the while‐loop. Its syntax is:
while (expression) statement
The while‐loop simply repeats statement while expression is true. If, after any execution
of statement, expression is no longer true, the loop ends, and the program continues right
after the loop. For example, let's have a look at a countdown using a while‐loop:
1. n is assigned a value
2. The while condition is checked (n>0). At this point there are two possibilities:
o condition is true: the statement is executed (to step 3)
o condition is false: ignore statement and continue after it (to step 5)
3. Execute statement:
cout << n << ", ";
--n;
(prints the value of n and decreases n by 1)
4. End of block. Return automatically to step 2.
5. Continue the program right after the block:
print liftoff! and end the program.
A thing to consider with while‐loops is that the loop should end at some point, and thus the
statement shall alter values checked in the condition in some way, so as to force it to become false
at some point. Otherwise, the loop will continue looping forever. In this case, the loop includes --
n, that decreases the value of the variable that is being evaluated in the condition (n) by one ‐ this
will eventually make the condition (n>0) false after a certain number of loop iterations. To be more
specific, after 10 iterations, n becomes 0, making the condition no longer true, and ending the
while‐loop.
Note that the complexity of this loop is trivial for a computer, and so the whole countdown is
performed instantly, without any practical delay between elements of the count (if interested,
see sleep_for for a countdown example with delays).
The do‐while loop
A very similar loop is the do‐while loop, whose syntax is:
do statement while (condition);
It behaves like a while‐loop, except that condition is evaluated after the execution
of statement instead of before, guaranteeing at least one execution of statement, even
if condition is never fulfilled. For example, the following example program echoes any text the
user introduces until the user enters goodbye:
The for loop is designed to iterate a number of times. Its syntax is:
for (initialization; condition; increase) statement;
Like the while‐loop, this loop repeats statement while condition is true. But, in addition, the
for loop provides specific locations to contain an initialization and
an increase expression, executed before the loop begins the first time, and after each iteration,
respectively. Therefore, it is especially useful to use counter variables as condition.
It works in the following way:
1. initialization is executed. Generally, this declares a counter variable, and sets it to
some initial value. This is executed a single time, at the beginning of the loop.
2. condition is checked. If it is true, the loop continues; otherwise, the loop ends,
and statement is skipped, going directly to step 5.
3. statement is executed. As usual, it can be either a single statement or a block enclosed
in curly braces { }.
4. increase is executed, and the loop gets back to step 2.
5. the loop ends: execution continues by the next statement after it.
Here is the countdown example using a for loop:
n starts with a value of 0, and i with 100, the condition is n!=i (i.e., that n is not equal to i).
Because n is increased by one, and i decreased by one on each iteration, the loop's condition will
become false after the 50th iteration, when both n and i are equal to 50.
Range‐based for loop
The for‐loop has another syntax, which is used exclusively with ranges:
for ( declaration : range ) statement;
This kind of for loop iterates over all the elements in range, where declaration declares some
variable able to take the value of an element in this range. Ranges are sequences of elements,
including arrays, containers, and any other type supporting the functions begin and end; Most of
these types have not yet been introduced in this tutorial, but we are already acquainted with at
least one kind of range: strings, which are sequences of characters.
An example of range‐based for loop using strings:
Jump statements
Jump statements allow altering the flow of a program by performing jumps to specific locations.
The break statement
break leaves a loop, even if the condition for its end is not fulfilled. It can be used to end an infinite
loop, or to force it to end before its natural end. For example, let's stop the countdown before its
natural end:
The continue statement
The continue statement causes the program to skip the rest of the loop in the current iteration,
as if the end of the statement block had been reached, causing it to jump to the start of the
following iteration. For example, let's skip number 5 in our countdown:
The goto statement
goto allows to make an absolute jump to another point in the program. This unconditional jump
ignores nesting levels, and does not cause any automatic stack unwinding. Therefore, it is a feature
to use with care, and preferably within the same block of statements, especially in the presence of
local variables.
The destination point is identified by a label, which is then used as an argument for
the goto statement. A label is made of a valid identifier followed by a colon (:).
goto is generally deemed a low‐level feature, with no particular use cases in modern higher‐level
programming paradigms generally used with C++. But, just as an example, here is a version of our
countdown loop using goto:
switch (expression)
{
case constant1:
group-of-statements-1;
break;
case constant2:
group-of-statements-2;
break;
.
.
.
default:
default-group-of-statements
}
It works in the following way: switch evaluates expression and checks if it is equivalent
to constant1; if it is, it executes group-of-statements-1 until it finds
the break statement. When it finds this break statement, the program jumps to the end of the
entire switch statement (the closing brace).
If expression was not equal to constant1, it is then checked against constant2. If it is equal to
this, it executes group-of-statements-2 until a break is found, when it jumps to the end of
the switch.
Finally, if the value of expression did not match any of the previously specified constants (there
may be any number of these), the program executes the statements included after
the default: label, if it exists (since it is optional).
Both of the following code fragments have the same behavior, demonstrating the if‐else equivalent
of a switch statement:
switch (x) { if (x == 1) {
case 1: cout << "x is 1";
cout << "x is 1"; }
break; else if (x == 2) {
case 2: cout << "x is 2";
cout << "x is 2"; }
break; else {
default: cout << "value of x unknown";
cout << "value of x unknown"; }
}
The switch statement has a somewhat peculiar syntax inherited from the early times of the first
C compilers, because it uses labels instead of blocks. In the most typical use (shown above), this
means that break statements are needed after each group of statements for a particular label.
If break is not included, all statements following the case (including those under any other labels)
are also executed, until the end of the switch block or a jump statement (such as break) is reached.
If the example above lacked the break statement after the first group for case one, the program
would not jump automatically to the end of the switch block after printing x is 1, and would
instead continue executing the statements in case two (thus printing also x is 2). It would then
continue doing so until a break statement is encountered, or the end of the switch block. This
makes unnecessary to enclose the statements for each case in braces {}, and can also be useful to
execute the same group of statements for different possible values. For example:
1 switch (x) {
2 case 1:
3 case 2:
4 case 3:
5 cout << "x is 1, 2 or 3";
6 break;
7 default:
8 cout << "x is not 1, 2 nor 3";
9 }
Notice that switch is limited to compare its evaluated expression against labels that are constant
expressions. It is not possible to use variables as labels or ranges, because they are not valid C++
constant expressions.
To check for ranges or values that are not constant, it is better to use concatenations
of if and else if statements.
Functions
Functions allow to structure programs in segments of code to perform individual tasks.
In C++, a function is a group of statements that is given a name, and which can be called from some
point of the program. The most common syntax to define a function is:
type name ( parameter1, parameter2, ...) { statements }
Where:
‐ type is the type of the value returned by the function.
‐ name is the identifier by which the function can be called.
‐ parameters (as many as needed): Each parameter consists of a type followed by an identifier,
with each parameter being separated from the next by a comma. Each parameter looks very much
like a regular variable declaration (for example: int x), and in fact acts within the function as a
regular variable which is local to the function. The purpose of parameters is to allow passing
arguments to the function from the location where it is called from.
‐ statements is the function's body. It is a block of statements surrounded by braces { } that
specify what the function actually does.
Let's have a look at an example:
1 // function example The result is 8
2 #include <iostream>
3 using namespace std;
4
5 int addition (int a, int b)
6 {
7 int r;
8 r=a+b;
9 return r;
10 }
11
12 int main ()
13 {
14 int z;
15 z = addition (5,3);
16 cout << "The result is " << z;
17 }
This program is divided in two functions: addition and main. Remember that no matter the
order in which they are defined, a C++ program always starts by calling main. In fact, main is the
only function called automatically, and the code in any other function is only executed if its function
is called from main (directly or indirectly).
In the example above, main begins by declaring the variable z of type int, and right after that, it
performs the first function call: it calls addition. The call to a function follows a structure very
similar to its declaration. In the example above, the call to addition can be compared to its
definition just a few lines earlier:
The parameters in the function declaration have a clear correspondence to the arguments passed
in the function call. The call passes two values, 5 and 3, to the function; these correspond to the
parameters a and b, declared for function addition.
At the point at which the function is called from within main, the control is passed to
function addition: here, execution of main is stopped, and will only resume once
the addition function ends. At the moment of the function call, the value of both arguments
(5 and 3) are copied to the local variables int a and int b within the function.
Then, inside addition, another local variable is declared (int r), and by means of the
expression r=a+b, the result of aplus b is assigned to r; which, for this case, where a is 5 and b is
3, means that 8 is assigned to r.
The final statement within the function:
return r;
Ends function addition, and returns the control back to the point where the function was called;
in this case: to function main. At this precise moment, the program resumes its course
on main returning exactly at the same point at which it was interrupted by the call to addition.
But additionally, because addition has a return type, the call is evaluated as having a value, and
this value is the value specified in the return statement that ended addition: in this particular
case, the value of the local variable r, which at the moment of the return statement had a value
of 8.
Therefore, the call to addition is an expression with the value returned by the function, and in
this case, that value, 8, is assigned to z. It is as if the entire function call (addition(5,3)) was
replaced by the value it returns (i.e., 8).
Then main simply prints this value by calling:
1 z = subtraction (7,2);
2 cout << "The first result is " << z;
If we replace the function call by the value it returns (i.e., 5), we would have:
1 z = 5;
2 cout << "The first result is " << z;
With the same procedure, we could interpret:
cout << "The second result is " << subtraction (7,2);
as:
cout << "The second result is " << 5;
since 5 is the value returned by subtraction (7,2).
In the case of:
z = 4 + subtraction (x,y);
The only addition being that now the function call is also an operand of an addition operation.
Again, the result is the same as if the function call was replaced by its result: 6. Note, that thanks
to the commutative property of additions, the above can also be written as:
z = subtraction (x,y) + 4;
With exactly the same result. Note also that the semicolon does not necessarily go after the
function call, but, as always, at the end of the whole statement. Again, the logic behind may be
easily seen again by replacing the function calls by their returned value:
printmessage ();
The parentheses are what differentiate functions from other kinds of declarations or statements.
The following would not call the function:
printmessage;
The return value of main
You may have noticed that the return type of main is int, but most examples in this and earlier
chapters did not actually return any value from main.
Well, there is a catch: If the execution of main ends normally without encountering
a return statement the compiler assumes the function ends with an implicit return statement:
return 0;
Note that this only applies to function main for historical reasons. All other functions with a return
type shall end with a proper return statement that includes a return value, even if this is never
used.
When main returns zero (either implicitly or explicitly), it is interpreted by the environment as that
the program ended successfully. Other values may be returned by main, and some environments
give access to that value to the caller in some way, although this behavior is not required nor
necessarily portable between platforms. The values for main that are guaranteed to be
interpreted in the same way on all platforms are:
value description
In certain cases, though, it may be useful to access an external variable from within a function. To
do that, arguments can be passed by reference, instead of by value. For example, the
function duplicate in this code duplicates the value of its three arguments, causing the variables
used as arguments to actually be modified by the call:
In fact, a, b, and c become aliases of the arguments passed on the function call (x, y, and z) and
any change on a within the function is actually modifying variable x outside the function. Any
change on b modifies y, and any change on cmodifies z. That is why when, in the example,
function duplicate modifies the values of variables a, b, and c, the values of x, y, and z are
affected.
If instead of defining duplicate as:
Inline functions
Calling a function generally causes a certain overhead (stacking arguments, jumps, etc...), and thus
for very short functions, it may be more efficient to simply insert the code of the function where it
is called, instead of performing the process of formally calling a function.
Preceding a function declaration with the inline specifier informs the compiler that inline
expansion is preferred over the usual function call mechanism for a specific function. This does not
change at all the behavior of a function, but is merely used to suggest the compiler that the code
generated by the function body shall be inserted at each point the function is called, instead of
being invoked with a regular function call.
For example, the concatenate function above may be declared inline as:
divide (12)
The call only passes one argument to the function, even though the function has two parameters.
In this case, the function assumes the second parameter to be 2 (notice the function definition,
which declares its second parameter as int b=2). Therefore, the result is 6.
In the second call:
divide (20,4)
The call passes two arguments to the function. Therefore, the default value for b (int b=2) is
ignored, and b takes the value passed as argument, that is 4, yielding a result of 5.
Declaring functions
In C++, identifiers can only be used in expressions once they have been declared. For example,
some variable x cannot be used before being declared with a statement, such as:
int x;
The same applies to functions. Functions cannot be called before they are declared. That is why, in
all the previous examples of functions, the functions were always defined before
the main function, which is the function from where the other functions were called. If main were
defined before the other functions, this would break the rule that functions shall be declared
before being used, and thus would not compile.
The prototype of a function can be declared without actually defining the function completely,
giving just enough details to allow the types involved in a function call to be known. Naturally, the
function shall be defined somewhere else, like later in the code. But at least, once declared like
this, it can already be called.
The declaration shall include all types involved (the return type and the type of its arguments),
using the same syntax as used in the definition of the function, but replacing the body of the
function (the block of statements) with an ending semicolon.
The parameter list does not need to include the parameter names, but only their types. Parameter
names can nevertheless be specified, but they are optional, and do not need to necessarily match
those in the function definition. For example, a function called protofunction with two int
parameters can be declared with either of these statements:
Recursivity
Recursivity is the property that functions have to be called by themselves. It is useful for some tasks,
such as sorting elements, or calculating the factorial of numbers. For example, in order to obtain
the factorial of a number (n!) the mathematical formula would be:
n! = n * (n-1) * (n-2) * (n-3) ... * 1
More concretely, 5! (factorial of 5) would be:
5! = 5 * 4 * 3 * 2 * 1 = 120
And a recursive function to calculate this in C++ could be:
1 // factorial calculator 9! = 362880
2 #include <iostream>
3 using namespace std;
4
5 long factorial (long a)
6 {
7 if (a > 1)
8 return (a * factorial (a-1));
9 else
10 return 1;
11 }
12
13 int main ()
14 {
15 long number = 9;
16 cout << number << "! = " << factorial
17 (number);
18 return 0;
}
Notice how in function factorial we included a call to itself, but only if the argument passed was
greater than 1, since, otherwise, the function would perform an infinite recursive loop, in which
once it arrived to 0, it would continue multiplying by all the negative numbers (probably provoking
a stack overflow at some point during runtime).
Overloaded functions
In C++, two different functions can have the same name if their parameters are different; either
because they have a different number of parameters, or because any of their parameters are of a
different type. For example:
1 // overloading functions 10
2 #include <iostream> 2.5
3 using namespace std;
4
5 int operate (int a, int b)
6 {
7 return (a*b);
8 }
9
10 double operate (double a, double b)
11 {
12 return (a/b);
13 }
14
15 int main ()
16 {
17 int x=5,y=2;
18 double n=5.0,m=2.0;
19 cout << operate (x,y) << '\n';
20 cout << operate (n,m) << '\n';
21 return 0;
22 }
In this example, there are two functions called operate, but one of them has two parameters of
type int, while the other has them of type double. The compiler knows which one to call in each
case by examining the types passed as arguments when the function is called. If it is called with
two int arguments, it calls to the function that has two intparameters, and if it is called with
two doubles, it calls the one with two doubles.
In this example, both functions have quite different behaviors, the int version multiplies its
arguments, while the doubleversion divides them. This is generally not a good idea. Two functions
with the same name are generally expected to have ‐at least‐ a similar behavior, but this example
demonstrates that is entirely possible for them not to. Two overloaded functions (i.e., two
functions with the same name) have entirely different definitions; they are, for all purposes,
different functions, that only happen to have the same name.
Note that a function cannot be overloaded only by its return type. At least one of its parameters
must have a different type.
Function templates
Overloaded functions may have the same definition. For example:
1 // overloaded functions 30
2 #include <iostream> 2.5
3 using namespace std;
4
5 int sum (int a, int b)
6 {
7 return a+b;
8 }
9
10 double sum (double a, double b)
11 {
12 return a+b;
13 }
14
15 int main ()
16 {
17 cout << sum (10,20) << '\n';
18 cout << sum (1.0,1.5) << '\n';
19 return 0;
20 }
Here, sum is overloaded with different parameter types, but with the exact same body.
The function sum could be overloaded for a lot of types, and it could make sense for all of them to
have the same body. For cases such as this, C++ has the ability to define functions with generic
types, known as function templates. Defining a function template follows the same syntax as a
regular function, except that it is preceded by the templatekeyword and a series of template
parameters enclosed in angle‐brackets <>:
template <template-parameters> function-declaration
The template parameters are a series of parameters separated by commas. These parameters can
be generic template types by specifying either the class or typename keyword followed by an
identifier. This identifier can then be used in the function declaration as if it was a regular type. For
example, a generic sum function could be defined as:
x = sum<int>(10,20);
The function sum<int> is just one of the possible instantiations of function template sum. In this
case, by using int as template argument in the call, the compiler automatically instantiates a
version of sum where each occurrence of SomeType is replaced by int, as if it was defined as:
1 // function template 11
2 #include <iostream> 2.5
3 using namespace std;
4
5 template <class T>
6 T sum (T a, T b)
7 {
8 T result;
9 result = a + b;
10 return result;
11 }
12
13 int main () {
14 int i=5, j=6, k;
15 double f=2.0, g=0.5, h;
16 k=sum<int>(i,j);
17 h=sum<double>(f,g);
18 cout << k << '\n';
19 cout << h << '\n';
20 return 0;
21 }
In this case, we have used T as the template parameter name, instead of SomeType. It makes no
difference, and T is actually a quite common template parameter name for generic types.
In the example above, we used the function template sum twice. The first time with arguments of
type int, and the second one with arguments of type double. The compiler has instantiated and
then called each time the appropriate version of the function.
Note also how T is also used to declare a local variable of that (generic) type within sum:
T result;
Therefore, result will be a variable of the same type as the parameters a and b, and as the type
returned by the function.
In this specific case where the generic type T is used as a parameter for sum, the compiler is even
able to deduce the data type automatically without having to explicitly specify it within angle
brackets. Therefore, instead of explicitly specifying the template arguments with:
1 k = sum<int> (i,j);
2 h = sum<double> (f,g);
It is possible to instead simply write:
1 k = sum (i,j);
2 h = sum (f,g);
without the type enclosed in angle brackets. Naturally, for that, the type shall be unambiguous.
If sum is called with arguments of different types, the compiler may not be able to deduce the type
of T automatically.
Templates are a powerful and versatile feature. They can have multiple template parameters, and
the function can still use regular non‐templated types. For example:
are_equal(10,10.0)
Is equivalent to:
are_equal<int,double>(10,10.0)
There is no ambiguity possible because numerical literals are always of a specific type: Unless
otherwise specified with a suffix, integer literals always produce values of type int, and floating‐
point literals always produce values of type double. Therefore 10 has always
type int and 10.0 has always type double.
The template parameters can not only include types introduced by class or typename, but can
also include expressions of a particular type:
1 // template arguments 20
2 #include <iostream> 30
3 using namespace std;
4
5 template <class T, int N>
6 T fixed_multiply (T val)
7 {
8 return val * N;
9 }
10
11 int main() {
12 std::cout << fixed_multiply<int,2>(10)
13 << '\n';
14 std::cout << fixed_multiply<int,3>(10)
<< '\n';
}
The second argument of the fixed_multiply function template is of type int. It just looks like
a regular function parameter, and can actually be used just like one.
But there exists a major difference: the value of template parameters is determined on compile‐
time to generate a different instantiation of the function fixed_multiply, and thus the value
of that argument is never passed during runtime: The two calls
to fixed_multiply in main essentially call two versions of the function: one that always
multiplies by two, and one that always multiplies by three. For that same reason, the second
template argument needs to be a constant expression (it cannot be passed a variable).
Name visibility
Scopes
Named entities, such as variables, functions, and compound types need to be declared before
being used in C++. The point in the program where this declaration happens influences its visibility:
An entity declared outside any block has global scope, meaning that its name is valid anywhere in
the code. While an entity declared within a block, such as a function or a selective statement,
has block scope, and is only visible within the specific block in which it is declared, but not outside
it.
Variables with block scope are known as local variables.
For example, a variable declared in the body of a function is a local variable that extends until the
end of the the function (i.e., until the brace } that closes the function definition), but not outside
it:
1 int foo; // global variable
2
3 int some_function ()
4 {
5 int bar; // local variable
6 bar = 0;
7 }
8
9 int other_function ()
10 {
11 foo = 1; // ok: foo is a global variable
12 bar = 2; // wrong: bar is not visible from this function
13 }
In each scope, a name can only represent one entity. For example, there cannot be two variables
with the same name in the same scope:
1 int some_function ()
2{
3 int x;
4 x = 0;
5 double x; // wrong: name already used in this scope
6 x = 0.0;
7}
The visibility of an entity with block scope extends until the end of the block, including inner blocks.
Nevertheless, an inner block, because it is a different block, can re‐utilize a name existing in an
outer scope to refer to a different entity; in this case, the name will refer to a different entity only
within the inner block, hiding the entity it names outside. While outside it, it will still refer to the
original entity. For example:
Namespaces
Only one entity can exist with a particular name in a particular scope. This is seldom a problem for
local names, since blocks tend to be relatively short, and names have particular purposes within
them, such as naming a counter variable, an argument, etc...
But non‐local names bring more possibilities for name collision, especially considering that libraries
may declare many functions, types, and variables, neither of them local in nature, and some of
them very generic.
Namespaces allow us to group named entities that otherwise would have global scope into
narrower scopes, giving them namespace scope. This allows organizing the elements of programs
into different logical scopes referred to by names.
The syntax to declare a namespaces is:
namespace identifier
{
named_entities
}
Where identifier is any valid identifier and named_entities is the set of variables, types
and functions that are included within the namespace. For example:
1 namespace myNamespace
2{
3 int a, b;
4}
In this case, the variables a and b are normal variables declared within a namespace
called myNamespace.
These variables can be accessed from within their namespace normally, with their identifier
(either a or b), but if accessed from outside the myNamespace namespace they have to be
properly qualified with the scope operator ::. For example, to access the previous variables from
outside myNamespace they should be qualified like:
1 myNamespace::a
2 myNamespace::b
Namespaces are particularly useful to avoid name collisions. For example:
1 // namespaces 5
2 #include <iostream> 6.2832
3 using namespace std; 3.1416
4
5 namespace foo
6 {
7 int value() { return 5; }
8 }
9
10 namespace bar
11 {
12 const double pi = 3.1416;
13 double value() { return 2*pi; }
14 }
15
16 int main () {
17 cout << foo::value() << '\n';
18 cout << bar::value() << '\n';
19 cout << bar::pi << '\n';
20 return 0;
21 }
In this case, there are two functions with the same name: value. One is defined within the
namespace foo, and the other one in bar. No redefinition errors happen thanks to namespaces.
Notice also how pi is accessed in an unqualified manner from within namespace bar (just as pi),
while it is again accessed in main, but here it needs to be qualified as bar::pi.
Namespaces can be split: Two segments of a code can be declared in the same namespace:
using
The keyword using introduces a name into the current declarative region (such as a block), thus
avoiding the need to qualify the name. For example:
1 // using 5
2 #include <iostream> 2.7183
3 using namespace std; 10
4 3.1416
5 namespace first
6 {
7 int x = 5;
8 int y = 10;
9 }
10
11 namespace second
12 {
13 double x = 3.1416;
14 double y = 2.7183;
15 }
16
17 int main () {
18 using first::x;
19 using second::y;
20 cout << x << '\n';
21 cout << y << '\n';
22 cout << first::y << '\n';
23 cout << second::x << '\n';
24 return 0;
25 }
Notice how in main, the variable x (without any name qualifier) refers to first::x,
whereas y refers to second::y, just as specified by the using declarations. The
variables first::y and second::x can still be accessed, but require fully qualified names.
The keyword using can also be used as a directive to introduce an entire namespace:
1 // using 5
2 #include <iostream> 10
3 using namespace std; 3.1416
4 2.7183
5 namespace first
6 {
7 int x = 5;
8 int y = 10;
9 }
10
11 namespace second
12 {
13 double x = 3.1416;
14 double y = 2.7183;
15 }
16
17 int main () {
18 using namespace first;
19 cout << x << '\n';
20 cout << y << '\n';
21 cout << second::x << '\n';
22 cout << second::y << '\n';
23 return 0;
24 }
In this case, by declaring that we were using namespace first, all direct uses of x and y without
name qualifiers were also looked up in namespace first.
using and using namespace have validity only in the same block in which they are stated or
in the entire source code file if they are used directly in the global scope. For example, it would be
possible to first use the objects of one namespace and then those of another one by splitting the
code in different blocks:
1 // using namespace example 5
2 #include <iostream> 3.1416
3 using namespace std;
4
5 namespace first
6 {
7 int x = 5;
8 }
9
10 namespace second
11 {
12 double x = 3.1416;
13 }
14
15 int main () {
16 {
17 using namespace first;
18 cout << x << '\n';
19 }
20 {
21 using namespace second;
22 cout << x << '\n';
23 }
24 return 0;
25 }
Namespace aliasing
Existing namespaces can be aliased with new names, with the following syntax:
namespace new_name = current_name;
Storage classes
The storage for variables with global or namespace scope is allocated for the entire duration of the
program. This is known as static storage, and it contrasts with the storage for local variables (those
declared within a block). These use what is known as automatic storage. The storage for local
variables is only available during the block in which they are declared; after that, that same storage
may be used for a local variable of some other function, or used otherwise.
But there is another substantial difference between variables with static storage and variables
with automatic storage:
‐ Variables with static storage (such as global variables) that are not explicitly initialized are
automatically initialized to zeroes.
‐ Variables with automatic storage (such as local variables) that are not explicitly initialized are left
uninitialized, and thus have an undetermined value.
For example:
1 // static vs automatic storage 0
2 #include <iostream> 4285838
3 using namespace std;
4
5 int x;
6
7 int main ()
8{
9 int y;
10 cout << x << '\n';
11 cout << y << '\n';
12 return 0;
13 }
The actual output may vary, but only the value of x is guaranteed to be zero. y can actually contain
just about any value (including zero).
Unit 4
Arrays
An array is a series of elements of the same type placed in contiguous memory locations that can
be individually referenced by adding an index to a unique identifier.
That means that, for example, five values of type int can be declared as an array without having
to declare 5 different variables (each with its own identifier). Instead, using an array, the
five int values are stored in contiguous memory locations, and all five can be accessed using the
same identifier, with the proper index.
For example, an array containing 5 integer values of type int called foo could be represented as:
where each blank panel represents an element of the array. In this case, these are values of
type int. These elements are numbered from 0 to 4, being 0 the first and 4 the last; In C++, the
first element in an array is always numbered with a zero (not a one), no matter its length.
Like a regular variable, an array must be declared before it is used. A typical declaration for an array
in C++ is:
type name [elements];
where type is a valid type (such as int, float...), name is a valid identifier and
the elements field (which is always enclosed in square brackets []), specifies the length of the
array in terms of the number of elements.
Therefore, the foo array, with five elements of type int, can be declared as:
int foo [5];
NOTE: The elements field within square brackets [], representing the number of elements in the
array, must be a constant expression, since arrays are blocks of static memory whose size must be
determined at compile time, before the program runs.
Initializing arrays
By default, regular arrays of local scope (for example, those declared within a function) are left
uninitialized. This means that none of its elements are set to any particular value; their contents
are undetermined at the point the array is declared.
But the elements in an array can be explicitly initialized to specific values when it is declared, by
enclosing those initial values in braces {}. For example:
The number of values between braces {} shall not be greater than the number of elements in the
array. For example, in the example above, foo was declared having 5 elements (as specified by the
number enclosed in square brackets, []), and the braces {} contained exactly 5 values, one for
each element. If declared with less, the remaining elements are set to their default values (which
for fundamental types, means they are filled with zeroes). For example:
The initializer can even have no values, just the braces:
When an initialization of values is provided for an array, C++ allows the possibility of leaving the
square brackets empty []. In this case, the compiler will assume automatically a size for the array
that matches the number of values included between the braces {}:
For example, the following statement stores the value 75 in the third element of foo:
foo [2] = 75;
and, for example, the following copies the value of the third element of foo to a variable called x:
x = foo[2];
Therefore, the expression foo[2] is itself a variable of type int.
Notice that the third element of foo is specified foo[2], since the first one is foo[0], the
second one is foo[1], and therefore, the third one is foo[2]. By this same reason, its last
element is foo[4]. Therefore, if we write foo[5], we would be accessing the sixth element
of foo, and therefore actually exceeding the size of the array.
In C++, it is syntactically correct to exceed the valid range of indices for an array. This can create
problems, since accessing out‐of‐range elements do not cause errors on compilation, but can cause
errors on runtime. The reason for this being allowed will be seen in a later chapter when pointers
are introduced.
At this point, it is important to be able to clearly distinguish between the two uses that
brackets [] have related to arrays. They perform two different tasks: one is to specify the size of
arrays when they are declared; and the second one is to specify indices for concrete array elements
when they are accessed. Do not confuse these two possible uses of brackets [] with arrays.
1 foo[0] = a;
2 foo[a] = 75;
3 b = foo [a+2];
4 foo[foo[a]] = foo[2] + 5;
For example:
1 // arrays example 12206
2 #include <iostream>
3 using namespace std;
4
5 int foo [] = {16, 2, 77, 40, 12071};
6 int n, result=0;
7
8 int main ()
9 {
10 for ( n=0 ; n<5 ; ++n )
11 {
12 result += foo[n];
13 }
14 cout << result;
15 return 0;
16 }
Multidimensional arrays
Multidimensional arrays can be described as "arrays of arrays". For example, a bidimensional array
can be imagined as a two‐dimensional table made of elements, all of them of a same uniform data
type.
jimmy represents a bidimensional array of 3 per 5 elements of type int. The C++ syntax for this
is:
jimmy[1][3]
(remember that array indices always begin with zero).
Multidimensional arrays are not limited to two indices (i.e., two dimensions). They can contain as
many indices as needed. Although be careful: the amount of memory needed for an array increases
exponentially with each dimension. For example:
Note that the code uses defined constants for the width and height, instead of using directly their
numerical values. This gives the code a better readability, and allows changes in the code to be
made easily in one place.
Arrays as parameters
At some point, we may need to pass an array to a function as a parameter. In C++, it is not possible
to pass the entire block of memory represented by an array to a function directly as an argument.
But what can be passed instead is its address. In practice, this has almost the same effect, and it is
a much faster and more efficient operation.
To accept an array as parameter for a function, the parameters can be declared as the array type,
but with empty brackets, omitting the actual size of the array. For example:
procedure (myarray);
Here you have a complete example:
1 // arrays as parameters 5 10 15
2 #include <iostream> 2 4 6 8 10
3 using namespace std;
4
5 void printarray (int arg[], int length) {
6 for (int n=0; n<length; ++n)
7 cout << arg[n] << ' ';
8 cout << '\n';
9 }
10
11 int main ()
12 {
13 int firstarray[] = {5, 10, 15};
14 int secondarray[] = {2, 4, 6, 8, 10};
15 printarray (firstarray,3);
16 printarray (secondarray,5);
17 }
In the code above, the first parameter (int arg[]) accepts any array whose elements are of
type int, whatever its length. For that reason, we have included a second parameter that tells the
function the length of each array that we pass to it as its first parameter. This allows the for loop
that prints out the array to know the range to iterate in the array passed, without going out of
range.
In a function declaration, it is also possible to include multidimensional arrays. The format for a
tridimensional array parameter is:
base_type[][depth][depth]
For example, a function with a multidimensional array as argument could be:
#include <iostream>
#include <iostream>
#include <array>
using namespace std;
using namespace std;
int main()
int main()
{
{
array<int,3> myarray {10,20,30};
int myarray[3] = {10,20,30};
for (int i=0; i<myarray.size(); ++i)
for (int i=0; i<3; ++i)
++myarray[i];
++myarray[i];
for (int elem : myarray)
for (int elem : myarray)
cout << elem << '\n';
cout << elem << '\n';
}
}
As you can see, both kinds of arrays use the same syntax to access its elements: myarray[i].
Other than that, the main differences lay on the declaration of the array, and the inclusion of an
additional header for the library array. Notice also how it is easy to access the size of the library
array.
Character sequences
The string class has been briefly introduced in an earlier chapter. It is a very powerful class to
handle and manipulate strings of characters. However, because strings are, in fact, sequences of
characters, we can represent them also as plain arrays of elements of a character type.
For example, the following array:
Therefore, this array has a capacity to store sequences of up to 20 characters. But this capacity
does not need to be fully exhausted: the array can also accommodate shorter sequences. For
example, at some point in a program, either the sequence "Hello" or the sequence "Merry
Christmas" can be stored in foo, since both would fit in a sequence with a capacity for 20
characters.
By convention, the end of strings represented in character sequences is signaled by a special
character: the null character, whose literal value can be written as '\0' (backslash, zero).
In this case, the array of 20 elements of type char called foo can be represented storing the
character sequences "Hello" and "Merry Christmas" as:
Notice how after the content of the string itself, a null character ('\0') has been added in order
to indicate the end of the sequence. The panels in gray color represent char elements with
undetermined values.
1 myword = "Bye";
2 myword[] = "Bye";
would not be valid, like neither would be:
1 myword[0] = 'B';
2 myword[1] = 'y';
3 myword[2] = 'e';
4 myword[3] = '\0';
Pointers
In earlier chapters, variables have been explained as locations in the computer's memory which
can be accessed by their identifier (their name). This way, the program does not need to care about
the physical address of the data in memory; it simply uses the identifier whenever it needs to refer
to the variable.
For a C++ program, the memory of a computer is like a succession of memory cells, each one byte
in size, and each with a unique address. These single‐byte memory cells are ordered in a way that
allows data representations larger than one byte to occupy memory cells that have consecutive
addresses.
This way, each cell can be easily located in the memory by means of its unique address. For example,
the memory cell with the address 1776 always follows immediately after the cell with
address 1775 and precedes the one with 1777, and is exactly one thousand cells after 776 and
exactly one thousand cells before 2776.
When a variable is declared, the memory needed to store its value is assigned a specific location
in memory (its memory address). Generally, C++ programs do not actively decide the exact memory
addresses where its variables are stored. Fortunately, that task is left to the environment where
the program is run ‐ generally, an operating system that decides the particular memory locations
on runtime. However, it may be useful for a program to be able to obtain the address of a variable
during runtime in order to access data cells that are at a certain position relative to it.
foo = &myvar;
This would assign the address of variable myvar to foo; by preceding the name of the
variable myvar with the address‐of operator (&), we are no longer assigning the content of the
variable itself to foo, but its address.
The actual address of a variable in memory cannot be known before runtime, but let's assume, in
order to help clarify some concepts, that myvar is placed during runtime in the memory
address 1776.
In this case, consider the following code fragment:
1 myvar = 25;
2 foo = &myvar;
3 bar = myvar;
The values contained in each variable after the execution of this are shown in the following
diagram:
First, we have assigned the value 25 to myvar (a variable whose address in memory we assumed
to be 1776).
The second statement assigns foo the address of myvar, which we have assumed to be 1776.
Finally, the third statement, assigns the value contained in myvar to bar. This is a standard
assignment operation, as already done many times in earlier chapters.
The main difference between the second and third statements is the appearance of the address‐
of operator (&).
The variable that stores the address of another variable (like foo in the previous example) is what
in C++ is called a pointer. Pointers are a very powerful feature of the language that has many uses
in lower level programming. A bit later, we will see how to declare and use pointers.
baz = *foo;
This could be read as: "baz equal to value pointed to by foo", and the statement would actually
assign the value 25 to baz, since foo is 1776, and the value pointed to by 1776 (following the
example above) would be 25.
It is important to clearly differentiate that foo refers to the value 1776, while *foo (with an
asterisk * preceding the identifier) refers to the value stored at address 1776, which in this case
is 25. Notice the difference of including or not including the dereference operator (I have added an
explanatory comment of how each of these two expressions could be read):
& is the address‐of operator, and can be read simply as "address of"
* is the dereference operator, and can be read as "value pointed to by"
Thus, they have sort of opposite meanings: An address obtained with & can be dereferenced with *.
Earlier, we performed the following two assignment operations:
1 myvar = 25;
2 foo = &myvar;
Right after these two statements, all of the following expressions would give true as result:
1 myvar == 25
2 &myvar == 1776
3 foo == 1776
4 *foo == 25
The first expression is quite clear, considering that the assignment operation performed
on myvar was myvar=25. The second one uses the address‐of operator (&), which returns the
address of myvar, which we assumed it to have a value of 1776. The third one is somewhat
obvious, since the second expression was true and the assignment operation performed
on foo was foo=&myvar. The fourth expression uses the dereference operator (*) that can be
read as "value pointed to by", and the value pointed to by foo is indeed 25.
So, after all that, you may also infer that for as long as the address pointed to by foo remains
unchanged, the following expression will also be true:
*foo == myvar
Declaring pointers
Due to the ability of a pointer to directly refer to the value that it points to, a pointer has different
properties when it points to a char than when it points to an int or a float. Once dereferenced,
the type needs to be known. And for that, the declaration of a pointer needs to include the data
type the pointer is going to point to.
The declaration of pointers follows this syntax:
type * name;
where type is the data type pointed to by the pointer. This type is not the type of the pointer itself,
but the type of the data the pointer points to. For example:
1 int * number;
2 char * character;
3 double * decimals;
These are three declarations of pointers. Each one is intended to point to a different data type, but,
in fact, all of them are pointers and all of them are likely going to occupy the same amount of space
in memory (the size in memory of a pointer depends on the platform where the program runs).
Nevertheless, the data to which they point to do not occupy the same amount of space nor are of
the same type: the first one points to an int, the second one to a char, and the last one to
a double. Therefore, although these three example variables are all of them pointers, they
actually have different types: int*, char*, and double* respectively, depending on the type
they point to.
Note that the asterisk (*) used when declaring a pointer only means that it is a pointer (it is part
of its type compound specifier), and should not be confused with the dereference operator seen a
bit earlier, but which is also written with an asterisk (*). They are simply two different things
represented with the same sign.
Let's see an example on pointers:
mypointer = myarray;
After that, mypointer and myarray would be equivalent and would have very similar properties.
The main difference being that mypointer can be assigned a different address,
whereas myarray can never be assigned anything, and will always represent the same block of 20
elements of type int. Therefore, the following assignment would not be valid:
myarray = mypointer;
Let's see an example that mixes arrays and pointers:
1 a[5] = 0; // a [offset of 5] = 0
2 *(a+5) = 0; // pointed to by (a+5) = 0
These two expressions are equivalent and valid, not only if a is a pointer, but also if a is an array.
Remember that if an array, its name can be used just like a pointer to its first element.
Pointer initialization
Pointers can be initialized to point to specific locations at the very moment they are defined:
1 int myvar;
2 int * myptr = &myvar;
The resulting state of variables after this code is the same as after:
1 int myvar;
2 int * myptr;
3 myptr = &myvar;
When pointers are initialized, what is initialized is the address they point to (i.e., myptr), never
the value being pointed (i.e., *myptr). Therefore, the code above shall not be confused with:
1 int myvar;
2 int * myptr;
3 *myptr = &myvar;
Which anyway would not make much sense (and is not valid code).
The asterisk (*) in the pointer declaration (line 2) only indicates that it is a pointer, it is not the
dereference operator (as in line 3). Both things just happen to use the same sign: *. As always,
spaces are not relevant, and never change the meaning of an expression.
Pointers can be initialized either to the address of a variable (such as in the case above), or to the
value of another pointer (or array):
1 int myvar;
2 int *foo = &myvar;
3 int *bar = foo;
Pointer arithmetics
To conduct arithmetical operations on pointers is a little different than to conduct them on regular
integer types. To begin with, only addition and subtraction operations are allowed; the others make
no sense in the world of pointers. But both addition and subtraction have a slightly different
behavior with pointers, according to the size of the data type to which they point.
When fundamental data types were introduced, we saw that types have different sizes. For
example: char always has a size of 1 byte, short is generally larger than that,
and int and long are even larger; the exact size of these being dependent on the system. For
example, let's imagine that in a given system, char takes 1 byte, short takes 2 bytes,
and long takes 4.
Suppose now that we define three pointers in this compiler:
1 char *mychar;
2 short *myshort;
3 long *mylong;
and that we know that they point to the memory locations 1000, 2000, and 3000, respectively.
Therefore, if we write:
1 ++mychar;
2 ++myshort;
3 ++mylong;
mychar, as one would expect, would contain the value 1001. But not so
obviously, myshort would contain the value 2002, and mylong would contain 3004, even though
they have each been incremented only once. The reason is that, when adding one to a pointer, the
pointer is made to point to the following element of the same type, and, therefore, the size in bytes
of the type it points to is added to the pointer.
This is applicable both when adding and subtracting any number to a pointer. It would happen
exactly the same if we wrote:
1 mychar = mychar + 1;
2 myshort = myshort + 1;
3 mylong = mylong + 1;
Regarding the increment (++) and decrement (--) operators, they both can be used as either prefix
or suffix of an expression, with a slight difference in behavior: as a prefix, the increment happens
before the expression is evaluated, and as a suffix, the increment happens after the expression is
evaluated. This also applies to expressions incrementing and decrementing pointers, which can
become part of more complicated expressions that also include dereference operators (*).
Remembering operator precedence rules, we can recall that postfix operators, such as increment
and decrement, have higher precedence than prefix operators, such as the dereference operator
(*). Therefore, the following expression:
*p++
is equivalent to *(p++). And what it does is to increase the value of p (so it now points to the next
element), but because ++ is used as postfix, the whole expression is evaluated as the value pointed
originally by the pointer (the address it pointed to before being incremented).
Essentially, these are the four possible combinations of the dereference operator with both the
prefix and suffix versions of the increment operator (the same being applicable also to the
decrement operator):
*p++ = *q++;
Because ++ has a higher precedence than *, both p and q are incremented, but because both
increment operators (++) are used as postfix and not prefix, the value assigned to *p is *q before
both p and q are incremented. And then both are incremented. It would be roughly equivalent to:
1 *p = *q;
2 ++p;
3 ++q;
Like always, parentheses reduce confusion by adding legibility to expressions.
1 int x;
2 int y = 10;
3 const int * p = &y;
x = *p; // ok: reading p
4 *p = x; // error: modifying p, which is const-qualified
5
Here p points to a variable, but points to it in a const‐qualified manner, meaning that it can read
the value pointed, but it cannot modify it. Note also, that the expression &y is of type int*, but
this is assigned to a pointer of type const int*. This is allowed: a pointer to non‐const can be
implicitly converted to a pointer to const. But not the other way around! As a safety feature,
pointers to const are not implicitly convertible to pointers to non‐const.
One of the use cases of pointers to const elements is as function parameters: a function that
takes a pointer to non‐const as parameter can modify the value passed as argument, while a
function that takes a pointer to const as parameter cannot.
1 // pointers as arguments: 11
2 #include <iostream> 21
3 using namespace std; 31
4
5 void increment_all (int* start, int* stop)
6 {
7 int * current = start;
8 while (current != stop) {
9 ++(*current); // increment value
10 pointed
11 ++current; // increment pointer
12 }
13 }
14
15 void print_all (const int* start, const
16 int* stop)
17 {
18 const int * current = start;
19 while (current != stop) {
20 cout << *current << '\n';
21 ++current; // increment pointer
22 }
23 }
24
25 int main ()
26 {
27 int numbers[] = {10,20,30};
28 increment_all (numbers,numbers+3);
29 print_all (numbers,numbers+3);
return 0;
}
Note that print_all uses pointers that point to constant elements. These pointers point to
constant content they cannot modify, but they are not constant themselves: i.e., the pointers can
still be incremented or assigned different addresses, although they cannot modify the content they
point to.
And this is where a second dimension to constness is added to pointers: Pointers can also be
themselves const. And this is specified by appending const to the pointed type (after the asterisk):
1 int x;
2 int * p1 = &x; // non-const pointer to non-const int
3 const int * p2 = &x; // non-const pointer to const int
4 int * const p3 = &x; // const pointer to non-const int
5 const int * const p4 = &x; // const pointer to const int
The syntax with const and pointers is definitely tricky, and recognizing the cases that best suit
each use tends to require some experience. In any case, it is important to get constness with
pointers (and references) right sooner rather than later, but you should not worry too much about
grasping everything if this is the first time you are exposed to the mix of const and pointers. More
use cases will show up in coming chapters.
To add a little bit more confusion to the syntax of const with pointers, the const qualifier can
either precede or follow the pointed type, with the exact same meaning:
Note that here foo is a pointer and contains the value 1702, and not 'h', nor "hello", although
1702 indeed is the address of both of these.
The pointer foo points to a sequence of characters. And because pointers and arrays behave
essentially in the same way in expressions, foo can be used to access the characters in the same
way arrays of null‐terminated character sequences are. For example:
1 *(foo+4)
2 foo[4]
Both expressions have a value of 'o' (the fifth element of the array).
Pointers to pointers
C++ allows the use of pointers that point to pointers, that these, in its turn, point to data (or even
to other pointers). The syntax simply requires an asterisk (*) for each level of indirection in the
declaration of the pointer:
1 char a;
2 char * b;
3 char ** c;
4 a = 'z';
b = &a;
5 c = &b;
6
This, assuming the randomly chosen memory locations for each variable of 7230, 8092,
and 10502, could be represented as:
With the value of each variable represented inside its corresponding cell, and their respective
addresses in memory represented by the value under them.
The new thing in this example is variable c, which is a pointer to a pointer, and can be used in three
different levels of indirection, each one of them would correspond to a different value:
c is of type char** and a value of 8092
*c is of type char* and a value of 7230
**c is of type char and a value of 'z'
void pointers
The void type of pointer is a special type of pointer. In C++, void represents the absence of type.
Therefore, voidpointers are pointers that point to a value that has no type (and thus also an
undetermined length and undetermined dereferencing properties).
This gives void pointers a great flexibility, by being able to point to any data type, from an integer
value or a float to a string of characters. In exchange, they have a great limitation: the data pointed
to by them cannot be directly dereferenced (which is logical, since we have no type to dereference
to), and for that reason, any address in a voidpointer needs to be transformed into some other
pointer type that points to a concrete data type before being dereferenced.
One of its possible uses may be to pass generic parameters to a function. For example:
1 // increaser y, 1603
2 #include <iostream>
3 using namespace std;
4
5 void increase (void* data, int psize)
6 {
7 if ( psize == sizeof(char) )
8 { char* pchar; pchar=(char*)data;
9 ++(*pchar); }
10 else if (psize == sizeof(int) )
11 { int* pint; pint=(int*)data;
12 ++(*pint); }
13 }
14
15 int main ()
16 {
17 char a = 'x';
18 int b = 1602;
19 increase (&a,sizeof(a));
20 increase (&b,sizeof(b));
21 cout << a << ", " << b << '\n';
return 0;
}
sizeof is an operator integrated in the C++ language that returns the size in bytes of its argument.
For non‐dynamic data types, this value is a constant. Therefore, for example, sizeof(char) is 1,
because char has always a size of one byte.
1 int * p = 0;
2 int * q = nullptr;
Here, both p and q are null pointers, meaning that they explicitly point to nowhere, and they both
actually compare equal: all null pointers compare equal to other null pointers. It is also quite usual
to see the defined constant NULL be used in older code to refer to the null pointer value:
int * r = NULL;
NULL is defined in several headers of the standard library, and is defined as an alias of some null
pointer constant value (such as 0 or nullptr).
Do not confuse null pointers with void pointers! A null pointer is a value that any pointer can take
to represent that it is pointing to "nowhere", while a void pointer is a type of pointer that can
point to somewhere without a specific type. One refers to the value stored in the pointer, and the
other to the type of data it points to.
Pointers to functions
C++ allows operations with pointers to functions. The typical use of this is for passing a function as
an argument to another function. Pointers to functions are declared with the same syntax as a
regular function declaration, except that the name of the function is enclosed between
parentheses () and an asterisk (*) is inserted before the name:
1 // pointer to functions 8
2 #include <iostream>
3 using namespace std;
4
5 int addition (int a, int b)
6 { return (a+b); }
7
8 int subtraction (int a, int b)
9 { return (a-b); }
10
11 int operation (int x, int y, int
12 (*functocall)(int,int))
13 {
14 int g;
15 g = (*functocall)(x,y);
16 return (g);
17 }
18
19 int main ()
20 {
21 int m,n;
22 int (*minus)(int,int) = subtraction;
23
24 m = operation (7, 5, addition);
25 n = operation (20, m, minus);
26 cout <<n;
27 return 0;
}
In the example above, minus is a pointer to a function that has two parameters of type int. It is
directly initialized to point to the function subtraction:
Dynamic memory
In the programs seen in previous chapters, all memory needs were determined before program
execution by defining the variables needed. But there may be cases where the memory needs of a
program can only be determined during runtime. For example, when the memory needed depends
on user input. On these cases, programs need to dynamically allocate memory, for which the C++
language integrates the operators new and delete.
Dynamic memory is allocated using operator new. new is followed by a data type specifier and, if
a sequence of more than one element is required, the number of these within brackets []. It
returns a pointer to the beginning of the new block of memory allocated. Its syntax is:
pointer = new type
pointer = new type [number_of_elements]
The first expression is used to allocate memory to contain one single element of type type. The
second one is used to allocate a block (an array) of elements of type type,
where number_of_elements is an integer value representing the amount of these. For example:
1 int * foo;
2 foo = new int [5];
In this case, the system dynamically allocates space for five elements of type int and returns a
pointer to the first element of the sequence, which is assigned to foo (a pointer).
Therefore, foo now points to a valid block of memory with space for five elements of type int.
Here, foo is a pointer, and thus, the first element pointed to by foo can be accessed either with
the expression foo[0] or the expression *foo (both are equivalent). The second element can be
accessed either with foo[1] or *(foo+1), and so on...
There is a substantial difference between declaring a normal array and allocating dynamic memory
for a block of memory using new. The most important difference is that the size of a regular array
needs to be a constant expression, and thus its size has to be determined at the moment of
designing the program, before it is run, whereas the dynamic memory allocation performed
by new allows to assign memory during runtime using any variable value as size.
The dynamic memory requested by our program is allocated by the system from the memory heap.
However, computer memory is a limited resource, and it can be exhausted. Therefore, there are
no guarantees that all requests to allocate memory using operator new are going to be granted by
the system.
C++ provides two standard mechanisms to check if the allocation was successful:
One is by handling exceptions. Using this method, an exception of type bad_alloc is thrown
when the allocation fails. Exceptions are a powerful C++ feature explained later in these tutorials.
But for now, you should know that if this exception is thrown and it is not handled by a specific
handler, the program execution is terminated.
This exception method is the method used by default by new, and is the one used in a declaration
like:
foo = new int [5]; // if allocation fails, an exception is
thrown
The other method is known as nothrow, and what happens when it is used is that when a memory
allocation fails, instead of throwing a bad_alloc exception or terminating the program, the
pointer returned by new is a null pointer, and the program continues its execution normally.
This method can be specified by using a special object called nothrow, declared in header <new>,
as argument for new:
1 int * foo;
2 foo = new (nothrow) int [5];
3 if (foo == nullptr) {
4 // error assigning memory. Take measures.
5 }
This nothrow method is likely to produce less efficient code than exceptions, since it implies
explicitly checking the pointer value returned after each and every allocation. Therefore, the
exception mechanism is generally preferred, at least for critical allocations. Still, most of the coming
examples will use the nothrow mechanism due to its simplicity.
1 delete pointer;
2 delete[] pointer;
The first statement releases the memory of a single element allocated using new, and the second
one releases the memory allocated for arrays of elements using new and a size in brackets ([]).
The value passed as argument to delete shall be either a pointer to a memory block previously
allocated with new, or a null pointer (in the case of a null pointer, delete produces no effect).
Dynamic memory in C
C++ integrates the operators new and delete for allocating dynamic memory. But these were not
available in the C language; instead, it used a library solution, with the
functions malloc, calloc, realloc and free, defined in the header <cstdlib> (known
as <stdlib.h> in C). The functions are also available in C++ and can also be used to allocate and
deallocate dynamic memory.
Note, though, that the memory blocks allocated by these functions are not necessarily compatible
with those returned by new, so they should not be mixed; each one should be handled with its
own set of functions or operators.
Data structures
Data structures
A data structure is a group of data elements grouped together under one name. These data
elements, known as members, can have different types and different lengths. Data structures can
be declared in C++ using the following syntax:
struct type_name {
member_type1 member_name1;
member_type2 member_name2;
member_type3 member_name3;
.
.
} object_names;
Where type_name is a name for the structure type, object_name can be a set of valid
identifiers for objects that have the type of this structure. Within braces {}, there is a list with the
data members, each one is specified with a type and a valid identifier as its name.
For example:
1 struct product {
2 int weight;
3 double price;
4 } ;
5
6 product apple;
7 product banana, melon;
This declares a structure type, called product, and defines it having two
members: weight and price, each of a different fundamental type. This declaration creates a
new type (product), which is then used to declare three objects (variables) of this
type: apple, banana, and melon. Note how once product is declared, it is used just like any
other type.
Right at the end of the struct definition, and before the ending semicolon (;), the optional
field object_names can be used to directly declare objects of the structure type. For example,
the structure objects apple, banana, and melon can be declared at the moment the data
structure type is defined:
1 struct product {
2 int weight;
3 double price;
4 } apple, banana, melon;
In this case, where object_names are specified, the type name (product) becomes
optional: struct requires either a type_name or at least one name in object_names, but not
necessarily both.
It is important to clearly differentiate between what is the structure type name (product), and
what is an object of this type (apple, banana, and melon). Many objects (such
as apple, banana, and melon) can be declared from a single structure type (product).
Once the three objects of a determined structure type are declared (apple, banana, and melon)
its members can be accessed directly. The syntax for that is simply to insert a dot (.) between the
object name and the member name. For example, we could operate with any of these elements as
if they were standard variables of their respective types:
1 apple.weight
2 apple.price
3 banana.weight
4 banana.price
melon.weight
5 melon.price
6
Each one of these has the data type corresponding to the member they refer
to: apple.weight, banana.weight, and melon.weight are of type int,
while apple.price, banana.price, and melon.price are of type double.
Here is a real example with structure types in action:
Pointers to structures
Like any other type, structures can be pointed to by its own type of pointers:
1 struct movies_t {
2 string title;
3 int year;
4 };
5
6 movies_t amovie;
7 movies_t * pmovie;
Here amovie is an object of structure type movies_t, and pmovie is a pointer to point to
objects of structure type movies_t. Therefore, the following code would also be valid:
pmovie = &amovie;
The value of the pointer pmovie would be assigned the address of object amovie.
Now, let's see another example that mixes pointers and structures, and will serve to introduce a
new operator: the arrow operator (->):
pmovie->title
is, for all purposes, equivalent to:
(*pmovie).title
Both expressions, pmovie->title and (*pmovie).title are valid, and both access the
member title of the data structure pointed by a pointer called pmovie. It is definitely
something different than:
*pmovie.title
which is rather equivalent to:
*(pmovie.title)
This would access the value pointed by a hypothetical pointer member called title of the
structure object pmovie (which is not the case, since title is not a pointer type). The following
panel summarizes possible combinations of the operators for pointers and for structure members:
Nesting structures
Structures can also be nested in such a way that an element of a structure is itself another structure:
1 struct movies_t {
2 string title;
3 int year;
4 };
5
6 struct friends_t {
7 string name;
8 string email;
9 movies_t favorite_movie;
10 } charlie, maria;
11
12 friends_t * pfriends = &charlie;
After the previous declarations, all of the following expressions would be valid:
1 charlie.name
2 maria.favorite_movie.title
3 charlie.favorite_movie.year
4 pfriends->favorite_movie.year
(where, by the way, the last two expressions refer to the same member).
1 typedef char C;
2 typedef unsigned int WORD;
3 typedef char * pChar;
4 typedef char field [50];
This defines four type aliases: C, WORD, pChar, and field as char, unsigned
int, char* and char[50], respectively. Once these aliases are defined, they can be used in any
declaration just like any other valid type:
1 using C = char;
2 using WORD = unsigned int;
3 using pChar = char *;
4 using field = char [50];
Both aliases defined with typedef and aliases defined with using are semantically equivalent.
The only difference being that typedef has certain limitations in the realm of templates
that using has not. Therefore, using is more generic, although typedef has a longer history
and is probably more common in existing code.
Note that neither typedef nor using create new distinct data types. They only create synonyms
of existing types. That means that the type of myword above, declared with type WORD, can as well
be considered of type unsigned int; it does not really matter, since both are actually referring
to the same type.
Type aliases can be used to reduce the length of long or confusing type names, but they are most
useful as tools to abstract programs from the underlying types they use. For example, by using an
alias of int to refer to a particular kind of parameter instead of using int directly, it allows for the
type to be easily replaced by long (or some other type) in a later version, without having to
change every instance where it is used.
Unions
Unions allow one portion of memory to be accessed as different data types. Its declaration and use
is similar to the one of structures, but its functionality is totally different:
union type_name {
member_type1 member_name1;
member_type2 member_name2;
member_type3 member_name3;
.
.
} object_names;
This creates a new union type, identified by type_name, in which all its member elements occupy
the same physical space in memory. The size of this type is the one of the largest member element.
For example:
1 union mytypes_t {
2 char c;
3 int i;
4 float f;
5 } mytypes;
declares an object (mytypes) with three members:
1 mytypes.c
2 mytypes.i
3 mytypes.f
Each of these members is of a different data type. But since all of them are referring to the same
location in memory, the modification of one of the members will affect the value of all of them. It
is not possible to store different values in them in a way that each is independent of the others.
One of the uses of a union is to be able to access a value either in its entirety or as an array or
structure of smaller elements. For example:
1 union mix_t {
2 int l;
3 struct {
4 short hi;
5 short lo;
6 } s;
7 char c[4];
8 } mix;
If we assume that the system where this program runs has an int type with a size of 4 bytes, and
a short type of 2 bytes, the union defined above allows the access to the same group of 4
bytes: mix.l, mix.s and mix.c, and which we can use according to how we want to access
these bytes: as if they were a single value of type int, or as if they were two values of type short,
or as an array of char elements, respectively. The example mixes types, arrays, and structures in
the union to demonstrate different ways to access the data. For a little‐endian system, this union
could be represented as:
The exact alignment and order of the members of a union in memory depends on the system, with
the possibility of creating portability issues.
Anonymous unions
When unions are members of a class (or structure), they can be declared with no name. In this
case, they become anonymous unions, and its members are directly accessible from objects by
their member names. For example, see the differences between these two structure declarations:
1 book1.price.dollars
2 book1.price.yen
whereas for an object of the second type (which has an anonymous union), it would be:
1 book2.dollars
2 book2.yen
Again, remember that because it is a member union (not a member structure), the
members dollars and yen actually share the same memory location, so they cannot be used to
store two different values simultaneously. The price can be set in dollars or in yen, but not in
both simultaneously.
enum type_name {
value1,
value2,
value3,
.
.
} object_names;
This creates the type type_name, which can take any of value1, value2, value3, ... as value.
Objects (variables) of this type can directly be instantiated as object_names.
For example, a new type of variable called colors_t could be defined to store colors with the
following declaration:
1 colors_t mycolor;
2
3 mycolor = blue;
4 if (mycolor == green) mycolor = red;
Values of enumerated types declared with enum are implicitly convertible to an integer type, and
vice versa. In fact, the elements of such an enum are always assigned an integer numerical
equivalent internally, to which they can be implicitly converted to or from. If it is not specified
otherwise, the integer value equivalent to the first possible value is 0, the equivalent to the second
is 1, to the third is 2, and so on... Therefore, in the data type colors_t defined
above, blackwould be equivalent to 0, blue would be equivalent to 1, green to 2, and so on...
A specific integer value can be specified for any of the possible values in the enumerated type. And
if the constant value that follows it is itself not given its own value, it is automatically assumed to
be the same value plus one. For example:
But, in C++, it is possible to create real enum types that are neither implicitly convertible
to int and that neither have enumerator values of type int, but of the enum type itself, thus
preserving type safety. They are declared with enum class(or enum struct) instead of
just enum:
enum class Colors {black, blue, green, cyan, red, purple, yellow,
white};
Each of the enumerator values of an enum class type needs to be scoped into its type (this is
actually also possible with enum types, but it is only optional). For example:
1 Colors mycolor;
2
mycolor = Colors::blue;
3 if (mycolor == Colors::green) mycolor = Colors::red;
4
Enumerated types declared with enum class also have more control over their underlying type;
it may be any integral data type, such as char, short or unsigned int, which essentially
serves to determine the size of the type. This is specified by a colon and the underlying type
following the enumerated type. For example:
Unit 5
Classes (I)
Classes are an expanded concept of data structures: like data structures, they can contain data
members, but they can also contain functions as members.
An object is an instantiation of a class. In terms of variables, a class would be the type, and an
object would be the variable.
Classes are defined using either keyword class or keyword struct, with the following syntax:
class class_name {
access_specifier_1:
member1;
access_specifier_2:
member2;
...
} object_names;
Where class_name is a valid identifier for the class, object_names is an optional list of names
for objects of this class. The body of the declaration can contain members, which can either be
data or function declarations, and optionally access specifiers.
Classes have the same format as plain data structures, except that they can also include functions
and have these new things called access specifiers. An access specifier is one of the following three
keywords: private, public or protected. These specifiers modify the access rights for the
members that follow them:
private members of a class are accessible only from within other members of the
same class (or from their "friends").
protected members are accessible from other members of the same class (or from
their "friends"), but also from members of their derived classes.
Finally, public members are accessible from anywhere where the object is visible.
By default, all members of a class declared with the class keyword have private access for all its
members. Therefore, any member that is declared before any other access specifier has private
access automatically. For example:
1 class Rectangle {
2 int width, height;
3 public:
4 void set_values (int,int);
5 int area (void);
6 } rect;
Declares a class (i.e., a type) called Rectangle and an object (i.e., a variable) of this class,
called rect. This class contains four members: two data members of
type int (member width and member height) with private access(because private is the
default access level) and two member functions with public access: the
functions set_values and area, of which for now we have only included their declaration, but
not their definition.
Notice the difference between the class name and the object name: In the previous
example, Rectangle was the class name (i.e., the type), whereas rect was an object of
type Rectangle. It is the same relationship int and a have in the following declaration:
int a;
where int is the type name (the class) and a is the variable name (the object).
After the declarations of Rectangle and rect, any of the public members of object rect can
be accessed as if they were normal functions or normal variables, by simply inserting a dot (.)
between object name and member name. This follows the same syntax as accessing the members
of plain data structures. For example:
1 rect.set_values (3,4);
2 myarea = rect.area();
The only members of rect that cannot be accessed from outside the class
are width and height, since they have private access and they can only be referred to from
within other members of that same class.
Here is the complete example of class Rectangle:
1 // classes example area: 12
2 #include <iostream>
3 using namespace std;
4
5 class Rectangle {
6 int width, height;
7 public:
8 void set_values (int,int);
9 int area() {return width*height;}
10 };
11
12 void Rectangle::set_values (int x, int y)
13 {
14 width = x;
15 height = y;
16 }
17
18 int main () {
19 Rectangle rect;
20 rect.set_values (3,4);
21 cout << "area: " << rect.area();
22 return 0;
}
This example reintroduces the scope operator (::, two colons), seen in earlier chapters in relation
to namespaces. Here it is used in the definition of function set_values to define a member of a
class outside the class itself.
Notice that the definition of the member function area has been included directly within the
definition of class Rectanglegiven its extreme simplicity. Conversely, set_values it is merely
declared with its prototype within the class, but its definition is outside it. In this outside definition,
the operator of scope (::) is used to specify that the function being defined is a member of the
class Rectangle and not a regular non‐member function.
The scope operator (::) specifies the class to which the member being defined belongs, granting
exactly the same scope properties as if this function definition was directly included within the
class definition. For example, the function set_values in the previous example has access to the
variables width and height, which are private members of class Rectangle, and thus only
accessible from other members of the class, such as this.
The only difference between defining a member function completely within the class definition or
to just include its declaration in the function and define it later outside the class, is that in the first
case the function is automatically considered an inline member function by the compiler, while in
the second it is a normal (not‐inline) class member function. This causes no differences in behavior,
but only on possible compiler optimizations.
Members width and height have private access (remember that if nothing else is specified, all
members of a class defined with keyword class have private access). By declaring them private,
access from outside the class is not allowed. This makes sense, since we have already defined a
member function to set values for those members within the object: the member
function set_values. Therefore, the rest of the program does not need to have direct access to
them. Perhaps in a so simple example as this, it is difficult to see how restricting access to these
variables may be useful, but in greater projects it may be very important that values cannot be
modified in an unexpected way (unexpected from the point of view of the object).
The most important property of a class is that it is a type, and as such, we can declare multiple
objects of it. For example, following with the previous example of class Rectangle, we could have
declared the object rectb in addition to object rect:
Constructors
What would happen in the previous example if we called the member function area before having
called set_values? An undetermined result, since the members width and height had never
been assigned a value.
In order to avoid that, a class can include a special function called its constructor, which is
automatically called whenever a new object of this class is created, allowing the class to initialize
member variables or allocate storage.
This constructor function is declared just like a regular member function, but with a name that
matches the class name and without any return type; not even void.
The Rectangle class above can easily be improved by implementing a constructor:
Overloading constructors
Like any other function, a constructor can also be overloaded with different versions taking
different parameters: with a different number of parameters and/or parameters of different types.
The compiler will automatically call the one whose parameters match the arguments:
Uniform initialization
The way of calling constructors by enclosing their arguments in parentheses, as shown above, is
known as functional form. But constructors can also be called with other syntaxes:
First, constructors with a single parameter can be called using the variable initialization syntax (an
equal sign followed by the argument):
class_name object_name = initialization_value;
More recently, C++ introduced the possibility of constructors to be called using uniform
initialization, which essentially is the same as the functional form, but using braces ({}) instead of
parentheses (()):
class_name object_name { value, value, value, ... }
Optionally, this last syntax can include an equal sign before the braces.
Here is an example with four ways to construct objects of a class whose constructor takes a single
parameter:
1 class Rectangle {
2 int width,height;
3 public:
4 Rectangle(int,int);
5 int area() {return width*height;}
6 };
The constructor for this class could be defined, as usual, as:
Pointers to classes
Objects can also be pointed to by pointers: Once declared, a class becomes a valid type, so it can
be used as the type pointed to by a pointer. For example:
Rectangle * prect;
is a pointer to an object of class Rectangle.
Similarly as with plain data structures, the members of an object can be accessed directly from a
pointer by using the arrow operator (->). Here is an example with some possible combinations:
*x pointed to by x
&x address of x
x.y member y of object x
x->y member y of object pointed to by x
(*x).y member y of object pointed to by x (equivalent to the previous one)
x[0] first object pointed to by x
x[1] second object pointed to by x
x[n] (n+1)th object pointed to by x
Most of these expressions have been introduced in earlier chapters. Most notably, the chapter
about arrays introduced the offset operator ([]) and the chapter about plain data structures
introduced the arrow operator (->).
Classes can be defined not only with keyword class, but also with keywords struct and union.
The keyword struct, generally used to declare plain data structures, can also be used to declare
classes that have member functions, with the same syntax as with keyword class. The only
difference between both is that members of classes declared with the
keyword struct have public access by default, while members of classes declared with the
keyword class have private access by default. For all other purposes both keywords are
equivalent in this context.
Conversely, the concept of unions is different from that of classes declared
with struct and class, since unions only store one data member at a time, but nevertheless
they are also classes and can thus also hold member functions. The default access in union classes
is public.
Classes (II)
Overloading operators
Classes, essentially, define new types to be used in C++ code. And types in C++ not only interact
with code by means of constructions and assignments. They also interact by means of operators.
For example, take the following operation on fundamental types:
1 int a, b, c;
2 a = b + c;
Here, different variables of a fundamental type (int) are applied the addition operator, and then
the assignment operator. For a fundamental arithmetic type, the meaning of such operations is
generally obvious and unambiguous, but it may not be so for certain class types. For example:
1 struct myclass {
2 string product;
3 float price;
4 } a, b, c;
5 a = b + c;
Here, it is not obvious what the result of the addition operation on b and c does. In fact, this code
alone would cause a compilation error, since the type myclass has no defined behavior for
additions. However, C++ allows most operators to be overloaded so that their behavior can be
defined for just about any type, including classes. Here is a list of all the operators that can be
overloaded:
Overloadable operators
1 c = a + b;
2 c = a.operator+ (b);
Both expressions are equivalent.
The operator overloads are just regular functions which can have any behavior; there is actually no
requirement that the operation performed by that overload bears a relation to the mathematical
or usual meaning of the operator, although it is strongly recommended. For example, a class that
overloads operator+ to actually subtract or that overloads operator== to fill the object with
zeros, is perfectly valid, although using such a class could be challenging.
The parameter expected for a member function overload for operations such as operator+ is
naturally the operand to the right hand side of the operator. This is common to all binary operators
(those with an operand to its left and one operand to its right). But operators can come in diverse
forms. Here you have a table with a summary of the parameters needed for each of the different
operators than can be overloaded (please, replace @ by the operator in each case):
Non-member
Expression Operator Member function
function
The keyword this represents a pointer to the object whose member function is being executed.
It is used within a class's member function to refer to the object itself.
One of its uses can be to check if a parameter passed to a member function is the object itself. For
example:
Static members
A class can contain static members, either data or functions.
A static data member of a class is also known as a "class variable", because there is only one
common variable for all the objects of that same class, sharing the same value: i.e., its value is not
different from one object of this class to another.
For example, it may be used for a variable within a class that can contain a counter with the number
of objects of that class that are currently allocated, as in the following example:
int Dummy::n=0;
Because it is a common variable value for all the objects of the same class, it can be referred to as
a member of any object of that class or even directly by the class name (of course this is only valid
for static members):
When an object of a class is qualified as a const object:
1 // const objects 10
2 #include <iostream>
3 using namespace std;
4
5 class MyClass {
6 int x;
7 public:
8 MyClass(int val) : x(val) {}
9 const int& get() const {return x;}
10 };
11
12 void print (const MyClass& arg) {
13 cout << arg.get() << '\n';
14 }
15
16 int main() {
17 MyClass foo (10);
18 print(foo);
19
20 return 0;
21 }
If in this example, get was not specified as a const member, the call to arg.get() in
the print function would not be possible, because const objects only have access
to const member functions.
Member functions can be overloaded on their constness: i.e., a class may have two member
functions with identical signatures except that one is const and the other is not: in this case,
the const version is called only when the object is itself const, and the non‐const version is
called when the object is itself non‐const.
Class templates
Just like we can create function templates, we can also create class templates, allowing classes to
have members that use template parameters as types. For example:
Template specialization
It is possible to define a different implementation for a template when a specific type is passed as
template argument. This is called a template specialization.
For example, let's suppose that we have a very simple class called mycontainer that can store
one element of any type and that has just one member function called increase, which increases
its value. But we find that when it stores an element of type char it would be more convenient to
have a completely different implementation with a function member uppercase, so we decide
to declare a class template specialization for that type:
1 // template specialization 8
2 #include <iostream> J
3 using namespace std;
4
5 // class template:
6 template <class T>
7 class mycontainer {
8 T element;
9 public:
10 mycontainer (T arg) {element=arg;}
11 T increase () {return ++element;}
12 };
13
14 // class template specialization:
15 template <>
16 class mycontainer <char> {
17 char element;
18 public:
19 mycontainer (char arg) {element=arg;}
20 char uppercase ()
21 {
22 if ((element>='a')&&(element<='z'))
23 element+='A'-'a';
24 return element;
25 }
26 };
27
28 int main () {
29 mycontainer<int> myint (7);
30 mycontainer<char> mychar ('j');
31 cout << myint.increase() << endl;
32 cout << mychar.uppercase() << endl;
33 return 0;
34 }
This is the syntax used for the class template specialization:
Special members
[NOTE: This chapter requires proper understanding of dynamically allocated memory]
Special member functions are member functions that are implicitly defined as member of classes
under certain circumstances. There are six:
Default constructor
The default constructor is the constructor called when objects of a class are declared, but are not
initialized with any arguments.
If a class definition has no constructors, the compiler assumes the class to have an implicitly
defined default constructor. Therefore, after declaring a class like this:
1 class Example {
2 public:
3 int total;
4 void accumulate (int x) { total += x; }
5 };
The compiler assumes that Example has a default constructor. Therefore, objects of this class can
be constructed by simply declaring them without any arguments:
Example ex;
But as soon as a class has some constructor taking any number of parameters explicitly declared,
the compiler no longer provides an implicit default constructor, and no longer allows the
declaration of new objects of that class without arguments. For example, the following class:
1 class Example2 {
2 public:
3 int total;
4 Example2 (int initial_value) : total(initial_value) { };
5 void accumulate (int x) { total += x; };
6 };
Here, we have declared a constructor with a parameter of type int. Therefore the following object
declaration would be correct:
Example3() {}
This allows objects of class Example3 to be constructed without arguments (like foo was
declared in this example). Normally, a default constructor like this is implicitly defined for all classes
that have no other constructors and thus no explicit definition is required. But in this
case, Example3 has another constructor:
Destructor
Destructors fulfill the opposite functionality of constructors: They are responsible for the necessary
cleanup needed by a class when its lifetime ends. The classes we have defined in previous chapters
did not allocate any resource and thus did not really require any clean up.
But now, let's imagine that the class in the last example allocates dynamic memory to store the
string it had as data member; in this case, it would be very useful to have a function called
automatically at the end of the object's life in charge of releasing this memory. To do this, we use
a destructor. A destructor is a member function very similar to a default constructor: it takes no
arguments and returns nothing, not even void. It also uses the class name as its own name, but
preceded with a tilde sign (~):
1 // destructors bar's content: Example
2 #include <iostream>
3 #include <string>
4 using namespace std;
5
6 class Example4 {
7 string* ptr;
8 public:
9 // constructors:
10 Example4() : ptr(new string) {}
11 Example4 (const string& str) : ptr(new
12 string(str)) {}
13 // destructor:
14 ~Example4 () {delete ptr;}
15 // access content:
16 const string& content() const {return
17 *ptr;}
18 };
19
20 int main () {
21 Example4 foo;
22 Example4 bar ("Example");
23
24 cout << "bar's content: " <<
bar.content() << '\n';
return 0;
}
On construction, Example4 allocates storage for a string. Storage that is later released by the
destructor.
The destructor for an object is called at the end of its lifetime; in the case of foo and bar this
happens at the end of function main.
Copy constructor
When an object is passed a named object of its own type as argument, its copy constructor is
invoked in order to construct a copy.
A copy constructor is a constructor whose first parameter is of type reference to the class itself
(possibly constqualified) and which can be invoked with a single argument of this type. For
example, for a class MyClass, the copy constructor may have the following signature:
1 class MyClass {
2 public:
3 int a, b; string c;
4 };
An implicit copy constructor is automatically defined. The definition assumed for this function
performs a shallow copy, roughly equivalent to:
Copy assignment
Objects are not only copied on construction, when they are initialized: They can also be copied on
any assignment operation. See the difference:
1 MyClass foo;
2 MyClass bar (foo); // object initialization: copy
3 constructor called
4 MyClass baz = foo; // object initialization: copy
constructor called
foo = bar; // object already initialized: copy
assignment called
Note that baz is initialized on construction using an equal sign, but this is not an assignment
operation! (although it may look like one): The declaration of an object is not an assignment
operation, it is just another of the syntaxes to call single‐argument constructors.
The assignment on foo is an assignment operation. No object is being declared here, but an
operation is being performed on an existing object; foo.
The copy assignment operator is an overload of operator= which takes a value or reference of
the class itself as parameter. The return value is generally a reference to *this (although this is
not required). For example, for a class MyClass, the copy assignment may have the following
signature:
Implicit members
The six special members functions described above are members implicitly declared on classes
under certain circumstances:
Member default
implicitly defined:
function definition:
Default
if no other constructors does nothing
constructor
Destructor if no destructor does nothing
copies all
Copy constructor if no move constructor and no move assignment
members
copies all
Copy assignment if no move constructor and no move assignment
members
if no destructor, no copy constructor and no copy nor moves all
Move constructor
move assignment members
Move if no destructor, no copy constructor and no copy nor moves all
assignment move assignment members
Notice how not all special member functions are implicitly defined in the same cases. This is mostly
due to backwards compatibility with C structures and earlier C++ versions, and in fact some include
deprecated cases. Fortunately, each class can select explicitly which of these members exist with
their default definition or which are deleted by using the keywords default and delete,
respectively. The syntax is either one of:
function_declaration = default;
function_declaration = delete;
For example:
1 // default and delete implicit members bar's area: 200
2 #include <iostream>
3 using namespace std;
4
5 class Rectangle {
6 int width, height;
7 public:
8 Rectangle (int x, int y) : width(x),
9 height(y) {}
10 Rectangle() = default;
11 Rectangle (const Rectangle& other) =
12 delete;
13 int area() {return width*height;}
14 };
15
16 int main () {
17 Rectangle foo;
18 Rectangle bar (10,20);
19
20 cout << "bar's area: " << bar.area() <<
'\n';
return 0;
}
Here, Rectangle can be constructed either with two int arguments or be default‐
constructed (with no arguments). It cannot however be copy‐constructed from
another Rectangle object, because this function has been deleted. Therefore, assuming the
objects of the last example, the following statement would not be valid:
Friend functions
In principle, private and protected members of a class cannot be accessed from outside the same
class in which they are declared. However, this rule does not apply to "friends".
Friends are functions or classes declared with the friend keyword.
A non‐member function can access the private and protected members of a class if it is declared
a friend of that class. That is done by including a declaration of this external function within the
class, and preceding it with the keyword friend:
1 // friend functions 24
2 #include <iostream>
3 using namespace std;
4
5 class Rectangle {
6 int width, height;
7 public:
8 Rectangle() {}
9 Rectangle (int x, int y) : width(x),
10 height(y) {}
11 int area() {return width * height;}
12 friend Rectangle duplicate (const
13 Rectangle&);
14 };
15
16 Rectangle duplicate (const Rectangle&
17 param)
18 {
19 Rectangle res;
20 res.width = param.width*2;
21 res.height = param.height*2;
22 return res;
23 }
24
25 int main () {
26 Rectangle foo;
27 Rectangle bar (2,3);
28 foo = duplicate (bar);
cout << foo.area() << '\n';
return 0;
}
The duplicate function is a friend of class Rectangle. Therefore, function duplicate is able
to access the members widthand height (which are private) of different objects of
type Rectangle. Notice though that neither in the declaration of duplicate nor in its later use
in main, function duplicate is considered a member of class Rectangle. It isn't! It simply has
access to its private and protected members without being a member.
Typical use cases of friend functions are operations that are conducted between two different
classes accessing private or protected members of both.
Friend classes
Similar to friend functions, a friend class is a class whose members have access to the private or
protected members of another class:
1 // friend class 16
2 #include <iostream>
3 using namespace std;
4
5 class Square;
6
7 class Rectangle {
8 int width, height;
9 public:
10 int area ()
11 {return (width * height);}
12 void convert (Square a);
13 };
14
15 class Square {
16 friend class Rectangle;
17 private:
18 int side;
19 public:
20 Square (int a) : side(a) {}
21 };
22
23 void Rectangle::convert (Square a) {
24 width = a.side;
25 height = a.side;
26 }
27
28 int main () {
29 Rectangle rect;
30 Square sqr (4);
31 rect.convert(sqr);
32 cout << rect.area();
33 return 0;
34 }
In this example, class Rectangle is a friend of class Square allowing Rectangle's member
functions to access private and protected members of Square. More
concretely, Rectangle accesses the member variable Square::side, which describes the side
of the square.
There is something else new in this example: at the beginning of the program, there is an empty
declaration of class Square. This is necessary because class Rectangle uses Square (as a
parameter in member convert), and Square uses Rectangle (declaring it a friend).
Friendships are never corresponded unless specified: In our example, Rectangle is considered a
friend class by Square, but Square is not considered a friend by Rectangle. Therefore, the
member functions of Rectangle can access the protected and private members of Square but
not the other way around. Of course, Square could also be declared friend of Rectangle, if
needed, granting such an access.
Another property of friendships is that they are not transitive: The friend of a friend is not
considered a friend unless explicitly specified.
The Polygon class would contain members that are common for both types of polygon. In our
case: width and height. And Rectangle and Triangle would be its derived classes, with
specific features that are different from one type of polygon to the other.
Classes that are derived from others inherit all the accessible members of the base class. That
means that if a base class includes a member A and we derive a class from it with another member
called B, the derived class will contain both member A and member B.
The inheritance relationship of two classes is declared in the derived class. Derived classes
definitions use the following syntax:
class derived_class_name: public base_class_name
{ /*...*/ };
Where derived_class_name is the name of the derived class and base_class_name is the
name of the class on which it is based. The public access specifier may be replaced by any one
of the other access specifiers (protected or private). This access specifier limits the most
accessible level for the members inherited from the base class: The members with a more
accessible level are inherited with this level instead, while the members with an equal or more
restrictive access level keep their restrictive level in the derived class.
1 // derived classes 20
2 #include <iostream> 10
3 using namespace std;
4
5 class Polygon {
6 protected:
7 int width, height;
8 public:
9 void set_values (int a, int b)
10 { width=a; height=b;}
11 };
12
13 class Rectangle: public Polygon {
14 public:
15 int area ()
16 { return width * height; }
17 };
18
19 class Triangle: public Polygon {
20 public:
21 int area ()
22 { return width * height / 2; }
23 };
24
25 int main () {
26 Rectangle rect;
27 Triangle trgl;
28 rect.set_values (4,5);
29 trgl.set_values (4,5);
30 cout << rect.area() << '\n';
31 cout << trgl.area() << '\n';
32 return 0;
33 }
The objects of the classes Rectangle and Triangle each contain members inherited
from Polygon. These are: width, heightand set_values.
The protected access specifier used in class Polygon is similar to private. Its only difference
occurs in fact with inheritance: When a class inherits another one, the members of the derived
class can access the protected members inherited from the base class, but not its private members.
By declaring width and height as protected instead of private, these members are also
accessible from the derived classes Rectangle and Triangle, instead of just from members
of Polygon. If they were public, they could be accessed just from anywhere.
We can summarize the different access types according to which functions can access them in the
following way:
Access public protected private
members of the same class yes yes yes
members of derived class yes yes no
not members yes no no
Where "not members" represents any access from outside the class, such as from main, from
another class or from a function.
In the example above, the members inherited by Rectangle and Triangle have the same
access permissions as they had in their base class Polygon:
its constructors and its destructor
its assignment operator members (operator=)
its friends
its private members
Even though access to the constructors and destructor of the base class is not inherited as such,
they are automatically called by the constructors and destructor of the derived class.
Unless otherwise specified, the constructors of a derived class calls the default constructor of its
base classes (i.e., the constructor taking no arguments). Calling a different constructor of a base
class is possible, using the same syntax used to initialize member variables in the initialization list:
derived_constructor_name (parameters) : base_constructor_name
(parameters) {...}
For example:
return 0;
}
Notice the difference between which Mother's constructor is called when a
new Daughter object is created and which when it is a Son object. The difference is due to the
different constructor declarations of Daughter and Son:
Multiple inheritance
A class may inherit from more than one class by simply specifying more base classes, separated by
commas, in the list of a class's base classes (i.e., after the colon). For example, if the program had
a specific class to print on screen called Output, and we wanted our
classes Rectangle and Triangle to also inherit its members in addition to those
of Polygon we could write:
1 class Rectangle: public Polygon, public Output;
2 class Triangle: public Polygon, public Output;
Here is the complete example:
1 // multiple inheritance 20
2 #include <iostream> 10
3 using namespace std;
4
5 class Polygon {
6 protected:
7 int width, height;
8 public:
9 Polygon (int a, int b) : width(a),
10 height(b) {}
11 };
12
13 class Output {
14 public:
15 static void print (int i);
16 };
17
18 void Output::print (int i) {
19 cout << i << '\n';
20 }
21
22 class Rectangle: public Polygon, public
23 Output {
24 public:
25 Rectangle (int a, int b) :
26 Polygon(a,b) {}
27 int area ()
28 { return width*height; }
29 };
30
31 class Triangle: public Polygon, public
32 Output {
33 public:
34 Triangle (int a, int b) : Polygon(a,b)
35 {}
36 int area ()
37 { return width*height/2; }
38 };
39
40 int main () {
41 Rectangle rect (4,5);
Triangle trgl (4,5);
rect.print (rect.area());
Triangle::print (trgl.area());
return 0;
}
Polymorphism
Before getting any deeper into this chapter, you should have a proper understanding of pointers
and class inheritance. If you are not really sure of the meaning of any of the following expressions,
you should review the indicated sections:
1 ppoly1->set_values (4,5);
2 rect.set_values (4,5);
But because the type of ppoly1 and ppoly2 is pointer to Polygon (and not pointer
to Rectangle nor pointer to Triangle), only the members inherited from Polygon can be
accessed, and not those of the derived classes Rectangle and Triangle. That is why the
program above accesses the area members of both objects using rect and trgl directly,
instead of the pointers; the pointers to the base class cannot access the area members.
Member area could have been accessed with the pointers to Polygon if area were a member
of Polygon instead of a member of its derived classes, but the problem is
that Rectangle and Triangle implement different versions of area, therefore there is not a
single common version that could be implemented in the base class.
Virtual members
A virtual member is a member function that can be redefined in a derived class, while preserving
its calling properties through references. The syntax for a function to become virtual is to precede
its declaration with the virtual keyword:
1 // virtual members 20
2 #include <iostream> 10
3 using namespace std; 0
4
5 class Polygon {
6 protected:
7 int width, height;
8 public:
9 void set_values (int a, int b)
10 { width=a; height=b; }
11 virtual int area ()
12 { return 0; }
13 };
14
15 class Rectangle: public Polygon {
16 public:
17 int area ()
18 { return width * height; }
19 };
20
21 class Triangle: public Polygon {
22 public:
23 int area ()
24 { return (width * height / 2); }
25 };
26
27 int main () {
28 Rectangle rect;
29 Triangle trgl;
30 Polygon poly;
31 Polygon * ppoly1 = ▭
32 Polygon * ppoly2 = &trgl;
33 Polygon * ppoly3 = &poly;
34 ppoly1->set_values (4,5);
35 ppoly2->set_values (4,5);
36 ppoly3->set_values (4,5);
37 cout << ppoly1->area() << '\n';
38 cout << ppoly2->area() << '\n';
39 cout << ppoly3->area() << '\n';
40 return 0;
41 }
In this example, all three classes (Polygon, Rectangle and Triangle) have the same
members: width, height, and functions set_values and area.
The member function area has been declared as virtual in the base class because it is later
redefined in each of the derived classes. Non‐virtual members can also be redefined in derived
classes, but non‐virtual members of derived classes cannot be accessed through a reference of the
base class: i.e., if virtual is removed from the declaration of areain the example above, all
three calls to area would return zero, because in all cases, the version of the base class would
have been called instead.
Therefore, essentially, what the virtual keyword does is to allow a member of a derived class
with the same name as one in the base class to be appropriately called from a pointer, and more
precisely when the type of the pointer is a pointer to the base class that is pointing to an object of
the derived class, as in the above example.
A class that declares or inherits a virtual function is called a polymorphic class.
Note that despite of the virtuality of one of its members, Polygon was a regular class, of which
even an object was instantiated (poly), with its own definition of member area that always
returns 0.
Abstract base classes are something very similar to the Polygon class in the previous example.
They are classes that can only be used as base classes, and thus are allowed to have virtual member
functions without definition (known as pure virtual functions). The syntax is to replace their
definition by =0 (an equal sign and a zero):
An abstract base Polygon class could look like this:
1 // abstract class CPolygon
2 class Polygon {
3 protected:
4 int width, height;
5 public:
6 void set_values (int a, int b)
7 { width=a; height=b; }
8 virtual int area () =0;
9 };
Notice that area has no definition; this has been replaced by =0, which makes it a pure virtual
function. Classes that contain at least one pure virtual function are known as abstract base classes.
Abstract base classes cannot be used to instantiate objects. Therefore, this last abstract base class
version of Polygoncould not be used to declare objects like:
1 Polygon * ppoly1;
2 Polygon * ppoly2;
And can actually be dereferenced when pointing to objects of derived (non‐abstract) classes. Here
is the entire example:
Unit 6
Type conversions
Implicit conversion
Implicit conversions are automatically performed when a value is copied to a compatible type. For
example:
1 short a=2000;
2 int b;
3 b=a;
Here, the value of a is promoted from short to int without the need of any explicit operator.
This is known as a standard conversion. Standard conversions affect fundamental data types, and
allow the conversions between numerical types
(short to int, int to float, double to int...), to or from bool, and some pointer
conversions.
Converting to int from some smaller integer type, or to double from float is known
as promotion, and is guaranteed to produce the exact same value in the destination type. Other
conversions between arithmetic types may not always be able to represent the same value exactly:
If a negative integer value is converted to an unsigned type, the resulting value
corresponds to its 2's complement bitwise representation (i.e., -1 becomes the largest
value representable by the type, -2 the second largest, ...).
The conversions from/to bool consider false equivalent to zero (for numeric types)
and to null pointer (for pointer types); true is equivalent to all other values and is
converted to the equivalent of 1.
If the conversion is from a floating‐point type to an integer type, the value is truncated
(the decimal part is removed). If the result lies outside the range of representable values
by the type, the conversion causes undefined behavior.
Otherwise, if the conversion is between numeric types of the same kind (integer‐to‐
integer or floating‐to‐floating), the conversion is valid, but the value is implementation‐
specific (and may not be portable).
Some of these conversions may imply a loss of precision, which the compiler can signal with a
warning. This warning can be avoided with an explicit conversion.
For non‐fundamental types, arrays and functions implicitly convert to pointers, and pointers in
general allow the following conversions:
Null pointers can be converted to pointers of any type
Pointers to any type can be converted to void pointers.
Pointer upcast: pointers to a derived class can be converted to a pointer of
an accessible and unambiguous base class, without modifying
its const or volatile qualification.
Single‐argument constructors: allow implicit conversion from a particular type to
initialize an object.
Assignment operator: allow implicit conversion from a particular type on assignments.
Type‐cast operator: allow implicit conversion to a particular type.
For example:
Keyword explicit
On a function call, C++ allows one implicit conversion to happen for each argument. This may be
somewhat problematic for classes, because it is not always what is intended. For example, if we
add the following function to the last example:
void fn (B arg) {}
This function takes an argument of type B, but it could as well be called with an object of type A as
argument:
fn (foo);
This may or may not be what was intended. But, in any case, it can be prevented by marking the
affected constructor with the explicit keyword:
B bar = foo;
Type‐cast member functions (those described in the previous section) can also be specified
as explicit. This prevents implicit conversions in the same way as explicit‐specified
constructors do for the destination type.
Type casting
C++ is a strong‐typed language. Many conversions, specially those that imply a different
interpretation of the value, require an explicit conversion, known in C++ as type‐casting. There exist
two main syntaxes for generic type‐casting: functional and c‐like:
1 double x = 10.3;
2 int y;
3 y = int (x); // functional notation
4 y = (int) x; // c-like cast notation
The functionality of these generic forms of type‐casting is enough for most needs with fundamental
data types. However, these operators can be applied indiscriminately on classes and pointers to
classes, which can lead to code that ‐while being syntactically correct‐ can cause runtime errors.
For example, the following code compiles without errors:
1 // class type-casting
2 #include <iostream>
3 using namespace std;
4
5 class Dummy {
6 double i,j;
7 };
8
9 class Addition {
10 int x,y;
11 public:
12 Addition (int a, int b) { x=a; y=b; }
13 int result() { return x+y;}
14 };
15
16 int main () {
17 Dummy d;
18 Addition * padd;
19 padd = (Addition*) &d;
20 cout << padd->result();
21 return 0;
22 }
The program declares a pointer to Addition, but then it assigns to it a reference to an object of
another unrelated type using explicit type‐casting:
dynamic_cast
dynamic_cast can only be used with pointers and references to classes (or with void*). Its
purpose is to ensure that the result of the type conversion points to a valid complete object of the
destination pointer type.
This naturally includes pointer upcast (converting from pointer‐to‐derived to pointer‐to‐base), in
the same way as allowed as an implicit conversion.
But dynamic_cast can also downcast (convert from pointer‐to‐base to pointer‐to‐derived)
polymorphic classes (those with virtual members) if ‐and only if‐ the pointed object is a valid
complete object of the target type. For example:
static_cast
Convert from void* to any pointer type. In this case, it guarantees that if
the void* value was obtained by converting from that same pointer type, the resulting
pointer value is the same.
Convert integers, floating‐point values and enum types to enum types.
Additionally, static_cast can also perform the following:
Explicitly call a single‐argument constructor or a conversion operator.
Convert to rvalue references.
Convert enum class values into integers or floating‐point values.
Convert any type to void, evaluating and discarding the value.
reinterpret_cast
reinterpret_cast converts any pointer type to any other pointer type, even of unrelated
classes. The operation result is a simple binary copy of the value from one pointer to the other. All
pointer conversions are allowed: neither the content pointed nor the pointer type itself is checked.
It can also cast pointers to or from integer types. The format in which this integer value represents
a pointer is platform‐specific. The only guarantee is that a pointer cast to an integer type large
enough to fully contain it (such as intptr_t), is guaranteed to be able to be cast back to a valid
pointer.
The conversions that can be performed by reinterpret_cast but not by static_cast are
low‐level operations based on reinterpreting the binary representations of the types, which on
most cases results in code which is system‐specific, and thus non‐portable. For example:
1 class A { /* ... */ };
2 class B { /* ... */ };
3 A * a = new A;
4 B * b = reinterpret_cast<B*>(a);
This code compiles, although it does not make much sense, since now b points to an object of a
totally unrelated and likely incompatible class. Dereferencing b is unsafe.
const_cast
This type of casting manipulates the constness of the object pointed by a pointer, either to be set
or to be removed. For example, in order to pass a const pointer to a function that expects a non‐
const argument:
typeid allows to check the type of an expression:
typeid (expression)
This operator returns a reference to a constant object of type type_info that is defined in the
standard header <typeinfo>. A value returned by typeid can be compared with another value
returned by typeid using operators == and !=or can serve to obtain a null‐terminated character
sequence representing the data type or class name by using its name() member.
Exceptions
Exceptions provide a way to react to exceptional circumstances (like runtime errors) in programs
by transferring control to special functions called handlers.
To catch exceptions, a portion of code is placed under exception inspection. This is done by
enclosing that portion of code in a try‐block. When an exceptional circumstance arises within that
block, an exception is thrown that transfers the control to the exception handler. If no exception is
thrown, the code continues normally and all handlers are ignored.
An exception is thrown by using the throw keyword from inside the try block. Exception handlers
are declared with the keyword catch, which must be placed immediately after the try block:
1 // exceptions An exception occurred. Exception N
2 #include <iostream> 20
3 using namespace std;
4
5 int main () {
6 try
7 {
8 throw 20;
9 }
10 catch (int e)
11 {
12 cout << "An exception occurred. Exception
13 Nr. " << e << '\n';
14 }
15 return 0;
}
The code under exception handling is enclosed in a try block. In this example this code simply
throws an exception:
throw 20;
A throw expression accepts one parameter (in this case the integer value 20), which is passed as
an argument to the exception handler.
The exception handler is declared with the catch keyword immediately after the closing brace of
the try block. The syntax for catch is similar to a regular function with one parameter. The type
of this parameter is very important, since the type of the argument passed by
the throw expression is checked against it, and only in the case they match, the exception is
caught by that handler.
Multiple handlers (i.e., catch expressions) can be chained; each one with a different parameter
type. Only the handler whose argument type matches the type of the exception specified in
the throw statement is executed.
If an ellipsis (...) is used as the parameter of catch, that handler will catch any exception no
matter what the type of the exception thrown. This can be used as a default handler that catches
all exceptions not caught by other handlers:
1 try {
2 // code here
3}
4 catch (int param) { cout << "int exception"; }
5 catch (char param) { cout << "char exception"; }
6 catch (...) { cout << "default exception"; }
In this case, the last handler would catch any exception thrown of a type that is
neither int nor char.
After an exception has been handled the program, execution resumes after the try‐catch block, not
after the throwstatement!.
It is also possible to nest try-catch blocks within more external try blocks. In these cases, we
have the possibility that an internal catch block forwards the exception to its external level. This
is done with the expression throw; with no arguments. For example:
1 try {
2 try {
3 // code here
4 }
5 catch (int n) {
6 throw;
7 }
8 }
9 catch (...) {
10 cout << "Exception occurred";
11 }
Exception specification
Older code may contain dynamic exception specifications. They are now deprecated in C++, but still
supported. A dynamic exception specification follows the declaration of a function, appending
a throw specifier to it. For example:
Standard exceptions
The C++ Standard library provides a base class specifically designed to declare objects to be thrown
as exceptions. It is called std::exception and is defined in the <exception> header. This class has a
virtual member function called what that returns a null‐terminated character sequence (of
type char *) and that can be overwritten in derived classes to contain some sort of description
of the exception.
exception description
bad_alloc thrown by new on allocation failure
bad_cast thrown by dynamic_cast when it fails in a dynamic cast
bad_exception thrown by certain dynamic exception specifiers
bad_typeid thrown by typeid
bad_function_call thrown by empty function objects
bad_weak_ptr thrown by shared_ptr when passed a bad weak_ptr
Also deriving from exception, header <exception> defines two generic exception types that can
be inherited by custom exceptions to report errors:
exception description
logic_error error related to the internal logic of the program
runtime_error error detected during runtime
A typical example where standard exceptions need to be checked for is on memory allocation:
Preprocessor directives
Preprocessor directives are lines included in the code of programs preceded by a hash sign (#).
These lines are not program statements but directives for the preprocessor. The preprocessor
examines the code before actual compilation of code begins and resolves all these directives before
any code is actually generated by regular statements.
These preprocessor directives extend only across a single line of code. As soon as a newline
character is found, the preprocessor directive is ends. No semicolon (;) is expected at the end of a
preprocessor directive. The only way a preprocessor directive can extend through more than one
line is by preceding the newline character at the end of the line by a backslash (\).
To define preprocessor macros we can use #define. Its syntax is:
#define identifier replacement
When the preprocessor encounters this directive, it replaces any occurrence of identifier in
the rest of the code by replacement. This replacement can be an expression, a statement, a
block or simply anything. The preprocessor does not understand C++ proper, it simply replaces any
occurrence of identifier by replacement.
1 int table1[100];
2 int table2[100];
#define can work also with parameters to define function macros:
#define getmax(a,b) a>b?a:b
This would replace any occurrence of getmax followed by two arguments by the replacement
expression, but also replacing each argument by its identifier, exactly as you would expect if it was
a function:
1 // function macro 5
2 #include <iostream> 7
3 using namespace std;
4
5 #define getmax(a,b) ((a)>(b)?(a):(b))
6
7 int main()
8 {
9 int x=5, y;
10 y= getmax(x,2);
11 cout << y << endl;
12 cout << getmax(7,x) << endl;
13 return 0;
14 }
Defined macros are not affected by block structure. A macro lasts until it is undefined with
the #undef preprocessor directive:
1 int table1[100];
2 int table2[200];
Function macro definitions accept two special operators (# and ##) in the replacement sequence:
The operator #, followed by a parameter name, is replaced by a string literal that contains the
argument passed (as if enclosed between double quotes):
1 #define str(x) #x
2 cout << str(test);
This would be translated into:
1 #define glue(a,b) a ## b
2 glue(c,out) << "test";
This would also be translated into:
and #elif)
These directives allow to include or discard part of the code of a program if a certain condition is
met.
#ifdef allows a section of a program to be compiled only if the macro that is specified as the
parameter has been defined, no matter which its value is. For example:
1 #ifdef TABLE_SIZE
2 int table[TABLE_SIZE];
3 #endif
In this case, the line of code int table[TABLE_SIZE]; is only compiled if TABLE_SIZE was
previously defined with #define, independently of its value. If it was not defined, that line will
not be included in the program compilation.
#ifndef serves for the exact opposite: the code between #ifndef and #endif directives is
only compiled if the specified identifier has not been previously defined. For example:
1 #ifndef TABLE_SIZE
2 #define TABLE_SIZE 100
3 #endif
4 int table[TABLE_SIZE];
In this case, if when arriving at this piece of code, the TABLE_SIZE macro has not been defined
yet, it would be defined to a value of 100. If it already existed it would keep its previous value since
the #define directive would not be executed.
The #if, #else and #elif (i.e., "else if") directives serve to specify some condition to be met in
order for the portion of code they surround to be compiled. The condition that
follows #if or #elif can only evaluate constant expressions, including macro expressions. For
example:
1 #if TABLE_SIZE>200
2 #undef TABLE_SIZE
3 #define TABLE_SIZE 200
4
5 #elif TABLE_SIZE<50
6 #undef TABLE_SIZE
7 #define TABLE_SIZE 50
8
9 #else
10 #undef TABLE_SIZE
11 #define TABLE_SIZE 100
12 #endif
13
14 int table[TABLE_SIZE];
Notice how the entire structure of #if, #elif and #else chained directives ends with #endif.
The behavior of #ifdef and #ifndef can also be achieved by using the special
operators defined and !defined respectively in any #if or #elif directive:
1 #if defined ARRAY_SIZE
2 #define TABLE_SIZE ARRAY_SIZE
3 #elif !defined BUFFER_SIZE
4 #define TABLE_SIZE 128
5 #else
6 #define TABLE_SIZE BUFFER_SIZE
7 #endif
1 #include <header>
2 #include "file"
In the first case, a header is specified between angle‐brackets <>. This is used to include headers
provided by the implementation, such as the headers that compose the standard library
(iostream, string,...). Whether the headers are actually files or exist in some other form
is implementation‐defined, but in any case they shall be properly included with this directive.
The syntax used in the second #include uses quotes, and includes a file. The file is searched for
in an implementation‐defined manner, which generally includes the current path. In the case that
the file is not found, the compiler interprets the directive as a header inclusion, just as if the quotes
("") were replaced by angle‐brackets (<>).
macro value
Integer value representing the current line in the source code file being
__LINE__
compiled.
A string literal containing the presumed name of the source file being
__FILE__
compiled.
A string literal in the form "Mmm dd yyyy" containing the date in which
__DATE__
the compilation process began.
A string literal in the form "hh:mm:ss" containing the time at which the
__TIME__
compilation process began.
An integer value. All C++ compilers have this constant defined to some
value. Its value depends on the version of the standard supported by
the compiler:
macro value
Unit 7
Input/output with files
C++ provides the following classes to perform output and input of characters to/from files:
ofstream: Stream class to write on files
ifstream: Stream class to read from files
fstream: Stream class to both read and write from/to files.
These classes are derived directly or indirectly from the classes istream and ostream. We have
already used objects whose types were these classes: cin is an object of
class istream and cout is an object of class ostream. Therefore, we have already been using
classes that are related to our file streams. And in fact, we can use our file streams the same way
we are already used to use cin and cout, with the only difference that we have to associate these
streams with physical files. Let's see an example:
Open a file
The first operation generally performed on an object of one of these classes is to associate it to a
real file. This procedure is known as to open a file. An open file is represented within a program by
a stream (i.e., an object of one of these classes; in the previous example, this was myfile) and
any input or output operation performed on this stream object will be applied to the physical file
associated to it.
In order to open a file with a stream object we use its member function open:
open (filename, mode);
Where filename is a string representing the name of the file to be opened, and mode is an
optional parameter with a combination of the following flags:
1 ofstream myfile;
2 myfile.open ("example.bin", ios::out | ios::app | ios::binary);
Each of the open member functions of classes ofstream, ifstream and fstream has a
default mode that is used if the file is opened without a second argument:
ofstream ios::out
ifstream ios::in
fstream ios::in | ios::out
For ifstream and ofstream classes, ios::in and ios::out are automatically and
respectively assumed, even if a mode that does not include them is passed as second argument to
the open member function (the flags are combined).
For fstream, the default value is only applied if the function is called without specifying any value
for the mode parameter. If the function is called with any value in that parameter the default mode
is overridden, not combined.
File streams opened in binary mode perform input and output operations independently of any
format considerations. Non‐binary files are known as text files, and some translations may occur
due to formatting of some special characters (like newline and carriage return characters).
Since the first task that is performed on a file stream is generally to open a file, these three classes
include a constructor that automatically calls the open member function and has the exact same
parameters as this member. Therefore, we could also have declared the previous myfile object
and conduct the same opening operation in our previous example by writing:
Closing a file
When we are finished with our input and output operations on a file we shall close it so that the
operating system is notified and its resources become available again. For that, we call the stream's
member function close. This member function takes flushes the associated buffers and closes
the file:
myfile.close();
Once this member function is called, the stream object can be re‐used to open another file, and
the file is available again to be opened by other processes.
In case that an object is destroyed while still associated with an open file, the destructor
automatically calls the member function close.
Text files
Text file streams are those where the ios::binary flag is not included in their opening mode.
These files are designed to store text and thus all values that are input or output from/to them can
suffer some formatting transformations, which do not necessarily correspond to their literal binary
value.
Writing operations on text files are performed in the same way we operated with cout:
bad()
Returns true if a reading or writing operation fails. For example, in the case that we try
to write to a file that is not open for writing or if the device where we try to write has no
space left.
fail()
Returns true in the same cases as bad(), but also in the case that a format error
happens, like when an alphabetical character is extracted when we are trying to read an
integer number.
eof()
Returns true if a file open for reading has reached the end.
good()
It is the most generic state flag: it returns false in the same cases in which calling any of
the previous functions would return true. Note that good and bad are not exact
opposites (good checks more state flags at once).
The member function clear() can be used to reset the state flags.
tellg() and tellp()
These two member functions with no parameters return a value of the member type streampos,
which is a type representing the current get position (in the case of tellg) or the put position (in
the case of tellp).
seekg() and seekp()
These functions allow to change the location of the get and put positions. Both functions are
overloaded with two different prototypes. The first form is:
seekg ( position );
seekp ( position );
Using this prototype, the stream pointer is changed to the absolute position position (counting
from the beginning of the file). The type for this parameter is streampos, which is the same type
as returned by functions tellg and tellp.
The other form for these functions is:
seekg ( offset, direction );
seekp ( offset, direction );
Using this prototype, the get or put position is set to an offset value relative to some specific point
determined by the parameter direction. offset is of type streamoff. And direction is
of type seekdir, which is an enumerated type that determines the point from where offset is
counted from, and that can take any of the following values:
streampos size;
streampos is a specific type used for buffer and file positioning and is the type returned
by file.tellg(). Values of this type can safely be subtracted from other values of the same
type, and can also be converted to an integer type large enough to contain the size of the file.
These stream positioning functions use two particular types: streampos and streamoff. These
types are also defined as member types of the stream class:
Member
Type Description
type
Defined as fpos<mbstate_t>.
streampos ios::pos_type It can be converted to/from streamoff and can be added or
subtracted values of these types.
It is an alias of one of the fundamental integral types (such
streamoff ios::off_type
as int or long long).
Each of the member types above is an alias of its non‐member equivalent (they are the exact same
type). It does not matter which one is used. The member types are more generic, because they are
the same on all stream objects (even on streams using exotic types of characters), but the non‐
member types are widely used in existing code for historical reasons.
Binary files
For binary files, reading and writing data with the extraction and insertion operators (<< and >>)
and functions like getline is not efficient, since we do not need to format any data and data is
likely not formatted in lines.
File streams include two member functions specifically designed to read and write binary data
sequentially: write and read. The first one (write) is a member function
of ostream (inherited by ofstream). And read is a member function of istream (inherited
by ifstream). Objects of class fstream have both. Their prototypes are:
write ( memory_block, size );
read ( memory_block, size );
Where memory_block is of type char* (pointer to char), and represents the address of an
array of bytes where the read data elements are stored or from where the data elements to be
written are taken. The size parameter is an integer value that specifies the number of characters
to be read or written from/to the memory block.
When the file is closed: before closing a file, all buffers that have not yet been flushed
are synchronized and all pending data is written or read to the physical medium.
When the buffer is full: Buffers have a certain size. When the buffer is full it is
automatically synchronized.
Explicitly, with manipulators: When certain manipulators are used on streams, an
explicit synchronization takes place. These manipulators are: flush and endl.
Explicitly, with member function sync(): Calling the stream's member
function sync() causes an immediate synchronization. This function returns
an int value equal to -1 if the stream has no associated buffer or in case of failure.
Otherwise (if the stream buffer was successfully synchronized) it returns 0.